Re: Why have dedicated type for Language rather than using Strings (was Re: Moderate use of classes or only interfaces and strings)

Peter Ansell Sat, 11 Apr 2015 06:16:34 -0700

On 11 April 2015 at 21:04, Reto Gmür <[email protected]> wrote:
> On Mon, Mar 30, 2015 at 9:40 PM, Peter Ansell <[email protected]>
> wrote:
>
>> The use of interfaces is a deliberate step to improve the usefulness
>> of the API by not mandating any particular implementation. We are not
>> going to change that.
>>
>> I don't understand what makes a simple String a bad choice for
>> representing language tags. There are no other attributes that could
>> be attached to the string, per the BCP47 design to have all of the
>> information required in a simple string. In addition that BCP47 string
>> literal is all that is referred to in RDF-1.1, which is our core
>> reference, not the JVM or other libraries design choices. Ie, the
>> RDF-1.1 specs do not disect the language tag, so we see no need to do
>> so here. The equality rules (lower case comparison with any casing for
>> the tag literal itself) are all defined at the total level, so in that
>> case it also doesn't make sense to decompile the string.
>>
>
> By that reasoning you could also use Strings for IRIs.


There are limits to the amount of object orientation that is required.
If there is agreement that there is benefit from adding a type for
LanguageTags then it could be added. However, there is no reason to
generalise to IRIs, which are generally agreed to need to be
represented by a type due to their complexity.

> It's about object orientation, to represent a IRI we have a type IRI, to
> representation a Language a type Language and to represent an immutable
> sequence of characters and only for that we have String.
>
> So if our APIs would be focused on the concrete syntax then for the
> serializations "Hello"@en and "Hello"@EN it might be justified to have two
> different language tags and
>
> literal1.getLanguageTag().equals(literal2.getLanguageTag()) evaluating to
> false.

The RDF-1.1 specification sets up specific rules based solely on lower
case string comparison by stating that the lower case language tags
are the value space. There is no ambiguity about the fact that a plain
.equals is not suitable for language tag strings.

> But as our API models the abstract syntax and as in the RDF data model
> there is no difference between "Hello"@en and "Hello"@EN and the string
> representation of the language in some serialization is irrelevant for what
> we model, we should only care about the language (which can be represented
> as a string but which is not a string)
>
> literal1.getLanguage().equals(literal2.getLanguage()) must evaluate to
> true, because the literals have the same language.
>
> The Language type is similar to java.util.Locale but modelling only BCP-47.

As long as the Language type allowed for the preservation of the
original casing for systems that wish to support that, it would not be
any different to String, apart from formalising the case-insensitive
comparison.

Feel free to submit a pull request which adds this functionality (or
any other patch submission method that you prefer) and that may make
it simpler to discuss the specifics, including whether it should be
added or not.

Thanks,

Peter

Re: Why have dedicated type for Language rather than using Strings (was Re: Moderate use of classes or only interfaces and strings)

Reply via email to