I don't remember how much you've heard of this idea of merging RDF's
language-tagged literals into datatype literals.  The idea RIF-WG and
OWL-WG are working with is a kind of hack of inventing a datatype whose
lexical space contains a string with an "@" and a language tag appended.
So:

    "[EMAIL PROTECTED]"^^rif:text  == "chat"@fr

As I say, it's a sort of a hack, but it works for getting RDF down to
having just one kind of literal, which is a big win for folks building
semantics around it, eg RIF and OWL.    (Maybe some distinction still
needs to be kept between "foo" and "foo"^^xs:string.  I'm fuzzy on that
issue.) 

Axel Polleres just had another suggestion, though, which is:

    "chat"^^lang:fr   == "chat"@fr

which really seems a lot better.  It also seems like something RDF Core
would have considered.  (Along with "chat"^lang:fr, I imagine.)   Can
you remember any of why it was not chosen?   If you don't, I guess I'll
ask Brian and Graham next -- from the minutes I see, they seem to be the
most interested in the subject.

       -- Sandro

--- Begin Message ---

Phillips, Addison wrote:
Hi,

Would you consider including I18N WG in your joint task force? These issues 
seem to arise fairly frequently. We'd like to see consistent solutions develop.

Addison

Sure!

As for the namespace, I personally prefer rdf: sharing jos' arguments here that it is in my opinion NOT problematic to do so. Several rdf: namespaced properties already do not have a specified formal semantics (the reification having been mentioned already, so what).

My point is rather that our intention was what we wanted to reflect is the lang tagged literals from RDF from the beginning, realizing that there is some un-uniformity with these compared with other (typed) literals.

However, your raised concerns, issues make we shift a bit towards a completely different proposal, which at first glance, seems much much cleaner to me. As for you proposal/issues:

1) subtags: if we stick with the current understanding that we talk about a single datatype with pairs in its lexical space, I am not sure whether this needs to be semantically reflected... do you mean to say that e.g. a fact

message("Hello"@en-US^^owl:internationalizedString)

 should in our semantics also imply

message("Hello"@en^^owl:internationalizedString)

???
In RDF's semantics this is certainly not the case now and I think it would complicate things there ...

 A probably more feasible solution would be to do a real type hierarchy,
for language tags and - instead of a datatype owl:internationalizedString or rif:text which has pairs of strings and language tags as lexical space - define separate datatypes and (subtypes) for each lang-tag, ie.

use:

message("Hello"^^lang:en-US)

where e.g. lang:en-US is a subtype of lang:en, i.e.
that would also imply

message("Hello"^^lang:en)

(just as xsd:integer is a subtype of xsd:integer of xsd:decimal in the XML Schema type hierarchy, see http://www.w3.org/TR/xmlschema-2/#built-in-datatypes)

Anything wrong with that? To me this seems much cleaner than this fiddling around with pairs of strings and lang-tags.

The new lang: namespace could then be something in the control of
our task force and define a datatype-IRI for each of the lang-tags defined in BCP 47.

Anything wrong with that??? Seems simpler than what we are into now and feasible to me. The problem of addressing our concerns would then boil down to deriving a type hierarchy from the BCP 47 document, rather than defining a single datatype... and defining the lang: namespace.
which could in its namespace-URI somehow refer to BCP 47.

This said, summarizing: If that works, I am in faor of an own new namespace above rif or owl, I guess. If there is some issue problem with that and we want to stick with the current pairs in the lex space, I am in favor of the rdf namespace.

2) As for your second issue, i.e. to add several language tags
in a priority list... I am not sure how/whether we could address this or whether this is really in the scope of what we want to define. We are talking a bout typed constants which belong to a certain language or not, also, it would complicate things probably... unless we define in the lang-type hierarchy that e.g. a type "en,fr" would be implicitly a supertype of both en and fr... would that work?

example:

  metal("Gold"^^lang:en,de)

would imply both

  metal("Gold"^^lang:de)

  metal("Gold"^^lang:en)

Does that make sense? I am not sure whether I understand your issue well enough though and what more complications such an extension would bring. Still, it looks to me as if this considerably complicates things.


best,
Axel

Addison Phillips
Globalization Architect -- Lab126

Internationalization is not a feature.
It is an architecture.


-----Original Message-----
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Jie
Bao
Sent: Wednesday, July 09, 2008 11:33 AM
To: Phillips, Addison
Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED]; public-
[EMAIL PROTECTED]
Subject: Re: I18N issues an OWL2

Hi Addison

Thank you for the suggestions. The OWL and RIF WGs are planning to
have a joint task force on internationalized strings. There are a
short state-of-the-art summary[2]  and a specification draft [1].
Further revisions will be made after further discussions between
the
WGs. Your comments are valuable and will definitely be considered.
I
will let you updated if there is any progress.

[1] http://www.w3.org/2007/OWL/wiki/InternationalizedStringSpec
[2] http://www.w3.org/2007/OWL/wiki/InternationalizedString

Best

Jie

On Tue, Jul 8, 2008 at 6:54 PM, Phillips, Addison
<[EMAIL PROTECTED]> wrote:
All,

I am writing this note in response to Jeremy Carroll's note of 21
May [1] and in response to an action item from the
Internationalization Core WG [2]
I've reviewed the various issue tracker materials you have and
have some comments. I hope you find these useful. Please note that
these are currently personal and not WG comments.
First, a bit of summary/background. IETF BCP 47 defines language
tags. BCP 47 used to be RFC 3066. Currently, it is two RFCs: 4646
and 4647. The latter of these is about "Matching of Language Tags",
which is primarily the issue at hand. Generally speaking, there are
several forms of matching that you might describe in OWL2. Given
the general type of operations you provide, I think you'd be best
off if you implemented something similar to "extended filtering" in
4647. This is the most "regular expression-like" syntax and allows
for the most flexibility for applications using it.
The problem with the proposals I've seen so far are similar to
issues I have often seen with language tags elsewhere at W3C:
language tags have an internal structure made up of subtags
separated by hyphens. If one specifies "en*" (or, better, "en" or
"en-*"), this should match tags like "en-US" or "en-GB", but not
"ena" or "enf-US". That is, the tokens should be interpreted as
subtags.
In reviewing plans, I noticed this message as the most recent
reference about formats and such [3]. This gave me a few concerns:
1. I'm not sure I like the name "internationalizedString". I
realize that this is an expansion on xsd:string and thus needs a
different name. However, it implies that other strings are somehow
"not internationalized". Perhaps something along the lines of
"languageString", "nlString" (nl for natural language), or similar.
2. Definitely langPattern should be case insensitive.
Alternatively, it is permitted to normalized both the literal and
the pattern to lowercase for matching purposes.
3. It would be best to use the terminology from RFC 4647 to the
extent possible. One question would be whether langPattern could be
a true "language priority list" (i.e. have more than one "language
range" in it). That would allow one to say something like:
   DatatypeRestriction(owl:internationalizedString langPattern
"en,fr")
... which would mean: any string in some flavor of English or
French (but not, say, German or Japanese), and inclusive of tags
such as "fr-CA" and "EN-us".
This may be difficult, since I don't think other pattern strings
allow for internal structure.
I'd be happy, personally and on behalf of the I18N Core WG, to
spend time discussing this with your WG as appropriate. Please note
that I'm also the editor of BCP 47 and that a new revision is
coming up. It won't affect this discussion, but it is a good reason
why one should reference the BCP number and not the RFC :-)
Best Regards,

Addison

[1] http://lists.w3.org/Archives/Public/public-i18n-
core/2008AprJun/0065.html
[2] http://www.w3.org/2008/06/04-core-minutes.html#item07
[3] http://lists.w3.org/Archives/Public/public-owl-
wg/2008May/0019.html

Addison Phillips
Globalization Architect -- Lab126

Internationalization is not a feature.
It is an architecture.





--
Dr. Axel Polleres, Digital Enterprise Research Institute (DERI)
email: [EMAIL PROTECTED]  url: http://www.polleres.net/

Everything is possible:
rdfs:subClassOf rdfs:subPropertyOf rdfs:Resource.
rdfs:subClassOf rdfs:subPropertyOf rdfs:subPropertyOf.
rdf:type rdfs:subPropertyOf rdfs:subClassOf.
rdfs:subClassOf rdf:type owl:SymmetricProperty.

--- End Message ---

Reply via email to