See replies inline ;-)
> Sorry to say this, but I think you are making a mistake. To say that the
> rdfs:label has to look like a variable name because it is for Web developers
> sounds to me like you are saying that the javadoc of a method should look
> like a piece of code because it is addressed to programmers. I refuse to
> believe that Web developers understand better pseudo code than natural
> language.
I will finally give in to use English spacing and capitalization for
rdfs:labels in GoodRelations, e.g. use
"Business entity"@en for gr:BusinessEntity etc.
But I will keep the cardinality recommendation in the rdfs:label of properties,
e.g.
serial number (0..*) for gr:serialNumber
and the class type information in ontological individuals, as in
By bank transfer in advance (payment method) for gr:ByBankTransferInAdvance
The latter should definitely not irritate human consumers, for it provides
context; the former is to my judgment the best way of indicating cardinality
recommendations in OWL, since the OWL cardinality constructs don't cover what
is needed, yet I have to be able to tell modelers the intended cardinality. It
is not nonsensical, as you state, as many users of GR have confirmed.
> Moreover, Web clients most of the time display raw data (in a nice way)
> extracted from databases. For instance, a Wikipedia article displays a nice
> readable title, which is exactly the raw data that is found in a column of a
> database. Of course, you can decide that you won't use rdfs:label for human
> readable text and reserve another property for that (eg, dc:title), but you
> cannot decide how others will use your data and they may have a preference
> for the rdfs:label. As a matter of fact, rdfs:label is commonly used for
> showing people a nice readable piece of text in natural language.
I was stressing that SW apps that aim at real people will have to use
sophisticated methods for choosing the proper label for data elements anyway;
using the raw rdfs:label will not work for non geeks in most of the cases. Most
ordinary people cannot process data, just information.
>
> Now, let's imagine I have a "product browser" which aggregates information
> about products found on the Web, leveraging the GoodRelations vocabulary and
> possibly other vocabularies. It may display the products in a table and have
> a column for "product type", which displays the class of the product. There
> are chances that the client will display the rdfs:label of the class as the
> "product type", which in the case of GoodRelations would look sibylline to a
> casual reader, with camel-toed text and nonsensical information about arity.
Nobody except for very specialized analysts will ever want to use a product
browser that presents raw RDF data.
>
> Moreover, with such practice, how can you provide labels in multiple
> languages? Paymentmethod is not even an English word!
The choice of labels for information consumers cannot be solved by the creator
of the vocabulary, because that depends on the context (e.g. audience) in which
the results will be displayed.
This is independent from the question of translations. A good ontology makes
good (context-independent, lasting, cross-cultural) choices regarding the
categories of things. The linguistic representation of these categories in
specific context is a completely different story.
>> But since this class is so frequently used, I want to change it to
>> simply gr:Location while remaining as much of backward compatibility
>> as possible; that is the background of the pattern I suggested.
>
> Ouch! I'm afraid amateur Linked Data producers who are searching for terms in
> a SemWeb search engine will find gr:Location very appropriate for *any*
> location. As a consequence, it will be inferred that all locations recorded
> in geonames are selling something! The Semantic Web will break and bring in
> its downfall the World Wide Web and the Internet, then the end of the world...
>
First, it does not hurt for him or her to use gr:Location for that purpose -
there is no contradiction; any place or area in the universe can be said to be
an instance of gr:Location.
Second, I cannot solve the problem of
- amateur linked data producers in general and
- the unsatisfying state of search technology for ontologies and ontology
elements.
The most important audience to cater for nowadays are Web developers who want
to add RDFa to existing sites. Learn from Facebook and their findings re OGP.
>> Well, in my case that would mean I cannot change a)
>> gr:LocationOfSalesOrServiceProvisioning to gr:Location b)
>> gr:ProductOrServicesSomeInstancesPlaceholder to gr:SomeItems and c)
>> gr:ActualProductOrServiceInstance gr:Individual
>
> Those names are horribly long but they have the merit of being little
> ambiguous, as opposed to gr:Individual. In FOAF, the names are very short,
> which certainly helps getting the vocabulary adopted but creates a
> considerable amount of misuses (foaf:img, foaf:mbox, ...). Moreover, these
> long names are easier to discover in keyword-based search engines because
> there is more contextual information to properly index and relate the words
> in the name.
>
I would put it differently: The initial long names were important for me to
develop a clean conceptual model, because other terms would have been much less
generic and much more industry-specific. The fact that you can use
GoodRelations across industries (jobs, restaurants, transportation, cars,
books, consulting, disposal, ...) is because I did not use the quick,
context-bound words for conceptual elements.
But in the three modifications I am planning, I think the gain in brevity is
much more relevant that the risk of wrong usage. Keep in mind that even long
names do not prevent wrong usage.
Basically, I am evaluating only three changes (not yet confirmed with important
stakeholders):
gr:ActualProductOrServiceInstance --> gr:Individual
gr:ProductOrServicesSomeInstancesPlaceholder --> gr:SomeItems
gr:LocationOfSalesOrServiceProvisioning --> gr:Location
The former two are always used as additional classes, so their IDs will always
be in context:
foo:myHammer a <http://www.productontology.org/id/Hammer>, gr:Individual.
foo:someHammers a <http://www.productontology.org/id/Hammer>, gr:SomeItems.
Even I has to look up the GoodRelations Reference for the correct syntax from
time to time, so there is a real need for improvement.
>> As said, I am considering to change the formatting from camel word to
>> non-camel style but keep the cardinality and class membership info
>> for developers. The issue of several languages is, in theory, a nice
>> feature, but extremely difficult to implement in six-sigma quality
>> due to the differences in connotations and semantic granularity of
>> natural languages. Having second-class translations would do more
>> harm than good, in my opinion. The only reliable translations I could
>> provide easily would be German, but that would really not increase
>> adoption significantly - most German Web developers speak English.
>
> You do not need to make the translations yourself. Find fluent translators or
> expert linguists.
I do not know whether you have ever tried to get sufficiently precise
translations for rather abstract ideas.
You would need to get at least two independent translations for each language
and then evaluate the differences.
BTW, I am not saying there is no need for translations, but before the
translations could be part of the official spec, they would have to be
extremely reliable.
It's no problem if someone on the Web publishes an RDF graph of French labels
for GoodRelations, even if it was not 100 % accurate.
Have a look at 30 years of terminology research (e.g. http://www.termnet.org/)
or google for Eugen Wuester.
>> Snippets or Yahoo SearchMonkey will never see the vocabulary labels,
>> only the person configuring the generation of data.
>
> Google Rich Snippets don't show the labels because it is specifically tuned
> for GoodRelations. But a generic tool which aggregates information from
> various sources using various vocabularies has to make a generic assumption
> on what to display. rdfs:label is what is often chosen by generic tools to be
> shown to people.
I doubt the interaction with RDF data on a Web scale will be a simple
modification of the browser paradigm of HTML content. Pivot-style approaches
IMO pointing to the right direction, but again, you will need a hard-coded or
pretty intelligent additional layer in between the human and the data, and
selecting the proper name for a piece of data will be among the challenges. A
simple regex on the labels from the vocabulary will be the least obstacle of
all.
I don't think that we as LOD / SW researchers do already know how to implement
the larger vision, but it will for sure require a lot more sweat, more
creativity, and more cross-discipline effort than many seem to assume.
Best
Martin