XML and content negotiation

Knud Hinnerk Möller Thu, 25 Jun 2009 10:09:55 -0700

Hi Martin,

On 25.06.2009, at 17:44, Martin Hepp (UniBW) wrote:

Hi all:
After about two months of helping people generate RDF/XML metadatafor their businesses using the GoodRelations annotator [1],I have quite some evidence that the current best practices ofusing .htaccess are a MAJOR bottleneck for the adoption of SemanticWeb technology.
Just some data:
- We have several hundred entries in the annotator log - most peoplespend 10 or more minutes to create a reasonable description ofthemselves.- Even though they all operate some sort of Web sites, less than 30% of them manage to upload/publish a single *.rdf file in their rootdirectory.- Of those 30%, only a fraction manage to set up content negotiationproperly, even though we provide a step-by-step recipe.

These are interesting statistics, maybe you want to blog about them orpublish them in some other way?

The effects are
- URIs that are not dereferencable,
- incorrect media types and
and other problems.
When investigating the causes and trying to help people, weencountered a variety of configurations and causes that we did notexpect. It turned out that helping people just managing this tinystep of publishing Semantic Web data would turn into a full-timejob for 1 - 2 administrators.
Typical causes of problems are
- Lack of privileges for .htaccess (many cheap hosting packages givelimited or no access to .htaccess)- Users without Unix background had trouble name a file so that itbegins with a dot
- Microsoft IIS require completely different recipes
- Many users have access just at a CMS level

Bottomline:
- For researchers in the field, it is a doable task to set up anApache server so that it serves RDF content according to currentbest practices.- For most people out there in reality, this is regularly aprohibitively difficult task, both because of a lack of skills and avariety in the technical environments that turns into an engineeringchallenge what is easy on the textbook-level.

For the cases where people still want to serve RDF documents, it wouldbe neat if various CMSes had a simple way of handling content-negotiation. What I'm thinking of is e.g. a module for Drupal whichwould allow the Drupal admin to specify that, if rdf/xml for node X isrequested (a page), serve RDF document Y. The content negotiationwould be handled by php code in the module, hence no fiddlingwith .htaccess required.

As a consequence, we will modify our tool so that it generates"dummy" RDFa code with span/div that *just* represents the meta-datawithout interfering with the presentation layer.That can then be inserted as code snippets via copy-and-paste to anyXHTML document.

I like it! It's similar to what our Shift tool [2] does for otherkinds of data. However, this might lead to other problems: many CMSesonly allow a subset of HTML in their input forms, so some of the RDFacould get lost. I remember this was a problem with Blogger in the past(not sure if this problem persists).


Cheers,
Knud

[1] http://kantenwerk.org/shift/



Any opinions?

Best
Martin

[1]  http://www.ebusiness-unibw.org/tools/goodrelations-annotator/

Danny Ayers wrote:

Thank you for the excellent questions, Bill.

Right now IMHO the best bet is probably just to pick whichever format
you are most comfortable with (yup "it depends") and use that as the
single source, transforming perhaps with scripts to generate the
alternate representations for conneg.

As far as I'm aware we don't yet have an easy templating engine for

RDFa, so I suspect having that as the source is probably a goodchoice

for typical Web applications.

As mentioned already GRDDL is available for transforming on the fly,
though I'm not sure of the level of client engine support at present.
Ditto providing a SPARQL endpoint is another way of maximising the
surface area of the data.

But the key step has clearly been taken, that decision to publishdata

directly without needing the human element to interpret it.

I claim *win* for the Semantic Web, even if it'll still be a fewyears

before we see applications exploiting it in a way that provides real
benefit for the end user.

my 2 cents.

Cheers,
Danny.


--
--------------------------------------------------------------
martin hepp
e-business & web science research group
universitaet der bundeswehr muenchen

e-mail:  [email protected]
phone:   +49-(0)89-6004-4217
fax:     +49-(0)89-6004-4620
www:     http://www.unibw.de/ebusiness/ (group)
       http://www.heppnetz.de/ (personal)
skype:   mfhepp twitter: mfhepp

Check out the GoodRelations vocabulary for E-Commerce on the Web ofData!========================================================================


Webcast:
http://www.heppnetz.de/projects/goodrelations/webcast/

Talk at the Semantic Technology Conference 2009: "Semantic Web-basedE-Commerce: The GoodRelations Ontology"

http://tinyurl.com/semtech-hepp

Tool for registering your business:
http://www.ebusiness-unibw.org/tools/goodrelations-annotator/

Overview article on Semantic Universe:
http://tinyurl.com/goodrelations-universe

Project page and resources for developers:
http://purl.org/goodrelations/

Tutorial materials:

Tutorial at ESWC 2009: The Web of Data for E-Commerce in One Day: AHands-on Introduction to the GoodRelations Ontology, RDFa, andYahoo! SearchMonkey


http://www.ebusiness-unibw.org/wiki/GoodRelations_Tutorial_ESWC2009




<martin_hepp.vcf>


-------------------------------------------------
Knud Möller, MA
+353 - 91 - 495086
Smile Group: http://smile.deri.ie
Digital Enterprise Research Institute
  National University of Ireland, Galway
Institiúid Taighde na Fiontraíochta Digití
  Ollscoil na hÉireann, Gaillimh

Re: .htaccess a major bottleneck to Semantic Web adoption / Was: Re: RDFa vs RDF/XML and content negotiation

Reply via email to