Workshop: Why, Where and How? of Linked Data

2011-01-20 Thread John Goodwin
Hi all,

 

The DNF Expert Group is pleased to announce the publication of the
agenda for the Why, Where and How? of Linked Data workshop being run
in conjunction with UK Location and the Chartered Institute of IT. 

 

Registration is now open through this site at
http://www.dnf.org/events/register

 

The event is taking place in London on 10/02/11 and is free of charge
and is the follow on event to the original workshop back in September
2010.

 

As the venue imposes some strict limits on us as to audience size,
places will be allocated on a strictly first come, first served basis.

 

John

Dr John Goodwin 
Research Scientist, Research, Ordnance Survey
Adanac Drive, SOUTHAMPTON, UK, SO16 0AS  Phone: +44 (0) 23 8005
5761 
 www.ordnancesurvey.co.uk| john.good...@ordnancesurvey.co.uk 
Please consider your environmental responsibility before printing this
email 

 


This email is only intended for the person to whom it is addressed and may 
contain confidential information. If you have received this email in error, 
please notify the sender and delete this email which must not be copied, 
distributed or disclosed to any other person.

Unless stated otherwise, the contents of this email are personal to the writer 
and do not represent the official view of Ordnance Survey. Nor can any contract 
be formed on Ordnance Survey's behalf via email. We reserve the right to 
monitor emails and attachments without prior notice.

Thank you for your cooperation.

Ordnance Survey
Adanac Drive
Southampton SO16 0AS
Tel: 08456 050505
http://www.ordnancesurvey.co.uk



Re: URI Comparisons: RFC 2616 vs. RDF

2011-01-20 Thread Nathan

Alan Ruttenberg wrote:

On Wed, Jan 19, 2011 at 4:45 PM, Nathan nat...@webr3.org wrote:

David Wood wrote:

On Jan 19, 2011, at 10:59, Nathan wrote:


ps: as an illustration of how engrained URI normalization is, I've
capitalized the domain names in the to: and cc: fields, I do hope the mail
still come through, and hope that you'll accept this email as being sent to
you. Hopefully we'll also find this mail in the archives shortly at
htTp://lists.W3.org/Archives/Public/public-lod/2011Jan/ - Personally I'd
hope that any statements made using these URIs (asserted by man or machine)
would remain valid regardless of the (incorrect?-)casing.


Heh.  OK, I'll bite.  Domain names in email addressing are defined in IETF
RFC 2822 (and its predecessor RFC 822), which defers the interpretation to
RFC 1035 (Domain names - implementation and specification).  RFC 1035
section 2.3.3 states that domain names in DNS, and therefore in (E)SMTP, are
to be compared in a case-insensitive manner.

As far as I know, the W3C specs do not so refer to RFC 1035.


And I'll bite in the other direction, why not treat URIs as URIs? why go
against both the RDF Specification [1] and the URI specification when they
say /not/ to encode permitted US-ASCII characters (like ~ %7E)? why force
case-sensitive matching on the scheme and domain on URIs matching the
generic syntax when the specs say must be compared case insensitively? and
so on and so forth.


[AR]
Which specs?


The various URI/IRI specs and previous revisions of.


http://www.w3.org/TR/REC-xml-names/#NSNameComparison

URI references identifying namespaces

..

In a namespace declaration, the URI reference is

..

The URI references below are all different for the purposes of identifying
namespaces

..

The URI references below are also all different for the purposes of
identifying namespaces

..

So here is another spec that *explicitly* disagrees with the idea that URI
normalization should be a built-in processing.


As far as I can see, that's only for a URI reference used within a 
namespace, and does not govern usage or normalization when you join the 
URI reference up with the local name to make the full URI.


Out of interest, where is that process defined? I was looking for it the 
other day - for instance in the quoted specification we have the example:


edi:price xmlns:edi='http://ecommerce.example.org/schema' 
units='Euro'32.18/edi:price


Where's the bit of the XML specification which says you join them up by 
concatenating 'http://ecommerce.example.org/schema' with #(?assumed?) 
and 'Euro' to get 'http://ecommerce.example.org/schema#Euro'?


And finally, this is why I specifically asked if the non-normalization 
of RDF URI References had XML Namespace heritage, which had then 
filtered down through OWL, SPARQL and RIF.



[AR] More to document, please: Which data is being junked and scrapped?


will document, but essentially every statement made using a non 
normalized URI when other statements are also being made about the 
same resource using normalized URIs - the two most common cases for 
this will be when people are using CMS systems and enter their domain 
name as uppercase in some admin, only to have that filter through to 
URIs in serialized RDF/RDFa, and where bugs in software have led to 
inconsistent URIs over time (for instance where % encoding has been 
fixed, or a :80 has been removed from a URI).



[AR] Hmm. Are you suggesting that the behavior of libraries and clients
should have precedence over specification? My view is that one first looks
to specifications, and then only if specifications are poor or do not speak
to the issue do we look at existing behavior.


Yes I am, that specification should standardize the behaviour of 
libraries and clients - the level of normalization in URIs published, 
consumed or used by these tools is often determined by non sem web stack 
components, and the sem web components are blocked from normalizing 
these should-not-be-differing-URIs by the sem web specifications.



[AR] I think there are many ways to lose in this scenario. For instance, if
the server redirects then the base is the last in the chain of redirects.
http://tools.ietf.org/html/rfc3986#page-29, 5.1.3. Base URI from the
Retrieval URI. My conclusion - don't engineer this way.


That would be my conclusion too, but as RDF(a) moves in to the realms of 
the CMS systems and out of the hands of the sem web community, it will 
be increasingly engineered this way, it's a very common pattern when 
working with (X)HTML (allows people to test locally or on dev servers 
without changing the content).



Further, essentially all RDFa ever encountered by a browser has the casing
on all URIs in href and src, and all these which are resolved, automatically
normalized - so even if you set the base to htTp://EXAMPLE.org/ or use it
in a URI, browser tools, extensions, and js based libraries will only ever
see the normalized URIs (and thus be incompatible with the rest 

Nice domain name for the take

2011-01-20 Thread Rinke Hoekstra
Hi all,

Last year, Christophe Gueret and I registered the domain name 
linkeddatamarketplace.com, thinking we might find the time to set up a market 
place for ... eh... linked data. The idea was to bring together publishers and 
users of data (along the lines of http://www.datamarketplace.com).

As it turned out we didn't have the time to do any of this (apart from setting 
up a 'coming soon' picture), and now the domain name is about to expire.

Question: if anyone's interested in using it, please let us know and we can 
transfer it to your name. The domain is registered at geni.com and will expire 
in 2 months.

Cheers,
Rinke

---
Dr Rinke Hoekstra

AI Department |   Leibniz Center for Law
Faculty of Sciences   |   Faculty of Law
Vrije Universiteit|   Universiteit van Amsterdam
De Boelelaan 1081a|   Kloveniersburgwal 48  
1081 HV Amsterdam |   1012 CX  Amsterdam
+31-(0)20-5987752 |   +31-(0)20-5253497 
r.j.hoeks...@vu.nl|   hoeks...@uva.nl   

Homepage: http://www.few.vu.nl/~hoekstra







Re: URI Comparisons: RFC 2616 vs. RDF

2011-01-20 Thread Kingsley Idehen

On 1/19/11 11:27 PM, Alan Ruttenberg wrote:



On Wed, Jan 19, 2011 at 11:11 AM, Kingsley Idehen 
kide...@openlinksw.com mailto:kide...@openlinksw.com wrote:


On 1/19/11 10:59 AM, Nathan wrote:

htTp://lists.W3.org/Archives/Public/public-lod/2011Jan/ -
Personally I'd hope that any statements made using these URIs
(asserted by man or machine) would remain valid regardless of the
(incorrect?-)casing. 

Okay for Data Source Address Ref. (URL), no good for Entity (Data
Item or Data Object) Name Ref., bar system specific handling via
IFP property or owl:sameAs :-)


Kingsley, same for you as Nathan. To what specification do you refer 
to for the definitions and behavior of:

 - Data source address ref
 - Entity
 - Statement.

-Alan


Alan,

My response is purely about managing Identifiers that are used as 
functional unambiguous Name or Address References. Not quoting a W3C 
spec. Basically, expressing a view based on my understanding of what's 
practical.


A system (e.g. a database or client app.) can (should) make a decision 
about how it handles resolvable Identifiers when used as Name or Address 
references.


Kingsley





-- 


Regards,

Kingsley Idehen 
President  CEO
OpenLink Software
Web:http://www.openlinksw.com
Weblog:http://www.openlinksw.com/blog/~kidehen  
http://www.openlinksw.com/blog/%7Ekidehen
Twitter/Identi.ca: kidehen









--

Regards,

Kingsley Idehen 
President  CEO
OpenLink Software
Web: http://www.openlinksw.com
Weblog: http://www.openlinksw.com/blog/~kidehen
Twitter/Identi.ca: kidehen







Re: URI Comparisons: RFC 2616 vs. RDF

2011-01-20 Thread Dave Reynolds
On Wed, 2011-01-19 at 21:45 +, Nathan wrote: 
 David Wood wrote:
  On Jan 19, 2011, at 10:59, Nathan wrote:
  ps: as an illustration of how engrained URI normalization is, I've 
  capitalized the domain names in the to: and cc: fields, I do hope the mail 
  still come through, and hope that you'll accept this email as being sent 
  to you. Hopefully we'll also find this mail in the archives shortly at 
  htTp://lists.W3.org/Archives/Public/public-lod/2011Jan/ - Personally I'd 
  hope that any statements made using these URIs (asserted by man or 
  machine) would remain valid regardless of the (incorrect?-)casing.
  
  Heh.  OK, I'll bite.  Domain names in email addressing are defined in IETF 
  RFC 2822 (and its predecessor RFC 822), which defers the interpretation to 
  RFC 1035 (Domain names - implementation and specification).  RFC 1035 
  section 2.3.3 states that domain names in DNS, and therefore in (E)SMTP, 
  are to be compared in a case-insensitive manner.
  
  As far as I know, the W3C specs do not so refer to RFC 1035.
 
 And I'll bite in the other direction, why not treat URIs as URIs? 

It seems to me the underlying question here is whether aliasing of URIs
(whether they dereference to the same resource) should imply semantic
equality (i.e. use as an identifier in a web logic language like RDF or
OWL).

The position so far in RDF, OWL and RIF has been no

As far as the specifications for those languages are concerned a URI is
just a convenient spelling for an identifier and they require
comparison of identifiers to be stable and context-independent. 
Those specs don't constrain what you get back from dereferencing some
URI U to include statements about U.

The URI spec (rfc3986[1]) does allow this usage. In particular Section 6
Normalization and Comparison says:

URI comparison is performed for some particular purpose.  Protocols 
or implementations that compare URIs for different purposes will
   often be subject to differing design trade-offs in regards to how
   much effort should be spent in reducing aliased identifiers.  This
   section describes various methods that may be used to compare URIs,
   the trade-offs between them, and the types of applications that might
   use them.

and

We use the terms different and
   equivalent to describe the possible outcomes of such comparisons,
   but there are many application-dependent versions of equivalence.

While RDF predates this spec it seems to me that the RDF usage remains
consistent with it. The purpose of comparison in RDF is different from
that of cache retrieval of web pages or message delivery of email.

This quote also makes clear that there is no single definitive
normalization. There are different levels of normalization possible
depending on your needs. 

Earlier you pointed out that the place where the URI specs and RDF do
collide is in resolving relative URIs into absolute URIs. Again rfc3986
does not preclude the RDF usage. Section 5.2.1 says:

Normalization of the base URI, as described in Sections 6.2.2 and 
   6.2.3, is optional.

So I claim that in terms of formal published specifications:
(1) RDF, OWL and RIF do not require any normalization of URIs (beyond
the character encoding level) and compare URIs by simple string
comparison.
(2) This usage is *not* precluded by the URI specs, at least by 3986
which sets the current framework for the application of scheme-specific
specs.

** Now we turn to linked data ...

As we've already mentioned :) there are no specs for linked data so we
move onto more subjective grounds.

The linked data convention is that dereferencing some URI U in your RDF
document should return information about U, including further onward
links. So if data set A spells a URI hTTp://example.com/foo but the data
you get from dereferencing that URI talks only about
http://example.com/foo then someone has a problem somewhere. The
question is who, where and how to fix it.

It seems to me that this is primarily a issue with publishing, and a
little about being sensible about how you pass on links. If I'm going to
put up some linked data I should mint normalized URIs; I should use the
same spelling of the URIs throughout my data; I'll make sure those URIs
dereference and that the data that comes back is stable and useful. If
someone else refers to my resources using an aliased URI (such as a
different case for the protocol) and makes statements about those
aliases then they have simply made a mistake.

To make sure that dereference returns what I expect, independent of
aliasing, then I should publish data with explicit base URIs (or just
absolute URIs). Publishing with relative URIs and no base is a recipe
for having your data look different from different places. Just don't do
it. No surprise there.

None of this requires us to force URI normalization into the heart of
identifier comparison in RDF itself. It is not a necessary solution and
it is not a sufficient one because there is no universal 

Re: URI Comparisons: RFC 2616 vs. RDF

2011-01-20 Thread Nathan

Hi Dave,

Generally I agree, will address a few specific points in line (just to 
address them) then summarize my intended goals at the end (being the 
substance of the mail).


Dave Reynolds wrote:

The URI spec (rfc3986[1]) does allow this usage. In particular Section 6
Normalization and Comparison says:

URI comparison is performed for some particular purpose.  Protocols 
or implementations that compare URIs for different purposes will

   often be subject to differing design trade-offs in regards to how
   much effort should be spent in reducing aliased identifiers.  This
   section describes various methods that may be used to compare URIs,
   the trade-offs between them, and the types of applications that might
   use them.

and

We use the terms different and
   equivalent to describe the possible outcomes of such comparisons,
   but there are many application-dependent versions of equivalence.

While RDF predates this spec it seems to me that the RDF usage remains
consistent with it. The purpose of comparison in RDF is different from
that of cache retrieval of web pages or message delivery of email.


Indeed, I also read though:

   For all URIs, the hexadecimal digits within a percent-encoding
   triplet (e.g., %3a versus %3A) are case-insensitive and therefore
   should be normalized to use uppercase letters for the digits A-F.

   When a URI uses components of the generic syntax, the component
   syntax equivalence rules always apply; namely, that the scheme and
   host are case-insensitive and therefore should be normalized to
   lowercase...
   - http://tools.ietf.org/html/rfc3986#section-6.2.2.1

And took the For all and always to literally mean for all and 
always.


Unsure where this leaves things, and which takes precedence.


This quote also makes clear that there is no single definitive
normalization. There are different levels of normalization possible
depending on your needs. 


agree


So I claim that in terms of formal published specifications:
(1) RDF, OWL and RIF do not require any normalization of URIs (beyond
the character encoding level) and compare URIs by simple string
comparison.


One potential issue on the % encoding, clarified further down.


(2) This usage is *not* precluded by the URI specs, at least by 3986
which sets the current framework for the application of scheme-specific
specs.


Not a 100% sure but tempted to agree with you, would make sense not to 
preclude it.



As we've already mentioned :) there are no specs for linked data so we
move onto more subjective grounds.


Would be nice to get some specs at some point...


The linked data convention is that dereferencing some URI U in your RDF
document should return information about U, including further onward
links. So if data set A spells a URI hTTp://example.com/foo but the data
you get from dereferencing that URI talks only about
http://example.com/foo then someone has a problem somewhere. The
question is who, where and how to fix it.


agree, good way of putting it.

against both the RDF Specification [1] and the URI specification when 
they say /not/ to encode permitted US-ASCII characters (like ~ %7E)? 


Where did that example come from? 


   The encoding consists of... %-escaping octets that do not correspond
   to permitted US-ASCII characters.
   - http://www.w3.org/TR/rdf-concepts/#section-Graph-URIref

   For consistency, percent-encoded octets in the ranges of ALPHA
   (%41-%5A and %61-%7A), DIGIT (%30-%39), hyphen (%2D), period (%2E),
   underscore (%5F), or tilde (%7E) should not be created by URI
   producers and, when found in a URI, should be decoded to their
   corresponding unreserved characters by URI normalizers.
   - http://tools.ietf.org/html/rfc3986#section-2.3

I read those quotes as saying do not encode permitted US-ASCII 
characters in RDF URI References.



At what point have we suggested doing that?


As above

why 
force case-sensitive matching on the scheme and domain on URIs matching 
the generic syntax when the specs say must be compared case 
insensitively?


No, the specs do not say that, see above.


See for all and always quote earlier on.

So use normalized URIs in the first place. 

...

RDF/OWL/RIF aren't designed the way they are because someone thought it
would be a good idea to allow such things to be used side by side or
because they *want* people to use denormalized URIs.

...

The point is that there is no single, simple, universal (i.e. across all
schemes) normalization algorithm that could be used.
The current approach gives stable, well-defined behaviour which doesn't
change as people invent new URI schemes. The RDF serializations give you
enough control to enable you to be certain about what URI you are
talking about. Job done.


Okay, I agree, and I'm really not looking to create a lot of work here, 
the general gist of what I'm hoping for is along the lines of:


  RDF Publishers MUST perform Case Normalization and Percent-Encoding 
Normalization on all 

Re: URI Comparisons: RFC 2616 vs. RDF

2011-01-20 Thread David Booth
On Thu, 2011-01-20 at 13:08 +, Dave Reynolds wrote:
[ . . . ]
 It seems to me that this is primarily a issue with publishing, and a
 little about being sensible about how you pass on links. If I'm going to
 put up some linked data I should mint normalized URIs; I should use the
 same spelling of the URIs throughout my data; I'll make sure those URIs
 dereference and that the data that comes back is stable and useful. If
 someone else refers to my resources using an aliased URI (such as a
 different case for the protocol) and makes statements about those
 aliases then they have simply made a mistake.
 
 To make sure that dereference returns what I expect, independent of
 aliasing, then I should publish data with explicit base URIs (or just
 absolute URIs). Publishing with relative URIs and no base is a recipe
 for having your data look different from different places. Just don't do
 it. 

This advice sounds like an excellent candidate for publication in a best
practices document.  And if it is merely best practice guidance, perhaps
that *is* something that the new RDF working group could address.



-- 
David Booth, Ph.D.
http://dbooth.org/

Opinions expressed herein are those of the author and do not necessarily
reflect those of his employer.




Re: URI Comparisons: RFC 2616 vs. RDF

2011-01-20 Thread Nathan

David Booth wrote:

On Thu, 2011-01-20 at 13:08 +, Dave Reynolds wrote:
[ . . . ]

It seems to me that this is primarily a issue with publishing, and a
little about being sensible about how you pass on links. If I'm going to
put up some linked data I should mint normalized URIs; I should use the
same spelling of the URIs throughout my data; I'll make sure those URIs
dereference and that the data that comes back is stable and useful. If
someone else refers to my resources using an aliased URI (such as a
different case for the protocol) and makes statements about those
aliases then they have simply made a mistake.

To make sure that dereference returns what I expect, independent of
aliasing, then I should publish data with explicit base URIs (or just
absolute URIs). Publishing with relative URIs and no base is a recipe
for having your data look different from different places. Just don't do
it. 


This advice sounds like an excellent candidate for publication in a best
practices document.  And if it is merely best practice guidance, perhaps
that *is* something that the new RDF working group could address.


+1 from me, address at the publishing phase, allow at the consuming 
phase, keep comparison simple.





Re: URI Comparisons: RFC 2616 vs. RDF

2011-01-20 Thread William Waites
* [2011-01-20 14:29:35 +] Nathan nat...@webr3.org écrit:

]   RDF Publishers MUST perform Case Normalization and Percent-Encoding 
] Normalization on all URIs prior to publishing. When using relative URIs 
] publishers SHOULD include a well defined base using a serialization 
] specific mechanism. Publishers are advised to perform additional 
] normalization steps as specified by URI (RFC 3986) where possible.
] 
]   RDF Consumers MAY normalize URIs they encounter and SHOULD perform 
] Case Normalization and Percent-Encoding Normalization.
] 
]   Two RDF URIs are equal if and only if they compare as equal, 
] character by character, as Unicode strings.
] 
] For many reasons it would be good to solve this at the publishing phase, 
] allow normalization at the consuming phase (can't be precluded as 
] intermediary components may normalize), and keep simple case sensitive 
] string comparison throughout the stack and specs (so implementations 
] remain simple and fast.)
] 
] Does anybody find the above disagreeable?


Sounds about right to me, but what about port numbers,
http://example.org/ vs http://example.org:80/?

-w

-- 
William Waitesmailto:w...@styx.org
http://eris.okfn.org/ww/ sip:w...@styx.org
F4B3 39BF E775 CF42 0BAB  3DF0 BE40 A6DF B06F FD45



Re: Nice domain name for the take

2011-01-20 Thread Rinke Hoekstra
Hi all,

Thanks for the responses! This call for applicants is now closed ;)

Cheers,
Rinke


PS Did I say geni.com? Surely I meant gandi.net ...


On 20 jan 2011, at 13:13, Rinke Hoekstra wrote:

 Hi all,
 
 Last year, Christophe Gueret and I registered the domain name 
 linkeddatamarketplace.com, thinking we might find the time to set up a 
 market place for ... eh... linked data. The idea was to bring together 
 publishers and users of data (along the lines of 
 http://www.datamarketplace.com).
 
 As it turned out we didn't have the time to do any of this (apart from 
 setting up a 'coming soon' picture), and now the domain name is about to 
 expire.
 
 Question: if anyone's interested in using it, please let us know and we can 
 transfer it to your name. The domain is registered at geni.com and will 
 expire in 2 months.
 
 Cheers,
 Rinke
 
 ---
 Dr Rinke Hoekstra
 
 AI Department |   Leibniz Center for Law
 Faculty of Sciences   |   Faculty of Law
 Vrije Universiteit|   Universiteit van Amsterdam
 De Boelelaan 1081a|   Kloveniersburgwal 48  
 1081 HV Amsterdam |   1012 CX  Amsterdam
 +31-(0)20-5987752 |   +31-(0)20-5253497 
 r.j.hoeks...@vu.nl|   hoeks...@uva.nl   
 
 Homepage: http://www.few.vu.nl/~hoekstra
 
 
 
 


---
Dr Rinke Hoekstra

AI Department |   Leibniz Center for Law
Faculty of Sciences   |   Faculty of Law
Vrije Universiteit|   Universiteit van Amsterdam
De Boelelaan 1081a|   Kloveniersburgwal 48  
1081 HV Amsterdam |   1012 CX  Amsterdam
+31-(0)20-5987752 |   +31-(0)20-5253497 
r.j.hoeks...@vu.nl|   hoeks...@uva.nl   

Homepage: http://www.few.vu.nl/~hoekstra







How to declare in a web app's interface which kind of app/version/features and or interfaces or formats it exposes

2011-01-20 Thread Olivier Berger
Hi.

I'm considering the different options that could help embed (with
slightest modifications possible) in the in HTML interface of a Web app,
a description of which app it is and/or which interfaces it exposes, so
that this would be discoverable and lead to exploitation of such data
by SemWeb apps, or existing harvesters.

Which SemWeb standards could be used to do so ?

Thanks in advance.

Best regards,
-- 
Olivier BERGER olivier.ber...@it-sudparis.eu
http://www-public.it-sudparis.eu/~berger_o/ - OpenPGP-Id: 2048R/5819D7E8
Ingénieur Recherche - Dept INF
Institut TELECOM, SudParis (http://www.it-sudparis.eu/), Evry (France)




Re: How to declare in a web app's interface which kind of app/version/features and or interfaces or formats it exposes

2011-01-20 Thread Michael Hausenblas

Olivier,

 I'm considering the different options that could help embed (with
 slightest modifications possible) in the in HTML interface of a Web app,
 a description of which app it is and/or which interfaces it exposes, so
 that this would be discoverable and lead to exploitation of such data
 by SemWeb apps, or existing harvesters.

You might find my blog post 'Announcing Application Metadata on the Web
of Data' [1] along with the template [2] useful for this purpose.

Cheers,
  Michael

[1] 
http://webofdata.wordpress.com/2010/01/06/announcing-application-metadata
[2] http://lab.linkeddata.deri.ie/2010/res/web-app-metadata-template.html

-- 
Dr. Michael Hausenblas, Research Fellow
LiDRC - Linked Data Research Centre
DERI - Digital Enterprise Research Institute
NUIG - National University of Ireland, Galway
Ireland, Europe
Tel. +353 91 495730
http://linkeddata.deri.ie/
http://sw-app.org/about.html



 From: Olivier Berger olivier.ber...@it-sudparis.eu
 Date: Thu, 20 Jan 2011 16:42:16 +0100
 To: Linked Data community public-lod@w3.org
 Subject: How to declare in a web app's interface which kind of
 app/version/features and or interfaces or formats it exposes
 Resent-From: Linked Data community public-lod@w3.org
 Resent-Date: Thu, 20 Jan 2011 15:43:59 +
 
 Hi.
 
 I'm considering the different options that could help embed (with
 slightest modifications possible) in the in HTML interface of a Web app,
 a description of which app it is and/or which interfaces it exposes, so
 that this would be discoverable and lead to exploitation of such data
 by SemWeb apps, or existing harvesters.
 
 Which SemWeb standards could be used to do so ?
 
 Thanks in advance.
 
 Best regards,
 -- 
 Olivier BERGER olivier.ber...@it-sudparis.eu
 http://www-public.it-sudparis.eu/~berger_o/ - OpenPGP-Id: 2048R/5819D7E8
 Ingénieur Recherche - Dept INF
 Institut TELECOM, SudParis (http://www.it-sudparis.eu/), Evry (France)
 
 




Re: URI Comparisons: RFC 2616 vs. RDF

2011-01-20 Thread Martin Hepp

Hi:

On 20.01.2011, at 15:40, Nathan wrote:


David Booth wrote:

On Thu, 2011-01-20 at 13:08 +, Dave Reynolds wrote:
[ . . . ]


To make sure that dereference returns what I expect, independent of
aliasing, then I should publish data with explicit base URIs (or  
just
absolute URIs). Publishing with relative URIs and no base is a  
recipe
for having your data look different from different places. Just  
don't do

it.
This advice sounds like an excellent candidate for publication in a  
best
practices document.  And if it is merely best practice guidance,  
perhaps

that *is* something that the new RDF working group could address.


+1 from me, address at the publishing phase, allow at the consuming  
phase, keep comparison simple.





I am not sure whether you are also talking of RDFa, but in case you  
do, I would like to add the following:


Our experiences with helping about 2,000 sites with adding  
GoodRelations via our form-based tools shows that


1. RDFa is in many cases the only viable way for people to publish RDF
2. They can often not control and not even predict the exact URI of  
the page that will contain the markup (imagine uncool URIs loaded  
with parameters etc.)


In those scenarios, relative URIs are essential.

We even recommend that people include an empty

   div rel=foaf:page resource=/div

at the proper position in the nesting so that there will be a link  
between the data entity and the page that contains it.


Martin





Re: URI Comparisons: RFC 2616 vs. RDF

2011-01-20 Thread Nathan

Martin Hepp wrote:

On 20.01.2011, at 15:40, Nathan wrote:

David Booth wrote:

On Thu, 2011-01-20 at 13:08 +, Dave Reynolds wrote:
[ . . . ]


To make sure that dereference returns what I expect, independent of
aliasing, then I should publish data with explicit base URIs (or just
absolute URIs). Publishing with relative URIs and no base is a recipe
for having your data look different from different places. Just 
don't do

it.

This advice sounds like an excellent candidate for publication in a best
practices document.  And if it is merely best practice guidance, perhaps
that *is* something that the new RDF working group could address.


+1 from me, address at the publishing phase, allow at the consuming 
phase, keep comparison simple.


I am not sure whether you are also talking of RDFa, but in case you do, 
I would like to add the following:


Hi Martin,

Yes (re RDFa), see: http://webr3.org/urinorm/2 - all the browsers do the 
normalization so you can't even get to the non-normalized URI.


in a browser you'll note that all the URIs get normalized automatically, 
in that it's impossible to programmatically access the correct casing. 
That's a problem.


if you run it through the RDFa distiller at w3.org [2] you'll find:

  htTp://WEBR3.org/urinorm/2 dc:creator http://WEBR3.org/nathan#me .

  http://WEBR3.org/urinorm/2#example dc:title URI Normalization 
Example 2 .


note one of the URIs (the one which required relative path resolution) 
has the scheme normalised.


if you run if through check.rdfa.info you'll find that all the URIs are 
normalized. [3]


if you run it through sigma [4] you'll find everything has been 
normalized. You can also see an RDF view of this [5]


if you run it through URI Burner [6], you'll find that /some/ URIs have 
been normalized. It's also worth noting that this caused all kinds of 
problems - I ended up having to create a new resource at this point w/ 
some RDF  N3 to test URI Burner:


  http://webr3.org/urinorm/3

which lead to the empty [7] then I figured I'd try [8] and if you click 
the creator ( htTp://WEBR3.org/nathan#me ) since in this case there's no 
normalization (not it was normalized in [6]) you get a 400 Bad Request [9].


and so on and so forth - far from ideal.

Best,

Nathan

[1] http://www.rdfabout.com/demo/validator/ (normalizes all RDF URIs)
[2] http://www.w3.org/2007/08/pyRdfa/
[3] http://check.rdfa.info/check?url=http://webr3.org/urinorm/2version=1.0
[4] http://sig.ma/search?q=http://webr3.org/urinorm/2
[5] http://sig.ma/entity/e6a2c8319bb3bf21f4b4639216f114a4.rdf#this
[6] 
http://linkeddata.uriburner.com/about/html/http/webr3.org/urinorm/2%01this

[7] http://linkeddata.uriburner.com/about/html/http/webr3.org/urinorm/3
[8] http://linkeddata.uriburner.com/about/html/htTp://WEBR3.org/urinorm/3
[9] http://linkeddata.uriburner.com/about/html/htTp/WEBR3.org/nathan%01me



Re: How to declare in a web app's interface which kind of app/version/features and or interfaces or formats it exposes

2011-01-20 Thread Olivier Berger
Le jeudi 20 janvier 2011 à 15:50 +, Michael Hausenblas a écrit :
 Olivier,
 
  I'm considering the different options that could help embed (with
  slightest modifications possible) in the in HTML interface of a Web app,
  a description of which app it is and/or which interfaces it exposes, so
  that this would be discoverable and lead to exploitation of such data
  by SemWeb apps, or existing harvesters.
 
 You might find my blog post 'Announcing Application Metadata on the Web
 of Data' [1] along with the template [2] useful for this purpose.
 
 Cheers,
   Michael
 
 [1] 
 http://webofdata.wordpress.com/2010/01/06/announcing-application-metadata
 [2] http://lab.linkeddata.deri.ie/2010/res/web-app-metadata-template.html
 

Thanks for sharing this. The only critique I could have is about the use
of DOAP, where there's a confusion between a project (community) and a
software (developped by that community) behind doap:Project, IMHO... but
that's a common problem with DOAP, that is counterweighted by its
popularity (perfect model vs. available data).

I was thinking of something maybe less intrusive : RDFa addition to
existing apps is maybe too hard, as requiring to change its code (in
particular as some (X)HTML may not convert so easily to XHTML+RDFa.

Anything in the domain of HTML headers maybe ? Such could more easily be
added by quick+dirty patches / sysadmin configuration.

Thanks in advance.
-- 
Olivier BERGER olivier.ber...@it-sudparis.eu
http://www-public.it-sudparis.eu/~berger_o/ - OpenPGP-Id: 2048R/5819D7E8
Ingénieur Recherche - Dept INF
Institut TELECOM, SudParis (http://www.it-sudparis.eu/), Evry (France)




2nd CfP: USEWOD2011 - 1st International Workshop on Usage Analysis and the Web of Data

2011-01-20 Thread Knud Hinnerk Möller
Just a quick reminder that the deadline for USEWOD2011 is approaching fast!

=== Second Call for Papers ===

Workshop on: USAGE ANALYSIS AND THE WEB OF DATA (USEWOD2011)
 USEWOD DATA CHALLENGE

Workshop at WWW 2011 – Hyderabad, India, 28 or 29 March 2011
http://data.semanticweb.org/usewod/2011/


Important dates
===
* Release of Dataset for the USEWOD Challenge: 21 December 2011
* Paper submission deadline: 8 February 2011
* Workshop and Prize for USEWOD Challenge: 28 or 29 March 2011


Submission
=
* Long papers: up to 8 pages
* Short papers: up to 4 pages
* Data Challenge papers (see below): up to 4 pages
all in ACM format 
(http://www.acm.org/sigs/publications/proceedings-templates)


Overview

This workshop will investigate the synergy between semantics and semantic-web 
technology on the one hand and analysis and mining of usage data on the other 
hand. The two fields are a promising combination. First, semantics can be used 
to enhance the analysis of usage data. Usage logs contain information that can 
help to better understand users or to adapt a system to a user’s needs and 
preferences. Now that more and more explicit knowledge is represented on the 
Web, in the form of ontologies, folksonomies, or linked data, the question 
arises how these semantics can be used to aid large scale web usage analysis 
and mining. Second, usage data analysis can enhance semantic resources as well 
as Semantic Web applications. Traces of users can be used to evaluate, adapt or 
personalize Semantic Web applications. Since logs record real-life users, they 
provide an opportunity to create gold standards for search or recommendation 
tools. In addition, logs can form valuable resources from which knowledge (e.g. 
in the form of ontologies or thesauri) can be extracted bottom-up.

Also, the emerging Web of Data demands a re-evaluation of existing usage mining 
techniques; new ways of accessing information enabled by the Web of Data imply 
the need to develop or adapt algorithms, methods, and techniques to analyze and 
interpret the usage of Web data instead of Web pages. An important question at 
this time is how the Web of Data is being used: how are datasets being accessed 
by human users and how by machines, what kinds of queries are being performed, 
and what can we learn about the usage of semantic applications?

The primary goals of this workshop are to foment a new community of researchers 
from various fields sharing an interest in usage mining and semantics and to 
create a roadmap for future research in this direction.


Data Challenge
==
In addition to regular papers, we will release a dataset of usage data (server 
log files) from two Linked Open Data sources: Semantic Web Dog Food 
(data.semanticweb.org) and DBpedia (dbpedia.org). Participants are invited to 
present interesting analyses, applications, alignments, etc. for these 
datasets, and to submit their findings as a Data Challenge paper. The best Data 
Challenge paper will get a prize.


Topics of interest
==
include, but are not limited to:
• Analysis and mining of usage logs of semantic resources and applications
• Inferring semantic information from usage logs
• Methods and tools for semantic analysis of usage logs
• Representing and enriching usage logs with semantic information
• Usage-based evaluation methods and frameworks; gold standards for evaluation 
of semantic web applications
• Specifics and semantics of logs for content-consumption and content-creation
• Using semantics for recommendation, personalization and adaptation
• Usage-based recommendation, personalization and adaptation of semantic web 
applications
• Exploiting usage logs for semantic search
• Data sharing, privacy, and privacy-protecting policies and techniques


Workshop chairs
===
* Bettina Berendt,  K.U. Leuven, Belgium
* Laura Hollink, Delft University of Technology, The Netherlands
* Vera Hollink, Centre for Mathematics and Computer Science, Amsterdam, The 
Netherlands
* Markus Luczak-Roesch, Freie Universitaet Berlin, Germany
* Knud Moeller, DERI / National University of Ireland, Galway, Ireland
* David Vallet, Universidad Autonoma de Madrid, Spain
   --- Please contact us at usewod2011-cha...@googlegroups.com


Program committee 
===
see the workshop web page: http://data.semanticweb.org/usewod/2011/

-
Knud Möller, PhD
+353 - 91 - 495086
Smile Group: http://smile.deri.ie
Digital Enterprise Research Institute
  National University of Ireland, Galway
Institiúid Taighde na Fiontraíochta Digití
  Ollscoil na hÉireann, Gaillimh






Re: How to declare in a web app's interface which kind of app/version/features and or interfaces or formats it exposes

2011-01-20 Thread Thomas Steiner
Hi Olivier,

Do you know http://LinkedOpenServices.org/? This might be what you're
looking for (assuming your website is your API, where website reads
like Web app).

Cheers,
Tom

Thank God not sent from a BlackBerry, but from my iPhone

On 20.01.2011, at 18:37, Olivier Berger olivier.ber...@it-sudparis.eu wrote:

 Le jeudi 20 janvier 2011 à 15:50 +, Michael Hausenblas a écrit :
 Olivier,

 I'm considering the different options that could help embed (with
 slightest modifications possible) in the in HTML interface of a Web app,
 a description of which app it is and/or which interfaces it exposes, so
 that this would be discoverable and lead to exploitation of such data
 by SemWeb apps, or existing harvesters.

 You might find my blog post 'Announcing Application Metadata on the Web
 of Data' [1] along with the template [2] useful for this purpose.

 Cheers,
  Michael

 [1]
 http://webofdata.wordpress.com/2010/01/06/announcing-application-metadata
 [2] http://lab.linkeddata.deri.ie/2010/res/web-app-metadata-template.html


 Thanks for sharing this. The only critique I could have is about the use
 of DOAP, where there's a confusion between a project (community) and a
 software (developped by that community) behind doap:Project, IMHO... but
 that's a common problem with DOAP, that is counterweighted by its
 popularity (perfect model vs. available data).

 I was thinking of something maybe less intrusive : RDFa addition to
 existing apps is maybe too hard, as requiring to change its code (in
 particular as some (X)HTML may not convert so easily to XHTML+RDFa.

 Anything in the domain of HTML headers maybe ? Such could more easily be
 added by quick+dirty patches / sysadmin configuration.

 Thanks in advance.
 --
 Olivier BERGER olivier.ber...@it-sudparis.eu
 http://www-public.it-sudparis.eu/~berger_o/ - OpenPGP-Id: 2048R/5819D7E8
 Ingénieur Recherche - Dept INF
 Institut TELECOM, SudParis (http://www.it-sudparis.eu/), Evry (France)





Re: URI Comparisons: RFC 2616 vs. RDF

2011-01-20 Thread Dave Reynolds

Hi Nathan,

I largely agree but have a few quibbles :)

On 20/01/2011 2:29 PM, Nathan wrote:

Dave Reynolds wrote:

The URI spec (rfc3986[1]) does allow this usage. In particular Section 6
Normalization and Comparison says:

URI comparison is performed for some particular purpose. Protocols
or implementations that compare URIs for different purposes will
often be subject to differing design trade-offs in regards to how
much effort should be spent in reducing aliased identifiers. This
section describes various methods that may be used to compare URIs,
the trade-offs between them, and the types of applications that might
use them.

and

We use the terms different and
equivalent to describe the possible outcomes of such comparisons,
but there are many application-dependent versions of equivalence.

While RDF predates this spec it seems to me that the RDF usage remains
consistent with it. The purpose of comparison in RDF is different from
that of cache retrieval of web pages or message delivery of email.


Indeed, I also read though:

For all URIs, the hexadecimal digits within a percent-encoding
triplet (e.g., %3a versus %3A) are case-insensitive and therefore
should be normalized to use uppercase letters for the digits A-F.

When a URI uses components of the generic syntax, the component
syntax equivalence rules always apply; namely, that the scheme and
host are case-insensitive and therefore should be normalized to
lowercase...
- http://tools.ietf.org/html/rfc3986#section-6.2.2.1

And took the For all and always to literally mean for all and
always.


Those quotes come from section (6.2.2) describing normalization but the 
earlier quote is from the start of section 6 saying that choice of 
normalization is application dependent. I interpret the two together as 
*if* you are normalizing then always ...blah 


That was certainly the RIF position where we explicitly said that 
sections 6.2.2 and 6.2.3 of rfc3986 were not applicable.



against both the RDF Specification [1] and the URI specification when
they say /not/ to encode permitted US-ASCII characters (like ~ %7E)?


Where did that example come from?


The encoding consists of... %-escaping octets that do not correspond
to permitted US-ASCII characters.
- http://www.w3.org/TR/rdf-concepts/#section-Graph-URIref

For consistency, percent-encoded octets in the ranges of ALPHA
(%41-%5A and %61-%7A), DIGIT (%30-%39), hyphen (%2D), period (%2E),
underscore (%5F), or tilde (%7E) should not be created by URI
producers and, when found in a URI, should be decoded to their
corresponding unreserved characters by URI normalizers.
- http://tools.ietf.org/html/rfc3986#section-2.3

I read those quotes as saying do not encode permitted US-ASCII
characters in RDF URI References.


At what point have we suggested doing that?


As above


Sorry, I didn't mean to dispute that you shouldn't %-encode ~, I was 
wondering where the suggestion that you should do so came from.


I believe there are some corner cases, such as the handling of spaces, 
which differ between the RDF spec and the IRI spec. This was down to 
timing. The RDF Core WG was doing its best to anticipate what the IRI 
spec would look like but couldn't wait until that was finalized. 
Resolving any such small discrepancies between that anticipation and the 
actual IRI specs is something I believe to be in scope for the proposed 
new RDF WG.



So use normalized URIs in the first place.

...

RDF/OWL/RIF aren't designed the way they are because someone thought it
would be a good idea to allow such things to be used side by side or
because they *want* people to use denormalized URIs.

...

The point is that there is no single, simple, universal (i.e. across all
schemes) normalization algorithm that could be used.
The current approach gives stable, well-defined behaviour which doesn't
change as people invent new URI schemes. The RDF serializations give you
enough control to enable you to be certain about what URI you are
talking about. Job done.


Okay, I agree, and I'm really not looking to create a lot of work here,
the general gist of what I'm hoping for is along the lines of:

RDF Publishers MUST perform Case Normalization and Percent-Encoding
Normalization on all URIs prior to publishing. When using relative URIs
publishers SHOULD include a well defined base using a serialization
specific mechanism. Publishers are advised to perform additional
normalization steps as specified by URI (RFC 3986) where possible.

RDF Consumers MAY normalize URIs they encounter and SHOULD perform Case
Normalization and Percent-Encoding Normalization.

Two RDF URIs are equal if and only if they compare as equal, character
by character, as Unicode strings.


I sort of OK with that but ...

Terms like RDF Publisher and RDF Consumer need to be defined in 
order to make formal statements like these. The RDF/OWL/RIF specs are 
careful to define what sort of processors are subject to conformance 
statements and I don't think RDF 

Standardizing linked data - was Re: URI Comparisons: RFC 2616 vs. RDF

2011-01-20 Thread Nathan

Dave Reynolds wrote:

Okay, I agree, and I'm really not looking to create a lot of work here,
the general gist of what I'm hoping for is along the lines of:

RDF Publishers MUST perform Case Normalization and Percent-Encoding
Normalization on all URIs prior to publishing. When using relative URIs
publishers SHOULD include a well defined base using a serialization
specific mechanism. Publishers are advised to perform additional
normalization steps as specified by URI (RFC 3986) where possible.

RDF Consumers MAY normalize URIs they encounter and SHOULD perform Case
Normalization and Percent-Encoding Normalization.

Two RDF URIs are equal if and only if they compare as equal, character
by character, as Unicode strings.


I sort of OK with that but ...

Terms like RDF Publisher and RDF Consumer need to be defined in 
order to make formal statements like these. The RDF/OWL/RIF specs are 
careful to define what sort of processors are subject to conformance 
statements and I don't think RDF Publisher is a conformance point for 
the existing specs.


This may sound like nit-picking that's life with specifications. You 
need to be clear how the last para about RDF URIs relates to notions 
like RDF Consumer.


I wonder whether you might want to instead define notions of Linked Data 
Publisher and Linked Data Consumer to which these MUST/MAY/SHOULD 
conformance statements apply. That way it is clear that a component such 
as an RDF store or RDF parser is correct in following the existing RDF 
specs and not doing any of these transformations but that in order to 
construct a Linked Data Consumer/Publisher some other component can be 
introduced to perform the normalizations. Linked Data as a set of 
constraints and conventions layered on top of the RDF/OWL specs.


Fully agree, had the same conversation with DanC this afternoon and he 
too immediately suggested changing RDF Publisher/Consumer to Linked Data 
Publisher/Consumer. Also ties in with earlier comments about 
standardizing Linked Data, however it's done, or worded, my only care 
here is that it positively impacts the current situation, and doesn't 
negatively impact anybody else.


The specific point on the normalization ladder would have to defined, of 
course, and you would need to define how to handle schemes unknown to 
the consumer.


All this presupposes some work to formalize and specify linked data. Is 
there anything like that planned?  In some ways Linked Data is an 
engineering experiment and benefits from that freedom to experiment. On 
the other hand interoperability eventually needs clear specifications.


Unsure, but I'll also ask the question, is there anything planned? I'd 
certainly +1 standardization and do anything I could to help the process 
along.



For many reasons it would be good to solve this at the publishing phase,
allow normalization at the consuming phase (can't be precluded as
intermediary components may normalize), and keep simple case sensitive
string comparison throughout the stack and specs (so implementations
remain simple and fast.)


Agreed.


cool, thanks again Dave,

Nathan



Re: Standardizing linked data - was Re: URI Comparisons: RFC 2616 vs. RDF

2011-01-20 Thread Nathan

Nathan wrote:

Dave Reynolds wrote:
All this presupposes some work to formalize and specify linked data. 
Is there anything like that planned?  In some ways Linked Data is an 
engineering experiment and benefits from that freedom to experiment. 
On the other hand interoperability eventually needs clear specifications.


Unsure, but I'll also ask the question, is there anything planned? I'd 
certainly +1 standardization and do anything I could to help the process 
along.


or perhaps an IG/XG follow up to the SWEO, taking in to account Read 
Write Web of Data, hopefully with a some protocol or best practice 
report giving a migration path to standardization?


There are certainly plenty of other groups to take in to account and 
consider in all of this, like the WebID XG.


Best,

Nathan



Re: URI Comparisons: RFC 2616 vs. RDF

2011-01-20 Thread Alan Ruttenberg
On Thu, Jan 20, 2011 at 5:15 AM, Nathan nat...@webr3.org wrote:

 As far as I can see, that's only for a URI reference used within a
 namespace, and does not govern usage or normalization when you join the URI
 reference up with the local name to make the full URI.

 Out of interest, where is that process defined? I was looking for it the
 other day - for instance in the quoted specification we have the example:

 edi:price xmlns:edi='http://ecommerce.example.org/schema'
 units='Euro'32.18/edi:price

 Where's the bit of the XML specification which says you join them up by
 concatenating 'http://ecommerce.example.org/schema' with #(?assumed?) and
 'Euro' to get 'http://ecommerce.example.org/schema#Euro'?


My understanding is that this is governed by the definition of qnames. As I
understand things, the concatenation you write would happen only if the
attribute was defined in the schema to be an xsi:type
http://www.w3.org/TR/2004/REC-xmlschema-1-20041028/structures.html#xsi_type,
and without the #. The only case where a # would be added is when rdf:id
or xml:id is used.

And finally, this is why I specifically asked if the non-normalization of
 RDF URI References had XML Namespace heritage, which had then filtered down
 through OWL, SPARQL and RIF.


I don't believe so. I believe the genesis are the reasons that I discussed
earlier - the difficulty of actually implementing it combined with the
indeterminacy. But I would be glad if someone else has better information
and can either confirm or deny this.

-Alan


Re: URI Comparisons: RFC 2616 vs. RDF

2011-01-20 Thread Harry Halpin
On Thu, Jan 20, 2011 at 11:15 AM, Nathan nat...@webr3.org wrote:
 Alan Ruttenberg wrote:

 On Wed, Jan 19, 2011 at 4:45 PM, Nathan nat...@webr3.org wrote:

 David Wood wrote:

 On Jan 19, 2011, at 10:59, Nathan wrote:

 ps: as an illustration of how engrained URI normalization is, I've
 capitalized the domain names in the to: and cc: fields, I do hope the
 mail
 still come through, and hope that you'll accept this email as being
 sent to
 you. Hopefully we'll also find this mail in the archives shortly at
 htTp://lists.W3.org/Archives/Public/public-lod/2011Jan/ - Personally
 I'd
 hope that any statements made using these URIs (asserted by man or
 machine)
 would remain valid regardless of the (incorrect?-)casing.

 Heh.  OK, I'll bite.  Domain names in email addressing are defined in
 IETF
 RFC 2822 (and its predecessor RFC 822), which defers the interpretation
 to
 RFC 1035 (Domain names - implementation and specification).  RFC 1035
 section 2.3.3 states that domain names in DNS, and therefore in (E)SMTP,
 are
 to be compared in a case-insensitive manner.

 As far as I know, the W3C specs do not so refer to RFC 1035.

 And I'll bite in the other direction, why not treat URIs as URIs? why go
 against both the RDF Specification [1] and the URI specification when
 they
 say /not/ to encode permitted US-ASCII characters (like ~ %7E)? why force
 case-sensitive matching on the scheme and domain on URIs matching the
 generic syntax when the specs say must be compared case insensitively?
 and
 so on and so forth.

 [AR]
 Which specs?

 The various URI/IRI specs and previous revisions of.

 http://www.w3.org/TR/REC-xml-names/#NSNameComparison

 URI references identifying namespaces

 ..

 In a namespace declaration, the URI reference is

 ..

 The URI references below are all different for the purposes of identifying
 namespaces

 ..

 The URI references below are also all different for the purposes of
 identifying namespaces

 ..

 So here is another spec that *explicitly* disagrees with the idea that URI
 normalization should be a built-in processing.

 As far as I can see, that's only for a URI reference used within a
 namespace, and does not govern usage or normalization when you join the URI
 reference up with the local name to make the full URI.

 Out of interest, where is that process defined? I was looking for it the
 other day - for instance in the quoted specification we have the example:

 edi:price xmlns:edi='http://ecommerce.example.org/schema'
 units='Euro'32.18/edi:price

 Where's the bit of the XML specification which says you join them up by
 concatenating 'http://ecommerce.example.org/schema' with #(?assumed?) and
 'Euro' to get 'http://ecommerce.example.org/schema#Euro'?


Actually you don't. A namespace is just that - a tuple (namespace,
localname) in XML. That's why namespaces in XML are far all intents
and purposes broken and why, to a large extent, Web browser developers
in HTML stopped using them and hate implementing them in the DOM, and
so refuse to have them in HTML5. And that's one reason RDF(A) will
probably continue getting a sort of bad rap in the HTML world, as
prefixes are not associated with just making URIs, but with this
terrible namespace tuple.

For an archeology of the relevant standards, check out Section What
Namespaces Do of this paper. While the paper is focussed on why
namespace documents are a mess, the relevant information is in that
section and extensively referenced, with examples:

http://xml.coverpages.org/HHalpinXMLVS-Extreme.html

 And finally, this is why I specifically asked if the non-normalization of
 RDF URI References had XML Namespace heritage, which had then filtered down
 through OWL, SPARQL and RIF.

Indeed, they should be normalized in a sane manner across all Semantic
Web specs, and dependencies on XML Namespaces should obviously be
dropped IMHO.


 [AR] More to document, please: Which data is being junked and scrapped?

 will document, but essentially every statement made using a non normalized
 URI when other statements are also being made about the same resource
 using normalized URIs - the two most common cases for this will be when
 people are using CMS systems and enter their domain name as uppercase in
 some admin, only to have that filter through to URIs in serialized RDF/RDFa,
 and where bugs in software have led to inconsistent URIs over time (for
 instance where % encoding has been fixed, or a :80 has been removed from a
 URI).

 [AR] Hmm. Are you suggesting that the behavior of libraries and clients
 should have precedence over specification? My view is that one first looks
 to specifications, and then only if specifications are poor or do not
 speak
 to the issue do we look at existing behavior.

Which is the case with namespaces and URI normalization :)


 Yes I am, that specification should standardize the behaviour of libraries
 and clients - the level of normalization in URIs published, consumed or used
 by these tools is often