Re: [I] urn:uuid syntax validation incorrect? [jena]

via GitHub Mon, 30 Oct 2023 04:17:59 -0700


svandenhoek commented on issue #2062:
URL: https://github.com/apache/jena/issues/2062#issuecomment-1784975228


   My main point indeed is to not flag something as invalid if it might 
actually be in-spec, though I do agree that part about not-defined behaviour 
for undefined components is worrying indeed. An informational warning (or 
perhaps a command-line argument that allows toggling between "in-spec only vs 
strict validation") seems like a valid option.
   
   While it's a guess, perhaps other tools simply treat the fragment similarly 
to an IRI (so the fragment in 
`urn:uuid:0c93930e-709d-431b-add5-9fdca2a117da#something` and 
`http://www.w3.org/2000/01/rdf-schema#label` being treated the same) and 
therefore not causing any issues?
   
   Interesting observation regarding the rq-component indeed. It seems 
according to [RFC 8141 Section 
2.3.1](https://datatracker.ietf.org/doc/html/rfc8141#section-2.3.1) that the 
r-component is for now purely there to "reserves it for future use" (first 
paragraph page 13) and it "SHOULD NOT be used for URNs before their semantics 
have been standardized" (last paragraph of the section). So a warning (or even 
an error?) sounds valid for now as well until it is updated. Though ofcourse 
this gets a bit more complicated due to `?+` being valid inside the q-component.
   
   Looking at [RFC 8141 Section 
2.3.1](https://datatracker.ietf.org/doc/html/rfc8141#section-2.3.1), I'm not 
sure if `?+abc?=def` should be seen as a single r-component though (but I'll be 
honest that I have no clue regarding passing it to the resolver so you might be 
right regarding implementation):
   
   > The sequence "?+" introduces the r-component.  The r-component ends with a 
"?=" sequence (which begins a q-component) or a "#" character (number sign, 
which begins an f-component).
   
   Regarding the usage: We currently have some non-semantic data which we want 
to be able to query on linked data. Most of it can be converted properly to 
existing IRIs that represents this data (and are the parts relevant for 
federated querying). Some of the data which is purely relevant inside the 
project cannot be that easily converted, so within the current concept version 
a UUID is generated to represent the specific use-case and the fragments 
describe the parts they represent within that specific use-case. A bit similar 
to the anology in [RFC 8141 Section 
2.3.3](https://datatracker.ietf.org/doc/html/rfc8141#section-2.3.3):
   
   > Consider the hypothetical example of obtaining resources that are part of 
a larger entity (say, the chapters of a book).  Each part could be specified in 
the f-component
   
   P.S. This week might not be able to respond as much.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [I] urn:uuid syntax validation incorrect? [jena]

Reply via email to