Thank you for the answer. I think I can see the problem, but I'm still convinced
that it should be left to the data producers to standardize on one form, if they
need to.
It looks to me like a common problem that multiple URIs exist for representing
the same entity. For example when combining different sources or different
ontologies. I think data producers get around this either by agreeing on one
particular URI, or by creating a new URI, or resorting to reasoning
(owl:sameAs). But I would not expect the database to automatically change any
URI, that's why this was surprising to me.

I think a very similar example is

    http:example.org/path/to/file
    http:/example.org/path/to/file
    http://example.org/path/to/file
    http:///example.org/path//to///file

When importing these, Jena does not change them, and treats them as different
URIs instead. I would expect this behaviour for every URI, unless "file:" needs
to be treated differently.


On Mon, 2025-01-06 at 21:05 +0000, Andy Seaborne wrote:
> 
> 
> On 06/01/2025 19:14, zPlus wrote:
> > > But we need one form for URI matching otherwise "file:/path" does not
> > > match "file:///path"
> > 
> > Why does Jena need to match "file:/path" and "file:///path"? Shouldn't it be
> > left to the user to choose one form or the other in their data?
> 
> There is no "right" answer for file: URLs.
> 
> Having one normalized form means the same name is for data producer 
> (load database) and data consumer (SPARQL query) whether they write it 
> file:/ or file:/// or a mixture; or when multiple sources of data are 
> combined. And across operating systems.
> 
> There isn't "the user".
> 
> https://datatracker.ietf.org/doc/html/rfc8089.html#appendix-B
> 
> """
>   o  A traditional file URI for a local file with an empty authority.
>        This is the most common format in use today.  For example:
> 
>        *  "file:///path/to/file"
> """
> 
> And on Windows ...
> 
> C:/path is the "C:" URI scheme.
> 
> file:C:/path is going to be interpreted different on Windows and linux/Mac.
> 
> The whole thing is messy.
> 
>      Andy
> 


Reply via email to