Re: Querying URL with square brackets

2023-11-24 Thread Marco Neumann
Martynas, I think you have to go way back in time to fully appreciate the
anchor reference and its "interference" with URI local names. :)

Fundamentally URIs as identifiers are not meant to be retrieved as such
Laura. So a web browser is not designed to follow the implicit "physical"
link of an identifier.

To "browse" URIs as identifiers only you need a RDF browser or plugin that
may dereference documents from objects for display as URLs.

Marco


On Fri, Nov 24, 2023 at 1:55 PM Martynas Jusevičius 
wrote:

> On Fri, Nov 24, 2023 at 12:50 PM Laura Morales  wrote:
> >
> > > If you want a page for every book, don't use fragment URIs. Use
> > > http://example.org/book/1 or http://example.org/book/1#this instead of
> > >  http://example.org/book#1.
> >
> > yes yes I agree with this. I only tried to present an example of yet
> another "quirk" between raw data and browsers (where this kind of data is
> supposed to be used).
>
> Still don't understand the problem :) http://example.org/book#1
> uniquely identifies a resource, but you'll need to get the whole
> http://example.org/book document to retrieve it. That's just how HTTP
> works.
>


-- 


---
Marco Neumann


Re: Querying URL with square brackets

2023-11-24 Thread Martynas Jusevičius
On Fri, Nov 24, 2023 at 12:50 PM Laura Morales  wrote:
>
> > If you want a page for every book, don't use fragment URIs. Use
> > http://example.org/book/1 or http://example.org/book/1#this instead of
> >  http://example.org/book#1.
>
> yes yes I agree with this. I only tried to present an example of yet another 
> "quirk" between raw data and browsers (where this kind of data is supposed to 
> be used).

Still don't understand the problem :) http://example.org/book#1
uniquely identifies a resource, but you'll need to get the whole
http://example.org/book document to retrieve it. That's just how HTTP
works.


Re: Querying URL with square brackets

2023-11-24 Thread Laura Morales
> If you want a page for every book, don't use fragment URIs. Use
> http://example.org/book/1 or http://example.org/book/1#this instead of
>  http://example.org/book#1.

yes yes I agree with this. I only tried to present an example of yet another 
"quirk" between raw data and browsers (where this kind of data is supposed to 
be used).


Re: Querying URL with square brackets

2023-11-24 Thread Martynas Jusevičius
On Fri, Nov 24, 2023 at 11:46 AM Laura Morales  wrote:
>
> > > in the case that I want to use these URLs with a web browser.
> >
> > I don't understand what the trouble with the above example is?
>
> The problem with # is that browsers treat them as the start of a local 
> reference. When you open http://example.org/book#1 the server only receives 
> http://example.org/book. In other words it would be an error to create nodes 
> for n different books (#1 #2 #3 #n) if my goal is also to use these URLs with 
> a browser (for example if I want to show one page for every book). It's not a 
> problem with Jena, it's a problem with the way browsers treat the fragment.

If you want a page for every book, don't use fragment URIs. Use
http://example.org/book/1 or http://example.org/book/1#this instead of
 http://example.org/book#1.


Re: Querying URL with square brackets

2023-11-24 Thread Laura Morales
> > in the case that I want to use these URLs with a web browser.
>
> I don't understand what the trouble with the above example is?

The problem with # is that browsers treat them as the start of a local 
reference. When you open http://example.org/book#1 the server only receives 
http://example.org/book. In other words it would be an error to create nodes 
for n different books (#1 #2 #3 #n) if my goal is also to use these URLs with a 
browser (for example if I want to show one page for every book). It's not a 
problem with Jena, it's a problem with the way browsers treat the fragment.


Re: Querying URL with square brackets

2023-11-24 Thread Marco Neumann
The URI syntax is defined by the Internet Engineering Task Force (IETF) in
RFC 3986.

W3C RDF is just a rule-taker here ;)

https://datatracker.ietf.org/doc/html/rfc3986

Marco

On Fri, Nov 24, 2023 at 10:36 AM Laura Morales  wrote:

> > What do you mean by human-readable here? For large technical systems it's
> > simply not feasible to encode meaning into the URI and I might even
> > consider it an anti-pattern.
>
> This is my problem. I do NOT want to encode any meaning into URLs, but I
> do want them to be human readable simply because I) properties are URLs
> too, 2) they can be used online, and 3) they are simpler to work with, for
> example editing in a Turtle file or writing a query.
>
> :alice :knows :bobvs:dsa7hdsahdsa782j :d93ifg75jgueeywu
> :s93oeirugj290sjf
>
> I can avoid [ entirely, but it rises the question of what other characters
> I MUST avoid.
>


-- 


---
Marco Neumann


Re: Querying URL with square brackets

2023-11-24 Thread Laura Morales
> What do you mean by human-readable here? For large technical systems it's
> simply not feasible to encode meaning into the URI and I might even
> consider it an anti-pattern.

This is my problem. I do NOT want to encode any meaning into URLs, but I do 
want them to be human readable simply because I) properties are URLs too, 2) 
they can be used online, and 3) they are simpler to work with, for example 
editing in a Turtle file or writing a query.

:alice :knows :bobvs:dsa7hdsahdsa782j :d93ifg75jgueeywu 
:s93oeirugj290sjf

I can avoid [ entirely, but it rises the question of what other characters I 
MUST avoid.


Re: Querying URL with square brackets

2023-11-24 Thread Marco Neumann
(side note) preferably the local name of a URI should not start with a
number but a letter or underscore.

What do you mean by human-readable here? For large technical systems it's
simply not feasible to encode meaning into the URI and I might even
consider it an anti-pattern.

There are some community efforts that have introduced single letters and
number sequences for vocabulary development like CIDOC CRM which was later
also adopted by community projects like wikidata. But instance data
typically doesn't have that requirement and can be random but has to be
syntax compliant of course.

I am sure Andy can elaborate on the details of the encoding here.




On Fri, Nov 24, 2023 at 9:31 AM Laura Morales  wrote:

> Thank you a lot. FILTER(STR(?id) = "...") works, as suggested by Andy. I
> do recognize though that it is a hack, and that URLs should probably not
> have a [.
>
> But now I have trouble understanding UTF8 addresses. I would use random
> alphanumeric URLs everywhere if I could, or I would %-encode everything.
> But nodes IDs (URLs) are supposed to be valid, human-readable URLs because
> they're used online. Jena, and browsers, work fine with IRIs (which are
> UTF8), but the way special characters are used is not the same. For example
> it's perfectly fine in my graph to have a URL fragment, such as
> http://example.org/foo#bar but these URLs are not usable with a browser
> because the fragment is a local reference (local to the browser) that is
> not sent to the server. Which means in practice, that if I want to stay out
> of trouble I should not create a graph with IDs
>
> http://example.org/book#1
> http://example.org/book#2
> http://example.org/book#3
>
> in the case that I want to use these URLs with a web browser. Viceversa,
> browsers are perfectly fine with a [ in the path, but Jena is stricter.
>
> So, if I want to use UTF8 addresses (IRIs) in my graph, and if I don't
> want to %-encode them because I want them to be human-readbale (also
> because they are much easier to read/edit manually), what is the list of
> characters that MUST be %-encoded?
>
>
> > Sent: Friday, November 24, 2023 at 9:55 AM
> > From: "Marco Neumann" 
> > To: users@jena.apache.org
> > Subject: Re: Querying URL with square brackets
> >
> > Laura, see jena issue #2102
> > https://github.com/apache/jena/issues/2102
> >
> > Marco
>


-- 


---
Marco Neumann


Re: Querying URL with square brackets

2023-11-24 Thread Martynas Jusevičius
On Fri, Nov 24, 2023 at 10:31 AM Laura Morales  wrote:
>
> Thank you a lot. FILTER(STR(?id) = "...") works, as suggested by Andy. I do 
> recognize though that it is a hack, and that URLs should probably not have a 
> [.
>
> But now I have trouble understanding UTF8 addresses. I would use random 
> alphanumeric URLs everywhere if I could, or I would %-encode everything. But 
> nodes IDs (URLs) are supposed to be valid, human-readable URLs because 
> they're used online. Jena, and browsers, work fine with IRIs (which are 
> UTF8), but the way special characters are used is not the same. For example 
> it's perfectly fine in my graph to have a URL fragment, such as 
> http://example.org/foo#bar but these URLs are not usable with a browser 
> because the fragment is a local reference (local to the browser) that is not 
> sent to the server. Which means in practice, that if I want to stay out of 
> trouble I should not create a graph with IDs
>
> http://example.org/book#1
> http://example.org/book#2
> http://example.org/book#3
>
> in the case that I want to use these URLs with a web browser.

I don't understand what the trouble with the above example is?

> Viceversa, browsers are perfectly fine with a [ in the path, but Jena is 
> stricter.

It's not Jena that's stricter, it's the standard specifications. Or
you can say browsers are too lax. They use their own WHATWG URL
"specification".
Sometimes the URL you see in the address bar is not the actual URL
being sent to the server.

>
> So, if I want to use UTF8 addresses (IRIs) in my graph, and if I don't want 
> to %-encode them because I want them to be human-readbale (also because they 
> are much easier to read/edit manually), what is the list of characters that 
> MUST be %-encoded?
>
>
> > Sent: Friday, November 24, 2023 at 9:55 AM
> > From: "Marco Neumann" 
> > To: users@jena.apache.org
> > Subject: Re: Querying URL with square brackets
> >
> > Laura, see jena issue #2102
> > https://github.com/apache/jena/issues/2102
> >
> > Marco


Re: Querying URL with square brackets

2023-11-24 Thread Laura Morales
Thank you a lot. FILTER(STR(?id) = "...") works, as suggested by Andy. I do 
recognize though that it is a hack, and that URLs should probably not have a [.

But now I have trouble understanding UTF8 addresses. I would use random 
alphanumeric URLs everywhere if I could, or I would %-encode everything. But 
nodes IDs (URLs) are supposed to be valid, human-readable URLs because they're 
used online. Jena, and browsers, work fine with IRIs (which are UTF8), but the 
way special characters are used is not the same. For example it's perfectly 
fine in my graph to have a URL fragment, such as http://example.org/foo#bar but 
these URLs are not usable with a browser because the fragment is a local 
reference (local to the browser) that is not sent to the server. Which means in 
practice, that if I want to stay out of trouble I should not create a graph 
with IDs

http://example.org/book#1
http://example.org/book#2
http://example.org/book#3

in the case that I want to use these URLs with a web browser. Viceversa, 
browsers are perfectly fine with a [ in the path, but Jena is stricter.

So, if I want to use UTF8 addresses (IRIs) in my graph, and if I don't want to 
%-encode them because I want them to be human-readbale (also because they are 
much easier to read/edit manually), what is the list of characters that MUST be 
%-encoded?


> Sent: Friday, November 24, 2023 at 9:55 AM
> From: "Marco Neumann" 
> To: users@jena.apache.org
> Subject: Re: Querying URL with square brackets
>
> Laura, see jena issue #2102
> https://github.com/apache/jena/issues/2102
>
> Marco


Re: Querying URL with square brackets

2023-11-24 Thread Marco Neumann
Laura, see jena issue #2102
https://github.com/apache/jena/issues/2102

Marco

On Fri, Nov 24, 2023 at 7:12 AM Laura Morales  wrote:

> I have a few URLs containing square brackets like
> http://example.org/foo[1]bar
> I can create a TDB2 dataset without much problems, with warnings but no
> errors. I can also query these nodes "indirectly", that is if I query them
> by some property and not by URI. My problem is that I cannot query them
> directly by URI. As soon as I try to use the URIs explicitly in a query,
> for example "DESCRIBE ", I receive this
> error
>
> ERROR SPARQL  :: [line: 1, col: 10] Bad IRI: '
> http://example.org/foo[1]bar':  Code:
> 0/ILLEGAL_CHARACTER in PATH: The character violates the grammar rules for
> URIs/IRIs.
>
> I tried escaping, "foo\[1\]bar" but it doesn't work.
> I tried converting from a string, FILTER(?id = URI("
> http://example.org/foo[1]bar;)) but it doesn't work
> What else could I try?
>


-- 


---
Marco Neumann