Hi Andy and Lorenz, thanks for your quick replies. I am not trying to parse 
full SPARQL, but actually only the Basic Graph Pattern part of a query. Is the 
org.apache.jena.sparql.lang.arq.ARQParser class not parsing SPARQL 1.1?

Based on Andy's directions it seems like doing the following additional check 
after parsing the string works for detecting graph patterns with dashes in 
their variable name at the object location.

if (parser1.token.next.kind == ARQParser.EOF) {
        // found valid graph pattern
        System.out.println("Graph pattern parse successful!");
} else {
        // stream not empty, so not a valid graph pattern
        System.out.println("Graph pattern parse failed!");
}

Thanks for your help!

Best regards,

Barry

-----Original Message-----
From: Andy Seaborne <[email protected]> 
Sent: dinsdag 5 april 2022 14:07
To: [email protected]
Subject: Re: ARQ variables with dashes

Inline.

Summary : it didn't consume the whole input, only up to the end of the legal 
part.

On 05/04/2022 12:43, Lorenz Buehmann wrote:
> Hi Barry,
> 
> 
> Did you try SPARQL1.1 parser instead? Afaik, ARQ was always beyond 
> SPARQL 1.1 or better said, already before SPARQL 1.1 with some extensions.
> 
> Indeed, Andy will correct me soon :D
> 
> The grammar files for JavaCC are here:
> 
> https://github.com/apache/jena/tree/main/jena-arq/Grammar
> 
> You can check arq.jj and sparql_11.jj
> 
> 
> Or just wait for Andy's response ...
> 
> 
> Cheers,
> 
> Lorenz
> 
> 
> 
> On 05.04.22 13:21, Nouwt, B. (Barry) wrote:
>> Hi everyone,
>>
>> We are using ARQ's SPARQL parser to parse graph patterns and noticed 
>> that it allows dashes in variable names if these variables occur as 
>> the *object* location of a triple pattern. If the variable names at 
>> the *subject* location of a triple pattern contains dashes, it fails 
>> with a ParseException. As far as we could tell the SPARQL 
>> specification does not allow dashes in variable names at all 
>> (https://www.w3.org/TR/sparql11-query/#rVARNAME). The pattern1 and
>> pattern2 below should both fail, but the first one does not fail and 
>> the second does fail.
>>
>> String pattern1 = "<test> https://www.tno.nl/example/b ?community-ID 
>> ."; ARQParser parser1 = new ARQParser(new StringReader(pattern1)); 
>> parser1.GroupGraphPatternSub();

Calling into the middle of the parse doesn't work so easily.

It has parsed up to the end of legal triple pattern.

"<test> https://www.tno.nl/example/b ?community"

when it sees the "-" the variable name has ended and (because the "." is not 
required) it is a legal GroupGraphPatternSub

The "-ID ." is left in the token input stream.

You have to test whether end-of-input has been reached.


try

qparse 'SELECT * { <test> <p> ?o-1 }'

Parse error because "-1", the next token (tokenizing is done ahead of where the 
parser grammar is the 1 in LL(1)) is not legal.

This is illegal because there is check for end of input:

qparse 'SELECT * { <test> <p> ?o } XXX'

The top level entry point is

void QueryUnit(): { }
{
   ByteOrderMark()
   Query()
   <EOF>
}

so the parser must see <EOF> to be valid and exit without error.

     Andy

>>
>> String pattern2 = "?community-ID https://www.tno.nl/example/b <test> 
>> ."; ARQParser parser2 = new ARQParser(new StringReader(pattern2)); 
>> parser2.GroupGraphPatternSub();
>>
>> Is this a bug?
>>
>> Best regards,
>>
>> Barry
>> This message may contain information that is not intended for you. If 
>> you are not the addressee or if this message was sent to you by 
>> mistake, you are requested to inform the sender and delete the 
>> message. TNO accepts no liability for the content of this e-mail, for 
>> the manner in which you use it and for damage of any kind resulting 
>> from the risks inherent to the electronic transmission of messages.
>>

Reply via email to