Re: [GSoC 2015 - JENA-491] JavaCC with master.jj

2015-07-09 Thread Qihong Lin
Hi,

I've studied the jena tests. It looks like that the syntax tests are
generated by syn.sh. But the execution tests are not generated by
scripts, which are written by hand one by one. Is that true?

Since I have enough time, I'd like to directly go for the syn.sh and
syn-arq.sh to generate tests for constructing quads. Thanks!

regards,
Qihong

On Tue, Jun 16, 2015 at 9:24 PM, Andy Seaborne a...@apache.org wrote:
 On 16/06/15 09:06, Qihong Lin wrote:

 Hi,

 Thanks! I just marked GRAPH mandatory, and it worked without
 producing the warnings. I'll look into the details later.

 By the way, if the new parser is ready, how to test it? I mean, where
 to drop the unit test code and the query strings to be tested? I'm
 confused with org.apache.jena.sparql.junit.QueryTest (is that what I
 need to deal with?). Any guideline or documentation for arq test?

 regards,
 Qihong


 Most testing of queries is by externally defined manifest files
 (manifest.ttl)

 jena-arq/testing/ARQ

 For now, keep it clean and start a new directory

 jena-arq/testing/ARQ/ConstructQuads

 with both syntax and execution tests.  This is just to keep everything in
 one place for now.


 See jena-arq/testing/ARQ/Syntax/Syntax-ARQ/manifest.ttl and
 jena-arq/testing/ARQ/Construct/manifest.ttl.

 A manifest can have syntax and execution tests - it so happens that they are
 in separate places in the current test suite which was input the the working
 group.

 A syntax test looks like:

 :test_1 rdf:type   mfx:PositiveSyntaxTestARQ ;
dawgt:approval dawgt:NotClassified ;
mf:namesyntax-select-expr-01.arq ;
mf:action  syntax-select-expr-01.arq ;.

 to parse syntax-select-expr-01.arq, expecting it to be good, and an
 execution test is an action and a result:

 :someExecutionTest rdf:type   mfx:TestQuery ;
 mf:nameConstruct Quads 1 ;
 mf:action
  [ qt:query  q-construct-1.rq ;
qt:data   data-1.ttl ] ;
 mf:result  results-construct-1.ttl
 .

 an action is a query and a data file.

 There are different styles of layout in different places.  The test suite
 has grown incrementally over the years of SPARQL 1.0 and SPARQL 1.1. Some
 test come from outside the project.

 You can test from the command line using the arq.qparse tool.
 See other message.

 There is a command qtest for running manifests.

 Background FYI:

 You won't need this when put everything in
 jena-arq/testing/ARQ/ConstructQuads but to explain: the main test syntax
 suites are auto-generated by syn.sh

 Part of that is syn-arq.sh.

 But hand writing syntax easier for now.

 Andy



Re: [GSoC 2015 - JENA-491] JavaCC with master.jj

2015-06-16 Thread Andy Seaborne

On 16/06/15 09:06, Qihong Lin wrote:

Hi,

Thanks! I just marked GRAPH mandatory, and it worked without
producing the warnings. I'll look into the details later.

By the way, if the new parser is ready, how to test it? I mean, where
to drop the unit test code and the query strings to be tested? I'm
confused with org.apache.jena.sparql.junit.QueryTest (is that what I
need to deal with?). Any guideline or documentation for arq test?

regards,
Qihong


Most testing of queries is by externally defined manifest files 
(manifest.ttl)


jena-arq/testing/ARQ

For now, keep it clean and start a new directory

jena-arq/testing/ARQ/ConstructQuads

with both syntax and execution tests.  This is just to keep everything 
in one place for now.



See jena-arq/testing/ARQ/Syntax/Syntax-ARQ/manifest.ttl and
jena-arq/testing/ARQ/Construct/manifest.ttl.

A manifest can have syntax and execution tests - it so happens that they 
are in separate places in the current test suite which was input the the 
working group.


A syntax test looks like:

:test_1 rdf:type   mfx:PositiveSyntaxTestARQ ;
   dawgt:approval dawgt:NotClassified ;
   mf:namesyntax-select-expr-01.arq ;
   mf:action  syntax-select-expr-01.arq ;.

to parse syntax-select-expr-01.arq, expecting it to be good, and an 
execution test is an action and a result:


:someExecutionTest rdf:type   mfx:TestQuery ;
mf:nameConstruct Quads 1 ;
mf:action
 [ qt:query  q-construct-1.rq ;
   qt:data   data-1.ttl ] ;
mf:result  results-construct-1.ttl
.

an action is a query and a data file.

There are different styles of layout in different places.  The test 
suite has grown incrementally over the years of SPARQL 1.0 and SPARQL 
1.1. Some test come from outside the project.


You can test from the command line using the arq.qparse tool.
See other message.

There is a command qtest for running manifests.

Background FYI:

You won't need this when put everything in 
jena-arq/testing/ARQ/ConstructQuads but to explain: the main test syntax 
suites are auto-generated by syn.sh


Part of that is syn-arq.sh.

But hand writing syntax easier for now.

Andy



Re: [GSoC 2015 - JENA-491] JavaCC with master.jj

2015-06-16 Thread Qihong Lin
Hi,

Thanks! I just marked GRAPH mandatory, and it worked without
producing the warnings. I'll look into the details later.

By the way, if the new parser is ready, how to test it? I mean, where
to drop the unit test code and the query strings to be tested? I'm
confused with org.apache.jena.sparql.junit.QueryTest (is that what I
need to deal with?). Any guideline or documentation for arq test?

regards,
Qihong


On Mon, Jun 15, 2015 at 11:45 PM, Ying Jiang jpz6311...@gmail.com wrote:
 Hi Qihong,

 In addition to Andy's explanation, You might take look at this
 tutorial for more details on javacc lookahead:
 https://javacc.java.net/doc/lookahead.html


 Best regards,
 Ying Jiang

 On Mon, Jun 15, 2015 at 10:42 PM, Andy Seaborne a...@apache.org wrote:
 Qihong,

 There is an ambiguity in the grammar if you make GRAPH optional.

 See rule 'Quads'

 Consider these two cases:

  :s :p :o .
  :z { :s1 :p1 :o1 } .

  :s :p :o .
  :z :q :o2 .


 when the parser get to end of the triple in the default graph:

  :s :p :o .

 there are two ways forward: more triples (TriplesTemplate) and end of the
 triples part, start of named graph.

 It looks ahead one token and see :z and needs to decide whether the next
 rule is more triples, the :z :q :o2 . case, or the end of the triples for
 the default graph and the start of a named graph the :z { :s1 :p1 :o1 } .
 where it exists TriplesTemplate and moves on to QuadsNotTriples

 If GRAPH then the entry to QuadsNotTriples is marked by a GRAPH which is
 never in triples.

 The grammar is LL(1) - a lookahead of 1 - by default.

 There are two solutions (I haven't checked exact deatils):

 1/ Use LOOKAHEAD(2) so it sees tokens ':z' and ':q' or ':z' (triples) and
 '{' which is the named graphs case.  I think this is in Quads somewhere.

 2/ Leave GRAPH required.

 (2) is fine for now - it will not be too unexpected to users because INSERT
 DATA requires a GRAPH and it is legal TriG, even if not the short form in
 TriG.

 You can come back and look at (1) later.  I'm keen for you to get something
 going as soon as possible, not get lost in details.

 

 Background:

 There is a third solution but it's not as so simple which is to introduce an
 intermediate state of MaybeTriplesMaybeQuads but if you do that, more of
 the grammar needs rewriting.  I'm not sure how widespread the changes would
 be.

 Jena's TriG parser (which is not JavaCC based see
 LangTriG::oneNamedGraphBlock2)

 has this comment:

 // Either :s :p :o or :g { ... }

 and does one look ahead to get the :s or :g (the :z above), keeps that
 hanging around, does another lookahead to see '{' or not, then calls
 turtle(n) if triples.

 In LangTriG:

 turtle() is roughly TriplesSameSubject
 turtle(n) is roughly  PropertyListNotEmpty

 Andy


 On 15/06/15 11:53, Qihong Lin wrote:

 Hi,

 I'm trying to play with master.jj. But the grammar script somethings
 prints warning messages. The behavior is strange. In order to simplify
 my question, I'd like to take the following example:

 In QuadsNotTriples(), line 691 in master.jj, in the master branch:
 
 GRAPH
 
 If I change it to optional (which is required in future
 implementations, for the new grammar):
 
 (GRAPH)?
 
 the grammar script goes like this:

 $ ./grammar
  Process grammar -- sparql_11.jj
 Java Compiler Compiler Version 5.0 (Parser Generator)
 (type javacc with no arguments for help)
 Reading from file sparql_11.jj . . .
 Warning: Choice conflict in [...] construct at line 464, column 4.
   Expansion nested within construct and expansion following
 construct
   have common prefixes, one of which is: VAR1
   Consider using a lookahead of 2 or more for nested expansion.
 Warning: Choice conflict in [...] construct at line 468, column 6.
   Expansion nested within construct and expansion following
 construct
   have common prefixes, one of which is: VAR1
   Consider using a lookahead of 2 or more for nested expansion.
 Warning: Choice conflict in [...] construct at line 484, column 12.
   Expansion nested within construct and expansion following
 construct
   have common prefixes, one of which is: VAR1
   Consider using a lookahead of 2 or more for nested expansion.
 Warning: Choice conflict in [...] construct at line 759, column 3.
   Expansion nested within construct and expansion following
 construct
   have common prefixes, one of which is: VAR1
   Consider using a lookahead of 2 or more for nested expansion.
 Warning: Choice conflict in [...] construct at line 767, column 5.
   Expansion nested within construct and expansion following
 construct
   have common prefixes, one of which is: VAR1
   Consider using a lookahead of 2 or more for nested expansion.
 File TokenMgrError.java does not exist.  Will create one.
 File ParseException.java does not exist.  Will create one.
 File 

Re: [GSoC 2015 - JENA-491] JavaCC with master.jj

2015-06-15 Thread Andy Seaborne

Qihong,

There is an ambiguity in the grammar if you make GRAPH optional.

See rule 'Quads'

Consider these two cases:

 :s :p :o .
 :z { :s1 :p1 :o1 } .

 :s :p :o .
 :z :q :o2 .


when the parser get to end of the triple in the default graph:

 :s :p :o .

there are two ways forward: more triples (TriplesTemplate) and end of 
the triples part, start of named graph.


It looks ahead one token and see :z and needs to decide whether the next 
rule is more triples, the :z :q :o2 . case, or the end of the triples 
for the default graph and the start of a named graph the :z { :s1 :p1 
:o1 } . where it exists TriplesTemplate and moves on to QuadsNotTriples


If GRAPH then the entry to QuadsNotTriples is marked by a GRAPH 
which is never in triples.


The grammar is LL(1) - a lookahead of 1 - by default.

There are two solutions (I haven't checked exact deatils):

1/ Use LOOKAHEAD(2) so it sees tokens ':z' and ':q' or ':z' (triples) 
and '{' which is the named graphs case.  I think this is in Quads 
somewhere.


2/ Leave GRAPH required.

(2) is fine for now - it will not be too unexpected to users because 
INSERT DATA requires a GRAPH and it is legal TriG, even if not the short 
form in TriG.


You can come back and look at (1) later.  I'm keen for you to get 
something going as soon as possible, not get lost in details.




Background:

There is a third solution but it's not as so simple which is to 
introduce an intermediate state of MaybeTriplesMaybeQuads but if you 
do that, more of the grammar needs rewriting.  I'm not sure how 
widespread the changes would be.


Jena's TriG parser (which is not JavaCC based see 
LangTriG::oneNamedGraphBlock2)


has this comment:

// Either :s :p :o or :g { ... }

and does one look ahead to get the :s or :g (the :z above), keeps that 
hanging around, does another lookahead to see '{' or not, then calls 
turtle(n) if triples.


In LangTriG:

turtle() is roughly TriplesSameSubject
turtle(n) is roughly  PropertyListNotEmpty

Andy

On 15/06/15 11:53, Qihong Lin wrote:

Hi,

I'm trying to play with master.jj. But the grammar script somethings
prints warning messages. The behavior is strange. In order to simplify
my question, I'd like to take the following example:

In QuadsNotTriples(), line 691 in master.jj, in the master branch:

GRAPH

If I change it to optional (which is required in future
implementations, for the new grammar):

(GRAPH)?

the grammar script goes like this:

$ ./grammar
 Process grammar -- sparql_11.jj
Java Compiler Compiler Version 5.0 (Parser Generator)
(type javacc with no arguments for help)
Reading from file sparql_11.jj . . .
Warning: Choice conflict in [...] construct at line 464, column 4.
  Expansion nested within construct and expansion following construct
  have common prefixes, one of which is: VAR1
  Consider using a lookahead of 2 or more for nested expansion.
Warning: Choice conflict in [...] construct at line 468, column 6.
  Expansion nested within construct and expansion following construct
  have common prefixes, one of which is: VAR1
  Consider using a lookahead of 2 or more for nested expansion.
Warning: Choice conflict in [...] construct at line 484, column 12.
  Expansion nested within construct and expansion following construct
  have common prefixes, one of which is: VAR1
  Consider using a lookahead of 2 or more for nested expansion.
Warning: Choice conflict in [...] construct at line 759, column 3.
  Expansion nested within construct and expansion following construct
  have common prefixes, one of which is: VAR1
  Consider using a lookahead of 2 or more for nested expansion.
Warning: Choice conflict in [...] construct at line 767, column 5.
  Expansion nested within construct and expansion following construct
  have common prefixes, one of which is: VAR1
  Consider using a lookahead of 2 or more for nested expansion.
File TokenMgrError.java does not exist.  Will create one.
File ParseException.java does not exist.  Will create one.
File Token.java does not exist.  Will create one.
File JavaCharStream.java does not exist.  Will create one.
Parser generated with 0 errors and 5 warnings.
 Create text form
Java Compiler Compiler Version 5.0 (Documentation Generator Version 0.1.4)
(type jjdoc with no arguments for help)
Reading from file sparql_11.jj . . .
Grammar documentation generated successfully in sparql_11.txt
 Fixing Java warnings in TokenManager ...
 Fixing Java warnings in Token ...
 Fixing Java warnings in TokenMgrError ...
 Fixing Java warnings in SPARQLParser11 ...
 Done
 Process grammar -- arq.jj
Java Compiler Compiler Version 5.0 (Parser Generator)
(type javacc with no arguments for help)
Reading from file arq.jj . . .
Warning: Choice conflict in [...] construct at line 486, column 4.
  Expansion nested within 

Re: [GSoC 2015 - JENA-491] JavaCC with master.jj

2015-06-15 Thread Ying Jiang
Hi Qihong,

In addition to Andy's explanation, You might take look at this
tutorial for more details on javacc lookahead:
https://javacc.java.net/doc/lookahead.html


Best regards,
Ying Jiang

On Mon, Jun 15, 2015 at 10:42 PM, Andy Seaborne a...@apache.org wrote:
 Qihong,

 There is an ambiguity in the grammar if you make GRAPH optional.

 See rule 'Quads'

 Consider these two cases:

  :s :p :o .
  :z { :s1 :p1 :o1 } .

  :s :p :o .
  :z :q :o2 .


 when the parser get to end of the triple in the default graph:

  :s :p :o .

 there are two ways forward: more triples (TriplesTemplate) and end of the
 triples part, start of named graph.

 It looks ahead one token and see :z and needs to decide whether the next
 rule is more triples, the :z :q :o2 . case, or the end of the triples for
 the default graph and the start of a named graph the :z { :s1 :p1 :o1 } .
 where it exists TriplesTemplate and moves on to QuadsNotTriples

 If GRAPH then the entry to QuadsNotTriples is marked by a GRAPH which is
 never in triples.

 The grammar is LL(1) - a lookahead of 1 - by default.

 There are two solutions (I haven't checked exact deatils):

 1/ Use LOOKAHEAD(2) so it sees tokens ':z' and ':q' or ':z' (triples) and
 '{' which is the named graphs case.  I think this is in Quads somewhere.

 2/ Leave GRAPH required.

 (2) is fine for now - it will not be too unexpected to users because INSERT
 DATA requires a GRAPH and it is legal TriG, even if not the short form in
 TriG.

 You can come back and look at (1) later.  I'm keen for you to get something
 going as soon as possible, not get lost in details.

 

 Background:

 There is a third solution but it's not as so simple which is to introduce an
 intermediate state of MaybeTriplesMaybeQuads but if you do that, more of
 the grammar needs rewriting.  I'm not sure how widespread the changes would
 be.

 Jena's TriG parser (which is not JavaCC based see
 LangTriG::oneNamedGraphBlock2)

 has this comment:

 // Either :s :p :o or :g { ... }

 and does one look ahead to get the :s or :g (the :z above), keeps that
 hanging around, does another lookahead to see '{' or not, then calls
 turtle(n) if triples.

 In LangTriG:

 turtle() is roughly TriplesSameSubject
 turtle(n) is roughly  PropertyListNotEmpty

 Andy


 On 15/06/15 11:53, Qihong Lin wrote:

 Hi,

 I'm trying to play with master.jj. But the grammar script somethings
 prints warning messages. The behavior is strange. In order to simplify
 my question, I'd like to take the following example:

 In QuadsNotTriples(), line 691 in master.jj, in the master branch:
 
 GRAPH
 
 If I change it to optional (which is required in future
 implementations, for the new grammar):
 
 (GRAPH)?
 
 the grammar script goes like this:

 $ ./grammar
  Process grammar -- sparql_11.jj
 Java Compiler Compiler Version 5.0 (Parser Generator)
 (type javacc with no arguments for help)
 Reading from file sparql_11.jj . . .
 Warning: Choice conflict in [...] construct at line 464, column 4.
   Expansion nested within construct and expansion following
 construct
   have common prefixes, one of which is: VAR1
   Consider using a lookahead of 2 or more for nested expansion.
 Warning: Choice conflict in [...] construct at line 468, column 6.
   Expansion nested within construct and expansion following
 construct
   have common prefixes, one of which is: VAR1
   Consider using a lookahead of 2 or more for nested expansion.
 Warning: Choice conflict in [...] construct at line 484, column 12.
   Expansion nested within construct and expansion following
 construct
   have common prefixes, one of which is: VAR1
   Consider using a lookahead of 2 or more for nested expansion.
 Warning: Choice conflict in [...] construct at line 759, column 3.
   Expansion nested within construct and expansion following
 construct
   have common prefixes, one of which is: VAR1
   Consider using a lookahead of 2 or more for nested expansion.
 Warning: Choice conflict in [...] construct at line 767, column 5.
   Expansion nested within construct and expansion following
 construct
   have common prefixes, one of which is: VAR1
   Consider using a lookahead of 2 or more for nested expansion.
 File TokenMgrError.java does not exist.  Will create one.
 File ParseException.java does not exist.  Will create one.
 File Token.java does not exist.  Will create one.
 File JavaCharStream.java does not exist.  Will create one.
 Parser generated with 0 errors and 5 warnings.
  Create text form
 Java Compiler Compiler Version 5.0 (Documentation Generator Version 0.1.4)
 (type jjdoc with no arguments for help)
 Reading from file sparql_11.jj . . .
 Grammar documentation generated successfully in sparql_11.txt
  Fixing Java warnings in TokenManager ...
  Fixing Java warnings in Token ...
  Fixing Java