[jira] [Updated] (JENA-978) jena-text: store original literals

2015-08-18 Thread Osma Suominen (JIRA)

 [ 
https://issues.apache.org/jira/browse/JENA-978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Osma Suominen updated JENA-978:
---
Attachment: text-query.mdtext.diff

>   jena-text: store original literals
> 
>
> Key: JENA-978
> URL: https://issues.apache.org/jira/browse/JENA-978
> Project: Apache Jena
>  Issue Type: Improvement
>  Components: Text
>Affects Versions: Jena 2.13.0
>Reporter: Osma Suominen
>Assignee: Osma Suominen
> Fix For: Jena 3.0.0
>
> Attachments: text-query.mdtext.diff
>
>
> As discussed on the dev list, this PR implements a feature where it's 
> possible to store the original literal values in the jena-text Lucene index 
> and to access them when querying the index.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (JENA-978) jena-text: store original literals

2015-08-18 Thread Osma Suominen (JIRA)

 [ 
https://issues.apache.org/jira/browse/JENA-978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Osma Suominen reopened JENA-978:


Apparently I need to reopen the issue to be able to attach my patch...


>   jena-text: store original literals
> 
>
> Key: JENA-978
> URL: https://issues.apache.org/jira/browse/JENA-978
> Project: Apache Jena
>  Issue Type: Improvement
>  Components: Text
>Affects Versions: Jena 2.13.0
>Reporter: Osma Suominen
>Assignee: Osma Suominen
> Fix For: Jena 3.0.0
>
>
> As discussed on the dev list, this PR implements a feature where it's 
> possible to store the original literal values in the jena-text Lucene index 
> and to access them when querying the index.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (JENA-978) jena-text: store original literals

2015-08-18 Thread Osma Suominen (JIRA)

[ 
https://issues.apache.org/jira/browse/JENA-978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14702606#comment-14702606
 ] 

Osma Suominen commented on JENA-978:


I forgot to document this earlier (was busy with family). I have now created 
documentation for this in the form of a patch for text-query.mdtext, which I 
will attach. Could this go into the production web site, since it's already in 
3.0.0?

>   jena-text: store original literals
> 
>
> Key: JENA-978
> URL: https://issues.apache.org/jira/browse/JENA-978
> Project: Apache Jena
>  Issue Type: Improvement
>  Components: Text
>Affects Versions: Jena 2.13.0
>Reporter: Osma Suominen
>Assignee: Osma Suominen
> Fix For: Jena 3.0.0
>
>
> As discussed on the dev list, this PR implements a feature where it's 
> possible to store the original literal values in the jena-text Lucene index 
> and to access them when querying the index.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (JENA-1010) Wrong parsing of aggregate functions in HAVING clause

2015-08-18 Thread Andy Seaborne (JIRA)

 [ 
https://issues.apache.org/jira/browse/JENA-1010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Seaborne resolved JENA-1010.
-
   Resolution: Fixed
Fix Version/s: Jena 3.0.0

> Wrong parsing of aggregate functions in HAVING clause
> -
>
> Key: JENA-1010
> URL: https://issues.apache.org/jira/browse/JENA-1010
> Project: Apache Jena
>  Issue Type: Bug
>  Components: Jena
>Affects Versions: Jena 2.13.0
>Reporter: Ruben Navarro Piris
>Assignee: Andy Seaborne
> Fix For: Jena 3.0.0
>
>
> Using an aggregate funtion (count) in the HAVING clause leads to some weird 
> parsing. The following code reproduces the problem (output also provided)
> {code:java}
> public class Test {
>   public static void main(String[] args) {
> AlgebraGenerator ag = new AlgebraGenerator();
> // this query counts the number of occurrences of a property
> // showing only properties with more than one occurrence
> String q = 
> "SELECT (count(?o) AS ?c) ?prop "
> + "WHERE { GRAPH ?g { ?s ?prop ?o } } "
> + "GROUP BY ?prop HAVING ( count(?o) > 1 ) "
> + "ORDER BY DESC(?c) ?prop";
> Query query = QueryFactory.create(q);
> System.out.println(query);
> Op queryOp = ag.compile(query);
> System.out.println(queryOp);
> 
> Query rewritten = OpAsQuery.asQuery(queryOp);
> System.out.println(rewritten);
>   }
> }
> {code}
> {code}
> SELECT  (COUNT(?o) AS ?c) ?prop
> WHERE
>   { GRAPH ?g
>   { ?s ?prop ?o}
>   }
> GROUP BY ?prop
> HAVING ( COUNT(?o) > 1 )
> ORDER BY DESC(?c) ?prop
> (project (?c ?prop)
>   (order ((desc ?c) ?prop)
> (filter (> ?.0 1)
>   (extend ((?c ?.0))
> (group (?prop) ((?.0 (count ?o)))
>   (graph ?g
> (bgp (triple ?s ?prop ?o
> SELECT  ?c ?prop
> WHERE
>   { { GRAPH ?g
> { ?s ?prop ?o}
>   BIND(COUNT(?o) AS ?c)
> }
> FILTER ( ?.0 > 1 )
>   }
> GROUP BY ?prop
> ORDER BY DESC(?c) ?prop
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (JENA-1010) Wrong parsing of aggregate functions in HAVING clause

2015-08-18 Thread Andy Seaborne (JIRA)

 [ 
https://issues.apache.org/jira/browse/JENA-1010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Seaborne reassigned JENA-1010:
---

Assignee: Andy Seaborne

> Wrong parsing of aggregate functions in HAVING clause
> -
>
> Key: JENA-1010
> URL: https://issues.apache.org/jira/browse/JENA-1010
> Project: Apache Jena
>  Issue Type: Bug
>  Components: Jena
>Affects Versions: Jena 2.13.0
>Reporter: Ruben Navarro Piris
>Assignee: Andy Seaborne
>
> Using an aggregate funtion (count) in the HAVING clause leads to some weird 
> parsing. The following code reproduces the problem (output also provided)
> {code:java}
> public class Test {
>   public static void main(String[] args) {
> AlgebraGenerator ag = new AlgebraGenerator();
> // this query counts the number of occurrences of a property
> // showing only properties with more than one occurrence
> String q = 
> "SELECT (count(?o) AS ?c) ?prop "
> + "WHERE { GRAPH ?g { ?s ?prop ?o } } "
> + "GROUP BY ?prop HAVING ( count(?o) > 1 ) "
> + "ORDER BY DESC(?c) ?prop";
> Query query = QueryFactory.create(q);
> System.out.println(query);
> Op queryOp = ag.compile(query);
> System.out.println(queryOp);
> 
> Query rewritten = OpAsQuery.asQuery(queryOp);
> System.out.println(rewritten);
>   }
> }
> {code}
> {code}
> SELECT  (COUNT(?o) AS ?c) ?prop
> WHERE
>   { GRAPH ?g
>   { ?s ?prop ?o}
>   }
> GROUP BY ?prop
> HAVING ( COUNT(?o) > 1 )
> ORDER BY DESC(?c) ?prop
> (project (?c ?prop)
>   (order ((desc ?c) ?prop)
> (filter (> ?.0 1)
>   (extend ((?c ?.0))
> (group (?prop) ((?.0 (count ?o)))
>   (graph ?g
> (bgp (triple ?s ?prop ?o
> SELECT  ?c ?prop
> WHERE
>   { { GRAPH ?g
> { ?s ?prop ?o}
>   BIND(COUNT(?o) AS ?c)
> }
> FILTER ( ?.0 > 1 )
>   }
> GROUP BY ?prop
> ORDER BY DESC(?c) ?prop
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (JENA-1010) Wrong parsing of aggregate functions in HAVING clause

2015-08-18 Thread Andy Seaborne (JIRA)

[ 
https://issues.apache.org/jira/browse/JENA-1010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14701035#comment-14701035
 ] 

Andy Seaborne commented on JENA-1010:
-

See JENA-963. Fixed in Jena 3.0.0. The problem was in opAsQuery.  Output of the 
code above is:
{code}
SELECT  (COUNT(?o) AS ?c) ?prop
WHERE
  { GRAPH ?g
  { ?s  ?prop  ?o }
  }
GROUP BY ?prop
HAVING ( COUNT(?o) > 1 )
ORDER BY DESC(?c) ?prop
{code}
with Jena 3.0.0

> Wrong parsing of aggregate functions in HAVING clause
> -
>
> Key: JENA-1010
> URL: https://issues.apache.org/jira/browse/JENA-1010
> Project: Apache Jena
>  Issue Type: Bug
>  Components: Jena
>Affects Versions: Jena 2.13.0
>Reporter: Ruben Navarro Piris
>
> Using an aggregate funtion (count) in the HAVING clause leads to some weird 
> parsing. The following code reproduces the problem (output also provided)
> {code:java}
> public class Test {
>   public static void main(String[] args) {
> AlgebraGenerator ag = new AlgebraGenerator();
> // this query counts the number of occurrences of a property
> // showing only properties with more than one occurrence
> String q = 
> "SELECT (count(?o) AS ?c) ?prop "
> + "WHERE { GRAPH ?g { ?s ?prop ?o } } "
> + "GROUP BY ?prop HAVING ( count(?o) > 1 ) "
> + "ORDER BY DESC(?c) ?prop";
> Query query = QueryFactory.create(q);
> System.out.println(query);
> Op queryOp = ag.compile(query);
> System.out.println(queryOp);
> 
> Query rewritten = OpAsQuery.asQuery(queryOp);
> System.out.println(rewritten);
>   }
> }
> {code}
> {code}
> SELECT  (COUNT(?o) AS ?c) ?prop
> WHERE
>   { GRAPH ?g
>   { ?s ?prop ?o}
>   }
> GROUP BY ?prop
> HAVING ( COUNT(?o) > 1 )
> ORDER BY DESC(?c) ?prop
> (project (?c ?prop)
>   (order ((desc ?c) ?prop)
> (filter (> ?.0 1)
>   (extend ((?c ?.0))
> (group (?prop) ((?.0 (count ?o)))
>   (graph ?g
> (bgp (triple ?s ?prop ?o
> SELECT  ?c ?prop
> WHERE
>   { { GRAPH ?g
> { ?s ?prop ?o}
>   BIND(COUNT(?o) AS ?c)
> }
> FILTER ( ?.0 > 1 )
>   }
> GROUP BY ?prop
> ORDER BY DESC(?c) ?prop
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (JENA-1010) Wrong parsing of aggregate functions in HAVING clause

2015-08-18 Thread Ruben Navarro Piris (JIRA)
Ruben Navarro Piris created JENA-1010:
-

 Summary: Wrong parsing of aggregate functions in HAVING clause
 Key: JENA-1010
 URL: https://issues.apache.org/jira/browse/JENA-1010
 Project: Apache Jena
  Issue Type: Bug
  Components: Jena
Affects Versions: Jena 2.13.0
Reporter: Ruben Navarro Piris


Using an aggregate funtion (count) in the HAVING clause leads to some weird 
parsing. The following code reproduces the problem (output also provided)

{code:java}
public class Test {
  public static void main(String[] args) {

AlgebraGenerator ag = new AlgebraGenerator();
// this query counts the number of occurrences of a property
// showing only properties with more than one occurrence
String q = 
"SELECT (count(?o) AS ?c) ?prop "
+ "WHERE { GRAPH ?g { ?s ?prop ?o } } "
+ "GROUP BY ?prop HAVING ( count(?o) > 1 ) "
+ "ORDER BY DESC(?c) ?prop";
Query query = QueryFactory.create(q);
System.out.println(query);

Op queryOp = ag.compile(query);
System.out.println(queryOp);

Query rewritten = OpAsQuery.asQuery(queryOp);
System.out.println(rewritten);
  }
}
{code}

{code}
SELECT  (COUNT(?o) AS ?c) ?prop
WHERE
  { GRAPH ?g
  { ?s ?prop ?o}
  }
GROUP BY ?prop
HAVING ( COUNT(?o) > 1 )
ORDER BY DESC(?c) ?prop

(project (?c ?prop)
  (order ((desc ?c) ?prop)
(filter (> ?.0 1)
  (extend ((?c ?.0))
(group (?prop) ((?.0 (count ?o)))
  (graph ?g
(bgp (triple ?s ?prop ?o

SELECT  ?c ?prop
WHERE
  { { GRAPH ?g
{ ?s ?prop ?o}
  BIND(COUNT(?o) AS ?c)
}
FILTER ( ?.0 > 1 )
  }
GROUP BY ?prop
ORDER BY DESC(?c) ?prop
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Fwd: Final Deliveries of GSoC Project (JENA-491)

2015-08-18 Thread Andy Seaborne

On 18/08/15 08:44, Qihong Lin wrote:

Hi Andy,

Thank for your comments! I can understand your idea of A/ B/ C/. But
there'a problem in practice.

In brief:
For A/, at the server side, always call
QueryExecutionBase::execConstructDataset()
For B/, the problem is, there're overlaps between DEF.rdfOffer and
DEF.quadsOffer, e.g. TriX, TriXxml and JSONLD.
For C/, e.g. if the Accept lang is JSONLD, which should be written out
by RDFDataMgr, model or dataset? Note that, the server side doesn't
know which it's called from the client side,
QueryEngineHTTP::exeConstructTriples() or exeConstructQuads().


I wrote:
>> a combined DEF.rdfOffer+DEF.quadsOffer

Create a new offer which combines all the languages you want. 
DEF.constructOffer.


If the language is quad-supporting, RDFLanguages.isQuads is true, 
process as a dataset regardless.


It does not matter if the client is asking for a model or a dataset 
because in all syntax cases, a dataset of just the default (the unnamed) 
graph looks like a single graph.  This is while you can generalise to 
work in datasets for much of the processing.  The default graph in a 
datasets is written as just triples without a graph field.



Also at the client side, if
QueryEngineHTTP::exeConstructTriples()/exeConstructQuads() gets the
content type of, e.g. JSONLD, it doesn't know whether it is a model or
dataset.


If they call execConstructTriples, then either the CONSTRUCT query will 
only have a default graph template or it's an extended syntax query and 
you want the default graph.  In both cases, parse to a steam of triples. 
 The parsers skip named graphs when they are asked for triples only.



So, that's why I introduce the DEF.pureRdfOffer to distinguish the
triple format that is not a quad format. In my way, both the sever
side and the client side separate model and dataset. It's kind of ugly
in code. Any more suggestion?


That would probably also work but it is not what the code is doing in 
SPARQL_Query.  It just offers DEF.pureRdfOffer and then tests for *. 
There is no datasets content negotation possible (e.g. n-quads vs Trig) 
because the process can't get to ResponseDataset with 
application/n-quads set, say.  The only way I can see to get there is a 
Accept-type of "*" in SPARQL_Query.


Try it in a debugger.

hack to run Fusek2 clean in Eclipse (string names need changing):

public class DevFuseki2 {
  public static void main(String[] args) {
String DIR = "/home/afs/ASF/jena-491/" ;
System.setProperty("FUSEKI_HOME",
   DIR+"jena-fuseki2/jena-fuseki-core") ;
String fusekiBase = 
"/home/afs/ASF/jena-491/jena-fuseki2/jena-fuseki-core/run" ;

System.setProperty("FUSEKI_BASE", fusekiBase) ;
String runArea = Paths.get(fusekiBase).toAbsolutePath().toString() ;
//  Delete any previous state.
// FileOps.ensureDir(runArea) ;
FusekiCmd.main() ;
System.exit(0) ;
  }
}


Andy



regards,
Qihong

On Mon, Aug 17, 2015 at 10:49 PM, Andy Seaborne  wrote:

Thanks for the clarification.  I had made a combined version to start
testing and hopefully it's a close approximation of the intended
deliverables.

[ Ying - how's your testing going? ]

A few things about the pull requests so far:

0/ More tests in TestQuery in Fuseki:

For example, this should show up:

1/ QueryEngineHttp.execConstructDataset is not implemented.


2/ SPARQL_Query:

This line

if ( ! rdfMediaType.getType().equals("*") ) {

means that only "Accept: *" will trigger dataset results.

then in ResponseDataset

 MediaType i = ConNeg.chooseContentType(request, DEF.quadsOffer,
DEF.acceptNQuads) ;

will always choose TriG because the accept is "*" ("output=" works but that
is overriding content negotiation).

There is no way to ask for a specific syntax (n-quads, trig, JSON-LD) using
"Accept:"

e.g. "Accept: application/n-quads"


3/ ResponseDataset is copy of ResponseModel yet the only differences (after
reformatting) are different data values and

  < RDFDataMgr.write(out, dataset, lang) ;
  ---
  > RDFDataMgr.write(out, model, lang) ;

It is not good to have copied code because it makes maintenance harder.



(2) and (3) can be addressed by

A/ SPARQL_Query:

For CONSTRUCT, always work in datasets; execConstructDataset().
No need for mention of Models.  if it's a triples CONSTRUCT, treating as a
dataset will simply put the results in to the default graph.

QueryExecutionBase::execConstructQuads

Following on from that, treating datasets as a natural extension of models.
There is no need to test for extended syntax.  If it's strict SPARQL 1.1, a
dataset will just have only a default graph.


B/ Content negotiate for a combined DEF.rdfOffer+DEF.quadsOffer (I don't
underatand DEF.pureRdfOffer -- N-triples is a standard).

C/ If it's a triple format (test the Lang),

RDFDataMgr.write(out, dataset.getDefaultModel(), lang) ;
otherwise:
RDFDataMgr.write(out, dataset, lang) ;


 Andy




Re: svn commit: r1696184 - /jena/site/trunk/content/documentation/query/construct-quad.mdtext

2015-08-18 Thread Qihong Lin
Hi Dr. Jiang

Thank you! I'll refine the doc following your instructions.

regards,
Qihong

On Tue, Aug 18, 2015 at 12:21 PM, Ying Jiang  wrote:
> Hi Qihong,
>
> You can improve the doc:
> 1)  The grammar table collapses. I can see a "|" in the first row,
> which needs to be escaped. Also, a table header is required.
> 2) All of the special symbols (e.g. code, string, expression ) should
> be marked. You can refer to other docs in Jena website, e.g
> property_paths.html
> 3) Add possible links in the doc, e.g. Fuseki (doc),
> ExampleConstructQuads.javad(code)
> 4) some indent problems in the code fragment
> 5) Add ARQ documentation index at the end of the page, just like the
> other ARQ docs.
>
> Last but not least, make sure make changes based on the latest source
> doc in svn (Andy modified it):
> http://svn.apache.org/repos/asf/jena/site/trunk/content/documentation/query/construct-quad.mdtext
>
> Best regards,
> Ying Jiang
>
>
> On Mon, Aug 17, 2015 at 4:52 PM, Andy Seaborne  wrote:
>> On 17/08/15 00:36, jpz6311...@apache.org wrote:
>>>
>>> Author: jpz6311whu
>>> Date: Sun Aug 16 23:36:58 2015
>>> New Revision: 1696184
>>>
>>> URL: http://svn.apache.org/r1696184
>>> Log:
>>> compose doc for construct-quad.mdtext
>>>
>>> Modified:
>>>  jena/site/trunk/content/documentation/query/construct-quad.mdtext
>>
>>
>> Please check when making changes.
>>
>> The generated page was blank.
>>
>> I have fixed this (it seems that converting the content to UTF-8 did the
>> trick) as well as some formatting issues I noticed.  I do not promise all
>> formatting issues are fixed.
>>
>> Andy
>>


Re: Fwd: Final Deliveries of GSoC Project (JENA-491)

2015-08-18 Thread Qihong Lin
Hi Andy,

Thank for your comments! I can understand your idea of A/ B/ C/. But
there'a problem in practice.

In brief:
For A/, at the server side, always call
QueryExecutionBase::execConstructDataset()
For B/, the problem is, there're overlaps between DEF.rdfOffer and
DEF.quadsOffer, e.g. TriX, TriXxml and JSONLD.
For C/, e.g. if the Accept lang is JSONLD, which should be written out
by RDFDataMgr, model or dataset? Note that, the server side doesn't
know which it's called from the client side,
QueryEngineHTTP::exeConstructTriples() or exeConstructQuads().

Also at the client side, if
QueryEngineHTTP::exeConstructTriples()/exeConstructQuads() gets the
content type of, e.g. JSONLD, it doesn't know whether it is a model or
dataset.

So, that's why I introduce the DEF.pureRdfOffer to distinguish the
triple format that is not a quad format. In my way, both the sever
side and the client side separate model and dataset. It's kind of ugly
in code. Any more suggestion?

regards,
Qihong

On Mon, Aug 17, 2015 at 10:49 PM, Andy Seaborne  wrote:
> Thanks for the clarification.  I had made a combined version to start
> testing and hopefully it's a close approximation of the intended
> deliverables.
>
> [ Ying - how's your testing going? ]
>
> A few things about the pull requests so far:
>
> 0/ More tests in TestQuery in Fuseki:
>
> For example, this should show up:
>
> 1/ QueryEngineHttp.execConstructDataset is not implemented.
>
>
> 2/ SPARQL_Query:
>
> This line
>
>if ( ! rdfMediaType.getType().equals("*") ) {
>
> means that only "Accept: *" will trigger dataset results.
>
> then in ResponseDataset
>
> MediaType i = ConNeg.chooseContentType(request, DEF.quadsOffer,
> DEF.acceptNQuads) ;
>
> will always choose TriG because the accept is "*" ("output=" works but that
> is overriding content negotiation).
>
> There is no way to ask for a specific syntax (n-quads, trig, JSON-LD) using
> "Accept:"
>
> e.g. "Accept: application/n-quads"
>
>
> 3/ ResponseDataset is copy of ResponseModel yet the only differences (after
> reformatting) are different data values and
>
>  < RDFDataMgr.write(out, dataset, lang) ;
>  ---
>  > RDFDataMgr.write(out, model, lang) ;
>
> It is not good to have copied code because it makes maintenance harder.
>
>
>
> (2) and (3) can be addressed by
>
> A/ SPARQL_Query:
>
> For CONSTRUCT, always work in datasets; execConstructDataset().
> No need for mention of Models.  if it's a triples CONSTRUCT, treating as a
> dataset will simply put the results in to the default graph.
>
> QueryExecutionBase::execConstructQuads
>
> Following on from that, treating datasets as a natural extension of models.
> There is no need to test for extended syntax.  If it's strict SPARQL 1.1, a
> dataset will just have only a default graph.
>
>
> B/ Content negotiate for a combined DEF.rdfOffer+DEF.quadsOffer (I don't
> underatand DEF.pureRdfOffer -- N-triples is a standard).
>
> C/ If it's a triple format (test the Lang),
>
>RDFDataMgr.write(out, dataset.getDefaultModel(), lang) ;
> otherwise:
>RDFDataMgr.write(out, dataset, lang) ;
>
>
> Andy