Re: [DRAFT] Jena board report for July 2015
+1 LGTM On 03/07/2015 09:40, Andy Seaborne a...@apache.org wrote: Report from the Apache Jena project [Andy Seaborne] ## Description: A framework for developing Semantic Web and Linked Data applications in Java. ## Activity: The project is working towards a major release. The most significant user-visible is converting to org.apache.jena package names everywhere, replacing the historical names for older code areas. The project has also seen new contributors, both helping clean the code up, adding new functionality and generally discussing the code. ## Issues: There are no issues requiring board attention at this time ## PMC/Committership changes: - Currently 13 committers and 11 PMC members in the project. - Osma Suominen was added to the PMC on Thu Jun 25 2015 - Osma Suominen was added as a committer on Thu Jun 25 2015 ## Releases: - Last release was 2.13.0 on Fri Mar 13 2015 ## Mailing list activity: - us...@jena.apache.org: - 586 subscribers (up 16 in the last 3 months): - 485 emails sent to list (663 in previous quarter) - dev@jena.apache.org: - 149 subscribers (down -3 in the last 3 months): - 1480 emails sent to list (1148 in previous quarter) ## JIRA activity: - 67 JIRA tickets created in the last 3 months - 62 JIRA tickets closed/resolved in the last 3 months
[GitHub] jena pull request: Removing out-of-date comment and empty @Overrid...
Github user asfgit closed the pull request at: https://github.com/apache/jena/pull/83 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
Re: Query parameterization.
On 01/07/15 05:27, Holger Knublauch wrote: Hi Andy, this looks great, and is just in time for the ongoing discussions in the SHACL group. I apologize in advance for not having the bandwidth yet to try this out from your branch, but this topic will definitely bubble up in the priorities soon... I have not fully understood how the semantics of this are different from the setInitialBinding feature that we currently use in SPIN, and which seems to do a pretty good job. However, having a facility to do further pre-processing in advance may improve performance and provide a more formal definition of what setInitialBinding is doing. I am personally not enthusiastic about approaches based on text-substitution, so working on the parsed syntax tree looks good to me. There are some (rare) cases where text-substitution would be more powerful, e.g. dynamic path properties If you can insert compound syntax, then injection attacks need to be considered. and some solution modifiers, but as you say no approach is perfect. Better done on the algebra? Especially around SELECT clause as it is several modifiers in tangle. (See recent OpAsQuery discussion and changes) Questions: - would this also pre-bind variables inside of nested SELECTs? Yes (it's a choice - it could not do it with some analysis of the inner projection as it passes through). - I assume this can handle blank nodes (e.g. rdf:Lists) as bindings? Probably! (it's tricky and needs more testing) ... Yes - the replacement with bnodes-are-variables in SPARQL is done during parsing and this is post parse (different to all string based approaches). If the substituted query to turned into a string, it will beome a bnode in SPARQL which then reparses is a variable. The printing code (specifically NodeToLabelMapBNode.asString) handles it and would need a tweak. The _:label form would be better but needs implementing. - What about bound(?var) and ?var is pre-bound? ?var in bound(?var) is replaced (as ?var in all expressions). This is syntax. Andy Thanks Holger On 6/28/15 8:08 PM, Andy Seaborne wrote: (info / discussion / ...) In working on JENA-963 (OpAsQuery; reworked handling of SPARQL modifiers for GROUP BY), it was easier/better to add the code I had for rewriting syntax by transformation, much like the algebra is rewritten by the optimizer. The use case is rewriting the output of OpAsQuery to remove unnecessary nesting of levels of {} which arise during translation for the safety of the translation. Hence putting in package oaj.sparql.syntax.syntaxtransform, a general framework for rewriting syntax, like we have for the SPARQL+ algebra. It is also capable of being a parameterized query system (PQ). We already ParameterizedSparqlString (PSS) so how do they compare? Work-in-progress: https://github.com/afs/jena-workspace/blob/master/src/main/java/syntaxtransform/ParameterizedQuery.java PQ is a rewrite of a Query object (the template) with a map of variables to constants. That is, it works on the syntax tree after parsing and produces a syntax tree. PSS is a builder with substitution. It builds a string, carefully (injection attacks) and is neutral as to what it is working with - query or update or something weird. http://jena.apache.org/documentation/query/parameterized-sparql-strings.html Summary: PQ is only for replacement of a variable in a template. PSS is a builder that can do that as part of building. PQ covers cases PSS doesn't - neither is perfect. PSS works with INSERT DATA. PQ would use the INSERT { ... } WHERE {} form. Details: PSS: Can build query, update strings and fragments Supports JDBC style positional parameters (a '?') These must be bound to get a valid query. Can generate illegal syntax. Tests the type of the injected value (string, iri, double etc). Has corner cases Looks for ?x as a string so ... This is not a ?x as a variable http://example/foo?x=123 SELECT ?x ns:local\?x (a legal local part) Protects against injection by checking. Works on INSERT DATA. PQ: Replaces SPARQL variables where identified as variables. (no extra-syntax positional '?') Legal query to legal syntax query. The query may violate scope rules (example below). Not a query builder. Post parser, so no reparsing to use the query (for large updates and queries) Injection is meaningless - can only inject values, not syntax. Can rewrite structurally: SELECT ?x = SELECT (:value AS ?x) which is useful to record the injection variables. Works with INSERT {?s ?p ?o } WHERE { } PQ example: Query template = QueryFactory.create(.. valid query ..) ; MapString, RDFNode map = new HashMap() ; map.put(y, ResourceFactory.createPlainLiteral(Bristol) ; Query query = ParameterizedQuery.setVariables(template, map) ; A perfect system probably needs a template language which SPARQL extended with a new template variable which is
[jira] [Commented] (JENA-978) jena-text: store original literals
[ https://issues.apache.org/jira/browse/JENA-978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14613316#comment-14613316 ] ASF subversion and git services commented on JENA-978: -- Commit b7eac624cfe5c95b4a7f6ecddbdfc27bd361da0a in jena's branch refs/heads/master from [~andy.seaborne] [ https://git-wip-us.apache.org/repos/asf?p=jena.git;h=b7eac62 ] JENA-978: jena-text stored literals: initial functionality and tests for Lucene jena-text: store original literals Key: JENA-978 URL: https://issues.apache.org/jira/browse/JENA-978 Project: Apache Jena Issue Type: Improvement Components: Text Affects Versions: Jena 2.13.0 Reporter: Osma Suominen As discussed on the dev list, this PR implements a feature where it's possible to store the original literal values in the jena-text Lucene index and to access them when querying the index. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (JENA-966) LazyIterator
[ https://issues.apache.org/jira/browse/JENA-966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14613322#comment-14613322 ] ASF subversion and git services commented on JENA-966: -- Commit 66ceff0cb1b2aaddf4477993dda60a1e32026ebf in jena's branch refs/heads/master from [~andy.seaborne] [ https://git-wip-us.apache.org/repos/asf?p=jena.git;h=66ceff0 ] JENA-966: Deprecate EarlyBindingIterator and UniqueExtendedIterator LazyIterator Key: JENA-966 URL: https://issues.apache.org/jira/browse/JENA-966 Project: Apache Jena Issue Type: Bug Components: Core Affects Versions: Jena 3.0.0 Reporter: Claude Warren Assignee: Claude Warren Fix For: Jena 3.0.0 LazyIterator is an abstract class. The documentation indicates that the create() method needs to be overridden to create an instance. From this I would expect that now LazyIterator(){ @Override public ExtendedIteratorModel create() { ... }}; Would work however LazyIterator does not override: remoteNext(), andThen(), toList(), and toSet(). I believe these should be implemented in the class. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Random error named broken line for same data, same query and same code.
Hello everyone, I am trying to execute a SPARQL query by making a call to fuseki server running at default port 3030. The query is as follows. PREFIX univ: http://example.com# PREFIX rdf: http://www.w3.org/1999/02/22-rdf-syntax-ns# SELECT ?subject ?predicate ?object FROM http://example.data/University3A WHERE { { { ?subject univ:doctoralDegreeFrom ?object { { ?subject rdf:type univ:FullProfessor} UNION { ?subject rdf:type univ:AssistantProfessor} UNION { ?subject rdf:type univ:AssociateProfessor} UNION { ?subject rdf:type univ:Lecturer} } } UNION { ?subject univ:mastersDegreeFrom ?object { { ?subject rdf:type univ:FullProfessor} UNION { ?subject rdf:type univ:AssistantProfessor} UNION { ?subject rdf:type univ:AssociateProfessor} UNION { ?subject rdf:type univ:Lecturer} } } UNION { ?subject univ:undergraduateDegreeFrom ?object { { ?subject rdf:type univ:FullProfessor} UNION { ?subject rdf:type univ:AssistantProfessor} UNION { ?subject rdf:type univ:AssociateProfessor} UNION { ?subject rdf:type univ:Lecturer} } } UNION { ?subject univ:emailAddress ?object { { ?subject rdf:type univ:FullProfessor} UNION { ?subject rdf:type univ:AssistantProfessor} UNION { ?subject rdf:type univ:AssociateProfessor} UNION { ?subject rdf:type univ:Lecturer} } } UNION { ?subject univ:name ?object} UNION { ?subject univ:researchInterest ?object} UNION { ?subject univ:teacherOf ?object} UNION { ?subject univ:worksFor ?object { { ?subject rdf:type univ:FullProfessor} UNION { ?subject rdf:type univ:AssistantProfessor} UNION { ?subject rdf:type univ:AssociateProfessor} UNION { ?subject rdf:type univ:Lecturer} } } UNION { ?subject univ:name ?object { { ?subject rdf:type univ:Course}} } UNION { ?subject univ:name ?object { { ?subject rdf:type univ:Department}} } UNION { ?subject univ:suborganizationof ?object { { ?subject rdf:type univ:Department}} } UNION { ?subject univ:name ?object { { ?subject rdf:type univ:GraduateCourse}} } UNION { ?subject univ:advisor ?object { { ?subject rdf:type univ:GraduateStudent} UNION { ?subject rdf:type univ:ResearchAssistant} UNION { ?subject rdf:type univ:TeachingAssistant} } } UNION { ?subject univ:emailAddress ?object { { ?subject rdf:type univ:GraduateStudent} UNION { ?subject rdf:type univ:ResearchAssistant} UNION { ?subject rdf:type univ:TeachingAssistant} } } UNION { ?subject univ:memberOf ?object { { ?subject rdf:type univ:GraduateStudent} UNION { ?subject rdf:type univ:ResearchAssistant} UNION { ?subject rdf:type univ:TeachingAssistant} } } UNION { ?subject univ:name ?object { { ?subject rdf:type univ:GraduateStudent} UNION { ?subject rdf:type univ:ResearchAssistant} UNION { ?subject rdf:type univ:TeachingAssistant} } } UNION { ?subject univ:name ?object} UNION { ?subject univ:publicationAuthor ?object} UNION { ?subject univ:suborganizationOf ?object { { ?subject rdf:type univ:ResearchGroup}} } UNION { ?subject rdf:type univ:University} } } Data set used to query upon is the synthetic data generated using data generator script available at http://swat.cse.lehigh.edu/projects/lubm/ (UBA 1.7) with command 3 university as the command line argument and name space as http://example.com; *Problem* : When I try to run the code to submit query using QueryEngineHTTP I get an error mainly *broken line : some text*, and each time I run the code to query dataset I get different type of text like broken line (new line) : ty, broken line (new line) : www. I tried to peep in the generated data, but I was not able to find any new line in the file or any error. Interestingly, when data is generated using above script UBA 1.7 with 1 university as argument I get no such error for the same query (may be number of triples are one third when compared with 3 as argument). *Code* used is
[jira] [Updated] (JENA-978) jena-text: store original literals
[ https://issues.apache.org/jira/browse/JENA-978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Seaborne updated JENA-978: --- Assignee: Osma Suominen (was: Andy Seaborne) jena-text: store original literals Key: JENA-978 URL: https://issues.apache.org/jira/browse/JENA-978 Project: Apache Jena Issue Type: Improvement Components: Text Affects Versions: Jena 2.13.0 Reporter: Osma Suominen Assignee: Osma Suominen Fix For: Jena 3.0.0 As discussed on the dev list, this PR implements a feature where it's possible to store the original literal values in the jena-text Lucene index and to access them when querying the index. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (JENA-978) jena-text: store original literals
[ https://issues.apache.org/jira/browse/JENA-978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Seaborne resolved JENA-978. Resolution: Done Fix Version/s: Jena 3.0.0 jena-text: store original literals Key: JENA-978 URL: https://issues.apache.org/jira/browse/JENA-978 Project: Apache Jena Issue Type: Improvement Components: Text Affects Versions: Jena 2.13.0 Reporter: Osma Suominen Assignee: Osma Suominen Fix For: Jena 3.0.0 As discussed on the dev list, this PR implements a feature where it's possible to store the original literal values in the jena-text Lucene index and to access them when querying the index. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (JENA-982) Concurrent Modification Exception on DatasetGraphMem.clear()
[ https://issues.apache.org/jira/browse/JENA-982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14612981#comment-14612981 ] ASF subversion and git services commented on JENA-982: -- Commit a3d7609a65606417b4109da201d2aaaf6f87aa2a in jena's branch refs/heads/master from [~andy.seaborne] [ https://git-wip-us.apache.org/repos/asf?p=jena.git;h=a3d7609 ] JENA-982 : CME on DatasetGraphMem.clear Concurrent Modification Exception on DatasetGraphMem.clear() Key: JENA-982 URL: https://issues.apache.org/jira/browse/JENA-982 Project: Apache Jena Issue Type: Bug Components: ARQ Affects Versions: Jena 2.13.0 Reporter: Andy Seaborne Assignee: Andy Seaborne Fix For: Jena 3.0.0 Report from : [stackoverflow/31188209|http://stackoverflow.com/questions/31188209/jena-concurrentmodificationexception-on-datasetgraph-clear] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Query parameterization.
On 01/07/15 07:17, Claude Warren wrote: SelectBuilder sb = new SelectBuilder() .addVar( * ) .addWhere( ?s, ?p, ?o ); sb.setVar( Var.alloc( ?o ), NodeFactory.createURI( http://xmlns.com/foaf/0.1/Person; ) ) ;Query q = sb.build(); Hi Claude, Should that be one of Var.alloc( o ) Var.alloc(Var.canonical(?o)) How does it compare to the corner cases in my first message? There is at least one injection attack: NodeFactory.createURI of http://xmlns.com/foaf/0.1/Person . ?s ?q http://example/ns; because it is string inclusion, jena-querybuilder needs to do the same checks that ParametrizedSparqlString does for URI. A check is needed on literals but a different kind of test. BTW: and how do I add OPTIONAL { ?s q 123 . ?s v ?x . FILTER(?x56) } ? And for UNION, there seems to be a confusion because it takes a SelectBuilder (a subquery) but that's an SQL-ism, not SPARQL. It seems to cause problems: SelectBuilder sb = new SelectBuilder().addVar(*) ; sb.addWhere(?s, ?p, ?o) ; SelectBuilder sb1 = new SelectBuilder().addVar(*) ; sb1.addWhere(?s, ?p, ?o) ; sb1.addUnion(sb1) ; Query q1 = sb1.build() ; String s1 = q1.toString() ; System.out.println(s1) ; I get stack overflow. UNION and OPTIONAL are similar - they take graph patterns. Andy
[DRAFT] Jena board report for July 2015
Report from the Apache Jena project [Andy Seaborne] ## Description: A framework for developing Semantic Web and Linked Data applications in Java. ## Activity: The project is working towards a major release. The most significant user-visible is converting to org.apache.jena package names everywhere, replacing the historical names for older code areas. The project has also seen new contributors, both helping clean the code up, adding new functionality and generally discussing the code. ## Issues: There are no issues requiring board attention at this time ## PMC/Committership changes: - Currently 13 committers and 11 PMC members in the project. - Osma Suominen was added to the PMC on Thu Jun 25 2015 - Osma Suominen was added as a committer on Thu Jun 25 2015 ## Releases: - Last release was 2.13.0 on Fri Mar 13 2015 ## Mailing list activity: - us...@jena.apache.org: - 586 subscribers (up 16 in the last 3 months): - 485 emails sent to list (663 in previous quarter) - dev@jena.apache.org: - 149 subscribers (down -3 in the last 3 months): - 1480 emails sent to list (1148 in previous quarter) ## JIRA activity: - 67 JIRA tickets created in the last 3 months - 62 JIRA tickets closed/resolved in the last 3 months
[jira] [Created] (JENA-982) Concurrent Modification Exception on DatasetGraphMem.clear()
Andy Seaborne created JENA-982: -- Summary: Concurrent Modification Exception on DatasetGraphMem.clear() Key: JENA-982 URL: https://issues.apache.org/jira/browse/JENA-982 Project: Apache Jena Issue Type: Bug Components: ARQ Affects Versions: Jena 2.13.0 Reporter: Andy Seaborne Assignee: Andy Seaborne Fix For: Jena 3.0.0 Report from : [stackoverflow/31188209|http://stackoverflow.com/questions/31188209/jena-concurrentmodificationexception-on-datasetgraph-clear] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (JENA-982) Concurrent Modification Exception on DatasetGraphMem.clear()
[ https://issues.apache.org/jira/browse/JENA-982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Seaborne closed JENA-982. -- Resolution: Fixed Concurrent Modification Exception on DatasetGraphMem.clear() Key: JENA-982 URL: https://issues.apache.org/jira/browse/JENA-982 Project: Apache Jena Issue Type: Bug Components: ARQ Affects Versions: Jena 2.13.0 Reporter: Andy Seaborne Assignee: Andy Seaborne Fix For: Jena 3.0.0 Report from : [stackoverflow/31188209|http://stackoverflow.com/questions/31188209/jena-concurrentmodificationexception-on-datasetgraph-clear] -- This message was sent by Atlassian JIRA (v6.3.4#6332)