Re: Time for Jena 3.2.0 release?

2017-01-22 Thread Claude Warren
slightly off topic question.  With a more frequent release cycle are we
going to provide long term support for particular releases?  I would define
long term support as we provide patches for the long term versions for a
longer period of time.

The reasoning is that projects that use Jena will then have a stable
platform upon which to build products and can deliver production code safe
in the knowledge that Jena will not be changing out from underneath them.

Claude

On Sun, Jan 22, 2017 at 8:24 PM, Andy Seaborne  wrote:

> It's a little after 3 months since Jena 3.1.1
>
> We have this time:
>
> RDF Connection
>   JENA-1267
>
> Serializable for Quad/triple/Node
>   JENA-1233
>
> JsonLDReader: possibility to override the @context
>   JENA-1279
>   And jsonld-java upgarde to version 0.9.0
>
> jena-spatial - no longer sorts result.
>   Big performance improvement.
>   JENA-1277
>
> What else?
>
> What would the next release be?
>
> If we are moving jena-text/spatial on in the Lucene version, should we
> call the next release 3.3.0?
>
> All JIRA marked as fixed in 3.2.0:
>https://s.apache.org/uhcd
>
> Andy
>
> Adam - are you still interested in being the RM? (If you don't think you
> have time, that's fine.)
>
>


-- 
I like: Like Like - The likeliest place on the web

LinkedIn: http://www.linkedin.com/in/claudewarren


[jira] [Commented] (JENA-1274) Support a writer-per-graph in-memory dataset

2017-01-22 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/JENA-1274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15833716#comment-15833716
 ] 

ASF GitHub Bot commented on JENA-1274:
--

Github user ajs6f commented on the issue:

https://github.com/apache/jena/pull/204
  
I'm starting to wonder whether I should back up here and just go for the 
more general 2-phase-locking design as [@afs discusses in the Jira 
ticket](https://issues.apache.org/jira/browse/JENA-1274?focusedCommentId=15809288&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15809288),
 or even to try to generalize to more amorphous locking regions. Doing those 
things would open up more use cases which might make this stuff viable for 
`jena-arq` itself.


> Support a writer-per-graph in-memory dataset
> 
>
> Key: JENA-1274
> URL: https://issues.apache.org/jira/browse/JENA-1274
> Project: Apache Jena
>  Issue Type: Improvement
>  Components: ARQ, Jena
>Reporter: A. Soroka
>Assignee: A. Soroka
>Priority: Minor
>  Labels: ldp, multithreading, named_graphs
>
> Without too much work we could support a writer-per-graph in-memory dataset. 
> The target use case here is LDP-style interaction or other RESTful 
> architectures, where it is normal for updates to occur centered on one 
> resource.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] jena issue #204: One writable graph per thread/transaction dataset

2017-01-22 Thread ajs6f
Github user ajs6f commented on the issue:

https://github.com/apache/jena/pull/204
  
I'm starting to wonder whether I should back up here and just go for the 
more general 2-phase-locking design as [@afs discusses in the Jira 
ticket](https://issues.apache.org/jira/browse/JENA-1274?focusedCommentId=15809288&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15809288),
 or even to try to generalize to more amorphous locking regions. Doing those 
things would open up more use cases which might make this stuff viable for 
`jena-arq` itself.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (JENA-1275) TransformScopeRename does the wrong thing with FILTER NOT EXISTS

2017-01-22 Thread Andy Seaborne (JIRA)

[ 
https://issues.apache.org/jira/browse/JENA-1275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15833705#comment-15833705
 ] 

Andy Seaborne commented on JENA-1275:
-

{noformat}
qparse --print=quadopt
{noformat}
{noformat}
(project (?triangles ?openTriplets)
  (project (?openTriplets)
(extend ((?openTriplets ?/.0))
  (group () ((?/.0 (count ?/x)))
(filter (notexists
   (quadpattern (quad  ?/z ?/c 
?/x)))
  (quadpattern
(quad  ?/x ?/a ?/y)
(quad  ?/y ?/b ?/z)
  ))
{noformat}

> TransformScopeRename does the wrong thing with FILTER NOT EXISTS
> 
>
> Key: JENA-1275
> URL: https://issues.apache.org/jira/browse/JENA-1275
> Project: Apache Jena
>  Issue Type: Bug
>  Components: ARQ
>Affects Versions: Jena 3.1.1
>Reporter: Rob Vesse
>Assignee: Andy Seaborne
> Fix For: Jena 3.2.0
>
>
> I have produced the following minimal query from an originally much larger 
> query. When the optimise is applied to this it incorrectly double renames the 
> variables inside of the {{FILTER NOT EXISTS}} clause leading to incorrect 
> algebra.
> Query:
> {noformat}
> SELECT ?triangles ?openTriplets
>   {
> {
>   #subQ2: calculate #open-triplets
>   SELECT (COUNT(?x) as ?openTriplets)
>   WHERE {
> ?x ?a ?y .
> ?y ?b ?z .
> FILTER NOT EXISTS {?z ?c ?x}
>   }
> }
>   }
> {noformat}
> Output:
> {noformat}
> (project (?triangles ?openTriplets)
>   (project (?openTriplets)
> (extend ((?openTriplets ?/.0))
>   (group () ((?/.0 (count ?/x)))
> (filter (notexists
>(quadpattern (quad  ?//z ?//c 
> ?//x)))
>   (quadpattern
> (quad  ?/x ?/a ?/y)
> (quad  ?/y ?/b ?/z)
>   ))
> {noformat}
> Note that we apply the quad transformation prior to applying the optimiser. 
> Strangely enough I cannot reproduce the problem using pure Jena command line 
> tools i.e. {{qparse}} although I note from the code that it applies quad 
> transformation after applying optimisation. This suggests that it is a bug in 
> how TransformScopeRename applies to quad form algebra.
> I can reproduce it with a unit test like so:
> {noformat}
> @Test
> public void filter_not_exists_scoping_03() {
> //@formatter:off
> Op orig = SSE.parseOp(StrUtils.strjoinNL("(project (?triangles 
> ?openTriplets)",
>"  (project (?openTriplets)",
>"(extend ((?openTriplets ?.0))",
>"  (group () ((?.0 (count ?x)))",
>"(filter (notexists",
>"   (quadpattern (quad 
>  ?z ?c ?x)))",
>"  (quadpattern",
>"(quad 
>  ?x ?a ?y)",
>"(quad 
>  ?y ?b ?z)",
>"  ))"));
> Op expected = SSE.parseOp(StrUtils.strjoinNL("(project (?triangles 
> ?openTriplets)",
> "  (project (?openTriplets)",
> "(extend ((?openTriplets ?/.0))",
> "  (group () ((?/.0 (count ?/x)))",
> "(filter (notexists",
> "   (quadpattern (quad 
>  ?/z ?/c ?/x)))",
> "  (quadpattern",
> "(quad  ?/x ?/a ?/y)",
> "(quad  ?/y ?/b ?/z)",
> "  ))"));
> //@formatter:on
> 
> Op transformed = TransformScopeRename.transform(orig);
> 
> Assert.assertEquals(transformed, expected);
> }
> 
> @Test
> public void filter_not_exists_scoping_04() {
> //@formatter:off
> Op orig = SSE.parseOp(StrUtils.strjoinNL(
>"  (project (?openTriplets)",
>"(extend ((?openTriplets ?.0))",
>"  (group () ((?.0 (count ?x)))",
>"(filter (notexists",
>"   (quadpattern (quad 
>  ?z ?c ?x)))",
>"  (quadpattern",
>"(quad 
>  ?x ?a ?y)",
>"(quad 
>  ?y ?b ?z)",
>"  )"));
> Op expected = SSE.parseOp(StrUtils.strjoinNL(
> "  (project (?openTriple

[jira] [Resolved] (JENA-1275) TransformScopeRename does the wrong thing with FILTER NOT EXISTS

2017-01-22 Thread Andy Seaborne (JIRA)

 [ 
https://issues.apache.org/jira/browse/JENA-1275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Seaborne resolved JENA-1275.
-
   Resolution: Fixed
Fix Version/s: Jena 3.2.0

> TransformScopeRename does the wrong thing with FILTER NOT EXISTS
> 
>
> Key: JENA-1275
> URL: https://issues.apache.org/jira/browse/JENA-1275
> Project: Apache Jena
>  Issue Type: Bug
>  Components: ARQ
>Affects Versions: Jena 3.1.1
>Reporter: Rob Vesse
>Assignee: Andy Seaborne
> Fix For: Jena 3.2.0
>
>
> I have produced the following minimal query from an originally much larger 
> query. When the optimise is applied to this it incorrectly double renames the 
> variables inside of the {{FILTER NOT EXISTS}} clause leading to incorrect 
> algebra.
> Query:
> {noformat}
> SELECT ?triangles ?openTriplets
>   {
> {
>   #subQ2: calculate #open-triplets
>   SELECT (COUNT(?x) as ?openTriplets)
>   WHERE {
> ?x ?a ?y .
> ?y ?b ?z .
> FILTER NOT EXISTS {?z ?c ?x}
>   }
> }
>   }
> {noformat}
> Output:
> {noformat}
> (project (?triangles ?openTriplets)
>   (project (?openTriplets)
> (extend ((?openTriplets ?/.0))
>   (group () ((?/.0 (count ?/x)))
> (filter (notexists
>(quadpattern (quad  ?//z ?//c 
> ?//x)))
>   (quadpattern
> (quad  ?/x ?/a ?/y)
> (quad  ?/y ?/b ?/z)
>   ))
> {noformat}
> Note that we apply the quad transformation prior to applying the optimiser. 
> Strangely enough I cannot reproduce the problem using pure Jena command line 
> tools i.e. {{qparse}} although I note from the code that it applies quad 
> transformation after applying optimisation. This suggests that it is a bug in 
> how TransformScopeRename applies to quad form algebra.
> I can reproduce it with a unit test like so:
> {noformat}
> @Test
> public void filter_not_exists_scoping_03() {
> //@formatter:off
> Op orig = SSE.parseOp(StrUtils.strjoinNL("(project (?triangles 
> ?openTriplets)",
>"  (project (?openTriplets)",
>"(extend ((?openTriplets ?.0))",
>"  (group () ((?.0 (count ?x)))",
>"(filter (notexists",
>"   (quadpattern (quad 
>  ?z ?c ?x)))",
>"  (quadpattern",
>"(quad 
>  ?x ?a ?y)",
>"(quad 
>  ?y ?b ?z)",
>"  ))"));
> Op expected = SSE.parseOp(StrUtils.strjoinNL("(project (?triangles 
> ?openTriplets)",
> "  (project (?openTriplets)",
> "(extend ((?openTriplets ?/.0))",
> "  (group () ((?/.0 (count ?/x)))",
> "(filter (notexists",
> "   (quadpattern (quad 
>  ?/z ?/c ?/x)))",
> "  (quadpattern",
> "(quad  ?/x ?/a ?/y)",
> "(quad  ?/y ?/b ?/z)",
> "  ))"));
> //@formatter:on
> 
> Op transformed = TransformScopeRename.transform(orig);
> 
> Assert.assertEquals(transformed, expected);
> }
> 
> @Test
> public void filter_not_exists_scoping_04() {
> //@formatter:off
> Op orig = SSE.parseOp(StrUtils.strjoinNL(
>"  (project (?openTriplets)",
>"(extend ((?openTriplets ?.0))",
>"  (group () ((?.0 (count ?x)))",
>"(filter (notexists",
>"   (quadpattern (quad 
>  ?z ?c ?x)))",
>"  (quadpattern",
>"(quad 
>  ?x ?a ?y)",
>"(quad 
>  ?y ?b ?z)",
>"  )"));
> Op expected = SSE.parseOp(StrUtils.strjoinNL(
> "  (project (?openTriplets)",
> "(extend ((?openTriplets ?.0))",
> "  (group () ((?.0 (count ?x)))",
> "(filter (notexists",
> "   (quadpattern (quad 
>  ?z ?c ?x)))",
> "  (quadpattern",
> "(quad  ?x ?a ?y)",
> "(quad  ?y ?b ?z)",
>

Time for Jena 3.2.0 release?

2017-01-22 Thread Andy Seaborne

It's a little after 3 months since Jena 3.1.1

We have this time:

RDF Connection
  JENA-1267

Serializable for Quad/triple/Node
  JENA-1233

JsonLDReader: possibility to override the @context
  JENA-1279
  And jsonld-java upgarde to version 0.9.0

jena-spatial - no longer sorts result.
  Big performance improvement.
  JENA-1277

What else?

What would the next release be?

If we are moving jena-text/spatial on in the Lucene version, should we 
call the next release 3.3.0?


All JIRA marked as fixed in 3.2.0:
   https://s.apache.org/uhcd

Andy

Adam - are you still interested in being the RM? (If you don't think you 
have time, that's fine.)




[jira] [Commented] (JENA-1275) TransformScopeRename does the wrong thing with FILTER NOT EXISTS

2017-01-22 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/JENA-1275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15833667#comment-15833667
 ] 

ASF subversion and git services commented on JENA-1275:
---

Commit 555410d9b51e93efbc4f844f03f447563df6963e in jena's branch 
refs/heads/master from [~andy.seaborne]
[ https://git-wip-us.apache.org/repos/asf?p=jena.git;h=555410d ]

JENA-1275 : Process expressions, including 'exists', consistently.


> TransformScopeRename does the wrong thing with FILTER NOT EXISTS
> 
>
> Key: JENA-1275
> URL: https://issues.apache.org/jira/browse/JENA-1275
> Project: Apache Jena
>  Issue Type: Bug
>  Components: ARQ
>Affects Versions: Jena 3.1.1
>Reporter: Rob Vesse
>Assignee: Andy Seaborne
>
> I have produced the following minimal query from an originally much larger 
> query. When the optimise is applied to this it incorrectly double renames the 
> variables inside of the {{FILTER NOT EXISTS}} clause leading to incorrect 
> algebra.
> Query:
> {noformat}
> SELECT ?triangles ?openTriplets
>   {
> {
>   #subQ2: calculate #open-triplets
>   SELECT (COUNT(?x) as ?openTriplets)
>   WHERE {
> ?x ?a ?y .
> ?y ?b ?z .
> FILTER NOT EXISTS {?z ?c ?x}
>   }
> }
>   }
> {noformat}
> Output:
> {noformat}
> (project (?triangles ?openTriplets)
>   (project (?openTriplets)
> (extend ((?openTriplets ?/.0))
>   (group () ((?/.0 (count ?/x)))
> (filter (notexists
>(quadpattern (quad  ?//z ?//c 
> ?//x)))
>   (quadpattern
> (quad  ?/x ?/a ?/y)
> (quad  ?/y ?/b ?/z)
>   ))
> {noformat}
> Note that we apply the quad transformation prior to applying the optimiser. 
> Strangely enough I cannot reproduce the problem using pure Jena command line 
> tools i.e. {{qparse}} although I note from the code that it applies quad 
> transformation after applying optimisation. This suggests that it is a bug in 
> how TransformScopeRename applies to quad form algebra.
> I can reproduce it with a unit test like so:
> {noformat}
> @Test
> public void filter_not_exists_scoping_03() {
> //@formatter:off
> Op orig = SSE.parseOp(StrUtils.strjoinNL("(project (?triangles 
> ?openTriplets)",
>"  (project (?openTriplets)",
>"(extend ((?openTriplets ?.0))",
>"  (group () ((?.0 (count ?x)))",
>"(filter (notexists",
>"   (quadpattern (quad 
>  ?z ?c ?x)))",
>"  (quadpattern",
>"(quad 
>  ?x ?a ?y)",
>"(quad 
>  ?y ?b ?z)",
>"  ))"));
> Op expected = SSE.parseOp(StrUtils.strjoinNL("(project (?triangles 
> ?openTriplets)",
> "  (project (?openTriplets)",
> "(extend ((?openTriplets ?/.0))",
> "  (group () ((?/.0 (count ?/x)))",
> "(filter (notexists",
> "   (quadpattern (quad 
>  ?/z ?/c ?/x)))",
> "  (quadpattern",
> "(quad  ?/x ?/a ?/y)",
> "(quad  ?/y ?/b ?/z)",
> "  ))"));
> //@formatter:on
> 
> Op transformed = TransformScopeRename.transform(orig);
> 
> Assert.assertEquals(transformed, expected);
> }
> 
> @Test
> public void filter_not_exists_scoping_04() {
> //@formatter:off
> Op orig = SSE.parseOp(StrUtils.strjoinNL(
>"  (project (?openTriplets)",
>"(extend ((?openTriplets ?.0))",
>"  (group () ((?.0 (count ?x)))",
>"(filter (notexists",
>"   (quadpattern (quad 
>  ?z ?c ?x)))",
>"  (quadpattern",
>"(quad 
>  ?x ?a ?y)",
>"(quad 
>  ?y ?b ?z)",
>"  )"));
> Op expected = SSE.parseOp(StrUtils.strjoinNL(
> "  (project (?openTriplets)",
> "(extend ((?openTriplets ?.0))",
> "  (group () ((?.0 (count ?x)))",
> "(fil

[jira] [Commented] (JENA-1275) TransformScopeRename does the wrong thing with FILTER NOT EXISTS

2017-01-22 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/JENA-1275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15833668#comment-15833668
 ] 

ASF subversion and git services commented on JENA-1275:
---

Commit 16c3caadd0da223e463d28af98b517419df04904 in jena's branch 
refs/heads/master from [~andy.seaborne]
[ https://git-wip-us.apache.org/repos/asf?p=jena.git;h=16c3caa ]

JENA-1275: Merge commit 'refs/pull/206/head' of github.com:apache/jena

This closes #206.


> TransformScopeRename does the wrong thing with FILTER NOT EXISTS
> 
>
> Key: JENA-1275
> URL: https://issues.apache.org/jira/browse/JENA-1275
> Project: Apache Jena
>  Issue Type: Bug
>  Components: ARQ
>Affects Versions: Jena 3.1.1
>Reporter: Rob Vesse
>Assignee: Andy Seaborne
>
> I have produced the following minimal query from an originally much larger 
> query. When the optimise is applied to this it incorrectly double renames the 
> variables inside of the {{FILTER NOT EXISTS}} clause leading to incorrect 
> algebra.
> Query:
> {noformat}
> SELECT ?triangles ?openTriplets
>   {
> {
>   #subQ2: calculate #open-triplets
>   SELECT (COUNT(?x) as ?openTriplets)
>   WHERE {
> ?x ?a ?y .
> ?y ?b ?z .
> FILTER NOT EXISTS {?z ?c ?x}
>   }
> }
>   }
> {noformat}
> Output:
> {noformat}
> (project (?triangles ?openTriplets)
>   (project (?openTriplets)
> (extend ((?openTriplets ?/.0))
>   (group () ((?/.0 (count ?/x)))
> (filter (notexists
>(quadpattern (quad  ?//z ?//c 
> ?//x)))
>   (quadpattern
> (quad  ?/x ?/a ?/y)
> (quad  ?/y ?/b ?/z)
>   ))
> {noformat}
> Note that we apply the quad transformation prior to applying the optimiser. 
> Strangely enough I cannot reproduce the problem using pure Jena command line 
> tools i.e. {{qparse}} although I note from the code that it applies quad 
> transformation after applying optimisation. This suggests that it is a bug in 
> how TransformScopeRename applies to quad form algebra.
> I can reproduce it with a unit test like so:
> {noformat}
> @Test
> public void filter_not_exists_scoping_03() {
> //@formatter:off
> Op orig = SSE.parseOp(StrUtils.strjoinNL("(project (?triangles 
> ?openTriplets)",
>"  (project (?openTriplets)",
>"(extend ((?openTriplets ?.0))",
>"  (group () ((?.0 (count ?x)))",
>"(filter (notexists",
>"   (quadpattern (quad 
>  ?z ?c ?x)))",
>"  (quadpattern",
>"(quad 
>  ?x ?a ?y)",
>"(quad 
>  ?y ?b ?z)",
>"  ))"));
> Op expected = SSE.parseOp(StrUtils.strjoinNL("(project (?triangles 
> ?openTriplets)",
> "  (project (?openTriplets)",
> "(extend ((?openTriplets ?/.0))",
> "  (group () ((?/.0 (count ?/x)))",
> "(filter (notexists",
> "   (quadpattern (quad 
>  ?/z ?/c ?/x)))",
> "  (quadpattern",
> "(quad  ?/x ?/a ?/y)",
> "(quad  ?/y ?/b ?/z)",
> "  ))"));
> //@formatter:on
> 
> Op transformed = TransformScopeRename.transform(orig);
> 
> Assert.assertEquals(transformed, expected);
> }
> 
> @Test
> public void filter_not_exists_scoping_04() {
> //@formatter:off
> Op orig = SSE.parseOp(StrUtils.strjoinNL(
>"  (project (?openTriplets)",
>"(extend ((?openTriplets ?.0))",
>"  (group () ((?.0 (count ?x)))",
>"(filter (notexists",
>"   (quadpattern (quad 
>  ?z ?c ?x)))",
>"  (quadpattern",
>"(quad 
>  ?x ?a ?y)",
>"(quad 
>  ?y ?b ?z)",
>"  )"));
> Op expected = SSE.parseOp(StrUtils.strjoinNL(
> "  (project (?openTriplets)",
> "(extend ((?openTriplets ?.0))",
> "  (group () ((?.0 (count ?x)))",
>   

[jira] [Commented] (JENA-1275) TransformScopeRename does the wrong thing with FILTER NOT EXISTS

2017-01-22 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/JENA-1275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15833669#comment-15833669
 ] 

ASF GitHub Bot commented on JENA-1275:
--

Github user asfgit closed the pull request at:

https://github.com/apache/jena/pull/206


> TransformScopeRename does the wrong thing with FILTER NOT EXISTS
> 
>
> Key: JENA-1275
> URL: https://issues.apache.org/jira/browse/JENA-1275
> Project: Apache Jena
>  Issue Type: Bug
>  Components: ARQ
>Affects Versions: Jena 3.1.1
>Reporter: Rob Vesse
>Assignee: Andy Seaborne
>
> I have produced the following minimal query from an originally much larger 
> query. When the optimise is applied to this it incorrectly double renames the 
> variables inside of the {{FILTER NOT EXISTS}} clause leading to incorrect 
> algebra.
> Query:
> {noformat}
> SELECT ?triangles ?openTriplets
>   {
> {
>   #subQ2: calculate #open-triplets
>   SELECT (COUNT(?x) as ?openTriplets)
>   WHERE {
> ?x ?a ?y .
> ?y ?b ?z .
> FILTER NOT EXISTS {?z ?c ?x}
>   }
> }
>   }
> {noformat}
> Output:
> {noformat}
> (project (?triangles ?openTriplets)
>   (project (?openTriplets)
> (extend ((?openTriplets ?/.0))
>   (group () ((?/.0 (count ?/x)))
> (filter (notexists
>(quadpattern (quad  ?//z ?//c 
> ?//x)))
>   (quadpattern
> (quad  ?/x ?/a ?/y)
> (quad  ?/y ?/b ?/z)
>   ))
> {noformat}
> Note that we apply the quad transformation prior to applying the optimiser. 
> Strangely enough I cannot reproduce the problem using pure Jena command line 
> tools i.e. {{qparse}} although I note from the code that it applies quad 
> transformation after applying optimisation. This suggests that it is a bug in 
> how TransformScopeRename applies to quad form algebra.
> I can reproduce it with a unit test like so:
> {noformat}
> @Test
> public void filter_not_exists_scoping_03() {
> //@formatter:off
> Op orig = SSE.parseOp(StrUtils.strjoinNL("(project (?triangles 
> ?openTriplets)",
>"  (project (?openTriplets)",
>"(extend ((?openTriplets ?.0))",
>"  (group () ((?.0 (count ?x)))",
>"(filter (notexists",
>"   (quadpattern (quad 
>  ?z ?c ?x)))",
>"  (quadpattern",
>"(quad 
>  ?x ?a ?y)",
>"(quad 
>  ?y ?b ?z)",
>"  ))"));
> Op expected = SSE.parseOp(StrUtils.strjoinNL("(project (?triangles 
> ?openTriplets)",
> "  (project (?openTriplets)",
> "(extend ((?openTriplets ?/.0))",
> "  (group () ((?/.0 (count ?/x)))",
> "(filter (notexists",
> "   (quadpattern (quad 
>  ?/z ?/c ?/x)))",
> "  (quadpattern",
> "(quad  ?/x ?/a ?/y)",
> "(quad  ?/y ?/b ?/z)",
> "  ))"));
> //@formatter:on
> 
> Op transformed = TransformScopeRename.transform(orig);
> 
> Assert.assertEquals(transformed, expected);
> }
> 
> @Test
> public void filter_not_exists_scoping_04() {
> //@formatter:off
> Op orig = SSE.parseOp(StrUtils.strjoinNL(
>"  (project (?openTriplets)",
>"(extend ((?openTriplets ?.0))",
>"  (group () ((?.0 (count ?x)))",
>"(filter (notexists",
>"   (quadpattern (quad 
>  ?z ?c ?x)))",
>"  (quadpattern",
>"(quad 
>  ?x ?a ?y)",
>"(quad 
>  ?y ?b ?z)",
>"  )"));
> Op expected = SSE.parseOp(StrUtils.strjoinNL(
> "  (project (?openTriplets)",
> "(extend ((?openTriplets ?.0))",
> "  (group () ((?.0 (count ?x)))",
> "(filter (notexists",
> "   (quadpattern (quad 
>  ?z ?c ?x)))",
> "  (quadpattern",
> "(quad  ?x ?a ?

[GitHub] jena pull request #206: JENA-1275 : Trasnforming Expressions and VarExprList...

2017-01-22 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/jena/pull/206


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: JPA contribution

2017-01-22 Thread Andy Seaborne



On 22/01/17 14:21, A. Soroka wrote:

This is a cool project that seems like it would be of use to Jena
users. It raises a question for me about how Jena handles
contributions generally (not specific to this example).

Do we have any policy about how much support must exist from
committers to accept a project? For example, in some other projects
in which I participate, it's necessary for at least two committers to
accept responsibility to maintain a module before it can be accepted,
and if there are ever fewer than that over time, it goes into a
deprecation path that eventuates in it leaving the project. I'm not
arguing for that policy in particular for Jena, just wondering if we
have anything like that, or whether the modules are pruned on an ad
hoc basis.


Good points.  There needs to be activity around a component to keep it 
alive and well.


In addition, if we put everything into a single build process it becomes 
increasingly harder to release and to make changes to the main codebase 
because everything is lock-step.  2 of the release releases have been 
like that.


I wonder if splitting into "the main build" where we can regularly 
release with fixes. Other parts can be released separately by people 
interested.


We need to be able to retire parts: jena-maven-tools and jena-csv are 
current examples, as is jena-fuseki1.  This process needn't be fast or 
abrupt but for the long term health of the project, we have to recognize 
that some parts will fade away when no one is interested.



For PA4RDF - what communities are there? User community? Developer 
community?


Andy


Incidentally, in a sort of slow, long term activity, I have some work to 
extract what linked data applications need for XSD datatyping:


https://github.com/afs/xsd4ld

It has the missing types of PA4RDF; it is not tied to Jena at all (no 
dependency).



>
>
> ---
> A. Soroka
>
>> On Jan 21, 2017, at 3:40 AM, Claude Warren  wrote:
>>
>> Greetings,
>>
>> I have a project (PA4RDF) that provides persistence annotations that
>> read/write a Jena graph.
>>
>> It basically turns any RDF subject into an object with the predicates
>> defining the properties of the object.
>>
>> The current implementation can apply the annotations to interfaces,
>> abstract or concrete classes.  It has been used in several projects with
>> different corporate and government owners.
>>
>> I would like to contribute the code and documentation to the Jena 
project
>> as an "extras" project.  Further information about the project and 
the code

>> can be found at https://github.com/Claudenw/PA4RDF.
>>
>> Is there any objection to accepting this contribution?
>>
>> Claude
>>
>> --
>> I like: Like Like - The likeliest place on the web
>> 
>> LinkedIn: http://www.linkedin.com/in/claudewarren
>
>
>
>
>


Re: in-memory scale-out with Hazelcast Was: Horizontal scalability and limits of TDB

2017-01-22 Thread A. Soroka
I'm not sure how you are using those filters, Claude, but my idea (not specific 
at all to Hazelcast) would be to use Bloom (or maybe "cuckoo" filters) at the 
server that receives a query or some subtree in a query and expects to 
distribute subtrees or branches to some other servers. The filters (associated 
to other partitions or servers) might prevent unnecessary network traffic to 
servers that don't have any relevant info. As I understand such schemes, it's 
only when the cost of retrieval is high enough (e.g. network or hard disk or 
_really_ inefficient data representations) that the extra computations involved 
in  filters are worth it.

Is that like what you are doing with your implementation?

---
A. Soroka


> On Jan 22, 2017, at 1:39 PM, Claude Warren  wrote:
> 
> If have a bloom filter based graph implementation that runs on mysql as
> there needs to be a server side bloom filter check and I could easily
> implement a User defined function there.
> 
> I also have a new implementation of bloom filter that uses murmur 128 and
> creates proto-bloom filters that can be expressed as bloom filters of any
> shape.
> 
> Last weekend implemented a bloom filter join (multi level bloom filters --
> so different shapes) but I have not tested it at scale and I am no certain
> that it is any  better than the standard hash join.  However, it should be
> possible to page out parts of the table so that it can do very large joins
> efficiently.  But that got me wondering how big are "really large" joins?
> even with millions of triples it seems to me that the queries will only
> have thousands of entries to join.
> 
> Claude
> 
> On Sun, Jan 22, 2017 at 4:54 PM, A. Soroka  wrote:
> 
>> Another idea with which I have been playing is to try to scale
>> horizontally but only in-memory:
>> 
>> I could take the one-writeble-graph-per-transaction dataset code I've
>> written and replace the ConcurrentHashMap that currently holds the graphs
>> with a Hazelcast [1] distributed map. Naive union graph performance would
>> be awful, but if the workload was chiefly addressing individual graphs and
>> the graphs were large enough, the parallelism might be really worthwhile.
>> 
>> Hazelcast offers per-entry locks [2], so those could be used instead of
>> the lockable graphs I'm using now. It also also offers optimistic locking
>> via Map.replace( key, oldValue, newValue ), so I could even imagine
>> offering a switch between "strict mode" in which locks are used and
>> "read-heavy mode" in which it is assumed that the application will prevent
>> contention on individual graphs but that an update could fail if that isn't
>> so.
>> 
>> Hazelcast also offers some support for remote computation at the entries
>> of its distributed maps [3], so it might be possible to distribute
>> findInSpecificNamedGraph() executions (maybe eventually some of the ARQ
>> execution as well?). It also supports a kind of query language [4] that
>> might be used to obtain more efficiency, perhaps by using Bloom filters for
>> graphs, as Claude has discussed before.
>> 
>> All just food for thought, for now.
>> 
>> ---
>> A. Soroka
>> 
>> [1] https://hazelcast.org/
>> [2] http://docs.hazelcast.org/docs/3.7/manual/html-single/
>> index.html#locking-maps
>> [3] http://docs.hazelcast.org/docs/3.7/manual/html-single/
>> index.html#entry-processor
>> [4] http://docs.hazelcast.org/docs/3.7/manual/html-single/
>> index.html#distributed-query
>> 
>>> On Jan 20, 2017, at 8:38 PM, De Gyves  wrote:
>>> 
>>> I'd like to participate on the storage portion of Jena, maybe TDB. As I
>>> have worked many years developing with RBDMS I like to explore new
>>> horizonts of persistence and graph based ones seem very promising to my
>>> next projects, so i'd like to use SPARQL and RDF with Jena/TDB and see
>> how
>>> far I can go.
>>> 
>>> So I've spent the last two days exploring subjects of the mail archives
>>> from august 2015 to january of this year the of jena-dev and found some
>>> interesting threads, as the development of TDB2, the tests of 100m of
>> BSBM
>>> data, a question of horizontal scaling, and that anything that implements
>>> DatasetGraph can be used for a triples store. Some readings of jena doc
>>> include: SPARQL, The RDF API, Txn and TDB transactions.
>>> 
>>> What I am looking for is to get a clear perspective of some requirements
>>> which are taken for granted on a traditional RDBMS. These are:
>>> 
>>> 1. Atomicity, consistency, isolation and durability of a transaction on a
>>> single tdb database: Apart from the limitations on the documentation of
>> TDB
>>> Transactions and Txn,  there are current issues? edge cases detected and
>>> not yet covered?
>>> 2. Are there currently available strategies to achieve a
>> horizontal-scaled
>>> tdb database?
>>> 3. What do you think of try to implement a horizontal scalability with
>>> DatasetGraph or something else with, let's say, cockroachdb, voltdb,
>>> postgresql, etc?
>>> 4. If there are some s

Re: subject based graph implementation

2017-01-22 Thread A. Soroka
Well, depending on your queries and how you want to design them, you might be 
able to use an in-memory dataset and put all the triples for one particular 
subject into the named graph with that URI. That would keep things nicely 
partitioned.

For example you could use a DatasetGraphCollection subclass like 
DatasetGraphMapLink, which just uses a HashMap to keep track of graphs that are 
of any implementation you configure, or my (proposed) DatasetGraphGraphPerTxn, 
which implements graphs as persistent maps in order to afford MR+SW (multiple 
readers AND a single writer simultaneously) concurrency for each graph. 

We could maybe alter DatasetGraphMapLink to take its Map of names to graphs as 
a parameter. Then Claude could use a Map implementation with eviction behavior, 
if that is the kind of caching he wants.

Maybe you could say more about the use case? Do you want something that 
naturally partitions data by subject because it's the size of transaction you 
are going to be using? Are you going to write queries that involve more than 
one subject?

---
A. Soroka
The University of Virginia Library

> On Jan 22, 2017, at 1:47 PM, Claude Warren  wrote:
> 
> I am wondering if we have a graph implementation that is subject based.
> What I am looking for is a graph cache that is subject based.
> 
> I want to retrieve data from a remote graph and store it in the cache,  I
> will retrieve all the predicates and values for the subject, so a complete
> shnapshot of the data.  Then as queries proceed when a new subject is
> required populate the cache.
> 
> My end goal is to be able to take a query, do some data extraction from a
> big remote graph and then answer the query from the smaller in memory
> graph.
> 
> -- 
> I like: Like Like - The likeliest place on the web
> 
> LinkedIn: http://www.linkedin.com/in/claudewarren




subject based graph implementation

2017-01-22 Thread Claude Warren
I am wondering if we have a graph implementation that is subject based.
What I am looking for is a graph cache that is subject based.

I want to retrieve data from a remote graph and store it in the cache,  I
will retrieve all the predicates and values for the subject, so a complete
shnapshot of the data.  Then as queries proceed when a new subject is
required populate the cache.

My end goal is to be able to take a query, do some data extraction from a
big remote graph and then answer the query from the smaller in memory
graph.

-- 
I like: Like Like - The likeliest place on the web

LinkedIn: http://www.linkedin.com/in/claudewarren


Re: in-memory scale-out with Hazelcast Was: Horizontal scalability and limits of TDB

2017-01-22 Thread Claude Warren
If have a bloom filter based graph implementation that runs on mysql as
there needs to be a server side bloom filter check and I could easily
implement a User defined function there.

I also have a new implementation of bloom filter that uses murmur 128 and
creates proto-bloom filters that can be expressed as bloom filters of any
shape.

Last weekend implemented a bloom filter join (multi level bloom filters --
so different shapes) but I have not tested it at scale and I am no certain
that it is any  better than the standard hash join.  However, it should be
possible to page out parts of the table so that it can do very large joins
efficiently.  But that got me wondering how big are "really large" joins?
even with millions of triples it seems to me that the queries will only
have thousands of entries to join.

Claude

On Sun, Jan 22, 2017 at 4:54 PM, A. Soroka  wrote:

> Another idea with which I have been playing is to try to scale
> horizontally but only in-memory:
>
> I could take the one-writeble-graph-per-transaction dataset code I've
> written and replace the ConcurrentHashMap that currently holds the graphs
> with a Hazelcast [1] distributed map. Naive union graph performance would
> be awful, but if the workload was chiefly addressing individual graphs and
> the graphs were large enough, the parallelism might be really worthwhile.
>
> Hazelcast offers per-entry locks [2], so those could be used instead of
> the lockable graphs I'm using now. It also also offers optimistic locking
> via Map.replace( key, oldValue, newValue ), so I could even imagine
> offering a switch between "strict mode" in which locks are used and
> "read-heavy mode" in which it is assumed that the application will prevent
> contention on individual graphs but that an update could fail if that isn't
> so.
>
> Hazelcast also offers some support for remote computation at the entries
> of its distributed maps [3], so it might be possible to distribute
> findInSpecificNamedGraph() executions (maybe eventually some of the ARQ
> execution as well?). It also supports a kind of query language [4] that
> might be used to obtain more efficiency, perhaps by using Bloom filters for
> graphs, as Claude has discussed before.
>
> All just food for thought, for now.
>
> ---
> A. Soroka
>
> [1] https://hazelcast.org/
> [2] http://docs.hazelcast.org/docs/3.7/manual/html-single/
> index.html#locking-maps
> [3] http://docs.hazelcast.org/docs/3.7/manual/html-single/
> index.html#entry-processor
> [4] http://docs.hazelcast.org/docs/3.7/manual/html-single/
> index.html#distributed-query
>
> > On Jan 20, 2017, at 8:38 PM, De Gyves  wrote:
> >
> > I'd like to participate on the storage portion of Jena, maybe TDB. As I
> > have worked many years developing with RBDMS I like to explore new
> > horizonts of persistence and graph based ones seem very promising to my
> > next projects, so i'd like to use SPARQL and RDF with Jena/TDB and see
> how
> > far I can go.
> >
> > So I've spent the last two days exploring subjects of the mail archives
> > from august 2015 to january of this year the of jena-dev and found some
> > interesting threads, as the development of TDB2, the tests of 100m of
> BSBM
> > data, a question of horizontal scaling, and that anything that implements
> > DatasetGraph can be used for a triples store. Some readings of jena doc
> > include: SPARQL, The RDF API, Txn and TDB transactions.
> >
> > What I am looking for is to get a clear perspective of some requirements
> > which are taken for granted on a traditional RDBMS. These are:
> >
> > 1. Atomicity, consistency, isolation and durability of a transaction on a
> > single tdb database: Apart from the limitations on the documentation of
> TDB
> > Transactions and Txn,  there are current issues? edge cases detected and
> > not yet covered?
> > 2. Are there currently available strategies to achieve a
> horizontal-scaled
> > tdb database?
> > 3. What do you think of try to implement a horizontal scalability with
> > DatasetGraph or something else with, let's say, cockroachdb, voltdb,
> > postgresql, etc?
> > 4. If there are some stress tests available, e.g. I read about a 100M of
> > BSBM test, is it included in the src? or may I have a copy of it? I'd
> like
> > to see what the limits are of the current TDB, and maybe of TDB2: maximum
> > size on disk of a dataset, max number of nodes on a dataset, of models or
> > graphs on a dataset, the limiting behavior of a typical read/write
> > transaction vs. the number of nodes, datasets, etcetera. Or, some
> > guidelines, so I can start to create this stress code. Will it be useful
> to
> > you also?
> >
> > --
> > Víctor-Polo de Gyvés Montero.
> > +52 (55) 4926 9478 (Cellphone in Mexico city)
> > Address: Daniel Delgadillo 7 6A, Agricultura neighborhood, Miguel Hidalgo
> > burough
> > ZIP: 11360, México City.
> >
> > http://degyves.googlepages.com
>
>


-- 
I like: Like Like - The likeliest place on the web

LinkedIn: 

Re: JPA contribution

2017-01-22 Thread Claude Warren
The annotations are not specific to Jena thought the implementation is.

I started trying to apply the standard JPA annotations to Jena but found a
significant impedance mismatch which led me to develop the annotations.

I think you could develop an implementation on Commons RDF but I have not
looked at Commons RDF that closely.

Claude

On Sun, Jan 22, 2017 at 2:48 PM, Stian Soiland-Reyes 
wrote:

> This is interesting! It seems to have a large overlap with the Juneau
> project in the Incubator, which can map annotated beans from/to various
> formats, including RDF with Jena.
>
> http://juneau.incubator.apache.org/#about.html
>
>
> Sesame in the olden days had something called Alibaba which was very
> similar, but since abandoned, I think it also had such an EntityManager.
>
> Would this module be particularly Jena-specific? I would think PA4RDF could
> work well with Commons RDF and then be useful without having to pick an RDF
> implementation.
>
> (except I guess you use Jena for mapping to native types?).
>
>
> On 21 Jan 2017 8:40 am, "Claude Warren"  wrote:
>
> Greetings,
>
> I have a project (PA4RDF) that provides persistence annotations that
> read/write a Jena graph.
>
> It basically turns any RDF subject into an object with the predicates
> defining the properties of the object.
>
> The current implementation can apply the annotations to interfaces,
> abstract or concrete classes.  It has been used in several projects with
> different corporate and government owners.
>
> I would like to contribute the code and documentation to the Jena project
> as an "extras" project.  Further information about the project and the code
> can be found at https://github.com/Claudenw/PA4RDF.
>
> Is there any objection to accepting this contribution?
>
> Claude
>
> --
> I like: Like Like - The likeliest place on the web
> 
> LinkedIn: http://www.linkedin.com/in/claudewarren
>



-- 
I like: Like Like - The likeliest place on the web

LinkedIn: http://www.linkedin.com/in/claudewarren


in-memory scale-out with Hazelcast Was: Horizontal scalability and limits of TDB

2017-01-22 Thread A. Soroka
Another idea with which I have been playing is to try to scale horizontally but 
only in-memory:

I could take the one-writeble-graph-per-transaction dataset code I've written 
and replace the ConcurrentHashMap that currently holds the graphs with a 
Hazelcast [1] distributed map. Naive union graph performance would be awful, 
but if the workload was chiefly addressing individual graphs and the graphs 
were large enough, the parallelism might be really worthwhile.

Hazelcast offers per-entry locks [2], so those could be used instead of the 
lockable graphs I'm using now. It also also offers optimistic locking via 
Map.replace( key, oldValue, newValue ), so I could even imagine offering a 
switch between "strict mode" in which locks are used and "read-heavy mode" in 
which it is assumed that the application will prevent contention on individual 
graphs but that an update could fail if that isn't so.

Hazelcast also offers some support for remote computation at the entries of its 
distributed maps [3], so it might be possible to distribute 
findInSpecificNamedGraph() executions (maybe eventually some of the ARQ 
execution as well?). It also supports a kind of query language [4] that might 
be used to obtain more efficiency, perhaps by using Bloom filters for graphs, 
as Claude has discussed before. 

All just food for thought, for now.

---
A. Soroka

[1] https://hazelcast.org/
[2] 
http://docs.hazelcast.org/docs/3.7/manual/html-single/index.html#locking-maps
[3] 
http://docs.hazelcast.org/docs/3.7/manual/html-single/index.html#entry-processor
[4] 
http://docs.hazelcast.org/docs/3.7/manual/html-single/index.html#distributed-query

> On Jan 20, 2017, at 8:38 PM, De Gyves  wrote:
> 
> I'd like to participate on the storage portion of Jena, maybe TDB. As I
> have worked many years developing with RBDMS I like to explore new
> horizonts of persistence and graph based ones seem very promising to my
> next projects, so i'd like to use SPARQL and RDF with Jena/TDB and see how
> far I can go.
> 
> So I've spent the last two days exploring subjects of the mail archives
> from august 2015 to january of this year the of jena-dev and found some
> interesting threads, as the development of TDB2, the tests of 100m of BSBM
> data, a question of horizontal scaling, and that anything that implements
> DatasetGraph can be used for a triples store. Some readings of jena doc
> include: SPARQL, The RDF API, Txn and TDB transactions.
> 
> What I am looking for is to get a clear perspective of some requirements
> which are taken for granted on a traditional RDBMS. These are:
> 
> 1. Atomicity, consistency, isolation and durability of a transaction on a
> single tdb database: Apart from the limitations on the documentation of TDB
> Transactions and Txn,  there are current issues? edge cases detected and
> not yet covered?
> 2. Are there currently available strategies to achieve a horizontal-scaled
> tdb database?
> 3. What do you think of try to implement a horizontal scalability with
> DatasetGraph or something else with, let's say, cockroachdb, voltdb,
> postgresql, etc?
> 4. If there are some stress tests available, e.g. I read about a 100M of
> BSBM test, is it included in the src? or may I have a copy of it? I'd like
> to see what the limits are of the current TDB, and maybe of TDB2: maximum
> size on disk of a dataset, max number of nodes on a dataset, of models or
> graphs on a dataset, the limiting behavior of a typical read/write
> transaction vs. the number of nodes, datasets, etcetera. Or, some
> guidelines, so I can start to create this stress code. Will it be useful to
> you also?
> 
> -- 
> Víctor-Polo de Gyvés Montero.
> +52 (55) 4926 9478 (Cellphone in Mexico city)
> Address: Daniel Delgadillo 7 6A, Agricultura neighborhood, Miguel Hidalgo
> burough
> ZIP: 11360, México City.
> 
> http://degyves.googlepages.com



Re: Horizontal scalability and limits of TDB

2017-01-22 Thread A. Soroka
First, to your specific questions:

> 1. Atomicity, consistency, isolation and durability of a transaction on a 
> single tdb database: Apart from the limitations on the documentation of TDB 
> Transactions and Txn,  there are current issues? edge cases detected and not 
> yet covered?

I'm not really sure what we mean by "consistency" once we go beyond a single 
writer. Without a schema and therefore without any understanding of data 
dependencies within the database, it's not clear to me how we can automatically 
understand when a state is consistent. It seems we have to leave that to the 
applications, for the most part. I'm very interested myself in ways we could 
"hint" to a triplestore the data dependencies we want it to understand (perhaps 
something like OWL/ICV), but that's not really a scaling issue.

I've recently been investigating the possibility of lock regions more granular 
that a whole dataset:

https://github.com/apache/jena/pull/204

for the special case of named graphs as the lock regions. We discussed this 
about a year ago when Claude Warren (Jena committer/PMC) made up some designs 
for discussion:

https://lists.apache.org/thread.html/916eed68e9847c6f4c0330fecff8b6f416a27344f2d995400e834562@1451744303@%3Cdev.jena.apache.org%3E

and there is a _lot_ more to be thought about there. 

Jena uses threads as stand-ins for transactions, and there is definitely work 
to be done to separate those ideas so that more than one thread can participate 
in a transaction and so that transactions can be managed independently of 
threading and low-level concurrency. That would be a pretty major change in the 
codebase, but Andy has been making some moves that will help set that up by 
changing from a single class being transactional to several type together 
composing a transactional thing.

> 2. Are there currently available strategies to achieve a horizontal-scaled 
> tdb database?

I'l let Andy speak to this, but I know of none (and I would very much like to!).

> 3. What do you think of try to implement a horizontal scalability with 
> DatasetGraph or something else with, let's say, cockroachdb, voltdb, 
> postgresql, etc?

See Claude's reply about Cassandra. Claude's is not the only work with 
Cassandra for RDF. There is also:

https://github.com/cumulusrdf/cumulusrdf

but that does not seem to be a very active project.

> 4. If there are some stress tests available, e.g. I read about a 100M of BSBM 
> test, is it included in the src? or may I have a copy of it? ... Or, some 
> guidelines, so I can start to create this stress code. Will it be useful to 
> you also?

You will definitely want to know about the work Rob Vesse (Jena committer/PMC) 
has done on this front:

https://github.com/rvesse/sparql-query-bm

Modeling workloads for triplestores, in general, is hard because people use 
them in so many different ways. Also knowing (say) the maximum number of nodes 
you could put in a dataset might not help you very much if the query time for 
that dataset with your queries isn't what you need. That's not to discourage 
you from working on this problem, just to point out that there is a lot of 
subtlety to even defining and scoping the problem well. It seems to me that 
most famous benchmarks for RDF stores take up a particular system of use cases 
and model that.

Otherwise: I've been thinking about scale-out for Jena for a while, too. 
Particularly I've been inspired by some of the advanced ideas being worked on 
in RDFox and TriAD [1], [2], and Andy pointed out this [3] blog post from the 
folks working on the closed-source product Stardog.

In fact, I was about to write some questions to the list (particularly Andy) 
about how we might start thinking about working in ARQ to split queries to 
partitions in different nodes, perhaps using summary graphs to avoid sending 
BGPs where they aren't going to find results or even using metadata at the 
branching nodes of the query tree to do cost accounting and results cardinality 
bounding. It seems we could at least get basic partitioning with enough time to 
work on it (he wrote blithely!). We might use something like Apache Zookeeper 
to manage the partitions and nodes and help figure out where to send different 
branches of the query. TriAD and RDFox are using clever ways of letting 
different paths through the query slip asynchronously against each other, but 
that seems to me like a bridge too far at first. Just getting a distributed 
approach basically working and giving correct results would be a great start! 
:grin: 

---
A. Soroka

[1] https://www.cs.ox.ac.uk/ian.horrocks/Publications/download/2016/PoMH16a.pdf
[2] http://adrem.ua.ac.be/~tmartin/Gurajada-Sigmod14.pdf
[3] http://blog.stardog.com/how-to-read-stardog-query-plans/

> On Jan 20, 2017, at 8:38 PM, De Gyves  wrote:
> 
> I'd like to participate on the storage portion of Jena, maybe TDB. As I
> have worked many years developing with RBDMS I like to explore new
> horizonts of persisten

Re: JPA contribution

2017-01-22 Thread Stian Soiland-Reyes
This is interesting! It seems to have a large overlap with the Juneau
project in the Incubator, which can map annotated beans from/to various
formats, including RDF with Jena.

http://juneau.incubator.apache.org/#about.html


Sesame in the olden days had something called Alibaba which was very
similar, but since abandoned, I think it also had such an EntityManager.

Would this module be particularly Jena-specific? I would think PA4RDF could
work well with Commons RDF and then be useful without having to pick an RDF
implementation.

(except I guess you use Jena for mapping to native types?).


On 21 Jan 2017 8:40 am, "Claude Warren"  wrote:

Greetings,

I have a project (PA4RDF) that provides persistence annotations that
read/write a Jena graph.

It basically turns any RDF subject into an object with the predicates
defining the properties of the object.

The current implementation can apply the annotations to interfaces,
abstract or concrete classes.  It has been used in several projects with
different corporate and government owners.

I would like to contribute the code and documentation to the Jena project
as an "extras" project.  Further information about the project and the code
can be found at https://github.com/Claudenw/PA4RDF.

Is there any objection to accepting this contribution?

Claude

--
I like: Like Like - The likeliest place on the web

LinkedIn: http://www.linkedin.com/in/claudewarren


Re: JPA contribution

2017-01-22 Thread A. Soroka
This is a cool project that seems like it would be of use to Jena users. It 
raises a question for me about how Jena handles contributions generally (not 
specific to this example).

Do we have any policy about how much support must exist from committers to 
accept a project? For example, in some other projects in which I participate, 
it's necessary for at least two committers to accept responsibility to maintain 
a module before it can be accepted, and if there are ever fewer than that over 
time, it goes into a deprecation path that eventuates in it leaving the 
project. I'm not arguing for that policy in particular for Jena, just wondering 
if we have anything like that, or whether the modules are pruned on an ad hoc 
basis.


---
A. Soroka

> On Jan 21, 2017, at 3:40 AM, Claude Warren  wrote:
> 
> Greetings,
> 
> I have a project (PA4RDF) that provides persistence annotations that
> read/write a Jena graph.
> 
> It basically turns any RDF subject into an object with the predicates
> defining the properties of the object.
> 
> The current implementation can apply the annotations to interfaces,
> abstract or concrete classes.  It has been used in several projects with
> different corporate and government owners.
> 
> I would like to contribute the code and documentation to the Jena project
> as an "extras" project.  Further information about the project and the code
> can be found at https://github.com/Claudenw/PA4RDF.
> 
> Is there any objection to accepting this contribution?
> 
> Claude
> 
> -- 
> I like: Like Like - The likeliest place on the web
> 
> LinkedIn: http://www.linkedin.com/in/claudewarren