Re: [Discussed] Integrating SPARQL-Gremlin 0.2 Plugin with the TinkerPop codebase

2018-08-08 Thread Stephen Mallette
I thought I had brought attention to this on a different thread somewhere
along the line, but note that the sparql-gremlin PR has been out there long
enough now that it can be merged on my +1 plus lazy consensus:

https://github.com/apache/tinkerpop/pull/902

I'll leave it open a bit longer in case anyone has any final comments, but
this branch has been hanging out there for a long time now so I don't think
there should be any much surprise as to what is going on.

On Thu, Apr 19, 2018 at 12:57 PM Stephen Mallette 
wrote:

> I think I'm going to rebase sparql-gremlin on master so that it's inline
> to be part of 3.4.0 when we go to release that. Please let me know if there
> are any concerns - if not I'll probably do that first thing next week.
>
> On Wed, Feb 7, 2018 at 11:34 AM, Harsh Thakkar  wrote:
>
>> No worries, I am on it :)
>>
>>
>> On 2018/02/07 15:04:34, Stephen Mallette  wrote:
>> > Ok - sounds like we are basically on the same page then. I hate to
>> > volunteer you for work :) but I think you are the best person to write
>> up
>> > the capabilities and limitations of sparql-gremlin. I think that if we
>> have
>> > those documented we can more easily decide the appropriate level of
>> > testing, so imo, doing that documentation is the next step. I think you
>> > should just expand what I wrote on sparql-gremlin here:
>> >
>> >
>> https://github.com/apache/tinkerpop/blob/TINKERPOP-1878/docs/src/reference/transpilers.asciidoc
>> >
>> > and provide a PR for that. Is that a good next step?
>> >
>> > On Wed, Feb 7, 2018 at 9:56 AM, Harsh Thakkar 
>> wrote:
>> >
>> > > Hi Stephen,
>> > >
>> > > Having more than one variables inside a GROUP BY or an ORDER BY
>> clause is
>> > > a problem on its own to be honest.  Responding to your question about
>> the
>> > > query.
>> > >
>> > > ```
>> > >
>> > > SELECT ?age ?name (COUNT(?name) AS ?name_count)
>> > > WHERE {
>> > > ?a e:created ?b .
>> > > ?a v:name ?name .
>> > > ?a v:age ?age .
>> > > }
>> > > GROUP BY ?age ?name
>> > > ```
>> > >
>> > > Ideally, what SPARQL does is that, it will GROUP BY'ied first
>> according to
>> > > the ?age and then for each ?age value it is will further GROUP BY
>> according
>> > > to the ?name. However, no specific ordering is followed unless the
>> user
>> > > specifies one, to the best of my knowledge.
>> > >
>> > > In the Gremlin translation or in Gremlin (I am not sure yet, where
>> there
>> > > problem lies), what is happening is that the values are GROUP BY'ied
>> first
>> > > according to the ?name and then being re-GROUP BY'ied (or re-arranged)
>> > > according to the ?age. That is why, only the last variable GROUPing is
>> > > visible.
>> > >
>> > > Reg. errors and exception reporting: Yes, we should have it stated
>> very
>> > > clear in the documentation of the plugin as what is feasible and what
>> is
>> > > not feasible in the current stage of the plugin. For instance, SPARQL
>> has
>> > > query modifiers which are very specific to SPARQL such as isIRI()
>> filter,
>> > > where it checks or filters a particular variable depending on if it
>> is a
>> > > URI or IRI (basically a URL). This, I do not think, has a
>> corresponding
>> > > operator in Gremlin. Clearly, because Gremlin operates on Property
>> Graphs
>> > > and not RDF graphs so there is no need.
>> > >
>> > > We should indeed, have it stated in the readme (documentation) what
>> is not
>> > > supported so that we do not have an outcry of third-party users
>> complaining
>> > > later that this doesn't work and that doesn't work and I also agree on
>> > > having nicely handled exceptions too. Nothing gives more pain than
>> ugly
>> > > crashing code. :)
>> > >
>> > >
>> > > On 2018/02/06 21:58:33, Stephen Mallette 
>> wrote:
>> > > > Thanks for your review of my concern with the traspiling for
>> GROUP.  So
>> > > > there's two aspects to your reply that I'd like to discuss. First,
>> the
>> > > > specific issue with GROUP that I'm seeing is that its simply
>> choosing the
>> > > > last variable given in the GROUP
>> > > >
>> > > >
>> https://github.com/apache/tinkerpop/blob/74b568a8babb8b52b790767e7bb05f
>> > > 462dc5c5f0/sparql-gremlin/src/main/java/org/apache/
>> > > tinkerpop/gremlin/sparql/SparqlToGremlinTranspiler.java#L139-L141
>> > > >
>> > > > so when you do this (which i think is legitimate SPARQL - i'm still
>> > > > learning):
>> > > >
>> > > > SELECT ?age ?name (COUNT(?name) AS ?name_count)
>> > > > WHERE {
>> > > >   ?a e:created ?b .
>> > > >   ?a v:name ?name .
>> > > >   ?a v:age ?age .
>> > > > }
>> > > > GROUP BY ?age ?name
>> > > >
>> > > > you will get back a grouping on "name" and if you transpose the
>> variables
>> > > > in that last line to:
>> > > >
>> > > > GROUP BY ?name ?age
>> > > >
>> > > > then you get a grouping on "age".  That doesn't seem right to me.
>> Now,
>> > > that
>> > > > would lead me to my second issue I'd want to bring up. Perhaps,
>> > > supporting
>> > > > GROUP with multiple variables is something we 

Re: [Discussed] Integrating SPARQL-Gremlin 0.2 Plugin with the TinkerPop codebase

2018-04-19 Thread Stephen Mallette
I think I'm going to rebase sparql-gremlin on master so that it's inline to
be part of 3.4.0 when we go to release that. Please let me know if there
are any concerns - if not I'll probably do that first thing next week.

On Wed, Feb 7, 2018 at 11:34 AM, Harsh Thakkar  wrote:

> No worries, I am on it :)
>
>
> On 2018/02/07 15:04:34, Stephen Mallette  wrote:
> > Ok - sounds like we are basically on the same page then. I hate to
> > volunteer you for work :) but I think you are the best person to write up
> > the capabilities and limitations of sparql-gremlin. I think that if we
> have
> > those documented we can more easily decide the appropriate level of
> > testing, so imo, doing that documentation is the next step. I think you
> > should just expand what I wrote on sparql-gremlin here:
> >
> > https://github.com/apache/tinkerpop/blob/TINKERPOP-1878/
> docs/src/reference/transpilers.asciidoc
> >
> > and provide a PR for that. Is that a good next step?
> >
> > On Wed, Feb 7, 2018 at 9:56 AM, Harsh Thakkar  wrote:
> >
> > > Hi Stephen,
> > >
> > > Having more than one variables inside a GROUP BY or an ORDER BY clause
> is
> > > a problem on its own to be honest.  Responding to your question about
> the
> > > query.
> > >
> > > ```
> > >
> > > SELECT ?age ?name (COUNT(?name) AS ?name_count)
> > > WHERE {
> > > ?a e:created ?b .
> > > ?a v:name ?name .
> > > ?a v:age ?age .
> > > }
> > > GROUP BY ?age ?name
> > > ```
> > >
> > > Ideally, what SPARQL does is that, it will GROUP BY'ied first
> according to
> > > the ?age and then for each ?age value it is will further GROUP BY
> according
> > > to the ?name. However, no specific ordering is followed unless the user
> > > specifies one, to the best of my knowledge.
> > >
> > > In the Gremlin translation or in Gremlin (I am not sure yet, where
> there
> > > problem lies), what is happening is that the values are GROUP BY'ied
> first
> > > according to the ?name and then being re-GROUP BY'ied (or re-arranged)
> > > according to the ?age. That is why, only the last variable GROUPing is
> > > visible.
> > >
> > > Reg. errors and exception reporting: Yes, we should have it stated very
> > > clear in the documentation of the plugin as what is feasible and what
> is
> > > not feasible in the current stage of the plugin. For instance, SPARQL
> has
> > > query modifiers which are very specific to SPARQL such as isIRI()
> filter,
> > > where it checks or filters a particular variable depending on if it is
> a
> > > URI or IRI (basically a URL). This, I do not think, has a corresponding
> > > operator in Gremlin. Clearly, because Gremlin operates on Property
> Graphs
> > > and not RDF graphs so there is no need.
> > >
> > > We should indeed, have it stated in the readme (documentation) what is
> not
> > > supported so that we do not have an outcry of third-party users
> complaining
> > > later that this doesn't work and that doesn't work and I also agree on
> > > having nicely handled exceptions too. Nothing gives more pain than ugly
> > > crashing code. :)
> > >
> > >
> > > On 2018/02/06 21:58:33, Stephen Mallette  wrote:
> > > > Thanks for your review of my concern with the traspiling for GROUP.
> So
> > > > there's two aspects to your reply that I'd like to discuss. First,
> the
> > > > specific issue with GROUP that I'm seeing is that its simply
> choosing the
> > > > last variable given in the GROUP
> > > >
> > > > https://github.com/apache/tinkerpop/blob/
> 74b568a8babb8b52b790767e7bb05f
> > > 462dc5c5f0/sparql-gremlin/src/main/java/org/apache/
> > > tinkerpop/gremlin/sparql/SparqlToGremlinTranspiler.java#L139-L141
> > > >
> > > > so when you do this (which i think is legitimate SPARQL - i'm still
> > > > learning):
> > > >
> > > > SELECT ?age ?name (COUNT(?name) AS ?name_count)
> > > > WHERE {
> > > >   ?a e:created ?b .
> > > >   ?a v:name ?name .
> > > >   ?a v:age ?age .
> > > > }
> > > > GROUP BY ?age ?name
> > > >
> > > > you will get back a grouping on "name" and if you transpose the
> variables
> > > > in that last line to:
> > > >
> > > > GROUP BY ?name ?age
> > > >
> > > > then you get a grouping on "age".  That doesn't seem right to me.
> Now,
> > > that
> > > > would lead me to my second issue I'd want to bring up. Perhaps,
> > > supporting
> > > > GROUP with multiple variables is something we don't support yet.
> Perhaps
> > > > there's a long line of other SPARQL capabilities that we aren't quite
> > > ready
> > > > to provide full transpiling for. Moreover, perhaps there are certain
> > > SPARQL
> > > > statements that just aren't possible to translate to Gremlin at all.
> I
> > > > think we need to do something smart in those cases. We don't want
> > > > situations like the one I presented in GROUP where it transpiles to
> > > Gremlin
> > > > but doesn't really accomplish what the intention of the SPARQL query
> > > was. I
> > > > feel like we need to do several things 

Re: [Discussed] Integrating SPARQL-Gremlin 0.2 Plugin with the TinkerPop codebase

2018-02-07 Thread Harsh Thakkar
No worries, I am on it :)


On 2018/02/07 15:04:34, Stephen Mallette  wrote: 
> Ok - sounds like we are basically on the same page then. I hate to
> volunteer you for work :) but I think you are the best person to write up
> the capabilities and limitations of sparql-gremlin. I think that if we have
> those documented we can more easily decide the appropriate level of
> testing, so imo, doing that documentation is the next step. I think you
> should just expand what I wrote on sparql-gremlin here:
> 
> https://github.com/apache/tinkerpop/blob/TINKERPOP-1878/docs/src/reference/transpilers.asciidoc
> 
> and provide a PR for that. Is that a good next step?
> 
> On Wed, Feb 7, 2018 at 9:56 AM, Harsh Thakkar  wrote:
> 
> > Hi Stephen,
> >
> > Having more than one variables inside a GROUP BY or an ORDER BY clause is
> > a problem on its own to be honest.  Responding to your question about the
> > query.
> >
> > ```
> >
> > SELECT ?age ?name (COUNT(?name) AS ?name_count)
> > WHERE {
> > ?a e:created ?b .
> > ?a v:name ?name .
> > ?a v:age ?age .
> > }
> > GROUP BY ?age ?name
> > ```
> >
> > Ideally, what SPARQL does is that, it will GROUP BY'ied first according to
> > the ?age and then for each ?age value it is will further GROUP BY according
> > to the ?name. However, no specific ordering is followed unless the user
> > specifies one, to the best of my knowledge.
> >
> > In the Gremlin translation or in Gremlin (I am not sure yet, where there
> > problem lies), what is happening is that the values are GROUP BY'ied first
> > according to the ?name and then being re-GROUP BY'ied (or re-arranged)
> > according to the ?age. That is why, only the last variable GROUPing is
> > visible.
> >
> > Reg. errors and exception reporting: Yes, we should have it stated very
> > clear in the documentation of the plugin as what is feasible and what is
> > not feasible in the current stage of the plugin. For instance, SPARQL has
> > query modifiers which are very specific to SPARQL such as isIRI() filter,
> > where it checks or filters a particular variable depending on if it is a
> > URI or IRI (basically a URL). This, I do not think, has a corresponding
> > operator in Gremlin. Clearly, because Gremlin operates on Property Graphs
> > and not RDF graphs so there is no need.
> >
> > We should indeed, have it stated in the readme (documentation) what is not
> > supported so that we do not have an outcry of third-party users complaining
> > later that this doesn't work and that doesn't work and I also agree on
> > having nicely handled exceptions too. Nothing gives more pain than ugly
> > crashing code. :)
> >
> >
> > On 2018/02/06 21:58:33, Stephen Mallette  wrote:
> > > Thanks for your review of my concern with the traspiling for GROUP.  So
> > > there's two aspects to your reply that I'd like to discuss. First, the
> > > specific issue with GROUP that I'm seeing is that its simply choosing the
> > > last variable given in the GROUP
> > >
> > > https://github.com/apache/tinkerpop/blob/74b568a8babb8b52b790767e7bb05f
> > 462dc5c5f0/sparql-gremlin/src/main/java/org/apache/
> > tinkerpop/gremlin/sparql/SparqlToGremlinTranspiler.java#L139-L141
> > >
> > > so when you do this (which i think is legitimate SPARQL - i'm still
> > > learning):
> > >
> > > SELECT ?age ?name (COUNT(?name) AS ?name_count)
> > > WHERE {
> > >   ?a e:created ?b .
> > >   ?a v:name ?name .
> > >   ?a v:age ?age .
> > > }
> > > GROUP BY ?age ?name
> > >
> > > you will get back a grouping on "name" and if you transpose the variables
> > > in that last line to:
> > >
> > > GROUP BY ?name ?age
> > >
> > > then you get a grouping on "age".  That doesn't seem right to me. Now,
> > that
> > > would lead me to my second issue I'd want to bring up. Perhaps,
> > supporting
> > > GROUP with multiple variables is something we don't support yet. Perhaps
> > > there's a long line of other SPARQL capabilities that we aren't quite
> > ready
> > > to provide full transpiling for. Moreover, perhaps there are certain
> > SPARQL
> > > statements that just aren't possible to translate to Gremlin at all. I
> > > think we need to do something smart in those cases. We don't want
> > > situations like the one I presented in GROUP where it transpiles to
> > Gremlin
> > > but doesn't really accomplish what the intention of the SPARQL query
> > was. I
> > > feel like we need to do several things with respect to this:
> > >
> > > 1. If we can't transpile the SPARQL, we throw an
> > > UnsupportedOperationException with a nice error message that says why the
> > > user's SPARQL didn't transpile (i.e. what don't we support that they
> > tried
> > > to pass through)
> > > 2. We document the boundaries of what we do support and what our
> > > limitations are.
> > >
> > > Any thoughts on all that?
> > >
> > > >   Dharmen and I did check your corrections and comments in the code. We
> > > found them appropriate.
> > >
> > > That's good 

Re: [Discussed] Integrating SPARQL-Gremlin 0.2 Plugin with the TinkerPop codebase

2018-02-07 Thread Stephen Mallette
Ok - sounds like we are basically on the same page then. I hate to
volunteer you for work :) but I think you are the best person to write up
the capabilities and limitations of sparql-gremlin. I think that if we have
those documented we can more easily decide the appropriate level of
testing, so imo, doing that documentation is the next step. I think you
should just expand what I wrote on sparql-gremlin here:

https://github.com/apache/tinkerpop/blob/TINKERPOP-1878/docs/src/reference/transpilers.asciidoc

and provide a PR for that. Is that a good next step?

On Wed, Feb 7, 2018 at 9:56 AM, Harsh Thakkar  wrote:

> Hi Stephen,
>
> Having more than one variables inside a GROUP BY or an ORDER BY clause is
> a problem on its own to be honest.  Responding to your question about the
> query.
>
> ```
>
> SELECT ?age ?name (COUNT(?name) AS ?name_count)
> WHERE {
> ?a e:created ?b .
> ?a v:name ?name .
> ?a v:age ?age .
> }
> GROUP BY ?age ?name
> ```
>
> Ideally, what SPARQL does is that, it will GROUP BY'ied first according to
> the ?age and then for each ?age value it is will further GROUP BY according
> to the ?name. However, no specific ordering is followed unless the user
> specifies one, to the best of my knowledge.
>
> In the Gremlin translation or in Gremlin (I am not sure yet, where there
> problem lies), what is happening is that the values are GROUP BY'ied first
> according to the ?name and then being re-GROUP BY'ied (or re-arranged)
> according to the ?age. That is why, only the last variable GROUPing is
> visible.
>
> Reg. errors and exception reporting: Yes, we should have it stated very
> clear in the documentation of the plugin as what is feasible and what is
> not feasible in the current stage of the plugin. For instance, SPARQL has
> query modifiers which are very specific to SPARQL such as isIRI() filter,
> where it checks or filters a particular variable depending on if it is a
> URI or IRI (basically a URL). This, I do not think, has a corresponding
> operator in Gremlin. Clearly, because Gremlin operates on Property Graphs
> and not RDF graphs so there is no need.
>
> We should indeed, have it stated in the readme (documentation) what is not
> supported so that we do not have an outcry of third-party users complaining
> later that this doesn't work and that doesn't work and I also agree on
> having nicely handled exceptions too. Nothing gives more pain than ugly
> crashing code. :)
>
>
> On 2018/02/06 21:58:33, Stephen Mallette  wrote:
> > Thanks for your review of my concern with the traspiling for GROUP.  So
> > there's two aspects to your reply that I'd like to discuss. First, the
> > specific issue with GROUP that I'm seeing is that its simply choosing the
> > last variable given in the GROUP
> >
> > https://github.com/apache/tinkerpop/blob/74b568a8babb8b52b790767e7bb05f
> 462dc5c5f0/sparql-gremlin/src/main/java/org/apache/
> tinkerpop/gremlin/sparql/SparqlToGremlinTranspiler.java#L139-L141
> >
> > so when you do this (which i think is legitimate SPARQL - i'm still
> > learning):
> >
> > SELECT ?age ?name (COUNT(?name) AS ?name_count)
> > WHERE {
> >   ?a e:created ?b .
> >   ?a v:name ?name .
> >   ?a v:age ?age .
> > }
> > GROUP BY ?age ?name
> >
> > you will get back a grouping on "name" and if you transpose the variables
> > in that last line to:
> >
> > GROUP BY ?name ?age
> >
> > then you get a grouping on "age".  That doesn't seem right to me. Now,
> that
> > would lead me to my second issue I'd want to bring up. Perhaps,
> supporting
> > GROUP with multiple variables is something we don't support yet. Perhaps
> > there's a long line of other SPARQL capabilities that we aren't quite
> ready
> > to provide full transpiling for. Moreover, perhaps there are certain
> SPARQL
> > statements that just aren't possible to translate to Gremlin at all. I
> > think we need to do something smart in those cases. We don't want
> > situations like the one I presented in GROUP where it transpiles to
> Gremlin
> > but doesn't really accomplish what the intention of the SPARQL query
> was. I
> > feel like we need to do several things with respect to this:
> >
> > 1. If we can't transpile the SPARQL, we throw an
> > UnsupportedOperationException with a nice error message that says why the
> > user's SPARQL didn't transpile (i.e. what don't we support that they
> tried
> > to pass through)
> > 2. We document the boundaries of what we do support and what our
> > limitations are.
> >
> > Any thoughts on all that?
> >
> > >   Dharmen and I did check your corrections and comments in the code. We
> > found them appropriate.
> >
> > That's good to know. Thanks.
> >
> >
> >
> >
> > On Tue, Feb 6, 2018 at 3:36 PM, Harsh Thakkar  wrote:
> >
> > > Hi Stephen,
> > >
> > > Apologies for being quiet for some time. I have been down with severe
> flu
> > > and just recovered. I looked into the order by issue and the reason for
> > > having only an 

Re: [Discussed] Integrating SPARQL-Gremlin 0.2 Plugin with the TinkerPop codebase

2018-02-07 Thread Harsh Thakkar
Hi Stephen,

Having more than one variables inside a GROUP BY or an ORDER BY clause is a 
problem on its own to be honest.  Responding to your question about the query.

```

SELECT ?age ?name (COUNT(?name) AS ?name_count)
WHERE {
?a e:created ?b .
?a v:name ?name .
?a v:age ?age .
}
GROUP BY ?age ?name
```

Ideally, what SPARQL does is that, it will GROUP BY'ied first according to the 
?age and then for each ?age value it is will further GROUP BY according to the 
?name. However, no specific ordering is followed unless the user specifies one, 
to the best of my knowledge.

In the Gremlin translation or in Gremlin (I am not sure yet, where there 
problem lies), what is happening is that the values are GROUP BY'ied first 
according to the ?name and then being re-GROUP BY'ied (or re-arranged) 
according to the ?age. That is why, only the last variable GROUPing is visible.

Reg. errors and exception reporting: Yes, we should have it stated very clear 
in the documentation of the plugin as what is feasible and what is not feasible 
in the current stage of the plugin. For instance, SPARQL has query modifiers 
which are very specific to SPARQL such as isIRI() filter, where it checks or 
filters a particular variable depending on if it is a URI or IRI (basically a 
URL). This, I do not think, has a corresponding operator in Gremlin. Clearly, 
because Gremlin operates on Property Graphs and not RDF graphs so there is no 
need.

We should indeed, have it stated in the readme (documentation) what is not 
supported so that we do not have an outcry of third-party users complaining 
later that this doesn't work and that doesn't work and I also agree on having 
nicely handled exceptions too. Nothing gives more pain than ugly crashing code. 
:)


On 2018/02/06 21:58:33, Stephen Mallette  wrote: 
> Thanks for your review of my concern with the traspiling for GROUP.  So
> there's two aspects to your reply that I'd like to discuss. First, the
> specific issue with GROUP that I'm seeing is that its simply choosing the
> last variable given in the GROUP
> 
> https://github.com/apache/tinkerpop/blob/74b568a8babb8b52b790767e7bb05f462dc5c5f0/sparql-gremlin/src/main/java/org/apache/tinkerpop/gremlin/sparql/SparqlToGremlinTranspiler.java#L139-L141
> 
> so when you do this (which i think is legitimate SPARQL - i'm still
> learning):
> 
> SELECT ?age ?name (COUNT(?name) AS ?name_count)
> WHERE {
>   ?a e:created ?b .
>   ?a v:name ?name .
>   ?a v:age ?age .
> }
> GROUP BY ?age ?name
> 
> you will get back a grouping on "name" and if you transpose the variables
> in that last line to:
> 
> GROUP BY ?name ?age
> 
> then you get a grouping on "age".  That doesn't seem right to me. Now, that
> would lead me to my second issue I'd want to bring up. Perhaps, supporting
> GROUP with multiple variables is something we don't support yet. Perhaps
> there's a long line of other SPARQL capabilities that we aren't quite ready
> to provide full transpiling for. Moreover, perhaps there are certain SPARQL
> statements that just aren't possible to translate to Gremlin at all. I
> think we need to do something smart in those cases. We don't want
> situations like the one I presented in GROUP where it transpiles to Gremlin
> but doesn't really accomplish what the intention of the SPARQL query was. I
> feel like we need to do several things with respect to this:
> 
> 1. If we can't transpile the SPARQL, we throw an
> UnsupportedOperationException with a nice error message that says why the
> user's SPARQL didn't transpile (i.e. what don't we support that they tried
> to pass through)
> 2. We document the boundaries of what we do support and what our
> limitations are.
> 
> Any thoughts on all that?
> 
> >   Dharmen and I did check your corrections and comments in the code. We
> found them appropriate.
> 
> That's good to know. Thanks.
> 
> 
> 
> 
> On Tue, Feb 6, 2018 at 3:36 PM, Harsh Thakkar  wrote:
> 
> > Hi Stephen,
> >
> > Apologies for being quiet for some time. I have been down with severe flu
> > and just recovered. I looked into the order by issue and the reason for
> > having only an aggregation variable in the select clause is because of
> > SPARQL. SPARQL does not support projecting any other variable other than
> > the one which is being used in group by. One could write such a SPARQL
> > query, however, it would be incorrect and wouldn't be able to be parsed by
> > any SPARQL query processor.
> >
> > For instance,
> >
> >  select ?unitOnOrder
> >   where {
> >   ?a v:label "product" .
> >   ?a v:name ?name .
> >   ?a v:unitsOnOrder ?unitOnOrder .
> >   } GROUP BY (?unitOnOrder)
> >
> > the above query will be valid and return an appropriate response, whereas:
> >
> >  select ?name
> >   where {
> >   ?a v:label "product" .
> >   ?a v:name ?name .
> >   ?a v:unitsOnOrder ?unitOnOrder .
> >   } GROUP BY 

Re: [Discussed] Integrating SPARQL-Gremlin 0.2 Plugin with the TinkerPop codebase

2018-02-06 Thread Stephen Mallette
Thanks for your review of my concern with the traspiling for GROUP.  So
there's two aspects to your reply that I'd like to discuss. First, the
specific issue with GROUP that I'm seeing is that its simply choosing the
last variable given in the GROUP

https://github.com/apache/tinkerpop/blob/74b568a8babb8b52b790767e7bb05f462dc5c5f0/sparql-gremlin/src/main/java/org/apache/tinkerpop/gremlin/sparql/SparqlToGremlinTranspiler.java#L139-L141

so when you do this (which i think is legitimate SPARQL - i'm still
learning):

SELECT ?age ?name (COUNT(?name) AS ?name_count)
WHERE {
  ?a e:created ?b .
  ?a v:name ?name .
  ?a v:age ?age .
}
GROUP BY ?age ?name

you will get back a grouping on "name" and if you transpose the variables
in that last line to:

GROUP BY ?name ?age

then you get a grouping on "age".  That doesn't seem right to me. Now, that
would lead me to my second issue I'd want to bring up. Perhaps, supporting
GROUP with multiple variables is something we don't support yet. Perhaps
there's a long line of other SPARQL capabilities that we aren't quite ready
to provide full transpiling for. Moreover, perhaps there are certain SPARQL
statements that just aren't possible to translate to Gremlin at all. I
think we need to do something smart in those cases. We don't want
situations like the one I presented in GROUP where it transpiles to Gremlin
but doesn't really accomplish what the intention of the SPARQL query was. I
feel like we need to do several things with respect to this:

1. If we can't transpile the SPARQL, we throw an
UnsupportedOperationException with a nice error message that says why the
user's SPARQL didn't transpile (i.e. what don't we support that they tried
to pass through)
2. We document the boundaries of what we do support and what our
limitations are.

Any thoughts on all that?

>   Dharmen and I did check your corrections and comments in the code. We
found them appropriate.

That's good to know. Thanks.




On Tue, Feb 6, 2018 at 3:36 PM, Harsh Thakkar  wrote:

> Hi Stephen,
>
> Apologies for being quiet for some time. I have been down with severe flu
> and just recovered. I looked into the order by issue and the reason for
> having only an aggregation variable in the select clause is because of
> SPARQL. SPARQL does not support projecting any other variable other than
> the one which is being used in group by. One could write such a SPARQL
> query, however, it would be incorrect and wouldn't be able to be parsed by
> any SPARQL query processor.
>
> For instance,
>
>  select ?unitOnOrder
>   where {
>   ?a v:label "product" .
>   ?a v:name ?name .
>   ?a v:unitsOnOrder ?unitOnOrder .
>   } GROUP BY (?unitOnOrder)
>
> the above query will be valid and return an appropriate response, whereas:
>
>  select ?name
>   where {
>   ?a v:label "product" .
>   ?a v:name ?name .
>   ?a v:unitsOnOrder ?unitOnOrder .
>   } GROUP BY (?unitOnOrder)
>
> OR
>
>  select ?unitOnOrder ?name
>   where {
>   ?a v:label "product" .
>   ?a v:name ?name .
>   ?a v:unitsOnOrder ?unitOnOrder .
>   } GROUP BY (?unitOnOrder)
>
>
> would throw the following exception => 
> org.apache.jena.query.QueryParseException:
> Non-group key variable in SELECT
> This is specifically for Jena, but if any other SPARQL processor would
> throw a similar exception while processing as it is beyond the formal
> definition of SPARQL query language.
> Thus, I believe we will have to live with it.
>
> What else is happening on your side? Please let me know where I can get in
> and be of help. Dharmen and I did check your corrections and comments in
> the code. We found them appropriate.
> We will continue to go throw and add more comments if need be, especially
> me. I haven't been able to work much on this due to ill health. But I am
> back now.
>
>
> On 2018/01/29 16:09:50, Stephen Mallette  wrote:
> > > SPARQL 1.1 test suite could be used
> >
> > thanks josh - will need to look into that further
> >
> > Harsh, the plugin is pushed at this point. After building we can now do:
> >
> > gremlin> :install org.apache.tinkerpop sparql-gremlin 3.3.2-SNAPSHOT
> > ==>Loaded: [org.apache.tinkerpop, sparql-gremlin, 3.3.2-SNAPSHOT]
> > gremlin> :plugin use tinkerpop.sparql
> > ==>tinkerpop.sparql activated
> >
> > so that's good. i also added a bit of asciidoc for sparql-gremlin. you
> can
> > append whatever documentation you write to that. i will probably step
> away
> > from this branch for a bit to work on other things and give you a chance
> to
> > get familiar with the changes i've pushed so far.
> >
> > On Sun, Jan 28, 2018 at 11:21 AM, Joshua Shinavier 
> > wrote:
> >
> > > For testing, perhaps the SPARQL 1.1 test suite could be used:
> > >
> > > https://www.w3.org/2009/sparql/docs/tests
> > >
> > > This would provide a strong 

Re: [Discussed] Integrating SPARQL-Gremlin 0.2 Plugin with the TinkerPop codebase

2018-02-06 Thread Harsh Thakkar
Hi Stephen,

Apologies for being quiet for some time. I have been down with severe flu and 
just recovered. I looked into the order by issue and the reason for having only 
an aggregation variable in the select clause is because of SPARQL. SPARQL does 
not support projecting any other variable other than the one which is being 
used in group by. One could write such a SPARQL query, however, it would be 
incorrect and wouldn't be able to be parsed by any SPARQL query processor.

For instance, 

 select ?unitOnOrder
  where {
  ?a v:label "product" .
  ?a v:name ?name .
  ?a v:unitsOnOrder ?unitOnOrder .
  } GROUP BY (?unitOnOrder)

the above query will be valid and return an appropriate response, whereas:

 select ?name
  where {
  ?a v:label "product" .
  ?a v:name ?name .
  ?a v:unitsOnOrder ?unitOnOrder .
  } GROUP BY (?unitOnOrder)

OR

 select ?unitOnOrder ?name
  where {
  ?a v:label "product" .
  ?a v:name ?name .
  ?a v:unitsOnOrder ?unitOnOrder .
  } GROUP BY (?unitOnOrder)


would throw the following exception => 
org.apache.jena.query.QueryParseException: Non-group key variable in SELECT
This is specifically for Jena, but if any other SPARQL processor would throw a 
similar exception while processing as it is beyond the formal definition of 
SPARQL query language.
Thus, I believe we will have to live with it. 

What else is happening on your side? Please let me know where I can get in and 
be of help. Dharmen and I did check your corrections and comments in the code. 
We found them appropriate. 
We will continue to go throw and add more comments if need be, especially me. I 
haven't been able to work much on this due to ill health. But I am back now.


On 2018/01/29 16:09:50, Stephen Mallette  wrote: 
> > SPARQL 1.1 test suite could be used
> 
> thanks josh - will need to look into that further
> 
> Harsh, the plugin is pushed at this point. After building we can now do:
> 
> gremlin> :install org.apache.tinkerpop sparql-gremlin 3.3.2-SNAPSHOT
> ==>Loaded: [org.apache.tinkerpop, sparql-gremlin, 3.3.2-SNAPSHOT]
> gremlin> :plugin use tinkerpop.sparql
> ==>tinkerpop.sparql activated
> 
> so that's good. i also added a bit of asciidoc for sparql-gremlin. you can
> append whatever documentation you write to that. i will probably step away
> from this branch for a bit to work on other things and give you a chance to
> get familiar with the changes i've pushed so far.
> 
> On Sun, Jan 28, 2018 at 11:21 AM, Joshua Shinavier 
> wrote:
> 
> > For testing, perhaps the SPARQL 1.1 test suite could be used:
> >
> > https://www.w3.org/2009/sparql/docs/tests
> >
> > This would provide a strong guarantee of coverage and correctness of
> > supported features. The metadata about required features for individual
> > tests is limited, so an appropriate subset of the test cases would need to
> > be hand-picked according to that coverage.
> >
> >
> >
> > On Sun, Jan 28, 2018 at 7:20 AM, Stephen Mallette 
> > wrote:
> >
> > > Harsh, on Friday, I pushed a great many changes to the TINKERPOP-1878
> > > branch. I got quite familiar with the code and even fixed up a bug in
> > ORDER
> > > where it wasn't properly handling multiple fields passed to it. I believe
> > > that there are similar problems in GROUP. At this point, I've got most of
> > > what Marko suggested in place, but I'm not especially happy with it - the
> > > java generics are giving me a hard time...kinda ugly. I've added a
> > > reasonable level of javadoc and inline comments, but you may want to
> > review
> > > and include more if I've missed something or mis-stated some intent. The
> > > test suite is still a bit of a mystery to me. I'm not quite sure how we
> > > should go about dealing with that. At this point, I'm basically using
> > code
> > > coverage as a guide to drive the tests written, but I'm not sure if
> > that's
> > > effective in the big picture. I don't have a plugin working at this
> > point.
> > > I only got all that stuff i tweeted to work in the console because I
> > > installed it all by hand manually. That still needs to be done.
> > >
> > > Here's my suggestion for how we proceed forward:
> > >
> > > 1. You start by pulling my latest changes to your fork for review. I
> > > changed a lot of things - renaming, refactoring, removing dead code, etc.
> > > You should get familiar with what's there and let me know if I did
> > anything
> > > dumb.
> > > 2. Perhaps you look at the issue I think that I see with GROUP (which is
> > > basically identical to ORDER in that it only accepts the last field as a
> > > GROUPing...i don't think that's right).
> > > 3. Perhaps you could also think about writing some documentation that
> > > explains the support TinkerPop has for SPARQL - describe the aspects of
> > > SPARQL that we support 

Re: [Discussed] Integrating SPARQL-Gremlin 0.2 Plugin with the TinkerPop codebase

2018-01-29 Thread Stephen Mallette
> SPARQL 1.1 test suite could be used

thanks josh - will need to look into that further

Harsh, the plugin is pushed at this point. After building we can now do:

gremlin> :install org.apache.tinkerpop sparql-gremlin 3.3.2-SNAPSHOT
==>Loaded: [org.apache.tinkerpop, sparql-gremlin, 3.3.2-SNAPSHOT]
gremlin> :plugin use tinkerpop.sparql
==>tinkerpop.sparql activated

so that's good. i also added a bit of asciidoc for sparql-gremlin. you can
append whatever documentation you write to that. i will probably step away
from this branch for a bit to work on other things and give you a chance to
get familiar with the changes i've pushed so far.

On Sun, Jan 28, 2018 at 11:21 AM, Joshua Shinavier 
wrote:

> For testing, perhaps the SPARQL 1.1 test suite could be used:
>
> https://www.w3.org/2009/sparql/docs/tests
>
> This would provide a strong guarantee of coverage and correctness of
> supported features. The metadata about required features for individual
> tests is limited, so an appropriate subset of the test cases would need to
> be hand-picked according to that coverage.
>
>
>
> On Sun, Jan 28, 2018 at 7:20 AM, Stephen Mallette 
> wrote:
>
> > Harsh, on Friday, I pushed a great many changes to the TINKERPOP-1878
> > branch. I got quite familiar with the code and even fixed up a bug in
> ORDER
> > where it wasn't properly handling multiple fields passed to it. I believe
> > that there are similar problems in GROUP. At this point, I've got most of
> > what Marko suggested in place, but I'm not especially happy with it - the
> > java generics are giving me a hard time...kinda ugly. I've added a
> > reasonable level of javadoc and inline comments, but you may want to
> review
> > and include more if I've missed something or mis-stated some intent. The
> > test suite is still a bit of a mystery to me. I'm not quite sure how we
> > should go about dealing with that. At this point, I'm basically using
> code
> > coverage as a guide to drive the tests written, but I'm not sure if
> that's
> > effective in the big picture. I don't have a plugin working at this
> point.
> > I only got all that stuff i tweeted to work in the console because I
> > installed it all by hand manually. That still needs to be done.
> >
> > Here's my suggestion for how we proceed forward:
> >
> > 1. You start by pulling my latest changes to your fork for review. I
> > changed a lot of things - renaming, refactoring, removing dead code, etc.
> > You should get familiar with what's there and let me know if I did
> anything
> > dumb.
> > 2. Perhaps you look at the issue I think that I see with GROUP (which is
> > basically identical to ORDER in that it only accepts the last field as a
> > GROUPing...i don't think that's right).
> > 3. Perhaps you could also think about writing some documentation that
> > explains the support TinkerPop has for SPARQL - describe the aspects of
> > SPARQL that we support and any limitations that we have in that support.
> > 4. I will work on the plugin and get that working on early this coming
> > week.
> > 5. I will also keep thinking about testing - i still don't think that the
> > approach I have is sufficient. If you have ideas about that, please let
> me
> > know.
> >
> > How does that sound?
> >
> > btw, note that i had to do a bit of trickery to get the sparql-gremlin
> > stuff to work in the console for that screenshot i posted on twitter.
> > obviously, without the plugin things don't work too easily. i had to
> > manually install all the dependencies to the console to get all that to
> > work. again, that should be resolved early this coming week and then it
> can
> > be easily imported to the console and server.
> >
> >
> >
> >
> > On Thu, Jan 25, 2018 at 4:58 PM, Stephen Mallette 
> > wrote:
> >
> > > Marko had a nice idea with:
> > >
> > > gremlin> sparql = graph.traversal(SPARQLTraversalStrategy.class)
> > > .withRemote(“127.0.0.2”)
> > > gremlin> sparql.query(“SELECT ?x ?y WHERE {…}”).toList()
> > > ==>{?x:marko, ?y:29}
> > > ==>{?x:josh, ?y:32}
> > >
> > > The problem i'm seeing is that it requires that the TraversalSource on
> > the
> > > server be a SparqlTraversalSource because when it gets to the server it
> > > ends up trying to deserialize the bytecode into a GraphTraversalSource.
> > > Now, that's exactly how a DSL would work, but a DSL would start with an
> > > existing start step such as V() or E(), but not constant() which is
> what
> > > SparqlTraversalSource is sending with the sparql query in it. I might
> be
> > > not thinking of something right in how he expected to implement it,
> but I
> > > came up with a reasonably simple workaround - I added an empty inject()
> > > step before the constant() so that the GraphTraversalSource will be
> used.
> > > Both of these steps will be wholly replaced by the transpiled traversal
> > > when the SparqlStrategy executes and we thus get:
> > >
> > > gremlin> graph = 

Re: [Discussed] Integrating SPARQL-Gremlin 0.2 Plugin with the TinkerPop codebase

2018-01-28 Thread Joshua Shinavier
For testing, perhaps the SPARQL 1.1 test suite could be used:

https://www.w3.org/2009/sparql/docs/tests

This would provide a strong guarantee of coverage and correctness of
supported features. The metadata about required features for individual
tests is limited, so an appropriate subset of the test cases would need to
be hand-picked according to that coverage.



On Sun, Jan 28, 2018 at 7:20 AM, Stephen Mallette 
wrote:

> Harsh, on Friday, I pushed a great many changes to the TINKERPOP-1878
> branch. I got quite familiar with the code and even fixed up a bug in ORDER
> where it wasn't properly handling multiple fields passed to it. I believe
> that there are similar problems in GROUP. At this point, I've got most of
> what Marko suggested in place, but I'm not especially happy with it - the
> java generics are giving me a hard time...kinda ugly. I've added a
> reasonable level of javadoc and inline comments, but you may want to review
> and include more if I've missed something or mis-stated some intent. The
> test suite is still a bit of a mystery to me. I'm not quite sure how we
> should go about dealing with that. At this point, I'm basically using code
> coverage as a guide to drive the tests written, but I'm not sure if that's
> effective in the big picture. I don't have a plugin working at this point.
> I only got all that stuff i tweeted to work in the console because I
> installed it all by hand manually. That still needs to be done.
>
> Here's my suggestion for how we proceed forward:
>
> 1. You start by pulling my latest changes to your fork for review. I
> changed a lot of things - renaming, refactoring, removing dead code, etc.
> You should get familiar with what's there and let me know if I did anything
> dumb.
> 2. Perhaps you look at the issue I think that I see with GROUP (which is
> basically identical to ORDER in that it only accepts the last field as a
> GROUPing...i don't think that's right).
> 3. Perhaps you could also think about writing some documentation that
> explains the support TinkerPop has for SPARQL - describe the aspects of
> SPARQL that we support and any limitations that we have in that support.
> 4. I will work on the plugin and get that working on early this coming
> week.
> 5. I will also keep thinking about testing - i still don't think that the
> approach I have is sufficient. If you have ideas about that, please let me
> know.
>
> How does that sound?
>
> btw, note that i had to do a bit of trickery to get the sparql-gremlin
> stuff to work in the console for that screenshot i posted on twitter.
> obviously, without the plugin things don't work too easily. i had to
> manually install all the dependencies to the console to get all that to
> work. again, that should be resolved early this coming week and then it can
> be easily imported to the console and server.
>
>
>
>
> On Thu, Jan 25, 2018 at 4:58 PM, Stephen Mallette 
> wrote:
>
> > Marko had a nice idea with:
> >
> > gremlin> sparql = graph.traversal(SPARQLTraversalStrategy.class)
> > .withRemote(“127.0.0.2”)
> > gremlin> sparql.query(“SELECT ?x ?y WHERE {…}”).toList()
> > ==>{?x:marko, ?y:29}
> > ==>{?x:josh, ?y:32}
> >
> > The problem i'm seeing is that it requires that the TraversalSource on
> the
> > server be a SparqlTraversalSource because when it gets to the server it
> > ends up trying to deserialize the bytecode into a GraphTraversalSource.
> > Now, that's exactly how a DSL would work, but a DSL would start with an
> > existing start step such as V() or E(), but not constant() which is what
> > SparqlTraversalSource is sending with the sparql query in it. I might be
> > not thinking of something right in how he expected to implement it, but I
> > came up with a reasonably simple workaround - I added an empty inject()
> > step before the constant() so that the GraphTraversalSource will be used.
> > Both of these steps will be wholly replaced by the transpiled traversal
> > when the SparqlStrategy executes and we thus get:
> >
> > gremlin> graph = EmptyGraph.instance()
> > ==>emptygraph[empty]
> > gremlin> cluster = Cluster.open()
> > ==>localhost/127.0.0.1:8182
> > gremlin> g = graph.traversal(SparqlTraversalSource.class).
> > ..1> withStrategies(SparqlStrategy.instance()).
> > ..2> withRemote(DriverRemoteConnection.using(
> cluster))
> > ==>sparqltraversalsource[emptygraph[empty], standard]
> > gremlin> g.sparql("SELECT ?name ?age WHERE { ?person v:name ?name .
> > ?person v:age ?age }")
> > ==>[name:marko,age:29]
> > ==>[name:vadas,age:27]
> > ==>[name:josh,age:32]
> > ==>[name:peter,age:35]
> >
> > Treating sparql-gremlin as a DSL really seems like the best way to get
> > this all working - especially since it already is! :)  To get the same
> > pattern going with GLVs we would only need to make use of the DSL
> patterns
> > which already exist. Anyway, it's nice to have these basic premises
> nailed
> > down in 

Re: [Discussed] Integrating SPARQL-Gremlin 0.2 Plugin with the TinkerPop codebase

2018-01-28 Thread Stephen Mallette
Harsh, on Friday, I pushed a great many changes to the TINKERPOP-1878
branch. I got quite familiar with the code and even fixed up a bug in ORDER
where it wasn't properly handling multiple fields passed to it. I believe
that there are similar problems in GROUP. At this point, I've got most of
what Marko suggested in place, but I'm not especially happy with it - the
java generics are giving me a hard time...kinda ugly. I've added a
reasonable level of javadoc and inline comments, but you may want to review
and include more if I've missed something or mis-stated some intent. The
test suite is still a bit of a mystery to me. I'm not quite sure how we
should go about dealing with that. At this point, I'm basically using code
coverage as a guide to drive the tests written, but I'm not sure if that's
effective in the big picture. I don't have a plugin working at this point.
I only got all that stuff i tweeted to work in the console because I
installed it all by hand manually. That still needs to be done.

Here's my suggestion for how we proceed forward:

1. You start by pulling my latest changes to your fork for review. I
changed a lot of things - renaming, refactoring, removing dead code, etc.
You should get familiar with what's there and let me know if I did anything
dumb.
2. Perhaps you look at the issue I think that I see with GROUP (which is
basically identical to ORDER in that it only accepts the last field as a
GROUPing...i don't think that's right).
3. Perhaps you could also think about writing some documentation that
explains the support TinkerPop has for SPARQL - describe the aspects of
SPARQL that we support and any limitations that we have in that support.
4. I will work on the plugin and get that working on early this coming week.
5. I will also keep thinking about testing - i still don't think that the
approach I have is sufficient. If you have ideas about that, please let me
know.

How does that sound?

btw, note that i had to do a bit of trickery to get the sparql-gremlin
stuff to work in the console for that screenshot i posted on twitter.
obviously, without the plugin things don't work too easily. i had to
manually install all the dependencies to the console to get all that to
work. again, that should be resolved early this coming week and then it can
be easily imported to the console and server.




On Thu, Jan 25, 2018 at 4:58 PM, Stephen Mallette 
wrote:

> Marko had a nice idea with:
>
> gremlin> sparql = graph.traversal(SPARQLTraversalStrategy.class)
> .withRemote(“127.0.0.2”)
> gremlin> sparql.query(“SELECT ?x ?y WHERE {…}”).toList()
> ==>{?x:marko, ?y:29}
> ==>{?x:josh, ?y:32}
>
> The problem i'm seeing is that it requires that the TraversalSource on the
> server be a SparqlTraversalSource because when it gets to the server it
> ends up trying to deserialize the bytecode into a GraphTraversalSource.
> Now, that's exactly how a DSL would work, but a DSL would start with an
> existing start step such as V() or E(), but not constant() which is what
> SparqlTraversalSource is sending with the sparql query in it. I might be
> not thinking of something right in how he expected to implement it, but I
> came up with a reasonably simple workaround - I added an empty inject()
> step before the constant() so that the GraphTraversalSource will be used.
> Both of these steps will be wholly replaced by the transpiled traversal
> when the SparqlStrategy executes and we thus get:
>
> gremlin> graph = EmptyGraph.instance()
> ==>emptygraph[empty]
> gremlin> cluster = Cluster.open()
> ==>localhost/127.0.0.1:8182
> gremlin> g = graph.traversal(SparqlTraversalSource.class).
> ..1> withStrategies(SparqlStrategy.instance()).
> ..2> withRemote(DriverRemoteConnection.using(cluster))
> ==>sparqltraversalsource[emptygraph[empty], standard]
> gremlin> g.sparql("SELECT ?name ?age WHERE { ?person v:name ?name .
> ?person v:age ?age }")
> ==>[name:marko,age:29]
> ==>[name:vadas,age:27]
> ==>[name:josh,age:32]
> ==>[name:peter,age:35]
>
> Treating sparql-gremlin as a DSL really seems like the best way to get
> this all working - especially since it already is! :)  To get the same
> pattern going with GLVs we would only need to make use of the DSL patterns
> which already exist. Anyway, it's nice to have these basic premises nailed
> down in code to ensure the ideas were sound. Please let me know if you have
> any thoughts
>
>
>
> On Thu, Jan 25, 2018 at 2:37 PM, Stephen Mallette 
> wrote:
>
>> Check this out:
>>
>> gremlin> graph = TinkerFactory.createModern()
>> ==>tinkergraph[vertices:6 edges:6]
>> gremlin> g = graph.traversal(SparqlTraversalSource.class).
>> ..1>   withStrategies(SparqlStrategy.instance())
>> ==>sparqltraversalsource[tinkergraph[vertices:6 edges:6], standard]
>> gremlin> g.sparql("SELECT ?name ?age WHERE { ?person v:name ?name .
>> ?person v:age ?age }")
>> ==>[name:marko,age:29]
>> 

Re: [Discussed] Integrating SPARQL-Gremlin 0.2 Plugin with the TinkerPop codebase

2018-01-25 Thread Stephen Mallette
Marko had a nice idea with:

gremlin> sparql = graph.traversal(SPARQLTraversalStrategy.class)
.withRemote(“127.0.0.2”)
gremlin> sparql.query(“SELECT ?x ?y WHERE {…}”).toList()
==>{?x:marko, ?y:29}
==>{?x:josh, ?y:32}

The problem i'm seeing is that it requires that the TraversalSource on the
server be a SparqlTraversalSource because when it gets to the server it
ends up trying to deserialize the bytecode into a GraphTraversalSource.
Now, that's exactly how a DSL would work, but a DSL would start with an
existing start step such as V() or E(), but not constant() which is what
SparqlTraversalSource is sending with the sparql query in it. I might be
not thinking of something right in how he expected to implement it, but I
came up with a reasonably simple workaround - I added an empty inject()
step before the constant() so that the GraphTraversalSource will be used.
Both of these steps will be wholly replaced by the transpiled traversal
when the SparqlStrategy executes and we thus get:

gremlin> graph = EmptyGraph.instance()
==>emptygraph[empty]
gremlin> cluster = Cluster.open()
==>localhost/127.0.0.1:8182
gremlin> g = graph.traversal(SparqlTraversalSource.class).
..1> withStrategies(SparqlStrategy.instance()).
..2> withRemote(DriverRemoteConnection.using(cluster))
==>sparqltraversalsource[emptygraph[empty], standard]
gremlin> g.sparql("SELECT ?name ?age WHERE { ?person v:name ?name . ?person
v:age ?age }")
==>[name:marko,age:29]
==>[name:vadas,age:27]
==>[name:josh,age:32]
==>[name:peter,age:35]

Treating sparql-gremlin as a DSL really seems like the best way to get this
all working - especially since it already is! :)  To get the same pattern
going with GLVs we would only need to make use of the DSL patterns which
already exist. Anyway, it's nice to have these basic premises nailed down
in code to ensure the ideas were sound. Please let me know if you have any
thoughts



On Thu, Jan 25, 2018 at 2:37 PM, Stephen Mallette 
wrote:

> Check this out:
>
> gremlin> graph = TinkerFactory.createModern()
> ==>tinkergraph[vertices:6 edges:6]
> gremlin> g = graph.traversal(SparqlTraversalSource.class).
> ..1>   withStrategies(SparqlStrategy.instance())
> ==>sparqltraversalsource[tinkergraph[vertices:6 edges:6], standard]
> gremlin> g.sparql("SELECT ?name ?age WHERE { ?person v:name ?name .
> ?person v:age ?age }")
> ==>[name:marko,age:29]
> ==>[name:vadas,age:27]
> ==>[name:josh,age:32]
> ==>[name:peter,age:35]
>
> The work is horribly hacked together at the moment and I've not pushed it
> to the development branch yet, but that's the general idea for how
> gremlin-sparql will be used based on what we talked about earlier in this
> thread. Pretty neat?
>
>
>
> On Wed, Jan 24, 2018 at 3:38 PM, Stephen Mallette 
> wrote:
>
>> I just wanted to quickly note that sparql-gremlin is now building
>> properly on the TINKERPOP-1878 branch (i just pushed some changes to clean
>> up some pom.xml/dependency conflicts issues). As we discussed in this
>> thread, the branch currently contains a fairly bare bones model and it will
>> need some work to get it complete enough for it to be considered for merge
>> to a release branch. In a way that's good, because it will give the
>> community a chance to shape exactly how sparql-gremlin will work.
>>
>> On Tue, Jan 9, 2018 at 10:47 AM, Harsh Thakkar  wrote:
>>
>>> Hi Stephen,
>>>
>>> It does make sense to me. The work is going on slow but steady. Let's
>>> wait and see how other devs feel about this, as you said.
>>>
>>> Cheers,
>>> Harsh
>>> On 2018-01-09 16:31, Stephen Mallette  wrote:
>>> > I've had some thoughts on this thread since December. Since
>>> sparql-gremlin
>>> > has a pretty long to-do list and there is likely a lot of discussion
>>> > required on this list prior to it being ready for merge to a release
>>> > branch, it seems like we might treat this as a normal feature under
>>> > development. I think we should just merge it to a development branch
>>> in the
>>> > TinkerPop repository and then collaborate on it from there. We've taken
>>> > similar approaches with other "long term" pull requests which has
>>> allowed
>>> > the code to develop as it would typically would. I'm thinking that's a
>>> > better approach than a "big-bang" pull request.
>>> >
>>> > Harsh, if that's ok with you, feel free to issue your PR against
>>> master and
>>> > I'll get it setup against a development branch on our end (no rush,
>>> please
>>> > give it a few days to see if everyone is ok with that approach).
>>> >
>>> > On Mon, Dec 18, 2017 at 5:16 PM, Stephen Mallette <
>>> spmalle...@gmail.com>
>>> > wrote:
>>> >
>>> > > > Should I also remove the northwind file?
>>> > >
>>> > > I think I'd prefer to see all of our sparql examples use the
>>> existing toy
>>> > > graphs - better not to add more options - so I'd remove it as well.
>>> If
>>> > > 

Re: [Discussed] Integrating SPARQL-Gremlin 0.2 Plugin with the TinkerPop codebase

2018-01-25 Thread Stephen Mallette
Check this out:

gremlin> graph = TinkerFactory.createModern()
==>tinkergraph[vertices:6 edges:6]
gremlin> g = graph.traversal(SparqlTraversalSource.class).
..1>   withStrategies(SparqlStrategy.instance())
==>sparqltraversalsource[tinkergraph[vertices:6 edges:6], standard]
gremlin> g.sparql("SELECT ?name ?age WHERE { ?person v:name ?name . ?person
v:age ?age }")
==>[name:marko,age:29]
==>[name:vadas,age:27]
==>[name:josh,age:32]
==>[name:peter,age:35]

The work is horribly hacked together at the moment and I've not pushed it
to the development branch yet, but that's the general idea for how
gremlin-sparql will be used based on what we talked about earlier in this
thread. Pretty neat?



On Wed, Jan 24, 2018 at 3:38 PM, Stephen Mallette 
wrote:

> I just wanted to quickly note that sparql-gremlin is now building properly
> on the TINKERPOP-1878 branch (i just pushed some changes to clean up some
> pom.xml/dependency conflicts issues). As we discussed in this thread, the
> branch currently contains a fairly bare bones model and it will need some
> work to get it complete enough for it to be considered for merge to a
> release branch. In a way that's good, because it will give the community a
> chance to shape exactly how sparql-gremlin will work.
>
> On Tue, Jan 9, 2018 at 10:47 AM, Harsh Thakkar  wrote:
>
>> Hi Stephen,
>>
>> It does make sense to me. The work is going on slow but steady. Let's
>> wait and see how other devs feel about this, as you said.
>>
>> Cheers,
>> Harsh
>> On 2018-01-09 16:31, Stephen Mallette  wrote:
>> > I've had some thoughts on this thread since December. Since
>> sparql-gremlin
>> > has a pretty long to-do list and there is likely a lot of discussion
>> > required on this list prior to it being ready for merge to a release
>> > branch, it seems like we might treat this as a normal feature under
>> > development. I think we should just merge it to a development branch in
>> the
>> > TinkerPop repository and then collaborate on it from there. We've taken
>> > similar approaches with other "long term" pull requests which has
>> allowed
>> > the code to develop as it would typically would. I'm thinking that's a
>> > better approach than a "big-bang" pull request.
>> >
>> > Harsh, if that's ok with you, feel free to issue your PR against master
>> and
>> > I'll get it setup against a development branch on our end (no rush,
>> please
>> > give it a few days to see if everyone is ok with that approach).
>> >
>> > On Mon, Dec 18, 2017 at 5:16 PM, Stephen Mallette > >
>> > wrote:
>> >
>> > > > Should I also remove the northwind file?
>> > >
>> > > I think I'd prefer to see all of our sparql examples use the existing
>> toy
>> > > graphs - better not to add more options - so I'd remove it as well. If
>> > > anyone disagrees, I don't really feel too strongly about not
>> including it,
>> > > but it would be good to hear some reasoning as to why the existing
>> datasets
>> > > that we already package are insufficient for users to learn with.
>> > >
>> > > >  will need some help (quite possibly) with getting things right as
>> far
>> > > as the DSL pattern for the gremlin language variants is concerned.
>> > >
>> > > We can help point you in the right direction when you get stuck or
>> need to
>> > > clarify things. If you get really stuck, we can move to step 2 and
>> have you
>> > > issue a PR sooner than later and we'll just merge what you have to a
>> > > development branch so others can collaborate with you on it more
>> easily.
>> > > Let's see how things develop.
>> > >
>> > > > Also, since you are very well versed in the test suite, I would also
>> > > request some assistance for the same when we are there :) as it is our
>> > > first time pushing a work to the production level. So bear with us :)
>> > >
>> > > no worries. i will need to think on the testing approach. my thinking
>> will
>> > > be focused on what i would call integration tests i.e. tests that
>> evaluate
>> > > sparql-gremlin across the entire stack. I don't imagine that you need
>> my
>> > > input to write some unit tests to validate the workings of your
>> current
>> > > code though.
>> > >
>> > > > One question, though there is not a strict deadline, when is the
>> 3.3.2
>> > > release planned?
>> > >
>> > > We have no timeline on 3.3.2 at this point (we are just in the
>> process of
>> > > releasing 3.3.1 so it will be a while before we see 3.3.2). I think
>> the
>> > > merging of gremlin-javascript will likely trigger that release, i
>> would
>> > > guess no earlier than February 2018 if all goes right with that. I
>> also
>> > > don't mean to make it sound like sparql-gremlin needs to be part of
>> that
>> > > release, so if it's not ready then, it's not ready and it releases
>> with
>> > > 3.3.3. You'll find that with TinkerPop, we tend to release when
>> software is
>> > > "ready" and not by setting long range 

Re: [Discussed] Integrating SPARQL-Gremlin 0.2 Plugin with the TinkerPop codebase

2018-01-24 Thread Stephen Mallette
I just wanted to quickly note that sparql-gremlin is now building properly
on the TINKERPOP-1878 branch (i just pushed some changes to clean up some
pom.xml/dependency conflicts issues). As we discussed in this thread, the
branch currently contains a fairly bare bones model and it will need some
work to get it complete enough for it to be considered for merge to a
release branch. In a way that's good, because it will give the community a
chance to shape exactly how sparql-gremlin will work.

On Tue, Jan 9, 2018 at 10:47 AM, Harsh Thakkar  wrote:

> Hi Stephen,
>
> It does make sense to me. The work is going on slow but steady. Let's wait
> and see how other devs feel about this, as you said.
>
> Cheers,
> Harsh
> On 2018-01-09 16:31, Stephen Mallette  wrote:
> > I've had some thoughts on this thread since December. Since
> sparql-gremlin
> > has a pretty long to-do list and there is likely a lot of discussion
> > required on this list prior to it being ready for merge to a release
> > branch, it seems like we might treat this as a normal feature under
> > development. I think we should just merge it to a development branch in
> the
> > TinkerPop repository and then collaborate on it from there. We've taken
> > similar approaches with other "long term" pull requests which has allowed
> > the code to develop as it would typically would. I'm thinking that's a
> > better approach than a "big-bang" pull request.
> >
> > Harsh, if that's ok with you, feel free to issue your PR against master
> and
> > I'll get it setup against a development branch on our end (no rush,
> please
> > give it a few days to see if everyone is ok with that approach).
> >
> > On Mon, Dec 18, 2017 at 5:16 PM, Stephen Mallette 
> > wrote:
> >
> > > > Should I also remove the northwind file?
> > >
> > > I think I'd prefer to see all of our sparql examples use the existing
> toy
> > > graphs - better not to add more options - so I'd remove it as well. If
> > > anyone disagrees, I don't really feel too strongly about not including
> it,
> > > but it would be good to hear some reasoning as to why the existing
> datasets
> > > that we already package are insufficient for users to learn with.
> > >
> > > >  will need some help (quite possibly) with getting things right as
> far
> > > as the DSL pattern for the gremlin language variants is concerned.
> > >
> > > We can help point you in the right direction when you get stuck or
> need to
> > > clarify things. If you get really stuck, we can move to step 2 and
> have you
> > > issue a PR sooner than later and we'll just merge what you have to a
> > > development branch so others can collaborate with you on it more
> easily.
> > > Let's see how things develop.
> > >
> > > > Also, since you are very well versed in the test suite, I would also
> > > request some assistance for the same when we are there :) as it is our
> > > first time pushing a work to the production level. So bear with us :)
> > >
> > > no worries. i will need to think on the testing approach. my thinking
> will
> > > be focused on what i would call integration tests i.e. tests that
> evaluate
> > > sparql-gremlin across the entire stack. I don't imagine that you need
> my
> > > input to write some unit tests to validate the workings of your current
> > > code though.
> > >
> > > > One question, though there is not a strict deadline, when is the
> 3.3.2
> > > release planned?
> > >
> > > We have no timeline on 3.3.2 at this point (we are just in the process
> of
> > > releasing 3.3.1 so it will be a while before we see 3.3.2). I think the
> > > merging of gremlin-javascript will likely trigger that release, i would
> > > guess no earlier than February 2018 if all goes right with that. I also
> > > don't mean to make it sound like sparql-gremlin needs to be part of
> that
> > > release, so if it's not ready then, it's not ready and it releases with
> > > 3.3.3. You'll find that with TinkerPop, we tend to release when
> software is
> > > "ready" and not by setting long range time deadlines for ourselves. So,
> > > don't worry about when we release sparql-gremlin too much. Let's stay
> > > focused on just getting the code right.
> > >
> > > Thanks for your understanding.
> > >
> > >
> > >
> > >
> > > On Mon, Dec 18, 2017 at 5:01 PM, Harsh Thakkar 
> wrote:
> > >
> > >> Hello Stephen,
> > >>
> > >> Alright, I will remove the bsbm file from the repository and I refer
> to
> > >> it in the docs (with some examples) sharing a link to download from
> the
> > >> website if that is acceptable. No worries.
> > >> Should I also remove the northwind file?
> > >>
> > >>
> > >> Your expectations are reasonable, it was just that I wasn't very clear
> > >> about what needs to be done. Now it is pretty much clear. It will
> take some
> > >> time for me to wrap my head around the specifics of the tinkerpop
> codebase
> > >> in order to satisfy the 3 requirements. I 

Re: [Discussed] Integrating SPARQL-Gremlin 0.2 Plugin with the TinkerPop codebase

2018-01-09 Thread Harsh Thakkar
Hi Stephen,

It does make sense to me. The work is going on slow but steady. Let's wait and 
see how other devs feel about this, as you said.

Cheers,
Harsh
On 2018-01-09 16:31, Stephen Mallette  wrote: 
> I've had some thoughts on this thread since December. Since sparql-gremlin
> has a pretty long to-do list and there is likely a lot of discussion
> required on this list prior to it being ready for merge to a release
> branch, it seems like we might treat this as a normal feature under
> development. I think we should just merge it to a development branch in the
> TinkerPop repository and then collaborate on it from there. We've taken
> similar approaches with other "long term" pull requests which has allowed
> the code to develop as it would typically would. I'm thinking that's a
> better approach than a "big-bang" pull request.
> 
> Harsh, if that's ok with you, feel free to issue your PR against master and
> I'll get it setup against a development branch on our end (no rush, please
> give it a few days to see if everyone is ok with that approach).
> 
> On Mon, Dec 18, 2017 at 5:16 PM, Stephen Mallette 
> wrote:
> 
> > > Should I also remove the northwind file?
> >
> > I think I'd prefer to see all of our sparql examples use the existing toy
> > graphs - better not to add more options - so I'd remove it as well. If
> > anyone disagrees, I don't really feel too strongly about not including it,
> > but it would be good to hear some reasoning as to why the existing datasets
> > that we already package are insufficient for users to learn with.
> >
> > >  will need some help (quite possibly) with getting things right as far
> > as the DSL pattern for the gremlin language variants is concerned.
> >
> > We can help point you in the right direction when you get stuck or need to
> > clarify things. If you get really stuck, we can move to step 2 and have you
> > issue a PR sooner than later and we'll just merge what you have to a
> > development branch so others can collaborate with you on it more easily.
> > Let's see how things develop.
> >
> > > Also, since you are very well versed in the test suite, I would also
> > request some assistance for the same when we are there :) as it is our
> > first time pushing a work to the production level. So bear with us :)
> >
> > no worries. i will need to think on the testing approach. my thinking will
> > be focused on what i would call integration tests i.e. tests that evaluate
> > sparql-gremlin across the entire stack. I don't imagine that you need my
> > input to write some unit tests to validate the workings of your current
> > code though.
> >
> > > One question, though there is not a strict deadline, when is the 3.3.2
> > release planned?
> >
> > We have no timeline on 3.3.2 at this point (we are just in the process of
> > releasing 3.3.1 so it will be a while before we see 3.3.2). I think the
> > merging of gremlin-javascript will likely trigger that release, i would
> > guess no earlier than February 2018 if all goes right with that. I also
> > don't mean to make it sound like sparql-gremlin needs to be part of that
> > release, so if it's not ready then, it's not ready and it releases with
> > 3.3.3. You'll find that with TinkerPop, we tend to release when software is
> > "ready" and not by setting long range time deadlines for ourselves. So,
> > don't worry about when we release sparql-gremlin too much. Let's stay
> > focused on just getting the code right.
> >
> > Thanks for your understanding.
> >
> >
> >
> >
> > On Mon, Dec 18, 2017 at 5:01 PM, Harsh Thakkar  wrote:
> >
> >> Hello Stephen,
> >>
> >> Alright, I will remove the bsbm file from the repository and I refer to
> >> it in the docs (with some examples) sharing a link to download from the
> >> website if that is acceptable. No worries.
> >> Should I also remove the northwind file?
> >>
> >>
> >> Your expectations are reasonable, it was just that I wasn't very clear
> >> about what needs to be done. Now it is pretty much clear. It will take some
> >> time for me to wrap my head around the specifics of the tinkerpop codebase
> >> in order to satisfy the 3 requirements. I will need some help (quite
> >> possibly) with getting things right as far as the DSL pattern for the
> >> gremlin language variants is concerned. I am already reading the dev-docs
> >> on this, from here:
> >> http://tinkerpop.apache.org/docs/current/reference/#dsl
> >>
> >> Also, since you are very well versed in the test suite, I would also
> >> request some assistance for the same when we are there :) as it is our
> >> first time pushing a work to the production level. So bear with us :)
> >>
> >> I agree with you on not having any API shifts, this does certainly not
> >> give a good impression, also its a lot of effort down the drain. Quality
> >> must be ensured.
> >>
> >> One question, though there is not a strict deadline, when is the 3.3.2
> >> release planned?
> 

Re: [Discussed] Integrating SPARQL-Gremlin 0.2 Plugin with the TinkerPop codebase

2018-01-09 Thread Stephen Mallette
I've had some thoughts on this thread since December. Since sparql-gremlin
has a pretty long to-do list and there is likely a lot of discussion
required on this list prior to it being ready for merge to a release
branch, it seems like we might treat this as a normal feature under
development. I think we should just merge it to a development branch in the
TinkerPop repository and then collaborate on it from there. We've taken
similar approaches with other "long term" pull requests which has allowed
the code to develop as it would typically would. I'm thinking that's a
better approach than a "big-bang" pull request.

Harsh, if that's ok with you, feel free to issue your PR against master and
I'll get it setup against a development branch on our end (no rush, please
give it a few days to see if everyone is ok with that approach).

On Mon, Dec 18, 2017 at 5:16 PM, Stephen Mallette 
wrote:

> > Should I also remove the northwind file?
>
> I think I'd prefer to see all of our sparql examples use the existing toy
> graphs - better not to add more options - so I'd remove it as well. If
> anyone disagrees, I don't really feel too strongly about not including it,
> but it would be good to hear some reasoning as to why the existing datasets
> that we already package are insufficient for users to learn with.
>
> >  will need some help (quite possibly) with getting things right as far
> as the DSL pattern for the gremlin language variants is concerned.
>
> We can help point you in the right direction when you get stuck or need to
> clarify things. If you get really stuck, we can move to step 2 and have you
> issue a PR sooner than later and we'll just merge what you have to a
> development branch so others can collaborate with you on it more easily.
> Let's see how things develop.
>
> > Also, since you are very well versed in the test suite, I would also
> request some assistance for the same when we are there :) as it is our
> first time pushing a work to the production level. So bear with us :)
>
> no worries. i will need to think on the testing approach. my thinking will
> be focused on what i would call integration tests i.e. tests that evaluate
> sparql-gremlin across the entire stack. I don't imagine that you need my
> input to write some unit tests to validate the workings of your current
> code though.
>
> > One question, though there is not a strict deadline, when is the 3.3.2
> release planned?
>
> We have no timeline on 3.3.2 at this point (we are just in the process of
> releasing 3.3.1 so it will be a while before we see 3.3.2). I think the
> merging of gremlin-javascript will likely trigger that release, i would
> guess no earlier than February 2018 if all goes right with that. I also
> don't mean to make it sound like sparql-gremlin needs to be part of that
> release, so if it's not ready then, it's not ready and it releases with
> 3.3.3. You'll find that with TinkerPop, we tend to release when software is
> "ready" and not by setting long range time deadlines for ourselves. So,
> don't worry about when we release sparql-gremlin too much. Let's stay
> focused on just getting the code right.
>
> Thanks for your understanding.
>
>
>
>
> On Mon, Dec 18, 2017 at 5:01 PM, Harsh Thakkar  wrote:
>
>> Hello Stephen,
>>
>> Alright, I will remove the bsbm file from the repository and I refer to
>> it in the docs (with some examples) sharing a link to download from the
>> website if that is acceptable. No worries.
>> Should I also remove the northwind file?
>>
>>
>> Your expectations are reasonable, it was just that I wasn't very clear
>> about what needs to be done. Now it is pretty much clear. It will take some
>> time for me to wrap my head around the specifics of the tinkerpop codebase
>> in order to satisfy the 3 requirements. I will need some help (quite
>> possibly) with getting things right as far as the DSL pattern for the
>> gremlin language variants is concerned. I am already reading the dev-docs
>> on this, from here:
>> http://tinkerpop.apache.org/docs/current/reference/#dsl
>>
>> Also, since you are very well versed in the test suite, I would also
>> request some assistance for the same when we are there :) as it is our
>> first time pushing a work to the production level. So bear with us :)
>>
>> I agree with you on not having any API shifts, this does certainly not
>> give a good impression, also its a lot of effort down the drain. Quality
>> must be ensured.
>>
>> One question, though there is not a strict deadline, when is the 3.3.2
>> release planned?
>>
>> Cheers,
>> Harsh
>>
>>
>> On 2017-12-18 20:48, Stephen Mallette  wrote:
>> > A quick note about (4) - Having some sample data for user convenience is
>> > good. Files like that though should not be "resources", but should be
>> added
>> > here:
>> >
>> > https://github.com/harsh9t/tinkerpop/tree/master/data
>> >
>> > Placing those files there will allow them to be included in the 

Re: Subject: Re: [Discussed] Integrating SPARQL-Gremlin 0.2 Plugin with the TinkerPop codebase

2017-12-20 Thread Harsh Thakkar
Hi Josh,

Apologies for late reply, I almost forgot this one.

I am not exactly familiar with the PropertyGraphSail and GraphSail 
implementations at the moment. Let me get back to you on this once I have a 
more concrete idea.

> How does your content-preserving RDF <--> PG interface compare with the
> GraphSail mapping, the PropertyGraphSail mapping(s), or with Hartig's
> "RDF*"

Thanks for pointing out Olaf Hartig's work in this area. I am currently 
collaborating with him on the proposed "RDF <-> PG" converter. Olaf laid a 
foundation with his paper on the perception of PGs from an RDF perspective. 
This is good, however, has some challenges such as it is not able to handle RDF 
reification at this point. I am working on this with him, to extend his work 
and create a (pretty much) seamless bi-directional conversion mechanism.

I hope that answers your question to somewhat extent. :)

Cheers!

On 2017-12-15 02:38, Joshua Shinavier  wrote: 
> Hi Harsh,
> 
> Thanks for the detailed reply. I can't say with confidence that the TP2
> suite could be re-implemented on top of Jena in that time frame (as I am a
> long-time Sesame fan without much Jena experience), although a TP2 --> TP3
> port could be done, keeping the Sesame (RDF4j) dependency.  GraphSail has
> already been ported, and just needs some docs and more tests. I wonder if
> it would be too crazy to support both, i.e. rdf4j-gremlin and jena-gremlin.
> 
> At any rate, it has been really good to see the recent upsurge of interest
> in RDF and SPARQL support in the graph DB space. At Data Day Seattle, I
> made the point that although Property Graphs came to prominence as a simple
> and lightweight alternative to the Semantic Web standards, SemWeb-like
> features -- such as schemas/ontologies and rules/reasoning -- keep finding
> their way in.
> 
> How does your content-preserving RDF <--> PG interface compare with the
> GraphSail mapping, the PropertyGraphSail mapping(s), or with Hartig's
> "RDF*" [1]?
> 
> 
> [1] https://arxiv.org/pdf/1409.3288.pdf
> 
> 
> On Thu, Dec 14, 2017 at 8:43 AM, Harsh Thakkar  wrote:
> 
> >
> > Hi Josh,
> >
> > I already wrote an elaborate reply to your comment. I think it went
> > somewhere but didn't show up :(
> >
> > I will summarize my reply here now..
> >
> > Yes, I am of the same opinion of having a continuous SPARQL implementation
> > on top of Gremlin. Also, I am working on a custom interface, (as we speak)
> > in my current research, on proposing an information preserving RDF <-> PG
> > converter. This will allow interoperability between the semantic web and
> > graph database communities to leverage the advantages of one another. i.e.
> > the earlier can traverse and the later can have a more diverse access
> > portfolio to rich datasets.
> >
> > My Ph.D. thesis is more or less focused on this. It started from proposing
> > a robust open and extensible benchmarking platform "LITMUS" [], which
> > eventually led me to address all these issues and thus my keen interest :)
> >
> > If I am not getting it wrong, the other interfaces you mentioned, about
> > that, do you wish to see them eventually integrated into tinkerpop? or are
> > you implying that this should be already done before the next release?
> >
> > Thanks for your pointers!
> > Cheers!
> >
> > On 2017-12-13 16:46, Joshua Shinavier  wrote:
> > > Hi Harsh,
> > >
> > > Glad you are taking Daniel's work forward. In porting the code to the
> > > TinkerPop code base, might I suggest we allow for not only
> > SPARQL-Gremlin,
> > > but a whole suite of RDF tools as in TP2. Perhaps call the module
> > > rdf-gremlin. Then we could have all of:
> > >
> > > * SPARQL-Gremlin: executes standard SPARQL queries over a Property Graph
> > > database
> > > * GraphSail [1,2]: stores RDF quads in the database, explicitly, and
> > > enables SPARQL and triple pattern queries over the quads
> > > * PropertyGraphSail [3]: exposes a Property Graph with of two mappings to
> > > the RDF data model
> > > * SailGraph [4]: takes an RDF triple store (not natively supporting
> > > Gremlin) and enables Gremlin queries
> > > * others? I have often thought that a continuous SPARQL implementation
> > > built on Gremlin would be powerful
> > >
> > > The biggest mismatch between the TP2 suite and what might be built for
> > > Apache TinkerPop is that the previous suite was implemented using
> > (Eclipse)
> > > RDF4j, whereas things seem to be leaning towards (Apache) Jena now.
> > > However, the same principles could be applied.
> > >
> > > Josh
> > >
> > >
> > > [1] https://github.com/tinkerpop/blueprints/wiki/Sail-Ouplementation
> > > [2] https://github.com/joshsh/graphsail
> > > [3]
> > > https://github.com/tinkerpop/blueprints/wiki/PropertyGraphSail-
> > Ouplementation
> > > [4] https://github.com/tinkerpop/blueprints/wiki/Sail-Implementation
> > >
> > [snip]
> 


Re: [Discussed] Integrating SPARQL-Gremlin 0.2 Plugin with the TinkerPop codebase

2017-12-18 Thread Stephen Mallette
> Should I also remove the northwind file?

I think I'd prefer to see all of our sparql examples use the existing toy
graphs - better not to add more options - so I'd remove it as well. If
anyone disagrees, I don't really feel too strongly about not including it,
but it would be good to hear some reasoning as to why the existing datasets
that we already package are insufficient for users to learn with.

>  will need some help (quite possibly) with getting things right as far as
the DSL pattern for the gremlin language variants is concerned.

We can help point you in the right direction when you get stuck or need to
clarify things. If you get really stuck, we can move to step 2 and have you
issue a PR sooner than later and we'll just merge what you have to a
development branch so others can collaborate with you on it more easily.
Let's see how things develop.

> Also, since you are very well versed in the test suite, I would also
request some assistance for the same when we are there :) as it is our
first time pushing a work to the production level. So bear with us :)

no worries. i will need to think on the testing approach. my thinking will
be focused on what i would call integration tests i.e. tests that evaluate
sparql-gremlin across the entire stack. I don't imagine that you need my
input to write some unit tests to validate the workings of your current
code though.

> One question, though there is not a strict deadline, when is the 3.3.2
release planned?

We have no timeline on 3.3.2 at this point (we are just in the process of
releasing 3.3.1 so it will be a while before we see 3.3.2). I think the
merging of gremlin-javascript will likely trigger that release, i would
guess no earlier than February 2018 if all goes right with that. I also
don't mean to make it sound like sparql-gremlin needs to be part of that
release, so if it's not ready then, it's not ready and it releases with
3.3.3. You'll find that with TinkerPop, we tend to release when software is
"ready" and not by setting long range time deadlines for ourselves. So,
don't worry about when we release sparql-gremlin too much. Let's stay
focused on just getting the code right.

Thanks for your understanding.




On Mon, Dec 18, 2017 at 5:01 PM, Harsh Thakkar  wrote:

> Hello Stephen,
>
> Alright, I will remove the bsbm file from the repository and I refer to it
> in the docs (with some examples) sharing a link to download from the
> website if that is acceptable. No worries.
> Should I also remove the northwind file?
>
>
> Your expectations are reasonable, it was just that I wasn't very clear
> about what needs to be done. Now it is pretty much clear. It will take some
> time for me to wrap my head around the specifics of the tinkerpop codebase
> in order to satisfy the 3 requirements. I will need some help (quite
> possibly) with getting things right as far as the DSL pattern for the
> gremlin language variants is concerned. I am already reading the dev-docs
> on this, from here:
> http://tinkerpop.apache.org/docs/current/reference/#dsl
>
> Also, since you are very well versed in the test suite, I would also
> request some assistance for the same when we are there :) as it is our
> first time pushing a work to the production level. So bear with us :)
>
> I agree with you on not having any API shifts, this does certainly not
> give a good impression, also its a lot of effort down the drain. Quality
> must be ensured.
>
> One question, though there is not a strict deadline, when is the 3.3.2
> release planned?
>
> Cheers,
> Harsh
>
>
> On 2017-12-18 20:48, Stephen Mallette  wrote:
> > A quick note about (4) - Having some sample data for user convenience is
> > good. Files like that though should not be "resources", but should be
> added
> > here:
> >
> > https://github.com/harsh9t/tinkerpop/tree/master/data
> >
> > Placing those files there will allow them to be included in the the .zip
> > distribution files we produce for Gremlin Console and Gremlin Server.
> Now,
> > that BSBM file is a bit much. It's 90M in size and 22M compressed to zip.
> > Either way, that's going to push our already large zip distributions
> bigger
> > than they should be. I don't think the value of this file is worth the
> > that. We can definitely make it available as a separate download from the
> > web site if everyone thinks it's that important and then provide links to
> > it, but I don't think it should be in the source repository as it is now.
> >
> > Aside from (4) I just wanted to make some general points about my
> > expectations for a sparql-gremlin being part of a TinkerPop release
> branch.
> > Apologies if this wasn't clear from when we started. I think we need to
> see
> > sparql-gremlin as close to a final form as possible before we look to
> merge
> > it. By "final" I mean:
> >
> > 1. sparql-gremlin has a full test suite - that means good unit test
> > coverage at a minimum and integration tests as necessary 

[Discussed] Integrating SPARQL-Gremlin 0.2 Plugin with the TinkerPop codebase

2017-12-18 Thread Harsh Thakkar
Hello Stephen,

Alright, I will remove the bsbm file from the repository and I refer to it in 
the docs (with some examples) sharing a link to download from the website if 
that is acceptable. No worries. 
Should I also remove the northwind file? 


Your expectations are reasonable, it was just that I wasn't very clear about 
what needs to be done. Now it is pretty much clear. It will take some time for 
me to wrap my head around the specifics of the tinkerpop codebase in order to 
satisfy the 3 requirements. I will need some help (quite possibly) with getting 
things right as far as the DSL pattern for the gremlin language variants is 
concerned. I am already reading the dev-docs on this, from here:
http://tinkerpop.apache.org/docs/current/reference/#dsl

Also, since you are very well versed in the test suite, I would also request 
some assistance for the same when we are there :) as it is our first time 
pushing a work to the production level. So bear with us :)

I agree with you on not having any API shifts, this does certainly not give a 
good impression, also its a lot of effort down the drain. Quality must be 
ensured. 

One question, though there is not a strict deadline, when is the 3.3.2 release 
planned?

Cheers,
Harsh


On 2017-12-18 20:48, Stephen Mallette  wrote: 
> A quick note about (4) - Having some sample data for user convenience is
> good. Files like that though should not be "resources", but should be added
> here:
> 
> https://github.com/harsh9t/tinkerpop/tree/master/data
> 
> Placing those files there will allow them to be included in the the .zip
> distribution files we produce for Gremlin Console and Gremlin Server. Now,
> that BSBM file is a bit much. It's 90M in size and 22M compressed to zip.
> Either way, that's going to push our already large zip distributions bigger
> than they should be. I don't think the value of this file is worth the
> that. We can definitely make it available as a separate download from the
> web site if everyone thinks it's that important and then provide links to
> it, but I don't think it should be in the source repository as it is now.
> 
> Aside from (4) I just wanted to make some general points about my
> expectations for a sparql-gremlin being part of a TinkerPop release branch.
> Apologies if this wasn't clear from when we started. I think we need to see
> sparql-gremlin as close to a final form as possible before we look to merge
> it. By "final" I mean:
> 
> 1. sparql-gremlin has a full test suite - that means good unit test
> coverage at a minimum and integration tests as necessary (and I sense they
> will be necessary). I agree with marko, that we also have to consider the
> testing pattern carefully, so that we set the stage properly for future
> languages.
> 2. sparql-gremlin has a clear and easy method of usage that is consistent
> with how TinkerPop works - this is crucial prior to merge because TinkerPop
> has high profile production usage. once merged sparql-gremlin will
> immediately be consumed by users and we will not want to shift that API
> once it is available. we will break the code of too many people if we do
> that. we need to strive to get this right from the start.
> 3. sparql-gremlin has a good body of user documentation.
> 
> I don't think any of this is insurmountable, but it does mean there is a
> fair bit of work to do and it won't happen overnight. We held
> gremlin-dotnet to the same rigorous level before merging and even
> gremlin-javascript all these months later is still not merged for basically
> the same reasons, so this is just the process that we tend to go through.
> If we follow what we did for the GLVs, we will likely follow this basic
> process:
> 
> 1. You get sparql-gremlin "pretty close" to final in your fork
> 2. Once we all agree that you are "pretty close", you offer the pull request
> 3. We merge it into a TinkerPop branch for further evaluation (this will be
> a development branch and not a release branch)
> 4. We work together to get the development branch "final"
> 5. We issue a pull request from that development branch
> 6. The pull request goes through the standard review/vote process and we
> merge to a release branch.
> 7. sparql-gremlin will likely be part of 3.3.2 release
> 
> I hope that make sense.
> 
> 
> On Mon, Dec 18, 2017 at 12:26 PM, Marko Rodriguez 
> wrote:
> 
> > Actually, my (3) is bad. Given that query() would always return a
> > Traversal>, it would be necessary to have that linearized
> > to Traversal for the test suite to validate it. That would
> > mean making SPARQLTraversal support extended Traversal methods like
> > flatMap(), blah, blah… That seems excessive, though convenient.
> >
> > Hm…… Thoughts?,
> > Marko.
> >
> > http://markorodriguez.com
> >
> >
> >
> > > On Dec 18, 2017, at 9:21 AM, Marko Rodriguez 
> > wrote:
> > >
> > > Hello,
> > >
> > > A couple of items 

Re: [Discussed] Integrating SPARQL-Gremlin 0.2 Plugin with the TinkerPop codebase

2017-12-18 Thread Harsh Thakkar
Hello Marko,

I made a mistake mentioning earlier that the sparql-gremlin compiler returns a 
string, well it does not. It returns a graph traversal, apologies!

Regarding (7), I agree, it makes sense. I will wrap my head around how to get 
that done. I am already reading the dev-docs on this, from here:
http://tinkerpop.apache.org/docs/current/reference/#dsl 
as mentioned in the reply to Stephen.

Regarding (3), I was just not sure whether or not to include these tests, so 
left them out. This makes it clear. I will write the test cases, taking some 
help from Stephen on the specifics of Test Suite. However, these test cases 
will have to be written within the scope of SPARQL. We can not test a query 
which can not be written in SPARQL :) I guess you were implying the same.

Let me get this done and get back to you. This will take some time. No worries!

Cheers,
Harsh

On 2017-12-18 18:26, Marko Rodriguez  wrote: 
> Actually, my (3) is bad. Given that query() would always return a 
> Traversal>, it would be necessary to have that linearized to 
> Traversal for the test suite to validate it. That would mean 
> making SPARQLTraversal support extended Traversal methods like flatMap(), 
> blah, blah… That seems excessive, though convenient.
> 
> Hm…… Thoughts?,
> Marko.
>  
> http://markorodriguez.com
> 
> 
> 
> > On Dec 18, 2017, at 9:21 AM, Marko Rodriguez  wrote:
> > 
> > Hello,
> > 
> > A couple of items worth considering.
> > 
> > Regarding (7), that should be done prior to master/ merge. It is necessary 
> > to follow the patterns that are established in TinkerPop regarding language 
> > interoperability. The DSL pattern developed for Gremlin language variants 
> > seems to be the best pattern for distinct languages as well. In essence, if 
> > your language is not a fluent language, and instead, uses a String, then it 
> > should be wrapped as such in a fluent interface using all the Strategy, 
> > Step, and Traversal methods that makes sense so it works within the larger 
> > infrastructure of TinkerPop (e.g. testing! — see below). What I proposed 
> > in my previous email seems the easiest and cleanest way to do things.
> > 
> > Regarding (3), testing is crucial. Given that this would be TinkerPop’s 
> > first distinct language, we don’t have a pattern set forth for testing. 
> > However, this doesn’t mean we can’t improvise on our current model. Off 
> > the top of my head, perhaps the best way would be to follow the 
> > ProcessTestSuite and do the SPARQL variants of those. For instance:
> > 
> > 
> > https://github.com/apache/tinkerpop/blob/master/gremlin-test/src/main/java/org/apache/tinkerpop/gremlin/process/traversal/step/map/VertexTest.java#L62
> >  
> > 
> > 
> > The SPARQL test version would be:
> > 
> > @Override
> > public Traversal get_g_VX1X_out(final Object v1Id) {
> >   return sparql.query(“SELECT ?x WHERE {“ + toURI(v1Id) + “ ?a ?x 
> > }”);
> > }
> > 
> > In this way, sparql is your SPARQLTraversalSource for each test and query() 
> > will return a Traversal typed according (query() will have to have solid 
> > generic support). From there, you would implement each and every test that 
> > is semantically possible with SPARQL (where SPARQ won’t be able to 
> > semantically cover all Gremlin tests).
> > 
> > Stephen has done a lot of recent work to generalize our test suite out of 
> > Java so it is in a language agnostic form. I haven’t been following that 
> > work so I’m not sure what I’m am saying above is exactly as it should 
> > be done, but it is a start.
> > 
> > HTH,
> > Marko.
> > 
> > http://markorodriguez.com 
> > 
> > 
> > 
> >> On Dec 18, 2017, at 7:43 AM, Harsh Thakkar  >> > wrote:
> >> 
> >> Hi Stephen and All,
> >> 
> >> Thanks for going through the code. I address your questions below (in the 
> >> same order):
> >> 
> >> 1. Yes, this file can be removed. It was just to test the traversal 
> >> method. 
> >> 
> >> 2. Yes, I have commented out the block of tests at this moment since we do 
> >> not need to run tests at mvn clean install time. However, I kept it (in 
> >> commented out form) if there arose a need in future for the same. It can 
> >> surely be removed if you think, it won't be necessary.
> >> 
> >> 3. There were two testing units (we continued them from Daniel's version), 
> >> one to check whether the prefixes are being encoded correctly, the second 
> >> one is to test whether the generated traversal is correct (in short the 
> >> compiler is functioning as it should). Since, we extended previous work 
> >> supporting a variety of SPARQL operators, more test cases can be added to 
> >> validate that 

Re: [Discussed] Integrating SPARQL-Gremlin 0.2 Plugin with the TinkerPop codebase

2017-12-18 Thread Marko Rodriguez
Actually, my (3) is bad. Given that query() would always return a Traversal>, it would be necessary to have that linearized to 
Traversal for the test suite to validate it. That would mean 
making SPARQLTraversal support extended Traversal methods like flatMap(), blah, 
blah… That seems excessive, though convenient.

Hm…… Thoughts?,
Marko.
 
http://markorodriguez.com



> On Dec 18, 2017, at 9:21 AM, Marko Rodriguez  wrote:
> 
> Hello,
> 
> A couple of items worth considering.
> 
> Regarding (7), that should be done prior to master/ merge. It is necessary to 
> follow the patterns that are established in TinkerPop regarding language 
> interoperability. The DSL pattern developed for Gremlin language variants 
> seems to be the best pattern for distinct languages as well. In essence, if 
> your language is not a fluent language, and instead, uses a String, then it 
> should be wrapped as such in a fluent interface using all the Strategy, Step, 
> and Traversal methods that makes sense so it works within the larger 
> infrastructure of TinkerPop (e.g. testing! — see below). What I proposed in 
> my previous email seems the easiest and cleanest way to do things.
> 
> Regarding (3), testing is crucial. Given that this would be TinkerPop’s first 
> distinct language, we don’t have a pattern set forth for testing. However, 
> this doesn’t mean we can’t improvise on our current model. Off the top of my 
> head, perhaps the best way would be to follow the ProcessTestSuite and do the 
> SPARQL variants of those. For instance:
> 
>   
> https://github.com/apache/tinkerpop/blob/master/gremlin-test/src/main/java/org/apache/tinkerpop/gremlin/process/traversal/step/map/VertexTest.java#L62
>  
> 
> 
> The SPARQL test version would be:
> 
> @Override
> public Traversal get_g_VX1X_out(final Object v1Id) {
>   return sparql.query(“SELECT ?x WHERE {“ + toURI(v1Id) + “ ?a ?x }”);
> }
> 
> In this way, sparql is your SPARQLTraversalSource for each test and query() 
> will return a Traversal typed according (query() will have to have solid 
> generic support). From there, you would implement each and every test that is 
> semantically possible with SPARQL (where SPARQ won’t be able to semantically 
> cover all Gremlin tests).
> 
> Stephen has done a lot of recent work to generalize our test suite out of 
> Java so it is in a language agnostic form. I haven’t been following that work 
> so I’m not sure what I’m am saying above is exactly as it should be done, but 
> it is a start.
> 
> HTH,
> Marko.
> 
> http://markorodriguez.com 
> 
> 
> 
>> On Dec 18, 2017, at 7:43 AM, Harsh Thakkar > > wrote:
>> 
>> Hi Stephen and All,
>> 
>> Thanks for going through the code. I address your questions below (in the 
>> same order):
>> 
>> 1. Yes, this file can be removed. It was just to test the traversal method. 
>> 
>> 2. Yes, I have commented out the block of tests at this moment since we do 
>> not need to run tests at mvn clean install time. However, I kept it (in 
>> commented out form) if there arose a need in future for the same. It can 
>> surely be removed if you think, it won't be necessary.
>> 
>> 3. There were two testing units (we continued them from Daniel's version), 
>> one to check whether the prefixes are being encoded correctly, the second 
>> one is to test whether the generated traversal is correct (in short the 
>> compiler is functioning as it should). Since, we extended previous work 
>> supporting a variety of SPARQL operators, more test cases can be added to 
>> validate that each of these is functioning as expected. However, as I 
>> mentioned in point #2. we need not do it explicitly as we (Dharmen and I) 
>> have already tested them on 3-4 different datasets and query-sets. Now, 
>> since we did not know if that was going to be formally required in the 
>> future or not, we left them as it is, just commented it out.
>> 
>> 4. These resources are the graphml files that we wish to provide the users, 
>> for (i) loading and querying famous datasets - the Berlin SPARQL Benchmark 
>> (BSBM)  (famous in the Semantic Web-RDF community) so that they do not have 
>> to look elsewhere for the same. (ii) Also, it provides a strong use-case for 
>> demonstrating the applicability of sparql-gremlin (creates trust in the SW 
>> community users) and (iii) to keep the plug-in pretty much self-dependent.
>> 
>> 5 & 6  YES, damn it. The IDE did this. I will revert these changes. It's 
>> like when you are not looking, the IDE does things on it own :-/ apologies!
>> 
>> 7. Regarding, Marko's thoughts -- Yes, I was waiting for you to reply to the 
>> thread. I do have some thoughts on this. But first, I was wondering if this 
>> (what 

Re: [Discussed] Integrating SPARQL-Gremlin 0.2 Plugin with the TinkerPop codebase

2017-12-18 Thread Marko Rodriguez
Hello,

A couple of items worth considering.

Regarding (7), that should be done prior to master/ merge. It is necessary to 
follow the patterns that are established in TinkerPop regarding language 
interoperability. The DSL pattern developed for Gremlin language variants seems 
to be the best pattern for distinct languages as well. In essence, if your 
language is not a fluent language, and instead, uses a String, then it should 
be wrapped as such in a fluent interface using all the Strategy, Step, and 
Traversal methods that makes sense so it works within the larger infrastructure 
of TinkerPop (e.g. testing! — see below). What I proposed in my previous email 
seems the easiest and cleanest way to do things.

Regarding (3), testing is crucial. Given that this would be TinkerPop’s first 
distinct language, we don’t have a pattern set forth for testing. However, this 
doesn’t mean we can’t improvise on our current model. Off the top of my head, 
perhaps the best way would be to follow the ProcessTestSuite and do the SPARQL 
variants of those. For instance:


https://github.com/apache/tinkerpop/blob/master/gremlin-test/src/main/java/org/apache/tinkerpop/gremlin/process/traversal/step/map/VertexTest.java#L62
 


The SPARQL test version would be:

@Override
public Traversal get_g_VX1X_out(final Object v1Id) {
  return sparql.query(“SELECT ?x WHERE {“ + toURI(v1Id) + “ ?a ?x }”);
}

In this way, sparql is your SPARQLTraversalSource for each test and query() 
will return a Traversal typed according (query() will have to have solid 
generic support). From there, you would implement each and every test that is 
semantically possible with SPARQL (where SPARQ won’t be able to semantically 
cover all Gremlin tests).

Stephen has done a lot of recent work to generalize our test suite out of Java 
so it is in a language agnostic form. I haven’t been following that work so I’m 
not sure what I’m am saying above is exactly as it should be done, but it is a 
start.

HTH,
Marko.

http://markorodriguez.com



> On Dec 18, 2017, at 7:43 AM, Harsh Thakkar  wrote:
> 
> Hi Stephen and All,
> 
> Thanks for going through the code. I address your questions below (in the 
> same order):
> 
> 1. Yes, this file can be removed. It was just to test the traversal method. 
> 
> 2. Yes, I have commented out the block of tests at this moment since we do 
> not need to run tests at mvn clean install time. However, I kept it (in 
> commented out form) if there arose a need in future for the same. It can 
> surely be removed if you think, it won't be necessary.
> 
> 3. There were two testing units (we continued them from Daniel's version), 
> one to check whether the prefixes are being encoded correctly, the second one 
> is to test whether the generated traversal is correct (in short the compiler 
> is functioning as it should). Since, we extended previous work supporting a 
> variety of SPARQL operators, more test cases can be added to validate that 
> each of these is functioning as expected. However, as I mentioned in point 
> #2. we need not do it explicitly as we (Dharmen and I) have already tested 
> them on 3-4 different datasets and query-sets. Now, since we did not know if 
> that was going to be formally required in the future or not, we left them as 
> it is, just commented it out.
> 
> 4. These resources are the graphml files that we wish to provide the users, 
> for (i) loading and querying famous datasets - the Berlin SPARQL Benchmark 
> (BSBM)  (famous in the Semantic Web-RDF community) so that they do not have 
> to look elsewhere for the same. (ii) Also, it provides a strong use-case for 
> demonstrating the applicability of sparql-gremlin (creates trust in the SW 
> community users) and (iii) to keep the plug-in pretty much self-dependent.
> 
> 5 & 6  YES, damn it. The IDE did this. I will revert these changes. It's like 
> when you are not looking, the IDE does things on it own :-/ apologies!
> 
> 7. Regarding, Marko's thoughts -- Yes, I was waiting for you to reply to the 
> thread. I do have some thoughts on this. But first, I was wondering if this 
> (what Marko suggested) is supposed to be entirely implemented in the current 
> version of sparql-gremlin 0.2, i.e. including the withStrategies() and 
> withStrategies() and remote() features, or it is to be supported eventually 
> (after the sparql-gremlin 0.2.0) plugin is rolled out. Also, I am not 
> entirely sure I got what Marko was exactly suggesting. I bring this to light 
> in the in-line style reply to Marko's comment later here.
> 
> The current implementation is more of a typical compiler, the users, however, 
> can use it by specifying the query file and the dataset against which it is 
> to be executed via the command (once in the gremlin shell):
> 
> gremlin> graph 

Re: [Discussed] Integrating SPARQL-Gremlin 0.2 Plugin with the TinkerPop codebase

2017-12-18 Thread Harsh Thakkar
Hi Stephen and All,

Thanks for going through the code. I address your questions below (in the same 
order):

1. Yes, this file can be removed. It was just to test the traversal method. 

2. Yes, I have commented out the block of tests at this moment since we do not 
need to run tests at mvn clean install time. However, I kept it (in commented 
out form) if there arose a need in future for the same. It can surely be 
removed if you think, it won't be necessary.

3. There were two testing units (we continued them from Daniel's version), one 
to check whether the prefixes are being encoded correctly, the second one is to 
test whether the generated traversal is correct (in short the compiler is 
functioning as it should). Since, we extended previous work supporting a 
variety of SPARQL operators, more test cases can be added to validate that each 
of these is functioning as expected. However, as I mentioned in point #2. we 
need not do it explicitly as we (Dharmen and I) have already tested them on 3-4 
different datasets and query-sets. Now, since we did not know if that was going 
to be formally required in the future or not, we left them as it is, just 
commented it out.

4. These resources are the graphml files that we wish to provide the users, for 
(i) loading and querying famous datasets - the Berlin SPARQL Benchmark (BSBM)  
(famous in the Semantic Web-RDF community) so that they do not have to look 
elsewhere for the same. (ii) Also, it provides a strong use-case for 
demonstrating the applicability of sparql-gremlin (creates trust in the SW 
community users) and (iii) to keep the plug-in pretty much self-dependent.

5 & 6  YES, damn it. The IDE did this. I will revert these changes. It's like 
when you are not looking, the IDE does things on it own :-/ apologies!

7. Regarding, Marko's thoughts -- Yes, I was waiting for you to reply to the 
thread. I do have some thoughts on this. But first, I was wondering if this 
(what Marko suggested) is supposed to be entirely implemented in the current 
version of sparql-gremlin 0.2, i.e. including the withStrategies() and 
withStrategies() and remote() features, or it is to be supported eventually 
(after the sparql-gremlin 0.2.0) plugin is rolled out. Also, I am not entirely 
sure I got what Marko was exactly suggesting. I bring this to light in the 
in-line style reply to Marko's comment later here.

The current implementation is more of a typical compiler, the users, however, 
can use it by specifying the query file and the dataset against which it is to 
be executed via the command (once in the gremlin shell):

gremlin> graph = TinkerGraph.open(..) 
gremlin> SparqlToGremlinCompiler.convertToGremlinTraversal(graph, "SELECT ?a 
WHERE {} ") 
==>{?x:marko, ?y:29}
==>{?x:josh, ?y:32}

 
 i.e. load a graph using pre-defined tinkerpop methods ( 
graph.io(IoCore.gryo()).readGraph(graphName), TinkerGraph.open(), etc ) , then 
execute the traversal as above with arguments -- (graph, queryString), where 
queryString = "SPARQL query".

Now Let me quote Marko's comment and reply in-line to bring more clarity:

1. There should be a SPARQLTraversalSource which supports one spawn method — 
query(String).
This is already happening inside the code. Therefore, we do not need to 
mention it explicitly. Please correct me if I got it wrong here.
 
2. SPARQLTraversal is spawned and it only supports only the Traversal methods 
— next(), toList(), iterate(), etc.
All traversal methods that are supported, available to a regular 
gremlin traversal, can be used by the sparql-gremlin compiler generated 
traversal as well.  

3. query(String) adds a ConstantStep(String).
 This is happening internally (as shown in the example above), we 
can also make explicit. i.e. let the user only provide the queryString instead 
of the whole "SparqlToGremlinCompiler.convertToGremlinTraversal(graph, "SELECT 
?a WHERE {} ")" command. Does this make sense? or am I missing something 
here.


4. SPARQLTraversalSource has a registered SPARQLStrategy.
At this moment, we leave it to the default setting for this strategy 
selection.

5. SPARQLTraversalSource should also support withStrategies(), 
withoutStrategies(), withRemote(), etc.
Once the traversal is generated, it can support all strategies like any 
other gremlin traversal. Does this make sense to you?

In a nutshell, 
What is happening is that we are converting the SPARQL queryString into a 
gremlin traversal and leave it upto the tinkerpop compiler to choose what is 
best for it. 
We only map a SPARQL query to its corresponding pattern matching gremlin 
traversal (i.e. using with .match() clause). Since, the expressibility of 
SPARQL is less than that of Gremlin (i.e. SPARQL 1.0 doesn't support/allow  
performing looping and traversing operations), we can only map what is in the 
scope of SPARQL language to Gremlin. Once the traversal is generated, it is 
left to the 

Re: [Discussed] Integrating SPARQL-Gremlin 0.2 Plugin with the TinkerPop codebase

2017-12-18 Thread Stephen Mallette
Harsh, I looked at the code in a bit more detail than I have. Here's some
thoughts/questions I had as I was going through things:

1. Can this file be removed - it doesn't appear to have any usage that I
can see:

https://github.com/harsh9t/tinkerpop/blob/master/sparql-gremlin/src/main/java/org/apache/tinkerpop/gremlin/sparql/Runable.java

2. I note that this entire block of tests is commented out - should that be
removed?:

https://github.com/harsh9t/tinkerpop/blob/master/sparql-gremlin/src/test/java/org/apache/tinkerpop/gremlin/sparql/SparqlToGremlinCompilerTest.java

3. I could be wrong, but even if you didn't remove the tests above, it
seems like unit testing is rather thin at this point. Am I missing
something? Is there more work to do there?

4. I don't understand the nature of these resources:

https://github.com/harsh9t/tinkerpop/tree/master/sparql-gremlin/src/main/resources

Is there any need to package those with the jar? Should those be "test"
resources instead? Do we need the really large data/bsbm1m.graphml file for
any specific reason?

5. What are these changes to these poms?

https://github.com/harsh9t/tinkerpop/commit/cb3b6512ea3536f556108e5a257c4586aa4d157a

I assume that your IDE did that accidentally and it was not intended.
Please revert that change.

6. This looks odd too - gremlin-shaded repeated again and again and again:

https://github.com/harsh9t/tinkerpop/commit/143d16f20dcaa9c915b96cdd4adf7b1504db5d36#diff-9e90009f097eabeb25c28159571fc6a2R118

7. Did you have any thoughts in reference to Marko's earlier reply that
described how sparql-gremlin should be used? Right now, it seems like the
code you have there is just the "engine" but lacks the piece that connects
it into the rest of the stack. From my perspective, I think we need to be
sure that users have an easy, clear and consistent way to use
sparql-gremlin before we can merge this work. Obviously, having that aspect
of the code thought through will impact the documentation that you write as
well, so I think you need to go down this path a bit further before we get
to the pull request stage.

8. We aren't big javadoc sticklers here, but we try to at least get class
level javadoc in place for most classes. I don't see much javadoc or
comments in the code right now. I think I'd like to see a modicum of
javadoc/comments present as part of this work.

So, that's my broad level feedback at this point. It seems as though there
are some reasonably large issues there to contend with before a pull
request is worth issuing. That's not a problem, of coursewe will just
keep iterating toward the goal. I'm not aware of anything that is pushing
us to rush to a pull request - I'm of the opinion that we can take the time
to get this right.

Thanks,

Stephen


On Fri, Dec 15, 2017 at 1:46 PM, Joshua Shinavier  wrote:

> Hi Marko,
>
> I think we're more or less on the same page here; it's clear that TP3 has a
> different API than TP2. If you look at the guts of TP3 GraphSail [1], it
> uses the modern APIs, and yet does adapt them to the Sail interface.
>
> Something like PropertyGraphSail (or an equivalent Jena thing) still makes
> sense in TP3, as well. One interesting detail here is that in TP3, vertices
> can have labels, which can be turned into rdf:type statements (that, in
> turn, can be used to enable subclass/superclass inheritance if the graph is
> combined with a RDF schema.
>
> A TP3 equivalent of SailGraph would indeed be quite different in
> implementation -- strategies, not wrapper graph -- than what we had for
> Blueprints, and yet would serve the same purpose.
>
> Josh
>
>
> [1]
> https://github.com/joshsh/graphsail/tree/master/src/
> main/java/net/fortytwo/tpop/sail
>
>
>
> On Fri, Dec 15, 2017 at 10:22 AM, Marko Rodriguez 
> wrote:
>
> > Hello,
> >
> > The model proposed below is in-line with TinkerPop2’s way of thinking.
> > Unfortunately, TinkerPop3 and more so for TinkerPop4, the Graph
> “structure"
> > API will become deprecated. This means that the notion of “wrapping the
> > Graph API” has gone away for TP3 and will be completely gone in TP4. In
> > TP4, there will not even be a Graph API — no more Vertex, Edge, Property,
> > etc. Only the concept of a Graph with only methods like
> Graph.traversal(),
> > Graph.partitions(), etc.
> >
> > Why was this route taken? In TinkerPop3, there was a need to support any
> > language besides Java. This was why Gremlin bytecode and the concept of
> the
> > Gremlin traversal machine was introduced. A provider simply gets Gremlin
> > bytecode and has to do something with it. For the Java-based Gremlin
> > traversal machine, this is why providers implement their own GraphStep,
> > VertexStep, etc. For a Python-based Gremlin traversal machine, likewise…
> >
> > This means that SailGraph, GraphSail, PropertyGraphSail as stated below
> > don’t make sense in the current and future architectures.
> >
> > The next question becomes, "well how would 

Re: [Discussed] Integrating SPARQL-Gremlin 0.2 Plugin with the TinkerPop codebase

2017-12-15 Thread Joshua Shinavier
Hi Marko,

I think we're more or less on the same page here; it's clear that TP3 has a
different API than TP2. If you look at the guts of TP3 GraphSail [1], it
uses the modern APIs, and yet does adapt them to the Sail interface.

Something like PropertyGraphSail (or an equivalent Jena thing) still makes
sense in TP3, as well. One interesting detail here is that in TP3, vertices
can have labels, which can be turned into rdf:type statements (that, in
turn, can be used to enable subclass/superclass inheritance if the graph is
combined with a RDF schema.

A TP3 equivalent of SailGraph would indeed be quite different in
implementation -- strategies, not wrapper graph -- than what we had for
Blueprints, and yet would serve the same purpose.

Josh


[1]
https://github.com/joshsh/graphsail/tree/master/src/main/java/net/fortytwo/tpop/sail



On Fri, Dec 15, 2017 at 10:22 AM, Marko Rodriguez 
wrote:

> Hello,
>
> The model proposed below is in-line with TinkerPop2’s way of thinking.
> Unfortunately, TinkerPop3 and more so for TinkerPop4, the Graph “structure"
> API will become deprecated. This means that the notion of “wrapping the
> Graph API” has gone away for TP3 and will be completely gone in TP4. In
> TP4, there will not even be a Graph API — no more Vertex, Edge, Property,
> etc. Only the concept of a Graph with only methods like Graph.traversal(),
> Graph.partitions(), etc.
>
> Why was this route taken? In TinkerPop3, there was a need to support any
> language besides Java. This was why Gremlin bytecode and the concept of the
> Gremlin traversal machine was introduced. A provider simply gets Gremlin
> bytecode and has to do something with it. For the Java-based Gremlin
> traversal machine, this is why providers implement their own GraphStep,
> VertexStep, etc. For a Python-based Gremlin traversal machine, likewise…
>
> This means that SailGraph, GraphSail, PropertyGraphSail as stated below
> don’t make sense in the current and future architectures.
>
> The next question becomes, "well how would you turn an RDF store into a
> PropertyGraph?” Easy — implement your own custom GraphStep, VertexStep,
> etc. and respective ProviderStrategies that will handle the bytecode
> compilation accordingly.
>
> The next question becomes, “well how would a PropertyGraph support
> reasoning?” Easy — implement your own custom DecorationStrategy that will
> insert reasoning into the traversal giving the RDFS schema. For instance:
> g.V().out(“likes”)
> ==>
> g.V().out(“knows”,”likes”)
> iff “likes” is a sub-property of “knows”
>
> In essence, it is possible to do this integration of RDF and TinkerPop, it
> just needs to be done at the correct level of abstraction so that it stays
> in line with how TinkerPop is evolving, not how it was back in 2012.
>
> Take care,
> Marko.
>
> http://markorodriguez .com
>
>
> On 2017-12-13 07:46, Joshua Shinavier  wrote:
> > Hi Harsh,>
> >
> > Glad you are taking Daniel's work forward. In porting the code to the>
> > TinkerPop code base, might I suggest we allow for not only
> SPARQL-Gremlin,>
> > but a whole suite of RDF tools as in TP2. Perhaps call the module>
> > rdf-gremlin. Then we could have all of:>
> >
> > * SPARQL-Gremlin: executes standard SPARQL queries over a Property Graph>
> > database>
> > * GraphSail [1,2]: stores RDF quads in the database, explicitly, and>
> > enables SPARQL and triple pattern queries over the quads>
> > * PropertyGraphSail [3]: exposes a Property Graph with of two mappings
> to>
> > the RDF data model>
> > * SailGraph [4]: takes an RDF triple store (not natively supporting>
> > Gremlin) and enables Gremlin queries>
> > * others? I have often thought that a continuous SPARQL implementation>
> > built on Gremlin would be powerful>
> >
> > The biggest mismatch between the TP2 suite and what might be built for>
> > Apache TinkerPop is that the previous suite was implemented using
> (Eclipse)>
> > RDF4j, whereas things seem to be leaning towards (Apache) Jena now.>
> > However, the same principles could be applied.>
> >
> > Josh>
> >
> >
> > [1] https://github.com/tinkerpop/blueprints/wiki/Sail-Ouplementation>
> > [2] https://github.com/joshsh/graphsail>
> > [3]>
> > https://github.com/tinkerpop/blueprints/wiki/PropertyGraphSail-
> Ouplementation>
> > [4] https://github.com/tinkerpop/blueprints/wiki/Sail-Implementation
>
> http://markorodriguez.com
>
>
>
>


Re: [Discussed] Integrating SPARQL-Gremlin 0.2 Plugin with the TinkerPop codebase

2017-12-15 Thread Marko Rodriguez
Hello,

The model proposed below is in-line with TinkerPop2’s way of thinking. 
Unfortunately, TinkerPop3 and more so for TinkerPop4, the Graph “structure" API 
will become deprecated. This means that the notion of “wrapping the Graph API” 
has gone away for TP3 and will be completely gone in TP4. In TP4, there will 
not even be a Graph API — no more Vertex, Edge, Property, etc. Only the concept 
of a Graph with only methods like Graph.traversal(), Graph.partitions(), etc.

Why was this route taken? In TinkerPop3, there was a need to support any 
language besides Java. This was why Gremlin bytecode and the concept of the 
Gremlin traversal machine was introduced. A provider simply gets Gremlin 
bytecode and has to do something with it. For the Java-based Gremlin traversal 
machine, this is why providers implement their own GraphStep, VertexStep, etc. 
For a Python-based Gremlin traversal machine, likewise… 

This means that SailGraph, GraphSail, PropertyGraphSail as stated below don’t 
make sense in the current and future architectures. 

The next question becomes, "well how would you turn an RDF store into a 
PropertyGraph?” Easy — implement your own custom GraphStep, VertexStep, etc. 
and respective ProviderStrategies that will handle the bytecode compilation 
accordingly.

The next question becomes, “well how would a PropertyGraph support reasoning?” 
Easy — implement your own custom DecorationStrategy that will insert reasoning 
into the traversal giving the RDFS schema. For instance:
g.V().out(“likes”) 
==>
g.V().out(“knows”,”likes”)
iff “likes” is a sub-property of “knows”

In essence, it is possible to do this integration of RDF and TinkerPop, it just 
needs to be done at the correct level of abstraction so that it stays in line 
with how TinkerPop is evolving, not how it was back in 2012.

Take care,
Marko.

http://markorodriguez .com


On 2017-12-13 07:46, Joshua Shinavier  wrote: 
> Hi Harsh,> 
> 
> Glad you are taking Daniel's work forward. In porting the code to the> 
> TinkerPop code base, might I suggest we allow for not only SPARQL-Gremlin,> 
> but a whole suite of RDF tools as in TP2. Perhaps call the module> 
> rdf-gremlin. Then we could have all of:> 
> 
> * SPARQL-Gremlin: executes standard SPARQL queries over a Property Graph> 
> database> 
> * GraphSail [1,2]: stores RDF quads in the database, explicitly, and> 
> enables SPARQL and triple pattern queries over the quads> 
> * PropertyGraphSail [3]: exposes a Property Graph with of two mappings to> 
> the RDF data model> 
> * SailGraph [4]: takes an RDF triple store (not natively supporting> 
> Gremlin) and enables Gremlin queries> 
> * others? I have often thought that a continuous SPARQL implementation> 
> built on Gremlin would be powerful> 
> 
> The biggest mismatch between the TP2 suite and what might be built for> 
> Apache TinkerPop is that the previous suite was implemented using (Eclipse)> 
> RDF4j, whereas things seem to be leaning towards (Apache) Jena now.> 
> However, the same principles could be applied.> 
> 
> Josh> 
> 
> 
> [1] https://github.com/tinkerpop/blueprints/wiki/Sail-Ouplementation> 
> [2] https://github.com/joshsh/graphsail> 
> [3]> 
> https://github.com/tinkerpop/blueprints/wiki/PropertyGraphSail-Ouplementation>
>  
> [4] https://github.com/tinkerpop/blueprints/wiki/Sail-Implementation 

http://markorodriguez.com





Re: [Discussed] Integrating SPARQL-Gremlin 0.2 Plugin with the TinkerPop codebase

2017-12-15 Thread Marko Rodriguez
Hello,

Regarding how users should use SPARQL-Gremlin. I believe that it should be done 
as follows.

1. There should be a SPARQLTraversalSource which supports one spawn 
method — query(String).
2. SPARQLTraversal is spawned and it only supports only the Traversal 
methods — next(), toList(), iterate(), etc.
3. query(String) adds a ConstantStep(String).
4. SPARQLTraversalSource has a registered SPARQLStrategy.
5. SPARQLTraversalSource should also support withStrategies(), 
withoutStrategies(), withRemote(), etc.

In this way, RemoteStrategy will work and moreover, only Gremlin bytecode is 
sent over. What is sent over is constant(String). Server-side, SPARQLStrategy 
will do the compilation from SPARQL string (the String in constant()) to a 
Gremlin traversal.

Thus:

gremlin> sparql = 
graph.traversal(SPARQLTraversalStrategy.class).withRemote(“127.0.0.2”)
gremlin> sparql.query(“SELECT ?x ?y WHERE {…}”).toList()
==>{?x:marko, ?y:29}
==>{?x:josh, ?y:32}

Behind the scenes:

ConstantStep(“SELECT ?x ?y WHERE { … }”)
==SPARQLStrategy==>
[GraphStep, MatchStep[…]]

The benefits of this model:

1. SPARQL is a DSL. Makes sense in our vision outline.
2. Python, JavaScript, C#, etc. can easily support it via their DSL 
framework. Given the simplicity of just a query()-step, trivial indeed.
3. It works naturally withRemote().
4. It works naturally without remote as well.

Take care,
Marko.

http://markorodriguez.com


On 2017-12-08 05:44, "h...@gmail.com> wrote: 
> Hi Stephen,> 
> 
> Thanks for the insight on the process of this integration. I will reply to 
> your comments in the same manner.> 
> 
> 1. Yes, I will do the fork and migrate the code to the Tinkerpop repository, 
> after cleaning the code a bit. We also need to prepare a detailed doc 
> (how-to) for the plugin. This can also be done in parallel, depending upon 
> the urgency.> 
> 
> 2. Yes, we both are contributing to the v0.2 of the sparql-gremlin plugin. We 
> will both submit the ICLAs.> 
> 
> Yes, we (both) will continue to provide support for the 0.2 plugin and also 
> extend it in the future (trying to cover SPARQL 1.1 specification, also fix 
> the OPTIONAL fix in the current version). > 
> 
> Looking forward to hear more on this from the devs :)> 
> 
> Cheers!> 
> 
> On 2017-12-08 13:41, Stephen Mallette  wrote: > 
> > I agree with Marko's thoughts, both on this topic of including> 
> > sparql-gremlin as well as the wider topic of what should be included in> 
> > TinkerPop code base more generally. Providing a path for rdf/sparql folks> 
> > to get into the TinkerPop world seems like a smart direction.> 
> > > 
> > Now, assuming that we have consensus to include sparql-gremlin in the> 
> > TinkerPop code base, the process will look something like this:> 
> > > 
> > 1. I think that Harsh should fork the TinkerPop repository and migrate> 
> > sparql-gremlin into its structure. From there we will provide> 
> > feedback/review to get that fork into best shape possible prior to his> 
> > submitting a pull request. I think we can handle initial feedback through> 
> > the dev list in a separate thread.> 
> > > 
> > 2. In parallel to the above item, it appears as though there are two> 
> > contributors on sparql-gremlin:> 
> > > 
> > https://github.com/LITMUS-Benchmark-Suite/sparql-to-gremlin/graphs/contributors>
> >  
> > > 
> > Both contributors, Harsh and Dharmen, should submit ICLAs:> 
> > > 
> > http://apache.org/licenses/icla.pdf> 
> > > 
> > and send them to secret...@apache.org.> 
> > > 
> > 3. Once ICLAs are confirmed by secretary, Harsh can submit a pull request> 
> > from his fork where it can under go final review.> 
> > > 
> > Does that sound sensible to everyone?> 
> > > 
> > btw, Harsh, it sounds as though you intend to continue development on> 
> > sparql-gremlin after it is part of the TinkerPop repository...does Dharmen> 
> > intend to do the same?> 
> > > 
> > On Fri, Dec 8, 2017 at 6:54 AM, Stephen Mallette > 
> > wrote:> 
> > > 
> > > linking marko's reply from the user list:> 
> > >> 
> > > https://groups.google.com/d/msg/gremlin-users/zK9jj7bWvrQ/nE1VvhmeAAAJ> 
> > >> 
> > > On Thu, Dec 7, 2017 at 1:52 PM, hars...@gmail.com > 
> > > wrote:> 
> > >> 
> > >> Hello, dear Gremlin people!> 
> > >>> 
> > >> Apologies for raising this topic a bit late. I planned to start this> 
> > >> thread quite earlier but wasn’t able to due to some reasons.> 
> > >>> 
> > >> === short ==> 
> > >> ==> 
> > >> I seek your guidance and also help for polishing and integrating the> 
> > >> sparql-gremlin 0.2 (https://github.com/LITMUS-Ben> 
> > >> chmark-Suite/sparql-to-gremlin) plugin in the apache tinkerpop code> 
> > >> base, succeeding its predecessor developed by Daniel Kupitz (> 
> > >> https://github.com/dkuppitz/sparql-gremlin). The new plugin offers> 
> > >> support 

Re: Subject: Re: [Discussed] Integrating SPARQL-Gremlin 0.2 Plugin with the TinkerPop codebase

2017-12-14 Thread Joshua Shinavier
Hi Harsh,

Thanks for the detailed reply. I can't say with confidence that the TP2
suite could be re-implemented on top of Jena in that time frame (as I am a
long-time Sesame fan without much Jena experience), although a TP2 --> TP3
port could be done, keeping the Sesame (RDF4j) dependency.  GraphSail has
already been ported, and just needs some docs and more tests. I wonder if
it would be too crazy to support both, i.e. rdf4j-gremlin and jena-gremlin.

At any rate, it has been really good to see the recent upsurge of interest
in RDF and SPARQL support in the graph DB space. At Data Day Seattle, I
made the point that although Property Graphs came to prominence as a simple
and lightweight alternative to the Semantic Web standards, SemWeb-like
features -- such as schemas/ontologies and rules/reasoning -- keep finding
their way in.

How does your content-preserving RDF <--> PG interface compare with the
GraphSail mapping, the PropertyGraphSail mapping(s), or with Hartig's
"RDF*" [1]?


[1] https://arxiv.org/pdf/1409.3288.pdf


On Thu, Dec 14, 2017 at 8:43 AM, Harsh Thakkar  wrote:

>
> Hi Josh,
>
> I already wrote an elaborate reply to your comment. I think it went
> somewhere but didn't show up :(
>
> I will summarize my reply here now..
>
> Yes, I am of the same opinion of having a continuous SPARQL implementation
> on top of Gremlin. Also, I am working on a custom interface, (as we speak)
> in my current research, on proposing an information preserving RDF <-> PG
> converter. This will allow interoperability between the semantic web and
> graph database communities to leverage the advantages of one another. i.e.
> the earlier can traverse and the later can have a more diverse access
> portfolio to rich datasets.
>
> My Ph.D. thesis is more or less focused on this. It started from proposing
> a robust open and extensible benchmarking platform "LITMUS" [], which
> eventually led me to address all these issues and thus my keen interest :)
>
> If I am not getting it wrong, the other interfaces you mentioned, about
> that, do you wish to see them eventually integrated into tinkerpop? or are
> you implying that this should be already done before the next release?
>
> Thanks for your pointers!
> Cheers!
>
> On 2017-12-13 16:46, Joshua Shinavier  wrote:
> > Hi Harsh,
> >
> > Glad you are taking Daniel's work forward. In porting the code to the
> > TinkerPop code base, might I suggest we allow for not only
> SPARQL-Gremlin,
> > but a whole suite of RDF tools as in TP2. Perhaps call the module
> > rdf-gremlin. Then we could have all of:
> >
> > * SPARQL-Gremlin: executes standard SPARQL queries over a Property Graph
> > database
> > * GraphSail [1,2]: stores RDF quads in the database, explicitly, and
> > enables SPARQL and triple pattern queries over the quads
> > * PropertyGraphSail [3]: exposes a Property Graph with of two mappings to
> > the RDF data model
> > * SailGraph [4]: takes an RDF triple store (not natively supporting
> > Gremlin) and enables Gremlin queries
> > * others? I have often thought that a continuous SPARQL implementation
> > built on Gremlin would be powerful
> >
> > The biggest mismatch between the TP2 suite and what might be built for
> > Apache TinkerPop is that the previous suite was implemented using
> (Eclipse)
> > RDF4j, whereas things seem to be leaning towards (Apache) Jena now.
> > However, the same principles could be applied.
> >
> > Josh
> >
> >
> > [1] https://github.com/tinkerpop/blueprints/wiki/Sail-Ouplementation
> > [2] https://github.com/joshsh/graphsail
> > [3]
> > https://github.com/tinkerpop/blueprints/wiki/PropertyGraphSail-
> Ouplementation
> > [4] https://github.com/tinkerpop/blueprints/wiki/Sail-Implementation
> >
> [snip]


Subject: Re: [Discussed] Integrating SPARQL-Gremlin 0.2 Plugin with the TinkerPop codebase

2017-12-14 Thread Harsh Thakkar

Hi Josh,

I already wrote an elaborate reply to your comment. I think it went somewhere 
but didn't show up :(

I will summarize my reply here now..

Yes, I am of the same opinion of having a continuous SPARQL implementation on 
top of Gremlin. Also, I am working on a custom interface, (as we speak) in my 
current research, on proposing an information preserving RDF <-> PG converter. 
This will allow interoperability between the semantic web and graph database 
communities to leverage the advantages of one another. i.e. the earlier can 
traverse and the later can have a more diverse access portfolio to rich 
datasets.

My Ph.D. thesis is more or less focused on this. It started from proposing a 
robust open and extensible benchmarking platform "LITMUS" [], which eventually 
led me to address all these issues and thus my keen interest :)

If I am not getting it wrong, the other interfaces you mentioned, about that, 
do you wish to see them eventually integrated into tinkerpop? or are you 
implying that this should be already done before the next release?

Thanks for your pointers!
Cheers!

On 2017-12-13 16:46, Joshua Shinavier  wrote: 
> Hi Harsh,
> 
> Glad you are taking Daniel's work forward. In porting the code to the
> TinkerPop code base, might I suggest we allow for not only SPARQL-Gremlin,
> but a whole suite of RDF tools as in TP2. Perhaps call the module
> rdf-gremlin. Then we could have all of:
> 
> * SPARQL-Gremlin: executes standard SPARQL queries over a Property Graph
> database
> * GraphSail [1,2]: stores RDF quads in the database, explicitly, and
> enables SPARQL and triple pattern queries over the quads
> * PropertyGraphSail [3]: exposes a Property Graph with of two mappings to
> the RDF data model
> * SailGraph [4]: takes an RDF triple store (not natively supporting
> Gremlin) and enables Gremlin queries
> * others? I have often thought that a continuous SPARQL implementation
> built on Gremlin would be powerful
> 
> The biggest mismatch between the TP2 suite and what might be built for
> Apache TinkerPop is that the previous suite was implemented using (Eclipse)
> RDF4j, whereas things seem to be leaning towards (Apache) Jena now.
> However, the same principles could be applied.
> 
> Josh
> 
> 
> [1] https://github.com/tinkerpop/blueprints/wiki/Sail-Ouplementation
> [2] https://github.com/joshsh/graphsail
> [3]
> https://github.com/tinkerpop/blueprints/wiki/PropertyGraphSail-Ouplementation
> [4] https://github.com/tinkerpop/blueprints/wiki/Sail-Implementation
> 
> 
> 
> 
> On Wed, Dec 13, 2017 at 4:03 AM, Stephen Mallette 
> wrote:
> 
> > I suggest that you read through the dev docs a bit. There's lots of little
> > odds/ends there about how to develop on the TinkerPop code base. For
> > example, for intellij issues, please have a look at this:
> >
> > http://tinkerpop.apache.org/docs/current/dev/developer/#_
> > ide_setup_with_intellij
> >
> > > - Also, when I did a man clean install on 3.3.1-SNAPSHOT, it did get
> > build success but a majority of the test cases failed. Not sure if this is
> > worth mentioning.
> >
> > there should be no failures on master and 3.3.1-SNAPSHOT. hard to say what
> > is wrong without some error logs
> >
> > > Also, is it okay if I use the 3.3.0 api version for my module or it is
> > absolutely necessary that we have to only use the 3.3.1-SNAPSHOT api
> > version?
> >
> > you should be on 3.3.1-SNAPSHOT for your API version now that you've
> > integrated sparql-gremlin into the tinkerpop code base. ultimately, it will
> > all release together as part of a single package, so it should all be on
> > the same version.
> >
> > On Wed, Dec 13, 2017 at 6:43 AM, Harsh Thakkar  wrote:
> >
> > > Hi Stephen,
> > >
> > > I cleaned up the code a bit and then, I tried testing the code merge
> > > yesterday and I ran into some issues for 3.3.1-SNAPSHOT version.
> > >
> > > - I forked apache/tinkerpop repository to my local account and loaded the
> > > same using an IDE (as maven project). This immediately threw errors in
> > the
> > > native repositories such as gremlin-core, stating that it is not able to
> > > find org.apache.tinkerpop.shaded.kryo.Kryo.
> > > - When I try building the code with 3.3.0 api it works perfectly without
> > > any error, however for 3.3.1-SNAPSHOT version it is not able to find
> > > various files and throws errors in the core modules of tinkerpop. Thus, I
> > > cannot test my module (sparql-gremlin) with the 3.3.1-SNAPSHOT version.
> > > - Also, when I did a man clean install on 3.3.1-SNAPSHOT, it did get
> > build
> > > success but a majority of the test cases failed. Not sure if this is
> > worth
> > > mentioning.
> > >
> > > What do you suggest? How do I fix this?
> > > Also, is it okay if I use the 3.3.0 api version for my module or it is
> > > absolutely necessary that we have to only use the 3.3.1-SNAPSHOT api
> > > version?
> > >
> > > Thanks in advance!
> > >
> > > On 

Re: [Discussed] Integrating SPARQL-Gremlin 0.2 Plugin with the TinkerPop codebase

2017-12-14 Thread Harsh Thakkar
Hi Stephen,

Thanks for the pointers. We have good news. Turn out that the error was mostly 
because of the IDE environment and some other shady stuff going wrong. We 
finally managed to merge the sparql-gremlin work into the tinkerpop code base. 
I merged forked tinkerpop repository can be found here - 
https://github.com/harsh9t/tinkerpop 

Please have a look at it and let us know what is to be done next. 

Meanwhile, we are having a look at the documentation on how to generate and 
other specifics. If we have questions, we will get back to you. I haven't done 
this before so expect some :D 

Cheers!


On 2017-12-13 13:03, Stephen Mallette  wrote: 
> I suggest that you read through the dev docs a bit. There's lots of little
> odds/ends there about how to develop on the TinkerPop code base. For
> example, for intellij issues, please have a look at this:
> 
> http://tinkerpop.apache.org/docs/current/dev/developer/#_ide_setup_with_intellij
> 
> > - Also, when I did a man clean install on 3.3.1-SNAPSHOT, it did get
> build success but a majority of the test cases failed. Not sure if this is
> worth mentioning.
> 
> there should be no failures on master and 3.3.1-SNAPSHOT. hard to say what
> is wrong without some error logs
> 
> > Also, is it okay if I use the 3.3.0 api version for my module or it is
> absolutely necessary that we have to only use the 3.3.1-SNAPSHOT api
> version?
> 
> you should be on 3.3.1-SNAPSHOT for your API version now that you've
> integrated sparql-gremlin into the tinkerpop code base. ultimately, it will
> all release together as part of a single package, so it should all be on
> the same version.
> 
> On Wed, Dec 13, 2017 at 6:43 AM, Harsh Thakkar  wrote:
> 
> > Hi Stephen,
> >
> > I cleaned up the code a bit and then, I tried testing the code merge
> > yesterday and I ran into some issues for 3.3.1-SNAPSHOT version.
> >
> > - I forked apache/tinkerpop repository to my local account and loaded the
> > same using an IDE (as maven project). This immediately threw errors in the
> > native repositories such as gremlin-core, stating that it is not able to
> > find org.apache.tinkerpop.shaded.kryo.Kryo.
> > - When I try building the code with 3.3.0 api it works perfectly without
> > any error, however for 3.3.1-SNAPSHOT version it is not able to find
> > various files and throws errors in the core modules of tinkerpop. Thus, I
> > cannot test my module (sparql-gremlin) with the 3.3.1-SNAPSHOT version.
> > - Also, when I did a man clean install on 3.3.1-SNAPSHOT, it did get build
> > success but a majority of the test cases failed. Not sure if this is worth
> > mentioning.
> >
> > What do you suggest? How do I fix this?
> > Also, is it okay if I use the 3.3.0 api version for my module or it is
> > absolutely necessary that we have to only use the 3.3.1-SNAPSHOT api
> > version?
> >
> > Thanks in advance!
> >
> > On 2017-12-12 12:58, Stephen Mallette  wrote:
> > > yes - please post questions here. i don't think you need to know much
> > about
> > > TinkerPop internal structure. I'd think that sparql-gremlin is expected
> > to
> > > be included in the root of the TinkerPop source as a sub-module to the
> > > top-level pom. That just means some minor changes to your pom.xml to get
> > it
> > > to build along with everything else. See other projects for examples:
> > >
> > > https://github.com/apache/tinkerpop/blob/f5687ee4497bfbaef4ae89233e4c29
> > f07001ed2c/gremlin-core/pom.xml#L20-L24
> > >
> > > You can drop all of this because it is already defined in the root
> > pom.xml:
> > >
> > > https://github.com/LITMUS-Benchmark-Suite/sparql-to-
> > gremlin/blob/master/pom.xml#L30-L70
> > >
> > > Looking at the rest of your pom.xml now, I'm not sure I understand
> > > everything your  section is doing and if it's all necessary: The
> > > root pom.xml should handle the most common build/deploy options and they
> > > will be thus inherited to your sub-module pom which is why, for example,
> > > the gremlin-core pom is pretty simple for the  section:
> > >
> > > https://github.com/apache/tinkerpop/blob/f5687ee4497bfbaef4ae89233e4c29
> > f07001ed2c/gremlin-core/pom.xml#L119-L151
> > >
> > > If there's anything you're sure can be removed from the sparql-gremlin
> > > pom.xml  section based on how the TinkerPop root pom.xml is setup,
> > > the please feel free to cleanup as much as possible there.
> > >
> > > As for the general project structure of sparql-gremlin, I don't fully
> > > understand how it is arranged. There's
> > >
> > > /Queries
> > > /doc
> > > /docs/images
> > > /output
> > > /src
> > >
> > > and all of that is repeated inside of the /bin directory. something seems
> > > amiss there. maybe once that's cleared up a bit I can think more clearly
> > on
> > > what additional changes you might need.
> > >
> > > Another important thing to considerdocumentation. Right now, it's all
> > > in the README. I think we will want a 

Re: [Discussed] Integrating SPARQL-Gremlin 0.2 Plugin with the TinkerPop codebase

2017-12-13 Thread Joshua Shinavier
Hi Harsh,

Glad you are taking Daniel's work forward. In porting the code to the
TinkerPop code base, might I suggest we allow for not only SPARQL-Gremlin,
but a whole suite of RDF tools as in TP2. Perhaps call the module
rdf-gremlin. Then we could have all of:

* SPARQL-Gremlin: executes standard SPARQL queries over a Property Graph
database
* GraphSail [1,2]: stores RDF quads in the database, explicitly, and
enables SPARQL and triple pattern queries over the quads
* PropertyGraphSail [3]: exposes a Property Graph with of two mappings to
the RDF data model
* SailGraph [4]: takes an RDF triple store (not natively supporting
Gremlin) and enables Gremlin queries
* others? I have often thought that a continuous SPARQL implementation
built on Gremlin would be powerful

The biggest mismatch between the TP2 suite and what might be built for
Apache TinkerPop is that the previous suite was implemented using (Eclipse)
RDF4j, whereas things seem to be leaning towards (Apache) Jena now.
However, the same principles could be applied.

Josh


[1] https://github.com/tinkerpop/blueprints/wiki/Sail-Ouplementation
[2] https://github.com/joshsh/graphsail
[3]
https://github.com/tinkerpop/blueprints/wiki/PropertyGraphSail-Ouplementation
[4] https://github.com/tinkerpop/blueprints/wiki/Sail-Implementation




On Wed, Dec 13, 2017 at 4:03 AM, Stephen Mallette 
wrote:

> I suggest that you read through the dev docs a bit. There's lots of little
> odds/ends there about how to develop on the TinkerPop code base. For
> example, for intellij issues, please have a look at this:
>
> http://tinkerpop.apache.org/docs/current/dev/developer/#_
> ide_setup_with_intellij
>
> > - Also, when I did a man clean install on 3.3.1-SNAPSHOT, it did get
> build success but a majority of the test cases failed. Not sure if this is
> worth mentioning.
>
> there should be no failures on master and 3.3.1-SNAPSHOT. hard to say what
> is wrong without some error logs
>
> > Also, is it okay if I use the 3.3.0 api version for my module or it is
> absolutely necessary that we have to only use the 3.3.1-SNAPSHOT api
> version?
>
> you should be on 3.3.1-SNAPSHOT for your API version now that you've
> integrated sparql-gremlin into the tinkerpop code base. ultimately, it will
> all release together as part of a single package, so it should all be on
> the same version.
>
> On Wed, Dec 13, 2017 at 6:43 AM, Harsh Thakkar  wrote:
>
> > Hi Stephen,
> >
> > I cleaned up the code a bit and then, I tried testing the code merge
> > yesterday and I ran into some issues for 3.3.1-SNAPSHOT version.
> >
> > - I forked apache/tinkerpop repository to my local account and loaded the
> > same using an IDE (as maven project). This immediately threw errors in
> the
> > native repositories such as gremlin-core, stating that it is not able to
> > find org.apache.tinkerpop.shaded.kryo.Kryo.
> > - When I try building the code with 3.3.0 api it works perfectly without
> > any error, however for 3.3.1-SNAPSHOT version it is not able to find
> > various files and throws errors in the core modules of tinkerpop. Thus, I
> > cannot test my module (sparql-gremlin) with the 3.3.1-SNAPSHOT version.
> > - Also, when I did a man clean install on 3.3.1-SNAPSHOT, it did get
> build
> > success but a majority of the test cases failed. Not sure if this is
> worth
> > mentioning.
> >
> > What do you suggest? How do I fix this?
> > Also, is it okay if I use the 3.3.0 api version for my module or it is
> > absolutely necessary that we have to only use the 3.3.1-SNAPSHOT api
> > version?
> >
> > Thanks in advance!
> >
> > On 2017-12-12 12:58, Stephen Mallette  wrote:
> > > yes - please post questions here. i don't think you need to know much
> > about
> > > TinkerPop internal structure. I'd think that sparql-gremlin is expected
> > to
> > > be included in the root of the TinkerPop source as a sub-module to the
> > > top-level pom. That just means some minor changes to your pom.xml to
> get
> > it
> > > to build along with everything else. See other projects for examples:
> > >
> > > https://github.com/apache/tinkerpop/blob/
> f5687ee4497bfbaef4ae89233e4c29
> > f07001ed2c/gremlin-core/pom.xml#L20-L24
> > >
> > > You can drop all of this because it is already defined in the root
> > pom.xml:
> > >
> > > https://github.com/LITMUS-Benchmark-Suite/sparql-to-
> > gremlin/blob/master/pom.xml#L30-L70
> > >
> > > Looking at the rest of your pom.xml now, I'm not sure I understand
> > > everything your  section is doing and if it's all necessary: The
> > > root pom.xml should handle the most common build/deploy options and
> they
> > > will be thus inherited to your sub-module pom which is why, for
> example,
> > > the gremlin-core pom is pretty simple for the  section:
> > >
> > > https://github.com/apache/tinkerpop/blob/
> f5687ee4497bfbaef4ae89233e4c29
> > f07001ed2c/gremlin-core/pom.xml#L119-L151
> > >
> > > If there's anything you're sure can 

Re: [Discussed] Integrating SPARQL-Gremlin 0.2 Plugin with the TinkerPop codebase

2017-12-13 Thread Stephen Mallette
I suggest that you read through the dev docs a bit. There's lots of little
odds/ends there about how to develop on the TinkerPop code base. For
example, for intellij issues, please have a look at this:

http://tinkerpop.apache.org/docs/current/dev/developer/#_ide_setup_with_intellij

> - Also, when I did a man clean install on 3.3.1-SNAPSHOT, it did get
build success but a majority of the test cases failed. Not sure if this is
worth mentioning.

there should be no failures on master and 3.3.1-SNAPSHOT. hard to say what
is wrong without some error logs

> Also, is it okay if I use the 3.3.0 api version for my module or it is
absolutely necessary that we have to only use the 3.3.1-SNAPSHOT api
version?

you should be on 3.3.1-SNAPSHOT for your API version now that you've
integrated sparql-gremlin into the tinkerpop code base. ultimately, it will
all release together as part of a single package, so it should all be on
the same version.

On Wed, Dec 13, 2017 at 6:43 AM, Harsh Thakkar  wrote:

> Hi Stephen,
>
> I cleaned up the code a bit and then, I tried testing the code merge
> yesterday and I ran into some issues for 3.3.1-SNAPSHOT version.
>
> - I forked apache/tinkerpop repository to my local account and loaded the
> same using an IDE (as maven project). This immediately threw errors in the
> native repositories such as gremlin-core, stating that it is not able to
> find org.apache.tinkerpop.shaded.kryo.Kryo.
> - When I try building the code with 3.3.0 api it works perfectly without
> any error, however for 3.3.1-SNAPSHOT version it is not able to find
> various files and throws errors in the core modules of tinkerpop. Thus, I
> cannot test my module (sparql-gremlin) with the 3.3.1-SNAPSHOT version.
> - Also, when I did a man clean install on 3.3.1-SNAPSHOT, it did get build
> success but a majority of the test cases failed. Not sure if this is worth
> mentioning.
>
> What do you suggest? How do I fix this?
> Also, is it okay if I use the 3.3.0 api version for my module or it is
> absolutely necessary that we have to only use the 3.3.1-SNAPSHOT api
> version?
>
> Thanks in advance!
>
> On 2017-12-12 12:58, Stephen Mallette  wrote:
> > yes - please post questions here. i don't think you need to know much
> about
> > TinkerPop internal structure. I'd think that sparql-gremlin is expected
> to
> > be included in the root of the TinkerPop source as a sub-module to the
> > top-level pom. That just means some minor changes to your pom.xml to get
> it
> > to build along with everything else. See other projects for examples:
> >
> > https://github.com/apache/tinkerpop/blob/f5687ee4497bfbaef4ae89233e4c29
> f07001ed2c/gremlin-core/pom.xml#L20-L24
> >
> > You can drop all of this because it is already defined in the root
> pom.xml:
> >
> > https://github.com/LITMUS-Benchmark-Suite/sparql-to-
> gremlin/blob/master/pom.xml#L30-L70
> >
> > Looking at the rest of your pom.xml now, I'm not sure I understand
> > everything your  section is doing and if it's all necessary: The
> > root pom.xml should handle the most common build/deploy options and they
> > will be thus inherited to your sub-module pom which is why, for example,
> > the gremlin-core pom is pretty simple for the  section:
> >
> > https://github.com/apache/tinkerpop/blob/f5687ee4497bfbaef4ae89233e4c29
> f07001ed2c/gremlin-core/pom.xml#L119-L151
> >
> > If there's anything you're sure can be removed from the sparql-gremlin
> > pom.xml  section based on how the TinkerPop root pom.xml is setup,
> > the please feel free to cleanup as much as possible there.
> >
> > As for the general project structure of sparql-gremlin, I don't fully
> > understand how it is arranged. There's
> >
> > /Queries
> > /doc
> > /docs/images
> > /output
> > /src
> >
> > and all of that is repeated inside of the /bin directory. something seems
> > amiss there. maybe once that's cleared up a bit I can think more clearly
> on
> > what additional changes you might need.
> >
> > Another important thing to considerdocumentation. Right now, it's all
> > in the README. I think we will want a new section to the Reference
> > Documentation, probably appearing after Gremlin Variants:
> >
> > http://tinkerpop.apache.org/docs/current/reference/#gremlin-variants
> >
> > Perhaps that could be named "Query Languages" where sparql-gremlin would
> be
> > the first sub-section. That would set up for some future where we also
> had
> > sql-gremlin.  And perhaps a section for cypher-gremlin which could point
> to
> > Neo4j's work in this area.  You can find reference docs for TinkerPop
> here:
> >
> > https://github.com/apache/tinkerpop/tree/f5687ee4497bfbaef4ae89233e4c29
> f07001ed2c/docs/src/reference
> >
> > The easiest way to generate docs is with docker via:
> >
> > docker/build.sh -d
> >
> > without that you need hadoop running with appropriate configurations:
> >
> > http://tinkerpop.apache.org/docs/current/dev/developer/#
> documentation-environment
> >

Re: [Discussed] Integrating SPARQL-Gremlin 0.2 Plugin with the TinkerPop codebase

2017-12-13 Thread Harsh Thakkar
Hi Stephen,

I cleaned up the code a bit and then, I tried testing the code merge yesterday 
and I ran into some issues for 3.3.1-SNAPSHOT version.

- I forked apache/tinkerpop repository to my local account and loaded the same 
using an IDE (as maven project). This immediately threw errors in the native 
repositories such as gremlin-core, stating that it is not able to find 
org.apache.tinkerpop.shaded.kryo.Kryo. 
- When I try building the code with 3.3.0 api it works perfectly without any 
error, however for 3.3.1-SNAPSHOT version it is not able to find various files 
and throws errors in the core modules of tinkerpop. Thus, I cannot test my 
module (sparql-gremlin) with the 3.3.1-SNAPSHOT version. 
- Also, when I did a man clean install on 3.3.1-SNAPSHOT, it did get build 
success but a majority of the test cases failed. Not sure if this is worth 
mentioning.

What do you suggest? How do I fix this? 
Also, is it okay if I use the 3.3.0 api version for my module or it is 
absolutely necessary that we have to only use the 3.3.1-SNAPSHOT api version? 

Thanks in advance!

On 2017-12-12 12:58, Stephen Mallette  wrote: 
> yes - please post questions here. i don't think you need to know much about
> TinkerPop internal structure. I'd think that sparql-gremlin is expected to
> be included in the root of the TinkerPop source as a sub-module to the
> top-level pom. That just means some minor changes to your pom.xml to get it
> to build along with everything else. See other projects for examples:
> 
> https://github.com/apache/tinkerpop/blob/f5687ee4497bfbaef4ae89233e4c29f07001ed2c/gremlin-core/pom.xml#L20-L24
> 
> You can drop all of this because it is already defined in the root pom.xml:
> 
> https://github.com/LITMUS-Benchmark-Suite/sparql-to-gremlin/blob/master/pom.xml#L30-L70
> 
> Looking at the rest of your pom.xml now, I'm not sure I understand
> everything your  section is doing and if it's all necessary: The
> root pom.xml should handle the most common build/deploy options and they
> will be thus inherited to your sub-module pom which is why, for example,
> the gremlin-core pom is pretty simple for the  section:
> 
> https://github.com/apache/tinkerpop/blob/f5687ee4497bfbaef4ae89233e4c29f07001ed2c/gremlin-core/pom.xml#L119-L151
> 
> If there's anything you're sure can be removed from the sparql-gremlin
> pom.xml  section based on how the TinkerPop root pom.xml is setup,
> the please feel free to cleanup as much as possible there.
> 
> As for the general project structure of sparql-gremlin, I don't fully
> understand how it is arranged. There's
> 
> /Queries
> /doc
> /docs/images
> /output
> /src
> 
> and all of that is repeated inside of the /bin directory. something seems
> amiss there. maybe once that's cleared up a bit I can think more clearly on
> what additional changes you might need.
> 
> Another important thing to considerdocumentation. Right now, it's all
> in the README. I think we will want a new section to the Reference
> Documentation, probably appearing after Gremlin Variants:
> 
> http://tinkerpop.apache.org/docs/current/reference/#gremlin-variants
> 
> Perhaps that could be named "Query Languages" where sparql-gremlin would be
> the first sub-section. That would set up for some future where we also had
> sql-gremlin.  And perhaps a section for cypher-gremlin which could point to
> Neo4j's work in this area.  You can find reference docs for TinkerPop here:
> 
> https://github.com/apache/tinkerpop/tree/f5687ee4497bfbaef4ae89233e4c29f07001ed2c/docs/src/reference
> 
> The easiest way to generate docs is with docker via:
> 
> docker/build.sh -d
> 
> without that you need hadoop running with appropriate configurations:
> 
> http://tinkerpop.apache.org/docs/current/dev/developer/#documentation-environment
> 
> Well, hope that gives you a few things to work on and think about for your
> first round of changes in your fork. Looking forward to seeing how this PR
> shapes up!
> 
> 
> On Tue, Dec 12, 2017 at 3:40 AM, Harsh Thakkar  wrote:
> 
> > Hi Stephen,
> >
> > Very well then, we will start the migration from today. Also we will
> > submit the signed iclas' today.
> >
> > If we have some questions regarding building the code properly, can we
> > feel free to ask them here? I assume we might need some guidance on how to
> > getting things plugged in correctly. We both are not much aware of the
> > internal structure of TinkerPop, so that is why.
> >
> > Also, if there is any specific documentation to help us with this, please
> > lend a pointer.
> >
> > Many thanks!
> >
> > On 2017-12-11 21:33, Stephen Mallette  wrote:
> > > As there hasn't been any other opinions, It seems we have a lazy
> > consensus
> > > to accept sparql-gremlin into TinkerPop's code base. Cool!
> > >
> > > Harsh, I think you and Dharmen should proceed with the steps I listed
> > > above. Once you have the code integrated and building properly in your
> > > 

Re: [Discussed] Integrating SPARQL-Gremlin 0.2 Plugin with the TinkerPop codebase

2017-12-12 Thread Stephen Mallette
yes - please post questions here. i don't think you need to know much about
TinkerPop internal structure. I'd think that sparql-gremlin is expected to
be included in the root of the TinkerPop source as a sub-module to the
top-level pom. That just means some minor changes to your pom.xml to get it
to build along with everything else. See other projects for examples:

https://github.com/apache/tinkerpop/blob/f5687ee4497bfbaef4ae89233e4c29f07001ed2c/gremlin-core/pom.xml#L20-L24

You can drop all of this because it is already defined in the root pom.xml:

https://github.com/LITMUS-Benchmark-Suite/sparql-to-gremlin/blob/master/pom.xml#L30-L70

Looking at the rest of your pom.xml now, I'm not sure I understand
everything your  section is doing and if it's all necessary: The
root pom.xml should handle the most common build/deploy options and they
will be thus inherited to your sub-module pom which is why, for example,
the gremlin-core pom is pretty simple for the  section:

https://github.com/apache/tinkerpop/blob/f5687ee4497bfbaef4ae89233e4c29f07001ed2c/gremlin-core/pom.xml#L119-L151

If there's anything you're sure can be removed from the sparql-gremlin
pom.xml  section based on how the TinkerPop root pom.xml is setup,
the please feel free to cleanup as much as possible there.

As for the general project structure of sparql-gremlin, I don't fully
understand how it is arranged. There's

/Queries
/doc
/docs/images
/output
/src

and all of that is repeated inside of the /bin directory. something seems
amiss there. maybe once that's cleared up a bit I can think more clearly on
what additional changes you might need.

Another important thing to considerdocumentation. Right now, it's all
in the README. I think we will want a new section to the Reference
Documentation, probably appearing after Gremlin Variants:

http://tinkerpop.apache.org/docs/current/reference/#gremlin-variants

Perhaps that could be named "Query Languages" where sparql-gremlin would be
the first sub-section. That would set up for some future where we also had
sql-gremlin.  And perhaps a section for cypher-gremlin which could point to
Neo4j's work in this area.  You can find reference docs for TinkerPop here:

https://github.com/apache/tinkerpop/tree/f5687ee4497bfbaef4ae89233e4c29f07001ed2c/docs/src/reference

The easiest way to generate docs is with docker via:

docker/build.sh -d

without that you need hadoop running with appropriate configurations:

http://tinkerpop.apache.org/docs/current/dev/developer/#documentation-environment

Well, hope that gives you a few things to work on and think about for your
first round of changes in your fork. Looking forward to seeing how this PR
shapes up!


On Tue, Dec 12, 2017 at 3:40 AM, Harsh Thakkar  wrote:

> Hi Stephen,
>
> Very well then, we will start the migration from today. Also we will
> submit the signed iclas' today.
>
> If we have some questions regarding building the code properly, can we
> feel free to ask them here? I assume we might need some guidance on how to
> getting things plugged in correctly. We both are not much aware of the
> internal structure of TinkerPop, so that is why.
>
> Also, if there is any specific documentation to help us with this, please
> lend a pointer.
>
> Many thanks!
>
> On 2017-12-11 21:33, Stephen Mallette  wrote:
> > As there hasn't been any other opinions, It seems we have a lazy
> consensus
> > to accept sparql-gremlin into TinkerPop's code base. Cool!
> >
> > Harsh, I think you and Dharmen should proceed with the steps I listed
> > above. Once you have the code integrated and building properly in your
> > fork, please reply back and point us to it and we can start with some
> > coarse grained review of what you have.
> >
> > Thanks,
> >
> > Stephen
> >
> > On Fri, Dec 8, 2017 at 8:44 AM, hars...@gmail.com 
> wrote:
> >
> > > Hi Stephen,
> > >
> > > Thanks for the insight on the process of this integration. I will
> reply to
> > > your comments in the same manner.
> > >
> > > 1. Yes, I will do the fork and migrate the code to the Tinkerpop
> > > repository, after cleaning the code a bit. We also need to prepare a
> > > detailed doc (how-to) for the plugin. This can also be done in
> parallel,
> > > depending upon the urgency.
> > >
> > > 2. Yes, we both are contributing to the v0.2 of the sparql-gremlin
> plugin.
> > > We will both submit the ICLAs.
> > >
> > > Yes, we (both) will continue to provide support for the 0.2 plugin and
> > > also extend it in the future (trying to cover SPARQL 1.1 specification,
> > > also fix the OPTIONAL fix in the current version).
> > >
> > > Looking forward to hear more on this from the devs :)
> > >
> > > Cheers!
> > >
> > > On 2017-12-08 13:41, Stephen Mallette  wrote:
> > > > I agree with Marko's thoughts, both on this topic of including
> > > > sparql-gremlin as well as the wider topic of what should be included
> in
> > > > TinkerPop code 

Re: [Discussed] Integrating SPARQL-Gremlin 0.2 Plugin with the TinkerPop codebase

2017-12-12 Thread Harsh Thakkar
Hi Stephen,

Very well then, we will start the migration from today. Also we will submit the 
signed iclas' today. 

If we have some questions regarding building the code properly, can we feel 
free to ask them here? I assume we might need some guidance on how to getting 
things plugged in correctly. We both are not much aware of the internal 
structure of TinkerPop, so that is why. 

Also, if there is any specific documentation to help us with this, please lend 
a pointer.

Many thanks!

On 2017-12-11 21:33, Stephen Mallette  wrote: 
> As there hasn't been any other opinions, It seems we have a lazy consensus
> to accept sparql-gremlin into TinkerPop's code base. Cool!
> 
> Harsh, I think you and Dharmen should proceed with the steps I listed
> above. Once you have the code integrated and building properly in your
> fork, please reply back and point us to it and we can start with some
> coarse grained review of what you have.
> 
> Thanks,
> 
> Stephen
> 
> On Fri, Dec 8, 2017 at 8:44 AM, hars...@gmail.com  wrote:
> 
> > Hi Stephen,
> >
> > Thanks for the insight on the process of this integration. I will reply to
> > your comments in the same manner.
> >
> > 1. Yes, I will do the fork and migrate the code to the Tinkerpop
> > repository, after cleaning the code a bit. We also need to prepare a
> > detailed doc (how-to) for the plugin. This can also be done in parallel,
> > depending upon the urgency.
> >
> > 2. Yes, we both are contributing to the v0.2 of the sparql-gremlin plugin.
> > We will both submit the ICLAs.
> >
> > Yes, we (both) will continue to provide support for the 0.2 plugin and
> > also extend it in the future (trying to cover SPARQL 1.1 specification,
> > also fix the OPTIONAL fix in the current version).
> >
> > Looking forward to hear more on this from the devs :)
> >
> > Cheers!
> >
> > On 2017-12-08 13:41, Stephen Mallette  wrote:
> > > I agree with Marko's thoughts, both on this topic of including
> > > sparql-gremlin as well as the wider topic of what should be included in
> > > TinkerPop code base more generally. Providing a path for rdf/sparql folks
> > > to get into the TinkerPop world seems like a smart direction.
> > >
> > > Now, assuming that we have consensus to include sparql-gremlin in the
> > > TinkerPop code base, the process will look something like this:
> > >
> > > 1. I think that Harsh should fork the TinkerPop repository and migrate
> > > sparql-gremlin into its structure. From there we will provide
> > > feedback/review to get that fork into best shape possible prior to his
> > > submitting a pull request. I think we can handle initial feedback through
> > > the dev list in a separate thread.
> > >
> > > 2. In parallel to the above item, it appears as though there are two
> > > contributors on sparql-gremlin:
> > >
> > > https://github.com/LITMUS-Benchmark-Suite/sparql-to-
> > gremlin/graphs/contributors
> > >
> > > Both contributors, Harsh and Dharmen, should submit ICLAs:
> > >
> > > http://apache.org/licenses/icla.pdf
> > >
> > > and send them to secret...@apache.org.
> > >
> > > 3. Once ICLAs are confirmed by secretary, Harsh can submit a pull request
> > > from his fork where it can under go final review.
> > >
> > > Does that sound sensible to everyone?
> > >
> > > btw, Harsh, it sounds as though you intend to continue development on
> > > sparql-gremlin after it is part of the TinkerPop repository...does
> > Dharmen
> > > intend to do the same?
> > >
> > > On Fri, Dec 8, 2017 at 6:54 AM, Stephen Mallette 
> > > wrote:
> > >
> > > > linking marko's reply from the user list:
> > > >
> > > > https://groups.google.com/d/msg/gremlin-users/zK9jj7bWvrQ/nE1VvhmeAAAJ
> > > >
> > > > On Thu, Dec 7, 2017 at 1:52 PM, hars...@gmail.com 
> > > > wrote:
> > > >
> > > >> Hello, dear Gremlin people!
> > > >>
> > > >> Apologies for raising this topic a bit late. I planned to start this
> > > >> thread quite earlier but wasn’t able to due to some reasons.
> > > >>
> > > >> === short ==
> > > >> ==
> > > >> I seek your guidance and also help for polishing and integrating the
> > > >> sparql-gremlin 0.2 (https://github.com/LITMUS-Ben
> > > >> chmark-Suite/sparql-to-gremlin) plugin in the apache tinkerpop code
> > > >> base, succeeding its predecessor developed by Daniel Kupitz (
> > > >> https://github.com/dkuppitz/sparql-gremlin). The new plugin offers
> > > >> support for a wide range of SPARQL queries from the SPARQL 1.0
> > features.
> > > >>
> > > >>
> > > >>  long ==
> > > >> ===
> > > >>
> > > >> I am a Ph.D. student at the University of Bonn and work at the
> > > >> intersection of semantic web and graph databases. My thesis is
> > focused on
> > > >> bridging the gap between these two domains by enabling support for
> > SPARQL
> > > >> 

Re: [Discussed] Integrating SPARQL-Gremlin 0.2 Plugin with the TinkerPop codebase

2017-12-11 Thread Stephen Mallette
As there hasn't been any other opinions, It seems we have a lazy consensus
to accept sparql-gremlin into TinkerPop's code base. Cool!

Harsh, I think you and Dharmen should proceed with the steps I listed
above. Once you have the code integrated and building properly in your
fork, please reply back and point us to it and we can start with some
coarse grained review of what you have.

Thanks,

Stephen

On Fri, Dec 8, 2017 at 8:44 AM, hars...@gmail.com  wrote:

> Hi Stephen,
>
> Thanks for the insight on the process of this integration. I will reply to
> your comments in the same manner.
>
> 1. Yes, I will do the fork and migrate the code to the Tinkerpop
> repository, after cleaning the code a bit. We also need to prepare a
> detailed doc (how-to) for the plugin. This can also be done in parallel,
> depending upon the urgency.
>
> 2. Yes, we both are contributing to the v0.2 of the sparql-gremlin plugin.
> We will both submit the ICLAs.
>
> Yes, we (both) will continue to provide support for the 0.2 plugin and
> also extend it in the future (trying to cover SPARQL 1.1 specification,
> also fix the OPTIONAL fix in the current version).
>
> Looking forward to hear more on this from the devs :)
>
> Cheers!
>
> On 2017-12-08 13:41, Stephen Mallette  wrote:
> > I agree with Marko's thoughts, both on this topic of including
> > sparql-gremlin as well as the wider topic of what should be included in
> > TinkerPop code base more generally. Providing a path for rdf/sparql folks
> > to get into the TinkerPop world seems like a smart direction.
> >
> > Now, assuming that we have consensus to include sparql-gremlin in the
> > TinkerPop code base, the process will look something like this:
> >
> > 1. I think that Harsh should fork the TinkerPop repository and migrate
> > sparql-gremlin into its structure. From there we will provide
> > feedback/review to get that fork into best shape possible prior to his
> > submitting a pull request. I think we can handle initial feedback through
> > the dev list in a separate thread.
> >
> > 2. In parallel to the above item, it appears as though there are two
> > contributors on sparql-gremlin:
> >
> > https://github.com/LITMUS-Benchmark-Suite/sparql-to-
> gremlin/graphs/contributors
> >
> > Both contributors, Harsh and Dharmen, should submit ICLAs:
> >
> > http://apache.org/licenses/icla.pdf
> >
> > and send them to secret...@apache.org.
> >
> > 3. Once ICLAs are confirmed by secretary, Harsh can submit a pull request
> > from his fork where it can under go final review.
> >
> > Does that sound sensible to everyone?
> >
> > btw, Harsh, it sounds as though you intend to continue development on
> > sparql-gremlin after it is part of the TinkerPop repository...does
> Dharmen
> > intend to do the same?
> >
> > On Fri, Dec 8, 2017 at 6:54 AM, Stephen Mallette 
> > wrote:
> >
> > > linking marko's reply from the user list:
> > >
> > > https://groups.google.com/d/msg/gremlin-users/zK9jj7bWvrQ/nE1VvhmeAAAJ
> > >
> > > On Thu, Dec 7, 2017 at 1:52 PM, hars...@gmail.com 
> > > wrote:
> > >
> > >> Hello, dear Gremlin people!
> > >>
> > >> Apologies for raising this topic a bit late. I planned to start this
> > >> thread quite earlier but wasn’t able to due to some reasons.
> > >>
> > >> === short ==
> > >> ==
> > >> I seek your guidance and also help for polishing and integrating the
> > >> sparql-gremlin 0.2 (https://github.com/LITMUS-Ben
> > >> chmark-Suite/sparql-to-gremlin) plugin in the apache tinkerpop code
> > >> base, succeeding its predecessor developed by Daniel Kupitz (
> > >> https://github.com/dkuppitz/sparql-gremlin). The new plugin offers
> > >> support for a wide range of SPARQL queries from the SPARQL 1.0
> features.
> > >>
> > >>
> > >>  long ==
> > >> ===
> > >>
> > >> I am a Ph.D. student at the University of Bonn and work at the
> > >> intersection of semantic web and graph databases. My thesis is
> focused on
> > >> bridging the gap between these two domains by enabling support for
> SPARQL
> > >> querying of Property Graph databases. Thus, working on the
> SPARQL-Gremlin
> > >> interoperability was an obvious idea given the wide popularity of
> Gremlin
> > >> amongst the Graph DB vendors.
> > >>
> > >> The sparql-gremlin 0.1 (link - https://github.com/dkuppitz/
> sparql-gremlin)
> > >> plugin was developed by Daniel Kupitz, which we have extended to
> support
> > >> various features of the SPARQL 1.0 specification and have tested using
> > >> various synthetic datasets (such as Northwind dataset and the Berlin
> sprawl
> > >> benchmark [BSBM] dataset) and a wide range of SPARQL queries.
> > >>
> > >> The extended version of the plugin (sparql-gremlin 0.2, link -
> > >> https://github.com/LITMUS-Benchmark-Suite/sparql-to-gremlin)
> supports a
> > >> variety of query 

Re: [Discussed] Integrating SPARQL-Gremlin 0.2 Plugin with the TinkerPop codebase

2017-12-08 Thread hars...@gmail.com
Hi Stephen,

Thanks for the insight on the process of this integration. I will reply to your 
comments in the same manner.

1. Yes, I will do the fork and migrate the code to the Tinkerpop repository, 
after cleaning the code a bit. We also need to prepare a detailed doc (how-to) 
for the plugin. This can also be done in parallel, depending upon the urgency.

2. Yes, we both are contributing to the v0.2 of the sparql-gremlin plugin. We 
will both submit the ICLAs.

Yes, we (both) will continue to provide support for the 0.2 plugin and also 
extend it in the future (trying to cover SPARQL 1.1 specification, also fix the 
OPTIONAL fix in the current version). 

Looking forward to hear more on this from the devs :)

Cheers!

On 2017-12-08 13:41, Stephen Mallette  wrote: 
> I agree with Marko's thoughts, both on this topic of including
> sparql-gremlin as well as the wider topic of what should be included in
> TinkerPop code base more generally. Providing a path for rdf/sparql folks
> to get into the TinkerPop world seems like a smart direction.
> 
> Now, assuming that we have consensus to include sparql-gremlin in the
> TinkerPop code base, the process will look something like this:
> 
> 1. I think that Harsh should fork the TinkerPop repository and migrate
> sparql-gremlin into its structure. From there we will provide
> feedback/review to get that fork into best shape possible prior to his
> submitting a pull request. I think we can handle initial feedback through
> the dev list in a separate thread.
> 
> 2. In parallel to the above item, it appears as though there are two
> contributors on sparql-gremlin:
> 
> https://github.com/LITMUS-Benchmark-Suite/sparql-to-gremlin/graphs/contributors
> 
> Both contributors, Harsh and Dharmen, should submit ICLAs:
> 
> http://apache.org/licenses/icla.pdf
> 
> and send them to secret...@apache.org.
> 
> 3. Once ICLAs are confirmed by secretary, Harsh can submit a pull request
> from his fork where it can under go final review.
> 
> Does that sound sensible to everyone?
> 
> btw, Harsh, it sounds as though you intend to continue development on
> sparql-gremlin after it is part of the TinkerPop repository...does Dharmen
> intend to do the same?
> 
> On Fri, Dec 8, 2017 at 6:54 AM, Stephen Mallette 
> wrote:
> 
> > linking marko's reply from the user list:
> >
> > https://groups.google.com/d/msg/gremlin-users/zK9jj7bWvrQ/nE1VvhmeAAAJ
> >
> > On Thu, Dec 7, 2017 at 1:52 PM, hars...@gmail.com 
> > wrote:
> >
> >> Hello, dear Gremlin people!
> >>
> >> Apologies for raising this topic a bit late. I planned to start this
> >> thread quite earlier but wasn’t able to due to some reasons.
> >>
> >> === short ==
> >> ==
> >> I seek your guidance and also help for polishing and integrating the
> >> sparql-gremlin 0.2 (https://github.com/LITMUS-Ben
> >> chmark-Suite/sparql-to-gremlin) plugin in the apache tinkerpop code
> >> base, succeeding its predecessor developed by Daniel Kupitz (
> >> https://github.com/dkuppitz/sparql-gremlin). The new plugin offers
> >> support for a wide range of SPARQL queries from the SPARQL 1.0 features.
> >>
> >>
> >>  long ==
> >> ===
> >>
> >> I am a Ph.D. student at the University of Bonn and work at the
> >> intersection of semantic web and graph databases. My thesis is focused on
> >> bridging the gap between these two domains by enabling support for SPARQL
> >> querying of Property Graph databases. Thus, working on the SPARQL-Gremlin
> >> interoperability was an obvious idea given the wide popularity of Gremlin
> >> amongst the Graph DB vendors.
> >>
> >> The sparql-gremlin 0.1 (link - https://github.com/dkuppitz/sparql-gremlin)
> >> plugin was developed by Daniel Kupitz, which we have extended to support
> >> various features of the SPARQL 1.0 specification and have tested using
> >> various synthetic datasets (such as Northwind dataset and the Berlin sprawl
> >> benchmark [BSBM] dataset) and a wide range of SPARQL queries.
> >>
> >> The extended version of the plugin (sparql-gremlin 0.2, link -
> >> https://github.com/LITMUS-Benchmark-Suite/sparql-to-gremlin) supports a
> >> variety of query modifiers (group-by, order-by, counts, etc) and complex
> >> query features such as union, aggregation, etc. It does not currently
> >> support SPARQL optional queries though. It needs a minor fix.
> >>
> >> I wish to integrate this updated version to the apache tinkerpop codebase
> >> and wish to see it roll out as a functional plugin (like the old one,
> >> replacing it with the updated version) in the next version of tinker pop
> >> (or even before, however it works out).
> >>
> >> 
> >> ==
> >>
> >> I am not much aware of how to do it and what steps I need to follow, so I
> >> seek input 

Re: [Discussed] Integrating SPARQL-Gremlin 0.2 Plugin with the TinkerPop codebase

2017-12-08 Thread Stephen Mallette
I agree with Marko's thoughts, both on this topic of including
sparql-gremlin as well as the wider topic of what should be included in
TinkerPop code base more generally. Providing a path for rdf/sparql folks
to get into the TinkerPop world seems like a smart direction.

Now, assuming that we have consensus to include sparql-gremlin in the
TinkerPop code base, the process will look something like this:

1. I think that Harsh should fork the TinkerPop repository and migrate
sparql-gremlin into its structure. From there we will provide
feedback/review to get that fork into best shape possible prior to his
submitting a pull request. I think we can handle initial feedback through
the dev list in a separate thread.

2. In parallel to the above item, it appears as though there are two
contributors on sparql-gremlin:

https://github.com/LITMUS-Benchmark-Suite/sparql-to-gremlin/graphs/contributors

Both contributors, Harsh and Dharmen, should submit ICLAs:

http://apache.org/licenses/icla.pdf

and send them to secret...@apache.org.

3. Once ICLAs are confirmed by secretary, Harsh can submit a pull request
from his fork where it can under go final review.

Does that sound sensible to everyone?

btw, Harsh, it sounds as though you intend to continue development on
sparql-gremlin after it is part of the TinkerPop repository...does Dharmen
intend to do the same?

On Fri, Dec 8, 2017 at 6:54 AM, Stephen Mallette 
wrote:

> linking marko's reply from the user list:
>
> https://groups.google.com/d/msg/gremlin-users/zK9jj7bWvrQ/nE1VvhmeAAAJ
>
> On Thu, Dec 7, 2017 at 1:52 PM, hars...@gmail.com 
> wrote:
>
>> Hello, dear Gremlin people!
>>
>> Apologies for raising this topic a bit late. I planned to start this
>> thread quite earlier but wasn’t able to due to some reasons.
>>
>> === short ==
>> ==
>> I seek your guidance and also help for polishing and integrating the
>> sparql-gremlin 0.2 (https://github.com/LITMUS-Ben
>> chmark-Suite/sparql-to-gremlin) plugin in the apache tinkerpop code
>> base, succeeding its predecessor developed by Daniel Kupitz (
>> https://github.com/dkuppitz/sparql-gremlin). The new plugin offers
>> support for a wide range of SPARQL queries from the SPARQL 1.0 features.
>>
>>
>>  long ==
>> ===
>>
>> I am a Ph.D. student at the University of Bonn and work at the
>> intersection of semantic web and graph databases. My thesis is focused on
>> bridging the gap between these two domains by enabling support for SPARQL
>> querying of Property Graph databases. Thus, working on the SPARQL-Gremlin
>> interoperability was an obvious idea given the wide popularity of Gremlin
>> amongst the Graph DB vendors.
>>
>> The sparql-gremlin 0.1 (link - https://github.com/dkuppitz/sparql-gremlin)
>> plugin was developed by Daniel Kupitz, which we have extended to support
>> various features of the SPARQL 1.0 specification and have tested using
>> various synthetic datasets (such as Northwind dataset and the Berlin sprawl
>> benchmark [BSBM] dataset) and a wide range of SPARQL queries.
>>
>> The extended version of the plugin (sparql-gremlin 0.2, link -
>> https://github.com/LITMUS-Benchmark-Suite/sparql-to-gremlin) supports a
>> variety of query modifiers (group-by, order-by, counts, etc) and complex
>> query features such as union, aggregation, etc. It does not currently
>> support SPARQL optional queries though. It needs a minor fix.
>>
>> I wish to integrate this updated version to the apache tinkerpop codebase
>> and wish to see it roll out as a functional plugin (like the old one,
>> replacing it with the updated version) in the next version of tinker pop
>> (or even before, however it works out).
>>
>> 
>> ==
>>
>> I am not much aware of how to do it and what steps I need to follow, so I
>> seek input from you all and have started this thread (as suggested by
>> Stephen Mallette) and already discussed with Marko Rodriguez during Graph
>> Day SF 2017 and in other informal communications.
>>
>> Please guide me through the same and let me know what all I will need to
>> do and/or what you will need to get this done. I am happy to collaborate
>> and be a part of this awesome project :)
>>
>> Cheers,
>> Harsh
>>
>
>


Re: [Discussed] Integrating SPARQL-Gremlin 0.2 Plugin with the TinkerPop codebase

2017-12-08 Thread Stephen Mallette
linking marko's reply from the user list:

https://groups.google.com/d/msg/gremlin-users/zK9jj7bWvrQ/nE1VvhmeAAAJ

On Thu, Dec 7, 2017 at 1:52 PM, hars...@gmail.com  wrote:

> Hello, dear Gremlin people!
>
> Apologies for raising this topic a bit late. I planned to start this
> thread quite earlier but wasn’t able to due to some reasons.
>
> === short 
> I seek your guidance and also help for polishing and integrating the
> sparql-gremlin 0.2 (https://github.com/LITMUS-Benchmark-Suite/sparql-to-
> gremlin) plugin in the apache tinkerpop code base, succeeding its
> predecessor developed by Daniel Kupitz (https://github.com/dkuppitz/
> sparql-gremlin). The new plugin offers support for a wide range of SPARQL
> queries from the SPARQL 1.0 features.
>
>
>  long ==
> ===
>
> I am a Ph.D. student at the University of Bonn and work at the
> intersection of semantic web and graph databases. My thesis is focused on
> bridging the gap between these two domains by enabling support for SPARQL
> querying of Property Graph databases. Thus, working on the SPARQL-Gremlin
> interoperability was an obvious idea given the wide popularity of Gremlin
> amongst the Graph DB vendors.
>
> The sparql-gremlin 0.1 (link - https://github.com/dkuppitz/sparql-gremlin)
> plugin was developed by Daniel Kupitz, which we have extended to support
> various features of the SPARQL 1.0 specification and have tested using
> various synthetic datasets (such as Northwind dataset and the Berlin sprawl
> benchmark [BSBM] dataset) and a wide range of SPARQL queries.
>
> The extended version of the plugin (sparql-gremlin 0.2, link -
> https://github.com/LITMUS-Benchmark-Suite/sparql-to-gremlin) supports a
> variety of query modifiers (group-by, order-by, counts, etc) and complex
> query features such as union, aggregation, etc. It does not currently
> support SPARQL optional queries though. It needs a minor fix.
>
> I wish to integrate this updated version to the apache tinkerpop codebase
> and wish to see it roll out as a functional plugin (like the old one,
> replacing it with the updated version) in the next version of tinker pop
> (or even before, however it works out).
>
> 
> ==
>
> I am not much aware of how to do it and what steps I need to follow, so I
> seek input from you all and have started this thread (as suggested by
> Stephen Mallette) and already discussed with Marko Rodriguez during Graph
> Day SF 2017 and in other informal communications.
>
> Please guide me through the same and let me know what all I will need to
> do and/or what you will need to get this done. I am happy to collaborate
> and be a part of this awesome project :)
>
> Cheers,
> Harsh
>


[Discussed] Integrating SPARQL-Gremlin 0.2 Plugin with the TinkerPop codebase

2017-12-07 Thread hars...@gmail.com
Hello, dear Gremlin people!

Apologies for raising this topic a bit late. I planned to start this thread 
quite earlier but wasn’t able to due to some reasons. 

=== short 
I seek your guidance and also help for polishing and integrating the 
sparql-gremlin 0.2 
(https://github.com/LITMUS-Benchmark-Suite/sparql-to-gremlin) plugin in the 
apache tinkerpop code base, succeeding its predecessor developed by Daniel 
Kupitz (https://github.com/dkuppitz/sparql-gremlin). The new plugin offers 
support for a wide range of SPARQL queries from the SPARQL 1.0 features.


 long =

I am a Ph.D. student at the University of Bonn and work at the intersection of 
semantic web and graph databases. My thesis is focused on bridging the gap 
between these two domains by enabling support for SPARQL querying of Property 
Graph databases. Thus, working on the SPARQL-Gremlin interoperability was an 
obvious idea given the wide popularity of Gremlin amongst the Graph DB vendors. 

The sparql-gremlin 0.1 (link - https://github.com/dkuppitz/sparql-gremlin) 
plugin was developed by Daniel Kupitz, which we have extended to support 
various features of the SPARQL 1.0 specification and have tested using various 
synthetic datasets (such as Northwind dataset and the Berlin sprawl benchmark 
[BSBM] dataset) and a wide range of SPARQL queries. 

The extended version of the plugin (sparql-gremlin 0.2, link -  
https://github.com/LITMUS-Benchmark-Suite/sparql-to-gremlin) supports a variety 
of query modifiers (group-by, order-by, counts, etc) and complex query features 
such as union, aggregation, etc. It does not currently support SPARQL optional 
queries though. It needs a minor fix.

I wish to integrate this updated version to the apache tinkerpop codebase and 
wish to see it roll out as a functional plugin (like the old one, replacing it 
with the updated version) in the next version of tinker pop (or even before, 
however it works out). 

==

I am not much aware of how to do it and what steps I need to follow, so I seek 
input from you all and have started this thread (as suggested by Stephen 
Mallette) and already discussed with Marko Rodriguez during Graph Day SF 2017 
and in other informal communications.

Please guide me through the same and let me know what all I will need to do 
and/or what you will need to get this done. I am happy to collaborate and be a 
part of this awesome project :)

Cheers,
Harsh