Re: [GitHub] [tinkerpop] spmallette opened pull request #1074: TINKERPOP-2171 Allow sparql() to be extended with Gremlin steps.

2019-03-01 Thread Harsh Thakkar
I vote "+1" for this feature.

On 2019/02/28 14:34:01, spmallette (GitHub)  wrote: 
> https://issues.apache.org/jira/browse/TINKERPOP-2171
> 
> Allows `sparql()` step to be followed by Gremlin steps:
> 
> ```text
> gremlin> g.sparql("SELECT * WHERE { }").out("knows").values("name")
> ==>vadas
> ==>josh
> ```
> 
> 
> [ Full content available at: https://github.com/apache/tinkerpop/pull/1074 ]
> This message was relayed via gitbox.apache.org for dev@tinkerpop.apache.org
> 


Re: [Discussed] Integrating SPARQL-Gremlin 0.2 Plugin with the TinkerPop codebase

2018-02-07 Thread Harsh Thakkar
No worries, I am on it :)


On 2018/02/07 15:04:34, Stephen Mallette  wrote: 
> Ok - sounds like we are basically on the same page then. I hate to
> volunteer you for work :) but I think you are the best person to write up
> the capabilities and limitations of sparql-gremlin. I think that if we have
> those documented we can more easily decide the appropriate level of
> testing, so imo, doing that documentation is the next step. I think you
> should just expand what I wrote on sparql-gremlin here:
> 
> https://github.com/apache/tinkerpop/blob/TINKERPOP-1878/docs/src/reference/transpilers.asciidoc
> 
> and provide a PR for that. Is that a good next step?
> 
> On Wed, Feb 7, 2018 at 9:56 AM, Harsh Thakkar  wrote:
> 
> > Hi Stephen,
> >
> > Having more than one variables inside a GROUP BY or an ORDER BY clause is
> > a problem on its own to be honest.  Responding to your question about the
> > query.
> >
> > ```
> >
> > SELECT ?age ?name (COUNT(?name) AS ?name_count)
> > WHERE {
> > ?a e:created ?b .
> > ?a v:name ?name .
> > ?a v:age ?age .
> > }
> > GROUP BY ?age ?name
> > ```
> >
> > Ideally, what SPARQL does is that, it will GROUP BY'ied first according to
> > the ?age and then for each ?age value it is will further GROUP BY according
> > to the ?name. However, no specific ordering is followed unless the user
> > specifies one, to the best of my knowledge.
> >
> > In the Gremlin translation or in Gremlin (I am not sure yet, where there
> > problem lies), what is happening is that the values are GROUP BY'ied first
> > according to the ?name and then being re-GROUP BY'ied (or re-arranged)
> > according to the ?age. That is why, only the last variable GROUPing is
> > visible.
> >
> > Reg. errors and exception reporting: Yes, we should have it stated very
> > clear in the documentation of the plugin as what is feasible and what is
> > not feasible in the current stage of the plugin. For instance, SPARQL has
> > query modifiers which are very specific to SPARQL such as isIRI() filter,
> > where it checks or filters a particular variable depending on if it is a
> > URI or IRI (basically a URL). This, I do not think, has a corresponding
> > operator in Gremlin. Clearly, because Gremlin operates on Property Graphs
> > and not RDF graphs so there is no need.
> >
> > We should indeed, have it stated in the readme (documentation) what is not
> > supported so that we do not have an outcry of third-party users complaining
> > later that this doesn't work and that doesn't work and I also agree on
> > having nicely handled exceptions too. Nothing gives more pain than ugly
> > crashing code. :)
> >
> >
> > On 2018/02/06 21:58:33, Stephen Mallette  wrote:
> > > Thanks for your review of my concern with the traspiling for GROUP.  So
> > > there's two aspects to your reply that I'd like to discuss. First, the
> > > specific issue with GROUP that I'm seeing is that its simply choosing the
> > > last variable given in the GROUP
> > >
> > > https://github.com/apache/tinkerpop/blob/74b568a8babb8b52b790767e7bb05f
> > 462dc5c5f0/sparql-gremlin/src/main/java/org/apache/
> > tinkerpop/gremlin/sparql/SparqlToGremlinTranspiler.java#L139-L141
> > >
> > > so when you do this (which i think is legitimate SPARQL - i'm still
> > > learning):
> > >
> > > SELECT ?age ?name (COUNT(?name) AS ?name_count)
> > > WHERE {
> > >   ?a e:created ?b .
> > >   ?a v:name ?name .
> > >   ?a v:age ?age .
> > > }
> > > GROUP BY ?age ?name
> > >
> > > you will get back a grouping on "name" and if you transpose the variables
> > > in that last line to:
> > >
> > > GROUP BY ?name ?age
> > >
> > > then you get a grouping on "age".  That doesn't seem right to me. Now,
> > that
> > > would lead me to my second issue I'd want to bring up. Perhaps,
> > supporting
> > > GROUP with multiple variables is something we don't support yet. Perhaps
> > > there's a long line of other SPARQL capabilities that we aren't quite
> > ready
> > > to provide full transpiling for. Moreover, perhaps there are certain
> > SPARQL
> > > statements that just aren't possible to translate to Gremlin at all. I
> > > think we need to do something smart in those cases. We don't want
> > > situations like the one I presented in GROUP where it transpiles to
> > Gremlin
>

Re: [Discussed] Integrating SPARQL-Gremlin 0.2 Plugin with the TinkerPop codebase

2018-02-07 Thread Harsh Thakkar
Hi Stephen,

Having more than one variables inside a GROUP BY or an ORDER BY clause is a 
problem on its own to be honest.  Responding to your question about the query.

```

SELECT ?age ?name (COUNT(?name) AS ?name_count)
WHERE {
?a e:created ?b .
?a v:name ?name .
?a v:age ?age .
}
GROUP BY ?age ?name
```

Ideally, what SPARQL does is that, it will GROUP BY'ied first according to the 
?age and then for each ?age value it is will further GROUP BY according to the 
?name. However, no specific ordering is followed unless the user specifies one, 
to the best of my knowledge.

In the Gremlin translation or in Gremlin (I am not sure yet, where there 
problem lies), what is happening is that the values are GROUP BY'ied first 
according to the ?name and then being re-GROUP BY'ied (or re-arranged) 
according to the ?age. That is why, only the last variable GROUPing is visible.

Reg. errors and exception reporting: Yes, we should have it stated very clear 
in the documentation of the plugin as what is feasible and what is not feasible 
in the current stage of the plugin. For instance, SPARQL has query modifiers 
which are very specific to SPARQL such as isIRI() filter, where it checks or 
filters a particular variable depending on if it is a URI or IRI (basically a 
URL). This, I do not think, has a corresponding operator in Gremlin. Clearly, 
because Gremlin operates on Property Graphs and not RDF graphs so there is no 
need.

We should indeed, have it stated in the readme (documentation) what is not 
supported so that we do not have an outcry of third-party users complaining 
later that this doesn't work and that doesn't work and I also agree on having 
nicely handled exceptions too. Nothing gives more pain than ugly crashing code. 
:)


On 2018/02/06 21:58:33, Stephen Mallette  wrote: 
> Thanks for your review of my concern with the traspiling for GROUP.  So
> there's two aspects to your reply that I'd like to discuss. First, the
> specific issue with GROUP that I'm seeing is that its simply choosing the
> last variable given in the GROUP
> 
> https://github.com/apache/tinkerpop/blob/74b568a8babb8b52b790767e7bb05f462dc5c5f0/sparql-gremlin/src/main/java/org/apache/tinkerpop/gremlin/sparql/SparqlToGremlinTranspiler.java#L139-L141
> 
> so when you do this (which i think is legitimate SPARQL - i'm still
> learning):
> 
> SELECT ?age ?name (COUNT(?name) AS ?name_count)
> WHERE {
>   ?a e:created ?b .
>   ?a v:name ?name .
>   ?a v:age ?age .
> }
> GROUP BY ?age ?name
> 
> you will get back a grouping on "name" and if you transpose the variables
> in that last line to:
> 
> GROUP BY ?name ?age
> 
> then you get a grouping on "age".  That doesn't seem right to me. Now, that
> would lead me to my second issue I'd want to bring up. Perhaps, supporting
> GROUP with multiple variables is something we don't support yet. Perhaps
> there's a long line of other SPARQL capabilities that we aren't quite ready
> to provide full transpiling for. Moreover, perhaps there are certain SPARQL
> statements that just aren't possible to translate to Gremlin at all. I
> think we need to do something smart in those cases. We don't want
> situations like the one I presented in GROUP where it transpiles to Gremlin
> but doesn't really accomplish what the intention of the SPARQL query was. I
> feel like we need to do several things with respect to this:
> 
> 1. If we can't transpile the SPARQL, we throw an
> UnsupportedOperationException with a nice error message that says why the
> user's SPARQL didn't transpile (i.e. what don't we support that they tried
> to pass through)
> 2. We document the boundaries of what we do support and what our
> limitations are.
> 
> Any thoughts on all that?
> 
> >   Dharmen and I did check your corrections and comments in the code. We
> found them appropriate.
> 
> That's good to know. Thanks.
> 
> 
> 
> 
> On Tue, Feb 6, 2018 at 3:36 PM, Harsh Thakkar  wrote:
> 
> > Hi Stephen,
> >
> > Apologies for being quiet for some time. I have been down with severe flu
> > and just recovered. I looked into the order by issue and the reason for
> > having only an aggregation variable in the select clause is because of
> > SPARQL. SPARQL does not support projecting any other variable other than
> > the one which is being used in group by. One could write such a SPARQL
> > query, however, it would be incorrect and wouldn't be able to be parsed by
> > any SPARQL query processor.
> >
> > For instance,
> >
> >  select ?unitOnOrder
> >   where {
> >   ?a v:label "product" .
> >   ?a v:name ?name .
&

Re: [Discussed] Integrating SPARQL-Gremlin 0.2 Plugin with the TinkerPop codebase

2018-02-06 Thread Harsh Thakkar
 I did
> > anything
> > > dumb.
> > > 2. Perhaps you look at the issue I think that I see with GROUP (which is
> > > basically identical to ORDER in that it only accepts the last field as a
> > > GROUPing...i don't think that's right).
> > > 3. Perhaps you could also think about writing some documentation that
> > > explains the support TinkerPop has for SPARQL - describe the aspects of
> > > SPARQL that we support and any limitations that we have in that support.
> > > 4. I will work on the plugin and get that working on early this coming
> > > week.
> > > 5. I will also keep thinking about testing - i still don't think that the
> > > approach I have is sufficient. If you have ideas about that, please let
> > me
> > > know.
> > >
> > > How does that sound?
> > >
> > > btw, note that i had to do a bit of trickery to get the sparql-gremlin
> > > stuff to work in the console for that screenshot i posted on twitter.
> > > obviously, without the plugin things don't work too easily. i had to
> > > manually install all the dependencies to the console to get all that to
> > > work. again, that should be resolved early this coming week and then it
> > can
> > > be easily imported to the console and server.
> > >
> > >
> > >
> > >
> > > On Thu, Jan 25, 2018 at 4:58 PM, Stephen Mallette 
> > > wrote:
> > >
> > > > Marko had a nice idea with:
> > > >
> > > > gremlin> sparql = graph.traversal(SPARQLTraversalStrategy.class)
> > > > .withRemote(“127.0.0.2”)
> > > > gremlin> sparql.query(“SELECT ?x ?y WHERE {…}”).toList()
> > > > ==>{?x:marko, ?y:29}
> > > > ==>{?x:josh, ?y:32}
> > > >
> > > > The problem i'm seeing is that it requires that the TraversalSource on
> > > the
> > > > server be a SparqlTraversalSource because when it gets to the server it
> > > > ends up trying to deserialize the bytecode into a GraphTraversalSource.
> > > > Now, that's exactly how a DSL would work, but a DSL would start with an
> > > > existing start step such as V() or E(), but not constant() which is
> > what
> > > > SparqlTraversalSource is sending with the sparql query in it. I might
> > be
> > > > not thinking of something right in how he expected to implement it,
> > but I
> > > > came up with a reasonably simple workaround - I added an empty inject()
> > > > step before the constant() so that the GraphTraversalSource will be
> > used.
> > > > Both of these steps will be wholly replaced by the transpiled traversal
> > > > when the SparqlStrategy executes and we thus get:
> > > >
> > > > gremlin> graph = EmptyGraph.instance()
> > > > ==>emptygraph[empty]
> > > > gremlin> cluster = Cluster.open()
> > > > ==>localhost/127.0.0.1:8182
> > > > gremlin> g = graph.traversal(SparqlTraversalSource.class).
> > > > ..1> withStrategies(SparqlStrategy.instance()).
> > > > ..2> withRemote(DriverRemoteConnection.using(
> > > cluster))
> > > > ==>sparqltraversalsource[emptygraph[empty], standard]
> > > > gremlin> g.sparql("SELECT ?name ?age WHERE { ?person v:name ?name .
> > > > ?person v:age ?age }")
> > > > ==>[name:marko,age:29]
> > > > ==>[name:vadas,age:27]
> > > > ==>[name:josh,age:32]
> > > > ==>[name:peter,age:35]
> > > >
> > > > Treating sparql-gremlin as a DSL really seems like the best way to get
> > > > this all working - especially since it already is! :)  To get the same
> > > > pattern going with GLVs we would only need to make use of the DSL
> > > patterns
> > > > which already exist. Anyway, it's nice to have these basic premises
> > > nailed
> > > > down in code to ensure the ideas were sound. Please let me know if you
> > > have
> > > > any thoughts
> > > >
> > > >
> > > >
> > > > On Thu, Jan 25, 2018 at 2:37 PM, Stephen Mallette <
> > spmalle...@gmail.com>
> > > > wrote:
> > > >
> > > >> Check this out:
> > > >>
> > > >> gremlin> graph = TinkerFactory.createModern()
> > > >> ==>tinkergraph[vertices:6 edges:6]
> 

Re: [Discussed] Integrating SPARQL-Gremlin 0.2 Plugin with the TinkerPop codebase

2018-01-09 Thread Harsh Thakkar
Hi Stephen,

It does make sense to me. The work is going on slow but steady. Let's wait and 
see how other devs feel about this, as you said.

Cheers,
Harsh
On 2018-01-09 16:31, Stephen Mallette  wrote: 
> I've had some thoughts on this thread since December. Since sparql-gremlin
> has a pretty long to-do list and there is likely a lot of discussion
> required on this list prior to it being ready for merge to a release
> branch, it seems like we might treat this as a normal feature under
> development. I think we should just merge it to a development branch in the
> TinkerPop repository and then collaborate on it from there. We've taken
> similar approaches with other "long term" pull requests which has allowed
> the code to develop as it would typically would. I'm thinking that's a
> better approach than a "big-bang" pull request.
> 
> Harsh, if that's ok with you, feel free to issue your PR against master and
> I'll get it setup against a development branch on our end (no rush, please
> give it a few days to see if everyone is ok with that approach).
> 
> On Mon, Dec 18, 2017 at 5:16 PM, Stephen Mallette 
> wrote:
> 
> > > Should I also remove the northwind file?
> >
> > I think I'd prefer to see all of our sparql examples use the existing toy
> > graphs - better not to add more options - so I'd remove it as well. If
> > anyone disagrees, I don't really feel too strongly about not including it,
> > but it would be good to hear some reasoning as to why the existing datasets
> > that we already package are insufficient for users to learn with.
> >
> > >  will need some help (quite possibly) with getting things right as far
> > as the DSL pattern for the gremlin language variants is concerned.
> >
> > We can help point you in the right direction when you get stuck or need to
> > clarify things. If you get really stuck, we can move to step 2 and have you
> > issue a PR sooner than later and we'll just merge what you have to a
> > development branch so others can collaborate with you on it more easily.
> > Let's see how things develop.
> >
> > > Also, since you are very well versed in the test suite, I would also
> > request some assistance for the same when we are there :) as it is our
> > first time pushing a work to the production level. So bear with us :)
> >
> > no worries. i will need to think on the testing approach. my thinking will
> > be focused on what i would call integration tests i.e. tests that evaluate
> > sparql-gremlin across the entire stack. I don't imagine that you need my
> > input to write some unit tests to validate the workings of your current
> > code though.
> >
> > > One question, though there is not a strict deadline, when is the 3.3.2
> > release planned?
> >
> > We have no timeline on 3.3.2 at this point (we are just in the process of
> > releasing 3.3.1 so it will be a while before we see 3.3.2). I think the
> > merging of gremlin-javascript will likely trigger that release, i would
> > guess no earlier than February 2018 if all goes right with that. I also
> > don't mean to make it sound like sparql-gremlin needs to be part of that
> > release, so if it's not ready then, it's not ready and it releases with
> > 3.3.3. You'll find that with TinkerPop, we tend to release when software is
> > "ready" and not by setting long range time deadlines for ourselves. So,
> > don't worry about when we release sparql-gremlin too much. Let's stay
> > focused on just getting the code right.
> >
> > Thanks for your understanding.
> >
> >
> >
> >
> > On Mon, Dec 18, 2017 at 5:01 PM, Harsh Thakkar  wrote:
> >
> >> Hello Stephen,
> >>
> >> Alright, I will remove the bsbm file from the repository and I refer to
> >> it in the docs (with some examples) sharing a link to download from the
> >> website if that is acceptable. No worries.
> >> Should I also remove the northwind file?
> >>
> >>
> >> Your expectations are reasonable, it was just that I wasn't very clear
> >> about what needs to be done. Now it is pretty much clear. It will take some
> >> time for me to wrap my head around the specifics of the tinkerpop codebase
> >> in order to satisfy the 3 requirements. I will need some help (quite
> >> possibly) with getting things right as far as the DSL pattern for the
> >> gremlin language variants is concerned. I am already reading the dev-docs
> >> on this, from here:
> >> http://t

Re: Subject: Re: [Discussed] Integrating SPARQL-Gremlin 0.2 Plugin with the TinkerPop codebase

2017-12-20 Thread Harsh Thakkar
Hi Josh,

Apologies for late reply, I almost forgot this one.

I am not exactly familiar with the PropertyGraphSail and GraphSail 
implementations at the moment. Let me get back to you on this once I have a 
more concrete idea.

> How does your content-preserving RDF <--> PG interface compare with the
> GraphSail mapping, the PropertyGraphSail mapping(s), or with Hartig's
> "RDF*"

Thanks for pointing out Olaf Hartig's work in this area. I am currently 
collaborating with him on the proposed "RDF <-> PG" converter. Olaf laid a 
foundation with his paper on the perception of PGs from an RDF perspective. 
This is good, however, has some challenges such as it is not able to handle RDF 
reification at this point. I am working on this with him, to extend his work 
and create a (pretty much) seamless bi-directional conversion mechanism.

I hope that answers your question to somewhat extent. :)

Cheers!

On 2017-12-15 02:38, Joshua Shinavier  wrote: 
> Hi Harsh,
> 
> Thanks for the detailed reply. I can't say with confidence that the TP2
> suite could be re-implemented on top of Jena in that time frame (as I am a
> long-time Sesame fan without much Jena experience), although a TP2 --> TP3
> port could be done, keeping the Sesame (RDF4j) dependency.  GraphSail has
> already been ported, and just needs some docs and more tests. I wonder if
> it would be too crazy to support both, i.e. rdf4j-gremlin and jena-gremlin.
> 
> At any rate, it has been really good to see the recent upsurge of interest
> in RDF and SPARQL support in the graph DB space. At Data Day Seattle, I
> made the point that although Property Graphs came to prominence as a simple
> and lightweight alternative to the Semantic Web standards, SemWeb-like
> features -- such as schemas/ontologies and rules/reasoning -- keep finding
> their way in.
> 
> How does your content-preserving RDF <--> PG interface compare with the
> GraphSail mapping, the PropertyGraphSail mapping(s), or with Hartig's
> "RDF*" [1]?
> 
> 
> [1] https://arxiv.org/pdf/1409.3288.pdf
> 
> 
> On Thu, Dec 14, 2017 at 8:43 AM, Harsh Thakkar  wrote:
> 
> >
> > Hi Josh,
> >
> > I already wrote an elaborate reply to your comment. I think it went
> > somewhere but didn't show up :(
> >
> > I will summarize my reply here now..
> >
> > Yes, I am of the same opinion of having a continuous SPARQL implementation
> > on top of Gremlin. Also, I am working on a custom interface, (as we speak)
> > in my current research, on proposing an information preserving RDF <-> PG
> > converter. This will allow interoperability between the semantic web and
> > graph database communities to leverage the advantages of one another. i.e.
> > the earlier can traverse and the later can have a more diverse access
> > portfolio to rich datasets.
> >
> > My Ph.D. thesis is more or less focused on this. It started from proposing
> > a robust open and extensible benchmarking platform "LITMUS" [], which
> > eventually led me to address all these issues and thus my keen interest :)
> >
> > If I am not getting it wrong, the other interfaces you mentioned, about
> > that, do you wish to see them eventually integrated into tinkerpop? or are
> > you implying that this should be already done before the next release?
> >
> > Thanks for your pointers!
> > Cheers!
> >
> > On 2017-12-13 16:46, Joshua Shinavier  wrote:
> > > Hi Harsh,
> > >
> > > Glad you are taking Daniel's work forward. In porting the code to the
> > > TinkerPop code base, might I suggest we allow for not only
> > SPARQL-Gremlin,
> > > but a whole suite of RDF tools as in TP2. Perhaps call the module
> > > rdf-gremlin. Then we could have all of:
> > >
> > > * SPARQL-Gremlin: executes standard SPARQL queries over a Property Graph
> > > database
> > > * GraphSail [1,2]: stores RDF quads in the database, explicitly, and
> > > enables SPARQL and triple pattern queries over the quads
> > > * PropertyGraphSail [3]: exposes a Property Graph with of two mappings to
> > > the RDF data model
> > > * SailGraph [4]: takes an RDF triple store (not natively supporting
> > > Gremlin) and enables Gremlin queries
> > > * others? I have often thought that a continuous SPARQL implementation
> > > built on Gremlin would be powerful
> > >
> > > The biggest mismatch between the TP2 suite and what might be built for
> > > Apache TinkerPop is that the previous suite was implemented using
> > (Eclipse)
> > > RDF4j, whereas things seem to be leaning towards (Apache) Jena now.
> > > However, the same principles could be applied.
> > >
> > > Josh
> > >
> > >
> > > [1] https://github.com/tinkerpop/blueprints/wiki/Sail-Ouplementation
> > > [2] https://github.com/joshsh/graphsail
> > > [3]
> > > https://github.com/tinkerpop/blueprints/wiki/PropertyGraphSail-
> > Ouplementation
> > > [4] https://github.com/tinkerpop/blueprints/wiki/Sail-Implementation
> > >
> > [snip]
> 


[Discussed] Integrating SPARQL-Gremlin 0.2 Plugin with the TinkerPop codebase

2017-12-18 Thread Harsh Thakkar
nvenient.
> >
> > Hm…… Thoughts?,
> > Marko.
> >
> > http://markorodriguez.com
> >
> >
> >
> > > On Dec 18, 2017, at 9:21 AM, Marko Rodriguez 
> > wrote:
> > >
> > > Hello,
> > >
> > > A couple of items worth considering.
> > >
> > > Regarding (7), that should be done prior to master/ merge. It is
> > necessary to follow the patterns that are established in TinkerPop
> > regarding language interoperability. The DSL pattern developed for Gremlin
> > language variants seems to be the best pattern for distinct languages as
> > well. In essence, if your language is not a fluent language, and instead,
> > uses a String, then it should be wrapped as such in a fluent interface
> > using all the Strategy, Step, and Traversal methods that makes sense so it
> > works within the larger infrastructure of TinkerPop (e.g. testing! — see
> > below). What I proposed in my previous email seems the easiest and cleanest
> > way to do things.
> > >
> > > Regarding (3), testing is crucial. Given that this would be TinkerPop’s
> > first distinct language, we don’t have a pattern set forth for testing.
> > However, this doesn’t mean we can’t improvise on our current model. Off 
> > the
> > top of my head, perhaps the best way would be to follow the
> > ProcessTestSuite and do the SPARQL variants of those. For instance:
> > >
> > >   https://github.com/apache/tinkerpop/blob/master/gremlin-
> > test/src/main/java/org/apache/tinkerpop/gremlin/process/
> > traversal/step/map/VertexTest.java#L62 <https://github.com/apache/
> > tinkerpop/blob/master/gremlin-test/src/main/java/org/apache/
> > tinkerpop/gremlin/process/traversal/step/map/VertexTest.java#L62>
> > >
> > > The SPARQL test version would be:
> > >
> > > @Override
> > > public Traversal get_g_VX1X_out(final Object v1Id) {
> > >   return sparql.query(“SELECT ?x WHERE {“ + toURI(v1Id) + “ ?a ?x 
> > > }”);
> > > }
> > >
> > > In this way, sparql is your SPARQLTraversalSource for each test and
> > query() will return a Traversal typed according (query() will have to have
> > solid generic support). From there, you would implement each and every test
> > that is semantically possible with SPARQL (where SPARQ won’t be able to
> > semantically cover all Gremlin tests).
> > >
> > > Stephen has done a lot of recent work to generalize our test suite out
> > of Java so it is in a language agnostic form. I haven’t been following 
> > that
> > work so I’m not sure what I’m am saying above is exactly as it should be
> > done, but it is a start.
> > >
> > > HTH,
> > > Marko.
> > >
> > > http://markorodriguez.com <http://markorodriguez.com/>
> > >
> > >
> > >
> > >> On Dec 18, 2017, at 7:43 AM, Harsh Thakkar  > hars...@gmail.com>> wrote:
> > >>
> > >> Hi Stephen and All,
> > >>
> > >> Thanks for going through the code. I address your questions below (in
> > the same order):
> > >>
> > >> 1. Yes, this file can be removed. It was just to test the traversal
> > method.
> > >>
> > >> 2. Yes, I have commented out the block of tests at this moment since we
> > do not need to run tests at mvn clean install time. However, I kept it (in
> > commented out form) if there arose a need in future for the same. It can
> > surely be removed if you think, it won't be necessary.
> > >>
> > >> 3. There were two testing units (we continued them from Daniel's
> > version), one to check whether the prefixes are being encoded correctly,
> > the second one is to test whether the generated traversal is correct (in
> > short the compiler is functioning as it should). Since, we extended
> > previous work supporting a variety of SPARQL operators, more test cases can
> > be added to validate that each of these is functioning as expected.
> > However, as I mentioned in point #2. we need not do it explicitly as we
> > (Dharmen and I) have already tested them on 3-4 different datasets and
> > query-sets. Now, since we did not know if that was going to be formally
> > required in the future or not, we left them as it is, just commented it out.
> > >>
> > >> 4. These resources are the graphml files that we wish to provide the
> > users, for (i) loading and querying famous datasets - the Berlin SPARQL
> > Benchmark (BSBM)  (famous in 

Re: [Discussed] Integrating SPARQL-Gremlin 0.2 Plugin with the TinkerPop codebase

2017-12-18 Thread Harsh Thakkar
Hello Marko,

I made a mistake mentioning earlier that the sparql-gremlin compiler returns a 
string, well it does not. It returns a graph traversal, apologies!

Regarding (7), I agree, it makes sense. I will wrap my head around how to get 
that done. I am already reading the dev-docs on this, from here:
http://tinkerpop.apache.org/docs/current/reference/#dsl 
as mentioned in the reply to Stephen.

Regarding (3), I was just not sure whether or not to include these tests, so 
left them out. This makes it clear. I will write the test cases, taking some 
help from Stephen on the specifics of Test Suite. However, these test cases 
will have to be written within the scope of SPARQL. We can not test a query 
which can not be written in SPARQL :) I guess you were implying the same.

Let me get this done and get back to you. This will take some time. No worries!

Cheers,
Harsh

On 2017-12-18 18:26, Marko Rodriguez  wrote: 
> Actually, my (3) is bad. Given that query() would always return a 
> Traversal>, it would be necessary to have that linearized to 
> Traversal for the test suite to validate it. That would mean 
> making SPARQLTraversal support extended Traversal methods like flatMap(), 
> blah, blah… That seems excessive, though convenient.
> 
> Hm…… Thoughts?,
> Marko.
>  
> http://markorodriguez.com
> 
> 
> 
> > On Dec 18, 2017, at 9:21 AM, Marko Rodriguez  wrote:
> > 
> > Hello,
> > 
> > A couple of items worth considering.
> > 
> > Regarding (7), that should be done prior to master/ merge. It is necessary 
> > to follow the patterns that are established in TinkerPop regarding language 
> > interoperability. The DSL pattern developed for Gremlin language variants 
> > seems to be the best pattern for distinct languages as well. In essence, if 
> > your language is not a fluent language, and instead, uses a String, then it 
> > should be wrapped as such in a fluent interface using all the Strategy, 
> > Step, and Traversal methods that makes sense so it works within the larger 
> > infrastructure of TinkerPop (e.g. testing! — see below). What I proposed 
> > in my previous email seems the easiest and cleanest way to do things.
> > 
> > Regarding (3), testing is crucial. Given that this would be TinkerPop’s 
> > first distinct language, we don’t have a pattern set forth for testing. 
> > However, this doesn’t mean we can’t improvise on our current model. Off 
> > the top of my head, perhaps the best way would be to follow the 
> > ProcessTestSuite and do the SPARQL variants of those. For instance:
> > 
> > 
> > https://github.com/apache/tinkerpop/blob/master/gremlin-test/src/main/java/org/apache/tinkerpop/gremlin/process/traversal/step/map/VertexTest.java#L62
> >  
> > <https://github.com/apache/tinkerpop/blob/master/gremlin-test/src/main/java/org/apache/tinkerpop/gremlin/process/traversal/step/map/VertexTest.java#L62>
> > 
> > The SPARQL test version would be:
> > 
> > @Override
> > public Traversal get_g_VX1X_out(final Object v1Id) {
> >   return sparql.query(“SELECT ?x WHERE {“ + toURI(v1Id) + “ ?a ?x 
> > }”);
> > }
> > 
> > In this way, sparql is your SPARQLTraversalSource for each test and query() 
> > will return a Traversal typed according (query() will have to have solid 
> > generic support). From there, you would implement each and every test that 
> > is semantically possible with SPARQL (where SPARQ won’t be able to 
> > semantically cover all Gremlin tests).
> > 
> > Stephen has done a lot of recent work to generalize our test suite out of 
> > Java so it is in a language agnostic form. I haven’t been following that 
> > work so I’m not sure what I’m am saying above is exactly as it should 
> > be done, but it is a start.
> > 
> > HTH,
> > Marko.
> > 
> > http://markorodriguez.com <http://markorodriguez.com/>
> > 
> > 
> > 
> >> On Dec 18, 2017, at 7:43 AM, Harsh Thakkar  >> <mailto:hars...@gmail.com>> wrote:
> >> 
> >> Hi Stephen and All,
> >> 
> >> Thanks for going through the code. I address your questions below (in the 
> >> same order):
> >> 
> >> 1. Yes, this file can be removed. It was just to test the traversal 
> >> method. 
> >> 
> >> 2. Yes, I have commented out the block of tests at this moment since we do 
> >> not need to run tests at mvn clean install time. However, I kept it (in 
> >> commented out form) if there arose a need in future for the same. It can 
> >> surely be removed if you think, it won't be necessary.
> >> 
&g

Re: [Discussed] Integrating SPARQL-Gremlin 0.2 Plugin with the TinkerPop codebase

2017-12-18 Thread Harsh Thakkar
Hi Stephen and All,

Thanks for going through the code. I address your questions below (in the same 
order):

1. Yes, this file can be removed. It was just to test the traversal method. 

2. Yes, I have commented out the block of tests at this moment since we do not 
need to run tests at mvn clean install time. However, I kept it (in commented 
out form) if there arose a need in future for the same. It can surely be 
removed if you think, it won't be necessary.

3. There were two testing units (we continued them from Daniel's version), one 
to check whether the prefixes are being encoded correctly, the second one is to 
test whether the generated traversal is correct (in short the compiler is 
functioning as it should). Since, we extended previous work supporting a 
variety of SPARQL operators, more test cases can be added to validate that each 
of these is functioning as expected. However, as I mentioned in point #2. we 
need not do it explicitly as we (Dharmen and I) have already tested them on 3-4 
different datasets and query-sets. Now, since we did not know if that was going 
to be formally required in the future or not, we left them as it is, just 
commented it out.

4. These resources are the graphml files that we wish to provide the users, for 
(i) loading and querying famous datasets - the Berlin SPARQL Benchmark (BSBM)  
(famous in the Semantic Web-RDF community) so that they do not have to look 
elsewhere for the same. (ii) Also, it provides a strong use-case for 
demonstrating the applicability of sparql-gremlin (creates trust in the SW 
community users) and (iii) to keep the plug-in pretty much self-dependent.

5 & 6  YES, damn it. The IDE did this. I will revert these changes. It's like 
when you are not looking, the IDE does things on it own :-/ apologies!

7. Regarding, Marko's thoughts -- Yes, I was waiting for you to reply to the 
thread. I do have some thoughts on this. But first, I was wondering if this 
(what Marko suggested) is supposed to be entirely implemented in the current 
version of sparql-gremlin 0.2, i.e. including the withStrategies() and 
withStrategies() and remote() features, or it is to be supported eventually 
(after the sparql-gremlin 0.2.0) plugin is rolled out. Also, I am not entirely 
sure I got what Marko was exactly suggesting. I bring this to light in the 
in-line style reply to Marko's comment later here.

The current implementation is more of a typical compiler, the users, however, 
can use it by specifying the query file and the dataset against which it is to 
be executed via the command (once in the gremlin shell):

gremlin> graph = TinkerGraph.open(..) 
gremlin> SparqlToGremlinCompiler.convertToGremlinTraversal(graph, "SELECT ?a 
WHERE {} ") 
==>{?x:marko, ?y:29}
==>{?x:josh, ?y:32}

 
 i.e. load a graph using pre-defined tinkerpop methods ( 
graph.io(IoCore.gryo()).readGraph(graphName), TinkerGraph.open(), etc ) , then 
execute the traversal as above with arguments -- (graph, queryString), where 
queryString = "SPARQL query".

Now Let me quote Marko's comment and reply in-line to bring more clarity:

1. There should be a SPARQLTraversalSource which supports one spawn method — 
query(String).
This is already happening inside the code. Therefore, we do not need to 
mention it explicitly. Please correct me if I got it wrong here.
 
2. SPARQLTraversal is spawned and it only supports only the Traversal methods 
— next(), toList(), iterate(), etc.
All traversal methods that are supported, available to a regular 
gremlin traversal, can be used by the sparql-gremlin compiler generated 
traversal as well.  

3. query(String) adds a ConstantStep(String).
 This is happening internally (as shown in the example above), we 
can also make explicit. i.e. let the user only provide the queryString instead 
of the whole "SparqlToGremlinCompiler.convertToGremlinTraversal(graph, "SELECT 
?a WHERE {} ")" command. Does this make sense? or am I missing something 
here.


4. SPARQLTraversalSource has a registered SPARQLStrategy.
At this moment, we leave it to the default setting for this strategy 
selection.

5. SPARQLTraversalSource should also support withStrategies(), 
withoutStrategies(), withRemote(), etc.
Once the traversal is generated, it can support all strategies like any 
other gremlin traversal. Does this make sense to you?

In a nutshell, 
What is happening is that we are converting the SPARQL queryString into a 
gremlin traversal and leave it upto the tinkerpop compiler to choose what is 
best for it. 
We only map a SPARQL query to its corresponding pattern matching gremlin 
traversal (i.e. using with .match() clause). Since, the expressibility of 
SPARQL is less than that of Gremlin (i.e. SPARQL 1.0 doesn't support/allow  
performing looping and traversing operations), we can only map what is in the 
scope of SPARQL language to Gremlin. Once the traversal is generated, it is 
left to the tinkerpo

Subject: Re: [Discussed] Integrating SPARQL-Gremlin 0.2 Plugin with the TinkerPop codebase

2017-12-14 Thread Harsh Thakkar

Hi Josh,

I already wrote an elaborate reply to your comment. I think it went somewhere 
but didn't show up :(

I will summarize my reply here now..

Yes, I am of the same opinion of having a continuous SPARQL implementation on 
top of Gremlin. Also, I am working on a custom interface, (as we speak) in my 
current research, on proposing an information preserving RDF <-> PG converter. 
This will allow interoperability between the semantic web and graph database 
communities to leverage the advantages of one another. i.e. the earlier can 
traverse and the later can have a more diverse access portfolio to rich 
datasets.

My Ph.D. thesis is more or less focused on this. It started from proposing a 
robust open and extensible benchmarking platform "LITMUS" [], which eventually 
led me to address all these issues and thus my keen interest :)

If I am not getting it wrong, the other interfaces you mentioned, about that, 
do you wish to see them eventually integrated into tinkerpop? or are you 
implying that this should be already done before the next release?

Thanks for your pointers!
Cheers!

On 2017-12-13 16:46, Joshua Shinavier  wrote: 
> Hi Harsh,
> 
> Glad you are taking Daniel's work forward. In porting the code to the
> TinkerPop code base, might I suggest we allow for not only SPARQL-Gremlin,
> but a whole suite of RDF tools as in TP2. Perhaps call the module
> rdf-gremlin. Then we could have all of:
> 
> * SPARQL-Gremlin: executes standard SPARQL queries over a Property Graph
> database
> * GraphSail [1,2]: stores RDF quads in the database, explicitly, and
> enables SPARQL and triple pattern queries over the quads
> * PropertyGraphSail [3]: exposes a Property Graph with of two mappings to
> the RDF data model
> * SailGraph [4]: takes an RDF triple store (not natively supporting
> Gremlin) and enables Gremlin queries
> * others? I have often thought that a continuous SPARQL implementation
> built on Gremlin would be powerful
> 
> The biggest mismatch between the TP2 suite and what might be built for
> Apache TinkerPop is that the previous suite was implemented using (Eclipse)
> RDF4j, whereas things seem to be leaning towards (Apache) Jena now.
> However, the same principles could be applied.
> 
> Josh
> 
> 
> [1] https://github.com/tinkerpop/blueprints/wiki/Sail-Ouplementation
> [2] https://github.com/joshsh/graphsail
> [3]
> https://github.com/tinkerpop/blueprints/wiki/PropertyGraphSail-Ouplementation
> [4] https://github.com/tinkerpop/blueprints/wiki/Sail-Implementation
> 
> 
> 
> 
> On Wed, Dec 13, 2017 at 4:03 AM, Stephen Mallette 
> wrote:
> 
> > I suggest that you read through the dev docs a bit. There's lots of little
> > odds/ends there about how to develop on the TinkerPop code base. For
> > example, for intellij issues, please have a look at this:
> >
> > http://tinkerpop.apache.org/docs/current/dev/developer/#_
> > ide_setup_with_intellij
> >
> > > - Also, when I did a man clean install on 3.3.1-SNAPSHOT, it did get
> > build success but a majority of the test cases failed. Not sure if this is
> > worth mentioning.
> >
> > there should be no failures on master and 3.3.1-SNAPSHOT. hard to say what
> > is wrong without some error logs
> >
> > > Also, is it okay if I use the 3.3.0 api version for my module or it is
> > absolutely necessary that we have to only use the 3.3.1-SNAPSHOT api
> > version?
> >
> > you should be on 3.3.1-SNAPSHOT for your API version now that you've
> > integrated sparql-gremlin into the tinkerpop code base. ultimately, it will
> > all release together as part of a single package, so it should all be on
> > the same version.
> >
> > On Wed, Dec 13, 2017 at 6:43 AM, Harsh Thakkar  wrote:
> >
> > > Hi Stephen,
> > >
> > > I cleaned up the code a bit and then, I tried testing the code merge
> > > yesterday and I ran into some issues for 3.3.1-SNAPSHOT version.
> > >
> > > - I forked apache/tinkerpop repository to my local account and loaded the
> > > same using an IDE (as maven project). This immediately threw errors in
> > the
> > > native repositories such as gremlin-core, stating that it is not able to
> > > find org.apache.tinkerpop.shaded.kryo.Kryo.
> > > - When I try building the code with 3.3.0 api it works perfectly without
> > > any error, however for 3.3.1-SNAPSHOT version it is not able to find
> > > various files and throws errors in the core modules of tinkerpop. Thus, I
> > > cannot test my module (sparql-gremlin) with the 3.3.1-SNAPSHOT version.
> > > - Also, when I did a man clean install on 3.3.1-SNAPSHOT, it did get
>

Re: [Discussed] Integrating SPARQL-Gremlin 0.2 Plugin with the TinkerPop codebase

2017-12-14 Thread Harsh Thakkar
Hi Stephen,

Thanks for the pointers. We have good news. Turn out that the error was mostly 
because of the IDE environment and some other shady stuff going wrong. We 
finally managed to merge the sparql-gremlin work into the tinkerpop code base. 
I merged forked tinkerpop repository can be found here - 
https://github.com/harsh9t/tinkerpop 

Please have a look at it and let us know what is to be done next. 

Meanwhile, we are having a look at the documentation on how to generate and 
other specifics. If we have questions, we will get back to you. I haven't done 
this before so expect some :D 

Cheers!


On 2017-12-13 13:03, Stephen Mallette  wrote: 
> I suggest that you read through the dev docs a bit. There's lots of little
> odds/ends there about how to develop on the TinkerPop code base. For
> example, for intellij issues, please have a look at this:
> 
> http://tinkerpop.apache.org/docs/current/dev/developer/#_ide_setup_with_intellij
> 
> > - Also, when I did a man clean install on 3.3.1-SNAPSHOT, it did get
> build success but a majority of the test cases failed. Not sure if this is
> worth mentioning.
> 
> there should be no failures on master and 3.3.1-SNAPSHOT. hard to say what
> is wrong without some error logs
> 
> > Also, is it okay if I use the 3.3.0 api version for my module or it is
> absolutely necessary that we have to only use the 3.3.1-SNAPSHOT api
> version?
> 
> you should be on 3.3.1-SNAPSHOT for your API version now that you've
> integrated sparql-gremlin into the tinkerpop code base. ultimately, it will
> all release together as part of a single package, so it should all be on
> the same version.
> 
> On Wed, Dec 13, 2017 at 6:43 AM, Harsh Thakkar  wrote:
> 
> > Hi Stephen,
> >
> > I cleaned up the code a bit and then, I tried testing the code merge
> > yesterday and I ran into some issues for 3.3.1-SNAPSHOT version.
> >
> > - I forked apache/tinkerpop repository to my local account and loaded the
> > same using an IDE (as maven project). This immediately threw errors in the
> > native repositories such as gremlin-core, stating that it is not able to
> > find org.apache.tinkerpop.shaded.kryo.Kryo.
> > - When I try building the code with 3.3.0 api it works perfectly without
> > any error, however for 3.3.1-SNAPSHOT version it is not able to find
> > various files and throws errors in the core modules of tinkerpop. Thus, I
> > cannot test my module (sparql-gremlin) with the 3.3.1-SNAPSHOT version.
> > - Also, when I did a man clean install on 3.3.1-SNAPSHOT, it did get build
> > success but a majority of the test cases failed. Not sure if this is worth
> > mentioning.
> >
> > What do you suggest? How do I fix this?
> > Also, is it okay if I use the 3.3.0 api version for my module or it is
> > absolutely necessary that we have to only use the 3.3.1-SNAPSHOT api
> > version?
> >
> > Thanks in advance!
> >
> > On 2017-12-12 12:58, Stephen Mallette  wrote:
> > > yes - please post questions here. i don't think you need to know much
> > about
> > > TinkerPop internal structure. I'd think that sparql-gremlin is expected
> > to
> > > be included in the root of the TinkerPop source as a sub-module to the
> > > top-level pom. That just means some minor changes to your pom.xml to get
> > it
> > > to build along with everything else. See other projects for examples:
> > >
> > > https://github.com/apache/tinkerpop/blob/f5687ee4497bfbaef4ae89233e4c29
> > f07001ed2c/gremlin-core/pom.xml#L20-L24
> > >
> > > You can drop all of this because it is already defined in the root
> > pom.xml:
> > >
> > > https://github.com/LITMUS-Benchmark-Suite/sparql-to-
> > gremlin/blob/master/pom.xml#L30-L70
> > >
> > > Looking at the rest of your pom.xml now, I'm not sure I understand
> > > everything your  section is doing and if it's all necessary: The
> > > root pom.xml should handle the most common build/deploy options and they
> > > will be thus inherited to your sub-module pom which is why, for example,
> > > the gremlin-core pom is pretty simple for the  section:
> > >
> > > https://github.com/apache/tinkerpop/blob/f5687ee4497bfbaef4ae89233e4c29
> > f07001ed2c/gremlin-core/pom.xml#L119-L151
> > >
> > > If there's anything you're sure can be removed from the sparql-gremlin
> > > pom.xml  section based on how the TinkerPop root pom.xml is setup,
> > > the please feel free to cleanup as much as possible there.
> > >
> > > As for the general project structure of sparql-gremlin, I

Re: [Discussed] Integrating SPARQL-Gremlin 0.2 Plugin with the TinkerPop codebase

2017-12-13 Thread Harsh Thakkar
Hi Stephen,

I cleaned up the code a bit and then, I tried testing the code merge yesterday 
and I ran into some issues for 3.3.1-SNAPSHOT version.

- I forked apache/tinkerpop repository to my local account and loaded the same 
using an IDE (as maven project). This immediately threw errors in the native 
repositories such as gremlin-core, stating that it is not able to find 
org.apache.tinkerpop.shaded.kryo.Kryo. 
- When I try building the code with 3.3.0 api it works perfectly without any 
error, however for 3.3.1-SNAPSHOT version it is not able to find various files 
and throws errors in the core modules of tinkerpop. Thus, I cannot test my 
module (sparql-gremlin) with the 3.3.1-SNAPSHOT version. 
- Also, when I did a man clean install on 3.3.1-SNAPSHOT, it did get build 
success but a majority of the test cases failed. Not sure if this is worth 
mentioning.

What do you suggest? How do I fix this? 
Also, is it okay if I use the 3.3.0 api version for my module or it is 
absolutely necessary that we have to only use the 3.3.1-SNAPSHOT api version? 

Thanks in advance!

On 2017-12-12 12:58, Stephen Mallette  wrote: 
> yes - please post questions here. i don't think you need to know much about
> TinkerPop internal structure. I'd think that sparql-gremlin is expected to
> be included in the root of the TinkerPop source as a sub-module to the
> top-level pom. That just means some minor changes to your pom.xml to get it
> to build along with everything else. See other projects for examples:
> 
> https://github.com/apache/tinkerpop/blob/f5687ee4497bfbaef4ae89233e4c29f07001ed2c/gremlin-core/pom.xml#L20-L24
> 
> You can drop all of this because it is already defined in the root pom.xml:
> 
> https://github.com/LITMUS-Benchmark-Suite/sparql-to-gremlin/blob/master/pom.xml#L30-L70
> 
> Looking at the rest of your pom.xml now, I'm not sure I understand
> everything your  section is doing and if it's all necessary: The
> root pom.xml should handle the most common build/deploy options and they
> will be thus inherited to your sub-module pom which is why, for example,
> the gremlin-core pom is pretty simple for the  section:
> 
> https://github.com/apache/tinkerpop/blob/f5687ee4497bfbaef4ae89233e4c29f07001ed2c/gremlin-core/pom.xml#L119-L151
> 
> If there's anything you're sure can be removed from the sparql-gremlin
> pom.xml  section based on how the TinkerPop root pom.xml is setup,
> the please feel free to cleanup as much as possible there.
> 
> As for the general project structure of sparql-gremlin, I don't fully
> understand how it is arranged. There's
> 
> /Queries
> /doc
> /docs/images
> /output
> /src
> 
> and all of that is repeated inside of the /bin directory. something seems
> amiss there. maybe once that's cleared up a bit I can think more clearly on
> what additional changes you might need.
> 
> Another important thing to considerdocumentation. Right now, it's all
> in the README. I think we will want a new section to the Reference
> Documentation, probably appearing after Gremlin Variants:
> 
> http://tinkerpop.apache.org/docs/current/reference/#gremlin-variants
> 
> Perhaps that could be named "Query Languages" where sparql-gremlin would be
> the first sub-section. That would set up for some future where we also had
> sql-gremlin.  And perhaps a section for cypher-gremlin which could point to
> Neo4j's work in this area.  You can find reference docs for TinkerPop here:
> 
> https://github.com/apache/tinkerpop/tree/f5687ee4497bfbaef4ae89233e4c29f07001ed2c/docs/src/reference
> 
> The easiest way to generate docs is with docker via:
> 
> docker/build.sh -d
> 
> without that you need hadoop running with appropriate configurations:
> 
> http://tinkerpop.apache.org/docs/current/dev/developer/#documentation-environment
> 
> Well, hope that gives you a few things to work on and think about for your
> first round of changes in your fork. Looking forward to seeing how this PR
> shapes up!
> 
> 
> On Tue, Dec 12, 2017 at 3:40 AM, Harsh Thakkar  wrote:
> 
> > Hi Stephen,
> >
> > Very well then, we will start the migration from today. Also we will
> > submit the signed iclas' today.
> >
> > If we have some questions regarding building the code properly, can we
> > feel free to ask them here? I assume we might need some guidance on how to
> > getting things plugged in correctly. We both are not much aware of the
> > internal structure of TinkerPop, so that is why.
> >
> > Also, if there is any specific documentation to help us with this, please
> > lend a pointer.
> >
> > Many thanks!
> >
> > On 2017-12-11 21:33, Stephen Mallette  wrote:
> > > As 

Re: [Discussed] Integrating SPARQL-Gremlin 0.2 Plugin with the TinkerPop codebase

2017-12-12 Thread Harsh Thakkar
Hi Stephen,

Very well then, we will start the migration from today. Also we will submit the 
signed iclas' today. 

If we have some questions regarding building the code properly, can we feel 
free to ask them here? I assume we might need some guidance on how to getting 
things plugged in correctly. We both are not much aware of the internal 
structure of TinkerPop, so that is why. 

Also, if there is any specific documentation to help us with this, please lend 
a pointer.

Many thanks!

On 2017-12-11 21:33, Stephen Mallette  wrote: 
> As there hasn't been any other opinions, It seems we have a lazy consensus
> to accept sparql-gremlin into TinkerPop's code base. Cool!
> 
> Harsh, I think you and Dharmen should proceed with the steps I listed
> above. Once you have the code integrated and building properly in your
> fork, please reply back and point us to it and we can start with some
> coarse grained review of what you have.
> 
> Thanks,
> 
> Stephen
> 
> On Fri, Dec 8, 2017 at 8:44 AM, hars...@gmail.com  wrote:
> 
> > Hi Stephen,
> >
> > Thanks for the insight on the process of this integration. I will reply to
> > your comments in the same manner.
> >
> > 1. Yes, I will do the fork and migrate the code to the Tinkerpop
> > repository, after cleaning the code a bit. We also need to prepare a
> > detailed doc (how-to) for the plugin. This can also be done in parallel,
> > depending upon the urgency.
> >
> > 2. Yes, we both are contributing to the v0.2 of the sparql-gremlin plugin.
> > We will both submit the ICLAs.
> >
> > Yes, we (both) will continue to provide support for the 0.2 plugin and
> > also extend it in the future (trying to cover SPARQL 1.1 specification,
> > also fix the OPTIONAL fix in the current version).
> >
> > Looking forward to hear more on this from the devs :)
> >
> > Cheers!
> >
> > On 2017-12-08 13:41, Stephen Mallette  wrote:
> > > I agree with Marko's thoughts, both on this topic of including
> > > sparql-gremlin as well as the wider topic of what should be included in
> > > TinkerPop code base more generally. Providing a path for rdf/sparql folks
> > > to get into the TinkerPop world seems like a smart direction.
> > >
> > > Now, assuming that we have consensus to include sparql-gremlin in the
> > > TinkerPop code base, the process will look something like this:
> > >
> > > 1. I think that Harsh should fork the TinkerPop repository and migrate
> > > sparql-gremlin into its structure. From there we will provide
> > > feedback/review to get that fork into best shape possible prior to his
> > > submitting a pull request. I think we can handle initial feedback through
> > > the dev list in a separate thread.
> > >
> > > 2. In parallel to the above item, it appears as though there are two
> > > contributors on sparql-gremlin:
> > >
> > > https://github.com/LITMUS-Benchmark-Suite/sparql-to-
> > gremlin/graphs/contributors
> > >
> > > Both contributors, Harsh and Dharmen, should submit ICLAs:
> > >
> > > http://apache.org/licenses/icla.pdf
> > >
> > > and send them to secret...@apache.org.
> > >
> > > 3. Once ICLAs are confirmed by secretary, Harsh can submit a pull request
> > > from his fork where it can under go final review.
> > >
> > > Does that sound sensible to everyone?
> > >
> > > btw, Harsh, it sounds as though you intend to continue development on
> > > sparql-gremlin after it is part of the TinkerPop repository...does
> > Dharmen
> > > intend to do the same?
> > >
> > > On Fri, Dec 8, 2017 at 6:54 AM, Stephen Mallette 
> > > wrote:
> > >
> > > > linking marko's reply from the user list:
> > > >
> > > > https://groups.google.com/d/msg/gremlin-users/zK9jj7bWvrQ/nE1VvhmeAAAJ
> > > >
> > > > On Thu, Dec 7, 2017 at 1:52 PM, hars...@gmail.com 
> > > > wrote:
> > > >
> > > >> Hello, dear Gremlin people!
> > > >>
> > > >> Apologies for raising this topic a bit late. I planned to start this
> > > >> thread quite earlier but wasn’t able to due to some reasons.
> > > >>
> > > >> === short ==
> > > >> ==
> > > >> I seek your guidance and also help for polishing and integrating the
> > > >> sparql-gremlin 0.2 (https://github.com/LITMUS-Ben
> > > >> chmark-Suite/sparql-to-gremlin) plugin in the apache tinkerpop code
> > > >> base, succeeding its predecessor developed by Daniel Kupitz (
> > > >> https://github.com/dkuppitz/sparql-gremlin). The new plugin offers
> > > >> support for a wide range of SPARQL queries from the SPARQL 1.0
> > features.
> > > >>
> > > >>
> > > >>  long ==
> > > >> ===
> > > >>
> > > >> I am a Ph.D. student at the University of Bonn and work at the
> > > >> intersection of semantic web and graph databases. My thesis is
> > focused on
> > > >> bridging the gap between these two domains by enabling support for
> > SPARQL
> > > >> querying of Property Graph databases. Thus, working on the
> > SPARQL-Gremlin
> > > >> interoperability was a