RE: add isNull to org.umlg.sqlg.structure.Property

2022-12-28 Thread pieter gmail
Hi,

Since TinkerPop added `supportsNullPropertyValues` users need to call
`property.isPresent() && property.value() != null` to do a null check.
I propse adding a `isNull` to the Property interface to reduce the
typing required.

Regards
Pieter


Re: Async capabilities to TinkerPop

2022-07-30 Thread pieter gmail
Just to clarify, I am not thinking about a client/server architecture.
I am talking only about TinkerPop's core step implementation.
So I think just a library like Reactor (no netty) is needed for that
part.

Regards
Pieter

On Sat, 2022-07-30 at 12:43 +0100, Oleksandr Porunov wrote:
> I used Vert.x before, and know that framework uses an even loop to
> solve that issue. I believe Reactor Netty also uses event loop to
> solve the issue with infinite callback chains.
> I.e. instead of having a callback which calls another callback which
> calls another callback and so on till StackOverflowException it would
> simply put the job which should be asynchronously processed and it
> will be processed as soon as there is a chance to process it (i.e.
> like in JavaScript basically). And so, you always have just a single
> thread which processes all the callbacks. Of course such a technique
> adds some delay because now instead of a function calling another
> function directly the code looks like a function puts another
> function into a queue and there is a thread which processes all the
> functions in the queue one by one. So, if one of the functions in the
> queue has some delay it means this delay will be translated to all
> the functions after that long-running function. 
> I think the only known way to solve the issue is with an event loop,
> but if anyone knows another technique - it would be really great to
> know about it.
> If we decide to stick with an event loop for TinkerPop's async
> queries processing then I would suggest to re-use some of the
> frameworks which provide this functionality.
> I.e. we could consider Reactor Netty (as Pieter suggested) or
> anything else. I don't know which one is better to use due to being
> stuck with a single framework as for now, but I guess Reactor Netty
> should be good for that.
> So, I guess the list to check could be:
> - Reactor Netty
> - RxJava
> - Vert.x
> - Akka
> - etc.
> 
> I believe if we OK to use one of the existing frameworks for async
> functionality, then it will be much easier to add async queries
> execution in TinkerPop instead of developing it from scratch and
> managing our own event loop.
> 
> Best regards,
> Oleksandr
> 
> 
> On Fri, Jul 29, 2022 at 4:27 PM pieter gmail
>  wrote:
> > Does frameworks like reactor not resolve this issue with back
> > pressure and other sexy tricks?
> > 
> > Cheers
> > Pieter
> > 
> > On Fri, 2022-07-29 at 13:51 +0100, Oleksandr Porunov wrote:
> > > I'm also not sure but for some reason I feel that we may need
> > > some event loop to be implemented if we want to re-write it to
> > > async capabilities.
> > > The reason I'm telling it is because I feel that in some
> > > situations we may trigger a very long call stack.
> > > I.e.:
> > > Promise -> Promise -> Promise ->  
> > > 
> > > I guess that could be in a situation when the next part of the
> > > execution depends on the previous part of the execution. For
> > > example,
> > > 
> > > g.V().has("hello", "world").barrier(50).limit(5)
> > > 
> > > So, let's assume the next things about the execution of the above
> > > query:
> > > - It is executed in JanusGraph with batch query enabled (graph
> > > provider specific, but it's easier for me to focus on a concrete
> > > implementation)
> > > - There is no necessary index for that property (again, graph
> > > provider specific)
> > > - There are only 4 vertices with such property
> > > - We have 50 million vertices in total
> > > 
> > > If the above facts are true then the query will be executed like
> > > the following:
> > > 1) get first (or next) 50 vertices
> > > 2) filter out unmatched vertices
> > > 3) if the limit is not reached 5 then process the next vertices
> > > by starting from step "1" again. Otherwise, return data.
> > >  
> > > So, in fact, with the above scenario we will traverse all 50
> > > million vertices in the graph which will result in about 1
> > > million chain calls for promises based implementation. That will
> > > probably result in StackOverflowException.
> > > With synchronous code we don't have these problems because we
> > > don't call a new function recursively each time we need to
> > > retrieve part of the data.
> > > We can overcome the above issue by implementing some kind of
> > > event loop where we put all the results and then a single thread
> > > running that loop will call necessa

Re: Async capabilities to TinkerPop

2022-07-29 Thread pieter gmail
; > >> checking Traversal.java code I think Pieter is right here.
> > > > >> For some reason I thought we could simply wrap synchronous
> > > > method in
> > > > >> asynchronous, basically something like:
> > > > >>
> > > > >> // the method which should be implemented by a graph
> > > > provider
> > > > >>
> > > > >> Future executeAsync(Callable func);
> > > > >>
> > > > >> public default Future asyncNext(){
> > > > >>     return executeAsync(this::next);
> > > > >> }
> > > > >>
> > > > >> but checking that code I think I was wrong about it.
> > > > Different steps may
> > > > >> execute different logic (i.e. different underlying storage
> > > > queries) for
> > > > >> different graph providers.
> > > > >> Thus, wrapping only terminal steps into async functions
> > > > won't solve the
> > > > >> problem most likely.
> > > > >>
> > > > >> I guess it will require re-writing or extending all steps to
> > > > be able to
> > > > >> pass an async state instead of a sync state.
> > > > >>
> > > > >> I'm not familiar enough with the TinkerPop code yet to claim
> > > > that, so
> > > > >> probably I could be wrong.
> > > > >> I will need to research it a bit more to find out but I
> > > > think that Pieter
> > > > >> is most likely right about a massive re-write.
> > > > >>
> > > > >> Nevertheless, even if that requires massive re-write, I
> > > > would be eager to
> > > > >> start the ball rolling.
> > > > >> I think we either need to try to implement async execution
> > > > in TinkerPop 3
> > > > >> or start making some concrete decisions regarding TinkerPop
> > > > 4.
> > > > >>
> > > > >> I see Marko A. Rodriguez started to work on RxJava back in
> > > > 2019 here
> > > > >>
> > > > https://github.com/apache/tinkerpop/tree/4.0-dev/java/machine/processor/rxjava/src/main/java/org/apache/tinkerpop/machine/processor/rxjava
> > > > >>
> > > > >> but the process didn't go as far as I understand. I guess it
> > > > would be
> > > > >> good to know if we want to completely rewrite TinkerPop in
> > > > version 4 or not.
> > > > >>
> > > > >> If we want to completely rewrite TinkerPop in version 4 then
> > > > I assume it
> > > > >> may take quite some time to do so. In this case I would be
> > > > more likely to
> > > > >> say that it's better to implement async functionality in
> > > > TinkerPop 3 even
> > > > >> if it requires rewriting all steps.
> > > > >>
> > > > >> In case TinkerPop 4 is a redevelopment with breaking changes
> > > > but without
> > > > >> starting to rewrite the whole functionality then I guess we
> > > > could try to
> > > > >> work on TinkerPop 4 by introducing async functionality and
> > > > maybe applying
> > > > >> more breaking changes in places where it's better to re-work
> > > > some parts.
> > > > >>
> > > > >> Best regards,
> > > > >> Oleksandr
> > > > >>
> > > > >>
> > > > >> On Thu, Jul 28, 2022 at 7:47 PM pieter gmail
> > > > 
> > > > >> wrote:
> > > > >>
> > > > >>> Hi,
> > > > >>>
> > > > >>> Does this not imply a massive rewrite of TinkerPop? In
> > > > particular the
> > > > >>> iterator chaining pattern of steps should follow a reactive
> > > > style of
> > > > >>> coding?
> > > > >>>
> > > > >>> Cheers
> > > > >>> Pieter
> > > > >>>
> > > > >>>
> > > > >>> On Thu, 2022-07-28 at 15:18 +0100, Oleksandr Porunov wrote:
> > > > >>> > I'm interested in adding async capabilities to TinkerPop.
> > > > >>> >
> > > > >>> > There were many discussions about async capa

Re: Async capabilities to TinkerPop

2022-07-29 Thread pieter gmail
>> asynchronous, basically something like:
> > >>
> > >> // the method which should be implemented by a graph provider
> > >>
> > >> Future executeAsync(Callable func);
> > >>
> > >> public default Future asyncNext(){
> > >>     return executeAsync(this::next);
> > >> }
> > >>
> > >> but checking that code I think I was wrong about it. Different
> > steps may
> > >> execute different logic (i.e. different underlying storage
> > queries) for
> > >> different graph providers.
> > >> Thus, wrapping only terminal steps into async functions won't
> > solve the
> > >> problem most likely.
> > >>
> > >> I guess it will require re-writing or extending all steps to be
> > able to
> > >> pass an async state instead of a sync state.
> > >>
> > >> I'm not familiar enough with the TinkerPop code yet to claim
> > that, so
> > >> probably I could be wrong.
> > >> I will need to research it a bit more to find out but I think
> > that Pieter
> > >> is most likely right about a massive re-write.
> > >>
> > >> Nevertheless, even if that requires massive re-write, I would be
> > eager to
> > >> start the ball rolling.
> > >> I think we either need to try to implement async execution in
> > TinkerPop 3
> > >> or start making some concrete decisions regarding TinkerPop 4.
> > >>
> > >> I see Marko A. Rodriguez started to work on RxJava back in 2019
> > here
> > >>
> > https://github.com/apache/tinkerpop/tree/4.0-dev/java/machine/processor/rxjava/src/main/java/org/apache/tinkerpop/machine/processor/rxjava
> > >>
> > >> but the process didn't go as far as I understand. I guess it
> > would be
> > >> good to know if we want to completely rewrite TinkerPop in
> > version 4 or not.
> > >>
> > >> If we want to completely rewrite TinkerPop in version 4 then I
> > assume it
> > >> may take quite some time to do so. In this case I would be more
> > likely to
> > >> say that it's better to implement async functionality in
> > TinkerPop 3 even
> > >> if it requires rewriting all steps.
> > >>
> > >> In case TinkerPop 4 is a redevelopment with breaking changes but
> > without
> > >> starting to rewrite the whole functionality then I guess we
> > could try to
> > >> work on TinkerPop 4 by introducing async functionality and maybe
> > applying
> > >> more breaking changes in places where it's better to re-work
> > some parts.
> > >>
> > >> Best regards,
> > >> Oleksandr
> > >>
> > >>
> > >> On Thu, Jul 28, 2022 at 7:47 PM pieter gmail
> > 
> > >> wrote:
> > >>
> > >>> Hi,
> > >>>
> > >>> Does this not imply a massive rewrite of TinkerPop? In
> > particular the
> > >>> iterator chaining pattern of steps should follow a reactive
> > style of
> > >>> coding?
> > >>>
> > >>> Cheers
> > >>> Pieter
> > >>>
> > >>>
> > >>> On Thu, 2022-07-28 at 15:18 +0100, Oleksandr Porunov wrote:
> > >>> > I'm interested in adding async capabilities to TinkerPop.
> > >>> >
> > >>> > There were many discussions about async capabilities for
> > TinkerPop
> > >>> > but
> > >>> > there was no clear consensus on how and when it should be
> > developed.
> > >>> >
> > >>> > The benefit for async capabilities is that the user calling a
> > query
> > >>> > shouldn't need its thread to be blocked to simply wait for
> > the result
> > >>> > of
> > >>> > the query execution. Instead of that a graph provider should
> > take
> > >>> > care
> > >>> > about implementation of async queries execution.
> > >>> > If that's the case then many graph providers will be able to
> > optimize
> > >>> > their
> > >>> > execution of async queries by handling less resources for the
> > query
> > >>> > execution.
> > >>> > As a real example of potential benefit we could get I would
> > like to
> > >>> > point
> > &

Re: Async capabilities to TinkerPop

2022-07-28 Thread pieter gmail
Hi,

Does this not imply a massive rewrite of TinkerPop? In particular the
iterator chaining pattern of steps should follow a reactive style of
coding?

Cheers
Pieter


On Thu, 2022-07-28 at 15:18 +0100, Oleksandr Porunov wrote:
> I'm interested in adding async capabilities to TinkerPop.
> 
> There were many discussions about async capabilities for TinkerPop
> but
> there was no clear consensus on how and when it should be developed.
> 
> The benefit for async capabilities is that the user calling a query
> shouldn't need its thread to be blocked to simply wait for the result
> of
> the query execution. Instead of that a graph provider should take
> care
> about implementation of async queries execution.
> If that's the case then many graph providers will be able to optimize
> their
> execution of async queries by handling less resources for the query
> execution.
> As a real example of potential benefit we could get I would like to
> point
> on how JanusGraph executes CQL queries to process Gremlin queries.
> CQL result retrieval:
> https://github.com/JanusGraph/janusgraph/blob/15a00b7938052274fe15cf26025168299a311224/janusgraph-cql/src/main/java/org/janusgraph/diskstorage/cql/function/slice/CQLSimpleSliceFunction.java#L45
> 
> As seen from the code above, JanusGraph already leverages async
> functionality for CQL queries under the hood but JanusGraph is
> required to
> process those queries in synced manner, so what JanusGraph does - it
> simply
> blocks the whole executing thread until result is returned instead of
> using
> async execution.
> 
> Of course, that's just a case when we can benefit from async
> execution
> because the underneath storage backend can process async queries. If
> a
> storage backend can't process async queries then we won't get any
> benefit
> from implementing a fake async executor.
> 
> That said, I believe quite a few graph providers may benefit from
> having a
> possibility to execute queries in async fashion because they can
> optimize
> their resource utilization.
> I believe that we could have a feature flag for storage providers
> which
> want to implement async execution. Those who can't implement it or
> don't
> want to implement it may simply disable async capabilities which will
> result in throwing an exception anytime an async function is called.
> I
> think it should be fine because we already have some feature flags
> like
> that for graph providers. For example "Null Semantics" was added in
> TinkerPop 3.5.0 but `null` is not supported for all graph providers.
> Thus,
> a feature flag for Null Semantics exists like
> "g.getGraph().features().vertex().supportsNullPropertyValues()".
> I believe we can enable async in TinkerPop 3 by providing async as a
> feature flag and letting graph providers implement it at their will.
> Moreover if a graph provider wants to have async capabilities but
> their
> storage backends don't support async capabilities then it should be
> easy to
> hide async execution under an ExecutorService which mimics async
> execution.
> I believe we could do that for TinkerGraph so that users could
> experiment
> with async API at least. I believe we could simply have a default
> "async"
> function implementation for TinkerGraph which wraps all sync
> executions in
> a function and sends it to that ExecutorService (we can discuss which
> one).
> In such a case TinkerGraph will support async execution even without
> real
> async functionality. We could also potentially provide some
> configuration
> options to TinkerGraph to configure thread pool size, executor
> service
> implementation, etc.
> 
> I didn't think about how it is better to implement those async
> capabilities
> for TinkerPop yet but I think reusing a similar approach like in
> Node.js
> which returns Promise when calling Terminal steps could be good. For
> example, we could have a method called `async` which accepts a
> termination
> step and returns a necessary Future object.
> I.e.:
> g.V(123).async(Traversal.next())
> g.V().async(Traversal.toList())
> g.E().async(Traversal.toSet())
> g.E().async(Traversal.iterate())
> 
> I know that there were discussions about adding async functionality
> to
> TinkerPop 4 eventually, but I don't see strong reasons why we
> couldn't add
> async functionality to TinkerPop 3 with a feature flag.
> It would be really great to hear some thoughts and concerns about it.
> 
> If there are no concerns, I'd like to develop a proposal for further
> discussion.
> 
> Best regards,
> Oleksandr Porunov



Re: A meta model for gremlin's property graph

2022-01-16 Thread pieter gmail
Hi,

This is a continuation of the "first decide what we are trying to
achieve in the first place." part.
We seem to agree on most of what was iterated. 
Do you have more items to add to the list?

Here is the bit I did not quite follow.

> > 3: Extend the gremlin grammar to specify schema create/edit/delete
> > functionality.
> Why is that necessary, if you're embedding schemas in the graph? Just
> embed them in the graph. We don't have extra grammar for updating
> other types of graphs.

I am not entirely sure what you mean here. In my previous example I
did, as an example, create the "modern" schema using pure gremlin based
on the property graph meta model.
However it is far from being user friendly.
Here it is again, is this what you mean by "Just embed them in the
graph."?

           modernSchema = g.meta();
Vertex person = modernSchema.addVertex(T.label, "VertexLabel", "label", 
"person");
Vertex personNameVertexProperty = modernSchema.addVertex(T.label, 
"VertexProperty", "name", "name", "type", GremlinDataType.STRING.name());
Vertex personAgeVertexProperty = modernSchema.addVertex(T.label, 
"VertexProperty", "name", "age", "type", GremlinDataType.INTEGER.name());
person.addEdge("properties", personNameVertexProperty);
person.addEdge("properties", personAgeVertexProperty);

Vertex software = modernSchema.addVertex(T.label, "VertexLabel", "label", 
"software");
Vertex softwareNameVertexProperty = modernSchema.addVertex(T.label, 
"VertexProperty", "name", "name", "type", GremlinDataType.STRING.name());
Vertex softwareLangVertexProperty = modernSchema.addVertex(T.label, 
"VertexProperty", "name", "lang", "type", GremlinDataType.STRING.name());
software.addEdge("properties", softwareNameVertexProperty);
software.addEdge("properties", softwareLangVertexProperty);

Vertex knows = modernSchema.addVertex(T.label, "EdgeLabel", "label", 
"knows");
Vertex knowsWeightVertexProperty = modernSchema.addVertex(T.label, 
"EdgeProperty", "name", "weight", "type", GremlinDataType.INTEGER.name());
knows.addEdge("properties", knowsWeightVertexProperty);

Vertex created = modernSchema.addVertex(T.label, "EdgeLabel", "label", 
"created");
Vertex createdWeightVertexProperty = modernSchema.addVertex(T.label, 
"EdgeProperty", "name", "weight", "type", GremlinDataType.INTEGER.name());
created.addEdge("properties", createdWeightVertexProperty);

person.addEdge("outEdge", knows);
person.addEdge("outEdge", created);
software.addEdge("inEdge", knows);
software.addEdge("inEdge", created);

It is far simpler to define a dedicated grammar, something like this,

VertexLabel person = g.getTopology().ensureVertexLabelExist("person", new 
HashMap<>() {{
put("name", GremlinDataType.STRING);
put("age", GremlinDataType.INTEGER);
}});
    VertexLabel software = g.getTopology().ensureVertexLabelExist("software", 
new HashMap<>() {{
put("name", GremlinDataType.STRING);
put("lang", GremlinDataType.STRING);
}});
EdgeLabel knows = person.ensureEdgeLabelExist("knows", person, new 
HashMap<>() {{
put("weight", GremlinDataType.DOUBLE);
}});
    EdgeLabel created = person.ensureEdgeLabelExist("created", person, new 
HashMap<>() {{
put("weight", GremlinDataType.DOUBLE);
}});

This is from embedded java so it will need some adjustment and thinking
but I suspect it is easier for the user if we extend the grammar.  In
the same way that rdbms's do not ask users to insert rows into the
information schema but instead give them a DDL grammar that speaks
directly to the task at hand.
It also guarantees that the model is valid at all times as the grammar
 won't permit an incorrect schema.
In the "embedded" way it is possible to corrupt the schema, which is
why the property graph meta model defined gremlin constraints to
validate the schema.

Preferable all implementations should provide a way to query the schema
based on the property graph meta model.

i.e. 
List persons = g.schema().V().hasLabel("VertexLabel").has("name", 
P.eq("person")).toList();
Assert.assertEquals(1, persons.size());
List knowsAndCreated = 
g.schema().V().hasLabel("VertexLabel").has("name", 
P.eq("person&quo

Re: A meta model for gremlin's property graph

2022-01-15 Thread pieter gmail
Hi,

Here are some thoughts on your response.

> which parts of the approach you describe below were influenced by OMG

The primary inspiration from UML is the insight that a language can be
self describing.  It is of course inevitable in the real world as we
can not tolerate infinite regression with regards to every level
needing yet another meta level to describe it.
The is precisely an attempt at gremlin describing itself without
recourse to any other language.

> +1 to using or drawing upon standards where we can. 

To be clear I am not using any OMG standard as such. If we were to do
that we would define the property graph model using MOF (meta object
facility) or its counter part EMF. While this is entirely possible it
is not the approach taken here. Here the attempt is to bootstrap the
property graph model entirely and only with gremlin.

> The problem right now is that Gremlin's declarative semantics aren't
> very clear, and it is a relatively complex language.

This is not an attempt at a specification of the gremlin language. It
is only an attempt at formally specifying the implicit property graph
model assumed by the gremlin language. My understanding is that the
gremlin language will be formally defined by the antlr grammar
accompanied with documentation in English.

> I like the term "schema".

+1

> I agree, and I think there is value in going one step further to
> create a general purpose data model for defining data models, with
> property graphs as a special case.

Here I do not agree. While there certainly is value in meta meta models
I do not think actually designing a new one belongs in TinkerPop.
TinkerPop is about the gremlin language and the property graph model,
not about meta meta models. The job of creating deeper more abstract
models with all that it entails is in my opinion a huge task that has
little to do TinkerPop, gremlin and its property graph model.

>  the classic graph ("data graph") has elements "Marko", "Josh",
> "ripple" etc. each of which is a value together with a type and a
> name

Here it is the same critique. There is no need to say that a vertex
together with its label is in fact a type with a name. Type is not a
notion in gremlin nor a notion in our meta model so its not part of our
language.

> Cool, except that I would banish types like Date and Time

I have no strong intuitions about this art/science. Perhaps the meta
model should be extended to provide some support for non primitive data
types.

> > int8, ...

I was actually hoping to avoid some arbitrary attempt at defining a
long list of possible primitives. I looked on the internet but seems
there is no standard body out there for this with every language and
database defining its own types. Perhaps the long list is the only
solution?

> Or the other way around: we define a core model as its own thing
> using a well-defined, controlled vocabulary, then map it into
> Gremlin.

Same critique as above. Letting in another language means gremlin does
not bootstrap itself.

> I don't see your approach of embedding model definitions and
> constraints natively in Gremlin as being at odds with having a formal
> data model.

Afraid I do see as being at odds with one another. Describing gremlin
using another language, be it MOF/EMF/category theory is a very big
difference to it being self describing. If we decide against gremlin
self describing then we abort this attempt, no point in hacking it.

For what its worth this is a bit of a proof of concept. To see if
gremlin can meaningfully self describe. It has done so for the last 10
years.

Perhaps we should, however, before discussing the merits of this
approach or another, first decide what we are trying to achieve in the
first place.

Here goes my understanding of what we are trying to achieve.

1: A property graph meta model. To describe exactly what kind of data
structure the gremlin language operates on.
2: Gremlin grammar together with the documentation specifies gremlin
the language fully.
3: Extend the gremlin grammar to specify schema create/edit/delete
functionality.
4: Extend the grammar to query the schema. (This can be plain gremlin,
just operating at the schema level)
5: A language agnostic specification of how to interact with a remote
gremlin enabled system. i.e. similar to the jdbc specification only
without reference to any particular language.

As an aside, breaking user space should not even be considered. i.e.
99% backward compatibility should be guaranteed at all times.

Thanks
Pieter




On Tue, 2022-01-11 at 10:47 -0800, Joshua Shinavier wrote:
> Hey Pieter,
> 
> Good to see some more motion on this front. Responses inline.
> 
> 
> On Sun, Jan 9, 2022 at 4:28 AM pieter gmail 
> wrote:
> > Hi,
> > 
> > I have done some work on defining a meta model for Gremlin's
> > property graph. I a

Re: A meta model for gremlin's property graph

2022-01-15 Thread pieter gmail
Hi,

Here are the 2 missing images.

The first is the property graph meta model as defined with gremlin.

public static Graph gremlinMetaModel() {
enum GremlinDataType {
STRING,
INTEGER,
DOUBLE,
DATE,
TIME
//...
}
TinkerGraph propertyGraphMetaModel = TinkerGraph.open();
Vertex graph = propertyGraphMetaModel.addVertex(T.label, "Graph", 
"name", "GremlinDataType::STRING");
Vertex vertex = propertyGraphMetaModel.addVertex(T.label, 
"VertexLabel", "label", "GremlinDataType::STRING");
Vertex edge = propertyGraphMetaModel.addVertex(T.label, "EdgeLabel", 
"label", "GremlinDataType::STRING");
Vertex vertexProperty = propertyGraphMetaModel.addVertex(T.label, 
"VertexProperty", "name", "GremlinDataType::STRING", "type", "GremlinDataType");
Vertex edgeProperty = propertyGraphMetaModel.addVertex(T.label, 
"EdgeProperty", "name", "GremlinDataType::STRING", "type", "GremlinDataType");

graph.addEdge("vertices", vertex);
graph.addEdge("edges", edge);
vertex.addEdge("properties", vertexProperty);
vertex.addEdge("properties", edgeProperty);
vertex.addEdge("out", edge);
vertex.addEdge("in", edge);

return propertyGraphMetaModel;
}

propertyGraphMetaModel.png

The second is TinkerPop's modern model/schema also as defined with
gremlin.

public static Graph modernModel() {
//import this from a base package
enum GremlinDataType {
STRING,
INTEGER,
DOUBLE,
DATE,
TIME
//...
}

TinkerGraph modernModelGraph = TinkerGraph.open();

Vertex person = modernModelGraph.addVertex(T.label, "VertexLabel", 
"label", "person");
Vertex personNameVertexProperty = modernModelGraph.addVertex(T.label, 
"VertexProperty", "name", "name", "type", GremlinDataType.STRING.name());
Vertex personAgeVertexProperty = modernModelGraph.addVertex(T.label, 
"VertexProperty", "name", "age", "type", GremlinDataType.INTEGER.name());
person.addEdge("properties", personNameVertexProperty);
person.addEdge("properties", personAgeVertexProperty);

Vertex software = modernModelGraph.addVertex(T.label, "VertexLabel", 
"label", "software");
Vertex softwareNameVertexProperty = modernModelGraph.addVertex(T.label, 
"VertexProperty", "name", "name", "type", GremlinDataType.STRING.name());
Vertex softwareLangVertexProperty = modernModelGraph.addVertex(T.label, 
"VertexProperty", "name", "lang", "type", GremlinDataType.STRING.name());
software.addEdge("properties", softwareNameVertexProperty);
software.addEdge("properties", softwareLangVertexProperty);

Vertex knows = modernModelGraph.addVertex(T.label, "EdgeLabel", 
"label", "knows");
Vertex knowsWeightVertexProperty = modernModelGraph.addVertex(T.label, 
"EdgeProperty", "name", "weight", "type", GremlinDataType.INTEGER.name());
knows.addEdge("properties", knowsWeightVertexProperty);

Vertex created = modernModelGraph.addVertex(T.label, "EdgeLabel", 
"label", "created");
Vertex createdWeightVertexProperty = 
modernModelGraph.addVertex(T.label, "EdgeProperty", "name", "weight", "type", 
GremlinDataType.INTEGER.name());
created.addEdge("properties", createdWeightVertexProperty);

person.addEdge("outEdge", knows);
person.addEdge("outEdge", created);
software.addEdge("inEdge", knows);
software.addEdge("inEdge", created);
return modernModelGraph;
}


modernModel.png

Regards
Pieter

On Sun, 2022-01-09 at 14:28 +0200, pieter gmail wrote:
> Hi,
> 
> I have done some work on defining a meta model for Gremlin's property graph. 
> I am using the approach used in the modelling world, in particular as done by 
> the OMG group when defining their various meta models and specifications.
> 
> However where OMG uses a subset of the UML to define their meta models I 
> suggest we use Gremlin. After all Gremlin is the language we use to describe 
> the world and the property graph meta model can also be described in Gremlin.
> 
> I pro

A meta model for gremlin's property graph

2022-01-09 Thread pieter gmail
Hi,

I have done some work on defining a meta model for Gremlin's property
graph. I am using the approach used in the modelling world, in
particular as done by the OMG group when defining their various meta
models and specifications.

However where OMG uses a subset of the UML to define their meta models
I suggest we use Gremlin. After all Gremlin is the language we use to
describe the world and the property graph meta model can also be
described in Gremlin.

I propose that we have 3 levels of modelling. Each of which can itself
be specified in gremlin.

1: The property graph meta model.
2: The model.
3: The graph representing the actual data.

1) The property graph meta model describes the nature of the property
graph itself. i.e. that property graphs have vertices, edges and
properties.

2) The model is an instance of the meta model. It describes the schema
of a particular graph. i.e. for TinkerPop's modern graph this would be
'person', 'software', 'created' and 'knows' and the various properties
'weight', 'age', 'name' and 'lang' properties.

3) The final level is an instance of the model. It is the actual graph
itself. i.e. for TinkerPop's modern graph it is 'Marko', 'Josh', 'java'
...


1: Property Graph Meta Model

public static Graph gremlinMetaModel() {
enum GremlinDataType {
STRING,
INTEGER,
DOUBLE,
DATE,
TIME
//...
}
TinkerGraph propertyGraphMetaModel = TinkerGraph.open();
Vertex graph = propertyGraphMetaModel.addVertex(T.label, "Graph", 
"name", "GremlinDataType::STRING");
Vertex vertex = propertyGraphMetaModel.addVertex(T.label, 
"VertexLabel", "label", "GremlinDataType::STRING");
Vertex edge = propertyGraphMetaModel.addVertex(T.label, "EdgeLabel", 
"label", "GremlinDataType::STRING");
Vertex vertexProperty = propertyGraphMetaModel.addVertex(T.label, 
"VertexProperty", "name", "GremlinDataType::STRING", "type", "GremlinDataType");
Vertex edgeProperty = propertyGraphMetaModel.addVertex(T.label, 
"EdgeProperty", "name", "GremlinDataType::STRING", "type", "GremlinDataType");

graph.addEdge("vertices", vertex);
graph.addEdge("edges", edge);
vertex.addEdge("properties", vertexProperty);
vertex.addEdge("properties", edgeProperty);
vertex.addEdge("out", edge);
vertex.addEdge("in", edge);

return propertyGraphMetaModel;
}

This can be visualized as,

Notes: 
1) GremlinDataType is an enumeration of named data types that Gremlin
supports. All gremlin data types are assumed to be atomic and its life
cycle fully owned by its containing parent. How it is persisted on disc
or transported over the wire is not a concern for the meta model.
2) Gremlin's semantics is to weak to fully specify a valid meta model.
Accompanying the meta model we need a list of constraints specified as
gremlin queries to augment the semantics of the meta model. These
constraints/queries will be able to validate any gremlin specified
model for correctness.
3) It is trivial to extend the meta model. e.g. To specify something
like index support just add an 'Index' vertex and an edge from
'VertexLabel' to it.

Property graph meta model constraints,

1) Every 'VertexLabel' must have a 'label'.
    g.V().hasLabel("EdgeLabel").where(__.not(__.in("inEdge"))).id()
2) Every 'EdgeLabel' must have a 'label'.
    g.V().hasLabel("EdgeLabel").or(__.hasNot("label"), __.has("label", 
P.eq(""))).id()
3) Every 'EdgeLabel' must have at least one 'outEdge' 'VertexLabel'
g.V().hasLabel("EdgeLabel").where(__.not(__.in("outEdge"))).id()
4) Every 'EdgeLabel' must have at least on 'inEdge' 'VertexLabel'
g.V().hasLabel("EdgeLabel").where(__.not(__.in("inEdge"))).id()
5) Every 'VertexProperty' must have a 'name'
    gV().hasLabel("VertexProperty").or(__.hasNot("name"), __.has("name", 
P.eq(""))).id()
6) Every 'VertexProperty' must have a 'type'
    g.V().hasLabel("VertexProperty").or(__.hasNot("type"), __.has("type", 
P.eq(""))).id()
7) Every 'EdgePropery' must have a 'name'
g.V().hasLabel("EdgeProperty").or(__.hasNot("name"), __.has("name", 
P.eq(""))).id()
8) Every 'EdgeProperty' must have a 'type'
g.V().hasLabel("EdgeProperty").or(__.hasNot("type"), __.has("type", 
P.eq(""))).id()
9) Every 'VertexProperty' must have a in 'properties' edge.
g.V().hasLabel("VertexProperty").where(__.not(__.in("properties"))).id()
10) Every 'EdgeProperty' must have a in 'properties' edge.
g.V().hasLabel("EdgeProperty").where(__.not(__.in("properties"))).id()
...

This can be visualized as,


2: The model

What follows is an example of TinkerPop's 'modern' graph specified as
an instance of the above property graph meta model.

public static Graph modernModel() {
//import this from a base package
enum GremlinDataType {
STRING,
INTEGER,
DOUBLE,
DATE,
TIME
//...

Re: [DISCUSS] Geo-Spatial support

2021-08-03 Thread pieter gmail
Hi,

I'd suggest having a look at Postgis (https://postgis.net/) for some
inspiration. Its mature and rather sophisticated with a small army of
functions. They have two types, 'geometry' for standard projection
functions and 'geography' for 3D spherical maths functions.

Postgis also has a JDBC driver with a bunch of types which might help
thinking about standards.

LinearRing
MultiPoint
LineString
MultiLineString
Polygon
GeographyPolygon
MultiPolygon
GeometryCollection
Point
GeographyPoint

We use is extensively. I added some types and basic functions to Sqlg
but I gave up as it felt like a "adds no value layer". It was far
easier to let our engineers work at the Postgis level as it is well
documented and with lots of support out there in the wild.

Perhaps in our case the 'g.cyhper("some cypher")' way would suit us
better, 

i.e. 

'graph.postgis("SELECT superhero.name
FROM city, superhero
WHERE ST_Contains(city.geom, superhero.geom)
AND city.name = 'Gotham';"
)'

I'd also suggest some support for Geojson

Postgis can convert any query's result into geojson which one then
directly pass to the map tool. In our case it completely removed the
need for the javascript folk to sweat away at endless performance
issues and gis complications.  

Cheers
Pieter

On Tue, 2021-08-03 at 11:50 -0800, David Bechberger wrote:
> Sorry Josh, I just realized I never responded to this and thanks for
> the
> feedback.
> 
> The scope for the proposed options are based on what tools like DSE
> Graph
> and Janusgraph support.  I definitely agree that we should make sure
> that
> what we choose is extensible as well as in line with standards.  I am
> not
> too familiar with GeoSPARQL but I have done a lot with WKT format which
> does allow for definitions of items like polygons with holes,
> muli-polygons, and multipoints that we may want to include at some
> point.
> 
> As far as the initial proposed predicates I was sort of looking at what
> was
> supported by other common indexing backends like Elasticsearch to
> provide a
> glimpse of the most common types of patterns people are searching on.
> 
> Dave
> 
> 
> On Tue, Aug 3, 2021 at 4:37 AM Stephen Mallette 
> wrote:
> 
> > Just noticed I hadn't commented on this thread - I'm in favor of this
> > addition. Other graphs have already built this sort of functionality
> > and it
> > is already satisfying existing use cases so we already have a model
> > for how
> > this sort of functionality will work. I'd agree with Josh that there
> > may
> > yet be some details on the implementation to consider but I don't
> > have much
> > to add to the general proposal Dave has provided. Looks good to me.
> > 
> > On Fri, Jul 23, 2021 at 11:47 AM Joshua Shinavier 
> > wrote:
> > 
> > > Hi Dave,
> > > 
> > > I think something like this is a very good idea, and these look
> > > like
> > useful
> > > primitives. IMO when it comes to geospatial queries, the devil is
> > > in the
> > > details. For example, at some point we'll have someone asking for
> > > double-precision lat/lon points (GPS is not that accurate, but some
> > > applications use computed/simulated points, or combine GPS data
> > > with
> > local
> > > position). Polygons are sometimes defined as having "holes", etc.
> > > It may
> > be
> > > worthwhile to take some direction from OGC standards like
> > > GeoSPARQL.
> > > 
> > > Just an initial $0.02. Ideally, the extension would be simple for
> > > developers to use and understand (as this is), while also being
> > > somewhat
> > > future-proof and playing well with standards.
> > > 
> > > Josh
> > > 
> > > 
> > > 
> > > On Thu, Jul 22, 2021 at 2:44 PM David Bechberger
> > > 
> > > wrote:
> > > 
> > > > One of the common requests from customers and users of TinkerPop
> > > > is to
> > > add
> > > > support for geographic based searches (TINKERPOP-2558
> > > > ). In fact
> > > > many
> > > > TinkerPop enabled database vendors such as DataStax Graph and
> > JanusGraph
> > > > have added custom predicates and libraries to handle this
> > > > request. As a
> > > > query language framework it would make sense for TinkerPop to
> > > > adopt a
> > > > common geo-predicate framework to provide standardization across
> > > providers
> > > > and to support this as part of the TinkerPop ecosystem.
> > > > 
> > > > In consultation with some others on the project we have put
> > > > together a
> > > > proposed scheme for supporting this in TinkerPop which I have
> > documented
> > > in
> > > > a gist here:
> > > > https://gist.github.com/bechbd/70f4ce5a537d331929ea01634b1fbaa2
> > > > 
> > > > Interested in hearing others thoughts?
> > > > 
> > > > Dave
> > > > 
> > > 
> > 




Re: code generation and RDF support in TinkerPop 4

2021-06-03 Thread pieter gmail
Hi,

I kinda lost track of what we discussed previously.
Did we come to a decision regarding what language we are going to use
to describe the structure of the graph.

yaml,xsd,uml,yang or some category theory based language?

>From my understanding this would be the biggest change in tp4. A
TinkerPop graph will no be longer a tangle of endless vertices and
edges but instead can, optionally, be well defined and constrained.
This way an engineer can, long after the original creators of a graph
have left, immediately understand the graph, without needing to write a
single query.

Thanks
Pieter




On Thu, 2021-06-03 at 09:59 -0700, Joshua Shinavier wrote:
> Hi Pieter,
> 
> 
> On Thu, Jun 3, 2021 at 9:40 AM pieter gmail 
> wrote:
> > Hi,
> > 
> > Just to understand a bit better whats going on.
> > 
> > Did you hand write the dragon yaml with the antlr grammar as input?
> > 
> 
> 
> 
> Yes, the YAML was written by hand, and based pretty closely on
> Gremlin.g4. You can see Stephen's ANTLR definitions inline with the
> YAML as comments. I also took some direction from the Java API.
> 
> 
>  
> > Did you generate the java classes from the yaml using dragon or
> > something else?
> > 
> 
> 
> 
> Yes, the Java classes are currently generated using Dragon. I'm
> limiting the generated code to Java for now (other possible targets
> being Scala and Haskell) just to keep diffs to a reasonable size, and
> because a new, open-source solution is needed to replace Dragon. My
> current thinking is that the new transformation framework will be
> separate from TinkerPop, as it will serve non-graph as well as graph
> use cases. For now, you can think of the code generation as a
> bootstrapping strategy.
> 
> Josh
> 
> 
>  
> > 
> > Thanks
> > Pieter
> > 
> > On Thu, 2021-06-03 at 07:48 -0700, Joshua Shinavier wrote:
> > > Hello all,
> > > 
> > > I would like to take some concrete steps toward the TinkerPop 4
> > > interoperability goals I've stated a few times (e.g. see
> > TinkerPop
> > > 2020
> > > <https://www.slideshare.net/joshsh/tinkerpop-2020>from last
> > year). At
> > > a
> > > meetup <https://www.meetup.com/Category-Theory/events/277331504/>
> > a
> > > couple
> > > of months ago, I demonstrated an approach for generating
> > TinkerPop
> > > APIs
> > > consistently into different languages. I have started to check in
> > > some of
> > > that generated code in a branch (see my commits here
> > >
> >
> <https://github.com/apache/tinkerpop/commits/TINKERPOP-2563-language/gremlin-language
> > > >)
> > > and add bits and pieces for RDF support, as well.
> > > 
> > > The Apache Software Foundation asks us to discuss any significant
> > > changes
> > > to the code base on the dev list. Since these steps toward TP4
> > will
> > > be
> > > major changes if and when they are merged into the master branch,
> > I
> > > will
> > > start discussing them here. Expect occasional emails from me
> > about
> > > the
> > > various things I will be doing in the branch. I absolutely invite
> > > comments,
> > > feedback, and actual discussion on these design proposals, but
> > even
> > > if it's
> > > just me issuing self-affirming statements into the void like the
> > King
> > > of
> > > Pointland, I will just carry on, because that's how this process
> > > works.
> > > 
> > > A brief summary of the changes so far:
> > > 
> > > 
> > >    - *Abstract specification of Gremlin traversals*. I have
> > turned
> > >    Stephen's Gremlin.g4
> > >   
> > >
> >
> <https://github.com/apache/tinkerpop/blob/TINKERPOP-2563-language/gremlin-language/src/main/antlr4/Gremlin.g4
> > > >
> > >    ANTLR grammar into an abstract specification of Gremlin
> > traversal
> > > syntax
> > >    using the Dragon (YAML-based) format. Unfortunately, it is
> > looking
> > > very
> > >    unlikely that Dragon will become available as open-source
> > > software, so you
> > >    can expect this YAML format to change just slightly once we
> > have a
> > > new
> > >    Dragon-like tool for schema and data transformations. More on
> > that
> > > later.
> > >    Right now, the syntax specification can be found here
> > >   
> > >
> >
> <https://github.com/apache/tinker

Re: code generation and RDF support in TinkerPop 4

2021-06-03 Thread pieter gmail
Hi,

Just to understand a bit better whats going on.

Did you hand write the dragon yaml with the antlr grammar as input?
Did you generate the java classes from the yaml using dragon or
something else?

Thanks
Pieter

On Thu, 2021-06-03 at 07:48 -0700, Joshua Shinavier wrote:
> Hello all,
> 
> I would like to take some concrete steps toward the TinkerPop 4
> interoperability goals I've stated a few times (e.g. see TinkerPop
> 2020
> from last year). At
> a
> meetup  a
> couple
> of months ago, I demonstrated an approach for generating TinkerPop
> APIs
> consistently into different languages. I have started to check in
> some of
> that generated code in a branch (see my commits here
>  >)
> and add bits and pieces for RDF support, as well.
> 
> The Apache Software Foundation asks us to discuss any significant
> changes
> to the code base on the dev list. Since these steps toward TP4 will
> be
> major changes if and when they are merged into the master branch, I
> will
> start discussing them here. Expect occasional emails from me about
> the
> various things I will be doing in the branch. I absolutely invite
> comments,
> feedback, and actual discussion on these design proposals, but even
> if it's
> just me issuing self-affirming statements into the void like the King
> of
> Pointland, I will just carry on, because that's how this process
> works.
> 
> A brief summary of the changes so far:
> 
> 
>    - *Abstract specification of Gremlin traversals*. I have turned
>    Stephen's Gremlin.g4
>   
>  >
>    ANTLR grammar into an abstract specification of Gremlin traversal
> syntax
>    using the Dragon (YAML-based) format. Unfortunately, it is looking
> very
>    unlikely that Dragon will become available as open-source
> software, so you
>    can expect this YAML format to change just slightly once we have a
> new
>    Dragon-like tool for schema and data transformations. More on that
> later.
>    Right now, the syntax specification can be found here
>   
>  >,
>    although the file path might change in the future.
> 
> 
>    - *Traversal DTOs*. Based on the abstract specification, I have
>    generated Java classes for building and working with traversals.
> The
>    generated files can currently be found here
>   
>  >.
>    These are essentially POJOs or DTO classes, with special
> boilerplate
>    methods for equality, pattern matching over alternative
> constructors, and
>    modification by copying (since the instances are immutable). These
> classes
>    allow you to build traversals in a declarative way, while all of
> the logic
>    for evaluating traversals goes elsewhere. Support for
> serialization and
>    deserialization for traversals is to be added in the future -- and
> the same
>    goes for all other classes generated in this way.
> 
> 
>    - *RDF 1.1 concepts model*. RDF support was part of TinkerPop from
> the
>    beginning, but it was de-emphasized for TinkerPop 3 due to other
> priorities
>    such as OLAP. For years, developers have been asking us for better
>    interoperability with RDF. While we do have some query-level
> support for
>    RDF these days in sparql-gremlin, we no longer have any data-level
> support,
>    e.g. supporting loading RDF data into a property graph and getting
> it back
>    out, evaluating Gremlin traversals over RDF datasets, etc. These
> things are
>    not especially hard to do, in certain limited ways, but our old
> approach of
>    writing adapters like GraphSail
>   
> ,
>    SailGraph
>   
> ,
> and
>    PropertyGraphSail
>   
>  >
>    in Java, with no support for other languages, does not seem
> appropriate for
>    TinkerPop 4. Also, those early mappings were extremely
> underspecified in a
>    formal sense -- good enough for some practical applications, but
> not good
>    enough for anything requiring inference, optimization, or
> composition with
>    other mappings. To that end, I am starting to add abstract
> specifications
>    for RDF along the lines of the Gremlin specifications I described
> above.
>    The first of these, a specification of RDF 1.1 Concepts, can
> currently be
>    found here
>   
> 

Re: [DISCUSS] ANTLR and gremlin-script

2021-03-22 Thread pieter gmail
Hi,

Exciting as this is I am not quite sure what it means.

Naively  perhaps it the idea,
Arbitary gremlin string -> antlr parser -> some AST walker -> gremlin
byte code -> java in memory steps ... -> voila

Is the grammar going to be the primary and only
interface/specification, or will the native java implementation bypass
the grammar going straight to the steps instead?

Is this aimed at the gremlin 3 or 4?

Cheers
Pieter

On Tue, 2021-03-16 at 15:47 -0400, Stephen Mallette wrote:
> Here is the PR: https://github.com/apache/tinkerpop/pull/1408
> 
> On Tue, Mar 16, 2021 at 6:14 AM Stephen Mallette
> 
> wrote:
> 
> > No branch yet, but I think I will be sending the PR today.
> > 
> > On Mon, Mar 15, 2021 at 9:33 PM Joshua Shinavier
> > 
> > wrote:
> > 
> > > Is there a branch we can take a look at before the PR is ready?
> > > 
> > > Josh
> > > 
> > > On Fri, Mar 12, 2021 at 5:42 AM Stephen Mallette
> > > 
> > > wrote:
> > > 
> > > > I've been working on forming a pull request for this task. I
> > > > don't
> > > think IP
> > > > Clearance is necessary as I originally did because the
> > > > contribution is
> > > > really just an ANTLR4 grammar file with some tests to validate
> > > > things.
> > > > Therefore, it's not a big body of independent code as I'd
> > > > perhaps
> > > initially
> > > > envisioned. Compared to gremlint, this addition is pretty
> > > > simple and
> > > > straightforward. I've created this issue in JIRA with some
> > > > additional
> > > notes
> > > > on what to expect in this initial body of work:
> > > > 
> > > > https://issues.apache.org/jira/browse/TINKERPOP-2533
> > > > 
> > > > 
> > > > 
> > > > On Mon, Feb 8, 2021 at 10:06 AM Stephen Mallette
> > > > 
> > > > wrote:
> > > > 
> > > > > Just wanted to leave an update on this thread. It was nice to
> > > > > see some
> > > > > support for it. I've not had time to focus on the task itself
> > > > > so sorry
> > > > > there hasn't been much movement, but I hope to see it on
> > > > > track soon. I
> > > > > thought to update the thread after I came across yet another
> > > > > nice
> > > usage
> > > > for
> > > > > it. I've long wanted to unify our test framework (i.e.
> > > > > deprecate the
> > > JVM
> > > > > process suite in favor of the GLV test suite). I was
> > > > > experimenting
> > > with
> > > > > what that might look like on Friday and hit a circular
> > > > > dependency
> > > which
> > > > > constantly trips things up where gremlin-test wants to depend
> > > > > on
> > > > > gremlin-groovy (for ScriptEngine support) but gremlin-groovy
> > > > > depends
> > > on
> > > > > gremlin-test and tinkergraph with  scope already. I
> > > > > think the
> > > > > introduction of gremlin-script would let gremlin-test build
> > > > > the
> > > Traversal
> > > > > object from a Gremlin string and thus avoid that circular
> > > relationship.
> > > > > 
> > > > > On Fri, Jan 8, 2021 at 2:43 AM pieter gmail
> > > > > 
> > > > > wrote:
> > > > > 
> > > > > > +1
> > > > > > 
> > > > > > I have often thought the language specification should be a
> > > > > > project
> > > > > > separate from the implementations, and done in a formal but
> > > > > > plain
> > > > > > English format similar to OMG or IETF specifications.
> > > > > > 
> > > > > > I suspect Sqlg's code base would have been fastly different
> > > > > > if it had
> > > > > > evolved from a grammer instead of an api.
> > > > > > 
> > > > > > Cheers
> > > > > > Pieter
> > > > > > 
> > > > > > On Thu, 2020-12-24 at 14:41 -0500, Stephen Mallette wrote:
> > > > > > > As a project, over the years, we've often been asked the
> > > > > > > question
> > > as
> > > > > > > to why
> > > > > &

Re: [DISCUSS] Adding motif support to match()

2021-02-04 Thread pieter gmail
+1

Cheers
Pieter

On Thu, 2021-02-04 at 17:15 -0800, Joshua Shinavier wrote:
> Initial thought: if the ASCII art syntax is Cypher-like, why not make
> it
> openCypher proper? I.e. keep match() as it is, but generalize the
> cypher()
> step out of Neo4jGraph, with native Neo4j evaluation of Cypher as an
> optimization.
> 
> Josh
> 
> 
> On Thu, Feb 4, 2021 at 2:17 PM David Bechberger 
> wrote:
> 
> > Over the years of working with Gremlin I have foudn the match()
> > step is
> > difficult to create traversals with and even more difficult to make
> > it work
> > efficently.  While the imperative style of programming in Gremlin
> > provides
> > a powerful path finding mechanism it really lacks an easy way to
> > perform
> > pattern matching queries.  It would be great if we could simplify
> > the
> > match() step to enable users to easily generate these pattern
> > matching
> > traversals.
> > 
> > To accomplish this I was wondering what adding support for a subset
> > of
> > motif/ascii art patterns to the match step might look like.  These
> > types of
> > patterns are very easy for people to understand and I think the
> > ability to
> > combine these pattern matching syntax with the powerful path
> > finding and
> > formatting features of Gremlin would make a powerful combination.
> > 
> > To accomplish this I am suggesting supporting a subset of potential
> > patterns.  The two most common examples of this sort of pattern out
> > there
> > are the openCypher type style and the style used by GraphX.  I have
> > provided a few examples below of what this syntax might look like:
> > 
> > e.g. openCypher style
> > 
> > Find me everything within one hop
> > g.V().match("()-[]->()")
> > 
> > Find me everything within one hop of a Person vertex
> > g.V().match("(p:Person)-[]->()")
> > 
> > Find me all Companies within one hop of a Person vertex
> > g.V().match("(p:Person)-[]->(c:Company)")
> > 
> > Find me all Companies within one hop of a Person vertex with an
> > Employed_at
> > edge
> > g.V().match("(p:Person)-[e:employed_at]->(c:Company)")
> > 
> > 
> > The other option would be to take more of a hybrid approach and use
> > only
> > the basic art/motifs like GraphX and apply the additional filtering
> > in a
> > hybrid type of mode like this:
> > 
> > Find me all Companies within one hop of a Person vertex with an
> > Employed_at
> > edge
> > g.V().match("(p)-[e]->(c)",
> > __.as('p').hasLabel('Person'),
> > __.as('e').hasLabel('employed_at'),
> > __.as('c').hasLabel('Company'),
> > )
> > 
> > This also has the potential to enable some significantly more
> > complex
> > patterns like "Find me all Companies within one hop of a Person
> > vertex with
> > an Employed_at edge who also worked at Foo"
> > g.V().match("(p)-[e]->(c)",
> > __.as('p').hasLabel('Person').out('employed_at').has('Company',
> > 'name',
> > 'Foo'),
> > __.as('e').hasLabel('employed_at'),
> > __.as('c').hasLabel('Company'),
> > )
> > 
> > Thoughts?
> > 
> > Dave
> > 



Re: [DISCUSS] ANTLR and gremlin-script

2021-01-07 Thread pieter gmail
+1 

I have often thought the language specification should be a project
separate from the implementations, and done in a formal but plain
English format similar to OMG or IETF specifications. 

I suspect Sqlg's code base would have been fastly different if it had
evolved from a grammer instead of an api.

Cheers
Pieter

On Thu, 2020-12-24 at 14:41 -0500, Stephen Mallette wrote:
> As a project, over the years, we've often been asked the question as
> to why
> Gremlin doesn't have an ANTLR style grammar. There have been varying
> answers over the years to explain the reasoning but in recent years
> I've
> started to see where our dependence on Java for driving Gremlin
> design has
> not translated well as we have expanded Gremlin into other
> programming
> ecosystems. Using Java has often allowed idioms of that language to
> leak
> into Gremlin itself which introduces friction when implemented
> outside of
> the JVM. I think that there is some advantage to designing Gremlin
> more
> with just graphs/usage in mind and then determining how that design
> choice
> looks in each programming language.
> 
> I think that using an ANTLR grammar to drive that design work for
> Gremlin
> makes a lot of sense in this context. We would effectively have
> something
> like a gremlin-script which would become the new language archetype.
> New
> steps, language changes, etc. would be discussed in its context and
> then
> implemented in the grammar and later in each programming language we
> support in the style a developer would expect. An interesting upside
> of
> this approach is that we can implement gremlin-script in the
> ScriptEngine
> and replace GremlinGroovyScriptEngine which would help us strengthen
> our
> security story in Gremlin Server. Groovy processing would just be a
> fallback to Gremlin scripts that could not be processed by the AST.
> In fact
> users who didn't need Groovy could simply not install it at all and
> thus
> boast a much more secure system.
> 
> I think that inclusion of a grammar in our project is an exciting new
> direction for us to take and will help in a variety of areas beyond
> those
> I've already related.
> 
> If we like this direction, Amazon Neptune already maintains such a
> grammar
> and would be willing to contribute it to the project to live in open
> source. The contribution would go through the same IP Clearance
> process
> gremlint is going through since it was developed outside of
> TinkerPop. I'd
> be happy to guide that process through if we draw to consensus here.




Re: Apply for Index Listing

2020-11-01 Thread pieter gmail
Just had a look at HugeGraph but can't find any English documentation?
CheersPieter
On Wed, 2020-10-28 at 10:44 -0400, Stephen Mallette wrote:
> I'm fine with adding this now +1 - any else have concerns?
> > The project must have a high-resolution logo that can be used by
> > Apache
> TinkerPop.
> Apache lists tend to remove images - could you please make it
> available insome other fashion?
> 
> On Wed, Oct 28, 2020 at 9:19 AM Zhang,Yi(SEC02) 
> wrote:
> > Hi, I am one owner of *HugeGraph <
> > https://github.com/hugegraph/hugegraph>*,an open source *TinkerPop-
> > enabled graph system*, sponsored by *Baidu*. MyGithub account is
> > zhoney(https://github.com/zhoney). I read your ProviderListing
> > Policy  and considerthat
> > HugeGraph meets all requirements. So, we are desiring to
> > makeHugeGraph listed on your listing. Below is the fundamental
> > state ofHugeGraph corresponding to your index/provider listing
> > requirements.
> > 
> > 
> > Index Listing Requirements
> >- The project must be either a TinkerPop-enabled graph system,
> > a   Gremlin language variant/compiler, a Gremlin language driver,
> > or a   TinkerPop-enabled middleware tool.  - HugeGraph is a
> > TinkerPop-enabled graph system.   - The project must have a public
> > URL that can be referenced by Apache   TinkerPop.  - Github
> > core Project URL is: https://github.com/hugegraph/hugegraph
> >   - Github Doc URL is: 
> > https://hugegraph.github.io/hugegraph-doc/
> >   - Github Organization URL is: https://github.com/hugegraph
> >- The project must have at least one release.  - HugeGraph
> > has 5 releases: 0.6.1, 0.7.4, 0.8.0, 0.9.2 and 0.10.4.  
> > https://github.com/hugegraph/hugegraph/releases
> >- The project must be actively developed/maintained to a current
> > or   previous "y" version of Apache TinkerPop (3.y.z).  -  
> > https://github.com/hugegraph/hugegraph/blob/0e4eaa0b8675d5bcd1708813ecb4879513673512/pom.xml#L101
> >   - HugeGraph is actively developed/maintained to 3.4.3 version
> > of  Apache TinkerPop   - The project must have *some*
> > documentation and that documentation   must make explicit its usage
> > of Apache TinkerPop and its version   compatibility
> > requirements.  - HugeGraph make explicit its usage of Apache
> > TinkerPop in  *Features* and *Thanks* section of Readme.md on
> > Github homepage:  https://github.com/hugegraph/hugegraph
> > 
> > Provider Listing Requirements(*extra requirements except index
> > listing*)
> >- The project must have a homepage that is not simply a
> > software   repository page.  - 
> > https://hugegraph.github.io/hugegraph-doc/
> >- The project must have a high-resolution logo that can be used
> > by   Apache TinkerPop.  -
> > 
> > 
> > FYI:
> > 1.  HugeGraph has got strong support in technical field and open
> > sourcefield from our company Baidu,which is one of gold sponsors of
> > ApacheSoftware Foundation.
> > 2.  HugeGraph’s open source version has more than 50 enterprise
> > users inChina, spread all over banking business, securities
> > industry, riskmanagement, knowledge graph, etc.
> > 3.  Our architect contribute to tinkerpop some PRs and codes:
> > https://github.com/javeme?tab=overview=2020-04-01=2020-04-30=apache,his
> > github account is javame(https://github.com/javeme)
> > [image: cid:image001.png@01D6AD54.38626940]
> > 
> > 
> > 
> > 
> > Any response is appreciated!
> > 
> > 
> > Yours sincerely,
> > zhoney


Re: [DISCUSS] Review Process

2018-07-10 Thread pieter gmail

Hi,

I feel like the project has become a bit too big and dispersed. A large 
portion of the emails, jira or otherwise are irrelevant to my 
interest/time/work.


Perhaps for version 4, TinkerPop could be broken up into more focused 
projects with their own jira/email/process management.


gremlin-language
gremlin-server
js-driver
python-driver
java-driver
.net-driver
reference implementation
...

Thanks
Pieter




Perhaps for version 4 the project should be broken up

On 10/07/2018 22:01, Jason Plurad wrote:

Thanks for starting this conversation, Stephen. Lots of interesting tidbits
here, and perhaps some we can apply to other OSS projects.


I'm not sure if committers/PMC members have just not had time to do

reviews or have not felt comfortable doing them

Probably a combination of both, especially with the GLVs.


I personally chase votes in the background to get PRs to merge.and, I

don't want to do that anymore.

Amazing that you did that, but I agree that nagging is not a great path
forward.


it is perfectly fine to review/VOTE in the following manner (as examples)

It'd be great to have these examples added to the maintainer guidelines.
When I do code reviews, sometimes I feel like one-liner votes are a bit of
a cop out, but having examples like this would lower the mental hurdle to
getting started on reviewing.


It would also be nice for non-committers to do reviews - i don't know how

to encourage that behavior though.

I agree on this, and it would be particularly important on areas of the
code where we only have one primary committer, like each GLV. If we come to
agreement on a new policy, I'd suggest that if we get the docs written up
and published, then we can mention it on gremlin-users sort of as a heads
up to those interested in getting more involved. Their participation helps
drive out releases, and new releases attract more users.

Regarding the proposal, a single binding +1 from a committer with a 1 week
lazy consensus sounds fine to me. If the contribution is a major feature or
significant change, the expectation is that the committer realizes this and
holds it open for 3 votes before committing.



On Tue, Jul 10, 2018 at 1:46 PM, Stephen Mallette 
wrote:


Good point, Ted - that wasn't clear and in truth I didn't think that
through well. I think we could say that that the +1 would come from a
committer. If the committer and submitter are one in the same then it has
its single VOTE and technically, the PR just goes through the week long
cooling period and could be merged after that. In the event the PR
submitter is not a committer then they would require at least one committer
to be on board with a +1 and a week long wait.

Ideally, I think we can trust committers enough to do smart things with
this change in process. I would hope that a committer who submits a PR that
is especially complex and touches scary code parts or breaks user/provider
APIs requests an actual review from another committer who might be able to
spot some problems even if the 1 week cool down passes by. I don't want to
subvert the good things that reviews do, but I don't want them holding up
simple code changes either. I'd really like it if we introduced this change
and we still got multiple +1s on PRs. It would also be nice for
non-committers to do reviews - i don't know how to encourage that behavior
though.




On Tue, Jul 10, 2018 at 1:26 PM Ted Wilmes  wrote:


I fell way off the PR review train, I'll get back on. For clarification,

is

that a +1 on top of the submitter +1? I'm thinking you
all just meant the submitter's +1 would be adequate after the lazy
consensus period but wanted to be sure. I'd be fine to moving with that.

My

impression is that with the folks involved, if a submitter feels that
another set of eyes is really required and lazy consensus is not

adequate,

regardless of the policy, that review will be sought and performed prior

to

merge.

--Ted

On Tue, Jul 10, 2018 at 11:44 AM Stephen Mallette 
wrote:


   It looks like its disabled for this project.

I don't think we can use the GitHub integration without getting off our
Apache Mirror (which we've discussed, but not really pulled the trigger

on

for no particular reason other than the hassle of changing everything).


   Does it have to be in that order?

I was thinking that the as long as there is a single +1 at any time in

that

week (or after that week) then it would be good to merge

On Tue, Jul 10, 2018 at 12:36 PM Robert Dale 

wrote:

There might be a better alternative to privately nagging ;-)  Github

has

a

feature on the sidebar that can be used to request reviews from

individuals

or groups. The heading has 'Reviewers' and, when it's active, has a

gear

icon to select people.  Github will then email the reviewers with the
request.  It looks like its disabled for this project.

I like the idea of adding the option of having a single vote and a

week

to

soak.  Does it have to be in that order?  Or can the 

Re: CustomId serialization

2018-06-25 Thread pieter gmail

Ok, ah well I briefly searched for "@class" magic and did not find it.
I find Jackson's docs surprisingly bad.

Anyway no matter, it works for now.

Thanks
Pieter

On 25/06/2018 18:48, Stephen Mallette wrote:

I think - "think" being the key word - that Jackson parses that CLASS to
determine the deserializer to use and then hands your deserializer the
contents of the rest of the JSON (which is all the deserializers needs once
the right one is chosen).

On Mon, Jun 25, 2018 at 8:08 AM pieter gmail 
wrote:


Hi,

Just manage to get it to work, but not really sure whats going on.

So Sqlg's RecordId itself consist of a SchemaTable and a Long. Both
RecordId and SchemaTable has serialization code.

The part I don't quite get is that serializeWithType and deserialize is
not symmetrical.
Here is RecordId's serialization code.

  @Override
  public void serializeWithType(final RecordId recordId, final
JsonGenerator jsonGenerator,
final SerializerProvider
serializerProvider, final TypeSerializer typeSerializer) throws
IOException, JsonProcessingException {

  jsonGenerator.writeStartObject();
jsonGenerator.writeStringField(GraphSONTokens.CLASS,
RecordId.class.getName());
  jsonGenerator.writeObjectField("schemaTable",
recordId.getSchemaTable());
  jsonGenerator.writeNumberField("id", recordId.getId());
  jsonGenerator.writeEndObject();
  }

  @Override
  public RecordId deserialize(final JsonParser jsonParser, final
DeserializationContext deserializationContext) throws IOException,
JsonProcessingException {
  org.apache.tinkerpop.shaded.jackson.core.JsonToken
jsonToken = jsonParser.nextToken();
  Preconditions.checkState(JsonToken.START_OBJECT == jsonToken);
  SchemaTable schemaTable =
deserializationContext.readValue(jsonParser, SchemaTable.class);
  jsonToken = jsonParser.nextToken();
Preconditions.checkState(org.apache.tinkerpop.shaded.jackson.core.JsonToken.FIELD_NAME

== jsonToken);
Preconditions.checkState("id".equals(jsonParser.getValueAsString()));
  jsonToken = jsonParser.nextToken();
  Preconditions.checkState(JsonToken.VALUE_NUMBER_INT ==
jsonToken);
  long id = jsonParser.getValueAsLong();
  jsonToken = jsonParser.nextToken();
Preconditions.checkState(org.apache.tinkerpop.shaded.jackson.core.JsonToken.END_OBJECT

== jsonToken);
  return RecordId.from(schemaTable, id);
  }

What happened to the GraphSONTokens.CLASS ?
I was expecting to have to read that also but somewhere I have lost the
flow.

Just to reiterate it is working now and all the tests are passing, so
its more of a information question.

Thanks
Pieter


On 25/06/2018 13:38, Stephen Mallette wrote:

I would think that you could write your own custom deserializer if you
needed to. That error doesn't give me any hints as to what might be wrong
exactly. I can't think of why that wouldn't work, but even with a little
refresh by looking at the code just now, my memory on GraphSON 1.0 is

fuzzy.

Maybe you could try to modify the working test in TinkerPop to include a
deserializer and see if you get a similar error for your efforts? Perhaps
that would help yield a clue?

On Mon, Jun 25, 2018 at 2:58 AM pieter gmail 
wrote:


Hi,

I am trying to upgrade Sqlg to 3.3.3 from 3.3.1.

The only tests that are failing are the io tests for graphson V1.

I see CustomId has a CustomIdJacksonSerializerV1d0 but not a
deserializer. Looks like Jackson is using reflection to instantiate the
CustomId and set its cluster and elementId.
Is this how it must be or can it work with a deserializer? Sqlg's
RecordId does not have default constructors.

For Sqlg I added the standard deserializer but it fails with.

org.apache.tinkerpop.shaded.jackson.databind.exc.InvalidTypeIdException:
Could not resolve type id 'org.umlg.sqlg.structure.SchemaTable' as a
subtype of [map type; class java.util.LinkedHashMap, [simple type, class
java.lang.Object] -> [simple type, class java.lang.Object]]: Not a

subtype

at [Source: (ByteArrayInputStream); line: 1, column: 105] (through
reference chain: java.util.HashMap["id"])

   at



org.apache.tinkerpop.shaded.jackson.databind.exc.InvalidTypeIdException.from(InvalidTypeIdException.java:43)

   at



org.apache.tinkerpop.shaded.jackson.databind.DeserializationContext.invalidTypeIdException(DeserializationContext.java:1628)

   at



org.apache.tinkerpop.shaded.jackson.databind.DatabindContext.resolveSubType(DatabindContext.java:200)

   at



org.apache.tinkerpop.shaded.jackson.databind.jsontype.impl.ClassNameIdResolver._typeFromId(ClassNameIdResolver.java:49)

   at



org.apache.tinkerpop.shaded.jackson.databind.jsontype.

Re: CustomId serialization

2018-06-25 Thread pieter gmail

Hi,

Just manage to get it to work, but not really sure whats going on.

So Sqlg's RecordId itself consist of a SchemaTable and a Long. Both 
RecordId and SchemaTable has serialization code.


The part I don't quite get is that serializeWithType and deserialize is 
not symmetrical.

Here is RecordId's serialization code.

    @Override
    public void serializeWithType(final RecordId recordId, final 
JsonGenerator jsonGenerator,
  final SerializerProvider 
serializerProvider, final TypeSerializer typeSerializer) throws 
IOException, JsonProcessingException {


    jsonGenerator.writeStartObject();
jsonGenerator.writeStringField(GraphSONTokens.CLASS, 
RecordId.class.getName());
    jsonGenerator.writeObjectField("schemaTable", 
recordId.getSchemaTable());

    jsonGenerator.writeNumberField("id", recordId.getId());
    jsonGenerator.writeEndObject();
    }

    @Override
    public RecordId deserialize(final JsonParser jsonParser, final 
DeserializationContext deserializationContext) throws IOException, 
JsonProcessingException {
    org.apache.tinkerpop.shaded.jackson.core.JsonToken 
jsonToken = jsonParser.nextToken();

    Preconditions.checkState(JsonToken.START_OBJECT == jsonToken);
    SchemaTable schemaTable = 
deserializationContext.readValue(jsonParser, SchemaTable.class);

    jsonToken = jsonParser.nextToken();
Preconditions.checkState(org.apache.tinkerpop.shaded.jackson.core.JsonToken.FIELD_NAME 
== jsonToken);

Preconditions.checkState("id".equals(jsonParser.getValueAsString()));
    jsonToken = jsonParser.nextToken();
    Preconditions.checkState(JsonToken.VALUE_NUMBER_INT == 
jsonToken);

    long id = jsonParser.getValueAsLong();
    jsonToken = jsonParser.nextToken();
Preconditions.checkState(org.apache.tinkerpop.shaded.jackson.core.JsonToken.END_OBJECT 
== jsonToken);

    return RecordId.from(schemaTable, id);
    }

What happened to the GraphSONTokens.CLASS ?
I was expecting to have to read that also but somewhere I have lost the 
flow.


Just to reiterate it is working now and all the tests are passing, so 
its more of a information question.


Thanks
Pieter


On 25/06/2018 13:38, Stephen Mallette wrote:

I would think that you could write your own custom deserializer if you
needed to. That error doesn't give me any hints as to what might be wrong
exactly. I can't think of why that wouldn't work, but even with a little
refresh by looking at the code just now, my memory on GraphSON 1.0 is fuzzy.

Maybe you could try to modify the working test in TinkerPop to include a
deserializer and see if you get a similar error for your efforts? Perhaps
that would help yield a clue?

On Mon, Jun 25, 2018 at 2:58 AM pieter gmail 
wrote:


Hi,

I am trying to upgrade Sqlg to 3.3.3 from 3.3.1.

The only tests that are failing are the io tests for graphson V1.

I see CustomId has a CustomIdJacksonSerializerV1d0 but not a
deserializer. Looks like Jackson is using reflection to instantiate the
CustomId and set its cluster and elementId.
Is this how it must be or can it work with a deserializer? Sqlg's
RecordId does not have default constructors.

For Sqlg I added the standard deserializer but it fails with.

org.apache.tinkerpop.shaded.jackson.databind.exc.InvalidTypeIdException:
Could not resolve type id 'org.umlg.sqlg.structure.SchemaTable' as a
subtype of [map type; class java.util.LinkedHashMap, [simple type, class
java.lang.Object] -> [simple type, class java.lang.Object]]: Not a subtype
   at [Source: (ByteArrayInputStream); line: 1, column: 105] (through
reference chain: java.util.HashMap["id"])

  at

org.apache.tinkerpop.shaded.jackson.databind.exc.InvalidTypeIdException.from(InvalidTypeIdException.java:43)
  at

org.apache.tinkerpop.shaded.jackson.databind.DeserializationContext.invalidTypeIdException(DeserializationContext.java:1628)
  at

org.apache.tinkerpop.shaded.jackson.databind.DatabindContext.resolveSubType(DatabindContext.java:200)
  at

org.apache.tinkerpop.shaded.jackson.databind.jsontype.impl.ClassNameIdResolver._typeFromId(ClassNameIdResolver.java:49)
  at

org.apache.tinkerpop.shaded.jackson.databind.jsontype.impl.ClassNameIdResolver.typeFromId(ClassNameIdResolver.java:44)
  at

org.apache.tinkerpop.shaded.jackson.databind.jsontype.impl.TypeDeserializerBase._findDeserializer(TypeDeserializerBase.java:156)
  at

org.apache.tinkerpop.shaded.jackson.databind.jsontype.impl.AsPropertyTypeDeserializer._deserializeTypedForId(AsPropertyTypeDeserializer.java:113)
  at

org.apache.tinkerpop.shaded.jackson.databind.jsontype.impl.AsPropertyTypeDeserializer.deserializeTypedFromObject(AsPropertyTypeDeserializer.java:97)
  at

org.apache.tinkerpop.shaded.jackson.databind.deser.std.MapDeserializer.d

RE: CustomId serialization

2018-06-25 Thread pieter gmail

Hi,

I am trying to upgrade Sqlg to 3.3.3 from 3.3.1.

The only tests that are failing are the io tests for graphson V1.

I see CustomId has a CustomIdJacksonSerializerV1d0 but not a 
deserializer. Looks like Jackson is using reflection to instantiate the 
CustomId and set its cluster and elementId.
Is this how it must be or can it work with a deserializer? Sqlg's 
RecordId does not have default constructors.


For Sqlg I added the standard deserializer but it fails with.

org.apache.tinkerpop.shaded.jackson.databind.exc.InvalidTypeIdException: 
Could not resolve type id 'org.umlg.sqlg.structure.SchemaTable' as a 
subtype of [map type; class java.util.LinkedHashMap, [simple type, class 
java.lang.Object] -> [simple type, class java.lang.Object]]: Not a subtype
 at [Source: (ByteArrayInputStream); line: 1, column: 105] (through 
reference chain: java.util.HashMap["id"])


    at 
org.apache.tinkerpop.shaded.jackson.databind.exc.InvalidTypeIdException.from(InvalidTypeIdException.java:43)
    at 
org.apache.tinkerpop.shaded.jackson.databind.DeserializationContext.invalidTypeIdException(DeserializationContext.java:1628)
    at 
org.apache.tinkerpop.shaded.jackson.databind.DatabindContext.resolveSubType(DatabindContext.java:200)
    at 
org.apache.tinkerpop.shaded.jackson.databind.jsontype.impl.ClassNameIdResolver._typeFromId(ClassNameIdResolver.java:49)
    at 
org.apache.tinkerpop.shaded.jackson.databind.jsontype.impl.ClassNameIdResolver.typeFromId(ClassNameIdResolver.java:44)
    at 
org.apache.tinkerpop.shaded.jackson.databind.jsontype.impl.TypeDeserializerBase._findDeserializer(TypeDeserializerBase.java:156)
    at 
org.apache.tinkerpop.shaded.jackson.databind.jsontype.impl.AsPropertyTypeDeserializer._deserializeTypedForId(AsPropertyTypeDeserializer.java:113)
    at 
org.apache.tinkerpop.shaded.jackson.databind.jsontype.impl.AsPropertyTypeDeserializer.deserializeTypedFromObject(AsPropertyTypeDeserializer.java:97)
    at 
org.apache.tinkerpop.shaded.jackson.databind.deser.std.MapDeserializer.deserializeWithType(MapDeserializer.java:400)
    at 
org.apache.tinkerpop.shaded.jackson.databind.deser.impl.TypeWrappedDeserializer.deserialize(TypeWrappedDeserializer.java:68)
    at 
org.apache.tinkerpop.shaded.jackson.databind.DeserializationContext.readValue(DeserializationContext.java:759)
    at 
org.apache.tinkerpop.shaded.jackson.databind.DeserializationContext.readValue(DeserializationContext.java:746)
    at 
org.umlg.sqlg.structure.RecordId$RecordIdJacksonDeserializerV1d0.deserialize(RecordId.java:205)


Any ideas as to how I should implement this?

Thanks
Pieter





Re: [DISCUSS] Depth First Repeat step

2018-04-28 Thread pieter gmail

Hi,

Is the objection to order(SearchAlgo) that it overloads order() or an 
objection to specifying, DFS/BFS in the traversal itself?
If so I do not really see how it is misplaced from a usability/API 
perspective. Seems pretty natural to me and very graphy at that.


As mentioned earlier I am not a fan of making the user deal with 
strategies. Strategies to my mind is not part of the gremlin as a 
language specification. You mention LazyBarrierStrategy yet it is a 
internal strategy to TinkerPop. Sqlg removes it and passes all tests 
without me even being aware anymore if TinkerPop would have had a 
LazyBarrierStrategy or not. I see this as a strength of TinkerPop. 
Strategies can evolve over time without it having a effect on the 
specification and user's codebase out there.


Here comes where my thoughts have gone to,

g.V().anythingTraversal(anyTraversal()).traverse(DFS/BFS)

This should apply to all traversals, not just RepeatStep. e.g. 
g.V().out().out().traverse(DFS).
If no traverse(DFS/BFS) is specified then the graph provider can do 
whatever they want, DFS, BFS or a mix of the two.
If however traverse(BFS/DFS) is specified then the graph provider must 
follow the directive and the test suite will test for it.


traverse(DFS/BFS) is like any other traversal in that it can be 
specified mid traversal. Everything before it must follow the directive, 
everything after need not to.


This will also be backward compatible which is kinda nice.

Cheers
Pieter

On 27/04/2018 21:03, Stephen Mallette wrote:

It seems like we have general agreement on the easy things, that is:

1. this is a change for 3.4.0/master and
2. we're all for a DFS option

but we still have the hard part of having to come to consensus on how it's
used/implemented. The quick summary of this thread in that regard goes
something like this: We currently have this PR that introduces DFS, but
does so as a configuration per repeat() step. From what I gather a good
many of us seem to find that approach undesirable for one or more of the
following reasons:

1. The use of order() seems misplaced purely from a usability/API
perspective
2. The approach seems to be at odds with how everything else works given
barrier() and strategies
3. The approach seems to be at odds with our current mixed mode of DFS/BFS

I think that we can see those issues resolve themselves with something
Kuppitz mentioned to me: repeat() should be DFS by default where barrier()
will change that behavior as required. That change would yield the
following approaches:

Full BFS: manually add `barrier()`'s
Mixed mode: Default, let strategies do their thing OR remove strategies and
manually add your own barrier()
Full DFS: execute `.withoutStrategies(Lazy...)`

Futherrmore, we probably should have some form of verification strategy
that ensures all BFS or all DFS so that users can't get tricked along the
way. It's not enough to just remove LazyBarrierStrategy to get DFS if
another strategy comes along and throws in a barrier().

So if all that sounds good from a usability perspective, then we get all
three modes that we want using existing traversal semantics which removes
the three concerns I've summarized from this thread. We also get Keith's
desire to have control over which part of a traversal is BFS/DFS if users
want that capability because they can do a manual Mixed Mode and add their
own barrier() to control the flow. For Pieter (or any graph provider)
nothing really changes and there is opportunity to control flow with
strategies as usual.

I haven't really focused much on what's involved in adapting the current
work in the PR to this approach as I more wanted to find the common ground
among all the people who commented on the thread. If we agree that this is
a nice way to go, then we can think more about "how" it could happen.

Keith, I saw you mention earlier that:


  The barrier step that Daniel described doesn’t currently work since

there’s basically booleans in the RepeatStep on whether or not to stash the
starts to make the RepeatStep depth first.

I presume that would be some source of technical derailment to this
approach.




On Tue, Apr 24, 2018 at 3:05 PM, Keith Lohnes <lohn...@gmail.com> wrote:


Yeah, that's what I meant. The steps inside are replaced with some
JanusGraph stuff.

Cheers,
Keith


On Tue, Apr 24, 2018 at 1:52 PM pieter gmail <pieter.mar...@gmail.com>
wrote:


Nah, that looks to me like the RepeatStep survived. Just the nested
VertexStep that got replaced with JanusgraphVertexStep.
Good for them, first prize is not replacing anything.

Cheers
Pieter

On 24/04/2018 19:50, Keith Lohnes wrote:

It looks like it,
`g.V().has("foo", "bar").repeat(out()).emit().explain()`

yields

`[JanusGraphStep([],[foo.eq(bar)]),
RepeatStep([JanusGraphVertexStep(OUT,vertex),
RepeatEndStep],until(false),emit(true))]`



On Tue, Apr 24, 2018 at 12:12 PM pieter gmail <pieter.mar...@gmail.com
wrote:


Hi,

Sqlg complet

Re: [DISCUSS] Depth First Repeat step

2018-04-24 Thread pieter gmail
Nah, that looks to me like the RepeatStep survived. Just the nested 
VertexStep that got replaced with JanusgraphVertexStep.

Good for them, first prize is not replacing anything.

Cheers
Pieter

On 24/04/2018 19:50, Keith Lohnes wrote:

It looks like it,
`g.V().has("foo", "bar").repeat(out()).emit().explain()`

yields

`[JanusGraphStep([],[foo.eq(bar)]),
RepeatStep([JanusGraphVertexStep(OUT,vertex),
RepeatEndStep],until(false),emit(true))]`



On Tue, Apr 24, 2018 at 12:12 PM pieter gmail <pieter.mar...@gmail.com>
wrote:


Hi,

Sqlg completely replaces TinkerPop's RepeatStep. The idea being that
with g.V().repeat(out()).times(x) only x round trips to the db is needed
regardless of the size of the graph. Each time it will go to the db with
the full set of the previous step's incoming starts.

But yeah TinkerPop's implementation is always the starting point so I'll
definitely have a look at how you have implemented DFS.

BTW, does Janus graph use TinkerPop's default RepeatStep as is with no
optimization strategies?

Cheers
Pieter

On 24/04/2018 16:33, Keith Lohnes wrote:

Pieter,

If you take a look at https://github.com/apache/tinkerpop/pull/838 DFS

is

implemented as a modification to BFS. It's taking the starts that come in
from a BFS and stashing them to be processed later. I haven't seen a big
performance difference on JanusGraph; At least for the queries that I've
been running with it. I'm not terribly familiar with Sqlg, but I wonder

if

in the context of how DFS is implemented there, it may be less of a
concern.

Cheers,
Keith

On Thu, Apr 19, 2018 at 12:46 PM pieter gmail <pieter.mar...@gmail.com>
wrote:


Hi,

Not really sure either what 'global' means technically with respect to
TinkerPop's current configuration support.
Perhaps it can be the start of a global configuration registry that can
be overridden per traversal.

I get that DFS is the preferred default but for Sqlg the performance
impact is so great that I'd rather, if possible have BFS as its default.

I am not sure about this but I reckon that any graph where the TinkerPop
vm is not running in the same process space as the actual graph/data
that latency is a big issue. BFS alleviates the latency issue
significantly.

Cheers
Pieter



On 19/04/2018 14:49, Keith Lohnes wrote:

whether this will affect more than just

repeat()?

For the PR that I start and the intent of this thread was to only

affect

repeat.


I prefer that the semantics of the traversal be specified in the

traversal

as a first class citizen.

+1


I am fine with any default but am wondering whether it would be

worthwhile for the default to be overridden at a global level at not

just

per traversal

It might be nice, but I guess I'm not sure what `global` means in this
context. Configured on the graph object?

On Tue, Apr 17, 2018 at 9:28 PM Daniel Kuppitz <m...@gremlin.guru>

wrote:

TinkerPop makes no guarantees about the order of elements unless you
specify an explicit order. This also goes back to the fact that

certain

strategies (LazyBarrier-, RepeatUnroll- and PathRetractionStrategy)

add

NoOpBarrierSteps to your traversal, which ultimately turns it into a
DFS/BFS mix. Check the .explain() output of your traversal to see

which

strategy adds which steps.

Cheers,
Daniel


On Tue, Apr 17, 2018 at 4:45 PM, Michael Pollmeier <
mich...@michaelpollmeier.com> wrote:


Also it seems to me that DFS only really applies to repeat() with an
emit().
g.V().hasLabel("A").repeat().times(2) gets rewritten as
g.V().hasLabel("A").out().out(). Are their subtleties that I am not
aware of or does DFV vs BFS not matter in this case?

When I read this I thought: clearly `.out().out()` is DFS for OLTP,
that's also what the documentation says, e.g. in this nice

infographic

http://tinkerpop.apache.org/docs/current/images/oltp-vs-olap.png

However, looks like that's not the case. Has my life been a lie?

Setting

up a simple flat graph to make things more obvious:
v3 <- v1 <- v0 -> v2 -> v4

```
graph = TinkerGraph.open()
v0 = graph.addVertex("l0")
v1 = graph.addVertex("l1")
v2 = graph.addVertex("l1")
v3 = graph.addVertex("l2")
v4 = graph.addVertex("l2")
v0.addEdge("e", v2)
v2.addEdge("e", v4)
v0.addEdge("e", v1)
v1.addEdge("e", v3)
g = graph.traversal()
g.V(v0).out().sideEffect{println(it)}.out().sideEffect{println(it)}
```

Prints:
v[2]
v[1]
v[4]
==>v[4]
v[3]
==>v[3]

If this was OLTP the output would be:
v[2]
v[4]
==>v[4]
v[1]
v[3]
==>v[3]

Cheers
Michael

On 18/04/18 02:58, pieter gmail wrote:

Hi,

I agree with the question about whether this will affect more than

just

repeat()?

I prefer that the semantics of the traversal be specified in the
traversal as a first class citizen. i.e. with order(SearchAlgo).
Strategies are to my mind internal to an implementation. In Robert's
example Lazy

Re: [DISCUSS] Depth First Repeat step

2018-04-24 Thread pieter gmail

Hi,

Sqlg completely replaces TinkerPop's RepeatStep. The idea being that 
with g.V().repeat(out()).times(x) only x round trips to the db is needed 
regardless of the size of the graph. Each time it will go to the db with 
the full set of the previous step's incoming starts.


But yeah TinkerPop's implementation is always the starting point so I'll 
definitely have a look at how you have implemented DFS.


BTW, does Janus graph use TinkerPop's default RepeatStep as is with no 
optimization strategies?


Cheers
Pieter

On 24/04/2018 16:33, Keith Lohnes wrote:

Pieter,

If you take a look at https://github.com/apache/tinkerpop/pull/838 DFS is
implemented as a modification to BFS. It's taking the starts that come in
from a BFS and stashing them to be processed later. I haven't seen a big
performance difference on JanusGraph; At least for the queries that I've
been running with it. I'm not terribly familiar with Sqlg, but I wonder if
in the context of how DFS is implemented there, it may be less of a
concern.

Cheers,
Keith

On Thu, Apr 19, 2018 at 12:46 PM pieter gmail <pieter.mar...@gmail.com>
wrote:


Hi,

Not really sure either what 'global' means technically with respect to
TinkerPop's current configuration support.
Perhaps it can be the start of a global configuration registry that can
be overridden per traversal.

I get that DFS is the preferred default but for Sqlg the performance
impact is so great that I'd rather, if possible have BFS as its default.

I am not sure about this but I reckon that any graph where the TinkerPop
vm is not running in the same process space as the actual graph/data
that latency is a big issue. BFS alleviates the latency issue
significantly.

Cheers
Pieter



On 19/04/2018 14:49, Keith Lohnes wrote:

whether this will affect more than just

repeat()?

For the PR that I start and the intent of this thread was to only affect
repeat.


I prefer that the semantics of the traversal be specified in the

traversal

as a first class citizen.

+1


   I am fine with any default but am wondering whether it would be

worthwhile for the default to be overridden at a global level at not just
per traversal

It might be nice, but I guess I'm not sure what `global` means in this
context. Configured on the graph object?

On Tue, Apr 17, 2018 at 9:28 PM Daniel Kuppitz <m...@gremlin.guru> wrote:


TinkerPop makes no guarantees about the order of elements unless you
specify an explicit order. This also goes back to the fact that certain
strategies (LazyBarrier-, RepeatUnroll- and PathRetractionStrategy) add
NoOpBarrierSteps to your traversal, which ultimately turns it into a
DFS/BFS mix. Check the .explain() output of your traversal to see which
strategy adds which steps.

Cheers,
Daniel


On Tue, Apr 17, 2018 at 4:45 PM, Michael Pollmeier <
mich...@michaelpollmeier.com> wrote:


Also it seems to me that DFS only really applies to repeat() with an
emit().
g.V().hasLabel("A").repeat().times(2) gets rewritten as
g.V().hasLabel("A").out().out(). Are their subtleties that I am not
aware of or does DFV vs BFS not matter in this case?

When I read this I thought: clearly `.out().out()` is DFS for OLTP,
that's also what the documentation says, e.g. in this nice infographic
http://tinkerpop.apache.org/docs/current/images/oltp-vs-olap.png

However, looks like that's not the case. Has my life been a lie?

Setting

up a simple flat graph to make things more obvious:
v3 <- v1 <- v0 -> v2 -> v4

```
graph = TinkerGraph.open()
v0 = graph.addVertex("l0")
v1 = graph.addVertex("l1")
v2 = graph.addVertex("l1")
v3 = graph.addVertex("l2")
v4 = graph.addVertex("l2")
v0.addEdge("e", v2)
v2.addEdge("e", v4)
v0.addEdge("e", v1)
v1.addEdge("e", v3)
g = graph.traversal()
g.V(v0).out().sideEffect{println(it)}.out().sideEffect{println(it)}
```

Prints:
v[2]
v[1]
v[4]
==>v[4]
v[3]
==>v[3]

If this was OLTP the output would be:
v[2]
v[4]
==>v[4]
v[1]
v[3]
==>v[3]

Cheers
Michael

On 18/04/18 02:58, pieter gmail wrote:

Hi,

I agree with the question about whether this will affect more than

just

repeat()?

I prefer that the semantics of the traversal be specified in the
traversal as a first class citizen. i.e. with order(SearchAlgo).
Strategies are to my mind internal to an implementation. In Robert's
example LazyBarrierStrategy may be replaced/removed by an

implementation

for whatever internal reason they have.

Regarding the default, I am fine with any default but am wondering
whether it would be worthwhile for the default to be overridden at a
global level at not just per traversal? That way the impact can also

be

alleviated when folks upgrade.

Also it seems to me that DFS only really applies to repeat() with an
emit().
g.V().hasLabel("A").repeat().times(2) gets rewritten as
g.V().hasLabel("A").out().out(). Are their subtleties that I am not
a

Re: [DISCUSS] Depth First Repeat step

2018-04-19 Thread pieter gmail

Hi,

Not really sure either what 'global' means technically with respect to 
TinkerPop's current configuration support.
Perhaps it can be the start of a global configuration registry that can 
be overridden per traversal.


I get that DFS is the preferred default but for Sqlg the performance 
impact is so great that I'd rather, if possible have BFS as its default.


I am not sure about this but I reckon that any graph where the TinkerPop 
vm is not running in the same process space as the actual graph/data 
that latency is a big issue. BFS alleviates the latency issue significantly.


Cheers
Pieter



On 19/04/2018 14:49, Keith Lohnes wrote:

whether this will affect more than just

repeat()?

For the PR that I start and the intent of this thread was to only affect
repeat.


I prefer that the semantics of the traversal be specified in the traversal

as a first class citizen.

+1


  I am fine with any default but am wondering whether it would be

worthwhile for the default to be overridden at a global level at not just
per traversal

It might be nice, but I guess I'm not sure what `global` means in this
context. Configured on the graph object?

On Tue, Apr 17, 2018 at 9:28 PM Daniel Kuppitz <m...@gremlin.guru> wrote:


TinkerPop makes no guarantees about the order of elements unless you
specify an explicit order. This also goes back to the fact that certain
strategies (LazyBarrier-, RepeatUnroll- and PathRetractionStrategy) add
NoOpBarrierSteps to your traversal, which ultimately turns it into a
DFS/BFS mix. Check the .explain() output of your traversal to see which
strategy adds which steps.

Cheers,
Daniel


On Tue, Apr 17, 2018 at 4:45 PM, Michael Pollmeier <
mich...@michaelpollmeier.com> wrote:


Also it seems to me that DFS only really applies to repeat() with an
emit().
g.V().hasLabel("A").repeat().times(2) gets rewritten as
g.V().hasLabel("A").out().out(). Are their subtleties that I am not
aware of or does DFV vs BFS not matter in this case?

When I read this I thought: clearly `.out().out()` is DFS for OLTP,
that's also what the documentation says, e.g. in this nice infographic
http://tinkerpop.apache.org/docs/current/images/oltp-vs-olap.png

However, looks like that's not the case. Has my life been a lie? Setting
up a simple flat graph to make things more obvious:
v3 <- v1 <- v0 -> v2 -> v4

```
graph = TinkerGraph.open()
v0 = graph.addVertex("l0")
v1 = graph.addVertex("l1")
v2 = graph.addVertex("l1")
v3 = graph.addVertex("l2")
v4 = graph.addVertex("l2")
v0.addEdge("e", v2)
v2.addEdge("e", v4)
v0.addEdge("e", v1)
v1.addEdge("e", v3)
g = graph.traversal()
g.V(v0).out().sideEffect{println(it)}.out().sideEffect{println(it)}
```

Prints:
v[2]
v[1]
v[4]
==>v[4]
v[3]
==>v[3]

If this was OLTP the output would be:
v[2]
v[4]
==>v[4]
v[1]
v[3]
==>v[3]

Cheers
Michael

On 18/04/18 02:58, pieter gmail wrote:

Hi,

I agree with the question about whether this will affect more than just
repeat()?

I prefer that the semantics of the traversal be specified in the
traversal as a first class citizen. i.e. with order(SearchAlgo).
Strategies are to my mind internal to an implementation. In Robert's
example LazyBarrierStrategy may be replaced/removed by an

implementation

for whatever internal reason they have.

Regarding the default, I am fine with any default but am wondering
whether it would be worthwhile for the default to be overridden at a
global level at not just per traversal? That way the impact can also be
alleviated when folks upgrade.

Also it seems to me that DFS only really applies to repeat() with an
emit().
g.V().hasLabel("A").repeat().times(2) gets rewritten as
g.V().hasLabel("A").out().out(). Are their subtleties that I am not
aware of or does DFV vs BFS not matter in this case?

Cheers
Pieter

On 17/04/2018 14:58, Robert Dale wrote:

+1 on DFS
-1 on order(SearchAlgo)

It seems like a strategy may be more appropriate.  It could affect

more

than just repeat().  And how would this interact with
LazyBarrierStrategy?

Maybe the default should be DFS with LazyBarrierStrategy. Then
LazyBarrierStrategy
can be removed with 'withoutStrategies()' and then it works just like
everything else.  I'd prefer consistency with the way everything else
works.



Robert Dale

On Tue, Apr 17, 2018 at 8:11 AM, Stephen Mallette <

spmalle...@gmail.com

wrote:


Thanks for the summary. Just a quick note - I'd not worry about the

GLV

tests for now. That part should be easy to sort out. Let's first make
sure
that we get clear on the other items first before digging too deeply
there.

On an administrative front, I think that this change should just go

to

3.4.0/master (so it's currently targeted to the right branch
actually) as
it sounds like we want DFS to be the default and that could be a
breaking
change as the semantics of the traversal shift 

Re: [DISCUSS] Depth First Repeat step

2018-04-17 Thread pieter gmail
", but not a lot about other aspects like testing, api,
release branch to apply it to, etc. Is that a fair depiction of how

this

work has developed so far? if so, let's use this thread to make sure

we're

all on the same page as to how this change will get in on all those

sorts

of issues.

btw, thanks to you and michael for sticking at this to come to

something

that seems to work. looking forward to the continued discussion on this
thread.


On Mon, Apr 16, 2018 at 6:54 PM, Michael Pollmeier <
mich...@michaelpollmeier.com> wrote:


Unsurprisingly I'm also +1 for defaulting to DFS in OLTP. My feeling

is

that most users currently expect it to be DFS since that's what the

docs

say.

And yes, it's easy to verify the default in the test suite, once we
agreed on what the default should be.

Cheers
Michael

On 17/04/18 04:40, pieter gmail wrote:

Hi,

I have not properly followed the previous thread. However I thought

is

going to be a way to set the default behavior as apposed to needing

to

use barrier()
Is this the case or not?

For Sqlg at least it is possible to optimize BFS much more

effectively

than DFS so it will be nice to have a way to set the strategy

rather

than having to manually inject barriers.

Is the test suite going to enforce the BFS vs DFS?

Thanks
Pieter

On 16/04/2018 16:56, Daniel Kuppitz wrote:

+1 for DFS. If the query relies on BFS, you can always do
.repeat(barrier())...

^ This holds true as long as there's no significant difference in

the

cpu+memory consumption and overall performance of the two

approaches.

BFS

has its advantages when it comes to bulking; an arbitrary number

of

traversers on the same element is processed at the same pace as a

single

traverser. I don't think we can benefit from bulking in DFS.

Cheers,
Daniel


On Mon, Apr 16, 2018 at 5:44 AM, Keith Lohnes <lohn...@gmail.com>

wrote:

As part of #838 <https://github.com/apache/tinkerpop/pull/838>

there’s

some
discussion around whether or not to make DFS the default for the

repeat

step. On the one hand, everything else in OLTP is depth first. On

the

other
hand, there’s likely existing traversals that depend on the

breadth

first
nature of repeat. My general preference is to make DFS optional

at

first,
and at some later date, change the default and have that be a

separate

change from implementing DFS for repeat
​









Re: [DISCUSS] Depth First Repeat step

2018-04-16 Thread pieter gmail

Hi,

I have not properly followed the previous thread. However I thought is 
going to be a way to set the default behavior as apposed to needing to 
use barrier()

Is this the case or not?

For Sqlg at least it is possible to optimize BFS much more effectively 
than DFS so it will be nice to have a way to set the strategy rather 
than having to manually inject barriers.


Is the test suite going to enforce the BFS vs DFS?

Thanks
Pieter

On 16/04/2018 16:56, Daniel Kuppitz wrote:

+1 for DFS. If the query relies on BFS, you can always do
.repeat(barrier())...

^ This holds true as long as there's no significant difference in the
cpu+memory consumption and overall performance of the two approaches. BFS
has its advantages when it comes to bulking; an arbitrary number of
traversers on the same element is processed at the same pace as a single
traverser. I don't think we can benefit from bulking in DFS.

Cheers,
Daniel


On Mon, Apr 16, 2018 at 5:44 AM, Keith Lohnes  wrote:


As part of #838  there’s
some
discussion around whether or not to make DFS the default for the repeat
step. On the one hand, everything else in OLTP is depth first. On the other
hand, there’s likely existing traversals that depend on the breadth first
nature of repeat. My general preference is to make DFS optional at first,
and at some later date, change the default and have that be a separate
change from implementing DFS for repeat
​





RE: GremlinPlugin

2018-03-03 Thread pieter gmail

Hi,

I am updating Sqlg's gremlin-console support. I have been upgrading Sqlg 
without testing the console and did not realize the changes that has 
been made there.


So I got it to work again but am not quite sure about the ImportCustomizer.

Previously there was no need to individually specify classes to import. 
Is this optional or do every class need to be individually specified? I 
see both TinkerGraph and Neo4 specify some classes but not all.


Thanks
Pieter



Re: [VOTE] TinkerPop 3.3.1 Release

2017-12-19 Thread pieter gmail

Hi,

Tested Sqlg on 3.3.1.

Custom, structured and process tests all pass.

VOTE +1

Cheers
Pieter

On 17/12/2017 18:54, Stephen Mallette wrote:

Hello,

We are happy to announce that TinkerPop 3.3.1 is ready for release.

The release artifacts can be found at this location:
 https://dist.apache.org/repos/dist/dev/tinkerpop/3.3.1/

The source distribution is provided by:
 apache-tinkerpop-3.3.1-src.zip

Two binary distributions are provided for user convenience:
 apache-tinkerpop-gremlin-console-3.3.1-bin.zip
 apache-tinkerpop-gremlin-server-3.3.1-bin.zip

The GPG key used to sign the release artifacts is available at:
 https://dist.apache.org/repos/dist/dev/tinkerpop/KEYS

The online docs can be found here:
 http://tinkerpop.apache.org/docs/3.3.1/ (user docs)
 http://tinkerpop.apache.org/docs/3.3.1/upgrade/ (upgrade docs)
 http://tinkerpop.apache.org/javadocs/3.3.1/core/ (core javadoc)
 http://tinkerpop.apache.org/javadocs/3.3.1/full/ (full javadoc)

The tag in Apache Git can be found here:

https://git-wip-us.apache.org/repos/asf?p=tinkerpop.git;a=tag;h=48c762d3e6fd6dda95b806a0eb1126989f7569ee

The release notes are available here:

https://github.com/apache/tinkerpop/blob/3.3.1/CHANGELOG.asciidoc#tinkerpop-331-release-date-december-17-2017

The [VOTE] will be open for the next 72 hours --- closing Wednesday,
December 20, 2017, at 12:00pm EST.

My vote is +1.

Thank you very much,
Stephen





Re: [DISCUSS] 3.3.1/3.2.7 December Release

2017-12-02 Thread pieter gmail

Hi,

Can you deploy 3.3.1-SNAPSHOT to the maven repository?
The last entry 
https://repository.apache.org/content/groups/snapshots/org/apache/tinkerpop/gremlin-core/3.3.1-SNAPSHOT/ 
seems to be 2017/10/26


Thanks
Pieter

On 30/11/2017 14:52, Stephen Mallette wrote:

I'd like to propose that we do a release before the end of the year. Now
that the GLV Test Suite is largely in place and we're close to having .NET
running under that suite we have a reasonable degree of confidence to make
that release official and get off the release candidate system we've been
using. It would further be good to get official releases of Gremlin Python
out there as the release candidates for 3.3.1 and 3.2.7 have been out for
about a month now without any report of trouble so those should be ready to
go.

Aside for GLV related changes there are a fair number of bug fixes and the
import feature of math() step that need to get out into the wild.

Other than existing open PRs

https://github.com/apache/tinkerpop/pull/758 (.NET DSLs)
https://github.com/apache/tinkerpop/pull/754 (.NET test suite)

I'm not aware of any major issues that need to be closed before release,
but I don't think we need to be in a huge rush to go to code freeze. Please
raise any issues or concerns that you feel are relevant to 3.3.1 and 3.2.7
and we'll figure out where to go from here.

Thanks,

Stephen





Re: dropStep and BarrierStep

2017-11-30 Thread pieter gmail

Hi,

Any ideas about this one?

Thanks
Pieter

On 22/11/2017 19:49, pieter gmail wrote:

Hi,

Whilst optimizing the DropStep in Sqlg I came across RepeatStep plus 
DropStep in a traversal going into a infinite loop.


To test Sqlg I tested on TinkerPop's ComplexTest.playlistPath() 
changing the gremlin to have a drop()


    @Test
    public void testDropPlaylist() {
    final TinkerGraph g = TinkerGraph.open();
    Io.Builder builder = 
GraphSONIo.build(GraphSONVersion.V3_0);

    final GraphReader reader = g.io(builder).reader().create();
    try (final InputStream stream = 
AbstractGremlinTest.class.getResourceAsStream("/grateful-dead-v3d0.json")) 
{

    reader.readGraph(stream, g);
    } catch (IOException e) {
    Assert.fail(e.getMessage());
    }
    Traversal<Vertex, Vertex> playListTraversal = 
getPlaylistPaths(g);

    System.out.println(playListTraversal.toString());
    List vertices = playListTraversal.toList();
    Assert.assertEquals(100, vertices.size());
    System.out.println("counted");
    Traversal<Vertex, Vertex> dropTraversal = 
getPlaylistPaths(g).drop().drop().iterate();

    System.out.println("done");
    }

    public GraphTraversal<Vertex, Vertex> getPlaylistPaths(Graph graph) {
    return graph.traversal().V().has("name", 
"Bob_Dylan").in("sungBy").as("a").

repeat(__.out().order().by(Order.shuffle).simplePath().from("a")).
    until(__.out("writtenBy").has("name", 
"Johnny_Cash")).limit(1).as("b").
repeat(__.out().order().by(Order.shuffle).as("c").simplePath().from("b").to("c")). 

    until(__.out("sungBy").has("name", 
"Grateful_Dead")).limit(100);

    }

The second dropTraversal goes into an infinite loop.

A simpler scenario illustrating the infinite loop,

    @Test
    public void testRepeatDrop() {
    final TinkerGraph g = TinkerGraph.open();
    Vertex a1 = g.addVertex(T.label, "A");
    Vertex b1 = g.addVertex(T.label, "B");
    Vertex c1 = g.addVertex(T.label, "C");
    a1.addEdge("ab", b1);
    b1.addEdge("ba", a1);
    b1.addEdge("bc", c1);

    Vertex a2 = g.addVertex(T.label, "A");
    Vertex b2 = g.addVertex(T.label, "B");
    Vertex c2 = g.addVertex(T.label, "C");
    a2.addEdge("ab", b2);
    b2.addEdge("ba", a2);
    b2.addEdge("bc", c2);

    a1.addEdge("ac", c1);
    a1.addEdge("ac", c2);

    List vertices = 
g.traversal().withoutStrategies(LazyBarrierStrategy.class)

    .V().hasLabel("A")
    .repeat(__.out("ab", "ba"))
    .until(__.out("bc"))
    .out("bc")
    .in("ac")
    .out("ac")
    .toList();
    Assert.assertEquals(4, vertices.size());
g.traversal().withoutStrategies(LazyBarrierStrategy.class)
    .V().hasLabel("A")
    .repeat(__.out("ab", "ba"))
    .until(__.out("bc"))
    .out("bc")
    .in("ac")
    .out("ac")
    .drop()
    .iterate();
    System.out.println("asdasdasd");
    }

Adding a BarrierStep before drop() resolves the issue but probably 
this should happen automatically in a strategy?
Afraid I can not quite tell what the scenarios are that would cause 
this issue so not sure if a BarrierStep should always be added or only 
sometimes.


Cheers
Pieter






RE: dropStep and BarrierStep

2017-11-22 Thread pieter gmail

Hi,

Whilst optimizing the DropStep in Sqlg I came across RepeatStep plus 
DropStep in a traversal going into a infinite loop.


To test Sqlg I tested on TinkerPop's ComplexTest.playlistPath() changing 
the gremlin to have a drop()


    @Test
    public void testDropPlaylist() {
    final TinkerGraph g = TinkerGraph.open();
    Io.Builder builder = 
GraphSONIo.build(GraphSONVersion.V3_0);

    final GraphReader reader = g.io(builder).reader().create();
    try (final InputStream stream = 
AbstractGremlinTest.class.getResourceAsStream("/grateful-dead-v3d0.json")) {

    reader.readGraph(stream, g);
    } catch (IOException e) {
    Assert.fail(e.getMessage());
    }
    Traversal playListTraversal = getPlaylistPaths(g);
    System.out.println(playListTraversal.toString());
    List vertices = playListTraversal.toList();
    Assert.assertEquals(100, vertices.size());
    System.out.println("counted");
    Traversal dropTraversal = 
getPlaylistPaths(g).drop().drop().iterate();

    System.out.println("done");
    }

    public GraphTraversal getPlaylistPaths(Graph graph) {
    return graph.traversal().V().has("name", 
"Bob_Dylan").in("sungBy").as("a").

repeat(__.out().order().by(Order.shuffle).simplePath().from("a")).
    until(__.out("writtenBy").has("name", 
"Johnny_Cash")).limit(1).as("b").

repeat(__.out().order().by(Order.shuffle).as("c").simplePath().from("b").to("c")).
    until(__.out("sungBy").has("name", 
"Grateful_Dead")).limit(100);

    }

The second dropTraversal goes into an infinite loop.

A simpler scenario illustrating the infinite loop,

    @Test
    public void testRepeatDrop() {
    final TinkerGraph g = TinkerGraph.open();
    Vertex a1 = g.addVertex(T.label, "A");
    Vertex b1 = g.addVertex(T.label, "B");
    Vertex c1 = g.addVertex(T.label, "C");
    a1.addEdge("ab", b1);
    b1.addEdge("ba", a1);
    b1.addEdge("bc", c1);

    Vertex a2 = g.addVertex(T.label, "A");
    Vertex b2 = g.addVertex(T.label, "B");
    Vertex c2 = g.addVertex(T.label, "C");
    a2.addEdge("ab", b2);
    b2.addEdge("ba", a2);
    b2.addEdge("bc", c2);

    a1.addEdge("ac", c1);
    a1.addEdge("ac", c2);

    List vertices = 
g.traversal().withoutStrategies(LazyBarrierStrategy.class)

    .V().hasLabel("A")
    .repeat(__.out("ab", "ba"))
    .until(__.out("bc"))
    .out("bc")
    .in("ac")
    .out("ac")
    .toList();
    Assert.assertEquals(4, vertices.size());
g.traversal().withoutStrategies(LazyBarrierStrategy.class)
    .V().hasLabel("A")
    .repeat(__.out("ab", "ba"))
    .until(__.out("bc"))
    .out("bc")
    .in("ac")
    .out("ac")
    .drop()
    .iterate();
    System.out.println("asdasdasd");
    }

Adding a BarrierStep before drop() resolves the issue but probably this 
should happen automatically in a strategy?
Afraid I can not quite tell what the scenarios are that would cause this 
issue so not sure if a BarrierStep should always be added or only sometimes.


Cheers
Pieter




Re: CountStrategy and TraversalHelper.replaceStep

2017-11-11 Thread pieter gmail
Created a jira <https://issues.apache.org/jira/browse/TINKERPOP-1832> 
for this.

I tested your fix and all tests are passing again. Sqlg and TinkerGraph.

Thanks
Pieter

On 30/10/2017 18:01, Daniel Kuppitz wrote:

Ah, yea, I see what you mean. It's actually replaceStep() which is buggy,
not the strategy. The fix is easy, we just need to change the order of the
2 statements in replaceStep():

public static <S, E> void replaceStep(final Step<S, E> removeStep,
final Step<S, E> insertStep, final Traversal.Admin traversal) {
 final int i;traversal.removeStep(i = stepIndex(removeStep,
traversal));traversal.addStep(i, insertStep);}


Cheers,
Daniel

On Mon, Oct 30, 2017 at 7:43 AM, pieter gmail <pieter.mar...@gmail.com>
wrote:


I am on the latest 3.3.1-SNAPSHOT, just pulled master again.

The actual traversals and results are correct. However after the
CountStrategy the VertexStep(OUT,edge) previousStep pointer
(AbstractStep.previousStep) is incorrect. I should be EmptyStep but infact
points to the newly inserted NotStep.

toString() on a traversal does not show the previous/nextStep so you can
not see what I am talking about in the console. This is not breaking
anything in TinkerGraph but breaks stuff in Sqlg.

In CountStrategy if I change line 153 to then my problems go away and
TinkerGraph also works as expected.

//TraversalHelper.replaceStep(prev, new NotStep<>(traversal, inner),
traversal);
NotStep notStep = new NotStep<>(traversal, inner);
TraversalHelper.replaceStep(prev, notStep, traversal);
List<Traversal.Admin> notStepTraversal = notStep.getLocalChildren();
Traversal.Admin notStepTraversalFirstStep = notStepTraversal.get(0);
//The first step is pointing to the NotStep, it should point to an
EmptyStep
notStepTraversalFirstStep.getSteps().get(0).setPreviousStep(
EmptyStep.instance());

So I suppose the question is, is it correct for the previousStep of the
first step of a local traversal to point to the parent step and not an
EmptyStep?

The TraversalHelper.replaceStep always makes the first step of the
traversal point to the newly inserted step. If the traversal however is a
local traversal then the root step should be an EmptyStep.

Hope it makes some sense.

Thanks
Pieter



On 30/10/2017 15:28, Daniel Kuppitz wrote:


I don't see any issues. Which version are you talking about?

*gremlin> Gremlin.version()*
*==>3.2.7-SNAPSHOT*
gremlin> g = TinkerFactory.createModern().traversal()
==>graphtraversalsource[tinkergraph[vertices:6 edges:6], standard]
gremlin> g.V(1).repeat(out()).until(__.not(outE())).values('name')
==>lop
==>vadas
==>ripple
==>lop
gremlin> g.V(1).repeat(out()).until(outE().count().is(0)).values('name')
==>lop
==>vadas
==>ripple
==>lop
gremlin>
g.V(1).repeat(out()).until(__.not(outE())).values('name').it
erate().toString()
==>[TinkerGraphStep(vertex,[1]), RepeatStep([VertexStep(OUT,vertex),
RepeatEndStep],until([NotStep([VertexStep(OUT,edge)])]),emit(false)),
PropertiesStep([name],value)]
gremlin>
g.V(1).repeat(out()).until(outE().count().is(0)).values('nam
e').iterate().toString()
==>[TinkerGraphStep(vertex,[1]), RepeatStep([VertexStep(OUT,vertex),
RepeatEndStep],until([NotStep([VertexStep(OUT,edge)])]),emit(false)),
PropertiesStep([name],value)]


*gremlin> Gremlin.version()*
*==>3.3.1-SNAPSHOT*

gremlin> g = TinkerFactory.createModern().traversal()
==>graphtraversalsource[tinkergraph[vertices:6 edges:6], standard]
gremlin> g.V(1).repeat(out()).until(__.not(outE())).values('name')
==>lop
==>vadas
==>ripple
==>lop
gremlin> g.V(1).repeat(out()).until(outE().count().is(0)).values('name')
==>lop
==>vadas
==>ripple
==>lop
gremlin>
g.V(1).repeat(out()).until(__.not(outE())).values('name').it
erate().toString()
==>[TinkerGraphStep(vertex,[1]), RepeatStep([VertexStep(OUT,vertex),
RepeatEndStep],until([NotStep([VertexStep(OUT,edge)])]),emit(false)),
PropertiesStep([name],value)]
gremlin>
g.V(1).repeat(out()).until(outE().count().is(0)).values('nam
e').iterate().toString()
==>[TinkerGraphStep(vertex,[1]), RepeatStep([VertexStep(OUT,vertex),
RepeatEndStep],until([NotStep([VertexStep(OUT,edge)])]),emit(false)),
PropertiesStep([name],value)]


Cheers,
Daniel


On Mon, Oct 30, 2017 at 1:53 AM, pieter gmail <pieter.mar...@gmail.com>
wrote:

Hi,

Whilst optimizing the NotStep I came across what looks to me like a bug
in
TraversalHelper.replaceStep or perhaps rather in CountStrategy.

The test below shows the bug.

  @Test
  public void g_VX1X_repeatXoutX_untilXoutE_count_isX0XX_name() {
  Graph graph = TinkerFactory.createModern();
  final Traversal<Vertex, String> traversal1 = graph.traversal()
  .V(convertToVertexId(graph, "marko"))
  .repeat(__.out())
  .until(__.not(__.outE()))
  .values("name");
  checkR

Re: CountStrategy and TraversalHelper.replaceStep

2017-10-30 Thread pieter gmail

Ok great, I'll test a bit with the order changed.

Thanks,
Pieter

On 30/10/2017 18:01, Daniel Kuppitz wrote:

Ah, yea, I see what you mean. It's actually replaceStep() which is buggy,
not the strategy. The fix is easy, we just need to change the order of the
2 statements in replaceStep():

public static <S, E> void replaceStep(final Step<S, E> removeStep,
final Step<S, E> insertStep, final Traversal.Admin traversal) {
 final int i;traversal.removeStep(i = stepIndex(removeStep,
traversal));traversal.addStep(i, insertStep);}


Cheers,
Daniel

On Mon, Oct 30, 2017 at 7:43 AM, pieter gmail <pieter.mar...@gmail.com>
wrote:


I am on the latest 3.3.1-SNAPSHOT, just pulled master again.

The actual traversals and results are correct. However after the
CountStrategy the VertexStep(OUT,edge) previousStep pointer
(AbstractStep.previousStep) is incorrect. I should be EmptyStep but infact
points to the newly inserted NotStep.

toString() on a traversal does not show the previous/nextStep so you can
not see what I am talking about in the console. This is not breaking
anything in TinkerGraph but breaks stuff in Sqlg.

In CountStrategy if I change line 153 to then my problems go away and
TinkerGraph also works as expected.

//TraversalHelper.replaceStep(prev, new NotStep<>(traversal, inner),
traversal);
NotStep notStep = new NotStep<>(traversal, inner);
TraversalHelper.replaceStep(prev, notStep, traversal);
List<Traversal.Admin> notStepTraversal = notStep.getLocalChildren();
Traversal.Admin notStepTraversalFirstStep = notStepTraversal.get(0);
//The first step is pointing to the NotStep, it should point to an
EmptyStep
notStepTraversalFirstStep.getSteps().get(0).setPreviousStep(
EmptyStep.instance());

So I suppose the question is, is it correct for the previousStep of the
first step of a local traversal to point to the parent step and not an
EmptyStep?

The TraversalHelper.replaceStep always makes the first step of the
traversal point to the newly inserted step. If the traversal however is a
local traversal then the root step should be an EmptyStep.

Hope it makes some sense.

Thanks
Pieter



On 30/10/2017 15:28, Daniel Kuppitz wrote:


I don't see any issues. Which version are you talking about?

*gremlin> Gremlin.version()*
*==>3.2.7-SNAPSHOT*
gremlin> g = TinkerFactory.createModern().traversal()
==>graphtraversalsource[tinkergraph[vertices:6 edges:6], standard]
gremlin> g.V(1).repeat(out()).until(__.not(outE())).values('name')
==>lop
==>vadas
==>ripple
==>lop
gremlin> g.V(1).repeat(out()).until(outE().count().is(0)).values('name')
==>lop
==>vadas
==>ripple
==>lop
gremlin>
g.V(1).repeat(out()).until(__.not(outE())).values('name').it
erate().toString()
==>[TinkerGraphStep(vertex,[1]), RepeatStep([VertexStep(OUT,vertex),
RepeatEndStep],until([NotStep([VertexStep(OUT,edge)])]),emit(false)),
PropertiesStep([name],value)]
gremlin>
g.V(1).repeat(out()).until(outE().count().is(0)).values('nam
e').iterate().toString()
==>[TinkerGraphStep(vertex,[1]), RepeatStep([VertexStep(OUT,vertex),
RepeatEndStep],until([NotStep([VertexStep(OUT,edge)])]),emit(false)),
PropertiesStep([name],value)]


*gremlin> Gremlin.version()*
*==>3.3.1-SNAPSHOT*

gremlin> g = TinkerFactory.createModern().traversal()
==>graphtraversalsource[tinkergraph[vertices:6 edges:6], standard]
gremlin> g.V(1).repeat(out()).until(__.not(outE())).values('name')
==>lop
==>vadas
==>ripple
==>lop
gremlin> g.V(1).repeat(out()).until(outE().count().is(0)).values('name')
==>lop
==>vadas
==>ripple
==>lop
gremlin>
g.V(1).repeat(out()).until(__.not(outE())).values('name').it
erate().toString()
==>[TinkerGraphStep(vertex,[1]), RepeatStep([VertexStep(OUT,vertex),
RepeatEndStep],until([NotStep([VertexStep(OUT,edge)])]),emit(false)),
PropertiesStep([name],value)]
gremlin>
g.V(1).repeat(out()).until(outE().count().is(0)).values('nam
e').iterate().toString()
==>[TinkerGraphStep(vertex,[1]), RepeatStep([VertexStep(OUT,vertex),
RepeatEndStep],until([NotStep([VertexStep(OUT,edge)])]),emit(false)),
PropertiesStep([name],value)]


Cheers,
Daniel


On Mon, Oct 30, 2017 at 1:53 AM, pieter gmail <pieter.mar...@gmail.com>
wrote:

Hi,

Whilst optimizing the NotStep I came across what looks to me like a bug
in
TraversalHelper.replaceStep or perhaps rather in CountStrategy.

The test below shows the bug.

  @Test
  public void g_VX1X_repeatXoutX_untilXoutE_count_isX0XX_name() {
  Graph graph = TinkerFactory.createModern();
  final Traversal<Vertex, String> traversal1 = graph.traversal()
  .V(convertToVertexId(graph, "marko"))
  .repeat(__.out())
  .until(__.not(__.outE()))
  .values("name");
  checkResults(Arrays.asList("lop", "lop", "ripple", "vadas"),
traversal1);

Re: CountStrategy and TraversalHelper.replaceStep

2017-10-30 Thread pieter gmail

I am on the latest 3.3.1-SNAPSHOT, just pulled master again.

The actual traversals and results are correct. However after the 
CountStrategy the VertexStep(OUT,edge) previousStep pointer 
(AbstractStep.previousStep) is incorrect. I should be EmptyStep but 
infact points to the newly inserted NotStep.


toString() on a traversal does not show the previous/nextStep so you can 
not see what I am talking about in the console. This is not breaking 
anything in TinkerGraph but breaks stuff in Sqlg.


In CountStrategy if I change line 153 to then my problems go away and 
TinkerGraph also works as expected.


//TraversalHelper.replaceStep(prev, new NotStep<>(traversal, inner), 
traversal);

NotStep notStep = new NotStep<>(traversal, inner);
TraversalHelper.replaceStep(prev, notStep, traversal);
List<Traversal.Admin> notStepTraversal = notStep.getLocalChildren();
Traversal.Admin notStepTraversalFirstStep = notStepTraversal.get(0);
//The first step is pointing to the NotStep, it should point to an EmptyStep
notStepTraversalFirstStep.getSteps().get(0).setPreviousStep(EmptyStep.instance());

So I suppose the question is, is it correct for the previousStep of the 
first step of a local traversal to point to the parent step and not an 
EmptyStep?


The TraversalHelper.replaceStep always makes the first step of the 
traversal point to the newly inserted step. If the traversal however is 
a local traversal then the root step should be an EmptyStep.


Hope it makes some sense.

Thanks
Pieter



On 30/10/2017 15:28, Daniel Kuppitz wrote:

I don't see any issues. Which version are you talking about?

*gremlin> Gremlin.version()*
*==>3.2.7-SNAPSHOT*
gremlin> g = TinkerFactory.createModern().traversal()
==>graphtraversalsource[tinkergraph[vertices:6 edges:6], standard]
gremlin> g.V(1).repeat(out()).until(__.not(outE())).values('name')
==>lop
==>vadas
==>ripple
==>lop
gremlin> g.V(1).repeat(out()).until(outE().count().is(0)).values('name')
==>lop
==>vadas
==>ripple
==>lop
gremlin>
g.V(1).repeat(out()).until(__.not(outE())).values('name').iterate().toString()
==>[TinkerGraphStep(vertex,[1]), RepeatStep([VertexStep(OUT,vertex),
RepeatEndStep],until([NotStep([VertexStep(OUT,edge)])]),emit(false)),
PropertiesStep([name],value)]
gremlin>
g.V(1).repeat(out()).until(outE().count().is(0)).values('name').iterate().toString()
==>[TinkerGraphStep(vertex,[1]), RepeatStep([VertexStep(OUT,vertex),
RepeatEndStep],until([NotStep([VertexStep(OUT,edge)])]),emit(false)),
PropertiesStep([name],value)]


*gremlin> Gremlin.version()*
*==>3.3.1-SNAPSHOT*
gremlin> g = TinkerFactory.createModern().traversal()
==>graphtraversalsource[tinkergraph[vertices:6 edges:6], standard]
gremlin> g.V(1).repeat(out()).until(__.not(outE())).values('name')
==>lop
==>vadas
==>ripple
==>lop
gremlin> g.V(1).repeat(out()).until(outE().count().is(0)).values('name')
==>lop
==>vadas
==>ripple
==>lop
gremlin>
g.V(1).repeat(out()).until(__.not(outE())).values('name').iterate().toString()
==>[TinkerGraphStep(vertex,[1]), RepeatStep([VertexStep(OUT,vertex),
RepeatEndStep],until([NotStep([VertexStep(OUT,edge)])]),emit(false)),
PropertiesStep([name],value)]
gremlin>
g.V(1).repeat(out()).until(outE().count().is(0)).values('name').iterate().toString()
==>[TinkerGraphStep(vertex,[1]), RepeatStep([VertexStep(OUT,vertex),
RepeatEndStep],until([NotStep([VertexStep(OUT,edge)])]),emit(false)),
PropertiesStep([name],value)]


Cheers,
Daniel


On Mon, Oct 30, 2017 at 1:53 AM, pieter gmail <pieter.mar...@gmail.com>
wrote:


Hi,

Whilst optimizing the NotStep I came across what looks to me like a bug in
TraversalHelper.replaceStep or perhaps rather in CountStrategy.

The test below shows the bug.

 @Test
 public void g_VX1X_repeatXoutX_untilXoutE_count_isX0XX_name() {
 Graph graph = TinkerFactory.createModern();
 final Traversal<Vertex, String> traversal1 = graph.traversal()
 .V(convertToVertexId(graph, "marko"))
 .repeat(__.out())
 .until(__.not(__.outE()))
 .values("name");
 checkResults(Arrays.asList("lop", "lop", "ripple", "vadas"),
traversal1);

 List vertexSteps = TraversalHelper.getStepsOfAssi
gnableClassRecursively(VertexStep.class, traversal1.asAdmin());
 VertexStep vertexStep = vertexSteps.stream().filter(s ->
s.returnsEdge()).findAny().get();
 Assert.assertEquals(EmptyStep.instance(),
vertexStep.getPreviousStep());

 final Traversal<Vertex, String> traversal2 = graph.traversal()
 .V(convertToVertexId(graph, "marko"))
 .repeat(__.out())
 .until(__.outE().count().is(0))
 .values("name");
 checkResults(Arrays.asList("lop", "l

CountStrategy and TraversalHelper.replaceStep

2017-10-30 Thread pieter gmail

Hi,

Whilst optimizing the NotStep I came across what looks to me like a bug 
in TraversalHelper.replaceStep or perhaps rather in CountStrategy.


The test below shows the bug.

    @Test
    public void g_VX1X_repeatXoutX_untilXoutE_count_isX0XX_name() {
    Graph graph = TinkerFactory.createModern();
    final Traversal traversal1 = graph.traversal()
    .V(convertToVertexId(graph, "marko"))
    .repeat(__.out())
    .until(__.not(__.outE()))
    .values("name");
    checkResults(Arrays.asList("lop", "lop", "ripple", "vadas"), 
traversal1);


    List vertexSteps = 
TraversalHelper.getStepsOfAssignableClassRecursively(VertexStep.class, 
traversal1.asAdmin());
    VertexStep vertexStep = vertexSteps.stream().filter(s -> 
s.returnsEdge()).findAny().get();
    Assert.assertEquals(EmptyStep.instance(), 
vertexStep.getPreviousStep());


    final Traversal traversal2 = graph.traversal()
    .V(convertToVertexId(graph, "marko"))
    .repeat(__.out())
    .until(__.outE().count().is(0))
    .values("name");
    checkResults(Arrays.asList("lop", "lop", "ripple", "vadas"), 
traversal2);


    vertexSteps = 
TraversalHelper.getStepsOfAssignableClassRecursively(VertexStep.class, 
traversal2.asAdmin());
    vertexStep = vertexSteps.stream().filter(s -> 
s.returnsEdge()).findAny().get();


    //This fails because the vertexStep's previous step is the 
NotStepwhen it should be an EmptyStep.
    Assert.assertEquals(EmptyStep.instance(), 
vertexStep.getPreviousStep());

    }

traversal1 and traversal2 should be the same as the CountStrategy will 
replace the __.outE().count().is(0) with __.not(__.outE())
The CountStrategy does what its suppose to do however then it calls 
TraversalHelper.replaceStep(prev, new NotStep<>(traversal, inner), 
traversal); the traversal's VertexStep gets its previousStep set to the 
NotStep. This is because of the way TraversalHelper.replaceStep is 
implemented.


I am not sure whether the fix should be in replaceStep or rather in 
CountStrategy.


Thanks
Pieter


Re: [jira] [Created] (TINKERPOP-1812) ProfileTest assumes that graph implementations will not add their own steps

2017-10-26 Thread pieter gmail
In Sqlg many profile tests are ignored because of this. Including steps 
being removed.
Back in the day no-one including myself had any interest in fixing it. 
Its not trivial by all accounts.


Cheers
Pieter

On 26/10/2017 13:21, Bryn Cooke (JIRA) wrote:

Bryn Cooke created TINKERPOP-1812:
-

  Summary: ProfileTest assumes that graph implementations will not 
add their own steps
  Key: TINKERPOP-1812
  URL: https://issues.apache.org/jira/browse/TINKERPOP-1812
  Project: TinkerPop
   Issue Type: Test
   Components: process
 Affects Versions: 3.2.6
 Reporter: Bryn Cooke


The following two tests check the number of steps in the traversal:
g_V_sideEffectXThread_sleepX10XX_sideEffectXThread_sleepX5XX_profileXmetricsX
g_V_sideEffectXThread_sleepX10XX_sideEffectXThread_sleepX5XX_profile

This assumes that graph implementations add no steps to the traversal. They 
should probably be checking the structure of the traversal rather than the 
total number of steps. For instance that each step is followed by a profile 
step and that the profile side effect step and cap step are the last steps.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)




Re: remove step

2017-10-26 Thread pieter gmail

Ha, there it is already. Sorry did not know about that one. A rtfm moment.

Thanks
Pieter

On 26/10/2017 20:21, Stephen Mallette wrote:

Is there some difference between what you are proposing and:

http://tinkerpop.apache.org/docs/current/reference/#drop-step

On Thu, Oct 26, 2017 at 2:18 PM, pieter gmail <pieter.mar...@gmail.com>
wrote:


Hi,

I was wondering if we can add a remove step. Currently to remove elements
I iterate the traversal and call remove on each element. Adding a remove
step will make it possible to optimize removals.

Its probably less useful for Neo4j and TinkerGraph, not sure about Janus
but for Sqlg,

g.V().hasLabel("X").rm() will translate to one sql truncate statement.
Currently to do a remove it loads all 'X's and deletes them one by one. Bit
of optimizing with a bulk deletion vibe but it could be way better if there
was a step to optimize.

Thanks
Pieter





remove step

2017-10-26 Thread pieter gmail

Hi,

I was wondering if we can add a remove step. Currently to remove 
elements I iterate the traversal and call remove on each element. Adding 
a remove step will make it possible to optimize removals.


Its probably less useful for Neo4j and TinkerGraph, not sure about Janus 
but for Sqlg,


g.V().hasLabel("X").rm() will translate to one sql truncate statement. 
Currently to do a remove it loads all 'X's and deletes them one by one. 
Bit of optimizing with a bulk deletion vibe but it could be way better 
if there was a step to optimize.


Thanks
Pieter


Re: Sqlg's description

2017-10-25 Thread pieter gmail

Thanks,
Pieter

On 25/10/2017 14:26, Stephen Mallette wrote:

As there were no objections to making the change, I went ahead and
published - it may take a few minutes for the changes to propogate. Thanks

On Sun, Oct 22, 2017 at 12:35 PM, pieter gmail <pieter.mar...@gmail.com>
wrote:


Hi,

Can I request Sqlg's description on the homepage <
https://tinkerpop.apache.org/> to be changed.
Sqlg now supports more back-ends and changes are that it will support non
RDBMS but SQL databases in the future.

Can we change it to

"OLTP implementation on SQL databases."

Thanks
Pieter





Sqlg's description

2017-10-22 Thread pieter gmail

Hi,

Can I request Sqlg's description on the homepage 
 to be changed.
Sqlg now supports more back-ends and changes are that it will support 
non RDBMS but SQL databases in the future.


Can we change it to

"OLTP implementation on SQL databases."

Thanks
Pieter


Re: Notes on TraverserSet and Sqlg optimizations

2017-10-18 Thread pieter gmail
Yes the hasCode() and equals() is correct. It is however a slightly 
heavier operation than TinkerGraph as Sqlg's Element's id is a more 
complex object holding the label and its id.


I should have mentioned that in Sqlg the traverser is always a 
B_LP_O_P_S_SE_SL_Traverser. As Sqlg returns multiple VertexSteps in one 
go I use the path information to reconstruct the jdbc ResultSet from the 
db. This makes the hashCode() and equals() operation heavier as it is 
called on B_LP_O_P_S_SE_SL_Traverser which calls hashCode() and equals() 
on Path and they in turn are non trivial operations.


Cheers
Pieter



On 17/10/2017 23:58, Marko Rodriguez wrote:

…do your vertices implement hashCode() and equals() “correctly” ?

Marko.




On Oct 17, 2017, at 2:40 PM, Stephen Mallette <spmalle...@gmail.com> wrote:


So if I understand correctly the map is only needed for bulking so quite

often is not needed.

afaik, it is only used for bulking though it's hard to characterize how
often it is used - i suppose it all depends on the types of traversals you
write and the nature of the data being traversed.


A significant difference.

The performance numbers are interesting. You don't get a speedup in sqlg
though when bullking would be enacted though - only when bulking would have
no effect - correct?



On Fri, Oct 13, 2017 at 3:48 PM, pieter gmail <pieter.mar...@gmail.com>
wrote:


Hi,

Doing step optimizations I am noticing a rather severe performance hit in
TraverserSet.

Sqlg does a secondary optimization on steps that it can not optimize from
the GraphStep. Before the secondary optimization these steps will execute
at least one query for each incoming start. The optimization caches the
incoming start traverser and the step is executed for all incoming
traversers in one go. This has the effect of changing the semantics into a
breath first traversal as opposed to the default depth first.

So basically the replaced steps code looks like follows

@Override
protected Traverser.Admin processNextStart() throws
NoSuchElementException {
if (this.first) {
this.first = false;
while (this.starts.hasNext()) {
Traverser.Admin start = this.starts.next();
this.traversal.addStart(start);
}


The performance hit is in the this.traversal.addStart(start) which ends up
putting the start into the TraverserSet's internal LinkedHashMap.

So if I understand correctly the map is only needed for bulking so quite
often is not needed. Replacing the map with an ArrayList improves the
performance drastically.

For the test the optimization does the following. I replace the
TraversalFilterStep with a custom SqlTraversalFilterStep which extends from
a custom SqlAbstractStep. The custom SqlgAbstractStep in turn replaces the
ExpandableStepIterator with a custom SqlgExpandableStepIterator which is a
copy of ExpandableStepIterator except for replacing TraverserSet with a
List<Traverser.Admin> traversers = new ArrayList<>();

@Test
public void testSqlgTraversalFilterStepPerformance() {
this.sqlgGraph.tx().normalBatchModeOn();
int count = 1;
for (int i = 0; i < count; i++) {
Vertex a1 = this.sqlgGraph.addVertex(T.label, "A", "name",
"a1");
Vertex b1 = this.sqlgGraph.addVertex(T.label, "B", "name",
"b1");
a1.addEdge("ab", b1);
}
this.sqlgGraph.tx().commit();

StopWatch stopWatch = new StopWatch();
for (int i = 0; i < 1000; i++) {
stopWatch.start();
GraphTraversal<Vertex, Vertex> traversal =
this.sqlgGraph.traversal()
.V().hasLabel("A")
.where(__.out().hasLabel("B"));
List vertices = traversal.toList();
Assert.assertEquals(count, vertices.size());
stopWatch.stop();
System.out.println(stopWatch.toString());
stopWatch.reset();
}
}

Without the ArrayList optimization the output is,
0:00:12.198
0:00:09.756
0:00:09.435
0:00:14.466
0:00:10.197
0:00:04.937
0:00:02.974
0:00:02.942
0:00:02.977
0:00:03.142
0:00:03.207

With the ArrayList optimization the output is,
0:00:00.334
0:00:00.147
0:00:00.114
0:00:00.100
... time for jit
0:00:00.055
0:00:00.056
0:00:00.054
0:00:00.053
0:00:00.054
0:00:00.055

A significant difference.

For TinkerGraph this tests optimization is moot as the TraversalFilterStep
resets the step for every step making the TraverserSet's map empty so the
traversers equals method is never called.

Not sure if there are scenarios where this optimization will be any good
for TinkerGraph but thought I'd let you know how I am optimizing steps.

A concern is that I am now replacing core steps which makes Sqlg further
away from the reference implementation making it fragile to changes in
TinkerPop and harder to keep up to upstream changes. Perhaps there is a way
to make TravererSet's current behavior configurable?

Cheers
Pieter








Re: Notes on TraverserSet and Sqlg optimizations

2017-10-18 Thread pieter gmail
Currently Sqlg's optimization strategies removes bulking as it does not 
work with Sqlg's way of accessing the database. Sqlg fetches many 
VertexSteps in one go and bulking needs it to be on a one by one basis. 
Bulking is still possible but only by removing Sqlg's strategies from 
the traversal. They way I understood bulking it is only of use for a 
particular graph shape. Graphs with lots references from the same label 
back to itself. For the kind of graphs I work on and hopefully most of 
my users the graphs are more like trees where bulking is less useful.


Later I hope to look at bulking and see if its possible to predict 
whether a query would be better of with bulking.


Cheers
Pieter

On 17/10/2017 22:40, Stephen Mallette wrote:

So if I understand correctly the map is only needed for bulking so quite

often is not needed.

afaik, it is only used for bulking though it's hard to characterize how
often it is used - i suppose it all depends on the types of traversals you
write and the nature of the data being traversed.


A significant difference.

The performance numbers are interesting. You don't get a speedup in sqlg
though when bullking would be enacted though - only when bulking would have
no effect - correct?



On Fri, Oct 13, 2017 at 3:48 PM, pieter gmail <pieter.mar...@gmail.com>
wrote:


Hi,

Doing step optimizations I am noticing a rather severe performance hit in
TraverserSet.

Sqlg does a secondary optimization on steps that it can not optimize from
the GraphStep. Before the secondary optimization these steps will execute
at least one query for each incoming start. The optimization caches the
incoming start traverser and the step is executed for all incoming
traversers in one go. This has the effect of changing the semantics into a
breath first traversal as opposed to the default depth first.

So basically the replaced steps code looks like follows

 @Override
 protected Traverser.Admin processNextStart() throws
NoSuchElementException {
 if (this.first) {
 this.first = false;
 while (this.starts.hasNext()) {
 Traverser.Admin start = this.starts.next();
 this.traversal.addStart(start);
 }
 

The performance hit is in the this.traversal.addStart(start) which ends up
putting the start into the TraverserSet's internal LinkedHashMap.

So if I understand correctly the map is only needed for bulking so quite
often is not needed. Replacing the map with an ArrayList improves the
performance drastically.

For the test the optimization does the following. I replace the
TraversalFilterStep with a custom SqlTraversalFilterStep which extends from
a custom SqlAbstractStep. The custom SqlgAbstractStep in turn replaces the
ExpandableStepIterator with a custom SqlgExpandableStepIterator which is a
copy of ExpandableStepIterator except for replacing TraverserSet with a
List<Traverser.Admin> traversers = new ArrayList<>();

 @Test
 public void testSqlgTraversalFilterStepPerformance() {
 this.sqlgGraph.tx().normalBatchModeOn();
 int count = 1;
 for (int i = 0; i < count; i++) {
 Vertex a1 = this.sqlgGraph.addVertex(T.label, "A", "name",
"a1");
 Vertex b1 = this.sqlgGraph.addVertex(T.label, "B", "name",
"b1");
 a1.addEdge("ab", b1);
 }
 this.sqlgGraph.tx().commit();

 StopWatch stopWatch = new StopWatch();
 for (int i = 0; i < 1000; i++) {
 stopWatch.start();
 GraphTraversal<Vertex, Vertex> traversal =
this.sqlgGraph.traversal()
 .V().hasLabel("A")
 .where(__.out().hasLabel("B"));
 List vertices = traversal.toList();
 Assert.assertEquals(count, vertices.size());
 stopWatch.stop();
 System.out.println(stopWatch.toString());
 stopWatch.reset();
 }
 }

Without the ArrayList optimization the output is,
0:00:12.198
0:00:09.756
0:00:09.435
0:00:14.466
0:00:10.197
0:00:04.937
0:00:02.974
0:00:02.942
0:00:02.977
0:00:03.142
0:00:03.207

With the ArrayList optimization the output is,
0:00:00.334
0:00:00.147
0:00:00.114
0:00:00.100
... time for jit
0:00:00.055
0:00:00.056
0:00:00.054
0:00:00.053
0:00:00.054
0:00:00.055

A significant difference.

For TinkerGraph this tests optimization is moot as the TraversalFilterStep
resets the step for every step making the TraverserSet's map empty so the
traversers equals method is never called.

Not sure if there are scenarios where this optimization will be any good
for TinkerGraph but thought I'd let you know how I am optimizing steps.

A concern is that I am now replacing core steps which makes Sqlg further
away from the reference implementation making it fragile to changes in
TinkerPop and harder to keep up to upstream changes. Perhaps there is a way
to make TravererSet's current behavior configurable?

Cheers
Pieter








Notes on TraverserSet and Sqlg optimizations

2017-10-13 Thread pieter gmail

Hi,

Doing step optimizations I am noticing a rather severe performance hit 
in TraverserSet.


Sqlg does a secondary optimization on steps that it can not optimize 
from the GraphStep. Before the secondary optimization these steps will 
execute at least one query for each incoming start. The optimization 
caches the incoming start traverser and the step is executed for all 
incoming traversers in one go. This has the effect of changing the 
semantics into a breath first traversal as opposed to the default depth 
first.


So basically the replaced steps code looks like follows

    @Override
    protected Traverser.Admin processNextStart() throws 
NoSuchElementException {

    if (this.first) {
    this.first = false;
    while (this.starts.hasNext()) {
    Traverser.Admin start = this.starts.next();
    this.traversal.addStart(start);
    }
    

The performance hit is in the this.traversal.addStart(start) which ends 
up putting the start into the TraverserSet's internal LinkedHashMap.


So if I understand correctly the map is only needed for bulking so quite 
often is not needed. Replacing the map with an ArrayList improves the 
performance drastically.


For the test the optimization does the following. I replace the 
TraversalFilterStep with a custom SqlTraversalFilterStep which extends 
from a custom SqlAbstractStep. The custom SqlgAbstractStep in turn 
replaces the ExpandableStepIterator with a custom 
SqlgExpandableStepIterator which is a copy of ExpandableStepIterator 
except for replacing TraverserSet with a List 
traversers = new ArrayList<>();


    @Test
    public void testSqlgTraversalFilterStepPerformance() {
    this.sqlgGraph.tx().normalBatchModeOn();
    int count = 1;
    for (int i = 0; i < count; i++) {
    Vertex a1 = this.sqlgGraph.addVertex(T.label, "A", "name", 
"a1");
    Vertex b1 = this.sqlgGraph.addVertex(T.label, "B", "name", 
"b1");

    a1.addEdge("ab", b1);
    }
    this.sqlgGraph.tx().commit();

    StopWatch stopWatch = new StopWatch();
    for (int i = 0; i < 1000; i++) {
    stopWatch.start();
    GraphTraversal traversal = 
this.sqlgGraph.traversal()

    .V().hasLabel("A")
    .where(__.out().hasLabel("B"));
    List vertices = traversal.toList();
    Assert.assertEquals(count, vertices.size());
    stopWatch.stop();
    System.out.println(stopWatch.toString());
    stopWatch.reset();
    }
    }

Without the ArrayList optimization the output is,
0:00:12.198
0:00:09.756
0:00:09.435
0:00:14.466
0:00:10.197
0:00:04.937
0:00:02.974
0:00:02.942
0:00:02.977
0:00:03.142
0:00:03.207

With the ArrayList optimization the output is,
0:00:00.334
0:00:00.147
0:00:00.114
0:00:00.100
... time for jit
0:00:00.055
0:00:00.056
0:00:00.054
0:00:00.053
0:00:00.054
0:00:00.055

A significant difference.

For TinkerGraph this tests optimization is moot as the 
TraversalFilterStep resets the step for every step making the 
TraverserSet's map empty so the traversers equals method is never called.


Not sure if there are scenarios where this optimization will be any good 
for TinkerGraph but thought I'd let you know how I am optimizing steps.


A concern is that I am now replacing core steps which makes Sqlg further 
away from the reference implementation making it fragile to changes in 
TinkerPop and harder to keep up to upstream changes. Perhaps there is a 
way to make TravererSet's current behavior configurable?


Cheers
Pieter





Re: io and graphson-v3

2017-09-06 Thread pieter gmail

Hi,

Pulled TINKERPOP-1767 branch, changed SqlgGraph's io method and ran the 
tests.


All the io tests are passing.
Only SerializationTest fails for the same reason. It too needs the 
version specified. I did that locally and then all tests passes.


Thanks
Pieter

On 06/09/2017 18:09, Stephen Mallette wrote:

Pieter, I created this issue:

https://issues.apache.org/jira/browse/TINKERPOP-1767

and made an effort to try to figure a way to fix it:

https://github.com/apache/tinkerpop/tree/TINKERPOP-1767

Note the change to TinkerGraph and its io() method. I suppose you could do
something similar to get the right registry in play? could you have a look
and see if what i did helps? if that works then i'll issue a PR and we can
get it reviewed/merged.


On Tue, Sep 5, 2017 at 12:10 PM, pieter gmail <pieter.mar...@gmail.com>
wrote:


Ok, at present there is only one SimpleModule, the default. I can make it
v2 or v3 but not both.

Lets say I make the SimpleModule support V2.
Then when calling IoEdgeTest for

 {"graphson-v3", true, true,
 (Function<Graph, GraphReader>) g -> g.io(IoCore.graphson()).reader
().mapper(g.io(GraphSONIo.build(GraphSONVersion.V3_0)).
mapper().create()).create(),
 (Function<Graph, GraphWriter>) g -> g.io(IoCore.graphson()).writer
().mapper(g.io(GraphSONIo.build(GraphSONVersion.V3_0)).
mapper().create()).create()},

then the deserializers that run are for V2,

 static class SchemaTableIdJacksonDeserializerV2d0 extends
AbstractObjectDeserializer {
 SchemaTableIdJacksonDeserializerV2d0() {
 super(SchemaTable.class);
 }

 @Override
 public SchemaTable createObject(final Map data) {
 return SchemaTable.of((String)data.get("schema"), (String)
data.get("table"));
 }
 }

when createObject fires the map data has V3 elements in it like @type and
@value whilst its expecting "schema" and "table"

If we make the SimpleModule support V3 then graphson-v2 will fail.

{"graphson-v2", false, false,
(Function<Graph, GraphReader>) g -> g.io(IoCore.graphson()).reader
().mapper(g.io(GraphSONIo.build(GraphSONVersion.V2_0)).
mapper().typeInfo(TypeInfo.NO_TYPES).create()).create(),
(Function<Graph, GraphWriter>) g -> g.io(IoCore.graphson()).writer
().mapper(g.io(GraphSONIo.build(GraphSONVersion.V2_0)).
mapper().typeInfo(TypeInfo.NO_TYPES).create()).create()},

Now the deserializers are for V3.

 static class SchemaTableJacksonDeserializerV3d0 extends
StdDeserializer {
 public SchemaTableJacksonDeserializerV3d0() {
 super(RecordId.class);
 }

 @Override
 public SchemaTable deserialize(final JsonParser jsonParser, final
DeserializationContext deserializationContext) throws IOException,
JsonProcessingException {
 final Map<String, Object> data = 
deserializationContext.readValue(jsonParser,
Map.class);
 return SchemaTable.of((String)data.get("schema"), (String)
data.get("table"));
 }

 @Override
 public boolean isCachable() {
 return true;
 }
 }

This does not fire at all. Eventually I get a detached edge with an id
that is a map. It never deserialized.

So basically it only works if the SimpleModule version, i.e.
serialize/deserialize code matches up with the version.

Sqlg serializes RecordId to,

 static class RecordIdJacksonSerializerV3d0 extends
StdScalarSerializer {
 public RecordIdJacksonSerializerV3d0() {
 super(RecordId.class);
 }
 @Override
 public void serialize(final RecordId recordId, final JsonGenerator
jsonGenerator, final SerializerProvider serializerProvider)
 throws IOException, JsonGenerationException {
 final Map<String, Object> m = new HashMap<>();
 m.put("schemaTable", recordId.getSchemaTable());
 m.put("id", recordId.getId());
 jsonGenerator.writeObject(m);
 }
 }

and

 static class SchemaTableJacksonSerializerV3d0 extends
StdScalarSerializer {
 SchemaTableJacksonSerializerV3d0() {
 super(SchemaTable.class);
 }

 @Override
 public void serialize(final SchemaTable schemaTable, final
JsonGenerator jsonGenerator, final SerializerProvider serializerProvider)
 throws IOException, JsonGenerationException {
 // when types are not embedded, stringify or resort to JSON
primitive representations of the
 // type so that non-jvm languages can better interoperate with
the TinkerPop stack.
 final Map<String, Object> m = new LinkedHashMap<>();
 m.put("schema", schemaTable.getSchema());
 m.put("table", schemaTable.getTable());
 jso

Re: io and graphson-v3

2017-09-06 Thread pieter gmail

Thanks, I'll have a look.
For now on 3.3.0 I'll OptOut of some io tests. I'll let you know the 
OptOut list.


Thanks
Pieter

On 06/09/2017 18:09, Stephen Mallette wrote:

Pieter, I created this issue:

https://issues.apache.org/jira/browse/TINKERPOP-1767

and made an effort to try to figure a way to fix it:

https://github.com/apache/tinkerpop/tree/TINKERPOP-1767

Note the change to TinkerGraph and its io() method. I suppose you could do
something similar to get the right registry in play? could you have a look
and see if what i did helps? if that works then i'll issue a PR and we can
get it reviewed/merged.


On Tue, Sep 5, 2017 at 12:10 PM, pieter gmail <pieter.mar...@gmail.com>
wrote:


Ok, at present there is only one SimpleModule, the default. I can make it
v2 or v3 but not both.

Lets say I make the SimpleModule support V2.
Then when calling IoEdgeTest for

 {"graphson-v3", true, true,
 (Function<Graph, GraphReader>) g -> g.io(IoCore.graphson()).reader
().mapper(g.io(GraphSONIo.build(GraphSONVersion.V3_0)).
mapper().create()).create(),
 (Function<Graph, GraphWriter>) g -> g.io(IoCore.graphson()).writer
().mapper(g.io(GraphSONIo.build(GraphSONVersion.V3_0)).
mapper().create()).create()},

then the deserializers that run are for V2,

 static class SchemaTableIdJacksonDeserializerV2d0 extends
AbstractObjectDeserializer {
 SchemaTableIdJacksonDeserializerV2d0() {
 super(SchemaTable.class);
 }

 @Override
 public SchemaTable createObject(final Map data) {
 return SchemaTable.of((String)data.get("schema"), (String)
data.get("table"));
 }
 }

when createObject fires the map data has V3 elements in it like @type and
@value whilst its expecting "schema" and "table"

If we make the SimpleModule support V3 then graphson-v2 will fail.

{"graphson-v2", false, false,
(Function<Graph, GraphReader>) g -> g.io(IoCore.graphson()).reader
().mapper(g.io(GraphSONIo.build(GraphSONVersion.V2_0)).
mapper().typeInfo(TypeInfo.NO_TYPES).create()).create(),
(Function<Graph, GraphWriter>) g -> g.io(IoCore.graphson()).writer
().mapper(g.io(GraphSONIo.build(GraphSONVersion.V2_0)).
mapper().typeInfo(TypeInfo.NO_TYPES).create()).create()},

Now the deserializers are for V3.

 static class SchemaTableJacksonDeserializerV3d0 extends
StdDeserializer {
 public SchemaTableJacksonDeserializerV3d0() {
 super(RecordId.class);
 }

 @Override
 public SchemaTable deserialize(final JsonParser jsonParser, final
DeserializationContext deserializationContext) throws IOException,
JsonProcessingException {
 final Map<String, Object> data = 
deserializationContext.readValue(jsonParser,
Map.class);
 return SchemaTable.of((String)data.get("schema"), (String)
data.get("table"));
 }

 @Override
 public boolean isCachable() {
 return true;
 }
 }

This does not fire at all. Eventually I get a detached edge with an id
that is a map. It never deserialized.

So basically it only works if the SimpleModule version, i.e.
serialize/deserialize code matches up with the version.

Sqlg serializes RecordId to,

 static class RecordIdJacksonSerializerV3d0 extends
StdScalarSerializer {
 public RecordIdJacksonSerializerV3d0() {
 super(RecordId.class);
 }
 @Override
 public void serialize(final RecordId recordId, final JsonGenerator
jsonGenerator, final SerializerProvider serializerProvider)
 throws IOException, JsonGenerationException {
 final Map<String, Object> m = new HashMap<>();
 m.put("schemaTable", recordId.getSchemaTable());
 m.put("id", recordId.getId());
 jsonGenerator.writeObject(m);
 }
 }

and

 static class SchemaTableJacksonSerializerV3d0 extends
StdScalarSerializer {
 SchemaTableJacksonSerializerV3d0() {
 super(SchemaTable.class);
 }

 @Override
 public void serialize(final SchemaTable schemaTable, final
JsonGenerator jsonGenerator, final SerializerProvider serializerProvider)
 throws IOException, JsonGenerationException {
 // when types are not embedded, stringify or resort to JSON
primitive representations of the
 // type so that non-jvm languages can better interoperate with
the TinkerPop stack.
 final Map<String, Object> m = new LinkedHashMap<>();
 m.put("schema", schemaTable.getSchema());
 m.put("table", schemaTable.getTable());
 jsonGenerator.writeObject(m);
 }

 }

Hope it all makes some sense,
Pieter


On 05/09/2017 17:31, Stephen Mallette wrote:


I guess I'

Re: io and graphson-v3

2017-09-05 Thread pieter gmail
Ok, at present there is only one SimpleModule, the default. I can make 
it v2 or v3 but not both.


Lets say I make the SimpleModule support V2.
Then when calling IoEdgeTest for

    {"graphson-v3", true, true,
    (Function<Graph, GraphReader>) g -> 
g.io(IoCore.graphson()).reader().mapper(g.io(GraphSONIo.build(GraphSONVersion.V3_0)).mapper().create()).create(),
    (Function<Graph, GraphWriter>) g -> 
g.io(IoCore.graphson()).writer().mapper(g.io(GraphSONIo.build(GraphSONVersion.V3_0)).mapper().create()).create()},


then the deserializers that run are for V2,

    static class SchemaTableIdJacksonDeserializerV2d0 extends 
AbstractObjectDeserializer {

    SchemaTableIdJacksonDeserializerV2d0() {
    super(SchemaTable.class);
    }

    @Override
    public SchemaTable createObject(final Map data) {
    return SchemaTable.of((String)data.get("schema"), (String) 
data.get("table"));

    }
    }

when createObject fires the map data has V3 elements in it like @type 
and @value whilst its expecting "schema" and "table"


If we make the SimpleModule support V3 then graphson-v2 will fail.

   {"graphson-v2", false, false,
       (Function<Graph, GraphReader>) g -> 
g.io(IoCore.graphson()).reader().mapper(g.io(GraphSONIo.build(GraphSONVersion.V2_0)).mapper().typeInfo(TypeInfo.NO_TYPES).create()).create(),
   (Function<Graph, GraphWriter>) g -> 
g.io(IoCore.graphson()).writer().mapper(g.io(GraphSONIo.build(GraphSONVersion.V2_0)).mapper().typeInfo(TypeInfo.NO_TYPES).create()).create()},


Now the deserializers are for V3.

    static class SchemaTableJacksonDeserializerV3d0 extends 
StdDeserializer {

    public SchemaTableJacksonDeserializerV3d0() {
    super(RecordId.class);
    }

    @Override
    public SchemaTable deserialize(final JsonParser jsonParser, 
final DeserializationContext deserializationContext) throws IOException, 
JsonProcessingException {
    final Map<String, Object> data = 
deserializationContext.readValue(jsonParser, Map.class);
    return SchemaTable.of((String)data.get("schema"), (String) 
data.get("table"));

    }

    @Override
    public boolean isCachable() {
    return true;
    }
    }

This does not fire at all. Eventually I get a detached edge with an id  
that is a map. It never deserialized.


So basically it only works if the SimpleModule version, i.e. 
serialize/deserialize code matches up with the version.


Sqlg serializes RecordId to,

    static class RecordIdJacksonSerializerV3d0 extends 
StdScalarSerializer {

    public RecordIdJacksonSerializerV3d0() {
    super(RecordId.class);
    }
    @Override
    public void serialize(final RecordId recordId, final 
JsonGenerator jsonGenerator, final SerializerProvider serializerProvider)

    throws IOException, JsonGenerationException {
    final Map<String, Object> m = new HashMap<>();
    m.put("schemaTable", recordId.getSchemaTable());
    m.put("id", recordId.getId());
    jsonGenerator.writeObject(m);
    }
    }

and

    static class SchemaTableJacksonSerializerV3d0 extends 
StdScalarSerializer {

    SchemaTableJacksonSerializerV3d0() {
    super(SchemaTable.class);
    }

    @Override
    public void serialize(final SchemaTable schemaTable, final 
JsonGenerator jsonGenerator, final SerializerProvider serializerProvider)

    throws IOException, JsonGenerationException {
    // when types are not embedded, stringify or resort to JSON 
primitive representations of the
    // type so that non-jvm languages can better interoperate 
with the TinkerPop stack.

    final Map<String, Object> m = new LinkedHashMap<>();
    m.put("schema", schemaTable.getSchema());
    m.put("table", schemaTable.getTable());
    jsonGenerator.writeObject(m);
    }

    }

Hope it all makes some sense,
Pieter


On 05/09/2017 17:31, Stephen Mallette wrote:

I guess I'm trying to understand why it matters for purpose of the test. If
you mix/match versions I can't think of why the test would care one way or
the other. does sqlg serialize its id to a JSON Map?

On Tue, Sep 5, 2017 at 11:19 AM, pieter gmail <pieter.mar...@gmail.com>
wrote:


I looked at TinkerGraph's implementation. In fact I copied it. TinkerGraph
does not have any special id serialization. In fact both its
TinkerIoRegistryV3d0 andTinkerIoRegistryV2d0 registry uses TinkerModuleV2d0.


In IoCustomTest tests you call g.io(GraphSONIo.build(GraphSON
Version.V2_0)).mapper().addCustomModule(moduleV2d0).
I.e. the test manually registers the appropriate SimpleModule for each
version.

For IoEdgeTest the same needs to happe

Re: io and graphson-v3

2017-09-05 Thread pieter gmail
I looked at TinkerGraph's implementation. In fact I copied it. 
TinkerGraph does not have any special id serialization. In fact both its 
TinkerIoRegistryV3d0 andTinkerIoRegistryV2d0 registry uses TinkerModuleV2d0.



In IoCustomTest tests you call 
g.io(GraphSONIo.build(GraphSONVersion.V2_0)).mapper().addCustomModule(moduleV2d0).
I.e. the test manually registers the appropriate SimpleModule for each 
version.


For IoEdgeTest the same needs to happen somehow. Currently I have only 
one default V3 SimpleModule same as TinkerGraph. I have written a V2 
SimpleModule but how in IoEdgeTest will the correct IoRegistry or 
SimpleModule be selected? The test itself does not call 
addCustomModule() and between the builders, mappers, registries and 
modules I don't see how to add it.


Thanks,
Pieter

On 05/09/2017 16:30, Stephen Mallette wrote:

You have registries for each version as well and default to v3. Please see
TinkerGraph:

https://github.com/apache/tinkerpop/blob/master/tinkergraph-gremlin/src/main/java/org/apache/tinkerpop/gremlin/tinkergraph/structure/TinkerGraph.java#L198

If the user wants to override that then that's their choice, but they have
to rig it all up. We probably need a better system than this. IO is way too
complicated and confusing.

On Tue, Sep 5, 2017 at 9:52 AM, pieter gmail <pieter.mar...@gmail.com>
wrote:


Afraid I still don't quite get it. How do I register the different
SimpleModules depending on the version.

Currently it starts in SqlgGraph,

@Override
public  I io(final Io.Builder builder) {
 return (I) builder.graph(this).onMapper(mapper ->
mapper.addRegistry(SqlgIoRegistry.getInstance())).create();
}

and the SqlgIoRegistry registers the SimpleModule

private SqlgIoRegistry() {
 final SqlgSimpleModule sqlgSimpleModule = new SqlgSimpleModule();
 register(GraphSONIo.class, null, sqlgSimpleModule);
 register(GryoIo.class, RecordId.class, null);
}


Is the SqlgGraph.io(...) method suppose to interrogate the builder to
check the version and add a corresponding IoRegistry and SimpleModule?

Thanks
Pieter






On 05/09/2017 13:30, Stephen Mallette wrote:


I think you should just create a new SimpleModule for each version - don't
try to put them in the same SqlgSimpleModule. It's a naming convention
that
users will have to follow rather than something explicitly enforced
through
code.

On Sun, Sep 3, 2017 at 3:20 PM, pieter gmail <pieter.mar...@gmail.com>
wrote:

Hi,

I am getting IO tests failures on 3.3.0.

Sqlg has a SimpleModule which add serializers for its custom id.

  SqlgSimpleModule() {
  super("custom");
//addSerializer(RecordId.class, new
RecordId.RecordIdJacksonSerial
izerV2d0());
//addDeserializer(RecordId.class, new
RecordId.RecordIdJacksonDeserializerV2d0());
//addSerializer(SchemaTable.class, new
SchemaTable.SchemaTableIdJacksonSerializerV2d0());
//addDeserializer(SchemaTable.class, new
SchemaTable.SchemaTableIdJacksonDeserializerV2d0());

  addSerializer(RecordId.class, new RecordId.RecordIdJacksonSerial
izerV3d0());
  addDeserializer(RecordId.class, new
RecordId.RecordIdJacksonDeseri
alizerV3d0());
  addSerializer(SchemaTable.class, new
SchemaTable.SchemaTableJacksonSerializerV3d0());
  addDeserializer(SchemaTable.class, new
SchemaTable.SchemaTableJacksonDeserializerV3d0());
  }

How is it suppose to distinguish between v2 and v3?

An example of a failure is 'IoEdgeTest.shouldReadWriteEdge'

If ...V2d0.. is added to the serializers then 'graphson-v3' fails.
If ...V3d0.. is added to the serializers then 'graphson-v2' fails.

TinkerPop's own CustomId tests do not rely on default behavior and
manually creates SimpleModules for each scenario.

Are they both suppose to work somehow?

Thanks
Pieter






Re: io and graphson-v3

2017-09-05 Thread pieter gmail
Afraid I still don't quite get it. How do I register the different 
SimpleModules depending on the version.


Currently it starts in SqlgGraph,

@Override
public  I io(final Io.Builder builder) {
    return (I) builder.graph(this).onMapper(mapper -> 
mapper.addRegistry(SqlgIoRegistry.getInstance())).create();

}

and the SqlgIoRegistry registers the SimpleModule

private SqlgIoRegistry() {
    final SqlgSimpleModule sqlgSimpleModule = new SqlgSimpleModule();
    register(GraphSONIo.class, null, sqlgSimpleModule);
    register(GryoIo.class, RecordId.class, null);
}


Is the SqlgGraph.io(...) method suppose to interrogate the builder to 
check the version and add a corresponding IoRegistry and SimpleModule?


Thanks
Pieter






On 05/09/2017 13:30, Stephen Mallette wrote:

I think you should just create a new SimpleModule for each version - don't
try to put them in the same SqlgSimpleModule. It's a naming convention that
users will have to follow rather than something explicitly enforced through
code.

On Sun, Sep 3, 2017 at 3:20 PM, pieter gmail <pieter.mar...@gmail.com>
wrote:


Hi,

I am getting IO tests failures on 3.3.0.

Sqlg has a SimpleModule which add serializers for its custom id.

 SqlgSimpleModule() {
 super("custom");
//addSerializer(RecordId.class, new RecordId.RecordIdJacksonSerial
izerV2d0());
//addDeserializer(RecordId.class, new
RecordId.RecordIdJacksonDeserializerV2d0());
//addSerializer(SchemaTable.class, new
SchemaTable.SchemaTableIdJacksonSerializerV2d0());
//addDeserializer(SchemaTable.class, new
SchemaTable.SchemaTableIdJacksonDeserializerV2d0());

 addSerializer(RecordId.class, new RecordId.RecordIdJacksonSerial
izerV3d0());
 addDeserializer(RecordId.class, new RecordId.RecordIdJacksonDeseri
alizerV3d0());
 addSerializer(SchemaTable.class, new
SchemaTable.SchemaTableJacksonSerializerV3d0());
 addDeserializer(SchemaTable.class, new
SchemaTable.SchemaTableJacksonDeserializerV3d0());
 }

How is it suppose to distinguish between v2 and v3?

An example of a failure is 'IoEdgeTest.shouldReadWriteEdge'

If ...V2d0.. is added to the serializers then 'graphson-v3' fails.
If ...V3d0.. is added to the serializers then 'graphson-v2' fails.

TinkerPop's own CustomId tests do not rely on default behavior and
manually creates SimpleModules for each scenario.

Are they both suppose to work somehow?

Thanks
Pieter





RE: io and graphson-v3

2017-09-03 Thread pieter gmail

Hi,

I am getting IO tests failures on 3.3.0.

Sqlg has a SimpleModule which add serializers for its custom id.

    SqlgSimpleModule() {
    super("custom");
//    addSerializer(RecordId.class, new 
RecordId.RecordIdJacksonSerializerV2d0());
//    addDeserializer(RecordId.class, new 
RecordId.RecordIdJacksonDeserializerV2d0());
//    addSerializer(SchemaTable.class, new 
SchemaTable.SchemaTableIdJacksonSerializerV2d0());
//    addDeserializer(SchemaTable.class, new 
SchemaTable.SchemaTableIdJacksonDeserializerV2d0());


    addSerializer(RecordId.class, new 
RecordId.RecordIdJacksonSerializerV3d0());
    addDeserializer(RecordId.class, new 
RecordId.RecordIdJacksonDeserializerV3d0());
    addSerializer(SchemaTable.class, new 
SchemaTable.SchemaTableJacksonSerializerV3d0());
    addDeserializer(SchemaTable.class, new 
SchemaTable.SchemaTableJacksonDeserializerV3d0());

    }

How is it suppose to distinguish between v2 and v3?

An example of a failure is 'IoEdgeTest.shouldReadWriteEdge'

If ...V2d0.. is added to the serializers then 'graphson-v3' fails.
If ...V3d0.. is added to the serializers then 'graphson-v2' fails.

TinkerPop's own CustomId tests do not rely on default behavior and 
manually creates SimpleModules for each scenario.


Are they both suppose to work somehow?

Thanks
Pieter


Re: [VOTE] TinkerPop 3.3.0 Release

2017-08-24 Thread pieter gmail
Afraid I won't get time to refactor Sqlg's ChooseStep/OptionalStep 
optimizations to deal with the new OptionalStep  before the vote 
expires. So all tests are not passing but its a Sqlg problem.


Apart from that all seems well.

VOTE +1

Cheers
Pieter

On 24/08/2017 16:44, Stephen Mallette wrote:

doh - i can't believe that's still there. i'll just quick delete that line
in svn. stupid.

On Thu, Aug 24, 2017 at 10:41 AM, Robert Dale  wrote:


Seemed like there was something to do here before the release

http://tinkerpop.apache.org/docs/3.3.0/upgrade/#_changes_to_io
WILL NEED TO WRITE SOMETHING MORE COHESIVE HERE - JUST LISTING STUFF FOR
RIGHT NOW

Maybe 'here' doesn't mean there.

VOTE +1


Robert Dale

On Tue, Aug 22, 2017 at 12:57 PM, Daniel Kuppitz  wrote:


*Validating binary distributions*

* downloading Apache TinkerPop Gremlin
(apache-tinkerpop-gremlin-console-3.3.0-bin.zip)... OK
* validating signatures and checksums ...
   * PGP signature ... OK
   * MD5 checksum ... OK
   * SHA1 checksum ... OK
* unzipping Apache TinkerPop Gremlin ... OK
* validating Apache TinkerPop Gremlin's docs ... OK
* validating Apache TinkerPop Gremlin's binaries ... OK
* validating Apache TinkerPop Gremlin's legal files ...
   * LICENSE ... OK
   * NOTICE ... OK
* validating Apache TinkerPop Gremlin's plugin directory ... OK
* validating Apache TinkerPop Gremlin's lib directory ... OK
* testing script evaluation ... OK

* downloading Apache TinkerPop Gremlin
(apache-tinkerpop-gremlin-server-3.3.0-bin.zip)... OK
* validating signatures and checksums ...
   * PGP signature ... OK
   * MD5 checksum ... OK
   * SHA1 checksum ... OK
* unzipping Apache TinkerPop Gremlin ... OK
* validating Apache TinkerPop Gremlin's docs ... OK
* validating Apache TinkerPop Gremlin's binaries ... OK
* validating Apache TinkerPop Gremlin's legal files ...
   * LICENSE ... OK
   * NOTICE ... OK
* validating Apache TinkerPop Gremlin's plugin directory ... OK
* validating Apache TinkerPop Gremlin's lib directory ... OK

Validating source distribution

* downloading Apache TinkerPop 3.3.0 (apache-tinkerpop-3.3.0-src.zip)...
OK
* validating signatures and checksums ...
   * PGP signature ... OK
   * MD5 checksum ... OK
   * SHA1 checksum ... OK
* unzipping Apache TinkerPop 3.3.0 ... OK
* building project ... OK


VOTE: +1

Cheers,
Daniel


On Tue, Aug 22, 2017 at 8:57 AM, Stephen Mallette 
wrote:


Hello,

We are happy to announce that TinkerPop 3.3.0 is ready for release.

The release artifacts can be found at this location:
 https://dist.apache.org/repos/dist/dev/tinkerpop/3.3.0/

The source distribution is provided by:
 apache-tinkerpop-3.3.0-src.zip

Two binary distributions are provided for user convenience:
 apache-tinkerpop-gremlin-console-3.3.0-bin.zip
 apache-tinkerpop-gremlin-server-3.3.0-bin.zip

The GPG key used to sign the release artifacts is available at:
 https://dist.apache.org/repos/dist/dev/tinkerpop/KEYS

The online docs can be found here:
 http://tinkerpop.apache.org/docs/3.3.0/ (user docs)
 http://tinkerpop.apache.org/docs/3.3.0/upgrade/ (upgrade docs)
 http://tinkerpop.apache.org/javadocs/3.3.0/core/ (core

javadoc)

 http://tinkerpop.apache.org/javadocs/3.3.0/full/ (full

javadoc)

The tag in Apache Git can be found here:

https://git-wip-us.apache.org/repos/asf?p=tinkerpop.git;a=tag;h=
e8aa977ab33e0ceec6ac9d06a3262d52b4e16511

The release notes are available here:

https://github.com/apache/tinkerpop/blob/3.3.0/
CHANGELOG.asciidoc#tinkerpop-330-release-date-august-21-2017

The [VOTE] will be open for the next 72 hours --- closing Friday

(August

21, 2017) at 12pm EST.

My vote is +1.

Thank you very much,
Stephen





Re: [VOTE] TinkerPop 3.2.6 Release

2017-08-22 Thread pieter gmail
Ran Sqlg's own test suite and TinkerPop's structured and process test 
suites on Sqlg.

All passes.

VOTE +1

Cheers
Pieter


On 22/08/2017 20:29, Robert Dale wrote:

VOTE +1

Robert Dale

On Tue, Aug 22, 2017 at 10:33 AM, Daniel Kuppitz  wrote:


*Validating binary distributions*

* downloading Apache TinkerPop Gremlin
(apache-tinkerpop-gremlin-console-3.2.6-bin.zip)... OK
* validating signatures and checksums ...
   * PGP signature ... OK
   * MD5 checksum ... OK
   * SHA1 checksum ... OK
* unzipping Apache TinkerPop Gremlin ... OK
* validating Apache TinkerPop Gremlin's docs ... OK
* validating Apache TinkerPop Gremlin's binaries ... OK
* validating Apache TinkerPop Gremlin's legal files ...
   * LICENSE ... OK
   * NOTICE ... OK
* validating Apache TinkerPop Gremlin's plugin directory ... OK
* validating Apache TinkerPop Gremlin's lib directory ... OK
* testing script evaluation ... OK

* downloading Apache TinkerPop Gremlin
(apache-tinkerpop-gremlin-server-3.2.6-bin.zip)... OK
* validating signatures and checksums ...
   * PGP signature ... OK
   * MD5 checksum ... OK
   * SHA1 checksum ... OK
* unzipping Apache TinkerPop Gremlin ... OK
* validating Apache TinkerPop Gremlin's docs ... OK
* validating Apache TinkerPop Gremlin's binaries ... OK
* validating Apache TinkerPop Gremlin's legal files ...
   * LICENSE ... OK
   * NOTICE ... OK
* validating Apache TinkerPop Gremlin's plugin directory ... OK
* validating Apache TinkerPop Gremlin's lib directory ... OK

Validating source distribution

* downloading Apache TinkerPop 3.2.6 (apache-tinkerpop-3.2.6-src.zip)...
OK
* validating signatures and checksums ...
   * PGP signature ... OK
   * MD5 checksum ... OK
   * SHA1 checksum ... OK
* unzipping Apache TinkerPop 3.2.6 ... OK
* building project ... OK



VOTE: +1

Cheers,
Daniel


On Tue, Aug 22, 2017 at 4:54 AM, Stephen Mallette 
wrote:


Hello,

We are happy to announce that TinkerPop 3.2.6 is ready for release.

The release artifacts can be found at this location:
 https://dist.apache.org/repos/dist/dev/tinkerpop/3.2.6/

The source distribution is provided by:
 apache-tinkerpop-3.2.6-src.zip

Two binary distributions are provided for user convenience:
 apache-tinkerpop-gremlin-console-3.2.6-bin.zip
 apache-tinkerpop-gremlin-server-3.2.6-bin.zip

The GPG key used to sign the release artifacts is available at:
 https://dist.apache.org/repos/dist/dev/tinkerpop/KEYS

The online docs can be found here:
 http://tinkerpop.apache.org/docs/3.2.6/ (user docs)
 http://tinkerpop.apache.org/docs/3.2.6/upgrade/ (upgrade docs)
 http://tinkerpop.apache.org/javadocs/3.2.6/core/ (core javadoc)
 http://tinkerpop.apache.org/javadocs/3.2.6/full/ (full javadoc)

The tag in Apache Git can be found here:

https://git-wip-us.apache.org/repos/asf?p=tinkerpop.git;a=tag;h=
43ed34fb10002308117b6cfabd4870d958cc2d99

The release notes are available here:

https://github.com/apache/tinkerpop/blob/3.2.6/
CHANGELOG.asciidoc#tinkerpop-326-release-date-august-21-2017

The [VOTE] will be open for the next 72 hours --- closing Friday (August
25, 2017) at 8AM EST.

My vote is +1.

Thank you very much,
Stephen





Re: [VOTE] TinkerPop 3.2.5 Release

2017-06-17 Thread pieter gmail

VOTE +1

Cheers
Pieter

On 15/06/2017 22:33, Stephen Mallette wrote:

Hello (again),

We are happy to announce (again) that TinkerPop 3.2.5 is ready for release.

The release artifacts can be found at this location:
 https://dist.apache.org/repos/dist/dev/tinkerpop/3.2.5/

The source distribution is provided by:
 apache-tinkerpop-3.2.5-src.zip

Two binary distributions are provided for user convenience:
 apache-tinkerpop-gremlin-console3.2.5-bin.zip
 apache-tinkerpop-gremlin-server-3.2.5-bin.zip

The GPG key used to sign the release artifacts is available at:
 https://dist.apache.org/repos/dist/dev/tinkerpop/KEYS

The online docs can be found here:
 http://tinkerpop.apache.org/docs/3.2.5/ (user docs)
 http://tinkerpop.apache.org/docs/3.2.5/upgrade/ (upgrade docs)
 http://tinkerpop.apache.org/javadocs/3.2.5/core/ (core javadoc)
 http://tinkerpop.apache.org/javadocs/3.2.5/full/ (full javadoc)

The tag in Apache Git can be found here:

https://git-wip-us.apache.org/repos/asf?p=tinkerpop.git;a=tag;h=0ecbb067a212711a6b996af828cc932956282dcc

The release notes are available here:

 https://github.com/apache/tinkerpop/blob/master/
CHANGELOG.asciidoc#tinkerpop-320-nine-inch-gremlins

The [VOTE] will be open for the next 72 hours --- closing Sunday (June 18,
2017) at  4:30pm EST

My vote is +1 (again).

Thank you very much,
Stephen





RE: gremlin-test and h2 database

2017-06-14 Thread pieter gmail

Hi,

I just noticed for the first time that gremlin-test has a dependency on 
h2 database.



com.h2database
h2
1.3.171


Where is it being used?

Thanks
Pieter


Re: [VOTE] TinkerPop 3.2.5 Release

2017-06-14 Thread pieter gmail

Ran all Sqlg's custom tests and the structure and process test suites.
All tests pass.

VOTE +1

On 14/06/2017 03:36, David Brown wrote:

Ran ./validate-distribution.sh 3.2.5 - all ok

Installed gremlin-python 3.2.5 from source as aiogremlin dependency - successful

Tested aiogremlin against Gremlin-Server 3.2.5 - all tests pass

Manually ran Gremlin-Python tests using Pytest against Gremlin-Server 3.2.5:

* Python 2.7.12 - all tests pass
* Python 3.5.2 - all tests pass

Nice work everyone!

VOTE +1

On Tue, Jun 13, 2017 at 2:36 PM, Robert Dale  wrote:

+1 LGTM

Robert Dale

On Tue, Jun 13, 2017 at 6:29 AM, Daniel Kuppitz  wrote:


*Validating binary distributions*

* downloading Apache TinkerPop Gremlin
(apache-tinkerpop-gremlin-console-3.2.5-bin.zip)... OK
* validating signatures and checksums ...
   * PGP signature ... OK
   * MD5 checksum ... OK
   * SHA1 checksum ... OK
* unzipping Apache TinkerPop Gremlin ... OK
* validating Apache TinkerPop Gremlin's docs ... OK
* validating Apache TinkerPop Gremlin's binaries ... OK
* validating Apache TinkerPop Gremlin's legal files ...
   * LICENSE ... OK
   * NOTICE ... OK
* validating Apache TinkerPop Gremlin's plugin directory ... OK
* validating Apache TinkerPop Gremlin's lib directory ... OK
* testing script evaluation ... OK

* downloading Apache TinkerPop Gremlin
(apache-tinkerpop-gremlin-server-3.2.5-bin.zip)... OK
* validating signatures and checksums ...
   * PGP signature ... OK
   * MD5 checksum ... OK
   * SHA1 checksum ... OK
* unzipping Apache TinkerPop Gremlin ... OK
* validating Apache TinkerPop Gremlin's docs ... OK
* validating Apache TinkerPop Gremlin's binaries ... OK
* validating Apache TinkerPop Gremlin's legal files ...
   * LICENSE ... OK
   * NOTICE ... OK
* validating Apache TinkerPop Gremlin's plugin directory ... OK
* validating Apache TinkerPop Gremlin's lib directory ... OK

Validating source distribution

* downloading Apache TinkerPop 3.2.5 (apache-tinkerpop-3.2.5-src.zip)...
OK
* validating signatures and checksums ...
   * PGP signature ... OK
   * MD5 checksum ... OK
   * SHA1 checksum ... OK
* unzipping Apache TinkerPop 3.2.5 ... OK
* building project ... OK

VOTE: +1


On Mon, Jun 12, 2017 at 11:20 PM, Stephen Mallette 
wrote:


Ok - fixed:

https://github.com/apache/tinkerpop/commit/

3977783f9b66bde8728b04be215ff1

460e9b0af9

Uploaded fresh zips. Please continue with review/VOTE. Thanks

On Mon, Jun 12, 2017 at 3:04 PM, Stephen Mallette 
wrote:


dah - wonder how that happened. that's like the worst kind of mistake

to

make. small enough to be largely inconsequential, but still staring you
right in the face to be big enough to have to re-do stuff. ok -

consider

this vote thread on hold for now. i have to fix that. thanks for

noticing

that.

On Mon, Jun 12, 2017 at 2:58 PM, Robert Dale 

wrote:

Looks like some extra chars got introduced here:

commit 9e14d674d26bf0a2b6b5bcd3d6a57feeeba24b16:
CHANGELOG.asciidoc:
-* Fixed an `NullPointerException` in `GraphMLReader` that occurred

when

an
`` didn't have an ID field and the base graph supported ID
assignment.
+* Fixed an `NullPoiBugs
+nterException` in `GraphMLReader` that occurred when an ``
didn't have an ID field and the base graph supported ID assignment.


Robert Dale

On Mon, Jun 12, 2017 at 12:00 PM, Stephen Mallette <

spmalle...@gmail.com>

wrote:


Hello,

We are happy to announce that TinkerPop 3.2.5 is ready for release.

The release artifacts can be found at this location:
 https://dist.apache.org/repos/dist/dev/tinkerpop/3.2.5/

The source distribution is provided by:
 apache-tinkerpop-3.2.5-src.zip

Two binary distributions are provided for user convenience:
 apache-tinkerpop-gremlin-console3.2.5-bin.zip
 apache-tinkerpop-gremlin-server-3.2.5-bin.zip

The GPG key used to sign the release artifacts is available at:
 https://dist.apache.org/repos/dist/dev/tinkerpop/KEYS

The online docs can be found here:
 http://tinkerpop.apache.org/docs/3.2.5/ (user docs)
 http://tinkerpop.apache.org/docs/3.2.5/upgrade/ (upgrade

docs)

 http://tinkerpop.apache.org/javadocs/3.2.5/core/ (core

javadoc)

 http://tinkerpop.apache.org/javadocs/3.2.5/full/ (full

javadoc)

The tag in Apache Git can be found here:

https://git-wip-us.apache.org/repos/asf?p=tinkerpop.git;a=tag;h=
9a60e34a3d590a311d4d1681143b6407c9b5dc13

The release notes are available here:


https://github.com/apache/tinkerpop/blob/master/
CHANGELOG.asciidoc#tinkerpop-320-nine-inch-gremlins

The [VOTE] will be open for the next 72 hours --- closing Thursday

(June

15, 2017) at  12pm EST

My vote is +1.

Thank you very much,
Stephen










Re: TraversalHelper.anyStepRecursively bug

2017-06-14 Thread pieter gmail

Hi,

No I can not demonstrate a traversal failure. At least not using 
TinkerGraph. The failure I get is while optimizing a step in Sqlg. 
Previously I would look for CyclicPathStep or SimplePathStep both of 
which are NOT a TraversalParent so TraversalHelper.anyStepRecursively 
happen to work.


With the 3.2.5 refactor I am looking for PathFilterStep which is a 
TraversalParent thus causing the bug to occur.


I can show that TraversalHelper.anyStepRecursively is incorrect.

@Test
public void test2() {
final TinkerGraph g = TinkerGraph.open();
Traversal.Admin traversal = (Traversal.Admin) 
g.traversal()

.V()
.repeat(
__.out().simplePath()
);
Predicate p = s -> s.getClass().equals(PathFilterStep.class);
Assert.assertTrue(TraversalHelper.anyStepRecursively(p, 
traversal));


traversal = (Traversal.Admin) g.traversal()
.V()
.optional(
__.out().path()
);
p = s -> s.getClass().equals(PathStep.class);
Assert.assertTrue(TraversalHelper.anyStepRecursively(p, 
traversal));

}


The only place I see TraversalHelper.anyStepRecursively being used in 
TinkerPop is in the PathRetractionStrategy. Its code is a bit too dense 
for me to quickly understand the effect on it.


Thanks
Pieter

On 13/06/2017 21:32, Marko Rodriguez wrote:

Hello,

Can you provide a CyclicPathTest that demonstrates the failure?

Marko.



On Jun 13, 2017, at 10:34 AM, pieter gmail <pieter.mar...@gmail.com> wrote:

Hi,

With 3.2.5 CyclicPathStep and SimplePathStep has been replaced with 
PathFilterStep.
This is fine but whilst doing the refactor I noticed what appears to be a bug 
in TraversalHelper.anyStepRecursively

public static boolean anyStepRecursively(final Predicate predicate, final 
Traversal.Admin traversal) {
for (final Step step : traversal.getSteps()) {
if (predicate.test(step)) {
return true;
}

if (step instanceof TraversalParent) anyStepRecursively(predicate, 
((TraversalParent) step));
}
return false;
}

Surely the second if statement should return true if 
anyStepRecursively(predicate, ((TraversalParent) step)); returns true?
i.e.

public static boolean anyStepRecursively(final Predicate predicate, final 
Traversal.Admin traversal) {
for (final Step step : traversal.getSteps()) {
if (predicate.test(step)) {
return true;
}

if (step instanceof TraversalParent) {
if (anyStepRecursively(predicate, ((TraversalParent) step))) {
return true;
}
}
}
return false;
}

Cheers
Pieter






TraversalHelper.anyStepRecursively bug

2017-06-13 Thread pieter gmail

Hi,

With 3.2.5 CyclicPathStep and SimplePathStep has been replaced with 
PathFilterStep.
This is fine but whilst doing the refactor I noticed what appears to be 
a bug in TraversalHelper.anyStepRecursively


public static boolean anyStepRecursively(final Predicate 
predicate, final Traversal.Admin traversal) {

for (final Step step : traversal.getSteps()) {
if (predicate.test(step)) {
return true;
}

if (step instanceof TraversalParent) 
anyStepRecursively(predicate, ((TraversalParent) step));

}
return false;
}

Surely the second if statement should return true if 
anyStepRecursively(predicate, ((TraversalParent) step)); returns true?

i.e.

public static boolean anyStepRecursively(final Predicate 
predicate, final Traversal.Admin traversal) {

for (final Step step : traversal.getSteps()) {
if (predicate.test(step)) {
return true;
}

if (step instanceof TraversalParent) {
if (anyStepRecursively(predicate, ((TraversalParent) 
step))) {

return true;
}
}
}
return false;
}

Cheers
Pieter




RE: mutation visibility semantics

2017-06-13 Thread pieter gmail

Hi,

Testing Sqlg on 3.2.5 I am getting failures 
EventStrategyProcessTest.shouldDetachVertexPropertyWhenRemoved


final GraphTraversalSource gts = create(eventStrategy);

gts.V(v).properties("to-remove").drop().iterate();
tryCommit(graph);

assertEquals(1, IteratorUtils.count(v.properties()));

The code assumes that the v that is currently in memory will 
automatically be kept in sync.


This is not my understanding of TinkerPop's semantics.
The v object was not itself updated. So I expect for the assertion to 
first re-fetch before asserting.


Here is a simpler test to illustrate the issue.

@Test
public void test() {
Vertex a1 = this.sqlgGraph.addVertex(T.label, "A", "name", "John");
this.sqlgGraph.tx().commit();

Vertex a1Again = this.sqlgGraph.traversal().V(a1).next();
a1Again.property("name", "Peter");
this.sqlgGraph.tx().commit();

//This fails, TinkerPop does not specify transaction memory 
visibility

Assert.assertEquals("Peter", a1.value("name"));
}

Am I correct in my understanding here, or is the test suppose to pass?

Thanks
Pieter


RE: hasId

2017-05-28 Thread pieter gmail

Hi,

The following code illustrates my concern/confusion.

@Test
public void testHasId() {
final TinkerGraph graph = TinkerGraph.open();
Vertex a = graph.addVertex(T.label, "A");
Vertex b = graph.addVertex(T.label, "B");

List vertices = 
graph.traversal().V().hasId(a.id()).hasId(b.id()).toList();

Assert.assertTrue(vertices.isEmpty());
}

The test fails as the both vertices are returned.
Is this expected, I expected 'and' not 'or' behavior.

Similar to,

@Test
public void testHasLabel() {
final TinkerGraph graph = TinkerGraph.open();
Vertex a = graph.addVertex(T.label, "A");
Vertex b = graph.addVertex(T.label, "B");

List vertices = 
graph.traversal().V().hasLabel("A").hasLabel("B").toList();

Assert.assertTrue(vertices.isEmpty());
}

This one passes.

I checked the docs,

|hasLabel(labels...)|: Remove the traverser if its element does not have 
any of the labels.
|hasId(ids...)|: Remove the traverser if its element does not have any 
of the ids.


Seems they should behave the same?

I am working on version 3.2.4

Thanks
Pieter



Re: [VOTE] TinkerPop 3.2.4 Release

2017-02-12 Thread pieter-gmail
Ran Sqlg's own test suite and TinkerPop's structured and process test
suites on Sqlg.
No failures.

VOTE: +1

Thanks
Pieter

On 11/02/2017 14:49, Robert Dale wrote:
> website passes linkchecker. my app passes all tests.
> VOTE +1
>
>
> Robert Dale
>
> On Fri, Feb 10, 2017 at 9:14 AM, Stephen Mallette <spmalle...@gmail.com>
> wrote:
>
>> i think it's on schedule still. pieter sounded like he needed a little
>> extra time, but given the nature of the issue he described it didn't seem
>> like that would ultimately turn up something run afoul in the TinkerPop
>> side of things, so my personal inclination would be to close vote when
>> posted and release. on the other hand, the vote closes on a saturday, which
>> means that unless jason intends to do release work on the weekend, there's
>> not much point to closing the vote on saturday and pieter can have some
>> extra time. Anyway, I'll leave that up to Jason to coordinate with Ted as
>> we traditionally release all our lines of code at once. Once they do close
>> vote and release the artifacts, it will take up to 24 hours for apache
>> infrastructure to sync the releases to all the mirrors, so everything is
>> usually available the day after the vote closes.
>>
>> as for voting, every vote counts - though PMC votes are the ones that are
>> technically "binding" for purpose of a release (in apache terms we need a
>> "majority approval"[1] to put the release out the door). Now, if any vote,
>> non-binding or binding came in as -1, we would take that fairly seriously,
>> discuss it and depending on circumstances could scrap/delay the release.
>> Testing releases and voting is an excellent and easy way to contribute to
>> TinkerPop.
>>
>> [1] http://apache.org/foundation/glossary.html#MajorityApproval
>>
>>
>> On Fri, Feb 10, 2017 at 8:47 AM, Paul A. Jackson <paul.jack...@pb.com>
>> wrote:
>>
>>> I'm looking for some insight for when 3.2.4 will release. It looks like
>>> voting closes tomorrow. Is that on schedule, and if so what's a typical
>> ETA
>>> for binaries to be posted?
>>>
>>> I don't know if my vote counts, but FTW!
>>>
>>> VOTE: +1
>>>
>>> Thanks,
>>> -Paul
>>>
>>>
>>> -Original Message-
>>> From: Stephen Mallette [mailto:spmalle...@gmail.com]
>>> Sent: Friday, February 10, 2017 7:03 AM
>>> To: dev@tinkerpop.apache.org
>>> Subject: Re: [VOTE] TinkerPop 3.2.4 Release
>>>
>>> validate-distribution.sh was good for me - thanks for doing all the work
>>> on this one Jason. nice job
>>>
>>> VOTE: +1
>>>
>>> On Fri, Feb 10, 2017 at 1:22 AM, Ted Wilmes <twil...@gmail.com> wrote:
>>>
>>>> Release and docs looks good to me.
>>>>
>>>> Validating source distribution
>>>>
>>>> * downloading Apache TinkerPop 3.2.4 (apache-tinkerpop-3.2.4-src.
>> zip)...
>>>> OK
>>>> * validating signatures and checksums ...
>>>>   * PGP signature ... OK
>>>>   * MD5 checksum ... OK
>>>>   * SHA1 checksum ... OK
>>>> * unzipping Apache TinkerPop 3.2.4 ... OK
>>>> * building project ... OK
>>>>
>>>>
>>>> VOTE: +1
>>>>
>>>> Thanks,
>>>> Ted
>>>>
>>>> On Thu, Feb 9, 2017 at 12:32 PM, pieter-gmail
>>>> <pieter.mar...@gmail.com>
>>>> wrote:
>>>>
>>>>> Ok, however I did test on the 3.2.4-SNAPSHOT immediately after
>>>>> Jason's email on the 2/2/2017 and those changes were not there.
>>>>> They are there now but there was a SNAPSHOT release on the
>>>>> 08/02/2017 so things changed.
>>>>> Anyhow that might just be some SNAPSHOT confusion thing.
>>>>>
>>>>> Next time I'll pull the code and build it manually to make sure.
>>>>>
>>>>> Thanks
>>>>> Pieter
>>>>>
>>>>>
>>>>>
>>>>> On 09/02/2017 22:27, Marko Rodriguez wrote:
>>>>>> Hi,
>>>>>>
>>>>>>> The significant change is
>>>>>>> ffe1b4c Marko A. Rodriguez <okramma...@gmail.com> on 2016/11/15
>>>>>>> at
>>>>> 12:44 AM
>>>>>> Ah yea. Thats different from what I thought you had issue with —
>>>>> has-containers and arrays/collections h

Re: failing optional and order by gremlin

2017-02-11 Thread pieter-gmail
OK got it.

Thanks for the help.

Cheers,
Pieter

On 11/02/2017 16:46, Daniel Kuppitz wrote:
> This looks like a bug in 3.2.3. It's clearly expected to fail, since by()
> modulators should always fail if the traverser can't emit a value. What you
> actually want to do is this:
>
> gremlin> g.traversal().V(a1.id()).optional(
> ..1> outE("ab").as("e").otherV().as("vb").optional(
> ..2>   outE("bc").as("e").otherV().as("vc"))).
> ..3>   order().by(select(first, "e").by("order")).
> ..4>   by(select(last, "e").by("order")).values("name")
> ==>b3
> ==>b2
> ==>c3
> ==>c2
> ==>c1
>
> Not related to your problem, but I thought I should point that out: don't
> use otherV() when you know the direction.
>
> g.traversal().V(a1.id()).optional(
> outE("ab").as("e").inV().as("vb").optional(
>   outE("bc").as("e").inV().as("vc"))).
>   order().by(select(first, "e").by("order")).
>   by(select(last, "e").by("order")).values("name")
>
> Also, some labels are redundant in this particular traversal; get rid of
> them:
>
> g.traversal().V(a1.id()).optional(
> outE("ab").as("e").inV().optional(
>   outE("bc").as("e").inV())).
>   order().by(select(first, "e").by("order")).
>   by(select(last, "e").by("order")).values("name")
>
> Oh, and there's something else: The query would fail if there wouldn't be a
> single "ab" edge. If you want to take this into account, do:
>
> g.traversal().V(a1.id()).optional(
> outE("ab").as("e").inV().optional(
>   outE("bc").as("e").inV())).
>   order().by(coalesce(select(first, "e").by("order"), constant(0))).
>   by(coalesce(select(last, "e").by("order"),
> constant(0))).values("name")
>
> And finally, pointing out the obvious: don't create a new traversal source
> for every query.
>
> That's it.
>
> Cheers,
> Daniel
>
>
> On Sat, Feb 11, 2017 at 3:03 PM, pieter-gmail <pieter.mar...@gmail.com>
> wrote:
>
>> Hi,
>>
>> The following query no longer works on 3.2.4
>>
>> @Test
>> public void testOptionalWithOrderBy() {
>> final TinkerGraph g = TinkerGraph.open();
>> Vertex a1 = g.addVertex(T.label, "A", "name", "a1");
>> Vertex b1 = g.addVertex(T.label, "B", "name", "b1");
>> Vertex b2 = g.addVertex(T.label, "B", "name", "b2");
>> Vertex b3 = g.addVertex(T.label, "B", "name", "b3");
>> Vertex c1 = g.addVertex(T.label, "C", "name", "c1");
>> Vertex c2 = g.addVertex(T.label, "C", "name", "c2");
>> Vertex c3 = g.addVertex(T.label, "C", "name", "c3");
>> a1.addEdge("ab", b1, "order", 3);
>> a1.addEdge("ab", b2, "order", 2);
>> a1.addEdge("ab", b3, "order", 1);
>> b1.addEdge("bc", c1, "order", 3);
>> b1.addEdge("bc", c2, "order", 2);
>> b1.addEdge("bc", c3, "order", 1);
>> GraphTraversal<Vertex, Vertex> traversal = g.traversal().V(a1.id
>> ())
>> .optional(
>> __.outE("ab").as("ab").otherV().as("vb")
>> .optional(
>>
>> __.outE("bc").as("bc").otherV().as("vc")
>> )
>> )
>> .order().by(__.select("ab").by("order"),
>> Order.incr).by(__.select("bc").by("order"), Order.incr);
>> while (traversal.hasNext()) {
>> System.out.println(traversal.next().value("name"));
>> }
>> }
>>
>> On 3.2.3 it returns
>>
>> b3
>> b2
>> c3
>> c2
>> c1
>>
>> On 3.2.4 it throws the following exception,
>>
>> java.lang.IllegalArgumentException: The provided traverser does not map
>> to a value: v[6]->[SelectOneStep(bc,value(order))]
>> at
>> org.apache.tinkerpop.gremlin.process.traversal.util.TraversalUtil.apply(
>> TraversalUtil.java:45)
>> at
>> org.apache.tinkerpop.gremlin.process.traversal.step.map.OrderGlobalStep.
>> createProjectedTraverser(OrderGlobalStep.java:155)
>> at
>> org.apache.tinkerpop.gremlin.process.traversal.step.map.OrderGlobalStep.
>> processAllStarts(OrderGlobalStep.java:74)
>> at
>> org.apache.tinkerpop.gremlin.process.traversal.step.util.
>> CollectingBarrierStep.processNextStart(CollectingBarrierStep.java:108)
>> at
>> org.apache.tinkerpop.gremlin.process.traversal.step.util.
>> AbstractStep.hasNext(AbstractStep.java:143)
>> at
>> org.apache.tinkerpop.gremlin.process.traversal.util.
>> DefaultTraversal.hasNext(DefaultTraversal.java:184)
>>
>> Has the semantics changed or is it a bug?
>>
>> Thanks
>> Pieter
>>
>>



RE: failing optional and order by gremlin

2017-02-11 Thread pieter-gmail
Hi,

The following query no longer works on 3.2.4

@Test
public void testOptionalWithOrderBy() {
final TinkerGraph g = TinkerGraph.open();
Vertex a1 = g.addVertex(T.label, "A", "name", "a1");
Vertex b1 = g.addVertex(T.label, "B", "name", "b1");
Vertex b2 = g.addVertex(T.label, "B", "name", "b2");
Vertex b3 = g.addVertex(T.label, "B", "name", "b3");
Vertex c1 = g.addVertex(T.label, "C", "name", "c1");
Vertex c2 = g.addVertex(T.label, "C", "name", "c2");
Vertex c3 = g.addVertex(T.label, "C", "name", "c3");
a1.addEdge("ab", b1, "order", 3);
a1.addEdge("ab", b2, "order", 2);
a1.addEdge("ab", b3, "order", 1);
b1.addEdge("bc", c1, "order", 3);
b1.addEdge("bc", c2, "order", 2);
b1.addEdge("bc", c3, "order", 1);
GraphTraversal traversal = g.traversal().V(a1.id())
.optional(
__.outE("ab").as("ab").otherV().as("vb")
.optional(
   
__.outE("bc").as("bc").otherV().as("vc")
)
)
.order().by(__.select("ab").by("order"),
Order.incr).by(__.select("bc").by("order"), Order.incr);
while (traversal.hasNext()) {
System.out.println(traversal.next().value("name"));
}
}

On 3.2.3 it returns

b3
b2
c3
c2
c1

On 3.2.4 it throws the following exception,

java.lang.IllegalArgumentException: The provided traverser does not map
to a value: v[6]->[SelectOneStep(bc,value(order))]
at
org.apache.tinkerpop.gremlin.process.traversal.util.TraversalUtil.apply(TraversalUtil.java:45)
at
org.apache.tinkerpop.gremlin.process.traversal.step.map.OrderGlobalStep.createProjectedTraverser(OrderGlobalStep.java:155)
at
org.apache.tinkerpop.gremlin.process.traversal.step.map.OrderGlobalStep.processAllStarts(OrderGlobalStep.java:74)
at
org.apache.tinkerpop.gremlin.process.traversal.step.util.CollectingBarrierStep.processNextStart(CollectingBarrierStep.java:108)
at
org.apache.tinkerpop.gremlin.process.traversal.step.util.AbstractStep.hasNext(AbstractStep.java:143)
at
org.apache.tinkerpop.gremlin.process.traversal.util.DefaultTraversal.hasNext(DefaultTraversal.java:184)

Has the semantics changed or is it a bug?

Thanks
Pieter



Re: [VOTE] TinkerPop 3.2.4 Release

2017-02-09 Thread pieter-gmail
Ok, however I did test on the 3.2.4-SNAPSHOT immediately after Jason's
email on the 2/2/2017 and those changes were not there.
They are there now but there was a SNAPSHOT release on the 08/02/2017 so
things changed.
Anyhow that might just be some SNAPSHOT confusion thing.

Next time I'll pull the code and build it manually to make sure.

Thanks
Pieter



On 09/02/2017 22:27, Marko Rodriguez wrote:
> Hi,
>
>> The significant change is
>> ffe1b4c Marko A. Rodriguez <okramma...@gmail.com> on 2016/11/15 at 12:44 AM
> Ah yea. Thats different from what I thought you had issue with — 
> has-containers and arrays/collections handling.
>
>> I don't have a problem with the changes just that is quite a refactor on
>> my side as it changes the structure of the HasSteps and its
>> HasContainers. I had made some assumptions around how HasSteps and
>> HasContainers look when optimizing.
> Yea, it “folds left” now so you don’t have to walk over has()-chains.
>
>> The change is quite old but somehow it was not present in the
>> 3.2.4-SNAPSHOT I tested on it when the code freeze announcement was
>> made. Not sure what happened there but alas its not giving me enough
>> time to get things working again.
> I would say don’t wait till releases to test Sqlg. In principle, you should 
> VOTE on every PR by building and testing the changes against Sqlg. That is 
> where you can make a huge contribution.
>
>> So I am not voting negative just requesting a weekend, if possible, to
>> get through the refactor.
> Again, PR awareness, not release awareness.
>
> Marko.
>
> http://markorodriguez.com
>
>
>> Cheers
>> Pieter
>>
>>
>>
>>
>>
>> "added TraversalHelper.addHasContainer() on 2016/11/15
>>
>> On 09/02/2017 22:11, Stephen Mallette wrote:
>>> Unless I'm missing something, HasStep hasn't changed in 4 months and
>>> HasContainer hasn't changed in 3 months. The only update that went in after
>>> 2/2 that I can think of that would have any bearing for graph providers who
>>> tested before/after that date would be the AutoCloseable stuff:
>>>
>>> https://github.com/apache/tinkerpop/pull/548
>>>
>>> Your problems aren't related to that are they?  Can you provide some
>>> synopsis of where the problems lie?
>>>
>>>
>>>
>>> On Thu, Feb 9, 2017 at 2:47 PM, pieter-gmail <pieter.mar...@gmail.com>
>>> wrote:
>>>
>>>> Hi,
>>>>
>>>> Some issues regarding the release process.
>>>>
>>>> On the 2/2/2017 the 3.2.4-SNAPSHOT was released.
>>>> I then started testing and found almost no issues.
>>>>
>>>> However yesterday when the VOTE mail came I found many issues on 3.2.4.
>>>> To understand the confusion I tested again on 3.2.4-SNAPSHOT and found
>>>> the same new issues.
>>>> I then checked the 3.2.4-SNAPSHOT timestamp and it changed to 08/02/2017
>>>>
>>>> Not sure what happened there as I can not, nor is it worth it, check an
>>>> old SNAPSHOT version's binary.
>>>> The relevant code changes are quite old (December 2016) but it may have
>>>> been done on a separate branch and rebasing may loose the merging
>>>> information. Not sure about this though.
>>>>
>>>> This only leaves the 72 hours to catch up.
>>>>
>>>> Even though it is a minor release there has been significant changes to
>>>> HasStep and HasContainer which breaks the heart (mine at least) of
>>>> implementors optimization code.
>>>>
>>>> Basically to get any value from a vote I for one will need more time.
>>>>
>>>> So far I have not found any TinkerPop issues but will need at least the
>>>> weekend to know better.
>>>>
>>>> Thanks
>>>> Pieter
>>>>
>>>> On 08/02/2017 16:51, Jason Plurad wrote:
>>>>> Hello,
>>>>>
>>>>> We are happy to announce that TinkerPop 3.2.4 is ready for release.
>>>>>
>>>>> The release artifacts can be found at this location:
>>>>>https://dist.apache.org/repos/dist/dev/tinkerpop/3.2.4/
>>>>>
>>>>> The source distribution is provided by:
>>>>>apache-tinkerpop-3.2.4-src.zip
>>>>>
>>>>> Two binary distributions are provided for user convenience:
>>>>>apache-tinkerpop-gremlin-console-3.2.4-bin.zip
>>>>>apach

Re: [VOTE] TinkerPop 3.2.4 Release

2017-02-09 Thread pieter-gmail
The significant change is

ffe1b4c Marko A. Rodriguez <okramma...@gmail.com> on 2016/11/15 at 12:44 AM

The description is

"added TraversalHelper.addHasContainer() which will either append a
HasStep with container or if the traverasl ends with a
HasContainerHolder, fold the container into the holder. This just makes
the code in GraphTravesrsal cleaner with less copy/paste."

I don't have a problem with the changes just that is quite a refactor on
my side as it changes the structure of the HasSteps and its
HasContainers. I had made some assumptions around how HasSteps and
HasContainers look when optimizing.

The change is quite old but somehow it was not present in the
3.2.4-SNAPSHOT I tested on it when the code freeze announcement was
made. Not sure what happened there but alas its not giving me enough
time to get things working again.

So I am not voting negative just requesting a weekend, if possible, to
get through the refactor.

Cheers
Pieter





"added TraversalHelper.addHasContainer() on 2016/11/15

On 09/02/2017 22:11, Stephen Mallette wrote:
> Unless I'm missing something, HasStep hasn't changed in 4 months and
> HasContainer hasn't changed in 3 months. The only update that went in after
> 2/2 that I can think of that would have any bearing for graph providers who
> tested before/after that date would be the AutoCloseable stuff:
>
> https://github.com/apache/tinkerpop/pull/548
>
> Your problems aren't related to that are they?  Can you provide some
> synopsis of where the problems lie?
>
>
>
> On Thu, Feb 9, 2017 at 2:47 PM, pieter-gmail <pieter.mar...@gmail.com>
> wrote:
>
>> Hi,
>>
>> Some issues regarding the release process.
>>
>> On the 2/2/2017 the 3.2.4-SNAPSHOT was released.
>> I then started testing and found almost no issues.
>>
>> However yesterday when the VOTE mail came I found many issues on 3.2.4.
>> To understand the confusion I tested again on 3.2.4-SNAPSHOT and found
>> the same new issues.
>> I then checked the 3.2.4-SNAPSHOT timestamp and it changed to 08/02/2017
>>
>> Not sure what happened there as I can not, nor is it worth it, check an
>> old SNAPSHOT version's binary.
>> The relevant code changes are quite old (December 2016) but it may have
>> been done on a separate branch and rebasing may loose the merging
>> information. Not sure about this though.
>>
>> This only leaves the 72 hours to catch up.
>>
>> Even though it is a minor release there has been significant changes to
>> HasStep and HasContainer which breaks the heart (mine at least) of
>> implementors optimization code.
>>
>> Basically to get any value from a vote I for one will need more time.
>>
>> So far I have not found any TinkerPop issues but will need at least the
>> weekend to know better.
>>
>> Thanks
>> Pieter
>>
>> On 08/02/2017 16:51, Jason Plurad wrote:
>>> Hello,
>>>
>>> We are happy to announce that TinkerPop 3.2.4 is ready for release.
>>>
>>> The release artifacts can be found at this location:
>>> https://dist.apache.org/repos/dist/dev/tinkerpop/3.2.4/
>>>
>>> The source distribution is provided by:
>>> apache-tinkerpop-3.2.4-src.zip
>>>
>>> Two binary distributions are provided for user convenience:
>>> apache-tinkerpop-gremlin-console-3.2.4-bin.zip
>>> apache-tinkerpop-gremlin-server-3.2.4-bin.zip
>>>
>>> The GPG key used to sign the release artifacts is available at:
>>> https://dist.apache.org/repos/dist/dev/tinkerpop/KEYS
>>>
>>> The online docs can be found here:
>>> http://tinkerpop.apache.org/docs/3.2.4/ (user docs)
>>> http://tinkerpop.apache.org/docs/3.2.4/upgrade/ (upgrade docs)
>>> http://tinkerpop.apache.org/javadocs/3.2.4/core/ (core javadoc)
>>> http://tinkerpop.apache.org/javadocs/3.2.4/full/ (full javadoc)
>>>
>>> The tag in Apache Git can be found here:
>>>
>>> https://git-wip-us.apache.org/repos/asf?p=tinkerpop.git;a=
>> tag;h=refs/tags/3.2.4
>>> The release notes are available here:
>>>
>>> https://github.com/apache/tinkerpop/blob/3.2.4/
>> CHANGELOG.asciidoc#release-3-2-4
>>> The [VOTE] will be open for the next 72 hours --- closing Saturday
>>> (February 11, 2017) at 10:00 AM EST.
>>>
>>> My vote is +1.
>>>
>>> Thank you very much,
>>> Jason Plurad
>>>
>>



RE: static code in TinkerPop

2017-02-09 Thread pieter-gmail
Hi,

I have recently found a bug in Sqlg related to the static nature of
strategies.

In particular it is happening for me on the PathRetractionStep but it
probably a general issue.

Sqlg supports running many graphs in one JVM.

The staticness confuses the state of the strategies when they overlap.

Is it possible to make GlobalCache not static?

Thanks
Pieter


Re: [VOTE] TinkerPop 3.2.4 Release

2017-02-09 Thread pieter-gmail
Hi,

Some issues regarding the release process.

On the 2/2/2017 the 3.2.4-SNAPSHOT was released.
I then started testing and found almost no issues.

However yesterday when the VOTE mail came I found many issues on 3.2.4.
To understand the confusion I tested again on 3.2.4-SNAPSHOT and found
the same new issues.
I then checked the 3.2.4-SNAPSHOT timestamp and it changed to 08/02/2017

Not sure what happened there as I can not, nor is it worth it, check an
old SNAPSHOT version's binary.
The relevant code changes are quite old (December 2016) but it may have
been done on a separate branch and rebasing may loose the merging
information. Not sure about this though.

This only leaves the 72 hours to catch up.

Even though it is a minor release there has been significant changes to
HasStep and HasContainer which breaks the heart (mine at least) of
implementors optimization code.

Basically to get any value from a vote I for one will need more time.

So far I have not found any TinkerPop issues but will need at least the
weekend to know better.

Thanks
Pieter

On 08/02/2017 16:51, Jason Plurad wrote:
> Hello,
>
> We are happy to announce that TinkerPop 3.2.4 is ready for release.
>
> The release artifacts can be found at this location:
> https://dist.apache.org/repos/dist/dev/tinkerpop/3.2.4/
>
> The source distribution is provided by:
> apache-tinkerpop-3.2.4-src.zip
>
> Two binary distributions are provided for user convenience:
> apache-tinkerpop-gremlin-console-3.2.4-bin.zip
> apache-tinkerpop-gremlin-server-3.2.4-bin.zip
>
> The GPG key used to sign the release artifacts is available at:
> https://dist.apache.org/repos/dist/dev/tinkerpop/KEYS
>
> The online docs can be found here:
> http://tinkerpop.apache.org/docs/3.2.4/ (user docs)
> http://tinkerpop.apache.org/docs/3.2.4/upgrade/ (upgrade docs)
> http://tinkerpop.apache.org/javadocs/3.2.4/core/ (core javadoc)
> http://tinkerpop.apache.org/javadocs/3.2.4/full/ (full javadoc)
>
> The tag in Apache Git can be found here:
>
> https://git-wip-us.apache.org/repos/asf?p=tinkerpop.git;a=tag;h=refs/tags/3.2.4
>
> The release notes are available here:
>
> https://github.com/apache/tinkerpop/blob/3.2.4/CHANGELOG.asciidoc#release-3-2-4
>
> The [VOTE] will be open for the next 72 hours --- closing Saturday
> (February 11, 2017) at 10:00 AM EST.
>
> My vote is +1.
>
> Thank you very much,
> Jason Plurad
>



Re: lazy iteration semantics

2017-01-26 Thread pieter-gmail
Ping!

Any ideas on this.

Thanks
Pieter

On 23/01/2017 19:00, pieter-gmail wrote:
> Hi,
>
> Ran some more tests, seems all is not well.
>
> So the previous example always works (on TinkerPop and Neo4j) because
> bothE is not seen as two queries but rather one and all is well.
>
> Running the same test with a union instead has the same issue as Sqlg.
>
> @Test
> public void testLazy() {
> final TinkerGraph graph = TinkerGraph.open();
> final Vertex a1 = graph.addVertex(T.label, "A");
> final Vertex b1 = graph.addVertex(T.label, "B");
> final Vertex c1 = graph.addVertex(T.label, "C");
> a1.addEdge("ab", b1);
> a1.addEdge("ac", c1);
>
> AtomicInteger count = new AtomicInteger(0);
> graph.traversal().V(a1).union(outE(), inE()).forEachRemaining(edge -> {
> a1.addEdge("ab", b1);#1
> c1.addEdge("ac", a1);#2
> count.getAndIncrement();
> });
> Assert.assertEquals(2, count.get());
> }
>
> If only #1 is executed the test passes as the outE() is first traversed
> and the subsequent inE() is unchanged.
> However is #2 is executed by the time inE() executes its sees 2 new in
> edges and the count == 4.
>
> I tested on Neo4j and it has the same behavior.
>
> As far as I can tell all is actually behaving as expected with lazy
> iteration assumed.
>
> The unspecified semantics is with bothE(), both sides done immediately
> like Neo4j and TinkerPop, or out and in done lazily as with Sqlg.
> If lazily which side first as if affects the semantics.
>
> Another caveat is if barrier steps are injected and then the semantics
> is not the same as without the barrier step.
>
> Cheers
> Pieter
>
> On 23/01/2017 14:38, Stephen Mallette wrote:
>> I'd say TinkerGraph demonstrates the expected behavior - I guess we don't
>> have a test that enforces that?
>>
>> On Sat, Jan 21, 2017 at 3:23 PM, pieter-gmail <pieter.mar...@gmail.com>
>> wrote:
>>
>>> Ok, thanks.
>>>
>>> Its tricky as it messes with the laziness and visibility of the queries.
>>> Can really think of a solution though.
>>>
>>> Cheers
>>> Pieter
>>>
>>> On 21/01/2017 21:14, Daniel Kuppitz wrote:
>>>> Hi Pieter,
>>>>
>>>> forEachRemaining iterates over the result and hence I would expect the
>>>> result to be 2. Otherwise you would / should end up with an endless loop.
>>>> However, the same behavior is seen when you replace forEachRemaining
>>> with a
>>>> sideEffect lambda step. Not sure what the expected behavior is / should
>>> be,
>>>> but I think I prefer the way it's done now.
>>>>
>>>> Cheers,
>>>> Daniel
>>>>
>>>>
>>>> On Sat, Jan 21, 2017 at 7:30 PM, pieter-gmail <pieter.mar...@gmail.com>
>>>> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> I need to clarify the expected semantics of gremlin's lazy iteration
>>>>> semantics.
>>>>> The following gremlin is what highlighted it.
>>>>>
>>>>> ```
>>>>> @Test
>>>>> public void testLazy() {
>>>>> final TinkerGraph graph = TinkerGraph.open();
>>>>> final Vertex a1 = graph.addVertex(T.label, "A");
>>>>> final Vertex b1 = graph.addVertex(T.label, "B");
>>>>> final Vertex c1 = graph.addVertex(T.label, "C");
>>>>> a1.addEdge("ab", b1);
>>>>> a1.addEdge("ac", c1);
>>>>>
>>>>> AtomicInteger count = new AtomicInteger(0);
>>>>> graph.traversal().V(a1).bothE().forEachRemaining(edge -> {
>>>>> a1.addEdge("ab", b1);
>>>>> //a1.addEdge("ac", c1);
>>>>> count.getAndIncrement();
>>>>> });
>>>>> Assert.assertEquals(2, count.get());
>>>>> }
>>>>>
>>>>> ```
>>>>>
>>>>> On TinkerGraph the count is always 2.
>>>>>
>>>>> On Sqlg's Postgresql dialect the `bothE` first traverses the 'ac' edges
>>>>> adds in a 'ab' edge and then traverses the 'ab' edges and ends up with a
>>>>> count of 3.
>>>>>
>>>>> On Sqlg's HSQLDB and H2 dialects the `bothE` first traverses the 'ab'
>>>>> edges adds in a 'ab' edge then traverses the 'ac' edges and ends up with
>>>>> a count of 2.
>>>>>
>>>>> So for Sqlg the added edge will be seen by subsequent traversals due to
>>>>> the lazy nature but the order affects the result.
>>>>>
>>>>> TinkerGraph seems to get both 'ab' and 'ac' edges upfront and does not
>>>>> subsequently see the added edge. A bit like it has a hidden barrier step
>>>>> before the `forEachRemaining`.
>>>>>
>>>>> What is the expected semantics in this situation?
>>>>> Is a traversal that traverses an element that has been modified earlier
>>>>> in the same traversal suppose to see the change?
>>>>>
>>>>> Thanks
>>>>> Pieter
>>>>>
>>>>>



Re: lazy iteration semantics

2017-01-24 Thread pieter-gmail
Does it make sense?
Any ideas?

Cheers
Pieter

On 23/01/2017 19:00, pieter-gmail wrote:
> Hi,
>
> Ran some more tests, seems all is not well.
>
> So the previous example always works (on TinkerPop and Neo4j) because
> bothE is not seen as two queries but rather one and all is well.
>
> Running the same test with a union instead has the same issue as Sqlg.
>
> @Test
> public void testLazy() {
> final TinkerGraph graph = TinkerGraph.open();
> final Vertex a1 = graph.addVertex(T.label, "A");
> final Vertex b1 = graph.addVertex(T.label, "B");
> final Vertex c1 = graph.addVertex(T.label, "C");
> a1.addEdge("ab", b1);
> a1.addEdge("ac", c1);
>
> AtomicInteger count = new AtomicInteger(0);
> graph.traversal().V(a1).union(outE(), inE()).forEachRemaining(edge -> {
> a1.addEdge("ab", b1);#1
> c1.addEdge("ac", a1);#2
> count.getAndIncrement();
> });
> Assert.assertEquals(2, count.get());
> }
>
> If only #1 is executed the test passes as the outE() is first traversed
> and the subsequent inE() is unchanged.
> However is #2 is executed by the time inE() executes its sees 2 new in
> edges and the count == 4.
>
> I tested on Neo4j and it has the same behavior.
>
> As far as I can tell all is actually behaving as expected with lazy
> iteration assumed.
>
> The unspecified semantics is with bothE(), both sides done immediately
> like Neo4j and TinkerPop, or out and in done lazily as with Sqlg.
> If lazily which side first as if affects the semantics.
>
> Another caveat is if barrier steps are injected and then the semantics
> is not the same as without the barrier step.
>
> Cheers
> Pieter
>
> On 23/01/2017 14:38, Stephen Mallette wrote:
>> I'd say TinkerGraph demonstrates the expected behavior - I guess we don't
>> have a test that enforces that?
>>
>> On Sat, Jan 21, 2017 at 3:23 PM, pieter-gmail <pieter.mar...@gmail.com>
>> wrote:
>>
>>> Ok, thanks.
>>>
>>> Its tricky as it messes with the laziness and visibility of the queries.
>>> Can really think of a solution though.
>>>
>>> Cheers
>>> Pieter
>>>
>>> On 21/01/2017 21:14, Daniel Kuppitz wrote:
>>>> Hi Pieter,
>>>>
>>>> forEachRemaining iterates over the result and hence I would expect the
>>>> result to be 2. Otherwise you would / should end up with an endless loop.
>>>> However, the same behavior is seen when you replace forEachRemaining
>>> with a
>>>> sideEffect lambda step. Not sure what the expected behavior is / should
>>> be,
>>>> but I think I prefer the way it's done now.
>>>>
>>>> Cheers,
>>>> Daniel
>>>>
>>>>
>>>> On Sat, Jan 21, 2017 at 7:30 PM, pieter-gmail <pieter.mar...@gmail.com>
>>>> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> I need to clarify the expected semantics of gremlin's lazy iteration
>>>>> semantics.
>>>>> The following gremlin is what highlighted it.
>>>>>
>>>>> ```
>>>>> @Test
>>>>> public void testLazy() {
>>>>> final TinkerGraph graph = TinkerGraph.open();
>>>>> final Vertex a1 = graph.addVertex(T.label, "A");
>>>>> final Vertex b1 = graph.addVertex(T.label, "B");
>>>>> final Vertex c1 = graph.addVertex(T.label, "C");
>>>>> a1.addEdge("ab", b1);
>>>>> a1.addEdge("ac", c1);
>>>>>
>>>>> AtomicInteger count = new AtomicInteger(0);
>>>>> graph.traversal().V(a1).bothE().forEachRemaining(edge -> {
>>>>> a1.addEdge("ab", b1);
>>>>> //a1.addEdge("ac", c1);
>>>>> count.getAndIncrement();
>>>>> });
>>>>> Assert.assertEquals(2, count.get());
>>>>> }
>>>>>
>>>>> ```
>>>>>
>>>>> On TinkerGraph the count is always 2.
>>>>>
>>>>> On Sqlg's Postgresql dialect the `bothE` first traverses the 'ac' edges
>>>>> adds in a 'ab' edge and then traverses the 'ab' edges and ends up with a
>>>>> count of 3.
>>>>>
>>>>> On Sqlg's HSQLDB and H2 dialects the `bothE` first traverses the 'ab'
>>>>> edges adds in a 'ab' edge then traverses the 'ac' edges and ends up with
>>>>> a count of 2.
>>>>>
>>>>> So for Sqlg the added edge will be seen by subsequent traversals due to
>>>>> the lazy nature but the order affects the result.
>>>>>
>>>>> TinkerGraph seems to get both 'ab' and 'ac' edges upfront and does not
>>>>> subsequently see the added edge. A bit like it has a hidden barrier step
>>>>> before the `forEachRemaining`.
>>>>>
>>>>> What is the expected semantics in this situation?
>>>>> Is a traversal that traverses an element that has been modified earlier
>>>>> in the same traversal suppose to see the change?
>>>>>
>>>>> Thanks
>>>>> Pieter
>>>>>
>>>>>



Re: lazy iteration semantics

2017-01-21 Thread pieter-gmail
Ok, thanks.

Its tricky as it messes with the laziness and visibility of the queries.
Can really think of a solution though.

Cheers
Pieter

On 21/01/2017 21:14, Daniel Kuppitz wrote:
> Hi Pieter,
>
> forEachRemaining iterates over the result and hence I would expect the
> result to be 2. Otherwise you would / should end up with an endless loop.
> However, the same behavior is seen when you replace forEachRemaining with a
> sideEffect lambda step. Not sure what the expected behavior is / should be,
> but I think I prefer the way it's done now.
>
> Cheers,
> Daniel
>
>
> On Sat, Jan 21, 2017 at 7:30 PM, pieter-gmail <pieter.mar...@gmail.com>
> wrote:
>
>> Hi,
>>
>> I need to clarify the expected semantics of gremlin's lazy iteration
>> semantics.
>> The following gremlin is what highlighted it.
>>
>> ```
>> @Test
>> public void testLazy() {
>> final TinkerGraph graph = TinkerGraph.open();
>> final Vertex a1 = graph.addVertex(T.label, "A");
>> final Vertex b1 = graph.addVertex(T.label, "B");
>> final Vertex c1 = graph.addVertex(T.label, "C");
>> a1.addEdge("ab", b1);
>> a1.addEdge("ac", c1);
>>
>> AtomicInteger count = new AtomicInteger(0);
>> graph.traversal().V(a1).bothE().forEachRemaining(edge -> {
>> a1.addEdge("ab", b1);
>> //a1.addEdge("ac", c1);
>> count.getAndIncrement();
>> });
>> Assert.assertEquals(2, count.get());
>> }
>>
>> ```
>>
>> On TinkerGraph the count is always 2.
>>
>> On Sqlg's Postgresql dialect the `bothE` first traverses the 'ac' edges
>> adds in a 'ab' edge and then traverses the 'ab' edges and ends up with a
>> count of 3.
>>
>> On Sqlg's HSQLDB and H2 dialects the `bothE` first traverses the 'ab'
>> edges adds in a 'ab' edge then traverses the 'ac' edges and ends up with
>> a count of 2.
>>
>> So for Sqlg the added edge will be seen by subsequent traversals due to
>> the lazy nature but the order affects the result.
>>
>> TinkerGraph seems to get both 'ab' and 'ac' edges upfront and does not
>> subsequently see the added edge. A bit like it has a hidden barrier step
>> before the `forEachRemaining`.
>>
>> What is the expected semantics in this situation?
>> Is a traversal that traverses an element that has been modified earlier
>> in the same traversal suppose to see the change?
>>
>> Thanks
>> Pieter
>>
>>



RE: lazy iteration semantics

2017-01-21 Thread pieter-gmail
Hi,

I need to clarify the expected semantics of gremlin's lazy iteration
semantics.
The following gremlin is what highlighted it.

```
@Test
public void testLazy() {
final TinkerGraph graph = TinkerGraph.open();
final Vertex a1 = graph.addVertex(T.label, "A");
final Vertex b1 = graph.addVertex(T.label, "B");
final Vertex c1 = graph.addVertex(T.label, "C");
a1.addEdge("ab", b1);
a1.addEdge("ac", c1);

AtomicInteger count = new AtomicInteger(0);
graph.traversal().V(a1).bothE().forEachRemaining(edge -> {
a1.addEdge("ab", b1);
//a1.addEdge("ac", c1);
count.getAndIncrement();
});
Assert.assertEquals(2, count.get());
}

```

On TinkerGraph the count is always 2.

On Sqlg's Postgresql dialect the `bothE` first traverses the 'ac' edges
adds in a 'ab' edge and then traverses the 'ab' edges and ends up with a
count of 3.

On Sqlg's HSQLDB and H2 dialects the `bothE` first traverses the 'ab'
edges adds in a 'ab' edge then traverses the 'ac' edges and ends up with
a count of 2.

So for Sqlg the added edge will be seen by subsequent traversals due to
the lazy nature but the order affects the result.

TinkerGraph seems to get both 'ab' and 'ac' edges upfront and does not
subsequently see the added edge. A bit like it has a hidden barrier step
before the `forEachRemaining`.

What is the expected semantics in this situation?
Is a traversal that traverses an element that has been modified earlier
in the same traversal suppose to see the change?

Thanks
Pieter



Re: [TinkerPop] gremlin-x

2016-12-06 Thread pieter-gmail
Hi,

Now that we have established our disagreement lets continue...

"You prefer your model to be that of the underlying graph (following
that logic, you would use Hibernate to map to Table objects?) and I
prefer using application domain models."

When working with TinkerPop I certainly prefer that its model is that of
the underlying meta model i.e. having the property graph model as a
first class construct. For modeling
Cat/Dog/Person/Organization/Molecule... I also, like you use application
models as an abstraction on top of TinkerPop. Hibernate on top off
TinkerPop as opposed to JDBC if you will.

"You prefer your query to return the underlying graph model and I prefer
it to return any data."

True, however I am totally fine with g.V().this().that().values(...);
for the any data query.

"You prefer your query to always return all properties and I prefer it
to always return only selected properties."

This is partially true and indeed the default that I and I'd say the
TinkerPop test suites have assumed.
There are countless gremlins that return GraphTraversal<Vertex, Vertex>
that happily iterate the traversal accessing its properties assuming
that they are right there and not another db hit. It has bitten me
though with some fat vertices making it not such a sensible default
anymore. And when loading vertices just to create edges, then loading
the properties are somewhat silly.

"You prefer your objects to be proxies to the underlying datastore (I
think this blurs the lines between being a graph provider and gremlin
client) and I prefer my objects to be detached with load/store being
explicit."

True. My programming model assumes this to be the case and again so does
the TinkerPop test suite since day one. I'd argue it makes for eloquent
code. No need to worry about attaching and detaching and reattaching
with transaction semantics getting confused. You read and write within
the same transaction boundaries with the Vertex/Edge being bound to the
transaction. Very similar to you previous mail where you suggested
binding the interaction to a connection, except for the object we bind
to to is the transaction. Pretty much a one to one between transaction
and connection for most ACID databases.

"In the end, it sounds like you want gremlin to be an object-graph
mapper in the graph model and I prefer a layered approach where gremlin
is a simple query language of which an object-graph mapper, in any
domain model, could be built on top (like so many other query languages)."

Not so true. Object Graph Mappers will use TinkerPop as the graph layer
to model their domain on top off. TinkerPop responsibility is to model
its own model. The property graph model. One can think of
Vertex/Edge/Property as a baby hardcoded meta ORM but maybe its not a
useful analogy.

All that said I think we are not all that far apart as the way I read it
our primary disagreement is the properties.

Lets take GraphTraversal<Vertex, Vertex>.
If its a ReferencedVertex then every property access will be another
round trip. This seems unacceptable to me.
If it returns a Vertex with all its properties pre-loaded then its also
unacceptable. (Even if its the current default that I assume)

So perhaps what Marko suggested

g.withDetachment(…)
  - Detach.reference // just ids
  - Detach.reduced // ids, labels, keys, values (basically, what you
need for toString())
  - Detach.rich // property data included
  - Detached.full // edge data included (basically, StarGraph)

will makes us both happy?

Cheers
Pieter



On 05/12/2016 17:06, Robert Dale wrote:
> Clearly we have different use cases.
>
> You prefer your model to be that of the underlying graph (following
> that logic, you would use Hibernate to map to Table objects?) and I
> prefer using application domain models.
>
> You prefer your query to return the underlying graph model and I
> prefer it to return any data.
>
> You prefer your query to always return all properties and I prefer it
> to always return only selected properties.
>
> You prefer your objects to be proxies to the underlying datastore (I
> think this blurs the lines between being a graph provider and gremlin
> client) and I prefer my objects to be detached with load/store being
> explicit.
>
> In the end, it sounds like you want gremlin to be an object-graph
> mapper in the graph model and I prefer a layered approach where
> gremlin is a simple query language of which an object-graph mapper, in
> any domain model, could be built on top (like so many other query
> languages).
>
> So I guess we'll just have to agree to disagree.
>
>
> Robert Dale
>
> On Fri, Dec 2, 2016 at 10:30 AM, pieter-gmail <pieter.mar...@gmail.com
> <mailto:pieter.mar...@gmail.com>> wrote:
>
> Hi,
>
> Let me disagree with your disagreement ;-)

Re: [TinkerPop] gremlin-x

2016-12-03 Thread pieter-gmail
Hi,

Had some more thoughts regarding this.

If TinkerPop recipes were become steps somehow then providers can
optimize them individually. Currently some recipes, (complex gremlins in
general) are unoptimizeable. However if the recipe were to be a step
then the individual recipe/step could be optimized using a different
strategy to that of the default recipe's gremlin.

Providers are then also able to add their own recipes and if its popular
enough it could make it into TinkerPop's library of recipes via standard
gremlin.

Cheers
Pieter

On 02/12/2016 14:03, Stephen Mallette wrote:
> > Perhaps this is also the time to think about custom steps and some
> way for providers to inject custom steps into the traversal.
>
> I'm not completelyclear on how custom steps, DSLs, etc should be
> properly implemented. I think it would be important to have an
> approach for that under this model and get it documented.
>
> On Fri, Dec 2, 2016 at 6:49 AM, pieter-gmail <pieter.mar...@gmail.com
> <mailto:pieter.mar...@gmail.com>> wrote:
>
> Hi,
>
> Perhaps this is also the time to think about custom steps and some way
> for providers to inject custom steps into the traversal.
> Currently the steps are all defined on `GraphTraversal`. Maybe
> providers
> can extend the interface with their functionality and if
> `graph.traversal()` returns the extended interface the custom
> steps will
> be available.
>
> e.g.
>
> graph.traversal().streamVertex(...)
>
> Or if the traversal becomes session aware then perhaps,
>
> GraphTraversal gt = graph.traversal();
> gt.bulkModeOn();
> for 1 .. 1 000 000
>gt.addV().next();
>
> gt.V().specialRecipe().next();
> gt.commit();
>
> Just some thoughts
>
> Cheers
> Pieter
>
> On 01/12/2016 22:57, Marko Rodriguez wrote:
> > Ah. Yes. I concur Pieter.
> >
> > Then I think we need to get smart about “what data?” We sort of
> > already do with HaltedTraverserStrategy but this is really
> specific to
> > internal computing and GremlinServer. We could go deeper into this
> > path with:
> >
> > g.withDetachment(…)
> >   - Detach.reference // just ids
> >   - Detach.reduced // ids, labels, keys, values (basically, what
> > you need for toString())
> >   - Detach.rich // property data included
> >   - Detached.full // edge data included (basically, StarGraph)
> >
> >
> > Next, GraphSON would have to specify a “subtype” to the various
> > g:Element types.
> >
> > { @type:”g:Vertex", @detach:”reference", @value: { id:1 } }
> >
> >
> > Then, we could add a method to Element.
> >
> > Element.detachment() -> Detach enum
> >
> >
> > This way, users can always know what is available to them.
> Finally for
>     > cleanliness:
> >
> > Element.detach(Detach) -> Element
> > Element.attach(Graph) -> Element (always a Detach.full element)
> >
> >
> > Just spit-ballin’,
> > Marko.
> >
> > http://markorodriguez.com
> >
> >
> >
> >> On Dec 1, 2016, at 12:28 PM, pieter-gmail
> <pieter.mar...@gmail.com <mailto:pieter.mar...@gmail.com>
> >> <mailto:pieter.mar...@gmail.com
> <mailto:pieter.mar...@gmail.com>>> wrote:
> >>
> >> Hi,
> >>
> >> "So with ReferenceElements, latency will be less too because it
> takes
> >> less time to construct the ReferenceVertex than it does to
> construct a
> >> DetachedVertex. Imagine a vertex with 100 properties and meta
> >> properties. ?!"
> >>
> >> But ReferencedElement does not have the properties so more
> round trips
> >> are needed increasing latency. One of the first things to make
> Sqlg at
> >> all usable was to make sure that a Vertex contains all of its
> >> properties. Else at least one more call is needed per Vertex. Its a
> >> latency killer. For those mostly few cases where the Vertex is
> so fat
> >> that it is slow to load and only a few properties are needed then
> >> g.V().hasLabel("label").values("property1", "property2") is
> used. So to
> >> my mind ReferencedElement increases latency and does not
> decreases it.
> >>

Re: [TinkerPop] gremlin-x

2016-12-02 Thread pieter-gmail
sically arguing that the default behavior is the sql
> equivalent of SELECT *.  This is not a good practice. Then you go on
> to say that if the dev is aware that this is a 'fat' element, they
> should ask for exact properties.  I think what we're arguing is that
> the default behavior should be 'always ask for exact properties'. This
> is the most accepted practice in querying any database, sql, nosql,
> mongodb, cassandra, etc.
>
> That leads us to your Hibernate comment.  In the abstract sense,
> Vertex is just a map wrapper. I think you're just splitting hairs
> trying to distinguish a Dog Vertex and a Dog Map. In either case, you
> would have to query the label.  In any case, I don't know anyone who
> wants to deal with Vertex/Edges.  What most devs deal with, in my
> experience, is a domain-specific model.  So whether I get back a
> Vertex or a Map, either way, I'm going to translate that to my domain
> model.  Also, in hibernate, when I get a property I didn't query for,
> I will get a null.  If I set a property, it does not automatically
> persist it to the database. In your model, there is no difference
> between transient, in-memory state (e.g. workflow) and database state.
> BTW, this would also be lots of round trips to the database in your
> case. Finally, believe it or not, Hibernate attempts to do smart
> querying where it will actually retrieve only the IDs, then look for
> them in its second-level cache, if not found, go back to the database
> to get them.  This is a very common pattern across sql/nosql datastores.
>
> So it's not just about becoming more like jdbc but more about a
> low-level paradigm. To that I agree with you on one thing, the first
> thing you should do is create a 'baby hibernate' because I don't think
> gremlin should be an ORM (OGM?).
>
>
>
> Robert Dale
>
> On Thu, Dec 1, 2016 at 2:28 PM, pieter-gmail <pieter.mar...@gmail.com
> <mailto:pieter.mar...@gmail.com>> wrote:
>
> Hi,
>
> "So with ReferenceElements, latency will be less too because it takes
> less time to construct the ReferenceVertex than it does to construct a
> DetachedVertex. Imagine a vertex with 100 properties and meta
> properties. ?!"
>
> But ReferencedElement does not have the properties so more round trips
> are needed increasing latency. One of the first things to make Sqlg at
> all usable was to make sure that a Vertex contains all of its
> properties. Else at least one more call is needed per Vertex. Its a
> latency killer. For those mostly few cases where the Vertex is so fat
> that it is slow to load and only a few properties are needed then
> g.V().hasLabel("label").values("property1", "property2") is used.
> So to
> my mind ReferencedElement increases latency and does not decreases it.
>
> Using ReferencedElement which is hardly an Element at all, after
> all it
> throws exceptions on almost all of its own interface, the user has to
> get the properties manually and then is back in a world of Map and
> Lists
> of Maps.
>
> A refactor of much existing code will need to toss the Vertex
> notion all
> together and replace it with Maps and Lists of Maps. Almost like
> writing
> an application in pure JDBC code with thousands of lines iterating
> through ResultSets mapping things back and forth. Unless I am missing
> something this change seems huge.
>
> I get that all this is important for non java devs but it be a pity if
> their problems becomes java devs problems.
>
> Cheers
> Pieter
>
>
> On 01/12/2016 20:38, Marko Rodriguez wrote:
> > Hi,
> >
> > *PIETER REPLIES:*
> >
> >> One of the first reasons I came to graphs, Neo4j and then
> TinkerPop way
> >> back was precisely because of the direct access to Node/Vertex.
> The user
> >> treats it like any other object, not a remote connection. It is the
> >> embedded nature that makes life so easy. In a way it was like
> having a
> >> simplistic Hibernate as the core api. 99% of queries we write is to
> >> retrieve vertices. Not Maps and Lists of something. TinkerPop's
> own test
> >> suite applies this type of thinking. Querying/modifying
> Elements and
> >> asserting them. Vertex and Edge abound as first class citizens.
> >
> > So Graph/Vertex/Edge/VertexProperty/Property will still exist for
> > users as objects in the respective GLV language, it is just they are
> > not “attached” and “rich.”
> >
> > For instanc

Re: [TinkerPop] gremlin-x

2016-12-02 Thread pieter-gmail
Hi,

Perhaps this is also the time to think about custom steps and some way
for providers to inject custom steps into the traversal.
Currently the steps are all defined on `GraphTraversal`. Maybe providers
can extend the interface with their functionality and if
`graph.traversal()` returns the extended interface the custom steps will
be available.

e.g.

graph.traversal().streamVertex(...)

Or if the traversal becomes session aware then perhaps,

GraphTraversal gt = graph.traversal();
gt.bulkModeOn();
for 1 .. 1 000 000
   gt.addV().next();

gt.V().specialRecipe().next();
gt.commit();

Just some thoughts

Cheers
Pieter

On 01/12/2016 22:57, Marko Rodriguez wrote:
> Ah. Yes. I concur Pieter.
>
> Then I think we need to get smart about “what data?” We sort of
> already do with HaltedTraverserStrategy but this is really specific to
> internal computing and GremlinServer. We could go deeper into this
> path with:
>
> g.withDetachment(…)
>   - Detach.reference // just ids
>   - Detach.reduced // ids, labels, keys, values (basically, what
> you need for toString())
>   - Detach.rich // property data included
>   - Detached.full // edge data included (basically, StarGraph)
>
>
> Next, GraphSON would have to specify a “subtype” to the various
> g:Element types.
>
> { @type:”g:Vertex", @detach:”reference", @value: { id:1 } }
>
>
> Then, we could add a method to Element.
>
> Element.detachment() -> Detach enum
>
>
> This way, users can always know what is available to them. Finally for
> cleanliness:
>
> Element.detach(Detach) -> Element
> Element.attach(Graph) -> Element (always a Detach.full element)
>
>
> Just spit-ballin’,
> Marko.
>
> http://markorodriguez.com
>
>
>
>> On Dec 1, 2016, at 12:28 PM, pieter-gmail <pieter.mar...@gmail.com
>> <mailto:pieter.mar...@gmail.com>> wrote:
>>
>> Hi,
>>
>> "So with ReferenceElements, latency will be less too because it takes
>> less time to construct the ReferenceVertex than it does to construct a
>> DetachedVertex. Imagine a vertex with 100 properties and meta
>> properties. ?!"
>>
>> But ReferencedElement does not have the properties so more round trips
>> are needed increasing latency. One of the first things to make Sqlg at
>> all usable was to make sure that a Vertex contains all of its
>> properties. Else at least one more call is needed per Vertex. Its a
>> latency killer. For those mostly few cases where the Vertex is so fat
>> that it is slow to load and only a few properties are needed then
>> g.V().hasLabel("label").values("property1", "property2") is used. So to
>> my mind ReferencedElement increases latency and does not decreases it.
>>
>> Using ReferencedElement which is hardly an Element at all, after all it
>> throws exceptions on almost all of its own interface, the user has to
>> get the properties manually and then is back in a world of Map and Lists
>> of Maps.
>>
>> A refactor of much existing code will need to toss the Vertex notion all
>> together and replace it with Maps and Lists of Maps. Almost like writing
>> an application in pure JDBC code with thousands of lines iterating
>> through ResultSets mapping things back and forth. Unless I am missing
>> something this change seems huge.
>>
>> I get that all this is important for non java devs but it be a pity if
>> their problems becomes java devs problems.
>>
>> Cheers
>> Pieter
>>
>>
>> On 01/12/2016 20:38, Marko Rodriguez wrote:
>>> Hi,
>>>
>>> *PIETER REPLIES:*
>>>
>>>> One of the first reasons I came to graphs, Neo4j and then TinkerPop way
>>>> back was precisely because of the direct access to Node/Vertex. The
>>>> user
>>>> treats it like any other object, not a remote connection. It is the
>>>> embedded nature that makes life so easy. In a way it was like having a
>>>> simplistic Hibernate as the core api. 99% of queries we write is to
>>>> retrieve vertices. Not Maps and Lists of something. TinkerPop's own
>>>> test
>>>> suite applies this type of thinking. Querying/modifying Elements and
>>>> asserting them. Vertex and Edge abound as first class citizens.
>>>
>>> So Graph/Vertex/Edge/VertexProperty/Property will still exist for
>>> users as objects in the respective GLV language, it is just they are
>>> not “attached” and “rich.”
>>>
>>> For instance, in Gremlin-Python, you have:
>>>
>>>v = g.V().next()
>&g

Re: [TinkerPop] gremlin-x

2016-12-01 Thread pieter-gmail
Hi,

"So with ReferenceElements, latency will be less too because it takes
less time to construct the ReferenceVertex than it does to construct a
DetachedVertex. Imagine a vertex with 100 properties and meta
properties. ?!"

But ReferencedElement does not have the properties so more round trips
are needed increasing latency. One of the first things to make Sqlg at
all usable was to make sure that a Vertex contains all of its
properties. Else at least one more call is needed per Vertex. Its a
latency killer. For those mostly few cases where the Vertex is so fat
that it is slow to load and only a few properties are needed then
g.V().hasLabel("label").values("property1", "property2") is used. So to
my mind ReferencedElement increases latency and does not decreases it.

Using ReferencedElement which is hardly an Element at all, after all it
throws exceptions on almost all of its own interface, the user has to
get the properties manually and then is back in a world of Map and Lists
of Maps.

A refactor of much existing code will need to toss the Vertex notion all
together and replace it with Maps and Lists of Maps. Almost like writing
an application in pure JDBC code with thousands of lines iterating
through ResultSets mapping things back and forth. Unless I am missing
something this change seems huge.

I get that all this is important for non java devs but it be a pity if
their problems becomes java devs problems.

Cheers
Pieter


On 01/12/2016 20:38, Marko Rodriguez wrote:
> Hi,
>
> *PIETER REPLIES:*
>
>> One of the first reasons I came to graphs, Neo4j and then TinkerPop way
>> back was precisely because of the direct access to Node/Vertex. The user
>> treats it like any other object, not a remote connection. It is the
>> embedded nature that makes life so easy. In a way it was like having a
>> simplistic Hibernate as the core api. 99% of queries we write is to
>> retrieve vertices. Not Maps and Lists of something. TinkerPop's own test
>> suite applies this type of thinking. Querying/modifying Elements and
>> asserting them. Vertex and Edge abound as first class citizens.
>
> So Graph/Vertex/Edge/VertexProperty/Property will still exist for
> users as objects in the respective GLV language, it is just they are
> not “attached” and “rich.”
>
> For instance, in Gremlin-Python, you have:
>
> v = g.V().next()
> v.id
>
> A ReferenceVertex contains the id of the vertex so you can always
> “re-attach” it to the source.
>
> g.V(v).out()
>
>
>> Graph, Vertex and Edge is the primary abstraction that users deal with.
>> Having the direct representation of this is very very nice.
>> It makes user code easy and readable.  You know you are dealing with the
>> "Person/Address/Dog/This/That" entity/label as opposed to just a
>> decontextualized bunch of data, Maps and Lists. If Vertex/Edge/Property
>> were to disappear I'd say it would be the first call of duty to write a
>> baby hibernate to bring the property model back in again into userspace.
>
> Again, the abstraction is still there, but just ALWAYS in a detached form.
>
>>
>> Regarding jdbc, this kinda makes the point. Sqlg and Hibernate and many
>> many other tools exists precisely so that users do not need to use JDBC
>> with endless hardcoded strings guiding the application. Making TinkerPop
>> more like JDBC is not an obvious plus point.
>
> So, RemoteConnection differs from JDBC in that its not a fat string,
> but RemoteConnection.submit(Bytecode). Thus, you still work at the
> GraphTraversal level in every GLV.
>
>> A ReferencedElement is also no good as the problem I experience is
>> latency not bandwidth.
>
> So with ReferenceElements, latency will be less too because it takes
> less time to construct the ReferenceVertex than it does to construct a
> DetachedVertex. Imagine a vertex with 100 properties and meta
> properties. ?!
>
>> I reckon the experience and usage of TinkerPop is rather different for
>> java and non java people and perhaps even java folks. Hopefully I am not
>> the only one who have made such heavy happy use of the TinkerPop
>> property meta model and would be sad to see it go.
>>
>> Cheers
>> Pieter
>>
>
>
> *ROBERT REPLIES:*
>
>> I agree the focus should be on the Connection (being separate from
>> Graph) and Traversal. I wouldn't constrain it to "RemoteConnection",
>> just Connection or GraphConnection. Perhaps there's an
>> EmbeddedConnection and a RemoteConnection or maybe it's URI-oriented
>> similar to how JDBC does it. In either case, the behavior  of Remote
>> and Embedded is the same which is what I think we're striving for.
>
> Yes. Good point. Just Connection.
>
>> I would also like to see Transactions be Connection-oriented. With
>> the right API, it could hook into JTA and be able to take advantage
>> of various annotations for marking transaction boundaries.
>
> g = g.openTx()
> g.V().out().out()
> g.addV()
> g.V(1).addE().to(2)
> g.closeTx();
>
>
> ??? This way, its all about 

Re: [TinkerPop] gremlin-x

2016-12-01 Thread pieter-gmail
;http://graph.io>() - in one
>> sense i'd like to see that as something hidden as it is the source of
>> a fair bit of confusion (i.e. someone trying to load a multi GB
>> graphson file) but on the other hand it's convenient for small graphs
>> - that story could be made better. Maybe Graph.io <http://graph.io>()
>> being a part of "Graph" is hidden and only used by the test suite to
>> load data into graph providers implementations for testing purposes.
>> Of course, we will need our story in order for loading data in
>> different formats if that isn't available (which I guess we need to
>> do anyway).
>> + Can remoting Gremlin bytecode cover everything that users currently
>> do in embedded/local mode? The drawback with a remote traversal is
>> that it is bound to a single transaction and you therefore lose some
>> efficiency with certain graphs databases as logic with three
>> traversals that might span a single transaction, for example, will be
>> executed less efficiently as three different transaction refreshing a
>> transaction cache each time. Maybe the server protocol needs to be
>> session based? Maybe, going back to the first bullet i had, the
>> transaction model needs to change so that its not bound to a single
>> thread by default?  
>> + Related to the last point, is there an easy way to blend client
>> code with server code? Meaning, if i have a body of complex logic
>> right now, my only option is to submit a script. I think the
>> ScriptEngine stuff should really just be used for those situations
>> where you need a lambda. That's a nice, simple, easy-to-explain rule,
>> whereas "do it if there is complex logic" is a bit more subjective
>> (and potentially troublesome). How can GLVs get complex logic on the
>> "server" (let's say the logic is written in java and available to
>> gremlin server) so that it could be executed as part of some bytecode
>> in any GLV?
>> + It would be neat to see what kind of "server" changes we could make
>> to optimize for bytecode. Right now we sorta tied bytecode execution
>> into the same Gremlin Server execution pipeline as ScriptEngine
>> processing. I think we could probably do better on that.
>>
>> Not sure there are any immediate answers to any of the above, but
>> it's what I've been thinking about with respect to TinkerPop 3.3.0
>> lately.
>>
>>
>>
>> On Wed, Nov 30, 2016 at 1:11 PM, pieter-gmail
>> <pieter.mar...@gmail.com <mailto:pieter.mar...@gmail.com>> wrote:
>>
>> Ok, if Sqlg just implements its own RemoteConnection and no need
>> for an
>> additional server in the architecture then I am a lot less
>> nervous about
>> gremlin-x.
>>
>> As an aside, Sqlg has its own peculiarities and optimization
>> outside the
>> standard TinkerPop interface. Most providers will have such
>> features. If
>> the graph is not exposed directly there will need to be some way
>> to call
>> custom features. No idea how now, but still.
>>
>> E.g. for Sqlg the most used such feature is
>> SqlgGraph.tx().batchMode(batchModeType) but there are others. Index
>> management will be another. Without being able to access these
>> features
>> too many benefits of choosing a particular provider will be lost.
>>
>> Cheers
>> Pieter
>>
>> On 30/11/2016 19:31, Marko Rodriguez wrote:
>> > Hi,
>> >
>> >> I generally recommend the exact opposite. Neo4j shines when
>> embedded.
>> >
>> > And thats fine. There would simply be a
>> EmbeddedRemoteConnection which
>> > is basically just a proxy that does nothing. No need for server
>> stuff.
>> >
>> >> Sqlg already works of databases that is designed for remote
>> access.
>> >
>> > And yes, thats where a full fledged RemoteConnection (like
>> > GremlinServer’s) is needed for marshaling data over a network.
>> >
>> >> Wrapping that with yet another server is an unnecessary
>> complication.
>> >
>> > Its not about GremlinServer so much as its about
>> RemoteConnection. If
>> > Sqlg has their own RemoteConnection implementation, then it
>> uses that.
>> > Likewise, Neo4jServer could provide its own RemoteConnection
>> > implementation and thus, no need for GremlinServer.
>> >
>> >> (Not

Re: [DISCUSS] Graph.addVertex(Map)

2016-11-27 Thread pieter-gmail
Hi,

I saw this
http://iteratrlearning.com/java9/2016/11/09/java9-collection-factory-methods
Creating maps will become easier (less typing) in java 9.

Cheers
Pieter

On 20/09/2016 18:49, Stephen Mallette wrote:
> Anyone interested in seeing a Graph.addVertex(Map) overload?
>
> https://issues.apache.org/jira/browse/TINKERPOP-1174
>
> I don't imagine there would be any change to addV() in this case. I'm
> thinking that we wouldn't likely use this method internally and so it would
> more be something for user convenience, in which case, it seems to
> encourage more use of the Graph API which we're typically trying to do less
> of.
>



Re: [jira] [Updated] (TINKERPOP-1541) Select should default to Pop.last semantics

2016-11-08 Thread pieter-gmail
I'd say no, if you wanted it once you would have selected it once no?
in sql `select a.name, a.name from Person a` will return name twice
because that's what you asked for.

5 cents
Pieter



On 08/11/2016 14:02, Marko A. Rodriguez (JIRA) wrote:
>  [ 
> https://issues.apache.org/jira/browse/TINKERPOP-1541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
>  ]
>
> Marko A. Rodriguez updated TINKERPOP-1541:
> --
> Summary: Select should default to Pop.last semantics  (was: Should 
> repeated objects be returned for the same label?)
>
>> Select should default to Pop.last semantics
>> ---
>>
>> Key: TINKERPOP-1541
>> URL: https://issues.apache.org/jira/browse/TINKERPOP-1541
>> Project: TinkerPop
>>  Issue Type: Improvement
>>  Components: process
>>Affects Versions: 3.2.3
>>Reporter: Marko A. Rodriguez
>>  Labels: breaking
>>
>> Check this out:
>> {code}
>> gremlin> g.V().as('a').select('a').as('a').select('a')
>> ==>[v[1],v[1]]
>> ==>[v[2],v[2]]
>> ==>[v[3],v[3]]
>> ==>[v[4],v[4]]
>> ==>[v[5],v[5]]
>> ==>[v[6],v[6]]
>> {code}
>> Shouldn't we just return the uniques? This is a big decision as this can 
>> cause massive rippling breakage for users :).
>
>
> --
> This message was sent by Atlassian JIRA
> (v6.3.4#6332)



Re: [jira] [Commented] (TINKERPOP-1372) ImmutablePath should not use Java recursion (call stacks are wack)

2016-11-02 Thread pieter-gmail
Perused the code. Looks good.

I noticed there is a travis failure on GroupCountTest however the same
test passes on my machine on branch TINKERPOP-1372.

VOTE +1

On 01/11/2016 16:41, ASF GitHub Bot (JIRA) wrote:
> [ 
> https://issues.apache.org/jira/browse/TINKERPOP-1372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15625594#comment-15625594
>  ] 
>
> ASF GitHub Bot commented on TINKERPOP-1372:
> ---
>
> GitHub user okram opened a pull request:
>
> https://github.com/apache/tinkerpop/pull/473
>
> TINKERPOP-1372: ImmutablePath should not use Java recursion (call stacks 
> are wack)
>
> https://issues.apache.org/jira/browse/TINKERPOP-1372
> 
> `ImmutablePath` used heavy method-recursion which is expensive in Java to 
> create a new call stack for each recurse. All method-recursion has been 
> replaced with `while(true)`-recursion. Furthermore, was able to get rid of 
> `ImmutablePath.TailPath` with a `public static ImmutablePath TAIL_PATH = new 
> ImmutablePath(null,null,null)`.  This makes things much cleaner and we don't 
> need the package protected `ImmutablePathImpl` interface. Finally, I stole 
> @pietermartin's trick to use direct reference to `Set` labels as the 
> labels are immutable.
> 
> Here is a benchmark of a bunch of `match()`-traversals on the Grateful 
> Dead graph where the first two columns are time in milliseconds and the last 
> column is the number of returned results.
> 
> ```
> PREVIOUS   NEW # RESULTS
> 
> [12.676,  12.019,  93]  
> [222.123, 177.596, 2952]
> [27.187,  35.787,  6]
> [80.917,  77.891,  5421]
> [189.354, 176.308, 5096]
> [14.644,  14.969,  18]
> [2.214,   0.908,   3]
> [924.093, 777.707, 314932]
> ```
> 
> VOTE +1.
>
> You can merge this pull request into a Git repository by running:
>
> $ git pull https://github.com/apache/tinkerpop TINKERPOP-1372
>
> Alternatively you can review and apply these changes as the patch at:
>
> https://github.com/apache/tinkerpop/pull/473.patch
>
> To close this pull request, make a commit to your master/trunk branch
> with (at least) the following in the commit message:
>
> This closes #473
> 
> 
> commit 04fe38a28d3dce2a910c40c49658c083785b6473
> Author: Marko A. Rodriguez 
> Date:   2016-11-01T12:39:45Z
>
> removed call stack recursion in ImmutablePath. All is while(true) based 
> with a break on 'tail path.' ImmutablePath.TailPath is no longer required as 
> the 'tail' is a the path segmanet with currentObject == null. Some 
> preliminary tests show a significant speed up. Benchmark to follow suit. 
> Added more test cases to PathTest. Removed TailPath Class.forName() in 
> GryoRegistrator as it is no longer an existing class.
>
> commit 3caa5c8aa38b108f9548ce345ddd97bd7378f99e
> Author: Marko A. Rodriguez 
> Date:   2016-11-01T12:41:17Z
>
> removed ImmutablePathImpl. Was initially Deprecated as TailPath is no 
> longer needed, but since its a package local interface, it is not possible to 
> implement outside of the package. Thus, if its no longer used in the package, 
> delete.
>
> commit cd000995d1670170b9b5f3d726f20fb8cf45ffc9
> Author: Marko A. Rodriguez 
> Date:   2016-11-01T13:09:45Z
>
> removed more method-based recursions in ImmutablePath and inlined the 
> singleHead() and singleTail() methods as they are no longer interface methods 
> and are only called in one other method.
>
> commit 3896a981fdfced7b19a830738b2f3ef51f82672a
> Author: Marko A. Rodriguez 
> Date:   2016-11-01T13:19:54Z
>
> Overrode Path.isSimple() default impl for ImmutablePath that doesn't 
> create so many objects.
>
> commit deaf38a7ed35f3236614d833eeb0eac2a25334fc
> Author: Marko A. Rodriguez 
> Date:   2016-11-01T14:27:08Z
>
> added @pietermartin's direct reference to Step.getLabels() optimization 
> to ImmutablePath. Added JavaDoc to Traverser for the 
> dropLabels()/keepLabels() method. Fixed a spelling mistake in 
> AbstractTraverser.
>
> 
>
>
>> ImmutablePath should not use Java recursion (call stacks are wack)
>> --
>>
>> Key: TINKERPOP-1372
>> URL: https://issues.apache.org/jira/browse/TINKERPOP-1372
>> Project: TinkerPop
>>  Issue Type: Improvement
>>  Components: process
>>Affects Versions: 3.2.0-incubating
>>Reporter: Marko A. Rodriguez
>>Assignee: Marko A. Rodriguez
>>
>> {{ImmutablePath}} sucks for a few reasons:
>> 1. It has {{ImmutablePathImpl}} interface to combine {{Tail}} and 
>> {{ImmutablePath}}. Lame.
>> 2. It uses recurssion to find data. Lame.
>> For 3.2.1, I have done a lot to use {{while()}}-based recursion and I 
>> suspect I can gut 

Re: path query optimization

2016-11-01 Thread pieter-gmail
The branch is TINKERPOP-1404
https://github.com/apache/tinkerpop/commit/c1556fe82c58527dc4425d23d1d69ce324e62cfa

Cheers
Pieter

On 01/11/2016 15:23, Marko Rodriguez wrote:
> What branch are you in? Perhaps give me URLs (to GitHub) to the files 
> touched? (if its not too many)
>
> Marko.
>
> http://markorodriguez.com
>
>
>
>> On Nov 1, 2016, at 7:19 AM, pieter-gmail <pieter.mar...@gmail.com> wrote:
>>
>> Hi,
>>
>> Yes I am but afraid I do not have time at present to concentrate on it.
>> I just noticed your ImmutablePath ticket which will overlap with some of
>> what I have done.
>>
>> I'd suggest to pull my branch and look at what I did there. It was very
>> little, but dangerous code, which is why I was reluctant to submit a PR
>> at first. If you don't continue with it, I should in 2 or 3 weeks be
>> able to look at it again.
>>
>> Thanks
>> Pieter
>>
>>
>> On 01/11/2016 15:11, Marko Rodriguez wrote:
>>> Hi Pieter,
>>>
>>> I’m still really interested in your work in this area. Are you still doing 
>>> this?
>>>
>>> Marko.
>>>
>>> http://markorodriguez.com
>>>
>>>
>>>
>>>> On Aug 7, 2016, at 9:12 AM, Pieter Martin <pieter.mar...@gmail.com> wrote:
>>>>
>>>> To avoid the collection logic alltogether. For most steps there is no need 
>>>> to
>>>> check the labels as it is known that they are added, immutable and correct.
>>>>
>>>> Also with the current strategy the `ImmutablePath.currentLabels` is 
>>>> exactly the
>>>> same collection as that of the step. It is not a copy.
>>>>
>>>> `Traverser.addLabels()` is only called for the steps previously mentioned. 
>>>> They
>>>> too can be optimized as there is no need to create a new `ImmutablePath` 
>>>> just to
>>>> set the labels. There could be a `Traverer.split()` method that creates the
>>>> initial `ImmutablePath` with the correct labels to start with. As things 
>>>> stand
>>>> now the `ImmutablePath` is created just to be replaced 1 or 2 millis 
>>>> later. I
>>>> have however not benchmarked any of those queries so am not touching that 
>>>> at
>>>> the moment.
>>>>
>>>> In some ways its a matter of style. The label logic in `AbstractStep` is 
>>>> not
>>>> relevant to all of its subclasses and should be higher in the inheritance
>>>> hierarchy.
>>>>
>>>> Cheers
>>>> Pieter
>>>>
>>>>
>>>> Excerpts from Marko Rodriguez's message of August 7, 2016 3:54 :
>>>>> Why not just check to see if the labels to be added already exist, if 
>>>>> they do, don’t addLabels() and thus, don’t create a new collection.
>>>>> Marko.
>>>>> http://markorodriguez.com
>>>>>> On Aug 7, 2016, at 6:07 AM, Pieter Martin <pieter.mar...@gmail.com> 
>>>>>> wrote:
>>>>>> Here is what I have come up with so far.
>>>>>> The idea is that `Traverser.split(r, step)` already copies the labels to 
>>>>>> the
>>>>>> traverser so there is no need to call `Traverser.addLabels(labels)` 
>>>>>> again.
>>>>>> I removed the `Traverser.addLabels(labels)` call from `AbstractStep`.
>>>>>> For the traversers that do not call `Traverer.split(r, step)` I manually 
>>>>>> added
>>>>>> the `traverser.addLabels(labels)` call in `processNextStart()`. This was 
>>>>>> done
>>>>>> by fixing test failures rather than searching and investigating all 
>>>>>> calls to
>>>>>> `Traverser.split()`.
>>>>>> The following steps needed the labels to be added manually.
>>>>>> `AggregateStep`
>>>>>> `CollectingBarrierStep`
>>>>>> `FilterStep`
>>>>>> `SideEffectStep`
>>>>>> `StartStep`
>>>>>> `NoOpBarrierStep`
>>>>>> Further seeing as `Step.getLabels()` already returns a unmodifiable 
>>>>>> collection and
>>>>>> `ImmutablePath` is well immutable there is no need for it to have its 
>>>>>> own copy
>>>>>> of the labels. I set the labels directly on the path as oppose to making 
>>>>>> a copy.
>>>>>> `TinkerGraphPro

Re: path query optimization

2016-11-01 Thread pieter-gmail
ase of the
>>>>> ImmutablePath, is that it isolates ImmutablePath from whatever the 
>>>>> subclass
>>>>> of set was that the caller passed in.  I think that's what is causing the
>>>>> serialization test failure in this case since the caller passed in an
>>>>> unmodifiable set.
>>>>> --Ted
>>>>> On Fri, Aug 5, 2016, 2:31 PM Marko Rodriguez <okramma...@gmail.com> wrote:
>>>>>> Hello,
>>>>>> This is cool. Check out also ImmutablePath.extend(labels) as that is
>>>>>> ultimately what Traverser.addLabels() calls. We have a lot of set copying
>>>>>> and I don’t know if its needed (as you seem to be demonstrating). What I
>>>>>> don’t like about your solution is the explicit reference to the
>>>>>> B_L_P…Traverser in AbstractStep. See if you can work your solution 
>>>>>> without
>>>>>> it.
>>>>>> Good luck,
>>>>>> Marko.
>>>>>> http://markorodriguez.com
>>>>>>> On Aug 5, 2016, at 12:44 PM, pieter-gmail <pieter.mar...@gmail.com>
>>>>>> wrote:
>>>>>>> Sorry forgot to add a rather important part.
>>>>>>>
>>>>>>> I changed ImmutablePath's constructor to
>>>>>>>
>>>>>>>private ImmutablePath(final ImmutablePathImpl previousPath, final
>>>>>>> Object currentObject, final Set currentLabels) {
>>>>>>>this.previousPath = previousPath;
>>>>>>>this.currentObject = currentObject;
>>>>>>>this.currentLabels = currentLabels;
>>>>>>> //this.currentLabels.addAll(currentLabels);
>>>>>>>}
>>>>>>>
>>>>>>> Setting the collection directly as oppose to `addAll`
>>>>>>>
>>>>>>> Thanks
>>>>>>> Pieter
>>>>>>>
>>>>>>>
>>>>>>> On 05/08/2016 20:40, pieter-gmail wrote:
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> I have been optimizing Sqlg of late and eventually arrived at TinkerPop
>>>>>>>> code.
>>>>>>>>
>>>>>>>> The gremlin in particular that I am interested is path queries.
>>>>>>>>
>>>>>>>> Here is the test that I am running in jmh.
>>>>>>>>
>>>>>>>>//@Setup
>>>>>>>>Vertex a = graph.addVertex(T.label, "A", "name", "a1");
>>>>>>>>for (int i = 1; i < 1_000_001; i++) {
>>>>>>>>Vertex b = graph.addVertex(T.label, "B", "name", "name_" +
>>>>>> i);
>>>>>>>>a.addEdge("outB", b);
>>>>>>>>for (int j = 0; j < 1; j++) {
>>>>>>>>Vertex c = graph.addVertex(T.label, "C", "name", "name_"
>>>>>>>> + i + " " + j);
>>>>>>>>b.addEdge("outC", c);
>>>>>>>>}
>>>>>>>>}
>>>>>>>>
>>>>>>>> And the query being benchmarked is
>>>>>>>>
>>>>>>>>GraphTraversal<Vertex, Path> traversal =
>>>>>>>> g.V(a).as("a").out().as("b").out().as("c").path();
>>>>>>>>while (traversal.hasNext()) {
>>>>>>>>Path path = traversal.next();
>>>>>>>>}
>>>>>>>>
>>>>>>>> Before the optimization, (as things are now)
>>>>>>>>
>>>>>>>> Benchmark Mode Cnt  Score Error
>>>>>> Units
>>>>>>>> GremlinPathBenchmark.g_path  avgt  100  1.086 ± 0.020   s/op
>>>>>>>>
>>>>>>>> The optimization I did is in AbstractStep.prepareTraversalForNextStep,
>>>>>>>> to not call addLabels() for path gremlins as the labels are known by 
>>>>>>>> the
>>>>>>>> step and do not change again so there is not need to keep adding them.
>>>>>>>>
>>>>>>>>private final Traverser.Admin prepareTraversalForNextStep(final
>>>>>>>> Traverser.Admin traverser) {
>>>>>>>>if (!this.traverserStepIdAndLabelsSetByChild) {
>>>>>>>>traverser.setStepId(this.nextStep.getId());
>>>>>>>>if (traverser instanceof B_LP_O_P_S_SE_SL_Traverser) {
>>>>>>>>} else {
>>>>>>>>traverser.addLabels(this.labels);
>>>>>>>>}
>>>>>>>>}
>>>>>>>>return traverser;
>>>>>>>>}
>>>>>>>>
>>>>>>>> After optimization,
>>>>>>>>
>>>>>>>> Benchmark Mode Cnt  Score Error
>>>>>> Units
>>>>>>>> GremlinPathBenchmark.g_path  avgt  100  0.680 ± 0.004   s/op
>>>>>>>>
>>>>>>>> 1.086 vs 0.689 seconds for the traversal.
>>>>>>>>
>>>>>>>> I ran the Structured and Process test suites. 2 tests are failing with
>>>>>>>> this optimization.
>>>>>>>>
>>>>>>>> InjectTest.g_VX1X_out_name_injectXdanielX_asXaX_mapXlengthX_path fails
>>>>>> with
>>>>>>>> "java.lang.IllegalArgumentException: The step with label a does not
>>>>>> exist"
>>>>>>>> and
>>>>>>>>
>>>>>>>> SerializationTest.shouldSerializePathAsDetached fails with
>>>>>>>>
>>>>>>>> "Caused by: java.lang.IllegalArgumentException: Class is not
>>>>>> registered:
>>>>>>>> java.util.Collections$UnmodifiableSet"
>>>>>>>>
>>>>>>>> Before investigating the failures is this optimization worth pursuing?
>>>>>>>>
>>>>>>>> Thanks
>>>>>>>> Pieter
>>>>>>>>



Re: PathRetractionStrategy and TraverserRequirement.PATH

2016-10-26 Thread pieter-gmail
Thanks, now I know Sqlg has indeed been bugged. I am loosing the label
after the emit().as("b").

Cheers
Pieter

On 25/10/2016 21:29, Marko Rodriguez wrote:
> Here is a simple test. Remove PathRetractionStrategy from TinkerGraph 
> traversal and see what you get? Do you get what Sqlg returns or the same as 
> if with PathRetractionStrategy.
>
> E.g.
>
> graph = TinkerFactory.createModern();
> g = graph.traversal().withoutStrategies(PathRetractionStrategy.class);
> g.V().the().traversal().to().test()
>
> If you get the same answer without PathRetractionStrategy, then you know that 
> Sqlg is bugged.
>
> HTH,
> Marko.
>
> http://markorodriguez.com
>
>
>
>> On Oct 24, 2016, at 2:21 PM, pieter-gmail <pieter.mar...@gmail.com> wrote:
>>
>> Ok apologies. I thought I spotted the difference and simplified the
>> gremlin too much to highlight what I thought I saw. The above mentioned
>> queries are returning the same result in Sqlg as TinkerGraph.
>>
>> Here is what is not working.
>>
>>final TinkerGraph g = TinkerFactory.createModern();
>>GraphTraversal<Vertex, Map<Vertex, Collection>>
>> traversal = g.traversal()
>>.V().as("a")
>>.repeat(both()).times(3).emit().as("b")
>>.<Vertex, Collection>group().by(select("a"));
>>printTraversalForm(traversal);
>>while (traversal.hasNext()) {
>>Map<Vertex, Collection> vertexMap = traversal.next();
>>for (Vertex vertex : vertexMap.keySet()) {
>>Collection coll = vertexMap.get(vertex);
>>System.out.println("key: " + vertex.value("name") + ",
>> value: " + coll.size());
>>}
>>}
>>
>> For this Sqlg has the same result as TinkerGraph.
>>
>> TinkerGraph
>>
>> post-strategy:[TinkerGraphStep(vertex,[])@[a],
>> RepeatStep([VertexStep(BOTH,vertex),
>> RepeatEndStep],until(loops(3)),emit(true))@[b],
>> GroupStep([SelectOneStep(a), NoOpBarrierStep(2500)],[FoldStep])]
>>
>> Sqlg
>>
>> post-strategy:[SqlgGraphStepCompiled(vertex,[])@[sqlgPathFakeLabel],
>> GroupStep([SelectOneStep(a)],[FoldStep])]
>>
>> key: marko, value: 27
>> key: vadas, value: 11
>> key: lop, value: 27
>> key: josh, value: 27
>> key: ripple, value: 11
>> key: peter, value: 11
>>
>> Adding in the extra by()
>>
>>final TinkerGraph g = TinkerFactory.createModern();
>>GraphTraversal<Vertex, Map<Vertex, Collection>>
>> traversal = g.traversal()
>>.V().as("a")
>>.repeat(both()).times(3).emit().as("b")
>>.<Vertex, Collection>group().by(select("a"))
>>.by(select("b").dedup().order().by(T.id).fold());
>>printTraversalForm(traversal);
>>while (traversal.hasNext()) {
>>Map<Vertex, Collection> vertexMap = traversal.next();
>>for (Vertex vertex : vertexMap.keySet()) {
>>Collection coll = vertexMap.get(vertex);
>>System.out.println("key: " + vertex.value("name") + ",
>> value: " + coll.size());
>>}
>>}
>>
>> TinkerGraph prints
>>
>> post-strategy:[TinkerGraphStep(vertex,[])@[a],
>> RepeatStep([VertexStep(BOTH,vertex),
>> RepeatEndStep],until(loops(3)),emit(true))@[b],
>> GroupStep([SelectOneStep(a), NoOpBarrierStep(2500)],[SelectOneStep(b),
>> DedupGlobalStep, OrderGlobalStep([[id, incr]]), FoldStep])]
>>
>> key: marko, value: 6
>> key: vadas, value: 6
>> key: lop, value: 6
>> key: josh, value: 6
>> key: ripple, value: 6
>> key: peter, value: 6
>>
>> and Sqlg
>>
>> post-strategy:[SqlgGraphStepCompiled(vertex,[])@[sqlgPathFakeLabel],
>> GroupStep([SelectOneStep(a)],[SelectOneStep(b), DedupGlobalStep,
>> OrderGlobalStep([[id, incr]]), FoldStep])]
>>
>> key: marko, value: 0
>> key: ripple, value: 0
>> key: peter, value: 0
>> key: lop, value: 0
>> key: josh, value: 0
>> key: vadas, value: 0
>>
>> The difference being the NoOpBarrierStep but I am not sure if that is
>> the culprit or not.
>>
>> Thanks
>> Pieter
>>
>>
>>
>>
>>
>>
>> On 24/10/2016 21:31, Marko Rodriguez wrote:
>>> Hi Pieter,
>>>
>>> What are

Re: PathRetractionStrategy and TraverserRequirement.PATH

2016-10-24 Thread pieter-gmail
Ok apologies. I thought I spotted the difference and simplified the
gremlin too much to highlight what I thought I saw. The above mentioned
queries are returning the same result in Sqlg as TinkerGraph.

Here is what is not working.

final TinkerGraph g = TinkerFactory.createModern();
GraphTraversal<Vertex, Map<Vertex, Collection>>
traversal = g.traversal()
.V().as("a")
.repeat(both()).times(3).emit().as("b")
.<Vertex, Collection>group().by(select("a"));
printTraversalForm(traversal);
while (traversal.hasNext()) {
Map<Vertex, Collection> vertexMap = traversal.next();
for (Vertex vertex : vertexMap.keySet()) {
Collection coll = vertexMap.get(vertex);
System.out.println("key: " + vertex.value("name") + ",
value: " + coll.size());
}
}

For this Sqlg has the same result as TinkerGraph.

TinkerGraph

post-strategy:[TinkerGraphStep(vertex,[])@[a],
RepeatStep([VertexStep(BOTH,vertex),
RepeatEndStep],until(loops(3)),emit(true))@[b],
GroupStep([SelectOneStep(a), NoOpBarrierStep(2500)],[FoldStep])]

Sqlg

post-strategy:[SqlgGraphStepCompiled(vertex,[])@[sqlgPathFakeLabel],
GroupStep([SelectOneStep(a)],[FoldStep])]

key: marko, value: 27
key: vadas, value: 11
key: lop, value: 27
key: josh, value: 27
key: ripple, value: 11
key: peter, value: 11

Adding in the extra by()

final TinkerGraph g = TinkerFactory.createModern();
GraphTraversal<Vertex, Map<Vertex, Collection>>
traversal = g.traversal()
.V().as("a")
.repeat(both()).times(3).emit().as("b")
.<Vertex, Collection>group().by(select("a"))
.by(select("b").dedup().order().by(T.id).fold());
printTraversalForm(traversal);
while (traversal.hasNext()) {
Map<Vertex, Collection> vertexMap = traversal.next();
for (Vertex vertex : vertexMap.keySet()) {
Collection coll = vertexMap.get(vertex);
System.out.println("key: " + vertex.value("name") + ",
value: " + coll.size());
}
}

TinkerGraph prints

post-strategy:[TinkerGraphStep(vertex,[])@[a],
RepeatStep([VertexStep(BOTH,vertex),
RepeatEndStep],until(loops(3)),emit(true))@[b],
GroupStep([SelectOneStep(a), NoOpBarrierStep(2500)],[SelectOneStep(b),
DedupGlobalStep, OrderGlobalStep([[id, incr]]), FoldStep])]

key: marko, value: 6
key: vadas, value: 6
key: lop, value: 6
key: josh, value: 6
key: ripple, value: 6
key: peter, value: 6

and Sqlg

post-strategy:[SqlgGraphStepCompiled(vertex,[])@[sqlgPathFakeLabel],
GroupStep([SelectOneStep(a)],[SelectOneStep(b), DedupGlobalStep,
OrderGlobalStep([[id, incr]]), FoldStep])]

key: marko, value: 0
key: ripple, value: 0
key: peter, value: 0
key: lop, value: 0
key: josh, value: 0
key: vadas, value: 0

The difference being the NoOpBarrierStep but I am not sure if that is
the culprit or not.

Thanks
Pieter






On 24/10/2016 21:31, Marko Rodriguez wrote:
> Hi Pieter,
>
> What are the two answers --- TinkerGraph and Sqlg for the respective test 
> traversal?
>
> (I suspect the test is bad because group() pushes traversers through with 
> bulks and all so the test might just add to a collection without adding 
> respecting bulks. Probably should change that test regardless to do like a 
> count or something instead).
>
> Marko.
>
> http://markorodriguez.com
>
>
>
>> On Oct 24, 2016, at 12:57 PM, pieter-gmail <pieter.mar...@gmail.com> wrote:
>>
>> Hi,
>>
>> This is on 3.2.3
>>
>> I have been investigating why
>> `DedupTest.g_V_asXaX_repeatXbothX_timesX3X_emit_asXbX_group_byXselectXaXX_byXselectXbX_dedup_order_byXidX_foldX_selectXvaluesX_unfold_dedup`
>> fails on Sqlg. It is a fairly recently added test.
>>
>> My investigation so far has narrowed the problem to the
>> `PathRetractionStrategy`
>>
>> On the modern graph,
>>
>>GraphTraversal<Vertex, Map<Vertex, Collection>>
>> traversal = g.traversal()
>>.V().as("a")
>>.out().as("b")
>>.<Vertex, Collection>group().by(select("a"))
>>.by(select("b"));
>>printTraversalForm(traversal);
>>
>> Outputs the following on TinkerGraph
>>
>> pre-strategy:[GraphStep(vertex,[])@[a], VertexStep(OUT,vertex)@[b],
>> GroupStep([SelectOneStep(a)],[SelectOneStep(b)])]
>>  post-strategy:[TinkerGraphStep(vertex,[])@[a],
>> VertexStep(OUT,vertex)@[b], GroupStep([SelectOneStep(a),
>> NoOpBarrierStep(2500)],[Selec

Re: [jira] [Commented] (TINKERPOP-1506) Optional/Coalesce should not allow sideEffect traversals.

2016-10-21 Thread pieter-gmail
Hi,

`coalesce` was discussed way back when `optional` was first discussed.
@Daniel's comment seems to show that `coalesce` was not what we wanted.
@Marko's comment indicated "Ah smart. The reason choose works and
coalesce doesn't is because one uses globalTraversals and the other uses
localTraversals"

Do there comments still hold?

Thanks
Pieter

On 21/10/2016 23:42, Marko A. Rodriguez (JIRA) wrote:
> [ 
> https://issues.apache.org/jira/browse/TINKERPOP-1506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15596463#comment-15596463
>  ] 
>
> Marko A. Rodriguez commented on TINKERPOP-1506:
> ---
>
> Huh. I just reailzed we can implement {{optional()}} using {{ColesceStep}} 
> and we don't have this problem.
>
> {code}
> gremlin> g.inject(1).coalesce(addV('twin'),identity())
> ==>v[0]
> gremlin> g.V()
> ==>v[0]
> {code}
>
> Thus, {{optional(x)}} -> {{coalesce(x,identity())}}. Easy fix. Any objections 
> to this direction?
>
>
>> Optional/Coalesce should not allow sideEffect traversals.
>> -
>>
>> Key: TINKERPOP-1506
>> URL: https://issues.apache.org/jira/browse/TINKERPOP-1506
>> Project: TinkerPop
>>  Issue Type: Improvement
>>  Components: process
>>Affects Versions: 3.1.4, 3.2.2
>>Reporter: Marko A. Rodriguez
>>
>> It took me a long time to realize what was wrong with a traversal I wrote 
>> that used {{optional(blah.sideEffect.blah)}}. {{optional()}} maps to 
>> {{ChooseStep}} under the hood and the provide traversal is first tested for 
>> a {{hasNext()}}. If so, the it plays itself out. The problem is that if 
>> there is a side-effect in the traversal child, then it gets executed twice. 
>> {code}
>> gremlin> g = TinkerGraph.open().traversal()
>> ==>graphtraversalsource[tinkergraph[vertices:0 edges:0], standard]
>> gremlin> g.inject(1).optional(addV('twin'))
>> ==>v[1]
>> gremlin> g.V().valueMap(true)
>> ==>[id:0,label:twin]
>> ==>[id:1,label:twin]
>> {code}
>> We should NOT allow {{optional()}} to have {{SideEffectStep}} steps in it so 
>> as not to cause unexpected behavior. {{StandardVerificationStrategy}} can 
>> analyze and throw an exception if necessary.
>> Also, {{coalesce()}} has a similar problem, though perhaps it can be a 
>> useful 'technique.'
>> {code}
>> gremlin> g = TinkerGraph.open().traversal()
>> ==>graphtraversalsource[tinkergraph[vertices:0 edges:0], standard]
>> gremlin> g.inject(1).coalesce(addV('twin1').limit(0), addV('twin2'))
>> ==>v[1]
>> gremlin> g.V().valueMap(true)
>> ==>[id:0,label:twin1]
>> ==>[id:1,label:twin2]
>> gremlin>
>> {code}
>
>
> --
> This message was sent by Atlassian JIRA
> (v6.3.4#6332)



Re: [VOTE] TinkerPop 3.2.3 Release

2016-10-18 Thread pieter-gmail
All tests passes except for one but its a Sqlg bug.

Vote +1


On 17/10/2016 23:21, Stephen Mallette wrote:
> Hello,
>
> We are happy to announce that TinkerPop 3.2.3 is ready for release.
>
> The release artifacts can be found at this location:
> https://dist.apache.org/repos/dist/dev/tinkerpop/3.2.3/
>
> The source distribution is provided by:
> apache-tinkerpop-3.2.3-src.zip
>
> Two binary distributions are provided for user convenience:
> apache-tinkerpop-gremlin-console-3.2.3-bin.zip
> apache-tinkerpop-gremlin-server-3.2.3-bin.zip
>
> The GPG key used to sign the release artifacts is available at:
> https://dist.apache.org/repos/dist/dev/tinkerpop/KEYS
>
> The online docs can be found here:
> http://tinkerpop.apache.org/docs/3.2.3/ (user docs)
> http://tinkerpop.apache.org/docs/3.2.3/upgrade/ (upgrade docs)
> http://tinkerpop.apache.org/javadocs/3.2.3/core/ (core javadoc)
> http://tinkerpop.apache.org/javadocs/3.2.3/full/ (full javadoc)
>
> The tag in Apache Git can be found here:
>
> https://git-wip-us.apache.org/repos/asf?p=tinkerpop.git;a=tag;h=0fdd98d8b657185b766310562926c155427594d6
>
> The release notes are available here:
>
> https://github.com/apache/tinkerpop/blob/master/CHANGELOG.asciidoc#tinkerpop-323-release-date-october-17-2016
>
> The [VOTE] will be open for the next 72 hours --- closing Thursday (October
> 20, 2016) at 5:30pm EST.
>
> My vote is +1.
>
> Thank you very much,
> Stephen
>



Re: [DISCUSS] Graph.addVertex(Map)

2016-10-18 Thread pieter-gmail
Perhaps to add some strength to my argument I can give some indication
of our current graph's shape however.

Currently we have about 14 000 tables and about 220 000 columns in the
rdbms and it grows runtime as we support more and more vendors and
technologies. This is in the telco space.
100% of that is added, updated and deleted via the graph.

Maps makes this job way way easier.
If I were to use some graph implementation that does not have the map
interface natively I'd add some wrapper or util or something immediately
anyways.

Data that comes in from etl processes or web frontends or wherever is
almost always already in a Map. Very very seldom do we work with varargs.
Not having a map interface forces clients to convert maps to varargs
meaning they'll add that wrapper/utility in their apps anyways.

The interface Sqlg added is `graph.addVertex(String label, Map keyValues)`
The original varargs method is never invoked.

Cheers
Pieter



On 18/10/2016 16:05, Marko Rodriguez wrote:
> Hi,
>
> addV(Object…) was deprecated because it didn’t support Parameters (that is, 
> dynamic traversal based parametrization).
>
> g.V().has(’name’,’bob’).as(‘a’).
>   addV(‘person’).
> property(‘name’,’stephen’).
> property(‘status’,select(‘a’).by(’status’))
>
> We could support it in Object.., but then stuff looks weird:
>
> g.V().has(’name’,’bob’).as(‘a’).
>   addV(label,’person’,’name’,’stephen’,‘status’,select(‘a’).by(‘status’))
>
> You really don’t save that much typing and I think its best to be explicit so 
> traversals are more readable.
>
> To @pieter. In terms of Map arguments. We don’t have any steps/sources that 
> take Map arguments. I would prefer not to introduce a new data structure 
> especially when its so fuggly to create in Java.
>
> Thoughts?,
> Marko.
>
> http://markorodriguez.com
>
>
>
>> On Sep 28, 2016, at 1:03 PM, Pieter Martin  wrote:
>>
>> Well, I have to say I really like Map. In almost all of our code by the time 
>> we
>> are ready to create a vertex we have the properties already in a map. Data 
>> for
>> the most part are not captured by humans typing gremlin but by machines and
>> they store key value pairs in a Map.
>>
>> Cheers
>> Pieter
>>
>> Excerpts from Marko Rodriguez's message of September 28, 2016 7:18 :
>>> Hi,
>>> Right now we have:
>>> addV().property(a,b).property(c,d,e,f)
>>> The second property() call creates a c=d vertex property with e=f 
>>> meta-property.
>>> We could do this:
>>> addV(a,b,c,d).property(c).property(e,f)
>>> That is, addV() has a Object[]… arg. However, seems to be the same length 
>>> of characters. Though, without meta-properties:
>>> addV().property(‘a’,b’).property(‘c’,’d’)
>>> …becomes:
>>> addV(‘a’,’b’,’c’,’d’)
>>> I don’t really like Map as that is not a type we use anywhere else… Marko.
>>> http://markorodriguez.com
 On Sep 28, 2016, at 10:41 AM, Stephen Mallette  
 wrote:
 Matthias re-opened that issue now looking to see g.addV(Map) given my
 reasoning for closing.
 On Tue, Sep 20, 2016 at 12:49 PM, Stephen Mallette 
 wrote:
> Anyone interested in seeing a Graph.addVertex(Map) overload?
> https://issues.apache.org/jira/browse/TINKERPOP-1174
> I don't imagine there would be any change to addV() in this case. I'm
> thinking that we wouldn't likely use this method internally and so it 
> would
> more be something for user convenience, in which case, it seems to
> encourage more use of the Graph API which we're typically trying to do 
> less
> of.
>



Re: Code Freeze 3.2.3/3.1.5

2016-10-14 Thread pieter-gmail
I just upgraded and things are looking good.

Cheers

Pieter


On 14/10/2016 20:50, Stephen Mallette wrote:
> It's been fairly quiet on this thread for this release for some reason. I
> assume that can only mean that this is to be the best release ever!
>
> I just published the latest 3.2.3-SNAPSHOT again to Apache Snapshots
> Repository and also republished the docs:
>
> http://tinkerpop.apache.org/docs/3.2.3-SNAPSHOT/
>
> and they look pretty good (there was a problem with upgrade doc
> formatting). Upgrade docs look really solid for this release. Hopefully,
> master is fully stable now and we won't need any more changes before I
> build up the release for vote on monday.
>
>
> On Tue, Oct 11, 2016 at 7:20 PM, Ted Wilmes  wrote:
>
>> I was getting failures earlier today off of master.  Just did a pull and
>> things are looking good.
>>
>> --Ted
>>
>> On Tue, Oct 11, 2016 at 5:48 PM, Stephen Mallette 
>> wrote:
>>
>>> Push a commit to master earlier today to fix that issue we talked about
>>> last week regarding the failing TraversalInterruptionTest. Travis has
>> been
>>> happy and I can't seem to get it to fail locally. I think it's in good
>>> shape. If you were having problems with that before, please give it a try
>>> now. Marko is still planning to some work to fix up PeerPressure test
>>> (can't say I've have trouble with that one myself).
>>>
>>> Also, I published a 3.2.3 -SNAPSHOT earlier today btw to the Apache
>>> snapshot repository for testing.
>>>
>>> On Fri, Oct 7, 2016 at 6:44 PM, Stephen Mallette 
>>> wrote:
>>>
 We're supposed to start code freeze tomorrow, but we are a little
>> behind.
 Still have one PR left to merge and it needs a rebase:

 https://github.com/apache/tinkerpop/pull/448

 So expect that to get merged for 3.2.3 during code freeze week, but
 nothing in that PR should preclude providers from testing their
 implementations.  Other than that, I think everything else of substance
>>> is
 in.

 I do have one worry about that TraversalInterruption test that has been
 failing randomly since the LazyBarrierStrategy stuff went in (i think).
 Marko also mentioned the PeerPressure test. We'll put some elbow grease
 into that next week and try to get those figured out and more stable.

 As a reminder Ted will be release manager for 3.1.5 and I'll be doing
 3.2.3. As usual, we will use this thread to coordinate during code
>> freeze
 week. Please bring up relevant issues here.

 Thanks,

 Stephen




Re: path query optimization

2016-08-05 Thread pieter-gmail
Sorry forgot to add a rather important part.

I changed ImmutablePath's constructor to

private ImmutablePath(final ImmutablePathImpl previousPath, final
Object currentObject, final Set currentLabels) {
this.previousPath = previousPath;
this.currentObject = currentObject;
this.currentLabels = currentLabels;
//this.currentLabels.addAll(currentLabels);
}

Setting the collection directly as oppose to `addAll`

Thanks
Pieter


On 05/08/2016 20:40, pieter-gmail wrote:
> Hi,
>
> I have been optimizing Sqlg of late and eventually arrived at TinkerPop
> code.
>
> The gremlin in particular that I am interested is path queries.
>
> Here is the test that I am running in jmh.
>
> //@Setup
> Vertex a = graph.addVertex(T.label, "A", "name", "a1");
> for (int i = 1; i < 1_000_001; i++) {
> Vertex b = graph.addVertex(T.label, "B", "name", "name_" + i);
> a.addEdge("outB", b);
> for (int j = 0; j < 1; j++) {
> Vertex c = graph.addVertex(T.label, "C", "name", "name_"
> + i + " " + j);
> b.addEdge("outC", c);
> }  
> }
>
> And the query being benchmarked is
>
> GraphTraversal<Vertex, Path> traversal =
> g.V(a).as("a").out().as("b").out().as("c").path();
> while (traversal.hasNext()) {
> Path path = traversal.next();
> }
>
> Before the optimization, (as things are now)
>
> Benchmark Mode Cnt  Score Error   Units
> GremlinPathBenchmark.g_path  avgt  100  1.086 ± 0.020   s/op
>
> The optimization I did is in AbstractStep.prepareTraversalForNextStep,
> to not call addLabels() for path gremlins as the labels are known by the
> step and do not change again so there is not need to keep adding them.
>
> private final Traverser.Admin prepareTraversalForNextStep(final
> Traverser.Admin traverser) {
> if (!this.traverserStepIdAndLabelsSetByChild) {
> traverser.setStepId(this.nextStep.getId());
> if (traverser instanceof B_LP_O_P_S_SE_SL_Traverser) {
> } else {
> traverser.addLabels(this.labels);
> }  
> }  
> return traverser;
> } 
>
> After optimization,
>
> Benchmark Mode Cnt  Score Error   Units
> GremlinPathBenchmark.g_path  avgt  100  0.680 ± 0.004   s/op
>
> 1.086 vs 0.689 seconds for the traversal.
>
> I ran the Structured and Process test suites. 2 tests are failing with
> this optimization.
>
> InjectTest.g_VX1X_out_name_injectXdanielX_asXaX_mapXlengthX_path fails with
>
> "java.lang.IllegalArgumentException: The step with label a does not exist"
>
> and
>
> SerializationTest.shouldSerializePathAsDetached fails with
>
> "Caused by: java.lang.IllegalArgumentException: Class is not registered:
> java.util.Collections$UnmodifiableSet"
>
> Before investigating the failures is this optimization worth pursuing?
>
> Thanks
> Pieter
>



RE: path query optimization

2016-08-05 Thread pieter-gmail
Hi,

I have been optimizing Sqlg of late and eventually arrived at TinkerPop
code.

The gremlin in particular that I am interested is path queries.

Here is the test that I am running in jmh.

//@Setup
Vertex a = graph.addVertex(T.label, "A", "name", "a1");
for (int i = 1; i < 1_000_001; i++) {
Vertex b = graph.addVertex(T.label, "B", "name", "name_" + i);
a.addEdge("outB", b);
for (int j = 0; j < 1; j++) {
Vertex c = graph.addVertex(T.label, "C", "name", "name_"
+ i + " " + j);
b.addEdge("outC", c);
}  
}

And the query being benchmarked is

GraphTraversal traversal =
g.V(a).as("a").out().as("b").out().as("c").path();
while (traversal.hasNext()) {
Path path = traversal.next();
}

Before the optimization, (as things are now)

Benchmark Mode Cnt  Score Error   Units
GremlinPathBenchmark.g_path  avgt  100  1.086 ± 0.020   s/op

The optimization I did is in AbstractStep.prepareTraversalForNextStep,
to not call addLabels() for path gremlins as the labels are known by the
step and do not change again so there is not need to keep adding them.

private final Traverser.Admin prepareTraversalForNextStep(final
Traverser.Admin traverser) {
if (!this.traverserStepIdAndLabelsSetByChild) {
traverser.setStepId(this.nextStep.getId());
if (traverser instanceof B_LP_O_P_S_SE_SL_Traverser) {
} else {
traverser.addLabels(this.labels);
}  
}  
return traverser;
} 

After optimization,

Benchmark Mode Cnt  Score Error   Units
GremlinPathBenchmark.g_path  avgt  100  0.680 ± 0.004   s/op

1.086 vs 0.689 seconds for the traversal.

I ran the Structured and Process test suites. 2 tests are failing with
this optimization.

InjectTest.g_VX1X_out_name_injectXdanielX_asXaX_mapXlengthX_path fails with

"java.lang.IllegalArgumentException: The step with label a does not exist"

and

SerializationTest.shouldSerializePathAsDetached fails with

"Caused by: java.lang.IllegalArgumentException: Class is not registered:
java.util.Collections$UnmodifiableSet"

Before investigating the failures is this optimization worth pursuing?

Thanks
Pieter



Re: [DISCUSS] interrupt

2016-07-22 Thread pieter-gmail
"It's not clear to me if the problem exists in HSQLDB, the test, or tail
step"

This had nothing to do with the TailStep bug. That one is resolved for
the most part.

For the rest, where the problem is, is itself part of the problem.
Thread.interrupt() has rather weak semantics having many different
behaviors. Some reset the flag, some throw an exception some swallow and
some do a combination of all. I don't think engaging 3rd parties with
regards to this is an option. Firstly there are way to many 3rd parties
where the InterruptException is being caught to even start. Secondly I
reckon as the semantics are weak every 3rd party engagement will turn
into a discussion itself. From what I gather many 3rd parties call
Thread.wait/join/sleep and handle the InterruptException for the
interrupt that they are expecting. I imagine they are swallowing the
exception and not resetting the flag with good cause.

Regarding delegating the query to a separate thread, even if Sqlg
executes the sql in a different thread there are still many other 3rd
party libraries that might interfere with the expected interrupt logic
outside of just the sql query.

This makes me of the opinion that Thread.interrupt is an unreliable
mechanism for interrupting a traversal.

Regarding asynchronous or synchronous I'd say the interrupt request
should be asynchronous with a Future that returns on a successful
cancellation. That way you can wait for it or not.

>From what I understand the complexity is more in GremlinServer that
executes scripts and has no real concept of a traversal. It does not
even really want to interrupt a traversal as such but rather a script
which may itself contain many traversals. I reckon it will have to pass
in a object when executing the script which the graph will store in a
threadvar. The graph can then register all traversals executing in the
thread on that object. And when the time comes to interrupt a script
GremlinServer will call interrupt on that object which in turn will
interrupt the current executing traversal. Something like that is what I
am thinking of.

Cheers
Pieter


On 22/07/2016 14:24, Robert Dale wrote:
> Trying to summarize the concerns I think I'm hearing:
> 1. cancelling the gremlin job
> 2. cancelling the task in the backend database, this implies handling
> at minimum:
>   a. commit state: interruptable
>   b: rollback state: probably not interruptable
> 3. responding to the client, returning the thread
>
> Should these things done synchronously or asynchronously or some
> combination? The answer may depend on how decoupled they are.
>
> Separately, are tests doing the right thing? It's not clear to me if
> the problem exists in HSQLDB, the test, or tail step.
>
> I think if Thread.interrupt() is the right way, then that's the way it
> should be done regardless of bad citizen libraries.
>
> Handle 3rd party bad citizens by:
> - filing a bug with them. Maybe they will fix or justify the behavior.
> - tracking them in a Known Issues list
> - workaround them as close as possible to the problem:
> I'm not familiar with how providers work so I don't know how generally
> applicable this would be, but in the case of Sqlg, the sql query
> itself could be delegated to a separate thread in which special
> interrupt strategies could be implemented such as the while loop.
>
> Side question: are there management tools in gremlin server to see
> currently running tasks and kill them? Or is that something that would
> be delegated to the backend database?
>



Re: [DISCUSS] interrupt

2016-07-21 Thread pieter-gmail
Ok, np, its not serious, Postgres is the important one for me anyhow and
it is behaving. I'll investigate how to tell Postgres to cancel the
query. Just stopping the traversal is not quite good enough as every now
and again we have queries on Postgres that persist even if the java
thread dies.
Thanks,
Pieter

On 21/07/2016 22:16, Stephen Mallette wrote:
>> For every traversal that starts it notifies the caller via the reference
> object about the traversal.
>
> that's the tricky bit. you'd have to have some global tracking of spawned
> traversals to know that and it would have to be bound to the Thread that
> started it I guess. That information isn't going to be available out of a
> standard JSR-223 ScriptEngine.eval() call. We are making some changes to
> ScriptEngine where we extend upon it for purposes of Gremlin. Maybe there's
> opportunity in those changes to make a change like this somewhere in that
> work (though how that would happen is still murky to me).
>
> If we need changes to the ScriptEngine to even think about doing this, it
> may be a bit of a way off before we can make much progress here. I don't
> expect to see all the ScriptEngine work I had in mind done until 3.3.x as
> it must include some breaking changes to some public APIs to happen.
>
>
>
>
>
> On Thu, Jul 21, 2016 at 4:01 PM, pieter-gmail <pieter.mar...@gmail.com>
> wrote:
>
>> Well no, the problem is Thread.interrupted() is not reliable. Does not
>> really matter who the caller is, GremlinServer or other.
>> Just about every 3rd party library I can see might reset the flag
>> meaning that the check will randomly return false or true. Something as
>> trivial as a logger might even reset the flag. It seems to me interrupt
>> is more for code that actually calls wait/join/sleep and they handle the
>> any subsequent InterruptException as they please.
>>
>> All I can think of for GremlinServer is a way more complex multi
>> threaded solution.
>> The ScriptEngine.eval passes in a reference object and returns
>> immediately. For every traversal that starts it notifies the caller via
>> the reference object about the traversal. The caller then uses that
>> traversal to interrupt it. Plus some more logic to know when the script
>> is done.
>>
>> Ok had another idea but kinda want to try it first as it might be
>> nonsense. Basically keep retrying the Thread.interrupt() till the thread
>> via exceptions bubbles to the top of the stack and gets handled
>> appropriately.
>>
>> On 21/07/2016 18:47, Stephen Mallette wrote:
>>> thanks for all that pieter. the primary reason for traversal interruption
>>> in the first place was so that gremlin server would have a chance to kill
>>> traversals that were running too long. Without a solution to that
>> problem,
>>> I'm not sure what to do here. just tossing ideas around - could we still
>>> check for thread interruption as an additional way to interrupt a
>>> Traversal. maybe instead of:
>>>
>>> if (Thread.interrupted()) throw new TraversalInterruptedException();
>>>
>>> we need:
>>>
>>> if (Thread.interrupted()) this.traversal.interrupt()
>>>
>>> that would then trigger whatever interrupt logic the traversal had?
>>>
>>> If we need to do a better job with AbstractStep, please create a JIRA
>>> (and/or submit a PR) so we don't forget to make some improvements there.
>>>
>>> On Thu, Jul 21, 2016 at 12:37 PM, pieter-gmail <pieter.mar...@gmail.com>
>>> wrote:
>>>
>>>> I just did a global Intellij search in the Sqlg project.
>>>>
>>>> HSQLDB has 13 catch (InterruptedException e) clauses. All of them
>>>> swallows the exception and none resets the interrupt flag.
>>>>
>>>> Postgresql jdbc driver has 3 catch (InterruptedException e) clauses. 2
>>>> swallows the exception without resetting the interrupt flag and one
>>>> throws an exception.
>>>>
>>>> The rest,
>>>>
>>>> logback, 7 catch (InterruptedException e) 1 resets the flag while the
>>>> rest swallow the exception without resetting the interrupt flag
>>>>
>>>> google guava about 25 catch (InterruptedException e) all resets the
>>>> interrupt flag
>>>>
>>>> hazelcast 85 catch (InterruptedException e) too many to count but some
>>>> resets the interrupt flag and some don't
>>>>
>>>> mchange c3po pool 7 catch (InterruptedException e), 4 throws exception
>>>> without resetting the in

Re: [DISCUSS] interrupt

2016-07-21 Thread pieter-gmail
Well no, the problem is Thread.interrupted() is not reliable. Does not
really matter who the caller is, GremlinServer or other.
Just about every 3rd party library I can see might reset the flag
meaning that the check will randomly return false or true. Something as
trivial as a logger might even reset the flag. It seems to me interrupt
is more for code that actually calls wait/join/sleep and they handle the
any subsequent InterruptException as they please.

All I can think of for GremlinServer is a way more complex multi
threaded solution.
The ScriptEngine.eval passes in a reference object and returns
immediately. For every traversal that starts it notifies the caller via
the reference object about the traversal. The caller then uses that
traversal to interrupt it. Plus some more logic to know when the script
is done.

Ok had another idea but kinda want to try it first as it might be
nonsense. Basically keep retrying the Thread.interrupt() till the thread
via exceptions bubbles to the top of the stack and gets handled
appropriately.

On 21/07/2016 18:47, Stephen Mallette wrote:
> thanks for all that pieter. the primary reason for traversal interruption
> in the first place was so that gremlin server would have a chance to kill
> traversals that were running too long. Without a solution to that problem,
> I'm not sure what to do here. just tossing ideas around - could we still
> check for thread interruption as an additional way to interrupt a
> Traversal. maybe instead of:
>
> if (Thread.interrupted()) throw new TraversalInterruptedException();
>
> we need:
>
> if (Thread.interrupted()) this.traversal.interrupt()
>
> that would then trigger whatever interrupt logic the traversal had?
>
> If we need to do a better job with AbstractStep, please create a JIRA
> (and/or submit a PR) so we don't forget to make some improvements there.
>
> On Thu, Jul 21, 2016 at 12:37 PM, pieter-gmail <pieter.mar...@gmail.com>
> wrote:
>
>> I just did a global Intellij search in the Sqlg project.
>>
>> HSQLDB has 13 catch (InterruptedException e) clauses. All of them
>> swallows the exception and none resets the interrupt flag.
>>
>> Postgresql jdbc driver has 3 catch (InterruptedException e) clauses. 2
>> swallows the exception without resetting the interrupt flag and one
>> throws an exception.
>>
>> The rest,
>>
>> logback, 7 catch (InterruptedException e) 1 resets the flag while the
>> rest swallow the exception without resetting the interrupt flag
>>
>> google guava about 25 catch (InterruptedException e) all resets the
>> interrupt flag
>>
>> hazelcast 85 catch (InterruptedException e) too many to count but some
>> resets the interrupt flag and some don't
>>
>> mchange c3po pool 7 catch (InterruptedException e), 4 throws exception
>> without resetting the interrupt flag and 3 swallow the exception without
>> resetting the interrupt flag.
>>
>> mchange common 8 catch (InterruptedException e), 2 throws an exception
>> without resetting the interrult flag and 6 complete swallow without
>> resetting.
>>
>> commons-io 8 catch (InterruptedException e) 1 reset of the interrupt
>> flag, 7 swallow the exception without resetting the interrupt flag
>>
>> jline 3 catch (InterruptedException e) all swallow the exception without
>> resetting the flag.
>>
>>
>> All and all I don't think using interrupt will be a reliable strategy to
>> use.
>>
>> http://stackoverflow.com/questions/10401947/methods-that-clear-the-thread-interrupt-flag
>> says that it is good practise to always reset the flag. It might be good
>> but it is not common.
>> From the above rather quick search only google guava respected that good
>> practice.
>>
>> AbstractStep code
>> if (Thread.interrupted()) throw new TraversalInterruptedException();
>>
>> will also reset the interrupt flag potentially making someone else's
>> Thread.interrupted() check fail.
>>
>>
>> All that said I do not have a solution for GremlinServer not having
>> access to the traversal.
>>
>> Thanks
>> Pieter
>>
>>
>>
>>
>>
>>
>> On 21/07/2016 17:09, Stephen Mallette wrote:
>>> I don't recall all the issues with doing traversal interruption with a
>>> flag. I suppose it could work in the same way that thread interruption
>>> works now. I will say that I'm hesitant to say that we should change this
>>> on the basis of this being a problem general to databases as we've only
>>> seen in so far in HSQLDB. If it was shown to be a problem in other graphs
>>> i'd be more amplified to see a cha

Re: [DISCUSS] interrupt

2016-07-21 Thread pieter-gmail
I just did a global Intellij search in the Sqlg project.

HSQLDB has 13 catch (InterruptedException e) clauses. All of them
swallows the exception and none resets the interrupt flag.

Postgresql jdbc driver has 3 catch (InterruptedException e) clauses. 2
swallows the exception without resetting the interrupt flag and one
throws an exception.

The rest,

logback, 7 catch (InterruptedException e) 1 resets the flag while the
rest swallow the exception without resetting the interrupt flag

google guava about 25 catch (InterruptedException e) all resets the
interrupt flag

hazelcast 85 catch (InterruptedException e) too many to count but some
resets the interrupt flag and some don't

mchange c3po pool 7 catch (InterruptedException e), 4 throws exception
without resetting the interrupt flag and 3 swallow the exception without
resetting the interrupt flag.

mchange common 8 catch (InterruptedException e), 2 throws an exception
without resetting the interrult flag and 6 complete swallow without
resetting.

commons-io 8 catch (InterruptedException e) 1 reset of the interrupt
flag, 7 swallow the exception without resetting the interrupt flag

jline 3 catch (InterruptedException e) all swallow the exception without
resetting the flag.


All and all I don't think using interrupt will be a reliable strategy to
use.
http://stackoverflow.com/questions/10401947/methods-that-clear-the-thread-interrupt-flag
says that it is good practise to always reset the flag. It might be good
but it is not common.
>From the above rather quick search only google guava respected that good
practice.

AbstractStep code
if (Thread.interrupted()) throw new TraversalInterruptedException();

will also reset the interrupt flag potentially making someone else's
Thread.interrupted() check fail.


All that said I do not have a solution for GremlinServer not having
access to the traversal.

Thanks
Pieter






On 21/07/2016 17:09, Stephen Mallette wrote:
> I don't recall all the issues with doing traversal interruption with a
> flag. I suppose it could work in the same way that thread interruption
> works now. I will say that I'm hesitant to say that we should change this
> on the basis of this being a problem general to databases as we've only
> seen in so far in HSQLDB. If it was shown to be a problem in other graphs
> i'd be more amplified to see a change. Not sure if any other graph
> providers out there can attest to a problem with the thread interruption
> approach but it would be nice to hear so if there did.
>
> Of course, I think you alluded to the bigger problem, which is that Gremlin
> Server uses thread interruption to kill script executions and iterations
> that exceed timeouts. So, the problem there is that, if someone submits a
> script like this:
>
> t = g.V()
> x = t.toList()
>
> that script gets pushed into a ScriptEngine.eval() method. That method
> blocks until it is complete. Under that situation, Gremlin Server doesn't
> have access to the Traversal to call interrupt on it. "t" is iterating via
> toList() and there is no way to stop it. Not sure what we could do about
> situations like that.
>
> On Wed, Jul 20, 2016 at 4:00 PM, pieter-gmail <pieter.mar...@gmail.com>
> wrote:
>
>> The current interrupt implementation is failing on Sqlg's HSQLDB
>> implementation.
>> The reason for this is that HSQLDB itself relies on Thread.interrupt()
>> for its own internal logic. When TinkerPop interrupts the thread it
>> thinks it has to do with its own logic and as a result the interrupt
>> flag is reset and no exception is thrown.
>>
>> Reading the Thread.interrupt javadocs it says that wait(), join() and
>> sleep() will all reset the interrupt flag throw an InterruptedException.
>> This makes TinkerPop's reliance on the flag being set somewhat fragile.
>> All of those methods I suspect are common with database io code and
>> TinkerPop being a high level database layer makes it susceptible to 3rd
>> party interpretations of interrupt semantics.
>>
>> In some ways the TraversalInterruptionTest itself had to carefully reset
>> the flag with its usage of Thread.sleep().
>>
>> My proposal is to mark the traversal itself as interrupted rather than
>> the thread and keep the logic contained to TinkerPop's space.
>>
>> Another benefit is that the traversal.interrupt() can raise an event
>> that implementations can listen to. On receipt of the event they would
>> then be able to send a separate request to the database to cancel a
>> particular query. In my case would be a nice way for Sqlg to tell
>> Postgresql or HSQLDB to cancel a particular query (the latest one the
>> traversal executed).
>>
>> In many ways the semantics are the same. Currently for client code
>> wantin

[DISCUSS] interrupt

2016-07-20 Thread pieter-gmail
The current interrupt implementation is failing on Sqlg's HSQLDB
implementation.
The reason for this is that HSQLDB itself relies on Thread.interrupt()
for its own internal logic. When TinkerPop interrupts the thread it
thinks it has to do with its own logic and as a result the interrupt
flag is reset and no exception is thrown.

Reading the Thread.interrupt javadocs it says that wait(), join() and
sleep() will all reset the interrupt flag throw an InterruptedException.
This makes TinkerPop's reliance on the flag being set somewhat fragile.
All of those methods I suspect are common with database io code and
TinkerPop being a high level database layer makes it susceptible to 3rd
party interpretations of interrupt semantics.

In some ways the TraversalInterruptionTest itself had to carefully reset
the flag with its usage of Thread.sleep().

My proposal is to mark the traversal itself as interrupted rather than
the thread and keep the logic contained to TinkerPop's space.

Another benefit is that the traversal.interrupt() can raise an event
that implementations can listen to. On receipt of the event they would
then be able to send a separate request to the database to cancel a
particular query. In my case would be a nice way for Sqlg to tell
Postgresql or HSQLDB to cancel a particular query (the latest one the
traversal executed).

In many ways the semantics are the same. Currently for client code
wanting to interrupt a particular traversal it needs to have a reference
to the thread the traversal is executing in. Now instead it needs to
keep a reference to executing traversals and interrupt them directly.

Add Traversal.interrupt() and Traversal.isInterrupted(boolean
ClearInterrupted)

Caveat, I am not familiar with GremlinServer nor the complications
around interrupt there so perhaps I am missing something.

Thanks
Pieter


Re: [VOTE] TinkerPop 3.2.1 Release

2016-07-20 Thread pieter-gmail
Hi,

Ran all Sqlg's tests and the process and structured  test suites.
But alas there are failures.

TraversalInterruptionTest are failing on HSQLDB as the
Thread.interrupt() is intercepted by them and the interrupt flag is reset.
The TraversalInterruptionTest tests themselves suffers from this as its
own Thread.sleep() logic resets the interrupt flag and requires special
resetting. I'd say the current interrupt strategy needs rethinking.

TailTest.g_V_repeatXbothX_timesX3X_tailX7X fails. I added a few more,
repeat followed by a tail step, tests in sqlg, all of which also fails.
Jason has already proposed a fix for this here
.

vote -1

Thanks
Pieter



On 19/07/2016 15:20, Stephen Mallette wrote:
> Hello,
>
> We are happy to announce that TinkerPop 3.2.1 is ready for release - note
> the lack of "-incubating" everywhere.  :)
>
> The release artifacts can be found at this location:
> https://dist.apache.org/repos/dist/dev/tinkerpop/3.2.1/
>
> The source distribution is provided by:
> apache-tinkerpop-3.2.1-src.zip
>
> Two binary distributions are provided for user convenience:
> apache-gremlin-console-3.2.1-bin.zip
> apache-gremlin-server-3.2.1-bin.zip
>
> The GPG key used to sign the release artifacts is available at:
> https://dist.apache.org/repos/dist/dev/tinkerpop/KEYS
>
> The online docs can be found here:
> http://tinkerpop.apache.org/docs/3.2.1/reference/ (user docs)
> http://tinkerpop.apache.org/docs/3.2.1/upgrade/ (upgrade docs)
> http://tinkerpop.apache.org/javadocs/3.2.1/core/ (core javadoc)
> http://tinkerpop.apache.org/javadocs/3.2.1/full/ (full javadoc)
>
> The tag in Apache Git can be found here:
>
> https://git-wip-us.apache.org/repos/asf?p=tinkerpop.git;a=tag;h=c5a9e2815e76f044e6b33b773b6bb0bb048270cc
>
> The release notes are available here:
> https://github.com/apache/tinkerpop/blob/3.2.1/CHANGELOG.asciidoc#release-3-2-1
>
> The [VOTE] will be open for the next 72 hours --- closing Friday (July 22,
> 2016) at 9:30 am EST.
>
> My vote is +1.
>
> Thank you very much,
> Stephen
>



Re: RepeatStep

2016-07-16 Thread pieter-gmail
More investigation, shows that the new RepeatUnrollStrategy completely
removes the RepeatStep because of the times 0 logic.

For the global case the leaves behind the GraphStep and the HasStep thus
returning a1.

For the LocalStep case the LocalSteps are now empty (after the
RepeatSteps have been removed) throwing a FastNoSuchElementException and
thus returning no elements.

However I think we first need to understand what the expected semantics
of a before times(0) is.
I have a feeling that my previous passing tests were just getting lucky?

Thanks
Pieter


On 16/07/2016 18:37, pieter-gmail wrote:
> Hi,
>
> I am running the following tests on 3.2.1-SNAPSHOT
>
> @Test
> public void testRepeat() {
> final TinkerGraph g = TinkerGraph.open();
> Vertex a1 = g.addVertex(T.label, "A", "name", "a1");
> Vertex b1 = g.addVertex(T.label, "B", "name", "b1");
> Vertex b2 = g.addVertex(T.label, "B", "name", "b2");
> Vertex b3 = g.addVertex(T.label, "B", "name", "b3");
> a1.addEdge("ab", b1);
> a1.addEdge("ab", b2);
> a1.addEdge("ab", b3);
> Vertex c1 = g.addVertex(T.label, "C", "name", "c1");
> Vertex c2 = g.addVertex(T.label, "C", "name", "c2");
> Vertex c3 = g.addVertex(T.label, "C", "name", "c3");
> b1.addEdge("bc", c1);
> b1.addEdge("bc", c2);
> b1.addEdge("bc", c3);
>
> //this passes   
> List vertices =
> g.traversal().V().hasLabel("A").times(0).repeat(out("ab").out("bc")).toList();
> assertEquals(1, vertices.size());
> assertTrue(vertices.contains(a1));
>
> //this fails
> vertices =
> g.traversal().V().hasLabel("A").local(__.times(0).repeat(out("ab").out("bc"))).toList();
> assertEquals(1, vertices.size());
> assertTrue(vertices.contains(a1));
> }
>
> Previously on 3.2.0-incubating the test passed.
>
> Is this a bug or a new interpretation of the while do zero times logic?
>
> Thanks
> Pieter
>
>
>



Re: RepeatStep

2016-07-16 Thread pieter-gmail
Hi,

I am running the following tests on 3.2.1-SNAPSHOT

@Test
public void testRepeat() {
final TinkerGraph g = TinkerGraph.open();
Vertex a1 = g.addVertex(T.label, "A", "name", "a1");
Vertex b1 = g.addVertex(T.label, "B", "name", "b1");
Vertex b2 = g.addVertex(T.label, "B", "name", "b2");
Vertex b3 = g.addVertex(T.label, "B", "name", "b3");
a1.addEdge("ab", b1);
a1.addEdge("ab", b2);
a1.addEdge("ab", b3);
Vertex c1 = g.addVertex(T.label, "C", "name", "c1");
Vertex c2 = g.addVertex(T.label, "C", "name", "c2");
Vertex c3 = g.addVertex(T.label, "C", "name", "c3");
b1.addEdge("bc", c1);
b1.addEdge("bc", c2);
b1.addEdge("bc", c3);

//this passes   
List vertices =
g.traversal().V().hasLabel("A").times(0).repeat(out("ab").out("bc")).toList();
assertEquals(1, vertices.size());
assertTrue(vertices.contains(a1));
   
//this fails
vertices =
g.traversal().V().hasLabel("A").local(__.times(0).repeat(out("ab").out("bc"))).toList();
assertEquals(1, vertices.size());
assertTrue(vertices.contains(a1));
}

Previously on 3.2.0-incubating the test passed.

Is this a bug or a new interpretation of the while do zero times logic?

Thanks
Pieter