Re: TraversalStrategies.setTraverserGeneratorFactory's removal

pieter-gmail Thu, 07 Apr 2016 10:58:41 -0700

Ok thanks, let me try that.

Pieter


On 07/04/2016 19:54, Marko Rodriguez wrote:
> Hi,
>
> There is ImmutablePath and MutablePath. ImmutablePath is used in OLTP and is 
> much more efficient in terms of space & time than MutablePath. You can create 
> either as you please you just do:
>
> Path path = XXXPath.make()
> path = path.extend(…)
> path = path.extend(…)
>
> However, I don't recommend you work at that level. What I would recommend you 
> do is this:
>
> public class MyBigSelectLabelStep<S,E> {
>
>   Traverser traverser = this.starts.next();
>   Map<String,Object> result = doLowLevelProviderSpecificStuff(traverser);
>   traverser = traverser.split(result.get("a"), EmptyStep.instance()); // 
> simulate GraphStep
>   traverser.addLabels("a")
>   traverser = traverser.split(result.get("b"), EmptyStep.instance()); // 
> simulate VertexStep
>   traverser.addLabels("b") 
>   traverser = traverser.split(result, EmptyStep.instance()); // simulate 
> SelectStep
>   return traverser;
>
> }
>
> This handles all the low-level mechanisms of generating a traverser that 
> looks like it went through multiple steps even though it only went through 
> one. This is hand-typed from memory of the API so please be aware that I 
> might have gotten an argument wrong or something. Also, out() is a 
> FlatMapStep and thus, processes an iterator of results -- you will have to be 
> smart to flatten that iterator… see FlatMapStep's implementation for how this 
> is typically handled in TinkerPop.
>
> HTH,
> Marko. 
>
> http://markorodriguez.com
>
> On Apr 7, 2016, at 10:59 AM, pieter-gmail <pieter.mar...@gmail.com> wrote:
>
>> Hi,
>>
>> I have been working on this without using a custom traverser. It is fine
>> but I do need to be able to set the path of the traverser.
>>
>> We had a ticket for this previously here
>> <https://issues.apache.org/jira/browse/TINKERPOP-766>. I mentioned there
>> that I no longer needed the path setter and you mentioned that is ok but
>> it was never done.
>>
>> To give some background.
>>
>> g.V(a1).as("a").out().as("b").select("a", "b")
>>
>> This will be compiled to one step. i.e. This means that the collapsed
>> steps label information is somewhat obfuscated and the path is
>> incorrectly calculated by the traverser.
>> However the label information is not lost and I recalculate the path but
>> need to set it on the traverser.
>>
>> Is it still ok to make the path mutable?
>>
>> Thanks
>> Pieter
>>
>> On 30/03/2016 23:19, Marko Rodriguez wrote:
>>> Hi,
>>>
>>> So Titan does something similar where it takes a row in Cassandra and turns 
>>> those into Traversers. It uses FlatMapStep to do so where the iterator in 
>>> FlatMapStep is a custom iterator that knows how to do data conversions.
>>>
>>> Would something like that help?
>>>
>>> If not and you really need your own TraverserGenerator, then you can use 
>>> reflection to set it in DefaultTraversal. Its a private member now.
>>>
>>> Moving forward, I would highly recommend you don't create classes so low in 
>>> the stack. Graph database providers should only create (if necessary):
>>>
>>>     1. Steps that extend non-final TinkerPop steps.
>>>     2. TraversalStrategies that implement ProviderOptimizationStrategy.
>>>     3. Classes that extend Graph, Vertex, Edge, Property, VertexProperty, 
>>> and GraphComputer.
>>>     4. Their own InputFormat or InputRDD if they want to have Spark/Giraph 
>>> work against them.
>>>
>>> Anything beyond that (I think) is starting to get into murky territory.
>>>
>>> Marko.
>>>
>>> http://markorodriguez.com
>>>
>>> On Mar 30, 2016, at 2:46 PM, pieter-gmail <pieter.mar...@gmail.com> wrote:
>>>
>>>> Hi,
>>>>
>>>> I need it to keep state. Mapping sql ResultSet's grid nature to a graph
>>>> nature gets complex quickly and being able to store and manipulate the
>>>> state of the traverser makes it easier. Also it was possible and the
>>>> solution presented itself to me as such.
>>>>
>>>> A single row i.e. AbstractStep.processNextStart() from a sql ResultSet
>>>> might map to many traversers. This is the at the heart of what makes
>>>> Sqlg a worthwhile enterprise. To fetch lots of data (reduce latency
>>>> cost) in a denormalized manner and map it to a graph format.
>>>>
>>>> To manage this I needed to store state and add a custom method
>>>> "customSplit(...)". customSplit is similar to split() but it uses and
>>>> updates the said state. The custom traverser also keeps additional state
>>>> of where we are with respect to the processNextStart() (a sql row) as I
>>>> need it in order to calculate the next traverser from the same row.
>>>>
>>>> So if all this becomes impossible there is bound to be a different
>>>> solution to the same problem but it would require quite some thinking
>>>> and effort.
>>>>
>>>> With a little bit of arrogance and ignorance, perhaps letting OLAP
>>>> constraints leak into OLTP is not a good idea. I'd say OLTP is 99% of
>>>> use-cases so whatever these serialization issues are they ought to be
>>>> contained to OLAP.
>>>>
>>>> Thanks
>>>> Pieter
>>>>
>>>>
>>>>
>>>> On 30/03/2016 21:38, Marko Rodriguez wrote:
>>>>> Hello Pieter,
>>>>>
>>>>>> In SqlgGraph 
>>>>>> <https://github.com/pietermartin/sqlg/blob/schema/sqlg-core/src/main/java/org/umlg/sqlg/structure/SqlgGraph.java>
>>>>>> in a static code block invokes
>>>>>>
>>>>>> static {
>>>>>> TraversalStrategies.GlobalCache.registerStrategies(Graph.class,
>>>>>> TraversalStrategies.GlobalCache.getStrategies(Graph.class).clone().addStrategies(new
>>>>>> SqlgVertexStepStrategy()));
>>>>>> TraversalStrategies.GlobalCache.registerStrategies(Graph.class,
>>>>>> TraversalStrategies.GlobalCache.getStrategies(Graph.class).clone().addStrategies(new
>>>>>> SqlgGraphStepStrategy()));
>>>>>> TraversalStrategies.GlobalCache.getStrategies(Graph.class).setTraverserGeneratorFactory(new
>>>>>> SqlgTraverserGeneratorFactory());
>>>>>> }
>>>>> This all looks great exception the TraverserGeneratorFactory. Traverser 
>>>>> classes are so low-level and so tied to serialization code in OLAP that I 
>>>>> removed all concept of users able to create traverser species. I need 
>>>>> full control at that level to maneuver.
>>>>>
>>>>> I really need to create a section in the docs that says stuff like:
>>>>>
>>>>>   * Graph System Providers: only implement steps that extend non-final 
>>>>> TinkerPop-steps (e.g. GraphStep, VertexStep, etc.).
>>>>>   * Graph Language Providers: only have Traversal.steps() that can be 
>>>>> represented as a composition of TinkerPop-steps.
>>>>>
>>>>> When providers get too low level, then its hard for us to maneuver and 
>>>>> optimize and move forward with designs. There are so many assumption in 
>>>>> the code that we make around Traverser instances, Step interfaces, etc. 
>>>>> that if people just make new ones, then strategies, serialization, etc. 
>>>>> breaks down.
>>>>>
>>>>> The question I have, why do you have your own Traverser implementation? I 
>>>>> can't imagine a reason for a provider needs their own traverser class. ??
>>>>>
>>>>> Thanks,
>>>>> Marko.
>>>>>
>>>>> http://markorodriguez.com
>

Re: TraversalStrategies.setTraverserGeneratorFactory's removal

Reply via email to