The branch is TINKERPOP-1404
https://github.com/apache/tinkerpop/commit/c1556fe82c58527dc4425d23d1d69ce324e62cfa

Cheers
Pieter

On 01/11/2016 15:23, Marko Rodriguez wrote:
> What branch are you in? Perhaps give me URLs (to GitHub) to the files 
> touched? (if its not too many)
>
> Marko.
>
> http://markorodriguez.com
>
>
>
>> On Nov 1, 2016, at 7:19 AM, pieter-gmail <pieter.mar...@gmail.com> wrote:
>>
>> Hi,
>>
>> Yes I am but afraid I do not have time at present to concentrate on it.
>> I just noticed your ImmutablePath ticket which will overlap with some of
>> what I have done.
>>
>> I'd suggest to pull my branch and look at what I did there. It was very
>> little, but dangerous code, which is why I was reluctant to submit a PR
>> at first. If you don't continue with it, I should in 2 or 3 weeks be
>> able to look at it again.
>>
>> Thanks
>> Pieter
>>
>>
>> On 01/11/2016 15:11, Marko Rodriguez wrote:
>>> Hi Pieter,
>>>
>>> I’m still really interested in your work in this area. Are you still doing 
>>> this?
>>>
>>> Marko.
>>>
>>> http://markorodriguez.com
>>>
>>>
>>>
>>>> On Aug 7, 2016, at 9:12 AM, Pieter Martin <pieter.mar...@gmail.com> wrote:
>>>>
>>>> To avoid the collection logic alltogether. For most steps there is no need 
>>>> to
>>>> check the labels as it is known that they are added, immutable and correct.
>>>>
>>>> Also with the current strategy the `ImmutablePath.currentLabels` is 
>>>> exactly the
>>>> same collection as that of the step. It is not a copy.
>>>>
>>>> `Traverser.addLabels()` is only called for the steps previously mentioned. 
>>>> They
>>>> too can be optimized as there is no need to create a new `ImmutablePath` 
>>>> just to
>>>> set the labels. There could be a `Traverer.split()` method that creates the
>>>> initial `ImmutablePath` with the correct labels to start with. As things 
>>>> stand
>>>> now the `ImmutablePath` is created just to be replaced 1 or 2 millis 
>>>> later. I
>>>> have however not benchmarked any of those queries so am not touching that 
>>>> at
>>>> the moment.
>>>>
>>>> In some ways its a matter of style. The label logic in `AbstractStep` is 
>>>> not
>>>> relevant to all of its subclasses and should be higher in the inheritance
>>>> hierarchy.
>>>>
>>>> Cheers
>>>> Pieter
>>>>
>>>>
>>>> Excerpts from Marko Rodriguez's message of August 7, 2016 3:54 :
>>>>> Why not just check to see if the labels to be added already exist, if 
>>>>> they do, don’t addLabels() and thus, don’t create a new collection.
>>>>> Marko.
>>>>> http://markorodriguez.com
>>>>>> On Aug 7, 2016, at 6:07 AM, Pieter Martin <pieter.mar...@gmail.com> 
>>>>>> wrote:
>>>>>> Here is what I have come up with so far.
>>>>>> The idea is that `Traverser.split(r, step)` already copies the labels to 
>>>>>> the
>>>>>> traverser so there is no need to call `Traverser.addLabels(labels)` 
>>>>>> again.
>>>>>> I removed the `Traverser.addLabels(labels)` call from `AbstractStep`.
>>>>>> For the traversers that do not call `Traverer.split(r, step)` I manually 
>>>>>> added
>>>>>> the `traverser.addLabels(labels)` call in `processNextStart()`. This was 
>>>>>> done
>>>>>> by fixing test failures rather than searching and investigating all 
>>>>>> calls to
>>>>>> `Traverser.split()`.
>>>>>> The following steps needed the labels to be added manually.
>>>>>> `AggregateStep`
>>>>>> `CollectingBarrierStep`
>>>>>> `FilterStep`
>>>>>> `SideEffectStep`
>>>>>> `StartStep`
>>>>>> `NoOpBarrierStep`
>>>>>> Further seeing as `Step.getLabels()` already returns a unmodifiable 
>>>>>> collection and
>>>>>> `ImmutablePath` is well immutable there is no need for it to have its 
>>>>>> own copy
>>>>>> of the labels. I set the labels directly on the path as oppose to making 
>>>>>> a copy.
>>>>>> `TinkerGraphProcessComputerTest` passes.
>>>>>> `TinkerGraphProcessStandardTest` passes.
>>>>>> `TinkerGraphStructureStandardTest` has one failure.
>>>>>> `SerializationTest$GryoTest.shouldSerializePathAsDetached` fails with
>>>>>> `java.lang.IllegalArgumentException: Class is not registered:
>>>>>> java.util.Collections$UnmodifiableSet`
>>>>>> Let me know what you think.
>>>>>> Thanks
>>>>>> Pieter
>>>>>> Excerpts from Ted Wilmes's message of August 7, 2016 12:16 :
>>>>>>> Neat find Pieter.  With regards to the update to ImmutablePath, I think
>>>>>>> that defensive copying of inbound collections is generally a good idea 
>>>>>>> but
>>>>>>> if we can target specific areas where we can reap big gains from not
>>>>>>> creating new collections it may be worth it to relax that constraint,
>>>>>>> especially if we're only talking about internal APIs.  If we do relax 
>>>>>>> that
>>>>>>> constraint though, we'll need to careful during code review and unit
>>>>>>> testing because those bugs can be difficult to track down.  The other 
>>>>>>> nice
>>>>>>> thing that the defensive copy gets us, particularly in the case of the
>>>>>>> ImmutablePath, is that it isolates ImmutablePath from whatever the 
>>>>>>> subclass
>>>>>>> of set was that the caller passed in.  I think that's what is causing 
>>>>>>> the
>>>>>>> serialization test failure in this case since the caller passed in an
>>>>>>> unmodifiable set.
>>>>>>> --Ted
>>>>>>> On Fri, Aug 5, 2016, 2:31 PM Marko Rodriguez <okramma...@gmail.com> 
>>>>>>> wrote:
>>>>>>>> Hello,
>>>>>>>> This is cool. Check out also ImmutablePath.extend(labels) as that is
>>>>>>>> ultimately what Traverser.addLabels() calls. We have a lot of set 
>>>>>>>> copying
>>>>>>>> and I don’t know if its needed (as you seem to be demonstrating). What 
>>>>>>>> I
>>>>>>>> don’t like about your solution is the explicit reference to the
>>>>>>>> B_L_P…Traverser in AbstractStep. See if you can work your solution 
>>>>>>>> without
>>>>>>>> it.
>>>>>>>> Good luck,
>>>>>>>> Marko.
>>>>>>>> http://markorodriguez.com
>>>>>>>>> On Aug 5, 2016, at 12:44 PM, pieter-gmail <pieter.mar...@gmail.com>
>>>>>>>> wrote:
>>>>>>>>> Sorry forgot to add a rather important part.
>>>>>>>>>
>>>>>>>>> I changed ImmutablePath's constructor to
>>>>>>>>>
>>>>>>>>>   private ImmutablePath(final ImmutablePathImpl previousPath, final
>>>>>>>>> Object currentObject, final Set<String> currentLabels) {
>>>>>>>>>       this.previousPath = previousPath;
>>>>>>>>>       this.currentObject = currentObject;
>>>>>>>>>       this.currentLabels = currentLabels;
>>>>>>>>> //        this.currentLabels.addAll(currentLabels);
>>>>>>>>>   }
>>>>>>>>>
>>>>>>>>> Setting the collection directly as oppose to `addAll`
>>>>>>>>>
>>>>>>>>> Thanks
>>>>>>>>> Pieter
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 05/08/2016 20:40, pieter-gmail wrote:
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> I have been optimizing Sqlg of late and eventually arrived at 
>>>>>>>>>> TinkerPop
>>>>>>>>>> code.
>>>>>>>>>>
>>>>>>>>>> The gremlin in particular that I am interested is path queries.
>>>>>>>>>>
>>>>>>>>>> Here is the test that I am running in jmh.
>>>>>>>>>>
>>>>>>>>>>       //@Setup
>>>>>>>>>>       Vertex a = graph.addVertex(T.label, "A", "name", "a1");
>>>>>>>>>>       for (int i = 1; i < 1_000_001; i++) {
>>>>>>>>>>           Vertex b = graph.addVertex(T.label, "B", "name", "name_" +
>>>>>>>> i);
>>>>>>>>>>           a.addEdge("outB", b);
>>>>>>>>>>           for (int j = 0; j < 1; j++) {
>>>>>>>>>>               Vertex c = graph.addVertex(T.label, "C", "name", 
>>>>>>>>>> "name_"
>>>>>>>>>> + i + " " + j);
>>>>>>>>>>               b.addEdge("outC", c);
>>>>>>>>>>           }
>>>>>>>>>>       }
>>>>>>>>>>
>>>>>>>>>> And the query being benchmarked is
>>>>>>>>>>
>>>>>>>>>>       GraphTraversal<Vertex, Path> traversal =
>>>>>>>>>> g.V(a).as("a").out().as("b").out().as("c").path();
>>>>>>>>>>       while (traversal.hasNext()) {
>>>>>>>>>>           Path path = traversal.next();
>>>>>>>>>>       }
>>>>>>>>>>
>>>>>>>>>> Before the optimization, (as things are now)
>>>>>>>>>>
>>>>>>>>>> Benchmark                                 Mode Cnt  Score Error
>>>>>>>> Units
>>>>>>>>>> GremlinPathBenchmark.g_path  avgt  100  1.086 ± 0.020   s/op
>>>>>>>>>>
>>>>>>>>>> The optimization I did is in 
>>>>>>>>>> AbstractStep.prepareTraversalForNextStep,
>>>>>>>>>> to not call addLabels() for path gremlins as the labels are known by 
>>>>>>>>>> the
>>>>>>>>>> step and do not change again so there is not need to keep adding 
>>>>>>>>>> them.
>>>>>>>>>>
>>>>>>>>>>   private final Traverser.Admin<E> prepareTraversalForNextStep(final
>>>>>>>>>> Traverser.Admin<E> traverser) {
>>>>>>>>>>       if (!this.traverserStepIdAndLabelsSetByChild) {
>>>>>>>>>>           traverser.setStepId(this.nextStep.getId());
>>>>>>>>>>           if (traverser instanceof B_LP_O_P_S_SE_SL_Traverser) {
>>>>>>>>>>           } else {
>>>>>>>>>>               traverser.addLabels(this.labels);
>>>>>>>>>>           }
>>>>>>>>>>       }
>>>>>>>>>>       return traverser;
>>>>>>>>>>   }
>>>>>>>>>>
>>>>>>>>>> After optimization,
>>>>>>>>>>
>>>>>>>>>> Benchmark                                 Mode Cnt  Score Error
>>>>>>>> Units
>>>>>>>>>> GremlinPathBenchmark.g_path  avgt  100  0.680 ± 0.004   s/op
>>>>>>>>>>
>>>>>>>>>> 1.086 vs 0.689 seconds for the traversal.
>>>>>>>>>>
>>>>>>>>>> I ran the Structured and Process test suites. 2 tests are failing 
>>>>>>>>>> with
>>>>>>>>>> this optimization.
>>>>>>>>>>
>>>>>>>>>> InjectTest.g_VX1X_out_name_injectXdanielX_asXaX_mapXlengthX_path 
>>>>>>>>>> fails
>>>>>>>> with
>>>>>>>>>> "java.lang.IllegalArgumentException: The step with label a does not
>>>>>>>> exist"
>>>>>>>>>> and
>>>>>>>>>>
>>>>>>>>>> SerializationTest.shouldSerializePathAsDetached fails with
>>>>>>>>>>
>>>>>>>>>> "Caused by: java.lang.IllegalArgumentException: Class is not
>>>>>>>> registered:
>>>>>>>>>> java.util.Collections$UnmodifiableSet"
>>>>>>>>>>
>>>>>>>>>> Before investigating the failures is this optimization worth 
>>>>>>>>>> pursuing?
>>>>>>>>>>
>>>>>>>>>> Thanks
>>>>>>>>>> Pieter
>>>>>>>>>>

Reply via email to