Hi,
> SubgraphStrategyTest contains test cases that access both vertices incident
> on edges which isn't allowed under GraphComputer semantics. It silently
> passes in GraphComputerTest because the TP3 implementations ignore those
> cases. I believe implementations should throw an exception on such access.
> This example demonstrates how easy it is to end up with difficult to find
> bugs otherwise. We would put a high burden on the user to write unit tests
> for their strategies to make sure they detect those cases where it makes a
> difference.
Yes. I agree that StarGraph should throw and exception when, right now, it
returns Collections.emptyIterator().
I will add to the GraphComputer TestSuite verification of the topology of the
"star vertex" so that all vendors have the same accessible data and the same
Exceptions thrown.
> Moreover, we should think about such FilterStrategies in general. Should
> there be a way to restrict them to just stargraph access to ensure that
> they always work in GraphComputer? Basically, I think it would be great to
> avoid a case where a strategy works in OLTP but doesn't in OLAP because
> most people would probably implement it for OLTP first. This is
> particularly true for SubgraphStrategies where you try to restrict access.
There are various situations in which OLAP and OLTP do not line up. However, I
don't think we should restrict OLTP to be that which OLAP can handle. Primarily
because many of the limitations of OLAP will be solved in the future. Here is
the list as I know them:
1. OLAP does not support mutating steps (addV(), addE(), etc.). This is
because we haven't come up with a good model for mutating in OLAP.
- The latest idea on this is the "GraphMutator" interface that
all OLAP vendors should supply, blah blah.
- In essence, it will happens, just not right now (3.1+).
2. OLAP does not support lambdas. This is because Java8 lambdas are not
serializable.
- We can fake this with passing Groovy strings, but its not
"true" Gremlin-Java8.
- Kryo3 supposedly figured out a way to serialize Java8 lambdas
-- perhaps our savior?
- In essence, it will happen, just not right now.
3. OLAP does not support mid-traversal barriers (i.e.
g.V.count().is(gt(100))). This is because a barrier step requires a MapReduce
and we can't go MapReduce -> VertexProgram.
- There is a ticket to support OLAP->OLTP->OLAP which will all
this.
- In essence, it will happen, just not right now.
4. OLAP does not support non-graph object processing (i.e.
__.(1,2,3,4).order(local).by(incr).sum() in OLAP (i.e. no graph objects)).
- This is possible (and actually not that hard) and there is a
ticket for it.
- In essence, it will happen, just not right now.
5. OLAP localTraversals can not leave "the star graph."
- This is because we don't have that data available locally,
though, I don't see why we can't solve this with "TraverserSet -- scoping" in
OLAP.
- The solution would be complex to write, but doable. The way
TraveralMatrix works in OLAP, the code is staged for this.
- In essence, it will happen, just not right now.
6. OLAP does not support MatchStep.
- There is currently no thoughts on solving this, but I suspect
its a natural fall out once 4 is complete.
- In essence, it will happen, just not right now.
So, while OLAP and OLTP differ, I would not restrict OLTP to be what OLAP can
do. What we could do is make the ComputerVerificationStrategy's analysis easily
accessible to users. For instance:
ComputerVerificationStrategy.check(myTraversal) // which is simply a
call to ComputerVerificationStrategy.instance().apply(myTraversal) :)
This way, a user can easily determine if their traversal will work on OLAP
without having to actually try and execute the job. One of the things that
Daniel Kuppitz is working on is making sure that ComputerVerificationStrategy
is able to identify the 5 areas above and throw the appropriate exception
explaining "why" to the user. As we move forward with 3.1, 3.2, 3.3, etc.
hopefully the list becomes smaller and smaller.
Thanks,
Marko.
http://markorodriguez.com