Re: Async capabilities to TinkerPop

Joshua Shinavier Thu, 28 Jul 2022 16:38:56 -0700

Well, the wrapper I mentioned before did not require a full rewrite of
TinkerPop :-) Rather, it provided async interfaces for vertices and edges,
on which operations like subgraph and shortest paths queries were evaluated
in an asynchronous fashion (using a special language, as it happened, but
limited Gremlin queries would have been an option). So I think a basic
async API might be a useful starting point even if it doesn't go very deep.


Josh


On Thu, Jul 28, 2022 at 4:21 PM Oleksandr Porunov <
[email protected]> wrote:

> Hi Joshua and Pieter,
>
> Thank you for joining the conversation!
>
> I didn't actually look into the implementation details yet but quickly
> checking Traversal.java code I think Pieter is right here.
> For some reason I thought we could simply wrap synchronous method in
> asynchronous, basically something like:
>
> // the method which should be implemented by a graph provider
>
> Future<E> executeAsync(Callable<E> func);
>
> public default Future<E> asyncNext(){
>     return executeAsync(this::next);
> }
>
> but checking that code I think I was wrong about it. Different steps may
> execute different logic (i.e. different underlying storage queries) for
> different graph providers.
> Thus, wrapping only terminal steps into async functions won't solve the
> problem most likely.
>
> I guess it will require re-writing or extending all steps to be able to
> pass an async state instead of a sync state.
>
> I'm not familiar enough with the TinkerPop code yet to claim that, so
> probably I could be wrong.
> I will need to research it a bit more to find out but I think that Pieter
> is most likely right about a massive re-write.
>
> Nevertheless, even if that requires massive re-write, I would be eager to
> start the ball rolling.
> I think we either need to try to implement async execution in TinkerPop 3
> or start making some concrete decisions regarding TinkerPop 4.
>
> I see Marko A. Rodriguez started to work on RxJava back in 2019 here
> https://github.com/apache/tinkerpop/tree/4.0-dev/java/machine/processor/rxjava/src/main/java/org/apache/tinkerpop/machine/processor/rxjava
>
> but the process didn't go as far as I understand. I guess it would be good
> to know if we want to completely rewrite TinkerPop in version 4 or not.
>
> If we want to completely rewrite TinkerPop in version 4 then I assume it
> may take quite some time to do so. In this case I would be more likely to
> say that it's better to implement async functionality in TinkerPop 3 even
> if it requires rewriting all steps.
>
> In case TinkerPop 4 is a redevelopment with breaking changes but without
> starting to rewrite the whole functionality then I guess we could try to
> work on TinkerPop 4 by introducing async functionality and maybe applying
> more breaking changes in places where it's better to re-work some parts.
>
> Best regards,
> Oleksandr
>
>
> On Thu, Jul 28, 2022 at 7:47 PM pieter gmail <[email protected]>
> wrote:
>
>> Hi,
>>
>> Does this not imply a massive rewrite of TinkerPop? In particular the
>> iterator chaining pattern of steps should follow a reactive style of
>> coding?
>>
>> Cheers
>> Pieter
>>
>>
>> On Thu, 2022-07-28 at 15:18 +0100, Oleksandr Porunov wrote:
>> > I'm interested in adding async capabilities to TinkerPop.
>> >
>> > There were many discussions about async capabilities for TinkerPop
>> > but
>> > there was no clear consensus on how and when it should be developed.
>> >
>> > The benefit for async capabilities is that the user calling a query
>> > shouldn't need its thread to be blocked to simply wait for the result
>> > of
>> > the query execution. Instead of that a graph provider should take
>> > care
>> > about implementation of async queries execution.
>> > If that's the case then many graph providers will be able to optimize
>> > their
>> > execution of async queries by handling less resources for the query
>> > execution.
>> > As a real example of potential benefit we could get I would like to
>> > point
>> > on how JanusGraph executes CQL queries to process Gremlin queries.
>> > CQL result retrieval:
>> >
>> https://github.com/JanusGraph/janusgraph/blob/15a00b7938052274fe15cf26025168299a311224/janusgraph-cql/src/main/java/org/janusgraph/diskstorage/cql/function/slice/CQLSimpleSliceFunction.java#L45
>> >
>> > As seen from the code above, JanusGraph already leverages async
>> > functionality for CQL queries under the hood but JanusGraph is
>> > required to
>> > process those queries in synced manner, so what JanusGraph does - it
>> > simply
>> > blocks the whole executing thread until result is returned instead of
>> > using
>> > async execution.
>> >
>> > Of course, that's just a case when we can benefit from async
>> > execution
>> > because the underneath storage backend can process async queries. If
>> > a
>> > storage backend can't process async queries then we won't get any
>> > benefit
>> > from implementing a fake async executor.
>> >
>> > That said, I believe quite a few graph providers may benefit from
>> > having a
>> > possibility to execute queries in async fashion because they can
>> > optimize
>> > their resource utilization.
>> > I believe that we could have a feature flag for storage providers
>> > which
>> > want to implement async execution. Those who can't implement it or
>> > don't
>> > want to implement it may simply disable async capabilities which will
>> > result in throwing an exception anytime an async function is called.
>> > I
>> > think it should be fine because we already have some feature flags
>> > like
>> > that for graph providers. For example "Null Semantics" was added in
>> > TinkerPop 3.5.0 but `null` is not supported for all graph providers.
>> > Thus,
>> > a feature flag for Null Semantics exists like
>> > "g.getGraph().features().vertex().supportsNullPropertyValues()".
>> > I believe we can enable async in TinkerPop 3 by providing async as a
>> > feature flag and letting graph providers implement it at their will.
>> > Moreover if a graph provider wants to have async capabilities but
>> > their
>> > storage backends don't support async capabilities then it should be
>> > easy to
>> > hide async execution under an ExecutorService which mimics async
>> > execution.
>> > I believe we could do that for TinkerGraph so that users could
>> > experiment
>> > with async API at least. I believe we could simply have a default
>> > "async"
>> > function implementation for TinkerGraph which wraps all sync
>> > executions in
>> > a function and sends it to that ExecutorService (we can discuss which
>> > one).
>> > In such a case TinkerGraph will support async execution even without
>> > real
>> > async functionality. We could also potentially provide some
>> > configuration
>> > options to TinkerGraph to configure thread pool size, executor
>> > service
>> > implementation, etc.
>> >
>> > I didn't think about how it is better to implement those async
>> > capabilities
>> > for TinkerPop yet but I think reusing a similar approach like in
>> > Node.js
>> > which returns Promise when calling Terminal steps could be good. For
>> > example, we could have a method called `async` which accepts a
>> > termination
>> > step and returns a necessary Future object.
>> > I.e.:
>> > g.V(123).async(Traversal.next())
>> > g.V().async(Traversal.toList())
>> > g.E().async(Traversal.toSet())
>> > g.E().async(Traversal.iterate())
>> >
>> > I know that there were discussions about adding async functionality
>> > to
>> > TinkerPop 4 eventually, but I don't see strong reasons why we
>> > couldn't add
>> > async functionality to TinkerPop 3 with a feature flag.
>> > It would be really great to hear some thoughts and concerns about it.
>> >
>> > If there are no concerns, I'd like to develop a proposal for further
>> > discussion.
>> >
>> > Best regards,
>> > Oleksandr Porunov
>>
>>

Re: Async capabilities to TinkerPop

Reply via email to