Well, the wrapper I mentioned before did not require a full rewrite of TinkerPop :-) Rather, it provided async interfaces for vertices and edges, on which operations like subgraph and shortest paths queries were evaluated in an asynchronous fashion (using a special language, as it happened, but limited Gremlin queries would have been an option). So I think a basic async API might be a useful starting point even if it doesn't go very deep.
Josh On Thu, Jul 28, 2022 at 4:21 PM Oleksandr Porunov < [email protected]> wrote: > Hi Joshua and Pieter, > > Thank you for joining the conversation! > > I didn't actually look into the implementation details yet but quickly > checking Traversal.java code I think Pieter is right here. > For some reason I thought we could simply wrap synchronous method in > asynchronous, basically something like: > > // the method which should be implemented by a graph provider > > Future<E> executeAsync(Callable<E> func); > > public default Future<E> asyncNext(){ > return executeAsync(this::next); > } > > but checking that code I think I was wrong about it. Different steps may > execute different logic (i.e. different underlying storage queries) for > different graph providers. > Thus, wrapping only terminal steps into async functions won't solve the > problem most likely. > > I guess it will require re-writing or extending all steps to be able to > pass an async state instead of a sync state. > > I'm not familiar enough with the TinkerPop code yet to claim that, so > probably I could be wrong. > I will need to research it a bit more to find out but I think that Pieter > is most likely right about a massive re-write. > > Nevertheless, even if that requires massive re-write, I would be eager to > start the ball rolling. > I think we either need to try to implement async execution in TinkerPop 3 > or start making some concrete decisions regarding TinkerPop 4. > > I see Marko A. Rodriguez started to work on RxJava back in 2019 here > https://github.com/apache/tinkerpop/tree/4.0-dev/java/machine/processor/rxjava/src/main/java/org/apache/tinkerpop/machine/processor/rxjava > > but the process didn't go as far as I understand. I guess it would be good > to know if we want to completely rewrite TinkerPop in version 4 or not. > > If we want to completely rewrite TinkerPop in version 4 then I assume it > may take quite some time to do so. In this case I would be more likely to > say that it's better to implement async functionality in TinkerPop 3 even > if it requires rewriting all steps. > > In case TinkerPop 4 is a redevelopment with breaking changes but without > starting to rewrite the whole functionality then I guess we could try to > work on TinkerPop 4 by introducing async functionality and maybe applying > more breaking changes in places where it's better to re-work some parts. > > Best regards, > Oleksandr > > > On Thu, Jul 28, 2022 at 7:47 PM pieter gmail <[email protected]> > wrote: > >> Hi, >> >> Does this not imply a massive rewrite of TinkerPop? In particular the >> iterator chaining pattern of steps should follow a reactive style of >> coding? >> >> Cheers >> Pieter >> >> >> On Thu, 2022-07-28 at 15:18 +0100, Oleksandr Porunov wrote: >> > I'm interested in adding async capabilities to TinkerPop. >> > >> > There were many discussions about async capabilities for TinkerPop >> > but >> > there was no clear consensus on how and when it should be developed. >> > >> > The benefit for async capabilities is that the user calling a query >> > shouldn't need its thread to be blocked to simply wait for the result >> > of >> > the query execution. Instead of that a graph provider should take >> > care >> > about implementation of async queries execution. >> > If that's the case then many graph providers will be able to optimize >> > their >> > execution of async queries by handling less resources for the query >> > execution. >> > As a real example of potential benefit we could get I would like to >> > point >> > on how JanusGraph executes CQL queries to process Gremlin queries. >> > CQL result retrieval: >> > >> https://github.com/JanusGraph/janusgraph/blob/15a00b7938052274fe15cf26025168299a311224/janusgraph-cql/src/main/java/org/janusgraph/diskstorage/cql/function/slice/CQLSimpleSliceFunction.java#L45 >> > >> > As seen from the code above, JanusGraph already leverages async >> > functionality for CQL queries under the hood but JanusGraph is >> > required to >> > process those queries in synced manner, so what JanusGraph does - it >> > simply >> > blocks the whole executing thread until result is returned instead of >> > using >> > async execution. >> > >> > Of course, that's just a case when we can benefit from async >> > execution >> > because the underneath storage backend can process async queries. If >> > a >> > storage backend can't process async queries then we won't get any >> > benefit >> > from implementing a fake async executor. >> > >> > That said, I believe quite a few graph providers may benefit from >> > having a >> > possibility to execute queries in async fashion because they can >> > optimize >> > their resource utilization. >> > I believe that we could have a feature flag for storage providers >> > which >> > want to implement async execution. Those who can't implement it or >> > don't >> > want to implement it may simply disable async capabilities which will >> > result in throwing an exception anytime an async function is called. >> > I >> > think it should be fine because we already have some feature flags >> > like >> > that for graph providers. For example "Null Semantics" was added in >> > TinkerPop 3.5.0 but `null` is not supported for all graph providers. >> > Thus, >> > a feature flag for Null Semantics exists like >> > "g.getGraph().features().vertex().supportsNullPropertyValues()". >> > I believe we can enable async in TinkerPop 3 by providing async as a >> > feature flag and letting graph providers implement it at their will. >> > Moreover if a graph provider wants to have async capabilities but >> > their >> > storage backends don't support async capabilities then it should be >> > easy to >> > hide async execution under an ExecutorService which mimics async >> > execution. >> > I believe we could do that for TinkerGraph so that users could >> > experiment >> > with async API at least. I believe we could simply have a default >> > "async" >> > function implementation for TinkerGraph which wraps all sync >> > executions in >> > a function and sends it to that ExecutorService (we can discuss which >> > one). >> > In such a case TinkerGraph will support async execution even without >> > real >> > async functionality. We could also potentially provide some >> > configuration >> > options to TinkerGraph to configure thread pool size, executor >> > service >> > implementation, etc. >> > >> > I didn't think about how it is better to implement those async >> > capabilities >> > for TinkerPop yet but I think reusing a similar approach like in >> > Node.js >> > which returns Promise when calling Terminal steps could be good. For >> > example, we could have a method called `async` which accepts a >> > termination >> > step and returns a necessary Future object. >> > I.e.: >> > g.V(123).async(Traversal.next()) >> > g.V().async(Traversal.toList()) >> > g.E().async(Traversal.toSet()) >> > g.E().async(Traversal.iterate()) >> > >> > I know that there were discussions about adding async functionality >> > to >> > TinkerPop 4 eventually, but I don't see strong reasons why we >> > couldn't add >> > async functionality to TinkerPop 3 with a feature flag. >> > It would be really great to hear some thoughts and concerns about it. >> > >> > If there are no concerns, I'd like to develop a proposal for further >> > discussion. >> > >> > Best regards, >> > Oleksandr Porunov >> >>
