Hi Oleksandr, I agree about the long-standing need for async queries. A "fake" async API for TinkerPop was one the first things we had to build when I first started at Uber in 2017 (using JanusGraph on Cassandra, and later an in-house Cassandra-based graph DB). Feel free to share an early version of your proposal here, or post a link to a design doc; I would be happy to be in the loop from an interoperability point of view -- e.g. making sure that async APIs in different languages are analogous. Callbacks / promise-based RPC would have been my first thought, as well.
Josh On Thu, Jul 28, 2022 at 7:18 AM Oleksandr Porunov < [email protected]> wrote: > I'm interested in adding async capabilities to TinkerPop. > > There were many discussions about async capabilities for TinkerPop but > there was no clear consensus on how and when it should be developed. > > The benefit for async capabilities is that the user calling a query > shouldn't need its thread to be blocked to simply wait for the result of > the query execution. Instead of that a graph provider should take care > about implementation of async queries execution. > If that's the case then many graph providers will be able to optimize their > execution of async queries by handling less resources for the query > execution. > As a real example of potential benefit we could get I would like to point > on how JanusGraph executes CQL queries to process Gremlin queries. > CQL result retrieval: > > https://github.com/JanusGraph/janusgraph/blob/15a00b7938052274fe15cf26025168299a311224/janusgraph-cql/src/main/java/org/janusgraph/diskstorage/cql/function/slice/CQLSimpleSliceFunction.java#L45 > > As seen from the code above, JanusGraph already leverages async > functionality for CQL queries under the hood but JanusGraph is required to > process those queries in synced manner, so what JanusGraph does - it simply > blocks the whole executing thread until result is returned instead of using > async execution. > > Of course, that's just a case when we can benefit from async execution > because the underneath storage backend can process async queries. If a > storage backend can't process async queries then we won't get any benefit > from implementing a fake async executor. > > That said, I believe quite a few graph providers may benefit from having a > possibility to execute queries in async fashion because they can optimize > their resource utilization. > I believe that we could have a feature flag for storage providers which > want to implement async execution. Those who can't implement it or don't > want to implement it may simply disable async capabilities which will > result in throwing an exception anytime an async function is called. I > think it should be fine because we already have some feature flags like > that for graph providers. For example "Null Semantics" was added in > TinkerPop 3.5.0 but `null` is not supported for all graph providers. Thus, > a feature flag for Null Semantics exists like > "g.getGraph().features().vertex().supportsNullPropertyValues()". > I believe we can enable async in TinkerPop 3 by providing async as a > feature flag and letting graph providers implement it at their will. > Moreover if a graph provider wants to have async capabilities but their > storage backends don't support async capabilities then it should be easy to > hide async execution under an ExecutorService which mimics async execution. > I believe we could do that for TinkerGraph so that users could experiment > with async API at least. I believe we could simply have a default "async" > function implementation for TinkerGraph which wraps all sync executions in > a function and sends it to that ExecutorService (we can discuss which one). > In such a case TinkerGraph will support async execution even without real > async functionality. We could also potentially provide some configuration > options to TinkerGraph to configure thread pool size, executor service > implementation, etc. > > I didn't think about how it is better to implement those async capabilities > for TinkerPop yet but I think reusing a similar approach like in Node.js > which returns Promise when calling Terminal steps could be good. For > example, we could have a method called `async` which accepts a termination > step and returns a necessary Future object. > I.e.: > g.V(123).async(Traversal.next()) > g.V().async(Traversal.toList()) > g.E().async(Traversal.toSet()) > g.E().async(Traversal.iterate()) > > I know that there were discussions about adding async functionality to > TinkerPop 4 eventually, but I don't see strong reasons why we couldn't add > async functionality to TinkerPop 3 with a feature flag. > It would be really great to hear some thoughts and concerns about it. > > If there are no concerns, I'd like to develop a proposal for further > discussion. > > Best regards, > Oleksandr Porunov >
