I've opened a PR for the UnifiedChannelizer: https://github.com/apache/tinkerpop/pull/1414
Over the weekend I passed 10s of millions of messages through the UnifiedChannelizer and the server stayed quite stable throughout it all. With all the tests passing nicely now in travis, docker and locally I think this one is now ready for review. On Fri, Apr 2, 2021 at 8:35 AM Stephen Mallette <[email protected]> wrote: > I've solved the Travis issue. Some tests simply required more threads in > gremlinPool than the default which is set to the available cores. On Travis > you probably get a machine with 2 cores available and some tests required 3 > or more to keep that many sessions open at once. Silly problem though the > analysis did help uncover other issues so I guess travis did its job. > > I've merged the g.tx() work and rebased the UnifiedChannelizer branch and > will continue with profiling/testing into early next week. I think that > should set this up for a PR at that point. If you'd like to see what the > work looks like, I think it would be fine to do so now at: > > https://github.com/apache/tinkerpop/compare/TINKERPOP-2245 > > > > > On Thu, Apr 1, 2021 at 11:57 AM Stephen Mallette <[email protected]> > wrote: > >> I've been fussing with Travis for a while now haven't had much luck >> figuring out what is wrong. Decided to start profiling the server to see if >> that provided any hints (and it was something I intended to do in any >> case). Two issues came up: >> >> 1. shouldEventuallySucceedOnSameServerWithDefault() had some bad >> assertion logic and it fails for both UnifiedChannelizer and for the old >> OpProcessor stuff when the assertions are corrected. I've @Ignore'd that >> test for now. No idea if it ever worked or if it's busted with some more >> recent changes to the driver. I don't think that this was the cause of the >> problems with travis but I can't be sure. >> 2. AliasClusteredClient did not proxy calls to its underlying Client >> instance. This was purposeful, but has bad implications for >> DriverRemoteConnection. I'd now consider it a bug in 3.5.0 and I've changed >> the behavior. Ultimately, the whole Client aliasing probably needs to be >> changed, but it's too big a thing to try to get into now. Not completely >> sure yet if this fixes travis either, but I sense it has something to do >> with it. >> >> Travis is a bit backed up today so it's slow getting results. Hopefully >> (2) solved the problem or at least leads to an error i can see. >> >> The nice side-effect of this profiling is that I've come to see how much >> faster UnifiedChannelizer is as compared to the OpProcessor. Using the new >> remote transaction approach to load 100k vertices and 50k edges in 50k >> separate transactions (2 vertices/1 edge per tx), took under 3 minutes with >> UnifiedChannelizer and the server stayed remarkably stable in terms of >> resources/gc. The same workload on OpProcessor.........I gave up waiting >> for it to finish after 1500 transactions, which took 9 minutes. Looking at >> the jfr, there was a really weird gc pattern and memory usage under >> OpProcessor which perhaps had some connection to our quickly >> starting/killing threads and GremlinExecutor instances. >> >> I plan to keep testing/profiling for a bit longer before issuing the PR >> (and hopefully getting travis fixed in the process)...more updates to come >> i imagine. >> >> Divij, I'm glad that you brought this issue up early. I imagine I would >> have noticed the problem in profiling, but it was smart to call it out >> early. Nice job. >> >> >> >> On Mon, Mar 29, 2021 at 6:55 AM Stephen Mallette <[email protected]> >> wrote: >> >>> I can't seem to get Travis to pass the build for this branch with the >>> UnifiedChannelizer enabled. >>> >>> https://travis-ci.com/github/apache/tinkerpop/builds/221427334 >>> >>> It just hangs without much information. I can't seem to get it to behave >>> otherwise and it doesn't fail for me locally (of course). If anyone has >>> some free CPU cycles could you please do: >>> >>> mvn clean install -DskipTests && mvn verify -pl :gremlin-server >>> -DskipTests -DskipIntegrationTests=false -DincludeNeo4j -DtestUnified=true >>> >>> i'm curious if anyone else can recreate the issue.....Thanks >>> >>> >>> >>> On Thu, Mar 25, 2021 at 3:33 PM Stephen Mallette <[email protected]> >>> wrote: >>> >>>> It took me a long time to get to the final steps of this change. >>>> Uncovered some unexpected bugs in the WsAndHttpChannelizer and that was >>>> gnarly to sort out. Also found some problems with authorization through the >>>> SaslAndHttpBasicAuthenticationHandler. I can't say I really like how these >>>> combined channelizers really work. Seems like the new tangle of stuff in >>>> Gremlin Server. Not sure how much of that I can sort that out for 3.5.0 at >>>> this point. I might make some attempts on this body of work since it's >>>> fresh in my mind but we'll see. >>>> >>>> After getting through these little nits I've managed to get all of the >>>> Gremlin Server integration test cases passing with the new >>>> UnifiedChannelizer. I've added a -DtestUnified property to run the >>>> UnifiedChannelizer rather than the old stuff. This way we can add a new job >>>> to travis to cover that by itself in parallel to the old. You can run it >>>> like this: >>>> >>>> mvn verify -pl gremlin-server -DskipIntegrationTests=false >>>> -DtestUnified=true >>>> >>>> Having all the tests pass on the new channelizer makes me feel pretty >>>> confident that what I have at least works (and without much change in >>>> behavior) and therefore have pushed a branch for early review: >>>> >>>> https://github.com/apache/tinkerpop/tree/TINKERPOP-2245 >>>> >>>> I still need to write a few more tests and after that it would be time >>>> to do a bit of performance testing to see how it compares to the old >>>> method. I'm also missing "metrics". If all that is good then I'll need to >>>> do a gang of documentation work. I imagine I will continue to keep this >>>> work separate from the bytecode transaction work that spawned it and merge >>>> the transaction stuff first followed by this. In that way, expect this >>>> branch to be a bit volatile with rebasing. >>>> >>>> >>>> >>>> >>>> >>>> On Tue, Mar 23, 2021 at 6:51 PM Stephen Mallette <[email protected]> >>>> wrote: >>>> >>>>> More reasonable progress with the UnifiedChannelizer. The short >>>>> version is that I have a pretty major quantity of the server integration >>>>> tests passing under this new Channelizer. The ones that are failing are >>>>> typically ones that assert some specific log messages or exception message >>>>> which may not necessarily apply anymore. I'm still mulling those over. >>>>> >>>>> The basic structure of the UnifiedChannelizer is that it sorta copies >>>>> the WsAndHttpChannelizer and builds from there, so when this >>>>> UnfiedChannelizer is configured you'll get HTTP and Websockets. The >>>>> UnifiedChannelizer gets rid of the entire OpProcessor infrastructure. I >>>>> think that was a good body of code years ago, but we've now learned quite >>>>> explicitly what we want Gremlin Server to do and what role it plays. >>>>> OpProcessor just feels like an unnecessary abstraction now. Removing it >>>>> simplifies the code of the UnifiedChannelizer, which adds a single >>>>> UnifiedHandler to the netty pipeline. That UnifiedHandler streamlines all >>>>> those OpProcessor implementations into one so all the duplicate code is >>>>> effectively removed by running RequestMessage instances through a single >>>>> common flow. That means all the request validation, error handling, >>>>> iterator processing, result building, transaction management, etc. will be >>>>> in one place. >>>>> >>>>> The UnifiedHandler also kills out the GremlinExecutor. That means all >>>>> the Lifecycle callbacks and random lambdas that generated so much >>>>> indirection in the name of extensibility are no longer confusing the code >>>>> base and the wicked hierarchy of timesouts and exception throws are all >>>>> flattened. >>>>> >>>>> When the UnifiedHandler gets a RequestMessage it validates it, builds >>>>> a Context and creates a Rexster instance (thought i'd bring back the name >>>>> rather than go with the generic "Worker" interface) that has the Context >>>>> added to it as a task. The Rexster instance is then submitted to the >>>>> Gremlin thread pool to do work when a thread is available. Rexster will >>>>> hang on to that thread awaiting tasks that are assigned to it. For >>>>> sessionless requests, that means Rexster handles one request and exits. >>>>> For >>>>> a sessionful request it means the Rexster will wait for more tasks >>>>> (Context >>>>> objects from new requests) to be added to it. The UnifiedHandler holds >>>>> those Rexster instances in a Map aligning session identifiers to Rexster >>>>> instances. As Divij alluded to, I suppose this means we are putting down >>>>> foundation for query cancellation and better insight into the queue of >>>>> things that are running. >>>>> >>>>> Now that the gremlinPool server setting covers both types of requests >>>>> we no longer have a situation where the number of threads that can be >>>>> running is unbounded and a "session" will now look a bit like a "long run >>>>> job" in that it will consume a thread from the pool for as long as the >>>>> session is open. So if the gremlinPool is 16, you could have 16 sessions >>>>> active. Sessionless requests would begin to queue as would requests for >>>>> new >>>>> sessions and they would be serviced in the order received. >>>>> >>>>> I'm still not quite ready for a pull request. Despite having the bulk >>>>> of the integration tests working, the code isn't quite the way I think it >>>>> will be finally organized (I used a lot of inner classes and things which >>>>> probably should change). I also need to figure out how to run the tests. I >>>>> think I need to run them all per channelizer (old and new) but I think >>>>> that >>>>> will push the build time over maximum on Travis, so I may need to break >>>>> things up somehow. I'm reasonably sure I'll have a branch ready to push >>>>> before the end of the week. >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> On Mon, Mar 22, 2021 at 5:22 PM Stephen Mallette <[email protected]> >>>>> wrote: >>>>> >>>>>> I've started to play with some ideas for pushing all requests >>>>>> scripts/bytecode sessions/sessionless down one unified pipeline with one >>>>>> shared thread pool. The most interesting thing I've seen so far is how >>>>>> much >>>>>> the code could potentially simplify under this approach. The >>>>>> GremlinExecutor was the wrong abstraction in a sense and it forced way >>>>>> too >>>>>> many callbacks as arguments and made error handling kinda sloppy which >>>>>> made >>>>>> the server code harder to reuse, extend and read. Of course, all that was >>>>>> built before the first remote graph was built and we were just wildly >>>>>> guessing at how DS Graph was going to work with all this. Now, I think >>>>>> the >>>>>> remote model is much more clear, so maybe all this can be made "right" >>>>>> now. >>>>>> >>>>>> I think that if I can see any success at all with this in the next >>>>>> few days, we could include a new UnfiiedChannelizer that would replace >>>>>> the >>>>>> existing ones. I think we'd keep the old one in place by default and keep >>>>>> my copy/paste work for SessionOpProcessor that added bytecode there, with >>>>>> the idea that the UnifiedChannelizer will become the future default as we >>>>>> get it tested further along the 3.5.x line. >>>>>> >>>>>> I imagine I'll have more details on this task tomorrow and will post >>>>>> back here when I do. >>>>>> >>>>>> >>>>>> On Thu, Mar 18, 2021 at 12:37 PM Divij Vaidya < >>>>>> [email protected]> wrote: >>>>>> >>>>>>> > >>>>>>> > I suppose the >>>>>>> > question is how would we ensure that each request for a session >>>>>>> ends up >>>>>>> > being executed on the same thread from the previous request if >>>>>>> jobs are >>>>>>> > randomly submitted to a worker pool? >>>>>>> >>>>>>> >>>>>>> I haven't thought through the details, but on top of my head, we >>>>>>> would have >>>>>>> to maintain some request<->thread mapping on the server. This >>>>>>> mapping is >>>>>>> also a building block for a request cancellation feature in future >>>>>>> where a >>>>>>> client would be able to send a cancellation request to the server, >>>>>>> the >>>>>>> server will map the request to a thread executing that request and >>>>>>> then set >>>>>>> an interrupt on that thread to signify cancellation. >>>>>>> >>>>>>> Divij Vaidya >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Thu, Mar 18, 2021 at 5:00 PM Stephen Mallette < >>>>>>> [email protected]> >>>>>>> wrote: >>>>>>> >>>>>>> > I started a fresh thread for this topic Divij brought up, with more >>>>>>> > context: >>>>>>> > >>>>>>> > > In a scenario where we have both >>>>>>> > > session-less and sesion-ful requests being made to the server, >>>>>>> the >>>>>>> > resource >>>>>>> > > allocation will not be fair and may lead to starvation for some >>>>>>> requests. >>>>>>> > > This problem exists even today, hence not totally correlated to >>>>>>> what you >>>>>>> > > are proposing but I am afraid a wider adoption of explicit >>>>>>> > > transactions will bring this problem to the spotlight. Hence, we >>>>>>> should >>>>>>> > fix >>>>>>> > > this problem first. A solution would entail converging the >>>>>>> worker pool >>>>>>> > for >>>>>>> > > both session vs session-less requests. >>>>>>> > >>>>>>> > I'm not sure we can get that done in 3.5.0, but maybe? I suppose >>>>>>> the >>>>>>> > question is how would we ensure that each request for a session >>>>>>> ends up >>>>>>> > being executed on the same thread from the previous request if >>>>>>> jobs are >>>>>>> > randomly submitted to a worker pool? >>>>>>> > >>>>>>> > >>>>>>> > On Thu, Mar 18, 2021 at 11:52 AM Divij Vaidya < >>>>>>> [email protected]> >>>>>>> > wrote: >>>>>>> > >>>>>>> > > Great idea Stephen. I agree with standardizing the explicit >>>>>>> transaction >>>>>>> > > semantics in cases of remote server (implicit aka sessionless is >>>>>>> already >>>>>>> > > self explanatory) and unifying the syntax with embedded graph >>>>>>> usage. >>>>>>> > > >>>>>>> > > Couple of notes: >>>>>>> > > >>>>>>> > > 1. I would favor the begin() instead of create() as it's closer >>>>>>> to >>>>>>> > > universally prominent SQL transaction syntax. This would reduce >>>>>>> the >>>>>>> > > learning curve for users of TP. >>>>>>> > > >>>>>>> > > 2. From an implementation standpoint, sessionless requests are >>>>>>> queued on >>>>>>> > > the server side and are serviced by the worker thread pool. But a >>>>>>> > > session-ful aka explicit transaction aka managed transaction >>>>>>> starts a new >>>>>>> > > single worked thread pool every time. In a scenario where we >>>>>>> have both >>>>>>> > > session-less and sesion-ful requests being made to the server, >>>>>>> the >>>>>>> > resource >>>>>>> > > allocation will not be fair and may lead to starvation for some >>>>>>> requests. >>>>>>> > > This problem exists even today, hence not totally correlated to >>>>>>> what you >>>>>>> > > are proposing but I am afraid a wider adoption of explicit >>>>>>> > > transactions will bring this problem to the spotlight. Hence, we >>>>>>> should >>>>>>> > fix >>>>>>> > > this problem first. A solution would entail converging the >>>>>>> worker pool >>>>>>> > for >>>>>>> > > both session vs session-less requests. >>>>>>> > > >>>>>>> > > 3. You are proposing the idea of having a transaction scoped >>>>>>> traversal >>>>>>> > > object. I agree with it but we need more clarification in >>>>>>> behavior wrt >>>>>>> > the >>>>>>> > > following scenarios: >>>>>>> > > >>>>>>> > > Q. What happens when g.tx().commit() is called on a transaction >>>>>>> scoped >>>>>>> > > traversal object? Does the traversal get closed? >>>>>>> > > Q. Currently, the same traversal object could be used to execute >>>>>>> multiple >>>>>>> > > session-less requests simultaneously in a thread safe manner. >>>>>>> Does the >>>>>>> > same >>>>>>> > > behaviour apply to transaction scoped traversal? If not, then >>>>>>> how do I >>>>>>> > send >>>>>>> > > multiple requests in parallel from the client all scoped to the >>>>>>> same >>>>>>> > > transaction on the server? Note that this is different from case >>>>>>> of multi >>>>>>> > > threaded transactions because on the server all requests are >>>>>>> scoped to >>>>>>> > > single transaction i.e. single thread but on the client they may >>>>>>> be >>>>>>> > > submitted via multiple threads. >>>>>>> > > >>>>>>> > > Regards, >>>>>>> > > Divij Vaidya >>>>>>> > > >>>>>>> > > >>>>>>> > > >>>>>>> > > On Thu, Mar 18, 2021 at 4:05 PM Stephen Mallette < >>>>>>> [email protected]> >>>>>>> > > wrote: >>>>>>> > > >>>>>>> > > > The Transaction object is a bit important with remote cases >>>>>>> because it >>>>>>> > > > holds the reference to the session. With embedded use cases we >>>>>>> > generally >>>>>>> > > > adhere to ThreadLocal transactions so the Transaction >>>>>>> implementation in >>>>>>> > > > that case is more of a controller for the current thread but >>>>>>> for remote >>>>>>> > > > cases the Transaction implementation holds a bit of state that >>>>>>> can >>>>>>> > cross >>>>>>> > > > over threads. In some ways, that makes remote cases feel like a >>>>>>> > "threaded >>>>>>> > > > transaction" so that may be familiar to users?? Here's some >>>>>>> example >>>>>>> > > > syntax I currently have working in a test case: >>>>>>> > > > >>>>>>> > > > g = traversal().withRemote(conn) >>>>>>> > > > gtx = g.tx().create() >>>>>>> > > > assert gtx.isOpen() == true >>>>>>> > > > gtx.addV('person').iterate() >>>>>>> > > > gtx.addV('software').iterate() >>>>>>> > > > gtx.commit() // alternatively you could explicitly rollback() >>>>>>> > > > assert gtx.isOpen() == false >>>>>>> > > > >>>>>>> > > > I hope that documentation changes are enough to unify the >>>>>>> syntax and >>>>>>> > > remove >>>>>>> > > > confusion despite there still being ThreadLocal transactions >>>>>>> as a >>>>>>> > default >>>>>>> > > > for embedded cases and something else for remote. At least >>>>>>> they will >>>>>>> > look >>>>>>> > > > the same, even if you technically don't need to do a >>>>>>> g.tx().create() >>>>>>> > for >>>>>>> > > > embedded transactions and can just have the control you always >>>>>>> had with >>>>>>> > > > g.tx().commit() directly. >>>>>>> > > > >>>>>>> > > > Note that the remote Transaction object can support >>>>>>> configuration via >>>>>>> > > > onClose(CLOSE_BEHAVIOR) and you could therefore switch from >>>>>>> the default >>>>>>> > > of >>>>>>> > > > "commit on close" to rollback or manual (or i suppose something >>>>>>> > custom). >>>>>>> > > > It's nice that this piece works. I don't see a point in >>>>>>> supporting >>>>>>> > > > onReadWrite(READ_WRITE_BEHAVIOR) as it just doesn't make sense >>>>>>> in >>>>>>> > remote >>>>>>> > > > contexts. >>>>>>> > > > >>>>>>> > > > Another point is that I've added what I termed a "Graph >>>>>>> Operation" >>>>>>> > > bytecode >>>>>>> > > > instances which is bytecode that isn't related to traversal >>>>>>> > > construction. I >>>>>>> > > > hope this isn't one of those things where we end up wanting to >>>>>>> > deprecate >>>>>>> > > an >>>>>>> > > > idea as fast as we added it but we needed some way to pass >>>>>>> > > commit/rollback >>>>>>> > > > commands and they aren't really part of traversal >>>>>>> construction. They >>>>>>> > are >>>>>>> > > > operations that execute on the graph itself and I couldn't >>>>>>> think of how >>>>>>> > > to >>>>>>> > > > treat them as traversals nicely, so they sorta just exist as a >>>>>>> one off. >>>>>>> > > > Perhaps they will grow in number?? Folks have always asked if >>>>>>> bytecode >>>>>>> > > > requests could get "GraphFeature" information - I suppose this >>>>>>> could >>>>>>> > be a >>>>>>> > > > way to do that?? >>>>>>> > > > >>>>>>> > > > Anyway, I will keep going down this general path as it's >>>>>>> appearing >>>>>>> > > > relatively fruitful. >>>>>>> > > > >>>>>>> > > > >>>>>>> > > > >>>>>>> > > > >>>>>>> > > > On Thu, Mar 18, 2021 at 6:34 AM Stephen Mallette < >>>>>>> [email protected] >>>>>>> > > >>>>>>> > > > wrote: >>>>>>> > > > >>>>>>> > > > > > We should just communicate clearly that a simple remote >>>>>>> traversal >>>>>>> > > > > already uses a transaction by default, >>>>>>> > > > > >>>>>>> > > > > That's a good point because even with this change we still >>>>>>> have a >>>>>>> > > > > situation where a single iterated remote traversal will >>>>>>> generally >>>>>>> > mean >>>>>>> > > a >>>>>>> > > > > server managed commit/rollback, but in embedded mode, we >>>>>>> have an open >>>>>>> > > > > transaction (for transaction enabled graphs - TinkerGraph >>>>>>> will behave >>>>>>> > > > more >>>>>>> > > > > like a remote traversal, so more confusion there i guess). >>>>>>> I'm not >>>>>>> > sure >>>>>>> > > > how >>>>>>> > > > > to rectify that at this time except by way of documentation. >>>>>>> > > > > >>>>>>> > > > > The only thing I can think of is that the default for >>>>>>> embedded would >>>>>>> > > have >>>>>>> > > > > to be auto-commit per traversal. In that way it would work >>>>>>> like >>>>>>> > remote >>>>>>> > > > > traversals and for graphs that don't support transactions >>>>>>> like >>>>>>> > > > TinkerGraph. >>>>>>> > > > > Of course, that's the kind of change that will break a lot >>>>>>> of code. >>>>>>> > > Maybe >>>>>>> > > > > we just keep that idea for another version. >>>>>>> > > > > >>>>>>> > > > > On Thu, Mar 18, 2021 at 4:21 AM <[email protected]> >>>>>>> wrote: >>>>>>> > > > > >>>>>>> > > > >> I like the idea of adding transactions for remote >>>>>>> traversals as they >>>>>>> > > > >> close a gap in functionality that we currently have for >>>>>>> remote >>>>>>> > > > traversals. >>>>>>> > > > >> We should just communicate clearly that a simple remote >>>>>>> traversal >>>>>>> > > > already >>>>>>> > > > >> uses a transaction by default, meaning that the server will >>>>>>> roll >>>>>>> > back >>>>>>> > > > any >>>>>>> > > > >> changes if any exception occurs. Users often ask about how >>>>>>> to do >>>>>>> > > > >> transactions with a remote traversal because they don't >>>>>>> know about >>>>>>> > > this >>>>>>> > > > and >>>>>>> > > > >> I'm afraid that we might add even more confusion if we now >>>>>>> add >>>>>>> > > > transactions >>>>>>> > > > >> for remote traversals. Hence why I think that we should >>>>>>> communicate >>>>>>> > > this >>>>>>> > > > >> clearly when we add remote transactions. >>>>>>> > > > >> >>>>>>> > > > >> Getting this to work in the GLVs should be possible but >>>>>>> will require >>>>>>> > > > some >>>>>>> > > > >> effort. I think we would have to introduce some >>>>>>> > > > >> "DriverRemoteTransactionTraversal" that doesn't submit the >>>>>>> traversal >>>>>>> > > on >>>>>>> > > > a >>>>>>> > > > >> terminal step but saves it and then submits all saved >>>>>>> traversals >>>>>>> > > > together >>>>>>> > > > >> on close(). >>>>>>> > > > >> This also means that we should directly add an async >>>>>>> version of >>>>>>> > > close(), >>>>>>> > > > >> maybe closeAsync()? (Same for commit() and rollback()) >>>>>>> > > > >> >>>>>>> > > > >> I also like create() better than traversal() because it >>>>>>> would >>>>>>> > confuse >>>>>>> > > me >>>>>>> > > > >> to first start a traversal with traversal() and then also >>>>>>> start a >>>>>>> > > > >> transaction with the same method. >>>>>>> > > > >> >>>>>>> > > > >> -----Ursprüngliche Nachricht----- >>>>>>> > > > >> Von: Stephen Mallette <[email protected]> >>>>>>> > > > >> Gesendet: Donnerstag, 18. März 2021 00:01 >>>>>>> > > > >> An: [email protected] >>>>>>> > > > >> Betreff: Re: [DSISCUSS] Remote Transactions >>>>>>> > > > >> >>>>>>> > > > >> One thing I should have noted about tx().create(). The >>>>>>> create() very >>>>>>> > > > well >>>>>>> > > > >> could have been named traversal(), thus the previous >>>>>>> example reading >>>>>>> > > as: >>>>>>> > > > >> >>>>>>> > > > >> g = traversal().withEmbedded(graph) >>>>>>> > > > >> // or >>>>>>> > > > >> g = traversal().withRemote(conn) >>>>>>> > > > >> gtx = g.tx().traversal() >>>>>>> > > > >> gtx.addV('person').iterate() >>>>>>> > > > >> gtx.addV('software').iterate() >>>>>>> > > > >> gtx.close() // alternatively you could explicitly commit() >>>>>>> or >>>>>>> > > rollback() >>>>>>> > > > >> >>>>>>> > > > >> You're basically spawning a new GraphTraversalSource from >>>>>>> the >>>>>>> > > > Transaction >>>>>>> > > > >> object rather than from a Graph or AnonymousTraversal. I >>>>>>> chose >>>>>>> > > create() >>>>>>> > > > >> because it felt like this looked weird: >>>>>>> > > > >> >>>>>>> > > > >> g = traversal().withRemote(conn).tx().traversal() >>>>>>> > > > >> >>>>>>> > > > >> which would be a weird thing to do I guess, but just seeing >>>>>>> > > > "traversal()" >>>>>>> > > > >> over and over seemed odd looking and I wanted to >>>>>>> differentiate with >>>>>>> > > the >>>>>>> > > > >> Transaction object even though the methods are identical in >>>>>>> what >>>>>>> > they >>>>>>> > > > do. I >>>>>>> > > > >> suppose I also drew inspiration from: >>>>>>> > > > >> >>>>>>> > > > >> Transaction.createdThreadedTx() >>>>>>> > > > >> >>>>>>> > > > >> which I think we might consider deprecating. JanusGraph >>>>>>> would simply >>>>>>> > > > >> expose their own Transaction object in the future that has >>>>>>> that >>>>>>> > method >>>>>>> > > > as I >>>>>>> > > > >> imagine folks still use that feature. As far as I know, no >>>>>>> other >>>>>>> > graph >>>>>>> > > > >> implements that functionality and my guess is that no one >>>>>>> will >>>>>>> > likely >>>>>>> > > > do so >>>>>>> > > > >> in the future. >>>>>>> > > > >> >>>>>>> > > > >> anyway, if you like traversal() better than create() or the >>>>>>> other >>>>>>> > way >>>>>>> > > > >> around, please let me know >>>>>>> > > > >> >>>>>>> > > > >> On Wed, Mar 17, 2021 at 5:33 PM David Bechberger < >>>>>>> > [email protected] >>>>>>> > > > >>>>>>> > > > >> wrote: >>>>>>> > > > >> >>>>>>> > > > >> > I am in favor of this change as I any idea that unifies >>>>>>> the >>>>>>> > multiple >>>>>>> > > > >> > different ways things work in TP will only make it easier >>>>>>> to learn >>>>>>> > > and >>>>>>> > > > >> > adopt. >>>>>>> > > > >> > >>>>>>> > > > >> > I don't know that I have enough knowledge of the inner >>>>>>> workings of >>>>>>> > > > >> > transactions to know what/if this will cause problems. >>>>>>> > > > >> > >>>>>>> > > > >> > Dave >>>>>>> > > > >> > >>>>>>> > > > >> > On Wed, Mar 17, 2021 at 12:57 PM Stephen Mallette >>>>>>> > > > >> > <[email protected]> >>>>>>> > > > >> > wrote: >>>>>>> > > > >> > >>>>>>> > > > >> > > We haven't touched "transactions" since TP3 was >>>>>>> originally >>>>>>> > > designed. >>>>>>> > > > >> > > They have remained a feature for embedded use cases >>>>>>> even in the >>>>>>> > > face >>>>>>> > > > >> > > of the >>>>>>> > > > >> > rise >>>>>>> > > > >> > > of remote graph use cases and result in major >>>>>>> inconsistencies >>>>>>> > that >>>>>>> > > > >> > > really bother users. >>>>>>> > > > >> > > >>>>>>> > > > >> > > As we close on 3.5.0, I figured that perhaps there was >>>>>>> a chance >>>>>>> > to >>>>>>> > > > >> > > do something radical about transactions. Basically, I'd >>>>>>> like >>>>>>> > > > >> > > transactions to work irrespective of remote or embedded >>>>>>> usage >>>>>>> > and >>>>>>> > > > >> > > for them to have the >>>>>>> > > > >> > same >>>>>>> > > > >> > > API when doing so. In mulling it over for the last day >>>>>>> or so I >>>>>>> > > had a >>>>>>> > > > >> > > realization that makes me believe that the following is >>>>>>> > possible: >>>>>>> > > > >> > > >>>>>>> > > > >> > > g = traversal().withEmbedded(graph) >>>>>>> > > > >> > > // or >>>>>>> > > > >> > > g = traversal().withRemote(conn) >>>>>>> > > > >> > > gtx = g.tx().create() >>>>>>> > > > >> > > gtx.addV('person').iterate() >>>>>>> > > > >> > > gtx.addV('software').iterate() >>>>>>> > > > >> > > gtx.close() // alternatively you could explicitly >>>>>>> commit() or >>>>>>> > > > >> > > rollback() >>>>>>> > > > >> > > >>>>>>> > > > >> > > // you could still use g for sessionless, but gtx is >>>>>>> "done" >>>>>>> > after >>>>>>> > > // >>>>>>> > > > >> > > you close so you will need to create() a new gtx >>>>>>> instance for a >>>>>>> > // >>>>>>> > > > >> > > fresh transaction assert 2 == g.V().count().next() >>>>>>> > > > >> > > >>>>>>> > > > >> > > Note that the create() method on tx() is the new bit of >>>>>>> > necessary >>>>>>> > > > >> syntax. >>>>>>> > > > >> > > Technically, it is a do-nothing for embedded mode (and >>>>>>> you could >>>>>>> > > > >> > > skip it for thread-local auto-transactions) but all the >>>>>>> > > > >> > > documentation can be shifted so that the API is >>>>>>> identical to >>>>>>> > > remote. >>>>>>> > > > >> > > The change would be non-breaking as the embedded >>>>>>> transaction >>>>>>> > > > >> > > approach would remain as it is, but would no longer be >>>>>>> > documented >>>>>>> > > as >>>>>>> > > > >> > > the preferred approach. Perhaps we could one day >>>>>>> disallow it but >>>>>>> > > > >> > > it's a bit of a tangle of ThreadLocal that I'm not sure >>>>>>> we want >>>>>>> > to >>>>>>> > > > >> touch in TP3. >>>>>>> > > > >> > > >>>>>>> > > > >> > > What happens behind the scenes of g.tx().create() is >>>>>>> that in >>>>>>> > > remote >>>>>>> > > > >> > > cases the RemoteConnection constructs a remote >>>>>>> Transaction >>>>>>> > > > >> > > implementation which basically is used to spawn new >>>>>>> traversal >>>>>>> > > source >>>>>>> > > > >> > > instances with a session based connections. The remote >>>>>>> > Transaction >>>>>>> > > > >> > > object can't support >>>>>>> > > > >> > transaction >>>>>>> > > > >> > > listeners or special read-write/close configurations, >>>>>>> but I >>>>>>> > don't >>>>>>> > > > >> > > think that's a problem as those things don't make a lot >>>>>>> of sense >>>>>>> > > in >>>>>>> > > > >> > > remote use cases. >>>>>>> > > > >> > > >>>>>>> > > > >> > > From the server perspective, this change would also >>>>>>> mean that >>>>>>> > > > >> > > sessions would have to accept bytecode and not just >>>>>>> scripts, >>>>>>> > which >>>>>>> > > > >> > > technically shouldn't be a problem. >>>>>>> > > > >> > > >>>>>>> > > > >> > > One downside at the moment is that i'm thinking mostly >>>>>>> in Java >>>>>>> > at >>>>>>> > > > >> > > the moment. I'm not sure how this all works in other >>>>>>> variants >>>>>>> > just >>>>>>> > > > >> > > yet, but >>>>>>> > > > >> > I'd >>>>>>> > > > >> > > hope to keep similar syntax. >>>>>>> > > > >> > > >>>>>>> > > > >> > > I will keep experimenting tomorrow. If you have any >>>>>>> thoughts on >>>>>>> > > the >>>>>>> > > > >> > matter, >>>>>>> > > > >> > > I'd be happy to hear them. Thanks! >>>>>>> > > > >> > > >>>>>>> > > > >> > >>>>>>> > > > >> >>>>>>> > > > >> >>>>>>> > > > >>>>>>> > > >>>>>>> > >>>>>>> >>>>>>
