Re: [DISCUSS] Making Transaction.readWrite() changes ThreadLocal

Dylan Millikin Tue, 13 Oct 2015 01:23:10 -0700

> I never write my Gremlin with manual transactions so it feels wrong to me,
> but if someone did a mid-commit in their sessionless submitted script
they'd
> be in MANUAL and would have to open a transaction.


It's funny I feel slightly different about this. I find it weird that if I
committed mid query, another transaction would be implicitly opened (you
need understanding of MANUAL vs AUTO to grasp what is going on and that is
-from what I've seen- beyond the knowledge of most driver users).
I would think that if you committed a transaction in a sessionless request
you had knowledge that the transaction was opened at some point. If you
forgot to reopen a transaction the error message should be explicit enough.
With that said it's still a little awkward because you wouldn't need to
commit it.... weird

Just to make sure we have the bigger picture. It might be worth keeping in
mind that transactions are generally handled on the driver layer. I'm not
sure how often a driver user would manually commit mid-sessionless or do
any other kind of transaction management in a sessionless request (as
opposed to using the driver's transaction management).

I'm not really pushing for this. Just wanted to be thorough.

Making the configuration ThreadLocal would definitely work, and seeing how
crippling this can be for gremlin-server users I'd push for it. But at this
point in time it's a little hard for me to grasp what the implications are
for implementation providers.


On Mon, Oct 12, 2015 at 6:10 PM, Stephen Mallette <[email protected]>
wrote:

> > I'm not all that familiar with the internals and how mutations of the
> first graph would be passed to the second but apart from that it sounds
> like it could work.
>
> hmm - doesn't matter - as i think of it more closely it doesn't work
> uniformly across all graphs.  won't work with neo4j, but would work with
> titan, for example.
>
> > Is there a reason why AUTO is the default for gremlin-server?
>
> that's a good point.  if it was all MANUAL then it makes sense that it
> would work.  of course, the default of AUTO isn't coming from Gremlin
> Server - that is a function of gremlin-core and enforced by the test
> suite.  Gremlin Server could override that to be MANUAL whenever
> transactions are supported, but i don't know if i like that idea really.  I
> never write my Gremlin with manual transactions so it feels wrong to me,
> but if someone did a mid-commit in their sessionles submitted script they'd
> be in MANUAL and would have to open a transaction.
>
> maybe we don't have an "OR" situation here.  we may only have one choice to
> settle this:
>
> > we go with the original proposal to just make that configuration
> ThreadLocal.
>
>
> On Mon, Oct 12, 2015 at 11:21 AM, Dylan Millikin <[email protected]
> >
> wrote:
>
> > Interesting talk!
> >
> > I've been testing things today and decided to take a deeper look at the
> > code. I have to say that I've also gone off the idea of making
> Transactions
> > ThreadLocal. It looks like it would work fine but if some issue arose it
> > would just end up being more maintenance than it's worth. And that's
> often
> > a good sign that it's not the best design.
> >
> > > In other words, in Gremlin Server the user would configure two graphs
> one
> > > that was in manual mode and one that was in auto mode.  Dylan, that
> would
> > > solve the problem you came across?  You would use rebindings to be sure
> > > that your query could always be written with a "g", but in Gremlin
> Server
> > > have two distinct graphs called "manualG" and "autoG" or whatever.  Use
> > the
> > > manualG in sessionful communication and use "autoG" for sessionless.
> >
> > I'm not all that familiar with the internals and how mutations of the
> first
> > graph would be passed to the second but apart from that it sounds like it
> > could work. This is a lot of stress on the end user though, perhaps it's
> > something gremlin-server could do in the background (seeing as it will be
> > necessary every time a client transaction is required)?
> >
> > I thought of something else as well, and was wondering. Is there a reason
> > why AUTO is the default for gremlin-server? Seeing as the
> > AbstractOpProcessor is already capable of managing commit and rollback of
> > transactions could we not extend this to also opening sessions? In which
> > case we would set MANUAL on all graphs as the default for gremlin-server
> > and it -should- solve these issues.
> > I don't know if there's a way of working with GremlinExecutor.eval() to
> > support this right now but maybe this could be another direction to look
> > into.
> >
> >
> >
> >
> > On Mon, Oct 12, 2015 at 1:59 PM, Stephen Mallette <[email protected]>
> > wrote:
> >
> > > I think Bryn answered your question Matt.
> > >
> > > I know Matt was just trying to understand the Transaction model here so
> > we
> > > got a bit away from the original point, but just to bring us back
> there:
> > >  We have transactions bound to ThreadLocal, but the manual/auto
> > transaction
> > > configuration is bound globally to the Graph. Without any changes at
> all,
> > > we could do what Matt suggested:
> > >
> > > > Could a client simply instantiate multiple Graph objects to maintain
> > > different
> > > ways of interacting with the underlying graph?
> > >
> > > In other words, in Gremlin Server the user would configure two graphs
> one
> > > that was in manual mode and one that was in auto mode.  Dylan, that
> would
> > > solve the problem you came across?  You would use rebindings to be sure
> > > that your query could always be written with a "g", but in Gremlin
> Server
> > > have two distinct graphs called "manualG" and "autoG" or whatever.  Use
> > the
> > > manualG in sessionful communication and use "autoG" for sessionless.
> > >
> > > OR
> > >
> > > we go with the original proposal to just make that configuration
> > > ThreadLocal.  As I think about it now, I kinda prefer not introducing
> > more
> > > ThreadLocal stuff and keeping the configuration global to the Graph
> > > instance.
> > >
> > > > However, it would be such a major change..... Perhaps this could be
> > > considered for TP4?
> > >
> > > whoever said there would be a TP4? :)
> > >
> > > I don't mean to stifle conversation on the general Transaction model -
> I
> > > would think it is a change beyond TP3 unless someone has something
> really
> > > smart that nicely retrofits the existing model.  If so, please begin
> new
> > > thread for that discussion.  We may yet need to talk about transactions
> > > more though as I'm not sure they are completely well connected to the
> > > TraversalSource.
> > >
> > >
> > >
> > > On Mon, Oct 12, 2015 at 4:59 AM, Bryn Cooke <[email protected]>
> wrote:
> > >
> > > > Transaction isn't even quite that, it is the controller for the
> > 'current'
> > > > transaction and also a builder for thread independent transactions.
> > > Current
> > > > transaction may or may not refer to a thread local graph.
> > > >
> > > > I agree that this is confusing, mostly because it bucks the trend of
> > > using
> > > > the connection factory to handle shared state between connections.
> > > >
> > > > Matt is correct that it would be much simpler if every thread that
> > access
> > > > to a graph should just obtain a new Graph instance from a
> GraphFactory
> > > > instance. Creating a GraphFactory should be expensive, but creating
> > > > multiple graphs from that factory should be cheap. You could then get
> > rid
> > > > of the Transaction interface altogether and move commit() and
> > rollback()
> > > > back on to the graph, all thread local stuff can go and every graph
> > > becomes
> > > > thread independent.
> > > >
> > > >
> > > > //The graph factory instance will handle sharing resources. The
> > concrete
> > > > implementation of the factory will be provided by the graph provider.
> > > > GraphFactory gf = GraphFactory.open(props);
> > > >
> > > > //These graphs will all point to the same underlying data.
> > > > Graph g1 = gf.create();
> > > > Graph g2 = gf.create();
> > > > Graph g3 = gf.create();
> > > >
> > > > g1.commit();
> > > > g2.commit();
> > > > g3.commit();
> > > >
> > > >
> > > > This way if the client wants to stash a graph in a thread local then
> > they
> > > > are free to do it under their own terms. We don't interfere with the
> > > > application threading model.
> > > >
> > > >
> > > > However, it would be such a major change..... Perhaps this could be
> > > > considered for TP4?
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > On 10/10/15 16:53, Matt Frantz wrote:
> > > >
> > > >> If the threading model is well-defined, then ThreadLocal can be a
> > > >> convenient way to inject state.  For example, if you have a pool of
> > > >> long-lived worker threads, then each worker thread can store state
> > > related
> > > >> to shared objects in ThreadLocal.  However, if the threading model
> is
> > > less
> > > >> well-defined or there is no clear affinity between threads and the
> > > shared
> > > >> objects, then ThreadLocal can be awkward for some situations, such
> as
> > > the
> > > >> ones you point out where multiple threads collaborate on a
> > Transaction.
> > > >>
> > > >> I'm trying to uncover the requirements here, and understand why the
> > > >> Transaction.onReadWrite state cannot simply manifest in the
> > Transaction
> > > >> object itself, as it appears to do in my read of the Transaction
> > class.
> > > >> When you call graph.tx().onReadWrite(X), is the intent to affect one
> > > >> Transaction or all Transactions from that point onward [in the
> > caller's
> > > >> thread]?
> > > >>
> > > >> [reads code]
> > > >>
> > > >> Ah, now that I see Graph.tx()'s Javadoc, it appears I misunderstood
> > what
> > > >> Transaction means.  Transaction is not a transaction, but is
> something
> > > >> more
> > > >> like a Transaction builder.  Is there an interface that represents a
> > > >> single
> > > >> transaction?
> > > >>
> > > >> Graph might therefore be better understood as a conflation of the
> > graph
> > > >> itself and a way of (==set of options for) interacting with the
> graph.
> > > >>
> > > >> Could a client simply instantiate multiple Graph objects to maintain
> > > >> different ways of interacting with the underlying graph?
> > > >>
> > > >> Could we cleave the two aspects of Graph into Graph and
> > > >> TransactionBuilder?
> > > >>
> > > >>
> > > >> On Sat, Oct 10, 2015 at 4:36 AM, Stephen Mallette <
> > [email protected]
> > > >
> > > >> wrote:
> > > >>
> > > >> It sounds like you're proposing a bigger picture change to
> > transactions
> > > >>> than just what this particular issue is about. Just to be clear,
> > Dylan
> > > >>> and
> > > >>> I were trying to come to conclusion on a specific problem with
> > > >>> Transactions
> > > >>> within our existing model.  For that, I'm not sure I see a way
> > outside
> > > of
> > > >>> ThreadLocal for that issue.  If i'm wrong in understanding your
> > > proposal
> > > >>> perhaps you need to add more details.
> > > >>>
> > > >>> More generally speaking, in terms of the bigger picture,
> ThreadLocal
> > > has
> > > >>> a
> > > >>> convenience for transactions but also is a genuine pain for a lot
> of
> > > >>> applications (server applications).  Perhaps you could expand your
> > > >>> thoughts
> > > >>> a bit as you might bring a new perspective on how to do this stuff.
> > Of
> > > >>> course, i would imagine it pretty hard to change our flow on that
> at
> > > this
> > > >>> point, but I suppose it doesn't hurt to talk about it.
> > > >>>
> > > >>>
> > > >>>
> > > >>> On Fri, Oct 9, 2015 at 5:58 PM, Matt Frantz <
> > > [email protected]>
> > > >>> wrote:
> > > >>>
> > > >>> Sometimes, use of thread-local storage is a code smell.  It's a
> > > "global"
> > > >>>> variable from the standpoint of the thread.  Assuming in this case
> > > that
> > > >>>>
> > > >>> the
> > > >>>
> > > >>>> Transaction object itself is not the appropriate place for this
> > state,
> > > >>>>
> > > >>> then
> > > >>>
> > > >>>> is there something like a TransactionFactory abstraction that
> could
> > > >>>> encapsulate the context that is being stored in ThreadLocal, and
> > then
> > > >>>> provided explicitly to the txn?
> > > >>>>
> > > >>>> On Fri, Oct 9, 2015 at 10:54 AM, Stephen Mallette <
> > > [email protected]
> > > >>>> >
> > > >>>> wrote:
> > > >>>>
> > > >>>> Graphs that support transactions in TinkerPop (going back to
> > > >>>>>
> > > >>>> Blueprints)
> > > >>>
> > > >>>> have always implemented as ThreadLocal, meaning the transaction is
> > > >>>>>
> > > >>>> bound
> > > >>>
> > > >>>> to
> > > >>>>
> > > >>>>> the thread it is executed in.  Some graphs, those that can
> support
> > > it,
> > > >>>>>
> > > >>>> can
> > > >>>>
> > > >>>>> break out of that model with a "threaded transaction":
> > > >>>>>
> > > >>>>>
> > > >>>>>
> > > >>>>>
> > > >>>
> > >
> >
> http://tinkerpop.incubator.apache.org/docs/3.0.1-incubating/#_threaded_transactions
> > > >>>
> > > >>>> In this case multiple threads can operate on the same transaction.
> > > >>>>>
> > > >>>>>
> > > >>>>>
> > > >>>>> On Fri, Oct 9, 2015 at 1:43 PM, Matt Frantz <
> > > >>>>>
> > > >>>> [email protected]>
> > > >>>
> > > >>>> wrote:
> > > >>>>>
> > > >>>>> This is the first time I've noticed a reference to ThreadLocal
> in a
> > > >>>>>>
> > > >>>>> TP
> > > >>>
> > > >>>> discussion, so if you could share a bit of background, I'd
> > appreciate
> > > >>>>>>
> > > >>>>> it.
> > > >>>>
> > > >>>>> What is the concurrency model for TP from the client standpoint?
> > > >>>>>>
> > > >>>>> When
> > > >>>
> > > >>>> you
> > > >>>>>
> > > >>>>>> say "thread local", do you mean the client thread?  Does this
> > > >>>>>>
> > > >>>>> behavior
> > > >>>
> > > >>>> port
> > > >>>>>
> > > >>>>>> across other client libraries, especially threadless
> environments
> > > >>>>>>
> > > >>>>> like
> > > >>>
> > > >>>> JavaScript?
> > > >>>>>>
> > > >>>>>> I understand API's that affect transactions, but not ones that
> > > affect
> > > >>>>>> threads.  Can't this be a Transaction API?
> > > >>>>>>
> > > >>>>>> On Fri, Oct 9, 2015 at 7:32 AM, Stephen Mallette <
> > > >>>>>>
> > > >>>>> [email protected]
> > > >>>
> > > >>>> wrote:
> > > >>>>>>
> > > >>>>>> Not sure why, but from the beginning I'd thought of this
> settings
> > > >>>>>>>
> > > >>>>>> as
> > > >>>
> > > >>>> being
> > > >>>>>>
> > > >>>>>>> global to the Graph instance, but framed in this context, it
> > > >>>>>>>
> > > >>>>>> doesn't
> > > >>>
> > > >>>> seem
> > > >>>>>
> > > >>>>>> right (in any context).  I agree with Dylan that this setting
> > > >>>>>>>
> > > >>>>>> should
> > > >>>
> > > >>>> be
> > > >>>>
> > > >>>>> ThreadLocal.
> > > >>>>>>>
> > > >>>>>>> On Fri, Oct 9, 2015 at 8:58 AM, Dylan Millikin <
> > > >>>>>>>
> > > >>>>>> [email protected]
> > > >>>>>
> > > >>>>>> wrote:
> > > >>>>>>>
> > > >>>>>>> Hey guys,
> > > >>>>>>>>
> > > >>>>>>>> It's become apparent from the discussion in the following
> issue
> > > >>>>>>>>
> > > >>>>>>> that
> > > >>>>
> > > >>>>> there
> > > >>>>>>>
> > > >>>>>>>> are some limitations with making the transactional
> > > >>>>>>>>
> > > >>>>>>> READ_WRITE_BEHAVIOR
> > > >>>>>
> > > >>>>>> global to all graph operations  (
> > > >>>>>>>> https://issues.apache.org/jira/browse/TINKERPOP3-875 )
> > > >>>>>>>>
> > > >>>>>>>> With gremlin-server as an example this implies that:
> > > >>>>>>>> If you issue
> > > >>>>>>>>
> > > >>>>>>> graph.tx().onReadWrite(Transaction.READ_WRITE_BEHAVIOR.MANUAL)
> > > >>>>>>>
> > > >>>>>>>> (as is automatic when you use the sessionOpProcessor) every
> > > >>>>>>>>
> > > >>>>>>> subsequent
> > > >>>>>
> > > >>>>>> request will need to handle transactions manually. I think this
> > > >>>>>>>>
> > > >>>>>>> also
> > > >>>>
> > > >>>>> includes other clients that have no idea this operation was
> > > >>>>>>>>
> > > >>>>>>> performed.
> > > >>>>>
> > > >>>>>> The side effect of this is that sessionless requests
> > > >>>>>>>>
> > > >>>>>>> (StandardOpProcessor)
> > > >>>>>>>
> > > >>>>>>>> that should be automatically committed, break.
> > > >>>>>>>>
> > > >>>>>>>> Seeing as fixing this would be a breaking change, let's
> discuss
> > > >>>>>>>>
> > > >>>>>>> pros
> > > >>>>
> > > >>>>> and
> > > >>>>>>
> > > >>>>>>> cons and depending on how things go, create another JIRA ticket
> > > >>>>>>>>
> > > >>>>>>> (the
> > > >>>>
> > > >>>>> one
> > > >>>>>>
> > > >>>>>>> linked earlier is being used for another issue)
> > > >>>>>>>>
> > > >>>>>>>> Best,
> > > >>>>>>>> Dylan
> > > >>>>>>>>
> > > >>>>>>>>
> > > >
> > >
> >
>

Re: [DISCUSS] Making Transaction.readWrite() changes ThreadLocal

Reply via email to