Re: [DISCUSS] Making Transaction.readWrite() changes ThreadLocal

Dylan Millikin Tue, 13 Oct 2015 05:05:36 -0700

Fair enough. I just wanted to put that out there for full disclosure. I'm
good with going the AUTO way.


I was waiting to see what direction this post was taking for that issue. I
guess that if we make this a ThreadLocal setting then the previous change
discussed there isn't necessary anymore? I'll post within the JIRA to keep
the info centralized.

On Tue, Oct 13, 2015 at 1:09 PM, Stephen Mallette <[email protected]>
wrote:

> I've never used MANUAL transaction in any application I've ever built on
> TinkerPop - so speaking for myself - i find it hard to reconcile other
> default behavior.  Plus the Gremlin Console works AUTO and every example is
> essentially AUTO.  MANUAL sorta goes against the grain.  It's offered as an
> option to those who are more comfortable with that way of doing things
> (which it seems you are).
>
> >  It might be worth keeping in mind that transactions are generally
> handled on the driver layer. I'm not sure how often a driver user would
> manually commit mid-sessionless or do any other kind of transaction
> management in a sessionless request (as opposed to using the driver's
> transaction management).
>
> I've heard tale of folks sending fairly transaction-complex scripts to
> Rexster on sessionless requests.  Like someone will generate a reasonably
> large script with transactions and send that to the server for processing.
> Not sure that is a great idea to do, but since folks are doing it, I guess
> it is a use case to protect?  or perhaps it's enabling bad patterns?
>  ...bah...another discussion
>
> anyway, given that there haven't been other thoughts on this specific
> issue, perhaps a ThreadLocal setting for this is what is in order.  It will
> solve the problem, but it just means that this setting is not an
> initialization anymore.  it is a setting that must be made in each thread
> that uses it.
>
> dylan, do you still want to take a stab at this one on:
> https://issues.apache.org/jira/browse/TINKERPOP3-875
>
>
>
>
>
>
> On Tue, Oct 13, 2015 at 4:21 AM, Dylan Millikin <[email protected]>
> wrote:
>
> > > I never write my Gremlin with manual transactions so it feels wrong to
> > me,
> > > but if someone did a mid-commit in their sessionless submitted script
> > they'd
> > > be in MANUAL and would have to open a transaction.
> >
> > It's funny I feel slightly different about this. I find it weird that if
> I
> > committed mid query, another transaction would be implicitly opened (you
> > need understanding of MANUAL vs AUTO to grasp what is going on and that
> is
> > -from what I've seen- beyond the knowledge of most driver users).
> > I would think that if you committed a transaction in a sessionless
> request
> > you had knowledge that the transaction was opened at some point. If you
> > forgot to reopen a transaction the error message should be explicit
> enough.
> > With that said it's still a little awkward because you wouldn't need to
> > commit it.... weird
> >
> > Just to make sure we have the bigger picture. It might be worth keeping
> in
> > mind that transactions are generally handled on the driver layer. I'm not
> > sure how often a driver user would manually commit mid-sessionless or do
> > any other kind of transaction management in a sessionless request (as
> > opposed to using the driver's transaction management).
> >
> > I'm not really pushing for this. Just wanted to be thorough.
> >
> > Making the configuration ThreadLocal would definitely work, and seeing
> how
> > crippling this can be for gremlin-server users I'd push for it. But at
> this
> > point in time it's a little hard for me to grasp what the implications
> are
> > for implementation providers.
> >
> >
> > On Mon, Oct 12, 2015 at 6:10 PM, Stephen Mallette <[email protected]>
> > wrote:
> >
> > > > I'm not all that familiar with the internals and how mutations of the
> > > first graph would be passed to the second but apart from that it sounds
> > > like it could work.
> > >
> > > hmm - doesn't matter - as i think of it more closely it doesn't work
> > > uniformly across all graphs.  won't work with neo4j, but would work
> with
> > > titan, for example.
> > >
> > > > Is there a reason why AUTO is the default for gremlin-server?
> > >
> > > that's a good point.  if it was all MANUAL then it makes sense that it
> > > would work.  of course, the default of AUTO isn't coming from Gremlin
> > > Server - that is a function of gremlin-core and enforced by the test
> > > suite.  Gremlin Server could override that to be MANUAL whenever
> > > transactions are supported, but i don't know if i like that idea
> > really.  I
> > > never write my Gremlin with manual transactions so it feels wrong to
> me,
> > > but if someone did a mid-commit in their sessionles submitted script
> > they'd
> > > be in MANUAL and would have to open a transaction.
> > >
> > > maybe we don't have an "OR" situation here.  we may only have one
> choice
> > to
> > > settle this:
> > >
> > > > we go with the original proposal to just make that configuration
> > > ThreadLocal.
> > >
> > >
> > > On Mon, Oct 12, 2015 at 11:21 AM, Dylan Millikin <
> > [email protected]
> > > >
> > > wrote:
> > >
> > > > Interesting talk!
> > > >
> > > > I've been testing things today and decided to take a deeper look at
> the
> > > > code. I have to say that I've also gone off the idea of making
> > > Transactions
> > > > ThreadLocal. It looks like it would work fine but if some issue arose
> > it
> > > > would just end up being more maintenance than it's worth. And that's
> > > often
> > > > a good sign that it's not the best design.
> > > >
> > > > > In other words, in Gremlin Server the user would configure two
> graphs
> > > one
> > > > > that was in manual mode and one that was in auto mode.  Dylan, that
> > > would
> > > > > solve the problem you came across?  You would use rebindings to be
> > sure
> > > > > that your query could always be written with a "g", but in Gremlin
> > > Server
> > > > > have two distinct graphs called "manualG" and "autoG" or whatever.
> > Use
> > > > the
> > > > > manualG in sessionful communication and use "autoG" for
> sessionless.
> > > >
> > > > I'm not all that familiar with the internals and how mutations of the
> > > first
> > > > graph would be passed to the second but apart from that it sounds
> like
> > it
> > > > could work. This is a lot of stress on the end user though, perhaps
> > it's
> > > > something gremlin-server could do in the background (seeing as it
> will
> > be
> > > > necessary every time a client transaction is required)?
> > > >
> > > > I thought of something else as well, and was wondering. Is there a
> > reason
> > > > why AUTO is the default for gremlin-server? Seeing as the
> > > > AbstractOpProcessor is already capable of managing commit and
> rollback
> > of
> > > > transactions could we not extend this to also opening sessions? In
> > which
> > > > case we would set MANUAL on all graphs as the default for
> > gremlin-server
> > > > and it -should- solve these issues.
> > > > I don't know if there's a way of working with GremlinExecutor.eval()
> to
> > > > support this right now but maybe this could be another direction to
> > look
> > > > into.
> > > >
> > > >
> > > >
> > > >
> > > > On Mon, Oct 12, 2015 at 1:59 PM, Stephen Mallette <
> > [email protected]>
> > > > wrote:
> > > >
> > > > > I think Bryn answered your question Matt.
> > > > >
> > > > > I know Matt was just trying to understand the Transaction model
> here
> > so
> > > > we
> > > > > got a bit away from the original point, but just to bring us back
> > > there:
> > > > >  We have transactions bound to ThreadLocal, but the manual/auto
> > > > transaction
> > > > > configuration is bound globally to the Graph. Without any changes
> at
> > > all,
> > > > > we could do what Matt suggested:
> > > > >
> > > > > > Could a client simply instantiate multiple Graph objects to
> > maintain
> > > > > different
> > > > > ways of interacting with the underlying graph?
> > > > >
> > > > > In other words, in Gremlin Server the user would configure two
> graphs
> > > one
> > > > > that was in manual mode and one that was in auto mode.  Dylan, that
> > > would
> > > > > solve the problem you came across?  You would use rebindings to be
> > sure
> > > > > that your query could always be written with a "g", but in Gremlin
> > > Server
> > > > > have two distinct graphs called "manualG" and "autoG" or whatever.
> > Use
> > > > the
> > > > > manualG in sessionful communication and use "autoG" for
> sessionless.
> > > > >
> > > > > OR
> > > > >
> > > > > we go with the original proposal to just make that configuration
> > > > > ThreadLocal.  As I think about it now, I kinda prefer not
> introducing
> > > > more
> > > > > ThreadLocal stuff and keeping the configuration global to the Graph
> > > > > instance.
> > > > >
> > > > > > However, it would be such a major change..... Perhaps this could
> be
> > > > > considered for TP4?
> > > > >
> > > > > whoever said there would be a TP4? :)
> > > > >
> > > > > I don't mean to stifle conversation on the general Transaction
> model
> > -
> > > I
> > > > > would think it is a change beyond TP3 unless someone has something
> > > really
> > > > > smart that nicely retrofits the existing model.  If so, please
> begin
> > > new
> > > > > thread for that discussion.  We may yet need to talk about
> > transactions
> > > > > more though as I'm not sure they are completely well connected to
> the
> > > > > TraversalSource.
> > > > >
> > > > >
> > > > >
> > > > > On Mon, Oct 12, 2015 at 4:59 AM, Bryn Cooke <[email protected]>
> > > wrote:
> > > > >
> > > > > > Transaction isn't even quite that, it is the controller for the
> > > > 'current'
> > > > > > transaction and also a builder for thread independent
> transactions.
> > > > > Current
> > > > > > transaction may or may not refer to a thread local graph.
> > > > > >
> > > > > > I agree that this is confusing, mostly because it bucks the trend
> > of
> > > > > using
> > > > > > the connection factory to handle shared state between
> connections.
> > > > > >
> > > > > > Matt is correct that it would be much simpler if every thread
> that
> > > > access
> > > > > > to a graph should just obtain a new Graph instance from a
> > > GraphFactory
> > > > > > instance. Creating a GraphFactory should be expensive, but
> creating
> > > > > > multiple graphs from that factory should be cheap. You could then
> > get
> > > > rid
> > > > > > of the Transaction interface altogether and move commit() and
> > > > rollback()
> > > > > > back on to the graph, all thread local stuff can go and every
> graph
> > > > > becomes
> > > > > > thread independent.
> > > > > >
> > > > > >
> > > > > > //The graph factory instance will handle sharing resources. The
> > > > concrete
> > > > > > implementation of the factory will be provided by the graph
> > provider.
> > > > > > GraphFactory gf = GraphFactory.open(props);
> > > > > >
> > > > > > //These graphs will all point to the same underlying data.
> > > > > > Graph g1 = gf.create();
> > > > > > Graph g2 = gf.create();
> > > > > > Graph g3 = gf.create();
> > > > > >
> > > > > > g1.commit();
> > > > > > g2.commit();
> > > > > > g3.commit();
> > > > > >
> > > > > >
> > > > > > This way if the client wants to stash a graph in a thread local
> > then
> > > > they
> > > > > > are free to do it under their own terms. We don't interfere with
> > the
> > > > > > application threading model.
> > > > > >
> > > > > >
> > > > > > However, it would be such a major change..... Perhaps this could
> be
> > > > > > considered for TP4?
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > On 10/10/15 16:53, Matt Frantz wrote:
> > > > > >
> > > > > >> If the threading model is well-defined, then ThreadLocal can be
> a
> > > > > >> convenient way to inject state.  For example, if you have a pool
> > of
> > > > > >> long-lived worker threads, then each worker thread can store
> state
> > > > > related
> > > > > >> to shared objects in ThreadLocal.  However, if the threading
> model
> > > is
> > > > > less
> > > > > >> well-defined or there is no clear affinity between threads and
> the
> > > > > shared
> > > > > >> objects, then ThreadLocal can be awkward for some situations,
> such
> > > as
> > > > > the
> > > > > >> ones you point out where multiple threads collaborate on a
> > > > Transaction.
> > > > > >>
> > > > > >> I'm trying to uncover the requirements here, and understand why
> > the
> > > > > >> Transaction.onReadWrite state cannot simply manifest in the
> > > > Transaction
> > > > > >> object itself, as it appears to do in my read of the Transaction
> > > > class.
> > > > > >> When you call graph.tx().onReadWrite(X), is the intent to affect
> > one
> > > > > >> Transaction or all Transactions from that point onward [in the
> > > > caller's
> > > > > >> thread]?
> > > > > >>
> > > > > >> [reads code]
> > > > > >>
> > > > > >> Ah, now that I see Graph.tx()'s Javadoc, it appears I
> > misunderstood
> > > > what
> > > > > >> Transaction means.  Transaction is not a transaction, but is
> > > something
> > > > > >> more
> > > > > >> like a Transaction builder.  Is there an interface that
> > represents a
> > > > > >> single
> > > > > >> transaction?
> > > > > >>
> > > > > >> Graph might therefore be better understood as a conflation of
> the
> > > > graph
> > > > > >> itself and a way of (==set of options for) interacting with the
> > > graph.
> > > > > >>
> > > > > >> Could a client simply instantiate multiple Graph objects to
> > maintain
> > > > > >> different ways of interacting with the underlying graph?
> > > > > >>
> > > > > >> Could we cleave the two aspects of Graph into Graph and
> > > > > >> TransactionBuilder?
> > > > > >>
> > > > > >>
> > > > > >> On Sat, Oct 10, 2015 at 4:36 AM, Stephen Mallette <
> > > > [email protected]
> > > > > >
> > > > > >> wrote:
> > > > > >>
> > > > > >> It sounds like you're proposing a bigger picture change to
> > > > transactions
> > > > > >>> than just what this particular issue is about. Just to be
> clear,
> > > > Dylan
> > > > > >>> and
> > > > > >>> I were trying to come to conclusion on a specific problem with
> > > > > >>> Transactions
> > > > > >>> within our existing model.  For that, I'm not sure I see a way
> > > > outside
> > > > > of
> > > > > >>> ThreadLocal for that issue.  If i'm wrong in understanding your
> > > > > proposal
> > > > > >>> perhaps you need to add more details.
> > > > > >>>
> > > > > >>> More generally speaking, in terms of the bigger picture,
> > > ThreadLocal
> > > > > has
> > > > > >>> a
> > > > > >>> convenience for transactions but also is a genuine pain for a
> lot
> > > of
> > > > > >>> applications (server applications).  Perhaps you could expand
> > your
> > > > > >>> thoughts
> > > > > >>> a bit as you might bring a new perspective on how to do this
> > stuff.
> > > > Of
> > > > > >>> course, i would imagine it pretty hard to change our flow on
> that
> > > at
> > > > > this
> > > > > >>> point, but I suppose it doesn't hurt to talk about it.
> > > > > >>>
> > > > > >>>
> > > > > >>>
> > > > > >>> On Fri, Oct 9, 2015 at 5:58 PM, Matt Frantz <
> > > > > [email protected]>
> > > > > >>> wrote:
> > > > > >>>
> > > > > >>> Sometimes, use of thread-local storage is a code smell.  It's a
> > > > > "global"
> > > > > >>>> variable from the standpoint of the thread.  Assuming in this
> > case
> > > > > that
> > > > > >>>>
> > > > > >>> the
> > > > > >>>
> > > > > >>>> Transaction object itself is not the appropriate place for
> this
> > > > state,
> > > > > >>>>
> > > > > >>> then
> > > > > >>>
> > > > > >>>> is there something like a TransactionFactory abstraction that
> > > could
> > > > > >>>> encapsulate the context that is being stored in ThreadLocal,
> and
> > > > then
> > > > > >>>> provided explicitly to the txn?
> > > > > >>>>
> > > > > >>>> On Fri, Oct 9, 2015 at 10:54 AM, Stephen Mallette <
> > > > > [email protected]
> > > > > >>>> >
> > > > > >>>> wrote:
> > > > > >>>>
> > > > > >>>> Graphs that support transactions in TinkerPop (going back to
> > > > > >>>>>
> > > > > >>>> Blueprints)
> > > > > >>>
> > > > > >>>> have always implemented as ThreadLocal, meaning the
> transaction
> > is
> > > > > >>>>>
> > > > > >>>> bound
> > > > > >>>
> > > > > >>>> to
> > > > > >>>>
> > > > > >>>>> the thread it is executed in.  Some graphs, those that can
> > > support
> > > > > it,
> > > > > >>>>>
> > > > > >>>> can
> > > > > >>>>
> > > > > >>>>> break out of that model with a "threaded transaction":
> > > > > >>>>>
> > > > > >>>>>
> > > > > >>>>>
> > > > > >>>>>
> > > > > >>>
> > > > >
> > > >
> > >
> >
> http://tinkerpop.incubator.apache.org/docs/3.0.1-incubating/#_threaded_transactions
> > > > > >>>
> > > > > >>>> In this case multiple threads can operate on the same
> > transaction.
> > > > > >>>>>
> > > > > >>>>>
> > > > > >>>>>
> > > > > >>>>> On Fri, Oct 9, 2015 at 1:43 PM, Matt Frantz <
> > > > > >>>>>
> > > > > >>>> [email protected]>
> > > > > >>>
> > > > > >>>> wrote:
> > > > > >>>>>
> > > > > >>>>> This is the first time I've noticed a reference to
> ThreadLocal
> > > in a
> > > > > >>>>>>
> > > > > >>>>> TP
> > > > > >>>
> > > > > >>>> discussion, so if you could share a bit of background, I'd
> > > > appreciate
> > > > > >>>>>>
> > > > > >>>>> it.
> > > > > >>>>
> > > > > >>>>> What is the concurrency model for TP from the client
> > standpoint?
> > > > > >>>>>>
> > > > > >>>>> When
> > > > > >>>
> > > > > >>>> you
> > > > > >>>>>
> > > > > >>>>>> say "thread local", do you mean the client thread?  Does
> this
> > > > > >>>>>>
> > > > > >>>>> behavior
> > > > > >>>
> > > > > >>>> port
> > > > > >>>>>
> > > > > >>>>>> across other client libraries, especially threadless
> > > environments
> > > > > >>>>>>
> > > > > >>>>> like
> > > > > >>>
> > > > > >>>> JavaScript?
> > > > > >>>>>>
> > > > > >>>>>> I understand API's that affect transactions, but not ones
> that
> > > > > affect
> > > > > >>>>>> threads.  Can't this be a Transaction API?
> > > > > >>>>>>
> > > > > >>>>>> On Fri, Oct 9, 2015 at 7:32 AM, Stephen Mallette <
> > > > > >>>>>>
> > > > > >>>>> [email protected]
> > > > > >>>
> > > > > >>>> wrote:
> > > > > >>>>>>
> > > > > >>>>>> Not sure why, but from the beginning I'd thought of this
> > > settings
> > > > > >>>>>>>
> > > > > >>>>>> as
> > > > > >>>
> > > > > >>>> being
> > > > > >>>>>>
> > > > > >>>>>>> global to the Graph instance, but framed in this context,
> it
> > > > > >>>>>>>
> > > > > >>>>>> doesn't
> > > > > >>>
> > > > > >>>> seem
> > > > > >>>>>
> > > > > >>>>>> right (in any context).  I agree with Dylan that this
> setting
> > > > > >>>>>>>
> > > > > >>>>>> should
> > > > > >>>
> > > > > >>>> be
> > > > > >>>>
> > > > > >>>>> ThreadLocal.
> > > > > >>>>>>>
> > > > > >>>>>>> On Fri, Oct 9, 2015 at 8:58 AM, Dylan Millikin <
> > > > > >>>>>>>
> > > > > >>>>>> [email protected]
> > > > > >>>>>
> > > > > >>>>>> wrote:
> > > > > >>>>>>>
> > > > > >>>>>>> Hey guys,
> > > > > >>>>>>>>
> > > > > >>>>>>>> It's become apparent from the discussion in the following
> > > issue
> > > > > >>>>>>>>
> > > > > >>>>>>> that
> > > > > >>>>
> > > > > >>>>> there
> > > > > >>>>>>>
> > > > > >>>>>>>> are some limitations with making the transactional
> > > > > >>>>>>>>
> > > > > >>>>>>> READ_WRITE_BEHAVIOR
> > > > > >>>>>
> > > > > >>>>>> global to all graph operations  (
> > > > > >>>>>>>> https://issues.apache.org/jira/browse/TINKERPOP3-875 )
> > > > > >>>>>>>>
> > > > > >>>>>>>> With gremlin-server as an example this implies that:
> > > > > >>>>>>>> If you issue
> > > > > >>>>>>>>
> > > > > >>>>>>>
> > graph.tx().onReadWrite(Transaction.READ_WRITE_BEHAVIOR.MANUAL)
> > > > > >>>>>>>
> > > > > >>>>>>>> (as is automatic when you use the sessionOpProcessor)
> every
> > > > > >>>>>>>>
> > > > > >>>>>>> subsequent
> > > > > >>>>>
> > > > > >>>>>> request will need to handle transactions manually. I think
> > this
> > > > > >>>>>>>>
> > > > > >>>>>>> also
> > > > > >>>>
> > > > > >>>>> includes other clients that have no idea this operation was
> > > > > >>>>>>>>
> > > > > >>>>>>> performed.
> > > > > >>>>>
> > > > > >>>>>> The side effect of this is that sessionless requests
> > > > > >>>>>>>>
> > > > > >>>>>>> (StandardOpProcessor)
> > > > > >>>>>>>
> > > > > >>>>>>>> that should be automatically committed, break.
> > > > > >>>>>>>>
> > > > > >>>>>>>> Seeing as fixing this would be a breaking change, let's
> > > discuss
> > > > > >>>>>>>>
> > > > > >>>>>>> pros
> > > > > >>>>
> > > > > >>>>> and
> > > > > >>>>>>
> > > > > >>>>>>> cons and depending on how things go, create another JIRA
> > ticket
> > > > > >>>>>>>>
> > > > > >>>>>>> (the
> > > > > >>>>
> > > > > >>>>> one
> > > > > >>>>>>
> > > > > >>>>>>> linked earlier is being used for another issue)
> > > > > >>>>>>>>
> > > > > >>>>>>>> Best,
> > > > > >>>>>>>> Dylan
> > > > > >>>>>>>>
> > > > > >>>>>>>>
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: [DISCUSS] Making Transaction.readWrite() changes ThreadLocal

Reply via email to