On Wed, Dec 30, 2009 at 12:14 PM, Tobias Ivarsson
<[email protected]> wrote:
> Hi,
>
> Great question. It took some discussion to come up with a good answer.
> Honestly I'm not sure this answer is as good an answer as this question
> would deserve, but it is a summary of the discussions I've been involved
> with off-list and that I am now moving to the list. Combine this with the
> holidays and you might get to a point where it's understandable that the
> response took so long.

Prefectly understandable ! And it wasn't so long :)

> Given the fact that the need for open transactions for read operations is
> less than it used to be, and the fact that a number of people seem to be
> questioning the fact that transactions are required to read from Neo4j,
> perhaps it's time to change this. We'll have a meeting with the dev team
> next week to discuss this. In the meantime it would be great to get some
> input from all of you guys on the list. Do you think we should remove the
> requirement for transactions for read operations? Why/Why not?

When i'm doing traversal, i think about sql "SELECT".
And transaction is sql "BEGIN ... COMMIT;"

When doing purely readonly operation, transaction are not explicitely
required (and there is nothing to rollback).
The sql transaction for a select is "automagic", when a potentially
long select is done, the query work on a "frozen" state of the
database.
If you want to do multiple select using the same "frozen state" of the
1st select, then you need to wrap everything in a transaction.
Otherwise, it's not needed.
When i'm doing a transaction around a "traverser ... for ..." i assume
that, while crawling the nodespace with my traverser, a concurrent
thread can commit change to the database withtout causing consistency
problem for my traverser : i work on a "frozen state" of the graph.
Of course, removing the transaction could cause some consistency
problem, but in most case it's perfectly acceptable ... if known.

On the other side, if i must wrap my traverser loop in a transaction,
it totally kill the granularity of the transaction.

Exemple :
- i have a huge graph
- i want to update everything single node
- it will take many days because of some low process independant to neo4j.
- i wrapped everything in a top lvl Tx to be able to traverse the graph.
- one of the update fail : i lose all my update and a few days of work.

That was *exactly* my problem.

As a workaround, the Traverser loop filled a good old java array (a few seconds)
Then i traversed the Array, instead of working on the graph.
So i avoided the pain of having a huge top lvl Tx (filling java heap, etc)
An update on a node could fail safely without rollbacking all my work.

That could lead to some eventual consistency problem, but concistency
isn't my priority, i always deal with "eventual consustency".

Both Con and Pro-transaction have very good arguments.
I just want to have the choice.

What about some kind of "autocommit" feature ?

-- 
Laurent "ker2x" Laborde
Sysadmin & DBA at http://www.over-blog.com/
_______________________________________________
Neo mailing list
[email protected]
https://lists.neo4j.org/mailman/listinfo/user

Reply via email to