Re: [Neo4j] Neo4j Write Performance

2011-09-24 Thread Mattias Persson
For the record, that branch is outdated and not working correctly in HA
mode.

2011/9/12 Peter Neubauer 

> James,
> we are experimenting with that feature, namely, not forcing a flush()
> at the end of a transaction and let the OS take care of the actual
> flushing. You potentially loose some last-transaction data, but the
> store is still going to recover and will not get corrupted.
> Mattias has been testing this in the ordered-writes branch at
> https://github.com/neo4j/community/tree/ordered-writes .This needs to
> be fleshed out to give access to these settings per transaction. I
> think it will not make it into 1.5 unless someone in the community
> steps up and puts in the effort to expose it. But feel free to try it
> out and give feedback on your findings!
>
> /peter
>
> On Fri, Sep 9, 2011 at 8:07 PM, espeed  wrote:
> > Hi Guys -
> >
> > I have been working on loading WordNet (http://wordnet.princeton.edu/)
> into
> > Neo4j, and have been using it as an opportunity to tune write performance
> on
> > Linux for a Web application I am developing.
> >
> > My initial idea was to load WordNet RDF
> > (http://semanticweb.cs.vu.nl/lod/wn30/) through the Blueprints SailGraph
> > interface, but then I decided to use NLTK (http://www.nltk.org) and load
> it
> > directly from Bulbs into Rexster.
> >
> > Stephen recently added batch transactions to Rexster
> > (https://github.com/tinkerpop/rexster-kibbles/tree/master/batch-kibble),
> but
> > right now I am not using them because I want to see what type of write
> > performance you can get in non-batch mode.
> >
> > The Neo4j performance guides were helpful:
> >
> > * http://wiki.neo4j.org/content/Performance_Guide
> > * http://wiki.neo4j.org/content/Linux_Performance_Guide
> > * http://wiki.neo4j.org/content/Configuration_Settings
> >
> > As are Peter and Tobias' recommendations to put Neo4j transactions in
> manual
> > mode
> > (https://groups.google.com/d/msg/gremlin-users/vl4IZO7O8H4/20Yc4rUObNcJ)
> so
> > you don't have to flush to disk for each write.
> >
> > However, manual/batch modes are not practical for writes in a Web
> > application. It would be cool if there was a tunable parameter where you
> > could set Neo4j to flush to disk at some interval instead of after every
> > create/update statement.
> >
> > Obviously you would have an issue if the server crashed before it was
> > written to disk, but this could be mitigated through HA redundancy, and
> > because it's a tunable parameter, you could dial it up or down depending
> on
> > your requirements.
> >
> > MongoDB does something similar, and it is reported that a single server
> can
> > do 20-30,000 writes per second
> > (http://www.dbms2.com/2011/04/04/the-mongodb-story/).
> >
> > Here some of the things Mongo does to make writes fast:
> >
> > * A memory-mapped data model.
> > * Deferred writes — a write might take a couple of seconds to actually
> > persist.
> > * Optimism — you don’t have to wait for an acknowledgement if you write
> > something to the database.
> > * “Upsert in place” – update in place without checking whether you’re
> doing
> > a write or insert.
> >
> > What would it take for Neo4j to approach these levels?
> >
> > Neo4j does memory-mapped IO:
> >
> >
> >
> http://wiki.neo4j.org/content/Configuration_Settings#Memory_mapped_I.2FO_settings
> >
> > There have been talks about adding optimistic locking:
> >
> >  http://neo4j.org/forums/#nabble-td2891798
> >
> > And Peter has said that deferred writes are on the drawing board
> > (http://lists.neo4j.org/pipermail/user/2011-May/008792.html):
> >
> >
> > Peter Neubauer wrote:
> >>
> >> However, we are looking into Neo4j normal mode speedups by having a mode
> >> that drops the JTA dependencies and thus can relax on the logfile
> flushing
> >> requirements for each transaction, by that being able to use the
> >> underlying
> >> OS for ordered (deferred) writing, adjustable on a case-by-case level
> >> (e.g.
> >> batch inserting big data). This will give Neo4j insertions in this mode
> >> comparable performance with the batchinserter, while keeping all other
> >> semantics and layers in place. I hope this can make it into 1.4, and it
> >> will
> >> speed up the RDF insertion considerably!
> >>
> >
> > Is support for optimistic locking and deferred writes planned for an
> > upcoming release?
> >
> > Thanks.
> >
> > - James
> >
> > --
> > View this message in context:
> http://neo4j-community-discussions.438527.n3.nabble.com/Neo4j-Write-Performance-tp3323638p3323638.html
> > Sent from the Neo4j Community Discussions mailing list archive at
> Nabble.com.
> > ___
> > Neo4j mailing list
> > User@lists.neo4j.org
> > https://lists.neo4j.org/mailman/listinfo/user
> >
> ___
> Neo4j mailing list
> User@lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user
>



-- 
Mattias Persson, [matt...@neotechnology.com]
Hacker, Neo Technology
www.neotechnol

Re: [Neo4j] Neo4j Write Performance

2011-09-12 Thread Peter Neubauer
James,
we are experimenting with that feature, namely, not forcing a flush()
at the end of a transaction and let the OS take care of the actual
flushing. You potentially loose some last-transaction data, but the
store is still going to recover and will not get corrupted.
Mattias has been testing this in the ordered-writes branch at
https://github.com/neo4j/community/tree/ordered-writes .This needs to
be fleshed out to give access to these settings per transaction. I
think it will not make it into 1.5 unless someone in the community
steps up and puts in the effort to expose it. But feel free to try it
out and give feedback on your findings!

/peter

On Fri, Sep 9, 2011 at 8:07 PM, espeed  wrote:
> Hi Guys -
>
> I have been working on loading WordNet (http://wordnet.princeton.edu/) into
> Neo4j, and have been using it as an opportunity to tune write performance on
> Linux for a Web application I am developing.
>
> My initial idea was to load WordNet RDF
> (http://semanticweb.cs.vu.nl/lod/wn30/) through the Blueprints SailGraph
> interface, but then I decided to use NLTK (http://www.nltk.org) and load it
> directly from Bulbs into Rexster.
>
> Stephen recently added batch transactions to Rexster
> (https://github.com/tinkerpop/rexster-kibbles/tree/master/batch-kibble), but
> right now I am not using them because I want to see what type of write
> performance you can get in non-batch mode.
>
> The Neo4j performance guides were helpful:
>
> * http://wiki.neo4j.org/content/Performance_Guide
> * http://wiki.neo4j.org/content/Linux_Performance_Guide
> * http://wiki.neo4j.org/content/Configuration_Settings
>
> As are Peter and Tobias' recommendations to put Neo4j transactions in manual
> mode
> (https://groups.google.com/d/msg/gremlin-users/vl4IZO7O8H4/20Yc4rUObNcJ) so
> you don't have to flush to disk for each write.
>
> However, manual/batch modes are not practical for writes in a Web
> application. It would be cool if there was a tunable parameter where you
> could set Neo4j to flush to disk at some interval instead of after every
> create/update statement.
>
> Obviously you would have an issue if the server crashed before it was
> written to disk, but this could be mitigated through HA redundancy, and
> because it's a tunable parameter, you could dial it up or down depending on
> your requirements.
>
> MongoDB does something similar, and it is reported that a single server can
> do 20-30,000 writes per second
> (http://www.dbms2.com/2011/04/04/the-mongodb-story/).
>
> Here some of the things Mongo does to make writes fast:
>
> * A memory-mapped data model.
> * Deferred writes — a write might take a couple of seconds to actually
> persist.
> * Optimism — you don’t have to wait for an acknowledgement if you write
> something to the database.
> * “Upsert in place” – update in place without checking whether you’re doing
> a write or insert.
>
> What would it take for Neo4j to approach these levels?
>
> Neo4j does memory-mapped IO:
>
>
> http://wiki.neo4j.org/content/Configuration_Settings#Memory_mapped_I.2FO_settings
>
> There have been talks about adding optimistic locking:
>
>  http://neo4j.org/forums/#nabble-td2891798
>
> And Peter has said that deferred writes are on the drawing board
> (http://lists.neo4j.org/pipermail/user/2011-May/008792.html):
>
>
> Peter Neubauer wrote:
>>
>> However, we are looking into Neo4j normal mode speedups by having a mode
>> that drops the JTA dependencies and thus can relax on the logfile flushing
>> requirements for each transaction, by that being able to use the
>> underlying
>> OS for ordered (deferred) writing, adjustable on a case-by-case level
>> (e.g.
>> batch inserting big data). This will give Neo4j insertions in this mode
>> comparable performance with the batchinserter, while keeping all other
>> semantics and layers in place. I hope this can make it into 1.4, and it
>> will
>> speed up the RDF insertion considerably!
>>
>
> Is support for optimistic locking and deferred writes planned for an
> upcoming release?
>
> Thanks.
>
> - James
>
> --
> View this message in context: 
> http://neo4j-community-discussions.438527.n3.nabble.com/Neo4j-Write-Performance-tp3323638p3323638.html
> Sent from the Neo4j Community Discussions mailing list archive at Nabble.com.
> ___
> Neo4j mailing list
> User@lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user
>
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Neo4j Write Performance

2011-09-11 Thread espeed
I added a ticket for this here...

https://github.com/neo4j/community/issues/18

--
View this message in context: 
http://neo4j-community-discussions.438527.n3.nabble.com/Neo4j-Write-Performance-tp3323638p3327618.html
Sent from the Neo4j Community Discussions mailing list archive at Nabble.com.
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


[Neo4j] Neo4j Write Performance

2011-09-09 Thread espeed
Hi Guys -

I have been working on loading WordNet (http://wordnet.princeton.edu/) into
Neo4j, and have been using it as an opportunity to tune write performance on
Linux for a Web application I am developing. 

My initial idea was to load WordNet RDF
(http://semanticweb.cs.vu.nl/lod/wn30/) through the Blueprints SailGraph
interface, but then I decided to use NLTK (http://www.nltk.org) and load it
directly from Bulbs into Rexster.

Stephen recently added batch transactions to Rexster
(https://github.com/tinkerpop/rexster-kibbles/tree/master/batch-kibble), but
right now I am not using them because I want to see what type of write
performance you can get in non-batch mode.

The Neo4j performance guides were helpful:

* http://wiki.neo4j.org/content/Performance_Guide
* http://wiki.neo4j.org/content/Linux_Performance_Guide
* http://wiki.neo4j.org/content/Configuration_Settings

As are Peter and Tobias' recommendations to put Neo4j transactions in manual
mode
(https://groups.google.com/d/msg/gremlin-users/vl4IZO7O8H4/20Yc4rUObNcJ) so
you don't have to flush to disk for each write.  

However, manual/batch modes are not practical for writes in a Web
application. It would be cool if there was a tunable parameter where you
could set Neo4j to flush to disk at some interval instead of after every
create/update statement. 

Obviously you would have an issue if the server crashed before it was
written to disk, but this could be mitigated through HA redundancy, and
because it's a tunable parameter, you could dial it up or down depending on
your requirements. 

MongoDB does something similar, and it is reported that a single server can
do 20-30,000 writes per second
(http://www.dbms2.com/2011/04/04/the-mongodb-story/).

Here some of the things Mongo does to make writes fast:

* A memory-mapped data model.
* Deferred writes — a write might take a couple of seconds to actually
persist.
* Optimism — you don’t have to wait for an acknowledgement if you write
something to the database.
* “Upsert in place” – update in place without checking whether you’re doing
a write or insert.

What would it take for Neo4j to approach these levels?

Neo4j does memory-mapped IO:

 
http://wiki.neo4j.org/content/Configuration_Settings#Memory_mapped_I.2FO_settings

There have been talks about adding optimistic locking:

  http://neo4j.org/forums/#nabble-td2891798

And Peter has said that deferred writes are on the drawing board
(http://lists.neo4j.org/pipermail/user/2011-May/008792.html):


Peter Neubauer wrote:
> 
> However, we are looking into Neo4j normal mode speedups by having a mode
> that drops the JTA dependencies and thus can relax on the logfile flushing
> requirements for each transaction, by that being able to use the
> underlying
> OS for ordered (deferred) writing, adjustable on a case-by-case level
> (e.g.
> batch inserting big data). This will give Neo4j insertions in this mode
> comparable performance with the batchinserter, while keeping all other
> semantics and layers in place. I hope this can make it into 1.4, and it
> will
> speed up the RDF insertion considerably!
> 

Is support for optimistic locking and deferred writes planned for an
upcoming release?

Thanks.

- James

--
View this message in context: 
http://neo4j-community-discussions.438527.n3.nabble.com/Neo4j-Write-Performance-tp3323638p3323638.html
Sent from the Neo4j Community Discussions mailing list archive at Nabble.com.
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user