Re: [Neo4j] Question about REST interface concurrency

2011-04-26 Thread Mattias Persson
Does your disk benchmark tests flush the data to disk or just write to
it, making file system / OS flush when ever it feel like it (making it
much faster, of course)?

2011/4/25 Stephen Roos sr...@careerarcgroup.com:
 Hi Jim,

 I took a look at my disk utilization and I'm only getting up to about 9379 
 KBps (write).  My disk benchmarking tests show max write rates to be around 
 220 MBps, so I shouldn't be maxed out there.  Interestingly, I don't see that 
 much data in the graph.db directory (I see about 15 MB there after creating 
 150k empty nodes, no relationships, no index).  The largest file is 
 nioneo_logical.log.1 (14 MB), the next largest is the neostore.nodestore.db 
 (1.3 MB).  I don't know if that information is helpful, but I thought it was 
 a bit strange that I'm sustaining disk write rates of  9 MBps for over 40 
 secs yet I don't have anywhere close to 9 * 40 MB of data.

 I do wonder about the flush operation though.  Flush is a blocking operation, 
 maybe that's the bottleneck even though the disk isn't over utilized.  I'll 
 look into that.  Let me know if you have any other ideas.

 Thanks!
 Stephen

 -Original Message-
 From: Jim Webber [mailto:j...@neotechnology.com]
 Sent: Friday, April 22, 2011 3:34 AM
 To: Neo4j user discussions
 Subject: Re: [Neo4j] Question about REST interface concurrency

 Hi Stephen,

 I think the network IO you've measured is consistent with the rest of the 
 behaviour your've described.

 What I'm thinking is that you're simply reaching the limits of create 
 transaction-create a node-complete transaction-flush to filesystem (that 
 is, you're basically testing disk write speed/seek time/etc).

 Can you check how busy your IO to disk is? I expect it'll be relatively high.

 Jim
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user




-- 
Mattias Persson, [matt...@neotechnology.com]
Hacker, Neo Technology
www.neotechnology.com
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Question about REST interface concurrency

2011-04-26 Thread Stephen Roos
Hi Jim,

From what I understand, it flushes with various granularities, though I'd 
suspect that it's not flushing after writes the size of empty nodes, so this is 
certainly a possible bottleneck point.  I've been looking through the code and 
don't see exactly where the flush takes place.  Can you point me at the right 
class?

I did come across the PersistenceWindowPool class which seems to come into play 
when the underlying node record is updated during the transaction commit.  It 
looks as if the windows are mapped over contiguous blocks of the primitives ID 
space and that because the new node IDs are typically sequential, each of my 
create-node operations is likely to target the same window.  These windows are 
locked and waiting threads are queued up to wait for the locking thread to 
notify on unlock.  Am I reading the code correctly?  If so, do you have any 
thoughts on how we might remove that bottleneck?

Thanks again for your help,
Stephen


-Original Message-
From: Mattias Persson [mailto:matt...@neotechnology.com] 
Sent: Tuesday, April 26, 2011 12:19 AM
To: Neo4j user discussions
Subject: Re: [Neo4j] Question about REST interface concurrency

Does your disk benchmark tests flush the data to disk or just write to
it, making file system / OS flush when ever it feel like it (making it
much faster, of course)?

2011/4/25 Stephen Roos sr...@careerarcgroup.com:
 Hi Jim,

 I took a look at my disk utilization and I'm only getting up to about 9379 
 KBps (write).  My disk benchmarking tests show max write rates to be around 
 220 MBps, so I shouldn't be maxed out there.  Interestingly, I don't see that 
 much data in the graph.db directory (I see about 15 MB there after creating 
 150k empty nodes, no relationships, no index).  The largest file is 
 nioneo_logical.log.1 (14 MB), the next largest is the neostore.nodestore.db 
 (1.3 MB).  I don't know if that information is helpful, but I thought it was 
 a bit strange that I'm sustaining disk write rates of  9 MBps for over 40 
 secs yet I don't have anywhere close to 9 * 40 MB of data.

 I do wonder about the flush operation though.  Flush is a blocking operation, 
 maybe that's the bottleneck even though the disk isn't over utilized.  I'll 
 look into that.  Let me know if you have any other ideas.

 Thanks!
 Stephen

 -Original Message-
 From: Jim Webber [mailto:j...@neotechnology.com]
 Sent: Friday, April 22, 2011 3:34 AM
 To: Neo4j user discussions
 Subject: Re: [Neo4j] Question about REST interface concurrency

 Hi Stephen,

 I think the network IO you've measured is consistent with the rest of the 
 behaviour your've described.

 What I'm thinking is that you're simply reaching the limits of create 
 transaction-create a node-complete transaction-flush to filesystem (that 
 is, you're basically testing disk write speed/seek time/etc).

 Can you check how busy your IO to disk is? I expect it'll be relatively high.

 Jim
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user




-- 
Mattias Persson, [matt...@neotechnology.com]
Hacker, Neo Technology
www.neotechnology.com

___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Question about REST interface concurrency

2011-04-22 Thread Jim Webber
Hi Stephen,

I think the network IO you've measured is consistent with the rest of the 
behaviour your've described. 

What I'm thinking is that you're simply reaching the limits of create 
transaction-create a node-complete transaction-flush to filesystem (that is, 
you're basically testing disk write speed/seek time/etc).

Can you check how busy your IO to disk is? I expect it'll be relatively high.

Jim
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Question about REST interface concurrency

2011-04-21 Thread Stephen Roos
Hi Peter,

I'd be glad to share the code, I'll commit soon and share with the users list.

I've run some more load/concurrency tests and am seeing some strange results.  
Maybe someone can help explain this to me:

I run a load test where I fire off 100K create empty node REST requests to 
Neo as quickly as possible.  With my code updates to allow configuration of the 
Jetty thread pool size, I can effectively reduce or increase the maximum 
concurrent transaction limit on the server.  If I limit the thread pool so that 
there is only 1 thread available for requests, I see (as expected) the 
PeakNumberOfConcurrentTransactions reported by the Neo4j Transactions MBean is 
1.  If I scale the thread pool up so that there are 800 available request 
threads, I can throw enough load at the server to cause 800 concurrent 
transactions.  From what I have read, node creation causes a node-local lock, 
not a global node lock, so there shouldn't be a lock-imposed concurrency 
bottleneck.

The strange thing is, no matter whether I have 1 or 800 concurrent 
transactions, my total node creation throughput is always the same (~1600 
nodes/sec).  Even with 800 concurrent transactions, my server is only using 
~15% CPU and ~25% memory (JVM Xmm/Xmx = 1024m/2048m), so server load wouldn't 
appear to be an issue.  I've followed all the recommendations I could find 
including sysctl limits and JVM settings, but the rate doesn't change.  I have 
also tried running the load test from multiple clients simultaneously (just to 
be sure I'm not running into any limits on the client machine), and indeed as 
soon as I add a second load test client, the throughput on each client gets cut 
in half.  If I'm talking to Neo in a way that is unrestricted by things like 
thread pool size and concurrency limits, I'd expect to be able to scale up my 
load tests and see at least some level of throughput improvement until I start 
to saturate/overload the box.  The fact that increasing concurrency doesn't 
increase throughput makes me think that there's some internal bottleneck or 
synchronization point that's limiting.

Any thoughts?  I'm glad to look through the code and investigate, any ideas you 
have would be a big help.

Thanks, and sorry for the long question!

Stephen


-Original Message-
From: Peter Neubauer [mailto:peter.neuba...@neotechnology.com] 
Sent: Monday, April 18, 2011 12:50 AM
To: Neo4j user discussions
Subject: Re: [Neo4j] Question about REST interface concurrency

Stephen,
did you fork the code? Would be good to merge in the changes or at
least take a look at them!

Cheers,

/peter neubauer

GTalk:      neubauer.peter
Skype       peter.neubauer
Phone       +46 704 106975
LinkedIn   http://www.linkedin.com/in/neubauer
Twitter      http://twitter.com/peterneubauer

http://www.neo4j.org               - Your high performance graph database.
http://startupbootcamp.org/    - Öresund - Innovation happens HERE.
http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party.



On Mon, Apr 18, 2011 at 4:08 AM, Stephen Roos sr...@careerarcgroup.com wrote:
 Hi Jim,

 Thanks for the quick reply.  I tried the configuration mentioned here 
 (rest_max_jetty_threads):

 https://trac.neo4j.org/changeset/6157/laboratory/components/rest

 But it doesn't seem to have changed anything.  I took a look through the code 
 and didn't see any configuration settings exposed in Jetty6WebServer.  I 
 added the changes myself and am starting to see some good results (I've 
 exposed settings for min/max threadpool size, # acceptor threads, acceptor 
 queue size, and request buffer size).  Is there anything else that you'd 
 recommend tweaking to improve throughput?

 Thanks again for your help!



 -Original Message-
 From: Jim Webber [mailto:j...@neotechnology.com]
 Sent: Friday, April 15, 2011 1:57 AM
 To: Neo4j user discussions
 Subject: Re: [Neo4j] Question about REST interface concurrency

 Hi Stephen,

 The same Jetty tweaks that worked in previous versions will work with 1.3. We 
 haven't changed any of the Jetty stuff under the covers.

 Jim
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user



___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Question about REST interface concurrency

2011-04-21 Thread Stephen Roos
I'm running on Linux (2.6.18).  Watching network utilization, I never see rates 
higher than ~2.5 MBps on the server.  I've also set net.core.rmem_min/max and 
net.ipv4.tcp_rmem/wmem in sysctl to be quite high based on some recommendations 
I've found.  Is this contrary to your own load tests?  Are you able to hit the 
server with enough load that the system is maxed out?  I was considering adding 
some instrumentation around transactions so that I can see the average internal 
transaction time span during a load test.  If you have any other thoughts on 
what to look for/test, I'd be very appreciative.

Thanks again,
Stephen

-Original Message-
From: Jim Webber [mailto:j...@neotechnology.com] 
Sent: Thursday, April 21, 2011 12:24 PM
To: Neo4j user discussions
Subject: Re: [Neo4j] Question about REST interface concurrency

Hi Stephen,

Are you running on Linux (or Windows) by any chance? I wonder whether the 
asymptotical performance you're seeing is because you've gotten to a point 
where you're exercising the IO channel and file system.

Jim
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Question about REST interface concurrency

2011-04-18 Thread Peter Neubauer
Stephen,
did you fork the code? Would be good to merge in the changes or at
least take a look at them!

Cheers,

/peter neubauer

GTalk:      neubauer.peter
Skype       peter.neubauer
Phone       +46 704 106975
LinkedIn   http://www.linkedin.com/in/neubauer
Twitter      http://twitter.com/peterneubauer

http://www.neo4j.org               - Your high performance graph database.
http://startupbootcamp.org/    - Öresund - Innovation happens HERE.
http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party.



On Mon, Apr 18, 2011 at 4:08 AM, Stephen Roos sr...@careerarcgroup.com wrote:
 Hi Jim,

 Thanks for the quick reply.  I tried the configuration mentioned here 
 (rest_max_jetty_threads):

 https://trac.neo4j.org/changeset/6157/laboratory/components/rest

 But it doesn't seem to have changed anything.  I took a look through the code 
 and didn't see any configuration settings exposed in Jetty6WebServer.  I 
 added the changes myself and am starting to see some good results (I've 
 exposed settings for min/max threadpool size, # acceptor threads, acceptor 
 queue size, and request buffer size).  Is there anything else that you'd 
 recommend tweaking to improve throughput?

 Thanks again for your help!



 -Original Message-
 From: Jim Webber [mailto:j...@neotechnology.com]
 Sent: Friday, April 15, 2011 1:57 AM
 To: Neo4j user discussions
 Subject: Re: [Neo4j] Question about REST interface concurrency

 Hi Stephen,

 The same Jetty tweaks that worked in previous versions will work with 1.3. We 
 haven't changed any of the Jetty stuff under the covers.

 Jim
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user

___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Question about REST interface concurrency

2011-04-17 Thread Stephen Roos
Hi Jim,

Thanks for the quick reply.  I tried the configuration mentioned here 
(rest_max_jetty_threads): 

https://trac.neo4j.org/changeset/6157/laboratory/components/rest

But it doesn't seem to have changed anything.  I took a look through the code 
and didn't see any configuration settings exposed in Jetty6WebServer.  I added 
the changes myself and am starting to see some good results (I've exposed 
settings for min/max threadpool size, # acceptor threads, acceptor queue size, 
and request buffer size).  Is there anything else that you'd recommend tweaking 
to improve throughput?

Thanks again for your help!



-Original Message-
From: Jim Webber [mailto:j...@neotechnology.com] 
Sent: Friday, April 15, 2011 1:57 AM
To: Neo4j user discussions
Subject: Re: [Neo4j] Question about REST interface concurrency

Hi Stephen,

The same Jetty tweaks that worked in previous versions will work with 1.3. We 
haven't changed any of the Jetty stuff under the covers.

Jim
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Question about REST interface concurrency

2011-04-15 Thread Jim Webber
Hi Stephen,

The same Jetty tweaks that worked in previous versions will work with 1.3. We 
haven't changed any of the Jetty stuff under the covers.

Jim
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


[Neo4j] Question about REST interface concurrency

2011-04-14 Thread Stephen Roos
Hello Neo Team!

Congrats on the recent release!  I'm using 1.3 enterprise in my development 
environment.  I noticed that in earlier versions there were some patches to 
allow setting the min/max thread pool size for the REST servlet container.  Are 
there any similar options now?  Under load tests, it seems like I would benefit 
from at least having a higher initial thread pool size.  Are there any other 
configuration changes or strategies that would help with overall throughput 
under heavy load?

Thanks for your help!

Stephen Roos
Software Engineer

CareerArc Group
The Social Exceleration Network
www.careerarcgroup.com

This email and any files transmitted with it are confidential and intended 
solely for the use of the individual or entity to whom they are addressed.  If 
you have received this email in error, please notify the sender and delete this 
email from your system.


___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user