Re: [Neo4j] Stumped by performance issue in traversal - would take a month to run!

2010-08-06 Thread Jeff Klann
To anyone interested, after reading Martin's e-mail I ran a test of a small (75mb) Neo4J graph and traversed it (~500k edges) with memory mapping turned off in about 1.5 minutes on my hard drive and 30sec when copied to a (probably slow) flash drive. Which incidentally is the same time it took

Re: [Neo4j] Stumped by performance issue in traversal - would take a month to run!

2010-08-05 Thread Jeff Klann
Thanks for the answers. Yes, I can do online updates of the datastore, but while this is in RD I will need to rerun the main loop when I change the algorithm and just for personal benefit I don't want to wait hours to see the changes. Seems to be running acceptably now, though. However, I haven't

Re: [Neo4j] Stumped by performance issue in traversal - would take a month to run!

2010-08-05 Thread Martin Neumann
- Martin, I'm confused a bit about SSDs. I read up on them after I read your post. You said flash drives are best, but I read that even the highest performing flash drives are about 30MB/s read, whereas modern hard drives are at least 50MB/s. True SSDs claim to be 50MB/s too but

Re: [Neo4j] Stumped by performance issue in traversal - would take a month to run!

2010-08-02 Thread David Montag
Hi Jeff, If I'm not mistaken, Neo4j loads all properties for a node or relationship when you invoke any operation that touches a property. As for the performance of traversals, it is highly dependent on how deep you traverse, and what you do during the traversal, so ymmv. Using a traverser is

Re: [Neo4j] Stumped by performance issue in traversal - would take a month to run!

2010-08-02 Thread David Montag
Hi Jeff, Please see answers below. On Mon, Aug 2, 2010 at 5:47 PM, Jeff Klann jkl...@iupui.edu wrote: Thank you all for your continued interest in helping me. I tweaked the code more to minimize writes to the database and it now looks like: For each item A For each customer that purchased

Re: [Neo4j] Stumped by performance issue in traversal - would take a month to run!

2010-08-01 Thread Martin Neumann
Hi, there are some environmental optimizations you can do to speed things up. Neo4j is stored as a graph on disk, so traversal translate to moving the cursor on the hard drive if the data was not in RAM. For good performance you need a fast hd (flash drive would do best). Deleting lots of nodes

Re: [Neo4j] Stumped by performance issue in traversal - would take a month to run!

2010-08-01 Thread Martin Neumann
Hi, there are some environmental optimizations you can do to speed things up. Neo4j is stored as a graph on disk, so traversal translate to moving the cursor on the hard drive if the data was not in RAM. For good performance you need a fast hd (flash drive would do best). Deleting lots of nodes

Re: [Neo4j] Stumped by performance issue in traversal - would take a month to run!

2010-07-30 Thread Jeff Klann
Hi, so I got 2GB more RAM and noticed that after adding some more memory map and increasing the heap space, my small query went from 6hrs to 3min. Quite reasonable! But the larger one that would take a month would still take a month. So I've been performance testing parts of it: The algorithm as

[Neo4j] Stumped by performance issue in traversal - would take a month to run!

2010-07-28 Thread Jeff Klann
Hi, I have an algorithm running on my little server that is very very slow. It's a recommendation traversal (for all A and B in the catalog of items: for each item A, how many customers also purchased another item in the catalog B). It's processed 90 items in about 8 hours so far! Before I dive

Re: [Neo4j] Stumped by performance issue in traversal - would take a month to run!

2010-07-28 Thread Rick Bullotta
Jeff, when you're doing your traversal/update process, how often do you commit the transactions? -Original Message- From: user-boun...@lists.neo4j.org [mailto:user-boun...@lists.neo4j.org] On Behalf Of Jeff Klann Sent: Wednesday, July 28, 2010 11:20 AM To: Neo4j user discussions Subject:

Re: [Neo4j] Stumped by performance issue in traversal - would take a month to run!

2010-07-28 Thread Rick Bullotta
Oh, and you DEFINITELY need more RAM! -Original Message- From: user-boun...@lists.neo4j.org [mailto:user-boun...@lists.neo4j.org] On Behalf Of Jeff Klann Sent: Wednesday, July 28, 2010 11:20 AM To: Neo4j user discussions Subject: [Neo4j] Stumped by performance issue in traversal - would

Re: [Neo4j] Stumped by performance issue in traversal - would take a month to run!

2010-07-28 Thread Tim Jones
I can't give too much help on this unfortunately, but as far as possibility 1) goes, my database contains around 8 million nodes, and I traverse them in about 15 seconds for retrievals. It's 2.8GB on disk, and the machine has 4GB of RAM. I allocate a 1GB heap to the JDK. Inserts take a little

Re: [Neo4j] Stumped by performance issue in traversal - would take a month to run!

2010-07-28 Thread Jeff Klann
Thank you both for your responses. - I will get some more RAM tomorrow and give Neo4J another shot. Hopefully that's a huge factor. - Tim, I like your algorithm trick! Would save a lot of reading/writing but would definitely require more memory due to the massive increase in # of edges. -

Re: [Neo4j] Stumped by performance issue in traversal - would take a month to run!

2010-07-28 Thread Rick Bullotta
Hi, Jeff. If you are committing after each item, it definitely will slow down performance. Start a single transaction, commit when you're all done the entire traversal, and report back the results. You will still see the changes you've made prior to committing the transaction, as long as you're

Re: [Neo4j] Stumped by performance issue in traversal - would take a month to run!

2010-07-28 Thread Jeff Klann
I don't think that's the problem. Here's why... When I was importing my data, it eventually slowed down to a crawl (though it was pretty fast at first). Someone pointed out that since I was trying to do it all in one transaction, it was filling the java heap too much. They suggested I commit