Re: [Neo4j] Concerning getting all nodes after a traversal

Peter Neubauer Mon, 21 Nov 2011 07:01:16 -0800

Daniel,
there is the ImpertmanentGraphDatabase,
https://github.com/neo4j/community/blob/master/kernel/src/test/java/org/neo4j/test/ImpermanentGraphDatabase.java
that you could use just to test in-memory out, since it does abstract
away the file system layer. Would be interesting to see if you can get
any differences. It's mostly a test utility, but could provide useful.


Cheers,

/peter neubauer

GTalk:      neubauer.peter
Skype       peter.neubauer
Phone       +46 704 106975
LinkedIn   http://www.linkedin.com/in/neubauer
Twitter      http://twitter.com/peterneubauer

http://www.neo4j.org              - NOSQL for the Enterprise.
http://startupbootcamp.org/    - Öresund - Innovation happens HERE.



On Wed, Sep 28, 2011 at 8:30 PM, Daniel Morozoff
<danielmoroz...@gmail.com> wrote:
> Michael,
> The amount of nodes traversed / and the amount of relationships vary. I
> personally have seen it go from 2 - 2k+ for relationships, and similarly for
> the amounts of nodes traversed. The total number of nodes in the DB is ~ 4M.
> We dynamically assigne the number of sets, usually ~ 3-4 steps.
> The usecase is basically we developed a novel algorithm to detect subspace
> similarities in graph networks (solely based on the geometry of space). The
> applications as you can imagine are varied in numerous fields.
>
> Hope this helps,
> Dan
> P.S. As for the graph generator -- I don't think you can recreate a
> similarly sparsed graph algorithmically, best case is to use some sort of
> real world sparse data.
>
> Daniel Morozoff-Abezgauz
>
> Mobile: 415-652-2388
> Email: danielmoroz...@gmail.com
> Website:  http://patentula.com
>
>
>
> On Tue, Sep 27, 2011 at 3:06 PM, Michael Hunger
> <michael.hun...@neotechnology.com> wrote:
>>
>> Dan about how many nodes as traversal result are we talking here?
>> You said you have one type of relationship. How many relationships do you
>> have per node pointing to other nodes (outgoing) and incoming (min, max, avg
>> or distribution) ?
>> How many steps does your traversal go?
>> And what is the usecase behind it? So that we can understand your graph
>> model and the operations you'd like to perform.
>> Perhaps you could also share (privately) a graph generator that would
>> create a graph as yours. Then we could run some performance tests of our
>> own.
>> Cheers
>> Michael
>> Am 27.09.2011 um 23:17 schrieb Daniel Morozoff:
>>
>> Hi Michael,
>> Sure we may move it to the mailing list.
>> We have single nodes connected with one type of relationship. We are
>> running cold cache but using an indexing mechanism to find start node (which
>> we have only one). We also implemented a stop evaluator running on a depth
>> protocol and it is working - so I believe we are traversing the exact amount
>> we need.
>> Does neo4j support loading an entire db into memory btw?
>> The reason I asked of blackray, is b/c we need the capability to receive
>> close to instantaneous responses on a web query. So we will need to run the
>> calculation the back end before exposing it to the front end.
>> Regards,
>> Dan
>>
>>
>>
>> On Tue, Sep 27, 2011 at 1:39 PM, Michael Hunger
>> <michael.hun...@neotechnology.com> wrote:
>>>
>>> Can we take this to the mailing list ?
>>> After all what is your usecase?
>>> What is the structure of your graph, what your starting nodes, etc.
>>> Do you have cold or hot caches.
>>> I think probably the traverser is not limiting to the right set and
>>> traverses too much of the graph.
>>> Cheers
>>> Michael
>>> Am 27.09.2011 um 20:41 schrieb Peter Neubauer:
>>>
>>> Daniel,
>>> I think Michael has been testing some with these setups...
>>>
>>> /peter
>>>
>>> Sent from my phone.
>>>
>>> On Sep 27, 2011 6:21 PM, "Daniel Morozoff" <danielmoroz...@gmail.com>
>>> wrote:
>>> > Hi Peter,
>>> >
>>> > Thanks for your response. Makes total sense! Can you recommend any in
>>> > memory
>>> > DBs like blackray that work well with neo4j and java?
>>> >
>>> > Thanks,
>>> >
>>> > Dan
>>> >
>>> > On Tue, Sep 27, 2011 at 12:26 AM, Peter Neubauer <
>>> > peter.neuba...@neotechnology.com> wrote:
>>> >
>>> >> Daniel,
>>> >> remember that the traversals are lazy, so nothing is traversed until
>>> >> you actually iterate. Is that explaining the difference? Also, try
>>> >> running the code several times, and the caches and the JVM will help
>>> >> to bring times down compared to cold runs.
>>> >>
>>> >> HTH,
>>> >>
>>> >> Cheers,
>>> >>
>>> >> /peter neubauer
>>> >>
>>> >> GTalk: neubauer.peter
>>> >> Skype peter.neubauer
>>> >> Phone +46 704 106975
>>> >> LinkedIn http://www.linkedin.com/in/neubauer
>>> >> Twitter http://twitter.com/peterneubauer
>>> >>
>>> >> http://www.neo4j.org - Your high performance graph database.
>>> >> http://startupbootcamp.org/ - Öresund - Innovation happens HERE.
>>> >> http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing
>>> >> party.
>>> >>
>>> >>
>>> >>
>>> >> On Mon, Sep 26, 2011 at 9:54 PM, <danielmoroz...@gmail.com> wrote:
>>> >> > Hi Peter,
>>> >> >
>>> >> > I know you are one of the admins for the forums and I was wondering
>>> >> > if
>>> >> you could assist me with my question. I posted it already on the
>>> >> forum, but
>>> >> have not received a response.
>>> >> >
>>> >> > My question pertains to running the getAllNodes() method on a
>>> >> > Traverser
>>> >> object. It takes drastically longer to get all nodes than to traverse
>>> >> them.
>>> >> >
>>> >> > I assumed it was a indexing issue and decompiled the kernel lib
>>> >> > file, but
>>> >> could not find where the indexing was occurring (as it was not in the
>>> >> Traverser class).
>>> >> >
>>> >> > Could you give me some input, as we are attempting to optimize our
>>> >> algorithms, but 95% of the speed comes from this one method.
>>> >> >
>>> >> > Here is a copy of my post:
>>> >> >
>>> >> > For your reference the size of the DB is ~6.8 Gb
>>> >> > ------------------------------------------------------------
>>> >> >
>>> >> > long startTime = System.currentTimeMillis();
>>> >> > Traverser treeTraverser = root.traverse(
>>> >> > Traverser.Order.BREADTH_FIRST,
>>> >> > operator.getStopEvaluator(),
>>> >> > ReturnableEvaluator.ALL,
>>> >> > relationshipType,
>>> >> > Direction.OUTGOING);
>>> >> >
>>> >> > long endTime = System.currentTimeMillis();
>>> >> >
>>> >> > System.out.println("\n||TIME NEEDED FOR TRAVERSAL:
>>> >> "+(endTime-startTime)+"
>>> >> > ms||");
>>> >> > int size = 0;
>>> >> > startTime =System.currentTimeMillis();
>>> >> > Iterable <Node> nodeCollection =
>>> >> treeTraverser.getAllNodes();
>>> >> > endTime = System.currentTimeMillis();
>>> >> >
>>> >> > System.out.println("\n||TIME NEEDED TO GET NODES:
>>> >> > "+(endTime-startTime)+"
>>> >> > ms||");
>>> >> >
>>> >> >
>>> >> > -----------------------------------------------------------
>>> >> > Console output:::
>>> >> >
>>> >> > ||TIME NEEDED FOR TRAVERSAL: 63 ms||
>>> >> >
>>> >> > ||TIME NEEDED TO GET NODES: 56875 ms||
>>> >> >
>>> >> > ------------------------------------------------------------
>>> >> >
>>> >> > Thank you so much,
>>> >> >
>>> >> > Dan
>>> >> >
>>> >>
>>>
>>
>>
>
>
_______________________________________________
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

Re: [Neo4j] Concerning getting all nodes after a traversal

Reply via email to