Re: [Neo4j] server with big (huge?) graph

Kiss Miklós Wed, 23 Feb 2011 05:48:36 -0800

Thanks for the response.

Then my idea of a server plugin wasn't a bad idea, great.
My next question is then: how do I traverse only a part of the possible 
sub-graph?
I mean: let's suppose I start traversing from node 'A' and want to get 
all 2 length paths on relationships 'TYPE_X' and 'TYPE_Y'. Let's say 
node 'A' has 5000 'TYPE_X' relationships and all connected nodes have 10 
'TYPE_Y' relationships. If I start my traversing from node 'A' on server 
side, how can I stop after fetching 1000 paths and later continue from 
where I stopped? Or am I missing something very important?


I have an idea to put extra nodes into my graph that collect (let's say) 
100 same typed connections and stand as an intermediate node between 'A' 
and 'B'. Using this would allow me to collect all paths between 'A' and 
'B' in multiple steps, each step returning at most 100 paths only. 
However, this scheme is harder to maintain and makes the connections 
harder to read and also has a performance penalty (direct connections 
becoma indirect 2 step connections).

Or is this something very similar that can be achieved with proper 
indexing (like You mentioned)?

Am I on the right way?

Miklós Kiss

2011.02.23. 14:20 keltezéssel, Michael Hunger írta:
> First - you should perhaps write a Server-Plugin that does your heavy
> lifting on the server and provides a REST endpoint to get the results.
> Not sure if non-GET verbs are supported yet (otherwise you can always
> go for an unmanaged extension defining your own resources).
>
> You can do indexing for certain start nodes and then use the traversal
> facilities to update your graph (if this is fitting). E.g. you can use
> the javascript evaluators not only to evaluate/query but also to
> update the graph.
>
> Hope that helps
>
> Michael
>
> We're also thinking about a more terse or binary API that would server
> interaction more efficient but I think that is the wrong direction for
> your usecase. Rather move into the server what belongs there and
> expose appropriate resources for your clients to interact with.
>
> 2011/2/23 Kiss Miklós<[email protected]>:
>> Hi all,
>>
>> I'd like to get ideas on how to handle a (relatively) big graph. My
>> graph is stored in a neo4j server. The structure is simple but highly
>> interconnected:
>> - I have nodes containing longer texts
>> - and I have many nodes containing tokens of those texts.
>> Relationships connect tokens to texts so I have many relationships. The
>> actual graph does have many other nodes too but this is irrelevant now.
>> The graph contains 300k nodes, 2.5 million properties and 1 million
>> relationships (and is still growing).
>>
>> My question is how to execute querys from the graph. I have to execute
>> operations that usually require querying huge parts of the graph. I
>> mean: get all the tokens for some of the texts; or even get all the
>> tokens. (I'm creating a text processing system that is learning and the
>> teaching process involves manipulation of all tokens - I think it's much
>> faster executed in memory rather then querying each token separately).
>>
>> The naive solution (traverse the graph from root node with 1 depth to
>> get all the nodes of a certain type) is now unsusabe since my graph is
>> too big. The server simply runs out of memory (I gave it 1024 MB - this
>> is around the maximum until the server gets a separate hardvare).
>>
>> So my question is how to implement correctly and efficiently the
>> querying of the graph? Should I create custom extensions that traverse
>> and return only a part of the graph in such scenario? Or should I insert
>> additional "control" nodes to the graph which can be used as reference
>> points for querying? The main problem is that I have many same typed
>> relationships. I don't know how to manage traversing the graph partially
>> if it is only accessible through the REST protocol.
>>
>> Any help would be appreciated!
>>
>> Thanks in advance,
>> Miklós Kiss
>> _______________________________________________
>> Neo4j mailing list
>> [email protected]
>> https://lists.neo4j.org/mailman/listinfo/user
>>
> _______________________________________________
> Neo4j mailing list
> [email protected]
> https://lists.neo4j.org/mailman/listinfo/user
>
>

_______________________________________________
Neo4j mailing list
[email protected]
https://lists.neo4j.org/mailman/listinfo/user

Re: [Neo4j] server with big (huge?) graph

Reply via email to