2011/2/23 Kiss Miklós <[email protected]> > Thanks for the response. > > Then my idea of a server plugin wasn't a bad idea, great. > My next question is then: how do I traverse only a part of the possible > sub-graph? > I mean: let's suppose I start traversing from node 'A' and want to get > all 2 length paths on relationships 'TYPE_X' and 'TYPE_Y'. Let's say > node 'A' has 5000 'TYPE_X' relationships and all connected nodes have 10 > 'TYPE_Y' relationships. If I start my traversing from node 'A' on server > side, how can I stop after fetching 1000 paths and later continue from > where I stopped? Or am I missing something very important? > > I have an idea to put extra nodes into my graph that collect (let's say) > 100 same typed connections and stand as an intermediate node between 'A' > and 'B'. Using this would allow me to collect all paths between 'A' and > 'B' in multiple steps, each step returning at most 100 paths only. > However, this scheme is harder to maintain and makes the connections > harder to read and also has a performance penalty (direct connections > becoma indirect 2 step connections). >
Seems awfully complex to maintain as you mentioned. There should be another way of doing this, however I think you'll have to roll your own. Have you made any progress here? I just though about it and could it be done by writing your own server plugin which does a path calculation and returns URI or ID where you can start getting results from it? So that plugin merely does the calculation, assigns the result a new ID and returns, then another GET request could read that result and iterate only N number of items from the result, and then the next GET request to that ID could continue that iteration of the results. I don't know, just thinking out loud. > > Or is this something very similar that can be achieved with proper > indexing (like You mentioned)? > Doing path algorithms with mixed indexing can be rather slow and limiting the index results wouldn't solve your problem, would it? > > Am I on the right way? > > Miklós Kiss > > 2011.02.23. 14:20 keltezéssel, Michael Hunger írta: > > First - you should perhaps write a Server-Plugin that does your heavy > > lifting on the server and provides a REST endpoint to get the results. > > Not sure if non-GET verbs are supported yet (otherwise you can always > > go for an unmanaged extension defining your own resources). > > > > You can do indexing for certain start nodes and then use the traversal > > facilities to update your graph (if this is fitting). E.g. you can use > > the javascript evaluators not only to evaluate/query but also to > > update the graph. > > > > Hope that helps > > > > Michael > > > > We're also thinking about a more terse or binary API that would server > > interaction more efficient but I think that is the wrong direction for > > your usecase. Rather move into the server what belongs there and > > expose appropriate resources for your clients to interact with. > > > > 2011/2/23 Kiss Miklós<[email protected]>: > >> Hi all, > >> > >> I'd like to get ideas on how to handle a (relatively) big graph. My > >> graph is stored in a neo4j server. The structure is simple but highly > >> interconnected: > >> - I have nodes containing longer texts > >> - and I have many nodes containing tokens of those texts. > >> Relationships connect tokens to texts so I have many relationships. The > >> actual graph does have many other nodes too but this is irrelevant now. > >> The graph contains 300k nodes, 2.5 million properties and 1 million > >> relationships (and is still growing). > >> > >> My question is how to execute querys from the graph. I have to execute > >> operations that usually require querying huge parts of the graph. I > >> mean: get all the tokens for some of the texts; or even get all the > >> tokens. (I'm creating a text processing system that is learning and the > >> teaching process involves manipulation of all tokens - I think it's much > >> faster executed in memory rather then querying each token separately). > >> > >> The naive solution (traverse the graph from root node with 1 depth to > >> get all the nodes of a certain type) is now unsusabe since my graph is > >> too big. The server simply runs out of memory (I gave it 1024 MB - this > >> is around the maximum until the server gets a separate hardvare). > >> > >> So my question is how to implement correctly and efficiently the > >> querying of the graph? Should I create custom extensions that traverse > >> and return only a part of the graph in such scenario? Or should I insert > >> additional "control" nodes to the graph which can be used as reference > >> points for querying? The main problem is that I have many same typed > >> relationships. I don't know how to manage traversing the graph partially > >> if it is only accessible through the REST protocol. > >> > >> Any help would be appreciated! > >> > >> Thanks in advance, > >> Miklós Kiss > >> _______________________________________________ > >> Neo4j mailing list > >> [email protected] > >> https://lists.neo4j.org/mailman/listinfo/user > >> > > _______________________________________________ > > Neo4j mailing list > > [email protected] > > https://lists.neo4j.org/mailman/listinfo/user > > > > > > _______________________________________________ > Neo4j mailing list > [email protected] > https://lists.neo4j.org/mailman/listinfo/user > -- Mattias Persson, [[email protected]] Hacker, Neo Technology www.neotechnology.com _______________________________________________ Neo4j mailing list [email protected] https://lists.neo4j.org/mailman/listinfo/user

