Re: [Neo] how to get an specific node
Hi! Thanks for the input, I'll look into doing it in the returnable evaluator. Good points :-) /anders Johan Svensson skrev: Hi, You could implement your own returnable evaluator only to return best match or complete match nodes. If not I would pick the nested loop one since code is cleaner or write a combination of traverser + for loop in one of the evaluators. -Johan ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo] how to get an specific node
Hi, You could implement your own returnable evaluator only to return best match or complete match nodes. If not I would pick the nested loop one since code is cleaner or write a combination of traverser + for loop in one of the evaluators. -Johan On Sat, Aug 9, 2008 at 2:28 PM, Anders Nawroth [EMAIL PROTECTED] wrote: Hi! Johan Svensson skrev: One way would be to do a traversal from the word-node with the least number for relationships. OK, I implemented this. But I *do* have to iterate over the relationships to get the count? At the moment I only want the search to return one best match: either the first that matches all search words or the one matching as many words as possible (this alternative is a bit simplistic as I start the traversal from one point only). I attached some screenshots from my current IMDB node space. I tried two different ways performing the traversal: 1. traverser framework 2. nested loops Both do have their pros and cons, I think. Any suggestions for improvements on the code? Wich one would you choose?! /anders ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo] how to get an specific node
hi! Johan Svensson wrote: anders.nawroth wrote: As I'm currently looking into the IMDB Workshop stuff, it would be helpful to hear your idea on a graph layout to make searches for actor names and movie titles in a better way then just using indexing! There's a need for supporting partial matches somehow in this case, I think. Depends on requirements of application and data size. As I recall the IMDB dataset isn't that big so it could be enough to just sort the first few letters in the string (binary tree). Not sure what you mean by this, could you expand it a little on what will go into indexes and what into the node space? Or you could use the index service to index on every word in each string, having a node for each word then draw relationship to all movie/actor nodes that contain it (reverse index). OK, sounds similar to materialized views (but better, as it's easy to keep the data 100% in sync). Would it make sense to do the following: To search for multiple words, create a MapNode, Integer, keeping track of the number of hits (relationships from the word-nodes) for every Node. Then sort the map by value to get the best hits. /anders ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo] how to get an specific node
On Fri, Aug 8, 2008 at 12:34 PM, Anders Nawroth [EMAIL PROTECTED] wrote: hi! Johan Svensson wrote: anders.nawroth wrote: As I'm currently looking into the IMDB Workshop stuff, it would be helpful to hear your idea on a graph layout to make searches for actor names and movie titles in a better way then just using indexing! There's a need for supporting partial matches somehow in this case, I think. Depends on requirements of application and data size. As I recall the IMDB dataset isn't that big so it could be enough to just sort the first few letters in the string (binary tree). Not sure what you mean by this, could you expand it a little on what will go into indexes and what into the node space? Everything in the node space, nothing into index service. Lets say you have 1M nodes, each having a title string as an property and you want to find all nodes that have a specific title. If the titles are somewhat well alphabetically distributed having a simple sorted tree (in the node space) for the first two letters you would after that have to linear search about 2000 entries for the title. Depending on application requirements that may be acceptable. Or you could use the index service to index on every word in each string, having a node for each word then draw relationship to all movie/actor nodes that contain it (reverse index). OK, sounds similar to materialized views (but better, as it's easy to keep the data 100% in sync). Would it make sense to do the following: To search for multiple words, create a MapNode, Integer, keeping track of the number of hits (relationships from the word-nodes) for every Node. Then sort the map by value to get the best hits. One way would be to do a traversal from the word-node with the least number for relationships. If I do AND search for kill bill vol 2 and find that the word vol only had 10 relationships on it while the others all had 1000+ it would be smart to start on the vol node since then we would only have to check 10 movie nodes if the title really contained all the words search for. Another great thing you get when using a traverser to solve this problem is that you use the iterator idiom. If you have a page rendering the first 10 hits then you will only use CPU time to find the first 10 hits. After that that the user must press next page to get more hits. This is (often) much better then calculating the complete search result consuming a lot of unnecessary CPU time. -Johan ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo] how to get an specific node
Hi Anders, On Tue, Jul 29, 2008 at 5:06 PM, Anders Nawroth [EMAIL PROTECTED] wrote: hi Johan! Johan Svensson wrote: As you say IndexService is very simple and only supports exact lookup at the moment. Use the index service to position yourself somewhere in the graph (or if you can get away with using Neo's native ids since getNodeById is lightning fast) . Once you have a position you start traversing to perform the rest of the query. The graph is in itself an index and a well designed graph layout can perform certain queries on large datasets orders of magnitude faster compared to a traditional relational/set oriented approach. ... jumping into this thread again ... As I'm currently looking into the IMDB Workshop stuff, it would be helpful to hear your idea on a graph layout to make searches for actor names and movie titles in a better way then just using indexing! There's a need for supporting partial matches somehow in this case, I think. Depends on requirements of application and data size. As I recall the IMDB dataset isn't that big so it could be enough to just sort the first few letters in the string (binary tree). Or you could use the index service to index on every word in each string, having a node for each word then draw relationship to all movie/actor nodes that contain it (reverse index). -Johan ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo] how to get an specific node
Hi Andreas, As you say IndexService is very simple and only supports exact lookup at the moment. Use the index service to position yourself somewhere in the graph (or if you can get away with using Neo's native ids since getNodeById is lightning fast) . Once you have a position you start traversing to perform the rest of the query. The graph is in itself an index and a well designed graph layout can perform certain queries on large datasets orders of magnitude faster compared to a traditional relational/set oriented approach. I can further clarify this with some use cases if needed but the idea is to position yourself in the graph then traverse (some queries/searches perform even better starting from two or more positions concurrently). When modeling your data in a graph many types of searches/queries will then perform in constant time as the dataset grows instead of increasing linear (or even worse exponential). It may feel easier to just use some query language, declare a question, execute and get the result but once you start thinking in graphs performing searches using the Neo API is very intuitive. We do lack library and tool support but are working on new components. Soon we will release a graph library with many common graph algorithms implemented and we're also improving our RDF/SPARQL support continuously. However, do you have some example/use case where the simple indexing+graph isn't enough/applicable to solve your problem? What kind of additional support are you intending on to provide in Neo4j.rb? Would very much like to hear what you are working on and Neo4j.rb in general. Regards, Johan On Tue, Jul 29, 2008 at 11:35 AM, Andreas Ronge [EMAIL PROTECTED] wrote: Hi IndexService is indeed very simple, maybe too simple ? Would it not be nice with search for a node id from several properties with AND/OR values, range, regexp which is available in lucene. Or is it not needed ? I'm working on those issues in the neo4j.rb project. SIncerly Andreas On Sat, Jul 26, 2008 at 12:17 AM, Anders Nawroth [EMAIL PROTECTED] wrote: I want to add that the IndexService interface can be found here: https://svn.neo4j.org/components/index-util/trunk/src/main/java/org/neo4j/util/index/IndexService.java It's simple to use. /anders ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo] how to get an specific node
hi Johan! Johan Svensson wrote: As you say IndexService is very simple and only supports exact lookup at the moment. Use the index service to position yourself somewhere in the graph (or if you can get away with using Neo's native ids since getNodeById is lightning fast) . Once you have a position you start traversing to perform the rest of the query. The graph is in itself an index and a well designed graph layout can perform certain queries on large datasets orders of magnitude faster compared to a traditional relational/set oriented approach. ... jumping into this thread again ... As I'm currently looking into the IMDB Workshop stuff, it would be helpful to hear your idea on a graph layout to make searches for actor names and movie titles in a better way then just using indexing! There's a need for supporting partial matches somehow in this case, I think. /anders ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user