Re: [Neo] how to get an specific node

2008-08-10 Thread Anders Nawroth
Hi!

Thanks for the input, I'll look into doing it in the returnable evaluator.

Good points :-)

/anders

Johan Svensson skrev:
 Hi,

 You could implement your own returnable evaluator only to return best
 match or complete match nodes. If not I would pick the nested loop
 one since code is cleaner or write a combination of traverser + for
 loop in one of the evaluators.

 -Johan
   

___
Neo mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo] how to get an specific node

2008-08-09 Thread Johan Svensson
Hi,

You could implement your own returnable evaluator only to return best
match or complete match nodes. If not I would pick the nested loop
one since code is cleaner or write a combination of traverser + for
loop in one of the evaluators.

-Johan

On Sat, Aug 9, 2008 at 2:28 PM, Anders Nawroth [EMAIL PROTECTED] wrote:
 Hi!

 Johan Svensson skrev:

 One way would be to do a traversal from the word-node with the least
 number for relationships.

 OK, I implemented this. But I *do* have to iterate over the relationships to
 get the count?

 At the moment I only want the search to return one best match: either the
 first that matches all search words or the one matching as many words as
 possible (this alternative is a bit simplistic as I start the traversal from
 one point only).

 I attached some screenshots from my current IMDB node space.

 I tried two different ways performing the traversal:
 1. traverser framework
 2. nested loops
 Both do have their pros and cons, I think.

 Any suggestions for improvements on the code?
 Wich one would you choose?!

 /anders
___
Neo mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo] how to get an specific node

2008-08-08 Thread Anders Nawroth
hi!

Johan Svensson wrote:
 anders.nawroth wrote:

 As I'm currently looking into the IMDB Workshop stuff, it would be helpful
 to hear your idea on a graph layout to make searches for actor names and
 movie titles in a better way then just using indexing! There's  a need for
 supporting partial matches somehow in this case, I think.

 
 Depends on requirements of application and data size. As I recall the
 IMDB dataset isn't that big so it could be enough to just sort the
 first few letters in the string (binary tree).

Not sure what you mean by this, could you expand it a little on what 
will go into indexes and what into the node space?

 Or you could use the
 index service to index on every word in each string, having a node for
 each word then draw relationship to all movie/actor nodes that contain
 it (reverse index).

OK, sounds similar to materialized views (but better, as it's easy to 
keep the data 100% in sync).

Would it make sense to do the following:
To search for multiple words, create a MapNode, Integer, keeping track 
of the number of hits (relationships from the word-nodes) for every 
Node. Then sort the map by value to get the best hits.

/anders

___
Neo mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo] how to get an specific node

2008-08-08 Thread Johan Svensson
On Fri, Aug 8, 2008 at 12:34 PM, Anders Nawroth
[EMAIL PROTECTED] wrote:
 hi!

 Johan Svensson wrote:
 anders.nawroth wrote:

 As I'm currently looking into the IMDB Workshop stuff, it would be helpful
 to hear your idea on a graph layout to make searches for actor names and
 movie titles in a better way then just using indexing! There's  a need for
 supporting partial matches somehow in this case, I think.


 Depends on requirements of application and data size. As I recall the
 IMDB dataset isn't that big so it could be enough to just sort the
 first few letters in the string (binary tree).

 Not sure what you mean by this, could you expand it a little on what
 will go into indexes and what into the node space?


Everything in the node space, nothing into index service. Lets say you
have 1M nodes, each having a title string as an property and you
want to find all nodes that have a specific title. If the titles are
somewhat well alphabetically distributed having a simple sorted tree
(in the node space) for the first two letters you would after that
have to linear search about 2000 entries for the title. Depending on
application requirements that may be acceptable.

 Or you could use the
 index service to index on every word in each string, having a node for
 each word then draw relationship to all movie/actor nodes that contain
 it (reverse index).

 OK, sounds similar to materialized views (but better, as it's easy to
 keep the data 100% in sync).

 Would it make sense to do the following:
 To search for multiple words, create a MapNode, Integer, keeping track
 of the number of hits (relationships from the word-nodes) for every
 Node. Then sort the map by value to get the best hits.


One way would be to do a traversal from the word-node with the least
number for relationships. If I do AND search for kill bill vol 2 and
find that the word vol only had 10 relationships on it while the
others all had 1000+ it would be smart to start on the vol node
since then we would only have to check 10 movie nodes if the title
really contained all the words search for.

Another great thing you get when using a traverser to solve this
problem is that you use the iterator idiom. If you have a page
rendering the first 10 hits then you will only use CPU time to find
the first 10 hits. After that that the user must press next page to
get more hits. This is (often) much better then calculating the
complete search result consuming a lot of unnecessary CPU time.

-Johan
___
Neo mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo] how to get an specific node

2008-07-30 Thread Johan Svensson
Hi Anders,

On Tue, Jul 29, 2008 at 5:06 PM, Anders Nawroth
[EMAIL PROTECTED] wrote:
 hi Johan!

 Johan Svensson wrote:

 As you say IndexService is very simple and only supports exact lookup
 at the moment. Use the index service to position yourself somewhere in
 the graph (or if you can get away with using Neo's native ids since
 getNodeById is lightning fast) . Once you have a position you start
 traversing to perform the rest of the query. The graph is in itself an
 index and a well designed graph layout can perform certain queries on
 large datasets orders of magnitude faster compared to a traditional
 relational/set oriented approach.

 ... jumping into this thread again ...

 As I'm currently looking into the IMDB Workshop stuff, it would be helpful
 to hear your idea on a graph layout to make searches for actor names and
 movie titles in a better way then just using indexing! There's  a need for
 supporting partial matches somehow in this case, I think.


Depends on requirements of application and data size. As I recall the
IMDB dataset isn't that big so it could be enough to just sort the
first few letters in the string (binary tree). Or you could use the
index service to index on every word in each string, having a node for
each word then draw relationship to all movie/actor nodes that contain
it (reverse index).

-Johan
___
Neo mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo] how to get an specific node

2008-07-29 Thread Johan Svensson
Hi Andreas,

As you say IndexService is very simple and only supports exact lookup
at the moment. Use the index service to position yourself somewhere in
the graph (or if you can get away with using Neo's native ids since
getNodeById is lightning fast) . Once you have a position you start
traversing to perform the rest of the query. The graph is in itself an
index and a well designed graph layout can perform certain queries on
large datasets orders of magnitude faster compared to a traditional
relational/set oriented approach.

I can further clarify this with some use cases if needed but the idea
is to position yourself in the graph then traverse (some
queries/searches perform even better starting from two or more
positions concurrently). When modeling your data in a graph many types
of searches/queries will then perform in constant time as the dataset
grows instead of increasing linear (or even worse exponential).

It may feel easier to just use some query language, declare a
question, execute and get the result but once you start thinking in
graphs performing searches using the Neo API is very intuitive. We do
lack library and tool support but are working on new components. Soon
we will release a graph library with many common graph algorithms
implemented and we're also improving our RDF/SPARQL support
continuously.

However, do you have some example/use case where the simple
indexing+graph isn't enough/applicable to solve your problem? What
kind of additional support are you intending on to provide in
Neo4j.rb? Would very much like to hear what you are working on and
Neo4j.rb in general.

Regards,
Johan

On Tue, Jul 29, 2008 at 11:35 AM, Andreas Ronge [EMAIL PROTECTED] wrote:
 Hi

 IndexService is indeed very simple, maybe too simple ?
 Would it not be nice with search for a node id from several properties
 with AND/OR values, range, regexp which is available in lucene. Or is
 it not needed ?
 I'm working on those issues in the neo4j.rb project.

 SIncerly
 Andreas

 On Sat, Jul 26, 2008 at 12:17 AM, Anders Nawroth
 [EMAIL PROTECTED] wrote:
 I want to add that the IndexService interface can be found here:
 https://svn.neo4j.org/components/index-util/trunk/src/main/java/org/neo4j/util/index/IndexService.java

 It's simple to use.

 /anders

___
Neo mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo] how to get an specific node

2008-07-29 Thread Anders Nawroth

hi Johan!

Johan Svensson wrote:

As you say IndexService is very simple and only supports exact lookup
at the moment. Use the index service to position yourself somewhere in
the graph (or if you can get away with using Neo's native ids since
getNodeById is lightning fast) . Once you have a position you start
traversing to perform the rest of the query. The graph is in itself an
index and a well designed graph layout can perform certain queries on
large datasets orders of magnitude faster compared to a traditional
relational/set oriented approach.


... jumping into this thread again ...

As I'm currently looking into the IMDB Workshop stuff, it would be 
helpful to hear your idea on a graph layout to make searches for actor 
names and movie titles in a better way then just using indexing! There's 
 a need for supporting partial matches somehow in this case, I think.


/anders

___
Neo mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user