Re: [Neo4j] Batch Insertion, how to index relationships?

Pablo Pareja Wed, 02 Mar 2011 02:35:51 -0800

On Wed, Mar 2, 2011 at 11:00 AM, Mattias Persson
<[email protected]>wrote:


> 2011/3/2 Pablo Pareja <[email protected]>:
> > Hi!
> >
> > I just checked the wiki looking for information on how to index
> > relationships in batch insertion mode but
> > didn't find anything so far.
> > This can be found in the wiki regarding relationship indexing:
> >
> > RelationshipIndex friendships = graphDb.index().forRelationships(
> > "friendships" );
> > // "type" isn't a reserved key and isn't indexed automatically
> > Relationship relationship = friendships.get( "type", "knows", morpheus,
> > trinity ).getSingle();
> >
> > However I cannot find any code snippet for adding relationships to the
> > index, not just querying it.
> > How can these two different cases (*batch insertion and standard mode*)
> be
> > carried out?
>
> It's just like adding to a node index. a RelationshipIndex extends
> Index<Relationship> so adding to a RelationshipIndex is:
>
>  RelationshipIndex relIndex = ...
>  relIndex.add( myRelationship, "key", "value" );
>
> I'll add it to the wiki as well.
>

Thanks ;)

Just one more question about this, what if you want to index a relationship
not by any property but only by
the nodes involved?
I've seen than once the relationship has been indexed, you can query the
index with start/end node parameters; but
how can I index it in the case where I only care about these nodes and not
any key-value pair?

>
> >
> > Besides that, I was wondering how node relationships retrieval is
> > implemented.
> > In my domain model I have some nodes which have hundred of thousands of
> > relationships, including different types of them.
> >
> > Suppose you are already situated on one of these nodes and you want to
> get
> > only one specific type of incoming relationships,
> > retrieval time would be dependent on how many relationships are there
> > including other types?
> > or once you specify the relationship type in the
> node.getRelationships(...)
> > method it doesn't matter how many relationships are there
> > of other types in terms of relationship retrieval time?
>
> I'd really like to answer: "it doesn't matter". But currently it
> does... there are two phases here: one where the relationships aren't
> cached in memory yet (or have fallen out of memory to make room for
> loading other relationships). Then loading relationships, whether a
> specific type or all is linear to how many relationships there are on
> that node.
>
> The other is when the node is fully cached with all its relationships
> and you get relationships of a specific type(s). Then it won't matter
> how many relationships there are other then for the given type(s).
> Direction for those relationships is filtered when you get them, but I
> think soon the cache layer will also optimize on direction so that not
> even filtering will be required when getting relationships of a
> specific type and direction.
>
> There are some thoughts about also making the storing/loading of
> relationships type-and-direction-aware so that the number of
> relationships (outside of your given type/direction) really won't
> matter. Will probably be a while before such a thing even has a chance
> to make it in though.
>
>
Ok, that's actually what I thought after running some test-cases; however I
somehow was hoping it
wouldn't be that way and what happened was just that I was doing something
wrong.
This means then that at least in my case I'll have to index a fairly big
amount of relationships in order
to achieve decent retrieval times for highly connected nodes. Any
advice/suggestion for tunning performance in
these cases apart from relationship indexation?

Cheers,

Pablo


> >
> > Thanks in advance,
> >
> > Pablo
> >
> > --
> > Pablo Pareja Tobes
> > LinkedIn    http://www.linkedin.com/in/pabloparejatobes
> > Twitter       http://www.twitter.com/pablopareja
> >
> > http://about.me/pablopareja
> > http://www.ohnosequences.com
> > _______________________________________________
> > Neo4j mailing list
> > [email protected]
> > https://lists.neo4j.org/mailman/listinfo/user
> >
>
>
>
> --
> Mattias Persson, [[email protected]]
> Hacker, Neo Technology
> www.neotechnology.com
> _______________________________________________
> Neo4j mailing list
> [email protected]
> https://lists.neo4j.org/mailman/listinfo/user
>



-- 
Pablo Pareja Tobes
LinkedIn    http://www.linkedin.com/in/pabloparejatobes
Twitter       http://www.twitter.com/pablopareja

http://about.me/pablopareja
http://www.ohnosequences.com
_______________________________________________
Neo4j mailing list
[email protected]
https://lists.neo4j.org/mailman/listinfo/user

Re: [Neo4j] Batch Insertion, how to index relationships?

Reply via email to