Re: [Neo4j] Batch Insertion, how to index relationships?

José Devezas Wed, 02 Mar 2011 09:36:59 -0800

Hello,

I'd also like to know more about this. I want to keep an edge weight
representing how many times A links to B, instead of using multiple edges
(which is what I'm currently doing). My work is based on the Blueprints
implementation of Neo4j, so would I need to index relationships with, for
instance, an attribute "edgeId" set to something like "nodeA->nodeB"?
Iterating through all the edges in a node until we find the right one is an
alternative, but doesn't seem good performance-wise.


So, the question is, during a batch insertion where we want to increment an
edge attribute, what would be the best way to do it? Indexing relationships?

On Wed, Mar 2, 2011 at 10:35 AM, Pablo Pareja <[email protected]> wrote:

> On Wed, Mar 2, 2011 at 11:00 AM, Mattias Persson
> <[email protected]>wrote:
>
> > 2011/3/2 Pablo Pareja <[email protected]>:
> > > Hi!
> > >
> > > I just checked the wiki looking for information on how to index
> > > relationships in batch insertion mode but
> > > didn't find anything so far.
> > > This can be found in the wiki regarding relationship indexing:
> > >
> > > RelationshipIndex friendships = graphDb.index().forRelationships(
> > > "friendships" );
> > > // "type" isn't a reserved key and isn't indexed automatically
> > > Relationship relationship = friendships.get( "type", "knows", morpheus,
> > > trinity ).getSingle();
> > >
> > > However I cannot find any code snippet for adding relationships to the
> > > index, not just querying it.
> > > How can these two different cases (*batch insertion and standard mode*)
> > be
> > > carried out?
> >
> > It's just like adding to a node index. a RelationshipIndex extends
> > Index<Relationship> so adding to a RelationshipIndex is:
> >
> >  RelationshipIndex relIndex = ...
> >  relIndex.add( myRelationship, "key", "value" );
> >
> > I'll add it to the wiki as well.
> >
>
> Thanks ;)
>
> Just one more question about this, what if you want to index a relationship
> not by any property but only by
> the nodes involved?
> I've seen than once the relationship has been indexed, you can query the
> index with start/end node parameters; but
> how can I index it in the case where I only care about these nodes and not
> any key-value pair?
>
> >
> > >
> > > Besides that, I was wondering how node relationships retrieval is
> > > implemented.
> > > In my domain model I have some nodes which have hundred of thousands of
> > > relationships, including different types of them.
> > >
> > > Suppose you are already situated on one of these nodes and you want to
> > get
> > > only one specific type of incoming relationships,
> > > retrieval time would be dependent on how many relationships are there
> > > including other types?
> > > or once you specify the relationship type in the
> > node.getRelationships(...)
> > > method it doesn't matter how many relationships are there
> > > of other types in terms of relationship retrieval time?
> >
> > I'd really like to answer: "it doesn't matter". But currently it
> > does... there are two phases here: one where the relationships aren't
> > cached in memory yet (or have fallen out of memory to make room for
> > loading other relationships). Then loading relationships, whether a
> > specific type or all is linear to how many relationships there are on
> > that node.
> >
> > The other is when the node is fully cached with all its relationships
> > and you get relationships of a specific type(s). Then it won't matter
> > how many relationships there are other then for the given type(s).
> > Direction for those relationships is filtered when you get them, but I
> > think soon the cache layer will also optimize on direction so that not
> > even filtering will be required when getting relationships of a
> > specific type and direction.
> >
> > There are some thoughts about also making the storing/loading of
> > relationships type-and-direction-aware so that the number of
> > relationships (outside of your given type/direction) really won't
> > matter. Will probably be a while before such a thing even has a chance
> > to make it in though.
> >
> >
> Ok, that's actually what I thought after running some test-cases; however I
> somehow was hoping it
> wouldn't be that way and what happened was just that I was doing something
> wrong.
> This means then that at least in my case I'll have to index a fairly big
> amount of relationships in order
> to achieve decent retrieval times for highly connected nodes. Any
> advice/suggestion for tunning performance in
> these cases apart from relationship indexation?
>
> Cheers,
>
> Pablo
>
>
> > >
> > > Thanks in advance,
> > >
> > > Pablo
> > >
> > > --
> > > Pablo Pareja Tobes
> > > LinkedIn    http://www.linkedin.com/in/pabloparejatobes
> > > Twitter       http://www.twitter.com/pablopareja
> > >
> > > http://about.me/pablopareja
> > > http://www.ohnosequences.com
> > > _______________________________________________
> > > Neo4j mailing list
> > > [email protected]
> > > https://lists.neo4j.org/mailman/listinfo/user
> > >
> >
> >
> >
> > --
> > Mattias Persson, [[email protected]]
> > Hacker, Neo Technology
> > www.neotechnology.com
> > _______________________________________________
> > Neo4j mailing list
> > [email protected]
> > https://lists.neo4j.org/mailman/listinfo/user
> >
>
>
>
> --
> Pablo Pareja Tobes
> LinkedIn    http://www.linkedin.com/in/pabloparejatobes
> Twitter       http://www.twitter.com/pablopareja
>
> http://about.me/pablopareja
> http://www.ohnosequences.com
> _______________________________________________
> Neo4j mailing list
> [email protected]
> https://lists.neo4j.org/mailman/listinfo/user
>



-- 
José Luís Devezas
Labs SAPO/UP
http://labs.sapo.pt/up
_______________________________________________
Neo4j mailing list
[email protected]
https://lists.neo4j.org/mailman/listinfo/user

Re: [Neo4j] Batch Insertion, how to index relationships?

Reply via email to