Re: [Neo4j] 回复：回复： neo4j reade with getNodeById && index.with...

Mattias Persson Wed, 23 Mar 2011 03:54:20 -0700

2011/3/23 孤竹 <[email protected]>

> Sorry for my bad english :)
>
>
> I mean , In my application , There will be 10~20 millions nodes, and
> between one node to the other, there will be 10~20 thousands relations, I
> wonder what should  the index be created for ? In other words,
>  what the sizes of the index will not cause
> a File I/O ?  which factors will cause this problem?
>
> So you have thousands of relationships for every node? If you want to get
specific relationships between two nodes based on one or more properties
then you could index relationships in a relationship index, but if you're
just looping through them and aggregating data or using them to find
shortest path or whatever, then you wouldn't need such an index. I think you
need to supply even more information for others to answer as correctly as
possible.


Also how often do you perform node/relationship lookups? Are you merely
getting one node and traverse from there or are you doing something else?

And yes, indexes are of course stored on disk, so bigger indexes will result
in I/O operations for sure. I can't give you any kind of numbers on that...
I think you'll have to experiment. Start with the most simple solution and
if that becomes a problem, evolve it, maybe add an index for particular
areas if that may help.


> Do I make it clearer? May be I should improve my English
>
>
>
>
> ------------------ 原始邮件 ------------------
> 发件人: "Tobias Ivarsson"<[email protected]>;
> 发送时间: 2011年3月23日(星期三) 下午5:58
> 收件人: "Neo4j user discussions"<[email protected]>;
>
> 主题: Re: [Neo4j]回复： neo4j reade with getNodeById && index.with...
>
>
>  Why not just keep the reference to the actual Node object then? Neo4j's
> internal cache and memory management makes this really cheap to do. [1]
>
> As to your questions about size of property of an index and maximum memory,
> you will have to be clearer, I really don't understand what you are asking
> for.
>
> -tobias
>
> [1] a Node object always occupies exactly 32B in a 64bit JVM, and I believe
> that would translate to 20B on a 32bit JVM. The actual state of the Node is
> stored internally in Neo4j and managed through the Neo4j cache.
>
> 2011/3/23 孤竹 <[email protected]>
>
> > Yes, I know I can't assign the IDS. But I can got some node's ID ,and
> cache
> > it in some place( i.e. in the cache).
> >
> >
> > when I need to search something , I got the id in the cache . Is it
> better
> > ?
> >
> >
> > At last, I wonder what size is property of the index? Depends On the
> memory
> > I gived to Neo4j ?
> >
> >
> > Is there a suggest rate ? for example , If the max-memory for Neo4j is
> 1G,
> > and the index is better less than 500M ?
> >
> >
> > ------------------ 原始邮件 ------------------
> > 发件人: "mattias"<[email protected]>;
> > 发送时间: 2011年3月23日(星期三) 下午4:24
> > 收件人: "Neo4j user discussions"<[email protected]>;
> >
> > 主题: Re: [Neo4j] neo4j reade with getNodeById && index.with...
> >
> >
> >  And also to circumvent this different in performance for not-very-big
> > indexes you can make use of the built-in caching (for the new API)... see
> >
> >
> http://docs.neo4j.org/chunked/snapshot/indexing-lucene-extras.html#indexing-lucene-caching
> >
> > 2011/3/23 Tobias Ivarsson <[email protected]>
> >
> > > It is correct that getNodeById is much faster than an index lookup, but
> > IDs
> > > are assigned by Neo4j, there is no way for you as a user to assign IDs,
> > > which makes it a very blunt tool for looking up entities. To know which
> > id
> > > corresponds to a particular name or similar attribute, you would have
> to
> > go
> > > through an index, and that is exactly what the Neo4j index API does for
> > > you.
> > >
> > > By the way, the LuceneIndexService is a deprecated API, use
> > > GraphDatabaseService#index() instead:
> > >
> > >
> >
> http://components.neo4j.org/neo4j/1.3.M04/apidocs/org/neo4j/graphdb/GraphDatabaseService.html#index()
> > >
> > > Cheers,
> > > Tobias
> > >
> > > On Wed, Mar 23, 2011 at 3:45 AM, 孤竹 <[email protected]> wrote:
> > >
> > > > HI,all
> > > >
> > > >       I Found something interesting, The test as  follow:
> > > >
> > > > public static void main(String args[]) {
> > > >        GraphDatabaseService db = new EmbeddedGraphDatabase("testDB");
> > > >        indexService = new LuceneIndexService(db);
> > > >        Long startTime = System.currentTimeMillis();
> > > >        for(int i =0 ;i <10000;i++){
> > > >            Node node = getNodeByIndexProperty(db,String.valueOf(i));
> > > >        }
> > > >        Long endTime = System.currentTimeMillis();
> > > >        System.out.println("time = " + (endTime - startTime));
> > > >    }
> > > >    public static Node getNodeByIndexProperty(GraphDatabaseService db,
> > > > String keyValue) {
> > > > //        Node node = indexService.getSingleNode(KEY_NAME, keyValue);
> > > >        Node node = db.getNodeById(Long.valueOf(keyValue));
> > > >        return node;
> > > >    }
> > > >
> > > > When I try to use the method indexService.getSingleNode...... with
> 9999
> > > > times, It will take 2 seconds (nearly) , BY when I use
> > db.getNodeById....
> > > > same times, It just take 350 millisecond . Does it mean getById is
> > faster
> > > > than use index ? or Does that mean the natural index is better than
> > index
> > > if
> > > > I can use it ?
> > > >
> > > > thanks for your help ,and replies, that help me very much!
> > > > _______________________________________________
> > > > Neo4j mailing list
> > > > [email protected]
> > > > https://lists.neo4j.org/mailman/listinfo/user
> > > >
> > >
> > >
> > >
> > > --
> > > Tobias Ivarsson <[email protected]>
> > > Hacker, Neo Technology
> > > www.neotechnology.com
> > > Cellphone: +46 706 534857
> > > _______________________________________________
> > > Neo4j mailing list
> > > [email protected]
> > > https://lists.neo4j.org/mailman/listinfo/user
> > >
> >
> >
> >
> > --
> > Mattias Persson, [[email protected]]
> > Hacker, Neo Technology
> > www.neotechnology.com
> > _______________________________________________
> > Neo4j mailing list
> > [email protected]
> > https://lists.neo4j.org/mailman/listinfo/user
> > _______________________________________________
> > Neo4j mailing list
> > [email protected]
> > https://lists.neo4j.org/mailman/listinfo/user
> >
>
>
>
> --
> Tobias Ivarsson <[email protected]>
> Hacker, Neo Technology
> www.neotechnology.com
> Cellphone: +46 706 534857
> _______________________________________________
> Neo4j mailing list
> [email protected]
> https://lists.neo4j.org/mailman/listinfo/user
> _______________________________________________
> Neo4j mailing list
> [email protected]
> https://lists.neo4j.org/mailman/listinfo/user
>



-- 
Mattias Persson, [[email protected]]
Hacker, Neo Technology
www.neotechnology.com
_______________________________________________
Neo4j mailing list
[email protected]
https://lists.neo4j.org/mailman/listinfo/user

Re: [Neo4j] 回复： 回复： neo4j reade with getNodeById && index.with...

Reply via email to

Re: [Neo4j] 回复：回复： neo4j reade with getNodeById && index.with...