Or simply index (only) supernodes?

Jim

On 22 Mar 2011, at 06:35, Michael Hunger wrote:

> Hi,
> You can try to create intermediary nodes that aggregate certain kinds of 
> relationships, i.e. create a abstraction on top of them. This is also used 
> for write heavy scenarios e.g. activity streams with  super-nodes which are 
> connected to millions of others - you just introduce a second round of nodes 
> below the supernode, sharded on the properties of either the relationships or 
> the target node, this is to lower write load and also (if your sharding key 
> takes domain read considerations into account you can just go from your 
> initial node to the first subnode and only operate on the relationships from 
> there)
> 
> Hope that helps
> Am 22.03.2011 um 02:54 schrieb 孤竹:
> 
>> OK, thanks for you help! It help me a lot!
>> 
>> 
>> There is another question , In my application, there are lots of nodes and 
>> relations(May be  million nodes,and ten Thousands relation). I am wonder, I 
>> have a method to take relation less,but the nodes will be more( the same 
>> ratio ), Is it faster or better for my search ? I think it's faster , 
>> because the nodes  have index~ Please give me some advices :)
>> 
>> ------------------ 原始邮件 ------------------
>> 发件人: "Tobias Ivarsson"<[email protected]>;
>> 发送时间: 2011年3月19日(星期六) 下午5:59
>> 收件人: "Neo4j user discussions"<[email protected]>; 
>> 
>> 主题: Re: [Neo4j] Fans of Neo4j From Chinese
>> 
>> 
>> Neo4j serializes commits. I.e. at most one thread is committing a
>> transaction at once.
>> For the actual work of building up the data to be committed, Neo4j supports
>> multiple concurrent threads.
>> 
>> This fact alone, that there is a single congestion point, means that if an
>> application, like in your case, is very write centric, it is unlikely for it
>> to scale beyond two threads, with one building up the next commit while the
>> other is commiting its data. It might scale to a few more threads than that
>> if the buildup time is significantly larger than the commit time. It is
>> simple time slicing, "only one train can be at the station at once", then
>> you have to do the maths on how many "trains can be out on the track" during
>> that time.
>> 
>> It is also worth keeping in mind, that for CPU bound operation, an
>> application doesn't scale much further than the number of CPUs in the
>> computer. The threads that are not in commit mode - i.e. the ones that are
>> building up the data for their next commit - are CPU bound, and contending
>> for the same CPU resources. This means that your application is not going to
>> scale much further than the number of CPUs in your computer, and few
>> desktop/laptop computers have more than 4 CPUs these days, which makes 5
>> threads about the most you can squeeze out of it, anything more than that is
>> just going to add contention, and possibly even slow things down.
>> 
>> Finally, the (CPU bound) threads that create the graph might be contending
>> on the same resources. As Peter said. If multiple threads modify the same
>> node or relationship, i.e. if they create relationships to the same node
>> (the root node for example), they are all going to block on that resource.
>> Neo4j only allows one transaction to modify each entity at a time. This
>> means that to get maximum concurrency out of your data creation, each thread
>> should be creating each own disconnected subgraph. And if they have
>> connected parts, the connections to the "global data" should be made last in
>> the transaction (in a predictable order to avoid deadlocks[1]), to maximize
>> the time the thread is operational before hitting the
>> "congestion point" that is the (potentially) contended data.
>> 
>> Cheers,
>> Tobias
>> 
>> [1] Neo4j will detect if a deadlock has occurred and throw a
>> DeadlockDetectedException in that case.
>> 
>> 2011/3/18 孤竹 <[email protected]>
>> 
>>> hi,
>>> 
>>> 
>>> Sorry for disturb you , I am a chinese engineer , Excused for my bad
>>> english :) .
>>> 
>>> 
>>> Recently, I am learning Neo4j and trying to use it in my project . But
>>> When I make a Pressure on neo4j with 5 theads , 10 theads, 20 and 30, I
>>> found the nodes inserted to the Neo4J is not change obvious (sometimes not
>>> change ~ ~! ). Does it not matter with threads ? the kenerl will make it
>>> Serial ? Is there any documents or something about The performance of Neo4j
>>> ? thanks for your help
>>> 
>>> 
>>> 
>>> The program as follows:
>>> I put this function in ExecutorService ,with 5/10/30 threads. then test
>>> for the nodes inserted into at same time .(The counts have not changed
>>> obviously)
>>> 
>>> 
>>> Transaction tx = null;
>>>              Node before = null;
>>>              try {
>>>                      for (int i = 0; i < 1000000; i++) {
>>>                              if(stop == true){
>>>                                      return;
>>>                              }
>>>                              if (graphDb == null) {
>>>                                      return;
>>>                              }
>>>                              try {
>>>                                      if (tx == null) {
>>>                                              tx = graphDb.beginTx();
>>>                                      }
>>>                                      // 引用计数加1
>>>                                      writeCount.addAndGet(1);
>>>                                      int startNodeString =
>>> name.addAndGet(1);
>>>                                      Node start =
>>> getOrCreateNodeWithOutIndex(""
>>>                                                      + startNodeString);
>>>                                      if (before == null) {
>>>                                              // 根节点.哈哈哈 I got U
>>>                                              Node root =
>>> graphDb.getNodeById(0);
>>> 
>>> root.createRelationshipTo(start, LEAD);
>>>                                      }
>>>                                      if (before != null) {
>>> 
>>> before.createRelationshipTo(start, LOVES);
>>>                                      }
>>>                                      int endNodeName = name.addAndGet(1);
>>>                                      Node end =
>>> getOrCreateNodeWithOutIndex("" + endNodeName);
>>>                                      start.createRelationshipTo(end,
>>> KNOWS);
>>>                                      before = end;
>>>                                      // 每一千次 commit一次
>>>                                      if (i % 100 == 0) {
>>>                                              tx.success();
>>>                                              tx.finish();
>>>                                              tx = null;
>>>                                      }
>>>                              } catch (Exception e) {
>>>                                       System.out.println("write : = " +
>>> e);
>>>                              }
>>>                      }
>>>              } catch (Exception e) {
>>>              } finally {
>>>                      tx.finish();
>>>              }
>>>      }
>>> 
>> -- 
>> Tobias Ivarsson <[email protected]>
>> Hacker, Neo Technology
>> www.neotechnology.com
>> Cellphone: +46 706 534857
>> _______________________________________________
>> Neo4j mailing list
>> [email protected]
>> https://lists.neo4j.org/mailman/listinfo/user
>> _______________________________________________
>> Neo4j mailing list
>> [email protected]
>> https://lists.neo4j.org/mailman/listinfo/user
> 
> _______________________________________________
> Neo4j mailing list
> [email protected]
> https://lists.neo4j.org/mailman/listinfo/user

_______________________________________________
Neo4j mailing list
[email protected]
https://lists.neo4j.org/mailman/listinfo/user

Reply via email to