Re: [Neo4j] Fans of Neo4j From Chinese

2011-03-22 Thread Tobias Ivarsson
Rick, I don't quite understand what you are asking for. Could you elaborate
further?

-tobias

2011/3/22 Rick Bullotta rick.bullo...@thingworx.com

 I'd like to explore this question a bit further.  Does this mean that
 basically there's no way to scale beyond a single thread/CPU for
 disconnected graphs if you have complex graph dependencies (e.g. you cannot
 create disjoint subgraphs)?



 -Original Message-
 From: user-boun...@lists.neo4j.org [mailto:user-boun...@lists.neo4j.org]
 On Behalf Of Tobias Ivarsson
 Sent: Saturday, March 19, 2011 5:59 AM
 To: Neo4j user discussions
 Subject: Re: [Neo4j] Fans of Neo4j From Chinese

 Neo4j serializes commits. I.e. at most one thread is committing a
 transaction at once.
 For the actual work of building up the data to be committed, Neo4j supports
 multiple concurrent threads.

 This fact alone, that there is a single congestion point, means that if an
 application, like in your case, is very write centric, it is unlikely for
 it
 to scale beyond two threads, with one building up the next commit while the
 other is commiting its data. It might scale to a few more threads than that
 if the buildup time is significantly larger than the commit time. It is
 simple time slicing, only one train can be at the station at once, then
 you have to do the maths on how many trains can be out on the track
 during
 that time.

 It is also worth keeping in mind, that for CPU bound operation, an
 application doesn't scale much further than the number of CPUs in the
 computer. The threads that are not in commit mode - i.e. the ones that are
 building up the data for their next commit - are CPU bound, and contending
 for the same CPU resources. This means that your application is not going
 to
 scale much further than the number of CPUs in your computer, and few
 desktop/laptop computers have more than 4 CPUs these days, which makes 5
 threads about the most you can squeeze out of it, anything more than that
 is
 just going to add contention, and possibly even slow things down.

 Finally, the (CPU bound) threads that create the graph might be contending
 on the same resources. As Peter said. If multiple threads modify the same
 node or relationship, i.e. if they create relationships to the same node
 (the root node for example), they are all going to block on that resource.
 Neo4j only allows one transaction to modify each entity at a time. This
 means that to get maximum concurrency out of your data creation, each
 thread
 should be creating each own disconnected subgraph. And if they have
 connected parts, the connections to the global data should be made last
 in
 the transaction (in a predictable order to avoid deadlocks[1]), to maximize
 the time the thread is operational before hitting the
 congestion point that is the (potentially) contended data.

 Cheers,
 Tobias

 [1] Neo4j will detect if a deadlock has occurred and throw a
 DeadlockDetectedException in that case.

 2011/3/18 孤竹 ho...@foxmail.com

  hi,
 
 
Sorry for disturb you , I am a chinese engineer , Excused for my bad
  english :) .
 
 
Recently, I am learning Neo4j and trying to use it in my project . But
  When I make a Pressure on neo4j with 5 theads , 10 theads, 20 and 30, I
  found the nodes inserted to the Neo4J is not change obvious (sometimes
 not
  change ~ ~! ). Does it not matter with threads ? the kenerl will make it
  Serial ? Is there any documents or something about The performance of
 Neo4j
  ? thanks for your help
 
 
 
The program as follows:
I put this function in ExecutorService ,with 5/10/30 threads. then test
  for the nodes inserted into at same time .(The counts have not changed
  obviously)
 
 
  Transaction tx = null;
 Node before = null;
 try {
 for (int i = 0; i  100; i++) {
 if(stop == true){
 return;
 }
 if (graphDb == null) {
 return;
 }
 try {
 if (tx == null) {
 tx = graphDb.beginTx();
 }
 // 引用计数加1
 writeCount.addAndGet(1);
 int startNodeString =
  name.addAndGet(1);
 Node start =
  getOrCreateNodeWithOutIndex(
 +
 startNodeString);
 if (before == null) {
 // 根节点.哈哈哈 I got U
 Node root =
  graphDb.getNodeById(0);
 
   root.createRelationshipTo(start, LEAD

Re: [Neo4j] Fans of Neo4j From Chinese

2011-03-21 Thread Rick Bullotta
I'd like to explore this question a bit further.  Does this mean that basically 
there's no way to scale beyond a single thread/CPU for disconnected graphs if 
you have complex graph dependencies (e.g. you cannot create disjoint subgraphs)?



-Original Message-
From: user-boun...@lists.neo4j.org [mailto:user-boun...@lists.neo4j.org] On 
Behalf Of Tobias Ivarsson
Sent: Saturday, March 19, 2011 5:59 AM
To: Neo4j user discussions
Subject: Re: [Neo4j] Fans of Neo4j From Chinese

Neo4j serializes commits. I.e. at most one thread is committing a
transaction at once.
For the actual work of building up the data to be committed, Neo4j supports
multiple concurrent threads.

This fact alone, that there is a single congestion point, means that if an
application, like in your case, is very write centric, it is unlikely for it
to scale beyond two threads, with one building up the next commit while the
other is commiting its data. It might scale to a few more threads than that
if the buildup time is significantly larger than the commit time. It is
simple time slicing, only one train can be at the station at once, then
you have to do the maths on how many trains can be out on the track during
that time.

It is also worth keeping in mind, that for CPU bound operation, an
application doesn't scale much further than the number of CPUs in the
computer. The threads that are not in commit mode - i.e. the ones that are
building up the data for their next commit - are CPU bound, and contending
for the same CPU resources. This means that your application is not going to
scale much further than the number of CPUs in your computer, and few
desktop/laptop computers have more than 4 CPUs these days, which makes 5
threads about the most you can squeeze out of it, anything more than that is
just going to add contention, and possibly even slow things down.

Finally, the (CPU bound) threads that create the graph might be contending
on the same resources. As Peter said. If multiple threads modify the same
node or relationship, i.e. if they create relationships to the same node
(the root node for example), they are all going to block on that resource.
Neo4j only allows one transaction to modify each entity at a time. This
means that to get maximum concurrency out of your data creation, each thread
should be creating each own disconnected subgraph. And if they have
connected parts, the connections to the global data should be made last in
the transaction (in a predictable order to avoid deadlocks[1]), to maximize
the time the thread is operational before hitting the
congestion point that is the (potentially) contended data.

Cheers,
Tobias

[1] Neo4j will detect if a deadlock has occurred and throw a
DeadlockDetectedException in that case.

2011/3/18 孤竹 ho...@foxmail.com

 hi,


   Sorry for disturb you , I am a chinese engineer , Excused for my bad
 english :) .


   Recently, I am learning Neo4j and trying to use it in my project . But
 When I make a Pressure on neo4j with 5 theads , 10 theads, 20 and 30, I
 found the nodes inserted to the Neo4J is not change obvious (sometimes not
 change ~ ~! ). Does it not matter with threads ? the kenerl will make it
 Serial ? Is there any documents or something about The performance of Neo4j
 ? thanks for your help



   The program as follows:
   I put this function in ExecutorService ,with 5/10/30 threads. then test
 for the nodes inserted into at same time .(The counts have not changed
 obviously)


 Transaction tx = null;
Node before = null;
try {
for (int i = 0; i  100; i++) {
if(stop == true){
return;
}
if (graphDb == null) {
return;
}
try {
if (tx == null) {
tx = graphDb.beginTx();
}
// 引用计数加1
writeCount.addAndGet(1);
int startNodeString =
 name.addAndGet(1);
Node start =
 getOrCreateNodeWithOutIndex(
+ startNodeString);
if (before == null) {
// 根节点.哈哈哈 I got U
Node root =
 graphDb.getNodeById(0);

  root.createRelationshipTo(start, LEAD);
}
if (before != null) {

  before.createRelationshipTo(start, LOVES);
}
int endNodeName = name.addAndGet(1

Re: [Neo4j] Fans of Neo4j From Chinese

2011-03-19 Thread Peter Neubauer
Hi there,
Without having too much insight in the concurrency of things, Neo4j
locks on node level. That means, if two transactions in different
threads are trying to modify the same nodes or relationships, an
exception will be thrown or the transactions will be queued.

Also, I am not sure how the lucene index is behaving in multiple
threads, since we are wrapping it to be able to provide transactional
support, which is not built into Lucene itself.

Mattias, do you have any details on that?

/peter

On Friday, March 18, 2011, 孤竹 ho...@foxmail.com wrote:
 hi,


Sorry for disturb you , I am a chinese engineer , Excused for my bad 
 english :) .


Recently, I am learning Neo4j and trying to use it in my project . But 
 When I make a Pressure on neo4j with 5 theads , 10 theads, 20 and 30, I found 
 the nodes inserted to the Neo4J is not change obvious (sometimes not change ~ 
 ~! ). Does it not matter with threads ? the kenerl will make it Serial ? Is 
 there any documents or something about The performance of Neo4j ? thanks for 
 your help



The program as follows:
I put this function in ExecutorService ,with 5/10/30 threads. then test 
 for the nodes inserted into at same time .(The counts have not changed 
 obviously)


 Transaction tx = null;
 Node before = null;
 try {
 for (int i = 0; i  100; i++) {
 if(stop == true){
 return;
 }
 if (graphDb == null) {
 return;
 }
 try {
 if (tx == null) {
 tx = graphDb.beginTx();
 }
 // 引用计数加1
 writeCount.addAndGet(1);
 int startNodeString = 
 name.addAndGet(1);
 Node start = 
 getOrCreateNodeWithOutIndex(
 + startNodeString);
 if (before == null) {
 // 根节点.哈哈哈 I got U
 Node root = 
 graphDb.getNodeById(0);
 
 root.createRelationshipTo(start, LEAD);
 }
 if (before != null) {
 
 before.createRelationshipTo(start, LOVES);
 }
 int endNodeName = name.addAndGet(1);
 Node end = 
 getOrCreateNodeWithOutIndex( + endNodeName);
 start.createRelationshipTo(end, 
 KNOWS);
 before = end;
 // 每一千次 commit一次
 if (i % 100 == 0) {
 tx.success();
 tx.finish();
 tx = null;
 }
 } catch (Exception e) {
  System.out.println(write : =  + e);
 }
 }
 } catch (Exception e) {
 } finally {
 tx.finish();
 }
 }
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user

___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Fans of Neo4j From Chinese

2011-03-19 Thread Tobias Ivarsson
Neo4j serializes commits. I.e. at most one thread is committing a
transaction at once.
For the actual work of building up the data to be committed, Neo4j supports
multiple concurrent threads.

This fact alone, that there is a single congestion point, means that if an
application, like in your case, is very write centric, it is unlikely for it
to scale beyond two threads, with one building up the next commit while the
other is commiting its data. It might scale to a few more threads than that
if the buildup time is significantly larger than the commit time. It is
simple time slicing, only one train can be at the station at once, then
you have to do the maths on how many trains can be out on the track during
that time.

It is also worth keeping in mind, that for CPU bound operation, an
application doesn't scale much further than the number of CPUs in the
computer. The threads that are not in commit mode - i.e. the ones that are
building up the data for their next commit - are CPU bound, and contending
for the same CPU resources. This means that your application is not going to
scale much further than the number of CPUs in your computer, and few
desktop/laptop computers have more than 4 CPUs these days, which makes 5
threads about the most you can squeeze out of it, anything more than that is
just going to add contention, and possibly even slow things down.

Finally, the (CPU bound) threads that create the graph might be contending
on the same resources. As Peter said. If multiple threads modify the same
node or relationship, i.e. if they create relationships to the same node
(the root node for example), they are all going to block on that resource.
Neo4j only allows one transaction to modify each entity at a time. This
means that to get maximum concurrency out of your data creation, each thread
should be creating each own disconnected subgraph. And if they have
connected parts, the connections to the global data should be made last in
the transaction (in a predictable order to avoid deadlocks[1]), to maximize
the time the thread is operational before hitting the
congestion point that is the (potentially) contended data.

Cheers,
Tobias

[1] Neo4j will detect if a deadlock has occurred and throw a
DeadlockDetectedException in that case.

2011/3/18 孤竹 ho...@foxmail.com

 hi,


   Sorry for disturb you , I am a chinese engineer , Excused for my bad
 english :) .


   Recently, I am learning Neo4j and trying to use it in my project . But
 When I make a Pressure on neo4j with 5 theads , 10 theads, 20 and 30, I
 found the nodes inserted to the Neo4J is not change obvious (sometimes not
 change ~ ~! ). Does it not matter with threads ? the kenerl will make it
 Serial ? Is there any documents or something about The performance of Neo4j
 ? thanks for your help



   The program as follows:
   I put this function in ExecutorService ,with 5/10/30 threads. then test
 for the nodes inserted into at same time .(The counts have not changed
 obviously)


 Transaction tx = null;
Node before = null;
try {
for (int i = 0; i  100; i++) {
if(stop == true){
return;
}
if (graphDb == null) {
return;
}
try {
if (tx == null) {
tx = graphDb.beginTx();
}
// 引用计数加1
writeCount.addAndGet(1);
int startNodeString =
 name.addAndGet(1);
Node start =
 getOrCreateNodeWithOutIndex(
+ startNodeString);
if (before == null) {
// 根节点.哈哈哈 I got U
Node root =
 graphDb.getNodeById(0);

  root.createRelationshipTo(start, LEAD);
}
if (before != null) {

  before.createRelationshipTo(start, LOVES);
}
int endNodeName = name.addAndGet(1);
Node end =
 getOrCreateNodeWithOutIndex( + endNodeName);
start.createRelationshipTo(end,
 KNOWS);
before = end;
// 每一千次 commit一次
if (i % 100 == 0) {
tx.success();
tx.finish();
  

[Neo4j] Fans of Neo4j From Chinese

2011-03-18 Thread 孤竹
hi,


   Sorry for disturb you , I am a chinese engineer , Excused for my bad english 
:) .


   Recently, I am learning Neo4j and trying to use it in my project . But When 
I make a Pressure on neo4j with 5 theads , 10 theads, 20 and 30, I found the 
nodes inserted to the Neo4J is not change obvious (sometimes not change ~ ~! ). 
Does it not matter with threads ? the kenerl will make it Serial ? Is there any 
documents or something about The performance of Neo4j ? thanks for your help


   
   The program as follows:
   I put this function in ExecutorService ,with 5/10/30 threads. then test for 
the nodes inserted into at same time .(The counts have not changed obviously)


Transaction tx = null;
Node before = null;
try {
for (int i = 0; i  100; i++) {
if(stop == true){
return;
}
if (graphDb == null) {
return;
}
try {
if (tx == null) {
tx = graphDb.beginTx();
}
// 引用计数加1
writeCount.addAndGet(1);
int startNodeString = name.addAndGet(1);
Node start = 
getOrCreateNodeWithOutIndex(
+ startNodeString);
if (before == null) {
// 根节点.哈哈哈 I got U
Node root = 
graphDb.getNodeById(0);

root.createRelationshipTo(start, LEAD);
}
if (before != null) {

before.createRelationshipTo(start, LOVES);
}
int endNodeName = name.addAndGet(1);
Node end = 
getOrCreateNodeWithOutIndex( + endNodeName);
start.createRelationshipTo(end, KNOWS);
before = end;
// 每一千次 commit一次
if (i % 100 == 0) {
tx.success();
tx.finish();
tx = null;
}
} catch (Exception e) {
 System.out.println(write : =  + e);
}
}
} catch (Exception e) {
} finally {
tx.finish();
}
}
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user