Re: [Neo4j] Batch find

2011-08-08 Thread Michael Hunger
I don't see why there should be any delay.

if you just try this, it should be able to add several thousand nodes per 
second to the graph.

GraphDatabaseService graphdb = new EmbeddedGraphDatabase("words.db");
Index index = graphdb.index().forNodes("words");

for (Document doc : documents) {
 Transaction tx=graphdb.beginTx();
  try {
 for (String word : document.words()) {
 Node node = index.get("word",word).getSingle();
 if (node == null) {
 node = graphdb.createNode();
 node.setProperty("word",word);
 node.setProperty("count",1);
 index.add(node, "word",word);
 } else {
  node.setProperty("count", (Integer)node.getProperty("count")+1);
 }
 }
 tx.success();
  } finally {
 tx.finish();
  }
}

Am 08.08.2011 um 17:06 schrieb ahmed.elsharkasy:

> Also what is the reason of the delay still?
> 
> --
> View this message in context: 
> http://neo4j-community-discussions.438527.n3.nabble.com/Batch-find-tp3221634p3235850.html
> Sent from the Neo4j Community Discussions mailing list archive at Nabble.com.
> ___
> Neo4j mailing list
> User@lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user

___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Batch find

2011-08-08 Thread ahmed.elsharkasy
Also what is the reason of the delay still?

--
View this message in context: 
http://neo4j-community-discussions.438527.n3.nabble.com/Batch-find-tp3221634p3235850.html
Sent from the Neo4j Community Discussions mailing list archive at Nabble.com.
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Batch find

2011-08-08 Thread ahmed.elsharkasy
yes this is my initial load of the db

yes i know i maybe mixing both of them and this is not right but how can i
do the same functionality and using the batch operations can i remove the
transaction and insert/update with batch?

--
View this message in context: 
http://neo4j-community-discussions.438527.n3.nabble.com/Batch-find-tp3221634p3235802.html
Sent from the Neo4j Community Discussions mailing list archive at Nabble.com.
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Batch find

2011-08-08 Thread Michael Hunger
Ahmed,

is this your initial load of the graphdb?

It looks like your mixing batch-insertion and normal transactional API in a 
single program.

Please try to use just one in one program.

I'd really suggest just go with the transactional API and insert / update one 
or more document(s) per transaction.

What are you using "reference" for? that is set to the "created" or "result" 
node(id)?

Am 08.08.2011 um 16:29 schrieb ahmed.elsharkasy:

>Transaction tx = graphDb.beginTx();
>try {
> 
>for (all words in a document){
>   // search for the word
> 
>if (result == null) {
>long created = inserter.createNode(properties);
>wordsIndex.add(created, properties);
> 
>Map properties2 =
> MapUtil.map("value", reference);
> 
>   //create relation
> 
>reference = created;
> 
>} else {
> // update with the new properties
>inserter.setNodeProperties(result, new_properties);
> 
>//create relation 
> 
>reference = result;
> 
>}
>}
> 
>} finally {
>tx.finish();
>index.shutdown();
>inserter.shutdown();
>graphDb.shutdown();
>}
> 
> --
> View this message in context: 
> http://neo4j-community-discussions.438527.n3.nabble.com/Batch-find-tp3221634p3235721.html
> Sent from the Neo4j Community Discussions mailing list archive at Nabble.com.
> ___
> Neo4j mailing list
> User@lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user

___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Batch find

2011-08-08 Thread ahmed.elsharkasy
Transaction tx = graphDb.beginTx();
try {

for (all words in a document){
   // search for the word

if (result == null) {
long created = inserter.createNode(properties);
wordsIndex.add(created, properties);

Map properties2 =
MapUtil.map("value", reference);

   //create relation

reference = created;

} else {
 // update with the new properties
inserter.setNodeProperties(result, new_properties);

//create relation 

reference = result;

}
}

} finally {
tx.finish();
index.shutdown();
inserter.shutdown();
graphDb.shutdown();
}

--
View this message in context: 
http://neo4j-community-discussions.438527.n3.nabble.com/Batch-find-tp3221634p3235721.html
Sent from the Neo4j Community Discussions mailing list archive at Nabble.com.
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Batch find

2011-08-08 Thread Michael Hunger
Ahmed,

could you please share some code?

Batch-Inserter should really only be used to insert millions or billions of 
nodes.

With the normal API you can insert/update about 10k nodes/rels per transaction 
without any issues.

You should be able to insert/update several thousand nodes per second into your 
graph and index per second.

Everything else is not ok. 

The REST API is at least one magnitude slower than the normal Java API 
(probably even two magnitudes, depending on the types of operations).

Michael

Am 08.08.2011 um 15:04 schrieb ahmed.elsharkasy:

> Yes i am inserting 3 nodes where for each node i search use batch index
> whether this word is found in the graph to update or create another node and
> this took 1 second which is too high for me
> 
> another problem beside the time is that i have to open a transaction and i
> use inside the transaction a batch inserter which made me open a graph
> database service for the beginning of the transaction and then closing it
> and starting my batch operations which i think is not good too.also the
> shutting down of the service and starting the batch also swallows good bunch
> of milliseconds
> 
> Each node carry only 1 string property beside the id .
> 
> Do you think this time is suitable ? from the rest api i used to insert more
> than 50 nodes in less than a second 
> 
> By the way when i increased the number of inserted nodes to 500 , the time
> is still 1 second , the problem is that i want to decrease this 1 sec to
> half a second maybe or something like that 
> 
> --
> View this message in context: 
> http://neo4j-community-discussions.438527.n3.nabble.com/Batch-find-tp3221634p3235486.html
> Sent from the Neo4j Community Discussions mailing list archive at Nabble.com.
> ___
> Neo4j mailing list
> User@lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user

___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Batch find

2011-08-08 Thread ahmed.elsharkasy
Yes i am inserting 3 nodes where for each node i search use batch index
whether this word is found in the graph to update or create another node and
this took 1 second which is too high for me

another problem beside the time is that i have to open a transaction and i
use inside the transaction a batch inserter which made me open a graph
database service for the beginning of the transaction and then closing it
and starting my batch operations which i think is not good too.also the
shutting down of the service and starting the batch also swallows good bunch
of milliseconds

Each node carry only 1 string property beside the id .

Do you think this time is suitable ? from the rest api i used to insert more
than 50 nodes in less than a second 

By the way when i increased the number of inserted nodes to 500 , the time
is still 1 second , the problem is that i want to decrease this 1 sec to
half a second maybe or something like that 

--
View this message in context: 
http://neo4j-community-discussions.438527.n3.nabble.com/Batch-find-tp3221634p3235486.html
Sent from the Neo4j Community Discussions mailing list archive at Nabble.com.
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Batch find

2011-08-08 Thread Michael Hunger
Yes, just executing a number of java API calls in a single tx, this is just 
what the REST API does.

Batching is here on the protocol level, i.e. you need only one network 
operation (and serializer/deserializer call) for the whole set of operations 
(and those concerns are all not relevant in the java API).

The question is: do you run into performance issues?

Michael

Am 08.08.2011 um 13:57 schrieb ahmed.elsharkasy:

> i got your point , the reason for asking this question is that i already done
> similar operations from the rest API i.e finding groups of words with one
> request , doing more than operation in one call
> 
> isnt there a java equivalent to a request like this in the Rest API :
> http://docs.neo4j.org/chunked/1.4.M04/rest-api-batch-ops.html
> 
> --
> View this message in context: 
> http://neo4j-community-discussions.438527.n3.nabble.com/Batch-find-tp3221634p3235344.html
> Sent from the Neo4j Community Discussions mailing list archive at Nabble.com.
> ___
> Neo4j mailing list
> User@lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user

___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Batch find

2011-08-08 Thread ahmed.elsharkasy
i got your point , the reason for asking this question is that i already done
similar operations from the rest API i.e finding groups of words with one
request , doing more than operation in one call

isnt there a java equivalent to a request like this in the Rest API :
http://docs.neo4j.org/chunked/1.4.M04/rest-api-batch-ops.html

--
View this message in context: 
http://neo4j-community-discussions.438527.n3.nabble.com/Batch-find-tp3221634p3235344.html
Sent from the Neo4j Community Discussions mailing list archive at Nabble.com.
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Batch find

2011-08-08 Thread Michael Hunger
How many words are contained in your text document ? Probably not millions or 
billions?
Then using the batch-inserter API for that is not sensible.

Otherwise (except if you're really experiencing performance issues) I would 
stay with the iteration across the words (of your word-set). You might use the 
lucene query syntax that I mentioned before to construct a query that  looks 
for nodes with your words. That will give you the nodes already in the graph, 
you'd have to keep track of the words of your set that have already been dealt 
with and create the others afterwards.


Am 08.08.2011 um 13:23 schrieb ahmed.elsharkasy:

> still how can i get the whole words of a document in one shot to be able to
> define the nodes which shall be inserted by batch and the nodes which shall
> be updated
> 
> --
> View this message in context: 
> http://neo4j-community-discussions.438527.n3.nabble.com/Batch-find-tp3221634p3235279.html
> Sent from the Neo4j Community Discussions mailing list archive at Nabble.com.
> ___
> Neo4j mailing list
> User@lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user

___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Batch find

2011-08-08 Thread ahmed.elsharkasy
still how can i get the whole words of a document in one shot to be able to
define the nodes which shall be inserted by batch and the nodes which shall
be updated

--
View this message in context: 
http://neo4j-community-discussions.438527.n3.nabble.com/Batch-find-tp3221634p3235279.html
Sent from the Neo4j Community Discussions mailing list archive at Nabble.com.
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Batch find

2011-08-03 Thread Rick Bullotta
You'll probably want to use an Index for this.  Either a Lucene index or 
in-graph index.  I would recommend a Lucene index, since you can also leverage 
Lucene (and even Solr's) analyzers and parsers to process your document.

-Original Message-
From: user-boun...@lists.neo4j.org [mailto:user-boun...@lists.neo4j.org] On 
Behalf Of ahmed.elsharkasy
Sent: Wednesday, August 03, 2011 7:15 AM
To: user@lists.neo4j.org
Subject: Re: [Neo4j] Batch find

I am trying to insert a document containing list of words , and i wont to
check whether some of this words are already in my graph and in this case i
will update their properties otherwise i will create new nodes with the new
words

--
View this message in context: 
http://neo4j-community-discussions.438527.n3.nabble.com/Batch-find-tp3221634p3221964.html
Sent from the Neo4j Community Discussions mailing list archive at Nabble.com.
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Batch find

2011-08-03 Thread Niels Hoogeveen

That should be "without having to do any lookups"

> From: pd_aficion...@hotmail.com
> To: user@lists.neo4j.org
> Date: Wed, 3 Aug 2011 13:37:44 +0200
> Subject: Re: [Neo4j] Batch find
> 
> 
> The batch insert is intended to push data into the database with having to do 
> any look ups.
> You could preprocess your input data, such that it can be loaded in one go. 
> You could for example read you input file against an existing database, fetch 
> the ID's of nodes and relationships that contain the information you need to 
> update, and create two new input files. One containing data that can be 
> inserted using the batch inserter, and one containing the information that 
> needs to updated (including the ID's of the PropertyContainers that need to 
> be updated).
> Niels
> 
> 
> > Date: Wed, 3 Aug 2011 04:14:44 -0700
> > From: ahmed.elshark...@gmail.com
> > To: user@lists.neo4j.org
> > Subject: Re: [Neo4j] Batch find
> > 
> > I am trying to insert a document containing list of words , and i wont to
> > check whether some of this words are already in my graph and in this case i
> > will update their properties otherwise i will create new nodes with the new
> > words
> > 
> > --
> > View this message in context: 
> > http://neo4j-community-discussions.438527.n3.nabble.com/Batch-find-tp3221634p3221964.html
> > Sent from the Neo4j Community Discussions mailing list archive at 
> > Nabble.com.
> > ___
> > Neo4j mailing list
> > User@lists.neo4j.org
> > https://lists.neo4j.org/mailman/listinfo/user
> 
> ___
> Neo4j mailing list
> User@lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user
  
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Batch find

2011-08-03 Thread Niels Hoogeveen

The batch insert is intended to push data into the database with having to do 
any look ups.
You could preprocess your input data, such that it can be loaded in one go. You 
could for example read you input file against an existing database, fetch the 
ID's of nodes and relationships that contain the information you need to 
update, and create two new input files. One containing data that can be 
inserted using the batch inserter, and one containing the information that 
needs to updated (including the ID's of the PropertyContainers that need to be 
updated).
Niels


> Date: Wed, 3 Aug 2011 04:14:44 -0700
> From: ahmed.elshark...@gmail.com
> To: user@lists.neo4j.org
> Subject: Re: [Neo4j] Batch find
> 
> I am trying to insert a document containing list of words , and i wont to
> check whether some of this words are already in my graph and in this case i
> will update their properties otherwise i will create new nodes with the new
> words
> 
> --
> View this message in context: 
> http://neo4j-community-discussions.438527.n3.nabble.com/Batch-find-tp3221634p3221964.html
> Sent from the Neo4j Community Discussions mailing list archive at Nabble.com.
> ___
> Neo4j mailing list
> User@lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user
  
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Batch find

2011-08-03 Thread ahmed.elsharkasy
I am trying to insert a document containing list of words , and i wont to
check whether some of this words are already in my graph and in this case i
will update their properties otherwise i will create new nodes with the new
words

--
View this message in context: 
http://neo4j-community-discussions.438527.n3.nabble.com/Batch-find-tp3221634p3221964.html
Sent from the Neo4j Community Discussions mailing list archive at Nabble.com.
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Batch find

2011-08-03 Thread Peter Neubauer
Ahmed,
are you tying to find a text or name, or a node? I am not sure as to
what you mean. Do you have some example code so we can understand your
problem better?

Cheers,

/peter neubauer

GTalk:      neubauer.peter
Skype       peter.neubauer
Phone       +46 704 106975
LinkedIn   http://www.linkedin.com/in/neubauer
Twitter      http://twitter.com/peterneubauer

http://www.neo4j.org               - Your high performance graph database.
http://startupbootcamp.org/    - Öresund - Innovation happens HERE.
http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party.



On Wed, Aug 3, 2011 at 1:25 AM, ahmed.elsharkasy
 wrote:
> how can i batch find a whole document in neo4j instead of looping through the
> document words and searching one by one?
> am using java
>
> --
> View this message in context: 
> http://neo4j-community-discussions.438527.n3.nabble.com/Batch-find-tp3221634p3221634.html
> Sent from the Neo4j Community Discussions mailing list archive at Nabble.com.
> ___
> Neo4j mailing list
> User@lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user
>
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


[Neo4j] Batch find

2011-08-03 Thread ahmed.elsharkasy
how can i batch find a whole document in neo4j instead of looping through the
document words and searching one by one?
am using java 

--
View this message in context: 
http://neo4j-community-discussions.438527.n3.nabble.com/Batch-find-tp3221634p3221634.html
Sent from the Neo4j Community Discussions mailing list archive at Nabble.com.
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user