Hello,

Last week, I decided to download your graph database core in order to use
it. First, I created a new project to parse my CSV files and create a new
graph database with Neo4j. This CSV files contain 150 milion edges and 20
milion nodes.

When I finished to write the code which will create the graph database, I
executed it and, after six hours of execution, the program crashes because
of a Lucene exception. The exception is related to the index merging and it
has the following message:
"mergeFields produced an invalid result: docCount is 385282378 but fdx file
size is 3082259028; now aborting this merge to prevent index corruption"

I have searched on the net and I found that it is a lucene bug. The
libraries used for executing my project were:
neo-1.0-b10
index-util-0.7
lucene-core-2.4.0

So, I decided to use a newer Lucene version. I found that you have a newer
index-util version so I updated the libraries:
neo-1.0-b10
index-util-0.9
lucene-core-2.9.1

When I had updated those libraries, I tried to execute my project again and
I found that, in many occassions, it was not indexing properly. So, I tried
to optimize the index after every time I indexed something. This was a
solution because, after that, it was indexing properly but the time
execution increased a lot.

I am not using transactions, instead of this, I am using the Batch Inserter
with the LuceneIndexBatchInserter.

So, my question is: What can I do to solve this problem? If use
index-util-0.7 I cannot finish the execution of creating the graph database
and I use index-util-0.9 I have to optimize the index in every insertion and
the execution never ever ends.

Thank you very much in advance,

Núria.
_______________________________________________
Neo mailing list
[email protected]
https://lists.neo4j.org/mailman/listinfo/user

Reply via email to