Re: [Neo4j] How to create a graph database out of a huge dataset?

Michael Hunger Mon, 18 Jul 2011 11:28:21 -0700

Stephan,

as you pull the zipfile from wikipedia can you point me to where you get it 
from?
That would be simpler than writing an artificial generator.


Thanks 

Michael
> 
> What did you mean by a typcal distribution. I don't know so far the number
> of articles per author.

Just thought that you had a kind of distribution :)
> 
> The runtime of that import was 30hours. For that import I used the following
> hardware:
> Inter Core i5-460M (2,53 GHz)
> 4GB DDR3 RAM
> ATI Mobility Radeon HD5650

Ok, what parameters do you give your JVM (heap space?)

> I didn't understand what you mean by that small converter, because the
> CSV-file has one entry per line and I think that everyline is correct so
> that I don't need to filter any line.
> That split function in my code is just to check if it creates an array with
> 3 entries.

The time needed for that adds up as well :) That's what I meant, so converting 
the original csv to a file that costs less time to parse (for every import) 
pays off.

> You have read my code correctly.
Ok, great.

> Sorry, I didn't get that. What did you mean by split them by some sharding
> key. Could you give me an example for that part. That would be nice.

You're creating millions of relationships to a single node. That might have 
performance implications for later and might have also for now for the import.

For the String I suggest using Java's to parse it into a long value and store 
that instead, that should decrease your store file tremendously.

long millis = new 
java.text.SimpleDateFormat("yyyy-MM-dd'T'HH:mm:ss'Z'").parse("2002-08-27T03:07:09Z").getTime();

Cheers

Michael

> As you can see at top, the timestamp String looks like this:
> 2002-08-27T03:07:09Z
> What is the best method to save that String in the database?
> Right now I'm just using the timestamp to save it as a property for the
> relationships.
> Maybe I will have to use the timestamp in some traversals, but I'm not sure
> yet.
> 
> Cheers
> Stephan
> 
> 
> --
> View this message in context: 
> http://neo4j-community-discussions.438527.n3.nabble.com/How-to-create-a-graph-database-out-of-a-huge-dataset-tp3177076p3179878.html
> Sent from the Neo4J Community Discussions mailing list archive at Nabble.com.
> _______________________________________________
> Neo4j mailing list
> [email protected]
> https://lists.neo4j.org/mailman/listinfo/user

_______________________________________________
Neo4j mailing list
[email protected]
https://lists.neo4j.org/mailman/listinfo/user

Re: [Neo4j] How to create a graph database out of a huge dataset?

Reply via email to