Hello Michael, I got the zipfile from here http://download.wikimedia.org/enwiki/20110526/enwiki-20110526-stub-meta-history.xml.gz http://download.wikimedia.org/enwiki/20110526/enwiki-20110526-stub-meta-history.xml.gz . The unzipped file is a XML-file and I extracted the important informations and saved the informations in my CSV-file. I extracted the article name, the timestamp and the author.
>Ok, what parameters do you give your JVM (heap space?) Actually I changed nothing so far. I used just the standard options from Eclipse and Java. Should I change the heap space and if so, what are the best settings and how do I do that? >The time needed for that adds up as well :) That's what I meant, so converting the original csv to a file >that costs less time to parse (for every import) pays off. What kind of file would be faster to parse or should I change something in my created CSV-file? >You're creating millions of relationships to a single node. That might have performance implications for >later and might have also for now for the import. Actually I connect every article with the reference node and that are 20M relationships. The authors are only connected to the article they have written or edited. I don't see where I also create millions of relationships to a single node and I can delete the relationships to the reference node if there is the problem. If there are also any problems with the author nodes how do I use this sharding key or the index for those too. To be honest, I don't know what "sharding" key means. :-) An example for this would be nice. Cheers Stephan -- View this message in context: http://neo4j-community-discussions.438527.n3.nabble.com/How-to-create-a-graph-database-out-of-a-huge-dataset-tp3177076p3180256.html Sent from the Neo4J Community Discussions mailing list archive at Nabble.com. _______________________________________________ Neo4j mailing list [email protected] https://lists.neo4j.org/mailman/listinfo/user

