Hi Rich, if you are a bit familiar with Java you can also use the batch-inserter API yourself to implement the things you need.
This also applies to other JVM languages too, like JRuby, Jython, Javascript Scala, Clojure, Groovy etc. Am 12.05.2014 um 09:49 schrieb Rich Morin <[email protected]>: > I need to import ~300 million RDF triples from YAGO2s, a mechanically- > generated ontology. The Batch Importer (preferably the 2.0 version) > is an obvious candidate for this task, if I can figure out some pesky > usage details. Help? > > -r > > Background > > If a triple defines a relation between subject and object URIs, I > can express it as a Neo4j relationship. > > However, many triples define values (eg, hasLatitiude) for entities. > I'd like to express these as node properties, but the Batch Importer > uses TSV syntax, which has a fixed set of properties per node. Yep, good insight, you don't want to store those value triples as relationships. > > Questions > > Q: If I define properties in the TSV header, but leave the data > fields empty, what will the Batch Importer do? For example: > > name works_on works_in > Michael neo4j Java > Richard Ruby > Xavier Yes, it skips empty cells > > Would this create the following nodes? > > Michael: > works_on: neo4j > works_in: Java > Richard: > works_in: Ruby > Xavier: > > Q: If I have already used the Batch Importer to define nodes and > relationships, can I use it again to simply add properties? > > name speaks > Michael German > Richard English Unfortunately not it is really meant for insert. Theoretically it would be possible though but I'm not sure about the performance overhead. > > Given that the nodes file no longer has ID numbers, how do I > tell the Batch Importer which entities to modify? If it would work You could state the properties to look-up from an index and then use those to find and update the nodes. But the index read performance is much slower than the batch-inserter write performance. Usually what I'd do is to programmatically read all nodes of the graph and store the relevant lookup property (eg. url) and the node-id in a Map or sorted array. Then you can find the node quickly by id and update it. HTH, Michael > > -- > http://www.cfcl.com/rdm Rich Morin [email protected] > http://www.cfcl.com/rdm/resume San Bruno, CA, USA +1 650-873-7841 > > Software system design, development, and documentation > > > -- > You received this message because you are subscribed to the Google Groups > "Neo4j" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups "Neo4j" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
