Hi Tom,
what I did was basically dumping the nodes and rels from the 2.1. database
using the shell tools into a csv file and then importing it into 2.0 using the
batch-importer (would also have worked with the shell tools but I was too lazy
:)
Could be that I missed some.
I think you'd be faster importing your CSV data with the shell tools in 2.0
(use a batch-size of 1k for the relationships) (or the batch-importer)
you would do similar things like with load csv (adapt to the shape of your
csv-files)
import-cypher -i nodes.csv create(n:#{label}) set jurt_id = {jurt_id}
import-cypher -i rels.csv match (n),(m) where id(n) = {start} and id(m)={end}
create (n)-[(:#{label}]->(m)
Michael
Here is what I did:
#1 built the shell tools for 2.1 ->
s3://dist.neo4j.org/jexp/shell/neo4j-shell-tools-2.1.zip
Unzipped the zip in the lib directory of the 2.1 server
started the shell
bin/neo4j-shell -path ~/Downloads/tom/graph21.db
ran the following 2 export commands (for a real export you'd probably export
all node properties one by one) took in total perhaps 20s
import-cypher -o nodes.csv match(n) return id(n) as `:id`,labels(n)[0] as
`:label`, n.jurt_id as jurt_id
Query: match(n) return id(n) as `:id`,labels(n)[0] as `:label`, n.jurt_id as
jurt_id infile (none) delim ',' quoted false outfile nodes.csv batch-size 1000
Import statement execution created 28184 rows of output.
import-cypher -o nodes.csv match(n)-[r]->(m) return id(n) as `s:id`,id(m) as
`e:id`,type(r) as `:label`
Query: match(n)-[r]->(m) return id(n) as `s:id`,id(m) as `e:id`,type(r) as
`:label` infile (none) delim ',' quoted false outfile nodes.csv batch-size 1000
Import statement execution created 1276254 rows of output.
Then I imported it with the batch-importer
import.sh tom.db nodes.csv rels.csv
using a properties file with
batch_import.csv.delim=,
Am 10.03.2014 um 10:08 schrieb Tom Zeppenfeldt <[email protected]>:
> Hi Michael,
>
> Thanks for your reply. Basically you suggest not to overspecify the queries,
> by leaving out the labels or identifiers when not necessary. And I learned
> my lesson with regard to using snapshots :)
>
> BTW : Assuming that you are using the db that I shared with you and
> converted it to a 2.0.1. version, I appreciated the increased speed, but does
> it also explain why the returned counts are different ?
>
> If you have converted the db, could you share the datastore (the 2.0.1. one)
> back to me ?
>
> Thanks a lot !
>
> Best, Tom
>
> On Monday, 10 March 2014 07:11:00 UTC+1, Michael Hunger wrote:
> Hi Tom,
>
> with 2.0.1 the query time went down to 1.6 seconds.
> It still has to pull through and aggregate 500.000 rels but should actually
> be faster doing this.
>
> match (j1:jurt)-[:HAS_TERM]->(t)<-[:HAS_TERM]-(j2)
> where j1.jurt_id = 'J70000' AND j2 <> j1
> RETURN j2,count(*) as commonterms
> order by commonterms desc
> limit 3;
>
> +---------------------------------------------+
> | j2 | commonterms |
> +---------------------------------------------+
> | Node[19946]{jurt_id:"J72191"} | 68 |
> | Node[20977]{jurt_id:"J73483"} | 67 |
> | Node[21658]{jurt_id:"J74261"} | 64 |
> +---------------------------------------------+
> 3 rows
> 1614 ms
>
> Cheers,
>
> Michael
>
> ----
> (michael}-[:SUPPORTS]->(YOU)-[:USE]->(Neo4j)
> Learn Online, Offline or Read a Book (in Deutsch)
> We're trading T-shirts for cool GraphGist Models
>
>
>
>
>
> Am 09.03.2014 um 20:00 schrieb Michael Hunger <[email protected]>:
>
>> Ouch
>>
>> Share via dropbox
>>
>> You can share the 2.1 store with me or the loadcsv script with your csv files
>>
>> Thanks for all the great feedback btw
>>
>> Can you send me your postal address and t-shirt size?
>>
>> Thx
>>
>> Sent from mobile device
>>
>> Am 09.03.2014 um 19:08 schrieb Tom Zeppenfeldt <[email protected]>:
>>
>>> Ok Michael,
>>>
>>> - Just a question that may sound stupid : What's the best way to share
>>> things privately over here ? Not seeing any clear option to do so.
>>> - I'll try to setup a server with 2.0.1 and try to use the
>>> shell-import-tools. FYI : uploading the 1.2M rels uring LOAD CSV took over
>>> 24 hrs ... hope your shell-import-tools work faster ..
>>>
>>> Best,
>>>
>>> Tom
>>>
>>>
>>> Met vriendelijke groet / With kind regards
>>>
>>>
>>>
>>> Ir. T. Zeppenfeldt
>>> van der Waalsstraat 30
>>> 6706 JR Wageningen
>>> The Netherlands
>>>
>>> Mobile: +31 6 23 28 78 06
>>> Phone: +31 3 17 84 22 17
>>> E-mail: [email protected]
>>> Web: www.ophileon.com
>>> Twitter: tomzeppenfeldt
>>> Skype: tomzeppenfeldt
>>>
>>>
>>> 2014-03-09 16:27 GMT+01:00 Michael Hunger <[email protected]>:
>>> Could you send me the profike output from the shell? Easier to read on
>>> mobile and also share the db with me privately
>>>
>>> Can you also try the query in 2.0.1?
>>>
>>> You can import the data using my shell-import-tools
>>>
>>> Or just generate textual cypher statements from load-csv
>>>
>>> Sent from mobile device
>>>
>>> Am 09.03.2014 um 16:11 schrieb Tom Zeppenfeldt <[email protected]>:
>>>
>>>> query is executed as follows, in which I spot:
>>>>
>>>> "_rows" : 478380,
>>>> "_db_hits" : 956760,
>>>>
>>>> which is actually higher (= worse ??) than the original ..
>>>>
>>>> {
>>>> "columns" : [ "j1.jurt_id", "j2.jurt_id", "commonterms" ],
>>>> "data" : [ [ "J70000", "J72191", 68 ], [ "J70000", "J73483", 67 ], [
>>>> "J70000", "J75683", 66 ] ],
>>>> "plan" : {
>>>> "args" : {
>>>> "returnItemNames" : [ "j1.jurt_id", "j2.jurt_id", "commonterms" ],
>>>> "_rows" : 3,
>>>> "_db_hits" : 0,
>>>> "symKeys" : [ "j1.jurt_id", "j2.jurt_id", "
>>>> INTERNAL_AGGREGATEb6207bc9-3236-4e8f-ad48-51d2d73e3372" ]
>>>> },
>>>> "dbHits" : 0,
>>>> "name" : "ColumnFilter",
>>>> "children" : [ {
>>>> "args" : {
>>>> "limit" : "Literal(3)",
>>>> "orderBy" : [ "SortItem(Cached(
>>>> INTERNAL_AGGREGATEb6207bc9-3236-4e8f-ad48-51d2d73e3372 of type
>>>> Integer),false)" ],
>>>> "_rows" : 3,
>>>> "_db_hits" : 0
>>>> },
>>>> "dbHits" : 0,
>>>> "name" : "Top",
>>>> "children" : [ {
>>>> "args" : {
>>>> "keys" : [ "Cached(j1.jurt_id of type Any)", "Cached(j2.jurt_id
>>>> of type Any)" ],
>>>> "_rows" : 9992,
>>>> "aggregates" : [ "(
>>>> INTERNAL_AGGREGATEb6207bc9-3236-4e8f-ad48-51d2d73e3372,Count(t))" ],
>>>> "_db_hits" : 0
>>>> },
>>>> "dbHits" : 0,
>>>> "name" : "EagerAggregation",
>>>> "children" : [ {
>>>> "args" : {
>>>> "_rows" : 478380,
>>>> "_db_hits" : 956760,
>>>> "exprKeys" : [ "j1.jurt_id", "j2.jurt_id" ],
>>>> "symKeys" : [ "j1", "t", " UNNAMED79", "j2", " UNNAMED62" ]
>>>> },
>>>> "dbHits" : 956760,
>>>> "name" : "Extract",
>>>> "children" : [ {
>>>> "args" : {
>>>> "_rows" : 478380,
>>>> "_db_hits" : 0,
>>>> "pred" : "NOT(j2 == j1)"
>>>> },
>>>> "dbHits" : 0,
>>>> "name" : "Filter",
>>>> "children" : [ {
>>>> "args" : {
>>>> "g" : "(j1)-[' UNNAMED62']-(t),(j2)-[' UNNAMED79']-(t)",
>>>> "_rows" : 478380,
>>>> "_db_hits" : 0
>>>> },
>>>> "dbHits" : 0,
>>>> "name" : "SimplePatternMatcher",
>>>> "children" : [ {
>>>> "args" : {
>>>> "identifiers" : [ "j1" ],
>>>> "query" : "{jurtid}",
>>>> "producer" : "SchemaIndex",
>>>> "_rows" : 1,
>>>> "property" : "jurt_id",
>>>> "label" : "jurt",
>>>> "_db_hits" : 0,
>>>> "identifier" : "j1"
>>>> },
>>>> "dbHits" : 0,
>>>> "name" : "SchemaIndex",
>>>> "children" : [ ],
>>>> "rows" : 1
>>>> } ],
>>>> "rows" : 478380
>>>> } ],
>>>> "rows" : 478380
>>>> } ],
>>>> "rows" : 478380
>>>> } ],
>>>> "rows" : 9992
>>>> } ],
>>>> "rows" : 3
>>>> } ],
>>>> "rows" : 3
>>>> }
>>>> }
>>>>
>>>> --
>>>> You received this message because you are subscribed to the Google Groups
>>>> "Neo4j" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send an
>>>> email to [email protected].
>>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>> --
>>> You received this message because you are subscribed to the Google Groups
>>> "Neo4j" group.
>>> To unsubscribe from this group and stop receiving emails from it, send an
>>> email to [email protected].
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>>
>>> --
>>> You received this message because you are subscribed to the Google Groups
>>> "Neo4j" group.
>>> To unsubscribe from this group and stop receiving emails from it, send an
>>> email to [email protected].
>>> For more options, visit https://groups.google.com/d/optout.
>
>
> --
> You received this message because you are subscribed to the Google Groups
> "Neo4j" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups
"Neo4j" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/d/optout.