Re: [Neo4j] Version 2.1.0-M01 CSV Import Index Lookup

Michel Ávila Thu, 27 Mar 2014 07:12:03 -0700

Yes Michael, that definitely did the trick! Works like a charm now.
Can you explain the differences between MERGE and MATCH, in this case, so i 
can choose between them consciously next time?


Thank you again!

Em quinta-feira, 27 de março de 2014 10h11min00s UTC-3, Michael Hunger 
escreveu:
>
> Can you try to use MERGE instead of MATCH in your relationship-statement 
> that should definitely use the index.
>
>
> On Wed, Mar 26, 2014 at 10:13 PM, Michel Ávila 
> <[email protected]<javascript:>
> > wrote:
>
>> I have 3 files, containing a set of companies, persons and the 
>> relationships between these entities, respectively.
>> I managed to load the companies and the persons files in no time, but and 
>> i'm having some performance issues when loading the last one (the 
>> relationships).
>> It took more than 1 hour and i killed it, because i knew something was 
>> not right.
>> This sample has following:
>>
>>    - ~100k companies;
>>    - ~100k persons;
>>    - ~250k relationships; 
>>
>> I needed to be sure that the file was being read correctly, so i left 
>> only one data row in the "rels" file and ran the following cypher:
>>
>> LOAD CSV WITH HEADERS FROM "file:D:\\rels.csv" AS f MATCH (c:company 
>> {document
>> : f.company_document } ) RETURN c
>>
>> The result took about 20 seconds to bring me back the company, so it was 
>> not a problem reading the file, but finding the company.
>> Then i asked the prompt to profile the cypher, and the result was:
>>
>> ColumnFilter(symKeys=["f", "c"], returnItemNames=["c"], _rows=1, _db_hits
>> =0)
>> Filter(pred="Property(c,document(3)) == Property(f,company_document)",_rows
>> =1, _db_hits=112865)
>>   NodeByLabel(identifier="c", _db_hits=0, _rows=112865, 
>> label="company",identifiers
>> =["c"], producer="NodeByLabel")
>>     LoadCSV(_rows=1, _db_hits=0)
>>
>> The way i see it, the loader is reading the entire node set under the 
>> label "company" and applying the document filter later.
>> When i make the same "MATCH" cypher outside the "LOAD" command, the 
>> profile is this:
>>
>> profile MATCH (c:company { document: "76875897000169" } ) RETURN c;
>> SchemaIndex(identifier="c", _db_hits=0, _rows=1, label="company", query=
>> "Literal(76875897000169)", identifiers=["c"], property="document",producer
>> ="SchemaIndex")
>>
>> It's clear to me that it's querying the "company" label index as it was 
>> designed to do.
>> So, why the "LOAD CSV" uses another query plan to do the same lookup?
>>
>> Thanks in advance!
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "Neo4j" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to [email protected] <javascript:>.
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Neo4j" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Re: [Neo4j] Version 2.1.0-M01 CSV Import Index Lookup

Reply via email to