Re: [Neo4j] Version 2.1.0-M01 CSV Import Index Lookup

Michael Hunger Thu, 27 Mar 2014 06:11:46 -0700

Can you try to use MERGE instead of MATCH in your relationship-statement
that should definitely use the index.



On Wed, Mar 26, 2014 at 10:13 PM, Michel Ávila <
[email protected]> wrote:

> I have 3 files, containing a set of companies, persons and the
> relationships between these entities, respectively.
> I managed to load the companies and the persons files in no time, but and
> i'm having some performance issues when loading the last one (the
> relationships).
> It took more than 1 hour and i killed it, because i knew something was not
> right.
> This sample has following:
>
>    - ~100k companies;
>    - ~100k persons;
>    - ~250k relationships;
>
> I needed to be sure that the file was being read correctly, so i left only
> one data row in the "rels" file and ran the following cypher:
>
> LOAD CSV WITH HEADERS FROM "file:D:\\rels.csv" AS f MATCH (c:company {document
> : f.company_document } ) RETURN c
>
> The result took about 20 seconds to bring me back the company, so it was
> not a problem reading the file, but finding the company.
> Then i asked the prompt to profile the cypher, and the result was:
>
> ColumnFilter(symKeys=["f", "c"], returnItemNames=["c"], _rows=1, _db_hits=
> 0)
> Filter(pred="Property(c,document(3)) == Property(f,company_document)",_rows
> =1, _db_hits=112865)
>   NodeByLabel(identifier="c", _db_hits=0, _rows=112865, 
> label="company",identifiers
> =["c"], producer="NodeByLabel")
>     LoadCSV(_rows=1, _db_hits=0)
>
> The way i see it, the loader is reading the entire node set under the
> label "company" and applying the document filter later.
> When i make the same "MATCH" cypher outside the "LOAD" command, the
> profile is this:
>
> profile MATCH (c:company { document: "76875897000169" } ) RETURN c;
> SchemaIndex(identifier="c", _db_hits=0, _rows=1, label="company", query=
> "Literal(76875897000169)", identifiers=["c"], property="document",producer
> ="SchemaIndex")
>
> It's clear to me that it's querying the "company" label index as it was
> designed to do.
> So, why the "LOAD CSV" uses another query plan to do the same lookup?
>
> Thanks in advance!
>
> --
> You received this message because you are subscribed to the Google Groups
> "Neo4j" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"Neo4j" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Re: [Neo4j] Version 2.1.0-M01 CSV Import Index Lookup

Reply via email to