Yes Michael, that definitely did the trick! Works like a charm now. Can you explain the differences between MERGE and MATCH, in this case, so i can choose between them consciously next time?
Thank you again! Em quinta-feira, 27 de março de 2014 10h11min00s UTC-3, Michael Hunger escreveu: > > Can you try to use MERGE instead of MATCH in your relationship-statement > that should definitely use the index. > > > On Wed, Mar 26, 2014 at 10:13 PM, Michel Ávila > <[email protected]<javascript:> > > wrote: > >> I have 3 files, containing a set of companies, persons and the >> relationships between these entities, respectively. >> I managed to load the companies and the persons files in no time, but and >> i'm having some performance issues when loading the last one (the >> relationships). >> It took more than 1 hour and i killed it, because i knew something was >> not right. >> This sample has following: >> >> - ~100k companies; >> - ~100k persons; >> - ~250k relationships; >> >> I needed to be sure that the file was being read correctly, so i left >> only one data row in the "rels" file and ran the following cypher: >> >> LOAD CSV WITH HEADERS FROM "file:D:\\rels.csv" AS f MATCH (c:company >> {document >> : f.company_document } ) RETURN c >> >> The result took about 20 seconds to bring me back the company, so it was >> not a problem reading the file, but finding the company. >> Then i asked the prompt to profile the cypher, and the result was: >> >> ColumnFilter(symKeys=["f", "c"], returnItemNames=["c"], _rows=1, _db_hits >> =0) >> Filter(pred="Property(c,document(3)) == Property(f,company_document)",_rows >> =1, _db_hits=112865) >> NodeByLabel(identifier="c", _db_hits=0, _rows=112865, >> label="company",identifiers >> =["c"], producer="NodeByLabel") >> LoadCSV(_rows=1, _db_hits=0) >> >> The way i see it, the loader is reading the entire node set under the >> label "company" and applying the document filter later. >> When i make the same "MATCH" cypher outside the "LOAD" command, the >> profile is this: >> >> profile MATCH (c:company { document: "76875897000169" } ) RETURN c; >> SchemaIndex(identifier="c", _db_hits=0, _rows=1, label="company", query= >> "Literal(76875897000169)", identifiers=["c"], property="document",producer >> ="SchemaIndex") >> >> It's clear to me that it's querying the "company" label index as it was >> designed to do. >> So, why the "LOAD CSV" uses another query plan to do the same lookup? >> >> Thanks in advance! >> >> -- >> You received this message because you are subscribed to the Google Groups >> "Neo4j" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected] <javascript:>. >> For more options, visit https://groups.google.com/d/optout. >> > > -- You received this message because you are subscribed to the Google Groups "Neo4j" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
