I think most of the slowness is when doing count. If I just get a small set
doing limit it works fine.
Also, while trying to test your suggestions, I wan’t able to get USING INDEX to
work, is there anything wrong in the syntax?
neo4j> profile
MATCH (n1:`Label1`) WHERE n1.`prop1` = 'bbb'
USING INDEX n1:Label1(prop1)
MATCH (n:`Label2`)-[:REL1]->(n1) WHERE n.`prop2`='aaa'
USING INDEX n:Label2(prop2)
RETURN COUNT(*) AS `Count`;
Invalid input 'S': expected 'n/N' (line 4, column 2 (offset: 56))
"USING INDEX n1:Label1(prop1)"
^
Also, how can I make sure is it completely memory mapped? I have 16gb on the
environment I’m testing it and and I left the neo4j defaults for memory.
> El 29 dic. 2016, a las 18:12, 'Michael Hunger' via Neo4j
> <[email protected]> escribió:
>
> In 3.1 (and then also 3.2)
> you could also use the compiled runtime (except for the count)
>
> So a query like
>
> cypher runtime=compiledExperimentalFeatureNotSupportedForProductionUse
> profile
> MATCH (n1:`Label1`)<-[r1:REL1]-(n:`Label2`) WHERE n.`prop1` = 'aaa' AND
> n1.`prop2`='bbb' RETURN 1
>
> should use the compiled runtime and run faster than the interpreted runtime.
> In 3.2 it will be again (runtime=compiled) or actually the default in
> enterprise.
>
> Michael
>
>
> On Thu, Dec 29, 2016 at 10:07 PM, Michael Hunger
> <[email protected] <mailto:[email protected]>>
> wrote:
> It depends a bit.
>
> If you can aggressively filter on both sides (doesn't seem so with your > 2M
> results).
>
> Then you can do an index lookup for both sides and do a WHERE on the cross
> product.
>
> The other thing you can try is this (make sure n1 is the label with smaller
> cardinality, which you can determine upfront quickly, by running MATCH
> (:Label)-[:TYPE]->() return count(*)
>
> profile
> MATCH (n:`Label2`) WHERE n.`prop1` = 'aaa' WITH collect(n) as nodes
> MATCH (n1:`Label1`)<-[:REL1]-(n) WHERE n1.`prop2`='bbb' AND n IN nodes
> RETURN COUNT(*) AS `Count`;
>
>
> profile
> MATCH (n1:`Label1`) WHERE n1.`prop1` = 'bbb'
> USING INDEX n1:Label1(prop1)
> MATCH (n:`Label2`)-[:REL1]->(n1) WHERE n.`prop2`='aaa'
> USING INDEX n:Label2(prop2)
> RETURN COUNT(*) AS `Count`;
>
> profile
> MATCH (n1:`Label1`) WHERE n1.`prop1` = 'bbb'
> USING INDEX n1:Label1(prop1)
> MATCH (n:`Label2`) WHERE n.`prop2`='aaa' AND (n)-[:REL1]->(n1)
> USING INDEX n:Label2(prop2)
> RETURN COUNT(*) AS `Count`;
>
> Please try these variants, I can imagine reasons for either being faster.
> And let us know.
>
> Your store is not that big (3M nodes 100M rels) but have you made sure it is
> completely memory mapped?
>
> Michael
>
> On Thu, Dec 29, 2016 at 3:10 PM, Matias Burak <[email protected]
> <mailto:[email protected]>> wrote:
> Thanks Michael, maybe using the boolean property as label would help but it's
> very complicated for us to do that because we are allowing the users to build
> their own data model, so there might be a lot of boolean properties, and we
> would have to turn all those properties into labels, making the database very
> complex and the way to retrieve data too.
> Also, it's not just about booleans, we also have other cases like filtering
> by strings instead of boolean, and that's slow too.
> The players/clubs graph was something i made up to protect our data.
> This is what we get when we try to filter on 2 properties on related nodes.
>
> neo4j-sh (?)$ profile MATCH (n1:`Label1`)<-[r1:REL1]-(n:`Label2`) WHERE
> n.`prop1` = 'aaa' AND n1.`prop2`='bbb' RETURN COUNT(*) AS `Count`;
> +---------+
> | Count |
> +---------+
> | 2127237 |
> +---------+
> 1 row
> 5747 ms
>
> Compiler CYPHER 3.1
>
> Planner COST
>
> Runtime INTERPRETED
>
> +-------------------+----------------+---------+---------+-------------+-----------------------------------------------+
> | Operator | Estimated Rows | Rows | DB Hits | Variables |
> Other |
> +-------------------+----------------+---------+---------+-------------+-----------------------------------------------+
> | +ProduceResults | 498 | 1 | 0 | Count |
> Count |
> | |
> +----------------+---------+---------+-------------+-----------------------------------------------+
> | +EagerAggregation | 498 | 1 | 0 | Count |
> |
> | |
> +----------------+---------+---------+-------------+-----------------------------------------------+
> | +Filter | 248298 | 2127237 | 4815283 | n, n1, r1 |
> n.prop1 == { AUTOSTRING0} AND n:Label2 |
> | |
> +----------------+---------+---------+-------------+-----------------------------------------------+
> | +Expand(All) | 815288 | 2444571 | 2444572 | n, r1 -- n1 |
> (n1)<-[r1:REL1]-(n) |
> | |
> +----------------+---------+---------+-------------+-----------------------------------------------+
> | +NodeIndexSeek | 1 | 1 | 2 | n1 |
> :Label1(prop2) |
> +-------------------+----------------+---------+---------+-------------+-----------------------------------------------+
>
> Total database accesses: 7259857
>
>
>
> El miércoles, 28 de diciembre de 2016, 8:06:45 (UTC-3), Michael Hunger
> escribió:
> I would turn boolean properties, like international into a label.
>
>
> like this:
> MATCH (n1:`Club`)<-[r1:CLUB]-(n:International) WHERE n1.name
> <http://n1.name/>='FC Barcelona' RETURN COUNT(*) AS `Count`
>
> Something seems to be off with your data too.
>
> As there are hardly 2.2M international players in FC Barcelona (and 2.4M in
> total) it will take a bit to expand / load them.
> What kind of Hardware do you run this on?
>
> Or is this some computer game and players there?
>
> Please also try something like this:
>
> MATCH (c:Club {name:"FC Barcelona"})
> MATCH (p:Player:International) WHERE (p)-[:CLUB]->(c)
> RETURN count(*);
>
> Michael
>
>
> On Tue, Dec 27, 2016 at 9:38 PM, Matias Burak <[email protected] <>> wrote:
> neo4j-sh (?)$ profile MATCH (n1:`Club`)<-[r1:CLUB]-(n:`Player`) WHERE
> n.`international` = true AND n1.name <http://n1.name/>='FC Barcelona' RETURN
> COUNT(*) AS `Count`;
> +---------+
> | Count |
> +---------+
> | 2181771 |
> +---------+
> 1 row
> 7912 ms
>
> Compiler CYPHER 3.1
>
> Planner COST
>
> Runtime INTERPRETED
>
> +-------------------+----------------+---------+---------+-------------+--------------------------------------------------+
> | Operator | Estimated Rows | Rows | DB Hits | Variables |
> Other |
> +-------------------+----------------+---------+---------+-------------+--------------------------------------------------+
> | +ProduceResults | 593 | 1 | 0 | Count |
> Count |
> | |
> +----------------+---------+---------+-------------+--------------------------------------------------+
> | +EagerAggregation | 593 | 1 | 0 | Count |
> |
> | |
> +----------------+---------+---------+-------------+--------------------------------------------------+
> | +Filter | 351532 | 2181771 | 4869817 | n, n1, r1 |
> n.international == { AUTOBOOL0} AND n:Player |
> | |
> +----------------+---------+---------+-------------+--------------------------------------------------+
> | +Expand(All) | 815288 | 2444571 | 2444572 | n, r1 -- n1 |
> (n1)<-[r1:CLUB]-(n) |
> | |
> +----------------+---------+---------+-------------+--------------------------------------------------+
> | +NodeIndexSeek | 1 | 1 | 2 | n1 |
> :Club(name) |
> +-------------------+----------------+---------+---------+-------------+--------------------------------------------------+
>
> Total database accesses: 7314391
>
> That's the actual query and the execution plan. And yes, we do have indexes
> on :Club(name) and :Player(international)
>
>
>
>
> El martes, 27 de diciembre de 2016, 1:00:41 (UTC-3), Max De Marzi Jr.
> escribió:
>
> Have you created Indexes for the fields you will be searching on? For example:
>
> CREATE INDEX ON :Club(name)
>
> What is the actual query you are trying to optimize and what is the actual
> query execution plan of the query?
>
> On Sunday, December 25, 2016 at 1:17:30 PM UTC-6, Matias Burak wrote:
> Well, actually I was simplifying the query but what we really need to do is
> to filter by some property in club too (like club.name <http://club.name/> =
> 'FC Barcelona') so that solution wouldn't work...
>
>
> El 25 de dic. de 2016 13:27 -0300, Max De Marzi Jr. <[email protected] <>>,
> escribió:
>> Try:
>>
>> MATCH (n:`Player`) WHERE n.`international` = true RETURN
>> SUM(size((n)-[:CLUB]->()) ) AS `Count`;
>>
>> On Saturday, December 24, 2016 at 3:55:26 AM UTC-6, Matias Burak wrote:
>> We are working with the latest 3.1 version, we have a database of around
>> 3.5M nodes and 100M relationships.
>> Doing some basic queries is very slow that makes the application unusable.
>>
>> neo4j-sh (?)$ profile MATCH (n1:`Club`)<-[r1:CLUB]-(n:`Player`) WHERE
>> n.`international` = true RETURN COUNT(*) AS `Count`;
>> +---------+
>> | Count |
>> +---------+
>> | 2181771 <> |
>> +---------+
>> 1 row
>> 28676 ms
>>
>> Compiler CYPHER 3.1
>>
>> Planner COST
>>
>> Runtime INTERPRETED
>>
>> +-------------------+----------------+---------+---------+-------------+-----------------------------------------------+
>> | Operator | Estimated Rows | Rows | DB Hits | Variables |
>> Other |
>> +-------------------+----------------+---------+---------+-------------+-----------------------------------------------+
>> | +ProduceResults | 1027 | 1 | 0 | Count |
>> Count |
>> | |
>> +----------------+---------+---------+-------------+-----------------------------------------------+
>> | +EagerAggregation | 1027 | 1 | 0 | Count |
>> |
>> | |
>> +----------------+---------+---------+-------------+-----------------------------------------------+
>> | +Filter | 1054596 <> | 2181771 <> | 4872405 <> | n, n1,
>> r1 | n.international == { AUTOBOOL0} AND n:Player |
>> | |
>> +----------------+---------+---------+-------------+-----------------------------------------------+
>> | +Expand(All) | 2445865 <> | 2445865 <> | 2445868 <> | n, r1 --
>> n1 | (n1)<-[r1:CLUB]-(n) |
>> | |
>> +----------------+---------+---------+-------------+-----------------------------------------------+
>> | +NodeByLabelScan | 3 | 3 | 4 | n1 |
>> :Club |
>> +-------------------+----------------+---------+---------+-------------+-----------------------------------------------+
>>
>> Total database accesses: 7318277 <>
>> There are around 2.3M players and 3 clubs.
>> It's taking 28secs the first time, then in takes around 9-10 secs.
>> It's really annoying because it's a very simple query.
>> Is there anything we are doing wrong? Is this something we can do about or
>> it's the expected behaviour? This is running on a 16gbRAM machine, getting
>> more RAM is the only way to improve it?
>>
>> Thanks,
>> Matias.
>>
>>
>> --
>> You received this message because you are subscribed to a topic in the
>> Google Groups "Neo4j" group.
>> To unsubscribe from this topic, visit
>> https://groups.google.com/d/topic/neo4j/uZtQWJOC1HE/unsubscribe
>> <https://groups.google.com/d/topic/neo4j/uZtQWJOC1HE/unsubscribe>.
>> To unsubscribe from this group and all its topics, send an email to
>> [email protected] <>.
>> For more options, visit https://groups.google.com/d/optout
>> <https://groups.google.com/d/optout>.
>
>
> --
> You received this message because you are subscribed to the Google Groups
> "Neo4j" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected] <>.
> For more options, visit https://groups.google.com/d/optout
> <https://groups.google.com/d/optout>.
>
>
> --
> You received this message because you are subscribed to the Google Groups
> "Neo4j" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected]
> <mailto:[email protected]>.
> For more options, visit https://groups.google.com/d/optout
> <https://groups.google.com/d/optout>.
>
>
>
> --
> You received this message because you are subscribed to a topic in the Google
> Groups "Neo4j" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/neo4j/uZtQWJOC1HE/unsubscribe
> <https://groups.google.com/d/topic/neo4j/uZtQWJOC1HE/unsubscribe>.
> To unsubscribe from this group and all its topics, send an email to
> [email protected]
> <mailto:[email protected]>.
> For more options, visit https://groups.google.com/d/optout
> <https://groups.google.com/d/optout>.
--
You received this message because you are subscribed to the Google Groups
"Neo4j" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/d/optout.