Thanks Michael, maybe using the boolean property as label would help but
it's very complicated for us to do that because we are allowing the users
to build their own data model, so there might be a lot of boolean
properties, and we would have to turn all those properties into labels,
making the database very complex and the way to retrieve data too.
Also, it's not just about booleans, we also have other cases like filtering
by strings instead of boolean, and that's slow too.
The players/clubs graph was something i made up to protect our data.
This is what we get when we try to filter on 2 properties on related nodes.
neo4j-sh (?)$ profile MATCH (n1:`Label1`)<-[r1:REL1]-(n:`Label2`) WHERE
n.`prop1` = 'aaa' AND n1.`prop2`='bbb' RETURN COUNT(*) AS `Count`;
+---------+
| Count |
+---------+
| 2127237 |
+---------+
1 row
5747 ms
Compiler CYPHER 3.1
Planner COST
Runtime INTERPRETED
+-------------------+----------------+---------+---------+-------------+-----------------------------------------------+
| Operator | Estimated Rows | Rows | DB Hits | Variables |
Other |
+-------------------+----------------+---------+---------+-------------+-----------------------------------------------+
| +ProduceResults | 498 | 1 | 0 | Count |
Count |
| |
+----------------+---------+---------+-------------+-----------------------------------------------+
| +EagerAggregation | 498 | 1 | 0 | Count |
|
| |
+----------------+---------+---------+-------------+-----------------------------------------------+
| +Filter | 248298 | 2127237 | 4815283 | n, n1, r1 |
n.prop1 == { AUTOSTRING0} AND n:Label2 |
| |
+----------------+---------+---------+-------------+-----------------------------------------------+
| +Expand(All) | 815288 | 2444571 | 2444572 | n, r1 -- n1 |
(n1)<-[r1:REL1]-(n) |
| |
+----------------+---------+---------+-------------+-----------------------------------------------+
| +NodeIndexSeek | 1 | 1 | 2 | n1 |
:Label1(prop2) |
+-------------------+----------------+---------+---------+-------------+-----------------------------------------------+
Total database accesses: 7259857
El miércoles, 28 de diciembre de 2016, 8:06:45 (UTC-3), Michael Hunger
escribió:
>
> I would turn boolean properties, like international into a label.
>
>
> like this:
> MATCH (n1:`Club`)<-[r1:CLUB]-(n:International) WHERE n1.name='FC
> Barcelona' RETURN COUNT(*) AS `Count`
>
> Something seems to be off with your data too.
>
> As there are hardly 2.2M international players in FC Barcelona (and 2.4M
> in total) it will take a bit to expand / load them.
> What kind of Hardware do you run this on?
>
> Or is this some computer game and players there?
>
> Please also try something like this:
>
> MATCH (c:Club {name:"FC Barcelona"})
> MATCH (p:Player:International) WHERE (p)-[:CLUB]->(c)
> RETURN count(*);
>
> Michael
>
>
> On Tue, Dec 27, 2016 at 9:38 PM, Matias Burak <[email protected]
> <javascript:>> wrote:
>
>> neo4j-sh (?)$ profile MATCH (n1:`Club`)<-[r1:CLUB]-(n:`Player`) WHERE
>> n.`international` = true AND n1.name='FC Barcelona' RETURN COUNT(*) AS
>> `Count`;
>> +---------+
>> | Count |
>> +---------+
>> | 2181771 |
>> +---------+
>> 1 row
>> 7912 ms
>>
>> Compiler CYPHER 3.1
>>
>> Planner COST
>>
>> Runtime INTERPRETED
>>
>>
>> +-------------------+----------------+---------+---------+-------------+--------------------------------------------------+
>> | Operator | Estimated Rows | Rows | DB Hits | Variables |
>> Other |
>>
>> +-------------------+----------------+---------+---------+-------------+--------------------------------------------------+
>> | +ProduceResults | 593 | 1 | 0 | Count |
>> Count |
>> | |
>> +----------------+---------+---------+-------------+--------------------------------------------------+
>> | +EagerAggregation | 593 | 1 | 0 | Count |
>> |
>> | |
>> +----------------+---------+---------+-------------+--------------------------------------------------+
>> | +Filter | 351532 | 2181771 | 4869817 | n, n1, r1 |
>> n.international == { AUTOBOOL0} AND n:Player |
>> | |
>> +----------------+---------+---------+-------------+--------------------------------------------------+
>> | +Expand(All) | 815288 | 2444571 | 2444572 | n, r1 -- n1 |
>> (n1)<-[r1:CLUB]-(n) |
>> | |
>> +----------------+---------+---------+-------------+--------------------------------------------------+
>> | +NodeIndexSeek | 1 | 1 | 2 | n1 |
>> :Club(name) |
>>
>> +-------------------+----------------+---------+---------+-------------+--------------------------------------------------+
>>
>> Total database accesses: 7314391
>>
>> That's the actual query and the execution plan. And yes, we do have
>> indexes on :Club(name) and :Player(international)
>>
>>
>>
>>
>> El martes, 27 de diciembre de 2016, 1:00:41 (UTC-3), Max De Marzi Jr.
>> escribió:
>>>
>>>
>>> Have you created Indexes for the fields you will be searching on? For
>>> example:
>>>
>>> CREATE INDEX ON :Club(name)
>>>
>>> What is the actual query you are trying to optimize and what is the
>>> actual query execution plan of the query?
>>>
>>> On Sunday, December 25, 2016 at 1:17:30 PM UTC-6, Matias Burak wrote:
>>>>
>>>> Well, actually I was simplifying the query but what we really need to
>>>> do is to filter by some property in club too (like club.name = 'FC
>>>> Barcelona') so that solution wouldn't work...
>>>>
>>>>
>>>> El 25 de dic. de 2016 13:27 -0300, Max De Marzi Jr. <[email protected]>,
>>>> escribió:
>>>>
>>>> Try:
>>>>
>>>> MATCH (n:`Player`) WHERE n.`international` = true RETURN SUM(
>>>> size((n)-[:CLUB]->()) ) AS `Count`;
>>>>
>>>> On Saturday, December 24, 2016 at 3:55:26 AM UTC-6, Matias Burak wrote:
>>>>>
>>>>> We are working with the latest 3.1 version, we have a database of
>>>>> around 3.5M nodes and 100M relationships.
>>>>> Doing some basic queries is very slow that makes the application
>>>>> unusable.
>>>>>
>>>>> neo4j-sh (?)$ profile MATCH (n1:`Club`)<-[r1:CLUB]-(n:`Player`) WHERE
>>>>> n.`international` = true RETURN COUNT(*) AS `Count`;
>>>>> +---------+
>>>>> | Count |
>>>>> +---------+
>>>>> | 2181771 |
>>>>> +---------+
>>>>> 1 row
>>>>> 28676 ms
>>>>>
>>>>> Compiler CYPHER 3.1
>>>>>
>>>>> Planner COST
>>>>>
>>>>> Runtime INTERPRETED
>>>>>
>>>>>
>>>>> +-------------------+----------------+---------+---------+-------------+-----------------------------------------------+
>>>>> | Operator | Estimated Rows | Rows | DB Hits | Variables
>>>>> | Other |
>>>>>
>>>>> +-------------------+----------------+---------+---------+-------------+-----------------------------------------------+
>>>>> | +ProduceResults | 1027 | 1 | 0 | Count
>>>>> | Count |
>>>>> | |
>>>>> +----------------+---------+---------+-------------+-----------------------------------------------+
>>>>> | +EagerAggregation | 1027 | 1 | 0 | Count
>>>>> | |
>>>>> | |
>>>>> +----------------+---------+---------+-------------+-----------------------------------------------+
>>>>> | +Filter | 1054596 | 2181771 | 4872405 | n, n1, r1
>>>>> | n.international == { AUTOBOOL0} AND n:Player |
>>>>> | |
>>>>> +----------------+---------+---------+-------------+-----------------------------------------------+
>>>>> | +Expand(All) | 2445865 | 2445865 | 2445868 | n, r1 --
>>>>> n1 | (n1)<-[r1:CLUB]-(n) |
>>>>> | |
>>>>> +----------------+---------+---------+-------------+-----------------------------------------------+
>>>>> | +NodeByLabelScan | 3 | 3 | 4 | n1
>>>>> | :Club |
>>>>>
>>>>> +-------------------+----------------+---------+---------+-------------+-----------------------------------------------+
>>>>>
>>>>> Total database accesses: 7318277
>>>>>
>>>>> There are around 2.3M players and 3 clubs.
>>>>> It's taking 28secs the first time, then in takes around 9-10 secs.
>>>>> It's really annoying because it's a very simple query.
>>>>> Is there anything we are doing wrong? Is this something we can do
>>>>> about or it's the expected behaviour? This is running on a 16gbRAM
>>>>> machine,
>>>>> getting more RAM is the only way to improve it?
>>>>>
>>>>> Thanks,
>>>>> Matias.
>>>>>
>>>>>
>>>> --
>>>> You received this message because you are subscribed to a topic in the
>>>> Google Groups "Neo4j" group.
>>>> To unsubscribe from this topic, visit
>>>> https://groups.google.com/d/topic/neo4j/uZtQWJOC1HE/unsubscribe.
>>>> To unsubscribe from this group and all its topics, send an email to
>>>> [email protected].
>>>> For more options, visit https://groups.google.com/d/optout.
>>>>
>>>> --
>> You received this message because you are subscribed to the Google Groups
>> "Neo4j" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [email protected] <javascript:>.
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>
--
You received this message because you are subscribed to the Google Groups
"Neo4j" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/d/optout.