Re: [Neo4j] label implementation details

Sun Yuhan Sun, 06 Aug 2017 20:21:01 -0700

The two databases store the graphs with same graph structure. Each node has 
only one label. The two graphs have 2 and 100 labels respectively, named 
with GRAPH_0, GRAPH_1, .., etc.


The first query is "profile match (a0:GRAPH_0)-->(a1:GRAPH_1) return 
id(a0), id(a1)". The second one is "profile match 
(a0:GRAPH_2)-->(a1:GRAPH_1) return id(a0), id(a1)".

Their query plans are as follows:

+------------------+----------------+---------+---------+------------------------------------+----------------------------------------------------+
| Operator         | Estimated Rows | Rows    | DB Hits | Variables         
                 | Other                                              |
+------------------+----------------+---------+---------+------------------------------------+----------------------------------------------------+
| +ProduceResults  |        3298257 | 2638399 |       0 | id(a0), id(a1)   
                  | id(a0), id(a1)                                     |
| |               
 
+----------------+---------+---------+------------------------------------+----------------------------------------------------+
| +Projection      |        3298257 | 2638399 |       0 | id(a0), id(a1) -- 
anon[19], a0, a1 | {id(a0) : IdFunction(a0), id(a1) : IdFunction(a1)} |
| |               
 
+----------------+---------+---------+------------------------------------+----------------------------------------------------+
| +Filter          |        3298257 | 2638399 | 3298257 | anon[19], a0, a1 
                  | a1:GRAPH_1                                         |
| |               
 
+----------------+---------+---------+------------------------------------+----------------------------------------------------+
| +Expand(All)     |        3298257 | 3298257 | 4053211 | anon[19], a1 -- 
a0                 | (a0)-->(a1)                                        |
| |               
 
+----------------+---------+---------+------------------------------------+----------------------------------------------------+
| +NodeByLabelScan |         754954 |  754954 |  754955 | a0               
                  | :GRAPH_0                                           |
+------------------+----------------+---------+---------+------------------------------------+----------------------------------------------------+


+------------------+----------------+-------+---------+------------------------------------+----------------------------------------------------+
| Operator         | Estimated Rows | Rows  | DB Hits | Variables           
               | Other                                              |
+------------------+----------------+-------+---------+------------------------------------+----------------------------------------------------+
| +ProduceResults  |          33082 | 26410 |       0 | id(a0), id(a1)     
                | id(a0), id(a1)                                     |
| |               
 
+----------------+-------+---------+------------------------------------+----------------------------------------------------+
| +Projection      |          33082 | 26410 |       0 | id(a0), id(a1) -- 
anon[19], a0, a1 | {id(a0) : IdFunction(a0), id(a1) : IdFunction(a1)} |
| |               
 
+----------------+-------+---------+------------------------------------+----------------------------------------------------+
| +Filter          |          33082 | 26410 |   33082 | anon[19], a0, a1   
                | a1:GRAPH_1                                         |
| |               
 
+----------------+-------+---------+------------------------------------+----------------------------------------------------+
| +Expand(All)     |          33082 | 33082 |   40735 | anon[19], a1 -- a0 
                | (a0)-->(a1)                                        |
| |               
 
+----------------+-------+---------+------------------------------------+----------------------------------------------------+
| +NodeByLabelScan |           7653 |  7653 |    7654 | a0                 
                | :GRAPH_2                                           |
+------------------+----------------+-------+---------+------------------------------------+----------------------------------------------------+

On Friday, August 4, 2017 at 4:20:52 AM UTC-7, Michael Hunger wrote:
>
> You need to supply more details, i.e. the queries, query plans, query 
> times, how many labels you have per node etc.
>
> Michael
>
> On Fri, Aug 4, 2017 at 7:02 AM, Sun Yuhan <[email protected] <javascript:>
> > wrote:
>
>> I tested the query like (a:A) --> (b:B) in two graphs. The two graphs 
>> have the same graph structure, which means same number of nodes and edges. 
>> The edges are connecting the same two nodes in the two graphs. The only 
>> difference is that the first graph has only two labels A and B and the 
>> second graph has ten different labels, which is from A to I. For this same 
>> query, query time in the second graph is about 30x worse than that in the 
>> first graph.
>>
>> Is there any feasible explanation for this?
>>
>> On Thursday, July 6, 2017 at 9:10:24 PM UTC-7, Michael Hunger wrote:
>>>
>>> The index lookup is used when using labels with properties
>>> The label scan store that I described is used for all label scan 
>>>
>>> Von meinem iPhone gesendet
>>>
>>> Am 07.07.2017 um 03:31 schrieb Sun Yuhan <[email protected]>:
>>>
>>> Actually I am using the neo4j to execute some graph pattern query ( 
>>> subgraph isomorphism). So the given query pattern will be a labeled graph 
>>> and the results are all subgraphs in the database that can fit the pattern. 
>>> Does the index you mentioned in the reply can achieve better performance 
>>> than using labels?
>>>
>>> On Wednesday, July 5, 2017 at 1:32:15 AM UTC-7, Michael Hunger wrote:
>>>>
>>>> It should be a rare operation if you have indexes in place
>>>>
>>>> There is a reverse store mapping labels to sets of node ids
>>>>
>>>> In 3.1 it uses a compressed lucene format 
>>>> In 3.2 a custom generational b+ tree
>>>> Details for both are only in the source
>>>>
>>>>
>>>>
>>>> Von meinem iPhone gesendet
>>>>
>>>> Am 05.07.2017 um 07:31 schrieb Sun Yuhan <[email protected]>:
>>>>
>>>> The NodeByLabelScan is very commonly used in cypher query execution 
>>>> plans. However, it is not clear how this operation is executed in neo4j. 
>>>>
>>>> What's more, I do not know how the label is managed and stored in 
>>>> neo4j. Are there any materials online that I can refer to? Or I can refer 
>>>> to some source code of neo4j to see the implementation details.
>>>>
>>>> -- 
>>>> You received this message because you are subscribed to the Google 
>>>> Groups "Neo4j" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send 
>>>> an email to [email protected].
>>>> For more options, visit https://groups.google.com/d/optout.
>>>>
>>>> -- 
>>> You received this message because you are subscribed to the Google 
>>> Groups "Neo4j" group.
>>> To unsubscribe from this group and stop receiving emails from it, send 
>>> an email to [email protected].
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "Neo4j" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to [email protected] <javascript:>.
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Neo4j" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Re: [Neo4j] label implementation details

Reply via email to