[ 
https://issues.apache.org/jira/browse/HELIX-345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13871418#comment-13871418
 ] 

Kanak Biscuitwala commented on HELIX-345:
-----------------------------------------

Ran this again, here is more information:

- There is one ZK server accepting all reads and writes, and it is the only 
active process on that machine
- There is one controller running on one machine, and the 100 participants are 
all running on one (different) machine
- There are ~30300 ZNodes to read, of which ~30000 are messages (as expected, 
given the cluster size)
- If I prevent writes by the participant, the performance is unchanged; this 
was not the source of a bottleneck
- Reading stats is no faster than reading data, so even though the dirty 
properties that are actually read are in the hundreds, this stage is not 
optimized
- Discussed with Jason whether or not ZkCacheBaseDataAccessor would make sense 
here and we agreed that it would be no faster than just caching the controller 
messages within ClusterDataCache. This class is designed for single writers, 
and messages don't fit this model.
- Thus, in approach 2, I updated the code that creates the message to also 
populate the cache so that the initial read goes away
- Approach 2 is the only approach that can read this cluster with this setup in 
500ms (because it never has to read messages or stats)

> Speed up the controller pipelines
> ---------------------------------
>
>                 Key: HELIX-345
>                 URL: https://issues.apache.org/jira/browse/HELIX-345
>             Project: Apache Helix
>          Issue Type: Bug
>    Affects Versions: 0.6.2-incubating, 0.7.0-incubating
>            Reporter: Kanak Biscuitwala
>            Assignee: Kanak Biscuitwala
>
> ReadClusterDataStage can take some time. We should have techniques for 
> speeding it up like parallelizing or caching.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to