[
https://issues.apache.org/jira/browse/HELIX-345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13871418#comment-13871418
]
Kanak Biscuitwala commented on HELIX-345:
-----------------------------------------
Ran this again, here is more information:
- There is one ZK server accepting all reads and writes, and it is the only
active process on that machine
- There is one controller running on one machine, and the 100 participants are
all running on one (different) machine
- There are ~30300 ZNodes to read, of which ~30000 are messages (as expected,
given the cluster size)
- If I prevent writes by the participant, the performance is unchanged; this
was not the source of a bottleneck
- Reading stats is no faster than reading data, so even though the dirty
properties that are actually read are in the hundreds, this stage is not
optimized
- Discussed with Jason whether or not ZkCacheBaseDataAccessor would make sense
here and we agreed that it would be no faster than just caching the controller
messages within ClusterDataCache. This class is designed for single writers,
and messages don't fit this model.
- Thus, in approach 2, I updated the code that creates the message to also
populate the cache so that the initial read goes away
- Approach 2 is the only approach that can read this cluster with this setup in
500ms (because it never has to read messages or stats)
> Speed up the controller pipelines
> ---------------------------------
>
> Key: HELIX-345
> URL: https://issues.apache.org/jira/browse/HELIX-345
> Project: Apache Helix
> Issue Type: Bug
> Affects Versions: 0.6.2-incubating, 0.7.0-incubating
> Reporter: Kanak Biscuitwala
> Assignee: Kanak Biscuitwala
>
> ReadClusterDataStage can take some time. We should have techniques for
> speeding it up like parallelizing or caching.
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)