Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/9123 )
Change subject: IMPALA-4953,IMPALA-6437: separate AC/scheduler from catalog topic updates ...................................................................... Patch Set 15: I was able to borrow a 140 node cluster and ran a workload with a few streams of concurrent queries. I looked at the statestored topics page, impalad and statestored metrics, "top" and "perf top" to make sure that resource consumption was as expected. Things looks good. The statestore does remain active consuming a moderate amount of CPU just to poll the 140 subscribers for updates every 100ms, but that is expected. One thing that is interesting is that the request-queue topic updates very frequently when running without mem_limits, because it tracks the actual memory consumption in that case, which tends to fluctuate a lot. Detailed notes follow. On an idle 140 node cluster, statestored consumes ~40% cpu and ~500kb/s network. Perf shows that time is mainly spent checking topic versions. 9.19% impalad [.] impala::Statestore::GetMinSubscriberTopicVersion(std::string const&, std::string*) 7.50% [kernel] [k] find_busiest_group 4.47% impalad [.] _ZN5boost9unordered6detail12mix64_policyImE10apply_hashINS_4hashISsEESsEEmRKT_RKT0_.isra.271 1.82% [kernel] [k] find_next_bit 1.69% impalad [.] impala::Statestore::Subscriber::LastTopicVersionProcessed(std::string const&) const 1.68% [kernel] [k] _spin_lock 1.27% libc-2.12.so [.] __memcmp_sse4_1 1.25% [kernel] [k] cpumask_next_and 0.98% [kernel] [k] thread_return 0.80% [kernel] [k] smaps_pte_entry 0.74% libjvm.so [.] GenericTaskQueueSet<Padded<GenericTaskQueue<oopDesc*, (unsigned short)1280, 131072u>, 64ul>, (unsigned short 0.70% [kernel] [k] schedule 0.60% libc-2.12.so [.] memcpy 0.54% [kernel] [k] _spin_lock_irqsave 0.54% [kernel] [k] ixgbe_poll If I ran a light workload of queries with no mem_limits, the request-queue topic updated frequently with the current memory consumption and statestored CPU consumption increased to 50-60% with a light workload of queries with no mem_limits. If I set a default pool mem_limit the topic version only increments rarely and there is no noticeable increase in load. On an idle cluster, prioritized updates were delivered in a timely manner and took minimal time to process. Snapshot of metrics from an Impalad while idle after running some queries: statestore-subscriber.connected true Whether the Impala Daemon considers itself connected to the StateStore. statestore-subscriber.last-recovery-time N/A The local time that the last statestore recovery happened. statestore-subscriber.topic-update-interval-time Last (of 67983): 0.100672. Min: 0, max: 2.43858, avg: 0.144938 The time (sec) between Statestore subscriber topic updates. statestore-subscriber.topic-update-duration Last (of 34783): 0.000256518. Min: 0, max: 1.06605, avg: 0.00530736 The time (sec) taken to process Statestore subscriber topic updates. statestore-subscriber.heartbeat-interval-time Last (of 3344): 1.00001. Min: 0, max: 1.82638, avg: 1.00039 The time (sec) between Statestore heartbeats. statestore-subscriber.topic-impala-request-queue.update-interval Last (of 33200): 0.100672. Min: 0, max: 2.43848, avg: 0.100697 Interval between topic updates for Topic impala-request-queue statestore-subscriber.topic-impala-membership.update-interval Last (of 33200): 0.100671. Min: 0, max: 2.43858, avg: 0.100696 Interval between topic updates for Topic impala-membership statestore-subscriber.topic-impala-request-queue.processing-time-s Last (of 33200): 0.000254981. Min: 0, max: 0.0113791, avg: 0.000155621 Statestore Subscriber Topic impala-request-queue Processing Time statestore-subscriber.topic-impala-membership.processing-time-s Last (of 33200): 0.000177222. Min: 0, max: 0.0113666, avg: 0.000120381 Statestore Subscriber Topic impala-membership Processing Time statestore-subscriber.topic-catalog-update.update-interval Last (of 1583): 2.0009. Min: 1, max: 2.20628, avg: 2.00069 Interval between topic updates for Topic catalog-update statestore-subscriber.topic-catalog-update.processing-time-s Last (of 1583): 0.136297. Min: 0, max: 1.06605, avg: 0.113322 Statestore Subscriber Topic catalog-update Processing Time rpc-method.statestore-subscriber.StatestoreSubscriber.UpdateState.call_duration Count: 34783, min / max: 0 / 1s117ms, 25th %-ile: 0, 50th %-ile: 0, 75th %-ile: 0, 90th %-ile: 1ms, 95th %-ile: 1ms, 99.9th %-ile: 132ms rpc-method.statestore-subscriber.StatestoreSubscriber.Heartbeat.call_duration Count: 3344, min / max: 0 / 1ms, 25th %-ile: 0, 50th %-ile: 0, 75th %-ile: 0, 90th %-ile: 0, 95th %-ile: 0, 99.9th %-ile: 1ms statestore-subscriber.last-recovery-duration 0 The amount of time the StateStore subscriber took to recover the connection the last time it was lost. >From the statestore's side, prioritized topic update RPCs were low latency - >~.5ms on average. statestore.heartbeat-durations Last (of 473797): 0.000238798. Min: 0, max: 2.0773, avg: 0.000334941 The time (sec) spent sending heartbeat RPCs. Includes subscriber-side processing time and network transmission time. statestore.priority-topic-update-durations Last (of 4668620): 0.0010134. Min: 0, max: 2.19275, avg: 0.000593646 The time (sec) spent sending priority topic update RPCs. Includes subscriber-side processing time and network transmission time. statestore.topic-update-durations Last (of 224106): 0.121756. Min: 0, max: 2.96739, avg: 0.11604 The time (sec) spent sending non-priority topic update RPCs. Includes subscriber-side processing time and network transmission time. Topic versions for the request queue update very frequently while queries are running, as expected: Topics Topic Id Number of entries Version Oldest subscriber version Oldest subscriber ID Size (keys / values / total) impala-membership 130 130 130 impa...@vc1517.halxg.cloudera.com:22000 4.95 KB / 12.82 KB / 17.77 KB impala-request-queue 129 543176 543048 impa...@ve1139.halxg.cloudera.com:22000 5.54 KB / 4.28 KB / 9.83 KB catalog-update 88308 88322 88322 impa...@vc1517.halxg.cloudera.com:22000 2.35 MB / 5.45 MB / 7.80 MB -- To view, visit http://gerrit.cloudera.org:8080/9123 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ifc49c2d0f2a5bfad822545616b8c62b4b95dc210 Gerrit-Change-Number: 9123 Gerrit-PatchSet: 15 Gerrit-Owner: Tim Armstrong <tarmstr...@cloudera.com> Gerrit-Reviewer: Alex Behm <alex.b...@cloudera.com> Gerrit-Reviewer: Bikramjeet Vig <bikramjeet....@cloudera.com> Gerrit-Reviewer: Dan Hecht <dhe...@cloudera.com> Gerrit-Reviewer: Dimitris Tsirogiannis <dtsirogian...@cloudera.com> Gerrit-Reviewer: Tianyi Wang <tw...@cloudera.com> Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com> Gerrit-Comment-Date: Tue, 13 Feb 2018 23:46:35 +0000 Gerrit-HasComments: No