Aman Sinha has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/16315


Change subject: IMPALA-10063: [WIP] Fix crash during ComputeCpuRatios
......................................................................

IMPALA-10063: [WIP] Fix crash during ComputeCpuRatios

The subtraction of the sum of cpu counters used in
ComputeCpuRatios was in some cases (see IMPALA-10063)
causing the 'total_tics' value to be negative which is
invalid. The calculation is 'total_tics = cur_sum - old_sum'
where cur_sum is the current snapshot and old_sum is prior
snapshot. This causes the DCHECK to be triggered and cause
a crash.

The number of ticks is obtained from the cpu entry in
/proc/stat and successive snapshots should only increase in
value or stay the same. It turns out there was a missing
memset for the array that is re-used. Fixed it by calling
memset on this array and initializing to 0 before reading
new values. Based on the symptoms, previous reports of cpu
ratio would have been wrong.

Testing:
 - Manual testing on my desktop which was showing this
   symptom earlier (need to monitor over a few days).
 - TODO: run E2E tests where the above metrics are used
   (need to identify these tests).

Change-Id: Ib5be8aef55f19e84994dd6cb556546ca00a060c9
---
M be/src/util/system-state-info.cc
1 file changed, 1 insertion(+), 0 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/15/16315/1
--
To view, visit http://gerrit.cloudera.org:8080/16315
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ib5be8aef55f19e84994dd6cb556546ca00a060c9
Gerrit-Change-Number: 16315
Gerrit-PatchSet: 1
Gerrit-Owner: Aman Sinha <[email protected]>

Reply via email to