Aman Sinha has uploaded this change for review. ( http://gerrit.cloudera.org:8080/16315
Change subject: IMPALA-10063: [WIP] Fix crash during ComputeCpuRatios ...................................................................... IMPALA-10063: [WIP] Fix crash during ComputeCpuRatios The subtraction of the sum of cpu counters used in ComputeCpuRatios was in some cases (see IMPALA-10063) causing the 'total_tics' value to be negative which is invalid. The calculation is 'total_tics = cur_sum - old_sum' where cur_sum is the current snapshot and old_sum is prior snapshot. This causes the DCHECK to be triggered and cause a crash. The number of ticks is obtained from the cpu entry in /proc/stat and successive snapshots should only increase in value or stay the same. It turns out there was a missing memset for the array that is re-used. Fixed it by calling memset on this array and initializing to 0 before reading new values. Based on the symptoms, previous reports of cpu ratio would have been wrong. Testing: - Manual testing on my desktop which was showing this symptom earlier (need to monitor over a few days). - TODO: run E2E tests where the above metrics are used (need to identify these tests). Change-Id: Ib5be8aef55f19e84994dd6cb556546ca00a060c9 --- M be/src/util/system-state-info.cc 1 file changed, 1 insertion(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/15/16315/1 -- To view, visit http://gerrit.cloudera.org:8080/16315 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Ib5be8aef55f19e84994dd6cb556546ca00a060c9 Gerrit-Change-Number: 16315 Gerrit-PatchSet: 1 Gerrit-Owner: Aman Sinha <[email protected]>
