[GitHub] drill pull request #1200: DRILL-143: Support CGROUPs resource management
Github user asfgit closed the pull request at: https://github.com/apache/drill/pull/1200 ---
[GitHub] drill pull request #1200: DRILL-143: Support CGROUPs resource management
Github user kkhatua commented on a diff in the pull request: https://github.com/apache/drill/pull/1200#discussion_r179292376 --- Diff: distribution/src/resources/drillbit.sh --- @@ -127,6 +127,44 @@ check_before_start() fi } +check_after_start(){ +#check if the process is running +if [ -f $pid ]; then + dbitProc=$(ps -ef | grep `cat $pid` | grep Drillbit) + if [ -n "$dbitProc" ]; then +# Check and enforce for CGroup +if [ -n "$DRILLBIT_CGROUP" ]; then + check_and_enforce_cgroup `cat $pid` +fi + fi +fi +} + +check_and_enforce_cgroup(){ +dbitPid=$1; +#if [ $(`ps -o cgroup` | grep -c $DRILLBIT_CGROUP ) -eq 1 ]; then +if [ -f /cgroup/cpu/${DRILLBIT_CGROUP}/cgroup.procs ]; then + echo $dbitPid > /cgroup/cpu/${DRILLBIT_CGROUP}/cgroup.procs + # Verify Enforcement + cgroupStatus=`grep -w $pid /cgroup/cpu/${DRILLBIT_CGROUP}/cgroup.procs` + if [ -z "$cgroupStatus" ]; then +#Ref: https://www.kernel.org/doc/Documentation/scheduler/sched-bwc.txt +let cpu_quota=`cat /cgroup/cpu/${DRILLBIT_CGROUP}/cpu.cfs_quota_us` +let cpu_period=`cat /cgroup/cpu/${DRILLBIT_CGROUP}/cpu.cfs_period_us` +if [ $cpu_period -gt 0 ] && [ $cpu_quota -gt 0 ]; then + coresAllowed="(up to "`echo $(( 100 * $cpu_quota / $cpu_period )) | sed 's/..$/.&/'`" cores allowed)" +fi +echo "WARN: Drillbit's CPU resource usage will be managed under the CGroup : $DRILLBIT_CGROUP "$coresAllowed + else +echo "ERROR: Failed to add Drillbit to CGroup ( $DRILLBIT_CGROUP ) for resource usage management. Ensure that the cgroup manages CPU" +exit 1 + fi +else + echo "ERROR: cgroup $DRILLBIT_CGROUP does not found. Ensure that daemon is running and cgroup exists" + exit 1 --- End diff -- Good point. Ideally, we should prevent the Drillbit from starting up (or in this case, it should shut down), if CGroups couldn't be applied. My concern is that if CGroups is not being enforced, we're running a process that can (potentially) consume excess CPU resources. Should I shut down the Drillbit in such a scenario, or move on with just a WARN message? ---
[GitHub] drill pull request #1200: DRILL-143: Support CGROUPs resource management
Github user kkhatua commented on a diff in the pull request: https://github.com/apache/drill/pull/1200#discussion_r179287166 --- Diff: distribution/src/resources/drillbit.sh --- @@ -127,6 +127,44 @@ check_before_start() fi } +check_after_start(){ +#check if the process is running +if [ -f $pid ]; then + dbitProc=$(ps -ef | grep `cat $pid` | grep Drillbit) + if [ -n "$dbitProc" ]; then +# Check and enforce for CGroup +if [ -n "$DRILLBIT_CGROUP" ]; then + check_and_enforce_cgroup `cat $pid` +fi + fi +fi +} + +check_and_enforce_cgroup(){ +dbitPid=$1; +#if [ $(`ps -o cgroup` | grep -c $DRILLBIT_CGROUP ) -eq 1 ]; then +if [ -f /cgroup/cpu/${DRILLBIT_CGROUP}/cgroup.procs ]; then + echo $dbitPid > /cgroup/cpu/${DRILLBIT_CGROUP}/cgroup.procs + # Verify Enforcement + cgroupStatus=`grep -w $pid /cgroup/cpu/${DRILLBIT_CGROUP}/cgroup.procs` + if [ -z "$cgroupStatus" ]; then +#Ref: https://www.kernel.org/doc/Documentation/scheduler/sched-bwc.txt +let cpu_quota=`cat /cgroup/cpu/${DRILLBIT_CGROUP}/cpu.cfs_quota_us` +let cpu_period=`cat /cgroup/cpu/${DRILLBIT_CGROUP}/cpu.cfs_period_us` +if [ $cpu_period -gt 0 ] && [ $cpu_quota -gt 0 ]; then + coresAllowed="(up to "`echo $(( 100 * $cpu_quota / $cpu_period )) | sed 's/..$/.&/'`" cores allowed)" +fi +echo "WARN: Drillbit's CPU resource usage will be managed under the CGroup : $DRILLBIT_CGROUP "$coresAllowed --- End diff -- Thought that since the Drill process is being brought up in a restricted mode, it should appear as a WARN. I can switch it to an INFO because there is nothing compelling to use WARN. Will make the change on the variable within quotes. ---
[GitHub] drill pull request #1200: DRILL-143: Support CGROUPs resource management
Github user kkhatua commented on a diff in the pull request: https://github.com/apache/drill/pull/1200#discussion_r179286598 --- Diff: distribution/src/resources/drillbit.sh --- @@ -127,6 +127,44 @@ check_before_start() fi } +check_after_start(){ +#check if the process is running +if [ -f $pid ]; then + dbitProc=$(ps -ef | grep `cat $pid` | grep Drillbit) + if [ -n "$dbitProc" ]; then +# Check and enforce for CGroup +if [ -n "$DRILLBIT_CGROUP" ]; then + check_and_enforce_cgroup `cat $pid` +fi + fi +fi +} + +check_and_enforce_cgroup(){ +dbitPid=$1; +#if [ $(`ps -o cgroup` | grep -c $DRILLBIT_CGROUP ) -eq 1 ]; then +if [ -f /cgroup/cpu/${DRILLBIT_CGROUP}/cgroup.procs ]; then + echo $dbitPid > /cgroup/cpu/${DRILLBIT_CGROUP}/cgroup.procs + # Verify Enforcement + cgroupStatus=`grep -w $pid /cgroup/cpu/${DRILLBIT_CGROUP}/cgroup.procs` + if [ -z "$cgroupStatus" ]; then +#Ref: https://www.kernel.org/doc/Documentation/scheduler/sched-bwc.txt +let cpu_quota=`cat /cgroup/cpu/${DRILLBIT_CGROUP}/cpu.cfs_quota_us` --- End diff -- Ok. Will fix this ---
[GitHub] drill pull request #1200: DRILL-143: Support CGROUPs resource management
Github user kkhatua commented on a diff in the pull request: https://github.com/apache/drill/pull/1200#discussion_r179159323 --- Diff: distribution/src/resources/drillbit.sh --- @@ -154,6 +192,7 @@ start_bit ( ) nohup nice -n $DRILL_NICENESS "$DRILL_HOME/bin/runbit" exec ${args[@]} >> "$logout" 2>&1 & echo $! > $pid sleep 1 + check_after_start --- End diff -- ð Will fix this, thanks! ---
[GitHub] drill pull request #1200: DRILL-143: Support CGROUPs resource management
Github user kkhatua commented on a diff in the pull request: https://github.com/apache/drill/pull/1200#discussion_r179158686 --- Diff: distribution/src/resources/drillbit.sh --- @@ -127,6 +127,44 @@ check_before_start() fi } +check_after_start(){ +#check if the process is running +if [ -f $pid ]; then + dbitProc=$(ps -ef | grep `cat $pid` | grep Drillbit) + if [ -n "$dbitProc" ]; then +# Check and enforce for CGroup +if [ -n "$DRILLBIT_CGROUP" ]; then + check_and_enforce_cgroup `cat $pid` +fi + fi +fi +} + +check_and_enforce_cgroup(){ +dbitPid=$1; +#if [ $(`ps -o cgroup` | grep -c $DRILLBIT_CGROUP ) -eq 1 ]; then --- End diff -- Thanks for catching that! ---
[GitHub] drill pull request #1200: DRILL-143: Support CGROUPs resource management
Github user kkhatua commented on a diff in the pull request: https://github.com/apache/drill/pull/1200#discussion_r179158585 --- Diff: distribution/src/resources/drillbit.sh --- @@ -127,6 +127,44 @@ check_before_start() fi } +check_after_start(){ +#check if the process is running +if [ -f $pid ]; then + dbitProc=$(ps -ef | grep `cat $pid` | grep Drillbit) --- End diff -- Agreed. I'll incorporate this change. ---
[GitHub] drill pull request #1200: DRILL-143: Support CGROUPs resource management
Github user kkhatua commented on a diff in the pull request: https://github.com/apache/drill/pull/1200#discussion_r179158191 --- Diff: distribution/src/resources/drill-env.sh --- @@ -86,6 +86,12 @@ #export DRILL_PID_DIR=${DRILL_PID_DIR:-$DRILL_HOME} +# CGroup to which the Drillbit belong when running as a daemon using --- End diff -- :) I did speculate about enforcement for other resources, but it seems that CPU is the primary resource that needs to be managed. Memory 'management' (by YARN, for e.g.) otherwise works without the need for CGroup. ---
[GitHub] drill pull request #1200: DRILL-143: Support CGROUPs resource management
Github user kkhatua commented on a diff in the pull request: https://github.com/apache/drill/pull/1200#discussion_r179156839 --- Diff: distribution/src/resources/drillbit.sh --- @@ -127,6 +127,44 @@ check_before_start() fi } +check_after_start(){ +#check if the process is running +if [ -f $pid ]; then + dbitProc=$(ps -ef | grep `cat $pid` | grep Drillbit) + if [ -n "$dbitProc" ]; then +# Check and enforce for CGroup +if [ -n "$DRILLBIT_CGROUP" ]; then + check_and_enforce_cgroup `cat $pid` +fi + fi +fi +} + +check_and_enforce_cgroup(){ +dbitPid=$1; +#if [ $(`ps -o cgroup` | grep -c $DRILLBIT_CGROUP ) -eq 1 ]; then +if [ -f /cgroup/cpu/${DRILLBIT_CGROUP}/cgroup.procs ]; then --- End diff -- I've looked to check for older versions, but wasn't sure which one aligned to which version and OS. It's easy to check if CGroups is running, but I don't see a guaranteed way of confirming specifically about that CGroup. Let me do some more research on this front. ---
[GitHub] drill pull request #1200: DRILL-143: Support CGROUPs resource management
Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/1200#discussion_r179026313 --- Diff: distribution/src/resources/drillbit.sh --- @@ -127,6 +127,44 @@ check_before_start() fi } +check_after_start(){ +#check if the process is running +if [ -f $pid ]; then + dbitProc=$(ps -ef | grep `cat $pid` | grep Drillbit) + if [ -n "$dbitProc" ]; then +# Check and enforce for CGroup +if [ -n "$DRILLBIT_CGROUP" ]; then + check_and_enforce_cgroup `cat $pid` +fi + fi +fi +} + +check_and_enforce_cgroup(){ +dbitPid=$1; +#if [ $(`ps -o cgroup` | grep -c $DRILLBIT_CGROUP ) -eq 1 ]; then --- End diff -- Remove commented out line ---
[GitHub] drill pull request #1200: DRILL-143: Support CGROUPs resource management
Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/1200#discussion_r179027582 --- Diff: distribution/src/resources/drillbit.sh --- @@ -154,6 +192,7 @@ start_bit ( ) nohup nice -n $DRILL_NICENESS "$DRILL_HOME/bin/runbit" exec ${args[@]} >> "$logout" 2>&1 & echo $! > $pid sleep 1 + check_after_start --- End diff -- To make things easier: ``` procId=$! echo $procId > $pid # Yeah, $pid is a file, $procId is the pid... sleep 1 check_after_start $procId ``` Also, we now have to naming conventions: `waitForProcessEnd` and `check_after_start`. Doesn't match which we choose, but we should stick with it. Where is check style for scripts? ---
[GitHub] drill pull request #1200: DRILL-143: Support CGROUPs resource management
Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/1200#discussion_r179025617 --- Diff: distribution/src/resources/drill-env.sh --- @@ -86,6 +86,12 @@ #export DRILL_PID_DIR=${DRILL_PID_DIR:-$DRILL_HOME} +# CGroup to which the Drillbit belong when running as a daemon using --- End diff -- Also, maybe provide a bit more of a description. My guess is that the CGroup must already be set up? That we only enforce CPU? Or, is what we enforce determined entirely by what the user has configured? If so, wouldn't "drill" be a better CGroup name? ---
[GitHub] drill pull request #1200: DRILL-143: Support CGROUPs resource management
Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/1200#discussion_r179028127 --- Diff: distribution/src/resources/drillbit.sh --- @@ -127,6 +127,44 @@ check_before_start() fi } +check_after_start(){ +#check if the process is running +if [ -f $pid ]; then + dbitProc=$(ps -ef | grep `cat $pid` | grep Drillbit) + if [ -n "$dbitProc" ]; then +# Check and enforce for CGroup +if [ -n "$DRILLBIT_CGROUP" ]; then + check_and_enforce_cgroup `cat $pid` +fi + fi +fi +} + +check_and_enforce_cgroup(){ +dbitPid=$1; +#if [ $(`ps -o cgroup` | grep -c $DRILLBIT_CGROUP ) -eq 1 ]; then +if [ -f /cgroup/cpu/${DRILLBIT_CGROUP}/cgroup.procs ]; then + echo $dbitPid > /cgroup/cpu/${DRILLBIT_CGROUP}/cgroup.procs + # Verify Enforcement + cgroupStatus=`grep -w $pid /cgroup/cpu/${DRILLBIT_CGROUP}/cgroup.procs` + if [ -z "$cgroupStatus" ]; then +#Ref: https://www.kernel.org/doc/Documentation/scheduler/sched-bwc.txt +let cpu_quota=`cat /cgroup/cpu/${DRILLBIT_CGROUP}/cpu.cfs_quota_us` +let cpu_period=`cat /cgroup/cpu/${DRILLBIT_CGROUP}/cpu.cfs_period_us` +if [ $cpu_period -gt 0 ] && [ $cpu_quota -gt 0 ]; then + coresAllowed="(up to "`echo $(( 100 * $cpu_quota / $cpu_period )) | sed 's/..$/.&/'`" cores allowed)" +fi +echo "WARN: Drillbit's CPU resource usage will be managed under the CGroup : $DRILLBIT_CGROUP "$coresAllowed + else +echo "ERROR: Failed to add Drillbit to CGroup ( $DRILLBIT_CGROUP ) for resource usage management. Ensure that the cgroup manages CPU" +exit 1 + fi +else + echo "ERROR: cgroup $DRILLBIT_CGROUP does not found. Ensure that daemon is running and cgroup exists" + exit 1 --- End diff -- Not sure we want to fail the script in this case. Failure means that the Drillbit did not start, but it did. ---
[GitHub] drill pull request #1200: DRILL-143: Support CGROUPs resource management
Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/1200#discussion_r179028625 --- Diff: distribution/src/resources/drillbit.sh --- @@ -127,6 +127,44 @@ check_before_start() fi } +check_after_start(){ +#check if the process is running +if [ -f $pid ]; then + dbitProc=$(ps -ef | grep `cat $pid` | grep Drillbit) + if [ -n "$dbitProc" ]; then +# Check and enforce for CGroup +if [ -n "$DRILLBIT_CGROUP" ]; then + check_and_enforce_cgroup `cat $pid` +fi + fi +fi +} + +check_and_enforce_cgroup(){ +dbitPid=$1; +#if [ $(`ps -o cgroup` | grep -c $DRILLBIT_CGROUP ) -eq 1 ]; then +if [ -f /cgroup/cpu/${DRILLBIT_CGROUP}/cgroup.procs ]; then + echo $dbitPid > /cgroup/cpu/${DRILLBIT_CGROUP}/cgroup.procs + # Verify Enforcement + cgroupStatus=`grep -w $pid /cgroup/cpu/${DRILLBIT_CGROUP}/cgroup.procs` + if [ -z "$cgroupStatus" ]; then +#Ref: https://www.kernel.org/doc/Documentation/scheduler/sched-bwc.txt +let cpu_quota=`cat /cgroup/cpu/${DRILLBIT_CGROUP}/cpu.cfs_quota_us` --- End diff -- `let` is used only for math. Just say `cpu_quota=cat...` ---
[GitHub] drill pull request #1200: DRILL-143: Support CGROUPs resource management
Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/1200#discussion_r179027761 --- Diff: distribution/src/resources/drillbit.sh --- @@ -127,6 +127,44 @@ check_before_start() fi } +check_after_start(){ +#check if the process is running +if [ -f $pid ]; then + dbitProc=$(ps -ef | grep `cat $pid` | grep Drillbit) + if [ -n "$dbitProc" ]; then +# Check and enforce for CGroup +if [ -n "$DRILLBIT_CGROUP" ]; then + check_and_enforce_cgroup `cat $pid` +fi + fi +fi +} + +check_and_enforce_cgroup(){ +dbitPid=$1; +#if [ $(`ps -o cgroup` | grep -c $DRILLBIT_CGROUP ) -eq 1 ]; then +if [ -f /cgroup/cpu/${DRILLBIT_CGROUP}/cgroup.procs ]; then + echo $dbitPid > /cgroup/cpu/${DRILLBIT_CGROUP}/cgroup.procs + # Verify Enforcement + cgroupStatus=`grep -w $pid /cgroup/cpu/${DRILLBIT_CGROUP}/cgroup.procs` + if [ -z "$cgroupStatus" ]; then +#Ref: https://www.kernel.org/doc/Documentation/scheduler/sched-bwc.txt +let cpu_quota=`cat /cgroup/cpu/${DRILLBIT_CGROUP}/cpu.cfs_quota_us` +let cpu_period=`cat /cgroup/cpu/${DRILLBIT_CGROUP}/cpu.cfs_period_us` +if [ $cpu_period -gt 0 ] && [ $cpu_quota -gt 0 ]; then + coresAllowed="(up to "`echo $(( 100 * $cpu_quota / $cpu_period )) | sed 's/..$/.&/'`" cores allowed)" +fi +echo "WARN: Drillbit's CPU resource usage will be managed under the CGroup : $DRILLBIT_CGROUP "$coresAllowed --- End diff -- Why is this a WARN? Isn't this exactly what I requested? Maybe emit this as a message (no WARN) if a -v (verbose) flag is set. Also, put the variable inside the quotes; bash is handy that way... Message will be "...CGroup : drillcpu 5". Maybe, "CGroup drillcpu will limit Drill to 5 cpu(s)". ---
[GitHub] drill pull request #1200: DRILL-143: Support CGROUPs resource management
Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/1200#discussion_r179028078 --- Diff: distribution/src/resources/drillbit.sh --- @@ -127,6 +127,44 @@ check_before_start() fi } +check_after_start(){ +#check if the process is running +if [ -f $pid ]; then + dbitProc=$(ps -ef | grep `cat $pid` | grep Drillbit) + if [ -n "$dbitProc" ]; then +# Check and enforce for CGroup +if [ -n "$DRILLBIT_CGROUP" ]; then + check_and_enforce_cgroup `cat $pid` +fi + fi +fi +} + +check_and_enforce_cgroup(){ +dbitPid=$1; +#if [ $(`ps -o cgroup` | grep -c $DRILLBIT_CGROUP ) -eq 1 ]; then +if [ -f /cgroup/cpu/${DRILLBIT_CGROUP}/cgroup.procs ]; then + echo $dbitPid > /cgroup/cpu/${DRILLBIT_CGROUP}/cgroup.procs + # Verify Enforcement + cgroupStatus=`grep -w $pid /cgroup/cpu/${DRILLBIT_CGROUP}/cgroup.procs` + if [ -z "$cgroupStatus" ]; then +#Ref: https://www.kernel.org/doc/Documentation/scheduler/sched-bwc.txt +let cpu_quota=`cat /cgroup/cpu/${DRILLBIT_CGROUP}/cpu.cfs_quota_us` +let cpu_period=`cat /cgroup/cpu/${DRILLBIT_CGROUP}/cpu.cfs_period_us` +if [ $cpu_period -gt 0 ] && [ $cpu_quota -gt 0 ]; then + coresAllowed="(up to "`echo $(( 100 * $cpu_quota / $cpu_period )) | sed 's/..$/.&/'`" cores allowed)" +fi +echo "WARN: Drillbit's CPU resource usage will be managed under the CGroup : $DRILLBIT_CGROUP "$coresAllowed + else +echo "ERROR: Failed to add Drillbit to CGroup ( $DRILLBIT_CGROUP ) for resource usage management. Ensure that the cgroup manages CPU" +exit 1 + fi +else + echo "ERROR: cgroup $DRILLBIT_CGROUP does not found. Ensure that daemon is running and cgroup exists" --- End diff -- cgroup --> CGroup (or at least make the format consistent across messages) does not found --> not found We can check if the daemon is running using `kill -0 $pid`. If so, then we just tell the user something line "Check CGroup configuration." Also, errors should go to stdout: `echo "msg" >&2` ---
[GitHub] drill pull request #1200: DRILL-143: Support CGROUPs resource management
Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/1200#discussion_r179026462 --- Diff: distribution/src/resources/drillbit.sh --- @@ -127,6 +127,44 @@ check_before_start() fi } +check_after_start(){ +#check if the process is running +if [ -f $pid ]; then + dbitProc=$(ps -ef | grep `cat $pid` | grep Drillbit) + if [ -n "$dbitProc" ]; then +# Check and enforce for CGroup +if [ -n "$DRILLBIT_CGROUP" ]; then + check_and_enforce_cgroup `cat $pid` +fi + fi +fi +} + +check_and_enforce_cgroup(){ +dbitPid=$1; +#if [ $(`ps -o cgroup` | grep -c $DRILLBIT_CGROUP ) -eq 1 ]; then +if [ -f /cgroup/cpu/${DRILLBIT_CGROUP}/cgroup.procs ]; then --- End diff -- I believe cgroup setup is OS-specific. There are, AFAIK, multiple group versions. For this reason, I wonder if this needs to have checks. Are there scripts we can crib from somewhere that do those checks? ---
[GitHub] drill pull request #1200: DRILL-143: Support CGROUPs resource management
Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/1200#discussion_r179025777 --- Diff: distribution/src/resources/drillbit.sh --- @@ -127,6 +127,44 @@ check_before_start() fi } +check_after_start(){ +#check if the process is running +if [ -f $pid ]; then + dbitProc=$(ps -ef | grep `cat $pid` | grep Drillbit) --- End diff -- If you get this far, the pid must exist or the scripts are broken. The only question is whether the process is still running. The chance that it is not is very low. Also, there is a race condition: the process may exit just after we check it. So, not sure it is even worth doing the check. Finally, the standard way to check for the process is `kill -0 $pid`". See `waitForProcessEnd()`. ---
[GitHub] drill pull request #1200: DRILL-143: Support CGROUPs resource management
Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/1200#discussion_r179028377 --- Diff: distribution/src/resources/drillbit.sh --- @@ -127,6 +127,44 @@ check_before_start() fi } +check_after_start(){ +#check if the process is running +if [ -f $pid ]; then + dbitProc=$(ps -ef | grep `cat $pid` | grep Drillbit) + if [ -n "$dbitProc" ]; then +# Check and enforce for CGroup +if [ -n "$DRILLBIT_CGROUP" ]; then + check_and_enforce_cgroup `cat $pid` +fi + fi +fi +} + +check_and_enforce_cgroup(){ +dbitPid=$1; +#if [ $(`ps -o cgroup` | grep -c $DRILLBIT_CGROUP ) -eq 1 ]; then +if [ -f /cgroup/cpu/${DRILLBIT_CGROUP}/cgroup.procs ]; then + echo $dbitPid > /cgroup/cpu/${DRILLBIT_CGROUP}/cgroup.procs + # Verify Enforcement + cgroupStatus=`grep -w $pid /cgroup/cpu/${DRILLBIT_CGROUP}/cgroup.procs` + if [ -z "$cgroupStatus" ]; then +#Ref: https://www.kernel.org/doc/Documentation/scheduler/sched-bwc.txt +let cpu_quota=`cat /cgroup/cpu/${DRILLBIT_CGROUP}/cpu.cfs_quota_us` +let cpu_period=`cat /cgroup/cpu/${DRILLBIT_CGROUP}/cpu.cfs_period_us` +if [ $cpu_period -gt 0 ] && [ $cpu_quota -gt 0 ]; then + coresAllowed="(up to "`echo $(( 100 * $cpu_quota / $cpu_period )) | sed 's/..$/.&/'`" cores allowed)" +fi +echo "WARN: Drillbit's CPU resource usage will be managed under the CGroup : $DRILLBIT_CGROUP "$coresAllowed + else +echo "ERROR: Failed to add Drillbit to CGroup ( $DRILLBIT_CGROUP ) for resource usage management. Ensure that the cgroup manages CPU" +exit 1 --- End diff -- See below about `exit 1` ---
[GitHub] drill pull request #1200: DRILL-143: Support CGROUPs resource management
Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/1200#discussion_r179025400 --- Diff: distribution/src/resources/drill-env.sh --- @@ -86,6 +86,12 @@ #export DRILL_PID_DIR=${DRILL_PID_DIR:-$DRILL_HOME} +# CGroup to which the Drillbit belong when running as a daemon using --- End diff -- belong --> belongs ---
[GitHub] drill pull request #1200: DRILL-143: Support CGROUPs resource management
Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/1200#discussion_r179027910 --- Diff: distribution/src/resources/drillbit.sh --- @@ -127,6 +127,44 @@ check_before_start() fi } +check_after_start(){ +#check if the process is running +if [ -f $pid ]; then + dbitProc=$(ps -ef | grep `cat $pid` | grep Drillbit) + if [ -n "$dbitProc" ]; then +# Check and enforce for CGroup +if [ -n "$DRILLBIT_CGROUP" ]; then + check_and_enforce_cgroup `cat $pid` +fi + fi +fi +} + +check_and_enforce_cgroup(){ +dbitPid=$1; +#if [ $(`ps -o cgroup` | grep -c $DRILLBIT_CGROUP ) -eq 1 ]; then +if [ -f /cgroup/cpu/${DRILLBIT_CGROUP}/cgroup.procs ]; then + echo $dbitPid > /cgroup/cpu/${DRILLBIT_CGROUP}/cgroup.procs + # Verify Enforcement + cgroupStatus=`grep -w $pid /cgroup/cpu/${DRILLBIT_CGROUP}/cgroup.procs` + if [ -z "$cgroupStatus" ]; then +#Ref: https://www.kernel.org/doc/Documentation/scheduler/sched-bwc.txt +let cpu_quota=`cat /cgroup/cpu/${DRILLBIT_CGROUP}/cpu.cfs_quota_us` +let cpu_period=`cat /cgroup/cpu/${DRILLBIT_CGROUP}/cpu.cfs_period_us` +if [ $cpu_period -gt 0 ] && [ $cpu_quota -gt 0 ]; then + coresAllowed="(up to "`echo $(( 100 * $cpu_quota / $cpu_period )) | sed 's/..$/.&/'`" cores allowed)" +fi +echo "WARN: Drillbit's CPU resource usage will be managed under the CGroup : $DRILLBIT_CGROUP "$coresAllowed + else +echo "ERROR: Failed to add Drillbit to CGroup ( $DRILLBIT_CGROUP ) for resource usage management. Ensure that the cgroup manages CPU" --- End diff -- Inconsistent format: parens here, colon above. Actually, neither are needed. Also, "for resource usage management" --> "for cpu" ---
[GitHub] drill pull request #1200: DRILL-143: Support CGROUPs resource management
Github user Ben-Zvi commented on a diff in the pull request: https://github.com/apache/drill/pull/1200#discussion_r179005732 --- Diff: distribution/src/resources/drillbit.sh --- @@ -127,6 +127,44 @@ check_before_start() fi } +check_after_start(){ +#check if the process is running +if [ -f $pid ]; then + dbitProc=$(ps -ef | grep `cat $pid` | grep Drillbit) + if [ -n "$dbitProc" ]; then +# Check and enforce for CGroup +if [ -n "$DRILLBIT_CGROUP" ]; then + check_and_enforce_cgroup `cat $pid` +fi + fi +fi +} + +check_and_enforce_cgroup(){ +dbitPid=$1; +#if [ $(`ps -o cgroup` | grep -c $DRILLBIT_CGROUP ) -eq 1 ]; then +if [ -f /cgroup/cpu/${DRILLBIT_CGROUP}/cgroup.procs ]; then --- End diff -- Why is the path hardcoded ? Is it a standard ? I checked on my cluster, where /sys/fs/cgroup/. is used instead. ---
[GitHub] drill pull request #1200: DRILL-143: Support CGROUPs resource management
GitHub user kkhatua opened a pull request: https://github.com/apache/drill/pull/1200 DRILL-143: Support CGROUPs resource management Introduces the `DRILLBIT_CGROUP` option in defined in `drill-env.sh` The startup script checks if the specified CGroup (ver 2) is available and tries to apply it to the launched Drillbit JVM. This would benefit not just Drill-on-YARN usecases, but any setup that would like CGroups for enforcement of (cpu) resources management. e.g when Drillbit is configured to use `drillcpu` cgroup ``` [root@maprlabs ~]# /opt/mapr/drill/apache-drill-1.14.0-SNAPSHOT/bin/drillbit.sh restart Stopping drillbit .. Starting drillbit, logging to /var/log/drill/drillbit.out WARN: Drillbit's CPU resource usage will be managed under the CGroup : drillcpu (up to 4.00 cores allowed) ``` e.g. Non-existent CGroup `droolcpu` is used ``` [root@maprlabs ~]# /opt/mapr/drill/apache-drill-1.14.0-SNAPSHOT/bin/drillbit.sh restart Stopping drillbit .. Starting drillbit, logging to /var/log/drill/drillbit.out ERROR: cgroup droolcpu does not found. Ensure that daemon is running and cgroup exists ``` You can merge this pull request into a Git repository by running: $ git pull https://github.com/kkhatua/drill DRILL-143 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/drill/pull/1200.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1200 commit e9a2551b315b4395c5227dad017f2b4340f41108 Author: Kunal KhatuaDate: 2018-04-03T05:35:53Z DRILL-143: Support CGROUPs resource management Introduces the DRILLBIT_CGROUP option in drill-env.sh. The startup script checks if the specified CGroup (ver 2) is available and tries to apply it to the launched Drillbit JVM. This would benefit not just Drill-on-YARN usecases, but any setup that would like CGroups for enforcement of (cpu) resources management. e.g when Drillbit is configured to use `drillcpu` cgroup ``` [root@maprlabs ~]# /opt/mapr/drill/apache-drill-1.14.0-SNAPSHOT/bin/drillbit.sh restart Stopping drillbit .. Starting drillbit, logging to /var/log/drill/drillbit.out WARN: Drillbit's CPU resource usage will be managed under the CGroup : drillcpu (up to 4.00 cores allowed) ``` e.g. Non-existent CGroup `droolcpu` is used ``` [root@kk127 ~]# /opt/mapr/drill/apache-drill-1.14.0-SNAPSHOT/bin/drillbit.sh restart Stopping drillbit .. Starting drillbit, logging to /var/log/drill/drillbit.out ERROR: cgroup droolcpu does not found. Ensure that daemon is running and cgroup exists ``` ---