[jira] [Commented] (DRILL-143) Support CGROUPs resource management

2018-06-14 Thread Bridget Bevens (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16513121#comment-16513121
 ] 

Bridget Bevens commented on DRILL-143:
--

Doc updated here: 
https://drill.apache.org/docs/configuring-cgroups-to-control-cpu-usage/ 


> Support CGROUPs resource management
> ---
>
> Key: DRILL-143
> URL: https://issues.apache.org/jira/browse/DRILL-143
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Jacques Nadeau
>Assignee: Kunal Khatua
>Priority: Major
>  Labels: doc-complete, ready-to-commit
> Fix For: 1.14.0
>
> Attachments: 253ce178-ddeb-e482-cd64-44ab7284ad1c.sys.drill
>
>
> For the purpose of playing nice on clusters that don't have YARN, we should 
> write up configuration and scripts to allows users to run Drill next to 
> existing workloads without sharing resources.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-143) Support CGROUPs resource management

2018-05-06 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16465258#comment-16465258
 ] 

ASF GitHub Bot commented on DRILL-143:
--

asfgit closed pull request #1239: DRILL-143: CGroup Support for Drill-on-YARN
URL: https://github.com/apache/drill/pull/1239
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/distribution/src/resources/drillbit.sh 
b/distribution/src/resources/drillbit.sh
index 7dad34cf50..88d56c8a14 100755
--- a/distribution/src/resources/drillbit.sh
+++ b/distribution/src/resources/drillbit.sh
@@ -119,9 +119,9 @@ check_before_start()
 {
   #check that the process is not running
   mkdir -p "$DRILL_PID_DIR"
-  if [ -f $pid ]; then
-if kill -0 `cat $pid` > /dev/null 2>&1; then
-  echo "$command is already running as process `cat $pid`.  Stop it first."
+  if [ -f $pidFile ]; then
+if kill -0 `cat $pidFile` > /dev/null 2>&1; then
+  echo "$command is already running as process `cat $pidFile`.  Stop it 
first."
   exit 1
 fi
   fi
@@ -146,8 +146,8 @@ check_and_enforce_cgroup(){
 if [ -f $SYS_CGROUP_DIR/cpu/$DRILLBIT_CGROUP/cgroup.procs ]; then
   echo $dbitPid > $SYS_CGROUP_DIR/cpu/$DRILLBIT_CGROUP/cgroup.procs
   # Verify Enforcement
-  cgroupStatus=`grep -w $pid 
$SYS_CGROUP_DIR/cpu/${DRILLBIT_CGROUP}/cgroup.procs`
-  if [ -z "$cgroupStatus" ]; then
+  cgroupStatus=`grep -w $dbitPid 
$SYS_CGROUP_DIR/cpu/${DRILLBIT_CGROUP}/cgroup.procs`
+  if [ -n "$cgroupStatus" ]; then
 #Ref: https://www.kernel.org/doc/Documentation/scheduler/sched-bwc.txt
 cpu_quota=`cat 
${SYS_CGROUP_DIR}/cpu/${DRILLBIT_CGROUP}/cpu.cfs_quota_us`
 cpu_period=`cat 
${SYS_CGROUP_DIR}/cpu/${DRILLBIT_CGROUP}/cpu.cfs_period_us`
@@ -189,8 +189,8 @@ start_bit ( )
   echo "`ulimit -a`" >> "$DRILLBIT_LOG_PATH" 2>&1
   nohup nice -n $DRILL_NICENESS "$DRILL_HOME/bin/runbit" exec ${args[@]} >> 
"$logout" 2>&1 &
   procId=$!
-  echo $procId > $pid # Yeah, $pid is a file, $procId is the pid...
-  echo $! > $pid
+  echo $procId > $pidFile # Yeah, $pidFile is a file, $procId is the pid...
+  echo $! > $pidFile
   sleep 1
   check_after_start $procId
 }
@@ -198,8 +198,8 @@ start_bit ( )
 stop_bit ( )
 {
   kill_drillbit=$1
-  if [ -f $pid ]; then
-pidToKill=`cat $pid`
+  if [ -f $pidFile ]; then
+pidToKill=`cat $pidFile`
 # kill -0 == see if the PID exists
 if kill -0 $pidToKill > /dev/null 2>&1; then
   echo "Stopping $command"
@@ -211,15 +211,15 @@ stop_bit ( )
   retval=$?
   echo "No $command to stop because kill -0 of pid $pidToKill failed with 
status $retval"
 fi
-rm $pid > /dev/null 2>&1
+rm $pidFile > /dev/null 2>&1
   else
-echo "No $command to stop because no pid file $pid"
+echo "No $command to stop because no pid file $pidFile"
 retval=1
   fi
   return $retval
 }
 
-pid=$DRILL_PID_DIR/drillbit.pid
+pidFile=$DRILL_PID_DIR/drillbit.pid
 logout="${DRILL_LOG_PREFIX}.out"
 
 thiscmd=$0
@@ -271,12 +271,12 @@ case $startStopStatus in
   ;;
 
 (status)
-  if [ -f $pid ]; then
-TARGET_PID=`cat $pid`
+  if [ -f $pidFile ]; then
+TARGET_PID=`cat $pidFile`
 if kill -0 $TARGET_PID > /dev/null 2>&1; then
   echo "$command is running."
 else
-  echo "$pid file is present but $command is not running."
+  echo "$pidFile file is present but $command is not running."
   exit 1
 fi
   else


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Support CGROUPs resource management
> ---
>
> Key: DRILL-143
> URL: https://issues.apache.org/jira/browse/DRILL-143
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Jacques Nadeau
>Assignee: Kunal Khatua
>Priority: Major
>  Labels: doc-impacting, ready-to-commit
> Fix For: 1.14.0
>
> Attachments: 253ce178-ddeb-e482-cd64-44ab7284ad1c.sys.drill
>
>
> For the purpose of playing nice on clusters that don't have YARN, we should 
> write up configuration and scripts to allows users to run Drill next to 
> existing workloads without sharing resources.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-143) Support CGROUPs resource management

2018-05-06 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16465220#comment-16465220
 ] 

ASF GitHub Bot commented on DRILL-143:
--

vvysotskyi commented on issue #1239: DRILL-143: CGroup Support for Drill-on-YARN
URL: https://github.com/apache/drill/pull/1239#issuecomment-386892638
 
 
   @paul-rogers, @kkhatua, thanks for the clarification, will merge it soon.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Support CGROUPs resource management
> ---
>
> Key: DRILL-143
> URL: https://issues.apache.org/jira/browse/DRILL-143
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Jacques Nadeau
>Assignee: Kunal Khatua
>Priority: Major
>  Labels: doc-impacting, ready-to-commit
> Fix For: 1.14.0
>
> Attachments: 253ce178-ddeb-e482-cd64-44ab7284ad1c.sys.drill
>
>
> For the purpose of playing nice on clusters that don't have YARN, we should 
> write up configuration and scripts to allows users to run Drill next to 
> existing workloads without sharing resources.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-143) Support CGROUPs resource management

2018-05-06 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16465215#comment-16465215
 ] 

ASF GitHub Bot commented on DRILL-143:
--

kkhatua commented on issue #1239: DRILL-143: CGroup Support for Drill-on-YARN
URL: https://github.com/apache/drill/pull/1239#issuecomment-386891400
 
 
   @vvysotskyi  Yes. This commit should carry only changes to one file, instead 
of the original 2 files. You can go ahead and commit it.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Support CGROUPs resource management
> ---
>
> Key: DRILL-143
> URL: https://issues.apache.org/jira/browse/DRILL-143
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Jacques Nadeau
>Assignee: Kunal Khatua
>Priority: Major
>  Labels: doc-impacting, ready-to-commit
> Fix For: 1.14.0
>
> Attachments: 253ce178-ddeb-e482-cd64-44ab7284ad1c.sys.drill
>
>
> For the purpose of playing nice on clusters that don't have YARN, we should 
> write up configuration and scripts to allows users to run Drill next to 
> existing workloads without sharing resources.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-143) Support CGROUPs resource management

2018-05-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16464938#comment-16464938
 ] 

ASF GitHub Bot commented on DRILL-143:
--

paul-rogers commented on issue #1239: DRILL-143: CGroup Support for 
Drill-on-YARN
URL: https://github.com/apache/drill/pull/1239#issuecomment-386842324
 
 
   Please check with @kkhatua about his intention. The change is OK as far as 
it goes; the discussion was about whether it went far enough.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Support CGROUPs resource management
> ---
>
> Key: DRILL-143
> URL: https://issues.apache.org/jira/browse/DRILL-143
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Jacques Nadeau
>Assignee: Kunal Khatua
>Priority: Major
>  Labels: doc-impacting, ready-to-commit
> Fix For: 1.14.0
>
> Attachments: 253ce178-ddeb-e482-cd64-44ab7284ad1c.sys.drill
>
>
> For the purpose of playing nice on clusters that don't have YARN, we should 
> write up configuration and scripts to allows users to run Drill next to 
> existing workloads without sharing resources.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-143) Support CGROUPs resource management

2018-05-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16464800#comment-16464800
 ] 

ASF GitHub Bot commented on DRILL-143:
--

vvysotskyi commented on issue #1239: DRILL-143: CGroup Support for Drill-on-YARN
URL: https://github.com/apache/drill/pull/1239#issuecomment-386811130
 
 
   @paul-rogers, @kkhatua, is this PR ready to be merged, or some additional 
work should be done?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Support CGROUPs resource management
> ---
>
> Key: DRILL-143
> URL: https://issues.apache.org/jira/browse/DRILL-143
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Jacques Nadeau
>Assignee: Kunal Khatua
>Priority: Major
>  Labels: doc-impacting, ready-to-commit
> Fix For: 1.14.0
>
> Attachments: 253ce178-ddeb-e482-cd64-44ab7284ad1c.sys.drill
>
>
> For the purpose of playing nice on clusters that don't have YARN, we should 
> write up configuration and scripts to allows users to run Drill next to 
> existing workloads without sharing resources.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-143) Support CGROUPs resource management

2018-04-30 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16458877#comment-16458877
 ] 

ASF GitHub Bot commented on DRILL-143:
--

Github user kkhatua commented on the issue:

https://github.com/apache/drill/pull/1239
  
Yes, I agree. Support folks gave me similar feedback, so I'll commit this 
change in the mapr distro _IFF_ there is a request for that. YARN is already a 
complex beast with numerous settings. Introducing caveats like this will only 
add to the confusion.


> Support CGROUPs resource management
> ---
>
> Key: DRILL-143
> URL: https://issues.apache.org/jira/browse/DRILL-143
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Jacques Nadeau
>Assignee: Kunal Khatua
>Priority: Major
>  Labels: doc-impacting
> Fix For: 1.14.0
>
> Attachments: 253ce178-ddeb-e482-cd64-44ab7284ad1c.sys.drill
>
>
> For the purpose of playing nice on clusters that don't have YARN, we should 
> write up configuration and scripts to allows users to run Drill next to 
> existing workloads without sharing resources.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-143) Support CGROUPs resource management

2018-04-30 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16458866#comment-16458866
 ] 

ASF GitHub Bot commented on DRILL-143:
--

Github user paul-rogers commented on the issue:

https://github.com/apache/drill/pull/1239
  
Just to be clear, I have no objection Drill enforcing its own cgroup limits.

My point is rather that CPU limits must be integrated with YARN, via the 
DoY config file, so that the user specifies CPU limits in one place and that 
limit is the same one passed to YARN for container allocation and to whichever 
code is doing cgroup enforcement. By default, it will be YARN if cgroups are 
enabled in YARN, as they can be for Apache YARN.

It is then fine to have an additional option to enable self-enforcement in 
Drill for those odd cases where users cannot or choose not to use versions of 
YARN that do the work.

The problem would be if the user has to configure CPU in two places: in a 
shell script to enable self-enforcement, and in the DoY config to tell YARN the 
container size. This goes against the ease-of-use experience DoY was designed 
to provide. (That is, getting multiple to work consistently correct is very 
hard if you're just trying to get Drill to run and are not a Drill internals 
expert.)


> Support CGROUPs resource management
> ---
>
> Key: DRILL-143
> URL: https://issues.apache.org/jira/browse/DRILL-143
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Jacques Nadeau
>Assignee: Kunal Khatua
>Priority: Major
>  Labels: doc-impacting
> Fix For: 1.14.0
>
> Attachments: 253ce178-ddeb-e482-cd64-44ab7284ad1c.sys.drill
>
>
> For the purpose of playing nice on clusters that don't have YARN, we should 
> write up configuration and scripts to allows users to run Drill next to 
> existing workloads without sharing resources.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-143) Support CGROUPs resource management

2018-04-30 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16458850#comment-16458850
 ] 

ASF GitHub Bot commented on DRILL-143:
--

Github user kkhatua commented on the issue:

https://github.com/apache/drill/pull/1239
  
@paul-rogers I went through with support on this and found that the issue 
is not specific to MapR. However, you make a strong argument in favor of 
letting YARN handle the CGroup management rather than Drill over-reaching. 
I've reverted the change to `yarn-drillbit.sh` in the latest commit.


> Support CGROUPs resource management
> ---
>
> Key: DRILL-143
> URL: https://issues.apache.org/jira/browse/DRILL-143
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Jacques Nadeau
>Assignee: Kunal Khatua
>Priority: Major
>  Labels: doc-impacting
> Fix For: 1.14.0
>
> Attachments: 253ce178-ddeb-e482-cd64-44ab7284ad1c.sys.drill
>
>
> For the purpose of playing nice on clusters that don't have YARN, we should 
> write up configuration and scripts to allows users to run Drill next to 
> existing workloads without sharing resources.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-143) Support CGROUPs resource management

2018-04-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16452883#comment-16452883
 ] 

ASF GitHub Bot commented on DRILL-143:
--

Github user paul-rogers commented on the issue:

https://github.com/apache/drill/pull/1239
  
One other thing to highlight from an earlier comment. CPU is something that 
the user specifies in the DoY config file. That information is passed to YARN 
in container requests. This feature asks the user to modify the drill-env.sh 
file to enable cgroups to limit CPU. As a result, the user must specify the CPU 
limit in two places.

DoY went to extreme lengths to unify memory configuration so it is set in 
one place. The assumption was that, since Apache YARN handles cgroups, we've 
also got unified CPU specification. But, in this "side-car" approach we don't. 
So, would be good to capture the CPU amount from the YARN config as explained 
earlier and use that to set the Drill cgroups env vars -- but only if some 
"self-enforcing cgroup" flag is enabled in the config (which can be done by 
default for the limited MapR YARN.)


> Support CGROUPs resource management
> ---
>
> Key: DRILL-143
> URL: https://issues.apache.org/jira/browse/DRILL-143
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Jacques Nadeau
>Assignee: Kunal Khatua
>Priority: Major
>  Labels: doc-impacting
> Fix For: 1.14.0
>
> Attachments: 253ce178-ddeb-e482-cd64-44ab7284ad1c.sys.drill
>
>
> For the purpose of playing nice on clusters that don't have YARN, we should 
> write up configuration and scripts to allows users to run Drill next to 
> existing workloads without sharing resources.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-143) Support CGROUPs resource management

2018-04-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16452877#comment-16452877
 ] 

ASF GitHub Bot commented on DRILL-143:
--

Github user paul-rogers commented on the issue:

https://github.com/apache/drill/pull/1239
  
@kkhatua, putting on my Apache hat... Apache Drill is an Apache project 
that must work with other Apache projects such as Apache YARN. The Apache Drill 
DoY support is designed to work well with Apache YARN (and has a few special 
additions for MapR YARN's unique limitations.) It is important that the Apache 
DoY work well with the generic YARN. No harm in adding tweaks (such as this 
one) to work with vendor-specific limitations. But, the overriding concern is 
that the DoY feature be useful in Apache. 


> Support CGROUPs resource management
> ---
>
> Key: DRILL-143
> URL: https://issues.apache.org/jira/browse/DRILL-143
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Jacques Nadeau
>Assignee: Kunal Khatua
>Priority: Major
>  Labels: doc-impacting
> Fix For: 1.14.0
>
> Attachments: 253ce178-ddeb-e482-cd64-44ab7284ad1c.sys.drill
>
>
> For the purpose of playing nice on clusters that don't have YARN, we should 
> write up configuration and scripts to allows users to run Drill next to 
> existing workloads without sharing resources.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-143) Support CGROUPs resource management

2018-04-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16452872#comment-16452872
 ] 

ASF GitHub Bot commented on DRILL-143:
--

Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/1239#discussion_r184169114
  
--- Diff: distribution/src/resources/yarn-drillbit.sh ---
@@ -175,4 +209,11 @@ fi
 echo "`date` Starting drillbit on `hostname` under YARN, logging to 
$DRILLBIT_LOG_PATH"
 echo "`ulimit -a`" >> "$DRILLBIT_LOG_PATH" 2>&1
 
-"$DRILL_HOME/bin/runbit" exec
+# Run in background
+"$DRILL_HOME/bin/runbit" exec &
--- End diff --

Under YARN, it is YARN that maintains the pid, not Drill. YARN expects its 
child processes to run in the foreground and will handle capturing the pid.

This is a case in which "native" Apache YARN works differently than "MapR 
YARN." Since Apache YARN handles cgroups, it is the one that needs the pid. 
Under MapR's limited YARN, then Drill is second-guessing YARN and needs the 
pid. It may be that MapR's YARN can handle a background process; I don't recall 
the details.

Is there way way to run Drill in the background, get the pid, then return 
to the foreground so that the script does not exit until Drill itself exits?

In fact, if I remember correctly, the scripts have two layers; in one 
layer, the script replaces itself with the Drill process. Something to check.


> Support CGROUPs resource management
> ---
>
> Key: DRILL-143
> URL: https://issues.apache.org/jira/browse/DRILL-143
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Jacques Nadeau
>Assignee: Kunal Khatua
>Priority: Major
>  Labels: doc-impacting
> Fix For: 1.14.0
>
> Attachments: 253ce178-ddeb-e482-cd64-44ab7284ad1c.sys.drill
>
>
> For the purpose of playing nice on clusters that don't have YARN, we should 
> write up configuration and scripts to allows users to run Drill next to 
> existing workloads without sharing resources.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-143) Support CGROUPs resource management

2018-04-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16452793#comment-16452793
 ] 

ASF GitHub Bot commented on DRILL-143:
--

Github user kkhatua commented on the issue:

https://github.com/apache/drill/pull/1239
  
@paul-rogers DoY is no more a MapR-only feature, and if it helps to have 
Drill self-enforce, this works. If YARN is able to enforce for Drill, the user 
need not specify the settings in their `drill-env.sh`. 

I see https://issues.apache.org/jira/browse/YARN-810 as being an open 
issue. (Source: [Issue 
List](https://issues.apache.org/jira/issues/?jql=project%20%3D%20YARN%20AND%20status%20in%20(Open%2C%20"Patch%20Available")%20AND%20text%20~%20"cgroup"%20ORDER%20BY%20key%20%2C%20status%20ASC)
 ). So, I probably don't need to do any explicit checks for hadoop distros and 
documentation should be sufficient to address this, IMO.


> Support CGROUPs resource management
> ---
>
> Key: DRILL-143
> URL: https://issues.apache.org/jira/browse/DRILL-143
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Jacques Nadeau
>Assignee: Kunal Khatua
>Priority: Major
>  Labels: doc-impacting
> Fix For: 1.14.0
>
> Attachments: 253ce178-ddeb-e482-cd64-44ab7284ad1c.sys.drill
>
>
> For the purpose of playing nice on clusters that don't have YARN, we should 
> write up configuration and scripts to allows users to run Drill next to 
> existing workloads without sharing resources.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-143) Support CGROUPs resource management

2018-04-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16452774#comment-16452774
 ] 

ASF GitHub Bot commented on DRILL-143:
--

Github user kkhatua commented on a diff in the pull request:

https://github.com/apache/drill/pull/1239#discussion_r184153040
  
--- Diff: distribution/src/resources/yarn-drillbit.sh ---
@@ -175,4 +209,11 @@ fi
 echo "`date` Starting drillbit on `hostname` under YARN, logging to 
$DRILLBIT_LOG_PATH"
 echo "`ulimit -a`" >> "$DRILLBIT_LOG_PATH" 2>&1
 
-"$DRILL_HOME/bin/runbit" exec
+# Run in background
+"$DRILL_HOME/bin/runbit" exec &
--- End diff --

The process is momentarily in the background to capture the PID. We 
eventually wait for it. Are you saying that a process will not continue to run 
because it is in the background??


> Support CGROUPs resource management
> ---
>
> Key: DRILL-143
> URL: https://issues.apache.org/jira/browse/DRILL-143
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Jacques Nadeau
>Assignee: Kunal Khatua
>Priority: Major
>  Labels: doc-impacting
> Fix For: 1.14.0
>
> Attachments: 253ce178-ddeb-e482-cd64-44ab7284ad1c.sys.drill
>
>
> For the purpose of playing nice on clusters that don't have YARN, we should 
> write up configuration and scripts to allows users to run Drill next to 
> existing workloads without sharing resources.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-143) Support CGROUPs resource management

2018-04-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16452711#comment-16452711
 ] 

ASF GitHub Bot commented on DRILL-143:
--

Github user paul-rogers commented on the issue:

https://github.com/apache/drill/pull/1239
  
There may be some misunderstanding of how DoY works. The only info that 
users can pass to DoY is that which is in the DoY config file. We should add 
arguments to that file which will be passed through the DoY client to the DoY 
AM, and from there, as an env var, to the Drillbit containers. We already do 
this for memory so that both Drill and YARN agree on memory. We should do the 
same for CPU so that Drill, YARN and cgroups agree on the number of CPUs that 
Drill can use.

Since this feature is MapR-only, it might be OK to require that users alter 
their `drill-env.sh` to set an environment variable the forces Drill to police 
itself under YARN.


> Support CGROUPs resource management
> ---
>
> Key: DRILL-143
> URL: https://issues.apache.org/jira/browse/DRILL-143
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Jacques Nadeau
>Assignee: Kunal Khatua
>Priority: Major
>  Labels: doc-impacting
> Fix For: 1.14.0
>
> Attachments: 253ce178-ddeb-e482-cd64-44ab7284ad1c.sys.drill
>
>
> For the purpose of playing nice on clusters that don't have YARN, we should 
> write up configuration and scripts to allows users to run Drill next to 
> existing workloads without sharing resources.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-143) Support CGROUPs resource management

2018-04-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16452712#comment-16452712
 ] 

ASF GitHub Bot commented on DRILL-143:
--

Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/1239#discussion_r184144439
  
--- Diff: distribution/src/resources/yarn-drillbit.sh ---
@@ -175,4 +209,11 @@ fi
 echo "`date` Starting drillbit on `hostname` under YARN, logging to 
$DRILLBIT_LOG_PATH"
 echo "`ulimit -a`" >> "$DRILLBIT_LOG_PATH" 2>&1
 
-"$DRILL_HOME/bin/runbit" exec
+# Run in background
+"$DRILL_HOME/bin/runbit" exec &
--- End diff --

This can't be for YARN. Under YARN, Drill must be run in the foreground. 
The original Drill-on-YARN work ensured that al this works.


> Support CGROUPs resource management
> ---
>
> Key: DRILL-143
> URL: https://issues.apache.org/jira/browse/DRILL-143
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Jacques Nadeau
>Assignee: Kunal Khatua
>Priority: Major
>  Labels: doc-impacting
> Fix For: 1.14.0
>
> Attachments: 253ce178-ddeb-e482-cd64-44ab7284ad1c.sys.drill
>
>
> For the purpose of playing nice on clusters that don't have YARN, we should 
> write up configuration and scripts to allows users to run Drill next to 
> existing workloads without sharing resources.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-143) Support CGROUPs resource management

2018-04-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16452706#comment-16452706
 ] 

ASF GitHub Bot commented on DRILL-143:
--

Github user kkhatua commented on the issue:

https://github.com/apache/drill/pull/1239
  
Thanks for that pointer, @paul-rogers ! I'll make the relevant changes and 
add to this commit.


> Support CGROUPs resource management
> ---
>
> Key: DRILL-143
> URL: https://issues.apache.org/jira/browse/DRILL-143
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Jacques Nadeau
>Assignee: Kunal Khatua
>Priority: Major
>  Labels: doc-impacting
> Fix For: 1.14.0
>
> Attachments: 253ce178-ddeb-e482-cd64-44ab7284ad1c.sys.drill
>
>
> For the purpose of playing nice on clusters that don't have YARN, we should 
> write up configuration and scripts to allows users to run Drill next to 
> existing workloads without sharing resources.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-143) Support CGROUPs resource management

2018-04-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16452696#comment-16452696
 ] 

ASF GitHub Bot commented on DRILL-143:
--

Github user paul-rogers commented on the issue:

https://github.com/apache/drill/pull/1239
  
@kkhatua, it turns out that upstream YARN has long had effective cgroup 
support per container. ( have the pleasure of sitting near the guy who 
maintains that work.)There has long been a discussion about whether the MapR 
version of YARN picked up those changes, we believe that MapR does *not* 
support this upstream work.

As a result, under Apache YARN, the YARN NM itself will impose cgroup 
controls and Drill need not do it itself. For MapR YARN (only) Drill (and all 
other YARN apps) must do their own cgroup control.

Please make sure that this feature is off by default to allow YARN to do 
the work. Only enable it for versions of YARN (such as MapR) which do not 
provide cgroup control in YARN itself.


> Support CGROUPs resource management
> ---
>
> Key: DRILL-143
> URL: https://issues.apache.org/jira/browse/DRILL-143
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Jacques Nadeau
>Assignee: Kunal Khatua
>Priority: Major
>  Labels: doc-impacting, ready-to-commit
> Fix For: 1.14.0
>
> Attachments: 253ce178-ddeb-e482-cd64-44ab7284ad1c.sys.drill
>
>
> For the purpose of playing nice on clusters that don't have YARN, we should 
> write up configuration and scripts to allows users to run Drill next to 
> existing workloads without sharing resources.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-143) Support CGROUPs resource management

2018-04-24 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16450198#comment-16450198
 ] 

ASF GitHub Bot commented on DRILL-143:
--

Github user kkhatua commented on a diff in the pull request:

https://github.com/apache/drill/pull/1239#discussion_r183805056
  
--- Diff: distribution/src/resources/yarn-drillbit.sh ---
@@ -110,6 +114,36 @@
 # Enables Java GC logging. Passed from the drill.yarn.drillbit.log-gc
 # garbage collection option.
 
+### Function to enforce CGroup (Refer local drillbit.sh)
+check_and_enforce_cgroup(){
+dbitPid=$1;
+kill -0 $dbitPid
+if [ $? -gt 0 ]; then 
+  echo "ERROR: Failed to add Drillbit to CGroup ( $DRILLBIT_CGROUP ) 
for 'cpu'. Ensure that the Drillbit ( pid=$dbitPid ) started up." >&2
+  exit 1
+fi
+SYS_CGROUP_DIR=${SYS_CGROUP_DIR:-"/sys/fs/cgroup"}
+if [ -f $SYS_CGROUP_DIR/cpu/$DRILLBIT_CGROUP/cgroup.procs ]; then
+  echo $dbitPid > $SYS_CGROUP_DIR/cpu/$DRILLBIT_CGROUP/cgroup.procs
+  # Verify Enforcement
+  cgroupStatus=`grep -w $pid 
$SYS_CGROUP_DIR/cpu/${DRILLBIT_CGROUP}/cgroup.procs`
+  if [ -z "$cgroupStatus" ]; then
--- End diff --

Fixed the changes.


> Support CGROUPs resource management
> ---
>
> Key: DRILL-143
> URL: https://issues.apache.org/jira/browse/DRILL-143
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Jacques Nadeau
>Assignee: Kunal Khatua
>Priority: Major
>  Labels: doc-impacting, ready-to-commit
> Fix For: 1.14.0
>
> Attachments: 253ce178-ddeb-e482-cd64-44ab7284ad1c.sys.drill
>
>
> For the purpose of playing nice on clusters that don't have YARN, we should 
> write up configuration and scripts to allows users to run Drill next to 
> existing workloads without sharing resources.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-143) Support CGROUPs resource management

2018-04-24 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16449858#comment-16449858
 ] 

ASF GitHub Bot commented on DRILL-143:
--

Github user kkhatua commented on a diff in the pull request:

https://github.com/apache/drill/pull/1239#discussion_r183732780
  
--- Diff: distribution/src/resources/yarn-drillbit.sh ---
@@ -110,6 +114,36 @@
 # Enables Java GC logging. Passed from the drill.yarn.drillbit.log-gc
 # garbage collection option.
 
+### Function to enforce CGroup (Refer local drillbit.sh)
+check_and_enforce_cgroup(){
+dbitPid=$1;
+kill -0 $dbitPid
+if [ $? -gt 0 ]; then 
+  echo "ERROR: Failed to add Drillbit to CGroup ( $DRILLBIT_CGROUP ) 
for 'cpu'. Ensure that the Drillbit ( pid=$dbitPid ) started up." >&2
+  exit 1
+fi
+SYS_CGROUP_DIR=${SYS_CGROUP_DIR:-"/sys/fs/cgroup"}
+if [ -f $SYS_CGROUP_DIR/cpu/$DRILLBIT_CGROUP/cgroup.procs ]; then
+  echo $dbitPid > $SYS_CGROUP_DIR/cpu/$DRILLBIT_CGROUP/cgroup.procs
+  # Verify Enforcement
+  cgroupStatus=`grep -w $pid 
$SYS_CGROUP_DIR/cpu/${DRILLBIT_CGROUP}/cgroup.procs`
+  if [ -z "$cgroupStatus" ]; then
--- End diff --

You're right. I seem to have missed negating the check. Since this check 
only affects publication of a message and not the actual application of the 
CGroup, we didn't catch it during testing. I'll fix this port and the original 
patch as well.


> Support CGROUPs resource management
> ---
>
> Key: DRILL-143
> URL: https://issues.apache.org/jira/browse/DRILL-143
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Jacques Nadeau
>Assignee: Kunal Khatua
>Priority: Major
>  Labels: doc-impacting, ready-to-commit
> Fix For: 1.14.0
>
> Attachments: 253ce178-ddeb-e482-cd64-44ab7284ad1c.sys.drill
>
>
> For the purpose of playing nice on clusters that don't have YARN, we should 
> write up configuration and scripts to allows users to run Drill next to 
> existing workloads without sharing resources.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-143) Support CGROUPs resource management

2018-04-23 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16449301#comment-16449301
 ] 

ASF GitHub Bot commented on DRILL-143:
--

Github user Ben-Zvi commented on a diff in the pull request:

https://github.com/apache/drill/pull/1239#discussion_r183607110
  
--- Diff: distribution/src/resources/yarn-drillbit.sh ---
@@ -110,6 +114,36 @@
 # Enables Java GC logging. Passed from the drill.yarn.drillbit.log-gc
 # garbage collection option.
 
+### Function to enforce CGroup (Refer local drillbit.sh)
+check_and_enforce_cgroup(){
+dbitPid=$1;
+kill -0 $dbitPid
+if [ $? -gt 0 ]; then 
+  echo "ERROR: Failed to add Drillbit to CGroup ( $DRILLBIT_CGROUP ) 
for 'cpu'. Ensure that the Drillbit ( pid=$dbitPid ) started up." >&2
+  exit 1
+fi
+SYS_CGROUP_DIR=${SYS_CGROUP_DIR:-"/sys/fs/cgroup"}
+if [ -f $SYS_CGROUP_DIR/cpu/$DRILLBIT_CGROUP/cgroup.procs ]; then
+  echo $dbitPid > $SYS_CGROUP_DIR/cpu/$DRILLBIT_CGROUP/cgroup.procs
+  # Verify Enforcement
+  cgroupStatus=`grep -w $pid 
$SYS_CGROUP_DIR/cpu/${DRILLBIT_CGROUP}/cgroup.procs`
+  if [ -z "$cgroupStatus" ]; then
--- End diff --

I'm confused: Is this checking for $dbitPid (in cgroup.procs) or for $pid ?
In case the former, then need to negate the following "-z" condition.
 


> Support CGROUPs resource management
> ---
>
> Key: DRILL-143
> URL: https://issues.apache.org/jira/browse/DRILL-143
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Jacques Nadeau
>Assignee: Kunal Khatua
>Priority: Major
>  Labels: doc-impacting, ready-to-commit
> Fix For: 1.14.0
>
> Attachments: 253ce178-ddeb-e482-cd64-44ab7284ad1c.sys.drill
>
>
> For the purpose of playing nice on clusters that don't have YARN, we should 
> write up configuration and scripts to allows users to run Drill next to 
> existing workloads without sharing resources.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-143) Support CGROUPs resource management

2018-04-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16438162#comment-16438162
 ] 

ASF GitHub Bot commented on DRILL-143:
--

Github user asfgit closed the pull request at:

https://github.com/apache/drill/pull/1200


> Support CGROUPs resource management
> ---
>
> Key: DRILL-143
> URL: https://issues.apache.org/jira/browse/DRILL-143
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Jacques Nadeau
>Assignee: Kunal Khatua
>Priority: Major
>  Labels: doc-impacting, ready-to-commit
> Fix For: 1.14.0
>
> Attachments: 253ce178-ddeb-e482-cd64-44ab7284ad1c.sys.drill
>
>
> For the purpose of playing nice on clusters that don't have YARN, we should 
> write up configuration and scripts to allows users to run Drill next to 
> existing workloads without sharing resources.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-143) Support CGROUPs resource management

2018-04-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16436074#comment-16436074
 ] 

ASF GitHub Bot commented on DRILL-143:
--

Github user kkhatua commented on the issue:

https://github.com/apache/drill/pull/1200
  
@Ben-Zvi .. batch committers would be doing that. I've got the feed back 
that it helps to preserve the commit-&-comment history and the squashing can be 
done in pre-commit merge branch.
@paul-rogers ?


> Support CGROUPs resource management
> ---
>
> Key: DRILL-143
> URL: https://issues.apache.org/jira/browse/DRILL-143
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Jacques Nadeau
>Assignee: Kunal Khatua
>Priority: Major
>  Labels: doc-impacting
> Fix For: 1.14.0
>
> Attachments: 253ce178-ddeb-e482-cd64-44ab7284ad1c.sys.drill
>
>
> For the purpose of playing nice on clusters that don't have YARN, we should 
> write up configuration and scripts to allows users to run Drill next to 
> existing workloads without sharing resources.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-143) Support CGROUPs resource management

2018-04-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16433028#comment-16433028
 ] 

ASF GitHub Bot commented on DRILL-143:
--

Github user kkhatua commented on the issue:

https://github.com/apache/drill/pull/1200
  
@Ben-Zvi / @paul-rogers I've updated the PR with changes based on your 
comments. Waiting for a review.


> Support CGROUPs resource management
> ---
>
> Key: DRILL-143
> URL: https://issues.apache.org/jira/browse/DRILL-143
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Jacques Nadeau
>Assignee: Kunal Khatua
>Priority: Major
>  Labels: doc-impacting
> Fix For: 1.14.0
>
> Attachments: 253ce178-ddeb-e482-cd64-44ab7284ad1c.sys.drill
>
>
> For the purpose of playing nice on clusters that don't have YARN, we should 
> write up configuration and scripts to allows users to run Drill next to 
> existing workloads without sharing resources.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-143) Support CGROUPs resource management

2018-04-04 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426286#comment-16426286
 ] 

ASF GitHub Bot commented on DRILL-143:
--

Github user kkhatua commented on the issue:

https://github.com/apache/drill/pull/1200
  
@Ben-Zvi / @paul-rogers 
I discovered that the path `/cgroup` is unique only to my installation of 
the package on CentOS. The standard path is `/sys/fs/cgroup`.
So, I've made changes that allow for users to specify the CGroups location. 
Without this, the latest commit complains as expected:
```
[root@kk127 ~]#  
/opt/mapr/drill/apache-drill-1.14.0-SNAPSHOT/bin/drillbit.sh restart
Stopping drillbit
..
Starting drillbit, logging to /var/log/drill/drillbit.out
ERROR: CGroup drillcpu not found. Ensure that daemon is running, 
SYS_CGROUP_DIR is correctly set (currently, /sys/fs/cgroup ), and that the 
CGroup exists
```
The `drill-env.sh` also specifies that the enforcement is only for CPU. 
I'll put this in the documentation of the feature as well.


> Support CGROUPs resource management
> ---
>
> Key: DRILL-143
> URL: https://issues.apache.org/jira/browse/DRILL-143
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Jacques Nadeau
>Assignee: Kunal Khatua
>Priority: Major
>  Labels: doc-impacting
> Fix For: 1.14.0
>
> Attachments: 253ce178-ddeb-e482-cd64-44ab7284ad1c.sys.drill
>
>
> For the purpose of playing nice on clusters that don't have YARN, we should 
> write up configuration and scripts to allows users to run Drill next to 
> existing workloads without sharing resources.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-143) Support CGROUPs resource management

2018-04-04 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426226#comment-16426226
 ] 

ASF GitHub Bot commented on DRILL-143:
--

Github user kkhatua commented on the issue:

https://github.com/apache/drill/pull/1200
  
@paul-rogers very good point about the issue of the heartbeat thread 
competing with other Drillbit process threads for CPU. When writing up the 
documentation for this, we'll have to ensure that users also configure the 
parallelization to a reasonable level to ensure that CPU thrashing is minimized 
with CGroups enforced. 


> Support CGROUPs resource management
> ---
>
> Key: DRILL-143
> URL: https://issues.apache.org/jira/browse/DRILL-143
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Jacques Nadeau
>Assignee: Kunal Khatua
>Priority: Major
>  Labels: doc-impacting
> Fix For: 1.14.0
>
> Attachments: 253ce178-ddeb-e482-cd64-44ab7284ad1c.sys.drill
>
>
> For the purpose of playing nice on clusters that don't have YARN, we should 
> write up configuration and scripts to allows users to run Drill next to 
> existing workloads without sharing resources.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-143) Support CGROUPs resource management

2018-04-04 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426218#comment-16426218
 ] 

ASF GitHub Bot commented on DRILL-143:
--

Github user kkhatua commented on a diff in the pull request:

https://github.com/apache/drill/pull/1200#discussion_r179292376
  
--- Diff: distribution/src/resources/drillbit.sh ---
@@ -127,6 +127,44 @@ check_before_start()
   fi
 }
 
+check_after_start(){
+#check if the process is running
+if [ -f $pid ]; then
+  dbitProc=$(ps -ef | grep `cat $pid` | grep Drillbit)
+  if [ -n "$dbitProc" ]; then
+# Check and enforce for CGroup
+if [ -n "$DRILLBIT_CGROUP" ]; then 
+  check_and_enforce_cgroup `cat $pid`
+fi
+  fi
+fi
+}
+
+check_and_enforce_cgroup(){
+dbitPid=$1;
+#if [ $(`ps -o cgroup` | grep -c $DRILLBIT_CGROUP ) -eq 1 ]; then 
+if [ -f /cgroup/cpu/${DRILLBIT_CGROUP}/cgroup.procs ]; then 
+  echo $dbitPid > /cgroup/cpu/${DRILLBIT_CGROUP}/cgroup.procs
+  # Verify Enforcement
+  cgroupStatus=`grep -w $pid 
/cgroup/cpu/${DRILLBIT_CGROUP}/cgroup.procs`
+  if [ -z "$cgroupStatus" ]; then
+#Ref: 
https://www.kernel.org/doc/Documentation/scheduler/sched-bwc.txt
+let cpu_quota=`cat /cgroup/cpu/${DRILLBIT_CGROUP}/cpu.cfs_quota_us`
+let cpu_period=`cat 
/cgroup/cpu/${DRILLBIT_CGROUP}/cpu.cfs_period_us`
+if [ $cpu_period -gt 0 ] && [ $cpu_quota -gt 0 ]; then
+  coresAllowed="(up to "`echo $(( 100 * $cpu_quota / $cpu_period 
)) | sed 's/..$/.&/'`" cores allowed)"
+fi
+echo "WARN: Drillbit's CPU resource usage will be managed under 
the CGroup : $DRILLBIT_CGROUP "$coresAllowed
+  else
+echo "ERROR: Failed to add Drillbit to CGroup ( $DRILLBIT_CGROUP ) 
for resource usage management.  Ensure that the cgroup manages CPU"
+exit 1
+  fi
+else
+  echo "ERROR: cgroup $DRILLBIT_CGROUP does not found. Ensure that 
daemon is running and cgroup exists"
+  exit 1
--- End diff --

Good point. Ideally, we should prevent the Drillbit from starting up (or in 
this case, it should shut down), if CGroups couldn't be applied. My concern is 
that if CGroups is not being enforced, we're running a process that can 
(potentially) consume excess CPU resources. 
Should I shut down the Drillbit in such a scenario, or move on with just a 
WARN message?


> Support CGROUPs resource management
> ---
>
> Key: DRILL-143
> URL: https://issues.apache.org/jira/browse/DRILL-143
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Jacques Nadeau
>Assignee: Kunal Khatua
>Priority: Major
> Fix For: 1.14.0
>
> Attachments: 253ce178-ddeb-e482-cd64-44ab7284ad1c.sys.drill
>
>
> For the purpose of playing nice on clusters that don't have YARN, we should 
> write up configuration and scripts to allows users to run Drill next to 
> existing workloads without sharing resources.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-143) Support CGROUPs resource management

2018-04-04 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426191#comment-16426191
 ] 

ASF GitHub Bot commented on DRILL-143:
--

Github user kkhatua commented on a diff in the pull request:

https://github.com/apache/drill/pull/1200#discussion_r179287166
  
--- Diff: distribution/src/resources/drillbit.sh ---
@@ -127,6 +127,44 @@ check_before_start()
   fi
 }
 
+check_after_start(){
+#check if the process is running
+if [ -f $pid ]; then
+  dbitProc=$(ps -ef | grep `cat $pid` | grep Drillbit)
+  if [ -n "$dbitProc" ]; then
+# Check and enforce for CGroup
+if [ -n "$DRILLBIT_CGROUP" ]; then 
+  check_and_enforce_cgroup `cat $pid`
+fi
+  fi
+fi
+}
+
+check_and_enforce_cgroup(){
+dbitPid=$1;
+#if [ $(`ps -o cgroup` | grep -c $DRILLBIT_CGROUP ) -eq 1 ]; then 
+if [ -f /cgroup/cpu/${DRILLBIT_CGROUP}/cgroup.procs ]; then 
+  echo $dbitPid > /cgroup/cpu/${DRILLBIT_CGROUP}/cgroup.procs
+  # Verify Enforcement
+  cgroupStatus=`grep -w $pid 
/cgroup/cpu/${DRILLBIT_CGROUP}/cgroup.procs`
+  if [ -z "$cgroupStatus" ]; then
+#Ref: 
https://www.kernel.org/doc/Documentation/scheduler/sched-bwc.txt
+let cpu_quota=`cat /cgroup/cpu/${DRILLBIT_CGROUP}/cpu.cfs_quota_us`
+let cpu_period=`cat 
/cgroup/cpu/${DRILLBIT_CGROUP}/cpu.cfs_period_us`
+if [ $cpu_period -gt 0 ] && [ $cpu_quota -gt 0 ]; then
+  coresAllowed="(up to "`echo $(( 100 * $cpu_quota / $cpu_period 
)) | sed 's/..$/.&/'`" cores allowed)"
+fi
+echo "WARN: Drillbit's CPU resource usage will be managed under 
the CGroup : $DRILLBIT_CGROUP "$coresAllowed
--- End diff --

Thought that since the Drill process is being brought up in a restricted 
mode, it should appear as a WARN.
I can switch it to an INFO because there is nothing compelling to use WARN.
Will make the change on the variable within quotes.


> Support CGROUPs resource management
> ---
>
> Key: DRILL-143
> URL: https://issues.apache.org/jira/browse/DRILL-143
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Jacques Nadeau
>Assignee: Kunal Khatua
>Priority: Major
> Fix For: 1.14.0
>
> Attachments: 253ce178-ddeb-e482-cd64-44ab7284ad1c.sys.drill
>
>
> For the purpose of playing nice on clusters that don't have YARN, we should 
> write up configuration and scripts to allows users to run Drill next to 
> existing workloads without sharing resources.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-143) Support CGROUPs resource management

2018-04-04 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426186#comment-16426186
 ] 

ASF GitHub Bot commented on DRILL-143:
--

Github user kkhatua commented on a diff in the pull request:

https://github.com/apache/drill/pull/1200#discussion_r179286598
  
--- Diff: distribution/src/resources/drillbit.sh ---
@@ -127,6 +127,44 @@ check_before_start()
   fi
 }
 
+check_after_start(){
+#check if the process is running
+if [ -f $pid ]; then
+  dbitProc=$(ps -ef | grep `cat $pid` | grep Drillbit)
+  if [ -n "$dbitProc" ]; then
+# Check and enforce for CGroup
+if [ -n "$DRILLBIT_CGROUP" ]; then 
+  check_and_enforce_cgroup `cat $pid`
+fi
+  fi
+fi
+}
+
+check_and_enforce_cgroup(){
+dbitPid=$1;
+#if [ $(`ps -o cgroup` | grep -c $DRILLBIT_CGROUP ) -eq 1 ]; then 
+if [ -f /cgroup/cpu/${DRILLBIT_CGROUP}/cgroup.procs ]; then 
+  echo $dbitPid > /cgroup/cpu/${DRILLBIT_CGROUP}/cgroup.procs
+  # Verify Enforcement
+  cgroupStatus=`grep -w $pid 
/cgroup/cpu/${DRILLBIT_CGROUP}/cgroup.procs`
+  if [ -z "$cgroupStatus" ]; then
+#Ref: 
https://www.kernel.org/doc/Documentation/scheduler/sched-bwc.txt
+let cpu_quota=`cat /cgroup/cpu/${DRILLBIT_CGROUP}/cpu.cfs_quota_us`
--- End diff --

Ok. Will fix this 


> Support CGROUPs resource management
> ---
>
> Key: DRILL-143
> URL: https://issues.apache.org/jira/browse/DRILL-143
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Jacques Nadeau
>Assignee: Kunal Khatua
>Priority: Major
> Fix For: 1.14.0
>
> Attachments: 253ce178-ddeb-e482-cd64-44ab7284ad1c.sys.drill
>
>
> For the purpose of playing nice on clusters that don't have YARN, we should 
> write up configuration and scripts to allows users to run Drill next to 
> existing workloads without sharing resources.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-143) Support CGROUPs resource management

2018-04-04 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16425605#comment-16425605
 ] 

ASF GitHub Bot commented on DRILL-143:
--

Github user kkhatua commented on a diff in the pull request:

https://github.com/apache/drill/pull/1200#discussion_r179159323
  
--- Diff: distribution/src/resources/drillbit.sh ---
@@ -154,6 +192,7 @@ start_bit ( )
   nohup nice -n $DRILL_NICENESS "$DRILL_HOME/bin/runbit" exec ${args[@]} 
>> "$logout" 2>&1 &
   echo $! > $pid
   sleep 1
+  check_after_start
--- End diff --

 
Will fix this, thanks!


> Support CGROUPs resource management
> ---
>
> Key: DRILL-143
> URL: https://issues.apache.org/jira/browse/DRILL-143
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Jacques Nadeau
>Assignee: Kunal Khatua
>Priority: Major
> Fix For: 1.14.0
>
> Attachments: 253ce178-ddeb-e482-cd64-44ab7284ad1c.sys.drill
>
>
> For the purpose of playing nice on clusters that don't have YARN, we should 
> write up configuration and scripts to allows users to run Drill next to 
> existing workloads without sharing resources.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-143) Support CGROUPs resource management

2018-04-04 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16425604#comment-16425604
 ] 

ASF GitHub Bot commented on DRILL-143:
--

Github user kkhatua commented on a diff in the pull request:

https://github.com/apache/drill/pull/1200#discussion_r179158686
  
--- Diff: distribution/src/resources/drillbit.sh ---
@@ -127,6 +127,44 @@ check_before_start()
   fi
 }
 
+check_after_start(){
+#check if the process is running
+if [ -f $pid ]; then
+  dbitProc=$(ps -ef | grep `cat $pid` | grep Drillbit)
+  if [ -n "$dbitProc" ]; then
+# Check and enforce for CGroup
+if [ -n "$DRILLBIT_CGROUP" ]; then 
+  check_and_enforce_cgroup `cat $pid`
+fi
+  fi
+fi
+}
+
+check_and_enforce_cgroup(){
+dbitPid=$1;
+#if [ $(`ps -o cgroup` | grep -c $DRILLBIT_CGROUP ) -eq 1 ]; then 
--- End diff --

Thanks for catching that!


> Support CGROUPs resource management
> ---
>
> Key: DRILL-143
> URL: https://issues.apache.org/jira/browse/DRILL-143
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Jacques Nadeau
>Assignee: Kunal Khatua
>Priority: Major
> Fix For: 1.14.0
>
> Attachments: 253ce178-ddeb-e482-cd64-44ab7284ad1c.sys.drill
>
>
> For the purpose of playing nice on clusters that don't have YARN, we should 
> write up configuration and scripts to allows users to run Drill next to 
> existing workloads without sharing resources.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-143) Support CGROUPs resource management

2018-04-04 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16425602#comment-16425602
 ] 

ASF GitHub Bot commented on DRILL-143:
--

Github user kkhatua commented on a diff in the pull request:

https://github.com/apache/drill/pull/1200#discussion_r179158585
  
--- Diff: distribution/src/resources/drillbit.sh ---
@@ -127,6 +127,44 @@ check_before_start()
   fi
 }
 
+check_after_start(){
+#check if the process is running
+if [ -f $pid ]; then
+  dbitProc=$(ps -ef | grep `cat $pid` | grep Drillbit)
--- End diff --

Agreed. I'll incorporate this change.


> Support CGROUPs resource management
> ---
>
> Key: DRILL-143
> URL: https://issues.apache.org/jira/browse/DRILL-143
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Jacques Nadeau
>Assignee: Kunal Khatua
>Priority: Major
> Fix For: 1.14.0
>
> Attachments: 253ce178-ddeb-e482-cd64-44ab7284ad1c.sys.drill
>
>
> For the purpose of playing nice on clusters that don't have YARN, we should 
> write up configuration and scripts to allows users to run Drill next to 
> existing workloads without sharing resources.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-143) Support CGROUPs resource management

2018-04-04 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16425600#comment-16425600
 ] 

ASF GitHub Bot commented on DRILL-143:
--

Github user kkhatua commented on a diff in the pull request:

https://github.com/apache/drill/pull/1200#discussion_r179158191
  
--- Diff: distribution/src/resources/drill-env.sh ---
@@ -86,6 +86,12 @@
 
 #export DRILL_PID_DIR=${DRILL_PID_DIR:-$DRILL_HOME}
 
+# CGroup to which the Drillbit belong when running as a daemon using
--- End diff --

:)
I did speculate about enforcement for other resources, but it seems that 
CPU is the primary resource that needs to be managed. Memory 'management' (by 
YARN, for e.g.) otherwise works without the need for CGroup.


> Support CGROUPs resource management
> ---
>
> Key: DRILL-143
> URL: https://issues.apache.org/jira/browse/DRILL-143
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Jacques Nadeau
>Assignee: Kunal Khatua
>Priority: Major
> Fix For: 1.14.0
>
> Attachments: 253ce178-ddeb-e482-cd64-44ab7284ad1c.sys.drill
>
>
> For the purpose of playing nice on clusters that don't have YARN, we should 
> write up configuration and scripts to allows users to run Drill next to 
> existing workloads without sharing resources.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-143) Support CGROUPs resource management

2018-04-04 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16425594#comment-16425594
 ] 

ASF GitHub Bot commented on DRILL-143:
--

Github user kkhatua commented on a diff in the pull request:

https://github.com/apache/drill/pull/1200#discussion_r179156839
  
--- Diff: distribution/src/resources/drillbit.sh ---
@@ -127,6 +127,44 @@ check_before_start()
   fi
 }
 
+check_after_start(){
+#check if the process is running
+if [ -f $pid ]; then
+  dbitProc=$(ps -ef | grep `cat $pid` | grep Drillbit)
+  if [ -n "$dbitProc" ]; then
+# Check and enforce for CGroup
+if [ -n "$DRILLBIT_CGROUP" ]; then 
+  check_and_enforce_cgroup `cat $pid`
+fi
+  fi
+fi
+}
+
+check_and_enforce_cgroup(){
+dbitPid=$1;
+#if [ $(`ps -o cgroup` | grep -c $DRILLBIT_CGROUP ) -eq 1 ]; then 
+if [ -f /cgroup/cpu/${DRILLBIT_CGROUP}/cgroup.procs ]; then 
--- End diff --

I've looked to check for older versions, but wasn't sure which one aligned 
to which version and OS. It's easy to check if CGroups is running, but I don't 
see a guaranteed way of confirming specifically about that CGroup. 
Let me do some more research on this front.


> Support CGROUPs resource management
> ---
>
> Key: DRILL-143
> URL: https://issues.apache.org/jira/browse/DRILL-143
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Jacques Nadeau
>Assignee: Kunal Khatua
>Priority: Major
> Fix For: 1.14.0
>
> Attachments: 253ce178-ddeb-e482-cd64-44ab7284ad1c.sys.drill
>
>
> For the purpose of playing nice on clusters that don't have YARN, we should 
> write up configuration and scripts to allows users to run Drill next to 
> existing workloads without sharing resources.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-143) Support CGROUPs resource management

2018-04-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16425032#comment-16425032
 ] 

ASF GitHub Bot commented on DRILL-143:
--

Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/1200#discussion_r179028078
  
--- Diff: distribution/src/resources/drillbit.sh ---
@@ -127,6 +127,44 @@ check_before_start()
   fi
 }
 
+check_after_start(){
+#check if the process is running
+if [ -f $pid ]; then
+  dbitProc=$(ps -ef | grep `cat $pid` | grep Drillbit)
+  if [ -n "$dbitProc" ]; then
+# Check and enforce for CGroup
+if [ -n "$DRILLBIT_CGROUP" ]; then 
+  check_and_enforce_cgroup `cat $pid`
+fi
+  fi
+fi
+}
+
+check_and_enforce_cgroup(){
+dbitPid=$1;
+#if [ $(`ps -o cgroup` | grep -c $DRILLBIT_CGROUP ) -eq 1 ]; then 
+if [ -f /cgroup/cpu/${DRILLBIT_CGROUP}/cgroup.procs ]; then 
+  echo $dbitPid > /cgroup/cpu/${DRILLBIT_CGROUP}/cgroup.procs
+  # Verify Enforcement
+  cgroupStatus=`grep -w $pid 
/cgroup/cpu/${DRILLBIT_CGROUP}/cgroup.procs`
+  if [ -z "$cgroupStatus" ]; then
+#Ref: 
https://www.kernel.org/doc/Documentation/scheduler/sched-bwc.txt
+let cpu_quota=`cat /cgroup/cpu/${DRILLBIT_CGROUP}/cpu.cfs_quota_us`
+let cpu_period=`cat 
/cgroup/cpu/${DRILLBIT_CGROUP}/cpu.cfs_period_us`
+if [ $cpu_period -gt 0 ] && [ $cpu_quota -gt 0 ]; then
+  coresAllowed="(up to "`echo $(( 100 * $cpu_quota / $cpu_period 
)) | sed 's/..$/.&/'`" cores allowed)"
+fi
+echo "WARN: Drillbit's CPU resource usage will be managed under 
the CGroup : $DRILLBIT_CGROUP "$coresAllowed
+  else
+echo "ERROR: Failed to add Drillbit to CGroup ( $DRILLBIT_CGROUP ) 
for resource usage management.  Ensure that the cgroup manages CPU"
+exit 1
+  fi
+else
+  echo "ERROR: cgroup $DRILLBIT_CGROUP does not found. Ensure that 
daemon is running and cgroup exists"
--- End diff --

cgroup --> CGroup (or at least make the format consistent across messages)

does not found --> not found

We can check if the daemon is running using `kill -0 $pid`. If so, then we 
just tell the user something line "Check CGroup configuration."

Also, errors should go to stdout: `echo "msg" >&2`


> Support CGROUPs resource management
> ---
>
> Key: DRILL-143
> URL: https://issues.apache.org/jira/browse/DRILL-143
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Jacques Nadeau
>Assignee: Kunal Khatua
>Priority: Major
> Fix For: 1.14.0
>
> Attachments: 253ce178-ddeb-e482-cd64-44ab7284ad1c.sys.drill
>
>
> For the purpose of playing nice on clusters that don't have YARN, we should 
> write up configuration and scripts to allows users to run Drill next to 
> existing workloads without sharing resources.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-143) Support CGROUPs resource management

2018-04-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16425026#comment-16425026
 ] 

ASF GitHub Bot commented on DRILL-143:
--

Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/1200#discussion_r179025400
  
--- Diff: distribution/src/resources/drill-env.sh ---
@@ -86,6 +86,12 @@
 
 #export DRILL_PID_DIR=${DRILL_PID_DIR:-$DRILL_HOME}
 
+# CGroup to which the Drillbit belong when running as a daemon using
--- End diff --

belong --> belongs


> Support CGROUPs resource management
> ---
>
> Key: DRILL-143
> URL: https://issues.apache.org/jira/browse/DRILL-143
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Jacques Nadeau
>Assignee: Kunal Khatua
>Priority: Major
> Fix For: 1.14.0
>
> Attachments: 253ce178-ddeb-e482-cd64-44ab7284ad1c.sys.drill
>
>
> For the purpose of playing nice on clusters that don't have YARN, we should 
> write up configuration and scripts to allows users to run Drill next to 
> existing workloads without sharing resources.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-143) Support CGROUPs resource management

2018-04-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16425031#comment-16425031
 ] 

ASF GitHub Bot commented on DRILL-143:
--

Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/1200#discussion_r179028127
  
--- Diff: distribution/src/resources/drillbit.sh ---
@@ -127,6 +127,44 @@ check_before_start()
   fi
 }
 
+check_after_start(){
+#check if the process is running
+if [ -f $pid ]; then
+  dbitProc=$(ps -ef | grep `cat $pid` | grep Drillbit)
+  if [ -n "$dbitProc" ]; then
+# Check and enforce for CGroup
+if [ -n "$DRILLBIT_CGROUP" ]; then 
+  check_and_enforce_cgroup `cat $pid`
+fi
+  fi
+fi
+}
+
+check_and_enforce_cgroup(){
+dbitPid=$1;
+#if [ $(`ps -o cgroup` | grep -c $DRILLBIT_CGROUP ) -eq 1 ]; then 
+if [ -f /cgroup/cpu/${DRILLBIT_CGROUP}/cgroup.procs ]; then 
+  echo $dbitPid > /cgroup/cpu/${DRILLBIT_CGROUP}/cgroup.procs
+  # Verify Enforcement
+  cgroupStatus=`grep -w $pid 
/cgroup/cpu/${DRILLBIT_CGROUP}/cgroup.procs`
+  if [ -z "$cgroupStatus" ]; then
+#Ref: 
https://www.kernel.org/doc/Documentation/scheduler/sched-bwc.txt
+let cpu_quota=`cat /cgroup/cpu/${DRILLBIT_CGROUP}/cpu.cfs_quota_us`
+let cpu_period=`cat 
/cgroup/cpu/${DRILLBIT_CGROUP}/cpu.cfs_period_us`
+if [ $cpu_period -gt 0 ] && [ $cpu_quota -gt 0 ]; then
+  coresAllowed="(up to "`echo $(( 100 * $cpu_quota / $cpu_period 
)) | sed 's/..$/.&/'`" cores allowed)"
+fi
+echo "WARN: Drillbit's CPU resource usage will be managed under 
the CGroup : $DRILLBIT_CGROUP "$coresAllowed
+  else
+echo "ERROR: Failed to add Drillbit to CGroup ( $DRILLBIT_CGROUP ) 
for resource usage management.  Ensure that the cgroup manages CPU"
+exit 1
+  fi
+else
+  echo "ERROR: cgroup $DRILLBIT_CGROUP does not found. Ensure that 
daemon is running and cgroup exists"
+  exit 1
--- End diff --

Not sure we want to fail the script in this case. Failure means that the 
Drillbit did not start, but it did.


> Support CGROUPs resource management
> ---
>
> Key: DRILL-143
> URL: https://issues.apache.org/jira/browse/DRILL-143
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Jacques Nadeau
>Assignee: Kunal Khatua
>Priority: Major
> Fix For: 1.14.0
>
> Attachments: 253ce178-ddeb-e482-cd64-44ab7284ad1c.sys.drill
>
>
> For the purpose of playing nice on clusters that don't have YARN, we should 
> write up configuration and scripts to allows users to run Drill next to 
> existing workloads without sharing resources.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-143) Support CGROUPs resource management

2018-04-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16425025#comment-16425025
 ] 

ASF GitHub Bot commented on DRILL-143:
--

Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/1200#discussion_r179025777
  
--- Diff: distribution/src/resources/drillbit.sh ---
@@ -127,6 +127,44 @@ check_before_start()
   fi
 }
 
+check_after_start(){
+#check if the process is running
+if [ -f $pid ]; then
+  dbitProc=$(ps -ef | grep `cat $pid` | grep Drillbit)
--- End diff --

If you get this far, the pid must exist or the scripts are broken.

The only question is whether the process is still running. The chance that 
it is not is very low. Also, there is a race condition: the process may exit 
just after we check it. So, not sure it is even worth doing the check.

Finally, the standard way to check for the process is `kill -0 $pid`". See 
`waitForProcessEnd()`.


> Support CGROUPs resource management
> ---
>
> Key: DRILL-143
> URL: https://issues.apache.org/jira/browse/DRILL-143
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Jacques Nadeau
>Assignee: Kunal Khatua
>Priority: Major
> Fix For: 1.14.0
>
> Attachments: 253ce178-ddeb-e482-cd64-44ab7284ad1c.sys.drill
>
>
> For the purpose of playing nice on clusters that don't have YARN, we should 
> write up configuration and scripts to allows users to run Drill next to 
> existing workloads without sharing resources.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-143) Support CGROUPs resource management

2018-04-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16425023#comment-16425023
 ] 

ASF GitHub Bot commented on DRILL-143:
--

Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/1200#discussion_r179027582
  
--- Diff: distribution/src/resources/drillbit.sh ---
@@ -154,6 +192,7 @@ start_bit ( )
   nohup nice -n $DRILL_NICENESS "$DRILL_HOME/bin/runbit" exec ${args[@]} 
>> "$logout" 2>&1 &
   echo $! > $pid
   sleep 1
+  check_after_start
--- End diff --

To make things easier:

```
procId=$!
echo $procId > $pid # Yeah, $pid is a file, $procId is the pid...
sleep 1
check_after_start $procId
```

Also, we now have to naming conventions: `waitForProcessEnd` and 
`check_after_start`. Doesn't match which we choose, but we should stick with 
it. Where is check style for scripts?


> Support CGROUPs resource management
> ---
>
> Key: DRILL-143
> URL: https://issues.apache.org/jira/browse/DRILL-143
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Jacques Nadeau
>Assignee: Kunal Khatua
>Priority: Major
> Fix For: 1.14.0
>
> Attachments: 253ce178-ddeb-e482-cd64-44ab7284ad1c.sys.drill
>
>
> For the purpose of playing nice on clusters that don't have YARN, we should 
> write up configuration and scripts to allows users to run Drill next to 
> existing workloads without sharing resources.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-143) Support CGROUPs resource management

2018-04-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16425022#comment-16425022
 ] 

ASF GitHub Bot commented on DRILL-143:
--

Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/1200#discussion_r179026313
  
--- Diff: distribution/src/resources/drillbit.sh ---
@@ -127,6 +127,44 @@ check_before_start()
   fi
 }
 
+check_after_start(){
+#check if the process is running
+if [ -f $pid ]; then
+  dbitProc=$(ps -ef | grep `cat $pid` | grep Drillbit)
+  if [ -n "$dbitProc" ]; then
+# Check and enforce for CGroup
+if [ -n "$DRILLBIT_CGROUP" ]; then 
+  check_and_enforce_cgroup `cat $pid`
+fi
+  fi
+fi
+}
+
+check_and_enforce_cgroup(){
+dbitPid=$1;
+#if [ $(`ps -o cgroup` | grep -c $DRILLBIT_CGROUP ) -eq 1 ]; then 
--- End diff --

Remove commented out line


> Support CGROUPs resource management
> ---
>
> Key: DRILL-143
> URL: https://issues.apache.org/jira/browse/DRILL-143
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Jacques Nadeau
>Assignee: Kunal Khatua
>Priority: Major
> Fix For: 1.14.0
>
> Attachments: 253ce178-ddeb-e482-cd64-44ab7284ad1c.sys.drill
>
>
> For the purpose of playing nice on clusters that don't have YARN, we should 
> write up configuration and scripts to allows users to run Drill next to 
> existing workloads without sharing resources.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-143) Support CGROUPs resource management

2018-04-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16425030#comment-16425030
 ] 

ASF GitHub Bot commented on DRILL-143:
--

Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/1200#discussion_r179027761
  
--- Diff: distribution/src/resources/drillbit.sh ---
@@ -127,6 +127,44 @@ check_before_start()
   fi
 }
 
+check_after_start(){
+#check if the process is running
+if [ -f $pid ]; then
+  dbitProc=$(ps -ef | grep `cat $pid` | grep Drillbit)
+  if [ -n "$dbitProc" ]; then
+# Check and enforce for CGroup
+if [ -n "$DRILLBIT_CGROUP" ]; then 
+  check_and_enforce_cgroup `cat $pid`
+fi
+  fi
+fi
+}
+
+check_and_enforce_cgroup(){
+dbitPid=$1;
+#if [ $(`ps -o cgroup` | grep -c $DRILLBIT_CGROUP ) -eq 1 ]; then 
+if [ -f /cgroup/cpu/${DRILLBIT_CGROUP}/cgroup.procs ]; then 
+  echo $dbitPid > /cgroup/cpu/${DRILLBIT_CGROUP}/cgroup.procs
+  # Verify Enforcement
+  cgroupStatus=`grep -w $pid 
/cgroup/cpu/${DRILLBIT_CGROUP}/cgroup.procs`
+  if [ -z "$cgroupStatus" ]; then
+#Ref: 
https://www.kernel.org/doc/Documentation/scheduler/sched-bwc.txt
+let cpu_quota=`cat /cgroup/cpu/${DRILLBIT_CGROUP}/cpu.cfs_quota_us`
+let cpu_period=`cat 
/cgroup/cpu/${DRILLBIT_CGROUP}/cpu.cfs_period_us`
+if [ $cpu_period -gt 0 ] && [ $cpu_quota -gt 0 ]; then
+  coresAllowed="(up to "`echo $(( 100 * $cpu_quota / $cpu_period 
)) | sed 's/..$/.&/'`" cores allowed)"
+fi
+echo "WARN: Drillbit's CPU resource usage will be managed under 
the CGroup : $DRILLBIT_CGROUP "$coresAllowed
--- End diff --

Why is this a WARN? Isn't this exactly what I requested?

Maybe emit this as a message (no WARN) if a -v (verbose) flag is set.

Also, put the variable inside the quotes; bash is handy that way...

Message will be "...CGroup : drillcpu 5". Maybe, "CGroup drillcpu will 
limit Drill to 5 cpu(s)".


> Support CGROUPs resource management
> ---
>
> Key: DRILL-143
> URL: https://issues.apache.org/jira/browse/DRILL-143
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Jacques Nadeau
>Assignee: Kunal Khatua
>Priority: Major
> Fix For: 1.14.0
>
> Attachments: 253ce178-ddeb-e482-cd64-44ab7284ad1c.sys.drill
>
>
> For the purpose of playing nice on clusters that don't have YARN, we should 
> write up configuration and scripts to allows users to run Drill next to 
> existing workloads without sharing resources.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-143) Support CGROUPs resource management

2018-04-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16425033#comment-16425033
 ] 

ASF GitHub Bot commented on DRILL-143:
--

Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/1200#discussion_r179028377
  
--- Diff: distribution/src/resources/drillbit.sh ---
@@ -127,6 +127,44 @@ check_before_start()
   fi
 }
 
+check_after_start(){
+#check if the process is running
+if [ -f $pid ]; then
+  dbitProc=$(ps -ef | grep `cat $pid` | grep Drillbit)
+  if [ -n "$dbitProc" ]; then
+# Check and enforce for CGroup
+if [ -n "$DRILLBIT_CGROUP" ]; then 
+  check_and_enforce_cgroup `cat $pid`
+fi
+  fi
+fi
+}
+
+check_and_enforce_cgroup(){
+dbitPid=$1;
+#if [ $(`ps -o cgroup` | grep -c $DRILLBIT_CGROUP ) -eq 1 ]; then 
+if [ -f /cgroup/cpu/${DRILLBIT_CGROUP}/cgroup.procs ]; then 
+  echo $dbitPid > /cgroup/cpu/${DRILLBIT_CGROUP}/cgroup.procs
+  # Verify Enforcement
+  cgroupStatus=`grep -w $pid 
/cgroup/cpu/${DRILLBIT_CGROUP}/cgroup.procs`
+  if [ -z "$cgroupStatus" ]; then
+#Ref: 
https://www.kernel.org/doc/Documentation/scheduler/sched-bwc.txt
+let cpu_quota=`cat /cgroup/cpu/${DRILLBIT_CGROUP}/cpu.cfs_quota_us`
+let cpu_period=`cat 
/cgroup/cpu/${DRILLBIT_CGROUP}/cpu.cfs_period_us`
+if [ $cpu_period -gt 0 ] && [ $cpu_quota -gt 0 ]; then
+  coresAllowed="(up to "`echo $(( 100 * $cpu_quota / $cpu_period 
)) | sed 's/..$/.&/'`" cores allowed)"
+fi
+echo "WARN: Drillbit's CPU resource usage will be managed under 
the CGroup : $DRILLBIT_CGROUP "$coresAllowed
+  else
+echo "ERROR: Failed to add Drillbit to CGroup ( $DRILLBIT_CGROUP ) 
for resource usage management.  Ensure that the cgroup manages CPU"
+exit 1
--- End diff --

See below about `exit 1`


> Support CGROUPs resource management
> ---
>
> Key: DRILL-143
> URL: https://issues.apache.org/jira/browse/DRILL-143
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Jacques Nadeau
>Assignee: Kunal Khatua
>Priority: Major
> Fix For: 1.14.0
>
> Attachments: 253ce178-ddeb-e482-cd64-44ab7284ad1c.sys.drill
>
>
> For the purpose of playing nice on clusters that don't have YARN, we should 
> write up configuration and scripts to allows users to run Drill next to 
> existing workloads without sharing resources.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-143) Support CGROUPs resource management

2018-04-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16425029#comment-16425029
 ] 

ASF GitHub Bot commented on DRILL-143:
--

Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/1200#discussion_r179028625
  
--- Diff: distribution/src/resources/drillbit.sh ---
@@ -127,6 +127,44 @@ check_before_start()
   fi
 }
 
+check_after_start(){
+#check if the process is running
+if [ -f $pid ]; then
+  dbitProc=$(ps -ef | grep `cat $pid` | grep Drillbit)
+  if [ -n "$dbitProc" ]; then
+# Check and enforce for CGroup
+if [ -n "$DRILLBIT_CGROUP" ]; then 
+  check_and_enforce_cgroup `cat $pid`
+fi
+  fi
+fi
+}
+
+check_and_enforce_cgroup(){
+dbitPid=$1;
+#if [ $(`ps -o cgroup` | grep -c $DRILLBIT_CGROUP ) -eq 1 ]; then 
+if [ -f /cgroup/cpu/${DRILLBIT_CGROUP}/cgroup.procs ]; then 
+  echo $dbitPid > /cgroup/cpu/${DRILLBIT_CGROUP}/cgroup.procs
+  # Verify Enforcement
+  cgroupStatus=`grep -w $pid 
/cgroup/cpu/${DRILLBIT_CGROUP}/cgroup.procs`
+  if [ -z "$cgroupStatus" ]; then
+#Ref: 
https://www.kernel.org/doc/Documentation/scheduler/sched-bwc.txt
+let cpu_quota=`cat /cgroup/cpu/${DRILLBIT_CGROUP}/cpu.cfs_quota_us`
--- End diff --

`let` is used only for math. Just say `cpu_quota=cat...`


> Support CGROUPs resource management
> ---
>
> Key: DRILL-143
> URL: https://issues.apache.org/jira/browse/DRILL-143
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Jacques Nadeau
>Assignee: Kunal Khatua
>Priority: Major
> Fix For: 1.14.0
>
> Attachments: 253ce178-ddeb-e482-cd64-44ab7284ad1c.sys.drill
>
>
> For the purpose of playing nice on clusters that don't have YARN, we should 
> write up configuration and scripts to allows users to run Drill next to 
> existing workloads without sharing resources.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-143) Support CGROUPs resource management

2018-04-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16425028#comment-16425028
 ] 

ASF GitHub Bot commented on DRILL-143:
--

Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/1200#discussion_r179025617
  
--- Diff: distribution/src/resources/drill-env.sh ---
@@ -86,6 +86,12 @@
 
 #export DRILL_PID_DIR=${DRILL_PID_DIR:-$DRILL_HOME}
 
+# CGroup to which the Drillbit belong when running as a daemon using
--- End diff --

Also, maybe provide a bit more of a description. My guess is that the 
CGroup must already be set up? That we only enforce CPU? Or, is what we enforce 
determined entirely by what the user has configured? If so, wouldn't "drill" be 
a better CGroup name?


> Support CGROUPs resource management
> ---
>
> Key: DRILL-143
> URL: https://issues.apache.org/jira/browse/DRILL-143
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Jacques Nadeau
>Assignee: Kunal Khatua
>Priority: Major
> Fix For: 1.14.0
>
> Attachments: 253ce178-ddeb-e482-cd64-44ab7284ad1c.sys.drill
>
>
> For the purpose of playing nice on clusters that don't have YARN, we should 
> write up configuration and scripts to allows users to run Drill next to 
> existing workloads without sharing resources.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-143) Support CGROUPs resource management

2018-04-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16425027#comment-16425027
 ] 

ASF GitHub Bot commented on DRILL-143:
--

Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/1200#discussion_r179027910
  
--- Diff: distribution/src/resources/drillbit.sh ---
@@ -127,6 +127,44 @@ check_before_start()
   fi
 }
 
+check_after_start(){
+#check if the process is running
+if [ -f $pid ]; then
+  dbitProc=$(ps -ef | grep `cat $pid` | grep Drillbit)
+  if [ -n "$dbitProc" ]; then
+# Check and enforce for CGroup
+if [ -n "$DRILLBIT_CGROUP" ]; then 
+  check_and_enforce_cgroup `cat $pid`
+fi
+  fi
+fi
+}
+
+check_and_enforce_cgroup(){
+dbitPid=$1;
+#if [ $(`ps -o cgroup` | grep -c $DRILLBIT_CGROUP ) -eq 1 ]; then 
+if [ -f /cgroup/cpu/${DRILLBIT_CGROUP}/cgroup.procs ]; then 
+  echo $dbitPid > /cgroup/cpu/${DRILLBIT_CGROUP}/cgroup.procs
+  # Verify Enforcement
+  cgroupStatus=`grep -w $pid 
/cgroup/cpu/${DRILLBIT_CGROUP}/cgroup.procs`
+  if [ -z "$cgroupStatus" ]; then
+#Ref: 
https://www.kernel.org/doc/Documentation/scheduler/sched-bwc.txt
+let cpu_quota=`cat /cgroup/cpu/${DRILLBIT_CGROUP}/cpu.cfs_quota_us`
+let cpu_period=`cat 
/cgroup/cpu/${DRILLBIT_CGROUP}/cpu.cfs_period_us`
+if [ $cpu_period -gt 0 ] && [ $cpu_quota -gt 0 ]; then
+  coresAllowed="(up to "`echo $(( 100 * $cpu_quota / $cpu_period 
)) | sed 's/..$/.&/'`" cores allowed)"
+fi
+echo "WARN: Drillbit's CPU resource usage will be managed under 
the CGroup : $DRILLBIT_CGROUP "$coresAllowed
+  else
+echo "ERROR: Failed to add Drillbit to CGroup ( $DRILLBIT_CGROUP ) 
for resource usage management.  Ensure that the cgroup manages CPU"
--- End diff --

Inconsistent format: parens here, colon above. Actually, neither are needed.

Also, "for resource usage management" --> "for cpu"


> Support CGROUPs resource management
> ---
>
> Key: DRILL-143
> URL: https://issues.apache.org/jira/browse/DRILL-143
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Jacques Nadeau
>Assignee: Kunal Khatua
>Priority: Major
> Fix For: 1.14.0
>
> Attachments: 253ce178-ddeb-e482-cd64-44ab7284ad1c.sys.drill
>
>
> For the purpose of playing nice on clusters that don't have YARN, we should 
> write up configuration and scripts to allows users to run Drill next to 
> existing workloads without sharing resources.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-143) Support CGROUPs resource management

2018-04-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16425024#comment-16425024
 ] 

ASF GitHub Bot commented on DRILL-143:
--

Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/1200#discussion_r179026462
  
--- Diff: distribution/src/resources/drillbit.sh ---
@@ -127,6 +127,44 @@ check_before_start()
   fi
 }
 
+check_after_start(){
+#check if the process is running
+if [ -f $pid ]; then
+  dbitProc=$(ps -ef | grep `cat $pid` | grep Drillbit)
+  if [ -n "$dbitProc" ]; then
+# Check and enforce for CGroup
+if [ -n "$DRILLBIT_CGROUP" ]; then 
+  check_and_enforce_cgroup `cat $pid`
+fi
+  fi
+fi
+}
+
+check_and_enforce_cgroup(){
+dbitPid=$1;
+#if [ $(`ps -o cgroup` | grep -c $DRILLBIT_CGROUP ) -eq 1 ]; then 
+if [ -f /cgroup/cpu/${DRILLBIT_CGROUP}/cgroup.procs ]; then 
--- End diff --

I believe cgroup setup is OS-specific. There are, AFAIK, multiple group 
versions.

For this reason, I wonder if this needs to have checks. Are there scripts 
we can crib from somewhere that do those checks?


> Support CGROUPs resource management
> ---
>
> Key: DRILL-143
> URL: https://issues.apache.org/jira/browse/DRILL-143
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Jacques Nadeau
>Assignee: Kunal Khatua
>Priority: Major
> Fix For: 1.14.0
>
> Attachments: 253ce178-ddeb-e482-cd64-44ab7284ad1c.sys.drill
>
>
> For the purpose of playing nice on clusters that don't have YARN, we should 
> write up configuration and scripts to allows users to run Drill next to 
> existing workloads without sharing resources.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-143) Support CGROUPs resource management

2018-04-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16424839#comment-16424839
 ] 

ASF GitHub Bot commented on DRILL-143:
--

Github user Ben-Zvi commented on the issue:

https://github.com/apache/drill/pull/1200
  
Could there be users with an older Linux (pre 4.5, circa March 2016) which 
does not support cgroups V2 ?



> Support CGROUPs resource management
> ---
>
> Key: DRILL-143
> URL: https://issues.apache.org/jira/browse/DRILL-143
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Jacques Nadeau
>Assignee: Kunal Khatua
>Priority: Major
> Fix For: 1.14.0
>
> Attachments: 253ce178-ddeb-e482-cd64-44ab7284ad1c.sys.drill
>
>
> For the purpose of playing nice on clusters that don't have YARN, we should 
> write up configuration and scripts to allows users to run Drill next to 
> existing workloads without sharing resources.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-143) Support CGROUPs resource management

2018-04-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16424835#comment-16424835
 ] 

ASF GitHub Bot commented on DRILL-143:
--

Github user Ben-Zvi commented on a diff in the pull request:

https://github.com/apache/drill/pull/1200#discussion_r179005732
  
--- Diff: distribution/src/resources/drillbit.sh ---
@@ -127,6 +127,44 @@ check_before_start()
   fi
 }
 
+check_after_start(){
+#check if the process is running
+if [ -f $pid ]; then
+  dbitProc=$(ps -ef | grep `cat $pid` | grep Drillbit)
+  if [ -n "$dbitProc" ]; then
+# Check and enforce for CGroup
+if [ -n "$DRILLBIT_CGROUP" ]; then 
+  check_and_enforce_cgroup `cat $pid`
+fi
+  fi
+fi
+}
+
+check_and_enforce_cgroup(){
+dbitPid=$1;
+#if [ $(`ps -o cgroup` | grep -c $DRILLBIT_CGROUP ) -eq 1 ]; then 
+if [ -f /cgroup/cpu/${DRILLBIT_CGROUP}/cgroup.procs ]; then 
--- End diff --

Why is the path hardcoded ? Is it a standard ?  I checked on my cluster, 
where /sys/fs/cgroup/. is used instead.



> Support CGROUPs resource management
> ---
>
> Key: DRILL-143
> URL: https://issues.apache.org/jira/browse/DRILL-143
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Jacques Nadeau
>Assignee: Kunal Khatua
>Priority: Major
> Fix For: 1.14.0
>
> Attachments: 253ce178-ddeb-e482-cd64-44ab7284ad1c.sys.drill
>
>
> For the purpose of playing nice on clusters that don't have YARN, we should 
> write up configuration and scripts to allows users to run Drill next to 
> existing workloads without sharing resources.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-143) Support CGROUPs resource management

2018-04-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16423565#comment-16423565
 ] 

ASF GitHub Bot commented on DRILL-143:
--

Github user kkhatua commented on the issue:

https://github.com/apache/drill/pull/1200
  
@paul-rogers / @Ben-Zvi  , could you review this?


> Support CGROUPs resource management
> ---
>
> Key: DRILL-143
> URL: https://issues.apache.org/jira/browse/DRILL-143
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Jacques Nadeau
>Assignee: Kunal Khatua
>Priority: Major
> Fix For: 1.14.0
>
> Attachments: 253ce178-ddeb-e482-cd64-44ab7284ad1c.sys.drill
>
>
> For the purpose of playing nice on clusters that don't have YARN, we should 
> write up configuration and scripts to allows users to run Drill next to 
> existing workloads without sharing resources.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-143) Support CGROUPs resource management

2018-04-03 Thread Kunal Khatua (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16423555#comment-16423555
 ] 

Kunal Khatua commented on DRILL-143:


Attached a sample profile ([^253ce178-ddeb-e482-cd64-44ab7284ad1c.sys.drill]) 
where node `kk127.qa.lab` was restricted to using not more than 4 CPU cores.

As a result, the HashJoin operator on that node ran 3 times longer than the 
other 3 nodes which didn't have CPU utilization restricted with CGroups.

> Support CGROUPs resource management
> ---
>
> Key: DRILL-143
> URL: https://issues.apache.org/jira/browse/DRILL-143
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Jacques Nadeau
>Assignee: Kunal Khatua
>Priority: Major
> Fix For: 1.14.0
>
> Attachments: 253ce178-ddeb-e482-cd64-44ab7284ad1c.sys.drill
>
>
> For the purpose of playing nice on clusters that don't have YARN, we should 
> write up configuration and scripts to allows users to run Drill next to 
> existing workloads without sharing resources.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-143) Support CGROUPs resource management

2018-04-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16423552#comment-16423552
 ] 

ASF GitHub Bot commented on DRILL-143:
--

GitHub user kkhatua opened a pull request:

https://github.com/apache/drill/pull/1200

DRILL-143: Support CGROUPs resource management

Introduces the `DRILLBIT_CGROUP` option in defined in `drill-env.sh`
The startup script checks if the specified CGroup (ver 2) is available and 
tries to apply it to the launched Drillbit JVM.
This would benefit not just Drill-on-YARN usecases, but  any setup that 
would like CGroups for enforcement of (cpu) resources management.

e.g when Drillbit is configured to use `drillcpu` cgroup
```
[root@maprlabs ~]# 
/opt/mapr/drill/apache-drill-1.14.0-SNAPSHOT/bin/drillbit.sh restart
Stopping drillbit
..
Starting drillbit, logging to /var/log/drill/drillbit.out
WARN: Drillbit's CPU resource usage will be managed under the CGroup : 
drillcpu (up to 4.00 cores allowed)
```

e.g. Non-existent CGroup `droolcpu` is used
```
[root@maprlabs ~]# 
/opt/mapr/drill/apache-drill-1.14.0-SNAPSHOT/bin/drillbit.sh restart
Stopping drillbit
..
Starting drillbit, logging to /var/log/drill/drillbit.out
ERROR: cgroup droolcpu does not found. Ensure that daemon is running and 
cgroup exists
```

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/kkhatua/drill DRILL-143

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/1200.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1200


commit e9a2551b315b4395c5227dad017f2b4340f41108
Author: Kunal Khatua 
Date:   2018-04-03T05:35:53Z

DRILL-143: Support CGROUPs resource management

Introduces the DRILLBIT_CGROUP option in drill-env.sh.
The startup script checks if the specified CGroup (ver 2) is available and 
tries to apply it to the launched Drillbit JVM.
This would benefit not just Drill-on-YARN usecases, but  any setup that 
would like CGroups for enforcement of (cpu) resources management.

e.g when Drillbit is configured to use `drillcpu` cgroup
```
[root@maprlabs ~]# 
/opt/mapr/drill/apache-drill-1.14.0-SNAPSHOT/bin/drillbit.sh restart
Stopping drillbit
..
Starting drillbit, logging to /var/log/drill/drillbit.out
WARN: Drillbit's CPU resource usage will be managed under the CGroup : 
drillcpu (up to 4.00 cores allowed)
```

e.g. Non-existent CGroup `droolcpu` is used
```
[root@kk127 ~]# 
/opt/mapr/drill/apache-drill-1.14.0-SNAPSHOT/bin/drillbit.sh restart
Stopping drillbit
..
Starting drillbit, logging to /var/log/drill/drillbit.out
ERROR: cgroup droolcpu does not found. Ensure that daemon is running and 
cgroup exists
```




> Support CGROUPs resource management
> ---
>
> Key: DRILL-143
> URL: https://issues.apache.org/jira/browse/DRILL-143
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Jacques Nadeau
>Assignee: Kunal Khatua
>Priority: Major
> Fix For: 1.14.0
>
>
> For the purpose of playing nice on clusters that don't have YARN, we should 
> write up configuration and scripts to allows users to run Drill next to 
> existing workloads without sharing resources.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-143) Support CGROUPs resource management

2018-03-16 Thread Kunal Khatua (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16402385#comment-16402385
 ] 

Kunal Khatua commented on DRILL-143:


Marking this as a feature for 1.14.0 since Drill-on-Yarn will be part of 1.13.0.

However, this would be a generic feature for Drill honoring CGroups 
irrespective of whether the node is managed by YARN or not.

> Support CGROUPs resource management
> ---
>
> Key: DRILL-143
> URL: https://issues.apache.org/jira/browse/DRILL-143
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Jacques Nadeau
>Priority: Major
> Fix For: 1.14.0
>
>
> For the purpose of playing nice on clusters that don't have YARN, we should 
> write up configuration and scripts to allows users to run Drill next to 
> existing workloads without sharing resources.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)