[
https://issues.apache.org/jira/browse/DRILL-6263?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Paul Rogers updated DRILL-6263:
-------------------------------
Description:
As part of the Drill 1.13 release process, I tested out DoY after a year of not
having used it. That time gap pointed out some improvements for first-time
users.
* Copy the
[USAGE.md|https://github.com/apache/drill/blob/master/drill-yarn/USAGE.md] file
into the Drill home directory with the name "DRILL_YARN_USAGE.md.".
* Change the {{drill-on-yarn-example.conf}} file to be a valid file for the
default Drill and YARN configurations.
{noformat}
heap: "2G"
max-direct-memory: "2G"
memory-mb: 5125
{noformat}
* Change the {{drill-on-yarn-example.conf}} to disable SSL by default. Just
comment out the following line:
{noformat}
#ssl-enabled: true
{noformat}
* Change the {{drill-on-yarn-example.conf}} to disable authorization by
default. That is, comment out the following line:
{noformat}
#auth-type: "drill"
{noformat}
* Change the {{drill-on-yarn-example.conf}} to use no AM node labels by
default. That is, comment out the following line:
{noformat}
#node-label-expr: "drill-am"
{noformat}
Failure to comment out this line results in the following error:
{noformat}
Failed to start Drill application master
Caused by: Submit application failed
Caused by: Invalid resource request, node label not enabled but request
contains label expression
{noformat}
Also, add this to the Troubleshooting section in {{USAGE.md}}.
* Change {{DrillOnYarnConfig.findSuffix}}, to allow the {{.tar}} suffix. This
is what one ends up with it ht Mac does its automatic extract. A tar file is
larger than the compressed version, but no reason it should not be allowed
(assuming YARN supports it.)
* Otherwise, change {{DrillOnYarnConfig.getRemoteDrillHome()}}, where we emit
the error "does not name a valid archive" to differentiate between no
sufficient and an unsupported suffix. (I got the following error and had to
look at the source to figure out what I'd done wrong):
{noformat}
drill.yarn.drill-install.client-path does not name a valid archive:
/Users/paulrogers/bin/apache-drill-1.13.0.tar
{noformat}
* Change the newly-added error reporting code in {{DrillOnYarn.displayError}}
to omit displaying the exception cause it if just repeats the main error
message. Here is the full error message from above, the second line is
redundant:
{noformat}
drill.yarn.drill-install.client-path does not name a valid archive:
/Users/paulrogers/bin/apache-drill-1.13.0.tar
Caused by: drill.yarn.drill-install.client-path does not name a valid
archive: /Users/paulrogers/bin/apache-drill-1.13.0.tar
{noformat}
* Add to
[USAGE.md|https://github.com/apache/drill/blob/master/drill-yarn/USAGE.md]
pointers for how to set up a basic HDFS, ZK and YARN configuration. Mostly just
state what is to be done and point to the [relevant Hadoop
docs|https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/SingleCluster.html],
"Pseudo-Distributed Operation". In particular, we want to create an actual
HDFS file system, not use the default of local file system.
* Add to {{USAGE.md}} a description of the supported YARN (actually Hadoop)
versions. Feature was developed with 2.7.1. Currently verifying with 2.9.0.
Probably needs to be rechecked on the 3.x series.
* Add to {{USAGE.md}} the fact that Drill is built with, and includes the jars
for, Hadoop 2.7.1. It is not clear what version compatibility Hadoop has; are
these jars compatible with the latest 2.x series Hadoop? With Hadoop 3.x?
* Until DRILL-6268 is fixed, explain that the HDFS configuration *must* use
port 8020. Also, add this to the Troubleshooting section in {{USAGE.md}}.
* Add to {{USAGE.md}}, Troubleshooting: if configuration issues cause Drill to
fail to start, then Drill-on-YARN will blacklist each node after several tries.
Unfortunately, the YARN UI appears to not provide access to the logs for failed
application containers. So, to track down the failure, look for the container
logs in YARN. In the default single-node install, they are in
{{$HADOOP_HOME/logs/userlogs/application_xxx/container_xx_00000y}} where y=1 is
the AM, y>1 are the Drillbit containers.
* Change {{USAGE.md}} to change the following line:
{noformat}
cp $DRILL_HOME/conf/drill-override-example.conf $DRILL_SITE/drill-override.conf
{noformat}
To the following:
{noformat}
cp $DRILL_HOME/conf/drill-override.conf $DRILL_SITE
{noformat}
Without this change, Drill will fail to start and you'll see the following in
the YARN container log directory, {{drillbit.log}} file:
{noformat}
2018-03-17 16:11:25,293 [main] ERROR o.a.d.e.r.u.s.PamUserAuthenticator -
Problem in finding the native library of JPAM (Pluggable Authenticator Module
API). Make sure to set Drillbit JVM option 'java.library.path' to point to the
directory where the native JPAM exists.
java.lang.UnsatisfiedLinkError: no jpam in java.library.path
{noformat}
None of these are show stoppers, each is instead just a bit of sand in the
gears that makes progress a bit slower than it need be.
was:
As part of the Drill 1.13 release process, I tested out DoY after a year of not
having used it. That time gap pointed out some improvements for first-time
users.
* Copy the
[USAGE.md|https://github.com/apache/drill/blob/master/drill-yarn/USAGE.md] file
into the Drill home directory with the name "DRILL_YARN_USAGE.md.".
* Change the {{drill-on-yarn-example.conf}} file to be a valid file for the
default Drill and YARN configurations.
{noformat}
heap: "2G"
max-direct-memory: "2G"
memory-mb: 5125
{noformat}
* Change the {{drill-on-yarn-example.conf}} to disable SSL by default. Just
comment out the following line:
{noformat}
#ssl-enabled: true
{noformat}
* Change the {{drill-on-yarn-example.conf}} to disable authorization by
default. That is, comment out the following line:
{noformat}
#auth-type: "drill"
{noformat}
* Change the {{drill-on-yarn-example.conf}} to use no AM node labels by
default. That is, comment out the following line:
{noformat}
#node-label-expr: "drill-am"
{noformat}
Failure to comment out this line results in the following error:
{noformat}
Failed to start Drill application master
Caused by: Submit application failed
Caused by: Invalid resource request, node label not enabled but request
contains label expression
{noformat}
Also, add this to the Troubleshooting section in {{USAGE.md}}.
* Change {{DrillOnYarnConfig.findSuffix}}, to allow the {{.tar}} suffix. This
is what one ends up with it ht Mac does its automatic extract. A tar file is
larger than the compressed version, but no reason it should not be allowed
(assuming YARN supports it.)
* Otherwise, change {{DrillOnYarnConfig.getRemoteDrillHome()}}, where we emit
the error "does not name a valid archive" to differentiate between no
sufficient and an unsupported suffix. (I got the following error and had to
look at the source to figure out what I'd done wrong):
{noformat}
drill.yarn.drill-install.client-path does not name a valid archive:
/Users/paulrogers/bin/apache-drill-1.13.0.tar
{noformat}
* Change the newly-added error reporting code in {{DrillOnYarn.displayError}}
to omit displaying the exception cause it if just repeats the main error
message. Here is the full error message from above, the second line is
redundant:
{noformat}
drill.yarn.drill-install.client-path does not name a valid archive:
/Users/paulrogers/bin/apache-drill-1.13.0.tar
Caused by: drill.yarn.drill-install.client-path does not name a valid
archive: /Users/paulrogers/bin/apache-drill-1.13.0.tar
{noformat}
* Add to
[USAGE.md|https://github.com/apache/drill/blob/master/drill-yarn/USAGE.md]
pointers for how to set up a basic HDFS, ZK and YARN configuration. Mostly just
state what is to be done and point to the [relevant Hadoop
docs|https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/SingleCluster.html],
"Pseudo-Distributed Operation". In particular, we want to create an actual
HDFS file system, not use the default of local file system.
* Add to {{USAGE.md}} a description of the supported YARN (actually Hadoop)
versions. Feature was developed with 2.7.1. Currently verifying with 2.9.0.
Probably needs to be rechecked on the 3.x series.
* Add to {{USAGE.md}} the fact that Drill is built with, and includes the jars
for, Hadoop 2.7.1. It is not clear what version compatibility Hadoop has; are
these jars compatible with the latest 2.x series Hadoop? With Hadoop 3.x?
* Until DRILL-6268 is fixed, explain that the HDFS configuration *must* use
port 8020. Also, add this to the Troubleshooting section in {{USAGE.md}}.
None of these are show stoppers, each is instead just a bit of sand in the
gears that makes progress a bit slower than it need be.
> Improvements to DoY initial experience
> --------------------------------------
>
> Key: DRILL-6263
> URL: https://issues.apache.org/jira/browse/DRILL-6263
> Project: Apache Drill
> Issue Type: Improvement
> Affects Versions: 1.13.0
> Reporter: Paul Rogers
> Priority: Minor
> Fix For: 1.14.0
>
>
> As part of the Drill 1.13 release process, I tested out DoY after a year of
> not having used it. That time gap pointed out some improvements for
> first-time users.
> * Copy the
> [USAGE.md|https://github.com/apache/drill/blob/master/drill-yarn/USAGE.md]
> file into the Drill home directory with the name "DRILL_YARN_USAGE.md.".
> * Change the {{drill-on-yarn-example.conf}} file to be a valid file for the
> default Drill and YARN configurations.
> {noformat}
> heap: "2G"
> max-direct-memory: "2G"
> memory-mb: 5125
> {noformat}
> * Change the {{drill-on-yarn-example.conf}} to disable SSL by default. Just
> comment out the following line:
> {noformat}
> #ssl-enabled: true
> {noformat}
> * Change the {{drill-on-yarn-example.conf}} to disable authorization by
> default. That is, comment out the following line:
> {noformat}
> #auth-type: "drill"
> {noformat}
> * Change the {{drill-on-yarn-example.conf}} to use no AM node labels by
> default. That is, comment out the following line:
> {noformat}
> #node-label-expr: "drill-am"
> {noformat}
> Failure to comment out this line results in the following error:
> {noformat}
> Failed to start Drill application master
> Caused by: Submit application failed
> Caused by: Invalid resource request, node label not enabled but request
> contains label expression
> {noformat}
> Also, add this to the Troubleshooting section in {{USAGE.md}}.
> * Change {{DrillOnYarnConfig.findSuffix}}, to allow the {{.tar}} suffix. This
> is what one ends up with it ht Mac does its automatic extract. A tar file is
> larger than the compressed version, but no reason it should not be allowed
> (assuming YARN supports it.)
> * Otherwise, change {{DrillOnYarnConfig.getRemoteDrillHome()}}, where we emit
> the error "does not name a valid archive" to differentiate between no
> sufficient and an unsupported suffix. (I got the following error and had to
> look at the source to figure out what I'd done wrong):
> {noformat}
> drill.yarn.drill-install.client-path does not name a valid archive:
> /Users/paulrogers/bin/apache-drill-1.13.0.tar
> {noformat}
> * Change the newly-added error reporting code in {{DrillOnYarn.displayError}}
> to omit displaying the exception cause it if just repeats the main error
> message. Here is the full error message from above, the second line is
> redundant:
> {noformat}
> drill.yarn.drill-install.client-path does not name a valid archive:
> /Users/paulrogers/bin/apache-drill-1.13.0.tar
> Caused by: drill.yarn.drill-install.client-path does not name a valid
> archive: /Users/paulrogers/bin/apache-drill-1.13.0.tar
> {noformat}
> * Add to
> [USAGE.md|https://github.com/apache/drill/blob/master/drill-yarn/USAGE.md]
> pointers for how to set up a basic HDFS, ZK and YARN configuration. Mostly
> just state what is to be done and point to the [relevant Hadoop
> docs|https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/SingleCluster.html],
> "Pseudo-Distributed Operation". In particular, we want to create an actual
> HDFS file system, not use the default of local file system.
> * Add to {{USAGE.md}} a description of the supported YARN (actually Hadoop)
> versions. Feature was developed with 2.7.1. Currently verifying with 2.9.0.
> Probably needs to be rechecked on the 3.x series.
> * Add to {{USAGE.md}} the fact that Drill is built with, and includes the
> jars for, Hadoop 2.7.1. It is not clear what version compatibility Hadoop
> has; are these jars compatible with the latest 2.x series Hadoop? With Hadoop
> 3.x?
> * Until DRILL-6268 is fixed, explain that the HDFS configuration *must* use
> port 8020. Also, add this to the Troubleshooting section in {{USAGE.md}}.
> * Add to {{USAGE.md}}, Troubleshooting: if configuration issues cause Drill
> to fail to start, then Drill-on-YARN will blacklist each node after several
> tries. Unfortunately, the YARN UI appears to not provide access to the logs
> for failed application containers. So, to track down the failure, look for
> the container logs in YARN. In the default single-node install, they are in
> {{$HADOOP_HOME/logs/userlogs/application_xxx/container_xx_00000y}} where y=1
> is the AM, y>1 are the Drillbit containers.
> * Change {{USAGE.md}} to change the following line:
> {noformat}
> cp $DRILL_HOME/conf/drill-override-example.conf
> $DRILL_SITE/drill-override.conf
> {noformat}
> To the following:
> {noformat}
> cp $DRILL_HOME/conf/drill-override.conf $DRILL_SITE
> {noformat}
> Without this change, Drill will fail to start and you'll see the following in
> the YARN container log directory, {{drillbit.log}} file:
> {noformat}
> 2018-03-17 16:11:25,293 [main] ERROR o.a.d.e.r.u.s.PamUserAuthenticator -
> Problem in finding the native library of JPAM (Pluggable Authenticator Module
> API). Make sure to set Drillbit JVM option 'java.library.path' to point to
> the directory where the native JPAM exists.
> java.lang.UnsatisfiedLinkError: no jpam in java.library.path
> {noformat}
> None of these are show stoppers, each is instead just a bit of sand in the
> gears that makes progress a bit slower than it need be.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)