[
https://issues.apache.org/jira/browse/DRILL-6263?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Paul Rogers updated DRILL-6263:
-------------------------------
Description:
As part of the Drill 1.13 release process, I tested out DoY after a year of not
having used it. That time gap pointed out some improvements for first-time
users.
* Copy the
[USAGE.md|https://github.com/apache/drill/blob/master/drill-yarn/USAGE.md] file
into the Drill home directory with the name "DRILL_YARN_USAGE.md.".
* Change the {{drill-on-yarn-example.conf}} file to be a valid file for the
default Drill and YARN configurations.
{noformat}
heap: "2G"
max-direct-memory: "2G"
memory-mb: 5125
{noformat}
* Change the {{drill-on-yarn-example.conf}} to disable SSL by default. Just
comment out the following line:
{noformat}
#ssl-enabled: true
{noformat}
* Change the {{drill-on-yarn-example.conf}} to disable authorization by
default. That is, comment out the following line:
{noformat}
#auth-type: "drill"
{noformat}
* Change the {{drill-on-yarn-example.conf}} to use no AM node labels by
default. That is, comment out the following line:
{noformat}
#node-label-expr: "drill-am"
{noformat}
* Change {{DrillOnYarnConfig.findSuffix}}, to allow the {{.tar}} suffix. This
is what one ends up with it ht Mac does its automatic extract. A tar file is
larger than the compressed version, but no reason it should not be allowed
(assuming YARN supports it.)
* Otherwise, change {{DrillOnYarnConfig.getRemoteDrillHome()}}, where we emit
the error "does not name a valid archive" to differentiate between no
sufficient and an unsupported suffix. (I got the following error and had to
look at the source to figure out what I'd done wrong):
{noformat}
drill.yarn.drill-install.client-path does not name a valid archive:
/Users/paulrogers/bin/apache-drill-1.13.0.tar
{noformat}
* Change the newly-added error reporting code in {{DrillOnYarn.displayError}}
to omit displaying the exception cause it if just repeats the main error
message. Here is the full error message from above, the second line is
redundant:
{noformat}
drill.yarn.drill-install.client-path does not name a valid archive:
/Users/paulrogers/bin/apache-drill-1.13.0.tar
Caused by: drill.yarn.drill-install.client-path does not name a valid
archive: /Users/paulrogers/bin/apache-drill-1.13.0.tar
{noformat}
* Add to
[USAGE.md|https://github.com/apache/drill/blob/master/drill-yarn/USAGE.md]
pointers for how to set up a basic HDFS, ZK and YARN configuration. Mostly just
state what is to be done and point to the [relevant Hadoop
docs|https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/SingleCluster.html],
"Pseudo-Distributed Operation". In particular, we want to create an actual
HDFS file system, not use the default of local file system.
* Add to {{USAGE.md}} a description of the supported YARN (actually Hadoop)
versions. Feature was developed with 2.7.1. Currently verifying with 2.9.0.
Probably needs to be rechecked on the 3.x series.
* Add to {{USAGE.md}} the fact that Drill is built with, and includes the jars
for, Hadoop 2.7.1. It is not clear what version compatibility Hadoop has; are
these jars compatible with the latest 2.x series Hadoop? With Hadoop 3.x?
* Until DRILL-6268 is fixed, explain that the HDFS configuration *must* use
port 8020.
None of these are show stoppers, each is instead just a bit of sand in the
gears that makes progress a bit slower than it need be.
was:
As part of the Drill 1.13 release process, I tested out DoY after a year of not
having used it. That time gap pointed out some improvements for first-time
users.
* Copy the
[USAGE.md|https://github.com/apache/drill/blob/master/drill-yarn/USAGE.md] file
into the Drill home directory with the name "DRILL_YARN_USAGE.md.".
* Change the {{drill-on-yarn-example.conf}} file to be a valid file for the
default Drill and YARN configurations.
{noformat}
heap: "2G"
max-direct-memory: "2G"
memory-mb: 5125
{noformat}
* Change the {{drill-on-yarn-example.conf}} to disable SSL by default. Just
comment out the following line:
{noformat}
#ssl-enabled: true
{noformat}
* Change the {{drill-on-yarn-example.conf}} to disable authorization by
default. That is, comment out the following line:
{noformat}
#auth-type: "drill"
{noformat}
* Change {{DrillOnYarnConfig.findSuffix}}, to allow the {{.tar}} suffix. This
is what one ends up with it ht Mac does its automatic extract. A tar file is
larger than the compressed version, but no reason it should not be allowed
(assuming YARN supports it.)
* Otherwise, change {{DrillOnYarnConfig.getRemoteDrillHome()}}, where we emit
the error "does not name a valid archive" to differentiate between no
sufficient and an unsupported suffix. (I got the following error and had to
look at the source to figure out what I'd done wrong):
{noformat}
drill.yarn.drill-install.client-path does not name a valid archive:
/Users/paulrogers/bin/apache-drill-1.13.0.tar
{noformat}
* Change the newly-added error reporting code in {{DrillOnYarn.displayError}}
to omit displaying the exception cause it if just repeats the main error
message. Here is the full error message from above, the second line is
redundant:
{noformat}
drill.yarn.drill-install.client-path does not name a valid archive:
/Users/paulrogers/bin/apache-drill-1.13.0.tar
Caused by: drill.yarn.drill-install.client-path does not name a valid
archive: /Users/paulrogers/bin/apache-drill-1.13.0.tar
{noformat}
* Add to
[USAGE.md|https://github.com/apache/drill/blob/master/drill-yarn/USAGE.md]
pointers for how to set up a basic HDFS, ZK and YARN configuration. Mostly just
state what is to be done and point to the [relevant Hadoop
docs|https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/SingleCluster.html],
"Pseudo-Distributed Operation". In particular, we want to create an actual
HDFS file system, not use the default of local file system.
* Add to {{USAGE.md}} a description of the supported YARN (actually Hadoop)
versions. Feature was developed with 2.7.1. Currently verifying with 2.9.0.
Probably needs to be rechecked on the 3.x series.
* Add to {{USAGE.md}} the fact that Drill is built with, and includes the jars
for, Hadoop 2.7.1. It is not clear what version compatibility Hadoop has; are
these jars compatible with the latest 2.x series Hadoop? With Hadoop 3.x?
* Until DRILL-6268 is fixed, explain that the HDFS configuration *must* use
port 8020.
None of these are show stoppers, each is instead just a bit of sand in the
gears that makes progress a bit slower than it need be.
> Improvements to DoY initial experience
> --------------------------------------
>
> Key: DRILL-6263
> URL: https://issues.apache.org/jira/browse/DRILL-6263
> Project: Apache Drill
> Issue Type: Improvement
> Affects Versions: 1.13.0
> Reporter: Paul Rogers
> Priority: Minor
> Fix For: 1.14.0
>
>
> As part of the Drill 1.13 release process, I tested out DoY after a year of
> not having used it. That time gap pointed out some improvements for
> first-time users.
> * Copy the
> [USAGE.md|https://github.com/apache/drill/blob/master/drill-yarn/USAGE.md]
> file into the Drill home directory with the name "DRILL_YARN_USAGE.md.".
> * Change the {{drill-on-yarn-example.conf}} file to be a valid file for the
> default Drill and YARN configurations.
> {noformat}
> heap: "2G"
> max-direct-memory: "2G"
> memory-mb: 5125
> {noformat}
> * Change the {{drill-on-yarn-example.conf}} to disable SSL by default. Just
> comment out the following line:
> {noformat}
> #ssl-enabled: true
> {noformat}
> * Change the {{drill-on-yarn-example.conf}} to disable authorization by
> default. That is, comment out the following line:
> {noformat}
> #auth-type: "drill"
> {noformat}
> * Change the {{drill-on-yarn-example.conf}} to use no AM node labels by
> default. That is, comment out the following line:
> {noformat}
> #node-label-expr: "drill-am"
> {noformat}
> * Change {{DrillOnYarnConfig.findSuffix}}, to allow the {{.tar}} suffix. This
> is what one ends up with it ht Mac does its automatic extract. A tar file is
> larger than the compressed version, but no reason it should not be allowed
> (assuming YARN supports it.)
> * Otherwise, change {{DrillOnYarnConfig.getRemoteDrillHome()}}, where we emit
> the error "does not name a valid archive" to differentiate between no
> sufficient and an unsupported suffix. (I got the following error and had to
> look at the source to figure out what I'd done wrong):
> {noformat}
> drill.yarn.drill-install.client-path does not name a valid archive:
> /Users/paulrogers/bin/apache-drill-1.13.0.tar
> {noformat}
> * Change the newly-added error reporting code in {{DrillOnYarn.displayError}}
> to omit displaying the exception cause it if just repeats the main error
> message. Here is the full error message from above, the second line is
> redundant:
> {noformat}
> drill.yarn.drill-install.client-path does not name a valid archive:
> /Users/paulrogers/bin/apache-drill-1.13.0.tar
> Caused by: drill.yarn.drill-install.client-path does not name a valid
> archive: /Users/paulrogers/bin/apache-drill-1.13.0.tar
> {noformat}
> * Add to
> [USAGE.md|https://github.com/apache/drill/blob/master/drill-yarn/USAGE.md]
> pointers for how to set up a basic HDFS, ZK and YARN configuration. Mostly
> just state what is to be done and point to the [relevant Hadoop
> docs|https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/SingleCluster.html],
> "Pseudo-Distributed Operation". In particular, we want to create an actual
> HDFS file system, not use the default of local file system.
> * Add to {{USAGE.md}} a description of the supported YARN (actually Hadoop)
> versions. Feature was developed with 2.7.1. Currently verifying with 2.9.0.
> Probably needs to be rechecked on the 3.x series.
> * Add to {{USAGE.md}} the fact that Drill is built with, and includes the
> jars for, Hadoop 2.7.1. It is not clear what version compatibility Hadoop
> has; are these jars compatible with the latest 2.x series Hadoop? With Hadoop
> 3.x?
> * Until DRILL-6268 is fixed, explain that the HDFS configuration *must* use
> port 8020.
> None of these are show stoppers, each is instead just a bit of sand in the
> gears that makes progress a bit slower than it need be.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)