[jira] [Updated] (DRILL-6263) Improvements to DoY initial experience

Paul Rogers (JIRA) Sat, 17 Mar 2018 16:12:00 -0700

     [ 
https://issues.apache.org/jira/browse/DRILL-6263?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Paul Rogers updated DRILL-6263:
-------------------------------
    Description: 
As part of the Drill 1.13 release process, I tested out DoY after a year of not 
having used it. That time gap pointed out some improvements for first-time 
users.

* Copy the 
[USAGE.md|https://github.com/apache/drill/blob/master/drill-yarn/USAGE.md] file 
into the Drill home directory with the name "DRILL_YARN_USAGE.md.".

* Change the {{drill-on-yarn-example.conf}} file to be a valid file for the 
default Drill and YARN configurations.

{noformat}
    heap: "2G"
    max-direct-memory: "2G"
   memory-mb: 5125
{noformat}

* Change the {{drill-on-yarn-example.conf}} to disable SSL by default. Just 
comment out the following line:

{noformat}
    #ssl-enabled: true
{noformat}

* Change the {{drill-on-yarn-example.conf}} to disable authorization by 
default. That is, comment out the following line:

{noformat}
    #auth-type: "drill"
{noformat}

* Change the {{drill-on-yarn-example.conf}} to use no AM node labels by 
default. That is, comment out the following line:

{noformat}
    #node-label-expr: "drill-am"
{noformat}

* Change {{DrillOnYarnConfig.findSuffix}}, to allow the {{.tar}} suffix. This 
is what one ends up with it ht Mac does its automatic extract. A tar file is 
larger than the compressed version, but no reason it should not be allowed 
(assuming YARN supports it.)

* Otherwise, change {{DrillOnYarnConfig.getRemoteDrillHome()}}, where we emit 
the error "does not name a valid archive" to differentiate between no 
sufficient and an unsupported suffix. (I got the following error and had to 
look at the source to figure out what I'd done wrong):

{noformat}
drill.yarn.drill-install.client-path does not name a valid archive: 
/Users/paulrogers/bin/apache-drill-1.13.0.tar
{noformat}

* Change the newly-added error reporting code in {{DrillOnYarn.displayError}} 
to omit displaying the exception cause it if just repeats the main error 
message. Here is the full error message from above, the second line is 
redundant:

{noformat}
drill.yarn.drill-install.client-path does not name a valid archive: 
/Users/paulrogers/bin/apache-drill-1.13.0.tar
  Caused by: drill.yarn.drill-install.client-path does not name a valid 
archive: /Users/paulrogers/bin/apache-drill-1.13.0.tar
{noformat}

* Add to 
[USAGE.md|https://github.com/apache/drill/blob/master/drill-yarn/USAGE.md] 
pointers for how to set up a basic HDFS, ZK and YARN configuration. Mostly just 
state what is to be done and point to the [relevant Hadoop 
docs|https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/SingleCluster.html],
 "Pseudo-Distributed Operation". In particular, we want to create an actual 
HDFS file system, not use the default of local file system.

* Add to {{USAGE.md}} a description of the supported YARN (actually Hadoop) 
versions. Feature was developed with 2.7.1. Currently verifying with 2.9.0. 
Probably needs to be rechecked on the 3.x series.

* Add to {{USAGE.md}} the fact that Drill is built with, and includes the jars 
for, Hadoop 2.7.1. It is not clear what version compatibility Hadoop has; are 
these jars compatible with the latest 2.x series Hadoop? With Hadoop 3.x?

* Until DRILL-6268 is fixed, explain that the HDFS configuration *must* use 
port 8020.

None of these are show stoppers, each is instead just a bit of sand in the 
gears that makes progress a bit slower than it need be.

  was:
As part of the Drill 1.13 release process, I tested out DoY after a year of not 
having used it. That time gap pointed out some improvements for first-time 
users.

* Copy the 
[USAGE.md|https://github.com/apache/drill/blob/master/drill-yarn/USAGE.md] file 
into the Drill home directory with the name "DRILL_YARN_USAGE.md.".

* Change the {{drill-on-yarn-example.conf}} file to be a valid file for the 
default Drill and YARN configurations.

{noformat}
    heap: "2G"
    max-direct-memory: "2G"
   memory-mb: 5125
{noformat}

* Change the {{drill-on-yarn-example.conf}} to disable SSL by default. Just 
comment out the following line:

{noformat}
    #ssl-enabled: true
{noformat}

* Change the {{drill-on-yarn-example.conf}} to disable authorization by 
default. That is, comment out the following line:

{noformat}
    #auth-type: "drill"
{noformat}

* Change {{DrillOnYarnConfig.findSuffix}}, to allow the {{.tar}} suffix. This 
is what one ends up with it ht Mac does its automatic extract. A tar file is 
larger than the compressed version, but no reason it should not be allowed 
(assuming YARN supports it.)

* Otherwise, change {{DrillOnYarnConfig.getRemoteDrillHome()}}, where we emit 
the error "does not name a valid archive" to differentiate between no 
sufficient and an unsupported suffix. (I got the following error and had to 
look at the source to figure out what I'd done wrong):

{noformat}
drill.yarn.drill-install.client-path does not name a valid archive: 
/Users/paulrogers/bin/apache-drill-1.13.0.tar
{noformat}

* Change the newly-added error reporting code in {{DrillOnYarn.displayError}} 
to omit displaying the exception cause it if just repeats the main error 
message. Here is the full error message from above, the second line is 
redundant:

{noformat}
drill.yarn.drill-install.client-path does not name a valid archive: 
/Users/paulrogers/bin/apache-drill-1.13.0.tar
  Caused by: drill.yarn.drill-install.client-path does not name a valid 
archive: /Users/paulrogers/bin/apache-drill-1.13.0.tar
{noformat}

* Add to 
[USAGE.md|https://github.com/apache/drill/blob/master/drill-yarn/USAGE.md] 
pointers for how to set up a basic HDFS, ZK and YARN configuration. Mostly just 
state what is to be done and point to the [relevant Hadoop 
docs|https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/SingleCluster.html],
 "Pseudo-Distributed Operation". In particular, we want to create an actual 
HDFS file system, not use the default of local file system.

* Add to {{USAGE.md}} a description of the supported YARN (actually Hadoop) 
versions. Feature was developed with 2.7.1. Currently verifying with 2.9.0. 
Probably needs to be rechecked on the 3.x series.

* Add to {{USAGE.md}} the fact that Drill is built with, and includes the jars 
for, Hadoop 2.7.1. It is not clear what version compatibility Hadoop has; are 
these jars compatible with the latest 2.x series Hadoop? With Hadoop 3.x?

* Until DRILL-6268 is fixed, explain that the HDFS configuration *must* use 
port 8020.

None of these are show stoppers, each is instead just a bit of sand in the 
gears that makes progress a bit slower than it need be.


> Improvements to DoY initial experience
> --------------------------------------
>
>                 Key: DRILL-6263
>                 URL: https://issues.apache.org/jira/browse/DRILL-6263
>             Project: Apache Drill
>          Issue Type: Improvement
>    Affects Versions: 1.13.0
>            Reporter: Paul Rogers
>            Priority: Minor
>             Fix For: 1.14.0
>
>
> As part of the Drill 1.13 release process, I tested out DoY after a year of 
> not having used it. That time gap pointed out some improvements for 
> first-time users.
> * Copy the 
> [USAGE.md|https://github.com/apache/drill/blob/master/drill-yarn/USAGE.md] 
> file into the Drill home directory with the name "DRILL_YARN_USAGE.md.".
> * Change the {{drill-on-yarn-example.conf}} file to be a valid file for the 
> default Drill and YARN configurations.
> {noformat}
>     heap: "2G"
>     max-direct-memory: "2G"
>    memory-mb: 5125
> {noformat}
> * Change the {{drill-on-yarn-example.conf}} to disable SSL by default. Just 
> comment out the following line:
> {noformat}
>     #ssl-enabled: true
> {noformat}
> * Change the {{drill-on-yarn-example.conf}} to disable authorization by 
> default. That is, comment out the following line:
> {noformat}
>     #auth-type: "drill"
> {noformat}
> * Change the {{drill-on-yarn-example.conf}} to use no AM node labels by 
> default. That is, comment out the following line:
> {noformat}
>     #node-label-expr: "drill-am"
> {noformat}
> * Change {{DrillOnYarnConfig.findSuffix}}, to allow the {{.tar}} suffix. This 
> is what one ends up with it ht Mac does its automatic extract. A tar file is 
> larger than the compressed version, but no reason it should not be allowed 
> (assuming YARN supports it.)
> * Otherwise, change {{DrillOnYarnConfig.getRemoteDrillHome()}}, where we emit 
> the error "does not name a valid archive" to differentiate between no 
> sufficient and an unsupported suffix. (I got the following error and had to 
> look at the source to figure out what I'd done wrong):
> {noformat}
> drill.yarn.drill-install.client-path does not name a valid archive: 
> /Users/paulrogers/bin/apache-drill-1.13.0.tar
> {noformat}
> * Change the newly-added error reporting code in {{DrillOnYarn.displayError}} 
> to omit displaying the exception cause it if just repeats the main error 
> message. Here is the full error message from above, the second line is 
> redundant:
> {noformat}
> drill.yarn.drill-install.client-path does not name a valid archive: 
> /Users/paulrogers/bin/apache-drill-1.13.0.tar
>   Caused by: drill.yarn.drill-install.client-path does not name a valid 
> archive: /Users/paulrogers/bin/apache-drill-1.13.0.tar
> {noformat}
> * Add to 
> [USAGE.md|https://github.com/apache/drill/blob/master/drill-yarn/USAGE.md] 
> pointers for how to set up a basic HDFS, ZK and YARN configuration. Mostly 
> just state what is to be done and point to the [relevant Hadoop 
> docs|https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/SingleCluster.html],
>  "Pseudo-Distributed Operation". In particular, we want to create an actual 
> HDFS file system, not use the default of local file system.
> * Add to {{USAGE.md}} a description of the supported YARN (actually Hadoop) 
> versions. Feature was developed with 2.7.1. Currently verifying with 2.9.0. 
> Probably needs to be rechecked on the 3.x series.
> * Add to {{USAGE.md}} the fact that Drill is built with, and includes the 
> jars for, Hadoop 2.7.1. It is not clear what version compatibility Hadoop 
> has; are these jars compatible with the latest 2.x series Hadoop? With Hadoop 
> 3.x?
> * Until DRILL-6268 is fixed, explain that the HDFS configuration *must* use 
> port 8020.
> None of these are show stoppers, each is instead just a bit of sand in the 
> gears that makes progress a bit slower than it need be.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (DRILL-6263) Improvements to DoY initial experience

Reply via email to