[jira] [Commented] (HADOOP-11296) hadoop-daemons.sh throws 'host1: bash: host3: command not found...'

2014-11-12 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-11296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14208697#comment-14208697
 ] 

Allen Wittenauer commented on HADOOP-11296:
---

It looks like this is dependent upon either the version of bash or the version 
of xargs, as I'm having trouble reproducing this on OS X.  Even the test code 
gives me the expected output.  What version is showing the problem?

 hadoop-daemons.sh throws 'host1: bash: host3: command not found...'
 ---

 Key: HADOOP-11296
 URL: https://issues.apache.org/jira/browse/HADOOP-11296
 Project: Hadoop Common
  Issue Type: Bug
  Components: scripts
Affects Versions: 3.0.0
Reporter: Vinayakumar B
Assignee: Vinayakumar B
 Attachments: HADOOP-11296-001.patch, HADOOP-11296-002.patch


 *hadoop-daemons.sh* throws command not found.
 {noformat}[vinay@host2 install]$ 
 /home/vinay/install/hadoop/sbin/hadoop-[vinay@host2 install]$ 
 /home/vinay/install/hadoop/sbin/hadoop-daemons.sh --config 
 /home/vinay/install/conf --hostnames 'host1 host2' start namenode
 host1: bash: host2: command not found...
 {noformat}
 *hadoop-daemons.sh* is mainly used to start the cluster, for ex: start-dfs.sh
 Without this cluster will not be able to start.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HADOOP-11298) slaves.sh is missing a /

2014-11-12 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-11298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated HADOOP-11298:
--
Component/s: scripts

 slaves.sh is missing a / 
 -

 Key: HADOOP-11298
 URL: https://issues.apache.org/jira/browse/HADOOP-11298
 Project: Hadoop Common
  Issue Type: Bug
  Components: scripts
Affects Versions: 3.0.0
Reporter: Allen Wittenauer
Priority: Trivial
  Labels: newbie, shell

 Just need to turn dev/null - /dev/null in the cd statement in the preamble.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HADOOP-11298) slaves.sh is missing a /

2014-11-12 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-11298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated HADOOP-11298:
--
Affects Version/s: 3.0.0

 slaves.sh is missing a / 
 -

 Key: HADOOP-11298
 URL: https://issues.apache.org/jira/browse/HADOOP-11298
 Project: Hadoop Common
  Issue Type: Bug
  Components: scripts
Affects Versions: 3.0.0
Reporter: Allen Wittenauer
Priority: Trivial
  Labels: newbie, shell

 Just need to turn dev/null - /dev/null in the cd statement in the preamble.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HADOOP-11298) slaves.sh is missing a /

2014-11-12 Thread Allen Wittenauer (JIRA)
Allen Wittenauer created HADOOP-11298:
-

 Summary: slaves.sh is missing a / 
 Key: HADOOP-11298
 URL: https://issues.apache.org/jira/browse/HADOOP-11298
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Allen Wittenauer
Priority: Trivial


Just need to turn dev/null - /dev/null in the cd statement in the preamble.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-11296) hadoop-daemons.sh throws 'host1: bash: host3: command not found...'

2014-11-12 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-11296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14208735#comment-14208735
 ] 

Allen Wittenauer commented on HADOOP-11296:
---

OK, it looks like OS X is the outlier.  I've been able to reproduce this on 
both Linux and Illumos. As part of that, it looks like Illumos xargs doesn't 
support -P parameter (SVID issue? non-POSIX extension?).  At first glance, the 
patch seems reasonable, but I want to test a few things out. :)

 hadoop-daemons.sh throws 'host1: bash: host3: command not found...'
 ---

 Key: HADOOP-11296
 URL: https://issues.apache.org/jira/browse/HADOOP-11296
 Project: Hadoop Common
  Issue Type: Bug
  Components: scripts
Affects Versions: 3.0.0
Reporter: Vinayakumar B
Assignee: Vinayakumar B
 Attachments: HADOOP-11296-001.patch, HADOOP-11296-002.patch


 *hadoop-daemons.sh* throws command not found.
 {noformat}[vinay@host2 install]$ 
 /home/vinay/install/hadoop/sbin/hadoop-[vinay@host2 install]$ 
 /home/vinay/install/hadoop/sbin/hadoop-daemons.sh --config 
 /home/vinay/install/conf --hostnames 'host1 host2' start namenode
 host1: bash: host2: command not found...
 {noformat}
 *hadoop-daemons.sh* is mainly used to start the cluster, for ex: start-dfs.sh
 Without this cluster will not be able to start.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-11257) Update hadoop jar documentation to warn against using it for launching yarn jars

2014-11-12 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-11257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14208891#comment-14208891
 ] 

Allen Wittenauer commented on HADOOP-11257:
---

While this is nice and all, I'm not really sure if it ultimately fixes 
anything.  We still have two code paths to test.

 Update hadoop jar documentation to warn against using it for launching yarn 
 jars
 --

 Key: HADOOP-11257
 URL: https://issues.apache.org/jira/browse/HADOOP-11257
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Allen Wittenauer
Assignee: Masatake Iwasaki
 Attachments: HADOOP-11257.1.patch, HADOOP-11257.1.patch, 
 HADOOP-11257.2.patch, HADOOP-11257.3.patch


 We should update the hadoop jar documentation to warn against using it for 
 launching yarn jars.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-11025) hadoop-daemons.sh should just call hdfs directly

2014-11-12 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-11025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14208910#comment-14208910
 ] 

Allen Wittenauer commented on HADOOP-11025:
---

+1 will commit to trunk.

Thanks!!

 hadoop-daemons.sh should just call hdfs directly
 

 Key: HADOOP-11025
 URL: https://issues.apache.org/jira/browse/HADOOP-11025
 Project: Hadoop Common
  Issue Type: Improvement
  Components: scripts
Reporter: Allen Wittenauer
Assignee: Masatake Iwasaki
 Attachments: HADOOP-11025.1.patch, HADOOP-11025.2.patch


 There is little-to-no reason for it to call hadoop-daemon.sh anymore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HADOOP-11025) hadoop-daemons.sh should just call hdfs directly

2014-11-12 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-11025?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer resolved HADOOP-11025.
---
   Resolution: Fixed
Fix Version/s: 3.0.0

Committed to trunk.

 hadoop-daemons.sh should just call hdfs directly
 

 Key: HADOOP-11025
 URL: https://issues.apache.org/jira/browse/HADOOP-11025
 Project: Hadoop Common
  Issue Type: Improvement
  Components: scripts
Reporter: Allen Wittenauer
Assignee: Masatake Iwasaki
 Fix For: 3.0.0

 Attachments: HADOOP-11025.1.patch, HADOOP-11025.2.patch


 There is little-to-no reason for it to call hadoop-daemon.sh anymore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-11284) Fix variable name mismatch in hadoop-functions.sh

2014-11-12 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-11284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14208937#comment-14208937
 ] 

Allen Wittenauer commented on HADOOP-11284:
---

Ugh.  I really messed these up.  

Thanks for finding them. :)

+1 will commit to trunk.

 Fix variable name mismatch in hadoop-functions.sh
 -

 Key: HADOOP-11284
 URL: https://issues.apache.org/jira/browse/HADOOP-11284
 Project: Hadoop Common
  Issue Type: Bug
  Components: scripts
Reporter: Masatake Iwasaki
Assignee: Masatake Iwasaki
Priority: Minor
 Attachments: HADOOP-11284.1.patch


 Some functions use variables not given as argument but defined in outside of 
 function. The variables are used as name of pid file. Though 
 hadoop-functions.sh works by chance now, it should be fixed to avoid future 
 bug.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HADOOP-11284) Fix variable name mismatches in hadoop-functions.sh

2014-11-12 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-11284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated HADOOP-11284:
--
Summary: Fix variable name mismatches in hadoop-functions.sh  (was: Fix 
variable name mismatch in hadoop-functions.sh)

 Fix variable name mismatches in hadoop-functions.sh
 ---

 Key: HADOOP-11284
 URL: https://issues.apache.org/jira/browse/HADOOP-11284
 Project: Hadoop Common
  Issue Type: Bug
  Components: scripts
Reporter: Masatake Iwasaki
Assignee: Masatake Iwasaki
Priority: Minor
 Attachments: HADOOP-11284.1.patch


 Some functions use variables not given as argument but defined in outside of 
 function. The variables are used as name of pid file. Though 
 hadoop-functions.sh works by chance now, it should be fixed to avoid future 
 bug.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HADOOP-11284) Fix variable name mismatches in hadoop-functions.sh

2014-11-12 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-11284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer resolved HADOOP-11284.
---
   Resolution: Fixed
Fix Version/s: 3.0.0

Committed to trunk.

Thanks!

 Fix variable name mismatches in hadoop-functions.sh
 ---

 Key: HADOOP-11284
 URL: https://issues.apache.org/jira/browse/HADOOP-11284
 Project: Hadoop Common
  Issue Type: Bug
  Components: scripts
Reporter: Masatake Iwasaki
Assignee: Masatake Iwasaki
Priority: Minor
 Fix For: 3.0.0

 Attachments: HADOOP-11284.1.patch


 Some functions use variables not given as argument but defined in outside of 
 function. The variables are used as name of pid file. Though 
 hadoop-functions.sh works by chance now, it should be fixed to avoid future 
 bug.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HADOOP-11298) slaves.sh and stop-all.sh are missing slashes

2014-11-12 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-11298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated HADOOP-11298:
--
Summary: slaves.sh and stop-all.sh are missing slashes   (was: slaves.sh is 
missing a / )

 slaves.sh and stop-all.sh are missing slashes 
 --

 Key: HADOOP-11298
 URL: https://issues.apache.org/jira/browse/HADOOP-11298
 Project: Hadoop Common
  Issue Type: Bug
  Components: scripts
Affects Versions: 3.0.0
Reporter: Allen Wittenauer
Priority: Trivial
  Labels: newbie, shell

 Just need to turn dev/null - /dev/null in the cd statement in the preamble.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HADOOP-11298) slaves.sh and stop-all.sh are missing slashes

2014-11-12 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-11298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated HADOOP-11298:
--
Status: Patch Available  (was: Open)

 slaves.sh and stop-all.sh are missing slashes 
 --

 Key: HADOOP-11298
 URL: https://issues.apache.org/jira/browse/HADOOP-11298
 Project: Hadoop Common
  Issue Type: Bug
  Components: scripts
Affects Versions: 3.0.0
Reporter: Allen Wittenauer
Priority: Trivial
  Labels: newbie, shell
 Attachments: HADOOP-11298.patch


 Just need to turn dev/null - /dev/null in the cd statement in the preamble.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HADOOP-11298) slaves.sh and stop-all.sh are missing slashes

2014-11-12 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-11298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated HADOOP-11298:
--
Attachment: HADOOP-11298.patch

Effectively, a two character patch.  I wonder if this is a record.

 slaves.sh and stop-all.sh are missing slashes 
 --

 Key: HADOOP-11298
 URL: https://issues.apache.org/jira/browse/HADOOP-11298
 Project: Hadoop Common
  Issue Type: Bug
  Components: scripts
Affects Versions: 3.0.0
Reporter: Allen Wittenauer
Priority: Trivial
  Labels: newbie, shell
 Attachments: HADOOP-11298.patch


 Just need to turn dev/null - /dev/null in the cd statement in the preamble.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HADOOP-11298) slaves.sh and stop-all.sh are missing slashes

2014-11-12 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-11298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer reassigned HADOOP-11298:
-

Assignee: Allen Wittenauer

 slaves.sh and stop-all.sh are missing slashes 
 --

 Key: HADOOP-11298
 URL: https://issues.apache.org/jira/browse/HADOOP-11298
 Project: Hadoop Common
  Issue Type: Bug
  Components: scripts
Affects Versions: 3.0.0
Reporter: Allen Wittenauer
Assignee: Allen Wittenauer
Priority: Trivial
  Labels: newbie, shell
 Attachments: HADOOP-11298.patch


 Just need to turn dev/null - /dev/null in the cd statement in the preamble.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-11278) hadoop-daemon.sh script doesn't hornor --config option

2014-11-12 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-11278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14208996#comment-14208996
 ] 

Allen Wittenauer commented on HADOOP-11278:
---

I realize this is closed, but is there an error condition we should be checking 
for to help prevent this issue in the future?

 hadoop-daemon.sh script doesn't hornor --config option
 --

 Key: HADOOP-11278
 URL: https://issues.apache.org/jira/browse/HADOOP-11278
 Project: Hadoop Common
  Issue Type: Bug
  Components: bin
Affects Versions: 3.0.0
Reporter: Brandon Li





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HADOOP-11278) hadoop-daemon.sh script doesn't honor --config option

2014-11-12 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-11278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated HADOOP-11278:
--
Summary: hadoop-daemon.sh script doesn't honor --config option  (was: 
hadoop-daemon.sh script doesn't hornor --config option)

 hadoop-daemon.sh script doesn't honor --config option
 -

 Key: HADOOP-11278
 URL: https://issues.apache.org/jira/browse/HADOOP-11278
 Project: Hadoop Common
  Issue Type: Bug
  Components: bin
Affects Versions: 3.0.0
Reporter: Brandon Li





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-11300) KMS startup scripts must not display the keystore / truststore passwords

2014-11-12 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-11300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14209068#comment-14209068
 ] 

Allen Wittenauer commented on HADOOP-11300:
---

Umm, given this is a -D setting, doesn't that also mean it's passed on the 
command line...  which in turn means that anyone doing a ps/reading /proc will 
also see the password?  It sounds like this security service has a pretty major 
security hole...

 KMS startup scripts must not display the keystore / truststore passwords
 

 Key: HADOOP-11300
 URL: https://issues.apache.org/jira/browse/HADOOP-11300
 Project: Hadoop Common
  Issue Type: Bug
  Components: kms
Reporter: Arun Suresh
 Attachments: HADOOP-11300.1.patch


 Sample output of the KMS startup scripts :
 {noformat}
 Setting KMS_HOME:  /usr/lib/hadoop-kms
 Using   KMS_CONFIG:/var/run/kms-config/
 Using   KMS_LOG:   /var/log/kms-log
 Using   KMS_TEMP:   /var/run/kms-tmp/
 Using   KMS_HTTP_PORT: 16000
 Using   KMS_ADMIN_PORT: 16001
 Using   KMS_MAX_THREADS: 250
 Using   KMS_SSL_KEYSTORE_FILE: /etc/conf/kms-keystore.jks
 Using   KMS_SSL_KEYSTORE_PASS: keystorepass
 Using   CATALINA_BASE:   /var/lib/kms/tomcat-deployment
 Using   KMS_CATALINA_HOME:   /usr/lib/hadoop-kms/lib/bigtop-tomcat
 Setting CATALINA_OUT:/var/log/kms-log/kms-catalina.out
 Setting CATALINA_PID:/tmp/kms.pid
 Using   CATALINA_OPTS:   . 
 -Djavax.net.ssl.trustStorePassword=truststorepass 
 Adding to CATALINA_OPTS: -Dkms.home.dir=..  -Dkms.ssl.keystore.pass= 
 keystorepass 
 {noformat}
 The keystore password and truststore password are in clear text.. which 
 should be masked



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HADOOP-11208) Replace daemon with better name in scripts like hadoop-hdfs-project/hadoop-hdfs/src/main/bin/hdfs

2014-11-12 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-11208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated HADOOP-11208:
--
Affects Version/s: 3.0.0

 Replace daemon with better name in scripts like 
 hadoop-hdfs-project/hadoop-hdfs/src/main/bin/hdfs
 ---

 Key: HADOOP-11208
 URL: https://issues.apache.org/jira/browse/HADOOP-11208
 Project: Hadoop Common
  Issue Type: Improvement
Affects Versions: 3.0.0
Reporter: Yongjun Zhang
Assignee: Allen Wittenauer

 Per discussion in HDFS-7204, creating this jira.
 Thanks [~aw] for the work on HDFS-7204.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HADOOP-11208) Replace daemon with better name in scripts like hadoop-hdfs-project/hadoop-hdfs/src/main/bin/hdfs

2014-11-12 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-11208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated HADOOP-11208:
--
Attachment: HADOOP-11208.patch

Changes the local daemon variable to supportdaemonization.

Documentation change will be made on the wiki to the shell scripting guide 
after commit.

 Replace daemon with better name in scripts like 
 hadoop-hdfs-project/hadoop-hdfs/src/main/bin/hdfs
 ---

 Key: HADOOP-11208
 URL: https://issues.apache.org/jira/browse/HADOOP-11208
 Project: Hadoop Common
  Issue Type: Improvement
Affects Versions: 3.0.0
Reporter: Yongjun Zhang
Assignee: Allen Wittenauer
 Attachments: HADOOP-11208.patch


 Per discussion in HDFS-7204, creating this jira.
 Thanks [~aw] for the work on HDFS-7204.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HADOOP-11208) Replace daemon with better name in scripts like hadoop-hdfs-project/hadoop-hdfs/src/main/bin/hdfs

2014-11-12 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-11208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer reassigned HADOOP-11208:
-

Assignee: Allen Wittenauer

 Replace daemon with better name in scripts like 
 hadoop-hdfs-project/hadoop-hdfs/src/main/bin/hdfs
 ---

 Key: HADOOP-11208
 URL: https://issues.apache.org/jira/browse/HADOOP-11208
 Project: Hadoop Common
  Issue Type: Improvement
Affects Versions: 3.0.0
Reporter: Yongjun Zhang
Assignee: Allen Wittenauer

 Per discussion in HDFS-7204, creating this jira.
 Thanks [~aw] for the work on HDFS-7204.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-11278) hadoop-daemon.sh script doesn't honor --config option

2014-11-12 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-11278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14209171#comment-14209171
 ] 

Allen Wittenauer commented on HADOOP-11278:
---

Yeah, --debug isn't passed through and it probably should be.  I'm sure that 
would have helped tremendously!

 hadoop-daemon.sh script doesn't honor --config option
 -

 Key: HADOOP-11278
 URL: https://issues.apache.org/jira/browse/HADOOP-11278
 Project: Hadoop Common
  Issue Type: Bug
  Components: bin
Affects Versions: 3.0.0
Reporter: Brandon Li





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HADOOP-8989) hadoop fs -find feature

2014-11-13 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-8989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated HADOOP-8989:
-
Summary: hadoop fs -find feature  (was: hadoop dfs -find feature)

 hadoop fs -find feature
 ---

 Key: HADOOP-8989
 URL: https://issues.apache.org/jira/browse/HADOOP-8989
 Project: Hadoop Common
  Issue Type: New Feature
Reporter: Marco Nicosia
Assignee: Jonathan Allen
 Attachments: HADOOP-8989.patch, HADOOP-8989.patch, HADOOP-8989.patch, 
 HADOOP-8989.patch, HADOOP-8989.patch, HADOOP-8989.patch, HADOOP-8989.patch, 
 HADOOP-8989.patch, HADOOP-8989.patch, HADOOP-8989.patch, HADOOP-8989.patch, 
 HADOOP-8989.patch, HADOOP-8989.patch, HADOOP-8989.patch, HADOOP-8989.patch, 
 HADOOP-8989.patch, HADOOP-8989.patch, HADOOP-8989.patch, HADOOP-8989.patch, 
 HADOOP-8989.patch, HADOOP-8989.patch, HADOOP-8989.patch


 Both sysadmins and users make frequent use of the unix 'find' command, but 
 Hadoop has no correlate. Without this, users are writing scripts which make 
 heavy use of hadoop dfs -lsr, and implementing find one-offs. I think hdfs 
 -lsr is somewhat taxing on the NameNode, and a really slow experience on the 
 client side. Possibly an in-NameNode find operation would be only a bit more 
 taxing on the NameNode, but significantly faster from the client's point of 
 view?
 The minimum set of options I can think of which would make a Hadoop find 
 command generally useful is (in priority order):
 * -type (file or directory, for now)
 * -atime/-ctime-mtime (... and -creationtime?) (both + and - arguments)
 * -print0 (for piping to xargs -0)
 * -depth
 * -owner/-group (and -nouser/-nogroup)
 * -name (allowing for shell pattern, or even regex?)
 * -perm
 * -size
 One possible special case, but could possibly be really cool if it ran from 
 within the NameNode:
 * -delete
 The hadoop dfs -lsr | hadoop dfs -rm cycle is really, really slow.
 Lower priority, some people do use operators, mostly to execute -or searches 
 such as:
 * find / \(-nouser -or -nogroup\)
 Finally, I thought I'd include a link to the [Posix spec for 
 find|http://www.opengroup.org/onlinepubs/009695399/utilities/find.html]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HADOOP-8989) hadoop fs -find feature

2014-11-13 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-8989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated HADOOP-8989:
-
   Resolution: Fixed
Fix Version/s: 2.7.0
   Status: Resolved  (was: Patch Available)

With several +1's, I've committed this to branch-2 and trunk.

Thanks!

 hadoop fs -find feature
 ---

 Key: HADOOP-8989
 URL: https://issues.apache.org/jira/browse/HADOOP-8989
 Project: Hadoop Common
  Issue Type: New Feature
Reporter: Marco Nicosia
Assignee: Jonathan Allen
 Fix For: 2.7.0

 Attachments: HADOOP-8989.patch, HADOOP-8989.patch, HADOOP-8989.patch, 
 HADOOP-8989.patch, HADOOP-8989.patch, HADOOP-8989.patch, HADOOP-8989.patch, 
 HADOOP-8989.patch, HADOOP-8989.patch, HADOOP-8989.patch, HADOOP-8989.patch, 
 HADOOP-8989.patch, HADOOP-8989.patch, HADOOP-8989.patch, HADOOP-8989.patch, 
 HADOOP-8989.patch, HADOOP-8989.patch, HADOOP-8989.patch, HADOOP-8989.patch, 
 HADOOP-8989.patch, HADOOP-8989.patch, HADOOP-8989.patch


 Both sysadmins and users make frequent use of the unix 'find' command, but 
 Hadoop has no correlate. Without this, users are writing scripts which make 
 heavy use of hadoop dfs -lsr, and implementing find one-offs. I think hdfs 
 -lsr is somewhat taxing on the NameNode, and a really slow experience on the 
 client side. Possibly an in-NameNode find operation would be only a bit more 
 taxing on the NameNode, but significantly faster from the client's point of 
 view?
 The minimum set of options I can think of which would make a Hadoop find 
 command generally useful is (in priority order):
 * -type (file or directory, for now)
 * -atime/-ctime-mtime (... and -creationtime?) (both + and - arguments)
 * -print0 (for piping to xargs -0)
 * -depth
 * -owner/-group (and -nouser/-nogroup)
 * -name (allowing for shell pattern, or even regex?)
 * -perm
 * -size
 One possible special case, but could possibly be really cool if it ran from 
 within the NameNode:
 * -delete
 The hadoop dfs -lsr | hadoop dfs -rm cycle is really, really slow.
 Lower priority, some people do use operators, mostly to execute -or searches 
 such as:
 * find / \(-nouser -or -nogroup\)
 Finally, I thought I'd include a link to the [Posix spec for 
 find|http://www.opengroup.org/onlinepubs/009695399/utilities/find.html]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HADOOP-11298) slaves.sh and stop-all.sh are missing slashes

2014-11-13 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-11298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated HADOOP-11298:
--
   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

Thanks!

Committed to trunk.

 slaves.sh and stop-all.sh are missing slashes 
 --

 Key: HADOOP-11298
 URL: https://issues.apache.org/jira/browse/HADOOP-11298
 Project: Hadoop Common
  Issue Type: Bug
  Components: scripts
Affects Versions: 3.0.0
Reporter: Allen Wittenauer
Assignee: Allen Wittenauer
Priority: Trivial
  Labels: newbie, shell
 Fix For: 3.0.0

 Attachments: HADOOP-11298.patch


 Just need to turn dev/null - /dev/null in the cd statement in the preamble.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HADOOP-11150) hadoop command should show the reason on failure by invalid COMMAND or CLASSNAME

2014-11-13 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-11150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated HADOOP-11150:
--
   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

+1 will commit to trunk.

Thanks!

 hadoop command should show the reason on failure by invalid COMMAND or 
 CLASSNAME
 

 Key: HADOOP-11150
 URL: https://issues.apache.org/jira/browse/HADOOP-11150
 Project: Hadoop Common
  Issue Type: Improvement
  Components: scripts
Reporter: Masatake Iwasaki
Assignee: Masatake Iwasaki
Priority: Minor
 Fix For: 3.0.0

 Attachments: HADOOP-11150-0.patch, HADOOP-11150-1.patch


 hadoop_validate_classname checks whether the classname contains .. It is 
 possible that classname without package is used in some examples or tutorials.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-11300) KMS startup scripts must not display the keystore / truststore passwords

2014-11-14 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-11300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14213227#comment-14213227
 ] 

Allen Wittenauer commented on HADOOP-11300:
---

This feels extremely fragile, but since it's tomcat there's only so much we can 
do. :(  Hopefully in the future we can dump tomcat and all of its extra 
baggage/issues. 

 KMS startup scripts must not display the keystore / truststore passwords
 

 Key: HADOOP-11300
 URL: https://issues.apache.org/jira/browse/HADOOP-11300
 Project: Hadoop Common
  Issue Type: Bug
  Components: kms
Affects Versions: 2.6.0
Reporter: Arun Suresh
Assignee: Arun Suresh
 Attachments: HADOOP-11300.1.patch, HADOOP-11300.2.patch


 Sample output of the KMS startup scripts :
 {noformat}
 Setting KMS_HOME:  /usr/lib/hadoop-kms
 Using   KMS_CONFIG:/var/run/kms-config/
 Using   KMS_LOG:   /var/log/kms-log
 Using   KMS_TEMP:   /var/run/kms-tmp/
 Using   KMS_HTTP_PORT: 16000
 Using   KMS_ADMIN_PORT: 16001
 Using   KMS_MAX_THREADS: 250
 Using   KMS_SSL_KEYSTORE_FILE: /etc/conf/kms-keystore.jks
 Using   KMS_SSL_KEYSTORE_PASS: keystorepass
 Using   CATALINA_BASE:   /var/lib/kms/tomcat-deployment
 Using   KMS_CATALINA_HOME:   /usr/lib/hadoop-kms/lib/bigtop-tomcat
 Setting CATALINA_OUT:/var/log/kms-log/kms-catalina.out
 Setting CATALINA_PID:/tmp/kms.pid
 Using   CATALINA_OPTS:   . 
 -Djavax.net.ssl.trustStorePassword=truststorepass 
 Adding to CATALINA_OPTS: -Dkms.home.dir=..  -Dkms.ssl.keystore.pass= 
 keystorepass 
 {noformat}
 The keystore password and truststore password are in clear text.. which 
 should be masked



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-6962) FileSystem.mkdirs(Path, FSPermission) should use the permission for all of the created directories

2012-12-10 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-6962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13528431#comment-13528431
 ] 

Allen Wittenauer commented on HADOOP-6962:
--

Is there any reason not to make this a blocker for 2.0 or even 1.2.0?  This is 
really causing us 'out here' a lot of pain and really needs to get fixed.

 FileSystem.mkdirs(Path, FSPermission) should use the permission for all of 
 the created directories
 --

 Key: HADOOP-6962
 URL: https://issues.apache.org/jira/browse/HADOOP-6962
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs
Reporter: Owen O'Malley
Assignee: Daryn Sharp
 Attachments: HADOOP-6962.patch


 Currently, FileSystem.mkdirs only applies the permissions to the last level 
 if it was created. It should be applied to *all* levels that are created.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HADOOP-6962) FileSystem.mkdirs(Path, FSPermission) should use the permission for all of the created directories

2012-12-10 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-6962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated HADOOP-6962:
-

Labels: security  (was: )

 FileSystem.mkdirs(Path, FSPermission) should use the permission for all of 
 the created directories
 --

 Key: HADOOP-6962
 URL: https://issues.apache.org/jira/browse/HADOOP-6962
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs
Reporter: Owen O'Malley
Assignee: Daryn Sharp
  Labels: security
 Attachments: HADOOP-6962.patch


 Currently, FileSystem.mkdirs only applies the permissions to the last level 
 if it was created. It should be applied to *all* levels that are created.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-9160) Adopt JMX for management protocols

2012-12-30 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13541180#comment-13541180
 ] 

Allen Wittenauer commented on HADOOP-9160:
--

bq. The fsck operation takes a long time to complete, could have lot of output 
streamed as response for a long time. 

HDFS-2538 fixes this problem: output is reduced, fsck runs faster, and it's 
much easier for ops teams to build tools around.  From a JMX perspective, it 
would just need to provide a percentage.

I can understand the desire to do JMX.  It's a defacto standard supported by 
many industry tools.  That said...

If we put admin interfaces in JMX, then we need to be concerned about security. 
 When I last looked at it, JMX requires the use of keystores full of certs in 
order to handle multiple identities.  PKI+keystores means a lot of pain on the 
ops side of the house.  So if we enable JMX for any 'writable' interfaces, we 
need to have a way to turn it off so that those of us that don't want to go 
through that pain and still have a secure system can stick with Hadoop RPC/HTTP 
with GSSAPI/SPNEGO.

 Adopt JMX for management protocols
 --

 Key: HADOOP-9160
 URL: https://issues.apache.org/jira/browse/HADOOP-9160
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Luke Lu

 Currently we use Hadoop RPC (and some HTTP, notably fsck) for admin 
 protocols. We should consider adopt JMX for future admin protocols, as it's 
 the industry standard for java server management with wide client support.
 Having an alternative/redundant RPC mechanism is very desirable for admin 
 protocols. I've seen in the past in multiple cases, where NN and/or JT RPC 
 were locked up solid due to various bugs and/or RPC thread pool exhaustion, 
 while HTTP and/or JMX worked just fine.
 Other desirable benefits include admin protocol backward compatibility and 
 introspectability, which is convenient for a centralized management system to 
 manage multiple Hadoop clusters of different versions. Another notable 
 benefit is that it's much easier to implement new admin commands in JMX 
 (especially with MXBean) than Hadoop RPC, especially in trunk (as well as 
 0.23+ and 2.x).
 Since Hadoop RPC doesn't guarantee backward compatibility (probably not ever 
 for branch-1), there are few external tools depending on it. We can keep the 
 old protocols for as long as needed. New commands should be in JMX. The 
 transition can be gradual and backward-compatible.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-9164) Add version number and/or library file name to native library for easy tracking

2012-12-30 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13541181#comment-13541181
 ] 

Allen Wittenauer commented on HADOOP-9164:
--

bq. I don't think C/C++ library version numbers are the most interesting things 
to report, though. The reality is, we very seldom increment those numbers, so 
just knowing that you're using libhadoop-1.0.0 doesn't give you much 
information (there were never any other version numbers for that library :\ )

This should probably get fixed as part of this patch.

 Add version number and/or library file name to native library for easy 
 tracking
 ---

 Key: HADOOP-9164
 URL: https://issues.apache.org/jira/browse/HADOOP-9164
 Project: Hadoop Common
  Issue Type: Improvement
  Components: native
Affects Versions: 2.0.2-alpha
Reporter: Binglin Chang
Assignee: Binglin Chang
Priority: Minor
 Attachments: HADOOP-9164.v1.patch, HADOOP-9164.v2.patch, 
 HADOOP-9164.v3.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HADOOP-6962) FileSystem.mkdirs(Path, FSPermission) should use the permission for all of the created directories

2013-01-02 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-6962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated HADOOP-6962:
-

 Priority: Blocker  (was: Critical)
Fix Version/s: 1.2.0

I'm changing this to a blocker for 1.2.0.  This is a pretty major security 
hole, when one considers that HDFS does permission inheritance. 

The only real choices appear to be:
a) Use 0777 + applied umask (i.e., POSIX)
b) Use inherited perms + applied umask (what I remember from the testing we did 
in Hadoop 0.14/15-ish)

I don't view this as a backwards compatibility problem as much as I view this 
as a regression.  I'm fairly confident that at some point in time this was 
working as intended (option b), but somewhere along the way no one noticed that 
it was broken.

 FileSystem.mkdirs(Path, FSPermission) should use the permission for all of 
 the created directories
 --

 Key: HADOOP-6962
 URL: https://issues.apache.org/jira/browse/HADOOP-6962
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs, security
Affects Versions: 1.0.4
Reporter: Owen O'Malley
Assignee: Daryn Sharp
Priority: Blocker
  Labels: security
 Fix For: 1.2.0

 Attachments: HADOOP-6962.patch


 Currently, FileSystem.mkdirs only applies the permissions to the last level 
 if it was created. It should be applied to *all* levels that are created.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-9160) Adopt JMX for management protocols

2013-01-21 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13558952#comment-13558952
 ] 

Allen Wittenauer commented on HADOOP-9160:
--

bq. The users of the protocols are sysadmins and management daemons.

As a member of this subset and as mentioned previously, I want the ability to 
turn off writes to guarantee that JMX can be used as a read-only interface.  
I'll -1 any patch that doesn't have it.

 Adopt JMX for management protocols
 --

 Key: HADOOP-9160
 URL: https://issues.apache.org/jira/browse/HADOOP-9160
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Luke Lu
 Attachments: hadoop-9160-demo-branch-1.txt


 Currently we use Hadoop RPC (and some HTTP, notably fsck) for admin 
 protocols. We should consider adopt JMX for future admin protocols, as it's 
 the industry standard for java server management with wide client support.
 Having an alternative/redundant RPC mechanism is very desirable for admin 
 protocols. I've seen in the past in multiple cases, where NN and/or JT RPC 
 were locked up solid due to various bugs and/or RPC thread pool exhaustion, 
 while HTTP and/or JMX worked just fine.
 Other desirable benefits include admin protocol backward compatibility and 
 introspectability, which is convenient for a centralized management system to 
 manage multiple Hadoop clusters of different versions. Another notable 
 benefit is that it's much easier to implement new admin commands in JMX 
 (especially with MXBean) than Hadoop RPC, especially in trunk (as well as 
 0.23+ and 2.x).
 Since Hadoop RPC doesn't guarantee backward compatibility (probably not ever 
 for branch-1), there are few external tools depending on it. We can keep the 
 old protocols for as long as needed. New commands should be in JMX. The 
 transition can be gradual and backward-compatible.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HADOOP-6962) FileSystem.mkdirs(Path, FSPermission) should use the permission for all of the created directories

2013-01-27 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-6962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated HADOOP-6962:
-

Target Version/s: 1.2.0
   Fix Version/s: (was: 1.2.0)

My tests with the current patch did not work on 1.0.4 when running a MapReduce 
program.

 FileSystem.mkdirs(Path, FSPermission) should use the permission for all of 
 the created directories
 --

 Key: HADOOP-6962
 URL: https://issues.apache.org/jira/browse/HADOOP-6962
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs, security
Affects Versions: 1.0.4
Reporter: Owen O'Malley
Assignee: Daryn Sharp
Priority: Blocker
  Labels: security
 Attachments: HADOOP-6962.patch


 Currently, FileSystem.mkdirs only applies the permissions to the last level 
 if it was created. It should be applied to *all* levels that are created.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-9296) Authenticating users from different realm without a trust relationship

2013-02-13 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13577796#comment-13577796
 ] 

Allen Wittenauer commented on HADOOP-9296:
--

How does this work when multiple grids are involved?  i.e. distcp

 Authenticating users from different realm without a trust relationship
 --

 Key: HADOOP-9296
 URL: https://issues.apache.org/jira/browse/HADOOP-9296
 Project: Hadoop Common
  Issue Type: Improvement
  Components: security
Reporter: Benoy Antony
Assignee: Benoy Antony
 Attachments: HADOOP-9296-1.1.patch, multirealm.pdf


 Hadoop Masters (JobTracker and NameNode) and slaves (Data Node and 
 TaskTracker) are part of the Hadoop domain, controlled by Hadoop Active 
 Directory. 
 The users belong to the CORP domain, controlled by the CORP Active Directory. 
 In the absence of a one way trust from HADOOP DOMAIN to CORP DOMAIN, how will 
 Hadoop Servers (JobTracker, NameNode) authenticate  CORP users ?
 The solution and implementation details are in the attachement

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-9317) User cannot specify a kerberos keytab for commands

2013-02-20 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13582660#comment-13582660
 ] 

Allen Wittenauer commented on HADOOP-9317:
--

Maybe I'm missing something, but I don't understand why just using a different 
KRB5CCNAME for every invocation doesn't fix this.  i.e., program flow should be:

{code}
export KRB5CCNAME=/tmp/mycoolcache.$$
kinit -k -t keytab identity
hadoop jar blah
rm /tmp/mycookcache.$$
{code}

You could even be smarter and check the creation timestamp vs. expiry.  
Additionally, I'm not sure, but I don't think kinit -R removes the file.  (But 
I could be wrong.)

 User cannot specify a kerberos keytab for commands
 --

 Key: HADOOP-9317
 URL: https://issues.apache.org/jira/browse/HADOOP-9317
 Project: Hadoop Common
  Issue Type: Bug
  Components: security
Affects Versions: 0.23.0, 2.0.0-alpha, 3.0.0
Reporter: Daryn Sharp
Assignee: Daryn Sharp
Priority: Critical
 Attachments: HADOOP-9317.branch-23.patch, 
 HADOOP-9317.branch-23.patch, HADOOP-9317.patch, HADOOP-9317.patch, 
 HADOOP-9317.patch


 {{UserGroupInformation}} only allows kerberos users to be logged in via the 
 ticket cache when running hadoop commands.  {{UGI}} allows a keytab to be 
 used, but it's only exposed programatically.  This forces keytab-based users 
 running hadoop commands to periodically issue a kinit from the keytab.  A 
 race condition exists during the kinit when the ticket cache is deleted and 
 re-created.  Hadoop commands will fail when the ticket cache does not 
 momentarily exist.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-9296) Authenticating users from different realm without a trust relationship

2013-02-20 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13582674#comment-13582674
 ] 

Allen Wittenauer commented on HADOOP-9296:
--

After more thought, as far as I can tell, doesn't actually do anything to 
protect the web interfaces for the TaskTracker or the DataNode. I'm guessing 
this is built around the idea that something else is protecting those or the 
user will always connect to the JT or NN first in order to get a delegation 
token?  Also, how does SPNEGO for the NN/2NN work under this scenario?  Will 
the hdfs user need to come from the user realm as well? 

I recognize this is a kludge for broken company policies and politics who for 
whatever reasons aren't willing to do Kerberos properly with a one-way trust.  
But I'm worried this is going to give a false sense of security without making 
sure that other things are in place.  At the minimum, the documentation 
accompanying this change should be explicit about its use cases and promote the 
usage of real trusts.

 Authenticating users from different realm without a trust relationship
 --

 Key: HADOOP-9296
 URL: https://issues.apache.org/jira/browse/HADOOP-9296
 Project: Hadoop Common
  Issue Type: Improvement
  Components: security
Reporter: Benoy Antony
Assignee: Benoy Antony
 Attachments: HADOOP-9296-1.1.patch, multirealm.pdf


 Hadoop Masters (JobTracker and NameNode) and slaves (Data Node and 
 TaskTracker) are part of the Hadoop domain, controlled by Hadoop Active 
 Directory. 
 The users belong to the CORP domain, controlled by the CORP Active Directory. 
 In the absence of a one way trust from HADOOP DOMAIN to CORP DOMAIN, how will 
 Hadoop Servers (JobTracker, NameNode) authenticate  CORP users ?
 The solution and implementation details are in the attachement

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HADOOP-9520) _HOST doesn't resolve to bound interface

2013-04-27 Thread Allen Wittenauer (JIRA)
Allen Wittenauer created HADOOP-9520:


 Summary: _HOST doesn't resolve to bound interface
 Key: HADOOP-9520
 URL: https://issues.apache.org/jira/browse/HADOOP-9520
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 2.0.4-alpha
Reporter: Allen Wittenauer


_HOST appears to ignore bound interfaces.  For example, if a host has two 
interfaces such that:

nic0 = gethostname()
nic1 = someothername

and then I configure the namenode or resource manager to use 
someothername:, the system still treats _HOST = nic0.  This is especially 
harmful for Kerberos principals.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HADOOP-9521) krb5 replay error triggers log file DoS with Safari

2013-04-27 Thread Allen Wittenauer (JIRA)
Allen Wittenauer created HADOOP-9521:


 Summary: krb5 replay error triggers log file DoS with Safari
 Key: HADOOP-9521
 URL: https://issues.apache.org/jira/browse/HADOOP-9521
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 2.0.4-alpha
Reporter: Allen Wittenauer
Priority: Blocker


While investigating YARN-621, looking at the web interface with Safari 
triggered a loop which both filled the log with stack traces as well as left 
the browser in a continual loading situation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HADOOP-9521) krb5 replay error triggers log file DoS with Safari

2013-04-27 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-9521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated HADOOP-9521:
-

Component/s: security
Environment: Mac OS X 10.8.3, Safari 6.0.3 (8536.28.10)

 krb5 replay error triggers log file DoS with Safari
 ---

 Key: HADOOP-9521
 URL: https://issues.apache.org/jira/browse/HADOOP-9521
 Project: Hadoop Common
  Issue Type: Bug
  Components: security
Affects Versions: 2.0.4-alpha
 Environment: Mac OS X 10.8.3, Safari 6.0.3 (8536.28.10)
Reporter: Allen Wittenauer
Priority: Blocker

 While investigating YARN-621, looking at the web interface with Safari 
 triggered a loop which both filled the log with stack traces as well as left 
 the browser in a continual loading situation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HADOOP-9522) web interfaces are not logged until after opening

2013-04-27 Thread Allen Wittenauer (JIRA)
Allen Wittenauer created HADOOP-9522:


 Summary: web interfaces are not logged until after opening
 Key: HADOOP-9522
 URL: https://issues.apache.org/jira/browse/HADOOP-9522
 Project: Hadoop Common
  Issue Type: Improvement
Affects Versions: 2.0.4-alpha
Reporter: Allen Wittenauer


If one mis-configures certain interfaces (in my case 
yarn.resourcemanager.webapp.address), neither Hadoop nor jetty throws any 
errors that the interface doesn't exist. Worse yet, the system appears to be 
hung. It would be better if we logged what hostname:port we were attempting to 
open before we opened it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-9520) _HOST doesn't resolve to bound interface

2013-04-30 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13645913#comment-13645913
 ] 

Allen Wittenauer commented on HADOOP-9520:
--

FWIW, I'm fully expecting to fix this bug myself like I did for our branch-1 
install.  People wanted me to file bugs.  I did and got the fully expected push 
back that ops teams are required to hard code everything (despite this being 
completely unnecessary and mostly unintuitive vs ~4 code change).



 _HOST doesn't resolve to bound interface
 

 Key: HADOOP-9520
 URL: https://issues.apache.org/jira/browse/HADOOP-9520
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 2.0.4-alpha
Reporter: Allen Wittenauer

 _HOST appears to ignore bound interfaces.  For example, if a host has two 
 interfaces such that:
 nic0 = gethostname()
 nic1 = someothername
 and then I configure the namenode or resource manager to use 
 someothername:, the system still treats _HOST = nic0.  This is especially 
 harmful for Kerberos principals.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-9521) krb5 replay error triggers log file DoS with Safari

2013-04-30 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13645929#comment-13645929
 ] 

Allen Wittenauer commented on HADOOP-9521:
--

I don't have anything but 6.0.3 here to test against.

Stack trace is what you'd expect:

{code}
2013-04-30 19:58:54,576 WARN 
org.apache.hadoop.security.authentication.server.AuthenticationFilter: 
Authentication exception: GSSException: Failure unspecified at GSS-API level 
(Mechanism level: Request is a replay (34))
org.apache.hadoop.security.authentication.client.AuthenticationException: 
GSSException: Failure unspecified at GSS-API level (Mechanism level: Request is 
a replay (34))
at 
org.apache.hadoop.security.authentication.server.KerberosAuthenticationHandler.authenticate(KerberosAuthenticationHandler.java:329)
at 
org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:349)
at 
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
at 
org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:384)
at 
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
at 
org.apache.hadoop.http.HttpServer$QuotingInputFilter.doFilter(HttpServer.java:1069)
at 
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
at org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45)
at 
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
at org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45)
at 
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
at 
org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
at 
org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
at 
org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
at 
org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)
at 
org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
at 
org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
at org.mortbay.jetty.Server.handle(Server.java:326)
at 
org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
at 
org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928)
at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549)
at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
at 
org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410)
at 
org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)
Caused by: GSSException: Failure unspecified at GSS-API level (Mechanism level: 
Request is a replay (34))
at 
sun.security.jgss.krb5.Krb5Context.acceptSecContext(Krb5Context.java:741)
at 
sun.security.jgss.GSSContextImpl.acceptSecContext(GSSContextImpl.java:323)
at 
sun.security.jgss.GSSContextImpl.acceptSecContext(GSSContextImpl.java:267)
at 
sun.security.jgss.spnego.SpNegoContext.GSS_acceptSecContext(SpNegoContext.java:874)
at 
sun.security.jgss.spnego.SpNegoContext.acceptSecContext(SpNegoContext.java:541)
at 
sun.security.jgss.GSSContextImpl.acceptSecContext(GSSContextImpl.java:323)
at 
sun.security.jgss.GSSContextImpl.acceptSecContext(GSSContextImpl.java:267)
at 
org.apache.hadoop.security.authentication.server.KerberosAuthenticationHandler$2.run(KerberosAuthenticationHandler.java:299)
at 
org.apache.hadoop.security.authentication.server.KerberosAuthenticationHandler$2.run(KerberosAuthenticationHandler.java:291)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at 
org.apache.hadoop.security.authentication.server.KerberosAuthenticationHandler.authenticate(KerberosAuthenticationHandler.java:291)
... 25 more
Caused by: KrbException: Request is a replay (34)
at sun.security.krb5.KrbApReq.authenticate(KrbApReq.java:298)
at sun.security.krb5.KrbApReq.init(KrbApReq.java:134)
at 
sun.security.jgss.krb5.InitSecContextToken.init(InitSecContextToken.java:79)
at 
sun.security.jgss.krb5.Krb5Context.acceptSecContext(Krb5Context.java:724)
... 36 more
{code}

 krb5 replay error triggers log file DoS with Safari
 ---

 Key: HADOOP-9521

[jira] [Updated] (HADOOP-9521) krb5 replay error triggers log file DoS with Safari

2013-04-30 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-9521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated HADOOP-9521:
-

Environment: 
Mac OS X 10.8.3, Safari 6.0.3 (8536.28.10)
Mac OS X 10.6.8, Safari 6.0.3 (8536.28.10)

  was:Mac OS X 10.8.3, Safari 6.0.3 (8536.28.10)


 krb5 replay error triggers log file DoS with Safari
 ---

 Key: HADOOP-9521
 URL: https://issues.apache.org/jira/browse/HADOOP-9521
 Project: Hadoop Common
  Issue Type: Bug
  Components: security
Affects Versions: 2.0.4-alpha
 Environment: Mac OS X 10.8.3, Safari 6.0.3 (8536.28.10)
 Mac OS X 10.6.8, Safari 6.0.3 (8536.28.10)
Reporter: Allen Wittenauer
Priority: Blocker

 While investigating YARN-621, looking at the web interface with Safari 
 triggered a loop which both filled the log with stack traces as well as left 
 the browser in a continual loading situation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HADOOP-9710) Modify security layer to support QoP based on ports

2013-07-09 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-9710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated HADOOP-9710:
-

Description: 
Hadoop Servers currently support only one quality of protection(QOP) for all of 
the cluster.
This jira allows a server to have different QOP on different ports. 

The QOP is set based on the port.

  was:
Hadoop Servers currently support only one QOP for all of the cluster.
This jira allows a server to have different QOP on different ports. 

The QOP is set based on the port.

Summary: Modify security layer  to support QoP based on ports  (was: 
Modify security layer  to support QOP based on ports)

 Modify security layer  to support QoP based on ports
 

 Key: HADOOP-9710
 URL: https://issues.apache.org/jira/browse/HADOOP-9710
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Benoy Antony
Assignee: Benoy Antony
 Attachments: HADOOP-9710.patch


 Hadoop Servers currently support only one quality of protection(QOP) for all 
 of the cluster.
 This jira allows a server to have different QOP on different ports. 
 The QOP is set based on the port.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HADOOP-9777) RPM should not claim ownership of paths owned by the platform

2013-07-26 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-9777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated HADOOP-9777:
-

Priority: Critical  (was: Major)

 RPM should not claim ownership of paths owned by the platform
 -

 Key: HADOOP-9777
 URL: https://issues.apache.org/jira/browse/HADOOP-9777
 Project: Hadoop Common
  Issue Type: Bug
  Components: build
Affects Versions: 1.1.2
 Environment: Fedora 19 x64
Reporter: Stevo Slavic
Priority: Critical

 Installing Apache Hadoop rpm ( hadoop-1.1.2-1.x86_64.rpm ) on Fedora 19 x64 
 fails with:
 [root@laptop hadoop]# rpm -i /home/sslavic/Downloads/hadoop-1.1.2-1.x86_64.rpm
 file /usr/bin from install of hadoop-1.1.2-1.x86_64 conflicts with file from 
 package filesystem-3.2-12.fc19.x86_64
 file /usr/lib from install of hadoop-1.1.2-1.x86_64 conflicts with file from 
 package filesystem-3.2-12.fc19.x86_64
 file /usr/lib64 from install of hadoop-1.1.2-1.x86_64 conflicts with file 
 from package filesystem-3.2-12.fc19.x86_64
 file /usr/sbin from install of hadoop-1.1.2-1.x86_64 conflicts with file from 
 package filesystem-3.2-12.fc19.x86_64
 Same issue occurs if one tries to install as non-root user:
 [sslavic@laptop ~]$ sudo rpm -i Downloads/hadoop-1.1.2-1.x86_64.rpm 
 file /usr/bin from install of hadoop-1.1.2-1.x86_64 conflicts with file from 
 package filesystem-3.2-12.fc19.x86_64
 file /usr/lib from install of hadoop-1.1.2-1.x86_64 conflicts with file from 
 package filesystem-3.2-12.fc19.x86_64
 file /usr/lib64 from install of hadoop-1.1.2-1.x86_64 conflicts with file 
 from package filesystem-3.2-12.fc19.x86_64
 file /usr/sbin from install of hadoop-1.1.2-1.x86_64 conflicts with file from 
 package filesystem-3.2-12.fc19.x86_64
 It seems these 4 directories in Hadoop rpm have wrong permissions (+w for 
 owner).
 This is violation of packaging rules. Hadoop rpm spec and/or build scripts 
 need to be fixed, so that rpm on installation doesn't try to claim ownership 
 of paths owned by the platform, in this case, filesystem.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-9870) Mixed configurations for JVM -Xmx in hadoop command

2013-08-14 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13739896#comment-13739896
 ] 

Allen Wittenauer commented on HADOOP-9870:
--

Is there something inherently wrong with letting the JVM make the decision?  
Are we worried about a JVM that doesn't follow the same set of rules?  (which, 
at this point, is a de facto API)

 Mixed configurations for JVM -Xmx in hadoop command
 ---

 Key: HADOOP-9870
 URL: https://issues.apache.org/jira/browse/HADOOP-9870
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Wei Yan

 When we use hadoop command to launch a class, there are two places setting 
 the -Xmx configuration.
 *1*. The first place is located in file 
 {{hadoop-common-project/hadoop-common/src/main/bin/hadoop}}.
 {code}
 exec $JAVA $JAVA_HEAP_MAX $HADOOP_OPTS $CLASS $@
 {code}
 Here $JAVA_HEAP_MAX is configured in hadoop-config.sh 
 ({{hadoop-common-project/hadoop-common/src/main/bin/hadoop-config.sh}}). The 
 default value is -Xmx1000m.
 *2*. The second place is set with $HADOOP_OPTS in file 
 {{hadoop-common-project/hadoop-common/src/main/bin/hadoop}}.
 {code}
 HADOOP_OPTS=$HADOOP_OPTS $HADOOP_CLIENT_OPTS
 {code}
 Here $HADOOP_CLIENT_OPTS is set in hadoop-env.sh 
 ({{hadoop-common-project/hadoop-common/src/main/conf/hadoop-env.sh}})
 {code}
 export HADOOP_CLIENT_OPTS=-Xmx512m $HADOOP_CLIENT_OPTS
 {code}
 Currently the final default java command looks like:
 {code}java -Xmx1000m  -Xmx512m CLASS_NAME ARGUMENTS{code}
 And if users also specify the -Xmx in the $HADOOP_CLIENT_OPTS, there will be 
 three -Xmx configurations. 
 The hadoop setup tutorial only discusses hadoop-env.sh, and it looks that 
 users should not make any change in hadoop-config.sh.
 We should let hadoop smart to choose the right one before launching the java 
 command, instead of leaving for jvm to make the decision.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-9874) hadoop.security.logger output goes to both logs

2013-08-14 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13740473#comment-13740473
 ] 

Allen Wittenauer commented on HADOOP-9874:
--

I'm fairly certain this is a regression as well, but I can't verify that at the 
moment.

 hadoop.security.logger output goes to both logs
 ---

 Key: HADOOP-9874
 URL: https://issues.apache.org/jira/browse/HADOOP-9874
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 2.1.0-beta
Reporter: Allen Wittenauer

 Setting hadoop.security.logger (for SecurityLogger messages) to non-null 
 sends authentication information to the other log as specified.  However, 
 that logging information also goes to the main log.   It should only go to 
 one log, not both.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HADOOP-9874) hadoop.security.logger output goes to both logs

2013-08-14 Thread Allen Wittenauer (JIRA)
Allen Wittenauer created HADOOP-9874:


 Summary: hadoop.security.logger output goes to both logs
 Key: HADOOP-9874
 URL: https://issues.apache.org/jira/browse/HADOOP-9874
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 2.1.0-beta
Reporter: Allen Wittenauer


Setting hadoop.security.logger (for SecurityLogger messages) to non-null sends 
authentication information to the other log as specified.  However, that 
logging information also goes to the main log.   It should only go to one log, 
not both.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-9874) hadoop.security.logger output goes to both logs

2013-08-14 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13740504#comment-13740504
 ] 

Allen Wittenauer commented on HADOOP-9874:
--

Sure, but we should do the correct thing out of the box.

 hadoop.security.logger output goes to both logs
 ---

 Key: HADOOP-9874
 URL: https://issues.apache.org/jira/browse/HADOOP-9874
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 2.1.0-beta
Reporter: Allen Wittenauer

 Setting hadoop.security.logger (for SecurityLogger messages) to non-null 
 sends authentication information to the other log as specified.  However, 
 that logging information also goes to the main log.   It should only go to 
 one log, not both.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-9884) Hadoop calling du -sk is expensive

2013-08-19 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13744184#comment-13744184
 ] 

Allen Wittenauer commented on HADOOP-9884:
--

We need to tread carefully here.  Replacing the du call has the potential to 
break distributed cache (and probably other things), especially for 
non-HDFS-based systems.

 Hadoop calling du -sk is expensive
 --

 Key: HADOOP-9884
 URL: https://issues.apache.org/jira/browse/HADOOP-9884
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Alex Newman

 On numerous occasions we've had customers worry about slowness while hadoop 
 calls du -sk underneath the hood. For most of these users getting the 
 information from df would be sufficient and much faster. In fact there is a 
 hack going around, that is quiet common that replaces df with du. Sometimes 
 people have to tune the vcache. What if we just allowed users to use the df 
 information instead of the du information with a patch and config setting. 
 I'd be glad to code it up

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HADOOP-9902) Shell script rewrite

2013-08-24 Thread Allen Wittenauer (JIRA)
Allen Wittenauer created HADOOP-9902:


 Summary: Shell script rewrite
 Key: HADOOP-9902
 URL: https://issues.apache.org/jira/browse/HADOOP-9902
 Project: Hadoop Common
  Issue Type: Improvement
  Components: scripts
Affects Versions: 2.1.1-beta
Reporter: Allen Wittenauer
Assignee: Allen Wittenauer


Umbrella JIRA for shell script rewrite.  See first comment for more details.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HADOOP-9902) Shell script rewrite

2013-08-24 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-9902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated HADOOP-9902:
-

Description: Umbrella JIRA for shell script rewrite.  See more-info.txt for 
more details.  (was: Umbrella JIRA for shell script rewrite.  See first comment 
for more details.)

 Shell script rewrite
 

 Key: HADOOP-9902
 URL: https://issues.apache.org/jira/browse/HADOOP-9902
 Project: Hadoop Common
  Issue Type: Improvement
  Components: scripts
Affects Versions: 2.1.1-beta
Reporter: Allen Wittenauer
Assignee: Allen Wittenauer
 Attachments: more-info.txt


 Umbrella JIRA for shell script rewrite.  See more-info.txt for more details.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HADOOP-9902) Shell script rewrite

2013-08-24 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-9902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated HADOOP-9902:
-

Attachment: more-info.txt

 Shell script rewrite
 

 Key: HADOOP-9902
 URL: https://issues.apache.org/jira/browse/HADOOP-9902
 Project: Hadoop Common
  Issue Type: Improvement
  Components: scripts
Affects Versions: 2.1.1-beta
Reporter: Allen Wittenauer
Assignee: Allen Wittenauer
 Attachments: more-info.txt


 Umbrella JIRA for shell script rewrite.  See more-info.txt for more details.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HADOOP-9902) Shell script rewrite

2013-08-24 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-9902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated HADOOP-9902:
-

Attachment: scripts.tgz

Just to give an idea of what I'm thinking, here is a sample.  Note this is a) 
not even close to final, b) likely has bugs, c) is very incomplete, and d) 
hasn't been fully optimized at all.  

This is for 2.1.0.  Sorry for not being in patch format, but I'm not at that 
stage yet.

 Shell script rewrite
 

 Key: HADOOP-9902
 URL: https://issues.apache.org/jira/browse/HADOOP-9902
 Project: Hadoop Common
  Issue Type: Improvement
  Components: scripts
Affects Versions: 2.1.1-beta
Reporter: Allen Wittenauer
Assignee: Allen Wittenauer
 Attachments: more-info.txt, scripts.tgz


 Umbrella JIRA for shell script rewrite.  See more-info.txt for more details.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-9902) Shell script rewrite

2013-08-25 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13749657#comment-13749657
 ] 

Allen Wittenauer commented on HADOOP-9902:
--

Adding a bunch of links to JIRAs for xref to when various things got added.

A quick read leaves me with one impression: YARN is incredibly inconsistent and 
it's attempts to make things easier have actually made things harder for both 
the user and the developer.  Worse, a lot of the stuff is completely 
undocumented outside of JIRAs.  I don't know if this situation is salvageable 
without undoing some of this nonsense.

 Shell script rewrite
 

 Key: HADOOP-9902
 URL: https://issues.apache.org/jira/browse/HADOOP-9902
 Project: Hadoop Common
  Issue Type: Improvement
  Components: scripts
Affects Versions: 2.1.1-beta
Reporter: Allen Wittenauer
Assignee: Allen Wittenauer
 Attachments: more-info.txt, scripts.tgz


 Umbrella JIRA for shell script rewrite.  See more-info.txt for more details.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-9902) Shell script rewrite

2013-08-26 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13750154#comment-13750154
 ] 

Allen Wittenauer commented on HADOOP-9902:
--

Question for the crowd.  In bin/yarn is... this:

{code}
# for developers, add Hadoop classes to CLASSPATH
if [ -d $HADOOP_YARN_HOME/yarn-api/target/classes ]; then
  CLASSPATH=${CLASSPATH}:$HADOOP_YARN_HOME/yarn-api/target/classes
fi
if [ -d $HADOOP_YARN_HOME/yarn-common/target/classes ]; then
  CLASSPATH=${CLASSPATH}:$HADOOP_YARN_HOME/yarn-common/target/classes
fi
if [ -d $HADOOP_YARN_HOME/yarn-mapreduce/target/classes ]; then
  CLASSPATH=${CLASSPATH}:$HADOOP_YARN_HOME/yarn-mapreduce/target/classes
fi
if [ -d $HADOOP_YARN_HOME/yarn-master-worker/target/classes ]; then
  CLASSPATH=${CLASSPATH}:$HADOOP_YARN_HOME/yarn-master-worker/target/classes
fi
if [ -d $HADOOP_YARN_HOME/yarn-server/yarn-server-nodemanager/target/classes 
]; then
  
CLASSPATH=${CLASSPATH}:$HADOOP_YARN_HOME/yarn-server/yarn-server-nodemanager/target/classes
fi
if [ -d $HADOOP_YARN_HOME/yarn-server/yarn-server-common/target/classes ]; 
then
  
CLASSPATH=${CLASSPATH}:$HADOOP_YARN_HOME/yarn-server/yarn-server-common/target/classes
fi
if [ -d 
$HADOOP_YARN_HOME/yarn-server/yarn-server-resourcemanager/target/classes ]; 
then
  
CLASSPATH=${CLASSPATH}:$HADOOP_YARN_HOME/yarn-server/yarn-server-resourcemanager/target/classes
fi
if [ -d $HADOOP_YARN_HOME/build/test/classes ]; then
  CLASSPATH=${CLASSPATH}:$HADOOP_YARN_HOME/target/test/classes
fi
if [ -d $HADOOP_YARN_HOME/build/tools ]; then
  CLASSPATH=${CLASSPATH}:$HADOOP_YARN_HOME/build/tools
fi
{code}

[I'm pretty sure at this point in the execution path, the YARN jars from the 
non-build directories have already been inserted into the classpath via the 
early call into hadoop-config.sh... which means this code likely isn't working 
as intended.  For now, let's assume that it is.]

After cleanup, it looks a bit more like this, using the before option to push 
the entries to the front of the classpath and reversing to maintain the pathing 
order. [altho I suspect that a) we can this down even further by an ls -d and 
b) ordering doesn't matter]:

{code}
add_classpath $HADOOP_YARN_HOME/build/tools before
add_classpath $HADOOP_YARN_HOME/build/test/classes before
for debugpath in yarn-server-resourcemanager yarn-server-common 
yarn-server-nodemanager \
 yarn-master-worker yarn-mapreduce yarn-common yarn-api; do
  add_classpath $HADOOP_YARN_HOME/$debugpath/target/classes before
done
{code}

Since this is buried in bin/yarn, this is only getting set if the yarn command 
is being used.  This might lead to some interesting situations where we're 
running test yarn code on stable HDFS.  This may or may not be desirable.  So 
now the question:

*Should test classpaths always be inserted if we detect them?*

Your choices:

a) We actually cover this as part of the unit tests.  Strip all this stuff out 
so our commands run faster!
b) Keep the debug code per-section.  i.e., hdfs command will only get hdfs and 
common test code, yarn command will get the yarn and common test code, hadoop 
command only gets common.
c) Everyone gets everything.  i.e., using the hdfs command will add in the yarn 
test code.  

Reminder: hadoop-config.sh adds in *all* of the classpaths we know about.  I 
don't think this is fixable without breaking compatibility in a major way.  
(Changing the 'hadoop classpath' command to show all paths is certainly do-able 
but who knows what *else* would break...)

Thoughts?

 Shell script rewrite
 

 Key: HADOOP-9902
 URL: https://issues.apache.org/jira/browse/HADOOP-9902
 Project: Hadoop Common
  Issue Type: Improvement
  Components: scripts
Affects Versions: 2.1.1-beta
Reporter: Allen Wittenauer
Assignee: Allen Wittenauer
 Attachments: more-info.txt, scripts.tgz


 Umbrella JIRA for shell script rewrite.  See more-info.txt for more details.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-9902) Shell script rewrite

2013-08-26 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13750233#comment-13750233
 ] 

Allen Wittenauer commented on HADOOP-9902:
--

or a 4th option:

d) set HADOOP_BUILD_DEBUG=sub sub ... which would only enable the classpath 
for the subprojects listed. (i.e., HADOOP_BUILD_DEBUG=hdfs yarn would enable 
both hdfs and yarn but not common.

 Shell script rewrite
 

 Key: HADOOP-9902
 URL: https://issues.apache.org/jira/browse/HADOOP-9902
 Project: Hadoop Common
  Issue Type: Improvement
  Components: scripts
Affects Versions: 2.1.1-beta
Reporter: Allen Wittenauer
Assignee: Allen Wittenauer
 Attachments: more-info.txt, scripts.tgz


 Umbrella JIRA for shell script rewrite.  See more-info.txt for more details.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-9902) Shell script rewrite

2013-09-03 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13756796#comment-13756796
 ] 

Allen Wittenauer commented on HADOOP-9902:
--

Digging into this further, it looks like YARN has a different build structure 
than HDFS, common, and mapreduce, which is why these extra classpaths aren't 
added.  I'll see if I can work out what should be added and wrap them around a 
new flag (--buildpaths).

Thanks!

 Shell script rewrite
 

 Key: HADOOP-9902
 URL: https://issues.apache.org/jira/browse/HADOOP-9902
 Project: Hadoop Common
  Issue Type: Improvement
  Components: scripts
Affects Versions: 2.1.1-beta
Reporter: Allen Wittenauer
Assignee: Allen Wittenauer
 Attachments: more-info.txt, scripts.tgz


 Umbrella JIRA for shell script rewrite.  See more-info.txt for more details.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HADOOP-9902) Shell script rewrite

2013-09-03 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-9902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated HADOOP-9902:
-

Attachment: scripts.tgz

 Shell script rewrite
 

 Key: HADOOP-9902
 URL: https://issues.apache.org/jira/browse/HADOOP-9902
 Project: Hadoop Common
  Issue Type: Improvement
  Components: scripts
Affects Versions: 2.1.1-beta
Reporter: Allen Wittenauer
Assignee: Allen Wittenauer
 Attachments: more-info.txt, scripts.tgz, scripts.tgz


 Umbrella JIRA for shell script rewrite.  See more-info.txt for more details.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HADOOP-9902) Shell script rewrite

2013-09-03 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-9902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated HADOOP-9902:
-

Attachment: (was: scripts.tgz)

 Shell script rewrite
 

 Key: HADOOP-9902
 URL: https://issues.apache.org/jira/browse/HADOOP-9902
 Project: Hadoop Common
  Issue Type: Improvement
  Components: scripts
Affects Versions: 2.1.1-beta
Reporter: Allen Wittenauer
Assignee: Allen Wittenauer
 Attachments: more-info.txt, scripts.tgz


 Umbrella JIRA for shell script rewrite.  See more-info.txt for more details.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-9902) Shell script rewrite

2013-09-03 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13757341#comment-13757341
 ] 

Allen Wittenauer commented on HADOOP-9902:
--

Uploaded another mostly untested code drop with contents of bin/ and libexec/ 
to show progress, get some feedback, etc.  Basic stuff does appear to work for 
me, but I haven't tried starting any daemons yet since I'm still working out 
the new secure DN starter code to be much more flexible.  Plus I'm still 
working my way through sbin. A few things worth pointing out:

Load order should be consistent now.  Basic path is:

* bin/command sets HADOOP_NEW_CONFIG to disable auto-population.  It then loads:
** xyz-config.sh 
*** hadoop-config.sh 
 hadoop-env.sh
 hadoop-functions.sh
*** xyz-env.sh - loading this here should allow for users to override quite a 
bit more, at least that's the hypothesis
* (do whatever)
* finalize - fills in any missing -D's
* exec java

This mainly has implications for YARN which did/does really oddball things with 
YARN_OPTS.  There is bound to be some (edge-case?) breakage here, but (IMO) 
consistency is more important.  I tried to 'make it work', but...

Misc.

* users can override functions in hadoop-env.sh.  This means if they need 
extra/replacement functionality, totally doable, without replacing anything in 
libexec.  I might make a specific call out
* double-dash options (i.e., --config) are handled by the same code, 
consistently, in hadoop-config.sh.  Also, since this is a loop, the order of 
the options no longer matters, except for --config (for what are hopefully 
obvious reasons). --help and friends work by having the top level define a 
function called usage(). 
* Most/all of the crazy if/fi constructions (esp those buried inside a case!) 
have been replaced with a single-parent case statement.  Also, an effort has 
been made to mostly alphabetize the commands in the case statement, although 
I'm sure I missed one or two.
* Option C from above has been implemented.  I think. ;)
* I haven't touched httpfs yet at all. 
* You can see some previews of some of the stuff in sbin.  For example, 
slaves.sh now uses pdsh if it is installed.
* LD_LIBRARY_PATH, CLASSPATH, JAVA_LIBRARY_PATH are now de-duped.


 Shell script rewrite
 

 Key: HADOOP-9902
 URL: https://issues.apache.org/jira/browse/HADOOP-9902
 Project: Hadoop Common
  Issue Type: Improvement
  Components: scripts
Affects Versions: 2.1.1-beta
Reporter: Allen Wittenauer
Assignee: Allen Wittenauer
 Attachments: more-info.txt, scripts.tgz


 Umbrella JIRA for shell script rewrite.  See more-info.txt for more details.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-9902) Shell script rewrite

2013-09-03 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13757349#comment-13757349
 ] 

Allen Wittenauer commented on HADOOP-9902:
--

Oh, one other thing:
* removed rm-config/log4j.properties and nm-config/log4j.properties support. 
These appear to be completely undocumented.

 Shell script rewrite
 

 Key: HADOOP-9902
 URL: https://issues.apache.org/jira/browse/HADOOP-9902
 Project: Hadoop Common
  Issue Type: Improvement
  Components: scripts
Affects Versions: 2.1.1-beta
Reporter: Allen Wittenauer
Assignee: Allen Wittenauer
 Attachments: more-info.txt, scripts.tgz


 Umbrella JIRA for shell script rewrite.  See more-info.txt for more details.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-9902) Shell script rewrite

2013-09-10 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13763221#comment-13763221
 ] 

Allen Wittenauer commented on HADOOP-9902:
--

Would anyone miss any of the following YARN properties being defined:

* yarn.id.str
* yarn.home.dir
* yarn.policy.file

None of these are used in the Hadoop source and don't appear to be documented.

 Shell script rewrite
 

 Key: HADOOP-9902
 URL: https://issues.apache.org/jira/browse/HADOOP-9902
 Project: Hadoop Common
  Issue Type: Improvement
  Components: scripts
Affects Versions: 2.1.1-beta
Reporter: Allen Wittenauer
Assignee: Allen Wittenauer
 Attachments: more-info.txt, scripts.tgz


 Umbrella JIRA for shell script rewrite.  See more-info.txt for more details.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-9902) Shell script rewrite

2013-09-16 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13768713#comment-13768713
 ] 

Allen Wittenauer commented on HADOOP-9902:
--

Since I'm getting ready to post a patch, how about an 'end result' example!  
Here is the comamnd line for the resource manager from my real, 100+ node test 
grid.

Before the changes:
{code}
/usr/java/default/bin/java
-Dproc_resourcemanager
-Xmx1000m
-Xmx24g
-Dyarn.server.resourcemanager.appsummary.log.file=rm-appsummary.log
-Dyarn.server.resourcemanager.appsummary.logger=INFO,RMSUMMARY
-Xloggc:/export/apps/hadoop/logs/gc-nn.log-201308261726
-Dcom.sun.management.jmxremote.port=9010
-verbose:gc
-XX:+PrintGCDetails
-XX:+PrintGCTimeStamps
-XX:+PrintGCDateStamps
-Dcom.sun.management.jmxremote=true
-Dcom.sun.management.jmxremote.authenticate=false
-Dcom.sun.management.jmxremote.ssl=false
-Dhadoop.log.dir=/export/apps/hadoop/logs
-Dyarn.log.dir=/export/apps/hadoop/logs
-Dhadoop.log.file=yarn-yarn-resourcemanager-eat1-hcl4083.grid.linkedin.com.log
-Dyarn.log.file=yarn-yarn-resourcemanager-eat1-hcl4083.grid.linkedin.com.log
-Dyarn.home.dir=
-Dyarn.id.str=yarn
-Dhadoop.root.logger=INFO,DRFA
-Dyarn.root.logger=INFO,DRFA
-Djava.library.path=/export/apps/hadoop/latest/lib/native
-Dyarn.policy.file=hadoop-policy.xml
-Dhadoop.log.dir=/export/apps/hadoop/logs
-Dyarn.log.dir=/export/apps/hadoop/logs
-Dhadoop.log.file=yarn-yarn-resourcemanager-eat1-hcl4083.grid.linkedin.com.log
-Dyarn.log.file=yarn-yarn-resourcemanager-eat1-hcl4083.grid.linkedin.com.log
-Dyarn.home.dir=/export/apps/hadoop/latest
-Dhadoop.home.dir=/export/apps/hadoop/latest
-Dhadoop.root.logger=INFO,DRFA
-Dyarn.root.logger=INFO,DRFA
-Djava.library.path=/export/apps/hadoop/latest/lib/native
-classpath
/export/apps/hadoop/site/etc/hadoop
/export/apps/hadoop/site/etc/hadoop
/export/apps/hadoop/site/etc/hadoop
/export/apps/hadoop/latest/share/hadoop/common/lib/*
/export/apps/hadoop/latest/share/hadoop/common/*
/export/apps/hadoop/latest/share/hadoop/hdfs
/export/apps/hadoop/latest/share/hadoop/hdfs/lib/*
/export/apps/hadoop/latest/share/hadoop/hdfs/*
/export/apps/hadoop/latest/share/hadoop/yarn/lib/*
/export/apps/hadoop/latest/share/hadoop/yarn/*
/export/apps/hadoop/latest/share/hadoop/mapreduce/lib/*
/export/apps/hadoop/latest/share/hadoop/mapreduce/*
/export/apps/hadoop/site/lib/grid-topology-1.0.jar
/export/apps/hadoop/latest/contrib/capacity-scheduler/*.jar
/export/apps/hadoop/site/lib/grid-topology-1.0.jar
/export/apps/hadoop/latest/contrib/capacity-scheduler/*.jar
/export/apps/hadoop/site/lib/grid-topology-1.0.jar
/export/apps/hadoop/latest/contrib/capacity-scheduler/*.jar
/export/apps/hadoop/latest/share/hadoop/yarn/*
/export/apps/hadoop/latest/share/hadoop/yarn/lib/*
/export/apps/hadoop/site/etc/hadoop/rm-config/log4j.properties
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager
{code}

After the changes:
{code}
/usr/java/default/bin/java
-Dproc_resourcemanager
-Xloggc:/export/apps/hadoop/logs/gc-nn.log-201309162014
-Dcom.sun.management.jmxremote.port=9010
-verbose:gc
-XX:+PrintGCDetails
-XX:+PrintGCTimeStamps
-XX:+PrintGCDateStamps
-Dcom.sun.management.jmxremote=true
-Dcom.sun.management.jmxremote.authenticate=false
-Dcom.sun.management.jmxremote.ssl=false
-Xmx24g
-Dyarn.server.resourcemanager.appsummary.log.file=rm-appsummary.log
-Dyarn.server.resourcemanager.appsummary.logger=INFO,RMSUMMARY
-Dyarn.log.dir=/export/apps/hadoop/logs
-Dyarn.log.file=yarn-yarn-resourcemanager-eat1-hcl4083.grid.linkedin.com.log
-Dyarn.home.dir=/export/apps/hadoop/latest
-Dyarn.root.logger=INFO,DRFA
-Djava.library.path=/export/apps/hadoop/latest/lib/native
-Dhadoop.log.dir=/export/apps/hadoop/logs
-Dhadoop.log.file=yarn-yarn-resourcemanager-eat1-hcl4083.grid.linkedin.com.log
-Dhadoop.home.dir=/export/apps/hadoop/latest
-Dhadoop.id.str=yarn
-Dhadoop.root.logger=INFO,DRFA
-Dhadoop.policy.file=hadoop-policy.xml
-Dhadoop.security.logger=INFO,NullAppender
-Djava.net.preferIPv4Stack=true
-classpath
/export/apps/hadoop/site/lib/grid-topology-1.0.jar
/export/apps/hadoop/latest/contrib/capacity-scheduler/*.jar
/export/apps/hadoop/site/etc/hadoop
/export/apps/hadoop/latest/share/hadoop/common/lib/*
/export/apps/hadoop/latest/share/hadoop/common/*
/export/apps/hadoop/latest/share/hadoop/hdfs
/export/apps/hadoop/latest/share/hadoop/hdfs/lib/*
/export/apps/hadoop/latest/share/hadoop/hdfs/*
/export/apps/hadoop/latest/share/hadoop/yarn/lib/*
/export/apps/hadoop/latest/share/hadoop/yarn/*
/export/apps/hadoop/latest/share/hadoop/mapreduce/lib/*
/export/apps/hadoop/latest/share/hadoop/mapreduce/*
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager
{code}

2500 bytes vs. 1750 bytes, almost all the savings are from the classpath.

There are still a few problems with the 'after' output but... they are mainly 
from my local config and not coming from the scripts. :)

 Shell script rewrite
 

[jira] [Updated] (HADOOP-9902) Shell script rewrite

2013-09-16 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-9902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated HADOOP-9902:
-

Attachment: (was: scripts.tgz)

 Shell script rewrite
 

 Key: HADOOP-9902
 URL: https://issues.apache.org/jira/browse/HADOOP-9902
 Project: Hadoop Common
  Issue Type: Improvement
  Components: scripts
Affects Versions: 2.1.1-beta
Reporter: Allen Wittenauer
Assignee: Allen Wittenauer
 Attachments: hadoop-9902-1.patch, more-info.txt


 Umbrella JIRA for shell script rewrite.  See more-info.txt for more details.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HADOOP-9902) Shell script rewrite

2013-09-16 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-9902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated HADOOP-9902:
-

Attachment: hadoop-9902-1.patch

 Shell script rewrite
 

 Key: HADOOP-9902
 URL: https://issues.apache.org/jira/browse/HADOOP-9902
 Project: Hadoop Common
  Issue Type: Improvement
  Components: scripts
Affects Versions: 2.1.1-beta
Reporter: Allen Wittenauer
Assignee: Allen Wittenauer
 Attachments: hadoop-9902-1.patch, more-info.txt


 Umbrella JIRA for shell script rewrite.  See more-info.txt for more details.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-9902) Shell script rewrite

2013-09-16 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13768889#comment-13768889
 ] 

Allen Wittenauer commented on HADOOP-9902:
--

Removed the tarball.  Added a patch.

This still needs a lot of testing and some of the features aren't quite 
complete (start-dfs.sh firing off secure datanodes, for example).

httpfs hasn't been touched.

 Shell script rewrite
 

 Key: HADOOP-9902
 URL: https://issues.apache.org/jira/browse/HADOOP-9902
 Project: Hadoop Common
  Issue Type: Improvement
  Components: scripts
Affects Versions: 2.1.1-beta
Reporter: Allen Wittenauer
Assignee: Allen Wittenauer
 Attachments: hadoop-9902-1.patch, more-info.txt


 Umbrella JIRA for shell script rewrite.  See more-info.txt for more details.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-9902) Shell script rewrite

2013-10-02 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13784664#comment-13784664
 ] 

Allen Wittenauer commented on HADOOP-9902:
--

These are sort of out of order.

bq. playing with this. sometimes the generated classpath is , say. 
share/hadoop/yarn/* ; the capacity scheduler is /*.jar -should everything be 
consistent.

At one point I thought about processing the regex string to dedupe it down to 
the jar level. This opens up a big can of worms, however: if you hit two of 
them, do you always take the latest?  What does latest mean anyway (date or 
version)?  Will we be able to parse the version out of the filename? How do we 
deal with user overrides?  Still take the latest no matter what?

I've opted to basically let the classpath as it is passed to us stand.  
Currently the dedupe code is pretty fast for interpreted shell. :) The *only* 
sub-optimization that I might be tempted to do is to normalize any symlinks and 
relative paths.  There is a good chance we'll catch a few dupes this way... but 
it likely isn't worth the extra execution time.

It's worth pointing out that a user can feasibly replace the add_classpath code 
in hadoop-env.sh to override the functionality without changing the base Apache 
code if they want/need more advanced classpath handling. (e.g., HADOOP-6997 
seems to be a non-issue to me since passing duplicate class names is just bad 
practice;  changing the collation is fixing a symptom of a much 
bigger/dangerous problem. But someone facing this issue could theoretically fix 
a collation problem on their own, legally in a stable way using this trick.)

bq. I don't see hadoop tools getting on the CP: is there a plan for that?

Tools path gets added as needed.  I seem to recall this is exactly the same way 
in the current shell scripts.

bq. Because it would suit me to have a directory into which I could put things 
to get them on a classpath without playing with HADOOP_CLASSPATH

I was planning on bringing up this exact issue after I get this one committed.  
It's a harder discussion because the placement is tricky and there are a lot of 
options to make this functionality happen.  Do we add another env var?  Do we 
just auto-prepend $HADOOP_PREFIX/lib/share/site/*?  Do we offer both prepend 
and append options? etc etc. All have pro's and con's.  Some of the choices 
become feasible really only after this is committed, however. 

bq. we do need to think when and how to react to (conf dir) absence

Good point.  That's pretty easy to add given that the conf dir handling is 
fairly well contained now in the hadoop_find_confdir function in 
hadoop-functions.sh.  It's pretty trivial to throw a fatal error if we don't 
detect, say, hadoop-env.sh in what we resolved HADOOP_CONF_DIR to.  Suggestions 
on what to check for?

bq. actually a rebuild fixes that. What I did have to do was drop 
hadoop-functions.sh into libexec

Yeah, after commit this is pretty much a flag day for all of the Hadoop 
subprojects. I talked to a few folks about it and it was generally felt that 
this should be one big patch+JIRA rather than several smaller ones per project 
given the interdependency on common.  We'll have to advertise on the various 
-dev mailing lists post commit to say do a full rebuild.  Hopefully folks won't 
have to change their *-env.sh files and they will continue without 
modification, however.

Thanks!

 Shell script rewrite
 

 Key: HADOOP-9902
 URL: https://issues.apache.org/jira/browse/HADOOP-9902
 Project: Hadoop Common
  Issue Type: Improvement
  Components: scripts
Affects Versions: 2.1.1-beta
Reporter: Allen Wittenauer
Assignee: Allen Wittenauer
 Attachments: hadoop-9902-1.patch, more-info.txt


 Umbrella JIRA for shell script rewrite.  See more-info.txt for more details.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HADOOP-9902) Shell script rewrite

2013-10-03 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13785489#comment-13785489
 ] 

Allen Wittenauer commented on HADOOP-9902:
--

Agreed. The edge cases are too painful.

The only dupe jar detection that occurs now is some extremely simple string 
match.  So if someone does something like $DIR/lib/blah.jar and 
$DIR/lib/../lib/blah.jar, it won't get deduped.  (It does, however, verify that 
$DIR/lib and $DIR/lib/../lib exists!)  Even with just this simple stuff, it 
eliminates multiple instances of the conf dir at a minimum.

 Shell script rewrite
 

 Key: HADOOP-9902
 URL: https://issues.apache.org/jira/browse/HADOOP-9902
 Project: Hadoop Common
  Issue Type: Improvement
  Components: scripts
Affects Versions: 2.1.1-beta
Reporter: Allen Wittenauer
Assignee: Allen Wittenauer
 Attachments: hadoop-9902-1.patch, more-info.txt


 Umbrella JIRA for shell script rewrite.  See more-info.txt for more details.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HADOOP-10034) optimize same-filesystem symlinks by doing resolution server-side

2013-10-09 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13791005#comment-13791005
 ] 

Allen Wittenauer commented on HADOOP-10034:
---

Won't doing this preclude us ever adding real relative paths into HDFS?  i.e., 
supporting .. 

 optimize same-filesystem symlinks by doing resolution server-side
 -

 Key: HADOOP-10034
 URL: https://issues.apache.org/jira/browse/HADOOP-10034
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: fs
Reporter: Colin Patrick McCabe

 We should optimize same-filesystem symlinks by doing resolution server-side 
 rather than client side, as discussed on HADOOP-9780.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HADOOP-9902) Shell script rewrite

2014-01-21 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13877702#comment-13877702
 ] 

Allen Wittenauer commented on HADOOP-9902:
--

Adding a link to HADOOP-10177 and HDFS-4763 to include the changes added by 
those patches.

(It should be noted that neither patch listed included CLI help info for the 
new sub-commands they added...)

 Shell script rewrite
 

 Key: HADOOP-9902
 URL: https://issues.apache.org/jira/browse/HADOOP-9902
 Project: Hadoop Common
  Issue Type: Improvement
  Components: scripts
Affects Versions: 2.1.1-beta
Reporter: Allen Wittenauer
Assignee: Allen Wittenauer
 Attachments: hadoop-9902-1.patch, more-info.txt


 Umbrella JIRA for shell script rewrite.  See more-info.txt for more details.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HADOOP-9902) Shell script rewrite

2014-01-21 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13877709#comment-13877709
 ] 

Allen Wittenauer commented on HADOOP-9902:
--

Would anyone be too upset for a patch to trunk that removed the 'deprecated' 
status? i.e., no longer warning, etc?  It'll have been in a release that we no 
longer support the HDFS and MR sub-commands.

 Shell script rewrite
 

 Key: HADOOP-9902
 URL: https://issues.apache.org/jira/browse/HADOOP-9902
 Project: Hadoop Common
  Issue Type: Improvement
  Components: scripts
Affects Versions: 2.1.1-beta
Reporter: Allen Wittenauer
Assignee: Allen Wittenauer
 Attachments: hadoop-9902-1.patch, more-info.txt


 Umbrella JIRA for shell script rewrite.  See more-info.txt for more details.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HADOOP-7476) task-controller can drop last char from config file

2011-07-19 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-7476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13067978#comment-13067978
 ] 

Allen Wittenauer commented on HADOOP-7476:
--

While working on porting task-controller, I ran into getline():

{code}
size_read = getline(line,linesize,conf_file);
//feof returns true only after we read past EOF.
//so a file with no new line, at last can reach this place
//if size_read returns negative check for eof condition
if (size_read == -1) {
  if(!feof(conf_file)){
fprintf(LOGFILE, getline returned error.\n);
exit(INVALID_CONFIG_FILE);
  }else {
free(line);
break;
  }
}
//trim the ending new line
line[strlen(line)-1] = '\0';
//comment line
{code}

My read of this code says that we always remove the last character of the 
buffer prior to the null termination.  In the vast majority of cases, this 
should be \N.  However, getline() doesn't appear to guarantee this:

The buffer is null-terminated and includes the newline character, if one was 
found.

If the configuration file was built in such a way that it does not end with a 
newline, it will chop off the last character. 

 task-controller can drop last char from config file
 ---

 Key: HADOOP-7476
 URL: https://issues.apache.org/jira/browse/HADOOP-7476
 Project: Hadoop Common
  Issue Type: Bug
  Components: security
Affects Versions: 0.20.203.0
Reporter: Allen Wittenauer
Priority: Trivial

 It looks as though task-controller's configuration file reader assumes that 
 the output of getline() always ends with \n\0.  This assumption does not 
 appear to be safe.  See comments for more. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-7476) task-controller can drop last char from config file

2011-07-19 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-7476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13067988#comment-13067988
 ] 

Allen Wittenauer commented on HADOOP-7476:
--

sure is, missed that one.  thanks.

 task-controller can drop last char from config file
 ---

 Key: HADOOP-7476
 URL: https://issues.apache.org/jira/browse/HADOOP-7476
 Project: Hadoop Common
  Issue Type: Bug
  Components: security
Affects Versions: 0.20.203.0
Reporter: Allen Wittenauer
Priority: Trivial

 It looks as though task-controller's configuration file reader assumes that 
 the output of getline() always ends with \n\0.  This assumption does not 
 appear to be safe.  See comments for more. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-7476) task-controller can drop last char from config file

2011-07-19 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-7476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13068073#comment-13068073
 ] 

Allen Wittenauer commented on HADOOP-7476:
--

OS X and Solaris.  As usual, this typically means removing the GNU-only crud*.  
At this point, my task-controller uses fgetln() instead of getline().  Since 
I'm lazy, it is easier to find code we can import that implements fgetln() in a 
portable fashion than getline(). (Altho if getline() is present w/out fgetln(), 
I've got a wrapper that implements fgetln() with getline()).

* Technically, getline() was added to super-recent POSIX, but none of the 
platforms that I have access to have that other than glibc-based machines.  So 
it isn't that portable yet. :(

 task-controller can drop last char from config file
 ---

 Key: HADOOP-7476
 URL: https://issues.apache.org/jira/browse/HADOOP-7476
 Project: Hadoop Common
  Issue Type: Bug
  Components: security
Affects Versions: 0.20.203.0
Reporter: Allen Wittenauer
Priority: Trivial

 It looks as though task-controller's configuration file reader assumes that 
 the output of getline() always ends with \n\0.  This assumption does not 
 appear to be safe.  See comments for more. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-7371) Improve tarball distributions

2011-07-29 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-7371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13072972#comment-13072972
 ] 

Allen Wittenauer commented on HADOOP-7371:
--

bq. Sources are compressed to a jar file as 
$HADOOP_PREFIX/share/hadoop/hadoop-source-[version].jar, Javadoc is compressed 
as $HADOOP_PREFIX/share/javadoc/hadoop-javadoc-[version].jar

Do we really want to use jar for these?  This could lead to massive confusion.  
Besides, if these are part of the *tarball* distribution, the user clearly has 
*tar* available...

 Improve tarball distributions
 -

 Key: HADOOP-7371
 URL: https://issues.apache.org/jira/browse/HADOOP-7371
 Project: Hadoop Common
  Issue Type: Improvement
  Components: build
 Environment: Java 6, Redhat 5.5
Reporter: Eric Yang
Assignee: Eric Yang
 Fix For: 0.23.0

 Attachments: HADOOP-7371.patch


 Hadoop release tarball contains both raw source and binary.  This leads users 
 to use the release tarball as base for applying patches, to build custom 
 Hadoop.  This is not the recommended method to develop hadoop because it 
 leads to mixed development system where processed files and raw source are 
 hard to separate.  
 To correct the problematic usage of the release tarball, the release build 
 target should be defined as:
 ant source generates source release tarball.
 ant binary is binary release without source/javadoc jar files.
 ant tar is a mirror of binary release with source/javadoc jar files.
 Does this sound reasonable?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-7371) Improve tarball distributions

2011-07-31 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-7371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13073419#comment-13073419
 ] 

Allen Wittenauer commented on HADOOP-7371:
--

Why would Eclipse users use the tarball?  Besides, don't Eclipse users have 
other things they need to do before they can actually do things with Hadoop?

 Improve tarball distributions
 -

 Key: HADOOP-7371
 URL: https://issues.apache.org/jira/browse/HADOOP-7371
 Project: Hadoop Common
  Issue Type: Improvement
  Components: build
 Environment: Java 6, Redhat 5.5
Reporter: Eric Yang
Assignee: Eric Yang
 Fix For: 0.23.0

 Attachments: HADOOP-7371.patch


 Hadoop release tarball contains both raw source and binary.  This leads users 
 to use the release tarball as base for applying patches, to build custom 
 Hadoop.  This is not the recommended method to develop hadoop because it 
 leads to mixed development system where processed files and raw source are 
 hard to separate.  
 To correct the problematic usage of the release tarball, the release build 
 target should be defined as:
 ant source generates source release tarball.
 ant binary is binary release without source/javadoc jar files.
 ant tar is a mirror of binary release with source/javadoc jar files.
 Does this sound reasonable?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-7371) Improve tarball distributions

2011-08-01 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-7371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13073662#comment-13073662
 ] 

Allen Wittenauer commented on HADOOP-7371:
--

FWIW, I'm not going to block this, but I still think it is going to lead to 
confusion, except for maybe the three people who debug production grids with 
eclipse.

 Improve tarball distributions
 -

 Key: HADOOP-7371
 URL: https://issues.apache.org/jira/browse/HADOOP-7371
 Project: Hadoop Common
  Issue Type: Improvement
  Components: build
 Environment: Java 6, Redhat 5.5
Reporter: Eric Yang
Assignee: Eric Yang
 Fix For: 0.23.0

 Attachments: HADOOP-7371.patch


 Hadoop release tarball contains both raw source and binary.  This leads users 
 to use the release tarball as base for applying patches, to build custom 
 Hadoop.  This is not the recommended method to develop hadoop because it 
 leads to mixed development system where processed files and raw source are 
 hard to separate.  
 To correct the problematic usage of the release tarball, the release build 
 target should be defined as:
 ant source generates source release tarball.
 ant binary is binary release without source/javadoc jar files.
 ant tar is a mirror of binary release with source/javadoc jar files.
 Does this sound reasonable?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-7494) Add -c option for FSshell -tail

2011-08-02 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-7494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13078315#comment-13078315
 ] 

Allen Wittenauer commented on HADOOP-7494:
--

What happens when this is used against non-HDFS or for large values of -c?

 Add -c option for FSshell -tail
 ---

 Key: HADOOP-7494
 URL: https://issues.apache.org/jira/browse/HADOOP-7494
 Project: Hadoop Common
  Issue Type: Improvement
  Components: fs
Affects Versions: 0.23.0
Reporter: XieXianshan
Assignee: XieXianshan
Priority: Trivial
 Fix For: 0.23.0

 Attachments: HADOOP-7494.patch


 Add the -c option for FSshell -tail to allow users to specify the output 
 bytes(currently,it's -1024 by default).
 For instance:
 $ hdfs dfs -tail -c -10 /user/hadoop/xiexs
 or
 $ hdfs dfs -tail -c+10 /user/hadoop/xiexs

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-7499) Add method for doing a sanity check on hostnames in NetUtils

2011-08-02 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-7499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13078476#comment-13078476
 ] 

Allen Wittenauer commented on HADOOP-7499:
--

Does the test actually do a DNS lookup?

 Add method for doing a sanity check on hostnames in NetUtils
 

 Key: HADOOP-7499
 URL: https://issues.apache.org/jira/browse/HADOOP-7499
 Project: Hadoop Common
  Issue Type: Bug
  Components: util
Affects Versions: 0.23.0
Reporter: Jeffrey Naisbitt
Assignee: Jeffrey Naisbitt
 Fix For: 0.23.0

 Attachments: HADOOP-7499.patch


 As part of MAPREDUCE-2489, we need a method in NetUtils to do a sanity check 
 on hostnames

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-7506) hadoopcommon build version cant be set from the maven commandline

2011-08-03 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-7506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13079039#comment-13079039
 ] 

Allen Wittenauer commented on HADOOP-7506:
--

If we can't change the version # at build time, I don't think we'll be able to 
upgrade server side-only components without also upgrading all the clients.  
That's a major hit on the ops side.  If that holds true, then we'll need to 
back out the maven patch before release if we can't fix this.

 hadoopcommon build version cant be set from the maven commandline
 -

 Key: HADOOP-7506
 URL: https://issues.apache.org/jira/browse/HADOOP-7506
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: build
Affects Versions: 0.23.0
Reporter: Giridharan Kesavan
Assignee: Giridharan Kesavan
 Attachments: HADOOP-7506.PATCH


 pom.xml had to introduce hadoop.version property with the default value set 
 to the snapshot version. If someone during build time want to override the 
 version from maven command line they can do so by passing 
 -Dhadoop.version=. For ppl who doesnt want to change the default version 
 can continue building. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-7519) hadoop fs commands should support tar/gzip or an equivalent

2011-08-05 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-7519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13080114#comment-13080114
 ] 

Allen Wittenauer commented on HADOOP-7519:
--

BTW, I'm fairly certain that distcp works against file:// .

 hadoop fs commands should support tar/gzip or an equivalent
 ---

 Key: HADOOP-7519
 URL: https://issues.apache.org/jira/browse/HADOOP-7519
 Project: Hadoop Common
  Issue Type: Improvement
  Components: fs
Affects Versions: 0.20.1
Reporter: Keith Wiley
Priority: Minor
  Labels: hadoop

 The hadoop fs subcommand should offer options for batching, unbatching, 
 compressing, and uncompressing files on hdfs.  The equivalent of hadoop fs 
 -tar or hadoop fs -gzip.  These commands would greatly facilitate moving 
 large data (especially in a large number of files) back and forth from hdfs.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-7521) bintar created tarball should use a common directory for prefix

2011-08-05 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-7521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13080121#comment-13080121
 ] 

Allen Wittenauer commented on HADOOP-7521:
--

-1

This completely breaks with customary tar ball behavior.  The expectation that 
when you unpack a tarball is that it will be in (pkgname)-(version).  Users are 
*expecting* to have component separation and in many cases *prefer* component 
separation.  If someone wants a more integrated experience, they'll use the 
rpm, deb, etc, packaging.



 bintar created tarball should use a common directory for prefix
 ---

 Key: HADOOP-7521
 URL: https://issues.apache.org/jira/browse/HADOOP-7521
 Project: Hadoop Common
  Issue Type: Bug
  Components: build
Affects Versions: 0.23.0
 Environment: Java 6, Maven, Linux/Mac
Reporter: Eric Yang
Assignee: Eric Yang
 Attachments: HADOOP-7521.patch


 The binary tarball contains the directory structure like:
 {noformat}
 hadoop-common-0.23.0-SNAPSHOT-bin/bin
  /etc/hadoop
  /libexec
  /sbin
  /share/hadoop/common
 {noformat}
 It would be nice to rename the prefix directory to a common directory where 
 it is common to all Hadoop stack software.  Therefore, user can untar hbase, 
 hadoop, zookeeper, pig, hive all into the same location and run from the top 
 level directory without manually renaming them to the same directory again.
 By default the prefix directory can be /usr.  Hence, it could merge with the 
 base OS.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-7521) bintar created tarball should use a common directory for prefix

2011-08-05 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-7521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13080161#comment-13080161
 ] 

Allen Wittenauer commented on HADOOP-7521:
--

bq. Allen, the isolated tarball (pkgname-version) is still supported by tar 
profile. We are discussing merged layout here.

If merged layout is:

bq. Therefore, user can untar hbase, hadoop, zookeeper, pig, hive all into the 
same location and run from the top level directory without manually renaming 
them to the same directory again.

then I'm still -1.

That is just a flawed idea to try to treat tar as equivalent of rpm.  They 
aren't.



 bintar created tarball should use a common directory for prefix
 ---

 Key: HADOOP-7521
 URL: https://issues.apache.org/jira/browse/HADOOP-7521
 Project: Hadoop Common
  Issue Type: Bug
  Components: build
Affects Versions: 0.23.0
 Environment: Java 6, Maven, Linux/Mac
Reporter: Eric Yang
Assignee: Eric Yang
 Attachments: HADOOP-7521.patch


 The binary tarball contains the directory structure like:
 {noformat}
 hadoop-common-0.23.0-SNAPSHOT-bin/bin
  /etc/hadoop
  /libexec
  /sbin
  /share/hadoop/common
 {noformat}
 It would be nice to rename the prefix directory to a common directory where 
 it is common to all Hadoop stack software.  Therefore, user can untar hbase, 
 hadoop, zookeeper, pig, hive all into the same location and run from the top 
 level directory without manually renaming them to the same directory again.
 By default the prefix directory can be /usr.  Hence, it could merge with the 
 base OS.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-7521) bintar created tarball should use a common directory for prefix

2011-08-05 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-7521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13080183#comment-13080183
 ] 

Allen Wittenauer commented on HADOOP-7521:
--

You mean other than the fact that few to no other tarball on the Internet does 
this? 

People who use binary tarballs to deploy things where there is an RPM almost 
always want package separation and higher levels of control of where things get 
placed. Changing this paradigm is going to be surprising and counter to those 
end user goals.

In other words:  This isn't broke.  Stop trying to fix it.

 bintar created tarball should use a common directory for prefix
 ---

 Key: HADOOP-7521
 URL: https://issues.apache.org/jira/browse/HADOOP-7521
 Project: Hadoop Common
  Issue Type: Bug
  Components: build
Affects Versions: 0.23.0
 Environment: Java 6, Maven, Linux/Mac
Reporter: Eric Yang
Assignee: Eric Yang
 Attachments: HADOOP-7521.patch


 The binary tarball contains the directory structure like:
 {noformat}
 hadoop-common-0.23.0-SNAPSHOT-bin/bin
  /etc/hadoop
  /libexec
  /sbin
  /share/hadoop/common
 {noformat}
 It would be nice to rename the prefix directory to a common directory where 
 it is common to all Hadoop stack software.  Therefore, user can untar hbase, 
 hadoop, zookeeper, pig, hive all into the same location and run from the top 
 level directory without manually renaming them to the same directory again.
 By default the prefix directory can be /usr.  Hence, it could merge with the 
 base OS.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-7521) bintar created tarball should use a common directory for prefix

2011-08-05 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-7521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13080202#comment-13080202
 ] 

Allen Wittenauer commented on HADOOP-7521:
--

bq. It is standard practice with popular Ops tools. 

Yet your examples are dev tools.  

-1 remains.  Might as well close this as won't fix.

 bintar created tarball should use a common directory for prefix
 ---

 Key: HADOOP-7521
 URL: https://issues.apache.org/jira/browse/HADOOP-7521
 Project: Hadoop Common
  Issue Type: Bug
  Components: build
Affects Versions: 0.23.0
 Environment: Java 6, Maven, Linux/Mac
Reporter: Eric Yang
Assignee: Eric Yang
 Attachments: HADOOP-7521.patch


 The binary tarball contains the directory structure like:
 {noformat}
 hadoop-common-0.23.0-SNAPSHOT-bin/bin
  /etc/hadoop
  /libexec
  /sbin
  /share/hadoop/common
 {noformat}
 It would be nice to rename the prefix directory to a common directory where 
 it is common to all Hadoop stack software.  Therefore, user can untar hbase, 
 hadoop, zookeeper, pig, hive all into the same location and run from the top 
 level directory without manually renaming them to the same directory again.
 By default the prefix directory can be /usr.  Hence, it could merge with the 
 base OS.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-7521) bintar created tarball should use a common directory for prefix

2011-08-05 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-7521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13080251#comment-13080251
 ] 

Allen Wittenauer commented on HADOOP-7521:
--

Beyond just the tarbomb problem, you've got file and permission problems.

 bintar created tarball should use a common directory for prefix
 ---

 Key: HADOOP-7521
 URL: https://issues.apache.org/jira/browse/HADOOP-7521
 Project: Hadoop Common
  Issue Type: Bug
  Components: build
Affects Versions: 0.23.0
 Environment: Java 6, Maven, Linux/Mac
Reporter: Eric Yang
Assignee: Eric Yang
 Attachments: HADOOP-7521.patch


 The binary tarball contains the directory structure like:
 {noformat}
 hadoop-common-0.23.0-SNAPSHOT-bin/bin
  /etc/hadoop
  /libexec
  /sbin
  /share/hadoop/common
 {noformat}
 It would be nice to rename the prefix directory to a common directory where 
 it is common to all Hadoop stack software.  Therefore, user can untar hbase, 
 hadoop, zookeeper, pig, hive all into the same location and run from the top 
 level directory without manually renaming them to the same directory again.
 By default the prefix directory can be /usr.  Hence, it could merge with the 
 base OS.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-7550) Need for Integrity Validation of RPC

2011-08-18 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-7550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13087431#comment-13087431
 ] 

Allen Wittenauer commented on HADOOP-7550:
--

From what I remember, krb5 vs krb5i was like 5-10% perf degradation. krb5p was 
like another 5%. I'd expect going from nothing to krb5i or krb5p to be fairly 
horrific.  On the plus side, these are already implemented, known quantities, 
etc.  With hardware accelerated crypto now common, the numbers are likely 
lower for anyone using anything relatively modern on non-Intel gear.  For 
Intel-gear, enabling AES support would probably help.

 Need for Integrity Validation of RPC
 

 Key: HADOOP-7550
 URL: https://issues.apache.org/jira/browse/HADOOP-7550
 Project: Hadoop Common
  Issue Type: Improvement
  Components: ipc
Reporter: Dave Thompson
Assignee: Dave Thompson

 Some recent investigation of network packet corruption has shown a need for 
 hadoop RPC integrity validation beyond assurances already provided by 802.3 
 link layer and TCP 16-bit CRC.
 During an unusual occurrence on a 4k node cluster, we've seen as high as 4 
 TCP anomalies per second on a single node, sustained over an hour (14k per 
 hour).   A TCP anomaly  would be an escaped link layer packet that resulted 
 in a TCP CRC failure, TCP packet out of sequence
 or TCP packet size error.
 According to this paper[*]:  http://tinyurl.com/3aue72r
 TCP's 16-bit CRC has an effective detection rate of 2^10.   1 in 1024 errors 
 may escape detection, and in fact what originally alerted us to this issue 
 was seeing failures due to bit-errors in hadoop traffic.  Extrapolating from 
 that paper, one might expect 14 escaped packet errors per hour for that 
 single node of a 4k cluster.  While the above error rate
 was unusually high due to a broadband aggregate switch issue, hadoop not 
 having an integrity check on RPC makes it problematic to discover, and limit 
 any potential data damage due to
 acting on a corrupt RPC message.
 --
 [*] In case this jira outlives that tinyurl, the IEEE paper cited is:  
 Performance of Checksums and CRCs over Real Data by Jonathan Stone, Michael 
 Greenwald, Craig Partridge, Jim Hughes.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-7596) Enable jsvc to work with Hadoop RPM package

2011-09-01 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-7596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13095440#comment-13095440
 ] 

Allen Wittenauer commented on HADOOP-7596:
--

bq. Hadoop only works with Sun Java. 

This isn't true and one of the reasons why attempting to figure out which java 
to use programmatically is full of pot holes.

 Enable jsvc to work with Hadoop RPM package
 ---

 Key: HADOOP-7596
 URL: https://issues.apache.org/jira/browse/HADOOP-7596
 Project: Hadoop Common
  Issue Type: Bug
  Components: build
Affects Versions: 0.20.204.0
 Environment: Java 6, RedHat EL 5.6
Reporter: Eric Yang
Assignee: Eric Yang
 Fix For: 0.20.205.0

 Attachments: HADOOP-7596.patch


 For secure Hadoop 0.20.2xx cluster, datanode can only run with 32 bit jvm 
 because Hadoop only packages 32 bit jsvc.  The build process should download 
 proper jsvc versions base on the build architecture.  In addition, the shell 
 script should be enhanced to locate hadoop jar files in the proper location.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-7596) Enable jsvc to work with Hadoop RPM package

2011-09-01 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-7596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13095500#comment-13095500
 ] 

Allen Wittenauer commented on HADOOP-7596:
--

http://wiki.apache.org/hadoop/HadoopJavaVersions

 Enable jsvc to work with Hadoop RPM package
 ---

 Key: HADOOP-7596
 URL: https://issues.apache.org/jira/browse/HADOOP-7596
 Project: Hadoop Common
  Issue Type: Bug
  Components: build
Affects Versions: 0.20.204.0
 Environment: Java 6, RedHat EL 5.6
Reporter: Eric Yang
Assignee: Eric Yang
 Fix For: 0.20.205.0

 Attachments: HADOOP-7596.patch


 For secure Hadoop 0.20.2xx cluster, datanode can only run with 32 bit jvm 
 because Hadoop only packages 32 bit jsvc.  The build process should download 
 proper jsvc versions base on the build architecture.  In addition, the shell 
 script should be enhanced to locate hadoop jar files in the proper location.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-7603) Set default hdfs, mapred uid, and hadoop group gid for RPM packages

2011-09-02 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-7603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13096074#comment-13096074
 ] 

Allen Wittenauer commented on HADOOP-7603:
--

bq. What group uses 49?

wnn uses it:

http://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/6/html/Deployment_Guide/s1-users-groups-standard-users.html



 Set default hdfs, mapred uid, and hadoop group gid for RPM packages
 ---

 Key: HADOOP-7603
 URL: https://issues.apache.org/jira/browse/HADOOP-7603
 Project: Hadoop Common
  Issue Type: Bug
 Environment: Java, Redhat EL, Ubuntu
Reporter: Eric Yang
Assignee: Eric Yang

 Hadoop rpm package creates hdfs, mapped users, and hadoop group for 
 automatically setting up pid directory and log directory with proper 
 permission.  The default headless users should have a fixed uid, and gid 
 numbers defined.
 Searched through the standard uid and gid on both Redhat and Debian distro.  
 It looks like:
 {noformat}
 uid: 201 for hdfs
 uid: 202 for mapred
 gid: 49 for hadoop
 {noformat}
 would be free for use.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-7624) Set things up for a top level hadoop-tools module

2011-09-12 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-7624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13102835#comment-13102835
 ] 

Allen Wittenauer commented on HADOOP-7624:
--

We need a rule set of what goes into tools so that we don't create contrib v2.  
Until that, I'm very much -1 on this.

 Set things up for a top level hadoop-tools module
 -

 Key: HADOOP-7624
 URL: https://issues.apache.org/jira/browse/HADOOP-7624
 Project: Hadoop Common
  Issue Type: Bug
  Components: build
Reporter: Vinod Kumar Vavilapalli

 See this thread: http://markmail.org/thread/cxtz3i6lvztfgfxn
 We need to get things up and running for a top level hadoop-tools module. 
 DistCpV2 will be the first resident of this new home. Things we need:
  - The module itself and a top level pom with appropriate dependencies
  - Integration with the patch builds for the new module
  - Integration with the post-commit and nightly builds for the new module.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-7624) Set things up for a top level hadoop-tools module

2011-09-12 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-7624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13102921#comment-13102921
 ] 

Allen Wittenauer commented on HADOOP-7624:
--

It is not acceptable to say we're going to create this anyway and deal with 
the consequences later.

 Set things up for a top level hadoop-tools module
 -

 Key: HADOOP-7624
 URL: https://issues.apache.org/jira/browse/HADOOP-7624
 Project: Hadoop Common
  Issue Type: Bug
  Components: build
Reporter: Vinod Kumar Vavilapalli

 See this thread: http://markmail.org/thread/cxtz3i6lvztfgfxn
 We need to get things up and running for a top level hadoop-tools module. 
 DistCpV2 will be the first resident of this new home. Things we need:
  - The module itself and a top level pom with appropriate dependencies
  - Integration with the patch builds for the new module
  - Integration with the post-commit and nightly builds for the new module.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-7624) Set things up for a top level hadoop-tools module

2011-09-12 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-7624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13102963#comment-13102963
 ] 

Allen Wittenauer commented on HADOOP-7624:
--

This JIRA was set up under the pretense of creating the hadoop-tools space.  
Even the summary statement says: Set things up for a top level hadoop-tools 
module.  It seems logical to me that this is the space where this discussion 
needs to happen.  

 Set things up for a top level hadoop-tools module
 -

 Key: HADOOP-7624
 URL: https://issues.apache.org/jira/browse/HADOOP-7624
 Project: Hadoop Common
  Issue Type: Bug
  Components: build
Reporter: Vinod Kumar Vavilapalli

 See this thread: http://markmail.org/thread/cxtz3i6lvztfgfxn
 We need to get things up and running for a top level hadoop-tools module. 
 DistCpV2 will be the first resident of this new home. Things we need:
  - The module itself and a top level pom with appropriate dependencies
  - Integration with the patch builds for the new module
  - Integration with the post-commit and nightly builds for the new module.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-7624) Set things up for a top level hadoop-tools module

2011-09-12 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-7624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13103080#comment-13103080
 ] 

Allen Wittenauer commented on HADOOP-7624:
--

No, there was not consensus.  I even said in the mailing list that I would 
oppose this without some rules to prevent this turning into contrib v2.0.

I honestly think that the only way to prevent this from turning into a complete 
mess is to essentially make it a full-fledged sub-project.

 Set things up for a top level hadoop-tools module
 -

 Key: HADOOP-7624
 URL: https://issues.apache.org/jira/browse/HADOOP-7624
 Project: Hadoop Common
  Issue Type: Bug
  Components: build
Reporter: Vinod Kumar Vavilapalli

 See this thread: http://markmail.org/thread/cxtz3i6lvztfgfxn
 We need to get things up and running for a top level hadoop-tools module. 
 DistCpV2 will be the first resident of this new home. Things we need:
  - The module itself and a top level pom with appropriate dependencies
  - Integration with the patch builds for the new module
  - Integration with the post-commit and nightly builds for the new module.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-7624) Set things up for a top level hadoop-tools module

2011-09-13 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-7624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13103834#comment-13103834
 ] 

Allen Wittenauer commented on HADOOP-7624:
--

I see a handful of choices:

a) rename contrib to tools and quit lying to ourselves that putting random 
stuff in a different directory makes them special

b) integrate these components directly into the mapreduce jar

c) make a new Hadoop sub project to hold these random things

d) just keep these components in contrib

e) make tools separate from contrib, but actually put some rules and process 
around what goes in there so that we don't end up with the same mess we had 
before

I basically don't want to see a repeat of history. If we don't do this now, in 
a year or three we're going to be back to we need to prune 
contrib^H^H^H^H^H^H^Htools of all this abandoned source.

If these things are important, just integrate them directly into the mainline 
jars and be done with it.

 Set things up for a top level hadoop-tools module
 -

 Key: HADOOP-7624
 URL: https://issues.apache.org/jira/browse/HADOOP-7624
 Project: Hadoop Common
  Issue Type: Bug
  Components: build
Reporter: Vinod Kumar Vavilapalli
Assignee: Alejandro Abdelnur
 Attachments: HADOOP-7624.patch


 See this thread: http://markmail.org/thread/cxtz3i6lvztfgfxn
 We need to get things up and running for a top level hadoop-tools module. 
 DistCpV2 will be the first resident of this new home. Things we need:
  - The module itself and a top level pom with appropriate dependencies
  - Integration with the patch builds for the new module
  - Integration with the post-commit and nightly builds for the new module.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-7624) Set things up for a top level hadoop-tools module

2011-09-17 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-7624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13107297#comment-13107297
 ] 

Allen Wittenauer commented on HADOOP-7624:
--

I'm removing my -1.  Commit away.

 Set things up for a top level hadoop-tools module
 -

 Key: HADOOP-7624
 URL: https://issues.apache.org/jira/browse/HADOOP-7624
 Project: Hadoop Common
  Issue Type: Bug
  Components: build
Reporter: Vinod Kumar Vavilapalli
Assignee: Alejandro Abdelnur
 Attachments: HADOOP-7624.patch


 See this thread: http://markmail.org/thread/cxtz3i6lvztfgfxn
 We need to get things up and running for a top level hadoop-tools module. 
 DistCpV2 will be the first resident of this new home. Things we need:
  - The module itself and a top level pom with appropriate dependencies
  - Integration with the patch builds for the new module
  - Integration with the post-commit and nightly builds for the new module.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HADOOP-7228) jar names are not compatible with 0.20.2

2011-09-19 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-7228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer resolved HADOOP-7228.
--

Resolution: Won't Fix

Surprise.

 jar names are not compatible with 0.20.2
 

 Key: HADOOP-7228
 URL: https://issues.apache.org/jira/browse/HADOOP-7228
 Project: Hadoop Common
  Issue Type: Bug
  Components: documentation
Affects Versions: 0.20.203.0
Reporter: Allen Wittenauer
Priority: Critical

 The jars in 203 are named differently vs. Apache Hadoop 0.20.2.  I understand 
 this was done to make the Maven people less cranky.  However, this breaks 
 compatibility especially for streaming users. We need to make sure we have a 
 release note or something significant so that users aren't taken by surprise.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-7228) jar names are not compatible with 0.20.2

2011-09-22 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-7228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13112895#comment-13112895
 ] 

Allen Wittenauer commented on HADOOP-7228:
--

Pretty much too late.   branch-20-security is all about breaking compatibility 
with 0.20.[0-2] it seems.

 jar names are not compatible with 0.20.2
 

 Key: HADOOP-7228
 URL: https://issues.apache.org/jira/browse/HADOOP-7228
 Project: Hadoop Common
  Issue Type: Bug
  Components: documentation
Affects Versions: 0.20.203.0
Reporter: Allen Wittenauer
Priority: Critical

 The jars in 203 are named differently vs. Apache Hadoop 0.20.2.  I understand 
 this was done to make the Maven people less cranky.  However, this breaks 
 compatibility especially for streaming users. We need to make sure we have a 
 release note or something significant so that users aren't taken by surprise.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




<    7   8   9   10   11   12   13   14   15   16   >