[jira] [Updated] (HADOOP-8569) CMakeLists.txt: define _GNU_SOURCE and _LARGEFILE_SOURCE

2012-07-06 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-8569?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HADOOP-8569:
-

Attachment: HADOOP-8569.001.patch

 CMakeLists.txt: define _GNU_SOURCE and _LARGEFILE_SOURCE
 

 Key: HADOOP-8569
 URL: https://issues.apache.org/jira/browse/HADOOP-8569
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
Priority: Minor
 Attachments: HADOOP-8569.001.patch


 In the native code, we should define _GNU_SOURCE and _LARGEFILE_SOURCE so 
 that all of the functions on Linux are available.
 _LARGEFILE enables fseeko and ftello; _GNU_SOURCE enables a variety of 
 Linux-specific functions from glibc, including sync_file_range.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HADOOP-8569) CMakeLists.txt: define _GNU_SOURCE and _LARGEFILE_SOURCE

2012-07-06 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-8569?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HADOOP-8569:
-

Status: Patch Available  (was: Open)

 CMakeLists.txt: define _GNU_SOURCE and _LARGEFILE_SOURCE
 

 Key: HADOOP-8569
 URL: https://issues.apache.org/jira/browse/HADOOP-8569
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
Priority: Minor
 Attachments: HADOOP-8569.001.patch


 In the native code, we should define _GNU_SOURCE and _LARGEFILE_SOURCE so 
 that all of the functions on Linux are available.
 _LARGEFILE enables fseeko and ftello; _GNU_SOURCE enables a variety of 
 Linux-specific functions from glibc, including sync_file_range.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-8566) AvroReflectSerializer.accept(Class) throws a NPE if the class has no package (primitive types and arrays)

2012-07-06 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13407745#comment-13407745
 ] 

Hadoop QA commented on HADOOP-8566:
---

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12535300/HADOOP-8566.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 1 new or modified test 
files.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 eclipse:eclipse.  The patch built with eclipse:eclipse.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed these unit tests in 
hadoop-common-project/hadoop-common:

  org.apache.hadoop.ha.TestZKFailoverController
  org.apache.hadoop.io.file.tfile.TestTFileByteArrays
  
org.apache.hadoop.io.file.tfile.TestTFileJClassComparatorByteArrays

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/1173//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/1173//console

This message is automatically generated.

 AvroReflectSerializer.accept(Class) throws a NPE if the class has no package 
 (primitive types and arrays)
 -

 Key: HADOOP-8566
 URL: https://issues.apache.org/jira/browse/HADOOP-8566
 Project: Hadoop Common
  Issue Type: Bug
  Components: io
Affects Versions: 2.0.0-alpha
Reporter: Alejandro Abdelnur
Assignee: Alejandro Abdelnur
 Fix For: 2.0.1-alpha

 Attachments: HADOOP-8566.patch


 the accept() method should consider the case where the class getPackage() 
 returns NULL.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HADOOP-8570) Bzip2Codec should accept .bz files too

2012-07-06 Thread Harsh J (JIRA)
Harsh J created HADOOP-8570:
---

 Summary: Bzip2Codec should accept .bz files too
 Key: HADOOP-8570
 URL: https://issues.apache.org/jira/browse/HADOOP-8570
 Project: Hadoop Common
  Issue Type: Improvement
  Components: io
Affects Versions: 2.0.0-alpha, 1.0.0
Reporter: Harsh J


The default extension reported for Bzip2Codec today is .bz2. This causes it 
not to pick up .bz files as Bzip2Codec files. Although the extension is not 
very popular today, it is still mentioned as a valid extension in the bunzip 
manual and we should support it.

We should change the Bzip2Codec default extension to bz, or we should add in 
a new extension list support to allow for better detection across various 
aliases.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HADOOP-8571) Improve resource cleaning when shutting down

2012-07-06 Thread Guillaume Nodet (JIRA)
Guillaume Nodet created HADOOP-8571:
---

 Summary: Improve resource cleaning when shutting down
 Key: HADOOP-8571
 URL: https://issues.apache.org/jira/browse/HADOOP-8571
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Guillaume Nodet




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-8571) Improve resource cleaning when shutting down

2012-07-06 Thread Guillaume Nodet (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13407876#comment-13407876
 ] 

Guillaume Nodet commented on HADOOP-8571:
-

I've committed a patch in a github fork at
   
https://github.com/gnodet/hadoop-common/commit/d7a6738429716000376df344ae68ee1a1a630223

 Improve resource cleaning when shutting down
 

 Key: HADOOP-8571
 URL: https://issues.apache.org/jira/browse/HADOOP-8571
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Guillaume Nodet



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-8541) Better high-percentile latency metrics

2012-07-06 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13407914#comment-13407914
 ] 

Hadoop QA commented on HADOOP-8541:
---

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12535283/hadoop-8541-2.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 2 new or modified test 
files.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 javadoc.  The javadoc tool appears to have generated 1 warning messages.

+1 eclipse:eclipse.  The patch built with eclipse:eclipse.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed these unit tests in 
hadoop-common-project/hadoop-common:

  org.apache.hadoop.ha.TestZKFailoverController
  org.apache.hadoop.io.file.tfile.TestTFileByteArrays
  
org.apache.hadoop.io.file.tfile.TestTFileJClassComparatorByteArrays

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/1174//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/1174//console

This message is automatically generated.

 Better high-percentile latency metrics
 --

 Key: HADOOP-8541
 URL: https://issues.apache.org/jira/browse/HADOOP-8541
 Project: Hadoop Common
  Issue Type: Improvement
  Components: metrics
Affects Versions: 2.0.0-alpha
Reporter: Andrew Wang
Assignee: Andrew Wang
 Attachments: hadoop-8541-1.patch, hadoop-8541-2.patch


 Based on discussion in HBASE-6261 and with some HDFS devs, I'd like to make 
 better high-percentile latency metrics a part of hadoop-common.
 I've already got a working implementation of [1], an efficient algorithm for 
 estimating quantiles on a stream of values. It allows you to specify 
 arbitrary quantiles to track (e.g. 50th, 75th, 90th, 95th, 99th), along with 
 tight error bounds. This estimator can be snapshotted and reset periodically 
 to get a feel for how these percentiles are changing over time.
 I propose creating a new MutableQuantiles class that does this. [1] isn't 
 completely without overhead (~1MB memory for reasonably sized windows), which 
 is why I hesitate to add it to the existing MutableStat class.
 [1] Cormode, Korn, Muthukrishnan, and Srivastava. Effective Computation of 
 Biased Quantiles over Data Streams in ICDE 2005.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-8096) add single point where System.exit() is called for better handling in containers

2012-07-06 Thread Guillaume Nodet (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13407928#comment-13407928
 ] 

Guillaume Nodet commented on HADOOP-8096:
-

Fwiw, I've been able to run hadoop in osgi (I'm working on patches right now) 
and I haven't hit any issues with System.exit being called.
I think most of the calls are in Main classes (tools or main runners) and can 
be avoided by controlling the configuration in advance.
For those cases, I think it would be better to have the code throw an exception 
and catch it in the respective main() methods and call System.exit() there.
In OSGi, the main() methods would not be called, to the OSGi layer would have a 
way to intercept the problems gracefully.


 add single point where System.exit() is called for better handling in 
 containers
 

 Key: HADOOP-8096
 URL: https://issues.apache.org/jira/browse/HADOOP-8096
 Project: Hadoop Common
  Issue Type: New Feature
  Components: util
Affects Versions: 0.24.0
Reporter: Steve Loughran
Assignee: Csaba Miklos
Priority: Trivial
 Fix For: 0.24.0

 Attachments: HADOOP-8096.patch


 with plans for OSGI integration afoot in HADOOP-7977, Hadoop needs unified 
 place where System.exit() calls. When one runs any bit of Hadoop in a 
 containers the container will block those exits with a security manager and 
 convert the calls into security exceptions. A single exit method would enable 
 such exceptions to be logged, and conceivably handled slightly more 
 gracefully (e.g. tell the daemon to die).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HADOOP-8572) Have the ability to force the use of the login user

2012-07-06 Thread Guillaume Nodet (JIRA)
Guillaume Nodet created HADOOP-8572:
---

 Summary: Have the ability to force the use of the login user 
 Key: HADOOP-8572
 URL: https://issues.apache.org/jira/browse/HADOOP-8572
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Guillaume Nodet


In Karaf, most of the code is run under the karaf user. When a user ssh into 
Karaf, commands will be executed under that user.
Deploying hadoop inside Karaf requires that the authenticated Subject has the 
required hadoop principals set, which forces the reconfiguration of the whole 
security layer, even at dev time.

My patch proposes the introduction of a new configuration property 
{{hadoop.security.force.login.user}} which if set to true (it would default to 
false to keep the current behavior), would force the use of the login user 
instead of using the authenticated subject (which is what happen when there's 
no authenticated subject at all).  This greatly simplifies the use of hadoop in 
such environments where security isn't really needed (at dev time).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-8569) CMakeLists.txt: define _GNU_SOURCE and _LARGEFILE_SOURCE

2012-07-06 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13408016#comment-13408016
 ] 

Hadoop QA commented on HADOOP-8569:
---

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12535323/HADOOP-8569.001.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 eclipse:eclipse.  The patch built with eclipse:eclipse.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed these unit tests in 
hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs 
hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager:

  org.apache.hadoop.hdfs.TestDatanodeBlockScanner
  org.apache.hadoop.hdfs.TestHDFSTrash

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/1175//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/1175//console

This message is automatically generated.

 CMakeLists.txt: define _GNU_SOURCE and _LARGEFILE_SOURCE
 

 Key: HADOOP-8569
 URL: https://issues.apache.org/jira/browse/HADOOP-8569
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
Priority: Minor
 Attachments: HADOOP-8569.001.patch


 In the native code, we should define _GNU_SOURCE and _LARGEFILE_SOURCE so 
 that all of the functions on Linux are available.
 _LARGEFILE enables fseeko and ftello; _GNU_SOURCE enables a variety of 
 Linux-specific functions from glibc, including sync_file_range.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-8572) Have the ability to force the use of the login user

2012-07-06 Thread Guillaume Nodet (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13408020#comment-13408020
 ] 

Guillaume Nodet commented on HADOOP-8572:
-

Patch available for review at 
https://github.com/gnodet/hadoop-common/commit/02af662eecd79aa4fd09afd47249e6d025de985e

 Have the ability to force the use of the login user 
 

 Key: HADOOP-8572
 URL: https://issues.apache.org/jira/browse/HADOOP-8572
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Guillaume Nodet

 In Karaf, most of the code is run under the karaf user. When a user ssh 
 into Karaf, commands will be executed under that user.
 Deploying hadoop inside Karaf requires that the authenticated Subject has the 
 required hadoop principals set, which forces the reconfiguration of the whole 
 security layer, even at dev time.
 My patch proposes the introduction of a new configuration property 
 {{hadoop.security.force.login.user}} which if set to true (it would default 
 to false to keep the current behavior), would force the use of the login user 
 instead of using the authenticated subject (which is what happen when there's 
 no authenticated subject at all).  This greatly simplifies the use of hadoop 
 in such environments where security isn't really needed (at dev time).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HADOOP-8573) Configuration tries to read from an inputstream resource multiple times.

2012-07-06 Thread Robert Joseph Evans (JIRA)
Robert Joseph Evans created HADOOP-8573:
---

 Summary: Configuration tries to read from an inputstream resource 
multiple times. 
 Key: HADOOP-8573
 URL: https://issues.apache.org/jira/browse/HADOOP-8573
 Project: Hadoop Common
  Issue Type: Bug
  Components: conf
Affects Versions: 1.0.2, 0.23.3, 2.0.1-alpha, 3.0.0
Reporter: Robert Joseph Evans
Assignee: Robert Joseph Evans


If someone calls Configuration.addResource(InputStream) and then 
reloadConfiguration is called for any reason, Configruation will try to reread 
the contents of the InputStream, after it has already closed it.

This never showed up in 1.0 because the framework itself does not call 
addResource with an InputStream, and typically by the time user code starts 
running that might call this, all of the default and site resources have 
already been loaded.

In 0.23 mapreduce is now a client library, and mapred-site.xml and 
mapred-default.xml are loaded much later in the process.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-8573) Configuration tries to read from an inputstream resource multiple times.

2012-07-06 Thread Robert Joseph Evans (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13408061#comment-13408061
 ] 

Robert Joseph Evans commented on HADOOP-8573:
-

The only real way to fix this is to cache the contents of the InputStream, or a 
parsed version of it.  The question is how do we reduce the potential memory 
impact of this?  We could change configuration so that it is a list of 
properties instead of a single combined properties object, but this is a very 
large change to fix this issue. Because of this I am inclined to not really 
worry about the memory impact right now.  This feature is not that commonly 
used, and most configuration objects tend to be shared a lot so I don't see the 
impact being that large.

 Configuration tries to read from an inputstream resource multiple times. 
 -

 Key: HADOOP-8573
 URL: https://issues.apache.org/jira/browse/HADOOP-8573
 Project: Hadoop Common
  Issue Type: Bug
  Components: conf
Affects Versions: 1.0.2, 0.23.3, 2.0.1-alpha, 3.0.0
Reporter: Robert Joseph Evans
Assignee: Robert Joseph Evans

 If someone calls Configuration.addResource(InputStream) and then 
 reloadConfiguration is called for any reason, Configruation will try to 
 reread the contents of the InputStream, after it has already closed it.
 This never showed up in 1.0 because the framework itself does not call 
 addResource with an InputStream, and typically by the time user code starts 
 running that might call this, all of the default and site resources have 
 already been loaded.
 In 0.23 mapreduce is now a client library, and mapred-site.xml and 
 mapred-default.xml are loaded much later in the process.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HADOOP-8571) Improve resource cleaning when shutting down

2012-07-06 Thread Harsh J (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-8571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J updated HADOOP-8571:


 Target Version/s: 1.1.0
Affects Version/s: 3.0.0
   1.0.0
   2.0.0-alpha

Your patch seems to be for just the branch-1 code. Any chance we can get a 
trunk patch as well, with some more detail on what you've changed and why? 
(Some changes are obvious, but others can do with some more info, would help 
acceptability).

Thanks!

P.s. Also upload your patch here and grant ASF permissions to use it, otherwise 
reviewers may be hesitant in reviewing cause we can't integrate it unless you 
manually give ASF permission to do so.

 Improve resource cleaning when shutting down
 

 Key: HADOOP-8571
 URL: https://issues.apache.org/jira/browse/HADOOP-8571
 Project: Hadoop Common
  Issue Type: Improvement
Affects Versions: 1.0.0, 2.0.0-alpha, 3.0.0
Reporter: Guillaume Nodet



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-8571) Improve resource cleaning when shutting down

2012-07-06 Thread Guillaume Nodet (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13408081#comment-13408081
 ] 

Guillaume Nodet commented on HADOOP-8571:
-

Yes, I want to first fix the 1.0 branch to be ready for OSGi and then backport 
everything to trunk.
I'll upload real patches and attach them the usual way (btw, I'm an ASF 
committer already, so I don't think such a grant is really needed in that case 
anyway).

Related to the changes, I think the most problematic problem is related to 
CleanupQueue which is a singleton with a thread wich is never stopped.  I think 
most of the other threads are controlled somehow, but I need to include the 
following code when stopping the OSGi bundles to correctly stop all the threads:

{code}
FileSystem.closeAll();
CleanupQueue.getInstance().stop();
DefaultMetricsSystem.INSTANCE.shutdown();
{code}


 Improve resource cleaning when shutting down
 

 Key: HADOOP-8571
 URL: https://issues.apache.org/jira/browse/HADOOP-8571
 Project: Hadoop Common
  Issue Type: Improvement
Affects Versions: 1.0.0, 2.0.0-alpha, 3.0.0
Reporter: Guillaume Nodet



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HADOOP-8574) Enable starting hadoop services from inside OSGi

2012-07-06 Thread Guillaume Nodet (JIRA)
Guillaume Nodet created HADOOP-8574:
---

 Summary: Enable starting hadoop services from inside OSGi
 Key: HADOOP-8574
 URL: https://issues.apache.org/jira/browse/HADOOP-8574
 Project: Hadoop Common
  Issue Type: New Feature
Reporter: Guillaume Nodet


This JIRA captures the needed things in order to start hadoop services in OSGi.

The main idea I used so far consists in:
  * using the OSGi ConfigAdmin to store the hadoop configuration
  * in that configuration, use a few boolean properties to determine which 
services should be started (nameNode, dataNode ...)
  * expose a configured url handler so that the whole OSGi runtime can use urls 
in hdfs:/xxx
  * the use of an OSGi ManagedService means that when the configuration 
changes, the services will be stopped and restarted with the new configuration



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HADOOP-8573) Configuration tries to read from an inputstream resource multiple times.

2012-07-06 Thread Robert Joseph Evans (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-8573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Joseph Evans updated HADOOP-8573:


Attachment: HADOOP-8573.txt

This patch just does basic caching on the parsed properties for an InputStream. 
 It replaces the InputStream in the resources array with the properties 
themselves.

 Configuration tries to read from an inputstream resource multiple times. 
 -

 Key: HADOOP-8573
 URL: https://issues.apache.org/jira/browse/HADOOP-8573
 Project: Hadoop Common
  Issue Type: Bug
  Components: conf
Affects Versions: 1.0.2, 0.23.3, 2.0.1-alpha, 3.0.0
Reporter: Robert Joseph Evans
Assignee: Robert Joseph Evans
 Attachments: HADOOP-8573.txt


 If someone calls Configuration.addResource(InputStream) and then 
 reloadConfiguration is called for any reason, Configruation will try to 
 reread the contents of the InputStream, after it has already closed it.
 This never showed up in 1.0 because the framework itself does not call 
 addResource with an InputStream, and typically by the time user code starts 
 running that might call this, all of the default and site resources have 
 already been loaded.
 In 0.23 mapreduce is now a client library, and mapred-site.xml and 
 mapred-default.xml are loaded much later in the process.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HADOOP-8573) Configuration tries to read from an inputstream resource multiple times.

2012-07-06 Thread Robert Joseph Evans (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-8573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Joseph Evans updated HADOOP-8573:


Status: Patch Available  (was: Open)

I forgot to add in that this patch feels like a hack to me, with loadResource 
returning the properties to be added in, but I could not think of a cleaner way 
to do it without completely refactoring a lot of that code.  If someone else 
has a better way to do this I would be very happy to switch.

 Configuration tries to read from an inputstream resource multiple times. 
 -

 Key: HADOOP-8573
 URL: https://issues.apache.org/jira/browse/HADOOP-8573
 Project: Hadoop Common
  Issue Type: Bug
  Components: conf
Affects Versions: 1.0.2, 0.23.3, 2.0.1-alpha, 3.0.0
Reporter: Robert Joseph Evans
Assignee: Robert Joseph Evans
 Attachments: HADOOP-8573.txt


 If someone calls Configuration.addResource(InputStream) and then 
 reloadConfiguration is called for any reason, Configruation will try to 
 reread the contents of the InputStream, after it has already closed it.
 This never showed up in 1.0 because the framework itself does not call 
 addResource with an InputStream, and typically by the time user code starts 
 running that might call this, all of the default and site resources have 
 already been loaded.
 In 0.23 mapreduce is now a client library, and mapred-site.xml and 
 mapred-default.xml are loaded much later in the process.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-8574) Enable starting hadoop services from inside OSGi

2012-07-06 Thread Guillaume Nodet (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13408144#comment-13408144
 ] 

Guillaume Nodet commented on HADOOP-8574:
-

Possible patch to kick the discussion: 
https://github.com/gnodet/hadoop-common/commit/742ab08aa068424fc2292cf1cd2d64a345053173
Though the OSGi metadata are not yet there, so this is not really testable yet 
(will upload a patch for that soon or JB).

There is one possibly controversial change which is the one in the 
Configuration (see 
https://github.com/gnodet/hadoop-common/commit/742ab08aa068424fc2292cf1cd2d64a345053173#L3R207).
The idea is that in OSGi, the whole configuration is controlled (at least the 
default) by ConfigAdmin.  The benefit is that clients don't really have to deal 
with configuration.  
One thing I haven't really understood is why the configuration isn't a global 
singleton (at least the defaults), as the configuration files are being read 
multiple times (each time a new configuration is created).



 Enable starting hadoop services from inside OSGi
 

 Key: HADOOP-8574
 URL: https://issues.apache.org/jira/browse/HADOOP-8574
 Project: Hadoop Common
  Issue Type: New Feature
Reporter: Guillaume Nodet

 This JIRA captures the needed things in order to start hadoop services in 
 OSGi.
 The main idea I used so far consists in:
   * using the OSGi ConfigAdmin to store the hadoop configuration
   * in that configuration, use a few boolean properties to determine which 
 services should be started (nameNode, dataNode ...)
   * expose a configured url handler so that the whole OSGi runtime can use 
 urls in hdfs:/xxx
   * the use of an OSGi ManagedService means that when the configuration 
 changes, the services will be stopped and restarted with the new configuration

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HADOOP-8575) No mapred-site.xml present in the configuration directory. This is very trivial but thought would be less confusing for a new user if it came packaged.

2012-07-06 Thread Pavan Kulkarni (JIRA)
Pavan Kulkarni created HADOOP-8575:
--

 Summary: No mapred-site.xml present in the configuration 
directory. This is very trivial but thought would be less confusing for a new 
user if it came packaged.
 Key: HADOOP-8575
 URL: https://issues.apache.org/jira/browse/HADOOP-8575
 Project: Hadoop Common
  Issue Type: Bug
  Components: conf
Affects Versions: 0.23.1, 0.23.0
 Environment: Linux
Reporter: Pavan Kulkarni
Priority: Minor
 Fix For: 0.23.2, 0.23.3


The binary Distribution of the hadoop-0.23.3 has no mapred-site.xml file in the 
/etc/hadoop directory. 
And for the setting up the cluster we need to configure mapred-site.xml.
Though this is trivial issue but new users might get confused while configuring.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-8230) Enable sync by default and disable append

2012-07-06 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13408193#comment-13408193
 ] 

Suresh Srinivas commented on HADOOP-8230:
-

I had marked HADOOP-8365 as a blocker for 1.1.0. 

Since HADOOP-8365 has not been fixed yet for 1.1.0, I am -1 on this patch. If 
HADOOP-8365 gets fixed, I will remove my -1.


 Enable sync by default and disable append
 -

 Key: HADOOP-8230
 URL: https://issues.apache.org/jira/browse/HADOOP-8230
 Project: Hadoop Common
  Issue Type: Improvement
Affects Versions: 1.0.0
Reporter: Eli Collins
Assignee: Eli Collins
 Fix For: 1.1.0

 Attachments: hadoop-8230.txt


 Per HDFS-3120 for 1.x let's:
 - Always enable the sync path, which is currently only enabled if 
 dfs.support.append is set
 - Remove the dfs.support.append configuration option. We'll keep the code 
 paths though in case we ever fix append on branch-1, in which case we can add 
 the config option back

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HADOOP-7823) port HADOOP-4012 to branch-1 (splitting support for bzip2)

2012-07-06 Thread Matt Foley (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-7823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Foley updated HADOOP-7823:
---

Summary: port HADOOP-4012 to branch-1 (splitting support for bzip2)  (was: 
port HADOOP-4012 to branch-1)

 port HADOOP-4012 to branch-1 (splitting support for bzip2)
 --

 Key: HADOOP-7823
 URL: https://issues.apache.org/jira/browse/HADOOP-7823
 Project: Hadoop Common
  Issue Type: New Feature
Affects Versions: 0.20.205.0
Reporter: Tim Broberg
Assignee: Andrew Purtell
 Attachments: HADOOP-7823-branch-1-v2.patch, 
 HADOOP-7823-branch-1-v3.patch, HADOOP-7823-branch-1-v3.patch, 
 HADOOP-7823-branch-1-v4.patch, HADOOP-7823-branch-1.patch


 Please see HADOOP-4012 - Providing splitting support for bzip2 compressed 
 files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HADOOP-7723) Automatically generate good Release Notes

2012-07-06 Thread Matt Foley (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-7723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Foley updated HADOOP-7723:
---

Target Version/s: 0.23.0, 1.1.1  (was: 1.1.0, 0.23.0)

 Automatically generate good Release Notes
 -

 Key: HADOOP-7723
 URL: https://issues.apache.org/jira/browse/HADOOP-7723
 Project: Hadoop Common
  Issue Type: Improvement
Affects Versions: 0.20.204.0, 0.23.0
Reporter: Matt Foley
Assignee: Matt Foley

 In branch-0.20-security, there is a tool src/docs/relnotes.py, that 
 automatically generates Release Notes.  Fix deficiencies and port it up to 
 trunk.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-8523) test-patch.sh doesn't validate patches before building

2012-07-06 Thread Jonathan Eagles (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13408243#comment-13408243
 ] 

Jonathan Eagles commented on HADOOP-8523:
-

+1. verified test-patch.sh does abort early on badly formatted patches and 
continues on normally with correctly formatted patches.

 test-patch.sh doesn't validate patches before building
 --

 Key: HADOOP-8523
 URL: https://issues.apache.org/jira/browse/HADOOP-8523
 Project: Hadoop Common
  Issue Type: Improvement
  Components: build
Affects Versions: 0.23.3, 2.0.1-alpha
Reporter: Jack Dintruff
Priority: Minor
  Labels: newbie
 Attachments: HADOOP-8523.patch, HADOOP-8523.patch, Hadoop-8523.patch, 
 Hadoop-8523.patch


 When running test-patch.sh with an invalid patch (not formatted properly) or 
 one that doesn't compile, the script spends a lot of time building Hadoop 
 before checking to see if the patch is invalid.  It would help devs if it 
 checked first just in case we run test-patch.sh with a bad patch file. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-8574) Enable starting hadoop services from inside OSGi

2012-07-06 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HADOOP-8574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13408260#comment-13408260
 ] 

Jean-Baptiste Onofré commented on HADOOP-8574:
--

Yep, I take the OSGi metadata and merge back the Karaf feature descriptor. It 
will be done over the week end.

 Enable starting hadoop services from inside OSGi
 

 Key: HADOOP-8574
 URL: https://issues.apache.org/jira/browse/HADOOP-8574
 Project: Hadoop Common
  Issue Type: New Feature
Reporter: Guillaume Nodet

 This JIRA captures the needed things in order to start hadoop services in 
 OSGi.
 The main idea I used so far consists in:
   * using the OSGi ConfigAdmin to store the hadoop configuration
   * in that configuration, use a few boolean properties to determine which 
 services should be started (nameNode, dataNode ...)
   * expose a configured url handler so that the whole OSGi runtime can use 
 urls in hdfs:/xxx
   * the use of an OSGi ManagedService means that when the configuration 
 changes, the services will be stopped and restarted with the new configuration

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HADOOP-8566) AvroReflectSerializer.accept(Class) throws a NPE if the class has no package (primitive types and arrays)

2012-07-06 Thread Alejandro Abdelnur (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-8566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alejandro Abdelnur updated HADOOP-8566:
---

Attachment: HADOOP-8566.patch

Run the failed tests locally without and with the patch and they pass. The 
failures seem unrelated. Reattaching patch to force a rerun. 


 AvroReflectSerializer.accept(Class) throws a NPE if the class has no package 
 (primitive types and arrays)
 -

 Key: HADOOP-8566
 URL: https://issues.apache.org/jira/browse/HADOOP-8566
 Project: Hadoop Common
  Issue Type: Bug
  Components: io
Affects Versions: 2.0.0-alpha
Reporter: Alejandro Abdelnur
Assignee: Alejandro Abdelnur
 Fix For: 2.0.1-alpha

 Attachments: HADOOP-8566.patch, HADOOP-8566.patch


 the accept() method should consider the case where the class getPackage() 
 returns NULL.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-8523) test-patch.sh doesn't validate patches before building

2012-07-06 Thread Jonathan Eagles (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13408282#comment-13408282
 ] 

Jonathan Eagles commented on HADOOP-8523:
-

Thanks, Jack. I put this into trunk.

 test-patch.sh doesn't validate patches before building
 --

 Key: HADOOP-8523
 URL: https://issues.apache.org/jira/browse/HADOOP-8523
 Project: Hadoop Common
  Issue Type: Improvement
  Components: build
Affects Versions: 0.23.3, 2.0.1-alpha
Reporter: Jack Dintruff
Priority: Minor
  Labels: newbie
 Fix For: 2.0.1-alpha, 3.0.0

 Attachments: HADOOP-8523.patch, HADOOP-8523.patch, Hadoop-8523.patch, 
 Hadoop-8523.patch


 When running test-patch.sh with an invalid patch (not formatted properly) or 
 one that doesn't compile, the script spends a lot of time building Hadoop 
 before checking to see if the patch is invalid.  It would help devs if it 
 checked first just in case we run test-patch.sh with a bad patch file. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-8572) Have the ability to force the use of the login user

2012-07-06 Thread Owen O'Malley (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13408281#comment-13408281
 ] 

Owen O'Malley commented on HADOOP-8572:
---

You need to also upload the patch to Apache's jira.

Can you explain more of the context? If Karaf invoking Hadoop via the command 
line or in the same jvm?

 Have the ability to force the use of the login user 
 

 Key: HADOOP-8572
 URL: https://issues.apache.org/jira/browse/HADOOP-8572
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Guillaume Nodet

 In Karaf, most of the code is run under the karaf user. When a user ssh 
 into Karaf, commands will be executed under that user.
 Deploying hadoop inside Karaf requires that the authenticated Subject has the 
 required hadoop principals set, which forces the reconfiguration of the whole 
 security layer, even at dev time.
 My patch proposes the introduction of a new configuration property 
 {{hadoop.security.force.login.user}} which if set to true (it would default 
 to false to keep the current behavior), would force the use of the login user 
 instead of using the authenticated subject (which is what happen when there's 
 no authenticated subject at all).  This greatly simplifies the use of hadoop 
 in such environments where security isn't really needed (at dev time).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-8564) Create a Windows native InputStream class to address datanode concurrent reading and writing issue

2012-07-06 Thread Chuan Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13408283#comment-13408283
 ] 

Chuan Liu commented on HADOOP-8564:
---

{quote}
Can this be merged into the existing NativeIO JNI library? Or are the number of 
#ifdef WINDOWS macros required so numerous that we should just have two 
entirely separate libhadoops?
{quote}
NativeIO JNI library is only available on Linux while this class is only needed 
on Windows. I think it make sense to create a separate native lib file. We 
don't necessary need to name it libhadoop. For example, if the class is called 
'WindowsFileInputStream', the new lib could be 'WindowsFileInputStream.dll'. Is 
there any concern over this? E.g. you want to reduce native library files 
exposed in Hadoop in general?

 Create a Windows native InputStream class to address datanode concurrent 
 reading and writing issue
 --

 Key: HADOOP-8564
 URL: https://issues.apache.org/jira/browse/HADOOP-8564
 Project: Hadoop Common
  Issue Type: Bug
  Components: io
Affects Versions: 1-win
Reporter: Chuan Liu
Assignee: Chuan Liu

 HDFS files are made up of blocks. First, let’s look at writing. When the data 
 is written to datanode, an active or temporary file is created to receive 
 packets. After the last packet for the block is received, we will finalize 
 the block. One step during finalization is to rename the block file to a new 
 directory. The relevant code can be found via the call sequence: 
 FSDataSet.finalizeBlockInternal - FSDir.addBlock.
 {code} 
 if ( ! metaData.renameTo( newmeta ) ||
 ! src.renameTo( dest ) ) {
   throw new IOException( could not move files for  + b +
   from tmp to  + 
  dest.getAbsolutePath() );
 }
 {code}
 Let’s then switch to reading. On HDFS, it is expected the client can also 
 read these unfinished blocks. So when the read calls from client reach 
 datanode, the datanode will open an input stream on the unfinished block file.
 The problem comes in when the file is opened for reading while the datanode 
 receives last packet from client and try to rename the finished block file. 
 This operation will succeed on Linux, but not on Windows .  The behavior can 
 be modified on Windows to open the file with FILE_SHARE_DELETE flag on, i.e. 
 sharing the delete (including renaming) permission with other processes while 
 opening the file. There is also a Java bug ([id 
 6357433|http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6357433]) reported 
 a while back on this. However, since this behavior exists for Java on Windows 
 since JDK 1.0, the Java developers do not want to break the backward 
 compatibility on this behavior. Instead, a new file system API is proposed in 
 JDK 7.
 As outlined in the [Java forum|http://www.java.net/node/645421] by the Java 
 developer (kbr), there are three ways to fix the problem:
 # Use different mechanism in the application in dealing with files.
 # Create a new implementation of InputStream abstract class using Windows 
 native code.
 # Patch JDK with a private patch that alters FileInputStream behavior.
 For the third option, it cannot fix the problem for users using Oracle JDK.
 We discussed some options for the first approach. For example one option is 
 to use two phase renaming, i.e. first hardlink; then remove the old hardlink 
 when read is finished. This option was thought to be rather pervasive.  
 Another option discussed is to change the HDFS behavior on Windows by not 
 allowing client reading unfinished blocks. However this behavior change is 
 thought to be problematic and may affect other application build on top of 
 HDFS.
 For all the reasons discussed above, we will use the second approach to 
 address the problem.
 If there are better options to fix the problem, we would also like to hear 
 about them.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HADOOP-8523) test-patch.sh doesn't validate patches before building

2012-07-06 Thread Jonathan Eagles (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Eagles resolved HADOOP-8523.
-

   Resolution: Fixed
Fix Version/s: 3.0.0
   2.0.1-alpha

 test-patch.sh doesn't validate patches before building
 --

 Key: HADOOP-8523
 URL: https://issues.apache.org/jira/browse/HADOOP-8523
 Project: Hadoop Common
  Issue Type: Improvement
  Components: build
Affects Versions: 0.23.3, 2.0.1-alpha
Reporter: Jack Dintruff
Priority: Minor
  Labels: newbie
 Fix For: 2.0.1-alpha, 3.0.0

 Attachments: HADOOP-8523.patch, HADOOP-8523.patch, Hadoop-8523.patch, 
 Hadoop-8523.patch


 When running test-patch.sh with an invalid patch (not formatted properly) or 
 one that doesn't compile, the script spends a lot of time building Hadoop 
 before checking to see if the patch is invalid.  It would help devs if it 
 checked first just in case we run test-patch.sh with a bad patch file. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-8523) test-patch.sh doesn't validate patches before building

2012-07-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13408286#comment-13408286
 ] 

Hudson commented on HADOOP-8523:


Integrated in Hadoop-Hdfs-trunk #1095 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1095/])
HADOOP-8523. test-patch.sh doesn't validate patches before building (Jack 
Dintruff via jeagles) (Revision 1358394)

 Result = FAILURE
jeagles : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1358394
Files : 
* /hadoop/common/trunk/dev-support/smart-apply-patch.sh
* /hadoop/common/trunk/dev-support/test-patch.sh
* /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt


 test-patch.sh doesn't validate patches before building
 --

 Key: HADOOP-8523
 URL: https://issues.apache.org/jira/browse/HADOOP-8523
 Project: Hadoop Common
  Issue Type: Improvement
  Components: build
Affects Versions: 0.23.3, 2.0.1-alpha
Reporter: Jack Dintruff
Priority: Minor
  Labels: newbie
 Fix For: 2.0.1-alpha, 3.0.0

 Attachments: HADOOP-8523.patch, HADOOP-8523.patch, Hadoop-8523.patch, 
 Hadoop-8523.patch


 When running test-patch.sh with an invalid patch (not formatted properly) or 
 one that doesn't compile, the script spends a lot of time building Hadoop 
 before checking to see if the patch is invalid.  It would help devs if it 
 checked first just in case we run test-patch.sh with a bad patch file. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-8563) don't package hadoop-pipes examples/bin

2012-07-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13408290#comment-13408290
 ] 

Hudson commented on HADOOP-8563:


Integrated in Hadoop-Hdfs-trunk #1095 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1095/])
HADOOP-8563. don't package hadoop-pipes examples/bin (Colin Patrick McCabe 
via tgraves) (Revision 1357811)

 Result = FAILURE
tgraves : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1357811
Files : 
* 
/hadoop/common/trunk/hadoop-assemblies/src/main/resources/assemblies/hadoop-tools.xml
* /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt


 don't package hadoop-pipes examples/bin
 ---

 Key: HADOOP-8563
 URL: https://issues.apache.org/jira/browse/HADOOP-8563
 Project: Hadoop Common
  Issue Type: Bug
  Components: build
Affects Versions: 2.0.1-alpha
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
Priority: Minor
 Fix For: 2.0.1-alpha

 Attachments: HADOOP-8563.001.patch


 Let's not package hadoop-pipes examples/bin

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-8523) test-patch.sh doesn't validate patches before building

2012-07-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13408294#comment-13408294
 ] 

Hudson commented on HADOOP-8523:


Integrated in Hadoop-Mapreduce-trunk-Commit #2446 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/2446/])
HADOOP-8523. test-patch.sh doesn't validate patches before building (Jack 
Dintruff via jeagles) (Revision 1358394)

 Result = FAILURE
jeagles : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1358394
Files : 
* /hadoop/common/trunk/dev-support/smart-apply-patch.sh
* /hadoop/common/trunk/dev-support/test-patch.sh
* /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt


 test-patch.sh doesn't validate patches before building
 --

 Key: HADOOP-8523
 URL: https://issues.apache.org/jira/browse/HADOOP-8523
 Project: Hadoop Common
  Issue Type: Improvement
  Components: build
Affects Versions: 0.23.3, 2.0.1-alpha
Reporter: Jack Dintruff
Priority: Minor
  Labels: newbie
 Fix For: 2.0.1-alpha, 3.0.0

 Attachments: HADOOP-8523.patch, HADOOP-8523.patch, Hadoop-8523.patch, 
 Hadoop-8523.patch


 When running test-patch.sh with an invalid patch (not formatted properly) or 
 one that doesn't compile, the script spends a lot of time building Hadoop 
 before checking to see if the patch is invalid.  It would help devs if it 
 checked first just in case we run test-patch.sh with a bad patch file. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-8523) test-patch.sh doesn't validate patches before building

2012-07-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13408318#comment-13408318
 ] 

Hudson commented on HADOOP-8523:


Integrated in Hadoop-Hdfs-trunk-Commit #2496 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2496/])
HADOOP-8523. test-patch.sh doesn't validate patches before building (Jack 
Dintruff via jeagles) (Revision 1358394)

 Result = SUCCESS
jeagles : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1358394
Files : 
* /hadoop/common/trunk/dev-support/smart-apply-patch.sh
* /hadoop/common/trunk/dev-support/test-patch.sh
* /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt


 test-patch.sh doesn't validate patches before building
 --

 Key: HADOOP-8523
 URL: https://issues.apache.org/jira/browse/HADOOP-8523
 Project: Hadoop Common
  Issue Type: Improvement
  Components: build
Affects Versions: 0.23.3, 2.0.1-alpha
Reporter: Jack Dintruff
Priority: Minor
  Labels: newbie
 Fix For: 2.0.1-alpha, 3.0.0

 Attachments: HADOOP-8523.patch, HADOOP-8523.patch, Hadoop-8523.patch, 
 Hadoop-8523.patch


 When running test-patch.sh with an invalid patch (not formatted properly) or 
 one that doesn't compile, the script spends a lot of time building Hadoop 
 before checking to see if the patch is invalid.  It would help devs if it 
 checked first just in case we run test-patch.sh with a bad patch file. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-8564) Create a Windows native InputStream class to address datanode concurrent reading and writing issue

2012-07-06 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13408323#comment-13408323
 ] 

Todd Lipcon commented on HADOOP-8564:
-

bq. NativeIO JNI library is only available on Linux while this class is only 
needed on Windows. I think it make sense to create a separate native lib file. 
We don't necessary need to name it libhadoop. For example, if the class is 
called 'WindowsFileInputStream', the new lib could be 
'WindowsFileInputStream.dll'. Is there any concern over this? E.g. you want to 
reduce native library files exposed in Hadoop in general?

Currently NativeIO JNI is Linux-only, but I think all of the stuff found in 
there is useful on Windows as well. For example:
- Native CRC32 computation: the SSE instructions probably need slightly 
different syntax for the Windows C++ compiler, but are necessary for good 
performance
- Various other flags to open() needed for race-condition free security 
support: probably needs different APIs in Windows but likely there are 
equivalents available
- Compression: Windows equally needs fast compression libraries, etc

So, I think it makes sense to get libhadoop generally compiling on Windows and 
making it the central place for native dependency code.

 Create a Windows native InputStream class to address datanode concurrent 
 reading and writing issue
 --

 Key: HADOOP-8564
 URL: https://issues.apache.org/jira/browse/HADOOP-8564
 Project: Hadoop Common
  Issue Type: Bug
  Components: io
Affects Versions: 1-win
Reporter: Chuan Liu
Assignee: Chuan Liu

 HDFS files are made up of blocks. First, let’s look at writing. When the data 
 is written to datanode, an active or temporary file is created to receive 
 packets. After the last packet for the block is received, we will finalize 
 the block. One step during finalization is to rename the block file to a new 
 directory. The relevant code can be found via the call sequence: 
 FSDataSet.finalizeBlockInternal - FSDir.addBlock.
 {code} 
 if ( ! metaData.renameTo( newmeta ) ||
 ! src.renameTo( dest ) ) {
   throw new IOException( could not move files for  + b +
   from tmp to  + 
  dest.getAbsolutePath() );
 }
 {code}
 Let’s then switch to reading. On HDFS, it is expected the client can also 
 read these unfinished blocks. So when the read calls from client reach 
 datanode, the datanode will open an input stream on the unfinished block file.
 The problem comes in when the file is opened for reading while the datanode 
 receives last packet from client and try to rename the finished block file. 
 This operation will succeed on Linux, but not on Windows .  The behavior can 
 be modified on Windows to open the file with FILE_SHARE_DELETE flag on, i.e. 
 sharing the delete (including renaming) permission with other processes while 
 opening the file. There is also a Java bug ([id 
 6357433|http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6357433]) reported 
 a while back on this. However, since this behavior exists for Java on Windows 
 since JDK 1.0, the Java developers do not want to break the backward 
 compatibility on this behavior. Instead, a new file system API is proposed in 
 JDK 7.
 As outlined in the [Java forum|http://www.java.net/node/645421] by the Java 
 developer (kbr), there are three ways to fix the problem:
 # Use different mechanism in the application in dealing with files.
 # Create a new implementation of InputStream abstract class using Windows 
 native code.
 # Patch JDK with a private patch that alters FileInputStream behavior.
 For the third option, it cannot fix the problem for users using Oracle JDK.
 We discussed some options for the first approach. For example one option is 
 to use two phase renaming, i.e. first hardlink; then remove the old hardlink 
 when read is finished. This option was thought to be rather pervasive.  
 Another option discussed is to change the HDFS behavior on Windows by not 
 allowing client reading unfinished blocks. However this behavior change is 
 thought to be problematic and may affect other application build on top of 
 HDFS.
 For all the reasons discussed above, we will use the second approach to 
 address the problem.
 If there are better options to fix the problem, we would also like to hear 
 about them.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-8523) test-patch.sh doesn't validate patches before building

2012-07-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13408325#comment-13408325
 ] 

Hudson commented on HADOOP-8523:


Integrated in Hadoop-Common-trunk-Commit #2428 (See 
[https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2428/])
HADOOP-8523. test-patch.sh doesn't validate patches before building (Jack 
Dintruff via jeagles) (Revision 1358394)

 Result = SUCCESS
jeagles : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1358394
Files : 
* /hadoop/common/trunk/dev-support/smart-apply-patch.sh
* /hadoop/common/trunk/dev-support/test-patch.sh
* /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt


 test-patch.sh doesn't validate patches before building
 --

 Key: HADOOP-8523
 URL: https://issues.apache.org/jira/browse/HADOOP-8523
 Project: Hadoop Common
  Issue Type: Improvement
  Components: build
Affects Versions: 0.23.3, 2.0.1-alpha
Reporter: Jack Dintruff
Priority: Minor
  Labels: newbie
 Fix For: 2.0.1-alpha, 3.0.0

 Attachments: HADOOP-8523.patch, HADOOP-8523.patch, Hadoop-8523.patch, 
 Hadoop-8523.patch


 When running test-patch.sh with an invalid patch (not formatted properly) or 
 one that doesn't compile, the script spends a lot of time building Hadoop 
 before checking to see if the patch is invalid.  It would help devs if it 
 checked first just in case we run test-patch.sh with a bad patch file. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-8561) Introduce HADOOP_PROXY_USER for secure impersonation in child hadoop client processes

2012-07-06 Thread Owen O'Malley (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13408342#comment-13408342
 ] 

Owen O'Malley commented on HADOOP-8561:
---

I'm not against making an environment variable/property to set the user, but we 
might as well use the one we already have and enable HADOOP_USER_NAME 
in secure mode to mean act as a proxy for the given user.

 Introduce HADOOP_PROXY_USER for secure impersonation in child hadoop client 
 processes
 -

 Key: HADOOP-8561
 URL: https://issues.apache.org/jira/browse/HADOOP-8561
 Project: Hadoop Common
  Issue Type: Improvement
  Components: security
Reporter: Luke Lu
Assignee: Yu Gao

 To solve the problem for an authenticated user to type hadoop shell commands 
 in a web console, we can introduce an HADOOP_PROXY_USER environment variable 
 to allow proper impersonation in the child hadoop client processes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-8523) test-patch.sh doesn't validate patches before building

2012-07-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13408349#comment-13408349
 ] 

Hudson commented on HADOOP-8523:


Integrated in Hadoop-Mapreduce-trunk #1128 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1128/])
HADOOP-8523. test-patch.sh doesn't validate patches before building (Jack 
Dintruff via jeagles) (Revision 1358394)

 Result = SUCCESS
jeagles : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1358394
Files : 
* /hadoop/common/trunk/dev-support/smart-apply-patch.sh
* /hadoop/common/trunk/dev-support/test-patch.sh
* /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt


 test-patch.sh doesn't validate patches before building
 --

 Key: HADOOP-8523
 URL: https://issues.apache.org/jira/browse/HADOOP-8523
 Project: Hadoop Common
  Issue Type: Improvement
  Components: build
Affects Versions: 0.23.3, 2.0.1-alpha
Reporter: Jack Dintruff
Priority: Minor
  Labels: newbie
 Fix For: 2.0.1-alpha, 3.0.0

 Attachments: HADOOP-8523.patch, HADOOP-8523.patch, Hadoop-8523.patch, 
 Hadoop-8523.patch


 When running test-patch.sh with an invalid patch (not formatted properly) or 
 one that doesn't compile, the script spends a lot of time building Hadoop 
 before checking to see if the patch is invalid.  It would help devs if it 
 checked first just in case we run test-patch.sh with a bad patch file. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-8563) don't package hadoop-pipes examples/bin

2012-07-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13408353#comment-13408353
 ] 

Hudson commented on HADOOP-8563:


Integrated in Hadoop-Mapreduce-trunk #1128 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1128/])
HADOOP-8563. don't package hadoop-pipes examples/bin (Colin Patrick McCabe 
via tgraves) (Revision 1357811)

 Result = SUCCESS
tgraves : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1357811
Files : 
* 
/hadoop/common/trunk/hadoop-assemblies/src/main/resources/assemblies/hadoop-tools.xml
* /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt


 don't package hadoop-pipes examples/bin
 ---

 Key: HADOOP-8563
 URL: https://issues.apache.org/jira/browse/HADOOP-8563
 Project: Hadoop Common
  Issue Type: Bug
  Components: build
Affects Versions: 2.0.1-alpha
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
Priority: Minor
 Fix For: 2.0.1-alpha

 Attachments: HADOOP-8563.001.patch


 Let's not package hadoop-pipes examples/bin

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-8564) Create a Windows native InputStream class to address datanode concurrent reading and writing issue

2012-07-06 Thread Chuan Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13408360#comment-13408360
 ] 

Chuan Liu commented on HADOOP-8564:
---

Hi Todd, thanks for the clarification. I see you point now. However I think 
there are three things here.

# Make existing NativeIO works on Windows.
# Create new Windows native IO functionality that solves the above issue.
# Build and organize the code/lib so that we have a central place for the 
native code.

For this Jira, we only intend to solve 2. I agree with you on 1. For 3, I can 
see both pros and cons. But once 1 is done, there should be only modest work to 
create a common lib for all native code. Does this make sense to you?

 Create a Windows native InputStream class to address datanode concurrent 
 reading and writing issue
 --

 Key: HADOOP-8564
 URL: https://issues.apache.org/jira/browse/HADOOP-8564
 Project: Hadoop Common
  Issue Type: Bug
  Components: io
Affects Versions: 1-win
Reporter: Chuan Liu
Assignee: Chuan Liu

 HDFS files are made up of blocks. First, let’s look at writing. When the data 
 is written to datanode, an active or temporary file is created to receive 
 packets. After the last packet for the block is received, we will finalize 
 the block. One step during finalization is to rename the block file to a new 
 directory. The relevant code can be found via the call sequence: 
 FSDataSet.finalizeBlockInternal - FSDir.addBlock.
 {code} 
 if ( ! metaData.renameTo( newmeta ) ||
 ! src.renameTo( dest ) ) {
   throw new IOException( could not move files for  + b +
   from tmp to  + 
  dest.getAbsolutePath() );
 }
 {code}
 Let’s then switch to reading. On HDFS, it is expected the client can also 
 read these unfinished blocks. So when the read calls from client reach 
 datanode, the datanode will open an input stream on the unfinished block file.
 The problem comes in when the file is opened for reading while the datanode 
 receives last packet from client and try to rename the finished block file. 
 This operation will succeed on Linux, but not on Windows .  The behavior can 
 be modified on Windows to open the file with FILE_SHARE_DELETE flag on, i.e. 
 sharing the delete (including renaming) permission with other processes while 
 opening the file. There is also a Java bug ([id 
 6357433|http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6357433]) reported 
 a while back on this. However, since this behavior exists for Java on Windows 
 since JDK 1.0, the Java developers do not want to break the backward 
 compatibility on this behavior. Instead, a new file system API is proposed in 
 JDK 7.
 As outlined in the [Java forum|http://www.java.net/node/645421] by the Java 
 developer (kbr), there are three ways to fix the problem:
 # Use different mechanism in the application in dealing with files.
 # Create a new implementation of InputStream abstract class using Windows 
 native code.
 # Patch JDK with a private patch that alters FileInputStream behavior.
 For the third option, it cannot fix the problem for users using Oracle JDK.
 We discussed some options for the first approach. For example one option is 
 to use two phase renaming, i.e. first hardlink; then remove the old hardlink 
 when read is finished. This option was thought to be rather pervasive.  
 Another option discussed is to change the HDFS behavior on Windows by not 
 allowing client reading unfinished blocks. However this behavior change is 
 thought to be problematic and may affect other application build on top of 
 HDFS.
 For all the reasons discussed above, we will use the second approach to 
 address the problem.
 If there are better options to fix the problem, we would also like to hear 
 about them.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HADOOP-8525) Provide Improved Traceability for Configuration

2012-07-06 Thread Robert Joseph Evans (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-8525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Joseph Evans updated HADOOP-8525:


Status: Open  (was: Patch Available)

Canecling patch to address test failures

 Provide Improved Traceability for Configuration
 ---

 Key: HADOOP-8525
 URL: https://issues.apache.org/jira/browse/HADOOP-8525
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Robert Joseph Evans
Assignee: Robert Joseph Evans
Priority: Trivial
 Attachments: HADOOP-8525.txt, HADOOP-8525.txt, HADOOP-8525.txt


 Configuration provides basic traceability to see where a config setting came 
 from, but once the configuration is written out that information is written 
 to a comment in the XML and then lost the next time the configuration is read 
 back in.  It would really be great to be able to store a complete history of 
 where the config came from in the XML, so that it can then be retrieved later 
 for debugging.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HADOOP-8525) Provide Improved Traceability for Configuration

2012-07-06 Thread Robert Joseph Evans (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-8525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Joseph Evans updated HADOOP-8525:


Attachment: HADOOP-8525.txt

 Provide Improved Traceability for Configuration
 ---

 Key: HADOOP-8525
 URL: https://issues.apache.org/jira/browse/HADOOP-8525
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Robert Joseph Evans
Assignee: Robert Joseph Evans
Priority: Trivial
 Attachments: HADOOP-8525.txt, HADOOP-8525.txt, HADOOP-8525.txt


 Configuration provides basic traceability to see where a config setting came 
 from, but once the configuration is written out that information is written 
 to a comment in the XML and then lost the next time the configuration is read 
 back in.  It would really be great to be able to store a complete history of 
 where the config came from in the XML, so that it can then be retrieved later 
 for debugging.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HADOOP-8525) Provide Improved Traceability for Configuration

2012-07-06 Thread Robert Joseph Evans (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-8525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Joseph Evans updated HADOOP-8525:


Status: Patch Available  (was: Open)

 Provide Improved Traceability for Configuration
 ---

 Key: HADOOP-8525
 URL: https://issues.apache.org/jira/browse/HADOOP-8525
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Robert Joseph Evans
Assignee: Robert Joseph Evans
Priority: Trivial
 Attachments: HADOOP-8525.txt, HADOOP-8525.txt, HADOOP-8525.txt


 Configuration provides basic traceability to see where a config setting came 
 from, but once the configuration is written out that information is written 
 to a comment in the XML and then lost the next time the configuration is read 
 back in.  It would really be great to be able to store a complete history of 
 where the config came from in the XML, so that it can then be retrieved later 
 for debugging.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-8543) Invalid pom.xml files on 0.23 branch

2012-07-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13408370#comment-13408370
 ] 

Hudson commented on HADOOP-8543:


Integrated in Hadoop-Hdfs-0.23-Build #305 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/305/])
HADOOP-8543. Invalid pom.xml files on 0.23 branch Updated to fix a bug in 
orriginal patch. (Revision 1357785)

 Result = SUCCESS
bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1357785
Files : 
* /hadoop/common/branches/branch-0.23/hadoop-project-dist/pom.xml


 Invalid pom.xml files on 0.23 branch
 

 Key: HADOOP-8543
 URL: https://issues.apache.org/jira/browse/HADOOP-8543
 Project: Hadoop Common
  Issue Type: Bug
  Components: build
Affects Versions: 0.23.3
 Environment: FreeBSD 8.2, 64bit, Artifactory
Reporter: Radim Kolar
Assignee: Radim Kolar
  Labels: build
 Fix For: 0.23.3

 Attachments: hadoop-invalid-pom-023-2.txt, 
 hadoop-invalid-pom-023-3.txt, hadoop-invalid-pom-023.txt


 This is backport of HADOOP-8268 to 0.23 branch. It fixes invalid pom.xml 
 which allows them to be uploaded into artifactory maven repository management 
 and adds schema declarations which allows to use XML validating tools.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-8333) src/contrib/fuse-dfs build fails on non-Sun JVM environments

2012-07-06 Thread Andy Isaacson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13408377#comment-13408377
 ] 

Andy Isaacson commented on HADOOP-8333:
---

Since HADOOP-8368 on trunk (which switched us from automake to CMake), this is 
not an issue on OpenJDK at least.  I realize that doesn't help much on branch-1 
though...

 src/contrib/fuse-dfs build fails on non-Sun JVM environments
 

 Key: HADOOP-8333
 URL: https://issues.apache.org/jira/browse/HADOOP-8333
 Project: Hadoop Common
  Issue Type: Bug
  Components: build
Affects Versions: 1.0.2
 Environment: IBM Java 6 
Reporter: Kumar Ravi

 src/contrib/fuse-dfs build failure when building in IBM Java 6 environment. 
 The message on the console when the build aborts is:
  [exec] /usr/bin/ld: cannot find -ljvm
  [exec] collect2: ld returned 1 exit status
  [exec] make[1]: *** [fuse_dfs] Error 1
  [exec] make[1]: Leaving directory 
 `/home/hadoop/branch-1.0_0427/src/contrib/fuse-dfs/src'
  [exec] make: *** [all-recursive] Error 1
 The reason this seems to be happening is because of the last line in 
 src/contrib/fuse-dfs/src/Makefile.am
 AM_LDFLAGS= -L$(HADOOP_HOME)/build/libhdfs -lhdfs -L$(FUSE_HOME)/lib -lfuse 
 -L$(JAVA_HOME)/jre/lib/$(OS_ARCH)/server -ljvm
 For hadoop to build on IBM Java, this last line should read as follows since 
 this is where the libjvm library resides
 AM_LDFLAGS= -L$(HADOOP_HOME)/build/libhdfs -lhdfs -L$(FUSE_HOME)/lib -lfuse 
 -L$(JAVA_HOME)/jre/lib/$(OS_ARCH)/j9vm -ljvm
 IMO, Changes like the following will need to be made to 
 src/contrib/fuse-dfs/configure.ac (?) to include changes similar to that in 
 src/native/ to check for the appropriate JVM and configure the appropriate 
 path for ljvm.
 dnl Check for '-ljvm'
 JNI_LDFLAGS=
 if test $JAVA_HOME != 
 then
   JNI_LDFLAGS=-L$JAVA_HOME/jre/lib/$OS_ARCH/server
   JVMSOPATH=`find $JAVA_HOME/jre/ -name libjvm.so | head -n 1`
   JNI_LDFLAGS=$JNI_LDFLAGS -L`dirname $JVMSOPATH`
 fi
 ldflags_bak=$LDFLAGS
 LDFLAGS=$LDFLAGS $JNI_LDFLAGS
 AC_CHECK_LIB([jvm], [JNI_GetCreatedJavaVMs])
 LDFLAGS=$ldflags_bak
 AC_SUBST([JNI_LDFLAGS])
 # Checks for header files.
 dnl Check for Ansi C headers
 AC_HEADER_STDC
 dnl Check for other standard C headers
 AC_CHECK_HEADERS([stdio.h stddef.h], [], AC_MSG_ERROR(Some system headers not 
 found... please ensure their presence on your platform.))
 dnl Check for JNI headers
 JNI_CPPFLAGS=
 if test $JAVA_HOME != 
 then
   for dir in `find $JAVA_HOME/include -follow -type d`
   do
 JNI_CPPFLAGS=$JNI_CPPFLAGS -I$dir
   done
 fi
  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-8561) Introduce HADOOP_PROXY_USER for secure impersonation in child hadoop client processes

2012-07-06 Thread Luke Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13408383#comment-13408383
 ] 

Luke Lu commented on HADOOP-8561:
-

We'd also like to use proxy user in semi secure mode as well.

 Introduce HADOOP_PROXY_USER for secure impersonation in child hadoop client 
 processes
 -

 Key: HADOOP-8561
 URL: https://issues.apache.org/jira/browse/HADOOP-8561
 Project: Hadoop Common
  Issue Type: Improvement
  Components: security
Reporter: Luke Lu
Assignee: Yu Gao

 To solve the problem for an authenticated user to type hadoop shell commands 
 in a web console, we can introduce an HADOOP_PROXY_USER environment variable 
 to allow proper impersonation in the child hadoop client processes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-8566) AvroReflectSerializer.accept(Class) throws a NPE if the class has no package (primitive types and arrays)

2012-07-06 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13408382#comment-13408382
 ] 

Hadoop QA commented on HADOOP-8566:
---

+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12535414/HADOOP-8566.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 1 new or modified test 
files.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 eclipse:eclipse.  The patch built with eclipse:eclipse.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed unit tests in 
hadoop-common-project/hadoop-common.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/1176//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/1176//console

This message is automatically generated.

 AvroReflectSerializer.accept(Class) throws a NPE if the class has no package 
 (primitive types and arrays)
 -

 Key: HADOOP-8566
 URL: https://issues.apache.org/jira/browse/HADOOP-8566
 Project: Hadoop Common
  Issue Type: Bug
  Components: io
Affects Versions: 2.0.0-alpha
Reporter: Alejandro Abdelnur
Assignee: Alejandro Abdelnur
 Fix For: 2.0.1-alpha

 Attachments: HADOOP-8566.patch, HADOOP-8566.patch


 the accept() method should consider the case where the class getPackage() 
 returns NULL.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-8561) Introduce HADOOP_PROXY_USER for secure impersonation in child hadoop client processes

2012-07-06 Thread Luke Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13408393#comment-13408393
 ] 

Luke Lu commented on HADOOP-8561:
-

@Owen, I'm fine with repurposing HADOOP_USER_NAME and for proxy user (better 
auditing and access control even without kerbero), though it's an incompatible 
change. One of the reasons we added HADOOP_PROXY_USER is to preserve the 
original semantics for HADOOP_USER_NAME.

 Introduce HADOOP_PROXY_USER for secure impersonation in child hadoop client 
 processes
 -

 Key: HADOOP-8561
 URL: https://issues.apache.org/jira/browse/HADOOP-8561
 Project: Hadoop Common
  Issue Type: Improvement
  Components: security
Reporter: Luke Lu
Assignee: Yu Gao

 To solve the problem for an authenticated user to type hadoop shell commands 
 in a web console, we can introduce an HADOOP_PROXY_USER environment variable 
 to allow proper impersonation in the child hadoop client processes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-8567) Backport conf servlet with dump running configuration to branch 1.x

2012-07-06 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13408394#comment-13408394
 ] 

Suresh Srinivas commented on HADOOP-8567:
-

+1 for backport. This will be very useful feature on stable release.

 Backport conf servlet with dump running configuration to branch 1.x
 ---

 Key: HADOOP-8567
 URL: https://issues.apache.org/jira/browse/HADOOP-8567
 Project: Hadoop Common
  Issue Type: New Feature
  Components: conf
Affects Versions: 1.0.3
Reporter: Junping Du
Assignee: Junping Du
 Fix For: 0.21.1, 2.0.1-alpha


 HADOOP-6408 provide conf servlet that can dump running configuration which 
 great helps admin to trouble shooting the configuration issue. However, that 
 patch works on branch after 0.21 only and should be backport to branch 1.x.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HADOOP-8566) AvroReflectSerializer.accept(Class) throws a NPE if the class has no package (primitive types and arrays)

2012-07-06 Thread Alejandro Abdelnur (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-8566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alejandro Abdelnur updated HADOOP-8566:
---

  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

committed to trunk and branch-2

 AvroReflectSerializer.accept(Class) throws a NPE if the class has no package 
 (primitive types and arrays)
 -

 Key: HADOOP-8566
 URL: https://issues.apache.org/jira/browse/HADOOP-8566
 Project: Hadoop Common
  Issue Type: Bug
  Components: io
Affects Versions: 2.0.0-alpha
Reporter: Alejandro Abdelnur
Assignee: Alejandro Abdelnur
 Fix For: 2.0.1-alpha

 Attachments: HADOOP-8566.patch, HADOOP-8566.patch


 the accept() method should consider the case where the class getPackage() 
 returns NULL.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HADOOP-8365) Provide ability to disable working sync

2012-07-06 Thread Eli Collins (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-8365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins reassigned HADOOP-8365:
---

Assignee: Eli Collins

 Provide ability to disable working sync
 ---

 Key: HADOOP-8365
 URL: https://issues.apache.org/jira/browse/HADOOP-8365
 Project: Hadoop Common
  Issue Type: Improvement
Affects Versions: 1.1.0
Reporter: Eli Collins
Assignee: Eli Collins
Priority: Blocker

 Per HADOOP-8230 there's a request for a flag to disable the sync code paths 
 that dfs.support.append used to enable. The sync method itself will still be 
 available and have a broken implementation as that was the behavior before 
 HADOOP-8230. This config flag should default to false as the primary 
 motivation for HADOOP-8230 is so HBase works out-of-the-box with Hadoop 1.1.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-8230) Enable sync by default and disable append

2012-07-06 Thread Eli Collins (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13408408#comment-13408408
 ] 

Eli Collins commented on HADOOP-8230:
-

Patch for HADOOP-8365 coming, please review when you get a sec.

 Enable sync by default and disable append
 -

 Key: HADOOP-8230
 URL: https://issues.apache.org/jira/browse/HADOOP-8230
 Project: Hadoop Common
  Issue Type: Improvement
Affects Versions: 1.0.0
Reporter: Eli Collins
Assignee: Eli Collins
 Fix For: 1.1.0

 Attachments: hadoop-8230.txt


 Per HDFS-3120 for 1.x let's:
 - Always enable the sync path, which is currently only enabled if 
 dfs.support.append is set
 - Remove the dfs.support.append configuration option. We'll keep the code 
 paths though in case we ever fix append on branch-1, in which case we can add 
 the config option back

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HADOOP-8365) Provide ability to disable working sync

2012-07-06 Thread Eli Collins (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-8365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins updated HADOOP-8365:


Attachment: hadoop-8365.txt

Patch attached.

 Provide ability to disable working sync
 ---

 Key: HADOOP-8365
 URL: https://issues.apache.org/jira/browse/HADOOP-8365
 Project: Hadoop Common
  Issue Type: Improvement
Affects Versions: 1.1.0
Reporter: Eli Collins
Assignee: Eli Collins
Priority: Blocker
 Attachments: hadoop-8365.txt


 Per HADOOP-8230 there's a request for a flag to disable the sync code paths 
 that dfs.support.append used to enable. The sync method itself will still be 
 available and have a broken implementation as that was the behavior before 
 HADOOP-8230. This config flag should default to false as the primary 
 motivation for HADOOP-8230 is so HBase works out-of-the-box with Hadoop 1.1.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-8566) AvroReflectSerializer.accept(Class) throws a NPE if the class has no package (primitive types and arrays)

2012-07-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13408446#comment-13408446
 ] 

Hudson commented on HADOOP-8566:


Integrated in Hadoop-Mapreduce-trunk-Commit #2447 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/2447/])
HADOOP-8566. AvroReflectSerializer.accept(Class) throws a NPE if the class 
has no package (primitive types and arrays). (tucu) (Revision 1358454)

 Result = FAILURE
tucu : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1358454
Files : 
* /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/serializer/avro/AvroReflectSerialization.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/serializer/avro/TestAvroSerialization.java


 AvroReflectSerializer.accept(Class) throws a NPE if the class has no package 
 (primitive types and arrays)
 -

 Key: HADOOP-8566
 URL: https://issues.apache.org/jira/browse/HADOOP-8566
 Project: Hadoop Common
  Issue Type: Bug
  Components: io
Affects Versions: 2.0.0-alpha
Reporter: Alejandro Abdelnur
Assignee: Alejandro Abdelnur
 Fix For: 2.0.1-alpha

 Attachments: HADOOP-8566.patch, HADOOP-8566.patch


 the accept() method should consider the case where the class getPackage() 
 returns NULL.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-8554) KerberosAuthenticator should use the configured principal

2012-07-06 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13408456#comment-13408456
 ] 

Alejandro Abdelnur commented on HADOOP-8554:


@Eli, the line of code you point out happens on the client side, if your URL is 
of the form http://foohost/ then the principal is created as 
'HTTP/foohost'. There is a JIRAs to add support for kerberos name rules 
HADOOP-8518. IMO this JIRA is invalid.

 KerberosAuthenticator should use the configured principal
 -

 Key: HADOOP-8554
 URL: https://issues.apache.org/jira/browse/HADOOP-8554
 Project: Hadoop Common
  Issue Type: Bug
  Components: security
Affects Versions: 1.0.0, 2.0.0-alpha, 2.0.1-alpha, 3.0.0
Reporter: Eli Collins
  Labels: security, webconsole

 In KerberosAuthenticator we construct the principal as follows:
 {code}
 String servicePrincipal = HTTP/ + KerberosAuthenticator.this.url.getHost();
 {code}
 Seems like we should use the configured 
 hadoop.http.authentication.kerberos.principal instead right?
 I hit this issue as a distcp using webhdfs://localhost fails because 
 HTTP/localhost is not in the kerb DB but using webhdfs://eli-thinkpad works 
 because HTTP/eli-thinkpad is (and is my configured principal). distcp using 
 Hftp://localhost with the same config works so it looks like this check is 
 webhdfs specific for some reason (webhdfs is using spnego and hftp is not?).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-8148) Zero-copy ByteBuffer-based compressor / decompressor API

2012-07-06 Thread Tim Broberg (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13408459#comment-13408459
 ] 

Tim Broberg commented on HADOOP-8148:
-

One more thought here, we could define an asynchronous decompressor interface 
with a FutureByteBuffer read(ByteBuffer dest) method for pipelined 
decompressor streams.

This would allow the app to own the buffers at the expense of making him track 
multiple outstanding requests.

Likewise, there could be a FutureByteBuffer write(ByteBuffer source) method 
on compression streams where the Future would return the buffer to the app 
for reuse.

The caller would be responsible to ensure that the buffers are large enough 
also.

It's cute, but I don't know that I like it any better.


 Zero-copy ByteBuffer-based compressor / decompressor API
 

 Key: HADOOP-8148
 URL: https://issues.apache.org/jira/browse/HADOOP-8148
 Project: Hadoop Common
  Issue Type: New Feature
  Components: io, performance
Reporter: Tim Broberg
Assignee: Tim Broberg
 Attachments: hadoop-8148.patch, hadoop8148.patch, zerocopyifc.tgz


 Per Todd Lipcon's comment in HDFS-2834, 
   Whenever a native decompression codec is being used, ... we generally have 
 the following copies:
   1) Socket - DirectByteBuffer (in SocketChannel implementation)
   2) DirectByteBuffer - byte[] (in SocketInputStream)
   3) byte[] - Native buffer (set up for decompression)
   4*) decompression to a different native buffer (not really a copy - 
 decompression necessarily rewrites)
   5) native buffer - byte[]
   with the proposed improvement we can hopefully eliminate #2,#3 for all 
 applications, and #2,#3,and #5 for libhdfs.
 
 The interfaces in the attached patch attempt to address:
  A - Compression and decompression based on ByteBuffers (HDFS-2834)
  B - Zero-copy compression and decompression (HDFS-3051)
  C - Provide the caller a way to know how the max space required to hold 
 compressed output.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HADOOP-8554) KerberosAuthenticator should use the configured principal

2012-07-06 Thread Eli Collins (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-8554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins resolved HADOOP-8554.
-

Resolution: Invalid

You're right, thanks for the explanation, I didn't realize the principal config 
was server-side only. Also, the reason I hit this with webhdfs and not hftp is 
that hftp doesn't support SPNEGO.

 KerberosAuthenticator should use the configured principal
 -

 Key: HADOOP-8554
 URL: https://issues.apache.org/jira/browse/HADOOP-8554
 Project: Hadoop Common
  Issue Type: Bug
  Components: security
Affects Versions: 1.0.0, 2.0.0-alpha, 2.0.1-alpha, 3.0.0
Reporter: Eli Collins
  Labels: security, webconsole

 In KerberosAuthenticator we construct the principal as follows:
 {code}
 String servicePrincipal = HTTP/ + KerberosAuthenticator.this.url.getHost();
 {code}
 Seems like we should use the configured 
 hadoop.http.authentication.kerberos.principal instead right?
 I hit this issue as a distcp using webhdfs://localhost fails because 
 HTTP/localhost is not in the kerb DB but using webhdfs://eli-thinkpad works 
 because HTTP/eli-thinkpad is (and is my configured principal). distcp using 
 Hftp://localhost with the same config works so it looks like this check is 
 webhdfs specific for some reason (webhdfs is using spnego and hftp is not?).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HADOOP-8552) Conflict: Same security.log.file for multiple users.

2012-07-06 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-8552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated HADOOP-8552:
-

Status: Patch Available  (was: Open)

 Conflict: Same security.log.file for multiple users. 
 -

 Key: HADOOP-8552
 URL: https://issues.apache.org/jira/browse/HADOOP-8552
 Project: Hadoop Common
  Issue Type: Bug
  Components: conf, security
Affects Versions: 2.0.0-alpha, 1.0.3
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
 Attachments: HADOOP-8552_branch1.patch, HADOOP-8552_branch2.patch


 In log4j.properties, hadoop.security.log.file is set to SecurityAuth.audit. 
 In the presence of multiple users, this can lead to a potential conflict.
 Adding username to the log file would avoid this scenario.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-8525) Provide Improved Traceability for Configuration

2012-07-06 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13408536#comment-13408536
 ] 

Hadoop QA commented on HADOOP-8525:
---

+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12535438/HADOOP-8525.txt
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 2 new or modified test 
files.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 eclipse:eclipse.  The patch built with eclipse:eclipse.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed unit tests in 
hadoop-common-project/hadoop-common.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/1179//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/1179//console

This message is automatically generated.

 Provide Improved Traceability for Configuration
 ---

 Key: HADOOP-8525
 URL: https://issues.apache.org/jira/browse/HADOOP-8525
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Robert Joseph Evans
Assignee: Robert Joseph Evans
Priority: Trivial
 Attachments: HADOOP-8525.txt, HADOOP-8525.txt, HADOOP-8525.txt


 Configuration provides basic traceability to see where a config setting came 
 from, but once the configuration is written out that information is written 
 to a comment in the XML and then lost the next time the configuration is read 
 back in.  It would really be great to be able to store a complete history of 
 where the config came from in the XML, so that it can then be retrieved later 
 for debugging.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-8573) Configuration tries to read from an inputstream resource multiple times.

2012-07-06 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13408542#comment-13408542
 ] 

Hadoop QA commented on HADOOP-8573:
---

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12535381/HADOOP-8573.txt
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 1 new or modified test 
files.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 eclipse:eclipse.  The patch built with eclipse:eclipse.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed these unit tests in 
hadoop-common-project/hadoop-common:

  org.apache.hadoop.ha.TestZKFailoverController
  org.apache.hadoop.metrics2.impl.TestMetricsSystemImpl
  org.apache.hadoop.io.file.tfile.TestTFileByteArrays
  
org.apache.hadoop.io.file.tfile.TestTFileJClassComparatorByteArrays

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/1178//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/1178//console

This message is automatically generated.

 Configuration tries to read from an inputstream resource multiple times. 
 -

 Key: HADOOP-8573
 URL: https://issues.apache.org/jira/browse/HADOOP-8573
 Project: Hadoop Common
  Issue Type: Bug
  Components: conf
Affects Versions: 1.0.2, 0.23.3, 2.0.1-alpha, 3.0.0
Reporter: Robert Joseph Evans
Assignee: Robert Joseph Evans
 Attachments: HADOOP-8573.txt


 If someone calls Configuration.addResource(InputStream) and then 
 reloadConfiguration is called for any reason, Configruation will try to 
 reread the contents of the InputStream, after it has already closed it.
 This never showed up in 1.0 because the framework itself does not call 
 addResource with an InputStream, and typically by the time user code starts 
 running that might call this, all of the default and site resources have 
 already been loaded.
 In 0.23 mapreduce is now a client library, and mapred-site.xml and 
 mapred-default.xml are loaded much later in the process.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-8566) AvroReflectSerializer.accept(Class) throws a NPE if the class has no package (primitive types and arrays)

2012-07-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13408544#comment-13408544
 ] 

Hudson commented on HADOOP-8566:


Integrated in Hadoop-Common-trunk-Commit #2429 (See 
[https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2429/])
HADOOP-8566. AvroReflectSerializer.accept(Class) throws a NPE if the class 
has no package (primitive types and arrays). (tucu) (Revision 1358454)

 Result = SUCCESS
tucu : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1358454
Files : 
* /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/serializer/avro/AvroReflectSerialization.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/serializer/avro/TestAvroSerialization.java


 AvroReflectSerializer.accept(Class) throws a NPE if the class has no package 
 (primitive types and arrays)
 -

 Key: HADOOP-8566
 URL: https://issues.apache.org/jira/browse/HADOOP-8566
 Project: Hadoop Common
  Issue Type: Bug
  Components: io
Affects Versions: 2.0.0-alpha
Reporter: Alejandro Abdelnur
Assignee: Alejandro Abdelnur
 Fix For: 2.0.1-alpha

 Attachments: HADOOP-8566.patch, HADOOP-8566.patch


 the accept() method should consider the case where the class getPackage() 
 returns NULL.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-8566) AvroReflectSerializer.accept(Class) throws a NPE if the class has no package (primitive types and arrays)

2012-07-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13408545#comment-13408545
 ] 

Hudson commented on HADOOP-8566:


Integrated in Hadoop-Hdfs-trunk-Commit #2497 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2497/])
HADOOP-8566. AvroReflectSerializer.accept(Class) throws a NPE if the class 
has no package (primitive types and arrays). (tucu) (Revision 1358454)

 Result = SUCCESS
tucu : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1358454
Files : 
* /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/serializer/avro/AvroReflectSerialization.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/serializer/avro/TestAvroSerialization.java


 AvroReflectSerializer.accept(Class) throws a NPE if the class has no package 
 (primitive types and arrays)
 -

 Key: HADOOP-8566
 URL: https://issues.apache.org/jira/browse/HADOOP-8566
 Project: Hadoop Common
  Issue Type: Bug
  Components: io
Affects Versions: 2.0.0-alpha
Reporter: Alejandro Abdelnur
Assignee: Alejandro Abdelnur
 Fix For: 2.0.1-alpha

 Attachments: HADOOP-8566.patch, HADOOP-8566.patch


 the accept() method should consider the case where the class getPackage() 
 returns NULL.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-8552) Conflict: Same security.log.file for multiple users.

2012-07-06 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13408546#comment-13408546
 ] 

Hadoop QA commented on HADOOP-8552:
---

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12535000/HADOOP-8552_branch2.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

Console output: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/1180//console

This message is automatically generated.

 Conflict: Same security.log.file for multiple users. 
 -

 Key: HADOOP-8552
 URL: https://issues.apache.org/jira/browse/HADOOP-8552
 Project: Hadoop Common
  Issue Type: Bug
  Components: conf, security
Affects Versions: 1.0.3, 2.0.0-alpha
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
 Attachments: HADOOP-8552_branch1.patch, HADOOP-8552_branch2.patch


 In log4j.properties, hadoop.security.log.file is set to SecurityAuth.audit. 
 In the presence of multiple users, this can lead to a potential conflict.
 Adding username to the log file would avoid this scenario.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HADOOP-8572) Have the ability to force the use of the login user

2012-07-06 Thread Guillaume Nodet (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13408579#comment-13408579
 ] 

Guillaume Nodet commented on HADOOP-8572:
-

I'm working on deploying Hadoop in OSGi (and Karaf in particular).  Karaf has a 
console which is similar has a unix shell where you can run commands (deploying 
new osgi bundles, starting, stopping bundles and much more).  This console is 
also available remotely using an ssh client (Karaf embed a java sshd server).
In Karaf, we don't use any security manager, but we still have a user 
authenticated and associated to the thread.  The user is the one logged into 
the console.   When you install the hadoop osgi bundle (which I'm working on), 
the karaf security layer and the hadoop security mechanism do not work well 
together.  The patch allows hadoop to just ignore the currently associated 
Subject and always use the same mechanism as if no Subject was associated to 
the thread (i.e. defaulting to use the OS login user).

I'll attach a patch asap.


 Have the ability to force the use of the login user 
 

 Key: HADOOP-8572
 URL: https://issues.apache.org/jira/browse/HADOOP-8572
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Guillaume Nodet

 In Karaf, most of the code is run under the karaf user. When a user ssh 
 into Karaf, commands will be executed under that user.
 Deploying hadoop inside Karaf requires that the authenticated Subject has the 
 required hadoop principals set, which forces the reconfiguration of the whole 
 security layer, even at dev time.
 My patch proposes the introduction of a new configuration property 
 {{hadoop.security.force.login.user}} which if set to true (it would default 
 to false to keep the current behavior), would force the use of the login user 
 instead of using the authenticated subject (which is what happen when there's 
 no authenticated subject at all).  This greatly simplifies the use of hadoop 
 in such environments where security isn't really needed (at dev time).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira