date:20160414

[jira] [Updated] (HADOOP-12924) Add default coder key for creating raw coders

2016-04-14 Thread Rui Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-12924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Li updated HADOOP-12924:

Attachment: HADOOP-12924.5.patch

Update patch to address Kai's comments.
# The codec names are still strings rather than enum. That's because we're 
using string as codec name in ECSchema. And enum is not friendly to custom 
codec names.
# I moved the system schemas and codec names to a new class 
{{ErasureCodeConstants}}.

> Add default coder key for creating raw coders
> -
>
> Key: HADOOP-12924
> URL: https://issues.apache.org/jira/browse/HADOOP-12924
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Rui Li
>Assignee: Rui Li
>Priority: Minor
>  Labels: hdfs-ec-3.0-must-do
> Attachments: HADOOP-12924.1.patch, HADOOP-12924.2.patch, 
> HADOOP-12924.3.patch, HADOOP-12924.4.patch, HADOOP-12924.5.patch
>
>
> As suggested 
> [here|https://issues.apache.org/jira/browse/HADOOP-12826?focusedCommentId=15194402=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15194402].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HADOOP-12989) Some tests in org.apache.hadoop.fs.shell.find occasionally time out

2016-04-14 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-12989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15242472#comment-15242472
 ] 

Hudson commented on HADOOP-12989:
-

FAILURE: Integrated in Hadoop-trunk-Commit #9617 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/9617/])
HADOOP-12989. Some tests in org.apache.hadoop.fs.shell.find occasionally 
(aajisaka: rev 6e6b6dd5aaf93cfb373833cd175ee722d2cb708f)
* 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/shell/find/TestResult.java
* 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/shell/find/TestPrint.java
* 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/shell/find/TestPrint0.java
* 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/shell/find/TestIname.java
* 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/shell/find/TestFilterExpression.java
* 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/shell/find/TestName.java
* 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/shell/find/TestFind.java
* 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/shell/find/TestAnd.java


> Some tests in org.apache.hadoop.fs.shell.find occasionally time out
> ---
>
> Key: HADOOP-12989
> URL: https://issues.apache.org/jira/browse/HADOOP-12989
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: test
>Reporter: Akira AJISAKA
>Assignee: Takashi Ohnishi
>  Labels: newbie
> Fix For: 2.8.0, 2.7.3
>
> Attachments: HADOOP-12989.1.patch, HADOOP-12989.2.patch
>
>
> An example:
> {noformat}
> java.lang.Exception: test timed out after 1000 milliseconds
>   at java.lang.ClassLoader$NativeLibrary.load(Native Method)
>   at java.lang.ClassLoader.loadLibrary1(ClassLoader.java:1965)
>   at java.lang.ClassLoader.loadLibrary0(ClassLoader.java:1890)
>   at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1872)
>   at java.lang.Runtime.loadLibrary0(Runtime.java:849)
>   at java.lang.System.loadLibrary(System.java:1088)
>   at sun.security.action.LoadLibraryAction.run(LoadLibraryAction.java:67)
>   at sun.security.action.LoadLibraryAction.run(LoadLibraryAction.java:47)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at java.net.NetworkInterface.(NetworkInterface.java:56)
>   at org.apache.htrace.core.TracerId.getBestIpString(TracerId.java:179)
>   at org.apache.htrace.core.TracerId.processShellVar(TracerId.java:145)
>   at org.apache.htrace.core.TracerId.(TracerId.java:116)
>   at org.apache.htrace.core.Tracer$Builder.build(Tracer.java:159)
>   at org.apache.hadoop.fs.FsTracer.get(FsTracer.java:42)
>   at 
> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2794)
>   at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:99)
>   at 
> org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2837)
>   at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2819)
>   at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:381)
>   at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:180)
>   at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:365)
>   at org.apache.hadoop.fs.shell.PathData.(PathData.java:81)
>   at org.apache.hadoop.fs.shell.find.TestName.applyGlob(TestName.java:74)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HADOOP-13010) Refactor raw erasure coders

2016-04-14 Thread Rui Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-13010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15242468#comment-15242468
 ] 

Rui Li commented on HADOOP-13010:
-

I think dummy raw coders are intended for performance benchmark. By eliminating 
the en/decode work, it can help set an upper bound for the real raw coders, and 
find potential performance issues in how we read/write erasure coded files.
The end-user should be able to configure them so they can conduct their own 
benchmark.
Regarding the name, I'm not sure which one is better. I think the naming should 
indicate the coder is only for testing and not for real data.

> Refactor raw erasure coders
> ---
>
> Key: HADOOP-13010
> URL: https://issues.apache.org/jira/browse/HADOOP-13010
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Kai Zheng
>Assignee: Kai Zheng
> Fix For: 3.0.0
>
> Attachments: HADOOP-13010-v1.patch, HADOOP-13010-v2.patch
>
>
> This will refactor raw erasure coders according to some comments received so 
> far.
> * As discussed in HADOOP-11540 and suggested by [~cmccabe], better not to 
> rely class inheritance to reuse the codes, instead they can be moved to some 
> utility.
> * Suggested by [~jingzhao] somewhere quite some time ago, better to have a 
> state holder to keep some checking results for later reuse during an 
> encode/decode call.
> This would not get rid of some inheritance levels as doing so isn't clear yet 
> for the moment and also incurs big impact. I do wish the end result by this 
> refactoring will make all the levels more clear and easier to follow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HADOOP-12989) Some tests in org.apache.hadoop.fs.shell.find occasionally time out

2016-04-14 Thread Akira AJISAKA (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-12989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15242466#comment-15242466
 ] 

Akira AJISAKA commented on HADOOP-12989:


and thanks Xiao for the comment.

> Some tests in org.apache.hadoop.fs.shell.find occasionally time out
> ---
>
> Key: HADOOP-12989
> URL: https://issues.apache.org/jira/browse/HADOOP-12989
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: test
>Reporter: Akira AJISAKA
>Assignee: Takashi Ohnishi
>  Labels: newbie
> Fix For: 2.8.0, 2.7.3
>
> Attachments: HADOOP-12989.1.patch, HADOOP-12989.2.patch
>
>
> An example:
> {noformat}
> java.lang.Exception: test timed out after 1000 milliseconds
>   at java.lang.ClassLoader$NativeLibrary.load(Native Method)
>   at java.lang.ClassLoader.loadLibrary1(ClassLoader.java:1965)
>   at java.lang.ClassLoader.loadLibrary0(ClassLoader.java:1890)
>   at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1872)
>   at java.lang.Runtime.loadLibrary0(Runtime.java:849)
>   at java.lang.System.loadLibrary(System.java:1088)
>   at sun.security.action.LoadLibraryAction.run(LoadLibraryAction.java:67)
>   at sun.security.action.LoadLibraryAction.run(LoadLibraryAction.java:47)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at java.net.NetworkInterface.(NetworkInterface.java:56)
>   at org.apache.htrace.core.TracerId.getBestIpString(TracerId.java:179)
>   at org.apache.htrace.core.TracerId.processShellVar(TracerId.java:145)
>   at org.apache.htrace.core.TracerId.(TracerId.java:116)
>   at org.apache.htrace.core.Tracer$Builder.build(Tracer.java:159)
>   at org.apache.hadoop.fs.FsTracer.get(FsTracer.java:42)
>   at 
> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2794)
>   at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:99)
>   at 
> org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2837)
>   at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2819)
>   at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:381)
>   at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:180)
>   at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:365)
>   at org.apache.hadoop.fs.shell.PathData.(PathData.java:81)
>   at org.apache.hadoop.fs.shell.find.TestName.applyGlob(TestName.java:74)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HADOOP-12989) Some tests in org.apache.hadoop.fs.shell.find occasionally time out

2016-04-14 Thread Akira AJISAKA (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-12989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira AJISAKA updated HADOOP-12989:
---
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 2.7.3
   2.8.0
   Status: Resolved  (was: Patch Available)

Committed this to branch-2.7 and above. Thanks [~bwtakacy] for the contribution!

> Some tests in org.apache.hadoop.fs.shell.find occasionally time out
> ---
>
> Key: HADOOP-12989
> URL: https://issues.apache.org/jira/browse/HADOOP-12989
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: test
>Reporter: Akira AJISAKA
>Assignee: Takashi Ohnishi
>  Labels: newbie
> Fix For: 2.8.0, 2.7.3
>
> Attachments: HADOOP-12989.1.patch, HADOOP-12989.2.patch
>
>
> An example:
> {noformat}
> java.lang.Exception: test timed out after 1000 milliseconds
>   at java.lang.ClassLoader$NativeLibrary.load(Native Method)
>   at java.lang.ClassLoader.loadLibrary1(ClassLoader.java:1965)
>   at java.lang.ClassLoader.loadLibrary0(ClassLoader.java:1890)
>   at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1872)
>   at java.lang.Runtime.loadLibrary0(Runtime.java:849)
>   at java.lang.System.loadLibrary(System.java:1088)
>   at sun.security.action.LoadLibraryAction.run(LoadLibraryAction.java:67)
>   at sun.security.action.LoadLibraryAction.run(LoadLibraryAction.java:47)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at java.net.NetworkInterface.(NetworkInterface.java:56)
>   at org.apache.htrace.core.TracerId.getBestIpString(TracerId.java:179)
>   at org.apache.htrace.core.TracerId.processShellVar(TracerId.java:145)
>   at org.apache.htrace.core.TracerId.(TracerId.java:116)
>   at org.apache.htrace.core.Tracer$Builder.build(Tracer.java:159)
>   at org.apache.hadoop.fs.FsTracer.get(FsTracer.java:42)
>   at 
> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2794)
>   at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:99)
>   at 
> org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2837)
>   at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2819)
>   at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:381)
>   at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:180)
>   at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:365)
>   at org.apache.hadoop.fs.shell.PathData.(PathData.java:81)
>   at org.apache.hadoop.fs.shell.find.TestName.applyGlob(TestName.java:74)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HADOOP-12613) TestFind.processArguments occasionally fails

2016-04-14 Thread Akira AJISAKA (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-12613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira AJISAKA updated HADOOP-12613:
---
Fix Version/s: (was: 2.9.0)
   2.7.3
   2.8.0

Cherry-picked to branch-2.8 and branch-2.7 because HADOOP-12989 depends on this 
commit.

> TestFind.processArguments occasionally fails
> 
>
> Key: HADOOP-12613
> URL: https://issues.apache.org/jira/browse/HADOOP-12613
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: test
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
> Fix For: 2.8.0, 2.7.3
>
> Attachments: HADOOP-12613.001.patch, HADOOP-12613.002.patch
>
>
> This failure seems to exist after November 3rd. I am still tracing where this 
> can come from.
> https://builds.apache.org/job/Hadoop-Common-trunk/2066/testReport/org.apache.hadoop.fs.shell.find/TestFind/processArguments/
> Error Message
> test timed out after 1000 milliseconds
> Stacktrace
> {noformat}
> java.lang.Exception: test timed out after 1000 milliseconds
>   at java.util.AbstractList$Itr.next(AbstractList.java:357)
>   at java.util.SubList$1.next(AbstractList.java:707)
>   at java.util.AbstractCollection.toArray(AbstractCollection.java:141)
>   at java.util.ArrayList.addAll(ArrayList.java:559)
>   at 
> org.mockito.internal.exceptions.base.StackTraceFilter.filter(StackTraceFilter.java:54)
>   at org.mockito.internal.debugging.Location.(Location.java:22)
>   at org.mockito.internal.debugging.Location.(Location.java:17)
>   at org.mockito.internal.invocation.Invocation.(Invocation.java:60)
>   at 
> org.mockito.internal.creation.MethodInterceptorFilter.intercept(MethodInterceptorFilter.java:46)
>   at 
> org.apache.hadoop.fs.FileStatus$$EnhancerByMockitoWithCGLIB$$a131b1e2.isSymlink()
>   at org.apache.hadoop.fs.shell.find.Find.recursePath(Find.java:355)
>   at org.apache.hadoop.fs.shell.Command.processPaths(Command.java:323)
>   at org.apache.hadoop.fs.shell.Command.recursePath(Command.java:377)
>   at org.apache.hadoop.fs.shell.find.Find.recursePath(Find.java:369)
>   at org.apache.hadoop.fs.shell.Command.processPaths(Command.java:323)
>   at 
> org.apache.hadoop.fs.shell.Command.processPathArgument(Command.java:293)
>   at org.apache.hadoop.fs.shell.Command.processArgument(Command.java:275)
>   at org.apache.hadoop.fs.shell.Command.processArguments(Command.java:259)
>   at org.apache.hadoop.fs.shell.find.Find.processArguments(Find.java:427)
>   at 
> org.apache.hadoop.fs.shell.find.TestFind.processArguments(TestFind.java:253)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HADOOP-12989) Some tests in org.apache.hadoop.fs.shell.find occasionally time out

2016-04-14 Thread Akira AJISAKA (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-12989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15242449#comment-15242449
 ] 

Akira AJISAKA commented on HADOOP-12989:


Thank you for updating the patch [~bwtakacy]! +1, committing this.

> Some tests in org.apache.hadoop.fs.shell.find occasionally time out
> ---
>
> Key: HADOOP-12989
> URL: https://issues.apache.org/jira/browse/HADOOP-12989
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: test
>Reporter: Akira AJISAKA
>Assignee: Takashi Ohnishi
>  Labels: newbie
> Attachments: HADOOP-12989.1.patch, HADOOP-12989.2.patch
>
>
> An example:
> {noformat}
> java.lang.Exception: test timed out after 1000 milliseconds
>   at java.lang.ClassLoader$NativeLibrary.load(Native Method)
>   at java.lang.ClassLoader.loadLibrary1(ClassLoader.java:1965)
>   at java.lang.ClassLoader.loadLibrary0(ClassLoader.java:1890)
>   at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1872)
>   at java.lang.Runtime.loadLibrary0(Runtime.java:849)
>   at java.lang.System.loadLibrary(System.java:1088)
>   at sun.security.action.LoadLibraryAction.run(LoadLibraryAction.java:67)
>   at sun.security.action.LoadLibraryAction.run(LoadLibraryAction.java:47)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at java.net.NetworkInterface.(NetworkInterface.java:56)
>   at org.apache.htrace.core.TracerId.getBestIpString(TracerId.java:179)
>   at org.apache.htrace.core.TracerId.processShellVar(TracerId.java:145)
>   at org.apache.htrace.core.TracerId.(TracerId.java:116)
>   at org.apache.htrace.core.Tracer$Builder.build(Tracer.java:159)
>   at org.apache.hadoop.fs.FsTracer.get(FsTracer.java:42)
>   at 
> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2794)
>   at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:99)
>   at 
> org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2837)
>   at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2819)
>   at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:381)
>   at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:180)
>   at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:365)
>   at org.apache.hadoop.fs.shell.PathData.(PathData.java:81)
>   at org.apache.hadoop.fs.shell.find.TestName.applyGlob(TestName.java:74)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HADOOP-13010) Refactor raw erasure coders

2016-04-14 Thread Kai Zheng (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-13010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15242373#comment-15242373
 ] 

Kai Zheng commented on HADOOP-13010:


Continue to address the original comments.
bq. AbstractRawErasureCoder – why do we need this base class? Its function 
seems to be just storing configuration values.
Having the common base class would allow encoder and decoder to share common 
properties, not just configurations, but also schema info and some options. We 
can also say that encoder and decoder are also coders, which allows to write 
some common  behaviors to deal with coders, not encoder or decoder specific. I 
understand it should also work by composition, but  right now I don't see very 
much benefits to switch this from one style to the other, or troubles if we 
don't.
bq. AbstractRawErasureEncoder /AbstractRawErasureDecoder – why are these 
classes separate from RawErasureEncoder / RawErasureDecoder? ...  Base classes 
are also easier to extend in the future than interfaces because you can add new 
methods without breaking backwards compatibility
It sounds better not to have the interfaces since the benefit is obvious. So in 
summary how about having these classes (no interface) now: still 
AbstractRawErasureCoder, RawErasureEncoder/Decoder (no Abstract prefix now, 
with the original interface combined), and all kinds of concrete inherent 
encoders/decoders. All client codes will declare RawErasureEncoder/Decoder type 
when creating instances.
bq. DummyRawDecoder – NoOpRawDecoder would be a better name than "Dummy".  Is 
this intended to be used just in unit tests, or is it something the end-user 
should be able to configure?
Hmm, I'm not sure but I think "Dummy" is more simple and flexible to interpret. 
It's for tests but may be configured and used in a benchmark test when 
comparing. [~zhz] and [~lirui] please help clarify if I missed something. 

> Refactor raw erasure coders
> ---
>
> Key: HADOOP-13010
> URL: https://issues.apache.org/jira/browse/HADOOP-13010
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Kai Zheng
>Assignee: Kai Zheng
> Fix For: 3.0.0
>
> Attachments: HADOOP-13010-v1.patch, HADOOP-13010-v2.patch
>
>
> This will refactor raw erasure coders according to some comments received so 
> far.
> * As discussed in HADOOP-11540 and suggested by [~cmccabe], better not to 
> rely class inheritance to reuse the codes, instead they can be moved to some 
> utility.
> * Suggested by [~jingzhao] somewhere quite some time ago, better to have a 
> state holder to keep some checking results for later reuse during an 
> encode/decode call.
> This would not get rid of some inheritance levels as doing so isn't clear yet 
> for the moment and also incurs big impact. I do wish the end result by this 
> refactoring will make all the levels more clear and easier to follow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HADOOP-13027) Add unit test for MiniKDC to issue tickets for >1 persons in the same instance

2016-04-14 Thread Jiajia Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-13027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jiajia Li updated HADOOP-13027:
---
Summary: Add unit test for MiniKDC to issue tickets for >1 persons in the 
same instance  (was: Add UT for MiniKDC to issue tickets for >1 person in the 
same JVM)

> Add unit test for MiniKDC to issue tickets for >1 persons in the same instance
> --
>
> Key: HADOOP-13027
> URL: https://issues.apache.org/jira/browse/HADOOP-13027
> Project: Hadoop Common
>  Issue Type: Test
>Reporter: Jiajia Li
>Assignee: Jiajia Li
>
> As discussed in https://issues.apache.org/jira/browse/HADOOP-12911, we should 
> verify Kerby SimpleKDC can issue tickets for >1 person in the same JVM.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HADOOP-13027) Add unit test for MiniKDC to issue tickets for >1 persons in the same instance

2016-04-14 Thread Jiajia Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-13027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jiajia Li updated HADOOP-13027:
---
Description: As discussed in HADOOP-12911, we should verify MiniKDC can 
issue tickets for >1 persons in the same instance.  (was: As discussed in 
https://issues.apache.org/jira/browse/HADOOP-12911, we should verify Kerby 
SimpleKDC can issue tickets for >1 person in the same JVM.)

> Add unit test for MiniKDC to issue tickets for >1 persons in the same instance
> --
>
> Key: HADOOP-13027
> URL: https://issues.apache.org/jira/browse/HADOOP-13027
> Project: Hadoop Common
>  Issue Type: Test
>Reporter: Jiajia Li
>Assignee: Jiajia Li
>
> As discussed in HADOOP-12911, we should verify MiniKDC can issue tickets for 
> >1 persons in the same instance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HADOOP-13027) Add UT for MiniKDC to issue tickets for >1 person in the same JVM

2016-04-14 Thread Jiajia Li (JIRA)

Jiajia Li created HADOOP-13027:
--

 Summary: Add UT for MiniKDC to issue tickets for >1 person in the 
same JVM
 Key: HADOOP-13027
 URL: https://issues.apache.org/jira/browse/HADOOP-13027
 Project: Hadoop Common
  Issue Type: Test
Reporter: Jiajia Li
Assignee: Jiajia Li


As discussed in https://issues.apache.org/jira/browse/HADOOP-12911, we should 
verify Kerby SimpleKDC can issue tickets for >1 person in the same JVM.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HADOOP-12224) new method to ReflectionUtils to test if the item is already public

2016-04-14 Thread Greg Phillips (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-12224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Phillips updated HADOOP-12224:
---
Attachment: HADOOP-12224.002.patch

Fixed formatting issues

> new method to ReflectionUtils to test if the item is already public
> ---
>
> Key: HADOOP-12224
> URL: https://issues.apache.org/jira/browse/HADOOP-12224
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: util
>Affects Versions: 2.5.0
>Reporter: Purvesh Patel
>Priority: Minor
> Attachments: HADOOP-12224-1.patch, HADOOP-12224.002.patch
>
>
> It's noticed that it is common practice to call setAccessible on a method 
> whether it is required or not. This patch provides a new method to 
> ReflectionUtils to test if the item is already public and only call 
> setAccessible if it is not public. Please see attached file.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HADOOP-12911) Upgrade Hadoop MiniKDC with Kerby

2016-04-14 Thread Jiajia Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-12911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15242233#comment-15242233
 ] 

Jiajia Li commented on HADOOP-12911:


bq. It's a good idea to support tickets > 1, I believe Kerby SimpleKDC can do 
so and the new MiniKDC will do. It would be good to add tests to verify this. 
Agree, I will add the test to verify this.

bq.  there were some issues related to MiniKDC in the current codebase, maybe 
you would check them out and see how this new implementation will solve them or 
not? 
Yes, I think it will make MiniKDC better, I will do it.

> Upgrade Hadoop MiniKDC with Kerby
> -
>
> Key: HADOOP-12911
> URL: https://issues.apache.org/jira/browse/HADOOP-12911
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: test
>Reporter: Jiajia Li
>Assignee: Jiajia Li
> Attachments: HADOOP-12911-v1.patch, HADOOP-12911-v2.patch, 
> HADOOP-12911-v3.patch, HADOOP-12911-v4.patch, HADOOP-12911-v5.patch, 
> HADOOP-12911-v6.patch
>
>
> As discussed in the mailing list, we’d like to introduce Apache Kerby into 
> Hadoop. Initially it’s good to start with upgrading Hadoop MiniKDC with Kerby 
> offerings. Apache Kerby (https://github.com/apache/directory-kerby), as an 
> Apache Directory sub project, is a Java Kerberos binding. It provides a 
> SimpleKDC server that borrowed ideas from MiniKDC and implemented all the 
> facilities existing in MiniKDC. Currently MiniKDC depends on the old Kerberos 
> implementation in Directory Server project, but the implementation is stopped 
> being maintained. Directory community has a plan to replace the 
> implementation using Kerby. MiniKDC can use Kerby SimpleKDC directly to avoid 
> depending on the full of Directory project. Kerby also provides nice identity 
> backends such as the lightweight memory based one and the very simple json 
> one for easy development and test environments.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HADOOP-13010) Refactor raw erasure coders

2016-04-14 Thread Kai Zheng (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-13010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15242201#comment-15242201
 ] 

Kai Zheng commented on HADOOP-13010:


Thanks Colin for the quick response!
bq. Let's get rid of the special case, unless we have some benchmarks showing 
that it helps.
Ok. It's not hot at all.
bq. If we want to do multiple decode operations in parallel, we can just create 
multiple Decoder objects, right?
The problem is, a decoder associates expensive coding buffers and computed 
coding matrices, which would be good to stay in CPU core near enough caches for 
the performance. The cached data is per decoder, not only schema specific, but 
also erasure index specific in decode call, so it's not good to keep the cache 
out of decoder, but still makes sense to cache it because in HDFS side it's 
repeatedly called in a loop for a large block size (64k cell size -> 256mb 
block size). You might have a check about the native codes for native coders 
about the expensive buffers and data cached in every decode call. We had 
benchmarked the coders and showed this optimization obtained great speedup. 
Java InputStreams are similar to here, but not exactly because it's pure 
view-only and leverages OS/IO level caches for file reading stuffs. 


> Refactor raw erasure coders
> ---
>
> Key: HADOOP-13010
> URL: https://issues.apache.org/jira/browse/HADOOP-13010
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Kai Zheng
>Assignee: Kai Zheng
> Fix For: 3.0.0
>
> Attachments: HADOOP-13010-v1.patch, HADOOP-13010-v2.patch
>
>
> This will refactor raw erasure coders according to some comments received so 
> far.
> * As discussed in HADOOP-11540 and suggested by [~cmccabe], better not to 
> rely class inheritance to reuse the codes, instead they can be moved to some 
> utility.
> * Suggested by [~jingzhao] somewhere quite some time ago, better to have a 
> state holder to keep some checking results for later reuse during an 
> encode/decode call.
> This would not get rid of some inheritance levels as doing so isn't clear yet 
> for the moment and also incurs big impact. I do wish the end result by this 
> refactoring will make all the levels more clear and easier to follow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (HADOOP-13010) Refactor raw erasure coders

2016-04-14 Thread Colin Patrick McCabe (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-13010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15242157#comment-15242157
 ] 

Colin Patrick McCabe edited comment on HADOOP-13010 at 4/14/16 11:53 PM:
-

bq. The underlying buffer for the empty trunk is assumed read-only and will 
only be used to zero coding buffers. Making the entire function safe and also 
private is a good idea because in practice that level should be good enough.

Right.  The arrays themselves are read-only.  But we still have to control 
access to the pointer to the array, which is not read-only and which needs to 
be accessed in a thread-safe fashion.

bq. For pure Java coders that use byte array and on-heap bytebuffer, this way 
to zero buffers is efficient (perhaps the most one but I'm not totally sure); 
to zero direct bytebuffer the more efficient way would be to use an empty 
direct bytebuffer. I don't optimize this because pure Java coder is better not 
to use direct bytebuffer overall. Note native coders will prefer direct 
bytebuffer but won't need to bump into this, as we discussed in HADOOP-11540.

Yeah, the JNI encoders can be more efficient, so we don't have to worry about 
optimizing this.  I was just commenting that it's unfortunate that we have to 
keep around the empty array.

bq. Ok. Comment can be made here to tell the null indexes include erased units 
and the units that's not to read.

The function just finds null array entries.  What these entries mean is up to 
the caller.

bq. Because I want \[the first element\] to return fast considering it's the 
most often case.

I don't see any evidence that adding a special case makes this faster than just 
running the loop.  The loop starts at the first element anyway.  If the loop 
usually stops after the first iteration, I would expect the just-in-time 
compiler to optimize this code.  Let's get rid of the special case, unless we 
have some benchmarks showing that it helps.

bq. \[Decoders are\] intended not to be stateful, thus many threads can use the 
same decoder instance. I'm not sure all the existing coders are already good in 
this aspect, but effort will be made to achieve so if necessary, not sure all 
be done here.

Part of the appeal of object-oriented programming is to combine the data with 
the methods used to operate on that data.  I'm not sure why we would want to 
keep the decoder state separate from the decoder functions.  If we want to do 
multiple decode operations in parallel, we can just create multiple Decoder 
objects, right?

Java InputStreams don't have an InputStreamState that you have to pass in to 
every function.  Instead, if you want multiple views of the same file, you just 
create multiple streams.  It seems like we can take the same approach here.


was (Author: cmccabe):
bq. The underlying buffer for the empty trunk is assumed read-only and will 
only be used to zero coding buffers. Making the entire function safe and also 
private is a good idea because in practice that level should be good enough.

Right.  The arrays themselves are read-only.  But we still have to control 
access to the pointer to the array, which is not read-only and which needs to 
be accessed in a thread-safe fashion.

bq. For pure Java coders that use byte array and on-heap bytebuffer, this way 
to zero buffers is efficient (perhaps the most one but I'm not totally sure); 
to zero direct bytebuffer the more efficient way would be to use an empty 
direct bytebuffer. I don't optimize this because pure Java coder is better not 
to use direct bytebuffer overall. Note native coders will prefer direct 
bytebuffer but won't need to bump into this, as we discussed in HADOOP-11540.

Yeah, the JNI encoders can be more efficient, so we don't have to worry about 
optimizing this.  I was just commenting that it's unfortunate that we have to 
keep around the empty array.

bq. Ok. Comment can be made here to tell the null indexes include erased units 
and the units that's not to read.

The function just finds null array entries.  What these entries mean is up to 
the caller.

bq. Because I want \[the first element\] to return fast considering it's the 
most often case.

I don't see any evidence that adding a special case makes this faster than just 
running the loop.  The loop starts at the first element anyway.  If the loop 
usually stops after the first iteration, I would expect the just-in-time 
compiler to optimize this code.  Let's get rid of the special case, unless we 
have some benchmarks showing that it helps.

bq. \[Decoders are\] intended not to be stateful, thus many threads can use the 
same decoder instance. I'm not sure all the existing coders are already good in 
this aspect, but effort will be made to achieve so if necessary, not sure all 
be done here.

Part of the appeal of object-oriented programming is to combine the data with 
the methods used to operate

[jira] [Commented] (HADOOP-13010) Refactor raw erasure coders

2016-04-14 Thread Colin Patrick McCabe (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-13010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15242157#comment-15242157
 ] 

Colin Patrick McCabe commented on HADOOP-13010:
---

bq. The underlying buffer for the empty trunk is assumed read-only and will 
only be used to zero coding buffers. Making the entire function safe and also 
private is a good idea because in practice that level should be good enough.

Right.  The arrays themselves are read-only.  But we still have to control 
access to the pointer to the array, which is not read-only and which needs to 
be accessed in a thread-safe fashion.

bq. For pure Java coders that use byte array and on-heap bytebuffer, this way 
to zero buffers is efficient (perhaps the most one but I'm not totally sure); 
to zero direct bytebuffer the more efficient way would be to use an empty 
direct bytebuffer. I don't optimize this because pure Java coder is better not 
to use direct bytebuffer overall. Note native coders will prefer direct 
bytebuffer but won't need to bump into this, as we discussed in HADOOP-11540.

Yeah, the JNI encoders can be more efficient, so we don't have to worry about 
optimizing this.  I was just commenting that it's unfortunate that we have to 
keep around the empty array.

bq. Ok. Comment can be made here to tell the null indexes include erased units 
and the units that's not to read.

The function just finds null array entries.  What these entries mean is up to 
the caller.

bq. Because I want \[the first element\] to return fast considering it's the 
most often case.

I don't see any evidence that adding a special case makes this faster than just 
running the loop.  The loop starts at the first element anyway.  If the loop 
usually stops after the first iteration, I would expect the just-in-time 
compiler to optimize this code.  Let's get rid of the special case, unless we 
have some benchmarks showing that it helps.

bq. \[Decoders are\] intended not to be stateful, thus many threads can use the 
same decoder instance. I'm not sure all the existing coders are already good in 
this aspect, but effort will be made to achieve so if necessary, not sure all 
be done here.

Part of the appeal of object-oriented programming is to combine the data with 
the methods used to operate on that data.  I'm not sure why we would want to 
keep the decoder state separate from the decoder functions.  If we want to do 
multiple decode operations in parallel, we can just create multiple Decoder 
objects, right?

> Refactor raw erasure coders
> ---
>
> Key: HADOOP-13010
> URL: https://issues.apache.org/jira/browse/HADOOP-13010
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Kai Zheng
>Assignee: Kai Zheng
> Fix For: 3.0.0
>
> Attachments: HADOOP-13010-v1.patch, HADOOP-13010-v2.patch
>
>
> This will refactor raw erasure coders according to some comments received so 
> far.
> * As discussed in HADOOP-11540 and suggested by [~cmccabe], better not to 
> rely class inheritance to reuse the codes, instead they can be moved to some 
> utility.
> * Suggested by [~jingzhao] somewhere quite some time ago, better to have a 
> state holder to keep some checking results for later reuse during an 
> encode/decode call.
> This would not get rid of some inheritance levels as doing so isn't clear yet 
> for the moment and also incurs big impact. I do wish the end result by this 
> refactoring will make all the levels more clear and easier to follow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HADOOP-13010) Refactor raw erasure coders

2016-04-14 Thread Kai Zheng (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-13010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15242107#comment-15242107
 ] 

Kai Zheng commented on HADOOP-13010:


bq. This isn't safe for multiple threads, since we could be reading 
CoderUtil#emptyChunk while it's in the middle of being written. You must either 
make this volatile or hold the lock for this entire function.
The underlying buffer for the empty trunk is assumed read-only and will only be 
used to zero coding buffers. Making the entire function safe and also private 
is a good idea because in practice that level should be good enough.
bq. It's unfortunate that we need a function like this-- I was hoping that 
there would be some more efficient way of zeroing a ByteBuffer.
For pure Java coders that use byte array and on-heap bytebuffer, this way to 
zero buffers is efficient (perhaps the most one but I'm not totally sure); to 
zero direct bytebuffer the more efficient way would be to use an empty direct 
bytebuffer. I don't optimize this because pure Java coder is better not to use 
direct bytebuffer overall. Note native coders will prefer direct bytebuffer but 
won't need to bump into this, as we discussed in HADOOP-11540.
bq. Maybe something like cloneAsDirectByteBuffer would be a better name?
Ah yes.
bq. Should be named getNullIndexes?
Ok. Comment can be made here to tell the null indexes include erased units and 
the units that's not to read.
bq. Why do we need the special case for the first element here?
Because I want it to return fast considering it's the most often case.
bq. Should be named getNonNullIndexes? Also, why does this one take an array 
passed in, whereas getNullIndexes returns an array? I also don't see how the 
caller is supposed to know how many of the array slots were used by the 
function. ...  Perhaps we could mandate that the caller set all the array slots 
to a negative value before calling the function, ...
Ah good catch and thoughts! It does be able to unify with getNullIndexes, and 
needs to be made general enough as it stays in the utility class, even though 
for now it's only internal and used by {{RSRawDecoder}}. How many slots to be 
used? By controlled by the fact exactly {{numDataUnits}} is expected and the 
valid indexes are put into the array in order (out of 6+3 inputs). The array is 
zeroed before, yes -1 may be better since 0 is also a valid index, but 0 can 
only occur in the first place so no problem here. Yes better to have some 
JavaDoc explaining about this.
bq. I'm not sure why we wouldn't just store DecoderState in the Decoder? These 
are stateful objects, I assume.
It's intended not to be stateful, thus many threads can use the same decoder 
instance. I'm not sure all the existing coders are already good in this aspect, 
but effort will be made to achieve so if necessary, not sure all be done here.


> Refactor raw erasure coders
> ---
>
> Key: HADOOP-13010
> URL: https://issues.apache.org/jira/browse/HADOOP-13010
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Kai Zheng
>Assignee: Kai Zheng
> Fix For: 3.0.0
>
> Attachments: HADOOP-13010-v1.patch, HADOOP-13010-v2.patch
>
>
> This will refactor raw erasure coders according to some comments received so 
> far.
> * As discussed in HADOOP-11540 and suggested by [~cmccabe], better not to 
> rely class inheritance to reuse the codes, instead they can be moved to some 
> utility.
> * Suggested by [~jingzhao] somewhere quite some time ago, better to have a 
> state holder to keep some checking results for later reuse during an 
> encode/decode call.
> This would not get rid of some inheritance levels as doing so isn't clear yet 
> for the moment and also incurs big impact. I do wish the end result by this 
> refactoring will make all the levels more clear and easier to follow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HADOOP-13010) Refactor raw erasure coders

2016-04-14 Thread Kai Zheng (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-13010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15242079#comment-15242079
 ] 

Kai Zheng commented on HADOOP-13010:


Thanks [~cmccabe] a lot for the thorough review and insights over the whole 
erasure coder codes, in your view it looks like a good chance to make the even 
bigger change. It makes sense doing it now before any release if the change 
will be good. I will go thru your ideas and work on them.


> Refactor raw erasure coders
> ---
>
> Key: HADOOP-13010
> URL: https://issues.apache.org/jira/browse/HADOOP-13010
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Kai Zheng
>Assignee: Kai Zheng
> Fix For: 3.0.0
>
> Attachments: HADOOP-13010-v1.patch, HADOOP-13010-v2.patch
>
>
> This will refactor raw erasure coders according to some comments received so 
> far.
> * As discussed in HADOOP-11540 and suggested by [~cmccabe], better not to 
> rely class inheritance to reuse the codes, instead they can be moved to some 
> utility.
> * Suggested by [~jingzhao] somewhere quite some time ago, better to have a 
> state holder to keep some checking results for later reuse during an 
> encode/decode call.
> This would not get rid of some inheritance levels as doing so isn't clear yet 
> for the moment and also incurs big impact. I do wish the end result by this 
> refactoring will make all the levels more clear and easier to follow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HADOOP-12702) Add an HDFS metrics sink

2016-04-14 Thread Daniel Templeton (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-12702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Templeton updated HADOOP-12702:
--
Description: 
We need a metrics2 sink that can write metrics to HDFS. The sink should accept 
as configuration a "directory prefix" and do the following in {{putMetrics()}}

* Get MMddHH from current timestamp.
* If HDFS dir "dir prefix" + MMddHH doesn't exist, create it. Close any 
currently open file and create a new file called .log in the new 
directory.
* Write metrics to the current log file.

  was:
We need a metrics2 sink that can write metrics to HDFS. The sink should accept 
as configuration a "directory prefix" and do the following in {{putMetrics()}}

* Get MMddHH from current timestamp.
* If HDFS dir "dir prefix" + MMddHH doesn't exist, create it. Close any 
currently open file and create a new file called .log in the new 
directory.
* Write metrics to the current log file.
* If a write fails, it should be fatal to the process running the sink.


> Add an HDFS metrics sink
> 
>
> Key: HADOOP-12702
> URL: https://issues.apache.org/jira/browse/HADOOP-12702
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: metrics
>Affects Versions: 2.7.1
>Reporter: Daniel Templeton
>Assignee: Daniel Templeton
> Fix For: 2.9.0
>
> Attachments: HADOOP-12702.001.patch, HADOOP-12702.002.patch, 
> HADOOP-12702.003.patch, HADOOP-12702.004.patch, HADOOP-12702.005.patch, 
> HADOOP-12702.006.patch
>
>
> We need a metrics2 sink that can write metrics to HDFS. The sink should 
> accept as configuration a "directory prefix" and do the following in 
> {{putMetrics()}}
> * Get MMddHH from current timestamp.
> * If HDFS dir "dir prefix" + MMddHH doesn't exist, create it. Close any 
> currently open file and create a new file called .log in the new 
> directory.
> * Write metrics to the current log file.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (HADOOP-12975) Add jitter to CachingGetSpaceUsed's thread

2016-04-14 Thread Colin Patrick McCabe (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-12975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241805#comment-15241805
 ] 

Colin Patrick McCabe edited comment on HADOOP-12975 at 4/14/16 8:08 PM:


bq. But a percentage is chosen as it makes the jitter scale with anyone who 
changes du periods. If it's a set number then someone with a refresh period of 
days won't get any benefit from the jitter.

Hmm.  It seems like a fixed amount of jitter still provides a benefit, even to 
someone with a longer refresh interval.  Let's say my refresh period is 7 days. 
 At the end of that, I would still appreciate having my DU processes launch at 
slightly different times on the 7th day, rather than all launching at once.

My concern with varying based on a percentage is that there will be enormous 
variations in how long different volumes go between DU operations, when longer 
refresh intervals are in use.  Like if I have a 7 day period and one volume 
refreshes after 3.5 days, and the other waits for the full 7 days, that's quite 
a variation.  Similarly, if our period is short -- like 1 hour-- having some 
datanodes refresh after only 30 minutes seems unwelcome.  That's why I 
suggested a fixed jitter amount, to be configured by the sysadmin.

I don't feel very strongly about this, though, so if you want to make it 
percentage-based, that's fine too.  As long as it's configurable and the 
defaults are reasonable.  I definitely think that a maximum jitter percentage 
of 0.15 or 0.20 seems more reasonable than 0.5.


was (Author: cmccabe):
bq. But a percentage is chosen as it makes the jitter scale with anyone who 
changes du periods. If it's a set number then someone with a refresh period of 
days won't get any benefit from the jitter.

Hmm.  It seems like a fixed amount of jitter still provides a benefit, even to 
someone with a longer refresh interval.  Let's say my refresh period is 7 days. 
 At the end of that, I would still appreciate having my DU processes launch at 
slightly different times on the 7th day, rather than all launching at once.

My concern with varying based on a percentage is that there will be enormous 
variations in how long different volumes go between DU operations, when longer 
refresh intervals are in use.  Like if I have a 7 day period and one volume 
refreshes after 3.5 days, and the other waits for the full 7 days, that's quite 
a variation.  Similarly, if our period is short -- like 1 hour-- having some 
datanodes refresh after only 30 minutes seems unwelcome.  That's why I 
suggested a fixed jitter amount, to be configured by the sysadmin.

I don't feel very strongly about this, though, so if you want to make it 
percentage-based, that's fine too.  As long as it's configurable and the 
defaults are reasonable.

> Add jitter to CachingGetSpaceUsed's thread
> --
>
> Key: HADOOP-12975
> URL: https://issues.apache.org/jira/browse/HADOOP-12975
> Project: Hadoop Common
>  Issue Type: Sub-task
>Affects Versions: 2.9.0
>Reporter: Elliott Clark
>Assignee: Elliott Clark
> Attachments: HADOOP-12975v0.patch, HADOOP-12975v1.patch, 
> HADOOP-12975v2.patch
>
>
> Running DU across lots of disks is very expensive and running all of the 
> processes at the same time creates a noticeable IO spike. We should add some 
> jitter.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (HADOOP-12975) Add jitter to CachingGetSpaceUsed's thread

2016-04-14 Thread Colin Patrick McCabe (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-12975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241805#comment-15241805
 ] 

Colin Patrick McCabe edited comment on HADOOP-12975 at 4/14/16 8:07 PM:


bq. But a percentage is chosen as it makes the jitter scale with anyone who 
changes du periods. If it's a set number then someone with a refresh period of 
days won't get any benefit from the jitter.

Hmm.  It seems like a fixed amount of jitter still provides a benefit, even to 
someone with a longer refresh interval.  Let's say my refresh period is 7 days. 
 At the end of that, I would still appreciate having my DU processes launch at 
slightly different times on the 7th day, rather than all launching at once.

My concern with varying based on a percentage is that there will be enormous 
variations in how long different volumes go between DU operations, when longer 
refresh intervals are in use.  Like if I have a 7 day period and one volume 
refreshes after 3.5 days, and the other waits for the full 7 days, that's quite 
a variation.  Similarly, if our period is short -- like 1 hour-- having some 
datanodes refresh after only 30 minutes seems unwelcome.  That's why I 
suggested a fixed jitter amount, to be configured by the sysadmin.

I don't feel very strongly about this, though, so if you want to make it 
percentage-based, that's fine too.  As long as it's configurable and the 
defaults are reasonable.


was (Author: cmccabe):
bq. But a percentage is chosen as it makes the jitter scale with anyone who 
changes du periods. If it's a set number then someone with a refresh period of 
days won't get any benefit from the jitter.

Hmm.  It seems like a fixed amount of jitter still provides a benefit, even to 
someone with a longer refresh interval.  Let's say my refresh period is 7 days. 
 At the end of that, I would still appreciate having my DU processes launch at 
slightly different times on the 7th day, rather than all launching at once.

My concern with varying based on a percentage is that there will be enormous 
variations in how long different volumes go between DU operations, when longer 
refresh intervals are in use.  Like if I have a 7 day period and one volume 
refreshes after 3.5 days, and the other waits for the full 7 days, that's quite 
a variation.  Similarly, if our period is short -- like 1 hour-- having some 
datanodes refresh after only 30 minutes seems unwelcome.  That's why I 
suggested a fixed jitter amount, to be configured by the sysadmin.

I don't feel very strongly about this, though, so if you want to make it 
percentage-based, that's fine too.  As long as its configurable and the 
defaults are reasonable.

> Add jitter to CachingGetSpaceUsed's thread
> --
>
> Key: HADOOP-12975
> URL: https://issues.apache.org/jira/browse/HADOOP-12975
> Project: Hadoop Common
>  Issue Type: Sub-task
>Affects Versions: 2.9.0
>Reporter: Elliott Clark
>Assignee: Elliott Clark
> Attachments: HADOOP-12975v0.patch, HADOOP-12975v1.patch, 
> HADOOP-12975v2.patch
>
>
> Running DU across lots of disks is very expensive and running all of the 
> processes at the same time creates a noticeable IO spike. We should add some 
> jitter.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (HADOOP-12975) Add jitter to CachingGetSpaceUsed's thread

2016-04-14 Thread Colin Patrick McCabe (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-12975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241805#comment-15241805
 ] 

Colin Patrick McCabe edited comment on HADOOP-12975 at 4/14/16 8:07 PM:


bq. But a percentage is chosen as it makes the jitter scale with anyone who 
changes du periods. If it's a set number then someone with a refresh period of 
days won't get any benefit from the jitter.

Hmm.  It seems like a fixed amount of jitter still provides a benefit, even to 
someone with a longer refresh interval.  Let's say my refresh period is 7 days. 
 At the end of that, I would still appreciate having my DU processes launch at 
slightly different times on the 7th day, rather than all launching at once.

My concern with varying based on a percentage is that there will be enormous 
variations in how long different volumes go between DU operations, when longer 
refresh intervals are in use.  Like if I have a 7 day period and one volume 
refreshes after 3.5 days, and the other waits for the full 7 days, that's quite 
a variation.  Similarly, if our period is short -- like 1 hour-- having some 
datanodes refresh after only 30 minutes seems unwelcome.  That's why I 
suggested a fixed jitter amount, to be configured by the sysadmin.

I don't feel very strongly about this, though, so if you want to make it 
percentage-based, that's fine too.  As long as its configurable and the 
defaults are reasonable.


was (Author: cmccabe):
bq. But a percentage is chosen as it makes the jitter scale with anyone who 
changes du periods. If it's a set number then someone with a refresh period of 
days won't get any benefit from the jitter.

Hmm.  It seems like a fixed amount of jitter still provides a benefit, even to 
someone with a longer refresh interval.  Let's say my refresh period is 7 days. 
 At the end of that, I would still appreciate having my DU processes launch at 
slightly different times on the 7th day, rather than all launching at once.

My concern with varying based on a percentage is that there will be enormous 
variations in how long different volumes go between DU operations, when longer 
refresh intervals are in use.  Like if I have a 7 day period and one volume 
refreshes after 3.5 days, and the other ways for the full 7 days, that's quite 
a variation.  Similarly, if our period is short -- like 1 hour-- having some 
datanodes refresh after only 30 minutes seems unwelcome.  That's why I 
suggested a fixed jitter amount, to be configured by the sysadmin.

I don't feel very strongly about this, though, so if you want to make it 
percentage-based, that's fine too.  As long as its configurable and the 
defaults are reasonable.

> Add jitter to CachingGetSpaceUsed's thread
> --
>
> Key: HADOOP-12975
> URL: https://issues.apache.org/jira/browse/HADOOP-12975
> Project: Hadoop Common
>  Issue Type: Sub-task
>Affects Versions: 2.9.0
>Reporter: Elliott Clark
>Assignee: Elliott Clark
> Attachments: HADOOP-12975v0.patch, HADOOP-12975v1.patch, 
> HADOOP-12975v2.patch
>
>
> Running DU across lots of disks is very expensive and running all of the 
> processes at the same time creates a noticeable IO spike. We should add some 
> jitter.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HADOOP-12975) Add jitter to CachingGetSpaceUsed's thread

2016-04-14 Thread Colin Patrick McCabe (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-12975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241805#comment-15241805
 ] 

Colin Patrick McCabe commented on HADOOP-12975:
---

bq. But a percentage is chosen as it makes the jitter scale with anyone who 
changes du periods. If it's a set number then someone with a refresh period of 
days won't get any benefit from the jitter.

Hmm.  It seems like a fixed amount of jitter still provides a benefit, even to 
someone with a longer refresh interval.  Let's say my refresh period is 7 days. 
 At the end of that, I would still appreciate having my DU processes launch at 
slightly different times on the 7th day, rather than all launching at once.

My concern with varying based on a percentage is that there will be enormous 
variations in how long different volumes go between DU operations, when longer 
refresh intervals are in use.  Like if I have a 7 day period and one volume 
refreshes after 3.5 days, and the other ways for the full 7 days, that's quite 
a variation.  Similarly, if our period is short -- like 1 hour-- having some 
datanodes refresh after only 30 minutes seems unwelcome.  That's why I 
suggested a fixed jitter amount, to be configured by the sysadmin.

I don't feel very strongly about this, though, so if you want to make it 
percentage-based, that's fine too.  As long as its configurable and the 
defaults are reasonable.

> Add jitter to CachingGetSpaceUsed's thread
> --
>
> Key: HADOOP-12975
> URL: https://issues.apache.org/jira/browse/HADOOP-12975
> Project: Hadoop Common
>  Issue Type: Sub-task
>Affects Versions: 2.9.0
>Reporter: Elliott Clark
>Assignee: Elliott Clark
> Attachments: HADOOP-12975v0.patch, HADOOP-12975v1.patch, 
> HADOOP-12975v2.patch
>
>
> Running DU across lots of disks is very expensive and running all of the 
> processes at the same time creates a noticeable IO spike. We should add some 
> jitter.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HADOOP-13026) Should not wrap SocketTimeoutException into a AuthenticationException in KerberosAuthenticator

2016-04-14 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-13026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241762#comment-15241762
 ] 

Hadoop QA commented on HADOOP-13026:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 9s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s 
{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 
40s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 3s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 45s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
13s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 20s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
13s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 
27s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 12s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 14s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
16s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 42s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 5m 42s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 38s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 38s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 14s 
{color} | {color:red} hadoop-common-project/hadoop-auth: patch generated 1 new 
+ 28 unchanged - 0 fixed = 29 total (was 28) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 19s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
13s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 
38s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 11s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 15s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 37s 
{color} | {color:green} hadoop-auth in the patch passed with JDK v1.8.0_77. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 4m 0s 
{color} | {color:green} hadoop-auth in the patch passed with JDK v1.7.0_95. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
25s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 44m 51s {color} 
| {color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:fbe3e86 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12798782/HADOOP-13026.1.patch |
| JIRA Issue | HADOOP-13026 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 97a15d49473e 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
|

[jira] [Commented] (HADOOP-12975) Add jitter to CachingGetSpaceUsed's thread

2016-04-14 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-12975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241754#comment-15241754
 ] 

Hadoop QA commented on HADOOP-12975:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 17s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s 
{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 9m 
12s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 11m 
56s {color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 8m 46s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
22s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 5s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
15s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
51s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 5s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 17s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
50s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 9m 12s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 9m 12s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 8m 32s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 8m 32s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
23s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 12s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
16s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
23s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 17s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 17s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 12m 23s {color} 
| {color:red} hadoop-common in the patch failed with JDK v1.8.0_77. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 10m 39s 
{color} | {color:green} hadoop-common in the patch passed with JDK v1.7.0_95. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
22s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 86m 9s {color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_77 Failed junit tests | hadoop.fs.shell.find.TestPrint |
|   | hadoop.fs.shell.find.TestPrint0 |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:fbe3e86 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12798774/HADOOP-12975v2.patch |
| JIRA Issue | HADOOP-12975 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 2ab84282d052 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014

[jira] [Commented] (HADOOP-12811) Change kms server port number which conflicts with HMaster port number

2016-04-14 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-12811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241739#comment-15241739
 ] 

Hudson commented on HADOOP-12811:
-

FAILURE: Integrated in Hadoop-trunk-Commit #9614 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/9614/])
HADOOP-12811. Change kms server port number which conflicts with HMaster (wang: 
rev a74580a4d3039ff95e7744f1d7a386b2bc7a7484)
* 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/crypto/key/kms/TestLoadBalancingKMSClientProvider.java
* hadoop-common-project/hadoop-kms/src/main/libexec/kms-config.sh
* hadoop-common-project/hadoop-kms/src/site/markdown/index.md.vm
* hadoop-common-project/hadoop-kms/src/main/conf/kms-env.sh


> Change kms server port number which conflicts with HMaster port number
> --
>
> Key: HADOOP-12811
> URL: https://issues.apache.org/jira/browse/HADOOP-12811
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: kms
>Affects Versions: 2.7.2
>Reporter: Yufeng Jiang
>Assignee: Xiao Chen
>Priority: Critical
>  Labels: incompatible, patch
> Fix For: 3.0.0
>
> Attachments: HADOOP-12811.01.patch
>
>
> The HBase's HMaster port number conflicts with Hadoop kms port number. Both 
> uses 16000.
> There might be use cases user need kms and HBase present on the same cluster. 
> The HBase is able to encrypt its HFiles but user might need KMS to encrypt 
> other HDFS directories.
> Users would have to manually override the default port of either application 
> on their cluster. It would be nice to have different default ports so kms and 
> HBase could naturally coexist. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HADOOP-12811) Change kms server port number which conflicts with HMaster port number

2016-04-14 Thread Xiao Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-12811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241741#comment-15241741
 ] 

Xiao Chen commented on HADOOP-12811:


Thanks a lot [~andrew.wang]!
I updated the release note, please let me know if you have any suggestions.

> Change kms server port number which conflicts with HMaster port number
> --
>
> Key: HADOOP-12811
> URL: https://issues.apache.org/jira/browse/HADOOP-12811
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: kms
>Affects Versions: 2.7.2
>Reporter: Yufeng Jiang
>Assignee: Xiao Chen
>Priority: Critical
>  Labels: incompatible, patch
> Fix For: 3.0.0
>
> Attachments: HADOOP-12811.01.patch
>
>
> The HBase's HMaster port number conflicts with Hadoop kms port number. Both 
> uses 16000.
> There might be use cases user need kms and HBase present on the same cluster. 
> The HBase is able to encrypt its HFiles but user might need KMS to encrypt 
> other HDFS directories.
> Users would have to manually override the default port of either application 
> on their cluster. It would be nice to have different default ports so kms and 
> HBase could naturally coexist. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HADOOP-12811) Change kms server port number which conflicts with HMaster port number

2016-04-14 Thread Xiao Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-12811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Chen updated HADOOP-12811:
---
Release Note: The default port for KMS service is now 9600. This is to 
avoid conflicts on the previous port 16000, which is also used by HMaster as 
the default port.

> Change kms server port number which conflicts with HMaster port number
> --
>
> Key: HADOOP-12811
> URL: https://issues.apache.org/jira/browse/HADOOP-12811
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: kms
>Affects Versions: 2.7.2
>Reporter: Yufeng Jiang
>Assignee: Xiao Chen
>Priority: Critical
>  Labels: incompatible, patch
> Fix For: 3.0.0
>
> Attachments: HADOOP-12811.01.patch
>
>
> The HBase's HMaster port number conflicts with Hadoop kms port number. Both 
> uses 16000.
> There might be use cases user need kms and HBase present on the same cluster. 
> The HBase is able to encrypt its HFiles but user might need KMS to encrypt 
> other HDFS directories.
> Users would have to manually override the default port of either application 
> on their cluster. It would be nice to have different default ports so kms and 
> HBase could naturally coexist. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HADOOP-12974) Create a CachingGetSpaceUsed implementation that uses df

2016-04-14 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-12974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241711#comment-15241711
 ] 

Hadoop QA commented on HADOOP-12974:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 10s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 
30s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 55s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 40s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
20s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 55s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
13s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
30s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 50s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 3s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
40s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 37s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 5m 37s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 39s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 39s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
19s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 55s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
14s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
45s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 53s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 2s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 7m 8s 
{color} | {color:green} hadoop-common in the patch passed with JDK v1.8.0_77. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 7m 30s 
{color} | {color:green} hadoop-common in the patch passed with JDK v1.7.0_95. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
22s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 58m 17s {color} 
| {color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:fbe3e86 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12798775/HADOOP-12974v3.patch |
| JIRA Issue | HADOOP-12974 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux d6e2d969ba61 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / c970f1d |
| Default Java | 1.7.0_95 |
| Multi-JDK versions |

[jira] [Updated] (HADOOP-12811) Change kms server port number which conflicts with HMaster port number

2016-04-14 Thread Andrew Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-12811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HADOOP-12811:
-
   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

Committed to trunk, thanks for working on this Xiao!

> Change kms server port number which conflicts with HMaster port number
> --
>
> Key: HADOOP-12811
> URL: https://issues.apache.org/jira/browse/HADOOP-12811
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: kms
>Affects Versions: 2.7.2
>Reporter: Yufeng Jiang
>Assignee: Xiao Chen
>Priority: Critical
>  Labels: incompatible, patch
> Fix For: 3.0.0
>
> Attachments: HADOOP-12811.01.patch
>
>
> The HBase's HMaster port number conflicts with Hadoop kms port number. Both 
> uses 16000.
> There might be use cases user need kms and HBase present on the same cluster. 
> The HBase is able to encrypt its HFiles but user might need KMS to encrypt 
> other HDFS directories.
> Users would have to manually override the default port of either application 
> on their cluster. It would be nice to have different default ports so kms and 
> HBase could naturally coexist. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HADOOP-12811) Change kms server port number which conflicts with HMaster port number

2016-04-14 Thread Andrew Wang (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-12811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241690#comment-15241690
 ] 

Andrew Wang commented on HADOOP-12811:
--

LGTM +1, will commit shortly.

[~xiaochen] do you mind adding a release note? Good to have for incompatible 
changes.

> Change kms server port number which conflicts with HMaster port number
> --
>
> Key: HADOOP-12811
> URL: https://issues.apache.org/jira/browse/HADOOP-12811
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: kms
>Affects Versions: 2.7.2
>Reporter: Yufeng Jiang
>Assignee: Xiao Chen
>Priority: Critical
>  Labels: incompatible, patch
> Attachments: HADOOP-12811.01.patch
>
>
> The HBase's HMaster port number conflicts with Hadoop kms port number. Both 
> uses 16000.
> There might be use cases user need kms and HBase present on the same cluster. 
> The HBase is able to encrypt its HFiles but user might need KMS to encrypt 
> other HDFS directories.
> Users would have to manually override the default port of either application 
> on their cluster. It would be nice to have different default ports so kms and 
> HBase could naturally coexist. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HADOOP-12811) Change kms server port number which conflicts with HMaster port number

2016-04-14 Thread Andrew Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-12811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HADOOP-12811:
-
Affects Version/s: (was: 2.6.3)
   (was: 2.6.2)
   (was: 2.7.1)
   (was: 2.6.1)
   (was: 2.7.0)

> Change kms server port number which conflicts with HMaster port number
> --
>
> Key: HADOOP-12811
> URL: https://issues.apache.org/jira/browse/HADOOP-12811
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: kms
>Affects Versions: 2.7.2
>Reporter: Yufeng Jiang
>Assignee: Xiao Chen
>  Labels: incompatible, patch
> Attachments: HADOOP-12811.01.patch
>
>
> The HBase's HMaster port number conflicts with Hadoop kms port number. Both 
> uses 16000.
> There might be use cases user need kms and HBase present on the same cluster. 
> The HBase is able to encrypt its HFiles but user might need KMS to encrypt 
> other HDFS directories.
> Users would have to manually override the default port of either application 
> on their cluster. It would be nice to have different default ports so kms and 
> HBase could naturally coexist. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HADOOP-12811) Change kms server port number which conflicts with HMaster port number

2016-04-14 Thread Andrew Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-12811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HADOOP-12811:
-
Target Version/s: 3.0.0
Priority: Critical  (was: Major)

> Change kms server port number which conflicts with HMaster port number
> --
>
> Key: HADOOP-12811
> URL: https://issues.apache.org/jira/browse/HADOOP-12811
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: kms
>Affects Versions: 2.7.2
>Reporter: Yufeng Jiang
>Assignee: Xiao Chen
>Priority: Critical
>  Labels: incompatible, patch
> Attachments: HADOOP-12811.01.patch
>
>
> The HBase's HMaster port number conflicts with Hadoop kms port number. Both 
> uses 16000.
> There might be use cases user need kms and HBase present on the same cluster. 
> The HBase is able to encrypt its HFiles but user might need KMS to encrypt 
> other HDFS directories.
> Users would have to manually override the default port of either application 
> on their cluster. It would be nice to have different default ports so kms and 
> HBase could naturally coexist. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HADOOP-13026) Should not wrap SocketTimeoutException into a AuthenticationException in KerberosAuthenticator

2016-04-14 Thread Xuan Gong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-13026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong updated HADOOP-13026:
---
Status: Patch Available  (was: Open)

> Should not wrap SocketTimeoutException into a AuthenticationException in 
> KerberosAuthenticator
> --
>
> Key: HADOOP-13026
> URL: https://issues.apache.org/jira/browse/HADOOP-13026
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Attachments: HADOOP-13026.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (HADOOP-13010) Refactor raw erasure coders

2016-04-14 Thread Colin Patrick McCabe (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-13010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241679#comment-15241679
 ] 

Colin Patrick McCabe edited comment on HADOOP-13010 at 4/14/16 6:29 PM:


Thanks for this, [~drankye].  Looks good overall!

I like the idea of moving some of the utility stuff into {{CoderUtil.java}}.

{code}
  static byte[] getEmptyChunk(int leastLength) {
if (emptyChunk.length >= leastLength) {
  return emptyChunk; // In most time
}
synchronized (AbstractRawErasureCoder.class) {
  emptyChunk = new byte[leastLength];
}
return emptyChunk;
  }
{code}
This isn't safe for multiple threads, since we could be reading 
{{CoderUtil#emptyChunk}} while it's in the middle of being written.  You must 
either make this {{volatile}} or hold the lock for this entire function.

It's unfortunate that we need a function like this-- I was hoping that there 
would be some more efficient way of zeroing a ByteBuffer.  One thing that's a 
little concerning here is that a caller could modify the array returned by 
{{getEmptyChunk}}, which would cause problems for other callers.  To avoid 
this, it's probably better to make this {{private}} to {{CoderUtil.java}}.

{code}
  static ByteBuffer convertInputBuffer(byte[] input, int offset, int len) {
{code}
Hmm.  This name seems a bit confusing.  What this function does has nothing to 
do with whether the buffer is for "input" versus "output"-- it's just copying 
data from an array to a {{DirectByteBuffer}}.  It's also not so much a 
"conversion" as a "copy".  Maybe something like {{cloneAsDirectByteBuffer}} 
would be a better name?

{code}
  static  int[] getErasedOrNotToReadIndexes(T[] inputs) {
{code}
Should be named {{getNullIndexes}}?

{code}
  static  T findFirstValidInput(T[] inputs) {
if (inputs.length > 0 && inputs[0] != null) {
  return inputs[0];
}

for (T input : inputs) {
  if (input != null) {
return input;
  }
}
...
{code}
Why do we need the special case for the first element here?

{code}
  static  void makeValidIndexes(T[] inputs, int[] validIndexes) {
{code}
Should be named {{getNonNullIndexes}}?  Also, why does this one take an array 
passed in, whereas {{getNullIndexes}} returns an array?  I also don't see how 
the caller is supposed to know how many of the array slots were used by the 
function.  If the array starts as all zeros, that is identical to the function 
putting a zero in the first element of the array and then returning, right?  
Perhaps we could mandate that the caller set all the array slots to a negative 
value before calling the function, but that seems like an awkward calling 
convention-- and certainly one that should be documented via JavaDoc.

{code}
  @Override
  protected void doDecode(DecoderState decoderState, ByteBuffer[] inputs,
  int[] erasedIndexes, ByteBuffer[] outputs) {
{code}
I'm not sure why we wouldn't just store {{DecoderState}} in the {{Decoder}}?  
These are stateful objects, I assume.

Continuing my comments from earlier:
* {{AbstractRawErasureCoder}} -- why do we need this base class?  Its function 
seems to be just storing configuration values.  Perhaps we'd be better off just 
having an {{ErasureEncodingConfiguration}} class which other objects can own 
(not inherit from).  I think of a configuration as something you *own*, not 
something you *are*, which is why I think composition would make more sense 
here.  Also, is it possible for this to be immutable?  Mutable configuration is 
a huge headache (another reason I dislike {{Configured.java}})
* {{AbstractRawErasureEncoder}} /{{AbstractRawErasureDecoder}} -- why are these 
classes separate from {{RawErasureEncoder}} / {{RawErasureDecoder}}?  Do we 
expect that any encoders will implement {{RawErasureEncoder}}, but not extend 
{{AbstractRawErasureEncoder}}?  If not, it would be better just to have two 
base classes here rather than 2 classes and 2 interfaces.  Base classes are 
also easier to extend in the future than interfaces because you can add new 
methods without breaking backwards compatibility (as long as you have a default 
in the base).
* {{DummyRawDecoder}} -- {{NoOpRawDecoder}} would be a better name than 
"Dummy".  Is this intended to be used just in unit tests, or is it something 
the end-user should be able to configure?  If it is just unit tests, it should 
be under a {{test}} path, rather than a {{main}} path... i.e. 
{{hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/erasurecode/rawcoder/DummyRawDecoder.java}}


was (Author: cmccabe):
Thanks for this, [~drankye].  Looks good overall!

I like the idea of moving some of the utility stuff into {{CoderUtil.java}}.

{code}
  static byte[] getEmptyChunk(int leastLength) {
if (emptyChunk.length >= leastLength) {
  return emptyChunk; // In most time
}

[jira] [Commented] (HADOOP-13010) Refactor raw erasure coders

2016-04-14 Thread Colin Patrick McCabe (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-13010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241679#comment-15241679
 ] 

Colin Patrick McCabe commented on HADOOP-13010:
---

Thanks for this, [~drankye].  Looks good overall!

I like the idea of moving some of the utility stuff into {{CoderUtil.java}}.

{code}
  static byte[] getEmptyChunk(int leastLength) {
if (emptyChunk.length >= leastLength) {
  return emptyChunk; // In most time
}
synchronized (AbstractRawErasureCoder.class) {
  emptyChunk = new byte[leastLength];
}
return emptyChunk;
  }
{code}
This isn't safe for multiple threads, since we could be reading 
{{CoderUtil#emptyChunk}} while it's in the middle of being written.  You must 
either make this {{volatile}} or hold the lock for this entire function.

It's unfortunate that we need a function like this-- I was hoping that there 
would be some more efficient way of zeroing a ByteBuffer.  One thing that's a 
little concerning here is that a caller could modify the array returned by 
{{getEmptyChunk}}, which would cause problems for other callers.  To avoid 
this, it's probably better to make this {{private}} to {{CoderUtil.java}}.

{code}
  static ByteBuffer convertInputBuffer(byte[] input, int offset, int len) {
{code}
Hmm.  This name seems a bit confusing.  What this function does has nothing to 
do with whether the buffer is for "input" versus "output"-- it's just copying 
data from an array to a {{DirectByteBuffer}}.  It's also not so much a 
"conversion" as a "copy".  Maybe something like {{cloneAsDirectByteBuffer}} 
would be a better name?

{code}
  static  int[] getErasedOrNotToReadIndexes(T[] inputs) {
{code}
Should be named {{getNullIndexes}}?

{code}
  static  T findFirstValidInput(T[] inputs) {
if (inputs.length > 0 && inputs[0] != null) {
  return inputs[0];
}

for (T input : inputs) {
  if (input != null) {
return input;
  }
}
...
{code}
Why do we need the special case for the first element here?

{code}
  static  void makeValidIndexes(T[] inputs, int[] validIndexes) {
{code}
Should be named {{getNonNullIndexes}}?  Also, why does this one take an array 
passed in, whereas {{getNullIndexes}} returns an array?  I also don't see how 
the caller is supposed to know how many of the array slots were used by the 
function.  If the array starts as all zeros, that is identical to the function 
putting a zero in the first element of the array and then returning, right?  
Perhaps we could mandate that the caller set all the array slots to a negative 
value before calling the function, but that seems like an awkward calling 
convention-- and certainly one that should be documented via JavaDoc.

{code}
  @Override
  protected void doDecode(DecoderState decoderState, ByteBuffer[] inputs,
  int[] erasedIndexes, ByteBuffer[] outputs) {
{code}
I'm not sure why we wouldn't just store {{DecoderState}} in the {{Decoder}}?  
These are stateful objects, I assume.

Continuing my comments from earlier:
* {{AbstractRawErasureCoder}} -- why do we need this base class?  Its function 
seems to be just storing configuration values.  Perhaps we'd be better off just 
having an {{ErasureEncodingConfiguration}} class which other objects can own 
(not inherit from).  I think of a configuration as something you *own*, not 
something you *are*, which is why I think composition would make more sense 
here.  Also, is it possible for this to be immutable?  Mutable configuration is 
a huge headache (another reason I dislike {{Configured.java}})
* {{AbstractRawErasure{En,De}coder}} -- why are these classes separate from 
{{RawErasureEncoder}} / {{RawErasureDecoder}}?  Do we expect that any encoders 
will implement {{RawErasureEncoder}}, but not extend 
{{AbstractRawErasureEncoder}}?  If not, it would be better just to have two 
base classes here rather than 2 classes and 2 interfaces.  Base classes are 
also easier to extend in the future than interfaces because you can add new 
methods without breaking backwards compatibility (as long as you have a default 
in the base).
* {{DummyRawDecoder}} -- {{NoOpRawDecoder}} would be a better name than 
"Dummy".  Is this intended to be used just in unit tests, or is it something 
the end-user should be able to configure?  If it is just unit tests, it should 
be under a {{test}} path, rather than a {{main}} path... i.e. 
{{hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/erasurecode/rawcoder/DummyRawDecoder.java}}

> Refactor raw erasure coders
> ---
>
> Key: HADOOP-13010
> URL: https://issues.apache.org/jira/browse/HADOOP-13010
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Kai Zheng
>Assignee: Kai Zheng
> Fix For: 3.0.0
>
> Attachments: HADOOP-13010-v1.patch,

[jira] [Updated] (HADOOP-13026) Should not wrap SocketTimeoutException into a AuthenticationException in KerberosAuthenticator

2016-04-14 Thread Xuan Gong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-13026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong updated HADOOP-13026:
---
Attachment: HADOOP-13026.1.patch

> Should not wrap SocketTimeoutException into a AuthenticationException in 
> KerberosAuthenticator
> --
>
> Key: HADOOP-13026
> URL: https://issues.apache.org/jira/browse/HADOOP-13026
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Attachments: HADOOP-13026.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HADOOP-13026) Should not wrap SocketTimeoutException into a AuthenticationException in KerberosAuthenticator

2016-04-14 Thread Xuan Gong (JIRA)

Xuan Gong created HADOOP-13026:
--

 Summary: Should not wrap SocketTimeoutException into a 
AuthenticationException in KerberosAuthenticator
 Key: HADOOP-13026
 URL: https://issues.apache.org/jira/browse/HADOOP-13026
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Xuan Gong
Assignee: Xuan Gong






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HADOOP-12974) Create a CachingGetSpaceUsed implementation that uses df

2016-04-14 Thread Elliott Clark (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-12974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elliott Clark updated HADOOP-12974:
---
Attachment: HADOOP-12974v3.patch

Fixed the documentation to correctly describe how to use the class.

> Create a CachingGetSpaceUsed implementation that uses df
> 
>
> Key: HADOOP-12974
> URL: https://issues.apache.org/jira/browse/HADOOP-12974
> Project: Hadoop Common
>  Issue Type: Sub-task
>Affects Versions: 2.9.0
>Reporter: Elliott Clark
>Assignee: Elliott Clark
> Attachments: HADOOP-12974v0.patch, HADOOP-12974v1.patch, 
> HADOOP-12974v2.patch, HADOOP-12974v3.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HADOOP-12975) Add jitter to CachingGetSpaceUsed's thread

2016-04-14 Thread Elliott Clark (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-12975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elliott Clark updated HADOOP-12975:
---
Affects Version/s: 2.9.0
   Status: Patch Available  (was: Open)

> Add jitter to CachingGetSpaceUsed's thread
> --
>
> Key: HADOOP-12975
> URL: https://issues.apache.org/jira/browse/HADOOP-12975
> Project: Hadoop Common
>  Issue Type: Sub-task
>Affects Versions: 2.9.0
>Reporter: Elliott Clark
>Assignee: Elliott Clark
> Attachments: HADOOP-12975v0.patch, HADOOP-12975v1.patch, 
> HADOOP-12975v2.patch
>
>
> Running DU across lots of disks is very expensive and running all of the 
> processes at the same time creates a noticeable IO spike. We should add some 
> jitter.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HADOOP-12975) Add jitter to CachingGetSpaceUsed's thread

2016-04-14 Thread Elliott Clark (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-12975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elliott Clark updated HADOOP-12975:
---
Attachment: HADOOP-12975v2.patch

> Add jitter to CachingGetSpaceUsed's thread
> --
>
> Key: HADOOP-12975
> URL: https://issues.apache.org/jira/browse/HADOOP-12975
> Project: Hadoop Common
>  Issue Type: Sub-task
>Affects Versions: 2.9.0
>Reporter: Elliott Clark
>Assignee: Elliott Clark
> Attachments: HADOOP-12975v0.patch, HADOOP-12975v1.patch, 
> HADOOP-12975v2.patch
>
>
> Running DU across lots of disks is very expensive and running all of the 
> processes at the same time creates a noticeable IO spike. We should add some 
> jitter.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HADOOP-12841) Update s3-related properties in core-default.xml

2016-04-14 Thread Wei-Chiu Chuang (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-12841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241539#comment-15241539
 ] 

Wei-Chiu Chuang commented on HADOOP-12841:
--

[~eddyxu] per HADOOP-13025, should we also commit this in 2.7.x and 2.8?

> Update s3-related properties in core-default.xml
> 
>
> Key: HADOOP-12841
> URL: https://issues.apache.org/jira/browse/HADOOP-12841
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/s3
>Affects Versions: 2.7.0
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
>Priority: Minor
> Fix For: 3.0.0, 2.9.0
>
> Attachments: HADOOP-12841.001.patch
>
>
> HADOOP-11670 deprecated 
> {{fs.s3a.awsAccessKeyId}}/{{fs.s3a.awsSecretAccessKey}} in favor of 
> {{fs.s3a.access.key}}/{{fs.s3a.secret.key}} in the code, but did not update 
> core-default.xml. Also, a few S3 related properties are missing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HADOOP-12975) Add jitter to CachingGetSpaceUsed's thread

2016-04-14 Thread Elliott Clark (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-12975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elliott Clark updated HADOOP-12975:
---
Summary: Add jitter to CachingGetSpaceUsed's thread  (was: Add jitter to 
DU's thread)

> Add jitter to CachingGetSpaceUsed's thread
> --
>
> Key: HADOOP-12975
> URL: https://issues.apache.org/jira/browse/HADOOP-12975
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Elliott Clark
>Assignee: Elliott Clark
> Attachments: HADOOP-12975v0.patch, HADOOP-12975v1.patch
>
>
> Running DU across lots of disks is very expensive and running all of the 
> processes at the same time creates a noticeable IO spike. We should add some 
> jitter.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (HADOOP-13025) S3 variables on core-default.xml page are wrong

2016-04-14 Thread Wei-Chiu Chuang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-13025?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang resolved HADOOP-13025.
--
Resolution: Duplicate

HADOOP-12841 fixed it. I am going to resolve this jira as a dup.

> S3 variables on core-default.xml page are wrong
> ---
>
> Key: HADOOP-13025
> URL: https://issues.apache.org/jira/browse/HADOOP-13025
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/s3
>Affects Versions: 2.7.0
>Reporter: Rui Xue
>Priority: Minor
>
> On the core-defaults.xml page (Hadoop 2.7.0 or higher): 
> https://hadoop.apache.org/docs/r2.7.0/hadoop-project-dist/hadoop-common/core-default.xml
> The variables fs.s3a.awsAccessKeyId and fs.s3a.awsSecretAccessKey are wrong, 
> the second part s3a should be s3 only, say
> fs.s3.awsAccessKeyId and fs.s3.awsSecretAccessKey



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HADOOP-12975) Add jitter to DU's thread

2016-04-14 Thread Elliott Clark (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-12975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241536#comment-15241536
 ] 

Elliott Clark commented on HADOOP-12975:


bq. What motivates choosing 50% of the full refresh period as the jitter 
default?
50% is probably too high, I'll change it to 15%.

But a percentage is chosen as it makes the jitter scale with anyone who changes 
du periods. If it's a set number then someone with a refresh period of days 
won't get any benefit from the jitter.

> Add jitter to DU's thread
> -
>
> Key: HADOOP-12975
> URL: https://issues.apache.org/jira/browse/HADOOP-12975
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Elliott Clark
>Assignee: Elliott Clark
> Attachments: HADOOP-12975v0.patch, HADOOP-12975v1.patch
>
>
> Running DU across lots of disks is very expensive and running all of the 
> processes at the same time creates a noticeable IO spike. We should add some 
> jitter.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HADOOP-12989) Some tests in org.apache.hadoop.fs.shell.find occasionally time out

2016-04-14 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-12989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241196#comment-15241196
 ] 

Hadoop QA commented on HADOOP-12989:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 8s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 8 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 
46s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 34s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 4s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
20s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 56s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
14s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
35s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 56s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 0s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
41s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 48s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 5m 48s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 39s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 39s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
21s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 58s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
14s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
52s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 54s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 4s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 7m 59s 
{color} | {color:green} hadoop-common in the patch passed with JDK v1.8.0_77. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 8m 14s 
{color} | {color:green} hadoop-common in the patch passed with JDK v1.7.0_95. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
21s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 62m 45s {color} 
| {color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:fbe3e86 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12798724/HADOOP-12989.2.patch |
| JIRA Issue | HADOOP-12989 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux c37099212e26 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / df18b6e9 |
| Default Java | 1.7.0_95 |
| Multi-JDK versions |

[jira] [Commented] (HADOOP-12911) Upgrade Hadoop MiniKDC with Kerby

2016-04-14 Thread Kai Zheng (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-12911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241130#comment-15241130
 ] 

Kai Zheng commented on HADOOP-12911:


Thanks [~ste...@apache.org].
bq. I could never get it to issue tickets for >1 person in the same JVM (though 
that was probably UGI's static initializers there).
It's a good idea to support tickets > 1, I believe Kerby SimpleKDC can do so 
and the new MiniKDC will do. It would be good to add tests to verify this. 
[~jiajia], how would you think of this?
In your previous experience, I guess you're right it's because of UGI. So it's 
good not to couple this with UGI (so inherent the bad), as it is and will be.

bq. This is the evolution of classic MiniKDC: it will have to move on, what 
needs to be done is do it carefully and with users of the module happy.
I agree. Based on the survey, let's see how to make all the clients happy. On 
the other hand, people already suffered with the current, I do wish the 
evolving will make their future life easier. [~jiajia], IIRC, there were some 
issues related to MiniKDC in the current codebase, maybe you would check them 
out and see how this new implementation will solve them or not? Thanks.

> Upgrade Hadoop MiniKDC with Kerby
> -
>
> Key: HADOOP-12911
> URL: https://issues.apache.org/jira/browse/HADOOP-12911
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: test
>Reporter: Jiajia Li
>Assignee: Jiajia Li
> Attachments: HADOOP-12911-v1.patch, HADOOP-12911-v2.patch, 
> HADOOP-12911-v3.patch, HADOOP-12911-v4.patch, HADOOP-12911-v5.patch, 
> HADOOP-12911-v6.patch
>
>
> As discussed in the mailing list, we’d like to introduce Apache Kerby into 
> Hadoop. Initially it’s good to start with upgrading Hadoop MiniKDC with Kerby 
> offerings. Apache Kerby (https://github.com/apache/directory-kerby), as an 
> Apache Directory sub project, is a Java Kerberos binding. It provides a 
> SimpleKDC server that borrowed ideas from MiniKDC and implemented all the 
> facilities existing in MiniKDC. Currently MiniKDC depends on the old Kerberos 
> implementation in Directory Server project, but the implementation is stopped 
> being maintained. Directory community has a plan to replace the 
> implementation using Kerby. MiniKDC can use Kerby SimpleKDC directly to avoid 
> depending on the full of Directory project. Kerby also provides nice identity 
> backends such as the lightweight memory based one and the very simple json 
> one for easy development and test environments.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HADOOP-12943) Add -w -r options in dfs -test command

2016-04-14 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-12943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241073#comment-15241073
 ] 

Hadoop QA commented on HADOOP-12943:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 9s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 1s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 
31s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 37s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 31s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 
4s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 47s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
27s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 
28s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 57s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 46s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 
27s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 56s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 5m 56s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 53s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 53s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 
3s {color} | {color:green} root: patch generated 0 new + 8 unchanged - 24 fixed 
= 8 total (was 32) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 46s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
28s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 7s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 56s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 53s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 8m 9s 
{color} | {color:green} hadoop-common in the patch passed with JDK v1.8.0_77. 
{color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 72m 45s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_77. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 8m 58s {color} 
| {color:red} hadoop-common in the patch failed with JDK v1.7.0_95. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 63m 21s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_95. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
25s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 213m 5s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
|

[jira] [Updated] (HADOOP-12989) Some tests in org.apache.hadoop.fs.shell.find occasionally time out

2016-04-14 Thread Takashi Ohnishi (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-12989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takashi Ohnishi updated HADOOP-12989:
-
Attachment: HADOOP-12989.2.patch

Attached the updated patch.

> Some tests in org.apache.hadoop.fs.shell.find occasionally time out
> ---
>
> Key: HADOOP-12989
> URL: https://issues.apache.org/jira/browse/HADOOP-12989
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: test
>Reporter: Akira AJISAKA
>Assignee: Takashi Ohnishi
>  Labels: newbie
> Attachments: HADOOP-12989.1.patch, HADOOP-12989.2.patch
>
>
> An example:
> {noformat}
> java.lang.Exception: test timed out after 1000 milliseconds
>   at java.lang.ClassLoader$NativeLibrary.load(Native Method)
>   at java.lang.ClassLoader.loadLibrary1(ClassLoader.java:1965)
>   at java.lang.ClassLoader.loadLibrary0(ClassLoader.java:1890)
>   at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1872)
>   at java.lang.Runtime.loadLibrary0(Runtime.java:849)
>   at java.lang.System.loadLibrary(System.java:1088)
>   at sun.security.action.LoadLibraryAction.run(LoadLibraryAction.java:67)
>   at sun.security.action.LoadLibraryAction.run(LoadLibraryAction.java:47)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at java.net.NetworkInterface.(NetworkInterface.java:56)
>   at org.apache.htrace.core.TracerId.getBestIpString(TracerId.java:179)
>   at org.apache.htrace.core.TracerId.processShellVar(TracerId.java:145)
>   at org.apache.htrace.core.TracerId.(TracerId.java:116)
>   at org.apache.htrace.core.Tracer$Builder.build(Tracer.java:159)
>   at org.apache.hadoop.fs.FsTracer.get(FsTracer.java:42)
>   at 
> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2794)
>   at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:99)
>   at 
> org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2837)
>   at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2819)
>   at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:381)
>   at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:180)
>   at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:365)
>   at org.apache.hadoop.fs.shell.PathData.(PathData.java:81)
>   at org.apache.hadoop.fs.shell.find.TestName.applyGlob(TestName.java:74)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HADOOP-12989) Some tests in org.apache.hadoop.fs.shell.find occasionally time out

2016-04-14 Thread Takashi Ohnishi (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-12989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takashi Ohnishi updated HADOOP-12989:
-
Status: Patch Available  (was: Open)

> Some tests in org.apache.hadoop.fs.shell.find occasionally time out
> ---
>
> Key: HADOOP-12989
> URL: https://issues.apache.org/jira/browse/HADOOP-12989
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: test
>Reporter: Akira AJISAKA
>Assignee: Takashi Ohnishi
>  Labels: newbie
> Attachments: HADOOP-12989.1.patch, HADOOP-12989.2.patch
>
>
> An example:
> {noformat}
> java.lang.Exception: test timed out after 1000 milliseconds
>   at java.lang.ClassLoader$NativeLibrary.load(Native Method)
>   at java.lang.ClassLoader.loadLibrary1(ClassLoader.java:1965)
>   at java.lang.ClassLoader.loadLibrary0(ClassLoader.java:1890)
>   at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1872)
>   at java.lang.Runtime.loadLibrary0(Runtime.java:849)
>   at java.lang.System.loadLibrary(System.java:1088)
>   at sun.security.action.LoadLibraryAction.run(LoadLibraryAction.java:67)
>   at sun.security.action.LoadLibraryAction.run(LoadLibraryAction.java:47)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at java.net.NetworkInterface.(NetworkInterface.java:56)
>   at org.apache.htrace.core.TracerId.getBestIpString(TracerId.java:179)
>   at org.apache.htrace.core.TracerId.processShellVar(TracerId.java:145)
>   at org.apache.htrace.core.TracerId.(TracerId.java:116)
>   at org.apache.htrace.core.Tracer$Builder.build(Tracer.java:159)
>   at org.apache.hadoop.fs.FsTracer.get(FsTracer.java:42)
>   at 
> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2794)
>   at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:99)
>   at 
> org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2837)
>   at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2819)
>   at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:381)
>   at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:180)
>   at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:365)
>   at org.apache.hadoop.fs.shell.PathData.(PathData.java:81)
>   at org.apache.hadoop.fs.shell.find.TestName.applyGlob(TestName.java:74)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HADOOP-12989) Some tests in org.apache.hadoop.fs.shell.find occasionally time out

2016-04-14 Thread Takashi Ohnishi (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-12989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takashi Ohnishi updated HADOOP-12989:
-
Status: Open  (was: Patch Available)

> Some tests in org.apache.hadoop.fs.shell.find occasionally time out
> ---
>
> Key: HADOOP-12989
> URL: https://issues.apache.org/jira/browse/HADOOP-12989
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: test
>Reporter: Akira AJISAKA
>Assignee: Takashi Ohnishi
>  Labels: newbie
> Attachments: HADOOP-12989.1.patch
>
>
> An example:
> {noformat}
> java.lang.Exception: test timed out after 1000 milliseconds
>   at java.lang.ClassLoader$NativeLibrary.load(Native Method)
>   at java.lang.ClassLoader.loadLibrary1(ClassLoader.java:1965)
>   at java.lang.ClassLoader.loadLibrary0(ClassLoader.java:1890)
>   at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1872)
>   at java.lang.Runtime.loadLibrary0(Runtime.java:849)
>   at java.lang.System.loadLibrary(System.java:1088)
>   at sun.security.action.LoadLibraryAction.run(LoadLibraryAction.java:67)
>   at sun.security.action.LoadLibraryAction.run(LoadLibraryAction.java:47)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at java.net.NetworkInterface.(NetworkInterface.java:56)
>   at org.apache.htrace.core.TracerId.getBestIpString(TracerId.java:179)
>   at org.apache.htrace.core.TracerId.processShellVar(TracerId.java:145)
>   at org.apache.htrace.core.TracerId.(TracerId.java:116)
>   at org.apache.htrace.core.Tracer$Builder.build(Tracer.java:159)
>   at org.apache.hadoop.fs.FsTracer.get(FsTracer.java:42)
>   at 
> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2794)
>   at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:99)
>   at 
> org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2837)
>   at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2819)
>   at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:381)
>   at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:180)
>   at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:365)
>   at org.apache.hadoop.fs.shell.PathData.(PathData.java:81)
>   at org.apache.hadoop.fs.shell.find.TestName.applyGlob(TestName.java:74)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HADOOP-10768) Optimize Hadoop RPC encryption performance

2016-04-14 Thread Kai Zheng (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-10768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241050#comment-15241050
 ] 

Kai Zheng commented on HADOOP-10768:


Thanks [~dian.fu] for working on and attacking this!
I only did a quick look at the work. So far some questions in high level:
* Would you have a design doc that describes the requirement, the approach? I 
understand this was well discussed in the past, but guess a doc like this may 
be good to summarize and bring fresh discussion.
* I guess it's all about and for performance. Do you have any number to share?
* What's the impact? Does it mean to upgrade RPC version? Can external clients 
still be able to talk with the server via SASL? How this affect downstream 
components?
* Looks like the work is mainly in SASL layer, when Kerberos is enabled, will 
it still favor the GSSAPI mechanism? If not or it's bypassed, what encryption 
key is used and how it's obtained?
* The patch looks rather large, the change covering crypto, protocol, sasl rpc 
client and server, data transfer and some misc. Would you break it down? This 
one can be the umbrella.

Thanks again!

> Optimize Hadoop RPC encryption performance
> --
>
> Key: HADOOP-10768
> URL: https://issues.apache.org/jira/browse/HADOOP-10768
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: performance, security
>Affects Versions: 3.0.0
>Reporter: Yi Liu
>Assignee: Dian Fu
> Attachments: HADOOP-10768.001.patch, HADOOP-10768.002.patch
>
>
> Hadoop RPC encryption is enabled by setting {{hadoop.rpc.protection}} to 
> "privacy". It utilized SASL {{GSSAPI}} and {{DIGEST-MD5}} mechanisms for 
> secure authentication and data protection. Even {{GSSAPI}} supports using 
> AES, but without AES-NI support by default, so the encryption is slow and 
> will become bottleneck.
> After discuss with [~atm], [~tucu00] and [~umamaheswararao], we can do the 
> same optimization as in HDFS-6606. Use AES-NI with more than *20x* speedup.
> On the other hand, RPC message is small, but RPC is frequent and there may be 
> lots of RPC calls in one connection, we needs to setup benchmark to see real 
> improvement and then make a trade-off. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HADOOP-12963) Allow using path style addressing for accessing the s3 endpoint

2016-04-14 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-12963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241040#comment-15241040
 ] 

Hudson commented on HADOOP-12963:
-

FAILURE: Integrated in Hadoop-trunk-Commit #9610 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/9610/])
HADOOP-12963 Allow using path style addressing for accessing the s3 (stevel: 
rev df18b6e9849c53c51a3d317f1254298edd8b17d1)
* 
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java
* hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/index.md
* 
hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/TestS3AConfiguration.java
* hadoop-common-project/hadoop-common/src/main/resources/core-default.xml
* hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/Constants.java


> Allow using path style addressing for accessing the s3 endpoint
> ---
>
> Key: HADOOP-12963
> URL: https://issues.apache.org/jira/browse/HADOOP-12963
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/s3
>Affects Versions: 2.7.1
>Reporter: Andrew Baptist
>Assignee: Stephen Montgomery
>Priority: Minor
>  Labels: features
> Attachments: HADOOP-12963-001.patch, HADOOP-12963-002.patch, 
> HADOOP-12963-1.patch, hdfs-8728.patch.2
>
>
> There is no ability to specify using path style access for the s3 endpoint. 
> There are numerous non-amazon implementations of storage that support the 
> amazon API's but only support path style access such as Cleversafe and Ceph. 
> Additionally in many environments it is difficult to configure DNS correctly 
> to get virtual host style addressing to work



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HADOOP-5470) RunJar.unJar() should write the last modified time found in the jar entry to the uncompressed file

2016-04-14 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-5470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241032#comment-15241032
 ] 

Hadoop QA commented on HADOOP-5470:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 10s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 
44s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 56s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 47s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
21s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 58s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
14s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
33s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 56s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 5s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
42s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 53s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 5m 53s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 46s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 46s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 20s 
{color} | {color:red} hadoop-common-project/hadoop-common: patch generated 1 
new + 5 unchanged - 0 fixed = 6 total (was 5) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 56s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
13s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
48s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 52s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 3s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 7m 14s {color} 
| {color:red} hadoop-common in the patch failed with JDK v1.8.0_77. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 7m 8s {color} | 
{color:red} hadoop-common in the patch failed with JDK v1.7.0_95. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
22s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 59m 8s {color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_77 Failed junit tests | hadoop.ha.TestZKFailoverController |
| JDK v1.7.0_95 Failed junit tests | hadoop.metrics2.impl.TestGangliaMetrics |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:fbe3e86 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12798710/HADOOP-5470.02.patch |
| JIRA Issue | HADOOP-5470 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 29a87bb79404 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64

[jira] [Commented] (HADOOP-12989) Some tests in org.apache.hadoop.fs.shell.find occasionally time out

2016-04-14 Thread Takashi Ohnishi (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-12989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241019#comment-15241019
 ] 

Takashi Ohnishi commented on HADOOP-12989:
--

Thank you [~ajisakaa] for commenting!

{quote}
Would you add global timeout to other tests in the same directory as well?
{quote}

Oops. I wrongly recognized that this is only for TestName from the stack trace.
I will update the patch. :)

> Some tests in org.apache.hadoop.fs.shell.find occasionally time out
> ---
>
> Key: HADOOP-12989
> URL: https://issues.apache.org/jira/browse/HADOOP-12989
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: test
>Reporter: Akira AJISAKA
>Assignee: Takashi Ohnishi
>  Labels: newbie
> Attachments: HADOOP-12989.1.patch
>
>
> An example:
> {noformat}
> java.lang.Exception: test timed out after 1000 milliseconds
>   at java.lang.ClassLoader$NativeLibrary.load(Native Method)
>   at java.lang.ClassLoader.loadLibrary1(ClassLoader.java:1965)
>   at java.lang.ClassLoader.loadLibrary0(ClassLoader.java:1890)
>   at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1872)
>   at java.lang.Runtime.loadLibrary0(Runtime.java:849)
>   at java.lang.System.loadLibrary(System.java:1088)
>   at sun.security.action.LoadLibraryAction.run(LoadLibraryAction.java:67)
>   at sun.security.action.LoadLibraryAction.run(LoadLibraryAction.java:47)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at java.net.NetworkInterface.(NetworkInterface.java:56)
>   at org.apache.htrace.core.TracerId.getBestIpString(TracerId.java:179)
>   at org.apache.htrace.core.TracerId.processShellVar(TracerId.java:145)
>   at org.apache.htrace.core.TracerId.(TracerId.java:116)
>   at org.apache.htrace.core.Tracer$Builder.build(Tracer.java:159)
>   at org.apache.hadoop.fs.FsTracer.get(FsTracer.java:42)
>   at 
> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2794)
>   at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:99)
>   at 
> org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2837)
>   at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2819)
>   at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:381)
>   at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:180)
>   at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:365)
>   at org.apache.hadoop.fs.shell.PathData.(PathData.java:81)
>   at org.apache.hadoop.fs.shell.find.TestName.applyGlob(TestName.java:74)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HADOOP-12963) Allow using path style addressing for accessing the s3 endpoint

2016-04-14 Thread Steve Loughran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-12963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-12963:

Resolution: Fixed
  Assignee: Stephen Montgomery
Status: Resolved  (was: Patch Available)

+1 committed.

new tests passed. Some other S3A tests were unhappy today against trunk, but it 
seems unrelated. Anyone who can: please do a test run of the hadoop-aws module 
today and verify it works for them

> Allow using path style addressing for accessing the s3 endpoint
> ---
>
> Key: HADOOP-12963
> URL: https://issues.apache.org/jira/browse/HADOOP-12963
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/s3
>Affects Versions: 2.7.1
>Reporter: Andrew Baptist
>Assignee: Stephen Montgomery
>Priority: Minor
>  Labels: features
> Attachments: HADOOP-12963-001.patch, HADOOP-12963-002.patch, 
> HADOOP-12963-1.patch, hdfs-8728.patch.2
>
>
> There is no ability to specify using path style access for the s3 endpoint. 
> There are numerous non-amazon implementations of storage that support the 
> amazon API's but only support path style access such as Cleversafe and Ceph. 
> Additionally in many environments it is difficult to configure DNS correctly 
> to get virtual host style addressing to work



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HADOOP-13025) S3 variables on core-default.xml page are wrong

2016-04-14 Thread Rui Xue (JIRA)

Rui Xue created HADOOP-13025:


 Summary: S3 variables on core-default.xml page are wrong
 Key: HADOOP-13025
 URL: https://issues.apache.org/jira/browse/HADOOP-13025
 Project: Hadoop Common
  Issue Type: Improvement
  Components: fs/s3
Affects Versions: 2.7.0
Reporter: Rui Xue
Priority: Minor


On the core-defaults.xml page (Hadoop 2.7.0 or higher): 
https://hadoop.apache.org/docs/r2.7.0/hadoop-project-dist/hadoop-common/core-default.xml

The variables fs.s3a.awsAccessKeyId and fs.s3a.awsSecretAccessKey are wrong, 
the second part s3a should be s3 only, say
fs.s3.awsAccessKeyId and fs.s3.awsSecretAccessKey



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HADOOP-5470) RunJar.unJar() should write the last modified time found in the jar entry to the uncompressed file

2016-04-14 Thread Andras Bokor (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-5470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andras Bokor updated HADOOP-5470:
-
Attachment: HADOOP-5470.02.patch

Attach [^HADOOP-5470.02.patch] to eliminate Findbugs warning.

> RunJar.unJar() should write the last modified time found in the jar entry to 
> the uncompressed file
> --
>
> Key: HADOOP-5470
> URL: https://issues.apache.org/jira/browse/HADOOP-5470
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: util
>Affects Versions: 0.18.0, 0.18.1, 0.18.2, 0.18.3, 0.19.0, 0.19.1
>Reporter: Colin Evans
>Assignee: Andras Bokor
>Priority: Minor
>  Labels: newbie
> Attachments: HADOOP-5470.01.patch, HADOOP-5470.02.patch
>
>
> For tools like jruby and jython, last modified times determine if a script 
> gets recompiled.  Losing the correct last modified time causes some 
> unfortunate recompilation race conditions when a job is running.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HADOOP-12878) Impersonate hosts in s3a for better data locality handling

2016-04-14 Thread Steve Loughran (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-12878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15240971#comment-15240971
 ] 

Steve Loughran commented on HADOOP-12878:
-

I like your idea of picking randomly; if you really wanted to be clever you''d 
use rack topology, but I don't see that would gain much. This is about 
distribution of workload. Provided the returned values were sufficiently 
random, it would even out.

Can I propose the code for this being entirely self contained; something that 
s3a can delegate too. This would allow the test to be isolated in a simple unit 
test, and the code reusable elsewhere

> Impersonate hosts in s3a for better data locality handling
> --
>
> Key: HADOOP-12878
> URL: https://issues.apache.org/jira/browse/HADOOP-12878
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Thomas Demoor
>Assignee: Thomas Demoor
>
> Currently, {{localhost}} is passed as locality for each block, causing all 
> blocks involved in job to initially target the same node (RM), before being 
> moved by the scheduler (to a rack-local node). This reduces parallelism for 
> jobs (with short-lived mappers). 
> We should mimic Azures implementation: a config setting 
> {{fs.s3a.block.location.impersonatedhost}} where the user can enter the list 
> of hostnames in the cluster to return to {{getFileBlockLocations}}. 
> Possible optimization: for larger systems, it might be better to return N 
> (5?) random hostnames to prevent passing a huge array (the downstream code 
> assumes size = O(3)).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HADOOP-11572) s3a delete() operation fails during a concurrent delete of child entries

2016-04-14 Thread Steve Loughran (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-11572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15240965#comment-15240965
 ] 

Steve Loughran commented on HADOOP-11572:
-

Abhishek: have you had a chance to look at this?

> s3a delete() operation fails during a concurrent delete of child entries
> 
>
> Key: HADOOP-11572
> URL: https://issues.apache.org/jira/browse/HADOOP-11572
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 2.6.0
>Reporter: Steve Loughran
>
> Reviewing the code, s3a has the problem raised in HADOOP-6688: deletion of a 
> child entry during a recursive directory delete is propagated as an 
> exception, rather than ignored as a detail which idempotent operations should 
> just ignore.
> the exception should be caught and, if a file not found problem, logged 
> rather than propagated



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

S3 related variables in core-site.xml are wrong

2016-04-14 Thread Rui Xue

In Hadoop 2.7.0 core-default page:
https://hadoop.apache.org/docs/r2.7.0/hadoop-project-dist/hadoop-common/core-default.xml

Variables: fs.s3*a*.awsSecretAccessKey and fs.s3*a*.awsAccessKeyId are
wrong. *s3a* should be s3 only. So they should be:
fs.s3.awsSecretAccessKey and fs.s3.awsAccessKeyId

[jira] [Commented] (HADOOP-13020) dfs -ls s3a root should not return error when bucket is empty

2016-04-14 Thread Steve Loughran (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-13020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15240927#comment-15240927
 ] 

Steve Loughran commented on HADOOP-13020:
-

no worries: I've been known to file duplicates of bugs I've already files. 
Better to report than leave out. 

could you do a build of 2.8 and verify the command works there?

> dfs -ls s3a root should not return error when bucket is empty
> -
>
> Key: HADOOP-13020
> URL: https://issues.apache.org/jira/browse/HADOOP-13020
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 2.6.0
>Reporter: John Zhuge
>Assignee: John Zhuge
>Priority: Trivial
>  Labels: s3
>
> Do not expect {{hdfs dfs -ls}} s3a root to return error "No such file or 
> directory" when the s3 bucket is empty. Expect no error and empty output, 
> just like listing an empty directory.
> {code}
> $ hdfs dfs -ls s3a://jz-hdfs1/
> Found 1 items
> drwxrwxrwx   -  0 1969-12-31 16:00 s3a://jz-hdfs1/tmp
> $ hdfs dfs -rmdir s3a://jz-hdfs1/tmp
> $ hdfs dfs -ls s3a://jz-hdfs1/
> ls: `s3a://jz-hdfs1/': No such file or directory
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HADOOP-12969) Mark IPC.Client and IPC.Server as @Public, @Evolving

2016-04-14 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-12969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15240887#comment-15240887
 ] 

Hudson commented on HADOOP-12969:
-

FAILURE: Integrated in Hadoop-trunk-Commit #9609 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/9609/])
HADOOP-12969 Mark IPC.Client and IPC.Server as @Public, @Evolving (stevel: rev 
40211d1f0a3e4546eab076e10be8937853490e5e)
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/Server.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/Client.java


> Mark IPC.Client and IPC.Server as @Public, @Evolving
> 
>
> Key: HADOOP-12969
> URL: https://issues.apache.org/jira/browse/HADOOP-12969
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: ipc
>Affects Versions: 2.8.0
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
>Priority: Minor
> Fix For: 2.8.0
>
> Attachments: HADOOP-12969.000..patch, HADOOP-12969.001.patch, 
> HADOOP-12969.002.patch, HADOOP-12969.003.patch
>
>
> Per the discussion in 
> [HADOOP-12909|https://issues.apache.org/jira/browse/HADOOP-12909?focusedCommentId=15211745=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15211745],
>  this is to propose marking IPC.Client and IPC.Server as @Public, @Evolving 
> as a result of HADOOP-12909



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HADOOP-12969) Mark IPC.Client and IPC.Server as @Public, @Evolving

2016-04-14 Thread Steve Loughran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-12969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-12969:

   Resolution: Fixed
Fix Version/s: 2.8.0
   Status: Resolved  (was: Patch Available)

> Mark IPC.Client and IPC.Server as @Public, @Evolving
> 
>
> Key: HADOOP-12969
> URL: https://issues.apache.org/jira/browse/HADOOP-12969
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: ipc
>Affects Versions: 2.8.0
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
>Priority: Minor
> Fix For: 2.8.0
>
> Attachments: HADOOP-12969.000..patch, HADOOP-12969.001.patch, 
> HADOOP-12969.002.patch, HADOOP-12969.003.patch
>
>
> Per the discussion in 
> [HADOOP-12909|https://issues.apache.org/jira/browse/HADOOP-12909?focusedCommentId=15211745=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15211745],
>  this is to propose marking IPC.Client and IPC.Server as @Public, @Evolving 
> as a result of HADOOP-12909



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HADOOP-12969) Mark IPC.Client and IPC.Server as @Public, @Evolving

2016-04-14 Thread Steve Loughran (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-12969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15240863#comment-15240863
 ] 

Steve Loughran commented on HADOOP-12969:
-

+1, committed. Thanks!

> Mark IPC.Client and IPC.Server as @Public, @Evolving
> 
>
> Key: HADOOP-12969
> URL: https://issues.apache.org/jira/browse/HADOOP-12969
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: ipc
>Affects Versions: 2.8.0
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
>Priority: Minor
> Attachments: HADOOP-12969.000..patch, HADOOP-12969.001.patch, 
> HADOOP-12969.002.patch, HADOOP-12969.003.patch
>
>
> Per the discussion in 
> [HADOOP-12909|https://issues.apache.org/jira/browse/HADOOP-12909?focusedCommentId=15211745=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15211745],
>  this is to propose marking IPC.Client and IPC.Server as @Public, @Evolving 
> as a result of HADOOP-12909



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HADOOP-13022) S3 MD5 check fails on Server Side Encryption with AWS and default key is used

2016-04-14 Thread Steve Loughran (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-13022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15240853#comment-15240853
 ] 

Steve Loughran commented on HADOOP-13022:
-

ok. Can you submit the patch to update the POM for this?

> S3 MD5 check fails on Server Side Encryption with AWS and default key is used
> -
>
> Key: HADOOP-13022
> URL: https://issues.apache.org/jira/browse/HADOOP-13022
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 2.8.0
>Reporter: Leonardo Contreras
>
> When server side encryption with "aws:kms" value and no custom key is used in 
> S3A Filesystem, the AWSClient fails when verifing Md5:
> {noformat}
> Exception in thread "main" com.amazonaws.AmazonClientException: Unable to 
> verify integrity of data upload.  Client calculated content hash (contentMD5: 
> 1B2M2Y8AsgTpgAmY7PhCfg== in base 64) didn't match hash (etag: 
> c29fcc646e17c348bce9cca8f9d205f5 in hex) calculated by Amazon S3.  You may 
> need to delete the data stored in Amazon S3. (metadata.contentMD5: null, 
> md5DigestStream: 
> com.amazonaws.services.s3.internal.MD5DigestCalculatingInputStream@65d9e72a, 
> bucketName: abuse-messages-nonprod, key: 
> venus/raw_events/checkpoint/825eb6aa-543d-46b1-801f-42de9dbc1610/)
>   at 
> com.amazonaws.services.s3.AmazonS3Client.putObject(AmazonS3Client.java:1492)
>   at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.createEmptyObject(S3AFileSystem.java:1295)
>   at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.createFakeDirectory(S3AFileSystem.java:1272)
>   at org.apache.hadoop.fs.s3a.S3AFileSystem.mkdirs(S3AFileSystem.java:969)
>   at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:1888)
>   at 
> org.apache.spark.SparkContext$$anonfun$setCheckpointDir$2.apply(SparkContext.scala:2077)
>   at 
> org.apache.spark.SparkContext$$anonfun$setCheckpointDir$2.apply(SparkContext.scala:2074)
>   at scala.Option.map(Option.scala:145)
>   at 
> org.apache.spark.SparkContext.setCheckpointDir(SparkContext.scala:2074)
>   at 
> org.apache.spark.streaming.StreamingContext.checkpoint(StreamingContext.scala:237)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HADOOP-13022) S3 MD5 check fails on Server Side Encryption with AWS and default key is used

2016-04-14 Thread Steve Loughran (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-13022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15240854#comment-15240854
 ] 

Steve Loughran commented on HADOOP-13022:
-

+ how about a test for this?

> S3 MD5 check fails on Server Side Encryption with AWS and default key is used
> -
>
> Key: HADOOP-13022
> URL: https://issues.apache.org/jira/browse/HADOOP-13022
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 2.8.0
>Reporter: Leonardo Contreras
>
> When server side encryption with "aws:kms" value and no custom key is used in 
> S3A Filesystem, the AWSClient fails when verifing Md5:
> {noformat}
> Exception in thread "main" com.amazonaws.AmazonClientException: Unable to 
> verify integrity of data upload.  Client calculated content hash (contentMD5: 
> 1B2M2Y8AsgTpgAmY7PhCfg== in base 64) didn't match hash (etag: 
> c29fcc646e17c348bce9cca8f9d205f5 in hex) calculated by Amazon S3.  You may 
> need to delete the data stored in Amazon S3. (metadata.contentMD5: null, 
> md5DigestStream: 
> com.amazonaws.services.s3.internal.MD5DigestCalculatingInputStream@65d9e72a, 
> bucketName: abuse-messages-nonprod, key: 
> venus/raw_events/checkpoint/825eb6aa-543d-46b1-801f-42de9dbc1610/)
>   at 
> com.amazonaws.services.s3.AmazonS3Client.putObject(AmazonS3Client.java:1492)
>   at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.createEmptyObject(S3AFileSystem.java:1295)
>   at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.createFakeDirectory(S3AFileSystem.java:1272)
>   at org.apache.hadoop.fs.s3a.S3AFileSystem.mkdirs(S3AFileSystem.java:969)
>   at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:1888)
>   at 
> org.apache.spark.SparkContext$$anonfun$setCheckpointDir$2.apply(SparkContext.scala:2077)
>   at 
> org.apache.spark.SparkContext$$anonfun$setCheckpointDir$2.apply(SparkContext.scala:2074)
>   at scala.Option.map(Option.scala:145)
>   at 
> org.apache.spark.SparkContext.setCheckpointDir(SparkContext.scala:2074)
>   at 
> org.apache.spark.streaming.StreamingContext.checkpoint(StreamingContext.scala:237)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HADOOP-13022) S3 MD5 check fails on Server Side Encryption with AWS and default key is used

2016-04-14 Thread Steve Loughran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-13022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-13022:

Component/s: fs/s3

> S3 MD5 check fails on Server Side Encryption with AWS and default key is used
> -
>
> Key: HADOOP-13022
> URL: https://issues.apache.org/jira/browse/HADOOP-13022
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 2.8.0
>Reporter: Leonardo Contreras
>
> When server side encryption with "aws:kms" value and no custom key is used in 
> S3A Filesystem, the AWSClient fails when verifing Md5:
> {noformat}
> Exception in thread "main" com.amazonaws.AmazonClientException: Unable to 
> verify integrity of data upload.  Client calculated content hash (contentMD5: 
> 1B2M2Y8AsgTpgAmY7PhCfg== in base 64) didn't match hash (etag: 
> c29fcc646e17c348bce9cca8f9d205f5 in hex) calculated by Amazon S3.  You may 
> need to delete the data stored in Amazon S3. (metadata.contentMD5: null, 
> md5DigestStream: 
> com.amazonaws.services.s3.internal.MD5DigestCalculatingInputStream@65d9e72a, 
> bucketName: abuse-messages-nonprod, key: 
> venus/raw_events/checkpoint/825eb6aa-543d-46b1-801f-42de9dbc1610/)
>   at 
> com.amazonaws.services.s3.AmazonS3Client.putObject(AmazonS3Client.java:1492)
>   at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.createEmptyObject(S3AFileSystem.java:1295)
>   at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.createFakeDirectory(S3AFileSystem.java:1272)
>   at org.apache.hadoop.fs.s3a.S3AFileSystem.mkdirs(S3AFileSystem.java:969)
>   at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:1888)
>   at 
> org.apache.spark.SparkContext$$anonfun$setCheckpointDir$2.apply(SparkContext.scala:2077)
>   at 
> org.apache.spark.SparkContext$$anonfun$setCheckpointDir$2.apply(SparkContext.scala:2074)
>   at scala.Option.map(Option.scala:145)
>   at 
> org.apache.spark.SparkContext.setCheckpointDir(SparkContext.scala:2074)
>   at 
> org.apache.spark.streaming.StreamingContext.checkpoint(StreamingContext.scala:237)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HADOOP-12943) Add -w -r options in dfs -test command

2016-04-14 Thread Weiwei Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-12943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weiwei Yang updated HADOOP-12943:
-
Status: Patch Available  (was: In Progress)

Submit v3 patch, resolved a minor checkstyle issue.

> Add -w -r options in dfs -test command
> --
>
> Key: HADOOP-12943
> URL: https://issues.apache.org/jira/browse/HADOOP-12943
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs, scripts, tools
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
> Fix For: 2.8.0
>
> Attachments: HADOOP-12943.001.patch, HADOOP-12943.002.patch, 
> HADOOP-12943.003.patch
>
>
> Currently the dfs -test command only supports 
>   -d, -e, -f, -s, -z
> options. It would be helpful if we add 
>   -w, -r 
> to verify permission of r/w before actual read or write. This will help 
> script programming.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HADOOP-12943) Add -w -r options in dfs -test command

2016-04-14 Thread Weiwei Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-12943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weiwei Yang updated HADOOP-12943:
-
Attachment: HADOOP-12943.003.patch

> Add -w -r options in dfs -test command
> --
>
> Key: HADOOP-12943
> URL: https://issues.apache.org/jira/browse/HADOOP-12943
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs, scripts, tools
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
> Fix For: 2.8.0
>
> Attachments: HADOOP-12943.001.patch, HADOOP-12943.002.patch, 
> HADOOP-12943.003.patch
>
>
> Currently the dfs -test command only supports 
>   -d, -e, -f, -s, -z
> options. It would be helpful if we add 
>   -w, -r 
> to verify permission of r/w before actual read or write. This will help 
> script programming.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HADOOP-12943) Add -w -r options in dfs -test command

2016-04-14 Thread Weiwei Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-12943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weiwei Yang updated HADOOP-12943:
-
Status: In Progress  (was: Patch Available)

> Add -w -r options in dfs -test command
> --
>
> Key: HADOOP-12943
> URL: https://issues.apache.org/jira/browse/HADOOP-12943
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs, scripts, tools
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
> Fix For: 2.8.0
>
> Attachments: HADOOP-12943.001.patch, HADOOP-12943.002.patch
>
>
> Currently the dfs -test command only supports 
>   -d, -e, -f, -s, -z
> options. It would be helpful if we add 
>   -w, -r 
> to verify permission of r/w before actual read or write. This will help 
> script programming.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HADOOP-12943) Add -w -r options in dfs -test command

2016-04-14 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-12943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15240827#comment-15240827
 ] 

Hadoop QA commented on HADOOP-12943:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 8s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 15s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 
38s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 9s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 37s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 
3s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 45s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
27s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 
25s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 0s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 47s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 
29s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 59s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 5m 59s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 45s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 45s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 5s 
{color} | {color:red} root: patch generated 4 new + 8 unchanged - 24 fixed = 12 
total (was 32) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 58s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
31s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 
44s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 10s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 5s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 8m 25s {color} 
| {color:red} hadoop-common in the patch failed with JDK v1.8.0_77. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 69m 40s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_77. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 8m 48s 
{color} | {color:green} hadoop-common in the patch passed with JDK v1.7.0_95. 
{color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 62m 27s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_95. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
26s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 210m 25s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK

[jira] [Commented] (HADOOP-12924) Add default coder key for creating raw coders

2016-04-14 Thread Kai Zheng (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-12924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15240774#comment-15240774
 ] 

Kai Zheng commented on HADOOP-12924:


Thanks Rui for the great update and nice tests! Two more comments:
* Maybe we can define RS_DEFAULT_CODEC_NAME and the like in an enum?
* Could we move the static schema definitions like RS_6_3_SCHEMA out of 
{{HdfsConstants}} maybe to {{CodecUtil}}? HdfsConstants isn't a good place for 
such.

> Add default coder key for creating raw coders
> -
>
> Key: HADOOP-12924
> URL: https://issues.apache.org/jira/browse/HADOOP-12924
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Rui Li
>Assignee: Rui Li
>Priority: Minor
>  Labels: hdfs-ec-3.0-must-do
> Attachments: HADOOP-12924.1.patch, HADOOP-12924.2.patch, 
> HADOOP-12924.3.patch, HADOOP-12924.4.patch
>
>
> As suggested 
> [here|https://issues.apache.org/jira/browse/HADOOP-12826?focusedCommentId=15194402=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15194402].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HADOOP-12811) Change kms server port number which conflicts with HMaster port number

2016-04-14 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-12811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15240753#comment-15240753
 ] 

Hadoop QA commented on HADOOP-12811:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 20s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:blue}0{color} | {color:blue} shelldocs {color} | {color:blue} 0m 0s 
{color} | {color:blue} Shelldocs was not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 11s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 
53s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 51s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 8m 10s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
30s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 33s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
31s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
20s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 16s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 31s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 10s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 
11s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 8m 10s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 8m 10s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 8m 33s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 8m 33s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
29s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 37s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
33s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} shellcheck {color} | {color:green} 0m 
13s {color} | {color:green} There were no new shellcheck issues. {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 3s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 19s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 36s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 8m 58s 
{color} | {color:green} hadoop-common in the patch passed with JDK v1.8.0_77. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 13s 
{color} | {color:green} hadoop-kms in the patch passed with JDK v1.8.0_77. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 9m 48s 
{color} | {color:green} hadoop-common in the patch passed with JDK v1.7.0_95. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 27s 
{color} | {color:green} hadoop-kms in the patch passed with JDK v1.7.0_95. 
{color} |
| {color:green}+1{color} |

[jira] [Commented] (HADOOP-12911) Upgrade Hadoop MiniKDC with Kerby

2016-04-14 Thread Jiajia Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-12911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15240736#comment-15240736
 ] 

Jiajia Li commented on HADOOP-12911:


Agree, the "test.build.dir" will be better.

> Upgrade Hadoop MiniKDC with Kerby
> -
>
> Key: HADOOP-12911
> URL: https://issues.apache.org/jira/browse/HADOOP-12911
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: test
>Reporter: Jiajia Li
>Assignee: Jiajia Li
> Attachments: HADOOP-12911-v1.patch, HADOOP-12911-v2.patch, 
> HADOOP-12911-v3.patch, HADOOP-12911-v4.patch, HADOOP-12911-v5.patch, 
> HADOOP-12911-v6.patch
>
>
> As discussed in the mailing list, we’d like to introduce Apache Kerby into 
> Hadoop. Initially it’s good to start with upgrading Hadoop MiniKDC with Kerby 
> offerings. Apache Kerby (https://github.com/apache/directory-kerby), as an 
> Apache Directory sub project, is a Java Kerberos binding. It provides a 
> SimpleKDC server that borrowed ideas from MiniKDC and implemented all the 
> facilities existing in MiniKDC. Currently MiniKDC depends on the old Kerberos 
> implementation in Directory Server project, but the implementation is stopped 
> being maintained. Directory community has a plan to replace the 
> implementation using Kerby. MiniKDC can use Kerby SimpleKDC directly to avoid 
> depending on the full of Directory project. Kerby also provides nice identity 
> backends such as the lightweight memory based one and the very simple json 
> one for easy development and test environments.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HADOOP-10768) Optimize Hadoop RPC encryption performance

2016-04-14 Thread Dian Fu (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-10768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15240732#comment-15240732
 ] 

Dian Fu commented on HADOOP-10768:
--

Updated the patch to fix the checkstyle issues. The test failures aren't relate 
to this patch. The failure of TestShortCircuitLocalRead  is cased by 
HADOOP-12994. Other test failures have passed in my local environment.

> Optimize Hadoop RPC encryption performance
> --
>
> Key: HADOOP-10768
> URL: https://issues.apache.org/jira/browse/HADOOP-10768
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: performance, security
>Affects Versions: 3.0.0
>Reporter: Yi Liu
>Assignee: Dian Fu
> Attachments: HADOOP-10768.001.patch, HADOOP-10768.002.patch
>
>
> Hadoop RPC encryption is enabled by setting {{hadoop.rpc.protection}} to 
> "privacy". It utilized SASL {{GSSAPI}} and {{DIGEST-MD5}} mechanisms for 
> secure authentication and data protection. Even {{GSSAPI}} supports using 
> AES, but without AES-NI support by default, so the encryption is slow and 
> will become bottleneck.
> After discuss with [~atm], [~tucu00] and [~umamaheswararao], we can do the 
> same optimization as in HDFS-6606. Use AES-NI with more than *20x* speedup.
> On the other hand, RPC message is small, but RPC is frequent and there may be 
> lots of RPC calls in one connection, we needs to setup benchmark to see real 
> improvement and then make a trade-off. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HADOOP-12994) Specify PositionedReadable, add contract tests, fix problems

2016-04-14 Thread Dian Fu (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-12994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15240725#comment-15240725
 ] 

Dian Fu commented on HADOOP-12994:
--

Test TestShortCircuitLocalRead fails after this patch is in:
{code}
java.lang.IndexOutOfBoundsException: Requested more bytes than destination 
buffer size
at 
org.apache.hadoop.fs.FSInputStream.validatePositionedReadArgs(FSInputStream.java:107)
at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:975)
at java.io.DataInputStream.read(DataInputStream.java:149)
at 
org.apache.hadoop.hdfs.shortcircuit.TestShortCircuitLocalRead.checkFileContent(TestShortCircuitLocalRead.java:157)
at 
org.apache.hadoop.hdfs.shortcircuit.TestShortCircuitLocalRead.doTestShortCircuitReadImpl(TestShortCircuitLocalRead.java:286)
at 
org.apache.hadoop.hdfs.shortcircuit.TestShortCircuitLocalRead.doTestShortCircuitRead(TestShortCircuitLocalRead.java:241)
at 
org.apache.hadoop.hdfs.shortcircuit.TestShortCircuitLocalRead.testSmallFileLocalRead(TestShortCircuitLocalRead.java:308)
{code}

> Specify PositionedReadable, add contract tests, fix problems
> 
>
> Key: HADOOP-12994
> URL: https://issues.apache.org/jira/browse/HADOOP-12994
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs
>Affects Versions: 2.8.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
> Fix For: 2.8.0
>
> Attachments: HADOOP-12994-001.patch, HADOOP-12994-002.patch, 
> HADOOP-12994-003.patch, HADOOP-12994-004.patch, HADOOP-12994-005.patch, 
> HADOOP-12994-006.patch
>
>
> Some work on S3a has shown up that there aren't tests catching regressions in 
> readFully, reviewing the documentation shows that its specification could be 
> improved.
> # review the spec
> # review the implementations
> # add tests (proposed: to the seek contract; streams which support seek 
> should support positioned readable)
> # fix code, where it differs significantly from HDFS or LocalFS



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HADOOP-10768) Optimize Hadoop RPC encryption performance

2016-04-14 Thread Dian Fu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-10768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dian Fu updated HADOOP-10768:
-
Attachment: HADOOP-10768.002.patch

> Optimize Hadoop RPC encryption performance
> --
>
> Key: HADOOP-10768
> URL: https://issues.apache.org/jira/browse/HADOOP-10768
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: performance, security
>Affects Versions: 3.0.0
>Reporter: Yi Liu
>Assignee: Dian Fu
> Attachments: HADOOP-10768.001.patch, HADOOP-10768.002.patch
>
>
> Hadoop RPC encryption is enabled by setting {{hadoop.rpc.protection}} to 
> "privacy". It utilized SASL {{GSSAPI}} and {{DIGEST-MD5}} mechanisms for 
> secure authentication and data protection. Even {{GSSAPI}} supports using 
> AES, but without AES-NI support by default, so the encryption is slow and 
> will become bottleneck.
> After discuss with [~atm], [~tucu00] and [~umamaheswararao], we can do the 
> same optimization as in HDFS-6606. Use AES-NI with more than *20x* speedup.
> On the other hand, RPC message is small, but RPC is frequent and there may be 
> lots of RPC calls in one connection, we needs to setup benchmark to see real 
> improvement and then make a trade-off. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HADOOP-12811) Change kms server port number which conflicts with HMaster port number

2016-04-14 Thread Xiao Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-12811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Chen updated HADOOP-12811:
---
Status: Patch Available  (was: Open)

> Change kms server port number which conflicts with HMaster port number
> --
>
> Key: HADOOP-12811
> URL: https://issues.apache.org/jira/browse/HADOOP-12811
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: kms
>Affects Versions: 2.6.3, 2.6.2, 2.7.2, 2.7.1, 2.7.0, 2.6.1
>Reporter: Yufeng Jiang
>Assignee: Xiao Chen
>  Labels: incompatible, patch
> Attachments: HADOOP-12811.01.patch
>
>
> The HBase's HMaster port number conflicts with Hadoop kms port number. Both 
> uses 16000.
> There might be use cases user need kms and HBase present on the same cluster. 
> The HBase is able to encrypt its HFiles but user might need KMS to encrypt 
> other HDFS directories.
> Users would have to manually override the default port of either application 
> on their cluster. It would be nice to have different default ports so kms and 
> HBase could naturally coexist. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

79 matches

Mail list logo