[jira] [Commented] (HADOOP-15335) Support xxxxxxx:xxx/stacks print lock info and more useful attribute of thread info

2018-03-22 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16409182#comment-16409182
 ] 

genericqa commented on HADOOP-15335:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
23s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
19s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 14m 
39s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
49s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
11s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 57s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
34s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
54s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 21m  
6s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 21m  
6s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 58s{color} | {color:orange} hadoop-common-project/hadoop-common: The patch 
generated 5 new + 72 unchanged - 0 fixed = 77 total (was 72) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 20s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
8s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  9m 45s{color} 
| {color:red} hadoop-common in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
35s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 99m 28s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.util.TestReadWriteDiskValidator |
|   | hadoop.util.TestBasicDiskValidator |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:d4cc50f |
| JIRA Issue | HADOOP-15335 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12915609/HADOOP-15335.001.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 660153fde6c0 3.13.0-137-generic #186-Ubuntu SMP Mon Dec 4 
19:09:19 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 8d898ab |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_151 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/14371/artifact/out/diff-checkstyle-hadoop-common-project_hadoop-common.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/14371/artifact/out/patch-unit-ha

[jira] [Commented] (HADOOP-15124) Slow FileSystem.Statistics counters implementation

2018-03-22 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16409200#comment-16409200
 ] 

Siddharth Seth commented on HADOOP-15124:
-

Adding to [~ste...@apache.org]'s comment earlier about downstream projects 
relying on per thread statistics - Hive-LLAP does rely on this since it can end 
up executing different queries in the same process. It tracks per query 
statistics by pulling from thread statistics - 
[https://github.com/apache/hive/blob/master/llap-server/src/java/org/apache/hadoop/hive/llap/daemon/impl/StatsRecordingThreadPool.java]
 . cc [~prasanth_j] - the perf improvement here may interest you.

> Slow FileSystem.Statistics counters implementation
> --
>
> Key: HADOOP-15124
> URL: https://issues.apache.org/jira/browse/HADOOP-15124
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: common
>Affects Versions: 2.9.0, 2.8.3, 2.7.5, 3.0.0
>Reporter: Igor Dvorzhak
>Assignee: Igor Dvorzhak
>Priority: Major
>  Labels: common, filesystem, statistics
> Attachments: HADOOP-15124.001.patch
>
>
> While profiling 1TB TeraGen job on Hadoop 2.8.2 cluster (Google Dataproc, 2 
> workers, GCS connector) I saw that FileSystem.Statistics code paths Wall time 
> is 5.58% and CPU time is 26.5% of total execution time.
> After switching FileSystem.Statistics implementation to LongAdder, consumed 
> Wall time decreased to 0.006% and CPU time to 0.104% of total execution time.
> Total job runtime decreased from 66 mins to 61 mins.
> These results are not conclusive, because I didn't benchmark multiple times 
> to average results, but regardless of performance gains switching to 
> LongAdder simplifies code and reduces its complexity.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Assigned] (HADOOP-15270) yarn rmadmin -getGroups returns group from which the user has been removed

2018-03-22 Thread Sunil G (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-15270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sunil G reassigned HADOOP-15270:


Assignee: Sunil G

> yarn rmadmin -getGroups returns group from which the user has been removed
> --
>
> Key: HADOOP-15270
> URL: https://issues.apache.org/jira/browse/HADOOP-15270
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Sumana Sathish
>Assignee: Sunil G
>Priority: Critical
>
> {code:title= adding group hrt_yarn_rmadmin_test}
> sudo su - -c "groupadd hrt_yarn_rmadmin_test" root
> {code}
> {Code:title=adding user hrt_yarn_rmadmin_test to group hrt_yarn_rmadmin_test}
> sudo su - -c "useradd hrt_yarn_rmadmin_test -g hrt_yarn_rmadmin_test" root
> {Code}
> {Code:title= adding group hrt_yarn_rmadmin_test_group2 }
> sudo su - -c "groupadd hrt_yarn_rmadmin_test_group2" root
> {Code}
> {Code:title=adding user hrt_yarn_rmadmin_test to group 
> hrt_yarn_rmadmin_test_group2}
> sudo su - -c "usermod -a -G hrt_yarn_rmadmin_test_group2 
> hrt_yarn_rmadmin_test" root
> {Code}
> Refresh and getGroups
> {code}
> yarn rmadmin -refreshUserToGroupsMappings
> /usr/hdp/current/hadoop-yarn-client/bin/yarn rmadmin -getGroups 
> hrt_yarn_rmadmin_test
> hrt_yarn_rmadmin_test : hrt_yarn_rmadmin_test hrt_yarn_rmadmin_test_group2
> {code}
> Delete group hrt_yarn_rmadmin_test_group2 from user hrt_yarn_rmadmin_test  
> and refresh and do getGroups.
> We can still see group hrt_yarn_rmadmin_test_group2
> {code}
> sudo su - -c "gpasswd -d hrt_yarn_rmadmin_test hrt_yarn_rmadmin_test_group2" 
> root
> {code}
> Removing user hrt_yarn_rmadmin_test from group hrt_yarn_rmadmin_test_group2
> {code}
> bash-4.2$  /usr/hdp/current/hadoop-yarn-client/bin/yarn rmadmin 
> -refreshUserToGroupsMappings
> /usr/hdp/current/hadoop-yarn-client/bin/yarn rmadmin -getGroups 
> hrt_yarn_rmadmin_test
> hrt_yarn_rmadmin_test : hrt_yarn_rmadmin_test hrt_yarn_rmadmin_test_group2
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15270) yarn rmadmin -getGroups returns group from which the user has been removed

2018-03-22 Thread Sunil G (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16409366#comment-16409366
 ] 

Sunil G commented on HADOOP-15270:
--

Changes to this issue is is YARN. converting as a YARN ticket.

> yarn rmadmin -getGroups returns group from which the user has been removed
> --
>
> Key: HADOOP-15270
> URL: https://issues.apache.org/jira/browse/HADOOP-15270
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Sumana Sathish
>Assignee: Sunil G
>Priority: Critical
>
> {code:title= adding group hrt_yarn_rmadmin_test}
> sudo su - -c "groupadd hrt_yarn_rmadmin_test" root
> {code}
> {Code:title=adding user hrt_yarn_rmadmin_test to group hrt_yarn_rmadmin_test}
> sudo su - -c "useradd hrt_yarn_rmadmin_test -g hrt_yarn_rmadmin_test" root
> {Code}
> {Code:title= adding group hrt_yarn_rmadmin_test_group2 }
> sudo su - -c "groupadd hrt_yarn_rmadmin_test_group2" root
> {Code}
> {Code:title=adding user hrt_yarn_rmadmin_test to group 
> hrt_yarn_rmadmin_test_group2}
> sudo su - -c "usermod -a -G hrt_yarn_rmadmin_test_group2 
> hrt_yarn_rmadmin_test" root
> {Code}
> Refresh and getGroups
> {code}
> yarn rmadmin -refreshUserToGroupsMappings
> /usr/hdp/current/hadoop-yarn-client/bin/yarn rmadmin -getGroups 
> hrt_yarn_rmadmin_test
> hrt_yarn_rmadmin_test : hrt_yarn_rmadmin_test hrt_yarn_rmadmin_test_group2
> {code}
> Delete group hrt_yarn_rmadmin_test_group2 from user hrt_yarn_rmadmin_test  
> and refresh and do getGroups.
> We can still see group hrt_yarn_rmadmin_test_group2
> {code}
> sudo su - -c "gpasswd -d hrt_yarn_rmadmin_test hrt_yarn_rmadmin_test_group2" 
> root
> {code}
> Removing user hrt_yarn_rmadmin_test from group hrt_yarn_rmadmin_test_group2
> {code}
> bash-4.2$  /usr/hdp/current/hadoop-yarn-client/bin/yarn rmadmin 
> -refreshUserToGroupsMappings
> /usr/hdp/current/hadoop-yarn-client/bin/yarn rmadmin -getGroups 
> hrt_yarn_rmadmin_test
> hrt_yarn_rmadmin_test : hrt_yarn_rmadmin_test hrt_yarn_rmadmin_test_group2
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-7195) RawLocalFileSystem.rename() should not try to do copy

2018-03-22 Thread Andras Bokor (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-7195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16409576#comment-16409576
 ] 

Andras Bokor commented on HADOOP-7195:
--

The fallback logic is to cover cross volume renames. Pls check HADOOP-13082 for 
the details.

> RawLocalFileSystem.rename() should not try to do copy
> -
>
> Key: HADOOP-7195
> URL: https://issues.apache.org/jira/browse/HADOOP-7195
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs
>Affects Versions: 0.20.2, 0.21.0
>Reporter: Kang Xiao
>Priority: Major
> Attachments: HADOOP-7195-v2.patch, HADOOP-7195-v2.patch, 
> HADOOP-7195.patch
>
>
> RawLocalFileSystem.rename() try to copy file if fails to call rename of java 
> File. It's really confusing to do copy in a rename interface. For example 
> rename(/a/b/c, /e/f/g) will invoke the copy if /e/f does not exist.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-7195) RawLocalFileSystem.rename() should not try to do copy

2018-03-22 Thread Andras Bokor (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-7195?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andras Bokor resolved HADOOP-7195.
--
Resolution: Later

> RawLocalFileSystem.rename() should not try to do copy
> -
>
> Key: HADOOP-7195
> URL: https://issues.apache.org/jira/browse/HADOOP-7195
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs
>Affects Versions: 0.20.2, 0.21.0
>Reporter: Kang Xiao
>Priority: Major
> Attachments: HADOOP-7195-v2.patch, HADOOP-7195-v2.patch, 
> HADOOP-7195.patch
>
>
> RawLocalFileSystem.rename() try to copy file if fails to call rename of java 
> File. It's really confusing to do copy in a rename interface. For example 
> rename(/a/b/c, /e/f/g) will invoke the copy if /e/f does not exist.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-6672) BytesWritable.write(buf) use much more CPU in writeInt() then write(buf)

2018-03-22 Thread Andras Bokor (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-6672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andras Bokor resolved HADOOP-6672.
--
Resolution: Duplicate

> BytesWritable.write(buf) use much more CPU in writeInt() then write(buf)
> 
>
> Key: HADOOP-6672
> URL: https://issues.apache.org/jira/browse/HADOOP-6672
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: io
>Affects Versions: 0.20.2
>Reporter: Kang Xiao
>Priority: Major
>  Labels: BytesWritable, hadoop, io
> Attachments: BytesWritable.java.patch, screenshot-1.jpg, 
> screenshot-2.jpg
>
>
> BytesWritable.write() use nearly 4 times of CPU in wirteInt() as write 
> buffer. It may be optimized.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-6897) FileSystem#mkdirs(FileSystem, Path, FsPermission) should not call setPermission if mkdirs failled

2018-03-22 Thread Andras Bokor (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-6897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andras Bokor updated HADOOP-6897:
-
Status: Patch Available  (was: Open)

> FileSystem#mkdirs(FileSystem, Path, FsPermission) should not call 
> setPermission if mkdirs failled
> -
>
> Key: HADOOP-6897
> URL: https://issues.apache.org/jira/browse/HADOOP-6897
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 0.22.0
>Reporter: Hairong Kuang
>Assignee: Hairong Kuang
>Priority: Major
> Attachments: mkdirs.patch
>
>
> Here is the piece of code that has the bug. fs.setPermission should not be 
> called if result is false.
> {code}
>   public static boolean mkdirs(FileSystem fs, Path dir, FsPermission 
> permission)
>   throws IOException {
> // create the directory using the default permission
> boolean result = fs.mkdirs(dir);
> // set its permission to be the supplied one
> fs.setPermission(dir, permission);
> return result;
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-6897) FileSystem#mkdirs(FileSystem, Path, FsPermission) should not call setPermission if mkdirs failled

2018-03-22 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-6897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16409688#comment-16409688
 ] 

genericqa commented on HADOOP-6897:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  4s{color} 
| {color:red} HADOOP-6897 does not apply to trunk. Rebase required? Wrong 
Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | HADOOP-6897 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12451172/mkdirs.patch |
| Console output | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/14372/console |
| Powered by | Apache Yetus 0.8.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> FileSystem#mkdirs(FileSystem, Path, FsPermission) should not call 
> setPermission if mkdirs failled
> -
>
> Key: HADOOP-6897
> URL: https://issues.apache.org/jira/browse/HADOOP-6897
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 0.22.0
>Reporter: Hairong Kuang
>Assignee: Hairong Kuang
>Priority: Major
> Attachments: mkdirs.patch
>
>
> Here is the piece of code that has the bug. fs.setPermission should not be 
> called if result is false.
> {code}
>   public static boolean mkdirs(FileSystem fs, Path dir, FsPermission 
> permission)
>   throws IOException {
> // create the directory using the default permission
> boolean result = fs.mkdirs(dir);
> // set its permission to be the supplied one
> fs.setPermission(dir, permission);
> return result;
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-6897) FileSystem#mkdirs(FileSystem, Path, FsPermission) should not call setPermission if mkdirs failled

2018-03-22 Thread Andras Bokor (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-6897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16409686#comment-16409686
 ] 

Andras Bokor commented on HADOOP-6897:
--

It's still an issue. The patch seems valid. We cannot remove the static mkdirs 
since it has different behavior than member ones.
It sets the permission without applying umask.

> FileSystem#mkdirs(FileSystem, Path, FsPermission) should not call 
> setPermission if mkdirs failled
> -
>
> Key: HADOOP-6897
> URL: https://issues.apache.org/jira/browse/HADOOP-6897
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 0.22.0
>Reporter: Hairong Kuang
>Assignee: Hairong Kuang
>Priority: Major
> Attachments: mkdirs.patch
>
>
> Here is the piece of code that has the bug. fs.setPermission should not be 
> called if result is false.
> {code}
>   public static boolean mkdirs(FileSystem fs, Path dir, FsPermission 
> permission)
>   throws IOException {
> // create the directory using the default permission
> boolean result = fs.mkdirs(dir);
> // set its permission to be the supplied one
> fs.setPermission(dir, permission);
> return result;
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-6822) Provide information as to whether or not security is enabled on web interface

2018-03-22 Thread Andras Bokor (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-6822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andras Bokor resolved HADOOP-6822.
--
Resolution: Invalid

There is no JT anymore. YARN new UI is in progress. If this feature is required 
on new UI a new ticket is needed.

> Provide information as to whether or not security is enabled on web interface
> -
>
> Key: HADOOP-6822
> URL: https://issues.apache.org/jira/browse/HADOOP-6822
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Jakob Homan
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15321) Reduce the RPC Client max retries on timeouts

2018-03-22 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16409734#comment-16409734
 ] 

Kihwal Lee commented on HADOOP-15321:
-

This IPC behavior was not introduced in 0.23. If one (painfully) traces through 
the epic three-project-split and the re-merge in the SVN days, it will be 
apparent the design is quite older (pre-0.20). As Hadoop was originally 
designed for batch processing, clients were configured to retry for a long time 
before giving up.  Datatransfer should move on to other nodes more quickly.  So 
if it was a dfs client, it must be {{getReplicaVisibleLength()}}. Although the 
IPC behavior was not new, 0.23 was the first release with clients calling ipc 
against datanodes.

Prior to HDFS-814, which added {{getReplicaVisibleLength()}}, dfs client did 
not call any IPC against datanode. I think it broke the 
quick-recovery-for-data-reads design, as IPC connection handling is much more 
conservative as it was primarily against namenode. This change was made to 
branch-0.21 in December 2009, but was not really tested in field until 0.23, 
which was released 2 years later. I think we started seeing problems after 
upgrading from 1.x (formerly 0.20.205.x) to 0.23. I do not recall specifically, 
but it seems HDFS-1330 was an attempt to address this.

The IPC behavior against NN has since changed with the introduction of HA. It 
seems the error handling in client to datanode ipc should be made comparable to 
that of data transfer. I thought the default connection timeout was 20 seconds, 
but it still is not desirable to try this for 45 times. We need a way to 
configure datanode ipc separately in clients. Perhaps we can simply use the 
parameters for data transfer(block reads) without implicit ipc-level retries. 
{{DFSInputStream}} can retry it in the same manner it does for block reads. We 
just need to be careful not to leak objects.

> Reduce the RPC Client max retries on timeouts
> -
>
> Key: HADOOP-15321
> URL: https://issues.apache.org/jira/browse/HADOOP-15321
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: ipc
>Reporter: Xiao Chen
>Assignee: Xiao Chen
>Priority: Major
>
> Currently, the 
> [default|https://github.com/apache/hadoop/blob/branch-3.0.0/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CommonConfigurationKeysPublic.java#L379]
>  number of retries when IPC client catch a {{ConnectTimeoutException}} is 45. 
> This seems unreasonably high.
> Given the IPC client timeout is by default 60 seconds, if a DN host is 
> shutdown the client will retry for 45 minutes until aborting. (If host is 
> there but process down, it would throw a connection refused immediately, 
> which is cool)
> Creating this Jira to discuss whether we can reduce that to a reasonable 
> number.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14067) VersionInfo should load version-info.properties from its own classloader

2018-03-22 Thread Jitendra Nath Pandey (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16409756#comment-16409756
 ] 

Jitendra Nath Pandey commented on HADOOP-14067:
---

+1, Thanks for addressing style/javadoc issues. I will commit shortly.

> VersionInfo should load version-info.properties from its own classloader
> 
>
> Key: HADOOP-14067
> URL: https://issues.apache.org/jira/browse/HADOOP-14067
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: common
>Affects Versions: 2.8.3, 3.0.0
>Reporter: Thejas M Nair
>Assignee: Thejas M Nair
>Priority: Major
> Attachments: HADOOP-14067.01.patch, HADOOP-14067.01.patch, 
> HADOOP-14067.02.patch, HADOOP-14067.03.patch
>
>
> org.apache.hadoop.util.VersionInfo loads the version-info.properties file via 
> the current thread classloader.
> However, in case of applications that are using hadoop classes dynamically  
> (eg jdbc based tools such as SQuirreL SQL) the current thread might not be 
> the one that loaded the hadoop classes including VersionInfo, and it would 
> fail to fine the properties file.
> The right place to look for the properties file is in the classloader of 
> VersionInfo class, as right version is the one that is associated with rest 
> of the loaded hadoop classes,  and not necessarily the one in current thread 
> classloader.
> Created a related jira - HADOOP-14066 to make methods to get version via 
> VersionInfo a public api.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15334) Upgrade Maven surefire plugin

2018-03-22 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16409838#comment-16409838
 ] 

Chris Douglas commented on HADOOP-15334:


+1

> Upgrade Maven surefire plugin
> -
>
> Key: HADOOP-15334
> URL: https://issues.apache.org/jira/browse/HADOOP-15334
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: build
>Affects Versions: 3.0.0
>Reporter: Arpit Agarwal
>Assignee: Arpit Agarwal
>Priority: Major
> Attachments: HADOOP-15334.01.patch
>
>
> Recent versions of the surefire plugin suppress summary test execution output 
> in quiet mode. This is now fixed in plugin version 2.21.0 (via SUREFIRE-1436).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15334) Upgrade Maven surefire plugin

2018-03-22 Thread Arpit Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-15334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HADOOP-15334:
---
  Resolution: Fixed
Hadoop Flags: Reviewed
   Fix Version/s: 3.0.2
  3.2.0
  2.9.1
  2.10.0
  3.1.1
Target Version/s:   (was: 3.1.0, 2.10.0)
  Status: Resolved  (was: Patch Available)

Committed. Thanks for the reviews Chris and Bharat.

> Upgrade Maven surefire plugin
> -
>
> Key: HADOOP-15334
> URL: https://issues.apache.org/jira/browse/HADOOP-15334
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: build
>Affects Versions: 3.0.0
>Reporter: Arpit Agarwal
>Assignee: Arpit Agarwal
>Priority: Major
> Fix For: 3.1.1, 2.10.0, 2.9.1, 3.2.0, 3.0.2
>
> Attachments: HADOOP-15334.01.patch
>
>
> Recent versions of the surefire plugin suppress summary test execution output 
> in quiet mode. This is now fixed in plugin version 2.21.0 (via SUREFIRE-1436).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15331) Race condition causing org.apache.hadoop.conf.Configuration: error parsing conf java.io.BufferedInputStream

2018-03-22 Thread Yufei Gu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16409996#comment-16409996
 ] 

Yufei Gu commented on HADOOP-15331:
---

The unit test failure is unrelated. +1. Committing.

> Race condition causing org.apache.hadoop.conf.Configuration: error parsing 
> conf java.io.BufferedInputStream
> ---
>
> Key: HADOOP-15331
> URL: https://issues.apache.org/jira/browse/HADOOP-15331
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: common
>Affects Versions: 2.9.0, 2.8.3, 2.7.5, 3.0.0, 3.1.0, 2.10.0
>Reporter: Miklos Szegedi
>Assignee: Miklos Szegedi
>Priority: Major
> Attachments: HADOOP-15331.000.patch, HADOOP-15331.001.patch
>
>
> There is a race condition in the way Hadoop handles the Configuration class. 
> The scenario is the following. Let's assume that there are two threads 
> sharing the same Configuration class. One adds some resources to the 
> configuration, while the other one clones it. Resources are loaded lazily in 
> a deferred call to {{loadResources()}}. If the cloning happens after adding 
> the resources but before parsing them, some temporary resources like input 
> stream pointers are cloned. Eventually both copies will load the input stream 
> resources pointing to the same input streams. One parses the input stream XML 
> and closes it updating it's own copy of the resource. The other one has 
> another pointer to the same input stream. When it tries to load it, it will 
> crash with a stream closed exception.
> Here is an example unit test:
> {code:java}
> @Test
> public void testResourceRace() {
>   InputStream is =
>   new BufferedInputStream(new ByteArrayInputStream(
>   "".getBytes()));
>   Configuration conf = new Configuration();
>   // Thread 1
>   conf.addResource(is);
>   // Thread 2
>   Configuration confClone = new Configuration(conf);
>   // Thread 2
>   confClone.get("firstParse");
>   // Thread 1
>   conf.get("secondParse");
> }{code}
> Example real world stack traces:
> {code:java}
> 2018-02-28 08:23:19,589 ERROR org.apache.hadoop.conf.Configuration: error 
> parsing conf java.io.BufferedInputStream@7741d346
> com.ctc.wstx.exc.WstxIOException: Stream closed
>   at 
> com.ctc.wstx.stax.WstxInputFactory.doCreateSR(WstxInputFactory.java:578)
>   at 
> com.ctc.wstx.stax.WstxInputFactory.createSR(WstxInputFactory.java:633)
>   at org.apache.hadoop.conf.Configuration.parse(Configuration.java:2803)
>   at 
> org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2853)
>   at 
> org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2817)
>   at 
> org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2689)
>   at org.apache.hadoop.conf.Configuration.get(Configuration.java:1420)
>   at 
> org.apache.hadoop.security.authorize.ServiceAuthorizationManager.refreshWithLoadedConfiguration(ServiceAuthorizationManager.java:161)
>   at 
> org.apache.hadoop.ipc.Server.refreshServiceAclWithLoadedConfiguration(Server.java:607)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshServiceAcls(AdminService.java:586)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.startServer(AdminService.java:188)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.serviceStart(AdminService.java:165)
>   at 
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
>   at 
> org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:121)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceStart(ResourceManager.java:1231)
>   at 
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1421)
> {code}
> Another example:
> {code:java}
> 2018-02-28 08:23:20,702 ERROR org.apache.hadoop.conf.Configuration: error 
> parsing conf java.io.BufferedInputStream@7741d346
> com.ctc.wstx.exc.WstxIOException: Stream closed
>   at 
> com.ctc.wstx.stax.WstxInputFactory.doCreateSR(WstxInputFactory.java:578)
>   at 
> com.ctc.wstx.stax.WstxInputFactory.createSR(WstxInputFactory.java:633)
>   at org.apache.hadoop.conf.Configuration.parse(Configuration.java:2803)
>   at 
> org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2853)
>   at 
> org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2817)
>   at 
> org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2689)
>   at org.apache.hadoop.conf.Configuration.set(Configuration.java:1326)
>   at org.apache.hadoop.conf

[jira] [Updated] (HADOOP-15331) Fix a race condition causing parsing error of java.io.BufferedInputStream in class org.apache.hadoop.conf.Configuration

2018-03-22 Thread Yufei Gu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-15331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yufei Gu updated HADOOP-15331:
--
Summary: Fix a race condition causing parsing error of 
java.io.BufferedInputStream in class org.apache.hadoop.conf.Configuration  
(was: Race condition causing org.apache.hadoop.conf.Configuration: error 
parsing conf java.io.BufferedInputStream)

> Fix a race condition causing parsing error of java.io.BufferedInputStream in 
> class org.apache.hadoop.conf.Configuration
> ---
>
> Key: HADOOP-15331
> URL: https://issues.apache.org/jira/browse/HADOOP-15331
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: common
>Affects Versions: 2.9.0, 2.8.3, 2.7.5, 3.0.0, 3.1.0, 2.10.0
>Reporter: Miklos Szegedi
>Assignee: Miklos Szegedi
>Priority: Major
> Attachments: HADOOP-15331.000.patch, HADOOP-15331.001.patch
>
>
> There is a race condition in the way Hadoop handles the Configuration class. 
> The scenario is the following. Let's assume that there are two threads 
> sharing the same Configuration class. One adds some resources to the 
> configuration, while the other one clones it. Resources are loaded lazily in 
> a deferred call to {{loadResources()}}. If the cloning happens after adding 
> the resources but before parsing them, some temporary resources like input 
> stream pointers are cloned. Eventually both copies will load the input stream 
> resources pointing to the same input streams. One parses the input stream XML 
> and closes it updating it's own copy of the resource. The other one has 
> another pointer to the same input stream. When it tries to load it, it will 
> crash with a stream closed exception.
> Here is an example unit test:
> {code:java}
> @Test
> public void testResourceRace() {
>   InputStream is =
>   new BufferedInputStream(new ByteArrayInputStream(
>   "".getBytes()));
>   Configuration conf = new Configuration();
>   // Thread 1
>   conf.addResource(is);
>   // Thread 2
>   Configuration confClone = new Configuration(conf);
>   // Thread 2
>   confClone.get("firstParse");
>   // Thread 1
>   conf.get("secondParse");
> }{code}
> Example real world stack traces:
> {code:java}
> 2018-02-28 08:23:19,589 ERROR org.apache.hadoop.conf.Configuration: error 
> parsing conf java.io.BufferedInputStream@7741d346
> com.ctc.wstx.exc.WstxIOException: Stream closed
>   at 
> com.ctc.wstx.stax.WstxInputFactory.doCreateSR(WstxInputFactory.java:578)
>   at 
> com.ctc.wstx.stax.WstxInputFactory.createSR(WstxInputFactory.java:633)
>   at org.apache.hadoop.conf.Configuration.parse(Configuration.java:2803)
>   at 
> org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2853)
>   at 
> org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2817)
>   at 
> org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2689)
>   at org.apache.hadoop.conf.Configuration.get(Configuration.java:1420)
>   at 
> org.apache.hadoop.security.authorize.ServiceAuthorizationManager.refreshWithLoadedConfiguration(ServiceAuthorizationManager.java:161)
>   at 
> org.apache.hadoop.ipc.Server.refreshServiceAclWithLoadedConfiguration(Server.java:607)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshServiceAcls(AdminService.java:586)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.startServer(AdminService.java:188)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.serviceStart(AdminService.java:165)
>   at 
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
>   at 
> org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:121)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceStart(ResourceManager.java:1231)
>   at 
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1421)
> {code}
> Another example:
> {code:java}
> 2018-02-28 08:23:20,702 ERROR org.apache.hadoop.conf.Configuration: error 
> parsing conf java.io.BufferedInputStream@7741d346
> com.ctc.wstx.exc.WstxIOException: Stream closed
>   at 
> com.ctc.wstx.stax.WstxInputFactory.doCreateSR(WstxInputFactory.java:578)
>   at 
> com.ctc.wstx.stax.WstxInputFactory.createSR(WstxInputFactory.java:633)
>   at org.apache.hadoop.conf.Configuration.parse(Configuration.java:2803)
>   at 
> org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2853)
>   at 
> org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2817)
>   at 
> org.apache.hado

[jira] [Commented] (HADOOP-15331) Fix a race condition causing parsing error of java.io.BufferedInputStream in class org.apache.hadoop.conf.Configuration

2018-03-22 Thread Yufei Gu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16410016#comment-16410016
 ] 

Yufei Gu commented on HADOOP-15331:
---

Committed to trunk. Thanks [~miklos.szeg...@cloudera.com] for working on this. 
Thanks [~grepas] for the review. 

> Fix a race condition causing parsing error of java.io.BufferedInputStream in 
> class org.apache.hadoop.conf.Configuration
> ---
>
> Key: HADOOP-15331
> URL: https://issues.apache.org/jira/browse/HADOOP-15331
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: common
>Affects Versions: 2.9.0, 2.8.3, 2.7.5, 3.0.0, 3.1.0, 2.10.0
>Reporter: Miklos Szegedi
>Assignee: Miklos Szegedi
>Priority: Major
> Fix For: 3.2.0
>
> Attachments: HADOOP-15331.000.patch, HADOOP-15331.001.patch
>
>
> There is a race condition in the way Hadoop handles the Configuration class. 
> The scenario is the following. Let's assume that there are two threads 
> sharing the same Configuration class. One adds some resources to the 
> configuration, while the other one clones it. Resources are loaded lazily in 
> a deferred call to {{loadResources()}}. If the cloning happens after adding 
> the resources but before parsing them, some temporary resources like input 
> stream pointers are cloned. Eventually both copies will load the input stream 
> resources pointing to the same input streams. One parses the input stream XML 
> and closes it updating it's own copy of the resource. The other one has 
> another pointer to the same input stream. When it tries to load it, it will 
> crash with a stream closed exception.
> Here is an example unit test:
> {code:java}
> @Test
> public void testResourceRace() {
>   InputStream is =
>   new BufferedInputStream(new ByteArrayInputStream(
>   "".getBytes()));
>   Configuration conf = new Configuration();
>   // Thread 1
>   conf.addResource(is);
>   // Thread 2
>   Configuration confClone = new Configuration(conf);
>   // Thread 2
>   confClone.get("firstParse");
>   // Thread 1
>   conf.get("secondParse");
> }{code}
> Example real world stack traces:
> {code:java}
> 2018-02-28 08:23:19,589 ERROR org.apache.hadoop.conf.Configuration: error 
> parsing conf java.io.BufferedInputStream@7741d346
> com.ctc.wstx.exc.WstxIOException: Stream closed
>   at 
> com.ctc.wstx.stax.WstxInputFactory.doCreateSR(WstxInputFactory.java:578)
>   at 
> com.ctc.wstx.stax.WstxInputFactory.createSR(WstxInputFactory.java:633)
>   at org.apache.hadoop.conf.Configuration.parse(Configuration.java:2803)
>   at 
> org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2853)
>   at 
> org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2817)
>   at 
> org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2689)
>   at org.apache.hadoop.conf.Configuration.get(Configuration.java:1420)
>   at 
> org.apache.hadoop.security.authorize.ServiceAuthorizationManager.refreshWithLoadedConfiguration(ServiceAuthorizationManager.java:161)
>   at 
> org.apache.hadoop.ipc.Server.refreshServiceAclWithLoadedConfiguration(Server.java:607)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshServiceAcls(AdminService.java:586)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.startServer(AdminService.java:188)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.serviceStart(AdminService.java:165)
>   at 
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
>   at 
> org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:121)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceStart(ResourceManager.java:1231)
>   at 
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1421)
> {code}
> Another example:
> {code:java}
> 2018-02-28 08:23:20,702 ERROR org.apache.hadoop.conf.Configuration: error 
> parsing conf java.io.BufferedInputStream@7741d346
> com.ctc.wstx.exc.WstxIOException: Stream closed
>   at 
> com.ctc.wstx.stax.WstxInputFactory.doCreateSR(WstxInputFactory.java:578)
>   at 
> com.ctc.wstx.stax.WstxInputFactory.createSR(WstxInputFactory.java:633)
>   at org.apache.hadoop.conf.Configuration.parse(Configuration.java:2803)
>   at 
> org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2853)
>   at 
> org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2817)
>   at 
> org.apache.hadoop.conf.Configuration.getProps(Configuration.java

[jira] [Updated] (HADOOP-15331) Fix a race condition causing parsing error of java.io.BufferedInputStream in class org.apache.hadoop.conf.Configuration

2018-03-22 Thread Yufei Gu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-15331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yufei Gu updated HADOOP-15331:
--
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 3.2.0
   Status: Resolved  (was: Patch Available)

> Fix a race condition causing parsing error of java.io.BufferedInputStream in 
> class org.apache.hadoop.conf.Configuration
> ---
>
> Key: HADOOP-15331
> URL: https://issues.apache.org/jira/browse/HADOOP-15331
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: common
>Affects Versions: 2.9.0, 2.8.3, 2.7.5, 3.0.0, 3.1.0, 2.10.0
>Reporter: Miklos Szegedi
>Assignee: Miklos Szegedi
>Priority: Major
> Fix For: 3.2.0
>
> Attachments: HADOOP-15331.000.patch, HADOOP-15331.001.patch
>
>
> There is a race condition in the way Hadoop handles the Configuration class. 
> The scenario is the following. Let's assume that there are two threads 
> sharing the same Configuration class. One adds some resources to the 
> configuration, while the other one clones it. Resources are loaded lazily in 
> a deferred call to {{loadResources()}}. If the cloning happens after adding 
> the resources but before parsing them, some temporary resources like input 
> stream pointers are cloned. Eventually both copies will load the input stream 
> resources pointing to the same input streams. One parses the input stream XML 
> and closes it updating it's own copy of the resource. The other one has 
> another pointer to the same input stream. When it tries to load it, it will 
> crash with a stream closed exception.
> Here is an example unit test:
> {code:java}
> @Test
> public void testResourceRace() {
>   InputStream is =
>   new BufferedInputStream(new ByteArrayInputStream(
>   "".getBytes()));
>   Configuration conf = new Configuration();
>   // Thread 1
>   conf.addResource(is);
>   // Thread 2
>   Configuration confClone = new Configuration(conf);
>   // Thread 2
>   confClone.get("firstParse");
>   // Thread 1
>   conf.get("secondParse");
> }{code}
> Example real world stack traces:
> {code:java}
> 2018-02-28 08:23:19,589 ERROR org.apache.hadoop.conf.Configuration: error 
> parsing conf java.io.BufferedInputStream@7741d346
> com.ctc.wstx.exc.WstxIOException: Stream closed
>   at 
> com.ctc.wstx.stax.WstxInputFactory.doCreateSR(WstxInputFactory.java:578)
>   at 
> com.ctc.wstx.stax.WstxInputFactory.createSR(WstxInputFactory.java:633)
>   at org.apache.hadoop.conf.Configuration.parse(Configuration.java:2803)
>   at 
> org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2853)
>   at 
> org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2817)
>   at 
> org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2689)
>   at org.apache.hadoop.conf.Configuration.get(Configuration.java:1420)
>   at 
> org.apache.hadoop.security.authorize.ServiceAuthorizationManager.refreshWithLoadedConfiguration(ServiceAuthorizationManager.java:161)
>   at 
> org.apache.hadoop.ipc.Server.refreshServiceAclWithLoadedConfiguration(Server.java:607)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshServiceAcls(AdminService.java:586)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.startServer(AdminService.java:188)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.serviceStart(AdminService.java:165)
>   at 
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
>   at 
> org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:121)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceStart(ResourceManager.java:1231)
>   at 
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1421)
> {code}
> Another example:
> {code:java}
> 2018-02-28 08:23:20,702 ERROR org.apache.hadoop.conf.Configuration: error 
> parsing conf java.io.BufferedInputStream@7741d346
> com.ctc.wstx.exc.WstxIOException: Stream closed
>   at 
> com.ctc.wstx.stax.WstxInputFactory.doCreateSR(WstxInputFactory.java:578)
>   at 
> com.ctc.wstx.stax.WstxInputFactory.createSR(WstxInputFactory.java:633)
>   at org.apache.hadoop.conf.Configuration.parse(Configuration.java:2803)
>   at 
> org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2853)
>   at 
> org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2817)
>   at 
> org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2689)
>   at org.apache.hadoop.conf.C

[jira] [Commented] (HADOOP-15321) Reduce the RPC Client max retries on timeouts

2018-03-22 Thread Xiao Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16410025#comment-16410025
 ] 

Xiao Chen commented on HADOOP-15321:


Thanks a lot for the history and tracing back to SVN [~kihwal]! A great 
lecture. :) You're also right that the connection timeout default was 20.

There were also more things to share from failure(s) I was seeing, and it's 
actually a mix of things. Apologies I as confused initially and didn't clarify 
between the 2 timeouts. The specific error I see was from Impala, but it's 
really just calling through JNI to dfsclients.

1. There is the 60 second timeout for the actual read, when setting up tcp 
connection to the DN. This is okay because the DN will be added to dead nodes 
and the next try will hit another DN, which would succeed.
{noformat}
W0125 23:37:35.947903 22700 DFSInputStream.java:696] Failed to connect to 
/DN:20003 for block, add to deadNodes and continue. 
org.apache.hadoop.net.ConnectTimeoutException: 6 millis timeout while 
waiting for channel to be ready for connect. ch : 
java.nio.channels.SocketChannel[connection-pending remote=/DN:20003]
Java exception follows:
org.apache.hadoop.net.ConnectTimeoutException: 6 millis timeout while 
waiting for channel to be ready for connect. ch : 
java.nio.channels.SocketChannel[connection-pending remote=/DN0:20003]
        at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:533)
        at 
org.apache.hadoop.hdfs.DFSClient.newConnectedPeer(DFSClient.java:3530)
        at 
org.apache.hadoop.hdfs.BlockReaderFactory.nextTcpPeer(BlockReaderFactory.java:840)
        at 
org.apache.hadoop.hdfs.BlockReaderFactory.getRemoteBlockReaderFromTcp(BlockReaderFactory.java:755)
        at 
org.apache.hadoop.hdfs.BlockReaderFactory.build(BlockReaderFactory.java:376)
        at 
org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:658)
        at 
org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:895)
        at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:972)
        at 
org.apache.hadoop.fs.FSDataInputStream.read(FSDataInputStream.java:147)
I0125 23:37:35.953243 22700 DFSInputStream.java:678] Successfully connected to 
/DN:20003 for BP-268139246-NN-1439947563913:blk_1964348624_1191034712
{noformat}
The version we saw did not have HDFS-11993 though, but looking at the event 
time and log patterns, I think this is must be the case.

2. There is also the 45 time retries, which we do not have stacktraces.
{noformat}
...
I0125 23:50:06.012015 22689 Client.java:870] Retrying connect to server: 
DATANODE:50020. Already tried 44 time(s); maxRetries=45
{noformat}
This is 20 seconds apart, but a consecutive 45 retries. No stacktrace or other 
interesting information logged because debug wasn't turned on.

Regarding the fix, your advice makes sense to me. To make sure my understanding 
is correct, we can configure the client -> DN ipc to not retry, but do our own 
retries similar to [the existing way of adding a DN to deadnodes and retry on 
the next 
DN|https://github.com/apache/hadoop/blob/branch-3.0.0/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DFSInputStream.java#L598]

If no objections I can give it a shot soon...

> Reduce the RPC Client max retries on timeouts
> -
>
> Key: HADOOP-15321
> URL: https://issues.apache.org/jira/browse/HADOOP-15321
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: ipc
>Reporter: Xiao Chen
>Assignee: Xiao Chen
>Priority: Major
>
> Currently, the 
> [default|https://github.com/apache/hadoop/blob/branch-3.0.0/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CommonConfigurationKeysPublic.java#L379]
>  number of retries when IPC client catch a {{ConnectTimeoutException}} is 45. 
> This seems unreasonably high.
> Given the IPC client timeout is by default 60 seconds, if a DN host is 
> shutdown the client will retry for 45 minutes until aborting. (If host is 
> there but process down, it would throw a connection refused immediately, 
> which is cool)
> Creating this Jira to discuss whether we can reduce that to a reasonable 
> number.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15334) Upgrade Maven surefire plugin

2018-03-22 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16410120#comment-16410120
 ] 

Hudson commented on HADOOP-15334:
-

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #13868 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/13868/])
HADOOP-15334. Upgrade Maven surefire plugin. Contributed by Arpit (arp: rev 
dae5051828a1cf52f8d23a9126775680cd32bade)
* (edit) hadoop-project/pom.xml


> Upgrade Maven surefire plugin
> -
>
> Key: HADOOP-15334
> URL: https://issues.apache.org/jira/browse/HADOOP-15334
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: build
>Affects Versions: 3.0.0
>Reporter: Arpit Agarwal
>Assignee: Arpit Agarwal
>Priority: Major
> Fix For: 2.10.0, 2.9.1, 3.2.0, 3.0.2, 3.1.1
>
> Attachments: HADOOP-15334.01.patch
>
>
> Recent versions of the surefire plugin suppress summary test execution output 
> in quiet mode. This is now fixed in plugin version 2.21.0 (via SUREFIRE-1436).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15331) Fix a race condition causing parsing error of java.io.BufferedInputStream in class org.apache.hadoop.conf.Configuration

2018-03-22 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16410121#comment-16410121
 ] 

Hudson commented on HADOOP-15331:
-

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #13868 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/13868/])
HADOOP-15331. Fix a race condition causing parsing error of (yufei: rev 
268c29a5f541449659f4b4ea1975c6f04c7b6a70)
* (edit) 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/conf/TestConfiguration.java
* (edit) 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/conf/Configuration.java


> Fix a race condition causing parsing error of java.io.BufferedInputStream in 
> class org.apache.hadoop.conf.Configuration
> ---
>
> Key: HADOOP-15331
> URL: https://issues.apache.org/jira/browse/HADOOP-15331
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: common
>Affects Versions: 2.9.0, 2.8.3, 2.7.5, 3.0.0, 3.1.0, 2.10.0
>Reporter: Miklos Szegedi
>Assignee: Miklos Szegedi
>Priority: Major
> Fix For: 3.2.0
>
> Attachments: HADOOP-15331.000.patch, HADOOP-15331.001.patch
>
>
> There is a race condition in the way Hadoop handles the Configuration class. 
> The scenario is the following. Let's assume that there are two threads 
> sharing the same Configuration class. One adds some resources to the 
> configuration, while the other one clones it. Resources are loaded lazily in 
> a deferred call to {{loadResources()}}. If the cloning happens after adding 
> the resources but before parsing them, some temporary resources like input 
> stream pointers are cloned. Eventually both copies will load the input stream 
> resources pointing to the same input streams. One parses the input stream XML 
> and closes it updating it's own copy of the resource. The other one has 
> another pointer to the same input stream. When it tries to load it, it will 
> crash with a stream closed exception.
> Here is an example unit test:
> {code:java}
> @Test
> public void testResourceRace() {
>   InputStream is =
>   new BufferedInputStream(new ByteArrayInputStream(
>   "".getBytes()));
>   Configuration conf = new Configuration();
>   // Thread 1
>   conf.addResource(is);
>   // Thread 2
>   Configuration confClone = new Configuration(conf);
>   // Thread 2
>   confClone.get("firstParse");
>   // Thread 1
>   conf.get("secondParse");
> }{code}
> Example real world stack traces:
> {code:java}
> 2018-02-28 08:23:19,589 ERROR org.apache.hadoop.conf.Configuration: error 
> parsing conf java.io.BufferedInputStream@7741d346
> com.ctc.wstx.exc.WstxIOException: Stream closed
>   at 
> com.ctc.wstx.stax.WstxInputFactory.doCreateSR(WstxInputFactory.java:578)
>   at 
> com.ctc.wstx.stax.WstxInputFactory.createSR(WstxInputFactory.java:633)
>   at org.apache.hadoop.conf.Configuration.parse(Configuration.java:2803)
>   at 
> org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2853)
>   at 
> org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2817)
>   at 
> org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2689)
>   at org.apache.hadoop.conf.Configuration.get(Configuration.java:1420)
>   at 
> org.apache.hadoop.security.authorize.ServiceAuthorizationManager.refreshWithLoadedConfiguration(ServiceAuthorizationManager.java:161)
>   at 
> org.apache.hadoop.ipc.Server.refreshServiceAclWithLoadedConfiguration(Server.java:607)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshServiceAcls(AdminService.java:586)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.startServer(AdminService.java:188)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.serviceStart(AdminService.java:165)
>   at 
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
>   at 
> org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:121)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceStart(ResourceManager.java:1231)
>   at 
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1421)
> {code}
> Another example:
> {code:java}
> 2018-02-28 08:23:20,702 ERROR org.apache.hadoop.conf.Configuration: error 
> parsing conf java.io.BufferedInputStream@7741d346
> com.ctc.wstx.exc.WstxIOException: Stream closed
>   at 
> com.ctc.wstx.stax.WstxInputFactory.doCreateSR(WstxInputFactory.java:578)
>   at 
> com.ctc.wstx.stax.WstxInputFactory.createSR(WstxInputFactory.java:633)

[jira] [Updated] (HADOOP-15307) Improve NFS error handling: Unsupported verifier flavorAUTH_SYS

2018-03-22 Thread Wei-Chiu Chuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-15307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HADOOP-15307:
-
Labels: newbie  (was: )

> Improve NFS error handling: Unsupported verifier flavorAUTH_SYS
> ---
>
> Key: HADOOP-15307
> URL: https://issues.apache.org/jira/browse/HADOOP-15307
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: nfs
> Environment: CentOS 7.4, CDH5.13.1, Kerberized Hadoop cluster
>Reporter: Wei-Chiu Chuang
>Priority: Major
>  Labels: newbie
>
> When NFS gateway starts and if the portmapper request is denied by rpcbind 
> for any reason (in our case, /etc/hosts.allow did not have the localhost), 
> NFS gateway fails with the following obscure exception:
> {noformat}
> 2018-03-05 12:49:31,976 INFO org.apache.hadoop.oncrpc.SimpleUdpServer: 
> Started listening to UDP requests at port 4242 for Rpc program: mountd at 
> localhost:4242 with workerCount 1
> 2018-03-05 12:49:31,988 INFO org.apache.hadoop.oncrpc.SimpleTcpServer: 
> Started listening to TCP requests at port 4242 for Rpc program: mountd at 
> localhost:4242 with workerCount 1
> 2018-03-05 12:49:31,993 TRACE org.apache.hadoop.oncrpc.RpcCall: 
> Xid:692394656, messageType:RPC_CALL, rpcVersion:2, program:10, version:2, 
> procedure:1, credential:(AuthFlavor:AUTH_NONE), 
> verifier:(AuthFlavor:AUTH_NONE)
> 2018-03-05 12:49:31,998 FATAL org.apache.hadoop.mount.MountdBase: Failed to 
> start the server. Cause:
> java.lang.UnsupportedOperationException: Unsupported verifier flavorAUTH_SYS
> at 
> org.apache.hadoop.oncrpc.security.Verifier.readFlavorAndVerifier(Verifier.java:45)
> at org.apache.hadoop.oncrpc.RpcDeniedReply.read(RpcDeniedReply.java:50)
> at org.apache.hadoop.oncrpc.RpcReply.read(RpcReply.java:67)
> at org.apache.hadoop.oncrpc.SimpleUdpClient.run(SimpleUdpClient.java:71)
> at org.apache.hadoop.oncrpc.RpcProgram.register(RpcProgram.java:130)
> at org.apache.hadoop.oncrpc.RpcProgram.register(RpcProgram.java:101)
> at org.apache.hadoop.mount.MountdBase.start(MountdBase.java:83)
> at org.apache.hadoop.hdfs.nfs.nfs3.Nfs3.startServiceInternal(Nfs3.java:56)
> at org.apache.hadoop.hdfs.nfs.nfs3.Nfs3.startService(Nfs3.java:69)
> at 
> org.apache.hadoop.hdfs.nfs.nfs3.PrivilegedNfsGatewayStarter.start(PrivilegedNfsGatewayStarter.java:60)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at org.apache.commons.daemon.support.DaemonLoader.start(DaemonLoader.java:243)
> 2018-03-05 12:49:32,007 INFO org.apache.hadoop.util.ExitUtil: Exiting with 
> status 1{noformat}
>  Reading the code comment for class Verifier, I think this bug existed since 
> its inception
> {code:java}
> /**
>  * Base class for verifier. Currently our authentication only supports 3 types
>  * of auth flavors: {@link RpcAuthInfo.AuthFlavor#AUTH_NONE}, {@link 
> RpcAuthInfo.AuthFlavor#AUTH_SYS},
>  * and {@link RpcAuthInfo.AuthFlavor#RPCSEC_GSS}. Thus for verifier we only 
> need to handle
>  * AUTH_NONE and RPCSEC_GSS
>  */
> public abstract class Verifier extends RpcAuthInfo {{code}
> The verifier should also handle AUTH_SYS too.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15320) Remove customized getFileBlockLocations for hadoop-azure and hadoop-azure-datalake

2018-03-22 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16410265#comment-16410265
 ] 

Chris Douglas commented on HADOOP-15320:


bq. Anything else we should run?
As [~ste...@apache.org] suggested, the hadoop-azure and hadoop-azuredatalake 
test suites and contract tests should pass.

> Remove customized getFileBlockLocations for hadoop-azure and 
> hadoop-azure-datalake
> --
>
> Key: HADOOP-15320
> URL: https://issues.apache.org/jira/browse/HADOOP-15320
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/adl, fs/azure
>Affects Versions: 2.7.3, 2.9.0, 3.0.0
>Reporter: shanyu zhao
>Assignee: shanyu zhao
>Priority: Major
> Attachments: HADOOP-15320.patch
>
>
> hadoop-azure and hadoop-azure-datalake have its own implementation of 
> getFileBlockLocations(), which faked a list of artificial blocks based on the 
> hard-coded block size. And each block has one host with name "localhost". 
> Take a look at this code:
> [https://github.com/apache/hadoop/blob/release-2.9.0-RC3/hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azure/NativeAzureFileSystem.java#L3485]
> This is a unnecessary mock up for a "remote" file system to mimic HDFS. And 
> the problem with this mock is that for large (~TB) files we generates lots of 
> artificial blocks, and FileInputFormat.getSplits() is slow in calculating 
> splits based on these blocks.
> We can safely remove this customized getFileBlockLocations() implementation, 
> fall back to the default FileSystem.getFileBlockLocations() implementation, 
> which is to return 1 block for any file with 1 host "localhost". Note that 
> this doesn't mean we will create much less splits, because the number of 
> splits is still limited by the blockSize in 
> FileInputFormat.computeSplitSize():
> {code:java}
> return Math.max(minSize, Math.min(goalSize, blockSize));{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14067) VersionInfo should load version-info.properties from its own classloader

2018-03-22 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HADOOP-14067:
--
   Resolution: Fixed
Fix Version/s: 3.2.0
   Status: Resolved  (was: Patch Available)

I have committed this to trunk. Thanks [~thejas]!

> VersionInfo should load version-info.properties from its own classloader
> 
>
> Key: HADOOP-14067
> URL: https://issues.apache.org/jira/browse/HADOOP-14067
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: common
>Affects Versions: 2.8.3, 3.0.0
>Reporter: Thejas M Nair
>Assignee: Thejas M Nair
>Priority: Major
> Fix For: 3.2.0
>
> Attachments: HADOOP-14067.01.patch, HADOOP-14067.01.patch, 
> HADOOP-14067.02.patch, HADOOP-14067.03.patch
>
>
> org.apache.hadoop.util.VersionInfo loads the version-info.properties file via 
> the current thread classloader.
> However, in case of applications that are using hadoop classes dynamically  
> (eg jdbc based tools such as SQuirreL SQL) the current thread might not be 
> the one that loaded the hadoop classes including VersionInfo, and it would 
> fail to fine the properties file.
> The right place to look for the properties file is in the classloader of 
> VersionInfo class, as right version is the one that is associated with rest 
> of the loaded hadoop classes,  and not necessarily the one in current thread 
> classloader.
> Created a related jira - HADOOP-14066 to make methods to get version via 
> VersionInfo a public api.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HADOOP-12862) LDAP Group Mapping over SSL can not specify trust store

2018-03-22 Thread Konstantin Shvachko (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16403075#comment-16403075
 ] 

Konstantin Shvachko edited comment on HADOOP-12862 at 3/22/18 9:40 PM:
---

Sounds like testing is a longer-term issue. BTW if I look into Hadoop 
dependencies I see apacheds and ldapsdk. May they be useful for testing. I 
wouldn't know.
The testing went well in our environment.
What do you think about removing {{.ssl.truststore.password}}? I really think 
people should not use configs for passwords. Typically configs are checked in 
in git repositories, so having passwords there is even worse than printing them 
on a command line, which [as you 
suggested|https://issues.apache.org/jira/browse/HADOOP-15315?focusedCommentId=16399166&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16399166]
 is a bad practice.


was (Author: shv):
Sounds like testing is a longer-term issue. BTW if I look into Hadoop 
dependencies I see apacheds and ldapsdk. May they be useful for testing. I 
wouldn't know.
The testing went well in our environment.
What do you think about removing {{.ssl.truststore.password}}? I really think 
people should not use configs for passwords. Typically configs are checked in 
in git repositories, so having passwords there is even worth than printing them 
on a command line, which [as you 
suggested|https://issues.apache.org/jira/browse/HADOOP-15315?focusedCommentId=16399166&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16399166]
 is a bad practice.

> LDAP Group Mapping over SSL can not specify trust store
> ---
>
> Key: HADOOP-12862
> URL: https://issues.apache.org/jira/browse/HADOOP-12862
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
>Priority: Major
> Attachments: HADOOP-12862.001.patch, HADOOP-12862.002.patch, 
> HADOOP-12862.003.patch, HADOOP-12862.004.patch, HADOOP-12862.005.patch, 
> HADOOP-12862.006.patch, HADOOP-12862.007.patch, HADOOP-12862.008.patch
>
>
> In a secure environment, SSL is used to encrypt LDAP request for group 
> mapping resolution.
> We (+[~yoderme], +[~tgrayson]) have found that its implementation is strange.
> For information, Hadoop name node, as an LDAP client, talks to a LDAP server 
> to resolve the group mapping of a user. In the case of LDAP over SSL, a 
> typical scenario is to establish one-way authentication (the client verifies 
> the server's certificate is real) by storing the server's certificate in the 
> client's truststore.
> A rarer scenario is to establish two-way authentication: in addition to store 
> truststore for the client to verify the server, the server also verifies the 
> client's certificate is real, and the client stores its own certificate in 
> its keystore.
> However, the current implementation for LDAP over SSL does not seem to be 
> correct in that it only configures keystore but no truststore (so LDAP server 
> can verify Hadoop's certificate, but Hadoop may not be able to verify LDAP 
> server's certificate)
> I think there should an extra pair of properties to specify the 
> truststore/password for LDAP server, and use that to configure system 
> properties {{javax.net.ssl.trustStore}}/{{javax.net.ssl.trustStorePassword}}
> I am a security layman so my words can be imprecise. But I hope this makes 
> sense.
> Oracle's SSL LDAP documentation: 
> http://docs.oracle.com/javase/jndi/tutorial/ldap/security/ssl.html
> JSSE reference guide: 
> http://docs.oracle.com/javase/7/docs/technotes/guides/security/jsse/JSSERefGuide.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14067) VersionInfo should load version-info.properties from its own classloader

2018-03-22 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16410369#comment-16410369
 ] 

Hudson commented on HADOOP-14067:
-

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #13869 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/13869/])
HADOOP-14067. VersionInfo should load version-info.properties from its 
(jitendra: rev 4bea96f9a84cee89d07dfa97b892f6fb3ed1e125)
* (edit) 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/ThreadUtil.java
* (edit) 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/VersionInfo.java


> VersionInfo should load version-info.properties from its own classloader
> 
>
> Key: HADOOP-14067
> URL: https://issues.apache.org/jira/browse/HADOOP-14067
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: common
>Affects Versions: 2.8.3, 3.0.0
>Reporter: Thejas M Nair
>Assignee: Thejas M Nair
>Priority: Major
> Fix For: 3.2.0
>
> Attachments: HADOOP-14067.01.patch, HADOOP-14067.01.patch, 
> HADOOP-14067.02.patch, HADOOP-14067.03.patch
>
>
> org.apache.hadoop.util.VersionInfo loads the version-info.properties file via 
> the current thread classloader.
> However, in case of applications that are using hadoop classes dynamically  
> (eg jdbc based tools such as SQuirreL SQL) the current thread might not be 
> the one that loaded the hadoop classes including VersionInfo, and it would 
> fail to fine the properties file.
> The right place to look for the properties file is in the classloader of 
> VersionInfo class, as right version is the one that is associated with rest 
> of the loaded hadoop classes,  and not necessarily the one in current thread 
> classloader.
> Created a related jira - HADOOP-14066 to make methods to get version via 
> VersionInfo a public api.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-12862) LDAP Group Mapping over SSL can not specify trust store

2018-03-22 Thread Konstantin Shvachko (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16410484#comment-16410484
 ] 

Konstantin Shvachko commented on HADOOP-12862:
--

So the code snippet you cited above was introduced in HADOOP-10607, which 
targeted exactly that ??"to eliminate the storage of passwords and secrets in 
clear text within configuration files or within code."??
I believe the opportunity to obtain password from configs was left there to 
provide backward compatibility.
I think we both agree that storing passwords in config files is a bad idea, no?
So why do we want to keep introducing (optional) password parameters, following 
the wrong pattern?

What you propose with HADOOP-15325 is adding an optional option to ignore an 
optional parameter. Why?

> LDAP Group Mapping over SSL can not specify trust store
> ---
>
> Key: HADOOP-12862
> URL: https://issues.apache.org/jira/browse/HADOOP-12862
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
>Priority: Major
> Attachments: HADOOP-12862.001.patch, HADOOP-12862.002.patch, 
> HADOOP-12862.003.patch, HADOOP-12862.004.patch, HADOOP-12862.005.patch, 
> HADOOP-12862.006.patch, HADOOP-12862.007.patch, HADOOP-12862.008.patch
>
>
> In a secure environment, SSL is used to encrypt LDAP request for group 
> mapping resolution.
> We (+[~yoderme], +[~tgrayson]) have found that its implementation is strange.
> For information, Hadoop name node, as an LDAP client, talks to a LDAP server 
> to resolve the group mapping of a user. In the case of LDAP over SSL, a 
> typical scenario is to establish one-way authentication (the client verifies 
> the server's certificate is real) by storing the server's certificate in the 
> client's truststore.
> A rarer scenario is to establish two-way authentication: in addition to store 
> truststore for the client to verify the server, the server also verifies the 
> client's certificate is real, and the client stores its own certificate in 
> its keystore.
> However, the current implementation for LDAP over SSL does not seem to be 
> correct in that it only configures keystore but no truststore (so LDAP server 
> can verify Hadoop's certificate, but Hadoop may not be able to verify LDAP 
> server's certificate)
> I think there should an extra pair of properties to specify the 
> truststore/password for LDAP server, and use that to configure system 
> properties {{javax.net.ssl.trustStore}}/{{javax.net.ssl.trustStorePassword}}
> I am a security layman so my words can be imprecise. But I hope this makes 
> sense.
> Oracle's SSL LDAP documentation: 
> http://docs.oracle.com/javase/jndi/tutorial/ldap/security/ssl.html
> JSSE reference guide: 
> http://docs.oracle.com/javase/7/docs/technotes/guides/security/jsse/JSSERefGuide.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-15336) NPE for FsServerDefaults.getKeyProviderUri() for clientProtocol communication between 2.7 and 3.1

2018-03-22 Thread Sherwood Zheng (JIRA)
Sherwood Zheng created HADOOP-15336:
---

 Summary: NPE for FsServerDefaults.getKeyProviderUri() for 
clientProtocol communication between 2.7 and 3.1
 Key: HADOOP-15336
 URL: https://issues.apache.org/jira/browse/HADOOP-15336
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 3.1.0, 3.2.0
Reporter: Sherwood Zheng
Assignee: Sherwood Zheng
 Fix For: 3.1.0, 3.2.0






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15336) NPE for FsServerDefaults.getKeyProviderUri() for clientProtocol communication between 2.7 and 3.1

2018-03-22 Thread Sherwood Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-15336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sherwood Zheng updated HADOOP-15336:

Labels: backward-incompatible common  (was: )

> NPE for FsServerDefaults.getKeyProviderUri() for clientProtocol communication 
> between 2.7 and 3.1
> -
>
> Key: HADOOP-15336
> URL: https://issues.apache.org/jira/browse/HADOOP-15336
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 3.1.0, 3.2.0
>Reporter: Sherwood Zheng
>Assignee: Sherwood Zheng
>Priority: Major
>  Labels: backward-incompatible, common
> Fix For: 3.1.0, 3.2.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-15337) RawLocalFileSystem file status permissions can avoid shelling out in some cases

2018-03-22 Thread Jonathan Eagles (JIRA)
Jonathan Eagles created HADOOP-15337:


 Summary: RawLocalFileSystem file status permissions can avoid 
shelling out in some cases
 Key: HADOOP-15337
 URL: https://issues.apache.org/jira/browse/HADOOP-15337
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Jonathan Eagles
Assignee: Jonathan Eagles


While investigating YARN-8054, it was noticed that getting file permissions for 
RawLocalFileSystem can fail by having too many files open. Upon inspection this 
happens when getting permissions by launching a shell program (ls -ld on linux) 
and parsing the results. With the introduction of java 7 posix file systems can 
accurately get file permissions without launching a shell program.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15337) RawLocalFileSystem file status permissions can avoid shelling out in some cases

2018-03-22 Thread Jonathan Eagles (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16410553#comment-16410553
 ] 

Jonathan Eagles commented on HADOOP-15337:
--

It tested the nio implementation and found that in addition to not shelling 
out, it also was 5 times faster to get file permissions on linux.

> RawLocalFileSystem file status permissions can avoid shelling out in some 
> cases
> ---
>
> Key: HADOOP-15337
> URL: https://issues.apache.org/jira/browse/HADOOP-15337
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Jonathan Eagles
>Assignee: Jonathan Eagles
>Priority: Major
>
> While investigating YARN-8054, it was noticed that getting file permissions 
> for RawLocalFileSystem can fail by having too many files open. Upon 
> inspection this happens when getting permissions by launching a shell program 
> (ls -ld on linux) and parsing the results. With the introduction of java 7 
> posix file systems can accurately get file permissions without launching a 
> shell program.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15336) NPE for FsServerDefaults.getKeyProviderUri() for clientProtocol communication between 2.7 and 3.1

2018-03-22 Thread Sherwood Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-15336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sherwood Zheng updated HADOOP-15336:

Fix Version/s: (was: 3.2.0)
   (was: 3.1.0)

> NPE for FsServerDefaults.getKeyProviderUri() for clientProtocol communication 
> between 2.7 and 3.1
> -
>
> Key: HADOOP-15336
> URL: https://issues.apache.org/jira/browse/HADOOP-15336
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 3.1.0, 3.2.0
>Reporter: Sherwood Zheng
>Assignee: Sherwood Zheng
>Priority: Major
>  Labels: backward-incompatible, common
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15336) NPE for FsServerDefaults.getKeyProviderUri() for clientProtocol communication between 2.7 and 3.2

2018-03-22 Thread Sherwood Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-15336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sherwood Zheng updated HADOOP-15336:

Summary: NPE for FsServerDefaults.getKeyProviderUri() for clientProtocol 
communication between 2.7 and 3.2  (was: NPE for 
FsServerDefaults.getKeyProviderUri() for clientProtocol communication between 
2.7 and 3.1)

> NPE for FsServerDefaults.getKeyProviderUri() for clientProtocol communication 
> between 2.7 and 3.2
> -
>
> Key: HADOOP-15336
> URL: https://issues.apache.org/jira/browse/HADOOP-15336
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 3.1.0, 3.2.0
>Reporter: Sherwood Zheng
>Assignee: Sherwood Zheng
>Priority: Major
>  Labels: backward-incompatible, common
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15337) RawLocalFileSystem file status permissions can avoid shelling out in some cases

2018-03-22 Thread Jonathan Eagles (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16410558#comment-16410558
 ] 

Jonathan Eagles commented on HADOOP-15337:
--

Now I see HADOOP-14600 is related, but only available in 3.1.0

> RawLocalFileSystem file status permissions can avoid shelling out in some 
> cases
> ---
>
> Key: HADOOP-15337
> URL: https://issues.apache.org/jira/browse/HADOOP-15337
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Jonathan Eagles
>Assignee: Jonathan Eagles
>Priority: Major
>
> While investigating YARN-8054, it was noticed that getting file permissions 
> for RawLocalFileSystem can fail by having too many files open. Upon 
> inspection this happens when getting permissions by launching a shell program 
> (ls -ld on linux) and parsing the results. With the introduction of java 7 
> posix file systems can accurately get file permissions without launching a 
> shell program.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14759) S3GuardTool prune to prune specific bucket entries

2018-03-22 Thread Gabor Bota (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Bota updated HADOOP-14759:

Attachment: HADOOP-14759.001.patch

> S3GuardTool prune to prune specific bucket entries
> --
>
> Key: HADOOP-14759
> URL: https://issues.apache.org/jira/browse/HADOOP-14759
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.0.0-beta1
>Reporter: Steve Loughran
>Assignee: Gabor Bota
>Priority: Minor
> Attachments: HADOOP-14759.001.patch
>
>
> Users may think that when you provide a URI to a bucket, you are pruning all 
> entries in the table *for that bucket*. In fact you are purging all entries 
> across all buckets in the table:
> {code}
> hadoop s3guard prune -days 7 s3a://ireland-1
> {code}
> It should be restricted to that bucket, unless you specify otherwise
> +maybe also add a hard date rather than a relative one



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14759) S3GuardTool prune to prune specific bucket entries

2018-03-22 Thread Gabor Bota (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Bota updated HADOOP-14759:

Status: Patch Available  (was: Open)

Test ran on us-west-2 successfully. 

> S3GuardTool prune to prune specific bucket entries
> --
>
> Key: HADOOP-14759
> URL: https://issues.apache.org/jira/browse/HADOOP-14759
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.0.0-beta1
>Reporter: Steve Loughran
>Assignee: Gabor Bota
>Priority: Minor
> Attachments: HADOOP-14759.001.patch
>
>
> Users may think that when you provide a URI to a bucket, you are pruning all 
> entries in the table *for that bucket*. In fact you are purging all entries 
> across all buckets in the table:
> {code}
> hadoop s3guard prune -days 7 s3a://ireland-1
> {code}
> It should be restricted to that bucket, unless you specify otherwise
> +maybe also add a hard date rather than a relative one



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HADOOP-14759) S3GuardTool prune to prune specific bucket entries

2018-03-22 Thread Gabor Bota (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16410821#comment-16410821
 ] 

Gabor Bota edited comment on HADOOP-14759 at 3/23/18 5:53 AM:
--

Tests ran on us-west-2 successfully. 


was (Author: gabor.bota):
Test ran on us-west-2 successfully. 

> S3GuardTool prune to prune specific bucket entries
> --
>
> Key: HADOOP-14759
> URL: https://issues.apache.org/jira/browse/HADOOP-14759
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.0.0-beta1
>Reporter: Steve Loughran
>Assignee: Gabor Bota
>Priority: Minor
> Attachments: HADOOP-14759.001.patch
>
>
> Users may think that when you provide a URI to a bucket, you are pruning all 
> entries in the table *for that bucket*. In fact you are purging all entries 
> across all buckets in the table:
> {code}
> hadoop s3guard prune -days 7 s3a://ireland-1
> {code}
> It should be restricted to that bucket, unless you specify otherwise
> +maybe also add a hard date rather than a relative one



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14445) Delegation tokens are not shared between KMS instances

2018-03-22 Thread Xiao Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Chen updated HADOOP-14445:
---
Attachment: HADOOP-14445.08.patch

> Delegation tokens are not shared between KMS instances
> --
>
> Key: HADOOP-14445
> URL: https://issues.apache.org/jira/browse/HADOOP-14445
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: kms
>Affects Versions: 2.8.0, 3.0.0-alpha1
> Environment: CDH5.7.4, Kerberized, SSL, KMS-HA, at rest encryption
>Reporter: Wei-Chiu Chuang
>Assignee: Xiao Chen
>Priority: Major
> Attachments: HADOOP-14445-branch-2.8.002.patch, 
> HADOOP-14445-branch-2.8.patch, HADOOP-14445.002.patch, 
> HADOOP-14445.003.patch, HADOOP-14445.004.patch, HADOOP-14445.05.patch, 
> HADOOP-14445.06.patch, HADOOP-14445.07.patch, HADOOP-14445.08.patch
>
>
> As discovered in HADOOP-14441, KMS HA using LoadBalancingKMSClientProvider do 
> not share delegation tokens. (a client uses KMS address/port as the key for 
> delegation token)
> {code:title=DelegationTokenAuthenticatedURL#openConnection}
> if (!creds.getAllTokens().isEmpty()) {
> InetSocketAddress serviceAddr = new InetSocketAddress(url.getHost(),
> url.getPort());
> Text service = SecurityUtil.buildTokenService(serviceAddr);
> dToken = creds.getToken(service);
> {code}
> But KMS doc states:
> {quote}
> Delegation Tokens
> Similar to HTTP authentication, KMS uses Hadoop Authentication for delegation 
> tokens too.
> Under HA, A KMS instance must verify the delegation token given by another 
> KMS instance, by checking the shared secret used to sign the delegation 
> token. To do this, all KMS instances must be able to retrieve the shared 
> secret from ZooKeeper.
> {quote}
> We should either update the KMS documentation, or fix this code to share 
> delegation tokens.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14445) Delegation tokens are not shared between KMS instances

2018-03-22 Thread Xiao Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16410851#comment-16410851
 ] 

Xiao Chen commented on HADOOP-14445:


Thanks for the review Rushabh, comments addressed in patch 8. Yes the duplicate 
tokens works as the way you described, the new 
{{testTokenCompatibilityOldRenewer}} verifies that. I also verified this in a 
cluster around patch 4 (replace the jars everywhere except the RM host), if my 
memory is correct... I'd be happy to verify the final patch again once all 
comments are settled.

> Delegation tokens are not shared between KMS instances
> --
>
> Key: HADOOP-14445
> URL: https://issues.apache.org/jira/browse/HADOOP-14445
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: kms
>Affects Versions: 2.8.0, 3.0.0-alpha1
> Environment: CDH5.7.4, Kerberized, SSL, KMS-HA, at rest encryption
>Reporter: Wei-Chiu Chuang
>Assignee: Xiao Chen
>Priority: Major
> Attachments: HADOOP-14445-branch-2.8.002.patch, 
> HADOOP-14445-branch-2.8.patch, HADOOP-14445.002.patch, 
> HADOOP-14445.003.patch, HADOOP-14445.004.patch, HADOOP-14445.05.patch, 
> HADOOP-14445.06.patch, HADOOP-14445.07.patch, HADOOP-14445.08.patch
>
>
> As discovered in HADOOP-14441, KMS HA using LoadBalancingKMSClientProvider do 
> not share delegation tokens. (a client uses KMS address/port as the key for 
> delegation token)
> {code:title=DelegationTokenAuthenticatedURL#openConnection}
> if (!creds.getAllTokens().isEmpty()) {
> InetSocketAddress serviceAddr = new InetSocketAddress(url.getHost(),
> url.getPort());
> Text service = SecurityUtil.buildTokenService(serviceAddr);
> dToken = creds.getToken(service);
> {code}
> But KMS doc states:
> {quote}
> Delegation Tokens
> Similar to HTTP authentication, KMS uses Hadoop Authentication for delegation 
> tokens too.
> Under HA, A KMS instance must verify the delegation token given by another 
> KMS instance, by checking the shared secret used to sign the delegation 
> token. To do this, all KMS instances must be able to retrieve the shared 
> secret from ZooKeeper.
> {quote}
> We should either update the KMS documentation, or fix this code to share 
> delegation tokens.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15313) TestKMS should close providers

2018-03-22 Thread Xiao Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16410855#comment-16410855
 ] 

Xiao Chen commented on HADOOP-15313:


Patch 1 to close the providers.

> TestKMS should close providers
> --
>
> Key: HADOOP-15313
> URL: https://issues.apache.org/jira/browse/HADOOP-15313
> Project: Hadoop Common
>  Issue Type: Test
>  Components: kms, test
>Reporter: Xiao Chen
>Assignee: Xiao Chen
>Priority: Major
> Attachments: HADOOP-15313.01.patch
>
>
> During the review of HADOOP-14445, [~jojochuang] found that we key providers 
> are not closed in tests. Details in [this 
> comment|https://issues.apache.org/jira/browse/HADOOP-14445?focusedCommentId=16397824&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16397824].
> We should investigate and handle that in all related tests.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15313) TestKMS should close providers

2018-03-22 Thread Xiao Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-15313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Chen updated HADOOP-15313:
---
Status: Patch Available  (was: Open)

> TestKMS should close providers
> --
>
> Key: HADOOP-15313
> URL: https://issues.apache.org/jira/browse/HADOOP-15313
> Project: Hadoop Common
>  Issue Type: Test
>  Components: kms, test
>Reporter: Xiao Chen
>Assignee: Xiao Chen
>Priority: Major
> Attachments: HADOOP-15313.01.patch
>
>
> During the review of HADOOP-14445, [~jojochuang] found that we key providers 
> are not closed in tests. Details in [this 
> comment|https://issues.apache.org/jira/browse/HADOOP-14445?focusedCommentId=16397824&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16397824].
> We should investigate and handle that in all related tests.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15313) TestKMS should close providers

2018-03-22 Thread Xiao Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-15313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Chen updated HADOOP-15313:
---
Attachment: HADOOP-15313.01.patch

> TestKMS should close providers
> --
>
> Key: HADOOP-15313
> URL: https://issues.apache.org/jira/browse/HADOOP-15313
> Project: Hadoop Common
>  Issue Type: Test
>  Components: kms, test
>Reporter: Xiao Chen
>Assignee: Xiao Chen
>Priority: Major
> Attachments: HADOOP-15313.01.patch
>
>
> During the review of HADOOP-14445, [~jojochuang] found that we key providers 
> are not closed in tests. Details in [this 
> comment|https://issues.apache.org/jira/browse/HADOOP-14445?focusedCommentId=16397824&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16397824].
> We should investigate and handle that in all related tests.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-11423) [Umbrella] Fix Java 10 incompatibilities in Hadoop

2018-03-22 Thread Akira Ajisaka (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-11423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka updated HADOOP-11423:
---
Summary: [Umbrella] Fix Java 10 incompatibilities in Hadoop  (was: 
[Umbrella] Support Java 10 in Hadoop)

> [Umbrella] Fix Java 10 incompatibilities in Hadoop
> --
>
> Key: HADOOP-11423
> URL: https://issues.apache.org/jira/browse/HADOOP-11423
> Project: Hadoop Common
>  Issue Type: New Feature
>Reporter: sneaky
>Priority: Minor
>
> Java 10 is coming quickly to various clusters. Making sure Hadoop seamlessly 
> works with Java 10 is important for the Apache community.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-11423) [Umbrella] Support Java 10 in Hadoop

2018-03-22 Thread Akira Ajisaka (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-11423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16410860#comment-16410860
 ] 

Akira Ajisaka commented on HADOOP-11423:


Agreed. Updating the description.

> [Umbrella] Support Java 10 in Hadoop
> 
>
> Key: HADOOP-11423
> URL: https://issues.apache.org/jira/browse/HADOOP-11423
> Project: Hadoop Common
>  Issue Type: New Feature
>Reporter: sneaky
>Priority: Minor
>
> Java 10 is coming quickly to various clusters. Making sure Hadoop seamlessly 
> works with Java 10 is important for the Apache community.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-11123) Fix Java 9 incompatibilies in Hadoop

2018-03-22 Thread Akira Ajisaka (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-11123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka updated HADOOP-11123:
---
Summary: Fix Java 9 incompatibilies in Hadoop  (was: Hadoop on Java 9)

> Fix Java 9 incompatibilies in Hadoop
> 
>
> Key: HADOOP-11123
> URL: https://issues.apache.org/jira/browse/HADOOP-11123
> Project: Hadoop Common
>  Issue Type: Task
>Affects Versions: 3.0.0-alpha1
> Environment: Java 9
>Reporter: Steve Loughran
>Priority: Critical
>
> JIRA to cover/track issues related to Hadoop on Java 9.
> Java 9 will have some significant changes —one of which is the removal of 
> various {{com.sun}} classes. These removals need to be handled or Hadoop will 
> not be able to run on a Java 9 JVM



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-15338) Support Java 11 LTS in Hadoop

2018-03-22 Thread Akira Ajisaka (JIRA)
Akira Ajisaka created HADOOP-15338:
--

 Summary: Support Java 11 LTS in Hadoop
 Key: HADOOP-15338
 URL: https://issues.apache.org/jira/browse/HADOOP-15338
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Akira Ajisaka


Java 8 will be EoL during January 2019, so we need to support Java 11 LTS 
before the date.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org