[jira] [Comment Edited] (HADOOP-15027) Improvements for Hadoop read from AliyunOSS

2017-11-10 Thread wujinhu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16248362#comment-16248362
 ] 

wujinhu edited comment on HADOOP-15027 at 11/11/17 6:40 AM:


Hi Steve Loughran 
Thanks for the comments and your suggestions are very helpful. I will follow 
your suggestions about thread pool and retry logic.
For random IO, it is true that my implementation will not work well.
It seems HADOOP-14535 is similar with what os does.
Operation system starts to sequential read-ahead when one of the following 
conditions satisfies:
* first read from a file and seek pos is 0
* current read and previous read are continuous in this file

Otherwise, it is random IO.
I will take a look at these two issues and continue to improve this.


was (Author: wujinhu):
Hi Steve Loughran 
Thanks for the comments and your suggestions are very helpful. I will follow 
your suggestions about thread pool and retry logic.
For random IO, it is true that my implementation will not work well.
It seems HADOOP-14535 is similar with what os does.
Operation system starts to sequential read-ahead when one of the following 
conditions satisfies:
* first read from a file and seek pos is 0
* current read and previous read are continuous in this file
Otherwise, it is random IO.
I will take a look at these two issues and continue to improve this.

> Improvements for Hadoop read from AliyunOSS
> ---
>
> Key: HADOOP-15027
> URL: https://issues.apache.org/jira/browse/HADOOP-15027
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/oss
>Affects Versions: 3.0.0
>Reporter: wujinhu
>Assignee: wujinhu
> Attachments: HADOOP-15027.001.patch
>
>
> Currently, read performance is poor when Hadoop reads from AliyunOSS. It 
> needs about 1min to read 1GB from OSS.
> Class AliyunOSSInputStream uses single thread to read data from AliyunOSS,  
> so we can refactor this by using multi-thread pre read to improve this.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15027) Improvements for Hadoop read from AliyunOSS

2017-11-10 Thread wujinhu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16248362#comment-16248362
 ] 

wujinhu commented on HADOOP-15027:
--

Hi Steve Loughran 
Thanks for the comments and your suggestions are very helpful. I will follow 
your suggestions about thread pool and retry logic.
For random IO, it is true that my implementation will not work well.
It seems HADOOP-14535 is similar with what os does.
Operation system starts to sequential read-ahead when one of the following 
conditions satisfies:
* first read from a file and seek pos is 0
* current read and previous read are continuous in this file
Otherwise, it is random IO.
I will take a look at these two issues and continue to improve this.

> Improvements for Hadoop read from AliyunOSS
> ---
>
> Key: HADOOP-15027
> URL: https://issues.apache.org/jira/browse/HADOOP-15027
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/oss
>Affects Versions: 3.0.0
>Reporter: wujinhu
>Assignee: wujinhu
> Attachments: HADOOP-15027.001.patch
>
>
> Currently, read performance is poor when Hadoop reads from AliyunOSS. It 
> needs about 1min to read 1GB from OSS.
> Class AliyunOSSInputStream uses single thread to read data from AliyunOSS,  
> so we can refactor this by using multi-thread pre read to improve this.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-8522) ResetableGzipOutputStream creates invalid gzip files when finish() and resetState() are used

2017-11-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16248278#comment-16248278
 ] 

Hudson commented on HADOOP-8522:


SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #13222 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/13222/])
HADOOP-8522. ResetableGzipOutputStream creates invalid gzip files when 
(cdouglas: rev 796a0d3a5c661f0c3b23af9c0db2d8f3db83c322)
* (add) 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/compress/TestGzipCodec.java
* (edit) 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/GzipCodec.java


> ResetableGzipOutputStream creates invalid gzip files when finish() and 
> resetState() are used
> 
>
> Key: HADOOP-8522
> URL: https://issues.apache.org/jira/browse/HADOOP-8522
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: io
>Affects Versions: 1.0.3, 2.0.0-alpha
>Reporter: Mike Percy
>Assignee: Mike Percy
>  Labels: BB2015-05-TBR
> Fix For: 3.0.0
>
> Attachments: HADOOP-8522-4.patch, HADOOP-8522.05.patch, 
> HADOOP-8522.06.patch, HADOOP-8522.07.patch
>
>
> ResetableGzipOutputStream creates invalid gzip files when finish() and 
> resetState() are used. The issue is that finish() flushes the compressor 
> buffer and writes the gzip CRC32 + data length trailer. After that, 
> resetState() does not repeat the gzip header, but simply starts writing more 
> deflate-compressed data. The resultant files are not readable by the Linux 
> "gunzip" tool. ResetableGzipOutputStream should write valid multi-member gzip 
> files.
> The gzip format is specified in [RFC 
> 1952|https://tools.ietf.org/html/rfc1952].



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-8522) ResetableGzipOutputStream creates invalid gzip files when finish() and resetState() are used

2017-11-10 Thread Chris Douglas (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-8522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Douglas updated HADOOP-8522:
--
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

I committed this. Thanks, [~mpercy]

> ResetableGzipOutputStream creates invalid gzip files when finish() and 
> resetState() are used
> 
>
> Key: HADOOP-8522
> URL: https://issues.apache.org/jira/browse/HADOOP-8522
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: io
>Affects Versions: 1.0.3, 2.0.0-alpha
>Reporter: Mike Percy
>Assignee: Mike Percy
>  Labels: BB2015-05-TBR
> Fix For: 3.0.0
>
> Attachments: HADOOP-8522-4.patch, HADOOP-8522.05.patch, 
> HADOOP-8522.06.patch, HADOOP-8522.07.patch
>
>
> ResetableGzipOutputStream creates invalid gzip files when finish() and 
> resetState() are used. The issue is that finish() flushes the compressor 
> buffer and writes the gzip CRC32 + data length trailer. After that, 
> resetState() does not repeat the gzip header, but simply starts writing more 
> deflate-compressed data. The resultant files are not readable by the Linux 
> "gunzip" tool. ResetableGzipOutputStream should write valid multi-member gzip 
> files.
> The gzip format is specified in [RFC 
> 1952|https://tools.ietf.org/html/rfc1952].



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14929) Cleanup usage of decodecomponent and use QueryStringDecoder from netty

2017-11-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16248238#comment-16248238
 ] 

Hudson commented on HADOOP-14929:
-

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #13220 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/13220/])
HADOOP-14929. Cleanup usage of decodecomponent and use (arp: rev 
1d6f8bebe9d20c958e419c140109e3d9fec8cb46)
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/web/webhdfs/TestParameterParser.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/web/webhdfs/ParameterParser.java


> Cleanup usage of decodecomponent and use QueryStringDecoder from netty
> --
>
> Key: HADOOP-14929
> URL: https://issues.apache.org/jira/browse/HADOOP-14929
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
> Fix For: 3.1.0
>
> Attachments: HADOOP-14929.00.patch, HADOOP-14929.01.patch, 
> HADOOP-14929.02.patch, HADOOP-14929.03.patch
>
>
> This is from the review of HADOOP-14910
> There is also other place usage of 
> decodeComponent(param(CreateFlagParam.NAME), StandardCharsets.UTF_8);
> In ParameterParser.java Line 147-148:
> String cf = decodeComponent(param(CreateFlagParam.NAME), 
> StandardCharsets.UTF_8);
> Use QueryStringDecoder from netty here too and cleanup the decodeComponent. 
> Actually this is added for netty issue only.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-8522) ResetableGzipOutputStream creates invalid gzip files when finish() and resetState() are used

2017-11-10 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16248229#comment-16248229
 ] 

Chris Douglas commented on HADOOP-8522:
---

If there's no other feedback, I'll commit this.

> ResetableGzipOutputStream creates invalid gzip files when finish() and 
> resetState() are used
> 
>
> Key: HADOOP-8522
> URL: https://issues.apache.org/jira/browse/HADOOP-8522
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: io
>Affects Versions: 1.0.3, 2.0.0-alpha
>Reporter: Mike Percy
>Assignee: Mike Percy
>  Labels: BB2015-05-TBR
> Attachments: HADOOP-8522-4.patch, HADOOP-8522.05.patch, 
> HADOOP-8522.06.patch, HADOOP-8522.07.patch
>
>
> ResetableGzipOutputStream creates invalid gzip files when finish() and 
> resetState() are used. The issue is that finish() flushes the compressor 
> buffer and writes the gzip CRC32 + data length trailer. After that, 
> resetState() does not repeat the gzip header, but simply starts writing more 
> deflate-compressed data. The resultant files are not readable by the Linux 
> "gunzip" tool. ResetableGzipOutputStream should write valid multi-member gzip 
> files.
> The gzip format is specified in [RFC 
> 1952|https://tools.ietf.org/html/rfc1952].



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14929) Cleanup usage of decodecomponent and use QueryStringDecoder from netty

2017-11-10 Thread Arpit Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HADOOP-14929:
---
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 3.1.0
   Status: Resolved  (was: Patch Available)

I've committed this. Thanks [~bharatviswa].

> Cleanup usage of decodecomponent and use QueryStringDecoder from netty
> --
>
> Key: HADOOP-14929
> URL: https://issues.apache.org/jira/browse/HADOOP-14929
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
> Fix For: 3.1.0
>
> Attachments: HADOOP-14929.00.patch, HADOOP-14929.01.patch, 
> HADOOP-14929.02.patch, HADOOP-14929.03.patch
>
>
> This is from the review of HADOOP-14910
> There is also other place usage of 
> decodeComponent(param(CreateFlagParam.NAME), StandardCharsets.UTF_8);
> In ParameterParser.java Line 147-148:
> String cf = decodeComponent(param(CreateFlagParam.NAME), 
> StandardCharsets.UTF_8);
> Use QueryStringDecoder from netty here too and cleanup the decodeComponent. 
> Actually this is added for netty issue only.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15030) [branch-2] Include hadoop-cloud-storage-project in the main hadoop pom modules

2017-11-10 Thread Subru Krishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-15030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Subru Krishnan updated HADOOP-15030:

   Resolution: Fixed
Fix Version/s: 2.9.0
   Status: Resolved  (was: Patch Available)

Thanks [~vrushalic] for raising this and [~chris.douglas] for the review, I 
have committed this to branch-2/2.9/2.9.0.

> [branch-2] Include hadoop-cloud-storage-project in the main hadoop pom modules
> --
>
> Key: HADOOP-15030
> URL: https://issues.apache.org/jira/browse/HADOOP-15030
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.9.0
>Reporter: Subru Krishnan
>Assignee: Subru Krishnan
>Priority: Critical
> Fix For: 2.9.0
>
> Attachments: HADOOP-15030-branch-2-v1.patch
>
>
> During validation of 2.9.0. RC, [~vrushalic] noticed that the 
> hadoop-cloud-storage-project is not included in the main hadoop pom modules 
> so it's not being managed including mvn versions:set for releases. This was 
> fixed in trunk with HADOOP-14004, doing the same for branch-2.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15030) [branch-2] Include hadoop-cloud-storage-project in the main hadoop pom modules

2017-11-10 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16248061#comment-16248061
 ] 

Chris Douglas commented on HADOOP-15030:


+1 good catch, [~vrushalic]. I don't think this needs to wait for Yetus.

> [branch-2] Include hadoop-cloud-storage-project in the main hadoop pom modules
> --
>
> Key: HADOOP-15030
> URL: https://issues.apache.org/jira/browse/HADOOP-15030
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.9.0
>Reporter: Subru Krishnan
>Assignee: Subru Krishnan
>Priority: Critical
> Attachments: HADOOP-15030-branch-2-v1.patch
>
>
> During validation of 2.9.0. RC, [~vrushalic] noticed that the 
> hadoop-cloud-storage-project is not included in the main hadoop pom modules 
> so it's not being managed including mvn versions:set for releases. This was 
> fixed in trunk with HADOOP-14004, doing the same for branch-2.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15003) Merge S3A committers into trunk: Yetus patch checker

2017-11-10 Thread Steve Loughran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-15003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-15003:

Status: Patch Available  (was: Open)

> Merge S3A committers into trunk: Yetus patch checker
> 
>
> Key: HADOOP-15003
> URL: https://issues.apache.org/jira/browse/HADOOP-15003
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.0.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
> Attachments: HADOOP-13786-041.patch, HADOOP-13786-042.patch, 
> HADOOP-13786-043.patch, HADOOP-13786-044.patch, HADOOP-13786-045.patch, 
> HADOOP-13786-046.patch, HADOOP-13786-047.patch
>
>
> This is a Yetus only JIRA created to have Yetus review the 
> HADOOP-13786/HADOOP-14971 patch as a .patch file, as the review PR 
> [https://github.com/apache/hadoop/pull/282] is stopping this happening in 
> HADOOP-14971.
> Reviews should go into the PR/other task



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15003) Merge S3A committers into trunk: Yetus patch checker

2017-11-10 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16248034#comment-16248034
 ] 

Steve Loughran commented on HADOOP-15003:
-

Aarons test failure has forced me to look at all the abort code in the 
committers. So I have.


* Found bug in the existing S3AFileSystem.listPendingUploads(path), in that as 
path never had a trailing "/", it'd list pending uploads in any directory which 
began with the same prefix, e.g. /work/job1 and work/job12. The final committer 
cleanup() logic would therefore abort adjacent work. Fixed. 
* turned off that bulk cleanup to see what would happen, helped explore that 
list and delete pending code, which ultimately proved insufficent (see below). 
listPendingUploads-> abort is in fact the only way to do it reliably.
* logging improved @debug level, with a bit more on impoort to deal with the 
use case "things aren't working in production, here are the logs, VM is gone"

  
Two little issues related to cleanup
  

Magic App attempt abort would list all in app attempt 
dest/__magic/$jobid_$attemptid/{*.pendingset, *.pending} and clean up
But this doesn't cleanup on any failed attempt, which will have a different 
attemptID, so not be found. 

For the magic committer, switched to a much simpler cleanup of
* list all uploads pending under $dest, abort
* magic: delete __magic/*
* staging: abort wrapped job
* staging: delete local staging dirs for entire job

That is: no list/read/cancel  of pending job & task events.  Expensive *and not 
complete*. 
This bulk abort of all pending uploads of a  dest handles: failed tasks, failed 
attempts, failure of incomplete jobs  from other processes/work which was never 
aborted. (e.g: a previous query  somewhere failed leaving pending work). 
This was already being done after all the "structured"  aborts; here I remove 
that work and only do the bulk list & rm for a  simpler life; made sure all of 
that excutes in parallel for fastest cleanup.

This significantly simplifies the abort, commit failure and helper classes (no 
need to suppress exceptions),
and reduces the diff between the staging & magic committers (so more has been 
pulled up to the base class)

That is: this is the first patch for a while which cuts out production code in 
any measurable way.

The staging committer *currently* does the same, but it retains the listing and 
abort of all outstanding requests first. Why? I've kept it there to handle the 
use case of >1 partitioned commit running simultaneously, where you don't want 
to abort all the outstanding stuff. That's not fully rounded off in this patch 
because it's still doing the bulk cleanup: I want to see what Ryan thinks is 
best here. I think I'm going to opt for: list then optional bulk abort. That 
means the default doesn't leave pending objects around. That makes the list 
operation superfluous, of course, but (a) it's going against HDFS so is less 
expensive and (b) means that all tests cover the codepath, rather than having a 
condition which has weak coverage. That is: I don't think it's that harmful: 
there's the cost of an HDFS list(recursive=true), an open + parse of every 
read; For the magic committer, in constrast, its more expensive (the LIST, the 
GET) and it doesn't support that notion of partitioned work: you can't get >1 
job writing to the same directory tree.


h3. Docs: 

* mention examining the _SUCCESS file in troubleshooting section
* defaults of the various options

I'm wrapping up some details on cleanup in the committer arch doc, not included 
in this patch as it is late on a friday evening.

h3. Misc

* Invoke adds new method {{ignoreIOExceptions(Log, text, path, 
lambda-operation)}} which runs the operation & logs & info if there is a 
problem. This replaces a lot of the try/catch(IOE ignored) sequences in 
cleanup. What it makes easy is to isolate every single operation this way, so 
if one fails, the next step runs. This makes cleanup potentially more rigorous.
* S3AUtils {{list(fs, path, recursive, filter)}}, replicates the classic 
{{FileSystem.list(path, filter)}} but works on the higher performance recursive 
listFileStatus operation, and returns a list of those. Some bits of code taking 
{{List}} have had to be changed to {{List}} 
because despite Java type erasure means this is the same at runtime, it still 
complains needlessly in the compile.
  

h3. Testing

* Huge magic commit test is called ITestS3AHugeMagicCommits; guarantees 
excluded with the normal rules. This is important as the test_010_ test case, 
which creates the file, skips the operation if it runs in parallel...which is 
going to fail the test with *the exact stack trace which aaron saw*. That is, I 
think the test failure really could have been a false alarm. But it forced me 
to look at cleanup, which is a good thing.
* New test to create two parallel jobs and work each one simultaneously. Found 

[jira] [Updated] (HADOOP-15003) Merge S3A committers into trunk: Yetus patch checker

2017-11-10 Thread Steve Loughran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-15003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-15003:

Attachment: HADOOP-13786-047.patch

> Merge S3A committers into trunk: Yetus patch checker
> 
>
> Key: HADOOP-15003
> URL: https://issues.apache.org/jira/browse/HADOOP-15003
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.0.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
> Attachments: HADOOP-13786-041.patch, HADOOP-13786-042.patch, 
> HADOOP-13786-043.patch, HADOOP-13786-044.patch, HADOOP-13786-045.patch, 
> HADOOP-13786-046.patch, HADOOP-13786-047.patch
>
>
> This is a Yetus only JIRA created to have Yetus review the 
> HADOOP-13786/HADOOP-14971 patch as a .patch file, as the review PR 
> [https://github.com/apache/hadoop/pull/282] is stopping this happening in 
> HADOOP-14971.
> Reviews should go into the PR/other task



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15030) [branch-2] Include hadoop-cloud-storage-project in the main hadoop pom modules

2017-11-10 Thread Subru Krishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-15030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Subru Krishnan updated HADOOP-15030:

Attachment: HADOOP-15030-branch-2-v1.patch

Attaching trivial patch that adds the module

> [branch-2] Include hadoop-cloud-storage-project in the main hadoop pom modules
> --
>
> Key: HADOOP-15030
> URL: https://issues.apache.org/jira/browse/HADOOP-15030
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.9.0
>Reporter: Subru Krishnan
>Assignee: Subru Krishnan
>Priority: Critical
> Attachments: HADOOP-15030-branch-2-v1.patch
>
>
> During validation of 2.9.0. RC, [~vrushalic] noticed that the 
> hadoop-cloud-storage-project is not included in the main hadoop pom modules 
> so it's not being managed including mvn versions:set for releases. This was 
> fixed in trunk with HADOOP-14004, doing the same for branch-2.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15030) [branch-2] Include hadoop-cloud-storage-project in the main hadoop pom modules

2017-11-10 Thread Subru Krishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-15030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Subru Krishnan updated HADOOP-15030:

Status: Patch Available  (was: Open)

> [branch-2] Include hadoop-cloud-storage-project in the main hadoop pom modules
> --
>
> Key: HADOOP-15030
> URL: https://issues.apache.org/jira/browse/HADOOP-15030
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.9.0
>Reporter: Subru Krishnan
>Assignee: Subru Krishnan
>Priority: Critical
> Attachments: HADOOP-15030-branch-2-v1.patch
>
>
> During validation of 2.9.0. RC, [~vrushalic] noticed that the 
> hadoop-cloud-storage-project is not included in the main hadoop pom modules 
> so it's not being managed including mvn versions:set for releases. This was 
> fixed in trunk with HADOOP-14004, doing the same for branch-2.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15030) [branch-2] Include hadoop-cloud-storage-project in the main hadoop pom modules

2017-11-10 Thread Subru Krishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-15030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Subru Krishnan updated HADOOP-15030:

Description: During validation of 2.9.0. RC, [~vrushalic] noticed that the 
hadoop-cloud-storage-project is not included in the main hadoop pom modules so 
it's not being managed including mvn versions:set for releases. This was fixed 
in trunk with HADOOP-14004, doing the same for branch-2.  (was: During 
validation of 2.9.0. RC, [~vrushalic] noticed that the 
hadoop-cloud-storage-project is not included in the main hadoop pom modules so 
it's not being managed including mvn versions:set for releases.)

> [branch-2] Include hadoop-cloud-storage-project in the main hadoop pom modules
> --
>
> Key: HADOOP-15030
> URL: https://issues.apache.org/jira/browse/HADOOP-15030
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.9.0
>Reporter: Subru Krishnan
>Assignee: Subru Krishnan
>Priority: Critical
>
> During validation of 2.9.0. RC, [~vrushalic] noticed that the 
> hadoop-cloud-storage-project is not included in the main hadoop pom modules 
> so it's not being managed including mvn versions:set for releases. This was 
> fixed in trunk with HADOOP-14004, doing the same for branch-2.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15030) [branch-2] Include hadoop-cloud-storage-project in the main hadoop pom modules

2017-11-10 Thread Subru Krishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-15030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Subru Krishnan updated HADOOP-15030:

Affects Version/s: (was: 3.0.0-alpha3)
   (was: 3.0.0-alpha4)
   (was: 3.0.0-beta1)

> [branch-2] Include hadoop-cloud-storage-project in the main hadoop pom modules
> --
>
> Key: HADOOP-15030
> URL: https://issues.apache.org/jira/browse/HADOOP-15030
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.9.0
>Reporter: Subru Krishnan
>Assignee: Subru Krishnan
>Priority: Critical
>
> During validation of 2.9.0. RC, [~vrushalic] noticed that the 
> hadoop-cloud-storage-project is not included in the main hadoop pom modules 
> so it's not being managed including mvn versions:set for releases.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15030) [branch-2] Include hadoop-cloud-storage-project in the main hadoop pom modules

2017-11-10 Thread Subru Krishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-15030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Subru Krishnan updated HADOOP-15030:

Summary: [branch-2] Include hadoop-cloud-storage-project in the main hadoop 
pom modules  (was: Include hadoop-cloud-storage-project in the main hadoop pom 
modules)

> [branch-2] Include hadoop-cloud-storage-project in the main hadoop pom modules
> --
>
> Key: HADOOP-15030
> URL: https://issues.apache.org/jira/browse/HADOOP-15030
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.9.0, 3.0.0-beta1, 3.0.0-alpha4, 3.0.0-alpha3
>Reporter: Subru Krishnan
>Assignee: Subru Krishnan
>Priority: Critical
>
> During validation of 2.9.0. RC, [~vrushalic] noticed that the 
> hadoop-cloud-storage-project is not included in the main hadoop pom modules 
> so it's not being managed including mvn versions:set for releases.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15030) Include hadoop-cloud-storage-project in the main hadoop pom modules

2017-11-10 Thread Subru Krishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-15030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Subru Krishnan updated HADOOP-15030:

Affects Version/s: 2.9.0
   3.0.0-beta1
   3.0.0-alpha4
   3.0.0-alpha3

> Include hadoop-cloud-storage-project in the main hadoop pom modules
> ---
>
> Key: HADOOP-15030
> URL: https://issues.apache.org/jira/browse/HADOOP-15030
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.9.0, 3.0.0-beta1, 3.0.0-alpha4, 3.0.0-alpha3
>Reporter: Subru Krishnan
>Assignee: Subru Krishnan
>Priority: Critical
>
> During validation of 2.9.0. RC, [~vrushalic] noticed that the 
> hadoop-cloud-storage-project is not included in the main hadoop pom modules 
> so it's not being managed including mvn versions:set for releases.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-15030) Include hadoop-cloud-storage-project in the main hadoop pom modules

2017-11-10 Thread Subru Krishnan (JIRA)
Subru Krishnan created HADOOP-15030:
---

 Summary: Include hadoop-cloud-storage-project in the main hadoop 
pom modules
 Key: HADOOP-15030
 URL: https://issues.apache.org/jira/browse/HADOOP-15030
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Subru Krishnan
Assignee: Subru Krishnan
Priority: Critical


During validation of 2.9.0. RC, [~vrushalic] noticed that the 
hadoop-cloud-storage-project is not included in the main hadoop pom modules so 
it's not being managed including mvn versions:set for releases.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HADOOP-9747) Reduce unnecessary UGI synchronization

2017-11-10 Thread Bharat Viswanadham (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16246681#comment-16246681
 ] 

Bharat Viswanadham edited comment on HADOOP-9747 at 11/10/17 7:52 PM:
--

Hi [~daryn]
Thank You for providing patch.
One comment from me, this patch removed the flag 
HADOOP_TREAT_SUBJECT_EXTERNAL_KEY.
So, this configuration also need to be removed from the 
CommonConfigurations.java and also from core-default.xml

And also could you rebase your patch, as this is not cleanly applying to trunk.


was (Author: bharatviswa):
Hi [~daryn]
Thank You for providing patch.
One comment from me, is this patch removed the flag 
HADOOP_TREAT_SUBJECT_EXTERNAL_KEY.
So, this configuration need to be removed from the CommonConfigurations.java 
and also from core-default.xml

And also could you rebase your patch, as this is not cleanly applying to trunk.

> Reduce unnecessary UGI synchronization
> --
>
> Key: HADOOP-9747
> URL: https://issues.apache.org/jira/browse/HADOOP-9747
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: security
>Affects Versions: 0.23.0, 2.0.0-alpha, 3.0.0-alpha1
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
>Priority: Critical
> Attachments: HADOOP-9747.2.branch-2.patch, HADOOP-9747.2.trunk.patch, 
> HADOOP-9747.branch-2.patch, HADOOP-9747.trunk.patch
>
>
> Jstacks of heavily loaded NNs show up to dozens of threads blocking in the 
> UGI.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14155) KerberosName.replaceParameters() may throw java.lang.ArrayIndexOutOfBoundsException

2017-11-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16247753#comment-16247753
 ] 

Hadoop QA commented on HADOOP-14155:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 13m  
0s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 36m 
44s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 29m 
30s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
48s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
8s{color} | {color:green} trunk passed {color} |
| {color:red}-1{color} | {color:red} shadedclient {color} | {color:red} 17m 
50s{color} | {color:red} branch has errors when building and testing our client 
artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
45s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  0m 
27s{color} | {color:red} hadoop-auth in the patch failed. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  0m 
23s{color} | {color:red} root in the patch failed. {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  0m 23s{color} 
| {color:red} root in the patch failed. {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
 6s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} mvnsite {color} | {color:red}  0m  
7s{color} | {color:red} hadoop-auth in the patch failed. {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 24s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
16s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
40s{color} | {color:green} hadoop-auth in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
21s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}116m 57s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 |
| JIRA Issue | HADOOP-14155 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12856781/HADOOP-14155.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 152852a6f7d1 3.13.0-129-generic #178-Ubuntu SMP Fri Aug 11 
12:48:20 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 8a1bd9a |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_151 |
| findbugs | v3.1.0-RC1 |
| mvninstall | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/13660/artifact/out/patch-mvninstall-hadoop-common-project_hadoop-auth.txt
 |
| compile | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/13660/artifact/out/patch-compile-root.txt
 |
| javac | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/13660/artifact/out/patch-compile-root.txt
 |
| mvnsite | 

[jira] [Commented] (HADOOP-15008) Metrics sinks may emit too frequently if multiple sink periods are configured

2017-11-10 Thread Erik Krogen (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16247725#comment-16247725
 ] 

Erik Krogen commented on HADOOP-15008:
--

[~eyang] any chance you can help review?

> Metrics sinks may emit too frequently if multiple sink periods are configured
> -
>
> Key: HADOOP-15008
> URL: https://issues.apache.org/jira/browse/HADOOP-15008
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: metrics
>Reporter: Erik Krogen
>Assignee: Erik Krogen
>Priority: Minor
> Attachments: HADOOP-15008.000.patch
>
>
> If there are multiple metrics sink periods configured, depending on what 
> those periods are, some sinks may emit too frequently. For example with the 
> following:
> {code:title=hadoop-metrics2.properties}
> namenode.sink.file10.class=org.apache.hadoop.metrics2.sink.FileSink
> namenode.sink.file5.class=org.apache.hadoop.metrics2.sink.FileSink
> namenode.sink.file10.filename=namenode-metrics_per10.out
> namenode.sink.file5.filename=namenode-metrics_per5.out
> namenode.sink.file10.period=10
> namenode.sink.file5.period=5
> {code}
> I get the following:
> {code}
> ± for f in namenode-metrics_per*.out; do echo "$f" && grep 
> "metricssystem.MetricsSystem" $f | awk '{last=curr; curr=$1} END { print 
> curr-last }'; done
> namenode-metrics_per10.out
> 5000
> namenode-metrics_per5.out
> 5000
> {code}
> i.e., for both metrics files, each record is 5000 ms apart, even though one 
> of the sinks has been configured to emit at 10s intervals



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-9083) Port HADOOP-9020 Add a SASL PLAIN server to branch 1

2017-11-10 Thread Andras Bokor (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-9083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andras Bokor resolved HADOOP-9083.
--
Resolution: Won't Fix

> Port HADOOP-9020 Add a SASL PLAIN server to branch 1
> 
>
> Key: HADOOP-9083
> URL: https://issues.apache.org/jira/browse/HADOOP-9083
> Project: Hadoop Common
>  Issue Type: Task
>  Components: ipc, security
>Affects Versions: 1.0.3
>Reporter: Yu Gao
>Assignee: Yu Gao
> Attachments: HADOOP-9020-branch-1.patch, test-TestSaslRPC.result, 
> test-patch.result
>
>
> It would be good if the patch of HADOOP-9020 for adding SASL PLAIN server 
> implementation could be ported to branch 1 as well.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-10743) Problem building hadoop -2.4.0 on FreeBSD 10 (without -Pnative)

2017-11-10 Thread Andras Bokor (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andras Bokor resolved HADOOP-10743.
---
Resolution: Won't Fix

2.4 is no longer supported.

> Problem building hadoop -2.4.0 on FreeBSD 10 (without -Pnative)
> ---
>
> Key: HADOOP-10743
> URL: https://issues.apache.org/jira/browse/HADOOP-10743
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: build
>Affects Versions: 2.4.0
> Environment: $ uname -a
> FreeBSD kakumen 10.0-STABLE FreeBSD 10.0-STABLE #4 r267707: Sat Jun 21 
> 19:40:06 COT 2014 pfg@kakumen:/usr/obj/usr/src/sys/GENERIC  amd64
> $ javac -version 
> javac 1.6.0_32
> $
>Reporter: Pedro Giffuni
>
> mapreduce-client-core fails to compile with java 1.6 on FreeBSD 10.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14665) Support # hash prefix comment lines in auth_to_local mapping rules

2017-11-10 Thread Hari Sekhon (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16247614#comment-16247614
 ] 

Hari Sekhon commented on HADOOP-14665:
--

Most of us don't hand edit that file any more in recent years as it's commonly 
done via Ambari but yes I see your point since it is just an xml value field.

I don't think it would be difficult for Hadoop to simply strip hash comments 
out of the value field though.

I might raise this to Ambari project as well but still think it might be better 
done in core as otherwise it won't help people who are on Cloudera or MapR 
which don't use Ambari and it becomes a distribution specific enhancement in 
each vendor's management console in that case.

> Support # hash prefix comment lines in auth_to_local mapping rules
> --
>
> Key: HADOOP-14665
> URL: https://issues.apache.org/jira/browse/HADOOP-14665
> Project: Hadoop Common
>  Issue Type: Improvement
>Affects Versions: 2.7.3
> Environment: HDP 2.6.0 + Kerberos
>Reporter: Hari Sekhon
>
> Request to add support for # hash prefixed comment lines in Hadoop's 
> auth_to_local mappings rules so I can comment what rules I've added and why 
> inline to the rules like with code (useful when supporting multi-directory 
> mappings).
> It should be fairly easy to implement, just string strip all lines from # to 
> end, trim whitespace and then exclude all blank / whitespace lines, I do this 
> in tools I write all the time.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-9327) Out of date code examples

2017-11-10 Thread Andras Bokor (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16247605#comment-16247605
 ] 

Andras Bokor commented on HADOOP-9327:
--

All the 3 classes are still available on trunk:
* 
https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-rumen/src/main/java/org/apache/hadoop/tools/rumen/JobConfigurationParser.java
* 
https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-rumen/src/main/java/org/apache/hadoop/tools/rumen/LoggedNetworkTopology.java
* 
https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/ContextFactory.java

What is this ticket about?

> Out of date code examples
> -
>
> Key: HADOOP-9327
> URL: https://issues.apache.org/jira/browse/HADOOP-9327
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.0.3-alpha
>Reporter: Hao Zhong
>
> 1. This page contains code examples that use JobConfigurationParser
> http://hadoop.apache.org/docs/current/api/org/apache/hadoop/tools/rumen/package-summary.html
> "JobConfigurationParser jcp = 
>   new JobConfigurationParser(interestedProperties);"
> JobConfigurationParser is deleted in 2.0.3
> 2. This page contains code examples that use ContextFactory
> http://hadoop.apache.org/docs/current/api/org/apache/hadoop/metrics/package-summary.html
> " ContextFactory factory = ContextFactory.getFactory();
> ... examine and/or modify factory attributes ...
> MetricsContext context = factory.getContext("myContext");"
> ContextFactory is deleted in 2.0.3
> 3. This page contains code examples that use LoggedNetworkTopology
> http://hadoop.apache.org/docs/current/api/org/apache/hadoop/tools/rumen/package-summary.html
> " do.init("topology.json", conf);
> 
>   // get the job summary using TopologyBuilder
>   LoggedNetworkTopology topology = topologyBuilder.build();"
> LoggedNetworkTopology is deleted in 2.0.3
> Please revise the documentation to reflect the code.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14389) Exception handling is incorrect in KerberosName.java

2017-11-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16247597#comment-16247597
 ] 

Hadoop QA commented on HADOOP-14389:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} docker {color} | {color:red}  6m 
48s{color} | {color:red} Docker failed to build yetus/hadoop:5b98639. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | HADOOP-14389 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12879815/HADOOP-14389.02.patch 
|
| Console output | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/13661/console |
| Powered by | Apache Yetus 0.7.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Exception handling is incorrect in KerberosName.java
> 
>
> Key: HADOOP-14389
> URL: https://issues.apache.org/jira/browse/HADOOP-14389
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Andras Bokor
>Assignee: Andras Bokor
>  Labels: supportability
> Attachments: HADOOP-14389.01.patch, HADOOP-14389.02.patch
>
>
> I found multiple inconsistency:
> Rule: {{RULE:\[2:$1/$2\@$3\](.\*)s/.\*/hdfs/}}
> Principal: {{nn/host.dom...@realm.tld}}
> Expected exception: {{BadStringFormat: ...3 is out of range...}}
> Actual exception: {{ArrayIndexOutOfBoundsException: 3}}
> 
> Rule: {{RULE:\[:$1/$2\@$0](.\*)s/.\*/hdfs/}} (Missing num of components)
> Expected: {{IllegalArgumentException}}
> Actual: {{java.lang.NumberFormatException: For input string: ""}}
> 
> Rule: {{RULE:\[2:$-1/$2\@$3\](.\*)s/.\*/hdfs/}}
> Expected {{BadStringFormat: -1 is outside of valid range...}}
> Actual: {{java.lang.NumberFormatException: For input string: ""}}
> 
> Rule: {{RULE:\[2:$one/$2\@$3\](.\*)s/.\*/hdfs/}}
> Expected {{java.lang.NumberFormatException: For input string: "one"}}
> Acutal {{java.lang.NumberFormatException: For input string: ""}}
> 
> In addtion:
> {code}[^\\]]{code}
> does not really make sense in {{ruleParser}}. Most probably it was needed 
> because we parse the whole rule string and remove the parsed rule from 
> beginning of the string: {{KerberosName#parseRules}}. This made the regex 
> engine parse wrong without it.
> In addition:
> In tests some corner cases are not covered.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-9282) Document Java 7 support

2017-11-10 Thread Andras Bokor (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-9282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andras Bokor resolved HADOOP-9282.
--
Resolution: Duplicate

Obsolete. 1.7 is mentioned on the wiki page.

> Document Java 7 support
> ---
>
> Key: HADOOP-9282
> URL: https://issues.apache.org/jira/browse/HADOOP-9282
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: documentation
>Reporter: Kevin Lyda
>
> The Hadoop Java Versions page makes no mention of Java 7.
> http://wiki.apache.org/hadoop/HadoopJavaVersions
> Java 6 is EOL as of this month ( 
> http://www.java.com/en/download/faq/java_6.xml ) and that's after extending 
> the date twice: https://blogs.oracle.com/henrik/entry/java_6_eol_h_h While 
> Oracle has recently released a number of security patches, chances are more 
> security issues will come up and we'll be left running clusters we can't 
> patch if we stay with Java 6.
> Does Hadoop support Java 7 and if so could the docs be changed to indicate 
> that?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-9324) Out of date API document

2017-11-10 Thread Andras Bokor (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-9324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andras Bokor resolved HADOOP-9324.
--
Resolution: Duplicate

I have raised HADOOP-15021 which is the root cause of most of the issues above. 
Others are ok.

1. Covered by HADOOP-15021
2. Covered by HADOOP-15021
3. Covered by HADOOP-15021
4. JoinCollector is not deleted
5. No longer an issue
6. Covered by HADOOP-15021
7. Covered by HADOOP-15021
8. Covered by HADOOP-15021
9. Covered by HADOOP-15021
10. JobContextImpl is not deleted. It will covered by HADOOP-15021
11. It is correct as it is
12. Covered by HADOOP-15021
13. Covered by HADOOP-15021
14. Covered by HADOOP-15021
15. Covered by HADOOP-15021
16. Package exists
17. Covered by HADOOP-15021
18. Covered by HADOOP-15021
19. No longer valid
20. Covered by HADOOP-15021


> Out of date API document
> 
>
> Key: HADOOP-9324
> URL: https://issues.apache.org/jira/browse/HADOOP-9324
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 2.0.3-alpha
>Reporter: Hao Zhong
>
> The documentation is out of date. Some code references are broken:
> 1. 
> http://hadoop.apache.org/docs/current/api/org/apache/hadoop/fs/FSDataInputStream.html
> "All Implemented Interfaces:
> Closeable, DataInput, *org.apache.hadoop.fs.ByteBufferReadable*, 
> *org.apache.hadoop.fs.HasFileDescriptor*, PositionedReadable, Seekable "
> 2.http://hadoop.apache.org/docs/current/api/org/apache/hadoop/mapreduce/Cluster.html
> renewDelegationToken(*org.apache.hadoop.security.token.Token*
>  token)
>   Deprecated. Use Token.renew(*org.apache.hadoop.conf.Configuration*) 
> instead
> 3.http://hadoop.apache.org/docs/current/api/org/apache/hadoop/mapred/JobConf.html
> "Use MRAsyncDiskService.moveAndDeleteAllVolumes instead. "
> I cannot find the MRAsyncDiskService class in the documentation of 2.0.3. 
> 4.http://hadoop.apache.org/docs/current/api/org/apache/hadoop/mapred/join/CompositeRecordReader.html
>  "protected 
> *org.apache.hadoop.mapred.join.CompositeRecordReader.JoinCollector*   jc"
> Please globally search JoinCollector. It is deleted, but mentioned many times 
> in the current documentation.
> 5.http://hadoop.apache.org/docs/current/api/org/apache/hadoop/mapred/OutputCommitter.html
> "abortJob(JobContext context, *org.apache.hadoop.mapreduce.JobStatus.State 
> runState*)"  
> http://hadoop.apache.org/docs/current/api/org/apache/hadoop/mapreduce/Job.html
> "public *org.apache.hadoop.mapreduce.JobStatus.State* getJobState()"
> 4.http://hadoop.apache.org/docs/current/api/org/apache/hadoop/mapred/SequenceFileOutputFormat.html
> " static *org.apache.hadoop.io.SequenceFile.CompressionType* 
> getOutputCompressionType"
> " static *org.apache.hadoop.io.SequenceFile.Reader[]* getReaders"
> 5.http://hadoop.apache.org/docs/current/api/org/apache/hadoop/mapred/TaskCompletionEvent.html
> "Returns enum Status.SUCESS or Status.FAILURE."->Status.SUCCEEDED? 
> 6.http://hadoop.apache.org/docs/current/api/org/apache/hadoop/mapreduce/Job.html
> " static *org.apache.hadoop.mapreduce.Job.TaskStatusFilter*   
> getTaskOutputFilter"
> "  org.apache.hadoop.mapreduce.TaskReport[]   getTaskReports(TaskType type) "
> 7.http://hadoop.apache.org/docs/current/api/org/apache/hadoop/mapreduce/Reducer.html
> "cleanup(*org.apache.hadoop.mapreduce.Reducer.Context* context) "
> 8.http://hadoop.apache.org/docs/current/api/org/apache/hadoop/mapred/SequenceFileOutputFormat.html
>  "static *org.apache.hadoop.io.SequenceFile.CompressionType*  
> getOutputCompressionType(JobConf conf)
>   Get the *SequenceFile.CompressionType* for the output SequenceFile."
> " static *org.apache.hadoop.io.SequenceFile.Reader[]* getReaders" 
> 9.http://hadoop.apache.org/docs/current/api/org/apache/hadoop/mapreduce/lib/partition/InputSampler.html
> "writePartitionFile(Job job, 
> *org.apache.hadoop.mapreduce.lib.partition.InputSampler.Sampler* 
> sampler) "
> 10.http://hadoop.apache.org/docs/current/api/org/apache/hadoop/mapreduce/lib/partition/TotalOrderPartitioner.html
> contain JobContextImpl.getNumReduceTasks() - 1 keys. 
> The JobContextImpl class is already deleted.
> 11. 
> http://hadoop.apache.org/docs/current/api/org/apache/hadoop/mapreduce/OutputCommitter.html
> "Note that this is invoked for jobs with final runstate as 
> JobStatus.State.FAILED or JobStatus.State.KILLED."->JobStatus.FAILED 
> JobStatus.KILLED?
> 12.http://hadoop.apache.org/docs/current/api/org/apache/hadoop/mapred/TaskAttemptContext.html
> "All Superinterfaces:
> JobContext, *org.apache.hadoop.mapreduce.MRJobConfig*, Progressable, 
> TaskAttemptContext "
> 13.http://hadoop.apache.org/docs/current/api/org/apache/hadoop/metrics/file/FileContext.html
> "All Implemented Interfaces:
> *org.apache.hadoop.metrics.MetricsContext*"
> 

[jira] [Commented] (HADOOP-14665) Support # hash prefix comment lines in auth_to_local mapping rules

2017-11-10 Thread Andras Bokor (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16247549#comment-16247549
 ] 

Andras Bokor commented on HADOOP-14665:
---

I am not sure if I understand correctly. auth_to_local rules are used in 
core-site.xml file. So it is an xml file that is why xml comment works.

> Support # hash prefix comment lines in auth_to_local mapping rules
> --
>
> Key: HADOOP-14665
> URL: https://issues.apache.org/jira/browse/HADOOP-14665
> Project: Hadoop Common
>  Issue Type: Improvement
>Affects Versions: 2.7.3
> Environment: HDP 2.6.0 + Kerberos
>Reporter: Hari Sekhon
>
> Request to add support for # hash prefixed comment lines in Hadoop's 
> auth_to_local mappings rules so I can comment what rules I've added and why 
> inline to the rules like with code (useful when supporting multi-directory 
> mappings).
> It should be fairly easy to implement, just string strip all lines from # to 
> end, trim whitespace and then exclude all blank / whitespace lines, I do this 
> in tools I write all the time.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HADOOP-14665) Support # hash prefix comment lines in auth_to_local mapping rules

2017-11-10 Thread Hari Sekhon (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16247502#comment-16247502
 ] 

Hari Sekhon edited comment on HADOOP-14665 at 11/10/17 1:39 PM:


Ok I didn't see that documented in the top few google hits, maybe it's just not 
widely known, and since it looks like a flat unix config, hash comments were 
expected.

I still think supporting hash comments would be more intuitive as in management 
consoles all the files which have free form configuration boxes are in unix 
config file format and use hash comments.


was (Author: harisekhon):
Ok I didn't see that documented in the top few google hits and since it looks 
like a flat unix config, hash comments were expected.

I still think supporting hash comments would be more intuitive as in management 
consoles all the files which have free form configuration boxes are in unix 
config file format and use hash comments.

> Support # hash prefix comment lines in auth_to_local mapping rules
> --
>
> Key: HADOOP-14665
> URL: https://issues.apache.org/jira/browse/HADOOP-14665
> Project: Hadoop Common
>  Issue Type: Improvement
>Affects Versions: 2.7.3
> Environment: HDP 2.6.0 + Kerberos
>Reporter: Hari Sekhon
>
> Request to add support for # hash prefixed comment lines in Hadoop's 
> auth_to_local mappings rules so I can comment what rules I've added and why 
> inline to the rules like with code (useful when supporting multi-directory 
> mappings).
> It should be fairly easy to implement, just string strip all lines from # to 
> end, trim whitespace and then exclude all blank / whitespace lines, I do this 
> in tools I write all the time.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HADOOP-14665) Support # hash prefix comment lines in auth_to_local mapping rules

2017-11-10 Thread Hari Sekhon (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16247502#comment-16247502
 ] 

Hari Sekhon edited comment on HADOOP-14665 at 11/10/17 1:39 PM:


Ok I didn't see that documented in the top few google hits and since it looks 
like a flat unix config, hash comments were expected.

I still think supporting hash comments would be more intuitive as in management 
consoles all the files which have free form configuration boxes are in unix 
config file format and use hash comments.


was (Author: harisekhon):
Ok I didn't see that documented in the top few google hits and since it looks 
like a flat unix config, hash comments were expected.

I still think supporting hash comments would be more intuitive as in management 
consoles all the files which have free form configuration boxes are in unix 
file format and take hashes.

> Support # hash prefix comment lines in auth_to_local mapping rules
> --
>
> Key: HADOOP-14665
> URL: https://issues.apache.org/jira/browse/HADOOP-14665
> Project: Hadoop Common
>  Issue Type: Improvement
>Affects Versions: 2.7.3
> Environment: HDP 2.6.0 + Kerberos
>Reporter: Hari Sekhon
>
> Request to add support for # hash prefixed comment lines in Hadoop's 
> auth_to_local mappings rules so I can comment what rules I've added and why 
> inline to the rules like with code (useful when supporting multi-directory 
> mappings).
> It should be fairly easy to implement, just string strip all lines from # to 
> end, trim whitespace and then exclude all blank / whitespace lines, I do this 
> in tools I write all the time.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14665) Support # hash prefix comment lines in auth_to_local mapping rules

2017-11-10 Thread Hari Sekhon (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16247502#comment-16247502
 ] 

Hari Sekhon commented on HADOOP-14665:
--

Ok I didn't see that documented in the top few google hits and since it looks 
like a flat unix config, hash comments were expected.

I still think supporting hash comments would be more intuitive as in management 
consoles all the files which have free form configuration boxes are in unix 
file format and take hashes.

> Support # hash prefix comment lines in auth_to_local mapping rules
> --
>
> Key: HADOOP-14665
> URL: https://issues.apache.org/jira/browse/HADOOP-14665
> Project: Hadoop Common
>  Issue Type: Improvement
>Affects Versions: 2.7.3
> Environment: HDP 2.6.0 + Kerberos
>Reporter: Hari Sekhon
>
> Request to add support for # hash prefixed comment lines in Hadoop's 
> auth_to_local mappings rules so I can comment what rules I've added and why 
> inline to the rules like with code (useful when supporting multi-directory 
> mappings).
> It should be fairly easy to implement, just string strip all lines from # to 
> end, trim whitespace and then exclude all blank / whitespace lines, I do this 
> in tools I write all the time.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14665) Support # hash prefix comment lines in auth_to_local mapping rules

2017-11-10 Thread Andras Bokor (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16247494#comment-16247494
 ] 

Andras Bokor commented on HADOOP-14665:
---

I does not seem like a missing feature. You can use standard xml comments, do 
not need to implement an own comment logic.

> Support # hash prefix comment lines in auth_to_local mapping rules
> --
>
> Key: HADOOP-14665
> URL: https://issues.apache.org/jira/browse/HADOOP-14665
> Project: Hadoop Common
>  Issue Type: Improvement
>Affects Versions: 2.7.3
> Environment: HDP 2.6.0 + Kerberos
>Reporter: Hari Sekhon
>
> Request to add support for # hash prefixed comment lines in Hadoop's 
> auth_to_local mappings rules so I can comment what rules I've added and why 
> inline to the rules like with code (useful when supporting multi-directory 
> mappings).
> It should be fairly easy to implement, just string strip all lines from # to 
> end, trim whitespace and then exclude all blank / whitespace lines, I do this 
> in tools I write all the time.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-10538) NumberFormatException happened when hadoop 1.2.1 running on Cygwin

2017-11-10 Thread Andras Bokor (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andras Bokor resolved HADOOP-10538.
---
Resolution: Won't Fix

It's obsolete. 1.x is not supported. 

> NumberFormatException happened  when hadoop 1.2.1 running on Cygwin
> ---
>
> Key: HADOOP-10538
> URL: https://issues.apache.org/jira/browse/HADOOP-10538
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 1.2.1
> Environment: OS: windows 7 / Cygwin
>Reporter: peter xie
>
> The TaskTracker always failed to startup when it running on Cygwin. And the 
> error info logged in xxx-tasktracker-.log is :
> 2014-04-21 22:13:51,439 DEBUG org.apache.hadoop.mapred.TaskRunner: putting 
> jobToken file name into environment 
> D:/hadoop/mapred/local/taskTracker/pxie/jobcache/job_201404212205_0001/jobToken
> 2014-04-21 22:13:51,439 INFO org.apache.hadoop.mapred.JvmManager: Killing 
> JVM: jvm_201404212205_0001_m_1895177159
> 2014-04-21 22:13:51,439 WARN org.apache.hadoop.mapred.TaskRunner: 
> attempt_201404212205_0001_m_00_0 : Child Error
> java.lang.NumberFormatException: For input string: ""
>   at 
> java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
>   at java.lang.Integer.parseInt(Integer.java:504)
>   at java.lang.Integer.parseInt(Integer.java:527)
>   at 
> org.apache.hadoop.mapred.JvmManager$JvmManagerForType$JvmRunner.kill(JvmManager.java:552)
>   at 
> org.apache.hadoop.mapred.JvmManager$JvmManagerForType.killJvmRunner(JvmManager.java:314)
>   at 
> org.apache.hadoop.mapred.JvmManager$JvmManagerForType.reapJvm(JvmManager.java:378)
>   at 
> org.apache.hadoop.mapred.JvmManager$JvmManagerForType.access$000(JvmManager.java:189)
>   at org.apache.hadoop.mapred.JvmManager.launchJvm(JvmManager.java:122)
>   at 
> org.apache.hadoop.mapred.TaskRunner.launchJvmAndWait(TaskRunner.java:292)
>   at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:251)
> 2014-04-21 22:13:51,511 DEBUG org.apache.hadoop.ipc.Server: IPC Server 
> listener on 59983: disconnecting client 127.0.0.1:60154. Number of active 
> connections: 1
> 2014-04-21 22:13:51,531 WARN org.apache.hadoop.fs.FileUtil: Failed to set 
> permissions of path: 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14098) AliyunOSS: improve the performance of object metadata operation

2017-11-10 Thread Genmao Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16247399#comment-16247399
 ] 

Genmao Yu commented on HADOOP-14098:


h3. AliyunOSS FS + metastore POC

1. metastore: a common module of {{MetadataStore}} interface to manager 
metadata of OSS object. Other filesystem or object store can manager their 
metadata based on specific implementation of metastore
2. In this POC, I implemented one common metastore based on HDFS, i.e. we use 
HDFS to manager metadata of object store. One advantage is the semantic 
consistency, we can use HDFS backened metastore to manage metadata just like 
use HDFS. Another advantage is this HDFS based metastore is platform 
independent, i.e. we can use it to manager any object store metadata.

h4. Goals:
  - Test if metastore can accelerate the performance of metadata operation.

h4. Non-Goals:
  - This test is not a pressure test of cluster. The test result does not 
indicate any platform or product performance.

| operation|unit|oss fs + metastore|oss fs|hdfs|
|CreateOp| | | | |
| |operations/sec|0.243|0.252|1.84|
| |successes/sec|0.152|0.161|0.324|
|DeleteOp| | | | |
| |operations/sec|52.479|43.499|328.515|
| |successes/sec|19.438|16.947|162.943|
|ListOp| | | | |
| |directory entries/sec|303.461|51.949|386.441|
| |operations/sec|250.05|37.191|235.405|
| |successes/sec|247.85|37.043|233.898|
|MkdirOp| | | | |
| |operations/sec|227.853|73.738|225.388|
| |successes/sec|227.853|73.738|225.388|
|ReadOp| | | | |
| |operations/sec|2.039|1.606|47.426|
| |successes/sec|0.127|0.132|6.222|
|RenameOp| | | | |
| |operations/sec|2.542|2.986|578.972|
| |successes/sec|0.578|0.507|163.965|
|AppendOp| |No Support|No Support| |

With metastore, we can find:
- {{ListOp}} and {{MkdirOp}} have a similar performance with HDFS.
- Others like {{RenameOp}} {{ReadOp}} {{DeleteOp}} still are much worse than 
HDFS.

> AliyunOSS: improve the performance of object metadata operation
> ---
>
> Key: HADOOP-14098
> URL: https://issues.apache.org/jira/browse/HADOOP-14098
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs
>Affects Versions: 3.0.0-alpha2
>Reporter: Genmao Yu
>Assignee: Genmao Yu
>
> Open this JIRA to research and address the potential request performance 
> issue.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15027) Improvements for Hadoop read from AliyunOSS

2017-11-10 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16247373#comment-16247373
 ] 

Steve Loughran commented on HADOOP-15027:
-

* There's an uber-JIRA to track all Alyun OSS issues; moved this under it: 
HADOOP-13377
* and added you to the list of developers; assigned the work to you 
* Make sure that [~unclegen] reviews, tests & is happy with this: he's the 
keeper of the module right now
* All patches for the object stores require the submitter to say which endpoint 
you ran all the test against. This ensures that you are confident you haven't 
broken anything before anyone else has a go.

Looking at the patch, I see what you are trying to do: speedup reads through 
pre-emptive fetching of data ahead of the client code, which ensures that when 
one thread is working on slow stuff.

I see the benefits of this on a sequential read from the start to end of a 
file, but for the common high-performance column formats: ORC & Parquet, that 
IO pattern isn't followed. Instead you seek 
open
seek (EOF - some offset)
read(footer)
seek(first column + length)

read(some summary data)either seek(first column), read(column,length) process
or seek(next column of that type)

or something similar: aggressive random IO, where the existing data needs to be 
discarded. If the https connection needs to be aborted, it's very expensive, so 
S3A and wasb now have random IO modes where in a readFully(position, length) 
read they do a GET position-max(min-read-size, length); and for forward seeks 
discard data wherever possible.

I would focus on performance of those data formats, rather than sequential IO, 
which primarly gets used for; .gz, .csv. avro ingest before parquet/orc is 
generated & used for all the other queries. (and distcp too, of course)

Take a look at HADOOP-13203 for the S3A work there, where we added a switch 
between sequential and random IO; added tests for random IO perf.

HADOOP-14535, did something better for Wasb, where it starts off in sequential, 
but as soon as you do a backwards seek (operation 4 in the list above), it says 
"this is columnar data" and switches to random IO. There's a patch pending for 
S3 to do that too, as it makes it easy to mix sequential data sources with 
random ones.

I would start with that, then worry about how best to prefetch data, which 
probably only matters in sequential reads.

Having a quick look at your code 

* The thread pool should be for the FileSystem itself, not per input stream. 
You can have many open input streams in a single process (especially: Spark, 
Hive); creating a thread pool for each one is slow and expensive.

* The retry logic needs tobe reworked because it just does retry-without-delay 
and retries every exception. There are some failures (UnknownHostException, 
NoRouteToHostException, auth failures, any RuntimeException) which aren't going 
be recoverable. Those we can recover from need to include some sleep & backoff 
policy. The good news: {{org.apache.hadoop.io.retry.RetryPolicy}} handles all 
this, with {{RetryPolicy.retryByException}} letting you declare the map of 
which exception to fail fast, which to retry on. Have a look at where other 
code is using it.


I like the look of the overall idea, and know that read performance matters. 
But focus on seek() first. Talk to [~unclegen] and see what he suggests.

> Improvements for Hadoop read from AliyunOSS
> ---
>
> Key: HADOOP-15027
> URL: https://issues.apache.org/jira/browse/HADOOP-15027
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/oss
>Affects Versions: 3.0.0
>Reporter: wujinhu
>Assignee: wujinhu
> Attachments: HADOOP-15027.001.patch
>
>
> Currently, read performance is poor when Hadoop reads from AliyunOSS. It 
> needs about 1min to read 1GB from OSS.
> Class AliyunOSSInputStream uses single thread to read data from AliyunOSS,  
> so we can refactor this by using multi-thread pre read to improve this.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-10768) Optimize Hadoop RPC encryption performance

2017-11-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16247371#comment-16247371
 ] 

Hadoop QA commented on HADOOP-10768:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
10s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 4 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
46s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 
52s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 12m 
36s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
11s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  3m  
1s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
15m 28s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  5m 
19s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
28s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
16s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 11m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green} 11m 
48s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red} 11m 48s{color} 
| {color:red} root generated 1 new + 1234 unchanged - 0 fixed = 1235 total (was 
1234) {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
2m 15s{color} | {color:orange} root: The patch generated 82 new + 873 unchanged 
- 7 fixed = 955 total (was 880) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
58s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 20 line(s) that end in whitespace. Use 
git apply --whitespace=fix <>. Refer 
https://git-scm.com/docs/git-apply {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green}  
9m 48s{color} | {color:green} patch has no errors when building and testing our 
client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  5m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
17s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  8m 
14s{color} | {color:green} hadoop-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m 
25s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 95m 19s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
40s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}200m  7s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Unreaped Processes | hadoop-hdfs:5 |
| Failed junit tests | hadoop.hdfs.TestFileCreationDelete |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure070 |
|   | hadoop.hdfs.TestErasureCodeBenchmarkThroughput |
|   | 

[jira] [Assigned] (HADOOP-15027) Improvements for Hadoop read from AliyunOSS

2017-11-10 Thread Steve Loughran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-15027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran reassigned HADOOP-15027:
---

Assignee: wujinhu

> Improvements for Hadoop read from AliyunOSS
> ---
>
> Key: HADOOP-15027
> URL: https://issues.apache.org/jira/browse/HADOOP-15027
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/oss
>Affects Versions: 3.0.0
>Reporter: wujinhu
>Assignee: wujinhu
> Attachments: HADOOP-15027.001.patch
>
>
> Currently, read performance is poor when Hadoop reads from AliyunOSS. It 
> needs about 1min to read 1GB from OSS.
> Class AliyunOSSInputStream uses single thread to read data from AliyunOSS,  
> so we can refactor this by using multi-thread pre read to improve this.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15027) Improvements for Hadoop read from AliyunOSS

2017-11-10 Thread Steve Loughran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-15027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-15027:

Affects Version/s: 3.0.0

> Improvements for Hadoop read from AliyunOSS
> ---
>
> Key: HADOOP-15027
> URL: https://issues.apache.org/jira/browse/HADOOP-15027
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/oss
>Affects Versions: 3.0.0
>Reporter: wujinhu
> Attachments: HADOOP-15027.001.patch
>
>
> Currently, read performance is poor when Hadoop reads from AliyunOSS. It 
> needs about 1min to read 1GB from OSS.
> Class AliyunOSSInputStream uses single thread to read data from AliyunOSS,  
> so we can refactor this by using multi-thread pre read to improve this.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15027) Improvements for Hadoop read from AliyunOSS

2017-11-10 Thread Steve Loughran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-15027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-15027:

Issue Type: Sub-task  (was: Improvement)
Parent: HADOOP-13377

> Improvements for Hadoop read from AliyunOSS
> ---
>
> Key: HADOOP-15027
> URL: https://issues.apache.org/jira/browse/HADOOP-15027
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/oss
>Reporter: wujinhu
> Attachments: HADOOP-15027.001.patch
>
>
> Currently, read performance is poor when Hadoop reads from AliyunOSS. It 
> needs about 1min to read 1GB from OSS.
> Class AliyunOSSInputStream uses single thread to read data from AliyunOSS,  
> so we can refactor this by using multi-thread pre read to improve this.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-9474) fs -put command doesn't work if selecting certain files from a local folder

2017-11-10 Thread Andras Bokor (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-9474?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andras Bokor resolved HADOOP-9474.
--
Resolution: Not A Bug

> fs -put command doesn't work if selecting certain files from a local folder
> ---
>
> Key: HADOOP-9474
> URL: https://issues.apache.org/jira/browse/HADOOP-9474
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs
>Affects Versions: 1.1.2
>Reporter: Glen Mazza
>
> The following four commands (a) - (d) were run sequentially.  From (a) - (c) 
> HDFS folder "inputABC" does not yet exist.
> (a) and (b) are improperly refusing to put the files from conf/*.xml into 
> inputABC because folder inputABC doesn't yet exist.  However, in (c) when I 
> make the same request except with just "conf" (and not "conf/*.xml") HDFS 
> will correctly create inputABC and copy the folders over.  We see that 
> inputABC now exists in (d) when I subsequently try to copy the conf/*.xml 
> folders, it correctly complains that the files already exist there.
> IOW, I can put "conf" into a nonexisting HDFS folder and fs will create the 
> folder for me, but I can't do the same with "conf/*.xml" -- but the latter 
> should work equally as well.  The problem appears to be in 
> org.apache.hadoop.fs.FileUtil, line 176, which properly routes "conf" to have 
> its files copied but will have "conf/*.xml" subsequently return a 
> "nonexisting folder" error.
> {noformat}
> a) gmazza@gmazza-work:/media/work1/hadoop-1.1.2$ bin/hadoop fs -put 
> conf/*.xml inputABC
> put: `inputABC': specified destination directory doest not exist
> b) gmazza@gmazza-work:/media/work1/hadoop-1.1.2$ bin/hadoop fs -put 
> conf/*.xml inputABC
> put: `inputABC': specified destination directory doest not exist
> c) gmazza@gmazza-work:/media/work1/hadoop-1.1.2$ bin/hadoop fs -put conf 
> inputABC
> d) gmazza@gmazza-work:/media/work1/hadoop-1.1.2$ bin/hadoop fs -put 
> conf/*.xml inputABC
> put: Target inputABC/capacity-scheduler.xml already exists
> Target inputABC/core-site.xml already exists
> Target inputABC/fair-scheduler.xml already exists
> Target inputABC/hadoop-policy.xml already exists
> Target inputABC/hdfs-site.xml already exists
> Target inputABC/mapred-queue-acls.xml already exists
> Target inputABC/mapred-site.xml already exists
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-9474) fs -put command doesn't work if selecting certain files from a local folder

2017-11-10 Thread Andras Bokor (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16247344#comment-16247344
 ] 

Andras Bokor commented on HADOOP-9474:
--

I don't think it is an issue. This behavior is in sync with Unix way. Your 
cannot copy files to a non-existing directory but you can copy a directory to 
another path:
{code}$ cp mydir/* fakedir
usage: cp [-R [-H | -L | -P]] [-fi | -n] [-apvX] source_file target_file
   cp [-R [-H | -L | -P]] [-fi | -n] [-apvX] source_file ... 
target_directory
$ cp mydir/* existingdir
$ ls existingdir/
1   2
$ cp -r mydir/ fakedir; ls fakedir
1   2{code}


> fs -put command doesn't work if selecting certain files from a local folder
> ---
>
> Key: HADOOP-9474
> URL: https://issues.apache.org/jira/browse/HADOOP-9474
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs
>Affects Versions: 1.1.2
>Reporter: Glen Mazza
>
> The following four commands (a) - (d) were run sequentially.  From (a) - (c) 
> HDFS folder "inputABC" does not yet exist.
> (a) and (b) are improperly refusing to put the files from conf/*.xml into 
> inputABC because folder inputABC doesn't yet exist.  However, in (c) when I 
> make the same request except with just "conf" (and not "conf/*.xml") HDFS 
> will correctly create inputABC and copy the folders over.  We see that 
> inputABC now exists in (d) when I subsequently try to copy the conf/*.xml 
> folders, it correctly complains that the files already exist there.
> IOW, I can put "conf" into a nonexisting HDFS folder and fs will create the 
> folder for me, but I can't do the same with "conf/*.xml" -- but the latter 
> should work equally as well.  The problem appears to be in 
> org.apache.hadoop.fs.FileUtil, line 176, which properly routes "conf" to have 
> its files copied but will have "conf/*.xml" subsequently return a 
> "nonexisting folder" error.
> {noformat}
> a) gmazza@gmazza-work:/media/work1/hadoop-1.1.2$ bin/hadoop fs -put 
> conf/*.xml inputABC
> put: `inputABC': specified destination directory doest not exist
> b) gmazza@gmazza-work:/media/work1/hadoop-1.1.2$ bin/hadoop fs -put 
> conf/*.xml inputABC
> put: `inputABC': specified destination directory doest not exist
> c) gmazza@gmazza-work:/media/work1/hadoop-1.1.2$ bin/hadoop fs -put conf 
> inputABC
> d) gmazza@gmazza-work:/media/work1/hadoop-1.1.2$ bin/hadoop fs -put 
> conf/*.xml inputABC
> put: Target inputABC/capacity-scheduler.xml already exists
> Target inputABC/core-site.xml already exists
> Target inputABC/fair-scheduler.xml already exists
> Target inputABC/hadoop-policy.xml already exists
> Target inputABC/hdfs-site.xml already exists
> Target inputABC/mapred-queue-acls.xml already exists
> Target inputABC/mapred-site.xml already exists
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-6380) Deprecate hadoop fs -dus command.

2017-11-10 Thread Andras Bokor (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-6380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andras Bokor resolved HADOOP-6380.
--
Resolution: Duplicate

> Deprecate hadoop fs -dus command.
> -
>
> Key: HADOOP-6380
> URL: https://issues.apache.org/jira/browse/HADOOP-6380
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Ravi Phulari
>
> We need to remove *hadoop fs -dus* command whose functionality is duplicated 
> by *hadoop fs -du -s*.  
> {noformat}
> [rphulari@lm]> bin/hdfs dfs -du -s 
> 48902  hdfs://localhost:9000/user/rphulari
> [rphulari@lm]> bin/hdfs dfs -dus 
> 48902  hdfs://localhost:9000/user/rphulari
> [rphulari@lm]> 
> [rphulari@lm]> bin/hdfs dfs -dus -h
> 47.8k  hdfs://localhost:9000/user/rphulari
> [rphulari@lm]> bin/hdfs dfs -du -s -h
> 47.8k  hdfs://localhost:9000/user/rphulari
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Reopened] (HADOOP-6380) Deprecate hadoop fs -dus command.

2017-11-10 Thread Andras Bokor (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-6380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andras Bokor reopened HADOOP-6380:
--

> Deprecate hadoop fs -dus command.
> -
>
> Key: HADOOP-6380
> URL: https://issues.apache.org/jira/browse/HADOOP-6380
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Ravi Phulari
>
> We need to remove *hadoop fs -dus* command whose functionality is duplicated 
> by *hadoop fs -du -s*.  
> {noformat}
> [rphulari@lm]> bin/hdfs dfs -du -s 
> 48902  hdfs://localhost:9000/user/rphulari
> [rphulari@lm]> bin/hdfs dfs -dus 
> 48902  hdfs://localhost:9000/user/rphulari
> [rphulari@lm]> 
> [rphulari@lm]> bin/hdfs dfs -dus -h
> 47.8k  hdfs://localhost:9000/user/rphulari
> [rphulari@lm]> bin/hdfs dfs -du -s -h
> 47.8k  hdfs://localhost:9000/user/rphulari
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-6380) Deprecate hadoop fs -dus command.

2017-11-10 Thread Andras Bokor (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-6380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andras Bokor resolved HADOOP-6380.
--
Resolution: Won't Fix

It's already deprecated:
{code}bin/hdfs dfs -dus /
dus: DEPRECATED: Please use 'du -s' instead.
2017-11-07 13:56:08,914 WARN util.NativeCodeLoader: Unable to load 
native-hadoop library for your platform... using builtin-java classes where 
applicable
0  0  /{code}

> Deprecate hadoop fs -dus command.
> -
>
> Key: HADOOP-6380
> URL: https://issues.apache.org/jira/browse/HADOOP-6380
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Ravi Phulari
>
> We need to remove *hadoop fs -dus* command whose functionality is duplicated 
> by *hadoop fs -du -s*.  
> {noformat}
> [rphulari@lm]> bin/hdfs dfs -du -s 
> 48902  hdfs://localhost:9000/user/rphulari
> [rphulari@lm]> bin/hdfs dfs -dus 
> 48902  hdfs://localhost:9000/user/rphulari
> [rphulari@lm]> 
> [rphulari@lm]> bin/hdfs dfs -dus -h
> 47.8k  hdfs://localhost:9000/user/rphulari
> [rphulari@lm]> bin/hdfs dfs -du -s -h
> 47.8k  hdfs://localhost:9000/user/rphulari
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15024) AliyunOSS: Provide oss client side Hadoop version information to oss server

2017-11-10 Thread Genmao Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16247308#comment-16247308
 ] 

Genmao Yu commented on HADOOP-15024:


OK, I will post a test result based on your new patch.

> AliyunOSS: Provide oss client side Hadoop version information to oss server
> ---
>
> Key: HADOOP-15024
> URL: https://issues.apache.org/jira/browse/HADOOP-15024
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs, fs/oss
>Reporter: SammiChen
>Assignee: SammiChen
> Attachments: HADOOP-15024.000.patch
>
>
> Provide oss client side Hadoop version to oss server, to help build access 
> statistic metrics. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14964) AliyunOSS: backport Aliyun OSS module to branch-2

2017-11-10 Thread Genmao Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16247303#comment-16247303
 ] 

Genmao Yu commented on HADOOP-14964:


Sorry for the late response. LGTM and +1. 
Maybe the best way to fix the {{httpclient}} version conflict issue is to 
degrade the {{httpclient}} in oss sdk. Let us discuss offline and wait for the 
feedback from Aliyun OSS team. 






> AliyunOSS: backport Aliyun OSS module to branch-2
> -
>
> Key: HADOOP-14964
> URL: https://issues.apache.org/jira/browse/HADOOP-14964
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/oss
>Reporter: Genmao Yu
>Assignee: SammiChen
> Attachments: HADOOP-14964-branch-2.000.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15024) AliyunOSS: Provide oss client side Hadoop version information to oss server

2017-11-10 Thread SammiChen (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16247299#comment-16247299
 ] 

SammiChen commented on HADOOP-15024:


Thanks [~ste...@apache.org] and [~uncleGen] for review the patch. I will upload 
a new patch later to add the prefix. That would be very helpful to test it 
offline in Aliyun OSS's environment, [~uncleGen]

> AliyunOSS: Provide oss client side Hadoop version information to oss server
> ---
>
> Key: HADOOP-15024
> URL: https://issues.apache.org/jira/browse/HADOOP-15024
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs, fs/oss
>Reporter: SammiChen
>Assignee: SammiChen
> Attachments: HADOOP-15024.000.patch
>
>
> Provide oss client side Hadoop version to oss server, to help build access 
> statistic metrics. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HADOOP-15029) AliyunOSS: Add User-Agent header in HTTP requests to the OSS server

2017-11-10 Thread Genmao Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16247276#comment-16247276
 ] 

Genmao Yu edited comment on HADOOP-15029 at 11/10/17 10:02 AM:
---

sorry to make a duplicated jira,  and just close this.


was (Author: unclegen):
duplicated, just close this.

> AliyunOSS:  Add User-Agent header in HTTP requests to the OSS server
> 
>
> Key: HADOOP-15029
> URL: https://issues.apache.org/jira/browse/HADOOP-15029
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs, fs/oss
>Affects Versions: 3.0.0-beta1
>Reporter: Genmao Yu
>Assignee: Genmao Yu
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15024) AliyunOSS: Provide oss client side Hadoop version information to oss server

2017-11-10 Thread Genmao Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16247288#comment-16247288
 ] 

Genmao Yu commented on HADOOP-15024:


+1 to add a config like "fs.oss.user.agent.prefix"
{code}
USER_AGENT = USER_AGENT_PREFIX + ","+ hadoop version information
{code}

[~Sammi] Maybe I can help you to test ut against "oss-cn-shanghai.aliyuncs.com" 
offline

> AliyunOSS: Provide oss client side Hadoop version information to oss server
> ---
>
> Key: HADOOP-15024
> URL: https://issues.apache.org/jira/browse/HADOOP-15024
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs, fs/oss
>Reporter: SammiChen
>Assignee: SammiChen
> Attachments: HADOOP-15024.000.patch
>
>
> Provide oss client side Hadoop version to oss server, to help build access 
> statistic metrics. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15029) AliyunOSS: Add User-Agent header in HTTP requests to the OSS server

2017-11-10 Thread Genmao Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16247276#comment-16247276
 ] 

Genmao Yu commented on HADOOP-15029:


duplicated, just close this.

> AliyunOSS:  Add User-Agent header in HTTP requests to the OSS server
> 
>
> Key: HADOOP-15029
> URL: https://issues.apache.org/jira/browse/HADOOP-15029
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs, fs/oss
>Affects Versions: 3.0.0-beta1
>Reporter: Genmao Yu
>Assignee: Genmao Yu
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-15029) AliyunOSS: Add User-Agent header in HTTP requests to the OSS server

2017-11-10 Thread Genmao Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-15029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Genmao Yu resolved HADOOP-15029.

Resolution: Duplicate

> AliyunOSS:  Add User-Agent header in HTTP requests to the OSS server
> 
>
> Key: HADOOP-15029
> URL: https://issues.apache.org/jira/browse/HADOOP-15029
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs, fs/oss
>Affects Versions: 3.0.0-beta1
>Reporter: Genmao Yu
>Assignee: Genmao Yu
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15029) AliyunOSS: Add User-Agent header in HTTP requests to the OSS server

2017-11-10 Thread Genmao Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-15029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Genmao Yu updated HADOOP-15029:
---
Affects Version/s: 3.0.0-beta1

> AliyunOSS:  Add User-Agent header in HTTP requests to the OSS server
> 
>
> Key: HADOOP-15029
> URL: https://issues.apache.org/jira/browse/HADOOP-15029
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs, fs/oss
>Affects Versions: 3.0.0-beta1
>Reporter: Genmao Yu
>Assignee: Genmao Yu
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-15029) AliyunOSS: Add User-Agent header in HTTP requests to the OSS server

2017-11-10 Thread Genmao Yu (JIRA)
Genmao Yu created HADOOP-15029:
--

 Summary: AliyunOSS:  Add User-Agent header in HTTP requests to the 
OSS server
 Key: HADOOP-15029
 URL: https://issues.apache.org/jira/browse/HADOOP-15029
 Project: Hadoop Common
  Issue Type: Sub-task
Reporter: Genmao Yu
Assignee: Genmao Yu






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work started] (HADOOP-15029) AliyunOSS: Add User-Agent header in HTTP requests to the OSS server

2017-11-10 Thread Genmao Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-15029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HADOOP-15029 started by Genmao Yu.
--
> AliyunOSS:  Add User-Agent header in HTTP requests to the OSS server
> 
>
> Key: HADOOP-15029
> URL: https://issues.apache.org/jira/browse/HADOOP-15029
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs, fs/oss
>Affects Versions: 3.0.0-beta1
>Reporter: Genmao Yu
>Assignee: Genmao Yu
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15027) Improvements for Hadoop read from AliyunOSS

2017-11-10 Thread wujinhu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-15027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wujinhu updated HADOOP-15027:
-
Attachment: HADOOP-15027.001.patch

> Improvements for Hadoop read from AliyunOSS
> ---
>
> Key: HADOOP-15027
> URL: https://issues.apache.org/jira/browse/HADOOP-15027
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/oss
>Reporter: wujinhu
> Attachments: HADOOP-15027.001.patch
>
>
> Currently, read performance is poor when Hadoop reads from AliyunOSS. It 
> needs about 1min to read 1GB from OSS.
> Class AliyunOSSInputStream uses single thread to read data from AliyunOSS,  
> so we can refactor this by using multi-thread pre read to improve this.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15028) Got errors while running org.apache.hadoop.io.TestSequenceFileAppend

2017-11-10 Thread bd17kaka (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-15028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

bd17kaka updated HADOOP-15028:
--
Description: 
I ran the test case org.apache.hadoop.io.TestSequenceFileAppend in 
branch-2.6.4, I got the following errors:

Running org.apache.hadoop.io.TestSequenceFileAppend
Tests run: 4, Failures: 0, Errors: 2, Skipped: 0, Time elapsed: 0.801 sec <<< 
FAILURE! - in org.apache.hadoop.io.TestSequenceFileAppend
testAppendBlockCompression(org.apache.hadoop.io.TestSequenceFileAppend)  Time 
elapsed: 0.117 sec  <<< ERROR!
java.io.IOException: File is corrupt!
at 
org.apache.hadoop.io.SequenceFile$Reader.readBlock(SequenceFile.java:2179)
at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:2583)
at 
org.apache.hadoop.io.TestSequenceFileAppend.verifyAll4Values(TestSequenceFileAppend.java:309)
at 
org.apache.hadoop.io.TestSequenceFileAppend.testAppendBlockCompression(TestSequenceFileAppend.java:205)

testAppendSort(org.apache.hadoop.io.TestSequenceFileAppend)  Time elapsed: 
0.013 sec  <<< ERROR!
java.io.IOException: File is corrupt!
at 
org.apache.hadoop.io.SequenceFile$Reader.readBlock(SequenceFile.java:2179)
at 
org.apache.hadoop.io.SequenceFile$Reader.nextRaw(SequenceFile.java:2488)
at 
org.apache.hadoop.io.SequenceFile$Sorter$SortPass.run(SequenceFile.java:2923)
at 
org.apache.hadoop.io.SequenceFile$Sorter.sortPass(SequenceFile.java:2861)
at org.apache.hadoop.io.SequenceFile$Sorter.sort(SequenceFile.java:2809)
at org.apache.hadoop.io.SequenceFile$Sorter.sort(SequenceFile.java:2850)
at 
org.apache.hadoop.io.TestSequenceFileAppend.testAppendSort(TestSequenceFileAppend.java:286)

But everything is OK in branch-2.6.5..

The maven command is  'mvn test -Pnative -Dtest=TestSequenceFileAppend'

  was:
I ran the test case org.apache.hadoop.io.TestSequenceFileAppend in 
branch-2.6.4, I got the following errors:

Running org.apache.hadoop.io.TestSequenceFileAppend
Tests run: 4, Failures: 0, Errors: 2, Skipped: 0, Time elapsed: 0.801 sec <<< 
FAILURE! - in org.apache.hadoop.io.TestSequenceFileAppend
testAppendBlockCompression(org.apache.hadoop.io.TestSequenceFileAppend)  Time 
elapsed: 0.117 sec  <<< ERROR!
java.io.IOException: File is corrupt!
at 
org.apache.hadoop.io.SequenceFile$Reader.readBlock(SequenceFile.java:2179)
at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:2583)
at 
org.apache.hadoop.io.TestSequenceFileAppend.verifyAll4Values(TestSequenceFileAppend.java:309)
at 
org.apache.hadoop.io.TestSequenceFileAppend.testAppendBlockCompression(TestSequenceFileAppend.java:205)

testAppendSort(org.apache.hadoop.io.TestSequenceFileAppend)  Time elapsed: 
0.013 sec  <<< ERROR!
java.io.IOException: File is corrupt!
at 
org.apache.hadoop.io.SequenceFile$Reader.readBlock(SequenceFile.java:2179)
at 
org.apache.hadoop.io.SequenceFile$Reader.nextRaw(SequenceFile.java:2488)
at 
org.apache.hadoop.io.SequenceFile$Sorter$SortPass.run(SequenceFile.java:2923)
at 
org.apache.hadoop.io.SequenceFile$Sorter.sortPass(SequenceFile.java:2861)
at org.apache.hadoop.io.SequenceFile$Sorter.sort(SequenceFile.java:2809)
at org.apache.hadoop.io.SequenceFile$Sorter.sort(SequenceFile.java:2850)
at 
org.apache.hadoop.io.TestSequenceFileAppend.testAppendSort(TestSequenceFileAppend.java:286)

But everything is OK in branch-2.6.5..


> Got errors while running org.apache.hadoop.io.TestSequenceFileAppend
> 
>
> Key: HADOOP-15028
> URL: https://issues.apache.org/jira/browse/HADOOP-15028
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.6.4
> Environment: Linux 2.6.32-642.el6.x86_64
>Reporter: bd17kaka
>
> I ran the test case org.apache.hadoop.io.TestSequenceFileAppend in 
> branch-2.6.4, I got the following errors:
> Running org.apache.hadoop.io.TestSequenceFileAppend
> Tests run: 4, Failures: 0, Errors: 2, Skipped: 0, Time elapsed: 0.801 sec <<< 
> FAILURE! - in org.apache.hadoop.io.TestSequenceFileAppend
> testAppendBlockCompression(org.apache.hadoop.io.TestSequenceFileAppend)  Time 
> elapsed: 0.117 sec  <<< ERROR!
> java.io.IOException: File is corrupt!
> at 
> org.apache.hadoop.io.SequenceFile$Reader.readBlock(SequenceFile.java:2179)
> at 
> org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:2583)
> at 
> org.apache.hadoop.io.TestSequenceFileAppend.verifyAll4Values(TestSequenceFileAppend.java:309)
> at 
> org.apache.hadoop.io.TestSequenceFileAppend.testAppendBlockCompression(TestSequenceFileAppend.java:205)
> testAppendSort(org.apache.hadoop.io.TestSequenceFileAppend)  Time elapsed: 
> 0.013 sec  <<< ERROR!
> java.io.IOException: File is corrupt!
>   

[jira] [Created] (HADOOP-15028) Got errors while running org.apache.hadoop.io.TestSequenceFileAppend

2017-11-10 Thread bd17kaka (JIRA)
bd17kaka created HADOOP-15028:
-

 Summary: Got errors while running 
org.apache.hadoop.io.TestSequenceFileAppend
 Key: HADOOP-15028
 URL: https://issues.apache.org/jira/browse/HADOOP-15028
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 2.6.4
 Environment: Linux 2.6.32-642.el6.x86_64
Reporter: bd17kaka


I ran the test case org.apache.hadoop.io.TestSequenceFileAppend in 
branch-2.6.4, I got the following errors:

Running org.apache.hadoop.io.TestSequenceFileAppend
Tests run: 4, Failures: 0, Errors: 2, Skipped: 0, Time elapsed: 0.801 sec <<< 
FAILURE! - in org.apache.hadoop.io.TestSequenceFileAppend
testAppendBlockCompression(org.apache.hadoop.io.TestSequenceFileAppend)  Time 
elapsed: 0.117 sec  <<< ERROR!
java.io.IOException: File is corrupt!
at 
org.apache.hadoop.io.SequenceFile$Reader.readBlock(SequenceFile.java:2179)
at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:2583)
at 
org.apache.hadoop.io.TestSequenceFileAppend.verifyAll4Values(TestSequenceFileAppend.java:309)
at 
org.apache.hadoop.io.TestSequenceFileAppend.testAppendBlockCompression(TestSequenceFileAppend.java:205)

testAppendSort(org.apache.hadoop.io.TestSequenceFileAppend)  Time elapsed: 
0.013 sec  <<< ERROR!
java.io.IOException: File is corrupt!
at 
org.apache.hadoop.io.SequenceFile$Reader.readBlock(SequenceFile.java:2179)
at 
org.apache.hadoop.io.SequenceFile$Reader.nextRaw(SequenceFile.java:2488)
at 
org.apache.hadoop.io.SequenceFile$Sorter$SortPass.run(SequenceFile.java:2923)
at 
org.apache.hadoop.io.SequenceFile$Sorter.sortPass(SequenceFile.java:2861)
at org.apache.hadoop.io.SequenceFile$Sorter.sort(SequenceFile.java:2809)
at org.apache.hadoop.io.SequenceFile$Sorter.sort(SequenceFile.java:2850)
at 
org.apache.hadoop.io.TestSequenceFileAppend.testAppendSort(TestSequenceFileAppend.java:286)

But everything is OK in branch-2.6.5..



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org