[jira] [Commented] (HADOOP-14478) Optimize NativeAzureFsInputStream for positional reads
[ https://issues.apache.org/jira/browse/HADOOP-14478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16035743#comment-16035743 ] Hadoop QA commented on HADOOP-14478: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 13s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 14m 36s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 25s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 17s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 28s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 22s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 38s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 18s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 23s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 20s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 20s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 14s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 25s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 17s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 45s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 14s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 43s{color} | {color:green} hadoop-azure in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 19s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 23m 38s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:14b5c93 | | JIRA Issue | HADOOP-14478 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12871098/HADOOP-14478.003.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux 3895ea5516dd 3.13.0-106-generic #153-Ubuntu SMP Tue Dec 6 15:44:32 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 73ecb19 | | Default Java | 1.8.0_131 | | findbugs | v3.1.0-RC1 | | Test Results | https://builds.apache.org/job/PreCommit-HADOOP-Build/12439/testReport/ | | modules | C: hadoop-tools/hadoop-azure U: hadoop-tools/hadoop-azure | | Console output | https://builds.apache.org/job/PreCommit-HADOOP-Build/12439/console | | Powered by | Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > Optimize NativeAzureFsInputStream for positional reads > -- > > Key: HADOOP-14478 > URL: https://issues.apache.org/jira/browse/HADOOP-14478 > Project: Hadoop Common > Issue Type: Bug > Components: fs/azure >Reporter: Rajesh Balamohan >Assignee: Rajesh Balamohan > Attachments: HADOOP-14478.001.patch, HADOOP-14478.002.patch, > HADOOP-14478.003.patch > > > Azure's
[jira] [Commented] (HADOOP-14478) Optimize NativeAzureFsInputStream for positional reads
[ https://issues.apache.org/jira/browse/HADOOP-14478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16035724#comment-16035724 ] Rajesh Balamohan commented on HADOOP-14478: --- Thanks [~liuml07] > Optimize NativeAzureFsInputStream for positional reads > -- > > Key: HADOOP-14478 > URL: https://issues.apache.org/jira/browse/HADOOP-14478 > Project: Hadoop Common > Issue Type: Bug > Components: fs/azure >Reporter: Rajesh Balamohan >Assignee: Rajesh Balamohan > Attachments: HADOOP-14478.001.patch, HADOOP-14478.002.patch, > HADOOP-14478.003.patch > > > Azure's {{BlobbInputStream}} internally buffers 4 MB of data irrespective of > the data length requested for. This would be beneficial for sequential reads. > However, for positional reads (seek to specific location, read x number of > bytes, seek back to original location) this may not be beneficial and might > even download lot more data which are not used later. > It would be good to override {{readFully(long position, byte[] buffer, int > offset, int length)}} for {{NativeAzureFsInputStream}} and make use of > {{mark(readLimit)}} as a hint to Azure's BlobInputStream. > BlobInputStream reference: > https://github.com/Azure/azure-storage-java/blob/master/microsoft-azure-storage/src/com/microsoft/azure/storage/blob/BlobInputStream.java#L448 > BlobInputStream can consider this as a hint later to determine the amount of > data to be read ahead. Changes to BlobInputStream would not be addressed in > this JIRA. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-14478) Optimize NativeAzureFsInputStream for positional reads
[ https://issues.apache.org/jira/browse/HADOOP-14478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Balamohan updated HADOOP-14478: -- Attachment: HADOOP-14478.003.patch Attaching .3 patch to address checkstyle issue (removed unused import statement) > Optimize NativeAzureFsInputStream for positional reads > -- > > Key: HADOOP-14478 > URL: https://issues.apache.org/jira/browse/HADOOP-14478 > Project: Hadoop Common > Issue Type: Bug > Components: fs/azure >Reporter: Rajesh Balamohan >Assignee: Rajesh Balamohan > Attachments: HADOOP-14478.001.patch, HADOOP-14478.002.patch, > HADOOP-14478.003.patch > > > Azure's {{BlobbInputStream}} internally buffers 4 MB of data irrespective of > the data length requested for. This would be beneficial for sequential reads. > However, for positional reads (seek to specific location, read x number of > bytes, seek back to original location) this may not be beneficial and might > even download lot more data which are not used later. > It would be good to override {{readFully(long position, byte[] buffer, int > offset, int length)}} for {{NativeAzureFsInputStream}} and make use of > {{mark(readLimit)}} as a hint to Azure's BlobInputStream. > BlobInputStream reference: > https://github.com/Azure/azure-storage-java/blob/master/microsoft-azure-storage/src/com/microsoft/azure/storage/blob/BlobInputStream.java#L448 > BlobInputStream can consider this as a hint later to determine the amount of > data to be read ahead. Changes to BlobInputStream would not be addressed in > this JIRA. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-14476) make InconsistentAmazonS3Client usable in downstream tests
[ https://issues.apache.org/jira/browse/HADOOP-14476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16035691#comment-16035691 ] Aaron Fabbri commented on HADOOP-14476: --- I started on this but shout if you wanted to work on it [~ste...@apache.org]. I'd like to make it configurable but I'm not sure if we want to pollute the config space with failure injection stuff. Maybe leave the values out of core-default.xml and just document them in testing.md? Thoughts? I'm currently thinking of adding a couple of knobs. 1. Delay time in milliseconds (how long the inconsistency lasts). 2. Substring for matching paths to be delayed. 3. A probability for random failure injection. > make InconsistentAmazonS3Client usable in downstream tests > -- > > Key: HADOOP-14476 > URL: https://issues.apache.org/jira/browse/HADOOP-14476 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3, test >Affects Versions: HADOOP-13345 >Reporter: Steve Loughran >Assignee: Aaron Fabbri > > It's important for downstream apps to be able to verify that s3guard works by > making the AWS client inconsistent (so demonstrate problems), then turn > s3guard on to verify that they go away. > This can be done by exposing the {{InconsistentAmazonS3Client}} > # move the factory to the production source > # make delay configurable for when you want a really long delay > # have factory code log @ warn when a non-default factory is used. > # mention in s3a testing.md > I think we could look at the name of the option, > {{fs.s3a.s3.client.factory.impl}} too. I'd like something which has > "internal" in it, and without the duplication of s3a.s3 -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-14478) Optimize NativeAzureFsInputStream for positional reads
[ https://issues.apache.org/jira/browse/HADOOP-14478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16035677#comment-16035677 ] Hadoop QA commented on HADOOP-14478: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 17s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 1s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 13m 56s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 24s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 15s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 22s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 27s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 33s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 15s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 18s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 17s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 17s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 11s{color} | {color:orange} hadoop-tools/hadoop-azure: The patch generated 1 new + 62 unchanged - 0 fixed = 63 total (was 62) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 18s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 12s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 33s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 11s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 21s{color} | {color:green} hadoop-azure in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 19s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 21m 35s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:14b5c93 | | JIRA Issue | HADOOP-14478 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12870981/HADOOP-14478.002.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux f484abf72143 3.13.0-116-generic #163-Ubuntu SMP Fri Mar 31 14:13:22 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 73ecb19 | | Default Java | 1.8.0_131 | | findbugs | v3.1.0-RC1 | | checkstyle | https://builds.apache.org/job/PreCommit-HADOOP-Build/12438/artifact/patchprocess/diff-checkstyle-hadoop-tools_hadoop-azure.txt | | Test Results | https://builds.apache.org/job/PreCommit-HADOOP-Build/12438/testReport/ | | modules | C: hadoop-tools/hadoop-azure U: hadoop-tools/hadoop-azure | | Console output | https://builds.apache.org/job/PreCommit-HADOOP-Build/12438/console | | Powered by | Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > Optimize NativeAzureFsInputStream for positional reads > -- > > Key: HADOOP-14478 > URL: https://issues.apache.org/jira/browse/HADOOP-14478 > Project: Hadoop Common > Issue
[jira] [Commented] (HADOOP-14459) SerializationFactory shouldn't throw a NullPointerException if the serializations list is not defined
[ https://issues.apache.org/jira/browse/HADOOP-14459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16035678#comment-16035678 ] Daniel Templeton commented on HADOOP-14459: --- I think that works for me. I think the last thing that would make it really crisp for me would be to add a statement to the log warning that says the default settings are being used. Other than that, LGTM. > SerializationFactory shouldn't throw a NullPointerException if the > serializations list is not defined > - > > Key: HADOOP-14459 > URL: https://issues.apache.org/jira/browse/HADOOP-14459 > Project: Hadoop Common > Issue Type: Bug >Reporter: Nandor Kollar >Assignee: Nandor Kollar >Priority: Minor > Attachments: HADOOP-14459_2.patch, HADOOP-14459.patch > > > The SerializationFactory throws an NPE if > CommonConfigurationKeys.IO_SERIALIZATIONS_KEY is not defined in the config. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-14394) Provide Builder pattern for DistributedFileSystem.create
[ https://issues.apache.org/jira/browse/HADOOP-14394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16035657#comment-16035657 ] Hadoop QA commented on HADOOP-14394: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 20m 45s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 4 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 37s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 12m 47s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 12m 27s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 46s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 28s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 53s{color} | {color:green} trunk passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 27s{color} | {color:red} hadoop-common-project/hadoop-common in trunk has 19 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 5s{color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 15s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 54s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 11m 59s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 11m 59s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 1m 51s{color} | {color:orange} root: The patch generated 2 new + 257 unchanged - 0 fixed = 259 total (was 257) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 38s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 1m 10s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 5m 0s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 18s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 7m 11s{color} | {color:red} hadoop-common in the patch failed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 27s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 64m 35s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 37s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}185m 58s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.security.TestRaceWhenRelogin | | | hadoop.hdfs.web.TestWebHdfsTimeouts | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:14b5c93 | | JIRA Issue | HADOOP-14394 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12871055/HADOOP-14394.03.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux a3f78222c919 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 13:48:03 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 73ecb19 | | Default Java | 1.8.0_131 | | findbugs | v3.1.0-RC1 | | findbugs | https://builds.apache.org/job/PreCommit-HADOOP-Build/12437/artifact/patchprocess/branch-findbugs-hadoop-common-project_hadoop-common-warnings.html | | checkstyle |
[jira] [Commented] (HADOOP-14478) Optimize NativeAzureFsInputStream for positional reads
[ https://issues.apache.org/jira/browse/HADOOP-14478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16035644#comment-16035644 ] Mingliang Liu commented on HADOOP-14478: +1 pending on Jenkins. Will commit next Monday if no more input. > Optimize NativeAzureFsInputStream for positional reads > -- > > Key: HADOOP-14478 > URL: https://issues.apache.org/jira/browse/HADOOP-14478 > Project: Hadoop Common > Issue Type: Bug > Components: fs/azure >Reporter: Rajesh Balamohan >Assignee: Rajesh Balamohan > Attachments: HADOOP-14478.001.patch, HADOOP-14478.002.patch > > > Azure's {{BlobbInputStream}} internally buffers 4 MB of data irrespective of > the data length requested for. This would be beneficial for sequential reads. > However, for positional reads (seek to specific location, read x number of > bytes, seek back to original location) this may not be beneficial and might > even download lot more data which are not used later. > It would be good to override {{readFully(long position, byte[] buffer, int > offset, int length)}} for {{NativeAzureFsInputStream}} and make use of > {{mark(readLimit)}} as a hint to Azure's BlobInputStream. > BlobInputStream reference: > https://github.com/Azure/azure-storage-java/blob/master/microsoft-azure-storage/src/com/microsoft/azure/storage/blob/BlobInputStream.java#L448 > BlobInputStream can consider this as a hint later to determine the amount of > data to be read ahead. Changes to BlobInputStream would not be addressed in > this JIRA. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Assigned] (HADOOP-14476) make InconsistentAmazonS3Client usable in downstream tests
[ https://issues.apache.org/jira/browse/HADOOP-14476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aaron Fabbri reassigned HADOOP-14476: - Assignee: Aaron Fabbri > make InconsistentAmazonS3Client usable in downstream tests > -- > > Key: HADOOP-14476 > URL: https://issues.apache.org/jira/browse/HADOOP-14476 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3, test >Affects Versions: HADOOP-13345 >Reporter: Steve Loughran >Assignee: Aaron Fabbri > > It's important for downstream apps to be able to verify that s3guard works by > making the AWS client inconsistent (so demonstrate problems), then turn > s3guard on to verify that they go away. > This can be done by exposing the {{InconsistentAmazonS3Client}} > # move the factory to the production source > # make delay configurable for when you want a really long delay > # have factory code log @ warn when a non-default factory is used. > # mention in s3a testing.md > I think we could look at the name of the option, > {{fs.s3a.s3.client.factory.impl}} too. I'd like something which has > "internal" in it, and without the duplication of s3a.s3 -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-14478) Optimize NativeAzureFsInputStream for positional reads
[ https://issues.apache.org/jira/browse/HADOOP-14478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mingliang Liu updated HADOOP-14478: --- Status: Patch Available (was: Open) > Optimize NativeAzureFsInputStream for positional reads > -- > > Key: HADOOP-14478 > URL: https://issues.apache.org/jira/browse/HADOOP-14478 > Project: Hadoop Common > Issue Type: Bug > Components: fs/azure >Reporter: Rajesh Balamohan >Assignee: Rajesh Balamohan > Attachments: HADOOP-14478.001.patch, HADOOP-14478.002.patch > > > Azure's {{BlobbInputStream}} internally buffers 4 MB of data irrespective of > the data length requested for. This would be beneficial for sequential reads. > However, for positional reads (seek to specific location, read x number of > bytes, seek back to original location) this may not be beneficial and might > even download lot more data which are not used later. > It would be good to override {{readFully(long position, byte[] buffer, int > offset, int length)}} for {{NativeAzureFsInputStream}} and make use of > {{mark(readLimit)}} as a hint to Azure's BlobInputStream. > BlobInputStream reference: > https://github.com/Azure/azure-storage-java/blob/master/microsoft-azure-storage/src/com/microsoft/azure/storage/blob/BlobInputStream.java#L448 > BlobInputStream can consider this as a hint later to determine the amount of > data to be read ahead. Changes to BlobInputStream would not be addressed in > this JIRA. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HADOOP-12360) Create StatsD metrics2 sink
[ https://issues.apache.org/jira/browse/HADOOP-12360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16035619#comment-16035619 ] Dave Marion edited comment on HADOOP-12360 at 6/2/17 11:27 PM: --- Honestly, I don't remember, it's been too long. Can you provide an example of how it's broken for the JVM metrics? was (Author: dlmarion): Honestly, I don't remember, it's been too long. > Create StatsD metrics2 sink > --- > > Key: HADOOP-12360 > URL: https://issues.apache.org/jira/browse/HADOOP-12360 > Project: Hadoop Common > Issue Type: New Feature > Components: metrics >Affects Versions: 2.7.1 >Reporter: Dave Marion >Assignee: Dave Marion >Priority: Minor > Fix For: 2.8.0, 3.0.0-alpha1 > > Attachments: HADOOP-12360.001.patch, HADOOP-12360.002.patch, > HADOOP-12360.003.patch, HADOOP-12360.004.patch, HADOOP-12360.005.patch, > HADOOP-12360.006.patch, HADOOP-12360.007.patch, HADOOP-12360.008.patch, > HADOOP-12360.009.patch, HADOOP-12360.010.patch > > > Create a metrics sink that pushes to a StatsD daemon. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HADOOP-12360) Create StatsD metrics2 sink
[ https://issues.apache.org/jira/browse/HADOOP-12360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16035619#comment-16035619 ] Dave Marion edited comment on HADOOP-12360 at 6/2/17 11:26 PM: --- Honestly, I don't remember, it's been too long. was (Author: dlmarion): Honestly, I don't remember, it's been too long. Is this causing an issue? > Create StatsD metrics2 sink > --- > > Key: HADOOP-12360 > URL: https://issues.apache.org/jira/browse/HADOOP-12360 > Project: Hadoop Common > Issue Type: New Feature > Components: metrics >Affects Versions: 2.7.1 >Reporter: Dave Marion >Assignee: Dave Marion >Priority: Minor > Fix For: 2.8.0, 3.0.0-alpha1 > > Attachments: HADOOP-12360.001.patch, HADOOP-12360.002.patch, > HADOOP-12360.003.patch, HADOOP-12360.004.patch, HADOOP-12360.005.patch, > HADOOP-12360.006.patch, HADOOP-12360.007.patch, HADOOP-12360.008.patch, > HADOOP-12360.009.patch, HADOOP-12360.010.patch > > > Create a metrics sink that pushes to a StatsD daemon. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-12360) Create StatsD metrics2 sink
[ https://issues.apache.org/jira/browse/HADOOP-12360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16035619#comment-16035619 ] Dave Marion commented on HADOOP-12360: -- Honestly, I don't remember, it's been too long. Is this causing an issue? > Create StatsD metrics2 sink > --- > > Key: HADOOP-12360 > URL: https://issues.apache.org/jira/browse/HADOOP-12360 > Project: Hadoop Common > Issue Type: New Feature > Components: metrics >Affects Versions: 2.7.1 >Reporter: Dave Marion >Assignee: Dave Marion >Priority: Minor > Fix For: 2.8.0, 3.0.0-alpha1 > > Attachments: HADOOP-12360.001.patch, HADOOP-12360.002.patch, > HADOOP-12360.003.patch, HADOOP-12360.004.patch, HADOOP-12360.005.patch, > HADOOP-12360.006.patch, HADOOP-12360.007.patch, HADOOP-12360.008.patch, > HADOOP-12360.009.patch, HADOOP-12360.010.patch > > > Create a metrics sink that pushes to a StatsD daemon. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-14481) Print stack trace when native bzip2 library does not load
[ https://issues.apache.org/jira/browse/HADOOP-14481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16035586#comment-16035586 ] Chen Liang commented on HADOOP-14481: - Thanks [~jojochuang] for the catch! v001 patch LGTM. > Print stack trace when native bzip2 library does not load > - > > Key: HADOOP-14481 > URL: https://issues.apache.org/jira/browse/HADOOP-14481 > Project: Hadoop Common > Issue Type: Improvement > Components: io >Reporter: Wei-Chiu Chuang >Assignee: Wei-Chiu Chuang >Priority: Minor > Attachments: HADOOP-14481.001.patch > > > When I ran hadoop checknative on my machine, it was not able to load system > bzip2 library and printed the following message. > 17/06/02 09:25:42 WARN bzip2.Bzip2Factory: Failed to load/initialize > native-bzip2 library system-native, will use pure-Java version > Reviewing the relevant code, it fails because of an exception. However, that > exception is not logged. We should print the stacktrace, at least at debug > log level. > {code:title=Bzip2Factory#isNativeBzip2Loaded()} > try { > // Initialize the native library. > Bzip2Compressor.initSymbols(libname); > Bzip2Decompressor.initSymbols(libname); > nativeBzip2Loaded = true; > LOG.info("Successfully loaded & initialized native-bzip2 library " + >libname); > } catch (Throwable t) { > LOG.warn("Failed to load/initialize native-bzip2 library " + >libname + ", will use pure-Java version"); > } > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-13786) Add S3Guard committer for zero-rename commits to consistent S3 endpoints
[ https://issues.apache.org/jira/browse/HADOOP-13786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16035546#comment-16035546 ] Hadoop QA commented on HADOOP-13786: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 16m 51s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 41 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 19s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 23s{color} | {color:green} HADOOP-13345 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 14m 38s{color} | {color:green} HADOOP-13345 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 10s{color} | {color:green} HADOOP-13345 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 41s{color} | {color:green} HADOOP-13345 passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 2m 51s{color} | {color:green} HADOOP-13345 passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 15s{color} | {color:green} HADOOP-13345 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 14s{color} | {color:green} HADOOP-13345 passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 17s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 52s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 13m 36s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 13m 36s{color} | {color:red} root generated 1 new + 777 unchanged - 1 fixed = 778 total (was 778) {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 2m 3s{color} | {color:orange} root: The patch generated 43 new + 120 unchanged - 23 fixed = 163 total (was 143) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 51s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 1m 51s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch has 13 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 3s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 55s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 56s{color} | {color:green} hadoop-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 25s{color} | {color:green} hadoop-yarn-project_hadoop-yarn_hadoop-yarn-registry generated 0 new + 45 unchanged - 3 fixed = 45 total (was 48) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 34s{color} | {color:green} hadoop-mapreduce-client-core in the patch passed. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 29s{color} | {color:green} hadoop-aws in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 7m 36s{color} | {color:green} hadoop-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 52s{color} | {color:green} hadoop-yarn-registry in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 50s{color} | {color:green} hadoop-mapreduce-client-core in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 56s{color} | {color:green} hadoop-aws in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 44s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} |
[jira] [Updated] (HADOOP-14394) Provide Builder pattern for DistributedFileSystem.create
[ https://issues.apache.org/jira/browse/HADOOP-14394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lei (Eddy) Xu updated HADOOP-14394: --- Attachment: HADOOP-14394.03.patch Attach a new patch to fix test failures. > Provide Builder pattern for DistributedFileSystem.create > > > Key: HADOOP-14394 > URL: https://issues.apache.org/jira/browse/HADOOP-14394 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs >Affects Versions: 2.9.0 >Reporter: Lei (Eddy) Xu >Assignee: Lei (Eddy) Xu > Attachments: HADOOP-14394.00.patch, HADOOP-14394.01.patch, > HADOOP-14394.02.patch, HADOOP-14394.03.patch > > > This JIRA continues to refine the {{FSOutputStreamBuilder}} interface > introduced in HDFS-11170. > It should also provide a spec for the Builder API. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-14457) create() does not notify metadataStore of parent directories or ensure they're not existing files
[ https://issues.apache.org/jira/browse/HADOOP-14457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16035396#comment-16035396 ] Aaron Fabbri commented on HADOOP-14457: --- If we end up adding an "ancestor is a directory" check to create() in the future, we could accumulate the list of missing parents during the ancestor checks and pass them through the operation to finishedWrite() as a precomputed list of the things to create in the metadatastore. It widens some race conditions around other clients modifying our directory tree, but it seems like it would be optimal WRT round trips. We'd have MetadataStore reads, then writing the outputstream, then close()->finishedWrite() does MetadataStore writes. > create() does not notify metadataStore of parent directories or ensure > they're not existing files > - > > Key: HADOOP-14457 > URL: https://issues.apache.org/jira/browse/HADOOP-14457 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Reporter: Sean Mackrory > Attachments: HADOOP-14457-HADOOP-13345.001.patch, > HADOOP-14457-HADOOP-13345.002.patch > > > Not a great test yet, but it at least reliably demonstrates the issue. > LocalMetadataStore will sometimes erroneously report that a directory is > empty with isAuthoritative = true when it *definitely* has children the > metadatastore should know about. It doesn't appear to happen if the children > are just directory. The fact that it's returning an empty listing is > concerning, but the fact that it says it's authoritative *might* be a second > bug. > {code} > diff --git > a/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java > > b/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java > index 78b3970..1821d19 100644 > --- > a/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java > +++ > b/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java > @@ -965,7 +965,7 @@ public boolean hasMetadataStore() { >} > >@VisibleForTesting > - MetadataStore getMetadataStore() { > + public MetadataStore getMetadataStore() { > return metadataStore; >} > > diff --git > a/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/contract/s3a/ITestS3AContractRename.java > > b/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/contract/s3a/ITestS3AContractRename.java > index 4339649..881bdc9 100644 > --- > a/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/contract/s3a/ITestS3AContractRename.java > +++ > b/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/contract/s3a/ITestS3AContractRename.java > @@ -23,6 +23,11 @@ > import org.apache.hadoop.fs.contract.AbstractFSContract; > import org.apache.hadoop.fs.FileSystem; > import org.apache.hadoop.fs.Path; > +import org.apache.hadoop.fs.s3a.S3AFileSystem; > +import org.apache.hadoop.fs.s3a.Tristate; > +import org.apache.hadoop.fs.s3a.s3guard.DirListingMetadata; > +import org.apache.hadoop.fs.s3a.s3guard.MetadataStore; > +import org.junit.Test; > > import static org.apache.hadoop.fs.contract.ContractTestUtils.dataset; > import static org.apache.hadoop.fs.contract.ContractTestUtils.writeDataset; > @@ -72,4 +77,24 @@ public void testRenameDirIntoExistingDir() throws > Throwable { > boolean rename = fs.rename(srcDir, destDir); > assertFalse("s3a doesn't support rename to non-empty directory", rename); >} > + > + @Test > + public void testMkdirPopulatesFileAncestors() throws Exception { > +final FileSystem fs = getFileSystem(); > +final MetadataStore ms = ((S3AFileSystem) fs).getMetadataStore(); > +final Path parent = path("testMkdirPopulatesFileAncestors/source"); > +try { > + fs.mkdirs(parent); > + final Path nestedFile = new Path(parent, "dir1/dir2/dir3/file4"); > + byte[] srcDataset = dataset(256, 'a', 'z'); > + writeDataset(fs, nestedFile, srcDataset, srcDataset.length, > + 1024, false); > + > + DirListingMetadata list = ms.listChildren(parent); > + assertTrue("MetadataStore falsely reports authoritative empty list", > + list.isEmpty() == Tristate.FALSE || !list.isAuthoritative()); > +} finally { > + fs.delete(parent, true); > +} > + } > } > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-13786) Add S3Guard committer for zero-rename commits to consistent S3 endpoints
[ https://issues.apache.org/jira/browse/HADOOP-13786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16035358#comment-16035358 ] Steve Loughran commented on HADOOP-13786: - Patch 030: evolution based on integration testing with the InconsistentAmazonS3Client enabled, s3guard on/off, in Spark, so using its workflow. * the _SUCCESS marker contains more information & diagnostics * various bits of tuning shown (making cleanup resilient to inconsistencies in list vs actual) * docs It's in sync with commit 0fbb4aa in [https://github.com/hortonworks-spark/cloud-integration]; as is [the documentation|https://github.com/hortonworks-spark/cloud-integration/blob/master/cloud-committer/src/main/site/markdown/index.md] The core integration tests are working; more is always welcome...I plan to scale things up & create 1+ test designed to work on large clusters. This is all just querying data, but it adds validation of the data from the _SUCCESS marker, which is new. Example printing of success marker data {code} 2017-06-02 19:59:19,780 [ScalaTest-main-running-S3ACommitDataframeSuite] INFO s3.S3AOperations (Logging.scala:logInfo(54)) - success data at s3a://hwdev-steve-new/cloud-integration/DELAY_LISTING_ME/S3ACommitDataframeSuite/dataframe-committer/partitioned/orc/_SUCCESS : SuccessData{committer='PartitionedStagingCommitter', hostname='HW13176.cotham.uk', description='Task committer attempt_20170602195913__m_00_0', date='Fri Jun 02 19:59:17 BST 2017', filenames=[/cloud-integration/DELAY_LISTING_ME/S3ACommitDataframeSuite/dataframe-committer/partitioned/orc/part-0-f22d488c-dad0-4fa5-8ca4-8d00b058c77c-c000.snappy.orc]} 2017-06-02 19:59:19,781 [ScalaTest-main-running-S3ACommitDataframeSuite] INFO s3.S3AOperations (Logging.scala:logInfo(54)) - Metrics: S3guard_metadatastore_put_path_latency50thPercentileLatency = 548156 S3guard_metadatastore_put_path_latency75thPercentileLatency = 548156 S3guard_metadatastore_put_path_latency90thPercentileLatency = 548156 S3guard_metadatastore_put_path_latency95thPercentileLatency = 548156 S3guard_metadatastore_put_path_latency99thPercentileLatency = 548156 S3guard_metadatastore_put_path_latencyNumOps = 1 committer_bytes_committed = 384 committer_commits_aborted = 0 committer_commits_completed = 1 committer_commits_created = 1 committer_commits_failed = 0 committer_commits_reverted = 0 committer_jobs_completed = 1 committer_jobs_failed = 0 committer_tasks_completed = 1 committer_tasks_failed = 0 directories_created = 1 directories_deleted = 0 fake_directories_deleted = 6 files_copied = 0 files_copied_bytes = 0 files_created = 0 files_deleted = 2 ignored_errors = 1 object_continue_list_requests = 0 object_copy_requests = 0 object_delete_requests = 2 object_list_requests = 5 object_metadata_requests = 8 object_multipart_aborted = 0 object_put_bytes = 384 object_put_bytes_pending = 0 object_put_requests = 2 object_put_requests_active = 0 object_put_requests_completed = 2 op_copy_from_local_file = 0 op_exists = 2 op_get_file_status = 4 op_glob_status = 0 op_is_directory = 0 op_is_file = 0 op_list_files = 0 op_list_located_status = 0 op_list_status = 0 op_mkdirs = 0 op_rename = 0 s3guard_metadatastore_initialization = 0 s3guard_metadatastore_put_path_request = 2 stream_aborted = 0 stream_backward_seek_operations = 0 stream_bytes_backwards_on_seek = 0 stream_bytes_discarded_in_abort = 0 stream_bytes_read = 0 stream_bytes_read_in_close = 0 stream_bytes_skipped_on_seek = 0 stream_close_operations = 0 stream_closed = 0 stream_forward_seek_operations = 0 stream_opened = 0 stream_read_exceptions = 0 stream_read_fully_operations = 0 stream_read_operations = 0 stream_read_operations_incomplete = 0 stream_seek_operations = 0 stream_write_block_uploads = 0 stream_write_block_uploads_aborted = 0 stream_write_block_uploads_active = 0 stream_write_block_uploads_committed = 0 stream_write_block_uploads_data_pending = 0 stream_write_block_uploads_pending = 0 stream_write_failures = 0 stream_write_total_data = 0 stream_write_total_time = 0 2017-06-02 19:59:19,782 [ScalaTest-main-running-S3ACommitDataframeSuite] INFO s3.S3AOperations (Logging.scala:logInfo(54)) - Diagnostics: fs.s3a.committer.magic.enabled = true fs.s3a.metadatastore.authoritative = false fs.s3a.metadatastore.impl = org.apache.hadoop.fs.s3a.s3guard.LocalMetadataStore {code} > Add S3Guard committer for zero-rename commits to consistent S3 endpoints > > > Key: HADOOP-13786 > URL: https://issues.apache.org/jira/browse/HADOOP-13786 > Project: Hadoop Common > Issue Type: New Feature > Components: fs/s3 >Affects Versions: HADOOP-13345 >
[jira] [Updated] (HADOOP-13786) Add S3Guard committer for zero-rename commits to consistent S3 endpoints
[ https://issues.apache.org/jira/browse/HADOOP-13786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran updated HADOOP-13786: Status: Open (was: Patch Available) > Add S3Guard committer for zero-rename commits to consistent S3 endpoints > > > Key: HADOOP-13786 > URL: https://issues.apache.org/jira/browse/HADOOP-13786 > Project: Hadoop Common > Issue Type: New Feature > Components: fs/s3 >Affects Versions: HADOOP-13345 >Reporter: Steve Loughran >Assignee: Steve Loughran > Attachments: HADOOP-13786-HADOOP-13345-001.patch, > HADOOP-13786-HADOOP-13345-002.patch, HADOOP-13786-HADOOP-13345-003.patch, > HADOOP-13786-HADOOP-13345-004.patch, HADOOP-13786-HADOOP-13345-005.patch, > HADOOP-13786-HADOOP-13345-006.patch, HADOOP-13786-HADOOP-13345-006.patch, > HADOOP-13786-HADOOP-13345-007.patch, HADOOP-13786-HADOOP-13345-009.patch, > HADOOP-13786-HADOOP-13345-010.patch, HADOOP-13786-HADOOP-13345-011.patch, > HADOOP-13786-HADOOP-13345-012.patch, HADOOP-13786-HADOOP-13345-013.patch, > HADOOP-13786-HADOOP-13345-015.patch, HADOOP-13786-HADOOP-13345-016.patch, > HADOOP-13786-HADOOP-13345-017.patch, HADOOP-13786-HADOOP-13345-018.patch, > HADOOP-13786-HADOOP-13345-019.patch, HADOOP-13786-HADOOP-13345-020.patch, > HADOOP-13786-HADOOP-13345-021.patch, HADOOP-13786-HADOOP-13345-022.patch, > HADOOP-13786-HADOOP-13345-023.patch, HADOOP-13786-HADOOP-13345-024.patch, > HADOOP-13786-HADOOP-13345-025.patch, HADOOP-13786-HADOOP-13345-026.patch, > HADOOP-13786-HADOOP-13345-027.patch, HADOOP-13786-HADOOP-13345-028.patch, > HADOOP-13786-HADOOP-13345-028.patch, HADOOP-13786-HADOOP-13345-029.patch, > HADOOP-13786-HADOOP-13345-030.patch, objectstore.pdf, s3committer-master.zip > > > A goal of this code is "support O(1) commits to S3 repositories in the > presence of failures". Implement it, including whatever is needed to > demonstrate the correctness of the algorithm. (that is, assuming that s3guard > provides a consistent view of the presence/absence of blobs, show that we can > commit directly). > I consider ourselves free to expose the blobstore-ness of the s3 output > streams (ie. not visible until the close()), if we need to use that to allow > us to abort commit operations. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-13786) Add S3Guard committer for zero-rename commits to consistent S3 endpoints
[ https://issues.apache.org/jira/browse/HADOOP-13786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran updated HADOOP-13786: Status: Patch Available (was: Open) > Add S3Guard committer for zero-rename commits to consistent S3 endpoints > > > Key: HADOOP-13786 > URL: https://issues.apache.org/jira/browse/HADOOP-13786 > Project: Hadoop Common > Issue Type: New Feature > Components: fs/s3 >Affects Versions: HADOOP-13345 >Reporter: Steve Loughran >Assignee: Steve Loughran > Attachments: HADOOP-13786-HADOOP-13345-001.patch, > HADOOP-13786-HADOOP-13345-002.patch, HADOOP-13786-HADOOP-13345-003.patch, > HADOOP-13786-HADOOP-13345-004.patch, HADOOP-13786-HADOOP-13345-005.patch, > HADOOP-13786-HADOOP-13345-006.patch, HADOOP-13786-HADOOP-13345-006.patch, > HADOOP-13786-HADOOP-13345-007.patch, HADOOP-13786-HADOOP-13345-009.patch, > HADOOP-13786-HADOOP-13345-010.patch, HADOOP-13786-HADOOP-13345-011.patch, > HADOOP-13786-HADOOP-13345-012.patch, HADOOP-13786-HADOOP-13345-013.patch, > HADOOP-13786-HADOOP-13345-015.patch, HADOOP-13786-HADOOP-13345-016.patch, > HADOOP-13786-HADOOP-13345-017.patch, HADOOP-13786-HADOOP-13345-018.patch, > HADOOP-13786-HADOOP-13345-019.patch, HADOOP-13786-HADOOP-13345-020.patch, > HADOOP-13786-HADOOP-13345-021.patch, HADOOP-13786-HADOOP-13345-022.patch, > HADOOP-13786-HADOOP-13345-023.patch, HADOOP-13786-HADOOP-13345-024.patch, > HADOOP-13786-HADOOP-13345-025.patch, HADOOP-13786-HADOOP-13345-026.patch, > HADOOP-13786-HADOOP-13345-027.patch, HADOOP-13786-HADOOP-13345-028.patch, > HADOOP-13786-HADOOP-13345-028.patch, HADOOP-13786-HADOOP-13345-029.patch, > HADOOP-13786-HADOOP-13345-030.patch, objectstore.pdf, s3committer-master.zip > > > A goal of this code is "support O(1) commits to S3 repositories in the > presence of failures". Implement it, including whatever is needed to > demonstrate the correctness of the algorithm. (that is, assuming that s3guard > provides a consistent view of the presence/absence of blobs, show that we can > commit directly). > I consider ourselves free to expose the blobstore-ness of the s3 output > streams (ie. not visible until the close()), if we need to use that to allow > us to abort commit operations. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-13786) Add S3Guard committer for zero-rename commits to consistent S3 endpoints
[ https://issues.apache.org/jira/browse/HADOOP-13786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran updated HADOOP-13786: Attachment: HADOOP-13786-HADOOP-13345-030.patch > Add S3Guard committer for zero-rename commits to consistent S3 endpoints > > > Key: HADOOP-13786 > URL: https://issues.apache.org/jira/browse/HADOOP-13786 > Project: Hadoop Common > Issue Type: New Feature > Components: fs/s3 >Affects Versions: HADOOP-13345 >Reporter: Steve Loughran >Assignee: Steve Loughran > Attachments: HADOOP-13786-HADOOP-13345-001.patch, > HADOOP-13786-HADOOP-13345-002.patch, HADOOP-13786-HADOOP-13345-003.patch, > HADOOP-13786-HADOOP-13345-004.patch, HADOOP-13786-HADOOP-13345-005.patch, > HADOOP-13786-HADOOP-13345-006.patch, HADOOP-13786-HADOOP-13345-006.patch, > HADOOP-13786-HADOOP-13345-007.patch, HADOOP-13786-HADOOP-13345-009.patch, > HADOOP-13786-HADOOP-13345-010.patch, HADOOP-13786-HADOOP-13345-011.patch, > HADOOP-13786-HADOOP-13345-012.patch, HADOOP-13786-HADOOP-13345-013.patch, > HADOOP-13786-HADOOP-13345-015.patch, HADOOP-13786-HADOOP-13345-016.patch, > HADOOP-13786-HADOOP-13345-017.patch, HADOOP-13786-HADOOP-13345-018.patch, > HADOOP-13786-HADOOP-13345-019.patch, HADOOP-13786-HADOOP-13345-020.patch, > HADOOP-13786-HADOOP-13345-021.patch, HADOOP-13786-HADOOP-13345-022.patch, > HADOOP-13786-HADOOP-13345-023.patch, HADOOP-13786-HADOOP-13345-024.patch, > HADOOP-13786-HADOOP-13345-025.patch, HADOOP-13786-HADOOP-13345-026.patch, > HADOOP-13786-HADOOP-13345-027.patch, HADOOP-13786-HADOOP-13345-028.patch, > HADOOP-13786-HADOOP-13345-028.patch, HADOOP-13786-HADOOP-13345-029.patch, > HADOOP-13786-HADOOP-13345-030.patch, objectstore.pdf, s3committer-master.zip > > > A goal of this code is "support O(1) commits to S3 repositories in the > presence of failures". Implement it, including whatever is needed to > demonstrate the correctness of the algorithm. (that is, assuming that s3guard > provides a consistent view of the presence/absence of blobs, show that we can > commit directly). > I consider ourselves free to expose the blobstore-ness of the s3 output > streams (ie. not visible until the close()), if we need to use that to allow > us to abort commit operations. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-14481) Print stack trace when native bzip2 library does not load
[ https://issues.apache.org/jira/browse/HADOOP-14481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16035330#comment-16035330 ] Hadoop QA commented on HADOOP-14481: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 19m 35s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 13m 8s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 13m 8s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 33s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 1s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 16s{color} | {color:green} trunk passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 25s{color} | {color:red} hadoop-common-project/hadoop-common in trunk has 19 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 47s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 40s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 12m 35s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 12m 35s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 31s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 58s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 16s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 29s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 45s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 7m 27s{color} | {color:green} hadoop-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 29s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 76m 51s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:14b5c93 | | JIRA Issue | HADOOP-14481 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12871017/HADOOP-14481.001.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux 39cb0683d304 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 13:48:03 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 73ecb19 | | Default Java | 1.8.0_131 | | findbugs | v3.1.0-RC1 | | findbugs | https://builds.apache.org/job/PreCommit-HADOOP-Build/12435/artifact/patchprocess/branch-findbugs-hadoop-common-project_hadoop-common-warnings.html | | Test Results | https://builds.apache.org/job/PreCommit-HADOOP-Build/12435/testReport/ | | modules | C: hadoop-common-project/hadoop-common U: hadoop-common-project/hadoop-common | | Console output | https://builds.apache.org/job/PreCommit-HADOOP-Build/12435/console | | Powered by | Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > Print stack trace when native bzip2 library does not load > - > > Key: HADOOP-14481 > URL: https://issues.apache.org/jira/browse/HADOOP-14481 > Project: Hadoop Common >
[jira] [Commented] (HADOOP-14457) create() does not notify metadataStore of parent directories or ensure they're not existing files
[ https://issues.apache.org/jira/browse/HADOOP-14457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16035327#comment-16035327 ] Sean Mackrory commented on HADOOP-14457: I filed HADOOP-14484 for the missing test case (and will fix it if it does indeed fail on Local or Dynamo). I'll look at moving this to S3Guard.java - although we should be able to save some operations by solving this problem and the file-as-parent-dir check in the same loop, rather than an S3Guard-specific one in one place and then always another check elsewhere. > create() does not notify metadataStore of parent directories or ensure > they're not existing files > - > > Key: HADOOP-14457 > URL: https://issues.apache.org/jira/browse/HADOOP-14457 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Reporter: Sean Mackrory > Attachments: HADOOP-14457-HADOOP-13345.001.patch, > HADOOP-14457-HADOOP-13345.002.patch > > > Not a great test yet, but it at least reliably demonstrates the issue. > LocalMetadataStore will sometimes erroneously report that a directory is > empty with isAuthoritative = true when it *definitely* has children the > metadatastore should know about. It doesn't appear to happen if the children > are just directory. The fact that it's returning an empty listing is > concerning, but the fact that it says it's authoritative *might* be a second > bug. > {code} > diff --git > a/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java > > b/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java > index 78b3970..1821d19 100644 > --- > a/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java > +++ > b/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java > @@ -965,7 +965,7 @@ public boolean hasMetadataStore() { >} > >@VisibleForTesting > - MetadataStore getMetadataStore() { > + public MetadataStore getMetadataStore() { > return metadataStore; >} > > diff --git > a/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/contract/s3a/ITestS3AContractRename.java > > b/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/contract/s3a/ITestS3AContractRename.java > index 4339649..881bdc9 100644 > --- > a/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/contract/s3a/ITestS3AContractRename.java > +++ > b/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/contract/s3a/ITestS3AContractRename.java > @@ -23,6 +23,11 @@ > import org.apache.hadoop.fs.contract.AbstractFSContract; > import org.apache.hadoop.fs.FileSystem; > import org.apache.hadoop.fs.Path; > +import org.apache.hadoop.fs.s3a.S3AFileSystem; > +import org.apache.hadoop.fs.s3a.Tristate; > +import org.apache.hadoop.fs.s3a.s3guard.DirListingMetadata; > +import org.apache.hadoop.fs.s3a.s3guard.MetadataStore; > +import org.junit.Test; > > import static org.apache.hadoop.fs.contract.ContractTestUtils.dataset; > import static org.apache.hadoop.fs.contract.ContractTestUtils.writeDataset; > @@ -72,4 +77,24 @@ public void testRenameDirIntoExistingDir() throws > Throwable { > boolean rename = fs.rename(srcDir, destDir); > assertFalse("s3a doesn't support rename to non-empty directory", rename); >} > + > + @Test > + public void testMkdirPopulatesFileAncestors() throws Exception { > +final FileSystem fs = getFileSystem(); > +final MetadataStore ms = ((S3AFileSystem) fs).getMetadataStore(); > +final Path parent = path("testMkdirPopulatesFileAncestors/source"); > +try { > + fs.mkdirs(parent); > + final Path nestedFile = new Path(parent, "dir1/dir2/dir3/file4"); > + byte[] srcDataset = dataset(256, 'a', 'z'); > + writeDataset(fs, nestedFile, srcDataset, srcDataset.length, > + 1024, false); > + > + DirListingMetadata list = ms.listChildren(parent); > + assertTrue("MetadataStore falsely reports authoritative empty list", > + list.isEmpty() == Tristate.FALSE || !list.isAuthoritative()); > +} finally { > + fs.delete(parent, true); > +} > + } > } > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Assigned] (HADOOP-14484) Ensure deleted parent directory tombstones are overwritten when implicitly recreated
[ https://issues.apache.org/jira/browse/HADOOP-14484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Mackrory reassigned HADOOP-14484: -- Assignee: Sean Mackrory > Ensure deleted parent directory tombstones are overwritten when implicitly > recreated > > > Key: HADOOP-14484 > URL: https://issues.apache.org/jira/browse/HADOOP-14484 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Reporter: Sean Mackrory >Assignee: Sean Mackrory > > As discussed on HADOOP-13998, there may be a test missing (and possibly > broken metadata store implementations) for the case where a directory is > deleted but is later implicitly recreated by creating a file inside it, where > the tombstone is not overwritten. In such a case, listing the parent > directory would result in an error. > This may also be happening because of HADOOP-14457, but we should add a test > for this other possibility anyway and fix it if it fails with any > implementations. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Created] (HADOOP-14484) Ensure deleted parent directory tombstones are overwritten when implicitly recreated
Sean Mackrory created HADOOP-14484: -- Summary: Ensure deleted parent directory tombstones are overwritten when implicitly recreated Key: HADOOP-14484 URL: https://issues.apache.org/jira/browse/HADOOP-14484 Project: Hadoop Common Issue Type: Sub-task Reporter: Sean Mackrory As discussed on HADOOP-13998, there may be a test missing (and possibly broken metadata store implementations) for the case where a directory is deleted but is later implicitly recreated by creating a file inside it, where the tombstone is not overwritten. In such a case, listing the parent directory would result in an error. This may also be happening because of HADOOP-14457, but we should add a test for this other possibility anyway and fix it if it fails with any implementations. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-14457) create() does not notify metadataStore of parent directories or ensure they're not existing files
[ https://issues.apache.org/jira/browse/HADOOP-14457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16035313#comment-16035313 ] Aaron Fabbri commented on HADOOP-14457: --- Thanks for the detail here [~ste...@apache.org]. {quote} FWIW, I'd like the creation code to be kept ouf S3AFS if possible, just because it's getting so big & complex. I've pulled writeOperationsHelper out in the committer branch, but there's still a lot of complexity in the core FS now that everything is metastore-guarded. {quote} By "creation code", I assume you mean the part where create() results in all ancestor dirs getting created in the MetadataStore. I generally agree. Can this live in S3Guard.java, [~mackrorysd], or is it awkward? > create() does not notify metadataStore of parent directories or ensure > they're not existing files > - > > Key: HADOOP-14457 > URL: https://issues.apache.org/jira/browse/HADOOP-14457 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Reporter: Sean Mackrory > Attachments: HADOOP-14457-HADOOP-13345.001.patch, > HADOOP-14457-HADOOP-13345.002.patch > > > Not a great test yet, but it at least reliably demonstrates the issue. > LocalMetadataStore will sometimes erroneously report that a directory is > empty with isAuthoritative = true when it *definitely* has children the > metadatastore should know about. It doesn't appear to happen if the children > are just directory. The fact that it's returning an empty listing is > concerning, but the fact that it says it's authoritative *might* be a second > bug. > {code} > diff --git > a/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java > > b/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java > index 78b3970..1821d19 100644 > --- > a/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java > +++ > b/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java > @@ -965,7 +965,7 @@ public boolean hasMetadataStore() { >} > >@VisibleForTesting > - MetadataStore getMetadataStore() { > + public MetadataStore getMetadataStore() { > return metadataStore; >} > > diff --git > a/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/contract/s3a/ITestS3AContractRename.java > > b/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/contract/s3a/ITestS3AContractRename.java > index 4339649..881bdc9 100644 > --- > a/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/contract/s3a/ITestS3AContractRename.java > +++ > b/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/contract/s3a/ITestS3AContractRename.java > @@ -23,6 +23,11 @@ > import org.apache.hadoop.fs.contract.AbstractFSContract; > import org.apache.hadoop.fs.FileSystem; > import org.apache.hadoop.fs.Path; > +import org.apache.hadoop.fs.s3a.S3AFileSystem; > +import org.apache.hadoop.fs.s3a.Tristate; > +import org.apache.hadoop.fs.s3a.s3guard.DirListingMetadata; > +import org.apache.hadoop.fs.s3a.s3guard.MetadataStore; > +import org.junit.Test; > > import static org.apache.hadoop.fs.contract.ContractTestUtils.dataset; > import static org.apache.hadoop.fs.contract.ContractTestUtils.writeDataset; > @@ -72,4 +77,24 @@ public void testRenameDirIntoExistingDir() throws > Throwable { > boolean rename = fs.rename(srcDir, destDir); > assertFalse("s3a doesn't support rename to non-empty directory", rename); >} > + > + @Test > + public void testMkdirPopulatesFileAncestors() throws Exception { > +final FileSystem fs = getFileSystem(); > +final MetadataStore ms = ((S3AFileSystem) fs).getMetadataStore(); > +final Path parent = path("testMkdirPopulatesFileAncestors/source"); > +try { > + fs.mkdirs(parent); > + final Path nestedFile = new Path(parent, "dir1/dir2/dir3/file4"); > + byte[] srcDataset = dataset(256, 'a', 'z'); > + writeDataset(fs, nestedFile, srcDataset, srcDataset.length, > + 1024, false); > + > + DirListingMetadata list = ms.listChildren(parent); > + assertTrue("MetadataStore falsely reports authoritative empty list", > + list.isEmpty() == Tristate.FALSE || !list.isAuthoritative()); > +} finally { > + fs.delete(parent, true); > +} > + } > } > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-14445) Delegation tokens are not shared between KMS instances
[ https://issues.apache.org/jira/browse/HADOOP-14445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16035306#comment-16035306 ] Daryn Sharp commented on HADOOP-14445: -- I had a feeling "nameservice" alluded to the hdfs HA configuration āĀ which is horrible for the reasons I've detailed and why we don't use it. I'll politely stress and repeat: *Updating configs of tens of thousands of nodes, launchers, oozie, storm, spark, etc and restarting the services is just not logistically possible*. bq. Although, IIRC, the tokens are renewed only if they are expired, so if they are renewed serially, it should not be a problem. The RM renews immediately to verify token validity and to determine the next renewal time. If they are expired, it's too late. Any kp token using just a service authority cannot determine the kp uri and is only renewable via the kp uri in the config āĀ enforcing one and only 1 kms cluster. If the kp client can be instantiated via the service, then multi-kms setups are possible. bq. I do like the idea of using a nameservice though, as Yongjun Zhang suggested which will ensure that we will still have only 1 single entry. There must be a disconnect here. 1 single entry is the advantage of just setting the service to the provider uri. Adding an extra layer of indirection through the config creates a logistical mess with no added benefits. I'm not going to bounce all my services and RMs because I added or changed a KMS cluster. Here's the big picture we are trying to achieve: * client requests kp uri from NN * client creates kp client from kp uri * client gets tokens and sets service to kp uri * RM calls kms token renewer which uses kp uri in service to create kp client * tasks use the NN->kp uri mapping established at job submission to locate tokens It's +config-less+ other than a setting on the NN. This is what we are running internally because the current kms client design is completely broken. We now have the ability to enable EZs on a NN and/or change kms cluster configuration without changing configs or restarting services. We only care about this load balancing provider because we need to ensure the kp client can be instantiated from the service. > Delegation tokens are not shared between KMS instances > -- > > Key: HADOOP-14445 > URL: https://issues.apache.org/jira/browse/HADOOP-14445 > Project: Hadoop Common > Issue Type: Bug > Components: documentation, kms >Affects Versions: 2.8.0, 3.0.0-alpha1 >Reporter: Wei-Chiu Chuang >Assignee: Rushabh S Shah > Attachments: HADOOP-14445-branch-2.8.patch > > > As discovered in HADOOP-14441, KMS HA using LoadBalancingKMSClientProvider do > not share delegation tokens. (a client uses KMS address/port as the key for > delegation token) > {code:title=DelegationTokenAuthenticatedURL#openConnection} > if (!creds.getAllTokens().isEmpty()) { > InetSocketAddress serviceAddr = new InetSocketAddress(url.getHost(), > url.getPort()); > Text service = SecurityUtil.buildTokenService(serviceAddr); > dToken = creds.getToken(service); > {code} > But KMS doc states: > {quote} > Delegation Tokens > Similar to HTTP authentication, KMS uses Hadoop Authentication for delegation > tokens too. > Under HA, A KMS instance must verify the delegation token given by another > KMS instance, by checking the shared secret used to sign the delegation > token. To do this, all KMS instances must be able to retrieve the shared > secret from ZooKeeper. > {quote} > We should either update the KMS documentation, or fix this code to share > delegation tokens. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Created] (HADOOP-14483) increase default value of fs.s3a.multipart.size to 128M
Steve Loughran created HADOOP-14483: --- Summary: increase default value of fs.s3a.multipart.size to 128M Key: HADOOP-14483 URL: https://issues.apache.org/jira/browse/HADOOP-14483 Project: Hadoop Common Issue Type: Sub-task Components: fs/s3 Affects Versions: 2.8.0 Reporter: Steve Loughran Priority: Minor increment the default value of {{fs.s3a.multipart.size}} from "100M" to "128M". Why? AWS S3 throttles clients making too many requests; going to a larger size will reduce this. Also: document the issue -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-14457) create() does not notify metadataStore of parent directories or ensure they're not existing files
[ https://issues.apache.org/jira/browse/HADOOP-14457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16035283#comment-16035283 ] Steve Loughran commented on HADOOP-14457: - Update: looked at {{finishedWrite}} in more detail. It does # call {{deleteUnnecessaryFakeDirectories(p.getParent());}} # if s3guard is enabled, update the metastore with the new value of the file. we can/should still have the safety checks in the create call for parents being file, but can mabye postpone the path creation until the file is written (or do it again). FWIW, I'd like the creation code to be kept ouf S3AFS if possible, just because it's getting so big & complex. I've pulled writeOperationsHelper out in the committer branch, but there's still a lot of complexity in the core FS now that everything is metastore-guarded. I think we should consider that there's another test missing here: a sequence of: # mkdir(parent) # delete(parent) # touch(child) # stat(child) # ls(parent) Similarly, do one for calling create() on a path whose parent hasn't been created and deleted, but simply doesn't exist. # touch(child) # stat(child) # ls(parent) > create() does not notify metadataStore of parent directories or ensure > they're not existing files > - > > Key: HADOOP-14457 > URL: https://issues.apache.org/jira/browse/HADOOP-14457 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Reporter: Sean Mackrory > Attachments: HADOOP-14457-HADOOP-13345.001.patch, > HADOOP-14457-HADOOP-13345.002.patch > > > Not a great test yet, but it at least reliably demonstrates the issue. > LocalMetadataStore will sometimes erroneously report that a directory is > empty with isAuthoritative = true when it *definitely* has children the > metadatastore should know about. It doesn't appear to happen if the children > are just directory. The fact that it's returning an empty listing is > concerning, but the fact that it says it's authoritative *might* be a second > bug. > {code} > diff --git > a/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java > > b/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java > index 78b3970..1821d19 100644 > --- > a/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java > +++ > b/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java > @@ -965,7 +965,7 @@ public boolean hasMetadataStore() { >} > >@VisibleForTesting > - MetadataStore getMetadataStore() { > + public MetadataStore getMetadataStore() { > return metadataStore; >} > > diff --git > a/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/contract/s3a/ITestS3AContractRename.java > > b/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/contract/s3a/ITestS3AContractRename.java > index 4339649..881bdc9 100644 > --- > a/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/contract/s3a/ITestS3AContractRename.java > +++ > b/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/contract/s3a/ITestS3AContractRename.java > @@ -23,6 +23,11 @@ > import org.apache.hadoop.fs.contract.AbstractFSContract; > import org.apache.hadoop.fs.FileSystem; > import org.apache.hadoop.fs.Path; > +import org.apache.hadoop.fs.s3a.S3AFileSystem; > +import org.apache.hadoop.fs.s3a.Tristate; > +import org.apache.hadoop.fs.s3a.s3guard.DirListingMetadata; > +import org.apache.hadoop.fs.s3a.s3guard.MetadataStore; > +import org.junit.Test; > > import static org.apache.hadoop.fs.contract.ContractTestUtils.dataset; > import static org.apache.hadoop.fs.contract.ContractTestUtils.writeDataset; > @@ -72,4 +77,24 @@ public void testRenameDirIntoExistingDir() throws > Throwable { > boolean rename = fs.rename(srcDir, destDir); > assertFalse("s3a doesn't support rename to non-empty directory", rename); >} > + > + @Test > + public void testMkdirPopulatesFileAncestors() throws Exception { > +final FileSystem fs = getFileSystem(); > +final MetadataStore ms = ((S3AFileSystem) fs).getMetadataStore(); > +final Path parent = path("testMkdirPopulatesFileAncestors/source"); > +try { > + fs.mkdirs(parent); > + final Path nestedFile = new Path(parent, "dir1/dir2/dir3/file4"); > + byte[] srcDataset = dataset(256, 'a', 'z'); > + writeDataset(fs, nestedFile, srcDataset, srcDataset.length, > + 1024, false); > + > + DirListingMetadata list = ms.listChildren(parent); > + assertTrue("MetadataStore falsely reports authoritative empty list", > + list.isEmpty() == Tristate.FALSE || !list.isAuthoritative()); > +} finally { > + fs.delete(parent, true); > +} > + } > } > {code} -- This
[jira] [Comment Edited] (HADOOP-13998) initial s3guard preview
[ https://issues.apache.org/jira/browse/HADOOP-13998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16035280#comment-16035280 ] Sean Mackrory edited comment on HADOOP-13998 at 6/2/17 7:23 PM: [~ste...@apache.org] - regarding that test issue, that would happen if a directory was deleted and a file inside it was then created without correctly overwriting or removing the tombstone of the parent directories. If you're using the DynamoDB implementation, it should definitely be replacing the tombstone for the parent directory when the file is created. If you're using the Local implementation, I wonder if that's happening as a result of HADOOP-14457. I'll take a closer look at that again and see if I can reproduce, though I thought I had added test cases for that sequence. was (Author: mackrorysd): [~ste...@apache.org] - regarding that test issue, that would happen if a directory was deleted, and a file inside it was then created. If you're using the DynamoDB implementation, it should definitely be replacing the tombstone for the parent directory when the file is created. If you're using the Local implementation, I wonder if that's happening as a result of HADOOP-14457. I'll take a closer look at that again and see if I can reproduce, though I thought I had added test cases for that sequence. > initial s3guard preview > --- > > Key: HADOOP-13998 > URL: https://issues.apache.org/jira/browse/HADOOP-13998 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Reporter: Steve Loughran > > JIRA to link in all the things we think are needed for a preview/merge into > trunk -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-13998) initial s3guard preview
[ https://issues.apache.org/jira/browse/HADOOP-13998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16035280#comment-16035280 ] Sean Mackrory commented on HADOOP-13998: [~ste...@apache.org] - regarding that test issue, that would happen if a directory was deleted, and a file inside it was then created. If you're using the DynamoDB implementation, it should definitely be replacing the tombstone for the parent directory when the file is created. If you're using the Local implementation, I wonder if that's happening as a result of HADOOP-14457. I'll take a closer look at that again and see if I can reproduce, though I thought I had added test cases for that sequence. > initial s3guard preview > --- > > Key: HADOOP-13998 > URL: https://issues.apache.org/jira/browse/HADOOP-13998 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Reporter: Steve Loughran > > JIRA to link in all the things we think are needed for a preview/merge into > trunk -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-13998) initial s3guard preview
[ https://issues.apache.org/jira/browse/HADOOP-13998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16035268#comment-16035268 ] Sean Mackrory commented on HADOOP-13998: {quote}If these tests were working before you turned s3guard on then they weren't catching inconsistencies & so were lucky (as mine were){quote} Actually I believe a few of those tests had transient failures at a fairly consistent rate (something like 1 in 4 or 1 in 6 test runs if I remember correctly) that had always been assumed to be the result of inconsistency. They stopped failing entirely once the initial work for list-after-put consistency was incorporated. > initial s3guard preview > --- > > Key: HADOOP-13998 > URL: https://issues.apache.org/jira/browse/HADOOP-13998 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Reporter: Steve Loughran > > JIRA to link in all the things we think are needed for a preview/merge into > trunk -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-14457) create() does not notify metadataStore of parent directories or ensure they're not existing files
[ https://issues.apache.org/jira/browse/HADOOP-14457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16035262#comment-16035262 ] Steve Loughran commented on HADOOP-14457: - OK, I am effectively seeing this in my committer tests where the file {{s3a://hwdev-steve-new/cloud-integration/DELAY_LISTING_ME/S3ACommitDataframeSuite/dataframe-committer/partitioned/orc/_SUCCESS}} exists, but an attempt to list the parent dir fails as a delete marker is being found instead. {code} 2017-06-02 19:59:19,791 [ScalaTest-main-running-S3ACommitDataframeSuite] DEBUG s3a.S3AFileSystem (S3AFileSystem.java:innerGetFileStatus(1899)) - Getting path status for s3a://hwdev-steve-new/cloud-integration/DELAY_LISTING_ME/S3ACommitDataframeSuite/dataframe-committer/partitioned/orc/_SUCCESS (cloud-integration/DELAY_LISTING_ME/S3ACommitDataframeSuite/dataframe-committer/partitioned/orc/_SUCCESS) 2017-06-02 19:59:19,791 [ScalaTest-main-running-S3ACommitDataframeSuite] DEBUG s3guard.MetadataStore (LocalMetadataStore.java:get(151)) - get(s3a://hwdev-steve-new/cloud-integration/DELAY_LISTING_ME/S3ACommitDataframeSuite/dataframe-committer/partitioned/orc/_SUCCESS) -> file s3a://hwdev-steve-new/cloud-integration/DELAY_LISTING_ME/S3ACommitDataframeSuite/dataframe-committer/partitioned/orc/_SUCCESS 3404UNKNOWN false S3AFileStatus{path=s3a://hwdev-steve-new/cloud-integration/DELAY_LISTING_ME/S3ACommitDataframeSuite/dataframe-committer/partitioned/orc/_SUCCESS; isDirectory=false; length=3404; replication=1; blocksize=1048576; modification_time=1496429958524; access_time=0; owner=stevel; group=stevel; permission=rw-rw-rw-; isSymlink=false; hasAcl=false; isEncrypted=false; isErasureCoded=false} isEmptyDirectory=FALSE 2017-06-02 19:59:19,792 [ScalaTest-main-running-S3ACommitDataframeSuite] DEBUG s3a.S3AFileSystem (S3AFileSystem.java:innerListStatus(1660)) - List status for path: s3a://hwdev-steve-new/cloud-integration/DELAY_LISTING_ME/S3ACommitDataframeSuite/dataframe-committer/partitioned/orc 2017-06-02 19:59:19,792 [ScalaTest-main-running-S3ACommitDataframeSuite] DEBUG s3a.S3AFileSystem (S3AFileSystem.java:innerGetFileStatus(1899)) - Getting path status for s3a://hwdev-steve-new/cloud-integration/DELAY_LISTING_ME/S3ACommitDataframeSuite/dataframe-committer/partitioned/orc (cloud-integration/DELAY_LISTING_ME/S3ACommitDataframeSuite/dataframe-committer/partitioned/orc) 2017-06-02 19:59:19,792 [ScalaTest-main-running-S3ACommitDataframeSuite] DEBUG s3guard.MetadataStore (LocalMetadataStore.java:get(151)) - get(s3a://hwdev-steve-new/cloud-integration/DELAY_LISTING_ME/S3ACommitDataframeSuite/dataframe-committer/partitioned/orc) -> file s3a://hwdev-steve-new/cloud-integration/DELAY_LISTING_ME/S3ACommitDataframeSuite/dataframe-committer/partitioned/orc 0 UNKNOWN true FileStatus{path=s3a://hwdev-steve-new/cloud-integration/DELAY_LISTING_ME/S3ACommitDataframeSuite/dataframe-committer/partitioned/orc; isDirectory=false; length=0; replication=0; blocksize=0; modification_time=1496429951655; access_time=0; owner=; group=; permission=rw-rw-rw-; isSymlink=false; hasAcl=false; isEncrypted=false; isErasureCoded=false} 2017-06-02 19:59:19,801 [dispatcher-event-loop-6] INFO spark.MapOutputTrackerMasterEndpoint (Logging.scala:logInfo(54)) - MapOutputTrackerMasterEndpoint stopped! 2017-06-02 19:59:19,811 [dispatcher-event-loop-3] INFO scheduler.OutputCommitCoordinator$OutputCommitCoordinatorEndpoint (Logging.scala:logInfo(54)) - OutputCommitCoordinator stopped! 2017-06-02 19:59:19,814 [ScalaTest-main-running-S3ACommitDataframeSuite] INFO spark.SparkContext (Logging.scala:logInfo(54)) - Successfully stopped SparkContext - Dataframe+partitioned *** FAILED *** java.io.FileNotFoundException: Path s3a://hwdev-steve-new/cloud-integration/DELAY_LISTING_ME/S3ACommitDataframeSuite/dataframe-committer/partitioned/orc is recorded as deleted by S3Guard at org.apache.hadoop.fs.s3a.S3AFileSystem.innerGetFileStatus(S3AFileSystem.java:1906) at org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:1881) at org.apache.hadoop.fs.s3a.S3AFileSystem.innerListStatus(S3AFileSystem.java:1664) at org.apache.hadoop.fs.s3a.S3AFileSystem.listStatus(S3AFileSystem.java:1640) at com.hortonworks.spark.cloud.ObjectStoreOperations$class.validateRowCount(ObjectStoreOperations.scala:340) at com.hortonworks.spark.cloud.CloudSuite.validateRowCount(CloudSuite.scala:37) at com.hortonworks.spark.cloud.s3.commit.S3ACommitDataframeSuite.testOneFormat(S3ACommitDataframeSuite.scala:111) at com.hortonworks.spark.cloud.s3.commit.S3ACommitDataframeSuite$$anonfun$1$$anonfun$apply$2.apply$mcV$sp(S3ACommitDataframeSuite.scala:71) at com.hortonworks.spark.cloud.CloudSuiteTrait$$anonfun$ctest$1.apply$mcV$sp(CloudSuiteTrait.scala:66) at com.hortonworks.spark.cloud.CloudSuiteTrait$$anonfun$ctest$1.apply(CloudSuiteTrait.scala:64) {code} The
[jira] [Updated] (HADOOP-14283) S3A may hang due to bug in AWS SDK 1.11.86
[ https://issues.apache.org/jira/browse/HADOOP-14283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aaron Fabbri updated HADOOP-14283: -- Attachment: ITestS3AConcurrentRename.java Attaching related Hadoop scale test but I'm leaning towards not including in the codebase it because: 1. It is slow as heck. One of those things you need to run 4+ hours to get some confidence on. 2. The direct-to-SDK [test|https://github.com/ajfabbri/awstest] I posted in the description is an easier way to reproduce the SDK hang. > S3A may hang due to bug in AWS SDK 1.11.86 > -- > > Key: HADOOP-14283 > URL: https://issues.apache.org/jira/browse/HADOOP-14283 > Project: Hadoop Common > Issue Type: Bug > Components: fs/s3 >Affects Versions: 3.0.0-alpha2 >Reporter: Aaron Fabbri >Assignee: Aaron Fabbri >Priority: Critical > Attachments: HADOOP-14283.001.patch, ITestS3AConcurrentRename.java > > > We hit a hang bug when testing S3A with parallel renames. > I narrowed this down to the newer AWS Java SDK. It only happens under load, > and appears to be a failure to wake up a waiting thread on timeout/error. > I've created a github issue here: > https://github.com/aws/aws-sdk-java/issues/1102 > I can post a Hadoop scale test which reliably reproduces this after some > cleanup. I have posted an SDK-only test here which reproduces the issue > without Hadoop: > https://github.com/ajfabbri/awstest > I have a support ticket open and am working with Amazon on this bug so I'll > take this issue. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-14283) S3A may hang due to bug in AWS SDK 1.11.86
[ https://issues.apache.org/jira/browse/HADOOP-14283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aaron Fabbri updated HADOOP-14283: -- Attachment: HADOOP-14283.001.patch Attaching patch which bumps SDK from 1.11.86 to 1.11.134. There are now newer versions but I've done a good amount of testing on .134. I ran unit, integration, and scale tests in us-west-2. > S3A may hang due to bug in AWS SDK 1.11.86 > -- > > Key: HADOOP-14283 > URL: https://issues.apache.org/jira/browse/HADOOP-14283 > Project: Hadoop Common > Issue Type: Bug > Components: fs/s3 >Affects Versions: 3.0.0-alpha2 >Reporter: Aaron Fabbri >Assignee: Aaron Fabbri >Priority: Critical > Attachments: HADOOP-14283.001.patch > > > We hit a hang bug when testing S3A with parallel renames. > I narrowed this down to the newer AWS Java SDK. It only happens under load, > and appears to be a failure to wake up a waiting thread on timeout/error. > I've created a github issue here: > https://github.com/aws/aws-sdk-java/issues/1102 > I can post a Hadoop scale test which reliably reproduces this after some > cleanup. I have posted an SDK-only test here which reproduces the issue > without Hadoop: > https://github.com/ajfabbri/awstest > I have a support ticket open and am working with Amazon on this bug so I'll > take this issue. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-14482) Update BUILDING.txt to include the correct steps to install zstd library
[ https://issues.apache.org/jira/browse/HADOOP-14482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang updated HADOOP-14482: - Description: The current BUILDING.txt includes the following steps for installing zstd library: $ sudo apt-get install zstd This is incorrect. On my Ubuntu 16 machine, zstd is not a library {quote} apt-cache search zstd libzstd-dev - fast lossless compression algorithm -- development files libzstd0 - fast lossless compression algorithm zstd - fast lossless compression algorithm -- CLI tool {quote} On a Ubuntu 14 machine, I couldn't even find anything related to zstd. In fact, to build Hadoop with ZStandard library, I have to install libzstd-dev. I will also need to install the runtime to use it. libzstd0 is the older version. libzstd1 is for zstd 1.x. It's not clear to me if libzstd0 is compatible. CentOS does have libzstd1 though. Perhaps we can provide instruction to compile/install libzstd from source code. {quote} * Use -Dzstd.prefix to specify a nonstandard location for the libzstd header files and library files. You do not need this option if you have installed zstandard using a package manager. * Use -Dzstd.lib to specify a nonstandard location for the libzstd library files. Similarly to zstd.prefix, you do not need this option if you have installed using a package manager. {quote} At least for CentOS, the library installed by rpm was not located and I had to specify -Dzstd.prefix to get it installed. was: The current BUILDING.txt includes the following steps for installing zstd library: $ sudo apt-get install zstd This is incorrect. On my Ubuntu machine, zstd is not a library {quote} apt-cache search zstd libzstd-dev - fast lossless compression algorithm -- development files libzstd0 - fast lossless compression algorithm zstd - fast lossless compression algorithm -- CLI tool {quote} In fact, to build Hadoop with ZStandard library, I have to install libzstd-dev. I will also need to install the runtime to use it. libzstd0 is the older version. libzstd1 is for zstd 1.x. It's not clear to me if libzstd0 is compatible. CentOS does have libzstd1 though. {quote} * Use -Dzstd.prefix to specify a nonstandard location for the libzstd header files and library files. You do not need this option if you have installed zstandard using a package manager. * Use -Dzstd.lib to specify a nonstandard location for the libzstd library files. Similarly to zstd.prefix, you do not need this option if you have installed using a package manager. {quote} At least for CentOS, the library installed by rpm was not located and I had to specify -Dzstd.prefix to get it installed. > Update BUILDING.txt to include the correct steps to install zstd library > > > Key: HADOOP-14482 > URL: https://issues.apache.org/jira/browse/HADOOP-14482 > Project: Hadoop Common > Issue Type: Improvement > Components: io >Affects Versions: 3.0.0-alpha2 >Reporter: Wei-Chiu Chuang >Priority: Minor > > The current BUILDING.txt includes the following steps for installing zstd > library: > $ sudo apt-get install zstd > This is incorrect. On my Ubuntu 16 machine, zstd is not a library > {quote} > apt-cache search zstd > libzstd-dev - fast lossless compression algorithm -- development files > libzstd0 - fast lossless compression algorithm > zstd - fast lossless compression algorithm -- CLI tool > {quote} > On a Ubuntu 14 machine, I couldn't even find anything related to zstd. > In fact, to build Hadoop with ZStandard library, I have to install > libzstd-dev. > I will also need to install the runtime to use it. libzstd0 is the older > version. libzstd1 is for zstd 1.x. It's not clear to me if libzstd0 is > compatible. CentOS does have libzstd1 though. > Perhaps we can provide instruction to compile/install libzstd from source > code. > {quote} > * Use -Dzstd.prefix to specify a nonstandard location for the libzstd > header files and library files. You do not need this option if you have > installed zstandard using a package manager. > * Use -Dzstd.lib to specify a nonstandard location for the libzstd library > files. Similarly to zstd.prefix, you do not need this option if you have > installed using a package manager. > {quote} > At least for CentOS, the library installed by rpm was not located and I had > to specify -Dzstd.prefix to get it installed. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-13998) initial s3guard preview
[ https://issues.apache.org/jira/browse/HADOOP-13998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16035217#comment-16035217 ] Steve Loughran commented on HADOOP-13998: - regarding tests, I'm seeing something up with the combination of (s3guard and the partition committer (and only it)): a newly created file is where it should be, but the parent dir is still tagged as missing. I can GET the file, but if I try to list the parent I get rejected: {code} 2017-06-02 18:19:10,709 [ScalaTest-main-running-S3ACommitDataframeSuite] INFO s3.S3AOperations (Logging.scala:logInfo(54)) - s3a://hwdev-steve-new/cloud-integration/DELAY_LISTING_ME/S3ACommitDataframeSuite/dataframe-committer/partitioned/orc/part-0-7573c876-38e5-4024-8a53-51fa1aa9c9c2-c000.snappy.orc size=384 2017-06-02 18:19:10,709 [ScalaTest-main-running-S3ACommitDataframeSuite] DEBUG s3a.S3AFileSystem (S3AFileSystem.java:innerGetFileStatus(1899)) - Getting path status for s3a://hwdev-steve-new/cloud-integration/DELAY_LISTING_ME/S3ACommitDataframeSuite/dataframe-committer/partitioned/orc/_SUCCESS (cloud-integration/DELAY_LISTING_ME/S3ACommitDataframeSuite/dataframe-committer/partitioned/orc/_SUCCESS) 2017-06-02 18:19:10,710 [ScalaTest-main-running-S3ACommitDataframeSuite] DEBUG s3guard.MetadataStore (LocalMetadataStore.java:get(151)) - get(s3a://hwdev-steve-new/cloud-integration/DELAY_LISTING_ME/S3ACommitDataframeSuite/dataframe-committer/partitioned/orc/_SUCCESS) -> file s3a://hwdev-steve-new/cloud-integration/DELAY_LISTING_ME/S3ACommitDataframeSuite/dataframe-committer/partitioned/orc/_SUCCESS 3400UNKNOWN false S3AFileStatus{path=s3a://hwdev-steve-new/cloud-integration/DELAY_LISTING_ME/S3ACommitDataframeSuite/dataframe-committer/partitioned/orc/_SUCCESS; isDirectory=false; length=3400; replication=1; blocksize=1048576; modification_time=1496423948811; access_time=0; owner=stevel; group=stevel; permission=rw-rw-rw-; isSymlink=false; hasAcl=false; isEncrypted=false; isErasureCoded=false} isEmptyDirectory=FALSE 2017-06-02 18:19:10,710 [ScalaTest-main-running-S3ACommitDataframeSuite] DEBUG s3a.S3AFileSystem (S3AFileSystem.java:innerListStatus(1660)) - List status for path: s3a://hwdev-steve-new/cloud-integration/DELAY_LISTING_ME/S3ACommitDataframeSuite/dataframe-committer/partitioned/orc 2017-06-02 18:19:10,710 [ScalaTest-main-running-S3ACommitDataframeSuite] DEBUG s3a.S3AFileSystem (S3AFileSystem.java:innerGetFileStatus(1899)) - Getting path status for s3a://hwdev-steve-new/cloud-integration/DELAY_LISTING_ME/S3ACommitDataframeSuite/dataframe-committer/partitioned/orc (cloud-integration/DELAY_LISTING_ME/S3ACommitDataframeSuite/dataframe-committer/partitioned/orc) 2017-06-02 18:19:10,711 [ScalaTest-main-running-S3ACommitDataframeSuite] DEBUG s3guard.MetadataStore (LocalMetadataStore.java:get(151)) - get(s3a://hwdev-steve-new/cloud-integration/DELAY_LISTING_ME/S3ACommitDataframeSuite/dataframe-committer/partitioned/orc) -> file s3a://hwdev-steve-new/cloud-integration/DELAY_LISTING_ME/S3ACommitDataframeSuite/dataframe-committer/partitioned/orc 0 UNKNOWN true FileStatus{path=s3a://hwdev-steve-new/cloud-integration/DELAY_LISTING_ME/S3ACommitDataframeSuite/dataframe-committer/partitioned/orc; isDirectory=false; length=0; replication=0; blocksize=0; modification_time=1496423936532; access_time=0; owner=; group=; permission=rw-rw-rw-; isSymlink=false; hasAcl=false; isEncrypted=false; isErasureCoded=false} 2017-06-02 18:19:10,719 [dispatcher-event-loop-6] INFO spark.MapOutputTrackerMasterEndpoint (Logging.scala:logInfo(54)) - MapOutputTrackerMasterEndpoint stopped! 2017-06-02 18:19:10,727 [dispatcher-event-loop-3] INFO scheduler.OutputCommitCoordinator$OutputCommitCoordinatorEndpoint (Logging.scala:logInfo(54)) - OutputCommitCoordinator stopped! 2017-06-02 18:19:10,729 [ScalaTest-main-running-S3ACommitDataframeSuite] INFO spark.SparkContext (Logging.scala:logInfo(54)) - Successfully stopped SparkContext - Dataframe+partitioned *** FAILED *** java.io.FileNotFoundException: Path s3a://hwdev-steve-new/cloud-integration/DELAY_LISTING_ME/S3ACommitDataframeSuite/dataframe-committer/partitioned/orc is recorded as deleted by S3Guard at org.apache.hadoop.fs.s3a.S3AFileSystem.innerGetFileStatus(S3AFileSystem.java:1906) at org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:1881) at org.apache.hadoop.fs.s3a.S3AFileSystem.innerListStatus(S3AFileSystem.java:1664) at org.apache.hadoop.fs.s3a.S3AFileSystem.listStatus(S3AFileSystem.java:1640) at com.hortonworks.spark.cloud.ObjectStoreOperations$class.validateRowCount(ObjectStoreOperations.scala:340) at com.hortonworks.spark.cloud.CloudSuite.validateRowCount(CloudSuite.scala:37) at com.hortonworks.spark.cloud.s3.commit.S3ACommitDataframeSuite.testOneFormat(S3ACommitDataframeSuite.scala:107) at
[jira] [Commented] (HADOOP-14445) Delegation tokens are not shared between KMS instances
[ https://issues.apache.org/jira/browse/HADOOP-14445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16035218#comment-16035218 ] Arun Suresh commented on HADOOP-14445: -- [~daryn], agreed that duplicating the entries is bad if the RM will blindly renew all of them. Although, IIRC, the tokens are renewed only if they are expired, so if they are renewed serially, it should not be a problem. But I do agree, since RM also renews DTs on other app events as well (app recovery etc) - for which duplicate renewal, might not be preventable. bq. The cleanest way to manage a kms cluster is transparently via a cname or multi-A record True, in which case one would not even need the LoadBalancingKMSClientProvider. I do like the idea of using a nameservice though, as [~yzhangal] suggested which will ensure that we will still have only 1 single entry. > Delegation tokens are not shared between KMS instances > -- > > Key: HADOOP-14445 > URL: https://issues.apache.org/jira/browse/HADOOP-14445 > Project: Hadoop Common > Issue Type: Bug > Components: documentation, kms >Affects Versions: 2.8.0, 3.0.0-alpha1 >Reporter: Wei-Chiu Chuang >Assignee: Rushabh S Shah > Attachments: HADOOP-14445-branch-2.8.patch > > > As discovered in HADOOP-14441, KMS HA using LoadBalancingKMSClientProvider do > not share delegation tokens. (a client uses KMS address/port as the key for > delegation token) > {code:title=DelegationTokenAuthenticatedURL#openConnection} > if (!creds.getAllTokens().isEmpty()) { > InetSocketAddress serviceAddr = new InetSocketAddress(url.getHost(), > url.getPort()); > Text service = SecurityUtil.buildTokenService(serviceAddr); > dToken = creds.getToken(service); > {code} > But KMS doc states: > {quote} > Delegation Tokens > Similar to HTTP authentication, KMS uses Hadoop Authentication for delegation > tokens too. > Under HA, A KMS instance must verify the delegation token given by another > KMS instance, by checking the shared secret used to sign the delegation > token. To do this, all KMS instances must be able to retrieve the shared > secret from ZooKeeper. > {quote} > We should either update the KMS documentation, or fix this code to share > delegation tokens. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-14468) S3Guard: make short-circuit getFileStatus() configurable
[ https://issues.apache.org/jira/browse/HADOOP-14468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16035211#comment-16035211 ] Aaron Fabbri commented on HADOOP-14468: --- I created this JIRA to follow up on [your comment|https://issues.apache.org/jira/browse/HADOOP-13345?focusedCommentId=16019741=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16019741] and the discussion about failing fast when file is not visible in S3 in the read path. I'm not 100% convinced we want this but it could be useful for: 1. Failing fast on open() instead of when we later read the stream. 2. A "safe mode" or fallback that can be enabled. When this is set to false, we could collect stats on any time MetadataStore differs from S3 which would be interesting. I.e. "s3 / metastore length differs" or "visible in metastore but not s3" In general we do not support a mixed mode where some clients use S3Guard and others do not: It is not safe. However, if there is a well-known path where only an external process (e.g. ETL) is dropping files for ingest, it may be nice to be able to support that more narrow case. I think the existing behavior with list checking S3 + MetadataStore is sufficient without this change though. > S3Guard: make short-circuit getFileStatus() configurable > > > Key: HADOOP-14468 > URL: https://issues.apache.org/jira/browse/HADOOP-14468 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Reporter: Aaron Fabbri >Assignee: Aaron Fabbri > > Currently, when S3Guard is enabled, getFileStatus() will skip S3 if it gets a > result from the MetadataStore (e.g. dynamodb) first. > I would like to add a new parameter > {{fs.s3a.metadatastore.getfilestatus.authoritative}} which, when true, keeps > the current behavior. When false, S3AFileSystem will check both S3 and the > MetadataStore. > I'm not sure yet if we want to have this behavior the same for all callers of > getFileStatus(), or if we only want to check both S3 and MetadataStore for > some internal callers such as open(). -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-14481) Print stack trace when native bzip2 library does not load
[ https://issues.apache.org/jira/browse/HADOOP-14481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang updated HADOOP-14481: - Status: Patch Available (was: Open) > Print stack trace when native bzip2 library does not load > - > > Key: HADOOP-14481 > URL: https://issues.apache.org/jira/browse/HADOOP-14481 > Project: Hadoop Common > Issue Type: Improvement > Components: io >Reporter: Wei-Chiu Chuang >Assignee: Wei-Chiu Chuang >Priority: Minor > Attachments: HADOOP-14481.001.patch > > > When I ran hadoop checknative on my machine, it was not able to load system > bzip2 library and printed the following message. > 17/06/02 09:25:42 WARN bzip2.Bzip2Factory: Failed to load/initialize > native-bzip2 library system-native, will use pure-Java version > Reviewing the relevant code, it fails because of an exception. However, that > exception is not logged. We should print the stacktrace, at least at debug > log level. > {code:title=Bzip2Factory#isNativeBzip2Loaded()} > try { > // Initialize the native library. > Bzip2Compressor.initSymbols(libname); > Bzip2Decompressor.initSymbols(libname); > nativeBzip2Loaded = true; > LOG.info("Successfully loaded & initialized native-bzip2 library " + >libname); > } catch (Throwable t) { > LOG.warn("Failed to load/initialize native-bzip2 library " + >libname + ", will use pure-Java version"); > } > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Created] (HADOOP-14482) Update BUILDING.txt to include the correct steps to install zstd library
Wei-Chiu Chuang created HADOOP-14482: Summary: Update BUILDING.txt to include the correct steps to install zstd library Key: HADOOP-14482 URL: https://issues.apache.org/jira/browse/HADOOP-14482 Project: Hadoop Common Issue Type: Improvement Components: io Affects Versions: 3.0.0-alpha2 Reporter: Wei-Chiu Chuang Priority: Minor The current BUILDING.txt includes the following steps for installing zstd library: $ sudo apt-get install zstd This is incorrect. On my Ubuntu machine, zstd is not a library {quote} apt-cache search zstd libzstd-dev - fast lossless compression algorithm -- development files libzstd0 - fast lossless compression algorithm zstd - fast lossless compression algorithm -- CLI tool {quote} In fact, to build Hadoop with ZStandard library, I have to install libzstd-dev. I will also need to install the runtime to use it. libzstd0 is the older version. libzstd1 is for zstd 1.x. It's not clear to me if libzstd0 is compatible. CentOS does have libzstd1 though. {quote} * Use -Dzstd.prefix to specify a nonstandard location for the libzstd header files and library files. You do not need this option if you have installed zstandard using a package manager. * Use -Dzstd.lib to specify a nonstandard location for the libzstd library files. Similarly to zstd.prefix, you do not need this option if you have installed using a package manager. {quote} At least for CentOS, the library installed by rpm was not located and I had to specify -Dzstd.prefix to get it installed. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-14445) Delegation tokens are not shared between KMS instances
[ https://issues.apache.org/jira/browse/HADOOP-14445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16035169#comment-16035169 ] Yongjun Zhang commented on HADOOP-14445: Thanks [~daryn]. When I say it's unavoidable, I mean, if we use a nameservice, we need to consult the config to know what's associated with the nameservice. For example NN nameservice, if we do "distcp hdfs://nameservice1:/xyz hdfs://nameservice2:/abc", we need to look up nameservice1/2 in config, to know the associated NNs. Similarly, if we use shared delegation token for all KMS servers, we could define a kms-nameservice to associate with the set of KMS servers, and the tokenService can just be the kms-nameservice, and from the config, we can find out the associated KMS server information. Does that make sense? > Delegation tokens are not shared between KMS instances > -- > > Key: HADOOP-14445 > URL: https://issues.apache.org/jira/browse/HADOOP-14445 > Project: Hadoop Common > Issue Type: Bug > Components: documentation, kms >Affects Versions: 2.8.0, 3.0.0-alpha1 >Reporter: Wei-Chiu Chuang >Assignee: Rushabh S Shah > Attachments: HADOOP-14445-branch-2.8.patch > > > As discovered in HADOOP-14441, KMS HA using LoadBalancingKMSClientProvider do > not share delegation tokens. (a client uses KMS address/port as the key for > delegation token) > {code:title=DelegationTokenAuthenticatedURL#openConnection} > if (!creds.getAllTokens().isEmpty()) { > InetSocketAddress serviceAddr = new InetSocketAddress(url.getHost(), > url.getPort()); > Text service = SecurityUtil.buildTokenService(serviceAddr); > dToken = creds.getToken(service); > {code} > But KMS doc states: > {quote} > Delegation Tokens > Similar to HTTP authentication, KMS uses Hadoop Authentication for delegation > tokens too. > Under HA, A KMS instance must verify the delegation token given by another > KMS instance, by checking the shared secret used to sign the delegation > token. To do this, all KMS instances must be able to retrieve the shared > secret from ZooKeeper. > {quote} > We should either update the KMS documentation, or fix this code to share > delegation tokens. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-14445) Delegation tokens are not shared between KMS instances
[ https://issues.apache.org/jira/browse/HADOOP-14445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16035159#comment-16035159 ] Daryn Sharp commented on HADOOP-14445: -- bq. It seems unavoidable if we want to implement ksm-nameservice. I'm not sure what this means but anything config based is not scalable. Updating configs of tens of thousands of nodes, launchers, oozie, storm, spark, etc and restarting the services is just not logistically possible. This is largely why we added the ability for the NN to tell the client the kms uri, plus it added much needed multi-kms support. bq. If user add new KMS and replace KMS, the clients need to be restarted with the new config. Another illustration of why a config-based approach is a bad idea. The cleanest way to manage a kms cluster is transparently via a cname or multi-A record. I've consulted with Rushabh on the initial design, I'll review the actual patch today. > Delegation tokens are not shared between KMS instances > -- > > Key: HADOOP-14445 > URL: https://issues.apache.org/jira/browse/HADOOP-14445 > Project: Hadoop Common > Issue Type: Bug > Components: documentation, kms >Affects Versions: 2.8.0, 3.0.0-alpha1 >Reporter: Wei-Chiu Chuang >Assignee: Rushabh S Shah > Attachments: HADOOP-14445-branch-2.8.patch > > > As discovered in HADOOP-14441, KMS HA using LoadBalancingKMSClientProvider do > not share delegation tokens. (a client uses KMS address/port as the key for > delegation token) > {code:title=DelegationTokenAuthenticatedURL#openConnection} > if (!creds.getAllTokens().isEmpty()) { > InetSocketAddress serviceAddr = new InetSocketAddress(url.getHost(), > url.getPort()); > Text service = SecurityUtil.buildTokenService(serviceAddr); > dToken = creds.getToken(service); > {code} > But KMS doc states: > {quote} > Delegation Tokens > Similar to HTTP authentication, KMS uses Hadoop Authentication for delegation > tokens too. > Under HA, A KMS instance must verify the delegation token given by another > KMS instance, by checking the shared secret used to sign the delegation > token. To do this, all KMS instances must be able to retrieve the shared > secret from ZooKeeper. > {quote} > We should either update the KMS documentation, or fix this code to share > delegation tokens. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-14481) Print stack trace when native bzip2 library does not load
[ https://issues.apache.org/jira/browse/HADOOP-14481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang updated HADOOP-14481: - Attachment: HADOOP-14481.001.patch Attach a very simple fix. With this fix, I am getting the following stacktrace: 17/06/02 10:42:04 WARN bzip2.Bzip2Factory: Failed to load/initialize native-bzip2 library system-native, will use pure-Java version java.lang.UnsatisfiedLinkError: org.apache.hadoop.io.compress.bzip2.Bzip2Compressor.initIDs(Ljava/lang/String;)V at org.apache.hadoop.io.compress.bzip2.Bzip2Compressor.initIDs(Native Method) at org.apache.hadoop.io.compress.bzip2.Bzip2Compressor.initSymbols(Bzip2Compressor.java:284) at org.apache.hadoop.io.compress.bzip2.Bzip2Factory.isNativeBzip2Loaded(Bzip2Factory.java:58) at org.apache.hadoop.util.NativeLibraryChecker.main(NativeLibraryChecker.java:74) > Print stack trace when native bzip2 library does not load > - > > Key: HADOOP-14481 > URL: https://issues.apache.org/jira/browse/HADOOP-14481 > Project: Hadoop Common > Issue Type: Improvement > Components: io >Reporter: Wei-Chiu Chuang >Assignee: Wei-Chiu Chuang >Priority: Minor > Attachments: HADOOP-14481.001.patch > > > When I ran hadoop checknative on my machine, it was not able to load system > bzip2 library and printed the following message. > 17/06/02 09:25:42 WARN bzip2.Bzip2Factory: Failed to load/initialize > native-bzip2 library system-native, will use pure-Java version > Reviewing the relevant code, it fails because of an exception. However, that > exception is not logged. We should print the stacktrace, at least at debug > log level. > {code:title=Bzip2Factory#isNativeBzip2Loaded()} > try { > // Initialize the native library. > Bzip2Compressor.initSymbols(libname); > Bzip2Decompressor.initSymbols(libname); > nativeBzip2Loaded = true; > LOG.info("Successfully loaded & initialized native-bzip2 library " + >libname); > } catch (Throwable t) { > LOG.warn("Failed to load/initialize native-bzip2 library " + >libname + ", will use pure-Java version"); > } > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Assigned] (HADOOP-14472) Azure: TestReadAndSeekPageBlobAfterWrite fails intermittently
[ https://issues.apache.org/jira/browse/HADOOP-14472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mingliang Liu reassigned HADOOP-14472: -- Assignee: Mingliang Liu > Azure: TestReadAndSeekPageBlobAfterWrite fails intermittently > - > > Key: HADOOP-14472 > URL: https://issues.apache.org/jira/browse/HADOOP-14472 > Project: Hadoop Common > Issue Type: Bug > Components: fs/azure, test >Reporter: Mingliang Liu >Assignee: Mingliang Liu > Attachments: HADOOP-14472.000.patch > > > Reported by [HADOOP-14461] > {code} > testManySmallWritesWithHFlush(org.apache.hadoop.fs.azure.TestReadAndSeekPageBlobAfterWrite) > Time elapsed: 1.051 sec <<< FAILURE! > java.lang.AssertionError: hflush duration of 13, less than minimum expected > of 20 > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.assertTrue(Assert.java:41) > at > org.apache.hadoop.fs.azure.TestReadAndSeekPageBlobAfterWrite.writeAndReadOneFile(TestReadAndSeekPageBlobAfterWrite.java:286) > at > org.apache.hadoop.fs.azure.TestReadAndSeekPageBlobAfterWrite.testManySmallWritesWithHFlush(TestReadAndSeekPageBlobAfterWrite.java:247) > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-14472) Azure: TestReadAndSeekPageBlobAfterWrite fails intermittently
[ https://issues.apache.org/jira/browse/HADOOP-14472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16035094#comment-16035094 ] Mingliang Liu commented on HADOOP-14472: Tested against US WEST region. Now all unit/live tests pass. {code} $ mcb; and cd hadoop-tools/hadoop-azure; and mvn test -q java version "1.8.0_65" Java(TM) SE Runtime Environment (build 1.8.0_65-b17) Java HotSpot(TM) 64-Bit Server VM (build 25.65-b01, mixed mode) --- T E S T S --- --- T E S T S --- Running org.apache.hadoop.fs.azure.contract.TestAzureNativeContractAppend Tests run: 5, Failures: 0, Errors: 0, Skipped: 1, Time elapsed: 2.225 sec - in org.apache.hadoop.fs.azure.contract.TestAzureNativeContractAppend Running org.apache.hadoop.fs.azure.contract.TestAzureNativeContractCreate Tests run: 10, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 3.039 sec - in org.apache.hadoop.fs.azure.contract.TestAzureNativeContractCreate Running org.apache.hadoop.fs.azure.contract.TestAzureNativeContractDelete Tests run: 8, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.821 sec - in org.apache.hadoop.fs.azure.contract.TestAzureNativeContractDelete Running org.apache.hadoop.fs.azure.contract.TestAzureNativeContractDistCp Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 13.131 sec - in org.apache.hadoop.fs.azure.contract.TestAzureNativeContractDistCp Running org.apache.hadoop.fs.azure.contract.TestAzureNativeContractGetFileStatus Tests run: 18, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 7.578 sec - in org.apache.hadoop.fs.azure.contract.TestAzureNativeContractGetFileStatus Running org.apache.hadoop.fs.azure.contract.TestAzureNativeContractMkdir Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 4.513 sec - in org.apache.hadoop.fs.azure.contract.TestAzureNativeContractMkdir Running org.apache.hadoop.fs.azure.contract.TestAzureNativeContractOpen Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.079 sec - in org.apache.hadoop.fs.azure.contract.TestAzureNativeContractOpen Running org.apache.hadoop.fs.azure.contract.TestAzureNativeContractRename Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 3.097 sec - in org.apache.hadoop.fs.azure.contract.TestAzureNativeContractRename Running org.apache.hadoop.fs.azure.contract.TestAzureNativeContractSeek Tests run: 18, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 6.192 sec - in org.apache.hadoop.fs.azure.contract.TestAzureNativeContractSeek Running org.apache.hadoop.fs.azure.metrics.TestAzureFileSystemInstrumentation Tests run: 8, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 8.561 sec - in org.apache.hadoop.fs.azure.metrics.TestAzureFileSystemInstrumentation Running org.apache.hadoop.fs.azure.metrics.TestBandwidthGaugeUpdater Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.324 sec - in org.apache.hadoop.fs.azure.metrics.TestBandwidthGaugeUpdater Running org.apache.hadoop.fs.azure.metrics.TestNativeAzureFileSystemMetricsSystem Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.694 sec - in org.apache.hadoop.fs.azure.metrics.TestNativeAzureFileSystemMetricsSystem Running org.apache.hadoop.fs.azure.metrics.TestRollingWindowAverage Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.213 sec - in org.apache.hadoop.fs.azure.metrics.TestRollingWindowAverage Running org.apache.hadoop.fs.azure.TestAzureConcurrentOutOfBandIo Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 3.62 sec - in org.apache.hadoop.fs.azure.TestAzureConcurrentOutOfBandIo Running org.apache.hadoop.fs.azure.TestAzureFileSystemErrorConditions Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.603 sec - in org.apache.hadoop.fs.azure.TestAzureFileSystemErrorConditions Running org.apache.hadoop.fs.azure.TestBlobDataValidation Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.444 sec - in org.apache.hadoop.fs.azure.TestBlobDataValidation Running org.apache.hadoop.fs.azure.TestBlobMetadata Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.747 sec - in org.apache.hadoop.fs.azure.TestBlobMetadata Running org.apache.hadoop.fs.azure.TestBlobTypeSpeedDifference Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.091 sec - in org.apache.hadoop.fs.azure.TestBlobTypeSpeedDifference Running org.apache.hadoop.fs.azure.TestContainerChecks Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.787 sec - in org.apache.hadoop.fs.azure.TestContainerChecks Running org.apache.hadoop.fs.azure.TestFileSystemOperationExceptionHandling Tests run: 16, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 6.164 sec - in
[jira] [Commented] (HADOOP-13998) initial s3guard preview
[ https://issues.apache.org/jira/browse/HADOOP-13998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16035077#comment-16035077 ] Steve Loughran commented on HADOOP-13998: - bq. We've run our standard downstream Hive, Spark, MR, Impala, scale, and performance tests If these tests were working *before* you turned s3guard on then they weren't catching inconsistencies & so were lucky (as mine were). I'm running my spark committer tests with the inconsistent client turned on, and it is repeatedly failing the classic & magic committers without s3guard enabled: both depend on consistent listing. Also found a brittleness in path cleanup for the magic committer too; cleanup code *must* handle an FNFE if there's a file returned in the listing but which isn't there in the GET. This is why I'd like the factory for the inconsistent client be in src/main: it lets anyone turn on inconsistency for their test runs bq. This is a good point. Do you prefer timing-based microbenchmarks, or S3 request statistics (counts)? the instrumentation ones are way less brittle; Ming has been fixing some nanotimer-assertion in WASB which was failing intermittently. I have some tests somewhere which call listFiles(recursive) against the amazon landsat store: that's the reference example of a deep and wide directory tree. > initial s3guard preview > --- > > Key: HADOOP-13998 > URL: https://issues.apache.org/jira/browse/HADOOP-13998 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Reporter: Steve Loughran > > JIRA to link in all the things we think are needed for a preview/merge into > trunk -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-14468) S3Guard: make short-circuit getFileStatus() configurable
[ https://issues.apache.org/jira/browse/HADOOP-14468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16035064#comment-16035064 ] Steve Loughran commented on HADOOP-14468: - What's the reason for this? To pick up changes to files which aren't going to s3guard even when auth=true? > S3Guard: make short-circuit getFileStatus() configurable > > > Key: HADOOP-14468 > URL: https://issues.apache.org/jira/browse/HADOOP-14468 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Reporter: Aaron Fabbri >Assignee: Aaron Fabbri > > Currently, when S3Guard is enabled, getFileStatus() will skip S3 if it gets a > result from the MetadataStore (e.g. dynamodb) first. > I would like to add a new parameter > {{fs.s3a.metadatastore.getfilestatus.authoritative}} which, when true, keeps > the current behavior. When false, S3AFileSystem will check both S3 and the > MetadataStore. > I'm not sure yet if we want to have this behavior the same for all callers of > getFileStatus(), or if we only want to check both S3 and MetadataStore for > some internal callers such as open(). -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-14457) create() does not notify metadataStore of parent directories or ensure they're not existing files
[ https://issues.apache.org/jira/browse/HADOOP-14457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16035059#comment-16035059 ] Steve Loughran commented on HADOOP-14457: - you know we do something in BlockOutputStream when finalizing a write? Or at least I am in the committer branch > create() does not notify metadataStore of parent directories or ensure > they're not existing files > - > > Key: HADOOP-14457 > URL: https://issues.apache.org/jira/browse/HADOOP-14457 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Reporter: Sean Mackrory > Attachments: HADOOP-14457-HADOOP-13345.001.patch, > HADOOP-14457-HADOOP-13345.002.patch > > > Not a great test yet, but it at least reliably demonstrates the issue. > LocalMetadataStore will sometimes erroneously report that a directory is > empty with isAuthoritative = true when it *definitely* has children the > metadatastore should know about. It doesn't appear to happen if the children > are just directory. The fact that it's returning an empty listing is > concerning, but the fact that it says it's authoritative *might* be a second > bug. > {code} > diff --git > a/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java > > b/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java > index 78b3970..1821d19 100644 > --- > a/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java > +++ > b/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java > @@ -965,7 +965,7 @@ public boolean hasMetadataStore() { >} > >@VisibleForTesting > - MetadataStore getMetadataStore() { > + public MetadataStore getMetadataStore() { > return metadataStore; >} > > diff --git > a/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/contract/s3a/ITestS3AContractRename.java > > b/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/contract/s3a/ITestS3AContractRename.java > index 4339649..881bdc9 100644 > --- > a/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/contract/s3a/ITestS3AContractRename.java > +++ > b/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/contract/s3a/ITestS3AContractRename.java > @@ -23,6 +23,11 @@ > import org.apache.hadoop.fs.contract.AbstractFSContract; > import org.apache.hadoop.fs.FileSystem; > import org.apache.hadoop.fs.Path; > +import org.apache.hadoop.fs.s3a.S3AFileSystem; > +import org.apache.hadoop.fs.s3a.Tristate; > +import org.apache.hadoop.fs.s3a.s3guard.DirListingMetadata; > +import org.apache.hadoop.fs.s3a.s3guard.MetadataStore; > +import org.junit.Test; > > import static org.apache.hadoop.fs.contract.ContractTestUtils.dataset; > import static org.apache.hadoop.fs.contract.ContractTestUtils.writeDataset; > @@ -72,4 +77,24 @@ public void testRenameDirIntoExistingDir() throws > Throwable { > boolean rename = fs.rename(srcDir, destDir); > assertFalse("s3a doesn't support rename to non-empty directory", rename); >} > + > + @Test > + public void testMkdirPopulatesFileAncestors() throws Exception { > +final FileSystem fs = getFileSystem(); > +final MetadataStore ms = ((S3AFileSystem) fs).getMetadataStore(); > +final Path parent = path("testMkdirPopulatesFileAncestors/source"); > +try { > + fs.mkdirs(parent); > + final Path nestedFile = new Path(parent, "dir1/dir2/dir3/file4"); > + byte[] srcDataset = dataset(256, 'a', 'z'); > + writeDataset(fs, nestedFile, srcDataset, srcDataset.length, > + 1024, false); > + > + DirListingMetadata list = ms.listChildren(parent); > + assertTrue("MetadataStore falsely reports authoritative empty list", > + list.isEmpty() == Tristate.FALSE || !list.isAuthoritative()); > +} finally { > + fs.delete(parent, true); > +} > + } > } > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Created] (HADOOP-14481) Print stack trace when native bzip2 library does not load
Wei-Chiu Chuang created HADOOP-14481: Summary: Print stack trace when native bzip2 library does not load Key: HADOOP-14481 URL: https://issues.apache.org/jira/browse/HADOOP-14481 Project: Hadoop Common Issue Type: Improvement Components: io Reporter: Wei-Chiu Chuang Assignee: Wei-Chiu Chuang Priority: Minor When I run hadoop checknative on my machine, it was not able to load system bzip2 library and printed the following message. 17/06/02 09:25:42 WARN bzip2.Bzip2Factory: Failed to load/initialize native-bzip2 library system-native, will use pure-Java version Reviewing the relevant code, it fails because of an exception. However, that exception is not logged. We should print the stacktrace, at least at debug log level. {code:title=Bzip2Factory#isNativeBzip2Loaded()} try { // Initialize the native library. Bzip2Compressor.initSymbols(libname); Bzip2Decompressor.initSymbols(libname); nativeBzip2Loaded = true; LOG.info("Successfully loaded & initialized native-bzip2 library " + libname); } catch (Throwable t) { LOG.warn("Failed to load/initialize native-bzip2 library " + libname + ", will use pure-Java version"); } {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-14481) Print stack trace when native bzip2 library does not load
[ https://issues.apache.org/jira/browse/HADOOP-14481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang updated HADOOP-14481: - Description: When I ran hadoop checknative on my machine, it was not able to load system bzip2 library and printed the following message. 17/06/02 09:25:42 WARN bzip2.Bzip2Factory: Failed to load/initialize native-bzip2 library system-native, will use pure-Java version Reviewing the relevant code, it fails because of an exception. However, that exception is not logged. We should print the stacktrace, at least at debug log level. {code:title=Bzip2Factory#isNativeBzip2Loaded()} try { // Initialize the native library. Bzip2Compressor.initSymbols(libname); Bzip2Decompressor.initSymbols(libname); nativeBzip2Loaded = true; LOG.info("Successfully loaded & initialized native-bzip2 library " + libname); } catch (Throwable t) { LOG.warn("Failed to load/initialize native-bzip2 library " + libname + ", will use pure-Java version"); } {code} was: When I run hadoop checknative on my machine, it was not able to load system bzip2 library and printed the following message. 17/06/02 09:25:42 WARN bzip2.Bzip2Factory: Failed to load/initialize native-bzip2 library system-native, will use pure-Java version Reviewing the relevant code, it fails because of an exception. However, that exception is not logged. We should print the stacktrace, at least at debug log level. {code:title=Bzip2Factory#isNativeBzip2Loaded()} try { // Initialize the native library. Bzip2Compressor.initSymbols(libname); Bzip2Decompressor.initSymbols(libname); nativeBzip2Loaded = true; LOG.info("Successfully loaded & initialized native-bzip2 library " + libname); } catch (Throwable t) { LOG.warn("Failed to load/initialize native-bzip2 library " + libname + ", will use pure-Java version"); } {code} > Print stack trace when native bzip2 library does not load > - > > Key: HADOOP-14481 > URL: https://issues.apache.org/jira/browse/HADOOP-14481 > Project: Hadoop Common > Issue Type: Improvement > Components: io >Reporter: Wei-Chiu Chuang >Assignee: Wei-Chiu Chuang >Priority: Minor > > When I ran hadoop checknative on my machine, it was not able to load system > bzip2 library and printed the following message. > 17/06/02 09:25:42 WARN bzip2.Bzip2Factory: Failed to load/initialize > native-bzip2 library system-native, will use pure-Java version > Reviewing the relevant code, it fails because of an exception. However, that > exception is not logged. We should print the stacktrace, at least at debug > log level. > {code:title=Bzip2Factory#isNativeBzip2Loaded()} > try { > // Initialize the native library. > Bzip2Compressor.initSymbols(libname); > Bzip2Decompressor.initSymbols(libname); > nativeBzip2Loaded = true; > LOG.info("Successfully loaded & initialized native-bzip2 library " + >libname); > } catch (Throwable t) { > LOG.warn("Failed to load/initialize native-bzip2 library " + >libname + ", will use pure-Java version"); > } > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-14474) Use OpenJDK 7 instead of Oracle JDK 7 to avoid oracle-java7-installer failures
[ https://issues.apache.org/jira/browse/HADOOP-14474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16035041#comment-16035041 ] Xiao Chen commented on HADOOP-14474: Thanks Allen, created HADOOP-14480 for the long shot. > Use OpenJDK 7 instead of Oracle JDK 7 to avoid oracle-java7-installer failures > -- > > Key: HADOOP-14474 > URL: https://issues.apache.org/jira/browse/HADOOP-14474 > Project: Hadoop Common > Issue Type: Bug > Components: build >Affects Versions: 2.8.0, 2.7.3 >Reporter: Akira Ajisaka >Assignee: Akira Ajisaka > Fix For: 2.9.0, 2.7.4, 2.6.6, 2.8.2 > > Attachments: HADOOP-14474-branch-2.01.patch > > > Recently Oracle has changed the download link for Oracle JDK7, and that's why > oracle-java7-installer fails. Precommit jobs for branch-2* are failing > because of this failure. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-14480) Remove Oracle JDK usage in Dockerfile
[ https://issues.apache.org/jira/browse/HADOOP-14480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Chen updated HADOOP-14480: --- Description: Further to the discussions in HADOOP-14474, we should look for a long-term solution that doesn't use Oracle JDKs. > Remove Oracle JDK usage in Dockerfile > - > > Key: HADOOP-14480 > URL: https://issues.apache.org/jira/browse/HADOOP-14480 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Xiao Chen > > Further to the discussions in HADOOP-14474, we should look for a long-term > solution that doesn't use Oracle JDKs. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Created] (HADOOP-14480) Remove Oracle JDK usage in Dockerfile
Xiao Chen created HADOOP-14480: -- Summary: Remove Oracle JDK usage in Dockerfile Key: HADOOP-14480 URL: https://issues.apache.org/jira/browse/HADOOP-14480 Project: Hadoop Common Issue Type: Improvement Reporter: Xiao Chen -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-14474) Use OpenJDK 7 instead of Oracle JDK 7 to avoid oracle-java7-installer failures
[ https://issues.apache.org/jira/browse/HADOOP-14474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16035034#comment-16035034 ] Allen Wittenauer commented on HADOOP-14474: --- As a sidenote, I hope people are aware this is likely "the first shot". I wouldn't be surprised to see Oracle JDK 8 eventually also require an Oracle account. We should probably consider moving off of Oracle JDKs in the Dockerfile completely. > Use OpenJDK 7 instead of Oracle JDK 7 to avoid oracle-java7-installer failures > -- > > Key: HADOOP-14474 > URL: https://issues.apache.org/jira/browse/HADOOP-14474 > Project: Hadoop Common > Issue Type: Bug > Components: build >Affects Versions: 2.8.0, 2.7.3 >Reporter: Akira Ajisaka >Assignee: Akira Ajisaka > Fix For: 2.9.0, 2.7.4, 2.6.6, 2.8.2 > > Attachments: HADOOP-14474-branch-2.01.patch > > > Recently Oracle has changed the download link for Oracle JDK7, and that's why > oracle-java7-installer fails. Precommit jobs for branch-2* are failing > because of this failure. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-14474) Use OpenJDK 7 instead of Oracle JDK 7 to avoid oracle-java7-installer failures
[ https://issues.apache.org/jira/browse/HADOOP-14474?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Chen updated HADOOP-14474: --- Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 2.8.2 2.6.6 2.7.4 2.9.0 Status: Resolved (was: Patch Available) Pushed to branch-2 and all the way down to branch-2.6, should make pre-commit happy. Thanks Akira for the fix, Allen for review, and everyone for discussion! > Use OpenJDK 7 instead of Oracle JDK 7 to avoid oracle-java7-installer failures > -- > > Key: HADOOP-14474 > URL: https://issues.apache.org/jira/browse/HADOOP-14474 > Project: Hadoop Common > Issue Type: Bug > Components: build >Affects Versions: 2.8.0, 2.7.3 >Reporter: Akira Ajisaka >Assignee: Akira Ajisaka > Fix For: 2.9.0, 2.7.4, 2.6.6, 2.8.2 > > Attachments: HADOOP-14474-branch-2.01.patch > > > Recently Oracle has changed the download link for Oracle JDK7, and that's why > oracle-java7-installer fails. Precommit jobs for branch-2* are failing > because of this failure. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-14445) Delegation tokens are not shared between KMS instances
[ https://issues.apache.org/jira/browse/HADOOP-14445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16035008#comment-16035008 ] Yongjun Zhang commented on HADOOP-14445: Thanks [~daryn]. {quote} No more config-based solutions {quote} It seems unavoidable if we want to implement ksm-nameservice. For this jira, maybe we just go with what [~shahrs87] has, use key with concatenated host, if can't find match, fallback to original key format. If user add new KMS and replace KMS, the clients need to be restarted with the new config. What do you think [~shahrs87], [~asuresh] and [~daryn]? Thanks. > Delegation tokens are not shared between KMS instances > -- > > Key: HADOOP-14445 > URL: https://issues.apache.org/jira/browse/HADOOP-14445 > Project: Hadoop Common > Issue Type: Bug > Components: documentation, kms >Affects Versions: 2.8.0, 3.0.0-alpha1 >Reporter: Wei-Chiu Chuang >Assignee: Rushabh S Shah > Attachments: HADOOP-14445-branch-2.8.patch > > > As discovered in HADOOP-14441, KMS HA using LoadBalancingKMSClientProvider do > not share delegation tokens. (a client uses KMS address/port as the key for > delegation token) > {code:title=DelegationTokenAuthenticatedURL#openConnection} > if (!creds.getAllTokens().isEmpty()) { > InetSocketAddress serviceAddr = new InetSocketAddress(url.getHost(), > url.getPort()); > Text service = SecurityUtil.buildTokenService(serviceAddr); > dToken = creds.getToken(service); > {code} > But KMS doc states: > {quote} > Delegation Tokens > Similar to HTTP authentication, KMS uses Hadoop Authentication for delegation > tokens too. > Under HA, A KMS instance must verify the delegation token given by another > KMS instance, by checking the shared secret used to sign the delegation > token. To do this, all KMS instances must be able to retrieve the shared > secret from ZooKeeper. > {quote} > We should either update the KMS documentation, or fix this code to share > delegation tokens. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-14474) Use OpenJDK 7 instead of Oracle JDK 7 to avoid oracle-java7-installer failures
[ https://issues.apache.org/jira/browse/HADOOP-14474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16034999#comment-16034999 ] Xiao Chen commented on HADOOP-14474: Thanks [~aw] for kicking off a jenkins job and reviewing. bq. This should probably work. +1 +1... will commit shortly. > Use OpenJDK 7 instead of Oracle JDK 7 to avoid oracle-java7-installer failures > -- > > Key: HADOOP-14474 > URL: https://issues.apache.org/jira/browse/HADOOP-14474 > Project: Hadoop Common > Issue Type: Bug > Components: build >Affects Versions: 2.8.0, 2.7.3 >Reporter: Akira Ajisaka >Assignee: Akira Ajisaka > Attachments: HADOOP-14474-branch-2.01.patch > > > Recently Oracle has changed the download link for Oracle JDK7, and that's why > oracle-java7-installer fails. Precommit jobs for branch-2* are failing > because of this failure. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-14429) getFsAction method of FTPFileSystem always returned FsAction.NONE
[ https://issues.apache.org/jira/browse/HADOOP-14429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hongyuan Li updated HADOOP-14429: - Summary: getFsAction method of FTPFileSystem always returned FsAction.NONE (was: getFsAction method of FTPFileSystem always return FsAction.NONE) > getFsAction method of FTPFileSystem always returned FsAction.NONE > -- > > Key: HADOOP-14429 > URL: https://issues.apache.org/jira/browse/HADOOP-14429 > Project: Hadoop Common > Issue Type: Bug > Components: fs >Affects Versions: 3.0.0-alpha2 >Reporter: Hongyuan Li >Assignee: Hongyuan Li >Priority: Trivial > Attachments: HADOOP-14429-001.patch, HADOOP-14429-002.patch, > HADOOP-14429-003.patch > > > > {code} > private FsAction getFsAction(int accessGroup, FTPFile ftpFile) { > FsAction action = FsAction.NONE; > if (ftpFile.hasPermission(accessGroup, FTPFile.READ_PERMISSION)) { > action.or(FsAction.READ); > } > if (ftpFile.hasPermission(accessGroup, FTPFile.WRITE_PERMISSION)) { > action.or(FsAction.WRITE); > } > if (ftpFile.hasPermission(accessGroup, FTPFile.EXECUTE_PERMISSION)) { > action.or(FsAction.EXECUTE); > } > return action; > } > {code} > from code above, we can see that the getFsAction method doesnot modify the > action generated by FsAction action = FsAction.NONE,which means it return > FsAction.NONE all the time; -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HADOOP-14429) getFsAction method of FTPFileSystem always return FsAction.NONE
[ https://issues.apache.org/jira/browse/HADOOP-14429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16032531#comment-16032531 ] Hongyuan Li edited comment on HADOOP-14429 at 6/2/17 4:16 PM: -- [~yzhangal] patch-002 does as what you comment. [~ste...@apache.org] Would you mind giving me a code review? was (Author: hongyuan li): [~yzhangal] patch-002 does as what you comment. > getFsAction method of FTPFileSystem always return FsAction.NONE > > > Key: HADOOP-14429 > URL: https://issues.apache.org/jira/browse/HADOOP-14429 > Project: Hadoop Common > Issue Type: Bug > Components: fs >Affects Versions: 3.0.0-alpha2 >Reporter: Hongyuan Li >Assignee: Hongyuan Li >Priority: Trivial > Attachments: HADOOP-14429-001.patch, HADOOP-14429-002.patch, > HADOOP-14429-003.patch > > > > {code} > private FsAction getFsAction(int accessGroup, FTPFile ftpFile) { > FsAction action = FsAction.NONE; > if (ftpFile.hasPermission(accessGroup, FTPFile.READ_PERMISSION)) { > action.or(FsAction.READ); > } > if (ftpFile.hasPermission(accessGroup, FTPFile.WRITE_PERMISSION)) { > action.or(FsAction.WRITE); > } > if (ftpFile.hasPermission(accessGroup, FTPFile.EXECUTE_PERMISSION)) { > action.or(FsAction.EXECUTE); > } > return action; > } > {code} > from code above, we can see that the getFsAction method doesnot modify the > action generated by FsAction action = FsAction.NONE,which means it return > FsAction.NONE all the time; -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-14469) the listStatus method of FTPFileSystem should filter the path "." and ".."
[ https://issues.apache.org/jira/browse/HADOOP-14469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hongyuan Li updated HADOOP-14469: - Summary: the listStatus method of FTPFileSystem should filter the path "." and ".." (was: the listStatus method of FTPFileSystem should ignore the path "." and "..") > the listStatus method of FTPFileSystem should filter the path "." and ".." > --- > > Key: HADOOP-14469 > URL: https://issues.apache.org/jira/browse/HADOOP-14469 > Project: Hadoop Common > Issue Type: Bug > Components: fs >Reporter: Hongyuan Li >Assignee: Hongyuan Li > Attachments: HADOOP-14469-001.patch, HADOOP-14469-002.patch, > HADOOP-14469-003.patch > > > for some ftpsystems, liststatus method will return new Path(".") and new > Path(".."), thus causing list op looping.for example, Serv-U > We can see the logic in code below: > {code} > private FileStatus[] listStatus(FTPClient client, Path file) > throws IOException { > ā¦ā¦ > FileStatus[] fileStats = new FileStatus[ftpFiles.length]; > for (int i = 0; i < ftpFiles.length; i++) { > fileStats[i] = getFileStatus(ftpFiles[i], absolute); > } > return fileStats; > } > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-14470) the ternary operator in create method in class CommandWithDestination is redundant
[ https://issues.apache.org/jira/browse/HADOOP-14470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hongyuan Li updated HADOOP-14470: - Description: in if statementļ¼the lazyPersist is always true, thus the ternary operator is redundantļ¼ {{lazyPersist == true}} in if statment, so {{lazyPersist ? 1 : getDefaultReplication(item.path)}} is redundant. related code like below, which is in {{org.apache.hadoop.fs.shell.CommandWithDestination}} lineNumber : 504 : {code:java} FSDataOutputStream create(PathData item, boolean lazyPersist, boolean direct) throws IOException { try { if (lazyPersist) { // in if stament, lazyPersist is always true ā¦ā¦ return create(item.path, FsPermission.getFileDefault().applyUMask( FsPermission.getUMask(getConf())), createFlags, getConf().getInt(IO_FILE_BUFFER_SIZE_KEY, IO_FILE_BUFFER_SIZE_DEFAULT), lazyPersist ? 1 : getDefaultReplication(item.path), // *this is redundant* getDefaultBlockSize(), null, null); } else { return create(item.path, true); } } finally { // might have been created but stream was interrupted if (!direct) { deleteOnExit(item.path); } } } {code} was: in if statementļ¼the lazyPersist is always true, thus the ternary operator is redundantļ¼ {{lazyPersist == true}} in if statment, so {{ lazyPersist ? 1 : getDefaultReplication(item.path) }} is redundant. related code like below, which is in {{org.apache.hadoop.fs.shell.CommandWithDestination }} , lineNumber : 504 : {code:java} FSDataOutputStream create(PathData item, boolean lazyPersist, boolean direct) throws IOException { try { if (lazyPersist) { // in if stament, lazyPersist is always true ā¦ā¦ return create(item.path, FsPermission.getFileDefault().applyUMask( FsPermission.getUMask(getConf())), createFlags, getConf().getInt(IO_FILE_BUFFER_SIZE_KEY, IO_FILE_BUFFER_SIZE_DEFAULT), lazyPersist ? 1 : getDefaultReplication(item.path), // *this is redundant* getDefaultBlockSize(), null, null); } else { return create(item.path, true); } } finally { // might have been created but stream was interrupted if (!direct) { deleteOnExit(item.path); } } } {code} > the ternary operator in create method in class CommandWithDestination is > redundant > --- > > Key: HADOOP-14470 > URL: https://issues.apache.org/jira/browse/HADOOP-14470 > Project: Hadoop Common > Issue Type: Improvement > Components: common >Reporter: Hongyuan Li >Assignee: Hongyuan Li >Priority: Trivial > Attachments: HADOOP-14470-001.patch > > > in if statementļ¼the lazyPersist is always true, thus the ternary operator is > redundantļ¼ > {{lazyPersist == true}} in if statment, so {{lazyPersist ? 1 : > getDefaultReplication(item.path)}} is redundant. > related code like below, which is in > {{org.apache.hadoop.fs.shell.CommandWithDestination}} lineNumber : 504 : > {code:java} >FSDataOutputStream create(PathData item, boolean lazyPersist, > boolean direct) > throws IOException { > try { > if (lazyPersist) { // in if stament, lazyPersist is always true > ā¦ā¦ > return create(item.path, > FsPermission.getFileDefault().applyUMask( > FsPermission.getUMask(getConf())), > createFlags, > getConf().getInt(IO_FILE_BUFFER_SIZE_KEY, > IO_FILE_BUFFER_SIZE_DEFAULT), > lazyPersist ? 1 : getDefaultReplication(item.path), > // *this is redundant* > getDefaultBlockSize(), > null, > null); > } else { > return create(item.path, true); > } > } finally { // might have been created but stream was interrupted > if (!direct) { > deleteOnExit(item.path); > } > } > } > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail:
[jira] [Updated] (HADOOP-14470) the ternary operator in create method in class CommandWithDestination is redundant
[ https://issues.apache.org/jira/browse/HADOOP-14470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hongyuan Li updated HADOOP-14470: - Description: in if statementļ¼the lazyPersist is always true, thus the ternary operator is redundantļ¼ {{lazyPersist == true}} in if statment, so {{ lazyPersist ? 1 : getDefaultReplication(item.path) }} is redundant. related code like below, which is in {{org.apache.hadoop.fs.shell.CommandWithDestination }} , lineNumber : 504 : {code:java} FSDataOutputStream create(PathData item, boolean lazyPersist, boolean direct) throws IOException { try { if (lazyPersist) { // in if stament, lazyPersist is always true ā¦ā¦ return create(item.path, FsPermission.getFileDefault().applyUMask( FsPermission.getUMask(getConf())), createFlags, getConf().getInt(IO_FILE_BUFFER_SIZE_KEY, IO_FILE_BUFFER_SIZE_DEFAULT), lazyPersist ? 1 : getDefaultReplication(item.path), // *this is redundant* getDefaultBlockSize(), null, null); } else { return create(item.path, true); } } finally { // might have been created but stream was interrupted if (!direct) { deleteOnExit(item.path); } } } {code} was: in if statementļ¼the lazyPersist is always true, thus the ternary operator is redundantļ¼ {{lazyPersist == true}} in if statment, so {{ lazyPersist ? 1 : getDefaultReplication(item.path) }} is redundant. related code like below, which is in {{ org.apache.hadoop.fs.shell.CommandWithDestination }} , lineNumber : 504 : {code:java} FSDataOutputStream create(PathData item, boolean lazyPersist, boolean direct) throws IOException { try { if (lazyPersist) { // in if stament, lazyPersist is always true ā¦ā¦ return create(item.path, FsPermission.getFileDefault().applyUMask( FsPermission.getUMask(getConf())), createFlags, getConf().getInt(IO_FILE_BUFFER_SIZE_KEY, IO_FILE_BUFFER_SIZE_DEFAULT), lazyPersist ? 1 : getDefaultReplication(item.path), // *this is redundant* getDefaultBlockSize(), null, null); } else { return create(item.path, true); } } finally { // might have been created but stream was interrupted if (!direct) { deleteOnExit(item.path); } } } {code} > the ternary operator in create method in class CommandWithDestination is > redundant > --- > > Key: HADOOP-14470 > URL: https://issues.apache.org/jira/browse/HADOOP-14470 > Project: Hadoop Common > Issue Type: Improvement > Components: common >Reporter: Hongyuan Li >Assignee: Hongyuan Li >Priority: Trivial > Attachments: HADOOP-14470-001.patch > > > in if statementļ¼the lazyPersist is always true, thus the ternary operator is > redundantļ¼ > {{lazyPersist == true}} in if statment, so {{ lazyPersist ? 1 : > getDefaultReplication(item.path) }} is redundant. > related code like below, which is in > {{org.apache.hadoop.fs.shell.CommandWithDestination }} > , lineNumber : 504 : > {code:java} >FSDataOutputStream create(PathData item, boolean lazyPersist, > boolean direct) > throws IOException { > try { > if (lazyPersist) { // in if stament, lazyPersist is always true > ā¦ā¦ > return create(item.path, > FsPermission.getFileDefault().applyUMask( > FsPermission.getUMask(getConf())), > createFlags, > getConf().getInt(IO_FILE_BUFFER_SIZE_KEY, > IO_FILE_BUFFER_SIZE_DEFAULT), > lazyPersist ? 1 : getDefaultReplication(item.path), > // *this is redundant* > getDefaultBlockSize(), > null, > null); > } else { > return create(item.path, true); > } > } finally { // might have been created but stream was interrupted > if (!direct) { > deleteOnExit(item.path); > } > } > } > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail:
[jira] [Updated] (HADOOP-14470) the ternary operator in create method in class CommandWithDestination is redundant
[ https://issues.apache.org/jira/browse/HADOOP-14470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hongyuan Li updated HADOOP-14470: - Description: in if statementļ¼the lazyPersist is always true, thus the ternary operator is redundantļ¼ {{lazyPersist == true}} in if statment, so {{ lazyPersist ? 1 : getDefaultReplication(item.path) }} is redundant. related code like below, which is in {{ org.apache.hadoop.fs.shell.CommandWithDestination }} , lineNumber : 504 : {code:java} FSDataOutputStream create(PathData item, boolean lazyPersist, boolean direct) throws IOException { try { if (lazyPersist) { // in if stament, lazyPersist is always true ā¦ā¦ return create(item.path, FsPermission.getFileDefault().applyUMask( FsPermission.getUMask(getConf())), createFlags, getConf().getInt(IO_FILE_BUFFER_SIZE_KEY, IO_FILE_BUFFER_SIZE_DEFAULT), lazyPersist ? 1 : getDefaultReplication(item.path), // *this is redundant* getDefaultBlockSize(), null, null); } else { return create(item.path, true); } } finally { // might have been created but stream was interrupted if (!direct) { deleteOnExit(item.path); } } } {code} was: in if statementļ¼the lazyPersist is always true, thus the ternary operator is redundantļ¼ {{lazyPersist == true}} in if statment, so {{ lazyPersist ? 1 : getDefaultReplication(item.path), }} is redundant. related code like below, which is in {{ org.apache.hadoop.fs.shell.CommandWithDestination}}, lineNumber : 504 : {code:java} FSDataOutputStream create(PathData item, boolean lazyPersist, boolean direct) throws IOException { try { if (lazyPersist) { // in if stament, lazyPersist is always true ā¦ā¦ return create(item.path, FsPermission.getFileDefault().applyUMask( FsPermission.getUMask(getConf())), createFlags, getConf().getInt(IO_FILE_BUFFER_SIZE_KEY, IO_FILE_BUFFER_SIZE_DEFAULT), lazyPersist ? 1 : getDefaultReplication(item.path), // *this is redundant* getDefaultBlockSize(), null, null); } else { return create(item.path, true); } } finally { // might have been created but stream was interrupted if (!direct) { deleteOnExit(item.path); } } } {code} > the ternary operator in create method in class CommandWithDestination is > redundant > --- > > Key: HADOOP-14470 > URL: https://issues.apache.org/jira/browse/HADOOP-14470 > Project: Hadoop Common > Issue Type: Improvement > Components: common >Reporter: Hongyuan Li >Assignee: Hongyuan Li >Priority: Trivial > Attachments: HADOOP-14470-001.patch > > > in if statementļ¼the lazyPersist is always true, thus the ternary operator is > redundantļ¼ > {{lazyPersist == true}} in if statment, so {{ lazyPersist ? 1 : > getDefaultReplication(item.path) }} is redundant. > related code like below, which is in {{ > org.apache.hadoop.fs.shell.CommandWithDestination }} , lineNumber : 504 : > {code:java} >FSDataOutputStream create(PathData item, boolean lazyPersist, > boolean direct) > throws IOException { > try { > if (lazyPersist) { // in if stament, lazyPersist is always true > ā¦ā¦ > return create(item.path, > FsPermission.getFileDefault().applyUMask( > FsPermission.getUMask(getConf())), > createFlags, > getConf().getInt(IO_FILE_BUFFER_SIZE_KEY, > IO_FILE_BUFFER_SIZE_DEFAULT), > lazyPersist ? 1 : getDefaultReplication(item.path), > // *this is redundant* > getDefaultBlockSize(), > null, > null); > } else { > return create(item.path, true); > } > } finally { // might have been created but stream was interrupted > if (!direct) { > deleteOnExit(item.path); > } > } > } > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail:
[jira] [Updated] (HADOOP-14470) the ternary operator in create method in class CommandWithDestination is redundant
[ https://issues.apache.org/jira/browse/HADOOP-14470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hongyuan Li updated HADOOP-14470: - Description: in if statementļ¼the lazyPersist is always true, thus the ternary operator is redundantļ¼ {{lazyPersist == true}} in if statment, so {{ lazyPersist ? 1 : getDefaultReplication(item.path), }} is redundant. related code like below, which is in {{ org.apache.hadoop.fs.shell.CommandWithDestination}}, lineNumber : 504 : {code:java} FSDataOutputStream create(PathData item, boolean lazyPersist, boolean direct) throws IOException { try { if (lazyPersist) { // in if stament, lazyPersist is always true ā¦ā¦ return create(item.path, FsPermission.getFileDefault().applyUMask( FsPermission.getUMask(getConf())), createFlags, getConf().getInt(IO_FILE_BUFFER_SIZE_KEY, IO_FILE_BUFFER_SIZE_DEFAULT), lazyPersist ? 1 : getDefaultReplication(item.path), // *this is redundant* getDefaultBlockSize(), null, null); } else { return create(item.path, true); } } finally { // might have been created but stream was interrupted if (!direct) { deleteOnExit(item.path); } } } {code} was: in if statementļ¼the lazyPersist is always true, thus the ternary operator is redundantļ¼related code like below: {code:java} FSDataOutputStream create(PathData item, boolean lazyPersist, boolean direct) throws IOException { try { if (lazyPersist) { // in if stament, lazyPersist is always true ā¦ā¦ return create(item.path, FsPermission.getFileDefault().applyUMask( FsPermission.getUMask(getConf())), createFlags, getConf().getInt(IO_FILE_BUFFER_SIZE_KEY, IO_FILE_BUFFER_SIZE_DEFAULT), lazyPersist ? 1 : getDefaultReplication(item.path), // *this is redundant* getDefaultBlockSize(), null, null); } else { return create(item.path, true); } } finally { // might have been created but stream was interrupted if (!direct) { deleteOnExit(item.path); } } } {code} > the ternary operator in create method in class CommandWithDestination is > redundant > --- > > Key: HADOOP-14470 > URL: https://issues.apache.org/jira/browse/HADOOP-14470 > Project: Hadoop Common > Issue Type: Improvement > Components: common >Reporter: Hongyuan Li >Assignee: Hongyuan Li >Priority: Trivial > Attachments: HADOOP-14470-001.patch > > > in if statementļ¼the lazyPersist is always true, thus the ternary operator is > redundantļ¼ > {{lazyPersist == true}} in if statment, so {{ lazyPersist ? 1 : > getDefaultReplication(item.path), }} is redundant. > related code like below, which is in {{ > org.apache.hadoop.fs.shell.CommandWithDestination}}, lineNumber : 504 : > {code:java} >FSDataOutputStream create(PathData item, boolean lazyPersist, > boolean direct) > throws IOException { > try { > if (lazyPersist) { // in if stament, lazyPersist is always true > ā¦ā¦ > return create(item.path, > FsPermission.getFileDefault().applyUMask( > FsPermission.getUMask(getConf())), > createFlags, > getConf().getInt(IO_FILE_BUFFER_SIZE_KEY, > IO_FILE_BUFFER_SIZE_DEFAULT), > lazyPersist ? 1 : getDefaultReplication(item.path), > // *this is redundant* > getDefaultBlockSize(), > null, > null); > } else { > return create(item.path, true); > } > } finally { // might have been created but stream was interrupted > if (!direct) { > deleteOnExit(item.path); > } > } > } > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-14474) Use OpenJDK 7 instead of Oracle JDK 7 to avoid oracle-java7-installer failures
[ https://issues.apache.org/jira/browse/HADOOP-14474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16034925#comment-16034925 ] Allen Wittenauer commented on HADOOP-14474: --- This should probably work. +1 > Use OpenJDK 7 instead of Oracle JDK 7 to avoid oracle-java7-installer failures > -- > > Key: HADOOP-14474 > URL: https://issues.apache.org/jira/browse/HADOOP-14474 > Project: Hadoop Common > Issue Type: Bug > Components: build >Affects Versions: 2.8.0, 2.7.3 >Reporter: Akira Ajisaka >Assignee: Akira Ajisaka > Attachments: HADOOP-14474-branch-2.01.patch > > > Recently Oracle has changed the download link for Oracle JDK7, and that's why > oracle-java7-installer fails. Precommit jobs for branch-2* are failing > because of this failure. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-14474) Use OpenJDK 7 instead of Oracle JDK 7 to avoid oracle-java7-installer failures
[ https://issues.apache.org/jira/browse/HADOOP-14474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16034842#comment-16034842 ] Hadoop QA commented on HADOOP-14474: | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 13m 1s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 11s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 17s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} shellcheck {color} | {color:green} 0m 0s{color} | {color:green} There were no new shellcheck issues. {color} | | {color:green}+1{color} | {color:green} shelldocs {color} | {color:green} 0m 8s{color} | {color:green} There were no new shelldocs issues. {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 18s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 15m 22s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:8515d35 | | JIRA Issue | HADOOP-14474 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12870747/HADOOP-14474-branch-2.01.patch | | Optional Tests | asflicense shellcheck shelldocs | | uname | Linux a9eee9084235 3.13.0-116-generic #163-Ubuntu SMP Fri Mar 31 14:13:22 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | branch-2 / 8e119f1 | | shellcheck | v0.4.6 | | modules | C: U: | | Console output | https://builds.apache.org/job/PreCommit-HADOOP-Build/12434/console | | Powered by | Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > Use OpenJDK 7 instead of Oracle JDK 7 to avoid oracle-java7-installer failures > -- > > Key: HADOOP-14474 > URL: https://issues.apache.org/jira/browse/HADOOP-14474 > Project: Hadoop Common > Issue Type: Bug > Components: build >Affects Versions: 2.8.0, 2.7.3 >Reporter: Akira Ajisaka >Assignee: Akira Ajisaka > Attachments: HADOOP-14474-branch-2.01.patch > > > Recently Oracle has changed the download link for Oracle JDK7, and that's why > oracle-java7-installer fails. Precommit jobs for branch-2* are failing > because of this failure. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-14436) Remove the redundant colon in ViewFs.md
[ https://issues.apache.org/jira/browse/HADOOP-14436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16034832#comment-16034832 ] Hudson commented on HADOOP-14436: - SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #11818 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/11818/]) HADOOP-14436. Remove the redundant colon in ViewFs.md. Contributed by (brahma: rev 056cc72885471d6952ff182670e4b4a38421603d) * (edit) hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/ViewFs.md > Remove the redundant colon in ViewFs.md > --- > > Key: HADOOP-14436 > URL: https://issues.apache.org/jira/browse/HADOOP-14436 > Project: Hadoop Common > Issue Type: Bug > Components: documentation >Affects Versions: 2.7.1, 3.0.0-alpha2 >Reporter: maobaolong >Assignee: maobaolong > Fix For: 2.9.0, 3.0.0-alpha4 > > Attachments: HADOOP-14436.patch > > > Minor mistake can led the beginner to the wrong way and getting far away from > us. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-12360) Create StatsD metrics2 sink
[ https://issues.apache.org/jira/browse/HADOOP-12360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16034826#comment-16034826 ] Michael Moss commented on HADOOP-12360: --- Hi, I'm curious what this section of code is trying to achieve: https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/sink/StatsDSink.java#L103 It seems that in some cases, for some metrics (JVM metrics for example), the sn (serviceName) variable is overridden, which breaks the configured prefix: https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/sink/StatsDSink.java#L91 Wondering if this was intended? > Create StatsD metrics2 sink > --- > > Key: HADOOP-12360 > URL: https://issues.apache.org/jira/browse/HADOOP-12360 > Project: Hadoop Common > Issue Type: New Feature > Components: metrics >Affects Versions: 2.7.1 >Reporter: Dave Marion >Assignee: Dave Marion >Priority: Minor > Fix For: 2.8.0, 3.0.0-alpha1 > > Attachments: HADOOP-12360.001.patch, HADOOP-12360.002.patch, > HADOOP-12360.003.patch, HADOOP-12360.004.patch, HADOOP-12360.005.patch, > HADOOP-12360.006.patch, HADOOP-12360.007.patch, HADOOP-12360.008.patch, > HADOOP-12360.009.patch, HADOOP-12360.010.patch > > > Create a metrics sink that pushes to a StatsD daemon. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-14474) Use OpenJDK 7 instead of Oracle JDK 7 to avoid oracle-java7-installer failures
[ https://issues.apache.org/jira/browse/HADOOP-14474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16034819#comment-16034819 ] Hadoop QA commented on HADOOP-14474: (!) A patch to the testing environment has been detected. Re-executing against the patched versions to perform further tests. The console is at https://builds.apache.org/job/PreCommit-HADOOP-Build/12434/console in case of problems. > Use OpenJDK 7 instead of Oracle JDK 7 to avoid oracle-java7-installer failures > -- > > Key: HADOOP-14474 > URL: https://issues.apache.org/jira/browse/HADOOP-14474 > Project: Hadoop Common > Issue Type: Bug > Components: build >Affects Versions: 2.8.0, 2.7.3 >Reporter: Akira Ajisaka >Assignee: Akira Ajisaka > Attachments: HADOOP-14474-branch-2.01.patch > > > Recently Oracle has changed the download link for Oracle JDK7, and that's why > oracle-java7-installer fails. Precommit jobs for branch-2* are failing > because of this failure. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-14436) Remove the redundant colon in ViewFs.md
[ https://issues.apache.org/jira/browse/HADOOP-14436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brahma Reddy Battula updated HADOOP-14436: -- Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 3.0.0-alpha4 2.9.0 Status: Resolved (was: Patch Available) Committed to {{trunk}} and {{branch-2}}. [~maobaolong] thanks for your contribution. > Remove the redundant colon in ViewFs.md > --- > > Key: HADOOP-14436 > URL: https://issues.apache.org/jira/browse/HADOOP-14436 > Project: Hadoop Common > Issue Type: Bug > Components: documentation >Affects Versions: 2.7.1, 3.0.0-alpha2 >Reporter: maobaolong >Assignee: maobaolong > Fix For: 2.9.0, 3.0.0-alpha4 > > Attachments: HADOOP-14436.patch > > > Minor mistake can led the beginner to the wrong way and getting far away from > us. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Created] (HADOOP-14479) Erasurecode testcase failures with ISA-L
Ayappan created HADOOP-14479: Summary: Erasurecode testcase failures with ISA-L Key: HADOOP-14479 URL: https://issues.apache.org/jira/browse/HADOOP-14479 Project: Hadoop Common Issue Type: Bug Components: common Affects Versions: 3.0.0-alpha3 Environment: x86_64 Ubuntu 16.04.02 LTS Reporter: Ayappan I built hadoop with ISA-L support. I took the ISA-L code from https://github.com/01org/isa-l (tag v2.18.0) and built it. While running the UTs , following three testcases are failing 1)TestHHXORErasureCoder Tests run: 7, Failures: 3, Errors: 0, Skipped: 0, Time elapsed: 1.106 sec <<< FAILURE! - in org.apache.hadoop.io.erasurecode.coder.TestHHXORErasureCoder testCodingDirectBuffer_10x4_erasing_p1(org.apache.hadoop.io.erasurecode.coder.TestHHXORErasureCoder) Time elapsed: 0.029 sec <<< FAILURE! java.lang.AssertionError: Decoding and comparing failed. at org.junit.Assert.fail(Assert.java:88) at org.junit.Assert.assertTrue(Assert.java:41) at org.apache.hadoop.io.erasurecode.TestCoderBase.compareAndVerify(TestCoderBase.java:170) at org.apache.hadoop.io.erasurecode.coder.TestErasureCoderBase.compareAndVerify(TestErasureCoderBase.java:141) at org.apache.hadoop.io.erasurecode.coder.TestErasureCoderBase.performTestCoding(TestErasureCoderBase.java:98) at org.apache.hadoop.io.erasurecode.coder.TestErasureCoderBase.testCoding(TestErasureCoderBase.java:69) at org.apache.hadoop.io.erasurecode.coder.TestHHXORErasureCoder.testCodingDirectBuffer_10x4_erasing_p1(TestHHXORErasureCoder.java:64) 2)TestRSErasureCoder Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.591 sec - in org.apache.hadoop.io.erasurecode.coder.TestXORCoder Running org.apache.hadoop.io.erasurecode.coder.TestRSErasureCoder # # A fatal error has been detected by the Java Runtime Environment: # # SIGSEGV (0xb) at pc=0x7f486a28a6e4, pid=8970, tid=0x7f4850927700 # # JRE version: OpenJDK Runtime Environment (8.0_121-b13) (build 1.8.0_121-8u121-b13-0ubuntu1.16.04.2-b13) # Java VM: OpenJDK 64-Bit Server VM (25.121-b13 mixed mode linux-amd64 compressed oops) # Problematic frame: # C [libc.so.6+0x8e6e4] # # Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again # # An error report file with more information is saved as: # /home/ayappan/hadoop/hadoop-common-project/hadoop-common/hs_err_pid8970.log # # If you would like to submit a bug report, please visit: # http://bugreport.java.com/bugreport/crash.jsp # The crash happened outside the Java Virtual Machine in native code. # See problematic frame for where to report the bug. # 3)TestCodecRawCoderMapping Running org.apache.hadoop.io.erasurecode.TestCodecRawCoderMapping Tests run: 5, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 0.559 sec <<< FAILURE! - in org.apache.hadoop.io.erasurecode.TestCodecRawCoderMapping testRSDefaultRawCoder(org.apache.hadoop.io.erasurecode.TestCodecRawCoderMapping) Time elapsed: 0.015 sec <<< FAILURE! java.lang.AssertionError: null at org.junit.Assert.fail(Assert.java:86) at org.junit.Assert.assertTrue(Assert.java:41) at org.junit.Assert.assertTrue(Assert.java:52) at org.apache.hadoop.io.erasurecode.TestCodecRawCoderMapping.testRSDefaultRawCoder(TestCodecRawCoderMapping.java:58) -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-14436) Remove the redundant colon in ViewFs.md
[ https://issues.apache.org/jira/browse/HADOOP-14436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brahma Reddy Battula updated HADOOP-14436: -- Summary: Remove the redundant colon in ViewFs.md (was: The ViewFs.md's minor error about a redundant colon) > Remove the redundant colon in ViewFs.md > --- > > Key: HADOOP-14436 > URL: https://issues.apache.org/jira/browse/HADOOP-14436 > Project: Hadoop Common > Issue Type: Bug > Components: documentation >Affects Versions: 2.7.1, 3.0.0-alpha2 >Reporter: maobaolong >Assignee: maobaolong > Attachments: HADOOP-14436.patch > > > Minor mistake can led the beginner to the wrong way and getting far away from > us. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-14473) Optimize NativeAzureFileSystem::seek for forward seeks
[ https://issues.apache.org/jira/browse/HADOOP-14473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16034683#comment-16034683 ] Rajesh Balamohan commented on HADOOP-14473: --- Since it was easier to combine this patch with HADOOP-14478, I have merged it and posted the revised patch there. In the revised patch, I have fixed an issue in seek() and shared the test results as well there. Tests were run against "japan west region" end point. {{BlobInputStream::skip()}} is more of a no-op call. Issue was related to closing the stream and opening it again via {{store.retrieve()}} as it would end up creating new {{BlobInputStream}}. And that would internally need additional http call as it needs to download blob attributes internally in {{BlobInputStream}}. This has been avoided in the patch. I completely agree that it would be good to get the instrumentation similar to s3a, and it was very useful. Please let me know if this could be done in incremental tickets. > Optimize NativeAzureFileSystem::seek for forward seeks > -- > > Key: HADOOP-14473 > URL: https://issues.apache.org/jira/browse/HADOOP-14473 > Project: Hadoop Common > Issue Type: Bug > Components: fs/azure >Reporter: Rajesh Balamohan >Assignee: Rajesh Balamohan > Attachments: HADOOP-14473-001.patch > > > {{NativeAzureFileSystem::seek()}} closes and re-opens the inputstream > irrespective of forward/backward seek. It would be beneficial to re-open the > stream on backward seek. > https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azure/NativeAzureFileSystem.java#L889 -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Assigned] (HADOOP-14478) Optimize NativeAzureFsInputStream for positional reads
[ https://issues.apache.org/jira/browse/HADOOP-14478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Balamohan reassigned HADOOP-14478: - Assignee: Rajesh Balamohan > Optimize NativeAzureFsInputStream for positional reads > -- > > Key: HADOOP-14478 > URL: https://issues.apache.org/jira/browse/HADOOP-14478 > Project: Hadoop Common > Issue Type: Bug > Components: fs/azure >Reporter: Rajesh Balamohan >Assignee: Rajesh Balamohan > Attachments: HADOOP-14478.001.patch, HADOOP-14478.002.patch > > > Azure's {{BlobbInputStream}} internally buffers 4 MB of data irrespective of > the data length requested for. This would be beneficial for sequential reads. > However, for positional reads (seek to specific location, read x number of > bytes, seek back to original location) this may not be beneficial and might > even download lot more data which are not used later. > It would be good to override {{readFully(long position, byte[] buffer, int > offset, int length)}} for {{NativeAzureFsInputStream}} and make use of > {{mark(readLimit)}} as a hint to Azure's BlobInputStream. > BlobInputStream reference: > https://github.com/Azure/azure-storage-java/blob/master/microsoft-azure-storage/src/com/microsoft/azure/storage/blob/BlobInputStream.java#L448 > BlobInputStream can consider this as a hint later to determine the amount of > data to be read ahead. Changes to BlobInputStream would not be addressed in > this JIRA. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-14478) Optimize NativeAzureFsInputStream for positional reads
[ https://issues.apache.org/jira/browse/HADOOP-14478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Balamohan updated HADOOP-14478: -- Attachment: HADOOP-14478.002.patch Attaching .2 version with fixes in seek(). Also attaching test results from hadoop-azure module. My azure machine and endpoints are hosted in "Japan West region" {noformat} hdiuser@hn0:~/hadoop/hadoop-tools/hadoop-azureā« mvn test ... .. Tests run: 16, Failures: 0, Errors: 0, Skipped: 16, Time elapsed: 0.421 sec - in org.apache.hadoop.fs.azure.TestFileSystemOperationExceptionHandling Running org.apache.hadoop.fs.azure.TestAzureConcurrentOutOfBandIo Tests run: 1, Failures: 0, Errors: 0, Skipped: 1, Time elapsed: 0.361 sec - in org.apache.hadoop.fs.azure.TestAzureConcurrentOutOfBandIo Running org.apache.hadoop.fs.azure.TestAzureFileSystemErrorConditions Tests run: 6, Failures: 0, Errors: 0, Skipped: 3, Time elapsed: 0.939 sec - in org.apache.hadoop.fs.azure.TestAzureFileSystemErrorConditions Results : Tests run: 703, Failures: 0, Errors: 0, Skipped: 436 [INFO] [INFO] BUILD SUCCESS [INFO] [INFO] Total time: 01:50 min [INFO] Finished at: 2017-06-02T13:08:42+00:00 [INFO] Final Memory: 29M/1574M [INFO] {noformat} > Optimize NativeAzureFsInputStream for positional reads > -- > > Key: HADOOP-14478 > URL: https://issues.apache.org/jira/browse/HADOOP-14478 > Project: Hadoop Common > Issue Type: Bug > Components: fs/azure >Reporter: Rajesh Balamohan > Attachments: HADOOP-14478.001.patch, HADOOP-14478.002.patch > > > Azure's {{BlobbInputStream}} internally buffers 4 MB of data irrespective of > the data length requested for. This would be beneficial for sequential reads. > However, for positional reads (seek to specific location, read x number of > bytes, seek back to original location) this may not be beneficial and might > even download lot more data which are not used later. > It would be good to override {{readFully(long position, byte[] buffer, int > offset, int length)}} for {{NativeAzureFsInputStream}} and make use of > {{mark(readLimit)}} as a hint to Azure's BlobInputStream. > BlobInputStream reference: > https://github.com/Azure/azure-storage-java/blob/master/microsoft-azure-storage/src/com/microsoft/azure/storage/blob/BlobInputStream.java#L448 > BlobInputStream can consider this as a hint later to determine the amount of > data to be read ahead. Changes to BlobInputStream would not be addressed in > this JIRA. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-14436) The ViewFs.md's minor error about a redundant colon
[ https://issues.apache.org/jira/browse/HADOOP-14436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16034628#comment-16034628 ] Hadoop QA commented on HADOOP-14436: | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 16s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 13m 23s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 52s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 51s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 17s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 16m 6s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:14b5c93 | | JIRA Issue | HADOOP-14436 | | GITHUB PR | https://github.com/apache/hadoop/pull/223 | | Optional Tests | asflicense mvnsite | | uname | Linux 8e9a4f056b23 3.13.0-116-generic #163-Ubuntu SMP Fri Mar 31 14:13:22 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 8d9084e | | modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs | | Console output | https://builds.apache.org/job/PreCommit-HADOOP-Build/12433/console | | Powered by | Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > The ViewFs.md's minor error about a redundant colon > --- > > Key: HADOOP-14436 > URL: https://issues.apache.org/jira/browse/HADOOP-14436 > Project: Hadoop Common > Issue Type: Bug > Components: documentation >Affects Versions: 2.7.1, 3.0.0-alpha2 >Reporter: maobaolong >Assignee: maobaolong > Attachments: HADOOP-14436.patch > > > Minor mistake can led the beginner to the wrong way and getting far away from > us. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-14436) The ViewFs.md's minor error about a redundant colon
[ https://issues.apache.org/jira/browse/HADOOP-14436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16034613#comment-16034613 ] Brahma Reddy Battula commented on HADOOP-14436: --- [~maobaolong] thanks for uploading the patch..Patch LGTM..pending for jenkins.. Go throught he following link for more details on contribution. https://cwiki.apache.org/confluence/display/HADOOP/HowToContribute > The ViewFs.md's minor error about a redundant colon > --- > > Key: HADOOP-14436 > URL: https://issues.apache.org/jira/browse/HADOOP-14436 > Project: Hadoop Common > Issue Type: Bug > Components: documentation >Affects Versions: 2.7.1, 3.0.0-alpha2 >Reporter: maobaolong >Assignee: maobaolong > Attachments: HADOOP-14436.patch > > > Minor mistake can led the beginner to the wrong way and getting far away from > us. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-14436) The ViewFs.md's minor error about a redundant colon
[ https://issues.apache.org/jira/browse/HADOOP-14436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brahma Reddy Battula updated HADOOP-14436: -- Status: Patch Available (was: Open) > The ViewFs.md's minor error about a redundant colon > --- > > Key: HADOOP-14436 > URL: https://issues.apache.org/jira/browse/HADOOP-14436 > Project: Hadoop Common > Issue Type: Bug > Components: documentation >Affects Versions: 3.0.0-alpha2, 2.7.1 >Reporter: maobaolong >Assignee: maobaolong > Attachments: HADOOP-14436.patch > > > Minor mistake can led the beginner to the wrong way and getting far away from > us. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-14436) The ViewFs.md's minor error about a redundant colon
[ https://issues.apache.org/jira/browse/HADOOP-14436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16034545#comment-16034545 ] ASF GitHub Bot commented on HADOOP-14436: - Github user maobaolong commented on the issue: https://github.com/apache/hadoop/pull/223 @brahmareddybattula Thank you advance. Now I upload the patch file here, please take a look, any review comment will help me. > The ViewFs.md's minor error about a redundant colon > --- > > Key: HADOOP-14436 > URL: https://issues.apache.org/jira/browse/HADOOP-14436 > Project: Hadoop Common > Issue Type: Bug > Components: documentation >Affects Versions: 2.7.1, 3.0.0-alpha2 >Reporter: maobaolong >Assignee: maobaolong > Attachments: HADOOP-14436.patch > > > Minor mistake can led the beginner to the wrong way and getting far away from > us. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-14436) The ViewFs.md's minor error about a redundant colon
[ https://issues.apache.org/jira/browse/HADOOP-14436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16034541#comment-16034541 ] maobaolong commented on HADOOP-14436: - [~brahma] Thank you advance. I'm glad to join you. Now I upload the patch file here, please take a look, any review comment will help me. > The ViewFs.md's minor error about a redundant colon > --- > > Key: HADOOP-14436 > URL: https://issues.apache.org/jira/browse/HADOOP-14436 > Project: Hadoop Common > Issue Type: Bug > Components: documentation >Affects Versions: 2.7.1, 3.0.0-alpha2 >Reporter: maobaolong >Assignee: maobaolong > Attachments: HADOOP-14436.patch > > > Minor mistake can led the beginner to the wrong way and getting far away from > us. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-14436) The ViewFs.md's minor error about a redundant colon
[ https://issues.apache.org/jira/browse/HADOOP-14436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] maobaolong updated HADOOP-14436: Attachment: HADOOP-14436.patch > The ViewFs.md's minor error about a redundant colon > --- > > Key: HADOOP-14436 > URL: https://issues.apache.org/jira/browse/HADOOP-14436 > Project: Hadoop Common > Issue Type: Bug > Components: documentation >Affects Versions: 2.7.1, 3.0.0-alpha2 >Reporter: maobaolong >Assignee: maobaolong > Attachments: HADOOP-14436.patch > > > Minor mistake can led the beginner to the wrong way and getting far away from > us. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-14475) Metrics of S3A don't print out when enable it in Hadoop metrics property file
[ https://issues.apache.org/jira/browse/HADOOP-14475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16034518#comment-16034518 ] Steve Loughran commented on HADOOP-14475: - You are probably the first person to play with this. I've been primarily using them for testing, and, in code, using the new stats API, {{getStorageStatistics()}} to pick them and log/store them (in HADOOP-13786, saving into the _SUCCESS file for later retrieval) If you could help work out what I've done wrong here, that'd be great. Probably some registration issue. > Metrics of S3A don't print out when enable it in Hadoop metrics property file > -- > > Key: HADOOP-14475 > URL: https://issues.apache.org/jira/browse/HADOOP-14475 > Project: Hadoop Common > Issue Type: Bug > Components: fs/s3 >Affects Versions: 2.8.0 > Environment: uname -a > Linux client01 4.4.0-74-generic #95-Ubuntu SMP Wed Apr 12 09:50:34 UTC 2017 > x86_64 x86_64 x86_64 GNU/Linux > cat /etc/issue > Ubuntu 16.04.2 LTS \n \l >Reporter: Yonger > > *.sink.file.class=org.apache.hadoop.metrics2.sink.FileSink > #*.sink.file.class=org.apache.hadoop.metrics2.sink.influxdb.InfluxdbSink > #*.sink.influxdb.url=http:/xx > #*.sink.influxdb.influxdb_port=8086 > #*.sink.influxdb.database=hadoop > #*.sink.influxdb.influxdb_username=hadoop > #*.sink.influxdb.influxdb_password=hadoop > #*.sink.ingluxdb.cluster=c1 > *.period=10 > #namenode.sink.influxdb.class=org.apache.hadoop.metrics2.sink.influxdb.InfluxdbSink > #S3AFileSystem.sink.influxdb.class=org.apache.hadoop.metrics2.sink.influxdb.InfluxdbSink > S3AFileSystem.sink.file.filename=s3afilesystem-metrics.out > I can't find the out put file even i run a MR job which should be used s3. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-14472) Azure: TestReadAndSeekPageBlobAfterWrite fails intermittently
[ https://issues.apache.org/jira/browse/HADOOP-14472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16034493#comment-16034493 ] Steve Loughran commented on HADOOP-14472: - Seems reasonable. Which endpoint did you test against? > Azure: TestReadAndSeekPageBlobAfterWrite fails intermittently > - > > Key: HADOOP-14472 > URL: https://issues.apache.org/jira/browse/HADOOP-14472 > Project: Hadoop Common > Issue Type: Bug > Components: fs/azure, test >Reporter: Mingliang Liu > Attachments: HADOOP-14472.000.patch > > > Reported by [HADOOP-14461] > {code} > testManySmallWritesWithHFlush(org.apache.hadoop.fs.azure.TestReadAndSeekPageBlobAfterWrite) > Time elapsed: 1.051 sec <<< FAILURE! > java.lang.AssertionError: hflush duration of 13, less than minimum expected > of 20 > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.assertTrue(Assert.java:41) > at > org.apache.hadoop.fs.azure.TestReadAndSeekPageBlobAfterWrite.writeAndReadOneFile(TestReadAndSeekPageBlobAfterWrite.java:286) > at > org.apache.hadoop.fs.azure.TestReadAndSeekPageBlobAfterWrite.testManySmallWritesWithHFlush(TestReadAndSeekPageBlobAfterWrite.java:247) > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-14473) Optimize NativeAzureFileSystem::seek for forward seeks
[ https://issues.apache.org/jira/browse/HADOOP-14473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16034491#comment-16034491 ] Steve Loughran commented on HADOOP-14473: - This is going to read forward no matter how big the file is, even if you are going to the last MB of a 20 GB file. Is this really the most optimal. Rajesh, you are pulling over the s3a input stream work again, aren't you? Maybe its best here to group them into 1 patch. That s3a work also added stream instrumentation {{org.apache.hadoop.fs.s3a.S3AInstrumentation.InputStreamStatistics}} , so we could actually measure what is going on, *and use it in tests*. This seek work here & related is the opportunity to do the same for Azure, which will benefit production monitoring too. In particular, here I'd like to track the #of bytes skipped in forward seeks, and the #of close/open pairs, so we can detect when there's a lot of skipping going on, plus make better tests. Ideally I'd like something like {{ITestS3AInputStreamPerformance}}, so as to catch any performance regressions in various read sequences (whole file vs skip forwards vs full random) > Optimize NativeAzureFileSystem::seek for forward seeks > -- > > Key: HADOOP-14473 > URL: https://issues.apache.org/jira/browse/HADOOP-14473 > Project: Hadoop Common > Issue Type: Bug > Components: fs/azure >Reporter: Rajesh Balamohan >Assignee: Rajesh Balamohan > Attachments: HADOOP-14473-001.patch > > > {{NativeAzureFileSystem::seek()}} closes and re-opens the inputstream > irrespective of forward/backward seek. It would be beneficial to re-open the > stream on backward seek. > https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azure/NativeAzureFileSystem.java#L889 -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-14473) Optimize NativeAzureFileSystem::seek for forward seeks
[ https://issues.apache.org/jira/browse/HADOOP-14473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16034485#comment-16034485 ] Steve Loughran commented on HADOOP-14473: - which endpoint did you test against? -1 until that's declared. > Optimize NativeAzureFileSystem::seek for forward seeks > -- > > Key: HADOOP-14473 > URL: https://issues.apache.org/jira/browse/HADOOP-14473 > Project: Hadoop Common > Issue Type: Bug > Components: fs/azure >Reporter: Rajesh Balamohan >Assignee: Rajesh Balamohan > Attachments: HADOOP-14473-001.patch > > > {{NativeAzureFileSystem::seek()}} closes and re-opens the inputstream > irrespective of forward/backward seek. It would be beneficial to re-open the > stream on backward seek. > https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azure/NativeAzureFileSystem.java#L889 -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-14478) Optimize NativeAzureFsInputStream for positional reads
[ https://issues.apache.org/jira/browse/HADOOP-14478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16034484#comment-16034484 ] Steve Loughran commented on HADOOP-14478: - Usual rule: which endpoint have you tested this with? > Optimize NativeAzureFsInputStream for positional reads > -- > > Key: HADOOP-14478 > URL: https://issues.apache.org/jira/browse/HADOOP-14478 > Project: Hadoop Common > Issue Type: Bug > Components: fs/azure >Reporter: Rajesh Balamohan > Attachments: HADOOP-14478.001.patch > > > Azure's {{BlobbInputStream}} internally buffers 4 MB of data irrespective of > the data length requested for. This would be beneficial for sequential reads. > However, for positional reads (seek to specific location, read x number of > bytes, seek back to original location) this may not be beneficial and might > even download lot more data which are not used later. > It would be good to override {{readFully(long position, byte[] buffer, int > offset, int length)}} for {{NativeAzureFsInputStream}} and make use of > {{mark(readLimit)}} as a hint to Azure's BlobInputStream. > BlobInputStream reference: > https://github.com/Azure/azure-storage-java/blob/master/microsoft-azure-storage/src/com/microsoft/azure/storage/blob/BlobInputStream.java#L448 > BlobInputStream can consider this as a hint later to determine the amount of > data to be read ahead. Changes to BlobInputStream would not be addressed in > this JIRA. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-14477) FileSystem Simplify / Optimize listStatus Method
[ https://issues.apache.org/jira/browse/HADOOP-14477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16034480#comment-16034480 ] Steve Loughran commented on HADOOP-14477: - test failure is unlrelated; checkstyles are new As this goes into FS behaviours, could you create an HDFS JIRA for same patch and submit it there too (& link it to this)? That will force all the hdfs tests to use this new routine as well. > FileSystem Simplify / Optimize listStatus Method > > > Key: HADOOP-14477 > URL: https://issues.apache.org/jira/browse/HADOOP-14477 > Project: Hadoop Common > Issue Type: Improvement > Components: fs >Affects Versions: 2.7.3, 3.0.0-alpha3 >Reporter: BELUGA BEHR >Assignee: BELUGA BEHR >Priority: Minor > Attachments: HADOOP-14477.1.patch, HADOOP-14477.2.patch > > > {code:title=org.apache.hadoop.fs.FileSystem.listStatus(ArrayList, > Path, PathFilter)} > /* >* Filter files/directories in the given path using the user-supplied path >* filter. Results are added to the given array results. >*/ > private void listStatus(ArrayList results, Path f, > PathFilter filter) throws FileNotFoundException, IOException { > FileStatus listing[] = listStatus(f); > if (listing == null) { > throw new IOException("Error accessing " + f); > } > for (int i = 0; i < listing.length; i++) { > if (filter.accept(listing[i].getPath())) { > results.add(listing[i]); > } > } > } > {code} > {code:title=org.apache.hadoop.fs.FileSystem.listStatus(Path, PathFilter)} > public FileStatus[] listStatus(Path f, PathFilter filter) >throws FileNotFoundException, IOException { > ArrayList results = new ArrayList(); > listStatus(results, f, filter); > return results.toArray(new FileStatus[results.size()]); > } > {code} > We can be smarter about this: > # Use enhanced for-loops > # Optimize for the case where there are zero files in a directory, save on > object instantiation > # More encapsulated design -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-14477) FileSystem Simplify / Optimize listStatus Method
[ https://issues.apache.org/jira/browse/HADOOP-14477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16034481#comment-16034481 ] Steve Loughran commented on HADOOP-14477: - test failure is unlrelated; checkstyles are new As this goes into FS behaviours, could you create an HDFS JIRA for same patch and submit it there too (& link it to this)? That will force all the hdfs tests to use this new routine as well. > FileSystem Simplify / Optimize listStatus Method > > > Key: HADOOP-14477 > URL: https://issues.apache.org/jira/browse/HADOOP-14477 > Project: Hadoop Common > Issue Type: Improvement > Components: fs >Affects Versions: 2.7.3, 3.0.0-alpha3 >Reporter: BELUGA BEHR >Assignee: BELUGA BEHR >Priority: Minor > Attachments: HADOOP-14477.1.patch, HADOOP-14477.2.patch > > > {code:title=org.apache.hadoop.fs.FileSystem.listStatus(ArrayList, > Path, PathFilter)} > /* >* Filter files/directories in the given path using the user-supplied path >* filter. Results are added to the given array results. >*/ > private void listStatus(ArrayList results, Path f, > PathFilter filter) throws FileNotFoundException, IOException { > FileStatus listing[] = listStatus(f); > if (listing == null) { > throw new IOException("Error accessing " + f); > } > for (int i = 0; i < listing.length; i++) { > if (filter.accept(listing[i].getPath())) { > results.add(listing[i]); > } > } > } > {code} > {code:title=org.apache.hadoop.fs.FileSystem.listStatus(Path, PathFilter)} > public FileStatus[] listStatus(Path f, PathFilter filter) >throws FileNotFoundException, IOException { > ArrayList results = new ArrayList(); > listStatus(results, f, filter); > return results.toArray(new FileStatus[results.size()]); > } > {code} > We can be smarter about this: > # Use enhanced for-loops > # Optimize for the case where there are zero files in a directory, save on > object instantiation > # More encapsulated design -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Assigned] (HADOOP-14477) FileSystem Simplify / Optimize listStatus Method
[ https://issues.apache.org/jira/browse/HADOOP-14477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran reassigned HADOOP-14477: --- Assignee: BELUGA BEHR > FileSystem Simplify / Optimize listStatus Method > > > Key: HADOOP-14477 > URL: https://issues.apache.org/jira/browse/HADOOP-14477 > Project: Hadoop Common > Issue Type: Improvement > Components: fs >Affects Versions: 2.7.3, 3.0.0-alpha3 >Reporter: BELUGA BEHR >Assignee: BELUGA BEHR >Priority: Minor > Attachments: HADOOP-14477.1.patch, HADOOP-14477.2.patch > > > {code:title=org.apache.hadoop.fs.FileSystem.listStatus(ArrayList, > Path, PathFilter)} > /* >* Filter files/directories in the given path using the user-supplied path >* filter. Results are added to the given array results. >*/ > private void listStatus(ArrayList results, Path f, > PathFilter filter) throws FileNotFoundException, IOException { > FileStatus listing[] = listStatus(f); > if (listing == null) { > throw new IOException("Error accessing " + f); > } > for (int i = 0; i < listing.length; i++) { > if (filter.accept(listing[i].getPath())) { > results.add(listing[i]); > } > } > } > {code} > {code:title=org.apache.hadoop.fs.FileSystem.listStatus(Path, PathFilter)} > public FileStatus[] listStatus(Path f, PathFilter filter) >throws FileNotFoundException, IOException { > ArrayList results = new ArrayList(); > listStatus(results, f, filter); > return results.toArray(new FileStatus[results.size()]); > } > {code} > We can be smarter about this: > # Use enhanced for-loops > # Optimize for the case where there are zero files in a directory, save on > object instantiation > # More encapsulated design -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-14477) FileSystem Simplify / Optimize listStatus Method
[ https://issues.apache.org/jira/browse/HADOOP-14477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran updated HADOOP-14477: Component/s: fs > FileSystem Simplify / Optimize listStatus Method > > > Key: HADOOP-14477 > URL: https://issues.apache.org/jira/browse/HADOOP-14477 > Project: Hadoop Common > Issue Type: Improvement > Components: fs >Affects Versions: 2.7.3, 3.0.0-alpha3 >Reporter: BELUGA BEHR >Priority: Minor > Attachments: HADOOP-14477.1.patch, HADOOP-14477.2.patch > > > {code:title=org.apache.hadoop.fs.FileSystem.listStatus(ArrayList, > Path, PathFilter)} > /* >* Filter files/directories in the given path using the user-supplied path >* filter. Results are added to the given array results. >*/ > private void listStatus(ArrayList results, Path f, > PathFilter filter) throws FileNotFoundException, IOException { > FileStatus listing[] = listStatus(f); > if (listing == null) { > throw new IOException("Error accessing " + f); > } > for (int i = 0; i < listing.length; i++) { > if (filter.accept(listing[i].getPath())) { > results.add(listing[i]); > } > } > } > {code} > {code:title=org.apache.hadoop.fs.FileSystem.listStatus(Path, PathFilter)} > public FileStatus[] listStatus(Path f, PathFilter filter) >throws FileNotFoundException, IOException { > ArrayList results = new ArrayList(); > listStatus(results, f, filter); > return results.toArray(new FileStatus[results.size()]); > } > {code} > We can be smarter about this: > # Use enhanced for-loops > # Optimize for the case where there are zero files in a directory, save on > object instantiation > # More encapsulated design -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-14163) Refactor existing hadoop site to use more usable static website generator
[ https://issues.apache.org/jira/browse/HADOOP-14163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16034306#comment-16034306 ] Akira Ajisaka commented on HADOOP-14163: Cool. Now Apache Hadoop 3.0.0-alpha3 is released. Would you update the document as well? When you update the document, I'll push this to asf-site branch. > Refactor existing hadoop site to use more usable static website generator > - > > Key: HADOOP-14163 > URL: https://issues.apache.org/jira/browse/HADOOP-14163 > Project: Hadoop Common > Issue Type: Improvement > Components: site >Reporter: Elek, Marton >Assignee: Elek, Marton > Attachments: HADOOP-14163-001.zip, HADOOP-14163-002.zip, > HADOOP-14163-003.zip, hadoop-site.tar.gz, hadop-site-rendered.tar.gz > > > From the dev mailing list: > "Publishing can be attacked via a mix of scripting and revamping the darned > website. Forrest is pretty bad compared to the newer static site generators > out there (e.g. need to write XML instead of markdown, it's hard to review a > staging site because of all the absolute links, hard to customize, did I > mention XML?), and the look and feel of the site is from the 00s. We don't > actually have that much site content, so it should be possible to migrate to > a new system." > This issue is find a solution to migrate the old site to a new modern static > site generator using a more contemprary theme. > Goals: > * existing links should work (or at least redirected) > * It should be easy to add more content required by a release automatically > (most probably with creating separated markdown files) -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-14478) Optimize NativeAzureFsInputStream for positional reads
[ https://issues.apache.org/jira/browse/HADOOP-14478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Balamohan updated HADOOP-14478: -- Description: Azure's {{BlobbInputStream}} internally buffers 4 MB of data irrespective of the data length requested for. This would be beneficial for sequential reads. However, for positional reads (seek to specific location, read x number of bytes, seek back to original location) this may not be beneficial and might even download lot more data which are not used later. It would be good to override {{readFully(long position, byte[] buffer, int offset, int length)}} for {{NativeAzureFsInputStream}} and make use of {{mark(readLimit)}} as a hint to Azure's BlobInputStream. BlobInputStream reference: https://github.com/Azure/azure-storage-java/blob/master/microsoft-azure-storage/src/com/microsoft/azure/storage/blob/BlobInputStream.java#L448 BlobInputStream can consider this as a hint later to determine the amount of data to be read ahead. Changes to BlobInputStream would not be addressed in this JIRA. was: Azure's {{BlobbInputStream}} internally buffers 4 MB of data irrespective of the data length requested for. This would be beneficial for sequential reads. However, for positional reads (seek to specific location, read x number of bytes, seek back to original location) this may not be beneficial and might even download lot more data which are not used later. It would be good to override {{readFully(long position, byte[] buffer, int offset, int length)}} for {{NativeAzureFsInputStream}} and make use of {{mark(readLimit)}} as a hint to Azure's BlobInputStream. BlobInputStream reference: https://github.com/Azure/azure-storage-java/blob/master/microsoft-azure-storage/src/com/microsoft/azure/storage/blob/BlobInputStream.java#L448 BlobInputStream can consider this as a hint later to determine the amount of data to be read ahead. Changes to BlobInputStream would not be apart of this JIRA. > Optimize NativeAzureFsInputStream for positional reads > -- > > Key: HADOOP-14478 > URL: https://issues.apache.org/jira/browse/HADOOP-14478 > Project: Hadoop Common > Issue Type: Bug > Components: fs/azure >Reporter: Rajesh Balamohan > Attachments: HADOOP-14478.001.patch > > > Azure's {{BlobbInputStream}} internally buffers 4 MB of data irrespective of > the data length requested for. This would be beneficial for sequential reads. > However, for positional reads (seek to specific location, read x number of > bytes, seek back to original location) this may not be beneficial and might > even download lot more data which are not used later. > It would be good to override {{readFully(long position, byte[] buffer, int > offset, int length)}} for {{NativeAzureFsInputStream}} and make use of > {{mark(readLimit)}} as a hint to Azure's BlobInputStream. > BlobInputStream reference: > https://github.com/Azure/azure-storage-java/blob/master/microsoft-azure-storage/src/com/microsoft/azure/storage/blob/BlobInputStream.java#L448 > BlobInputStream can consider this as a hint later to determine the amount of > data to be read ahead. Changes to BlobInputStream would not be addressed in > this JIRA. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-14478) Optimize NativeAzureFsInputStream for positional reads
[ https://issues.apache.org/jira/browse/HADOOP-14478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Balamohan updated HADOOP-14478: -- Attachment: HADOOP-14478.001.patch Attaching .1 patch for review. This includes changes related to HADOOP-14473 as well. > Optimize NativeAzureFsInputStream for positional reads > -- > > Key: HADOOP-14478 > URL: https://issues.apache.org/jira/browse/HADOOP-14478 > Project: Hadoop Common > Issue Type: Bug > Components: fs/azure >Reporter: Rajesh Balamohan > Attachments: HADOOP-14478.001.patch > > > Azure's {{BlobbInputStream}} internally buffers 4 MB of data irrespective of > the data length requested for. This would be beneficial for sequential reads. > However, for positional reads (seek to specific location, read x number of > bytes, seek back to original location) this may not be beneficial and might > even download lot more data which are not used later. > It would be good to override {{readFully(long position, byte[] buffer, int > offset, int length)}} for {{NativeAzureFsInputStream}} and make use of > {{mark(readLimit)}} as a hint to Azure's BlobInputStream. > BlobInputStream reference: > https://github.com/Azure/azure-storage-java/blob/master/microsoft-azure-storage/src/com/microsoft/azure/storage/blob/BlobInputStream.java#L448 > BlobInputStream can consider this as a hint later to determine the amount of > data to be read ahead. Changes to BlobInputStream would not be apart of this > JIRA. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Created] (HADOOP-14478) Optimize NativeAzureFsInputStream for positional reads
Rajesh Balamohan created HADOOP-14478: - Summary: Optimize NativeAzureFsInputStream for positional reads Key: HADOOP-14478 URL: https://issues.apache.org/jira/browse/HADOOP-14478 Project: Hadoop Common Issue Type: Bug Components: fs/azure Reporter: Rajesh Balamohan Azure's {{BlobbInputStream}} internally buffers 4 MB of data irrespective of the data length requested for. This would be beneficial for sequential reads. However, for positional reads (seek to specific location, read x number of bytes, seek back to original location) this may not be beneficial and might even download lot more data which are not used later. It would be good to override {{readFully(long position, byte[] buffer, int offset, int length)}} for {{NativeAzureFsInputStream}} and make use of {{mark(readLimit)}} as a hint to Azure's BlobInputStream. BlobInputStream reference: https://github.com/Azure/azure-storage-java/blob/master/microsoft-azure-storage/src/com/microsoft/azure/storage/blob/BlobInputStream.java#L448 BlobInputStream can consider this as a hint later to determine the amount of data to be read ahead. Changes to BlobInputStream would not be apart of this JIRA. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org