[jira] [Created] (HADOOP-18238) Hadoop 3.3.1 SFTPFileSystem.close() method have problem
yi liu created HADOOP-18238: --- Summary: Hadoop 3.3.1 SFTPFileSystem.close() method have problem Key: HADOOP-18238 URL: https://issues.apache.org/jira/browse/HADOOP-18238 Project: Hadoop Common Issue Type: Bug Components: common Affects Versions: 3.3.1 Reporter: yi liu @Override public void close() throws IOException { if (closed.getAndSet(true)) { return; } try { super.close(); } finally { if (connectionPool != null) { connectionPool.shutdown(); } } } if you exe this method ,the fs can not exec deleteOnExsist method,because the fs is closed. 如果手动调用,sftp fs执行close方法关闭连接池,让jvm能正常退出,deleteOnExsist 将因为fs已关闭无法执行成功。如果不关闭,则连接池不会释放,jvm不能退出。 https://issues.apache.org/jira/browse/HADOOP-17528,这是3.2.0 sftpfilesystem的问题 -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-13184) Add "Apache" to Hadoop project logo
[ https://issues.apache.org/jira/browse/HADOOP-13184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15320013#comment-15320013 ] Yi Liu commented on HADOOP-13184: - option 1 is more beautiful, +1. > Add "Apache" to Hadoop project logo > --- > > Key: HADOOP-13184 > URL: https://issues.apache.org/jira/browse/HADOOP-13184 > Project: Hadoop Common > Issue Type: Task >Reporter: Chris Douglas >Assignee: Abhishek > > Many ASF projects include "Apache" in their logo. We should add it to Hadoop. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-12756) Incorporate Aliyun OSS file system implementation
[ https://issues.apache.org/jira/browse/HADOOP-12756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15303697#comment-15303697 ] Yi Liu commented on HADOOP-12756: - {quote} I'd recommend a wider conversation on the dev mailing lists before filing any specific requests to infra. {quote} +1 for this. Another thing for the "auth-keys.xml", currently we use the credential file instead of normal Hadoop configuration property, I think the reason is it's more secure and the user can control the linux file permissions of "auth-keys.xml". Could we allow the normal Hadoop configuration property for the credentials too, then we can specify the credentials through mvn build command line which could be more easily supported by the INFRA. While user can still use the "auth-keys.xml" in practice. > Incorporate Aliyun OSS file system implementation > - > > Key: HADOOP-12756 > URL: https://issues.apache.org/jira/browse/HADOOP-12756 > Project: Hadoop Common > Issue Type: New Feature > Components: fs >Affects Versions: 2.8.0 >Reporter: shimingfei >Assignee: shimingfei > Attachments: HADOOP-12756-v02.patch, HCFS User manual.md, OSS > integration.pdf, OSS integration.pdf > > > Aliyun OSS is widely used among China’s cloud users, but currently it is not > easy to access data laid on OSS storage from user’s Hadoop/Spark application, > because of no original support for OSS in Hadoop. > This work aims to integrate Aliyun OSS with Hadoop. By simple configuration, > Spark/Hadoop applications can read/write data from OSS without any code > change. Narrowing the gap between user’s APP and data storage, like what have > been done for S3 in Hadoop -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-12756) Incorporate Aliyun OSS file system implementation
[ https://issues.apache.org/jira/browse/HADOOP-12756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15303667#comment-15303667 ] Yi Liu commented on HADOOP-12756: - Agree with [~cnauroth]. The credentials need to go somewhere accessible by each Jenkins host that runs a Hadoop pre-commit build. {code} have a dedicated host (or vm) equipped with all these credentials and run all the tests daily {code} Kai, I think it's not to find a dedicated host, instead, we need to make the auth-keys.xml available on all the Jenkins hosts that run Hadoop pre-commit build. Not sure whether it's easy to support this by the INFRA. {code} It seems these two files should not be included in source code, as what .gitingore has excluded. Maybe we can provide these two files separately? {code} [~lingzhou], please don't add the credentials in patch. It's unexpected. > Incorporate Aliyun OSS file system implementation > - > > Key: HADOOP-12756 > URL: https://issues.apache.org/jira/browse/HADOOP-12756 > Project: Hadoop Common > Issue Type: New Feature > Components: fs >Affects Versions: 2.8.0 >Reporter: shimingfei >Assignee: shimingfei > Attachments: HADOOP-12756-v02.patch, HCFS User manual.md, OSS > integration.pdf, OSS integration.pdf > > > Aliyun OSS is widely used among China’s cloud users, but currently it is not > easy to access data laid on OSS storage from user’s Hadoop/Spark application, > because of no original support for OSS in Hadoop. > This work aims to integrate Aliyun OSS with Hadoop. By simple configuration, > Spark/Hadoop applications can read/write data from OSS without any code > change. Narrowing the gap between user’s APP and data storage, like what have > been done for S3 in Hadoop -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-11180) Fix warning of "token.Token: Cannot find class for token kind kms-dt" for KMS when running jobs on Encryption zones
[ https://issues.apache.org/jira/browse/HADOOP-11180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15261461#comment-15261461 ] Yi Liu commented on HADOOP-11180: - Sure, thanks Andrew and Steve. Here the log level change should be safe. > Fix warning of "token.Token: Cannot find class for token kind kms-dt" for KMS > when running jobs on Encryption zones > --- > > Key: HADOOP-11180 > URL: https://issues.apache.org/jira/browse/HADOOP-11180 > Project: Hadoop Common > Issue Type: Bug > Components: kms, security >Affects Versions: 2.6.0 >Reporter: Yi Liu >Assignee: Yi Liu > Labels: BB2015-05-TBR > Attachments: HADOOP-11180.001.patch > > > This issue is produced when running MapReduce job and encryption zones are > configured. > {quote} > 14/10/09 05:06:02 INFO security.TokenCache: Got dt for > hdfs://hnode1.sh.intel.com:9000; Kind: HDFS_DELEGATION_TOKEN, Service: > 10.239.47.8:9000, Ident: (HDFS_DELEGATION_TOKEN token 21 for user) > 14/10/09 05:06:02 WARN token.Token: Cannot find class for token kind kms-dt > 14/10/09 05:06:02 INFO security.TokenCache: Got dt for > hdfs://hnode1.sh.intel.com:9000; Kind: kms-dt, Service: 10.239.47.8:16000, > Ident: 00 04 75 73 65 72 04 79 61 72 6e 00 8a 01 48 f1 8e 85 07 8a 01 49 15 > 9b 09 07 04 02 > 14/10/09 05:06:03 INFO input.FileInputFormat: Total input paths to process : 1 > 14/10/09 05:06:03 INFO mapreduce.JobSubmitter: number of splits:1 > 14/10/09 05:06:03 INFO mapreduce.JobSubmitter: Submitting tokens for job: > job_141272197_0004 > 14/10/09 05:06:03 WARN token.Token: Cannot find class for token kind kms-dt > 14/10/09 05:06:03 WARN token.Token: Cannot find class for token kind kms-dt > {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-12756) Incorporate Aliyun OSS file system implementation
[ https://issues.apache.org/jira/browse/HADOOP-12756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15255941#comment-15255941 ] Yi Liu commented on HADOOP-12756: - Also the name "oss" is abbreviation of Object Store Service, it's too generic, I think we need to change the name to ali-oss or some other names which other people can understand what it is at first glance. > Incorporate Aliyun OSS file system implementation > - > > Key: HADOOP-12756 > URL: https://issues.apache.org/jira/browse/HADOOP-12756 > Project: Hadoop Common > Issue Type: New Feature > Components: fs >Reporter: shimingfei >Assignee: shimingfei > Attachments: 0001-OSS-filesystem-integration-with-Hadoop.patch, HCFS > User manual.md, OSS integration.pdf, OSS integration.pdf > > > Aliyun OSS is widely used among China’s cloud users, but currently it is not > easy to access data laid on OSS storage from user’s Hadoop/Spark application, > because of no original support for OSS in Hadoop. > This work aims to integrate Aliyun OSS with Hadoop. By simple configuration, > Spark/Hadoop applications can read/write data from OSS without any code > change. Narrowing the gap between user’s APP and data storage, like what have > been done for S3 in Hadoop -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-12756) Incorporate Aliyun OSS file system implementation
[ https://issues.apache.org/jira/browse/HADOOP-12756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15255929#comment-15255929 ] Yi Liu commented on HADOOP-12756: - Thanks for mingfei and Lei for the work. Hi [~ste...@apache.org] and [~cnauroth], regarding testability, they have talked with me offline, Aliyun created an account for the test and they retained the account for Hadoop, they wanted to pass the username/password through "-D" in mvn command. So the basic functionalities could be verified by unit tests. Does this make sense to you? Mingfei and Lei: About the ali-oss client, does it rely on a different version of httpclient? Could we use the version which hadoop is using? I will post my detailed comments later. > Incorporate Aliyun OSS file system implementation > - > > Key: HADOOP-12756 > URL: https://issues.apache.org/jira/browse/HADOOP-12756 > Project: Hadoop Common > Issue Type: New Feature > Components: fs >Reporter: shimingfei >Assignee: shimingfei > Attachments: 0001-OSS-filesystem-integration-with-Hadoop.patch, HCFS > User manual.md, OSS integration.pdf, OSS integration.pdf > > > Aliyun OSS is widely used among China’s cloud users, but currently it is not > easy to access data laid on OSS storage from user’s Hadoop/Spark application, > because of no original support for OSS in Hadoop. > This work aims to integrate Aliyun OSS with Hadoop. By simple configuration, > Spark/Hadoop applications can read/write data from OSS without any code > change. Narrowing the gap between user’s APP and data storage, like what have > been done for S3 in Hadoop -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-12040) Adjust inputs order for the decode API in raw erasure coder
[ https://issues.apache.org/jira/browse/HADOOP-12040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14977915#comment-14977915 ] Yi Liu commented on HADOOP-12040: - Will commit shortly > Adjust inputs order for the decode API in raw erasure coder > --- > > Key: HADOOP-12040 > URL: https://issues.apache.org/jira/browse/HADOOP-12040 > Project: Hadoop Common > Issue Type: Sub-task >Reporter: Kai Zheng >Assignee: Kai Zheng > Attachments: HADOOP-12040-HDFS-7285-v1.patch, HADOOP-12040-v2.patch, > HADOOP-12040-v3.patch, HADOOP-12040-v4.patch > > > Currently we used the parity units + data units order for the inputs, > erasedIndexes and outputs parameters in the decode call in raw erasure coder, > which inherited from HDFS-RAID due to impact enforced by {{GaliosField}}. As > [~zhz] pointed and [~hitliuyi] felt, we'd better change the order to make it > natural for HDFS usage, where usually data blocks are before parity blocks in > a group. Doing this would avoid some reordering tricky logic. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-12040) Adjust inputs order for the decode API in raw erasure coder
[ https://issues.apache.org/jira/browse/HADOOP-12040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yi Liu updated HADOOP-12040: Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 2.8.0 Target Version/s: 2.8.0 Status: Resolved (was: Patch Available) Committed to trunk and branch-2. > Adjust inputs order for the decode API in raw erasure coder > --- > > Key: HADOOP-12040 > URL: https://issues.apache.org/jira/browse/HADOOP-12040 > Project: Hadoop Common > Issue Type: Sub-task >Reporter: Kai Zheng >Assignee: Kai Zheng > Fix For: 2.8.0 > > Attachments: HADOOP-12040-HDFS-7285-v1.patch, HADOOP-12040-v2.patch, > HADOOP-12040-v3.patch, HADOOP-12040-v4.patch > > > Currently we used the parity units + data units order for the inputs, > erasedIndexes and outputs parameters in the decode call in raw erasure coder, > which inherited from HDFS-RAID due to impact enforced by {{GaliosField}}. As > [~zhz] pointed and [~hitliuyi] felt, we'd better change the order to make it > natural for HDFS usage, where usually data blocks are before parity blocks in > a group. Doing this would avoid some reordering tricky logic. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-12040) Adjust inputs order for the decode API in raw erasure coder
[ https://issues.apache.org/jira/browse/HADOOP-12040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976024#comment-14976024 ] Yi Liu commented on HADOOP-12040: - Generally looks good, Kai. 1. You need to cleanup the checkstype issue. For example, some line is longer than 80 characters. 2. Some related tests show failure, such as TestRecoverStripedFile 3. {code}+for (int i = 0; i < erasedIndexes.length; i++) { + if (erasedIndexes[i] >= getNumDataUnits()) { +erasedIndexes2[idx++] = erasedIndexes[i] - getNumDataUnits(); +numErasedParityUnits++; + } +} +for (int i = 0; i < erasedIndexes.length; i++) { + if (erasedIndexes[i] < getNumDataUnits()) { +erasedIndexes2[idx++] = erasedIndexes[i] + getNumParityUnits(); +numErasedDataUnits++; + } +} {code} This can be done in a {{for}}. > Adjust inputs order for the decode API in raw erasure coder > --- > > Key: HADOOP-12040 > URL: https://issues.apache.org/jira/browse/HADOOP-12040 > Project: Hadoop Common > Issue Type: Sub-task >Reporter: Kai Zheng >Assignee: Kai Zheng > Attachments: HADOOP-12040-HDFS-7285-v1.patch, HADOOP-12040-v2.patch, > HADOOP-12040-v3.patch > > > Currently we used the parity units + data units order for the inputs, > erasedIndexes and outputs parameters in the decode call in raw erasure coder, > which inherited from HDFS-RAID due to impact enforced by {{GaliosField}}. As > [~zhz] pointed and [~hitliuyi] felt, we'd better change the order to make it > natural for HDFS usage, where usually data blocks are before parity blocks in > a group. Doing this would avoid some reordering tricky logic. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-12040) Adjust inputs order for the decode API in raw erasure coder
[ https://issues.apache.org/jira/browse/HADOOP-12040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14977510#comment-14977510 ] Yi Liu commented on HADOOP-12040: - For the comment #3, it's not convenient to do, so don't need to address it. > Adjust inputs order for the decode API in raw erasure coder > --- > > Key: HADOOP-12040 > URL: https://issues.apache.org/jira/browse/HADOOP-12040 > Project: Hadoop Common > Issue Type: Sub-task >Reporter: Kai Zheng >Assignee: Kai Zheng > Attachments: HADOOP-12040-HDFS-7285-v1.patch, HADOOP-12040-v2.patch, > HADOOP-12040-v3.patch > > > Currently we used the parity units + data units order for the inputs, > erasedIndexes and outputs parameters in the decode call in raw erasure coder, > which inherited from HDFS-RAID due to impact enforced by {{GaliosField}}. As > [~zhz] pointed and [~hitliuyi] felt, we'd better change the order to make it > natural for HDFS usage, where usually data blocks are before parity blocks in > a group. Doing this would avoid some reordering tricky logic. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-12040) Adjust inputs order for the decode API in raw erasure coder
[ https://issues.apache.org/jira/browse/HADOOP-12040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14977564#comment-14977564 ] Yi Liu commented on HADOOP-12040: - +1 pending Jenkins. > Adjust inputs order for the decode API in raw erasure coder > --- > > Key: HADOOP-12040 > URL: https://issues.apache.org/jira/browse/HADOOP-12040 > Project: Hadoop Common > Issue Type: Sub-task >Reporter: Kai Zheng >Assignee: Kai Zheng > Attachments: HADOOP-12040-HDFS-7285-v1.patch, HADOOP-12040-v2.patch, > HADOOP-12040-v3.patch, HADOOP-12040-v4.patch > > > Currently we used the parity units + data units order for the inputs, > erasedIndexes and outputs parameters in the decode call in raw erasure coder, > which inherited from HDFS-RAID due to impact enforced by {{GaliosField}}. As > [~zhz] pointed and [~hitliuyi] felt, we'd better change the order to make it > natural for HDFS usage, where usually data blocks are before parity blocks in > a group. Doing this would avoid some reordering tricky logic. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-12483) Maintain wrapped SASL ordering for postponed IPC responses
[ https://issues.apache.org/jira/browse/HADOOP-12483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14962749#comment-14962749 ] Yi Liu commented on HADOOP-12483: - +1, looks good to me. Thanks [~daryn], will commit shortly. > Maintain wrapped SASL ordering for postponed IPC responses > -- > > Key: HADOOP-12483 > URL: https://issues.apache.org/jira/browse/HADOOP-12483 > Project: Hadoop Common > Issue Type: Bug > Components: ipc >Affects Versions: 2.8.0 >Reporter: Daryn Sharp >Assignee: Daryn Sharp >Priority: Critical > Attachments: HADOOP-12483.patch > > > A SASL encryption algorithm (wrapping) may have a required ordering for > encrypted responses. The IPC layer encrypts when the response is set based > on the assumption it is being immediately sent. Postponed responses violate > that assumption. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-12483) Maintain wrapped SASL ordering for postponed IPC responses
[ https://issues.apache.org/jira/browse/HADOOP-12483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yi Liu updated HADOOP-12483: Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 2.8.0 Status: Resolved (was: Patch Available) Committed to trunk and branch-2. > Maintain wrapped SASL ordering for postponed IPC responses > -- > > Key: HADOOP-12483 > URL: https://issues.apache.org/jira/browse/HADOOP-12483 > Project: Hadoop Common > Issue Type: Bug > Components: ipc >Affects Versions: 2.8.0 >Reporter: Daryn Sharp >Assignee: Daryn Sharp >Priority: Critical > Fix For: 2.8.0 > > Attachments: HADOOP-12483.patch > > > A SASL encryption algorithm (wrapping) may have a required ordering for > encrypted responses. The IPC layer encrypts when the response is set based > on the assumption it is being immediately sent. Postponed responses violate > that assumption. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-10300) Allowed deferred sending of call responses
[ https://issues.apache.org/jira/browse/HADOOP-10300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yi Liu updated HADOOP-10300: Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 2.8.0 Status: Resolved (was: Patch Available) Committed to trunk and branch-2. > Allowed deferred sending of call responses > -- > > Key: HADOOP-10300 > URL: https://issues.apache.org/jira/browse/HADOOP-10300 > Project: Hadoop Common > Issue Type: Sub-task > Components: ipc >Affects Versions: 2.0.0-alpha, 3.0.0 >Reporter: Daryn Sharp >Assignee: Daryn Sharp > Labels: BB2015-05-TBR > Fix For: 2.8.0 > > Attachments: HADOOP-10300.patch, HADOOP-10300.patch, > HADOOP-10300.patch > > > RPC handlers currently do not return until the RPC call completes and > response is sent, or a partially sent response has been queued for the > responder. It would be useful for a proxy method to notify the handler to > not yet the send the call's response. > An potential use case is a namespace handler in the NN might want to return > before the edit log is synced so it can service more requests and allow > increased batching of edits per sync. Background syncing could later trigger > the sending of the call response to the client. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-10300) Allowed deferred sending of call responses
[ https://issues.apache.org/jira/browse/HADOOP-10300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14952778#comment-14952778 ] Yi Liu commented on HADOOP-10300: - +1, thanks [~daryn]. > Allowed deferred sending of call responses > -- > > Key: HADOOP-10300 > URL: https://issues.apache.org/jira/browse/HADOOP-10300 > Project: Hadoop Common > Issue Type: Sub-task > Components: ipc >Affects Versions: 2.0.0-alpha, 3.0.0 >Reporter: Daryn Sharp >Assignee: Daryn Sharp > Labels: BB2015-05-TBR > Attachments: HADOOP-10300.patch, HADOOP-10300.patch, > HADOOP-10300.patch > > > RPC handlers currently do not return until the RPC call completes and > response is sent, or a partially sent response has been queued for the > responder. It would be useful for a proxy method to notify the handler to > not yet the send the call's response. > An potential use case is a namespace handler in the NN might want to return > before the edit log is synced so it can service more requests and allow > increased batching of edits per sync. Background syncing could later trigger > the sending of the call response to the client. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-12448) TestTextCommand: use mkdirs rather than mkdir to create test directory
[ https://issues.apache.org/jira/browse/HADOOP-12448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14936280#comment-14936280 ] Yi Liu commented on HADOOP-12448: - +1, thanks [~cmccabe] and [~cnauroth], will commit it shortly. > TestTextCommand: use mkdirs rather than mkdir to create test directory > -- > > Key: HADOOP-12448 > URL: https://issues.apache.org/jira/browse/HADOOP-12448 > Project: Hadoop Common > Issue Type: Bug > Components: test >Affects Versions: 2.8.0 >Reporter: Colin Patrick McCabe >Assignee: Colin Patrick McCabe > Attachments: HADOOP-12448.001.patch, HADOOP-12448.002.patch > > > TestTextCommand should use mkdirs rather than mkdir to create the test > directory. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-12448) TestTextCommand: use mkdirs rather than mkdir to create test directory
[ https://issues.apache.org/jira/browse/HADOOP-12448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yi Liu updated HADOOP-12448: Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 2.8.0 Status: Resolved (was: Patch Available) Committed to trunk and branch-2. > TestTextCommand: use mkdirs rather than mkdir to create test directory > -- > > Key: HADOOP-12448 > URL: https://issues.apache.org/jira/browse/HADOOP-12448 > Project: Hadoop Common > Issue Type: Bug > Components: test >Affects Versions: 2.8.0 >Reporter: Colin Patrick McCabe >Assignee: Colin Patrick McCabe > Fix For: 2.8.0 > > Attachments: HADOOP-12448.001.patch, HADOOP-12448.002.patch > > > TestTextCommand should use mkdirs rather than mkdir to create the test > directory. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-12367) Move TestFileUtil's test resources to resources folder
[ https://issues.apache.org/jira/browse/HADOOP-12367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yi Liu updated HADOOP-12367: Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 2.8.0 Status: Resolved (was: Patch Available) Committed to trunk and branch-2, thanks [~andrew.wang]. > Move TestFileUtil's test resources to resources folder > -- > > Key: HADOOP-12367 > URL: https://issues.apache.org/jira/browse/HADOOP-12367 > Project: Hadoop Common > Issue Type: Improvement >Affects Versions: 2.7.1 >Reporter: Andrew Wang >Assignee: Andrew Wang >Priority: Minor > Fix For: 2.8.0 > > Attachments: HADOOP-12367.001.patch, HADOOP-12367.002.patch > > > Little cleanup. Right now we do an antrun step to copy the tar and tgz from > the source folder to target folder. We can skip this by just putting it in > the resources folder like all the other test resources. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-12367) Move TestFileUtil's test resources to resources folder
[ https://issues.apache.org/jira/browse/HADOOP-12367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14724540#comment-14724540 ] Yi Liu commented on HADOOP-12367: - +1 pending Jenkins. Thanks for the cleanup. > Move TestFileUtil's test resources to resources folder > -- > > Key: HADOOP-12367 > URL: https://issues.apache.org/jira/browse/HADOOP-12367 > Project: Hadoop Common > Issue Type: Improvement >Affects Versions: 2.7.1 >Reporter: Andrew Wang >Assignee: Andrew Wang >Priority: Minor > Attachments: HADOOP-12367.001.patch > > > Little cleanup. Right now we do an antrun step to copy the tar and tgz from > the source folder to target folder. We can skip this by just putting it in > the resources folder like all the other test resources. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-10300) Allowed deferred sending of call responses
[ https://issues.apache.org/jira/browse/HADOOP-10300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14702373#comment-14702373 ] Yi Liu commented on HADOOP-10300: - Yes, it's OK with me. Original patch looks good to me. The trunk has some change since that time, please rebase it and let me take another look. Thanks. Allowed deferred sending of call responses -- Key: HADOOP-10300 URL: https://issues.apache.org/jira/browse/HADOOP-10300 Project: Hadoop Common Issue Type: Sub-task Components: ipc Affects Versions: 2.0.0-alpha, 3.0.0 Reporter: Daryn Sharp Assignee: Daryn Sharp Labels: BB2015-05-TBR Attachments: HADOOP-10300.patch, HADOOP-10300.patch RPC handlers currently do not return until the RPC call completes and response is sent, or a partially sent response has been queued for the responder. It would be useful for a proxy method to notify the handler to not yet the send the call's response. An potential use case is a namespace handler in the NN might want to return before the edit log is synced so it can service more requests and allow increased batching of edits per sync. Background syncing could later trigger the sending of the call response to the client. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-12295) Improve NetworkTopology#InnerNode#remove logic
[ https://issues.apache.org/jira/browse/HADOOP-12295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14694910#comment-14694910 ] Yi Liu commented on HADOOP-12295: - Thanks [~vinayrpet] for the review, committed to trunk and branch-2. I can address the comment if Chris has, thanks. Improve NetworkTopology#InnerNode#remove logic -- Key: HADOOP-12295 URL: https://issues.apache.org/jira/browse/HADOOP-12295 Project: Hadoop Common Issue Type: Improvement Reporter: Yi Liu Assignee: Yi Liu Attachments: HADOOP-12295.001.patch In {{NetworkTopology#InnerNode#remove}}, We can use {{childrenMap}} to get the parent node, no need to loop the {{children}} list. Then it is more efficient since in most cases deleting parent node doesn't happen. Another nit in current code is: {code} String parent = n.getNetworkLocation(); String currentPath = getPath(this); {code} can be in closure of {{\!isAncestor\(n\)}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-12295) Improve NetworkTopology#InnerNode#remove logic
[ https://issues.apache.org/jira/browse/HADOOP-12295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yi Liu updated HADOOP-12295: Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 2.8.0 Status: Resolved (was: Patch Available) Improve NetworkTopology#InnerNode#remove logic -- Key: HADOOP-12295 URL: https://issues.apache.org/jira/browse/HADOOP-12295 Project: Hadoop Common Issue Type: Improvement Reporter: Yi Liu Assignee: Yi Liu Fix For: 2.8.0 Attachments: HADOOP-12295.001.patch In {{NetworkTopology#InnerNode#remove}}, We can use {{childrenMap}} to get the parent node, no need to loop the {{children}} list. Then it is more efficient since in most cases deleting parent node doesn't happen. Another nit in current code is: {code} String parent = n.getNetworkLocation(); String currentPath = getPath(this); {code} can be in closure of {{\!isAncestor\(n\)}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-12295) Improve NetworkTopology#InnerNode#remove logic
[ https://issues.apache.org/jira/browse/HADOOP-12295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yi Liu updated HADOOP-12295: Description: In {{NetworkTopology#InnerNode#remove}}, We can use {{childrenMap}} to get the parent node, no need to loop the {{children}} list. Another nit in current code is: {code} String parent = n.getNetworkLocation(); String currentPath = getPath(this); {code} can be in closure of {{\!isAncestor\(n\)}} was:In {{NetworkTopology#InnerNode#remove}}, We can use {{childrenMap}} to get the parent node, no need to loop the {{children}} list. Improve NetworkTopology#InnerNode#remove logic -- Key: HADOOP-12295 URL: https://issues.apache.org/jira/browse/HADOOP-12295 Project: Hadoop Common Issue Type: Improvement Reporter: Yi Liu Assignee: Yi Liu In {{NetworkTopology#InnerNode#remove}}, We can use {{childrenMap}} to get the parent node, no need to loop the {{children}} list. Another nit in current code is: {code} String parent = n.getNetworkLocation(); String currentPath = getPath(this); {code} can be in closure of {{\!isAncestor\(n\)}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-12295) Improve NetworkTopology#InnerNode#remove logic
[ https://issues.apache.org/jira/browse/HADOOP-12295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yi Liu updated HADOOP-12295: Description: In {{NetworkTopology#InnerNode#remove}}, We can use {{childrenMap}} to get the parent node, no need to loop the {{children}} list. Then it is more efficient since in most cases deleting parent node doesn't happen. Another nit in current code is: {code} String parent = n.getNetworkLocation(); String currentPath = getPath(this); {code} can be in closure of {{\!isAncestor\(n\)}} was: In {{NetworkTopology#InnerNode#remove}}, We can use {{childrenMap}} to get the parent node, no need to loop the {{children}} list. Another nit in current code is: {code} String parent = n.getNetworkLocation(); String currentPath = getPath(this); {code} can be in closure of {{\!isAncestor\(n\)}} Improve NetworkTopology#InnerNode#remove logic -- Key: HADOOP-12295 URL: https://issues.apache.org/jira/browse/HADOOP-12295 Project: Hadoop Common Issue Type: Improvement Reporter: Yi Liu Assignee: Yi Liu Attachments: HADOOP-12295.001.patch In {{NetworkTopology#InnerNode#remove}}, We can use {{childrenMap}} to get the parent node, no need to loop the {{children}} list. Then it is more efficient since in most cases deleting parent node doesn't happen. Another nit in current code is: {code} String parent = n.getNetworkLocation(); String currentPath = getPath(this); {code} can be in closure of {{\!isAncestor\(n\)}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-12295) Improve NetworkTopology#InnerNode#remove logic
[ https://issues.apache.org/jira/browse/HADOOP-12295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yi Liu updated HADOOP-12295: Description: In {{NetworkTopology#InnerNode#remove}}, We can use {{childrenMap}} to get the parent node, no need to loop the {{children}} list. (was: In {{ NetworkTopology#InnerNode#remove}}, We can use {{childrenMap}} to get the parent node, no need to loop the {{children}} list.) Improve NetworkTopology#InnerNode#remove logic -- Key: HADOOP-12295 URL: https://issues.apache.org/jira/browse/HADOOP-12295 Project: Hadoop Common Issue Type: Improvement Reporter: Yi Liu Assignee: Yi Liu In {{NetworkTopology#InnerNode#remove}}, We can use {{childrenMap}} to get the parent node, no need to loop the {{children}} list. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HADOOP-12295) Improve NetworkTopology#InnerNode#remove logic
Yi Liu created HADOOP-12295: --- Summary: Improve NetworkTopology#InnerNode#remove logic Key: HADOOP-12295 URL: https://issues.apache.org/jira/browse/HADOOP-12295 Project: Hadoop Common Issue Type: Improvement Reporter: Yi Liu Assignee: Yi Liu In {{ NetworkTopology#InnerNode#remove}}, We can use {{childrenMap}} to get the parent node, no need to loop the {{children}} list. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-12295) Improve NetworkTopology#InnerNode#remove logic
[ https://issues.apache.org/jira/browse/HADOOP-12295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yi Liu updated HADOOP-12295: Attachment: HADOOP-12295.001.patch Improve NetworkTopology#InnerNode#remove logic -- Key: HADOOP-12295 URL: https://issues.apache.org/jira/browse/HADOOP-12295 Project: Hadoop Common Issue Type: Improvement Reporter: Yi Liu Assignee: Yi Liu Attachments: HADOOP-12295.001.patch In {{NetworkTopology#InnerNode#remove}}, We can use {{childrenMap}} to get the parent node, no need to loop the {{children}} list. Another nit in current code is: {code} String parent = n.getNetworkLocation(); String currentPath = getPath(this); {code} can be in closure of {{\!isAncestor\(n\)}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-12295) Improve NetworkTopology#InnerNode#remove logic
[ https://issues.apache.org/jira/browse/HADOOP-12295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yi Liu updated HADOOP-12295: Status: Patch Available (was: Open) Improve NetworkTopology#InnerNode#remove logic -- Key: HADOOP-12295 URL: https://issues.apache.org/jira/browse/HADOOP-12295 Project: Hadoop Common Issue Type: Improvement Reporter: Yi Liu Assignee: Yi Liu Attachments: HADOOP-12295.001.patch In {{NetworkTopology#InnerNode#remove}}, We can use {{childrenMap}} to get the parent node, no need to loop the {{children}} list. Then it is more efficient since in most cases deleting parent node doesn't happen. Another nit in current code is: {code} String parent = n.getNetworkLocation(); String currentPath = getPath(this); {code} can be in closure of {{\!isAncestor\(n\)}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-12178) NPE during handling of SASL setup if problem with SASL resolver class
[ https://issues.apache.org/jira/browse/HADOOP-12178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14629705#comment-14629705 ] Yi Liu commented on HADOOP-12178: - Steve, sorry for late response. I agree with you not all exceptions indicates a problem with SASL connection, and some can be rethrown. Seems {{setupSaslConnection}} only throws IOException which is must handled, but not very sure if there are other exceptions thrown out. To be safety, could we keep the {code} } catch (Exception ex) { {code} Thanks. NPE during handling of SASL setup if problem with SASL resolver class - Key: HADOOP-12178 URL: https://issues.apache.org/jira/browse/HADOOP-12178 Project: Hadoop Common Issue Type: Bug Components: ipc Affects Versions: 2.7.1 Reporter: Steve Loughran Assignee: Steve Loughran Priority: Minor Attachments: HADOOP-12178-001.patch If there's any problem in the constructor of {{SaslRpcClient}}, then IPC Client throws an NPE rather than forwarding the stack trace. This is because the exception handler assumes that {{saslRpcClient}} is not null, that the exception is related to the SASL setup itself. The exception handler needs to check for {{saslRpcClient}} being null, and if so, rethrow the exception -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-12178) NPE during handling of SASL setup if problem with SASL resolver class
[ https://issues.apache.org/jira/browse/HADOOP-12178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14625644#comment-14625644 ] Yi Liu commented on HADOOP-12178: - Thanks Steve. {code} -} catch (Exception ex) { +} catch (IOException ex) { {code} I think changing {{Exception}} to {{IOException}} is unnecessary. If {{SaslPropertiesResolver.getInstance(conf)}} throws RTE, then {{doAs}} will also throws RTE, if we change it to IOE, it can't be caught , so {{if (saslRpcClient == null)}} can't reach, furthermore we need to handle other exception here. Others look good, just no need to change the exception. NPE during handling of SASL setup if problem with SASL resolver class - Key: HADOOP-12178 URL: https://issues.apache.org/jira/browse/HADOOP-12178 Project: Hadoop Common Issue Type: Bug Components: ipc Affects Versions: 2.7.1 Reporter: Steve Loughran Assignee: Steve Loughran Priority: Minor Attachments: HADOOP-12178-001.patch If there's any problem in the constructor of {{SaslRpcClient}}, then IPC Client throws an NPE rather than forwarding the stack trace. This is because the exception handler assumes that {{saslRpcClient}} is not null, that the exception is related to the SASL setup itself. The exception handler needs to check for {{saslRpcClient}} being null, and if so, rethrow the exception -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-12201) Add tracing to FileSystem#createFileSystem and Globber#glob
[ https://issues.apache.org/jira/browse/HADOOP-12201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14617788#comment-14617788 ] Yi Liu commented on HADOOP-12201: - +1, thanks Colin. Add tracing to FileSystem#createFileSystem and Globber#glob --- Key: HADOOP-12201 URL: https://issues.apache.org/jira/browse/HADOOP-12201 Project: Hadoop Common Issue Type: Bug Affects Versions: 2.6.0 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Attachments: HADOOP-12201.001.patch, createfilesystem.png Add tracing to FileSystem#createFileSystem and Globber#glob -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-12172) FsShell mkdir -p makes an unnecessary check for the existence of the parent.
[ https://issues.apache.org/jira/browse/HADOOP-12172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14611287#comment-14611287 ] Yi Liu commented on HADOOP-12172: - +1, thanks Chris FsShell mkdir -p makes an unnecessary check for the existence of the parent. Key: HADOOP-12172 URL: https://issues.apache.org/jira/browse/HADOOP-12172 Project: Hadoop Common Issue Type: Improvement Components: fs Reporter: Chris Nauroth Assignee: Chris Nauroth Priority: Minor Attachments: HADOOP-12172.001.patch The {{mkdir}} command in {{FsShell}} checks for the existence of the parent of the directory and returns an error if it doesn't exist. The {{-p}} option suppresses the error and allows the directory creation to continue, implicitly creating all missing intermediate directories. However, the existence check still runs even with {{-p}} specified, and its result is ignored. Depending on the file system, this is a wasteful RPC call (HDFS) or HTTP request (WebHDFS/S3/Azure) imposing extra latency for the client and extra load for the server. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-12124) Add HTrace support for FsShell
[ https://issues.apache.org/jira/browse/HADOOP-12124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14606740#comment-14606740 ] Yi Liu commented on HADOOP-12124: - +1, thanks Colin. Add HTrace support for FsShell -- Key: HADOOP-12124 URL: https://issues.apache.org/jira/browse/HADOOP-12124 Project: Hadoop Common Issue Type: Bug Affects Versions: 2.6.0 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Attachments: HADOOP-12124.001.patch, HADOOP-12124.002.patch Add HTrace support for FsShell -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-11847) Enhance raw coder allowing to read least required inputs in decoding
[ https://issues.apache.org/jira/browse/HADOOP-11847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14558731#comment-14558731 ] Yi Liu commented on HADOOP-11847: - Kai, the patch looks good, one comment, +1 after addressing: In RSRawDecoder#doDecode {code} +for (int bufferIdx = 0, i = 0; i erasedOrNotToReadIndexes.length; i++) { + if (adjustedDirectBufferOutputsParameter[i] == null) { +ByteBuffer buffer = checkGetDirectBuffer(bufferIdx, dataLen); +buffer.limit(dataLen); +adjustedDirectBufferOutputsParameter[i] = resetBuffer(buffer); +bufferIdx++; + } +} {code} Here, we need to set buffer position to 0. Enhance raw coder allowing to read least required inputs in decoding Key: HADOOP-11847 URL: https://issues.apache.org/jira/browse/HADOOP-11847 Project: Hadoop Common Issue Type: Sub-task Components: io Reporter: Kai Zheng Assignee: Kai Zheng Labels: BB2015-05-TBR Attachments: HADOOP-11847-HDFS-7285-v3.patch, HADOOP-11847-HDFS-7285-v4.patch, HADOOP-11847-HDFS-7285-v5.patch, HADOOP-11847-HDFS-7285-v6.patch, HADOOP-11847-HDFS-7285-v7.patch, HADOOP-11847-HDFS-7285-v8.patch, HADOOP-11847-v1.patch, HADOOP-11847-v2.patch This is to enhance raw erasure coder to allow only reading least required inputs while decoding. It will also refine and document the relevant APIs for better understanding and usage. When using least required inputs, it may add computating overhead but will possiblly outperform overall since less network traffic and disk IO are involved. This is something planned to do but just got reminded by [~zhz]' s question raised in HDFS-7678, also copied here: bq.Kai Zheng I have a question about decoding: in a (6+3) schema, if block #2 is missing, and I want to repair it with blocks 0, 1, 3, 4, 5, 8, how should I construct the inputs to RawErasureDecoder#decode? With this work, hopefully the answer to above question would be obvious. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-11847) Enhance raw coder allowing to read least required inputs in decoding
[ https://issues.apache.org/jira/browse/HADOOP-11847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14558742#comment-14558742 ] Yi Liu commented on HADOOP-11847: - +1, thanks Kai Enhance raw coder allowing to read least required inputs in decoding Key: HADOOP-11847 URL: https://issues.apache.org/jira/browse/HADOOP-11847 Project: Hadoop Common Issue Type: Sub-task Components: io Reporter: Kai Zheng Assignee: Kai Zheng Labels: BB2015-05-TBR Attachments: HADOOP-11847-HDFS-7285-v3.patch, HADOOP-11847-HDFS-7285-v4.patch, HADOOP-11847-HDFS-7285-v5.patch, HADOOP-11847-HDFS-7285-v6.patch, HADOOP-11847-HDFS-7285-v7.patch, HADOOP-11847-HDFS-7285-v8.patch, HADOOP-11847-HDFS-7285-v9.patch, HADOOP-11847-v1.patch, HADOOP-11847-v2.patch This is to enhance raw erasure coder to allow only reading least required inputs while decoding. It will also refine and document the relevant APIs for better understanding and usage. When using least required inputs, it may add computating overhead but will possiblly outperform overall since less network traffic and disk IO are involved. This is something planned to do but just got reminded by [~zhz]' s question raised in HDFS-7678, also copied here: bq.Kai Zheng I have a question about decoding: in a (6+3) schema, if block #2 is missing, and I want to repair it with blocks 0, 1, 3, 4, 5, 8, how should I construct the inputs to RawErasureDecoder#decode? With this work, hopefully the answer to above question would be obvious. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-11847) Enhance raw coder allowing to read least required inputs in decoding
[ https://issues.apache.org/jira/browse/HADOOP-11847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14558602#comment-14558602 ] Yi Liu commented on HADOOP-11847: - Thanks Kai for the patch. I will check it later (I was out of town again yesterday). Enhance raw coder allowing to read least required inputs in decoding Key: HADOOP-11847 URL: https://issues.apache.org/jira/browse/HADOOP-11847 Project: Hadoop Common Issue Type: Sub-task Components: io Reporter: Kai Zheng Assignee: Kai Zheng Labels: BB2015-05-TBR Attachments: HADOOP-11847-HDFS-7285-v3.patch, HADOOP-11847-HDFS-7285-v4.patch, HADOOP-11847-HDFS-7285-v5.patch, HADOOP-11847-HDFS-7285-v6.patch, HADOOP-11847-HDFS-7285-v7.patch, HADOOP-11847-HDFS-7285-v8.patch, HADOOP-11847-v1.patch, HADOOP-11847-v2.patch This is to enhance raw erasure coder to allow only reading least required inputs while decoding. It will also refine and document the relevant APIs for better understanding and usage. When using least required inputs, it may add computating overhead but will possiblly outperform overall since less network traffic and disk IO are involved. This is something planned to do but just got reminded by [~zhz]' s question raised in HDFS-7678, also copied here: bq.Kai Zheng I have a question about decoding: in a (6+3) schema, if block #2 is missing, and I want to repair it with blocks 0, 1, 3, 4, 5, 8, how should I construct the inputs to RawErasureDecoder#decode? With this work, hopefully the answer to above question would be obvious. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-11847) Enhance raw coder allowing to read least required inputs in decoding
[ https://issues.apache.org/jira/browse/HADOOP-11847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14555446#comment-14555446 ] Yi Liu commented on HADOOP-11847: - *in AbstractRawErasureDecoder.java* for findFirstValidInput, still one comment not addressed: {code} +if (inputs[0] != null) { + return inputs[0]; +} + +for (int i = 1; i inputs.length; i++) { + if (inputs[i] != null) { +return inputs[i]; + } +} {code} It can be: {code} for (int i = 0; i inputs.length; i++) { if (inputs[i] != null) { return inputs[i]; } } {code} *In RSRawDecoder.java* {code} private void ensureBytesArrayBuffers(int dataLen) { if (bytesArrayBuffers == null || bytesArrayBuffers[0].length dataLen) { /** * Create this set of buffers on demand, which is only needed at the first * time running into this, using bytes array. */ // Erased or not to read int maxInvalidUnits = getNumParityUnits(); adjustedByteArrayOutputsParameter = new byte[maxInvalidUnits][]; adjustedOutputOffsets = new int[maxInvalidUnits]; // These are temp buffers for both inputs and outputs bytesArrayBuffers = new byte[maxInvalidUnits * 2][]; for (int i = 0; i bytesArrayBuffers.length; ++i) { bytesArrayBuffers[i] = new byte[dataLen]; } } } private void ensureDirectBuffers(int dataLen) { if (directBuffers == null || directBuffers[0].capacity() dataLen) { /** * Create this set of buffers on demand, which is only needed at the first * time running into this, using DirectBuffer. */ // Erased or not to read int maxInvalidUnits = getNumParityUnits(); adjustedDirectBufferOutputsParameter = new ByteBuffer[maxInvalidUnits]; // These are temp buffers for both inputs and outputs directBuffers = new ByteBuffer[maxInvalidUnits * 2]; for (int i = 0; i directBuffers.length; i++) { directBuffers[i] = ByteBuffer.allocateDirect(dataLen); } } } {code} 1. Do we need {{maxInvalidUnits * 2}} for bytesArrayBuffers and directBuffers? Since we don't need additional buffer for inputs. The correct size should be {{parityUnitNum - outputs.length}}. If next time, there is no enough buffer, then you allocate new. 2. The share buffer size should be always the chunk size, otherwise they can't be shared, since the dataLen may be different. In {{doDecode}} {code} for (int i = 0; i adjustedByteArrayOutputsParameter.length; i++) { adjustedByteArrayOutputsParameter[i] = resetBuffer(bytesArrayBuffers[bufferIdx++], 0, dataLen); adjustedOutputOffsets[i] = 0; // Always 0 for such temp output } int outputIdx = 0; for (int i = 0; i erasedIndexes.length; i++, outputIdx++) { for (int j = 0; j erasedOrNotToReadIndexes.length; j++) { // If this index is one requested by the caller via erasedIndexes, then // we use the passed output buffer to avoid copying data thereafter. if (erasedIndexes[i] == erasedOrNotToReadIndexes[j]) { adjustedByteArrayOutputsParameter[j] = resetBuffer(outputs[outputIdx], 0, dataLen); adjustedOutputOffsets[j] = outputOffsets[outputIdx]; } } } {code} 1. We should check erasedOrNotToReadIndexes contains erasedIndexes. 2. We just need one loop, go though {{adjustedByteArrayOutputsParameter}}, assign buffer from outputs if exists, otherwise from {{bytesArrayBuffers}} Enhance raw coder allowing to read least required inputs in decoding Key: HADOOP-11847 URL: https://issues.apache.org/jira/browse/HADOOP-11847 Project: Hadoop Common Issue Type: Sub-task Components: io Reporter: Kai Zheng Assignee: Kai Zheng Labels: BB2015-05-TBR Attachments: HADOOP-11847-HDFS-7285-v3.patch, HADOOP-11847-HDFS-7285-v4.patch, HADOOP-11847-HDFS-7285-v5.patch, HADOOP-11847-HDFS-7285-v6.patch, HADOOP-11847-v1.patch, HADOOP-11847-v2.patch This is to enhance raw erasure coder to allow only reading least required inputs while decoding. It will also refine and document the relevant APIs for better understanding and usage. When using least required inputs, it may add computating overhead but will possiblly outperform overall since less network traffic and disk IO are involved. This is something planned to do but just got reminded by [~zhz]' s question raised in HDFS-7678, also copied here: bq.Kai Zheng I have a question about decoding: in a (6+3) schema, if block #2 is missing, and I want to repair it with blocks 0, 1, 3, 4, 5, 8, how should I construct the inputs to RawErasureDecoder#decode? With this work, hopefully the answer to above question would be obvious. --
[jira] [Commented] (HADOOP-11847) Enhance raw coder allowing to read least required inputs in decoding
[ https://issues.apache.org/jira/browse/HADOOP-11847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1414#comment-1414 ] Yi Liu commented on HADOOP-11847: - {quote} Sorry I missed to explain why the codes are like that. It was thinking that it's rarely the first units that's erased, so in most cases just checking inputs\[0\] will return the wanted result, avoiding involving into the loop. {quote} If the first element is not null, it will return. It will have loop? {quote} How about simply having maxInvalidUnits = numParityUnits? The good is we don't have to re-allocate the shared buffers for different erasures. {quote} We don't need to allocate {{numParityUnits}} number of buffers, the output should at least have one, right? Maybe more than one. I don't think we have to re-allocate the shared buffers for different erasures. If the buffers is not enough, then we allocate new and add it to the shared pool, it's typically behavior. {quote} We don't have or use chunkSize now. Please note the check is: {quote} Right, we don't need to use ChunkSize now. I think {{bytesArrayBuffers\[0\].length dataLen}} is OK. {{ensureBytesArrayBuffer}} and {{ensureDirectBuffers}} need to be renamed and rewritten per above comments. {quote} Would you check again, thanks. {quote} {code} for (int i = 0; i adjustedByteArrayOutputsParameter.length; i++) { adjustedByteArrayOutputsParameter[i] = resetBuffer(bytesArrayBuffers[bufferIdx++], 0, dataLen); adjustedOutputOffsets[i] = 0; // Always 0 for such temp output } int outputIdx = 0; for (int i = 0; i erasedIndexes.length; i++, outputIdx++) { for (int j = 0; j erasedOrNotToReadIndexes.length; j++) { // If this index is one requested by the caller via erasedIndexes, then // we use the passed output buffer to avoid copying data thereafter. if (erasedIndexes[i] == erasedOrNotToReadIndexes[j]) { adjustedByteArrayOutputsParameter[j] = resetBuffer(outputs[outputIdx], 0, dataLen); adjustedOutputOffsets[j] = outputOffsets[outputIdx]; } } } {code} You call {{resetBuffer}}: parityNum + erasedIndexes, is that true? Enhance raw coder allowing to read least required inputs in decoding Key: HADOOP-11847 URL: https://issues.apache.org/jira/browse/HADOOP-11847 Project: Hadoop Common Issue Type: Sub-task Components: io Reporter: Kai Zheng Assignee: Kai Zheng Labels: BB2015-05-TBR Attachments: HADOOP-11847-HDFS-7285-v3.patch, HADOOP-11847-HDFS-7285-v4.patch, HADOOP-11847-HDFS-7285-v5.patch, HADOOP-11847-HDFS-7285-v6.patch, HADOOP-11847-v1.patch, HADOOP-11847-v2.patch This is to enhance raw erasure coder to allow only reading least required inputs while decoding. It will also refine and document the relevant APIs for better understanding and usage. When using least required inputs, it may add computating overhead but will possiblly outperform overall since less network traffic and disk IO are involved. This is something planned to do but just got reminded by [~zhz]' s question raised in HDFS-7678, also copied here: bq.Kai Zheng I have a question about decoding: in a (6+3) schema, if block #2 is missing, and I want to repair it with blocks 0, 1, 3, 4, 5, 8, how should I construct the inputs to RawErasureDecoder#decode? With this work, hopefully the answer to above question would be obvious. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-11847) Enhance raw coder allowing to read least required inputs in decoding
[ https://issues.apache.org/jira/browse/HADOOP-11847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14553605#comment-14553605 ] Yi Liu commented on HADOOP-11847: - I will give comments later today, I was out of town yesterday. Enhance raw coder allowing to read least required inputs in decoding Key: HADOOP-11847 URL: https://issues.apache.org/jira/browse/HADOOP-11847 Project: Hadoop Common Issue Type: Sub-task Components: io Reporter: Kai Zheng Assignee: Kai Zheng Labels: BB2015-05-TBR Attachments: HADOOP-11847-HDFS-7285-v3.patch, HADOOP-11847-HDFS-7285-v4.patch, HADOOP-11847-HDFS-7285-v5.patch, HADOOP-11847-HDFS-7285-v6.patch, HADOOP-11847-v1.patch, HADOOP-11847-v2.patch This is to enhance raw erasure coder to allow only reading least required inputs while decoding. It will also refine and document the relevant APIs for better understanding and usage. When using least required inputs, it may add computating overhead but will possiblly outperform overall since less network traffic and disk IO are involved. This is something planned to do but just got reminded by [~zhz]' s question raised in HDFS-7678, also copied here: bq.Kai Zheng I have a question about decoding: in a (6+3) schema, if block #2 is missing, and I want to repair it with blocks 0, 1, 3, 4, 5, 8, how should I construct the inputs to RawErasureDecoder#decode? With this work, hopefully the answer to above question would be obvious. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-11847) Enhance raw coder allowing to read least required inputs in decoding
[ https://issues.apache.org/jira/browse/HADOOP-11847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14549929#comment-14549929 ] Yi Liu commented on HADOOP-11847: - Kai had an offline discussion with me. 1. In RSRawDecoder.java, for the additional input buffers, we don't need them, we can use inputs directly, and do some modification for {{RSUtil.GF}} to check whether some input is null. Then it's more efficient and simper. 2. For output, we have looked into RS implementation, all the outputs have relationship with each other, so in current phase, we still decode all outputs. For example, for 6+3, if there is 2 chunks missed, ideally we just need to reconstruct 2 chunks, but because the relationship of outputs, currently we still go to reconstruct 3 chunks. HADOOP-11871 is for further improvement of this. So for output buffers, we may need to allocate some buffer(s). Enhance raw coder allowing to read least required inputs in decoding Key: HADOOP-11847 URL: https://issues.apache.org/jira/browse/HADOOP-11847 Project: Hadoop Common Issue Type: Sub-task Components: io Reporter: Kai Zheng Assignee: Kai Zheng Labels: BB2015-05-TBR Attachments: HADOOP-11847-HDFS-7285-v3.patch, HADOOP-11847-HDFS-7285-v4.patch, HADOOP-11847-HDFS-7285-v5.patch, HADOOP-11847-v1.patch, HADOOP-11847-v2.patch This is to enhance raw erasure coder to allow only reading least required inputs while decoding. It will also refine and document the relevant APIs for better understanding and usage. When using least required inputs, it may add computating overhead but will possiblly outperform overall since less network traffic and disk IO are involved. This is something planned to do but just got reminded by [~zhz]' s question raised in HDFS-7678, also copied here: bq.Kai Zheng I have a question about decoding: in a (6+3) schema, if block #2 is missing, and I want to repair it with blocks 0, 1, 3, 4, 5, 8, how should I construct the inputs to RawErasureDecoder#decode? With this work, hopefully the answer to above question would be obvious. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-11847) Enhance raw coder allowing to read least required inputs in decoding
[ https://issues.apache.org/jira/browse/HADOOP-11847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14549697#comment-14549697 ] Yi Liu commented on HADOOP-11847: - Thanks Kai for the patch. *In AbstractRawErasureCoder.java* {code} + if (buffers[i] == null) { +if (allowNull) { + continue; +} +throw new HadoopIllegalArgumentException(Invalid buffer found, not allowing null); + } {code} Using following code may be more simpler {code} if (buffers[i] == null !allowNull) { throw new HadoopIllegalArgumentException(Invalid buffer found, not allowing null); } {code} *In AbstractRawErasureDecoder.java* Rename {{findGoodInput}} to {{getFirstNullInput}}, and we can use generic type of Java, also the implementation can be simplified: {code} /** * Find the first null input. * @param inputs * @return the first null input */ protected T T getFirstNullInput(T[] inputs) { for (T input : inputs) { if (input != null) { return input; } } throw new HadoopIllegalArgumentException( Invalid inputs are found, all being null); } {code} Look at the above, is it more cool? Then you can change {code} ByteBuffer goodInput = (ByteBuffer) findGoodInput(inputs); {code} to {code} ByteBuffer firstNullInput = getFirstNullInput(inputs); {code} {code} protected int[] getErasedOrNotToReadIndexes(Object[] inputs) { int[] invalidIndexes = new int[inputs.length]; {code} We can accept the {{int erasedNum}} parameter, then we can allocate the exact array size and no need array copy. *In RSRawDecoder.java* {code} /** * We need a set of reusable buffers either for the bytes array * decoding version or direct buffer decoding version. Normally not both. * * For both input and output, in addition to the valid buffers from the caller * passed from above, we need to provide extra buffers for the internal * decoding implementation. For input, the caller should provide at least * numDataUnits valid buffers (non-NULL); for output, the caller should * provide no more than numParityUnits but at least one buffers. And the left * buffers will be borrowed from either bytesArrayBuffersForInput or * bytesArrayBuffersForOutput, for the bytes array version. * */ // Reused buffers for decoding with bytes arrays private byte[][] bytesArrayBuffers; private byte[][] adjustedByteArrayInputsParameter; private byte[][] adjustedByteArrayOutputsParameter; private int[] adjustedInputOffsets; private int[] adjustedOutputOffsets; // Reused buffers for decoding with direct ByteBuffers private ByteBuffer[] directBuffers; private ByteBuffer[] adjustedDirectBufferInputsParameter; private ByteBuffer[] adjustedDirectBufferOutputsParameter; {code} I don't think we need these. {code} @Override protected void doDecode(byte[][] inputs, int[] inputOffsets, int dataLen, int[] erasedIndexes, byte[][] outputs, int[] outputOffsets) { ensureBytesArrayBuffers(dataLen); /** * As passed parameters are friendly to callers but not to the underlying * implementations, so we have to adjust them before calling doDecoder. */ int[] erasedOrNotToReadIndexes = getErasedOrNotToReadIndexes(inputs); int bufferIdx = 0, erasedIdx; // Prepare for adjustedInputsParameter and adjustedInputOffsets System.arraycopy(inputs, 0, adjustedByteArrayInputsParameter, 0, inputs.length); System.arraycopy(inputOffsets, 0, adjustedInputOffsets, 0, inputOffsets.length); for (int i = 0; i erasedOrNotToReadIndexes.length; i++) { // Borrow it from bytesArrayBuffersForInput for the temp usage. erasedIdx = erasedOrNotToReadIndexes[i]; adjustedByteArrayInputsParameter[erasedIdx] = resetBuffer(bytesArrayBuffers[bufferIdx++], 0, dataLen); adjustedInputOffsets[erasedIdx] = 0; // Always 0 for such temp input } // Prepare for adjustedOutputsParameter for (int i = 0; i adjustedByteArrayOutputsParameter.length; i++) { adjustedByteArrayOutputsParameter[i] = resetBuffer(bytesArrayBuffers[bufferIdx++], 0, dataLen); adjustedOutputOffsets[i] = 0; // Always 0 for such temp output } for (int outputIdx = 0, i = 0; i erasedIndexes.length; i++, outputIdx++) { for (int j = 0; j erasedOrNotToReadIndexes.length; j++) { // If this index is one requested by the caller via erasedIndexes, then // we use the passed output buffer to avoid copying data thereafter. if (erasedIndexes[i] == erasedOrNotToReadIndexes[j]) { adjustedByteArrayOutputsParameter[j] = resetBuffer(outputs[outputIdx], 0, dataLen); adjustedOutputOffsets[j] = outputOffsets[outputIdx]; } } } doDecodeImpl(adjustedByteArrayInputsParameter,
[jira] [Comment Edited] (HADOOP-11847) Enhance raw coder allowing to read least required inputs in decoding
[ https://issues.apache.org/jira/browse/HADOOP-11847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14549697#comment-14549697 ] Yi Liu edited comment on HADOOP-11847 at 5/19/15 3:17 AM: -- Thanks Kai for the patch. *In AbstractRawErasureCoder.java* {code} + if (buffers[i] == null) { +if (allowNull) { + continue; +} +throw new HadoopIllegalArgumentException(Invalid buffer found, not allowing null); + } {code} Using following code may be more simpler {code} if (buffers[i] == null !allowNull) { throw new HadoopIllegalArgumentException(Invalid buffer found, not allowing null); } {code} *In AbstractRawErasureDecoder.java* Rename {{findGoodInput}} to {{getFirstNotNullInput}}, and we can use generic type of Java, also the implementation can be simplified: {code} /** * Find the first not null input. * @param inputs * @return the first not null input */ protected T T getFirstNotNullInput(T[] inputs) { for (T input : inputs) { if (input != null) { return input; } } throw new HadoopIllegalArgumentException( Invalid inputs are found, all being null); } {code} Look at the above, is it more cool? Then you can change {code} ByteBuffer goodInput = (ByteBuffer) findGoodInput(inputs); {code} to {code} ByteBuffer firstNotNullInput = getFirstNotNullInput(inputs); {code} {code} protected int[] getErasedOrNotToReadIndexes(Object[] inputs) { int[] invalidIndexes = new int[inputs.length]; {code} We can accept the {{int erasedNum}} parameter, then we can allocate the exact array size and no need array copy. *In RSRawDecoder.java* {code} /** * We need a set of reusable buffers either for the bytes array * decoding version or direct buffer decoding version. Normally not both. * * For both input and output, in addition to the valid buffers from the caller * passed from above, we need to provide extra buffers for the internal * decoding implementation. For input, the caller should provide at least * numDataUnits valid buffers (non-NULL); for output, the caller should * provide no more than numParityUnits but at least one buffers. And the left * buffers will be borrowed from either bytesArrayBuffersForInput or * bytesArrayBuffersForOutput, for the bytes array version. * */ // Reused buffers for decoding with bytes arrays private byte[][] bytesArrayBuffers; private byte[][] adjustedByteArrayInputsParameter; private byte[][] adjustedByteArrayOutputsParameter; private int[] adjustedInputOffsets; private int[] adjustedOutputOffsets; // Reused buffers for decoding with direct ByteBuffers private ByteBuffer[] directBuffers; private ByteBuffer[] adjustedDirectBufferInputsParameter; private ByteBuffer[] adjustedDirectBufferOutputsParameter; {code} I don't think we need these. {code} @Override protected void doDecode(byte[][] inputs, int[] inputOffsets, int dataLen, int[] erasedIndexes, byte[][] outputs, int[] outputOffsets) { ensureBytesArrayBuffers(dataLen); /** * As passed parameters are friendly to callers but not to the underlying * implementations, so we have to adjust them before calling doDecoder. */ int[] erasedOrNotToReadIndexes = getErasedOrNotToReadIndexes(inputs); int bufferIdx = 0, erasedIdx; // Prepare for adjustedInputsParameter and adjustedInputOffsets System.arraycopy(inputs, 0, adjustedByteArrayInputsParameter, 0, inputs.length); System.arraycopy(inputOffsets, 0, adjustedInputOffsets, 0, inputOffsets.length); for (int i = 0; i erasedOrNotToReadIndexes.length; i++) { // Borrow it from bytesArrayBuffersForInput for the temp usage. erasedIdx = erasedOrNotToReadIndexes[i]; adjustedByteArrayInputsParameter[erasedIdx] = resetBuffer(bytesArrayBuffers[bufferIdx++], 0, dataLen); adjustedInputOffsets[erasedIdx] = 0; // Always 0 for such temp input } // Prepare for adjustedOutputsParameter for (int i = 0; i adjustedByteArrayOutputsParameter.length; i++) { adjustedByteArrayOutputsParameter[i] = resetBuffer(bytesArrayBuffers[bufferIdx++], 0, dataLen); adjustedOutputOffsets[i] = 0; // Always 0 for such temp output } for (int outputIdx = 0, i = 0; i erasedIndexes.length; i++, outputIdx++) { for (int j = 0; j erasedOrNotToReadIndexes.length; j++) { // If this index is one requested by the caller via erasedIndexes, then // we use the passed output buffer to avoid copying data thereafter. if (erasedIndexes[i] == erasedOrNotToReadIndexes[j]) { adjustedByteArrayOutputsParameter[j] = resetBuffer(outputs[outputIdx], 0, dataLen); adjustedOutputOffsets[j] = outputOffsets[outputIdx]; }
[jira] [Commented] (HADOOP-11938) Fix ByteBuffer version encode/decode API of raw erasure coder
[ https://issues.apache.org/jira/browse/HADOOP-11938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14545154#comment-14545154 ] Yi Liu commented on HADOOP-11938: - Looks good now, one nit, +1 after addressing in TestRawCoderBase.java {code} Assert.fail(Encoding test with bad input passed); {code} We should write Encoding test with bad input should fail. You write oppositely. Same as few other Assert.fail. Furthermore, we need to fix the Jenkins warnings (release audit/checkstyle/whitespace) if they are related to this patch. Fix ByteBuffer version encode/decode API of raw erasure coder - Key: HADOOP-11938 URL: https://issues.apache.org/jira/browse/HADOOP-11938 Project: Hadoop Common Issue Type: Sub-task Components: io Reporter: Kai Zheng Assignee: Kai Zheng Attachments: HADOOP-11938-HDFS-7285-v1.patch, HADOOP-11938-HDFS-7285-v2.patch, HADOOP-11938-HDFS-7285-v3.patch, HADOOP-11938-HDFS-7285-workaround.patch While investigating a test failure in {{TestRecoverStripedFile}}, one issue in raw erasrue coder, caused by an optimization in below codes. It assumes the heap buffer backed by the bytes array available for reading or writing always starts with zero and takes the whole space. {code} protected static byte[][] toArrays(ByteBuffer[] buffers) { byte[][] bytesArr = new byte[buffers.length][]; ByteBuffer buffer; for (int i = 0; i buffers.length; i++) { buffer = buffers[i]; if (buffer == null) { bytesArr[i] = null; continue; } if (buffer.hasArray()) { bytesArr[i] = buffer.array(); } else { throw new IllegalArgumentException(Invalid ByteBuffer passed, + expecting heap buffer); } } return bytesArr; } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-11938) Fix ByteBuffer version encode/decode API of raw erasure coder
[ https://issues.apache.org/jira/browse/HADOOP-11938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14543076#comment-14543076 ] Yi Liu commented on HADOOP-11938: - Looks more better than original. Some lines are longer than 80 chars. *In AbstractRawErasureCoder.java* {code} +for (int i = pos; i buffer.remaining(); ++i) { + buffer.put(i, (byte) 0); } {code} it should be {{buffer.limit()}} instead of remaining And we can just use {{buffer.put((byte)0)}} {code} @return the buffer itself, with ZERO bytes written, remaining the original + * position {code} remaining the original position, maybe the position and limit is not changed after the call is more clear. use {{HadoopIllegalArgumentException}} instead of {{IllegalArgumentException}} *in XORRawDecoder.java and XORRawEncoder.java* {code} inputs[i].position() + inputs[0].remaining() {code} Just need use inputs\[i\].limit() *in RSRawDecoder.java and RSRawEncoder.java* {code} +int dataLen = inputs[0].remaining(); {code} is it necessary? I think we don't need to pass {{dataLen}} to {{RSUtil.GF.solveVandermondeSystem}} *in GaloisField.java* {code} public void solveVandermondeSystem(int[] x, ByteBuffer[] y, int len, int dataLen) { {code} As in previous comment, {{dataLen}} is unnecessary, so idx1 p.position() + dataLen can be idx1 p.limit() {code} public void substitute(ByteBuffer[] p, ByteBuffer q, int x) { -int y = 1; +int y = 1, iIdx, oIdx; +int len = p[0].remaining(); for (int i = 0; i p.length; i++) { ByteBuffer pi = p[i]; - int len = pi.remaining(); - for (int j = 0; j len; j++) { -int pij = pi.get(j) 0x00FF; -q.put(j, (byte) (q.get(j) ^ mulTable[pij][y])); + for (iIdx = pi.position(), oIdx = q.position(); + iIdx pi.position() + len; iIdx++, oIdx++) { +int pij = pi.get(iIdx) 0x00FF; +q.put(oIdx, (byte) (q.get(oIdx) ^ mulTable[pij][y])); } y = mulTable[x][y]; } {code} {{len}} is unnecessary. Same for {code} public void remainder(ByteBuffer[] dividend, int len, int[] divisor) { {code} *in TestCoderBase.java* {code} + private byte[] zeroChunkBytes; .. protected void eraseDataFromChunk(ECChunk chunk) { ByteBuffer chunkBuffer = chunk.getBuffer(); -// erase the data -chunkBuffer.position(0); -for (int i = 0; i chunkSize; i++) { - chunkBuffer.put((byte) 0); -} +// erase the data at the position, and restore the buffer ready for reading +// chunkSize bytes but all ZERO. +int pos = chunkBuffer.position(); +chunkBuffer.flip(); +chunkBuffer.position(pos); +chunkBuffer.limit(pos + chunkSize); +chunkBuffer.put(zeroChunkBytes); chunkBuffer.flip(); +chunkBuffer.position(pos); +chunkBuffer.limit(pos + chunkSize); {code} {code} - protected static ECChunk cloneChunkWithData(ECChunk chunk) { + protected ECChunk cloneChunkWithData(ECChunk chunk) { ByteBuffer srcBuffer = chunk.getBuffer(); -ByteBuffer destBuffer; +ByteBuffer destBuffer = allocateOutputChunkBuffer(); -byte[] bytesArr = new byte[srcBuffer.remaining()]; +byte[] bytesArr = new byte[chunkSize]; srcBuffer.mark(); srcBuffer.get(bytesArr); srcBuffer.reset(); -if (srcBuffer.hasArray()) { - destBuffer = ByteBuffer.wrap(bytesArr); -} else { - destBuffer = ByteBuffer.allocateDirect(srcBuffer.remaining()); - destBuffer.put(bytesArr); - destBuffer.flip(); -} +int pos = destBuffer.position(); +destBuffer.put(bytesArr); +destBuffer.flip(); +destBuffer.position(pos); {code} {{destBuffer}} is still assumed to be chunkSize Furthermore, some unnecessary flip {code} + /** + * Convert an array of this chunks to an array of byte array. + * Note the chunk buffers are not affected. + * @param chunks + * @return an array of byte array + */ + protected byte[][] toArrays(ECChunk[] chunks) { +byte[][] bytesArr = new byte[chunks.length][]; + +ByteBuffer buffer; +for (int i = 0; i chunks.length; i++) { + buffer = chunks[i].getBuffer(); + if (buffer.hasArray() buffer.position() == 0 + buffer.remaining() == chunkSize) { +bytesArr[i] = buffer.array(); + } else { +bytesArr[i] = new byte[buffer.remaining()]; +// Avoid affecting the original one +buffer.mark(); +buffer.get(bytesArr[i]); +buffer.reset(); + } +} + +return bytesArr; + } {code} We already have this method, use {{ECChunk.toBuffers}} ? Convert to bytebuffer is enough? If not, should we have this method in code not only in test? *In TestRawCoderBase.java* You should add more description about your tests, for example, the negative test is for what and how you test, you also need find a good name for it. {code} protected
[jira] [Commented] (HADOOP-11847) Enhance raw coder allowing to read least required inputs in decoding
[ https://issues.apache.org/jira/browse/HADOOP-11847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14543105#comment-14543105 ] Yi Liu commented on HADOOP-11847: - Let's wait for HADOOP-11938, then back to this one. Enhance raw coder allowing to read least required inputs in decoding Key: HADOOP-11847 URL: https://issues.apache.org/jira/browse/HADOOP-11847 Project: Hadoop Common Issue Type: Sub-task Components: io Reporter: Kai Zheng Assignee: Kai Zheng Labels: BB2015-05-TBR Attachments: HADOOP-11847-HDFS-7285-v3.patch, HADOOP-11847-HDFS-7285-v4.patch, HADOOP-11847-v1.patch, HADOOP-11847-v2.patch, HADOOP-11847-v5.patch This is to enhance raw erasure coder to allow only reading least required inputs while decoding. It will also refine and document the relevant APIs for better understanding and usage. When using least required inputs, it may add computating overhead but will possiblly outperform overall since less network traffic and disk IO are involved. This is something planned to do but just got reminded by [~zhz]' s question raised in HDFS-7678, also copied here: bq.Kai Zheng I have a question about decoding: in a (6+3) schema, if block #2 is missing, and I want to repair it with blocks 0, 1, 3, 4, 5, 8, how should I construct the inputs to RawErasureDecoder#decode? With this work, hopefully the answer to above question would be obvious. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-11938) Fix ByteBuffer version encode/decode API of raw erasure coder
[ https://issues.apache.org/jira/browse/HADOOP-11938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14543078#comment-14543078 ] Yi Liu commented on HADOOP-11938: - One more comment: Add more javadoc for XOR coder, then other guys can have more better understand. Fix ByteBuffer version encode/decode API of raw erasure coder - Key: HADOOP-11938 URL: https://issues.apache.org/jira/browse/HADOOP-11938 Project: Hadoop Common Issue Type: Sub-task Components: io Reporter: Kai Zheng Assignee: Kai Zheng Attachments: HADOOP-11938-HDFS-7285-v1.patch, HADOOP-11938-HDFS-7285-v2.patch, HADOOP-11938-HDFS-7285-workaround.patch While investigating a test failure in {{TestRecoverStripedFile}}, one issue in raw erasrue coder, caused by an optimization in below codes. It assumes the heap buffer backed by the bytes array available for reading or writing always starts with zero and takes the whole space. {code} protected static byte[][] toArrays(ByteBuffer[] buffers) { byte[][] bytesArr = new byte[buffers.length][]; ByteBuffer buffer; for (int i = 0; i buffers.length; i++) { buffer = buffers[i]; if (buffer == null) { bytesArr[i] = null; continue; } if (buffer.hasArray()) { bytesArr[i] = buffer.array(); } else { throw new IllegalArgumentException(Invalid ByteBuffer passed, + expecting heap buffer); } } return bytesArr; } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HADOOP-11908) Erasure coding: Should be able to encode part of parity blocks.
[ https://issues.apache.org/jira/browse/HADOOP-11908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yi Liu resolved HADOOP-11908. - Resolution: Duplicate Erasure coding: Should be able to encode part of parity blocks. --- Key: HADOOP-11908 URL: https://issues.apache.org/jira/browse/HADOOP-11908 Project: Hadoop Common Issue Type: Sub-task Components: io Reporter: Yi Liu Assignee: Kai Zheng {code} public void encode(ByteBuffer[] inputs, ByteBuffer[] outputs); {code} Currently when we do encode, the outputs are all parity blocks, we should be able to encode part of parity blocks. This is required when we do datanode striped block recovery, if one or more parity blocks are missed, we need to do encode to recovery them. Only encode part of parity blocks should be more efficient than all. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HADOOP-11961) Add isLinear interface to Erasure coder
Yi Liu created HADOOP-11961: --- Summary: Add isLinear interface to Erasure coder Key: HADOOP-11961 URL: https://issues.apache.org/jira/browse/HADOOP-11961 Project: Hadoop Common Issue Type: Sub-task Reporter: Yi Liu Assignee: Yi Liu Today, we have a discussion including [~zhz], [~drankye], etc., also discuss in HDFS-8347. Some coder like {{RS}} and {{XOR}} is linear, some have coding boundary like HitchHicker. If the coder is linear, we can decode at any size, and we don't need to padding inputs to *chunksize*, if the coder is not linear, the inputs need to padding to *chunksize*, then do decode. This interface is important for performance, and can save memory/disk space since the parity cells are the same as first data cell (less than codec chunksize). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-11961) Add interface of whether codec has chunk boundary to Erasure coder
[ https://issues.apache.org/jira/browse/HADOOP-11961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yi Liu updated HADOOP-11961: Description: (was: Today, we have a discussion including [~zhz], [~drankye], etc., also discuss in HDFS-8347. Some coder like {{RS}} and {{XOR}} is linear, some have coding boundary like HitchHicker. If the coder is linear, we can decode at any size, and we don't need to padding inputs to *chunksize*, if the coder is not linear, the inputs need to padding to *chunksize*, then do decode. This interface is important for performance, and can save memory/disk space since the parity cells are the same as first data cell (less than codec chunksize). ) Assignee: (was: Yi Liu) Summary: Add interface of whether codec has chunk boundary to Erasure coder (was: Add isLinear interface to Erasure coder) Add interface of whether codec has chunk boundary to Erasure coder -- Key: HADOOP-11961 URL: https://issues.apache.org/jira/browse/HADOOP-11961 Project: Hadoop Common Issue Type: Sub-task Components: io Reporter: Yi Liu -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-11938) Fix ByteBuffer version encode/decode API of raw erasure coder
[ https://issues.apache.org/jira/browse/HADOOP-11938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14539776#comment-14539776 ] Yi Liu commented on HADOOP-11938: - OK, I see. Thanks for updating the patch, I will give comments to your update patch tomorrow. Fix ByteBuffer version encode/decode API of raw erasure coder - Key: HADOOP-11938 URL: https://issues.apache.org/jira/browse/HADOOP-11938 Project: Hadoop Common Issue Type: Sub-task Components: io Reporter: Kai Zheng Assignee: Kai Zheng Attachments: HADOOP-11938-HDFS-7285-v1.patch, HADOOP-11938-HDFS-7285-v2.patch, HADOOP-11938-HDFS-7285-workaround.patch While investigating a test failure in {{TestRecoverStripedFile}}, one issue in raw erasrue coder, caused by an optimization in below codes. It assumes the heap buffer backed by the bytes array available for reading or writing always starts with zero and takes the whole space. {code} protected static byte[][] toArrays(ByteBuffer[] buffers) { byte[][] bytesArr = new byte[buffers.length][]; ByteBuffer buffer; for (int i = 0; i buffers.length; i++) { buffer = buffers[i]; if (buffer == null) { bytesArr[i] = null; continue; } if (buffer.hasArray()) { bytesArr[i] = buffer.array(); } else { throw new IllegalArgumentException(Invalid ByteBuffer passed, + expecting heap buffer); } } return bytesArr; } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HADOOP-11961) Add interface of whether codec has chunk boundary to Erasure coder
[ https://issues.apache.org/jira/browse/HADOOP-11961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yi Liu resolved HADOOP-11961. - Resolution: Invalid Add interface of whether codec has chunk boundary to Erasure coder -- Key: HADOOP-11961 URL: https://issues.apache.org/jira/browse/HADOOP-11961 Project: Hadoop Common Issue Type: Sub-task Components: io Reporter: Yi Liu -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-11938) Fix ByteBuffer version encode/decode API of raw erasure coder
[ https://issues.apache.org/jira/browse/HADOOP-11938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14539753#comment-14539753 ] Yi Liu commented on HADOOP-11938: - {quote} Yes. XOR coder can only recover one erasure of unit {quote} Interesting, if so, why we need XOR coder, what's it used for? We should remove XOR coder and then no need to maintain the code anymore. Fix ByteBuffer version encode/decode API of raw erasure coder - Key: HADOOP-11938 URL: https://issues.apache.org/jira/browse/HADOOP-11938 Project: Hadoop Common Issue Type: Sub-task Components: io Reporter: Kai Zheng Assignee: Kai Zheng Attachments: HADOOP-11938-HDFS-7285-v1.patch, HADOOP-11938-HDFS-7285-v2.patch, HADOOP-11938-HDFS-7285-workaround.patch While investigating a test failure in {{TestRecoverStripedFile}}, one issue in raw erasrue coder, caused by an optimization in below codes. It assumes the heap buffer backed by the bytes array available for reading or writing always starts with zero and takes the whole space. {code} protected static byte[][] toArrays(ByteBuffer[] buffers) { byte[][] bytesArr = new byte[buffers.length][]; ByteBuffer buffer; for (int i = 0; i buffers.length; i++) { buffer = buffers[i]; if (buffer == null) { bytesArr[i] = null; continue; } if (buffer.hasArray()) { bytesArr[i] = buffer.array(); } else { throw new IllegalArgumentException(Invalid ByteBuffer passed, + expecting heap buffer); } } return bytesArr; } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-11938) Fix ByteBuffer version encode/decode API of raw erasure coder
[ https://issues.apache.org/jira/browse/HADOOP-11938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14537713#comment-14537713 ] Yi Liu commented on HADOOP-11938: - Some comments: In *AbstractRawErasureCoder.java* {code} + protected ByteBuffer resetOutputBuffer(ByteBuffer buffer) { +int pos = buffer.position(); +buffer.put(zeroChunkBytes); +buffer.position(pos); + +return buffer; } {code} length of zeroChunkBytes could be larger than buffer.remaining(), just use a *for* to put 0 to buffer. {code} protected ByteBuffer resetInputBuffer(ByteBuffer buffer) { {code} What's the reason we need to reset inputbuffer? Input buffers are given by caller. In AbstractRawErasureDecoder.java {code} boolean usingDirectBuffer = ! inputs[0].hasArray(); {code} use {{inputs\[0\].isDirect();}} {code} @Override public void decode(ByteBuffer[] inputs, int[] erasedIndexes, ByteBuffer[] outputs) @Override public void decode(byte[][] inputs, int[] erasedIndexes, byte[][] outputs) { {code} We should do following check: 1. all the input have the same length besides some inputs are null, and check output has enough space. 2. {{erasedIndexes}} matches the {{null}} position of inputs. We should also enhance the description in RawErasureDecoder#decode to describe more about decode/reconstruct. {code} +for (int i = 0; i outputs.length; ++i) { + buffer = outputs[i]; + // to be ready for read dataLen bytes + buffer.flip(); + buffer.position(outputOffsets[i]); + buffer.limit(outputOffsets[i] + dataLen); } {code} This is unnecessary, remove it. In *AbstractRawErasureEncoder.java* {code} boolean usingDirectBuffer = ! inputs[0].hasArray(); {code} use {{inputs\[0\].isDirect();}} all comments are same as in *AbstractRawErasureDecoder* in *RSRawDecoder.java* {code} assert (getNumDataUnits() + getNumParityUnits() RSUtil.GF.getFieldSize()); this.errSignature = new int[getNumParityUnits()]; this.primitivePower = RSUtil.getPrimitivePower(getNumDataUnits(), getNumParityUnits()); {code} Why not use {{numDataUnits}} and {{numParityUnits}} directly? In *RSRawEncoder.java* in {{initialize}}, use {{numDataUnits}} and {{numParityUnits}} directly. Fix ByteBuffer version encode/decode API of raw erasure coder - Key: HADOOP-11938 URL: https://issues.apache.org/jira/browse/HADOOP-11938 Project: Hadoop Common Issue Type: Sub-task Components: io Reporter: Kai Zheng Assignee: Kai Zheng Attachments: HADOOP-11938-HDFS-7285-v1.patch, HADOOP-11938-HDFS-7285-workaround.patch While investigating a test failure in {{TestRecoverStripedFile}}, one issue in raw erasrue coder, caused by an optimization in below codes. It assumes the heap buffer backed by the bytes array available for reading or writing always starts with zero and takes the whole space. {code} protected static byte[][] toArrays(ByteBuffer[] buffers) { byte[][] bytesArr = new byte[buffers.length][]; ByteBuffer buffer; for (int i = 0; i buffers.length; i++) { buffer = buffers[i]; if (buffer == null) { bytesArr[i] = null; continue; } if (buffer.hasArray()) { bytesArr[i] = buffer.array(); } else { throw new IllegalArgumentException(Invalid ByteBuffer passed, + expecting heap buffer); } } return bytesArr; } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-11938) Fix ByteBuffer version encode/decode API of raw erasure coder
[ https://issues.apache.org/jira/browse/HADOOP-11938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14537714#comment-14537714 ] Yi Liu commented on HADOOP-11938: - Will post more later. Fix ByteBuffer version encode/decode API of raw erasure coder - Key: HADOOP-11938 URL: https://issues.apache.org/jira/browse/HADOOP-11938 Project: Hadoop Common Issue Type: Sub-task Components: io Reporter: Kai Zheng Assignee: Kai Zheng Attachments: HADOOP-11938-HDFS-7285-v1.patch, HADOOP-11938-HDFS-7285-workaround.patch While investigating a test failure in {{TestRecoverStripedFile}}, one issue in raw erasrue coder, caused by an optimization in below codes. It assumes the heap buffer backed by the bytes array available for reading or writing always starts with zero and takes the whole space. {code} protected static byte[][] toArrays(ByteBuffer[] buffers) { byte[][] bytesArr = new byte[buffers.length][]; ByteBuffer buffer; for (int i = 0; i buffers.length; i++) { buffer = buffers[i]; if (buffer == null) { bytesArr[i] = null; continue; } if (buffer.hasArray()) { bytesArr[i] = buffer.array(); } else { throw new IllegalArgumentException(Invalid ByteBuffer passed, + expecting heap buffer); } } return bytesArr; } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-11938) Fix ByteBuffer version encode/decode API of raw erasure coder
[ https://issues.apache.org/jira/browse/HADOOP-11938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14539254#comment-14539254 ] Yi Liu commented on HADOOP-11938: - In *XORRawDecoder.java* {code} int dataLen = inputs[0].remaining(); int erasedIdx = erasedIndexes[0]; // Process the inputs. int iPos, oPos, iIdx, oIdx; oPos = output.position(); for (int i = 0; i inputs.length; i++) { // Skip the erased location. if (i == erasedIdx) { continue; } iPos = inputs[i].position(); for (iIdx = iPos, oIdx = oPos; iIdx iPos + dataLen; iIdx++, oIdx++) { output.put(oIdx, (byte) (output.get(oIdx) ^ inputs[i].get(iIdx))); } } {code} {{dataLen/iPos/oPos}} are not necessary. we can use buffer.limit(), position() instead. {code} @Override protected void doDecode(ByteBuffer[] inputs, int[] erasedIndexes, ByteBuffer[] outputs) { ByteBuffer output = outputs[0]; resetOutputBuffer(output); int dataLen = inputs[0].remaining(); int erasedIdx = erasedIndexes[0]; // Process the inputs. int iPos, oPos, iIdx, oIdx; oPos = output.position(); for (int i = 0; i inputs.length; i++) { // Skip the erased location. if (i == erasedIdx) { continue; } iPos = inputs[i].position(); for (iIdx = iPos, oIdx = oPos; iIdx iPos + dataLen; iIdx++, oIdx++) { output.put(oIdx, (byte) (output.get(oIdx) ^ inputs[i].get(iIdx))); } } } {code} I wonder whether this works, I see it only decode for *output\[0\]*. Do we ever test this? if not, we should have more test in this patch. In *XORRawEncoder.java* same comments as in XORRawDecoder In *GaloisField.java* {code} +ByteBuffer p, prev, after; +int pos1, idx1, pos2, idx2; {code} besides {{idx1/idx2}}, others are unnecessary, then the code is more clear. Also in some other places, most of them are unnecessary, if they are only used once or twice, we don't need declare a separate variable. *For the tests, I want to see more tests:* 1) The length of inputs/outputs is not equal to chunksize, we can still decode. 2) Some negative test, we can catch the expected exception. Fix ByteBuffer version encode/decode API of raw erasure coder - Key: HADOOP-11938 URL: https://issues.apache.org/jira/browse/HADOOP-11938 Project: Hadoop Common Issue Type: Sub-task Components: io Reporter: Kai Zheng Assignee: Kai Zheng Attachments: HADOOP-11938-HDFS-7285-v1.patch, HADOOP-11938-HDFS-7285-workaround.patch While investigating a test failure in {{TestRecoverStripedFile}}, one issue in raw erasrue coder, caused by an optimization in below codes. It assumes the heap buffer backed by the bytes array available for reading or writing always starts with zero and takes the whole space. {code} protected static byte[][] toArrays(ByteBuffer[] buffers) { byte[][] bytesArr = new byte[buffers.length][]; ByteBuffer buffer; for (int i = 0; i buffers.length; i++) { buffer = buffers[i]; if (buffer == null) { bytesArr[i] = null; continue; } if (buffer.hasArray()) { bytesArr[i] = buffer.array(); } else { throw new IllegalArgumentException(Invalid ByteBuffer passed, + expecting heap buffer); } } return bytesArr; } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-11938) Fix ByteBuffer version encode/decode API of raw erasure coder
[ https://issues.apache.org/jira/browse/HADOOP-11938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14533954#comment-14533954 ] Yi Liu commented on HADOOP-11938: - Thanks Kai for the catch. Fix ByteBuffer version encode/decode API of raw erasure coder - Key: HADOOP-11938 URL: https://issues.apache.org/jira/browse/HADOOP-11938 Project: Hadoop Common Issue Type: Sub-task Components: io Reporter: Kai Zheng Assignee: Kai Zheng While investigating a test failure in {{TestRecoverStripedFile}}, one issue in raw erasrue coder, a bad optimization in below codes. It assumes the heap buffer backed by the bytes array available for reading or writing always starts with zero and takes the whole. {code} protected static byte[][] toArrays(ByteBuffer[] buffers) { byte[][] bytesArr = new byte[buffers.length][]; ByteBuffer buffer; for (int i = 0; i buffers.length; i++) { buffer = buffers[i]; if (buffer == null) { bytesArr[i] = null; continue; } if (buffer.hasArray()) { bytesArr[i] = buffer.array(); } else { throw new IllegalArgumentException(Invalid ByteBuffer passed, + expecting heap buffer); } } return bytesArr; } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-11920) Refactor some codes for erasure coders
[ https://issues.apache.org/jira/browse/HADOOP-11920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14531892#comment-14531892 ] Yi Liu commented on HADOOP-11920: - Yes, the caller can pass a direct buffer, but also can use a java heap byte buffer. Why it should be DirectByteBuffer? In RawErasureEncoder, the encode declares it accepts {{ByteBuffer}} {code} public void encode(ByteBuffer[] inputs, ByteBuffer[] outputs); {code} If we want to accept only Direct ByteBuffer in {{XORRawEncoder#doEncode}}, we should check it must be a direct buffer. Refactor some codes for erasure coders -- Key: HADOOP-11920 URL: https://issues.apache.org/jira/browse/HADOOP-11920 Project: Hadoop Common Issue Type: Sub-task Components: io Reporter: Kai Zheng Assignee: Kai Zheng Attachments: HADOOP-11920-HDFS-7285-02.patch, HADOOP-11920-HDFS-7285-v4.patch, HADOOP-11920-v1.patch, HADOOP-11920-v2.patch, HADOOP-11920-v3.patch While working on native erasure coders and also HADOOP-11847, it was found in some chances better to refine a little bit of codes. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-11920) Refactor some codes for erasure coders
[ https://issues.apache.org/jira/browse/HADOOP-11920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14531858#comment-14531858 ] Yi Liu commented on HADOOP-11920: - Thanks Kai for the patch. {quote} resetDirectBuffer(outputs[0]); {quote} The name should be resetBuffer, since it could be a java heap buffer? For the Jenkins, can you run locally for the related test case, if they pass, I think it's OK to go. Refactor some codes for erasure coders -- Key: HADOOP-11920 URL: https://issues.apache.org/jira/browse/HADOOP-11920 Project: Hadoop Common Issue Type: Sub-task Components: io Reporter: Kai Zheng Assignee: Kai Zheng Attachments: HADOOP-11920-HDFS-7285-02.patch, HADOOP-11920-HDFS-7285-v4.patch, HADOOP-11920-v1.patch, HADOOP-11920-v2.patch, HADOOP-11920-v3.patch While working on native erasure coders and also HADOOP-11847, it was found in some chances better to refine a little bit of codes. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-11920) Refactor some codes for erasure coders
[ https://issues.apache.org/jira/browse/HADOOP-11920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14531994#comment-14531994 ] Yi Liu commented on HADOOP-11920: - +1 , thanks Kai. Refactor some codes for erasure coders -- Key: HADOOP-11920 URL: https://issues.apache.org/jira/browse/HADOOP-11920 Project: Hadoop Common Issue Type: Sub-task Components: io Reporter: Kai Zheng Assignee: Kai Zheng Attachments: HADOOP-11920-HDFS-7285-02.patch, HADOOP-11920-HDFS-7285-v4.patch, HADOOP-11920-HDFS-7285-v5.patch, HADOOP-11920-v1.patch, HADOOP-11920-v2.patch, HADOOP-11920-v3.patch While working on native erasure coders and also HADOOP-11847, it was found in some chances better to refine a little bit of codes. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-11847) Enhance raw coder allowing to read least required inputs in decoding
[ https://issues.apache.org/jira/browse/HADOOP-11847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14526741#comment-14526741 ] Yi Liu commented on HADOOP-11847: - Hi Zhe, for striped block recovery, there are several situations: 1) only parity blocks missed 2) only data blocks missed 3) both parity and data blocks missed. Before this patch commit, In HDFS-7348, for #1, I use encode as workaround, but it will encode all parity blocks. For #2, I found decode only works for data blocks, and the erasureIndices needs some special handle, see the decode test, so in HDFS-7348, in the test I made parityBlkNum of data blocks missed, then it works, but we need to have full inputs and allocate more buffers. For #3, it doesn't work and there is no test. So if without this fix, in HDFS-7348, HDFS-7678, the decode is just workaround and we still need to update after this patch. Even the decode interface is the same, but there is different requirements for the input parameters, so the code logic will be different. Should we review and push this patch as soon as possible? It's a block issue. Ideally for {{decode}}, the input should be: 1) minimal input blocks (may include data or parity blocks), 2) Indices of input blocks, or some way to let decode function know, 3) output is blocks to be recovered (one or more), 4) Indices of output blocks. Enhance raw coder allowing to read least required inputs in decoding Key: HADOOP-11847 URL: https://issues.apache.org/jira/browse/HADOOP-11847 Project: Hadoop Common Issue Type: Sub-task Components: io Reporter: Kai Zheng Assignee: Kai Zheng Attachments: HADOOP-11847-HDFS-7285-v3.patch, HADOOP-11847-HDFS-7285-v4.patch, HADOOP-11847-v1.patch, HADOOP-11847-v2.patch This is to enhance raw erasure coder to allow only reading least required inputs while decoding. It will also refine and document the relevant APIs for better understanding and usage. When using least required inputs, it may add computating overhead but will possiblly outperform overall since less network traffic and disk IO are involved. This is something planned to do but just got reminded by [~zhz]' s question raised in HDFS-7678, also copied here: bq.Kai Zheng I have a question about decoding: in a (6+3) schema, if block #2 is missing, and I want to repair it with blocks 0, 1, 3, 4, 5, 8, how should I construct the inputs to RawErasureDecoder#decode? With this work, hopefully the answer to above question would be obvious. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-11887) Introduce Intel ISA-L erasure coding library for the native support
[ https://issues.apache.org/jira/browse/HADOOP-11887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14527736#comment-14527736 ] Yi Liu commented on HADOOP-11887: - Hi [~cmccabe] and [~andrew.wang], do you have time to review this, appreciate if you guys can help? Since you are native experts :) Introduce Intel ISA-L erasure coding library for the native support --- Key: HADOOP-11887 URL: https://issues.apache.org/jira/browse/HADOOP-11887 Project: Hadoop Common Issue Type: Sub-task Components: io Reporter: Kai Zheng Assignee: Kai Zheng Attachments: HADOOP-11887-v1.patch This is to introduce Intel ISA-L erasure coding library for the native support, via dynamic loading mechanism (dynamic module, like *.so in *nix and *.dll on Windows). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-11908) Erasure coding: Should be able to encode part of parity blocks.
[ https://issues.apache.org/jira/browse/HADOOP-11908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yi Liu updated HADOOP-11908: Assignee: Kai Zheng (was: Yi Liu) Erasure coding: Should be able to encode part of parity blocks. --- Key: HADOOP-11908 URL: https://issues.apache.org/jira/browse/HADOOP-11908 Project: Hadoop Common Issue Type: Sub-task Components: io Reporter: Yi Liu Assignee: Kai Zheng {code} public void encode(ByteBuffer[] inputs, ByteBuffer[] outputs); {code} Currently when we do encode, the outputs are all parity blocks, we should be able to encode part of parity blocks. This is required when we do datanode striped block recovery, if one or more parity blocks are missed, we need to do encode to recovery them. Only encode part of parity blocks should be more efficient than all. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HADOOP-11908) Erasure coding: Should be able to encode part of parity blocks.
Yi Liu created HADOOP-11908: --- Summary: Erasure coding: Should be able to encode part of parity blocks. Key: HADOOP-11908 URL: https://issues.apache.org/jira/browse/HADOOP-11908 Project: Hadoop Common Issue Type: Sub-task Reporter: Yi Liu Assignee: Yi Liu {code} public void encode(ByteBuffer[] inputs, ByteBuffer[] outputs); {code} Currently when we do encode, the outputs are all parity blocks, we should be able to encode part of parity blocks. This is required when we do datanode striped block recovery, if one or more parity blocks are missed, we need to do encode to recovery them. Only encode part of parity blocks should be more efficient than all. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-11887) Introduce Intel ISA-L erasure coding library for the native support
[ https://issues.apache.org/jira/browse/HADOOP-11887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14520689#comment-14520689 ] Yi Liu commented on HADOOP-11887: - Thanks Kai for the work, I will take look at it in following few days. Introduce Intel ISA-L erasure coding library for the native support --- Key: HADOOP-11887 URL: https://issues.apache.org/jira/browse/HADOOP-11887 Project: Hadoop Common Issue Type: Sub-task Components: io Reporter: Kai Zheng Assignee: Kai Zheng Attachments: HADOOP-11887-v1.patch This is to introduce Intel ISA-L erasure coding library for the native support, via dynamic loading mechanism (dynamic module, like *.so in *nix and *.dll on Windows). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-11766) Generic token authentication support for Hadoop
[ https://issues.apache.org/jira/browse/HADOOP-11766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yi Liu updated HADOOP-11766: Assignee: Kai Zheng Generic token authentication support for Hadoop --- Key: HADOOP-11766 URL: https://issues.apache.org/jira/browse/HADOOP-11766 Project: Hadoop Common Issue Type: New Feature Components: security Reporter: Kai Zheng Assignee: Kai Zheng As a major goal of Rhino project, we proposed *TokenAuth* effort in HADOOP-9392, where it's to provide a common token authentication framework to integrate multiple authentication mechanisms, by adding a new {{AuthenticationMethod}} in lieu of {{KERBEROS}} and {{SIMPLE}}. To minimize the required changes and risk, we thought of another approach to achieve the general goals based on Kerberos as Kerberos itself supports a pre-authentication framework in both spec and implementation, which was discussed in HADOOP-10959 as *TokenPreauth*. In both approaches, we had performed workable prototypes covering both command line console and Hadoop web UI. As HADOOP-9392 is rather lengthy and heavy, HADOOP-10959 is mostly focused on the concrete implementation approach based on Kerberos, we open this for more general and updated discussions about requirement, use cases, and concerns for the generic token authentication support for Hadoop. We distinguish this token from existing Hadoop tokens as the token in this discussion is majorly for the initial and primary authentication. We will refine our existing codes in HADOOP-9392 and HADOOP-10959, break them down into smaller patches based on latest trunk. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-11766) Generic token authentication support for Hadoop
[ https://issues.apache.org/jira/browse/HADOOP-11766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14497640#comment-14497640 ] Yi Liu commented on HADOOP-11766: - Hi Kai, could you upload a design doc firstly? Then people can have better understand. Generic token authentication support for Hadoop --- Key: HADOOP-11766 URL: https://issues.apache.org/jira/browse/HADOOP-11766 Project: Hadoop Common Issue Type: New Feature Components: security Reporter: Kai Zheng As a major goal of Rhino project, we proposed *TokenAuth* effort in HADOOP-9392, where it's to provide a common token authentication framework to integrate multiple authentication mechanisms, by adding a new {{AuthenticationMethod}} in lieu of {{KERBEROS}} and {{SIMPLE}}. To minimize the required changes and risk, we thought of another approach to achieve the general goals based on Kerberos as Kerberos itself supports a pre-authentication framework in both spec and implementation, which was discussed in HADOOP-10959 as *TokenPreauth*. In both approaches, we had performed workable prototypes covering both command line console and Hadoop web UI. As HADOOP-9392 is rather lengthy and heavy, HADOOP-10959 is mostly focused on the concrete implementation approach based on Kerberos, we open this for more general and updated discussions about requirement, use cases, and concerns for the generic token authentication support for Hadoop. We distinguish this token from existing Hadoop tokens as the token in this discussion is majorly for the initial and primary authentication. We will refine our existing codes in HADOOP-9392 and HADOOP-10959, break them down into smaller patches based on latest trunk. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-11789) NPE in TestCryptoStreamsWithOpensslAesCtrCryptoCodec
[ https://issues.apache.org/jira/browse/HADOOP-11789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14487399#comment-14487399 ] Yi Liu commented on HADOOP-11789: - {quote} but the NPE on Jenkins needs to be fixed on the Jenkins side. {quote} Sorry, I missed this comment from Andrew :) The new patch addresses that, thanks. NPE in TestCryptoStreamsWithOpensslAesCtrCryptoCodec - Key: HADOOP-11789 URL: https://issues.apache.org/jira/browse/HADOOP-11789 Project: Hadoop Common Issue Type: Bug Affects Versions: 3.0.0, 2.8.0 Environment: ASF Jenkins Reporter: Steve Loughran Assignee: Yi Liu Attachments: HADOOP-11789.001.patch, HADOOP-11789.002.patch NPE surfacing in {{TestCryptoStreamsWithOpensslAesCtrCryptoCodec}} on Jenkins -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-11789) NPE in TestCryptoStreamsWithOpensslAesCtrCryptoCodec
[ https://issues.apache.org/jira/browse/HADOOP-11789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yi Liu updated HADOOP-11789: Status: Patch Available (was: Reopened) NPE in TestCryptoStreamsWithOpensslAesCtrCryptoCodec - Key: HADOOP-11789 URL: https://issues.apache.org/jira/browse/HADOOP-11789 Project: Hadoop Common Issue Type: Bug Affects Versions: 3.0.0, 2.8.0 Environment: ASF Jenkins Reporter: Steve Loughran Assignee: Yi Liu Attachments: HADOOP-11789.001.patch, HADOOP-11789.002.patch NPE surfacing in {{TestCryptoStreamsWithOpensslAesCtrCryptoCodec}} on Jenkins -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-11789) NPE in TestCryptoStreamsWithOpensslAesCtrCryptoCodec
[ https://issues.apache.org/jira/browse/HADOOP-11789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yi Liu updated HADOOP-11789: Attachment: HADOOP-11789.002.patch [~ste...@apache.org], good idea, we should have a better message, thanks. Update the patch to give a better message. NPE in TestCryptoStreamsWithOpensslAesCtrCryptoCodec - Key: HADOOP-11789 URL: https://issues.apache.org/jira/browse/HADOOP-11789 Project: Hadoop Common Issue Type: Bug Affects Versions: 3.0.0, 2.8.0 Environment: ASF Jenkins Reporter: Steve Loughran Assignee: Yi Liu Attachments: HADOOP-11789.001.patch, HADOOP-11789.002.patch NPE surfacing in {{TestCryptoStreamsWithOpensslAesCtrCryptoCodec}} on Jenkins -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-11789) NPE in TestCryptoStreamsWithOpensslAesCtrCryptoCodec
[ https://issues.apache.org/jira/browse/HADOOP-11789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14484799#comment-14484799 ] Yi Liu commented on HADOOP-11789: - Colin, if {{-Pnative}} is set, but the os doesn't have a correct version of OpenSSL, then we should make the test failed? Could we test the crypto streams with OpensslAesCtrCryptoCodec only if correct Openssl is loaded? Otherwise people will still see the failure in the environment without correct OpenSSL. NPE in TestCryptoStreamsWithOpensslAesCtrCryptoCodec - Key: HADOOP-11789 URL: https://issues.apache.org/jira/browse/HADOOP-11789 Project: Hadoop Common Issue Type: Bug Affects Versions: 3.0.0, 2.8.0 Environment: ASF Jenkins Reporter: Steve Loughran Assignee: Yi Liu Attachments: HADOOP-11789.001.patch NPE surfacing in {{TestCryptoStreamsWithOpensslAesCtrCryptoCodec}} on Jenkins -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-11789) NPE in TestCryptoStreamsWithOpensslAesCtrCryptoCodec
[ https://issues.apache.org/jira/browse/HADOOP-11789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14486456#comment-14486456 ] Yi Liu commented on HADOOP-11789: - Colin, I'm OK to close it as WONTFIX. [~steve_l] and [~xyao], do you have comments? If not, I will close it. NPE in TestCryptoStreamsWithOpensslAesCtrCryptoCodec - Key: HADOOP-11789 URL: https://issues.apache.org/jira/browse/HADOOP-11789 Project: Hadoop Common Issue Type: Bug Affects Versions: 3.0.0, 2.8.0 Environment: ASF Jenkins Reporter: Steve Loughran Assignee: Yi Liu Attachments: HADOOP-11789.001.patch NPE surfacing in {{TestCryptoStreamsWithOpensslAesCtrCryptoCodec}} on Jenkins -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-11789) NPE in TestCryptoStreamsWithOpensslAesCtrCryptoCodec
[ https://issues.apache.org/jira/browse/HADOOP-11789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yi Liu updated HADOOP-11789: Resolution: Won't Fix Status: Resolved (was: Patch Available) Thanks Colin and Andrew for the comments, close it as WON'T FIX. NPE in TestCryptoStreamsWithOpensslAesCtrCryptoCodec - Key: HADOOP-11789 URL: https://issues.apache.org/jira/browse/HADOOP-11789 Project: Hadoop Common Issue Type: Bug Affects Versions: 3.0.0, 2.8.0 Environment: ASF Jenkins Reporter: Steve Loughran Assignee: Yi Liu Attachments: HADOOP-11789.001.patch NPE surfacing in {{TestCryptoStreamsWithOpensslAesCtrCryptoCodec}} on Jenkins -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-11789) NPE in TestCryptoStreamsWithOpensslAesCtrCryptoCodec
[ https://issues.apache.org/jira/browse/HADOOP-11789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yi Liu updated HADOOP-11789: Attachment: HADOOP-11789.001.patch The failure is because openssl is not loaded or test is not run with -Pnative flag. Update the patch. NPE in TestCryptoStreamsWithOpensslAesCtrCryptoCodec - Key: HADOOP-11789 URL: https://issues.apache.org/jira/browse/HADOOP-11789 Project: Hadoop Common Issue Type: Bug Affects Versions: 3.0.0, 2.8.0 Environment: ASF Jenkins Reporter: Steve Loughran Assignee: Yi Liu Attachments: HADOOP-11789.001.patch NPE surfacing in {{TestCryptoStreamsWithOpensslAesCtrCryptoCodec}} on Jenkins -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-11789) NPE in TestCryptoStreamsWithOpensslAesCtrCryptoCodec
[ https://issues.apache.org/jira/browse/HADOOP-11789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yi Liu updated HADOOP-11789: Status: Patch Available (was: Open) NPE in TestCryptoStreamsWithOpensslAesCtrCryptoCodec - Key: HADOOP-11789 URL: https://issues.apache.org/jira/browse/HADOOP-11789 Project: Hadoop Common Issue Type: Bug Affects Versions: 3.0.0, 2.8.0 Environment: ASF Jenkins Reporter: Steve Loughran Assignee: Yi Liu Attachments: HADOOP-11789.001.patch NPE surfacing in {{TestCryptoStreamsWithOpensslAesCtrCryptoCodec}} on Jenkins -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HADOOP-11789) NPE in TestCryptoStreamsWithOpensslAesCtrCryptoCodec
[ https://issues.apache.org/jira/browse/HADOOP-11789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yi Liu reassigned HADOOP-11789: --- Assignee: Yi Liu NPE in TestCryptoStreamsWithOpensslAesCtrCryptoCodec - Key: HADOOP-11789 URL: https://issues.apache.org/jira/browse/HADOOP-11789 Project: Hadoop Common Issue Type: Bug Affects Versions: 3.0.0, 2.8.0 Environment: ASF Jenkins Reporter: Steve Loughran Assignee: Yi Liu NPE surfacing in {{TestCryptoStreamsWithOpensslAesCtrCryptoCodec}} on Jenkins -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-10300) Allowed deferred sending of call responses
[ https://issues.apache.org/jira/browse/HADOOP-10300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14386708#comment-14386708 ] Yi Liu commented on HADOOP-10300: - {code} public void sendResponse() throws IOException { int count = responseWaitCount.decrementAndGet(); assert count = 0 : response has already been sent; if (count == 0) { if (rpcResponse == null) { // needed by postponed operations to indicate an exception has // occurred. it's too late to re-encode the response so just // drop the connection. unlikely to occur in practice but in tests connection.close(); } else { connection.sendResponse(this); } } } {code} In real case, {{rpcResponse}} has value before {{sendResponse}}, so it seems {{if (rpcResponse == null)}} will not happen. Can we remove {{connection.close()}} and modify the test which makes this happen? Allowed deferred sending of call responses -- Key: HADOOP-10300 URL: https://issues.apache.org/jira/browse/HADOOP-10300 Project: Hadoop Common Issue Type: Sub-task Components: ipc Affects Versions: 2.0.0-alpha, 3.0.0 Reporter: Daryn Sharp Assignee: Daryn Sharp Attachments: HADOOP-10300.patch, HADOOP-10300.patch RPC handlers currently do not return until the RPC call completes and response is sent, or a partially sent response has been queued for the responder. It would be useful for a proxy method to notify the handler to not yet the send the call's response. An potential use case is a namespace handler in the NN might want to return before the edit log is synced so it can service more requests and allow increased batching of edits per sync. Background syncing could later trigger the sending of the call response to the client. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-11710) Make CryptoOutputStream behave like DFSOutputStream wrt synchronization
[ https://issues.apache.org/jira/browse/HADOOP-11710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14362630#comment-14362630 ] Yi Liu commented on HADOOP-11710: - {quote} I cherry-picked this to branch-2.7 {quote} Oh, I missed that. Thanks for committing to branch-2.7, [~ozawa]. Make CryptoOutputStream behave like DFSOutputStream wrt synchronization --- Key: HADOOP-11710 URL: https://issues.apache.org/jira/browse/HADOOP-11710 Project: Hadoop Common Issue Type: Sub-task Components: fs Affects Versions: 2.6.0 Reporter: Sean Busbey Assignee: Sean Busbey Priority: Critical Fix For: 2.7.0 Attachments: HADOOP-11710.1.patch.txt, HADOOP-11710.2.patch.txt, HADOOP-11710.3.patch.txt per discussion on parent, as an intermediate solution make CryptoOutputStream behave like DFSOutputStream -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-11711) Provide a default value for AES/CTR/NoPadding CryptoCodec classes
[ https://issues.apache.org/jira/browse/HADOOP-11711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14359879#comment-14359879 ] Yi Liu commented on HADOOP-11711: - +1, Thanks Andrew! Provide a default value for AES/CTR/NoPadding CryptoCodec classes - Key: HADOOP-11711 URL: https://issues.apache.org/jira/browse/HADOOP-11711 Project: Hadoop Common Issue Type: Improvement Affects Versions: 2.6.0 Reporter: Andrew Wang Assignee: Andrew Wang Priority: Minor Attachments: hadoop-11711.001.patch, hadoop-11711.002.patch Users can configure the desired class to use for a given codec via a property like {{hadoop.security.crypto.codec.classes.aes.ctr.nopadding}}. However, even though we provide a default value for this codec in {{core-default.xml}}, this default is not also done in the code. As a result, client deployments that do not include {{core-default.xml}} cannot resolve any codecs, and get an NPE. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-11710) Make CryptoOutputStream behave like DFSOutputStream wrt synchronization
[ https://issues.apache.org/jira/browse/HADOOP-11710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yi Liu updated HADOOP-11710: Resolution: Fixed Fix Version/s: 2.7.0 Target Version/s: 2.7.0 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Committed to trunk and branch-2. Make CryptoOutputStream behave like DFSOutputStream wrt synchronization --- Key: HADOOP-11710 URL: https://issues.apache.org/jira/browse/HADOOP-11710 Project: Hadoop Common Issue Type: Sub-task Components: fs Affects Versions: 2.6.0 Reporter: Sean Busbey Assignee: Sean Busbey Priority: Critical Fix For: 2.7.0 Attachments: HADOOP-11710.1.patch.txt, HADOOP-11710.2.patch.txt, HADOOP-11710.3.patch.txt per discussion on parent, as an intermediate solution make CryptoOutputStream behave like DFSOutputStream -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HADOOP-11710) Make CryptoOutputStream behave like DFSOutputStream wrt synchronization
[ https://issues.apache.org/jira/browse/HADOOP-11710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14359876#comment-14359876 ] Yi Liu edited comment on HADOOP-11710 at 3/13/15 3:25 AM: -- Committed to trunk and branch-2. The test failure is not related. was (Author: hitliuyi): Committed to trunk and branch-2. Make CryptoOutputStream behave like DFSOutputStream wrt synchronization --- Key: HADOOP-11710 URL: https://issues.apache.org/jira/browse/HADOOP-11710 Project: Hadoop Common Issue Type: Sub-task Components: fs Affects Versions: 2.6.0 Reporter: Sean Busbey Assignee: Sean Busbey Priority: Critical Fix For: 2.7.0 Attachments: HADOOP-11710.1.patch.txt, HADOOP-11710.2.patch.txt, HADOOP-11710.3.patch.txt per discussion on parent, as an intermediate solution make CryptoOutputStream behave like DFSOutputStream -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-11711) Provide a default value for AES/CTR/NoPadding CryptoCodec classes
[ https://issues.apache.org/jira/browse/HADOOP-11711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14359839#comment-14359839 ] Yi Liu commented on HADOOP-11711: - Thanks [~andrew.wang] for the patch, it looks good to me, +1 pending Jenkins. I find a really small nit in the test, it would be better if you could address: {code} public static final String HADOOP_SECURITY_CRYPTO_CODEC_CLASSES_AES_CTR_NOPADDING_KEY = HADOOP_SECURITY_CRYPTO_CODEC_CLASSES_KEY_PREFIX + CipherSuite.AES_CTR_NOPADDING.getConfigSuffix(); {code} In {{CommonConfigurationKeysPublic.java}}, {{HADOOP_SECURITY_CRYPTO_CODEC_CLASSES_AES_CTR_NOPADDING_KEY}} is defined, we could use it in TestCryptoStreamsWithJceAesCtrCryptoCodec.java and TestCryptoStreamsWithOpensslAesCtrCryptoCodec.java instead of constructing the string again. Provide a default value for AES/CTR/NoPadding CryptoCodec classes - Key: HADOOP-11711 URL: https://issues.apache.org/jira/browse/HADOOP-11711 Project: Hadoop Common Issue Type: Improvement Affects Versions: 2.6.0 Reporter: Andrew Wang Assignee: Andrew Wang Priority: Minor Attachments: hadoop-11711.001.patch Users can configure the desired class to use for a given codec via a property like {{hadoop.security.crypto.codec.classes.aes.ctr.nopadding}}. However, even though we provide a default value for this codec in {{core-default.xml}}, this default is not also done in the code. As a result, client deployments that do not include {{core-default.xml}} cannot resolve any codecs, and get an NPE. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-11710) Make CryptoOutputStream behave like DFSOutputStream wrt synchronization
[ https://issues.apache.org/jira/browse/HADOOP-11710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14359815#comment-14359815 ] Yi Liu commented on HADOOP-11710: - +1 pending Jenkins. Make CryptoOutputStream behave like DFSOutputStream wrt synchronization --- Key: HADOOP-11710 URL: https://issues.apache.org/jira/browse/HADOOP-11710 Project: Hadoop Common Issue Type: Sub-task Components: fs Affects Versions: 2.6.0 Reporter: Sean Busbey Assignee: Sean Busbey Priority: Critical Attachments: HADOOP-11710.1.patch.txt, HADOOP-11710.2.patch.txt, HADOOP-11710.3.patch.txt per discussion on parent, as an intermediate solution make CryptoOutputStream behave like DFSOutputStream -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-11708) CryptoOutputStream synchronization differences from DFSOutputStream break HBase
[ https://issues.apache.org/jira/browse/HADOOP-11708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14359718#comment-14359718 ] Yi Liu commented on HADOOP-11708: - I also +1 for changing CryptoOutputStream to behave the same as HDFS. We could not make DFSOutputStream or CryptOutputStream *synchronized* for all methods, that would affect performance, in most cases, applications should handle the synchronization, so it's enough we keep the same behave as HDFS. Sorry that I could not get time working on HDFS-7911 in the past two days for personal reason. Since [~busbey] has a patch in HADOOP-11710, I would mark HDFS-7911 as duplicated. CryptoOutputStream synchronization differences from DFSOutputStream break HBase --- Key: HADOOP-11708 URL: https://issues.apache.org/jira/browse/HADOOP-11708 Project: Hadoop Common Issue Type: Bug Components: fs Affects Versions: 2.6.0 Reporter: Sean Busbey Assignee: Sean Busbey Priority: Critical For the write-ahead-log, HBase writes to DFS from a single thread and sends sync/flush/hflush from a configurable number of other threads (default 5). FSDataOutputStream does not document anything about being thread safe, and it is not thread safe for concurrent writes. However, DFSOutputStream is thread safe for concurrent writes + syncs. When it is the stream FSDataOutputStream wraps, the combination is threadsafe for 1 writer and multiple syncs (the exact behavior HBase relies on). When HDFS Transparent Encryption is turned on, CryptoOutputStream is inserted between FSDataOutputStream and DFSOutputStream. It is proactively labeled as not thread safe, and this composition is not thread safe for any operations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-11710) Make CryptoOutputStream behave like DFSOutputStream wrt synchronization
[ https://issues.apache.org/jira/browse/HADOOP-11710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14359737#comment-14359737 ] Yi Liu commented on HADOOP-11710: - Sean, don't move {{closed = true;}}. {{super.close();}} will invoke flush to flush the remaining data in the buffer, if we set *closed* to true before invoking {{super.close()}}, we will get error. I think the test failure should be related to this. Make CryptoOutputStream behave like DFSOutputStream wrt synchronization --- Key: HADOOP-11710 URL: https://issues.apache.org/jira/browse/HADOOP-11710 Project: Hadoop Common Issue Type: Sub-task Components: fs Affects Versions: 2.6.0 Reporter: Sean Busbey Assignee: Sean Busbey Priority: Critical Attachments: HADOOP-11710.1.patch.txt, HADOOP-11710.2.patch.txt per discussion on parent, as an intermediate solution make CryptoOutputStream behave like DFSOutputStream -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HADOOP-11710) Make CryptoOutputStream behave like DFSOutputStream wrt synchronization
[ https://issues.apache.org/jira/browse/HADOOP-11710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14359737#comment-14359737 ] Yi Liu edited comment on HADOOP-11710 at 3/13/15 1:20 AM: -- Sean, don't move {{closed = true;}}. {{super.close();}} will invoke flush to flush the remaining data in the buffer, if we set *closed* to true before invoking {{super.close()}}, we will get error. I think the test failure should be related to this. was (Author: hitliuyi): Sean, don't move {{closed = true;}}. {{super.close();}} will invoke flush to flush the remaining data in the buffer, if we set *closed* to true before invoking {{super.close()}}, we will get error. I think the test failure should be related to this. Make CryptoOutputStream behave like DFSOutputStream wrt synchronization --- Key: HADOOP-11710 URL: https://issues.apache.org/jira/browse/HADOOP-11710 Project: Hadoop Common Issue Type: Sub-task Components: fs Affects Versions: 2.6.0 Reporter: Sean Busbey Assignee: Sean Busbey Priority: Critical Attachments: HADOOP-11710.1.patch.txt, HADOOP-11710.2.patch.txt per discussion on parent, as an intermediate solution make CryptoOutputStream behave like DFSOutputStream -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-11710) Make CryptoOutputStream behave like DFSOutputStream wrt synchronization
[ https://issues.apache.org/jira/browse/HADOOP-11710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14359743#comment-14359743 ] Yi Liu commented on HADOOP-11710: - Oh, I just see Steve's comments {quote} However, I would recommend one change, which is in close(), move the close=true operation up immediately after the close check, just in case something in {{freeBuffers() }} raised an exception or the parent did -it'll stop a second close() call getting into a mess. This is not really related to the rest of the patch, except in the general improve re-entrancy contex {quote} I agree we should make {{closed}} be set to false, also I think use sun's API to release directbuffer rarely failed. Maybe we can put it in {{try... finally}}. Make CryptoOutputStream behave like DFSOutputStream wrt synchronization --- Key: HADOOP-11710 URL: https://issues.apache.org/jira/browse/HADOOP-11710 Project: Hadoop Common Issue Type: Sub-task Components: fs Affects Versions: 2.6.0 Reporter: Sean Busbey Assignee: Sean Busbey Priority: Critical Attachments: HADOOP-11710.1.patch.txt, HADOOP-11710.2.patch.txt per discussion on parent, as an intermediate solution make CryptoOutputStream behave like DFSOutputStream -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-11674) oneByteBuf in CryptoInputStream and CryptoOutputStream should be non static
[ https://issues.apache.org/jira/browse/HADOOP-11674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yi Liu updated HADOOP-11674: Summary: oneByteBuf in CryptoInputStream and CryptoOutputStream should be non static (was: data corruption for parallel CryptoInputStream and CryptoOutputStream) oneByteBuf in CryptoInputStream and CryptoOutputStream should be non static --- Key: HADOOP-11674 URL: https://issues.apache.org/jira/browse/HADOOP-11674 Project: Hadoop Common Issue Type: Bug Components: io Affects Versions: 2.6.0 Reporter: Sean Busbey Assignee: Sean Busbey Priority: Critical Attachments: HADOOP-11674.1.patch A common optimization in the io classes for Input/Output Streams is to save a single length-1 byte array to use in single byte read/write calls. CryptoInputStream and CryptoOutputStream both attempt to follow this practice but mistakenly mark the array as static. That means that only a single instance of each can be present in a JVM safely. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-11674) oneByteBuf in CryptoInputStream and CryptoOutputStream should be non static
[ https://issues.apache.org/jira/browse/HADOOP-11674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yi Liu updated HADOOP-11674: Resolution: Fixed Fix Version/s: 2.7.0 Target Version/s: 2.7.0 (was: 3.0.0, 2.7.0, 2.6.1) Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Committed to trunk and branch-2. Thanks [~busbey] for the contribution. oneByteBuf in CryptoInputStream and CryptoOutputStream should be non static --- Key: HADOOP-11674 URL: https://issues.apache.org/jira/browse/HADOOP-11674 Project: Hadoop Common Issue Type: Bug Components: io Affects Versions: 2.6.0 Reporter: Sean Busbey Assignee: Sean Busbey Priority: Critical Fix For: 2.7.0 Attachments: HADOOP-11674.1.patch A common optimization in the io classes for Input/Output Streams is to save a single length-1 byte array to use in single byte read/write calls. CryptoInputStream and CryptoOutputStream both attempt to follow this practice but mistakenly mark the array as static. That means that only a single instance of each can be present in a JVM safely. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-11674) oneByteBuf in CryptoInputStream and CryptoOutputStream should be non static
[ https://issues.apache.org/jira/browse/HADOOP-11674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14348315#comment-14348315 ] Yi Liu commented on HADOOP-11674: - +1, {{oneByteBuf}} should be non-static, otherwise there may be issue for {{read()}} in multi threads. oneByteBuf in CryptoInputStream and CryptoOutputStream should be non static --- Key: HADOOP-11674 URL: https://issues.apache.org/jira/browse/HADOOP-11674 Project: Hadoop Common Issue Type: Bug Components: io Affects Versions: 2.6.0 Reporter: Sean Busbey Assignee: Sean Busbey Priority: Critical Attachments: HADOOP-11674.1.patch A common optimization in the io classes for Input/Output Streams is to save a single length-1 byte array to use in single byte read/write calls. CryptoInputStream and CryptoOutputStream both attempt to follow this practice but mistakenly mark the array as static. That means that only a single instance of each can be present in a JVM safely. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-11664) Loading predefined EC schemas from configuration
[ https://issues.apache.org/jira/browse/HADOOP-11664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yi Liu updated HADOOP-11664: Issue Type: Sub-task (was: Task) Parent: HADOOP-11264 Loading predefined EC schemas from configuration Key: HADOOP-11664 URL: https://issues.apache.org/jira/browse/HADOOP-11664 Project: Hadoop Common Issue Type: Sub-task Reporter: Kai Zheng Assignee: Kai Zheng Attachments: HDFS-7371_v1.patch System administrator can configure multiple EC codecs in hdfs-site.xml file, and codec instances or schemas in a new configuration file named ec-schema.xml in the conf folder. A codec can be referenced by its instance or schema using the codec name, and a schema can be utilized and specified by the schema name for a folder or EC ZONE to enforce EC. Once a schema is used to define an EC ZONE, then its associated parameter values will be stored as xattributes and respected thereafter. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Moved] (HADOOP-11664) Loading predefined EC schemas from configuration
[ https://issues.apache.org/jira/browse/HADOOP-11664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yi Liu moved HDFS-7371 to HADOOP-11664: --- Key: HADOOP-11664 (was: HDFS-7371) Project: Hadoop Common (was: Hadoop HDFS) Loading predefined EC schemas from configuration Key: HADOOP-11664 URL: https://issues.apache.org/jira/browse/HADOOP-11664 Project: Hadoop Common Issue Type: Task Reporter: Kai Zheng Assignee: Kai Zheng Attachments: HDFS-7371_v1.patch System administrator can configure multiple EC codecs in hdfs-site.xml file, and codec instances or schemas in a new configuration file named ec-schema.xml in the conf folder. A codec can be referenced by its instance or schema using the codec name, and a schema can be utilized and specified by the schema name for a folder or EC ZONE to enforce EC. Once a schema is used to define an EC ZONE, then its associated parameter values will be stored as xattributes and respected thereafter. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-11647) Reed-Solomon ErasureCoder
[ https://issues.apache.org/jira/browse/HADOOP-11647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yi Liu updated HADOOP-11647: Issue Type: Sub-task (was: Task) Parent: HADOOP-11264 Reed-Solomon ErasureCoder - Key: HADOOP-11647 URL: https://issues.apache.org/jira/browse/HADOOP-11647 Project: Hadoop Common Issue Type: Sub-task Reporter: Kai Zheng Assignee: Kai Zheng Attachments: HDFS-7664-v1.patch This is to implement Reed-Solomon ErasureCoder using the API defined in HDFS-7662. It supports to plugin via configuration for concrete RawErasureCoder, using either JRSErasureCoder added in HDFS-7418 or IsaRSErasureCoder added in HDFS-7338. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Moved] (HADOOP-11645) Erasure Codec API covering the essential aspects for an erasure code
[ https://issues.apache.org/jira/browse/HADOOP-11645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yi Liu moved HDFS-7699 to HADOOP-11645: --- Key: HADOOP-11645 (was: HDFS-7699) Project: Hadoop Common (was: Hadoop HDFS) Erasure Codec API covering the essential aspects for an erasure code Key: HADOOP-11645 URL: https://issues.apache.org/jira/browse/HADOOP-11645 Project: Hadoop Common Issue Type: Task Reporter: Kai Zheng Assignee: Kai Zheng This is to define the even higher level API *ErasureCodec* to possiblly consider all the essential aspects for an erasure code, as discussed in in HDFS-7337 in details. Generally, it will cover the necessary configurations about which *RawErasureCoder* to use for the code scheme, how to form and layout the BlockGroup, and etc. It will also discuss how an *ErasureCodec* will be used in both client and DataNode, in all the supported modes related to EC. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-11645) Erasure Codec API covering the essential aspects for an erasure code
[ https://issues.apache.org/jira/browse/HADOOP-11645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yi Liu updated HADOOP-11645: Issue Type: Sub-task (was: Task) Parent: HADOOP-11264 Erasure Codec API covering the essential aspects for an erasure code Key: HADOOP-11645 URL: https://issues.apache.org/jira/browse/HADOOP-11645 Project: Hadoop Common Issue Type: Sub-task Reporter: Kai Zheng Assignee: Kai Zheng This is to define the even higher level API *ErasureCodec* to possiblly consider all the essential aspects for an erasure code, as discussed in in HDFS-7337 in details. Generally, it will cover the necessary configurations about which *RawErasureCoder* to use for the code scheme, how to form and layout the BlockGroup, and etc. It will also discuss how an *ErasureCodec* will be used in both client and DataNode, in all the supported modes related to EC. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Moved] (HADOOP-11646) Erasure Coder API for encoding and decoding of block group
[ https://issues.apache.org/jira/browse/HADOOP-11646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yi Liu moved HDFS-7662 to HADOOP-11646: --- Fix Version/s: (was: HDFS-7285) HDFS-7285 Key: HADOOP-11646 (was: HDFS-7662) Project: Hadoop Common (was: Hadoop HDFS) Erasure Coder API for encoding and decoding of block group -- Key: HADOOP-11646 URL: https://issues.apache.org/jira/browse/HADOOP-11646 Project: Hadoop Common Issue Type: Task Reporter: Kai Zheng Assignee: Kai Zheng Fix For: HDFS-7285 Attachments: HDFS-7662-v1.patch, HDFS-7662-v2.patch, HDFS-7662-v3.patch This is to define ErasureCoder API for encoding and decoding of BlockGroup. Given a BlockGroup, ErasureCoder extracts data chunks from the blocks and leverages RawErasureCoder defined in HDFS-7353 to perform concrete encoding or decoding. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-11646) Erasure Coder API for encoding and decoding of block group
[ https://issues.apache.org/jira/browse/HADOOP-11646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yi Liu updated HADOOP-11646: Issue Type: Sub-task (was: Task) Parent: HADOOP-11264 Erasure Coder API for encoding and decoding of block group -- Key: HADOOP-11646 URL: https://issues.apache.org/jira/browse/HADOOP-11646 Project: Hadoop Common Issue Type: Sub-task Reporter: Kai Zheng Assignee: Kai Zheng Fix For: HDFS-7285 Attachments: HDFS-7662-v1.patch, HDFS-7662-v2.patch, HDFS-7662-v3.patch This is to define ErasureCoder API for encoding and decoding of BlockGroup. Given a BlockGroup, ErasureCoder extracts data chunks from the blocks and leverages RawErasureCoder defined in HDFS-7353 to perform concrete encoding or decoding. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Moved] (HADOOP-11647) Reed-Solomon ErasureCoder
[ https://issues.apache.org/jira/browse/HADOOP-11647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yi Liu moved HDFS-7664 to HADOOP-11647: --- Key: HADOOP-11647 (was: HDFS-7664) Project: Hadoop Common (was: Hadoop HDFS) Reed-Solomon ErasureCoder - Key: HADOOP-11647 URL: https://issues.apache.org/jira/browse/HADOOP-11647 Project: Hadoop Common Issue Type: Task Reporter: Kai Zheng Assignee: Kai Zheng Attachments: HDFS-7664-v1.patch This is to implement Reed-Solomon ErasureCoder using the API defined in HDFS-7662. It supports to plugin via configuration for concrete RawErasureCoder, using either JRSErasureCoder added in HDFS-7418 or IsaRSErasureCoder added in HDFS-7338. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-11595) Add default implementation for AbstractFileSystem#truncate
[ https://issues.apache.org/jira/browse/HADOOP-11595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yi Liu updated HADOOP-11595: Resolution: Fixed Fix Version/s: 2.7.0 Status: Resolved (was: Patch Available) Committed to trunk and branch-2, thanks again Chris. Add default implementation for AbstractFileSystem#truncate -- Key: HADOOP-11595 URL: https://issues.apache.org/jira/browse/HADOOP-11595 Project: Hadoop Common Issue Type: Bug Components: fs Affects Versions: 2.7.0 Reporter: Yi Liu Assignee: Yi Liu Fix For: 2.7.0 Attachments: HADOOP-11595.001.patch As [~cnauroth] commented in HADOOP-11510, we should add a default implementation for AbstractFileSystem#truncate to avoid backwards-compatibility -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-11595) Add default implementation for AbstractFileSystem#truncate
[ https://issues.apache.org/jira/browse/HADOOP-11595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14327149#comment-14327149 ] Yi Liu commented on HADOOP-11595: - Chris, Thank you very much for the review and verification! Sorry for late response (I am on holiday this week for Chinese traditional new year), will commit it later. Add default implementation for AbstractFileSystem#truncate -- Key: HADOOP-11595 URL: https://issues.apache.org/jira/browse/HADOOP-11595 Project: Hadoop Common Issue Type: Bug Components: fs Affects Versions: 2.7.0 Reporter: Yi Liu Assignee: Yi Liu Attachments: HADOOP-11595.001.patch As [~cnauroth] commented in HADOOP-11510, we should add a default implementation for AbstractFileSystem#truncate to avoid backwards-compatibility -- This message was sent by Atlassian JIRA (v6.3.4#6332)