from:"Ted Yu \(JIRA\)"

[jira] [Commented] (HADOOP-16018) DistCp won't reassemble chunks when blocks per chunk > 0

2018-12-22 Thread Ted Yu (JIRA)



[ 
https://issues.apache.org/jira/browse/HADOOP-16018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16727525#comment-16727525
 ] 

Ted Yu commented on HADOOP-16018:
-

Looking at 
https://github.com/apache/hadoop/commits/trunk/hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/DistCpOptionSwitch.java
 , it was not touched by HADOOP-15850

> DistCp won't reassemble chunks when blocks per chunk > 0
> 
>
> Key: HADOOP-16018
> URL: https://issues.apache.org/jira/browse/HADOOP-16018
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: tools/distcp
>Affects Versions: 3.2.0, 2.9.2
>Reporter: Kai X
>Priority: Major
>
> I was investigating why hadoop-distcp-2.9.2 won't reassemble chunks of the 
> same file when blocks per chunk has been set > 0.
> In the CopyCommitter::commitJob, this logic can prevent chunks from 
> reassembling if blocks per chunk is equal to 0:
> {code:java}
> if (blocksPerChunk > 0) {
>   concatFileChunks(conf);
> }
> {code}
> Then in CopyCommitter's ctor, blocksPerChunk is initialised from the config:
> {code:java}
> blocksPerChunk = context.getConfiguration().getInt(
> DistCpOptionSwitch.BLOCKS_PER_CHUNK.getConfigLabel(), 0);
> {code}
>  
> But here the config key DistCpOptionSwitch.BLOCKS_PER_CHUNK.getConfigLabel() 
> will always returns empty string because it is constructed without config 
> label:
> {code:java}
> BLOCKS_PER_CHUNK("",
> new Option("blocksperchunk", true, "If set to a positive value, files"
> + "with more blocks than this value will be split into chunks of "
> + " blocks to be transferred in parallel, and "
> + "reassembled on the destination. By default,  is "
> + "0 and the files will be transmitted in their entirety without "
> + "splitting. This switch is only applicable when the source file "
> + "system implements getBlockLocations method and the target file "
> + "system implements concat method"))
> {code}
> As a result it will fall back to the default value 0 for blocksPerChunk, and 
> prevent the chunks from reassembling.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Created] (HADOOP-15910) Javadoc for LdapAuthenticationHandler#ENABLE_START_TLS is wrong

2018-11-08 Thread Ted Yu (JIRA)

Ted Yu created HADOOP-15910:
---

 Summary: Javadoc for LdapAuthenticationHandler#ENABLE_START_TLS is 
wrong
 Key: HADOOP-15910
 URL: https://issues.apache.org/jira/browse/HADOOP-15910
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Ted Yu


In LdapAuthenticationHandler, the javadoc for ENABLE_START_TLS has the same 
contents for BASE_DN



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Created] (HADOOP-15876) Use keySet().removeAll() to remove multiple keys from Map in AzureBlobFileSystemStore

2018-10-23 Thread Ted Yu (JIRA)

Ted Yu created HADOOP-15876:
---

 Summary: Use keySet().removeAll() to remove multiple keys from Map 
in AzureBlobFileSystemStore
 Key: HADOOP-15876
 URL: https://issues.apache.org/jira/browse/HADOOP-15876
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Ted Yu


Looking at 
hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/AzureBlobFileSystemStore.java
 , {{removeDefaultAcl}} in particular:
{code}
for (Map.Entry defaultAclEntry : 
defaultAclEntries.entrySet()) {
  aclEntries.remove(defaultAclEntry.getKey());
}
{code}
The above operation can be written this way:
{code}
aclEntries.keySet().removeAll(defaultAclEntries.keySet());
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-15850) CopyCommitter#concatFileChunks should check that the blocks per chunk is not 0

2018-10-19 Thread Ted Yu (JIRA)



 [ 
https://issues.apache.org/jira/browse/HADOOP-15850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HADOOP-15850:

Priority: Critical  (was: Major)

> CopyCommitter#concatFileChunks should check that the blocks per chunk is not 0
> --
>
> Key: HADOOP-15850
> URL: https://issues.apache.org/jira/browse/HADOOP-15850
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: tools/distcp
>Affects Versions: 3.1.1
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Critical
> Fix For: 3.0.4, 3.3.0, 3.1.2, 3.2.1
>
> Attachments: HADOOP-15850.branch-3.0.patch, HADOOP-15850.v2.patch, 
> HADOOP-15850.v3.patch, HADOOP-15850.v4.patch, HADOOP-15850.v5.patch, 
> HADOOP-15850.v6.patch, testIncrementalBackupWithBulkLoad-output.txt
>
>
> I was investigating test failure of TestIncrementalBackupWithBulkLoad from 
> hbase against hadoop 3.1.1
> hbase MapReduceBackupCopyJob$BackupDistCp would create listing file:
> {code}
> LOG.debug("creating input listing " + listing + " , totalRecords=" + 
> totalRecords);
> cfg.set(DistCpConstants.CONF_LABEL_LISTING_FILE_PATH, listing);
> cfg.setLong(DistCpConstants.CONF_LABEL_TOTAL_NUMBER_OF_RECORDS, 
> totalRecords);
> {code}
> For the test case, two bulk loaded hfiles are in the listing:
> {code}
> 2018-10-13 14:09:24,123 DEBUG [Time-limited test] 
> mapreduce.MapReduceBackupCopyJob$BackupDistCp(195): BackupDistCp : 
> hdfs://localhost:42796/user/hbase/test-data/160aeab5-6bca-9f87-465e-2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/394e6d39a9b94b148b9089c4fb967aad_SeqId_205_
> 2018-10-13 14:09:24,125 DEBUG [Time-limited test] 
> mapreduce.MapReduceBackupCopyJob$BackupDistCp(195): BackupDistCp : 
> hdfs://localhost:42796/user/hbase/test-data/160aeab5-6bca-9f87-465e-2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/a7599081e835440eb7bf0dd3ef4fd7a5_SeqId_205_
> 2018-10-13 14:09:24,125 DEBUG [Time-limited test] 
> mapreduce.MapReduceBackupCopyJob$BackupDistCp(197): BackupDistCp execute for 
> 2 files of 10242
> {code}
> Later on, CopyCommitter#concatFileChunks would throw the following exception:
> {code}
> 2018-10-13 14:09:25,351 WARN  [Thread-936] mapred.LocalJobRunner$Job(590): 
> job_local1795473782_0004
> java.io.IOException: Inconsistent sequence file: current chunk file 
> org.apache.hadoop.tools.CopyListingFileStatus@bb8826ee{hdfs://localhost:42796/user/hbase/test-data/
>
> 160aeab5-6bca-9f87-465e-2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/a7599081e835440eb7bf0dd3ef4fd7a5_SeqId_205_
>  length = 5100 aclEntries  = null, xAttrs = null} doesnt match prior entry 
> org.apache.hadoop.tools.CopyListingFileStatus@243d544d{hdfs://localhost:42796/user/hbase/test-data/160aeab5-6bca-9f87-465e-
>
> 2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/394e6d39a9b94b148b9089c4fb967aad_SeqId_205_
>  length = 5142 aclEntries = null, xAttrs = null}
>   at 
> org.apache.hadoop.tools.mapred.CopyCommitter.concatFileChunks(CopyCommitter.java:276)
>   at 
> org.apache.hadoop.tools.mapred.CopyCommitter.commitJob(CopyCommitter.java:100)
>   at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:567)
> {code}
> The above warning shouldn't happen - the two bulk loaded hfiles are 
> independent.
> From the contents of the two CopyListingFileStatus instances, we can see that 
> their isSplit() return false. Otherwise the following from toString should be 
> logged:
> {code}
> if (isSplit()) {
>   sb.append(", chunkOffset = ").append(this.getChunkOffset());
>   sb.append(", chunkLength = ").append(this.getChunkLength());
> }
> {code}
> From hbase side, we can specify one bulk loaded hfile per job but that 
> defeats the purpose of using DistCp.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-15850) CopyCommitter#concatFileChunks should check that the blocks per chunk is not 0

2018-10-19 Thread Ted Yu (JIRA)



 [ 
https://issues.apache.org/jira/browse/HADOOP-15850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HADOOP-15850:

Attachment: HADOOP-15850.v6.patch

> CopyCommitter#concatFileChunks should check that the blocks per chunk is not 0
> --
>
> Key: HADOOP-15850
> URL: https://issues.apache.org/jira/browse/HADOOP-15850
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: tools/distcp
>Affects Versions: 3.1.1
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Major
> Attachments: HADOOP-15850.v2.patch, HADOOP-15850.v3.patch, 
> HADOOP-15850.v4.patch, HADOOP-15850.v5.patch, HADOOP-15850.v6.patch, 
> testIncrementalBackupWithBulkLoad-output.txt
>
>
> I was investigating test failure of TestIncrementalBackupWithBulkLoad from 
> hbase against hadoop 3.1.1
> hbase MapReduceBackupCopyJob$BackupDistCp would create listing file:
> {code}
> LOG.debug("creating input listing " + listing + " , totalRecords=" + 
> totalRecords);
> cfg.set(DistCpConstants.CONF_LABEL_LISTING_FILE_PATH, listing);
> cfg.setLong(DistCpConstants.CONF_LABEL_TOTAL_NUMBER_OF_RECORDS, 
> totalRecords);
> {code}
> For the test case, two bulk loaded hfiles are in the listing:
> {code}
> 2018-10-13 14:09:24,123 DEBUG [Time-limited test] 
> mapreduce.MapReduceBackupCopyJob$BackupDistCp(195): BackupDistCp : 
> hdfs://localhost:42796/user/hbase/test-data/160aeab5-6bca-9f87-465e-2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/394e6d39a9b94b148b9089c4fb967aad_SeqId_205_
> 2018-10-13 14:09:24,125 DEBUG [Time-limited test] 
> mapreduce.MapReduceBackupCopyJob$BackupDistCp(195): BackupDistCp : 
> hdfs://localhost:42796/user/hbase/test-data/160aeab5-6bca-9f87-465e-2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/a7599081e835440eb7bf0dd3ef4fd7a5_SeqId_205_
> 2018-10-13 14:09:24,125 DEBUG [Time-limited test] 
> mapreduce.MapReduceBackupCopyJob$BackupDistCp(197): BackupDistCp execute for 
> 2 files of 10242
> {code}
> Later on, CopyCommitter#concatFileChunks would throw the following exception:
> {code}
> 2018-10-13 14:09:25,351 WARN  [Thread-936] mapred.LocalJobRunner$Job(590): 
> job_local1795473782_0004
> java.io.IOException: Inconsistent sequence file: current chunk file 
> org.apache.hadoop.tools.CopyListingFileStatus@bb8826ee{hdfs://localhost:42796/user/hbase/test-data/
>
> 160aeab5-6bca-9f87-465e-2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/a7599081e835440eb7bf0dd3ef4fd7a5_SeqId_205_
>  length = 5100 aclEntries  = null, xAttrs = null} doesnt match prior entry 
> org.apache.hadoop.tools.CopyListingFileStatus@243d544d{hdfs://localhost:42796/user/hbase/test-data/160aeab5-6bca-9f87-465e-
>
> 2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/394e6d39a9b94b148b9089c4fb967aad_SeqId_205_
>  length = 5142 aclEntries = null, xAttrs = null}
>   at 
> org.apache.hadoop.tools.mapred.CopyCommitter.concatFileChunks(CopyCommitter.java:276)
>   at 
> org.apache.hadoop.tools.mapred.CopyCommitter.commitJob(CopyCommitter.java:100)
>   at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:567)
> {code}
> The above warning shouldn't happen - the two bulk loaded hfiles are 
> independent.
> From the contents of the two CopyListingFileStatus instances, we can see that 
> their isSplit() return false. Otherwise the following from toString should be 
> logged:
> {code}
> if (isSplit()) {
>   sb.append(", chunkOffset = ").append(this.getChunkOffset());
>   sb.append(", chunkLength = ").append(this.getChunkLength());
> }
> {code}
> From hbase side, we can specify one bulk loaded hfile per job but that 
> defeats the purpose of using DistCp.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-15850) CopyCommitter#concatFileChunks should check that the blocks per chunk is not 0

2018-10-18 Thread Ted Yu (JIRA)



[ 
https://issues.apache.org/jira/browse/HADOOP-15850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16656199#comment-16656199
 ] 

Ted Yu commented on HADOOP-15850:
-

Thanks for the review, looks like this bug could have been discovered sooner.

> CopyCommitter#concatFileChunks should check that the blocks per chunk is not 0
> --
>
> Key: HADOOP-15850
> URL: https://issues.apache.org/jira/browse/HADOOP-15850
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: tools/distcp
>Affects Versions: 3.1.1
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Major
> Attachments: HADOOP-15850.v2.patch, HADOOP-15850.v3.patch, 
> HADOOP-15850.v4.patch, HADOOP-15850.v5.patch, 
> testIncrementalBackupWithBulkLoad-output.txt
>
>
> I was investigating test failure of TestIncrementalBackupWithBulkLoad from 
> hbase against hadoop 3.1.1
> hbase MapReduceBackupCopyJob$BackupDistCp would create listing file:
> {code}
> LOG.debug("creating input listing " + listing + " , totalRecords=" + 
> totalRecords);
> cfg.set(DistCpConstants.CONF_LABEL_LISTING_FILE_PATH, listing);
> cfg.setLong(DistCpConstants.CONF_LABEL_TOTAL_NUMBER_OF_RECORDS, 
> totalRecords);
> {code}
> For the test case, two bulk loaded hfiles are in the listing:
> {code}
> 2018-10-13 14:09:24,123 DEBUG [Time-limited test] 
> mapreduce.MapReduceBackupCopyJob$BackupDistCp(195): BackupDistCp : 
> hdfs://localhost:42796/user/hbase/test-data/160aeab5-6bca-9f87-465e-2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/394e6d39a9b94b148b9089c4fb967aad_SeqId_205_
> 2018-10-13 14:09:24,125 DEBUG [Time-limited test] 
> mapreduce.MapReduceBackupCopyJob$BackupDistCp(195): BackupDistCp : 
> hdfs://localhost:42796/user/hbase/test-data/160aeab5-6bca-9f87-465e-2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/a7599081e835440eb7bf0dd3ef4fd7a5_SeqId_205_
> 2018-10-13 14:09:24,125 DEBUG [Time-limited test] 
> mapreduce.MapReduceBackupCopyJob$BackupDistCp(197): BackupDistCp execute for 
> 2 files of 10242
> {code}
> Later on, CopyCommitter#concatFileChunks would throw the following exception:
> {code}
> 2018-10-13 14:09:25,351 WARN  [Thread-936] mapred.LocalJobRunner$Job(590): 
> job_local1795473782_0004
> java.io.IOException: Inconsistent sequence file: current chunk file 
> org.apache.hadoop.tools.CopyListingFileStatus@bb8826ee{hdfs://localhost:42796/user/hbase/test-data/
>
> 160aeab5-6bca-9f87-465e-2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/a7599081e835440eb7bf0dd3ef4fd7a5_SeqId_205_
>  length = 5100 aclEntries  = null, xAttrs = null} doesnt match prior entry 
> org.apache.hadoop.tools.CopyListingFileStatus@243d544d{hdfs://localhost:42796/user/hbase/test-data/160aeab5-6bca-9f87-465e-
>
> 2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/394e6d39a9b94b148b9089c4fb967aad_SeqId_205_
>  length = 5142 aclEntries = null, xAttrs = null}
>   at 
> org.apache.hadoop.tools.mapred.CopyCommitter.concatFileChunks(CopyCommitter.java:276)
>   at 
> org.apache.hadoop.tools.mapred.CopyCommitter.commitJob(CopyCommitter.java:100)
>   at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:567)
> {code}
> The above warning shouldn't happen - the two bulk loaded hfiles are 
> independent.
> From the contents of the two CopyListingFileStatus instances, we can see that 
> their isSplit() return false. Otherwise the following from toString should be 
> logged:
> {code}
> if (isSplit()) {
>   sb.append(", chunkOffset = ").append(this.getChunkOffset());
>   sb.append(", chunkLength = ").append(this.getChunkLength());
> }
> {code}
> From hbase side, we can specify one bulk loaded hfile per job but that 
> defeats the purpose of using DistCp.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-15850) CopyCommitter#concatFileChunks should check that the blocks per chunk is not 0

2018-10-18 Thread Ted Yu (JIRA)



 [ 
https://issues.apache.org/jira/browse/HADOOP-15850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HADOOP-15850:

Attachment: HADOOP-15850.v5.patch

> CopyCommitter#concatFileChunks should check that the blocks per chunk is not 0
> --
>
> Key: HADOOP-15850
> URL: https://issues.apache.org/jira/browse/HADOOP-15850
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: tools/distcp
>Affects Versions: 3.1.1
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Major
> Attachments: HADOOP-15850.v2.patch, HADOOP-15850.v3.patch, 
> HADOOP-15850.v4.patch, HADOOP-15850.v5.patch, 
> testIncrementalBackupWithBulkLoad-output.txt
>
>
> I was investigating test failure of TestIncrementalBackupWithBulkLoad from 
> hbase against hadoop 3.1.1
> hbase MapReduceBackupCopyJob$BackupDistCp would create listing file:
> {code}
> LOG.debug("creating input listing " + listing + " , totalRecords=" + 
> totalRecords);
> cfg.set(DistCpConstants.CONF_LABEL_LISTING_FILE_PATH, listing);
> cfg.setLong(DistCpConstants.CONF_LABEL_TOTAL_NUMBER_OF_RECORDS, 
> totalRecords);
> {code}
> For the test case, two bulk loaded hfiles are in the listing:
> {code}
> 2018-10-13 14:09:24,123 DEBUG [Time-limited test] 
> mapreduce.MapReduceBackupCopyJob$BackupDistCp(195): BackupDistCp : 
> hdfs://localhost:42796/user/hbase/test-data/160aeab5-6bca-9f87-465e-2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/394e6d39a9b94b148b9089c4fb967aad_SeqId_205_
> 2018-10-13 14:09:24,125 DEBUG [Time-limited test] 
> mapreduce.MapReduceBackupCopyJob$BackupDistCp(195): BackupDistCp : 
> hdfs://localhost:42796/user/hbase/test-data/160aeab5-6bca-9f87-465e-2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/a7599081e835440eb7bf0dd3ef4fd7a5_SeqId_205_
> 2018-10-13 14:09:24,125 DEBUG [Time-limited test] 
> mapreduce.MapReduceBackupCopyJob$BackupDistCp(197): BackupDistCp execute for 
> 2 files of 10242
> {code}
> Later on, CopyCommitter#concatFileChunks would throw the following exception:
> {code}
> 2018-10-13 14:09:25,351 WARN  [Thread-936] mapred.LocalJobRunner$Job(590): 
> job_local1795473782_0004
> java.io.IOException: Inconsistent sequence file: current chunk file 
> org.apache.hadoop.tools.CopyListingFileStatus@bb8826ee{hdfs://localhost:42796/user/hbase/test-data/
>
> 160aeab5-6bca-9f87-465e-2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/a7599081e835440eb7bf0dd3ef4fd7a5_SeqId_205_
>  length = 5100 aclEntries  = null, xAttrs = null} doesnt match prior entry 
> org.apache.hadoop.tools.CopyListingFileStatus@243d544d{hdfs://localhost:42796/user/hbase/test-data/160aeab5-6bca-9f87-465e-
>
> 2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/394e6d39a9b94b148b9089c4fb967aad_SeqId_205_
>  length = 5142 aclEntries = null, xAttrs = null}
>   at 
> org.apache.hadoop.tools.mapred.CopyCommitter.concatFileChunks(CopyCommitter.java:276)
>   at 
> org.apache.hadoop.tools.mapred.CopyCommitter.commitJob(CopyCommitter.java:100)
>   at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:567)
> {code}
> The above warning shouldn't happen - the two bulk loaded hfiles are 
> independent.
> From the contents of the two CopyListingFileStatus instances, we can see that 
> their isSplit() return false. Otherwise the following from toString should be 
> logged:
> {code}
> if (isSplit()) {
>   sb.append(", chunkOffset = ").append(this.getChunkOffset());
>   sb.append(", chunkLength = ").append(this.getChunkLength());
> }
> {code}
> From hbase side, we can specify one bulk loaded hfile per job but that 
> defeats the purpose of using DistCp.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-15850) CopyCommitter#concatFileChunks should check that the blocks per chunk is not 0

2018-10-18 Thread Ted Yu (JIRA)



 [ 
https://issues.apache.org/jira/browse/HADOOP-15850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HADOOP-15850:

Attachment: HADOOP-15850.v4.patch

> CopyCommitter#concatFileChunks should check that the blocks per chunk is not 0
> --
>
> Key: HADOOP-15850
> URL: https://issues.apache.org/jira/browse/HADOOP-15850
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: tools/distcp
>Affects Versions: 3.1.1
>Reporter: Ted Yu
>Priority: Major
> Attachments: HADOOP-15850.v2.patch, HADOOP-15850.v3.patch, 
> HADOOP-15850.v4.patch, testIncrementalBackupWithBulkLoad-output.txt
>
>
> I was investigating test failure of TestIncrementalBackupWithBulkLoad from 
> hbase against hadoop 3.1.1
> hbase MapReduceBackupCopyJob$BackupDistCp would create listing file:
> {code}
> LOG.debug("creating input listing " + listing + " , totalRecords=" + 
> totalRecords);
> cfg.set(DistCpConstants.CONF_LABEL_LISTING_FILE_PATH, listing);
> cfg.setLong(DistCpConstants.CONF_LABEL_TOTAL_NUMBER_OF_RECORDS, 
> totalRecords);
> {code}
> For the test case, two bulk loaded hfiles are in the listing:
> {code}
> 2018-10-13 14:09:24,123 DEBUG [Time-limited test] 
> mapreduce.MapReduceBackupCopyJob$BackupDistCp(195): BackupDistCp : 
> hdfs://localhost:42796/user/hbase/test-data/160aeab5-6bca-9f87-465e-2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/394e6d39a9b94b148b9089c4fb967aad_SeqId_205_
> 2018-10-13 14:09:24,125 DEBUG [Time-limited test] 
> mapreduce.MapReduceBackupCopyJob$BackupDistCp(195): BackupDistCp : 
> hdfs://localhost:42796/user/hbase/test-data/160aeab5-6bca-9f87-465e-2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/a7599081e835440eb7bf0dd3ef4fd7a5_SeqId_205_
> 2018-10-13 14:09:24,125 DEBUG [Time-limited test] 
> mapreduce.MapReduceBackupCopyJob$BackupDistCp(197): BackupDistCp execute for 
> 2 files of 10242
> {code}
> Later on, CopyCommitter#concatFileChunks would throw the following exception:
> {code}
> 2018-10-13 14:09:25,351 WARN  [Thread-936] mapred.LocalJobRunner$Job(590): 
> job_local1795473782_0004
> java.io.IOException: Inconsistent sequence file: current chunk file 
> org.apache.hadoop.tools.CopyListingFileStatus@bb8826ee{hdfs://localhost:42796/user/hbase/test-data/
>
> 160aeab5-6bca-9f87-465e-2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/a7599081e835440eb7bf0dd3ef4fd7a5_SeqId_205_
>  length = 5100 aclEntries  = null, xAttrs = null} doesnt match prior entry 
> org.apache.hadoop.tools.CopyListingFileStatus@243d544d{hdfs://localhost:42796/user/hbase/test-data/160aeab5-6bca-9f87-465e-
>
> 2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/394e6d39a9b94b148b9089c4fb967aad_SeqId_205_
>  length = 5142 aclEntries = null, xAttrs = null}
>   at 
> org.apache.hadoop.tools.mapred.CopyCommitter.concatFileChunks(CopyCommitter.java:276)
>   at 
> org.apache.hadoop.tools.mapred.CopyCommitter.commitJob(CopyCommitter.java:100)
>   at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:567)
> {code}
> The above warning shouldn't happen - the two bulk loaded hfiles are 
> independent.
> From the contents of the two CopyListingFileStatus instances, we can see that 
> their isSplit() return false. Otherwise the following from toString should be 
> logged:
> {code}
> if (isSplit()) {
>   sb.append(", chunkOffset = ").append(this.getChunkOffset());
>   sb.append(", chunkLength = ").append(this.getChunkLength());
> }
> {code}
> From hbase side, we can specify one bulk loaded hfile per job but that 
> defeats the purpose of using DistCp.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-15850) CopyCommitter#concatFileChunks should check that the blocks per chunk is not 0

2018-10-17 Thread Ted Yu (JIRA)



[ 
https://issues.apache.org/jira/browse/HADOOP-15850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16653848#comment-16653848
 ] 

Ted Yu commented on HADOOP-15850:
-

[~ste...@apache.org] [~yzhangal] [~jojochuang] 
Mind taking a look at patch v3 ?

Thanks

> CopyCommitter#concatFileChunks should check that the blocks per chunk is not 0
> --
>
> Key: HADOOP-15850
> URL: https://issues.apache.org/jira/browse/HADOOP-15850
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: tools/distcp
>Affects Versions: 3.1.1
>Reporter: Ted Yu
>Priority: Major
> Attachments: HADOOP-15850.v2.patch, HADOOP-15850.v3.patch, 
> testIncrementalBackupWithBulkLoad-output.txt
>
>
> I was investigating test failure of TestIncrementalBackupWithBulkLoad from 
> hbase against hadoop 3.1.1
> hbase MapReduceBackupCopyJob$BackupDistCp would create listing file:
> {code}
> LOG.debug("creating input listing " + listing + " , totalRecords=" + 
> totalRecords);
> cfg.set(DistCpConstants.CONF_LABEL_LISTING_FILE_PATH, listing);
> cfg.setLong(DistCpConstants.CONF_LABEL_TOTAL_NUMBER_OF_RECORDS, 
> totalRecords);
> {code}
> For the test case, two bulk loaded hfiles are in the listing:
> {code}
> 2018-10-13 14:09:24,123 DEBUG [Time-limited test] 
> mapreduce.MapReduceBackupCopyJob$BackupDistCp(195): BackupDistCp : 
> hdfs://localhost:42796/user/hbase/test-data/160aeab5-6bca-9f87-465e-2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/394e6d39a9b94b148b9089c4fb967aad_SeqId_205_
> 2018-10-13 14:09:24,125 DEBUG [Time-limited test] 
> mapreduce.MapReduceBackupCopyJob$BackupDistCp(195): BackupDistCp : 
> hdfs://localhost:42796/user/hbase/test-data/160aeab5-6bca-9f87-465e-2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/a7599081e835440eb7bf0dd3ef4fd7a5_SeqId_205_
> 2018-10-13 14:09:24,125 DEBUG [Time-limited test] 
> mapreduce.MapReduceBackupCopyJob$BackupDistCp(197): BackupDistCp execute for 
> 2 files of 10242
> {code}
> Later on, CopyCommitter#concatFileChunks would throw the following exception:
> {code}
> 2018-10-13 14:09:25,351 WARN  [Thread-936] mapred.LocalJobRunner$Job(590): 
> job_local1795473782_0004
> java.io.IOException: Inconsistent sequence file: current chunk file 
> org.apache.hadoop.tools.CopyListingFileStatus@bb8826ee{hdfs://localhost:42796/user/hbase/test-data/
>
> 160aeab5-6bca-9f87-465e-2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/a7599081e835440eb7bf0dd3ef4fd7a5_SeqId_205_
>  length = 5100 aclEntries  = null, xAttrs = null} doesnt match prior entry 
> org.apache.hadoop.tools.CopyListingFileStatus@243d544d{hdfs://localhost:42796/user/hbase/test-data/160aeab5-6bca-9f87-465e-
>
> 2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/394e6d39a9b94b148b9089c4fb967aad_SeqId_205_
>  length = 5142 aclEntries = null, xAttrs = null}
>   at 
> org.apache.hadoop.tools.mapred.CopyCommitter.concatFileChunks(CopyCommitter.java:276)
>   at 
> org.apache.hadoop.tools.mapred.CopyCommitter.commitJob(CopyCommitter.java:100)
>   at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:567)
> {code}
> The above warning shouldn't happen - the two bulk loaded hfiles are 
> independent.
> From the contents of the two CopyListingFileStatus instances, we can see that 
> their isSplit() return false. Otherwise the following from toString should be 
> logged:
> {code}
> if (isSplit()) {
>   sb.append(", chunkOffset = ").append(this.getChunkOffset());
>   sb.append(", chunkLength = ").append(this.getChunkLength());
> }
> {code}
> From hbase side, we can specify one bulk loaded hfile per job but that 
> defeats the purpose of using DistCp.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-15850) CopyCommitter#concatFileChunks should check that the blocks per chunk is not 0

2018-10-17 Thread Ted Yu (JIRA)



 [ 
https://issues.apache.org/jira/browse/HADOOP-15850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HADOOP-15850:

Attachment: HADOOP-15850.v3.patch

> CopyCommitter#concatFileChunks should check that the blocks per chunk is not 0
> --
>
> Key: HADOOP-15850
> URL: https://issues.apache.org/jira/browse/HADOOP-15850
> Project: Hadoop Common
>  Issue Type: Task
>  Components: tools/distcp
>Affects Versions: 3.1.1
>Reporter: Ted Yu
>Priority: Major
> Attachments: HADOOP-15850.v2.patch, HADOOP-15850.v3.patch, 
> testIncrementalBackupWithBulkLoad-output.txt
>
>
> I was investigating test failure of TestIncrementalBackupWithBulkLoad from 
> hbase against hadoop 3.1.1
> hbase MapReduceBackupCopyJob$BackupDistCp would create listing file:
> {code}
> LOG.debug("creating input listing " + listing + " , totalRecords=" + 
> totalRecords);
> cfg.set(DistCpConstants.CONF_LABEL_LISTING_FILE_PATH, listing);
> cfg.setLong(DistCpConstants.CONF_LABEL_TOTAL_NUMBER_OF_RECORDS, 
> totalRecords);
> {code}
> For the test case, two bulk loaded hfiles are in the listing:
> {code}
> 2018-10-13 14:09:24,123 DEBUG [Time-limited test] 
> mapreduce.MapReduceBackupCopyJob$BackupDistCp(195): BackupDistCp : 
> hdfs://localhost:42796/user/hbase/test-data/160aeab5-6bca-9f87-465e-2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/394e6d39a9b94b148b9089c4fb967aad_SeqId_205_
> 2018-10-13 14:09:24,125 DEBUG [Time-limited test] 
> mapreduce.MapReduceBackupCopyJob$BackupDistCp(195): BackupDistCp : 
> hdfs://localhost:42796/user/hbase/test-data/160aeab5-6bca-9f87-465e-2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/a7599081e835440eb7bf0dd3ef4fd7a5_SeqId_205_
> 2018-10-13 14:09:24,125 DEBUG [Time-limited test] 
> mapreduce.MapReduceBackupCopyJob$BackupDistCp(197): BackupDistCp execute for 
> 2 files of 10242
> {code}
> Later on, CopyCommitter#concatFileChunks would throw the following exception:
> {code}
> 2018-10-13 14:09:25,351 WARN  [Thread-936] mapred.LocalJobRunner$Job(590): 
> job_local1795473782_0004
> java.io.IOException: Inconsistent sequence file: current chunk file 
> org.apache.hadoop.tools.CopyListingFileStatus@bb8826ee{hdfs://localhost:42796/user/hbase/test-data/
>
> 160aeab5-6bca-9f87-465e-2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/a7599081e835440eb7bf0dd3ef4fd7a5_SeqId_205_
>  length = 5100 aclEntries  = null, xAttrs = null} doesnt match prior entry 
> org.apache.hadoop.tools.CopyListingFileStatus@243d544d{hdfs://localhost:42796/user/hbase/test-data/160aeab5-6bca-9f87-465e-
>
> 2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/394e6d39a9b94b148b9089c4fb967aad_SeqId_205_
>  length = 5142 aclEntries = null, xAttrs = null}
>   at 
> org.apache.hadoop.tools.mapred.CopyCommitter.concatFileChunks(CopyCommitter.java:276)
>   at 
> org.apache.hadoop.tools.mapred.CopyCommitter.commitJob(CopyCommitter.java:100)
>   at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:567)
> {code}
> The above warning shouldn't happen - the two bulk loaded hfiles are 
> independent.
> From the contents of the two CopyListingFileStatus instances, we can see that 
> their isSplit() return false. Otherwise the following from toString should be 
> logged:
> {code}
> if (isSplit()) {
>   sb.append(", chunkOffset = ").append(this.getChunkOffset());
>   sb.append(", chunkLength = ").append(this.getChunkLength());
> }
> {code}
> From hbase side, we can specify one bulk loaded hfile per job but that 
> defeats the purpose of using DistCp.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-15850) CopyCommitter#concatFileChunks should check that the blocks per chunk is not 0

2018-10-17 Thread Ted Yu (JIRA)



 [ 
https://issues.apache.org/jira/browse/HADOOP-15850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HADOOP-15850:

Issue Type: Bug  (was: Task)

> CopyCommitter#concatFileChunks should check that the blocks per chunk is not 0
> --
>
> Key: HADOOP-15850
> URL: https://issues.apache.org/jira/browse/HADOOP-15850
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: tools/distcp
>Affects Versions: 3.1.1
>Reporter: Ted Yu
>Priority: Major
> Attachments: HADOOP-15850.v2.patch, HADOOP-15850.v3.patch, 
> testIncrementalBackupWithBulkLoad-output.txt
>
>
> I was investigating test failure of TestIncrementalBackupWithBulkLoad from 
> hbase against hadoop 3.1.1
> hbase MapReduceBackupCopyJob$BackupDistCp would create listing file:
> {code}
> LOG.debug("creating input listing " + listing + " , totalRecords=" + 
> totalRecords);
> cfg.set(DistCpConstants.CONF_LABEL_LISTING_FILE_PATH, listing);
> cfg.setLong(DistCpConstants.CONF_LABEL_TOTAL_NUMBER_OF_RECORDS, 
> totalRecords);
> {code}
> For the test case, two bulk loaded hfiles are in the listing:
> {code}
> 2018-10-13 14:09:24,123 DEBUG [Time-limited test] 
> mapreduce.MapReduceBackupCopyJob$BackupDistCp(195): BackupDistCp : 
> hdfs://localhost:42796/user/hbase/test-data/160aeab5-6bca-9f87-465e-2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/394e6d39a9b94b148b9089c4fb967aad_SeqId_205_
> 2018-10-13 14:09:24,125 DEBUG [Time-limited test] 
> mapreduce.MapReduceBackupCopyJob$BackupDistCp(195): BackupDistCp : 
> hdfs://localhost:42796/user/hbase/test-data/160aeab5-6bca-9f87-465e-2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/a7599081e835440eb7bf0dd3ef4fd7a5_SeqId_205_
> 2018-10-13 14:09:24,125 DEBUG [Time-limited test] 
> mapreduce.MapReduceBackupCopyJob$BackupDistCp(197): BackupDistCp execute for 
> 2 files of 10242
> {code}
> Later on, CopyCommitter#concatFileChunks would throw the following exception:
> {code}
> 2018-10-13 14:09:25,351 WARN  [Thread-936] mapred.LocalJobRunner$Job(590): 
> job_local1795473782_0004
> java.io.IOException: Inconsistent sequence file: current chunk file 
> org.apache.hadoop.tools.CopyListingFileStatus@bb8826ee{hdfs://localhost:42796/user/hbase/test-data/
>
> 160aeab5-6bca-9f87-465e-2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/a7599081e835440eb7bf0dd3ef4fd7a5_SeqId_205_
>  length = 5100 aclEntries  = null, xAttrs = null} doesnt match prior entry 
> org.apache.hadoop.tools.CopyListingFileStatus@243d544d{hdfs://localhost:42796/user/hbase/test-data/160aeab5-6bca-9f87-465e-
>
> 2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/394e6d39a9b94b148b9089c4fb967aad_SeqId_205_
>  length = 5142 aclEntries = null, xAttrs = null}
>   at 
> org.apache.hadoop.tools.mapred.CopyCommitter.concatFileChunks(CopyCommitter.java:276)
>   at 
> org.apache.hadoop.tools.mapred.CopyCommitter.commitJob(CopyCommitter.java:100)
>   at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:567)
> {code}
> The above warning shouldn't happen - the two bulk loaded hfiles are 
> independent.
> From the contents of the two CopyListingFileStatus instances, we can see that 
> their isSplit() return false. Otherwise the following from toString should be 
> logged:
> {code}
> if (isSplit()) {
>   sb.append(", chunkOffset = ").append(this.getChunkOffset());
>   sb.append(", chunkLength = ").append(this.getChunkLength());
> }
> {code}
> From hbase side, we can specify one bulk loaded hfile per job but that 
> defeats the purpose of using DistCp.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-15850) CopyCommitter#concatFileChunks should check that the blocks per chunk is not 0

2018-10-17 Thread Ted Yu (JIRA)



[ 
https://issues.apache.org/jira/browse/HADOOP-15850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16653756#comment-16653756
 ] 

Ted Yu commented on HADOOP-15850:
-

I tested the patch v2 in two ways:

* when there is no "-blocksperchunk" option specified, 
TestIncrementalBackupWithBulkLoad passes
* when positive value for "-blocksperchunk" option is specified, 
{{concatFileChunks}} is called - resulting in previously reported error.

> CopyCommitter#concatFileChunks should check that the blocks per chunk is not 0
> --
>
> Key: HADOOP-15850
> URL: https://issues.apache.org/jira/browse/HADOOP-15850
> Project: Hadoop Common
>  Issue Type: Task
>  Components: tools/distcp
>Affects Versions: 3.1.1
>Reporter: Ted Yu
>Priority: Major
> Attachments: HADOOP-15850.v2.patch, 
> testIncrementalBackupWithBulkLoad-output.txt
>
>
> I was investigating test failure of TestIncrementalBackupWithBulkLoad from 
> hbase against hadoop 3.1.1
> hbase MapReduceBackupCopyJob$BackupDistCp would create listing file:
> {code}
> LOG.debug("creating input listing " + listing + " , totalRecords=" + 
> totalRecords);
> cfg.set(DistCpConstants.CONF_LABEL_LISTING_FILE_PATH, listing);
> cfg.setLong(DistCpConstants.CONF_LABEL_TOTAL_NUMBER_OF_RECORDS, 
> totalRecords);
> {code}
> For the test case, two bulk loaded hfiles are in the listing:
> {code}
> 2018-10-13 14:09:24,123 DEBUG [Time-limited test] 
> mapreduce.MapReduceBackupCopyJob$BackupDistCp(195): BackupDistCp : 
> hdfs://localhost:42796/user/hbase/test-data/160aeab5-6bca-9f87-465e-2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/394e6d39a9b94b148b9089c4fb967aad_SeqId_205_
> 2018-10-13 14:09:24,125 DEBUG [Time-limited test] 
> mapreduce.MapReduceBackupCopyJob$BackupDistCp(195): BackupDistCp : 
> hdfs://localhost:42796/user/hbase/test-data/160aeab5-6bca-9f87-465e-2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/a7599081e835440eb7bf0dd3ef4fd7a5_SeqId_205_
> 2018-10-13 14:09:24,125 DEBUG [Time-limited test] 
> mapreduce.MapReduceBackupCopyJob$BackupDistCp(197): BackupDistCp execute for 
> 2 files of 10242
> {code}
> Later on, CopyCommitter#concatFileChunks would throw the following exception:
> {code}
> 2018-10-13 14:09:25,351 WARN  [Thread-936] mapred.LocalJobRunner$Job(590): 
> job_local1795473782_0004
> java.io.IOException: Inconsistent sequence file: current chunk file 
> org.apache.hadoop.tools.CopyListingFileStatus@bb8826ee{hdfs://localhost:42796/user/hbase/test-data/
>
> 160aeab5-6bca-9f87-465e-2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/a7599081e835440eb7bf0dd3ef4fd7a5_SeqId_205_
>  length = 5100 aclEntries  = null, xAttrs = null} doesnt match prior entry 
> org.apache.hadoop.tools.CopyListingFileStatus@243d544d{hdfs://localhost:42796/user/hbase/test-data/160aeab5-6bca-9f87-465e-
>
> 2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/394e6d39a9b94b148b9089c4fb967aad_SeqId_205_
>  length = 5142 aclEntries = null, xAttrs = null}
>   at 
> org.apache.hadoop.tools.mapred.CopyCommitter.concatFileChunks(CopyCommitter.java:276)
>   at 
> org.apache.hadoop.tools.mapred.CopyCommitter.commitJob(CopyCommitter.java:100)
>   at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:567)
> {code}
> The above warning shouldn't happen - the two bulk loaded hfiles are 
> independent.
> From the contents of the two CopyListingFileStatus instances, we can see that 
> their isSplit() return false. Otherwise the following from toString should be 
> logged:
> {code}
> if (isSplit()) {
>   sb.append(", chunkOffset = ").append(this.getChunkOffset());
>   sb.append(", chunkLength = ").append(this.getChunkLength());
> }
> {code}
> From hbase side, we can specify one bulk loaded hfile per job but that 
> defeats the purpose of using DistCp.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-15850) CopyCommitter#concatFileChunks should check that the blocks per chunk is not 0

2018-10-17 Thread Ted Yu (JIRA)



 [ 
https://issues.apache.org/jira/browse/HADOOP-15850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HADOOP-15850:

Status: Patch Available  (was: Open)

> CopyCommitter#concatFileChunks should check that the blocks per chunk is not 0
> --
>
> Key: HADOOP-15850
> URL: https://issues.apache.org/jira/browse/HADOOP-15850
> Project: Hadoop Common
>  Issue Type: Task
>  Components: tools/distcp
>Affects Versions: 3.1.1
>Reporter: Ted Yu
>Priority: Major
> Attachments: HADOOP-15850.v2.patch, 
> testIncrementalBackupWithBulkLoad-output.txt
>
>
> I was investigating test failure of TestIncrementalBackupWithBulkLoad from 
> hbase against hadoop 3.1.1
> hbase MapReduceBackupCopyJob$BackupDistCp would create listing file:
> {code}
> LOG.debug("creating input listing " + listing + " , totalRecords=" + 
> totalRecords);
> cfg.set(DistCpConstants.CONF_LABEL_LISTING_FILE_PATH, listing);
> cfg.setLong(DistCpConstants.CONF_LABEL_TOTAL_NUMBER_OF_RECORDS, 
> totalRecords);
> {code}
> For the test case, two bulk loaded hfiles are in the listing:
> {code}
> 2018-10-13 14:09:24,123 DEBUG [Time-limited test] 
> mapreduce.MapReduceBackupCopyJob$BackupDistCp(195): BackupDistCp : 
> hdfs://localhost:42796/user/hbase/test-data/160aeab5-6bca-9f87-465e-2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/394e6d39a9b94b148b9089c4fb967aad_SeqId_205_
> 2018-10-13 14:09:24,125 DEBUG [Time-limited test] 
> mapreduce.MapReduceBackupCopyJob$BackupDistCp(195): BackupDistCp : 
> hdfs://localhost:42796/user/hbase/test-data/160aeab5-6bca-9f87-465e-2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/a7599081e835440eb7bf0dd3ef4fd7a5_SeqId_205_
> 2018-10-13 14:09:24,125 DEBUG [Time-limited test] 
> mapreduce.MapReduceBackupCopyJob$BackupDistCp(197): BackupDistCp execute for 
> 2 files of 10242
> {code}
> Later on, CopyCommitter#concatFileChunks would throw the following exception:
> {code}
> 2018-10-13 14:09:25,351 WARN  [Thread-936] mapred.LocalJobRunner$Job(590): 
> job_local1795473782_0004
> java.io.IOException: Inconsistent sequence file: current chunk file 
> org.apache.hadoop.tools.CopyListingFileStatus@bb8826ee{hdfs://localhost:42796/user/hbase/test-data/
>
> 160aeab5-6bca-9f87-465e-2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/a7599081e835440eb7bf0dd3ef4fd7a5_SeqId_205_
>  length = 5100 aclEntries  = null, xAttrs = null} doesnt match prior entry 
> org.apache.hadoop.tools.CopyListingFileStatus@243d544d{hdfs://localhost:42796/user/hbase/test-data/160aeab5-6bca-9f87-465e-
>
> 2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/394e6d39a9b94b148b9089c4fb967aad_SeqId_205_
>  length = 5142 aclEntries = null, xAttrs = null}
>   at 
> org.apache.hadoop.tools.mapred.CopyCommitter.concatFileChunks(CopyCommitter.java:276)
>   at 
> org.apache.hadoop.tools.mapred.CopyCommitter.commitJob(CopyCommitter.java:100)
>   at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:567)
> {code}
> The above warning shouldn't happen - the two bulk loaded hfiles are 
> independent.
> From the contents of the two CopyListingFileStatus instances, we can see that 
> their isSplit() return false. Otherwise the following from toString should be 
> logged:
> {code}
> if (isSplit()) {
>   sb.append(", chunkOffset = ").append(this.getChunkOffset());
>   sb.append(", chunkLength = ").append(this.getChunkLength());
> }
> {code}
> From hbase side, we can specify one bulk loaded hfile per job but that 
> defeats the purpose of using DistCp.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-15850) CopyCommitter#concatFileChunks should check that the blocks per chunk is not 0

2018-10-17 Thread Ted Yu (JIRA)



 [ 
https://issues.apache.org/jira/browse/HADOOP-15850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HADOOP-15850:

Summary: CopyCommitter#concatFileChunks should check that the blocks per 
chunk is not 0  (was: CopyCommitter#concatFileChunks should check that the 
source file to be merged is a split)

> CopyCommitter#concatFileChunks should check that the blocks per chunk is not 0
> --
>
> Key: HADOOP-15850
> URL: https://issues.apache.org/jira/browse/HADOOP-15850
> Project: Hadoop Common
>  Issue Type: Task
>  Components: tools/distcp
>Affects Versions: 3.1.1
>Reporter: Ted Yu
>Priority: Major
> Attachments: HADOOP-15850.v2.patch, 
> testIncrementalBackupWithBulkLoad-output.txt
>
>
> I was investigating test failure of TestIncrementalBackupWithBulkLoad from 
> hbase against hadoop 3.1.1
> hbase MapReduceBackupCopyJob$BackupDistCp would create listing file:
> {code}
> LOG.debug("creating input listing " + listing + " , totalRecords=" + 
> totalRecords);
> cfg.set(DistCpConstants.CONF_LABEL_LISTING_FILE_PATH, listing);
> cfg.setLong(DistCpConstants.CONF_LABEL_TOTAL_NUMBER_OF_RECORDS, 
> totalRecords);
> {code}
> For the test case, two bulk loaded hfiles are in the listing:
> {code}
> 2018-10-13 14:09:24,123 DEBUG [Time-limited test] 
> mapreduce.MapReduceBackupCopyJob$BackupDistCp(195): BackupDistCp : 
> hdfs://localhost:42796/user/hbase/test-data/160aeab5-6bca-9f87-465e-2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/394e6d39a9b94b148b9089c4fb967aad_SeqId_205_
> 2018-10-13 14:09:24,125 DEBUG [Time-limited test] 
> mapreduce.MapReduceBackupCopyJob$BackupDistCp(195): BackupDistCp : 
> hdfs://localhost:42796/user/hbase/test-data/160aeab5-6bca-9f87-465e-2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/a7599081e835440eb7bf0dd3ef4fd7a5_SeqId_205_
> 2018-10-13 14:09:24,125 DEBUG [Time-limited test] 
> mapreduce.MapReduceBackupCopyJob$BackupDistCp(197): BackupDistCp execute for 
> 2 files of 10242
> {code}
> Later on, CopyCommitter#concatFileChunks would throw the following exception:
> {code}
> 2018-10-13 14:09:25,351 WARN  [Thread-936] mapred.LocalJobRunner$Job(590): 
> job_local1795473782_0004
> java.io.IOException: Inconsistent sequence file: current chunk file 
> org.apache.hadoop.tools.CopyListingFileStatus@bb8826ee{hdfs://localhost:42796/user/hbase/test-data/
>
> 160aeab5-6bca-9f87-465e-2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/a7599081e835440eb7bf0dd3ef4fd7a5_SeqId_205_
>  length = 5100 aclEntries  = null, xAttrs = null} doesnt match prior entry 
> org.apache.hadoop.tools.CopyListingFileStatus@243d544d{hdfs://localhost:42796/user/hbase/test-data/160aeab5-6bca-9f87-465e-
>
> 2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/394e6d39a9b94b148b9089c4fb967aad_SeqId_205_
>  length = 5142 aclEntries = null, xAttrs = null}
>   at 
> org.apache.hadoop.tools.mapred.CopyCommitter.concatFileChunks(CopyCommitter.java:276)
>   at 
> org.apache.hadoop.tools.mapred.CopyCommitter.commitJob(CopyCommitter.java:100)
>   at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:567)
> {code}
> The above warning shouldn't happen - the two bulk loaded hfiles are 
> independent.
> From the contents of the two CopyListingFileStatus instances, we can see that 
> their isSplit() return false. Otherwise the following from toString should be 
> logged:
> {code}
> if (isSplit()) {
>   sb.append(", chunkOffset = ").append(this.getChunkOffset());
>   sb.append(", chunkLength = ").append(this.getChunkLength());
> }
> {code}
> From hbase side, we can specify one bulk loaded hfile per job but that 
> defeats the purpose of using DistCp.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-15850) CopyCommitter#concatFileChunks should check that the source file to be merged is a split

2018-10-17 Thread Ted Yu (JIRA)



 [ 
https://issues.apache.org/jira/browse/HADOOP-15850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HADOOP-15850:

Attachment: (was: HADOOP-15850.v1.patch)

> CopyCommitter#concatFileChunks should check that the source file to be merged 
> is a split
> 
>
> Key: HADOOP-15850
> URL: https://issues.apache.org/jira/browse/HADOOP-15850
> Project: Hadoop Common
>  Issue Type: Task
>  Components: tools/distcp
>Affects Versions: 3.1.1
>Reporter: Ted Yu
>Priority: Major
> Attachments: HADOOP-15850.v2.patch, 
> testIncrementalBackupWithBulkLoad-output.txt
>
>
> I was investigating test failure of TestIncrementalBackupWithBulkLoad from 
> hbase against hadoop 3.1.1
> hbase MapReduceBackupCopyJob$BackupDistCp would create listing file:
> {code}
> LOG.debug("creating input listing " + listing + " , totalRecords=" + 
> totalRecords);
> cfg.set(DistCpConstants.CONF_LABEL_LISTING_FILE_PATH, listing);
> cfg.setLong(DistCpConstants.CONF_LABEL_TOTAL_NUMBER_OF_RECORDS, 
> totalRecords);
> {code}
> For the test case, two bulk loaded hfiles are in the listing:
> {code}
> 2018-10-13 14:09:24,123 DEBUG [Time-limited test] 
> mapreduce.MapReduceBackupCopyJob$BackupDistCp(195): BackupDistCp : 
> hdfs://localhost:42796/user/hbase/test-data/160aeab5-6bca-9f87-465e-2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/394e6d39a9b94b148b9089c4fb967aad_SeqId_205_
> 2018-10-13 14:09:24,125 DEBUG [Time-limited test] 
> mapreduce.MapReduceBackupCopyJob$BackupDistCp(195): BackupDistCp : 
> hdfs://localhost:42796/user/hbase/test-data/160aeab5-6bca-9f87-465e-2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/a7599081e835440eb7bf0dd3ef4fd7a5_SeqId_205_
> 2018-10-13 14:09:24,125 DEBUG [Time-limited test] 
> mapreduce.MapReduceBackupCopyJob$BackupDistCp(197): BackupDistCp execute for 
> 2 files of 10242
> {code}
> Later on, CopyCommitter#concatFileChunks would throw the following exception:
> {code}
> 2018-10-13 14:09:25,351 WARN  [Thread-936] mapred.LocalJobRunner$Job(590): 
> job_local1795473782_0004
> java.io.IOException: Inconsistent sequence file: current chunk file 
> org.apache.hadoop.tools.CopyListingFileStatus@bb8826ee{hdfs://localhost:42796/user/hbase/test-data/
>
> 160aeab5-6bca-9f87-465e-2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/a7599081e835440eb7bf0dd3ef4fd7a5_SeqId_205_
>  length = 5100 aclEntries  = null, xAttrs = null} doesnt match prior entry 
> org.apache.hadoop.tools.CopyListingFileStatus@243d544d{hdfs://localhost:42796/user/hbase/test-data/160aeab5-6bca-9f87-465e-
>
> 2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/394e6d39a9b94b148b9089c4fb967aad_SeqId_205_
>  length = 5142 aclEntries = null, xAttrs = null}
>   at 
> org.apache.hadoop.tools.mapred.CopyCommitter.concatFileChunks(CopyCommitter.java:276)
>   at 
> org.apache.hadoop.tools.mapred.CopyCommitter.commitJob(CopyCommitter.java:100)
>   at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:567)
> {code}
> The above warning shouldn't happen - the two bulk loaded hfiles are 
> independent.
> From the contents of the two CopyListingFileStatus instances, we can see that 
> their isSplit() return false. Otherwise the following from toString should be 
> logged:
> {code}
> if (isSplit()) {
>   sb.append(", chunkOffset = ").append(this.getChunkOffset());
>   sb.append(", chunkLength = ").append(this.getChunkLength());
> }
> {code}
> From hbase side, we can specify one bulk loaded hfile per job but that 
> defeats the purpose of using DistCp.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-15850) CopyCommitter#concatFileChunks should check that the source file to be merged is a split

2018-10-17 Thread Ted Yu (JIRA)



 [ 
https://issues.apache.org/jira/browse/HADOOP-15850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HADOOP-15850:

Attachment: HADOOP-15850.v2.patch

> CopyCommitter#concatFileChunks should check that the source file to be merged 
> is a split
> 
>
> Key: HADOOP-15850
> URL: https://issues.apache.org/jira/browse/HADOOP-15850
> Project: Hadoop Common
>  Issue Type: Task
>  Components: tools/distcp
>Affects Versions: 3.1.1
>Reporter: Ted Yu
>Priority: Major
> Attachments: HADOOP-15850.v2.patch, 
> testIncrementalBackupWithBulkLoad-output.txt
>
>
> I was investigating test failure of TestIncrementalBackupWithBulkLoad from 
> hbase against hadoop 3.1.1
> hbase MapReduceBackupCopyJob$BackupDistCp would create listing file:
> {code}
> LOG.debug("creating input listing " + listing + " , totalRecords=" + 
> totalRecords);
> cfg.set(DistCpConstants.CONF_LABEL_LISTING_FILE_PATH, listing);
> cfg.setLong(DistCpConstants.CONF_LABEL_TOTAL_NUMBER_OF_RECORDS, 
> totalRecords);
> {code}
> For the test case, two bulk loaded hfiles are in the listing:
> {code}
> 2018-10-13 14:09:24,123 DEBUG [Time-limited test] 
> mapreduce.MapReduceBackupCopyJob$BackupDistCp(195): BackupDistCp : 
> hdfs://localhost:42796/user/hbase/test-data/160aeab5-6bca-9f87-465e-2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/394e6d39a9b94b148b9089c4fb967aad_SeqId_205_
> 2018-10-13 14:09:24,125 DEBUG [Time-limited test] 
> mapreduce.MapReduceBackupCopyJob$BackupDistCp(195): BackupDistCp : 
> hdfs://localhost:42796/user/hbase/test-data/160aeab5-6bca-9f87-465e-2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/a7599081e835440eb7bf0dd3ef4fd7a5_SeqId_205_
> 2018-10-13 14:09:24,125 DEBUG [Time-limited test] 
> mapreduce.MapReduceBackupCopyJob$BackupDistCp(197): BackupDistCp execute for 
> 2 files of 10242
> {code}
> Later on, CopyCommitter#concatFileChunks would throw the following exception:
> {code}
> 2018-10-13 14:09:25,351 WARN  [Thread-936] mapred.LocalJobRunner$Job(590): 
> job_local1795473782_0004
> java.io.IOException: Inconsistent sequence file: current chunk file 
> org.apache.hadoop.tools.CopyListingFileStatus@bb8826ee{hdfs://localhost:42796/user/hbase/test-data/
>
> 160aeab5-6bca-9f87-465e-2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/a7599081e835440eb7bf0dd3ef4fd7a5_SeqId_205_
>  length = 5100 aclEntries  = null, xAttrs = null} doesnt match prior entry 
> org.apache.hadoop.tools.CopyListingFileStatus@243d544d{hdfs://localhost:42796/user/hbase/test-data/160aeab5-6bca-9f87-465e-
>
> 2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/394e6d39a9b94b148b9089c4fb967aad_SeqId_205_
>  length = 5142 aclEntries = null, xAttrs = null}
>   at 
> org.apache.hadoop.tools.mapred.CopyCommitter.concatFileChunks(CopyCommitter.java:276)
>   at 
> org.apache.hadoop.tools.mapred.CopyCommitter.commitJob(CopyCommitter.java:100)
>   at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:567)
> {code}
> The above warning shouldn't happen - the two bulk loaded hfiles are 
> independent.
> From the contents of the two CopyListingFileStatus instances, we can see that 
> their isSplit() return false. Otherwise the following from toString should be 
> logged:
> {code}
> if (isSplit()) {
>   sb.append(", chunkOffset = ").append(this.getChunkOffset());
>   sb.append(", chunkLength = ").append(this.getChunkLength());
> }
> {code}
> From hbase side, we can specify one bulk loaded hfile per job but that 
> defeats the purpose of using DistCp.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (HADOOP-15850) CopyCommitter#concatFileChunks should check that the source file to be merged is a split

2018-10-17 Thread Ted Yu (JIRA)



[ 
https://issues.apache.org/jira/browse/HADOOP-15850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16653574#comment-16653574
 ] 

Ted Yu edited comment on HADOOP-15850 at 10/17/18 3:25 PM:
---

In order to retrieve the per chunk information in CopyCommitter ctor,
how about using the config key 
DistCpOptionSwitch.BLOCKS_PER_CHUNK.getConfigLabel() ?
CopyCommitter can selectively skip concatenation when the value for the config 
is 0.


was (Author: yuzhih...@gmail.com):
I haven't found how to pass the per chunk information to CopyCommitter ctor.

How about adding a config keyed by DistCpConstants.CONF_BLOCKS_PER_CHUNK which 
carries value for -blocksperchunk ?
DistCp can use this config to inform CopyCommitter which can selectively skip 
concatenation.

> CopyCommitter#concatFileChunks should check that the source file to be merged 
> is a split
> 
>
> Key: HADOOP-15850
> URL: https://issues.apache.org/jira/browse/HADOOP-15850
> Project: Hadoop Common
>  Issue Type: Task
>  Components: tools/distcp
>Affects Versions: 3.1.1
>Reporter: Ted Yu
>Priority: Major
> Attachments: HADOOP-15850.v1.patch, 
> testIncrementalBackupWithBulkLoad-output.txt
>
>
> I was investigating test failure of TestIncrementalBackupWithBulkLoad from 
> hbase against hadoop 3.1.1
> hbase MapReduceBackupCopyJob$BackupDistCp would create listing file:
> {code}
> LOG.debug("creating input listing " + listing + " , totalRecords=" + 
> totalRecords);
> cfg.set(DistCpConstants.CONF_LABEL_LISTING_FILE_PATH, listing);
> cfg.setLong(DistCpConstants.CONF_LABEL_TOTAL_NUMBER_OF_RECORDS, 
> totalRecords);
> {code}
> For the test case, two bulk loaded hfiles are in the listing:
> {code}
> 2018-10-13 14:09:24,123 DEBUG [Time-limited test] 
> mapreduce.MapReduceBackupCopyJob$BackupDistCp(195): BackupDistCp : 
> hdfs://localhost:42796/user/hbase/test-data/160aeab5-6bca-9f87-465e-2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/394e6d39a9b94b148b9089c4fb967aad_SeqId_205_
> 2018-10-13 14:09:24,125 DEBUG [Time-limited test] 
> mapreduce.MapReduceBackupCopyJob$BackupDistCp(195): BackupDistCp : 
> hdfs://localhost:42796/user/hbase/test-data/160aeab5-6bca-9f87-465e-2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/a7599081e835440eb7bf0dd3ef4fd7a5_SeqId_205_
> 2018-10-13 14:09:24,125 DEBUG [Time-limited test] 
> mapreduce.MapReduceBackupCopyJob$BackupDistCp(197): BackupDistCp execute for 
> 2 files of 10242
> {code}
> Later on, CopyCommitter#concatFileChunks would throw the following exception:
> {code}
> 2018-10-13 14:09:25,351 WARN  [Thread-936] mapred.LocalJobRunner$Job(590): 
> job_local1795473782_0004
> java.io.IOException: Inconsistent sequence file: current chunk file 
> org.apache.hadoop.tools.CopyListingFileStatus@bb8826ee{hdfs://localhost:42796/user/hbase/test-data/
>
> 160aeab5-6bca-9f87-465e-2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/a7599081e835440eb7bf0dd3ef4fd7a5_SeqId_205_
>  length = 5100 aclEntries  = null, xAttrs = null} doesnt match prior entry 
> org.apache.hadoop.tools.CopyListingFileStatus@243d544d{hdfs://localhost:42796/user/hbase/test-data/160aeab5-6bca-9f87-465e-
>
> 2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/394e6d39a9b94b148b9089c4fb967aad_SeqId_205_
>  length = 5142 aclEntries = null, xAttrs = null}
>   at 
> org.apache.hadoop.tools.mapred.CopyCommitter.concatFileChunks(CopyCommitter.java:276)
>   at 
> org.apache.hadoop.tools.mapred.CopyCommitter.commitJob(CopyCommitter.java:100)
>   at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:567)
> {code}
> The above warning shouldn't happen - the two bulk loaded hfiles are 
> independent.
> From the contents of the two CopyListingFileStatus instances, we can see that 
> their isSplit() return false. Otherwise the following from toString should be 
> logged:
> {code}
> if (isSplit()) {
>   sb.append(", chunkOffset = ").append(this.getChunkOffset());
>   sb.append(", chunkLength = ").append(this.getChunkLength());
> }
> {code}
> From hbase side, we can specify one bulk loaded hfile per job but that 
> defeats the purpose of using DistCp.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-15850) CopyCommitter#concatFileChunks should check that the source file to be merged is a split

2018-10-17 Thread Ted Yu (JIRA)



[ 
https://issues.apache.org/jira/browse/HADOOP-15850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16653574#comment-16653574
 ] 

Ted Yu commented on HADOOP-15850:
-

I haven't found how to pass the per chunk information to CopyCommitter ctor.

How about adding a config keyed by DistCpConstants.CONF_BLOCKS_PER_CHUNK which 
carries value for -blocksperchunk ?
DistCp can use this config to inform CopyCommitter which can selectively skip 
concatenation.

> CopyCommitter#concatFileChunks should check that the source file to be merged 
> is a split
> 
>
> Key: HADOOP-15850
> URL: https://issues.apache.org/jira/browse/HADOOP-15850
> Project: Hadoop Common
>  Issue Type: Task
>  Components: tools/distcp
>Affects Versions: 3.1.1
>Reporter: Ted Yu
>Priority: Major
> Attachments: HADOOP-15850.v1.patch, 
> testIncrementalBackupWithBulkLoad-output.txt
>
>
> I was investigating test failure of TestIncrementalBackupWithBulkLoad from 
> hbase against hadoop 3.1.1
> hbase MapReduceBackupCopyJob$BackupDistCp would create listing file:
> {code}
> LOG.debug("creating input listing " + listing + " , totalRecords=" + 
> totalRecords);
> cfg.set(DistCpConstants.CONF_LABEL_LISTING_FILE_PATH, listing);
> cfg.setLong(DistCpConstants.CONF_LABEL_TOTAL_NUMBER_OF_RECORDS, 
> totalRecords);
> {code}
> For the test case, two bulk loaded hfiles are in the listing:
> {code}
> 2018-10-13 14:09:24,123 DEBUG [Time-limited test] 
> mapreduce.MapReduceBackupCopyJob$BackupDistCp(195): BackupDistCp : 
> hdfs://localhost:42796/user/hbase/test-data/160aeab5-6bca-9f87-465e-2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/394e6d39a9b94b148b9089c4fb967aad_SeqId_205_
> 2018-10-13 14:09:24,125 DEBUG [Time-limited test] 
> mapreduce.MapReduceBackupCopyJob$BackupDistCp(195): BackupDistCp : 
> hdfs://localhost:42796/user/hbase/test-data/160aeab5-6bca-9f87-465e-2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/a7599081e835440eb7bf0dd3ef4fd7a5_SeqId_205_
> 2018-10-13 14:09:24,125 DEBUG [Time-limited test] 
> mapreduce.MapReduceBackupCopyJob$BackupDistCp(197): BackupDistCp execute for 
> 2 files of 10242
> {code}
> Later on, CopyCommitter#concatFileChunks would throw the following exception:
> {code}
> 2018-10-13 14:09:25,351 WARN  [Thread-936] mapred.LocalJobRunner$Job(590): 
> job_local1795473782_0004
> java.io.IOException: Inconsistent sequence file: current chunk file 
> org.apache.hadoop.tools.CopyListingFileStatus@bb8826ee{hdfs://localhost:42796/user/hbase/test-data/
>
> 160aeab5-6bca-9f87-465e-2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/a7599081e835440eb7bf0dd3ef4fd7a5_SeqId_205_
>  length = 5100 aclEntries  = null, xAttrs = null} doesnt match prior entry 
> org.apache.hadoop.tools.CopyListingFileStatus@243d544d{hdfs://localhost:42796/user/hbase/test-data/160aeab5-6bca-9f87-465e-
>
> 2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/394e6d39a9b94b148b9089c4fb967aad_SeqId_205_
>  length = 5142 aclEntries = null, xAttrs = null}
>   at 
> org.apache.hadoop.tools.mapred.CopyCommitter.concatFileChunks(CopyCommitter.java:276)
>   at 
> org.apache.hadoop.tools.mapred.CopyCommitter.commitJob(CopyCommitter.java:100)
>   at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:567)
> {code}
> The above warning shouldn't happen - the two bulk loaded hfiles are 
> independent.
> From the contents of the two CopyListingFileStatus instances, we can see that 
> their isSplit() return false. Otherwise the following from toString should be 
> logged:
> {code}
> if (isSplit()) {
>   sb.append(", chunkOffset = ").append(this.getChunkOffset());
>   sb.append(", chunkLength = ").append(this.getChunkLength());
> }
> {code}
> From hbase side, we can specify one bulk loaded hfile per job but that 
> defeats the purpose of using DistCp.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-15850) CopyCommitter#concatFileChunks should check that the source file to be merged is a split

2018-10-16 Thread Ted Yu (JIRA)



[ 
https://issues.apache.org/jira/browse/HADOOP-15850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16652876#comment-16652876
 ] 

Ted Yu commented on HADOOP-15850:
-

I tried to add '-blocksperchunk 0' option when invoking DistCp:
{code}
2018-10-17 02:33:53,708 DEBUG [Time-limited test] 
mapreduce.MapReduceBackupCopyJob(416): New DistCp options: [-async, 
-blocksperchunk, 0, 
hdfs://localhost:34344/user/hbase/test-data/78931012-3303-fc71-e289-5a9726f1bfcc/data/default/test-1539743586635/2e17accd93f78be97c0f585e68f283d6/f/46480cbed054406c9ef52ff123729938_SeqId_205_,
 
hdfs://localhost:34344/user/hbase/test-data/78931012-3303-fc71-e289-5a9726f1bfcc/data/default/test-1539743586635/2e17accd93f78be97c0f585e68f283d6/f/7e3cc96eb3f7447cb4f925df947d1fa3_SeqId_205_,
 hdfs://localhost:34344/backupUT/backup_1539743624592]
{code}
Still encountered 'Inconsistent sequence file' error.

> CopyCommitter#concatFileChunks should check that the source file to be merged 
> is a split
> 
>
> Key: HADOOP-15850
> URL: https://issues.apache.org/jira/browse/HADOOP-15850
> Project: Hadoop Common
>  Issue Type: Task
>  Components: tools/distcp
>Affects Versions: 3.1.1
>Reporter: Ted Yu
>Priority: Major
> Attachments: HADOOP-15850.v1.patch, 
> testIncrementalBackupWithBulkLoad-output.txt
>
>
> I was investigating test failure of TestIncrementalBackupWithBulkLoad from 
> hbase against hadoop 3.1.1
> hbase MapReduceBackupCopyJob$BackupDistCp would create listing file:
> {code}
> LOG.debug("creating input listing " + listing + " , totalRecords=" + 
> totalRecords);
> cfg.set(DistCpConstants.CONF_LABEL_LISTING_FILE_PATH, listing);
> cfg.setLong(DistCpConstants.CONF_LABEL_TOTAL_NUMBER_OF_RECORDS, 
> totalRecords);
> {code}
> For the test case, two bulk loaded hfiles are in the listing:
> {code}
> 2018-10-13 14:09:24,123 DEBUG [Time-limited test] 
> mapreduce.MapReduceBackupCopyJob$BackupDistCp(195): BackupDistCp : 
> hdfs://localhost:42796/user/hbase/test-data/160aeab5-6bca-9f87-465e-2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/394e6d39a9b94b148b9089c4fb967aad_SeqId_205_
> 2018-10-13 14:09:24,125 DEBUG [Time-limited test] 
> mapreduce.MapReduceBackupCopyJob$BackupDistCp(195): BackupDistCp : 
> hdfs://localhost:42796/user/hbase/test-data/160aeab5-6bca-9f87-465e-2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/a7599081e835440eb7bf0dd3ef4fd7a5_SeqId_205_
> 2018-10-13 14:09:24,125 DEBUG [Time-limited test] 
> mapreduce.MapReduceBackupCopyJob$BackupDistCp(197): BackupDistCp execute for 
> 2 files of 10242
> {code}
> Later on, CopyCommitter#concatFileChunks would throw the following exception:
> {code}
> 2018-10-13 14:09:25,351 WARN  [Thread-936] mapred.LocalJobRunner$Job(590): 
> job_local1795473782_0004
> java.io.IOException: Inconsistent sequence file: current chunk file 
> org.apache.hadoop.tools.CopyListingFileStatus@bb8826ee{hdfs://localhost:42796/user/hbase/test-data/
>
> 160aeab5-6bca-9f87-465e-2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/a7599081e835440eb7bf0dd3ef4fd7a5_SeqId_205_
>  length = 5100 aclEntries  = null, xAttrs = null} doesnt match prior entry 
> org.apache.hadoop.tools.CopyListingFileStatus@243d544d{hdfs://localhost:42796/user/hbase/test-data/160aeab5-6bca-9f87-465e-
>
> 2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/394e6d39a9b94b148b9089c4fb967aad_SeqId_205_
>  length = 5142 aclEntries = null, xAttrs = null}
>   at 
> org.apache.hadoop.tools.mapred.CopyCommitter.concatFileChunks(CopyCommitter.java:276)
>   at 
> org.apache.hadoop.tools.mapred.CopyCommitter.commitJob(CopyCommitter.java:100)
>   at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:567)
> {code}
> The above warning shouldn't happen - the two bulk loaded hfiles are 
> independent.
> From the contents of the two CopyListingFileStatus instances, we can see that 
> their isSplit() return false. Otherwise the following from toString should be 
> logged:
> {code}
> if (isSplit()) {
>   sb.append(", chunkOffset = ").append(this.getChunkOffset());
>   sb.append(", chunkLength = ").append(this.getChunkLength());
> }
> {code}
> From hbase side, we can specify one bulk loaded hfile per job but that 
> defeats the purpose of using DistCp.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-15850) CopyCommitter#concatFileChunks should check that the source file to be merged is a split

2018-10-16 Thread Ted Yu (JIRA)



[ 
https://issues.apache.org/jira/browse/HADOOP-15850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16652765#comment-16652765
 ] 

Ted Yu commented on HADOOP-15850:
-

The DistCpOptions instance for the DistCp session is not passed to 
CopyCommitter.
If we have per chunk information, {{concatFileChunks}} call should depend on 
its value.

> CopyCommitter#concatFileChunks should check that the source file to be merged 
> is a split
> 
>
> Key: HADOOP-15850
> URL: https://issues.apache.org/jira/browse/HADOOP-15850
> Project: Hadoop Common
>  Issue Type: Task
>  Components: tools/distcp
>Affects Versions: 3.1.1
>Reporter: Ted Yu
>Priority: Major
> Attachments: HADOOP-15850.v1.patch, 
> testIncrementalBackupWithBulkLoad-output.txt
>
>
> I was investigating test failure of TestIncrementalBackupWithBulkLoad from 
> hbase against hadoop 3.1.1
> hbase MapReduceBackupCopyJob$BackupDistCp would create listing file:
> {code}
> LOG.debug("creating input listing " + listing + " , totalRecords=" + 
> totalRecords);
> cfg.set(DistCpConstants.CONF_LABEL_LISTING_FILE_PATH, listing);
> cfg.setLong(DistCpConstants.CONF_LABEL_TOTAL_NUMBER_OF_RECORDS, 
> totalRecords);
> {code}
> For the test case, two bulk loaded hfiles are in the listing:
> {code}
> 2018-10-13 14:09:24,123 DEBUG [Time-limited test] 
> mapreduce.MapReduceBackupCopyJob$BackupDistCp(195): BackupDistCp : 
> hdfs://localhost:42796/user/hbase/test-data/160aeab5-6bca-9f87-465e-2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/394e6d39a9b94b148b9089c4fb967aad_SeqId_205_
> 2018-10-13 14:09:24,125 DEBUG [Time-limited test] 
> mapreduce.MapReduceBackupCopyJob$BackupDistCp(195): BackupDistCp : 
> hdfs://localhost:42796/user/hbase/test-data/160aeab5-6bca-9f87-465e-2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/a7599081e835440eb7bf0dd3ef4fd7a5_SeqId_205_
> 2018-10-13 14:09:24,125 DEBUG [Time-limited test] 
> mapreduce.MapReduceBackupCopyJob$BackupDistCp(197): BackupDistCp execute for 
> 2 files of 10242
> {code}
> Later on, CopyCommitter#concatFileChunks would throw the following exception:
> {code}
> 2018-10-13 14:09:25,351 WARN  [Thread-936] mapred.LocalJobRunner$Job(590): 
> job_local1795473782_0004
> java.io.IOException: Inconsistent sequence file: current chunk file 
> org.apache.hadoop.tools.CopyListingFileStatus@bb8826ee{hdfs://localhost:42796/user/hbase/test-data/
>
> 160aeab5-6bca-9f87-465e-2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/a7599081e835440eb7bf0dd3ef4fd7a5_SeqId_205_
>  length = 5100 aclEntries  = null, xAttrs = null} doesnt match prior entry 
> org.apache.hadoop.tools.CopyListingFileStatus@243d544d{hdfs://localhost:42796/user/hbase/test-data/160aeab5-6bca-9f87-465e-
>
> 2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/394e6d39a9b94b148b9089c4fb967aad_SeqId_205_
>  length = 5142 aclEntries = null, xAttrs = null}
>   at 
> org.apache.hadoop.tools.mapred.CopyCommitter.concatFileChunks(CopyCommitter.java:276)
>   at 
> org.apache.hadoop.tools.mapred.CopyCommitter.commitJob(CopyCommitter.java:100)
>   at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:567)
> {code}
> The above warning shouldn't happen - the two bulk loaded hfiles are 
> independent.
> From the contents of the two CopyListingFileStatus instances, we can see that 
> their isSplit() return false. Otherwise the following from toString should be 
> logged:
> {code}
> if (isSplit()) {
>   sb.append(", chunkOffset = ").append(this.getChunkOffset());
>   sb.append(", chunkLength = ").append(this.getChunkLength());
> }
> {code}
> From hbase side, we can specify one bulk loaded hfile per job but that 
> defeats the purpose of using DistCp.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-15850) CopyCommitter#concatFileChunks should check that the source file to be merged is a split

2018-10-16 Thread Ted Yu (JIRA)



[ 
https://issues.apache.org/jira/browse/HADOOP-15850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16652736#comment-16652736
 ] 

Ted Yu commented on HADOOP-15850:
-

[~jojochuang]:
See the link to MapReduceBackupCopyJob.java in my first comment .
We invoke DistCp programmatically.

> CopyCommitter#concatFileChunks should check that the source file to be merged 
> is a split
> 
>
> Key: HADOOP-15850
> URL: https://issues.apache.org/jira/browse/HADOOP-15850
> Project: Hadoop Common
>  Issue Type: Task
>  Components: tools/distcp
>Affects Versions: 3.1.1
>Reporter: Ted Yu
>Priority: Major
> Attachments: HADOOP-15850.v1.patch, 
> testIncrementalBackupWithBulkLoad-output.txt
>
>
> I was investigating test failure of TestIncrementalBackupWithBulkLoad from 
> hbase against hadoop 3.1.1
> hbase MapReduceBackupCopyJob$BackupDistCp would create listing file:
> {code}
> LOG.debug("creating input listing " + listing + " , totalRecords=" + 
> totalRecords);
> cfg.set(DistCpConstants.CONF_LABEL_LISTING_FILE_PATH, listing);
> cfg.setLong(DistCpConstants.CONF_LABEL_TOTAL_NUMBER_OF_RECORDS, 
> totalRecords);
> {code}
> For the test case, two bulk loaded hfiles are in the listing:
> {code}
> 2018-10-13 14:09:24,123 DEBUG [Time-limited test] 
> mapreduce.MapReduceBackupCopyJob$BackupDistCp(195): BackupDistCp : 
> hdfs://localhost:42796/user/hbase/test-data/160aeab5-6bca-9f87-465e-2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/394e6d39a9b94b148b9089c4fb967aad_SeqId_205_
> 2018-10-13 14:09:24,125 DEBUG [Time-limited test] 
> mapreduce.MapReduceBackupCopyJob$BackupDistCp(195): BackupDistCp : 
> hdfs://localhost:42796/user/hbase/test-data/160aeab5-6bca-9f87-465e-2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/a7599081e835440eb7bf0dd3ef4fd7a5_SeqId_205_
> 2018-10-13 14:09:24,125 DEBUG [Time-limited test] 
> mapreduce.MapReduceBackupCopyJob$BackupDistCp(197): BackupDistCp execute for 
> 2 files of 10242
> {code}
> Later on, CopyCommitter#concatFileChunks would throw the following exception:
> {code}
> 2018-10-13 14:09:25,351 WARN  [Thread-936] mapred.LocalJobRunner$Job(590): 
> job_local1795473782_0004
> java.io.IOException: Inconsistent sequence file: current chunk file 
> org.apache.hadoop.tools.CopyListingFileStatus@bb8826ee{hdfs://localhost:42796/user/hbase/test-data/
>
> 160aeab5-6bca-9f87-465e-2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/a7599081e835440eb7bf0dd3ef4fd7a5_SeqId_205_
>  length = 5100 aclEntries  = null, xAttrs = null} doesnt match prior entry 
> org.apache.hadoop.tools.CopyListingFileStatus@243d544d{hdfs://localhost:42796/user/hbase/test-data/160aeab5-6bca-9f87-465e-
>
> 2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/394e6d39a9b94b148b9089c4fb967aad_SeqId_205_
>  length = 5142 aclEntries = null, xAttrs = null}
>   at 
> org.apache.hadoop.tools.mapred.CopyCommitter.concatFileChunks(CopyCommitter.java:276)
>   at 
> org.apache.hadoop.tools.mapred.CopyCommitter.commitJob(CopyCommitter.java:100)
>   at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:567)
> {code}
> The above warning shouldn't happen - the two bulk loaded hfiles are 
> independent.
> From the contents of the two CopyListingFileStatus instances, we can see that 
> their isSplit() return false. Otherwise the following from toString should be 
> logged:
> {code}
> if (isSplit()) {
>   sb.append(", chunkOffset = ").append(this.getChunkOffset());
>   sb.append(", chunkLength = ").append(this.getChunkLength());
> }
> {code}
> From hbase side, we can specify one bulk loaded hfile per job but that 
> defeats the purpose of using DistCp.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-15850) CopyCommitter#concatFileChunks should check that the source file to be merged is a split

2018-10-16 Thread Ted Yu (JIRA)



[ 
https://issues.apache.org/jira/browse/HADOOP-15850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16652726#comment-16652726
 ] 

Ted Yu commented on HADOOP-15850:
-

Running the backup test against hadoop 3.0.x / 3.1.y , this is easily 
reproducible.

I was aware of HADOOP-11794 and wondering why the per chunk feature kicks in.

> CopyCommitter#concatFileChunks should check that the source file to be merged 
> is a split
> 
>
> Key: HADOOP-15850
> URL: https://issues.apache.org/jira/browse/HADOOP-15850
> Project: Hadoop Common
>  Issue Type: Task
>  Components: tools/distcp
>Affects Versions: 3.1.1
>Reporter: Ted Yu
>Priority: Major
> Attachments: HADOOP-15850.v1.patch, 
> testIncrementalBackupWithBulkLoad-output.txt
>
>
> I was investigating test failure of TestIncrementalBackupWithBulkLoad from 
> hbase against hadoop 3.1.1
> hbase MapReduceBackupCopyJob$BackupDistCp would create listing file:
> {code}
> LOG.debug("creating input listing " + listing + " , totalRecords=" + 
> totalRecords);
> cfg.set(DistCpConstants.CONF_LABEL_LISTING_FILE_PATH, listing);
> cfg.setLong(DistCpConstants.CONF_LABEL_TOTAL_NUMBER_OF_RECORDS, 
> totalRecords);
> {code}
> For the test case, two bulk loaded hfiles are in the listing:
> {code}
> 2018-10-13 14:09:24,123 DEBUG [Time-limited test] 
> mapreduce.MapReduceBackupCopyJob$BackupDistCp(195): BackupDistCp : 
> hdfs://localhost:42796/user/hbase/test-data/160aeab5-6bca-9f87-465e-2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/394e6d39a9b94b148b9089c4fb967aad_SeqId_205_
> 2018-10-13 14:09:24,125 DEBUG [Time-limited test] 
> mapreduce.MapReduceBackupCopyJob$BackupDistCp(195): BackupDistCp : 
> hdfs://localhost:42796/user/hbase/test-data/160aeab5-6bca-9f87-465e-2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/a7599081e835440eb7bf0dd3ef4fd7a5_SeqId_205_
> 2018-10-13 14:09:24,125 DEBUG [Time-limited test] 
> mapreduce.MapReduceBackupCopyJob$BackupDistCp(197): BackupDistCp execute for 
> 2 files of 10242
> {code}
> Later on, CopyCommitter#concatFileChunks would throw the following exception:
> {code}
> 2018-10-13 14:09:25,351 WARN  [Thread-936] mapred.LocalJobRunner$Job(590): 
> job_local1795473782_0004
> java.io.IOException: Inconsistent sequence file: current chunk file 
> org.apache.hadoop.tools.CopyListingFileStatus@bb8826ee{hdfs://localhost:42796/user/hbase/test-data/
>
> 160aeab5-6bca-9f87-465e-2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/a7599081e835440eb7bf0dd3ef4fd7a5_SeqId_205_
>  length = 5100 aclEntries  = null, xAttrs = null} doesnt match prior entry 
> org.apache.hadoop.tools.CopyListingFileStatus@243d544d{hdfs://localhost:42796/user/hbase/test-data/160aeab5-6bca-9f87-465e-
>
> 2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/394e6d39a9b94b148b9089c4fb967aad_SeqId_205_
>  length = 5142 aclEntries = null, xAttrs = null}
>   at 
> org.apache.hadoop.tools.mapred.CopyCommitter.concatFileChunks(CopyCommitter.java:276)
>   at 
> org.apache.hadoop.tools.mapred.CopyCommitter.commitJob(CopyCommitter.java:100)
>   at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:567)
> {code}
> The above warning shouldn't happen - the two bulk loaded hfiles are 
> independent.
> From the contents of the two CopyListingFileStatus instances, we can see that 
> their isSplit() return false. Otherwise the following from toString should be 
> logged:
> {code}
> if (isSplit()) {
>   sb.append(", chunkOffset = ").append(this.getChunkOffset());
>   sb.append(", chunkLength = ").append(this.getChunkLength());
> }
> {code}
> From hbase side, we can specify one bulk loaded hfile per job but that 
> defeats the purpose of using DistCp.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-15850) CopyCommitter#concatFileChunks should check that the source file to be merged is a split

2018-10-16 Thread Ted Yu (JIRA)



 [ 
https://issues.apache.org/jira/browse/HADOOP-15850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HADOOP-15850:

Attachment: HADOOP-15850.v1.patch

> CopyCommitter#concatFileChunks should check that the source file to be merged 
> is a split
> 
>
> Key: HADOOP-15850
> URL: https://issues.apache.org/jira/browse/HADOOP-15850
> Project: Hadoop Common
>  Issue Type: Task
>  Components: tools/distcp
>Affects Versions: 3.1.1
>Reporter: Ted Yu
>Priority: Major
> Attachments: HADOOP-15850.v1.patch, 
> testIncrementalBackupWithBulkLoad-output.txt
>
>
> I was investigating test failure of TestIncrementalBackupWithBulkLoad from 
> hbase against hadoop 3.1.1
> hbase MapReduceBackupCopyJob$BackupDistCp would create listing file:
> {code}
> LOG.debug("creating input listing " + listing + " , totalRecords=" + 
> totalRecords);
> cfg.set(DistCpConstants.CONF_LABEL_LISTING_FILE_PATH, listing);
> cfg.setLong(DistCpConstants.CONF_LABEL_TOTAL_NUMBER_OF_RECORDS, 
> totalRecords);
> {code}
> For the test case, two bulk loaded hfiles are in the listing:
> {code}
> 2018-10-13 14:09:24,123 DEBUG [Time-limited test] 
> mapreduce.MapReduceBackupCopyJob$BackupDistCp(195): BackupDistCp : 
> hdfs://localhost:42796/user/hbase/test-data/160aeab5-6bca-9f87-465e-2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/394e6d39a9b94b148b9089c4fb967aad_SeqId_205_
> 2018-10-13 14:09:24,125 DEBUG [Time-limited test] 
> mapreduce.MapReduceBackupCopyJob$BackupDistCp(195): BackupDistCp : 
> hdfs://localhost:42796/user/hbase/test-data/160aeab5-6bca-9f87-465e-2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/a7599081e835440eb7bf0dd3ef4fd7a5_SeqId_205_
> 2018-10-13 14:09:24,125 DEBUG [Time-limited test] 
> mapreduce.MapReduceBackupCopyJob$BackupDistCp(197): BackupDistCp execute for 
> 2 files of 10242
> {code}
> Later on, CopyCommitter#concatFileChunks would throw the following exception:
> {code}
> 2018-10-13 14:09:25,351 WARN  [Thread-936] mapred.LocalJobRunner$Job(590): 
> job_local1795473782_0004
> java.io.IOException: Inconsistent sequence file: current chunk file 
> org.apache.hadoop.tools.CopyListingFileStatus@bb8826ee{hdfs://localhost:42796/user/hbase/test-data/
>
> 160aeab5-6bca-9f87-465e-2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/a7599081e835440eb7bf0dd3ef4fd7a5_SeqId_205_
>  length = 5100 aclEntries  = null, xAttrs = null} doesnt match prior entry 
> org.apache.hadoop.tools.CopyListingFileStatus@243d544d{hdfs://localhost:42796/user/hbase/test-data/160aeab5-6bca-9f87-465e-
>
> 2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/394e6d39a9b94b148b9089c4fb967aad_SeqId_205_
>  length = 5142 aclEntries = null, xAttrs = null}
>   at 
> org.apache.hadoop.tools.mapred.CopyCommitter.concatFileChunks(CopyCommitter.java:276)
>   at 
> org.apache.hadoop.tools.mapred.CopyCommitter.commitJob(CopyCommitter.java:100)
>   at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:567)
> {code}
> The above warning shouldn't happen - the two bulk loaded hfiles are 
> independent.
> From the contents of the two CopyListingFileStatus instances, we can see that 
> their isSplit() return false. Otherwise the following from toString should be 
> logged:
> {code}
> if (isSplit()) {
>   sb.append(", chunkOffset = ").append(this.getChunkOffset());
>   sb.append(", chunkLength = ").append(this.getChunkLength());
> }
> {code}
> From hbase side, we can specify one bulk loaded hfile per job but that 
> defeats the purpose of using DistCp.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-15850) CopyCommitter#concatFileChunks should check that the source file to be merged is a split

2018-10-16 Thread Ted Yu (JIRA)



 [ 
https://issues.apache.org/jira/browse/HADOOP-15850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HADOOP-15850:

Description: 
I was investigating test failure of TestIncrementalBackupWithBulkLoad from 
hbase against hadoop 3.1.1

hbase MapReduceBackupCopyJob$BackupDistCp would create listing file:
{code}
LOG.debug("creating input listing " + listing + " , totalRecords=" + 
totalRecords);
cfg.set(DistCpConstants.CONF_LABEL_LISTING_FILE_PATH, listing);
cfg.setLong(DistCpConstants.CONF_LABEL_TOTAL_NUMBER_OF_RECORDS, 
totalRecords);
{code}
For the test case, two bulk loaded hfiles are in the listing:
{code}
2018-10-13 14:09:24,123 DEBUG [Time-limited test] 
mapreduce.MapReduceBackupCopyJob$BackupDistCp(195): BackupDistCp : 
hdfs://localhost:42796/user/hbase/test-data/160aeab5-6bca-9f87-465e-2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/394e6d39a9b94b148b9089c4fb967aad_SeqId_205_
2018-10-13 14:09:24,125 DEBUG [Time-limited test] 
mapreduce.MapReduceBackupCopyJob$BackupDistCp(195): BackupDistCp : 
hdfs://localhost:42796/user/hbase/test-data/160aeab5-6bca-9f87-465e-2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/a7599081e835440eb7bf0dd3ef4fd7a5_SeqId_205_
2018-10-13 14:09:24,125 DEBUG [Time-limited test] 
mapreduce.MapReduceBackupCopyJob$BackupDistCp(197): BackupDistCp execute for 2 
files of 10242
{code}
Later on, CopyCommitter#concatFileChunks would throw the following exception:
{code}
2018-10-13 14:09:25,351 WARN  [Thread-936] mapred.LocalJobRunner$Job(590): 
job_local1795473782_0004
java.io.IOException: Inconsistent sequence file: current chunk file 
org.apache.hadoop.tools.CopyListingFileStatus@bb8826ee{hdfs://localhost:42796/user/hbase/test-data/
   
160aeab5-6bca-9f87-465e-2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/a7599081e835440eb7bf0dd3ef4fd7a5_SeqId_205_
 length = 5100 aclEntries  = null, xAttrs = null} doesnt match prior entry 
org.apache.hadoop.tools.CopyListingFileStatus@243d544d{hdfs://localhost:42796/user/hbase/test-data/160aeab5-6bca-9f87-465e-
   
2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/394e6d39a9b94b148b9089c4fb967aad_SeqId_205_
 length = 5142 aclEntries = null, xAttrs = null}
  at 
org.apache.hadoop.tools.mapred.CopyCommitter.concatFileChunks(CopyCommitter.java:276)
  at 
org.apache.hadoop.tools.mapred.CopyCommitter.commitJob(CopyCommitter.java:100)
  at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:567)
{code}
The above warning shouldn't happen - the two bulk loaded hfiles are independent.

>From the contents of the two CopyListingFileStatus instances, we can see that 
>their isSplit() return false. Otherwise the following from toString should be 
>logged:
{code}
if (isSplit()) {
  sb.append(", chunkOffset = ").append(this.getChunkOffset());
  sb.append(", chunkLength = ").append(this.getChunkLength());
}
{code}
>From hbase side, we can specify one bulk loaded hfile per job but that defeats 
>the purpose of using DistCp.



  was:
I was investigating test failure of TestIncrementalBackupWithBulkLoad from 
hbase against hadoop 3.1.1

hbase MapReduceBackupCopyJob$BackupDistCp would create listing file:
{code}
LOG.debug("creating input listing " + listing + " , totalRecords=" + 
totalRecords);
cfg.set(DistCpConstants.CONF_LABEL_LISTING_FILE_PATH, listing);
cfg.setLong(DistCpConstants.CONF_LABEL_TOTAL_NUMBER_OF_RECORDS, 
totalRecords);
{code}
For the test case, two bulk loaded hfiles are in the listing:
{code}
2018-10-13 14:09:24,123 DEBUG [Time-limited test] 
mapreduce.MapReduceBackupCopyJob$BackupDistCp(195): BackupDistCp : 
hdfs://localhost:42796/user/hbase/test-data/160aeab5-6bca-9f87-465e-2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/394e6d39a9b94b148b9089c4fb967aad_SeqId_205_
2018-10-13 14:09:24,125 DEBUG [Time-limited test] 
mapreduce.MapReduceBackupCopyJob$BackupDistCp(195): BackupDistCp : 
hdfs://localhost:42796/user/hbase/test-data/160aeab5-6bca-9f87-465e-2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/a7599081e835440eb7bf0dd3ef4fd7a5_SeqId_205_
2018-10-13 14:09:24,125 DEBUG [Time-limited test] 
mapreduce.MapReduceBackupCopyJob$BackupDistCp(197): BackupDistCp execute for 2 
files of 10242
{code}
Later on, CopyCommitter#concatFileChunks would throw the following exception:
{code}
2018-10-13 14:09:25,351 WARN  [Thread-936] mapred.LocalJobRunner$Job(590): 
job_local1795473782_0004
java.io.IOException: Inconsistent sequence file: current chunk file 
org.apache.hadoop.tools.CopyListingFileStatus@bb8826ee{hdfs://localhost:42796/user/hbase/test-data/
   
160aeab5-6bca-9f87-465e-2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/a7599081e835440eb7bf0dd3ef4fd7a5_SeqId_205_
 length = 5100

[jira] [Updated] (HADOOP-15850) CopyCommitter#concatFileChunks should check that the source file to be merged is a split

2018-10-16 Thread Ted Yu (JIRA)



 [ 
https://issues.apache.org/jira/browse/HADOOP-15850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HADOOP-15850:

Summary: CopyCommitter#concatFileChunks should check that the source file 
to be merged is a split  (was: Allow CopyCommitter to skip concatenating source 
files specified by DistCpConstants.CONF_LABEL_LISTING_FILE_PATH)

> CopyCommitter#concatFileChunks should check that the source file to be merged 
> is a split
> 
>
> Key: HADOOP-15850
> URL: https://issues.apache.org/jira/browse/HADOOP-15850
> Project: Hadoop Common
>  Issue Type: Task
>  Components: tools/distcp
>Affects Versions: 3.1.1
>Reporter: Ted Yu
>Priority: Major
> Attachments: testIncrementalBackupWithBulkLoad-output.txt
>
>
> I was investigating test failure of TestIncrementalBackupWithBulkLoad from 
> hbase against hadoop 3.1.1
> hbase MapReduceBackupCopyJob$BackupDistCp would create listing file:
> {code}
> LOG.debug("creating input listing " + listing + " , totalRecords=" + 
> totalRecords);
> cfg.set(DistCpConstants.CONF_LABEL_LISTING_FILE_PATH, listing);
> cfg.setLong(DistCpConstants.CONF_LABEL_TOTAL_NUMBER_OF_RECORDS, 
> totalRecords);
> {code}
> For the test case, two bulk loaded hfiles are in the listing:
> {code}
> 2018-10-13 14:09:24,123 DEBUG [Time-limited test] 
> mapreduce.MapReduceBackupCopyJob$BackupDistCp(195): BackupDistCp : 
> hdfs://localhost:42796/user/hbase/test-data/160aeab5-6bca-9f87-465e-2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/394e6d39a9b94b148b9089c4fb967aad_SeqId_205_
> 2018-10-13 14:09:24,125 DEBUG [Time-limited test] 
> mapreduce.MapReduceBackupCopyJob$BackupDistCp(195): BackupDistCp : 
> hdfs://localhost:42796/user/hbase/test-data/160aeab5-6bca-9f87-465e-2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/a7599081e835440eb7bf0dd3ef4fd7a5_SeqId_205_
> 2018-10-13 14:09:24,125 DEBUG [Time-limited test] 
> mapreduce.MapReduceBackupCopyJob$BackupDistCp(197): BackupDistCp execute for 
> 2 files of 10242
> {code}
> Later on, CopyCommitter#concatFileChunks would throw the following exception:
> {code}
> 2018-10-13 14:09:25,351 WARN  [Thread-936] mapred.LocalJobRunner$Job(590): 
> job_local1795473782_0004
> java.io.IOException: Inconsistent sequence file: current chunk file 
> org.apache.hadoop.tools.CopyListingFileStatus@bb8826ee{hdfs://localhost:42796/user/hbase/test-data/
>
> 160aeab5-6bca-9f87-465e-2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/a7599081e835440eb7bf0dd3ef4fd7a5_SeqId_205_
>  length = 5100 aclEntries  = null, xAttrs = null} doesnt match prior entry 
> org.apache.hadoop.tools.CopyListingFileStatus@243d544d{hdfs://localhost:42796/user/hbase/test-data/160aeab5-6bca-9f87-465e-
>
> 2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/394e6d39a9b94b148b9089c4fb967aad_SeqId_205_
>  length = 5142 aclEntries = null, xAttrs = null}
>   at 
> org.apache.hadoop.tools.mapred.CopyCommitter.concatFileChunks(CopyCommitter.java:276)
>   at 
> org.apache.hadoop.tools.mapred.CopyCommitter.commitJob(CopyCommitter.java:100)
>   at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:567)
> {code}
> The above warning shouldn't happen - the two bulk loaded hfiles are 
> independent.
> From the contents of the two CopyListingFileStatus instances, we can see that 
> their isSplit() return false. Otherwise the following from toString should be 
> logged:
> {code}
> if (isSplit()) {
>   sb.append(", chunkOffset = ").append(this.getChunkOffset());
>   sb.append(", chunkLength = ").append(this.getChunkLength());
> }
> {code}
> From hbase side, we can specify one bulk loaded hfile per job but that 
> defeats the purpose of using DistCp.
> There should be a way for DistCp to specify the skipping of source file 
> concatenation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-15850) Allow CopyCommitter to skip concatenating source files specified by DistCpConstants.CONF_LABEL_LISTING_FILE_PATH

2018-10-16 Thread Ted Yu (JIRA)



[ 
https://issues.apache.org/jira/browse/HADOOP-15850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16652693#comment-16652693
 ] 

Ted Yu commented on HADOOP-15850:
-

I wonder if the check for mismatching FileStatus should be refined this way:
{code}
diff --git 
a/hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/mapred/CopyCommitter.java
 b/hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/ma
index 07eacb0..6177454 100644
--- 
a/hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/mapred/CopyCommitter.java
+++ 
b/hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/mapred/CopyCommitter.java
@@ -266,9 +266,9 @@ private void concatFileChunks(Configuration conf) throws 
IOException {
 // Two neighboring chunks have to be consecutive ones for the same
 // file, for them to be merged
 if (!srcFileStatus.getPath().equals(lastFileStatus.getPath()) ||
-srcFileStatus.getChunkOffset() !=
+lastFileStatus.isSplit() && (srcFileStatus.getChunkOffset() !=
 (lastFileStatus.getChunkOffset() +
-lastFileStatus.getChunkLength())) {
+lastFileStatus.getChunkLength( {
   String emsg = "Inconsistent sequence file: current " +
   "chunk file " + srcFileStatus + " doesnt match prior " +
   "entry " + lastFileStatus;
{code}
The additional clause checks that lastFileStatus represents a split.

[~ste...@apache.org] [~yzhangal] [~jojochuang] 
What do you think ?

> Allow CopyCommitter to skip concatenating source files specified by 
> DistCpConstants.CONF_LABEL_LISTING_FILE_PATH
> 
>
> Key: HADOOP-15850
> URL: https://issues.apache.org/jira/browse/HADOOP-15850
> Project: Hadoop Common
>  Issue Type: Task
>  Components: tools/distcp
>Affects Versions: 3.1.1
>Reporter: Ted Yu
>Priority: Major
> Attachments: testIncrementalBackupWithBulkLoad-output.txt
>
>
> I was investigating test failure of TestIncrementalBackupWithBulkLoad from 
> hbase against hadoop 3.1.1
> hbase MapReduceBackupCopyJob$BackupDistCp would create listing file:
> {code}
> LOG.debug("creating input listing " + listing + " , totalRecords=" + 
> totalRecords);
> cfg.set(DistCpConstants.CONF_LABEL_LISTING_FILE_PATH, listing);
> cfg.setLong(DistCpConstants.CONF_LABEL_TOTAL_NUMBER_OF_RECORDS, 
> totalRecords);
> {code}
> For the test case, two bulk loaded hfiles are in the listing:
> {code}
> 2018-10-13 14:09:24,123 DEBUG [Time-limited test] 
> mapreduce.MapReduceBackupCopyJob$BackupDistCp(195): BackupDistCp : 
> hdfs://localhost:42796/user/hbase/test-data/160aeab5-6bca-9f87-465e-2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/394e6d39a9b94b148b9089c4fb967aad_SeqId_205_
> 2018-10-13 14:09:24,125 DEBUG [Time-limited test] 
> mapreduce.MapReduceBackupCopyJob$BackupDistCp(195): BackupDistCp : 
> hdfs://localhost:42796/user/hbase/test-data/160aeab5-6bca-9f87-465e-2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/a7599081e835440eb7bf0dd3ef4fd7a5_SeqId_205_
> 2018-10-13 14:09:24,125 DEBUG [Time-limited test] 
> mapreduce.MapReduceBackupCopyJob$BackupDistCp(197): BackupDistCp execute for 
> 2 files of 10242
> {code}
> Later on, CopyCommitter#concatFileChunks would throw the following exception:
> {code}
> 2018-10-13 14:09:25,351 WARN  [Thread-936] mapred.LocalJobRunner$Job(590): 
> job_local1795473782_0004
> java.io.IOException: Inconsistent sequence file: current chunk file 
> org.apache.hadoop.tools.CopyListingFileStatus@bb8826ee{hdfs://localhost:42796/user/hbase/test-data/
>
> 160aeab5-6bca-9f87-465e-2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/a7599081e835440eb7bf0dd3ef4fd7a5_SeqId_205_
>  length = 5100 aclEntries  = null, xAttrs = null} doesnt match prior entry 
> org.apache.hadoop.tools.CopyListingFileStatus@243d544d{hdfs://localhost:42796/user/hbase/test-data/160aeab5-6bca-9f87-465e-
>
> 2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/394e6d39a9b94b148b9089c4fb967aad_SeqId_205_
>  length = 5142 aclEntries = null, xAttrs = null}
>   at 
> org.apache.hadoop.tools.mapred.CopyCommitter.concatFileChunks(CopyCommitter.java:276)
>   at 
> org.apache.hadoop.tools.mapred.CopyCommitter.commitJob(CopyCommitter.java:100)
>   at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:567)
> {code}
> The above warning shouldn't happen - the two bulk loaded hfiles are 
> independent.
> From the contents of the two CopyListingFileStatus instances, we can see that 
> their isSplit() return false. Otherwise the following from toString

[jira] [Commented] (HADOOP-15850) Allow CopyCommitter to skip concatenating source files specified by DistCpConstants.CONF_LABEL_LISTING_FILE_PATH

2018-10-14 Thread Ted Yu (JIRA)



[ 
https://issues.apache.org/jira/browse/HADOOP-15850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16649638#comment-16649638
 ] 

Ted Yu commented on HADOOP-15850:
-

CopyCommitter#concatFileChunks is private.
It is not straight forward to override the method from DistCp user POV.

> Allow CopyCommitter to skip concatenating source files specified by 
> DistCpConstants.CONF_LABEL_LISTING_FILE_PATH
> 
>
> Key: HADOOP-15850
> URL: https://issues.apache.org/jira/browse/HADOOP-15850
> Project: Hadoop Common
>  Issue Type: Task
>Reporter: Ted Yu
>Priority: Major
> Attachments: testIncrementalBackupWithBulkLoad-output.txt
>
>
> I was investigating test failure of TestIncrementalBackupWithBulkLoad from 
> hbase against hadoop 3.1.1
> hbase MapReduceBackupCopyJob$BackupDistCp would create listing file:
> {code}
> LOG.debug("creating input listing " + listing + " , totalRecords=" + 
> totalRecords);
> cfg.set(DistCpConstants.CONF_LABEL_LISTING_FILE_PATH, listing);
> cfg.setLong(DistCpConstants.CONF_LABEL_TOTAL_NUMBER_OF_RECORDS, 
> totalRecords);
> {code}
> For the test case, two bulk loaded hfiles are in the listing:
> {code}
> 2018-10-13 14:09:24,123 DEBUG [Time-limited test] 
> mapreduce.MapReduceBackupCopyJob$BackupDistCp(195): BackupDistCp : 
> hdfs://localhost:42796/user/hbase/test-data/160aeab5-6bca-9f87-465e-2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/394e6d39a9b94b148b9089c4fb967aad_SeqId_205_
> 2018-10-13 14:09:24,125 DEBUG [Time-limited test] 
> mapreduce.MapReduceBackupCopyJob$BackupDistCp(195): BackupDistCp : 
> hdfs://localhost:42796/user/hbase/test-data/160aeab5-6bca-9f87-465e-2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/a7599081e835440eb7bf0dd3ef4fd7a5_SeqId_205_
> 2018-10-13 14:09:24,125 DEBUG [Time-limited test] 
> mapreduce.MapReduceBackupCopyJob$BackupDistCp(197): BackupDistCp execute for 
> 2 files of 10242
> {code}
> Later on, CopyCommitter#concatFileChunks would throw the following exception:
> {code}
> 2018-10-13 14:09:25,351 WARN  [Thread-936] mapred.LocalJobRunner$Job(590): 
> job_local1795473782_0004
> java.io.IOException: Inconsistent sequence file: current chunk file 
> org.apache.hadoop.tools.CopyListingFileStatus@bb8826ee{hdfs://localhost:42796/user/hbase/test-data/
>
> 160aeab5-6bca-9f87-465e-2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/a7599081e835440eb7bf0dd3ef4fd7a5_SeqId_205_
>  length = 5100 aclEntries  = null, xAttrs = null} doesnt match prior entry 
> org.apache.hadoop.tools.CopyListingFileStatus@243d544d{hdfs://localhost:42796/user/hbase/test-data/160aeab5-6bca-9f87-465e-
>
> 2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/394e6d39a9b94b148b9089c4fb967aad_SeqId_205_
>  length = 5142 aclEntries = null, xAttrs = null}
>   at 
> org.apache.hadoop.tools.mapred.CopyCommitter.concatFileChunks(CopyCommitter.java:276)
>   at 
> org.apache.hadoop.tools.mapred.CopyCommitter.commitJob(CopyCommitter.java:100)
>   at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:567)
> {code}
> The above warning shouldn't happen - the two bulk loaded hfiles are 
> independent.
> From the contents of the two CopyListingFileStatus instances, we can see that 
> their isSplit() return false. Otherwise the following from toString should be 
> logged:
> {code}
> if (isSplit()) {
>   sb.append(", chunkOffset = ").append(this.getChunkOffset());
>   sb.append(", chunkLength = ").append(this.getChunkLength());
> }
> {code}
> From hbase side, we can specify one bulk loaded hfile per job but that 
> defeats the purpose of using DistCp.
> There should be a way for DistCp to specify the skipping of source file 
> concatenation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (HADOOP-15850) Allow CopyCommitter to skip concatenating source files specified by DistCpConstants.CONF_LABEL_LISTING_FILE_PATH

2018-10-13 Thread Ted Yu (JIRA)



[ 
https://issues.apache.org/jira/browse/HADOOP-15850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16648975#comment-16648975
 ] 

Ted Yu edited comment on HADOOP-15850 at 10/13/18 2:28 PM:
---

[~yzhangal]:
When you have chance, can you take a look ?
Maybe I missed some existing DistCp functionality.

Thanks


was (Author: yuzhih...@gmail.com):
[~yzhangal]:
When you have chance, can you take a look ?
Maybe I missed some existing DistCp functionality.

> Allow CopyCommitter to skip concatenating source files specified by 
> DistCpConstants.CONF_LABEL_LISTING_FILE_PATH
> 
>
> Key: HADOOP-15850
> URL: https://issues.apache.org/jira/browse/HADOOP-15850
> Project: Hadoop Common
>  Issue Type: Task
>Reporter: Ted Yu
>Priority: Major
> Attachments: testIncrementalBackupWithBulkLoad-output.txt
>
>
> I was investigating test failure of TestIncrementalBackupWithBulkLoad from 
> hbase against hadoop 3.1.1
> hbase MapReduceBackupCopyJob$BackupDistCp would create listing file:
> {code}
> LOG.debug("creating input listing " + listing + " , totalRecords=" + 
> totalRecords);
> cfg.set(DistCpConstants.CONF_LABEL_LISTING_FILE_PATH, listing);
> cfg.setLong(DistCpConstants.CONF_LABEL_TOTAL_NUMBER_OF_RECORDS, 
> totalRecords);
> {code}
> For the test case, two bulk loaded hfiles are in the listing:
> {code}
> 2018-10-13 14:09:24,123 DEBUG [Time-limited test] 
> mapreduce.MapReduceBackupCopyJob$BackupDistCp(195): BackupDistCp : 
> hdfs://localhost:42796/user/hbase/test-data/160aeab5-6bca-9f87-465e-2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/394e6d39a9b94b148b9089c4fb967aad_SeqId_205_
> 2018-10-13 14:09:24,125 DEBUG [Time-limited test] 
> mapreduce.MapReduceBackupCopyJob$BackupDistCp(195): BackupDistCp : 
> hdfs://localhost:42796/user/hbase/test-data/160aeab5-6bca-9f87-465e-2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/a7599081e835440eb7bf0dd3ef4fd7a5_SeqId_205_
> 2018-10-13 14:09:24,125 DEBUG [Time-limited test] 
> mapreduce.MapReduceBackupCopyJob$BackupDistCp(197): BackupDistCp execute for 
> 2 files of 10242
> {code}
> Later on, CopyCommitter#concatFileChunks would throw the following exception:
> {code}
> 2018-10-13 14:09:25,351 WARN  [Thread-936] mapred.LocalJobRunner$Job(590): 
> job_local1795473782_0004
> java.io.IOException: Inconsistent sequence file: current chunk file 
> org.apache.hadoop.tools.CopyListingFileStatus@bb8826ee{hdfs://localhost:42796/user/hbase/test-data/
>
> 160aeab5-6bca-9f87-465e-2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/a7599081e835440eb7bf0dd3ef4fd7a5_SeqId_205_
>  length = 5100 aclEntries  = null, xAttrs = null} doesnt match prior entry 
> org.apache.hadoop.tools.CopyListingFileStatus@243d544d{hdfs://localhost:42796/user/hbase/test-data/160aeab5-6bca-9f87-465e-
>
> 2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/394e6d39a9b94b148b9089c4fb967aad_SeqId_205_
>  length = 5142 aclEntries = null, xAttrs = null}
>   at 
> org.apache.hadoop.tools.mapred.CopyCommitter.concatFileChunks(CopyCommitter.java:276)
>   at 
> org.apache.hadoop.tools.mapred.CopyCommitter.commitJob(CopyCommitter.java:100)
>   at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:567)
> {code}
> The above warning shouldn't happen - the two bulk loaded hfiles are 
> independent.
> From the contents of the two CopyListingFileStatus instances, we can see that 
> their isSplit() return false. Otherwise the following from toString should be 
> logged:
> {code}
> if (isSplit()) {
>   sb.append(", chunkOffset = ").append(this.getChunkOffset());
>   sb.append(", chunkLength = ").append(this.getChunkLength());
> }
> {code}
> From hbase side, we can specify one bulk loaded hfile per job but that 
> defeats the purpose of using DistCp.
> There should be a way for DistCp to specify the skipping of source file 
> concatenation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-15850) Allow CopyCommitter to skip concatenating source files specified by DistCpConstants.CONF_LABEL_LISTING_FILE_PATH

2018-10-13 Thread Ted Yu (JIRA)



[ 
https://issues.apache.org/jira/browse/HADOOP-15850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16648975#comment-16648975
 ] 

Ted Yu commented on HADOOP-15850:
-

[~yzhangal]:
When you have chance, can you take a look ?
Maybe I missed some existing DistCp functionality.

> Allow CopyCommitter to skip concatenating source files specified by 
> DistCpConstants.CONF_LABEL_LISTING_FILE_PATH
> 
>
> Key: HADOOP-15850
> URL: https://issues.apache.org/jira/browse/HADOOP-15850
> Project: Hadoop Common
>  Issue Type: Task
>Reporter: Ted Yu
>Priority: Major
> Attachments: testIncrementalBackupWithBulkLoad-output.txt
>
>
> I was investigating test failure of TestIncrementalBackupWithBulkLoad from 
> hbase against hadoop 3.1.1
> hbase MapReduceBackupCopyJob$BackupDistCp would create listing file:
> {code}
> LOG.debug("creating input listing " + listing + " , totalRecords=" + 
> totalRecords);
> cfg.set(DistCpConstants.CONF_LABEL_LISTING_FILE_PATH, listing);
> cfg.setLong(DistCpConstants.CONF_LABEL_TOTAL_NUMBER_OF_RECORDS, 
> totalRecords);
> {code}
> For the test case, two bulk loaded hfiles are in the listing:
> {code}
> 2018-10-13 14:09:24,123 DEBUG [Time-limited test] 
> mapreduce.MapReduceBackupCopyJob$BackupDistCp(195): BackupDistCp : 
> hdfs://localhost:42796/user/hbase/test-data/160aeab5-6bca-9f87-465e-2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/394e6d39a9b94b148b9089c4fb967aad_SeqId_205_
> 2018-10-13 14:09:24,125 DEBUG [Time-limited test] 
> mapreduce.MapReduceBackupCopyJob$BackupDistCp(195): BackupDistCp : 
> hdfs://localhost:42796/user/hbase/test-data/160aeab5-6bca-9f87-465e-2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/a7599081e835440eb7bf0dd3ef4fd7a5_SeqId_205_
> 2018-10-13 14:09:24,125 DEBUG [Time-limited test] 
> mapreduce.MapReduceBackupCopyJob$BackupDistCp(197): BackupDistCp execute for 
> 2 files of 10242
> {code}
> Later on, CopyCommitter#concatFileChunks would throw the following exception:
> {code}
> 2018-10-13 14:09:25,351 WARN  [Thread-936] mapred.LocalJobRunner$Job(590): 
> job_local1795473782_0004
> java.io.IOException: Inconsistent sequence file: current chunk file 
> org.apache.hadoop.tools.CopyListingFileStatus@bb8826ee{hdfs://localhost:42796/user/hbase/test-data/
>
> 160aeab5-6bca-9f87-465e-2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/a7599081e835440eb7bf0dd3ef4fd7a5_SeqId_205_
>  length = 5100 aclEntries  = null, xAttrs = null} doesnt match prior entry 
> org.apache.hadoop.tools.CopyListingFileStatus@243d544d{hdfs://localhost:42796/user/hbase/test-data/160aeab5-6bca-9f87-465e-
>
> 2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/394e6d39a9b94b148b9089c4fb967aad_SeqId_205_
>  length = 5142 aclEntries = null, xAttrs = null}
>   at 
> org.apache.hadoop.tools.mapred.CopyCommitter.concatFileChunks(CopyCommitter.java:276)
>   at 
> org.apache.hadoop.tools.mapred.CopyCommitter.commitJob(CopyCommitter.java:100)
>   at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:567)
> {code}
> The above warning shouldn't happen - the two bulk loaded hfiles are 
> independent.
> From the contents of the two CopyListingFileStatus instances, we can see that 
> their isSplit() return false. Otherwise the following from toString should be 
> logged:
> {code}
> if (isSplit()) {
>   sb.append(", chunkOffset = ").append(this.getChunkOffset());
>   sb.append(", chunkLength = ").append(this.getChunkLength());
> }
> {code}
> From hbase side, we can specify one bulk loaded hfile per job but that 
> defeats the purpose of using DistCp.
> There should be a way for DistCp to specify the skipping of source file 
> concatenation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-15850) Allow CopyCommitter to skip concatenating source files specified by DistCpConstants.CONF_LABEL_LISTING_FILE_PATH

2018-10-13 Thread Ted Yu (JIRA)



[ 
https://issues.apache.org/jira/browse/HADOOP-15850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16648974#comment-16648974
 ] 

Ted Yu commented on HADOOP-15850:
-

The quoted test output was from testIncrementalBackupWithBulkLoad-output.txt

> Allow CopyCommitter to skip concatenating source files specified by 
> DistCpConstants.CONF_LABEL_LISTING_FILE_PATH
> 
>
> Key: HADOOP-15850
> URL: https://issues.apache.org/jira/browse/HADOOP-15850
> Project: Hadoop Common
>  Issue Type: Task
>Reporter: Ted Yu
>Priority: Major
> Attachments: testIncrementalBackupWithBulkLoad-output.txt
>
>
> I was investigating test failure of TestIncrementalBackupWithBulkLoad from 
> hbase against hadoop 3.1.1
> hbase MapReduceBackupCopyJob$BackupDistCp would create listing file:
> {code}
> LOG.debug("creating input listing " + listing + " , totalRecords=" + 
> totalRecords);
> cfg.set(DistCpConstants.CONF_LABEL_LISTING_FILE_PATH, listing);
> cfg.setLong(DistCpConstants.CONF_LABEL_TOTAL_NUMBER_OF_RECORDS, 
> totalRecords);
> {code}
> For the test case, two bulk loaded hfiles are in the listing:
> {code}
> 2018-10-13 14:09:24,123 DEBUG [Time-limited test] 
> mapreduce.MapReduceBackupCopyJob$BackupDistCp(195): BackupDistCp : 
> hdfs://localhost:42796/user/hbase/test-data/160aeab5-6bca-9f87-465e-2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/394e6d39a9b94b148b9089c4fb967aad_SeqId_205_
> 2018-10-13 14:09:24,125 DEBUG [Time-limited test] 
> mapreduce.MapReduceBackupCopyJob$BackupDistCp(195): BackupDistCp : 
> hdfs://localhost:42796/user/hbase/test-data/160aeab5-6bca-9f87-465e-2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/a7599081e835440eb7bf0dd3ef4fd7a5_SeqId_205_
> 2018-10-13 14:09:24,125 DEBUG [Time-limited test] 
> mapreduce.MapReduceBackupCopyJob$BackupDistCp(197): BackupDistCp execute for 
> 2 files of 10242
> {code}
> Later on, CopyCommitter#concatFileChunks would throw the following exception:
> {code}
> 2018-10-13 14:09:25,351 WARN  [Thread-936] mapred.LocalJobRunner$Job(590): 
> job_local1795473782_0004
> java.io.IOException: Inconsistent sequence file: current chunk file 
> org.apache.hadoop.tools.CopyListingFileStatus@bb8826ee{hdfs://localhost:42796/user/hbase/test-data/
>
> 160aeab5-6bca-9f87-465e-2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/a7599081e835440eb7bf0dd3ef4fd7a5_SeqId_205_
>  length = 5100 aclEntries  = null, xAttrs = null} doesnt match prior entry 
> org.apache.hadoop.tools.CopyListingFileStatus@243d544d{hdfs://localhost:42796/user/hbase/test-data/160aeab5-6bca-9f87-465e-
>
> 2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/394e6d39a9b94b148b9089c4fb967aad_SeqId_205_
>  length = 5142 aclEntries = null, xAttrs = null}
>   at 
> org.apache.hadoop.tools.mapred.CopyCommitter.concatFileChunks(CopyCommitter.java:276)
>   at 
> org.apache.hadoop.tools.mapred.CopyCommitter.commitJob(CopyCommitter.java:100)
>   at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:567)
> {code}
> The above warning shouldn't happen - the two bulk loaded hfiles are 
> independent.
> From the contents of the two CopyListingFileStatus instances, we can see that 
> their isSplit() return false. Otherwise the following from toString should be 
> logged:
> {code}
> if (isSplit()) {
>   sb.append(", chunkOffset = ").append(this.getChunkOffset());
>   sb.append(", chunkLength = ").append(this.getChunkLength());
> }
> {code}
> From hbase side, we can specify one bulk loaded hfile per job but that 
> defeats the purpose of using DistCp.
> There should be a way for DistCp to specify the skipping of source file 
> concatenation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-15850) Allow CopyCommitter to skip concatenating source files specified by DistCpConstants.CONF_LABEL_LISTING_FILE_PATH

2018-10-13 Thread Ted Yu (JIRA)



[ 
https://issues.apache.org/jira/browse/HADOOP-15850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16648973#comment-16648973
 ] 

Ted Yu commented on HADOOP-15850:
-

This is hbase code:

https://github.com/apache/hbase/blob/master/hbase-backup/src/main/java/org/apache/hadoop/hbase/backup/mapreduce/MapReduceBackupCopyJob.java#L153

> Allow CopyCommitter to skip concatenating source files specified by 
> DistCpConstants.CONF_LABEL_LISTING_FILE_PATH
> 
>
> Key: HADOOP-15850
> URL: https://issues.apache.org/jira/browse/HADOOP-15850
> Project: Hadoop Common
>  Issue Type: Task
>Reporter: Ted Yu
>Priority: Major
> Attachments: testIncrementalBackupWithBulkLoad-output.txt
>
>
> I was investigating test failure of TestIncrementalBackupWithBulkLoad from 
> hbase against hadoop 3.1.1
> hbase MapReduceBackupCopyJob$BackupDistCp would create listing file:
> {code}
> LOG.debug("creating input listing " + listing + " , totalRecords=" + 
> totalRecords);
> cfg.set(DistCpConstants.CONF_LABEL_LISTING_FILE_PATH, listing);
> cfg.setLong(DistCpConstants.CONF_LABEL_TOTAL_NUMBER_OF_RECORDS, 
> totalRecords);
> {code}
> For the test case, two bulk loaded hfiles are in the listing:
> {code}
> 2018-10-13 14:09:24,123 DEBUG [Time-limited test] 
> mapreduce.MapReduceBackupCopyJob$BackupDistCp(195): BackupDistCp : 
> hdfs://localhost:42796/user/hbase/test-data/160aeab5-6bca-9f87-465e-2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/394e6d39a9b94b148b9089c4fb967aad_SeqId_205_
> 2018-10-13 14:09:24,125 DEBUG [Time-limited test] 
> mapreduce.MapReduceBackupCopyJob$BackupDistCp(195): BackupDistCp : 
> hdfs://localhost:42796/user/hbase/test-data/160aeab5-6bca-9f87-465e-2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/a7599081e835440eb7bf0dd3ef4fd7a5_SeqId_205_
> 2018-10-13 14:09:24,125 DEBUG [Time-limited test] 
> mapreduce.MapReduceBackupCopyJob$BackupDistCp(197): BackupDistCp execute for 
> 2 files of 10242
> {code}
> Later on, CopyCommitter#concatFileChunks would throw the following exception:
> {code}
> 2018-10-13 14:09:25,351 WARN  [Thread-936] mapred.LocalJobRunner$Job(590): 
> job_local1795473782_0004
> java.io.IOException: Inconsistent sequence file: current chunk file 
> org.apache.hadoop.tools.CopyListingFileStatus@bb8826ee{hdfs://localhost:42796/user/hbase/test-data/
>
> 160aeab5-6bca-9f87-465e-2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/a7599081e835440eb7bf0dd3ef4fd7a5_SeqId_205_
>  length = 5100 aclEntries  = null, xAttrs = null} doesnt match prior entry 
> org.apache.hadoop.tools.CopyListingFileStatus@243d544d{hdfs://localhost:42796/user/hbase/test-data/160aeab5-6bca-9f87-465e-
>
> 2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/394e6d39a9b94b148b9089c4fb967aad_SeqId_205_
>  length = 5142 aclEntries = null, xAttrs = null}
>   at 
> org.apache.hadoop.tools.mapred.CopyCommitter.concatFileChunks(CopyCommitter.java:276)
>   at 
> org.apache.hadoop.tools.mapred.CopyCommitter.commitJob(CopyCommitter.java:100)
>   at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:567)
> {code}
> The above warning shouldn't happen - the two bulk loaded hfiles are 
> independent.
> From the contents of the two CopyListingFileStatus instances, we can see that 
> their isSplit() return false. Otherwise the following from toString should be 
> logged:
> {code}
> if (isSplit()) {
>   sb.append(", chunkOffset = ").append(this.getChunkOffset());
>   sb.append(", chunkLength = ").append(this.getChunkLength());
> }
> {code}
> From hbase side, we can specify one bulk loaded hfile per job but that 
> defeats the purpose of using DistCp.
> There should be a way for DistCp to specify the skipping of source file 
> concatenation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-15850) Allow CopyCommitter to skip concatenating source files specified by DistCpConstants.CONF_LABEL_LISTING_FILE_PATH

2018-10-13 Thread Ted Yu (JIRA)



 [ 
https://issues.apache.org/jira/browse/HADOOP-15850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HADOOP-15850:

Attachment: testIncrementalBackupWithBulkLoad-output.txt

> Allow CopyCommitter to skip concatenating source files specified by 
> DistCpConstants.CONF_LABEL_LISTING_FILE_PATH
> 
>
> Key: HADOOP-15850
> URL: https://issues.apache.org/jira/browse/HADOOP-15850
> Project: Hadoop Common
>  Issue Type: Task
>Reporter: Ted Yu
>Priority: Major
> Attachments: testIncrementalBackupWithBulkLoad-output.txt
>
>
> I was investigating test failure of TestIncrementalBackupWithBulkLoad from 
> hbase against hadoop 3.1.1
> hbase MapReduceBackupCopyJob$BackupDistCp would create listing file:
> {code}
> LOG.debug("creating input listing " + listing + " , totalRecords=" + 
> totalRecords);
> cfg.set(DistCpConstants.CONF_LABEL_LISTING_FILE_PATH, listing);
> cfg.setLong(DistCpConstants.CONF_LABEL_TOTAL_NUMBER_OF_RECORDS, 
> totalRecords);
> {code}
> For the test case, two bulk loaded hfiles are in the listing:
> {code}
> 2018-10-13 14:09:24,123 DEBUG [Time-limited test] 
> mapreduce.MapReduceBackupCopyJob$BackupDistCp(195): BackupDistCp : 
> hdfs://localhost:42796/user/hbase/test-data/160aeab5-6bca-9f87-465e-2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/394e6d39a9b94b148b9089c4fb967aad_SeqId_205_
> 2018-10-13 14:09:24,125 DEBUG [Time-limited test] 
> mapreduce.MapReduceBackupCopyJob$BackupDistCp(195): BackupDistCp : 
> hdfs://localhost:42796/user/hbase/test-data/160aeab5-6bca-9f87-465e-2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/a7599081e835440eb7bf0dd3ef4fd7a5_SeqId_205_
> 2018-10-13 14:09:24,125 DEBUG [Time-limited test] 
> mapreduce.MapReduceBackupCopyJob$BackupDistCp(197): BackupDistCp execute for 
> 2 files of 10242
> {code}
> Later on, CopyCommitter#concatFileChunks would throw the following exception:
> {code}
> 2018-10-13 14:09:25,351 WARN  [Thread-936] mapred.LocalJobRunner$Job(590): 
> job_local1795473782_0004
> java.io.IOException: Inconsistent sequence file: current chunk file 
> org.apache.hadoop.tools.CopyListingFileStatus@bb8826ee{hdfs://localhost:42796/user/hbase/test-data/
>
> 160aeab5-6bca-9f87-465e-2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/a7599081e835440eb7bf0dd3ef4fd7a5_SeqId_205_
>  length = 5100 aclEntries  = null, xAttrs = null} doesnt match prior entry 
> org.apache.hadoop.tools.CopyListingFileStatus@243d544d{hdfs://localhost:42796/user/hbase/test-data/160aeab5-6bca-9f87-465e-
>
> 2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/394e6d39a9b94b148b9089c4fb967aad_SeqId_205_
>  length = 5142 aclEntries = null, xAttrs = null}
>   at 
> org.apache.hadoop.tools.mapred.CopyCommitter.concatFileChunks(CopyCommitter.java:276)
>   at 
> org.apache.hadoop.tools.mapred.CopyCommitter.commitJob(CopyCommitter.java:100)
>   at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:567)
> {code}
> The above warning shouldn't happen - the two bulk loaded hfiles are 
> independent.
> From the contents of the two CopyListingFileStatus instances, we can see that 
> their isSplit() return false. Otherwise the following from toString should be 
> logged:
> {code}
> if (isSplit()) {
>   sb.append(", chunkOffset = ").append(this.getChunkOffset());
>   sb.append(", chunkLength = ").append(this.getChunkLength());
> }
> {code}
> From hbase side, we can specify one bulk loaded hfile per job but that 
> defeats the purpose of using DistCp.
> There should be a way for DistCp to specify the skipping of source file 
> concatenation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Created] (HADOOP-15850) Allow CopyCommitter to skip concatenating source files specified by DistCpConstants.CONF_LABEL_LISTING_FILE_PATH

2018-10-13 Thread Ted Yu (JIRA)

Ted Yu created HADOOP-15850:
---

 Summary: Allow CopyCommitter to skip concatenating source files 
specified by DistCpConstants.CONF_LABEL_LISTING_FILE_PATH
 Key: HADOOP-15850
 URL: https://issues.apache.org/jira/browse/HADOOP-15850
 Project: Hadoop Common
  Issue Type: Task
Reporter: Ted Yu


I was investigating test failure of TestIncrementalBackupWithBulkLoad from 
hbase against hadoop 3.1.1

hbase MapReduceBackupCopyJob$BackupDistCp would create listing file:
{code}
LOG.debug("creating input listing " + listing + " , totalRecords=" + 
totalRecords);
cfg.set(DistCpConstants.CONF_LABEL_LISTING_FILE_PATH, listing);
cfg.setLong(DistCpConstants.CONF_LABEL_TOTAL_NUMBER_OF_RECORDS, 
totalRecords);
{code}
For the test case, two bulk loaded hfiles are in the listing:
{code}
2018-10-13 14:09:24,123 DEBUG [Time-limited test] 
mapreduce.MapReduceBackupCopyJob$BackupDistCp(195): BackupDistCp : 
hdfs://localhost:42796/user/hbase/test-data/160aeab5-6bca-9f87-465e-2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/394e6d39a9b94b148b9089c4fb967aad_SeqId_205_
2018-10-13 14:09:24,125 DEBUG [Time-limited test] 
mapreduce.MapReduceBackupCopyJob$BackupDistCp(195): BackupDistCp : 
hdfs://localhost:42796/user/hbase/test-data/160aeab5-6bca-9f87-465e-2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/a7599081e835440eb7bf0dd3ef4fd7a5_SeqId_205_
2018-10-13 14:09:24,125 DEBUG [Time-limited test] 
mapreduce.MapReduceBackupCopyJob$BackupDistCp(197): BackupDistCp execute for 2 
files of 10242
{code}
Later on, CopyCommitter#concatFileChunks would throw the following exception:
{code}
2018-10-13 14:09:25,351 WARN  [Thread-936] mapred.LocalJobRunner$Job(590): 
job_local1795473782_0004
java.io.IOException: Inconsistent sequence file: current chunk file 
org.apache.hadoop.tools.CopyListingFileStatus@bb8826ee{hdfs://localhost:42796/user/hbase/test-data/
   
160aeab5-6bca-9f87-465e-2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/a7599081e835440eb7bf0dd3ef4fd7a5_SeqId_205_
 length = 5100 aclEntries  = null, xAttrs = null} doesnt match prior entry 
org.apache.hadoop.tools.CopyListingFileStatus@243d544d{hdfs://localhost:42796/user/hbase/test-data/160aeab5-6bca-9f87-465e-
   
2517a0c43119/data/default/test-1539439707496/96b5a3613d52f4df1ba87a1cef20684c/f/394e6d39a9b94b148b9089c4fb967aad_SeqId_205_
 length = 5142 aclEntries = null, xAttrs = null}
  at 
org.apache.hadoop.tools.mapred.CopyCommitter.concatFileChunks(CopyCommitter.java:276)
  at 
org.apache.hadoop.tools.mapred.CopyCommitter.commitJob(CopyCommitter.java:100)
  at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:567)
{code}
The above warning shouldn't happen - the two bulk loaded hfiles are independent.

>From the contents of the two CopyListingFileStatus instances, we can see that 
>their isSplit() return false. Otherwise the following from toString should be 
>logged:
{code}
if (isSplit()) {
  sb.append(", chunkOffset = ").append(this.getChunkOffset());
  sb.append(", chunkLength = ").append(this.getChunkLength());
}
{code}
>From hbase side, we can specify one bulk loaded hfile per job but that defeats 
>the purpose of using DistCp.

There should be a way for DistCp to specify the skipping of source file 
concatenation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-15831) Include modificationTime in the toString method of CopyListingFileStatus

2018-10-11 Thread Ted Yu (JIRA)



[ 
https://issues.apache.org/jira/browse/HADOOP-15831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16647133#comment-16647133
 ] 

Ted Yu commented on HADOOP-15831:
-

[~ste...@apache.org]:
Is there anything else I need to do ?


> Include modificationTime in the toString method of CopyListingFileStatus
> 
>
> Key: HADOOP-15831
> URL: https://issues.apache.org/jira/browse/HADOOP-15831
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Minor
> Attachments: HADOOP-15831.01.patch, HADOOP-15831.02.patch, 
> HADOOP-15831.03.patch
>
>
> I was looking at a DistCp error observed in hbase backup test:
> {code}
> 2018-10-08 18:12:03,067 WARN  [Thread-933] mapred.LocalJobRunner$Job(590): 
> job_local1175594345_0004
> java.io.IOException: Inconsistent sequence file: current chunk file 
> org.apache.hadoop.tools.CopyListingFileStatus@7ac56817{hdfs://localhost:41712/user/hbase/test-data/
>
> c0f6352c-cf39-bbd1-7d10-57a9c01e7ce9/data/default/test-1539022262249/be1bf5445faddb63e45726410a07917a/f/f565f49046b04eecbf8d129eac7a7b88_SeqId_205_
>  length = 5100 aclEntries  = null, xAttrs = null} doesnt match prior entry 
> org.apache.hadoop.tools.CopyListingFileStatus@7aa4deb2{hdfs://localhost:41712/user/hbase/test-data/c0f6352c-cf39-bbd1-7d10-
>
> 57a9c01e7ce9/data/default/test-1539022262249/be1bf5445faddb63e45726410a07917a/f/41b6cb64bae64cbcac47d1fd9aae59f4_SeqId_205_
>  length = 5142 aclEntries = null, xAttrs = null}
>   at 
> org.apache.hadoop.tools.mapred.CopyCommitter.concatFileChunks(CopyCommitter.java:276)
>   at 
> org.apache.hadoop.tools.mapred.CopyCommitter.commitJob(CopyCommitter.java:100)
>   at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:567)
> 2018-10-08 18:12:03,150 INFO  [Time-limited test] 
> mapreduce.MapReduceBackupCopyJob$BackupDistCp(226): Progress: 100.0% subTask: 
> 1.0 mapProgress: 1.0
> {code}
> I noticed that modificationTime was not included in the toString of 
> CopyListingFileStatus.
> I propose including modificationTime so that it is easier to tell when the 
> respective files last change.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-15831) Include modificationTime in the toString method of CopyListingFileStatus

2018-10-09 Thread Ted Yu (JIRA)



 [ 
https://issues.apache.org/jira/browse/HADOOP-15831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HADOOP-15831:

Attachment: HADOOP-15831.03.patch

> Include modificationTime in the toString method of CopyListingFileStatus
> 
>
> Key: HADOOP-15831
> URL: https://issues.apache.org/jira/browse/HADOOP-15831
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Minor
> Attachments: HADOOP-15831.01.patch, HADOOP-15831.02.patch, 
> HADOOP-15831.03.patch
>
>
> I was looking at a DistCp error observed in hbase backup test:
> {code}
> 2018-10-08 18:12:03,067 WARN  [Thread-933] mapred.LocalJobRunner$Job(590): 
> job_local1175594345_0004
> java.io.IOException: Inconsistent sequence file: current chunk file 
> org.apache.hadoop.tools.CopyListingFileStatus@7ac56817{hdfs://localhost:41712/user/hbase/test-data/
>
> c0f6352c-cf39-bbd1-7d10-57a9c01e7ce9/data/default/test-1539022262249/be1bf5445faddb63e45726410a07917a/f/f565f49046b04eecbf8d129eac7a7b88_SeqId_205_
>  length = 5100 aclEntries  = null, xAttrs = null} doesnt match prior entry 
> org.apache.hadoop.tools.CopyListingFileStatus@7aa4deb2{hdfs://localhost:41712/user/hbase/test-data/c0f6352c-cf39-bbd1-7d10-
>
> 57a9c01e7ce9/data/default/test-1539022262249/be1bf5445faddb63e45726410a07917a/f/41b6cb64bae64cbcac47d1fd9aae59f4_SeqId_205_
>  length = 5142 aclEntries = null, xAttrs = null}
>   at 
> org.apache.hadoop.tools.mapred.CopyCommitter.concatFileChunks(CopyCommitter.java:276)
>   at 
> org.apache.hadoop.tools.mapred.CopyCommitter.commitJob(CopyCommitter.java:100)
>   at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:567)
> 2018-10-08 18:12:03,150 INFO  [Time-limited test] 
> mapreduce.MapReduceBackupCopyJob$BackupDistCp(226): Progress: 100.0% subTask: 
> 1.0 mapProgress: 1.0
> {code}
> I noticed that modificationTime was not included in the toString of 
> CopyListingFileStatus.
> I propose including modificationTime so that it is easier to tell when the 
> respective files last change.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-15831) Include modificationTime in the toString method of CopyListingFileStatus

2018-10-09 Thread Ted Yu (JIRA)



 [ 
https://issues.apache.org/jira/browse/HADOOP-15831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HADOOP-15831:

Attachment: HADOOP-15831.02.patch

> Include modificationTime in the toString method of CopyListingFileStatus
> 
>
> Key: HADOOP-15831
> URL: https://issues.apache.org/jira/browse/HADOOP-15831
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Minor
> Attachments: HADOOP-15831.01.patch, HADOOP-15831.02.patch
>
>
> I was looking at a DistCp error observed in hbase backup test:
> {code}
> 2018-10-08 18:12:03,067 WARN  [Thread-933] mapred.LocalJobRunner$Job(590): 
> job_local1175594345_0004
> java.io.IOException: Inconsistent sequence file: current chunk file 
> org.apache.hadoop.tools.CopyListingFileStatus@7ac56817{hdfs://localhost:41712/user/hbase/test-data/
>
> c0f6352c-cf39-bbd1-7d10-57a9c01e7ce9/data/default/test-1539022262249/be1bf5445faddb63e45726410a07917a/f/f565f49046b04eecbf8d129eac7a7b88_SeqId_205_
>  length = 5100 aclEntries  = null, xAttrs = null} doesnt match prior entry 
> org.apache.hadoop.tools.CopyListingFileStatus@7aa4deb2{hdfs://localhost:41712/user/hbase/test-data/c0f6352c-cf39-bbd1-7d10-
>
> 57a9c01e7ce9/data/default/test-1539022262249/be1bf5445faddb63e45726410a07917a/f/41b6cb64bae64cbcac47d1fd9aae59f4_SeqId_205_
>  length = 5142 aclEntries = null, xAttrs = null}
>   at 
> org.apache.hadoop.tools.mapred.CopyCommitter.concatFileChunks(CopyCommitter.java:276)
>   at 
> org.apache.hadoop.tools.mapred.CopyCommitter.commitJob(CopyCommitter.java:100)
>   at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:567)
> 2018-10-08 18:12:03,150 INFO  [Time-limited test] 
> mapreduce.MapReduceBackupCopyJob$BackupDistCp(226): Progress: 100.0% subTask: 
> 1.0 mapProgress: 1.0
> {code}
> I noticed that modificationTime was not included in the toString of 
> CopyListingFileStatus.
> I propose including modificationTime so that it is easier to tell when the 
> respective files last change.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-15831) Include modificationTime in the toString method of CopyListingFileStatus

2018-10-09 Thread Ted Yu (JIRA)



 [ 
https://issues.apache.org/jira/browse/HADOOP-15831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HADOOP-15831:

Assignee: Ted Yu
  Status: Patch Available  (was: Open)

> Include modificationTime in the toString method of CopyListingFileStatus
> 
>
> Key: HADOOP-15831
> URL: https://issues.apache.org/jira/browse/HADOOP-15831
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Minor
> Attachments: HADOOP-15831.01.patch
>
>
> I was looking at a DistCp error observed in hbase backup test:
> {code}
> 2018-10-08 18:12:03,067 WARN  [Thread-933] mapred.LocalJobRunner$Job(590): 
> job_local1175594345_0004
> java.io.IOException: Inconsistent sequence file: current chunk file 
> org.apache.hadoop.tools.CopyListingFileStatus@7ac56817{hdfs://localhost:41712/user/hbase/test-data/
>
> c0f6352c-cf39-bbd1-7d10-57a9c01e7ce9/data/default/test-1539022262249/be1bf5445faddb63e45726410a07917a/f/f565f49046b04eecbf8d129eac7a7b88_SeqId_205_
>  length = 5100 aclEntries  = null, xAttrs = null} doesnt match prior entry 
> org.apache.hadoop.tools.CopyListingFileStatus@7aa4deb2{hdfs://localhost:41712/user/hbase/test-data/c0f6352c-cf39-bbd1-7d10-
>
> 57a9c01e7ce9/data/default/test-1539022262249/be1bf5445faddb63e45726410a07917a/f/41b6cb64bae64cbcac47d1fd9aae59f4_SeqId_205_
>  length = 5142 aclEntries = null, xAttrs = null}
>   at 
> org.apache.hadoop.tools.mapred.CopyCommitter.concatFileChunks(CopyCommitter.java:276)
>   at 
> org.apache.hadoop.tools.mapred.CopyCommitter.commitJob(CopyCommitter.java:100)
>   at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:567)
> 2018-10-08 18:12:03,150 INFO  [Time-limited test] 
> mapreduce.MapReduceBackupCopyJob$BackupDistCp(226): Progress: 100.0% subTask: 
> 1.0 mapProgress: 1.0
> {code}
> I noticed that modificationTime was not included in the toString of 
> CopyListingFileStatus.
> I propose including modificationTime so that it is easier to tell when the 
> respective files last change.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-15831) Include modificationTime in the toString method of CopyListingFileStatus

2018-10-09 Thread Ted Yu (JIRA)



 [ 
https://issues.apache.org/jira/browse/HADOOP-15831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HADOOP-15831:

Attachment: HADOOP-15831.01.patch

> Include modificationTime in the toString method of CopyListingFileStatus
> 
>
> Key: HADOOP-15831
> URL: https://issues.apache.org/jira/browse/HADOOP-15831
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Ted Yu
>Priority: Minor
> Attachments: HADOOP-15831.01.patch
>
>
> I was looking at a DistCp error observed in hbase backup test:
> {code}
> 2018-10-08 18:12:03,067 WARN  [Thread-933] mapred.LocalJobRunner$Job(590): 
> job_local1175594345_0004
> java.io.IOException: Inconsistent sequence file: current chunk file 
> org.apache.hadoop.tools.CopyListingFileStatus@7ac56817{hdfs://localhost:41712/user/hbase/test-data/
>
> c0f6352c-cf39-bbd1-7d10-57a9c01e7ce9/data/default/test-1539022262249/be1bf5445faddb63e45726410a07917a/f/f565f49046b04eecbf8d129eac7a7b88_SeqId_205_
>  length = 5100 aclEntries  = null, xAttrs = null} doesnt match prior entry 
> org.apache.hadoop.tools.CopyListingFileStatus@7aa4deb2{hdfs://localhost:41712/user/hbase/test-data/c0f6352c-cf39-bbd1-7d10-
>
> 57a9c01e7ce9/data/default/test-1539022262249/be1bf5445faddb63e45726410a07917a/f/41b6cb64bae64cbcac47d1fd9aae59f4_SeqId_205_
>  length = 5142 aclEntries = null, xAttrs = null}
>   at 
> org.apache.hadoop.tools.mapred.CopyCommitter.concatFileChunks(CopyCommitter.java:276)
>   at 
> org.apache.hadoop.tools.mapred.CopyCommitter.commitJob(CopyCommitter.java:100)
>   at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:567)
> 2018-10-08 18:12:03,150 INFO  [Time-limited test] 
> mapreduce.MapReduceBackupCopyJob$BackupDistCp(226): Progress: 100.0% subTask: 
> 1.0 mapProgress: 1.0
> {code}
> I noticed that modificationTime was not included in the toString of 
> CopyListingFileStatus.
> I propose including modificationTime so that it is easier to tell when the 
> respective files last change.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Created] (HADOOP-15831) Include modificationTime in the toString method of CopyListingFileStatus

2018-10-08 Thread Ted Yu (JIRA)

Ted Yu created HADOOP-15831:
---

 Summary: Include modificationTime in the toString method of 
CopyListingFileStatus
 Key: HADOOP-15831
 URL: https://issues.apache.org/jira/browse/HADOOP-15831
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Ted Yu


I was looking at a DistCp error observed in hbase backup test:
{code}
2018-10-08 18:12:03,067 WARN  [Thread-933] mapred.LocalJobRunner$Job(590): 
job_local1175594345_0004
java.io.IOException: Inconsistent sequence file: current chunk file 
org.apache.hadoop.tools.CopyListingFileStatus@7ac56817{hdfs://localhost:41712/user/hbase/test-data/
   
c0f6352c-cf39-bbd1-7d10-57a9c01e7ce9/data/default/test-1539022262249/be1bf5445faddb63e45726410a07917a/f/f565f49046b04eecbf8d129eac7a7b88_SeqId_205_
 length = 5100 aclEntries  = null, xAttrs = null} doesnt match prior entry 
org.apache.hadoop.tools.CopyListingFileStatus@7aa4deb2{hdfs://localhost:41712/user/hbase/test-data/c0f6352c-cf39-bbd1-7d10-
   
57a9c01e7ce9/data/default/test-1539022262249/be1bf5445faddb63e45726410a07917a/f/41b6cb64bae64cbcac47d1fd9aae59f4_SeqId_205_
 length = 5142 aclEntries = null, xAttrs = null}
  at 
org.apache.hadoop.tools.mapred.CopyCommitter.concatFileChunks(CopyCommitter.java:276)
  at 
org.apache.hadoop.tools.mapred.CopyCommitter.commitJob(CopyCommitter.java:100)
  at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:567)
2018-10-08 18:12:03,150 INFO  [Time-limited test] 
mapreduce.MapReduceBackupCopyJob$BackupDistCp(226): Progress: 100.0% subTask: 
1.0 mapProgress: 1.0
{code}
I noticed that modificationTime was not included in the toString of 
CopyListingFileStatus.

I propose including modificationTime so that it is easier to tell when the 
respective files last change.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (HADOOP-10768) Optimize Hadoop RPC encryption performance

2018-05-11 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-10768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16472878#comment-16472878
 ] 

Ted Yu edited comment on HADOOP-10768 at 5/12/18 2:29 AM:
--

[~jojochuang]:
If you look in the hbase master log, there should be clue as to why master 
couldn't finish initialization.

Cheers


was (Author: yuzhih...@gmail.com):
[~jojochuang]:
If you look in the master log, there should be clue as to why master couldn't 
finish initialization.

Cheers

> Optimize Hadoop RPC encryption performance
> --
>
> Key: HADOOP-10768
> URL: https://issues.apache.org/jira/browse/HADOOP-10768
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: performance, security
>Affects Versions: 3.0.0-alpha1
>Reporter: Yi Liu
>Assignee: Dapeng Sun
>Priority: Major
> Attachments: HADOOP-10768.001.patch, HADOOP-10768.002.patch, 
> HADOOP-10768.003.patch, HADOOP-10768.004.patch, HADOOP-10768.005.patch, 
> HADOOP-10768.006.patch, HADOOP-10768.007.patch, HADOOP-10768.008.patch, 
> HADOOP-10768.009.patch, Optimize Hadoop RPC encryption performance.pdf
>
>
> Hadoop RPC encryption is enabled by setting {{hadoop.rpc.protection}} to 
> "privacy". It utilized SASL {{GSSAPI}} and {{DIGEST-MD5}} mechanisms for 
> secure authentication and data protection. Even {{GSSAPI}} supports using 
> AES, but without AES-NI support by default, so the encryption is slow and 
> will become bottleneck.
> After discuss with [~atm], [~tucu00] and [~umamaheswararao], we can do the 
> same optimization as in HDFS-6606. Use AES-NI with more than *20x* speedup.
> On the other hand, RPC message is small, but RPC is frequent and there may be 
> lots of RPC calls in one connection, we needs to setup benchmark to see real 
> improvement and then make a trade-off. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-10768) Optimize Hadoop RPC encryption performance

2018-05-11 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-10768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16472878#comment-16472878
 ] 

Ted Yu commented on HADOOP-10768:
-

[~jojochuang]:
If you look in the master log, there should be clue as to why master couldn't 
finish initialization.

Cheers

> Optimize Hadoop RPC encryption performance
> --
>
> Key: HADOOP-10768
> URL: https://issues.apache.org/jira/browse/HADOOP-10768
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: performance, security
>Affects Versions: 3.0.0-alpha1
>Reporter: Yi Liu
>Assignee: Dapeng Sun
>Priority: Major
> Attachments: HADOOP-10768.001.patch, HADOOP-10768.002.patch, 
> HADOOP-10768.003.patch, HADOOP-10768.004.patch, HADOOP-10768.005.patch, 
> HADOOP-10768.006.patch, HADOOP-10768.007.patch, HADOOP-10768.008.patch, 
> HADOOP-10768.009.patch, Optimize Hadoop RPC encryption performance.pdf
>
>
> Hadoop RPC encryption is enabled by setting {{hadoop.rpc.protection}} to 
> "privacy". It utilized SASL {{GSSAPI}} and {{DIGEST-MD5}} mechanisms for 
> secure authentication and data protection. Even {{GSSAPI}} supports using 
> AES, but without AES-NI support by default, so the encryption is slow and 
> will become bottleneck.
> After discuss with [~atm], [~tucu00] and [~umamaheswararao], we can do the 
> same optimization as in HDFS-6606. Use AES-NI with more than *20x* speedup.
> On the other hand, RPC message is small, but RPC is frequent and there may be 
> lots of RPC calls in one connection, we needs to setup benchmark to see real 
> improvement and then make a trade-off. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-15392) S3A Metrics in S3AInstrumentation Cause Memory Leaks in HBase Export

2018-04-25 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16453160#comment-16453160
 ] 

Ted Yu commented on HADOOP-15392:
-

I meant running ExportSnapshot with the DEBUG log.

> S3A Metrics in S3AInstrumentation Cause Memory Leaks in HBase Export
> 
>
> Key: HADOOP-15392
> URL: https://issues.apache.org/jira/browse/HADOOP-15392
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.1.0
>Reporter: Voyta
>Priority: Blocker
>
> While using HBase S3A Export Snapshot utility we started to experience memory 
> leaks of the process after version upgrade.
> By running code analysis we traced the cause to revision 
> 6555af81a26b0b72ec3bee7034e01f5bd84b1564 that added the following static 
> reference (singleton):
> private static MetricsSystem metricsSystem = null;
> When application uses S3AFileSystem instance that is not closed immediately 
> metrics are accumulated in this instance and memory grows without any limit.
>  
> Expectation:
>  * It would be nice to have an option to disable metrics completely as this 
> is not needed for Export Snapshot utility.
>  * Usage of S3AFileSystem should not contain any static object that can grow 
> indefinitely.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-15392) S3A Metrics in S3AInstrumentation Cause Memory Leaks in HBase Export

2018-04-25 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16453144#comment-16453144
 ] 

Ted Yu commented on HADOOP-15392:
-

>From S3AInstrumentation :
{code}
  public void close() {
synchronized (metricsSystemLock) {
  metricsSystem.unregisterSource(metricsSourceName);
  int activeSources = --metricsSourceActiveCounter;
  if (activeSources == 0) {
metricsSystem.publishMetricsNow();
metricsSystem.shutdown();
metricsSystem = null;
{code}
How about adding a DEBUG log with the value of activeSources so that we know 
whether the {{activeSources == 0}} case is ever reached ?

> S3A Metrics in S3AInstrumentation Cause Memory Leaks in HBase Export
> 
>
> Key: HADOOP-15392
> URL: https://issues.apache.org/jira/browse/HADOOP-15392
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.1.0
>Reporter: Voyta
>Priority: Blocker
>
> While using HBase S3A Export Snapshot utility we started to experience memory 
> leaks of the process after version upgrade.
> By running code analysis we traced the cause to revision 
> 6555af81a26b0b72ec3bee7034e01f5bd84b1564 that added the following static 
> reference (singleton):
> private static MetricsSystem metricsSystem = null;
> When application uses S3AFileSystem instance that is not closed immediately 
> metrics are accumulated in this instance and memory grows without any limit.
>  
> Expectation:
>  * It would be nice to have an option to disable metrics completely as this 
> is not needed for Export Snapshot utility.
>  * Usage of S3AFileSystem should not contain any static object that can grow 
> indefinitely.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-15392) S3A Metrics in S3AInstrumentation Cause Memory Leaks

2018-04-18 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16442696#comment-16442696
 ] 

Ted Yu commented on HADOOP-15392:
-

hbase export tool is located at:
hbase-mapreduce/src/main/java/org/apache/hadoop/hbase/snapshot/ExportSnapshot.java

I don't see any metrics call in that utility.
The code referenced in description is in S3AInstrumentation. So it seems this 
is specific to S3A code.

> S3A Metrics in S3AInstrumentation Cause Memory Leaks
> 
>
> Key: HADOOP-15392
> URL: https://issues.apache.org/jira/browse/HADOOP-15392
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.1.0
>Reporter: Voyta
>Priority: Major
>
> While using HBase S3A Export Snapshot utility we started to experience memory 
> leaks of the process after version upgrade.
> By running code analysis we traced the cause to revision 
> 6555af81a26b0b72ec3bee7034e01f5bd84b1564 that added the following static 
> reference (singleton):
> private static MetricsSystem metricsSystem = null;
> When application uses S3AFileSystem instance that is not closed immediately 
> metrics are accumulated in this instance and memory grows without any limit.
>  
> Expectation:
>  * It would be nice to have an option to disable metrics completely as this 
> is not needed for Export Snapshot utility.
>  * Usage of S3AFileSystem should not contain any static object that can grow 
> indefinitely.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-15289) FileStatus.readFields() assertion incorrect

2018-03-05 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16386776#comment-16386776
 ] 

Ted Yu commented on HADOOP-15289:
-

Thanks for the quick fix, Steve.

lgtm

> FileStatus.readFields() assertion incorrect
> ---
>
> Key: HADOOP-15289
> URL: https://issues.apache.org/jira/browse/HADOOP-15289
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 3.0.0, 3.1.0
>Reporter: Steve Loughran
>Priority: Critical
> Attachments: HADOOP-15289-001.patch
>
>
> As covered inHBASE-20123,  "Backup test fails against hadoop 3; ", I think 
> the assert at the end of {{FileStatus.readFields()}} is wrong; if you run the 
> code with assert=true against a directory, an IOE will get raised.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Resolved] (HADOOP-15290) Imprecise assertion in FileStatus w.r.t. symlink

2018-03-05 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu resolved HADOOP-15290.
-
Resolution: Duplicate

Dup of HADOOP-15289

> Imprecise assertion in FileStatus w.r.t. symlink
> 
>
> Key: HADOOP-15290
> URL: https://issues.apache.org/jira/browse/HADOOP-15290
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Ted Yu
>Priority: Minor
>
> In HBASE-20123, I logged the following stack trace:
> {code}
> 2018-03-03 14:46:10,858 ERROR [Time-limited test] 
> mapreduce.MapReduceBackupCopyJob$BackupDistCp(237): java.io.IOException: Path 
> hdfs://localhost:40578/backupUT/.tmp/backup_1520088356047 is not a symbolic 
> link
> java.io.IOException: Path 
> hdfs://localhost:40578/backupUT/.tmp/backup_1520088356047 is not a symbolic 
> link
>   at org.apache.hadoop.fs.FileStatus.getSymlink(FileStatus.java:338)
>   at org.apache.hadoop.fs.FileStatus.readFields(FileStatus.java:461)
>   at 
> org.apache.hadoop.tools.CopyListingFileStatus.readFields(CopyListingFileStatus.java:155)
>   at 
> org.apache.hadoop.io.SequenceFile$Reader.getCurrentValue(SequenceFile.java:2308)
>   at 
> org.apache.hadoop.tools.CopyListing.validateFinalListing(CopyListing.java:163)
>   at org.apache.hadoop.tools.CopyListing.buildListing(CopyListing.java:91)
>   at 
> org.apache.hadoop.tools.GlobbedCopyListing.doBuildListing(GlobbedCopyListing.java:90)
>   at org.apache.hadoop.tools.CopyListing.buildListing(CopyListing.java:84)
>   at org.apache.hadoop.tools.DistCp.createInputFileListing(DistCp.java:382)
>   at 
> org.apache.hadoop.hbase.backup.mapreduce.MapReduceBackupCopyJob$BackupDistCp.createInputFileListing(MapReduceBackupCopyJob.java:297)
> {code}
> [~ste...@apache.org] pointed out that the assertion in FileStatus.java is not 
> accurate:
> {code}
> assert (isDirectory() && getSymlink() == null) || !isDirectory();
> {code}
> {quote}
> It's assuming that getSymlink() returns null if there is no symlink, but 
> instead it raises and exception.
> {quote}
> Steve proposed the following replacement:
> {code}
> assert (!(isDirectory() && isSymlink())
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Created] (HADOOP-15290) Imprecise assertion in FileStatus w.r.t. symlink

2018-03-05 Thread Ted Yu (JIRA)

Ted Yu created HADOOP-15290:
---

 Summary: Imprecise assertion in FileStatus w.r.t. symlink
 Key: HADOOP-15290
 URL: https://issues.apache.org/jira/browse/HADOOP-15290
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Ted Yu


In HBASE-20123, I logged the following stack trace:
{code}
2018-03-03 14:46:10,858 ERROR [Time-limited test] 
mapreduce.MapReduceBackupCopyJob$BackupDistCp(237): java.io.IOException: Path 
hdfs://localhost:40578/backupUT/.tmp/backup_1520088356047 is not a symbolic link
java.io.IOException: Path 
hdfs://localhost:40578/backupUT/.tmp/backup_1520088356047 is not a symbolic link
  at org.apache.hadoop.fs.FileStatus.getSymlink(FileStatus.java:338)
  at org.apache.hadoop.fs.FileStatus.readFields(FileStatus.java:461)
  at 
org.apache.hadoop.tools.CopyListingFileStatus.readFields(CopyListingFileStatus.java:155)
  at 
org.apache.hadoop.io.SequenceFile$Reader.getCurrentValue(SequenceFile.java:2308)
  at 
org.apache.hadoop.tools.CopyListing.validateFinalListing(CopyListing.java:163)
  at org.apache.hadoop.tools.CopyListing.buildListing(CopyListing.java:91)
  at 
org.apache.hadoop.tools.GlobbedCopyListing.doBuildListing(GlobbedCopyListing.java:90)
  at org.apache.hadoop.tools.CopyListing.buildListing(CopyListing.java:84)
  at org.apache.hadoop.tools.DistCp.createInputFileListing(DistCp.java:382)
  at 
org.apache.hadoop.hbase.backup.mapreduce.MapReduceBackupCopyJob$BackupDistCp.createInputFileListing(MapReduceBackupCopyJob.java:297)
{code}
[~ste...@apache.org] pointed out that the assertion in FileStatus.java is not 
accurate:
{code}
assert (isDirectory() && getSymlink() == null) || !isDirectory();
{code}
{quote}
It's assuming that getSymlink() returns null if there is no symlink, but 
instead it raises and exception.
{quote}
Steve proposed the following replacement:
{code}
assert (!(isDirectory() && isSymlink())
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-15051) FSDataOutputStream returned by LocalFileSystem#createNonRecursive doesn't have hflush capability

2017-11-17 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HADOOP-15051:

Summary: FSDataOutputStream returned by LocalFileSystem#createNonRecursive 
doesn't have hflush capability  (was: FSDataOutputStream returned by 
LocalFileSystem#createNonRecursive doesn't support hflush)

> FSDataOutputStream returned by LocalFileSystem#createNonRecursive doesn't 
> have hflush capability
> 
>
> Key: HADOOP-15051
> URL: https://issues.apache.org/jira/browse/HADOOP-15051
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 3.0.0-beta1
>Reporter: Ted Yu
>
> See HBASE-19289 for background information.
> Here is related hbase code (fs is instance of LocalFileSystem):
> {code}
> this.output = fs.createNonRecursive(path, overwritable, bufferSize, 
> replication, blockSize,
>   null);
> // TODO Be sure to add a check for hsync if this branch includes 
> HBASE-19024
> if (!(CommonFSUtils.hasCapability(output, "hflush"))) {
>   throw new StreamLacksCapabilityException("hflush");
> {code}
> StreamCapabilities is used to poll "hflush" capability.
> [~busbey] suggested fixing this in hadoop.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-15051) FSDataOutputStream returned by LocalFileSystem#createNonRecursive doesn't support hflush

2017-11-17 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16257581#comment-16257581
 ] 

Ted Yu commented on HADOOP-15051:
-

[~andrew.wang]:
FYI


> FSDataOutputStream returned by LocalFileSystem#createNonRecursive doesn't 
> support hflush
> 
>
> Key: HADOOP-15051
> URL: https://issues.apache.org/jira/browse/HADOOP-15051
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 3.0.0-beta1
>Reporter: Ted Yu
>
> See HBASE-19289 for background information.
> Here is related hbase code (fs is instance of LocalFileSystem):
> {code}
> this.output = fs.createNonRecursive(path, overwritable, bufferSize, 
> replication, blockSize,
>   null);
> // TODO Be sure to add a check for hsync if this branch includes 
> HBASE-19024
> if (!(CommonFSUtils.hasCapability(output, "hflush"))) {
>   throw new StreamLacksCapabilityException("hflush");
> {code}
> StreamCapabilities is used to poll "hflush" capability.
> [~busbey] suggested fixing this in hadoop.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-15051) FSDataOutputStream returned by LocalFileSystem#createNonRecursive doesn't support hflush

2017-11-17 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HADOOP-15051:

Description: 
See HBASE-19289 for background information.
Here is related hbase code (fs is instance of LocalFileSystem):
{code}
this.output = fs.createNonRecursive(path, overwritable, bufferSize, 
replication, blockSize,
  null);
// TODO Be sure to add a check for hsync if this branch includes HBASE-19024
if (!(CommonFSUtils.hasCapability(output, "hflush"))) {
  throw new StreamLacksCapabilityException("hflush");
{code}
StreamCapabilities is used to poll "hflush" capability.

[~busbey] suggested fixing this in hadoop.

  was:
See HBASE-19289 for background information.

[~busbey] suggested fixing this in hadoop.


> FSDataOutputStream returned by LocalFileSystem#createNonRecursive doesn't 
> support hflush
> 
>
> Key: HADOOP-15051
> URL: https://issues.apache.org/jira/browse/HADOOP-15051
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 3.0.0-beta1
>Reporter: Ted Yu
>
> See HBASE-19289 for background information.
> Here is related hbase code (fs is instance of LocalFileSystem):
> {code}
> this.output = fs.createNonRecursive(path, overwritable, bufferSize, 
> replication, blockSize,
>   null);
> // TODO Be sure to add a check for hsync if this branch includes 
> HBASE-19024
> if (!(CommonFSUtils.hasCapability(output, "hflush"))) {
>   throw new StreamLacksCapabilityException("hflush");
> {code}
> StreamCapabilities is used to poll "hflush" capability.
> [~busbey] suggested fixing this in hadoop.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-15051) FSDataOutputStream returned by LocalFileSystem#createNonRecursive doesn't support hflush

2017-11-17 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HADOOP-15051:

Description: 
See HBASE-19289 for background information.

[~busbey] suggested fixing this in hadoop.

> FSDataOutputStream returned by LocalFileSystem#createNonRecursive doesn't 
> support hflush
> 
>
> Key: HADOOP-15051
> URL: https://issues.apache.org/jira/browse/HADOOP-15051
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 3.0.0-beta1
>Reporter: Ted Yu
>
> See HBASE-19289 for background information.
> [~busbey] suggested fixing this in hadoop.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Created] (HADOOP-15051) FSDataOutputStream returned by LocalFileSystem#createNonRecursive doesn

2017-11-17 Thread Ted Yu (JIRA)

Ted Yu created HADOOP-15051:
---

 Summary: FSDataOutputStream returned by 
LocalFileSystem#createNonRecursive doesn
 Key: HADOOP-15051
 URL: https://issues.apache.org/jira/browse/HADOOP-15051
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 3.0.0-beta1
Reporter: Ted Yu






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-15051) FSDataOutputStream returned by LocalFileSystem#createNonRecursive doesn't support hflush

2017-11-17 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HADOOP-15051:

Summary: FSDataOutputStream returned by LocalFileSystem#createNonRecursive 
doesn't support hflush  (was: FSDataOutputStream returned by 
LocalFileSystem#createNonRecursive doesn)

> FSDataOutputStream returned by LocalFileSystem#createNonRecursive doesn't 
> support hflush
> 
>
> Key: HADOOP-15051
> URL: https://issues.apache.org/jira/browse/HADOOP-15051
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 3.0.0-beta1
>Reporter: Ted Yu
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-13866) Upgrade netty-all to 4.1.1.Final

2017-10-17 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-13866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HADOOP-13866:

Resolution: Won't Fix
Status: Resolved  (was: Patch Available)

> Upgrade netty-all to 4.1.1.Final
> 
>
> Key: HADOOP-13866
> URL: https://issues.apache.org/jira/browse/HADOOP-13866
> Project: Hadoop Common
>  Issue Type: Improvement
>Affects Versions: 3.0.0-alpha1
>Reporter: Ted Yu
>Assignee: Ted Yu
> Attachments: HADOOP-13866.v1.patch, HADOOP-13866.v2.patch, 
> HADOOP-13866.v3.patch, HADOOP-13866.v4.patch, HADOOP-13866.v6.patch, 
> HADOOP-13866.v7.patch, HADOOP-13866.v8.patch, HADOOP-13866.v8.patch, 
> HADOOP-13866.v8.patch, HADOOP-13866.v9.patch
>
>
> netty-all 4.1.1.Final is stable release which we should upgrade to.
> See bottom of HADOOP-12927 for related discussion.
> This issue was discovered since hbase 2.0 uses 4.1.1.Final of netty.
> When launching mapreduce job from hbase, /grid/0/hadoop/yarn/local/  
> usercache/hbase/appcache/application_1479850535804_0008/container_e01_1479850535804_0008_01_05/mr-framework/hadoop/share/hadoop/hdfs/lib/netty-all-4.0.23.Final.jar
>  (from hdfs) is ahead of 4.1.1.Final jar (from hbase) on the classpath.
> Resulting in the following exception:
> {code}
> 2016-12-01 20:17:26,678 WARN [Default-IPC-NioEventLoopGroup-1-1] 
> io.netty.util.concurrent.DefaultPromise: An exception was thrown by 
> org.apache.hadoop.hbase.ipc.NettyRpcConnection$3.operationComplete()
> java.lang.NoSuchMethodError: 
> io.netty.buffer.ByteBuf.retainedDuplicate()Lio/netty/buffer/ByteBuf;
> at 
> org.apache.hadoop.hbase.ipc.NettyRpcConnection$3.operationComplete(NettyRpcConnection.java:272)
> at 
> org.apache.hadoop.hbase.ipc.NettyRpcConnection$3.operationComplete(NettyRpcConnection.java:262)
> at 
> io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:680)
> at 
> io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:603)
> at 
> io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:563)
> at 
> io.netty.util.concurrent.DefaultPromise.trySuccess(DefaultPromise.java:406)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-14930) Upgrade Jetty to 9.4 version

2017-10-13 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-14930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HADOOP-14930:

Resolution: Won't Fix
Status: Resolved  (was: Patch Available)

> Upgrade Jetty to 9.4 version
> 
>
> Key: HADOOP-14930
> URL: https://issues.apache.org/jira/browse/HADOOP-14930
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Ted Yu
>Assignee: Bharat Viswanadham
> Attachments: HADOOP-14930.00.patch
>
>
> Currently 9.3.19.v20170502 is used.
> In hbase 2.0+, 9.4.6.v20170531 is used.
> When starting mini dfs cluster in hbase unit tests, we get the following:
> {code}
> java.lang.NoSuchMethodError: 
> org.eclipse.jetty.server.session.SessionHandler.getSessionManager()Lorg/eclipse/jetty/server/SessionManager;
>   at 
> org.apache.hadoop.http.HttpServer2.initializeWebServer(HttpServer2.java:548)
>   at org.apache.hadoop.http.HttpServer2.(HttpServer2.java:529)
>   at org.apache.hadoop.http.HttpServer2.(HttpServer2.java:119)
>   at org.apache.hadoop.http.HttpServer2$Builder.build(HttpServer2.java:415)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeHttpServer.start(NameNodeHttpServer.java:157)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.startHttpServer(NameNode.java:887)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:723)
>   at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:949)
>   at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:928)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1637)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.createNameNode(MiniDFSCluster.java:1277)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.configureNameService(MiniDFSCluster.java:1046)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.createNameNodesAndSetConf(MiniDFSCluster.java:921)
> {code}
> This issue is to upgrade Jetty to 9.4 version



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-14895) Consider exposing SimpleCopyListing#computeSourceRootPath() for downstream project

2017-10-11 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-14895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16200708#comment-16200708
 ] 

Ted Yu commented on HADOOP-14895:
-

The previous attempt was for Hadoop3 alpha 4.

This is w.r.t. using DistCp.

> Consider exposing SimpleCopyListing#computeSourceRootPath() for downstream 
> project
> --
>
> Key: HADOOP-14895
> URL: https://issues.apache.org/jira/browse/HADOOP-14895
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Ted Yu
>
> Over in HBASE-18843, [~vrodionov] needs to override 
> SimpleCopyListing#computeSourceRootPath() .
> Since the method is private, some duplicated code appears in hbase.
> We should consider exposing SimpleCopyListing#computeSourceRootPath() so that 
> its behavior can be overridden.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Resolved] (HADOOP-10642) Provide option to limit heap memory consumed by dynamic metrics2 metrics

2017-10-11 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-10642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu resolved HADOOP-10642.
-
Resolution: Later

> Provide option to limit heap memory consumed by dynamic metrics2 metrics
> 
>
> Key: HADOOP-10642
> URL: https://issues.apache.org/jira/browse/HADOOP-10642
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: metrics
>Reporter: Ted Yu
>
> User sunweiei provided the following jmap output in HBase 0.96 deployment:
> {code}
>  num #instances #bytes  class name
> --
>1:  14917882 3396492464  [C
>2:   1996994 2118021808  [B
>3:  43341650 1733666000  java.util.LinkedHashMap$Entry
>4:  14453983 1156550896  [Ljava.util.HashMap$Entry;
>5:  14446577  924580928  
> org.apache.hadoop.metrics2.lib.Interns$CacheWith2Keys$2
> {code}
> Heap consumption by Interns$CacheWith2Keys$2 (and indirectly by [C) could be 
> due to calls to Interns.info() in DynamicMetricsRegistry which was cloned off 
> metrics2/lib/MetricsRegistry.java.
> This scenario would arise when large number of regions are tracked through 
> metrics2 dynamically.
> Interns class doesn't provide API to remove entries in its internal Map.
> One solution is to provide an option that allows skipping calls to 
> Interns.info() in metrics2/lib/MetricsRegistry.java



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Created] (HADOOP-14942) DistCp#cleanup() should check whether jobFS is null

2017-10-10 Thread Ted Yu (JIRA)

Ted Yu created HADOOP-14942:
---

 Summary: DistCp#cleanup() should check whether jobFS is null
 Key: HADOOP-14942
 URL: https://issues.apache.org/jira/browse/HADOOP-14942
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Ted Yu
Priority: Minor


Over in HBASE-18975, we observed the following:
{code}
2017-10-10 17:22:53,211 DEBUG [main] mapreduce.MapReduceBackupCopyJob(313): 
Doing COPY_TYPE_DISTCP
2017-10-10 17:22:53,272 DEBUG [main] mapreduce.MapReduceBackupCopyJob(322): 
DistCp options: [hdfs://localhost:55247/backupUT/.tmp/backup_1507681285309, 
hdfs://localhost:55247/   backupUT]
2017-10-10 17:22:53,283 ERROR [main] tools.DistCp(167): Exception encountered
java.lang.reflect.InvocationTargetException
  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
  at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  at java.lang.reflect.Method.invoke(Method.java:498)
  at 
org.apache.hadoop.hbase.backup.mapreduce.MapReduceBackupCopyJob$BackupDistCp.execute(MapReduceBackupCopyJob.java:234)
  at org.apache.hadoop.tools.DistCp.run(DistCp.java:153)
  at 
org.apache.hadoop.hbase.backup.mapreduce.MapReduceBackupCopyJob.copy(MapReduceBackupCopyJob.java:331)
  at 
org.apache.hadoop.hbase.backup.impl.IncrementalTableBackupClient.incrementalCopyHFiles(IncrementalTableBackupClient.java:286)
...
Caused by: java.lang.NullPointerException
  at org.apache.hadoop.tools.DistCp.cleanup(DistCp.java:460)
  ... 45 more
{code}
NullPointerException came from second line below:
{code}
  if (metaFolder == null) return;

  jobFS.delete(metaFolder, true);
{code}
in which case jobFS was null.
A check against null should be added.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-14043) Shade netty 4 dependency in hadoop-hdfs

2017-10-09 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-14043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16197724#comment-16197724
 ] 

Ted Yu commented on HADOOP-14043:
-

Not critical for 2.9.0

> Shade netty 4 dependency in hadoop-hdfs
> ---
>
> Key: HADOOP-14043
> URL: https://issues.apache.org/jira/browse/HADOOP-14043
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: build
>Affects Versions: 2.8.0
>Reporter: Ted Yu
>Priority: Critical
>
> During review of HADOOP-13866, [~andrew.wang] mentioned considering  shading 
> netty before putting the fix into branch-2.
> This would give users better experience when upgrading hadoop.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-14930) Upgrade Jetty to 9.4 version

2017-10-05 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-14930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16193694#comment-16193694
 ] 

Ted Yu commented on HADOOP-14930:
-

lgtm

> Upgrade Jetty to 9.4 version
> 
>
> Key: HADOOP-14930
> URL: https://issues.apache.org/jira/browse/HADOOP-14930
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Ted Yu
>Assignee: Bharat Viswanadham
> Attachments: HADOOP-14930.00.patch
>
>
> Currently 9.3.19.v20170502 is used.
> In hbase 2.0+, 9.4.6.v20170531 is used.
> When starting mini dfs cluster in hbase unit tests, we get the following:
> {code}
> java.lang.NoSuchMethodError: 
> org.eclipse.jetty.server.session.SessionHandler.getSessionManager()Lorg/eclipse/jetty/server/SessionManager;
>   at 
> org.apache.hadoop.http.HttpServer2.initializeWebServer(HttpServer2.java:548)
>   at org.apache.hadoop.http.HttpServer2.(HttpServer2.java:529)
>   at org.apache.hadoop.http.HttpServer2.(HttpServer2.java:119)
>   at org.apache.hadoop.http.HttpServer2$Builder.build(HttpServer2.java:415)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeHttpServer.start(NameNodeHttpServer.java:157)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.startHttpServer(NameNode.java:887)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:723)
>   at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:949)
>   at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:928)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1637)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.createNameNode(MiniDFSCluster.java:1277)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.configureNameService(MiniDFSCluster.java:1046)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.createNameNodesAndSetConf(MiniDFSCluster.java:921)
> {code}
> This issue is to upgrade Jetty to 9.4 version



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-14178) Move Mockito up to version 2.x

2017-10-05 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-14178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16193472#comment-16193472
 ] 

Ted Yu commented on HADOOP-14178:
-

Logged HDFS-12599

> Move Mockito up to version 2.x
> --
>
> Key: HADOOP-14178
> URL: https://issues.apache.org/jira/browse/HADOOP-14178
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: test
>Affects Versions: 2.9.0
>Reporter: Steve Loughran
>Assignee: Akira Ajisaka
>
> I don't know when Hadoop picked up Mockito, but it has been frozen at 1.8.5 
> since the switch to maven in 2011. 
> Mockito is now at version 2.1, [with lots of Java 8 
> support|https://github.com/mockito/mockito/wiki/What%27s-new-in-Mockito-2]. 
> That' s not just defining actions as closures, but in supporting Optional 
> types, mocking methods in interfaces, etc. 
> It's only used for testing, and, *provided there aren't regressions*, cost of 
> upgrade is low. The good news: test tools usually come with good test 
> coverage. The bad: mockito does go deep into java bytecodes.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-14178) Move Mockito up to version 2.x

2017-10-05 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-14178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16193304#comment-16193304
 ] 

Ted Yu commented on HADOOP-14178:
-

However mockito 1.10.19 doesn't have it.
MiniDFSCluster would use 1.10.19 in hbase tests.

HBASE-18925 is upgrading to mockito 2.1.0 :
{code}
-1.10.19
+2.1.0
{code}

> Move Mockito up to version 2.x
> --
>
> Key: HADOOP-14178
> URL: https://issues.apache.org/jira/browse/HADOOP-14178
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: test
>Affects Versions: 2.9.0
>Reporter: Steve Loughran
>Assignee: Akira Ajisaka
>
> I don't know when Hadoop picked up Mockito, but it has been frozen at 1.8.5 
> since the switch to maven in 2011. 
> Mockito is now at version 2.1, [with lots of Java 8 
> support|https://github.com/mockito/mockito/wiki/What%27s-new-in-Mockito-2]. 
> That' s not just defining actions as closures, but in supporting Optional 
> types, mocking methods in interfaces, etc. 
> It's only used for testing, and, *provided there aren't regressions*, cost of 
> upgrade is low. The good news: test tools usually come with good test 
> coverage. The bad: mockito does go deep into java bytecodes.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (HADOOP-14178) Move Mockito up to version 2.x

2017-10-05 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-14178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16193128#comment-16193128
 ] 

Ted Yu edited comment on HADOOP-14178 at 10/5/17 4:25 PM:
--

Is this going to hadoop-3 beta / GA ?

If not, how about upgrading to 1.10.19 for hadoop-3 ?

I got the following when starting hadoop-3 mini dfs cluster within hbase unit 
test:
{code}
2017-10-05 08:31:26,525 WARN  [main] hbase.HBaseTestingUtility(1077): error 
starting mini dfs
java.lang.NoClassDefFoundError: org/mockito/stubbing/Answer
  at org.apache.hadoop.hdfs.MiniDFSCluster.shouldWait(MiniDFSCluster.java:2668)
  at org.apache.hadoop.hdfs.MiniDFSCluster.waitActive(MiniDFSCluster.java:2564)
  at org.apache.hadoop.hdfs.MiniDFSCluster.waitActive(MiniDFSCluster.java:2607)
  at 
org.apache.hadoop.hdfs.MiniDFSCluster.startDataNodes(MiniDFSCluster.java:1667)
  at 
org.apache.hadoop.hdfs.MiniDFSCluster.initMiniDFSCluster(MiniDFSCluster.java:874)
  at org.apache.hadoop.hdfs.MiniDFSCluster.(MiniDFSCluster.java:769)
  at 
org.apache.hadoop.hbase.HBaseTestingUtility.startMiniDFSCluster(HBaseTestingUtility.java:661)
  at 
org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:1075)
  at 
org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:953)
{code}
hbase uses mockito-all 1.10.19

If I downgrade to 1.8.5 (hadoop), hbase code won't compile:
{code}
[ERROR] 
/Users/tyu/trunk/hbase-client/src/test/java/org/apache/hadoop/hbase/client/TestClientScanner.java:[146,62]
 cannot find symbol
[ERROR] symbol:   method 
getArgumentAt(int,java.lang.Class)
[ERROR] location: variable invocation of type 
org.mockito.invocation.InvocationOnMock
[ERROR] 
/Users/tyu/trunk/hbase-client/src/test/java/org/apache/hadoop/hbase/client/TestClientScanner.java:[207,60]
 cannot find symbol
[ERROR] symbol:   method 
getArgumentAt(int,java.lang.Class)
[ERROR] location: variable invocation of type 
org.mockito.invocation.InvocationOnMock
{code}


was (Author: yuzhih...@gmail.com):
Is this going to hadoop-3 beta / GA ?

If not, how about upgrading to 1.10.19 for hadoop-3 ?

I got the following when starting hadoop-3 mini dfs cluster:
{code}
2017-10-05 08:31:26,525 WARN  [main] hbase.HBaseTestingUtility(1077): error 
starting mini dfs
java.lang.NoClassDefFoundError: org/mockito/stubbing/Answer
  at org.apache.hadoop.hdfs.MiniDFSCluster.shouldWait(MiniDFSCluster.java:2668)
  at org.apache.hadoop.hdfs.MiniDFSCluster.waitActive(MiniDFSCluster.java:2564)
  at org.apache.hadoop.hdfs.MiniDFSCluster.waitActive(MiniDFSCluster.java:2607)
  at 
org.apache.hadoop.hdfs.MiniDFSCluster.startDataNodes(MiniDFSCluster.java:1667)
  at 
org.apache.hadoop.hdfs.MiniDFSCluster.initMiniDFSCluster(MiniDFSCluster.java:874)
  at org.apache.hadoop.hdfs.MiniDFSCluster.(MiniDFSCluster.java:769)
  at 
org.apache.hadoop.hbase.HBaseTestingUtility.startMiniDFSCluster(HBaseTestingUtility.java:661)
  at 
org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:1075)
  at 
org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:953)
{code}
hbase uses mockito-all 1.10.19

If I downgrade to 1.8.5 (hadoop), hbase code won't compile:
{code}
[ERROR] 
/Users/tyu/trunk/hbase-client/src/test/java/org/apache/hadoop/hbase/client/TestClientScanner.java:[146,62]
 cannot find symbol
[ERROR] symbol:   method 
getArgumentAt(int,java.lang.Class)
[ERROR] location: variable invocation of type 
org.mockito.invocation.InvocationOnMock
[ERROR] 
/Users/tyu/trunk/hbase-client/src/test/java/org/apache/hadoop/hbase/client/TestClientScanner.java:[207,60]
 cannot find symbol
[ERROR] symbol:   method 
getArgumentAt(int,java.lang.Class)
[ERROR] location: variable invocation of type 
org.mockito.invocation.InvocationOnMock
{code}

> Move Mockito up to version 2.x
> --
>
> Key: HADOOP-14178
> URL: https://issues.apache.org/jira/browse/HADOOP-14178
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: test
>Affects Versions: 2.9.0
>Reporter: Steve Loughran
>Assignee: Akira Ajisaka
>
> I don't know when Hadoop picked up Mockito, but it has been frozen at 1.8.5 
> since the switch to maven in 2011. 
> Mockito is now at version 2.1, [with lots of Java 8 
> support|https://github.com/mockito/mockito/wiki/What%27s-new-in-Mockito-2]. 
> That' s not just defining actions as closures, but in supporting Optional 
> types, mocking methods in interfaces, etc. 
> It's only used for testing, and, *provided there aren't regressions*, cost of 
> upgrade is low. The good news: test tools usually come with good test 
> coverage. The bad: mockito does go deep into java bytecodes.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To

[jira] [Commented] (HADOOP-14178) Move Mockito up to version 2.x

2017-10-05 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-14178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16193128#comment-16193128
 ] 

Ted Yu commented on HADOOP-14178:
-

Is this going to hadoop-3 beta / GA ?

If not, how about upgrading to 1.10.19 for hadoop-3 ?

I got the following when starting hadoop-3 mini dfs cluster:
{code}
2017-10-05 08:31:26,525 WARN  [main] hbase.HBaseTestingUtility(1077): error 
starting mini dfs
java.lang.NoClassDefFoundError: org/mockito/stubbing/Answer
  at org.apache.hadoop.hdfs.MiniDFSCluster.shouldWait(MiniDFSCluster.java:2668)
  at org.apache.hadoop.hdfs.MiniDFSCluster.waitActive(MiniDFSCluster.java:2564)
  at org.apache.hadoop.hdfs.MiniDFSCluster.waitActive(MiniDFSCluster.java:2607)
  at 
org.apache.hadoop.hdfs.MiniDFSCluster.startDataNodes(MiniDFSCluster.java:1667)
  at 
org.apache.hadoop.hdfs.MiniDFSCluster.initMiniDFSCluster(MiniDFSCluster.java:874)
  at org.apache.hadoop.hdfs.MiniDFSCluster.(MiniDFSCluster.java:769)
  at 
org.apache.hadoop.hbase.HBaseTestingUtility.startMiniDFSCluster(HBaseTestingUtility.java:661)
  at 
org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:1075)
  at 
org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:953)
{code}
hbase uses mockito-all 1.10.19

If I downgrade to 1.8.5 (hadoop), hbase code won't compile:
{code}
[ERROR] 
/Users/tyu/trunk/hbase-client/src/test/java/org/apache/hadoop/hbase/client/TestClientScanner.java:[146,62]
 cannot find symbol
[ERROR] symbol:   method 
getArgumentAt(int,java.lang.Class)
[ERROR] location: variable invocation of type 
org.mockito.invocation.InvocationOnMock
[ERROR] 
/Users/tyu/trunk/hbase-client/src/test/java/org/apache/hadoop/hbase/client/TestClientScanner.java:[207,60]
 cannot find symbol
[ERROR] symbol:   method 
getArgumentAt(int,java.lang.Class)
[ERROR] location: variable invocation of type 
org.mockito.invocation.InvocationOnMock
{code}

> Move Mockito up to version 2.x
> --
>
> Key: HADOOP-14178
> URL: https://issues.apache.org/jira/browse/HADOOP-14178
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: test
>Affects Versions: 2.9.0
>Reporter: Steve Loughran
>Assignee: Akira Ajisaka
>
> I don't know when Hadoop picked up Mockito, but it has been frozen at 1.8.5 
> since the switch to maven in 2011. 
> Mockito is now at version 2.1, [with lots of Java 8 
> support|https://github.com/mockito/mockito/wiki/What%27s-new-in-Mockito-2]. 
> That' s not just defining actions as closures, but in supporting Optional 
> types, mocking methods in interfaces, etc. 
> It's only used for testing, and, *provided there aren't regressions*, cost of 
> upgrade is low. The good news: test tools usually come with good test 
> coverage. The bad: mockito does go deep into java bytecodes.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-14930) Upgrade Jetty to 9.4 version

2017-10-05 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-14930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16193112#comment-16193112
 ] 

Ted Yu commented on HADOOP-14930:
-

hadoop uses 9.3.19.v20170502 

That was why NoSuchMethodError was encountered when jetty on the classpath is 
9.4 (hbase)

> Upgrade Jetty to 9.4 version
> 
>
> Key: HADOOP-14930
> URL: https://issues.apache.org/jira/browse/HADOOP-14930
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Ted Yu
>Assignee: Bharat Viswanadham
>
> Currently 9.3.19.v20170502 is used.
> In hbase 2.0+, 9.4.6.v20170531 is used.
> When starting mini dfs cluster in hbase unit tests, we get the following:
> {code}
> java.lang.NoSuchMethodError: 
> org.eclipse.jetty.server.session.SessionHandler.getSessionManager()Lorg/eclipse/jetty/server/SessionManager;
>   at 
> org.apache.hadoop.http.HttpServer2.initializeWebServer(HttpServer2.java:548)
>   at org.apache.hadoop.http.HttpServer2.(HttpServer2.java:529)
>   at org.apache.hadoop.http.HttpServer2.(HttpServer2.java:119)
>   at org.apache.hadoop.http.HttpServer2$Builder.build(HttpServer2.java:415)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeHttpServer.start(NameNodeHttpServer.java:157)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.startHttpServer(NameNode.java:887)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:723)
>   at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:949)
>   at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:928)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1637)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.createNameNode(MiniDFSCluster.java:1277)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.configureNameService(MiniDFSCluster.java:1046)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.createNameNodesAndSetConf(MiniDFSCluster.java:921)
> {code}
> This issue is to upgrade Jetty to 9.4 version



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Created] (HADOOP-14930) Upgrade Jetty to 9.4 version

2017-10-05 Thread Ted Yu (JIRA)

Ted Yu created HADOOP-14930:
---

 Summary: Upgrade Jetty to 9.4 version
 Key: HADOOP-14930
 URL: https://issues.apache.org/jira/browse/HADOOP-14930
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Ted Yu


Currently 9.3.19.v20170502 is used.

In hbase 2.0+, 9.4.6.v20170531 is used.

When starting mini dfs cluster in hbase unit tests, we get the following:
{code}
java.lang.NoSuchMethodError: 
org.eclipse.jetty.server.session.SessionHandler.getSessionManager()Lorg/eclipse/jetty/server/SessionManager;
  at 
org.apache.hadoop.http.HttpServer2.initializeWebServer(HttpServer2.java:548)
  at org.apache.hadoop.http.HttpServer2.(HttpServer2.java:529)
  at org.apache.hadoop.http.HttpServer2.(HttpServer2.java:119)
  at org.apache.hadoop.http.HttpServer2$Builder.build(HttpServer2.java:415)
  at 
org.apache.hadoop.hdfs.server.namenode.NameNodeHttpServer.start(NameNodeHttpServer.java:157)
  at 
org.apache.hadoop.hdfs.server.namenode.NameNode.startHttpServer(NameNode.java:887)
  at 
org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:723)
  at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:949)
  at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:928)
  at 
org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1637)
  at 
org.apache.hadoop.hdfs.MiniDFSCluster.createNameNode(MiniDFSCluster.java:1277)
  at 
org.apache.hadoop.hdfs.MiniDFSCluster.configureNameService(MiniDFSCluster.java:1046)
  at 
org.apache.hadoop.hdfs.MiniDFSCluster.createNameNodesAndSetConf(MiniDFSCluster.java:921)
{code}
This issue is to upgrade Jetty to 9.4 version



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Resolved] (HADOOP-10202) OK_JAVADOC_WARNINGS is out of date, leading to negative javadoc warning count

2017-09-28 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-10202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu resolved HADOOP-10202.
-
Resolution: Cannot Reproduce

> OK_JAVADOC_WARNINGS is out of date, leading to negative javadoc warning count
> -
>
> Key: HADOOP-10202
> URL: https://issues.apache.org/jira/browse/HADOOP-10202
> Project: Hadoop Common
>  Issue Type: Task
>Reporter: Ted Yu
>Priority: Minor
>
> From https://builds.apache.org/job/PreCommit-HDFS-Build/5813//testReport/ :
> {code}
> -1 javadoc. The javadoc tool appears to have generated -14 warning messages.
> {code}
> OK_JAVADOC_WARNINGS should be updated.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-10364) JsonGenerator in Configuration#dumpConfiguration() is not closed

2017-09-28 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-10364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HADOOP-10364:

Status: Open  (was: Patch Available)

> JsonGenerator in Configuration#dumpConfiguration() is not closed
> 
>
> Key: HADOOP-10364
> URL: https://issues.apache.org/jira/browse/HADOOP-10364
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: conf
>Reporter: Ted Yu
>  Labels: BB2015-05-TBR
> Attachments: HADOOP-10364.2.patch, HADOOP-10364.patch
>
>
> {code}
> JsonGenerator dumpGenerator = dumpFactory.createJsonGenerator(out);
> {code}
> dumpGenerator is not closed in Configuration#dumpConfiguration()
> Looking at the source code of 
> org.codehaus.jackson.impl.WriterBasedGenerator#close(), there is more than 
> flushing the buffer.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-11229) JobStoryProducer is not closed upon return from Gridmix#setupDistCacheEmulation()

2017-09-28 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-11229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HADOOP-11229:

Labels: BB2015-05-TBR gridmix  (was: BB2015-05-TBR)

> JobStoryProducer is not closed upon return from 
> Gridmix#setupDistCacheEmulation()
> -
>
> Key: HADOOP-11229
> URL: https://issues.apache.org/jira/browse/HADOOP-11229
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: skrho
>Priority: Minor
>  Labels: BB2015-05-TBR, gridmix
> Attachments: HADOOP-11229_001.patch, HADOOP-11229_002.patch, 
> HADOOP-11229.v3.patch
>
>
> Here is related code:
> {code}
>   JobStoryProducer jsp = createJobStoryProducer(traceIn, conf);
>   exitCode = distCacheEmulator.setupGenerateDistCacheData(jsp);
> {code}
> jsp should be closed upon return from setupDistCacheEmulation().



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-10642) Provide option to limit heap memory consumed by dynamic metrics2 metrics

2017-09-28 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-10642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HADOOP-10642:

Description: 
User sunweiei provided the following jmap output in HBase 0.96 deployment:
{code}
 num #instances #bytes  class name
--
   1:  14917882 3396492464  [C
   2:   1996994 2118021808  [B
   3:  43341650 1733666000  java.util.LinkedHashMap$Entry
   4:  14453983 1156550896  [Ljava.util.HashMap$Entry;
   5:  14446577  924580928  
org.apache.hadoop.metrics2.lib.Interns$CacheWith2Keys$2
{code}
Heap consumption by Interns$CacheWith2Keys$2 (and indirectly by [C) could be 
due to calls to Interns.info() in DynamicMetricsRegistry which was cloned off 
metrics2/lib/MetricsRegistry.java.

This scenario would arise when large number of regions are tracked through 
metrics2 dynamically.
Interns class doesn't provide API to remove entries in its internal Map.

One solution is to provide an option that allows skipping calls to 
Interns.info() in metrics2/lib/MetricsRegistry.java

  was:
User sunweiei provided the following jmap output in HBase 0.96 deployment:

{code}
 num #instances #bytes  class name
--
   1:  14917882 3396492464  [C
   2:   1996994 2118021808  [B
   3:  43341650 1733666000  java.util.LinkedHashMap$Entry
   4:  14453983 1156550896  [Ljava.util.HashMap$Entry;
   5:  14446577  924580928  
org.apache.hadoop.metrics2.lib.Interns$CacheWith2Keys$2
{code}
Heap consumption by Interns$CacheWith2Keys$2 (and indirectly by [C) could be 
due to calls to Interns.info() in DynamicMetricsRegistry which was cloned off 
metrics2/lib/MetricsRegistry.java.

This scenario would arise when large number of regions are tracked through 
metrics2 dynamically.
Interns class doesn't provide API to remove entries in its internal Map.

One solution is to provide an option that allows skipping calls to 
Interns.info() in metrics2/lib/MetricsRegistry.java


> Provide option to limit heap memory consumed by dynamic metrics2 metrics
> 
>
> Key: HADOOP-10642
> URL: https://issues.apache.org/jira/browse/HADOOP-10642
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: metrics
>Reporter: Ted Yu
>
> User sunweiei provided the following jmap output in HBase 0.96 deployment:
> {code}
>  num #instances #bytes  class name
> --
>1:  14917882 3396492464  [C
>2:   1996994 2118021808  [B
>3:  43341650 1733666000  java.util.LinkedHashMap$Entry
>4:  14453983 1156550896  [Ljava.util.HashMap$Entry;
>5:  14446577  924580928  
> org.apache.hadoop.metrics2.lib.Interns$CacheWith2Keys$2
> {code}
> Heap consumption by Interns$CacheWith2Keys$2 (and indirectly by [C) could be 
> due to calls to Interns.info() in DynamicMetricsRegistry which was cloned off 
> metrics2/lib/MetricsRegistry.java.
> This scenario would arise when large number of regions are tracked through 
> metrics2 dynamically.
> Interns class doesn't provide API to remove entries in its internal Map.
> One solution is to provide an option that allows skipping calls to 
> Interns.info() in metrics2/lib/MetricsRegistry.java



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-12724) Let BufferedFSInputStream implement CanUnbuffer

2017-09-28 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-12724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HADOOP-12724:

Description: 
When trying to determine reason for test failure over in HBASE-9393, I saw the 
following exception:
{code}
testSeekTo[4](org.apache.hadoop.hbase.io.hfile.TestSeekTo)  Time elapsed: 0.033 
sec  <<< ERROR!
java.lang.UnsupportedOperationException: this stream does not support 
unbuffering.
at 
org.apache.hadoop.fs.FSDataInputStream.unbuffer(FSDataInputStream.java:229)
at 
org.apache.hadoop.fs.FSDataInputStream.unbuffer(FSDataInputStream.java:227)
at 
org.apache.hadoop.hbase.io.hfile.HFile.pickReaderVersion(HFile.java:518)
at org.apache.hadoop.hbase.io.hfile.HFile.createReader(HFile.java:562)
at 
org.apache.hadoop.hbase.io.hfile.TestSeekTo.testSeekToInternals(TestSeekTo.java:307)
at 
org.apache.hadoop.hbase.io.hfile.TestSeekTo.testSeekTo(TestSeekTo.java:298)
{code}
Here is the cause:
{code}
java.lang.ClassCastException: org.apache.hadoop.fs.BufferedFSInputStream cannot 
be cast to org.apache.hadoop.fs.CanUnbuffer
{code}
See the comments starting with 
https://issues.apache.org/jira/browse/HBASE-9393?focusedCommentId=15105939=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15105939
 for background on the HBase patch.

This issue is to make BufferedFSInputStream implement CanUnbuffer.

This would benefit hbase unit tests.

Thanks to [~cmccabe] for discussion.

  was:
When trying to determine reason for test failure over in HBASE-9393, I saw the 
following exception:

{code}
testSeekTo[4](org.apache.hadoop.hbase.io.hfile.TestSeekTo)  Time elapsed: 0.033 
sec  <<< ERROR!
java.lang.UnsupportedOperationException: this stream does not support 
unbuffering.
at 
org.apache.hadoop.fs.FSDataInputStream.unbuffer(FSDataInputStream.java:229)
at 
org.apache.hadoop.fs.FSDataInputStream.unbuffer(FSDataInputStream.java:227)
at 
org.apache.hadoop.hbase.io.hfile.HFile.pickReaderVersion(HFile.java:518)
at org.apache.hadoop.hbase.io.hfile.HFile.createReader(HFile.java:562)
at 
org.apache.hadoop.hbase.io.hfile.TestSeekTo.testSeekToInternals(TestSeekTo.java:307)
at 
org.apache.hadoop.hbase.io.hfile.TestSeekTo.testSeekTo(TestSeekTo.java:298)
{code}
Here is the cause:
{code}
java.lang.ClassCastException: org.apache.hadoop.fs.BufferedFSInputStream cannot 
be cast to org.apache.hadoop.fs.CanUnbuffer
{code}
See the comments starting with 
https://issues.apache.org/jira/browse/HBASE-9393?focusedCommentId=15105939=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15105939
 for background on the HBase patch.

This issue is to make BufferedFSInputStream implement CanUnbuffer.

This would benefit hbase unit tests.

Thanks to [~cmccabe] for discussion.


> Let BufferedFSInputStream implement CanUnbuffer
> ---
>
> Key: HADOOP-12724
> URL: https://issues.apache.org/jira/browse/HADOOP-12724
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Ted Yu
>Priority: Minor
>  Labels: stream
>
> When trying to determine reason for test failure over in HBASE-9393, I saw 
> the following exception:
> {code}
> testSeekTo[4](org.apache.hadoop.hbase.io.hfile.TestSeekTo)  Time elapsed: 
> 0.033 sec  <<< ERROR!
> java.lang.UnsupportedOperationException: this stream does not support 
> unbuffering.
>   at 
> org.apache.hadoop.fs.FSDataInputStream.unbuffer(FSDataInputStream.java:229)
>   at 
> org.apache.hadoop.fs.FSDataInputStream.unbuffer(FSDataInputStream.java:227)
>   at 
> org.apache.hadoop.hbase.io.hfile.HFile.pickReaderVersion(HFile.java:518)
>   at org.apache.hadoop.hbase.io.hfile.HFile.createReader(HFile.java:562)
>   at 
> org.apache.hadoop.hbase.io.hfile.TestSeekTo.testSeekToInternals(TestSeekTo.java:307)
>   at 
> org.apache.hadoop.hbase.io.hfile.TestSeekTo.testSeekTo(TestSeekTo.java:298)
> {code}
> Here is the cause:
> {code}
> java.lang.ClassCastException: org.apache.hadoop.fs.BufferedFSInputStream 
> cannot be cast to org.apache.hadoop.fs.CanUnbuffer
> {code}
> See the comments starting with 
> https://issues.apache.org/jira/browse/HBASE-9393?focusedCommentId=15105939=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15105939
>  for background on the HBase patch.
> This issue is to make BufferedFSInputStream implement CanUnbuffer.
> This would benefit hbase unit tests.
> Thanks to [~cmccabe] for discussion.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Created] (HADOOP-14895) Consider exposing SimpleCopyListing#computeSourceRootPath() for downstream project

2017-09-21 Thread Ted Yu (JIRA)

Ted Yu created HADOOP-14895:
---

 Summary: Consider exposing 
SimpleCopyListing#computeSourceRootPath() for downstream project
 Key: HADOOP-14895
 URL: https://issues.apache.org/jira/browse/HADOOP-14895
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Ted Yu


Over in HBASE-18843, [~vrodionov] needs to override 
SimpleCopyListing#computeSourceRootPath() .

Since the method is private, some duplicated code appears in hbase.

We should consider exposing SimpleCopyListing#computeSourceRootPath() so that 
its behavior can be overridden.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-13866) Upgrade netty-all to 4.1.1.Final

2017-05-16 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-13866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16012773#comment-16012773
 ] 

Ted Yu commented on HADOOP-13866:
-

The shading discussion thread mentioned above hasn't led to a JIRA yet.
There would be better visibility once the discussion draws some conclusion.

> Upgrade netty-all to 4.1.1.Final
> 
>
> Key: HADOOP-13866
> URL: https://issues.apache.org/jira/browse/HADOOP-13866
> Project: Hadoop Common
>  Issue Type: Improvement
>Affects Versions: 3.0.0-alpha1
>Reporter: Ted Yu
>Assignee: Ted Yu
> Attachments: HADOOP-13866.v1.patch, HADOOP-13866.v2.patch, 
> HADOOP-13866.v3.patch, HADOOP-13866.v4.patch, HADOOP-13866.v6.patch, 
> HADOOP-13866.v7.patch, HADOOP-13866.v8.patch, HADOOP-13866.v8.patch, 
> HADOOP-13866.v8.patch, HADOOP-13866.v9.patch
>
>
> netty-all 4.1.1.Final is stable release which we should upgrade to.
> See bottom of HADOOP-12927 for related discussion.
> This issue was discovered since hbase 2.0 uses 4.1.1.Final of netty.
> When launching mapreduce job from hbase, /grid/0/hadoop/yarn/local/  
> usercache/hbase/appcache/application_1479850535804_0008/container_e01_1479850535804_0008_01_05/mr-framework/hadoop/share/hadoop/hdfs/lib/netty-all-4.0.23.Final.jar
>  (from hdfs) is ahead of 4.1.1.Final jar (from hbase) on the classpath.
> Resulting in the following exception:
> {code}
> 2016-12-01 20:17:26,678 WARN [Default-IPC-NioEventLoopGroup-1-1] 
> io.netty.util.concurrent.DefaultPromise: An exception was thrown by 
> org.apache.hadoop.hbase.ipc.NettyRpcConnection$3.operationComplete()
> java.lang.NoSuchMethodError: 
> io.netty.buffer.ByteBuf.retainedDuplicate()Lio/netty/buffer/ByteBuf;
> at 
> org.apache.hadoop.hbase.ipc.NettyRpcConnection$3.operationComplete(NettyRpcConnection.java:272)
> at 
> org.apache.hadoop.hbase.ipc.NettyRpcConnection$3.operationComplete(NettyRpcConnection.java:262)
> at 
> io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:680)
> at 
> io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:603)
> at 
> io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:563)
> at 
> io.netty.util.concurrent.DefaultPromise.trySuccess(DefaultPromise.java:406)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-13866) Upgrade netty-all to 4.1.1.Final

2017-04-18 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-13866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15973537#comment-15973537
 ] 

Ted Yu commented on HADOOP-13866:
-

See this thread: 
http://search-hadoop.com/m/HBase/YGbbZFYYbSDQ3l1?subj=Re+DISCUSS+More+Shading

No HBase JIRA yet.

> Upgrade netty-all to 4.1.1.Final
> 
>
> Key: HADOOP-13866
> URL: https://issues.apache.org/jira/browse/HADOOP-13866
> Project: Hadoop Common
>  Issue Type: Improvement
>Affects Versions: 3.0.0-alpha1
>Reporter: Ted Yu
>Assignee: Ted Yu
> Attachments: HADOOP-13866.v1.patch, HADOOP-13866.v2.patch, 
> HADOOP-13866.v3.patch, HADOOP-13866.v4.patch, HADOOP-13866.v6.patch, 
> HADOOP-13866.v7.patch, HADOOP-13866.v8.patch, HADOOP-13866.v8.patch, 
> HADOOP-13866.v8.patch, HADOOP-13866.v9.patch
>
>
> netty-all 4.1.1.Final is stable release which we should upgrade to.
> See bottom of HADOOP-12927 for related discussion.
> This issue was discovered since hbase 2.0 uses 4.1.1.Final of netty.
> When launching mapreduce job from hbase, /grid/0/hadoop/yarn/local/  
> usercache/hbase/appcache/application_1479850535804_0008/container_e01_1479850535804_0008_01_05/mr-framework/hadoop/share/hadoop/hdfs/lib/netty-all-4.0.23.Final.jar
>  (from hdfs) is ahead of 4.1.1.Final jar (from hbase) on the classpath.
> Resulting in the following exception:
> {code}
> 2016-12-01 20:17:26,678 WARN [Default-IPC-NioEventLoopGroup-1-1] 
> io.netty.util.concurrent.DefaultPromise: An exception was thrown by 
> org.apache.hadoop.hbase.ipc.NettyRpcConnection$3.operationComplete()
> java.lang.NoSuchMethodError: 
> io.netty.buffer.ByteBuf.retainedDuplicate()Lio/netty/buffer/ByteBuf;
> at 
> org.apache.hadoop.hbase.ipc.NettyRpcConnection$3.operationComplete(NettyRpcConnection.java:272)
> at 
> org.apache.hadoop.hbase.ipc.NettyRpcConnection$3.operationComplete(NettyRpcConnection.java:262)
> at 
> io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:680)
> at 
> io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:603)
> at 
> io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:563)
> at 
> io.netty.util.concurrent.DefaultPromise.trySuccess(DefaultPromise.java:406)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Created] (HADOOP-14222) Create specialized IOException subclass to represent closed filesystem

2017-03-23 Thread Ted Yu (JIRA)

Ted Yu created HADOOP-14222:
---

 Summary: Create specialized IOException subclass to represent 
closed filesystem
 Key: HADOOP-14222
 URL: https://issues.apache.org/jira/browse/HADOOP-14222
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Ted Yu


I was working on HBASE-17287 where hbase master didn't recognize that file 
system had closed due to extended unavailability of hdfs.

Chatting with [~steve_l], he suggested creating IOException subclass to 
represent closed filesystem so that downstream projects don't have to rely on 
the specific exception message.

The string in existing exception message can't be changed.
We should add clear comment around that part to avoid breakage.





--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (HADOOP-14043) Shade netty 4 dependency in hadoop-hdfs

2017-03-15 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-14043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15852837#comment-15852837
 ] 

Ted Yu edited comment on HADOOP-14043 at 3/15/17 3:13 PM:
--

Can the shading be done in a similar way to the following:
{code}
commit 70ca1f1e3a328b18eb4e27f7d0f328ae403342d5
Author: Andrew Wang 
Date:   Thu Dec 15 11:44:59 2016 -0800

HADOOP-11804. Shaded Hadoop client artifacts and minicluster. Contributed 
by Sean Busbey.
{code}


was (Author: yuzhih...@gmail.com):
Can the shading be done in a similar way to:
{code}
commit 70ca1f1e3a328b18eb4e27f7d0f328ae403342d5
Author: Andrew Wang 
Date:   Thu Dec 15 11:44:59 2016 -0800

HADOOP-11804. Shaded Hadoop client artifacts and minicluster. Contributed 
by Sean Busbey.
{code}

> Shade netty 4 dependency in hadoop-hdfs
> ---
>
> Key: HADOOP-14043
> URL: https://issues.apache.org/jira/browse/HADOOP-14043
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: build
>Affects Versions: 2.8.0
>Reporter: Ted Yu
>Priority: Critical
>
> During review of HADOOP-13866, [~andrew.wang] mentioned considering  shading 
> netty before putting the fix into branch-2.
> This would give users better experience when upgrading hadoop.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-10642) Provide option to limit heap memory consumed by dynamic metrics2 metrics

2017-03-09 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-10642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HADOOP-10642:

Description: 
User sunweiei provided the following jmap output in HBase 0.96 deployment:

{code}
 num #instances #bytes  class name
--
   1:  14917882 3396492464  [C
   2:   1996994 2118021808  [B
   3:  43341650 1733666000  java.util.LinkedHashMap$Entry
   4:  14453983 1156550896  [Ljava.util.HashMap$Entry;
   5:  14446577  924580928  
org.apache.hadoop.metrics2.lib.Interns$CacheWith2Keys$2
{code}
Heap consumption by Interns$CacheWith2Keys$2 (and indirectly by [C) could be 
due to calls to Interns.info() in DynamicMetricsRegistry which was cloned off 
metrics2/lib/MetricsRegistry.java.

This scenario would arise when large number of regions are tracked through 
metrics2 dynamically.
Interns class doesn't provide API to remove entries in its internal Map.

One solution is to provide an option that allows skipping calls to 
Interns.info() in metrics2/lib/MetricsRegistry.java

  was:
User sunweiei provided the following jmap output in HBase 0.96 deployment:
{code}
 num #instances #bytes  class name
--
   1:  14917882 3396492464  [C
   2:   1996994 2118021808  [B
   3:  43341650 1733666000  java.util.LinkedHashMap$Entry
   4:  14453983 1156550896  [Ljava.util.HashMap$Entry;
   5:  14446577  924580928  
org.apache.hadoop.metrics2.lib.Interns$CacheWith2Keys$2
{code}
Heap consumption by Interns$CacheWith2Keys$2 (and indirectly by [C) could be 
due to calls to Interns.info() in DynamicMetricsRegistry which was cloned off 
metrics2/lib/MetricsRegistry.java.

This scenario would arise when large number of regions are tracked through 
metrics2 dynamically.
Interns class doesn't provide API to remove entries in its internal Map.

One solution is to provide an option that allows skipping calls to 
Interns.info() in metrics2/lib/MetricsRegistry.java


> Provide option to limit heap memory consumed by dynamic metrics2 metrics
> 
>
> Key: HADOOP-10642
> URL: https://issues.apache.org/jira/browse/HADOOP-10642
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: metrics
>Reporter: Ted Yu
>
> User sunweiei provided the following jmap output in HBase 0.96 deployment:
> {code}
>  num #instances #bytes  class name
> --
>1:  14917882 3396492464  [C
>2:   1996994 2118021808  [B
>3:  43341650 1733666000  java.util.LinkedHashMap$Entry
>4:  14453983 1156550896  [Ljava.util.HashMap$Entry;
>5:  14446577  924580928  
> org.apache.hadoop.metrics2.lib.Interns$CacheWith2Keys$2
> {code}
> Heap consumption by Interns$CacheWith2Keys$2 (and indirectly by [C) could be 
> due to calls to Interns.info() in DynamicMetricsRegistry which was cloned off 
> metrics2/lib/MetricsRegistry.java.
> This scenario would arise when large number of regions are tracked through 
> metrics2 dynamically.
> Interns class doesn't provide API to remove entries in its internal Map.
> One solution is to provide an option that allows skipping calls to 
> Interns.info() in metrics2/lib/MetricsRegistry.java



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Resolved] (HADOOP-14076) Allow Configuration to be persisted given path to file

2017-03-08 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-14076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu resolved HADOOP-14076.
-
Resolution: Later

This can be done client side.

> Allow Configuration to be persisted given path to file
> --
>
> Key: HADOOP-14076
> URL: https://issues.apache.org/jira/browse/HADOOP-14076
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Ted Yu
>
> Currently Configuration has the following methods for persistence:
> {code}
>   public void writeXml(OutputStream out) throws IOException {
>   public void writeXml(Writer out) throws IOException {
> {code}
> Adding API for persisting to file given path would be useful:
> {code}
>   public void writeXml(String path) throws IOException {
> {code}
> Background: I recently worked on exporting Configuration to a file using JNI.
> Without the proposed API, I resorted to some trick such as the following:
> http://www.kfu.com/~nsayer/Java/jni-filedesc.html



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-13866) Upgrade netty-all to 4.1.1.Final

2017-03-06 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-13866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15898377#comment-15898377
 ] 

Ted Yu commented on HADOOP-13866:
-

I suggest keeping this JIRA open for a while until the following has been done:

* build hbase master branch (or branch-2 when it is cut)
* run the tar ball produced above on a released hadoop version
* verify that mapreduce job can succeed

> Upgrade netty-all to 4.1.1.Final
> 
>
> Key: HADOOP-13866
> URL: https://issues.apache.org/jira/browse/HADOOP-13866
> Project: Hadoop Common
>  Issue Type: Improvement
>Affects Versions: 3.0.0-alpha1
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Blocker
> Attachments: HADOOP-13866.v1.patch, HADOOP-13866.v2.patch, 
> HADOOP-13866.v3.patch, HADOOP-13866.v4.patch, HADOOP-13866.v6.patch, 
> HADOOP-13866.v7.patch, HADOOP-13866.v8.patch, HADOOP-13866.v8.patch, 
> HADOOP-13866.v8.patch, HADOOP-13866.v9.patch
>
>
> netty-all 4.1.1.Final is stable release which we should upgrade to.
> See bottom of HADOOP-12927 for related discussion.
> This issue was discovered since hbase 2.0 uses 4.1.1.Final of netty.
> When launching mapreduce job from hbase, /grid/0/hadoop/yarn/local/  
> usercache/hbase/appcache/application_1479850535804_0008/container_e01_1479850535804_0008_01_05/mr-framework/hadoop/share/hadoop/hdfs/lib/netty-all-4.0.23.Final.jar
>  (from hdfs) is ahead of 4.1.1.Final jar (from hbase) on the classpath.
> Resulting in the following exception:
> {code}
> 2016-12-01 20:17:26,678 WARN [Default-IPC-NioEventLoopGroup-1-1] 
> io.netty.util.concurrent.DefaultPromise: An exception was thrown by 
> org.apache.hadoop.hbase.ipc.NettyRpcConnection$3.operationComplete()
> java.lang.NoSuchMethodError: 
> io.netty.buffer.ByteBuf.retainedDuplicate()Lio/netty/buffer/ByteBuf;
> at 
> org.apache.hadoop.hbase.ipc.NettyRpcConnection$3.operationComplete(NettyRpcConnection.java:272)
> at 
> org.apache.hadoop.hbase.ipc.NettyRpcConnection$3.operationComplete(NettyRpcConnection.java:262)
> at 
> io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:680)
> at 
> io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:603)
> at 
> io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:563)
> at 
> io.netty.util.concurrent.DefaultPromise.trySuccess(DefaultPromise.java:406)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-14043) Shade netty 4 dependency in hadoop-hdfs

2017-03-02 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-14043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HADOOP-14043:

Priority: Critical  (was: Major)

> Shade netty 4 dependency in hadoop-hdfs
> ---
>
> Key: HADOOP-14043
> URL: https://issues.apache.org/jira/browse/HADOOP-14043
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: build
>Affects Versions: 2.8.0
>Reporter: Ted Yu
>Priority: Critical
>
> During review of HADOOP-13866, [~andrew.wang] mentioned considering  shading 
> netty before putting the fix into branch-2.
> This would give users better experience when upgrading hadoop.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (HADOOP-13866) Upgrade netty-all to 4.1.1.Final

2017-02-27 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-13866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15886035#comment-15886035
 ] 

Ted Yu edited comment on HADOOP-13866 at 2/27/17 8:28 PM:
--

Checked
http://netty.io/news/2017/01/30/4-0-44-Final-4-1-8-Final.html

http://netty.io/news/2017/01/12/4-0-43-Final-4-1-7-Final.html

http://netty.io/news/2016/10/14/4-0-42-Final-4-1-6-Final.html

http://netty.io/news/2016/08/29/4-0-41-Final-4-1-5-Final.html

http://netty.io/news/2016/07/27/4-0-40-Final-4-1-4-Final.html

Didn't spot any incompatible change


was (Author: yuzhih...@gmail.com):
Not sure how long we can get consensus w.r.t. pulling in this change for 2.8.0

[~djp]:
You can go ahead generating 2.8.0 RC without this change.

> Upgrade netty-all to 4.1.1.Final
> 
>
> Key: HADOOP-13866
> URL: https://issues.apache.org/jira/browse/HADOOP-13866
> Project: Hadoop Common
>  Issue Type: Improvement
>Affects Versions: 3.0.0-alpha1
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Blocker
> Attachments: HADOOP-13866.v1.patch, HADOOP-13866.v2.patch, 
> HADOOP-13866.v3.patch, HADOOP-13866.v4.patch, HADOOP-13866.v6.patch, 
> HADOOP-13866.v7.patch, HADOOP-13866.v8.patch, HADOOP-13866.v8.patch, 
> HADOOP-13866.v8.patch, HADOOP-13866.v9.patch
>
>
> netty-all 4.1.1.Final is stable release which we should upgrade to.
> See bottom of HADOOP-12927 for related discussion.
> This issue was discovered since hbase 2.0 uses 4.1.1.Final of netty.
> When launching mapreduce job from hbase, /grid/0/hadoop/yarn/local/  
> usercache/hbase/appcache/application_1479850535804_0008/container_e01_1479850535804_0008_01_05/mr-framework/hadoop/share/hadoop/hdfs/lib/netty-all-4.0.23.Final.jar
>  (from hdfs) is ahead of 4.1.1.Final jar (from hbase) on the classpath.
> Resulting in the following exception:
> {code}
> 2016-12-01 20:17:26,678 WARN [Default-IPC-NioEventLoopGroup-1-1] 
> io.netty.util.concurrent.DefaultPromise: An exception was thrown by 
> org.apache.hadoop.hbase.ipc.NettyRpcConnection$3.operationComplete()
> java.lang.NoSuchMethodError: 
> io.netty.buffer.ByteBuf.retainedDuplicate()Lio/netty/buffer/ByteBuf;
> at 
> org.apache.hadoop.hbase.ipc.NettyRpcConnection$3.operationComplete(NettyRpcConnection.java:272)
> at 
> org.apache.hadoop.hbase.ipc.NettyRpcConnection$3.operationComplete(NettyRpcConnection.java:262)
> at 
> io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:680)
> at 
> io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:603)
> at 
> io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:563)
> at 
> io.netty.util.concurrent.DefaultPromise.trySuccess(DefaultPromise.java:406)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-13866) Upgrade netty-all to 4.1.1.Final

2017-02-27 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-13866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15886035#comment-15886035
 ] 

Ted Yu commented on HADOOP-13866:
-

Not sure how long we can get consensus w.r.t. pulling in this change for 2.8.0

[~djp]:
You can go ahead generating 2.8.0 RC without this change.

> Upgrade netty-all to 4.1.1.Final
> 
>
> Key: HADOOP-13866
> URL: https://issues.apache.org/jira/browse/HADOOP-13866
> Project: Hadoop Common
>  Issue Type: Improvement
>Affects Versions: 3.0.0-alpha1
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Blocker
> Attachments: HADOOP-13866.v1.patch, HADOOP-13866.v2.patch, 
> HADOOP-13866.v3.patch, HADOOP-13866.v4.patch, HADOOP-13866.v6.patch, 
> HADOOP-13866.v7.patch, HADOOP-13866.v8.patch, HADOOP-13866.v8.patch, 
> HADOOP-13866.v8.patch, HADOOP-13866.v9.patch
>
>
> netty-all 4.1.1.Final is stable release which we should upgrade to.
> See bottom of HADOOP-12927 for related discussion.
> This issue was discovered since hbase 2.0 uses 4.1.1.Final of netty.
> When launching mapreduce job from hbase, /grid/0/hadoop/yarn/local/  
> usercache/hbase/appcache/application_1479850535804_0008/container_e01_1479850535804_0008_01_05/mr-framework/hadoop/share/hadoop/hdfs/lib/netty-all-4.0.23.Final.jar
>  (from hdfs) is ahead of 4.1.1.Final jar (from hbase) on the classpath.
> Resulting in the following exception:
> {code}
> 2016-12-01 20:17:26,678 WARN [Default-IPC-NioEventLoopGroup-1-1] 
> io.netty.util.concurrent.DefaultPromise: An exception was thrown by 
> org.apache.hadoop.hbase.ipc.NettyRpcConnection$3.operationComplete()
> java.lang.NoSuchMethodError: 
> io.netty.buffer.ByteBuf.retainedDuplicate()Lio/netty/buffer/ByteBuf;
> at 
> org.apache.hadoop.hbase.ipc.NettyRpcConnection$3.operationComplete(NettyRpcConnection.java:272)
> at 
> org.apache.hadoop.hbase.ipc.NettyRpcConnection$3.operationComplete(NettyRpcConnection.java:262)
> at 
> io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:680)
> at 
> io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:603)
> at 
> io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:563)
> at 
> io.netty.util.concurrent.DefaultPromise.trySuccess(DefaultPromise.java:406)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (HADOOP-13135) Encounter response code 500 when accessing /metrics endpoint

2017-02-26 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-13135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15658589#comment-15658589
 ] 

Ted Yu edited comment on HADOOP-13135 at 2/26/17 5:41 PM:
--

This issue should exist for hbase all HBase 1.x releases .


was (Author: yuzhih...@gmail.com):
This issue should exist for hbase all hbase 1.x releases .

> Encounter response code 500 when accessing /metrics endpoint
> 
>
> Key: HADOOP-13135
> URL: https://issues.apache.org/jira/browse/HADOOP-13135
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.7.1
>Reporter: Ted Yu
>  Labels: metrics
>
> When accessing /metrics endpoint on hbase master through hadoop 2.7.1, I got:
> {code}
> HTTP ERROR 500
> Problem accessing /metrics. Reason:
> INTERNAL_SERVER_ERROR
> Caused by:
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.http.HttpServer2.isInstrumentationAccessAllowed(HttpServer2.java:1029)
>   at 
> org.apache.hadoop.metrics.MetricsServlet.doGet(MetricsServlet.java:109)
>   at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
>   at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
>   at 
> org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
>   at 
> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1221)
>   at 
> org.apache.hadoop.hbase.http.lib.StaticUserWebFilter$StaticUserFilter.doFilter(StaticUserWebFilter.java:113)
>   at 
> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
> {code}
> [~ajisakaa] suggested that code 500 should be 404 (NOT FOUND).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (HADOOP-13135) Encounter response code 500 when accessing /metrics endpoint

2017-02-19 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-13135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15658589#comment-15658589
 ] 

Ted Yu edited comment on HADOOP-13135 at 2/19/17 3:57 PM:
--

This issue should exist for hbase all hbase 1.x releases .


was (Author: yuzhih...@gmail.com):
This issue should exist for hbase all 1.x releases .

> Encounter response code 500 when accessing /metrics endpoint
> 
>
> Key: HADOOP-13135
> URL: https://issues.apache.org/jira/browse/HADOOP-13135
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.7.1
>Reporter: Ted Yu
>  Labels: metrics
>
> When accessing /metrics endpoint on hbase master through hadoop 2.7.1, I got:
> {code}
> HTTP ERROR 500
> Problem accessing /metrics. Reason:
> INTERNAL_SERVER_ERROR
> Caused by:
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.http.HttpServer2.isInstrumentationAccessAllowed(HttpServer2.java:1029)
>   at 
> org.apache.hadoop.metrics.MetricsServlet.doGet(MetricsServlet.java:109)
>   at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
>   at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
>   at 
> org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
>   at 
> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1221)
>   at 
> org.apache.hadoop.hbase.http.lib.StaticUserWebFilter$StaticUserFilter.doFilter(StaticUserWebFilter.java:113)
>   at 
> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
> {code}
> [~ajisakaa] suggested that code 500 should be 404 (NOT FOUND).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-14043) Shade netty 4 dependency in hadoop-hdfs

2017-02-17 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-14043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HADOOP-14043:

Component/s: build

> Shade netty 4 dependency in hadoop-hdfs
> ---
>
> Key: HADOOP-14043
> URL: https://issues.apache.org/jira/browse/HADOOP-14043
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: build
>Affects Versions: 2.8.0
>Reporter: Ted Yu
>
> During review of HADOOP-13866, [~andrew.wang] mentioned considering  shading 
> netty before putting the fix into branch-2.
> This would give users better experience when upgrading hadoop.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-14076) Allow Configuration to be persisted given path to file

2017-02-14 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-14076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15865940#comment-15865940
 ] 

Ted Yu commented on HADOOP-14076:
-

Do you have suggestion where the helper method should reside.

> Allow Configuration to be persisted given path to file
> --
>
> Key: HADOOP-14076
> URL: https://issues.apache.org/jira/browse/HADOOP-14076
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Ted Yu
>
> Currently Configuration has the following methods for persistence:
> {code}
>   public void writeXml(OutputStream out) throws IOException {
>   public void writeXml(Writer out) throws IOException {
> {code}
> Adding API for persisting to file given path would be useful:
> {code}
>   public void writeXml(String path) throws IOException {
> {code}
> Background: I recently worked on exporting Configuration to a file using JNI.
> Without the proposed API, I resorted to some trick such as the following:
> http://www.kfu.com/~nsayer/Java/jni-filedesc.html



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-13866) Upgrade netty-all to 4.1.1.Final

2017-02-13 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-13866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15863817#comment-15863817
 ] 

Ted Yu commented on HADOOP-13866:
-

[~xiaochen] [~andrew.wang] [~djp]:
Can this be resolved this week ?

Let me know what else I need to do.

> Upgrade netty-all to 4.1.1.Final
> 
>
> Key: HADOOP-13866
> URL: https://issues.apache.org/jira/browse/HADOOP-13866
> Project: Hadoop Common
>  Issue Type: Improvement
>Affects Versions: 3.0.0-alpha1
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Blocker
> Attachments: HADOOP-13866.v1.patch, HADOOP-13866.v2.patch, 
> HADOOP-13866.v3.patch, HADOOP-13866.v4.patch, HADOOP-13866.v6.patch, 
> HADOOP-13866.v7.patch, HADOOP-13866.v8.patch, HADOOP-13866.v8.patch, 
> HADOOP-13866.v8.patch, HADOOP-13866.v9.patch
>
>
> netty-all 4.1.1.Final is stable release which we should upgrade to.
> See bottom of HADOOP-12927 for related discussion.
> This issue was discovered since hbase 2.0 uses 4.1.1.Final of netty.
> When launching mapreduce job from hbase, /grid/0/hadoop/yarn/local/  
> usercache/hbase/appcache/application_1479850535804_0008/container_e01_1479850535804_0008_01_05/mr-framework/hadoop/share/hadoop/hdfs/lib/netty-all-4.0.23.Final.jar
>  (from hdfs) is ahead of 4.1.1.Final jar (from hbase) on the classpath.
> Resulting in the following exception:
> {code}
> 2016-12-01 20:17:26,678 WARN [Default-IPC-NioEventLoopGroup-1-1] 
> io.netty.util.concurrent.DefaultPromise: An exception was thrown by 
> org.apache.hadoop.hbase.ipc.NettyRpcConnection$3.operationComplete()
> java.lang.NoSuchMethodError: 
> io.netty.buffer.ByteBuf.retainedDuplicate()Lio/netty/buffer/ByteBuf;
> at 
> org.apache.hadoop.hbase.ipc.NettyRpcConnection$3.operationComplete(NettyRpcConnection.java:272)
> at 
> org.apache.hadoop.hbase.ipc.NettyRpcConnection$3.operationComplete(NettyRpcConnection.java:262)
> at 
> io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:680)
> at 
> io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:603)
> at 
> io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:563)
> at 
> io.netty.util.concurrent.DefaultPromise.trySuccess(DefaultPromise.java:406)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-14076) Allow Configuration to be persisted given path to file

2017-02-13 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-14076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15863813#comment-15863813
 ] 

Ted Yu commented on HADOOP-14076:
-

That's what I did using JNI:
{code}
void writeConf(jobject conf, const char *filepath)
{
  jclass class_fdesc = (*env)->FindClass(env, "java/io/FileDescriptor");
  // construct a new FileDescriptor
  jmethodID const_fdesc = (*env)->GetMethodID(env, class_fdesc, "", 
"()V");

  jobject file = (*env)->NewObject(env, class_fdesc, const_fdesc);
  jfieldID field_fd = (*env)->GetFieldID(env, class_fdesc, "fd", "I");

  int fd = open(filepath, O_RDWR | O_NONBLOCK | O_CREAT, S_IRWXU);
  if (fd < 0) {
printf("Couldn't open file %s\n", filepath);
exit(-1);
  }
  (*env)->SetIntField(env, file, field_fd, fd);

  jclass cls_outstream = (*env)->FindClass(env, "java/io/FileOutputStream");
  jmethodID ctor_stream = (*env)->GetMethodID(env, cls_outstream, "",
"(Ljava/io/FileDescriptor;)V");
  if (ctor_stream == NULL) {
printf("Couldn't get ctor for FileOutputStream\n");
exit(-1);
  }
  jobject file_outstream = (*env)->NewObject(env, cls_outstream, ctor_stream, 
file);
  if (file_outstream == NULL) {
printf("Couldn't create FileOutputStream\n");
exit(-1);
  }
  jclass class_conf = (*env)->FindClass(env, HADOOP_CONF);
  jmethodID writeXmlMid = (*env)->GetMethodID(env, class_conf, "writeXml",
"(Ljava/io/OutputStream;)V");
  (*env)->CallObjectMethod(env, conf, writeXmlMid, file_outstream);
}
{code}
The code is tedious (manipulating fd field).

> Allow Configuration to be persisted given path to file
> --
>
> Key: HADOOP-14076
> URL: https://issues.apache.org/jira/browse/HADOOP-14076
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Ted Yu
>
> Currently Configuration has the following methods for persistence:
> {code}
>   public void writeXml(OutputStream out) throws IOException {
>   public void writeXml(Writer out) throws IOException {
> {code}
> Adding API for persisting to file given path would be useful:
> {code}
>   public void writeXml(String path) throws IOException {
> {code}
> Background: I recently worked on exporting Configuration to a file using JNI.
> Without the proposed API, I resorted to some trick such as the following:
> http://www.kfu.com/~nsayer/Java/jni-filedesc.html



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-13866) Upgrade netty-all to 4.1.1.Final

2017-02-11 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-13866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15862452#comment-15862452
 ] 

Ted Yu commented on HADOOP-13866:
-

Test failure in TestParameterParser is reproducible. TestParameterParser passes 
without patch.

I suggest we go with patch v8.

> Upgrade netty-all to 4.1.1.Final
> 
>
> Key: HADOOP-13866
> URL: https://issues.apache.org/jira/browse/HADOOP-13866
> Project: Hadoop Common
>  Issue Type: Improvement
>Affects Versions: 3.0.0-alpha1
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Blocker
> Attachments: HADOOP-13866.v1.patch, HADOOP-13866.v2.patch, 
> HADOOP-13866.v3.patch, HADOOP-13866.v4.patch, HADOOP-13866.v6.patch, 
> HADOOP-13866.v7.patch, HADOOP-13866.v8.patch, HADOOP-13866.v8.patch, 
> HADOOP-13866.v8.patch, HADOOP-13866.v9.patch
>
>
> netty-all 4.1.1.Final is stable release which we should upgrade to.
> See bottom of HADOOP-12927 for related discussion.
> This issue was discovered since hbase 2.0 uses 4.1.1.Final of netty.
> When launching mapreduce job from hbase, /grid/0/hadoop/yarn/local/  
> usercache/hbase/appcache/application_1479850535804_0008/container_e01_1479850535804_0008_01_05/mr-framework/hadoop/share/hadoop/hdfs/lib/netty-all-4.0.23.Final.jar
>  (from hdfs) is ahead of 4.1.1.Final jar (from hbase) on the classpath.
> Resulting in the following exception:
> {code}
> 2016-12-01 20:17:26,678 WARN [Default-IPC-NioEventLoopGroup-1-1] 
> io.netty.util.concurrent.DefaultPromise: An exception was thrown by 
> org.apache.hadoop.hbase.ipc.NettyRpcConnection$3.operationComplete()
> java.lang.NoSuchMethodError: 
> io.netty.buffer.ByteBuf.retainedDuplicate()Lio/netty/buffer/ByteBuf;
> at 
> org.apache.hadoop.hbase.ipc.NettyRpcConnection$3.operationComplete(NettyRpcConnection.java:272)
> at 
> org.apache.hadoop.hbase.ipc.NettyRpcConnection$3.operationComplete(NettyRpcConnection.java:262)
> at 
> io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:680)
> at 
> io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:603)
> at 
> io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:563)
> at 
> io.netty.util.concurrent.DefaultPromise.trySuccess(DefaultPromise.java:406)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-13866) Upgrade netty-all to 4.1.1.Final

2017-02-11 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-13866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15862337#comment-15862337
 ] 

Ted Yu commented on HADOOP-13866:
-

bq. make HBase 2.0 to work with Hadoop 2.8.0

That would be nice.

> Upgrade netty-all to 4.1.1.Final
> 
>
> Key: HADOOP-13866
> URL: https://issues.apache.org/jira/browse/HADOOP-13866
> Project: Hadoop Common
>  Issue Type: Improvement
>Affects Versions: 3.0.0-alpha1
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Blocker
> Attachments: HADOOP-13866.v1.patch, HADOOP-13866.v2.patch, 
> HADOOP-13866.v3.patch, HADOOP-13866.v4.patch, HADOOP-13866.v6.patch, 
> HADOOP-13866.v7.patch, HADOOP-13866.v8.patch, HADOOP-13866.v8.patch, 
> HADOOP-13866.v8.patch, HADOOP-13866.v9.patch
>
>
> netty-all 4.1.1.Final is stable release which we should upgrade to.
> See bottom of HADOOP-12927 for related discussion.
> This issue was discovered since hbase 2.0 uses 4.1.1.Final of netty.
> When launching mapreduce job from hbase, /grid/0/hadoop/yarn/local/  
> usercache/hbase/appcache/application_1479850535804_0008/container_e01_1479850535804_0008_01_05/mr-framework/hadoop/share/hadoop/hdfs/lib/netty-all-4.0.23.Final.jar
>  (from hdfs) is ahead of 4.1.1.Final jar (from hbase) on the classpath.
> Resulting in the following exception:
> {code}
> 2016-12-01 20:17:26,678 WARN [Default-IPC-NioEventLoopGroup-1-1] 
> io.netty.util.concurrent.DefaultPromise: An exception was thrown by 
> org.apache.hadoop.hbase.ipc.NettyRpcConnection$3.operationComplete()
> java.lang.NoSuchMethodError: 
> io.netty.buffer.ByteBuf.retainedDuplicate()Lio/netty/buffer/ByteBuf;
> at 
> org.apache.hadoop.hbase.ipc.NettyRpcConnection$3.operationComplete(NettyRpcConnection.java:272)
> at 
> org.apache.hadoop.hbase.ipc.NettyRpcConnection$3.operationComplete(NettyRpcConnection.java:262)
> at 
> io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:680)
> at 
> io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:603)
> at 
> io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:563)
> at 
> io.netty.util.concurrent.DefaultPromise.trySuccess(DefaultPromise.java:406)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-13866) Upgrade netty-all to 4.1.1.Final

2017-02-11 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-13866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HADOOP-13866:

Attachment: HADOOP-13866.v9.patch

Patch v9 upgrades to 4.1.8.Final

> Upgrade netty-all to 4.1.1.Final
> 
>
> Key: HADOOP-13866
> URL: https://issues.apache.org/jira/browse/HADOOP-13866
> Project: Hadoop Common
>  Issue Type: Improvement
>Affects Versions: 3.0.0-alpha1
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Blocker
> Attachments: HADOOP-13866.v1.patch, HADOOP-13866.v2.patch, 
> HADOOP-13866.v3.patch, HADOOP-13866.v4.patch, HADOOP-13866.v6.patch, 
> HADOOP-13866.v7.patch, HADOOP-13866.v8.patch, HADOOP-13866.v8.patch, 
> HADOOP-13866.v8.patch, HADOOP-13866.v9.patch
>
>
> netty-all 4.1.1.Final is stable release which we should upgrade to.
> See bottom of HADOOP-12927 for related discussion.
> This issue was discovered since hbase 2.0 uses 4.1.1.Final of netty.
> When launching mapreduce job from hbase, /grid/0/hadoop/yarn/local/  
> usercache/hbase/appcache/application_1479850535804_0008/container_e01_1479850535804_0008_01_05/mr-framework/hadoop/share/hadoop/hdfs/lib/netty-all-4.0.23.Final.jar
>  (from hdfs) is ahead of 4.1.1.Final jar (from hbase) on the classpath.
> Resulting in the following exception:
> {code}
> 2016-12-01 20:17:26,678 WARN [Default-IPC-NioEventLoopGroup-1-1] 
> io.netty.util.concurrent.DefaultPromise: An exception was thrown by 
> org.apache.hadoop.hbase.ipc.NettyRpcConnection$3.operationComplete()
> java.lang.NoSuchMethodError: 
> io.netty.buffer.ByteBuf.retainedDuplicate()Lio/netty/buffer/ByteBuf;
> at 
> org.apache.hadoop.hbase.ipc.NettyRpcConnection$3.operationComplete(NettyRpcConnection.java:272)
> at 
> org.apache.hadoop.hbase.ipc.NettyRpcConnection$3.operationComplete(NettyRpcConnection.java:262)
> at 
> io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:680)
> at 
> io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:603)
> at 
> io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:563)
> at 
> io.netty.util.concurrent.DefaultPromise.trySuccess(DefaultPromise.java:406)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Created] (HADOOP-14076) Allow Configuration to be persisted given path to file

2017-02-10 Thread Ted Yu (JIRA)

Ted Yu created HADOOP-14076:
---

 Summary: Allow Configuration to be persisted given path to file
 Key: HADOOP-14076
 URL: https://issues.apache.org/jira/browse/HADOOP-14076
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Ted Yu


Currently Configuration has the following methods for persistence:
{code}
  public void writeXml(OutputStream out) throws IOException {

  public void writeXml(Writer out) throws IOException {
{code}
Adding API for persisting to file given path would be useful:
{code}
  public void writeXml(String path) throws IOException {
{code}

Background: I recently worked on exporting Configuration to a file using JNI.
Without the proposed API, I resorted to some trick such as the following:
http://www.kfu.com/~nsayer/Java/jni-filedesc.html



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-13866) Upgrade netty-all to 4.1.1.Final

2017-02-07 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-13866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15856858#comment-15856858
 ] 

Ted Yu commented on HADOOP-13866:
-

[~andrew.wang] [~xiaochen]:
Is there anything I need to do to move this forward ?

> Upgrade netty-all to 4.1.1.Final
> 
>
> Key: HADOOP-13866
> URL: https://issues.apache.org/jira/browse/HADOOP-13866
> Project: Hadoop Common
>  Issue Type: Improvement
>Affects Versions: 3.0.0-alpha1
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Blocker
> Attachments: HADOOP-13866.v1.patch, HADOOP-13866.v2.patch, 
> HADOOP-13866.v3.patch, HADOOP-13866.v4.patch, HADOOP-13866.v6.patch, 
> HADOOP-13866.v7.patch, HADOOP-13866.v8.patch, HADOOP-13866.v8.patch, 
> HADOOP-13866.v8.patch
>
>
> netty-all 4.1.1.Final is stable release which we should upgrade to.
> See bottom of HADOOP-12927 for related discussion.
> This issue was discovered since hbase 2.0 uses 4.1.1.Final of netty.
> When launching mapreduce job from hbase, /grid/0/hadoop/yarn/local/  
> usercache/hbase/appcache/application_1479850535804_0008/container_e01_1479850535804_0008_01_05/mr-framework/hadoop/share/hadoop/hdfs/lib/netty-all-4.0.23.Final.jar
>  (from hdfs) is ahead of 4.1.1.Final jar (from hbase) on the classpath.
> Resulting in the following exception:
> {code}
> 2016-12-01 20:17:26,678 WARN [Default-IPC-NioEventLoopGroup-1-1] 
> io.netty.util.concurrent.DefaultPromise: An exception was thrown by 
> org.apache.hadoop.hbase.ipc.NettyRpcConnection$3.operationComplete()
> java.lang.NoSuchMethodError: 
> io.netty.buffer.ByteBuf.retainedDuplicate()Lio/netty/buffer/ByteBuf;
> at 
> org.apache.hadoop.hbase.ipc.NettyRpcConnection$3.operationComplete(NettyRpcConnection.java:272)
> at 
> org.apache.hadoop.hbase.ipc.NettyRpcConnection$3.operationComplete(NettyRpcConnection.java:262)
> at 
> io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:680)
> at 
> io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:603)
> at 
> io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:563)
> at 
> io.netty.util.concurrent.DefaultPromise.trySuccess(DefaultPromise.java:406)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-14043) Shade netty 4 dependency in hadoop-hdfs

2017-02-04 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-14043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HADOOP-14043:

Summary: Shade netty 4 dependency in hadoop-hdfs  (was: Shade netty 4 
dependency)

> Shade netty 4 dependency in hadoop-hdfs
> ---
>
> Key: HADOOP-14043
> URL: https://issues.apache.org/jira/browse/HADOOP-14043
> Project: Hadoop Common
>  Issue Type: Improvement
>Affects Versions: 2.8.0
>Reporter: Ted Yu
>
> During review of HADOOP-13866, [~andrew.wang] mentioned considering  shading 
> netty before putting the fix into branch-2.
> This would give users better experience when upgrading hadoop.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-14043) Shade netty 4 dependency in hadoop-hdfs

2017-02-04 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-14043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15852837#comment-15852837
 ] 

Ted Yu commented on HADOOP-14043:
-

Can the shading be done in a similar way to:
{code}
commit 70ca1f1e3a328b18eb4e27f7d0f328ae403342d5
Author: Andrew Wang 
Date:   Thu Dec 15 11:44:59 2016 -0800

HADOOP-11804. Shaded Hadoop client artifacts and minicluster. Contributed 
by Sean Busbey.
{code}

> Shade netty 4 dependency in hadoop-hdfs
> ---
>
> Key: HADOOP-14043
> URL: https://issues.apache.org/jira/browse/HADOOP-14043
> Project: Hadoop Common
>  Issue Type: Improvement
>Affects Versions: 2.8.0
>Reporter: Ted Yu
>
> During review of HADOOP-13866, [~andrew.wang] mentioned considering  shading 
> netty before putting the fix into branch-2.
> This would give users better experience when upgrading hadoop.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-13866) Upgrade netty-all to 4.1.1.Final

2017-02-02 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-13866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15850826#comment-15850826
 ] 

Ted Yu commented on HADOOP-13866:
-

2.9 can temporarily be dropped.
If HADOOP-14043 gets into 2.9, we can pull this in.

> Upgrade netty-all to 4.1.1.Final
> 
>
> Key: HADOOP-13866
> URL: https://issues.apache.org/jira/browse/HADOOP-13866
> Project: Hadoop Common
>  Issue Type: Improvement
>Affects Versions: 3.0.0-alpha1
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Blocker
> Attachments: HADOOP-13866.v1.patch, HADOOP-13866.v2.patch, 
> HADOOP-13866.v3.patch, HADOOP-13866.v4.patch, HADOOP-13866.v6.patch, 
> HADOOP-13866.v7.patch, HADOOP-13866.v8.patch, HADOOP-13866.v8.patch, 
> HADOOP-13866.v8.patch
>
>
> netty-all 4.1.1.Final is stable release which we should upgrade to.
> See bottom of HADOOP-12927 for related discussion.
> This issue was discovered since hbase 2.0 uses 4.1.1.Final of netty.
> When launching mapreduce job from hbase, /grid/0/hadoop/yarn/local/  
> usercache/hbase/appcache/application_1479850535804_0008/container_e01_1479850535804_0008_01_05/mr-framework/hadoop/share/hadoop/hdfs/lib/netty-all-4.0.23.Final.jar
>  (from hdfs) is ahead of 4.1.1.Final jar (from hbase) on the classpath.
> Resulting in the following exception:
> {code}
> 2016-12-01 20:17:26,678 WARN [Default-IPC-NioEventLoopGroup-1-1] 
> io.netty.util.concurrent.DefaultPromise: An exception was thrown by 
> org.apache.hadoop.hbase.ipc.NettyRpcConnection$3.operationComplete()
> java.lang.NoSuchMethodError: 
> io.netty.buffer.ByteBuf.retainedDuplicate()Lio/netty/buffer/ByteBuf;
> at 
> org.apache.hadoop.hbase.ipc.NettyRpcConnection$3.operationComplete(NettyRpcConnection.java:272)
> at 
> org.apache.hadoop.hbase.ipc.NettyRpcConnection$3.operationComplete(NettyRpcConnection.java:262)
> at 
> io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:680)
> at 
> io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:603)
> at 
> io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:563)
> at 
> io.netty.util.concurrent.DefaultPromise.trySuccess(DefaultPromise.java:406)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-13866) Upgrade netty-all to 4.1.1.Final

2017-02-02 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-13866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15850808#comment-15850808
 ] 

Ted Yu commented on HADOOP-13866:
-

That's true - if downstream project(s) uses netty API available in 4.1.0.Beta5 
but not in 4.1.1


> Upgrade netty-all to 4.1.1.Final
> 
>
> Key: HADOOP-13866
> URL: https://issues.apache.org/jira/browse/HADOOP-13866
> Project: Hadoop Common
>  Issue Type: Improvement
>Affects Versions: 3.0.0-alpha1
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Blocker
> Attachments: HADOOP-13866.v1.patch, HADOOP-13866.v2.patch, 
> HADOOP-13866.v3.patch, HADOOP-13866.v4.patch, HADOOP-13866.v6.patch, 
> HADOOP-13866.v7.patch, HADOOP-13866.v8.patch, HADOOP-13866.v8.patch, 
> HADOOP-13866.v8.patch
>
>
> netty-all 4.1.1.Final is stable release which we should upgrade to.
> See bottom of HADOOP-12927 for related discussion.
> This issue was discovered since hbase 2.0 uses 4.1.1.Final of netty.
> When launching mapreduce job from hbase, /grid/0/hadoop/yarn/local/  
> usercache/hbase/appcache/application_1479850535804_0008/container_e01_1479850535804_0008_01_05/mr-framework/hadoop/share/hadoop/hdfs/lib/netty-all-4.0.23.Final.jar
>  (from hdfs) is ahead of 4.1.1.Final jar (from hbase) on the classpath.
> Resulting in the following exception:
> {code}
> 2016-12-01 20:17:26,678 WARN [Default-IPC-NioEventLoopGroup-1-1] 
> io.netty.util.concurrent.DefaultPromise: An exception was thrown by 
> org.apache.hadoop.hbase.ipc.NettyRpcConnection$3.operationComplete()
> java.lang.NoSuchMethodError: 
> io.netty.buffer.ByteBuf.retainedDuplicate()Lio/netty/buffer/ByteBuf;
> at 
> org.apache.hadoop.hbase.ipc.NettyRpcConnection$3.operationComplete(NettyRpcConnection.java:272)
> at 
> org.apache.hadoop.hbase.ipc.NettyRpcConnection$3.operationComplete(NettyRpcConnection.java:262)
> at 
> io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:680)
> at 
> io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:603)
> at 
> io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:563)
> at 
> io.netty.util.concurrent.DefaultPromise.trySuccess(DefaultPromise.java:406)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-13866) Upgrade netty-all to 4.1.1.Final

2017-02-02 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-13866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15850706#comment-15850706
 ] 

Ted Yu commented on HADOOP-13866:
-

There is no code change needed for the upgrade.

> Upgrade netty-all to 4.1.1.Final
> 
>
> Key: HADOOP-13866
> URL: https://issues.apache.org/jira/browse/HADOOP-13866
> Project: Hadoop Common
>  Issue Type: Improvement
>Affects Versions: 3.0.0-alpha1
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Blocker
> Attachments: HADOOP-13866.v1.patch, HADOOP-13866.v2.patch, 
> HADOOP-13866.v3.patch, HADOOP-13866.v4.patch, HADOOP-13866.v6.patch, 
> HADOOP-13866.v7.patch, HADOOP-13866.v8.patch, HADOOP-13866.v8.patch, 
> HADOOP-13866.v8.patch
>
>
> netty-all 4.1.1.Final is stable release which we should upgrade to.
> See bottom of HADOOP-12927 for related discussion.
> This issue was discovered since hbase 2.0 uses 4.1.1.Final of netty.
> When launching mapreduce job from hbase, /grid/0/hadoop/yarn/local/  
> usercache/hbase/appcache/application_1479850535804_0008/container_e01_1479850535804_0008_01_05/mr-framework/hadoop/share/hadoop/hdfs/lib/netty-all-4.0.23.Final.jar
>  (from hdfs) is ahead of 4.1.1.Final jar (from hbase) on the classpath.
> Resulting in the following exception:
> {code}
> 2016-12-01 20:17:26,678 WARN [Default-IPC-NioEventLoopGroup-1-1] 
> io.netty.util.concurrent.DefaultPromise: An exception was thrown by 
> org.apache.hadoop.hbase.ipc.NettyRpcConnection$3.operationComplete()
> java.lang.NoSuchMethodError: 
> io.netty.buffer.ByteBuf.retainedDuplicate()Lio/netty/buffer/ByteBuf;
> at 
> org.apache.hadoop.hbase.ipc.NettyRpcConnection$3.operationComplete(NettyRpcConnection.java:272)
> at 
> org.apache.hadoop.hbase.ipc.NettyRpcConnection$3.operationComplete(NettyRpcConnection.java:262)
> at 
> io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:680)
> at 
> io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:603)
> at 
> io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:563)
> at 
> io.netty.util.concurrent.DefaultPromise.trySuccess(DefaultPromise.java:406)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-13866) Upgrade netty-all to 4.1.1.Final

2017-02-02 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-13866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15850432#comment-15850432
 ] 

Ted Yu commented on HADOOP-13866:
-

By swapping in netty-all-4.1.1.Final.jar to hadoop directories where 
netty-3.6.2.Final.jar used to reside, I was able to run 
org.apache.hadoop.hbase.test.IntegrationTestBigLinkedList without triggering 
NoSuchMethodException.

> Upgrade netty-all to 4.1.1.Final
> 
>
> Key: HADOOP-13866
> URL: https://issues.apache.org/jira/browse/HADOOP-13866
> Project: Hadoop Common
>  Issue Type: Improvement
>Affects Versions: 3.0.0-alpha1
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Blocker
> Attachments: HADOOP-13866.v1.patch, HADOOP-13866.v2.patch, 
> HADOOP-13866.v3.patch, HADOOP-13866.v4.patch, HADOOP-13866.v6.patch, 
> HADOOP-13866.v7.patch, HADOOP-13866.v8.patch, HADOOP-13866.v8.patch, 
> HADOOP-13866.v8.patch
>
>
> netty-all 4.1.1.Final is stable release which we should upgrade to.
> See bottom of HADOOP-12927 for related discussion.
> This issue was discovered since hbase 2.0 uses 4.1.1.Final of netty.
> When launching mapreduce job from hbase, /grid/0/hadoop/yarn/local/  
> usercache/hbase/appcache/application_1479850535804_0008/container_e01_1479850535804_0008_01_05/mr-framework/hadoop/share/hadoop/hdfs/lib/netty-all-4.0.23.Final.jar
>  (from hdfs) is ahead of 4.1.1.Final jar (from hbase) on the classpath.
> Resulting in the following exception:
> {code}
> 2016-12-01 20:17:26,678 WARN [Default-IPC-NioEventLoopGroup-1-1] 
> io.netty.util.concurrent.DefaultPromise: An exception was thrown by 
> org.apache.hadoop.hbase.ipc.NettyRpcConnection$3.operationComplete()
> java.lang.NoSuchMethodError: 
> io.netty.buffer.ByteBuf.retainedDuplicate()Lio/netty/buffer/ByteBuf;
> at 
> org.apache.hadoop.hbase.ipc.NettyRpcConnection$3.operationComplete(NettyRpcConnection.java:272)
> at 
> org.apache.hadoop.hbase.ipc.NettyRpcConnection$3.operationComplete(NettyRpcConnection.java:262)
> at 
> io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:680)
> at 
> io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:603)
> at 
> io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:563)
> at 
> io.netty.util.concurrent.DefaultPromise.trySuccess(DefaultPromise.java:406)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] (HADOOP-14043) Shade netty dependency

2017-01-31 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-14043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15847517#comment-15847517
 ] 

Ted Yu commented on HADOOP-14043:
-

{code}
2017-01-25 01:55:48,433 WARN [Default-IPC-NioEventLoopGroup-1-1] 
io.netty.util.concurrent.DefaultPromise: An exception was thrown by 
org.apache.hadoop.hbase.ipc.NettyRpcConnection$3.operationComplete()
java.lang.NoSuchMethodError: 
io.netty.buffer.ByteBuf.retainedDuplicate()Lio/netty/buffer/ByteBuf;
  at 
org.apache.hadoop.hbase.ipc.NettyRpcConnection$3.operationComplete(NettyRpcConnection.java:272)
  at 
org.apache.hadoop.hbase.ipc.NettyRpcConnection$3.operationComplete(NettyRpcConnection.java:262)
  at 
io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:680)
  at 
io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:603)
  at 
io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:563)
  at io.netty.util.concurrent.DefaultPromise.trySuccess(DefaultPromise.java:406)
  at 
io.netty.channel.DefaultChannelPromise.trySuccess(DefaultChannelPromise.java:82)
  at 
io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.fulfillConnectPromise(AbstractNioChannel.java:253)
  at 
io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:288)
  at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:528)
  at 
io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
  at 
io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
  at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
  at 
io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116)
  at 
io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:137)
  at java.lang.Thread.run(Thread.java:745)
{code}

> Shade netty dependency
> --
>
> Key: HADOOP-14043
> URL: https://issues.apache.org/jira/browse/HADOOP-14043
> Project: Hadoop Common
>  Issue Type: Improvement
>Affects Versions: 2.8.0
>Reporter: Ted Yu
>
> During review of HADOOP-13866, [~andrew.wang] mentioned considering  shading 
> netty before putting the fix into branch-2.
> This would give users better experience when upgrading hadoop.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

1 2 3 4 5 >

1 - 100 of 492 matches

Mail list logo