[jira] [Comment Edited] (HADOOP-14906) ITestAzureConcurrentOutOfBandIo failing with checksum errors on write

2017-09-25 Thread Georgi Chalakov (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16179511#comment-16179511
 ] 

Georgi Chalakov edited comment on HADOOP-14906 at 9/25/17 6:28 PM:
---

I am not sure that this is related to the block compaction change. The debug 
message shows no directories in the list for block blobs with compaction. I 
posted the code where we check whether the file is in one of those directories 
and if it is not we skip BlockBlobAppendStream.

{quote}
2017-09-25 14:47:17,484 DEBUG [JUnit-testReadOOBWrites]: 
azure.AzureNativeFileSystemStore 
(AzureNativeFileSystemStore.java:initialize(550)) - Block blobs with compaction 
directories:  
{quote}

{code:title=hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azure/AzureNativeFileSystemStore.java|borderStyle=solid}
  if (isBlockBlobWithCompactionKey(key)) {
BlockBlobAppendStream blockBlobOutputStream = new BlockBlobAppendStream(
(CloudBlockBlobWrapper) blob,
keyEncoded,
this.uploadBlockSizeBytes,
true,
getInstrumentedContext());
outputStream = blockBlobOutputStream;
  } else {
outputStream = openOutputStream(blob);
  }
{code}


was (Author: georgi):
I am not sure that this is related to the block compaction change. The debug 
message shows no directories in the list for block blobs with compaction. I 
posted the code where we check whether the file is in one of those directories 
and if it is not we skip BlockBlobAppendStream.

{noformat}
2017-09-25 14:47:17,484 DEBUG [JUnit-testReadOOBWrites]: 
azure.AzureNativeFileSystemStore 
(AzureNativeFileSystemStore.java:initialize(550)) - Block blobs with compaction 
directories:  
{noformat}

{code:title=hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azure/AzureNativeFileSystemStore.java|borderStyle=solid}
  if (isBlockBlobWithCompactionKey(key)) {
BlockBlobAppendStream blockBlobOutputStream = new BlockBlobAppendStream(
(CloudBlockBlobWrapper) blob,
keyEncoded,
this.uploadBlockSizeBytes,
true,
getInstrumentedContext());
outputStream = blockBlobOutputStream;
  } else {
outputStream = openOutputStream(blob);
  }
{code}

> ITestAzureConcurrentOutOfBandIo failing with checksum errors on write
> -
>
> Key: HADOOP-14906
> URL: https://issues.apache.org/jira/browse/HADOOP-14906
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/azure
>Affects Versions: 2.9.0, 3.1.0
> Environment: UK BT ASDL connection, 1.8.0_121-b13, azure storage 
> ireland
>Reporter: Steve Loughran
>
> {{ITestAzureConcurrentOutOfBandIo}} is consistently raising an IOE with the 
> text "The MD5 value specified in the request did not match with the MD5 value 
> calculated by the server"



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HADOOP-14906) ITestAzureConcurrentOutOfBandIo failing with checksum errors on write

2017-09-25 Thread Georgi Chalakov (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16179511#comment-16179511
 ] 

Georgi Chalakov edited comment on HADOOP-14906 at 9/25/17 6:27 PM:
---

I am not sure that this is related to the block compaction change. The debug 
message shows no directories in the list for block blobs with compaction. I 
posted the code where we check whether the file is in one of those directories 
and if it is not we skip BlockBlobAppendStream.

2017-09-25 14:47:17,484 DEBUG [JUnit-testReadOOBWrites]: 
azure.AzureNativeFileSystemStore 
(AzureNativeFileSystemStore.java:initialize(550)) - Block blobs with compaction 
directories:  

{code:title=hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azure/AzureNativeFileSystemStore.java|borderStyle=solid}
  if (isBlockBlobWithCompactionKey(key)) {
BlockBlobAppendStream blockBlobOutputStream = new BlockBlobAppendStream(
(CloudBlockBlobWrapper) blob,
keyEncoded,
this.uploadBlockSizeBytes,
true,
getInstrumentedContext());
outputStream = blockBlobOutputStream;
  } else {
outputStream = openOutputStream(blob);
  }
{code}


was (Author: georgi):
I am not sure that this is related to the block compaction change. The debug 
message shows no directories in the list for block blobs with compaction. I 
posted the code where we check whether the file is in one of those directories 
and if it is not we skip BlockBlobAppendStream.

2017-09-25 14:47:17,484 DEBUG [JUnit-testReadOOBWrites]: 
azure.AzureNativeFileSystemStore 
(AzureNativeFileSystemStore.java:initialize(550)) - Block blobs with compaction 
directories:  

{code:title=hadoop-tools\hadoop-azure\src\main\java\org\apache\hadoop\fs\azure\AzureNativeFileSystemStore.java|borderStyle=solid}
  if (isBlockBlobWithCompactionKey(key)) {
BlockBlobAppendStream blockBlobOutputStream = new BlockBlobAppendStream(
(CloudBlockBlobWrapper) blob,
keyEncoded,
this.uploadBlockSizeBytes,
true,
getInstrumentedContext());
outputStream = blockBlobOutputStream;
  } else {
outputStream = openOutputStream(blob);
  }
{code}

> ITestAzureConcurrentOutOfBandIo failing with checksum errors on write
> -
>
> Key: HADOOP-14906
> URL: https://issues.apache.org/jira/browse/HADOOP-14906
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/azure
>Affects Versions: 2.9.0, 3.1.0
> Environment: UK BT ASDL connection, 1.8.0_121-b13, azure storage 
> ireland
>Reporter: Steve Loughran
>
> {{ITestAzureConcurrentOutOfBandIo}} is consistently raising an IOE with the 
> text "The MD5 value specified in the request did not match with the MD5 value 
> calculated by the server"



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HADOOP-14906) ITestAzureConcurrentOutOfBandIo failing with checksum errors on write

2017-09-25 Thread Georgi Chalakov (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16179511#comment-16179511
 ] 

Georgi Chalakov edited comment on HADOOP-14906 at 9/25/17 6:27 PM:
---

I am not sure that this is related to the block compaction change. The debug 
message shows no directories in the list for block blobs with compaction. I 
posted the code where we check whether the file is in one of those directories 
and if it is not we skip BlockBlobAppendStream.

{noformat}
2017-09-25 14:47:17,484 DEBUG [JUnit-testReadOOBWrites]: 
azure.AzureNativeFileSystemStore 
(AzureNativeFileSystemStore.java:initialize(550)) - Block blobs with compaction 
directories:  
{noformat}

{code:title=hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azure/AzureNativeFileSystemStore.java|borderStyle=solid}
  if (isBlockBlobWithCompactionKey(key)) {
BlockBlobAppendStream blockBlobOutputStream = new BlockBlobAppendStream(
(CloudBlockBlobWrapper) blob,
keyEncoded,
this.uploadBlockSizeBytes,
true,
getInstrumentedContext());
outputStream = blockBlobOutputStream;
  } else {
outputStream = openOutputStream(blob);
  }
{code}


was (Author: georgi):
I am not sure that this is related to the block compaction change. The debug 
message shows no directories in the list for block blobs with compaction. I 
posted the code where we check whether the file is in one of those directories 
and if it is not we skip BlockBlobAppendStream.

2017-09-25 14:47:17,484 DEBUG [JUnit-testReadOOBWrites]: 
azure.AzureNativeFileSystemStore 
(AzureNativeFileSystemStore.java:initialize(550)) - Block blobs with compaction 
directories:  

{code:title=hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azure/AzureNativeFileSystemStore.java|borderStyle=solid}
  if (isBlockBlobWithCompactionKey(key)) {
BlockBlobAppendStream blockBlobOutputStream = new BlockBlobAppendStream(
(CloudBlockBlobWrapper) blob,
keyEncoded,
this.uploadBlockSizeBytes,
true,
getInstrumentedContext());
outputStream = blockBlobOutputStream;
  } else {
outputStream = openOutputStream(blob);
  }
{code}

> ITestAzureConcurrentOutOfBandIo failing with checksum errors on write
> -
>
> Key: HADOOP-14906
> URL: https://issues.apache.org/jira/browse/HADOOP-14906
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/azure
>Affects Versions: 2.9.0, 3.1.0
> Environment: UK BT ASDL connection, 1.8.0_121-b13, azure storage 
> ireland
>Reporter: Steve Loughran
>
> {{ITestAzureConcurrentOutOfBandIo}} is consistently raising an IOE with the 
> text "The MD5 value specified in the request did not match with the MD5 value 
> calculated by the server"



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HADOOP-14906) ITestAzureConcurrentOutOfBandIo failing with checksum errors on write

2017-09-25 Thread Georgi Chalakov (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16179511#comment-16179511
 ] 

Georgi Chalakov edited comment on HADOOP-14906 at 9/25/17 6:26 PM:
---

I am not sure that this is related to the block compaction change. The debug 
message shows no directories in the list for block blobs with compaction. I 
posted the code where we check whether the file is in one of those directories 
and if it is not we skip BlockBlobAppendStream.

2017-09-25 14:47:17,484 DEBUG [JUnit-testReadOOBWrites]: 
azure.AzureNativeFileSystemStore 
(AzureNativeFileSystemStore.java:initialize(550)) - Block blobs with compaction 
directories:  

{code:title=hadoop-tools\hadoop-azure\src\main\java\org\apache\hadoop\fs\azure\AzureNativeFileSystemStore.java|borderStyle=solid}
  if (isBlockBlobWithCompactionKey(key)) {
BlockBlobAppendStream blockBlobOutputStream = new BlockBlobAppendStream(
(CloudBlockBlobWrapper) blob,
keyEncoded,
this.uploadBlockSizeBytes,
true,
getInstrumentedContext());
outputStream = blockBlobOutputStream;
  } else {
outputStream = openOutputStream(blob);
  }
{code}


was (Author: georgi):
I am not sure that this is related to the block compaction change. The debug 
message shows no directories in the list for block blobs with compaction. I 
posted the code where we check whether the file is in one of those directories 
and if it is not we skip BlockBlobAppendStream.

2017-09-25 14:47:17,484 DEBUG [JUnit-testReadOOBWrites]: 
azure.AzureNativeFileSystemStore 
(AzureNativeFileSystemStore.java:initialize(550)) - Block blobs with compaction 
directories:  

{code:title=Bar.java|borderStyle=solid}
  if (isBlockBlobWithCompactionKey(key)) {
BlockBlobAppendStream blockBlobOutputStream = new BlockBlobAppendStream(
(CloudBlockBlobWrapper) blob,
keyEncoded,
this.uploadBlockSizeBytes,
true,
getInstrumentedContext());
outputStream = blockBlobOutputStream;
  } else {
outputStream = openOutputStream(blob);
  }
{code}

> ITestAzureConcurrentOutOfBandIo failing with checksum errors on write
> -
>
> Key: HADOOP-14906
> URL: https://issues.apache.org/jira/browse/HADOOP-14906
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/azure
>Affects Versions: 2.9.0, 3.1.0
> Environment: UK BT ASDL connection, 1.8.0_121-b13, azure storage 
> ireland
>Reporter: Steve Loughran
>
> {{ITestAzureConcurrentOutOfBandIo}} is consistently raising an IOE with the 
> text "The MD5 value specified in the request did not match with the MD5 value 
> calculated by the server"



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HADOOP-14906) ITestAzureConcurrentOutOfBandIo failing with checksum errors on write

2017-09-25 Thread Georgi Chalakov (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16179511#comment-16179511
 ] 

Georgi Chalakov edited comment on HADOOP-14906 at 9/25/17 6:26 PM:
---

I am not sure that this is related to the block compaction change. The debug 
message shows no directories in the list for block blobs with compaction. I 
posted the code where we check whether the file is in one of those directories 
and if it is not we skip BlockBlobAppendStream.

2017-09-25 14:47:17,484 DEBUG [JUnit-testReadOOBWrites]: 
azure.AzureNativeFileSystemStore 
(AzureNativeFileSystemStore.java:initialize(550)) - Block blobs with compaction 
directories:  

{code:title=Bar.java|borderStyle=solid}
  if (isBlockBlobWithCompactionKey(key)) {
BlockBlobAppendStream blockBlobOutputStream = new BlockBlobAppendStream(
(CloudBlockBlobWrapper) blob,
keyEncoded,
this.uploadBlockSizeBytes,
true,
getInstrumentedContext());
outputStream = blockBlobOutputStream;
  } else {
outputStream = openOutputStream(blob);
  }
{code}


was (Author: georgi):
I am not sure that this is related to the block compaction change. The debug 
message shows no directories in the list for block blobs with compaction. I 
posted the code where we check whether the file is in one of those directories 
and if it is not we skip BlockBlobAppendStream.

2017-09-25 14:47:17,484 DEBUG [JUnit-testReadOOBWrites]: 
azure.AzureNativeFileSystemStore 
(AzureNativeFileSystemStore.java:initialize(550)) - Block blobs with compaction 
directories:  

  if (isBlockBlobWithCompactionKey(key)) {
BlockBlobAppendStream blockBlobOutputStream = new BlockBlobAppendStream(
(CloudBlockBlobWrapper) blob,
keyEncoded,
this.uploadBlockSizeBytes,
true,
getInstrumentedContext());
outputStream = blockBlobOutputStream;
  } else {
outputStream = openOutputStream(blob);
  }

> ITestAzureConcurrentOutOfBandIo failing with checksum errors on write
> -
>
> Key: HADOOP-14906
> URL: https://issues.apache.org/jira/browse/HADOOP-14906
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/azure
>Affects Versions: 2.9.0, 3.1.0
> Environment: UK BT ASDL connection, 1.8.0_121-b13, azure storage 
> ireland
>Reporter: Steve Loughran
>
> {{ITestAzureConcurrentOutOfBandIo}} is consistently raising an IOE with the 
> text "The MD5 value specified in the request did not match with the MD5 value 
> calculated by the server"



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14906) ITestAzureConcurrentOutOfBandIo failing with checksum errors on write

2017-09-25 Thread Georgi Chalakov (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16179511#comment-16179511
 ] 

Georgi Chalakov commented on HADOOP-14906:
--

I am not sure that this is related to the block compaction change. The debug 
message shows no directories in the list for block blobs with compaction. I 
posted the code where we check whether the file is in one of those directories 
and if it is not we skip BlockBlobAppendStream.

2017-09-25 14:47:17,484 DEBUG [JUnit-testReadOOBWrites]: 
azure.AzureNativeFileSystemStore 
(AzureNativeFileSystemStore.java:initialize(550)) - Block blobs with compaction 
directories:  

  if (isBlockBlobWithCompactionKey(key)) {
BlockBlobAppendStream blockBlobOutputStream = new BlockBlobAppendStream(
(CloudBlockBlobWrapper) blob,
keyEncoded,
this.uploadBlockSizeBytes,
true,
getInstrumentedContext());
outputStream = blockBlobOutputStream;
  } else {
outputStream = openOutputStream(blob);
  }

> ITestAzureConcurrentOutOfBandIo failing with checksum errors on write
> -
>
> Key: HADOOP-14906
> URL: https://issues.apache.org/jira/browse/HADOOP-14906
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/azure
>Affects Versions: 2.9.0, 3.1.0
> Environment: UK BT ASDL connection, 1.8.0_121-b13, azure storage 
> ireland
>Reporter: Steve Loughran
>
> {{ITestAzureConcurrentOutOfBandIo}} is consistently raising an IOE with the 
> text "The MD5 value specified in the request did not match with the MD5 value 
> calculated by the server"



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14520) WASB: Block compaction for Azure Block Blobs

2017-09-11 Thread Georgi Chalakov (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16161580#comment-16161580
 ] 

Georgi Chalakov commented on HADOOP-14520:
--

Thanks for the review Steve! 

> WASB: Block compaction for Azure Block Blobs
> 
>
> Key: HADOOP-14520
> URL: https://issues.apache.org/jira/browse/HADOOP-14520
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/azure
>Affects Versions: 2.7.4
>Reporter: Georgi Chalakov
>Assignee: Georgi Chalakov
> Fix For: 2.0.6-alpha, 3.0.0-beta1
>
> Attachments: HADOOP-14520-006.patch, HADOOP-14520-008.patch, 
> HADOOP-14520-009.patch, HADOOP-14520-05.patch, HADOOP_14520_07.patch, 
> HADOOP_14520_08.patch, HADOOP_14520_09.patch, HADOOP_14520_10.patch, 
> hadoop-14520-branch-2-010.patch, HADOOP-14520-patch-07-08.diff, 
> HADOOP-14520-patch-07-09.diff
>
>
> Block Compaction for WASB allows uploading new blocks for every hflush/hsync 
> call. When the number of blocks is above 32000, next hflush/hsync triggers 
> the block compaction process. Block compaction replaces a sequence of blocks 
> with one block. From all the sequences with total length less than 4M, 
> compaction chooses the longest one. It is a greedy algorithm that preserve 
> all potential candidates for the next round. Block Compaction for WASB 
> increases data durability and allows using block blobs instead of page blobs. 
> By default, block compaction is disabled. Similar to the configuration for 
> page blobs, the client needs to specify HDFS folders where block compaction 
> over block blobs is enabled. 
> Results for HADOOP_14520_07.patch
> tested endpoint: fs.azure.account.key.hdfs4.blob.core.windows.net
> Tests run: 777, Failures: 0, Errors: 0, Skipped: 155



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14520) WASB: Block compaction for Azure Block Blobs

2017-09-08 Thread Georgi Chalakov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Georgi Chalakov updated HADOOP-14520:
-
Affects Version/s: (was: 3.0.0-alpha3)
   2.7.4
   Status: Patch Available  (was: Open)

> WASB: Block compaction for Azure Block Blobs
> 
>
> Key: HADOOP-14520
> URL: https://issues.apache.org/jira/browse/HADOOP-14520
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/azure
>Affects Versions: 2.7.4
>Reporter: Georgi Chalakov
>Assignee: Georgi Chalakov
> Attachments: HADOOP-14520-006.patch, HADOOP-14520-008.patch, 
> HADOOP-14520-009.patch, HADOOP-14520-05.patch, HADOOP_14520_07.patch, 
> HADOOP_14520_08.patch, HADOOP_14520_09.patch, HADOOP_14520_10.patch, 
> hadoop-14520-branch-2-010.patch, HADOOP-14520-patch-07-08.diff, 
> HADOOP-14520-patch-07-09.diff
>
>
> Block Compaction for WASB allows uploading new blocks for every hflush/hsync 
> call. When the number of blocks is above 32000, next hflush/hsync triggers 
> the block compaction process. Block compaction replaces a sequence of blocks 
> with one block. From all the sequences with total length less than 4M, 
> compaction chooses the longest one. It is a greedy algorithm that preserve 
> all potential candidates for the next round. Block Compaction for WASB 
> increases data durability and allows using block blobs instead of page blobs. 
> By default, block compaction is disabled. Similar to the configuration for 
> page blobs, the client needs to specify HDFS folders where block compaction 
> over block blobs is enabled. 
> Results for HADOOP_14520_07.patch
> tested endpoint: fs.azure.account.key.hdfs4.blob.core.windows.net
> Tests run: 777, Failures: 0, Errors: 0, Skipped: 155



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14520) WASB: Block compaction for Azure Block Blobs

2017-09-08 Thread Georgi Chalakov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Georgi Chalakov updated HADOOP-14520:
-
Status: Open  (was: Patch Available)

> WASB: Block compaction for Azure Block Blobs
> 
>
> Key: HADOOP-14520
> URL: https://issues.apache.org/jira/browse/HADOOP-14520
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/azure
>Affects Versions: 3.0.0-alpha3
>Reporter: Georgi Chalakov
>Assignee: Georgi Chalakov
> Attachments: HADOOP-14520-006.patch, HADOOP-14520-008.patch, 
> HADOOP-14520-009.patch, HADOOP-14520-05.patch, HADOOP_14520_07.patch, 
> HADOOP_14520_08.patch, HADOOP_14520_09.patch, HADOOP_14520_10.patch, 
> hadoop-14520-branch-2-010.patch, HADOOP-14520-patch-07-08.diff, 
> HADOOP-14520-patch-07-09.diff
>
>
> Block Compaction for WASB allows uploading new blocks for every hflush/hsync 
> call. When the number of blocks is above 32000, next hflush/hsync triggers 
> the block compaction process. Block compaction replaces a sequence of blocks 
> with one block. From all the sequences with total length less than 4M, 
> compaction chooses the longest one. It is a greedy algorithm that preserve 
> all potential candidates for the next round. Block Compaction for WASB 
> increases data durability and allows using block blobs instead of page blobs. 
> By default, block compaction is disabled. Similar to the configuration for 
> page blobs, the client needs to specify HDFS folders where block compaction 
> over block blobs is enabled. 
> Results for HADOOP_14520_07.patch
> tested endpoint: fs.azure.account.key.hdfs4.blob.core.windows.net
> Tests run: 777, Failures: 0, Errors: 0, Skipped: 155



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14520) WASB: Block compaction for Azure Block Blobs

2017-09-08 Thread Georgi Chalakov (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16159350#comment-16159350
 ] 

Georgi Chalakov commented on HADOOP-14520:
--

Thanks for the review Steve! 

I have attached the patch for branch-2:  hadoop-14520-branch-2-010.patch
Results from endpoint: fs.azure.account.key.hdfs4.blob.core.windows.net
Tests run: 774, Failures: 0, Errors: 0, Skipped: 131


> WASB: Block compaction for Azure Block Blobs
> 
>
> Key: HADOOP-14520
> URL: https://issues.apache.org/jira/browse/HADOOP-14520
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/azure
>Affects Versions: 3.0.0-alpha3
>Reporter: Georgi Chalakov
>Assignee: Georgi Chalakov
> Attachments: HADOOP-14520-006.patch, HADOOP-14520-008.patch, 
> HADOOP-14520-009.patch, HADOOP-14520-05.patch, HADOOP_14520_07.patch, 
> HADOOP_14520_08.patch, HADOOP_14520_09.patch, HADOOP_14520_10.patch, 
> hadoop-14520-branch-2-010.patch, HADOOP-14520-patch-07-08.diff, 
> HADOOP-14520-patch-07-09.diff
>
>
> Block Compaction for WASB allows uploading new blocks for every hflush/hsync 
> call. When the number of blocks is above 32000, next hflush/hsync triggers 
> the block compaction process. Block compaction replaces a sequence of blocks 
> with one block. From all the sequences with total length less than 4M, 
> compaction chooses the longest one. It is a greedy algorithm that preserve 
> all potential candidates for the next round. Block Compaction for WASB 
> increases data durability and allows using block blobs instead of page blobs. 
> By default, block compaction is disabled. Similar to the configuration for 
> page blobs, the client needs to specify HDFS folders where block compaction 
> over block blobs is enabled. 
> Results for HADOOP_14520_07.patch
> tested endpoint: fs.azure.account.key.hdfs4.blob.core.windows.net
> Tests run: 777, Failures: 0, Errors: 0, Skipped: 155



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14520) WASB: Block compaction for Azure Block Blobs

2017-09-08 Thread Georgi Chalakov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Georgi Chalakov updated HADOOP-14520:
-
Attachment: hadoop-14520-branch-2-010.patch

> WASB: Block compaction for Azure Block Blobs
> 
>
> Key: HADOOP-14520
> URL: https://issues.apache.org/jira/browse/HADOOP-14520
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/azure
>Affects Versions: 3.0.0-alpha3
>Reporter: Georgi Chalakov
>Assignee: Georgi Chalakov
> Attachments: HADOOP-14520-006.patch, HADOOP-14520-008.patch, 
> HADOOP-14520-009.patch, HADOOP-14520-05.patch, HADOOP_14520_07.patch, 
> HADOOP_14520_08.patch, HADOOP_14520_09.patch, HADOOP_14520_10.patch, 
> hadoop-14520-branch-2-010.patch, HADOOP-14520-patch-07-08.diff, 
> HADOOP-14520-patch-07-09.diff
>
>
> Block Compaction for WASB allows uploading new blocks for every hflush/hsync 
> call. When the number of blocks is above 32000, next hflush/hsync triggers 
> the block compaction process. Block compaction replaces a sequence of blocks 
> with one block. From all the sequences with total length less than 4M, 
> compaction chooses the longest one. It is a greedy algorithm that preserve 
> all potential candidates for the next round. Block Compaction for WASB 
> increases data durability and allows using block blobs instead of page blobs. 
> By default, block compaction is disabled. Similar to the configuration for 
> page blobs, the client needs to specify HDFS folders where block compaction 
> over block blobs is enabled. 
> Results for HADOOP_14520_07.patch
> tested endpoint: fs.azure.account.key.hdfs4.blob.core.windows.net
> Tests run: 777, Failures: 0, Errors: 0, Skipped: 155



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HADOOP-14520) WASB: Block compaction for Azure Block Blobs

2017-08-31 Thread Georgi Chalakov (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16149691#comment-16149691
 ] 

Georgi Chalakov edited comment on HADOOP-14520 at 8/31/17 10:08 PM:


Thank you for adding all these fixes. Stream capabilities looks like an useful 
feature.  

Re:flush()
FSDataOutputStream doesn't overwrite flush() and a normal flush() call on 
application level would not execute BlockBlobAppendStream::flush(). When the 
compaction is disabled hflush/hsync are nop and the performance of 
BlockBlobAppendStream for all operations is the same (or better) than before. 

Re:more than one append stream
We take a lease on the blob, that means at any point of time you can have one 
append stream only. If we had more than one append stream opened at the same 
time, we couldn't guarantee the order of write operations.

I have added hsync() call and made isclosed volatile. 

Re:close()
I think the first exception is the best indication what went wrong. After an 
exception, close() is just best effort. I don't know how useful for a client 
would be to continue after IO related exception, but if that is necessary, the 
client can continue. If block compaction is enabled, the client can go and read 
all the data until the last successful hflush()/hsync(). When the block 
compaction is disabled, we grantee nothing. We may or may not have the data 
stored in the service.  




was (Author: georgi):
Thank you for adding all these fixes. Stream capabilities looks like an useful 
feature.  

Re:flush()
FSDataOutputStream doesn't overwrite flush() and a normal flush() call on 
application level would not execute BlockBlobAppendStream::flush(). When the 
compaction is disabled hflush/hsync are nop and the performance of 
BlockBlobAppendStream for all operations is the same (or better) than before. 

Re:more than one append stream
We take a lease on the blob, that means at any point of time you can have one 
append stream only. If we had more than one append stream opened at the same 
time, we couldn't guarantee the order of write operations.

I have added hsync() call and made isclosed volatile. 

Re:close()
I think the first exception is the best indication what went wrong. After an 
exception, close() is just best effort. I don't know how useful for a client 
would be to continue after IO related exception, but if that is necessary, the 
client can continue. If block compaction is enabled, the client can go and read 
all the data until last hflush()/hsync(). When the block compaction is 
disabled, we grantee nothing. We may or may not have the data stored in the 
service.  



> WASB: Block compaction for Azure Block Blobs
> 
>
> Key: HADOOP-14520
> URL: https://issues.apache.org/jira/browse/HADOOP-14520
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/azure
>Affects Versions: 3.0.0-alpha3
>Reporter: Georgi Chalakov
>Assignee: Georgi Chalakov
> Attachments: HADOOP-14520-006.patch, HADOOP-14520-008.patch, 
> HADOOP-14520-009.patch, HADOOP-14520-05.patch, HADOOP_14520_07.patch, 
> HADOOP_14520_08.patch, HADOOP_14520_09.patch, HADOOP_14520_10.patch, 
> HADOOP-14520-patch-07-08.diff, HADOOP-14520-patch-07-09.diff
>
>
> Block Compaction for WASB allows uploading new blocks for every hflush/hsync 
> call. When the number of blocks is above 32000, next hflush/hsync triggers 
> the block compaction process. Block compaction replaces a sequence of blocks 
> with one block. From all the sequences with total length less than 4M, 
> compaction chooses the longest one. It is a greedy algorithm that preserve 
> all potential candidates for the next round. Block Compaction for WASB 
> increases data durability and allows using block blobs instead of page blobs. 
> By default, block compaction is disabled. Similar to the configuration for 
> page blobs, the client needs to specify HDFS folders where block compaction 
> over block blobs is enabled. 
> Results for HADOOP_14520_07.patch
> tested endpoint: fs.azure.account.key.hdfs4.blob.core.windows.net
> Tests run: 777, Failures: 0, Errors: 0, Skipped: 155



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HADOOP-14520) WASB: Block compaction for Azure Block Blobs

2017-08-31 Thread Georgi Chalakov (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16149691#comment-16149691
 ] 

Georgi Chalakov edited comment on HADOOP-14520 at 8/31/17 10:07 PM:


Thank you for adding all these fixes. Stream capabilities looks like an useful 
feature.  

Re:flush()
FSDataOutputStream doesn't overwrite flush() and a normal flush() call on 
application level would not execute BlockBlobAppendStream::flush(). When the 
compaction is disabled hflush/hsync are nop and the performance of 
BlockBlobAppendStream for all operations is the same (or better) than before. 

Re:more than one append stream
We take a lease on the blob, that means at any point of time you can have one 
append stream only. If we had more than one append stream opened at the same 
time, we couldn't guarantee the order of write operations.

I have added hsync() call and made isclosed volatile. 

Re:close()
I think the first exception is the best indication what went wrong. After an 
exception, close() is just best effort. I don't know how useful for a client 
would be to continue after IO related exception, but if that is necessary, the 
client can continue. If block compaction is enabled, the client can go and read 
all the data until last hflush()/hsync(). When the block compaction is 
disabled, we grantee nothing. We may or may not have the data stored in the 
service.  




was (Author: georgi):
Thank you for adding all these fixes. Stream capabilities looks like an useful 
feature.  

Re:flush()
FSDataOutputStream doesn't overwrite flush() and a normal flush() call on 
application level would not execute BlockBlobAppendStream::flush(). When the 
compaction is disabled hflush/hsync are nop and the performance of 
BlockBlobAppendStream for all operations is the same (or better) than before. 

Re:more than one append stream
We take a lease on the blob, that means at any point of time you can have one 
append stream only. If we had more than one append stream, we cannot grantee 
the order of write operations.

I have added hsync() call and made isclosed volatile. 

Re:close()
I think the first exception is the best indication what went wrong. After an 
exception, close() is just best effort. I don't know how useful for a client 
would be to continue after IO related exception, but if that is necessary, the 
client can continue. If block compaction is enabled, the client can go and read 
all the data until last hflush()/hsync(). When the block compaction is 
disabled, we grantee nothing. We may or may not have the data stored in the 
service.  



> WASB: Block compaction for Azure Block Blobs
> 
>
> Key: HADOOP-14520
> URL: https://issues.apache.org/jira/browse/HADOOP-14520
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/azure
>Affects Versions: 3.0.0-alpha3
>Reporter: Georgi Chalakov
>Assignee: Georgi Chalakov
> Attachments: HADOOP-14520-006.patch, HADOOP-14520-008.patch, 
> HADOOP-14520-009.patch, HADOOP-14520-05.patch, HADOOP_14520_07.patch, 
> HADOOP_14520_08.patch, HADOOP_14520_09.patch, HADOOP_14520_10.patch, 
> HADOOP-14520-patch-07-08.diff, HADOOP-14520-patch-07-09.diff
>
>
> Block Compaction for WASB allows uploading new blocks for every hflush/hsync 
> call. When the number of blocks is above 32000, next hflush/hsync triggers 
> the block compaction process. Block compaction replaces a sequence of blocks 
> with one block. From all the sequences with total length less than 4M, 
> compaction chooses the longest one. It is a greedy algorithm that preserve 
> all potential candidates for the next round. Block Compaction for WASB 
> increases data durability and allows using block blobs instead of page blobs. 
> By default, block compaction is disabled. Similar to the configuration for 
> page blobs, the client needs to specify HDFS folders where block compaction 
> over block blobs is enabled. 
> Results for HADOOP_14520_07.patch
> tested endpoint: fs.azure.account.key.hdfs4.blob.core.windows.net
> Tests run: 777, Failures: 0, Errors: 0, Skipped: 155



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HADOOP-14520) WASB: Block compaction for Azure Block Blobs

2017-08-31 Thread Georgi Chalakov (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16149691#comment-16149691
 ] 

Georgi Chalakov edited comment on HADOOP-14520 at 8/31/17 10:06 PM:


Thank you for adding all these fixes. Stream capabilities looks like an useful 
feature.  

Re:flush()
FSDataOutputStream doesn't overwrite flush() and a normal flush() call on 
application level would not execute BlockBlobAppendStream::flush(). When the 
compaction is disabled hflush/hsync are nop and the performance of 
BlockBlobAppendStream for all operations is the same (or better) than before. 

Re:more than one append stream
We take a lease on the blob, that means at any point of time you can have one 
append stream only. If we had more than one append stream, we cannot grantee 
the order of write operations.

I have added hsync() call and made isclosed volatile. 

Re:close()
I think the first exception is the best indication what went wrong. After an 
exception, close() is just best effort. I don't know how useful for a client 
would be to continue after IO related exception, but if that is necessary, the 
client can continue. If block compaction is enabled, the client can go and read 
all the data until last hflush()/hsync(). When the block compaction is 
disabled, we grantee nothing. We may or may not have the data stored in the 
service.  




was (Author: georgi):
Thank you for adding all these fixes. Stream capabilities looks like an useful 
feature.  

I will fix the space in last patch. 

Re:flush()
FSDataOutputStream doesn't overwrite flush() and a normal flush() call on 
application level would not execute BlockBlobAppendStream::flush(). When the 
compaction is disabled hflush/hsync are nop and the performance of 
BlockBlobAppendStream is the same (or better) than before. 

Re:more than one append stream
We take a lease on the blob, that means at any point of time you can have one 
append stream only. If we had more than one append stream, we cannot grantee 
the order of write operations.

I have added hsync() call and made isclosed volatile. 

Re:close()
I think the first exception is the best indication what went wrong. After an 
exception, close() is just best effort. I don't know how useful for a client 
would be to continue after IO related exception, but if that is necessary, the 
client can continue. If block compaction is enabled, the client can go and read 
all the data until last hflush()/hsync(). When the block compaction is 
disabled, we grantee nothing. We may or may not have the data stored in the 
service.  



> WASB: Block compaction for Azure Block Blobs
> 
>
> Key: HADOOP-14520
> URL: https://issues.apache.org/jira/browse/HADOOP-14520
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/azure
>Affects Versions: 3.0.0-alpha3
>Reporter: Georgi Chalakov
>Assignee: Georgi Chalakov
> Attachments: HADOOP-14520-006.patch, HADOOP-14520-008.patch, 
> HADOOP-14520-009.patch, HADOOP-14520-05.patch, HADOOP_14520_07.patch, 
> HADOOP_14520_08.patch, HADOOP_14520_09.patch, HADOOP_14520_10.patch, 
> HADOOP-14520-patch-07-08.diff, HADOOP-14520-patch-07-09.diff
>
>
> Block Compaction for WASB allows uploading new blocks for every hflush/hsync 
> call. When the number of blocks is above 32000, next hflush/hsync triggers 
> the block compaction process. Block compaction replaces a sequence of blocks 
> with one block. From all the sequences with total length less than 4M, 
> compaction chooses the longest one. It is a greedy algorithm that preserve 
> all potential candidates for the next round. Block Compaction for WASB 
> increases data durability and allows using block blobs instead of page blobs. 
> By default, block compaction is disabled. Similar to the configuration for 
> page blobs, the client needs to specify HDFS folders where block compaction 
> over block blobs is enabled. 
> Results for HADOOP_14520_07.patch
> tested endpoint: fs.azure.account.key.hdfs4.blob.core.windows.net
> Tests run: 777, Failures: 0, Errors: 0, Skipped: 155



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14520) WASB: Block compaction for Azure Block Blobs

2017-08-31 Thread Georgi Chalakov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Georgi Chalakov updated HADOOP-14520:
-
Status: Patch Available  (was: Open)

> WASB: Block compaction for Azure Block Blobs
> 
>
> Key: HADOOP-14520
> URL: https://issues.apache.org/jira/browse/HADOOP-14520
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/azure
>Affects Versions: 3.0.0-alpha3
>Reporter: Georgi Chalakov
>Assignee: Georgi Chalakov
> Attachments: HADOOP-14520-006.patch, HADOOP-14520-008.patch, 
> HADOOP-14520-009.patch, HADOOP-14520-05.patch, HADOOP_14520_07.patch, 
> HADOOP_14520_08.patch, HADOOP_14520_09.patch, HADOOP_14520_10.patch, 
> HADOOP-14520-patch-07-08.diff, HADOOP-14520-patch-07-09.diff
>
>
> Block Compaction for WASB allows uploading new blocks for every hflush/hsync 
> call. When the number of blocks is above 32000, next hflush/hsync triggers 
> the block compaction process. Block compaction replaces a sequence of blocks 
> with one block. From all the sequences with total length less than 4M, 
> compaction chooses the longest one. It is a greedy algorithm that preserve 
> all potential candidates for the next round. Block Compaction for WASB 
> increases data durability and allows using block blobs instead of page blobs. 
> By default, block compaction is disabled. Similar to the configuration for 
> page blobs, the client needs to specify HDFS folders where block compaction 
> over block blobs is enabled. 
> Results for HADOOP_14520_07.patch
> tested endpoint: fs.azure.account.key.hdfs4.blob.core.windows.net
> Tests run: 777, Failures: 0, Errors: 0, Skipped: 155



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14520) WASB: Block compaction for Azure Block Blobs

2017-08-31 Thread Georgi Chalakov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Georgi Chalakov updated HADOOP-14520:
-
Attachment: HADOOP_14520_10.patch

> WASB: Block compaction for Azure Block Blobs
> 
>
> Key: HADOOP-14520
> URL: https://issues.apache.org/jira/browse/HADOOP-14520
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/azure
>Affects Versions: 3.0.0-alpha3
>Reporter: Georgi Chalakov
>Assignee: Georgi Chalakov
> Attachments: HADOOP-14520-006.patch, HADOOP-14520-008.patch, 
> HADOOP-14520-009.patch, HADOOP-14520-05.patch, HADOOP_14520_07.patch, 
> HADOOP_14520_08.patch, HADOOP_14520_09.patch, HADOOP_14520_10.patch, 
> HADOOP-14520-patch-07-08.diff, HADOOP-14520-patch-07-09.diff
>
>
> Block Compaction for WASB allows uploading new blocks for every hflush/hsync 
> call. When the number of blocks is above 32000, next hflush/hsync triggers 
> the block compaction process. Block compaction replaces a sequence of blocks 
> with one block. From all the sequences with total length less than 4M, 
> compaction chooses the longest one. It is a greedy algorithm that preserve 
> all potential candidates for the next round. Block Compaction for WASB 
> increases data durability and allows using block blobs instead of page blobs. 
> By default, block compaction is disabled. Similar to the configuration for 
> page blobs, the client needs to specify HDFS folders where block compaction 
> over block blobs is enabled. 
> Results for HADOOP_14520_07.patch
> tested endpoint: fs.azure.account.key.hdfs4.blob.core.windows.net
> Tests run: 777, Failures: 0, Errors: 0, Skipped: 155



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14520) WASB: Block compaction for Azure Block Blobs

2017-08-31 Thread Georgi Chalakov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Georgi Chalakov updated HADOOP-14520:
-
Status: Open  (was: Patch Available)

> WASB: Block compaction for Azure Block Blobs
> 
>
> Key: HADOOP-14520
> URL: https://issues.apache.org/jira/browse/HADOOP-14520
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/azure
>Affects Versions: 3.0.0-alpha3
>Reporter: Georgi Chalakov
>Assignee: Georgi Chalakov
> Attachments: HADOOP-14520-006.patch, HADOOP-14520-008.patch, 
> HADOOP-14520-009.patch, HADOOP-14520-05.patch, HADOOP_14520_07.patch, 
> HADOOP_14520_08.patch, HADOOP_14520_09.patch, HADOOP-14520-patch-07-08.diff, 
> HADOOP-14520-patch-07-09.diff
>
>
> Block Compaction for WASB allows uploading new blocks for every hflush/hsync 
> call. When the number of blocks is above 32000, next hflush/hsync triggers 
> the block compaction process. Block compaction replaces a sequence of blocks 
> with one block. From all the sequences with total length less than 4M, 
> compaction chooses the longest one. It is a greedy algorithm that preserve 
> all potential candidates for the next round. Block Compaction for WASB 
> increases data durability and allows using block blobs instead of page blobs. 
> By default, block compaction is disabled. Similar to the configuration for 
> page blobs, the client needs to specify HDFS folders where block compaction 
> over block blobs is enabled. 
> Results for HADOOP_14520_07.patch
> tested endpoint: fs.azure.account.key.hdfs4.blob.core.windows.net
> Tests run: 777, Failures: 0, Errors: 0, Skipped: 155



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14520) WASB: Block compaction for Azure Block Blobs

2017-08-31 Thread Georgi Chalakov (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16149691#comment-16149691
 ] 

Georgi Chalakov commented on HADOOP-14520:
--

Thank you for adding all these fixes. Stream capabilities looks like an useful 
feature.  

I will fix the space in last patch. 

Re:flush()
FSDataOutputStream doesn't overwrite flush() and a normal flush() call on 
application level would not execute BlockBlobAppendStream::flush(). When the 
compaction is disabled hflush/hsync are nop and the performance of 
BlockBlobAppendStream is the same (or better) than before. 

Re:more than one append stream
We take a lease on the blob, that means at any point of time you can have one 
append stream only. If we had more than one append stream, we cannot grantee 
the order of write operations.

I have added hsync() call and made isclosed volatile. 

Re:close()
I think the first exception is the best indication what went wrong. After an 
exception, close() is just best effort. I don't know how useful for a client 
would be to continue after IO related exception, but if that is necessary, the 
client can continue. If block compaction is enabled, the client can go and read 
all the data until last hflush()/hsync(). When the block compaction is 
disabled, we grantee nothing. We may or may not have the data stored in the 
service.  



> WASB: Block compaction for Azure Block Blobs
> 
>
> Key: HADOOP-14520
> URL: https://issues.apache.org/jira/browse/HADOOP-14520
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/azure
>Affects Versions: 3.0.0-alpha3
>Reporter: Georgi Chalakov
>Assignee: Georgi Chalakov
> Attachments: HADOOP-14520-006.patch, HADOOP-14520-008.patch, 
> HADOOP-14520-009.patch, HADOOP-14520-05.patch, HADOOP_14520_07.patch, 
> HADOOP_14520_08.patch, HADOOP_14520_09.patch, HADOOP-14520-patch-07-08.diff, 
> HADOOP-14520-patch-07-09.diff
>
>
> Block Compaction for WASB allows uploading new blocks for every hflush/hsync 
> call. When the number of blocks is above 32000, next hflush/hsync triggers 
> the block compaction process. Block compaction replaces a sequence of blocks 
> with one block. From all the sequences with total length less than 4M, 
> compaction chooses the longest one. It is a greedy algorithm that preserve 
> all potential candidates for the next round. Block Compaction for WASB 
> increases data durability and allows using block blobs instead of page blobs. 
> By default, block compaction is disabled. Similar to the configuration for 
> page blobs, the client needs to specify HDFS folders where block compaction 
> over block blobs is enabled. 
> Results for HADOOP_14520_07.patch
> tested endpoint: fs.azure.account.key.hdfs4.blob.core.windows.net
> Tests run: 777, Failures: 0, Errors: 0, Skipped: 155



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14520) WASB: Block compaction for Azure Block Blobs

2017-08-30 Thread Georgi Chalakov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Georgi Chalakov updated HADOOP-14520:
-
Status: Patch Available  (was: Open)

fixes javadoc issues.

> WASB: Block compaction for Azure Block Blobs
> 
>
> Key: HADOOP-14520
> URL: https://issues.apache.org/jira/browse/HADOOP-14520
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/azure
>Affects Versions: 3.0.0-alpha3
>Reporter: Georgi Chalakov
>Assignee: Georgi Chalakov
> Attachments: HADOOP-14520-006.patch, HADOOP-14520-05.patch, 
> HADOOP_14520_07.patch, HADOOP_14520_08.patch, HADOOP_14520_09.patch
>
>
> Block Compaction for WASB allows uploading new blocks for every hflush/hsync 
> call. When the number of blocks is above 32000, next hflush/hsync triggers 
> the block compaction process. Block compaction replaces a sequence of blocks 
> with one block. From all the sequences with total length less than 4M, 
> compaction chooses the longest one. It is a greedy algorithm that preserve 
> all potential candidates for the next round. Block Compaction for WASB 
> increases data durability and allows using block blobs instead of page blobs. 
> By default, block compaction is disabled. Similar to the configuration for 
> page blobs, the client needs to specify HDFS folders where block compaction 
> over block blobs is enabled. 
> Results for HADOOP_14520_07.patch
> tested endpoint: fs.azure.account.key.hdfs4.blob.core.windows.net
> Tests run: 777, Failures: 0, Errors: 0, Skipped: 155



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14520) WASB: Block compaction for Azure Block Blobs

2017-08-30 Thread Georgi Chalakov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Georgi Chalakov updated HADOOP-14520:
-
Status: Open  (was: Patch Available)

> WASB: Block compaction for Azure Block Blobs
> 
>
> Key: HADOOP-14520
> URL: https://issues.apache.org/jira/browse/HADOOP-14520
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/azure
>Affects Versions: 3.0.0-alpha3
>Reporter: Georgi Chalakov
>Assignee: Georgi Chalakov
> Attachments: HADOOP-14520-006.patch, HADOOP-14520-05.patch, 
> HADOOP_14520_07.patch, HADOOP_14520_08.patch, HADOOP_14520_09.patch
>
>
> Block Compaction for WASB allows uploading new blocks for every hflush/hsync 
> call. When the number of blocks is above 32000, next hflush/hsync triggers 
> the block compaction process. Block compaction replaces a sequence of blocks 
> with one block. From all the sequences with total length less than 4M, 
> compaction chooses the longest one. It is a greedy algorithm that preserve 
> all potential candidates for the next round. Block Compaction for WASB 
> increases data durability and allows using block blobs instead of page blobs. 
> By default, block compaction is disabled. Similar to the configuration for 
> page blobs, the client needs to specify HDFS folders where block compaction 
> over block blobs is enabled. 
> Results for HADOOP_14520_07.patch
> tested endpoint: fs.azure.account.key.hdfs4.blob.core.windows.net
> Tests run: 777, Failures: 0, Errors: 0, Skipped: 155



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14520) WASB: Block compaction for Azure Block Blobs

2017-08-30 Thread Georgi Chalakov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Georgi Chalakov updated HADOOP-14520:
-
Attachment: HADOOP_14520_09.patch

> WASB: Block compaction for Azure Block Blobs
> 
>
> Key: HADOOP-14520
> URL: https://issues.apache.org/jira/browse/HADOOP-14520
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/azure
>Affects Versions: 3.0.0-alpha3
>Reporter: Georgi Chalakov
>Assignee: Georgi Chalakov
> Attachments: HADOOP-14520-006.patch, HADOOP-14520-05.patch, 
> HADOOP_14520_07.patch, HADOOP_14520_08.patch, HADOOP_14520_09.patch
>
>
> Block Compaction for WASB allows uploading new blocks for every hflush/hsync 
> call. When the number of blocks is above 32000, next hflush/hsync triggers 
> the block compaction process. Block compaction replaces a sequence of blocks 
> with one block. From all the sequences with total length less than 4M, 
> compaction chooses the longest one. It is a greedy algorithm that preserve 
> all potential candidates for the next round. Block Compaction for WASB 
> increases data durability and allows using block blobs instead of page blobs. 
> By default, block compaction is disabled. Similar to the configuration for 
> page blobs, the client needs to specify HDFS folders where block compaction 
> over block blobs is enabled. 
> Results for HADOOP_14520_07.patch
> tested endpoint: fs.azure.account.key.hdfs4.blob.core.windows.net
> Tests run: 777, Failures: 0, Errors: 0, Skipped: 155



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14520) WASB: Block compaction for Azure Block Blobs

2017-08-30 Thread Georgi Chalakov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Georgi Chalakov updated HADOOP-14520:
-
Status: Open  (was: Patch Available)

> WASB: Block compaction for Azure Block Blobs
> 
>
> Key: HADOOP-14520
> URL: https://issues.apache.org/jira/browse/HADOOP-14520
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/azure
>Affects Versions: 3.0.0-alpha3
>Reporter: Georgi Chalakov
>Assignee: Georgi Chalakov
> Attachments: HADOOP-14520-006.patch, HADOOP-14520-05.patch, 
> HADOOP_14520_07.patch, HADOOP_14520_08.patch
>
>
> Block Compaction for WASB allows uploading new blocks for every hflush/hsync 
> call. When the number of blocks is above 32000, next hflush/hsync triggers 
> the block compaction process. Block compaction replaces a sequence of blocks 
> with one block. From all the sequences with total length less than 4M, 
> compaction chooses the longest one. It is a greedy algorithm that preserve 
> all potential candidates for the next round. Block Compaction for WASB 
> increases data durability and allows using block blobs instead of page blobs. 
> By default, block compaction is disabled. Similar to the configuration for 
> page blobs, the client needs to specify HDFS folders where block compaction 
> over block blobs is enabled. 
> Results for HADOOP_14520_07.patch
> tested endpoint: fs.azure.account.key.hdfs4.blob.core.windows.net
> Tests run: 777, Failures: 0, Errors: 0, Skipped: 155



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work started] (HADOOP-14520) WASB: Block compaction for Azure Block Blobs

2017-08-30 Thread Georgi Chalakov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HADOOP-14520 started by Georgi Chalakov.

> WASB: Block compaction for Azure Block Blobs
> 
>
> Key: HADOOP-14520
> URL: https://issues.apache.org/jira/browse/HADOOP-14520
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/azure
>Affects Versions: 3.0.0-alpha3
>Reporter: Georgi Chalakov
>Assignee: Georgi Chalakov
> Attachments: HADOOP-14520-006.patch, HADOOP-14520-05.patch, 
> HADOOP_14520_07.patch, HADOOP_14520_08.patch
>
>
> Block Compaction for WASB allows uploading new blocks for every hflush/hsync 
> call. When the number of blocks is above 32000, next hflush/hsync triggers 
> the block compaction process. Block compaction replaces a sequence of blocks 
> with one block. From all the sequences with total length less than 4M, 
> compaction chooses the longest one. It is a greedy algorithm that preserve 
> all potential candidates for the next round. Block Compaction for WASB 
> increases data durability and allows using block blobs instead of page blobs. 
> By default, block compaction is disabled. Similar to the configuration for 
> page blobs, the client needs to specify HDFS folders where block compaction 
> over block blobs is enabled. 
> Results for HADOOP_14520_07.patch
> tested endpoint: fs.azure.account.key.hdfs4.blob.core.windows.net
> Tests run: 777, Failures: 0, Errors: 0, Skipped: 155



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14520) WASB: Block compaction for Azure Block Blobs

2017-08-30 Thread Georgi Chalakov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Georgi Chalakov updated HADOOP-14520:
-
Status: Patch Available  (was: In Progress)

> WASB: Block compaction for Azure Block Blobs
> 
>
> Key: HADOOP-14520
> URL: https://issues.apache.org/jira/browse/HADOOP-14520
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/azure
>Affects Versions: 3.0.0-alpha3
>Reporter: Georgi Chalakov
>Assignee: Georgi Chalakov
> Attachments: HADOOP-14520-006.patch, HADOOP-14520-05.patch, 
> HADOOP_14520_07.patch, HADOOP_14520_08.patch
>
>
> Block Compaction for WASB allows uploading new blocks for every hflush/hsync 
> call. When the number of blocks is above 32000, next hflush/hsync triggers 
> the block compaction process. Block compaction replaces a sequence of blocks 
> with one block. From all the sequences with total length less than 4M, 
> compaction chooses the longest one. It is a greedy algorithm that preserve 
> all potential candidates for the next round. Block Compaction for WASB 
> increases data durability and allows using block blobs instead of page blobs. 
> By default, block compaction is disabled. Similar to the configuration for 
> page blobs, the client needs to specify HDFS folders where block compaction 
> over block blobs is enabled. 
> Results for HADOOP_14520_07.patch
> tested endpoint: fs.azure.account.key.hdfs4.blob.core.windows.net
> Tests run: 777, Failures: 0, Errors: 0, Skipped: 155



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14520) WASB: Block compaction for Azure Block Blobs

2017-08-30 Thread Georgi Chalakov (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16147827#comment-16147827
 ] 

Georgi Chalakov commented on HADOOP-14520:
--

HADOOP_14520_08.patch 
whitespace fixes; javadoc fixes.

> WASB: Block compaction for Azure Block Blobs
> 
>
> Key: HADOOP-14520
> URL: https://issues.apache.org/jira/browse/HADOOP-14520
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/azure
>Affects Versions: 3.0.0-alpha3
>Reporter: Georgi Chalakov
>Assignee: Georgi Chalakov
> Attachments: HADOOP-14520-006.patch, HADOOP-14520-05.patch, 
> HADOOP_14520_07.patch, HADOOP_14520_08.patch
>
>
> Block Compaction for WASB allows uploading new blocks for every hflush/hsync 
> call. When the number of blocks is above 32000, next hflush/hsync triggers 
> the block compaction process. Block compaction replaces a sequence of blocks 
> with one block. From all the sequences with total length less than 4M, 
> compaction chooses the longest one. It is a greedy algorithm that preserve 
> all potential candidates for the next round. Block Compaction for WASB 
> increases data durability and allows using block blobs instead of page blobs. 
> By default, block compaction is disabled. Similar to the configuration for 
> page blobs, the client needs to specify HDFS folders where block compaction 
> over block blobs is enabled. 
> Results for HADOOP_14520_07.patch
> tested endpoint: fs.azure.account.key.hdfs4.blob.core.windows.net
> Tests run: 777, Failures: 0, Errors: 0, Skipped: 155



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14520) WASB: Block compaction for Azure Block Blobs

2017-08-30 Thread Georgi Chalakov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Georgi Chalakov updated HADOOP-14520:
-
Attachment: HADOOP_14520_08.patch

> WASB: Block compaction for Azure Block Blobs
> 
>
> Key: HADOOP-14520
> URL: https://issues.apache.org/jira/browse/HADOOP-14520
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/azure
>Affects Versions: 3.0.0-alpha3
>Reporter: Georgi Chalakov
>Assignee: Georgi Chalakov
> Attachments: HADOOP-14520-006.patch, HADOOP-14520-05.patch, 
> HADOOP_14520_07.patch, HADOOP_14520_08.patch
>
>
> Block Compaction for WASB allows uploading new blocks for every hflush/hsync 
> call. When the number of blocks is above 32000, next hflush/hsync triggers 
> the block compaction process. Block compaction replaces a sequence of blocks 
> with one block. From all the sequences with total length less than 4M, 
> compaction chooses the longest one. It is a greedy algorithm that preserve 
> all potential candidates for the next round. Block Compaction for WASB 
> increases data durability and allows using block blobs instead of page blobs. 
> By default, block compaction is disabled. Similar to the configuration for 
> page blobs, the client needs to specify HDFS folders where block compaction 
> over block blobs is enabled. 
> Results for HADOOP_14520_07.patch
> tested endpoint: fs.azure.account.key.hdfs4.blob.core.windows.net
> Tests run: 777, Failures: 0, Errors: 0, Skipped: 155



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HADOOP-14520) WASB: Block compaction for Azure Block Blobs

2017-08-29 Thread Georgi Chalakov (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16146357#comment-16146357
 ] 

Georgi Chalakov edited comment on HADOOP-14520 at 8/29/17 11:50 PM:


HADOOP_14520_07.patch
Results : Tests run: 777, Failures: 0, Errors: 0, Skipped: 155

bq. if you are changing precondition check, I'd recommend StringUtils.isEmpty() 
for Preconditions.checkArgument(StringUtils.isNotEmpty(aKey));

Done.

bq. If fields aren't updated after the constructor, best to set to final 
(example, compactionEnabled ?).

Done.

bq. How long is downloadBlockList going to take in that constructor? More 
specifically: if compaction is disabled, can that step be skipped?

downloadBlockList is used for two purposes: 1) to check for block existence 2) 
to download the block list

bq. If the stream needs a byte buffer, best to use ElasticByteBufferPool as a 
pool of buffers.

Done.

bq. Use StorageErrorCodeStrings as the source of string constants to check for 
in exception error codes.

Done.

bq. Rather than throw IOException(e), I'd prefer more specific (existing ones). 
That's PathIOException and subclasses, AzureException(e), and the 
java.io/java.nio ones.

Done

bq. When wrapping a StorageException with another IOE, always include the 
toString value of the wrapped exception. That way, the log message of the top 
level log retains the underlying problem.

Done.

bq. BlockBlobAppendStream.WriteRequest retry logic will retry even on 
RuntimeExceptions like IllegalArgumentException. Ideally they should be split 
into recoverable vs non-recoverable ops via a RetryPolicy. Is this an issue to 
address here though? Overall, with the new operatins doing retries, this may be 
time to embrace rety policies. Or at least create a JIRA entry on doing so.

add*Command() will rethrow the last exception. That means the following write() 
or close() will retrow stored exception. It is not going to happen right away, 
but the will happen before the stream is closed()

bq. I know java.io.OutputStream is marked as single-thread only, but I know of 
code (hello HBase!) which means that you must make some of the calls thread 
safe. HADOOP-11708/HADOOP-11710 covers this issue in CryptoOutputStream. At the 
very least, flush() must be synchronous with itself, close() & maybe write()

flush() is synchronous with itself through addFlushCommand(). We do not want 
flush() to be synchronous with write(). We would like while a thread waits for 
a flush(), other threads to continue writing. 

bq. I'm unsure about BlockBlobAppendStream.close() waiting for up to 15 minutes 
for things to complete, but looking @ other blobstore clients, I can see that 
they are implicitly waiting without any timeout at all. And it's in the 
existing codebase. But: why was the time limit changed from 10 min to 15? Was 
this based on test failures? If so, where is the guarantee that a 15 minute 
wait is always sufficient.

The change to 15 min was not based on test failures. I have changed the timeout 
back to 10 min and added a const. 

bq. Looking at BlockBlobAppendStream thread pooling, I think having a thread 
pool per output stream is expensive, especially as it has a minimum size of 4; 
it will ramp up fast. A pool of min=1 max=4 might be less expensive. But 
really, the stream should be thinking about sharing a pool common to the FS, 
relying on callbacks to notify it of completion rather than just awaiting pool 
completion and a shared writeable field.

I did a some tests with YCSB and a pool of min=1, max=4. It is slower and the 
difference is measurable. Considering how many output stream you usually have 
per FS, I would like to keep min=4, max=4. The shared pool is a good idea, but 
I am afraid we would need bigger change and at the end I am not sure we will 
get significant benefits. 

bq. I think the access/use of lastException needs to be made stronger than just 
a volatile, as it means that code of the form if (lastException!=null) throw 
lastException isn't thread safe. I know, it's not that harmful provided 
lastException is never set to null, but I'd still like some isolated 
get/getAndSet/maybeThrow operations. Similarly, is lastException the best way 
to propagate failure, as it means that teardown failures are going to get 
reported ahead of earlier ones during the write itself. Overall, I propose 
using Callable WASB: Block compaction for Azure Block Blobs
> 
>
> Key: HADOOP-14520
> URL: https://issues.apache.org/jira/browse/HADOOP-14520
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/azure
>Affects Versions: 3.0.0-alpha3
>Reporter: Georgi Chalakov
>Assignee: Georgi Chalakov
> Attachments: HADOOP-14520-006.patch, HADOOP-14520-05.patch, 
> 

[jira] [Updated] (HADOOP-14520) WASB: Block compaction for Azure Block Blobs

2017-08-29 Thread Georgi Chalakov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Georgi Chalakov updated HADOOP-14520:
-
Status: Patch Available  (was: In Progress)

> WASB: Block compaction for Azure Block Blobs
> 
>
> Key: HADOOP-14520
> URL: https://issues.apache.org/jira/browse/HADOOP-14520
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/azure
>Affects Versions: 3.0.0-alpha3
>Reporter: Georgi Chalakov
>Assignee: Georgi Chalakov
> Attachments: HADOOP-14520-006.patch, HADOOP-14520-05.patch, 
> HADOOP_14520_07.patch
>
>
> Block Compaction for WASB allows uploading new blocks for every hflush/hsync 
> call. When the number of blocks is above 32000, next hflush/hsync triggers 
> the block compaction process. Block compaction replaces a sequence of blocks 
> with one block. From all the sequences with total length less than 4M, 
> compaction chooses the longest one. It is a greedy algorithm that preserve 
> all potential candidates for the next round. Block Compaction for WASB 
> increases data durability and allows using block blobs instead of page blobs. 
> By default, block compaction is disabled. Similar to the configuration for 
> page blobs, the client needs to specify HDFS folders where block compaction 
> over block blobs is enabled. 
> Results for HADOOP_14520_07.patch
> tested endpoint: fs.azure.account.key.hdfs4.blob.core.windows.net
> Tests run: 777, Failures: 0, Errors: 0, Skipped: 155



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14520) WASB: Block compaction for Azure Block Blobs

2017-08-29 Thread Georgi Chalakov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Georgi Chalakov updated HADOOP-14520:
-
Attachment: HADOOP_14520_07.patch

> WASB: Block compaction for Azure Block Blobs
> 
>
> Key: HADOOP-14520
> URL: https://issues.apache.org/jira/browse/HADOOP-14520
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/azure
>Affects Versions: 3.0.0-alpha3
>Reporter: Georgi Chalakov
>Assignee: Georgi Chalakov
> Attachments: HADOOP-14520-006.patch, HADOOP-14520-05.patch, 
> HADOOP_14520_07.patch
>
>
> Block Compaction for WASB allows uploading new blocks for every hflush/hsync 
> call. When the number of blocks is above 32000, next hflush/hsync triggers 
> the block compaction process. Block compaction replaces a sequence of blocks 
> with one block. From all the sequences with total length less than 4M, 
> compaction chooses the longest one. It is a greedy algorithm that preserve 
> all potential candidates for the next round. Block Compaction for WASB 
> increases data durability and allows using block blobs instead of page blobs. 
> By default, block compaction is disabled. Similar to the configuration for 
> page blobs, the client needs to specify HDFS folders where block compaction 
> over block blobs is enabled. 
> Results for HADOOP_14520_07.patch
> tested endpoint: fs.azure.account.key.hdfs4.blob.core.windows.net
> Tests run: 777, Failures: 0, Errors: 0, Skipped: 155



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14520) WASB: Block compaction for Azure Block Blobs

2017-08-29 Thread Georgi Chalakov (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16146357#comment-16146357
 ] 

Georgi Chalakov commented on HADOOP-14520:
--

Results : Tests run: 777, Failures: 0, Errors: 0, Skipped: 155

bq. if you are changing precondition check, I'd recommend StringUtils.isEmpty() 
for Preconditions.checkArgument(StringUtils.isNotEmpty(aKey));

Done.

bq. If fields aren't updated after the constructor, best to set to final 
(example, compactionEnabled ?).

Done.

bq. How long is downloadBlockList going to take in that constructor? More 
specifically: if compaction is disabled, can that step be skipped?

downloadBlockList is used for two purposes: 1) to check for block existence 2) 
to download the block list

bq. If the stream needs a byte buffer, best to use ElasticByteBufferPool as a 
pool of buffers.

Done.

bq. Use StorageErrorCodeStrings as the source of string constants to check for 
in exception error codes.

Done.

bq. Rather than throw IOException(e), I'd prefer more specific (existing ones). 
That's PathIOException and subclasses, AzureException(e), and the 
java.io/java.nio ones.

Done

bq. When wrapping a StorageException with another IOE, always include the 
toString value of the wrapped exception. That way, the log message of the top 
level log retains the underlying problem.

Done.

bq. BlockBlobAppendStream.WriteRequest retry logic will retry even on 
RuntimeExceptions like IllegalArgumentException. Ideally they should be split 
into recoverable vs non-recoverable ops via a RetryPolicy. Is this an issue to 
address here though? Overall, with the new operatins doing retries, this may be 
time to embrace rety policies. Or at least create a JIRA entry on doing so.

add*Command() will rethrow the last exception. That means the following write() 
or close() will retrow stored exception. It is not going to happen right away, 
but the will happen before the stream is closed()

bq. I know java.io.OutputStream is marked as single-thread only, but I know of 
code (hello HBase!) which means that you must make some of the calls thread 
safe. HADOOP-11708/HADOOP-11710 covers this issue in CryptoOutputStream. At the 
very least, flush() must be synchronous with itself, close() & maybe write()

flush() is synchronous with itself through addFlushCommand(). We do not want 
flush() to be synchronous with write(). We would like while a thread waits for 
a flush(), other threads to continue writing. 

bq. I'm unsure about BlockBlobAppendStream.close() waiting for up to 15 minutes 
for things to complete, but looking @ other blobstore clients, I can see that 
they are implicitly waiting without any timeout at all. And it's in the 
existing codebase. But: why was the time limit changed from 10 min to 15? Was 
this based on test failures? If so, where is the guarantee that a 15 minute 
wait is always sufficient.

The change to 15 min was not based on test failures. I have changed the timeout 
back to 10 min and added a const. 

bq. Looking at BlockBlobAppendStream thread pooling, I think having a thread 
pool per output stream is expensive, especially as it has a minimum size of 4; 
it will ramp up fast. A pool of min=1 max=4 might be less expensive. But 
really, the stream should be thinking about sharing a pool common to the FS, 
relying on callbacks to notify it of completion rather than just awaiting pool 
completion and a shared writeable field.

I did a some tests with YCSB and a pool of min=1, max=4. It is slower and the 
difference is measurable. Considering how many output stream you usually have 
per FS, I would like to keep min=4, max=4. The shared pool is a good idea, but 
I am afraid we would need bigger change and at the end I am not sure we will 
get significant benefits. 

bq. I think the access/use of lastException needs to be made stronger than just 
a volatile, as it means that code of the form if (lastException!=null) throw 
lastException isn't thread safe. I know, it's not that harmful provided 
lastException is never set to null, but I'd still like some isolated 
get/getAndSet/maybeThrow operations. Similarly, is lastException the best way 
to propagate failure, as it means that teardown failures are going to get 
reported ahead of earlier ones during the write itself. Overall, I propose 
using Callable WASB: Block compaction for Azure Block Blobs
> 
>
> Key: HADOOP-14520
> URL: https://issues.apache.org/jira/browse/HADOOP-14520
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/azure
>Affects Versions: 3.0.0-alpha3
>Reporter: Georgi Chalakov
>Assignee: Georgi Chalakov
> Attachments: HADOOP-14520-006.patch, HADOOP-14520-05.patch, 
> HADOOP_14520_07.patch
>
>
> Block Compaction for WASB allows uploading new blocks for every 

[jira] [Updated] (HADOOP-14520) WASB: Block compaction for Azure Block Blobs

2017-08-29 Thread Georgi Chalakov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Georgi Chalakov updated HADOOP-14520:
-
Status: Open  (was: Patch Available)

> WASB: Block compaction for Azure Block Blobs
> 
>
> Key: HADOOP-14520
> URL: https://issues.apache.org/jira/browse/HADOOP-14520
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/azure
>Affects Versions: 3.0.0-alpha3
>Reporter: Georgi Chalakov
>Assignee: Georgi Chalakov
> Attachments: HADOOP-14520-006.patch, HADOOP-14520-05.patch
>
>
> Block Compaction for WASB allows uploading new blocks for every hflush/hsync 
> call. When the number of blocks is above 32000, next hflush/hsync triggers 
> the block compaction process. Block compaction replaces a sequence of blocks 
> with one block. From all the sequences with total length less than 4M, 
> compaction chooses the longest one. It is a greedy algorithm that preserve 
> all potential candidates for the next round. Block Compaction for WASB 
> increases data durability and allows using block blobs instead of page blobs. 
> By default, block compaction is disabled. Similar to the configuration for 
> page blobs, the client needs to specify HDFS folders where block compaction 
> over block blobs is enabled. 
> Results for HADOOP_14520_07.patch
> tested endpoint: fs.azure.account.key.hdfs4.blob.core.windows.net
> Tests run: 777, Failures: 0, Errors: 0, Skipped: 155



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work started] (HADOOP-14520) WASB: Block compaction for Azure Block Blobs

2017-08-29 Thread Georgi Chalakov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HADOOP-14520 started by Georgi Chalakov.

> WASB: Block compaction for Azure Block Blobs
> 
>
> Key: HADOOP-14520
> URL: https://issues.apache.org/jira/browse/HADOOP-14520
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/azure
>Affects Versions: 3.0.0-alpha3
>Reporter: Georgi Chalakov
>Assignee: Georgi Chalakov
> Attachments: HADOOP-14520-006.patch, HADOOP-14520-05.patch
>
>
> Block Compaction for WASB allows uploading new blocks for every hflush/hsync 
> call. When the number of blocks is above 32000, next hflush/hsync triggers 
> the block compaction process. Block compaction replaces a sequence of blocks 
> with one block. From all the sequences with total length less than 4M, 
> compaction chooses the longest one. It is a greedy algorithm that preserve 
> all potential candidates for the next round. Block Compaction for WASB 
> increases data durability and allows using block blobs instead of page blobs. 
> By default, block compaction is disabled. Similar to the configuration for 
> page blobs, the client needs to specify HDFS folders where block compaction 
> over block blobs is enabled. 
> Results for HADOOP_14520_07.patch
> tested endpoint: fs.azure.account.key.hdfs4.blob.core.windows.net
> Tests run: 777, Failures: 0, Errors: 0, Skipped: 155



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14520) WASB: Block compaction for Azure Block Blobs

2017-08-29 Thread Georgi Chalakov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Georgi Chalakov updated HADOOP-14520:
-
Description: 
Block Compaction for WASB allows uploading new blocks for every hflush/hsync 
call. When the number of blocks is above 32000, next hflush/hsync triggers the 
block compaction process. Block compaction replaces a sequence of blocks with 
one block. From all the sequences with total length less than 4M, compaction 
chooses the longest one. It is a greedy algorithm that preserve all potential 
candidates for the next round. Block Compaction for WASB increases data 
durability and allows using block blobs instead of page blobs. By default, 
block compaction is disabled. Similar to the configuration for page blobs, the 
client needs to specify HDFS folders where block compaction over block blobs is 
enabled. 

Results for HADOOP_14520_07.patch
tested endpoint: fs.azure.account.key.hdfs4.blob.core.windows.net
Tests run: 777, Failures: 0, Errors: 0, Skipped: 155



  was:
Block Compaction for WASB allows uploading new blocks for every hflush/hsync 
call. When the number of blocks is above 32000, next hflush/hsync triggers the 
block compaction process. Block compaction replaces a sequence of blocks with 
one block. From all the sequences with total length less than 4M, compaction 
chooses the longest one. It is a greedy algorithm that preserve all potential 
candidates for the next round. Block Compaction for WASB increases data 
durability and allows using block blobs instead of page blobs. By default, 
block compaction is disabled. Similar to the configuration for page blobs, the 
client needs to specify HDFS folders where block compaction over block blobs is 
enabled. 

Results for HADOOP_14520_05.patch
tested endpoint: fs.azure.account.key.hdfs4.blob.core.windows.net
Tests run: 777, Failures: 0, Errors: 0, Skipped: 155




> WASB: Block compaction for Azure Block Blobs
> 
>
> Key: HADOOP-14520
> URL: https://issues.apache.org/jira/browse/HADOOP-14520
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/azure
>Affects Versions: 3.0.0-alpha3
>Reporter: Georgi Chalakov
>Assignee: Georgi Chalakov
> Attachments: HADOOP-14520-006.patch, HADOOP-14520-05.patch
>
>
> Block Compaction for WASB allows uploading new blocks for every hflush/hsync 
> call. When the number of blocks is above 32000, next hflush/hsync triggers 
> the block compaction process. Block compaction replaces a sequence of blocks 
> with one block. From all the sequences with total length less than 4M, 
> compaction chooses the longest one. It is a greedy algorithm that preserve 
> all potential candidates for the next round. Block Compaction for WASB 
> increases data durability and allows using block blobs instead of page blobs. 
> By default, block compaction is disabled. Similar to the configuration for 
> page blobs, the client needs to specify HDFS folders where block compaction 
> over block blobs is enabled. 
> Results for HADOOP_14520_07.patch
> tested endpoint: fs.azure.account.key.hdfs4.blob.core.windows.net
> Tests run: 777, Failures: 0, Errors: 0, Skipped: 155



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14520) WASB: Block compaction for Azure Block Blobs

2017-08-29 Thread Georgi Chalakov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Georgi Chalakov updated HADOOP-14520:
-
Description: 
Block Compaction for WASB allows uploading new blocks for every hflush/hsync 
call. When the number of blocks is above 32000, next hflush/hsync triggers the 
block compaction process. Block compaction replaces a sequence of blocks with 
one block. From all the sequences with total length less than 4M, compaction 
chooses the longest one. It is a greedy algorithm that preserve all potential 
candidates for the next round. Block Compaction for WASB increases data 
durability and allows using block blobs instead of page blobs. By default, 
block compaction is disabled. Similar to the configuration for page blobs, the 
client needs to specify HDFS folders where block compaction over block blobs is 
enabled. 

Results for HADOOP_14520_05.patch
tested endpoint: fs.azure.account.key.hdfs4.blob.core.windows.net
Tests run: 777, Failures: 0, Errors: 0, Skipped: 155



  was:
Block Compaction for WASB allows uploading new blocks for every hflush/hsync 
call. When the number of blocks is above 32000, next hflush/hsync triggers the 
block compaction process. Block compaction replaces a sequence of blocks with 
one block. From all the sequences with total length less than 4M, compaction 
chooses the longest one. It is a greedy algorithm that preserve all potential 
candidates for the next round. Block Compaction for WASB increases data 
durability and allows using block blobs instead of page blobs. By default, 
block compaction is disabled. Similar to the configuration for page blobs, the 
client needs to specify HDFS folders where block compaction over block blobs is 
enabled. 

Results for HADOOP-14520-05.patch
tested endpoint: fs.azure.account.key.hdfs4.blob.core.windows.net
Tests run: 707, Failures: 0, Errors: 0, Skipped: 119




> WASB: Block compaction for Azure Block Blobs
> 
>
> Key: HADOOP-14520
> URL: https://issues.apache.org/jira/browse/HADOOP-14520
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/azure
>Affects Versions: 3.0.0-alpha3
>Reporter: Georgi Chalakov
>Assignee: Georgi Chalakov
> Attachments: HADOOP-14520-006.patch, HADOOP-14520-05.patch
>
>
> Block Compaction for WASB allows uploading new blocks for every hflush/hsync 
> call. When the number of blocks is above 32000, next hflush/hsync triggers 
> the block compaction process. Block compaction replaces a sequence of blocks 
> with one block. From all the sequences with total length less than 4M, 
> compaction chooses the longest one. It is a greedy algorithm that preserve 
> all potential candidates for the next round. Block Compaction for WASB 
> increases data durability and allows using block blobs instead of page blobs. 
> By default, block compaction is disabled. Similar to the configuration for 
> page blobs, the client needs to specify HDFS folders where block compaction 
> over block blobs is enabled. 
> Results for HADOOP_14520_05.patch
> tested endpoint: fs.azure.account.key.hdfs4.blob.core.windows.net
> Tests run: 777, Failures: 0, Errors: 0, Skipped: 155



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14518) Customize User-Agent header sent in HTTP/HTTPS requests by WASB.

2017-07-21 Thread Georgi Chalakov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Georgi Chalakov updated HADOOP-14518:
-
Attachment: (was: HADOOP0-14518-06.patch)

> Customize User-Agent header sent in HTTP/HTTPS requests by WASB.
> 
>
> Key: HADOOP-14518
> URL: https://issues.apache.org/jira/browse/HADOOP-14518
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/azure
>Affects Versions: 3.0.0-beta1
>Reporter: Georgi Chalakov
>Assignee: Georgi Chalakov
>Priority: Minor
> Attachments: HADOOP-14518-01.patch, HADOOP-14518-01-test.txt, 
> HADOOP-14518-02.patch, HADOOP-14518-03.patch, HADOOP-14518-04.patch, 
> HADOOP-14518-05.patch, HADOOP-14518-06.patch
>
>
> WASB passes a User-Agent header to the Azure back-end. Right now, it uses the 
> default value set by the Azure Client SDK, so Hadoop traffic doesn't appear 
> any different from general Blob traffic. If we customize the User-Agent 
> header, then it will enable better troubleshooting and analysis by Azure 
> service.
> The following configuration
>   
> fs.azure.user.agent.prefix
> MSFT
>   
> set the user agent to 
>  User-Agent: WASB/3.0.0-alpha4-SNAPSHOT (MSFT) Azure-Storage/4.2.0 
> (JavaJRE 1.8.0_131; WindowsServer2012R2 6.3)
> Test Results :
> Tests run: 703, Failures: 0, Errors: 0, Skipped: 119



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14518) Customize User-Agent header sent in HTTP/HTTPS requests by WASB.

2017-07-21 Thread Georgi Chalakov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Georgi Chalakov updated HADOOP-14518:
-
Attachment: HADOOP-14518-06.patch

> Customize User-Agent header sent in HTTP/HTTPS requests by WASB.
> 
>
> Key: HADOOP-14518
> URL: https://issues.apache.org/jira/browse/HADOOP-14518
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/azure
>Affects Versions: 3.0.0-beta1
>Reporter: Georgi Chalakov
>Assignee: Georgi Chalakov
>Priority: Minor
> Attachments: HADOOP-14518-01.patch, HADOOP-14518-01-test.txt, 
> HADOOP-14518-02.patch, HADOOP-14518-03.patch, HADOOP-14518-04.patch, 
> HADOOP-14518-05.patch, HADOOP-14518-06.patch
>
>
> WASB passes a User-Agent header to the Azure back-end. Right now, it uses the 
> default value set by the Azure Client SDK, so Hadoop traffic doesn't appear 
> any different from general Blob traffic. If we customize the User-Agent 
> header, then it will enable better troubleshooting and analysis by Azure 
> service.
> The following configuration
>   
> fs.azure.user.agent.prefix
> MSFT
>   
> set the user agent to 
>  User-Agent: WASB/3.0.0-alpha4-SNAPSHOT (MSFT) Azure-Storage/4.2.0 
> (JavaJRE 1.8.0_131; WindowsServer2012R2 6.3)
> Test Results :
> Tests run: 703, Failures: 0, Errors: 0, Skipped: 119



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14518) Customize User-Agent header sent in HTTP/HTTPS requests by WASB.

2017-07-21 Thread Georgi Chalakov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Georgi Chalakov updated HADOOP-14518:
-
Target Version/s: 2.8.1, 3.0.0-beta1  (was: 3.0.0-beta1, 2.8.1)
  Status: Patch Available  (was: In Progress)

> Customize User-Agent header sent in HTTP/HTTPS requests by WASB.
> 
>
> Key: HADOOP-14518
> URL: https://issues.apache.org/jira/browse/HADOOP-14518
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/azure
>Affects Versions: 3.0.0-beta1
>Reporter: Georgi Chalakov
>Assignee: Georgi Chalakov
>Priority: Minor
> Attachments: HADOOP-14518-01.patch, HADOOP-14518-01-test.txt, 
> HADOOP-14518-02.patch, HADOOP-14518-03.patch, HADOOP-14518-04.patch, 
> HADOOP-14518-05.patch, HADOOP-14518-06.patch
>
>
> WASB passes a User-Agent header to the Azure back-end. Right now, it uses the 
> default value set by the Azure Client SDK, so Hadoop traffic doesn't appear 
> any different from general Blob traffic. If we customize the User-Agent 
> header, then it will enable better troubleshooting and analysis by Azure 
> service.
> The following configuration
>   
> fs.azure.user.agent.prefix
> MSFT
>   
> set the user agent to 
>  User-Agent: WASB/3.0.0-alpha4-SNAPSHOT (MSFT) Azure-Storage/4.2.0 
> (JavaJRE 1.8.0_131; WindowsServer2012R2 6.3)
> Test Results :
> Tests run: 703, Failures: 0, Errors: 0, Skipped: 119



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14518) Customize User-Agent header sent in HTTP/HTTPS requests by WASB.

2017-07-21 Thread Georgi Chalakov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Georgi Chalakov updated HADOOP-14518:
-
Affects Version/s: (was: 3.0.0-alpha3)
   3.0.0-beta1

> Customize User-Agent header sent in HTTP/HTTPS requests by WASB.
> 
>
> Key: HADOOP-14518
> URL: https://issues.apache.org/jira/browse/HADOOP-14518
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/azure
>Affects Versions: 3.0.0-beta1
>Reporter: Georgi Chalakov
>Assignee: Georgi Chalakov
>Priority: Minor
> Attachments: HADOOP0-14518-06.patch, HADOOP-14518-01.patch, 
> HADOOP-14518-01-test.txt, HADOOP-14518-02.patch, HADOOP-14518-03.patch, 
> HADOOP-14518-04.patch, HADOOP-14518-05.patch
>
>
> WASB passes a User-Agent header to the Azure back-end. Right now, it uses the 
> default value set by the Azure Client SDK, so Hadoop traffic doesn't appear 
> any different from general Blob traffic. If we customize the User-Agent 
> header, then it will enable better troubleshooting and analysis by Azure 
> service.
> The following configuration
>   
> fs.azure.user.agent.prefix
> MSFT
>   
> set the user agent to 
>  User-Agent: WASB/3.0.0-alpha4-SNAPSHOT (MSFT) Azure-Storage/4.2.0 
> (JavaJRE 1.8.0_131; WindowsServer2012R2 6.3)
> Test Results :
> Tests run: 703, Failures: 0, Errors: 0, Skipped: 119



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14518) Customize User-Agent header sent in HTTP/HTTPS requests by WASB.

2017-07-21 Thread Georgi Chalakov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Georgi Chalakov updated HADOOP-14518:
-
Target Version/s: 2.8.1, 3.0.0-beta1  (was: 2.8.1, 3.0.0-alpha3)

> Customize User-Agent header sent in HTTP/HTTPS requests by WASB.
> 
>
> Key: HADOOP-14518
> URL: https://issues.apache.org/jira/browse/HADOOP-14518
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/azure
>Affects Versions: 3.0.0-beta1
>Reporter: Georgi Chalakov
>Assignee: Georgi Chalakov
>Priority: Minor
> Attachments: HADOOP0-14518-06.patch, HADOOP-14518-01.patch, 
> HADOOP-14518-01-test.txt, HADOOP-14518-02.patch, HADOOP-14518-03.patch, 
> HADOOP-14518-04.patch, HADOOP-14518-05.patch
>
>
> WASB passes a User-Agent header to the Azure back-end. Right now, it uses the 
> default value set by the Azure Client SDK, so Hadoop traffic doesn't appear 
> any different from general Blob traffic. If we customize the User-Agent 
> header, then it will enable better troubleshooting and analysis by Azure 
> service.
> The following configuration
>   
> fs.azure.user.agent.prefix
> MSFT
>   
> set the user agent to 
>  User-Agent: WASB/3.0.0-alpha4-SNAPSHOT (MSFT) Azure-Storage/4.2.0 
> (JavaJRE 1.8.0_131; WindowsServer2012R2 6.3)
> Test Results :
> Tests run: 703, Failures: 0, Errors: 0, Skipped: 119



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14518) Customize User-Agent header sent in HTTP/HTTPS requests by WASB.

2017-07-21 Thread Georgi Chalakov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Georgi Chalakov updated HADOOP-14518:
-
Attachment: HADOOP0-14518-06.patch

> Customize User-Agent header sent in HTTP/HTTPS requests by WASB.
> 
>
> Key: HADOOP-14518
> URL: https://issues.apache.org/jira/browse/HADOOP-14518
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/azure
>Affects Versions: 3.0.0-alpha3
>Reporter: Georgi Chalakov
>Assignee: Georgi Chalakov
>Priority: Minor
> Attachments: HADOOP0-14518-06.patch, HADOOP-14518-01.patch, 
> HADOOP-14518-01-test.txt, HADOOP-14518-02.patch, HADOOP-14518-03.patch, 
> HADOOP-14518-04.patch, HADOOP-14518-05.patch
>
>
> WASB passes a User-Agent header to the Azure back-end. Right now, it uses the 
> default value set by the Azure Client SDK, so Hadoop traffic doesn't appear 
> any different from general Blob traffic. If we customize the User-Agent 
> header, then it will enable better troubleshooting and analysis by Azure 
> service.
> The following configuration
>   
> fs.azure.user.agent.prefix
> MSFT
>   
> set the user agent to 
>  User-Agent: WASB/3.0.0-alpha4-SNAPSHOT (MSFT) Azure-Storage/4.2.0 
> (JavaJRE 1.8.0_131; WindowsServer2012R2 6.3)
> Test Results :
> Tests run: 703, Failures: 0, Errors: 0, Skipped: 119



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14518) Customize User-Agent header sent in HTTP/HTTPS requests by WASB.

2017-07-21 Thread Georgi Chalakov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Georgi Chalakov updated HADOOP-14518:
-
Target Version/s: 3.0.0-alpha3, 2.8.1  (was: 2.8.1, 3.0.0-alpha3)
  Status: In Progress  (was: Patch Available)

> Customize User-Agent header sent in HTTP/HTTPS requests by WASB.
> 
>
> Key: HADOOP-14518
> URL: https://issues.apache.org/jira/browse/HADOOP-14518
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/azure
>Affects Versions: 3.0.0-alpha3
>Reporter: Georgi Chalakov
>Assignee: Georgi Chalakov
>Priority: Minor
> Attachments: HADOOP0-14518-06.patch, HADOOP-14518-01.patch, 
> HADOOP-14518-01-test.txt, HADOOP-14518-02.patch, HADOOP-14518-03.patch, 
> HADOOP-14518-04.patch, HADOOP-14518-05.patch
>
>
> WASB passes a User-Agent header to the Azure back-end. Right now, it uses the 
> default value set by the Azure Client SDK, so Hadoop traffic doesn't appear 
> any different from general Blob traffic. If we customize the User-Agent 
> header, then it will enable better troubleshooting and analysis by Azure 
> service.
> The following configuration
>   
> fs.azure.user.agent.prefix
> MSFT
>   
> set the user agent to 
>  User-Agent: WASB/3.0.0-alpha4-SNAPSHOT (MSFT) Azure-Storage/4.2.0 
> (JavaJRE 1.8.0_131; WindowsServer2012R2 6.3)
> Test Results :
> Tests run: 703, Failures: 0, Errors: 0, Skipped: 119



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14520) WASB: Block compaction for Azure Block Blobs

2017-07-09 Thread Georgi Chalakov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Georgi Chalakov updated HADOOP-14520:
-
Attachment: (was: HADOOP-14520-01.patch)

> WASB: Block compaction for Azure Block Blobs
> 
>
> Key: HADOOP-14520
> URL: https://issues.apache.org/jira/browse/HADOOP-14520
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/azure
>Affects Versions: 3.0.0-alpha3
>Reporter: Georgi Chalakov
>Assignee: Georgi Chalakov
> Attachments: HADOOP-14520-05.patch
>
>
> Block Compaction for WASB allows uploading new blocks for every hflush/hsync 
> call. When the number of blocks is above 32000, next hflush/hsync triggers 
> the block compaction process. Block compaction replaces a sequence of blocks 
> with one block. From all the sequences with total length less than 4M, 
> compaction chooses the longest one. It is a greedy algorithm that preserve 
> all potential candidates for the next round. Block Compaction for WASB 
> increases data durability and allows using block blobs instead of page blobs. 
> By default, block compaction is disabled. Similar to the configuration for 
> page blobs, the client needs to specify HDFS folders where block compaction 
> over block blobs is enabled. 
> Results for HADOOP-14520-04.patch
> tested endpoint: fs.azure.account.key.hdfs4.blob.core.windows.net
> Tests run: 701, Failures: 0, Errors: 0, Skipped: 119



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14520) WASB: Block compaction for Azure Block Blobs

2017-07-09 Thread Georgi Chalakov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Georgi Chalakov updated HADOOP-14520:
-
Attachment: (was: HADOOP-14520-03.patch)

> WASB: Block compaction for Azure Block Blobs
> 
>
> Key: HADOOP-14520
> URL: https://issues.apache.org/jira/browse/HADOOP-14520
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/azure
>Affects Versions: 3.0.0-alpha3
>Reporter: Georgi Chalakov
>Assignee: Georgi Chalakov
> Attachments: HADOOP-14520-05.patch
>
>
> Block Compaction for WASB allows uploading new blocks for every hflush/hsync 
> call. When the number of blocks is above 32000, next hflush/hsync triggers 
> the block compaction process. Block compaction replaces a sequence of blocks 
> with one block. From all the sequences with total length less than 4M, 
> compaction chooses the longest one. It is a greedy algorithm that preserve 
> all potential candidates for the next round. Block Compaction for WASB 
> increases data durability and allows using block blobs instead of page blobs. 
> By default, block compaction is disabled. Similar to the configuration for 
> page blobs, the client needs to specify HDFS folders where block compaction 
> over block blobs is enabled. 
> Results for HADOOP-14520-04.patch
> tested endpoint: fs.azure.account.key.hdfs4.blob.core.windows.net
> Tests run: 701, Failures: 0, Errors: 0, Skipped: 119



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14520) WASB: Block compaction for Azure Block Blobs

2017-07-09 Thread Georgi Chalakov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Georgi Chalakov updated HADOOP-14520:
-
Status: In Progress  (was: Patch Available)

> WASB: Block compaction for Azure Block Blobs
> 
>
> Key: HADOOP-14520
> URL: https://issues.apache.org/jira/browse/HADOOP-14520
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/azure
>Affects Versions: 3.0.0-alpha3
>Reporter: Georgi Chalakov
>Assignee: Georgi Chalakov
> Attachments: HADOOP-14520-01.patch, HADOOP-14520-01-test.txt, 
> HADOOP-14520-03.patch, HADOOP-14520-4.patch
>
>
> Block Compaction for WASB allows uploading new blocks for every hflush/hsync 
> call. When the number of blocks is above 32000, next hflush/hsync triggers 
> the block compaction process. Block compaction replaces a sequence of blocks 
> with one block. From all the sequences with total length less than 4M, 
> compaction chooses the longest one. It is a greedy algorithm that preserve 
> all potential candidates for the next round. Block Compaction for WASB 
> increases data durability and allows using block blobs instead of page blobs. 
> By default, block compaction is disabled. Similar to the configuration for 
> page blobs, the client needs to specify HDFS folders where block compaction 
> over block blobs is enabled. 
> Results for HADOOP-14520-04.patch
> tested endpoint: fs.azure.account.key.hdfs4.blob.core.windows.net
> Tests run: 701, Failures: 0, Errors: 0, Skipped: 119



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14520) WASB: Block compaction for Azure Block Blobs

2017-07-09 Thread Georgi Chalakov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Georgi Chalakov updated HADOOP-14520:
-
Attachment: (was: HADOOP-14520-4.patch)

> WASB: Block compaction for Azure Block Blobs
> 
>
> Key: HADOOP-14520
> URL: https://issues.apache.org/jira/browse/HADOOP-14520
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/azure
>Affects Versions: 3.0.0-alpha3
>Reporter: Georgi Chalakov
>Assignee: Georgi Chalakov
> Attachments: HADOOP-14520-05.patch
>
>
> Block Compaction for WASB allows uploading new blocks for every hflush/hsync 
> call. When the number of blocks is above 32000, next hflush/hsync triggers 
> the block compaction process. Block compaction replaces a sequence of blocks 
> with one block. From all the sequences with total length less than 4M, 
> compaction chooses the longest one. It is a greedy algorithm that preserve 
> all potential candidates for the next round. Block Compaction for WASB 
> increases data durability and allows using block blobs instead of page blobs. 
> By default, block compaction is disabled. Similar to the configuration for 
> page blobs, the client needs to specify HDFS folders where block compaction 
> over block blobs is enabled. 
> Results for HADOOP-14520-04.patch
> tested endpoint: fs.azure.account.key.hdfs4.blob.core.windows.net
> Tests run: 701, Failures: 0, Errors: 0, Skipped: 119



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14520) WASB: Block compaction for Azure Block Blobs

2017-07-09 Thread Georgi Chalakov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Georgi Chalakov updated HADOOP-14520:
-
Attachment: HADOOP-14520-05.patch

> WASB: Block compaction for Azure Block Blobs
> 
>
> Key: HADOOP-14520
> URL: https://issues.apache.org/jira/browse/HADOOP-14520
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/azure
>Affects Versions: 3.0.0-alpha3
>Reporter: Georgi Chalakov
>Assignee: Georgi Chalakov
> Attachments: HADOOP-14520-05.patch
>
>
> Block Compaction for WASB allows uploading new blocks for every hflush/hsync 
> call. When the number of blocks is above 32000, next hflush/hsync triggers 
> the block compaction process. Block compaction replaces a sequence of blocks 
> with one block. From all the sequences with total length less than 4M, 
> compaction chooses the longest one. It is a greedy algorithm that preserve 
> all potential candidates for the next round. Block Compaction for WASB 
> increases data durability and allows using block blobs instead of page blobs. 
> By default, block compaction is disabled. Similar to the configuration for 
> page blobs, the client needs to specify HDFS folders where block compaction 
> over block blobs is enabled. 
> Results for HADOOP-14520-04.patch
> tested endpoint: fs.azure.account.key.hdfs4.blob.core.windows.net
> Tests run: 701, Failures: 0, Errors: 0, Skipped: 119



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14520) WASB: Block compaction for Azure Block Blobs

2017-07-09 Thread Georgi Chalakov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Georgi Chalakov updated HADOOP-14520:
-
Attachment: (was: HADOOP-14520-01-test.txt)

> WASB: Block compaction for Azure Block Blobs
> 
>
> Key: HADOOP-14520
> URL: https://issues.apache.org/jira/browse/HADOOP-14520
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/azure
>Affects Versions: 3.0.0-alpha3
>Reporter: Georgi Chalakov
>Assignee: Georgi Chalakov
> Attachments: HADOOP-14520-05.patch
>
>
> Block Compaction for WASB allows uploading new blocks for every hflush/hsync 
> call. When the number of blocks is above 32000, next hflush/hsync triggers 
> the block compaction process. Block compaction replaces a sequence of blocks 
> with one block. From all the sequences with total length less than 4M, 
> compaction chooses the longest one. It is a greedy algorithm that preserve 
> all potential candidates for the next round. Block Compaction for WASB 
> increases data durability and allows using block blobs instead of page blobs. 
> By default, block compaction is disabled. Similar to the configuration for 
> page blobs, the client needs to specify HDFS folders where block compaction 
> over block blobs is enabled. 
> Results for HADOOP-14520-04.patch
> tested endpoint: fs.azure.account.key.hdfs4.blob.core.windows.net
> Tests run: 701, Failures: 0, Errors: 0, Skipped: 119



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14520) WASB: Block compaction for Azure Block Blobs

2017-07-09 Thread Georgi Chalakov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Georgi Chalakov updated HADOOP-14520:
-
Description: 
Block Compaction for WASB allows uploading new blocks for every hflush/hsync 
call. When the number of blocks is above 32000, next hflush/hsync triggers the 
block compaction process. Block compaction replaces a sequence of blocks with 
one block. From all the sequences with total length less than 4M, compaction 
chooses the longest one. It is a greedy algorithm that preserve all potential 
candidates for the next round. Block Compaction for WASB increases data 
durability and allows using block blobs instead of page blobs. By default, 
block compaction is disabled. Similar to the configuration for page blobs, the 
client needs to specify HDFS folders where block compaction over block blobs is 
enabled. 

Results for HADOOP-14520-05.patch
tested endpoint: fs.azure.account.key.hdfs4.blob.core.windows.net
Tests run: 707, Failures: 0, Errors: 0, Skipped: 119



  was:
Block Compaction for WASB allows uploading new blocks for every hflush/hsync 
call. When the number of blocks is above 32000, next hflush/hsync triggers the 
block compaction process. Block compaction replaces a sequence of blocks with 
one block. From all the sequences with total length less than 4M, compaction 
chooses the longest one. It is a greedy algorithm that preserve all potential 
candidates for the next round. Block Compaction for WASB increases data 
durability and allows using block blobs instead of page blobs. By default, 
block compaction is disabled. Similar to the configuration for page blobs, the 
client needs to specify HDFS folders where block compaction over block blobs is 
enabled. 

Results for HADOOP-14520-04.patch

tested endpoint: fs.azure.account.key.hdfs4.blob.core.windows.net
Tests run: 701, Failures: 0, Errors: 0, Skipped: 119




> WASB: Block compaction for Azure Block Blobs
> 
>
> Key: HADOOP-14520
> URL: https://issues.apache.org/jira/browse/HADOOP-14520
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/azure
>Affects Versions: 3.0.0-alpha3
>Reporter: Georgi Chalakov
>Assignee: Georgi Chalakov
> Attachments: HADOOP-14520-05.patch
>
>
> Block Compaction for WASB allows uploading new blocks for every hflush/hsync 
> call. When the number of blocks is above 32000, next hflush/hsync triggers 
> the block compaction process. Block compaction replaces a sequence of blocks 
> with one block. From all the sequences with total length less than 4M, 
> compaction chooses the longest one. It is a greedy algorithm that preserve 
> all potential candidates for the next round. Block Compaction for WASB 
> increases data durability and allows using block blobs instead of page blobs. 
> By default, block compaction is disabled. Similar to the configuration for 
> page blobs, the client needs to specify HDFS folders where block compaction 
> over block blobs is enabled. 
> Results for HADOOP-14520-05.patch
> tested endpoint: fs.azure.account.key.hdfs4.blob.core.windows.net
> Tests run: 707, Failures: 0, Errors: 0, Skipped: 119



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14520) WASB: Block compaction for Azure Block Blobs

2017-07-09 Thread Georgi Chalakov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Georgi Chalakov updated HADOOP-14520:
-
Status: Patch Available  (was: In Progress)

> WASB: Block compaction for Azure Block Blobs
> 
>
> Key: HADOOP-14520
> URL: https://issues.apache.org/jira/browse/HADOOP-14520
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/azure
>Affects Versions: 3.0.0-alpha3
>Reporter: Georgi Chalakov
>Assignee: Georgi Chalakov
> Attachments: HADOOP-14520-05.patch
>
>
> Block Compaction for WASB allows uploading new blocks for every hflush/hsync 
> call. When the number of blocks is above 32000, next hflush/hsync triggers 
> the block compaction process. Block compaction replaces a sequence of blocks 
> with one block. From all the sequences with total length less than 4M, 
> compaction chooses the longest one. It is a greedy algorithm that preserve 
> all potential candidates for the next round. Block Compaction for WASB 
> increases data durability and allows using block blobs instead of page blobs. 
> By default, block compaction is disabled. Similar to the configuration for 
> page blobs, the client needs to specify HDFS folders where block compaction 
> over block blobs is enabled. 
> Results for HADOOP-14520-04.patch
> tested endpoint: fs.azure.account.key.hdfs4.blob.core.windows.net
> Tests run: 701, Failures: 0, Errors: 0, Skipped: 119



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14520) WASB: Block compaction for Azure Block Blobs

2017-07-09 Thread Georgi Chalakov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Georgi Chalakov updated HADOOP-14520:
-
Release Note: Block Compaction for Azure Block Blobs When the number of 
blocks in a block blob is above 32000, the process of compaction replaces a 
sequence of small blocks with with one big block.   (was: Block Compaction for 
WASB. When the number of blocks in a block blob is above 32000, the process of 
compaction replaces a sequence of small blocks with with one big block. )

> WASB: Block compaction for Azure Block Blobs
> 
>
> Key: HADOOP-14520
> URL: https://issues.apache.org/jira/browse/HADOOP-14520
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/azure
>Affects Versions: 3.0.0-alpha3
>Reporter: Georgi Chalakov
>Assignee: Georgi Chalakov
> Attachments: HADOOP-14520-01.patch, HADOOP-14520-01-test.txt, 
> HADOOP-14520-03.patch, HADOOP-14520-4.patch
>
>
> Block Compaction for WASB allows uploading new blocks for every hflush/hsync 
> call. When the number of blocks is above 32000, next hflush/hsync triggers 
> the block compaction process. Block compaction replaces a sequence of blocks 
> with one block. From all the sequences with total length less than 4M, 
> compaction chooses the longest one. It is a greedy algorithm that preserve 
> all potential candidates for the next round. Block Compaction for WASB 
> increases data durability and allows using block blobs instead of page blobs. 
> By default, block compaction is disabled. Similar to the configuration for 
> page blobs, the client needs to specify HDFS folders where block compaction 
> over block blobs is enabled. 
> Results for HADOOP-14520-04.patch
> tested endpoint: fs.azure.account.key.hdfs4.blob.core.windows.net
> Tests run: 701, Failures: 0, Errors: 0, Skipped: 119



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14520) WASB: Block compaction for Azure Block Blobs

2017-07-09 Thread Georgi Chalakov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Georgi Chalakov updated HADOOP-14520:
-
Summary: WASB: Block compaction for Azure Block Blobs  (was: Block 
compaction for WASB (Block Blobs Instead of Page Plobs))

> WASB: Block compaction for Azure Block Blobs
> 
>
> Key: HADOOP-14520
> URL: https://issues.apache.org/jira/browse/HADOOP-14520
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/azure
>Affects Versions: 3.0.0-alpha3
>Reporter: Georgi Chalakov
>Assignee: Georgi Chalakov
> Attachments: HADOOP-14520-01.patch, HADOOP-14520-01-test.txt, 
> HADOOP-14520-03.patch, HADOOP-14520-4.patch
>
>
> Block Compaction for WASB allows uploading new blocks for every hflush/hsync 
> call. When the number of blocks is above 32000, next hflush/hsync triggers 
> the block compaction process. Block compaction replaces a sequence of blocks 
> with one block. From all the sequences with total length less than 4M, 
> compaction chooses the longest one. It is a greedy algorithm that preserve 
> all potential candidates for the next round. Block Compaction for WASB 
> increases data durability and allows using block blobs instead of page blobs. 
> By default, block compaction is disabled. Similar to the configuration for 
> page blobs, the client needs to specify HDFS folders where block compaction 
> over block blobs is enabled. 
> Results for HADOOP-14520-04.patch
> tested endpoint: fs.azure.account.key.hdfs4.blob.core.windows.net
> Tests run: 701, Failures: 0, Errors: 0, Skipped: 119



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14520) WASB: Block compaction for Azure Block Blobs

2017-07-09 Thread Georgi Chalakov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Georgi Chalakov updated HADOOP-14520:
-
Release Note: Block Compaction for Azure Block Blobs. When the number of 
blocks in a block blob is above 32000, the process of compaction replaces a 
sequence of small blocks with with one big block.   (was: Block Compaction for 
Azure Block Blobs When the number of blocks in a block blob is above 32000, the 
process of compaction replaces a sequence of small blocks with with one big 
block. )

> WASB: Block compaction for Azure Block Blobs
> 
>
> Key: HADOOP-14520
> URL: https://issues.apache.org/jira/browse/HADOOP-14520
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/azure
>Affects Versions: 3.0.0-alpha3
>Reporter: Georgi Chalakov
>Assignee: Georgi Chalakov
> Attachments: HADOOP-14520-01.patch, HADOOP-14520-01-test.txt, 
> HADOOP-14520-03.patch, HADOOP-14520-4.patch
>
>
> Block Compaction for WASB allows uploading new blocks for every hflush/hsync 
> call. When the number of blocks is above 32000, next hflush/hsync triggers 
> the block compaction process. Block compaction replaces a sequence of blocks 
> with one block. From all the sequences with total length less than 4M, 
> compaction chooses the longest one. It is a greedy algorithm that preserve 
> all potential candidates for the next round. Block Compaction for WASB 
> increases data durability and allows using block blobs instead of page blobs. 
> By default, block compaction is disabled. Similar to the configuration for 
> page blobs, the client needs to specify HDFS folders where block compaction 
> over block blobs is enabled. 
> Results for HADOOP-14520-04.patch
> tested endpoint: fs.azure.account.key.hdfs4.blob.core.windows.net
> Tests run: 701, Failures: 0, Errors: 0, Skipped: 119



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14520) Block compaction for WASB (Block Blobs Instead of Page Plobs)

2017-07-09 Thread Georgi Chalakov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Georgi Chalakov updated HADOOP-14520:
-
Release Note: Block Compaction for WASB. When the number of blocks in a 
block blob is above 32000, the process of compaction replaces a sequence of 
small blocks with with one big block.   (was: Block Compaction for WASB. When 
the number of blocks in a block blob is above 32000, compaction replaces 
longest sequence of blocks with total size length less then 4M, with just one 
block. Compaction allows blocks blobs to be used instead of page blobs, 
including for WAL files.)
  Status: Patch Available  (was: In Progress)

> Block compaction for WASB (Block Blobs Instead of Page Plobs)
> -
>
> Key: HADOOP-14520
> URL: https://issues.apache.org/jira/browse/HADOOP-14520
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/azure
>Affects Versions: 3.0.0-alpha3
>Reporter: Georgi Chalakov
>Assignee: Georgi Chalakov
> Attachments: HADOOP-14520-01.patch, HADOOP-14520-01-test.txt, 
> HADOOP-14520-03.patch, HADOOP-14520-4.patch
>
>
> Block Compaction for WASB allows uploading new blocks for every hflush/hsync 
> call. When the number of blocks is above 32000, next hflush/hsync triggers 
> the block compaction process. Block compaction replaces a sequence of blocks 
> with one block. From all the sequences with total length less than 4M, 
> compaction chooses the longest one. It is a greedy algorithm that preserve 
> all potential candidates for the next round. Block Compaction for WASB 
> increases data durability and allows using block blobs instead of page blobs. 
> By default, block compaction is disabled. Similar to the configuration for 
> page blobs, the client needs to specify HDFS folders where block compaction 
> over block blobs is enabled. 
> Results for HADOOP-14520-04.patch
> tested endpoint: fs.azure.account.key.hdfs4.blob.core.windows.net
> Tests run: 701, Failures: 0, Errors: 0, Skipped: 119



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14520) Block compaction for WASB (Block Blobs Instead of Page Plobs)

2017-07-09 Thread Georgi Chalakov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Georgi Chalakov updated HADOOP-14520:
-
Attachment: HADOOP-14520-4.patch

> Block compaction for WASB (Block Blobs Instead of Page Plobs)
> -
>
> Key: HADOOP-14520
> URL: https://issues.apache.org/jira/browse/HADOOP-14520
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/azure
>Affects Versions: 3.0.0-alpha3
>Reporter: Georgi Chalakov
>Assignee: Georgi Chalakov
> Attachments: HADOOP-14520-01.patch, HADOOP-14520-01-test.txt, 
> HADOOP-14520-03.patch, HADOOP-14520-4.patch
>
>
> Block Compaction for WASB allows uploading new blocks for every hflush/hsync 
> call. When the number of blocks is above 32000, next hflush/hsync triggers 
> the block compaction process. Block compaction replaces a sequence of blocks 
> with one block. From all the sequences with total length less than 4M, 
> compaction chooses the longest one. It is a greedy algorithm that preserve 
> all potential candidates for the next round. Block Compaction for WASB 
> increases data durability and allows using block blobs instead of page blobs. 
> By default, block compaction is disabled. Similar to the configuration for 
> page blobs, the client needs to specify HDFS folders where block compaction 
> over block blobs is enabled. 
> Results for HADOOP-14520-04.patch
> tested endpoint: fs.azure.account.key.hdfs4.blob.core.windows.net
> Tests run: 701, Failures: 0, Errors: 0, Skipped: 119



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14520) Block compaction for WASB (Block Blobs Instead of Page Plobs)

2017-07-09 Thread Georgi Chalakov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Georgi Chalakov updated HADOOP-14520:
-
Description: 
Block Compaction for WASB allows uploading new blocks for every hflush/hsync 
call. When the number of blocks is above 32000, next hflush/hsync triggers the 
block compaction process. Block compaction replaces a sequence of blocks with 
one block. From all the sequences with total length less than 4M, compaction 
chooses the longest one. It is a greedy algorithm that preserve all potential 
candidates for the next round. Block Compaction for WASB increases data 
durability and allows using block blobs instead of page blobs. By default, 
block compaction is disabled. Similar to the configuration for page blobs, the 
client needs to specify HDFS folders where block compaction over block blobs is 
enabled. 

Results for HADOOP-14520-04.patch

tested endpoint: fs.azure.account.key.hdfs4.blob.core.windows.net
Tests run: 701, Failures: 0, Errors: 0, Skipped: 119



  was:
Block Compaction for WASB allows uploading new blocks for every hflush/hsync 
call. When the number of blocks is above 32000, next hflush/hsync triggers the 
block compaction process. Block compaction replaces a sequence of blocks with 
one block. From all the sequences with total length less than 4M, compaction 
chooses the longest one. It is a greedy algorithm that preserve all potential 
candidates for the next round. Block Compaction for WASB increases data 
durability and allows using block blobs instead of page blobs. By default, 
block compaction is disabled. Similar to the configuration for page blobs, the 
client needs to specify HDFS folders where block compaction over block blobs is 
enabled. 

Results for HADOOP-14520-01.patch
tested endpoint: fs.azure.account.key.hdfs4.blob.core.windows.net
Tests run: 704, Failures: 0, Errors: 0, Skipped: 119




> Block compaction for WASB (Block Blobs Instead of Page Plobs)
> -
>
> Key: HADOOP-14520
> URL: https://issues.apache.org/jira/browse/HADOOP-14520
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/azure
>Affects Versions: 3.0.0-alpha3
>Reporter: Georgi Chalakov
>Assignee: Georgi Chalakov
> Attachments: HADOOP-14520-01.patch, HADOOP-14520-01-test.txt, 
> HADOOP-14520-03.patch
>
>
> Block Compaction for WASB allows uploading new blocks for every hflush/hsync 
> call. When the number of blocks is above 32000, next hflush/hsync triggers 
> the block compaction process. Block compaction replaces a sequence of blocks 
> with one block. From all the sequences with total length less than 4M, 
> compaction chooses the longest one. It is a greedy algorithm that preserve 
> all potential candidates for the next round. Block Compaction for WASB 
> increases data durability and allows using block blobs instead of page blobs. 
> By default, block compaction is disabled. Similar to the configuration for 
> page blobs, the client needs to specify HDFS folders where block compaction 
> over block blobs is enabled. 
> Results for HADOOP-14520-04.patch
> tested endpoint: fs.azure.account.key.hdfs4.blob.core.windows.net
> Tests run: 701, Failures: 0, Errors: 0, Skipped: 119



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-14610) Block compaction for WASB (Block Blobs Instead of Page Plobs)

2017-06-28 Thread Georgi Chalakov (JIRA)
Georgi Chalakov created HADOOP-14610:


 Summary: Block compaction for WASB (Block Blobs Instead of Page 
Plobs)
 Key: HADOOP-14610
 URL: https://issues.apache.org/jira/browse/HADOOP-14610
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: fs/azure
Affects Versions: 3.0.0-alpha3
Reporter: Georgi Chalakov
Assignee: Georgi Chalakov






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14520) Block compaction for WASB (Block Blobs Instead of Page Plobs)

2017-06-28 Thread Georgi Chalakov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Georgi Chalakov updated HADOOP-14520:
-
Summary: Block compaction for WASB (Block Blobs Instead of Page Plobs)  
(was: Block compaction for WASB)

> Block compaction for WASB (Block Blobs Instead of Page Plobs)
> -
>
> Key: HADOOP-14520
> URL: https://issues.apache.org/jira/browse/HADOOP-14520
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/azure
>Affects Versions: 3.0.0-alpha3
>Reporter: Georgi Chalakov
>Assignee: Georgi Chalakov
> Attachments: HADOOP-14520-01.patch, HADOOP-14520-01-test.txt, 
> HADOOP-14520-03.patch
>
>
> Block Compaction for WASB allows uploading new blocks for every hflush/hsync 
> call. When the number of blocks is above 32000, next hflush/hsync triggers 
> the block compaction process. Block compaction replaces a sequence of blocks 
> with one block. From all the sequences with total length less than 4M, 
> compaction chooses the longest one. It is a greedy algorithm that preserve 
> all potential candidates for the next round. Block Compaction for WASB 
> increases data durability and allows using block blobs instead of page blobs. 
> By default, block compaction is disabled. Similar to the configuration for 
> page blobs, the client needs to specify HDFS folders where block compaction 
> over block blobs is enabled. 
> Results for HADOOP-14520-01.patch
> tested endpoint: fs.azure.account.key.hdfs4.blob.core.windows.net
> Tests run: 704, Failures: 0, Errors: 0, Skipped: 119



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work started] (HADOOP-14520) Block compaction for WASB

2017-06-27 Thread Georgi Chalakov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HADOOP-14520 started by Georgi Chalakov.

> Block compaction for WASB
> -
>
> Key: HADOOP-14520
> URL: https://issues.apache.org/jira/browse/HADOOP-14520
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/azure
>Affects Versions: 3.0.0-alpha3
>Reporter: Georgi Chalakov
>Assignee: Georgi Chalakov
> Attachments: HADOOP-14520-01.patch, HADOOP-14520-01-test.txt, 
> HADOOP-14520-03.patch
>
>
> Block Compaction for WASB allows uploading new blocks for every hflush/hsync 
> call. When the number of blocks is above 32000, next hflush/hsync triggers 
> the block compaction process. Block compaction replaces a sequence of blocks 
> with one block. From all the sequences with total length less than 4M, 
> compaction chooses the longest one. It is a greedy algorithm that preserve 
> all potential candidates for the next round. Block Compaction for WASB 
> increases data durability and allows using block blobs instead of page blobs. 
> By default, block compaction is disabled. Similar to the configuration for 
> page blobs, the client needs to specify HDFS folders where block compaction 
> over block blobs is enabled. 
> Results for HADOOP-14520-01.patch
> tested endpoint: fs.azure.account.key.hdfs4.blob.core.windows.net
> Tests run: 704, Failures: 0, Errors: 0, Skipped: 119



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14520) Block compaction for WASB

2017-06-27 Thread Georgi Chalakov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Georgi Chalakov updated HADOOP-14520:
-
Status: Open  (was: Patch Available)

> Block compaction for WASB
> -
>
> Key: HADOOP-14520
> URL: https://issues.apache.org/jira/browse/HADOOP-14520
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/azure
>Affects Versions: 3.0.0-alpha3
>Reporter: Georgi Chalakov
>Assignee: Georgi Chalakov
> Attachments: HADOOP-14520-01.patch, HADOOP-14520-01-test.txt, 
> HADOOP-14520-03.patch
>
>
> Block Compaction for WASB allows uploading new blocks for every hflush/hsync 
> call. When the number of blocks is above 32000, next hflush/hsync triggers 
> the block compaction process. Block compaction replaces a sequence of blocks 
> with one block. From all the sequences with total length less than 4M, 
> compaction chooses the longest one. It is a greedy algorithm that preserve 
> all potential candidates for the next round. Block Compaction for WASB 
> increases data durability and allows using block blobs instead of page blobs. 
> By default, block compaction is disabled. Similar to the configuration for 
> page blobs, the client needs to specify HDFS folders where block compaction 
> over block blobs is enabled. 
> Results for HADOOP-14520-01.patch
> tested endpoint: fs.azure.account.key.hdfs4.blob.core.windows.net
> Tests run: 704, Failures: 0, Errors: 0, Skipped: 119



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14518) Customize User-Agent header sent in HTTP/HTTPS requests by WASB.

2017-06-26 Thread Georgi Chalakov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Georgi Chalakov updated HADOOP-14518:
-
Target Version/s: 3.0.0-alpha3, 2.8.1  (was: 2.8.1, 3.0.0-alpha3)
  Status: Patch Available  (was: Open)

The name of the key is changed to "fs.azure.user.agent.prefix"
New test added "testUserAgentConfig"

> Customize User-Agent header sent in HTTP/HTTPS requests by WASB.
> 
>
> Key: HADOOP-14518
> URL: https://issues.apache.org/jira/browse/HADOOP-14518
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/azure
>Affects Versions: 3.0.0-alpha3
>Reporter: Georgi Chalakov
>Assignee: Georgi Chalakov
>Priority: Minor
> Attachments: HADOOP-14518-01.patch, HADOOP-14518-01-test.txt, 
> HADOOP-14518-02.patch, HADOOP-14518-03.patch, HADOOP-14518-04.patch, 
> HADOOP-14518-05.patch
>
>
> WASB passes a User-Agent header to the Azure back-end. Right now, it uses the 
> default value set by the Azure Client SDK, so Hadoop traffic doesn't appear 
> any different from general Blob traffic. If we customize the User-Agent 
> header, then it will enable better troubleshooting and analysis by Azure 
> service.
> The following configuration
>   
> fs.azure.user.agent.prefix
> MSFT
>   
> set the user agent to 
>  User-Agent: WASB/3.0.0-alpha4-SNAPSHOT (MSFT) Azure-Storage/4.2.0 
> (JavaJRE 1.8.0_131; WindowsServer2012R2 6.3)
> Test Results :
> Tests run: 703, Failures: 0, Errors: 0, Skipped: 119



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14518) Customize User-Agent header sent in HTTP/HTTPS requests by WASB.

2017-06-26 Thread Georgi Chalakov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Georgi Chalakov updated HADOOP-14518:
-
Attachment: HADOOP-14518-05.patch

> Customize User-Agent header sent in HTTP/HTTPS requests by WASB.
> 
>
> Key: HADOOP-14518
> URL: https://issues.apache.org/jira/browse/HADOOP-14518
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/azure
>Affects Versions: 3.0.0-alpha3
>Reporter: Georgi Chalakov
>Assignee: Georgi Chalakov
>Priority: Minor
> Attachments: HADOOP-14518-01.patch, HADOOP-14518-01-test.txt, 
> HADOOP-14518-02.patch, HADOOP-14518-03.patch, HADOOP-14518-04.patch, 
> HADOOP-14518-05.patch
>
>
> WASB passes a User-Agent header to the Azure back-end. Right now, it uses the 
> default value set by the Azure Client SDK, so Hadoop traffic doesn't appear 
> any different from general Blob traffic. If we customize the User-Agent 
> header, then it will enable better troubleshooting and analysis by Azure 
> service.
> The following configuration
>   
> fs.azure.user.agent.prefix
> MSFT
>   
> set the user agent to 
>  User-Agent: WASB/3.0.0-alpha4-SNAPSHOT (MSFT) Azure-Storage/4.2.0 
> (JavaJRE 1.8.0_131; WindowsServer2012R2 6.3)
> Test Results :
> Tests run: 703, Failures: 0, Errors: 0, Skipped: 119



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14518) Customize User-Agent header sent in HTTP/HTTPS requests by WASB.

2017-06-26 Thread Georgi Chalakov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Georgi Chalakov updated HADOOP-14518:
-
Target Version/s: 3.0.0-alpha3, 2.8.1  (was: 2.8.1, 3.0.0-alpha3)
  Status: Open  (was: Patch Available)

> Customize User-Agent header sent in HTTP/HTTPS requests by WASB.
> 
>
> Key: HADOOP-14518
> URL: https://issues.apache.org/jira/browse/HADOOP-14518
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/azure
>Affects Versions: 3.0.0-alpha3
>Reporter: Georgi Chalakov
>Assignee: Georgi Chalakov
>Priority: Minor
> Attachments: HADOOP-14518-01.patch, HADOOP-14518-01-test.txt, 
> HADOOP-14518-02.patch, HADOOP-14518-03.patch, HADOOP-14518-04.patch
>
>
> WASB passes a User-Agent header to the Azure back-end. Right now, it uses the 
> default value set by the Azure Client SDK, so Hadoop traffic doesn't appear 
> any different from general Blob traffic. If we customize the User-Agent 
> header, then it will enable better troubleshooting and analysis by Azure 
> service.
> The following configuration
>   
> fs.azure.user.agent.prefix
> MSFT
>   
> set the user agent to 
>  User-Agent: WASB/3.0.0-alpha4-SNAPSHOT (MSFT) Azure-Storage/4.2.0 
> (JavaJRE 1.8.0_131; WindowsServer2012R2 6.3)
> Test Results :
> Tests run: 703, Failures: 0, Errors: 0, Skipped: 119



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14518) Customize User-Agent header sent in HTTP/HTTPS requests by WASB.

2017-06-26 Thread Georgi Chalakov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Georgi Chalakov updated HADOOP-14518:
-
Description: 
WASB passes a User-Agent header to the Azure back-end. Right now, it uses the 
default value set by the Azure Client SDK, so Hadoop traffic doesn't appear any 
different from general Blob traffic. If we customize the User-Agent header, 
then it will enable better troubleshooting and analysis by Azure service.

The following configuration
  
fs.azure.user.agent.agent
MSFT
  

set the user agent to 
 User-Agent: WASB/3.0.0-alpha4-SNAPSHOT (MSFT) Azure-Storage/4.2.0 (JavaJRE 
1.8.0_131; WindowsServer2012R2 6.3)


Test Results :
Tests run: 703, Failures: 0, Errors: 0, Skipped: 119

  was:
WASB passes a User-Agent header to the Azure back-end. Right now, it uses the 
default value set by the Azure Client SDK, so Hadoop traffic doesn't appear any 
different from general Blob traffic. If we customize the User-Agent header, 
then it will enable better troubleshooting and analysis by Azure service.

The following configuration
  
fs.azure.user.agent.id
MSFT
  

set the user agent to 
 User-Agent: WASB/3.0.0-alpha4-SNAPSHOT (MSFT) Azure-Storage/4.2.0 (JavaJRE 
1.8.0_131; WindowsServer2012R2 6.3)


Test Results :
Tests run: 703, Failures: 0, Errors: 0, Skipped: 119


> Customize User-Agent header sent in HTTP/HTTPS requests by WASB.
> 
>
> Key: HADOOP-14518
> URL: https://issues.apache.org/jira/browse/HADOOP-14518
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/azure
>Affects Versions: 3.0.0-alpha3
>Reporter: Georgi Chalakov
>Assignee: Georgi Chalakov
>Priority: Minor
> Attachments: HADOOP-14518-01.patch, HADOOP-14518-01-test.txt, 
> HADOOP-14518-02.patch, HADOOP-14518-03.patch, HADOOP-14518-04.patch
>
>
> WASB passes a User-Agent header to the Azure back-end. Right now, it uses the 
> default value set by the Azure Client SDK, so Hadoop traffic doesn't appear 
> any different from general Blob traffic. If we customize the User-Agent 
> header, then it will enable better troubleshooting and analysis by Azure 
> service.
> The following configuration
>   
> fs.azure.user.agent.agent
> MSFT
>   
> set the user agent to 
>  User-Agent: WASB/3.0.0-alpha4-SNAPSHOT (MSFT) Azure-Storage/4.2.0 
> (JavaJRE 1.8.0_131; WindowsServer2012R2 6.3)
> Test Results :
> Tests run: 703, Failures: 0, Errors: 0, Skipped: 119



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14518) Customize User-Agent header sent in HTTP/HTTPS requests by WASB.

2017-06-26 Thread Georgi Chalakov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Georgi Chalakov updated HADOOP-14518:
-
Description: 
WASB passes a User-Agent header to the Azure back-end. Right now, it uses the 
default value set by the Azure Client SDK, so Hadoop traffic doesn't appear any 
different from general Blob traffic. If we customize the User-Agent header, 
then it will enable better troubleshooting and analysis by Azure service.

The following configuration
  
fs.azure.user.agent.prefix
MSFT
  

set the user agent to 
 User-Agent: WASB/3.0.0-alpha4-SNAPSHOT (MSFT) Azure-Storage/4.2.0 (JavaJRE 
1.8.0_131; WindowsServer2012R2 6.3)


Test Results :
Tests run: 703, Failures: 0, Errors: 0, Skipped: 119

  was:
WASB passes a User-Agent header to the Azure back-end. Right now, it uses the 
default value set by the Azure Client SDK, so Hadoop traffic doesn't appear any 
different from general Blob traffic. If we customize the User-Agent header, 
then it will enable better troubleshooting and analysis by Azure service.

The following configuration
  
fs.azure.user.agent.agent
MSFT
  

set the user agent to 
 User-Agent: WASB/3.0.0-alpha4-SNAPSHOT (MSFT) Azure-Storage/4.2.0 (JavaJRE 
1.8.0_131; WindowsServer2012R2 6.3)


Test Results :
Tests run: 703, Failures: 0, Errors: 0, Skipped: 119


> Customize User-Agent header sent in HTTP/HTTPS requests by WASB.
> 
>
> Key: HADOOP-14518
> URL: https://issues.apache.org/jira/browse/HADOOP-14518
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/azure
>Affects Versions: 3.0.0-alpha3
>Reporter: Georgi Chalakov
>Assignee: Georgi Chalakov
>Priority: Minor
> Attachments: HADOOP-14518-01.patch, HADOOP-14518-01-test.txt, 
> HADOOP-14518-02.patch, HADOOP-14518-03.patch, HADOOP-14518-04.patch
>
>
> WASB passes a User-Agent header to the Azure back-end. Right now, it uses the 
> default value set by the Azure Client SDK, so Hadoop traffic doesn't appear 
> any different from general Blob traffic. If we customize the User-Agent 
> header, then it will enable better troubleshooting and analysis by Azure 
> service.
> The following configuration
>   
> fs.azure.user.agent.prefix
> MSFT
>   
> set the user agent to 
>  User-Agent: WASB/3.0.0-alpha4-SNAPSHOT (MSFT) Azure-Storage/4.2.0 
> (JavaJRE 1.8.0_131; WindowsServer2012R2 6.3)
> Test Results :
> Tests run: 703, Failures: 0, Errors: 0, Skipped: 119



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14520) Block compaction for WASB

2017-06-21 Thread Georgi Chalakov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Georgi Chalakov updated HADOOP-14520:
-
Status: Patch Available  (was: Open)

> Block compaction for WASB
> -
>
> Key: HADOOP-14520
> URL: https://issues.apache.org/jira/browse/HADOOP-14520
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/azure
>Affects Versions: 3.0.0-alpha3
>Reporter: Georgi Chalakov
>Assignee: Georgi Chalakov
> Attachments: HADOOP-14520-01.patch, HADOOP-14520-01-test.txt, 
> HADOOP-14520-03.patch
>
>
> Block Compaction for WASB allows uploading new blocks for every hflush/hsync 
> call. When the number of blocks is above 32000, next hflush/hsync triggers 
> the block compaction process. Block compaction replaces a sequence of blocks 
> with one block. From all the sequences with total length less than 4M, 
> compaction chooses the longest one. It is a greedy algorithm that preserve 
> all potential candidates for the next round. Block Compaction for WASB 
> increases data durability and allows using block blobs instead of page blobs. 
> By default, block compaction is disabled. Similar to the configuration for 
> page blobs, the client needs to specify HDFS folders where block compaction 
> over block blobs is enabled. 
> Results for HADOOP-14520-01.patch
> tested endpoint: fs.azure.account.key.hdfs4.blob.core.windows.net
> Tests run: 704, Failures: 0, Errors: 0, Skipped: 119



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14520) Block compaction for WASB

2017-06-21 Thread Georgi Chalakov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Georgi Chalakov updated HADOOP-14520:
-
Attachment: HADOOP-14520-03.patch

> Block compaction for WASB
> -
>
> Key: HADOOP-14520
> URL: https://issues.apache.org/jira/browse/HADOOP-14520
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/azure
>Affects Versions: 3.0.0-alpha3
>Reporter: Georgi Chalakov
>Assignee: Georgi Chalakov
> Attachments: HADOOP-14520-01.patch, HADOOP-14520-01-test.txt, 
> HADOOP-14520-03.patch
>
>
> Block Compaction for WASB allows uploading new blocks for every hflush/hsync 
> call. When the number of blocks is above 32000, next hflush/hsync triggers 
> the block compaction process. Block compaction replaces a sequence of blocks 
> with one block. From all the sequences with total length less than 4M, 
> compaction chooses the longest one. It is a greedy algorithm that preserve 
> all potential candidates for the next round. Block Compaction for WASB 
> increases data durability and allows using block blobs instead of page blobs. 
> By default, block compaction is disabled. Similar to the configuration for 
> page blobs, the client needs to specify HDFS folders where block compaction 
> over block blobs is enabled. 
> Results for HADOOP-14520-01.patch
> tested endpoint: fs.azure.account.key.hdfs4.blob.core.windows.net
> Tests run: 704, Failures: 0, Errors: 0, Skipped: 119



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14520) Block compaction for WASB

2017-06-21 Thread Georgi Chalakov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Georgi Chalakov updated HADOOP-14520:
-
Attachment: (was: HADOOP-14520-02.patch)

> Block compaction for WASB
> -
>
> Key: HADOOP-14520
> URL: https://issues.apache.org/jira/browse/HADOOP-14520
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/azure
>Affects Versions: 3.0.0-alpha3
>Reporter: Georgi Chalakov
>Assignee: Georgi Chalakov
> Attachments: HADOOP-14520-01.patch, HADOOP-14520-01-test.txt
>
>
> Block Compaction for WASB allows uploading new blocks for every hflush/hsync 
> call. When the number of blocks is above 32000, next hflush/hsync triggers 
> the block compaction process. Block compaction replaces a sequence of blocks 
> with one block. From all the sequences with total length less than 4M, 
> compaction chooses the longest one. It is a greedy algorithm that preserve 
> all potential candidates for the next round. Block Compaction for WASB 
> increases data durability and allows using block blobs instead of page blobs. 
> By default, block compaction is disabled. Similar to the configuration for 
> page blobs, the client needs to specify HDFS folders where block compaction 
> over block blobs is enabled. 
> Results for HADOOP-14520-01.patch
> tested endpoint: fs.azure.account.key.hdfs4.blob.core.windows.net
> Tests run: 704, Failures: 0, Errors: 0, Skipped: 119



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14520) Block compaction for WASB

2017-06-21 Thread Georgi Chalakov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Georgi Chalakov updated HADOOP-14520:
-
Status: Open  (was: Patch Available)

> Block compaction for WASB
> -
>
> Key: HADOOP-14520
> URL: https://issues.apache.org/jira/browse/HADOOP-14520
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/azure
>Affects Versions: 3.0.0-alpha3
>Reporter: Georgi Chalakov
>Assignee: Georgi Chalakov
> Attachments: HADOOP-14520-01.patch, HADOOP-14520-01-test.txt
>
>
> Block Compaction for WASB allows uploading new blocks for every hflush/hsync 
> call. When the number of blocks is above 32000, next hflush/hsync triggers 
> the block compaction process. Block compaction replaces a sequence of blocks 
> with one block. From all the sequences with total length less than 4M, 
> compaction chooses the longest one. It is a greedy algorithm that preserve 
> all potential candidates for the next round. Block Compaction for WASB 
> increases data durability and allows using block blobs instead of page blobs. 
> By default, block compaction is disabled. Similar to the configuration for 
> page blobs, the client needs to specify HDFS folders where block compaction 
> over block blobs is enabled. 
> Results for HADOOP-14520-01.patch
> tested endpoint: fs.azure.account.key.hdfs4.blob.core.windows.net
> Tests run: 704, Failures: 0, Errors: 0, Skipped: 119



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14520) Block compaction for WASB

2017-06-21 Thread Georgi Chalakov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Georgi Chalakov updated HADOOP-14520:
-
Description: 
Block Compaction for WASB allows uploading new blocks for every hflush/hsync 
call. When the number of blocks is above 32000, next hflush/hsync triggers the 
block compaction process. Block compaction replaces a sequence of blocks with 
one block. From all the sequences with total length less than 4M, compaction 
chooses the longest one. It is a greedy algorithm that preserve all potential 
candidates for the next round. Block Compaction for WASB increases data 
durability and allows using block blobs instead of page blobs. By default, 
block compaction is disabled. Similar to the configuration for page blobs, the 
client needs to specify HDFS folders where block compaction over block blobs is 
enabled. 

Results for HADOOP-14520-01.patch
tested endpoint: fs.azure.account.key.hdfs4.blob.core.windows.net
Tests run: 704, Failures: 0, Errors: 0, Skipped: 119



  was:
Block Compaction for WASB allows uploading new blocks for every hflush/hsync 
call. When the number of blocks is above 32000, next hflush/hsync triggers the 
block compaction process. Block compaction replaces a sequence of blocks with 
one block. From all the sequences with total length less than 4M, compaction 
chooses the longest one. It is a greedy algorithm that preserve all potential 
candidates for the next round. Block Compaction for WASB increases data 
durability and allows using block blobs instead of page blobs. By default, 
block compaction is disabled. Similar to the configuration for page blobs, the 
client needs to specify HDFS folders where block compaction over block blobs is 
enabled. 

Results for HADOOP-14520-01.patch
Tests run: 704, Failures: 0, Errors: 0, Skipped: 119




> Block compaction for WASB
> -
>
> Key: HADOOP-14520
> URL: https://issues.apache.org/jira/browse/HADOOP-14520
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/azure
>Affects Versions: 3.0.0-alpha3
>Reporter: Georgi Chalakov
>Assignee: Georgi Chalakov
> Attachments: HADOOP-14520-01.patch, HADOOP-14520-01-test.txt, 
> HADOOP-14520-02.patch
>
>
> Block Compaction for WASB allows uploading new blocks for every hflush/hsync 
> call. When the number of blocks is above 32000, next hflush/hsync triggers 
> the block compaction process. Block compaction replaces a sequence of blocks 
> with one block. From all the sequences with total length less than 4M, 
> compaction chooses the longest one. It is a greedy algorithm that preserve 
> all potential candidates for the next round. Block Compaction for WASB 
> increases data durability and allows using block blobs instead of page blobs. 
> By default, block compaction is disabled. Similar to the configuration for 
> page blobs, the client needs to specify HDFS folders where block compaction 
> over block blobs is enabled. 
> Results for HADOOP-14520-01.patch
> tested endpoint: fs.azure.account.key.hdfs4.blob.core.windows.net
> Tests run: 704, Failures: 0, Errors: 0, Skipped: 119



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14520) Block compaction for WASB

2017-06-21 Thread Georgi Chalakov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Georgi Chalakov updated HADOOP-14520:
-
Description: 
Block Compaction for WASB allows uploading new blocks for every hflush/hsync 
call. When the number of blocks is above 32000, next hflush/hsync triggers the 
block compaction process. Block compaction replaces a sequence of blocks with 
one block. From all the sequences with total length less than 4M, compaction 
chooses the longest one. It is a greedy algorithm that preserve all potential 
candidates for the next round. Block Compaction for WASB increases data 
durability and allows using block blobs instead of page blobs. By default, 
block compaction is disabled. Similar to the configuration for page blobs, the 
client needs to specify HDFS folders where block compaction over block blobs is 
enabled. 

Results for HADOOP-14520-01.patch
Tests run: 704, Failures: 0, Errors: 0, Skipped: 119



  was:
Block Compaction for WASB allows uploading new blocks for every hflush/hsync 
call. When the number of blocks is above a predefined, configurable value, next 
hflush/hsync triggers the block compaction process. Block compaction replaces a 
sequence of blocks with one block. From all the sequences with total length 
less than 4M, compaction chooses the longest one. It is a greedy algorithm that 
preserve all potential candidates for the next round. Block Compaction for WASB 
increases data durability and allows using block blobs instead of page blobs. 
By default, block compaction is disabled. Similar to the configuration for page 
blobs, the client needs to specify HDFS folders where block compaction over 
block blobs is enabled. 

Results for HADOOP-14520-01.patch
Tests run: 704, Failures: 0, Errors: 0, Skipped: 119




> Block compaction for WASB
> -
>
> Key: HADOOP-14520
> URL: https://issues.apache.org/jira/browse/HADOOP-14520
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/azure
>Affects Versions: 3.0.0-alpha3
>Reporter: Georgi Chalakov
>Assignee: Georgi Chalakov
> Attachments: HADOOP-14520-01.patch, HADOOP-14520-01-test.txt, 
> HADOOP-14520-02.patch
>
>
> Block Compaction for WASB allows uploading new blocks for every hflush/hsync 
> call. When the number of blocks is above 32000, next hflush/hsync triggers 
> the block compaction process. Block compaction replaces a sequence of blocks 
> with one block. From all the sequences with total length less than 4M, 
> compaction chooses the longest one. It is a greedy algorithm that preserve 
> all potential candidates for the next round. Block Compaction for WASB 
> increases data durability and allows using block blobs instead of page blobs. 
> By default, block compaction is disabled. Similar to the configuration for 
> page blobs, the client needs to specify HDFS folders where block compaction 
> over block blobs is enabled. 
> Results for HADOOP-14520-01.patch
> Tests run: 704, Failures: 0, Errors: 0, Skipped: 119



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14520) Block compaction for WASB

2017-06-21 Thread Georgi Chalakov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Georgi Chalakov updated HADOOP-14520:
-
Release Note: Block Compaction for WASB. When the number of blocks in a 
block blob is above 32000, compaction replaces longest sequence of blocks with 
total size length less then 4M, with just one block. Compaction allows blocks 
blobs to be used instead of page blobs, including for WAL files.  (was: Block 
Compaction for WASB. When the number of blocks in a block blob is above a 
predefined, configurable number, compaction replaces longest sequence of blocks 
with total length less then 4M, with just one block. Compaction allows blocks 
blobs to be used instead of page blobs, including for WAL files.)
  Status: Patch Available  (was: In Progress)

> Block compaction for WASB
> -
>
> Key: HADOOP-14520
> URL: https://issues.apache.org/jira/browse/HADOOP-14520
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/azure
>Affects Versions: 3.0.0-alpha3
>Reporter: Georgi Chalakov
>Assignee: Georgi Chalakov
> Attachments: HADOOP-14520-01.patch, HADOOP-14520-01-test.txt, 
> HADOOP-14520-02.patch
>
>
> Block Compaction for WASB allows uploading new blocks for every hflush/hsync 
> call. When the number of blocks is above a predefined, configurable value, 
> next hflush/hsync triggers the block compaction process. Block compaction 
> replaces a sequence of blocks with one block. From all the sequences with 
> total length less than 4M, compaction chooses the longest one. It is a greedy 
> algorithm that preserve all potential candidates for the next round. Block 
> Compaction for WASB increases data durability and allows using block blobs 
> instead of page blobs. By default, block compaction is disabled. Similar to 
> the configuration for page blobs, the client needs to specify HDFS folders 
> where block compaction over block blobs is enabled. 
> Results for HADOOP-14520-01.patch
> Tests run: 704, Failures: 0, Errors: 0, Skipped: 119



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14520) Block compaction for WASB

2017-06-21 Thread Georgi Chalakov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Georgi Chalakov updated HADOOP-14520:
-
Attachment: HADOOP-14520-02.patch

> Block compaction for WASB
> -
>
> Key: HADOOP-14520
> URL: https://issues.apache.org/jira/browse/HADOOP-14520
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/azure
>Affects Versions: 3.0.0-alpha3
>Reporter: Georgi Chalakov
>Assignee: Georgi Chalakov
> Attachments: HADOOP-14520-01.patch, HADOOP-14520-01-test.txt, 
> HADOOP-14520-02.patch
>
>
> Block Compaction for WASB allows uploading new blocks for every hflush/hsync 
> call. When the number of blocks is above a predefined, configurable value, 
> next hflush/hsync triggers the block compaction process. Block compaction 
> replaces a sequence of blocks with one block. From all the sequences with 
> total length less than 4M, compaction chooses the longest one. It is a greedy 
> algorithm that preserve all potential candidates for the next round. Block 
> Compaction for WASB increases data durability and allows using block blobs 
> instead of page blobs. By default, block compaction is disabled. Similar to 
> the configuration for page blobs, the client needs to specify HDFS folders 
> where block compaction over block blobs is enabled. 
> Results for HADOOP-14520-01.patch
> Tests run: 704, Failures: 0, Errors: 0, Skipped: 119



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14520) Block compaction for WASB

2017-06-20 Thread Georgi Chalakov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Georgi Chalakov updated HADOOP-14520:
-
Status: In Progress  (was: Patch Available)

> Block compaction for WASB
> -
>
> Key: HADOOP-14520
> URL: https://issues.apache.org/jira/browse/HADOOP-14520
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/azure
>Affects Versions: 3.0.0-alpha3
>Reporter: Georgi Chalakov
>Assignee: Georgi Chalakov
> Attachments: HADOOP-14520-01.patch, HADOOP-14520-01-test.txt
>
>
> Block Compaction for WASB allows uploading new blocks for every hflush/hsync 
> call. When the number of blocks is above a predefined, configurable value, 
> next hflush/hsync triggers the block compaction process. Block compaction 
> replaces a sequence of blocks with one block. From all the sequences with 
> total length less than 4M, compaction chooses the longest one. It is a greedy 
> algorithm that preserve all potential candidates for the next round. Block 
> Compaction for WASB increases data durability and allows using block blobs 
> instead of page blobs. By default, block compaction is disabled. Similar to 
> the configuration for page blobs, the client needs to specify HDFS folders 
> where block compaction over block blobs is enabled. 
> Results for HADOOP-14520-01.patch
> Tests run: 704, Failures: 0, Errors: 0, Skipped: 119



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Issue Comment Deleted] (HADOOP-14536) Update azure-storage sdk to version 5.3.0

2017-06-20 Thread Georgi Chalakov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Georgi Chalakov updated HADOOP-14536:
-
Comment: was deleted

(was: It was tested against:

 
fs.contract.test.fs.wasb
wasb://testh...@hdfs4.blob.core.windows.net
  )

> Update azure-storage sdk to version 5.3.0
> -
>
> Key: HADOOP-14536
> URL: https://issues.apache.org/jira/browse/HADOOP-14536
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/azure
>Affects Versions: 3.0.0-alpha3
>Reporter: Mingliang Liu
>Assignee: Georgi Chalakov
> Attachments: HADOOP-14536-01.patch
>
>
> Update WASB driver to use the latest version (5.3.0) of SDK for Microsoft 
> Azure Storage Clients. We are currently using version 4.2.0 of the SDK.
> Azure Storage Clients changes between 4.2 and 5.3:
>  * Fixed a bug where the transactional MD5 check would fail when downloading 
> a range of blob or file and the recovery action is performed on a subsection 
> of the range.
>  * Fixed leaking connections for table requests.
>  * Fixed a bug where retries happened immediately when experiencing a network 
> exception uploading data or getting the response.
>  * Fixed a bug where the response stream was not being closed on nonretryable 
> exceptions.
>  * Fixed Exists() calls on Shares and Directories to now populate metadata. 
> This was already being done for Files.
>  * Changed blob constants to support up to 256 MB on put blob for block 
> blobs. The default value for put blob threshold has also been updated to half 
> of the maximum, or 128 MB currently.
>  * Fixed a bug that prevented setting content MD5 to true when creating a new 
> file.
>  * Fixed a bug where access conditions, options, and operation context were 
> not being passed when calling openWriteExisting() on a page blob or a file.
>  * Fixed a bug where an exception was being thrown on a range get of a blob 
> or file when the options disableContentMD5Validation is set to false and 
> useTransactionalContentMD5 is set to true and there is no overall MD5.
>  * Fixed a bug where retries were happening immediately if a socket exception 
> was thrown.
>  * In CloudFileShareProperties, setShareQuota() no longer asserts in bounds. 
> This check has been moved to create() and uploadProperties() in 
> CloudFileShare.
>  * Prefix support for listing files and directories.
>  * Added support for setting public access when creating a blob container
>  * The public access setting on a blob container is now a container property 
> returned from downloadProperties.
>  * Add Message now modifies the PopReceipt, Id, NextVisibleTime, 
> InsertionTime, and ExpirationTime properties of its CloudQueueMessage 
> parameter.
>  * Populate content MD5 for range gets on Blobs and Files.
>  * Added support in Page Blob for incremental copy.
>  * Added large BlockBlob upload support. Blocks can now support sizes up to 
> 100 MB.
>  * Added a new, memory-optimized upload strategy for the upload* APIs. This 
> algorithm only applies for blocks greater than 4MB and when 
> storeBlobContentMD5 and Client-Side Encryption are disabled.
>  * getQualifiedUri() has been deprecated for Blobs. Please use 
> getSnapshotQualifiedUri() instead. This new function will return the blob 
> including the snapshot (if present) and no SAS token.
>  * getQualifiedStorageUri() has been deprecated for Blobs. Please use 
> getSnapshotQualifiedStorageUri() instead. This new function will return the 
> blob including the snapshot (if present) and no SAS token.
>  * Fixed a bug where copying from a blob that included a SAS token and a 
> snapshot ommitted the SAS token.
>  * Fixed a bug in client-side encryption for tables that was preventing the 
> Java client from decrypting entities encrypted with the .NET client, and vice 
> versa.
>  * Added support for server-side encryption.
>  * Added support for getBlobReferenceFromServer methods on CloudBlobContainer 
> to support retrieving a blob without knowing its type.
>  * Fixed a bug in the retry policies where 300 status codes were being 
> retried when they shouldn't be.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14536) Update azure-storage sdk to version 5.3.0

2017-06-20 Thread Georgi Chalakov (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16056368#comment-16056368
 ] 

Georgi Chalakov commented on HADOOP-14536:
--

The patch changes the version of Azure Client SDK. The new SDK offers fixes and 
new functionality, but it should not change current behavior, so it doesn't 
require modification of existing tests or new tests. All tests pass when the 
change was tested against wasb://testh...@hdfs4.blob.core.windows.net 

> Update azure-storage sdk to version 5.3.0
> -
>
> Key: HADOOP-14536
> URL: https://issues.apache.org/jira/browse/HADOOP-14536
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/azure
>Affects Versions: 3.0.0-alpha3
>Reporter: Mingliang Liu
>Assignee: Georgi Chalakov
> Attachments: HADOOP-14536-01.patch
>
>
> Update WASB driver to use the latest version (5.3.0) of SDK for Microsoft 
> Azure Storage Clients. We are currently using version 4.2.0 of the SDK.
> Azure Storage Clients changes between 4.2 and 5.3:
>  * Fixed a bug where the transactional MD5 check would fail when downloading 
> a range of blob or file and the recovery action is performed on a subsection 
> of the range.
>  * Fixed leaking connections for table requests.
>  * Fixed a bug where retries happened immediately when experiencing a network 
> exception uploading data or getting the response.
>  * Fixed a bug where the response stream was not being closed on nonretryable 
> exceptions.
>  * Fixed Exists() calls on Shares and Directories to now populate metadata. 
> This was already being done for Files.
>  * Changed blob constants to support up to 256 MB on put blob for block 
> blobs. The default value for put blob threshold has also been updated to half 
> of the maximum, or 128 MB currently.
>  * Fixed a bug that prevented setting content MD5 to true when creating a new 
> file.
>  * Fixed a bug where access conditions, options, and operation context were 
> not being passed when calling openWriteExisting() on a page blob or a file.
>  * Fixed a bug where an exception was being thrown on a range get of a blob 
> or file when the options disableContentMD5Validation is set to false and 
> useTransactionalContentMD5 is set to true and there is no overall MD5.
>  * Fixed a bug where retries were happening immediately if a socket exception 
> was thrown.
>  * In CloudFileShareProperties, setShareQuota() no longer asserts in bounds. 
> This check has been moved to create() and uploadProperties() in 
> CloudFileShare.
>  * Prefix support for listing files and directories.
>  * Added support for setting public access when creating a blob container
>  * The public access setting on a blob container is now a container property 
> returned from downloadProperties.
>  * Add Message now modifies the PopReceipt, Id, NextVisibleTime, 
> InsertionTime, and ExpirationTime properties of its CloudQueueMessage 
> parameter.
>  * Populate content MD5 for range gets on Blobs and Files.
>  * Added support in Page Blob for incremental copy.
>  * Added large BlockBlob upload support. Blocks can now support sizes up to 
> 100 MB.
>  * Added a new, memory-optimized upload strategy for the upload* APIs. This 
> algorithm only applies for blocks greater than 4MB and when 
> storeBlobContentMD5 and Client-Side Encryption are disabled.
>  * getQualifiedUri() has been deprecated for Blobs. Please use 
> getSnapshotQualifiedUri() instead. This new function will return the blob 
> including the snapshot (if present) and no SAS token.
>  * getQualifiedStorageUri() has been deprecated for Blobs. Please use 
> getSnapshotQualifiedStorageUri() instead. This new function will return the 
> blob including the snapshot (if present) and no SAS token.
>  * Fixed a bug where copying from a blob that included a SAS token and a 
> snapshot ommitted the SAS token.
>  * Fixed a bug in client-side encryption for tables that was preventing the 
> Java client from decrypting entities encrypted with the .NET client, and vice 
> versa.
>  * Added support for server-side encryption.
>  * Added support for getBlobReferenceFromServer methods on CloudBlobContainer 
> to support retrieving a blob without knowing its type.
>  * Fixed a bug in the retry policies where 300 status codes were being 
> retried when they shouldn't be.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14536) Update azure-storage sdk to version 5.3.0

2017-06-20 Thread Georgi Chalakov (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16056362#comment-16056362
 ] 

Georgi Chalakov commented on HADOOP-14536:
--

It was tested against:

 
fs.contract.test.fs.wasb
wasb://testh...@hdfs4.blob.core.windows.net
  

> Update azure-storage sdk to version 5.3.0
> -
>
> Key: HADOOP-14536
> URL: https://issues.apache.org/jira/browse/HADOOP-14536
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/azure
>Affects Versions: 3.0.0-alpha3
>Reporter: Mingliang Liu
>Assignee: Georgi Chalakov
> Attachments: HADOOP-14536-01.patch
>
>
> Update WASB driver to use the latest version (5.3.0) of SDK for Microsoft 
> Azure Storage Clients. We are currently using version 4.2.0 of the SDK.
> Azure Storage Clients changes between 4.2 and 5.3:
>  * Fixed a bug where the transactional MD5 check would fail when downloading 
> a range of blob or file and the recovery action is performed on a subsection 
> of the range.
>  * Fixed leaking connections for table requests.
>  * Fixed a bug where retries happened immediately when experiencing a network 
> exception uploading data or getting the response.
>  * Fixed a bug where the response stream was not being closed on nonretryable 
> exceptions.
>  * Fixed Exists() calls on Shares and Directories to now populate metadata. 
> This was already being done for Files.
>  * Changed blob constants to support up to 256 MB on put blob for block 
> blobs. The default value for put blob threshold has also been updated to half 
> of the maximum, or 128 MB currently.
>  * Fixed a bug that prevented setting content MD5 to true when creating a new 
> file.
>  * Fixed a bug where access conditions, options, and operation context were 
> not being passed when calling openWriteExisting() on a page blob or a file.
>  * Fixed a bug where an exception was being thrown on a range get of a blob 
> or file when the options disableContentMD5Validation is set to false and 
> useTransactionalContentMD5 is set to true and there is no overall MD5.
>  * Fixed a bug where retries were happening immediately if a socket exception 
> was thrown.
>  * In CloudFileShareProperties, setShareQuota() no longer asserts in bounds. 
> This check has been moved to create() and uploadProperties() in 
> CloudFileShare.
>  * Prefix support for listing files and directories.
>  * Added support for setting public access when creating a blob container
>  * The public access setting on a blob container is now a container property 
> returned from downloadProperties.
>  * Add Message now modifies the PopReceipt, Id, NextVisibleTime, 
> InsertionTime, and ExpirationTime properties of its CloudQueueMessage 
> parameter.
>  * Populate content MD5 for range gets on Blobs and Files.
>  * Added support in Page Blob for incremental copy.
>  * Added large BlockBlob upload support. Blocks can now support sizes up to 
> 100 MB.
>  * Added a new, memory-optimized upload strategy for the upload* APIs. This 
> algorithm only applies for blocks greater than 4MB and when 
> storeBlobContentMD5 and Client-Side Encryption are disabled.
>  * getQualifiedUri() has been deprecated for Blobs. Please use 
> getSnapshotQualifiedUri() instead. This new function will return the blob 
> including the snapshot (if present) and no SAS token.
>  * getQualifiedStorageUri() has been deprecated for Blobs. Please use 
> getSnapshotQualifiedStorageUri() instead. This new function will return the 
> blob including the snapshot (if present) and no SAS token.
>  * Fixed a bug where copying from a blob that included a SAS token and a 
> snapshot ommitted the SAS token.
>  * Fixed a bug in client-side encryption for tables that was preventing the 
> Java client from decrypting entities encrypted with the .NET client, and vice 
> versa.
>  * Added support for server-side encryption.
>  * Added support for getBlobReferenceFromServer methods on CloudBlobContainer 
> to support retrieving a blob without knowing its type.
>  * Fixed a bug in the retry policies where 300 status codes were being 
> retried when they shouldn't be.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14536) Update azure-storage sdk to version 5.3.0

2017-06-19 Thread Georgi Chalakov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Georgi Chalakov updated HADOOP-14536:
-
Release Note: The WASB FileSystem now uses version 5.3.0 of the Azure 
Storage SDK.   (was: The WASB FileSystem now uses version 5.2.0 of the Azure 
Storage SDK. )
  Status: Patch Available  (was: In Progress)

> Update azure-storage sdk to version 5.3.0
> -
>
> Key: HADOOP-14536
> URL: https://issues.apache.org/jira/browse/HADOOP-14536
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/azure
>Affects Versions: 3.0.0-alpha3
>Reporter: Mingliang Liu
>Assignee: Georgi Chalakov
> Attachments: HADOOP-14536-01.patch
>
>
> Update WASB driver to use the latest version (5.3.0) of SDK for Microsoft 
> Azure Storage Clients. We are currently using version 4.2.0 of the SDK.
> Azure Storage Clients changes between 4.2 and 5.3:
>  * Fixed a bug where the transactional MD5 check would fail when downloading 
> a range of blob or file and the recovery action is performed on a subsection 
> of the range.
>  * Fixed leaking connections for table requests.
>  * Fixed a bug where retries happened immediately when experiencing a network 
> exception uploading data or getting the response.
>  * Fixed a bug where the response stream was not being closed on nonretryable 
> exceptions.
>  * Fixed Exists() calls on Shares and Directories to now populate metadata. 
> This was already being done for Files.
>  * Changed blob constants to support up to 256 MB on put blob for block 
> blobs. The default value for put blob threshold has also been updated to half 
> of the maximum, or 128 MB currently.
>  * Fixed a bug that prevented setting content MD5 to true when creating a new 
> file.
>  * Fixed a bug where access conditions, options, and operation context were 
> not being passed when calling openWriteExisting() on a page blob or a file.
>  * Fixed a bug where an exception was being thrown on a range get of a blob 
> or file when the options disableContentMD5Validation is set to false and 
> useTransactionalContentMD5 is set to true and there is no overall MD5.
>  * Fixed a bug where retries were happening immediately if a socket exception 
> was thrown.
>  * In CloudFileShareProperties, setShareQuota() no longer asserts in bounds. 
> This check has been moved to create() and uploadProperties() in 
> CloudFileShare.
>  * Prefix support for listing files and directories.
>  * Added support for setting public access when creating a blob container
>  * The public access setting on a blob container is now a container property 
> returned from downloadProperties.
>  * Add Message now modifies the PopReceipt, Id, NextVisibleTime, 
> InsertionTime, and ExpirationTime properties of its CloudQueueMessage 
> parameter.
>  * Populate content MD5 for range gets on Blobs and Files.
>  * Added support in Page Blob for incremental copy.
>  * Added large BlockBlob upload support. Blocks can now support sizes up to 
> 100 MB.
>  * Added a new, memory-optimized upload strategy for the upload* APIs. This 
> algorithm only applies for blocks greater than 4MB and when 
> storeBlobContentMD5 and Client-Side Encryption are disabled.
>  * getQualifiedUri() has been deprecated for Blobs. Please use 
> getSnapshotQualifiedUri() instead. This new function will return the blob 
> including the snapshot (if present) and no SAS token.
>  * getQualifiedStorageUri() has been deprecated for Blobs. Please use 
> getSnapshotQualifiedStorageUri() instead. This new function will return the 
> blob including the snapshot (if present) and no SAS token.
>  * Fixed a bug where copying from a blob that included a SAS token and a 
> snapshot ommitted the SAS token.
>  * Fixed a bug in client-side encryption for tables that was preventing the 
> Java client from decrypting entities encrypted with the .NET client, and vice 
> versa.
>  * Added support for server-side encryption.
>  * Added support for getBlobReferenceFromServer methods on CloudBlobContainer 
> to support retrieving a blob without knowing its type.
>  * Fixed a bug in the retry policies where 300 status codes were being 
> retried when they shouldn't be.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14536) Update azure-storage sdk to version 5.3.0

2017-06-19 Thread Georgi Chalakov (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16054832#comment-16054832
 ] 

Georgi Chalakov commented on HADOOP-14536:
--

---
 T E S T S
---
.
.
.
Results :

Tests run: 704, Failures: 0, Errors: 0, Skipped: 119


> Update azure-storage sdk to version 5.3.0
> -
>
> Key: HADOOP-14536
> URL: https://issues.apache.org/jira/browse/HADOOP-14536
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/azure
>Affects Versions: 3.0.0-alpha3
>Reporter: Mingliang Liu
>Assignee: Georgi Chalakov
> Attachments: HADOOP-14536-01.patch
>
>
> Update WASB driver to use the latest version (5.3.0) of SDK for Microsoft 
> Azure Storage Clients. We are currently using version 4.2.0 of the SDK.
> Azure Storage Clients changes between 4.2 and 5.3:
>  * Fixed a bug where the transactional MD5 check would fail when downloading 
> a range of blob or file and the recovery action is performed on a subsection 
> of the range.
>  * Fixed leaking connections for table requests.
>  * Fixed a bug where retries happened immediately when experiencing a network 
> exception uploading data or getting the response.
>  * Fixed a bug where the response stream was not being closed on nonretryable 
> exceptions.
>  * Fixed Exists() calls on Shares and Directories to now populate metadata. 
> This was already being done for Files.
>  * Changed blob constants to support up to 256 MB on put blob for block 
> blobs. The default value for put blob threshold has also been updated to half 
> of the maximum, or 128 MB currently.
>  * Fixed a bug that prevented setting content MD5 to true when creating a new 
> file.
>  * Fixed a bug where access conditions, options, and operation context were 
> not being passed when calling openWriteExisting() on a page blob or a file.
>  * Fixed a bug where an exception was being thrown on a range get of a blob 
> or file when the options disableContentMD5Validation is set to false and 
> useTransactionalContentMD5 is set to true and there is no overall MD5.
>  * Fixed a bug where retries were happening immediately if a socket exception 
> was thrown.
>  * In CloudFileShareProperties, setShareQuota() no longer asserts in bounds. 
> This check has been moved to create() and uploadProperties() in 
> CloudFileShare.
>  * Prefix support for listing files and directories.
>  * Added support for setting public access when creating a blob container
>  * The public access setting on a blob container is now a container property 
> returned from downloadProperties.
>  * Add Message now modifies the PopReceipt, Id, NextVisibleTime, 
> InsertionTime, and ExpirationTime properties of its CloudQueueMessage 
> parameter.
>  * Populate content MD5 for range gets on Blobs and Files.
>  * Added support in Page Blob for incremental copy.
>  * Added large BlockBlob upload support. Blocks can now support sizes up to 
> 100 MB.
>  * Added a new, memory-optimized upload strategy for the upload* APIs. This 
> algorithm only applies for blocks greater than 4MB and when 
> storeBlobContentMD5 and Client-Side Encryption are disabled.
>  * getQualifiedUri() has been deprecated for Blobs. Please use 
> getSnapshotQualifiedUri() instead. This new function will return the blob 
> including the snapshot (if present) and no SAS token.
>  * getQualifiedStorageUri() has been deprecated for Blobs. Please use 
> getSnapshotQualifiedStorageUri() instead. This new function will return the 
> blob including the snapshot (if present) and no SAS token.
>  * Fixed a bug where copying from a blob that included a SAS token and a 
> snapshot ommitted the SAS token.
>  * Fixed a bug in client-side encryption for tables that was preventing the 
> Java client from decrypting entities encrypted with the .NET client, and vice 
> versa.
>  * Added support for server-side encryption.
>  * Added support for getBlobReferenceFromServer methods on CloudBlobContainer 
> to support retrieving a blob without knowing its type.
>  * Fixed a bug in the retry policies where 300 status codes were being 
> retried when they shouldn't be.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14536) Update azure-storage sdk to version 5.3.0

2017-06-19 Thread Georgi Chalakov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Georgi Chalakov updated HADOOP-14536:
-
Attachment: HADOOP-14536-01.patch

> Update azure-storage sdk to version 5.3.0
> -
>
> Key: HADOOP-14536
> URL: https://issues.apache.org/jira/browse/HADOOP-14536
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/azure
>Affects Versions: 3.0.0-alpha3
>Reporter: Mingliang Liu
>Assignee: Georgi Chalakov
> Attachments: HADOOP-14536-01.patch
>
>
> Update WASB driver to use the latest version (5.3.0) of SDK for Microsoft 
> Azure Storage Clients. We are currently using version 4.2.0 of the SDK.
> Azure Storage Clients changes between 4.2 and 5.3:
>  * Fixed a bug where the transactional MD5 check would fail when downloading 
> a range of blob or file and the recovery action is performed on a subsection 
> of the range.
>  * Fixed leaking connections for table requests.
>  * Fixed a bug where retries happened immediately when experiencing a network 
> exception uploading data or getting the response.
>  * Fixed a bug where the response stream was not being closed on nonretryable 
> exceptions.
>  * Fixed Exists() calls on Shares and Directories to now populate metadata. 
> This was already being done for Files.
>  * Changed blob constants to support up to 256 MB on put blob for block 
> blobs. The default value for put blob threshold has also been updated to half 
> of the maximum, or 128 MB currently.
>  * Fixed a bug that prevented setting content MD5 to true when creating a new 
> file.
>  * Fixed a bug where access conditions, options, and operation context were 
> not being passed when calling openWriteExisting() on a page blob or a file.
>  * Fixed a bug where an exception was being thrown on a range get of a blob 
> or file when the options disableContentMD5Validation is set to false and 
> useTransactionalContentMD5 is set to true and there is no overall MD5.
>  * Fixed a bug where retries were happening immediately if a socket exception 
> was thrown.
>  * In CloudFileShareProperties, setShareQuota() no longer asserts in bounds. 
> This check has been moved to create() and uploadProperties() in 
> CloudFileShare.
>  * Prefix support for listing files and directories.
>  * Added support for setting public access when creating a blob container
>  * The public access setting on a blob container is now a container property 
> returned from downloadProperties.
>  * Add Message now modifies the PopReceipt, Id, NextVisibleTime, 
> InsertionTime, and ExpirationTime properties of its CloudQueueMessage 
> parameter.
>  * Populate content MD5 for range gets on Blobs and Files.
>  * Added support in Page Blob for incremental copy.
>  * Added large BlockBlob upload support. Blocks can now support sizes up to 
> 100 MB.
>  * Added a new, memory-optimized upload strategy for the upload* APIs. This 
> algorithm only applies for blocks greater than 4MB and when 
> storeBlobContentMD5 and Client-Side Encryption are disabled.
>  * getQualifiedUri() has been deprecated for Blobs. Please use 
> getSnapshotQualifiedUri() instead. This new function will return the blob 
> including the snapshot (if present) and no SAS token.
>  * getQualifiedStorageUri() has been deprecated for Blobs. Please use 
> getSnapshotQualifiedStorageUri() instead. This new function will return the 
> blob including the snapshot (if present) and no SAS token.
>  * Fixed a bug where copying from a blob that included a SAS token and a 
> snapshot ommitted the SAS token.
>  * Fixed a bug in client-side encryption for tables that was preventing the 
> Java client from decrypting entities encrypted with the .NET client, and vice 
> versa.
>  * Added support for server-side encryption.
>  * Added support for getBlobReferenceFromServer methods on CloudBlobContainer 
> to support retrieving a blob without knowing its type.
>  * Fixed a bug in the retry policies where 300 status codes were being 
> retried when they shouldn't be.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work started] (HADOOP-14536) Update azure-storage sdk to version 5.3.0

2017-06-19 Thread Georgi Chalakov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HADOOP-14536 started by Georgi Chalakov.

> Update azure-storage sdk to version 5.3.0
> -
>
> Key: HADOOP-14536
> URL: https://issues.apache.org/jira/browse/HADOOP-14536
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/azure
>Affects Versions: 3.0.0-alpha3
>Reporter: Mingliang Liu
>Assignee: Georgi Chalakov
>
> Update WASB driver to use the latest version (5.3.0) of SDK for Microsoft 
> Azure Storage Clients. We are currently using version 4.2.0 of the SDK.
> Azure Storage Clients changes between 4.2 and 5.3:
>  * Fixed a bug where the transactional MD5 check would fail when downloading 
> a range of blob or file and the recovery action is performed on a subsection 
> of the range.
>  * Fixed leaking connections for table requests.
>  * Fixed a bug where retries happened immediately when experiencing a network 
> exception uploading data or getting the response.
>  * Fixed a bug where the response stream was not being closed on nonretryable 
> exceptions.
>  * Fixed Exists() calls on Shares and Directories to now populate metadata. 
> This was already being done for Files.
>  * Changed blob constants to support up to 256 MB on put blob for block 
> blobs. The default value for put blob threshold has also been updated to half 
> of the maximum, or 128 MB currently.
>  * Fixed a bug that prevented setting content MD5 to true when creating a new 
> file.
>  * Fixed a bug where access conditions, options, and operation context were 
> not being passed when calling openWriteExisting() on a page blob or a file.
>  * Fixed a bug where an exception was being thrown on a range get of a blob 
> or file when the options disableContentMD5Validation is set to false and 
> useTransactionalContentMD5 is set to true and there is no overall MD5.
>  * Fixed a bug where retries were happening immediately if a socket exception 
> was thrown.
>  * In CloudFileShareProperties, setShareQuota() no longer asserts in bounds. 
> This check has been moved to create() and uploadProperties() in 
> CloudFileShare.
>  * Prefix support for listing files and directories.
>  * Added support for setting public access when creating a blob container
>  * The public access setting on a blob container is now a container property 
> returned from downloadProperties.
>  * Add Message now modifies the PopReceipt, Id, NextVisibleTime, 
> InsertionTime, and ExpirationTime properties of its CloudQueueMessage 
> parameter.
>  * Populate content MD5 for range gets on Blobs and Files.
>  * Added support in Page Blob for incremental copy.
>  * Added large BlockBlob upload support. Blocks can now support sizes up to 
> 100 MB.
>  * Added a new, memory-optimized upload strategy for the upload* APIs. This 
> algorithm only applies for blocks greater than 4MB and when 
> storeBlobContentMD5 and Client-Side Encryption are disabled.
>  * getQualifiedUri() has been deprecated for Blobs. Please use 
> getSnapshotQualifiedUri() instead. This new function will return the blob 
> including the snapshot (if present) and no SAS token.
>  * getQualifiedStorageUri() has been deprecated for Blobs. Please use 
> getSnapshotQualifiedStorageUri() instead. This new function will return the 
> blob including the snapshot (if present) and no SAS token.
>  * Fixed a bug where copying from a blob that included a SAS token and a 
> snapshot ommitted the SAS token.
>  * Fixed a bug in client-side encryption for tables that was preventing the 
> Java client from decrypting entities encrypted with the .NET client, and vice 
> versa.
>  * Added support for server-side encryption.
>  * Added support for getBlobReferenceFromServer methods on CloudBlobContainer 
> to support retrieving a blob without knowing its type.
>  * Fixed a bug in the retry policies where 300 status codes were being 
> retried when they shouldn't be.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14536) Update azure-storage sdk to version 5.3.0

2017-06-19 Thread Georgi Chalakov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Georgi Chalakov updated HADOOP-14536:
-
Summary: Update azure-storage sdk to version 5.3.0  (was: Update 
azure-storage sdk to version 5.2.0)

> Update azure-storage sdk to version 5.3.0
> -
>
> Key: HADOOP-14536
> URL: https://issues.apache.org/jira/browse/HADOOP-14536
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/azure
>Affects Versions: 3.0.0-alpha3
>Reporter: Mingliang Liu
>Assignee: Georgi Chalakov
>
> Update WASB driver to use the latest version (5.2.0) of SDK for Microsoft 
> Azure Storage Clients. We are currently using version 4.2.0 of the SDK.
> Azure Storage Clients changes between 4.2 and 5.2:
>  * Fixed Exists() calls on Shares and Directories to now populate metadata. 
> This was already being done for Files.
>  * Changed blob constants to support up to 256 MB on put blob for block 
> blobs. The default value for put blob threshold has also been updated to half 
> of the maximum, or 128 MB currently.
>  * Fixed a bug that prevented setting content MD5 to true when creating a new 
> file.
>  * Fixed a bug where access conditions, options, and operation context were 
> not being passed when calling openWriteExisting() on a page blob or a file.
>  * Fixed a bug where an exception was being thrown on a range get of a blob 
> or file when the options disableContentMD5Validation is set to false and 
> useTransactionalContentMD5 is set to true and there is no overall MD5.
>  * Fixed a bug where retries were happening immediately if a socket exception 
> was thrown.
>  * In CloudFileShareProperties, setShareQuota() no longer asserts in bounds. 
> This check has been moved to create() and uploadProperties() in 
> CloudFileShare.
>  * Prefix support for listing files and directories.
>  * Added support for setting public access when creating a blob container
>  * The public access setting on a blob container is now a container property 
> returned from downloadProperties.
>  * Add Message now modifies the PopReceipt, Id, NextVisibleTime, 
> InsertionTime, and ExpirationTime properties of its CloudQueueMessage 
> parameter.
>  * Populate content MD5 for range gets on Blobs and Files.
>  * Added support in Page Blob for incremental copy.
>  * Added large BlockBlob upload support. Blocks can now support sizes up to 
> 100 MB.
>  * Added a new, memory-optimized upload strategy for the upload* APIs. This 
> algorithm only applies for blocks greater than 4MB and when 
> storeBlobContentMD5 and Client-Side Encryption are disabled.
>  * getQualifiedUri() has been deprecated for Blobs. Please use 
> getSnapshotQualifiedUri() instead. This new function will return the blob 
> including the snapshot (if present) and no SAS token.
>  * getQualifiedStorageUri() has been deprecated for Blobs. Please use 
> getSnapshotQualifiedStorageUri() instead. This new function will return the 
> blob including the snapshot (if present) and no SAS token.
>  * Fixed a bug where copying from a blob that included a SAS token and a 
> snapshot ommitted the SAS token.
>  * Fixed a bug in client-side encryption for tables that was preventing the 
> Java client from decrypting entities encrypted with the .NET client, and vice 
> versa.
>  * Added support for server-side encryption.
>  * Added support for getBlobReferenceFromServer methods on CloudBlobContainer 
> to support retrieving a blob without knowing its type.
>  * Fixed a bug in the retry policies where 300 status codes were being 
> retried when they shouldn't be.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14536) Update azure-storage sdk to version 5.3.0

2017-06-19 Thread Georgi Chalakov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Georgi Chalakov updated HADOOP-14536:
-
Description: 
Update WASB driver to use the latest version (5.3.0) of SDK for Microsoft Azure 
Storage Clients. We are currently using version 4.2.0 of the SDK.

Azure Storage Clients changes between 4.2 and 5.3:

 * Fixed a bug where the transactional MD5 check would fail when downloading a 
range of blob or file and the recovery action is performed on a subsection of 
the range.
 * Fixed leaking connections for table requests.
 * Fixed a bug where retries happened immediately when experiencing a network 
exception uploading data or getting the response.
 * Fixed a bug where the response stream was not being closed on nonretryable 
exceptions.
 * Fixed Exists() calls on Shares and Directories to now populate metadata. 
This was already being done for Files.
 * Changed blob constants to support up to 256 MB on put blob for block blobs. 
The default value for put blob threshold has also been updated to half of the 
maximum, or 128 MB currently.
 * Fixed a bug that prevented setting content MD5 to true when creating a new 
file.
 * Fixed a bug where access conditions, options, and operation context were not 
being passed when calling openWriteExisting() on a page blob or a file.
 * Fixed a bug where an exception was being thrown on a range get of a blob or 
file when the options disableContentMD5Validation is set to false and 
useTransactionalContentMD5 is set to true and there is no overall MD5.
 * Fixed a bug where retries were happening immediately if a socket exception 
was thrown.
 * In CloudFileShareProperties, setShareQuota() no longer asserts in bounds. 
This check has been moved to create() and uploadProperties() in CloudFileShare.
 * Prefix support for listing files and directories.
 * Added support for setting public access when creating a blob container
 * The public access setting on a blob container is now a container property 
returned from downloadProperties.
 * Add Message now modifies the PopReceipt, Id, NextVisibleTime, InsertionTime, 
and ExpirationTime properties of its CloudQueueMessage parameter.
 * Populate content MD5 for range gets on Blobs and Files.
 * Added support in Page Blob for incremental copy.
 * Added large BlockBlob upload support. Blocks can now support sizes up to 100 
MB.
 * Added a new, memory-optimized upload strategy for the upload* APIs. This 
algorithm only applies for blocks greater than 4MB and when storeBlobContentMD5 
and Client-Side Encryption are disabled.
 * getQualifiedUri() has been deprecated for Blobs. Please use 
getSnapshotQualifiedUri() instead. This new function will return the blob 
including the snapshot (if present) and no SAS token.
 * getQualifiedStorageUri() has been deprecated for Blobs. Please use 
getSnapshotQualifiedStorageUri() instead. This new function will return the 
blob including the snapshot (if present) and no SAS token.
 * Fixed a bug where copying from a blob that included a SAS token and a 
snapshot ommitted the SAS token.
 * Fixed a bug in client-side encryption for tables that was preventing the 
Java client from decrypting entities encrypted with the .NET client, and vice 
versa.
 * Added support for server-side encryption.
 * Added support for getBlobReferenceFromServer methods on CloudBlobContainer 
to support retrieving a blob without knowing its type.
 * Fixed a bug in the retry policies where 300 status codes were being retried 
when they shouldn't be.


  was:
Update WASB driver to use the latest version (5.2.0) of SDK for Microsoft Azure 
Storage Clients. We are currently using version 4.2.0 of the SDK.

Azure Storage Clients changes between 4.2 and 5.2:

 * Fixed Exists() calls on Shares and Directories to now populate metadata. 
This was already being done for Files.
 * Changed blob constants to support up to 256 MB on put blob for block blobs. 
The default value for put blob threshold has also been updated to half of the 
maximum, or 128 MB currently.
 * Fixed a bug that prevented setting content MD5 to true when creating a new 
file.
 * Fixed a bug where access conditions, options, and operation context were not 
being passed when calling openWriteExisting() on a page blob or a file.
 * Fixed a bug where an exception was being thrown on a range get of a blob or 
file when the options disableContentMD5Validation is set to false and 
useTransactionalContentMD5 is set to true and there is no overall MD5.
 * Fixed a bug where retries were happening immediately if a socket exception 
was thrown.
 * In CloudFileShareProperties, setShareQuota() no longer asserts in bounds. 
This check has been moved to create() and uploadProperties() in CloudFileShare.
 * Prefix support for listing files and directories.
 * Added support for setting public access when creating a blob container
 * The public access setting on a 

[jira] [Updated] (HADOOP-14518) Customize User-Agent header sent in HTTP/HTTPS requests by WASB.

2017-06-17 Thread Georgi Chalakov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Georgi Chalakov updated HADOOP-14518:
-
Target Version/s: 3.0.0-alpha3, 2.8.1  (was: 2.8.1, 3.0.0-alpha3)
  Status: Patch Available  (was: In Progress)

> Customize User-Agent header sent in HTTP/HTTPS requests by WASB.
> 
>
> Key: HADOOP-14518
> URL: https://issues.apache.org/jira/browse/HADOOP-14518
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/azure
>Affects Versions: 3.0.0-alpha3
>Reporter: Georgi Chalakov
>Assignee: Georgi Chalakov
>Priority: Minor
> Attachments: HADOOP-14518-01.patch, HADOOP-14518-01-test.txt, 
> HADOOP-14518-02.patch, HADOOP-14518-03.patch, HADOOP-14518-04.patch
>
>
> WASB passes a User-Agent header to the Azure back-end. Right now, it uses the 
> default value set by the Azure Client SDK, so Hadoop traffic doesn't appear 
> any different from general Blob traffic. If we customize the User-Agent 
> header, then it will enable better troubleshooting and analysis by Azure 
> service.
> The following configuration
>   
> fs.azure.user.agent.id
> MSFT
>   
> set the user agent to 
>  User-Agent: WASB/3.0.0-alpha4-SNAPSHOT (MSFT) Azure-Storage/4.2.0 
> (JavaJRE 1.8.0_131; WindowsServer2012R2 6.3)
> Test Results :
> Tests run: 703, Failures: 0, Errors: 0, Skipped: 119



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14518) Customize User-Agent header sent in HTTP/HTTPS requests by WASB.

2017-06-16 Thread Georgi Chalakov (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16052608#comment-16052608
 ] 

Georgi Chalakov commented on HADOOP-14518:
--

It is not clear to us what the correct sequence of step is to get a patch 
reviewed, tested, and approved.

Can you share a link to document or just a paragraph that describes the 
lifetime of a Jira issue?

Thanks
Georgi

Get Outlook for iOS


> Customize User-Agent header sent in HTTP/HTTPS requests by WASB.
> 
>
> Key: HADOOP-14518
> URL: https://issues.apache.org/jira/browse/HADOOP-14518
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/azure
>Affects Versions: 3.0.0-alpha3
>Reporter: Georgi Chalakov
>Assignee: Georgi Chalakov
>Priority: Minor
> Attachments: HADOOP-14518-01.patch, HADOOP-14518-01-test.txt, 
> HADOOP-14518-02.patch, HADOOP-14518-03.patch, HADOOP-14518-04.patch
>
>
> WASB passes a User-Agent header to the Azure back-end. Right now, it uses the 
> default value set by the Azure Client SDK, so Hadoop traffic doesn't appear 
> any different from general Blob traffic. If we customize the User-Agent 
> header, then it will enable better troubleshooting and analysis by Azure 
> service.
> The following configuration
>   
> fs.azure.user.agent.id
> MSFT
>   
> set the user agent to 
>  User-Agent: WASB/3.0.0-alpha4-SNAPSHOT (MSFT) Azure-Storage/4.2.0 
> (JavaJRE 1.8.0_131; WindowsServer2012R2 6.3)
> Test Results :
> Tests run: 703, Failures: 0, Errors: 0, Skipped: 119



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14520) Block compaction for WASB

2017-06-16 Thread Georgi Chalakov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Georgi Chalakov updated HADOOP-14520:
-
Status: In Progress  (was: Patch Available)

> Block compaction for WASB
> -
>
> Key: HADOOP-14520
> URL: https://issues.apache.org/jira/browse/HADOOP-14520
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/azure
>Affects Versions: 3.0.0-alpha3
>Reporter: Georgi Chalakov
>Assignee: Georgi Chalakov
> Attachments: HADOOP-14520-01.patch, HADOOP-14520-01-test.txt
>
>
> Block Compaction for WASB allows uploading new blocks for every hflush/hsync 
> call. When the number of blocks is above a predefined, configurable value, 
> next hflush/hsync triggers the block compaction process. Block compaction 
> replaces a sequence of blocks with one block. From all the sequences with 
> total length less than 4M, compaction chooses the longest one. It is a greedy 
> algorithm that preserve all potential candidates for the next round. Block 
> Compaction for WASB increases data durability and allows using block blobs 
> instead of page blobs. By default, block compaction is disabled. Similar to 
> the configuration for page blobs, the client needs to specify HDFS folders 
> where block compaction over block blobs is enabled. 
> Results for HADOOP-14520-01.patch
> Tests run: 704, Failures: 0, Errors: 0, Skipped: 119



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14518) Customize User-Agent header sent in HTTP/HTTPS requests by WASB.

2017-06-16 Thread Georgi Chalakov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Georgi Chalakov updated HADOOP-14518:
-
Target Version/s: 3.0.0-alpha3, 2.8.1  (was: 2.8.1, 3.0.0-alpha3)
  Status: In Progress  (was: Patch Available)

> Customize User-Agent header sent in HTTP/HTTPS requests by WASB.
> 
>
> Key: HADOOP-14518
> URL: https://issues.apache.org/jira/browse/HADOOP-14518
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/azure
>Affects Versions: 3.0.0-alpha3
>Reporter: Georgi Chalakov
>Assignee: Georgi Chalakov
>Priority: Minor
> Attachments: HADOOP-14518-01.patch, HADOOP-14518-01-test.txt, 
> HADOOP-14518-02.patch, HADOOP-14518-03.patch, HADOOP-14518-04.patch
>
>
> WASB passes a User-Agent header to the Azure back-end. Right now, it uses the 
> default value set by the Azure Client SDK, so Hadoop traffic doesn't appear 
> any different from general Blob traffic. If we customize the User-Agent 
> header, then it will enable better troubleshooting and analysis by Azure 
> service.
> The following configuration
>   
> fs.azure.user.agent.id
> MSFT
>   
> set the user agent to 
>  User-Agent: WASB/3.0.0-alpha4-SNAPSHOT (MSFT) Azure-Storage/4.2.0 
> (JavaJRE 1.8.0_131; WindowsServer2012R2 6.3)
> Test Results :
> Tests run: 703, Failures: 0, Errors: 0, Skipped: 119



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14520) Block compaction for WASB

2017-06-16 Thread Georgi Chalakov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Georgi Chalakov updated HADOOP-14520:
-
Status: Patch Available  (was: In Progress)

> Block compaction for WASB
> -
>
> Key: HADOOP-14520
> URL: https://issues.apache.org/jira/browse/HADOOP-14520
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/azure
>Affects Versions: 3.0.0-alpha3
>Reporter: Georgi Chalakov
>Assignee: Georgi Chalakov
> Attachments: HADOOP-14520-01.patch, HADOOP-14520-01-test.txt
>
>
> Block Compaction for WASB allows uploading new blocks for every hflush/hsync 
> call. When the number of blocks is above a predefined, configurable value, 
> next hflush/hsync triggers the block compaction process. Block compaction 
> replaces a sequence of blocks with one block. From all the sequences with 
> total length less than 4M, compaction chooses the longest one. It is a greedy 
> algorithm that preserve all potential candidates for the next round. Block 
> Compaction for WASB increases data durability and allows using block blobs 
> instead of page blobs. By default, block compaction is disabled. Similar to 
> the configuration for page blobs, the client needs to specify HDFS folders 
> where block compaction over block blobs is enabled. 
> Results for HADOOP-14520-01.patch
> Tests run: 704, Failures: 0, Errors: 0, Skipped: 119



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14518) Customize User-Agent header sent in HTTP/HTTPS requests by WASB.

2017-06-16 Thread Georgi Chalakov (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16052528#comment-16052528
 ] 

Georgi Chalakov commented on HADOOP-14518:
--

Thanks for reviewing the change. The 3 style errors and the two typos are fixed 
in the new patch. 

> Customize User-Agent header sent in HTTP/HTTPS requests by WASB.
> 
>
> Key: HADOOP-14518
> URL: https://issues.apache.org/jira/browse/HADOOP-14518
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/azure
>Affects Versions: 3.0.0-alpha3
>Reporter: Georgi Chalakov
>Assignee: Georgi Chalakov
>Priority: Minor
> Attachments: HADOOP-14518-01.patch, HADOOP-14518-01-test.txt, 
> HADOOP-14518-02.patch, HADOOP-14518-03.patch, HADOOP-14518-04.patch
>
>
> WASB passes a User-Agent header to the Azure back-end. Right now, it uses the 
> default value set by the Azure Client SDK, so Hadoop traffic doesn't appear 
> any different from general Blob traffic. If we customize the User-Agent 
> header, then it will enable better troubleshooting and analysis by Azure 
> service.
> The following configuration
>   
> fs.azure.user.agent.id
> MSFT
>   
> set the user agent to 
>  User-Agent: WASB/3.0.0-alpha4-SNAPSHOT (MSFT) Azure-Storage/4.2.0 
> (JavaJRE 1.8.0_131; WindowsServer2012R2 6.3)
> Test Results :
> Tests run: 703, Failures: 0, Errors: 0, Skipped: 119



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14518) Customize User-Agent header sent in HTTP/HTTPS requests by WASB.

2017-06-16 Thread Georgi Chalakov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Georgi Chalakov updated HADOOP-14518:
-
Target Version/s: 3.0.0-alpha3, 2.8.1  (was: 2.8.0, 3.0.0-alpha3)
  Status: Patch Available  (was: In Progress)

> Customize User-Agent header sent in HTTP/HTTPS requests by WASB.
> 
>
> Key: HADOOP-14518
> URL: https://issues.apache.org/jira/browse/HADOOP-14518
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/azure
>Affects Versions: 3.0.0-alpha3
>Reporter: Georgi Chalakov
>Assignee: Georgi Chalakov
>Priority: Minor
> Attachments: HADOOP-14518-01.patch, HADOOP-14518-01-test.txt, 
> HADOOP-14518-02.patch, HADOOP-14518-03.patch, HADOOP-14518-04.patch
>
>
> WASB passes a User-Agent header to the Azure back-end. Right now, it uses the 
> default value set by the Azure Client SDK, so Hadoop traffic doesn't appear 
> any different from general Blob traffic. If we customize the User-Agent 
> header, then it will enable better troubleshooting and analysis by Azure 
> service.
> The following configuration
>   
> fs.azure.user.agent.id
> MSFT
>   
> set the user agent to 
>  User-Agent: WASB/3.0.0-alpha4-SNAPSHOT (MSFT) Azure-Storage/4.2.0 
> (JavaJRE 1.8.0_131; WindowsServer2012R2 6.3)
> Test Results :
> Tests run: 703, Failures: 0, Errors: 0, Skipped: 119



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14518) Customize User-Agent header sent in HTTP/HTTPS requests by WASB.

2017-06-16 Thread Georgi Chalakov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Georgi Chalakov updated HADOOP-14518:
-
Attachment: HADOOP-14518-04.patch

> Customize User-Agent header sent in HTTP/HTTPS requests by WASB.
> 
>
> Key: HADOOP-14518
> URL: https://issues.apache.org/jira/browse/HADOOP-14518
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/azure
>Affects Versions: 3.0.0-alpha3
>Reporter: Georgi Chalakov
>Assignee: Georgi Chalakov
>Priority: Minor
> Attachments: HADOOP-14518-01.patch, HADOOP-14518-01-test.txt, 
> HADOOP-14518-02.patch, HADOOP-14518-03.patch, HADOOP-14518-04.patch
>
>
> WASB passes a User-Agent header to the Azure back-end. Right now, it uses the 
> default value set by the Azure Client SDK, so Hadoop traffic doesn't appear 
> any different from general Blob traffic. If we customize the User-Agent 
> header, then it will enable better troubleshooting and analysis by Azure 
> service.
> The following configuration
>   
> fs.azure.user.agent.id
> MSFT
>   
> set the user agent to 
>  User-Agent: WASB/3.0.0-alpha4-SNAPSHOT (MSFT) Azure-Storage/4.2.0 
> (JavaJRE 1.8.0_131; WindowsServer2012R2 6.3)
> Test Results :
> Tests run: 703, Failures: 0, Errors: 0, Skipped: 119



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14520) Block compaction for WASB

2017-06-12 Thread Georgi Chalakov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Georgi Chalakov updated HADOOP-14520:
-
Status: Patch Available  (was: In Progress)

> Block compaction for WASB
> -
>
> Key: HADOOP-14520
> URL: https://issues.apache.org/jira/browse/HADOOP-14520
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/azure
>Affects Versions: 3.0.0-alpha3
>Reporter: Georgi Chalakov
>Assignee: Georgi Chalakov
> Attachments: HADOOP-14520-01.patch, HADOOP-14520-01-test.txt
>
>
> Block Compaction for WASB allows uploading new blocks for every hflush/hsync 
> call. When the number of blocks is above a predefined, configurable value, 
> next hflush/hsync triggers the block compaction process. Block compaction 
> replaces a sequence of blocks with one block. From all the sequences with 
> total length less than 4M, compaction chooses the longest one. It is a greedy 
> algorithm that preserve all potential candidates for the next round. Block 
> Compaction for WASB increases data durability and allows using block blobs 
> instead of page blobs. By default, block compaction is disabled. Similar to 
> the configuration for page blobs, the client needs to specify HDFS folders 
> where block compaction over block blobs is enabled. 
> Results for HADOOP-14520-01.patch
> Tests run: 704, Failures: 0, Errors: 0, Skipped: 119



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14520) Block compaction for WASB

2017-06-12 Thread Georgi Chalakov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Georgi Chalakov updated HADOOP-14520:
-
Status: In Progress  (was: Patch Available)

> Block compaction for WASB
> -
>
> Key: HADOOP-14520
> URL: https://issues.apache.org/jira/browse/HADOOP-14520
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/azure
>Affects Versions: 3.0.0-alpha3
>Reporter: Georgi Chalakov
>Assignee: Georgi Chalakov
> Attachments: HADOOP-14520-01.patch, HADOOP-14520-01-test.txt
>
>
> Block Compaction for WASB allows uploading new blocks for every hflush/hsync 
> call. When the number of blocks is above a predefined, configurable value, 
> next hflush/hsync triggers the block compaction process. Block compaction 
> replaces a sequence of blocks with one block. From all the sequences with 
> total length less than 4M, compaction chooses the longest one. It is a greedy 
> algorithm that preserve all potential candidates for the next round. Block 
> Compaction for WASB increases data durability and allows using block blobs 
> instead of page blobs. By default, block compaction is disabled. Similar to 
> the configuration for page blobs, the client needs to specify HDFS folders 
> where block compaction over block blobs is enabled. 
> Results for HADOOP-14520-01.patch
> Tests run: 704, Failures: 0, Errors: 0, Skipped: 119



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14520) Block compaction for WASB

2017-06-12 Thread Georgi Chalakov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Georgi Chalakov updated HADOOP-14520:
-
Status: Patch Available  (was: In Progress)

> Block compaction for WASB
> -
>
> Key: HADOOP-14520
> URL: https://issues.apache.org/jira/browse/HADOOP-14520
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/azure
>Affects Versions: 3.0.0-alpha3
>Reporter: Georgi Chalakov
>Assignee: Georgi Chalakov
> Attachments: HADOOP-14520-01.patch, HADOOP-14520-01-test.txt
>
>
> Block Compaction for WASB allows uploading new blocks for every hflush/hsync 
> call. When the number of blocks is above a predefined, configurable value, 
> next hflush/hsync triggers the block compaction process. Block compaction 
> replaces a sequence of blocks with one block. From all the sequences with 
> total length less than 4M, compaction chooses the longest one. It is a greedy 
> algorithm that preserve all potential candidates for the next round. Block 
> Compaction for WASB increases data durability and allows using block blobs 
> instead of page blobs. By default, block compaction is disabled. Similar to 
> the configuration for page blobs, the client needs to specify HDFS folders 
> where block compaction over block blobs is enabled. 
> Results for HADOOP-14520-01.patch
> Tests run: 704, Failures: 0, Errors: 0, Skipped: 119



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14520) Block compaction for WASB

2017-06-12 Thread Georgi Chalakov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Georgi Chalakov updated HADOOP-14520:
-
Status: In Progress  (was: Patch Available)

> Block compaction for WASB
> -
>
> Key: HADOOP-14520
> URL: https://issues.apache.org/jira/browse/HADOOP-14520
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/azure
>Affects Versions: 3.0.0-alpha3
>Reporter: Georgi Chalakov
>Assignee: Georgi Chalakov
> Attachments: HADOOP-14520-01.patch, HADOOP-14520-01-test.txt
>
>
> Block Compaction for WASB allows uploading new blocks for every hflush/hsync 
> call. When the number of blocks is above a predefined, configurable value, 
> next hflush/hsync triggers the block compaction process. Block compaction 
> replaces a sequence of blocks with one block. From all the sequences with 
> total length less than 4M, compaction chooses the longest one. It is a greedy 
> algorithm that preserve all potential candidates for the next round. Block 
> Compaction for WASB increases data durability and allows using block blobs 
> instead of page blobs. By default, block compaction is disabled. Similar to 
> the configuration for page blobs, the client needs to specify HDFS folders 
> where block compaction over block blobs is enabled. 
> Results for HADOOP-14520-01.patch
> Tests run: 704, Failures: 0, Errors: 0, Skipped: 119



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14518) Customize User-Agent header sent in HTTP/HTTPS requests by WASB.

2017-06-12 Thread Georgi Chalakov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Georgi Chalakov updated HADOOP-14518:
-
Target Version/s: 3.0.0-alpha3, 2.8.0  (was: 2.8.0, 3.0.0-alpha3)
  Status: In Progress  (was: Patch Available)

> Customize User-Agent header sent in HTTP/HTTPS requests by WASB.
> 
>
> Key: HADOOP-14518
> URL: https://issues.apache.org/jira/browse/HADOOP-14518
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/azure
>Affects Versions: 3.0.0-alpha3
>Reporter: Georgi Chalakov
>Assignee: Georgi Chalakov
>Priority: Minor
> Attachments: HADOOP-14518-01.patch, HADOOP-14518-01-test.txt, 
> HADOOP-14518-02.patch, HADOOP-14518-03.patch
>
>
> WASB passes a User-Agent header to the Azure back-end. Right now, it uses the 
> default value set by the Azure Client SDK, so Hadoop traffic doesn't appear 
> any different from general Blob traffic. If we customize the User-Agent 
> header, then it will enable better troubleshooting and analysis by Azure 
> service.
> The following configuration
>   
> fs.azure.user.agent.id
> MSFT
>   
> set the user agent to 
>  User-Agent: WASB/3.0.0-alpha4-SNAPSHOT (MSFT) Azure-Storage/4.2.0 
> (JavaJRE 1.8.0_131; WindowsServer2012R2 6.3)
> Test Results :
> Tests run: 703, Failures: 0, Errors: 0, Skipped: 119



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14520) Block compaction for WASB

2017-06-12 Thread Georgi Chalakov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Georgi Chalakov updated HADOOP-14520:
-
Status: Patch Available  (was: Open)

> Block compaction for WASB
> -
>
> Key: HADOOP-14520
> URL: https://issues.apache.org/jira/browse/HADOOP-14520
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/azure
>Affects Versions: 3.0.0-alpha3
>Reporter: Georgi Chalakov
>Assignee: Georgi Chalakov
> Attachments: HADOOP-14520-01.patch, HADOOP-14520-01-test.txt
>
>
> Block Compaction for WASB allows uploading new blocks for every hflush/hsync 
> call. When the number of blocks is above a predefined, configurable value, 
> next hflush/hsync triggers the block compaction process. Block compaction 
> replaces a sequence of blocks with one block. From all the sequences with 
> total length less than 4M, compaction chooses the longest one. It is a greedy 
> algorithm that preserve all potential candidates for the next round. Block 
> Compaction for WASB increases data durability and allows using block blobs 
> instead of page blobs. By default, block compaction is disabled. Similar to 
> the configuration for page blobs, the client needs to specify HDFS folders 
> where block compaction over block blobs is enabled. 
> Results for HADOOP-14520-01.patch
> Tests run: 704, Failures: 0, Errors: 0, Skipped: 119



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14520) Block compaction for WASB

2017-06-12 Thread Georgi Chalakov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Georgi Chalakov updated HADOOP-14520:
-
Status: Open  (was: Patch Available)

> Block compaction for WASB
> -
>
> Key: HADOOP-14520
> URL: https://issues.apache.org/jira/browse/HADOOP-14520
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/azure
>Affects Versions: 3.0.0-alpha3
>Reporter: Georgi Chalakov
>Assignee: Georgi Chalakov
> Attachments: HADOOP-14520-01.patch, HADOOP-14520-01-test.txt
>
>
> Block Compaction for WASB allows uploading new blocks for every hflush/hsync 
> call. When the number of blocks is above a predefined, configurable value, 
> next hflush/hsync triggers the block compaction process. Block compaction 
> replaces a sequence of blocks with one block. From all the sequences with 
> total length less than 4M, compaction chooses the longest one. It is a greedy 
> algorithm that preserve all potential candidates for the next round. Block 
> Compaction for WASB increases data durability and allows using block blobs 
> instead of page blobs. By default, block compaction is disabled. Similar to 
> the configuration for page blobs, the client needs to specify HDFS folders 
> where block compaction over block blobs is enabled. 
> Results for HADOOP-14520-01.patch
> Tests run: 704, Failures: 0, Errors: 0, Skipped: 119



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14520) Block compaction for WASB

2017-06-11 Thread Georgi Chalakov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Georgi Chalakov updated HADOOP-14520:
-
Description: 
Block Compaction for WASB allows uploading new blocks for every hflush/hsync 
call. When the number of blocks is above a predefined, configurable value, next 
hflush/hsync triggers the block compaction process. Block compaction replaces a 
sequence of blocks with one block. From all the sequences with total length 
less than 4M, compaction chooses the longest one. It is a greedy algorithm that 
preserve all potential candidates for the next round. Block Compaction for WASB 
increases data durability and allows using block blobs instead of page blobs. 
By default, block compaction is disabled. Similar to the configuration for page 
blobs, the client needs to specify HDFS folders where block compaction over 
block blobs is enabled. 

Results for HADOOP-14520-01.patch
Tests run: 704, Failures: 0, Errors: 0, Skipped: 119



  was:
Block Compaction for WASB allows uploading new blocks for every hflush/hsync 
call. When the number of blocks is above a predefined, configurable value, next 
hflush/hsync triggers the block compaction process. Block compaction replaces a 
sequence of blocks with one block. From all the sequences with total length 
less than 4M, compaction chooses the longest one. It is a greedy algorithm that 
preserve all potential candidates for the next round. Block Compaction for WASB 
increases data durability and allows using block blobs instead of page blobs. 
By default, block compaction is disabled. Similar to the configuration for page 
blobs, the client needs to specify HDFS folders where block compaction over 
block blobs is enabled. 




> Block compaction for WASB
> -
>
> Key: HADOOP-14520
> URL: https://issues.apache.org/jira/browse/HADOOP-14520
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/azure
>Affects Versions: 3.0.0-alpha3
>Reporter: Georgi Chalakov
>Assignee: Georgi Chalakov
> Attachments: HADOOP-14520-01.patch, HADOOP-14520-01-test.txt
>
>
> Block Compaction for WASB allows uploading new blocks for every hflush/hsync 
> call. When the number of blocks is above a predefined, configurable value, 
> next hflush/hsync triggers the block compaction process. Block compaction 
> replaces a sequence of blocks with one block. From all the sequences with 
> total length less than 4M, compaction chooses the longest one. It is a greedy 
> algorithm that preserve all potential candidates for the next round. Block 
> Compaction for WASB increases data durability and allows using block blobs 
> instead of page blobs. By default, block compaction is disabled. Similar to 
> the configuration for page blobs, the client needs to specify HDFS folders 
> where block compaction over block blobs is enabled. 
> Results for HADOOP-14520-01.patch
> Tests run: 704, Failures: 0, Errors: 0, Skipped: 119



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14520) Block compaction for WASB

2017-06-11 Thread Georgi Chalakov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Georgi Chalakov updated HADOOP-14520:
-
Status: Patch Available  (was: Open)

> Block compaction for WASB
> -
>
> Key: HADOOP-14520
> URL: https://issues.apache.org/jira/browse/HADOOP-14520
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/azure
>Affects Versions: 3.0.0-alpha3
>Reporter: Georgi Chalakov
>Assignee: Georgi Chalakov
> Attachments: HADOOP-14520-01.patch, HADOOP-14520-01-test.txt
>
>
> Block Compaction for WASB allows uploading new blocks for every hflush/hsync 
> call. When the number of blocks is above a predefined, configurable value, 
> next hflush/hsync triggers the block compaction process. Block compaction 
> replaces a sequence of blocks with one block. From all the sequences with 
> total length less than 4M, compaction chooses the longest one. It is a greedy 
> algorithm that preserve all potential candidates for the next round. Block 
> Compaction for WASB increases data durability and allows using block blobs 
> instead of page blobs. By default, block compaction is disabled. Similar to 
> the configuration for page blobs, the client needs to specify HDFS folders 
> where block compaction over block blobs is enabled. 
> Results for HADOOP-14520-01.patch
> Tests run: 704, Failures: 0, Errors: 0, Skipped: 119



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14520) Block compaction for WASB

2017-06-11 Thread Georgi Chalakov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Georgi Chalakov updated HADOOP-14520:
-
Attachment: HADOOP-14520-01-test.txt
HADOOP-14520-01.patch

> Block compaction for WASB
> -
>
> Key: HADOOP-14520
> URL: https://issues.apache.org/jira/browse/HADOOP-14520
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/azure
>Affects Versions: 3.0.0-alpha3
>Reporter: Georgi Chalakov
>Assignee: Georgi Chalakov
> Attachments: HADOOP-14520-01.patch, HADOOP-14520-01-test.txt
>
>
> Block Compaction for WASB allows uploading new blocks for every hflush/hsync 
> call. When the number of blocks is above a predefined, configurable value, 
> next hflush/hsync triggers the block compaction process. Block compaction 
> replaces a sequence of blocks with one block. From all the sequences with 
> total length less than 4M, compaction chooses the longest one. It is a greedy 
> algorithm that preserve all potential candidates for the next round. Block 
> Compaction for WASB increases data durability and allows using block blobs 
> instead of page blobs. By default, block compaction is disabled. Similar to 
> the configuration for page blobs, the client needs to specify HDFS folders 
> where block compaction over block blobs is enabled. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



  1   2   >