[jira] [Updated] (HADOOP-15478) WASB: hflush() and hsync() regression

2018-06-17 Thread Thomas Marquardt (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Marquardt updated HADOOP-15478:
--
Attachment: HADOOP-15547.002.patch

> WASB: hflush() and hsync() regression
> -
>
> Key: HADOOP-15478
> URL: https://issues.apache.org/jira/browse/HADOOP-15478
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/azure
>Affects Versions: 2.9.0, 3.0.2
>Reporter: Thomas Marquardt
>Assignee: Thomas Marquardt
>Priority: Major
> Fix For: 2.10.0, 3.1.1
>
> Attachments: HADOOP-15478-002.patch, HADOOP-15478.001.patch
>
>
> HADOOP-14520 introduced a regression in hflush() and hsync().  Previously, 
> for the default case where users upload data as block blobs, these were 
> no-ops.  Unfortunately, HADOOP-14520 accidentally implemented hflush() and 
> hsync() by default, so any data buffered in the stream is immediately 
> uploaded to storage.  This new behavior is undesirable, because block blobs 
> have a limit of 50,000 blocks.  Spark users are now seeing failures due to 
> exceeding the block limit, since Spark frequently invokes hflush().



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15478) WASB: hflush() and hsync() regression

2018-06-17 Thread Thomas Marquardt (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Marquardt updated HADOOP-15478:
--
Attachment: (was: HADOOP-15547.002.patch)

> WASB: hflush() and hsync() regression
> -
>
> Key: HADOOP-15478
> URL: https://issues.apache.org/jira/browse/HADOOP-15478
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/azure
>Affects Versions: 2.9.0, 3.0.2
>Reporter: Thomas Marquardt
>Assignee: Thomas Marquardt
>Priority: Major
> Fix For: 2.10.0, 3.1.1
>
> Attachments: HADOOP-15478-002.patch, HADOOP-15478.001.patch
>
>
> HADOOP-14520 introduced a regression in hflush() and hsync().  Previously, 
> for the default case where users upload data as block blobs, these were 
> no-ops.  Unfortunately, HADOOP-14520 accidentally implemented hflush() and 
> hsync() by default, so any data buffered in the stream is immediately 
> uploaded to storage.  This new behavior is undesirable, because block blobs 
> have a limit of 50,000 blocks.  Spark users are now seeing failures due to 
> exceeding the block limit, since Spark frequently invokes hflush().



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15478) WASB: hflush() and hsync() regression

2018-05-21 Thread Steve Loughran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-15478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-15478:

   Resolution: Fixed
Fix Version/s: 3.1.1
   2.10.0
   Status: Resolved  (was: Patch Available)

+1
applied to branch-3.1&, then cherrypicked to branch-2 and retested against 
Azure ireland. new test was happy.

I did see a failure of {{ITestAzureFileSystemInstrumentation}}; its assertion 
about bytes written in the last second, even when I tried a standalone run of it
{code}
[ERROR]   
ITestAzureFileSystemInstrumentation.testMetricsOnFileCreateRead:162->Assert.assertTrue:41->Assert.fail:88
 The bytes written in the last second 0 is pretty far from the expected range 
of around 1000 bytes plus a little overhead.
{code}

I think my network is just playing up today, with bandwidth/latency below what 
the tests expect. If its recurrent, we might have to think about making the 
assertion checks tuneable.

> WASB: hflush() and hsync() regression
> -
>
> Key: HADOOP-15478
> URL: https://issues.apache.org/jira/browse/HADOOP-15478
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/azure
>Affects Versions: 2.9.0, 3.0.2
>Reporter: Thomas Marquardt
>Assignee: Thomas Marquardt
>Priority: Major
> Fix For: 2.10.0, 3.1.1
>
> Attachments: HADOOP-15478-002.patch, HADOOP-15478.001.patch
>
>
> HADOOP-14520 introduced a regression in hflush() and hsync().  Previously, 
> for the default case where users upload data as block blobs, these were 
> no-ops.  Unfortunately, HADOOP-14520 accidentally implemented hflush() and 
> hsync() by default, so any data buffered in the stream is immediately 
> uploaded to storage.  This new behavior is undesirable, because block blobs 
> have a limit of 50,000 blocks.  Spark users are now seeing failures due to 
> exceeding the block limit, since Spark frequently invokes hflush().



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15478) WASB: hflush() and hsync() regression

2018-05-21 Thread Steve Loughran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-15478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-15478:

Attachment: HADOOP-15478-002.patch

> WASB: hflush() and hsync() regression
> -
>
> Key: HADOOP-15478
> URL: https://issues.apache.org/jira/browse/HADOOP-15478
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/azure
>Affects Versions: 2.9.0, 3.0.2
>Reporter: Thomas Marquardt
>Assignee: Thomas Marquardt
>Priority: Major
> Attachments: HADOOP-15478-002.patch, HADOOP-15478.001.patch
>
>
> HADOOP-14520 introduced a regression in hflush() and hsync().  Previously, 
> for the default case where users upload data as block blobs, these were 
> no-ops.  Unfortunately, HADOOP-14520 accidentally implemented hflush() and 
> hsync() by default, so any data buffered in the stream is immediately 
> uploaded to storage.  This new behavior is undesirable, because block blobs 
> have a limit of 50,000 blocks.  Spark users are now seeing failures due to 
> exceeding the block limit, since Spark frequently invokes hflush().



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15478) WASB: hflush() and hsync() regression

2018-05-18 Thread Thomas Marquardt (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-15478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Marquardt updated HADOOP-15478:
--
Status: Patch Available  (was: Open)

Submitting patch HADOOP-15478.001.patch

> WASB: hflush() and hsync() regression
> -
>
> Key: HADOOP-15478
> URL: https://issues.apache.org/jira/browse/HADOOP-15478
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/azure
>Affects Versions: 3.0.2, 2.9.0
>Reporter: Thomas Marquardt
>Assignee: Thomas Marquardt
>Priority: Major
> Attachments: HADOOP-15478.001.patch
>
>
> HADOOP-14520 introduced a regression in hflush() and hsync().  Previously, 
> for the default case where users upload data as block blobs, these were 
> no-ops.  Unfortunately, HADOOP-14520 accidentally implemented hflush() and 
> hsync() by default, so any data buffered in the stream is immediately 
> uploaded to storage.  This new behavior is undesirable, because block blobs 
> have a limit of 50,000 blocks.  Spark users are now seeing failures due to 
> exceeding the block limit, since Spark frequently invokes hflush().



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15478) WASB: hflush() and hsync() regression

2018-05-18 Thread Thomas Marquardt (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-15478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Marquardt updated HADOOP-15478:
--
Attachment: HADOOP-15478.001.patch

> WASB: hflush() and hsync() regression
> -
>
> Key: HADOOP-15478
> URL: https://issues.apache.org/jira/browse/HADOOP-15478
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/azure
>Affects Versions: 2.9.0, 3.0.2
>Reporter: Thomas Marquardt
>Assignee: Thomas Marquardt
>Priority: Major
> Attachments: HADOOP-15478.001.patch
>
>
> HADOOP-14520 introduced a regression in hflush() and hsync().  Previously, 
> for the default case where users upload data as block blobs, these were 
> no-ops.  Unfortunately, HADOOP-14520 accidentally implemented hflush() and 
> hsync() by default, so any data buffered in the stream is immediately 
> uploaded to storage.  This new behavior is undesirable, because block blobs 
> have a limit of 50,000 blocks.  Spark users are now seeing failures due to 
> exceeding the block limit, since Spark frequently invokes hflush().



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org