[jira] [Updated] (HADOOP-14081) S3A: Consider avoiding array copy in S3ABlockOutputStream (ByteArrayBlock)

2017-04-21 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HADOOP-14081:
-
Fix Version/s: 3.0.0-alpha3

> S3A: Consider avoiding array copy in S3ABlockOutputStream (ByteArrayBlock)
> --
>
> Key: HADOOP-14081
> URL: https://issues.apache.org/jira/browse/HADOOP-14081
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Minor
> Fix For: 2.8.0, 3.0.0-alpha3
>
> Attachments: HADOOP-14081.001.patch
>
>
> In {{S3ADataBlocks::ByteArrayBlock}}, data is copied whenever {{startUpload}} 
> is called. It might be possible to directly access the byte[] array from 
> ByteArrayOutputStream. 
> Might have to extend ByteArrayOutputStream and create a method like 
> getInputStream() which can return ByteArrayInputStream.  This would avoid 
> expensive array copy during large upload.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14081) S3A: Consider avoiding array copy in S3ABlockOutputStream (ByteArrayBlock)

2017-02-20 Thread Steve Loughran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-14081:

Parent Issue: HADOOP-11694  (was: HADOOP-13204)

> S3A: Consider avoiding array copy in S3ABlockOutputStream (ByteArrayBlock)
> --
>
> Key: HADOOP-14081
> URL: https://issues.apache.org/jira/browse/HADOOP-14081
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Minor
> Fix For: 2.8.0
>
> Attachments: HADOOP-14081.001.patch
>
>
> In {{S3ADataBlocks::ByteArrayBlock}}, data is copied whenever {{startUpload}} 
> is called. It might be possible to directly access the byte[] array from 
> ByteArrayOutputStream. 
> Might have to extend ByteArrayOutputStream and create a method like 
> getInputStream() which can return ByteArrayInputStream.  This would avoid 
> expensive array copy during large upload.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14081) S3A: Consider avoiding array copy in S3ABlockOutputStream (ByteArrayBlock)

2017-02-20 Thread Steve Loughran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-14081:

Fix Version/s: (was: 2.8.1)
   2.8.0

> S3A: Consider avoiding array copy in S3ABlockOutputStream (ByteArrayBlock)
> --
>
> Key: HADOOP-14081
> URL: https://issues.apache.org/jira/browse/HADOOP-14081
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Minor
> Fix For: 2.8.0
>
> Attachments: HADOOP-14081.001.patch
>
>
> In {{S3ADataBlocks::ByteArrayBlock}}, data is copied whenever {{startUpload}} 
> is called. It might be possible to directly access the byte[] array from 
> ByteArrayOutputStream. 
> Might have to extend ByteArrayOutputStream and create a method like 
> getInputStream() which can return ByteArrayInputStream.  This would avoid 
> expensive array copy during large upload.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14081) S3A: Consider avoiding array copy in S3ABlockOutputStream (ByteArrayBlock)

2017-02-20 Thread Steve Loughran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-14081:

Fix Version/s: 2.8.1

> S3A: Consider avoiding array copy in S3ABlockOutputStream (ByteArrayBlock)
> --
>
> Key: HADOOP-14081
> URL: https://issues.apache.org/jira/browse/HADOOP-14081
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Minor
> Fix For: 2.8.1
>
> Attachments: HADOOP-14081.001.patch
>
>
> In {{S3ADataBlocks::ByteArrayBlock}}, data is copied whenever {{startUpload}} 
> is called. It might be possible to directly access the byte[] array from 
> ByteArrayOutputStream. 
> Might have to extend ByteArrayOutputStream and create a method like 
> getInputStream() which can return ByteArrayInputStream.  This would avoid 
> expensive array copy during large upload.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14081) S3A: Consider avoiding array copy in S3ABlockOutputStream (ByteArrayBlock)

2017-02-20 Thread Steve Loughran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-14081:

Resolution: Fixed
Status: Resolved  (was: Patch Available)

+1 to the production side patch. I'm not applying the test change for the 
reasons stated above.

tested against s3 ireland @ scale:

{code}
mvit -Dparallel-tests -Dscale -Dfs.s3a.scale.test.huge.filesize=128M
{code}

> S3A: Consider avoiding array copy in S3ABlockOutputStream (ByteArrayBlock)
> --
>
> Key: HADOOP-14081
> URL: https://issues.apache.org/jira/browse/HADOOP-14081
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Minor
> Attachments: HADOOP-14081.001.patch
>
>
> In {{S3ADataBlocks::ByteArrayBlock}}, data is copied whenever {{startUpload}} 
> is called. It might be possible to directly access the byte[] array from 
> ByteArrayOutputStream. 
> Might have to extend ByteArrayOutputStream and create a method like 
> getInputStream() which can return ByteArrayInputStream.  This would avoid 
> expensive array copy during large upload.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14081) S3A: Consider avoiding array copy in S3ABlockOutputStream (ByteArrayBlock)

2017-02-15 Thread Rajesh Balamohan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HADOOP-14081:
--
Status: Patch Available  (was: Open)

> S3A: Consider avoiding array copy in S3ABlockOutputStream (ByteArrayBlock)
> --
>
> Key: HADOOP-14081
> URL: https://issues.apache.org/jira/browse/HADOOP-14081
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Rajesh Balamohan
>Priority: Minor
> Attachments: HADOOP-14081.001.patch
>
>
> In {{S3ADataBlocks::ByteArrayBlock}}, data is copied whenever {{startUpload}} 
> is called. It might be possible to directly access the byte[] array from 
> ByteArrayOutputStream. 
> Might have to extend ByteArrayOutputStream and create a method like 
> getInputStream() which can return ByteArrayInputStream.  This would avoid 
> expensive array copy during large upload.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14081) S3A: Consider avoiding array copy in S3ABlockOutputStream (ByteArrayBlock)

2017-02-15 Thread Rajesh Balamohan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HADOOP-14081:
--
Attachment: HADOOP-14081.001.patch

> S3A: Consider avoiding array copy in S3ABlockOutputStream (ByteArrayBlock)
> --
>
> Key: HADOOP-14081
> URL: https://issues.apache.org/jira/browse/HADOOP-14081
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Rajesh Balamohan
>Priority: Minor
> Attachments: HADOOP-14081.001.patch
>
>
> In {{S3ADataBlocks::ByteArrayBlock}}, data is copied whenever {{startUpload}} 
> is called. It might be possible to directly access the byte[] array from 
> ByteArrayOutputStream. 
> Might have to extend ByteArrayOutputStream and create a method like 
> getInputStream() which can return ByteArrayInputStream.  This would avoid 
> expensive array copy during large upload.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14081) S3A: Consider avoiding array copy in S3ABlockOutputStream (ByteArrayBlock)

2017-02-14 Thread Rajesh Balamohan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HADOOP-14081:
--
Issue Type: Sub-task  (was: Improvement)
Parent: HADOOP-13204

> S3A: Consider avoiding array copy in S3ABlockOutputStream (ByteArrayBlock)
> --
>
> Key: HADOOP-14081
> URL: https://issues.apache.org/jira/browse/HADOOP-14081
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Rajesh Balamohan
>Priority: Minor
>
> In {{S3ADataBlocks::ByteArrayBlock}}, data is copied whenever {{startUpload}} 
> is called. It might be possible to directly access the byte[] array from 
> ByteArrayOutputStream. 
> Might have to extend ByteArrayOutputStream and create a method like 
> getInputStream() which can return ByteArrayInputStream.  This would avoid 
> expensive array copy during large upload.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org