[jira] [Commented] (HADOOP-13853) S3ADataBlocks.DiskBlock to lazy create dest file for faster 0-byte puts

2017-12-08 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16283305#comment-16283305
 ] 

Steve Loughran commented on HADOOP-13853:
-

In HADOOP-13786 the success marker is non empty. However, possibly some merits 
for any touch() operation

> S3ADataBlocks.DiskBlock to lazy create dest file for faster 0-byte puts
> ---
>
> Key: HADOOP-13853
> URL: https://issues.apache.org/jira/browse/HADOOP-13853
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 2.8.0
>Reporter: Steve Loughran
>Priority: Minor
>
> Looking at traces of work, there's invariably a PUT of a _SUCCESS at the end, 
> which, with disk output, adds the overhead of creating, writing to and then 
> reading a 0 byte file.
> With a lazy create, the creation could be postponed until the first write, 
> with special handling in the {{startUpload()}} operation to return a null 
> stream, rather than reopen the file. Saves on some disk IO: create, read, 
> delete



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13853) S3ADataBlocks.DiskBlock to lazy create dest file for faster 0-byte puts

2016-12-01 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15712269#comment-15712269
 ] 

Steve Loughran commented on HADOOP-13853:
-

note that given all the other overheads of a commit, this is not the bottleneck.

> S3ADataBlocks.DiskBlock to lazy create dest file for faster 0-byte puts
> ---
>
> Key: HADOOP-13853
> URL: https://issues.apache.org/jira/browse/HADOOP-13853
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 2.8.0
>Reporter: Steve Loughran
>Priority: Minor
>
> Looking at traces of work, there's invariably a PUT of a _SUCCESS at the end, 
> which, with disk output, adds the overhead of creating, writing to and then 
> reading a 0 byte file.
> With a lazy create, the creation could be postponed until the first write, 
> with special handling in the {{startUpload()}} operation to return a null 
> stream, rather than reopen the file. Saves on some disk IO: create, read, 
> delete



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13853) S3ADataBlocks.DiskBlock to lazy create dest file for faster 0-byte puts

2016-12-01 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15712267#comment-15712267
 ] 

Steve Loughran commented on HADOOP-13853:
-

DEBUG-level logs of the work used to create a marker file.
{code}
2016-12-01 15:22:29,727 [ScalaTest-main-running-S3ANumbersSuite] DEBUG 
s3a.S3ABlockOutputStream (S3ABlockOutputStream.java:(170)) - Initialized 
S3ABlockOutputStream for {bucket=hwdev-steve-new, 
key='spark-cloud/S3ANumbersSuite/numbers_rdd_tests/_SUCCESS'} output to 
FileBlock{destFile=/Users/stevel/Projects/Hortonworks/Projects/sparkwork/spark-cloud-examples/cloud-examples/target/tmp/s3ablock8507376768330281400.tmp,
 state=Writing, dataSize=0, limit=8388608}
2016-12-01 15:22:29,728 [ScalaTest-main-running-S3ANumbersSuite] DEBUG 
s3a.S3ABlockOutputStream (S3ABlockOutputStream.java:close(333)) - 
S3ABlockOutputStream{{bucket=hwdev-steve-new, 
key='spark-cloud/S3ANumbersSuite/numbers_rdd_tests/_SUCCESS'}, 
blockSize=8388608, 
activeBlock=FileBlock{destFile=/Users/stevel/Projects/Hortonworks/Projects/sparkwork/spark-cloud-examples/cloud-examples/target/tmp/s3ablock8507376768330281400.tmp,
 state=Writing, dataSize=0, limit=8388608}}: Closing block #1: current block= 
FileBlock{destFile=/Users/stevel/Projects/Hortonworks/Projects/sparkwork/spark-cloud-examples/cloud-examples/target/tmp/s3ablock8507376768330281400.tmp,
 state=Writing, dataSize=0, limit=8388608}
2016-12-01 15:22:29,728 [ScalaTest-main-running-S3ANumbersSuite] DEBUG 
s3a.S3ABlockOutputStream (S3ABlockOutputStream.java:putObject(386)) - Executing 
regular upload for {bucket=hwdev-steve-new, 
key='spark-cloud/S3ANumbersSuite/numbers_rdd_tests/_SUCCESS'}
2016-12-01 15:22:29,728 [ScalaTest-main-running-S3ANumbersSuite] DEBUG 
s3a.S3ADataBlocks (S3ADataBlocks.java:startUpload(247)) - Start datablock upload
2016-12-01 15:22:29,728 [ScalaTest-main-running-S3ANumbersSuite] DEBUG 
s3a.S3ADataBlocks (S3ADataBlocks.java:enterState(154)) - 
FileBlock{destFile=/Users/stevel/Projects/Hortonworks/Projects/sparkwork/spark-cloud-examples/cloud-examples/target/tmp/s3ablock8507376768330281400.tmp,
 state=Writing, dataSize=0, limit=8388608}: entering state Upload
2016-12-01 15:22:29,729 [ScalaTest-main-running-S3ANumbersSuite] DEBUG 
s3a.S3ABlockOutputStream (S3ABlockOutputStream.java:clearActiveBlock(212)) - 
Clearing active block
2016-12-01 15:22:29,729 [s3a-transfer-shared-pool1-t5] DEBUG s3a.S3AFileSystem 
(S3AFileSystem.java:incrementPutStartStatistics(1169)) - PUT start 0 bytes
2016-12-01 15:22:29,730 [s3a-transfer-shared-pool1-t5] DEBUG s3a.S3AFileSystem 
(S3AStorageStatistics.java:incrementCounter(60)) - object_put_requests += 1  -> 
 4
2016-12-01 15:22:29,908 [s3a-transfer-shared-pool1-t5] DEBUG s3a.S3AFileSystem 
(S3AFileSystem.java:incrementPutCompletedStatistics(1186)) - PUT completed 
success=true; 0 bytes
2016-12-01 15:22:29,908 [s3a-transfer-shared-pool1-t5] DEBUG s3a.S3AFileSystem 
(S3AStorageStatistics.java:incrementCounter(60)) - 
object_put_requests_completed += 1  ->  4
2016-12-01 15:22:29,908 [s3a-transfer-shared-pool1-t5] DEBUG s3a.S3ADataBlocks 
(S3ADataBlocks.java:enterState(154)) - 
FileBlock{destFile=/Users/stevel/Projects/Hortonworks/Projects/sparkwork/spark-cloud-examples/cloud-examples/target/tmp/s3ablock8507376768330281400.tmp,
 state=Upload, dataSize=0, limit=8388608}: entering state Closed
2016-12-01 15:22:29,909 [s3a-transfer-shared-pool1-t5] DEBUG s3a.S3ADataBlocks 
(S3ADataBlocks.java:close(269)) - Closed 
FileBlock{destFile=/Users/stevel/Projects/Hortonworks/Projects/sparkwork/spark-cloud-examples/cloud-examples/target/tmp/s3ablock8507376768330281400.tmp,
 state=Closed, dataSize=0, limit=8388608}
2016-12-01 15:22:29,909 [s3a-transfer-shared-pool1-t5] DEBUG s3a.S3ADataBlocks 
(S3ADataBlocks.java:innerClose(743)) - Closing 
FileBlock{destFile=/Users/stevel/Projects/Hortonworks/Projects/sparkwork/spark-cloud-examples/cloud-examples/target/tmp/s3ablock8507376768330281400.tmp,
 state=Closed, dataSize=0, limit=8388608}
2016-12-01 15:22:29,909 [ScalaTest-main-running-S3ANumbersSuite] DEBUG 
s3a.S3ABlockOutputStream (S3ABlockOutputStream.java:close(360)) - Upload 
complete for {bucket=hwdev-steve-new, 
key='spark-cloud/S3ANumbersSuite/numbers_rdd_tests/_SUCCESS'}
{code}

> S3ADataBlocks.DiskBlock to lazy create dest file for faster 0-byte puts
> ---
>
> Key: HADOOP-13853
> URL: https://issues.apache.org/jira/browse/HADOOP-13853
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 2.8.0
>Reporter: Steve Loughran
>Priority: Minor
>
> Looking at traces of work, there's invariably a PUT of a _SUCCESS at the end, 
> which, with disk output, adds the overhead of creating, writing to and then 
> reading a 0 byte file.
> With a lazy create, the