[jira] [Commented] (HADOOP-13286) add a scale test to do gunzip and linecount

Steve Loughran (JIRA) Mon, 20 Jun 2016 06:25:32 -0700

    [ 
https://issues.apache.org/jira/browse/HADOOP-13286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15339477#comment-15339477
 ]


Steve Loughran commented on HADOOP-13286:
-----------------------------------------

In a test against s3 ireland, opening the file with the sequential policy,  
9.6s to read
{code}
Running org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 9.537 sec - in 
org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance
{code}

The closest equivalent test is {{testTimeToOpenAndReadWholeFileByByte}}, which, 
interestingly, takes slightly longer, at least for me. (disclaimer, this is 
{code}
Running org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 11.329 sec - in 
org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance
{code}

given decompress+line-by-line is one we see in real code, I'd actually like to 
keep it and cut the {{testTimeToOpenAndReadWholeFileByByte}}, test

> add a scale test to do gunzip and linecount
> -------------------------------------------
>
>                 Key: HADOOP-13286
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13286
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>    Affects Versions: 2.8.0
>            Reporter: Steve Loughran
>            Assignee: Steve Loughran
>         Attachments: HADOOP-13286-branch-2-001.patch
>
>
> the HADOOP-13203 patch proposal showed that there were performance problems 
> downstream which weren't surfacing in the current scale tests.
> Trying to decompress the .gz test file and then go through it with LineReader 
> models a basic use case: parse a .csv.gz data source. 
> Add this, with metric printing



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (HADOOP-13286) add a scale test to do gunzip and linecount

Reply via email to