[jira] [Commented] (HADOOP-13286) add a scale test to do gunzip and linecount

Chris Nauroth (JIRA) Fri, 17 Jun 2016 12:42:05 -0700

    [ 
https://issues.apache.org/jira/browse/HADOOP-13286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15336786#comment-15336786
 ]


Chris Nauroth commented on HADOOP-13286:
----------------------------------------

It's not clear to me that this test is distinct enough from others that it 
justifies the increased test runtime, shown here as ~2 minutes (though parallel 
execution can mask that).  Using a compression codec and line-oriented text 
formats is a common pattern, but that's just extra pieces on top of a 
sequential file access pattern at the {{FileSystem}} layer.  In HADOOP-13203, 
the existing {{TestS3AInputStreamPerformance#testReadAheadDefault}} was 
sufficient for me to flag a performance regression on sequential reads.  Could 
the {{logStreamStatistics}} and {{NanoTimer}} usage be applied to that test or 
other pre-existing tests instead of adding a new test?

If I missed something unique about what this test is covering, please let me 
know, and I'll go ahead and review it.

> add a scale test to do gunzip and linecount
> -------------------------------------------
>
>                 Key: HADOOP-13286
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13286
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>    Affects Versions: 2.8.0
>            Reporter: Steve Loughran
>            Assignee: Steve Loughran
>         Attachments: HADOOP-13286-branch-2-001.patch
>
>
> the HADOOP-13203 patch proposal showed that there were performance problems 
> downstream which weren't surfacing in the current scale tests.
> Trying to decompress the .gz test file and then go through it with LineReader 
> models a basic use case: parse a .csv.gz data source. 
> Add this, with metric printing



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (HADOOP-13286) add a scale test to do gunzip and linecount

Reply via email to