[ https://issues.apache.org/jira/browse/BEAM-3060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16259421#comment-16259421 ]
ASF GitHub Bot commented on BEAM-3060: -------------------------------------- GitHub user lgajowy opened a pull request: https://github.com/apache/beam/pull/4149 [BEAM-3060] Add Compressed TextIOIT Follow this checklist to help us incorporate your contribution quickly and easily: - [ ] Make sure there is a [JIRA issue](https://issues.apache.org/jira/projects/BEAM/issues/) filed for the change (usually before you start working on it). Trivial changes like typos do not require a JIRA issue. Your pull request should address just this issue, without pulling in other changes. - [ ] Each commit in the pull request should have a meaningful subject line and body. - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA issue. - [ ] Write a pull request description that is detailed enough to understand what the pull request does, how, and why. - [ ] Run `mvn clean verify` to make sure basic checks pass. A more thorough check will be performed on your pull request automatically. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf). --- This is a parametrized test for Compressed TextIO. Only the Java code - @DariuszAniszewski is working on Perfkit support and Dataflow runner support on his separate branch. As ZIP compression type is unsupported, I skipped it in the test. @chamikaramj could you take a look? You can merge this pull request into a Git repository by running: $ git pull https://github.com/lgajowy/beam compressed-text-io-test Alternatively you can review and apply these changes as the patch at: https://github.com/apache/beam/pull/4149.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #4149 ---- commit df472abc6ee1b3c2ea021f6069beabd6a4439907 Author: Ćukasz Gajowy <lukasz.gaj...@polidea.com> Date: 2017-11-20T16:00:54Z [BEAM-3060] Add Compressed TextIOIT ---- > Add performance tests for commonly used file-based I/O PTransforms > ------------------------------------------------------------------ > > Key: BEAM-3060 > URL: https://issues.apache.org/jira/browse/BEAM-3060 > Project: Beam > Issue Type: Test > Components: sdk-java-core > Reporter: Chamikara Jayalath > Assignee: Szymon Nieradka > > We recently added a performance testing framework [1] that can be used to do > following. > (1) Execute Beam tests using PerfkitBenchmarker > (2) Manage Kubernetes-based deployments of data stores. > (3) Easily publish benchmark results. > I think it will be useful to add performance tests for commonly used > file-based I/O PTransforms using this framework. I suggest looking into > following formats initially. > (1) AvroIO > (2) TextIO > (3) Compressed text using TextIO > (4) TFRecordIO > It should be possibly to run these tests for various Beam runners (Direct, > Dataflow, Flink, Spark, etc.) and file-systems (GCS, local, HDFS, etc.) > easily. > In the initial version, tests can be made manually triggerable for PRs > through Jenkins. Later, we could make some of these tests run periodically > and publish benchmark results (to BigQuery) through PerfkitBenchmarker. > [1] https://beam.apache.org/documentation/io/testing/ -- This message was sent by Atlassian JIRA (v6.4.14#64029)