[GitHub] beam pull request #2017: [BEAM-1410] python-sdk: add stacked WindowedValues ...
GitHub user yk5 opened a pull request: https://github.com/apache/beam/pull/2017 [BEAM-1410] python-sdk: add stacked WindowedValues in DirectRunner.Bundle. It saves memory for the typical cases that timestamp/window info is shared. This is on by default, but could be turned off by sending --no_direct_runner_use_stacked_bundle to the pipeline. Be sure to do all of the following to help us incorporate your contribution quickly and easily: - [ ] Make sure the PR title is formatted like: `[BEAM-] Description of pull request` - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable Travis-CI on your fork and ensure the whole test matrix passes). - [ ] Replace `` in the title with the actual Jira issue number, if there is one. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.txt). --- You can merge this pull request into a Git repository by running: $ git pull https://github.com/yk5/beam stacked_bundle Alternatively you can review and apply these changes as the patch at: https://github.com/apache/beam/pull/2017.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2017 commit f23f1b3af251c48789ff1afb9118b817b7d6fff4 Author: Younghee Kwon Date: 2017-02-16T01:23:34Z python-sdk: add stacked WindowedValues in DirectRunner.Bundle. It saves memory for the typical cases that timestamp/window info is shared. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] beam pull request #1933: [BEAM-1410] Improve DirectRunner performance by tun...
GitHub user yk5 opened a pull request: https://github.com/apache/beam/pull/1933 [BEAM-1410] Improve DirectRunner performance by tuning BoundedReadEvaluator. Be sure to do all of the following to help us incorporate your contribution quickly and easily: - [ ] Make sure the PR title is formatted like: `[BEAM-] Description of pull request` - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable Travis-CI on your fork and ensure the whole test matrix passes). - [ ] Replace `` in the title with the actual Jira issue number, if there is one. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.txt). --- You can merge this pull request into a Git repository by running: $ git pull https://github.com/yk5/beam performance Alternatively you can review and apply these changes as the patch at: https://github.com/apache/beam/pull/1933.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1933 commit d34145adf81dd50aa00d8e968fb2843bb624b23c Author: Younghee Kwon Date: 2017-02-07T05:49:49Z Improve DirectRunner performance by tuning BoundedReadEvaluator. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] beam pull request #1928: [BEAM-588] Add MemoryReporter to python-sdk
GitHub user yk5 opened a pull request: https://github.com/apache/beam/pull/1928 [BEAM-588] Add MemoryReporter to python-sdk Be sure to do all of the following to help us incorporate your contribution quickly and easily: - [ ] Make sure the PR title is formatted like: `[BEAM-] Description of pull request` - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable Travis-CI on your fork and ensure the whole test matrix passes). - [ ] Replace `` in the title with the actual Jira issue number, if there is one. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.txt). --- You can merge this pull request into a Git repository by running: $ git pull https://github.com/yk5/beam master Alternatively you can review and apply these changes as the patch at: https://github.com/apache/beam/pull/1928.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1928 commit 1c9a36599f1ae9b86205c59ba3754dba139921d2 Author: Younghee Kwon Date: 2017-02-06T20:35:50Z To add sdks/python/utils/profiler a MemoryReporter that tracks heap profiles. commit 9525392a39234af4efd808c3cbe17e930d65bf94 Author: Younghee Kwon Date: 2017-02-06T21:55:36Z added comment about guppy --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] beam pull request #1749: [BEAM-1233] Create TFRecordIO, providing source/sin...
Github user yk5 closed the pull request at: https://github.com/apache/beam/pull/1749 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] beam pull request #1749: [BEAM-1233] Create TFRecordIO, providing source/sin...
GitHub user yk5 opened a pull request: https://github.com/apache/beam/pull/1749 [BEAM-1233] Create TFRecordIO, providing source/sink for TFRecords, which is the dedicated record format for Tensorflow. For more about TFRecords, refer to https://github.com/tensorflow/tensorflow/blob/master/tensorflow/g3doc/api_docs/python/python_io.md Be sure to do all of the following to help us incorporate your contribution quickly and easily: - [ ] Make sure the PR title is formatted like: `[BEAM-] Description of pull request` - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable Travis-CI on your fork and ensure the whole test matrix passes). - [ ] Replace `` in the title with the actual Jira issue number, if there is one. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.txt). --- You can merge this pull request into a Git repository by running: $ git pull https://github.com/yk5/beam tfrecord Alternatively you can review and apply these changes as the patch at: https://github.com/apache/beam/pull/1749.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1749 commit 3bbd2c1c208860c48c7a4c1909e3936a1fab4faa Author: Younghee Kwon Date: 2017-01-07T02:05:56Z Create TFRecordIO, which provides source/sink for TFRecords, the dedicated record format for Tensorflow. For more about TFRecords, refer to https://github.com/tensorflow/tensorflow/blob/master/tensorflow/g3doc/api_docs/python/python_io.md --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] beam pull request #1736: [Beam-1245] Use @unittest.skip to skip avroio_test ...
Github user yk5 closed the pull request at: https://github.com/apache/beam/pull/1736 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] beam pull request #1737: [BEAM-1246] Update README.md to get rid of 'incubat...
Github user yk5 closed the pull request at: https://github.com/apache/beam/pull/1737 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] beam pull request #1737: Update README.md to get rid of 'incubating' notion.
GitHub user yk5 opened a pull request: https://github.com/apache/beam/pull/1737 Update README.md to get rid of 'incubating' notion. Be sure to do all of the following to help us incorporate your contribution quickly and easily: - [ ] Make sure the PR title is formatted like: `[BEAM-] Description of pull request` - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable Travis-CI on your fork and ensure the whole test matrix passes). - [ ] Replace `` in the title with the actual Jira issue number, if there is one. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.txt). --- You can merge this pull request into a Git repository by running: $ git pull https://github.com/yk5/beam comments Alternatively you can review and apply these changes as the patch at: https://github.com/apache/beam/pull/1737.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1737 commit 0dfb26be95da6cd1e2932d7d1b13cfa70f4a644e Author: Younghee Kwon Date: 2017-01-05T04:53:45Z Update README.md to get rid of 'incubating' notion. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] beam pull request #1736: [Beam-1245] Use @unittest.skip to skip avroio_test ...
GitHub user yk5 opened a pull request: https://github.com/apache/beam/pull/1736 [Beam-1245] Use @unittest.skip to skip avroio_test cases when python-snappy is not installed. Be sure to do all of the following to help us incorporate your contribution quickly and easily: - [ ] Make sure the PR title is formatted like: `[BEAM-] Description of pull request` - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable Travis-CI on your fork and ensure the whole test matrix passes). - [ ] Replace `` in the title with the actual Jira issue number, if there is one. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.txt). --- You can merge this pull request into a Git repository by running: $ git pull https://github.com/yk5/beam tfrecord Alternatively you can review and apply these changes as the patch at: https://github.com/apache/beam/pull/1736.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1736 commit 2dca45833f06329c30c303f257faaadf5c438211 Author: Younghee Kwon Date: 2017-01-05T02:39:29Z To use @unittest.skip to skip avroio_test cases when snappy is not imported. Without snappy installed, test log would look like: WARNING:root:snappy is not installed; some tests will be skipped. ... Ran 21 tests in 13.840s OK (skipped=3) --- With installed: ... Ran 21 tests in 14.464s OK --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] beam pull request #1722: [BEAM-1232] fixed comments to refer ptransform.expa...
Github user yk5 closed the pull request at: https://github.com/apache/beam/pull/1722 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] beam pull request #1691: [Beam-1232] Fixed pipeline.py comments to be confor...
Github user yk5 closed the pull request at: https://github.com/apache/beam/pull/1691 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] beam pull request #1722: [BEAM-1232] fixed comments to refer ptransform.expa...
GitHub user yk5 opened a pull request: https://github.com/apache/beam/pull/1722 [BEAM-1232] fixed comments to refer ptransform.expand() instead of apply() Be sure to do all of the following to help us incorporate your contribution quickly and easily: - [ ] Make sure the PR title is formatted like: `[BEAM-] Description of pull request` - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable Travis-CI on your fork and ensure the whole test matrix passes). - [ ] Replace `` in the title with the actual Jira issue number, if there is one. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.txt). --- You can merge this pull request into a Git repository by running: $ git pull https://github.com/yk5/incubator-beam fixcomments Alternatively you can review and apply these changes as the patch at: https://github.com/apache/beam/pull/1722.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1722 commit 33afbb2e59ae71cad6cbd47dba9ed980db88b113 Author: Younghee Kwon Date: 2016-12-30T22:32:58Z Updated ptransform.apply() to ptransform.expand() in the comments. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-beam pull request #1691: Fixed comments to make example usage to b...
GitHub user yk5 opened a pull request: https://github.com/apache/incubator-beam/pull/1691 Fixed comments to make example usage to be conformant to python-sdk Hi @robertwb, I just started to work on apache-beam python sdk b/33761836, and found some examples mismatch to the syntax, so fixing. can you please take a look? Be sure to do all of the following to help us incorporate your contribution quickly and easily: - [ ] Make sure the PR title is formatted like: `[BEAM-] Description of pull request` - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable Travis-CI on your fork and ensure the whole test matrix passes). - [ ] Replace `` in the title with the actual Jira issue number, if there is one. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.txt). --- You can merge this pull request into a Git repository by running: $ git pull https://github.com/yk5/incubator-beam python-sdk Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-beam/pull/1691.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1691 commit c49e2f63e3c28fe7dbb0fd9887a85dd23fa6a128 Author: Younghee Kwon Date: 2016-12-22T21:57:49Z Fixed the example usage in pipeline.py, which was not conforming to python-sdk. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---