[GitHub] beam pull request #2017: [BEAM-1410] python-sdk: add stacked WindowedValues ...

2017-02-15 Thread yk5
GitHub user yk5 opened a pull request:

https://github.com/apache/beam/pull/2017

[BEAM-1410] python-sdk: add stacked WindowedValues in DirectRunner.Bundle.

It saves memory for the typical cases that timestamp/window info is shared.

This is on by default, but could be turned off by sending 
--no_direct_runner_use_stacked_bundle to the pipeline.

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/yk5/beam stacked_bundle

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2017.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2017


commit f23f1b3af251c48789ff1afb9118b817b7d6fff4
Author: Younghee Kwon 
Date:   2017-02-16T01:23:34Z

python-sdk: add stacked WindowedValues in DirectRunner.Bundle.

It saves memory for the typical cases that timestamp/window info is shared.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] beam pull request #1933: [BEAM-1410] Improve DirectRunner performance by tun...

2017-02-06 Thread yk5
GitHub user yk5 opened a pull request:

https://github.com/apache/beam/pull/1933

[BEAM-1410] Improve DirectRunner performance by tuning BoundedReadEvaluator.

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/yk5/beam performance

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/1933.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1933


commit d34145adf81dd50aa00d8e968fb2843bb624b23c
Author: Younghee Kwon 
Date:   2017-02-07T05:49:49Z

Improve DirectRunner performance by tuning BoundedReadEvaluator.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] beam pull request #1928: [BEAM-588] Add MemoryReporter to python-sdk

2017-02-06 Thread yk5
GitHub user yk5 opened a pull request:

https://github.com/apache/beam/pull/1928

[BEAM-588] Add MemoryReporter to python-sdk

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/yk5/beam master

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/1928.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1928


commit 1c9a36599f1ae9b86205c59ba3754dba139921d2
Author: Younghee Kwon 
Date:   2017-02-06T20:35:50Z

To add sdks/python/utils/profiler a MemoryReporter that tracks heap 
profiles.

commit 9525392a39234af4efd808c3cbe17e930d65bf94
Author: Younghee Kwon 
Date:   2017-02-06T21:55:36Z

added comment about guppy




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] beam pull request #1749: [BEAM-1233] Create TFRecordIO, providing source/sin...

2017-01-09 Thread yk5
Github user yk5 closed the pull request at:

https://github.com/apache/beam/pull/1749


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] beam pull request #1749: [BEAM-1233] Create TFRecordIO, providing source/sin...

2017-01-06 Thread yk5
GitHub user yk5 opened a pull request:

https://github.com/apache/beam/pull/1749

[BEAM-1233] Create TFRecordIO, providing source/sink for TFRecords, 

which is the dedicated record format for Tensorflow.

For more about TFRecords, refer to 
https://github.com/tensorflow/tensorflow/blob/master/tensorflow/g3doc/api_docs/python/python_io.md

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/yk5/beam tfrecord

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/1749.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1749


commit 3bbd2c1c208860c48c7a4c1909e3936a1fab4faa
Author: Younghee Kwon 
Date:   2017-01-07T02:05:56Z

Create TFRecordIO, which provides source/sink for TFRecords, the dedicated 
record format for Tensorflow.

For more about TFRecords, refer to 
https://github.com/tensorflow/tensorflow/blob/master/tensorflow/g3doc/api_docs/python/python_io.md




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] beam pull request #1736: [Beam-1245] Use @unittest.skip to skip avroio_test ...

2017-01-05 Thread yk5
Github user yk5 closed the pull request at:

https://github.com/apache/beam/pull/1736


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] beam pull request #1737: [BEAM-1246] Update README.md to get rid of 'incubat...

2017-01-05 Thread yk5
Github user yk5 closed the pull request at:

https://github.com/apache/beam/pull/1737


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] beam pull request #1737: Update README.md to get rid of 'incubating' notion.

2017-01-04 Thread yk5
GitHub user yk5 opened a pull request:

https://github.com/apache/beam/pull/1737

Update README.md to get rid of 'incubating' notion.

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/yk5/beam comments

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/1737.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1737


commit 0dfb26be95da6cd1e2932d7d1b13cfa70f4a644e
Author: Younghee Kwon 
Date:   2017-01-05T04:53:45Z

Update README.md to get rid of 'incubating' notion.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] beam pull request #1736: [Beam-1245] Use @unittest.skip to skip avroio_test ...

2017-01-04 Thread yk5
GitHub user yk5 opened a pull request:

https://github.com/apache/beam/pull/1736

[Beam-1245] Use @unittest.skip to skip avroio_test cases

when python-snappy is not installed.

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/yk5/beam tfrecord

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/1736.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1736


commit 2dca45833f06329c30c303f257faaadf5c438211
Author: Younghee Kwon 
Date:   2017-01-05T02:39:29Z

To use @unittest.skip to skip avroio_test cases when snappy is not imported.

Without snappy installed, test log would look like:
WARNING:root:snappy is not installed; some tests will be skipped.
...
Ran 21 tests in 13.840s

OK (skipped=3)
---

With installed:
...
Ran 21 tests in 14.464s

OK




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] beam pull request #1722: [BEAM-1232] fixed comments to refer ptransform.expa...

2017-01-03 Thread yk5
Github user yk5 closed the pull request at:

https://github.com/apache/beam/pull/1722


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] beam pull request #1691: [Beam-1232] Fixed pipeline.py comments to be confor...

2017-01-03 Thread yk5
Github user yk5 closed the pull request at:

https://github.com/apache/beam/pull/1691


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] beam pull request #1722: [BEAM-1232] fixed comments to refer ptransform.expa...

2016-12-30 Thread yk5
GitHub user yk5 opened a pull request:

https://github.com/apache/beam/pull/1722

[BEAM-1232] fixed comments to refer ptransform.expand() instead of apply()

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/yk5/incubator-beam fixcomments

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/1722.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1722


commit 33afbb2e59ae71cad6cbd47dba9ed980db88b113
Author: Younghee Kwon 
Date:   2016-12-30T22:32:58Z

Updated ptransform.apply() to ptransform.expand() in the comments.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-beam pull request #1691: Fixed comments to make example usage to b...

2016-12-22 Thread yk5
GitHub user yk5 opened a pull request:

https://github.com/apache/incubator-beam/pull/1691

Fixed comments to make example usage to be conformant to python-sdk

Hi @robertwb, I just started to work on apache-beam python sdk b/33761836, 
and found some examples mismatch to the syntax, so fixing. can you please take 
a look?

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/yk5/incubator-beam python-sdk

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/1691.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1691


commit c49e2f63e3c28fe7dbb0fd9887a85dd23fa6a128
Author: Younghee Kwon 
Date:   2016-12-22T21:57:49Z

Fixed the example usage in pipeline.py, which was not conforming to 
python-sdk.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---