[
https://issues.apache.org/jira/browse/BEAM-625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15778940#comment-15778940
]
ASF GitHub Bot commented on BEAM-625:
-------------------------------------
GitHub user katsiapis opened a pull request:
https://github.com/apache/incubator-beam/pull/1694
[BEAM-625] A few memory optimizations in Avro.
- Using a buffer when decompressing with Snappy in Avro in order to avoid …
unnecessary copies during slicing of large strings.
- Explicitly gc-ing (possibly large) Avro Block with its associated data.
- Removing somewhat confusing cStringIO aliasing.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/katsiapis/incubator-beam memory
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/incubator-beam/pull/1694.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #1694
----
commit 069ceebcb167a6c3976b362548a5d91c003a3894
Author: Gus Katsiapis <[email protected]>
Date: 2016-12-26T20:13:06Z
Using a buffer when decompressing with Snappy in Avro in order to avoid
unnecessary copies during slicing of large strings.
commit 4f5e9b51740a97df692138fa477a9f6c66833ab1
Author: Gus Katsiapis <[email protected]>
Date: 2016-12-26T20:17:57Z
Explicitly gc-ing (possibly large) Avro Block with its associated data.
commit e91662bef5773bc23ca2619373496c632693e4d5
Author: Gus Katsiapis <[email protected]>
Date: 2016-12-26T20:26:03Z
Removing somewhat confusing cStringIO aliasing.
----
> Make Dataflow Python Materialized PCollection representation more efficient
> ---------------------------------------------------------------------------
>
> Key: BEAM-625
> URL: https://issues.apache.org/jira/browse/BEAM-625
> Project: Beam
> Issue Type: Improvement
> Components: sdk-py
> Reporter: Konstantinos Katsiapis
> Assignee: Frances Perry
> Fix For: 0.3.0-incubating
>
>
> This will be a several step process which will involve adding better support
> for compression as well as Avro.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)