[
https://issues.apache.org/jira/browse/BEAM-8564?focusedWorklogId=359607&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-359607
]
ASF GitHub Bot logged work on BEAM-8564:
----------------------------------------
Author: ASF GitHub Bot
Created on: 13/Dec/19 19:30
Start Date: 13/Dec/19 19:30
Worklog Time Spent: 10m
Work Description: amoght commented on issue #10254: [BEAM-8564] Add LZO
compression and decompression support
URL: https://github.com/apache/beam/pull/10254#issuecomment-565577445
@gsteelman we have used the airlift/aircompressor library to only get the
compression and decompression mechanism, the implementation of Input/Output
stream there introduces the transitive dependency, which can be removed and
replaced with apache hadoop common library. This significantly reduces the size
as well.
So, here are the 2 possible options:
1) We only use the compression and decompression mechanism from
airlift/aircompressor and design the Input/Output Streams for beam accordingly.
This will be needed to be updated if there is any change in those classes on
airlift/aircompressor's end. But, since we will only be using the compression
and decompression mechanism from airlift/aircompressor, the updates will be
small and quite rare. Therefore, this won't be that big of an issue.
2) We introduce LZO as an optional package for beam. As this will give users
the option to manage their beam size (if it is a constraint) or if LZO is not
required.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 359607)
Time Spent: 4.5h (was: 4h 20m)
> Add LZO compression and decompression support
> ---------------------------------------------
>
> Key: BEAM-8564
> URL: https://issues.apache.org/jira/browse/BEAM-8564
> Project: Beam
> Issue Type: New Feature
> Components: sdk-java-core
> Reporter: Amogh Tiwari
> Assignee: Amogh Tiwari
> Priority: Minor
> Time Spent: 4.5h
> Remaining Estimate: 0h
>
> LZO is a lossless data compression algorithm which is focused on compression
> and decompression speeds.
> This will enable Apache Beam sdk to compress/decompress files using LZO
> compression algorithm.
> This will include the following functionalities:
> # compress() : for compressing files into an LZO archive
> # decompress() : for decompressing files archived using LZO compression
> Appropriate Input and Output stream will also be added to enable working with
> LZO files.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)