[
https://issues.apache.org/jira/browse/BEAM-8564?focusedWorklogId=359023&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-359023
]
ASF GitHub Bot logged work on BEAM-8564:
----------------------------------------
Author: ASF GitHub Bot
Created on: 12/Dec/19 23:48
Start Date: 12/Dec/19 23:48
Worklog Time Spent: 10m
Work Description: gsteelman commented on issue #10254: [BEAM-8564] Add
LZO compression and decompression support
URL: https://github.com/apache/beam/pull/10254#issuecomment-565238477
> While studying the code, we found that the airlift/ aircompressor library
only requires some classes which are also present in apache hadoop common
package(~3.9MB). Therefore, we are now thinking that of making changes in the
airlift/ aircompressor package, replacing the
> com.facebook.presto.hadoop with org.apache.hadoop.common and removing
other compression mechanisms present in the airlift/aircompressor package(like
zstd, gzip etc) while only keeping the required LZO package.
> But if we go ahead with this approach, we will have to manually update
this library whenever any changes are made to the airlift/aircompressor's LZO
package.
> @lukecwik @gsteelman please provide your thoughts on this.
Is it possible to instead add the dependencies on the `apache.hadoop.common`
package directly in these changes, and not add a dependency on
airlift/aircompressor this change? I would prefer to stick with strict
dependencies when possible, rather than relying on transitive dependencies to
bring in the classes we need.
Relying on the transitive dependencies brought in by airlift/aircompressor
has its own set of issues, including having to update our libraries whenever
changes are made to airlift.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 359023)
Time Spent: 4h 20m (was: 4h 10m)
> Add LZO compression and decompression support
> ---------------------------------------------
>
> Key: BEAM-8564
> URL: https://issues.apache.org/jira/browse/BEAM-8564
> Project: Beam
> Issue Type: New Feature
> Components: sdk-java-core
> Reporter: Amogh Tiwari
> Assignee: Amogh Tiwari
> Priority: Minor
> Time Spent: 4h 20m
> Remaining Estimate: 0h
>
> LZO is a lossless data compression algorithm which is focused on compression
> and decompression speeds.
> This will enable Apache Beam sdk to compress/decompress files using LZO
> compression algorithm.
> This will include the following functionalities:
> # compress() : for compressing files into an LZO archive
> # decompress() : for decompressing files archived using LZO compression
> Appropriate Input and Output stream will also be added to enable working with
> LZO files.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)