[ 
https://issues.apache.org/jira/browse/BEAM-11658?focusedWorklogId=544191&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-544191
 ]

ASF GitHub Bot logged work on BEAM-11658:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 29/Jan/21 10:27
            Start Date: 29/Jan/21 10:27
    Worklog Time Spent: 10m 
      Work Description: ADBalici commented on a change in pull request #13821:
URL: https://github.com/apache/beam/pull/13821#discussion_r566722653



##########
File path: 
sdks/java/core/src/test/java/org/apache/beam/sdk/io/CompressedSourceTest.java
##########
@@ -907,30 +1052,33 @@ public void close() throws IOException {
   }
 
   /** Writes a single output file. */
-  private void writeFile(File file, byte[] input, CompressionMode mode) throws 
IOException {
-    try (OutputStream os = getOutputStreamForMode(mode, new 
FileOutputStream(file))) {
-      os.write(input);
+  private void writeFile(File file, byte[] input, Compression compression) 
throws IOException {
+    if (compression == Compression.SNAPPY) {
+      try (OutputStream os =
+          getOutputStreamForModeWithDecompressedSizeInfo(

Review comment:
       Of course. Thought of this, but I didn't want to mess with the old 
method signature. Glad you've suggested this. Sorted!




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 544191)
    Time Spent: 1h 20m  (was: 1h 10m)

> Add Snappy compression and decompression support
> ------------------------------------------------
>
>                 Key: BEAM-11658
>                 URL: https://issues.apache.org/jira/browse/BEAM-11658
>             Project: Beam
>          Issue Type: New Feature
>          Components: sdk-java-core
>            Reporter: Andrei Balici
>            Assignee: Andrei Balici
>            Priority: P2
>          Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Snappy is a compression/decompression library. It does not aim for maximum 
> compression, or compatibility with any other compression library; instead, it 
> aims for very high speeds and reasonable compression. For instance, compared 
> to the fastest mode of zlib, Snappy is an order of magnitude faster for most 
> inputs, but the resulting compressed files are anywhere from 20% to 100% 
> bigger.
>  
> Many data pipelines will have as input files .snappy compressed, and these 
> currently have to be read by creating custom DoFn(s).
>  
> It would be nice to see Beam support this out of the box, as it does 
> currently for LZO. Snappy usually is faster than algorithms in the same class 
> (e.g. LZO, LZF, QuickLZ, etc.) while achieving comparable compression ratios, 
> so I see no reason leaving this out.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to