Hi,

if you really want to add compression on the data path, I would encourage you to choose something as lightweight as possible. 10 GBit Ethernet is becoming pretty much commodity these days in the server space and it is not easy to saturate such a link even without compression.

Snappy is not a bad choice, but the fastest algorithm I’ve seen so far is LZ4:

https://code.google.com/p/lz4/

Best regards,

    Daniel


Am 25.11.2014 20:53, schrieb Stephan Ewen:
I would start with a simple compression of network buffers as a blob.

At some point, Flink's internal data layout may become columnar, which
should also help the blob-style compression, because more similar strings
will be within one window...

On Tue, Nov 25, 2014 at 11:26 AM, Viktor Rosenfeld <
viktor.rosenf...@tu-berlin.de> wrote:

Hi,

A codec like Snappy would work on an entire network buffer as one big blob,
right? I was more thinking along the lines of compressing individual tuples
fields by treating them as columns, e.g., using frame-of-reference encoding
and bit backing. Compression on tuple fields should yield much better
results than compressing the entire blob. Given that Flink controls the
serialization process this should be transparent to other layers in the
code. Not sure it is worth the effort though.

Cheers,
Viktor



--
View this message in context:
http://apache-flink-incubator-mailing-list-archive.1008284.n3.nabble.com/Compression-of-network-traffic-tp2568p2607.html
Sent from the Apache Flink (Incubator) Mailing List archive. mailing list
archive at Nabble.com.


Reply via email to