We've had success with LZ4 compression in a custom ShardHandler to reduce network overhead, getting ~25% compression with low CPU impact. LZ4 or Snappy seem like reasonable choices[1] for maximizing compression + transfer + decompression times in the data center.
Would it make sense to integrate compression into javabin itself? For the ShardHandler and transaction log javabin usage it seems to make sense. We could flip on gzip in Jetty for HTTP, but GZIP may add more CPU than is desirable and wouldn't help with the transaction log. If we did, i t seems incrementing the javabin version[2] and compressing/decompressing inside of JavaBinCodec#marshal[3] and JavaBinCodec#unmarshal[4] would allow us to retain backwards compatibility with older clients or existing files. Thoughts? --Gregg [1] http://cyan4973.github.io/lz4/#tab-2 [2] https://github.com/apache/lucene-solr/blob/trunk/solr/solrj/src/java/org/apache/solr/common/util/JavaBinCodec.java#L83 [3] https://github.com/apache/lucene-solr/blob/trunk/solr/solrj/src/java/org/apache/solr/common/util/JavaBinCodec.java#L112:L120 [4] https://github.com/apache/lucene-solr/blob/trunk/solr/solrj/src/java/org/apache/solr/common/util/JavaBinCodec.java#L129:L137