nsivabalan commented on issue #1253: [HUDI-558] Introduce ability to compress bloom filters while storing in parquet URL: https://github.com/apache/incubator-hudi/pull/1253#issuecomment-592868794 I also played with testing the sizes. Looks like the encoding is the culprit. test random keys Data before compress: 4792548 Data after compress Stage 1 3630215 Data after compress Stage 2 4967662 // added these log statements. byte[] compressed = bos.toByteArray(); System.out.println("Data after compress Stage 1 " + compressed.length); Base64.Encoder encoder = Base64.getMimeEncoder(); String toReturn = new String(encoder.encode(compressed), StandardCharsets.UTF_8); System.out.println("Data after compress Stage 2 " + toReturn.length());
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
