Tim Armstrong has posted comments on this change. Change subject: IMPALA-3766: Applying LZ4 compression on buffers before spilling ......................................................................
Patch Set 2: Didn't realise the ball was in my court on this one. It looks like this is going to be beneficial in some circumstances but not others. We may need to think through if/when to enable this. It would be interesting if we could benchmark this in a way that we were actually reading and writing to a spinning disk. That's when we'd expect to see a benefit. I think there are two things going on with the benchmark: * < 1gb of spilled data is almost certainly going to fit in the OS buffer cache, so we're not going to see much disk I/O on the critical path. * Disks and SSDs have different characteristics and trade-offs for spilling. To elaborate on the second point: on spinning disks, I/O is slow, so we will probably see some perf benefit. On SSDs, I suspect compression will be slower than the I/O. However, space on SSDs is scarce so it's probably worth compressing the data just to save space even if we take a performance hit. -- To view, visit http://gerrit.cloudera.org:8080/3478 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I4d49bd8d6d7643c84cefd1274c18b52907ca1488 Gerrit-PatchSet: 2 Gerrit-Project: Impala Gerrit-Branch: cdh5-trunk Gerrit-Owner: anujphadke <[email protected]> Gerrit-Reviewer: Mostafa Mokhtar <[email protected]> Gerrit-Reviewer: Tim Armstrong <[email protected]> Gerrit-Reviewer: anujphadke <[email protected]> Gerrit-HasComments: No
