Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3766:  Applying LZ4 compression on buffers before 
spilling
......................................................................


Patch Set 2:

Didn't realise the ball was in my court on this one. It looks like this is 
going to be beneficial in some circumstances but not others. We may need to 
think through if/when to enable this.

It would be interesting if we could benchmark this in a way that we were 
actually reading and writing to a spinning disk. That's when we'd expect to see 
a benefit. I think there are two things going on with the benchmark:

* < 1gb of spilled data is almost certainly going to fit in the OS buffer 
cache, so we're not going to see much disk I/O on the critical path.
* Disks and SSDs have different characteristics and trade-offs for spilling.

To elaborate on the second point: on spinning disks, I/O is slow, so we will 
probably see some perf benefit. On SSDs, I suspect compression will be slower 
than the I/O. However, space on SSDs is scarce so it's probably worth 
compressing the data just to save space even if we take a performance hit.

-- 
To view, visit http://gerrit.cloudera.org:8080/3478
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I4d49bd8d6d7643c84cefd1274c18b52907ca1488
Gerrit-PatchSet: 2
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: anujphadke <[email protected]>
Gerrit-Reviewer: Mostafa Mokhtar <[email protected]>
Gerrit-Reviewer: Tim Armstrong <[email protected]>
Gerrit-Reviewer: anujphadke <[email protected]>
Gerrit-HasComments: No

Reply via email to