Did some (unscientific) tests... The test scenario is 1m (wide) rows (50m cells) and then scanning along all cells.
The difference in runtime is within the noise. When I measure GC stats with jstat I see a ~3% reduction is young collections, and a ~10% reduction in overall GC time. At the same time I set a counting breakpoint on KeyValue.getRow that fires when rowcache is found not-null. I found this triggered about every 16 key values, which suggests the optimization saves a lot of copying of the row key. It is not entirely clear under what circumstances the rowCache would a be win, outweighing the extra static memory by every KV. So it looks like it is not worth making the change, although I suppose anything reducing GC pressure is a win. -- Lars ----- Original Message ----- From: lars hofhansl <[email protected]> To: Stack <[email protected]>; "[email protected]" <[email protected]> Cc: Sent: Thursday, November 24, 2011 11:57 AM Subject: Re: Size of KeyValue Hmm... Might be hard to prove whether removing that would be a net win or net loss in the current code base. I'll do some tests and report back. ________________________________ From: Stack <[email protected]> To: [email protected]; lars hofhansl <[email protected]> Sent: Wednesday, November 23, 2011 8:41 PM Subject: Re: Size of KeyValue On Wed, Nov 23, 2011 at 3:40 PM, lars hofhansl <[email protected]> wrote: > Looking at KeyValue I see three variable purely used for caching: > timestampCache(long), rowCache(byte[]), and keyLength(int). > > From a quick glance over the code I do not see many spots where we repeatedly > get the TS, rowKey, of keyLength from the same KV. > Together these consume 24 bytes (almost 1/2 of KeyValue's constant memory > overhead) on every key value created, and we create > a *lot* KVs (real and "fake" ones) during scanning and seeking. > > Were these added to address specific performance concerns? If not, we might > consider removing these. > IIRC, I added them after watching stuff in a profiler (a long time ago). Things change. Thats a lot of static mem to give up. St.Ack
