If I recall correctly, we put this in more for the benefit of the client side, with the assumption that the server side would never call this API. Then, we ended up writing some bad code somewhere in the server which calls this function.
-Todd On Mon, Nov 28, 2011 at 5:42 PM, lars hofhansl <[email protected]> wrote: > Did some (unscientific) tests... > The test scenario is 1m (wide) rows (50m cells) and then scanning along all > cells. > > The difference in runtime is within the noise. When I measure GC stats with > jstat I see a ~3% reduction is young collections, and a ~10% reduction in > overall GC time. > > At the same time I set a counting breakpoint on KeyValue.getRow that fires > when rowcache is found not-null. I found this triggered about every 16 key > values, which suggests > the optimization saves a lot of copying of the row key. > > It is not entirely clear under what circumstances the rowCache would a be > win, outweighing the extra static memory by every KV. > > So it looks like it is not worth making the change, although I suppose > anything reducing GC pressure is a win. > > -- Lars > > > ----- Original Message ----- > From: lars hofhansl <[email protected]> > To: Stack <[email protected]>; "[email protected]" <[email protected]> > Cc: > Sent: Thursday, November 24, 2011 11:57 AM > Subject: Re: Size of KeyValue > > Hmm... Might be hard to prove whether removing that would be a net win or net > loss in the current code base. > I'll do some tests and report back. > > > > ________________________________ > From: Stack <[email protected]> > To: [email protected]; lars hofhansl <[email protected]> > Sent: Wednesday, November 23, 2011 8:41 PM > Subject: Re: Size of KeyValue > > On Wed, Nov 23, 2011 at 3:40 PM, lars hofhansl <[email protected]> wrote: >> Looking at KeyValue I see three variable purely used for caching: >> timestampCache(long), rowCache(byte[]), and keyLength(int). >> >> From a quick glance over the code I do not see many spots where we >> repeatedly get the TS, rowKey, of keyLength from the same KV. >> Together these consume 24 bytes (almost 1/2 of KeyValue's constant memory >> overhead) on every key value created, and we create >> a *lot* KVs (real and "fake" ones) during scanning and seeking. >> >> Were these added to address specific performance concerns? If not, we might >> consider removing these. >> > > IIRC, I added them after watching stuff in a profiler (a long time > ago). Things change. Thats a lot of static mem to give up. > > St.Ack > -- Todd Lipcon Software Engineer, Cloudera
