I put the tail of this discussion into HBASE-32. - Andy
--- On Mon, 7/21/08, stack <[EMAIL PROTECTED]> wrote: > From: stack <[EMAIL PROTECTED]> > Subject: Re: any chance to get the size of a table? > To: [email protected] > Date: Monday, July 21, 2008, 10:46 AM > Andrew Purtell wrote: > > .. > > Maybe a map of MapFile to row count estimations can be > > stored in the FS next to the MapFiles and can be updated > > appropriately during compactions. Then a client can iterate > > over the regions of a table, ask the regionservers involved > > for row count estimations, the regionservers can consult > > the estimation-map and send the largest count found there > > for the table plus the largest memcache count for the > > table, and finally the client can total all of the results. > > > I like this idea. Suggest sticking it in the issue. Each > store already has an accompanying 'meta' file under the > sympathetic 'info' dir. Could stuff estimates in here. > Estimate of rows would also help sizing bloom filters when the >'enable-bloomfilters' switch is thrown. We'd have to > be clear this count an estimate particularly when rows of > sparsely populated columns. > > St.Ack
