> I don't follow what the Factory adds. The Factory part of the name just means that it can make an object of type T from a byte[] (the column value). This is the type that we keep in the set and sort on.
> > We're talking about getting HBASE-82 into 0.2. Does that interfere with > this proposal? I don't think it would get in the way. > Would be sweet if you could leverage the HBase memcache code and flusher to > do the above. Agreed. > This Map would be global for the table? Or per Region? Was thinking one per region. Most of our order-by queries should hit just a few regions due to key prefix. > A lucene index wouldn't work for you because you want ordering? Thought about lucene a bit. Looks like it can provide ordering, but for only basic types. Anyways, we could still probably make it work, but it seems more heavyweight, especially if we just want ordering. Am still open to this though. > You'd be random reading rows. You're OK w/ current performance? (For sure > it will only improve but....). Yeah, random reads are concern, but this should still be improvement of our current approach to ordered-by. > This scanner would have a significant client-side component to do the > arbitrage between all regions to figure the lowest column value? If you had > a new type of 'region' -- one denoted by lowest and upper column then the > client-side logic would fade away and your scanner would look like current > scanners. Is this essentially the same as Bryan's suggestion of just maintaining another hbase table? > Splits would not be row-based and run as they currently do, but rather > sorted-column based? No, I just meant when we split a region, it is easy to split the column's SortedSet for each daughter. > How are you thinking of adding in this new functionality? Subclassing > HRegionServer? Was thinking easiest / most isolated approach may be to sublcass HRegionServer to intercept puts, maintain SortedSets, and add new interface for obtaining ordered-by scanner. But more complete solution may be to modify/subclass HRegion: intercept puts here, store sorted column mapfiles along side other region data, handle splitting, scanning... > St.Ack > > > > Cheers, > > -clint > > > > > >
