Seems about right.
On Apr 1, 2008, at 9:41 AM, stack wrote:
Chatting on list, was thought that an InMemoryMapFile would not be
too hard to do.. couple of days, maybe. In HBaseMapFile, could
read all of the data into memory, into arrays (since it already
sorted) as we do now reading in the index. The HBaseMapFile.Reader
would be modified to go get entries from in-memory rather than from
disk.
Would be ugly since flags would have to go down through multiple
levels of inheritance -- down through BloomFilterMapFile,
HalfMapFile -- and that it should get cleaned up when we do our own
Mapfile.
But was thought, without more intelligent rebalancing of regions
over cluster so they were evenly distributed, with our current
lumpy assignment, it would be easy for a regionserver to have its
memory overstrained; it'd go down with an OOME, regions would be
redistributed lumpy, another would go down and then a downward
spiral. Was thought this had to be addressed first.
(That a fair summary Bryan?)
St.Ack
Bryan Duxbury wrote:
I'm not thinking of hot cell caching. I'm talking about going the
whole way and putting all the data in-memory. So yes, store file
contents would be loaded into memory, though not the memcache,
because that would get really complicated, I think.
InMemoryStoreFile would really be what I was going for, I'd guess.
The table wouldn't be read-only. Writes would go through to disk
but reads would come straight from memory.
On Mar 31, 2008, at 8:42 PM, stack wrote:
A Reference-cache of hot cells would take a day at the outside
I'd guess. The bulk of the work is done.
If you're talking about something else, lets discuss. What would
it look like? Store MapFiles would be floated in memory or
copied to MemCache? We'd need a special In-Memory MapFile? We'd
do a bulk memcopy from HDFS up into mem and then you'd serve from
there? Would the table have to be read-only?
St.Ack
Bryan Duxbury wrote:
Quick poll for us devs - if you had to guess, how long do you
think it would take for the in-memory option of HBase to
actually be implemented to work reliably?
-Bryan