Also there are other distributed filesystems than Hadoop DFS
supported by the Hadoop FS abstraction layer, such as KFS. 
(Are the IBM Almaden folks running HBase on KFS?) So depending
on the random read performance of the underlying file system
query response times will differ, and maybe improve over that
achievable when running on top of HDFS. 

   - Andy

> From: stack <[EMAIL PROTECTED]>
> Subject: Re: HBase for small data sets
> To: [email protected]
> Date: Wednesday, October 8, 2008, 2:35 PM
> Naama Kraus wrote:
> > Hi,
> >
> > Will HBase work reasonably for small data sets ? E.g.
> > for 10s or 100s of Gigas of data? Would it make sense
> > to use HBase to store and access them ?
> >   
> > I was thinking HDFS and M/R have a overhead thus
> > won't perform well for small amounts of data. But say I
> > use HBase w/o MapReduce (get, set, scan only) and use
> > local file system underneath, will I get reasonable
> > performance ?
>
> It won't look 'reasonable' if stacked against a RDBMS.
> Might be 'fast enough' though?
> 
> Scanning has been recently improved (4-fold is what I'm seeing)
> in trunk.  Coming batching facility should improve writes
> similarly. Random reads continue to suffer but might be OK
> going against local filesystem?
> 
> Thanks Naama,
> St.Ack


      

Reply via email to