Comments inline below.

---
Jim Kellerman, Senior Engineer; Powerset


> -----Original Message-----
> From: Naama Kraus [mailto:[EMAIL PROTECTED]
> Sent: Sunday, June 15, 2008 3:39 AM
> To: [email protected]
> Subject: HBase and locality issues
>
> Hi,
>
> I have some questions regarding HBase and locality issues -
> I'd appreciate some explanations and clarifications.
>
> I understand HBase is built on top of HDFS.
> Say an HRegionServer creates a HStoreFile where it puts some
> column family content. Does HDFS split the file to multiple
> HDFS blocks and distributes them around bunch of machines ?

Yes. HStoreFile is currently implemented using org.apache.hadoop.io.MapFile

> If that's the case, when the region server needs to actually
> access the files, does HDFS underneath communicates remote
> machines to read the various blocks ?

Sometimes. If a requested block is local, HDFS will try to get that one.

> Doesn't it hurt performance since there is no locality in data access
> (region server actually works on remote blocks).

Somewhat. We have other areas that we have identified as larger performance
bottlenecks that need to be addressed first.

> Or is the HStoreFile implemented in some other way which
> writes it to the local disks of the region server node
> machine that owns it ?

No. Blocks are placed according to HDFS strategies.

> If so, then how ? Does this code overrides the HDFS behavior ?

It doesn't.

> Another related question is about Map Reduce and HBase. When
> a MapReduce job  runs on top of HBase - i.e. gets  a table as
> an input. How does the MapReduce  framework know how to
> schedule  map tasks near data ? Does it have any knowledge of
> the actual location of the data pieces composing the table to
> be processed ?

No. It is on our list of things to do. See HBASE-57

> I'd be also glad to get pointers to the related source code (classes).
>
> Thanks for any information,
> Naama
>
> --
> oo 00 oo 00 oo 00 oo 00 oo 00 oo 00 oo 00 oo 00 oo 00 oo 00
> oo 00 oo 00 oo 00 oo 00 oo "If you want your children to be
> intelligent, read them fairy tales. If you want them to be
> more intelligent, read them more fairy tales." (Albert
> Einstein)

No virus found in this outgoing message.
Checked by AVG.
Version: 8.0.100 / Virus Database: 270.3.0/1503 - Release Date: 6/14/2008 6:02 
PM

Reply via email to