I'm trying to do some process on my HBase dataset in our company. But I'm
pretty new to the HBase and Hadoop ecosystem.
I would like to get some feedback from this community, to see if my
understanding of HBase and the MapReduce operation on it is correct.
Some backgrounds here:
1. We have a
Thanks for the confirmation. There is also a good/detailed discussion thread
on this issue found at
http://apache-hbase.679495.n3.nabble.com/Shared-HDFS-for-HBase-and-MapReduce-td4018856.html
http://apache-hbase.679495.n3.nabble.com/Shared-HDFS-for-HBase-and-MapReduce-td4018856.html
.
Michael
s to export computation to the data and not import data to the
> computation. If I were to segregate HBase and MapReduce clusters, then when
> using MapReduce on HBase data would I not have to transfer large amounts of
> data from HBase/HDFS cluster to MapReduce/HDFS cluster?
>
> Cloud
On Wed, Jun 6, 2012 at 2:20 AM, Amandeep Khurana wrote:
>> These are general recommendations and definitely change based on the
>> access patterns and the way you will be using HBase and MapReduce. In
>> general, if you are building a latency sensitive application on top of
&
github.com/lfrancke/puppet-cdh but there are others needed
too
On Wed, Jun 6, 2012 at 2:20 AM, Amandeep Khurana wrote:
> Atif,
>
> These are general recommendations and definitely change based on the
> access patterns and the way you will be using HBase and MapReduce. In
> gener
Atif,
These are general recommendations and definitely change based on the access
patterns and the way you will be using HBase and MapReduce. In general, if you
are building a latency sensitive application on top of HBase, running a
MapReduce job at the same time will impact performance due to
>isolate a MapReduce/HDFS cluster from an HBase/HDFS cluster as the two
>when
>sharing the same HDFS cluster could lead to performance problems. I am
>not
>sure if this is entirely true given the fact that the main concept behind
>Hadoop is to export computation to the data and not im
in concept behind
Hadoop is to export computation to the data and not import data to the
computation. If I were to segregate HBase and MapReduce clusters, then when
using MapReduce on HBase data would I not have to transfer large amounts of
data from HBase/HDFS cluster to MapReduce/HDFS cluster?
C
>
> 1. HBase guarantees data locality of store files and Regionserver only if
> it stays up for long. If there are too many region movements or the server
> has been recycled recently, there is a high probability that store file
> blocks are not local to the region server. But the getSplits comman
I have couple of questions related to MapReduce over HBase
1. HBase guarantees data locality of store files and Regionserver only if
it stays up for long. If there are too many region movements or the server
has been recycled recently, there is a high probability that store file
blocks are not
10 matches
Mail list logo