Re: large data and hbase

Harsh J Tue, 12 Jul 2011 06:02:17 -0700

For a query to work in a fully distributed manner, MapReduce may still
be required (atop HBase, i.e.). There's been work ongoing to assist
the same at the HBase side as well, but you're guaranteed better
responses on their mailing lists instead.


On Tue, Jul 12, 2011 at 3:31 PM, Rita <[email protected]> wrote:
> This is encouraging.
>
> ¨Make sure HDFS is running first. Start and stop the Hadoop HDFS daemons by
> running bin/start-hdfs.sh over in the HADOOP_HOME directory. You can ensure
> it started properly by testing the *put* and *get* of files into the Hadoop
> filesystem. HBase does not normally use the mapreduce daemons. These do not
> need to be started.¨
>
> On Mon, Jul 11, 2011 at 1:40 PM, Bharath Mundlapudi
> <[email protected]>wrote:
>
>> Another option to look at is Pig Or Hive. These need MapReduce.
>>
>>
>> -Bharath
>>
>>
>>
>> ________________________________
>> From: Rita <[email protected]>
>> To: "<[email protected]>" <[email protected]>
>> Sent: Monday, July 11, 2011 4:31 AM
>> Subject: large data and hbase
>>
>> I have a dataset which is several terabytes in size. I would like to query
>> this data using hbase (sql). Would I need to setup mapreduce to use hbase?
>> Currently the data is stored in hdfs and I am using `hdfs -cat ` to get the
>> data and pipe it into stdin.
>>
>>
>> --
>> --- Get your facts first, then you can distort them as you please.--
>>
>
>
>
> --
> --- Get your facts first, then you can distort them as you please.--
>



-- 
Harsh J

Re: large data and hbase

Reply via email to