Re: Joins Benchmark

Siva Wed, 03 Jun 2015 10:46:04 -0700

I agree with you Anil!.

On Tue, Jun 2, 2015 at 9:06 PM, anil gupta <anilgupt...@gmail.com> wrote:


> Hi Siva/Jaime,
>
> In my opinion:
> HBase is meant for quick key/value lookup or short range based scans and
> Hive is meant for Analytical/Datawarehouse kind of workload. Full table
> scan in HBase is not what HBase is known/popular for. Doing joins is not
> really a sweet spot for HBase if you are doing full table scans.
> If you are doing full table scan in HBase then you can also try running a
> MapReduce job over HBase snapshot. Or You could just use Hive OLAP type
> workload.
>
> Thanks,
> Anil Gupta
>
> On Tue, Jun 2, 2015 at 4:43 PM, Siva <sbhavan...@gmail.com> wrote:
>
>> Hi Jaime,
>>
>> When we ran queries with complex joins (which involves ~10 tables) on
>> Phoenix on the tables which has large data, initially we have seen a lot of
>> issues, queries failed with errors. We started to tune both hbase and
>> phoenix, now few queries are running fine, but queries with larger data set
>> still have same issues. Still working on tuning them. The reason for
>> failures could be because of small cluster, limited by memory and IO.
>>
>> On the other hand, same quires with same data size on Hive 14 (with Tez +
>> ORC format + SNAPPY compression) were finished with in 70~100 seconds. It
>> would be good if Phoenix can publish the performance results on join
>> queries.
>>
>> Thanks,
>> Siva.
>>
>> On Tue, Jun 2, 2015 at 1:47 PM, Jaime Solano <jdjsol...@gmail.com> wrote:
>>
>>> Hi guys,
>>>
>>> Are there benchmarks or numbers showing how Phoenix performs during the
>>> join of two or more huge tables? I'm not familiar with the join
>>> implementation, so I'm not sure if there's a limitation regarding number of
>>> regions, memory, disk, etc.
>>>
>>> Any thoughts?
>>>
>>> Thanks,
>>> -Jaime
>>>
>>
>>
>
>
> --
> Thanks & Regards,
> Anil Gupta
>

Re: Joins Benchmark

Reply via email to