Hive provides a SQL like functionality over hadoop but NOSQL does not provide all SQL capabilities very well. As the number of joins increase performance decreases. Instead you should try to model your data in one table to avoid joins. You can try Apache Accumulo which provides full control, over data structure and you also don't have have to define Column families in advance like in HBase you have to. Its fast and most scalable tested datastore which uses Hadoop in its base.

-Mohit Kaushik

On 01/18/2016 10:32 AM, Divya Gehlot wrote:
Hi,
Which Data storage is best for multiple joins on the run time in Hadoop.
Tried Hive but performance is poor.
Pointers/Guidance appreciated.


Thanks,
Regards,
Divya

Reply via email to