Another option would be to try Facebook's Presto https://prestodb.io/

Like Impala, Presto is designed for fast interactive querying over Hive
tables, but it is also capable of querying data from many other SQL sources
(mySQL, postgreSQL, Kafka, Cassandra, ...
https://prestodb.io/docs/current/connector.html)

In terms of performances on small queries, it seems to be as fast as
Impala, a league over Spark-SQL, and of course two leagues over Hive.

Unlike Impala, Presto is also able to read ORC file format, and make the
most of it (e.g. read pre-aggregated values from ORC headers).

It can also make use of Hive's bucketing feature, while Impala still cannot:
https://github.com/prestodb/presto/issues/6666
https://issues.apache.org/jira/browse/IMPALA-3118

Regards,

Furcy





On Tue, Jun 20, 2017 at 5:36 AM, Sruthi Kumar Annamneedu <
sruthikumar...@gmail.com> wrote:

> Try using Parquet with Snappy compression and Impala will work with this
> combination.
>
> On Sun, Jun 18, 2017 at 3:35 AM, rakesh sharma <rakeshsharm...@hotmail.com
> > wrote:
>
>> We are facing an issue of format. We would like to do bi style queries
>> from hive using impala and that supports parquet but we would like the data
>> to be compressed to the best ratio like orc. But impala cannot query orc
>> formats. What can be a design consideration for this. Any help
>>
>> Thanks
>> Rakesh
>>
>> Get Outlook for Android <https://aka.ms/ghei36>
>>
>>
>

Reply via email to