This very much depends on your use case. Is it for internal analytics,
perhaps with large, long running queries? Then Spark or Presto should be
fine and can present lots of other options.

But if low latency queries that involve precise lookups are required, keep
in mind that response time in Impala will always be better than Spark or
Presto, simply because there is no jvm startup time. By the same token,
Impala will be better suited in highly concurrent use cases.

For the record we use Impala and Kudu for our client facing analytics and
are very satisfied with the performance and stability. HDFS is there and
mostly dormant aside from its use in dimension staging now and then. The
only Hive aspect is the metastore. Overall, despite what it might seem on
paper it's an extremely simple set up.

Cliff

On Tue, Dec 10, 2019, 10:41 AM Yariv Moshe <ya...@frontline-pcb.com> wrote:

> Hi,
>
>
>
> I’m using Apache Kudu an a database and Impala as a query engine.
>
> I’m trying to reduce the number of the technologies.
>
> As far as I know Impala required hive which required HDFS.
>
>
>
> So now I have a question.
>
> 1.      Is there any option to use impala on top of kudu without
> hive\HDFS?
>
> 2.      I looked on Presto which query engine and run on top of kudu as
> well but the performance was slowest than impala, is there any other
> recommended query engine ?
>
>
>
> Thanks,
>
> Yariv
>
>
>

Reply via email to