Hello Long Zhou,

Thanks for reaching out. I'm developer at Lens and trying to answer your
questions with respect to Lens.

On Thu, Feb 26, 2015 at 9:09 PM, Long Zhou <[email protected]> wrote:

> [delivery to user@kylin failed, resend to dev@kylin]
>
> Hi Kylin and Lens communities,
>
>     I am working on a big data analysis project and consider using Kylin
> or Lens. Do you have some guidelines/recommendations on how to choose the
> right solution? We are particularly interested in the performance
> characteristics of these two solutions on terabytes of sparse data.
>

We don't have guidelines/recommendations/performance characteristics
documented anywhere as of now. But user documentation should help you with
some details of the system. Lens itself does not have any overhead with
respect to query execution, it would be given to underlying engine and the
performance numbers published in underlying systems should be sufficient.


>     I just started learning the two projects. It seems Kylin is more like
> MOLAP while Lens is more like ROLAP, is that correct? Does the differences
> between MOLAP and ROLAP apply here?
>

I  agree with Lens that it is ROLAP like system. We can say Lens can become
HOLAP (http://en.wikipedia.org/wiki/ROLAP,
http://en.wikipedia.org/wiki/HOLAP,
http://www.1keydata.com/datawarehousing/molap-rolap.html). And as said in
ROLAP, performance of Lens depends on underlying execution engines and if
the data is not aggregated, it would pick detailed tables for answering.
But if aggregated data is available through an ETL process, it would make
use of it.

    When using Hive as storage, it seems Kylin might perform better since
> data is pre-aggregated and cached. How does Kylin handle sparse tables and
> avoid empty cells in cache? Does Lens have cache on top of Hive?
>

No, Lens does not have any cache on top of Hive.


>     Lens supports columnar data warehouses like Redshift. How much
> performance could we gain by loading data to Redshift? Where can I find
> performance benchmark data for the two projects?
>

It would be same as how fast Redshift can answer queries. Lens comes with
JDBCDriver for reaching systems which can understand jdbc. At inmobi, we
are using it with Columnar dataware house - InfoBright (
https://www.infobright.com/) in production, it should work with Redshift as
well, but it is not yet tested with RedShift.

Thanks
Amareshwari

Reply via email to