Re: Need advice for kylin newbie

Adunuthula, Seshu Fri, 27 Feb 2015 10:16:38 -0800

Vikram,

Thank you for a honest and direct feedback on Kylin. As you had rightly
called 
out the sweetspot for Kylin is the ability to do MOLAP on 10-100 billions
of 
Rows with sub second query responses. So we believe Kylin is the right
tools
for your requirements below.


> So given these requirements, is Kylin the right solution to replace our
> on-premise MOLAP cubes?  As long as our users can pivot/slice & dice the
> measures quickly from client tools like excel ND tableau by dragging
> dropping dimensions into rows/columns w/o the need to join to fact table,



Docker is useful for single machine Developer deployments and I have found
that 
a certain level of Docker expertise is needed before you can successful
deploy 
Them.

You are doing a certain set of firsts that could be making your setup a
nightmare. 
Using Azure as the managed Hadoop System would certainly be a first for
the Kylin 
team and you might be running into.

That said we "Kylin team" are interested in making your POC successful,
and as 
with any open source there is some assembly required, are you as a team
setup 
for Development activities? If so we can have team to team meetings to
determine
What takes to make the POC successful

On 2/27/15, 8:51 AM, "Vikram Kone" <[email protected]> wrote:

>Hi,
>I'm a newbie when it comes to Kylin and Hadoop eco system in general. Our
>team has been predominantly a Microsoft shop that uses MS stack for most
>of
>their BI needs. So we are talking SQL server  for storing relational data
>and SQL Server Analysis services for building MOLAP cubes for sub-second
>query analysis.
>Lately, we have been hitting degradation in our cube query response times
>as our data sizes grew considerably the past year. We are talking fact
>tables which are in 1o-100 billions of rows range and a few dimensions in
>the 10-100's of millions of rows. We tried vertically scaling up our SSAS
>server but queries are still taking few minutes. In light of this, I was
>entrusted with task of figuring out an open source solution that would
>scale to our current and future needs for data analysis.
>I looked at a bunch of open source tools like Apache Drill, Druid,
>AtScale,
>Spark, Storm, Kylin etc and settled on exploring kylin  as the first step
>given it's recent rise in popularity and growing eco-system around it.
>I started to build out a POC for our MOLAP cubes using kylin with
>HDFS/Hive
>as the datasource and see how it scales for our queries/measures in real
>time with real data. The setup has been a nightmare so far. Configuration
>of the cluster takes too long. I tried the docker version and it fails
>with
>cryptic errors. Then tried installing it using the build from root option
>on a hadopp cluster and seeing more issues while building issues related
>to
>cube building. Same with binary package installation. It's just taking too
>long to set up. There should be an easier way to do this :(
>Roughly, these are the requirements for our team
>1. Should be able to create facts, dimensions and measures from our data
>sets in an easier way.
>2. Cubes should be query able from Excel and Tableau.
>3. Easily scale out by adding new nodes when data grows
>4. Very less maintenance and highly stable for production level workloads
>5. Sub second query latencies for COUNT DISTINCT measures (since majority
>of our expensive measures are of this type) . Are ok with Approx Distinct
>counts for better perf.
>
>So given these requirements, is Kylin the right solution to replace our
>on-premise MOLAP cubes?  As long as our users can pivot/slice & dice the
>measures quickly from client tools like excel ND tableau by dragging
>dropping dimensions into rows/columns w/o the need to join to fact table,
>we are ok with however the data is laid out. Doesn't have to be a cube. It
>can be a flat file in hdfs for all we care. I would love to chat with some
>one who has successfully done this kind of migration from SSAS OLAP cubes
>to KYLIN  in their team or company AND learn about pros n cons before I
>spend more time Co figuring this stuff.
>
>This is it for now. Looking forward to a great discussion.
>
>P.S. We have decided on using Azure as our managed hadoop system in the
>cloud.

Re: Need advice for kylin newbie

Reply via email to