Re: Some newbie questions about kylin

Li Yang Wed, 16 Sep 2015 00:57:13 -0700

> a). SELECT PART_DT FROM kylin_sales;

Well this is a special case for historic reason. Basically Kylin can only
answer queries with "group by". However too many people use "select * from
fact" as a health check after cube build, so we did a hack for similar
queries to return some (incorrect) records. Then the hack is not well
maintained after a certain version... anyway it's empty result right now,
and it's something we will fix.


Please focus on cases like b) where table or column does not match any cube.

On Tue, Sep 15, 2015 at 5:12 PM, Luke Han <[email protected]> wrote:

> I have to say build a cube is NOT "too complex" with basic knowledge of
> data warehouse and business intelligence.
> All concepts of Kylin are same as traditional products.
>
> Try this page first:
> http://kylin.incubator.apache.org/docs/gettingstarted/terminology.html
>
> and we are introducing new tutorial step by step soon:)
>
> Thanks.
>
>
> Best Regards!
> ---------------------
>
> Luke Han
>
> On Mon, Sep 14, 2015 at 6:06 PM, Leon Zhang <[email protected]> wrote:
>
> > Thank you Li Yang.
> >
> > 1. OK. Cool. Looking forward to discuss about this feature later.
> >
> > 2. I am trying example queries. The exception is not thrown as my
> expected.
> > I tried the following queries:
> >
> > a). SELECT PART_DT FROM kylin_sales;
> > b). SELECT * FROM sample_07;    # the table sample_07 was imported into
> > kylin
> > c). SELECT PART_DT FROM kylin_sales GROUP BY PART_DT;
> >
> > The case b) and c) can be handle as my expected, the b) threw a
> exception,
> > and c) will return correct result set. But, the case a) returned a *empty
> > result set* without any exception.
> >
> > So, I am not sure if it is a feature or a bug.
> >
> > Thanks
> >
> >
> >
> > On Mon, Sep 14, 2015 at 5:10 PM, Li Yang <[email protected]> wrote:
> >
> > > 1. Creating a cube right from a SQL is charming. Had discussed this
> idea,
> > > but never come to implement. A big issue is user still have to learn
> cube
> > > concepts, otherwise the auto-generated cube is either too big or the
> > query
> > > is too slow. Still this is a good idea to bootstrap new users.
> > >
> > > 2. Currently Kylin throws exception when no matching cube. You can
> catch
> > > the exception in the frontend and guide user to hive query GUI. That's
> > > perfectly doable, and I believe, very useful too. You may start by
> > calling
> > > Kylin Restful API from javascript [1], and detect error response that
> is
> > > caused by no matching cube.
> > >
> > > [1]
> > >
> >
> http://kylin.incubator.apache.org/docs/howto/howto_use_restapi_in_js.html
> > >
> > >
> > > On Mon, Sep 14, 2015 at 4:50 PM, Leon Zhang <[email protected]>
> wrote:
> > >
> > > > Thank you Li Yang for you quick reply.
> > > >
> > > > 1. From a newbie's view, create a cube is too complex. How about if
> > user
> > > > provide a SQL, kylin can automatically generate the model json and
> cube
> > > > json?
> > > >
> > > > 2. IMHO, SparkSQL can be faster, but, it may still take hours. Can
> > kylin
> > > > provide such API, that *test* if the query can be handled by any
> cube.
> > If
> > > > not, user can *directly* call hive-server2 for slow query?
> > > >
> > > > I am willing to hear about detailed advises, so that I can hack Kylin
> > to
> > > > implement these potential features.
> > > >
> > > > Thanks.
> > > >
> > > >
> > > > On Mon, Sep 14, 2015 at 4:33 PM, Li Yang <[email protected]> wrote:
> > > >
> > > > > 1. You can create a view using the SQL, then let the view be Kylin
> > fact
> > > > > table and build cube from it.
> > > > >
> > > > > 2. Technically yes, but the routing feature is disabled long ago.
> We
> > > had
> > > > > the same re-route idea in early PoC of Kylin. Later it turns out
> hive
> > > is
> > > > > too slow, and when mixed up, a query sometimes returns in seconds,
> > > > > sometimes in hours, the user experience is very bad. The feature is
> > > then
> > > > > disabled and never maintained any more. The idea is still valid, we
> > may
> > > > > implement when hive speed catches up, or route to some faster SQL
> > > engine,
> > > > > like SparkSQL. KYLIN-742 <
> > > > https://issues.apache.org/jira/browse/KYLIN-742>
> > > > >
> > > > >
> > > > > On Mon, Sep 14, 2015 at 2:11 PM, Leon Zhang <[email protected]>
> > > wrote:
> > > > >
> > > > > > Hi, Kylin Developers,
> > > > > >
> > > > > >    I am a newbie to Kylin system. During I investigate this
> awesome
> > > > > system,
> > > > > > these questions come to me:
> > > > > >
> > > > > > 1.  Can I build a cube from a SQL? For learn_kylin example, the
> SQL
> > > for
> > > > > > "kylin_cube_sales" is:
> > > > > >
> > > > > > ``` sql
> > > > > >  SELECT
> > > > > >  FACT_TABLE.PART_DT
> > > > > > ,FACT_TABLE.LEAF_CATEG_ID
> > > > > > ,FACT_TABLE.LSTG_SITE_ID
> > > > > > ,LOOKUP_2.META_CATEG_NAME
> > > > > > ,LOOKUP_2.CATEG_LVL2_NAME
> > > > > > ,LOOKUP_2.CATEG_LVL3_NAME
> > > > > > ,FACT_TABLE.LSTG_FORMAT_NAME
> > > > > > ,FACT_TABLE.PRICE
> > > > > > ,FACT_TABLE.SELLER_ID
> > > > > > FROM DEFAULT.KYLIN_SALES as FACT_TABLE
> > > > > > INNER JOIN DEFAULT.KYLIN_CAL_DT as LOOKUP_1
> > > > > > ON FACT_TABLE.PART_DT = LOOKUP_1.CAL_DT
> > > > > > INNER JOIN DEFAULT.KYLIN_CATEGORY_GROUPINGS as LOOKUP_2
> > > > > > ON FACT_TABLE.LEAF_CATEG_ID = LOOKUP_2.LEAF_CATEG_ID AND
> > > > > > FACT_TABLE.LSTG_SITE_ID = LOOKUP_2.SITE_ID
> > > > > > ```
> > > > > >
> > > > > > 2. Kylin is super fast, can I route *unmatched* query to slow
> hive
> > > > > engine?
> > > > > > For example, the simple query like "select * from kylin_sales"
> > > return a
> > > > > > empty result set. Can I route query like this to the hive engine?
> > > > > >
> > > > > >
> > > > > > Thanks.
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: Some newbie questions about kylin

Reply via email to