Re: 答复: a simple question about kylin input and output

Li Yang Wed, 08 Jun 2016 02:03:59 -0700

For your data model, the 2nd approach sounds better to me.

Because the dim table is big, it's better to have it in cube such that join
and aggregation is pre-calculated.



On Wed, Jun 8, 2016 at 9:28 AM, [email protected] <[email protected]>
wrote:

> I ask this question because we have a big dim table with 7000,000 records.
>
> The dim table is defined as below
>
> Id  分类1 分类2 分类3 品牌
> Id    cat1     cat2     cat3      brand
>
> There are actually 4 dimetions in this table.
>
> We have tested
> 1.
> don’t join fact table and dim table before building the cube,
> join tables in query phase , query time is about 10-40 seconds,
>
> we define cube with a derived dimention ( id column) and other dimentions
> in the fact table.
>
> I guess kylin store rowkey with column id , not cat1,cat2,cat3,brand ,am I
> right here?
>
> when query by or group by cat1,cat2,cat3,
> there are scan to the dim table and calculations for the measues.
>
> Advantage: less dimentions, only cube
> Disadvantage: query time is high
>
> 2. join tables before building the cube
> No need to join in query phase. Query time is about 1-3 seconds.
>
> Advantage: query time is low
> Disadvantage: more dimentions, need to split one cube to more if nessesary
>
>
> 发件人: Yang [via Apache Kylin] [mailto:[email protected]]
> 发送时间: 2016年6月7日 17:53
> 收件人: yubo-ds1(于渤.大数据中心.大数据平台部)
> 主题: Re: a simple question about kylin input and output
>
> I'm more curious about the purpose of such requirement. It's not obvious to
> me.
>
> On Thu, Jun 2, 2016 at 6:55 PM, lidong <[hidden
> email]</user/SendEmail.jtp?type=node&node=4842&i=0>> wrote:
>
> > Does Hive view meet your need?
> >
> >
> > Create a Hive view C base on A join B. And make C as input, then get C as
> > output.
> >
> >
> > Thanks,
> > Dong
> >
> >
> > Original Message
> > Sender:[hidden
> email]</user/SendEmail.jtp?type=node&node=4842&i=1>-[hidden
> email]</user/SendEmail.jtp?type=node&node=4842&i=2>
> > Recipient:[hidden email]</user/SendEmail.jtp?type=node&node=4842&i=3>
> > Date:Thursday, Jun 2, 2016 18:28
> > Subject:a simple question about kylin input and output
> >
> >
> > Hi all: We define a fact table and a dim table as input in a cube, after
> > cube build finished, we can find a fact table and a dim table in the
> > "insight" tab as output with same names as we defined. example: input 2
> > table: A,B output 2 table: A,B My question is: Is there a way to produce
> > only one table, which have joined the face table and the dim table?
> > example: input 2 table :A,B output 1 table : C Thanks in advance for any
> > hints. Yubo -- View this message in context:
> >
> http://apache-kylin.74782.x6.nabble.com/a-simple-question-about-kylin-input-and-output-tp4784.html
> > Sent from the Apache Kylin mailing list archive at Nabble.com.
> >
>
> ________________________________
> If you reply to this email, your message will be added to the discussion
> below:
>
> http://apache-kylin.74782.x6.nabble.com/a-simple-question-about-kylin-input-and-output-tp4784p4842.html
> To start a new topic under Apache Kylin, email
> [email protected]
> To unsubscribe from Apache Kylin, click here<
> http://apache-kylin.74782.x6.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=1&code=eXViby1kczFAeW9sbzI0LmNvbXwxfC0xMTE5OTYzOTg4
> >.
> NAML<
> http://apache-kylin.74782.x6.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml
> >
>
>
> --
> View this message in context:
> http://apache-kylin.74782.x6.nabble.com/a-simple-question-about-kylin-input-and-output-tp4784p4848.html
> Sent from the Apache Kylin mailing list archive at Nabble.com.
>

Re: 答复: a simple question about kylin input and output

Reply via email to