Thank you -- very helpful.

Regarding limits on the number of dimensions.    What are the
compute/storage constraints on this?  For a given query:
* Where is the data stored
* Which nodes is the computation occurring on?

I am trying to figure out -- if we have a large number of dimensions, what
part of the cloud based kylin  needs to be increased (I'm doing the setup
from the kylin4_on_cloud branch)

Thanks, WILL

On Tue, Oct 11, 2022 at 1:20 AM Xiaoxiang Yu <x...@apache.org> wrote:

> 1) The criteria for filtering (e.g. selecting sex='male') and grouping (e.g.
> group by state) should be dimensions - is this correct?
> Yes, besides Kylin has limit of 63 dimensions at maximum.  But you should
> be aware of 'The Curse of Dimensionality'.
>
> 2.1) Items that I would like to sum should be measures, is that right?
> Yes.
>
> 2.2) Is there a limit to the number of measures?
> No, there isn't such limit.
>
> 3) Did Kylin support sum(expression)?
> From mysql doc
> https://dev.mysql.com/doc/refman/5.7/en/aggregate-functions.html#function_sum 
> ,
> we know MySQL supports it.
> For Kylin, Kylin should support it for Kylin 3.X and the future version
> 5.x. But unluckily, Kylin 4.x didn't support sum exprssion, and Kylin 4.x
> is the version you are using.
>
> 4) Does Kylin support MEDIAN?
>
> Yes, Kylin should support but I didn't test it. In fact, Kylin has a
> measure PERCENTILE, and I think 50th percentile is equal to MEDIAN, am I
> right?
>
> --
> *Best wishes to you ! *
> *From :**Xiaoxiang Yu*
>
>
>
> At 2022-10-11 14:03:14, "Will Glass-Husain" <wgl...@forio.com> wrote:
> >Hi,
> >
> >Thanks for the recent help as I set up my first Kylin system.   I have a
> >question regarding proper design of a cube to run some
> >demographic queries.   I want to make this accessible in a webapp, with
> >reasonable response time.
> >
> >I have a CSV file with about 80 columns on sex, race, state, age, internet
> >access, job, etc.
> >
> >Can you advise regarding proper cube design?
> >
> >1) The criteria for filtering (e.g. selecting sex='male') and grouping
> >(e.g. group by state) should be dimensions - is this correct?
> >
> >2) Items that I would like to sum should be measures, is that right?   Is
> >there a limit to the number of measures?  I want to report out up to 300
> >different measures aggregated by the dimensions.
> >
> >3)
> >In MySQL, I am querying for different values like this
> >
> >select SUM((married=1) * weight) as MARRIED_1, SUM((married=2) * weight) as
> >MARRIED_2 from data group by state;
> >
> >This returns the total number of weighted records for records where married
> >is 1 and where married is 2.
> >
> >Question - is there a way to do this in the Kylin query?    Or do I need to
> >pre-compute my weights and create columns MARRIED_1 and MARRIED_2 in the
> >source data, then sum it in Kylin.
> >
> >4) This is a tricky one.  Does Kylin support MEDIAN?   In MySQL, there's no
> >MEDIAN function but we can calculate it by counting all the records, then
> >selecting the record at an offset of half the records.   I want to
> >calculate "median" (not mean) for age and some other variables.
> >
> >Thanks for any tips.
> >
> >Best, WILL
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >--
> >William Glass-Husain   /forio  |  +1 (415) 440 7500 x802  |  forio.com
> ><http://www.forio.com/>
>
>

-- 
William Glass-Husain   /forio  |  +1 (415) 440 7500 x802  |  forio.com
<http://www.forio.com/>

Reply via email to