Thanks Jim. While performance tuning of the queries will definitely help, I
also would like to know if there is a general practice on how metrics with
multiple dimensions are generally queried. For example, for a given metric
like LTV I got about 65 different ways to segment based on product/sku and
time interval combinations to view the metric.
On Tue, Oct 18, 2016 at 9:37 AM, Jim Apple <jbap...@cloudera.com> wrote:
> This might help:
> On Tue, Oct 18, 2016 at 12:30 AM, Buntu Dev <buntu...@gmail.com> wrote:
> > I got table of user purchases and subscriptions with various product skus
> > along with user attributes in a single table (~1g and 20M rows).
> > Due to the number of combinations for slicing and dicing the data, it
> > a while to query for churn, retention, etc. on the dataset for various
> > periods and product skus selected and makes it not ideal the frontend.
> > Generating a precomputed table with all the combinations is pretty
> > exhausting, so I'm look to see if there are any best practices in
> > a schema to overcome these issues.
> > Thanks!