Re: Data Model Suggestion

Skanda Tue, 23 Jun 2015 06:28:51 -0700

Hi Pari

For your use-case, having it as part of the rowkey should be a better
design than creating so many columns.


Regards
Skanda

On Tue, Jun 23, 2015 at 6:49 PM, Pariksheet Barapatre <[email protected]>
wrote:

> Hello All,
>
> This is more like a HBase question but as I am planning to use Phoenix as
> a access layer, I hope phoenix user will help me.
>
> I would like to create time series data to get on-the-fly analytics.
>
> This use case is for adTech.
>
> Report - what is houly,daily,weekly impression counts at country level for
> a given advertisement ID (ADID).
>
> I am doing hourly aggregation and loading into a Phoenix table.
>
> Primary Key - *ADID          | COUNTRY       | HOUR ID*
>
>
> ---------------------------------------------------------------------------------
> *ADID          | COUNTRY       | HOUR ID*      |  CF.IMP  |
>
> ---------------------------------------------------------------------------------
> 1                | US                  | 2015062301  | 3000        |
> 1                | US                  | 2015062302  | 3421        |
> 1                | UK                  | 2015062302  | 1212        |
>
> ---------------------------------------------------------------------------------
>
> Is it a good schema design or shall I create alternate schema as below
> Primary Key - *ADID          | COUNTRY       | DAY ID*
>
> ----------------------------------------------------------------------------------------------------
> *ADID          | COUNTRY       | DAY ID*      |  CF.IMP*01*  | CF.IMP*02*
> |
>
> ----------------------------------------------------------------------------------------------------
> 1                | US                  | 20150623  | 3000        |
> 3421         |
> 1                | UK                  | 20150623  | NULL        |
> 1212          |
>
> ----------------------------------------------------------------------------------------------------
> Here, I have taken hour part from hour ID and created 24 columns.
>
> I gone through many time-series NoSQL blog posts, most the author suggest
> to go with wider rows as above. This will reduce the scan, but I don't see
> much difference in both Data Models in term of latency for scanning.
>
> Can anybody please suggest good approach for my use case?
>
>
> Cheers,
> Pari
>

Re: Data Model Suggestion

Reply via email to