> Hallo,
> 
> I have a large amount of time based events to aggregate, with a finite
> list of possible events.
> the events are logged one at the time:
> 
> timestamp_1 : event_1
> timestamp_2 : event_2
> timestamp_3 : event_1
> ...
> 
> My idea is to prepare the data before to import them in order to get
> one separate column per event
> along with the desired time aggregation
> 
> 
> timestamp : houroffset : is_event_1 :is_event_2 : is_event_3
> 
> t1                 132         True        null         null
> t2                 132         null        True         null
> t3                 133         True        null         null
> t4                 134         null        null         True
> 
> 
> 
> ....
> 
> With this model,  I can easily count my events per time periode:
> 
> 
> select 
>   count(is_event_1) as C1,
>   count(is_event_2) as C2,
>   count(is_event_3) as C3
> from foo
> group by houroffset.
> 
> (the real model is more compex as they are different categories and
> volatile properties associated with the events)
> 
> and now my questions:
> ---------------------
> 
> - Will the above query acces the "is_event columns", or get the
> informmation only from the nulls bitmap within the row headers ?
> 
> 
> - How does the splitting of the "event" information affect the row
> headers ?  Should  I rather define a single event column and aggregate
> my data using "case when else end" clauses ?
> 
> 
> - is a where clause "where is_event_x is not null" more performant
> than "where is_event_x is true" as the null bitmap can thoretically be
> used.
> 
> 
> - in my model, I couldn't find a way to build an index that could be
> used to query a single event:
> 
>    select ...
>    from foo
>    where is_event_1=true (|| is_event_1 is not null)
> 
>   I first imagined an index like create index foo_i on foo
> ((is_event_1 IS NOT NULL),  (is_event_2  IS NOT NULL),...
>   But it is apparently not used  when I have a single event in my
> where clause.
>   
>   A solution would be to add a further column in my table containing
> the event_id and to index it:
>   
>   timestamp : houroffset : is_event_1 :is_event_2 : is_event_3 :
> event_id
>   
>   t1                 132         True        null         null
> 1
>   t2                 132         null        True         null
> 2
>   t3                 133         True        null         null
> 1
>   t4                 134         null        null         True
> 3
>   
>   but this extra column would be redundant with the nulls bitmap. Is
> there a way to avoid this duplication of information ?
>   
> 
> - is there a performance gain at all when columns that are seldom used
> are placed at the end of the rows ?
>   
>   
> - I 'll also be thankful for any comments and critics on my model.  
> 
> 
> 
> cheers,
> 
> Marc

Reply via email to