> Hallo, > > I have a large amount of time based events to aggregate, with a finite > list of possible events. > the events are logged one at the time: > > timestamp_1 : event_1 > timestamp_2 : event_2 > timestamp_3 : event_1 > ... > > My idea is to prepare the data before to import them in order to get > one separate column per event > along with the desired time aggregation > > > timestamp : houroffset : is_event_1 :is_event_2 : is_event_3 > > t1 132 True null null > t2 132 null True null > t3 133 True null null > t4 134 null null True > > > > .... > > With this model, I can easily count my events per time periode: > > > select > count(is_event_1) as C1, > count(is_event_2) as C2, > count(is_event_3) as C3 > from foo > group by houroffset. > > (the real model is more compex as they are different categories and > volatile properties associated with the events) > > and now my questions: > --------------------- > > - Will the above query acces the "is_event columns", or get the > informmation only from the nulls bitmap within the row headers ? > > > - How does the splitting of the "event" information affect the row > headers ? Should I rather define a single event column and aggregate > my data using "case when else end" clauses ? > > > - is a where clause "where is_event_x is not null" more performant > than "where is_event_x is true" as the null bitmap can thoretically be > used. > > > - in my model, I couldn't find a way to build an index that could be > used to query a single event: > > select ... > from foo > where is_event_1=true (|| is_event_1 is not null) > > I first imagined an index like create index foo_i on foo > ((is_event_1 IS NOT NULL), (is_event_2 IS NOT NULL),... > But it is apparently not used when I have a single event in my > where clause. > > A solution would be to add a further column in my table containing > the event_id and to index it: > > timestamp : houroffset : is_event_1 :is_event_2 : is_event_3 : > event_id > > t1 132 True null null > 1 > t2 132 null True null > 2 > t3 133 True null null > 1 > t4 134 null null True > 3 > > but this extra column would be redundant with the nulls bitmap. Is > there a way to avoid this duplication of information ? > > > - is there a performance gain at all when columns that are seldom used > are placed at the end of the rows ? > > > - I 'll also be thankful for any comments and critics on my model. > > > > cheers, > > Marc