PRIMARY KEY(dt,sid)) won't work well for your query. PRIMARY KEY(sid, dt)) is much better for time range queries for a particular sensor. In a latter case this query will be translated into efficient range scan. Do not use bigint for timestamp, try to fit it into long or use stringified version in a format suitable for byte-by-byte comparison.
"2015 06/08 05:23:25.345" -Vlad On Mon, Jun 8, 2015 at 2:48 AM, Yiannis Gkoufas <[email protected]> wrote: > Hi there, > > I am investigating Phoenix as a potential data-store for time-series data > on sensors. > What I am really interested in, as a first milestone, is to have efficient > time range queries for a particular sensor. From those queries the results > would consist of 1 or 2 columns (so small rows). > I was looking on some advice about the schema design and indexes. > What I have done so far: > > CREATE TABLE VALUES (dt bigint not null,sid varchar not null, us double, > gen double CONSTRAINT PK PRIMARY KEY(dt,sid)) COMPRESSION='GZ', > SALT_BUCKETS=120 > > loaded the data and performed queries like: > > SELECT DT,US,GEN FROM VALUES WHERE DT>=1369676800000 AND DT <= > 1370000800000 AND SID='ID1' > > Is this the optimal way to go? > > Thanks a lot >
