Re: Effective partition key for time series data, which allows range queries?

2017-04-05 Thread Jim Ancona
That's an interesting refinement! I'll keep it in mind the next time this sort of thing comes up. Jim On Wed, Apr 5, 2017 at 9:22 AM, Eric Stevens wrote: > Jim's basic model is similar to how we've solved this exact kind of > problem many times. From my own experience, I

Re: Effective partition key for time series data, which allows range queries?

2017-04-05 Thread Eric Stevens
Jim's basic model is similar to how we've solved this exact kind of problem many times. From my own experience, I strongly recommend that you make a `bucket` field in the partition key, and a `time` field in the cluster key. Make both of these of data type `timestamp`. Then use application

Re: Effective partition key for time series data, which allows range queries?

2017-04-04 Thread Jim Ancona
The typical recommendation for maximum partition size is on the order of 100mb and/or 100,000 rows. That's not a hard limit, but you may be setting yourself up for issues as you approach or exceed those numbers. If you need to reduce partition size, the typical way to do this is by "bucketing,"

Re: Effective partition key for time series data, which allows range queries?

2017-03-27 Thread Noorul Islam Kamal Malmiyoda
Have you looked at KairosDB schema ? https://kairosdb.github.io/ Regards, Noorul On Tue, Mar 28, 2017 at 6:17 AM, Ali Akhtar wrote: > I have a use case where the data for individual users is being tracked, and > every 15 minutes or so, the data for the past 15 minutes is

Effective partition key for time series data, which allows range queries?

2017-03-27 Thread Ali Akhtar
I have a use case where the data for individual users is being tracked, and every 15 minutes or so, the data for the past 15 minutes is inserted into the table. The table schema looks like: user id, timestamp, foo, bar, etc. Where foo, bar, etc are the items being tracked, and their values over