Hi. Look at https://github.com/dreyk/litetsdb.
It's base on riak_core and has follow features: - Save sequential data like time series. - Batch Load operations. - Range data scan. - Data replication like riak_kv but more simple(no vector clock and other additional meta on you data),last write win. - Range based partitioning. - Use lebeldb as backend. - Auto data expiration option. Fore storing you data you must define you own bucket module in which you must define(for example https://github.com/dreyk/etsdb/blob/master/src/etsdb_tkb.erl) - Partitioning function. - Serialization and Unserialization function - Scan data function. In etsdb_tkb data partitioning by time interval all data that have the same Time div TimeRange will be stored in the same place. You can choose another strategy. Wee use this database in production, for storing data like id(long)-time(long)-value(byte array). Average load - 1000 samples per sec in production. Benchmark(we use 3 node) show max throughput near 200000 per/sec,for example Cassandara in our benchmark have throughput 250000 per/sec, but become unreachable after first minute. Litetsdb was more stable under high load. If it's interesting. I will help you this configuration and testing. ----- Исходное сообщение ----- От: "Carter Charbonneau" <[email protected]> Кому: [email protected] Отправленные: Среда, 1 Май 2013 г 19:34:38 Тема: Time series system on top of riak-core. I want to store lots of time series data in a database. The data can be organized into measurements which have values at specific times. There will be a *lot* of data but it doesn't need to be accessed very often. The time value will most likely be something like the seconds since the unix epoch. However, more precision would be useful. The time will be of a fixed size. I want to store the data in such a way that losing a node with r=1 will only lose data at specific intervals. I'm thinking to do this by choosing the vnode to store the data on by time % num_vnodes and building on riak-core. I have a idea for the actual storage format. Each measurement would be stored separately. The data would be grouped into files of a specific size. The data would be grouped into blocks so that each block is a good size for compression. Each block would be compressed with The directory would look like: 1367419556.keys -> Keys file for time 1367419556 until the next file (1367464288). 1367419556.data -> Data file for time 1367419556 until the next file (1367464288). 1367419556.updates -> Updates/deletes/random inserts for the data. Searched first. 1367464288.keys 1367464288.data The data file would be | time | value_size | value | repeated for each time/value and grouped into compressed blocks. The keys file would contain | time of first item in block | offset | repeated for each block. To get a value at a time first the file it is contained in would be found. Next, the update file will be searched if it exists. Then the keys file would be read. For each time, if it is greater than the time, the previous block will be opened and read until the time is found. Range support will be necessary to achieve reasonable speed. To insert a key at the end, it would be buffered until there is block size buffered items. Then the data will be written in a block to the current keys and data files. To update, delete, or insert a key not at the end it will be written to a update file for the file it is stored in or it should be placed. The update file can be processed periodically and be integrated into the keys and data files. Downsampling of the data and keys file (aka increasing the block size) could also be sone at the same time the updates are integrated. This may be silly but it seems to me like it would be efficient for storing lots of time series data. Do you think this is realistic for storing time series data? _______________________________________________ riak-users mailing list [email protected] http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com _______________________________________________ riak-users mailing list [email protected] http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
