Hello, Yousuf. Thanks for your reply.
We have several millions of items. It's about 10 000 of unique 'from' fields (about 1000 items for each). Usually, we need to get items for about 500 'from' identifiers with 'time' limit (about 5% of items is corresponding). On Wed, Jul 25, 2012 at 1:02 PM, Yousuf Fauzan <[email protected]>wrote: > Hi Andrew, > > First of all, the correct answer to your question is the proverbial "it > depends". Having said that, here is what I could do in your case > > 1. If there are enough data points with the same "from" field, I will make > it a bucket and then index on time. > 2. If the above is not true, I will index on "from" and "time" field. > a. If number of records where "time" is greater than the one your > require is small, I will run a map/reduce with the initial input as those > records. > b. If number of records having a particular "from" is small, I will do > the above with the initial input as records having that "from" field. This > could be a problem as Riak only supports range and exact queries so if you > want to query multiple identifiers, you will have to run multiple queries. > In both the above cases, I will use secondary indexes to get the > initial records. > Note that we are using M/R as Riak does not support querying by > multiple indexes. > > What I would also suggest is to partition your data into different > buckets. You will need to understand the queries that you will be > supporting and partition it accordingly. > > -- > Yousuf > > On Wed, Jul 25, 2012 at 2:50 PM, Andrew Kondratovich < > [email protected]> wrote: > >> Good afternoon. >> >> I am considering several storage solutions for my project, and now I look >> at Riak. >> We work with the following pattern of data: >> { >> time: unixtime >> from: int >> data: binary >> ... >> } >> >> The amount of data is about several millions items for now, but it's >> growing. It is necessary to handle the folloring requests: for a list of >> identifiers (about 500 items) return all records where id = from and time >> greater than a certain value. >> >> How to store such data and to effectively handle such requests with the >> Riak? >> >> Thanks. >> >> -- >> Andrew Kondratovich >> >> >> _______________________________________________ >> riak-users mailing list >> [email protected] >> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com >> >> > -- Andrew Kondratovich
_______________________________________________ riak-users mailing list [email protected] http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
