So, the short answer is: you can't. The long answer is that I was figuring the
same problem and I have found the following solution.
Every time a user vote, the application saves a new document having doc.type ==
'vote'. The 'vote' document has a property called 'recorded', that is really
important.
The application uses a design document called 'votes', where is stored a view
called 'notRecorded', that contains just a map function like this one:
function($doc) {
if ((doc.type == 'vote') and (!doc.recorded))
emit(doc.timestamp, doc);
}
As you can see, there is not a reduce function, since we need to mark those
documents as recorded, we'll reduce them later on our own.
There is a cron job that runs, in my case, a PHP script. You can choose to run
the script every minute, every five minutes, etc. The script queries the view
defined above to get all the 'not recorded' votes.
Once we have all the 'not recorded' votes, we use a simple function to reduce
them. In my case the function reduces them by postId. The function returns an
associative array with the sum of the 'not recorded' votes, using as keys the
post identifiers.
Finally the same script queries another view of another design document
scores/perPost, using as keys the same post identifiers. This is the map
function:
function(doc)) {
if (doc.type == 'score')
emit(doc.postId, doc);
}
In the end, the script updates every single document of type 'score' retrieved
before, adding to the score.total the total of new voted counted. Then the
script updates every single 'vote' document marking it as recorded. So we'll
never count them anymore. There is no race condition here, no conflicts at all.
To get the posts ordered by score you just need to query another view, let's
call it posts/byScore (ddoc/viewName) like this:
function(doc)) {
if (doc.type == 'score')
emit(score.total, doc.postId);
}
I suggest to use _bulk_docs API, within all_or_nothing option, to be sure to
change all the documents as an atomic operation.
I don't know if you are using PHP, but my library has some interesting
features. First of all, you can process a query result chunk by chunk (I use
this approach to reduce votes), and you can also returns null rows in case some
post doesn't still have a score, because you never registered votes. This
simplify a lot your job.
This solution counts just the new votes. In a new vote is generated in the
meantime, nothing bad is gonna happen, no conflict is generated. You just
operate on a strict set of posts, just the one have new votes. You'll just
change the score related to those documents. That's all. :-)
So 3 views and 3 document types: post, vote, score. In your case you'll have
user instead of post, but the mechanic remains the same.
Bye
-Filippo
On Oct 11, 2013, at 5:09 PM, Tom van Neerijnen wrote:
> Hi all
>
> I'm developing a ranking service I'd like to throw scores for multiple
> users at and later be able to retrieve a list of users ranked/ordered by
> their scores. Problem is if I make it time series data and accumulate the
> total score in a reduce I can't put the score in the key to sort by.
>
> Does anyone have suggestions as to how I could re-jig my documents and/or
> M/R to accommodate this kind of output or should I simply consider storing
> a total document along side all my time series documents? My concern with
> storing a total document is I can never be sure the total in the document
> exactly matches my M/R over my time series data because I'd have to have an
> app query the view and, using the result of the view query, update the
> total document, so that's a last resort.
>
> I considered sorting in a list function but we're potentially looking at
> millions of users so that's not really an option.
>
> I'm imagining time series documents like so:
> "abcdefghi" : {
> "userId": "jim",
> "scoreChange": 10,
> "country": "GB",
> "game": "solitaire",
> "date": "2013-10-08T13:50:19"
> }
>
> "jklmnopqr" : {
> "userId": "jim",
> "scoreChange": -5,
> "country": "GB",
> "game": "solitaire",
> "date": "2013-10-08T13:51:19"
> }
>
> "jklmnopqr" : {
> "userId": "james",
> "scoreChange": 4,
> "country": "GB",
> "game": "freecell",
> "date": "2013-10-08T13:52:19"
> }
>
> I could then have this M/R for, as an example, ranks by game then user:
> "by_game": {
> "map": "function(doc) { emit([doc.game, doc.userId], doc.scoreChange)}",
> "reduce": "function(keys, values, rereduce) { return sum(values); }"
> }
>
> and get this list from it:
> {"key":["freecell","james"],"value":4}
> {"key":["solitaire","jim"],"value":5},
>
> Thanks