On Apr 14, 2008, at 17:19, Jan Lehnardt wrote:
On Apr 14, 2008, at 02:34, Ralf Nieuwenhuijsen wrote:
Well, that doesn't really apply. I am not looking for way to create
unique
documents.
I'm looking for a way to get a view with only unique documents.
Imagine some portion of all the documents having the key 'adres'.
Then I want a list of unique adresses; a view with only the adres
keys for
documents that have it, and then only unique entries.
It seems currently i can solve this problem in two ways:
- creating a separate adres document that stores an array of all
unique
addresses. But without any sane default merging behavior, this
breaks at
replication.
- creating a separate document for _each_ adres using put and the
md5 of
the adres of doc-id. This seems like an enormous waste of space.
Esspcially
since I will be doing this with almost every key in every document.
In the future this should be doable with the reduce/combinator
behavior, i
expect.But even there, i think the suggested approach is too
limiting. The
reducer is going to return one json object. I would rather have it
emit
(key,value) and use default view operations on it for stuff like
pagination.
Using the above example and assuming the reducer is implemented.
How to get
the X most used addresses? the value of X needs to be hard-coded
with the
suggested implemenation. Whereas using emit(key,value) in the
reducer as
well, would allow for pagination.
I might be totally off here, but the reduce function actually does
only return one key-value pair for the view:
map: /* _id = md5(address) */
function(doc) {
emit(doc._id, 1);
}
produces:
abc | 1
abc | 1
def | 1
xyz | 1
yyy | 1
yyy | 1
yyy | 1
for fictional _id values.
reduce:
function(keys, values) {
var sum = 0;
for(var i in values) {
sum += values[i];
}
return sum;
}
produces:
abc | 2
def | 1
xyz | 1
yyy | 3
as the output of the view, which can be paginated just as easy as
the list that map alone produces. This gives you a count for all
addresses but not yet a sorted list. got to think about that one a
bit more.
I checked back with Damien and we can't do that now. You'd need to
collate that reduce result in your application or use Lucene or some
other technology to do that for you.
Cheers
Jan
--