On Apr 14, 2008, at 02:34, Ralf Nieuwenhuijsen wrote:
Well, that doesn't really apply. I am not looking for way to create
unique
documents.
I'm looking for a way to get a view with only unique documents.
Imagine some portion of all the documents having the key 'adres'.
Then I want a list of unique adresses; a view with only the adres
keys for
documents that have it, and then only unique entries.
It seems currently i can solve this problem in two ways:
- creating a separate adres document that stores an array of all
unique
addresses. But without any sane default merging behavior, this
breaks at
replication.
- creating a separate document for _each_ adres using put and the
md5 of
the adres of doc-id. This seems like an enormous waste of space.
Esspcially
since I will be doing this with almost every key in every document.
In the future this should be doable with the reduce/combinator
behavior, i
expect.But even there, i think the suggested approach is too
limiting. The
reducer is going to return one json object. I would rather have it
emit
(key,value) and use default view operations on it for stuff like
pagination.
Using the above example and assuming the reducer is implemented. How
to get
the X most used addresses? the value of X needs to be hard-coded
with the
suggested implemenation. Whereas using emit(key,value) in the
reducer as
well, would allow for pagination.
I might be totally off here, but the reduce function actually does
only return one key-value pair for the view:
map: /* _id = md5(address) */
function(doc) {
emit(doc._id, 1);
}
produces:
abc | 1
abc | 1
def | 1
xyz | 1
yyy | 1
yyy | 1
yyy | 1
for fictional _id values.
reduce:
function(keys, values) {
var sum = 0;
for(var i in values) {
sum += values[i];
}
return sum;
}
produces:
abc | 2
def | 1
xyz | 1
yyy | 3
as the output of the view, which can be paginated just as easy as the
list that map alone produces. This gives you a count for all addresses
but not yet a sorted list. got to think about that one a bit more.
Cheers
Jan
--
Greetings,
Ralf
2008/4/13, Chris Anderson <[EMAIL PROTECTED]>:
Ralf,
If you use an algorithm to generate a deterministic _id for records
before PUT-ing them to CouchDB, you can ensure that each unique
record
only appears once in the database. This discussion might be relevant
for you:
http://mail-archives.apache.org/mod_mbox/incubator-couchdb-user/200803.mbox/[EMAIL
PROTECTED]
Chris
On Sat, Apr 12, 2008 at 8:43 PM, Ralf Nieuwenhuijsen
<[EMAIL PROTECTED]> wrote:
Is it possible to create a view with only unique-records?
I assume it would be possible using the future reduce/combinator
behavor?
What time-frame is the reduce-behavior planned?
Greetings,
Ralf
--
Chris Anderson
http://jchris.mfdz.com