Re: Create a view with only unique records

Jan Lehnardt Mon, 14 Apr 2008 08:20:40 -0700


On Apr 14, 2008, at 02:34, Ralf Nieuwenhuijsen wrote:

Well, that doesn't really apply. I am not looking for way to createunique
documents.
I'm looking for a way to get a view with only unique documents.
Imagine some portion of all the documents having the key 'adres'.
Then I want a list of unique adresses; a view with only the adreskeys for
documents that have it, and then only unique entries.

It seems currently i can solve this problem in two ways:
- creating a separate adres document that stores an array of alluniqueaddresses. But without any sane default merging behavior, thisbreaks at
replication.
- creating a separate document for _each_ adres using put and themd5 ofthe adres of doc-id. This seems like an enormous waste of space.Esspcially
since I will be doing this with almost every key in every document.
In the future this should be doable with the reduce/combinatorbehavior, iexpect.But even there, i think the suggested approach is toolimiting. Thereducer is going to return one json object. I would rather have itemit(key,value) and use default view operations on it for stuff likepagination.
Using the above example and assuming the reducer is implemented. Howto getthe X most used addresses? the value of X needs to be hard-codedwith thesuggested implemenation. Whereas using emit(key,value) in thereducer as
well, would allow for pagination.

I might be totally off here, but the reduce function actually doesonly return one key-value pair for the view:


map: /* _id = md5(address) */
function(doc) {
  emit(doc._id, 1);
}

produces:

abc | 1
abc | 1
def | 1
xyz | 1
yyy | 1
yyy | 1
yyy | 1

for fictional _id values.

reduce:
function(keys, values) {
  var sum = 0;
  for(var i in values) {
    sum += values[i];
  }

  return sum;
}

produces:

abc | 2
def | 1
xyz | 1
yyy | 3

as the output of the view, which can be paginated just as easy as thelist that map alone produces. This gives you a count for all addressesbut not yet a sorted list. got to think about that one a bit more.


Cheers
Jan
--

Greetings,
Ralf

2008/4/13, Chris Anderson <[EMAIL PROTECTED]>:


Ralf,

If you use an algorithm to generate a deterministic _id for records

before PUT-ing them to CouchDB, you can ensure that each uniquerecord

only appears once in the database. This discussion might be relevant
for you:


http://mail-archives.apache.org/mod_mbox/incubator-couchdb-user/200803.mbox/[EMAIL
 PROTECTED]

Chris


On Sat, Apr 12, 2008 at 8:43 PM, Ralf Nieuwenhuijsen
<[EMAIL PROTECTED]> wrote:

Is it possible to create a view with only unique-records?
I assume it would be possible using the future reduce/combinator

behavor?


What time-frame is the reduce-behavior planned?

Greetings,
Ralf





--
Chris Anderson
http://jchris.mfdz.com

Re: Create a view with only unique records

Reply via email to