Thanks, will implement something like that. -Patrick
On 25/11/2009 7:46 PM, "Brian Candler" <[email protected]> wrote: On Mon, Nov 23, 2009 at 01:10:26PM +1100, Patrick Barnes wrote: > The external data is delivered as ... Sounds like you need a merge. Taking users as an example: - have a couchdb view which emits users keyed by username - sort the incoming feed so that it is also keyed by username - take the first record from the view and the first record from the feed Then repeat the following: - if they have identical usernames, skip to next in both view and feed - if the view username < feed username, mark view record as 'inactive' and advance to next view record - if the view username > feed username, create a new user in database and advance to next feed record This solution uses constant RAM and scales indefinitely. Even though a couchdb view generates a single JSON object, you can "stream" it easily because each record within it is delimited by a newline. OTOH, if your 200K users can be stored in an 'acceptable' amount of memory, and you don't expect it to grow much larger, you could just read the whole lot into RAM and process it there. At 1K per user you'd use 200MB of RAM, which might be acceptable.
