Re: replication usage? creating dupes?

Jan Lehnardt Wed, 16 Jul 2008 11:01:42 -0700


On Jul 16, 2008, at 19:06, Damien Katz wrote:

On Jul 16, 2008, at 12:05 PM, Paul Davis wrote:
I haven't really gotten into replication yet, but did I read that
right? The browser request for compaction isn't expected to return
until replication has completed? On the surface of things that seems
fairly ungood. What happens in the future when I have a multigigabyte
database I want to replicate from scratch to a new node over a slow
connection?
If I'm not completely off my rocker, perhaps a better solution isthat
the browser request for replication returns immediately and then
couchdb would provide a method for checking on the status of the
replication.
That's an option for the future, but then we'd have to create areplication task monitor infrastructure that can be queried fromHTTP. That's additional complexity and overhead.
If you want to fire and forget, then that's an easy option. But ifyou want to fire it off, and monitor it, and shut it down, then allthat has to be written and tested and documented.
Instead, we can just have a synchronous HTTP request and get most ofthat for free. If the request is alive, you know the replication isstill running. If you want to kill it, then terminate theconnection. If you want to know when its done, then simply wait forthe completion. We could even send updates about progress of thereplication over chunked HTTP.
So for now I think we stick with the simpler design and enhance itto do what we want, until we hit the wall and need to buildsomething bigger.


I agree with Chris and Damien :)

For now it might be good to put a disclaimer into Futon that
says you need to wait for the replication to finish or, not to
use Futon at all for longer replications and curl et. al. for that.

Cmlenz? :)

Cheers
Jan
--

-Damien
Feel free to ignore me if I have this completely wrong.

Paul
On Wed, Jul 16, 2008 at 11:58 AM, Damien Katz <[EMAIL PROTECTED]>wrote:
That problem is likely due to the fact the user HTTP request istiming outwhile waiting for the replication to complete, that in turn killstheunderlying replication process. Restarting the replication willusually helpas CouchDB avoids sending the same document twice, but if thereplication isexceptionally long it might not get past the point where it itfinishing
examining the documents.
The problem is its only saves off the replication record once itcompletessuccessfully, so until it completes it always examine the samenumber ofdocuments to see if they exist on the target replica. The fix Ineed toimplement is to have it save off the replication record every xsecondsduring replication, then if it dies unexpectedly it will pick backup fromthe last replication record, reducing the number of documentsneeding to be
reexamined.
Then we need to solve is the current problem of synchronous HTTPrequest toperform the replication. In Futon, the browser doesn't do thereplication,it just sends a single replication request to the CouchDB server.A CouchDBErlang process then performs the replication, accessing databaseeitherlocally or via HTTP on other Erlang servers. Right now, thebrowser cantimeout the HTTP request during a long replication, that in turnkills the
replication process.
There are two potential solutions here, the first is to send abrowser pingto keep the connection alive. Easy do do with HTTP 1.1 I think,just send anempty HTTP chunk. The second is to make it impossible for thebroken HTTPrequest to kill the replication request. They aren't mutuallyexclusive, but
the more I think about it, the more I dislike the second solution.

-Damien


On Jul 16, 2008, at 11:13 AM, Chris Anderson wrote:
On Wed, Jul 16, 2008 at 2:18 AM, Jan Lehnardt <[EMAIL PROTECTED]>wrote:
I'm surprised that his wasn't reported earlier. CouchDBreplication
is supposed to be reliable (when we got all the bugs out), so an
external replication thing should not be necessary. I would have
guessed that reporting this is easier than writing code tocircumvent
the problem. This should be fixed in CouchDB and not worked
around.
My experience with replication has been that it works flawlesslyforsmaller datasets, and as the dataset grows, it either starts totakeso long it may as well be broken (but shows no errors in the log)oroccasionally does the =ERROR REPORT==== thing in the log. Thelater is
a new symptom in my experience.

I haven't had a chance to bring my install up to latest trunk, so I
hesitated to report it. Today's my only sane day for a couple ofweeks
on each side, so I'll see what progress I can make.

Chris


--
Chris Anderson
http://jchris.mfdz.com

Re: replication usage? creating dupes?

Reply via email to