Hi Nick, all,

After further investigation, I actually found that the problem was on my
side.
I had to port a CouchDB patch from 1.6 to 3.1 and there was an oversight
from
my side on the method name used to read a configuration parameter (now
called
"config" instead of "couch_config").

That was fixed, CouchDB rebuilt and now I can properly replicate data in
both
directions.

This brought me to a new - authorization - error though, with undefined
user GETing
data. It likely deserves another thread though.

Thanks,
Alan.

On Mon, Feb 28, 2022 at 4:11 PM Alan Malta <alanma...@gmail.com> wrote:

> Hi Nick,
>
> Thank you for the follow-up investigation and questions.
>
> I am in the process of rebuilding my software stack and will try to
> replicate data using the very same CouchDB 3.1.2 version + Erlang 22.
>
> > ... If you get a chance to find and run a remsh script check the output
> of :
> Regarding these erlang comments that you suggested to run, this is the
> output on the Couchdb 3.1 + erlang 22:
> Eshell V10.7.2.11  (abort with ^G)
> 1> crypto:info_lib().
> [{<<"OpenSSL">>,268443839,
>   <<"OpenSSL 1.0.2k-fips  26 Jan 2017">>}]
> 2> ssl:versions().
> [{ssl_app,"9.2"},
>  {supported,['tlsv1.2']},
>  {supported_dtls,['dtlsv1.2']},
>  {available,['tlsv1.3','tlsv1.2','tlsv1.1',tlsv1,sslv3]},
>  {available_dtls,['dtlsv1.2',dtlsv1]}]
>
> while the "old" and fully functional Couchdb 1.6.1 + erlang 16 gives me
> this:
> Eshell V5.10.4  (abort with ^G)
> 3> ssl:versions().
> [{ssl_app,"5.3.3"},
>  {supported,['tlsv1.2','tlsv1.1',tlsv1,sslv3]},
>  {available,['tlsv1.2','tlsv1.1',tlsv1,sslv3]}]
> 4> crypto:info_lib().
> [{<<"OpenSSL">>,268443839,
>   <<"OpenSSL 1.0.2k-fips  26 Jan 2017">>}]
>
> >  * Another potential issue is that the curl script quotes the parameters
> with a single quote in:
>
> this shouldn't be a problem. I actually started running those commands
> without the env variable, but
> decided to update my notes when I was copying them over gist.
>
> > The logs don't show any stack traces besides the one you indicated in
> the initial email? Anything with a module name and a line number
>
> there is absolute nothing around that single line of error. Changing log
> level to debug doesn't help either.
>
> Thank you for these suggestions. I should be back later with some news on
> 3.1.2 <--> 3.1.2 bidirectional
> replication.
>
> Thanks,
> Alan.
>
> On Sun, Feb 27, 2022 at 4:40 PM Nick Vatamaniuc <vatam...@gmail.com>
> wrote:
>
>> Thanks for the script, Alan.
>>
>> I had tried to set up a basic replication between localhost endpoint
>> on Erlang 22 with 3.2.1 release and that seems to work:
>> https://gist.github.com/nickva/5a89198c62fdd3ec97693c87833d5738
>>
>> Looking at the differences between our setups I noticed a few things:
>>
>>  * I haven't tried TLS on the endpoints. Wonder if that's the cause.
>> Would you be able to try it locally (or via a VPN) without TLS.
>> Sometimes it's possible to install or build an Erlang release without
>> crypto support and it only manifests itself when trying to use any of
>> that functionality at runtime. If you get a chance to find and run a
>> remsh script check the output of :
>>
>> crypto:info_lib().
>> [{<<"OpenSSL">>,269488239,
>>   <<"OpenSSL 1.1.1f  31 Mar 2020">>}]
>>
>> ssl:versions().
>> [{ssl_app,"9.2"},
>>  {supported,['tlsv1.2']},
>>  {supported_dtls,['dtlsv1.2']},
>>  {available,['tlsv1.3','tlsv1.2','tlsv1.1',tlsv1,sslv3]},
>>  {available_dtls,['dtlsv1.2',dtlsv1]}]
>>
>> That would indicate you have the crypto application installed and
>> linked to the openssl library.
>>
>>  * Another potential issue is that the curl script quotes the
>> parameters with a single quote in:
>>
>> curl -X POST http://$USERPASS@localhost:5984/_replicator -d
>> '{"source":"http://$USERPASS@localhost:5984/workqueue_inbox/";,
>> "target":"https://$REMOTEHOST/couchdb/workqueue/";, "continuous":true}'
>> -H "Content-Type: application/json"
>>
>> That would make the target the literal
>> `https://$REMOTEHOST/couchdb/workqueue/` string without substituting
>> the $REMOTEHOST with its value. That's probably not the reason here
>> but thought I'd mention it just in case.
>>
>>   * `https://$REMOTEHOST/couchdb/workqueue`. I could see the db / url
>> parser being confused by the url path there as the path
>> $REMOTEHOST/couchdb/workqueue could be split up as a database
>> path=$REMOTEHOST/couchdb and then workqueue would be the document, but
>> in this case the workqueue is the database actually. Would you be able
>> to test a setup where the URL path looks like
>> http://domain.name.ext/dbname for an endpoint?
>>
>> The logs don't show any stack traces besides the one you indicated in
>> the initial email? Anything with a module name and a line number
>> perhaps.
>>
>> Thanks,
>> -Nick
>>
>>
>>
>>
>>
>> On Sat, Feb 26, 2022 at 4:24 PM Alan Malta <alanma...@gmail.com> wrote:
>> >
>> > Hi Nick,
>> >
>> > Thank you for your prompt response.
>> >
>> > Yes, I confirm that CouchDB 3.1.2 is running with Erlang 22; and that
>> user
>> > and password only have basic chars a-z.
>> >
>> > I wiped out all my setup, started from scratch and managed to reproduce
>> > this replication issue with the following set
>> > of commands:
>> > https://gist.github.com/amaltaro/67bd133c519300fb82dd0cad372cf1a0
>> >
>> > while reproducing it, I defined only one way replication. However, my
>> > previous setup had it bi-directional and both
>> > of them were in a failed state. I also added some extra checks and
>> > information in the gist above, in case it turns out
>> > to be helpful.
>> >
>> > I haven't yet tried to replicate data among two instances running the
>> same
>> > version. Reason is, during this migration,
>> > I believe it will be impossible to swap all my services to the new
>> CouchDB
>> > version, so there should be a period of
>> > time (around a month) where I will need to keep this hybrid setup.
>> >
>> > Thank you again!
>> > Alan.
>> >
>> > On Sat, Feb 26, 2022 at 12:26 PM Nick Vatamaniuc <vatam...@gmail.com>
>> wrote:
>> >
>> > > Hi Alan,
>> > >
>> > > Thanks for reaching out.
>> > >
>> > > It looks like CouchDB had failed to parse the replication document,
>> > > and couldn't turn it into a proper replication job.
>> > >
>> > > The 'undef' error could suggest running on an unsupported version of
>> > > Erlang. It's a generic "this function doesn't exist" error in Erlang.
>> > > Are you running on at least Erlang 20?
>> > >
>> > > Does the target url have any unusual characters in it, or something
>> > > that might cause parsing errors (say, ':' or '@' characters for
>> > > example).
>> > >
>> > > Would it be possible to have an example script which fails. Ideally, a
>> > > set of curl commands creating dbs, then the replication job using
>> > > similar parameters you had?
>> > >
>> > > Cheers,
>> > > -Nick
>> > >
>> > > On Sat, Feb 26, 2022 at 9:29 AM Alan Malta <alanma...@gmail.com>
>> wrote:
>> > > >
>> > > > Hi everyone,
>> > > >
>> > > > after a delay of many years to migrate to (almost) the latest
>> CouchDB
>> > > > version, I started working with CouchDB 3.1.2.
>> > > >
>> > > > My tests with replication to/from the same node/localhost have been
>> > > > successful. But now that I am trying multiple push/pull replications
>> > > with a
>> > > > remote host, they get into a "failed" state.
>> > > >
>> > > > I just learned about the "_scheduler/jobs" API - and I am likely
>> missing
>> > > > some crucial knowledge here - and when I compare it against the
>> documents
>> > > > in the "_replicator" database, I see an inconsistent definition for
>> > > either
>> > > > the source or the target database.
>> > > > For instance, the "_scheduler/jobs" gives me the following output
>> for one
>> > > > of the replications:
>> > > >
>> > > >
>> > >
>> {"database":"_replicator","doc_id":"87463eb82b3e1dcd7a3178276800026e","id":null,"source":"
>> > > http://admin:
>> > > >
>> > >
>> *****@localhost:5984/my_db_name/","target":null,"state":"failed","error_count":1,"info":{"error":"{error,undef}"},"start_time":"2022-02-26T13:43:42Z","last_updated":"2022-02-26T13:43:42Z"},
>> > > > while the "_replicator" db lists this document as:
>> > > >
>> > > >
>> > >
>> {"id":"87463eb82b3e1dcd7a3178276800026e","key":"87463eb82b3e1dcd7a3178276800026e","value":{"rev":"2-590d4eadf029c21303ce77116d2f3f92"},"doc":{"_id":"87463eb82b3e1dcd7a3178276800026e","_rev":"2-590d4eadf029c21303ce77116d2f3f92","source":"
>> > > http://admin:
>> > > > *****@localhost:5984/my_db_name","target":"
>> > > > https://alanblah.blah.blah/couchdb/wmstats
>> > > >
>> > >
>> ","continuous":true,"filter":"WMStatsAgent/repfilter","owner":"admin","_replication_state":"failed","_replication_state_time":"2022-02-26T13:43:42Z","_replication_state_reason":"{error,undef}"}},
>> > > > in short, the "target" parameter is defined as null in the "jobs"
>> output.
>> > > > Is it because the replication failed somehow?
>> > > >
>> > > > Just in case, this is the only error I see in the couch log
>> regarding
>> > > that
>> > > > replication - on the node that triggered the replication:
>> > > >
>> > > > [error] 2022-02-26T13:43:42.016495Z couchdb@127.0.0.1 <0.534.0>
>> --------
>> > > > Error processing replication doc `87463eb82b3e1dcd7a3178276800026e`
>> from
>> > > > `shards/00000000-7fffffff/_replicator.1645882154`: {error,undef}
>> > > >
>> > > > I also wonder if the replication protocol is compatible among
>> different
>> > > > releases of CouchDB? In my case, target is still on the super old
>> version
>> > > > 1.6.1 while source is on 3.1.2
>> > > >
>> > > > Thank you very much for any help that you can provide.
>> > > > Best,
>> > > > Alan.
>> > >
>>
>

Reply via email to