Re: CouchDB 2.0: breaking the backward compatibility

Paul Davis Tue, 15 Jul 2014 00:25:36 -0700

+1 to zlib/base64 strings as long as we make sure that the string
format is trivially and directly parseable (ie, no regex requirement
for parsing).


On Mon, Jul 14, 2014 at 4:30 PM, Robert Samuel Newson
<[email protected]> wrote:
> Hrm, no, I think those would remain as non_neg_integer(), but the fate of 
> single-node is not yet determined.
>
> B.
>
> On 14 Jul 2014, at 19:36, Joan Touzet <[email protected]> wrote:
>
>> Yes to all 3 of your questions.
>>
>> Would we mandate this format even for single-node? We could "fake it" as if 
>> it was a q=1 db, i.e.
>>
>> "seq":"couchdb@foo,000-fff,12"
>>
>> If you want to allow replication from 1.x hosts you'd have to be more 
>> lenient ("should" not "must").
>>
>> ----- Original Message -----
>> From: "Robert Samuel Newson" <[email protected]>
>> To: [email protected]
>> Sent: Monday, July 14, 2014 10:34:21 AM
>> Subject: Re: CouchDB 2.0: breaking the backward compatibility
>>
>>
>> Another thought occurs; BigCouch has a different format for update_seq that 
>> is notoriously ugly.
>>
>> We obviously need to encode more information for a sharded cluster (the 
>> update_seq of each shard, the range that it’s for and the node it resides 
>> on) but BigCouch also had to be compatible with CouchDB 1.x installations, 
>> so we add the sum of update sequences on the front, to trick the old 
>> replicator into working.
>>
>> While it is not necessary for a human to be able to read an update_seq value 
>> (aka: you should treat it as opaque JSON), it’s often useful to unpack these 
>> for diagnostic purposes. Our use of term_to_binary confounds non-erlang 
>> libraries from doing so.
>>
>> I propose we fix that in 2.0 which would require that the replicator 
>> checkpoints every N updates, and not when it sees the current update_seq 
>> exceed some delta from the update_seq of the last checkpoint.
>>
>> An example readable format would be;
>>
>> "seq":"couchdb@foo,000-ccc,12:couchdb@bar,d00-fff,10"
>>
>> that is, a formatted string.
>>
>> A few questions;
>>
>> 1) Should we obscure hostnames? (we could run then forward through sha1 or 
>> even pbkdf2, akin to how .known_hosts is protected by ssh)
>> 2) should we gzip encode the result?
>> 3) should we base64 the result?
>>
>> I think "yes" to all questions (and we would obviously have to base64 if we 
>> gzipped).
>>
>> Thoughts?
>>
>> B.
>>
>>
>> On 13 Jul 2014, at 21:23, Paul Davis <[email protected]> wrote:
>>
>>> Changing the default respones for conflicts to include all versions
>>> (or no version).
>>>
>>> Fix the list API (inside couchjs) so that its a pure callback like
>>> everything else. Not sure if we should necessarily completely revamp
>>> the whole query server protocol for 2.0. Given that its not user
>>> facing I'm less inclined to think it needs to be in a major release,
>>> ie we could add a new protocol in a minor release after 2.0.
>>>
>>> We should rename _rev to _mvcc (or _token or _lock or anything not
>>> _rev) finally.
>>>
>>> Removing all metadata from document bodies has been an oft requested change.
>>>
>>> Seems like there was a list of these things floating around a long time ago.
>>>
>>> On Sun, Jul 13, 2014 at 3:17 PM, Robert Samuel Newson
>>> <[email protected]> wrote:
>>>>
>>>> Since we follow semantic versioning, the only meaning behind naming our 
>>>> next release 2.0 and not 1.7 is that it contains backwards incompatible 
>>>> changes.
>>>>
>>>> It’s for the CouchDB community as a whole to determine what is and isn’t 
>>>> in a release. Certainly merging in bigcouch and rcouch are a huge part of 
>>>> the 2.0 release, but they aren’t necessarily the only things. If they 
>>>> hadn’t changed the API in incompatible ways, they wouldn’t cause a major 
>>>> version bump.
>>>>
>>>> With that said then, I’m interested in hearing what else, besides the two 
>>>> merges, we feel we want to take on in our first major revision bump in 
>>>> approximately forever? At minimum, I would like to see a change that 
>>>> allows us to use versions of spidermonkey released after 1.8.5, whatever 
>>>> that change might be.
>>>>
>>>> B.
>>>>
>>>> On 13 Jul 2014, at 20:31, Joan Touzet <[email protected]> wrote:
>>>>
>>>>> Improving the view server protocol is a great idea, but it is appropriate
>>>>> for a 2.0 timeframe? I would think it would make more sense in a 3.0
>>>>> timeframe, given 2.0 is all about merging forks, not writing new features
>>>>> entirely from scratch.
>>>>>
>>>>> -Joan
>>>>>
>>>>> ----- Original Message -----
>>>>> From: "Robert Samuel Newson" <[email protected]>
>>>>> To: [email protected]
>>>>> Sent: Sunday, July 13, 2014 8:52:40 AM
>>>>> Subject: Re: CouchDB 2.0: breaking the backward compatibility
>>>>>
>>>>>
>>>>> Adding mvcc for _security is a great idea (happily, Cloudant have done so 
>>>>> very recently, so I will be pulling that work over soon).
>>>>>
>>>>> A better view server protocol is also a great idea.
>>>>>
>>>>>
>>>>> On 13 Jul 2014, at 13:13, Samuel Williams 
>>>>> <[email protected]> wrote:
>>>>>
>>>>>>
>>>>>> On 13/07/14 23:47, Alexander Shorin wrote:
>>>>>>> Our view server is compiles functions on each view index update
>>>>>>> instead of reusing inner cache. This is because of out-dated protocol:
>>>>>>> others design function are works differently from views. While it's
>>>>>>> good to change and improve query server protocol completely, this task
>>>>>>> requires more time to be done. We should have a least plan B to do
>>>>>>> small steps in good direction.
>>>>>> As already suggested, here is my proposal for 2.0 view/query server:
>>>>>>
>>>>>> https://docs.google.com/document/d/1JtfvCpNB9pRQyLhS5KkkEdJ-ghSCv89xnw5HDMTCsp8/edit
>>>>>>
>>>>>> I welcome people to suggest improvements/changes/ideas.
>>>>>>
>>>>>> Kind regards,
>>>>>> Samuel
>>>>>
>>>>
>>
>

Re: CouchDB 2.0: breaking the backward compatibility

Reply via email to