Re: [DISCUSS] length restrictions in 4.0

2020-05-12 Thread Robert Samuel Newson
deal.

> On 12 May 2020, at 22:50, Nick Vatamaniuc  wrote:
> 
> Ok, fine, let's not worry about it the 238 limit then, I'll yield :-)
> 
> But, if we are going by file system file name limits as a guide, in
> general, most of them are 255 not 256, so let's go with that.
> 
> https://en.wikipedia.org/wiki/Comparison_of_file_systems
> 
> On Tue, May 12, 2020 at 5:29 PM Robert Samuel Newson  
> wrote:
>> 
>> You'd have to replicate "back" and adjust the target db name to fit. It 
>> doesn't feel like a terrible hardship.
>> 
>>> On 12 May 2020, at 21:54, Joan Touzet  wrote:
>>> 
>>> I presume the workaround would be "Replicate back to CouchDB 3.x, but 
>>> truncate to 236 characters in the process?" You'd lose fidelity in the db 
>>> name that way.
>>> 
>>> -Joan
>>> 
>>> On 2020-05-12 4:05 p.m., Robert Newson wrote:
 I still don’t understand how the internal shard database name format has 
 any bearing on our public interface, present or future.
>> 



[DISCUSS] _changes feed on database partitions

2020-05-12 Thread Adam Kocoloski
Hi all,

When we introduced partitioned databases in 3.0 we declined to add a 
partition-specific _changes endpoint, because we didn’t have a prebuilt index 
that could support it. It sounds like the lack of that endpoint is a bit of a 
drag. I wanted to start this thread to consider adding it.

Note: this isn’t a fully-formed proposal coming from my team with a plan to 
staff the development of it. Just a discussion :)

In the simplest case, a _changes feed could be implemented by scanning the 
by_seq index of the shard that hosts the named partition. We already get some 
efficiencies here: we don’t need to touch any of the other shards of the 
database, and we have enough information in the by_seq btree to filter out 
documents from other partitions without actually retrieving them from disk, so 
we can push the filter down quite nicely without a lot of extra processing. 
It’s just a very cheap binary prefix pattern match on the docid.

Most consumers of the _changes feed work incrementally, and we can support that 
here as well. It’s not like we need to do a full table scan on every 
incremental request.

If the shard is hosting so many partitions that this filter is becoming a 
bottleneck, resharding (also new in 3.0) is probably a good option. Partitioned 
databases are particularly amenable to increasing the shard count. Global 
indexes on the database become more expensive to query, but those ought to be a 
smaller percentage of queries in this data model.

Finally, if the overhead of filtering out non-matching partitions is just too 
high, we could support the use of user-created indexes, e.g. by having a user 
create a Mango index on _local_seq. If such an index exists, our “query 
planner” uses it for the partitioned _changes feed. If not, resort to the scan 
on the shard’s by_seq index as above.

I’d like to do some basic benchmarking, but I have a feeling the by_seq work 
quite well in the majority of cases, and the user-defined index is a good 
"escape valve” if we need it. WDYT?

Adam

Re: [DISCUSS] length restrictions in 4.0

2020-05-12 Thread Nick Vatamaniuc
Ok, fine, let's not worry about it the 238 limit then, I'll yield :-)

But, if we are going by file system file name limits as a guide, in
general, most of them are 255 not 256, so let's go with that.

https://en.wikipedia.org/wiki/Comparison_of_file_systems

On Tue, May 12, 2020 at 5:29 PM Robert Samuel Newson  wrote:
>
> You'd have to replicate "back" and adjust the target db name to fit. It 
> doesn't feel like a terrible hardship.
>
> > On 12 May 2020, at 21:54, Joan Touzet  wrote:
> >
> > I presume the workaround would be "Replicate back to CouchDB 3.x, but 
> > truncate to 236 characters in the process?" You'd lose fidelity in the db 
> > name that way.
> >
> > -Joan
> >
> > On 2020-05-12 4:05 p.m., Robert Newson wrote:
> >> I still don’t understand how the internal shard database name format has 
> >> any bearing on our public interface, present or future.
>


Re: [DISCUSS] length restrictions in 4.0

2020-05-12 Thread Robert Samuel Newson
You'd have to replicate "back" and adjust the target db name to fit. It doesn't 
feel like a terrible hardship.

> On 12 May 2020, at 21:54, Joan Touzet  wrote:
> 
> I presume the workaround would be "Replicate back to CouchDB 3.x, but 
> truncate to 236 characters in the process?" You'd lose fidelity in the db 
> name that way.
> 
> -Joan
> 
> On 2020-05-12 4:05 p.m., Robert Newson wrote:
>> I still don’t understand how the internal shard database name format has any 
>> bearing on our public interface, present or future.



Re: [DISCUSS] length restrictions in 4.0

2020-05-12 Thread Joan Touzet
I presume the workaround would be "Replicate back to CouchDB 3.x, but 
truncate to 236 characters in the process?" You'd lose fidelity in the 
db name that way.


-Joan

On 2020-05-12 4:05 p.m., Robert Newson wrote:

I still don’t understand how the internal shard database name format has any 
bearing on our public interface, present or future.



Re: [DISCUSS] length restrictions in 4.0

2020-05-12 Thread Robert Samuel Newson
Taking another swing at this.

The 238 choice would mean that any valid 4.0 db name can be replicated back to 
<3.0 because, to succeed, those versions will make internal shard names that 
exceed common filesystem lengths.

Fine, I accept that. Backward compatibility is a tricky balance. 

My position is that 256 is a sensible limit for db name. It matches many 
filesystem limits, it is quite generous, and it is a small fraction of the fdb 
key / value limit. 

On compatibility, with 256 limit, you can replicate a < 4.0 db to 4.0 with the 
same name and back again (since any existing < 4.0 db must have successfully 
made internal shards). I think that is enough backward compatibility.

I don't have a final say here, so please everybody that cares chime in now.

My vote is for 256 character limit for 4.0 onward, non-configurable (all 
couchdb 4.0 installs, and anything claiming compatibility with it all 
supporting and enforcing this limit for onward replication compatibility).

B.

> On 12 May 2020, at 21:05, Robert Newson  wrote:
> 
> I still don’t understand how the internal shard database name format has any 
> bearing on our public interface, present or future. 
> 
> -- 
>  Robert Samuel Newson
>  rnew...@apache.org
> 
> On Tue, 12 May 2020, at 19:52, Nick Vatamaniuc wrote:
>> I still like it. It's only 18 bytes difference but it introduces one
>> more compatibility issue. At least for 4.x, it would be nice to have
>> less of those and we can always increase it later. But if other
>> participants think it's too nitpick-y and odd I am happy to go with
>> 256.
>> 
>> -Nick
>> 
>> On Tue, May 12, 2020 at 9:24 AM Robert Samuel Newson  
>> wrote:
>>> 
>>> Sorry to let this thread drop.
>>> 
>>> Nick, are you still preferring 238?
>>> 
>>> B.
>>> 
 On 4 May 2020, at 21:06, Robert Samuel Newson  wrote:
 
 Ah, ok, understood. I don't think that's a compelling reason to fix our 
 maximum database name length at 238.
 
 CouchDB 4.0 will be the first version of CouchDB where we're not coupled 
 to the filesystem for this list. 256 is very common for a filesystem 
 filename length limit (though not universal) so I don't think our history 
 should dictate an odd (fine, _even_) choice of 238.
 
 B.
 
 
> On 4 May 2020, at 20:41, Nick Vatamaniuc  wrote:
> 
> It will prevent replicating from db created in 4.0 which has a name
> longer than 238 (say 250) back to 2.x/3.x if the user intends to keep
> the same database name on both systems, that's what I meant.
> 
> On Mon, May 4, 2020 at 3:15 PM Robert Samuel Newson  
> wrote:
>> 
>> The 'timestamp in filename' is only on the internal shards, which would 
>> not be part of a replication between 2.x/3.x and 4.x.
>> 
>> In any case, Nick is suggesting lowering from 256 charts to 238 chars to 
>> leave room for these things that won't be there. I confess I don't 
>> understand the reasoning.
>> 
>> B.
>> 
>>> On 4 May 2020, at 20:04, Joan Touzet  wrote:
>>> 
>>> I suspect he means when replicating back to a 3.x or 2.x cluster.
>>> 
>>> On 2020-05-04 3:03 p.m., Robert Samuel Newson wrote:
 But we don't need to add a file extension or a timestamp to database 
 names.
 B.
> On 4 May 2020, at 18:42, Nick Vatamaniuc  wrote:
> 
> Hello everyone,
> 
> Good idea, +1 with one minor tweak: database name length in versions
> <4.0 was restricted by the maximum file name on whatever file system
> the server was running on. In practice that was 255, then there is an
> extension and a timestamp in the filename which made the db name limit
> be 238 so I suggest to use that instead.
> 
> -Nick
> 
> On Mon, May 4, 2020 at 11:51 AM Robert Samuel Newson 
>  wrote:
>> 
>> Hi,
>> 
>> I think I speak for many in accepting the risk that we're excluding 
>> doc ids formed from 4096-bit RSA signatures.
>> 
>> I don't think I made it clear but I think these should be fixed 
>> limits (i.e, not configurable) in order to ensure inter-replication 
>> between couchdb installations wherever they are.
>> 
>> B.
>> 
>>> On 4 May 2020, at 10:52, Ilya Khlopotov  wrote:
>>> 
>>> Hello,
>>> 
>>> Thank you Robert for starting this important discussion. I think 
>>> that the values you propose make sense.
>>> I can see a case when user would use hashes as document ids. All 
>>> existent hash functions I am aware of should return data which fit 
>>> into 512 characters. There is only one case which doesn't fit into 
>>> 512 limit. If user would decide to use RSA signatures as document 
>>> ids and they use 4096 bytes sized keys the signature size would be 

Re: [DISCUSS] length restrictions in 4.0

2020-05-12 Thread Robert Newson
I still don’t understand how the internal shard database name format has any 
bearing on our public interface, present or future. 

-- 
  Robert Samuel Newson
  rnew...@apache.org

On Tue, 12 May 2020, at 19:52, Nick Vatamaniuc wrote:
> I still like it. It's only 18 bytes difference but it introduces one
> more compatibility issue. At least for 4.x, it would be nice to have
> less of those and we can always increase it later. But if other
> participants think it's too nitpick-y and odd I am happy to go with
> 256.
> 
> -Nick
> 
> On Tue, May 12, 2020 at 9:24 AM Robert Samuel Newson  
> wrote:
> >
> > Sorry to let this thread drop.
> >
> > Nick, are you still preferring 238?
> >
> > B.
> >
> > > On 4 May 2020, at 21:06, Robert Samuel Newson  wrote:
> > >
> > > Ah, ok, understood. I don't think that's a compelling reason to fix our 
> > > maximum database name length at 238.
> > >
> > > CouchDB 4.0 will be the first version of CouchDB where we're not coupled 
> > > to the filesystem for this list. 256 is very common for a filesystem 
> > > filename length limit (though not universal) so I don't think our history 
> > > should dictate an odd (fine, _even_) choice of 238.
> > >
> > > B.
> > >
> > >
> > >> On 4 May 2020, at 20:41, Nick Vatamaniuc  wrote:
> > >>
> > >> It will prevent replicating from db created in 4.0 which has a name
> > >> longer than 238 (say 250) back to 2.x/3.x if the user intends to keep
> > >> the same database name on both systems, that's what I meant.
> > >>
> > >> On Mon, May 4, 2020 at 3:15 PM Robert Samuel Newson  
> > >> wrote:
> > >>>
> > >>> The 'timestamp in filename' is only on the internal shards, which would 
> > >>> not be part of a replication between 2.x/3.x and 4.x.
> > >>>
> > >>> In any case, Nick is suggesting lowering from 256 charts to 238 chars 
> > >>> to leave room for these things that won't be there. I confess I don't 
> > >>> understand the reasoning.
> > >>>
> > >>> B.
> > >>>
> >  On 4 May 2020, at 20:04, Joan Touzet  wrote:
> > 
> >  I suspect he means when replicating back to a 3.x or 2.x cluster.
> > 
> >  On 2020-05-04 3:03 p.m., Robert Samuel Newson wrote:
> > > But we don't need to add a file extension or a timestamp to database 
> > > names.
> > > B.
> > >> On 4 May 2020, at 18:42, Nick Vatamaniuc  wrote:
> > >>
> > >> Hello everyone,
> > >>
> > >> Good idea, +1 with one minor tweak: database name length in versions
> > >> <4.0 was restricted by the maximum file name on whatever file system
> > >> the server was running on. In practice that was 255, then there is an
> > >> extension and a timestamp in the filename which made the db name 
> > >> limit
> > >> be 238 so I suggest to use that instead.
> > >>
> > >> -Nick
> > >>
> > >> On Mon, May 4, 2020 at 11:51 AM Robert Samuel Newson 
> > >>  wrote:
> > >>>
> > >>> Hi,
> > >>>
> > >>> I think I speak for many in accepting the risk that we're excluding 
> > >>> doc ids formed from 4096-bit RSA signatures.
> > >>>
> > >>> I don't think I made it clear but I think these should be fixed 
> > >>> limits (i.e, not configurable) in order to ensure inter-replication 
> > >>> between couchdb installations wherever they are.
> > >>>
> > >>> B.
> > >>>
> >  On 4 May 2020, at 10:52, Ilya Khlopotov  wrote:
> > 
> >  Hello,
> > 
> >  Thank you Robert for starting this important discussion. I think 
> >  that the values you propose make sense.
> >  I can see a case when user would use hashes as document ids. All 
> >  existent hash functions I am aware of should return data which fit 
> >  into 512 characters. There is only one case which doesn't fit into 
> >  512 limit. If user would decide to use RSA signatures as document 
> >  ids and they use 4096 bytes sized keys the signature size would be 
> >  684 bytes.
> > 
> >  However in this case users can easily replace signatures with 
> >  hashes of signatures. So I wouldn't worry about it to much. 512 
> >  sounds plenty to me.
> > 
> >  +1 to set hard limits on db name size and doc id size with 
> >  proposed values.
> > 
> >  Best regards,
> >  iilyak
> > 
> >  On 2020/05/01 18:36:45, Robert Samuel Newson  
> >  wrote:
> > > Hello,
> > >
> > > There are other threads related to doc size (etc) limits for 
> > > CouchDB 4.0, motivated by restrictions in FoundationDB, but we 
> > > haven't discussed database name length and doc id length limits. 
> > > These are encoded into FoundationDB keys and so we would be wise 
> > > to forcibly limit their length from the start.
> > >
> > > I propose 256 character limit for database name and 512 character 
> > > 

Re: [DISCUSS] length restrictions in 4.0

2020-05-12 Thread Nick Vatamaniuc
I still like it. It's only 18 bytes difference but it introduces one
more compatibility issue. At least for 4.x, it would be nice to have
less of those and we can always increase it later. But if other
participants think it's too nitpick-y and odd I am happy to go with
256.

-Nick

On Tue, May 12, 2020 at 9:24 AM Robert Samuel Newson  wrote:
>
> Sorry to let this thread drop.
>
> Nick, are you still preferring 238?
>
> B.
>
> > On 4 May 2020, at 21:06, Robert Samuel Newson  wrote:
> >
> > Ah, ok, understood. I don't think that's a compelling reason to fix our 
> > maximum database name length at 238.
> >
> > CouchDB 4.0 will be the first version of CouchDB where we're not coupled to 
> > the filesystem for this list. 256 is very common for a filesystem filename 
> > length limit (though not universal) so I don't think our history should 
> > dictate an odd (fine, _even_) choice of 238.
> >
> > B.
> >
> >
> >> On 4 May 2020, at 20:41, Nick Vatamaniuc  wrote:
> >>
> >> It will prevent replicating from db created in 4.0 which has a name
> >> longer than 238 (say 250) back to 2.x/3.x if the user intends to keep
> >> the same database name on both systems, that's what I meant.
> >>
> >> On Mon, May 4, 2020 at 3:15 PM Robert Samuel Newson  
> >> wrote:
> >>>
> >>> The 'timestamp in filename' is only on the internal shards, which would 
> >>> not be part of a replication between 2.x/3.x and 4.x.
> >>>
> >>> In any case, Nick is suggesting lowering from 256 charts to 238 chars to 
> >>> leave room for these things that won't be there. I confess I don't 
> >>> understand the reasoning.
> >>>
> >>> B.
> >>>
>  On 4 May 2020, at 20:04, Joan Touzet  wrote:
> 
>  I suspect he means when replicating back to a 3.x or 2.x cluster.
> 
>  On 2020-05-04 3:03 p.m., Robert Samuel Newson wrote:
> > But we don't need to add a file extension or a timestamp to database 
> > names.
> > B.
> >> On 4 May 2020, at 18:42, Nick Vatamaniuc  wrote:
> >>
> >> Hello everyone,
> >>
> >> Good idea, +1 with one minor tweak: database name length in versions
> >> <4.0 was restricted by the maximum file name on whatever file system
> >> the server was running on. In practice that was 255, then there is an
> >> extension and a timestamp in the filename which made the db name limit
> >> be 238 so I suggest to use that instead.
> >>
> >> -Nick
> >>
> >> On Mon, May 4, 2020 at 11:51 AM Robert Samuel Newson 
> >>  wrote:
> >>>
> >>> Hi,
> >>>
> >>> I think I speak for many in accepting the risk that we're excluding 
> >>> doc ids formed from 4096-bit RSA signatures.
> >>>
> >>> I don't think I made it clear but I think these should be fixed 
> >>> limits (i.e, not configurable) in order to ensure inter-replication 
> >>> between couchdb installations wherever they are.
> >>>
> >>> B.
> >>>
>  On 4 May 2020, at 10:52, Ilya Khlopotov  wrote:
> 
>  Hello,
> 
>  Thank you Robert for starting this important discussion. I think 
>  that the values you propose make sense.
>  I can see a case when user would use hashes as document ids. All 
>  existent hash functions I am aware of should return data which fit 
>  into 512 characters. There is only one case which doesn't fit into 
>  512 limit. If user would decide to use RSA signatures as document 
>  ids and they use 4096 bytes sized keys the signature size would be 
>  684 bytes.
> 
>  However in this case users can easily replace signatures with hashes 
>  of signatures. So I wouldn't worry about it to much. 512 sounds 
>  plenty to me.
> 
>  +1 to set hard limits on db name size and doc id size with proposed 
>  values.
> 
>  Best regards,
>  iilyak
> 
>  On 2020/05/01 18:36:45, Robert Samuel Newson  
>  wrote:
> > Hello,
> >
> > There are other threads related to doc size (etc) limits for 
> > CouchDB 4.0, motivated by restrictions in FoundationDB, but we 
> > haven't discussed database name length and doc id length limits. 
> > These are encoded into FoundationDB keys and so we would be wise to 
> > forcibly limit their length from the start.
> >
> > I propose 256 character limit for database name and 512 character 
> > limit for doc ids.
> >
> > If you can't uniquely identify your database or document within 
> > those limits I argue that you're doing something wrong, and the 
> > limits here, while making FDB happy, are an aid to sensible 
> > application design.
> >
> > Does anyone want higher or lower limits? Comments pls.
> >
> > B.
> >
> >
> >>>
> >>>
> >
>


Re: moving email lists to GitHub Discussions (Was: [DISCUSS] moving email lists to Discourse)

2020-05-12 Thread Joan Touzet

On 2020-05-12 5:46 a.m., Ilya Khlopotov wrote:

I would be +1 as long as it works and we have options to migrate archive 
elsewhere if/when we need to.
You are proposing to mirror email traffic which means that mail archive would 
have a complete history and spare the project from total vendor lock in.



Yup, that'd be a requirement from the ASF's perspective, regardless of 
technology we select.


-Joan


Best regards,
ILYA

On 2020/05/11 19:04:53, Joan Touzet  wrote:

On 2020-03-15 9:36, Dave Cottlehuber wrote:

On Fri, 13 Mar 2020, at 14:35, Naomi Slater wrote:

apparently GitHub has discussions now. it's still in beta, but you can
specifically request it if you want it if you contact support, I think

e.g., https://github.com/zeit/next.js/discussions



interesting.


I'm interested to know what we think about this and how this
might/could fit into our plans for user support, discussion, etc.


Given that we already have email integration with GitHub, this will
probably be easier to get through the ASF bureaucracy than something
brand new.

I'm willing to take this through Infra if people agree to it. It doesn't
look like there are any separate "boards" or tags yet, so the proposal
would likely be that discussions there would get emailed onto user@. The
hard part will be getting replies to the thread on user@ to go back into
the discussion on GH; we might be able to get an "asf-bot" to do this
for us.

I also looked at Infra's JIRA database, and no one has put in this
request there yet. So, we'd be the first, with all the difficulties that
entails.

Can I get an informal "vote" on this approach and go-ahead? Since it's
informal, anyone is encouraged to respond.

-Joan "adopt, adapt, improve" Touzet



Re: moving email lists to GitHub Discussions (Was: [DISCUSS] moving email lists to Discourse)

2020-05-12 Thread Jan Lehnardt
I’d be willing to give this a go. +1 :)

Best
Jan
—

> On 11. May 2020, at 21:04, Joan Touzet  wrote:
> 
> On 2020-03-15 9:36, Dave Cottlehuber wrote:
>> On Fri, 13 Mar 2020, at 14:35, Naomi Slater wrote:
>>> apparently GitHub has discussions now. it's still in beta, but you can
>>> specifically request it if you want it if you contact support, I think
>>> 
>>> e.g., https://github.com/zeit/next.js/discussions
>>> 
>> interesting.
>>> I'm interested to know what we think about this and how this
>>> might/could fit into our plans for user support, discussion, etc.
> 
> Given that we already have email integration with GitHub, this will probably 
> be easier to get through the ASF bureaucracy than something brand new.
> 
> I'm willing to take this through Infra if people agree to it. It doesn't look 
> like there are any separate "boards" or tags yet, so the proposal would 
> likely be that discussions there would get emailed onto user@. The hard part 
> will be getting replies to the thread on user@ to go back into the discussion 
> on GH; we might be able to get an "asf-bot" to do this for us.
> 
> I also looked at Infra's JIRA database, and no one has put in this request 
> there yet. So, we'd be the first, with all the difficulties that entails.
> 
> Can I get an informal "vote" on this approach and go-ahead? Since it's 
> informal, anyone is encouraged to respond.
> 
> -Joan "adopt, adapt, improve" Touzet



Re: [DISCUSS] length restrictions in 4.0

2020-05-12 Thread Robert Samuel Newson
Sorry to let this thread drop.

Nick, are you still preferring 238?

B.

> On 4 May 2020, at 21:06, Robert Samuel Newson  wrote:
> 
> Ah, ok, understood. I don't think that's a compelling reason to fix our 
> maximum database name length at 238.
> 
> CouchDB 4.0 will be the first version of CouchDB where we're not coupled to 
> the filesystem for this list. 256 is very common for a filesystem filename 
> length limit (though not universal) so I don't think our history should 
> dictate an odd (fine, _even_) choice of 238.
> 
> B.
> 
> 
>> On 4 May 2020, at 20:41, Nick Vatamaniuc  wrote:
>> 
>> It will prevent replicating from db created in 4.0 which has a name
>> longer than 238 (say 250) back to 2.x/3.x if the user intends to keep
>> the same database name on both systems, that's what I meant.
>> 
>> On Mon, May 4, 2020 at 3:15 PM Robert Samuel Newson  
>> wrote:
>>> 
>>> The 'timestamp in filename' is only on the internal shards, which would not 
>>> be part of a replication between 2.x/3.x and 4.x.
>>> 
>>> In any case, Nick is suggesting lowering from 256 charts to 238 chars to 
>>> leave room for these things that won't be there. I confess I don't 
>>> understand the reasoning.
>>> 
>>> B.
>>> 
 On 4 May 2020, at 20:04, Joan Touzet  wrote:
 
 I suspect he means when replicating back to a 3.x or 2.x cluster.
 
 On 2020-05-04 3:03 p.m., Robert Samuel Newson wrote:
> But we don't need to add a file extension or a timestamp to database 
> names.
> B.
>> On 4 May 2020, at 18:42, Nick Vatamaniuc  wrote:
>> 
>> Hello everyone,
>> 
>> Good idea, +1 with one minor tweak: database name length in versions
>> <4.0 was restricted by the maximum file name on whatever file system
>> the server was running on. In practice that was 255, then there is an
>> extension and a timestamp in the filename which made the db name limit
>> be 238 so I suggest to use that instead.
>> 
>> -Nick
>> 
>> On Mon, May 4, 2020 at 11:51 AM Robert Samuel Newson 
>>  wrote:
>>> 
>>> Hi,
>>> 
>>> I think I speak for many in accepting the risk that we're excluding doc 
>>> ids formed from 4096-bit RSA signatures.
>>> 
>>> I don't think I made it clear but I think these should be fixed limits 
>>> (i.e, not configurable) in order to ensure inter-replication between 
>>> couchdb installations wherever they are.
>>> 
>>> B.
>>> 
 On 4 May 2020, at 10:52, Ilya Khlopotov  wrote:
 
 Hello,
 
 Thank you Robert for starting this important discussion. I think that 
 the values you propose make sense.
 I can see a case when user would use hashes as document ids. All 
 existent hash functions I am aware of should return data which fit 
 into 512 characters. There is only one case which doesn't fit into 512 
 limit. If user would decide to use RSA signatures as document ids and 
 they use 4096 bytes sized keys the signature size would be 684 bytes.
 
 However in this case users can easily replace signatures with hashes 
 of signatures. So I wouldn't worry about it to much. 512 sounds plenty 
 to me.
 
 +1 to set hard limits on db name size and doc id size with proposed 
 values.
 
 Best regards,
 iilyak
 
 On 2020/05/01 18:36:45, Robert Samuel Newson  
 wrote:
> Hello,
> 
> There are other threads related to doc size (etc) limits for CouchDB 
> 4.0, motivated by restrictions in FoundationDB, but we haven't 
> discussed database name length and doc id length limits. These are 
> encoded into FoundationDB keys and so we would be wise to forcibly 
> limit their length from the start.
> 
> I propose 256 character limit for database name and 512 character 
> limit for doc ids.
> 
> If you can't uniquely identify your database or document within those 
> limits I argue that you're doing something wrong, and the limits 
> here, while making FDB happy, are an aid to sensible application 
> design.
> 
> Does anyone want higher or lower limits? Comments pls.
> 
> B.
> 
> 
>>> 
>>> 
> 



Re: moving email lists to GitHub Discussions (Was: [DISCUSS] moving email lists to Discourse)

2020-05-12 Thread Ilya Khlopotov
I would be +1 as long as it works and we have options to migrate archive 
elsewhere if/when we need to.
You are proposing to mirror email traffic which means that mail archive would 
have a complete history and spare the project from total vendor lock in.

Best regards,
ILYA   

On 2020/05/11 19:04:53, Joan Touzet  wrote: 
> On 2020-03-15 9:36, Dave Cottlehuber wrote:
> > On Fri, 13 Mar 2020, at 14:35, Naomi Slater wrote:
> >> apparently GitHub has discussions now. it's still in beta, but you can
> >> specifically request it if you want it if you contact support, I think
> >>
> >> e.g., https://github.com/zeit/next.js/discussions
> >> 
> > 
> > interesting.
> > 
> >> I'm interested to know what we think about this and how this
> >> might/could fit into our plans for user support, discussion, etc.
> 
> Given that we already have email integration with GitHub, this will 
> probably be easier to get through the ASF bureaucracy than something 
> brand new.
> 
> I'm willing to take this through Infra if people agree to it. It doesn't 
> look like there are any separate "boards" or tags yet, so the proposal 
> would likely be that discussions there would get emailed onto user@. The 
> hard part will be getting replies to the thread on user@ to go back into 
> the discussion on GH; we might be able to get an "asf-bot" to do this 
> for us.
> 
> I also looked at Infra's JIRA database, and no one has put in this 
> request there yet. So, we'd be the first, with all the difficulties that 
> entails.
> 
> Can I get an informal "vote" on this approach and go-ahead? Since it's 
> informal, anyone is encouraged to respond.
> 
> -Joan "adopt, adapt, improve" Touzet
>