Re: Clusterwide vs Client configuration for metadata format version

2018-12-19 Thread Enrico Olivelli
Il giorno mer 19 dic 2018 alle ore 14:41 Enrico Olivelli
 ha scritto:
>
> +1 for holding on the release.

sorry, I was not clear.
+1 on post-poning that change in order to speed up 4.9 release

Enrico

>
> Off topic: we did not choose a release manager for 4.9 yet.
> Ivan and Sijie contributed most of the changes, but having a release
> Manager from SF would be awesome
>
> Enrico
>
> Il giorno mer 19 dic 2018 alle ore 14:39 Sijie Guo
>  ha scritto:
> >
> > On Wed, Dec 19, 2018 at 9:11 PM Ivan Kelly  wrote:
> >
> > > > If it is client level configuration, in theory it is possible to have
> > > latest client create v3 ledger while bookies are still running in the 
> > > older
> > > version right?
> > >
> > > Yes, autorecovery would likely just break in this case.
> > >
> > > > If we go with cluster level, I think using it part of LAYOUT_ZNODE is 
> > > > not
> > > clean.
> > > > I think we need to have a form of "cluster version number", or even
> > > better
> > > with a combination of  capability/feature bit-map which can dictate
> > > the cluster behavior.
> > >
> > > I used the LAYOUT znode because that is what already exists. If we
> > > create another znode for this, /ledgers/CLUSTER for example, then, for
> > > consistency, the contents of the layout znode should really be moved
> > > into this new znode. But this creates a lot more BC issues than just
> > > using the LAYOUT znode. Old versions of the software ignore anything
> > > other than the first two lines in LAYOUT. So, it's not clean nor
> > > ideal, but it does work well within the constraints of BC.
> > >
> > > > I am assuming that the tool Ivan is talking about is used for existing
> > > clusters to update the cluster version number.
> > > > Otherwise the maxLedgerMetadataFormat is used only for new clusters;
> > > that is fine.
> > >
> > > The maxLedgerMetadataFormat is only written when writing a new LAYOUT
> > > node, so either during metaformat, or when using the proposed tool.
> > > When it is absent from the layout node, it defaults to version 2,
> > > which matches current behaviour.
> > >
> > > The important thing for the 4.9 release is that the client can read
> > > binary metadata, so that in 4.10 or 4.11, if we add a field to the
> > > metadata, then we are able to use it with 4.9 clients and newer. It is
> > > only that that point that maxLedgerMetadataFormat comes into play.
> > >
> > > So, for the sake of getting 4.9 out the door, I propose that we:
> > >
> >
> >
> >
> > > a. Rollback the 2 changes around max metadata format version.
> > > b. Pin serde to use V2 for now.
> > > c. Continue this discussion to find the long term solution.
> > >
> >
> > +1
> >
> >
> > >
> > > -Ivan
> > >


Re: Clusterwide vs Client configuration for metadata format version

2018-12-19 Thread Enrico Olivelli
+1 for holding on the release.

Off topic: we did not choose a release manager for 4.9 yet.
Ivan and Sijie contributed most of the changes, but having a release
Manager from SF would be awesome

Enrico

Il giorno mer 19 dic 2018 alle ore 14:39 Sijie Guo
 ha scritto:
>
> On Wed, Dec 19, 2018 at 9:11 PM Ivan Kelly  wrote:
>
> > > If it is client level configuration, in theory it is possible to have
> > latest client create v3 ledger while bookies are still running in the older
> > version right?
> >
> > Yes, autorecovery would likely just break in this case.
> >
> > > If we go with cluster level, I think using it part of LAYOUT_ZNODE is not
> > clean.
> > > I think we need to have a form of "cluster version number", or even
> > better
> > with a combination of  capability/feature bit-map which can dictate
> > the cluster behavior.
> >
> > I used the LAYOUT znode because that is what already exists. If we
> > create another znode for this, /ledgers/CLUSTER for example, then, for
> > consistency, the contents of the layout znode should really be moved
> > into this new znode. But this creates a lot more BC issues than just
> > using the LAYOUT znode. Old versions of the software ignore anything
> > other than the first two lines in LAYOUT. So, it's not clean nor
> > ideal, but it does work well within the constraints of BC.
> >
> > > I am assuming that the tool Ivan is talking about is used for existing
> > clusters to update the cluster version number.
> > > Otherwise the maxLedgerMetadataFormat is used only for new clusters;
> > that is fine.
> >
> > The maxLedgerMetadataFormat is only written when writing a new LAYOUT
> > node, so either during metaformat, or when using the proposed tool.
> > When it is absent from the layout node, it defaults to version 2,
> > which matches current behaviour.
> >
> > The important thing for the 4.9 release is that the client can read
> > binary metadata, so that in 4.10 or 4.11, if we add a field to the
> > metadata, then we are able to use it with 4.9 clients and newer. It is
> > only that that point that maxLedgerMetadataFormat comes into play.
> >
> > So, for the sake of getting 4.9 out the door, I propose that we:
> >
>
>
>
> > a. Rollback the 2 changes around max metadata format version.
> > b. Pin serde to use V2 for now.
> > c. Continue this discussion to find the long term solution.
> >
>
> +1
>
>
> >
> > -Ivan
> >


Re: Clusterwide vs Client configuration for metadata format version

2018-12-19 Thread Sijie Guo
On Wed, Dec 19, 2018 at 9:11 PM Ivan Kelly  wrote:

> > If it is client level configuration, in theory it is possible to have
> latest client create v3 ledger while bookies are still running in the older
> version right?
>
> Yes, autorecovery would likely just break in this case.
>
> > If we go with cluster level, I think using it part of LAYOUT_ZNODE is not
> clean.
> > I think we need to have a form of "cluster version number", or even
> better
> with a combination of  capability/feature bit-map which can dictate
> the cluster behavior.
>
> I used the LAYOUT znode because that is what already exists. If we
> create another znode for this, /ledgers/CLUSTER for example, then, for
> consistency, the contents of the layout znode should really be moved
> into this new znode. But this creates a lot more BC issues than just
> using the LAYOUT znode. Old versions of the software ignore anything
> other than the first two lines in LAYOUT. So, it's not clean nor
> ideal, but it does work well within the constraints of BC.
>
> > I am assuming that the tool Ivan is talking about is used for existing
> clusters to update the cluster version number.
> > Otherwise the maxLedgerMetadataFormat is used only for new clusters;
> that is fine.
>
> The maxLedgerMetadataFormat is only written when writing a new LAYOUT
> node, so either during metaformat, or when using the proposed tool.
> When it is absent from the layout node, it defaults to version 2,
> which matches current behaviour.
>
> The important thing for the 4.9 release is that the client can read
> binary metadata, so that in 4.10 or 4.11, if we add a field to the
> metadata, then we are able to use it with 4.9 clients and newer. It is
> only that that point that maxLedgerMetadataFormat comes into play.
>
> So, for the sake of getting 4.9 out the door, I propose that we:
>



> a. Rollback the 2 changes around max metadata format version.
> b. Pin serde to use V2 for now.
> c. Continue this discussion to find the long term solution.
>

+1


>
> -Ivan
>


Re: Clusterwide vs Client configuration for metadata format version

2018-12-19 Thread Ivan Kelly
> If it is client level configuration, in theory it is possible to have latest 
> client create v3 ledger while bookies are still running in the older version 
> right?

Yes, autorecovery would likely just break in this case.

> If we go with cluster level, I think using it part of LAYOUT_ZNODE is not
clean.
> I think we need to have a form of "cluster version number", or even better
with a combination of  capability/feature bit-map which can dictate
the cluster behavior.

I used the LAYOUT znode because that is what already exists. If we
create another znode for this, /ledgers/CLUSTER for example, then, for
consistency, the contents of the layout znode should really be moved
into this new znode. But this creates a lot more BC issues than just
using the LAYOUT znode. Old versions of the software ignore anything
other than the first two lines in LAYOUT. So, it's not clean nor
ideal, but it does work well within the constraints of BC.

> I am assuming that the tool Ivan is talking about is used for existing 
> clusters to update the cluster version number.
> Otherwise the maxLedgerMetadataFormat is used only for new clusters; that is 
> fine.

The maxLedgerMetadataFormat is only written when writing a new LAYOUT
node, so either during metaformat, or when using the proposed tool.
When it is absent from the layout node, it defaults to version 2,
which matches current behaviour.

The important thing for the 4.9 release is that the client can read
binary metadata, so that in 4.10 or 4.11, if we add a field to the
metadata, then we are able to use it with 4.9 clients and newer. It is
only that that point that maxLedgerMetadataFormat comes into play.

So, for the sake of getting 4.9 out the door, I propose that we:
a. Rollback the 2 changes around max metadata format version.
b. Pin serde to use V2 for now.
c. Continue this discussion to find the long term solution.

-Ivan


Re: Clusterwide vs Client configuration for metadata format version

2018-12-18 Thread Venkateswara Rao Jujjuri
If it is client level configuration, in theory it is possible to have
latest client create v3 ledger while bookies
are still running in the older version right? Who can stop that? if we let
that happen what happens to the replication
logic? How can it handle the new ledger format?

If we go with cluster level, I think using it part of LAYOUT_ZNODE is not
clean.
I think we need to have a form of "cluster version number", or even better
with a combination of  capability/feature
bit-map which can dictate the cluster behavior.

I am assuming that the tool Ivan is talking about is used for existing
clusters to update the cluster version number.
Otherwise the maxLedgerMetadataFormat is used only for new clusters; that
is fine.
But this comes with strict operational guidelines where the
maxLedgerMetadataFormat needs to be updated
after successful upgrade of the entire cluster to new bits. In this case at
least we have a barrier that the entire
bookies are updated which can understand maxLedgerMetadataFormat, and we
support backward compatibility anyway.
But I don't like the way to overload LAYOUT_ZNODE which doesn't make sense
as it is not a layout change.

JV


On Tue, Dec 18, 2018 at 11:37 AM Sam Just  wrote:

> I think both approaches are viable, but I think that the max allowable
> version is more naturally a bk cluster property rather than a bk client
> property.  Controlling this from the client means that the same client
> version deployed to two different clusters might need different settings
> depending on the other clients deployed to those clusters.  Placing it in
> the metadata means that the clients simply pick up the correct version for
> the environment from the ledger metadata without needing additional
> configuration.  However, client config management is likely to be managed
> on a per-cluster basis anyway, so in practice there may be little
> difference.
> -Sam
>
> On Tue, Dec 18, 2018 at 10:01 AM Sam Just  wrote:
>
> > I'll take a look.
> >
> > On Tue, Dec 18, 2018 at 1:39 AM Ivan Kelly  wrote:
> >
> >> JV, Sam, Charan, Andrey, could one of you chime in on this? It's
> >> holding up 4.9 release.
> >>
> >> -Ivan
> >>
> >> On Thu, Dec 13, 2018 at 5:38 PM Ivan Kelly  wrote:
> >> >
> >> > I'd be interested to see the opinion of the salesforce folks on this.
> >> > On Thu, Dec 13, 2018 at 5:35 PM Ivan Kelly  wrote:
> >> > >
> >> > > > I am not sure about this. If clients don't react the changes of
> >> ledger
> >> > > > layout,
> >> > > > the information in ledger layout is just informative, you still
> >> need to
> >> > > > coordinate
> >> > > > both readers and writers. so IMO the version in ledger layout is
> >> not really
> >> > > > useful.
> >> > >
> >> > > The clients react the next time they initialize the ledger manager.
> >> > > Which is exactly the same as would occur with a configuration
> setting.
> >> > >
> >> > > -Ivan
> >>
> >
> >
> > --
> >
> > 
> >
>
>
> --
>
> 
>


-- 
Jvrao
---
First they ignore you, then they laugh at you, then they fight you, then
you win. - Mahatma Gandhi


Re: Clusterwide vs Client configuration for metadata format version

2018-12-18 Thread Sam Just
I think both approaches are viable, but I think that the max allowable
version is more naturally a bk cluster property rather than a bk client
property.  Controlling this from the client means that the same client
version deployed to two different clusters might need different settings
depending on the other clients deployed to those clusters.  Placing it in
the metadata means that the clients simply pick up the correct version for
the environment from the ledger metadata without needing additional
configuration.  However, client config management is likely to be managed
on a per-cluster basis anyway, so in practice there may be little
difference.
-Sam

On Tue, Dec 18, 2018 at 10:01 AM Sam Just  wrote:

> I'll take a look.
>
> On Tue, Dec 18, 2018 at 1:39 AM Ivan Kelly  wrote:
>
>> JV, Sam, Charan, Andrey, could one of you chime in on this? It's
>> holding up 4.9 release.
>>
>> -Ivan
>>
>> On Thu, Dec 13, 2018 at 5:38 PM Ivan Kelly  wrote:
>> >
>> > I'd be interested to see the opinion of the salesforce folks on this.
>> > On Thu, Dec 13, 2018 at 5:35 PM Ivan Kelly  wrote:
>> > >
>> > > > I am not sure about this. If clients don't react the changes of
>> ledger
>> > > > layout,
>> > > > the information in ledger layout is just informative, you still
>> need to
>> > > > coordinate
>> > > > both readers and writers. so IMO the version in ledger layout is
>> not really
>> > > > useful.
>> > >
>> > > The clients react the next time they initialize the ledger manager.
>> > > Which is exactly the same as would occur with a configuration setting.
>> > >
>> > > -Ivan
>>
>
>
> --
>
> 
>


-- 




Re: Clusterwide vs Client configuration for metadata format version

2018-12-18 Thread Sam Just
I'll take a look.

On Tue, Dec 18, 2018 at 1:39 AM Ivan Kelly  wrote:

> JV, Sam, Charan, Andrey, could one of you chime in on this? It's
> holding up 4.9 release.
>
> -Ivan
>
> On Thu, Dec 13, 2018 at 5:38 PM Ivan Kelly  wrote:
> >
> > I'd be interested to see the opinion of the salesforce folks on this.
> > On Thu, Dec 13, 2018 at 5:35 PM Ivan Kelly  wrote:
> > >
> > > > I am not sure about this. If clients don't react the changes of
> ledger
> > > > layout,
> > > > the information in ledger layout is just informative, you still need
> to
> > > > coordinate
> > > > both readers and writers. so IMO the version in ledger layout is not
> really
> > > > useful.
> > >
> > > The clients react the next time they initialize the ledger manager.
> > > Which is exactly the same as would occur with a configuration setting.
> > >
> > > -Ivan
>


-- 




Re: Clusterwide vs Client configuration for metadata format version

2018-12-18 Thread Ivan Kelly
JV, Sam, Charan, Andrey, could one of you chime in on this? It's
holding up 4.9 release.

-Ivan

On Thu, Dec 13, 2018 at 5:38 PM Ivan Kelly  wrote:
>
> I'd be interested to see the opinion of the salesforce folks on this.
> On Thu, Dec 13, 2018 at 5:35 PM Ivan Kelly  wrote:
> >
> > > I am not sure about this. If clients don't react the changes of ledger
> > > layout,
> > > the information in ledger layout is just informative, you still need to
> > > coordinate
> > > both readers and writers. so IMO the version in ledger layout is not 
> > > really
> > > useful.
> >
> > The clients react the next time they initialize the ledger manager.
> > Which is exactly the same as would occur with a configuration setting.
> >
> > -Ivan


Re: Clusterwide vs Client configuration for metadata format version

2018-12-13 Thread Ivan Kelly
I'd be interested to see the opinion of the salesforce folks on this.
On Thu, Dec 13, 2018 at 5:35 PM Ivan Kelly  wrote:
>
> > I am not sure about this. If clients don't react the changes of ledger
> > layout,
> > the information in ledger layout is just informative, you still need to
> > coordinate
> > both readers and writers. so IMO the version in ledger layout is not really
> > useful.
>
> The clients react the next time they initialize the ledger manager.
> Which is exactly the same as would occur with a configuration setting.
>
> -Ivan


Re: Clusterwide vs Client configuration for metadata format version

2018-12-13 Thread Ivan Kelly
> I am not sure about this. If clients don't react the changes of ledger
> layout,
> the information in ledger layout is just informative, you still need to
> coordinate
> both readers and writers. so IMO the version in ledger layout is not really
> useful.

The clients react the next time they initialize the ledger manager.
Which is exactly the same as would occur with a configuration setting.

-Ivan


Re: Clusterwide vs Client configuration for metadata format version

2018-12-13 Thread Sijie Guo
On Thu, Dec 13, 2018 at 7:24 PM Ivan Kelly  wrote:

> > I don't fully understand how the cluster-wide version work here,
> specially
> > how do clients react when people use the tool to bump the version in
> ledger
> > layout.
>
> Clients don't have to react immediately. The cluster-wide setting is
> the max _allowable_ format version. When it gets bumped, for example
> from 2 to 3, clients that started when the value was 2 can continue to
> write metadata in format 2, and all clients will will be able to read
> it. Clients start after the bump can start to write in format 3. There
> is currently nothing to motivate moving to version 3, but when we do
> add something to the metadata protobuf, we will be able to have
> clients read all the fields (even if it doesn't recognise it all).
>
> > IMO a client setting is probably good enough and more flexible for people
> > to control the upgrade stories and there will no surprises, since the
> > version is controlled by the bookkeeper "writers".
>
> I don't have a strong opinion either way. Client conf based gives more
> power to users, but also requires more coordination among all users.
>

I am not sure about this. If clients don't react the changes of ledger
layout,
the information in ledger layout is just informative, you still need to
coordinate
both readers and writers. so IMO the version in ledger layout is not really
useful.

so I would prefer using a simple configuration setting rather than storing
it in ledger layout.


> Clusterwide allows it to be set at a central authority, but that gives
> users less freedom. Both have merits. How common is it for users from
> different organisations to share ledgers?
>
> -Ivan
>


Re: Clusterwide vs Client configuration for metadata format version

2018-12-13 Thread Ivan Kelly
> I don't fully understand how the cluster-wide version work here, specially
> how do clients react when people use the tool to bump the version in ledger
> layout.

Clients don't have to react immediately. The cluster-wide setting is
the max _allowable_ format version. When it gets bumped, for example
from 2 to 3, clients that started when the value was 2 can continue to
write metadata in format 2, and all clients will will be able to read
it. Clients start after the bump can start to write in format 3. There
is currently nothing to motivate moving to version 3, but when we do
add something to the metadata protobuf, we will be able to have
clients read all the fields (even if it doesn't recognise it all).

> IMO a client setting is probably good enough and more flexible for people
> to control the upgrade stories and there will no surprises, since the
> version is controlled by the bookkeeper "writers".

I don't have a strong opinion either way. Client conf based gives more
power to users, but also requires more coordination among all users.
Clusterwide allows it to be set at a central authority, but that gives
users less freedom. Both have merits. How common is it for users from
different organisations to share ledgers?

-Ivan


Re: Clusterwide vs Client configuration for metadata format version

2018-12-12 Thread Sijie Guo
On Wed, Dec 12, 2018 at 11:31 PM Ivan Kelly  wrote:

> Hi folks,
>
> A discussion has arisen about on [1] about the ledger layout changes
> I've made recently [2].
> The change[2] adds a field maxLedgerMetadataFormat to the cluster-wide
> managed ledger layout.  When a new ledger is created, this is the
> maximum format version which will be used to write it. Currently, the
> default is 2 (text protobuf), though it could be changed to 3 (binary
> protobuf) in future. I also had a plan to create a tool to bump to 3.
> This field exists to allow clients that don't understand the binary
> format to coexist with clients that do.
> Another option would be to not change the ledger layout, but instead,
> have a per client configuration for maxLedgerMetadataFormat, which
> could default to 2. It would work the same way, but there would be no
> central point to bump to 3. Each client/application would have to do
> so.
>

I don't fully understand how the cluster-wide version work here, specially
how do clients react when people use the tool to bump the version in ledger
layout.

IMO a client setting is probably good enough and more flexible for people
to control the upgrade stories and there will no surprises, since the
version is controlled
by the bookkeeper "writers".


>
> What are folks thoughts on this? The cluster-wide is already
> implemented, though per client has its advantages too. In any case,
> this needs to be resolved before 4.9. Once resolved I'll push a BP for
> the whole feature.
>
> Cheers,
> Ivan
>
>
>
> [1] https://github.com/apache/bookkeeper/issues/1863
> [2] https://github.com/apache/bookkeeper/pull/1858
>


Re: Clusterwide vs Client configuration for metadata format version

2018-12-12 Thread Enrico Olivelli
Il mer 12 dic 2018, 16:31 Ivan Kelly  ha scritto:

> Hi folks,
>
> A discussion has arisen about on [1] about the ledger layout changes
> I've made recently [2].
> The change[2] adds a field maxLedgerMetadataFormat to the cluster-wide
> managed ledger layout.  When a new ledger is created, this is the
> maximum format version which will be used to write it. Currently, the
> default is 2 (text protobuf), though it could be changed to 3 (binary
> protobuf) in future. I also had a plan to create a tool to bump to 3.
> This field exists to allow clients that don't understand the binary
> format to coexist with clients that do.
> Another option would be to not change the ledger layout, but instead,
> have a per client configuration for maxLedgerMetadataFormat, which
> could default to 2. It would work the same way, but there would be no
> central point to bump to 3. Each client/application would have to do
> so.
>

Having such value on ZK will enable cluster wide upgrade and we can easily
move forward the default metadata version together with BK releases.
Otherwise we will be stuck to v2.
I have cases of clusters in which I have different versions of BK client
working in the same bookies/cluster, this is because I have (very ofter)
several applications which are sharing the same cluster but I can't force
every application to be updated to latest and greatest BK major version.

Another important issue is that if we have a client configuration option it
must be handled by applications which usually build their own
ClientConfiguration and read cluster wide configuration from ZK.

So +1 for cluster wide configuration

Enrico

>
> What are folks thoughts on this? The cluster-wide is already
> implemented, though per client has its advantages too. In any case,
> this needs to be resolved before 4.9. Once resolved I'll push a BP for
> the whole feature.
>
> Cheers,
> Ivan
>
>
>
> [1] https://github.com/apache/bookkeeper/issues/1863
> [2] https://github.com/apache/bookkeeper/pull/1858
>
-- 


-- Enrico Olivelli


Clusterwide vs Client configuration for metadata format version

2018-12-12 Thread Ivan Kelly
Hi folks,

A discussion has arisen about on [1] about the ledger layout changes
I've made recently [2].
The change[2] adds a field maxLedgerMetadataFormat to the cluster-wide
managed ledger layout.  When a new ledger is created, this is the
maximum format version which will be used to write it. Currently, the
default is 2 (text protobuf), though it could be changed to 3 (binary
protobuf) in future. I also had a plan to create a tool to bump to 3.
This field exists to allow clients that don't understand the binary
format to coexist with clients that do.
Another option would be to not change the ledger layout, but instead,
have a per client configuration for maxLedgerMetadataFormat, which
could default to 2. It would work the same way, but there would be no
central point to bump to 3. Each client/application would have to do
so.

What are folks thoughts on this? The cluster-wide is already
implemented, though per client has its advantages too. In any case,
this needs to be resolved before 4.9. Once resolved I'll push a BP for
the whole feature.

Cheers,
Ivan



[1] https://github.com/apache/bookkeeper/issues/1863
[2] https://github.com/apache/bookkeeper/pull/1858