Hi Tim,

Thanks for the fix!

Here’s my random thought on this, FWIW.

We don’t have to emulate the old, broken behavior — that is what you have 
generously fixed and we want to keep.

The problem we *do* want to solve is to make the transition seamless for users. 
Let’s focus on ZK.

The most important data in ZK is the option value itself. All other data is 
derived and should never have been stored in ZK in the first place. Consider. 
Only system options are stored in ZK, so no need to store the scope in which 
the option is set: only one scope is stored in ZK.

Consider other information: data type and metadata. We need a data type to know 
how to interpret the value. But all other metadata can be obtained from the new 
metadata structure you created.

Even the data type does not absolutely have to be stored in ZK: we know the 
type from the metadata. (Ignoring, here the corner case of someone changing the 
type of an existing option from one release to the next - we don’t handle that 
anyway.)

So, when retrieving the old data, we can throw away everything except the 
value. That is, we parse the old JSON, extract the value into the new 
structure, and discard the rest. We then write new values into ZK using the new 
structure (which, I would hope, has a form optimized for external storage as we 
discussed above.)

Although you are focusing on ZK, Drill has a “persistence” abstraction: data 
can also be stored in local files, by default in /tmp/drill. So, you can use an 
old Drill to create a test set of options persisted to a file. Save that as a 
test input. Then, in the latest version, create a test for your old-to-new 
converter using the saved test data.

Doing the removes ZK from the picture, making unit testing far easier.

The issue of table schema is a whole other topic which we can discuss later. I 
still wonder, does anyone actually depend on the current schema? If not, the 
point is moot and we’re in good shape already.

Thanks,

- Paul

> On Sep 22, 2017, at 12:12 AM, Timothy Farkas <[email protected]> wrote:
> 
> Hi Paul,
> 
> I've implemented the ZK fix. But I'm uncertain about how to make the options 
> table backwards compatible. The main issue with the options table is that 
> formerly there was a "type" column. The "type" column meant different things 
> in different parts of the code, and the meaning was overall inconsistent. Now 
> the "type" column has been renamed to "accessibleScopes" and was assigned 
> consistent meaning throughout the code base. As part of that change the 
> possible values for "accessibleScopes" has changed and how it is assigned 
> those values has changed as well. Making it truly backward compatible would 
> require emulating the previously bad behavior, and I'm not sure how useful 
> that would be. It would be possible to revert the name for the column back to 
> "type", but I think that would have limited usefulness since the value stored 
> for "type" will be different in almost every case from what it was before, so 
> that would also break any scripts that intimately depended on it as well. 
> What are your thoughts on how to handle this?
> 
> Thanks,
> Tim
> 
> 
> ________________________________
> From: Paul Rogers <[email protected]>
> Sent: Thursday, September 21, 2017 5:50:26 PM
> To: [email protected]
> Subject: Re: Backwards Compatibility Policy
> 
> Hi Tim,
> 
> Unfortunately, Drill has no version compatibility guidelines. We break 
> compatibility all the time — often by accident. As Drill matures, we should 
> consider defining such a policy.
> 
> Experience with other systems suggests that provide an automatic way to 
> handle the inevitable evolution of APIs, data formats and the like. This is 
> done for user convenience, and (in commercial products) to avoid support 
> calls.
> 
> Would have been great if we had a version number in ZK. Would solve problems 
> not just with options, but with storage plugins when we change their classes 
> (and hence the JSON stored in ZK.) But, we don’t…
> 
> Vitalii recently added versioning for the Parquet metadata file to avoid the 
> need for users to delete all the metadata each time we make a change. Would 
> be great if we could follow that example for other areas.
> 
> In the meantime, perhaps you can implement a way to read the old format ZK 
> but write the new format.
> 
> I believe you also changed the layout of the system table for options. These 
> were long-needed improvements. Still, I wonder if anyone has a script that 
> depends on the old format? Do we need a way to support the old format, while 
> offering a new table with the new format? Jyothsna recently did that as part 
> of her work in options; I wonder if something like that is needed here also? 
> In fact, you may be able to simply alter the table that she added: it hasn’t 
> see the light of a Drill release yet.
> 
> Thanks,
> 
> - Paul
> 
>> On Sep 21, 2017, at 4:16 PM, Timothy Farkas <[email protected]> wrote:
>> 
>> Makes sense Abhishek, I'll work on making it backwards compatible then.
>> 
>> Thanks,
>> Tim
>> 
>> ________________________________
>> From: Abhishek Girish <[email protected]>
>> Sent: Thursday, September 21, 2017 3:58:55 PM
>> To: [email protected]
>> Subject: Re: Backwards Compatibility Policy
>> 
>> Hey Tim,
>> 
>> Requiring users to purge Drill's ZK data is not advisable and we might not
>> want to go that route. We need to have a seamless upgrade path - for
>> instance modifying values found to be in an older format to the new one,
>> without explicit user interaction.
>> 
>> Regards,
>> Abhishek
>> 
>> On Thu, Sep 21, 2017 at 3:46 PM, Timothy Farkas <[email protected]> wrote:
>> 
>>> Hi All,
>>> 
>>> I recently made a change to the option system which impacted the fields
>>> contained in OptionValues and hence the format of option information we are
>>> storing in zookeeper. So it is currently not backward compatible with old
>>> system options stored in zookeeper. Two ways to resolve the issue are to
>>> require old data to be purged from zookeeper when upgrading the cluster or
>>> to attempt to allow backward compatibility by modifying the deserializer
>>> for OptionValue. So my question is what is our stance on backwards
>>> compatibility?
>>> 
>>> Thanks,
>>> Tim
>>> 
> 

Reply via email to