Sean Mackrory commented on HADOOP-14335:

Had an impromptu brainstorming session about this with [~gabor.bota] the other 
day. I feel like the current logic of requiring equality between the table's 
version and the current code's schema version is fragile. If we ever introduce 
a change that old code can safely work with but that new code needs to be able 
to recognize, we're hosed unless the old code already has a way to recognize a 
different, yet compatible schema versions. There are 2 caveats there:

* We could add a NEW version field that only the new code looks at. Keep the 
old field the same and the old code works. New code looks at the new schema 
field. This is hacky and messy.
* We couldn't come up with such a scenario. When we added delete tracking, old 
code shouldn't have used the same table. If we add an authoritative bit, 
there's no need for new code to recognize the new table, it simply uses the new 
field when it's in a row, and defaults to false when it's not. We talked about 
maybe adding a TTL bit in the future - but the same logic applies. So maybe 
this is just never going to be needed.

Can anyone think of such a scenario? Because unless you're on a version of 
Hadoop that has compatibility logic that doesn't exist, you wouldn't be able to 
upgrade nicely to a version of Hadoop that required that compatibility logic. 
And really that shouldn't even be any version upgrade, but a major one. But 
maybe this will just never happen and it's not worth worrying about.

> Improve DynamoDB schema update story
> ------------------------------------
>                 Key: HADOOP-14335
>                 URL: https://issues.apache.org/jira/browse/HADOOP-14335
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>    Affects Versions: 3.0.0-beta1
>            Reporter: Sean Mackrory
>            Assignee: Sean Mackrory
>            Priority: Major
> On HADOOP-13760 I'm realizing that changes to the DynamoDB schema aren't 
> great to deal with. Currently a build of Hadoop is hard-coded to a specific 
> schema version. So if you upgrade from one to the next you have to upgrade 
> everything (and then update the version in the table - which we don't have a 
> tool or document for) before you can keep using S3Guard. We could possibly 
> also make the definition of compatibility a bit more flexible, but it's going 
> to be very tough to do that without knowing what kind of future schema 
> changes we might want ahead of time.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Reply via email to