[
https://issues.apache.org/jira/browse/HDFS-11026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15858821#comment-15858821
]
Chris Douglas commented on HDFS-11026:
--------------------------------------
[~ehiggs], a comment in the .proto explaining the constraint on field numbers
would also be a good idea.
bq. I'd prefer an equality check for a definitive magic byte that can't occur
or represents something so large it can't/won't occur . Perhaps something like
-1 – I haven't checked if that's impossible for the varint.
The current patch relies on the record including least one field with an ID
less than 16
([required|https://developers.google.com/protocol-buffers/docs/encoding#order]
to be written sequentially, ordered by field ID). We could require the first
field, and match on its ID and type explicitly, rather than allowing it to
float.
An explicit check could be removed as part of deprecating the field, but...
this is becoming harder to maintain, for a pretty marginal gain over the
guarantees we get by exploiting the gap between encodings. If we moved to an
equality check, then we can't deprecate the first field while we support
backwards compatibility with *3.x*. That's true even if we add the magic bytes
as a constant, optional PB field. I agree, that's morally equivalent to not
deprecating 15 fields in a way, but we're more likely to get the latter correct
accidentally, after this JIRA is forgotten.
There's one more level of defensible paranoia: add code to {{writeLegacy}} that
refuses to write an expiryTime that would confuse the switch, and a unit test
affirming the same. That should at least cause noisy failures in the server
attempting to issue confusing tokens, but if we change the semantics of
expiryTime then we've already broken backwards compatibility with 2.x. Though
someone may read this later and curse me, checking this for each block token
seems like unnecessary overhead.
Thoughts? We're down to debating the merits of approaches that are unlikely to
be distinguished in practice, so with all the above being said, I'd be fine
with any of the three approaches in this order:
# *current patch* Rely on gap between encodings (expiryTime > 128, PB field
number \| type < 16)
# Fix the first field and match its number and type exactly
# Define a new PB field, {{magic_number}} that has a constant encoding and
match on it (adding a version field for a PB record, really)
> Convert BlockTokenIdentifier to use Protobuf
> --------------------------------------------
>
> Key: HDFS-11026
> URL: https://issues.apache.org/jira/browse/HDFS-11026
> Project: Hadoop HDFS
> Issue Type: Task
> Components: hdfs, hdfs-client
> Affects Versions: 2.9.0, 3.0.0-alpha1
> Reporter: Ewan Higgs
> Assignee: Ewan Higgs
> Fix For: 3.0.0-alpha3
>
> Attachments: blocktokenidentifier-protobuf.patch,
> HDFS-11026.002.patch, HDFS-11026.003.patch, HDFS-11026.004.patch,
> HDFS-11026.005.patch
>
>
> {{BlockTokenIdentifier}} currently uses a {{DataInput}}/{{DataOutput}}
> (basically a {{byte[]}}) and manual serialization to get data into and out of
> the encrypted buffer (in {{BlockKeyProto}}). Other TokenIdentifiers (e.g.
> {{ContainerTokenIdentifier}}, {{AMRMTokenIdentifier}}) use Protobuf. The
> {{BlockTokenIdenfitier}} should use Protobuf as well so it can be expanded
> more easily and will be consistent with the rest of the system.
> NB: Release of this will require a version update since 2.8.x won't be able
> to decipher {{BlockKeyProto.keyBytes}} from 2.8.y.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]