[
https://issues.apache.org/jira/browse/FLINK-10356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16617980#comment-16617980
]
ASF GitHub Bot commented on FLINK-10356:
----------------------------------------
NicoK opened a new pull request #6705: [FLINK-10356][network] add sanity checks
to SpillingAdaptiveSpanningRecordDeserializer
URL: https://github.com/apache/flink/pull/6705
## What is the purpose of the change
`SpillingAdaptiveSpanningRecordDeserializer` doesn't have too many
consistency checks for usage calls or serializers behaving properly, e.g. to
read only as many bytes as available/promised for that record. At least these
checks should be added:
1. Check that buffers have not been read from yet before adding them (this
is an invariant `SpillingAdaptiveSpanningRecordDeserializer` works with and
from what I can see, it is followed now.
2. Check that after deserialization, we actually consumed `recordLength`
bytes
- If not, in the spanning deserializer, we currently simply skip the
remaining bytes.
- But in the non-spanning deserializer, we currently continue from the
wrong offset.
3. Protect against `setNextBuffer()` being called before draining all
available records
## Brief change log
- add the aforementioned checks
- also add `[FLINK-9812][network][tests] fix SpanningRecordSerializationTest
not using deserializer correctly` which was needed to not throw based on the
added checks
- fix not resetting all fields correctly in `SpanningWrapper` (to safeguard
further usages / future changes)
This PR is based on #6693
## Verifying this change
This change is already covered by existing tests for the successful
de/serializations, e.g. via `SpanningRecordSerializationTest`.
This change added tests and can be verified as follows:
- added misbehaving serialization tests under
`SpanningRecordSerializationTest`
- adapted and fixed `CustomSerializationITCase`, working with
`ExpectedException` now (please also note that `testIncorrectSerializer4` is
not actually verifying behaviour - previously it silently passed although no
exception was thrown)
## Does this pull request potentially affect one of the following parts:
- Dependencies (does it add or upgrade a dependency): **no**
- The public API, i.e., is any changed class annotated with
`@Public(Evolving)`: **no**
- The serializers: **yes** (network only!)
- The runtime per-record code paths (performance sensitive): **yes**
- Anything that affects deployment or recovery: JobManager (and its
components), Checkpointing, Yarn/Mesos, ZooKeeper: **no**
- The S3 file system connector: **no**
## Documentation
- Does this pull request introduce a new feature? **no**
- If yes, how is the feature documented? **not applicable**
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
> Add sanity checks to SpillingAdaptiveSpanningRecordDeserializer
> ---------------------------------------------------------------
>
> Key: FLINK-10356
> URL: https://issues.apache.org/jira/browse/FLINK-10356
> Project: Flink
> Issue Type: Improvement
> Components: Network
> Affects Versions: 1.5.0, 1.5.1, 1.5.2, 1.5.3, 1.6.0, 1.6.1, 1.7.0, 1.5.4
> Reporter: Nico Kruber
> Assignee: Nico Kruber
> Priority: Major
> Labels: pull-request-available
>
> {{SpillingAdaptiveSpanningRecordDeserializer}} doesn't have any consistency
> checks for usage calls or serializers behaving properly, e.g. to read only as
> many bytes as available/promised for that record. At least these checks
> should be added:
> # Check that buffers have not been read from yet before adding them (this is
> an invariant {{SpillingAdaptiveSpanningRecordDeserializer}} works with and
> from what I can see, it is followed now.
> # Check that after deserialization, we actually consumed {{recordLength}}
> bytes
> ** If not, in the spanning deserializer, we currently simply skip the
> remaining bytes.
> ** But in the non-spanning deserializer, we currently continue from the
> wrong offset.
> # Protect against {{setNextBuffer}} being called before draining all
> available records
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)