[ 
https://issues.apache.org/jira/browse/FLINK-10356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16693072#comment-16693072
 ] 

ASF GitHub Bot commented on FLINK-10356:
----------------------------------------

NicoK commented on issue #6705: [FLINK-10356][network] add sanity checks to 
SpillingAdaptiveSpanningRecordDeserializer
URL: https://github.com/apache/flink/pull/6705#issuecomment-440237825
 
 
   Thanks for the new reviews @zhijiangW and @pnowojski.
   
   @zhijiangW:
   1. True, if we could do this in `SpanningWrapper` or 
`NonSpanningWrapper`/its created input view, it would be better to grasp. 
However, these work on a byte level and are called by the user's de/serializer. 
We don't know when the user has finished reading (for checking whether too few 
bytes have been consumed). I could hack some `finishRecordDeserialization()` 
method in but this would also miss a few things we are adding now: record 
length and how many bytes have been read after the record length. We could hack 
these in as well but would also have to wrap any exception in its read methods 
to throw ours instead (for reading too many bytes).
   -> I do agree that this whole deserialization code should eventually be 
simplified, but not as part of this PR.
   2. I could actually move the error-checking code up into 
`SpillingAdaptiveSpanningRecordDeserializer#getNextRecord()` as well. I think, 
it makes sense to have it together in one place - done.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


> Add sanity checks to SpillingAdaptiveSpanningRecordDeserializer
> ---------------------------------------------------------------
>
>                 Key: FLINK-10356
>                 URL: https://issues.apache.org/jira/browse/FLINK-10356
>             Project: Flink
>          Issue Type: Sub-task
>          Components: Network
>    Affects Versions: 1.5.0, 1.5.1, 1.5.2, 1.5.3, 1.5.4, 1.6.0, 1.6.1, 1.7.0
>            Reporter: Nico Kruber
>            Assignee: Nico Kruber
>            Priority: Major
>              Labels: pull-request-available
>
> {{SpillingAdaptiveSpanningRecordDeserializer}} doesn't have any consistency 
> checks for usage calls or serializers behaving properly, e.g. to read only as 
> many bytes as available/promised for that record. At least these checks 
> should be added:
>  # Check that buffers have not been read from yet before adding them (this is 
> an invariant {{SpillingAdaptiveSpanningRecordDeserializer}} works with and 
> from what I can see, it is followed now.
>  # Check that after deserialization, we actually consumed {{recordLength}} 
> bytes
>  ** If not, in the spanning deserializer, we currently simply skip the 
> remaining bytes.
>  ** But in the non-spanning deserializer, we currently continue from the 
> wrong offset.
>  # Protect against {{setNextBuffer}} being called before draining all 
> available records



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to