[ 
https://issues.apache.org/jira/browse/BEAM-5412?focusedWorklogId=145882&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-145882
 ]

ASF GitHub Bot logged work on BEAM-5412:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 20/Sep/18 00:12
            Start Date: 20/Sep/18 00:12
    Worklog Time Spent: 10m 
      Work Description: rangadi commented on a change in pull request #6440: 
[BEAM-5412][BEAM-5408] Fixes a bug that limited the size of TFRecords
URL: https://github.com/apache/beam/pull/6440#discussion_r218998852
 
 

 ##########
 File path: sdks/java/core/src/main/java/org/apache/beam/sdk/io/TFRecordIO.java
 ##########
 @@ -625,7 +625,19 @@ public int recordLength(byte[] data) {
       checkState(hashLong(length) == maskedCrc32OfLength, "Mismatch of length 
mask");
 
       ByteBuffer data = ByteBuffer.allocate((int) length);
-      checkState(inChannel.read(data) == length, "Invalid data");
+      long totalRead = 0;
+      while (true) {
 
 Review comment:
   `while (data.remaining() > 0) {` 
   
   In fact, the whole block can be replaced with `while (data.reamaining() > 0 
&& channel.read(data) >= 0) {};` Also see note about read() returning 0. 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 145882)
    Time Spent: 1h 10m  (was: 1h)

> TFRecordIO fails with records larger than 8K
> --------------------------------------------
>
>                 Key: BEAM-5412
>                 URL: https://issues.apache.org/jira/browse/BEAM-5412
>             Project: Beam
>          Issue Type: Bug
>          Components: io-java-text
>    Affects Versions: 2.4.0
>            Reporter: Raghu Angadi
>            Assignee: Chamikara Jayalath
>            Priority: Major
>          Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> This was reported on 
> [Stackoverflow|https://stackoverflow.com/questions/52284639/beam-java-sdk-with-tfrecord-and-compression-gzip].
>  TFRecordIO reader assumes a single call to {{channel.read()}} returns as 
> much as can fit in the input buffer. {{read()}} can return fewer bytes than 
> requested. Assert failure : 
> https://github.com/apache/beam/blob/release-2.4.0/sdks/java/core/src/main/java/org/apache/beam/sdk/io/TFRecordIO.java#L642



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to