TheNeuralBit commented on a change in pull request #11699:
URL: https://github.com/apache/beam/pull/11699#discussion_r424788208



##########
File path: sdks/python/apache_beam/io/parquetio.py
##########
@@ -567,5 +567,6 @@ def _flush_buffer(self):
     size = 0
     for x in arrays:
       for b in x.buffers():
-        size = size + b.size
+        if b is not None:
+          size = size + b.size

Review comment:
       Thanks for tracking this down!
   
   I wonder what changed in 0.17 to reveal this. Maybe the parquet writer 
(which we use to generate test data) wasn't eliding the null buffer before?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to