bkietz commented on a change in pull request #7623:
URL: https://github.com/apache/arrow/pull/7623#discussion_r451809431



##########
File path: python/pyarrow/_dataset.pyx
##########
@@ -881,10 +881,15 @@ cdef class RowGroupInfo:
             name = frombytes(c_statistics.type.get().field(i).get().name())
             c_minmax = <CStructScalar*> c_statistics.value[i].get()
 
-            statistics[name] = {
-                'min': pyarrow_wrap_scalar(c_minmax.value[0]).as_py(),
-                'max': pyarrow_wrap_scalar(c_minmax.value[1]).as_py(),
-            }
+            try:
+                statistics[name] = {
+                    'min': pyarrow_wrap_scalar(c_minmax.value[0]).as_py(),
+                    'max': pyarrow_wrap_scalar(c_minmax.value[1]).as_py(),
+                }
+            except ValueError:
+                # Don't treat failure to parse/convert a single Scalar as a
+                # failure. The min/max will simply be missing for this field.

Review comment:
       IMHO it'd be better to raise an exception. If conversion fails in the 
future for a statistics scalar which we don't yet support then recognizing that 
issue will be easier than figuring out why the statistics disappear.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to