zabetak commented on code in PR #1402:
URL: https://github.com/apache/orc/pull/1402#discussion_r1102569128


##########
java/core/src/java/org/apache/orc/impl/ReaderImpl.java:
##########
@@ -1035,11 +1037,15 @@ private static List<OrcProto.StripeStatistics> 
deserializeStripeStats(
       long offset,
       int length,
       InStream.StreamOptions options) throws IOException {
-    InStream stream = InStream.create("stripe stats", tailBuffer, offset,
-        length, options);
-    OrcProto.Metadata meta = OrcProto.Metadata.parseFrom(
-        InStream.createCodedInputStream(stream));
-    return meta.getStripeStatsList();
+    try (InStream stream = InStream.create("stripe stats", tailBuffer, offset,
+        length, options)) {
+      OrcProto.Metadata meta = OrcProto.Metadata.parseFrom(
+          InStream.createCodedInputStream(stream));
+      return meta.getStripeStatsList();
+    } catch (InvalidProtocolBufferException e) {
+      LOG.warn("Failed to parse stripe statistics; check ORC-1361 for more 
details.", e);

Review Comment:
   The details in the exception are probably enough for the user to understand 
that the stats are big/corrupted. What may be more difficult to understand is 
why the stats are big or why they were corrupted and what can they do about it. 
   
   Providing some tips like check the strip size, number of columns, size of 
string stats, total size of the file, etc., should be useful and that's what I 
was thinking that could go in ORC-1361 or another resource.
   
   I am fine with whatever you decide but I would strongly suggest to LOG at 
least the exception that was caught.
   
   Let me know what you prefer and I will do the appropriate changes.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to