[GitHub] [spark] cxzl25 commented on a diff in pull request #36787: [SPARK-39387][FOLLOWUP][TESTS] Add a test case for HIVE-25190

GitBox Wed, 22 Jun 2022 03:35:54 -0700


cxzl25 commented on code in PR #36787:
URL: https://github.com/apache/spark/pull/36787#discussion_r903574621



##########
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/orc/OrcQuerySuite.scala:
##########
@@ -832,6 +832,18 @@ abstract class OrcQuerySuite extends OrcQueryTest with 
SharedSparkSession {
       }
     }
   }
+
+  test("SPARK-39387: BytesColumnVector should not throw RuntimeException due 
to overflow") {

Review Comment:
   This UT requires a large heap, using end-to-end writing orc, and covering 
tests.
   
   Maybe we can use ignore first.
   
   Based on this comment 
https://github.com/apache/spark/pull/36772#issuecomment-1147811164, Hive also 
increased `xmx` to make UT pass.
   
   I tried to reduce the size of each batch write, but because there are still 
some temporary buffers in the process of spark to orc writing, it may oom.
   And if the oom error is intercepted, the Spark process will exit abnormally, 
and the ut will fail.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] cxzl25 commented on a diff in pull request #36787: [SPARK-39387][FOLLOWUP][TESTS] Add a test case for HIVE-25190

Reply via email to