cloud-fan commented on code in PR #41782:
URL: https://github.com/apache/spark/pull/41782#discussion_r1298336899
##########
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala:
##########
@@ -487,6 +487,25 @@ object SQLConf {
.intConf
.createWithDefault(10000)
+ val VECTORIZED_HUGE_VECTOR_RESERVE_RATIO =
+ buildConf("spark.sql.inMemoryColumnarStorage.hugeVectorReserveRatio")
+ .doc("spark will reserve requiredCapacity * this ratio memory next time.
This is only " +
+ "effective when spark.sql.inMemoryColumnarStorage.hugeVectorThreshold
> 0 and required " +
+ "memory larger than that threshold.")
+ .version("3.5.0")
+ .doubleConf
+ .createWithDefault(1.2)
+
+ val VECTORIZED_HUGE_VECTOR_THRESHOLD =
+ buildConf("spark.sql.inMemoryColumnarStorage.hugeVectorThreshold")
+ .doc("When the in memory column vector is larger than this, spark will
reserve " +
+ s"requiredCapacity * ${VECTORIZED_HUGE_VECTOR_RESERVE_RATIO.key}
memory next time and " +
+ "free this column vector before reading next batch data. -1 means
disabling the " +
+ "optimization.")
+ .version("3.5.0")
+ .bytesConf(ByteUnit.BYTE)
+ .createWithDefault(-1)
Review Comment:
can we rest this as `1` and see if there is any test failures? If not we can
change it back to `-1` and merge it.
##########
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala:
##########
@@ -487,6 +487,25 @@ object SQLConf {
.intConf
.createWithDefault(10000)
+ val VECTORIZED_HUGE_VECTOR_RESERVE_RATIO =
+ buildConf("spark.sql.inMemoryColumnarStorage.hugeVectorReserveRatio")
+ .doc("spark will reserve requiredCapacity * this ratio memory next time.
This is only " +
+ "effective when spark.sql.inMemoryColumnarStorage.hugeVectorThreshold
> 0 and required " +
+ "memory larger than that threshold.")
+ .version("3.5.0")
+ .doubleConf
+ .createWithDefault(1.2)
+
+ val VECTORIZED_HUGE_VECTOR_THRESHOLD =
+ buildConf("spark.sql.inMemoryColumnarStorage.hugeVectorThreshold")
+ .doc("When the in memory column vector is larger than this, spark will
reserve " +
+ s"requiredCapacity * ${VECTORIZED_HUGE_VECTOR_RESERVE_RATIO.key}
memory next time and " +
+ "free this column vector before reading next batch data. -1 means
disabling the " +
+ "optimization.")
+ .version("3.5.0")
+ .bytesConf(ByteUnit.BYTE)
+ .createWithDefault(-1)
Review Comment:
can we set this as `1` and see if there is any test failures? If not we can
change it back to `-1` and merge it.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]