[GitHub] [spark] vicennial commented on a diff in pull request #42321: [SPARK-44657][CONNECT] Fix incorrect limit handling in ArrowBatchWithSchemaIterator and config parsing of CONNECT_GRPC_ARROW_MAX_BATCH_SIZE

via GitHub Mon, 07 Aug 2023 12:58:29 -0700


vicennial commented on code in PR #42321:
URL: https://github.com/apache/spark/pull/42321#discussion_r1286332936



##########
connector/connect/server/src/main/scala/org/apache/spark/sql/connect/config/Connect.scala:
##########
@@ -48,13 +48,12 @@ object Connect {
 
   val CONNECT_GRPC_ARROW_MAX_BATCH_SIZE =
     ConfigBuilder("spark.connect.grpc.arrow.maxBatchSize")
-      .doc(
-        "When using Apache Arrow, limit the maximum size of one arrow batch 
that " +
-          "can be sent from server side to client side. Currently, we 
conservatively use 70% " +
-          "of it because the size is not accurate but estimated.")
+      .doc("When using Apache Arrow, limit the maximum size of one arrow 
batch, in bytes unless " +
+        "otherwise specified, that can be sent from server side to client 
side. Currently, we " +
+        "conservatively use 70% of it because the size is not accurate but 
estimated.")
       .version("3.4.0")
-      .bytesConf(ByteUnit.MiB)
-      .createWithDefaultString("4m")
+      .bytesConf(ByteUnit.BYTE)

Review Comment:
   @HyukjinKwon I've updated the conf to be Byte based instead of Mib (since 
the current impl actually assumed it was byte based)
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] vicennial commented on a diff in pull request #42321: [SPARK-44657][CONNECT] Fix incorrect limit handling in ArrowBatchWithSchemaIterator and config parsing of CONNECT_GRPC_ARROW_MAX_BATCH_SIZE

Reply via email to