allisonwang-db commented on code in PR #52467: URL: https://github.com/apache/spark/pull/52467#discussion_r2389381290
########## docs/sql-ref-datatypes.md: ########## @@ -131,7 +131,7 @@ from pyspark.sql.types import * |**StringType**|str|StringType()| |**CharType(length)**|str|CharType(length)| |**VarcharType(length)**|str|VarcharType(length)| -|**BinaryType**|bytearray|BinaryType()| +|**BinaryType**|bytearray<br/>**Note:** When Arrow is enabled (`spark.sql.execution.arrow.pyspark.enabled=true`), BinaryType maps to `bytes` instead of `bytearray`.|BinaryType()| |**BooleanType**|bool|BooleanType()| Review Comment: And this also requires this config `spark.sql.execution.arrow.pyspark.binaryAsBytes` to be enabled right? I think this can be confusing. If we enabled both configs `spark.sql.execution.arrow.pyspark.enabled` and `spark.sql.execution.arrow.pyspark.binaryAsBytes` by default in Spark 4.1, then can we directly change the documentation here to `bytes`? You can mention this is a behavior change in Spark 4.1. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
