Tagar commented on issue #25749: [SPARK-29041][PYTHON] Allows createDataFrame 
to accept bytes as binary type
URL: https://github.com/apache/spark/pull/25749#issuecomment-545637887
 
 
   I have tested this fix with Python 2.7 and Python 3.6 with Spark 2.3.
   
   With this run-time "patch" applied it works in both Python 2 and Python 3:
   ```python
   >>> import pyspark.sql.types
   >>> pyspark.sql.types._acceptable_types[pyspark.sql.types.BinaryType] = 
(bytearray, bytes)
   >>>
   
   ```
   
   Output for Python 3 
   
   ```
   Using Python version 3.6.6 (default, Jun 28 2018 17:14:51)
   SparkSession available as 'spark'.
   >>> spark.createDataFrame([[b"abcd"]], "col binary")
   Traceback (most recent call last):
   . . .
   TypeError: field col: BinaryType can not accept object b'abcd' in type 
<class 'bytes'>
   >>> import pyspark.sql.types
   >>> pyspark.sql.types._acceptable_types[pyspark.sql.types.BinaryType] = 
(bytearray, bytes)
   >>>
   >>> spark.createDataFrame([[b"abcd"]], "col binary")
   DataFrame[col: binary]
   >>>
   
   ```
   
   Output for Python 2 
   
   ```
   Using Python version 2.7.15 (default, May  1 2018 23:32:55)
   SparkSession available as 'spark'.
   >>>
   >>> spark.createDataFrame([[b"abcd"]], "col binary")
   Traceback (most recent call last):
    . . .
   TypeError: field col: BinaryType can not accept object 'abcd' in type <type 
'str'>
   >>>
   >>> import pyspark.sql.types
   >>> pyspark.sql.types._acceptable_types[pyspark.sql.types.BinaryType] = 
(bytearray, bytes)
   >>>
   >>> spark.createDataFrame([[b"abcd"]], "col binary")
   DataFrame[col: binary]
   >>>
   
   ```
   
   
   cc @BryanCutler and @HyukjinKwon please re-evaluate backportability to Spark 
2.x 
   
   thanks!!
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to