bhaskarkvvsr opened a new issue, #14037:
URL: https://github.com/apache/arrow/issues/14037

   I am trying to convert a spark data frame to pandas data frame by enabling 
these two flags
   
   ```
   'spark.sql.execution.arrow.pyspark.enabled'
   'spark.sql.execution.arrow.pyspark.fallback.enabled'
   ```
   
   But I'm getting this error while trying to do so.
   
   ```
   File 
/opt/conda/envs/python385/lib/python3.8/site-packages/pyspark/sql/pandas/conversion.py:108,
 in PandasConversionMixin.toPandas(self)
       106 # Rename columns to avoid duplicated column names.
       107 tmp_column_names = ['col_{}'.format(i) for i in 
range(len(self.columns))]
   --> 108 self_destruct = self.sql_ctx._conf.arrowPySparkSelfDestructEnabled()
       109 batches = self.toDF(*tmp_column_names)._collect_as_arrow(
       110     split_batches=self_destruct)
       111 if len(batches) > 0:
   
   Py4JError: An error occurred while calling 
o1723.arrowPySparkSelfDestructEnabled. Trace:
   py4j.Py4JException: Method arrowPySparkSelfDestructEnabled([]) does not exist
   ```
   
   I have installed `pyarrow` through conda-forge
   
   python version- 3.8.5
   Pyspark version - 3.2.1
   Pyarrow version - 8.0.0
   OS details- 
   
   NAME="Ubuntu"
   VERSION="20.04.4 LTS (Focal Fossa)"
   ID=ubuntu
   ID_LIKE=debian
   PRETTY_NAME="Ubuntu 20.04.4 LTS"
   VERSION_ID="20.04"


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to