Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/20567 Yup, I also agree with adding a configuration to control this. I will work on it for master only later. For https://github.com/apache/spark/pull/20567#issuecomment-364994740, yup. I agree with that but to do this, we should do something like: ```python if # 'spark.sql.execution.arrow.enabled' true? require_minimum_pyarrow_version() try: to_arrow_schema(self.schema) # return the one with Arrow except Exception as e: raise Exception("'spark.sql.execution.arrow.enabled' blah blah ...") else: # return the one without Arrow ``` the diff and complexity is pretty similar with fallback one: ```python if # 'spark.sql.execution.arrow.enabled' true? should_fall_back = False try: require_minimum_pyarrow_version() to_arrow_schema(self.schema) except Exception as e: should_fall_back = True if not should_fall_back: # return the one with Arrow # return the one without Arrow ``` Note that, in case of `spark.sql.codegen.fallback`, it's `true` by default, if I did't misunderstand. Also, we can match the behaviour to `createDataFrame` with Pandas as input for now in the latter way. I have been thought this feature is in transition and am trying to fix and match the behaviour first before the release.
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org