itholic opened a new pull request, #48964:
URL: https://github.com/apache/spark/pull/48964

   
   ### What changes were proposed in this pull request?
   
   We disabled the DataFrameQueryContext from 
https://github.com/apache/spark/pull/48827, and we also need a corresponding 
flag for PySpark for the same performant reason.
   
   ### Why are the changes needed?
   
   To avoid the performance slowdown for the case when the 
DataFrameQueryContext too much stacked
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   No API changes, but the DataFrameQueryContext would no longer displayed when 
the flag is disabled
   
   
   ### How was this patch tested?
   
   Manually tested:
   
   1. **FLAG ON (almost 25sec)**
   ```python
   >>> spark.conf.get("spark.sql.dataFrameQueryContext.enabled")
   'true'
   >>> import time
   >>> import pyspark.sql.functions as F
   >>>
   >>> c = F.col("name")
   >>> start = time.time()
   >>> for i in range(10000):
   ...   _ = c.alias("a")
   ...
   >>> print(time.time() - start)
   24.78217577934265
   ```
   
   2. **FLAG OFF (only 1 sec)**
   ```python
   >>> spark.conf.set("spark.sql.dataFrameQueryContext.enabled", "false")
   >>> spark.conf.get("spark.sql.dataFrameQueryContext.enabled")
   'false'
   >>> import time
   >>> import pyspark.sql.functions as F
   >>>
   >>> c = F.col("name")
   >>> start = time.time()
   >>> for i in range(10000):
   ...   _ = c.alias("a")
   ...
   >>> print(time.time() - start)
   1.0222370624542236
   ```
   
   
   ### Was this patch authored or co-authored using generative AI tooling?
   <!--
   If generative AI tooling has been used in the process of authoring this 
patch, please include the
   phrase: 'Generated-by: ' followed by the name of the tool and its version.
   If no, write 'No'.
   Please refer to the [ASF Generative Tooling 
Guidance](https://www.apache.org/legal/generative-tooling.html) for details.
   -->
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to