[spark] branch master updated: [SPARK-45224][PYTHON] Add examples w/ map and array as parameters of `sql()`

maxgekk Wed, 20 Sep 2023 07:38:22 -0700

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git



The following commit(s) were added to refs/heads/master by this push:
     new c89221b02bb3 [SPARK-45224][PYTHON] Add examples w/ map and array as 
parameters of `sql()`
c89221b02bb3 is described below

commit c89221b02bb3000f707a31322e6d40b561e527bd
Author: Max Gekk <[email protected]>
AuthorDate: Wed Sep 20 11:09:01 2023 +0300

    [SPARK-45224][PYTHON] Add examples w/ map and array as parameters of `sql()`
    
    ### What changes were proposed in this pull request?
    In the PR, I propose to add a few more examples for the `sql()` method in 
PySpark API with array and map parameters.
    
    ### Why are the changes needed?
    To inform users about recent changes introduced by #42752 and #42470, and 
check the changes work actually.
    
    ### Does this PR introduce _any_ user-facing change?
    No.
    
    ### How was this patch tested?
    By running new examples:
    ```
    $ python/run-tests --parallelism=1 --testnames 'pyspark.sql.session 
SparkSession.sql'
    ```
    
    ### Was this patch authored or co-authored using generative AI tooling?
    No.
    
    Closes #42996 from MaxGekk/map-sql-parameterized-python-connect.
    
    Authored-by: Max Gekk <[email protected]>
    Signed-off-by: Max Gekk <[email protected]>
---
 python/pyspark/sql/session.py | 30 +++++++++++++++++-------------
 1 file changed, 17 insertions(+), 13 deletions(-)

diff --git a/python/pyspark/sql/session.py b/python/pyspark/sql/session.py
index dc4f8f321a59..de2e8d0cda2a 100644
--- a/python/pyspark/sql/session.py
+++ b/python/pyspark/sql/session.py
@@ -1599,23 +1599,27 @@ class SparkSession(SparkConversionMixin):
 
         And substitude named parameters with the `:` prefix by SQL literals.
 
-        >>> spark.sql("SELECT * FROM {df} WHERE {df[B]} > :minB", {"minB" : 
5}, df=mydf).show()
-        +---+---+
-        |  A|  B|
-        +---+---+
-        |  3|  6|
-        +---+---+
+        >>> from pyspark.sql.functions import create_map
+        >>> spark.sql(
+        ...   "SELECT *, element_at(:m, 'a') AS C FROM {df} WHERE {df[B]} > 
:minB",
+        ...   {"minB" : 5, "m" : create_map(lit('a'), lit(1))}, df=mydf).show()
+        +---+---+---+
+        |  A|  B|  C|
+        +---+---+---+
+        |  3|  6|  1|
+        +---+---+---+
 
         Or positional parameters marked by `?` in the SQL query by SQL 
literals.
 
+        >>> from pyspark.sql.functions import array
         >>> spark.sql(
-        ...   "SELECT * FROM {df} WHERE {df[B]} > ? and ? < {df[A]}",
-        ...   args=[5, 2], df=mydf).show()
-        +---+---+
-        |  A|  B|
-        +---+---+
-        |  3|  6|
-        +---+---+
+        ...   "SELECT *, element_at(?, 1) AS C FROM {df} WHERE {df[B]} > ? and 
? < {df[A]}",
+        ...   args=[array(lit(1), lit(2), lit(3)), 5, 2], df=mydf).show()
+        +---+---+---+
+        |  A|  B|  C|
+        +---+---+---+
+        |  3|  6|  1|
+        +---+---+---+
         """
 
         formatter = SQLStringFormatter(self)


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[spark] branch master updated: [SPARK-45224][PYTHON] Add examples w/ map and array as parameters of `sql()`

Reply via email to