Haejoon Lee created SPARK-43611:
-----------------------------------

             Summary: Fix unexpected `AnalysisException` from Spark Connect 
client
                 Key: SPARK-43611
                 URL: https://issues.apache.org/jira/browse/SPARK-43611
             Project: Spark
          Issue Type: Sub-task
          Components: Connect, Pandas API on Spark
    Affects Versions: 3.5.0
            Reporter: Haejoon Lee


Reproducible example:
{code:java}
>>> import pyspark.pandas as ps
>>> psdf1 = ps.DataFrame({"A": [1, 2, 3]})
>>> psdf2 = ps.DataFrame({"B": [1, 2, 3]})
>>> psdf1.append(psdf2)
/Users/haejoon.lee/Desktop/git_store/spark/python/pyspark/pandas/frame.py:8897: 
FutureWarning: The DataFrame.append method is deprecated and will be removed in 
a future version. Use pyspark.pandas.concat instead.
  warnings.warn(
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File 
"/Users/haejoon.lee/Desktop/git_store/spark/python/pyspark/pandas/frame.py", 
line 8930, in append
    return cast(DataFrame, concat([self, other], ignore_index=ignore_index))
  File 
"/Users/haejoon.lee/Desktop/git_store/spark/python/pyspark/pandas/namespace.py",
 line 2703, in concat
    psdfs[0]._internal.copy(
  File 
"/Users/haejoon.lee/Desktop/git_store/spark/python/pyspark/pandas/internal.py", 
line 1508, in copy
    return InternalFrame(
  File 
"/Users/haejoon.lee/Desktop/git_store/spark/python/pyspark/pandas/internal.py", 
line 753, in __init__
    schema = spark_frame.select(data_spark_columns).schema
  File 
"/Users/haejoon.lee/Desktop/git_store/spark/python/pyspark/sql/connect/dataframe.py",
 line 1650, in schema
    return self._session.client.schema(query)
  File 
"/Users/haejoon.lee/Desktop/git_store/spark/python/pyspark/sql/connect/client.py",
 line 777, in schema
    schema = self._analyze(method="schema", plan=plan).schema
  File 
"/Users/haejoon.lee/Desktop/git_store/spark/python/pyspark/sql/connect/client.py",
 line 958, in _analyze
    self._handle_error(error)
  File 
"/Users/haejoon.lee/Desktop/git_store/spark/python/pyspark/sql/connect/client.py",
 line 1195, in _handle_error
    self._handle_rpc_error(error)
  File 
"/Users/haejoon.lee/Desktop/git_store/spark/python/pyspark/sql/connect/client.py",
 line 1231, in _handle_rpc_error
    raise convert_exception(info, status.message) from None
pyspark.errors.exceptions.connect.AnalysisException: When resolving 'A, fail to 
find subplan with plan_id=16 in 'Project ['A, 'B]
+- Project [__index_level_0__#1101L, A#1102L, B#1157L, 
monotonically_increasing_id() AS __natural_order__#1163L]
   +- Union false, false
      :- Project [__index_level_0__#1101L, A#1102L, cast(B#1116 as bigint) AS 
B#1157L]
      :  +- Project [__index_level_0__#1101L, A#1102L, B#1116]
      :     +- Project [__index_level_0__#1101L, A#1102L, 
__natural_order__#1108L, null AS B#1116]
      :        +- Project [__index_level_0__#1101L, A#1102L, 
__natural_order__#1108L]
      :           +- Project [__index_level_0__#1101L, A#1102L, 
monotonically_increasing_id() AS __natural_order__#1108L]
      :              +- Project [__index_level_0__#1097L AS 
__index_level_0__#1101L, A#1098L AS A#1102L]
      :                 +- LocalRelation [__index_level_0__#1097L, A#1098L]
      +- Project [__index_level_0__#1137L, cast(A#1152 as bigint) AS A#1158L, 
B#1138L]
         +- Project [__index_level_0__#1137L, A#1152, B#1138L]
            +- Project [__index_level_0__#1137L, B#1138L, 
__natural_order__#1144L, null AS A#1152]
               +- Project [__index_level_0__#1137L, B#1138L, 
__natural_order__#1144L]
                  +- Project [__index_level_0__#1137L, B#1138L, 
monotonically_increasing_id() AS __natural_order__#1144L]
                     +- Project [__index_level_0__#1133L AS 
__index_level_0__#1137L, B#1134L AS B#1138L]
                        +- LocalRelation [__index_level_0__#1133L, B#1134L] 
{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to