[GitHub] [hudi] jasondavindev opened a new issue #4122: [SUPPORT] UPDATE command doest not working on Spark SQL

GitBox Thu, 25 Nov 2021 12:55:52 -0800


jasondavindev opened a new issue #4122:
URL: https://github.com/apache/hudi/issues/4122



   I've tried use SparkSQL for update rows in my table, but I'm receiving the 
below error:
   
   ```
   183073 [Thread-3] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of 
name hive.stats.jdbc.timeout does not exist
   183075 [Thread-3] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of 
name hive.stats.retries.wait does not exist
   184478 [Thread-3] WARN  org.apache.hadoop.hive.metastore.ObjectStore  - 
Version information not found in metastore. hive.metastore.schema.verification 
is not enabled so recording the schema version 2.3.0
   184478 [Thread-3] WARN  org.apache.hadoop.hive.metastore.ObjectStore  - 
setMetaStoreSchemaVersion called but recording version is disabled: version = 
2.3.0, comment = Set by MetaStore [email protected]
   Traceback (most recent call last):
     File "<stdin>", line 1, in <module>
     File "/opt/spark/python/pyspark/sql/session.py", line 723, in sql
       return DataFrame(self._jsparkSession.sql(sqlQuery), self._wrapped)
     File "/opt/spark/python/lib/py4j-0.10.9-src.zip/py4j/java_gateway.py", 
line 1305, in __call__
     File "/opt/spark/python/pyspark/sql/utils.py", line 111, in deco
       return f(*a, **kw)
     File "/opt/spark/python/lib/py4j-0.10.9-src.zip/py4j/protocol.py", line 
328, in get_return_value
   py4j.protocol.Py4JJavaError: An error occurred while calling o27.sql.
   : java.lang.UnsupportedOperationException: UPDATE TABLE is not supported 
temporarily.
        at 
org.apache.spark.sql.execution.SparkStrategies$BasicOperators$.apply(SparkStrategies.scala:716)
        at 
org.apache.spark.sql.catalyst.planning.QueryPlanner.$anonfun$plan$1(QueryPlanner.scala:63)
        at scala.collection.Iterator$$anon$11.nextCur(Iterator.scala:484)
        at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:490)
        at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:489)
        at 
org.apache.spark.sql.catalyst.planning.QueryPlanner.plan(QueryPlanner.scala:93)
        at 
org.apache.spark.sql.execution.SparkStrategies.plan(SparkStrategies.scala:67)
        at 
org.apache.spark.sql.catalyst.planning.QueryPlanner.$anonfun$plan$3(QueryPlanner.scala:78)
   ```
   
   **To Reproduce**
   
   I saved a dataframe as Hudi format and load it to Hudi table
   
   ```python
   spark.sql('create table events using hudi options (primaryKey = "id", 
preCombinedField = "updated_at", type ="cow") location 
"/tmp/data/delta/events"')
   ```
   
   Then I tried update a row
   
   ```python
   spark.sql('update events set name = "eita" where id = 244603')
   ```
   
   **Environment Description**
   
   * Hudi version : 0.9.0
   
   * Spark version : 3.1.2
   
   * Storage (HDFS/S3/GCS..) : Local
   
   * Running on Docker? (yes/no) : yes
   
   My setup 
https://github.com/jasondavindev/delta-lake-dms-cdc/blob/main/apps/hudi.py
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [hudi] jasondavindev opened a new issue #4122: [SUPPORT] UPDATE command doest not working on Spark SQL

Reply via email to