zhanghaicheng1 opened a new issue #3016:
URL: https://github.com/apache/iceberg/issues/3016
spark version : 3.1.2
org.apache.iceberg.iceberg-hive : 0.11.1
scala.version: 2.12.8
hadoop.version: 3.0.0-cdh6.1.1
```
def main(args: Array[String]): Unit = {
val spark = SparkSession.builder()
.config("spark.sql.catalog.hadoop_prod.type", "hadoop") //
设置数据源类别为hadoop
.config("spark.sql.catalog.hadoop_prod", classOf[SparkCatalog].getName)
// 指定Hadoop数据源的根目录
.config("spark.sql.catalog.hadoop_prod.warehouse",
"hdfs://centos4:8020/doit/iceberg/warehouse/") // 设置数据源位置
.config("spark.sql.sources.partitionOverwriteMode", "dynamic")
.appName(this.getClass.getSimpleName)
.master("local[*]")
.getOrCreate()
val deleteSingleDataSQL = "DELETE FROM hadoop_prod.logging.tb_user1
where id=3 "
spark.table("hadoop_prod.logging.tb_user1").show
spark.sql(deleteSingleDataSQL)
spark.table("hadoop_prod.logging.tb_user1").show
```
when the code runs, the exception message is:
```
Using Spark's default log4j profile:
org/apache/spark/log4j-defaults.properties
21/08/24 10:14:42 WARN Utils: Your hostname, ocean resolves to a loopback
address: 127.0.0.1; using 192.168.2.162 instead (on interface en0)
21/08/24 10:14:42 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to
another address
21/08/24 10:14:43 INFO SparkContext: Running Spark version 3.1.2
21/08/24 10:14:43 INFO ResourceUtils:
==============================================================
21/08/24 10:14:43 INFO ResourceUtils: No custom resources configured for
spark.driver.
21/08/24 10:14:43 INFO ResourceUtils:
==============================================================
21/08/24 10:14:43 INFO SparkContext: Submitted application:
DeleteByContition$
21/08/24 10:14:43 INFO ResourceProfile: Default ResourceProfile created,
executor resources: Map(cores -> name: cores, amount: 1, script: , vendor: ,
memory -> name: memory, amount: 1024, script: , vendor: , offHeap -> name:
offHeap, amount: 0, script: , vendor: ), task resources: Map(cpus -> name:
cpus, amount: 1.0)
21/08/24 10:14:43 INFO ResourceProfile: Limiting resource is cpu
21/08/24 10:14:43 INFO ResourceProfileManager: Added ResourceProfile id: 0
21/08/24 10:14:43 INFO SecurityManager: Changing view acls to: zhc
21/08/24 10:14:43 INFO SecurityManager: Changing modify acls to: zhc
21/08/24 10:14:43 INFO SecurityManager: Changing view acls groups to:
21/08/24 10:14:43 INFO SecurityManager: Changing modify acls groups to:
21/08/24 10:14:43 INFO SecurityManager: SecurityManager: authentication
disabled; ui acls disabled; users with view permissions: Set(zhc); groups with
view permissions: Set(); users with modify permissions: Set(zhc); groups with
modify permissions: Set()
21/08/24 10:14:43 INFO Utils: Successfully started service 'sparkDriver' on
port 64863.
21/08/24 10:14:43 INFO SparkEnv: Registering MapOutputTracker
21/08/24 10:14:43 INFO SparkEnv: Registering BlockManagerMaster
21/08/24 10:14:43 INFO BlockManagerMasterEndpoint: Using
org.apache.spark.storage.DefaultTopologyMapper for getting topology information
21/08/24 10:14:43 INFO BlockManagerMasterEndpoint:
BlockManagerMasterEndpoint up
21/08/24 10:14:43 INFO SparkEnv: Registering BlockManagerMasterHeartbeat
21/08/24 10:14:43 INFO DiskBlockManager: Created local directory at
/private/var/folders/2g/f4g3ss0n0jv5r434ngcjbbhr0000gn/T/blockmgr-7e501f6e-2327-4c37-8777-900e8c25f7ea
21/08/24 10:14:43 INFO MemoryStore: MemoryStore started with capacity 4.1 GiB
21/08/24 10:14:43 INFO SparkEnv: Registering OutputCommitCoordinator
21/08/24 10:14:44 INFO Utils: Successfully started service 'SparkUI' on port
4040.
21/08/24 10:14:44 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at
http://ocean.lan:4040
21/08/24 10:14:44 INFO Executor: Starting executor ID driver on host
ocean.lan
21/08/24 10:14:44 INFO Utils: Successfully started service
'org.apache.spark.network.netty.NettyBlockTransferService' on port 64867.
21/08/24 10:14:44 INFO NettyBlockTransferService: Server created on
ocean.lan:64867
21/08/24 10:14:44 INFO BlockManager: Using
org.apache.spark.storage.RandomBlockReplicationPolicy for block replication
policy
21/08/24 10:14:44 INFO BlockManagerMaster: Registering BlockManager
BlockManagerId(driver, ocean.lan, 64867, None)
21/08/24 10:14:44 INFO BlockManagerMasterEndpoint: Registering block manager
ocean.lan:64867 with 4.1 GiB RAM, BlockManagerId(driver, ocean.lan, 64867, None)
21/08/24 10:14:44 INFO BlockManagerMaster: Registered BlockManager
BlockManagerId(driver, ocean.lan, 64867, None)
21/08/24 10:14:44 INFO BlockManager: Initialized BlockManager:
BlockManagerId(driver, ocean.lan, 64867, None)
21/08/24 10:14:44 INFO SharedState: spark.sql.warehouse.dir is not set, but
hive.metastore.warehouse.dir is set. Setting spark.sql.warehouse.dir to the
value of hive.metastore.warehouse.dir ('/user/hive/warehouse').
21/08/24 10:14:44 INFO SharedState: Warehouse path is '/user/hive/warehouse'.
21/08/24 10:14:46 INFO BaseMetastoreCatalog: Table loaded by catalog:
hadoop_prod.logging.tb_user1
21/08/24 10:14:47 INFO MemoryStore: Block broadcast_0 stored as values in
memory (estimated size 438.4 KiB, free 4.1 GiB)
21/08/24 10:14:47 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes
in memory (estimated size 42.7 KiB, free 4.1 GiB)
21/08/24 10:14:47 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory
on ocean.lan:64867 (size: 42.7 KiB, free: 4.1 GiB)
21/08/24 10:14:47 INFO SparkContext: Created broadcast 0 from broadcast at
SparkScanBuilder.java:171
21/08/24 10:14:47 INFO MemoryStore: Block broadcast_1 stored as values in
memory (estimated size 40.0 B, free 4.1 GiB)
21/08/24 10:14:47 INFO MemoryStore: Block broadcast_1_piece0 stored as bytes
in memory (estimated size 116.0 B, free 4.1 GiB)
21/08/24 10:14:47 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory
on ocean.lan:64867 (size: 116.0 B, free: 4.1 GiB)
21/08/24 10:14:47 INFO SparkContext: Created broadcast 1 from broadcast at
SparkScanBuilder.java:172
21/08/24 10:14:47 INFO V2ScanRelationPushDown:
Pushing operators to hadoop_prod.logging.tb_user1
Pushed Filters:
Post-Scan Filters:
Output: id#0, name#1, age#2
21/08/24 10:14:47 INFO BaseTableScan: Scanning table
hadoop_prod.logging.tb_user1 snapshot 4602380977344673234 created at 2021-08-19
11:14:20.938 with filter true
21/08/24 10:14:49 INFO CodeGenerator: Code generated in 323.058098 ms
21/08/24 10:14:49 INFO SparkContext: Starting job: show at
DeleteByContition.scala:24
21/08/24 10:14:49 INFO DAGScheduler: Got job 0 (show at
DeleteByContition.scala:24) with 1 output partitions
21/08/24 10:14:49 INFO DAGScheduler: Final stage: ResultStage 0 (show at
DeleteByContition.scala:24)
21/08/24 10:14:49 INFO DAGScheduler: Parents of final stage: List()
21/08/24 10:14:49 INFO DAGScheduler: Missing parents: List()
21/08/24 10:14:49 INFO DAGScheduler: Submitting ResultStage 0
(MapPartitionsRDD[3] at show at DeleteByContition.scala:24), which has no
missing parents
21/08/24 10:14:49 INFO MemoryStore: Block broadcast_2 stored as values in
memory (estimated size 8.0 KiB, free 4.1 GiB)
21/08/24 10:14:49 INFO MemoryStore: Block broadcast_2_piece0 stored as bytes
in memory (estimated size 3.8 KiB, free 4.1 GiB)
21/08/24 10:14:49 INFO BlockManagerInfo: Added broadcast_2_piece0 in memory
on ocean.lan:64867 (size: 3.8 KiB, free: 4.1 GiB)
21/08/24 10:14:49 INFO SparkContext: Created broadcast 2 from broadcast at
DAGScheduler.scala:1388
21/08/24 10:14:49 INFO DAGScheduler: Submitting 1 missing tasks from
ResultStage 0 (MapPartitionsRDD[3] at show at DeleteByContition.scala:24)
(first 15 tasks are for partitions Vector(0))
21/08/24 10:14:49 INFO TaskSchedulerImpl: Adding task set 0.0 with 1 tasks
resource profile 0
21/08/24 10:14:49 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID
0) (ocean.lan, executor driver, partition 0, ANY, 8662 bytes)
taskResourceAssignments Map()
21/08/24 10:14:49 INFO Executor: Running task 0.0 in stage 0.0 (TID 0)
21/08/24 10:14:49 INFO ZlibFactory: Successfully loaded & initialized
native-zlib library
21/08/24 10:14:49 INFO CodecPool: Got brand-new decompressor [.gz]
21/08/24 10:14:50 INFO Executor: Finished task 0.0 in stage 0.0 (TID 0).
1609 bytes result sent to driver
21/08/24 10:14:50 INFO TaskSetManager: Finished task 0.0 in stage 0.0 (TID
0) in 884 ms on ocean.lan (executor driver) (1/1)
21/08/24 10:14:50 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks
have all completed, from pool
21/08/24 10:14:50 INFO DAGScheduler: ResultStage 0 (show at
DeleteByContition.scala:24) finished in 0.958 s
21/08/24 10:14:50 INFO DAGScheduler: Job 0 is finished. Cancelling potential
speculative or zombie tasks for this job
21/08/24 10:14:50 INFO TaskSchedulerImpl: Killing all running tasks in stage
0: Stage finished
21/08/24 10:14:50 INFO DAGScheduler: Job 0 finished: show at
DeleteByContition.scala:24, took 0.990896 s
21/08/24 10:14:50 INFO CodeGenerator: Code generated in 21.044329 ms
+---+-------------+---+
| id| name|age|
+---+-------------+---+
| 1|zhanghaicheng| 20|
| 2| xutao| 18|
| 3| sunpengcheng| 19|
+---+-------------+---+
21/08/24 10:14:50 INFO MemoryStore: Block broadcast_3 stored as values in
memory (estimated size 438.4 KiB, free 4.1 GiB)
21/08/24 10:14:50 INFO MemoryStore: Block broadcast_3_piece0 stored as bytes
in memory (estimated size 42.7 KiB, free 4.1 GiB)
21/08/24 10:14:50 INFO BlockManagerInfo: Added broadcast_3_piece0 in memory
on ocean.lan:64867 (size: 42.7 KiB, free: 4.1 GiB)
21/08/24 10:14:50 INFO SparkContext: Created broadcast 3 from broadcast at
SparkScanBuilder.java:171
21/08/24 10:14:50 INFO MemoryStore: Block broadcast_4 stored as values in
memory (estimated size 40.0 B, free 4.1 GiB)
21/08/24 10:14:50 INFO MemoryStore: Block broadcast_4_piece0 stored as bytes
in memory (estimated size 116.0 B, free 4.1 GiB)
21/08/24 10:14:50 INFO BlockManagerInfo: Added broadcast_4_piece0 in memory
on ocean.lan:64867 (size: 116.0 B, free: 4.1 GiB)
21/08/24 10:14:50 INFO SparkContext: Created broadcast 4 from broadcast at
SparkScanBuilder.java:172
21/08/24 10:14:50 INFO V2ScanRelationPushDown:
Pushing operators to hadoop_prod.logging.tb_user1
Pushed Filters:
Post-Scan Filters:
Output: id#22, name#23, age#24
Exception in thread "main" org.apache.spark.sql.AnalysisException: Cannot
delete from table hadoop_prod.logging.tb_user1 where [EqualTo(id,3)]
at
org.apache.spark.sql.execution.datasources.v2.DataSourceV2Strategy.apply(DataSourceV2Strategy.scala:251)
at
org.apache.spark.sql.catalyst.planning.QueryPlanner.$anonfun$plan$1(QueryPlanner.scala:63)
at scala.collection.Iterator$$anon$11.nextCur(Iterator.scala:484)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:490)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:489)
at
org.apache.spark.sql.catalyst.planning.QueryPlanner.plan(QueryPlanner.scala:93)
at
org.apache.spark.sql.execution.SparkStrategies.plan(SparkStrategies.scala:67)
at
org.apache.spark.sql.catalyst.planning.QueryPlanner.$anonfun$plan$3(QueryPlanner.scala:78)
at
scala.collection.TraversableOnce.$anonfun$foldLeft$1(TraversableOnce.scala:160)
at
scala.collection.TraversableOnce.$anonfun$foldLeft$1$adapted(TraversableOnce.scala:160)
at scala.collection.Iterator.foreach(Iterator.scala:941)
at scala.collection.Iterator.foreach$(Iterator.scala:941)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1429)
at scala.collection.TraversableOnce.foldLeft(TraversableOnce.scala:160)
at scala.collection.TraversableOnce.foldLeft$(TraversableOnce.scala:158)
at scala.collection.AbstractIterator.foldLeft(Iterator.scala:1429)
at
org.apache.spark.sql.catalyst.planning.QueryPlanner.$anonfun$plan$2(QueryPlanner.scala:75)
at scala.collection.Iterator$$anon$11.nextCur(Iterator.scala:484)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:490)
at
org.apache.spark.sql.catalyst.planning.QueryPlanner.plan(QueryPlanner.scala:93)
at
org.apache.spark.sql.execution.SparkStrategies.plan(SparkStrategies.scala:67)
at
org.apache.spark.sql.execution.QueryExecution$.createSparkPlan(QueryExecution.scala:391)
at
org.apache.spark.sql.execution.QueryExecution.$anonfun$sparkPlan$1(QueryExecution.scala:104)
at
org.apache.spark.sql.catalyst.QueryPlanningTracker.measurePhase(QueryPlanningTracker.scala:111)
at
org.apache.spark.sql.execution.QueryExecution.$anonfun$executePhase$1(QueryExecution.scala:143)
at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:775)
at
org.apache.spark.sql.execution.QueryExecution.executePhase(QueryExecution.scala:143)
at
org.apache.spark.sql.execution.QueryExecution.sparkPlan$lzycompute(QueryExecution.scala:104)
at
org.apache.spark.sql.execution.QueryExecution.sparkPlan(QueryExecution.scala:97)
at
org.apache.spark.sql.execution.QueryExecution.$anonfun$executedPlan$1(QueryExecution.scala:117)
at
org.apache.spark.sql.catalyst.QueryPlanningTracker.measurePhase(QueryPlanningTracker.scala:111)
at
org.apache.spark.sql.execution.QueryExecution.$anonfun$executePhase$1(QueryExecution.scala:143)
at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:775)
at
org.apache.spark.sql.execution.QueryExecution.executePhase(QueryExecution.scala:143)
at
org.apache.spark.sql.execution.QueryExecution.executedPlan$lzycompute(QueryExecution.scala:117)
at
org.apache.spark.sql.execution.QueryExecution.executedPlan(QueryExecution.scala:110)
at
org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$5(SQLExecution.scala:101)
at
org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:163)
at
org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:90)
at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:775)
at
org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:64)
at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3685)
at org.apache.spark.sql.Dataset.<init>(Dataset.scala:228)
at org.apache.spark.sql.Dataset$.$anonfun$ofRows$2(Dataset.scala:99)
at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:775)
at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:96)
at
org.apache.spark.sql.SparkSession.$anonfun$sql$1(SparkSession.scala:618)
at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:775)
at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:613)
at com.aisino.delete.DeleteByContition$.main(DeleteByContition.scala:25)
at com.aisino.delete.DeleteByContition.main(DeleteByContition.scala)
21/08/24 10:14:50 INFO SparkContext: Invoking stop() from shutdown hook
21/08/24 10:14:50 INFO SparkUI: Stopped Spark web UI at http://ocean.lan:4040
21/08/24 10:14:50 INFO MapOutputTrackerMasterEndpoint:
MapOutputTrackerMasterEndpoint stopped!
21/08/24 10:14:50 INFO MemoryStore: MemoryStore cleared
21/08/24 10:14:50 INFO BlockManager: BlockManager stopped
21/08/24 10:14:50 INFO BlockManagerMaster: BlockManagerMaster stopped
21/08/24 10:14:50 INFO
OutputCommitCoordinator$OutputCommitCoordinatorEndpoint:
OutputCommitCoordinator stopped!
21/08/24 10:14:50 INFO SparkContext: Successfully stopped SparkContext
21/08/24 10:14:50 INFO ShutdownHookManager: Shutdown hook called
21/08/24 10:14:50 INFO ShutdownHookManager: Deleting directory
/private/var/folders/2g/f4g3ss0n0jv5r434ngcjbbhr0000gn/T/spark-481ebd46-87e2-48ea-bb1a-1e2a7c9cc762
Process finished with exit code 1
```
Looking forward to reply! Thank you!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]