kingkongpoon edited a comment on issue #2557:
URL: https://github.com/apache/hudi/issues/2557#issuecomment-783209847
> To help investigate better
>
> * Can you post the configs you used to write to hudi.
> * Can you post a screen shot of spark stages. So that we know where its
failing and can relate to some configs used.
> * Can you give some rough idea of your dataset record keys. Is it
completely random or does it have some ordering to it. what it is made of.
> * I assume you are using regular bloom as index type.
my Spark code configure
```
input
.write.format("org.apache.hudi")
.option("hoodie.cleaner.commits.retained", 1)
.option("hoodie.keep.min.commits", 2)
.option("hoodie.keep.max.commits", 3)
.option("hoodie.insert.shuffle.parallelism", 30)
.option("hoodie.upsert.shuffle.parallelism", 30)
.option(DataSourceWriteOptions.OPERATION_OPT_KEY,
DataSourceWriteOptions.UPSERT_OPERATION_OPT_VAL)
.option(DataSourceWriteOptions.TABLE_TYPE_OPT_KEY,
DataSourceWriteOptions.COW_TABLE_TYPE_OPT_VAL)
.option(DataSourceWriteOptions.RECORDKEY_FIELD_OPT_KEY, "uuid")
.option(DataSourceWriteOptions.PRECOMBINE_FIELD_OPT_KEY,
"etl_modify_time")
.option(DataSourceWriteOptions.PARTITIONPATH_FIELD_OPT_KEY,
"created_year,created_month,created_day,brand_id")
.option(DataSourceWriteOptions.PAYLOAD_CLASS_OPT_KEY,
classOf[DefaultHoodieRecordPayload].getName)
.option(HoodiePayloadProps.PAYLOAD_ORDERING_FIELD_PROP,
"etl_modify_time")
.option("hoodie.table.name", "std_order")
.option(DataSourceWriteOptions.HIVE_URL_OPT_KEY, hiveserver2)
.option(DataSourceWriteOptions.HIVE_DATABASE_OPT_KEY, "dwd_std")
.option(DataSourceWriteOptions.HIVE_TABLE_OPT_KEY, "std_order")
.option(DataSourceWriteOptions.KEYGENERATOR_CLASS_OPT_KEY,
classOf[ComplexKeyGenerator].getName)
.option(DataSourceWriteOptions.HIVE_PARTITION_FIELDS_OPT_KEY,
"created_year,created_month,created_day,brand_id")
.option(DataSourceWriteOptions.HIVE_PARTITION_EXTRACTOR_CLASS_OPT_KEY,
classOf[MultiPartKeysValueExtractor].getName)
.option(DataSourceWriteOptions.HIVE_SYNC_ENABLED_OPT_KEY, "true")
.option(HoodieIndexConfig.BLOOM_INDEX_UPDATE_PARTITION_PATH, "true")
.option(HoodieIndexConfig.INDEX_TYPE_PROP,
HoodieIndex.IndexType.GLOBAL_BLOOM.name())
.mode(SaveMode.Overwrite)
// .mode(SaveMode.Append)
.save(basePath)
```
```
spark-submit --master yarn --driver-memory 4G --executor-memory 8G
--executor-cores 4 --num-executors 10
--conf spark.executor.memoryOverhead=4G --conf spark.yarn.max.ex.ilures=100
--class com.qmtec.peony.newcrm.hudi.process --jars
hudi-hadoop-mr-bundle-0.7.0.jar
--jars hudi-hive-sync-bundle-0.7.0.jar --jars
hudi-spark-bundle_2.11-0.7.0.jar qmtec-peony-etl-hudi-1.0.jar
```
uuid is the tid from order table data,and it is unique,When I first wirte
data in HDFS, .mode(SaveMode.Overwrite), and create hive table successfully
,the file in HDFS is about 520MB.
But when I use the same code,configure and data in .mode(SaveMode.Append),
the process will throw errors
```
21/02/22 15:57:43 ERROR [dispatcher-event-loop-5] YarnScheduler: Lost
executor 4 on emr-worker-2.cluster-47763: Container from a bad node:
container_e10_1610102487810_52481_01_000005 on host:
emr-worker-2.cluster-47763. Exit status: 137. Diagnostics: Container killed on
request. Exit code is 137
Container exited with a non-zero exit code 137
Killed by external signal
.
21/02/22 15:57:45 ERROR [dispatcher-event-loop-7] YarnScheduler: Lost
executor 5 on emr-worker-4.cluster-47763: Container from a bad node:
container_e10_1610102487810_52481_01_000006 on host:
emr-worker-4.cluster-47763. Exit status: 137. Diagnostics: Container killed on
request. Exit code is 137
Container exited with a non-zero exit code 137
Killed by external signal
.
21/02/22 15:58:12 ERROR [dispatcher-event-loop-2] YarnScheduler: Lost
executor 7 on emr-worker-4.cluster-47763: Container from a bad node:
container_e10_1610102487810_52481_01_000009 on host:
emr-worker-4.cluster-47763. Exit status: 137. Diagnostics: Container killed on
request. Exit code is 137
Container exited with a non-zero exit code 137
Killed by external signal
.
21/02/22 15:58:31 ERROR [dispatcher-event-loop-4] YarnScheduler: Lost
executor 8 on emr-worker-4.cluster-47763: Container from a bad node:
container_e10_1610102487810_52481_01_000010 on host:
emr-worker-4.cluster-47763. Exit status: 1. Diagnostics: Exception from
container-launch.
Container id: container_e10_1610102487810_52481_01_000010
Exit code: 1
Stack trace: ExitCodeException exitCode=1:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:972)
at org.apache.hadoop.util.Shell.run(Shell.java:869)
at
org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1170)
at
org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:235)
at
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:299)
at
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:83)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Container exited with a non-zero exit code 1
```
but sometime it can run successfully ,it wil have two parquet files,and each
parquet file are also ablout 520MB
and the table root path has a .hoodie file ,when I run each time ,this file
will become bigger
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]