[ 
https://issues.apache.org/jira/browse/HUDI-3059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17461899#comment-17461899
 ] 

sivabalan narayanan commented on HUDI-3059:
-------------------------------------------

so, I created 2 savepoints as below. 

c1, c2, c3, sp1, c4, sp2, c5.

tried savepoint rollback for sp2 and it worked. but left trailing rollback meta 
files. 

again tried to savepoint roll back with sp1 and it failed. stacktrace does not 
have sufficient info. 
{code:java}
21/12/18 06:20:00 INFO HoodieActiveTimeline: Loaded instants upto : 
Option{val=[==>20211218061954430__rollback__REQUESTED]}
21/12/18 06:20:00 INFO BaseRollbackPlanActionExecutor: Requesting Rollback with 
instant time [==>20211218061954430__rollback__REQUESTED]
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 66
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 26
21/12/18 06:20:00 INFO BlockManagerInfo: Removed broadcast_3_piece0 on 
192.168.1.4:54359 in memory (size: 25.5 KB, free: 366.2 MB)
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 110
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 99
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 47
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 21
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 43
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 55
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 104
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 124
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 29
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 91
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 123
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 120
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 25
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 32
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 92
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 76
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 89
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 102
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 50
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 49
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 116
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 96
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 118
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 44
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 60
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 87
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 77
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 75
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 9
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 72
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 2
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 37
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 113
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 67
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 28
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 95
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 59
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 68
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 45
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 39
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 74
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 20
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 90
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 56
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 58
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 61
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 13
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 46
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 101
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 105
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 81
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 63
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 78
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 4
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 31
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 71
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 3
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 1
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 114
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 51
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 18
21/12/18 06:20:00 INFO HoodieActiveTimeline: Loaded instants upto : 
Option{val=[==>20211218061954430__rollback__REQUESTED]}
21/12/18 06:20:00 INFO BlockManagerInfo: Removed broadcast_4_piece0 on 
192.168.1.4:54359 in memory (size: 34.6 KB, free: 366.3 MB)
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 109
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 0
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 40
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 119
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 117
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 84
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 41
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 16
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 107
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 24
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 62
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 93
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 22
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 115
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 54
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 14
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 86
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 65
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 12
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 10
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 42
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 82
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 79
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 30
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 6
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 64
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 112
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 7
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 53
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 33
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 17
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 80
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 35
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 48
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 69
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 100
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 108
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 111
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 5
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 34
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 52
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 85
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 121
21/12/18 06:20:00 INFO BlockManagerInfo: Removed broadcast_2_piece0 on 
192.168.1.4:54359 in memory (size: 25.5 KB, free: 366.3 MB)
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 106
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 57
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 122
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 88
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 98
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 15
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 94
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 97
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 19
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 36
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 23
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 38
21/12/18 06:20:00 INFO BlockManagerInfo: Removed broadcast_1_piece0 on 
192.168.1.4:54359 in memory (size: 25.5 KB, free: 366.3 MB)
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 103
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 83
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 27
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 70
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 73
21/12/18 06:20:00 INFO ContextCleaner: Cleaned accumulator 11
21/12/18 06:20:00 INFO HoodieActiveTimeline: Loaded instants upto : 
Option{val=[==>20211218061954430__rollback__REQUESTED]}
21/12/18 06:20:00 WARN SparkMain: The commit "20211217183516921" failed to roll 
back.
21/12/18 06:20:00 INFO SparkUI: Stopped Spark web UI at http://192.168.1.4:4042
21/12/18 06:20:00 INFO MapOutputTrackerMasterEndpoint: 
MapOutputTrackerMasterEndpoint stopped!
21/12/18 06:20:00 INFO MemoryStore: MemoryStore cleared
21/12/18 06:20:00 INFO BlockManager: BlockManager stopped
21/12/18 06:20:00 INFO BlockManagerMaster: BlockManagerMaster stopped
21/12/18 06:20:00 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: 
OutputCommitCoordinator stopped!
21/12/18 06:20:00 INFO SparkContext: Successfully stopped SparkContext
21/12/18 06:20:00 INFO ShutdownHookManager: Shutdown hook called
21/12/18 06:20:00 INFO ShutdownHookManager: Deleting directory 
/private/var/folders/ym/8yjkm3n90kq8tk4gfmvk7y140000gn/T/spark-983167c8-60f0-493c-9d31-9d69131ddcc1
21/12/18 06:20:00 INFO ShutdownHookManager: Deleting directory 
/private/var/folders/ym/8yjkm3n90kq8tk4gfmvk7y140000gn/T/spark-a11f663a-03a7-47a7-87f1-39859094f0cf
hudi:hudi_trips_cow->37488835 [Spring Shell] INFO  
org.apache.hudi.common.table.HoodieTableMetaClient  - Loading 
HoodieTableMetaClient from /tmp/hudi_trips_cow
37488867 [Spring Shell] INFO  org.apache.hudi.common.table.HoodieTableConfig  - 
Loading table properties from /tmp/hudi_trips_cow/.hoodie/hoodie.properties
37488867 [Spring Shell] INFO  
org.apache.hudi.common.table.HoodieTableMetaClient  - Finished Loading Table of 
type COPY_ON_WRITE(version=1, baseFileFormat=PARQUET) from /tmp/hudi_trips_cow
Savepoint "20211217183516921" failed to roll back {code}
 

and timeilne is at an inflight state.  
{code:java}
-rw-r--r--  1 nsb  wheel     0 Dec 17 19:57 20211217195708258.commit.requested
-rw-r--r--  1 nsb  wheel  2594 Dec 17 19:57 20211217195708258.inflight
-rw-r--r--  1 nsb  wheel  4425 Dec 17 19:57 20211217195708258.commit
-rw-r--r--  1 nsb  wheel     0 Dec 17 19:57 20211217195708258.savepoint.inflight
-rw-r--r--  1 nsb  wheel  1168 Dec 17 19:57 20211217195708258.savepoint
-rw-r--r--  1 nsb  wheel     0 Dec 17 20:00 20211217200028051.restore.inflight
-rw-r--r--  1 nsb  wheel  1703 Dec 17 20:00 20211217200028099.rollback.requested
-rw-r--r--  1 nsb  wheel  1703 Dec 17 20:00 20211217200028099.rollback.inflight
-rw-r--r--  1 nsb  wheel  2770 Dec 17 20:00 20211217200028051.restore

-rw-r--r--  1 nsb  wheel     0 Dec 18 06:19 20211218061954381.restore.inflight
-rw-r--r--  1 nsb  wheel  1703 Dec 18 06:20 
20211218061954430.rollback.requested {code}
 

> save point rollback not working with hudi-cli
> ---------------------------------------------
>
>                 Key: HUDI-3059
>                 URL: https://issues.apache.org/jira/browse/HUDI-3059
>             Project: Apache Hudi
>          Issue Type: Bug
>          Components: Usability
>            Reporter: sivabalan narayanan
>            Assignee: Ethan Guo
>            Priority: Major
>              Labels: sev:critical
>
> Ref issue:
> [https://github.com/apache/hudi/issues/3870]
>  
>  # create Hudi dataset
>  # add some data so there are multiple commits
>  # create a savepoint
>  # try to rollback savepoint
>  
> I tried locally. 
> 1. savepoint creation with spark master not recognized, fails even after 
> setting spark master.
> {code:java}
> hudi:hudi_trips_cow->set --conf SPARK_MASTER=local[2]
> hudi:hudi_trips_cow->savepoint create --commit 20211217183516921
> 254601 [Spring Shell] INFO  
> org.apache.hudi.common.table.timeline.HoodieActiveTimeline  - Loaded instants 
> upto : Option{val=[20211217183516921__commit__COMPLETED]}
> 21/12/17 19:36:45 WARN Utils: Your hostname, Sivabalans-MacBook-Pro.local 
> resolves to a loopback address: 127.0.0.1; using 192.168.70.90 instead (on 
> interface en0)
> 21/12/17 19:36:45 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to 
> another address
> 21/12/17 19:36:45 WARN NativeCodeLoader: Unable to load native-hadoop library 
> for your platform... using builtin-java classes where applicable
> .
> .
> .
> 21/12/17 19:36:18 ERROR SparkContext: Error initializing SparkContext.
> org.apache.spark.SparkException: Could not parse Master URL: ''
>     at 
> org.apache.spark.SparkContext$.org$apache$spark$SparkContext$$createTaskScheduler(SparkContext.scala:2784)
>     at org.apache.spark.SparkContext.<init>(SparkContext.scala:493)
>     at 
> org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:58)
>     at 
> org.apache.hudi.cli.utils.SparkUtil.initJavaSparkConf(SparkUtil.java:115)
>     at 
> org.apache.hudi.cli.utils.SparkUtil.initJavaSparkConf(SparkUtil.java:110)
>     at org.apache.hudi.cli.commands.SparkMain.main(SparkMain.java:88)
>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>     at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>     at java.lang.reflect.Method.invoke(Method.java:498)
>     at 
> org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
>     at 
> org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:845)
>     at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:161)
>     at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:184)
>     at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86)
>     at 
> org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:920)
>     at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:929)
>     at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
> 21/12/17 19:36:18 INFO SparkUI: Stopped Spark web UI at 
> http://192.168.70.90:4042
> 21/12/17 19:36:18 INFO MapOutputTrackerMasterEndpoint: 
> MapOutputTrackerMasterEndpoint stopped!
> 21/12/17 19:36:18 INFO MemoryStore: MemoryStore cleared
> 21/12/17 19:36:18 INFO BlockManager: BlockManager stopped
> 21/12/17 19:36:18 INFO BlockManagerMaster: BlockManagerMaster stopped
> 21/12/17 19:36:18 WARN MetricsSystem: Stopping a MetricsSystem that is not 
> running
> 21/12/17 19:36:18 INFO 
> OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: 
> OutputCommitCoordinator stopped!
> 21/12/17 19:36:18 INFO SparkContext: Successfully stopped SparkContext
> Exception in thread "main" org.apache.spark.SparkException: Could not parse 
> Master URL: ''
>     at 
> org.apache.spark.SparkContext$.org$apache$spark$SparkContext$$createTaskScheduler(SparkContext.scala:2784)
>     at org.apache.spark.SparkContext.<init>(SparkContext.scala:493)
>     at 
> org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:58)
>     at 
> org.apache.hudi.cli.utils.SparkUtil.initJavaSparkConf(SparkUtil.java:115)
>     at 
> org.apache.hudi.cli.utils.SparkUtil.initJavaSparkConf(SparkUtil.java:110)
>     at org.apache.hudi.cli.commands.SparkMain.main(SparkMain.java:88)
>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>     at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>     at java.lang.reflect.Method.invoke(Method.java:498)
>     at 
> org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
>     at 
> org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:845)
>     at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:161)
>     at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:184)
>     at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86)
>     at 
> org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:920)
>     at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:929)
>     at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
> 21/12/17 19:36:18 INFO ShutdownHookManager: Shutdown hook called
> 21/12/17 19:36:18 INFO ShutdownHookManager: Deleting directory 
> /private/var/folders/ym/8yjkm3n90kq8tk4gfmvk7y140000gn/T/spark-db7da71a-bb1c-453b-b43b-640c080aaf2a
> 21/12/17 19:36:18 INFO ShutdownHookManager: Deleting directory 
> /private/var/folders/ym/8yjkm3n90kq8tk4gfmvk7y140000gn/T/spark-ef4048e5-6071-458a-9a54-05bf70157db2
>  {code}
> Locally made a fix for now and get past the issue. but we need to fix it 
> properly. 
> {code:java}
> diff --git 
> a/hudi-cli/src/main/java/org/apache/hudi/cli/commands/SparkMain.java 
> b/hudi-cli/src/main/java/org/apache/hudi/cli/commands/SparkMain.java
> index d1ee109f5..f925bdb0c 100644
> --- a/hudi-cli/src/main/java/org/apache/hudi/cli/commands/SparkMain.java
> +++ b/hudi-cli/src/main/java/org/apache/hudi/cli/commands/SparkMain.java
> @@ -82,11 +82,11 @@ public class SparkMain {
>    public static void main(String[] args) throws Exception {
>      ValidationUtils.checkArgument(args.length >= 4);
>      final String commandString = args[0];
> -    LOG.info("Invoking SparkMain: " + commandString);
> +    LOG.warn("Invoking SparkMain: " + commandString);
>      final SparkCommand cmd = SparkCommand.valueOf(commandString);
>  
>      JavaSparkContext jsc = SparkUtil.initJavaSparkConf("hoodie-cli-" + 
> commandString,
> -        Option.of(args[1]), Option.of(args[2]));
> +        Option.of("local[2]"), Option.of(args[2]));
>  
>      int returnCode = 0;
>      try { {code}
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to