khajaasmath786 commented on issue #10356:
URL: https://github.com/apache/hudi/issues/10356#issuecomment-1862075648
I will try this solution and see it.
from pyspark.sql import SparkSession
from pyspark.sql.functions import col
# Initialize Spark Session
spark = SparkSession.builder \
.appName("Hudi Rollback") \
.config("spark.serializer",
"org.apache.spark.serializer.KryoSerializer") \
.getOrCreate()
# Set the base path for the Hudi dataset
basePath = "<your-hudi-table-base-path>"
# Load the Hudi dataset
hudi_df = spark.read.format("hudi").load(basePath)
# Display commit times
commit_times =
hudi_df.select("_hoodie_commit_time").distinct().orderBy("_hoodie_commit_time").collect()
print("Commit times in the dataset:")
for commit in commit_times:
print(commit["_hoodie_commit_time"])
# Specify the commit time you want to roll back to
target_commit_time = "20231214220739609"
# Identify commits newer than the target commit
newer_commits = [commit["_hoodie_commit_time"] for commit in commit_times
if commit["_hoodie_commit_time"] > target_commit_time]
# Rollback newer commits in reverse order
for commit in reversed(newer_commits):
print(f"Rolling back commit: {commit}")
# Perform the rollback
# This is a placeholder, replace with actual Hudi rollback command
# spark.sql(f"CALL hudi_rollback('{commit}')")
# Note: The actual rollback command may vary based on Hudi version and
setup
spark.stop()
On Mon, Dec 18, 2023 at 9:53 PM Danny Chan ***@***.***> wrote:
> See the log report:
>
> Caused by: org.apache.hudi.exception.HoodieRollbackException: Found
> commits after time :20231214220739609, please rollback greater commits
first
>
> —
> Reply to this email directly, view it on GitHub
> <https://github.com/apache/hudi/issues/10356#issuecomment-1862074775>, or
> unsubscribe
>
<https://github.com/notifications/unsubscribe-auth/ACCZQMPPYBSZJL6SEZTTLCLYKEFTTAVCNFSM6AAAAABAZ6GNESVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNRSGA3TINZXGU>
> .
> You are receiving this because you authored the thread.Message ID:
> ***@***.***>
>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]