[GitHub] [incubator-hudi] nandini57 edited a comment on issue #1582: [SUPPORT] PreCombineAndUpdate in Payload

GitBox Wed, 06 May 2020 19:16:20 -0700


nandini57 edited a comment on issue #1582:
URL: https://github.com/apache/incubator-hudi/issues/1582#issuecomment-624789029



   Probably switching to parquet format instead of hudi and doing a 
spark.read.parquet(partitionpath).dropduplicates where commit_time= X is an 
option? The following works if i want to go back to commit X and have a view of 
data.However,the same with hudi format doesn't provide me the right view as of 
commit X
   
    def audit(spark: SparkSession, partitionPath: String, tablePath: String, 
commitTime: String): Unit = {
       val hoodieROViewDF = spark.read.option("inferSchema", 
true).parquet(tablePath +  "/" + partitionPath)
       hoodieROViewDF.createOrReplaceTempView("hoodie_ro")
       spark.sql("select * from hoodie_ro where _hoodie_commit_time =" + 
commitTime).dropDuplicates().show()
     }
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [incubator-hudi] nandini57 edited a comment on issue #1582: [SUPPORT] PreCombineAndUpdate in Payload

Reply via email to