abhibhat98 edited a comment on issue #1675:
URL: https://github.com/apache/hudi/issues/1675#issuecomment-635675628
Thanks @vinothchandar for a detailed peek into the design. I did this
` spark.sql("select * from test_123 where _hoodie_record_key = 'L1'").show`
However, I only got the latest commit. However when I do this:
`
spark.read.format("org.apache.hudi").
option(DataSourceReadOptions.VIEW_TYPE_OPT_KEY,
DataSourceReadOptions.VIEW_TYPE_INCREMENTAL_OPT_VAL).
option(BEGIN_INSTANTTIME_OPT_KEY, beginTime).
option(END_INSTANTTIME_OPT_KEY, endTime).
load("s3://dip-abhatia-test/hudi_test1/data")
`
I get the earlier records. But I need begin and/or end time. If I don't care
about performance(as its a one off job that fixes things or get all the data),
is there a way to get it? I see that you cli has this - fromCommitTime=0 and
maxCommits=-1 - as mentioned by you but is it possible via spark ?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]