matthiasdg commented on issue #8713:
URL: https://github.com/apache/hudi/issues/8713#issuecomment-1551439622
@ad1happy2go I have no issue with the incremental query cf. the page you
linked. However, that example requires passing in the root path of the table. I
was wondering if there is an option like:
`spark.read.option(...).table("name_of_table_in_hive")` -> that does not
seem to work. Main reason I'd want this is that I then don't have to care about
the path of the table. I can of course get the path from the hive table using
`org.apache.spark.sql.catalyst.catalog.SessionCatalog.getTableMetadata` and
then pass that in, but was wondering if there is a more direct way.
I understand it's about getting data above a certain instant time, but would
have thought this is already fulfilled if you load the table with an
incremental query that has a start instant of `${min.commit.time}`. Are there
cases where this additional filtering is necessary?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]