[GitHub] [iceberg] zhangdove commented on issue #1230: How to read/write iceberg in Spark Structed Streaming

GitBox Wed, 21 Oct 2020 00:35:32 -0700


zhangdove commented on issue #1230:
URL: https://github.com/apache/iceberg/issues/1230#issuecomment-713372597



   When Analyzing Iceberg's Catalog, I find that There is still an issue left 
here, and I have made some new discoveries:
   
   `spark.read.format("iceberg").load("hdfs://nn:8020/path/to/table")` By this 
way, Iceberg table loading does not use the Iceberg Catalog. Of course, 
Iceberg's metadata information will not be cached. Instead, Iceberg Table will 
be obtained directly by using `IcebergSource.findTable(options,conf)`.
   
   However, when Iceberg table is loaded using `spark.table("prod.db.table")`, 
CachingCatalog(`cache-enabled`default value is true) automatically looks for 
Iceberg table from the cache(Caffeine Cache).
   
   Finally, whether it is incorrect that I find that the description of the 
document [in this 
place](https://github.com/apache/iceberg/blob/master/site/docs/spark.md#querying-with-dataframes)?
   
   The correct description should not be this ?
   `Using spark.table("prod.db.table") loads an isolated table reference that 
is not refreshed when other queries update the table.`
   
   @rdblue How do you think this description? Should we update this place?
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [iceberg] zhangdove commented on issue #1230: How to read/write iceberg in Spark Structed Streaming

Reply via email to