kbendick commented on issue #5515:
URL: https://github.com/apache/iceberg/issues/5515#issuecomment-1215361026

   You mentioned that you can get the location and the `currentSnapshot`.
   
   I think there isn't anything wrong with your table. If you look at the error 
message, it says that one of the file set groups failed to be rewritten. This 
_usually_ happens when there's a concurrent write or other operation on the 
table that prevents the data rewrite from proceeding (if it would break ACID 
compliance).
   
   However, I do see the final error message shows that Table 
`hive.wrk.my_table` not found.
   
   When you use `spark.sql("show tables in hive")` or `spark.sql("show tables 
in hive.wrk")`, are you able to see the table?
   
   Given that you didn't set the `uri` property as found in the first example 
here, https://iceberg.apache.org/docs/latest/spark-configuration/#catalogs, I 
think you need to be sure that the table is registered as `hive.wrk.my_table`.
   
   Additionally, I don't see in the code where you configured the spark 
session. In order to use `SparkActions.get()`, you need to have an active 
properly initialized Spark session. Otherwise, it defaults to the [current 
active spark 
session](https://github.com/apache/iceberg/blob/ce5128f09cc697455e76af08ce6ce3c9c5b08b70/spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/actions/SparkActions.java#L46-L48).
   
   So I'd suggest:
   1. Ensure that the program, as written, can see the table when you use 
`spark.sql("SHOW TABLES IN ....")`.
   2. Consider passing the URI directly to the catalog properties when 
configuring it.
   3. Make sure that `wrk` is really the namespace, and that the table isn't 
_named_ `wrk.my_table`.
   
   If you provide the way you configured the program and submitted it, as well 
as check the output of `spark.sql("SHOW TABLES IN ....")` and then run 
`DESCRIBE EXTENDED TABLE ...` on that table, that should help.
   
   I think it's likely that you just need to properly initialize the Spark 
session (in order to use the SparkActions provider), aka 
`SparkSession.builder().....getOrCreate()` properly for the `spark` object.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to