RussellSpitzer commented on code in PR #15239:
URL: https://github.com/apache/iceberg/pull/15239#discussion_r2886420394


##########
docs/docs/spark-configuration.md:
##########
@@ -220,6 +220,7 @@ spark.read
 | stream-from-timestamp | (none) | A timestamp in milliseconds to stream from; 
if before the oldest known ancestor snapshot, the oldest will be used           
                                                  |
 | streaming-max-files-per-micro-batch | INT_MAX | Maximum number of files per 
microbatch                                                                      
                                                                  |
 | streaming-max-rows-per-micro-batch  | INT_MAX | "Soft maximum" number of 
rows per microbatch; always includes all rows in next unprocessed file, 
excludes additional files if their inclusion would exceed the soft max limit |
+| streaming-checkpoint-use-hadoop | false | Use Hadoop FileSystem for 
streaming checkpoint operations instead of the table's FileIO implementation    
                                                                          |

Review Comment:
   @danielcweeks ^ What do you think about just always using HadoopFS via 
HadoopFileIO? I think it's clear from the code I linked to that it's what Spark 
requires to work, so if a user must have configured correctly while TableIO can 
be anything.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to