You can read https://iceberg.apache.org/custom-catalog/#custom-file-io-implementation for more details of loading your custom FileIO, and see http://iceberg.apache.org/aws/#s3-fileio as an example. -Jack
On Tue, May 18, 2021 at 10:16 AM Vivekanand Vellanki <[email protected]> wrote: > Is it possible to make the FileIO implementation extensible for a schema? > > For e.g. for schema hdfs://, can I ensure that Iceberg uses my custom > implementation of FileIO at run time? > > On Tue, May 18, 2021 at 9:45 PM Daniel Weeks <[email protected]> wrote: > >> Hey Vivek, >> >> The file_path per spec is technically just a string, but the >> representation is expected to be a URI. >> >> How this URI is interpreted is really up to the FileIO implementation. >> So for example, the most common FileIO implementation is probably >> HadoopFileIO, which is going to use whatever file system scheme mapping >> you've defined in your configuration (typically via core-site.xml). >> >> For the Azure case (I'm not very familiar with this), it looks like >> AdlFileSystem is the Hadoop FileSystem implementation. So, if you map wasb >> -> AdlFileSystem, then you would want to use the URI format you described. >> >> There are more custom FileIO implementations (like S3FileIO), that are >> more specific about URI representations, but HadoopFileIO approach is >> probably more common at this point and relies on how Hadoop will resolve >> the URI. >> >> The only other thing I would note is that at this point the paths still >> need to be fully qualified (though there are some discussions ongoing about >> relative paths). >> >> Hope that helps, >> -Dan >> >> >> >> On Thu, May 13, 2021 at 5:30 AM Vivekanand Vellanki <[email protected]> >> wrote: >> >>> Hi, >>> >>> We are trying to create Iceberg tables on ADLS. What is the format for >>> referencing data files in ADLS from Manifest files? >>> >>> We are seeing Spark use something like: >>> wasb://<container>@account/<file path> >>> >>> Is there a standard for how data files should be referenced within >>> manifest files? >>> >>> Thanks >>> Vivek >>> >>>
