Hi Spark developers, My team has an internal storage format. It already has an implementaion of data source v2.
Now we want to adapt catalog support for it. I expect each partition can be stored in this format and spark catalog can manage partition columns which is just like using ORC and Parquet. After checking the logic of DataSource.resolveRelation, I wonder if introducing another FileFormat for my storage spec is the only way to support catalog managed partition. Could any expert help to confirm? Another question is the following comments "now catalog for data source V2 is under development". Anyone knows the progress or design of feature? lazy val providingClass: Class[_] = { val cls = DataSource.lookupDataSource(className, sparkSession.sessionState.conf) // `providingClass` is used for resolving data source relation for catalog tables. // As now catalog for data source V2 is under development, here we fall back all the // [[FileDataSourceV2]] to [[FileFormat]] to guarantee the current catalog works. // [[FileDataSourceV2]] will still be used if we call the load()/save() method in // [[DataFrameReader]]/[[DataFrameWriter]], since they use method `lookupDataSource` // instead of `providingClass`. cls.newInstance() match { case f: FileDataSourceV2 => f.fallbackFileFormat case _ => cls } } Thanks, Kun