thanks for the suggestion. we need to evaluate the cost to convert the format, as those hive tables have been there for many years, so PB data need to reformat.
also, do you think it is possible to develop the support for a new format? how costly is it? 发自我的iPhone > 在 2021年9月29日,下午9:34,Russell Spitzer <russell.spit...@gmail.com> 写道: > > There is no plan I am aware of using RCFiles directly in Iceberg. While we > could work to support other file formats, I don't think it is very widely > used compared to ORC and Parquet (Iceberg has native support for these > formats). > > My suggestion for conversion would be to do a CTAS statement in Spark and > have the table completely converted over to Parquet (or ORC). This is > probably the simplest way. > >> On Sep 29, 2021, at 7:01 AM, yuan youjun <yuanyou...@gmail.com> wrote: >> >> Hi community, >> >> I am exploring ways to evolute existing hive tables (RCFile) into data >> lake. However I found out that iceberg (or Hudi, delta lake) does not >> support RCFile. So my questions are: >> 1, is there any plan (or is it possible) to support RCFile in the future? So >> we can manage those existing data file without re-formating. >> 2, If no such plan, do you have any suggestion to migrate RCFiles into >> iceberg? >> >> Thanks >> Youjun