thanks for the suggestion. we need to evaluate the cost to convert the format, 
as those hive tables  have been there for many years, so PB data need to 
reformat.

also, do you think it is possible to develop the support for a new format? how 
costly is it?

发自我的iPhone

> 在 2021年9月29日,下午9:34,Russell Spitzer <russell.spit...@gmail.com> 写道:
> 
> There is no plan I am aware of using RCFiles directly in Iceberg. While we 
> could work to support other file formats, I don't think it is very widely 
> used compared to ORC and Parquet (Iceberg has native support for these 
> formats).
> 
> My suggestion for conversion would be to do a CTAS statement in Spark and 
> have the table completely converted over to Parquet (or ORC). This is 
> probably the simplest way.
> 
>> On Sep 29, 2021, at 7:01 AM, yuan youjun <yuanyou...@gmail.com> wrote:
>> 
>> Hi community,
>> 
>> I am exploring ways to evolute existing hive tables (RCFile)  into data 
>> lake. However I found out that iceberg (or Hudi, delta lake) does not 
>> support RCFile. So my questions are:
>> 1, is there any plan (or is it possible) to support RCFile in the future? So 
>> we can manage those existing data file without re-formating.
>> 2, If no such plan, do you have any suggestion to migrate RCFiles into 
>> iceberg?
>> 
>> Thanks
>> Youjun


Reply via email to