zhangjun0x01 commented on issue #1437:
URL: https://github.com/apache/iceberg/issues/1437#issuecomment-699720661


   hi ,@openinx 
   
   No matter which mode is used to submit flink jobs (sql 
client,standalone,yarn session, flink per job, application mode), flink will 
use FlinkCatalogFactory#createCatalog to create catalogs, so the difficulty 
lies in making FlinkCatalogFactory compatible with various submission modes. If 
hive file is included in the user jar  , Then FlinkCatalogFactory needs to 
parse the user jar  to obtain the hive configuration, but for other submission 
modes, it only needs to configure a hive local path. But I think it may be 
difficult for FlinkCatalogFactor to know which job mode the current user 
submits  in order to make different loading methods.
   
   In addition, for other submission modes, we provide a hive local 
configuration path. If the user executes the DDL that creates the catalog in 
the code, such as the following code example:
   
   `    tenv.executeSql("CREATE CATALOG hive_catalog WITH (\n" +
                                "  'type'='iceberg',\n" +
                                "  'catalog-type'='hive',\n" +
                                "  'uri'='thrift://localhost:9083'" +
                                "  
'warehouse'='hdfs://localhost/user/hive/warehouse'" +
                                ")");`
   
    and the code is typed into a jar  and executed in application mode, then In 
this case, we are based on the hive path configured by the user ddl or the 
hive-site.xml contained in the jar ? If the two configurations are different, 
the user will be confused. The same ddl It can be executed in sql client, why 
can't it be executed in the program.
   
   So my idea is to provide a hive conf path, which can be a local path or a 
hdfs path. If the user configures the hdfs path, we first download it to the 
local, and then load it. If the user submits the flink jar  in application 
mode, but the configured local path cannot find the path anymore, we will give 
the user some prompts and tell him that the application mode needs to configure 
the hdfs path.
   
   I think this is compatible with various user program submission modes. What 
do you think about this.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to