[ 
https://issues.apache.org/jira/browse/FLINK-36594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

slankka updated FLINK-36594:
----------------------------
    Description: 
recently, I'm using Hudi sync to HMS, when using HiveCatalog, HiveCatalog 
changes HiveConf.hiveSiteLocation to null, which will cause Hudi cannot get 
hive-site conf provided in classpath. I found the root cause is HiveCatalog 
changes the HiveConf.

HiveCatalog can load hive-site.xml itself without this variable , but the rest 
code after that, is still assuming HiveConf 'searches' hive-site.xml from 
classpath.

I mean, HiveCatalog turn it off, then any instance of HiveConf will never load 
hive-site.xml which user put it on classpath, yarn provided, such as hudi, only 
if you addResource explicitly, or Hive search it from user uber jar which need 
another effort.

 

Example
{code:java}
//at first
HiveConf static initialization code try to search hive-site.xml, and only once.

static {
  hiveSiteURL = findConfigFile(classLoader, "hive-site.xml", true);
}{code}
 
{code:java}
String name            = "myhive";
String defaultDatabase = "mydatabase";
String hiveConfDir     = "/opt/hive-conf";

HiveCatalog hive = new HiveCatalog(name, defaultDatabase, hiveConfDir);
tableEnv.registerCatalog("myhive", hive);

// set the HiveCatalog as the current catalog of the session
tableEnv.useCatalog("myhive"); {code}
after running code above:
{code:java}
//Another framework who are using hive naturely:

HiveConf hiveConf = new HiveConf(hadoopConf, HiveConf.class); 

// or directly

HiveConf hiveConf = new HiveConf(); {code}
The hiveConf DOES NOT load hive-site.xml from classpath, which will cause 
configuration loading failure.

 

Example code from HiveSyncConfig of Apache Hudi:

```

 public HiveSyncConfig(Properties props, Configuration hadoopConf) {
    super(props, hadoopConf);
    HiveConf hiveConf = new HiveConf();
    // HiveConf needs to load Hadoop conf to allow instantiation via 
AWSGlueClientFactory
    hiveConf.addResource(hadoopConf);
    setHadoopConf(hiveConf);
    validateParameters();
  }

```

 

 

  was:
recently, I'm using Hudi sync to HMS, when using HiveCatalog, HiveCatalog 
changes HiveConf.hiveSiteLocation to null, which will cause Hudi cannot get 
hive-site conf provided in classpath. I found the root cause is HiveCatalog 
changes the HiveConf.

HiveCatalog can load hive-site.xml itself without this variable , but the rest 
code after that, is still assuming HiveConf 'searches' hive-site.xml from 
classpath.

I mean, HiveCatalog turn it off, then any instance of HiveConf will never load 
hive-site.xml which user put it on classpath, yarn provided, such as hudi, only 
if you addResource explicitly, or Hive search it from user uber jar which need 
another effort.

 

Example
{code:java}
//at first
HiveConf static initialization code try to search hive-site.xml, and only once.

static {
  hiveSiteURL = findConfigFile(classLoader, "hive-site.xml", true);
}{code}
 
{code:java}
String name            = "myhive";
String defaultDatabase = "mydatabase";
String hiveConfDir     = "/opt/hive-conf";

HiveCatalog hive = new HiveCatalog(name, defaultDatabase, hiveConfDir);
tableEnv.registerCatalog("myhive", hive);

// set the HiveCatalog as the current catalog of the session
tableEnv.useCatalog("myhive"); {code}
after running code above:
{code:java}
//Another framework who are using hive

HiveConf hiveConf = new HiveConf(hadoopConf, HiveConf.class); {code}
The hiveConf DOES NOT load hive-site.xml from classpath, which will cause 
configuration loading failure.

 

 


> HiveCatalog should set HiveConf.hiveSiteLocation back
> -----------------------------------------------------
>
>                 Key: FLINK-36594
>                 URL: https://issues.apache.org/jira/browse/FLINK-36594
>             Project: Flink
>          Issue Type: Bug
>          Components: Connectors / Hive
>    Affects Versions: 1.20.1
>            Reporter: slankka
>            Priority: Minor
>              Labels: pull-request-available
>
> recently, I'm using Hudi sync to HMS, when using HiveCatalog, HiveCatalog 
> changes HiveConf.hiveSiteLocation to null, which will cause Hudi cannot get 
> hive-site conf provided in classpath. I found the root cause is HiveCatalog 
> changes the HiveConf.
> HiveCatalog can load hive-site.xml itself without this variable , but the 
> rest code after that, is still assuming HiveConf 'searches' hive-site.xml 
> from classpath.
> I mean, HiveCatalog turn it off, then any instance of HiveConf will never 
> load hive-site.xml which user put it on classpath, yarn provided, such as 
> hudi, only if you addResource explicitly, or Hive search it from user uber 
> jar which need another effort.
>  
> Example
> {code:java}
> //at first
> HiveConf static initialization code try to search hive-site.xml, and only 
> once.
> static {
>   hiveSiteURL = findConfigFile(classLoader, "hive-site.xml", true);
> }{code}
>  
> {code:java}
> String name            = "myhive";
> String defaultDatabase = "mydatabase";
> String hiveConfDir     = "/opt/hive-conf";
> HiveCatalog hive = new HiveCatalog(name, defaultDatabase, hiveConfDir);
> tableEnv.registerCatalog("myhive", hive);
> // set the HiveCatalog as the current catalog of the session
> tableEnv.useCatalog("myhive"); {code}
> after running code above:
> {code:java}
> //Another framework who are using hive naturely:
> HiveConf hiveConf = new HiveConf(hadoopConf, HiveConf.class); 
> // or directly
> HiveConf hiveConf = new HiveConf(); {code}
> The hiveConf DOES NOT load hive-site.xml from classpath, which will cause 
> configuration loading failure.
>  
> Example code from HiveSyncConfig of Apache Hudi:
> ```
>  public HiveSyncConfig(Properties props, Configuration hadoopConf) {
>     super(props, hadoopConf);
>     HiveConf hiveConf = new HiveConf();
>     // HiveConf needs to load Hadoop conf to allow instantiation via 
> AWSGlueClientFactory
>     hiveConf.addResource(hadoopConf);
>     setHadoopConf(hiveConf);
>     validateParameters();
>   }
> ```
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to