Yuchen Fan created IMPALA-11910:
-----------------------------------
Summary: Java UDF with resource file may cause catalogd start
failed
Key: IMPALA-11910
URL: https://issues.apache.org/jira/browse/IMPALA-11910
Project: IMPALA
Issue Type: Bug
Components: Catalog, Frontend
Affects Versions: Impala 4.1.0
Reporter: Yuchen Fan
Attachments: image-2023-02-09-19-15-10-624.png
In IMPALA-10997, UDF implementation is refactored for supporting GenericUDF.
After Impala-4.1.0, catalogd will start failed due to some irreguler UDFs,
which load some files or libraries in UDF class initialization. For example,
some uses' UDF may declare like this:
{code:java}
public class TestUdf extends UDF {
TestUdf() {}
static {
// 'resource.db' is attached in UDF jar
InputStream is =
XXX.class.getClassLoader().getResourceAsStream("resource.db");
}
...
// some UDF implementation
...
} {code}
When loading these UDFs, catalogd will crash and report error.
!image-2023-02-09-19-15-10-624.png!
We found UDF loading codes delete UDF jar file too early. Related code shown
below:
{code:java}
public static HiveUdfLoader createWithLocalPath(String localLibPath, Function
fn) {
...
// copy jar to local
FileSystemUtil.copyToLocal(new Path(uri), localJarPath);
...
// contruct HiveUdfLoader
...
// delete local jar file
FileSystemUtil.deleteIfExists(localJarPath);
}
// Now local jar file is deleted
private UDF instantiateUDFInstance(Class<?> udfClass) {
...
try {
Constructor<?> ctor = udfClass.getConstructor();
// static block containing open jar resource of UDF class will fail here
return (UDF) ctor.newInstance();
} catch (...Exception e) {
...
}
} {code}
To solve this, we should delay delete jar until UDF class finished to
instantiate.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)