[
https://issues.apache.org/jira/browse/IMPALA-11910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Yuchen Fan updated IMPALA-11910:
--------------------------------
Description:
In IMPALA-10997, UDF implementation is refactored for supporting GenericUDF.
After Impala-4.1.0, catalogd will start failed due to some irreguler UDFs,
which load some files or libraries in UDF class initialization. For example,
some uses' UDF may declare like this:
{code:java}
public class TestUdf extends UDF {
TestUdf() {}
static {
// 'resource.db' is attached in UDF jar
InputStream is =
XXX.class.getClassLoader().getResourceAsStream("resource.db");
}
...
// some UDF implementation
...
} {code}
When loading these UDFs, catalogd will crash and report error.
!image-2023-02-09-19-15-10-624.png!
We found UDF loading codes delete UDF jar file too early. Related code shown
below:
{code:java}
public static HiveUdfLoader createWithLocalPath(String localLibPath, Function
fn) {
...
// copy jar to local
FileSystemUtil.copyToLocal(new Path(uri), localJarPath);
...
// contruct HiveUdfLoader
...
// delete local jar file
FileSystemUtil.deleteIfExists(localJarPath);
}
// Now local jar file is deleted
private UDF instantiateUDFInstance(Class<?> udfClass) {
...
try {
Constructor<?> ctor = udfClass.getConstructor();
// static block containing open jar resource of UDF class will fail here
return (UDF) ctor.newInstance();
} catch (...Exception e) {
...
}
} {code}
When executing 'ctor.newInstance()', jar file is already deleted. So UDF
initialization will failed. To solve this, we should delay delete jar until UDF
class finished to instantiate.
was:
In IMPALA-10997, UDF implementation is refactored for supporting GenericUDF.
After Impala-4.1.0, catalogd will start failed due to some irreguler UDFs,
which load some files or libraries in UDF class initialization. For example,
some uses' UDF may declare like this:
{code:java}
public class TestUdf extends UDF {
TestUdf() {}
static {
// 'resource.db' is attached in UDF jar
InputStream is =
XXX.class.getClassLoader().getResourceAsStream("resource.db");
}
...
// some UDF implementation
...
} {code}
When loading these UDFs, catalogd will crash and report error.
!image-2023-02-09-19-15-10-624.png!
We found UDF loading codes delete UDF jar file too early. Related code shown
below:
{code:java}
public static HiveUdfLoader createWithLocalPath(String localLibPath, Function
fn) {
...
// copy jar to local
FileSystemUtil.copyToLocal(new Path(uri), localJarPath);
...
// contruct HiveUdfLoader
...
// delete local jar file
FileSystemUtil.deleteIfExists(localJarPath);
}
// Now local jar file is deleted
private UDF instantiateUDFInstance(Class<?> udfClass) {
...
try {
Constructor<?> ctor = udfClass.getConstructor();
// static block containing open jar resource of UDF class will fail here
return (UDF) ctor.newInstance();
} catch (...Exception e) {
...
}
} {code}
To solve this, we should delay delete jar until UDF class finished to
instantiate.
> Java UDF with resource file may cause catalogd start failed
> -----------------------------------------------------------
>
> Key: IMPALA-11910
> URL: https://issues.apache.org/jira/browse/IMPALA-11910
> Project: IMPALA
> Issue Type: Bug
> Components: Catalog, Frontend
> Affects Versions: Impala 4.1.0
> Reporter: Yuchen Fan
> Priority: Major
> Attachments: image-2023-02-09-19-15-10-624.png
>
>
> In IMPALA-10997, UDF implementation is refactored for supporting GenericUDF.
> After Impala-4.1.0, catalogd will start failed due to some irreguler UDFs,
> which load some files or libraries in UDF class initialization. For example,
> some uses' UDF may declare like this:
>
> {code:java}
> public class TestUdf extends UDF {
> TestUdf() {}
> static {
> // 'resource.db' is attached in UDF jar
> InputStream is =
> XXX.class.getClassLoader().getResourceAsStream("resource.db");
> }
> ...
> // some UDF implementation
> ...
> } {code}
> When loading these UDFs, catalogd will crash and report error.
>
> !image-2023-02-09-19-15-10-624.png!
> We found UDF loading codes delete UDF jar file too early. Related code shown
> below:
>
> {code:java}
> public static HiveUdfLoader createWithLocalPath(String localLibPath,
> Function fn) {
> ...
> // copy jar to local
> FileSystemUtil.copyToLocal(new Path(uri), localJarPath);
> ...
> // contruct HiveUdfLoader
> ...
> // delete local jar file
> FileSystemUtil.deleteIfExists(localJarPath);
> }
>
> // Now local jar file is deleted
> private UDF instantiateUDFInstance(Class<?> udfClass) {
> ...
> try {
> Constructor<?> ctor = udfClass.getConstructor();
> // static block containing open jar resource of UDF class will fail
> here
> return (UDF) ctor.newInstance();
> } catch (...Exception e) {
> ...
> }
> } {code}
> When executing 'ctor.newInstance()', jar file is already deleted. So UDF
> initialization will failed. To solve this, we should delay delete jar until
> UDF class finished to instantiate.
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]