This is an automated email from the ASF dual-hosted git repository.
achennaka pushed a commit to branch branch-1.18.x
in repository https://gitbox.apache.org/repos/asf/kudu.git
The following commit(s) were added to refs/heads/branch-1.18.x by this push:
new 6291ebc54 [Java] Resolve ClassNotFoundException: kudu.DefaultSource
6291ebc54 is described below
commit 6291ebc543abfd39ce6f53437b1f5adaafe17fe2
Author: Abhishek Chennaka <[email protected]>
AuthorDate: Mon Sep 15 12:02:22 2025 -0700
[Java] Resolve ClassNotFoundException: kudu.DefaultSource
In kudu-spark-tools module, having the minimize() removes the needed class,
org/apache/kudu/spark/kudu/DefaultSource.class.
Along with that we exclude 'META-INF/services/**' in kudu-spark-tools and
kudu-spark. But without
META-INF/services/org.apache.spark.sql.sources.DataSourceRegister
Spark cannot discover DefaultSource, even though the class itself is there.
This commit addresses these issues and has been tested manually.
ERROR yarn.Client: [main]: Application diagnostics message: User class
threw exception: org.apache.spark.SparkClassNotFoundException:
[DATA_SOURCE_NOT_FOUND] Failed to find the data source: kudu. Please find
packages at `https://spark.apache.org/third-party-projects.html`.
at
org.apache.spark.sql.errors.QueryExecutionErrors$.dataSourceNotFoundError(QueryExecutionErrors.scala:725)
at
org.apache.spark.sql.execution.datasources.DataSource$.lookupDataSource(DataSource.scala:649)
at
org.apache.spark.sql.execution.datasources.DataSource$.lookupDataSourceV2(DataSource.scala:699)
at
org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:208)
at
org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:172)
at
org.apache.kudu.spark.tools.Verifier$.run(IntegrationTestBigLinkedList.scala:371)
at
org.apache.kudu.spark.tools.Verifier$.main(IntegrationTestBigLinkedList.scala:451)
at
org.apache.kudu.spark.tools.IntegrationTestBigLinkedList$.main(IntegrationTestBigLinkedList.scala:107)
at
org.apache.kudu.spark.tools.IntegrationTestBigLinkedList.main(IntegrationTestBigLinkedList.scala)
at
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
at
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:568)
at
org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:748)
Caused by: java.lang.ClassNotFoundException: kudu.DefaultSource
at
java.base/java.net.URLClassLoader.findClass(URLClassLoader.java:445)
at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:592)
at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:525)
at
org.apache.spark.sql.execution.datasources.DataSource$.$anonfun$lookupDataSource$5(DataSource.scala:635)
at scala.util.Try$.apply(Try.scala:213)
at
org.apache.spark.sql.execution.datasources.DataSource$.$anonfun$lookupDataSource$4(DataSource.scala:635)
at scala.util.Failure.orElse(Try.scala:224)
at
org.apache.spark.sql.execution.datasources.DataSource$.lookupDataSource(DataSource.scala:635)
... 12 more
Change-Id: Ic0b8f07ea46759dc92d5ed2105a5480a0cf56464
Reviewed-on: http://gerrit.cloudera.org:8080/23605
Tested-by: Abhishek Chennaka <[email protected]>
Reviewed-by: Marton Greber <[email protected]>
Reviewed-by: Alexey Serbin <[email protected]>
(cherry picked from commit 27d5373ce4df12ea393de301a17dcef664e62acd)
Reviewed-on: http://gerrit.cloudera.org:8080/23682
---
java/kudu-spark-tools/build.gradle | 2 --
java/kudu-spark/build.gradle | 3 +--
2 files changed, 1 insertion(+), 4 deletions(-)
diff --git a/java/kudu-spark-tools/build.gradle
b/java/kudu-spark-tools/build.gradle
index 325b8d62f..5b593854f 100644
--- a/java/kudu-spark-tools/build.gradle
+++ b/java/kudu-spark-tools/build.gradle
@@ -78,7 +78,6 @@ shadowJar {
exclude '**/*.dylib'
exclude '**/*.html'
exclude '**/*.md'
- exclude 'META-INF/services/**'
exclude 'codegen/**'
exclude 'javax/**'
exclude 'org/threeten/**'
@@ -86,7 +85,6 @@ shadowJar {
exclude 'org/apache/orc/**'
exclude 'org/jetbrains/**'
- minimize()
}
// Adjust the artifact name to match the maven build.
diff --git a/java/kudu-spark/build.gradle b/java/kudu-spark/build.gradle
index b190e185c..f73cbecd7 100644
--- a/java/kudu-spark/build.gradle
+++ b/java/kudu-spark/build.gradle
@@ -72,7 +72,6 @@ shadowJar {
exclude '**/*.dylib'
exclude '**/*.html'
exclude '**/*.md'
- exclude 'META-INF/services/**'
exclude 'codegen/**'
exclude 'org/threeten/**'
exclude 'org/apache/arrow/**'
@@ -90,4 +89,4 @@ archivesBaseName =
"kudu-spark${versions.sparkBase}_${versions.scalaBase}"
tasks.withType(com.github.spotbugs.snom.SpotBugsTask) {
// This class causes SpotBugs runtime errors, so we completely ignore it
from analysis.
classes = classes.filter { !it.path.contains("SparkSQLTest") }
-}
\ No newline at end of file
+}