chenhu commented on code in PR #1876:
URL:
https://github.com/apache/incubator-seatunnel/pull/1876#discussion_r874320466
##########
seatunnel-connectors/seatunnel-connectors-spark/seatunnel-connector-spark-tidb/pom.xml:
##########
@@ -60,6 +60,10 @@
<groupId>mysql</groupId>
<artifactId>mysql-connector-java</artifactId>
</exclusion>
+ <exclusion>
+ <groupId>org.apache.spark</groupId>
+ <artifactId>spark-sql_${scala.binary.version}}</artifactId>
+ </exclusion>
Review Comment:
> In my knowledge, whether you exclude the `spark-sql_ ` here, will not
affect the version of `spark-sql_ ` in tidb.
When the indirect dependency of spark-sql's version in tispark-assembly ,
is different from the root pom of spark-sql,
when read parquet data in some plugin, eg: hudi source, will throw the
exception below:
----------------
Caused by: org.apache.spark.sql.AnalysisException: Multiple sources found
for parquet
(org.apache.spark.sql.execution.datasources.v2.parquet.ParquetDataSourceV2,
org.apache.spark.sql.execution.datasources.parquet.ParquetFileFormat), please
specify the fully qualified class name.
-----------------
this issue is cause by the tispark-assembly's spark-sql has the
ParquetDataSourceV2 and the root of spark-sql has the ParquetFileFormat, where
read parquet datasource ,spark can not have a decision using which Parquet
format
This occurs in my case .
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]