jackxu2011 commented on code in PR #4110:
URL: https://github.com/apache/linkis/pull/4110#discussion_r1092643449


##########
linkis-engineconn-plugins/spark/pom.xml:
##########
@@ -202,13 +207,103 @@
       <artifactId>linkis-rpc</artifactId>
       <version>${project.version}</version>
     </dependency>
-
+    <dependency>
+      <groupId>org.apache.linkis</groupId>
+      <artifactId>linkis-hadoop-hdfs-client-shade</artifactId>
+      <version>${project.version}</version>

Review Comment:
   > i have also tried to use relocation, but found spark have some force 
dependency for some hadoop-common class(shade class type will be rejected), 
@jackxu2011 do you have better ideas ?
   > 
   > ```
   >   def hadoopFile[K, V](path : scala.Predef.String, inputFormatClass : 
scala.Predef.Class[_ <: org.apache.hadoop.mapred.InputFormat[K, V]], keyClass : 
scala.Predef.Class[K], valueClass : scala.Predef.Class[V], minPartitions : 
scala.Int = { /* compiled code */ }) : org.apache.spark.rdd.RDD[scala.Tuple2[K, 
V]] = { /* compiled code */ }
   > ```
   > 
   > ```
   >   def transfer(sc: SparkContext, path: String, encoding: String): 
RDD[String] = {
   >     sc.hadoopFile(path, classOf[TextInputFormat], classOf[LongWritable], 
classOf[Text], 1)
   >       .map(p => new String(p._2.getBytes, 0, p._2.getLength, encoding))
   >   }
   > ```
   
   if you rebuild the spark2.4 with hadoop3,  in my opinion,  the shade is 
useless? because the spark2.4 can directed run with hadoop3



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to