[GitHub] [spark] yangBottle commented on a change in pull request #29178: [SPARK-32380][SQL] fixed spark3.0 access hive table while data in hbase problem

GitBox Sat, 09 Jan 2021 19:58:29 -0800


yangBottle commented on a change in pull request #29178:
URL: https://github.com/apache/spark/pull/29178#discussion_r553872869




##########
File path: sql/hive/src/main/scala/org/apache/spark/sql/hive/TableReader.scala
##########
@@ -299,7 +299,9 @@ class HadoopTableReader(
    */
   private def createHadoopRDD(localTableDesc: TableDesc, inputPathStr: 
String): RDD[Writable] = {
     val inputFormatClazz = localTableDesc.getInputFileFormatClass
-    if (classOf[newInputClass[_, _]].isAssignableFrom(inputFormatClazz)) {
+    if (!inputFormatClazz.getName.
+      
equalsIgnoreCase("org.apache.hadoop.hive.hbase.HiveHBaseTableInputFormat")

Review comment:
       It looks like the new MapReduce API (`org.apache.hadoop.mapreduce`) used 
when creating NewHadoopRDD , but The getsplits method of 
HiveHBaseTableInputFormat is implemented by org.apache.hadoop.mapred API，so 
some initialization operations (Table、connection) are not done,so the obtained 
variable table is null.And when using methord createOldHadoopRDD will use the 
org.apache.hadoop.mapred API,and some initialization operations 
(Table、connection) are doing,so it can work well.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] yangBottle commented on a change in pull request #29178: [SPARK-32380][SQL] fixed spark3.0 access hive table while data in hbase problem

Reply via email to