yangBottle commented on a change in pull request #29178:
URL: https://github.com/apache/spark/pull/29178#discussion_r553872869
##########
File path: sql/hive/src/main/scala/org/apache/spark/sql/hive/TableReader.scala
##########
@@ -299,7 +299,9 @@ class HadoopTableReader(
*/
private def createHadoopRDD(localTableDesc: TableDesc, inputPathStr:
String): RDD[Writable] = {
val inputFormatClazz = localTableDesc.getInputFileFormatClass
- if (classOf[newInputClass[_, _]].isAssignableFrom(inputFormatClazz)) {
+ if (!inputFormatClazz.getName.
+
equalsIgnoreCase("org.apache.hadoop.hive.hbase.HiveHBaseTableInputFormat")
Review comment:
It looks like the new MapReduce API (`org.apache.hadoop.mapreduce`) used
when creating NewHadoopRDD , but The getsplits method of
HiveHBaseTableInputFormat is implemented by org.apache.hadoop.mapred API,so
some initialization operations (Table、connection) are not done,so the obtained
variable table is null.And when using methord createOldHadoopRDD will use the
org.apache.hadoop.mapred API,and some initialization operations
(Table、connection) are doing,so it can work well.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]