sujith71955 commented on a change in pull request #22758: [SPARK-25332][SQL] 
select broadcast join instead of sortMergeJoin for the small size table even 
query fired via new session/context 
URL: https://github.com/apache/spark/pull/22758#discussion_r266304441
 
 

 ##########
 File path: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveMetastoreCatalog.scala
 ##########
 @@ -193,6 +193,16 @@ private[hive] class HiveMetastoreCatalog(sparkSession: 
SparkSession) extends Log
           None)
         val logicalRelation = cached.getOrElse {
           val updatedTable = inferIfNeeded(relation, options, fileFormat)
+          // Intialize the catalogTable stats if its not defined.An intial 
value has to be defined
+          // so that the hive statistics will be updated after each insert 
command.
+          val withStats = {
+            if (updatedTable.stats == None) {
 
 Review comment:
   > @wangyum, so it is basically subset of #22721? It's funny that Hive tables 
should set the initial stats alone here, which is supposed to be set somewhere 
else.
   Bit old PR :), will go through the problem once again and let you know more 
concide details. i remeber i struggled a lot for handing this issue :) Please 
let me know for any inputs if this way of handling is wrong.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to