sunchao commented on pull request #33382:
URL: https://github.com/apache/spark/pull/33382#issuecomment-883134666


   I feel adding one more config on top of the existing `tryDirectSql` may make 
it too complex. What if we introduce a new config and use that to decide 
whether Spark should fallback to call `getAllPartitionsMethod`? and default it 
to true but with a helpful message to tell users that they can switch it off to 
fail the query instead if they really want so?
   
   This should only affect those queries in a few scenarios which they used to 
fail but now can succeed, which IMO is a better outcome. 
   
   For instance:
   ```scala
           val tryDirectSqlConfVar = HiveConf.ConfVars.METASTORE_TRY_DIRECT_SQL
           val shouldFallback = 
SQLConf.get.metastorePartitionPruningFallbackOnException
           try {
             getPartitionsByFilterMethod.invoke(hive, table, filter)
               .asInstanceOf[JArrayList[Partition]]
           } catch {
             case ex: InvocationTargetException if 
ex.getCause.isInstanceOf[MetaException] &&
                 shouldFallback =>
               logWarning("Caught Hive MetaException attempting to get 
partition metadata by " +
                 "filter from Hive. Falling back to fetching all partition 
metadata, which will " +
                 "degrade performance. Modifying your Hive metastore 
configuration to set " +
                 s"${tryDirectSqlConfVar.varname} to true (if it is not true 
already) may resolve " +
                 "this problem. Otherwise, you can set " +
                 
s"${SQLConf.HIVE_METASTORE_PARTITION_PRUNING_FALLBACK_ON_EXCEPTION.key} " +
                 " to false and let the query fail instead.", ex)
               getAllPartitionsMethod.invoke(hive, 
table).asInstanceOf[JSet[Partition]]
           }
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to