[GitHub] spark issue #22614: [SPARK-25561][SQL] Implement a new config to control par...

kmanamcheri Tue, 09 Oct 2018 15:43:51 -0700

Github user kmanamcheri commented on the issue:

    https://github.com/apache/spark/pull/22614
  
    > Based on my understanding, the solution of FB team is to retry the 
following commands multiple times:
    > 
    > ```
    > getPartitionsByFilterMethod.invoke(hive, table, 
filter).asInstanceOf[JArrayList[Partition]]
    > ```
    
    @gatorsmile hmm my understanding was different. I thought they were 
retrying the fetchAllpartitions method. Maybe @tejasapatil can clarify here?
    
    > This really depends on what is the actual errors that fail 
`getPartitionsByFilterMethod`. When there are many concurrent users share the 
same metastore, `exponential backoff with retries` is very reasonable since 
most of errors might be caused by timeout or similar reasons.
    
    Doesn't this apply with every other HMS API as well? If so, shouldn't we be 
building a complete solution in HiveShim around this to do an `exponential 
backoff with retries` on every single HMS call in HiveShim?
    
    > If it still fails, I would suggest to fail fast or depends on the conf 
value of `spark.sql.hive.metastorePartitionPruning.fallback.enabled`
    
    Ok I agree. 
    
    I think we need clarification from @tejasapatil on which call they retry.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark issue #22614: [SPARK-25561][SQL] Implement a new config to control par...

Reply via email to