[GitHub] spark pull request: [SPARK-11783][SQL] Fixes execution Hive client...

liancheng Tue, 24 Nov 2015 03:41:32 -0800

Github user liancheng commented on the pull request:

    https://github.com/apache/spark/pull/9895#issuecomment-159239968
  
    This has been broken for quite a while ever since we introduced the 
isolated Hive client in 1.4. My theory about the reason why people seldom 
noticed it is that:
    
    1. Commands executed by the execution Hive client are mostly transient, 
they don't touch data stored in the real metastore. Thus logically it doesn't 
matter which Hive client execute them.
    1. Even if the remote Hive metastore runs a version that is lower than 
Spark SQL's execution Hive client, it still works as long as the Thrift 
protocols used by involved commands are backwards compatible.
    1. Although we've already upgraded to Hive 1.2.1, we haven't implemented 
many advanced features that only exist in new Hive versions yet, thus most 
commands taken by the execution Hive client are indeed backwards compatible 
with lower versions.
    
    Unfortunately the only reliable way I found to verify this change is to 
inspect the internal `HiveMetaStoreClient` instance of the execution Hive 
client via remote debugging. Because we need a remote Hive metastore here. For 
example, we can start the Thrift server using:
    
    ```sh
    $SPARK_HOME/sbin/start-thriftserver.sh\
      --driver-java-options 
"-agentlib:jdwp=transport=dt_socket,server=y,address=localhost:5005,suspend=y"
    ```
    
    Then attach the debugger to the endpoint localhost:5005. (Remote debugging 
facilities in IntelliJ IDEA can be quite neat here.)
    
    Also, please refer to the JIRA ticket for more information about how to 
reproduce this issue locally.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request: [SPARK-11783][SQL] Fixes execution Hive client...

Reply via email to