[
https://issues.apache.org/jira/browse/AIRFLOW-2514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Aizhamal Nurmamat kyzy resolved AIRFLOW-2514.
---------------------------------------------
Resolution: Fixed
Resolving reopened issues for component refactor.
> HiveServer2Hook doesn't work on Python2 due to thrift version conflict
> ----------------------------------------------------------------------
>
> Key: AIRFLOW-2514
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2514
> Project: Apache Airflow
> Issue Type: Bug
> Components: hive_hooks, hooks
> Reporter: Kengo Seki
> Priority: Major
> Labels: hive, hive-hooks
>
> impyla on which HiveServer2Hook depends doesn't work with Thrift 0.10.0+ on
> Python2. Example:
> {code}
> $ pip show thrift
> Name: thrift
> Version: 0.11.0
> (snip)
> $ ipython
> (snip)
> In [1]: from airflow.hooks.hive_hooks import HiveServer2Hook
> In [2]: HiveServer2Hook().get_conn().cursor()
> [2018-05-23 10:21:02,117] {base_hook.py:83} INFO - Using connection to:
> localhost
> ---------------------------------------------------------------------------
> TypeError Traceback (most recent call last)
> <ipython-input-2-f76a25f124cf> in <module>()
> ----> 1 HiveServer2Hook().get_conn().cursor()
> (snip)
> /home/sekikn/.virtualenvs/a/local/lib/python2.7/site-packages/impala/_thrift_gen/TCLIService/TCLIService.pyc
> in write(self, oprot)
> 1067 def write(self, oprot):
> 1068 if oprot.__class__ == TBinaryProtocol.TBinaryProtocolAccelerated
> and self.thrift_spec is not None and fastbinary is not None:
> -> 1069 oprot.trans.write(fastbinary.encode_binary(self,
> (self.__class__, self.thrift_spec)))
> 1070 return
> 1071 oprot.writeStructBegin('OpenSession_args')
> TypeError: expecting list of size 2 for struct args
> {code}
> [This problem is already
> reported|https://github.com/cloudera/impyla/issues/286] and therefore [impyla
> pins Thrift version to
> 0.9.3|https://github.com/cloudera/impyla/commit/94a8eff9cda0cdb16b180c7079961449c8385997].
> On the other hand, hmsclient (introduced by AIRFLOW-2336) needs Thrift
> 0.11.0+.
> With the lower version, importing hmsclient fails as follows:
> {code}
> $ pip show thrift
> Name: thrift
> Version: 0.10.0
> (snip)
> $ python -m airflow.hooks.hive_hooks
> Traceback (most recent call last):
> File "/usr/lib/python2.7/runpy.py", line 174, in _run_module_as_main
> "__main__", fname, loader, pkg_name)
> File "/usr/lib/python2.7/runpy.py", line 72, in _run_code
> exec code in run_globals
> File "/home/sekikn/dev/incubator-airflow/airflow/hooks/hive_hooks.py", line
> 33, in <module>
> import hmsclient
> File
> "/home/sekikn/.virtualenvs/a/local/lib/python2.7/site-packages/hmsclient/__init__.py",
> line 2, in <module>
> from .hmsclient import HMSClient
> File
> "/home/sekikn/.virtualenvs/a/local/lib/python2.7/site-packages/hmsclient/hmsclient.py",
> line 23, in <module>
> from .genthrift.hive_metastore import ThriftHiveMetastore
> File
> "/home/sekikn/.virtualenvs/a/local/lib/python2.7/site-packages/hmsclient/genthrift/hive_metastore/ThriftHiveMetastore.py",
> line 11, in <module>
> from thrift.TRecursive import fix_spec
> ImportError: No module named TRecursive
> {code}
> As a result, HiveServer2Hook is not available on Python2 now.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)