[
https://issues.apache.org/jira/browse/IMPALA-9489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
David Knupp updated IMPALA-9489:
--------------------------------
Comment: was deleted
(was: Review available at https://gerrit.cloudera.org/c/15417/)
> Setup impala-shell.sh env separately, and use thrift-0.11.0 by default
> ----------------------------------------------------------------------
>
> Key: IMPALA-9489
> URL: https://issues.apache.org/jira/browse/IMPALA-9489
> Project: IMPALA
> Issue Type: Improvement
> Components: Infrastructure
> Affects Versions: Impala 3.4.0
> Reporter: David Knupp
> Assignee: David Knupp
> Priority: Major
>
> [Note: this JIRA was filed in relation to the ongoing effort to make the
> impala-shell compatible with python 3]
> The impala python development environment is a fairly convoluted affair -- a
> number of packages are installed in the infra/python/env, some of it comes
> from the toolchain, some of it is generated and lives in the shell directory.
> Generally speaking, if you launch impala-python and import a module, it's not
> necessarily easy to predict where the module might live.
> {noformat}
> $ python
> Python 2.7.10 (default, Aug 17 2018, 19:45:58)
> [GCC 4.2.1 Compatible Apple LLVM 10.0.0 (clang-1000.0.42)] on darwin
> Type "help", "copyright", "credits" or "license" for more information.
> >>> import sasl
> >>> sasl
> <module 'sasl' from
> '/home/systest/Impala/shell/ext-py/sasl-0.1.1/dist/sasl-0.1.1-py2.7-linux-x86_64.egg/sasl/__init__.pyc'>
> >>> import requests
> >>> requests
> <module 'requests' from
> '/home/systest/Impala/infra/python/env/local/lib/python2.7/site-packages/requests/__init__.pyc'>
> >>> import Logging
> >>> Logging
> <module 'Logging' from
> '/home/systest/Impala/shell/gen-py/Logging/__init__.pyc'>
> >>> import thrift
> >>> thrift
> <module 'thrift' from
> '/home/systest/Impala/toolchain/thrift-0.9.3-p7/python/lib/python2.7/site-packages/thrift/__init__.pyc'>
> {noformat}
> Really, there is no one coherent environment -- there's just whatever
> collection of modules happens to be available at a given time for a given
> type of invocation, all of which is accomplished behind the scenes by calling
> scripts like {{bin/set-pythonpath.sh}} and {{bin/impala-python-common.sh}}
> that are responsible for cobbling together a PYTHONPATH based on known
> locations and current env variables.
> As far as I can tell, there are three important contexts where python comes
> into play...
> * during the build process (used during data load, e.g.,
> testdata/bin/load_nested.py)
> * when running the py.test bases e2e tests
> * whenever the impala-shell is invoked
> As noted by IMPALA-7825 (and also in a conversation I had with
> [~stakiar_impala_496e]), we're dependent on thrift 0.9.3 during the build
> process. This seems to come into play during the loading of test data
> (specifically, when calling testdata/bin/load_nested.py) mainly because at
> one point there was some well-intentioned but probably misguided attempt at
> code reuse from the test framework. The test code that gets re-used involves
> impyla and/or thrift-sasl, which currently still relies on thrift 0.9.3. So
> our test framework, and by extension the build, both inherit the same
> limitation.
> The impala-shell, on the other hand, luckily doesn't directly reuse any of
> the same test modules, and there really is no need to keep it pinned to
> 0.9.3. However, since calling the impala-shell.sh winds up invoking
> {{set-pythonpath.sh}}, the same script that script sets up the environment
> during building or testing, thrift 0.9.3 just kind of leaks over by default.
> As it turns out, thrift 0.9.3 is also one of the many limitations restricting
> the impala-shell to python 2. Luckily, with IMPALA-7924 resolved,
> thrift-0.11.0 is available -- we just have to use it. And the way to
> accomplish that is by decoupling the impala-shell from relying either
> {{set-pythonpath.sh}} or {{impala-python-common.sh}}.
> As a first pass, we can address the dev environment by just having
> {{impala-shell.sh}} itself do whatever is required to find python
> dependencies, and we can specify thrift-0.11.0 there. Also, thrift 0.11.0
> should be used by both of the scripts used to create the tarballs that
> package the impala-shell for customer environments. Neither of these should
> adversely building Impala or running the py.test test framework.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]