Hello Quanlong Huang, Laszlo Gaal, Jim Apple, Joe McDonnell, Csaba Ringhofer, 
Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/11731

to look at the new patch set (#6).

Change subject: IMPALA-7698: Add centos support to bootstrap_system.
......................................................................

IMPALA-7698: Add centos support to bootstrap_system.

Largely, the changes involve conditionalizing some invocations to
account for differences between RH and Ubuntu. The trickiest bits were
timezone-related test errors (see below), postgresql permissions (need
to accept md5 passwords from localhost) and default ulimits (1024 user
processes/threads is not enough).

To test this, I built using test-with-docker. In additional to the
ulimit issue, I ran into the fact that /tmp needed 1777 permissions for
the postgresql socket, and entrypoint.sh had a few places that needed
special cases. At the moment, the data load ran fine, as did most of the
tests. I observed a test that relied on a python2.7-ism fail, which is
part of the point of this.

In the course of development, I encountered a handful of tests fail with
"Encounter parse error: failed to open /usr/share/zoneinfo/GMT-08:00 -
No such file or directory.", which was reproduced as follows:

    [localhost:21000] default> use functional_orc_def; select * from alltypes;
    ...
    WARNINGS: Encounter parse error: failed to open 
/usr/share/zoneinfo/GMT-08:00 - No such file or directory.

With Quanlong's help, I learned what was happening. test-with-docker was
translating my time zone (America/Los_Angeles) to US/Pacific-New,
because realpath(/etc/localtime) = US/Pacific-New. This timezone exists
in centos:6, so that wasn't a problem. However, this timezone does not
exist in the package "tzdata-java", which is the copy of the timezone
information used by Java. (There are bugs here that may have been fixed
in centos:7.) As a result, when ORC asks (by using
TimeZone.getDefault().getID()) the JDK
(src/solaris/native/java/util/TimeZone_md.c) for the default timezone,
it can't find the same name as /etc/localtime points to in its
repository and defaults to "GMT-08:00". This string then gets written
into the ORC files generated by Hive as part of data load, and then the
C++ library can't read them. This is fixed by changing "realpath"
to "readlink" in test-with-docker.py.

centos:7 is not addressed by this change. The move to systemd makes
"service sshd start" (and the same for postgresql) not work, and
additional care needs to be done to work around that.

This change is a joint effort with Laszlo Gaal.

Change-Id: Id54294d7607f51de87a9de373dcfc4a33f4bedf5
---
M bin/bootstrap_system.sh
M docker/entrypoint.sh
M docker/test-with-docker.py
3 files changed, 166 insertions(+), 61 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/31/11731/6
--
To view, visit http://gerrit.cloudera.org:8080/11731
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Id54294d7607f51de87a9de373dcfc4a33f4bedf5
Gerrit-Change-Number: 11731
Gerrit-PatchSet: 6
Gerrit-Owner: Philip Zeyliger <phi...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <csringho...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Jim Apple <jbapple-imp...@apache.org>
Gerrit-Reviewer: Joe McDonnell <joemcdonn...@cloudera.com>
Gerrit-Reviewer: Laszlo Gaal <laszlo.g...@cloudera.com>
Gerrit-Reviewer: Philip Zeyliger <phi...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <huangquanl...@gmail.com>

Reply via email to