Re: Time for graduation?

2017-10-16 Thread Henry Robinson
+1, would be terrific to graduate.

On 16 October 2017 at 00:47, Tom White  wrote:

> +1 from me.
>
> Tom
>
> On Mon, Oct 16, 2017 at 12:59 AM, Michael Brown 
> wrote:
> > I am in favor of graduation.
> >
> > On Thu, Oct 12, 2017 at 2:17 PM, Todd Lipcon  wrote:
> >
> >> Hey Impala community,
> >>
> >> It's been a while that all of the Impala infrastructure has been moved
> >> over, and the community appears to be functioning healthily, generating
> new
> >> releases on a regular cadence as well as adding new committers and PPMC
> >> members. All of the branding stuff seems great, and the user mailing
> list
> >> has a healthy amount of traffic and a good track record of answering
> >> questions when they come up.
> >>
> >> As a mentor I think it's probably time to discuss graduation. The
> project
> >> is already functioning in the same way as your typical Apache TLP and it
> >> seems like it's time to become one.
> >>
> >> Any thoughts? If everyone is on board, the next step would be:
> >>
> >> 1. Pick the initial PMC chair for the TLP. According to the published
> >> Impala Bylaws it seems that this is meant to rotate annually, so no
> need to
> >> stress too much about it.
> >>
> >> A couple obvious choices here would be Marcel (as the original founder
> of
> >> the project) or perhaps Jim (who has done yeoman's work on a lot of the
> >> incubation process, podling reports, etc). Others could certainly
> volunteer
> >> or be nominated as well.
> >>
> >> 2. Draft a Resolution for the PPMC and IPMC to vote upon.
> >> -- the resolution would include the above-decided chair as well as the
> list
> >> of initial PMC, etc.
> >> -- the Initial PMC could be just the current list of PPMC, or you could
> >> consider adding others at this point as well.
> >>
> >>
> >> I can help with the above process but figured I'd solicit opinions
> first on
> >> whether the communit feels it's ready to graduate.
> >>
> >> Thanks
> >> Todd
> >>
>


Re: expr-test stuck in getJNIEnv

2017-09-10 Thread Henry Robinson
I've seen this deadlock before although not in expr-test. I can't remember
exactly how I cleared it but I believe it was either:

1. make fe && . bin/set-classpath.sh
2. bin/create-test-configuration.sh

Sailesh knows about the upstream HDFS bug which I think has been fixed but
not incorporated into Impala's dependencies.

On Sun, Sep 10, 2017 at 1:42 PM Jim Apple  wrote:

> When I run expr-test, it gets stuck in getJNIEnv(). Here's the full stack
> trace:
>
> #0  __lll_lock_wait () at
> ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
> #1  0x74885dbd in __GI___pthread_mutex_lock (mutex=0x45fe5a0
> ) at ../nptl/pthread_mutex_lock.c:80
> #2  0x02cf79f6 in mutexLock (m=) at
>
> /data/2/jenkins/workspace/impala-hadoop-dependency/hadoop/hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/os/posix/mutexes.c:28
> #3  0x02cf01b7 in setTLSExceptionStrings (rootCause=0x0,
> stackTrace=0x0) at
>
> /data/2/jenkins/workspace/impala-hadoop-dependency/hadoop/hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/jni_helper.c:581
> #4  0x02cf7f77 in printExceptionAndFreeV (env=0x4f221e8,
> exc=0x4eb6a00, noPrintFlags=, fmt=0x33d7994
> "loadFileSystems", ap=0x7fff9da0)
> at
> /data/2/jenkins/workspace/impala-hadoop-dependency/hadoop/hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/exception.c:183
> #5  0x02cf81dd in printExceptionAndFree (env=,
> exc=, noPrintFlags=, fmt= out>)
> at
> /data/2/jenkins/workspace/impala-hadoop-dependency/hadoop/hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/exception.c:213
> #6  0x02cf0faf in getGlobalJNIEnv () at
>
> /data/2/jenkins/workspace/impala-hadoop-dependency/hadoop/hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/jni_helper.c:463
> #7  getJNIEnv () at
>
> /data/2/jenkins/workspace/impala-hadoop-dependency/hadoop/hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/jni_helper.c:528
> #8  0x01a0116a in impala::JniUtil::Init () at
> be/src/util/jni-util.cc:105
> #9  0x014fd881 in impala::InitCommonRuntime (argc=1,
> argv=0x7fffa628, init_jvm=true,
> test_mode=impala::TestInfo::BE_TEST) at be/src/common/init.cc:236
> #10 0x0143da3f in main (argc=1, argv=0x7fffa628) at
> be/src/exprs/expr-test.cc:7420
>
> I've tried git fetch, bin/clean.sh, running with the minicluster on,
> running with the minicluster off, running with the impala cluster on,
> running with it off, running in release mode, debug mode, in gdb, and
> out of gdb.
>
> Has anyone else seen this and escaped from its clutches?
>


Re: Encountering failure during build on docker

2017-09-08 Thread Henry Robinson
Yeah, it's likely you're out of memory. These messages:

"Please submit a full bug report,*

*with preprocessed source if appropriate.*

*Please include the complete backtrace with any bug report.*"

come from gcc failing with an internal error, which in my experience is
almost always the OOM killer getting involved. You could try building with
less parallelism, or increasing the available memory. You can check
something like 'dmesg | egrep -i 'killed process' to check the OOM killer's
activity.

On 8 September 2017 at 10:46, Philip Zeyliger  wrote:

> It's a little bit of a shot in the dark, but your CPU/memory ratio may be
> significantly different than what most folks are using. I've seen gcc and
> friends fail kind of opaquely when you're out of memory. (Or out of disk
> space, for that matter.)
>
> I've been personally following the Docker instructions successfully on an
> Ubuntu16 host in Google Compute Engine. I've run into
> https://issues.apache.org/jira/browse/IMPALA-5765, but not reliably.
>
> -- Philip
>
> On Fri, Sep 8, 2017 at 9:44 AM, Jim Apple  wrote:
>
> > Hm. Haven't seen this before. Does "MBP" stand for "Mac Book Pro"?
> > This could be an issue with the Docker instructions in
> > bootstrap_development.sh not accounting for some transparency in
> > Docker exposing the host to the container.
> >
> > If possible, can you send the full stdout and stderr from that last
> > command?
> >
> > On Fri, Sep 8, 2017 at 8:57 AM, Manaswini Maharana
> >  wrote:
> > > Here you go -
> > >
> > >
> > > 1. The command you used to start the docker container
> > > mmaharana-MBP:~ mmaharana$ *docker pull ubuntu:16.04*
> > > mmaharana-MBP:~ mmaharana$ *docker run --privileged --interactive --tty
> > > --name impala-dev ubuntu:16.04 bash*
> > >
> > > 2. The output of gcc --version inside your docker container
> > > impdev@ea385187b032:~$ *gcc --version*
> > > *gcc (Ubuntu 5.4.0-6ubuntu1~16.04.4) 5.4.0 20160609*
> > > *Copyright (C) 2015 Free Software Foundation, Inc.*
> > > *This is free software; see the source for copying conditions.  There
> is
> > NO*
> > > *warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR
> > > PURPOSE.*
> > >
> > >
> > > 3. The output of lsb_release -a both in the host and inside the docker
> > > container
> > > On Host:
> > > mmaharana-MBP:~ mmaharana$* lsb_release -a*
> > > *-bash: lsb_release: command not found*
> > >
> > > On Container:
> > > *impdev@ea385187b032:~$ lsb_release -a *
> > > *No LSB modules are available.*
> > > *Distributor ID: Ubuntu*
> > > *Description: Ubuntu 16.04.3 LTS*
> > > *Release: 16.04*
> > > *Codename: xenial*
> > >
> > > 4. The commands you ran inside the container
> > > root@ea385187b032:/# *apt-get update*
> > > root@ea385187b032:/# *apt-get install sudo*
> > > root@ea385187b032:/# *adduser --disabled-password --gecos '' impdev*
> > > root@ea385187b032:/# *echo 'impdev ALL=(ALL) NOPASSWD:ALL' >>
> > /etc/sudoers*
> > > root@ea385187b032:/#* su - impdev*
> > > impdev@ea385187b032:~$ *sudo apt-get --yes install git*
> > > impdev@ea385187b032:~$* git clone
> > > https://git-wip-us.apache.org/repos/asf/incubator-impala.git
> > > 
> ~/Impala*
> > > impdev@ea385187b032:~$ *source ~/Impala/bin/bootstrap_development.sh*
> > >
> > >
> > > Thanks!
> > > Mansi
> > >
> > >
> > >
> > >
> > >
> > > On Fri, Sep 8, 2017 at 10:11 AM, Jim Apple 
> wrote:
> > >
> > >> Can you provide:
> > >>
> > >> 1. The command you used to start the docker container
> > >>
> > >> 2. The output of gcc --version inside your docker container
> > >>
> > >> 3. The output of lsb_release -a both in the host and inside the docker
> > >> container
> > >>
> > >> 4. The commands you ran inside the container
> > >>
> > >> Thank you!
> > >>
> > >> On Fri, Sep 8, 2017 at 7:58 AM, Manaswini Maharana
> > >>  wrote:
> > >> > Hello team,
> > >> >
> > >> > I'm trying to setup docker for development and encountering the
> below
> > >> issue
> > >> > during bootstrap_development.sh sourcing. Any pointers on how to
> > resolve
> > >> > this? If you need more stack trace to backtrack or any other kind of
> > >> > information to debug let me know.
> > >> >
> > >> >
> > >> > *Please submit a full bug report,*
> > >> >
> > >> > *with preprocessed source if appropriate.*
> > >> >
> > >> > *Please include the complete backtrace with any bug report.*
> > >> >
> > >> > *See >
> > for
> > >> > instructions.*
> > >> >
> > >> > *be/src/service/CMakeFiles/Service.dir/build.make:123: recipe for
> > target
> > >> > 'be/src/service/CMakeFiles/Service.dir/impala-server.cc.o' failed*
> > >> >
> > >> > *make[2]: *** [be/src/service/CMakeFiles/
> > Service.dir/impala-server.cc.o]
> > >> > Error 4*
> > >> >
> > >> > *CMakeFiles/Makefile2:5694: recipe for target
> > >> > 

Re: [DISCUSS] 2.10.0 release

2017-08-24 Thread Henry Robinson
amazing job, Tim!

On 24 August 2017 at 17:55, Tim Armstrong  wrote:

> All of the IMPALA-3200 work is now in master!
>
> On Wed, Aug 23, 2017 at 12:21 PM, Tim Armstrong 
> wrote:
>
> > I was looking through open JIRAs to make sure I didn't drop the ball on
> > any buffer pool changes and discovered we have 100+ open JIRAs targeted
> for
> > 2.10: https://issues.apache.org/jira/issues/?filter=12341748
> >
> > It would be great to clean those up. I tried to clean up the ones that I
> > know something about but most of them I'm not familiar with. It looks
> like
> > a lot aren't being actively worked on so probably belong in the backlog -
> > the target version seems to just be expressing a hope that someone else
> > will fix it soon.
> >
> > You can check your own 2.10 JIRAs with this filter:
> > https://issues.apache.org/jira/issues/?filter=12341563
> >
> > There are also a bunch of unassigned ones: https://issues.apache.org/
> > jira/issues/?filter=12341750
> >
> >
> >
> > On Mon, Aug 14, 2017 at 11:09 AM, Bharath Vissapragada <
> > bhara...@cloudera.com> wrote:
> >
> >> Agreed Tim.
> >>
> >> On Mon, Aug 14, 2017 at 9:13 AM, Tim Armstrong  >
> >> wrote:
> >>
> >> > Sounds good to me. We should coordinate to make sure that all of
> >> > https://issues.apache.org/jira/browse/IMPALA-3200 (the buffer pool
> >> > changes)
> >> > and related fixes make it into the release.
> >> >
> >> > - Tim
> >> >
> >> > On Mon, Aug 14, 2017 at 5:52 AM, Jim Apple 
> >> wrote:
> >> >
> >> > > This sounds like a good idea to me. Thank you for volunteering!
> >> > >
> >> > > On Mon, Aug 14, 2017 at 12:37 AM, Bharath Vissapragada
> >> > >  wrote:
> >> > > > Folks,
> >> > > >
> >> > > > It has been almost 2 months since we released Apache Impala
> >> > (incubating)
> >> > > > 2.9.0 and there have been new feature improvements and a good
> >> number of
> >> > > bug
> >> > > > fixes checked in since then.
> >> > > >
> >> > > > I propose that we release 2.10.0 soon and I volunteer to be its
> >> release
> >> > > > manager. Please speak up and let the community know if anyone has
> >> any
> >> > > > objections to this.
> >> > > >
> >> > > > Thanks,
> >> > > > Bharath
> >> > >
> >> >
> >>
> >
> >
>


Small change to all-build-options job

2017-08-12 Thread Henry Robinson
[You can probably skip this mail unless you're interested. TLDR: I slightly
changed the all-build-options job in a very minor way; let me know if you
see issues with it]

Impala will shortly have a dependency on libkb5-dev, which provides
Kerberos headers and libraries for security on the KRPC branch. As of Jim's
recent work, the bootstrap_development.sh script installs that on an Ubuntu
machine as a pre-requisite. I have added it to bootstrap_build.sh myself.

However, the all-build-options job doesn't seem to use either script, and
does not appear to install any dependencies. This blocks the patch that
introduces the libkrb5 dependency from passing GVO. So for now, I've added
the apt-get line from our bootstrap commands to the job script directly.
This is a temporary change, and over the next few days I'll file a JIRA to
sort out the bootstrap scripts so we can call it directly (if all GVO jobs
are succeeding).

My test jobs have passed the apt-get statement (having installed
libkrb5-dev succesfully), and seem to be proceeding fine, so there will
probably be no visible effect from this change. But if you see suspicious
failures in all-build-options, let me know.

Thanks,
Henry


Re: problem about buildall.sh

2017-08-03 Thread Henry Robinson
Glad to hear it - and sorry about this annoying edge case.

On 3 August 2017 at 04:14, yu feng <olaptes...@gmail.com> wrote:

> Great ! it is finish successfully after remove glog-0.3.4-p2 and pull to
> the newest commit. Thanks all of you.
>
> 2017-08-03 13:30 GMT+08:00 Henry Robinson <he...@apache.org>:
>
> > I think you're hitting the problem where you're not picking up the most
> > recent version of gflags. IMPALA-5659 added .so building to gflags, but
> > didn't bump the patch level or version number, and the toolchain doesn't
> > have a great way of dealing with that situation.
> >
> > Do the following:
> >
> > rm -rf ${IMPALA_TOOLCHAIN}/gflags-${IMPALA_GFLAGS_VERSION}
> > . bin/impala-config.sh
> > bin/bootstrap_toolchain.py
> >
> > and then try rebuilding.
> >
> >
> > On 2 August 2017 at 22:24, yu feng <olaptes...@gmail.com> wrote:
> >
> > > I will try to delete toolchain/glog-0.3.4-p2/ and rebuild, However, I
> > have
> > > remove all toochains before. I try again now.
> > >
> > > 2017-08-03 12:49 GMT+08:00 Tim Armstrong <tarmstr...@cloudera.com>:
> > >
> > > > It looks like the error is:
> > > >
> > > > > -- --> Adding thirdparty library glog. <--
> > > > > -- Header files: /home/hzfengyu/impala/apache-
> > impala/incubator-impala/
> > > > > toolchain/glog-0.3.4-p2/include
> > > > > -- Added shared library dependency glog:
> > /home/hzfengyu/impala/apache-
> > > > > impala/incubator-impala/toolchain/glog-0.3.4-p2/lib/libglog.so
> > > > > CMake Error at CMakeLists.txt:178 (IMPALA_ADD_THIRDPARTY_LIB):
> > > > >   IMPALA_ADD_THIRDPARTY_LIB Function invoked with incorrect
> arguments
> > > for
> > > > >   function named: IMPALA_ADD_THIRDPARTY_LIB
> > > >
> > > > Has anyone seen this before?
> > > >
> > > > One thing you can try is deleting glog then rebuilding. E.g.
> > > >
> > > > rm -r toolchain/glog-0.3.4-p2/
> > > >
> > > > On Wed, Aug 2, 2017 at 9:34 PM, yu feng <olaptes...@gmail.com>
> wrote:
> > > >
> > > > > CMakeOutput.log :
> > > > >
> > > > >
> > > > >
> > > > > The system is: Linux - 3.16.0-4-amd64 - x86_64
> > > > > Compiling the C compiler identification source file
> > > "CMakeCCompilerId.c"
> > > > > succeeded.
> > > > > Compiler:
> > > > > /home/hzfengyu/impala/apache-impala/incubator-impala/
> > > > > toolchain/gcc-4.9.2/bin/gcc
> > > > > Build flags:
> > > > > Id flags:
> > > > >
> > > > > The output was:
> > > > > 0
> > > > >
> > > > >
> > > > > Compilation of the C compiler identification source
> > > "CMakeCCompilerId.c"
> > > > > produced "a.out"
> > > > >
> > > > > The C compiler identification is GNU, found in
> > > > > "/home/hzfengyu/impala/apache-impala/incubator-impala/
> > > > > CMakeFiles/3.2.3/CompilerIdC/a.out"
> > > > >
> > > > > Compiling the CXX compiler identification source file
> > > > > "CMakeCXXCompilerId.cpp" succeeded.
> > > > > Compiler:
> > > > > /home/hzfengyu/impala/apache-impala/incubator-impala/
> > > > > toolchain/gcc-4.9.2/bin/g++
> > > > > Build flags:
> > > > > Id flags:
> > > > >
> > > > > The output was:
> > > > > 0
> > > > >
> > > > >
> > > > > Compilation of the CXX compiler identification source
> > > > > "CMakeCXXCompilerId.cpp" produced "a.out"
> > > > >
> > > > > The CXX compiler identification is GNU, found in
> > > > > "/home/hzfengyu/impala/apache-impala/incubator-impala/
> > > CMakeFiles/3.2.3/
> > > > > CompilerIdCXX/a.out"
> > > > >
> > > > > Determining if the C compiler works passed with the following
> output:
> > > > > Change Dir:
> > > > > /home/hzfengyu/impala/apache-impala/incubator-impala/
> > > CMakeFiles/CMakeTmp
> > > > >
> > > > > Run Build Command:"/usr/bin/make" "cmTryCompileExec2174981

Re: Impala Java API

2017-08-03 Thread Henry Robinson
I always forget about jdbc- that would be a much better option.

On Thu, Aug 3, 2017 at 8:47 AM Silvius Rus <s...@cloudera.com> wrote:

> Can you use JDBC?
>
> Silvius
>
> > On Aug 2, 2017, at 9:24 PM, Henry Robinson <he...@apache.org> wrote:
> >
> > Impala's clients all communicate with Impala via Apache Thrift, which is
> a
> > serialization and RPC format that has bindings for multiple languages. In
> > fact, Impala generates Thrift stubs for Java when the frontend is built
> > (see for example
> > ${IMPALA_HOME}/fe/generated-sources/gen-java/org/apache/impala/thrift
> > ImpalaHiveServer2Service.java).
> >
> > Your best bet is to write a client to the HiveServer2Service.
> >
> > Henry
> >
> >> On 2 August 2017 at 19:44, zhangwenyang <zhangweny...@neusoft.com>
> wrote:
> >>
> >> Hi,
> >>
> >> we team want use impala for in-time query.
> >> But we can't find java API, "refresh" for example.
> >> If we want to refresh and get data from java coding backend-service,
> what
> >> we should do?
> >>
> >> Thanks
> >>
> >>
> >>
> >>
> >> zhangwenyang
> >>
> >>
> >> 
> >> ---
> >> Confidentiality Notice: The information contained in this e-mail and any
> >> accompanying attachment(s)
> >> is intended only for the use of the intended recipient and may be
> >> confidential and/or privileged of
> >> Neusoft Corporation, its subsidiaries and/or its affiliates. If any
> reader
> >> of this communication is
> >> not the intended recipient, unauthorized use, forwarding, printing,
> >> storing, disclosure or copying
> >> is strictly prohibited, and may be unlawful.If you have received this
> >> communication in error,please
> >> immediately notify the sender by return e-mail, and delete the original
> >> message and all copies from
> >> your system. Thank you.
> >> 
> >> ---
>


Re: problem about buildall.sh

2017-08-02 Thread Henry Robinson
I think you're hitting the problem where you're not picking up the most
recent version of gflags. IMPALA-5659 added .so building to gflags, but
didn't bump the patch level or version number, and the toolchain doesn't
have a great way of dealing with that situation.

Do the following:

rm -rf ${IMPALA_TOOLCHAIN}/gflags-${IMPALA_GFLAGS_VERSION}
. bin/impala-config.sh
bin/bootstrap_toolchain.py

and then try rebuilding.


On 2 August 2017 at 22:24, yu feng  wrote:

> I will try to delete toolchain/glog-0.3.4-p2/ and rebuild, However, I have
> remove all toochains before. I try again now.
>
> 2017-08-03 12:49 GMT+08:00 Tim Armstrong :
>
> > It looks like the error is:
> >
> > > -- --> Adding thirdparty library glog. <--
> > > -- Header files: /home/hzfengyu/impala/apache-impala/incubator-impala/
> > > toolchain/glog-0.3.4-p2/include
> > > -- Added shared library dependency glog: /home/hzfengyu/impala/apache-
> > > impala/incubator-impala/toolchain/glog-0.3.4-p2/lib/libglog.so
> > > CMake Error at CMakeLists.txt:178 (IMPALA_ADD_THIRDPARTY_LIB):
> > >   IMPALA_ADD_THIRDPARTY_LIB Function invoked with incorrect arguments
> for
> > >   function named: IMPALA_ADD_THIRDPARTY_LIB
> >
> > Has anyone seen this before?
> >
> > One thing you can try is deleting glog then rebuilding. E.g.
> >
> > rm -r toolchain/glog-0.3.4-p2/
> >
> > On Wed, Aug 2, 2017 at 9:34 PM, yu feng  wrote:
> >
> > > CMakeOutput.log :
> > >
> > >
> > >
> > > The system is: Linux - 3.16.0-4-amd64 - x86_64
> > > Compiling the C compiler identification source file
> "CMakeCCompilerId.c"
> > > succeeded.
> > > Compiler:
> > > /home/hzfengyu/impala/apache-impala/incubator-impala/
> > > toolchain/gcc-4.9.2/bin/gcc
> > > Build flags:
> > > Id flags:
> > >
> > > The output was:
> > > 0
> > >
> > >
> > > Compilation of the C compiler identification source
> "CMakeCCompilerId.c"
> > > produced "a.out"
> > >
> > > The C compiler identification is GNU, found in
> > > "/home/hzfengyu/impala/apache-impala/incubator-impala/
> > > CMakeFiles/3.2.3/CompilerIdC/a.out"
> > >
> > > Compiling the CXX compiler identification source file
> > > "CMakeCXXCompilerId.cpp" succeeded.
> > > Compiler:
> > > /home/hzfengyu/impala/apache-impala/incubator-impala/
> > > toolchain/gcc-4.9.2/bin/g++
> > > Build flags:
> > > Id flags:
> > >
> > > The output was:
> > > 0
> > >
> > >
> > > Compilation of the CXX compiler identification source
> > > "CMakeCXXCompilerId.cpp" produced "a.out"
> > >
> > > The CXX compiler identification is GNU, found in
> > > "/home/hzfengyu/impala/apache-impala/incubator-impala/
> CMakeFiles/3.2.3/
> > > CompilerIdCXX/a.out"
> > >
> > > Determining if the C compiler works passed with the following output:
> > > Change Dir:
> > > /home/hzfengyu/impala/apache-impala/incubator-impala/
> CMakeFiles/CMakeTmp
> > >
> > > Run Build Command:"/usr/bin/make" "cmTryCompileExec2174981562/fast"
> > > /usr/bin/make -f CMakeFiles/cmTryCompileExec2174981562.dir/build.make
> > > CMakeFiles/cmTryCompileExec2174981562.dir/build
> > > make[1]: Entering directory
> > > '/home/hzfengyu/impala/apache-impala/incubator-impala/
> > CMakeFiles/CMakeTmp'
> > > /home/hzfengyu/impala/apache-impala/incubator-impala/
> > > toolchain/cmake-3.2.3-p1/bin/cmake
> > > -E cmake_progress_report
> > > /home/hzfengyu/impala/apache-impala/incubator-impala/
> > > CMakeFiles/CMakeTmp/CMakeFiles
> > > 1
> > > Building C object
> > > CMakeFiles/cmTryCompileExec2174981562.dir/testCCompiler.c.o
> > > /home/hzfengyu/impala/apache-impala/incubator-impala/
> > > toolchain/gcc-4.9.2/bin/gcc
> > >-o CMakeFiles/cmTryCompileExec2174981562.dir/testCCompiler.c.o   -c
> > > /home/hzfengyu/impala/apache-impala/incubator-impala/
> > CMakeFiles/CMakeTmp/
> > > testCCompiler.c
> > > Linking C executable cmTryCompileExec2174981562
> > > /home/hzfengyu/impala/apache-impala/incubator-impala/
> > > toolchain/cmake-3.2.3-p1/bin/cmake
> > > -E cmake_link_script CMakeFiles/cmTryCompileExec2174981562.
> dir/link.txt
> > > --verbose=1
> > > /home/hzfengyu/impala/apache-impala/incubator-impala/
> > > toolchain/gcc-4.9.2/bin/gcc
> > >   CMakeFiles/cmTryCompileExec2174981562.dir/testCCompiler.c.o  -o
> > > cmTryCompileExec2174981562 -rdynamic
> > > make[1]: Leaving directory
> > > '/home/hzfengyu/impala/apache-impala/incubator-impala/
> > CMakeFiles/CMakeTmp'
> > >
> > >
> > > Detecting C compiler ABI info compiled with the following output:
> > > Change Dir:
> > > /home/hzfengyu/impala/apache-impala/incubator-impala/
> CMakeFiles/CMakeTmp
> > >
> > > Run Build Command:"/usr/bin/make" "cmTryCompileExec3739174323/fast"
> > > /usr/bin/make -f CMakeFiles/cmTryCompileExec3739174323.dir/build.make
> > > CMakeFiles/cmTryCompileExec3739174323.dir/build
> > > make[1]: Entering directory
> > > '/home/hzfengyu/impala/apache-impala/incubator-impala/
> > CMakeFiles/CMakeTmp'
> > > /home/hzfengyu/impala/apache-impala/incubator-impala/
> > > 

Re: Impala Java API

2017-08-02 Thread Henry Robinson
Impala's clients all communicate with Impala via Apache Thrift, which is a
serialization and RPC format that has bindings for multiple languages. In
fact, Impala generates Thrift stubs for Java when the frontend is built
(see for example
${IMPALA_HOME}/fe/generated-sources/gen-java/org/apache/impala/thrift
ImpalaHiveServer2Service.java).

Your best bet is to write a client to the HiveServer2Service.

Henry

On 2 August 2017 at 19:44, zhangwenyang  wrote:

> Hi,
>
> we team want use impala for in-time query.
> But we can't find java API, "refresh" for example.
> If we want to refresh and get data from java coding backend-service, what
> we should do?
>
> Thanks
>
>
>
>
> zhangwenyang
>
>
> 
> ---
> Confidentiality Notice: The information contained in this e-mail and any
> accompanying attachment(s)
> is intended only for the use of the intended recipient and may be
> confidential and/or privileged of
> Neusoft Corporation, its subsidiaries and/or its affiliates. If any reader
> of this communication is
> not the intended recipient, unauthorized use, forwarding, printing,
> storing, disclosure or copying
> is strictly prohibited, and may be unlawful.If you have received this
> communication in error,please
> immediately notify the sender by return e-mail, and delete the original
> message and all copies from
> your system. Thank you.
> 
> ---


Re: problem about buildall.sh

2017-08-02 Thread Henry Robinson
I don't think attachments come through on this mailing list (at least I
can't see anything attached to your mail). Could you paste the output in
pastebin or somewhere similar?

On Wed, Aug 2, 2017 at 8:44 PM yu feng  wrote:

> attach command output and CmakeOutput.log, Thanks a lot.
>
> 2017-08-02 23:52 GMT+08:00 Tim Armstrong :
>
>> Hi,
>>   I don't see an error in the output you pasted. Maybe it would help to
>> include the full output of buildall.sh and CmakeOutput.log
>>
>> On Wed, Aug 2, 2017 at 5:32 AM, yu feng  wrote:
>>
>> > Hi, I clone impala from
>> > https://git-wip-us.apache.org/repos/asf/incubator-impala.git, and try
>> to
>> > run 'bash buildall.sh -noclean -skiptests -build_shared_libs -format', I
>> > have finish building impalad two months ago, However, I git pull all
>> newest
>> > code, error happened:
>> >
>> >
>> > -- Found JNI:
>> > /home/hzfengyu/impala/deploy/jdk1.7.0_79/jre/lib/amd64/
>> > libjawt.so;/home/hzfengyu/impala/deploy/jdk1.7.0_79/jre/
>> > lib/amd64/libjsig.so;/home/hzfengyu/impala/deploy/jdk1.7.
>> > 0_79/jre/lib/amd64/server/libjvm.so
>> >
>> > -- --> Adding thirdparty library java_jvm. <--
>> > -- Header files:
>> > /home/hzfengyu/impala/deploy/jdk1.7.0_79/include;/home/
>> > hzfengyu/impala/deploy/jdk1.7.0_79/include/linux;/home/
>> > hzfengyu/impala/deploy/jdk1.7.0_79/include
>> > -- Added static library dependency java_jvm:
>> > /home/hzfengyu/impala/deploy/jdk1.7.0_79/jre/lib/amd64/server/libjvm.so
>> > -- --> Adding thirdparty library breakpad_client. <--
>> > -- Header files:
>> > /home/hzfengyu/impala/apache-impala/incubator-impala/toolchain/breakpad-
>> > ffe3e478657dc7126fca6329dfcedc49f4c726d9-p2/include/breakpad
>> > -- Added static library dependency breakpad_client:
>> > /home/hzfengyu/impala/apache-impala/incubator-impala/toolchain/breakpad-
>> > ffe3e478657dc7126fca6329dfcedc49f4c726d9-p2/lib/libbreakpad_client.a
>> > -- Added shared library dependency rt:
>> /usr/lib/x86_64-linux-gnu/librt.so
>> > -- Added shared library dependency dl:
>> /usr/lib/x86_64-linux-gnu/libdl.so
>> > Using Thrift compiler:
>> > /home/hzfengyu/impala/apache-impala/incubator-impala/
>> > toolchain/thrift-0.9.0-p9/bin/thrift
>> > Found output dir:
>> > /home/hzfengyu/impala/apache-impala/incubator-impala/shell/
>> > Using FlatBuffers compiler:
>> > /home/hzfengyu/impala/apache-impala/incubator-impala/
>> > toolchain/flatbuffers-1.6.0/bin/flatc
>> > --java-o/home/hzfengyu/impala/apache-impala/incubator-
>> > impala/fe/generated-sources/gen-java-b
>> > --cpp-o/home/hzfengyu/impala/apache-impala/incubator-
>> > impala/be/generated-sources/gen-cpp-b
>> > -- Could NOT find Doxygen (missing:  DOXYGEN_EXECUTABLE)
>> > -- WARNING: Doxygen not found - Docs will not be created
>> > -- Looking for sched_getcpu
>> > -- Looking for sched_getcpu - found
>> > -- Looking for pipe2
>> > -- Looking for pipe2 - found
>> > -- Looking for fallocate
>> > -- Looking for fallocate - found
>> > -- Looking for preadv
>> > -- Looking for preadv - found
>> > -- Looking for include file linux/magic.h
>> > -- Looking for include file linux/magic.h - found
>> > -- Compiler Flags:  -Wall -Wno-sign-compare -Wno-unknown-pragmas
>> -pthread
>> > -fno-strict-aliasing -std=c++14 -Wno-deprecated -Wno-vla
>> > -DBOOST_DATE_TIME_POSIX_TIME_STD_CONFIG -DBOOST_SYSTEM_NO_DEPRECATED -B
>> > /home/hzfengyu/impala/apache-impala/incubator-impala/
>> > toolchain/binutils-2.26.1/bin/
>> > -fuse-ld=gold -g -Wno-unused-local-typedefs -ggdb -gdwarf-2 -Werror
>> > -fverbose-asm -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS
>> -D__STDC_FORMAT_MACROS
>> > -D__STDC_LIMIT_MACROS
>> > -- Common
>> > /home/hzfengyu/impala/apache-impala/incubator-impala/be/build/debug/
>> > -- Configuring incomplete, errors occurred!
>> > See also
>> > "/home/hzfengyu/impala/apache-impala/incubator-impala/
>> > CMakeFiles/CMakeOutput.log".
>> > Error in
>> > /home/hzfengyu/impala/apache-impala/incubator-impala/bin/make_impala.sh
>> at
>> > line 160: cmake . ${CMAKE_ARGS[@]}
>> >
>> >
>> >
>> > I try to find something in CMakeOutput.log,  and I can not find any
>> error,
>> > Is there I miss something? thanks a lot.
>> >
>>
>
>


Re: material for impala newbie

2017-08-02 Thread Henry Robinson
We don't have a lot of in-depth documentation, partly because the
implementation details change frequently.

Have you read the Impala paper?
http://cidrdb.org/cidr2015/Papers/CIDR15_Paper28.pdf
(here's a summary:
https://blog.acolyer.org/2015/02/05/impala-a-modern-open-source-sql-engine-for-hadoop/
)

There's also an old paper on code generation:
https://pdfs.semanticscholar.org/bac4/169d6b6f713c76271b5ccf3d45293351f785.pdf

But the very best thing to read is the source code...

On 2 August 2017 at 09:59, 俊杰陈  wrote:

> Hi
>
> I’m learning impala code now, is there anyone has any impala doc/PPT for
> computing workflow (such as order by), vectorization, and codegen?  Thanks
> in advanced.
>
> --
> Thanks & Best Regards
>


Re: Hold off on GVOs, please

2017-07-28 Thread Henry Robinson
We've solved the issue, for now, by switching to an instance type with a
bit more memory. Merge job are good to go! Thanks to Taras and Matt for
jumping on this.

Henry

On 27 July 2017 at 15:40, Henry Robinson <he...@apache.org> wrote:

> Merge jobs are currently broken due to IMPALA-5733
> <https://issues.apache.org/jira/browse/IMPALA-5733> - where Kudu tablet
> servers are getting killed because of memory pressure on the machine.
> Please refrain from submitting new merge jobs until someone gives the all
> clear.
>
> We don't currently have an ETA for a fix. Looks like this only started
> happening recently, so hopefully we'll be able to identify the culprit in
> short order.
>
> Thanks for your patience!
>
> Henry
>
>
>


Hold off on GVOs, please

2017-07-27 Thread Henry Robinson
Merge jobs are currently broken due to IMPALA-5733
 - where Kudu tablet
servers are getting killed because of memory pressure on the machine.
Please refrain from submitting new merge jobs until someone gives the all
clear.

We don't currently have an ETA for a fix. Looks like this only started
happening recently, so hopefully we'll be able to identify the culprit in
short order.

Thanks for your patience!

Henry


Re: IMPALA-5702 - disable shared linking on jenkins?

2017-07-26 Thread Henry Robinson
Ok, I've made that change, and am running a test GVO. I'll revert the
change if I see any issues.

On 26 July 2017 at 11:34, Henry Robinson <he...@apache.org> wrote:

> Are there any further objections to changing ubuntu-14.04-from-scratch to
> use static linking? If not, I'll make the change later today.
>
>
> On 25 July 2017 at 12:41, Henry Robinson <he...@apache.org> wrote:
>
>>
>>
>> On 24 July 2017 at 18:54, Jim Apple <jbap...@cloudera.com> wrote:
>>
>>> Got it - thanks for the clarification!
>>>
>>> Also, I think I was unclear in my stated concern for new contributors. It
>>> seems to me that new contributors could choose to use the -so flag, even
>>> if
>>> the official pre-merge jobs doesn't, but that there is a cost to
>>> diverging
>>> from the pre-merge job in that it is hard to know what is to blame if
>>> your
>>> pre-merge job fails.
>>>
>>>
>> I understand the concern. My response would be that a) errors usually
>> show up in the shared linking case rather than the static case, so in
>> practice I think it's less likely to see errors during GVO that don't show
>> up in local development and b) if there are divergences at GVO time,
>> there's usually an engaged committer available to help newcomers out.
>>
>> If we see this become an issue, we could revisit this decision at the
>> time.
>>
>>
>>> On Mon, Jul 24, 2017 at 5:46 PM, Henry Robinson <he...@apache.org>
>>> wrote:
>>>
>>> > On 24 July 2017 at 17:43, Jim Apple <jbap...@cloudera.com> wrote:
>>> >
>>> > > On Mon, Jul 24, 2017 at 5:08 PM, Henry Robinson <he...@apache.org>
>>> > wrote:
>>> > >
>>> > > > On 24 July 2017 at 17:04, Jim Apple <jbap...@cloudera.com> wrote:
>>> > > >
>>> > > > > I had anticipated that shared linking would save time and disk
>>> space,
>>> > > but
>>> > > > > it sounds like, from your testing, it doesn't save much time.
>>> Does it
>>> > > > save
>>> > > > > disk space?
>>> > > > >
>>> > > >
>>> > > > I haven't measured but I would expect not. Do we need to be very
>>> > careful
>>> > > > about disk space in the current configuration?
>>> > > >
>>> > >
>>> > > I don't think so, but since we are trying to entice new community
>>> members
>>> > > to commit patches, I am concerned about the cost on developer
>>> machines.
>>> > >
>>> > >
>>> > > >
>>> > > >
>>> > > > >
>>> > > > > Does static linking save time when compiling incremental changes?
>>> > > > >
>>> > > >
>>> > > > Again, I haven't measured.
>>> > > >
>>> > >
>>> > >
>>> > > I'm confused. You said, "Static linking doesn't take much longer in
>>> my
>>> > > unscientific measurements".
>>> > >
>>> >
>>> > I am also confused. I spoke about end-to-end builds on
>>> > ubuntu-14.04-from-scratch. I haven't measured incremental changes,
>>> unless
>>> > they're covered by that build.
>>> >
>>>
>>
>


Re: IMPALA-5702 - disable shared linking on jenkins?

2017-07-26 Thread Henry Robinson
Are there any further objections to changing ubuntu-14.04-from-scratch to
use static linking? If not, I'll make the change later today.

On 25 July 2017 at 12:41, Henry Robinson <he...@apache.org> wrote:

>
>
> On 24 July 2017 at 18:54, Jim Apple <jbap...@cloudera.com> wrote:
>
>> Got it - thanks for the clarification!
>>
>> Also, I think I was unclear in my stated concern for new contributors. It
>> seems to me that new contributors could choose to use the -so flag, even
>> if
>> the official pre-merge jobs doesn't, but that there is a cost to diverging
>> from the pre-merge job in that it is hard to know what is to blame if your
>> pre-merge job fails.
>>
>>
> I understand the concern. My response would be that a) errors usually show
> up in the shared linking case rather than the static case, so in practice I
> think it's less likely to see errors during GVO that don't show up in local
> development and b) if there are divergences at GVO time, there's usually an
> engaged committer available to help newcomers out.
>
> If we see this become an issue, we could revisit this decision at the time.
>
>
>> On Mon, Jul 24, 2017 at 5:46 PM, Henry Robinson <he...@apache.org> wrote:
>>
>> > On 24 July 2017 at 17:43, Jim Apple <jbap...@cloudera.com> wrote:
>> >
>> > > On Mon, Jul 24, 2017 at 5:08 PM, Henry Robinson <he...@apache.org>
>> > wrote:
>> > >
>> > > > On 24 July 2017 at 17:04, Jim Apple <jbap...@cloudera.com> wrote:
>> > > >
>> > > > > I had anticipated that shared linking would save time and disk
>> space,
>> > > but
>> > > > > it sounds like, from your testing, it doesn't save much time.
>> Does it
>> > > > save
>> > > > > disk space?
>> > > > >
>> > > >
>> > > > I haven't measured but I would expect not. Do we need to be very
>> > careful
>> > > > about disk space in the current configuration?
>> > > >
>> > >
>> > > I don't think so, but since we are trying to entice new community
>> members
>> > > to commit patches, I am concerned about the cost on developer
>> machines.
>> > >
>> > >
>> > > >
>> > > >
>> > > > >
>> > > > > Does static linking save time when compiling incremental changes?
>> > > > >
>> > > >
>> > > > Again, I haven't measured.
>> > > >
>> > >
>> > >
>> > > I'm confused. You said, "Static linking doesn't take much longer in my
>> > > unscientific measurements".
>> > >
>> >
>> > I am also confused. I spoke about end-to-end builds on
>> > ubuntu-14.04-from-scratch. I haven't measured incremental changes,
>> unless
>> > they're covered by that build.
>> >
>>
>


Re: IMPALA-5702 - disable shared linking on jenkins?

2017-07-25 Thread Henry Robinson
On 24 July 2017 at 18:54, Jim Apple <jbap...@cloudera.com> wrote:

> Got it - thanks for the clarification!
>
> Also, I think I was unclear in my stated concern for new contributors. It
> seems to me that new contributors could choose to use the -so flag, even if
> the official pre-merge jobs doesn't, but that there is a cost to diverging
> from the pre-merge job in that it is hard to know what is to blame if your
> pre-merge job fails.
>
>
I understand the concern. My response would be that a) errors usually show
up in the shared linking case rather than the static case, so in practice I
think it's less likely to see errors during GVO that don't show up in local
development and b) if there are divergences at GVO time, there's usually an
engaged committer available to help newcomers out.

If we see this become an issue, we could revisit this decision at the time.


> On Mon, Jul 24, 2017 at 5:46 PM, Henry Robinson <he...@apache.org> wrote:
>
> > On 24 July 2017 at 17:43, Jim Apple <jbap...@cloudera.com> wrote:
> >
> > > On Mon, Jul 24, 2017 at 5:08 PM, Henry Robinson <he...@apache.org>
> > wrote:
> > >
> > > > On 24 July 2017 at 17:04, Jim Apple <jbap...@cloudera.com> wrote:
> > > >
> > > > > I had anticipated that shared linking would save time and disk
> space,
> > > but
> > > > > it sounds like, from your testing, it doesn't save much time. Does
> it
> > > > save
> > > > > disk space?
> > > > >
> > > >
> > > > I haven't measured but I would expect not. Do we need to be very
> > careful
> > > > about disk space in the current configuration?
> > > >
> > >
> > > I don't think so, but since we are trying to entice new community
> members
> > > to commit patches, I am concerned about the cost on developer machines.
> > >
> > >
> > > >
> > > >
> > > > >
> > > > > Does static linking save time when compiling incremental changes?
> > > > >
> > > >
> > > > Again, I haven't measured.
> > > >
> > >
> > >
> > > I'm confused. You said, "Static linking doesn't take much longer in my
> > > unscientific measurements".
> > >
> >
> > I am also confused. I spoke about end-to-end builds on
> > ubuntu-14.04-from-scratch. I haven't measured incremental changes, unless
> > they're covered by that build.
> >
>


Re: IMPALA-5702 - disable shared linking on jenkins?

2017-07-24 Thread Henry Robinson
Could you point me to the failing job? I couldn't see it obviously on
https://jenkins.impala.io/.

On 24 July 2017 at 17:42, Jim Apple <jbap...@cloudera.com> wrote:

> Yes, ASAN in the current 1404 job fails with something about linking. I
> haven't got around to investigating in detail.
>
> On Mon, Jul 24, 2017 at 1:39 PM, Todd Lipcon <t...@cloudera.com> wrote:
>
> > Is it possible that the issue here is due to a "one definition rule"
> > violation? eg something like
> > https://github.com/google/sanitizers/wiki/AddressSanitizerOneDefinitionR
> > uleViolation
> > Another similar thing is described here:
> > https://github.com/google/sanitizers/wiki/AddressSanitizerInitialization
> > OrderFiasco
> >
> > ASAN with the appropriate flags might help expose if one of the above is
> > related.
> >
> > I wonder whether it is a kind of coincidence that it is fine in a static
> > build but causes problems in dynamic, and at some point the static link
> > order may slightly shift, causing another new subtle bug.
> >
> >
> >
> > On Mon, Jul 24, 2017 at 1:22 PM, Henry Robinson <he...@apache.org>
> wrote:
> >
> > > We've started seeing isolated incidences of IMPALA-5702 during GVOs,
> > where
> > > a custom cluster test fails by throwing an exception during locale
> > > handling.
> > >
> > > I've been able to reproduce this locally, but only with shared linking
> > > enabled (which makes sense since the issue is symptomatic of a global
> > c'tor
> > > not getting called the right number of times).
> > >
> > > It's probable that my patch for IMPALA-5659 exposed this (since it
> > forced a
> > > more correct linking strategy for thirdparty libraries when dynamic
> > linking
> > > was enabled), but it looks to me at first glance like there were latent
> > > dynamic linking bugs that we weren't getting hit by. Fixing IMPALA-5702
> > > will probably take a while, and I don't think we should hold up GVOs or
> > put
> > > them at risk.
> > >
> > > So there are two options:
> > >
> > > 1. Revert IMPALA-5659
> > >
> > > 2. Switch GVO to static linking
> > >
> > > IMPALA-5659 is important to commit the kudu util library, which is
> needed
> > > for the KRPC work. Without it, shared linking doesn't work *at all*
> when
> > > the kudu util library is committed.
> > >
> > > Static linking doesn't take much longer in my unscientific
> measurements,
> > > and is closer to how Impala is actually used. In the interest of
> forward
> > > progress I'd like to try switching ubuntu-14.04-from-scratch to use
> > static
> > > linking while I work on IMPALA-5702.
> > >
> > > What does everyone else think?
> > >
> > > Henry
> > >
> >
> >
> >
> > --
> > Todd Lipcon
> > Software Engineer, Cloudera
> >
>


Re: IMPALA-5702 - disable shared linking on jenkins?

2017-07-24 Thread Henry Robinson
On 24 July 2017 at 17:43, Jim Apple <jbap...@cloudera.com> wrote:

> On Mon, Jul 24, 2017 at 5:08 PM, Henry Robinson <he...@apache.org> wrote:
>
> > On 24 July 2017 at 17:04, Jim Apple <jbap...@cloudera.com> wrote:
> >
> > > I had anticipated that shared linking would save time and disk space,
> but
> > > it sounds like, from your testing, it doesn't save much time. Does it
> > save
> > > disk space?
> > >
> >
> > I haven't measured but I would expect not. Do we need to be very careful
> > about disk space in the current configuration?
> >
>
> I don't think so, but since we are trying to entice new community members
> to commit patches, I am concerned about the cost on developer machines.
>
>
> >
> >
> > >
> > > Does static linking save time when compiling incremental changes?
> > >
> >
> > Again, I haven't measured.
> >
>
>
> I'm confused. You said, "Static linking doesn't take much longer in my
> unscientific measurements".
>

I am also confused. I spoke about end-to-end builds on
ubuntu-14.04-from-scratch. I haven't measured incremental changes, unless
they're covered by that build.


Re: IMPALA-5702 - disable shared linking on jenkins?

2017-07-24 Thread Henry Robinson
On 24 July 2017 at 17:08, Henry Robinson <he...@apache.org> wrote:

>
>
> On 24 July 2017 at 17:04, Jim Apple <jbap...@cloudera.com> wrote:
>
>> I had anticipated that shared linking would save time and disk space, but
>> it sounds like, from your testing, it doesn't save much time. Does it save
>> disk space?
>>
>
> I haven't measured but I would expect not. Do we need to be very careful
> about disk space in the current configuration?
>

I just saw the disk space report at the end of ubuntu-14.04-from-scratch.
It costs about 50% more disk space (about 30Gb) which is a large amount,
but the executors have plenty of room left.

dynamic:


*00:08:20* /dev/xvda1161129 61335 93144  40% /


static:


*09:20:31* /dev/xvda1161129 92138 62341  60% /



>
>
>>
>> Does static linking save time when compiling incremental changes?
>>
>
> Again, I haven't measured.
>
>
>>
>> On Mon, Jul 24, 2017 at 4:51 PM, Henry Robinson <he...@apache.org> wrote:
>>
>> > :) I agree - we should also track any known breaks to shared linking in
>> a
>> > best effort fashion because it's so useful to some dev workflows.
>> >
>> > On 24 July 2017 at 16:49, Tim Armstrong <tarmstr...@cloudera.com>
>> wrote:
>> >
>> > > I vote for changing Jenkins' linking strategy now and not changing it
>> > back
>> > > :). Static linking is the blessed configuration so I think we should
>> be
>> > > running tests with that primarily.
>> > >
>> > > On Mon, Jul 24, 2017 at 4:34 PM, Henry Robinson <he...@apache.org>
>> > wrote:
>> > >
>> > > > On 24 July 2017 at 13:58, Todd Lipcon <t...@cloudera.com> wrote:
>> > > >
>> > > > > On Mon, Jul 24, 2017 at 1:47 PM, Henry Robinson <he...@apache.org
>> >
>> > > > wrote:
>> > > > >
>> > > > > > Thanks for the asan pointer - I'll give it a go.
>> > > > > >
>> > > > > > My understanding of linking isn't deep, but my working theory
>> has
>> > > been
>> > > > > that
>> > > > > > the complications have been caused by glog getting linked twice
>> -
>> > > once
>> > > > > > statically (possibly into libkudu.so), and once dynamically (via
>> > > > everyone
>> > > > > > else).
>> > > > > >
>> > > > >
>> > > > > In libkudu_client.so, we use a linker script to ensure that we
>> don't
>> > > leak
>> > > > > glog/gflags/etc symbols. Those are all listed as 'local' in
>> > > > > src/kudu/client/symbols.map. We also have a unit test
>> > > > > 'client_symbol-test.sh' which uses nm to dump the list of symbols
>> and
>> > > > make
>> > > > > sure that they all non-local non-weak symbols are under the
>> 'kudu::'
>> > > > > namespace.
>> > > > >
>> > > > > So it's possible that something's getting linked twice but I'd be
>> > > > somewhat
>> > > > > surprised if it's from the Kudu client.
>> > > > >
>> > > > >
>> > > > Good to know, thanks.
>> > > >
>> > > > ASAN hasn't turned up anything yet - so does anyone have an opinion
>> > about
>> > > > changing Jenkins' linking strategy for now?
>> > > >
>> > > >
>> > > > > -Todd
>> > > > >
>> > > > >
>> > > > > >
>> > > > > > I would think that could lead to one or both of the issues you
>> > linked
>> > > > to.
>> > > > > >
>> > > > > >
>> > > > > > On 24 July 2017 at 13:39, Todd Lipcon <t...@cloudera.com>
>> wrote:
>> > > > > >
>> > > > > > > Is it possible that the issue here is due to a "one definition
>> > > rule"
>> > > > > > > violation? eg something like
>> > > > > > > https://github.com/google/sanitizers/wiki/AddressSanitizerOn
>> > > > > > > eDefinitionRuleViolation
>> > > > > > > Another similar thing is described here:
>> > > > > > > https://github.com/google/sanitizers/wiki/AddressSanitizerIn
>> > > > &

Re: IMPALA-5702 - disable shared linking on jenkins?

2017-07-24 Thread Henry Robinson
On 24 July 2017 at 17:04, Jim Apple <jbap...@cloudera.com> wrote:

> I had anticipated that shared linking would save time and disk space, but
> it sounds like, from your testing, it doesn't save much time. Does it save
> disk space?
>

I haven't measured but I would expect not. Do we need to be very careful
about disk space in the current configuration?


>
> Does static linking save time when compiling incremental changes?
>

Again, I haven't measured.


>
> On Mon, Jul 24, 2017 at 4:51 PM, Henry Robinson <he...@apache.org> wrote:
>
> > :) I agree - we should also track any known breaks to shared linking in a
> > best effort fashion because it's so useful to some dev workflows.
> >
> > On 24 July 2017 at 16:49, Tim Armstrong <tarmstr...@cloudera.com> wrote:
> >
> > > I vote for changing Jenkins' linking strategy now and not changing it
> > back
> > > :). Static linking is the blessed configuration so I think we should be
> > > running tests with that primarily.
> > >
> > > On Mon, Jul 24, 2017 at 4:34 PM, Henry Robinson <he...@apache.org>
> > wrote:
> > >
> > > > On 24 July 2017 at 13:58, Todd Lipcon <t...@cloudera.com> wrote:
> > > >
> > > > > On Mon, Jul 24, 2017 at 1:47 PM, Henry Robinson <he...@apache.org>
> > > > wrote:
> > > > >
> > > > > > Thanks for the asan pointer - I'll give it a go.
> > > > > >
> > > > > > My understanding of linking isn't deep, but my working theory has
> > > been
> > > > > that
> > > > > > the complications have been caused by glog getting linked twice -
> > > once
> > > > > > statically (possibly into libkudu.so), and once dynamically (via
> > > > everyone
> > > > > > else).
> > > > > >
> > > > >
> > > > > In libkudu_client.so, we use a linker script to ensure that we
> don't
> > > leak
> > > > > glog/gflags/etc symbols. Those are all listed as 'local' in
> > > > > src/kudu/client/symbols.map. We also have a unit test
> > > > > 'client_symbol-test.sh' which uses nm to dump the list of symbols
> and
> > > > make
> > > > > sure that they all non-local non-weak symbols are under the
> 'kudu::'
> > > > > namespace.
> > > > >
> > > > > So it's possible that something's getting linked twice but I'd be
> > > > somewhat
> > > > > surprised if it's from the Kudu client.
> > > > >
> > > > >
> > > > Good to know, thanks.
> > > >
> > > > ASAN hasn't turned up anything yet - so does anyone have an opinion
> > about
> > > > changing Jenkins' linking strategy for now?
> > > >
> > > >
> > > > > -Todd
> > > > >
> > > > >
> > > > > >
> > > > > > I would think that could lead to one or both of the issues you
> > linked
> > > > to.
> > > > > >
> > > > > >
> > > > > > On 24 July 2017 at 13:39, Todd Lipcon <t...@cloudera.com> wrote:
> > > > > >
> > > > > > > Is it possible that the issue here is due to a "one definition
> > > rule"
> > > > > > > violation? eg something like
> > > > > > > https://github.com/google/sanitizers/wiki/AddressSanitizerOn
> > > > > > > eDefinitionRuleViolation
> > > > > > > Another similar thing is described here:
> > > > > > > https://github.com/google/sanitizers/wiki/AddressSanitizerIn
> > > > > > > itializationOrderFiasco
> > > > > > >
> > > > > > > ASAN with the appropriate flags might help expose if one of the
> > > above
> > > > > is
> > > > > > > related.
> > > > > > >
> > > > > > > I wonder whether it is a kind of coincidence that it is fine
> in a
> > > > > static
> > > > > > > build but causes problems in dynamic, and at some point the
> > static
> > > > link
> > > > > > > order may slightly shift, causing another new subtle bug.
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > On Mon, Jul 24, 2017 at 1:22 PM, Henry Robinson <
> >

Re: Thrift version used by Impala

2017-07-24 Thread Henry Robinson
Just to follow up on this - I spent some time looking at what would be
required to a Thrift 0.9.3 upgrade (since some relevant changes have
affected this since the last time I looked). The short answer is that it's
not a small change. See https://issues.apache.org/jira/browse/IMPALA-5690
for my notes so far.

Henry

On 20 June 2017 at 15:10, Henry Robinson <he...@apache.org> wrote:

> The main reason I haven't done it yet is because Thrift 0.9.3 introduces a
> Bison dependency for compilation, and I hadn't got round to getting that
> working on all the platforms I care about. No particular technical reason.
>
> On 20 June 2017 at 15:06, Alexander Kolbasov <ak...@cloudera.com> wrote:
>
>> As part of our investigation of Impala/Sentry integration issues it turned
>> out that Impala uses a version of Thrift that is older then 0.9.3 that's
>> used by Sentry (and many other components). Is there a fundamental reason
>> Impala can't move to Thrift 0.9.3? There were some security
>> vulnerabilities
>> in earlier versions.
>>
>> - Alex
>>
>
>


Re: IMPALA-5702 - disable shared linking on jenkins?

2017-07-24 Thread Henry Robinson
:) I agree - we should also track any known breaks to shared linking in a
best effort fashion because it's so useful to some dev workflows.

On 24 July 2017 at 16:49, Tim Armstrong <tarmstr...@cloudera.com> wrote:

> I vote for changing Jenkins' linking strategy now and not changing it back
> :). Static linking is the blessed configuration so I think we should be
> running tests with that primarily.
>
> On Mon, Jul 24, 2017 at 4:34 PM, Henry Robinson <he...@apache.org> wrote:
>
> > On 24 July 2017 at 13:58, Todd Lipcon <t...@cloudera.com> wrote:
> >
> > > On Mon, Jul 24, 2017 at 1:47 PM, Henry Robinson <he...@apache.org>
> > wrote:
> > >
> > > > Thanks for the asan pointer - I'll give it a go.
> > > >
> > > > My understanding of linking isn't deep, but my working theory has
> been
> > > that
> > > > the complications have been caused by glog getting linked twice -
> once
> > > > statically (possibly into libkudu.so), and once dynamically (via
> > everyone
> > > > else).
> > > >
> > >
> > > In libkudu_client.so, we use a linker script to ensure that we don't
> leak
> > > glog/gflags/etc symbols. Those are all listed as 'local' in
> > > src/kudu/client/symbols.map. We also have a unit test
> > > 'client_symbol-test.sh' which uses nm to dump the list of symbols and
> > make
> > > sure that they all non-local non-weak symbols are under the 'kudu::'
> > > namespace.
> > >
> > > So it's possible that something's getting linked twice but I'd be
> > somewhat
> > > surprised if it's from the Kudu client.
> > >
> > >
> > Good to know, thanks.
> >
> > ASAN hasn't turned up anything yet - so does anyone have an opinion about
> > changing Jenkins' linking strategy for now?
> >
> >
> > > -Todd
> > >
> > >
> > > >
> > > > I would think that could lead to one or both of the issues you linked
> > to.
> > > >
> > > >
> > > > On 24 July 2017 at 13:39, Todd Lipcon <t...@cloudera.com> wrote:
> > > >
> > > > > Is it possible that the issue here is due to a "one definition
> rule"
> > > > > violation? eg something like
> > > > > https://github.com/google/sanitizers/wiki/AddressSanitizerOn
> > > > > eDefinitionRuleViolation
> > > > > Another similar thing is described here:
> > > > > https://github.com/google/sanitizers/wiki/AddressSanitizerIn
> > > > > itializationOrderFiasco
> > > > >
> > > > > ASAN with the appropriate flags might help expose if one of the
> above
> > > is
> > > > > related.
> > > > >
> > > > > I wonder whether it is a kind of coincidence that it is fine in a
> > > static
> > > > > build but causes problems in dynamic, and at some point the static
> > link
> > > > > order may slightly shift, causing another new subtle bug.
> > > > >
> > > > >
> > > > >
> > > > > On Mon, Jul 24, 2017 at 1:22 PM, Henry Robinson <he...@apache.org>
> > > > wrote:
> > > > >
> > > > > > We've started seeing isolated incidences of IMPALA-5702 during
> > GVOs,
> > > > > where
> > > > > > a custom cluster test fails by throwing an exception during
> locale
> > > > > > handling.
> > > > > >
> > > > > > I've been able to reproduce this locally, but only with shared
> > > linking
> > > > > > enabled (which makes sense since the issue is symptomatic of a
> > global
> > > > > c'tor
> > > > > > not getting called the right number of times).
> > > > > >
> > > > > > It's probable that my patch for IMPALA-5659 exposed this (since
> it
> > > > > forced a
> > > > > > more correct linking strategy for thirdparty libraries when
> dynamic
> > > > > linking
> > > > > > was enabled), but it looks to me at first glance like there were
> > > latent
> > > > > > dynamic linking bugs that we weren't getting hit by. Fixing
> > > IMPALA-5702
> > > > > > will probably take a while, and I don't think we should hold up
> > GVOs
> > > or
> > > > > put
> > > > > > them at risk.
> > > > > >
> > > > > > So there are two options:
> > > > > >
> > > > > > 1. Revert IMPALA-5659
> > > > > >
> > > > > > 2. Switch GVO to static linking
> > > > > >
> > > > > > IMPALA-5659 is important to commit the kudu util library, which
> is
> > > > needed
> > > > > > for the KRPC work. Without it, shared linking doesn't work *at
> all*
> > > > when
> > > > > > the kudu util library is committed.
> > > > > >
> > > > > > Static linking doesn't take much longer in my unscientific
> > > > measurements,
> > > > > > and is closer to how Impala is actually used. In the interest of
> > > > forward
> > > > > > progress I'd like to try switching ubuntu-14.04-from-scratch to
> use
> > > > > static
> > > > > > linking while I work on IMPALA-5702.
> > > > > >
> > > > > > What does everyone else think?
> > > > > >
> > > > > > Henry
> > > > > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Todd Lipcon
> > > > > Software Engineer, Cloudera
> > > >
> > >
> > >
> > >
> > > --
> > > Todd Lipcon
> > > Software Engineer, Cloudera
> > >
> >
>


Re: IMPALA-5702 - disable shared linking on jenkins?

2017-07-24 Thread Henry Robinson
On 24 July 2017 at 13:58, Todd Lipcon <t...@cloudera.com> wrote:

> On Mon, Jul 24, 2017 at 1:47 PM, Henry Robinson <he...@apache.org> wrote:
>
> > Thanks for the asan pointer - I'll give it a go.
> >
> > My understanding of linking isn't deep, but my working theory has been
> that
> > the complications have been caused by glog getting linked twice - once
> > statically (possibly into libkudu.so), and once dynamically (via everyone
> > else).
> >
>
> In libkudu_client.so, we use a linker script to ensure that we don't leak
> glog/gflags/etc symbols. Those are all listed as 'local' in
> src/kudu/client/symbols.map. We also have a unit test
> 'client_symbol-test.sh' which uses nm to dump the list of symbols and make
> sure that they all non-local non-weak symbols are under the 'kudu::'
> namespace.
>
> So it's possible that something's getting linked twice but I'd be somewhat
> surprised if it's from the Kudu client.
>
>
Good to know, thanks.

ASAN hasn't turned up anything yet - so does anyone have an opinion about
changing Jenkins' linking strategy for now?


> -Todd
>
>
> >
> > I would think that could lead to one or both of the issues you linked to.
> >
> >
> > On 24 July 2017 at 13:39, Todd Lipcon <t...@cloudera.com> wrote:
> >
> > > Is it possible that the issue here is due to a "one definition rule"
> > > violation? eg something like
> > > https://github.com/google/sanitizers/wiki/AddressSanitizerOn
> > > eDefinitionRuleViolation
> > > Another similar thing is described here:
> > > https://github.com/google/sanitizers/wiki/AddressSanitizerIn
> > > itializationOrderFiasco
> > >
> > > ASAN with the appropriate flags might help expose if one of the above
> is
> > > related.
> > >
> > > I wonder whether it is a kind of coincidence that it is fine in a
> static
> > > build but causes problems in dynamic, and at some point the static link
> > > order may slightly shift, causing another new subtle bug.
> > >
> > >
> > >
> > > On Mon, Jul 24, 2017 at 1:22 PM, Henry Robinson <he...@apache.org>
> > wrote:
> > >
> > > > We've started seeing isolated incidences of IMPALA-5702 during GVOs,
> > > where
> > > > a custom cluster test fails by throwing an exception during locale
> > > > handling.
> > > >
> > > > I've been able to reproduce this locally, but only with shared
> linking
> > > > enabled (which makes sense since the issue is symptomatic of a global
> > > c'tor
> > > > not getting called the right number of times).
> > > >
> > > > It's probable that my patch for IMPALA-5659 exposed this (since it
> > > forced a
> > > > more correct linking strategy for thirdparty libraries when dynamic
> > > linking
> > > > was enabled), but it looks to me at first glance like there were
> latent
> > > > dynamic linking bugs that we weren't getting hit by. Fixing
> IMPALA-5702
> > > > will probably take a while, and I don't think we should hold up GVOs
> or
> > > put
> > > > them at risk.
> > > >
> > > > So there are two options:
> > > >
> > > > 1. Revert IMPALA-5659
> > > >
> > > > 2. Switch GVO to static linking
> > > >
> > > > IMPALA-5659 is important to commit the kudu util library, which is
> > needed
> > > > for the KRPC work. Without it, shared linking doesn't work *at all*
> > when
> > > > the kudu util library is committed.
> > > >
> > > > Static linking doesn't take much longer in my unscientific
> > measurements,
> > > > and is closer to how Impala is actually used. In the interest of
> > forward
> > > > progress I'd like to try switching ubuntu-14.04-from-scratch to use
> > > static
> > > > linking while I work on IMPALA-5702.
> > > >
> > > > What does everyone else think?
> > > >
> > > > Henry
> > > >
> > >
> > >
> > >
> > > --
> > > Todd Lipcon
> > > Software Engineer, Cloudera
> >
>
>
>
> --
> Todd Lipcon
> Software Engineer, Cloudera
>


Re: IMPALA-5702 - disable shared linking on jenkins?

2017-07-24 Thread Henry Robinson
Thanks for the asan pointer - I'll give it a go.

My understanding of linking isn't deep, but my working theory has been that
the complications have been caused by glog getting linked twice - once
statically (possibly into libkudu.so), and once dynamically (via everyone
else).

I would think that could lead to one or both of the issues you linked to.


On 24 July 2017 at 13:39, Todd Lipcon <t...@cloudera.com> wrote:

> Is it possible that the issue here is due to a "one definition rule"
> violation? eg something like
> https://github.com/google/sanitizers/wiki/AddressSanitizerOn
> eDefinitionRuleViolation
> Another similar thing is described here:
> https://github.com/google/sanitizers/wiki/AddressSanitizerIn
> itializationOrderFiasco
>
> ASAN with the appropriate flags might help expose if one of the above is
> related.
>
> I wonder whether it is a kind of coincidence that it is fine in a static
> build but causes problems in dynamic, and at some point the static link
> order may slightly shift, causing another new subtle bug.
>
>
>
> On Mon, Jul 24, 2017 at 1:22 PM, Henry Robinson <he...@apache.org> wrote:
>
> > We've started seeing isolated incidences of IMPALA-5702 during GVOs,
> where
> > a custom cluster test fails by throwing an exception during locale
> > handling.
> >
> > I've been able to reproduce this locally, but only with shared linking
> > enabled (which makes sense since the issue is symptomatic of a global
> c'tor
> > not getting called the right number of times).
> >
> > It's probable that my patch for IMPALA-5659 exposed this (since it
> forced a
> > more correct linking strategy for thirdparty libraries when dynamic
> linking
> > was enabled), but it looks to me at first glance like there were latent
> > dynamic linking bugs that we weren't getting hit by. Fixing IMPALA-5702
> > will probably take a while, and I don't think we should hold up GVOs or
> put
> > them at risk.
> >
> > So there are two options:
> >
> > 1. Revert IMPALA-5659
> >
> > 2. Switch GVO to static linking
> >
> > IMPALA-5659 is important to commit the kudu util library, which is needed
> > for the KRPC work. Without it, shared linking doesn't work *at all* when
> > the kudu util library is committed.
> >
> > Static linking doesn't take much longer in my unscientific measurements,
> > and is closer to how Impala is actually used. In the interest of forward
> > progress I'd like to try switching ubuntu-14.04-from-scratch to use
> static
> > linking while I work on IMPALA-5702.
> >
> > What does everyone else think?
> >
> > Henry
> >
>
>
>
> --
> Todd Lipcon
> Software Engineer, Cloudera


IMPALA-5702 - disable shared linking on jenkins?

2017-07-24 Thread Henry Robinson
We've started seeing isolated incidences of IMPALA-5702 during GVOs, where
a custom cluster test fails by throwing an exception during locale handling.

I've been able to reproduce this locally, but only with shared linking
enabled (which makes sense since the issue is symptomatic of a global c'tor
not getting called the right number of times).

It's probable that my patch for IMPALA-5659 exposed this (since it forced a
more correct linking strategy for thirdparty libraries when dynamic linking
was enabled), but it looks to me at first glance like there were latent
dynamic linking bugs that we weren't getting hit by. Fixing IMPALA-5702
will probably take a while, and I don't think we should hold up GVOs or put
them at risk.

So there are two options:

1. Revert IMPALA-5659

2. Switch GVO to static linking

IMPALA-5659 is important to commit the kudu util library, which is needed
for the KRPC work. Without it, shared linking doesn't work *at all* when
the kudu util library is committed.

Static linking doesn't take much longer in my unscientific measurements,
and is closer to how Impala is actually used. In the interest of forward
progress I'd like to try switching ubuntu-14.04-from-scratch to use static
linking while I work on IMPALA-5702.

What does everyone else think?

Henry


Re: Slow/unusable apache JIRA?

2017-07-24 Thread Henry Robinson
There's a link here: https://www.apache.org/dev/infra-contact

On 24 July 2017 at 09:24, Matthew Jacobs  wrote:

> Thanks.
> @Henry, where is the infra hipchat channel?
>
> On Mon, Jul 24, 2017 at 9:22 AM, Jeszy  wrote:
>
> > Hey,
> >
> > No problems usually, but now It's down for me as well. According to
> > status.apache.org, the service seems to be struggling.
> >
> > On 24 July 2017 at 18:15, Matthew Jacobs  wrote:
> > > Hey,
> > >
> > > I've been noticing a lot of slowness/timeouts on the Apache JIRA. Has
> > > anyone else noticed this? Sometimes it's just annoying, but today I've
> > > found a lot of pages are just timing out.
> > >
> > > Just got this error when attempting to load
> > > https://issues.apache.org/jira/browse/IMPALA-5275
> > >
> > >
> > > Communications Breakdown
> > >
> > > The call to the JIRA server did not complete within the timeout period.
> > We
> > > are unsure of the result of this operation.
> > >
> > > Close this dialog and press refresh in your browser
> >
>


Re: Slow/unusable apache JIRA?

2017-07-24 Thread Henry Robinson
Yep, it's super slow for me as well. Was just in the infra hipchat channel,
and they are now aware of the issue.

On 24 July 2017 at 09:15, Matthew Jacobs  wrote:

> Hey,
>
> I've been noticing a lot of slowness/timeouts on the Apache JIRA. Has
> anyone else noticed this? Sometimes it's just annoying, but today I've
> found a lot of pages are just timing out.
>
> Just got this error when attempting to load
> https://issues.apache.org/jira/browse/IMPALA-5275
>
>
> Communications Breakdown
>
> The call to the JIRA server did not complete within the timeout period. We
> are unsure of the result of this operation.
>
> Close this dialog and press refresh in your browser
>


Re: Problem running Impala built with dynamic linking

2017-07-19 Thread Henry Robinson
Can you try:

cd $IMPALA_HOME
rm CMakeCache.txt
cmake .


If that doesn't work, can you send me the output of rm CMakeCache.txt &&
cmake . from IMPALA_HOME?

Thanks,
Henry

On 19 July 2017 at 17:03, Henry Robinson <he...@cloudera.com> wrote:

> Sorry, I read too quickly - you've done that already! Let me take a look.
>
> On 19 July 2017 at 17:01, Henry Robinson <he...@apache.org> wrote:
>
>> Yep, you need to remove the downloaded version of gflags and replace it
>> with a recent toolchain version. See my mail from yesterday for
>> instructions: https://lists.apache.org/api/source.lua/a154f4
>> 3ef76e3da1271efed2740d447bbaeca09509156c7b728f2dd0@%3Cdev.
>> impala.apache.org%3E
>>
>> On 19 July 2017 at 16:56, Bikramjeet Vig <bikramjeet@cloudera.com>
>> wrote:
>>
>>> After fetching latest from asf-gerrit (that has the toolchain commit
>>> related to gflags) and doing a manual toolchain refresh, I am unable to
>>> run
>>> impala when I build with "make_debug" or "buildall -so", both statestore
>>> and catalogd show the following error:
>>>
>>> ERROR: something wrong with flag 'flagfile' in file
>>> '/data/jenkins/workspace/verify-impala-toolchain-package-bui
>>> ld/label/ec2-package-ubuntu-16-04/toolchain/source/gflags/
>>> gflags-2.2.0-p1/src/gflags.cc'.
>>> One possibility: file
>>> '/data/jenkins/workspace/verify-impala-toolchain-package-bui
>>> ld/label/ec2-package-ubuntu-16-04/toolchain/source/gflags/
>>> gflags-2.2.0-p1/src/gflags.cc'
>>> is being linked both statically and dynamically into this executable.
>>>
>>> I am only able to make it work if I go with static linking by building it
>>> with "buildall" without the "-so"
>>>
>>> Anyone facing the same issue?
>>>
>>
>>
>
>
> --
> Henry Robinson
> Software Engineer
> Cloudera
> 415-994-6679 <(415)%20994-6679>
>


Re: Problem running Impala built with dynamic linking

2017-07-19 Thread Henry Robinson
Sorry, I read too quickly - you've done that already! Let me take a look.

On 19 July 2017 at 17:01, Henry Robinson <he...@apache.org> wrote:

> Yep, you need to remove the downloaded version of gflags and replace it
> with a recent toolchain version. See my mail from yesterday for
> instructions: https://lists.apache.org/api/source.lua/
> a154f43ef76e3da1271efed2740d447bbaeca09509156c7b728f2dd0@%
> 3Cdev.impala.apache.org%3E
>
> On 19 July 2017 at 16:56, Bikramjeet Vig <bikramjeet@cloudera.com>
> wrote:
>
>> After fetching latest from asf-gerrit (that has the toolchain commit
>> related to gflags) and doing a manual toolchain refresh, I am unable to
>> run
>> impala when I build with "make_debug" or "buildall -so", both statestore
>> and catalogd show the following error:
>>
>> ERROR: something wrong with flag 'flagfile' in file
>> '/data/jenkins/workspace/verify-impala-toolchain-package-
>> build/label/ec2-package-ubuntu-16-04/toolchain/source/
>> gflags/gflags-2.2.0-p1/src/gflags.cc'.
>> One possibility: file
>> '/data/jenkins/workspace/verify-impala-toolchain-package-
>> build/label/ec2-package-ubuntu-16-04/toolchain/source/
>> gflags/gflags-2.2.0-p1/src/gflags.cc'
>> is being linked both statically and dynamically into this executable.
>>
>> I am only able to make it work if I go with static linking by building it
>> with "buildall" without the "-so"
>>
>> Anyone facing the same issue?
>>
>
>


-- 
Henry Robinson
Software Engineer
Cloudera
415-994-6679


Re: Problem running Impala built with dynamic linking

2017-07-19 Thread Henry Robinson
Yep, you need to remove the downloaded version of gflags and replace it
with a recent toolchain version. See my mail from yesterday for
instructions:
https://lists.apache.org/api/source.lua/a154f43ef76e3da1271efed2740d447bbaeca09509156c7b728f2dd0@%3Cdev.impala.apache.org%3E

On 19 July 2017 at 16:56, Bikramjeet Vig 
wrote:

> After fetching latest from asf-gerrit (that has the toolchain commit
> related to gflags) and doing a manual toolchain refresh, I am unable to run
> impala when I build with "make_debug" or "buildall -so", both statestore
> and catalogd show the following error:
>
> ERROR: something wrong with flag 'flagfile' in file
> '/data/jenkins/workspace/verify-impala-toolchain-package-build/label/ec2-
> package-ubuntu-16-04/toolchain/source/gflags/
> gflags-2.2.0-p1/src/gflags.cc'.
> One possibility: file
> '/data/jenkins/workspace/verify-impala-toolchain-package-build/label/ec2-
> package-ubuntu-16-04/toolchain/source/gflags/
> gflags-2.2.0-p1/src/gflags.cc'
> is being linked both statically and dynamically into this executable.
>
> I am only able to make it work if I go with static linking by building it
> with "buildall" without the "-so"
>
> Anyone facing the same issue?
>


Re: Disabling all clang-tidy checks

2017-07-18 Thread Henry Robinson
Does this exclude source files? The issue I have is that I want to exclude
a whole directory of .cc files.

On 18 July 2017 at 17:09, Todd Lipcon <t...@cloudera.com> wrote:

> FYI I have a patch up against clang-tidy to add this feature:
> https://reviews.llvm.org/D34654
> Hoping it will be reviewed/committed soon.
>
> -Todd
>
> On Thu, Jul 13, 2017 at 10:58 AM, Henry Robinson <he...@cloudera.com>
> wrote:
>
> > To close this loop - I took the blunt instrument approach and removed the
> > directory (which is the imported kudu code) from tidying consideration
> the
> > same way we do gutil in run_clang_tidy.sh. Nothing else seemed to work as
> > one might think it should.
> >
> > On 12 July 2017 at 17:46, Jim Apple <jbap...@cloudera.com> wrote:
> >
> > > The clang-diagnostics are, IIRC, also enabled by the -W flags. You
> could
> > > try turning all warnings off via compiler flags.
> > >
> > > There is also a tool that auto-fixes clang-tidy warnings, but only some
> > of
> > > them, and I never got even that much to work :-/
> > >
> > > On Wed, Jul 12, 2017 at 5:24 PM, Henry Robinson <he...@apache.org>
> > wrote:
> > >
> > > > That does not, for whatever reason, actually disable
> > clang-diagnostic-*.
> > > I
> > > > don't know why either :/
> > > >
> > > > On 12 July 2017 at 17:15, Jim Apple <jbap...@cloudera.com> wrote:
> > > >
> > > > > What about "diagnostic-henry-thinks-will-
> > > never-fire,-*,-clang-diagnosti
> > > > > c-*"?
> > > > >
> > > > >
> > > > >
> > > > > On Wed, Jul 12, 2017 at 5:01 PM, Henry Robinson <he...@apache.org>
> > > > wrote:
> > > > >
> > > > > > Has anyone found a way to disable all clang-tidy checks for a
> > > > directory?
> > > > > >
> > > > > > I've tried a directory-specific .clang-tidy file with
> > > > > >
> > > > > > ---
> > > > > > Checks: "-*"
> > > > > >
> > > > > > but that causes clang-tidy to exit with an error (because I
> didn't
> > > > > > configure any checks). So I tried adding one check that I thought
> > > would
> > > > > > never fire. But that silently re-enables a bunch of
> > clang-diagnostic*
> > > > > > checks that I don't want.
> > > > > >
> > > > > > This happens when running:
> > > > > >
> > > > > > git diff HEAD~1 |
> > > > > >  "${IMPALA_TOOLCHAIN}/llvm-${IMPALA_LLVM_VERSION}/share/clan
> > > > > > g/clang-tidy-diff.py"
> > > > > > -clang-tidy-binary
> > > > > > "${IMPALA_TOOLCHAIN}/llvm-${IMPALA_LLVM_VERSION}/bin/clang-tidy"
> > -p
> > > 1
> > > > > >
> > > > > > per
> > > > > > https://cwiki.apache.org/confluence/pages/viewpage.action?
> > > > > pageId=65868536
> > > > > >
> > > > > > Any ideas? Am I running clang-tidy wrong?
> > > > > >
> > > > >
> > > >
> > >
> >
> >
> >
> > --
> > Henry Robinson
> > Software Engineer
> > Cloudera
> > 415-994-6679
> >
>
>
>
> --
> Todd Lipcon
> Software Engineer, Cloudera
>


Heads up: manual toolchain refresh needed if building w/shared libs

2017-07-18 Thread Henry Robinson
If you have BUILD_SHARED_LIBS set in your cmake environment when building
Impala, you will need to manually clear your local toolchain version of
gflags when you have this commit:

https://github.com/apache/incubator-impala/commit/d79e01ef9fec559d4ebe57d41539f4e4164ae78f

in your local branch.

This is because the following toolchain commit:

https://github.com/cloudera/native-toolchain/commit/f32e122eaa9932f52b7c3f4c205045f3522e88dd

enabled shared library building for gflags (needed by the above Impala
commit), but did not change the gflags version or patch level since there
were no code changes.

To do this, simply remove the toolchain gflags dir:

rm -rf ${IMPALA_TOOLCHAIN}/gflags-${IMPALA_GFLAGS_VERSION}
${IMPALA_HOME}/bin/bootstrap_toolchain.py

Then re-run cmake however you usually do (whether by buildall.sh or make or
otherwise), and you should be good to go.

Let me know if you have questions!

Henry


Re: Disabling all clang-tidy checks

2017-07-13 Thread Henry Robinson
To close this loop - I took the blunt instrument approach and removed the
directory (which is the imported kudu code) from tidying consideration the
same way we do gutil in run_clang_tidy.sh. Nothing else seemed to work as
one might think it should.

On 12 July 2017 at 17:46, Jim Apple <jbap...@cloudera.com> wrote:

> The clang-diagnostics are, IIRC, also enabled by the -W flags. You could
> try turning all warnings off via compiler flags.
>
> There is also a tool that auto-fixes clang-tidy warnings, but only some of
> them, and I never got even that much to work :-/
>
> On Wed, Jul 12, 2017 at 5:24 PM, Henry Robinson <he...@apache.org> wrote:
>
> > That does not, for whatever reason, actually disable clang-diagnostic-*.
> I
> > don't know why either :/
> >
> > On 12 July 2017 at 17:15, Jim Apple <jbap...@cloudera.com> wrote:
> >
> > > What about "diagnostic-henry-thinks-will-
> never-fire,-*,-clang-diagnosti
> > > c-*"?
> > >
> > >
> > >
> > > On Wed, Jul 12, 2017 at 5:01 PM, Henry Robinson <he...@apache.org>
> > wrote:
> > >
> > > > Has anyone found a way to disable all clang-tidy checks for a
> > directory?
> > > >
> > > > I've tried a directory-specific .clang-tidy file with
> > > >
> > > > ---
> > > > Checks: "-*"
> > > >
> > > > but that causes clang-tidy to exit with an error (because I didn't
> > > > configure any checks). So I tried adding one check that I thought
> would
> > > > never fire. But that silently re-enables a bunch of clang-diagnostic*
> > > > checks that I don't want.
> > > >
> > > > This happens when running:
> > > >
> > > > git diff HEAD~1 |
> > > >  "${IMPALA_TOOLCHAIN}/llvm-${IMPALA_LLVM_VERSION}/share/clan
> > > > g/clang-tidy-diff.py"
> > > > -clang-tidy-binary
> > > > "${IMPALA_TOOLCHAIN}/llvm-${IMPALA_LLVM_VERSION}/bin/clang-tidy" -p
> 1
> > > >
> > > > per
> > > > https://cwiki.apache.org/confluence/pages/viewpage.action?
> > > pageId=65868536
> > > >
> > > > Any ideas? Am I running clang-tidy wrong?
> > > >
> > >
> >
>



-- 
Henry Robinson
Software Engineer
Cloudera
415-994-6679


Re: Disabling all clang-tidy checks

2017-07-12 Thread Henry Robinson
That does not, for whatever reason, actually disable clang-diagnostic-*. I
don't know why either :/

On 12 July 2017 at 17:15, Jim Apple <jbap...@cloudera.com> wrote:

> What about "diagnostic-henry-thinks-will-never-fire,-*,-clang-diagnosti
> c-*"?
>
>
>
> On Wed, Jul 12, 2017 at 5:01 PM, Henry Robinson <he...@apache.org> wrote:
>
> > Has anyone found a way to disable all clang-tidy checks for a directory?
> >
> > I've tried a directory-specific .clang-tidy file with
> >
> > ---
> > Checks: "-*"
> >
> > but that causes clang-tidy to exit with an error (because I didn't
> > configure any checks). So I tried adding one check that I thought would
> > never fire. But that silently re-enables a bunch of clang-diagnostic*
> > checks that I don't want.
> >
> > This happens when running:
> >
> > git diff HEAD~1 |
> >  "${IMPALA_TOOLCHAIN}/llvm-${IMPALA_LLVM_VERSION}/share/clan
> > g/clang-tidy-diff.py"
> > -clang-tidy-binary
> > "${IMPALA_TOOLCHAIN}/llvm-${IMPALA_LLVM_VERSION}/bin/clang-tidy" -p 1
> >
> > per
> > https://cwiki.apache.org/confluence/pages/viewpage.action?
> pageId=65868536
> >
> > Any ideas? Am I running clang-tidy wrong?
> >
>


Disabling all clang-tidy checks

2017-07-12 Thread Henry Robinson
Has anyone found a way to disable all clang-tidy checks for a directory?

I've tried a directory-specific .clang-tidy file with

---
Checks: "-*"

but that causes clang-tidy to exit with an error (because I didn't
configure any checks). So I tried adding one check that I thought would
never fire. But that silently re-enables a bunch of clang-diagnostic*
checks that I don't want.

This happens when running:

git diff HEAD~1 |
 
"${IMPALA_TOOLCHAIN}/llvm-${IMPALA_LLVM_VERSION}/share/clang/clang-tidy-diff.py"
-clang-tidy-binary
"${IMPALA_TOOLCHAIN}/llvm-${IMPALA_LLVM_VERSION}/bin/clang-tidy" -p 1

per
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=65868536

Any ideas? Am I running clang-tidy wrong?


Re: Add total number of started threads to /threadz

2017-07-10 Thread Henry Robinson
This seems valuable to me.

On 28 June 2017 at 21:26, Lars Volker  wrote:

> Hi All,
>
> While investigating IMPALA-5598 I added a counter with the total number of
> threads to /threadz. See below for what it looks like (I hope the ASF
> mailer won't eat the format). Does this look helpful? If someone thinks it
> does, I'll create a JIRA and push the change.
>
> Thanks, Lars
>
>
> Thread GroupsAll threads
> DataStreamSender
> : (running: 0, total created: 2500)
>  group?group=DataStreamSender>common
> : (running: 2, total created: 2)
> 
> coordinator-fragment-rpc
> : (running: 12, total created: 12)
>  fragment-rpc>disk-io-mgr
> : (running: 34, total created: 34)
> 
> fragment-mgr
> : (running: 0, total created: 2550)
>  >hdfs-scan-node
> : (running: 0, total created: 2500)
>  >hdfs-worker-pool
> : (running: 16, total created: 16)
>  group?group=hdfs-worker-pool>impala-server
> : (running: 8, total created: 8)
> 
> plan-fragment-executor
> : (running: 0, total created: 2550)
>  executor>query-exec-state
> : (running: 0, total created: 50)
>  group?group=query-exec-state>rpc-pool
> : (running: 8, total created: 8)
> 
> scheduling
> : (running: 1, total created: 1)
>  >setup-server
> : (running: 2, total created: 2)
> 
> statestore-subscriber
> : (running: 1, total created: 1)
>  subscriber>thrift-server
> : (running: 248, total created: 248)
> 
>


Re: Re: Impala make install

2017-06-20 Thread Henry Robinson
I don't think there's any plan for this work. The CMake documentation would
be where I'd start looking for ideas:

https://cmake.org/cmake/help/v3.2/command/install.html

Best,
Henry

On 20 June 2017 at 18:31, sky  wrote:

> Hi Tim,
>Is there a plan for this work? Could you provide a manual copy of the
> example?Thanks.
>
>
>
>
>
>
>
>
> At 2017-06-21 01:41:33, "Tim Armstrong"  wrote:
> >Hi Sky,
> >  We have not implemented an install target yet - for deployment we rely
> on
> >copying out the artifacts manually. I believe CMake has some support for
> >implementing install targets but nobody has picked up that work yet.
> >
> >- Tim
> >
> >On Mon, Jun 19, 2017 at 8:45 PM, sky  wrote:
> >
> >> Hi all,
> >> I am using cdh5.11.1-release,the compilation command is provided in
> >> the documentation(./buildall.sh -notests -so),but there is no command
> >> similar to 'make install'.In the current document compiled, the
> directory
> >> structure is too much and do not need too many files. Could you provide
> an
> >> "install" command to extract compiled files to other directories for
> easy
> >> management
>


Re: Broken build from Sentry

2017-06-20 Thread Henry Robinson
Yes, I did. AFAICT it worked fine.

On 20 June 2017 at 09:19, Alexander Behm <alex.b...@cloudera.com> wrote:

> Henry, did you try the revert on top of Tim's already-checked-in change?
>
> On Tue, Jun 20, 2017 at 9:18 AM, Alexander Behm <alex.b...@cloudera.com>
> wrote:
>
> > Let's revert the version to buy us some time. That solution is a ticking
> > time bomb though since that version will disappear soon.
> >
> > On Tue, Jun 20, 2017 at 8:56 AM, Henry Robinson <he...@apache.org>
> wrote:
> >
> >> I was able to run a build with EE and FE tests with Sentry reverted to
> >> 5.12
> >> - unless there are objections I'm going to post a patch to revert the
> >> version bump.
> >>
> >> On 20 June 2017 at 06:53, Thomas Tauber-Marshall <
> tmarsh...@cloudera.com>
> >> wrote:
> >>
> >> > So we've had a successful run of the nightlies now, and I've uploaded
> >> the
> >> > new jars to the s3 bucket, but Sentry still fails for some reason.
> >> >
> >> > I filed: https://issues.apache.org/jira/browse/IMPALA-5540 to track
> >> this
> >> >
> >> > On Tue, Jun 20, 2017 at 1:25 AM Alexander Kolbasov <
> ak...@cloudera.com>
> >> > wrote:
> >> >
> >> > > Note that Apache upstream story is more complicated - there was a
> >> change
> >> > > done upstream that refactored a bunch of Sentry code that will cause
> >> > > similar issue (I think it is SENTRY-1205). The change is present in
> >> > Sentry
> >> > > master but not in upstream sentry HA branch.
> >> > >
> >> > > On Mon, Jun 19, 2017 at 11:02 PM, Dimitris Tsirogiannis <
> >> > > dtsirogian...@cloudera.com> wrote:
> >> > >
> >> > > > +Sasha, who I believe has more up-to-date information on this.
> >> > > >
> >> > > > On Mon, Jun 19, 2017 at 10:56 PM, Henry Robinson <
> he...@apache.org>
> >> > > wrote:
> >> > > >
> >> > > >> FWIW, I've been able to start Sentry by setting:
> >> > > >>
> >> > > >> export IMPALA_SENTRY_VERSION=1.5.1-cdh5.12.0-SNAPSHOT
> >> > > >>
> >> > > >> (i.e. rolling back to the previous version of Sentry). I haven't
> >> yet
> >> > > tried
> >> > > >> to run tests - does anyone know an ETA for a fix coming out of
> >> > Cloudera
> >> > > >> for
> >> > > >> the 5.13-SNAPSHOT? If it might be a while, we should consider
> >> > regressing
> >> > > >> the Sentry version to unblock checkins.
> >> > > >>
> >> > > >> On 19 June 2017 at 15:31, Tim Armstrong <tarmstr...@cloudera.com
> >
> >> > > wrote:
> >> > > >>
> >> > > >> > It's unfortunately not that simple. The API change has been in
> >> > Apache
> >> > > >> > sentry
> >> > > >> >
> >> > > >> > So rolling back the API change temporarily solves the problem
> for
> >> > > >> Cloudera,
> >> > > >> > but we're going to have to deal with it at some point and get
> >> Impala
> >> > > >> > building against both versions of the API.
> >> > > >> >
> >> > > >> > On Mon, Jun 19, 2017 at 2:55 PM, Thomas Tauber-Marshall <
> >> > > >> > tmarsh...@cloudera.com> wrote:
> >> > > >> >
> >> > > >> > > Yes, the Sentry team has been contacted and they're going to
> be
> >> > > >> rolling
> >> > > >> > it
> >> > > >> > > back.
> >> > > >> > >
> >> > > >> > > On Mon, Jun 19, 2017 at 4:53 PM Todd Lipcon <
> t...@cloudera.com
> >> >
> >> > > >> wrote:
> >> > > >> > >
> >> > > >> > > > Quick question from a bystander: it seems like Sentry
> >> committed
> >> > an
> >> > > >> > > > API-incompatible change. Instead of fixing on the Impala
> >> side,
> >> > > >> should
> >> > > >> > the
> >> > > >> > > > Sentry project be notified that they may

Re: Broken build from Sentry

2017-06-20 Thread Henry Robinson
I was able to run a build with EE and FE tests with Sentry reverted to 5.12
- unless there are objections I'm going to post a patch to revert the
version bump.

On 20 June 2017 at 06:53, Thomas Tauber-Marshall <tmarsh...@cloudera.com>
wrote:

> So we've had a successful run of the nightlies now, and I've uploaded the
> new jars to the s3 bucket, but Sentry still fails for some reason.
>
> I filed: https://issues.apache.org/jira/browse/IMPALA-5540 to track this
>
> On Tue, Jun 20, 2017 at 1:25 AM Alexander Kolbasov <ak...@cloudera.com>
> wrote:
>
> > Note that Apache upstream story is more complicated - there was a change
> > done upstream that refactored a bunch of Sentry code that will cause
> > similar issue (I think it is SENTRY-1205). The change is present in
> Sentry
> > master but not in upstream sentry HA branch.
> >
> > On Mon, Jun 19, 2017 at 11:02 PM, Dimitris Tsirogiannis <
> > dtsirogian...@cloudera.com> wrote:
> >
> > > +Sasha, who I believe has more up-to-date information on this.
> > >
> > > On Mon, Jun 19, 2017 at 10:56 PM, Henry Robinson <he...@apache.org>
> > wrote:
> > >
> > >> FWIW, I've been able to start Sentry by setting:
> > >>
> > >> export IMPALA_SENTRY_VERSION=1.5.1-cdh5.12.0-SNAPSHOT
> > >>
> > >> (i.e. rolling back to the previous version of Sentry). I haven't yet
> > tried
> > >> to run tests - does anyone know an ETA for a fix coming out of
> Cloudera
> > >> for
> > >> the 5.13-SNAPSHOT? If it might be a while, we should consider
> regressing
> > >> the Sentry version to unblock checkins.
> > >>
> > >> On 19 June 2017 at 15:31, Tim Armstrong <tarmstr...@cloudera.com>
> > wrote:
> > >>
> > >> > It's unfortunately not that simple. The API change has been in
> Apache
> > >> > sentry
> > >> >
> > >> > So rolling back the API change temporarily solves the problem for
> > >> Cloudera,
> > >> > but we're going to have to deal with it at some point and get Impala
> > >> > building against both versions of the API.
> > >> >
> > >> > On Mon, Jun 19, 2017 at 2:55 PM, Thomas Tauber-Marshall <
> > >> > tmarsh...@cloudera.com> wrote:
> > >> >
> > >> > > Yes, the Sentry team has been contacted and they're going to be
> > >> rolling
> > >> > it
> > >> > > back.
> > >> > >
> > >> > > On Mon, Jun 19, 2017 at 4:53 PM Todd Lipcon <t...@cloudera.com>
> > >> wrote:
> > >> > >
> > >> > > > Quick question from a bystander: it seems like Sentry committed
> an
> > >> > > > API-incompatible change. Instead of fixing on the Impala side,
> > >> should
> > >> > the
> > >> > > > Sentry project be notified that they may want to roll back such
> a
> > >> > change?
> > >> > > > It seems like an error on their part to do such a thing within a
> > >> minor
> > >> > > > version.
> > >> > > >
> > >> > > > On Mon, Jun 19, 2017 at 1:56 PM, Thomas Tauber-Marshall <
> > >> > > > tmarsh...@cloudera.com> wrote:
> > >> > > >
> > >> > > > > I'm working on getting the s3 jars updated, which presumably
> > will
> > >> fix
> > >> > > > that.
> > >> > > > >
> > >> > > > > The problem (to my understanding) is that the nightlies
> haven't
> > >> > passed
> > >> > > > > since the change went into Sentry and so the Jenkins job that
> > >> > normally
> > >> > > > > produces the new jars is still pulling in old bits.
> > >> > > > >
> > >> > > > > I've been talking with releng and they expect the new jars to
> be
> > >> > > > available
> > >> > > > > later today.
> > >> > > > >
> > >> > > > > On Mon, Jun 19, 2017 at 3:48 PM Tim Armstrong <
> > >> > tarmstr...@cloudera.com
> > >> > > >
> > >> > > > > wrote:
> > >> > > > >
> > >> > > > > > Looks like the build still breaks when starting up sentry
> > after
> > >> my
&g

Re: Broken build from Sentry

2017-06-19 Thread Henry Robinson
FWIW, I've been able to start Sentry by setting:

export IMPALA_SENTRY_VERSION=1.5.1-cdh5.12.0-SNAPSHOT

(i.e. rolling back to the previous version of Sentry). I haven't yet tried
to run tests - does anyone know an ETA for a fix coming out of Cloudera for
the 5.13-SNAPSHOT? If it might be a while, we should consider regressing
the Sentry version to unblock checkins.

On 19 June 2017 at 15:31, Tim Armstrong <tarmstr...@cloudera.com> wrote:

> It's unfortunately not that simple. The API change has been in Apache
> sentry
>
> So rolling back the API change temporarily solves the problem for Cloudera,
> but we're going to have to deal with it at some point and get Impala
> building against both versions of the API.
>
> On Mon, Jun 19, 2017 at 2:55 PM, Thomas Tauber-Marshall <
> tmarsh...@cloudera.com> wrote:
>
> > Yes, the Sentry team has been contacted and they're going to be rolling
> it
> > back.
> >
> > On Mon, Jun 19, 2017 at 4:53 PM Todd Lipcon <t...@cloudera.com> wrote:
> >
> > > Quick question from a bystander: it seems like Sentry committed an
> > > API-incompatible change. Instead of fixing on the Impala side, should
> the
> > > Sentry project be notified that they may want to roll back such a
> change?
> > > It seems like an error on their part to do such a thing within a minor
> > > version.
> > >
> > > On Mon, Jun 19, 2017 at 1:56 PM, Thomas Tauber-Marshall <
> > > tmarsh...@cloudera.com> wrote:
> > >
> > > > I'm working on getting the s3 jars updated, which presumably will fix
> > > that.
> > > >
> > > > The problem (to my understanding) is that the nightlies haven't
> passed
> > > > since the change went into Sentry and so the Jenkins job that
> normally
> > > > produces the new jars is still pulling in old bits.
> > > >
> > > > I've been talking with releng and they expect the new jars to be
> > > available
> > > > later today.
> > > >
> > > > On Mon, Jun 19, 2017 at 3:48 PM Tim Armstrong <
> tarmstr...@cloudera.com
> > >
> > > > wrote:
> > > >
> > > > > Looks like the build still breaks when starting up sentry after my
> > fix:
> > > > >
> > > > >
> > > http://jenkins.impala.io:8080/job/ubuntu-14.04-from-scratch/
> 1547/console
> > > > >
> > > > > *20:08:54*  --> Starting the Sentry Policy Server*20:08:59* Error
> in
> > > > > /home/ubuntu/Impala/testdata/bin/run-all.sh at line 58:
> > > > > $IMPALA_HOME/testdata/bin/run-sentry-service.sh > \*20:08:59* +
> > > > > onexit*20:08:59* + df -m*20:08:59* Filesystem 1M-blocks  Used
> > > > > Available Use% Mounted on*20:08:59* udev   15070 1
> > > > > 15070   1% /dev*20:08:59* tmpfs   3015 1  3015
> > > > > 1% /run*20:08:59* /dev/xvda1161129 22275132204  15%
> > > > > /*20:08:59* none   1 0 1   0%
> > > > > /sys/fs/cgroup*20:08:59* none   5 0 5
> >  0%
> > > > > /run/lock*20:08:59* none   15075 1 15075   1%
> > > > > /run/shm*20:08:59* none 100 0   100   0%
> > > > > /run/user*20:08:59* + free -m*20:08:59*  total
>  used
> > > > >  free sharedbuffers cached*20:08:59* Mem:
> > > > > 30148  19597  10550 11 91
> > 14323*20:08:59*
> > > > > -/+ buffers/cache:   5182  24965*20:08:59* Swap:
>   0
> > > > > 0  0*20:08:59* + uptime -p*20:08:59* up 45
> > > > > minutes*20:08:59* + rm -rf /home/ubuntu/Impala/logs_stati
> c*20:08:59*
> > +
> > > > > mkdir -p /home/ubuntu/Impala/logs_static*20:08:59* + cp -r -L
> > > > > /home/ubuntu/Impala/logs /home/ubuntu/Impala/logs_static*20:08:59*
> > > > > Build step 'Execute shell' marked build as failure*20:08:59* Set
> > build
> > > > > name.*20:08:59* New build name is '#1547
> > > > > refs/changes/22/7222/3'*20:08:59* Variable with name
> > > > > 'BUILD_DISPLAY_NAME' already exists, current value: '#1547
> > > > > refs/changes/22/7222/3', new value: '#1547
> > > > > refs/changes/22/7222/3'*20:09:12* Archiving artifacts*20:09:21*
> > > > > Finished: FAILURE
> > > > >
> > > > >
> > >

Re: Broken build from Sentry

2017-06-19 Thread Henry Robinson
Presumably this will break GVO jobs as well - should we commit Tim's patch
to get us moving again while Alex works on the root cause?

On 19 June 2017 at 09:23, Alexander Behm  wrote:

> Meanwhile, I'll work on fixing the root cause:
> https://issues.apache.org/jira/browse/IMPALA-5530
>
> On Mon, Jun 19, 2017 at 9:20 AM, Tim Armstrong 
> wrote:
>
> > You may have noticed that Impala doesn't build this morning because of a
> > sentry exception class no longer existing. I was able to unblock myself
> > with this change, if you want to cherry-pick it:
> > https://gerrit.cloudera.org/#/c/7222/
> >
>


Re: [DISCUSS] Release 2.9.0 soon?

2017-05-25 Thread Henry Robinson
That seems like a great idea, +1. We can cherry-pick any future fixes back
before the release is done (I know IMPALA-4890 is going to land soon and is
important).

Thanks Taras!

On 25 May 2017 at 17:11, Taras Bobrovytsky <taras...@apache.org> wrote:

> Actually I think the better place to cut the branch is the following
> commit:
>
> commit 7763b8cc8d0e2a35a5ccf8dd37768750a11b5193
> > Author: poojanilangekar <nilangekar.po...@gmail.com>
> > Date:   Wed May 24 15:23:36 2017 -0700
> > IMPALA-5232: Parquet reader error message prints memory address
> > instead of value
>
>
> It's the last stable point, right before the big ADLS patch.
>
> Thoughts?
>
> On Thu, May 25, 2017 at 4:50 PM, Henry Robinson <he...@apache.org> wrote:
>
> > Why not the current HEAD, out of interest? (That's
> > https://github.com/apache/incubator-impala/commit/014c5603f8
> > 67907963f3821948f90d526e9a4789
> > right now).
> >
> > On 25 May 2017 at 16:48, Taras Bobrovytsky <taras...@apache.org> wrote:
> >
> > > I propose that we branch Impala 2.9.0 from the following commit:
> > >
> > > commit b2782774491b5e280c4b51e39c393810deaa1e22
> > > > Author: Tim Armstrong <tarmstr...@cloudera.com>
> > > > Date:   Sun May 21 15:04:21 2017 -0700
> > > > IMPALA-5347: Parquet scanner microoptimizations
> > >
> > >
> > > Does anyone have any objections?
> > >
> > > On Mon, May 22, 2017 at 3:29 AM, Lars Volker <l...@cloudera.com> wrote:
> > >
> > > > Impala has not release 2.9.0 yet. We follow the ASF release process
> > which
> > > > is outlined here: http://www.apache.org/dev/release-publishing.html
> > > >
> > > > The feedback to Taras' proposal seems positive, so he'll likely
> > provide a
> > > > first release candidate soon.
> > > >
> > > > For an overview you can search for JIRAs with fixVersion set to
> "Impala
> > > > 2.9.0", but the list often has inaccuracies and will change until the
> > > final
> > > > release is published.
> > > >
> > > > On Mon, May 22, 2017 at 11:51 AM, yu feng <olaptes...@gmail.com>
> > wrote:
> > > >
> > > > > where can I find the release notes about 2.9.0?  thanks.
> > > > >
> > > > > 2017-05-20 6:59 GMT+08:00 Tim Armstrong <tarmstr...@cloudera.com>:
> > > > >
> > > > > > +1 Thanks for volunteering. It would be great to get a release
> > done -
> > > > > it's
> > > > > > been quite a while since the last one.
> > > > > >
> > > > > > On Fri, May 19, 2017 at 5:53 PM, Bharath Vissapragada <
> > > > > > bhara...@cloudera.com
> > > > > > > wrote:
> > > > > >
> > > > > > > +1. Good to have a new release with all the latest
> improvements.
> > > > > > >
> > > > > > > On Fri, May 19, 2017 at 3:51 PM, Alexander Behm <
> > > > > alex.b...@cloudera.com>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > +1 for doing a release
> > > > > > > >
> > > > > > > > On Fri, May 19, 2017 at 3:41 PM, Taras Bobrovytsky <
> > > > > > taras...@apache.org>
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > This is not a [VOTE] thread. Everyone is encourage to
> > > > participate.
> > > > > > > > >
> > > > > > > > > I am volunteering to be a release manager for Impala 2.9.0.
> > Are
> > > > > there
> > > > > > > any
> > > > > > > > > objections to releasing 2.9.0 soon?
> > > > > > > > > Keep in mind this is NOT your last chance to speak - there
> > will
> > > > be
> > > > > at
> > > > > > > > least
> > > > > > > > > two votes, one for PPMC releasing and one for IPMC
> releasing.
> > > > > > > > >
> > > > > > > > > See
> > > > > > > > > https://cwiki.apache.org/confluence/display/IMPALA/
> > > > > > > > DRAFT%3A+How+to+Release
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>


Re: [DISCUSS] Release 2.9.0 soon?

2017-05-25 Thread Henry Robinson
Why not the current HEAD, out of interest? (That's
https://github.com/apache/incubator-impala/commit/014c5603f867907963f3821948f90d526e9a4789
right now).

On 25 May 2017 at 16:48, Taras Bobrovytsky  wrote:

> I propose that we branch Impala 2.9.0 from the following commit:
>
> commit b2782774491b5e280c4b51e39c393810deaa1e22
> > Author: Tim Armstrong 
> > Date:   Sun May 21 15:04:21 2017 -0700
> > IMPALA-5347: Parquet scanner microoptimizations
>
>
> Does anyone have any objections?
>
> On Mon, May 22, 2017 at 3:29 AM, Lars Volker  wrote:
>
> > Impala has not release 2.9.0 yet. We follow the ASF release process which
> > is outlined here: http://www.apache.org/dev/release-publishing.html
> >
> > The feedback to Taras' proposal seems positive, so he'll likely provide a
> > first release candidate soon.
> >
> > For an overview you can search for JIRAs with fixVersion set to "Impala
> > 2.9.0", but the list often has inaccuracies and will change until the
> final
> > release is published.
> >
> > On Mon, May 22, 2017 at 11:51 AM, yu feng  wrote:
> >
> > > where can I find the release notes about 2.9.0?  thanks.
> > >
> > > 2017-05-20 6:59 GMT+08:00 Tim Armstrong :
> > >
> > > > +1 Thanks for volunteering. It would be great to get a release done -
> > > it's
> > > > been quite a while since the last one.
> > > >
> > > > On Fri, May 19, 2017 at 5:53 PM, Bharath Vissapragada <
> > > > bhara...@cloudera.com
> > > > > wrote:
> > > >
> > > > > +1. Good to have a new release with all the latest improvements.
> > > > >
> > > > > On Fri, May 19, 2017 at 3:51 PM, Alexander Behm <
> > > alex.b...@cloudera.com>
> > > > > wrote:
> > > > >
> > > > > > +1 for doing a release
> > > > > >
> > > > > > On Fri, May 19, 2017 at 3:41 PM, Taras Bobrovytsky <
> > > > taras...@apache.org>
> > > > > > wrote:
> > > > > >
> > > > > > > This is not a [VOTE] thread. Everyone is encourage to
> > participate.
> > > > > > >
> > > > > > > I am volunteering to be a release manager for Impala 2.9.0. Are
> > > there
> > > > > any
> > > > > > > objections to releasing 2.9.0 soon?
> > > > > > > Keep in mind this is NOT your last chance to speak - there will
> > be
> > > at
> > > > > > least
> > > > > > > two votes, one for PPMC releasing and one for IPMC releasing.
> > > > > > >
> > > > > > > See
> > > > > > > https://cwiki.apache.org/confluence/display/IMPALA/
> > > > > > DRAFT%3A+How+to+Release
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>


Re: Build failure: FlagRegisterer not found by linker

2017-05-22 Thread Henry Robinson
You need to remove $IMPALA_TOOLCHAIN/glog-0.3.4-p2, then run
bin/bootstrap_toolchain.py again. Make sure you've sourced impala-config.sh
to get the toolchain version bumped up as well.

On 22 May 2017 at 13:41, Timothy Wood  wrote:

> Please excuse, in advance, my naivete, inexperience, off-topicness, etc. :)
>
> I'm having trouble completing a build of Impala.  The linker fails to
> build the tests (atomic-test, instruction-counter-test ) with several
> reports of:
> ../../../toolchain/glog-0.3.4-p2/lib/libglog.a(libglog_la-logging.o):logging.cc:function
> _GLOBAL__sub_I_logging.cc: error: undefined reference to
> 'google::FlagRegisterer::FlagRegisterer(char const*, char const*,
> char const*, bool*, bool*)'
>
> I Googled references to ...FlagRegisterer and found postings about this
> problem; some recommended checking that gflags was built and available.  I
> found that I have two gflag versions (2.2.0, 2.2.0-p1) in my loader search
> path.  I tried moving one and then the other out of the way and rebuilding
> each time, but I get the same errors in all cases.  I'm not sure what to
> try next to resolve this dependency.
>
> Thanks,
> TW
>


Re: Heads-up - manual toolchain update required soon

2017-05-17 Thread Henry Robinson
Ok, this is done, and the patch is committed to master. As a reminder,
you'll need to remove the cached glog library in your toolchain, as per
below.

IMPALA-5174: Bump gflags to 2.2.0-p1

This gflags patch adds DEFINE_int32_hidden() etc. macros, which suppress
flags from appearing in /varz, --help and other flag enumerations.

Our toolchain glog is statically linked against gflags, and therefore
had to be rebuilt, however its version number did not change. You will
likely need to do the following:

rm -rf ${IMPALA_TOOLCHAIN}/glog-0.3.4-p2/

before running bin/bootstrap_toolchain.py, otherwise building Impala may
fail with a linking error.


On 15 May 2017 at 11:09, Henry Robinson <he...@cloudera.com> wrote:

> Let's do it individually. I'll push an update patch for my change now.
>
> On 14 May 2017 at 06:09, Lars Volker <l...@cloudera.com> wrote:
>
>> I recently bumped the Breakpad version in the toolchain repo and now would
>> like to pull that into master. The change to do so is here:
>> https://gerrit.cloudera.org/#/c/6883/
>>
>> Should I wait until gflags has been pulled into master and rebase? Or
>> would
>> you like me to bump gflags in my change, too?
>>
>> On Tue, May 9, 2017 at 12:21 AM, Henry Robinson <he...@apache.org> wrote:
>>
>> > I'm about to start the process of getting IMPALA-5174 committed to the
>> > toolchain. This patch changes gflags to allow 'hidden' flags that won't
>> > show up on /varz etc.
>> >
>> > The toolchain glog has a dependency on gflags, meaning that the
>> installed
>> > glog library needs to be built against the installed gflag library. So
>> when
>> > the new gflag gets pulled in, you will need the new glog as well.
>> >
>> > However, the toolchain scripts won't detect that anything has changed
>> for
>> > glog, because there's no version number change (changing the toolchain
>> > build ID doesn't cause the toolchain scripts to invalidate
>> dependencies).
>> >
>> > Rather than introduce a spurious version bump with an empty patch file
>> or
>> > something, I figured in this case it's easiest to ask developers to
>> > manually delete their local glog, and then bin/bootstrap_toolchain.py
>> will
>> > download the most recent glog that's built against gflag. This is a
>> > one-time thing.
>> >
>> > I'll send out instructions about how to do this when the toolchain is
>> > updated.
>> >
>>
>
>
>
> --
> Henry Robinson
> Software Engineer
> Cloudera
> 415-994-6679 <(415)%20994-6679>
>


Re: Heads-up - manual toolchain update required soon

2017-05-15 Thread Henry Robinson
Let's do it individually. I'll push an update patch for my change now.

On 14 May 2017 at 06:09, Lars Volker <l...@cloudera.com> wrote:

> I recently bumped the Breakpad version in the toolchain repo and now would
> like to pull that into master. The change to do so is here:
> https://gerrit.cloudera.org/#/c/6883/
>
> Should I wait until gflags has been pulled into master and rebase? Or would
> you like me to bump gflags in my change, too?
>
> On Tue, May 9, 2017 at 12:21 AM, Henry Robinson <he...@apache.org> wrote:
>
> > I'm about to start the process of getting IMPALA-5174 committed to the
> > toolchain. This patch changes gflags to allow 'hidden' flags that won't
> > show up on /varz etc.
> >
> > The toolchain glog has a dependency on gflags, meaning that the installed
> > glog library needs to be built against the installed gflag library. So
> when
> > the new gflag gets pulled in, you will need the new glog as well.
> >
> > However, the toolchain scripts won't detect that anything has changed for
> > glog, because there's no version number change (changing the toolchain
> > build ID doesn't cause the toolchain scripts to invalidate dependencies).
> >
> > Rather than introduce a spurious version bump with an empty patch file or
> > something, I figured in this case it's easiest to ask developers to
> > manually delete their local glog, and then bin/bootstrap_toolchain.py
> will
> > download the most recent glog that's built against gflag. This is a
> > one-time thing.
> >
> > I'll send out instructions about how to do this when the toolchain is
> > updated.
> >
>



-- 
Henry Robinson
Software Engineer
Cloudera
415-994-6679


Re: Should we change tests so they don't use single letter table names?

2017-05-12 Thread Henry Robinson
Is there any way to scope the tests to their own, unique, database
namespace?

On 12 May 2017 at 09:26, Lars Volker  wrote:

> Looking at AnalyzeDDLTest alone it's full of "t", "p", "tbl", "test",
> "foo", "bar" and the like. Fixing them often means overflowing a line and
> fixing line breaks, so seems a bit more effort. Maybe better to postpone
> until after the release.
>
> On Fri, May 12, 2017 at 6:11 PM, Alexander Behm 
> wrote:
>
> > Tim, I think Michael was not suggesting to drop your tables, but instead
> > create/drop new unique tables in each test like we do in EE tests.
> >
> > Yes, I think we should tackle this. I frequently run into this problem
> with
> > a "foo" table :)
> >
> > On Fri, May 12, 2017 at 8:59 AM, Lars Volker  wrote:
> >
> > > Yes, they are in the default db. I think the easiest way to go about
> this
> > > is to create 26 tables in default with a script and then rename tables
> in
> > > the FE tests until we catch all of them. Or try to grep for the
> offending
> > > tests. :)
> > >
> > > There seems to be some consensus that we should tackle this, so I'll
> > open a
> > > JIRA.
> > >
> > > On Fri, May 12, 2017 at 5:49 PM, Tim Armstrong <
> tarmstr...@cloudera.com>
> > > wrote:
> > >
> > > > Personally I'd prefer the frontend test to fail instead of dropping
> my
> > > > table without warning. I assume these tables are in the default
> > database,
> > > > right?
> > > >
> > > > On Fri, May 12, 2017 at 8:43 AM, Alexander Behm <
> > alex.b...@cloudera.com>
> > > > wrote:
> > > >
> > > > > Michael, to keep them fast and self-contained the FE tests do not
> > > > require a
> > > > > running Impala cluster, and as such cannot really execute any
> > > statements
> > > > > (e.g. DROP/ADD).
> > > > >
> > > > > The FE has limited mechanisms for setting up temporary tables which
> > > might
> > > > > suffice in most but not all cases.
> > > > >
> > > > > I agree with Lars that we should address this issue. We need to
> look
> > > at a
> > > > > few cases and see if there's a sledgehammer solution we can apply.
> > > > >
> > > > > On Fri, May 12, 2017 at 7:21 AM, Michael Brown  >
> > > > wrote:
> > > > >
> > > > > > Why not alter the frontend test to drop t if exists? Tests should
> > > > > generally
> > > > > > strive to set themselves up. Is there some trait of the frontend
> > > tests
> > > > > that
> > > > > > prevents that?
> > > > > >
> > > > > > On Fri, May 12, 2017 at 4:38 AM, Lars Volker 
> > > wrote:
> > > > > >
> > > > > > > Hi All,
> > > > > > >
> > > > > > > I frequently create test tables on my local system with names
> > like
> > > > "t"
> > > > > or
> > > > > > > "p". A couple of frontend tests use the same names and then
> fail
> > > with
> > > > > > > "Table already exists".
> > > > > > >
> > > > > > > Does anyone else hit this from time to time? Can we change the
> > > table
> > > > > > names
> > > > > > > in the tests to avoid single letter names? If there are no
> > > > objections,
> > > > > > I'll
> > > > > > > open a JIRA.
> > > > > > >
> > > > > > > Thanks, Lars
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>


Heads-up - manual toolchain update required soon

2017-05-08 Thread Henry Robinson
I'm about to start the process of getting IMPALA-5174 committed to the
toolchain. This patch changes gflags to allow 'hidden' flags that won't
show up on /varz etc.

The toolchain glog has a dependency on gflags, meaning that the installed
glog library needs to be built against the installed gflag library. So when
the new gflag gets pulled in, you will need the new glog as well.

However, the toolchain scripts won't detect that anything has changed for
glog, because there's no version number change (changing the toolchain
build ID doesn't cause the toolchain scripts to invalidate dependencies).

Rather than introduce a spurious version bump with an empty patch file or
something, I figured in this case it's easiest to ask developers to
manually delete their local glog, and then bin/bootstrap_toolchain.py will
download the most recent glog that's built against gflag. This is a
one-time thing.

I'll send out instructions about how to do this when the toolchain is
updated.


FLAGS_use_statestore

2017-05-03 Thread Henry Robinson
Does anyone ever run Impala with --use_statestore set to false? I don't
think it works (catalog dissemination can't work), and keeping it around
means we have a few extra c'tors in the scheduler code that are proving a
bit of a pain to maintain.

I propose deprecating --use_statestore in 2.9. Any objections?

Thanks,
Henry


Re: Upstreaming ppc64le patches for native-toolchain

2017-04-17 Thread Henry Robinson
+1

On Mon, Apr 17, 2017 at 9:09 AM Tim Armstrong <tarmstr...@cloudera.com>
wrote:

> I feel like we shouldn't make PPC part of pre-commit at least initially -
> it's an unreasonable barrier if contributors/committers to debug issues on
> a platform they don't have easy access to. Having the testing infra is
> still important because we don't want to have code in there that will
> gradually bit-rot without us noticing.
>
> On Mon, Apr 17, 2017 at 8:51 AM, Silvius Rus <s...@cloudera.com> wrote:
>
> > Would it make sense to _not_ run PPC tests as part of presubmit?  Instead
> > Valencia could set up nightly tests using in-house infrastructure.  And
> > share the test results, e.g., by sending them to a new email list
> > te...@impala.incubator.apache.org (that we'd need to create) so everyone
> > can see when there are failures or if coverage stops for whatever reason.
> > GCC has been doing something like this for long,
> > https://gcc.gnu.org/ml/gcc-testresults/2017-04/.
> >
> > On Tue, Apr 11, 2017 at 9:44 AM, Jim Apple <jbap...@cloudera.com> wrote:
> >
> > > >
> > > > Locally, I work on native-toolchain using a VM configured with
> > > > Ubuntu16.04ppc64le, 4GB RAM and 50GB of HDD. If  we provide you a VM
> > with
> > > > this config, will it be sufficient ?
> > > >
> > >
> > > What hypervisor/emulator will it use?
> > >
> > > What are the requirements of the host OS and host hardware?
> > >
> > > Why is the config you have it set to so important that you mention it
> in
> > > your email - will the config be locked down into that config or can it
> be
> > > reconfigured later?
> > >
> > > How is the VM controlled from the host OS? Keep in mind that a GUI
> cannot
> > > be the only option for automated tests.
> > >
> > > FWIW, Impala's test suite probably cannot fully complete without at
> least
> > > 8, and I suspect 16, GB of RAM, and we might need more disk space, too,
> > but
> > > these should be reconfigurable with most hypervisors/emulators.
> > >
> >
>
-- 
Henry Robinson
Software Engineer
Cloudera
415-994-6679


Review request

2017-03-28 Thread Henry Robinson
If someone could review all three lines of
https://gerrit.cloudera.org/#/c/6497/1 when you have a moment, I'd be
grateful, as the gutil patches (which are ready to go) depend upon it.


Re: Guidance required to submit ppc64le patch for native-toolchain for gerrit review

2017-03-23 Thread Henry Robinson
I see the problem. Please push to the native-toolchain project, not the
Toolchain one. I recently renamed that project, and did not send
instructions to this list - my apologies!

You need to do this:

cd 
git remote set-url  ssh://@
gerrit.cloudera.org:29418/native-toolchain

and try pushing again. Please make sure you rebase on the upstream
native-toolchain repo, and that the only commits different from that are
the ones you want to have reviewed.

Let me know if you have problems -

Henry


On 23 March 2017 at 21:56, Valencia Serrao  wrote:

> Hi Jim,
>
> Thanks for the quick response.
>
> I have set the GERRIT_USER to my gerrit username. Here is the required
> info.
>
> *Command: *ssh -p 29418 $gerrit_u...@gerrit.cloudera.org
> *Output*:
> Enter passphrase for key '/home/test/.ssh/id_rsa':
>
>  Welcome to Gerrit Code Review 
>
> Hi Valencia Edna Serrao, you have successfully connected over SSH.
>
> Unfortunately, interactive shells are disabled.
> To clone a hosted Git repository, use:
>
> git clone ssh://vser...@gerrit.cloudera.org:29418/REPOSITORY_NAME.git
>
> Connection to gerrit.cloudera.org closed.
>
> *Command:* git log | grep -c 'ibm.com'
> *Output:* 1
>
>
> Please let me know if any other info is required.
>
> Regards,
> Valencia
> [image: Inactive hide details for Jim Apple ---03/23/2017 07:02:12
> PM---First, we forgot to include GERRIT_USER instructions. I have se]Jim
> Apple ---03/23/2017 07:02:12 PM---First, we forgot to include GERRIT_USER
> instructions. I have set those now: https://cwiki.apache.org
>
> From: Jim Apple 
> To: "dev@impala" 
> Cc: Valencia Serrao/Austin/Contr/IBM@IBMUS, Nishidha
> Panpaliya/Austin/Contr/IBM@IBMUS, Sudarshan Jagadale/Austin/Contr/IBM@IBMUS,
> Manish Patil/Austin/Contr/IBM@IBMUS
> Date: 03/23/2017 07:02 PM
> Subject: Re: Guidance required to submit ppc64le patch for
> native-toolchain for gerrit review
> --
>
>
>
> First, we forgot to include GERRIT_USER instructions. I have set those now:
>
> https://cwiki.apache.org/confluence/pages/diffpagesbyversion
> .action?pageId=65147133=16=15
>
> What do you see when you try
>
> ssh -p 29418 $gerrit_u...@gerrit.cloudera.org
>
> Also:
>
> > remote: (W) b5db155: commit subject >65 characters; use shorter first
> > paragraph
> > remote: (W) f414949: too many commit message lines longer than 70
> > characters; manually wrap lines
>
> These are odd when your git log shows a single commit without this
> problem. Did you squash all of your commits down into one? What does
> "git log | grep -c 'ibm.com'" show for you? I see 1.
>


Kudu library dependency reviews

2017-03-23 Thread Henry Robinson
I just pushed gutil, krpc, kudu_util and security library patches to
Gerrit. They are now ready to review.

Each update is separated into two - first is the source code import, second
is the build changes to make them compile and integrate with Impala.
Reviewers should only need to focus on the latter change. I'm on hand to
answer any and all questions!


Re: Need to upgrade my one year old build for potential join perf improvement?

2017-03-22 Thread Henry Robinson
If your build does not have runtime filters (which are a little more than a
year old), then the answer is yes: you should upgrade.

Performance in general has been improving steadily over the last year.

On 22 March 2017 at 20:36, 吴朱华  wrote:

> The question is there any big join perf related improvement has
> implemented? Thx
>
> 2017-03-23 11:34 GMT+08:00 吴朱华 :
>
> > Hi guys:
> >
> > I am currently using my one year old build (I guess 2016.3 CDH5-Trunk),
> > now I am facing a lot of join query situation(two big table joins), do I
> > need to upgrade my one year old build for potential join perf
> improvement?
> > Thx at advance^_^
> >
>


Re: Impala JIRA migration to https://issues.apache.org/jira is complete

2017-03-14 Thread Henry Robinson
It seems like they're re-assigned to you already. I'm not sure what the
role set-up looks like on the ASF JIRA, but I have permissions to make a
bulk change if you'd like anything done to all your issues at once.

Henry

On 14 March 2017 at 10:14, Jim Apple <jbap...@cloudera.com> wrote:

> You might ask the admins of the server to change that username, but I don't
> know if they are willing to provide that level of individualized service.
>
> https://www.apache.org/dev/infra-contact
>
> On Tue, Mar 14, 2017 at 8:33 AM, Zachary Amsden <zams...@cloudera.com>
> wrote:
>
> > As one of those affected, what should I do?  I already created a new
> > account as zamsden, but all of my issues are assigned to
> > zamsden_impala_ad21.  Should I reassign and transfer things over or am I
> > forever doomed to be zamsden_impala_ad21?
> >
>



-- 
Henry Robinson
Software Engineer
Cloudera
415-994-6679


Re: How counters are actually filled in a source

2017-03-08 Thread Henry Robinson
The timer is added to the runtime profile by ADD_TIMER(), and incremented
by CANCEL_SAFE_SCOPED_TIMER(), as Andrey said. What's the issue you're
seeing?

On 7 March 2017 at 23:13, Jim Apple <jbap...@cloudera.com> wrote:

> I don't see any other user of this either. Feel free to file a JIRA on
> this, and, if you like, a patch:
> https://cwiki.apache.org/confluence/display/IMPALA/Contributing+to+Impala
>
> On Fri, Feb 24, 2017 at 3:12 AM, Andrey Morskoy <m...@ciklum.com> wrote:
>
> > Seems that's it:
> > DataStreamRecvr::SenderQueue::AddBatch:
> > if (timer_lock) {
> > CANCEL_SAFE_SCOPED_TIMER(recvr_->buffer_full_wall_timer_,
> > _cancelled_);
> > 
> >
> > On Fri, Feb 24, 2017 at 12:35 PM, Andrey Morskoy <m...@ciklum.com>
> wrote:
> >
> > > Could you please explain a code pattern for situation:
> > > In a data-stream-recvr.cc there is section:
> > >   buffer_full_wall_timer_ = ADD_TIMER(profile_, "SendersBlockedTimer");
> > >
> > >
> > > But I could not find the code section responsible for increment this
> > > timer. Thanks
> > >
> > >
> >
>



-- 
Henry Robinson
Software Engineer
Cloudera
415-994-6679


Heads up: LZ4 1.7.5 bits have changed in toolchain

2017-03-01 Thread Henry Robinson
As a result of IMPALA-4983, the 1.7.5 version of LZ4 in the toolchain has
been rebuilt as of toolchain build 349-1b15a6c8f4 to include optimization
flags.

On another thread we're talking about ways to get
bin/bootstrap_toolchain.py to notice this change automatically. For the
time being, if you want to be sure you have the latest LZ4 version
(important for performance), do something like the following:

rm -rf ${IMPALA_TOOLCHAIN}/lz4-1.7.5/
cd ${IMPALA_HOME} && . bin/impala-config.sh
bin/bootstrap_toolchain.py


Re: Toolchain - versioning dependencies with the same version number

2017-02-28 Thread Henry Robinson
On 28 February 2017 at 12:57, Marcel Kornacker <mar...@cloudera.com> wrote:

> Yes, I too am particularly concerned about maintaining the ability to
> build offline, and downloading the same things over and over again.
>
> I don't quite understand the case against versioning - if gc'ing
> obsolete versions in order to reduce storage space is a concern, then
> it's probably fine to a) blow away and re-download everything, or b)
> throw away old versions manually, if you happen to be in a situation
> where a) isn't possible.
>

The issue I have with versioning is that there's no way to understand the
link between the version number, and what actually changed. It's a kludge
to deal with the fact that the toolchain can't handle this kind of
situation.

That said, my immediate goal is to make sure that everyone picks up the new
LZ4 build. So I'll add a new version for now, and we can revisit this some
other time.


>
> On Tue, Feb 28, 2017 at 12:20 PM, Tim Armstrong <tarmstr...@cloudera.com>
> wrote:
> > I agree it's not too bad if you have a fat pipe to S3, but it's a pretty
> > bad regression in usability to make it the default and particularly
> provide
> > no way to opt out.
> >
> > The toolchain is almost 1GB though, which is pretty problematic to
> download
> > if a developer is on coffee-shop wifi, cellular wireless, airplane wifi,
> > etc. It'd also be pretty easy for a developer working offline to switch
> > branches, run buildall.sh, have gcc, etc, automatically deleted and then
> be
> > stuck unable to build anything.
> >
> >
> > On Tue, Feb 28, 2017 at 9:07 AM, Henry Robinson <he...@apache.org>
> wrote:
> >
> >> I'd prefer not to do that because it's something of a hack and generates
> >> too many artifacts if we make incremental build changes, not to mention
> the
> >> extra complexity required to make such a change because new tarballs
> might
> >> need to be uploaded.
> >>
> >>
> >>
> >>
> >> On Tue, Feb 28, 2017 at 8:55 AM Lars Volker <l...@cloudera.com> wrote:
> >>
> >> > Can we add another version string component like -1 or -impala1, or
> add a
> >> > dummy patch to the affected packages to allow for new versions with
> the
> >> > same upstream version? I think this is what Linux distributions
> commonly
> >> do
> >> > to have several versions of the same upstream version.
> >> >
> >> > On Feb 27, 2017 21:15, "Henry Robinson" <he...@cloudera.com> wrote:
> >> >
> >> > Yes, it would force re-downloading. At my office, downloading a
> toolchain
> >> > takes a matter of a few seconds, so I'm not sure the cost is that
> great.
> >> > And if it turned out to be problematic, one could always change the
> >> > toolchain directory for different branches. Having something locally
> that
> >> > set IMPALA_TOOLCHAIN_DIR=${IMPALA_HOME}/${IMPALA_TOOLCHAIN_BUILD_ID}/
> >> would
> >> > work.
> >> >
> >> > However I wouldn't want to force behaviour that into the toolchain
> >> scripts
> >> > because of the need for garbage collection it would raise - it
> wouldn't
> >> be
> >> > clear when to delete old toolchains programatically.
> >> >
> >> > On 27 February 2017 at 20:51, Tim Armstrong <tarmstr...@cloudera.com>
> >> > wrote:
> >> >
> >> > > Maybe I'm misunderstanding, but wouldn't that force re-downloading
> of
> >> the
> >> > > entire toolchain every time a developer switches between branches
> with
> >> > > different build IDs?
> >> > >
> >> > > I know some developers do that frequently, e.g. to try and reproduce
> >> bugs
> >> > > on older versions or backport patches.
> >> > >
> >> > > I agree it would be good to fix this, since I've run into this
> problem
> >> > > before, I'm just not quite sure what the best solution is. In the
> other
> >> > > case where I had this issue with LLVM I changed the version number
> (by
> >> > > appending noasserts-) to it, but that's really just a hack.
> >> > >
> >> > > -Tim
> >> > >
> >> > > On Mon, Feb 27, 2017 at 4:35 PM, Henry Robinson <he...@cloudera.com
> >
> >> > > wrote:
> >> > >
> >> > > > As Matt said, I have a patch that implements build ID-based
> >> versioning
> &g

Re: Toolchain - versioning dependencies with the same version number

2017-02-28 Thread Henry Robinson
I'd prefer not to do that because it's something of a hack and generates
too many artifacts if we make incremental build changes, not to mention the
extra complexity required to make such a change because new tarballs might
need to be uploaded.




On Tue, Feb 28, 2017 at 8:55 AM Lars Volker <l...@cloudera.com> wrote:

> Can we add another version string component like -1 or -impala1, or add a
> dummy patch to the affected packages to allow for new versions with the
> same upstream version? I think this is what Linux distributions commonly do
> to have several versions of the same upstream version.
>
> On Feb 27, 2017 21:15, "Henry Robinson" <he...@cloudera.com> wrote:
>
> Yes, it would force re-downloading. At my office, downloading a toolchain
> takes a matter of a few seconds, so I'm not sure the cost is that great.
> And if it turned out to be problematic, one could always change the
> toolchain directory for different branches. Having something locally that
> set IMPALA_TOOLCHAIN_DIR=${IMPALA_HOME}/${IMPALA_TOOLCHAIN_BUILD_ID}/ would
> work.
>
> However I wouldn't want to force behaviour that into the toolchain scripts
> because of the need for garbage collection it would raise - it wouldn't be
> clear when to delete old toolchains programatically.
>
> On 27 February 2017 at 20:51, Tim Armstrong <tarmstr...@cloudera.com>
> wrote:
>
> > Maybe I'm misunderstanding, but wouldn't that force re-downloading of the
> > entire toolchain every time a developer switches between branches with
> > different build IDs?
> >
> > I know some developers do that frequently, e.g. to try and reproduce bugs
> > on older versions or backport patches.
> >
> > I agree it would be good to fix this, since I've run into this problem
> > before, I'm just not quite sure what the best solution is. In the other
> > case where I had this issue with LLVM I changed the version number (by
> > appending noasserts-) to it, but that's really just a hack.
> >
> > -Tim
> >
> > On Mon, Feb 27, 2017 at 4:35 PM, Henry Robinson <he...@cloudera.com>
> > wrote:
> >
> > > As Matt said, I have a patch that implements build ID-based versioning
> at
> > > https://gerrit.cloudera.org/#/c/6166/2.
> > >
> > > Does anyone want to take a look? If we could get this in soon it would
> > help
> > > smooth over the LZ4 change which is going in shortly.
> > >
> > > On 27 February 2017 at 14:21, Henry Robinson <he...@cloudera.com>
> wrote:
> > >
> > > > I agree that that might be useful, and that it's a separately
> > addressable
> > > > problem.
> > > >
> > > > On 27 February 2017 at 14:18, Matthew Jacobs <m...@cloudera.com>
> wrote:
> > > >
> > > >> Just catching up to this e-mail, though I had seen your code reviews
> > > >> and I think this approach makes sense. An additional concern would
> be
> > > >> how to identify how a toolchain package was built, and AFAIK this is
> > > >> tricky now if only the 'toolchain ID' is known. Before I saw this
> > > >> e-mail I was thinking about this problem (which I think we can
> address
> > > >> separately), and that we might want to write the native-toolchain
> git
> > > >> hash with every toolchain build so that the exact build scripts are
> > > >> associated with those build artifacts. I filed
> > > >> https://issues.cloudera.org/browse/IMPALA-5002 for this related
> > > >> problem.
> > > >>
> > > >> On Sat, Feb 25, 2017 at 10:22 PM, Henry Robinson <he...@apache.org>
> > > >> wrote:
> > > >> > As written, the toolchain can't apparently deal with the
> possibility
> > > of
> > > >> > build flags changing, but a dependency version remaining the same.
> > > >> >
> > > >> > LZ4 has never (afaict) been built with optimization enabled. I
> have
> > a
> > > >> > commit that enables -O3, but that continues to produce artifacts
> for
> > > >> > lz4-1.7.5 with no version change. This is a problem because
> > > >> bootstrapping
> > > >> > the toolchain will fail to pick up the new binaries - because the
> > > >> > previously downloaded version is still in the local cache, and
> won't
> > > be
> > > >> > overwritten because of the version change.
> > > >> >
> > > >> > I think the simplest way to fix this is to write the

Re: Toolchain - versioning dependencies with the same version number

2017-02-27 Thread Henry Robinson
Yes, it would force re-downloading. At my office, downloading a toolchain
takes a matter of a few seconds, so I'm not sure the cost is that great.
And if it turned out to be problematic, one could always change the
toolchain directory for different branches. Having something locally that
set IMPALA_TOOLCHAIN_DIR=${IMPALA_HOME}/${IMPALA_TOOLCHAIN_BUILD_ID}/ would
work.

However I wouldn't want to force behaviour that into the toolchain scripts
because of the need for garbage collection it would raise - it wouldn't be
clear when to delete old toolchains programatically.

On 27 February 2017 at 20:51, Tim Armstrong <tarmstr...@cloudera.com> wrote:

> Maybe I'm misunderstanding, but wouldn't that force re-downloading of the
> entire toolchain every time a developer switches between branches with
> different build IDs?
>
> I know some developers do that frequently, e.g. to try and reproduce bugs
> on older versions or backport patches.
>
> I agree it would be good to fix this, since I've run into this problem
> before, I'm just not quite sure what the best solution is. In the other
> case where I had this issue with LLVM I changed the version number (by
> appending noasserts-) to it, but that's really just a hack.
>
> -Tim
>
> On Mon, Feb 27, 2017 at 4:35 PM, Henry Robinson <he...@cloudera.com>
> wrote:
>
> > As Matt said, I have a patch that implements build ID-based versioning at
> > https://gerrit.cloudera.org/#/c/6166/2.
> >
> > Does anyone want to take a look? If we could get this in soon it would
> help
> > smooth over the LZ4 change which is going in shortly.
> >
> > On 27 February 2017 at 14:21, Henry Robinson <he...@cloudera.com> wrote:
> >
> > > I agree that that might be useful, and that it's a separately
> addressable
> > > problem.
> > >
> > > On 27 February 2017 at 14:18, Matthew Jacobs <m...@cloudera.com> wrote:
> > >
> > >> Just catching up to this e-mail, though I had seen your code reviews
> > >> and I think this approach makes sense. An additional concern would be
> > >> how to identify how a toolchain package was built, and AFAIK this is
> > >> tricky now if only the 'toolchain ID' is known. Before I saw this
> > >> e-mail I was thinking about this problem (which I think we can address
> > >> separately), and that we might want to write the native-toolchain git
> > >> hash with every toolchain build so that the exact build scripts are
> > >> associated with those build artifacts. I filed
> > >> https://issues.cloudera.org/browse/IMPALA-5002 for this related
> > >> problem.
> > >>
> > >> On Sat, Feb 25, 2017 at 10:22 PM, Henry Robinson <he...@apache.org>
> > >> wrote:
> > >> > As written, the toolchain can't apparently deal with the possibility
> > of
> > >> > build flags changing, but a dependency version remaining the same.
> > >> >
> > >> > LZ4 has never (afaict) been built with optimization enabled. I have
> a
> > >> > commit that enables -O3, but that continues to produce artifacts for
> > >> > lz4-1.7.5 with no version change. This is a problem because
> > >> bootstrapping
> > >> > the toolchain will fail to pick up the new binaries - because the
> > >> > previously downloaded version is still in the local cache, and won't
> > be
> > >> > overwritten because of the version change.
> > >> >
> > >> > I think the simplest way to fix this is to write the toolchain build
> > ID
> > >> to
> > >> > the dependency version file (that's in the local cache only) when
> it's
> > >> > downloaded. If that ID changes, the dependency will be
> re-downloaded.
> > >> >
> > >> > This has the disadvantage that any bump in IMPALA_TOOLCHAIN_BUILD_ID
> > >> will
> > >> > invalidate all dependencies, and bin/bootstrap_toolchain.py will
> > >> > re-download all of them. My feeling is that that cost is better than
> > >> trying
> > >> > to individually determine whether a dependency has changed between
> > >> > toolchain builds.
> > >> >
> > >> > Any thoughts on whether this is the right way to go?
> > >> >
> > >> > Henry
> > >>
> > >
> > >
> > >
> > > --
> > > Henry Robinson
> > > Software Engineer
> > > Cloudera
> > > 415-994-6679 <(415)%20994-6679>
> > >
> >
> >
> >
> > --
> > Henry Robinson
> > Software Engineer
> > Cloudera
> > 415-994-6679
> >
>



-- 
Henry Robinson
Software Engineer
Cloudera
415-994-6679 <(415)%20994-6679>


Re: Toolchain - versioning dependencies with the same version number

2017-02-27 Thread Henry Robinson
As Matt said, I have a patch that implements build ID-based versioning at
https://gerrit.cloudera.org/#/c/6166/2.

Does anyone want to take a look? If we could get this in soon it would help
smooth over the LZ4 change which is going in shortly.

On 27 February 2017 at 14:21, Henry Robinson <he...@cloudera.com> wrote:

> I agree that that might be useful, and that it's a separately addressable
> problem.
>
> On 27 February 2017 at 14:18, Matthew Jacobs <m...@cloudera.com> wrote:
>
>> Just catching up to this e-mail, though I had seen your code reviews
>> and I think this approach makes sense. An additional concern would be
>> how to identify how a toolchain package was built, and AFAIK this is
>> tricky now if only the 'toolchain ID' is known. Before I saw this
>> e-mail I was thinking about this problem (which I think we can address
>> separately), and that we might want to write the native-toolchain git
>> hash with every toolchain build so that the exact build scripts are
>> associated with those build artifacts. I filed
>> https://issues.cloudera.org/browse/IMPALA-5002 for this related
>> problem.
>>
>> On Sat, Feb 25, 2017 at 10:22 PM, Henry Robinson <he...@apache.org>
>> wrote:
>> > As written, the toolchain can't apparently deal with the possibility of
>> > build flags changing, but a dependency version remaining the same.
>> >
>> > LZ4 has never (afaict) been built with optimization enabled. I have a
>> > commit that enables -O3, but that continues to produce artifacts for
>> > lz4-1.7.5 with no version change. This is a problem because
>> bootstrapping
>> > the toolchain will fail to pick up the new binaries - because the
>> > previously downloaded version is still in the local cache, and won't be
>> > overwritten because of the version change.
>> >
>> > I think the simplest way to fix this is to write the toolchain build ID
>> to
>> > the dependency version file (that's in the local cache only) when it's
>> > downloaded. If that ID changes, the dependency will be re-downloaded.
>> >
>> > This has the disadvantage that any bump in IMPALA_TOOLCHAIN_BUILD_ID
>> will
>> > invalidate all dependencies, and bin/bootstrap_toolchain.py will
>> > re-download all of them. My feeling is that that cost is better than
>> trying
>> > to individually determine whether a dependency has changed between
>> > toolchain builds.
>> >
>> > Any thoughts on whether this is the right way to go?
>> >
>> > Henry
>>
>
>
>
> --
> Henry Robinson
> Software Engineer
> Cloudera
> 415-994-6679 <(415)%20994-6679>
>



-- 
Henry Robinson
Software Engineer
Cloudera
415-994-6679


Re: Toolchain - versioning dependencies with the same version number

2017-02-27 Thread Henry Robinson
I agree that that might be useful, and that it's a separately addressable
problem.

On 27 February 2017 at 14:18, Matthew Jacobs <m...@cloudera.com> wrote:

> Just catching up to this e-mail, though I had seen your code reviews
> and I think this approach makes sense. An additional concern would be
> how to identify how a toolchain package was built, and AFAIK this is
> tricky now if only the 'toolchain ID' is known. Before I saw this
> e-mail I was thinking about this problem (which I think we can address
> separately), and that we might want to write the native-toolchain git
> hash with every toolchain build so that the exact build scripts are
> associated with those build artifacts. I filed
> https://issues.cloudera.org/browse/IMPALA-5002 for this related
> problem.
>
> On Sat, Feb 25, 2017 at 10:22 PM, Henry Robinson <he...@apache.org> wrote:
> > As written, the toolchain can't apparently deal with the possibility of
> > build flags changing, but a dependency version remaining the same.
> >
> > LZ4 has never (afaict) been built with optimization enabled. I have a
> > commit that enables -O3, but that continues to produce artifacts for
> > lz4-1.7.5 with no version change. This is a problem because bootstrapping
> > the toolchain will fail to pick up the new binaries - because the
> > previously downloaded version is still in the local cache, and won't be
> > overwritten because of the version change.
> >
> > I think the simplest way to fix this is to write the toolchain build ID
> to
> > the dependency version file (that's in the local cache only) when it's
> > downloaded. If that ID changes, the dependency will be re-downloaded.
> >
> > This has the disadvantage that any bump in IMPALA_TOOLCHAIN_BUILD_ID will
> > invalidate all dependencies, and bin/bootstrap_toolchain.py will
> > re-download all of them. My feeling is that that cost is better than
> trying
> > to individually determine whether a dependency has changed between
> > toolchain builds.
> >
> > Any thoughts on whether this is the right way to go?
> >
> > Henry
>



-- 
Henry Robinson
Software Engineer
Cloudera
415-994-6679


[Toolchain-CR] Add a script to build Kudu using existing toolchain artifacts

2017-02-27 Thread Henry Robinson (Code Review)
Henry Robinson has posted comments on this change.

Change subject: Add a script to build Kudu using existing toolchain artifacts
..


Patch Set 1:

Matt - could you move this to the native-toolchain project please?

-- 
To view, visit http://gerrit.cloudera.org:8080/6014
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I237580e1545033467a92285ca8bb8db1cf8c804e
Gerrit-PatchSet: 1
Gerrit-Project: Toolchain
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs <m...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-HasComments: No


[Toolchain-CR] IMPALA-4983: Compile LZ4 in release mode

2017-02-27 Thread Henry Robinson (Code Review)
Henry Robinson has abandoned this change.

Change subject: IMPALA-4983: Compile LZ4 in release mode
..


Abandoned

Committed to native-toolchain project.

-- 
To view, visit http://gerrit.cloudera.org:8080/6145
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: abandon
Gerrit-Change-Id: I8bd113822dfc4df2d76c4393c4b3b3550066dd18
Gerrit-PatchSet: 1
Gerrit-Project: Toolchain
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <m...@cloudera.com>


Heads up: restarting Gerrit in a few minutes

2017-02-27 Thread Henry Robinson



Re: [DISCUSS] C++ code sharing amongst Apache {Arrow, Kudu, Impala, Parquet}

2017-02-25 Thread Henry Robinson
Thanks for bringing this up, Wes.

On 25 February 2017 at 14:18, Wes McKinney  wrote:

> Dear Apache Kudu and Apache Impala (incubating) communities,
>
> (I'm not sure the best way to have a cross-list discussion, so I
> apologize if this does not work well)
>
> On the recent Apache Parquet sync call, we discussed C++ code sharing
> between the codebases in Apache Arrow and Apache Parquet, and
> opportunities for more code sharing with Kudu and Impala as well.
>
> As context
>
> * We have an RC out for the 1.0.0 release of apache-parquet-cpp, the
> first C++ release within Apache Parquet. I got involved with this
> project a little over a year ago and was faced with the unpleasant
> decision to copy and paste a significant amount of code out of
> Impala's codebase to bootstrap the project.
>
> * In parallel, we begin the Apache Arrow project, which is designed to
> be a complementary library for file formats (like Parquet), storage
> engines (like Kudu), and compute engines (like Impala and pandas).
>
> * As Arrow and parquet-cpp matured, an increasing amount of code
> overlap crept up surrounding buffer memory management and IO
> interface. We recently decided in PARQUET-818
> (https://github.com/apache/parquet-cpp/commit/
> 2154e873d5aa7280314189a2683fb1e12a590c02)
> to remove some of the obvious code overlap in Parquet and make
> libarrow.a/so a hard compile and link-time dependency for
> libparquet.a/so.
>
> * There is still quite a bit of code in parquet-cpp that would better
> fit in Arrow: SIMD hash utilities, RLE encoding, dictionary encoding,
> compression, bit utilities, and so forth. Much of this code originated
> from Impala
>
> This brings me to a next set of points:
>
> * parquet-cpp contains quite a bit of code that was extracted from
> Impala. This is mostly self-contained in
> https://github.com/apache/parquet-cpp/tree/master/src/parquet/util
>
> * My understanding is that Kudu extracted certain computational
> utilities from Impala in its early days, but these tools have likely
> diverged as the needs of the projects have evolved.
>
> Since all of these projects are quite different in their end goals
> (runtime systems vs. libraries), touching code that is tightly coupled
> to either Kudu or Impala's runtimes is probably not worth discussing.
> However, I think there is a strong basis for collaboration on
> computational utilities and vectorized array processing. Some obvious
> areas that come to mind:
>
> * SIMD utilities (for hashing or processing of preallocated contiguous
> memory)
> * Array encoding utilities: RLE / Dictionary, etc.
> * Bit manipulation (packing and unpacking, e.g. Daniel Lemire
> contributed a patch to parquet-cpp around this)
> * Date and time utilities
> * Compression utilities
>

Between Kudu and Impala (at least) there are many more opportunities for
sharing. Threads, logging, metrics, concurrent primitives - the list is
quite long.


>
> I hope the benefits are obvious: consolidating efforts on unit
> testing, benchmarking, performance optimizations, continuous
> integration, and platform compatibility.
>
> Logistically speaking, one possible avenue might be to use Apache
> Arrow as the place to assemble this code. Its thirdparty toolchain is
> small, and it builds and installs fast. It is intended as a library to
> have its headers used and linked against other applications. (As an
> aside, I'm very interested in building optional support for Arrow
> columnar messages into the kudu client).
>

In principle I'm in favour of code sharing, and it seems very much in
keeping with the Apache way. However, practically speaking I'm of the
opinion that it only makes sense to house shared support code in a
separate, dedicated project.

Embedding the shared libraries in, e.g., Arrow naturally limits the scope
of sharing to utilities that Arrow is interested in. It would make no sense
to add a threading library to Arrow if it was never used natively. Muddying
the waters of the project's charter seems likely to lead to user, and
developer, confusion. Similarly, we should not necessarily couple Arrow's
design goals to those it inherits from Kudu and Impala's source code.

I think I'd rather see a new Apache project than re-use a current one for
two independent purposes.


>
> The downside of code sharing, which may have prevented it so far, are
> the logistics of coordinating ASF release cycles and keeping build
> toolchains in sync. It's taken us the past year to stabilize the
> design of Arrow for its intended use cases, so at this point if we
> went down this road I would be OK with helping the community commit to
> a regular release cadence that would be faster than Impala, Kudu, and
> Parquet's respective release cadences. Since members of the Kudu and
> Impala PMC are also on the Arrow PMC, I trust we would be able to
> collaborate to each other's mutual benefit and success.
>
> Note that Arrow does not throw C++ exceptions and similarly follows

Toolchain - versioning dependencies with the same version number

2017-02-25 Thread Henry Robinson
As written, the toolchain can't apparently deal with the possibility of
build flags changing, but a dependency version remaining the same.

LZ4 has never (afaict) been built with optimization enabled. I have a
commit that enables -O3, but that continues to produce artifacts for
lz4-1.7.5 with no version change. This is a problem because bootstrapping
the toolchain will fail to pick up the new binaries - because the
previously downloaded version is still in the local cache, and won't be
overwritten because of the version change.

I think the simplest way to fix this is to write the toolchain build ID to
the dependency version file (that's in the local cache only) when it's
downloaded. If that ID changes, the dependency will be re-downloaded.

This has the disadvantage that any bump in IMPALA_TOOLCHAIN_BUILD_ID will
invalidate all dependencies, and bin/bootstrap_toolchain.py will
re-download all of them. My feeling is that that cost is better than trying
to individually determine whether a dependency has changed between
toolchain builds.

Any thoughts on whether this is the right way to go?

Henry


[Toolchain-CR] IMPALA-4983: Compile LZ4 in release mode

2017-02-24 Thread Henry Robinson (Code Review)
Henry Robinson has posted comments on this change.

Change subject: IMPALA-4983: Compile LZ4 in release mode
..


Patch Set 1:

Running a build on this branch now - if it succeeds it will publish the 
artifacts and then I'll commit this to master.

-- 
To view, visit http://gerrit.cloudera.org:8080/6145
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8bd113822dfc4df2d76c4393c4b3b3550066dd18
Gerrit-PatchSet: 1
Gerrit-Project: Toolchain
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <m...@cloudera.com>
Gerrit-HasComments: No


[Toolchain-CR] IMPALA-4983: Compile LZ4 in release mode

2017-02-24 Thread Henry Robinson (Code Review)
Henry Robinson has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/6145

Change subject: IMPALA-4983: Compile LZ4 in release mode
..

IMPALA-4983: Compile LZ4 in release mode

LZ4 was not compiled (apparently ever?) with optimization enabled. In
recent versions this lead to a regression in compression time that was
noticeable vs previous LZ4 versions. With optimization enabled, the
regression appears to vanish.

Change-Id: I8bd113822dfc4df2d76c4393c4b3b3550066dd18
---
M source/lz4/build.sh
1 file changed, 1 insertion(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/Toolchain refs/changes/45/6145/1
-- 
To view, visit http://gerrit.cloudera.org:8080/6145
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I8bd113822dfc4df2d76c4393c4b3b3550066dd18
Gerrit-PatchSet: 1
Gerrit-Project: Toolchain
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson <he...@cloudera.com>


[Toolchain-CR] Add missing word to comment

2017-02-24 Thread Henry Robinson (Code Review)
Henry Robinson has posted comments on this change.

Change subject: Add missing word to comment
..


Patch Set 1: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/6142
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I099ef9001c16b5839cfbd5d99c0c5d1a241c1698
Gerrit-PatchSet: 1
Gerrit-Project: Toolchain
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <m...@cloudera.com>
Gerrit-HasComments: No


[Toolchain-CR] Add missing word to comment

2017-02-24 Thread Henry Robinson (Code Review)
Henry Robinson has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/6142

Change subject: Add missing word to comment
..

Add missing word to comment

Replication test commit

Change-Id: I099ef9001c16b5839cfbd5d99c0c5d1a241c1698
---
M buildall.sh
1 file changed, 1 insertion(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/Toolchain refs/changes/42/6142/1
-- 
To view, visit http://gerrit.cloudera.org:8080/6142
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I099ef9001c16b5839cfbd5d99c0c5d1a241c1698
Gerrit-PatchSet: 1
Gerrit-Project: Toolchain
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson <he...@cloudera.com>


Re: status-benchmark.cc compilation time

2017-02-23 Thread Henry Robinson
I think the main problem I want to avoid is paying the cost of linking,
which is expensive for Impala as it often generates multi-hundred-MB
binaries per benchmark or test.

Building the benchmarks during GVO seems the best solution to that to me.

On 23 February 2017 at 10:23, Todd Lipcon <t...@cloudera.com> wrote:

> One thing we've found useful in Kudu to prevent bitrot of benchmarks is to
> actually use gtest and gflags for the benchmark programs.
>
> We set some flag like --benchmark_num_rows or --benchmark_num_iterations
> with a default that's low enough to only run for a second or two, and run
> it as part of our normal test suite. Rarely catches any bugs, but serves to
> make sure that the code keeps working. Then, when a developer wants to
> actually test a change for performance, they can run it with
> --num_iterations=.
>
> Doesn't help the weird case of status-benchmark where *compiling* takes 10
> minutes... but I think the manual unrolling of 1000 status calls in there
> is probably unrealistic anyway regarding how the different options perform
> in a whole-program setting.
>
> -Todd
>
> On Thu, Feb 23, 2017 at 10:20 AM, Zachary Amsden <zams...@cloudera.com>
> wrote:
>
> > Yes.  If you take a look at the benchmark, you'll notice the JNI call to
> > initialize the frontend doesn't even have the right signature anymore.
> > That's one easy way to bitrot while still compiling.
> >
> > Even fixing that isn't enough to get it off the ground.
> >
> >  - Zach
> >
> > On Tue, Feb 21, 2017 at 11:44 AM, Henry Robinson <he...@cloudera.com>
> > wrote:
> >
> > > Did you run . bin/set-classpath.sh before running expr-benchmark?
> > >
> > > On 21 February 2017 at 11:30, Zachary Amsden <zams...@cloudera.com>
> > wrote:
> > >
> > > > Unfortunately some of the benchmarks have actually bit-rotted.  For
> > > > example, expr-benchmark compiles but immediately throws JNI
> exceptions.
> > > >
> > > > On Tue, Feb 21, 2017 at 10:55 AM, Marcel Kornacker <
> > mar...@cloudera.com>
> > > > wrote:
> > > >
> > > > > I'm also in favor of not compiling it on the standard commandline.
> > > > >
> > > > > However, I'm very much against allowing the benchmarks to bitrot.
> As
> > > > > was pointed out, those benchmarks can be valuable tools during
> > > > > development, and keeping them in working order shouldn't really
> > impact
> > > > > the development process.
> > > > >
> > > > > In other words, let's compile them as part of gvo.
> > > > >
> > > > > On Tue, Feb 21, 2017 at 10:50 AM, Alex Behm <
> alex.b...@cloudera.com>
> > > > > wrote:
> > > > > > +1 for not compiling the benchmarks in -notests
> > > > > >
> > > > > > On Mon, Feb 20, 2017 at 7:55 PM, Jim Apple <jbap...@cloudera.com
> >
> > > > wrote:
> > > > > >
> > > > > >> > On which note, would anyone object if we disabled benchmark
> > > > > compilation
> > > > > >> by
> > > > > >> > default when building the BE tests? I mean separating out
> > -notests
> > > > > into
> > > > > >> > -notests and -build_benchmarks (the latter false by default).
> > > > > >>
> > > > > >> I think this is a great idea.
> > > > > >>
> > > > > >> > I don't mind if the benchmarks bitrot as a result, because we
> > > don't
> > > > > run
> > > > > >> > them regularly or pay attention to their output except when
> > > > > developing a
> > > > > >> > feature. Of course, maybe an 'exhaustive' run should build the
> > > > > benchmarks
> > > > > >> > as well just to keep us honest, but I'd be happy if 95% of
> > Jenkins
> > > > > builds
> > > > > >> > didn't bother.
> > > > > >>
> > > > > >> The pre-merge (aka GVM aka GVO) testing builds
> > > > > >> http://jenkins.impala.io:8080/job/all-build-options, which
> builds
> > > > > >> without the "-notests" flag.
> > > > > >>
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > Henry Robinson
> > > Software Engineer
> > > Cloudera
> > > 415-994-6679
> > >
> >
>
>
>
> --
> Todd Lipcon
> Software Engineer, Cloudera
>



-- 
Henry Robinson
Software Engineer
Cloudera
415-994-6679


Re: status-benchmark.cc compilation time

2017-02-23 Thread Henry Robinson
Fair enough, always worth checking the easy things first :)

On 23 February 2017 at 10:20, Zachary Amsden <zams...@cloudera.com> wrote:

> Yes.  If you take a look at the benchmark, you'll notice the JNI call to
> initialize the frontend doesn't even have the right signature anymore.
> That's one easy way to bitrot while still compiling.
>
> Even fixing that isn't enough to get it off the ground.
>
>  - Zach
>
> On Tue, Feb 21, 2017 at 11:44 AM, Henry Robinson <he...@cloudera.com>
> wrote:
>
> > Did you run . bin/set-classpath.sh before running expr-benchmark?
> >
> > On 21 February 2017 at 11:30, Zachary Amsden <zams...@cloudera.com>
> wrote:
> >
> > > Unfortunately some of the benchmarks have actually bit-rotted.  For
> > > example, expr-benchmark compiles but immediately throws JNI exceptions.
> > >
> > > On Tue, Feb 21, 2017 at 10:55 AM, Marcel Kornacker <
> mar...@cloudera.com>
> > > wrote:
> > >
> > > > I'm also in favor of not compiling it on the standard commandline.
> > > >
> > > > However, I'm very much against allowing the benchmarks to bitrot. As
> > > > was pointed out, those benchmarks can be valuable tools during
> > > > development, and keeping them in working order shouldn't really
> impact
> > > > the development process.
> > > >
> > > > In other words, let's compile them as part of gvo.
> > > >
> > > > On Tue, Feb 21, 2017 at 10:50 AM, Alex Behm <alex.b...@cloudera.com>
> > > > wrote:
> > > > > +1 for not compiling the benchmarks in -notests
> > > > >
> > > > > On Mon, Feb 20, 2017 at 7:55 PM, Jim Apple <jbap...@cloudera.com>
> > > wrote:
> > > > >
> > > > >> > On which note, would anyone object if we disabled benchmark
> > > > compilation
> > > > >> by
> > > > >> > default when building the BE tests? I mean separating out
> -notests
> > > > into
> > > > >> > -notests and -build_benchmarks (the latter false by default).
> > > > >>
> > > > >> I think this is a great idea.
> > > > >>
> > > > >> > I don't mind if the benchmarks bitrot as a result, because we
> > don't
> > > > run
> > > > >> > them regularly or pay attention to their output except when
> > > > developing a
> > > > >> > feature. Of course, maybe an 'exhaustive' run should build the
> > > > benchmarks
> > > > >> > as well just to keep us honest, but I'd be happy if 95% of
> Jenkins
> > > > builds
> > > > >> > didn't bother.
> > > > >>
> > > > >> The pre-merge (aka GVM aka GVO) testing builds
> > > > >> http://jenkins.impala.io:8080/job/all-build-options, which builds
> > > > >> without the "-notests" flag.
> > > > >>
> > > >
> > >
> >
> >
> >
> > --
> > Henry Robinson
> > Software Engineer
> > Cloudera
> > 415-994-6679
> >
>



-- 
Henry Robinson
Software Engineer
Cloudera
415-994-6679


[Toolchain-CR] IMPALA-4966: Add flatbuffers 1.6.0 to toolchain

2017-02-22 Thread Henry Robinson (Code Review)
Henry Robinson has posted comments on this change.

Change subject: IMPALA-4966: Add flatbuffers 1.6.0 to toolchain
..


Patch Set 1: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6124/1/buildall.sh
File buildall.sh:

Line 252: 
remove extra blank line


-- 
To view, visit http://gerrit.cloudera.org:8080/6124
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I17e01dd9703d2519bb1985f0246c38e6f2f57b92
Gerrit-PatchSet: 1
Gerrit-Project: Toolchain
Gerrit-Branch: master
Gerrit-Owner: Dimitris Tsirogiannis <dtsirogian...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-HasComments: Yes


Re: status-benchmark.cc compilation time

2017-02-21 Thread Henry Robinson
Did you run . bin/set-classpath.sh before running expr-benchmark?

On 21 February 2017 at 11:30, Zachary Amsden <zams...@cloudera.com> wrote:

> Unfortunately some of the benchmarks have actually bit-rotted.  For
> example, expr-benchmark compiles but immediately throws JNI exceptions.
>
> On Tue, Feb 21, 2017 at 10:55 AM, Marcel Kornacker <mar...@cloudera.com>
> wrote:
>
> > I'm also in favor of not compiling it on the standard commandline.
> >
> > However, I'm very much against allowing the benchmarks to bitrot. As
> > was pointed out, those benchmarks can be valuable tools during
> > development, and keeping them in working order shouldn't really impact
> > the development process.
> >
> > In other words, let's compile them as part of gvo.
> >
> > On Tue, Feb 21, 2017 at 10:50 AM, Alex Behm <alex.b...@cloudera.com>
> > wrote:
> > > +1 for not compiling the benchmarks in -notests
> > >
> > > On Mon, Feb 20, 2017 at 7:55 PM, Jim Apple <jbap...@cloudera.com>
> wrote:
> > >
> > >> > On which note, would anyone object if we disabled benchmark
> > compilation
> > >> by
> > >> > default when building the BE tests? I mean separating out -notests
> > into
> > >> > -notests and -build_benchmarks (the latter false by default).
> > >>
> > >> I think this is a great idea.
> > >>
> > >> > I don't mind if the benchmarks bitrot as a result, because we don't
> > run
> > >> > them regularly or pay attention to their output except when
> > developing a
> > >> > feature. Of course, maybe an 'exhaustive' run should build the
> > benchmarks
> > >> > as well just to keep us honest, but I'd be happy if 95% of Jenkins
> > builds
> > >> > didn't bother.
> > >>
> > >> The pre-merge (aka GVM aka GVO) testing builds
> > >> http://jenkins.impala.io:8080/job/all-build-options, which builds
> > >> without the "-notests" flag.
> > >>
> >
>



-- 
Henry Robinson
Software Engineer
Cloudera
415-994-6679


Re: status-benchmark.cc compilation time

2017-02-20 Thread Henry Robinson
On which note, would anyone object if we disabled benchmark compilation by
default when building the BE tests? I mean separating out -notests into
-notests and -build_benchmarks (the latter false by default).

I don't mind if the benchmarks bitrot as a result, because we don't run
them regularly or pay attention to their output except when developing a
feature. Of course, maybe an 'exhaustive' run should build the benchmarks
as well just to keep us honest, but I'd be happy if 95% of Jenkins builds
didn't bother.

On 20 February 2017 at 19:18, Jim Apple <jbap...@cloudera.com> wrote:

> https://issues.cloudera.org/browse/IMPALA-3784
>
> I'd prefer it removed from git if it is removed from CMakeLists.txt,
> but I'm OK with either that or removing the unrolling.
>
> Also, as a reminder, "-notests" skips building the tests.
>
> On Mon, Feb 20, 2017 at 7:12 PM, Todd Lipcon <t...@cloudera.com> wrote:
> > Hi all,
> >
> > I've noticed that status-benchmark.cc takes 8+ minutes to compile on my
> dev
> > box. It seems this is due to use of macro expansions to unroll a loop
> 1000x
> > for the benchmarks.
> >
> > Whether that's actually a useful benchmark that reflects real-life usage
> of
> > Status, I'm not sure. But I'd wager that no one is looking at the results
> > of this benchmark on a regular basis, and maybe we could remove or
> disable
> > it for a normal build?
> >
> > Any objections to a patch which either disables the unrolling (subject to
> > an #ifdef that could be easily re-enabled) or removes the benchmark from
> > CMakeLists.txt (also easy to re-enable)? This is by far the long pole in
> my
> > compilation.
> >
> > -Todd
> > --
> > Todd Lipcon
> > Software Engineer, Cloudera
>



-- 
Henry Robinson
Software Engineer
Cloudera
415-994-6679


[Toolchain-CR] Force LZ4 to build static libs

2017-02-15 Thread Henry Robinson (Code Review)
Henry Robinson has submitted this change and it was merged.

Change subject: Force LZ4 to build static libs
..


Force LZ4 to build static libs

Change-Id: I8960a84343c3c0b75090dbae252e642ef932cba4
---
M source/lz4/build.sh
1 file changed, 1 insertion(+), 1 deletion(-)

Approvals:
  Henry Robinson: Verified
  Alex Behm: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/6015
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I8960a84343c3c0b75090dbae252e642ef932cba4
Gerrit-PatchSet: 1
Gerrit-Project: Toolchain
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Alex Behm <alex.b...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>


[Toolchain-CR] Force LZ4 to build static libs

2017-02-15 Thread Henry Robinson (Code Review)
Henry Robinson has posted comments on this change.

Change subject: Force LZ4 to build static libs
..


Patch Set 1: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/6015
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8960a84343c3c0b75090dbae252e642ef932cba4
Gerrit-PatchSet: 1
Gerrit-Project: Toolchain
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Alex Behm <alex.b...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-HasComments: No


[Toolchain-CR] Force LZ4 to build static libs

2017-02-15 Thread Henry Robinson (Code Review)
Henry Robinson has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/6015

Change subject: Force LZ4 to build static libs
..

Force LZ4 to build static libs

Change-Id: I8960a84343c3c0b75090dbae252e642ef932cba4
---
M source/lz4/build.sh
1 file changed, 1 insertion(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/Toolchain refs/changes/15/6015/1
-- 
To view, visit http://gerrit.cloudera.org:8080/6015
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I8960a84343c3c0b75090dbae252e642ef932cba4
Gerrit-PatchSet: 1
Gerrit-Project: Toolchain
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson <he...@cloudera.com>


[Toolchain-CR] IMPALA-4926: Bump LZ4 to 1.7.5

2017-02-14 Thread Henry Robinson (Code Review)
Henry Robinson has submitted this change and it was merged.

Change subject: IMPALA-4926: Bump LZ4 to 1.7.5
..


IMPALA-4926: Bump LZ4 to 1.7.5

The previous 'svn' version of LZ4 was directly checked in to the
repository. This patch removes that (and adds the tarball to the
toolchain's S3 bucket), and also adds LZ4 1.7.5 as the most recent
version.

Change-Id: I1b882a47b438c996fe0339c02c5c5b9bc5885d17
---
M buildall.sh
M source/lz4/build.sh
D source/lz4/lz4-svn/CMakeLists.txt
D source/lz4/lz4-svn/lz4.c
D source/lz4/lz4-svn/lz4.h
D source/lz4/lz4-svn/lz4hc.c
D source/lz4/lz4-svn/lz4hc.h
7 files changed, 8 insertions(+), 1,854 deletions(-)

Approvals:
  Henry Robinson: Verified
  Alex Behm: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/5990
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I1b882a47b438c996fe0339c02c5c5b9bc5885d17
Gerrit-PatchSet: 1
Gerrit-Project: Toolchain
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Alex Behm <alex.b...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>


[Toolchain-CR] IMPALA-4926: Bump LZ4 to 1.7.5

2017-02-14 Thread Henry Robinson (Code Review)
Henry Robinson has posted comments on this change.

Change subject: IMPALA-4926: Bump LZ4 to 1.7.5
..


Patch Set 1: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/5990
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I1b882a47b438c996fe0339c02c5c5b9bc5885d17
Gerrit-PatchSet: 1
Gerrit-Project: Toolchain
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Alex Behm <alex.b...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-HasComments: No


[Toolchain-CR] IMPALA-4926: Bump LZ4 to 1.7.5

2017-02-14 Thread Henry Robinson (Code Review)
Henry Robinson has posted comments on this change.

Change subject: IMPALA-4926: Bump LZ4 to 1.7.5
..


Patch Set 1:

This passed a full toolchain build, and Impala compiles against the newer 
version with no code changes. Anyone want to +2?

-- 
To view, visit http://gerrit.cloudera.org:8080/5990
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I1b882a47b438c996fe0339c02c5c5b9bc5885d17
Gerrit-PatchSet: 1
Gerrit-Project: Toolchain
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Alex Behm <alex.b...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-HasComments: No


[Toolchain-CR] IMPALA-4926: Bump LZ4 to 1.7.5

2017-02-13 Thread Henry Robinson (Code Review)
Henry Robinson has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/5990

Change subject: IMPALA-4926: Bump LZ4 to 1.7.5
..

IMPALA-4926: Bump LZ4 to 1.7.5

The previous 'svn' version of LZ4 was directly checked in to the
repository. This patch removes that (and adds the tarball to the
toolchain's S3 bucket), and also adds LZ4 1.7.5 as the most recent
version.

Change-Id: I1b882a47b438c996fe0339c02c5c5b9bc5885d17
---
M buildall.sh
M source/lz4/build.sh
D source/lz4/lz4-svn/CMakeLists.txt
D source/lz4/lz4-svn/lz4.c
D source/lz4/lz4-svn/lz4.h
D source/lz4/lz4-svn/lz4hc.c
D source/lz4/lz4-svn/lz4hc.h
7 files changed, 8 insertions(+), 1,854 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Toolchain refs/changes/90/5990/1
-- 
To view, visit http://gerrit.cloudera.org:8080/5990
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I1b882a47b438c996fe0339c02c5c5b9bc5885d17
Gerrit-PatchSet: 1
Gerrit-Project: Toolchain
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson <he...@cloudera.com>


Re: [GitHub] incubator-impala pull request #1: Branch 2.8.0

2017-01-31 Thread Henry Robinson
Ah, good catch, thanks.

On 31 January 2017 at 10:08, Jim Apple <jbap...@cloudera.com> wrote:

> I answered on the PR with the wiki page instructions on how to file
> bugs and patches. The PR didn't actually have any new code and the
> verb tense made me think this might be a feature request, not a patch.
>
> On Tue, Jan 31, 2017 at 9:57 AM, Henry Robinson <he...@cloudera.com>
> wrote:
> > Have you reached out to the author with suggestions for how to contribute
> > their patch? Looks like a patch we might want to consider.
> >
> > On 31 January 2017 at 09:34, Jim Apple <jbap...@cloudera.com> wrote:
> >
> > After replying, I asked on http://infra.chat about how to close this.
> > The right answer is apparently to ask in infra.chat or file an Apache
> > infra ticket.
> >
> > On Tue, Jan 31, 2017 at 9:33 AM, Humbedooh <g...@git.apache.org> wrote:
> >> Github user Humbedooh closed the pull request at:
> >>
> >> https://github.com/apache/incubator-impala/pull/1
> >>
> >>
> >> ---
> >> If your project is set up for it, you can reply to this email and have
> > your
> >> reply appear on GitHub as well. If your project does not have this
> feature
> >> enabled and wishes so, or if the feature is enabled but not working,
> > please
> >> contact infrastructure at infrastruct...@apache.org or file a JIRA
> ticket
> >> with INFRA.
> >> ---
> >
> >
> >
> >
> > --
> > Henry Robinson
> > Software Engineer
> > Cloudera
> > 415-994-6679
> > --
> > Henry Robinson
> > Software Engineer
> > Cloudera
> > 415-994-6679
>



-- 
Henry Robinson
Software Engineer
Cloudera
415-994-6679


Re: [GitHub] incubator-impala pull request #1: Branch 2.8.0

2017-01-31 Thread Henry Robinson
Have you reached out to the author with suggestions for how to contribute
their patch? Looks like a patch we might want to consider.

On 31 January 2017 at 09:34, Jim Apple <jbap...@cloudera.com> wrote:

After replying, I asked on http://infra.chat about how to close this.
The right answer is apparently to ask in infra.chat or file an Apache
infra ticket.

On Tue, Jan 31, 2017 at 9:33 AM, Humbedooh <g...@git.apache.org> wrote:
> Github user Humbedooh closed the pull request at:
>
> https://github.com/apache/incubator-impala/pull/1
>
>
> ---
> If your project is set up for it, you can reply to this email and have
your
> reply appear on GitHub as well. If your project does not have this feature
> enabled and wishes so, or if the feature is enabled but not working,
please
> contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
> with INFRA.
> ---




-- 
Henry Robinson
Software Engineer
Cloudera
415-994-6679
-- 
Henry Robinson
Software Engineer
Cloudera
415-994-6679


Re: Tweet about 2.8.0 release?

2017-01-23 Thread Henry Robinson
Done! Congratulations on the release everyone.

On 22 January 2017 at 23:09, Jim Apple <jbap...@cloudera.com> wrote:

> 2.8.0 was just released. Can someone tweet it from
> <https://twitter.com/apacheimpala>?
>



-- 
Henry Robinson
Software Engineer
Cloudera
415-994-6679


Does anyone use 'mini-impala-cluster'?

2017-01-13 Thread Henry Robinson
I'm doing some cleaning up and I don't think the 'mini-impala-cluster'
binary is used anywhere. If I can delete it, I can also get rid of
InProcessStatestore which is a pain to maintain.

Does anyone use the 'mini-impala-cluster' binary?

Henry


Re: run-all.sh stuck at 'Starting kms'

2017-01-11 Thread Henry Robinson
I think that's actually stuck at synchronizing NTP, a prerequisite for
starting Kudu. Is your system clock in sync?

On 11 January 2017 at 16:06, Marcel Kornacker <mar...@cloudera.com> wrote:

> Has anyone else seen that? What log files should I be looking at for
> diagnostics?
>



-- 
Henry Robinson
Software Engineer
Cloudera
415-994-6679


Re: "target version" and "fix version" fields for 2.8/2.9

2017-01-11 Thread Henry Robinson
The easiest way to do that is to 'release' 2.8 on JIRA, which as part of
the workflow asks what you want to do with the 2.8 items that aren't fixed.
Usually we bump them all to the next version, then ask people to triage
them out if they're a) assigned to them and b) not going to be worked on
soon.

On 11 January 2017 at 08:52, Tim Armstrong <tarmstr...@cloudera.com> wrote:

> I'll respond to the second part - I'm not sure about the first. I think we
> just need to go through and move JIRAs with a target version of 2.8 to 2.9
> (or to backlog if that fits better).
>
> On Wed, Jan 11, 2017 at 7:16 AM, Lars Volker <l...@cloudera.com> wrote:
>
> > Hi all,
> >
> > Do we have an automated way of checking "fix version" fields for
> > correctness? Out of habit I had put "Impala 2.8.0" there, until I
> realized
> > that the 2.8.0-rc had been cut already. I went through my changes,
> manually
> > checking whether the fix was included in the rc branch, and set the "fix
> > version" to "impala 2.9.0" if it wasn't. Do we have tooling to validate
> and
> > adjust the "fix version" automatically?
> >
> > Similarly I noticed that some issues were created with "target version" =
> > "Impala 2.8.0", but now their fixes did not make it into 2.8. Is there a
> > policy what to do with those?
> >
> > Thanks, Lars
> >
>



-- 
Henry Robinson
Software Engineer
Cloudera
415-994-6679


Upcoming RPC changes - sequencing of patches

2017-01-09 Thread Henry Robinson
Hi -

I've been making a lot of good progress on IMPALA-2567 - replacing our
Thrift RPC stack with one based on Apache Kudu's. You can see the progress
I'm making on my personal branch here:

http://github.com/HenryR/Impala/tree/krpc

It's time to start merging some of these changes to master, to prevent
divergence and to make some of the scalability benefits available to all.
Here's my thoughts on the sequencing:

1. Commit thirdparty dependency changes that bring in protobuf, crcutil and
libev

2. Commit Kudu dependencies (including gutil upgrade). They will be unused,
but will compile as part of the usual build.

3. Commit change to add new RpcMgr abstraction, and port of Statestore
services to KRPC. We hope to have security finished by this point so as not
to disable security in trunk at any time.

4. Commit port of catalog service to KRPC.

5. Commit port of ImpalaInternalService to KRPC. This is a large change, as
some of the query execution paths change significantly now that we don't
need application-level thread pools etc.

These steps can happen concurrently, but that's the order I want to start
them in. I plan to submit the changes for step 1 very soon, and step 2 (at
least) later this week. The Impala-side changes remain a work in progress -
particularly the ImpalaInternalService ones - but have solidified enough,
and have undergone sufficient testing, that I think they are nearing
readiness for review.

Let me know if you have questions!

Cheers,
Henry


Re: [DISCUSS] Release 2.8.0 soon?

2017-01-05 Thread Henry Robinson
Same deal with me - I don't support reverting in master.

I think it's fine to branch from before that change and cherry pick what
you can. As the RM you decide what goes in and what doesn't - if the cherry
picks are too hard, it's fine to skip them. Patching is also fine.

On Thu, Jan 5, 2017 at 12:12 PM Tim Armstrong <tarmstr...@cloudera.com>
wrote:

> Oh yes my reading comprehension was bad. I don't think it makes sense to
>
> revert it on master - I thought you meant reverting it on the branch.
>
>
>
> The bugfix is small and straightforward - maybe it's easiest if I just put
>
> that together and put it up for review.
>
>
>
> - Tim
>
>
>
> On Thu, Jan 5, 2017 at 12:02 PM, Jim Apple <jbap...@cloudera.com> wrote:
>
>
>
> > Just to clarify: when I said reverting it, I meant reverting it in
>
> > master, too, then chery picking that change to the branch. I'd rather
>
> > keep the branch free from as many non-master commits as possible.
>
> >
>
> > On Thu, Jan 5, 2017 at 11:55 AM, Tim Armstrong <tarmstr...@cloudera.com>
>
> > wrote:
>
> > > +1 for reverting it. It doesn't add any new functionality so I don't
> see
>
> > > the value in including it in the release.
>
> > >
>
> > > On Thu, Jan 5, 2017 at 10:58 AM, Henry Robinson <he...@cloudera.com>
>
> > wrote:
>
> > >
>
> > >> +1 for reverting it. It's a recent, major change and it's still
>
> > settling.
>
> > >>
>
> > >> On 5 January 2017 at 10:49, Jim Apple <jbap...@cloudera.com> wrote:
>
> > >>
>
> > >> > Yes, that is in the branch:
>
> > >> > https://git-wip-us.apache.org/repos/asf?p=incubator-impala.
>
> > >> > git;a=shortlog;h=refs/heads/branch-2.8.0
>
> > >> >
>
> > >> > Here are some options:
>
> > >> >
>
> > >> > 1. Burn this branch, make a new one without the commit but with
>
> > >> > everything else. Pros: no blocker. Cons: cherry-picking hell.
>
> > >> >
>
> > >> > 2. Take the branch before this commit. Pros: no blocker. Cons:
> missing
>
> > >> > other bug fixes
>
> > >> >
>
> > >> > 3. Wait for a fix. Pros: no blocker. Cons: delay
>
> > >> >
>
> > >> > 4. Commit to master a git revert of that patch. Pros: no blocker;
>
> > >> > fixes blocker on branch and master. Cons: add noise to commit
> history
>
> > >> >
>
> > >> > I'd like to git revert it. What do you all think?
>
> > >> >
>
> > >> > On Thu, Jan 5, 2017 at 10:39 AM, Tim Armstrong <
>
> > tarmstr...@cloudera.com>
>
> > >> > wrote:
>
> > >> > > This one:
>
> > >> > >
>
> > >> > > http://gerrit.cloudera.org:8080/4418
>
> > >> > >
>
> > >> > > On 5 Jan 2017 10:15 AM, "Jim Apple" <jbap...@cloudera.com> wrote:
>
> > >> > >
>
> > >> > >> Which commit introduced it?
>
> > >> > >>
>
> > >> > >> On Thu, Jan 5, 2017 at 10:12 AM, Tim Armstrong <
>
> > >> tarmstr...@cloudera.com
>
> > >> > >
>
> > >> > >> wrote:
>
> > >> > >> > I think we have some open blockers for 2.8. Or at least one
> that
>
> > was
>
> > >> > >> > introduced in a recent commit .
>
> > >> > >> > https://issues.cloudera.org/browse/IMPALA-4707. Do we plan to
>
> > >> > include a
>
> > >> > >> fix
>
> > >> > >> > or just exclude the commit that introduced it?
>
> > >> > >> >
>
> > >> > >> > On 5 Jan 2017 9:09 AM, "Jim Apple" <jbap...@cloudera.com>
> wrote:
>
> > >> > >> >
>
> > >> > >> > I have now also tested the docs build:
>
> > >> > >> > http://jenkins.impala.io:8080/view/Utility/job/docs-build/92/
>
> > >> > >> >
>
> > >> > >> > On Thu, Jan 5, 2017 at 8:28 AM, Jim Apple <
> jbap...@cloudera.com>
>
> > >> > wrote:
>
> > >> > >> >> I have now tested this hash (4fa9270e647b9c097295dcc13d9713
>
> > >> > 6cca3139ad)
>
> > >> > &g

Re: [DISCUSS] Release 2.8.0 soon?

2017-01-05 Thread Henry Robinson
+1 for reverting it. It's a recent, major change and it's still settling.

On 5 January 2017 at 10:49, Jim Apple <jbap...@cloudera.com> wrote:

> Yes, that is in the branch:
> https://git-wip-us.apache.org/repos/asf?p=incubator-impala.
> git;a=shortlog;h=refs/heads/branch-2.8.0
>
> Here are some options:
>
> 1. Burn this branch, make a new one without the commit but with
> everything else. Pros: no blocker. Cons: cherry-picking hell.
>
> 2. Take the branch before this commit. Pros: no blocker. Cons: missing
> other bug fixes
>
> 3. Wait for a fix. Pros: no blocker. Cons: delay
>
> 4. Commit to master a git revert of that patch. Pros: no blocker;
> fixes blocker on branch and master. Cons: add noise to commit history
>
> I'd like to git revert it. What do you all think?
>
> On Thu, Jan 5, 2017 at 10:39 AM, Tim Armstrong <tarmstr...@cloudera.com>
> wrote:
> > This one:
> >
> > http://gerrit.cloudera.org:8080/4418
> >
> > On 5 Jan 2017 10:15 AM, "Jim Apple" <jbap...@cloudera.com> wrote:
> >
> >> Which commit introduced it?
> >>
> >> On Thu, Jan 5, 2017 at 10:12 AM, Tim Armstrong <tarmstr...@cloudera.com
> >
> >> wrote:
> >> > I think we have some open blockers for 2.8. Or at least one that was
> >> > introduced in a recent commit .
> >> > https://issues.cloudera.org/browse/IMPALA-4707. Do we plan to
> include a
> >> fix
> >> > or just exclude the commit that introduced it?
> >> >
> >> > On 5 Jan 2017 9:09 AM, "Jim Apple" <jbap...@cloudera.com> wrote:
> >> >
> >> > I have now also tested the docs build:
> >> > http://jenkins.impala.io:8080/view/Utility/job/docs-build/92/
> >> >
> >> > On Thu, Jan 5, 2017 at 8:28 AM, Jim Apple <jbap...@cloudera.com>
> wrote:
> >> >> I have now tested this hash (4fa9270e647b9c097295dcc13d9713
> 6cca3139ad)
> >> >> on public Jenkins:
> >> >>
> >> >> http://jenkins.impala.io:8080/view/Utility/job/parallel-all-
> tests/130/
> >> >> http://jenkins.impala.io:8080/view/Utility/job/ubuntu-14.04-
> >> > from-scratch/434/
> >> >> http://jenkins.impala.io:8080/view/Utility/job/ubuntu-14.04-
> >> > from-scratch/435/
> >> >> http://jenkins.impala.io:8080/view/Utility/job/ubuntu-14.04-
> >> > from-scratch/436/
> >> >>
> >> >> That covers RAT (the tool for checking copyright compliance), various
> >> >> build options (including ninja, release, asan, shared libs), loading
> >> >> the data from scratch and running all tests in core and in
> exhaustive,
> >> >> clang-tidy, and the build we instruct IPMC release testers to run
> >> >> (bin/bootstrap_build.sh).
> >> >>
> >> >> I have also created a git branch:
> >> >> https://git-wip-us.apache.org/repos/asf?p=incubator-impala.
> >> > git;a=shortlog;h=refs/heads/branch-2.8.0
> >> >>
> >> >> I am working on a commit to add a disclaimer to the docs
> >> >> (https://gerrit.cloudera.org/#/c/5610/) and then I will upload a
> >> >> release candidate tarball.
> >> >>
> >> >> Please prepare yourself to vote. Instructions are here:
> >> >>
> >> >> https://cwiki.apache.org/confluence/display/IMPALA/
> >> > DRAFT%3A+How+to+Release#DRAFT:HowtoRelease-
> HowtoVoteonaReleaseCandidate
> >> >>
> >> >> On Wed, Jan 4, 2017 at 7:17 PM, Jim Apple <jbap...@cloudera.com>
> wrote:
> >> >>>> I'd figure out a way to add a big caveat to the docs. Maybe on the
> >> > landing
> >> >>>> page? Even better if there's a template we can add a caveat to that
> >> > appears
> >> >>>> on every page.
> >> >>>
> >> >>> I like this idea. I'll prepare a patch for the landing page.
> >> >>>
> >> >>> I don't think there is a simple way to do it on every page. John,
> >> >>> Laurel, am I wrong abut that?
> >>
>



-- 
Henry Robinson
Software Engineer
Cloudera
415-994-6679


Re: [DISCUSS] Release 2.8.0 soon?

2017-01-04 Thread Henry Robinson
+1 to releasing 2.8.0 - it's been a long time since 2.7.0 and a lot has
been done.

I'd also like to release 2.7.1 in the near future, and will try to find
time to be the RM for that.

On 4 January 2017 at 14:15, Jim Apple  wrote:

> Oh, and a specific topic for discussion:
>
> Should we include the docs in the source tarball? My feeling is yes,
> but they are not really cleaned up yet, and so contain a lot of
> Cloudera-specific info.
>
> Pros of including the docs in the tarball: Users get a docs tarball
> from which they can build the hundreds of pages of usable
> documentation.
>
> Cons: 1. Confused users about who runs the project. (Correct answer:
> Apache Impala PPMC. Confused answer: Cloudera), 2. Possible failed
> IPMC release vote that slows down the release.
>

I'd figure out a way to add a big caveat to the docs. Maybe on the landing
page? Even better if there's a template we can add a caveat to that appears
on every page.



>
> On Wed, Jan 4, 2017 at 2:09 PM, Jim Apple  wrote:
> > This is not a [VOTE] thread. Everyone is encourage to participate.
> >
> > I am volunteering to be a release manager for 2.8.0. I am provisionally
> testing
> >
> > https://git-wip-us.apache.org/repos/asf?p=incubator-impala.git;a=tree;h=
> 4fa9270e647b9c097295dcc13d97136cca3139ad;hb=4fa9270e647b9c097295dcc13d9713
> 6cca3139ad
> >
> > to branch from.
> >
> > Are there any objections to releasing 2.8.0 soon? Keep in mind this is
> > not your last chance to speak - there will be at least two votes one
> > for PPMC releasing and one for IPMC releasing. See
> >
> > https://cwiki.apache.org/confluence/display/IMPALA/
> DRAFT%3A+How+to+Release
>


[Toolchain-CR] IMPALA-4652: Add crcutil to toolchain

2016-12-18 Thread Henry Robinson (Code Review)
Henry Robinson has posted comments on this change.

Change subject: IMPALA-4652: Add crcutil to toolchain
..


Patch Set 3: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/5522
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ibf8b39082914b1b2932b4ce7efbd2cc4f5f69743
Gerrit-PatchSet: 3
Gerrit-Project: Toolchain
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com>
Gerrit-HasComments: No


[Toolchain-CR] Add autotools packages as build prerequisites

2016-12-18 Thread Henry Robinson (Code Review)
Henry Robinson has posted comments on this change.

Change subject: Add autotools packages as build prerequisites
..


Patch Set 3: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/5524
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I43218d1bdd4d660f1e50774a90de1733b4be10ee
Gerrit-PatchSet: 3
Gerrit-Project: Toolchain
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com>
Gerrit-HasComments: No


[Toolchain-CR] Add autotools packages as build prerequisites

2016-12-17 Thread Henry Robinson (Code Review)
Henry Robinson has posted comments on this change.

Change subject: Add autotools packages as build prerequisites
..


Patch Set 3:

Changed to only enable autotools for packages that require them, rather than 
everywhere. This passed a full toolchain build on all platforms.

-- 
To view, visit http://gerrit.cloudera.org:8080/5524
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I43218d1bdd4d660f1e50774a90de1733b4be10ee
Gerrit-PatchSet: 3
Gerrit-Project: Toolchain
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com>
Gerrit-HasComments: No


Re: Any autotools experts?

2016-12-17 Thread Henry Robinson
That's what I wound up doing. I'd rather not, because it seems like there's
a setup problem with the toolchain autotools, but only enabling toolchain
autotools for certain builds worked well and doesn't seem like too much of
a hack.

On 16 December 2016 at 14:04, Tim Armstrong <tarmstr...@cloudera.com> wrote:

> Maybe we should sidestep the problem for now and only use the toolchain
> autotools for the packages that need it?
>
> I'm guessing somehow the autotools build isn't set up right but it's hard
> to know where to look ot fix it.
>
> On Fri, Dec 16, 2016 at 1:46 PM, Henry Robinson <he...@cloudera.com>
> wrote:
>
> > The problem is that it's the toolchain Kudu's snappy that doesn't build,
> > and they claim that the autoreconf step is needed for their build. It
> would
> > be good not to rebuild the components that we both depend on, but right
> now
> > I'd like to avoid shaving that particular yak.
> >
> > On 16 December 2016 at 13:16, Tim Armstrong <tarmstr...@cloudera.com>
> > wrote:
> >
> > > I had a bit of a look. It doesn't make a lot of sense to me. It seems
> > like
> > > we can build snappy fine if we don't run autoreconf.
> > >
> > > On Fri, Dec 16, 2016 at 10:56 AM, Henry Robinson <he...@apache.org>
> > wrote:
> > >
> > > > I'm trying to add auto[make|conf] and libtool to our toolchain.
> > > Everything
> > > > almost works, except for when the Kudu build calls autoreconf -fvi
> for
> > > > snappy. The error occurs when calling autoreconf calls autoconf
> > --force.
> > > > I've discovered that removing the toolchain auto*make* from the path
> > > fixes
> > > > the issue, which is kind of weird.
> > > >
> > > > What is strange is that the toolchain version is exactly the same as
> > the
> > > > system one. The error suggests that the AC_DEFINE macro (which is in
> > > > general.m4) can't be found. What I don't know is how to change where
> > it's
> > > > looked for. (I've edited ACLOCAL_PATH to no effect).
> > > >
> > > > Any ideas?
> > > >
> > > > autoreconf: Entering directory `.'
> > > > autoreconf: configure.ac: not using Gettext
> > > > autoreconf: running: aclocal --force -I m4
> > > > autoreconf: configure.ac: tracing
> > > > autoreconf: running: libtoolize --copy --force
> > > > libtoolize: putting auxiliary files in `.'.
> > > > libtoolize: copying file `./ltmain.sh'
> > > > libtoolize: putting macros in AC_CONFIG_MACRO_DIR, `m4'.
> > > > libtoolize: copying file `m4/libtool.m4'
> > > > libtoolize: copying file `m4/ltoptions.m4'
> > > > libtoolize: copying file `m4/ltsugar.m4'
> > > > libtoolize: copying file `m4/ltversion.m4'
> > > > libtoolize: copying file `m4/lt~obsolete.m4'
> > > > autoreconf: running:
> > > > /data/henry/src/cloudera/native-toolchain/build/
> > > autoconf-2.69/bin/autoconf
> > > > --force
> > > > configure.ac:42: error: possibly undefined macro: AC_DEFINE
> > > >   If this token and others are legitimate, please use
> > > m4_pattern_allow.
> > > > See the Autoconf documentation.
> > > > configure.ac:44: error: possibly undefined macro:
> > > > AC_MSG_FAILURE
> > > > autoreconf:
> > > > /data/henry/src/cloudera/native-toolchain/build/
> > > autoconf-2.69/bin/autoconf
> > > > failed with exit status: 1
> > > >
> > >
> >
> >
> >
> > --
> > Henry Robinson
> > Software Engineer
> > Cloudera
> > 415-994-6679
> >
>



-- 
Henry Robinson
Software Engineer
Cloudera
415-994-6679


[Toolchain-CR] Add autotools packages as build prerequisites

2016-12-17 Thread Henry Robinson (Code Review)
Hello Tim Armstrong,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/5524

to look at the new patch set (#3).

Change subject: Add autotools packages as build prerequisites
..

Add autotools packages as build prerequisites

* Add autoconf, automake and libtool.
* Versions are those installed in Ubuntu 14.04.
* Autotools are only enabled for those packages that require them. See
  enable_toolchain_autotools().

Change-Id: I43218d1bdd4d660f1e50774a90de1733b4be10ee
---
M functions.sh
M init.sh
A source/autoconf/build.sh
A source/automake/build.sh
A source/libtool/build.sh
5 files changed, 141 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Toolchain refs/changes/24/5524/3
-- 
To view, visit http://gerrit.cloudera.org:8080/5524
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I43218d1bdd4d660f1e50774a90de1733b4be10ee
Gerrit-PatchSet: 3
Gerrit-Project: Toolchain
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com>


[Toolchain-CR] IMPALA-4652: Add crcutil to toolchain

2016-12-17 Thread Henry Robinson (Code Review)
Henry Robinson has uploaded a new patch set (#3).

Change subject: IMPALA-4652: Add crcutil to toolchain
..

IMPALA-4652: Add crcutil to toolchain

Note: crcutil does not have any release tarballs, so we borrow the one
built for Apache Kudu. As a result, the build version is a git hash,
rather than e.g. 1.0.

Change-Id: Ibf8b39082914b1b2932b4ce7efbd2cc4f5f69743
---
M buildall.sh
A source/crcutil/build.sh
2 files changed, 46 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Toolchain refs/changes/22/5522/3
-- 
To view, visit http://gerrit.cloudera.org:8080/5522
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ibf8b39082914b1b2932b4ce7efbd2cc4f5f69743
Gerrit-PatchSet: 3
Gerrit-Project: Toolchain
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com>


Any autotools experts?

2016-12-16 Thread Henry Robinson
I'm trying to add auto[make|conf] and libtool to our toolchain. Everything
almost works, except for when the Kudu build calls autoreconf -fvi for
snappy. The error occurs when calling autoreconf calls autoconf --force.
I've discovered that removing the toolchain auto*make* from the path fixes
the issue, which is kind of weird.

What is strange is that the toolchain version is exactly the same as the
system one. The error suggests that the AC_DEFINE macro (which is in
general.m4) can't be found. What I don't know is how to change where it's
looked for. (I've edited ACLOCAL_PATH to no effect).

Any ideas?

autoreconf: Entering directory `.'
autoreconf: configure.ac: not using Gettext
autoreconf: running: aclocal --force -I m4
autoreconf: configure.ac: tracing
autoreconf: running: libtoolize --copy --force
libtoolize: putting auxiliary files in `.'.
libtoolize: copying file `./ltmain.sh'
libtoolize: putting macros in AC_CONFIG_MACRO_DIR, `m4'.
libtoolize: copying file `m4/libtool.m4'
libtoolize: copying file `m4/ltoptions.m4'
libtoolize: copying file `m4/ltsugar.m4'
libtoolize: copying file `m4/ltversion.m4'
libtoolize: copying file `m4/lt~obsolete.m4'
autoreconf: running:
/data/henry/src/cloudera/native-toolchain/build/autoconf-2.69/bin/autoconf
--force
configure.ac:42: error: possibly undefined macro: AC_DEFINE
  If this token and others are legitimate, please use m4_pattern_allow.
See the Autoconf documentation.
configure.ac:44: error: possibly undefined macro: AC_MSG_FAILURE
autoreconf:
/data/henry/src/cloudera/native-toolchain/build/autoconf-2.69/bin/autoconf
failed with exit status: 1


  1   2   3   4   5   6   7   >