Re: [VOTE] 2.10.0 release candidate 2 (RC2)

2017-08-31 Thread Alexander Behm
Thanks for testing, Jim. I looked into your build but could not determine
what happened. I found no ASAN output in the logs. It's interesting that
this test also fails with a crash:

ERROR at teardown of TestGrantRevoke.test_role_update[exec_option:
{'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 0,
'disable_codegen': False, 'abort_on_error': 1,
'exec_single_node_rows_threshold': 0} | table_format: text/none]

Looks like these two completely unrelated tests both failed in a strange
way.

It's also interesting that many FE unit tests failed with NoClassDefFound
(very unusual).

Could it be that two jobs were scheduled on the same worker? Alternatively,
maybe another job did not clean up after itself and your run landed on an
unclean workspace leading to problems? This smells like an infra problem to
me,
Maybe do another run?





On Thu, Aug 31, 2017 at 8:02 PM, Jim Apple  wrote:

> This ASAN testing failed:
>
> https://jenkins.impala.io/view/Utility/job/ubuntu-16.04-
> from-scratch/218/consoleFull
> failed in query_test/test_udfs.py::TestUdfExecution::test_ir_
> functions[exec_option:
> {'disable_codegen_rows_threshold': 0, 'disable_codegen': False,
> 'exec_single_node_rows_threshold': 0, 'enable_expr_rewrites': False} |
> table_format: text/none]. Looks like a crash to me.
>
> I didn't see a corresponding bug. Has anyone else seen something like
> this before?
>
> On Thu, Aug 31, 2017 at 1:15 PM, Jim Apple  wrote:
> > BTW, this Jenkins job includes the log of what it tested, which
> > follows the Release Guide, so you should be able to follow along OK.
> > All committers should have access to run that job, too, if you don't
> > trust my result.
> >
> > I am also testing exhaustive (not just core) tests at
> > https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/, builds
> > 218-220 (once the instances come up).
> >
> > I tested with ImpalaLZO at this commit:
> > https://github.com/cloudera/impala-lzo/tree/
> 62c4b94ed6e89f0ce2068280864546ebccfb0729
> >
> > On Thu, Aug 31, 2017 at 6:19 AM, Jim Apple  wrote:
> >> +1
> >>
> >> https://jenkins.impala.io/job/release-test/20/console
> >>
> >> This tested following
> >> https://cwiki.apache.org/confluence/display/IMPALA/How+
> to+load+and+run+Impala+tests
> >> and https://cwiki.apache.org/confluence/display/IMPALA/How+
> to+Release#HowtoRelease-HowtoVoteonaReleaseCandidate.
> >>
> >> On Wed, Aug 30, 2017 at 11:35 PM, Bharath Vissapragada
> >>  wrote:
> >>> This is a vote to release Impala 2.10.0.
> >>>
> >>> - The artefacts for testing can be downloaded from <
> >>> https://dist.apache.org/repos/dist/dev/incubator/impala/2.10.0/RC2/>
> >>>
> >>> - The git tag for this release candidate is 2.10.0-rc2 and treehash is
> >>> visible at
> >>> <
> >>> https://git-wip-us.apache.org/repos/asf?p=incubator-impala.
> git;a=tree;hb=23d79462da5d0108709e8b1399c97606f4ebdf92
> 
> >>>
> >>> Please vote +1 or -1. -1 votes should be accompanied by an explanation
> of
> >>> the reason. Only PPMC members and mentors have binding votes, but other
> >>> community members are encouraged to cast non-binding votes. This vote
> will
> >>> pass if there are 3 binding +1 votes and more binding +1 votes than -1
> >>> votes.
> >>>
> >>> This wiki page describes how to check the release before you vote:
> >>> *https://cwiki.apache.org/confluence/display/IMPALA/How+
> to+Release#HowtoRelease-HowtoVoteonaReleaseCandidate
> >>>  to+Release#HowtoRelease-HowtoVoteonaReleaseCandidate>*
> >>>
> >>> The vote will be open until the end of day, September 5th, Pacific time
> >>> zone (UTC-08:00).
> >>> Once the vote passes the Impala PPMC vote, it still must pass the
> incubator
> >>> PMC vote before a release is made.
>


Re: [VOTE] 2.10.0 release candidate 2 (RC2)

2017-08-31 Thread Jim Apple
This ASAN testing failed:

https://jenkins.impala.io/view/Utility/job/ubuntu-16.04-from-scratch/218/consoleFull
failed in 
query_test/test_udfs.py::TestUdfExecution::test_ir_functions[exec_option:
{'disable_codegen_rows_threshold': 0, 'disable_codegen': False,
'exec_single_node_rows_threshold': 0, 'enable_expr_rewrites': False} |
table_format: text/none]. Looks like a crash to me.

I didn't see a corresponding bug. Has anyone else seen something like
this before?

On Thu, Aug 31, 2017 at 1:15 PM, Jim Apple  wrote:
> BTW, this Jenkins job includes the log of what it tested, which
> follows the Release Guide, so you should be able to follow along OK.
> All committers should have access to run that job, too, if you don't
> trust my result.
>
> I am also testing exhaustive (not just core) tests at
> https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/, builds
> 218-220 (once the instances come up).
>
> I tested with ImpalaLZO at this commit:
> https://github.com/cloudera/impala-lzo/tree/62c4b94ed6e89f0ce2068280864546ebccfb0729
>
> On Thu, Aug 31, 2017 at 6:19 AM, Jim Apple  wrote:
>> +1
>>
>> https://jenkins.impala.io/job/release-test/20/console
>>
>> This tested following
>> https://cwiki.apache.org/confluence/display/IMPALA/How+to+load+and+run+Impala+tests
>> and 
>> https://cwiki.apache.org/confluence/display/IMPALA/How+to+Release#HowtoRelease-HowtoVoteonaReleaseCandidate.
>>
>> On Wed, Aug 30, 2017 at 11:35 PM, Bharath Vissapragada
>>  wrote:
>>> This is a vote to release Impala 2.10.0.
>>>
>>> - The artefacts for testing can be downloaded from <
>>> https://dist.apache.org/repos/dist/dev/incubator/impala/2.10.0/RC2/>
>>>
>>> - The git tag for this release candidate is 2.10.0-rc2 and treehash is
>>> visible at
>>> <
>>> https://git-wip-us.apache.org/repos/asf?p=incubator-impala.git;a=tree;hb=23d79462da5d0108709e8b1399c97606f4ebdf92

>>>
>>> Please vote +1 or -1. -1 votes should be accompanied by an explanation of
>>> the reason. Only PPMC members and mentors have binding votes, but other
>>> community members are encouraged to cast non-binding votes. This vote will
>>> pass if there are 3 binding +1 votes and more binding +1 votes than -1
>>> votes.
>>>
>>> This wiki page describes how to check the release before you vote:
>>> *https://cwiki.apache.org/confluence/display/IMPALA/How+to+Release#HowtoRelease-HowtoVoteonaReleaseCandidate
>>> *
>>>
>>> The vote will be open until the end of day, September 5th, Pacific time
>>> zone (UTC-08:00).
>>> Once the vote passes the Impala PPMC vote, it still must pass the incubator
>>> PMC vote before a release is made.


Re: jenkins.impala.io pre-existing workspace

2017-08-31 Thread Jim Apple
Also, to be clear, I don't have the cycles to lead the fix-the-cleanup
task at the moment.

On Wed, Aug 30, 2017 at 4:45 PM, Jim Apple  wrote:
> The workspace cleanup isn't working - see the last bit of any recent
> ub1604 job: 
> https://jenkins.impala.io/view/Utility/job/ubuntu-16.04-from-scratch/206/console
>
> 03:56:40.920 [WS-CLEANUP] Deleting project workspace...Cannot delete
> workspace :remote file operation failed: /home/ubuntu at
> hudson.remoting.Channel@4384d5b9:ubuntu-16.04 (i-032d527b9c801df4c):
> java.io.IOException: Unable to delete '/home/ubuntu'. Tried 3 times
> (of a maximum of 3) waiting 0.1 sec between attempts.
> 03:56:48.161 ERROR: Step ‘Delete workspace when build is done’ failed:
> Cannot delete workspace: remote file operation failed: /home/ubuntu at
> hudson.remoting.Channel@4384d5b9:ubuntu-16.04 (i-032d527b9c801df4c):
> java.io.IOException: Unable to delete '/home/ubuntu'. Tried 3 times
> (of a maximum of 3) waiting 0.1 sec between attempts.
>
> The workspace is $HOME, so you can't just delete it without being root.
>
> This could be changed to
>
> 1. A post-build script to "rm -rf ~/*". This doesn't reset everything,
> though - the job makes changes to other parts of the filesystem.
>
> 2. A post-build script to "sudo shutdown -h now" to make sure ec2
> instances are not re-used. I'm not sure how Jenkins would feel about
> this. :-)
>
> 3. A post-build script to move $HOME to some archived location on the
> disk, to preserve debuggability.
>
> 4. A bash trap in the script to do one of the above.
>
> 5. Run the whole thing in a docker in the build machine, then delete
> the container when the script is done. Or don't, if there's enough
> disk space to not worry about that.
>
> 6. Do all of the work in a workspace inside $HOME. This would require
> some changes to bootstrap_development.sh.
>
> #5 is the most hermetic, I'd guess.
>
> On Thu, Aug 24, 2017 at 8:29 AM, Michael Brown  wrote:
>> Looks like someone has done this.
>>
>> On Wed, Aug 23, 2017 at 8:16 PM, Alexander Behm 
>> wrote:
>>
>>> Yes, let's please add the post-build action for sanity and consistency with
>>> our other jobs.
>>>
>>> On Wed, Aug 23, 2017 at 7:42 PM, Tim Armstrong 
>>> wrote:
>>>
>>> > Maybe the workspace just got left in a weird state - I think in most
>>> cases
>>> > "git init" followed by checking out a branch and doing a clean would
>>> work.
>>> >
>>> > Should we add the delete workspace post-build action?
>>> >
>>> > On Wed, Aug 23, 2017 at 5:32 PM, Michael Brown 
>>> wrote:
>>> >
>>> > > Not a known issue. I noticed ubuntu-16.04-from-scratch is not set to
>>> > clean
>>> > > up its workspace, and its config has not been touched since Aug 11. It
>>> > > seems strange we only saw this now
>>> > >
>>> > > On Wed, Aug 23, 2017 at 5:25 PM, Tim Armstrong <
>>> tarmstr...@cloudera.com>
>>> > > wrote:
>>> > >
>>> > > > Is this a known problem? My job failed because the Impala repo
>>> already
>>> > > > existed on the machine:
>>> > > >
>>> > > > https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/164/
>>> > > >
>>> > > > *23:00:24* + /usr/bin/git init /home/ubuntu/Impala*23:00:24*
>>> > > > Reinitialized existing Git repository in /home/ubuntu/Impala/.git/
>>> > > > 
>>> > > > *23:02:18* + for ITER in '$(seq 1 10)'*23:02:18* + echo 'ATTEMPT:
>>> > > > 1'*23:02:18* ATTEMPT: 1*23:02:18* + /usr/bin/git checkout
>>> > > > FETCH_HEAD*23:02:18* + cat
>>> > > > /home/ubuntu/Impala/tmp.3tYBn0GUga*23:02:18* 23:02:18.712300
>>> git.c:344
>>> > > >   trace: built-in: git 'checkout' 'FETCH_HEAD'*23:02:18*
>>> > > > error: The following untracked working tree files would be
>>> overwritten
>>> > > > by checkout:*23:02:18*  .clang-format*23:02:18*
>>> > > >  .clang-tidy*23:02:18*
>>> > > > .gitignore*23:02:18*CMakeLists.txt*23:02:18*
>>> > > > DISCLAIMER*23:02:18*
>>> > > > EXPORT_CONTROL.md*23:02:18* LICENSE.txt*23:02:18*
>>> > > >  LOGS.md*23:02:18*
>>> > > > NOTICE.txt*23:02:18*README.md*23:02:18*
>>> > > >  be/.gitignore*23:02:18*
>>> > > > be/.impala.doxy*23:02:18*   be/CMakeLists.txt*23:02:18*
>>> > > > be/src/benchmarks/CMakeLists.txt*23:02:18*
>>> > > > be/src/benchmarks/atod-benchmark.cc*23:02:18*
>>> > > > be/src/benchmarks/atof-benchmark.cc*23:02:18*
>>> > > > be/src/benchmarks/atoi-benchmark.cc*23:02:18*
>>> > > > be/src/benchmarks/bit-packing-benchmark.cc*23:02:18*
>>> > > > be/src/benchmarks/bitmap-benchmark.cc
>>> > > > ...
>>> > > >
>>> > >
>>> >
>>>


Re: [VOTE] 2.10.0 release candidate 2 (RC2)

2017-08-31 Thread Jim Apple
BTW, this Jenkins job includes the log of what it tested, which
follows the Release Guide, so you should be able to follow along OK.
All committers should have access to run that job, too, if you don't
trust my result.

I am also testing exhaustive (not just core) tests at
https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/, builds
218-220 (once the instances come up).

I tested with ImpalaLZO at this commit:
https://github.com/cloudera/impala-lzo/tree/62c4b94ed6e89f0ce2068280864546ebccfb0729

On Thu, Aug 31, 2017 at 6:19 AM, Jim Apple  wrote:
> +1
>
> https://jenkins.impala.io/job/release-test/20/console
>
> This tested following
> https://cwiki.apache.org/confluence/display/IMPALA/How+to+load+and+run+Impala+tests
> and 
> https://cwiki.apache.org/confluence/display/IMPALA/How+to+Release#HowtoRelease-HowtoVoteonaReleaseCandidate.
>
> On Wed, Aug 30, 2017 at 11:35 PM, Bharath Vissapragada
>  wrote:
>> This is a vote to release Impala 2.10.0.
>>
>> - The artefacts for testing can be downloaded from <
>> https://dist.apache.org/repos/dist/dev/incubator/impala/2.10.0/RC2/>
>>
>> - The git tag for this release candidate is 2.10.0-rc2 and treehash is
>> visible at
>> <
>> https://git-wip-us.apache.org/repos/asf?p=incubator-impala.git;a=tree;hb=23d79462da5d0108709e8b1399c97606f4ebdf92
>>>
>>
>> Please vote +1 or -1. -1 votes should be accompanied by an explanation of
>> the reason. Only PPMC members and mentors have binding votes, but other
>> community members are encouraged to cast non-binding votes. This vote will
>> pass if there are 3 binding +1 votes and more binding +1 votes than -1
>> votes.
>>
>> This wiki page describes how to check the release before you vote:
>> *https://cwiki.apache.org/confluence/display/IMPALA/How+to+Release#HowtoRelease-HowtoVoteonaReleaseCandidate
>> *
>>
>> The vote will be open until the end of day, September 5th, Pacific time
>> zone (UTC-08:00).
>> Once the vote passes the Impala PPMC vote, it still must pass the incubator
>> PMC vote before a release is made.


Re: Question about the multi-thread scan node model

2017-08-31 Thread Tim Armstrong
I spoke to Alex Behm off-list about that JIRA a while ago. I don't think
it's a true ramp-up task. The code change is easy but I think we would want
to do performance validation and testing to make sure that the new
multithreaded scanners have similar performance and stability before making
them the default.

On Thu, Aug 31, 2017 at 12:34 AM, huangquanl...@gmail.com <
huangquanl...@gmail.com> wrote:

> Yeah, "compute stats" is really cpu bound. That sounds great!
>
> I noticed that one of the sub tasks of multithreading work is labeled with
> "ramp up": https://issues.apache.org/jira/browse/IMPALA-5802
> Is this on progress? If not, could you reassign it to me to familiar with
> the latest framework?
>
> Thanks,
> Quanlong
>
> On 2017-08-31 07:16, Tim Armstrong  wrote:
> > Hi,
> >   The new scanner model is part of the multithreading work to support
> > running multiple instances of each fragment on each Impala daemon. The
> idea
> > there is that parallelisation is done at the fragment level so that all
> > execution including aggregations, sorts, joins is parallelised - not just
> > scans. This is enabled by setting mt_dop > 0. Currently it doesn't work
> for
> > plans including joins and HDFS inserts.
> >
> > We find that a lot of queries are compute bound, particularly by
> > aggregations and joins. In those cases we get big speedups from the newer
> > multithreading model. E.g. "compute stats" is a lot faster.
> >
> > On Wed, Aug 30, 2017 at 3:50 PM, 黄权隆  wrote:
> >
> > > Hi all,
> > >
> > >
> > > I’m working on applying our orc-support patch into the latest code
> bases (
> > > IMPALA-5717 ).
> Since
> > > our
> > > patch is based on cdh-5.7.3-release which was released one year ago,
> > > there’re lots of work to merge it.
> > >
> > >
> > > One of the biggest changes from cdh-5.7.3-release I notice is the new
> scan
> > > node & scanner model introduced in IMPALA-3902
> > > . I think it’s
> inspired
> > > by the investigating task in IMPALA-2849
> > > , but I cannot
> find any
> > > performance report in this issue. Could you share some report about
> this
> > > multi-thread refactor?
> > >
> > >
> > > I’m wondering how much this can improve the performance, since the old
> > > single thread scan node & multi-thread scanners model has supplied
> > > concurrent IO for reading, and most of the queries in OLAP are IO
> bound.
> > >
> > >
> > > Thanks,
> > >
> > > Quanlong
> > >
> >
>


Re: [VOTE] 2.10.0 release candidate 2 (RC2)

2017-08-31 Thread Jim Apple
Clarification: +1 (binding)

As a reminder, binding votes in the PPMC vote are not binding in the
IPMC vote unless the voter is also an IPMC member.

Since I am not, this is a PPMC "+1 (binding)" only.

On Thu, Aug 31, 2017 at 6:19 AM, Jim Apple  wrote:
> +1
>
> https://jenkins.impala.io/job/release-test/20/console
>
> This tested following
> https://cwiki.apache.org/confluence/display/IMPALA/How+to+load+and+run+Impala+tests
> and 
> https://cwiki.apache.org/confluence/display/IMPALA/How+to+Release#HowtoRelease-HowtoVoteonaReleaseCandidate.
>
> On Wed, Aug 30, 2017 at 11:35 PM, Bharath Vissapragada
>  wrote:
>> This is a vote to release Impala 2.10.0.
>>
>> - The artefacts for testing can be downloaded from <
>> https://dist.apache.org/repos/dist/dev/incubator/impala/2.10.0/RC2/>
>>
>> - The git tag for this release candidate is 2.10.0-rc2 and treehash is
>> visible at
>> <
>> https://git-wip-us.apache.org/repos/asf?p=incubator-impala.git;a=tree;hb=23d79462da5d0108709e8b1399c97606f4ebdf92
>>>
>>
>> Please vote +1 or -1. -1 votes should be accompanied by an explanation of
>> the reason. Only PPMC members and mentors have binding votes, but other
>> community members are encouraged to cast non-binding votes. This vote will
>> pass if there are 3 binding +1 votes and more binding +1 votes than -1
>> votes.
>>
>> This wiki page describes how to check the release before you vote:
>> *https://cwiki.apache.org/confluence/display/IMPALA/How+to+Release#HowtoRelease-HowtoVoteonaReleaseCandidate
>> *
>>
>> The vote will be open until the end of day, September 5th, Pacific time
>> zone (UTC-08:00).
>> Once the vote passes the Impala PPMC vote, it still must pass the incubator
>> PMC vote before a release is made.


Re: [VOTE] 2.10.0 release candidate 2 (RC2)

2017-08-31 Thread Jim Apple
+1

https://jenkins.impala.io/job/release-test/20/console

This tested following
https://cwiki.apache.org/confluence/display/IMPALA/How+to+load+and+run+Impala+tests
and 
https://cwiki.apache.org/confluence/display/IMPALA/How+to+Release#HowtoRelease-HowtoVoteonaReleaseCandidate.

On Wed, Aug 30, 2017 at 11:35 PM, Bharath Vissapragada
 wrote:
> This is a vote to release Impala 2.10.0.
>
> - The artefacts for testing can be downloaded from <
> https://dist.apache.org/repos/dist/dev/incubator/impala/2.10.0/RC2/>
>
> - The git tag for this release candidate is 2.10.0-rc2 and treehash is
> visible at
> <
> https://git-wip-us.apache.org/repos/asf?p=incubator-impala.git;a=tree;hb=23d79462da5d0108709e8b1399c97606f4ebdf92
>>
>
> Please vote +1 or -1. -1 votes should be accompanied by an explanation of
> the reason. Only PPMC members and mentors have binding votes, but other
> community members are encouraged to cast non-binding votes. This vote will
> pass if there are 3 binding +1 votes and more binding +1 votes than -1
> votes.
>
> This wiki page describes how to check the release before you vote:
> *https://cwiki.apache.org/confluence/display/IMPALA/How+to+Release#HowtoRelease-HowtoVoteonaReleaseCandidate
> *
>
> The vote will be open until the end of day, September 5th, Pacific time
> zone (UTC-08:00).
> Once the vote passes the Impala PPMC vote, it still must pass the incubator
> PMC vote before a release is made.


Re: jenkins.impala.io pre-existing workspace

2017-08-31 Thread Laszlo Gaal
The layout of #6 has served me well for a year now.
It allows me to keep several such workspaces around
to work with different Impala versions, I just have
to be control carefully which workspace the minicluster
runs from -- but that's not an issue here.

On Thu, Aug 31, 2017 at 2:44 AM, Philip Zeyliger 
wrote:

> On Wed, Aug 30, 2017 at 4:45 PM, Jim Apple  wrote:
>
> > The workspace cleanup isn't working - see the last bit of any recent
> > ub1604 job: https://jenkins.impala.io/view/Utility/job/ubuntu-16.04-
> > from-scratch/206/console
> >
> > 03:56:40.920 [WS-CLEANUP] Deleting project workspace...Cannot delete
> > workspace :remote file operation failed: /home/ubuntu at
> > hudson.remoting.Channel@4384d5b9:ubuntu-16.04 (i-032d527b9c801df4c):
> > java.io.IOException: Unable to delete '/home/ubuntu'. Tried 3 times
> > (of a maximum of 3) waiting 0.1 sec between attempts.
> > 03:56:48.161 ERROR: Step ‘Delete workspace when build is done’ failed:
> > Cannot delete workspace: remote file operation failed: /home/ubuntu at
> > hudson.remoting.Channel@4384d5b9:ubuntu-16.04 (i-032d527b9c801df4c):
> > java.io.IOException: Unable to delete '/home/ubuntu'. Tried 3 times
> > (of a maximum of 3) waiting 0.1 sec between attempts.
> >
> > The workspace is $HOME, so you can't just delete it without being root.
> >
> > This could be changed to
> >
> > 1. A post-build script to "rm -rf ~/*". This doesn't reset everything,
> > though - the job makes changes to other parts of the filesystem.
> >
> > 2. A post-build script to "sudo shutdown -h now" to make sure ec2
> > instances are not re-used. I'm not sure how Jenkins would feel about
> > this. :-)
> >
> > 3. A post-build script to move $HOME to some archived location on the
> > disk, to preserve debuggability.
> >
> > 4. A bash trap in the script to do one of the above.
> >
> > 5. Run the whole thing in a docker in the build machine, then delete
> > the container when the script is done. Or don't, if there's enough
> > disk space to not worry about that.
> >
> > 6. Do all of the work in a workspace inside $HOME. This would require
> > some changes to bootstrap_development.sh.
> >
> > #5 is the most hermetic, I'd guess.
> >
>
> I like #6 in the short term: I don't think anything is too bound to $HOME
> except a few "~/" in that script which are easily approached. If you do #5,
> you have to teach Jenkins about how to save the build output logs from
> inside of Docker, which is, I think, more work.
>
> -- Philip
>
>
> >
> > On Thu, Aug 24, 2017 at 8:29 AM, Michael Brown 
> wrote:
> > > Looks like someone has done this.
> > >
> > > On Wed, Aug 23, 2017 at 8:16 PM, Alexander Behm <
> alex.b...@cloudera.com>
> > > wrote:
> > >
> > >> Yes, let's please add the post-build action for sanity and consistency
> > with
> > >> our other jobs.
> > >>
> > >> On Wed, Aug 23, 2017 at 7:42 PM, Tim Armstrong <
> tarmstr...@cloudera.com
> > >
> > >> wrote:
> > >>
> > >> > Maybe the workspace just got left in a weird state - I think in most
> > >> cases
> > >> > "git init" followed by checking out a branch and doing a clean would
> > >> work.
> > >> >
> > >> > Should we add the delete workspace post-build action?
> > >> >
> > >> > On Wed, Aug 23, 2017 at 5:32 PM, Michael Brown 
> > >> wrote:
> > >> >
> > >> > > Not a known issue. I noticed ubuntu-16.04-from-scratch is not set
> to
> > >> > clean
> > >> > > up its workspace, and its config has not been touched since Aug
> 11.
> > It
> > >> > > seems strange we only saw this now
> > >> > >
> > >> > > On Wed, Aug 23, 2017 at 5:25 PM, Tim Armstrong <
> > >> tarmstr...@cloudera.com>
> > >> > > wrote:
> > >> > >
> > >> > > > Is this a known problem? My job failed because the Impala repo
> > >> already
> > >> > > > existed on the machine:
> > >> > > >
> > >> > > > https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/164/
> > >> > > >
> > >> > > > *23:00:24* + /usr/bin/git init /home/ubuntu/Impala*23:00:24*
> > >> > > > Reinitialized existing Git repository in
> /home/ubuntu/Impala/.git/
> > >> > > > 
> > >> > > > *23:02:18* + for ITER in '$(seq 1 10)'*23:02:18* + echo
> 'ATTEMPT:
> > >> > > > 1'*23:02:18* ATTEMPT: 1*23:02:18* + /usr/bin/git checkout
> > >> > > > FETCH_HEAD*23:02:18* + cat
> > >> > > > /home/ubuntu/Impala/tmp.3tYBn0GUga*23:02:18* 23:02:18.712300
> > >> git.c:344
> > >> > > >   trace: built-in: git 'checkout'
> > 'FETCH_HEAD'*23:02:18*
> > >> > > > error: The following untracked working tree files would be
> > >> overwritten
> > >> > > > by checkout:*23:02:18*  .clang-format*23:02:18*
> > >> > > >  .clang-tidy*23:02:18*
> > >> > > > .gitignore*23:02:18*CMakeLists.txt*23:02:18*
> > >> > > > DISCLAIMER*23:02:18*
> > >> > > > EXPORT_CONTROL.md*23:02:18* LICENSE.txt*23:02:18*
> > >> > > >  LOGS.md*23:02:18*
> > >> > > > NOTICE.txt*23:02:18*README.md*23:02:18*
> > >> > > >  be/.gitignore*23:02:18*
> 

Re: Re: Impala Show Tables

2017-08-31 Thread Dimitris Tsirogiannis
Hi Sky,

You don't have many options if you really need SQL to access the tables
because Impala doesn't have an information schema (see
https://issues.apache.org/jira/browse/IMPALA-1761). What I was talking
about is a programmatic way to get that same information.

Dimitris

On Wed, Aug 30, 2017 at 7:09 PM, sky  wrote:

> Hi Dimitris,
> What you mean is to query the database that stores the impala metadata ?
> hive(mysql,pg,derby and so on) ?
>
>
>
>
>
>
>
>
> At 2017-08-30 16:41:40, "Dimitris Tsirogiannis" <
> dtsirogian...@cloudera.com> wrote:
> >Hi sky,
> >
> >You could use HiveServer2 API (
> >https://github.com/apache/hive/blob/master/service-rpc/
> if/TCLIService.thrift)
> >to list the tables (see GetTables). Depending on your preference on
> >programming language, a number of clients exist that use this API (e.g.
> >https://github.com/cloudera/impyla).
> >
> >Dimitris
> >
> >On Wed, Aug 30, 2017 at 12:03 AM, sky  wrote:
> >
> >> Hi all,
> >> In addition to "show tables" command, is there any other ways to
> show
> >> all the tables in impala ?
> >> I need a way to handle collection of all tables through SQL, but
> "show
> >> tables"  can not be combined with SQL.
>


Re: Question about the multi-thread scan node model

2017-08-31 Thread huangquanl...@gmail.com
Yeah, "compute stats" is really cpu bound. That sounds great!

I noticed that one of the sub tasks of multithreading work is labeled with 
"ramp up": https://issues.apache.org/jira/browse/IMPALA-5802
Is this on progress? If not, could you reassign it to me to familiar with the 
latest framework?

Thanks,
Quanlong

On 2017-08-31 07:16, Tim Armstrong  wrote: 
> Hi,
>   The new scanner model is part of the multithreading work to support
> running multiple instances of each fragment on each Impala daemon. The idea
> there is that parallelisation is done at the fragment level so that all
> execution including aggregations, sorts, joins is parallelised - not just
> scans. This is enabled by setting mt_dop > 0. Currently it doesn't work for
> plans including joins and HDFS inserts.
> 
> We find that a lot of queries are compute bound, particularly by
> aggregations and joins. In those cases we get big speedups from the newer
> multithreading model. E.g. "compute stats" is a lot faster.
> 
> On Wed, Aug 30, 2017 at 3:50 PM, 黄权隆  wrote:
> 
> > Hi all,
> >
> >
> > I’m working on applying our orc-support patch into the latest code bases (
> > IMPALA-5717 ). Since
> > our
> > patch is based on cdh-5.7.3-release which was released one year ago,
> > there’re lots of work to merge it.
> >
> >
> > One of the biggest changes from cdh-5.7.3-release I notice is the new scan
> > node & scanner model introduced in IMPALA-3902
> > . I think it’s inspired
> > by the investigating task in IMPALA-2849
> > , but I cannot find any
> > performance report in this issue. Could you share some report about this
> > multi-thread refactor?
> >
> >
> > I’m wondering how much this can improve the performance, since the old
> > single thread scan node & multi-thread scanners model has supplied
> > concurrent IO for reading, and most of the queries in OLAP are IO bound.
> >
> >
> > Thanks,
> >
> > Quanlong
> >
> 


[VOTE] 2.10.0 release candidate 2 (RC2)

2017-08-31 Thread Bharath Vissapragada
This is a vote to release Impala 2.10.0.

- The artefacts for testing can be downloaded from <
https://dist.apache.org/repos/dist/dev/incubator/impala/2.10.0/RC2/>

- The git tag for this release candidate is 2.10.0-rc2 and treehash is
visible at
<
https://git-wip-us.apache.org/repos/asf?p=incubator-impala.git;a=tree;hb=23d79462da5d0108709e8b1399c97606f4ebdf92
>

Please vote +1 or -1. -1 votes should be accompanied by an explanation of
the reason. Only PPMC members and mentors have binding votes, but other
community members are encouraged to cast non-binding votes. This vote will
pass if there are 3 binding +1 votes and more binding +1 votes than -1
votes.

This wiki page describes how to check the release before you vote:
*https://cwiki.apache.org/confluence/display/IMPALA/How+to+Release#HowtoRelease-HowtoVoteonaReleaseCandidate
*

The vote will be open until the end of day, September 5th, Pacific time
zone (UTC-08:00).
Once the vote passes the Impala PPMC vote, it still must pass the incubator
PMC vote before a release is made.