RE: A question about AnalysisException

2018-06-05 Thread Mike Labman
Just use column number in your group by.

select datediff(day,now()) from test_table where day>='2018-06-01' group by 1

-Original Message-
From: skyyws [mailto:sky...@163.com]
Sent: Tuesday, June 05, 2018 9:49 AM
To: dev
Subject: Re: A question about AnalysisException

Here is the corret result of the sql below:
select datediff(day,now()) from test_table where day>='2018-06-01' group by 
datediff(day,now());

| datediff(day, now()) |
+--+
| -4   |
| 0|
| -3   |
| -1   |
| -2   |




2018-06-05
skyyws



发件人:skyyws 
发送时间:2018-06-05 21:44
主题:A question about AnalysisException
收件人:"dev@impala.apache.org"
抄送:

Hello all,
Recently, I found a probelm when I used impala to do ad-hoc analysis. When I 
executed the sql below:
select datediff(day,now()) from test_table where day>=(now() - interval 5 days) 
group by datediff(day,now());
I got an exception like this:
-
Status: AnalysisException: select list expression not produced by aggregation 
output (missing from GROUP BY clause?): datediff(day, TIMESTAMP '2018-06-05 
21:24:28.403393000')
-
and if I execute this sql:
select datediff(day,now()) from test_table where day>='2018-06-01' group by 
datediff(day,now());
I got the correct result like this:

This situation happend both on 2.10.0 and 3.0.0 version.
I'm not sure it's a bug or it's just designed like this, anyone who can give me 
some advice? Thanks.
(test_table is stored as parquet, and day is the partition column, string type.)

2018-06-05
skyyws



Confidentiality Note: This e-mail, and any attachment to it, contains 
privileged and confidential information intended only for the use of the 
individual(s) or entity named on the e-mail. If the reader of this e-mail is 
not the intended recipient, or the employee or agent responsible for delivering 
it to the intended recipient, you are hereby notified that reading it is 
strictly prohibited. If you have received this e-mail in error, please 
immediately return it to the sender and delete it from your system. Thank you



A question about AnalysisException

2018-06-05 Thread skyyws
Hello all,
Recently, I found a probelm when I used impala to do ad-hoc analysis. When I 
executed the sql below:
select datediff(day,now()) from test_table where day>=(now() - interval 5 days) 
group by datediff(day,now());
I got an exception like this:
-
Status: AnalysisException: select list expression not produced by aggregation 
output (missing from GROUP BY clause?): datediff(day, TIMESTAMP '2018-06-05 
21:24:28.403393000')
-
and if I execute this sql:
select datediff(day,now()) from test_table where day>='2018-06-01' group by 
datediff(day,now());
I got the correct result like this:
This situation happend both on 2.10.0 and 3.0.0 version.
I'm not sure it's a bug or it's just designed like this, anyone who can give me 
some advice? Thanks.
(test_table is stored as parquet, and day is the partition column, string type.)

2018-06-05
skyyws 

Re: Broken/Flaky Tests

2018-06-05 Thread Tim Armstrong
Things are starting to look healthier now.

I went through the broken-build JIRAs and downgraded some of the infrequent
infrastructure issues to critical so we have a clearer idea of what's
actually breaking the build now versus what's an occasional infra issue:
https://issues.apache.org/jira/issues/?jql=project%20%3D%20IMPALA%20AND%20status%20in%20(Open%2C%20%22In%20Progress%22%2C%20Reopened)%20AND%20labels%20%3D%20broken-build%20ORDER%20BY%20priority%20DESC

I'd like to see the fixes for these three issues go in:
https://issues.apache.org/jira/browse/IMPALA-7101
https://issues.apache.org/jira/browse/IMPALA-6956
https://issues.apache.org/jira/browse/IMPALA-7008

We still need to fix any flaky infrastructure issues but that should be
able to proceed in parallel with other things.


On Fri, Jun 1, 2018 at 11:18 AM, Thomas Tauber-Marshall <
tmarsh...@cloudera.com> wrote:

> So while its definitely better, there are still a large number of failing
> builds. We've been hit by at least: IMPALA-6642
> , IMPALA-6956
> , IMPALA-7101
>  and IMPALA-3040
> 
> all within the last day, along with some mysterious crashes that I haven't
> filed anything for with Apache yet as there's very little info about what's
> actually going on. There are still multiple builds that haven't been green
> in over a month.
> 
>
> Of course, if we hold commits for too long, there's a danger that when we
> open things back up a bunch of changes will all land at the same time and
> destabilize the builds again, putting back in the same situation. So, I
> would say at a minimum that any changes that are relatively minor and low
> risk can go in now.
>
> My preference would be to hold off on major changes until we have more
> stability.
>
> On Fri, Jun 1, 2018 at 10:30 AM Lars Volker  wrote:
>
> > Hi Thomas,
> >
> > Can you give an update on where we are with the builds?
> >
> > We currently have ~15 changes with a +2:
> >
> > https://gerrit.cloudera.org/#/q/status:open+project:Impala-
> ASF+branch:master+label:Code-Review%253D2
> >
> > Thanks, Lars
> >
> > On Fri, May 25, 2018 at 5:20 PM, Henry Robinson 
> wrote:
> >
> > > +1 - thanks for worrying about build health.
> > >
> > > On 25 May 2018 at 17:18, Jim Apple  wrote:
> > >
> > > > Sounds good to me. Thanks for taking ownership!
> > > >
> > > > On Fri, May 25, 2018 at 5:10 PM Thomas Tauber-Marshall <
> > > > tmarsh...@cloudera.com> wrote:
> > > >
> > > > > Hey Impala community,
> > > > >
> > > > > There seems to have been an unusually large number of flaky or
> broken
> > > > tests
> > > > > <
> > > > > https://issues.apache.org/jira/browse/IMPALA-7073?jql=
> > > > project%20%3D%20IMPALA%20AND%20status%20in%20(Open%2C%20%
> > > > 22In%20Progress%22%2C%20Reopened)%20AND%20labels%
> > > > 20in%20(flaky%2C%20broken-build)
> > > > > >
> > > > > cropping up in the last few weeks. I'd like to suggest that we hold
> > off
> > > > on
> > > > > merging new changes that aren't related to fixing those testing
> > issues
> > > > for
> > > > > at least a few days until things become more stable.
> > > > >
> > > > > Does anyone have any objections? If not, I'll send out another
> email
> > > when
> > > > > more of the issues have been addressed.
> > > > >
> > > > > Thanks,
> > > > > Thomas Tauber-Marshall
> > > > >
> > > >
> > >
> >
>


return value is wrong

2018-06-05 Thread ??????
I develop a UDA function as follow:
input:string
Output:string
I want to caculate every name' md5 code and then caculate the md5 code with 
XOR. I will get one value which is all md5 code caculate.


md5 code is saved as unsigned char [16], as I  know StringVal has a member 
called ptr(uint8_t *) I set the md5 code(caculate result) as StringVal' ptr.


Every time I get null value,that is why?








--  --
??: "skyyws";
: 2018??6??6??(??) 10:08
??: "dev@impala.apache.org";

: Re: A question about AnalysisException



Thanks for your reply, and I knew that both column number and alias worked like 
this:
--
select datediff(day,now()) from test_table where day>='2018-06-01' group by 1
select datediff(day,now()) d from test_table where day>='2018-06-01' group by d
--
I just wonder why built-in function in where clause instead of constants would 
resulting in this exception. It's impala syntax ?
On 06/5/2018 21:54??Mike Labman wrote??
Just use column number in your group by.

select datediff(day,now()) from test_table where day>='2018-06-01' group by 1

-Original Message-
From: skyyws [mailto:sky...@163.com]
Sent: Tuesday, June 05, 2018 9:49 AM
To: dev
Subject: Re: A question about AnalysisException

Here is the corret result of the sql below:
select datediff(day,now()) from test_table where day>='2018-06-01' group by 
datediff(day,now());

| datediff(day, now()) |
+--+
| -4   |
| 0|
| -3   |
| -1   |
| -2   |




2018-06-05
skyyws



skyyws 
??2018-06-05 21:44
??A question about AnalysisException
"dev@impala.apache.org"
??

Hello all,
Recently, I found a probelm when I used impala to do ad-hoc analysis. When I 
executed the sql below:
select datediff(day,now()) from test_table where day>=(now() - interval 5 days) 
group by datediff(day,now());
I got an exception like this:
-
Status: AnalysisException: select list expression not produced by aggregation 
output (missing from GROUP BY clause?): datediff(day, TIMESTAMP '2018-06-05 
21:24:28.403393000')
-
and if I execute this sql:
select datediff(day,now()) from test_table where day>='2018-06-01' group by 
datediff(day,now());
I got the correct result like this:

This situation happend both on 2.10.0 and 3.0.0 version.
I'm not sure it's a bug or it's just designed like this, anyone who can give me 
some advice? Thanks.
(test_table is stored as parquet, and day is the partition column, string type.)

2018-06-05
skyyws



Confidentiality Note: This e-mail, and any attachment to it, contains 
privileged and confidential information intended only for the use of the 
individual(s) or entity named on the e-mail. If the reader of this e-mail is 
not the intended recipient, or the employee or agent responsible for delivering 
it to the intended recipient, you are hereby notified that reading it is 
strictly prohibited. If you have received this e-mail in error, please 
immediately return it to the sender and delete it from your system. Thank you

UDA debugging, was Re: Broken/Flaky Tests

2018-06-05 Thread Jim Apple
Hi 周胜为,

I notice you are replying to other threads about different subjects when
you ask your questions. I think you will be more likely to get help if you
start new threads with relevant subjects and if you be as specific as
possible with your questions.

The Impala wiki has some advice for debugging:
https://cwiki.apache.org/confluence/display/IMPALA/Impala+Debugging+Tips


On Tue, Jun 5, 2018 at 6:21 PM 周胜为 <865392...@qq.com> wrote:

> One:I want to know how to debug the imapla UDA function
> Two:I would like to return a StringVal value through finalize function,
> but I get the null value every time. That is why?
>
>
>
>
> -- 原始邮件 --
> 发件人: "Tim Armstrong";
> 发送时间: 2018年6月6日(星期三) 上午9:08
> 收件人: "dev@impala";
>
> 主题: Re: Broken/Flaky Tests
>
>
>
> Ok, so 2/3 of those fixes are merged and the other is being merged.
>
> We still have a long list of flaky issues but I went through and we've
> either mitigated them or we're blocked on being able to repro them.
>
> I'll see how things look tomorrow, but if you have some low-risk changes in
> mind, let me know and I can considering whether to merge them.
>
>
>
> On Tue, Jun 5, 2018 at 10:11 AM, Tim Armstrong 
> wrote:
>
> > Things are starting to look healthier now.
> >
> > I went through the broken-build JIRAs and downgraded some of the
> > infrequent infrastructure issues to critical so we have a clearer idea of
> > what's actually breaking the build now versus what's an occasional infra
> > issue: https://issues.apache.org/jira/issues/?jql=project%
> > 20%3D%20IMPALA%20AND%20status%20in%20(Open%2C%20%22In%
> > 20Progress%22%2C%20Reopened)%20AND%20labels%20%3D%20broken-
> > build%20ORDER%20BY%20priority%20DESC
> >
> > I'd like to see the fixes for these three issues go in:
> > https://issues.apache.org/jira/browse/IMPALA-7101
> > https://issues.apache.org/jira/browse/IMPALA-6956
> > https://issues.apache.org/jira/browse/IMPALA-7008
> >
> > We still need to fix any flaky infrastructure issues but that should be
> > able to proceed in parallel with other things.
> >
> >
> > On Fri, Jun 1, 2018 at 11:18 AM, Thomas Tauber-Marshall <
> > tmarsh...@cloudera.com> wrote:
> >
> >> So while its definitely better, there are still a large number of
> failing
> >> builds. We've been hit by at least: IMPALA-6642
> >> , IMPALA-6956
> >> , IMPALA-7101
> >>  and IMPALA-3040
> >> 
> >> all within the last day, along with some mysterious crashes that I
> haven't
> >> filed anything for with Apache yet as there's very little info about
> >> what's
> >> actually going on. There are still multiple builds that haven't been
> green
> >> in over a month.
> >> 
> >>
> >> Of course, if we hold commits for too long, there's a danger that when
> we
> >> open things back up a bunch of changes will all land at the same time
> and
> >> destabilize the builds again, putting back in the same situation. So, I
> >> would say at a minimum that any changes that are relatively minor and
> low
> >> risk can go in now.
> >>
> >> My preference would be to hold off on major changes until we have more
> >> stability.
> >>
> >> On Fri, Jun 1, 2018 at 10:30 AM Lars Volker  wrote:
> >>
> >> > Hi Thomas,
> >> >
> >> > Can you give an update on where we are with the builds?
> >> >
> >> > We currently have ~15 changes with a +2:
> >> >
> >> > https://gerrit.cloudera.org/#/q/status:open+project:Impala-A
> >> SF+branch:master+label:Code-Review%253D2
> >> >
> >> > Thanks, Lars
> >> >
> >> > On Fri, May 25, 2018 at 5:20 PM, Henry Robinson 
> >> wrote:
> >> >
> >> > > +1 - thanks for worrying about build health.
> >> > >
> >> > > On 25 May 2018 at 17:18, Jim Apple  wrote:
> >> > >
> >> > > > Sounds good to me. Thanks for taking ownership!
> >> > > >
> >> > > > On Fri, May 25, 2018 at 5:10 PM Thomas Tauber-Marshall <
> >> > > > tmarsh...@cloudera.com> wrote:
> >> > > >
> >> > > > > Hey Impala community,
> >> > > > >
> >> > > > > There seems to have been an unusually large number of flaky or
> >> broken
> >> > > > tests
> >> > > > > <
> >> > > > > https://issues.apache.org/jira/browse/IMPALA-7073?jql=
> >> > > > project%20%3D%20IMPALA%20AND%20status%20in%20(Open%2C%20%
> >> > > > 22In%20Progress%22%2C%20Reopened)%20AND%20labels%
> >> > > > 20in%20(flaky%2C%20broken-build)
> >> > > > > >
> >> > > > > cropping up in the last few weeks. I'd like to suggest that we
> >> hold
> >> > off
> >> > > > on
> >> > > > > merging new changes that aren't related to fixing those testing
> >> > issues
> >> > > > for
> >> > > > > at least a few days until things become more stable.
> >> > > > >
> >> > > > > Does anyone have any objections? If not, I'll send out another
> >> email
> >> > > when
> >> > > > > more of the issues have been addressed.
> 

Re: UDA debugging, was Re: Broken/Flaky Tests

2018-06-05 Thread Tim Armstrong
We're happy to give you pointers. If you could share your uda code and
"create function" that would help us help you

On Tue., 5 Jun. 2018, 19:31 Jim Apple,  wrote:

> Hi 周胜为,
>
> I notice you are replying to other threads about different subjects when
> you ask your questions. I think you will be more likely to get help if you
> start new threads with relevant subjects and if you be as specific as
> possible with your questions.
>
> The Impala wiki has some advice for debugging:
> https://cwiki.apache.org/confluence/display/IMPALA/Impala+Debugging+Tips
>
>
> On Tue, Jun 5, 2018 at 6:21 PM 周胜为 <865392...@qq.com> wrote:
>
> > One:I want to know how to debug the imapla UDA function
> > Two:I would like to return a StringVal value through finalize function,
> > but I get the null value every time. That is why?
> >
> >
> >
> >
> > -- 原始邮件 --
> > 发件人: "Tim Armstrong";
> > 发送时间: 2018年6月6日(星期三) 上午9:08
> > 收件人: "dev@impala";
> >
> > 主题: Re: Broken/Flaky Tests
> >
> >
> >
> > Ok, so 2/3 of those fixes are merged and the other is being merged.
> >
> > We still have a long list of flaky issues but I went through and we've
> > either mitigated them or we're blocked on being able to repro them.
> >
> > I'll see how things look tomorrow, but if you have some low-risk changes
> in
> > mind, let me know and I can considering whether to merge them.
> >
> >
> >
> > On Tue, Jun 5, 2018 at 10:11 AM, Tim Armstrong 
> > wrote:
> >
> > > Things are starting to look healthier now.
> > >
> > > I went through the broken-build JIRAs and downgraded some of the
> > > infrequent infrastructure issues to critical so we have a clearer idea
> of
> > > what's actually breaking the build now versus what's an occasional
> infra
> > > issue: https://issues.apache.org/jira/issues/?jql=project%
> > > 20%3D%20IMPALA%20AND%20status%20in%20(Open%2C%20%22In%
> > > 20Progress%22%2C%20Reopened)%20AND%20labels%20%3D%20broken-
> > > build%20ORDER%20BY%20priority%20DESC
> > >
> > > I'd like to see the fixes for these three issues go in:
> > > https://issues.apache.org/jira/browse/IMPALA-7101
> > > https://issues.apache.org/jira/browse/IMPALA-6956
> > > https://issues.apache.org/jira/browse/IMPALA-7008
> > >
> > > We still need to fix any flaky infrastructure issues but that should be
> > > able to proceed in parallel with other things.
> > >
> > >
> > > On Fri, Jun 1, 2018 at 11:18 AM, Thomas Tauber-Marshall <
> > > tmarsh...@cloudera.com> wrote:
> > >
> > >> So while its definitely better, there are still a large number of
> > failing
> > >> builds. We've been hit by at least: IMPALA-6642
> > >> , IMPALA-6956
> > >> , IMPALA-7101
> > >>  and IMPALA-3040
> > >> 
> > >> all within the last day, along with some mysterious crashes that I
> > haven't
> > >> filed anything for with Apache yet as there's very little info about
> > >> what's
> > >> actually going on. There are still multiple builds that haven't been
> > green
> > >> in over a month.
> > >> 
> > >>
> > >> Of course, if we hold commits for too long, there's a danger that when
> > we
> > >> open things back up a bunch of changes will all land at the same time
> > and
> > >> destabilize the builds again, putting back in the same situation. So,
> I
> > >> would say at a minimum that any changes that are relatively minor and
> > low
> > >> risk can go in now.
> > >>
> > >> My preference would be to hold off on major changes until we have more
> > >> stability.
> > >>
> > >> On Fri, Jun 1, 2018 at 10:30 AM Lars Volker  wrote:
> > >>
> > >> > Hi Thomas,
> > >> >
> > >> > Can you give an update on where we are with the builds?
> > >> >
> > >> > We currently have ~15 changes with a +2:
> > >> >
> > >> > https://gerrit.cloudera.org/#/q/status:open+project:Impala-A
> > >> SF+branch:master+label:Code-Review%253D2
> > >> >
> > >> > Thanks, Lars
> > >> >
> > >> > On Fri, May 25, 2018 at 5:20 PM, Henry Robinson 
> > >> wrote:
> > >> >
> > >> > > +1 - thanks for worrying about build health.
> > >> > >
> > >> > > On 25 May 2018 at 17:18, Jim Apple  wrote:
> > >> > >
> > >> > > > Sounds good to me. Thanks for taking ownership!
> > >> > > >
> > >> > > > On Fri, May 25, 2018 at 5:10 PM Thomas Tauber-Marshall <
> > >> > > > tmarsh...@cloudera.com> wrote:
> > >> > > >
> > >> > > > > Hey Impala community,
> > >> > > > >
> > >> > > > > There seems to have been an unusually large number of flaky or
> > >> broken
> > >> > > > tests
> > >> > > > > <
> > >> > > > > https://issues.apache.org/jira/browse/IMPALA-7073?jql=
> > >> > > > project%20%3D%20IMPALA%20AND%20status%20in%20(Open%2C%20%
> > >> > > > 22In%20Progress%22%2C%20Reopened)%20AND%20labels%
> > >> > > > 20in%20(flaky%2C%20broken-build)
> > >> > > > > >
> > >> > > > > cropping up 

Re: A question about AnalysisException

2018-06-05 Thread skyyws
Thanks for your reply, and I knew that both column number and alias worked like 
this:
--
select datediff(day,now()) from test_table where day>='2018-06-01' group by 1
select datediff(day,now()) d from test_table where day>='2018-06-01' group by d
--
I just wonder why built-in function in where clause instead of constants would 
resulting in this exception. It's impala syntax ?
On 06/5/2018 21:54,Mike Labman wrote:
Just use column number in your group by.

select datediff(day,now()) from test_table where day>='2018-06-01' group by 1

-Original Message-
From: skyyws [mailto:sky...@163.com]
Sent: Tuesday, June 05, 2018 9:49 AM
To: dev
Subject: Re: A question about AnalysisException

Here is the corret result of the sql below:
select datediff(day,now()) from test_table where day>='2018-06-01' group by 
datediff(day,now());

| datediff(day, now()) |
+--+
| -4   |
| 0|
| -3   |
| -1   |
| -2   |




2018-06-05
skyyws



发件人:skyyws 
发送时间:2018-06-05 21:44
主题:A question about AnalysisException
收件人:"dev@impala.apache.org"
抄送:

Hello all,
Recently, I found a probelm when I used impala to do ad-hoc analysis. When I 
executed the sql below:
select datediff(day,now()) from test_table where day>=(now() - interval 5 days) 
group by datediff(day,now());
I got an exception like this:
-
Status: AnalysisException: select list expression not produced by aggregation 
output (missing from GROUP BY clause?): datediff(day, TIMESTAMP '2018-06-05 
21:24:28.403393000')
-
and if I execute this sql:
select datediff(day,now()) from test_table where day>='2018-06-01' group by 
datediff(day,now());
I got the correct result like this:

This situation happend both on 2.10.0 and 3.0.0 version.
I'm not sure it's a bug or it's just designed like this, anyone who can give me 
some advice? Thanks.
(test_table is stored as parquet, and day is the partition column, string type.)

2018-06-05
skyyws



Confidentiality Note: This e-mail, and any attachment to it, contains 
privileged and confidential information intended only for the use of the 
individual(s) or entity named on the e-mail. If the reader of this e-mail is 
not the intended recipient, or the employee or agent responsible for delivering 
it to the intended recipient, you are hereby notified that reading it is 
strictly prohibited. If you have received this e-mail in error, please 
immediately return it to the sender and delete it from your system. Thank you