[jira] [Created] (DRILL-7420) window function improve ROWS clause/frame possibilities

2019-10-23 Thread benj (Jira)
benj created DRILL-7420:
---

 Summary: window function improve ROWS clause/frame possibilities
 Key: DRILL-7420
 URL: https://issues.apache.org/jira/browse/DRILL-7420
 Project: Apache Drill
  Issue Type: New Feature
Affects Versions: 1.16.0
Reporter: benj


The possibility of window frame are currently limited in Apache Drill.
  
 ROWS clauses is only possible with "BETWEEN UNBOUNDED PRECEDING AND CURRENT 
ROW".
 It will be useful to have possibilities to use:
 * "BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING"
 * "BETWEEN x PRECEDING AND y FOLLOWING"

{code:sql}
/* ROWS clause is only possible with "BETWEEN UNBOUNDED PRECEDING AND CURRENT 
ROW" */
apache drill> SELECT *, sum(a) OVER(ORDER BY b ROWS BETWEEN UNBOUNDED PRECEDING 
AND CURRENT ROW)  FROM (SELECT 1 a, 1 b, 1 c);
+---+---+---++
| a | b | c | EXPR$3 |
+---+---+---++
| 1 | 1 | 1 | 1  |
+---+---+---++
1 row selected (1.357 seconds)

/* ROWS is currently not possible with "BETWEEN UNBOUNDED PRECEDING AND 
UNBOUNDED FOLLOWING" (it's possible with RANGE but with single ORDER BY only ) 
*/
apache drill> SELECT *, sum(a) OVER(ORDER BY b, c ROWS BETWEEN UNBOUNDED 
PRECEDING AND UNBOUNDED FOLLOWING)  FROM (SELECT 1 a, 1 b, 1 c);
Error: UNSUPPORTED_OPERATION ERROR: This type of window frame is currently not 
supported 
See Apache Drill JIRA: DRILL-3188

/* ROWS is currently not possible with "BETWEEN x PRECEDING AND y FOLLOWING" */
apache drill> SELECT *, sum(a) OVER(ORDER BY b ROWS BETWEEN 1 PRECEDING AND 1 
FOLLOWING)  FROM (SELECT 1 a, 1 b, 1 c);
Error: UNSUPPORTED_OPERATION ERROR: This type of window frame is currently not 
supported 
See Apache Drill JIRA: DRILL-3188
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (DRILL-7404) window function RANGE with compound ORDER BY

2019-10-14 Thread benj (Jira)
benj created DRILL-7404:
---

 Summary: window function RANGE with compound ORDER BY
 Key: DRILL-7404
 URL: https://issues.apache.org/jira/browse/DRILL-7404
 Project: Apache Drill
  Issue Type: Improvement
  Components: Documentation
Affects Versions: 1.16.0
Reporter: benj


When creating a ticket CALCITE-3402 (to ask for improve the window functions), 
it's appears that the documentation of drill seems not up to date

[https://drill.apache.org/docs/aggregate-window-functions/]
{code:java}
frame_clause
If an ORDER BY clause is used for an aggregate function, an explicit frame 
clause is required. The frame clause refines the set of rows in a function's 
window, including or excluding sets of rows within the ordered result. The 
frame clause consists of the ROWS or RANGE keyword and associated specifiers.
{code}
 But it's currently (1.16) possible to write ORDER BY clause in window function 
+without+ specify an explicit RANGE clause.

In this case, an +implicit+ frame clause is used.

And normally the default/implicit framing option is {{RANGE UNBOUNDED 
PRECEDING}}, which is the same as {{RANGE BETWEEN UNBOUNDED PRECEDING AND 
CURRENT ROW (and should perhaps also be more explicitly specified) }}

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (DRILL-7034) Window function over a malformed CSV file crashes the JVM

2019-03-11 Thread Arina Ielchiieva (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-7034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arina Ielchiieva resolved DRILL-7034.
-
Resolution: Fixed

> Window function over a malformed CSV file crashes the JVM 
> --
>
> Key: DRILL-7034
> URL: https://issues.apache.org/jira/browse/DRILL-7034
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Affects Versions: 1.15.0
>Reporter: Boaz Ben-Zvi
>Priority: Major
> Fix For: 1.16.0
>
> Attachments: hs_err_pid23450.log, janino8470007454663483217.java
>
>
> The JVM crashes executing window functions over (an ordered) CSV file with a 
> small format issue - an empty line.
> To create: Take the following simple `a.csvh` file:
> {noformat}
> amount
> 10
> 11
> {noformat}
> And execute a simple window function like
> {code:sql}
> select max(amount) over(order by amount) FROM dfs.`/data/a.csvh`;
> {code}
> Then add an empty line between the `10` and the `11`:
> {noformat}
> amount
> 10
> 11
> {noformat}
>  and try again:
> {noformat}
> 0: jdbc:drill:zk=local> select max(amount) over(order by amount) FROM 
> dfs.`/data/a.csvh`;
> +-+
> | EXPR$0  |
> +-+
> | 10  |
> | 11  |
> +-+
> 2 rows selected (3.554 seconds)
> 0: jdbc:drill:zk=local> select max(amount) over(order by amount) FROM 
> dfs.`/data/a.csvh`;
> #
> # A fatal error has been detected by the Java Runtime Environment:
> #
> #  SIGSEGV (0xb) at pc=0x0001064aeae7, pid=23450, tid=0x6103
> #
> # JRE version: Java(TM) SE Runtime Environment (8.0_181-b13) (build 
> 1.8.0_181-b13)
> # Java VM: Java HotSpot(TM) 64-Bit Server VM (25.181-b13 mixed mode bsd-amd64 
> compressed oops)
> # Problematic frame:
> # J 6719% C2 
> org.apache.drill.exec.expr.fn.impl.ByteFunctionHelpers.memcmp(JIIJII)I (188 
> bytes) @ 0x0001064aeae7 [0x0001064ae920+0x1c7]
> #
> # Core dump written. Default location: /cores/core or core.23450
> #
> # An error report file with more information is saved as:
> # /Users/boazben-zvi/IdeaProjects/drill/hs_err_pid23450.log
> #
> # If you would like to submit a bug report, please visit:
> #   http://bugreport.java.com/bugreport/crash.jsp
> #
> Abort trap: 6 (core dumped)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (DRILL-7034) Window function over a malformed CSV file crashes the JVM

2019-02-08 Thread Boaz Ben-Zvi (JIRA)
Boaz Ben-Zvi created DRILL-7034:
---

 Summary: Window function over a malformed CSV file crashes the JVM 
 Key: DRILL-7034
 URL: https://issues.apache.org/jira/browse/DRILL-7034
 Project: Apache Drill
  Issue Type: Bug
  Components: Execution - Relational Operators
Affects Versions: 1.15.0
Reporter: Boaz Ben-Zvi


The JVM crashes executing window functions over (an ordered) CSV file with a 
small format issue - an empty line.

To create: Take the following simple `a.csvh` file:
{noformat}
amount
10
11
{noformat}

And execute a simple window function like
{code:sql}
select max(amount) over(order by amount) FROM dfs.`/data/a.csvh`;
{code}

Then add an empty line between the `10` and the `11`:
{noformat}
amount
10

11
{noformat}

 and try again:
{noformat}
0: jdbc:drill:zk=local> select max(amount) over(order by amount) FROM 
dfs.`/data/a.csvh`;
+-+
| EXPR$0  |
+-+
| 10  |
| 11  |
+-+
2 rows selected (3.554 seconds)
0: jdbc:drill:zk=local> select max(amount) over(order by amount) FROM 
dfs.`/data/a.csvh`;
#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x0001064aeae7, pid=23450, tid=0x6103
#
# JRE version: Java(TM) SE Runtime Environment (8.0_181-b13) (build 
1.8.0_181-b13)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (25.181-b13 mixed mode bsd-amd64 
compressed oops)
# Problematic frame:
# J 6719% C2 
org.apache.drill.exec.expr.fn.impl.ByteFunctionHelpers.memcmp(JIIJII)I (188 
bytes) @ 0x0001064aeae7 [0x0001064ae920+0x1c7]
#
# Core dump written. Default location: /cores/core or core.23450
#
# An error report file with more information is saved as:
# /Users/boazben-zvi/IdeaProjects/drill/hs_err_pid23450.log
#
# If you would like to submit a bug report, please visit:
#   http://bugreport.java.com/bugreport/crash.jsp
#
Abort trap: 6 (core dumped)
{noformat}




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (DRILL-6395) Value Window Function - LEAD and LAG on VarChar result in "No applicable constructor/method found" error

2018-05-09 Thread Raymond Wong (JIRA)
Raymond Wong created DRILL-6395:
---

 Summary: Value Window Function - LEAD and LAG on VarChar result in 
 "No applicable constructor/method found" error
 Key: DRILL-6395
 URL: https://issues.apache.org/jira/browse/DRILL-6395
 Project: Apache Drill
  Issue Type: Bug
  Components: Functions - Drill
Affects Versions: 1.13.0
 Environment: windows 10, apache drill 1.13.0, 32GB Ram
Reporter: Raymond Wong


{code:java}
SELECT 
col2,
LEAD(col1, 1) OVER (ORDER BY col2) AS nxtCol1
FROM (
SELECT 'A' AS col1, 1 AS col2
UNION 
SELECT 'B' AS col1, 2 AS col2
UNION 
SELECT 'C' AS col1, 3 AS col2
) AS A;
{code}
Causes error 
{code:java}
SQL Error: SYSTEM ERROR: CompileException: Line 37, Column 40: 
No applicable constructor/method found for actual parameters "int, int, int, 
io.netty.buffer.DrillBuf"; 
candidates are: 
"public void 
org.apache.drill.exec.vector.NullableVarCharVector$Mutator.setSafe(int, 
org.apache.drill.exec.expr.holders.VarCharHolder)", 
"public void 
org.apache.drill.exec.vector.NullableVarCharVector$Mutator.setSafe(int, 
org.apache.drill.exec.expr.holders.NullableVarCharHolder)", 
"public void 
org.apache.drill.exec.vector.NullableVarCharVector$Mutator.setSafe(int, byte[], 
int, int)", 
"public void 
org.apache.drill.exec.vector.NullableVarCharVector$Mutator.setSafe(int, 
java.nio.ByteBuffer, int, int)", 
"public void 
org.apache.drill.exec.vector.NullableVarCharVector$Mutator.setSafe(int, int, 
int, int, io.netty.buffer.DrillBuf)"

{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (DRILL-5916) Drill document window function example on LAST_VALUE is incorrect

2017-10-31 Thread Raymond Wong (JIRA)
Raymond Wong created DRILL-5916:
---

 Summary: Drill document window function example on LAST_VALUE is 
incorrect
 Key: DRILL-5916
 URL: https://issues.apache.org/jira/browse/DRILL-5916
 Project: Apache Drill
  Issue Type: Bug
  Components: Documentation
Affects Versions: 1.11.0
Reporter: Raymond Wong
Priority: Minor


The top and bottom review count example query result is showing incorrect 
values for the LAST_VALUE column. 
([https://drill.apache.org/docs/analyzing-data-using-window-functions/] )

The LAST_VALUE column should have the same value as the review count of each 
row because the default Window Frame is RANGE BETWEEN UNBOUNDED PRECEDING AND 
*CURRENT ROW*.

Query result using 2017 yelp data set.

{quote}
SELECT name, city, review_count,
  FIRST_VALUE(review_count)
OVER(PARTITION BY city ORDER BY review_count DESC) AS top_review_count,
  LAST_VALUE(review_count)
OVER(PARTITION BY city ORDER BY review_count DESC) AS bottom_review_count
FROM dfs.yelp.`yelp_academic_dataset_business.json`
LIMIT 30

||name  ||city   
||review_count ||top_review_count||bottom_review_count ||
|Lululemon Athletica   ||5  
  |5|5   |
|Aberdour Castle   |Aberdour|4  
  |4|4   |
|Cupz N' Crepes|Ahwatukee   |236
  |236  |236 |
|My Wine Cellar|Ahwatukee   |158
  |236  |158 |
|Florencia Pizza Bistro|Ahwatukee   |129
  |236  |129 |
|Barro's Pizza |Ahwatukee   |62 
  |236  |62  |
|Kathy's Alterations   |Ahwatukee   |30 
  |236  |30  |
|Hertz Rent A Car  |Ahwatukee   |26 
  |236  |26  |
|Active Kids Pediatrics|Ahwatukee   |18 
  |236  |18  |
|Dental by Design  |Ahwatukee   |18 
  |236  |18  |
|Desert Dog Pet Care   |Ahwatukee   |10 
  |236  |10  |
|McDonald's|Ahwatukee   |7  
  |236  |7   |
|U-Haul|Ahwatukee   |6  
  |236  |6   |
|Sprinkler Detective   |Ahwatukee   |5  
  |236  |5   |
|Hi-Health |Ahwatukee   |4  
  |236  |4   |
|Healthy and Clean Living Environments |Ahwatukee   |4  
  |236  |4   |
|Designs By Christa|Ahwatukee   |4  
  |236  |4   |
{quote}

Changing the LAST_VAULE's Window Frame to RANGE BETWEEN UNBOUNDED PRECEDING AND 
*UNBOUNDED FOLLOWING*.

{quote}
SELECT name, city, review_count,
  FIRST_VALUE(review_count)
OVER(PARTITION BY city ORDER BY review_count DESC) AS top_review_count,
  LAST_VALUE(review_count)
OVER(PARTITION BY city ORDER BY review_count DESC RANGE BETWEEN UNBOUNDED 
PRECEDING AND UNBOUNDED FOLLOWING) AS bottom_review_count
FROM dfs.yelp.`yelp_academic_dataset_business.json`
LIMIT 30
;

||name  ||city
||review_count ||top_review_count ||bottom_review_count ||
|Lululemon Athletica   ||5  
  |5|5   |
|Aberdour Castle   |Aberdour|4  
  |4|4   |
|Cupz N' Crepes|Ahwatukee   |236
  |236  |4   |
|My Wine Cellar|Ahwatukee   |158
  |236  |4   |
|Florencia Pizza Bistro|Ahwatukee   |129
  |236  |4   |
|Barro's Pizza |Ahwatukee   |62 
  |236  |4   |
|Kathy's Alterations   |Ahwatukee   |30 
  |236  |4   |
|Hertz Rent A Car  |Ahwatukee   |26 
  |236  |4

[jira] [Resolved] (DRILL-3241) Query with window function runs out of direct memory and does not report back to client that it did

2017-09-18 Thread Khurram Faraaz (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Khurram Faraaz resolved DRILL-3241.
---
Resolution: Fixed

> Query with window function runs out of direct memory and does not report back 
> to client that it did
> ---
>
> Key: DRILL-3241
> URL: https://issues.apache.org/jira/browse/DRILL-3241
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Affects Versions: 1.0.0
>Reporter: Victoria Markman
>Assignee: Deneche A. Hakim
>Priority: Critical
> Fix For: 1.12.0
>
>
> Even though query run out of memory and was cancelled on the server, client 
> (sqlline) was never notified of the event and it appears to the user that 
> query is hung. 
> Configuration:
> Single drillbit configured with:
> DRILL_MAX_DIRECT_MEMORY="2G"
> DRILL_HEAP="1G"
> TPCDS100 parquet files
> Query:
> {code}
> select 
>   sum(ss_quantity) over(partition by ss_store_sk order by ss_sold_date_sk) 
> from store_sales;
> {code}
> drillbit.log
> {code}
> 2015-06-01 21:42:29,514 [BitServer-5] ERROR 
> o.a.d.exec.rpc.RpcExceptionHandler - Exception in RPC communication.  
> Connection: /10.10.88.133:31012 <--> /10.10.88.133:38887 (data server).  
> Closing connection.
> io.netty.handler.codec.DecoderException: java.lang.OutOfMemoryError: Direct 
> buffer memory
> at 
> io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:233)
>  ~[netty-codec-4.0.27.Final.jar:4.0.27.Final]
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
>  [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
>  [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> at 
> io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86)
>  [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
>  [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
>  [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> at 
> io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:847)
>  [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> at 
> io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe.epollInReady(AbstractEpollStreamChannel.java:618)
>  [netty-transport-native-epoll-4.0.27.Final-linux-x86_64.jar:na]
> at 
> io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:329) 
> [netty-transport-native-epoll-4.0.27.Final-linux-x86_64.jar:na]
> at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:250) 
> [netty-transport-native-epoll-4.0.27.Final-linux-x86_64.jar:na]
> at 
> io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
>  [netty-common-4.0.27.Final.jar:4.0.27.Final]
> at java.lang.Thread.run(Thread.java:745) [na:1.7.0_71]
> Caused by: java.lang.OutOfMemoryError: Direct buffer memory
> at java.nio.Bits.reserveMemory(Bits.java:658) ~[na:1.7.0_71]
> at java.nio.DirectByteBuffer.(DirectByteBuffer.java:123) 
> ~[na:1.7.0_71]
> at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:306) 
> ~[na:1.7.0_71]
> at io.netty.buffer.PoolArena$DirectArena.newChunk(PoolArena.java:437) 
> ~[netty-buffer-4.0.27.Final.jar:4.0.27.Final]
> at io.netty.buffer.PoolArena.allocateNormal(PoolArena.java:179) 
> ~[netty-buffer-4.0.27.Final.jar:4.0.27.Final]
> at io.netty.buffer.PoolArena.allocate(PoolArena.java:168) 
> ~[netty-buffer-4.0.27.Final.jar:4.0.27.Final]
> at io.netty.buffer.PoolArena.reallocate(PoolArena.java:280) 
> ~[netty-buffer-4.0.27.Final.jar:4.0.27.Final]
> at io.netty.buffer.PooledByteBuf.capacity(PooledByteBuf.java:110) 
> ~[netty-buffer-4.0.27.Final.jar:4.0.27.Final]
> at 
> io.netty.buffer.AbstractByteBuf.ensureWritable(AbstractByteBuf.java:251) 
> ~[netty-buffer-4.0.27.Final.jar:4.0.27.Final]
> at 
> io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:849) 
> ~[netty

Re: Window function

2016-12-04 Thread Nitin Pawar
Thanks Deneche for the explanation.

Thanks Khurram for the ticket.

let me see if I can pick it up and close it soon.

On Sun, Dec 4, 2016 at 12:11 AM, Khurram Faraaz <kfar...@maprtech.com>
wrote:

>1. DRILL-5099 <https://issues.apache.org/jira/browse/DRILL-5099> is
>created to track this.
>
>
> On Sat, Dec 3, 2016 at 5:46 PM, Khurram Faraaz <kfar...@maprtech.com>
> wrote:
>
> > Hakim, thanks for sharing those details and the explanation.
> > I will file a JIRA and anyone interested can pick it up and provide the
> > fix to support OFFSET values greater than one for LAG window function.
> >
> > Regards,
> > Khurram
> >
> > On Fri, Dec 2, 2016 at 10:18 PM, deneche abdelhakim <adene...@gmail.com>
> > wrote:
> >
> >> Hello Nitin,
> >>
> >> It's definitely possible to support offsets other than 1 for Lead and
> Lag,
> >> the main reason I didn't do it is just lack of time :P
> >>
> >> Things that need to be done to make Lag (or Lead) support offsets other
> >> than 1:
> >> - WindowFunction.Lead should extract the offset value from its
> >> FunctionCall
> >> argument, you can look at WindowFunctionNtile.numTilesFromExpression()
> >> for
> >> and example on how to do that.
> >> - make sure calls to copyNext() and copyPrev() in NoFrameSupportTemplate
> >> use the offset and not the hard coded value (you already figured that
> out)
> >> - finally make sure you update UnsupportedOperatorsVisitor to no longer
> >> throw an exception when we pass an offset value other than 1 to Lead or
> >> Lag. Just search for DRILL-3596 in that class and you will find the if
> >> block that need to be removed
> >>
> >> I think this should be enough to get it to work in the general case, do
> >> you
> >> want to volunteer and get this done ? that would be an awesome
> >> contribution
> >> to the project.
> >>
> >> Thanks
> >>
> >> On Thu, Dec 1, 2016 at 10:10 PM Nitin Pawar <nitinpawar...@gmail.com>
> >> wrote:
> >>
> >> > any help on this ?
> >> >
> >> > from  class NoFrameSupportTemplate, I see that
> >> >
> >> > inIndex is hard coded to point  to previous row in case of lag and
> >> > next row in case of lead.
> >> >
> >> > Is there a way I can modify this and pass it as parameter to pic
> >> > appropriate row?
> >> >
> >> >
> >> > On Fri, Nov 25, 2016 at 2:57 PM, Nitin Pawar <nitinpawar...@gmail.com
> >
> >> > wrote:
> >> >
> >> > > adding dev list for comments
> >> > >
> >> > > On Wed, Nov 23, 2016 at 7:04 PM, Nitin Pawar <
> nitinpawar...@gmail.com
> >> >
> >> > > wrote:
> >> > >
> >> > >> Hi,
> >> > >>
> >> > >> according to DRILL-3596
> >> > >> <https://issues.apache.org/jira/browse/DRILL-3596>, lead or lag
> >> > function
> >> > >> are limited to use offset as 1.
> >> > >>
> >> > >> according to documentation on postgres
> >> > >> lag(value any [, offset integer [, default any ]]) same type as
> value
> >> > >> returns value evaluated at the row that is offset rows before the
> >> > >> current row within the partition; if there is no such row, instead
> >> > return
> >> > >> default. Both offset and default are evaluated with respect to the
> >> > >> current row. If omitted, offset defaults to 1 and default to null
> >> > >>
> >> > >>
> >> > >> is there any plan to allow offset according to needs but not
> restrict
> >> > >> equal to 1
> >> > >>
> >> > >> usecase :
> >> > >>
> >> > >> I have daily data for a month.
> >> > >> every day I want to do a delta with last week same day like compare
> >> > >> monday with monday and tuesday with tuesday so basically do a
> >> lag(col,
> >> > 7)
> >> > >>
> >> > >> --
> >> > >> Nitin Pawar
> >> > >>
> >> > >
> >> > >
> >> > >
> >> > > --
> >> > > Nitin Pawar
> >> > >
> >> >
> >> >
> >> >
> >> > --
> >> > Nitin Pawar
> >> >
> >>
> >
> >
>



-- 
Nitin Pawar


Re: Window function

2016-12-03 Thread Khurram Faraaz
Hakim, thanks for sharing those details and the explanation.
I will file a JIRA and anyone interested can pick it up and provide the fix
to support OFFSET values greater than one for LAG window function.

Regards,
Khurram

On Fri, Dec 2, 2016 at 10:18 PM, deneche abdelhakim <adene...@gmail.com>
wrote:

> Hello Nitin,
>
> It's definitely possible to support offsets other than 1 for Lead and Lag,
> the main reason I didn't do it is just lack of time :P
>
> Things that need to be done to make Lag (or Lead) support offsets other
> than 1:
> - WindowFunction.Lead should extract the offset value from its FunctionCall
> argument, you can look at WindowFunctionNtile.numTilesFromExpression() for
> and example on how to do that.
> - make sure calls to copyNext() and copyPrev() in NoFrameSupportTemplate
> use the offset and not the hard coded value (you already figured that out)
> - finally make sure you update UnsupportedOperatorsVisitor to no longer
> throw an exception when we pass an offset value other than 1 to Lead or
> Lag. Just search for DRILL-3596 in that class and you will find the if
> block that need to be removed
>
> I think this should be enough to get it to work in the general case, do you
> want to volunteer and get this done ? that would be an awesome contribution
> to the project.
>
> Thanks
>
> On Thu, Dec 1, 2016 at 10:10 PM Nitin Pawar <nitinpawar...@gmail.com>
> wrote:
>
> > any help on this ?
> >
> > from  class NoFrameSupportTemplate, I see that
> >
> > inIndex is hard coded to point  to previous row in case of lag and
> > next row in case of lead.
> >
> > Is there a way I can modify this and pass it as parameter to pic
> > appropriate row?
> >
> >
> > On Fri, Nov 25, 2016 at 2:57 PM, Nitin Pawar <nitinpawar...@gmail.com>
> > wrote:
> >
> > > adding dev list for comments
> > >
> > > On Wed, Nov 23, 2016 at 7:04 PM, Nitin Pawar <nitinpawar...@gmail.com>
> > > wrote:
> > >
> > >> Hi,
> > >>
> > >> according to DRILL-3596
> > >> <https://issues.apache.org/jira/browse/DRILL-3596>, lead or lag
> > function
> > >> are limited to use offset as 1.
> > >>
> > >> according to documentation on postgres
> > >> lag(value any [, offset integer [, default any ]]) same type as value
> > >> returns value evaluated at the row that is offset rows before the
> > >> current row within the partition; if there is no such row, instead
> > return
> > >> default. Both offset and default are evaluated with respect to the
> > >> current row. If omitted, offset defaults to 1 and default to null
> > >>
> > >>
> > >> is there any plan to allow offset according to needs but not restrict
> > >> equal to 1
> > >>
> > >> usecase :
> > >>
> > >> I have daily data for a month.
> > >> every day I want to do a delta with last week same day like compare
> > >> monday with monday and tuesday with tuesday so basically do a lag(col,
> > 7)
> > >>
> > >> --
> > >> Nitin Pawar
> > >>
> > >
> > >
> > >
> > > --
> > > Nitin Pawar
> > >
> >
> >
> >
> > --
> > Nitin Pawar
> >
>


Re: Window function

2016-12-01 Thread Nitin Pawar
any help on this ?

from  class NoFrameSupportTemplate, I see that

inIndex is hard coded to point  to previous row in case of lag and
next row in case of lead.

Is there a way I can modify this and pass it as parameter to pic
appropriate row?


On Fri, Nov 25, 2016 at 2:57 PM, Nitin Pawar 
wrote:

> adding dev list for comments
>
> On Wed, Nov 23, 2016 at 7:04 PM, Nitin Pawar 
> wrote:
>
>> Hi,
>>
>> according to DRILL-3596
>> , lead or lag function
>> are limited to use offset as 1.
>>
>> according to documentation on postgres
>> lag(value any [, offset integer [, default any ]]) same type as value
>> returns value evaluated at the row that is offset rows before the
>> current row within the partition; if there is no such row, instead return
>> default. Both offset and default are evaluated with respect to the
>> current row. If omitted, offset defaults to 1 and default to null
>>
>>
>> is there any plan to allow offset according to needs but not restrict
>> equal to 1
>>
>> usecase :
>>
>> I have daily data for a month.
>> every day I want to do a delta with last week same day like compare
>> monday with monday and tuesday with tuesday so basically do a lag(col, 7)
>>
>> --
>> Nitin Pawar
>>
>
>
>
> --
> Nitin Pawar
>



-- 
Nitin Pawar


Re: Window function

2016-11-25 Thread Nitin Pawar
adding dev list for comments

On Wed, Nov 23, 2016 at 7:04 PM, Nitin Pawar 
wrote:

> Hi,
>
> according to DRILL-3596 ,
> lead or lag function are limited to use offset as 1.
>
> according to documentation on postgres
> lag(value any [, offset integer [, default any ]]) same type as value
> returns value evaluated at the row that is offset rows before the current
> row within the partition; if there is no such row, instead return default.
> Both offset and default are evaluated with respect to the current row. If
> omitted, offset defaults to 1 and default to null
>
>
> is there any plan to allow offset according to needs but not restrict
> equal to 1
>
> usecase :
>
> I have daily data for a month.
> every day I want to do a delta with last week same day like compare monday
> with monday and tuesday with tuesday so basically do a lag(col, 7)
>
> --
> Nitin Pawar
>



-- 
Nitin Pawar


[jira] [Created] (DRILL-4847) Window function query results in OOM Exception.

2016-08-16 Thread Khurram Faraaz (JIRA)
Khurram Faraaz created DRILL-4847:
-

 Summary: Window function query results in OOM Exception.
 Key: DRILL-4847
 URL: https://issues.apache.org/jira/browse/DRILL-4847
 Project: Apache Drill
  Issue Type: Bug
  Components: Execution - Flow
Affects Versions: 1.8.0
 Environment: 4 node cluster CentOS
Reporter: Khurram Faraaz
Priority: Critical


Window function query results in OOM Exception.

Drill version 1.8.0-SNAPSHOT git commit ID: 38ce31ca
MapRBuildVersion 5.1.0.37549.GA

{noformat}
0: jdbc:drill:schema=dfs.tmp> SELECT clientname, audiencekey, spendprofileid, 
postalcd, provincecd, provincename, postalcode_json, country_json, 
province_json, town_json, dma_json, msa_json, ROW_NUMBER() OVER (PARTITION BY 
spendprofileid  ORDER BY (CASE WHEN postalcd IS NULL THEN 9 ELSE 0 END) ASC, 
provincecd ASC) as rn FROM `MD593.parquet` limit 3;
Error: RESOURCE ERROR: One or more nodes ran out of memory while executing the 
query.

Failure while allocating buffer.
Fragment 0:0

[Error Id: 2287fe71-f0cb-469a-a563-11580fceb1c5 on centos-01.qa.lab:31010] 
(state=,code=0)
{noformat}

Stack trace from drillbit.log

{noformat}
2016-08-16 07:25:44,590 [284d4006-9f9d-b893-9352-4f54f9b1d52a:foreman] INFO  
o.a.drill.exec.work.foreman.Foreman - Query text for query id 
284d4006-9f9d-b893-9352-4f54f9b1d52a: SELECT clientname, audiencekey, 
spendprofileid, postalcd, provincecd, provincename, postalcode_json, 
country_json, province_json, town_json, dma_json, msa_json, ROW_NUMBER() OVER 
(PARTITION BY spendprofileid  ORDER BY (CASE WHEN postalcd IS NULL THEN 9 ELSE 
0 END) ASC, provincecd ASC) as rn FROM `MD593.parquet` limit 3
...
2016-08-16 07:25:46,273 [284d4006-9f9d-b893-9352-4f54f9b1d52a:frag:0:0] INFO  
o.a.d.e.p.i.xsort.ExternalSortBatch - Completed spilling to 
/tmp/drill/spill/284d4006-9f9d-b893-9352-4f54f9b1d52a_majorfragment0_minorfragment0_operator8/2
2016-08-16 07:25:46,283 [284d4006-9f9d-b893-9352-4f54f9b1d52a:frag:0:0] INFO  
o.a.d.e.w.fragment.FragmentExecutor - User Error Occurred
org.apache.drill.common.exceptions.UserException: RESOURCE ERROR: One or more 
nodes ran out of memory while executing the query.

Failure while allocating buffer.

[Error Id: 2287fe71-f0cb-469a-a563-11580fceb1c5 ]
at 
org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:543)
 ~[drill-common-1.8.0-SNAPSHOT.jar:1.8.0-SNAPSHOT]
at 
org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:242)
 [drill-java-exec-1.8.0-SNAPSHOT.jar:1.8.0-SNAPSHOT]
at 
org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38) 
[drill-common-1.8.0-SNAPSHOT.jar:1.8.0-SNAPSHOT]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
[na:1.7.0_101]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
[na:1.7.0_101]
at java.lang.Thread.run(Thread.java:745) [na:1.7.0_101]
Caused by: org.apache.drill.exec.exception.OutOfMemoryException: Failure while 
allocating buffer.
at 
org.apache.drill.exec.vector.NullableVarCharVector.allocateNew(NullableVarCharVector.java:187)
 ~[vector-1.8.0-SNAPSHOT.jar:1.8.0-SNAPSHOT]
at 
org.apache.drill.exec.vector.complex.RepeatedMapVector$RepeatedMapTransferPair.(RepeatedMapVector.java:331)
 ~[vector-1.8.0-SNAPSHOT.jar:1.8.0-SNAPSHOT]
at 
org.apache.drill.exec.vector.complex.RepeatedMapVector$RepeatedMapTransferPair.(RepeatedMapVector.java:307)
 ~[vector-1.8.0-SNAPSHOT.jar:1.8.0-SNAPSHOT]
at 
org.apache.drill.exec.vector.complex.RepeatedMapVector.getTransferPair(RepeatedMapVector.java:161)
 ~[vector-1.8.0-SNAPSHOT.jar:1.8.0-SNAPSHOT]
at 
org.apache.drill.exec.record.SimpleVectorWrapper.cloneAndTransfer(SimpleVectorWrapper.java:66)
 ~[drill-java-exec-1.8.0-SNAPSHOT.jar:1.8.0-SNAPSHOT]
at 
org.apache.drill.exec.record.VectorContainer.cloneAndTransfer(VectorContainer.java:204)
 ~[drill-java-exec-1.8.0-SNAPSHOT.jar:1.8.0-SNAPSHOT]
at 
org.apache.drill.exec.record.VectorContainer.getTransferClone(VectorContainer.java:157)
 ~[drill-java-exec-1.8.0-SNAPSHOT.jar:1.8.0-SNAPSHOT]
at 
org.apache.drill.exec.physical.impl.xsort.ExternalSortBatch.mergeAndSpill(ExternalSortBatch.java:569)
 ~[drill-java-exec-1.8.0-SNAPSHOT.jar:1.8.0-SNAPSHOT]
at 
org.apache.drill.exec.physical.impl.xsort.ExternalSortBatch.innerNext(ExternalSortBatch.java:414)
 ~[drill-java-exec-1.8.0-SNAPSHOT.jar:1.8.0-SNAPSHOT]
at 
org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:162)
 ~[drill-java-exec-1.8.0-SNAPSHOT.jar:1.8.0-SNAPSHOT]
at 
org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:119)
 ~[drill-java-exec-1.8.0-SNAPSHOT.jar:1.8.0-SNAPSHOT]
at 
org.apache.drill.exec.record.AbstractRecordBatch.n

[jira] [Created] (DRILL-4810) CalciteContextException - SELECT star, window function

2016-07-27 Thread Khurram Faraaz (JIRA)
Khurram Faraaz created DRILL-4810:
-

 Summary: CalciteContextException - SELECT star, window function
 Key: DRILL-4810
 URL: https://issues.apache.org/jira/browse/DRILL-4810
 Project: Apache Drill
  Issue Type: Bug
  Components: Query Planning & Optimization
Affects Versions: 1.8.0
Reporter: Khurram Faraaz


This is a problem seen with regular window functions (not specific to nested 
aggregates).

{noformat}
SELECT *,  OVER...
does not work on Drill
{noformat}

Same query works on Postgres and on same data.

{noformat}
postgres=# select *, MIN(col9) over(partition by col7 order by col8) from 
fewrwspqq_101 GROUP BY col0,col1,col2,col3,col4,col5,col6,col7,col8,col9;
col0|col1 |   col2   |   col3| col4 |   
   col5   |col6| col7 | col8 | col9 
| min  
+-+--+---+--+-++--+--+--+--
  1 |   65534 |  256 |1234.9 | 20:26:18.58  | 
2014-03-02 00:28:02.338 | 1952-08-14 | f| CA   | 
AXCZ | 
AXCZ
 13 | 200 |1 |-65534 | 19:20:30.5   | 
2004-06-02 00:28:02.418 | 1972-04-03 | f| GA   | 
PXCD | 
AXCZ
  7 |  17 |  -33 |  33.9 | 13:13:13.13  | 
2006-05-02 00:28:02.748 | 1992-12-12 | f| I
...  
{noformat}

Drill throws error
{noformat}
0: jdbc:drill:schema=dfs.tmp> select *, MIN(col9) over(partition by col7 order 
by col8) from `allTypsUniq.parquet` GROUP BY 
col0,col1,col2,col3,col4,col5,col6,col7,col8,col9;
Error: VALIDATION ERROR: At line 1, column 8: Expression 
'allTypsUniq.parquet.*' is not being grouped

SQL Query null

[Error Id: f83e438f-5cdf-42f2-a548-5dc17b49da07 on centos-01.qa.lab:31010] 
(state=,code=0)
{noformat}

Upon expanding star in the project Drill returns results
{noformat}
0: jdbc:drill:schema=dfs.tmp> select 
col0,col1,col2,col3,col4,col5,col6,col7,col8,col9, MIN(col9) over(partition by 
col7 order by col8) from `allTypsUniq.parquet` GROUP BY 
col0,col1,col2,col3,col4,col5,col6,col7,col8,col9;
+-+--+---++---+--+-++---+---+---+
|col0 | col1 |   col2|col3| col4  | 
  col5   |col6 |  col7  | col8  |   
  col9  |EXPR$10
|
+-+--+---++---+--+-++---+---+---+
| 1   | 65534| 256.0 | 1234.9 | 20:26:18.580  | 
2014-03-02 00:28:02.338  | 1952-08-14  | false  | CA| 
AXCZ  | 
AXCZ  |
| 13  | 200  | 1.0   | -65534.0   | 19:20:30.500  | 
2004-06-02 00:28:02.418  | 1972-04-03  | false  | GA| 
PXCD  | 
AXCZ  |
| 7   | 17   | -33.0 | 33.9   | 13:13:13.130  | 
2006-05-02 00:28:02.748  | 1992-12-12  | false  | IA| 
UXCB  | 
AXCZ  |
...
{noformat}

Stack trace from drillbit.log

{noformat}
2016-07-27 13:52:45,103 [28674351-99f0-2691-02f8-e6d576c85be1:foreman] INFO  
o.a.drill.exec.work.foreman.Foreman - Query text for query id 
28674351-99f0-2691-02f8-e6d576c85be1: select *, MIN(col9) over(partition by 
col7 order by col8) from `allTypsUniq.parquet` GROUP BY 
col0,col1,col2,col3,col4,col5,col6,col7,col8,col9
2016-07-27 13:52:45,132 [28674351-99f0-2691-02f8-e6d576c85be1:foreman] INFO  
o.a.d.exec.planner.sql.SqlConverter - User Error Occurred
org.apache.drill.common.exceptions.UserException: VALIDATION ERROR: At line 1, 
column 8: Expression 'allTypsUniq.parquet.*' is not being grouped

SQL Query null

[Error Id: c6758b36-8cb6-4965-861f-2aaf573c4002 ]
at 
org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:543)
 ~[drill-co

[jira] [Created] (DRILL-4808) CTE query with window function results in AssertionError

2016-07-26 Thread Khurram Faraaz (JIRA)
Khurram Faraaz created DRILL-4808:
-

 Summary: CTE query with window function results in AssertionError
 Key: DRILL-4808
 URL: https://issues.apache.org/jira/browse/DRILL-4808
 Project: Apache Drill
  Issue Type: Bug
  Components: Query Planning & Optimization
Affects Versions: 1.8.0
Reporter: Khurram Faraaz


Below query that uses CTE and window functions results in AssertionError
Same query over same data works on Postgres.
MapR Drill 1.8.0 commit ID : 34ca63ba

{noformat}
0: jdbc:drill:schema=dfs.tmp> WITH v1 ( a, b, c, d ) AS
. . . . . . . . . . . . . . > (
. . . . . . . . . . . . . . > SELECT col0, col8, MAX(MIN(col8)) over 
(partition by col7 order by col8) as max_col8, col7 from `allTypsUniq.parquet` 
GROUP BY col0,col7,col8
. . . . . . . . . . . . . . > )
. . . . . . . . . . . . . . > select * from ( select a, b, c, d from v1 where c 
> 'IN' GROUP BY a,b,c,d order by a,b,c,d);
Error: SYSTEM ERROR: AssertionError: Internal error: Type 'RecordType(ANY col0, 
ANY col8, ANY max_col8, ANY col7)' has no field 'a'


[Error Id: 5c058176-741a-42cd-8433-0cd81115776b on centos-01.qa.lab:31010] 
(state=,code=0)
{noformat}

Stack trace from drillbit.log for above failing query

{noformat}
2016-07-26 16:57:04,627 [2868699e-ae56-66f4-9439-8db2132ef265:foreman] INFO  
o.a.drill.exec.work.foreman.Foreman - Query text for query id 
2868699e-ae56-66f4-9439-8db2132ef265: WITH v1 ( a, b, c, d ) AS
(
SELECT col0, col8, MAX(MIN(col8)) over (partition by col7 order by col8) as 
max_col8, col7 from `allTypsUniq.parquet` GROUP BY col0,col7,col8
)
select * from ( select a, b, c, d from v1 where c > 'IN' GROUP BY a,b,c,d order 
by a,b,c,d)
2016-07-26 16:57:04,666 [2868699e-ae56-66f4-9439-8db2132ef265:foreman] ERROR 
o.a.drill.exec.work.foreman.Foreman - SYSTEM ERROR: AssertionError: Internal 
error: Type 'RecordType(ANY col0, ANY col8, ANY max_col8, ANY col7)' has no 
field 'a'


[Error Id: 5c058176-741a-42cd-8433-0cd81115776b on centos-01.qa.lab:31010]
org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: AssertionError: 
Internal error: Type 'RecordType(ANY col0, ANY col8, ANY max_col8, ANY col7)' 
has no field 'a'


[Error Id: 5c058176-741a-42cd-8433-0cd81115776b on centos-01.qa.lab:31010]
at 
org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:543)
 ~[drill-common-1.8.0-SNAPSHOT.jar:1.8.0-SNAPSHOT]
at 
org.apache.drill.exec.work.foreman.Foreman$ForemanResult.close(Foreman.java:791)
 [drill-java-exec-1.8.0-SNAPSHOT.jar:1.8.0-SNAPSHOT]
at 
org.apache.drill.exec.work.foreman.Foreman.moveToState(Foreman.java:901) 
[drill-java-exec-1.8.0-SNAPSHOT.jar:1.8.0-SNAPSHOT]
at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:271) 
[drill-java-exec-1.8.0-SNAPSHOT.jar:1.8.0-SNAPSHOT]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
[na:1.7.0_101]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
[na:1.7.0_101]
at java.lang.Thread.run(Thread.java:745) [na:1.7.0_101]
Caused by: org.apache.drill.exec.work.foreman.ForemanException: Unexpected 
exception during fragment initialization: Internal error: Type 'RecordType(ANY 
col0, ANY col8, ANY max_col8, ANY col7)' has no field 'a'
... 4 common frames omitted
Caused by: java.lang.AssertionError: Internal error: Type 'RecordType(ANY col0, 
ANY col8, ANY max_col8, ANY col7)' has no field 'a'
at org.apache.calcite.util.Util.newInternal(Util.java:777) 
~[calcite-core-1.4.0-drill-r14.jar:1.4.0-drill-r14]
at 
org.apache.calcite.rex.RexBuilder.makeFieldAccess(RexBuilder.java:167) 
~[calcite-core-1.4.0-drill-r14.jar:1.4.0-drill-r14]
at 
org.apache.calcite.sql2rel.SqlToRelConverter.convertIdentifier(SqlToRelConverter.java:3225)
 ~[calcite-core-1.4.0-drill-r14.jar:1.4.0-drill-r14]
at 
org.apache.calcite.sql2rel.SqlToRelConverter.access$1500(SqlToRelConverter.java:185)
 ~[calcite-core-1.4.0-drill-r14.jar:1.4.0-drill-r14]
at 
org.apache.calcite.sql2rel.SqlToRelConverter$Blackboard.visit(SqlToRelConverter.java:4181)
 ~[calcite-core-1.4.0-drill-r14.jar:1.4.0-drill-r14]
at 
org.apache.calcite.sql2rel.SqlToRelConverter$Blackboard.visit(SqlToRelConverter.java:3603)
 ~[calcite-core-1.4.0-drill-r14.jar:1.4.0-drill-r14]
at org.apache.calcite.sql.SqlIdentifier.accept(SqlIdentifier.java:274) 
~[calcite-core-1.4.0-drill-r14.jar:1.4.0-drill-r14]
at 
org.apache.calcite.sql2rel.SqlToRelConverter$Blackboard.convertExpression(SqlToRelConverter.java:4062)
 ~[calcite-core-1.4.0-drill-r14.jar:1.4.0-drill-r14]
at 
org.apache.calcite.sql2rel.SqlToRelConverter.convertSelectList(SqlToRelConverter.java:3433)
 ~[calcite-core-1.4.0-drill-r14.jar:1.4.0-drill-r14]
at 
org.apache.calcite.sql2rel.SqlToRelCo

[jira] [Created] (DRILL-4512) Revisit the changes for DRILL-3404 (using SUM0 for window function)

2016-03-15 Thread Aman Sinha (JIRA)
Aman Sinha created DRILL-4512:
-

 Summary: Revisit the changes for DRILL-3404 (using SUM0 for window 
function)
 Key: DRILL-4512
 URL: https://issues.apache.org/jira/browse/DRILL-4512
 Project: Apache Drill
  Issue Type: Bug
  Components: Query Planning & Optimization
Affects Versions: 1.6.0
Reporter: Aman Sinha
Assignee: Aman Sinha


DRILL-3404 was an incorrect results issue related to SUM0 window function over 
nullable column containing null values.  The change done in Calcite for that 
issue should be reverted because on the latest master,  after I revert the 
Calcite change, I still get the correct result.  The Explain plan also shows 
that the new plan is different from the old one.  It seems there may have been 
nullability related fix(es) on Calcite. 

New plan after reverting the change for DRILL-3404:

{noformat}
00-00Screen
00-01  Project(c1=[$0], c2=[$1], w_sum=[$2])
00-02Project(c1=[$0], c2=[$1], w_sum=[CASE(>($2, 0), $3, null)])
00-03  SelectionVectorRemover
00-04Filter(condition=[>($2, 0)])
00-05  Window(window#0=[window(partition {1} order by [0 
ASC-nulls-first] range between UNBOUNDED PRECEDING and CURRENT ROW aggs 
[COUNT($0), $SUM0($0)])])
00-06SelectionVectorRemover
00-07  Sort(sort0=[$1], sort1=[$0], dir0=[ASC], 
dir1=[ASC-nulls-first])
00-08Scan(groupscan=[ParquetGroupScan 
[entries=[ReadEntryWithPath 
[path=file:/Users/asinha/incubator-drill/exec/java-exec/src/test/resources/window/table_with_nulls.parquet]],
 
selectionRoot=file:/Users/asinha/incubator-drill/exec/java-exec/src/test/resources/window/table_with_nulls.parquet,
 numFiles=1, usedMetadataFile=false, columns=[`c1`, `c2`]]])

{noformat}

For reference, here's the old plan copied from DRILL-3404:

{noformat}
| 00-00Screen
00-01  Project(c1=[$0], c2=[$1], w_sum=[$2])
00-02Project(c1=[$0], c2=[$1], w_sum=[CASE(>($2, 0), $3, null)])
00-03  Window(window#0=[window(partition {1} order by [0 
ASC-nulls-first] range between UNBOUNDED PRECEDING and CURRENT ROW aggs 
[COUNT($0), $SUM0($0)])])
00-04SelectionVectorRemover
00-05  Sort(sort0=[$1], sort1=[$0], dir0=[ASC], 
dir1=[ASC-nulls-first])
00-06Project(c1=[$1], c2=[$0])
00-07  Scan(groupscan=[ParquetGroupScan 
[entries=[ReadEntryWithPath [path=maprfs:///tmp/tblWnulls]], 
selectionRoot=/tmp/tblWnulls, numFiles=1, columns=[`c1`, `c2`]]])
{noformat}

Notice the two plans are different due to the extra filter condition present in 
the new plan.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (DRILL-4457) Difference in results returned by window function over BIGINT data

2016-03-07 Thread Deneche A. Hakim (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Deneche A. Hakim resolved DRILL-4457.
-
Resolution: Fixed

Fixed in a2fec78695df979e240231cb9d32c7f18274a333

> Difference in results returned by window function over BIGINT data
> --
>
> Key: DRILL-4457
> URL: https://issues.apache.org/jira/browse/DRILL-4457
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 1.6.0
> Environment: 4 node cluster
>Reporter: Khurram Faraaz
>Assignee: Deneche A. Hakim
>  Labels: window_function
> Fix For: 1.6.0
>
>
> Difference in results returned by window function query over same data on 
> Drill vs on Postgres.
> Drill 1.6.0 commit ID 6d5f4983
> {noformat}
> Verification Failures:
> /root/public_framework/drill-test-framework/framework/resources/Functional/window_functions/frameclause/RBCRACR/RBCRACR_bgint_6.q
> Query:
> SELECT FIRST_VALUE(c3) OVER(PARTITION BY c8 ORDER BY c1 RANGE BETWEEN CURRENT 
> ROW AND CURRENT ROW) FROM `t_alltype.parquet`
>  Expected number of rows: 145
> Actual number of rows from Drill: 145
>  Number of matching rows: 143
>   Number of rows missing: 2
>Number of rows unexpected: 2
> These rows are not expected (first 10):
> 36022570792
> 21011901540311080
> These rows are missing (first 10):
> null (2 time(s))
> {noformat}
> Here is the difference in results, Drill 1.6.0 returns 36022570792 whereas 
> Postgres returns null, and another difference is that Drill returns 
> 21011901540311080 whereas Postgres returns null.
> {noformat}
> [root@centos-01 drill-output]# diff -cb 
> RBCRACR_RBCRACR_bgint_6.output_Tue_Mar_01_10\:36\:42_UTC_2016 
> ../resources/Functional/window_functions/frameclause/RBCRACR/RBCRACR_bgint_6.e
> *** RBCRACR_RBCRACR_bgint_6.output_Tue_Mar_01_10:36:42_UTC_2016   
> 2016-03-01 10:36:43.012382649 +
> --- 
> ../resources/Functional/window_functions/frameclause/RBCRACR/RBCRACR_bgint_6.e
> 2016-03-01 10:32:56.605677914 +
> ***
> *** 55,61 
>   5424751352
>   3734160392
>   36022570792
> ! 36022570792
>   584831936
>   37102817894137256
>   61958708627376736
> --- 55,61 
>   5424751352
>   3734160392
>   36022570792
> ! null
>   584831936
>   37102817894137256
>   61958708627376736
> ***
> *** 64,70 
>   29537626363643852
>   52598911986023288
>   21011901540311080
> ! 21011901540311080
>   17990322900862228
>   61608051272
>   3136812789494
> --- 64,70 
>   29537626363643852
>   52598911986023288
>   21011901540311080
> ! null
>   17990322900862228
>   61608051272
>   3136812789494
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-4453) Difference in results over char data, window function query

2016-02-29 Thread Khurram Faraaz (JIRA)
Khurram Faraaz created DRILL-4453:
-

 Summary: Difference in results over char data, window function 
query
 Key: DRILL-4453
 URL: https://issues.apache.org/jira/browse/DRILL-4453
 Project: Apache Drill
  Issue Type: Bug
  Components: Execution - Flow
Affects Versions: 1.6.0
 Environment: 4 node cluster
Reporter: Khurram Faraaz


Window function query with frame clause returns results that are different from 
those returned by same query on Postgres 9.3 of same data.

The difference is that there are two extra nulls returned by Drill where as 
Postgres does not. Note that the two tables have same number of nulls in both 
Drill and Postgres.

{noformat}
postgres=# \d t_alltype
 Table "public.t_alltype"
 Column |Type | Modifiers
+-+---
 c1 | integer |
 c2 | integer |
 c3 | bigint  |
 c4 | character(256)  |
 c5 | character varying(256)  |
 c6 | timestamp without time zone |
 c7 | date|
 c8 | boolean |
 c9 | double precision|
postgres=# select c4 from t_alltype where c4 is null;
 c4




(3 rows)

{noformat}
postgres=# SELECT MIN(c4) OVER(PARTITION BY c8 ORDER BY c1 ROWS BETWEEN 
UNBOUNDED PRECEDING AND CURRENT ROW) FROM t_alltype;

   min
--
 gwfrW
 ZAFOcferhjkcl
 ZAFOcferhjkcl
 ZAFOcferhjkcl
 ZAFOcferhjkcl
 ...
 ...
 
 ApKK
 ApKK















(145 rows)
{noformat}

Parquet schema details

{noformat}
[root@centos-01 parquet-tools]# ./parquet-schema 
./Datasources/window_functions/t_alltype.parquet
message root {
  optional int32 c1;
  optional int32 c2;
  optional int64 c3;
  optional binary c4 (UTF8);
  optional binary c5 (UTF8);
  optional int64 c6 (TIMESTAMP_MILLIS);
  optional int32 c7 (DATE);
  optional boolean c8;
  optional double c9;
}
{noformat}

On Drill 1.6.0 

{noformat}
0: jdbc:drill:schema=dfs.tmp> SELECT MIN(c4) OVER(PARTITION BY c8 ORDER BY c1 
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) FROM dfs.tmp.`t_alltype`;
++
| EXPR$0 |
++
| gwfrW  |
| ZAFOcferhjkcl  |
| ZAFOcferhjkcl  |
| ZAFOcferhjkcl  |
| ZAFOcferhjkcl  |
...
...
| ApKK |
| ApKK |
|  |
|  |
|  |
|  |
|  |
|  |
|  |
|  |
|  |
|  |
| null |
| null |
|  |
|  |
|  |
+--+
145 rows selected (0.409 seconds)
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-4444) Window function query results in IllegalStateException

2016-02-26 Thread Khurram Faraaz (JIRA)
Khurram Faraaz created DRILL-:
-

 Summary: Window function query results in IllegalStateException
 Key: DRILL-
 URL: https://issues.apache.org/jira/browse/DRILL-
 Project: Apache Drill
  Issue Type: Bug
  Components: Execution - Flow
Affects Versions: 1.6.0
Reporter: Khurram Faraaz


Window function query results in IllegalStateException
Drill 1.6.0 commit ID: 6d5f4983

{noformat}
0: jdbc:drill:schema=dfs.tmp> SELECT
. . . . . . . . . . . . . . > RANK() OVER (ORDER BY c1 DESC),
. . . . . . . . . . . . . . > AVG(c3) OVER (PARTITION BY c8 ORDER BY 
MIN(c3) DESC NULLS FIRST RANGE BETWEEN CURRENT ROW AND CURRENT ROW)
. . . . . . . . . . . . . . > FROM dfs.tmp.`t_alltype`
. . . . . . . . . . . . . . > WINDOW w AS (PARTITION BY c8 ORDER BY c2 DESC 
NULLS FIRST RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING);
Error: SYSTEM ERROR: IllegalStateException: This generator does not support 
mappings beyond

Fragment 0:0

[Error Id: a273f3c1-47a7-450b-b9d7-65c2608089d5 on centos-03.qa.lab:31010] 
(state=,code=0)
{noformat}

Stack trace from drillbit.log

{noformat}
2016-02-26 11:25:32,925 [292fc9d3-28f5-6eb0-ec6a-99d5b90ec968:foreman] INFO  
o.a.drill.exec.work.foreman.Foreman - Query text for query id 
292fc9d3-28f5-6eb0-ec6a-99d5b90ec968: SELECT
RANK() OVER (ORDER BY c1 DESC),
AVG(c3) OVER (PARTITION BY c8 ORDER BY MIN(c3) DESC NULLS FIRST RANGE 
BETWEEN CURRENT ROW AND CURRENT ROW)
FROM dfs.tmp.`t_alltype`
WINDOW w AS (PARTITION BY c8 ORDER BY c2 DESC NULLS FIRST RANGE BETWEEN 
UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)
2016-02-26 11:25:33,056 [292fc9d3-28f5-6eb0-ec6a-99d5b90ec968:foreman] INFO  
o.a.d.exec.store.parquet.Metadata - Took 1 ms to get file statuses
2016-02-26 11:25:33,059 [292fc9d3-28f5-6eb0-ec6a-99d5b90ec968:foreman] INFO  
o.a.d.exec.store.parquet.Metadata - Fetch parquet metadata: Executed 1 out of 1 
using 1 threads. Time: 2ms total, 2.395101ms avg, 2ms max.
2016-02-26 11:25:33,059 [292fc9d3-28f5-6eb0-ec6a-99d5b90ec968:foreman] INFO  
o.a.d.exec.store.parquet.Metadata - Fetch parquet metadata: Executed 1 out of 1 
using 1 threads. Earliest start: 1.325000 μs, Latest start: 1.325000 μs, 
Average start: 1.325000 μs .
2016-02-26 11:25:33,059 [292fc9d3-28f5-6eb0-ec6a-99d5b90ec968:foreman] INFO  
o.a.d.exec.store.parquet.Metadata - Took 2 ms to read file metadata
2016-02-26 11:25:33,130 [292fc9d3-28f5-6eb0-ec6a-99d5b90ec968:frag:0:0] INFO  
o.a.d.e.w.fragment.FragmentExecutor - 292fc9d3-28f5-6eb0-ec6a-99d5b90ec968:0:0: 
State change requested AWAITING_ALLOCATION --> RUNNING
2016-02-26 11:25:33,130 [292fc9d3-28f5-6eb0-ec6a-99d5b90ec968:frag:0:0] INFO  
o.a.d.e.w.f.FragmentStatusReporter - 292fc9d3-28f5-6eb0-ec6a-99d5b90ec968:0:0: 
State to report: RUNNING
2016-02-26 11:25:33,151 [292fc9d3-28f5-6eb0-ec6a-99d5b90ec968:frag:0:0] INFO  
o.a.d.e.w.fragment.FragmentExecutor - 292fc9d3-28f5-6eb0-ec6a-99d5b90ec968:0:0: 
State change requested RUNNING --> FAILED
2016-02-26 11:25:33,151 [292fc9d3-28f5-6eb0-ec6a-99d5b90ec968:frag:0:0] INFO  
o.a.d.e.w.fragment.FragmentExecutor - 292fc9d3-28f5-6eb0-ec6a-99d5b90ec968:0:0: 
State change requested FAILED --> FINISHED
2016-02-26 11:25:33,152 [292fc9d3-28f5-6eb0-ec6a-99d5b90ec968:frag:0:0] ERROR 
o.a.d.e.w.fragment.FragmentExecutor - SYSTEM ERROR: IllegalStateException: This 
generator does not support mappings beyond

Fragment 0:0

[Error Id: a273f3c1-47a7-450b-b9d7-65c2608089d5 on centos-03.qa.lab:31010]
org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: 
IllegalStateException: This generator does not support mappings beyond

Fragment 0:0

[Error Id: a273f3c1-47a7-450b-b9d7-65c2608089d5 on centos-03.qa.lab:31010]
at 
org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:543)
 ~[drill-common-1.6.0-SNAPSHOT.jar:1.6.0-SNAPSHOT]
at 
org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:318)
 [drill-java-exec-1.6.0-SNAPSHOT.jar:1.6.0-SNAPSHOT]
at 
org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:185)
 [drill-java-exec-1.6.0-SNAPSHOT.jar:1.6.0-SNAPSHOT]
at 
org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:287)
 [drill-java-exec-1.6.0-SNAPSHOT.jar:1.6.0-SNAPSHOT]
at 
org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38) 
[drill-common-1.6.0-SNAPSHOT.jar:1.6.0-SNAPSHOT]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
[na:1.8.0_65]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
[na:1.8.0_65]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_65]
Caused by: java.lang.IllegalStateException: This generator does not support 
mappings beyond
at 
org.apache.drill.exec.compile.s

[jira] [Resolved] (DRILL-4148) NullPointerException in planning when running window function query

2016-02-01 Thread Sean Hsuan-Yi Chu (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Hsuan-Yi Chu resolved DRILL-4148.
--
Resolution: Cannot Reproduce

> NullPointerException in planning when running window function query
> ---
>
> Key: DRILL-4148
> URL: https://issues.apache.org/jira/browse/DRILL-4148
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Affects Versions: 1.4.0
> Environment: 4 node cluster CentOS
>Reporter: Khurram Faraaz
>Assignee: Sean Hsuan-Yi Chu
>
> Failing test case : Functional/window_functions/lag_func/lag_Fn_4.q
> Tests were run on JDK8 with assertions enabled.
> {noformat}
> [root@centos-01 lag_func]# java -version
> openjdk version "1.8.0_65"
> OpenJDK Runtime Environment (build 1.8.0_65-b17)
> OpenJDK 64-Bit Server VM (build 25.65-b01, mixed mode)
> [root@centos-01 lag_func]# uname -a
> Linux centos-01 2.6.32-431.el6.x86_64 #1 SMP Fri Nov 22 03:15:09 UTC 2013 
> x86_64 x86_64 x86_64 GNU/Linux
> {noformat}
> Stack trace from drillbit.log
> {noformat}
> 2015-12-01 03:31:50,133 [29a2eb59-d5a3-f0f2-d2df-9ad4e5d17109:foreman] ERROR 
> o.a.drill.exec.work.foreman.Foreman - SYSTEM ERROR: NullPointerException
> [Error Id: ab046c45-4a2d-428d-8c72-592a02ea53e5 on centos-02.qa.lab:31010]
> org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: 
> NullPointerException
> [Error Id: ab046c45-4a2d-428d-8c72-592a02ea53e5 on centos-02.qa.lab:31010]
> at 
> org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:534)
>  ~[drill-common-1.4.0-SNAPSHOT.jar:1.4.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.foreman.Foreman$ForemanResult.close(Foreman.java:742)
>  [drill-java-exec-1.4.0-SNAPSHOT.jar:1.4.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.foreman.Foreman$StateSwitch.processEvent(Foreman.java:841)
>  [drill-java-exec-1.4.0-SNAPSHOT.jar:1.4.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.foreman.Foreman$StateSwitch.processEvent(Foreman.java:786)
>  [drill-java-exec-1.4.0-SNAPSHOT.jar:1.4.0-SNAPSHOT]
> at 
> org.apache.drill.common.EventProcessor.sendEvent(EventProcessor.java:73) 
> [drill-common-1.4.0-SNAPSHOT.jar:1.4.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.foreman.Foreman$StateSwitch.moveToState(Foreman.java:788)
>  [drill-java-exec-1.4.0-SNAPSHOT.jar:1.4.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.foreman.Foreman.moveToState(Foreman.java:894) 
> [drill-java-exec-1.4.0-SNAPSHOT.jar:1.4.0-SNAPSHOT]
> at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:255) 
> [drill-java-exec-1.4.0-SNAPSHOT.jar:1.4.0-SNAPSHOT]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  [na:1.8.0_65]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  [na:1.8.0_65]
> at java.lang.Thread.run(Thread.java:745) [na:1.8.0_65]
> Caused by: org.apache.drill.exec.work.foreman.ForemanException: Unexpected 
> exception during fragment initialization: null
> ... 4 common frames omitted
> Caused by: java.lang.NullPointerException: null
> at 
> com.google.common.base.Preconditions.checkNotNull(Preconditions.java:191) 
> ~[guava-14.0.1.jar:na]
> at 
> org.apache.calcite.rel.metadata.CachingRelMetadataProvider$CachingInvocationHandler.(CachingRelMetadataProvider.java:104)
>  ~[calcite-core-1.4.0-drill-r9.jar:1.4.0-drill-r9]
> at 
> org.apache.calcite.rel.metadata.CachingRelMetadataProvider$2.apply(CachingRelMetadataProvider.java:78)
>  ~[calcite-core-1.4.0-drill-r9.jar:1.4.0-drill-r9]
> at 
> org.apache.calcite.rel.metadata.CachingRelMetadataProvider$2.apply(CachingRelMetadataProvider.java:75)
>  ~[calcite-core-1.4.0-drill-r9.jar:1.4.0-drill-r9]
> at 
> org.apache.calcite.plan.hep.HepRelMetadataProvider$1.apply(HepRelMetadataProvider.java:45)
>  ~[calcite-core-1.4.0-drill-r9.jar:1.4.0-drill-r9]
> at 
> org.apache.calcite.plan.hep.HepRelMetadataProvider$1.apply(HepRelMetadataProvider.java:34)
>  ~[calcite-core-1.4.0-drill-r9.jar:1.4.0-drill-r9]
> at 
> org.apache.calcite.rel.metadata.CachingRelMetadataProvider$2.apply(CachingRelMetadataProvider.java:77)
>  ~[calcite-core-1.4.0-drill-r9.jar:1.4.0-drill-r9]
> at 
> org.apache.calcite.rel.metadata.CachingRelMetadataProvider$2.apply(CachingRelMetadataProvider.java:75)
>  ~[calcite-core-1.4.0-drill-r9.jar:1.4.0-drill-r9]
> 

[jira] [Created] (DRILL-4320) Difference in query plan on JDK8 for window function query

2016-01-27 Thread Khurram Faraaz (JIRA)
Khurram Faraaz created DRILL-4320:
-

 Summary: Difference in query plan on JDK8 for window function query
 Key: DRILL-4320
 URL: https://issues.apache.org/jira/browse/DRILL-4320
 Project: Apache Drill
  Issue Type: Bug
  Components: Query Planning & Optimization
Affects Versions: 1.4.0
 Environment: 4 node cluster CentOS
Reporter: Khurram Faraaz


Difference in query plan seen in window function query on JDK8 with below test 
environment, the difference being that a Project is missing after the initial 
Scan, the new plan looks more optimized. Should we update the expected query 
plan or further investigation is required ?

Java 8
MapR Drill 1.4.0 GA
JDK8
MapR FS 5.0.0 GA

Functional/window_functions/optimization/plan/pp_03.sql

{noformat}
Actual plan 

00-00Screen
00-01  Project(EXPR$0=[$0])
00-02Project($0=[$2])
00-03  Window(window#0=[window(partition {1} order by [] range between 
UNBOUNDED PRECEDING and UNBOUNDED FOLLOWING aggs [SUM($0)])])
00-04SelectionVectorRemover
00-05  Sort(sort0=[$1], dir0=[ASC])
00-06Scan(groupscan=[ParquetGroupScan 
[entries=[ReadEntryWithPath [path=maprfs:///drill/testdata/subqueries/t1]], 
selectionRoot=maprfs:/drill/testdata/subqueries/t1, numFiles=1, 
usedMetadataFile=false, columns=[`a1`, `c1`]]])

Expected plan 

 Screen
 .*Project.*
   .*Project.*
 .*Window.*range between UNBOUNDED PRECEDING and UNBOUNDED 
FOLLOWING aggs.*
   .*SelectionVectorRemover.*
 .*Sort.*
   .*Project.*
 .*Scan.*
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-4267) Multiple window function operators instead of one

2016-01-12 Thread Deneche A. Hakim (JIRA)
Deneche A. Hakim created DRILL-4267:
---

 Summary: Multiple window function operators instead of one
 Key: DRILL-4267
 URL: https://issues.apache.org/jira/browse/DRILL-4267
 Project: Apache Drill
  Issue Type: Bug
  Components: Query Planning & Optimization
Affects Versions: 1.5.0
Reporter: Deneche A. Hakim
Priority: Minor


Changing the order of window functions in a query changes the number of window 
function operators in the plan.

The following query generates a plan with a single window function operator:
{noformat}
0: jdbc:drill:zk=local> EXPLAIN PLAN FOR SELECT ROW_NUMBER() OVER w, COUNT(*) 
OVER w FROM cp.`employee.json` WINDOW w AS (PARTITION BY position_id ORDER BY 
salary);
+--+--+
| text | json |
+--+--+
| 00-00Screen
00-01  Project(EXPR$0=[$0], EXPR$1=[$1])
00-02Project(EXPR$0=[$0], EXPR$1=[$1])
00-03  Project($0=[$2], $1=[$3])
00-04Window(window#0=[window(partition {0} order by [1] rows 
between UNBOUNDED PRECEDING and CURRENT ROW aggs [ROW_NUMBER(), COUNT()])])
00-05  SelectionVectorRemover
00-06Sort(sort0=[$0], sort1=[$1], dir0=[ASC], dir1=[ASC])
00-07  Scan(groupscan=[EasyGroupScan 
[selectionRoot=classpath:/employee.json, numFiles=1, columns=[`position_id`, 
`salary`], files=[classpath:/employee.json]]])
{noformat}

But when we permute the window functions in the query we get 2 window function 
operators in the plan:
{noformat}
0: jdbc:drill:zk=local> EXPLAIN PLAN FOR SELECT COUNT(*) OVER w, ROW_NUMBER() 
OVER w FROM cp.`employee.json` WINDOW w AS (PARTITION BY position_id ORDER BY 
salary);
+--+--+
| text | json |
+--+--+
| 00-00Screen
00-01  Project(EXPR$0=[$0], EXPR$1=[$1])
00-02Project(EXPR$0=[$0], EXPR$1=[$1])
00-03  Project($0=[$2], $1=[$3])
00-04Window(window#0=[window(partition {0} order by [1] rows 
between UNBOUNDED PRECEDING and CURRENT ROW aggs [ROW_NUMBER()])])
00-05  Window(window#0=[window(partition {0} order by [1] range 
between UNBOUNDED PRECEDING and CURRENT ROW aggs [COUNT()])])
00-06SelectionVectorRemover
00-07  Sort(sort0=[$0], sort1=[$1], dir0=[ASC], dir1=[ASC])
00-08Scan(groupscan=[EasyGroupScan 
[selectionRoot=classpath:/employee.json, numFiles=1, columns=[`position_id`, 
`salary`], files=[classpath:/employee.json]]])
{noformat}




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (DRILL-3786) Query with window function fails with IllegalFormatConversionException

2016-01-09 Thread Deneche A. Hakim (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Deneche A. Hakim resolved DRILL-3786.
-
Resolution: Fixed

Fixed by DRILL-4174

> Query with window function fails with IllegalFormatConversionException
> --
>
> Key: DRILL-3786
> URL: https://issues.apache.org/jira/browse/DRILL-3786
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Affects Versions: 1.2.0
> Environment: 10 Performance Nodes
> DRILL_MAX_DIRECT_MEMORY=100g
> DRILL_INIT_HEAP="8g"
> DRILL_MAX_HEAP="8g"
> planner.memory.query_max_memory_per_node bumped up to 20 GB
> TPC-DS SF 1000 dataset (Parquet)
>Reporter: Abhishek Girish
>Assignee: Deneche A. Hakim
>Priority: Critical
> Fix For: 1.4.0
>
> Attachments: drillbit.log.txt, query_profile.json
>
>
> Query fails with Runtime exception:
> {code:sql}
> SELECT sum(s.ss_quantity) OVER (PARTITION BY s.ss_store_sk, s.ss_customer_sk 
> ORDER BY s.ss_store_sk) FROM store_sales s LIMIT 20;
> java.lang.RuntimeException: java.sql.SQLException: SYSTEM ERROR: 
> IllegalFormatConversionException: d != java.lang.Character
> Fragment 1:0
> [Error Id: 12b51c0c-4992-4ceb-89c4-c99307529c7e on ucs-node8.perf.lab:31010]
>   at sqlline.IncrementalRows.hasNext(IncrementalRows.java:73)
>   at 
> sqlline.TableOutputFormat$ResizingRowsProvider.next(TableOutputFormat.java:87)
>   at sqlline.TableOutputFormat.print(TableOutputFormat.java:118)
>   at sqlline.SqlLine.print(SqlLine.java:1583)
>   at sqlline.Commands.execute(Commands.java:852)
>   at sqlline.Commands.sql(Commands.java:751)
>   at sqlline.SqlLine.dispatch(SqlLine.java:738)
>   at sqlline.SqlLine.begin(SqlLine.java:612)
>   at sqlline.SqlLine.start(SqlLine.java:366)
>   at sqlline.SqlLine.main(SqlLine.java:259)
> {code}
> Query logs and profile attached. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-4216) Aggregate Window Function COUNT() With GROUP BY Clause expected: range(0, 32768)

2015-12-21 Thread PIPELINE (JIRA)
PIPELINE created DRILL-4216:
---

 Summary: Aggregate Window Function COUNT() With GROUP BY Clause 
expected: range(0, 32768)
 Key: DRILL-4216
 URL: https://issues.apache.org/jira/browse/DRILL-4216
 Project: Apache Drill
  Issue Type: Bug
  Components: Functions - Drill
Affects Versions: 1.3.0
 Environment: Hadoop-2.5.2
Hbase-0.9.15
Java-1.7.0_85
Reporter: PIPELINE


*When column is row_key,it work well !*
0: jdbc:drill:> select count(row_key) over() from hbase.web_initial_20151222 wi 
group by row_key limit 3;
+-+
| EXPR$0  |
+-+
| 102906  |
| 102906  |
| 102906  |
+-+
3 rows selected (1.645 seconds)

*When column is Hbase.Talbename.ColumnFamily.Qualify, and count(column) less 
than 32768,it work well !*
0: jdbc:drill:> select count(wi.cf1.q5) over() from hbase.web_initial_20151214 
wi group by wi.cf1.q5 limit 3;
+-+
| EXPR$0  |
+-+
| 10383   |
| 10383   |
| 10383   |
+-+
3 rows selected (1.044 seconds)

{color:red}
When column is Hbase.Talbename.ColumnFamily.Qualify, and count(column) more 
than 32768,IndexOutOfBoundsException
{color}

0: jdbc:drill:> select count(wi.cf1.q5) over() from hbase.web_initial_20151222 
wi group by wi.cf1.q5 limit 3;
Error: SYSTEM ERROR: IndexOutOfBoundsException: index: 0, length: 62784 
(expected: range(0, 32768))

Fragment 0:0

[Error Id: 77406a8a-8389-4f1b-af6c-d26d811379b7 on slave4.hadoop:31010] 
(state=,code=0)
java.sql.SQLException: SYSTEM ERROR: IndexOutOfBoundsException: index: 0, 
length: 62784 (expected: range(0, 32768))

Fragment 0:0

[Error Id: 77406a8a-8389-4f1b-af6c-d26d811379b7 on slave4.hadoop:31010]
at 
org.apache.drill.jdbc.impl.DrillCursor.nextRowInternally(DrillCursor.java:247)
at org.apache.drill.jdbc.impl.DrillCursor.next(DrillCursor.java:320)
at 
net.hydromatic.avatica.AvaticaResultSet.next(AvaticaResultSet.java:187)
at 
org.apache.drill.jdbc.impl.DrillResultSetImpl.next(DrillResultSetImpl.java:160)
at sqlline.IncrementalRows.hasNext(IncrementalRows.java:62)
at 
sqlline.TableOutputFormat$ResizingRowsProvider.next(TableOutputFormat.java:87)
at sqlline.TableOutputFormat.print(TableOutputFormat.java:118)
at sqlline.SqlLine.print(SqlLine.java:1593)
at sqlline.Commands.execute(Commands.java:852)
at sqlline.Commands.sql(Commands.java:751)
at sqlline.SqlLine.dispatch(SqlLine.java:746)
at sqlline.SqlLine.begin(SqlLine.java:621)
at sqlline.SqlLine.start(SqlLine.java:375)
at sqlline.SqlLine.main(SqlLine.java:268)
Caused by: org.apache.drill.common.exceptions.UserRemoteException: SYSTEM 
ERROR: IndexOutOfBoundsException: index: 0, length: 62784 (expected: range(0, 
32768))

Fragment 0:0

[Error Id: 77406a8a-8389-4f1b-af6c-d26d811379b7 on slave4.hadoop:31010]
at 
org.apache.drill.exec.rpc.user.QueryResultHandler.resultArrived(QueryResultHandler.java:118)
at 
org.apache.drill.exec.rpc.user.UserClient.handleReponse(UserClient.java:112)
at 
org.apache.drill.exec.rpc.BasicClientWithConnection.handle(BasicClientWithConnection.java:47)
at 
org.apache.drill.exec.rpc.BasicClientWithConnection.handle(BasicClientWithConnection.java:32)
at org.apache.drill.exec.rpc.RpcBus.handle(RpcBus.java:69)
at org.apache.drill.exec.rpc.RpcBus$RequestEvent.run(RpcBus.java:400)
at 
org.apache.drill.common.SerializedExecutor$RunnableProcessor.run(SerializedExecutor.java:105)
at 
org.apache.drill.exec.rpc.RpcBus$SameExecutor.execute(RpcBus.java:264)
at 
org.apache.drill.common.SerializedExecutor.execute(SerializedExecutor.java:142)
at 
org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:298)
at 
org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:269)
at 
io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:89)
at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
at 
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
at 
io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:254)
at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
at 
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
at 
io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
at 
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext

[GitHub] drill pull request: DRILL-3786: Query with window function fails w...

2015-11-13 Thread parthchandra
Github user parthchandra commented on the pull request:

https://github.com/apache/drill/pull/239#issuecomment-156506328
  
+1. Looks good.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request: DRILL-3786: Query with window function fails w...

2015-11-13 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/drill/pull/239


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Resolved] (DRILL-4061) Incorrect results returned by window function query.

2015-11-11 Thread Khurram Faraaz (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Khurram Faraaz resolved DRILL-4061.
---
Resolution: Not A Problem

> Incorrect results returned by window function query.
> 
>
> Key: DRILL-4061
> URL: https://issues.apache.org/jira/browse/DRILL-4061
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 1.3.0
> Environment: 4 node cluster CentOS
>Reporter: Khurram Faraaz
>     Attachments: 0_0_0.parquet
>
>
> Window function query that uses lag function returns incorrect results.
> sys.version => 3a73f098
> Drill 1.3
> Test parquet file is attached here.
> {code}
> 0: jdbc:drill:schema=dfs.tmp> CREATE TABLE testrepro AS SELECT 
> CAST(columns[0] AS INT) col0, CAST(columns[1] AS INT) col1 FROM 
> `testRepro.csv`;
> +---++
> | Fragment  | Number of records written  |
> +---++
> | 0_0   | 11 |
> +---++
> 1 row selected (0.542 seconds)
> 0: jdbc:drill:schema=dfs.tmp> select col1, 1 / (col1 - lag(col1) OVER (ORDER 
> BY col0)) from testrepro;
> +---+-+
> | col1  | EXPR$1  |
> +---+-+
> | 11| null|
> | 9 | 0   |
> | 0 | 0   |
> | 10| 0   |
> | 19| 0   |
> | 13| 0   |
> | 17| 0   |
> | -1| 0   |
> | 1 | 0   |
> | 20| 0   |
> | 100   | 0   |
> +---+-+
> 11 rows selected (0.451 seconds)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-4061) Incorrect results returned by window function query.

2015-11-10 Thread Khurram Faraaz (JIRA)
Khurram Faraaz created DRILL-4061:
-

 Summary: Incorrect results returned by window function query.
 Key: DRILL-4061
 URL: https://issues.apache.org/jira/browse/DRILL-4061
 Project: Apache Drill
  Issue Type: Bug
  Components: Execution - Flow
Affects Versions: 1.3.0
 Environment: 4 node cluster CentOS
Reporter: Khurram Faraaz


Window function query that uses lag function returns incorrect results.
sys.version => 3a73f098
Drill 1.3
Test parquet file is attached here.

{code}
0: jdbc:drill:schema=dfs.tmp> CREATE TABLE testrepro AS SELECT CAST(columns[0] 
AS INT) col0, CAST(columns[1] AS INT) col1 FROM `testRepro.csv`;
+---++
| Fragment  | Number of records written  |
+---++
| 0_0   | 11 |
+---++
1 row selected (0.542 seconds)
0: jdbc:drill:schema=dfs.tmp> select col1, 1 / (col1 - lag(col1) OVER (ORDER BY 
col0)) from testrepro;
+---+-+
| col1  | EXPR$1  |
+---+-+
| 11| null|
| 9 | 0   |
| 0 | 0   |
| 10| 0   |
| 19| 0   |
| 13| 0   |
| 17| 0   |
| -1| 0   |
| 1 | 0   |
| 20| 0   |
| 100   | 0   |
+---+-+
11 rows selected (0.451 seconds)
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (DRILL-3770) Query with window function having just ORDER BY clause runs out of memory on large datasets

2015-11-10 Thread Deneche A. Hakim (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3770?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Deneche A. Hakim resolved DRILL-3770.
-
   Resolution: Fixed
Fix Version/s: (was: 1.4.0)
   1.3.0

Fixed by DRILL-3952

> Query with window function having just ORDER BY clause runs out of memory on 
> large datasets
> ---
>
> Key: DRILL-3770
> URL: https://issues.apache.org/jira/browse/DRILL-3770
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 1.2.0
> Environment: 10 Performance Nodes
> DRILL_MAX_DIRECT_MEMORY=100g
> DRILL_INIT_HEAP="8g"
> DRILL_MAX_HEAP="8g"
> planner.memory.query_max_memory_per_node bumped up to 20 GB
> TPC-DS SF 1000 dataset (Parquet)
>Reporter: Abhishek Girish
>Assignee: Deneche A. Hakim
>Priority: Critical
>  Labels: window_function
> Fix For: 1.3.0
>
> Attachments: drillbit.log.txt, profile.json
>
>
> The following query runs out of memory:
> {code:sql}
> SELECT SUM(ss.ss_net_paid_inc_tax) OVER (ORDER BY ss.ss_store_sk) FROM 
> store_sales ss LIMIT 20;
> java.lang.RuntimeException: java.sql.SQLException: RESOURCE ERROR: One or 
> more nodes ran out of memory while executing the query.
> Fragment 0:0
> [Error Id: 9c441211-65ec-4206-9e6b-d6ae9c2903be on ucs-node6.perf.lab:31010]
>   at sqlline.IncrementalRows.hasNext(IncrementalRows.java:73)
>   at 
> sqlline.TableOutputFormat$ResizingRowsProvider.next(TableOutputFormat.java:87)
>   at sqlline.TableOutputFormat.print(TableOutputFormat.java:118)
>   at sqlline.SqlLine.print(SqlLine.java:1583)
>   at sqlline.Commands.execute(Commands.java:852)
>   at sqlline.Commands.sql(Commands.java:751)
>   at sqlline.SqlLine.dispatch(SqlLine.java:738)
>   at sqlline.SqlLine.begin(SqlLine.java:612)
>   at sqlline.SqlLine.start(SqlLine.java:366)
>   at sqlline.SqlLine.main(SqlLine.java:259)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-3786) Query with window function fails with IllegalFormatConversionException

2015-09-15 Thread Abhishek Girish (JIRA)
Abhishek Girish created DRILL-3786:
--

 Summary: Query with window function fails with 
IllegalFormatConversionException
 Key: DRILL-3786
 URL: https://issues.apache.org/jira/browse/DRILL-3786
 Project: Apache Drill
  Issue Type: Bug
  Components: Query Planning & Optimization
Affects Versions: 1.2.0
 Environment: 10 Performance Nodes
DRILL_MAX_DIRECT_MEMORY=100g
DRILL_INIT_HEAP="8g"
DRILL_MAX_HEAP="8g"
planner.memory.query_max_memory_per_node bumped up to 20 GB
TPC-DS SF 1000 dataset (Parquet)
Reporter: Abhishek Girish
Assignee: Jinfeng Ni


Query fails with Runtime exception:

{code:sql}
SELECT sum(s.ss_quantity) OVER (PARTITION BY s.ss_store_sk, s.ss_customer_sk 
ORDER BY s.ss_store_sk) FROM store_sales s LIMIT 20;
java.lang.RuntimeException: java.sql.SQLException: SYSTEM ERROR: 
IllegalFormatConversionException: d != java.lang.Character

Fragment 1:0

[Error Id: 12b51c0c-4992-4ceb-89c4-c99307529c7e on ucs-node8.perf.lab:31010]
at sqlline.IncrementalRows.hasNext(IncrementalRows.java:73)
at 
sqlline.TableOutputFormat$ResizingRowsProvider.next(TableOutputFormat.java:87)
at sqlline.TableOutputFormat.print(TableOutputFormat.java:118)
at sqlline.SqlLine.print(SqlLine.java:1583)
at sqlline.Commands.execute(Commands.java:852)
at sqlline.Commands.sql(Commands.java:751)
at sqlline.SqlLine.dispatch(SqlLine.java:738)
at sqlline.SqlLine.begin(SqlLine.java:612)
at sqlline.SqlLine.start(SqlLine.java:366)
at sqlline.SqlLine.main(SqlLine.java:259)
{code}

Query logs and profile attached. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (DRILL-3677) Wrong result with LEAD window function when used in multiple windows in the same query

2015-08-20 Thread Victoria Markman (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Victoria Markman resolved DRILL-3677.
-
Resolution: Invalid

 Wrong result with LEAD window function when used in multiple windows in the 
 same query
 --

 Key: DRILL-3677
 URL: https://issues.apache.org/jira/browse/DRILL-3677
 Project: Apache Drill
  Issue Type: Bug
  Components: Functions - Drill
Affects Versions: 1.2.0
Reporter: Victoria Markman
Assignee: Deneche A. Hakim
Priority: Critical
 Fix For: 1.2.0

 Attachments: j1.tar, q46.out, q46.res


 Query produces wrong result:
 {code}
 select
 c_integer,
 lead(c_integer) over (order by c_integer),
 lead(c_integer) over (partition by c_time order by c_date)
 from
 j1
 ;
 {code}
 Attached: q46.res (result generated from postgres)
 q46.out (output from Drill)
 j1.tar - table used in the query
 I tried to reproduce the same error on a smaller data set without any 
 success. Tried with the table with the same data as j1, but single parquet 
 file: same behavior.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-3677) Wrong result with LEAD window function when used in multiple windows in the same query

2015-08-20 Thread Victoria Markman (JIRA)
Victoria Markman created DRILL-3677:
---

 Summary: Wrong result with LEAD window function when used in 
multiple windows in the same query
 Key: DRILL-3677
 URL: https://issues.apache.org/jira/browse/DRILL-3677
 Project: Apache Drill
  Issue Type: Bug
  Components: Functions - Drill
Affects Versions: 1.2.0
Reporter: Victoria Markman
Assignee: Deneche A. Hakim
Priority: Critical
 Fix For: 1.2.0


Query produces wrong result:
{code}
select
c_integer,
lead(c_integer) over (order by c_integer),
lead(c_integer) over (partition by c_time order by c_date)
from
j1
;
{code}

Attached: q46.res (result generated from postgres)
q46.out (output from Drill)
j1.tar - table used in the query

I tried to reproduce the same error on a smaller data set without any success. 
Tried with the table with the same data as j1, but single parquet file: same 
error.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-3680) window function query returns Incorrect results

2015-08-20 Thread Khurram Faraaz (JIRA)
Khurram Faraaz created DRILL-3680:
-

 Summary: window function query returns Incorrect results 
 Key: DRILL-3680
 URL: https://issues.apache.org/jira/browse/DRILL-3680
 Project: Apache Drill
  Issue Type: Bug
  Components: Execution - Flow
Affects Versions: 1.2.0
 Environment: private-branch 
https://github.com/adeneche/incubator-drill/tree/new-window-funcs
Reporter: Khurram Faraaz
Assignee: Chris Westin
Priority: Critical



Query plan from Drill for the query that returns wrong results
{code}
0: jdbc:drill:schema=dfs.tmp explain plan for select c1 , c2 , lead(c2) OVER ( 
PARTITION BY c2 ORDER BY c1) lead_c2 FROM (SELECT c1 , c2, ntile(3) 
over(PARTITION BY c2 ORDER BY c1) FROM `tblWnulls.parquet`);
+--+--+
| text | json |
+--+--+
| 00-00Screen
00-01  Project(c1=[$0], c2=[$1], lead_c2=[$2])
00-02Project(c1=[$0], c2=[$1], lead_c2=[$2])
00-03  Project(c1=[$0], c2=[$1], $2=[$3])
00-04Window(window#0=[window(partition {1} order by [0] range 
between UNBOUNDED PRECEDING and CURRENT ROW aggs [LEAD($1)])])
00-05  Window(window#0=[window(partition {1} order by [0] range 
between UNBOUNDED PRECEDING and CURRENT ROW aggs [NTILE($2)])])
00-06SelectionVectorRemover
00-07  Sort(sort0=[$1], sort1=[$0], dir0=[ASC], dir1=[ASC])
00-08Project(c1=[$1], c2=[$0])
00-09  Scan(groupscan=[ParquetGroupScan 
[entries=[ReadEntryWithPath [path=maprfs:///tmp/tblWnulls.parquet]], 
selectionRoot=maprfs:/tmp/tblWnulls.parquet, numFiles=1, columns=[`c1`, `c2`]]])
{code}

Results returned by Drill.
{code}
0: jdbc:drill:schema=dfs.tmp select c1 , c2 , lead(c2) OVER ( PARTITION BY c2 
ORDER BY c1) lead_c2 FROM (SELECT c1 , c2, ntile(3) over(PARTITION BY c2 ORDER 
BY c1) FROM `tblWnulls.parquet`);
+-+---+--+
| c1  |  c2   | lead_c2  |
+-+---+--+
| 0   | a | null |
| 1   | a | null |
| 5   | a | null |
| 10  | a | null |
| 11  | a | null |
| 14  | a | null |
| 1   | a | null |
| 2   | b | null |
| 9   | b | null |
| 13  | b | null |
| 17  | b | null |
| 4   | c | null |
| 6   | c | null |
| 8   | c | null |
| 12  | c | null |
| 13  | c | null |
| 13  | c | null |
| null| c | null |
| 10  | d | null |
| 11  | d | null |
| 2147483647  | d | null |
| 2147483647  | d | null |
| null| d | null |
| null| d | null |
| -1  | e | null |
| 15  | e | null |
| 19  | null  | null |
| 65536   | null  | null |
| 100 | null  | null |
| null| null  | null |
+-+---+--+
30 rows selected (0.339 seconds)
{code}

Results returned by Postgres

{code}
postgres=# select c1 , c2 , lead(c2) OVER ( PARTITION BY c2 ORDER BY c1) 
lead_c2 FROM (SELECT c1 , c2, ntile(3) over(PARTITION BY c2 ORDER BY c1) FROM 
t222) sub_query;
 c1 | c2 | lead_c2 
++-
  0 | a  | a
  1 | a  | a
  5 | a  | a
 10 | a  | a
 11 | a  | a
 14 | a  | a
  1 | a  | 
  2 | b  | b
  9 | b  | b
 13 | b  | b
 17 | b  | 
  4 | c  | c
  6 | c  | c
  8 | c  | c
 12 | c  | c
 13 | c  | c
 13 | c  | c
| c  | 
 10 | d  | d
 11 | d  | d
 2147483647 | d  | d
 2147483647 | d  | d
| d  | d
| d  | 
 -1 | e  | e
 15 | e  | 
 19 || 
  65536 || 
100 || 
|| 
(30 rows)
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (DRILL-3619) Add support for NTILE window function

2015-08-18 Thread Deneche A. Hakim (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Deneche A. Hakim resolved DRILL-3619.
-
Resolution: Fixed

Fixed in b55e2328d929df5d361c038f63fdeffadb0e544c

 Add support for NTILE window function
 -

 Key: DRILL-3619
 URL: https://issues.apache.org/jira/browse/DRILL-3619
 Project: Apache Drill
  Issue Type: Sub-task
  Components: Execution - Relational Operators
Reporter: Deneche A. Hakim
Assignee: Deneche A. Hakim
  Labels: window_function
 Fix For: 1.2.0






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 37515: DRILL-3657: Wrong result with SUM(1) window function when multiple partitions are present

2015-08-17 Thread Sean Hsuan-Yi Chu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/37515/
---

(Updated Aug. 17, 2015, 4:54 p.m.)


Review request for drill and Aman Sinha.


Changes
---

Modified unit test


Bugs: DRILL-3657
https://issues.apache.org/jira/browse/DRILL-3657


Repository: drill-git


Description
---

When constants are referred in Window Prel, ensure the indices are shifted 
properly


Diffs (updated)
-

  
exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/WindowPrule.java
 f6d2b67 
  exec/java-exec/src/test/java/org/apache/drill/exec/TestWindowFunctions.java 
9e09106 

Diff: https://reviews.apache.org/r/37515/diff/


Testing
---

All required


Thanks,

Sean Hsuan-Yi Chu



Re: Review Request 37515: DRILL-3657: Wrong result with SUM(1) window function when multiple partitions are present

2015-08-17 Thread Sean Hsuan-Yi Chu


 On Aug. 17, 2015, 4:08 p.m., Aman Sinha wrote:
  exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/WindowPrule.java,
   line 159
  https://reviews.apache.org/r/37515/diff/1/?file=1041599#file1041599line159
 
  Would this work if there was non-window-agg functions present...such as 
  RANK() (in combination with window aggs) ?

Rank works too. Also, I modified the unit test to have more general coverage 
there.


- Sean Hsuan-Yi


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/37515/#review95592
---


On Aug. 17, 2015, 4:11 a.m., Sean Hsuan-Yi Chu wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/37515/
 ---
 
 (Updated Aug. 17, 2015, 4:11 a.m.)
 
 
 Review request for drill and Aman Sinha.
 
 
 Bugs: DRILL-3657
 https://issues.apache.org/jira/browse/DRILL-3657
 
 
 Repository: drill-git
 
 
 Description
 ---
 
 When constants are referred in Window Prel, ensure the indices are shifted 
 properly
 
 
 Diffs
 -
 
   
 exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/WindowPrule.java
  f6d2b67 
   exec/java-exec/src/test/java/org/apache/drill/exec/TestWindowFunctions.java 
 9e09106 
 
 Diff: https://reviews.apache.org/r/37515/diff/
 
 
 Testing
 ---
 
 All required
 
 
 Thanks,
 
 Sean Hsuan-Yi Chu
 




Re: Review Request 37515: DRILL-3657: Wrong result with SUM(1) window function when multiple partitions are present

2015-08-17 Thread Aman Sinha

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/37515/#review95592
---



exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/WindowPrule.java
 (line 158)
https://reviews.apache.org/r/37515/#comment150628

Would this work if there was non-window-agg functions present...such as 
RANK() (in combination with window aggs) ?


- Aman Sinha


On Aug. 17, 2015, 4:11 a.m., Sean Hsuan-Yi Chu wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/37515/
 ---
 
 (Updated Aug. 17, 2015, 4:11 a.m.)
 
 
 Review request for drill and Aman Sinha.
 
 
 Bugs: DRILL-3657
 https://issues.apache.org/jira/browse/DRILL-3657
 
 
 Repository: drill-git
 
 
 Description
 ---
 
 When constants are referred in Window Prel, ensure the indices are shifted 
 properly
 
 
 Diffs
 -
 
   
 exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/WindowPrule.java
  f6d2b67 
   exec/java-exec/src/test/java/org/apache/drill/exec/TestWindowFunctions.java 
 9e09106 
 
 Diff: https://reviews.apache.org/r/37515/diff/
 
 
 Testing
 ---
 
 All required
 
 
 Thanks,
 
 Sean Hsuan-Yi Chu
 




Review Request 37515: DRILL-3657: Wrong result with SUM(1) window function when multiple partitions are present

2015-08-16 Thread Sean Hsuan-Yi Chu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/37515/
---

Review request for drill and Aman Sinha.


Bugs: DRILL-3657
https://issues.apache.org/jira/browse/DRILL-3657


Repository: drill-git


Description
---

When constants are referred in Window Prel, ensure the indices are shifted 
properly


Diffs
-

  
exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/WindowPrule.java
 f6d2b67 
  exec/java-exec/src/test/java/org/apache/drill/exec/TestWindowFunctions.java 
9e09106 

Diff: https://reviews.apache.org/r/37515/diff/


Testing
---

All required


Thanks,

Sean Hsuan-Yi Chu



[jira] [Created] (DRILL-3651) Window function should not be allowed in order by clause of over clause of window function

2015-08-14 Thread Victoria Markman (JIRA)
Victoria Markman created DRILL-3651:
---

 Summary: Window function should not be allowed in order by clause 
of over clause of window function
 Key: DRILL-3651
 URL: https://issues.apache.org/jira/browse/DRILL-3651
 Project: Apache Drill
  Issue Type: Bug
  Components: Query Planning  Optimization
Reporter: Victoria Markman
Assignee: Jinfeng Ni
Priority: Minor


Should not parse and throw an error according to SQL standard.
ISO/IEC 9075-2:2011(E) 7.11 window clause 
d) If WDX has a window ordering clause, then WDEF shall not specify window 
order clause (hope I'm reading it correctly)

{code}
SELECT rank() OVER (ORDER BY rank() OVER (ORDER BY c1)) from t1;
{code}

Instead, drill returns result:
{code}
0: jdbc:drill:schema=dfs SELECT rank() OVER (ORDER BY rank() OVER (ORDER BY 
c1)) from t1;
+-+
| EXPR$0  |
+-+
| 1   |
| 2   |
| 3   |
| 4   |
| 5   |
| 6   |
| 7   |
| 8   |
| 9   |
| 10  |
+-+
10 rows selected (0.336 seconds)
{code}

Postgres throws an error in this case:
{code}
postgres=# SELECT rank() OVER (ORDER BY rank() OVER (ORDER BY c1)) from t1;
ERROR:  window functions are not allowed in window definitions
LINE 1: SELECT rank() OVER (ORDER BY rank() OVER (ORDER BY c1)) from...
{code}

Courtesy of postgres test suite.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-3647) Handle null as input to window function NTILE

2015-08-14 Thread Khurram Faraaz (JIRA)
Khurram Faraaz created DRILL-3647:
-

 Summary: Handle null as input to window function NTILE 
 Key: DRILL-3647
 URL: https://issues.apache.org/jira/browse/DRILL-3647
 Project: Apache Drill
  Issue Type: Bug
  Components: Execution - Flow
Affects Versions: 1.2.0
 Environment: private-branch 
https://github.com/adeneche/incubator-drill/tree/new-window-funcs
Reporter: Khurram Faraaz
Assignee: Chris Westin


We need to handle null as input to window functions. NTILE function must return 
null as output when input is null.

{code}
0: jdbc:drill:schema=dfs.tmp select col7 , col0 , ntile(null) over(partition 
by col7 order by col0) lead_col0 from FEWRWSPQQ_101;
Error: PARSE ERROR: From line 1, column 22 to line 1, column 37: Argument to 
function 'NTILE' must not be NULL


[Error Id: e5e69582-8502-4a99-8ba1-dffdfb8ac028 on centos-04.qa.lab:31010] 
(state=,code=0)
{code}

{code}
0: jdbc:drill:schema=dfs.tmp select col7 , col0 , lead(null) over(partition by 
col7 order by col0) lead_col0 from FEWRWSPQQ_101;
Error: PARSE ERROR: From line 1, column 27 to line 1, column 30: Illegal use of 
'NULL'


[Error Id: 6824ca01-e3f1-4338-b4c8-5535e7a42e13 on centos-04.qa.lab:31010] 
(state=,code=0)
{code}




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [jira] [Created] (DRILL-3619) Add support for NTILE window function

2015-08-10 Thread Ted Dunning
Let me know if you want some suggestions for approximate ntile
implementations.

On Mon, Aug 10, 2015 at 10:42 AM, Deneche A. Hakim (JIRA) j...@apache.org
wrote:

 Deneche A. Hakim created DRILL-3619:
 ---

  Summary: Add support for NTILE window function
  Key: DRILL-3619
  URL: https://issues.apache.org/jira/browse/DRILL-3619
  Project: Apache Drill
   Issue Type: Sub-task
 Reporter: Deneche A. Hakim
 Assignee: Deneche A. Hakim






 --
 This message was sent by Atlassian JIRA
 (v6.3.4#6332)



[jira] [Created] (DRILL-3619) Add support for NTILE window function

2015-08-10 Thread Deneche A. Hakim (JIRA)
Deneche A. Hakim created DRILL-3619:
---

 Summary: Add support for NTILE window function
 Key: DRILL-3619
 URL: https://issues.apache.org/jira/browse/DRILL-3619
 Project: Apache Drill
  Issue Type: Sub-task
Reporter: Deneche A. Hakim
Assignee: Deneche A. Hakim






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [jira] [Created] (DRILL-3619) Add support for NTILE window function

2015-08-10 Thread Abdel Hakim Deneche
I got a basic implementation running, but it's still in testing.
It's seems to be giving correct results, but it's tied to Drill's Window
operator because it still need to compute all window functions if necessary.

All suggestions are welcome, it's always interesting to learn about
alternative ways to implement the same feature.

Thanks

On Mon, Aug 10, 2015 at 11:15 AM, Ted Dunning ted.dunn...@gmail.com wrote:

 Let me know if you want some suggestions for approximate ntile
 implementations.

 On Mon, Aug 10, 2015 at 10:42 AM, Deneche A. Hakim (JIRA) j...@apache.org
 
 wrote:

  Deneche A. Hakim created DRILL-3619:
  ---
 
   Summary: Add support for NTILE window function
   Key: DRILL-3619
   URL: https://issues.apache.org/jira/browse/DRILL-3619
   Project: Apache Drill
Issue Type: Sub-task
  Reporter: Deneche A. Hakim
  Assignee: Deneche A. Hakim
 
 
 
 
 
 
  --
  This message was sent by Atlassian JIRA
  (v6.3.4#6332)
 




-- 

Abdelhakim Deneche

Software Engineer

  http://www.mapr.com/


Now Available - Free Hadoop On-Demand Training
http://www.mapr.com/training?utm_source=Emailutm_medium=Signatureutm_campaign=Free%20available


[jira] [Created] (DRILL-3580) wrong plan for window function queries containing function(col1 + colb)

2015-07-30 Thread Deneche A. Hakim (JIRA)
Deneche A. Hakim created DRILL-3580:
---

 Summary: wrong plan for window function queries containing 
function(col1 + colb)
 Key: DRILL-3580
 URL: https://issues.apache.org/jira/browse/DRILL-3580
 Project: Apache Drill
  Issue Type: Bug
  Components: Query Planning  Optimization
Affects Versions: 1.1.0
Reporter: Deneche A. Hakim
Assignee: Jinfeng Ni
Priority: Critical


The following query has a wrong plan:
{noformat}
explain plan for select position_id, salary, sum(salary) over (partition by 
position_id), sum(position_id + salary) over (partition by position_id) from 
cp.`employee.json` limit 20;
+--+--+
| text | json |
+--+--+
| 00-00Screen
00-01  ProjectAllowDup(position_id=[$0], salary=[$1], EXPR$2=[$2], 
EXPR$3=[$3])
00-02SelectionVectorRemover
00-03  Limit(fetch=[20])
00-04Project(position_id=[$0], salary=[$1], w0$o0=[$2], w0$o00=[$4])
00-05  Window(window#0=[window(partition {0} order by [] range 
between UNBOUNDED PRECEDING and UNBOUNDED FOLLOWING aggs [SUM($3)])])
00-06Project(position_id=[$1], salary=[$2], w0$o0=[$3], 
$3=[+($1, $2)])
00-07  Window(window#0=[window(partition {1} order by [] range 
between UNBOUNDED PRECEDING and UNBOUNDED FOLLOWING aggs [SUM($2)])])
00-08SelectionVectorRemover
00-09  Sort(sort0=[$1], dir0=[ASC])
00-10Project(T13¦¦*=[$0], position_id=[$1], salary=[$2])
00-11  Scan(groupscan=[EasyGroupScan 
[selectionRoot=classpath:/employee.json, numFiles=1, columns=[`*`], 
files=[classpath:/employee.json]]])
{noformat}

The plan contains 2 window operators which shouldn't be possible according to 
DRILL-3196. 

The results are also incorrect.

Depending on which aggregation or window function used we get wrong results or 
an IndexOutOfBounds exception



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-3582) Asset with the window function and group by when aggregate is used in order by clause

2015-07-30 Thread Victoria Markman (JIRA)
Victoria Markman created DRILL-3582:
---

 Summary: Asset with the window function and group by when 
aggregate is used in order by clause
 Key: DRILL-3582
 URL: https://issues.apache.org/jira/browse/DRILL-3582
 Project: Apache Drill
  Issue Type: Bug
  Components: Query Planning  Optimization
Affects Versions: 1.1.0
Reporter: Victoria Markman
Assignee: Jinfeng Ni


{code}
0: jdbc:drill:drillbit=localhost select sum(a1), rank() over (order by 
sum(a1)) from t1 group by c1;
Error: SYSTEM ERROR: AssertionError: Internal error: invariant violated: 
conversion result not null
[Error Id: 5ebc8f0e-3ca5-4916-aa3b-c272bdd9f585 on 172.16.1.129:31010] 
(state=,code=0)
{code}

drillbit.log
{code}
org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: AssertionError: 
Internal error: invariant violated: conversion result not null


[Error Id: d276c7cc-b5fb-47ab-8058-098b5f227cf3 on 172.16.1.129:31010]
at 
org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:523)
 ~[drill-common-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
at 
org.apache.drill.exec.work.foreman.Foreman$ForemanResult.close(Foreman.java:737)
 [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
at 
org.apache.drill.exec.work.foreman.Foreman$StateSwitch.processEvent(Foreman.java:839)
 [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
at 
org.apache.drill.exec.work.foreman.Foreman$StateSwitch.processEvent(Foreman.java:781)
 [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
at 
org.apache.drill.common.EventProcessor.sendEvent(EventProcessor.java:73) 
[drill-common-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
at 
org.apache.drill.exec.work.foreman.Foreman$StateSwitch.moveToState(Foreman.java:783)
 [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
at 
org.apache.drill.exec.work.foreman.Foreman.moveToState(Foreman.java:892) 
[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:253) 
[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
[na:1.7.0_71]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
[na:1.7.0_71]
at java.lang.Thread.run(Thread.java:745) [na:1.7.0_71]
Caused by: org.apache.drill.exec.work.foreman.ForemanException: Unexpected 
exception during fragment initialization: Internal error: invariant violated: 
conversion result not null
... 4 common frames omitted
Caused by: java.lang.AssertionError: Internal error: invariant violated: 
conversion result not null
at org.apache.calcite.util.Util.newInternal(Util.java:775) 
~[calcite-core-1.1.0-drill-r14.jar:1.1.0-drill-r14]
at org.apache.calcite.util.Util.permAssert(Util.java:883) 
~[calcite-core-1.1.0-drill-r14.jar:1.1.0-drill-r14]
at 
org.apache.calcite.sql2rel.SqlToRelConverter$Blackboard.convertExpression(SqlToRelConverter.java:3964)
 ~[calcite-core-1.1.0-drill-r14.jar:1.1.0-drill-r14]
at 
org.apache.calcite.sql2rel.SqlToRelConverter$Blackboard.convertSortExpression(SqlToRelConverter.java:3981)
 ~[calcite-core-1.1.0-drill-r14.jar:1.1.0-drill-r14]
at 
org.apache.calcite.sql2rel.SqlToRelConverter.convertOver(SqlToRelConverter.java:1756)
 ~[calcite-core-1.1.0-drill-r14.jar:1.1.0-drill-r14]
at 
org.apache.calcite.sql2rel.SqlToRelConverter.access$2(SqlToRelConverter.java:1726)
 ~[calcite-core-1.1.0-drill-r14.jar:1.1.0-drill-r14]
at 
org.apache.calcite.sql2rel.SqlToRelConverter$Blackboard.convertExpression(SqlToRelConverter.java:3956)
 ~[calcite-core-1.1.0-drill-r14.jar:1.1.0-drill-r14]
at 
org.apache.calcite.sql2rel.SqlToRelConverter.createAggImpl(SqlToRelConverter.java:2540)
 ~[calcite-core-1.1.0-drill-r14.jar:1.1.0-drill-r14]
at 
org.apache.calcite.sql2rel.SqlToRelConverter.convertAgg(SqlToRelConverter.java:2360)
 ~[calcite-core-1.1.0-drill-r14.jar:1.1.0-drill-r14]
at 
org.apache.calcite.sql2rel.SqlToRelConverter.convertSelectImpl(SqlToRelConverter.java:604)
 ~[calcite-core-1.1.0-drill-r14.jar:1.1.0-drill-r14]
at 
org.apache.calcite.sql2rel.SqlToRelConverter.convertSelect(SqlToRelConverter.java:564)
 ~[calcite-core-1.1.0-drill-r14.jar:1.1.0-drill-r14]
at 
org.apache.calcite.sql2rel.SqlToRelConverter.convertQueryRecursive(SqlToRelConverter.java:2759)
 ~[calcite-core-1.1.0-drill-r14.jar:1.1.0-drill-r14]
at 
org.apache.calcite.sql2rel.SqlToRelConverter.convertQuery(SqlToRelConverter.java:522)
 ~[calcite-core-1.1.0-drill-r14.jar:1.1.0-drill-r14]
at org.apache.calcite.prepare.PlannerImpl.convert(PlannerImpl.java:198) 
~[calcite-core-1.1.0-drill-r14.jar:1.1.0-drill-r14]
at 
org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToRel

[jira] [Created] (DRILL-3574) Wrong result with SUM window function in the query with multiple window definitions

2015-07-28 Thread Victoria Markman (JIRA)
Victoria Markman created DRILL-3574:
---

 Summary: Wrong result with SUM window function in the query with 
multiple window definitions
 Key: DRILL-3574
 URL: https://issues.apache.org/jira/browse/DRILL-3574
 Project: Apache Drill
  Issue Type: Bug
  Components: Query Planning  Optimization
 Environment: private-branch-with-multiple-partitions-enabled
Reporter: Victoria Markman
Assignee: Jinfeng Ni
Priority: Critical


Incorrect result:

{code}
0: jdbc:drill:drillbit=localhost select * from t1;
+---++-+
|  a1   |   b1   | c1  |
+---++-+
| 1 | a  | 2015-01-01  |
| 2 | b  | 2015-01-02  |
| 3 | c  | 2015-01-03  |
| 4 | null   | 2015-01-04  |
| 5 | e  | 2015-01-05  |
| 6 | f  | 2015-01-06  |
| 7 | g  | 2015-01-07  |
| null  | h  | 2015-01-08  |
| 9 | i  | null|
| 10| j  | 2015-01-10  |
+---++-+
10 rows selected (0.087 seconds)

0: jdbc:drill:drillbit=localhost select
. . . . . . . . . . . . . . . .  a1,
. . . . . . . . . . . . . . . .  sum(a1) over(partition by b1, c1),
. . . . . . . . . . . . . . . .  sum(a1) over()
. . . . . . . . . . . . . . . .  from
. . . . . . . . . . . . . . . .  t1
. . . . . . . . . . . . . . . .  order by
. . . . . . . . . . . . . . . .  a1;
+---+-+-+
|  a1   | EXPR$1  | EXPR$2  |
+---+-+-+
| 1 | 1   | 6   |
| 2 | 2   | 6   |
| 3 | 3   | 6   |
| 4 | 4   | 19  |
| 5 | 5   | 22  |
| 6 | 6   | 19  |
| 7 | 7   | 22  |
| 9 | 9   | 19  |
| 10| 10  | 22  |
| null  | null| 6   |
+---+-+-+
10 rows selected (0.165 seconds)

0: jdbc:drill:drillbit=localhost explain plan for select
. . . . . . . . . . . . . . . .  a1,
. . . . . . . . . . . . . . . .  sum(a1) over(partition by b1, c1),
. . . . . . . . . . . . . . . .  sum(a1) over()
. . . . . . . . . . . . . . . .  from
. . . . . . . . . . . . . . . .  t1
. . . . . . . . . . . . . . . .  order by
. . . . . . . . . . . . . . . .  a1;
+--+--+
| text | json |
+--+--+
| 00-00Screen
00-01  ProjectAllowDup(a1=[$0], EXPR$1=[$1], EXPR$2=[$2])
00-02SingleMergeExchange(sort0=[0 ASC])
01-01  SelectionVectorRemover
01-02Sort(sort0=[$0], dir0=[ASC])
01-03  Project(a1=[$0], w0$o0=[$1], w1$o0=[$2])
01-04HashToRandomExchange(dist0=[[$0]])
02-01  UnorderedMuxExchange
03-01Project(a1=[$0], w0$o0=[$1], w1$o0=[$2], 
E_X_P_R_H_A_S_H_F_I_E_L_D=[castInt(hash64AsDouble($0))])
03-02  Project(a1=[$1], w0$o0=[$4], w1$o0=[$5])
03-03Window(window#0=[window(partition {} order by [] 
range between UNBOUNDED PRECEDING and UNBOUNDED FOLLOWING aggs [SUM($1)])])
03-04  Window(window#0=[window(partition {2, 3} order 
by [] range between UNBOUNDED PRECEDING and UNBOUNDED FOLLOWING aggs 
[SUM($1)])])
03-05SelectionVectorRemover
03-06  Sort(sort0=[$2], sort1=[$3], dir0=[ASC], 
dir1=[ASC])
03-07Project(T154¦¦*=[$0], a1=[$1], b1=[$2], 
c1=[$3])
03-08  HashToRandomExchange(dist0=[[$2]], 
dist1=[[$3]])
04-01UnorderedMuxExchange
05-01  Project(T154¦¦*=[$0], a1=[$1], 
b1=[$2], c1=[$3], E_X_P_R_H_A_S_H_F_I_E_L_D=[castInt(hash64AsDouble($3, 
hash64AsDouble($2)))])
05-02Project(T154¦¦*=[$0], a1=[$1], 
b1=[$2], c1=[$3])
05-03  Scan(groupscan=[ParquetGroupScan 
[entries=[ReadEntryWithPath 
[path=file:/Users/vmarkman/drill/testdata/subqueries/t1]], 
selectionRoot=file:/Users/vmarkman/drill/testdata/subqueries/t1, numFiles=1, 
columns=[`*`]]])
{code}

Correct result:
{code}
0: jdbc:drill:drillbit=localhost select
. . . . . . . . . . . . . . . .  a1,
. . . . . . . . . . . . . . . .  sum(a1) over(partition by b1, c1),
. . . . . . . . . . . . . . . .  sum(a1) over()
. . . . . . . . . . . . . . . .  from
. . . . . . . . . . . . . . . .  t1
. . . . . . . . . . . . . . . .  order by
. . . . . . . . . . . . . . . .  a1;
+---+-+-+
|  a1   | EXPR$1  | EXPR$2  |
+---+-+-+
| 1 | 1   | 47  |
| 2 | 2   | 47  |
| 3 | 3   | 47  |
| 4 | 4   | 47  |
| 5 | 5   | 47  |
| 6 | 6   | 47  |
| 7 | 7   | 47  |
| 9 | 9   | 47  |
| 10| 10

Re: Review Request 36278: DRILL-3189: Disable DISALLOW PARTIAL in window function grammar

2015-07-10 Thread Sean Hsuan-Yi Chu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/36278/
---

(Updated July 10, 2015, 11:33 p.m.)


Review request for drill, Aman Sinha and Jinfeng Ni.


Changes
---

new patch


Bugs: DRILL-3189
https://issues.apache.org/jira/browse/DRILL-3189


Repository: drill-git


Description
---

Disable disallow partial in Over-Clause


Diffs (updated)
-

  
exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/parser/UnsupportedOperatorsVisitor.java
 9bbd537 
  exec/java-exec/src/test/java/org/apache/drill/exec/TestWindowFunctions.java 
7071bea 

Diff: https://reviews.apache.org/r/36278/diff/


Testing
---

All requested


Thanks,

Sean Hsuan-Yi Chu



Re: Review Request 36278: DRILL-3189: Disable DISALLOW PARTIAL in window function grammar

2015-07-10 Thread Aman Sinha

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/36278/#review91375
---

Ship it!


Ship It!

- Aman Sinha


On July 10, 2015, 11:33 p.m., Sean Hsuan-Yi Chu wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/36278/
 ---
 
 (Updated July 10, 2015, 11:33 p.m.)
 
 
 Review request for drill, Aman Sinha and Jinfeng Ni.
 
 
 Bugs: DRILL-3189
 https://issues.apache.org/jira/browse/DRILL-3189
 
 
 Repository: drill-git
 
 
 Description
 ---
 
 Disable disallow partial in Over-Clause
 
 
 Diffs
 -
 
   
 exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/parser/UnsupportedOperatorsVisitor.java
  9bbd537 
   exec/java-exec/src/test/java/org/apache/drill/exec/TestWindowFunctions.java 
 7071bea 
 
 Diff: https://reviews.apache.org/r/36278/diff/
 
 
 Testing
 ---
 
 All requested
 
 
 Thanks,
 
 Sean Hsuan-Yi Chu
 




Re: Review Request 36278: DRILL-3189: Disable DISALLOW PARTIAL in window function grammar

2015-07-09 Thread Sean Hsuan-Yi Chu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/36278/
---

(Updated July 9, 2015, 5:33 p.m.)


Review request for drill, Aman Sinha and Jinfeng Ni.


Changes
---

Addressed the comment


Bugs: DRILL-3189
https://issues.apache.org/jira/browse/DRILL-3189


Repository: drill-git


Description
---

Disable disallow partial in Over-Clause


Diffs (updated)
-

  
exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/parser/UnsupportedOperatorsVisitor.java
 9bbd537 
  exec/java-exec/src/test/java/org/apache/drill/exec/TestWindowFunctions.java 
7071bea 

Diff: https://reviews.apache.org/r/36278/diff/


Testing
---

All requested


Thanks,

Sean Hsuan-Yi Chu



[jira] [Created] (DRILL-3481) Query with Window Function fails with SYSTEM ERROR: RpcException: Data not accepted downstream.

2015-07-09 Thread Abhishek Girish (JIRA)
Abhishek Girish created DRILL-3481:
--

 Summary: Query with Window Function fails with SYSTEM ERROR: 
RpcException: Data not accepted downstream.
 Key: DRILL-3481
 URL: https://issues.apache.org/jira/browse/DRILL-3481
 Project: Apache Drill
  Issue Type: Bug
  Components: Execution - Flow
Affects Versions: 1.2.0
Reporter: Abhishek Girish
Assignee: Deneche A. Hakim


I'm seeing an error when executing a simple WF query on the latest master. 

Dataset: TPC-DS SF1000 - Parquet

Git.Commit.ID: b6577fe (Jul 8 15)

Options:
{code}
alter session set `planner.memory.max_query_memory_per_node` = 21474836480;
{code}

Query:
{code:sql}
SELECT SUM(ss.ss_net_paid_inc_tax) OVER (PARTITION BY ss.ss_store_sk ORDER BY 
ss.ss_store_sk)  FROM store_sales ss WHERE ss.ss_store_sk is not NULL LIMIT 20;
{code}

Error:
{code}
java.lang.RuntimeException: java.sql.SQLException: SYSTEM ERROR: RpcException: 
Data not accepted downstream.

Fragment 2:1

[Error Id: 49edceeb-8f10-474d-9cf2-adb4baa13bf4 on ucs-node7.perf.lab:31010]
at sqlline.IncrementalRows.hasNext(IncrementalRows.java:73)
at 
sqlline.TableOutputFormat$ResizingRowsProvider.next(TableOutputFormat.java:87)
at sqlline.TableOutputFormat.print(TableOutputFormat.java:118)
at sqlline.SqlLine.print(SqlLine.java:1583)
at sqlline.Commands.execute(Commands.java:852)
at sqlline.Commands.sql(Commands.java:751)
at sqlline.SqlLine.dispatch(SqlLine.java:738)
at sqlline.SqlLine.begin(SqlLine.java:612)
at sqlline.SqlLine.start(SqlLine.java:366)
at sqlline.SqlLine.main(SqlLine.java:259)
{code}

Log:
{code}
2015-07-08 12:17:44,764 ucs-node7.perf.lab [BitClient-2] INFO  
o.a.d.e.w.fragment.FragmentExecutor - 2a628c9e-3ff4-cc1b-77d2-bab6d00a2d1d:2:1: 
State change requested RUNNING -- FAILED

2015-07-08 12:17:44,765 ucs-node7.perf.lab [BitClient-2] ERROR 
o.a.drill.exec.ops.StatusHandler - Data not accepted downstream. Stopping 
future sends.

2015-07-08 12:17:44,765 ucs-node7.perf.lab [BitClient-2] ERROR 
o.a.drill.exec.ops.StatusHandler - Data not accepted downstream. Stopping 
future sends.

2015-07-08 12:17:44,765 ucs-node7.perf.lab [BitClient-2] INFO  
o.a.d.e.w.fragment.FragmentExecutor - 2a628c9e-3ff4-cc1b-77d2-bab6d00a2d1d:2:1: 
State change requested FAILED -- FAILED

2015-07-08 12:17:44,768 ucs-node7.perf.lab [BitClient-2] ERROR 
o.a.drill.exec.ops.StatusHandler - Data not accepted downstream. Stopping 
future sends.

2015-07-08 12:17:44,768 ucs-node7.perf.lab [BitClient-2] ERROR 
o.a.drill.exec.ops.StatusHandler - Data not accepted downstream. Stopping 
future sends.

2015-07-08 12:17:44,769 ucs-node7.perf.lab [BitClient-2] INFO  
o.a.d.e.w.fragment.FragmentExecutor - 2a628c9e-3ff4-cc1b-77d2-bab6d00a2d1d:2:1: 
State change requested FAILED -- FAILED

2015-07-08 12:17:44,771 ucs-node7.perf.lab [BitClient-2] ERROR 
o.a.drill.exec.ops.StatusHandler - Data not accepted downstream. Stopping 
future sends.

2015-07-08 12:17:44,771 ucs-node7.perf.lab [BitClient-2] ERROR 
o.a.drill.exec.ops.StatusHandler - Data not accepted downstream. Stopping 
future sends.

2015-07-08 12:17:44,772 ucs-node7.perf.lab [BitClient-2] INFO  
o.a.d.e.w.fragment.FragmentExecutor - 2a628c9e-3ff4-cc1b-77d2-bab6d00a2d1d:2:1: 
State change requested FAILED -- FAILED

2015-07-08 12:17:44,790 ucs-node7.perf.lab 
[2a628c9e-3ff4-cc1b-77d2-bab6d00a2d1d:frag:2:1] INFO  
o.a.d.e.w.fragment.FragmentExecutor - 2a628c9e-3ff4-cc1b-77d2-bab6d00a2d1d:2:1: 
State change requested FAILED -- FINISHED

2015-07-08 12:17:44,814 ucs-node7.perf.lab 
[2a628c9e-3ff4-cc1b-77d2-bab6d00a2d1d:frag:2:1] ERROR 
o.a.d.e.w.fragment.FragmentExecutor - SYSTEM ERROR: RpcException: Data not 
accepted downstream.



Fragment 2:1



[Error Id: 49edceeb-8f10-474d-9cf2-adb4baa13bf4 on ucs-node7.perf.lab:31010]

org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: RpcException: 
Data not accepted downstream.



Fragment 2:1



[Error Id: 49edceeb-8f10-474d-9cf2-adb4baa13bf4 on ucs-node7.perf.lab:31010]

at 
org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:523)
 ~[drill-common-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]

at 
org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:323)
 [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]

at 
org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:178)
 [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]

at 
org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:292)
 [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]

at 
org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38) 
[drill-common-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]

at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
[na:1.7.0_65]

at 
java.util.concurrent.ThreadPoolExecutor

Re: Review Request 36070: DRILL-3243: Need a better error message - Use of alias in window function definition

2015-07-01 Thread Hanifi Gunes

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/36070/#review90194
---

Ship it!


- Hanifi Gunes


On June 30, 2015, 11:20 p.m., abdelhakim deneche wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/36070/
 ---
 
 (Updated June 30, 2015, 11:20 p.m.)
 
 
 Review request for drill and Hanifi Gunes.
 
 
 Bugs: DRILL-3243
 https://issues.apache.org/jira/browse/DRILL-3243
 
 
 Repository: drill-git
 
 
 Description
 ---
 
 improved error message and added a unit test
 
 
 Diffs
 -
 
   common/src/main/java/org/apache/drill/common/exceptions/UserException.java 
 13c17bd 
   
 common/src/main/java/org/apache/drill/common/exceptions/UserRemoteException.java
  1b3fa42 
   
 exec/java-exec/src/main/java/org/apache/drill/exec/store/easy/text/compliant/CompliantTextRecordReader.java
  254e0d8 
   
 exec/java-exec/src/main/java/org/apache/drill/exec/store/easy/text/compliant/RepeatedVarCharOutput.java
  40276f4 
   
 exec/java-exec/src/test/java/org/apache/drill/exec/store/text/TestNewTextReader.java
  76674f9 
 
 Diff: https://reviews.apache.org/r/36070/diff/
 
 
 Testing
 ---
 
 all unit tests are passing along with functional and tpch100
 
 
 Thanks,
 
 abdelhakim deneche
 




Review Request 36070: DRILL-3243: Need a better error message - Use of alias in window function definition

2015-06-30 Thread abdelhakim deneche

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/36070/
---

Review request for drill and Hanifi Gunes.


Bugs: DRILL-3243
https://issues.apache.org/jira/browse/DRILL-3243


Repository: drill-git


Description
---

improved error message and added a unit test


Diffs
-

  common/src/main/java/org/apache/drill/common/exceptions/UserException.java 
13c17bd 
  
common/src/main/java/org/apache/drill/common/exceptions/UserRemoteException.java
 1b3fa42 
  
exec/java-exec/src/main/java/org/apache/drill/exec/store/easy/text/compliant/CompliantTextRecordReader.java
 254e0d8 
  
exec/java-exec/src/main/java/org/apache/drill/exec/store/easy/text/compliant/RepeatedVarCharOutput.java
 40276f4 
  
exec/java-exec/src/test/java/org/apache/drill/exec/store/text/TestNewTextReader.java
 76674f9 

Diff: https://reviews.apache.org/r/36070/diff/


Testing
---

all unit tests are passing along with functional and tpch100


Thanks,

abdelhakim deneche



[jira] [Created] (DRILL-3414) Window function on a null producing column of an outer join results in the wrong result

2015-06-29 Thread Victoria Markman (JIRA)
Victoria Markman created DRILL-3414:
---

 Summary: Window function on a null producing column of an outer 
join results in the wrong result
 Key: DRILL-3414
 URL: https://issues.apache.org/jira/browse/DRILL-3414
 Project: Apache Drill
  Issue Type: Bug
  Components: Query Planning  Optimization
Affects Versions: 1.0.0
Reporter: Victoria Markman
Assignee: Jinfeng Ni
Priority: Critical


{code:sql}
select
j4.c_boolean,
j4.c_date,
j4.c_integer,
sum(j4.c_integer) over (partition by j4.c_boolean order by j4.c_date, 
j4.c_integer)
fromj1
   left outer join
   j4 on j1.c_integer = j4.c_integer
order by 1,2,3;
{code}


If window function is on left side, query returns correct result.
This works:
{code:sql}
select
j1.c_boolean,
j1.c_date,
sum(j1.c_integer) over (partition by j1.c_boolean order by j1.c_date)
from
j1
left outer join
j4 on j1.c_integer = j4.c_integer
order by
1, 2;
{code}

Attached:

1. query.tar (q2.sql , q2.res (postgres output), q2.out (drill output) )
2. tables : j1.tar, j4.parquet



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (DRILL-3414) Window function on a null producing column of an outer join results in the wrong result

2015-06-29 Thread Victoria Markman (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Victoria Markman resolved DRILL-3414.
-
Resolution: Invalid

 Window function on a null producing column of an outer join results in the 
 wrong result
 ---

 Key: DRILL-3414
 URL: https://issues.apache.org/jira/browse/DRILL-3414
 Project: Apache Drill
  Issue Type: Bug
  Components: Query Planning  Optimization
Affects Versions: 1.0.0
Reporter: Victoria Markman
Assignee: Jinfeng Ni
Priority: Critical
  Labels: window_funcion
 Attachments: j1.tar, j4.parquet, query.tar


 {code:sql}
 select
 j4.c_boolean,
 j4.c_date,
 j4.c_integer,
 sum(j4.c_integer) over (partition by j4.c_boolean order by j4.c_date, 
 j4.c_integer)
 fromj1
left outer join
j4 on j1.c_integer = j4.c_integer
 order by 1,2,3;
 {code}
 If window function is on left side, query returns correct result.
 This works:
 {code:sql}
 select
 j1.c_boolean,
 j1.c_date,
 sum(j1.c_integer) over (partition by j1.c_boolean order by j1.c_date)
 from
 j1
 left outer join
 j4 on j1.c_integer = j4.c_integer
 order by
 1, 2;
 {code}
 Attached:
 1. query.tar (q2.sql , q2.res (postgres output), q2.out (drill output) )
 2. tables : j1.tar, j4.parquet



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (DRILL-3211) Assert in a query with window function and group by clause

2015-06-27 Thread Sean Hsuan-Yi Chu (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Hsuan-Yi Chu resolved DRILL-3211.
--
Resolution: Duplicate

 Assert in a query with window function and group by clause 
 ---

 Key: DRILL-3211
 URL: https://issues.apache.org/jira/browse/DRILL-3211
 Project: Apache Drill
  Issue Type: Bug
  Components: Query Planning  Optimization
Affects Versions: 1.0.0
Reporter: Victoria Markman
Assignee: Sean Hsuan-Yi Chu
  Labels: window_function
 Fix For: 1.1.0


 {code}
 0: jdbc:drill:schema=dfs select sum(a1) over (partition by b1)  from t1 
 group by b1;
 Error: SYSTEM ERROR: java.lang.AssertionError: Internal error: while 
 converting SUM(`t1`.`a1`)
 [Error Id: 21872cfa-6f09-4e92-aee6-5dd8698cf9e7 on atsqa4-133.qa.lab:31010] 
 (state=,code=0)
 {code}
 drillbit.log
 {code}
 Caused by: java.lang.AssertionError: Internal error: while converting 
 SUM(`t1`.`a1`)
 at org.apache.calcite.util.Util.newInternal(Util.java:790) 
 ~[calcite-core-1.1.0-drill-r7.jar:1.1.0-drill-r7]
 at 
 org.apache.calcite.sql2rel.ReflectiveConvertletTable$2.convertCall(ReflectiveConvertletTable.java:152)
  ~[calcite-core-1.1.0-drill-r7.jar:1.1.0-drill-r7]
 at 
 org.apache.calcite.sql2rel.SqlNodeToRexConverterImpl.convertCall(SqlNodeToRexConverterImpl.java:60)
  ~[calcite-core-1.1.0-drill-r7.jar:1.1.0-drill-r7]
 at 
 org.apache.calcite.sql2rel.SqlToRelConverter.convertOver(SqlToRelConverter.java:1762)
  ~[calcite-core-1.1.0-drill-r7.jar:1.1.0-drill-r7]
 at 
 org.apache.calcite.sql2rel.SqlToRelConverter.access$1000(SqlToRelConverter.java:180)
  ~[calcite-core-1.1.0-drill-r7.jar:1.1.0-drill-r7]
 at 
 org.apache.calcite.sql2rel.SqlToRelConverter$Blackboard.convertExpression(SqlToRelConverter.java:3937)
  ~[calcite-core-1.1.0-drill-r7.jar:1.1.0-drill-r7]
 at 
 org.apache.calcite.sql2rel.SqlToRelConverter.createAggImpl(SqlToRelConverter.java:2521)
  ~[calcite-core-1.1.0-drill-r7.jar:1.1.0-drill-r7]
 at 
 org.apache.calcite.sql2rel.SqlToRelConverter.convertAgg(SqlToRelConverter.java:2342)
  ~[calcite-core-1.1.0-drill-r7.jar:1.1.0-drill-r7]
 at 
 org.apache.calcite.sql2rel.SqlToRelConverter.convertSelectImpl(SqlToRelConverter.java:604)
  ~[calcite-core-1.1.0-drill-r7.jar:1.1.0-drill-r7]
 at 
 org.apache.calcite.sql2rel.SqlToRelConverter.convertSelect(SqlToRelConverter.java:564)
  ~[calcite-core-1.1.0-drill-r7.jar:1.1.0-drill-r7]
 at 
 org.apache.calcite.sql2rel.SqlToRelConverter.convertQueryRecursive(SqlToRelConverter.java:2741)
  ~[calcite-core-1.1.0-drill-r7.jar:1.1.0-drill-r7]
 at 
 org.apache.calcite.sql2rel.SqlToRelConverter.convertQuery(SqlToRelConverter.java:522)
  ~[calcite-core-1.1.0-drill-r7.jar:1.1.0-drill-r7]
 at 
 org.apache.calcite.prepare.PlannerImpl.convert(PlannerImpl.java:198) 
 ~[calcite-core-1.1.0-drill-r7.jar:1.1.0-drill-r7]
 at 
 org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToRel(DefaultSqlHandler.java:246)
  ~[drill-java-exec-1.0.0-mapr-r1-rebuffed.jar:1.0.0-mapr-r1]
 at 
 org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.getPlan(DefaultSqlHandler.java:182)
  ~[drill-java-exec-1.0.0-mapr-r1-rebuffed.jar:1.0.0-mapr-r1]
 at 
 org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan(DrillSqlWorker.java:177)
  ~[drill-java-exec-1.0.0-mapr-r1-rebuffed.jar:1.0.0-mapr-r1]
 at 
 org.apache.drill.exec.work.foreman.Foreman.runSQL(Foreman.java:902) 
 [drill-java-exec-1.0.0-mapr-r1-rebuffed.jar:1.0.0-mapr-r1]
 at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:240) 
 [drill-java-exec-1.0.0-mapr-r1-rebuffed.jar:1.0.0-mapr-r1]
 ... 3 common frames omitted
 Caused by: java.lang.reflect.InvocationTargetException: null
 at sun.reflect.GeneratedMethodAccessor120.invoke(Unknown Source) 
 ~[na:na]
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  ~[na:1.7.0_71]
 at java.lang.reflect.Method.invoke(Method.java:606) ~[na:1.7.0_71]
 at 
 org.apache.calcite.sql2rel.ReflectiveConvertletTable$2.convertCall(ReflectiveConvertletTable.java:142)
  ~[calcite-core-1.1.0-drill-r7.jar:1.1.0-drill-r7]
 ... 19 common frames omitted
 Caused by: java.lang.AssertionError: null
 at 
 org.apache.calcite.sql2rel.SqlToRelConverter$Blackboard.getRootField(SqlToRelConverter.java:3810)
  ~[calcite-core-1.1.0-drill-r7.jar:1.1.0-drill-r7]
 at 
 org.apache.calcite.sql2rel.SqlToRelConverter.adjustInputRef(SqlToRelConverter.java:3139)
  ~[calcite-core-1.1.0-drill-r7.jar:1.1.0-drill-r7]
 at 
 org.apache.calcite.sql2rel.SqlToRelConverter.convertIdentifier(SqlToRelConverter.java:3114)
  ~[calcite

[jira] [Resolved] (DRILL-3183) When Group By clause is present, the argument in window function should not refer to any column outside Group By

2015-06-26 Thread Jinfeng Ni (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3183?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jinfeng Ni resolved DRILL-3183.
---
Resolution: Fixed

fixed in commit:  da17f28678f1dee5c43bdb69582586a79f8b667c

 When Group By clause is present, the argument in window function should not 
 refer to any column outside Group By
 

 Key: DRILL-3183
 URL: https://issues.apache.org/jira/browse/DRILL-3183
 Project: Apache Drill
  Issue Type: Bug
  Components: Query Planning  Optimization
Affects Versions: 1.0.0
 Environment: faec150598840c40827e6493992d81209aa936da
Reporter: Khurram Faraaz
Assignee: Sean Hsuan-Yi Chu
  Labels: window_function
 Fix For: 1.1.0


 Test was run on 4 node cluster on CentOS. We see NPE for a query that uses 
 window functions. Data was from CSV file (headers were ignored).
 To enable window functions, 
 alter session set `window.enable`=true;
 These two queries work 
 {code}
 SELECT count(salary) OVER w, count(salary) OVER w FROM cp.`employee.json` t 
 WINDOW w AS (PARTITION BY store_id ORDER BY position_id DESC);
 SELECT count(columns[0]) OVER(PARTITION BY columns[1] ORDER BY columns[0] 
 DESC), count(columns[0]) OVER(PARTITION BY columns[1] ORDER BY columns[0] 
 DESC) FROM `airports.csv`;
 {code}
 These two queries do not work and in the second query we see a NPE
 {code}
 0: jdbc:drill:schema=dfs.tmp SELECT count(*) OVER w, count(*) OVER w FROM 
 `airports.csv` WINDOW w AS (PARTITION BY columns[1] ORDER BY columns[0] DESC);
 Error: PARSE ERROR: From line 1, column 87 to line 1, column 93: Table 
 'columns' not found
 [Error Id: 51d080bc-580f-44cc-a9be-d29ae60900c3 on centos-03.qa.lab:31010] 
 (state=,code=0)
 {code}
 Query that returns NPE.
 {code}
 0: jdbc:drill:schema=dfs.tmp SELECT count(*) OVER w, count(*) OVER w FROM 
 `airports.csv` t WINDOW w AS (PARTITION BY t.columns[1] ORDER BY t.columns[0] 
 DESC);
 Error: PARSE ERROR: java.lang.NullPointerException
 [Error Id: 27e933bf-1382-4aae-bfef-36444a69acc9 on centos-03.qa.lab:31010] 
 (state=,code=0)
 {code}
 Stack trace from drillbit.log
 {code}
 2015-05-26 19:07:33,104 [2a9b3b8a-4d0b-ba7b-f0ff-f8038f9f9dbd:foreman] INFO  
 o.a.drill.exec.work.foreman.Foreman - State change requested.  PENDING -- 
 FAILED
 org.apache.drill.exec.work.foreman.ForemanException: Unexpected exception 
 during fragment initialization: PARSE ERROR: java.lang.NullPointerException
 [Error Id: 16e17855-32f7-4687-9502-5b4880bb11a4 ]
 at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:251) 
 [drill-java-exec-1.0.0-mapr-r1-rebuffed.jar:1.0.0-mapr-r1]
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
  [na:1.7.0_45]
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
  [na:1.7.0_45]
 at java.lang.Thread.run(Thread.java:744) [na:1.7.0_45]
 Caused by: org.apache.drill.common.exceptions.UserException: PARSE ERROR: 
 java.lang.NullPointerException
 [Error Id: 16e17855-32f7-4687-9502-5b4880bb11a4 ]
 at 
 org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:522)
  ~[drill-common-1.0.0-mapr-r1-rebuffed.jar:1.0.0-mapr-r1]
 at 
 org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan(DrillSqlWorker.java:180)
  ~[drill-java-exec-1.0.0-mapr-r1-rebuffed.jar:1.0.0-mapr-r1]
 at 
 org.apache.drill.exec.work.foreman.Foreman.runSQL(Foreman.java:902) 
 [drill-java-exec-1.0.0-mapr-r1-rebuffed.jar:1.0.0-mapr-r1]
 at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:240) 
 [drill-java-exec-1.0.0-mapr-r1-rebuffed.jar:1.0.0-mapr-r1]
 ... 3 common frames omitted
 Caused by: org.apache.calcite.tools.ValidationException: 
 java.lang.NullPointerException
 at 
 org.apache.calcite.prepare.PlannerImpl.validate(PlannerImpl.java:176) 
 ~[calcite-core-1.1.0-drill-r7.jar:1.1.0-drill-r7]
 at 
 org.apache.calcite.prepare.PlannerImpl.validateAndGetType(PlannerImpl.java:185)
  ~[calcite-core-1.1.0-drill-r7.jar:1.1.0-drill-r7]
 at 
 org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.validateNode(DefaultSqlHandler.java:226)
  ~[drill-java-exec-1.0.0-mapr-r1-rebuffed.jar:1.0.0-mapr-r1]
 at 
 org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.getPlan(DefaultSqlHandler.java:178)
  ~[drill-java-exec-1.0.0-mapr-r1-rebuffed.jar:1.0.0-mapr-r1]
 at 
 org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan(DrillSqlWorker.java:177)
  ~[drill-java-exec-1.0.0-mapr-r1-rebuffed.jar:1.0.0-mapr-r1]
 ... 5 common frames omitted
 Caused by: java.lang.NullPointerException: null
 at 
 org.apache.calcite.rel.type.RelDataTypeImpl.getField(RelDataTypeImpl.java:82

[jira] [Created] (DRILL-3404) Filter on window function does not appear in query plan

2015-06-26 Thread Khurram Faraaz (JIRA)
Khurram Faraaz created DRILL-3404:
-

 Summary: Filter on window function does not appear in query plan
 Key: DRILL-3404
 URL: https://issues.apache.org/jira/browse/DRILL-3404
 Project: Apache Drill
  Issue Type: Bug
  Components: Query Planning  Optimization
Affects Versions: 1.1.0
 Environment: 4 node cluster on CentOS
Reporter: Khurram Faraaz
Assignee: Jinfeng Ni
Priority: Critical


Filter is missing in the query plan for the below query in Drill, and hence 
wrong results are returned.

Results from Drill
{code}
0: jdbc:drill:schema=dfs.tmp select c1, c2, w_sum from ( select c1, c2, sum ( 
c1 ) over ( partition by c2 order by c1 asc nulls first ) w_sum from 
`tblWnulls` ) sub_query where w_sum is not null;
+-+---+-+
| c1  |  c2   |w_sum|
+-+---+-+
| 0   | a | 0   |
| 1   | a | 1   |
| 5   | a | 6   |
| 10  | a | 16  |
| 11  | a | 27  |
| 14  | a | 41  |
| 1   | a | 11152   |
| 2   | b | 2   |
| 9   | b | 11  |
| 13  | b | 24  |
| 17  | b | 41  |
| null| c | null|
| 4   | c | 4   |
| 6   | c | 10  |
| 8   | c | 18  |
| 12  | c | 30  |
| 13  | c | 56  |
| 13  | c | 56  |
| null| d | null|
| null| d | null|
| 10  | d | 10  |
| 11  | d | 21  |
| 2147483647  | d | 4294967315  |
| 2147483647  | d | 4294967315  |
| -1  | e | -1  |
| 15  | e | 14  |
| null| null  | null|
| 19  | null  | 19  |
| 65536   | null  | 6   |
| 100 | null  | 106 |
+-+---+-+
30 rows selected (0.337 seconds)
{code}

Explain plan for the above query from Drill
{code}
0: jdbc:drill:schema=dfs.tmp explain plan for select c1, c2, w_sum from ( 
select c1, c2, sum ( c1 ) over ( partition by c2 order by c1 asc nulls first ) 
w_sum from `tblWnulls` ) sub_query where w_sum is not null;
+--+---+
|   



text



   | json  |
+--+---+
| 00-00Screen
00-01  Project(c1=[$0], c2=[$1], w_sum=[$2])
00-02Project(c1=[$0], c2=[$1], w_sum=[CASE(($2, 0), $3, null)])
00-03  Window(window#0=[window(partition {1} order by [0 
ASC-nulls-first] range between UNBOUNDED PRECEDING and CURRENT ROW aggs 
[COUNT($0), $SUM0($0)])])
00-04SelectionVectorRemover
00-05  Sort(sort0=[$1], sort1=[$0], dir0=[ASC], 
dir1=[ASC-nulls-first])
00-06Project(c1=[$1], c2=[$0])
00-07  Scan

Re: Review Request 35960: DRILL-3307: Query with window function runs out of memory

2015-06-26 Thread Steven Phillips

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35960/#review89600
---

Ship it!


Ship It!

- Steven Phillips


On June 27, 2015, 1:12 a.m., abdelhakim deneche wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/35960/
 ---
 
 (Updated June 27, 2015, 1:12 a.m.)
 
 
 Review request for drill and Steven Phillips.
 
 
 Bugs: DRILL-3307
 https://issues.apache.org/jira/browse/DRILL-3307
 
 
 Repository: drill-git
 
 
 Description
 ---
 
 Fixed sort to only use copier allocator when spilling to disk
 
 
 Diffs
 -
 
   
 exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/xsort/ExternalSortBatch.java
  02a1c08 
 
 Diff: https://reviews.apache.org/r/35960/diff/
 
 
 Testing
 ---
 
 ongoing...
 
 
 Thanks,
 
 abdelhakim deneche
 




Re: Review Request 35609: DRILL-3243: Need a better error message - Use of alias in window function definition

2015-06-25 Thread abdelhakim deneche

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35609/
---

(Updated June 25, 2015, 3:52 p.m.)


Review request for drill and Hanifi Gunes.


Changes
---

rebased on top of master


Bugs: DRILL-3243
https://issues.apache.org/jira/browse/DRILL-3243


Repository: drill-git


Description
---

changed RepeatedVarCharOutput to display the column name as part of the 
exception's message
changed CompliantTestRecordReader to throw a DATA_READ user exception
added new unit test to TestNewTextReader


Diffs (updated)
-

  common/src/main/java/org/apache/drill/common/exceptions/UserException.java 
13c17bd 
  
common/src/main/java/org/apache/drill/common/exceptions/UserRemoteException.java
 1b3fa42 
  
exec/java-exec/src/main/java/org/apache/drill/exec/store/easy/text/compliant/CompliantTextRecordReader.java
 254e0d8 
  
exec/java-exec/src/main/java/org/apache/drill/exec/store/easy/text/compliant/RepeatedVarCharOutput.java
 40276f4 
  
exec/java-exec/src/test/java/org/apache/drill/exec/store/text/TestNewTextReader.java
 76674f9 

Diff: https://reviews.apache.org/r/35609/diff/


Testing
---

all unit tests are passing along with functional and tpch100


Thanks,

abdelhakim deneche



[jira] [Created] (DRILL-3352) Extra re-distribution when evaluating window function after GROUP BY

2015-06-24 Thread Aman Sinha (JIRA)
Aman Sinha created DRILL-3352:
-

 Summary: Extra re-distribution when evaluating window function 
after GROUP BY
 Key: DRILL-3352
 URL: https://issues.apache.org/jira/browse/DRILL-3352
 Project: Apache Drill
  Issue Type: Bug
  Components: Query Planning  Optimization
Affects Versions: 1.0.0
Reporter: Aman Sinha
Assignee: Aman Sinha


Consider the following query and plan: 
{code}
explain plan for select min(l_partkey) over (partition by l_suppkey) from 
lineitem group by l_partkey, l_suppkey limit 1;

00-00Screen
00-01  Project(EXPR$0=[$0])
00-02SelectionVectorRemover
00-03  Limit(fetch=[1])
00-04UnionExchange
01-01  Project($0=[$3])
01-02Window(window#0=[window(partition {1} order by [] range 
between UNBOUNDED PRECEDING and UNBOUNDED FOLLOWING aggs [MIN($0)])])
01-03  SelectionVectorRemover
01-04Sort(sort0=[$1], dir0=[ASC])
01-05  Project(l_partkey=[$0], l_suppkey=[$1], $f2=[$2])
01-06HashToRandomExchange(dist0=[[$1]])
02-01  UnorderedMuxExchange
03-01Project(l_partkey=[$0], l_suppkey=[$1], 
$f2=[$2], E_X_P_R_H_A_S_H_F_I_E_L_D=[castInt(hash64AsDouble($1))])
03-02  HashAgg(group=[{0, 1}], agg#0=[MIN($2)])
03-03Project(l_partkey=[$0], l_suppkey=[$1], 
$f2=[$2])
03-04  HashToRandomExchange(dist0=[[$0]], 
dist1=[[$1]])
04-01UnorderedMuxExchange
05-01  Project(l_partkey=[$0], 
l_suppkey=[$1], $f2=[$2], E_X_P_R_H_A_S_H_F_I_E_L_D=[castInt(hash64AsDouble($1, 
hash64AsDouble($0)))])
05-02HashAgg(group=[{0, 1}], 
agg#0=[MIN($0)])
05-03  Project(l_partkey=[$1], 
l_suppkey=[$0])
05-04
Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath 
[path=file:/Users/asinha/data/tpch-sf1/lineitem]], 
selectionRoot=/Users/asinha/data/tpch-sf1/lineitem, numFiles=1, 
columns=[`l_partkey`, `l_suppkey`]]])
{code}

Here, we do a distribution for the HashAgg on 2 columns: {l_partkey, 
l_suppkey}.  Subsequently, we re-distribute on {l_suppkey} only since the 
window function has a partition-by l_suppkey.  The second re-distribute could 
be avoided if the first distribution for the HashAgg was done on l_suppkey 
only.   The reason we do distribution on all grouping columns is to avoid skew 
problems.   However, in many cases especially when a window function is 
involved, it may make sense to only distribute on 1 column. 




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-3358) CUME_DIST window function provides wrong result when only ORDER BY clause is specified

2015-06-24 Thread Abhishek Girish (JIRA)
Abhishek Girish created DRILL-3358:
--

 Summary: CUME_DIST window function provides wrong result when only 
ORDER BY clause is specified
 Key: DRILL-3358
 URL: https://issues.apache.org/jira/browse/DRILL-3358
 Project: Apache Drill
  Issue Type: Bug
  Components: Query Planning  Optimization
Affects Versions: 1.1.0
Reporter: Abhishek Girish
Assignee: Deneche A. Hakim


*Drill:*
{code:sql}
 SELECT CUME_DIST() OVER (ORDER BY ss.ss_store_sk) FROM store_sales ss ORDER 
 BY 1 LIMIT 20;
+-+
|   EXPR$0|
+-+
| 0.9923989432198661  |
| 0.9923989432198661  |
| 0.9923989432198661  |
| 0.9923989432198661  |
| 0.9923989432198661  |
| 0.9923989432198661  |
| 0.9923989432198661  |
| 0.9923989432198661  |
| 0.9923989432198661  |
| 0.9923989432198661  |
| 0.9923989432198661  |
| 0.9923989432198661  |
| 0.9923989432198661  |
| 0.9923989432198661  |
| 0.9923989432198661  |
| 0.9923989432198661  |
| 0.9923989432198661  |
| 0.9923989432198661  |
| 0.9923989432198661  |
| 0.9923989432198661  |
+-+
20 rows selected (17.317 seconds)
{code}

*Postgres*
{code:sql}
# SELECT CUME_DIST() OVER (ORDER BY ss.ss_store_sk) FROM store_sales ss ORDER 
BY 1 LIMIT 20;
 cume_dist

---
 0.158622193275665
 0.158622193275665
 0.158622193275665
 0.158622193275665
 0.158622193275665
 0.158622193275665
 0.158622193275665
 0.158622193275665
 0.158622193275665
 0.158622193275665
 0.158622193275665
 0.158622193275665
 0.158622193275665
 0.158622193275665
 0.158622193275665
 0.158622193275665
 0.158622193275665
 0.158622193275665
 0.158622193275665
 0.158622193275665
(20 rows)
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-3359) Drill should throw and error when window function defined using WINDOW AS uses ROWS UNBOUNDED PRECEDING

2015-06-24 Thread Deneche A. Hakim (JIRA)
Deneche A. Hakim created DRILL-3359:
---

 Summary: Drill should throw and error when window function defined 
using WINDOW AS uses ROWS UNBOUNDED PRECEDING
 Key: DRILL-3359
 URL: https://issues.apache.org/jira/browse/DRILL-3359
 Project: Apache Drill
  Issue Type: Bug
Reporter: Deneche A. Hakim
Assignee: Sean Hsuan-Yi Chu


as part of DRILL-3188, the following query is not supported and Drill displays 
the proper error message:
{noformat}
0: jdbc:drill:zk=local select sum(salary) over(partition by position_id order 
by salary rows unbounded preceding) from cp.`employee.json` limit 20;
Error: UNSUPPORTED_OPERATION ERROR: This type of window frame is currently not 
supported 
See Apache Drill JIRA: DRILL-3188
{noformat}

But when defining the same window using a WINDOW AS, Drill doesn't throw any 
error:
{noformat}
0: jdbc:drill:zk=local select sum(salary) over w from cp.`employee.json` 
window w as (partition by position_id order by salary rows unbounded preceding) 
limit 20;
+---+
|  EXPR$0   |
+---+
| 8.0   |
| 3.0   |
| 135000.0  |
| 135000.0  |
| 135000.0  |
| 215000.0  |
| 215000.0  |
| 25000.0   |
| 15000.0   |
| 5.0   |
| 6700.0|
| 14700.0   |
| 34700.0   |
| 34700.0   |
| 5000.0|
| 13500.0   |
| 58500.0   |
| 5000.0|
| 11700.0   |
| 2.0   |
+---+
20 rows selected (0.348 seconds)
{noformat}

The results are, of course, incorrect



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (DRILL-3358) CUME_DIST window function provides wrong result when only ORDER BY clause is specified

2015-06-24 Thread Deneche A. Hakim (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Deneche A. Hakim resolved DRILL-3358.
-
Resolution: Duplicate

 CUME_DIST window function provides wrong result when only ORDER BY clause is 
 specified
 --

 Key: DRILL-3358
 URL: https://issues.apache.org/jira/browse/DRILL-3358
 Project: Apache Drill
  Issue Type: Bug
  Components: Query Planning  Optimization
Affects Versions: 1.1.0
Reporter: Abhishek Girish
Assignee: Deneche A. Hakim
  Labels: window_function
 Fix For: 1.1.0


 *Drill:*
 {code:sql}
  SELECT CUME_DIST() OVER (ORDER BY ss.ss_store_sk) FROM store_sales ss ORDER 
  BY 1 LIMIT 20;
 +-+
 |   EXPR$0|
 +-+
 | 0.9923989432198661  |
 | 0.9923989432198661  |
 | 0.9923989432198661  |
 | 0.9923989432198661  |
 | 0.9923989432198661  |
 | 0.9923989432198661  |
 | 0.9923989432198661  |
 | 0.9923989432198661  |
 | 0.9923989432198661  |
 | 0.9923989432198661  |
 | 0.9923989432198661  |
 | 0.9923989432198661  |
 | 0.9923989432198661  |
 | 0.9923989432198661  |
 | 0.9923989432198661  |
 | 0.9923989432198661  |
 | 0.9923989432198661  |
 | 0.9923989432198661  |
 | 0.9923989432198661  |
 | 0.9923989432198661  |
 +-+
 20 rows selected (17.317 seconds)
 {code}
 *Postgres*
 {code:sql}
 # SELECT CUME_DIST() OVER (ORDER BY ss.ss_store_sk) FROM store_sales ss ORDER 
 BY 1 LIMIT 20;
  cume_dist
 ---
  0.158622193275665
  0.158622193275665
  0.158622193275665
  0.158622193275665
  0.158622193275665
  0.158622193275665
  0.158622193275665
  0.158622193275665
  0.158622193275665
  0.158622193275665
  0.158622193275665
  0.158622193275665
  0.158622193275665
  0.158622193275665
  0.158622193275665
  0.158622193275665
  0.158622193275665
  0.158622193275665
  0.158622193275665
  0.158622193275665
 (20 rows)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-3365) Query with window function on large dataset fails with IOException: Mkdirs failed to create spill directory

2015-06-24 Thread Abhishek Girish (JIRA)
Abhishek Girish created DRILL-3365:
--

 Summary: Query with window function on large dataset fails with 
IOException: Mkdirs failed to create spill directory
 Key: DRILL-3365
 URL: https://issues.apache.org/jira/browse/DRILL-3365
 Project: Apache Drill
  Issue Type: Bug
  Components: Execution - Flow
Affects Versions: 1.1.0
Reporter: Abhishek Girish
Assignee: Chris Westin
Priority: Minor


Dataset: TPC-DS SF100 Parquet

Query: 
{code:sql}
SELECT sum(ss.ss_net_paid_inc_tax) OVER (PARTITION BY ss.ss_store_sk ORDER BY 
ss.ss_customer_sk) AS PartialSum FROM store_sales ss GROUP BY 
ss.ss_net_paid_inc_tax, ss.ss_store_sk, ss.ss_customer_sk  LIMIT 20;
java.lang.RuntimeException: java.sql.SQLException: SYSTEM ERROR: 
java.io.IOException: Mkdirs failed to create 
/tmp/drill/spill/2a74ac18-0679-ab99-26c6-af41b9af7f4e/major_fragment_1/minor_fragment_17/operator_4
 (exists=false, cwd=file:///opt/mapr/drill/drill-1.1.0/bin)
Fragment 1:17
[Error Id: 4905b400-fc0f-4287-beba-d1ca18359986 on abhi5.qa.lab:31010]
at sqlline.IncrementalRows.hasNext(IncrementalRows.java:73)
at 
sqlline.TableOutputFormat$ResizingRowsProvider.next(TableOutputFormat.java:85)
at sqlline.TableOutputFormat.print(TableOutputFormat.java:116)
at sqlline.SqlLine.print(SqlLine.java:1583)
at sqlline.Commands.execute(Commands.java:852)
at sqlline.Commands.sql(Commands.java:751)
at sqlline.SqlLine.dispatch(SqlLine.java:738)
at sqlline.SqlLine.begin(SqlLine.java:612)
at sqlline.SqlLine.start(SqlLine.java:366)
at sqlline.SqlLine.main(SqlLine.java:259)
{code}

Was unable to find corresponding logs. This was consistently seen via JDBC 
program and sqlline. 

After I restarted Drillbits the issue seems to have been resolved. But wanted 
to report this anyway. Possible explanation is DRILL-2917 (one or more 
drillbits were in an inconsistent state)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 35609: DRILL-3243: Need a better error message - Use of alias in window function definition

2015-06-23 Thread Hanifi Gunes

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35609/#review89036
---

Ship it!


Ship It!

- Hanifi Gunes


On June 18, 2015, 3:14 p.m., abdelhakim deneche wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/35609/
 ---
 
 (Updated June 18, 2015, 3:14 p.m.)
 
 
 Review request for drill and Hanifi Gunes.
 
 
 Bugs: DRILL-3243
 https://issues.apache.org/jira/browse/DRILL-3243
 
 
 Repository: drill-git
 
 
 Description
 ---
 
 changed RepeatedVarCharOutput to display the column name as part of the 
 exception's message
 changed CompliantTestRecordReader to throw a DATA_READ user exception
 added new unit test to TestNewTextReader
 
 
 Diffs
 -
 
   common/src/main/java/org/apache/drill/common/exceptions/UserException.java 
 6f28a2b 
   
 common/src/main/java/org/apache/drill/common/exceptions/UserRemoteException.java
  1b3fa42 
   
 exec/java-exec/src/main/java/org/apache/drill/exec/store/easy/text/compliant/CompliantTextRecordReader.java
  254e0d8 
   
 exec/java-exec/src/main/java/org/apache/drill/exec/store/easy/text/compliant/RepeatedVarCharOutput.java
  40276f4 
   
 exec/java-exec/src/test/java/org/apache/drill/exec/store/text/TestNewTextReader.java
  76674f9 
 
 Diff: https://reviews.apache.org/r/35609/diff/
 
 
 Testing
 ---
 
 all unit tests are passing along with functional and tpch100
 
 
 Thanks,
 
 abdelhakim deneche
 




[jira] [Created] (DRILL-3337) Queries with Window Function DENSE_RANK fail with SchemaChangeException

2015-06-22 Thread Abhishek Girish (JIRA)
Abhishek Girish created DRILL-3337:
--

 Summary: Queries with Window Function DENSE_RANK fail with 
SchemaChangeException
 Key: DRILL-3337
 URL: https://issues.apache.org/jira/browse/DRILL-3337
 Project: Apache Drill
  Issue Type: Bug
  Components: Query Planning  Optimization
Affects Versions: 1.1.0
Reporter: Abhishek Girish
Assignee: Aman Sinha


Example queries which result in exceptions:

DENSE_RANK WF with ORDER BY 1 column and GROUP BY, ORDER BY on the main query
{code:sql}
SELECT DENSE_RANK() OVER (ORDER BY ss.ss_store_sk) FROM store_sales ss GROUP BY 
ss.ss_store_sk, ss.ss_net_paid_inc_tax ORDER BY 1 LIMIT 20;
Error: SYSTEM ERROR: org.apache.drill.exec.exception.SchemaChangeException: 
Failure while materializing expression. 
Error in expression at index 0.  Error: Missing function implementation: 
[dense_rank(INT-REQUIRED)].  Full expression: null.
Fragment 4:10
[Error Id: 4b9187db-e770-4e7f-afe4-0d4dfc045088 on abhi6.qa.lab:31010] 
(state=,code=0)
{code}

DENSE_RANK WF with PARTITION BY 2 columns and ORDER BY 2 column and GROUP BY, 
ORDER BY on the main query
{code:sql}
SELECT DENSE_RANK() OVER (PARTITION BY s.ss_store_sk, s.ss_customer_sk ORDER BY 
s.ss_store_sk, s.ss_customer_sk) FROM store_sales s GROUP BY s.ss_store_sk, 
s.ss_customer_sk, s.ss_quantity ORDER BY 1  LIMIT 20;
Error: SYSTEM ERROR: org.apache.drill.exec.exception.SchemaChangeException: 
Failure while materializing expression. 
Error in expression at index 0.  Error: Missing function implementation: 
[dense_rank(INT-REQUIRED)].  Full expression: null.
Fragment 5:22
[Error Id: 3ac6e4ce-5bb3-4058-b806-3d0becbbd0d1 on abhi6.qa.lab:31010] 
(state=,code=0)
{code}


*The queries execute fine on Postgres*



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (DRILL-3293) CTAS with window function fails with UnsupportedOperationException

2015-06-22 Thread Jinfeng Ni (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jinfeng Ni resolved DRILL-3293.
---
Resolution: Fixed

fixed in commit: ffae1691c0cd526ed1095fbabbc0855d016790d7

 CTAS with window function fails with UnsupportedOperationException
 --

 Key: DRILL-3293
 URL: https://issues.apache.org/jira/browse/DRILL-3293
 Project: Apache Drill
  Issue Type: Bug
  Components: Execution - Flow
Affects Versions: 1.0.0
Reporter: Victoria Markman
Assignee: Jinfeng Ni
  Labels: window_function
 Fix For: 1.1.0

 Attachments: t1_parquet


 {code}
 0: jdbc:drill:schema=dfs create table wf_t1 as select sum(a1) over(partition 
 by a1) from t1;
 Error: SYSTEM ERROR:
 Fragment 0:0
 [Error Id: 96897b46-70c0-4373-9d85-ca7501cb1479 on atsqa4-133.qa.lab:31010] 
 (state=,code=0)
 {code}
 drillbit.log
 {code}
 [Error Id: bde0d90b-7eaa-4772-9316-9c58a46b01d2 on atsqa4-133.qa.lab:31010]
 org.apache.drill.common.exceptions.UserException: SYSTEM ERROR:
 Fragment 0:0
 [Error Id: bde0d90b-7eaa-4772-9316-9c58a46b01d2 on atsqa4-133.qa.lab:31010]
 at 
 org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:522)
  ~[drill-common-1.1.0-SNAPSHOT-rebuffed.jar:1.1.0-SNAPSHOT]
 at 
 org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:325)
  [drill-java-exec-1.1.0-SNAPSHOT-rebuffed.jar:1.1.0-SNAPSHOT]
 at 
 org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:181)
  [drill-java-exec-1.1.0-SNAPSHOT-rebuffed.jar:1.1.0-SNAPSHOT]
 at 
 org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:294)
  [drill-java-exec-1.1.0-SNAPSHOT-rebuffed.jar:1.1.0-SNAPSHOT]
 at 
 org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38)
  [drill-common-1.1.0-SNAPSHOT-rebuffed.jar:1.1.0-SNAPSHOT]
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
  [na:1.7.0_71]
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
  [na:1.7.0_71]
 at java.lang.Thread.run(Thread.java:745) [na:1.7.0_71]
 Caused by: java.lang.UnsupportedOperationException: null
 at 
 org.apache.drill.exec.expr.TypeHelper.getValueVectorClass(TypeHelper.java:674)
  ~[drill-java-exec-1.1.0-SNAPSHOT-rebuffed.jar:1.1.0-SNAPSHOT]
 at 
 org.apache.drill.exec.record.VectorContainer.addOrGet(VectorContainer.java:82)
  ~[drill-java-exec-1.1.0-SNAPSHOT-rebuffed.jar:1.1.0-SNAPSHOT]
 at 
 org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.setupNewSchema(ProjectRecordBatch.java:421)
  ~[drill-java-exec-1.1.0-SNAPSHOT-rebuffed.jar:1.1.0-SNAPSHOT]
 at 
 org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:78)
  ~[drill-java-exec-1.1.0-SNAPSHOT-rebuffed.jar:1.1.0-SNAPSHOT]
 at 
 org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext(ProjectRecordBatch.java:129)
  ~[drill-java-exec-1.1.0-SNAPSHOT-rebuffed.jar:1.1.0-SNAPSHOT]
 at 
 org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:146)
  ~[drill-java-exec-1.1.0-SNAPSHOT-rebuffed.jar:1.1.0-SNAPSHOT]
 at 
 org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:105)
  ~[drill-java-exec-1.1.0-SNAPSHOT-rebuffed.jar:1.1.0-SNAPSHOT]
 at 
 org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:95)
  ~[drill-java-exec-1.1.0-SNAPSHOT-rebuffed.jar:1.1.0-SNAPSHOT]
 at 
 org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:51)
  ~[drill-java-exec-1.1.0-SNAPSHOT-rebuffed.jar:1.1.0-SNAPSHOT]
 at 
 org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext(ProjectRecordBatch.java:129)
  ~[drill-java-exec-1.1.0-SNAPSHOT-rebuffed.jar:1.1.0-SNAPSHOT]
 at 
 org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:146)
  ~[drill-java-exec-1.1.0-SNAPSHOT-rebuffed.jar:1.1.0-SNAPSHOT]
 at 
 org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:105)
  ~[drill-java-exec-1.1.0-SNAPSHOT-rebuffed.jar:1.1.0-SNAPSHOT]
 at 
 org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:95)
  ~[drill-java-exec-1.1.0-SNAPSHOT-rebuffed.jar:1.1.0-SNAPSHOT]
 at 
 org.apache.drill.exec.physical.impl.WriterRecordBatch.innerNext(WriterRecordBatch.java:92)
  ~[drill-java-exec-1.1.0-SNAPSHOT-rebuffed.jar:1.1.0-SNAPSHOT]
 at 
 org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:146)
  ~[drill-java-exec-1.1.0-SNAPSHOT-rebuffed.jar:1.1.0-SNAPSHOT

[jira] [Resolved] (DRILL-3265) Query with window function and sort below that spills to disk runs out of memory

2015-06-22 Thread Victoria Markman (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Victoria Markman resolved DRILL-3265.
-
Resolution: Fixed

 Query with window function and sort below that spills to disk runs out of 
 memory
 

 Key: DRILL-3265
 URL: https://issues.apache.org/jira/browse/DRILL-3265
 Project: Apache Drill
  Issue Type: Bug
  Components: Execution - Relational Operators
Affects Versions: 1.0.0
Reporter: Victoria Markman
Assignee: Deneche A. Hakim
  Labels: window_function
 Fix For: 1.1.0

 Attachments: drill-3265.log


 Failure:
 {code}
 0: jdbc:drill:schema=dfs select
 . . . . . . . . . . . .sum(ss_quantity) over(partition by ss_store_sk 
 order by ss_sold_date_sk)
 . . . . . . . . . . . .  from store_sales;
 java.lang.RuntimeException: java.sql.SQLException: RESOURCE ERROR: One or 
 more nodes ran out of memory while executing the query.
 Fragment 1:13
 [Error Id: 72609220-e431-41fc-8505-8f9740c96153 on atsqa4-133.qa.lab:31010]
 at sqlline.IncrementalRows.hasNext(IncrementalRows.java:73)
 at 
 sqlline.TableOutputFormat$ResizingRowsProvider.next(TableOutputFormat.java:85)
 at sqlline.TableOutputFormat.print(TableOutputFormat.java:116)
 at sqlline.SqlLine.print(SqlLine.java:1583)
 at sqlline.Commands.execute(Commands.java:852)
 at sqlline.Commands.sql(Commands.java:751)
 at sqlline.SqlLine.dispatch(SqlLine.java:738)
 at sqlline.SqlLine.begin(SqlLine.java:612)
 at sqlline.SqlLine.start(SqlLine.java:366)
 at sqlline.SqlLine.main(SqlLine.java:259)
 {code}
 Failure looks legitimate from customer point of view (granted that we know 
 that we can run out of memory in some cases)
 However, there are couple of problems with this scenario:
 1. It looks like we are running out of memory during disk based sort
 {code}
 2015-06-09 16:55:45,693 [2a88e633-b714-49a1-c36a-509a9c817c77:frag:1:19] WARN 
  o.a.d.e.p.i.xsort.ExternalSortBatch - Starting to merge. 3311 batch groups. 
 Current allocated memory: 17456644
 2015-06-09 16:55:45,763 [2a88e633-b714-49a1-c36a-509a9c817c77:frag:1:13] WARN 
  o.a.d.exec.memory.BufferAllocator - Unable to allocate buffer of size 20833 
 due to memory limit. Current allocation: 19989451
 java.lang.Exception: null
 at 
 org.apache.drill.exec.memory.TopLevelAllocator$ChildAllocator.buffer(TopLevelAllocator.java:253)
  [drill-java-exec-1.1.0-SNAPSHOT-rebuffed.jar:1.1.0-SNAPSHOT]
 at 
 org.apache.drill.exec.memory.TopLevelAllocator$ChildAllocator.buffer(TopLevelAllocator.java:273)
  [drill-java-exec-1.1.0-SNAPSHOT-rebuffed.jar:1.1.0-SNAPSHOT]
 at 
 org.apache.drill.exec.vector.UInt1Vector.allocateNew(UInt1Vector.java:171) 
 [drill-java-exec-1.1.0-SNAPSHOT-rebuffed.jar:1.1.0-SNAPSHOT]
 at 
 org.apache.drill.exec.vector.NullableIntVector.allocateNew(NullableIntVector.java:204)
  [drill-java-exec-1.1.0-SNAPSHOT-rebuffed.jar:1.1.0-SNAPSHOT]
 at 
 org.apache.drill.exec.vector.AllocationHelper.allocateNew(AllocationHelper.java:56)
  [drill-java-exec-1.1.0-SNAPSHOT-rebuffed.jar:1.1.0-SNAPSHOT]
 at 
 org.apache.drill.exec.test.generated.PriorityQueueCopierGen139.allocateVectors(PriorityQueueCopierTemplate.java:123)
  [na:na]
 at 
 org.apache.drill.exec.test.generated.PriorityQueueCopierGen139.next(PriorityQueueCopierTemplate.java:66)
  [na:na]
 at 
 org.apache.drill.exec.physical.impl.xsort.ExternalSortBatch.innerNext(ExternalSortBatch.java:224)
  [drill-java-exec-1.1.0-SNAPSHOT-rebuffed.jar:1.1.0-SNAPSHOT]
 at 
 org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:146)
  [drill-java-exec-1.1.0-SNAPSHOT-rebuffed.jar:1.1.0-SNAPSHOT]
 at 
 org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:105)
  [drill-java-exec-1.1.0-SNAPSHOT-rebuffed.jar:1.1.0-SNAPSHOT]
 at 
 org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:95)
  [drill-java-exec-1.1.0-SNAPSHOT-rebuffed.jar:1.1.0-SNAPSHOT]
 at 
 org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:51)
  [drill-java-exec-1.1.0-SNAPSHOT-rebuffed.jar:1.1.0-SNAPSHOT]
 at 
 org.apache.drill.exec.physical.impl.svremover.RemovingRecordBatch.innerNext(RemovingRecordBatch.java:92)
  [drill-java-exec-1.1.0-SNAPSHOT-rebuffed.jar:1.1.0-SNAPSHOT]
 at 
 org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:146)
  [drill-java-exec-1.1.0-SNAPSHOT-rebuffed.jar:1.1.0-SNAPSHOT]
 at 
 org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:105)
  [drill-java-exec-1.1.0-SNAPSHOT-rebuffed.jar:1.1.0-SNAPSHOT

[jira] [Resolved] (DRILL-3294) False schema change exception in CTAS with AVG window function

2015-06-22 Thread Jinfeng Ni (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jinfeng Ni resolved DRILL-3294.
---
Resolution: Fixed

Fixed in commit :  ffae1691c0cd526ed1095fbabbc0855d016790d7. 



 False schema change exception in CTAS with AVG window function
 --

 Key: DRILL-3294
 URL: https://issues.apache.org/jira/browse/DRILL-3294
 Project: Apache Drill
  Issue Type: Bug
  Components: Query Planning  Optimization
Affects Versions: 1.0.0
Reporter: Victoria Markman
Assignee: Jinfeng Ni
  Labels: window_function
 Fix For: 1.1.0

 Attachments: t1_parquet


 This bug could be related to DRILL-3293, but since it's a different function 
 and different symptom, I'm filing a new one.
 {code}
 0: jdbc:drill:schema=dfs create table wf_t1(a1) as select avg(a1) 
 over(partition by a1) from t1;
 Error: SYSTEM ERROR: org.apache.drill.exec.exception.SchemaChangeException: 
 Failure while trying to materialize incoming schema.  Errors:
  
 Error in expression at index -1.  Error: Missing function implementation: 
 [castTINYINT(NULL-OPTIONAL)].  Full expression: --UNKNOWN EXPRESSION--..
 Fragment 0:0
 [Error Id: 1ca5af3a-0ea7-4b75-b493-74f6404d4894 on atsqa4-133.qa.lab:31010] 
 (state=,code=0)
 {code}
 Query works correctly by itself:
 {code}
 0: jdbc:drill:schema=dfs select avg(a1) over(partition by a1) from t1;
 +-+
 | EXPR$0  |
 +-+
 | 1   |
 | 2   |
 | 3   |
 | 4   |
 | 5   |
 | 6   |
 | 7   |
 | 9   |
 | 10  |
 | null|
 +-+
 10 rows selected (0.181 seconds)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 35584: DRILL-3298: fix wrong result for window function query when no partition-by is present

2015-06-18 Thread Aman Sinha

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35584/
---

(Updated June 18, 2015, 7:07 p.m.)


Review request for drill and Jinfeng Ni.


Changes
---

New patch after incorporating review comments and adding unit test.


Bugs: DRILL-3298
https://issues.apache.org/jira/browse/DRILL-3298


Repository: drill-git


Description
---

The JIRA DRILL-3298 has relevant discussion on this.  The fix involves creating 
a single stream as input to the Window by inserting a SingleMergeExchange if 
only the ORDER-BY clause is present in a Window.  This ensures that the Sort 
below the Window is done in parallel followed by a Merge.


Diffs (updated)
-

  
exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/WindowPrule.java
 f7728c8 
  exec/java-exec/src/test/java/org/apache/drill/exec/TestWindowFunctions.java 
2ec2481 

Diff: https://reviews.apache.org/r/35584/diff/


Testing
---

Manual testing of the query in DRILL-3298.  Ran unit tests.  Functional tests 
are in progress.


Thanks,

Aman Sinha



Re: Review Request 35584: DRILL-3298: fix wrong result for window function query when no partition-by is present

2015-06-18 Thread Aman Sinha


 On June 18, 2015, 1:11 a.m., Jinfeng Ni wrote:
  exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/WindowPrule.java,
   line 64
  https://reviews.apache.org/r/35584/diff/1/?file=986627#file986627line64
 
  This might not be related to this JIRA. But I feel this for loop is 
  problemtic.  Are we going to support the following:
  
  avg(c_1) over (partition by c_2 order by c_3), sum(c_2) over (partition 
  by c_4 order by c_5)
  
  Seems each different group will re-set the traits based on different 
  partition/order keys, which would cause incorrect problem.
  
  If we do not support, then we probably should add assertion check to 
  make sure there is only one group in window.
 
 Aman Sinha wrote:
 Currently your query will fail with an unsupported operation error: 
 
 Error: UNSUPPORTED_OPERATION ERROR: Multiple window definitions in a 
 single SELECT list is not currently supported
 See Apache Drill JIRA: DRILL-3196
 
 I am not sure if resetting the traits in the loop would cause incorrect 
 results; it would be suboptimal. There is a comment at the beginning: 
   // TODO: Order window based on existing partition by
 //input.getTraitSet().subsumes()
 
 Jinfeng Ni wrote:
 Right. The UNSUPPORTED_OPERATION ERROR is what I expect. If we do not 
 allow multiple window definitions, then seems this for loop in WindowPrule is 
 not necessary. In stead, we could simply check if window.groups has size = 1. 
 Something like:
 
 Preconditions.checkArgument(window.groups.size() == 1, Only one 
 window definition is allowed.);
 
 Window.Group windowBase = window.groups.get(0);

I discussed this offline with Jinfeng and we agreed that the current logic in 
the for loop will produce valid plans.  In fact, please see a discussion of 
this in DRILL-3196 where both the plan and results were validated for multiple 
window partitions.  The reason we are currently blocking this functionality is 
because of lack of sufficient testing, not due to implementation.


- Aman


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35584/#review88321
---


On June 18, 2015, 7:07 p.m., Aman Sinha wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/35584/
 ---
 
 (Updated June 18, 2015, 7:07 p.m.)
 
 
 Review request for drill and Jinfeng Ni.
 
 
 Bugs: DRILL-3298
 https://issues.apache.org/jira/browse/DRILL-3298
 
 
 Repository: drill-git
 
 
 Description
 ---
 
 The JIRA DRILL-3298 has relevant discussion on this.  The fix involves 
 creating a single stream as input to the Window by inserting a 
 SingleMergeExchange if only the ORDER-BY clause is present in a Window.  This 
 ensures that the Sort below the Window is done in parallel followed by a 
 Merge.
 
 
 Diffs
 -
 
   
 exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/WindowPrule.java
  f7728c8 
   exec/java-exec/src/test/java/org/apache/drill/exec/TestWindowFunctions.java 
 2ec2481 
 
 Diff: https://reviews.apache.org/r/35584/diff/
 
 
 Testing
 ---
 
 Manual testing of the query in DRILL-3298.  Ran unit tests.  Functional tests 
 are in progress.
 
 
 Thanks,
 
 Aman Sinha
 




Re: Review Request 35584: DRILL-3298: fix wrong result for window function query when no partition-by is present

2015-06-18 Thread Jinfeng Ni

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35584/#review88432
---

Ship it!


Ship It!

- Jinfeng Ni


On June 18, 2015, 12:07 p.m., Aman Sinha wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/35584/
 ---
 
 (Updated June 18, 2015, 12:07 p.m.)
 
 
 Review request for drill and Jinfeng Ni.
 
 
 Bugs: DRILL-3298
 https://issues.apache.org/jira/browse/DRILL-3298
 
 
 Repository: drill-git
 
 
 Description
 ---
 
 The JIRA DRILL-3298 has relevant discussion on this.  The fix involves 
 creating a single stream as input to the Window by inserting a 
 SingleMergeExchange if only the ORDER-BY clause is present in a Window.  This 
 ensures that the Sort below the Window is done in parallel followed by a 
 Merge.
 
 
 Diffs
 -
 
   
 exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/WindowPrule.java
  f7728c8 
   exec/java-exec/src/test/java/org/apache/drill/exec/TestWindowFunctions.java 
 2ec2481 
 
 Diff: https://reviews.apache.org/r/35584/diff/
 
 
 Testing
 ---
 
 Manual testing of the query in DRILL-3298.  Ran unit tests.  Functional tests 
 are in progress.
 
 
 Thanks,
 
 Aman Sinha
 




[jira] [Created] (DRILL-3307) Query with window function runs out of memory

2015-06-17 Thread Abhishek Girish (JIRA)
Abhishek Girish created DRILL-3307:
--

 Summary: Query with window function runs out of memory
 Key: DRILL-3307
 URL: https://issues.apache.org/jira/browse/DRILL-3307
 Project: Apache Drill
  Issue Type: Bug
  Components: Query Planning  Optimization
Affects Versions: 1.1.0
 Environment: Data set: TPC-DS SF 100 Parquet
Number of Nodes: 4
Reporter: Abhishek Girish
Assignee: Deneche A. Hakim
 Attachments: drillbit.log.txt

Query with window function runs out of memory:
{code:sql}
 SELECT SUM(ss.ss_net_paid_inc_tax) OVER (PARTITION BY ss.ss_store_sk) AS 
TotalSpend FROM store_sales ss ORDER BY 1 LIMIT 20;
java.lang.RuntimeException: java.sql.SQLException: RESOURCE ERROR: One or more 
nodes ran out of memory while executing the query.

Fragment 3:0

[Error Id: 9af19064-9175-46a4-b557-714d1c77cd76 on abhi6.qa.lab:31010]
at sqlline.IncrementalRows.hasNext(IncrementalRows.java:73)
at 
sqlline.TableOutputFormat$ResizingRowsProvider.next(TableOutputFormat.java:85)
at sqlline.TableOutputFormat.print(TableOutputFormat.java:116)
at sqlline.SqlLine.print(SqlLine.java:1583)
at sqlline.Commands.execute(Commands.java:852)
at sqlline.Commands.sql(Commands.java:751)
at sqlline.SqlLine.dispatch(SqlLine.java:738)
at sqlline.SqlLine.begin(SqlLine.java:612)
at sqlline.SqlLine.start(SqlLine.java:366)
at sqlline.SqlLine.main(SqlLine.java:259)
{code}

Plan:
{code}
00-00Screen : rowType = RecordType(ANY TotalSpend): rowcount = 
2.87997024E8, cumulative cost = {4.3487550824E9 rows, 5.7539970079068695E10 
cpu, 0.0 io, 7.077814861824E12 network, 4.607952384E9 memory}, id = 142297
00-01  SelectionVectorRemover : rowType = RecordType(ANY TotalSpend): 
rowcount = 2.87997024E8, cumulative cost = {4.31995538E9 rows, 
5.751117037666869E10 cpu, 0.0 io, 7.077814861824E12 network, 4.607952384E9 
memory}, id = 142296
00-02Limit(fetch=[20]) : rowType = RecordType(ANY TotalSpend): rowcount 
= 2.87997024E8, cumulative cost = {4.031958356E9 rows, 5.722317335266869E10 
cpu, 0.0 io, 7.077814861824E12 network, 4.607952384E9 memory}, id = 142295
00-03  SingleMergeExchange(sort0=[0 ASC]) : rowType = RecordType(ANY 
TotalSpend): rowcount = 2.87997024E8, cumulative cost = {4.031958336E9 rows, 
5.722317327266869E10 cpu, 0.0 io, 7.077814861824E12 network, 4.607952384E9 
memory}, id = 142294
01-01SelectionVectorRemover : rowType = RecordType(ANY TotalSpend): 
rowcount = 2.87997024E8, cumulative cost = {3.743961312E9 rows, 
5.261522088866869E10 cpu, 0.0 io, 5.89817905152E12 network, 4.607952384E9 
memory}, id = 142293
01-02  TopN(limit=[20]) : rowType = RecordType(ANY TotalSpend): 
rowcount = 2.87997024E8, cumulative cost = {3.455964288E9 rows, 
5.232722386466869E10 cpu, 0.0 io, 5.89817905152E12 network, 4.607952384E9 
memory}, id = 142292
01-03Project(TotalSpend=[$0]) : rowType = RecordType(ANY 
TotalSpend): rowcount = 2.87997024E8, cumulative cost = {3.167967264E9 rows, 
4.734841414759049E10 cpu, 0.0 io, 5.89817905152E12 network, 4.607952384E9 
memory}, id = 142291
01-04  HashToRandomExchange(dist0=[[$0]]) : rowType = 
RecordType(ANY TotalSpend, ANY E_X_P_R_H_A_S_H_F_I_E_L_D): rowcount = 
2.87997024E8, cumulative cost = {3.167967264E9 rows, 4.734841414759049E10 cpu, 
0.0 io, 5.89817905152E12 network, 4.607952384E9 memory}, id = 142290
02-01UnorderedMuxExchange : rowType = RecordType(ANY 
TotalSpend, ANY E_X_P_R_H_A_S_H_F_I_E_L_D): rowcount = 2.87997024E8, cumulative 
cost = {2.87997024E9 rows, 4.274046176359049E10 cpu, 0.0 io, 3.538907430912E12 
network, 4.607952384E9 memory}, id = 142289
03-01  Project(TotalSpend=[$0], 
E_X_P_R_H_A_S_H_F_I_E_L_D=[castInt(hash64AsDouble($0))]) : rowType = 
RecordType(ANY TotalSpend, ANY E_X_P_R_H_A_S_H_F_I_E_L_D): rowcount = 
2.87997024E8, cumulative cost = {2.591973216E9 rows, 4.245246473959049E10 cpu, 
0.0 io, 3.538907430912E12 network, 4.607952384E9 memory}, id = 142288
03-02Project(TotalSpend=[CASE(($2, 0), CAST($3):ANY, 
null)]) : rowType = RecordType(ANY TotalSpend): rowcount = 2.87997024E8, 
cumulative cost = {2.303976192E9 rows, 4.130047664359049E10 cpu, 0.0 io, 
3.538907430912E12 network, 4.607952384E9 memory}, id = 142287
03-03  Window(window#0=[window(partition {1} order by 
[] range between UNBOUNDED PRECEDING and UNBOUNDED FOLLOWING aggs [COUNT($0), 
$SUM0($0)])]) : rowType = RecordType(ANY ss_net_paid_inc_tax, ANY ss_store_sk, 
BIGINT w0$o0, ANY w0$o1): rowcount = 2.87997024E8, cumulative cost = 
{2.015979168E9 rows, 4.014848854759049E10 cpu, 0.0 io, 3.538907430912E12 
network, 4.607952384E9 memory}, id = 142286
03-04SelectionVectorRemover : rowType = 
RecordType(ANY ss_net_paid_inc_tax

Review Request 35584: DRILL-3298: fix wrong result for window function query when no partition-by is present

2015-06-17 Thread Aman Sinha

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35584/
---

Review request for drill and Jinfeng Ni.


Bugs: DRILL-3298
https://issues.apache.org/jira/browse/DRILL-3298


Repository: drill-git


Description
---

The JIRA DRILL-3298 has relevant discussion on this.  The fix involves creating 
a single stream as input to the Window by inserting a SingleMergeExchange if 
only the ORDER-BY clause is present in a Window.  This ensures that the Sort 
below the Window is done in parallel followed by a Merge.


Diffs
-

  
exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/WindowPrule.java
 f7728c8 

Diff: https://reviews.apache.org/r/35584/diff/


Testing
---

Manual testing of the query in DRILL-3298.  Ran unit tests.  Functional tests 
are in progress.


Thanks,

Aman Sinha



Re: Review Request 35584: DRILL-3298: fix wrong result for window function query when no partition-by is present

2015-06-17 Thread Aman Sinha


 On June 18, 2015, 1:11 a.m., Jinfeng Ni wrote:
  exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/WindowPrule.java,
   line 64
  https://reviews.apache.org/r/35584/diff/1/?file=986627#file986627line64
 
  This might not be related to this JIRA. But I feel this for loop is 
  problemtic.  Are we going to support the following:
  
  avg(c_1) over (partition by c_2 order by c_3), sum(c_2) over (partition 
  by c_4 order by c_5)
  
  Seems each different group will re-set the traits based on different 
  partition/order keys, which would cause incorrect problem.
  
  If we do not support, then we probably should add assertion check to 
  make sure there is only one group in window.

Currently your query will fail with an unsupported operation error: 

Error: UNSUPPORTED_OPERATION ERROR: Multiple window definitions in a single 
SELECT list is not currently supported
See Apache Drill JIRA: DRILL-3196

I am not sure if resetting the traits in the loop would cause incorrect 
results; it would be suboptimal. There is a comment at the beginning: 
  // TODO: Order window based on existing partition by
//input.getTraitSet().subsumes()


 On June 18, 2015, 1:11 a.m., Jinfeng Ni wrote:
  exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/WindowPrule.java,
   line 99
  https://reviews.apache.org/r/35584/diff/1/?file=986627#file986627line99
 
  Line 99 uses windowBase.orderKeys, while line 181 uses 
  window.collation(). It took me a while to figure out they refer to the same 
  thing. Is it better to use one form, to avoid confusion?

Yes, I will use window.collation() to be consistent.


 On June 18, 2015, 1:11 a.m., Jinfeng Ni wrote:
  exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/WindowPrule.java,
   line 100
  https://reviews.apache.org/r/35584/diff/1/?file=986627#file986627line100
 
  Why do u need call convert() over exch? the input has been converted, 
  and the SingleMergeExchangePrel is explicitly created in this rule. Sounds 
  it's not necessary to call convert() again.

Agree..the convert() is not necessary in this case. Will remove and assign 
convertedInput = SingleMergeExchange.


On June 18, 2015, 1:11 a.m., Aman Sinha wrote:
  Also, you may consider adding one unit test case for the failed query 
  reported in the JIRA.

I am trying to create a simpler repro than the one in the JIRA (that one has 
many parquet files) and will upload a new patch with a unit test and above 
changes.


- Aman


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35584/#review88321
---


On June 17, 2015, 11:59 p.m., Aman Sinha wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/35584/
 ---
 
 (Updated June 17, 2015, 11:59 p.m.)
 
 
 Review request for drill and Jinfeng Ni.
 
 
 Bugs: DRILL-3298
 https://issues.apache.org/jira/browse/DRILL-3298
 
 
 Repository: drill-git
 
 
 Description
 ---
 
 The JIRA DRILL-3298 has relevant discussion on this.  The fix involves 
 creating a single stream as input to the Window by inserting a 
 SingleMergeExchange if only the ORDER-BY clause is present in a Window.  This 
 ensures that the Sort below the Window is done in parallel followed by a 
 Merge.
 
 
 Diffs
 -
 
   
 exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/WindowPrule.java
  f7728c8 
 
 Diff: https://reviews.apache.org/r/35584/diff/
 
 
 Testing
 ---
 
 Manual testing of the query in DRILL-3298.  Ran unit tests.  Functional tests 
 are in progress.
 
 
 Thanks,
 
 Aman Sinha
 




Re: Review Request 35584: DRILL-3298: fix wrong result for window function query when no partition-by is present

2015-06-17 Thread Jinfeng Ni

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35584/#review88321
---



exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/WindowPrule.java
 (line 64)
https://reviews.apache.org/r/35584/#comment140783

This might not be related to this JIRA. But I feel this for loop is 
problemtic.  Are we going to support the following:

avg(c_1) over (partition by c_2 order by c_3), sum(c_2) over (partition by 
c_4 order by c_5)

Seems each different group will re-set the traits based on different 
partition/order keys, which would cause incorrect problem.

If we do not support, then we probably should add assertion check to make 
sure there is only one group in window.



exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/WindowPrule.java
 (line 99)
https://reviews.apache.org/r/35584/#comment140778

Line 99 uses windowBase.orderKeys, while line 181 uses window.collation(). 
It took me a while to figure out they refer to the same thing. Is it better to 
use one form, to avoid confusion?



exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/WindowPrule.java
 (line 100)
https://reviews.apache.org/r/35584/#comment140780

Why do u need call convert() over exch? the input has been converted, and 
the SingleMergeExchangePrel is explicitly created in this rule. Sounds it's not 
necessary to call convert() again.


Also, you may consider adding one unit test case for the failed query reported 
in the JIRA.

- Jinfeng Ni


On June 17, 2015, 4:59 p.m., Aman Sinha wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/35584/
 ---
 
 (Updated June 17, 2015, 4:59 p.m.)
 
 
 Review request for drill and Jinfeng Ni.
 
 
 Bugs: DRILL-3298
 https://issues.apache.org/jira/browse/DRILL-3298
 
 
 Repository: drill-git
 
 
 Description
 ---
 
 The JIRA DRILL-3298 has relevant discussion on this.  The fix involves 
 creating a single stream as input to the Window by inserting a 
 SingleMergeExchange if only the ORDER-BY clause is present in a Window.  This 
 ensures that the Sort below the Window is done in parallel followed by a 
 Merge.
 
 
 Diffs
 -
 
   
 exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/WindowPrule.java
  f7728c8 
 
 Diff: https://reviews.apache.org/r/35584/diff/
 
 
 Testing
 ---
 
 Manual testing of the query in DRILL-3298.  Ran unit tests.  Functional tests 
 are in progress.
 
 
 Thanks,
 
 Aman Sinha
 




[jira] [Created] (DRILL-3294) False schema change exception in CTAS with AVG window function

2015-06-15 Thread Victoria Markman (JIRA)
Victoria Markman created DRILL-3294:
---

 Summary: False schema change exception in CTAS with AVG window 
function
 Key: DRILL-3294
 URL: https://issues.apache.org/jira/browse/DRILL-3294
 Project: Apache Drill
  Issue Type: Bug
  Components: Execution - Relational Operators
Affects Versions: 1.0.0
Reporter: Victoria Markman
Assignee: Chris Westin


This bug could be related to DRILL-3293, but since it's a different function 
and different symptom, I'm filing a new one.

{code}
0: jdbc:drill:schema=dfs create table wf_t1(a1) as select avg(a1) 
over(partition by a1) from t1;
Error: SYSTEM ERROR: org.apache.drill.exec.exception.SchemaChangeException: 
Failure while trying to materialize incoming schema.  Errors:
 
Error in expression at index -1.  Error: Missing function implementation: 
[castTINYINT(NULL-OPTIONAL)].  Full expression: --UNKNOWN EXPRESSION--..

Fragment 0:0

[Error Id: 1ca5af3a-0ea7-4b75-b493-74f6404d4894 on atsqa4-133.qa.lab:31010] 
(state=,code=0)
{code}

Query works correctly by itself:
{code}
0: jdbc:drill:schema=dfs select avg(a1) over(partition by a1) from t1;
+-+
| EXPR$0  |
+-+
| 1   |
| 2   |
| 3   |
| 4   |
| 5   |
| 6   |
| 7   |
| 9   |
| 10  |
| null|
+-+
10 rows selected (0.181 seconds)
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-3293) CTAS with window function fails with NPE

2015-06-15 Thread Victoria Markman (JIRA)
Victoria Markman created DRILL-3293:
---

 Summary: CTAS with window function fails with NPE
 Key: DRILL-3293
 URL: https://issues.apache.org/jira/browse/DRILL-3293
 Project: Apache Drill
  Issue Type: Bug
  Components: Execution - Flow
Affects Versions: 1.0.0
Reporter: Victoria Markman
Assignee: Chris Westin


{code}
0: jdbc:drill:schema=dfs create table wf_t1 as select sum(a1) over(partition 
by a1) from t1;
Error: SYSTEM ERROR:

Fragment 0:0

[Error Id: 96897b46-70c0-4373-9d85-ca7501cb1479 on atsqa4-133.qa.lab:31010] 
(state=,code=0)
{code}

drillbit.log
{code}
[Error Id: bde0d90b-7eaa-4772-9316-9c58a46b01d2 on atsqa4-133.qa.lab:31010]
org.apache.drill.common.exceptions.UserException: SYSTEM ERROR:

Fragment 0:0

[Error Id: bde0d90b-7eaa-4772-9316-9c58a46b01d2 on atsqa4-133.qa.lab:31010]
at 
org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:522)
 ~[drill-common-1.1.0-SNAPSHOT-rebuffed.jar:1.1.0-SNAPSHOT]
at 
org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:325)
 [drill-java-exec-1.1.0-SNAPSHOT-rebuffed.jar:1.1.0-SNAPSHOT]
at 
org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:181)
 [drill-java-exec-1.1.0-SNAPSHOT-rebuffed.jar:1.1.0-SNAPSHOT]
at 
org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:294)
 [drill-java-exec-1.1.0-SNAPSHOT-rebuffed.jar:1.1.0-SNAPSHOT]
at 
org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38) 
[drill-common-1.1.0-SNAPSHOT-rebuffed.jar:1.1.0-SNAPSHOT]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
[na:1.7.0_71]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
[na:1.7.0_71]
at java.lang.Thread.run(Thread.java:745) [na:1.7.0_71]
Caused by: java.lang.UnsupportedOperationException: null
at 
org.apache.drill.exec.expr.TypeHelper.getValueVectorClass(TypeHelper.java:674) 
~[drill-java-exec-1.1.0-SNAPSHOT-rebuffed.jar:1.1.0-SNAPSHOT]
at 
org.apache.drill.exec.record.VectorContainer.addOrGet(VectorContainer.java:82) 
~[drill-java-exec-1.1.0-SNAPSHOT-rebuffed.jar:1.1.0-SNAPSHOT]
at 
org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.setupNewSchema(ProjectRecordBatch.java:421)
 ~[drill-java-exec-1.1.0-SNAPSHOT-rebuffed.jar:1.1.0-SNAPSHOT]
at 
org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:78)
 ~[drill-java-exec-1.1.0-SNAPSHOT-rebuffed.jar:1.1.0-SNAPSHOT]
at 
org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext(ProjectRecordBatch.java:129)
 ~[drill-java-exec-1.1.0-SNAPSHOT-rebuffed.jar:1.1.0-SNAPSHOT]
at 
org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:146)
 ~[drill-java-exec-1.1.0-SNAPSHOT-rebuffed.jar:1.1.0-SNAPSHOT]
at 
org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:105)
 ~[drill-java-exec-1.1.0-SNAPSHOT-rebuffed.jar:1.1.0-SNAPSHOT]
at 
org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:95)
 ~[drill-java-exec-1.1.0-SNAPSHOT-rebuffed.jar:1.1.0-SNAPSHOT]
at 
org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:51)
 ~[drill-java-exec-1.1.0-SNAPSHOT-rebuffed.jar:1.1.0-SNAPSHOT]
at 
org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext(ProjectRecordBatch.java:129)
 ~[drill-java-exec-1.1.0-SNAPSHOT-rebuffed.jar:1.1.0-SNAPSHOT]
at 
org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:146)
 ~[drill-java-exec-1.1.0-SNAPSHOT-rebuffed.jar:1.1.0-SNAPSHOT]
at 
org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:105)
 ~[drill-java-exec-1.1.0-SNAPSHOT-rebuffed.jar:1.1.0-SNAPSHOT]
at 
org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:95)
 ~[drill-java-exec-1.1.0-SNAPSHOT-rebuffed.jar:1.1.0-SNAPSHOT]
at 
org.apache.drill.exec.physical.impl.WriterRecordBatch.innerNext(WriterRecordBatch.java:92)
 ~[drill-java-exec-1.1.0-SNAPSHOT-rebuffed.jar:1.1.0-SNAPSHOT]
at 
org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:146)
 ~[drill-java-exec-1.1.0-SNAPSHOT-rebuffed.jar:1.1.0-SNAPSHOT]
at 
org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:105)
 ~[drill-java-exec-1.1.0-SNAPSHOT-rebuffed.jar:1.1.0-SNAPSHOT]
at 
org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:95)
 ~[drill-java-exec-1.1.0-SNAPSHOT-rebuffed.jar:1.1.0-SNAPSHOT]
at 
org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:51)
 ~[drill-java-exec-1.1.0

[jira] [Created] (DRILL-3298) Wrong result with SUM window function and order by without partition by in the OVER clause

2015-06-15 Thread Victoria Markman (JIRA)
Victoria Markman created DRILL-3298:
---

 Summary: Wrong result with SUM window function and order by 
without partition by in the OVER clause
 Key: DRILL-3298
 URL: https://issues.apache.org/jira/browse/DRILL-3298
 Project: Apache Drill
  Issue Type: Bug
  Components: Execution - Flow
Affects Versions: 1.0.0
Reporter: Victoria Markman
Assignee: Chris Westin
Priority: Critical


This query returns incorrect result when planner.slice_target = 1
{code}
select
j1.c_integer,
sum(j1.c_integer) over w
from j1
window  w as (order by c_integer desc)
order by
1, 2;
{code}

Query plan with planner.slice_target = 1
{code}
00-01  Project(c_integer=[$0], EXPR$1=[$1])
00-02SingleMergeExchange(sort0=[0 ASC], sort1=[1 ASC])
01-01  SelectionVectorRemover
01-02Sort(sort0=[$0], sort1=[$1], dir0=[ASC], dir1=[ASC])
01-03  Project(c_integer=[$0], EXPR$1=[$1])
01-04HashToRandomExchange(dist0=[[$0]], dist1=[[$1]])
02-01  UnorderedMuxExchange
03-01Project(c_integer=[$0], EXPR$1=[$1], 
E_X_P_R_H_A_S_H_F_I_E_L_D=[castInt(hash64AsDouble($1, hash64AsDouble($0)))])
03-02  Project(c_integer=[$0], EXPR$1=[CASE(($1, 0), 
CAST($2):ANY, null)])
03-03Window(window#0=[window(partition {} order by [0 
DESC] range between UNBOUNDED PRECEDING and CURRENT ROW aggs [COUNT($0), 
$SUM0($0)])])
03-04  SelectionVectorRemover
03-05Sort(sort0=[$0], dir0=[DESC])
03-06  Scan(groupscan=[ParquetGroupScan 
[entries=[ReadEntryWithPath [path=maprfs:///drill/testdata/subqueries/j1]], 
selectionRoot=/drill/testdata/subqueries/j1, numFiles=1, 
columns=[`c_integer`]]])
{code}

Query plan with planner.slice_target = 10;
{code}
00-01  Project(c_integer=[$0], EXPR$1=[$1])
00-02SelectionVectorRemover
00-03  Sort(sort0=[$0], sort1=[$1], dir0=[ASC], dir1=[ASC])
00-04Project(c_integer=[$0], EXPR$1=[CASE(($1, 0), CAST($2):ANY, 
null)])
00-05  Window(window#0=[window(partition {} order by [0 DESC] range 
between UNBOUNDED PRECEDING and CURRENT ROW aggs [COUNT($0), $SUM0($0)])])
00-06SelectionVectorRemover
00-07  Sort(sort0=[$0], dir0=[DESC])
00-08Scan(groupscan=[ParquetGroupScan 
[entries=[ReadEntryWithPath [path=maprfs:///drill/testdata/subqueries/j1]], 
selectionRoot=/drill/testdata/subqueries/j1, numFiles=1, 
columns=[`c_integer`]]])
{code}

Attached:
* table j1
* test.res - result generated with postgres



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-3280) Missing OVER clause in window function query results in AssertionError

2015-06-11 Thread Khurram Faraaz (JIRA)
Khurram Faraaz created DRILL-3280:
-

 Summary: Missing OVER clause in window function query results in 
AssertionError
 Key: DRILL-3280
 URL: https://issues.apache.org/jira/browse/DRILL-3280
 Project: Apache Drill
  Issue Type: Bug
  Components: Execution - Flow
Affects Versions: 1.0.0
Reporter: Khurram Faraaz
Assignee: Chris Westin


Missing OVER clause results in AssertionError.
Instead, we will need an error message that said, window function call 
requires an OVER clause

{code}
0: jdbc:drill:schema=dfs.tmp select rank(), cume_dist() over w from 
`allDataInPrq/0_0_0.parquet` window w as (partition by col_chr order by 
col_dbl);
Error: SYSTEM ERROR: org.apache.drill.exec.work.foreman.ForemanException: 
Unexpected exception during fragment initialization: null


[Error Id: f8675256-eea9-4ca6-859c-4c0b714f27a0 on centos-02.qa.lab:31010] 
(state=,code=0)
{code}

Stack trace from drillbit.log

{code}
2015-06-11 20:50:42,054 [2a860b5d-dd87-087f-3730-bf47a10f5d97:foreman] ERROR 
o.a.d.c.exceptions.UserException - SYSTEM ERROR: 
org.apache.drill.exec.work.foreman.ForemanException: Unexpected exception 
during fragment initialization: null


[Error Id: f8675256-eea9-4ca6-859c-4c0b714f27a0 on centos-02.qa.lab:31010]
org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: 
org.apache.drill.exec.work.foreman.ForemanException: Unexpected exception 
during fragment initialization: null


[Error Id: f8675256-eea9-4ca6-859c-4c0b714f27a0 on centos-02.qa.lab:31010]
at 
org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:522)
 ~[drill-common-1.1.0-SNAPSHOT-rebuffed.jar:1.1.0-SNAPSHOT]
at 
org.apache.drill.exec.work.foreman.Foreman$ForemanResult.close(Foreman.java:738)
 [drill-java-exec-1.1.0-SNAPSHOT-rebuffed.jar:1.1.0-SNAPSHOT]
at 
org.apache.drill.exec.work.foreman.Foreman$StateSwitch.processEvent(Foreman.java:840)
 [drill-java-exec-1.1.0-SNAPSHOT-rebuffed.jar:1.1.0-SNAPSHOT]
at 
org.apache.drill.exec.work.foreman.Foreman$StateSwitch.processEvent(Foreman.java:782)
 [drill-java-exec-1.1.0-SNAPSHOT-rebuffed.jar:1.1.0-SNAPSHOT]
at 
org.apache.drill.common.EventProcessor.sendEvent(EventProcessor.java:73) 
[drill-common-1.1.0-SNAPSHOT-rebuffed.jar:1.1.0-SNAPSHOT]
at 
org.apache.drill.exec.work.foreman.Foreman$StateSwitch.moveToState(Foreman.java:784)
 [drill-java-exec-1.1.0-SNAPSHOT-rebuffed.jar:1.1.0-SNAPSHOT]
at 
org.apache.drill.exec.work.foreman.Foreman.moveToState(Foreman.java:893) 
[drill-java-exec-1.1.0-SNAPSHOT-rebuffed.jar:1.1.0-SNAPSHOT]
at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:253) 
[drill-java-exec-1.1.0-SNAPSHOT-rebuffed.jar:1.1.0-SNAPSHOT]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
[na:1.7.0_45]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
[na:1.7.0_45]
at java.lang.Thread.run(Thread.java:744) [na:1.7.0_45]
Caused by: org.apache.drill.exec.work.foreman.ForemanException: Unexpected 
exception during fragment initialization: null
... 4 common frames omitted
Caused by: java.lang.AssertionError: null
at 
org.apache.calcite.sql2rel.SqlToRelConverter$Blackboard.getRootField(SqlToRelConverter.java:3810)
 ~[calcite-core-1.1.0-drill-r7.jar:1.1.0-drill-r7]
at 
org.apache.calcite.sql2rel.SqlToRelConverter.adjustInputRef(SqlToRelConverter.java:3139)
 ~[calcite-core-1.1.0-drill-r7.jar:1.1.0-drill-r7]
at 
org.apache.calcite.sql2rel.SqlToRelConverter.convertIdentifier(SqlToRelConverter.java:3114)
 ~[calcite-core-1.1.0-drill-r7.jar:1.1.0-drill-r7]
at 
org.apache.calcite.sql2rel.SqlToRelConverter.access$1400(SqlToRelConverter.java:180)
 ~[calcite-core-1.1.0-drill-r7.jar:1.1.0-drill-r7]
at 
org.apache.calcite.sql2rel.SqlToRelConverter$Blackboard.visit(SqlToRelConverter.java:4061)
 ~[calcite-core-1.1.0-drill-r7.jar:1.1.0-drill-r7]
at 
org.apache.calcite.sql2rel.SqlToRelConverter$Blackboard.visit(SqlToRelConverter.java:3489)
 ~[calcite-core-1.1.0-drill-r7.jar:1.1.0-drill-r7]
at org.apache.calcite.sql.SqlIdentifier.accept(SqlIdentifier.java:274) 
~[calcite-core-1.1.0-drill-r7.jar:1.1.0-drill-r7]
at 
org.apache.calcite.sql2rel.SqlToRelConverter$Blackboard.convertExpression(SqlToRelConverter.java:3944)
 ~[calcite-core-1.1.0-drill-r7.jar:1.1.0-drill-r7]
at 
org.apache.calcite.sql2rel.SqlToRelConverter$Blackboard.convertSortExpression(SqlToRelConverter.java:3962)
 ~[calcite-core-1.1.0-drill-r7.jar:1.1.0-drill-r7]
at 
org.apache.calcite.sql2rel.SqlToRelConverter.convertOver(SqlToRelConverter.java:1756)
 ~[calcite-core-1.1.0-drill-r7.jar:1.1.0-drill-r7]
at 
org.apache.calcite.sql2rel.SqlToRelConverter.access$1000(SqlToRelConverter.java:180)
 ~[calcite-core-1.1.0-drill-r7.jar:1.1.0

[jira] [Created] (DRILL-3265) Query with window function and sort below that spills to disk runs out of memory

2015-06-09 Thread Victoria Markman (JIRA)
Victoria Markman created DRILL-3265:
---

 Summary: Query with window function and sort below that spills to 
disk runs out of memory
 Key: DRILL-3265
 URL: https://issues.apache.org/jira/browse/DRILL-3265
 Project: Apache Drill
  Issue Type: Bug
  Components: Execution - Relational Operators
Affects Versions: 1.0.0
Reporter: Victoria Markman
Assignee: Chris Westin


Failure:

{code}
0: jdbc:drill:schema=dfs select
. . . . . . . . . . . .sum(ss_quantity) over(partition by ss_store_sk 
order by ss_sold_date_sk)
. . . . . . . . . . . .  from store_sales;
java.lang.RuntimeException: java.sql.SQLException: RESOURCE ERROR: One or more 
nodes ran out of memory while executing the query.

Fragment 1:13

[Error Id: 72609220-e431-41fc-8505-8f9740c96153 on atsqa4-133.qa.lab:31010]
at sqlline.IncrementalRows.hasNext(IncrementalRows.java:73)
at 
sqlline.TableOutputFormat$ResizingRowsProvider.next(TableOutputFormat.java:85)
at sqlline.TableOutputFormat.print(TableOutputFormat.java:116)
at sqlline.SqlLine.print(SqlLine.java:1583)
at sqlline.Commands.execute(Commands.java:852)
at sqlline.Commands.sql(Commands.java:751)
at sqlline.SqlLine.dispatch(SqlLine.java:738)
at sqlline.SqlLine.begin(SqlLine.java:612)
at sqlline.SqlLine.start(SqlLine.java:366)
at sqlline.SqlLine.main(SqlLine.java:259)
{code}

Failure looks legitimate from customer point of view (granted that we know that 
we can run out of memory in some cases)

However, there are couple of problems with this scenario:

1. It looks like we are running out of memory during disk based sort
{code}
2015-06-09 16:55:45,693 [2a88e633-b714-49a1-c36a-509a9c817c77:frag:1:19] WARN  
o.a.d.e.p.i.xsort.ExternalSortBatch - Starting to merge. 3311 batch groups. 
Current allocated memory: 17456644
2015-06-09 16:55:45,763 [2a88e633-b714-49a1-c36a-509a9c817c77:frag:1:13] WARN  
o.a.d.exec.memory.BufferAllocator - Unable to allocate buffer of size 20833 due 
to memory limit. Current allocation: 19989451
java.lang.Exception: null
at 
org.apache.drill.exec.memory.TopLevelAllocator$ChildAllocator.buffer(TopLevelAllocator.java:253)
 [drill-java-exec-1.1.0-SNAPSHOT-rebuffed.jar:1.1.0-SNAPSHOT]
at 
org.apache.drill.exec.memory.TopLevelAllocator$ChildAllocator.buffer(TopLevelAllocator.java:273)
 [drill-java-exec-1.1.0-SNAPSHOT-rebuffed.jar:1.1.0-SNAPSHOT]
at 
org.apache.drill.exec.vector.UInt1Vector.allocateNew(UInt1Vector.java:171) 
[drill-java-exec-1.1.0-SNAPSHOT-rebuffed.jar:1.1.0-SNAPSHOT]
at 
org.apache.drill.exec.vector.NullableIntVector.allocateNew(NullableIntVector.java:204)
 [drill-java-exec-1.1.0-SNAPSHOT-rebuffed.jar:1.1.0-SNAPSHOT]
at 
org.apache.drill.exec.vector.AllocationHelper.allocateNew(AllocationHelper.java:56)
 [drill-java-exec-1.1.0-SNAPSHOT-rebuffed.jar:1.1.0-SNAPSHOT]
at 
org.apache.drill.exec.test.generated.PriorityQueueCopierGen139.allocateVectors(PriorityQueueCopierTemplate.java:123)
 [na:na]
at 
org.apache.drill.exec.test.generated.PriorityQueueCopierGen139.next(PriorityQueueCopierTemplate.java:66)
 [na:na]
at 
org.apache.drill.exec.physical.impl.xsort.ExternalSortBatch.innerNext(ExternalSortBatch.java:224)
 [drill-java-exec-1.1.0-SNAPSHOT-rebuffed.jar:1.1.0-SNAPSHOT]
at 
org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:146)
 [drill-java-exec-1.1.0-SNAPSHOT-rebuffed.jar:1.1.0-SNAPSHOT]
at 
org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:105)
 [drill-java-exec-1.1.0-SNAPSHOT-rebuffed.jar:1.1.0-SNAPSHOT]
at 
org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:95)
 [drill-java-exec-1.1.0-SNAPSHOT-rebuffed.jar:1.1.0-SNAPSHOT]
at 
org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:51)
 [drill-java-exec-1.1.0-SNAPSHOT-rebuffed.jar:1.1.0-SNAPSHOT]
at 
org.apache.drill.exec.physical.impl.svremover.RemovingRecordBatch.innerNext(RemovingRecordBatch.java:92)
 [drill-java-exec-1.1.0-SNAPSHOT-rebuffed.jar:1.1.0-SNAPSHOT]
at 
org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:146)
 [drill-java-exec-1.1.0-SNAPSHOT-rebuffed.jar:1.1.0-SNAPSHOT]
at 
org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:105)
 [drill-java-exec-1.1.0-SNAPSHOT-rebuffed.jar:1.1.0-SNAPSHOT]
at 
org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:95)
 [drill-java-exec-1.1.0-SNAPSHOT-rebuffed.jar:1.1.0-SNAPSHOT]
at 
org.apache.drill.exec.physical.impl.window.WindowFrameRecordBatch.innerNext(WindowFrameRecordBatch.java:123)
 [drill-java-exec-1.1.0-SNAPSHOT-rebuffed.jar:1.1.0-SNAPSHOT

Re: Window function query takes too long to complete and return results

2015-06-09 Thread Abdel Hakim Deneche
please open a JIRA issue. please provide the test file (compressed) or a
script to generate similar data.

Thanks!

On Tue, Jun 9, 2015 at 6:55 PM, Khurram Faraaz kfar...@maprtech.com wrote:

 Query that uses window functions takes too long to complete and return
 results. It returns close to a million records, for which it took 533.8
 seconds ~8 minutes
 Input CSV file has two columns, one integer and another varchar type
 column. Please let me know if this needs to be investigated and I can
 report a JIRA to track this if required ?

 Size of the input CSV file

 root@centos-01 ~]# hadoop fs -ls /tmp/manyDuplicates.csv

 -rwxr-xr-x   3 root root   27889455 2015-06-10 01:26
 /tmp/manyDuplicates.csv

 {code}

 select count(*) over(partition by cast(columns[1] as varchar(25)) order by
 cast(columns[0] as bigint)) from `manyDuplicates.csv`;

 ...

 1,000,007 rows selected (533.857 seconds)
 {code}

 There are five distinct values in columns[1] in the CSV file. = [FIVE
 PARTITIONS]

 {code}

 0: jdbc:drill:schema=dfs.tmp select distinct columns[1] from
 `manyDuplicates.csv`;

 *+---+*

 *| **   EXPR$0** |*

 *+---+*

 *| * * |*

 *| * * |*

 *| * * |*

 *| * * |*

 *| * * |*

 *+---+*

 5 rows selected (1.906 seconds)
 {code}

 Here is the count for each of those values in columns[1]

 {code}

 0: jdbc:drill:schema=dfs.tmp select count(columns[1]) from
 `manyDuplicates.csv` where columns[1] = '';

 *+-+*

 *| **EXPR$0 ** |*

 *+-+*

 *| *200484 * |*

 *+-+*

 1 row selected (0.961 seconds)

 {code}


 {code}

 0: jdbc:drill:schema=dfs.tmp select count(columns[1]) from
 `manyDuplicates.csv` where columns[1] = '';

 *+-+*

 *| **EXPR$0 ** |*

 *+-+*

 *| *199353 * |*

 *+-+*

 1 row selected (0.86 seconds)

 {code}


 {code}

 0: jdbc:drill:schema=dfs.tmp select count(columns[1]) from
 `manyDuplicates.csv` where columns[1] = '';

 *+-+*

 *| **EXPR$0 ** |*

 *+-+*

 *| *200702 * |*

 *+-+*

 1 row selected (0.826 seconds)

 {code}


 {code}

 0: jdbc:drill:schema=dfs.tmp select count(columns[1]) from
 `manyDuplicates.csv` where columns[1] = '';

 *+-+*

 *| **EXPR$0 ** |*

 *+-+*

 *| *199916 * |*

 *+-+*

 1 row selected (0.851 seconds)

 {code}


 {code}

 0: jdbc:drill:schema=dfs.tmp select count(columns[1]) from
 `manyDuplicates.csv` where columns[1] = '';

 *+-+*

 *| **EXPR$0 ** |*

 *+-+*

 *| *199552 * |*

 *+-+*

 1 row selected (0.827 seconds)
 {code}

 Thanks,
 Khurram




-- 

Abdelhakim Deneche

Software Engineer

  http://www.mapr.com/


Now Available - Free Hadoop On-Demand Training
http://www.mapr.com/training?utm_source=Emailutm_medium=Signatureutm_campaign=Free%20available


Window function query takes too long to complete and return results

2015-06-09 Thread Khurram Faraaz
Query that uses window functions takes too long to complete and return
results. It returns close to a million records, for which it took 533.8
seconds ~8 minutes
Input CSV file has two columns, one integer and another varchar type
column. Please let me know if this needs to be investigated and I can
report a JIRA to track this if required ?

Size of the input CSV file

root@centos-01 ~]# hadoop fs -ls /tmp/manyDuplicates.csv

-rwxr-xr-x   3 root root   27889455 2015-06-10 01:26 /tmp/manyDuplicates.csv

{code}

select count(*) over(partition by cast(columns[1] as varchar(25)) order by
cast(columns[0] as bigint)) from `manyDuplicates.csv`;

...

1,000,007 rows selected (533.857 seconds)
{code}

There are five distinct values in columns[1] in the CSV file. = [FIVE
PARTITIONS]

{code}

0: jdbc:drill:schema=dfs.tmp select distinct columns[1] from
`manyDuplicates.csv`;

*+---+*

*| **   EXPR$0** |*

*+---+*

*| * * |*

*| * * |*

*| * * |*

*| * * |*

*| * * |*

*+---+*

5 rows selected (1.906 seconds)
{code}

Here is the count for each of those values in columns[1]

{code}

0: jdbc:drill:schema=dfs.tmp select count(columns[1]) from
`manyDuplicates.csv` where columns[1] = '';

*+-+*

*| **EXPR$0 ** |*

*+-+*

*| *200484 * |*

*+-+*

1 row selected (0.961 seconds)

{code}


{code}

0: jdbc:drill:schema=dfs.tmp select count(columns[1]) from
`manyDuplicates.csv` where columns[1] = '';

*+-+*

*| **EXPR$0 ** |*

*+-+*

*| *199353 * |*

*+-+*

1 row selected (0.86 seconds)

{code}


{code}

0: jdbc:drill:schema=dfs.tmp select count(columns[1]) from
`manyDuplicates.csv` where columns[1] = '';

*+-+*

*| **EXPR$0 ** |*

*+-+*

*| *200702 * |*

*+-+*

1 row selected (0.826 seconds)

{code}


{code}

0: jdbc:drill:schema=dfs.tmp select count(columns[1]) from
`manyDuplicates.csv` where columns[1] = '';

*+-+*

*| **EXPR$0 ** |*

*+-+*

*| *199916 * |*

*+-+*

1 row selected (0.851 seconds)

{code}


{code}

0: jdbc:drill:schema=dfs.tmp select count(columns[1]) from
`manyDuplicates.csv` where columns[1] = '';

*+-+*

*| **EXPR$0 ** |*

*+-+*

*| *199552 * |*

*+-+*

1 row selected (0.827 seconds)
{code}

Thanks,
Khurram


[jira] [Created] (DRILL-3269) Window function query takes too long to complete and return results

2015-06-09 Thread Khurram Faraaz (JIRA)
Khurram Faraaz created DRILL-3269:
-

 Summary: Window function query takes too long to complete and 
return results
 Key: DRILL-3269
 URL: https://issues.apache.org/jira/browse/DRILL-3269
 Project: Apache Drill
  Issue Type: Bug
  Components: Execution - Flow
Affects Versions: 1.0.0
 Environment: 1de6aed93efce8a524964371d96673b8ef192d89
Reporter: Khurram Faraaz
Assignee: Chris Westin
Priority: Minor


Query that uses window functions takes too long to complete and return results. 
It returns close to a million records, for which it took 533.8 seconds ~8 
minutes
Input CSV file has two columns, one integer and another varchar type column. 
Please take a look.

Size of the input CSV file
root@centos-01 ~]# hadoop fs -ls /tmp/manyDuplicates.csv
-rwxr-xr-x   3 root root   27889455 2015-06-10 01:26 /tmp/manyDuplicates.csv

{code}
select count(*) over(partition by cast(columns[1] as varchar(25)) order by 
cast(columns[0] as bigint)) from `manyDuplicates.csv`;
...
1,000,007 rows selected (533.857 seconds)
{code}

There are five distinct values in columns[1] in the CSV file. = [FIVE 
PARTITIONS]

{code}
0: jdbc:drill:schema=dfs.tmp select distinct columns[1] from 
`manyDuplicates.csv`;
+---+
|EXPR$0 |
+---+
|   |
|   |
|   |
|   |
|   |
+---+
5 rows selected (1.906 seconds)
{code}

Here is the count for each of those values in columns[1]

{code}
0: jdbc:drill:schema=dfs.tmp select count(columns[1]) from 
`manyDuplicates.csv` where columns[1] = '';
+-+
| EXPR$0  |
+-+
| 200484  |
+-+
1 row selected (0.961 seconds)
{code}

{code}
0: jdbc:drill:schema=dfs.tmp select count(columns[1]) from 
`manyDuplicates.csv` where columns[1] = '';
+-+
| EXPR$0  |
+-+
| 199353  |
+-+
1 row selected (0.86 seconds)
{code}

{code}
0: jdbc:drill:schema=dfs.tmp select count(columns[1]) from 
`manyDuplicates.csv` where columns[1] = '';
+-+
| EXPR$0  |
+-+
| 200702  |
+-+
1 row selected (0.826 seconds)
{code}

{code}
0: jdbc:drill:schema=dfs.tmp select count(columns[1]) from 
`manyDuplicates.csv` where columns[1] = '';
+-+
| EXPR$0  |
+-+
| 199916  |
+-+
1 row selected (0.851 seconds)
{code}

{code}
0: jdbc:drill:schema=dfs.tmp select count(columns[1]) from 
`manyDuplicates.csv` where columns[1] = '';
+-+
| EXPR$0  |
+-+
| 199552  |
+-+
1 row selected (0.827 seconds)
{code}

Query plan for the long running query
{code}
| 00-00Screen
00-01  UnionExchange
01-01Project(EXPR$0=[$0])
01-02  Project($0=[$2])
01-03Window(window#0=[window(partition {1} order by [0] range 
between UNBOUNDED PRECEDING and CURRENT ROW aggs [COUNT()])])
01-04  SelectionVectorRemover
01-05Sort(sort0=[$1], sort1=[$0], dir0=[ASC], dir1=[ASC])
01-06  Project($0=[$0], $1=[$1])
01-07HashToRandomExchange(dist0=[[$1]])
02-01  UnorderedMuxExchange
03-01Project($0=[$0], $1=[$1], 
E_X_P_R_H_A_S_H_F_I_E_L_D=[castInt(hash64AsDouble($1))])
03-02  Project($0=[CAST(ITEM($0, 0)):BIGINT], 
$1=[CAST(ITEM($0, 1)):VARCHAR(25) CHARACTER SET ISO-8859-1 COLLATE 
ISO-8859-1$en_US$primary])
03-03Scan(groupscan=[EasyGroupScan 
[selectionRoot=/tmp/manyDuplicates.csv, numFiles=1, columns=[`columns`[0], 
`columns`[1]], files=[maprfs:///tmp/manyDuplicates.csv]]])
{code}

python script to generate data in CSV format
{code}
import random
f = open('/Users/kungfo/manyDuplicates.csv', 'a')
for i in range(1,00):

f.write(str(random.choice(xrange(1,100)))+','+str(random.choice(['','','','','']))+'\n')
f.flush()


{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Window function query takes too long to complete and return results

2015-06-09 Thread Steven Phillips
In cases like this where you are printing millions of record in SQLLINE,
you should pipe the output to /dev/null or to a file, and measure the
performance that way. I'm guessing that most of the time in this case is
spent printing the output to the console, and thus really unrelated to
Drill performance. If piping the data to a file or /dev/null causes the
query to run much faster, than it probably isn't a real issue.

also, anytime you are investigating a performance related issue, you should
always check the profile. In this case, I suspect you might see that most
of the time is spent in the WAIT time of the SCREEN operator. That would
indicate that client side processing is slowing the query down.

On Tue, Jun 9, 2015 at 7:09 PM, Abdel Hakim Deneche adene...@maprtech.com
wrote:

 please open a JIRA issue. please provide the test file (compressed) or a
 script to generate similar data.

 Thanks!

 On Tue, Jun 9, 2015 at 6:55 PM, Khurram Faraaz kfar...@maprtech.com
 wrote:

  Query that uses window functions takes too long to complete and return
  results. It returns close to a million records, for which it took 533.8
  seconds ~8 minutes
  Input CSV file has two columns, one integer and another varchar type
  column. Please let me know if this needs to be investigated and I can
  report a JIRA to track this if required ?
 
  Size of the input CSV file
 
  root@centos-01 ~]# hadoop fs -ls /tmp/manyDuplicates.csv
 
  -rwxr-xr-x   3 root root   27889455 2015-06-10 01:26
  /tmp/manyDuplicates.csv
 
  {code}
 
  select count(*) over(partition by cast(columns[1] as varchar(25)) order
 by
  cast(columns[0] as bigint)) from `manyDuplicates.csv`;
 
  ...
 
  1,000,007 rows selected (533.857 seconds)
  {code}
 
  There are five distinct values in columns[1] in the CSV file. = [FIVE
  PARTITIONS]
 
  {code}
 
  0: jdbc:drill:schema=dfs.tmp select distinct columns[1] from
  `manyDuplicates.csv`;
 
  *+---+*
 
  *| **   EXPR$0** |*
 
  *+---+*
 
  *| * * |*
 
  *| * * |*
 
  *| * * |*
 
  *| * * |*
 
  *| * * |*
 
  *+---+*
 
  5 rows selected (1.906 seconds)
  {code}
 
  Here is the count for each of those values in columns[1]
 
  {code}
 
  0: jdbc:drill:schema=dfs.tmp select count(columns[1]) from
  `manyDuplicates.csv` where columns[1] = '';
 
  *+-+*
 
  *| **EXPR$0 ** |*
 
  *+-+*
 
  *| *200484 * |*
 
  *+-+*
 
  1 row selected (0.961 seconds)
 
  {code}
 
 
  {code}
 
  0: jdbc:drill:schema=dfs.tmp select count(columns[1]) from
  `manyDuplicates.csv` where columns[1] = '';
 
  *+-+*
 
  *| **EXPR$0 ** |*
 
  *+-+*
 
  *| *199353 * |*
 
  *+-+*
 
  1 row selected (0.86 seconds)
 
  {code}
 
 
  {code}
 
  0: jdbc:drill:schema=dfs.tmp select count(columns[1]) from
  `manyDuplicates.csv` where columns[1] = '';
 
  *+-+*
 
  *| **EXPR$0 ** |*
 
  *+-+*
 
  *| *200702 * |*
 
  *+-+*
 
  1 row selected (0.826 seconds)
 
  {code}
 
 
  {code}
 
  0: jdbc:drill:schema=dfs.tmp select count(columns[1]) from
  `manyDuplicates.csv` where columns[1] = '';
 
  *+-+*
 
  *| **EXPR$0 ** |*
 
  *+-+*
 
  *| *199916 * |*
 
  *+-+*
 
  1 row selected (0.851 seconds)
 
  {code}
 
 
  {code}
 
  0: jdbc:drill:schema=dfs.tmp select count(columns[1]) from
  `manyDuplicates.csv` where columns[1] = '';
 
  *+-+*
 
  *| **EXPR$0 ** |*
 
  *+-+*
 
  *| *199552 * |*
 
  *+-+*
 
  1 row selected (0.827 seconds)
  {code}
 
  Thanks,
  Khurram
 



 --

 Abdelhakim Deneche

 Software Engineer

   http://www.mapr.com/


 Now Available - Free Hadoop On-Demand Training
 
 http://www.mapr.com/training?utm_source=Emailutm_medium=Signatureutm_campaign=Free%20available
 




-- 
 Steven Phillips
 Software Engineer

 mapr.com


[jira] [Created] (DRILL-3241) Query with window function runs out of direct memory and does not report back to client that it did

2015-06-01 Thread Victoria Markman (JIRA)
Victoria Markman created DRILL-3241:
---

 Summary: Query with window function runs out of direct memory and 
does not report back to client that it did
 Key: DRILL-3241
 URL: https://issues.apache.org/jira/browse/DRILL-3241
 Project: Apache Drill
  Issue Type: Bug
  Components: Execution - Relational Operators
Affects Versions: 1.0.0
Reporter: Victoria Markman
Assignee: Chris Westin


Even though query run out of memory and was cancelled on the server, client 
(sqlline) was never notified of the event and it appears to the user that query 
is hung. 

Configuration:
Single drillbit configured with:
DRILL_MAX_DIRECT_MEMORY=2G
DRILL_HEAP=1G
TPCDS100 parquet files

Query:
{code}
select 
  sum(ss_quantity) over(partition by ss_store_sk order by ss_sold_date_sk) 
from store_sales;
{code}

drillbit.log
{code}
2015-06-01 21:42:29,514 [BitServer-5] ERROR o.a.d.exec.rpc.RpcExceptionHandler 
- Exception in RPC communication.  Connection: /10.10.88.133:31012 -- 
/10.10.88.133:38887 (data server).  Closing connection.
io.netty.handler.codec.DecoderException: java.lang.OutOfMemoryError: Direct 
buffer memory
at 
io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:233)
 ~[netty-codec-4.0.27.Final.jar:4.0.27.Final]
at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
 [netty-transport-4.0.27.Final.jar:4.0.27.Final]
at 
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
 [netty-transport-4.0.27.Final.jar:4.0.27.Final]
at 
io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86)
 [netty-transport-4.0.27.Final.jar:4.0.27.Final]
at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
 [netty-transport-4.0.27.Final.jar:4.0.27.Final]
at 
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
 [netty-transport-4.0.27.Final.jar:4.0.27.Final]
at 
io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:847)
 [netty-transport-4.0.27.Final.jar:4.0.27.Final]
at 
io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe.epollInReady(AbstractEpollStreamChannel.java:618)
 [netty-transport-native-epoll-4.0.27.Final-linux-x86_64.jar:na]
at 
io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:329) 
[netty-transport-native-epoll-4.0.27.Final-linux-x86_64.jar:na]
at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:250) 
[netty-transport-native-epoll-4.0.27.Final-linux-x86_64.jar:na]
at 
io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
 [netty-common-4.0.27.Final.jar:4.0.27.Final]
at java.lang.Thread.run(Thread.java:745) [na:1.7.0_71]
Caused by: java.lang.OutOfMemoryError: Direct buffer memory
at java.nio.Bits.reserveMemory(Bits.java:658) ~[na:1.7.0_71]
at java.nio.DirectByteBuffer.init(DirectByteBuffer.java:123) 
~[na:1.7.0_71]
at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:306) 
~[na:1.7.0_71]
at io.netty.buffer.PoolArena$DirectArena.newChunk(PoolArena.java:437) 
~[netty-buffer-4.0.27.Final.jar:4.0.27.Final]
at io.netty.buffer.PoolArena.allocateNormal(PoolArena.java:179) 
~[netty-buffer-4.0.27.Final.jar:4.0.27.Final]
at io.netty.buffer.PoolArena.allocate(PoolArena.java:168) 
~[netty-buffer-4.0.27.Final.jar:4.0.27.Final]
at io.netty.buffer.PoolArena.reallocate(PoolArena.java:280) 
~[netty-buffer-4.0.27.Final.jar:4.0.27.Final]
at io.netty.buffer.PooledByteBuf.capacity(PooledByteBuf.java:110) 
~[netty-buffer-4.0.27.Final.jar:4.0.27.Final]
at 
io.netty.buffer.AbstractByteBuf.ensureWritable(AbstractByteBuf.java:251) 
~[netty-buffer-4.0.27.Final.jar:4.0.27.Final]
at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:849) 
~[netty-buffer-4.0.27.Final.jar:4.0.27.Final]
at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:841) 
~[netty-buffer-4.0.27.Final.jar:4.0.27.Final]
at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:831) 
~[netty-buffer-4.0.27.Final.jar:4.0.27.Final]
at io.netty.buffer.WrappedByteBuf.writeBytes(WrappedByteBuf.java:600) 
~[netty-buffer-4.0.27.Final.jar:4.0.27.Final]
at 
io.netty.buffer.UnsafeDirectLittleEndian.writeBytes(UnsafeDirectLittleEndian.java:28)
 ~[drill-java-exec-1.0.0-mapr-r1-rebuffed.jar:4.0.27.Final]
at 
io.netty.handler.codec.ByteToMessageDecoder$1.cumulate(ByteToMessageDecoder.java:92)
 ~[netty-codec-4.0.27.Final.jar:4.0.27.Final]
at 
io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:227

[jira] [Created] (DRILL-3243) Need a better error message - Use of alias in window function definition

2015-06-01 Thread Khurram Faraaz (JIRA)
Khurram Faraaz created DRILL-3243:
-

 Summary: Need a better error message - Use of alias in window 
function definition
 Key: DRILL-3243
 URL: https://issues.apache.org/jira/browse/DRILL-3243
 Project: Apache Drill
  Issue Type: Bug
  Components: Execution - Flow
Affects Versions: 1.0.0
Reporter: Khurram Faraaz
Assignee: Chris Westin


Need a better error message when we use alias for window definition in query 
that uses window functions. for example, OVER(PARTITION BY columns[0] ORDER BY 
columns[1]) tmp, and if alias tmp is used in the predicate we need a message 
that says, column tmp does not exist, that is how it is in Postgres 9.3

Postgres 9.3

{code}
postgres=# select count(*) OVER(partition by type order by id) `tmp` from 
airports where tmp is not null;
ERROR:  column tmp does not exist
LINE 1: ...ect count(*) OVER(partition by type order by id) `tmp` from ...
 ^
{code}

Drill 1.0
{code}
0: jdbc:drill:schema=dfs.tmp select count(*) OVER(partition by columns[2] 
order by columns[0]) tmp from `airports.csv` where tmp is not null;
Error: SYSTEM ERROR: java.lang.IllegalArgumentException: Selected column(s) 
must have name 'columns' or must be plain '*'

Fragment 0:0

[Error Id: 66987b81-fe50-422d-95e4-9ce61c873584 on centos-02.qa.lab:31010] 
(state=,code=0)
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-3218) Window function usage throws CompileException

2015-05-29 Thread Khurram Faraaz (JIRA)
Khurram Faraaz created DRILL-3218:
-

 Summary: Window function usage throws CompileException
 Key: DRILL-3218
 URL: https://issues.apache.org/jira/browse/DRILL-3218
 Project: Apache Drill
  Issue Type: Bug
  Components: Execution - Flow
Affects Versions: 1.0.0
 Environment: faec150598840c40827e6493992d81209aa936da
Reporter: Khurram Faraaz
Assignee: Chris Westin



PARTITION BY date ORDER BY timestamp

{code}
0: jdbc:drill:schema=dfs.tmp SELECT MAX(columns[0]) OVER (PARTITION BY 
columns[6] ORDER BY columns[4]) FROM `allTypData2.csv`;
Error: SYSTEM ERROR: org.codehaus.commons.compiler.CompileException: Line 330, 
Column 31: Unknown variable or type incoming

Fragment 0:0

[Error Id: 285af8f1-ddb4-4d3e-a2d7-bfaef20df5e0 on centos-02.qa.lab:31010] 
(state=,code=0)
{code}

I will add more details in a bit.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-3219) Filter is not pushed into subquery with window function

2015-05-29 Thread Victoria Markman (JIRA)
Victoria Markman created DRILL-3219:
---

 Summary: Filter is not pushed into subquery  with window function
 Key: DRILL-3219
 URL: https://issues.apache.org/jira/browse/DRILL-3219
 Project: Apache Drill
  Issue Type: Bug
  Components: Query Planning  Optimization
Affects Versions: 1.0.0
Reporter: Victoria Markman
Assignee: Jinfeng Ni


{code}
0: jdbc:drill:schema=dfs explain plan for select * from ( select a1, b1, c1, 
sum(a1) over(partition by b1) from t1 ) where c1 is not null;
+--+--+
| text | json |
+--+--+
| 00-00Screen
00-01  Project(a1=[$0], b1=[$1], c1=[$2], EXPR$3=[$3])
00-02Project(a1=[$0], b1=[$1], c1=[$2], EXPR$3=[CASE(($3, 0), 
CAST($4):ANY, null)])
00-03  SelectionVectorRemover
00-04Filter(condition=[IS NOT NULL($2)])
00-05  Window(window#0=[window(partition {1} order by [] range 
between UNBOUNDED PRECEDING and UNBOUNDED FOLLOWING aggs [COUNT($0), 
$SUM0($0)])])
00-06SelectionVectorRemover
00-07  Sort(sort0=[$1], dir0=[ASC])
00-08Project(a1=[$2], b1=[$1], c1=[$0])
00-09  Scan(groupscan=[ParquetGroupScan 
[entries=[ReadEntryWithPath [path=maprfs:///drill/testdata/subqueries/t1]], 
selectionRoot=/drill/testdata/subqueries/t1, numFiles=1, columns=[`a1`, `b1`, 
`c1`]]])
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


select * + window function gives strange column names

2015-05-28 Thread Abdel Hakim Deneche
I have a small json file that contains the following columns:
employee_id, position_id, salary, sub and count

when I run the following query:

select *, count(*) over(partition by position_id) from `myfile.json`;

I get this list of columns:

T1¦¦employee_id  |  T1¦¦position_id  |  T1¦¦sub  |  T1¦¦salary  |
  position_id  |  w0$o0  |  w0$o00  |  EXPR$1 |


is this correct ?


-- 

Abdelhakim Deneche

Software Engineer

  http://www.mapr.com/


Now Available - Free Hadoop On-Demand Training
http://www.mapr.com/training?utm_source=Emailutm_medium=Signatureutm_campaign=Free%20available


Re: select * + window function gives strange column names

2015-05-28 Thread Abdel Hakim Deneche
never mind, DRILL-3210 https://issues.apache.org/jira/browse/DRILL-3210
was just filled for a similar problem

On Thu, May 28, 2015 at 3:51 PM, Abdel Hakim Deneche adene...@maprtech.com
wrote:

 I have a small json file that contains the following columns:
 employee_id, position_id, salary, sub and count

 when I run the following query:

 select *, count(*) over(partition by position_id) from `myfile.json`;

 I get this list of columns:

 T1¦¦employee_id  |  T1¦¦position_id  |  T1¦¦sub  |  T1¦¦salary  |
  position_id  |  w0$o0  |  w0$o00  |  EXPR$1 |


 is this correct ?


 --

 Abdelhakim Deneche

 Software Engineer

   http://www.mapr.com/


 Now Available - Free Hadoop On-Demand Training
 http://www.mapr.com/training?utm_source=Emailutm_medium=Signatureutm_campaign=Free%20available




-- 

Abdelhakim Deneche

Software Engineer

  http://www.mapr.com/


Now Available - Free Hadoop On-Demand Training
http://www.mapr.com/training?utm_source=Emailutm_medium=Signatureutm_campaign=Free%20available


[jira] [Created] (DRILL-3211) Assert in a query with window function and group by clause

2015-05-28 Thread Victoria Markman (JIRA)
Victoria Markman created DRILL-3211:
---

 Summary: Assert in a query with window function and group by 
clause 
 Key: DRILL-3211
 URL: https://issues.apache.org/jira/browse/DRILL-3211
 Project: Apache Drill
  Issue Type: Bug
  Components: Query Planning  Optimization
Affects Versions: 1.0.0
Reporter: Victoria Markman
Assignee: Jinfeng Ni


{code}
0: jdbc:drill:schema=dfs select sum(a1) over (partition by b1)  from t1 group 
by b1;
Error: SYSTEM ERROR: java.lang.AssertionError: Internal error: while converting 
SUM(`t1`.`a1`)
[Error Id: 21872cfa-6f09-4e92-aee6-5dd8698cf9e7 on atsqa4-133.qa.lab:31010] 
(state=,code=0)
{code}

drillbit.log
{code}
Caused by: java.lang.AssertionError: Internal error: while converting 
SUM(`t1`.`a1`)
at org.apache.calcite.util.Util.newInternal(Util.java:790) 
~[calcite-core-1.1.0-drill-r7.jar:1.1.0-drill-r7]
at 
org.apache.calcite.sql2rel.ReflectiveConvertletTable$2.convertCall(ReflectiveConvertletTable.java:152)
 ~[calcite-core-1.1.0-drill-r7.jar:1.1.0-drill-r7]
at 
org.apache.calcite.sql2rel.SqlNodeToRexConverterImpl.convertCall(SqlNodeToRexConverterImpl.java:60)
 ~[calcite-core-1.1.0-drill-r7.jar:1.1.0-drill-r7]
at 
org.apache.calcite.sql2rel.SqlToRelConverter.convertOver(SqlToRelConverter.java:1762)
 ~[calcite-core-1.1.0-drill-r7.jar:1.1.0-drill-r7]
at 
org.apache.calcite.sql2rel.SqlToRelConverter.access$1000(SqlToRelConverter.java:180)
 ~[calcite-core-1.1.0-drill-r7.jar:1.1.0-drill-r7]
at 
org.apache.calcite.sql2rel.SqlToRelConverter$Blackboard.convertExpression(SqlToRelConverter.java:3937)
 ~[calcite-core-1.1.0-drill-r7.jar:1.1.0-drill-r7]
at 
org.apache.calcite.sql2rel.SqlToRelConverter.createAggImpl(SqlToRelConverter.java:2521)
 ~[calcite-core-1.1.0-drill-r7.jar:1.1.0-drill-r7]
at 
org.apache.calcite.sql2rel.SqlToRelConverter.convertAgg(SqlToRelConverter.java:2342)
 ~[calcite-core-1.1.0-drill-r7.jar:1.1.0-drill-r7]
at 
org.apache.calcite.sql2rel.SqlToRelConverter.convertSelectImpl(SqlToRelConverter.java:604)
 ~[calcite-core-1.1.0-drill-r7.jar:1.1.0-drill-r7]
at 
org.apache.calcite.sql2rel.SqlToRelConverter.convertSelect(SqlToRelConverter.java:564)
 ~[calcite-core-1.1.0-drill-r7.jar:1.1.0-drill-r7]
at 
org.apache.calcite.sql2rel.SqlToRelConverter.convertQueryRecursive(SqlToRelConverter.java:2741)
 ~[calcite-core-1.1.0-drill-r7.jar:1.1.0-drill-r7]
at 
org.apache.calcite.sql2rel.SqlToRelConverter.convertQuery(SqlToRelConverter.java:522)
 ~[calcite-core-1.1.0-drill-r7.jar:1.1.0-drill-r7]
at org.apache.calcite.prepare.PlannerImpl.convert(PlannerImpl.java:198) 
~[calcite-core-1.1.0-drill-r7.jar:1.1.0-drill-r7]
at 
org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToRel(DefaultSqlHandler.java:246)
 ~[drill-java-exec-1.0.0-mapr-r1-rebuffed.jar:1.0.0-mapr-r1]
at 
org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.getPlan(DefaultSqlHandler.java:182)
 ~[drill-java-exec-1.0.0-mapr-r1-rebuffed.jar:1.0.0-mapr-r1]
at 
org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan(DrillSqlWorker.java:177)
 ~[drill-java-exec-1.0.0-mapr-r1-rebuffed.jar:1.0.0-mapr-r1]
at org.apache.drill.exec.work.foreman.Foreman.runSQL(Foreman.java:902) 
[drill-java-exec-1.0.0-mapr-r1-rebuffed.jar:1.0.0-mapr-r1]
at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:240) 
[drill-java-exec-1.0.0-mapr-r1-rebuffed.jar:1.0.0-mapr-r1]
... 3 common frames omitted
Caused by: java.lang.reflect.InvocationTargetException: null
at sun.reflect.GeneratedMethodAccessor120.invoke(Unknown Source) 
~[na:na]
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 ~[na:1.7.0_71]
at java.lang.reflect.Method.invoke(Method.java:606) ~[na:1.7.0_71]
at 
org.apache.calcite.sql2rel.ReflectiveConvertletTable$2.convertCall(ReflectiveConvertletTable.java:142)
 ~[calcite-core-1.1.0-drill-r7.jar:1.1.0-drill-r7]
... 19 common frames omitted
Caused by: java.lang.AssertionError: null
at 
org.apache.calcite.sql2rel.SqlToRelConverter$Blackboard.getRootField(SqlToRelConverter.java:3810)
 ~[calcite-core-1.1.0-drill-r7.jar:1.1.0-drill-r7]
at 
org.apache.calcite.sql2rel.SqlToRelConverter.adjustInputRef(SqlToRelConverter.java:3139)
 ~[calcite-core-1.1.0-drill-r7.jar:1.1.0-drill-r7]
at 
org.apache.calcite.sql2rel.SqlToRelConverter.convertIdentifier(SqlToRelConverter.java:3114)
 ~[calcite-core-1.1.0-drill-r7.jar:1.1.0-drill-r7]
at 
org.apache.calcite.sql2rel.SqlToRelConverter.access$1400(SqlToRelConverter.java:180)
 ~[calcite-core-1.1.0-drill-r7.jar:1.1.0-drill-r7]
at 
org.apache.calcite.sql2rel.SqlToRelConverter$Blackboard.visit(SqlToRelConverter.java:4061)
 ~[calcite-core-1.1.0-drill-r7.jar:1.1.0-drill-r7

[jira] [Created] (DRILL-3181) Change error message to a more clear one when window range is specified for window function

2015-05-26 Thread Victoria Markman (JIRA)
Victoria Markman created DRILL-3181:
---

 Summary: Change error message to a more clear one when window 
range is specified for window function
 Key: DRILL-3181
 URL: https://issues.apache.org/jira/browse/DRILL-3181
 Project: Apache Drill
  Issue Type: Bug
Reporter: Victoria Markman


This error message makes me think that some data types are supported in the 
frame clause, when if fact we only support at this point unbounded 
preceding/unbounded following and current row/. We should change error message 
to say that.

{code}
0: jdbc:drill:schema=dfs select a2, count(b2) over(partition by a2 order by a2 
range between 10 preceding and 10 following) from t2;
Error: PARSE ERROR: From line 1, column 69 to line 1, column 70: Data type of 
ORDER BY prohibits use of RANGE clause
[Error Id: 67f94be3-d4a0-4d4c-94f5-d88b14fadbf3 on atsqa4-133.qa.lab:31010] 
(state=,code=0)
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-2405) Generate test data for window function instead of downloading it from S3

2015-03-09 Thread Deneche A. Hakim (JIRA)
Deneche A. Hakim created DRILL-2405:
---

 Summary: Generate test data for window function instead of 
downloading it from S3
 Key: DRILL-2405
 URL: https://issues.apache.org/jira/browse/DRILL-2405
 Project: Apache Drill
  Issue Type: Improvement
Affects Versions: 0.9.0
Reporter: Deneche A. Hakim
Assignee: Deneche A. Hakim
Priority: Minor


currently, the unit tests for the window function use pre-generated json files 
that are hosted on S3. Update the unit tests to generate the files on the fly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 30051: DRILL-1908: new window function implementation

2015-03-03 Thread abdelhakim deneche

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30051/
---

(Updated March 3, 2015, 10 p.m.)


Review request for drill and Jacques Nadeau.


Changes
---

patch rebased to master


Bugs: DRILL-1908
https://issues.apache.org/jira/browse/DRILL-1908


Repository: drill-git


Description
---

In order to fix DRILL-1487 a complete rewrite of the 
StreamingWindowFrameRecordBatch was needed. This patch adds a new 
WindowFrameRecordBatch that correctly handles window functions with or without 
order by clauses.
This code still lacks support for frame clauses and may be optimized to reduce 
unneeded frame computations.


Diffs (updated)
-

  
common/src/main/java/org/apache/drill/common/logical/data/AbstractBuilder.java 
28424a5 
  common/src/main/java/org/apache/drill/common/logical/data/Window.java 6dba77c 
  contrib/data/pom.xml 86075f2 
  contrib/data/window-test-data/pom.xml PRE-CREATION 
  exec/java-exec/pom.xml 28eca2b 
  exec/java-exec/src/main/java/org/apache/drill/exec/ExecConstants.java 5efcce8 
  exec/java-exec/src/main/java/org/apache/drill/exec/opt/BasicOptimizer.java 
5288f5d 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/config/WindowPOP.java
 17738ee 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/OverFinder.java
 PRE-CREATION 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/StreamingWindowFrameBatchCreator.java
 9b8929f 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/StreamingWindowFrameRecordBatch.java
 87209eb 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/StreamingWindowFrameTemplate.java
 e2c7e9e 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/StreamingWindowFramer.java
 9588cef 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFrameBatchCreator.java
 PRE-CREATION 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFrameRecordBatch.java
 PRE-CREATION 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFrameTemplate.java
 PRE-CREATION 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFramer.java
 PRE-CREATION 
  
exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillRuleSets.java
 3b7adca 
  
exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/StreamingWindowPrel.java
 f1a8bc0 
  
exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/StreamingWindowPrule.java
 00c20b2 
  
exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/WindowPrel.java
 PRE-CREATION 
  
exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/WindowPrule.java
 PRE-CREATION 
  
exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/handlers/DefaultSqlHandler.java
 232778a 
  
exec/java-exec/src/main/java/org/apache/drill/exec/server/options/SystemOptionManager.java
 3d3e96f 
  
exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/window/TestWindowFrame.java
 a9d2ef8 
  
exec/java-exec/src/test/java/org/apache/drill/exec/sql/TestWindowFunctions.java 
6eff6db 

Diff: https://reviews.apache.org/r/30051/diff/


Testing
---

Unit tests are available to test window functions in mulitple scenarios:
- b1.p1: single batch with a single partition
- b1.p2: 2 batches, each containing a different parition
- b2.p4: 2 batches and 4 partitions, one partition has rows in both batches
- b3.p2: 3 batches and 2 partitions, one partition includes the whole 2nd batch 
and has rows in 3 batches
- b4.p4: 4 batches and 4 partitions, the partitions are arranged to test an 
edge case: the 2nd time innerNext() is called, WindowFrameRecordBatch has 
enough saved batches to call it's framer.doWork() without the need to call 
next(incoming)

All tests, except the last one, come in 2 variations: with and without order 
by clause

all unit tests pass. functional, sf100 and customer tests don't add any new 
failures


Thanks,

abdelhakim deneche



Re: Review Request 30051: DRILL-1908: new window function implementation

2015-02-12 Thread abdelhakim deneche

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30051/
---

(Updated Feb. 12, 2015, 6:20 p.m.)


Review request for drill and Jacques Nadeau.


Changes
---

rebased the patch.


Bugs: DRILL-1908
https://issues.apache.org/jira/browse/DRILL-1908


Repository: drill-git


Description
---

In order to fix DRILL-1487 a complete rewrite of the 
StreamingWindowFrameRecordBatch was needed. This patch adds a new 
WindowFrameRecordBatch that correctly handles window functions with or without 
order by clauses.
This code still lacks support for frame clauses and may be optimized to reduce 
unneeded frame computations.


Diffs (updated)
-

  
common/src/main/java/org/apache/drill/common/logical/data/AbstractBuilder.java 
28424a5 
  common/src/main/java/org/apache/drill/common/logical/data/Window.java 6dba77c 
  contrib/data/pom.xml 86075f2 
  contrib/data/window-test-data/pom.xml PRE-CREATION 
  exec/java-exec/pom.xml 06f60fb 
  exec/java-exec/src/main/java/org/apache/drill/exec/ExecConstants.java 5efcce8 
  exec/java-exec/src/main/java/org/apache/drill/exec/opt/BasicOptimizer.java 
5288f5d 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/config/WindowPOP.java
 17738ee 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/OverFinder.java
 PRE-CREATION 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/StreamingWindowFrameBatchCreator.java
 9b8929f 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/StreamingWindowFrameRecordBatch.java
 26d23f2 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/StreamingWindowFrameTemplate.java
 e2c7e9e 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/StreamingWindowFramer.java
 9588cef 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFrameBatchCreator.java
 PRE-CREATION 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFrameRecordBatch.java
 PRE-CREATION 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFrameTemplate.java
 PRE-CREATION 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFramer.java
 PRE-CREATION 
  
exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillRuleSets.java
 3b7adca 
  
exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/StreamingWindowPrel.java
 f1a8bc0 
  
exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/StreamingWindowPrule.java
 00c20b2 
  
exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/WindowPrel.java
 PRE-CREATION 
  
exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/WindowPrule.java
 PRE-CREATION 
  
exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/handlers/DefaultSqlHandler.java
 6b3d301 
  
exec/java-exec/src/main/java/org/apache/drill/exec/server/options/SystemOptionManager.java
 aa0a5ad 
  
exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/window/TestWindowFrame.java
 a9d2ef8 
  
exec/java-exec/src/test/java/org/apache/drill/exec/sql/TestWindowFunctions.java 
6eff6db 

Diff: https://reviews.apache.org/r/30051/diff/


Testing (updated)
---

Unit tests are available to test window functions in mulitple scenarios:
- b1.p1: single batch with a single partition
- b1.p2: 2 batches, each containing a different parition
- b2.p4: 2 batches and 4 partitions, one partition has rows in both batches
- b3.p2: 3 batches and 2 partitions, one partition includes the whole 2nd batch 
and has rows in 3 batches
- b4.p4: 4 batches and 4 partitions, the partitions are arranged to test an 
edge case: the 2nd time innerNext() is called, WindowFrameRecordBatch has 
enough saved batches to call it's framer.doWork() without the need to call 
next(incoming)

All tests, except the last one, come in 2 variations: with and without order 
by clause

all unit tests pass. functional, sf100 and customer tests don't add any new 
failures


Thanks,

abdelhakim deneche



Re: Review Request 30051: DRILL-1908: new window function implementation

2015-02-02 Thread Chris Westin


 On Jan. 30, 2015, 1:12 a.m., Chris Westin wrote:
  exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/GenerateTestData.java,
   line 121
  https://reviews.apache.org/r/30051/diff/2/?file=826885#file826885line121
 
  Make the file prefix a command-line argument so that people besides 
  yourself can run this.
 
 abdelhakim deneche wrote:
 Argh, I forgot to remove GenerateTestData again!
 I just used this to generate the data used in the tests, it was never 
 intended to be part of the final code.
 Sorry about that

I would check it in. We might need it again in the future. What if something 
changes and we have to re-generate the test data?


 On Jan. 30, 2015, 1:12 a.m., Chris Westin wrote:
  exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFrameTemplate.java,
   line 32
  https://reviews.apache.org/r/30051/diff/2/?file=826892#file826892line32
 
  logger should be private.
 
 Jacques Nadeau wrote:
 I disagree.  Our current standard is package private.  If you think we 
 should change this throughout the code, we should have a discussion but my 
 preference is to maintain the current standard until we decide upon a new one.

Loggers identify their source in log messages thanks to the class argument 
given to getLogger(). They're meant to be associated with a class in a 
one-to-one manner -- why else would getLogger() have this parameter?

There are no uses of the pattern otherclass.logger..., so there's no reason 
for them to be package private. However, I have come across a few uses where a 
derived class uses the logger from its base class, and this is confusing. This 
has sent me looking in the wrong place for the source of the message, so we 
shouldn't do it. I've assumed it was accidental, and slipped by because the 
base class's logger wasn't private, so the author was able to use it without 
realizing it. Making them private will prevent that, and ensure that log 
messages correctly identify their real source.

Because we have not written standard that described this, and because it goes 
against common best practice elsewhere, I've been converting these to private 
wherever I've come across them. In only a couple of cases has this made me add 
new loggers where a derived class was accidentally using its base class's 
logger.


- Chris


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30051/#review70296
---


On Jan. 28, 2015, 7:50 p.m., abdelhakim deneche wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/30051/
 ---
 
 (Updated Jan. 28, 2015, 7:50 p.m.)
 
 
 Review request for drill.
 
 
 Bugs: DRILL-1908
 https://issues.apache.org/jira/browse/DRILL-1908
 
 
 Repository: drill-git
 
 
 Description
 ---
 
 In order to fix DRILL-1487 a complete rewrite of the 
 StreamingWindowFrameRecordBatch was needed. This patch adds a new 
 WindowFrameRecordBatch that correctly handles window functions with or 
 without order by clauses.
 This code still lacks support for frame clauses and may be optimized to 
 reduce unneeded frame computations.
 
 
 Diffs
 -
 
   
 common/src/main/java/org/apache/drill/common/logical/data/AbstractBuilder.java
  28424a5 
   common/src/main/java/org/apache/drill/common/logical/data/Window.java 
 6dba77c 
   contrib/data/pom.xml 86075f2 
   contrib/data/window-test-data/pom.xml PRE-CREATION 
   exec/java-exec/pom.xml 90734a5 
   exec/java-exec/src/main/java/org/apache/drill/exec/ExecConstants.java 
 190c13f 
   exec/java-exec/src/main/java/org/apache/drill/exec/opt/BasicOptimizer.java 
 5288f5d 
   
 exec/java-exec/src/main/java/org/apache/drill/exec/physical/config/WindowPOP.java
  17738ee 
   
 exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/OverFinder.java
  PRE-CREATION 
   
 exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/StreamingWindowFrameBatchCreator.java
  9b8929f 
   
 exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/StreamingWindowFrameRecordBatch.java
  26d23f2 
   
 exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/StreamingWindowFrameTemplate.java
  e2c7e9e 
   
 exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/StreamingWindowFramer.java
  9588cef 
   
 exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFrameBatchCreator.java
  PRE-CREATION 
   
 exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFrameRecordBatch.java
  PRE-CREATION 
   
 exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFrameTemplate.java
  PRE-CREATION 
   
 

Re: Review Request 30051: DRILL-1908: new window function implementation

2015-01-30 Thread Chris Westin


 On Jan. 24, 2015, 12:20 a.m., abdelhakim deneche wrote:
  exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFrameRecordBatch.java,
   line 85
  https://reviews.apache.org/r/30051/diff/3/?file=828946#file828946line85
 
  Yes, especially if the user uses over() then we'll have one single 
  parition containing all rows.

Then it seems like we should have some way of limiting this and aborting, and 
submit a ticket to possibly find another way to store the required data (spool 
to disk, like sort?) if this happens in the future.


- Chris


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30051/#review69506
---


On Jan. 28, 2015, 7:50 p.m., abdelhakim deneche wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/30051/
 ---
 
 (Updated Jan. 28, 2015, 7:50 p.m.)
 
 
 Review request for drill.
 
 
 Bugs: DRILL-1908
 https://issues.apache.org/jira/browse/DRILL-1908
 
 
 Repository: drill-git
 
 
 Description
 ---
 
 In order to fix DRILL-1487 a complete rewrite of the 
 StreamingWindowFrameRecordBatch was needed. This patch adds a new 
 WindowFrameRecordBatch that correctly handles window functions with or 
 without order by clauses.
 This code still lacks support for frame clauses and may be optimized to 
 reduce unneeded frame computations.
 
 
 Diffs
 -
 
   
 common/src/main/java/org/apache/drill/common/logical/data/AbstractBuilder.java
  28424a5 
   common/src/main/java/org/apache/drill/common/logical/data/Window.java 
 6dba77c 
   contrib/data/pom.xml 86075f2 
   contrib/data/window-test-data/pom.xml PRE-CREATION 
   exec/java-exec/pom.xml 90734a5 
   exec/java-exec/src/main/java/org/apache/drill/exec/ExecConstants.java 
 190c13f 
   exec/java-exec/src/main/java/org/apache/drill/exec/opt/BasicOptimizer.java 
 5288f5d 
   
 exec/java-exec/src/main/java/org/apache/drill/exec/physical/config/WindowPOP.java
  17738ee 
   
 exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/OverFinder.java
  PRE-CREATION 
   
 exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/StreamingWindowFrameBatchCreator.java
  9b8929f 
   
 exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/StreamingWindowFrameRecordBatch.java
  26d23f2 
   
 exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/StreamingWindowFrameTemplate.java
  e2c7e9e 
   
 exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/StreamingWindowFramer.java
  9588cef 
   
 exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFrameBatchCreator.java
  PRE-CREATION 
   
 exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFrameRecordBatch.java
  PRE-CREATION 
   
 exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFrameTemplate.java
  PRE-CREATION 
   
 exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFramer.java
  PRE-CREATION 
   
 exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillRuleSets.java
  3b7adca 
   
 exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/StreamingWindowPrel.java
  f1a8bc0 
   
 exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/StreamingWindowPrule.java
  00c20b2 
   
 exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/WindowPrel.java
  PRE-CREATION 
   
 exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/WindowPrule.java
  PRE-CREATION 
   
 exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/handlers/DefaultSqlHandler.java
  79603eb 
   
 exec/java-exec/src/main/java/org/apache/drill/exec/server/options/SystemOptionManager.java
  f20627d 
   
 exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/window/TestWindowFrame.java
  a9d2ef8 
   
 exec/java-exec/src/test/java/org/apache/drill/exec/sql/TestWindowFunctions.java
  6eff6db 
 
 Diff: https://reviews.apache.org/r/30051/diff/
 
 
 Testing
 ---
 
 Unit tests are available to test window functions in mulitple scenarios:
 - b1.p1: single batch with a single partition
 - b1.p2: 2 batches, each containing a different parition
 - b2.p4: 2 batches and 4 partitions, one partition has rows in both batches
 - b3.p2: 3 batches and 2 partitions, one partition includes the whole 2nd 
 batch and has rows in 3 batches
 - b4.p4: 4 batches and 4 partitions, the partitions are arranged to test an 
 edge case: the 2nd time innerNext() is called, WindowFrameRecordBatch has 
 enough saved batches to call it's framer.doWork() without the need to call 
 next(incoming)
 
 All tests, except the last one, come in 2 variations: with and without 

Re: Review Request 30051: DRILL-1908: new window function implementation

2015-01-28 Thread abdelhakim deneche


 On Jan. 26, 2015, 5:51 p.m., Jacques Nadeau wrote:
  exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/window/TestWindowFrame.java,
   line 38
  https://reviews.apache.org/r/30051/diff/3/?file=828952#file828952line38
 
  Please add a negative test case when using multiple partitions.  In 
  that case, the failure should happen in the planning phase, not execution.

WindowPrel.getPhysicalOperator(PhysicalPlanCreator creator) has the following:
```
checkState(windows.size() == 1, Only one window is expected in WindowPrel);
```


- abdelhakim


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30051/#review69617
---


On Jan. 28, 2015, 7:50 p.m., abdelhakim deneche wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/30051/
 ---
 
 (Updated Jan. 28, 2015, 7:50 p.m.)
 
 
 Review request for drill.
 
 
 Bugs: DRILL-1908
 https://issues.apache.org/jira/browse/DRILL-1908
 
 
 Repository: drill-git
 
 
 Description
 ---
 
 In order to fix DRILL-1487 a complete rewrite of the 
 StreamingWindowFrameRecordBatch was needed. This patch adds a new 
 WindowFrameRecordBatch that correctly handles window functions with or 
 without order by clauses.
 This code still lacks support for frame clauses and may be optimized to 
 reduce unneeded frame computations.
 
 
 Diffs
 -
 
   
 common/src/main/java/org/apache/drill/common/logical/data/AbstractBuilder.java
  28424a5 
   common/src/main/java/org/apache/drill/common/logical/data/Window.java 
 6dba77c 
   contrib/data/pom.xml 86075f2 
   contrib/data/window-test-data/pom.xml PRE-CREATION 
   exec/java-exec/pom.xml 90734a5 
   exec/java-exec/src/main/java/org/apache/drill/exec/ExecConstants.java 
 190c13f 
   exec/java-exec/src/main/java/org/apache/drill/exec/opt/BasicOptimizer.java 
 5288f5d 
   
 exec/java-exec/src/main/java/org/apache/drill/exec/physical/config/WindowPOP.java
  17738ee 
   
 exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/OverFinder.java
  PRE-CREATION 
   
 exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/StreamingWindowFrameBatchCreator.java
  9b8929f 
   
 exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/StreamingWindowFrameRecordBatch.java
  26d23f2 
   
 exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/StreamingWindowFrameTemplate.java
  e2c7e9e 
   
 exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/StreamingWindowFramer.java
  9588cef 
   
 exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFrameBatchCreator.java
  PRE-CREATION 
   
 exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFrameRecordBatch.java
  PRE-CREATION 
   
 exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFrameTemplate.java
  PRE-CREATION 
   
 exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFramer.java
  PRE-CREATION 
   
 exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillRuleSets.java
  3b7adca 
   
 exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/StreamingWindowPrel.java
  f1a8bc0 
   
 exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/StreamingWindowPrule.java
  00c20b2 
   
 exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/WindowPrel.java
  PRE-CREATION 
   
 exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/WindowPrule.java
  PRE-CREATION 
   
 exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/handlers/DefaultSqlHandler.java
  79603eb 
   
 exec/java-exec/src/main/java/org/apache/drill/exec/server/options/SystemOptionManager.java
  f20627d 
   
 exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/window/TestWindowFrame.java
  a9d2ef8 
   
 exec/java-exec/src/test/java/org/apache/drill/exec/sql/TestWindowFunctions.java
  6eff6db 
 
 Diff: https://reviews.apache.org/r/30051/diff/
 
 
 Testing
 ---
 
 Unit tests are available to test window functions in mulitple scenarios:
 - b1.p1: single batch with a single partition
 - b1.p2: 2 batches, each containing a different parition
 - b2.p4: 2 batches and 4 partitions, one partition has rows in both batches
 - b3.p2: 3 batches and 2 partitions, one partition includes the whole 2nd 
 batch and has rows in 3 batches
 - b4.p4: 4 batches and 4 partitions, the partitions are arranged to test an 
 edge case: the 2nd time innerNext() is called, WindowFrameRecordBatch has 
 enough saved batches to call it's framer.doWork() without the need to call 
 next(incoming)
 
 All tests, except the last one, come in 2 variations: with and without order 

Re: Review Request 30051: DRILL-1908: new window function implementation

2015-01-28 Thread abdelhakim deneche

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30051/
---

(Updated Jan. 28, 2015, 7:50 p.m.)


Review request for drill.


Changes
---

- updated GeneratorMapping and MappingSet variables to follow same convention 
used in HashJoinBatch and ChainedHashTable
- check and disable window function in sql parsing
- fail when schema changes
- use fail(Exception e) to report errors
- rename StreamingWindowPrel/StreamingWindowPrule to WindowPrel/WindowPrule


Bugs: DRILL-1908
https://issues.apache.org/jira/browse/DRILL-1908


Repository: drill-git


Description
---

In order to fix DRILL-1487 a complete rewrite of the 
StreamingWindowFrameRecordBatch was needed. This patch adds a new 
WindowFrameRecordBatch that correctly handles window functions with or without 
order by clauses.
This code still lacks support for frame clauses and may be optimized to reduce 
unneeded frame computations.


Diffs (updated)
-

  
common/src/main/java/org/apache/drill/common/logical/data/AbstractBuilder.java 
28424a5 
  common/src/main/java/org/apache/drill/common/logical/data/Window.java 6dba77c 
  contrib/data/pom.xml 86075f2 
  contrib/data/window-test-data/pom.xml PRE-CREATION 
  exec/java-exec/pom.xml 90734a5 
  exec/java-exec/src/main/java/org/apache/drill/exec/ExecConstants.java 190c13f 
  exec/java-exec/src/main/java/org/apache/drill/exec/opt/BasicOptimizer.java 
5288f5d 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/config/WindowPOP.java
 17738ee 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/OverFinder.java
 PRE-CREATION 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/StreamingWindowFrameBatchCreator.java
 9b8929f 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/StreamingWindowFrameRecordBatch.java
 26d23f2 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/StreamingWindowFrameTemplate.java
 e2c7e9e 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/StreamingWindowFramer.java
 9588cef 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFrameBatchCreator.java
 PRE-CREATION 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFrameRecordBatch.java
 PRE-CREATION 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFrameTemplate.java
 PRE-CREATION 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFramer.java
 PRE-CREATION 
  
exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillRuleSets.java
 3b7adca 
  
exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/StreamingWindowPrel.java
 f1a8bc0 
  
exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/StreamingWindowPrule.java
 00c20b2 
  
exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/WindowPrel.java
 PRE-CREATION 
  
exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/WindowPrule.java
 PRE-CREATION 
  
exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/handlers/DefaultSqlHandler.java
 79603eb 
  
exec/java-exec/src/main/java/org/apache/drill/exec/server/options/SystemOptionManager.java
 f20627d 
  
exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/window/TestWindowFrame.java
 a9d2ef8 
  
exec/java-exec/src/test/java/org/apache/drill/exec/sql/TestWindowFunctions.java 
6eff6db 

Diff: https://reviews.apache.org/r/30051/diff/


Testing
---

Unit tests are available to test window functions in mulitple scenarios:
- b1.p1: single batch with a single partition
- b1.p2: 2 batches, each containing a different parition
- b2.p4: 2 batches and 4 partitions, one partition has rows in both batches
- b3.p2: 3 batches and 2 partitions, one partition includes the whole 2nd batch 
and has rows in 3 batches
- b4.p4: 4 batches and 4 partitions, the partitions are arranged to test an 
edge case: the 2nd time innerNext() is called, WindowFrameRecordBatch has 
enough saved batches to call it's framer.doWork() without the need to call 
next(incoming)

All tests, except the last one, come in 2 variations: with and without order 
by clause


Thanks,

abdelhakim deneche



Re: Review Request 30051: DRILL-1908: new window function implementation

2015-01-26 Thread Jacques Nadeau


 On Jan. 23, 2015, 11:39 p.m., Chris Westin wrote:
  exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFrameRecordBatch.java,
   line 273
  https://reviews.apache.org/r/30051/diff/3/?file=828946#file828946line273
 
  Shouldn't these all be static?

No.  We should actually fix the casing on the names. We originally had these as 
static but we actually maintain state inside the mappings which means they 
can't be static.  In most of the code they were originally static but then we 
realized the mistake.  We fixed the static part but I don't think we did a good 
job of fixing the case of all the declarations.  He is being with consistent 
with what we have elsewhere but what is elsewhere isn't right stylistically.


 On Jan. 23, 2015, 11:39 p.m., Chris Westin wrote:
  exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFrameRecordBatch.java,
   line 299
  https://reviews.apache.org/r/30051/diff/3/?file=828946#file828946line299
 
  static?

same as above.


- Jacques


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30051/#review69499
---


On Jan. 21, 2015, 11:05 p.m., abdelhakim deneche wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/30051/
 ---
 
 (Updated Jan. 21, 2015, 11:05 p.m.)
 
 
 Review request for drill.
 
 
 Bugs: DRILL-1908
 https://issues.apache.org/jira/browse/DRILL-1908
 
 
 Repository: drill-git
 
 
 Description
 ---
 
 In order to fix DRILL-1487 a complete rewrite of the 
 StreamingWindowFrameRecordBatch was needed. This patch adds a new 
 WindowFrameRecordBatch that correctly handles window functions with or 
 without order by clauses.
 This code still lacks support for frame clauses and may be optimized to 
 reduce unneeded frame computations.
 
 
 Diffs
 -
 
   
 common/src/main/java/org/apache/drill/common/logical/data/AbstractBuilder.java
  28424a5 
   common/src/main/java/org/apache/drill/common/logical/data/Window.java 
 6dba77c 
   contrib/data/pom.xml 86075f2 
   contrib/data/window-test-data/pom.xml PRE-CREATION 
   exec/java-exec/pom.xml 90734a5 
   exec/java-exec/src/main/java/org/apache/drill/exec/ExecConstants.java 
 190c13f 
   exec/java-exec/src/main/java/org/apache/drill/exec/opt/BasicOptimizer.java 
 5288f5d 
   
 exec/java-exec/src/main/java/org/apache/drill/exec/physical/config/WindowPOP.java
  17738ee 
   
 exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/StreamingWindowFrameBatchCreator.java
  9b8929f 
   
 exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/StreamingWindowFrameRecordBatch.java
  a3e7940 
   
 exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/StreamingWindowFrameTemplate.java
  b4e3fed 
   
 exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/StreamingWindowFramer.java
  9588cef 
   
 exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFrameBatchCreator.java
  PRE-CREATION 
   
 exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFrameRecordBatch.java
  PRE-CREATION 
   
 exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFrameTemplate.java
  PRE-CREATION 
   
 exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFramer.java
  PRE-CREATION 
   
 exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/StreamingWindowPrel.java
  f1a8bc0 
   
 exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/StreamingWindowPrule.java
  00c20b2 
   
 exec/java-exec/src/main/java/org/apache/drill/exec/server/options/SystemOptionManager.java
  f20627d 
   
 exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/window/TestWindowFrame.java
  7c04477 
   
 exec/java-exec/src/test/java/org/apache/drill/exec/sql/TestWindowFunctions.java
  6eff6db 
 
 Diff: https://reviews.apache.org/r/30051/diff/
 
 
 Testing
 ---
 
 Unit tests are available to test window functions in mulitple scenarios:
 - b1.p1: single batch with a single partition
 - b1.p2: 2 batches, each containing a different parition
 - b2.p4: 2 batches and 4 partitions, one partition has rows in both batches
 - b3.p2: 3 batches and 2 partitions, one partition includes the whole 2nd 
 batch and has rows in 3 batches
 - b4.p4: 4 batches and 4 partitions, the partitions are arranged to test an 
 edge case: the 2nd time innerNext() is called, WindowFrameRecordBatch has 
 enough saved batches to call it's framer.doWork() without the need to call 
 next(incoming)
 
 All tests, except the last one, come in 2 variations: with and without order 
 by clause
 
 
 Thanks,
 
 abdelhakim deneche
 




  1   2   >