[jira] [Assigned] (DRILL-7211) Batch Sizing in MergingReceiver
[ https://issues.apache.org/jira/browse/DRILL-7211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthikeyan Manivannan reassigned DRILL-7211: - Assignee: Karthikeyan Manivannan > Batch Sizing in MergingReceiver > --- > > Key: DRILL-7211 > URL: https://issues.apache.org/jira/browse/DRILL-7211 > Project: Apache Drill > Issue Type: Sub-task >Reporter: Karthikeyan Manivannan >Assignee: Karthikeyan Manivannan >Priority: Major > > Changes required to MergingReceiver for doing output batch sizing. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-7211) Batch Sizing in MergingReceiver
Karthikeyan Manivannan created DRILL-7211: - Summary: Batch Sizing in MergingReceiver Key: DRILL-7211 URL: https://issues.apache.org/jira/browse/DRILL-7211 Project: Apache Drill Issue Type: Sub-task Reporter: Karthikeyan Manivannan Changes required to MergingReceiver for doing output batch sizing. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-7210) Batch Sizing in HashPartitionSender
Karthikeyan Manivannan created DRILL-7210: - Summary: Batch Sizing in HashPartitionSender Key: DRILL-7210 URL: https://issues.apache.org/jira/browse/DRILL-7210 Project: Apache Drill Issue Type: Sub-task Reporter: Karthikeyan Manivannan Jira to track changes required in HashPartitionSender for performing Batch Sizing -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (DRILL-7210) Batch Sizing in HashPartitionSender
[ https://issues.apache.org/jira/browse/DRILL-7210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthikeyan Manivannan reassigned DRILL-7210: - Assignee: Karthikeyan Manivannan > Batch Sizing in HashPartitionSender > --- > > Key: DRILL-7210 > URL: https://issues.apache.org/jira/browse/DRILL-7210 > Project: Apache Drill > Issue Type: Sub-task >Reporter: Karthikeyan Manivannan >Assignee: Karthikeyan Manivannan >Priority: Major > > Jira to track changes required in HashPartitionSender for performing Batch > Sizing -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Reopened] (DRILL-7107) Unable to connect to Drill 1.15 through ZK
[ https://issues.apache.org/jira/browse/DRILL-7107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthikeyan Manivannan reopened DRILL-7107: --- > Unable to connect to Drill 1.15 through ZK > -- > > Key: DRILL-7107 > URL: https://issues.apache.org/jira/browse/DRILL-7107 > Project: Apache Drill > Issue Type: Bug > Components: Client - JDBC >Affects Versions: 1.15.0 >Reporter: Karthikeyan Manivannan >Assignee: Karthikeyan Manivannan >Priority: Major > Labels: ready-to-commit > Fix For: 1.16.0 > > > After upgrading to Drill 1.15, users are seeing they are no longer able to > connect to Drill using ZK quorum. They are getting the following "Unable to > setup ZK for client" error. > [~]$ sqlline -u "jdbc:drill:zk=172.16.2.165:5181;auth=maprsasl" > Error: Failure in connecting to Drill: > org.apache.drill.exec.rpc.RpcException: Failure setting up ZK for client. > (state=,code=0) > java.sql.SQLNonTransientConnectionException: Failure in connecting to Drill: > org.apache.drill.exec.rpc.RpcException: Failure setting up ZK for client. > at > org.apache.drill.jdbc.impl.DrillConnectionImpl.(DrillConnectionImpl.java:174) > at > org.apache.drill.jdbc.impl.DrillJdbc41Factory.newDrillConnection(DrillJdbc41Factory.java:67) > at > org.apache.drill.jdbc.impl.DrillFactory.newConnection(DrillFactory.java:67) > at > org.apache.calcite.avatica.UnregisteredDriver.connect(UnregisteredDriver.java:138) > at org.apache.drill.jdbc.Driver.connect(Driver.java:72) > at sqlline.DatabaseConnection.connect(DatabaseConnection.java:130) > at sqlline.DatabaseConnection.getConnection(DatabaseConnection.java:179) > at sqlline.Commands.connect(Commands.java:1247) > at sqlline.Commands.connect(Commands.java:1139) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at sqlline.ReflectiveCommandHandler.execute(ReflectiveCommandHandler.java:38) > at sqlline.SqlLine.dispatch(SqlLine.java:722) > at sqlline.SqlLine.initArgs(SqlLine.java:416) > at sqlline.SqlLine.begin(SqlLine.java:514) > at sqlline.SqlLine.start(SqlLine.java:264) > at sqlline.SqlLine.main(SqlLine.java:195) > Caused by: org.apache.drill.exec.rpc.RpcException: Failure setting up ZK for > client. > at org.apache.drill.exec.client.DrillClient.connect(DrillClient.java:340) > at > org.apache.drill.jdbc.impl.DrillConnectionImpl.(DrillConnectionImpl.java:165) > ... 18 more > Caused by: java.lang.NullPointerException > at > org.apache.drill.exec.coord.zk.ZKACLProviderFactory.findACLProvider(ZKACLProviderFactory.java:68) > at > org.apache.drill.exec.coord.zk.ZKACLProviderFactory.getACLProvider(ZKACLProviderFactory.java:47) > at > org.apache.drill.exec.coord.zk.ZKClusterCoordinator.(ZKClusterCoordinator.java:114) > at > org.apache.drill.exec.coord.zk.ZKClusterCoordinator.(ZKClusterCoordinator.java:86) > at org.apache.drill.exec.client.DrillClient.connect(DrillClient.java:337) > ... 19 more > Apache Drill 1.15.0.0 > "This isn't your grandfather's SQL." > sqlline> > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-7137) Implement unit test case to test Drill client <-> interaction in a secure setup
Karthikeyan Manivannan created DRILL-7137: - Summary: Implement unit test case to test Drill client <-> interaction in a secure setup Key: DRILL-7137 URL: https://issues.apache.org/jira/browse/DRILL-7137 Project: Apache Drill Issue Type: Improvement Reporter: Karthikeyan Manivannan Implement a unit testcase fo DRILL-7101 >From the PR https://github.com/apache/drill/pull/1702 "Writing a test where the Drillbits (inside ClusterFixture) are setup with ZK_APPLY_SECURE_ACL=false (to avoid the need to setup a secure ZK server within the unit test) and the ClientFixture is setup with ZK_APPLY_SECURE_ACL=true (to simulate the failure). Starting a test with different values for the same property turns out to be quite hard because the ClusterFixture internally instantiates a ClientFixure. Changing this behavior might affect other tests." -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-7107) Unable to connect to Drill 1.15 through ZK
[ https://issues.apache.org/jira/browse/DRILL-7107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16793926#comment-16793926 ] Karthikeyan Manivannan commented on DRILL-7107: --- This problem happens because the zk.apply_secure_acl config parameter, shared between client and server, is expected to have different values for client (false) and server (true). The DrillClient and DrillBit both use the same logic to read the config file. So if they are both run from the same installation and the zk.apply_secure_acl parameter is set to ‘true', then the DrillClient also sees that value to to be 'true’ and passes it on to the ZKAClProviderFactory. This causes the ZKAClProviderFactory to crash because it has no context to discover ACLProviders, when it is called by the client code. Ideally, we should fix this by having different config files for client and server. But for now, I am identifying the client codepath and not calling into the ZKAClProviderFactory. I am testing this fix now. > Unable to connect to Drill 1.15 through ZK > -- > > Key: DRILL-7107 > URL: https://issues.apache.org/jira/browse/DRILL-7107 > Project: Apache Drill > Issue Type: Bug >Reporter: Karthikeyan Manivannan >Assignee: Karthikeyan Manivannan >Priority: Major > > After upgrading to Drill 1.15, users are seeing they are no longer able to > connect to Drill using ZK quorum. They are getting the following "Unable to > setup ZK for client" error. > [~]$ sqlline -u "jdbc:drill:zk=172.16.2.165:5181;auth=maprsasl" > Error: Failure in connecting to Drill: > org.apache.drill.exec.rpc.RpcException: Failure setting up ZK for client. > (state=,code=0) > java.sql.SQLNonTransientConnectionException: Failure in connecting to Drill: > org.apache.drill.exec.rpc.RpcException: Failure setting up ZK for client. > at > org.apache.drill.jdbc.impl.DrillConnectionImpl.(DrillConnectionImpl.java:174) > at > org.apache.drill.jdbc.impl.DrillJdbc41Factory.newDrillConnection(DrillJdbc41Factory.java:67) > at > org.apache.drill.jdbc.impl.DrillFactory.newConnection(DrillFactory.java:67) > at > org.apache.calcite.avatica.UnregisteredDriver.connect(UnregisteredDriver.java:138) > at org.apache.drill.jdbc.Driver.connect(Driver.java:72) > at sqlline.DatabaseConnection.connect(DatabaseConnection.java:130) > at sqlline.DatabaseConnection.getConnection(DatabaseConnection.java:179) > at sqlline.Commands.connect(Commands.java:1247) > at sqlline.Commands.connect(Commands.java:1139) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at sqlline.ReflectiveCommandHandler.execute(ReflectiveCommandHandler.java:38) > at sqlline.SqlLine.dispatch(SqlLine.java:722) > at sqlline.SqlLine.initArgs(SqlLine.java:416) > at sqlline.SqlLine.begin(SqlLine.java:514) > at sqlline.SqlLine.start(SqlLine.java:264) > at sqlline.SqlLine.main(SqlLine.java:195) > Caused by: org.apache.drill.exec.rpc.RpcException: Failure setting up ZK for > client. > at org.apache.drill.exec.client.DrillClient.connect(DrillClient.java:340) > at > org.apache.drill.jdbc.impl.DrillConnectionImpl.(DrillConnectionImpl.java:165) > ... 18 more > Caused by: java.lang.NullPointerException > at > org.apache.drill.exec.coord.zk.ZKACLProviderFactory.findACLProvider(ZKACLProviderFactory.java:68) > at > org.apache.drill.exec.coord.zk.ZKACLProviderFactory.getACLProvider(ZKACLProviderFactory.java:47) > at > org.apache.drill.exec.coord.zk.ZKClusterCoordinator.(ZKClusterCoordinator.java:114) > at > org.apache.drill.exec.coord.zk.ZKClusterCoordinator.(ZKClusterCoordinator.java:86) > at org.apache.drill.exec.client.DrillClient.connect(DrillClient.java:337) > ... 19 more > Apache Drill 1.15.0.0 > "This isn't your grandfather's SQL." > sqlline> > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-7107) Unable to connect to Drill 1.15 through ZK
Karthikeyan Manivannan created DRILL-7107: - Summary: Unable to connect to Drill 1.15 through ZK Key: DRILL-7107 URL: https://issues.apache.org/jira/browse/DRILL-7107 Project: Apache Drill Issue Type: Bug Reporter: Karthikeyan Manivannan Assignee: Karthikeyan Manivannan After upgrading to Drill 1.15, users are seeing they are no longer able to connect to Drill using ZK quorum. They are getting the following "Unable to setup ZK for client" error. [~]$ sqlline -u "jdbc:drill:zk=172.16.2.165:5181;auth=maprsasl" Error: Failure in connecting to Drill: org.apache.drill.exec.rpc.RpcException: Failure setting up ZK for client. (state=,code=0) java.sql.SQLNonTransientConnectionException: Failure in connecting to Drill: org.apache.drill.exec.rpc.RpcException: Failure setting up ZK for client. at org.apache.drill.jdbc.impl.DrillConnectionImpl.(DrillConnectionImpl.java:174) at org.apache.drill.jdbc.impl.DrillJdbc41Factory.newDrillConnection(DrillJdbc41Factory.java:67) at org.apache.drill.jdbc.impl.DrillFactory.newConnection(DrillFactory.java:67) at org.apache.calcite.avatica.UnregisteredDriver.connect(UnregisteredDriver.java:138) at org.apache.drill.jdbc.Driver.connect(Driver.java:72) at sqlline.DatabaseConnection.connect(DatabaseConnection.java:130) at sqlline.DatabaseConnection.getConnection(DatabaseConnection.java:179) at sqlline.Commands.connect(Commands.java:1247) at sqlline.Commands.connect(Commands.java:1139) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at sqlline.ReflectiveCommandHandler.execute(ReflectiveCommandHandler.java:38) at sqlline.SqlLine.dispatch(SqlLine.java:722) at sqlline.SqlLine.initArgs(SqlLine.java:416) at sqlline.SqlLine.begin(SqlLine.java:514) at sqlline.SqlLine.start(SqlLine.java:264) at sqlline.SqlLine.main(SqlLine.java:195) Caused by: org.apache.drill.exec.rpc.RpcException: Failure setting up ZK for client. at org.apache.drill.exec.client.DrillClient.connect(DrillClient.java:340) at org.apache.drill.jdbc.impl.DrillConnectionImpl.(DrillConnectionImpl.java:165) ... 18 more Caused by: java.lang.NullPointerException at org.apache.drill.exec.coord.zk.ZKACLProviderFactory.findACLProvider(ZKACLProviderFactory.java:68) at org.apache.drill.exec.coord.zk.ZKACLProviderFactory.getACLProvider(ZKACLProviderFactory.java:47) at org.apache.drill.exec.coord.zk.ZKClusterCoordinator.(ZKClusterCoordinator.java:114) at org.apache.drill.exec.coord.zk.ZKClusterCoordinator.(ZKClusterCoordinator.java:86) at org.apache.drill.exec.client.DrillClient.connect(DrillClient.java:337) ... 19 more Apache Drill 1.15.0.0 "This isn't your grandfather's SQL." sqlline> -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-7093) Batch Sizing in SingleSender
[ https://issues.apache.org/jira/browse/DRILL-7093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthikeyan Manivannan updated DRILL-7093: -- Issue Type: Sub-task (was: Bug) Parent: DRILL-7099 > Batch Sizing in SingleSender > > > Key: DRILL-7093 > URL: https://issues.apache.org/jira/browse/DRILL-7093 > Project: Apache Drill > Issue Type: Sub-task >Reporter: Karthikeyan Manivannan >Assignee: Karthikeyan Manivannan >Priority: Major > > SingleSender batch sizing: SingleSender does not have a mechanism to control > the size of batches sent to the receiver. This results in excessive memory > use. > This bug captures the changes required to SingleSender to control batch size > by using the RecordbatchSizer > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-7099) Resource Management in Exchange Operators
Karthikeyan Manivannan created DRILL-7099: - Summary: Resource Management in Exchange Operators Key: DRILL-7099 URL: https://issues.apache.org/jira/browse/DRILL-7099 Project: Apache Drill Issue Type: Bug Reporter: Karthikeyan Manivannan Assignee: Karthikeyan Manivannan This Jira will be used to track the changes required for implementing Resource Management in Exchange operators. The design can be found here: https://docs.google.com/document/d/1N9OXfCWcp68jsxYVmSt9tPgnZRV_zk8rwwFh0BxXZeE/edit?usp=sharing -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-7093) Batch Sizing in SingleSender
Karthikeyan Manivannan created DRILL-7093: - Summary: Batch Sizing in SingleSender Key: DRILL-7093 URL: https://issues.apache.org/jira/browse/DRILL-7093 Project: Apache Drill Issue Type: Bug Reporter: Karthikeyan Manivannan Assignee: Karthikeyan Manivannan SingleSender batch sizing: SingleSender does not have a mechanism to control the size of batches sent to the receiver. This results in excessive memory use. This bug captures the changes required to SingleSender to control batch size by using the RecordbatchSizer -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-6593) unordered receivers for broadcast senders dont rerpot memmory consumption
[ https://issues.apache.org/jira/browse/DRILL-6593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthikeyan Manivannan updated DRILL-6593: -- Fix Version/s: (was: 1.16.0) 1.17.0 > unordered receivers for broadcast senders dont rerpot memmory consumption > - > > Key: DRILL-6593 > URL: https://issues.apache.org/jira/browse/DRILL-6593 > Project: Apache Drill > Issue Type: Bug > Components: Functions - Drill >Affects Versions: 1.14.0 > Environment: RHEL 7 >Reporter: Dechang Gu >Assignee: Karthikeyan Manivannan >Priority: Major > Fix For: 1.17.0 > > Attachments: TPCDS_78_3_id_24c43954-65b4-07b6-7e53-a37ad47fa963.json > > > In my regression test on TPCDS SF100 dataset, query 78 profile shows the > following: > {code} > 05-xx-02 PROJECT 0.000s 0.001s 0.003s 0.022s 0.000s 0.000s 0.000s > 0.02% 0.00% 64,787,488 3MB 3MB > 05-xx-03 HASH_JOIN 0.000s 0.000s 0.774s 1.002s 0.000s 0.000s > 0.000s 6.87% 0.32% 69,186,507 8MB 10MB > 05-xx-04 UNORDERED_RECEIVER 0.000s 0.000s 0.000s 0.000s 0.000s > 0.000s 0.000s 0.00% 0.00% 4,382,940 - - > 05-xx-05 PROJECT 0.000s 0.001s 0.002s 0.015s 0.000s 0.000s 0.000s > 0.02% 0.00% 64,803,567 3MB 3MB > 05-xx-06 SELECTION_VECTOR_REMOVER0.000s 0.000s 0.333s 0.566s > 0.000s 0.000s 0.000s 2.95% 0.14% 64,803,567 5MB 5MB > {code} > Note 05-xx-04 did not show memory usage. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-7027) TPCH Queries hit IOB when planner.enable_demux_exchange = true
Karthikeyan Manivannan created DRILL-7027: - Summary: TPCH Queries hit IOB when planner.enable_demux_exchange = true Key: DRILL-7027 URL: https://issues.apache.org/jira/browse/DRILL-7027 Project: Apache Drill Issue Type: Bug Components: Execution - Flow Reporter: Karthikeyan Manivannan Assignee: Karthikeyan Manivannan Runing TPCH queries on SF100 dataset, a few queries (13, 14, and 19) hit IOB exception: {code} java.sql.SQLException: SYSTEM ERROR: IndexOutOfBoundsException: index 154 Fragment 7:0 Please, refer to logs for more information. [Error Id: e312dc77-0cad-4bc0-b90e-fb0d477ef272 on ucs-node2.perf.lab:31010] at org.apache.drill.jdbc.impl.DrillCursor.nextRowInternally(DrillCursor.java:528) at org.apache.drill.jdbc.impl.DrillCursor.next(DrillCursor.java:632) at org.apache.calcite.avatica.AvaticaResultSet.next(AvaticaResultSet.java:217) at org.apache.drill.jdbc.impl.DrillResultSetImpl.next(DrillResultSetImpl.java:151) at PipSQueak.fetchRows(PipSQueak.java:346) at PipSQueak.runTest(PipSQueak.java:113) at PipSQueak.main(PipSQueak.java:477) Caused by: org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: IndexOutOfBoundsException: index 154 Fragment 7:0 Please, refer to logs for more information. [Error Id: e312dc77-0cad-4bc0-b90e-fb0d477ef272 on ucs-node2.perf.lab:31010] at org.apache.drill.exec.rpc.user.QueryResultHandler.resultArrived(QueryResultHandler.java:123) at org.apache.drill.exec.rpc.user.UserClient.handle(UserClient.java:422) at org.apache.drill.exec.rpc.user.UserClient.handle(UserClient.java:96) at org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:273) at org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:243) at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:88) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342) at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335) at io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:287) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342) at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335) at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:102) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342) at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335) at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:312) at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:286) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342) at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335) at io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342) at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335) at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1294) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342) at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:911) at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131) at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:645) at io.netty.channel.nio.NioE
[jira] [Created] (DRILL-6896) Extraneous columns being projected in Drill 1.15
Karthikeyan Manivannan created DRILL-6896: - Summary: Extraneous columns being projected in Drill 1.15 Key: DRILL-6896 URL: https://issues.apache.org/jira/browse/DRILL-6896 Project: Apache Drill Issue Type: Improvement Affects Versions: 1.15.0 Reporter: Karthikeyan Manivannan Assignee: Aman Sinha [~rhou] noted that TPCH13 on Drill 1.15 was running slower than Drill 1.14. Analysis revealed that an extra column was being projected in 1.15 and the slowdown was because the extra column was being unnecessarily pushed across an exchange. Here is a simplified query written by [~amansinha100] that exhibits the same problem : In first plan, o_custkey and o_comment are both extraneous projections. In the second plan (on 1.14.0), also, there is an extraneous projection: o_custkey but not o_comment. On 1.15.0: - explain plan without implementation for select c.c_custkey from cp.`tpch/customer.parquet` c left outer join cp.`tpch/orders.parquet` o on c.c_custkey = o.o_custkey and o.o_comment not like '%special%requests%' ; DrillScreenRel DrillProjectRel(c_custkey=[$0]) DrillProjectRel(c_custkey=[$2], o_custkey=[$0], o_comment=[$1]) DrillJoinRel(condition=[=($2, $0)], joinType=[right]) DrillFilterRel(condition=[NOT(LIKE($1, '%special%requests%'))]) DrillScanRel(table=[[cp, tpch/orders.parquet]], groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=classpath:/tpch/orders.parquet]], selectionRoot=classpath:/tpch/orders.parquet, numFiles=1, numRowGroups=1, usedMetadataFile=false, columns=[`o_custkey`, `o_comment`]]]) DrillScanRel(table=[[cp, tpch/customer.parquet]], groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=classpath:/tpch/customer.parquet]], selectionRoot=classpath:/tpch/customer.parquet, numFiles=1, numRowGroups=1, usedMetadataFile=false, columns=[`c_custkey`]]]) On 1.14.0: - DrillScreenRel DrillProjectRel(c_custkey=[$0]) DrillProjectRel(c_custkey=[$1], o_custkey=[$0]) DrillJoinRel(condition=[=($1, $0)], joinType=[right]) DrillProjectRel(o_custkey=[$0]) DrillFilterRel(condition=[NOT(LIKE($1, '%special%requests%'))]) DrillScanRel(table=[[cp, tpch/orders.parquet]], groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=classpath:/tpch/orders.parquet]], selectionRoot=classpath:/tpch/orders.parquet, numFiles=1, numRowGroups=1, usedMetadataFile=false, columns=[`o_custkey`, `o_comment`]]]) DrillScanRel(table=[[cp, tpch/customer.parquet]], groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=classpath:/tpch/customer.parquet]], selectionRoot=classpath:/tpch/customer.parquet, numFiles=1, numRowGroups=1, usedMetadataFile=false, columns=[`c_custkey`]]]) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-6823) Tests that require ZK client-specific configuration fail when they are not run standalone
[ https://issues.apache.org/jira/browse/DRILL-6823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthikeyan Manivannan updated DRILL-6823: -- Summary: Tests that require ZK client-specific configuration fail when they are not run standalone (was: Tests that require ZK client-specific configuration have to be run standalone) > Tests that require ZK client-specific configuration fail when they are not > run standalone > - > > Key: DRILL-6823 > URL: https://issues.apache.org/jira/browse/DRILL-6823 > Project: Apache Drill > Issue Type: Bug >Reporter: Karthikeyan Manivannan >Priority: Major > > ZK libraries only supports one client instance per-machine per-server and it > is cached. Tests that require client-specific configuration will fail when > run after other ZK tests that setup the client in a way that will cause this > test to fail. > Some investigation is necessary to see if the ZK ACL tests and any other such > tests, can be run standalone in our test framework. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-6823) Tests that require ZK client-specific configuration have to be run standalone
[ https://issues.apache.org/jira/browse/DRILL-6823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthikeyan Manivannan updated DRILL-6823: -- Issue Type: Bug (was: Improvement) > Tests that require ZK client-specific configuration have to be run standalone > - > > Key: DRILL-6823 > URL: https://issues.apache.org/jira/browse/DRILL-6823 > Project: Apache Drill > Issue Type: Bug >Reporter: Karthikeyan Manivannan >Priority: Major > > ZK libraries only supports one client instance per-machine per-server and it > is cached. Tests that require client-specific configuration will fail when > run after other ZK tests that setup the client in a way that will cause this > test to fail. > Some investigation is necessary to see if the ZK ACL tests and any other such > tests, can be run standalone in our test framework. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-6823) Tests that require ZK client-specific configuration have to be run standalone
Karthikeyan Manivannan created DRILL-6823: - Summary: Tests that require ZK client-specific configuration have to be run standalone Key: DRILL-6823 URL: https://issues.apache.org/jira/browse/DRILL-6823 Project: Apache Drill Issue Type: Improvement Reporter: Karthikeyan Manivannan ZK libraries only supports one client instance per-machine per-server and it is cached. Tests that require client-specific configuration will fail when run after other ZK tests that setup the client in a way that will cause this test to fail. Some investigation is necessary to see if the ZK ACL tests and any other such tests, can be run standalone in our test framework. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (DRILL-4897) NumberFormatException in Drill SQL while casting to BIGINT when its actually a number
[ https://issues.apache.org/jira/browse/DRILL-4897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthikeyan Manivannan resolved DRILL-4897. --- Resolution: Not A Problem > NumberFormatException in Drill SQL while casting to BIGINT when its actually > a number > - > > Key: DRILL-4897 > URL: https://issues.apache.org/jira/browse/DRILL-4897 > Project: Apache Drill > Issue Type: Bug > Components: Functions - Drill >Reporter: Srihari Karanth >Assignee: Karthikeyan Manivannan >Priority: Blocker > Fix For: 1.15.0 > > > In the following SQL, drill cribs when trying to convert a number which is in > varchar >select cast (case IsNumeric(Delta_Radio_Delay) > when 0 then 0 else Delta_Radio_Delay end as BIGINT) > from datasource.`./sometable` > where Delta_Radio_Delay='4294967294'; > BIGINT should be able to take very large number. I dont understand how it > throws the below error: > 0: jdbc:drill:> select cast (case IsNumeric(Delta_Radio_Delay) > when 0 then 0 else Delta_Radio_Delay end as BIGINT) > from datasource.`./sometable` > where Delta_Radio_Delay='4294967294'; > Error: SYSTEM ERROR: NumberFormatException: 4294967294 > Fragment 1:29 > [Error Id: a63bb113-271f-4d8b-8194-2c9728543200 on cluster-3:31010] > (state=,code=0) > How can i modify SQL to fix this? -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-4897) NumberFormatException in Drill SQL while casting to BIGINT when its actually a number
[ https://issues.apache.org/jira/browse/DRILL-4897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16669292#comment-16669292 ] Karthikeyan Manivannan commented on DRILL-4897: --- This works as expected. When Codegen is being done for the IfExpression corresponding to the Case, then the Codegen realizes that the type of IfExpression (Int) and ElseExpression(Varchar) are different. ResolverTypePrecedence.precedenceMap specifies the precedence order for type conversions and it clearly indicates that Varchar -> Int is a valid conversion but not the other way round. So as part of the If-Else type resolution, Codegen consults the precedenceMap and casts the Else-type(Varchar) to the If-type (Int). The user can avoid this error by explicitly casting the ElseExpression to be BigInt. _SELECT CAST(case isnumeric(columns[0]) WHEN 0 THEN 0 ELSE CAST(columns[0] as BigInt) END AS BIGINT) from ..._ > NumberFormatException in Drill SQL while casting to BIGINT when its actually > a number > - > > Key: DRILL-4897 > URL: https://issues.apache.org/jira/browse/DRILL-4897 > Project: Apache Drill > Issue Type: Bug > Components: Functions - Drill >Reporter: Srihari Karanth >Assignee: Karthikeyan Manivannan >Priority: Blocker > Fix For: 1.15.0 > > > In the following SQL, drill cribs when trying to convert a number which is in > varchar >select cast (case IsNumeric(Delta_Radio_Delay) > when 0 then 0 else Delta_Radio_Delay end as BIGINT) > from datasource.`./sometable` > where Delta_Radio_Delay='4294967294'; > BIGINT should be able to take very large number. I dont understand how it > throws the below error: > 0: jdbc:drill:> select cast (case IsNumeric(Delta_Radio_Delay) > when 0 then 0 else Delta_Radio_Delay end as BIGINT) > from datasource.`./sometable` > where Delta_Radio_Delay='4294967294'; > Error: SYSTEM ERROR: NumberFormatException: 4294967294 > Fragment 1:29 > [Error Id: a63bb113-271f-4d8b-8194-2c9728543200 on cluster-3:31010] > (state=,code=0) > How can i modify SQL to fix this? -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6763) Codegen optimization of SQL functions with constant values
[ https://issues.apache.org/jira/browse/DRILL-6763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16629264#comment-16629264 ] Karthikeyan Manivannan commented on DRILL-6763: --- This is an interesting change. I will take a look by the end of this week. > Codegen optimization of SQL functions with constant values > -- > > Key: DRILL-6763 > URL: https://issues.apache.org/jira/browse/DRILL-6763 > Project: Apache Drill > Issue Type: Improvement > Components: Execution - Codegen >Affects Versions: 1.14.0 >Reporter: shuifeng lu >Assignee: shuifeng lu >Priority: Major > Fix For: 1.15.0 > > Attachments: Query1.java, Query2.java, code_compare.png, > compilation_time.png > > > Codegen class compilation takes tens to hundreds of milliseconds, a class > cache is hit when generifiedCode of code generator is exactly the same. > It works fine when UDF only takes columns or symbols, but not efficient when > one or more parameters in UDF is always distinct from the other. > Take face recognition for example, the face images are almost distinct from > each other according to lighting, facial expressions and details. > It is important to reduce redundant class compilation especially for those > low latency queries. > Cache miss rate and metaspace gc can also be reduced by eliminating the > redundant classes. > Here is the query to get the persons whose last name is Brunner and hire from > 1st Jan 1990: > SELECT full_name, hire_date FROM cp.`employee.json` where last_name = > 'Brunner' and hire_date >= '1990-01-01 00:00:00.0'; > Now get the persons whose last name is Bernard and hire from 1st Jan 1990. > SELECT full_name, hire_date FROM cp.`employee.json` where last_name = > 'Bernard' and hire_date >= '1990-01-01 00:00:00.0'; > Figure !compilation_time.png! shows the compilation time of the generated > code by the above query in FilterRecordBatch on my laptop > Figure !code_compare.png! shows the only difference of the generated code > from the attachments is the last_name value at line 156. > It is straightforward that the redundant class compilation can be eliminated > by making the string12 as a member of the class and set the value when the > instance is created -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-6754) Add a field to SV2 to indicate if the SV2 reorders the Record Batch
Karthikeyan Manivannan created DRILL-6754: - Summary: Add a field to SV2 to indicate if the SV2 reorders the Record Batch Key: DRILL-6754 URL: https://issues.apache.org/jira/browse/DRILL-6754 Project: Apache Drill Issue Type: Improvement Environment: The optimization in DRILL-6687 is not correct if an SV2 is used to re-order rows in the record batch. Currently, this is not a problem because none of the reordering operators (SORT, TOPN) use an SV2. SORT has code for SV2 but it is disabled. Adding a field to SV2 to indicate if the SV2 reorders the Record Batch would allow the safe application of the DRILL-6687 optimization. Reporter: Karthikeyan Manivannan Assignee: Karthikeyan Manivannan -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6688) Data batches for Project operator exceed the maximum specified
[ https://issues.apache.org/jira/browse/DRILL-6688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16590294#comment-16590294 ] Karthikeyan Manivannan commented on DRILL-6688: --- [~ben-zvi] please review the PR. > Data batches for Project operator exceed the maximum specified > -- > > Key: DRILL-6688 > URL: https://issues.apache.org/jira/browse/DRILL-6688 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Relational Operators >Affects Versions: 1.14.0 >Reporter: Robert Hou >Assignee: Karthikeyan Manivannan >Priority: Major > Fix For: 1.15.0 > > > I ran this query: > alter session set `drill.exec.memory.operator.project.output_batch_size` = > 131072; > alter session set `planner.width.max_per_node` = 1; > alter session set `planner.width.max_per_query` = 1; > select > chr(101) CharacterValuea, > chr(102) CharacterValueb, > chr(103) CharacterValuec, > chr(104) CharacterValued, > chr(105) CharacterValuee > from dfs.`/drill/testdata/batch_memory/character5_1MB.parquet`; > The output has 1024 identical lines: > e f g h i > There is one incoming batch: > 2018-08-09 15:50:14,794 [24933ad8-a5e2-73f1-90dd-947fc2938e54:frag:0:0] DEBUG > o.a.d.e.p.i.p.ProjectMemoryManager - BATCH_STATS, incoming: Batch size: > { Records: 6, Total size: 0, Data size: 30, Gross row width: 0, Net > row width: 5, Density: 0% } > Batch schema & sizes: > { `_DEFAULT_COL_TO_READ_`(type: OPTIONAL INT, count: 6, Per entry: std > data size: 4, std net size: 5, actual data size: 4, actual net size: 5 > Totals: data size: 24, net size: 30) } > } > There are four outgoing batches. All are too large. The first three look like > this: > 2018-08-09 15:50:14,799 [24933ad8-a5e2-73f1-90dd-947fc2938e54:frag:0:0] DEBUG > o.a.d.e.p.i.p.ProjectRecordBatch - BATCH_STATS, outgoing: Batch size: > { Records: 16383, Total size: 0, Data size: 409575, Gross row width: 0, Net > row width: 25, Density: 0% } > Batch schema & sizes: > { CharacterValuea(type: REQUIRED VARCHAR, count: 16383, Per entry: std data > size: 50, std net size: 54, actual data size: 1, actual net size: 5 Totals: > data size: 16383, net size: 81915) } > CharacterValueb(type: REQUIRED VARCHAR, count: 16383, Per entry: std data > size: 50, std net size: 54, actual data size: 1, actual net size: 5 Totals: > data size: 16383, net size: 81915) } > CharacterValuec(type: REQUIRED VARCHAR, count: 16383, Per entry: std data > size: 50, std net size: 54, actual data size: 1, actual net size: 5 Totals: > data size: 16383, net size: 81915) } > CharacterValued(type: REQUIRED VARCHAR, count: 16383, Per entry: std data > size: 50, std net size: 54, actual data size: 1, actual net size: 5 Totals: > data size: 16383, net size: 81915) } > CharacterValuee(type: REQUIRED VARCHAR, count: 16383, Per entry: std data > size: 50, std net size: 54, actual data size: 1, actual net size: 5 Totals: > data size: 16383, net size: 81915) } > } > The last batch is smaller because it has the remaining records. > The data size (409575) exceeds the maximum batch size (131072). > character415.q -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-6629) BitVector split and transfer does not work correctly for transfer length < 8
[ https://issues.apache.org/jira/browse/DRILL-6629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthikeyan Manivannan updated DRILL-6629: -- Labels: ready-to-commit (was: ) > BitVector split and transfer does not work correctly for transfer length < 8 > > > Key: DRILL-6629 > URL: https://issues.apache.org/jira/browse/DRILL-6629 > Project: Apache Drill > Issue Type: Improvement > Components: Execution - Data Types > Environment: BitVector split and transfer does not work correctly for > transfer length < 8. >Reporter: Karthikeyan Manivannan >Assignee: Karthikeyan Manivannan >Priority: Major > Labels: ready-to-commit > Fix For: 1.15.0 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-6629) BitVector split and transfer does not work correctly for transfer length < 8
Karthikeyan Manivannan created DRILL-6629: - Summary: BitVector split and transfer does not work correctly for transfer length < 8 Key: DRILL-6629 URL: https://issues.apache.org/jira/browse/DRILL-6629 Project: Apache Drill Issue Type: Improvement Components: Execution - Data Types Environment: BitVector split and transfer does not work correctly for transfer length < 8. Reporter: Karthikeyan Manivannan Assignee: Karthikeyan Manivannan -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-6373) Refactor the Result Set Loader to prepare for Union, List support
[ https://issues.apache.org/jira/browse/DRILL-6373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthikeyan Manivannan updated DRILL-6373: -- Attachment: 6373_Functional_Fail_07_13_1300.txt > Refactor the Result Set Loader to prepare for Union, List support > - > > Key: DRILL-6373 > URL: https://issues.apache.org/jira/browse/DRILL-6373 > Project: Apache Drill > Issue Type: Improvement >Affects Versions: 1.13.0 >Reporter: Paul Rogers >Assignee: Paul Rogers >Priority: Major > Attachments: 6373_Functional_Fail_07_13_1300.txt, > drill-6373-with-6585-fix-functional-failure.txt > > > As the next step in merging the "batch sizing" enhancements, refactor the > {{ResultSetLoader}} and related classes to prepare for Union and List > support. This fix follows the refactoring of the column accessors for the > same purpose. Actual Union and List support is to follow in a separate PR. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6373) Refactor the Result Set Loader to prepare for Union, List support
[ https://issues.apache.org/jira/browse/DRILL-6373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16543642#comment-16543642 ] Karthikeyan Manivannan commented on DRILL-6373: --- [~paul-rogers] The functional test failed with some plan verification failures but I doubt it is because of your change. The log is attached [^6373_Functional_Fail_07_13_1300.txt] > Refactor the Result Set Loader to prepare for Union, List support > - > > Key: DRILL-6373 > URL: https://issues.apache.org/jira/browse/DRILL-6373 > Project: Apache Drill > Issue Type: Improvement >Affects Versions: 1.13.0 >Reporter: Paul Rogers >Assignee: Paul Rogers >Priority: Major > Attachments: 6373_Functional_Fail_07_13_1300.txt, > drill-6373-with-6585-fix-functional-failure.txt > > > As the next step in merging the "batch sizing" enhancements, refactor the > {{ResultSetLoader}} and related classes to prepare for Union and List > support. This fix follows the refactoring of the column accessors for the > same purpose. Actual Union and List support is to follow in a separate PR. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (DRILL-6601) LageFileCompilation testProject times out
[ https://issues.apache.org/jira/browse/DRILL-6601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthikeyan Manivannan reassigned DRILL-6601: - Assignee: Karthikeyan Manivannan > LageFileCompilation testProject times out > - > > Key: DRILL-6601 > URL: https://issues.apache.org/jira/browse/DRILL-6601 > Project: Apache Drill > Issue Type: Improvement >Reporter: Karthikeyan Manivannan >Assignee: Karthikeyan Manivannan >Priority: Major > > The number of columns projected by testProject was bumped up from 5K to 10K > in DRILL-6529. Changing this back to 5K should reduce the stress on this test > yet stay within the threshold to test constant pool constraints. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-6601) LageFileCompilation testProject times out
Karthikeyan Manivannan created DRILL-6601: - Summary: LageFileCompilation testProject times out Key: DRILL-6601 URL: https://issues.apache.org/jira/browse/DRILL-6601 Project: Apache Drill Issue Type: Improvement Reporter: Karthikeyan Manivannan The number of columns projected by testProject was bumped up from 5K to 10K in DRILL-6529. Changing this back to 5K should reduce the stress on this test yet stay within the threshold to test constant pool constraints. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6373) Refactor the Result Set Loader to prepare for Union, List support
[ https://issues.apache.org/jira/browse/DRILL-6373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16537679#comment-16537679 ] Karthikeyan Manivannan commented on DRILL-6373: --- The functional test failed. The logs are attached [^drill-6373-with-6585-fix-functional-failure.txt] > Refactor the Result Set Loader to prepare for Union, List support > - > > Key: DRILL-6373 > URL: https://issues.apache.org/jira/browse/DRILL-6373 > Project: Apache Drill > Issue Type: Improvement >Affects Versions: 1.13.0 >Reporter: Paul Rogers >Assignee: Paul Rogers >Priority: Major > Attachments: drill-6373-with-6585-fix-functional-failure.txt > > > As the next step in merging the "batch sizing" enhancements, refactor the > {{ResultSetLoader}} and related classes to prepare for Union and List > support. This fix follows the refactoring of the column accessors for the > same purpose. Actual Union and List support is to follow in a separate PR. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-6373) Refactor the Result Set Loader to prepare for Union, List support
[ https://issues.apache.org/jira/browse/DRILL-6373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthikeyan Manivannan updated DRILL-6373: -- Attachment: drill-6373-with-6585-fix-functional-failure.txt > Refactor the Result Set Loader to prepare for Union, List support > - > > Key: DRILL-6373 > URL: https://issues.apache.org/jira/browse/DRILL-6373 > Project: Apache Drill > Issue Type: Improvement >Affects Versions: 1.13.0 >Reporter: Paul Rogers >Assignee: Paul Rogers >Priority: Major > Attachments: drill-6373-with-6585-fix-functional-failure.txt > > > As the next step in merging the "batch sizing" enhancements, refactor the > {{ResultSetLoader}} and related classes to prepare for Union and List > support. This fix follows the refactoring of the column accessors for the > same purpose. Actual Union and List support is to follow in a separate PR. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-6529) Project Batch Sizing causes two LargeFileCompilation tests to timeout
[ https://issues.apache.org/jira/browse/DRILL-6529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthikeyan Manivannan updated DRILL-6529: -- Summary: Project Batch Sizing causes two LargeFileCompilation tests to timeout (was: Project Batch Sizing causes two LargeFileCompilation tests to fail) > Project Batch Sizing causes two LargeFileCompilation tests to timeout > - > > Key: DRILL-6529 > URL: https://issues.apache.org/jira/browse/DRILL-6529 > Project: Apache Drill > Issue Type: Improvement > Components: Execution - Relational Operators >Reporter: Karthikeyan Manivannan >Assignee: Karthikeyan Manivannan >Priority: Major > Fix For: 1.14.0 > > > Timeout failures are seen in TestLargeFileCompilation testExternal_Sort and > testTop_N_Sort. These tests are stress tests for compilation where the > queries cover projections over 5000 columns and sort over 500 columns. These > tests pass if they are run stand-alone. Something triggers the timeouts when > the tests are run in parallel as part of a unit test run. > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-6529) Prokect Batch Sizing causes two LargeFileCompilation tests to fail
Karthikeyan Manivannan created DRILL-6529: - Summary: Prokect Batch Sizing causes two LargeFileCompilation tests to fail Key: DRILL-6529 URL: https://issues.apache.org/jira/browse/DRILL-6529 Project: Apache Drill Issue Type: Improvement Components: Execution - Relational Operators Reporter: Karthikeyan Manivannan Assignee: Karthikeyan Manivannan Timeout failures are seen in TestLargeFileCompilation testExternal_Sort and testTop_N_Sort. These tests are stress tests for compilation where the queries cover projections over 5000 columns and sort over 500 columns. These tests pass if they are run stand-alone. Something triggers the timeouts when the tests are run in parallel as part of a unit test run. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-6340) Output Batch Control in Project using the RecordBatchSizer
[ https://issues.apache.org/jira/browse/DRILL-6340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthikeyan Manivannan updated DRILL-6340: -- Labels: doc-impacting ready-to-commit (was: doc-impacting) > Output Batch Control in Project using the RecordBatchSizer > -- > > Key: DRILL-6340 > URL: https://issues.apache.org/jira/browse/DRILL-6340 > Project: Apache Drill > Issue Type: Improvement > Components: Execution - Relational Operators >Reporter: Karthikeyan Manivannan >Assignee: Karthikeyan Manivannan >Priority: Major > Labels: doc-impacting, ready-to-commit > Fix For: 1.14.0 > > > This bug is for tracking the changes required to implement Output Batch > Sizing in Project using the RecordBatchSizer. The challenge in doing this > mainly lies in dealing with expressions that produce variable-length columns. > The following doc talks about some of the design approaches for dealing with > such variable-length columns. > [https://docs.google.com/document/d/1h0WsQsen6xqqAyyYSrtiAniQpVZGmQNQqC1I2DJaxAA/edit?usp=sharing] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-6493) Replace BitVector with Uint1Vector
Karthikeyan Manivannan created DRILL-6493: - Summary: Replace BitVector with Uint1Vector Key: DRILL-6493 URL: https://issues.apache.org/jira/browse/DRILL-6493 Project: Apache Drill Issue Type: Improvement Reporter: Karthikeyan Manivannan BitVector stores a single bit of data and it uses a bit of storage space. UInt1 is an alternate implementation which uses a byte to store a bit. Recently discovered bugs in BitVector and anecdotal evidence of performance issues seems to suggest that this code is slow and buggy. I am opening this bug to analyze the impact of replacing BitVector with UInt1. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (DRILL-6486) BitVector split and transfer does not work correctly for non byte-multiple transfer lengths
[ https://issues.apache.org/jira/browse/DRILL-6486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthikeyan Manivannan resolved DRILL-6486. --- Resolution: Fixed > BitVector split and transfer does not work correctly for non byte-multiple > transfer lengths > --- > > Key: DRILL-6486 > URL: https://issues.apache.org/jira/browse/DRILL-6486 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Data Types >Affects Versions: 1.13.0 >Reporter: Karthikeyan Manivannan >Assignee: Karthikeyan Manivannan >Priority: Major > Fix For: 1.14.0 > > Attachments: TestSplitAndTransfer.java > > Original Estimate: 24h > Remaining Estimate: 24h > > BitVector splitAndTransfer does not correctly handle transfers where the > transfer-length is not a multiple of 8. The attached bitVector tests will > expose this problem. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-6486) BitVector split and transfer does not work correctly for non byte-multiple transfer lengths
Karthikeyan Manivannan created DRILL-6486: - Summary: BitVector split and transfer does not work correctly for non byte-multiple transfer lengths Key: DRILL-6486 URL: https://issues.apache.org/jira/browse/DRILL-6486 Project: Apache Drill Issue Type: Bug Components: Execution - Data Types Affects Versions: 1.13.0 Reporter: Karthikeyan Manivannan Assignee: Karthikeyan Manivannan Fix For: 1.14.0 Attachments: TestSplitAndTransfer.java BitVector splitAndTransfer does not correctly handle transfers where the transfer-length is not a multiple of 8. The attached bitVector tests will expose this problem. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-6486) BitVector split and transfer does not work correctly for non byte-multiple transfer lengths
[ https://issues.apache.org/jira/browse/DRILL-6486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthikeyan Manivannan updated DRILL-6486: -- Attachment: TestSplitAndTransfer.java > BitVector split and transfer does not work correctly for non byte-multiple > transfer lengths > --- > > Key: DRILL-6486 > URL: https://issues.apache.org/jira/browse/DRILL-6486 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Data Types >Affects Versions: 1.13.0 >Reporter: Karthikeyan Manivannan >Assignee: Karthikeyan Manivannan >Priority: Major > Fix For: 1.14.0 > > Attachments: TestSplitAndTransfer.java > > Original Estimate: 24h > Remaining Estimate: 24h > > BitVector splitAndTransfer does not correctly handle transfers where the > transfer-length is not a multiple of 8. The attached bitVector tests will > expose this problem. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-6340) Output Batch Control in Project using the RecordBatchSizer
Karthikeyan Manivannan created DRILL-6340: - Summary: Output Batch Control in Project using the RecordBatchSizer Key: DRILL-6340 URL: https://issues.apache.org/jira/browse/DRILL-6340 Project: Apache Drill Issue Type: Improvement Components: Execution - Relational Operators Reporter: Karthikeyan Manivannan Assignee: Karthikeyan Manivannan This bug is for tracking the changes required to implement Output Batch Sizing in Project using the RecordBatchSizer. The challenge in doing this mainly lies in dealing with expressions that produce variable-length columns. The following doc talks about some of the design approaches for dealing with such variable-length columns. [https://docs.google.com/document/d/1h0WsQsen6xqqAyyYSrtiAniQpVZGmQNQqC1I2DJaxAA/edit?usp=sharing] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-4897) NumberFormatException in Drill SQL while casting to BIGINT when its actually a number
[ https://issues.apache.org/jira/browse/DRILL-4897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16402463#comment-16402463 ] Karthikeyan Manivannan commented on DRILL-4897: --- This seems to be happening because the "WHEN 0 THEN 0" in the query. I think the "THEN 0" causes PROJECT to assume that the result column is INT instead of BIGINT and the query throws the exception when a number larger than what INT can hold is processed. The query runs fine if it is changed to "...WHEN 0 THEN 2147483648..." but fails when it is changed to "...WHEN 0 THEN 2147483647..." 0: jdbc:drill:zk=local> select CAST(case isnumeric(columns[0]) WHEN 0 THEN 2147483647 ELSE columns[0] END AS BIGINT) from dfs.`/Users/karthik/work/bugs/DRILL-4897/pw2.csv`; Error: SYSTEM ERROR: NumberFormatException: 2147483648 Fragment 0:0 [Error Id: d29ec48e-e659-41b4-a722-9c546ef8c9c9 on 172.30.8.179:31010] (state=,code=0) 0: jdbc:drill:zk=local> select CAST(case isnumeric(columns[0]) WHEN 0 THEN 2147483648 ELSE columns[0] END AS BIGINT) from dfs.`/Users/karthik/work/bugs/DRILL-4897/pw2.csv`; +-+ | EXPR$0 | +-+ | 1 | | 2 | ... ... | 2147483648 | | 4294967296 | +-+ The planner seems to be doing the same thing in both cases: Failed Case < Project(EXPR$0=[CAST(CASE(=(ISNUMERIC(ITEM($0, 0)), 0), 2147483647, ITEM($0, 0))):BIGINT]) : rowType = RecordType(BIGINT EXPR$0): rowcount = 2.0, cumulative cost = \{4.0 rows, 10.0 cpu, 0.0 io, 0.0 network, 0.0 memory} --- Succesfull > Project(EXPR$0=[CAST(CASE(=(ISNUMERIC(ITEM($0, 0)), 0), 2147483648, ITEM($0, > 0))):BIGINT]) : rowType = RecordType(BIGINT EXPR$0): rowcount = 2.0, > cumulative cost = \{4.0 rows, 10.0 cpu, 0.0 io, 0.0 network, 0.0 memory} So, I guess the problem is in the way the expression is handled in PROJECT. I will investigate this further. > NumberFormatException in Drill SQL while casting to BIGINT when its actually > a number > - > > Key: DRILL-4897 > URL: https://issues.apache.org/jira/browse/DRILL-4897 > Project: Apache Drill > Issue Type: Bug > Components: Functions - Drill >Reporter: Srihari Karanth >Assignee: Karthikeyan Manivannan >Priority: Blocker > > In the following SQL, drill cribs when trying to convert a number which is in > varchar >select cast (case IsNumeric(Delta_Radio_Delay) > when 0 then 0 else Delta_Radio_Delay end as BIGINT) > from datasource.`./sometable` > where Delta_Radio_Delay='4294967294'; > BIGINT should be able to take very large number. I dont understand how it > throws the below error: > 0: jdbc:drill:> select cast (case IsNumeric(Delta_Radio_Delay) > when 0 then 0 else Delta_Radio_Delay end as BIGINT) > from datasource.`./sometable` > where Delta_Radio_Delay='4294967294'; > Error: SYSTEM ERROR: NumberFormatException: 4294967294 > Fragment 1:29 > [Error Id: a63bb113-271f-4d8b-8194-2c9728543200 on cluster-3:31010] > (state=,code=0) > How can i modify SQL to fix this? -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (DRILL-3928) OutOfMemoryException should not be derived from FragmentSetupException
[ https://issues.apache.org/jira/browse/DRILL-3928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthikeyan Manivannan resolved DRILL-3928. --- Resolution: Not A Problem OutofMemoryException is not derived from FragmentSetupException > OutOfMemoryException should not be derived from FragmentSetupException > -- > > Key: DRILL-3928 > URL: https://issues.apache.org/jira/browse/DRILL-3928 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Flow >Affects Versions: 1.2.0 >Reporter: Chris Westin >Assignee: Karthikeyan Manivannan >Priority: Major > > Discovered while working on DRILL-3927. > The client and server both use the same direct memory allocator code. But the > allocator's OutOfMemoryException is derived from FragmentSetupException > (which is derived from ForemanException). > Firstly, OOM situations don't only happen during setup. > Secondly, Fragment and Foreman classes shouldn't exist on the client side. > (This is causing unnecessary dependencies on the jdbc-all jar on server-only > code). > There's nothing special in those base classes that OutOfMemoryException > depends on. This looks like it was just a cheap way to avoid extra catch > clauses in Foreman and FragmentExecutor by catching the baser classes only. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-520) ceiling/ceil and floor functions return decimal value instead of an integer
[ https://issues.apache.org/jira/browse/DRILL-520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16401459#comment-16401459 ] Karthikeyan Manivannan commented on DRILL-520: -- This cannot be fixed simply by changing the return type of the builtin functions (as suggested in the attached patch) because floating point types (Float4, Float8) have a range that is beyond what can be represented in any integer type, even in BigInt. This is what happens if the patch (change the outputType of ceil/floor to Int and use a (int) cast to return the value of java.lang.Math.ceil/floor) is applied 0: jdbc:drill:zk=local> select floor(cast('34028.5' as float)); | 34028 | 1 row selected (0.157 seconds) 0: jdbc:drill:zk=local> select floor(cast('340280.5' as float)); <--- too big to represent as Int | 2147483647 | <- Int max A similar overflow will happen even if a BigInt output type and a 'double' cast is used. I looked up how ProsgreSQL implements ceil/floor: |Name|Return Type|Description|Example|Result| |{{ceil(dp}} or {{numeric}})|same as input)|nearest integer greater than or equal to argument|{{ceil(-42.8)}}|{{-42}}| |{{floor(dp}} or {{numeric}})|(same as input)|nearest integer less than or equal to argument|{{floor(-42.8)}}|{{-43}}| where _*dp*_ is a double precision 8byte floating-point number _*numeric*_ is numeric [ (p, s) ] exact numeric of selectable precision I am not sure how a the dp/numeric return type is returned as an Int I think the way to fix this in Drill would be to use BigDecimal as the return type. > ceiling/ceil and floor functions return decimal value instead of an integer > --- > > Key: DRILL-520 > URL: https://issues.apache.org/jira/browse/DRILL-520 > Project: Apache Drill > Issue Type: Bug > Components: Functions - Drill >Affects Versions: 1.0.0 >Reporter: Krystal >Assignee: Karthikeyan Manivannan >Priority: Critical > Fix For: Future > > Attachments: DRILL-520.patch > > > Ran the following queries in drill: > 0: jdbc:drill:schema=dfs> select ceiling(55.8) from dfs.`student` where > rownum=11; > ++ > | EXPR$0 | > ++ > | 56.0 | > ++ > 0: jdbc:drill:schema=dfs> select floor(55.8) from dfs.`student` where > rownum=11; > ++ > | EXPR$0 | > ++ > | 55.0 | > ++ > The same queries executed from oracle, postgres and mysql returned integer > values of 56 and 55. > Found the following description of the two functions from > http://users.atw.hu/sqlnut/sqlnut2-chp-4-sect-4.html : > Ceil/Ceiling: > Rounds a noninteger value upwards to the next greatest integer. Returns an > integer value unchanged. > Floor: > Rounds a noninteger value downwards to the next least integer. Returns an > integer value unchanged. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (DRILL-520) ceiling/ceil and floor functions return decimal value instead of an integer
[ https://issues.apache.org/jira/browse/DRILL-520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthikeyan Manivannan resolved DRILL-520. -- Resolution: Later > ceiling/ceil and floor functions return decimal value instead of an integer > --- > > Key: DRILL-520 > URL: https://issues.apache.org/jira/browse/DRILL-520 > Project: Apache Drill > Issue Type: Bug > Components: Functions - Drill >Affects Versions: 1.0.0 >Reporter: Krystal >Assignee: Karthikeyan Manivannan >Priority: Critical > Fix For: Future > > Attachments: DRILL-520.patch > > > Ran the following queries in drill: > 0: jdbc:drill:schema=dfs> select ceiling(55.8) from dfs.`student` where > rownum=11; > ++ > | EXPR$0 | > ++ > | 56.0 | > ++ > 0: jdbc:drill:schema=dfs> select floor(55.8) from dfs.`student` where > rownum=11; > ++ > | EXPR$0 | > ++ > | 55.0 | > ++ > The same queries executed from oracle, postgres and mysql returned integer > values of 56 and 55. > Found the following description of the two functions from > http://users.atw.hu/sqlnut/sqlnut2-chp-4-sect-4.html : > Ceil/Ceiling: > Rounds a noninteger value upwards to the next greatest integer. Returns an > integer value unchanged. > Floor: > Rounds a noninteger value downwards to the next least integer. Returns an > integer value unchanged. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (DRILL-6083) RestClientFixture does not connect to the correct webserver port
[ https://issues.apache.org/jira/browse/DRILL-6083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthikeyan Manivannan resolved DRILL-6083. --- Resolution: Not A Problem > RestClientFixture does not connect to the correct webserver port > > > Key: DRILL-6083 > URL: https://issues.apache.org/jira/browse/DRILL-6083 > Project: Apache Drill > Issue Type: Bug > Components: Tools, Build & Test >Affects Versions: Future >Reporter: Karthikeyan Manivannan >Assignee: Karthikeyan Manivannan >Priority: Major > Fix For: 1.13.0 > > > RestClientFixture always connects to the default http port (8047) instead of > connecting to the webserver-port of the cluster. The cluster's webserver port > won't be 8047 if there are other Drillbits running when the cluster is > launched. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6083) RestClientFixture does not connect to the correct webserver port
[ https://issues.apache.org/jira/browse/DRILL-6083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16323240#comment-16323240 ] Karthikeyan Manivannan commented on DRILL-6083: --- The following change seems to fix the problem. diff --git a/exec/java-exec/src/test/java/org/apache/drill/test/RestClientFixture.java b/exec/java-exec/src/test/java/org/apache/drill/test/RestClientFixture.java index 988e7723d..09c6ba96f 100644 --- a/exec/java-exec/src/test/java/org/apache/drill/test/RestClientFixture.java +++ b/exec/java-exec/src/test/java/org/apache/drill/test/RestClientFixture.java @@ -55,7 +55,8 @@ public class RestClientFixture implements AutoCloseable { private final Client client; private RestClientFixture(ClusterFixture cluster) { -int port = cluster.config.getInt(ExecConstants.HTTP_PORT); +int port = cluster.drillbit().getWebServerPort(); +//int port = cluster.config.getInt(ExecConstants.HTTP_PORT); String address = cluster.drillbits().iterator().next().getContext().getEndpoint().getAddress(); String baseURL = "http://"; + address + ":" + port; [~timothyfarkas], is this the correct approach? > RestClientFixture does not connect to the correct webserver port > > > Key: DRILL-6083 > URL: https://issues.apache.org/jira/browse/DRILL-6083 > Project: Apache Drill > Issue Type: Bug > Components: Tools, Build & Test >Affects Versions: Future >Reporter: Karthikeyan Manivannan >Assignee: Karthikeyan Manivannan > Fix For: 1.13.0 > > Original Estimate: 2h > Remaining Estimate: 2h > > RestClientFixture always connects to the default http port (8047) instead of > connecting to the webserver-port of the cluster. The cluster's webserver port > won't be 8047 if there are other Drillbits running when the cluster is > launched. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (DRILL-6083) RestClientFixture does not connect to the correct webserver port
Karthikeyan Manivannan created DRILL-6083: - Summary: RestClientFixture does not connect to the correct webserver port Key: DRILL-6083 URL: https://issues.apache.org/jira/browse/DRILL-6083 Project: Apache Drill Issue Type: Bug Components: Tools, Build & Test Affects Versions: Future Reporter: Karthikeyan Manivannan Assignee: Karthikeyan Manivannan Fix For: 1.13.0 RestClientFixture always connects to the default http port (8047) instead of connecting to the webserver-port of the cluster. The cluster's webserver port won't be 8047 if there are other Drillbits running when the cluster is launched. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (DRILL-6017) Fix for SHUTDOWN button being visible for non Admin users
[ https://issues.apache.org/jira/browse/DRILL-6017?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthikeyan Manivannan reassigned DRILL-6017: - Assignee: Karthikeyan Manivannan > Fix for SHUTDOWN button being visible for non Admin users > - > > Key: DRILL-6017 > URL: https://issues.apache.org/jira/browse/DRILL-6017 > Project: Apache Drill > Issue Type: Bug >Reporter: Arina Ielchiieva >Assignee: Karthikeyan Manivannan >Priority: Blocker > > DRILL-4286 introduces shutdown button on index page but when authorization is > enabled it should be visible only to admin users. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-4708) connection closed unexpectedly
[ https://issues.apache.org/jira/browse/DRILL-4708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16252801#comment-16252801 ] Karthikeyan Manivannan commented on DRILL-4708: --- [~cch...@maprtech.com] Do you have a repeatable repro for this ? It would be very useful to have a deterministic repro on a small cluster. > connection closed unexpectedly > -- > > Key: DRILL-4708 > URL: https://issues.apache.org/jira/browse/DRILL-4708 > Project: Apache Drill > Issue Type: Bug > Components: Execution - RPC >Affects Versions: 1.7.0 >Reporter: Chun Chang >Assignee: Karthikeyan Manivannan >Priority: Critical > Attachments: data.tgz > > > Running DRILL functional automation, we often see query failed randomly due > to the following unexpected connection close error. > {noformat} > Execution Failures: > /root/drillAutomation/framework/framework/resources/Functional/ctas/ctas_flatten/10rows/filter5.q > Query: > select * from dfs.ctas_flatten.`filter5_10rows_ctas` > Failed with exception > java.sql.SQLException: CONNECTION ERROR: Connection /10.10.100.171:36185 <--> > drillats4.qa.lab/10.10.100.174:31010 (user client) closed unexpectedly. > Drillbit down? > [Error Id: 3d5dad8e-80d0-4c7f-9012-013bf01ce2b7 ] > at > org.apache.drill.jdbc.impl.DrillCursor.nextRowInternally(DrillCursor.java:247) > at org.apache.drill.jdbc.impl.DrillCursor.next(DrillCursor.java:321) > at > oadd.net.hydromatic.avatica.AvaticaResultSet.next(AvaticaResultSet.java:187) > at > org.apache.drill.jdbc.impl.DrillResultSetImpl.next(DrillResultSetImpl.java:172) > at > org.apache.drill.test.framework.DrillTestJdbc.executeQuery(DrillTestJdbc.java:210) > at > org.apache.drill.test.framework.DrillTestJdbc.run(DrillTestJdbc.java:99) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:744) > Caused by: oadd.org.apache.drill.common.exceptions.UserException: CONNECTION > ERROR: Connection /10.10.100.171:36185 <--> > drillats4.qa.lab/10.10.100.174:31010 (user client) closed unexpectedly. > Drillbit down? > [Error Id: 3d5dad8e-80d0-4c7f-9012-013bf01ce2b7 ] > at > oadd.org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:543) > at > oadd.org.apache.drill.exec.rpc.user.QueryResultHandler$ChannelClosedHandler$1.operationComplete(QueryResultHandler.java:373) > at > oadd.io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:680) > at > oadd.io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:603) > at > oadd.io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:563) > at > oadd.io.netty.util.concurrent.DefaultPromise.trySuccess(DefaultPromise.java:406) > at > oadd.io.netty.channel.DefaultChannelPromise.trySuccess(DefaultChannelPromise.java:82) > at > oadd.io.netty.channel.AbstractChannel$CloseFuture.setClosed(AbstractChannel.java:943) > at > oadd.io.netty.channel.AbstractChannel$AbstractUnsafe.doClose0(AbstractChannel.java:592) > at > oadd.io.netty.channel.AbstractChannel$AbstractUnsafe.close(AbstractChannel.java:584) > at > oadd.io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.closeOnRead(AbstractNioByteChannel.java:71) > at > oadd.io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.handleReadException(AbstractNioByteChannel.java:89) > at > oadd.io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:162) > at > oadd.io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511) > at > oadd.io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468) > at > oadd.io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382) > at oadd.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354) > at > oadd.io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111) > ... 1 more > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (DRILL-4708) connection closed unexpectedly
[ https://issues.apache.org/jira/browse/DRILL-4708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthikeyan Manivannan resolved DRILL-4708. --- Resolution: Works for Me > connection closed unexpectedly > -- > > Key: DRILL-4708 > URL: https://issues.apache.org/jira/browse/DRILL-4708 > Project: Apache Drill > Issue Type: Bug > Components: Execution - RPC >Affects Versions: 1.7.0 >Reporter: Chun Chang >Assignee: Karthikeyan Manivannan >Priority: Critical > Attachments: data.tgz > > > Running DRILL functional automation, we often see query failed randomly due > to the following unexpected connection close error. > {noformat} > Execution Failures: > /root/drillAutomation/framework/framework/resources/Functional/ctas/ctas_flatten/10rows/filter5.q > Query: > select * from dfs.ctas_flatten.`filter5_10rows_ctas` > Failed with exception > java.sql.SQLException: CONNECTION ERROR: Connection /10.10.100.171:36185 <--> > drillats4.qa.lab/10.10.100.174:31010 (user client) closed unexpectedly. > Drillbit down? > [Error Id: 3d5dad8e-80d0-4c7f-9012-013bf01ce2b7 ] > at > org.apache.drill.jdbc.impl.DrillCursor.nextRowInternally(DrillCursor.java:247) > at org.apache.drill.jdbc.impl.DrillCursor.next(DrillCursor.java:321) > at > oadd.net.hydromatic.avatica.AvaticaResultSet.next(AvaticaResultSet.java:187) > at > org.apache.drill.jdbc.impl.DrillResultSetImpl.next(DrillResultSetImpl.java:172) > at > org.apache.drill.test.framework.DrillTestJdbc.executeQuery(DrillTestJdbc.java:210) > at > org.apache.drill.test.framework.DrillTestJdbc.run(DrillTestJdbc.java:99) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:744) > Caused by: oadd.org.apache.drill.common.exceptions.UserException: CONNECTION > ERROR: Connection /10.10.100.171:36185 <--> > drillats4.qa.lab/10.10.100.174:31010 (user client) closed unexpectedly. > Drillbit down? > [Error Id: 3d5dad8e-80d0-4c7f-9012-013bf01ce2b7 ] > at > oadd.org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:543) > at > oadd.org.apache.drill.exec.rpc.user.QueryResultHandler$ChannelClosedHandler$1.operationComplete(QueryResultHandler.java:373) > at > oadd.io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:680) > at > oadd.io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:603) > at > oadd.io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:563) > at > oadd.io.netty.util.concurrent.DefaultPromise.trySuccess(DefaultPromise.java:406) > at > oadd.io.netty.channel.DefaultChannelPromise.trySuccess(DefaultChannelPromise.java:82) > at > oadd.io.netty.channel.AbstractChannel$CloseFuture.setClosed(AbstractChannel.java:943) > at > oadd.io.netty.channel.AbstractChannel$AbstractUnsafe.doClose0(AbstractChannel.java:592) > at > oadd.io.netty.channel.AbstractChannel$AbstractUnsafe.close(AbstractChannel.java:584) > at > oadd.io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.closeOnRead(AbstractNioByteChannel.java:71) > at > oadd.io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.handleReadException(AbstractNioByteChannel.java:89) > at > oadd.io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:162) > at > oadd.io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511) > at > oadd.io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468) > at > oadd.io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382) > at oadd.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354) > at > oadd.io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111) > ... 1 more > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-4708) connection closed unexpectedly
[ https://issues.apache.org/jira/browse/DRILL-4708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16250490#comment-16250490 ] Karthikeyan Manivannan commented on DRILL-4708: --- I don't see a failure with the Apache master(7a2fc87ee20f706d85cb5c90cc441e6b44b71592). I ran this test in an embedded drillbit. 0: jdbc:drill:zk=local> select * from dfs.`/Users/karthik/work/bugs/DRILL-4708/data` order by `time` asc; ... ... | signal=od2lkz4xzq | pi8tly75osv1aq | no-thing | 207369 | 0.046113 | | signal=mgj912yejm | pi8tly75osv1aq | no-thing | 207369 | 0.023779 | ++-+---+-+---+ 4,354,749 rows selected (157.454 seconds) 0: jdbc:drill:zk=local> select count(*) from dfs.`/Users/karthik/work/bugs/DRILL-4708/data`; +--+ | EXPR$0 | +--+ | 4354749 | +--+ > connection closed unexpectedly > -- > > Key: DRILL-4708 > URL: https://issues.apache.org/jira/browse/DRILL-4708 > Project: Apache Drill > Issue Type: Bug > Components: Execution - RPC >Affects Versions: 1.7.0 >Reporter: Chun Chang >Assignee: Karthikeyan Manivannan >Priority: Critical > Attachments: data.tgz > > > Running DRILL functional automation, we often see query failed randomly due > to the following unexpected connection close error. > {noformat} > Execution Failures: > /root/drillAutomation/framework/framework/resources/Functional/ctas/ctas_flatten/10rows/filter5.q > Query: > select * from dfs.ctas_flatten.`filter5_10rows_ctas` > Failed with exception > java.sql.SQLException: CONNECTION ERROR: Connection /10.10.100.171:36185 <--> > drillats4.qa.lab/10.10.100.174:31010 (user client) closed unexpectedly. > Drillbit down? > [Error Id: 3d5dad8e-80d0-4c7f-9012-013bf01ce2b7 ] > at > org.apache.drill.jdbc.impl.DrillCursor.nextRowInternally(DrillCursor.java:247) > at org.apache.drill.jdbc.impl.DrillCursor.next(DrillCursor.java:321) > at > oadd.net.hydromatic.avatica.AvaticaResultSet.next(AvaticaResultSet.java:187) > at > org.apache.drill.jdbc.impl.DrillResultSetImpl.next(DrillResultSetImpl.java:172) > at > org.apache.drill.test.framework.DrillTestJdbc.executeQuery(DrillTestJdbc.java:210) > at > org.apache.drill.test.framework.DrillTestJdbc.run(DrillTestJdbc.java:99) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:744) > Caused by: oadd.org.apache.drill.common.exceptions.UserException: CONNECTION > ERROR: Connection /10.10.100.171:36185 <--> > drillats4.qa.lab/10.10.100.174:31010 (user client) closed unexpectedly. > Drillbit down? > [Error Id: 3d5dad8e-80d0-4c7f-9012-013bf01ce2b7 ] > at > oadd.org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:543) > at > oadd.org.apache.drill.exec.rpc.user.QueryResultHandler$ChannelClosedHandler$1.operationComplete(QueryResultHandler.java:373) > at > oadd.io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:680) > at > oadd.io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:603) > at > oadd.io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:563) > at > oadd.io.netty.util.concurrent.DefaultPromise.trySuccess(DefaultPromise.java:406) > at > oadd.io.netty.channel.DefaultChannelPromise.trySuccess(DefaultChannelPromise.java:82) > at > oadd.io.netty.channel.AbstractChannel$CloseFuture.setClosed(AbstractChannel.java:943) > at > oadd.io.netty.channel.AbstractChannel$AbstractUnsafe.doClose0(AbstractChannel.java:592) > at > oadd.io.netty.channel.AbstractChannel$AbstractUnsafe.close(AbstractChannel.java:584) > at > oadd.io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.closeOnRead(AbstractNioByteChannel.java:71) > at > oadd.io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.handleReadException(AbstractNioByteChannel.java:89) > at > oadd.io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:162) > at > oadd.io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511) > at > oadd.io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468) > at > oadd.io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382) > at oadd.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354) >
[jira] [Updated] (DRILL-5582) [Threat Modeling] Drillbit may be spoofed by an attacker and this may lead to data being written to the attacker's target instead of Drillbit
[ https://issues.apache.org/jira/browse/DRILL-5582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthikeyan Manivannan updated DRILL-5582: -- Labels: doc-impacting (was: ) > [Threat Modeling] Drillbit may be spoofed by an attacker and this may lead to > data being written to the attacker's target instead of Drillbit > - > > Key: DRILL-5582 > URL: https://issues.apache.org/jira/browse/DRILL-5582 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.10.0 >Reporter: Rob Wu >Priority: Minor > Labels: doc-impacting > > *Consider the scenario:* > Alice has a drillbit (my.drillbit.co) with plain and kerberos authentication > enabled containing important data. Bob, the attacker, attempts to spoof the > connection and redirect it to his own drillbit (fake.drillbit.co) with no > authentication setup. > When Alice is under attack and attempts to connect to her secure drillbit, > she is actually authenticating against Bob's drillbit. At this point, the > connection should have failed due to unmatched configuration. However, the > current implementation will return SUCCESS as long as the (spoofing) drillbit > has no authentication requirement set. > Currently, the drillbit <- to -> drill client connection accepts the lowest > authentication configuration set on the server. This leaves unsuspecting user > vulnerable to spoofing. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-5697) Improve performance of filter operator for pattern matching
[ https://issues.apache.org/jira/browse/DRILL-5697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16128070#comment-16128070 ] Karthikeyan Manivannan commented on DRILL-5697: --- Where should I be looking if I want to see what the Baseline code does? > Improve performance of filter operator for pattern matching > --- > > Key: DRILL-5697 > URL: https://issues.apache.org/jira/browse/DRILL-5697 > Project: Apache Drill > Issue Type: Improvement > Components: Execution - Flow >Affects Versions: 1.11.0 >Reporter: Padma Penumarthy >Assignee: Padma Penumarthy > > Queries using filter with sql like operator use Java regex library for > pattern matching. However, for cases like %abc (ends with abc), abc% (starts > with abc), %abc% (contains abc), it is observed that implementing these cases > with simple code instead of using regex library provides good performance > boost (4-6x). Idea is to use special case code for simple, common cases and > fall back to Java regex library for complicated ones. That will provide good > performance benefit for most common cases. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (DRILL-5671) Set secure ACLs (Access Control List) for Drill ZK nodes in a secure cluster
[ https://issues.apache.org/jira/browse/DRILL-5671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthikeyan Manivannan updated DRILL-5671: -- Summary: Set secure ACLs (Access Control List) for Drill ZK nodes in a secure cluster (was: Set a secure ACL (Access Control List) for Drill ZK nodes in a secure cluster) > Set secure ACLs (Access Control List) for Drill ZK nodes in a secure cluster > > > Key: DRILL-5671 > URL: https://issues.apache.org/jira/browse/DRILL-5671 > Project: Apache Drill > Issue Type: New Feature > Components: Server >Reporter: Karthikeyan Manivannan >Assignee: Karthikeyan Manivannan > > All Drill ZK nodes, currently, are assigned a default [world:all] ACL i.e. > anyone gets to do CDRWA(create, delete, read, write, admin access). This > means that even on a secure cluster anyone can perform all CRDWA actions on > the znodes. > This should be changed such that: > - In a non-secure cluster, Drill will continue using the current default > [world:all] ACL > - In a secure cluster, all nodes should have an [authid: all] ACL i.e. the > authenticated user that created the znode gets full access. The discovery > znodes i.e. the znodes with the list of Drillbits will have an additional > [world:read] ACL, i.e. the list of Drillbits will be readable by anyone. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (DRILL-5671) Set a secure ACL (Access Control List) for Drill ZK nodes in a secure cluster
Karthikeyan Manivannan created DRILL-5671: - Summary: Set a secure ACL (Access Control List) for Drill ZK nodes in a secure cluster Key: DRILL-5671 URL: https://issues.apache.org/jira/browse/DRILL-5671 Project: Apache Drill Issue Type: New Feature Components: Server Reporter: Karthikeyan Manivannan Assignee: Karthikeyan Manivannan All Drill ZK nodes, currently, are assigned a default [world:all] ACL i.e. anyone gets to do CDRWA(create, delete, read, write, admin access). This means that even on a secure cluster anyone can perform all CRDWA actions on the znodes. This should be changed such that: - In a non-secure cluster, Drill will continue using the current default [world:all] ACL - In a secure cluster, all nodes should have an [authid: all] ACL i.e. the authenticated user that created the znode gets full access. The discovery znodes i.e. the znodes with the list of Drillbits will have an additional [world:read] ACL, i.e. the list of Drillbits will be readable by anyone. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (DRILL-5567) Review changes for DRILL 5514
[ https://issues.apache.org/jira/browse/DRILL-5567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthikeyan Manivannan resolved DRILL-5567. --- Resolution: Done > Review changes for DRILL 5514 > - > > Key: DRILL-5567 > URL: https://issues.apache.org/jira/browse/DRILL-5567 > Project: Apache Drill > Issue Type: Sub-task >Reporter: Karthikeyan Manivannan >Assignee: Karthikeyan Manivannan > Fix For: 1.11.0 > > Original Estimate: 2h > Remaining Estimate: 2h > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (DRILL-5567) Review changes for DRILL 5514
[ https://issues.apache.org/jira/browse/DRILL-5567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthikeyan Manivannan updated DRILL-5567: -- Remaining Estimate: 2h Original Estimate: 2h > Review changes for DRILL 5514 > - > > Key: DRILL-5567 > URL: https://issues.apache.org/jira/browse/DRILL-5567 > Project: Apache Drill > Issue Type: Sub-task >Reporter: Karthikeyan Manivannan >Assignee: Karthikeyan Manivannan > Fix For: 1.11.0 > > Original Estimate: 2h > Remaining Estimate: 2h > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (DRILL-5514) Enhance VectorContainer to merge two row sets
[ https://issues.apache.org/jira/browse/DRILL-5514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthikeyan Manivannan updated DRILL-5514: -- Reviewer: Karthikeyan Manivannan (was: Sorabh Hamirwasia) > Enhance VectorContainer to merge two row sets > - > > Key: DRILL-5514 > URL: https://issues.apache.org/jira/browse/DRILL-5514 > Project: Apache Drill > Issue Type: Improvement >Affects Versions: 1.10.0 >Reporter: Paul Rogers >Assignee: Paul Rogers >Priority: Minor > Fix For: 1.11.0 > > > Consider the concept of a "record batch" in Drill. On the one hand, one can > envision a record batch as a stack of records: > {code} > | a1 | b1 | c1 | > > | a2 | b2 | c2 | > {code} > But, Drill is columnar. So a record batch is really a "bundle" of vectors: > {code} > | a1 || b1 || c1 | > | a2 || b2 || c2 | > {code} > There are times when it is handy to build up a record batch as a merge of two > different vector bundles: > {code} > -- bundle 1 ---- bundle 2 -- > | a1 || b1 || c1 | > | a2 || b2 || c2 | > {code} > For example, consider a reader. The reader implementation might read columns > (a, b) from a file, say. Then, the "{{ScanBatch}}" might add (c) as an > implicit vector (the file name, say.) The merged set of vectors comprises the > final schema: (a, b, c). > This ticket asks for the code to do the merge: > * Merge two schemas A = (a, b), B = (c) to create schema C = (a, b, c). > * Merge two vector containers C1 and C2 to create a new container, C3, that > holds the merger of the vectors from the first two. > Clearly, the merge only makes sense if: > * The two input containers have the same row count, and > * The columns in each input container are distinct. > Because this feature is also useful for tests, add the merge to the "row set" > tools also. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (DRILL-5567) Review changes for DRILL 5514
Karthikeyan Manivannan created DRILL-5567: - Summary: Review changes for DRILL 5514 Key: DRILL-5567 URL: https://issues.apache.org/jira/browse/DRILL-5567 Project: Apache Drill Issue Type: Sub-task Reporter: Karthikeyan Manivannan Assignee: Karthikeyan Manivannan -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (DRILL-5547) Drill config options and session options do not work as intended
Karthikeyan Manivannan created DRILL-5547: - Summary: Drill config options and session options do not work as intended Key: DRILL-5547 URL: https://issues.apache.org/jira/browse/DRILL-5547 Project: Apache Drill Issue Type: Bug Components: Server Affects Versions: 1.10.0 Reporter: Karthikeyan Manivannan Assignee: Venkata Jyothsna Donapati Fix For: Future In Drill, session options should take precedence over config options. But several of these session options are assigned hard-coded default values when the option validators are initialized. Because of this config options will never be read and honored even if the user did not specify the session option. ClassCompilerSelector.JAVA_COMPILER_VALIDATOR uses CompilerPolicy.DEFAULT as the default value. This default value gets into the session options map via the initialization of validators in SystemOptionManager. Now any piece of code that tries to check if a session option is set will never see a null, so it will always use that value and never try to look into the config options. For example, in the following piece of code from ClassCompilerSelector (), the policy will never be read from the config file. policy = CompilerPolicy.valueOf((value != null) ? value.string_val.toUpperCase() : config.getString(JAVA_COMPILER_CONFIG).toUpperCase()); -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (DRILL-4609) Select true,true,true from ... does not always output true,true,true
[ https://issues.apache.org/jira/browse/DRILL-4609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16003332#comment-16003332 ] Karthikeyan Manivannan commented on DRILL-4609: --- It is unlikely that this is a JVM bug. JDK 7's runtime/JIT-compilation policies might just be hiding a Drill bug. JVM bugs( like mis-compiles, incorrect class loading) usually result in a crash. All that said, given that this is a 1-bit error, it is possible that this is a JVM bug. You can eliminate the role of the JIT compiler by trying to repro the bug with the -Xint( interpreter only, will slow down execution significantly) JVM option. If you hit the bug with -Xint then you can be fairly certain that this is not a JVM bug. > Select true,true,true from ... does not always output true,true,true > > > Key: DRILL-4609 > URL: https://issues.apache.org/jira/browse/DRILL-4609 > Project: Apache Drill > Issue Type: Bug > Components: Client - CLI, Query Planning & Optimization, Storage - > Writer >Affects Versions: 1.5.0, 1.6.0 > Environment: Linux Redhat > tested in cluster (hdfs) and embedded mode >Reporter: F Méthot > > Doing a simple "select true, true, true from table" won't output > true,true,true on all generated rows. > Step to reproduce. > generate a simple CSV files: > {code:sql} > for i in {1..100}; do echo "Allo"; done > /users/fmethot/test.csv > {code} > Open a new fresh drill CLI. > Just to help for validation, switch output to CSV: > {code:sql} > alter session set `store.format`='csv' > {code} > generate a table like this: > {code:sql} >create table TEST_OUT as (select true,true,true,true from > dfs.`/users/fmethot/test.csv') > {code} > Check content of /users/fmethot/test.csv > You will find false values in there! > If you generate another table, on the same session, the same way, chances are > the value will be fine (all true). We can only reproduce this on the first > CTAS run. > We came to test this select pattern after we realize our custom boolean UDF > (as well as the one provided in Drill like "ilike") were not outputting > consistent deterministic results (same input were implausibly generating > random boolean output). We hope that fixing this ticket will also fix our > issue with boolean UDFs. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (DRILL-5448) ODBC client crashed when user does not have access to text formatted hive table
[ https://issues.apache.org/jira/browse/DRILL-5448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15985780#comment-15985780 ] Karthikeyan Manivannan commented on DRILL-5448: --- Sorabh and I took an initial stab at this. The failure is because the error message buffer in Drill::ErrorMessages::getMessage can only handle 10240 characters. The error message, in this case, is about 32k. This seems to be causing the segfault. No idea why this only occurs if the drillbit is busy. (lldb) process launch --stop-at-entry -- -program_arg querySubmitter query='select * from hive.voter_text' logLevel=trace type=sql connectStr=local=10.10.102.81:31010 api=sync saslPluginPath='/Users/karthik/lib' user=karthik password=maprmapr1234 ... ... lldb) br s -n 'Drill::DrillClientError::getErrorObject(exec::shared::DrillPBError const&)' Breakpoint 1: where = libdrillClient.dylib`Drill::DrillClientError::getErrorObject(exec::shared::DrillPBError const&) + 15 at drillClient.cpp:33, address = 0x00010017021f (lldb) r There is a running process, kill it and restart?: [Y/n] Y Process 28268 exited with status = 9 (0x0009) Process 36577 launched: '/Users/karthik/git-sources/drill-fork/contrib/native/client/build/querySubmitter' (x86_64) Unknown option:-program_arg. Ignoring Unknown option:querySubmitter. Ignoring Connected! Process 36577 stopped * thread #2: tid = 0x1c2a096, 0x00010017021f libdrillClient.dylib`Drill::DrillClientError::getErrorObject(e=0x0001008013d0) + 15 at drillClient.cpp:33, stop reason = breakpoint 1.1 frame #0: 0x00010017021f libdrillClient.dylib`Drill::DrillClientError::getErrorObject(e=0x0001008013d0) + 15 at drillClient.cpp:33 30 namespace Drill{ 31 32 DrillClientError* DrillClientError::getErrorObject(const exec::shared::DrillPBError& e){ -> 33 std::string s=Drill::getMessage(ERR_QRY_FAILURE, e.message().c_str()); 34 DrillClientError* err=NULL; 35 err=new DrillClientError(QRY_FAILURE, QRY_ERROR_START+QRY_FAILURE, s); 36 return err; (lldb) step Process 36577 stopped * thread #2: tid = 0x1c2a096, 0x00010017047c libdrillClient.dylib`exec::shared::DrillPBError::message(this=0x0001008013d0) const + 12 at UserBitShared.pb.h:3409, stop reason = step in frame #0: 0x00010017047c libdrillClient.dylib`exec::shared::DrillPBError::message(this=0x0001008013d0) const + 12 at UserBitShared.pb.h:3409 3406 clear_has_message(); 3407 } 3408 inline const ::std::string& DrillPBError::message() const { -> 3409 return *message_; 3410 } 3411 inline void DrillPBError::set_message(const ::std::string& value) { 3412 set_has_message(); (lldb) p *message warning: could not load any Objective-C class information. This will significantly reduce the quality of type information available. error: reference to non-static member function must be called; did you mean to call it with no arguments? error: indirection requires pointer operand ('const string' (aka 'const std::__1::basic_string, std::__1::allocator >') invalid) (lldb) p *message_ (std::__1::string) $6 = "SYSTEM ERROR: AccessControlException: User karthik(user id 503) does not have access to maprfs:///user/hive/warehouse/voter_text\n\n\n[Error Id: 7efda71b-f55b-45b4-a040-a59b0fd059a7 on qa102-81.qa.lab:31010]\n\n (org.apache.drill.exec.work.foreman.ForemanException) Unexpected exception during fragment initialization: Internal error: Error while applying rule DrillPushProjIntoScan, args [rel#15120:LogicalProject.NONE.ANY([]).[](input=rel#15119:Subset#0.ENUMERABLE.ANY([]).[],voter_id=$0,name=$1,age=$2,registration=$3,contributions=$4,voterzone=$5,create_timestamp=$6,create_date=$7), rel#15110:EnumerableTableScan.ENUMERABLE.ANY([]).[](table=[hive, voter_text])]\norg.apache.drill.exec.work.foreman.Foreman.run():298\n java.util.concurrent.ThreadPoolExecutor.runWorker():1142\n java.util.concurrent.ThreadPoolExecutor$Worker.run():617\n java.lang.Thread.run():745\n Caused By (java.lang.AssertionError) Internal error: Error while applying rule DrillPushProjIntoScan, args [rel#15120:LogicalProject.NONE.ANY([]).[]"... (lldb) p message_.size() (std::__1::basic_string, std::__1::allocator >::size_type) $7 = 32737 Fix-it applied, fixed expression was: message_->size() (lldb) p message_->size() (std::__1::basic_string, std::__1::allocator >::size_type) $8 = 32737 (lldb) step Process 36577 stopped * thread #2: tid = 0x1c2a096, 0x000100170228 libdrillClient.dylib`Drill::DrillClientError::getErrorObject(e=0x0001008013d0) + 24 at drillClient.cpp:33, stop reason = step in frame #0: 0x000100170228 libdrillClient.dylib`Drill::DrillClientError::getErrorObject(e=0x0001008013d0) + 24 at drillClient.cpp:33 30 namespace Drill{ 31 32 DrillClientError* DrillClientError::getErrorObject(const exec
[jira] [Updated] (DRILL-5034) Select timestamp from hive generated parquet always return in UTC
[ https://issues.apache.org/jira/browse/DRILL-5034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthikeyan Manivannan updated DRILL-5034: -- Labels: ready-to-commit (was: ) > Select timestamp from hive generated parquet always return in UTC > - > > Key: DRILL-5034 > URL: https://issues.apache.org/jira/browse/DRILL-5034 > Project: Apache Drill > Issue Type: Bug > Components: Storage - Parquet >Affects Versions: 1.9.0 >Reporter: Krystal >Assignee: Vitalii Diravka > Labels: ready-to-commit > > commit id: 5cea9afa6278e21574c6a982ae5c3d82085ef904 > Reading timestamp data against a hive parquet table from drill automatically > converts the timestamp data to UTC. > {code} > SELECT TIMEOFDAY() FROM (VALUES(1)); > +--+ > |EXPR$0| > +--+ > | 2016-11-10 12:33:26.547 America/Los_Angeles | > +--+ > {code} > data schema: > {code} > message hive_schema { > optional int32 voter_id; > optional binary name (UTF8); > optional int32 age; > optional binary registration (UTF8); > optional fixed_len_byte_array(3) contributions (DECIMAL(6,2)); > optional int32 voterzone; > optional int96 create_timestamp; > optional int32 create_date (DATE); > } > {code} > Using drill-1.8, the returned timestamps match the table data: > {code} > select convert_from(create_timestamp, 'TIMESTAMP_IMPALA') from > `/user/hive/warehouse/voter_hive_parquet` limit 5; > ++ > | EXPR$0 | > ++ > | 2016-10-23 20:03:58.0 | > | null | > | 2016-09-09 12:01:18.0 | > | 2017-03-06 20:35:55.0 | > | 2017-01-20 22:32:43.0 | > ++ > 5 rows selected (1.032 seconds) > {code} > If the user timzone is changed to UTC, then the timestamp data is returned in > UTC time. > Using drill-1.9, the returned timestamps got converted to UTC eventhough the > user timezone is in PST. > {code} > select convert_from(create_timestamp, 'TIMESTAMP_IMPALA') from > dfs.`/user/hive/warehouse/voter_hive_parquet` limit 5; > ++ > | EXPR$0 | > ++ > | 2016-10-24 03:03:58.0 | > | null | > | 2016-09-09 19:01:18.0 | > | 2017-03-07 04:35:55.0 | > | 2017-01-21 06:32:43.0 | > ++ > {code} > {code} > alter session set `store.parquet.reader.int96_as_timestamp`=true; > +---+---+ > | ok | summary | > +---+---+ > | true | store.parquet.reader.int96_as_timestamp updated. | > +---+---+ > select create_timestamp from dfs.`/user/hive/warehouse/voter_hive_parquet` > limit 5; > ++ > |create_timestamp| > ++ > | 2016-10-24 03:03:58.0 | > | null | > | 2016-09-09 19:01:18.0 | > | 2017-03-07 04:35:55.0 | > | 2017-01-21 06:32:43.0 | > ++ > {code} > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (DRILL-5034) Select timestamp from hive generated parquet always return in UTC
[ https://issues.apache.org/jira/browse/DRILL-5034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthikeyan Manivannan updated DRILL-5034: -- Labels: ready-to-commit (was: ) > Select timestamp from hive generated parquet always return in UTC > - > > Key: DRILL-5034 > URL: https://issues.apache.org/jira/browse/DRILL-5034 > Project: Apache Drill > Issue Type: Bug > Components: Storage - Parquet >Affects Versions: 1.9.0 >Reporter: Krystal >Assignee: Vitalii Diravka > Labels: ready-to-commit > > commit id: 5cea9afa6278e21574c6a982ae5c3d82085ef904 > Reading timestamp data against a hive parquet table from drill automatically > converts the timestamp data to UTC. > {code} > SELECT TIMEOFDAY() FROM (VALUES(1)); > +--+ > |EXPR$0| > +--+ > | 2016-11-10 12:33:26.547 America/Los_Angeles | > +--+ > {code} > data schema: > {code} > message hive_schema { > optional int32 voter_id; > optional binary name (UTF8); > optional int32 age; > optional binary registration (UTF8); > optional fixed_len_byte_array(3) contributions (DECIMAL(6,2)); > optional int32 voterzone; > optional int96 create_timestamp; > optional int32 create_date (DATE); > } > {code} > Using drill-1.8, the returned timestamps match the table data: > {code} > select convert_from(create_timestamp, 'TIMESTAMP_IMPALA') from > `/user/hive/warehouse/voter_hive_parquet` limit 5; > ++ > | EXPR$0 | > ++ > | 2016-10-23 20:03:58.0 | > | null | > | 2016-09-09 12:01:18.0 | > | 2017-03-06 20:35:55.0 | > | 2017-01-20 22:32:43.0 | > ++ > 5 rows selected (1.032 seconds) > {code} > If the user timzone is changed to UTC, then the timestamp data is returned in > UTC time. > Using drill-1.9, the returned timestamps got converted to UTC eventhough the > user timezone is in PST. > {code} > select convert_from(create_timestamp, 'TIMESTAMP_IMPALA') from > dfs.`/user/hive/warehouse/voter_hive_parquet` limit 5; > ++ > | EXPR$0 | > ++ > | 2016-10-24 03:03:58.0 | > | null | > | 2016-09-09 19:01:18.0 | > | 2017-03-07 04:35:55.0 | > | 2017-01-21 06:32:43.0 | > ++ > {code} > {code} > alter session set `store.parquet.reader.int96_as_timestamp`=true; > +---+---+ > | ok | summary | > +---+---+ > | true | store.parquet.reader.int96_as_timestamp updated. | > +---+---+ > select create_timestamp from dfs.`/user/hive/warehouse/voter_hive_parquet` > limit 5; > ++ > |create_timestamp| > ++ > | 2016-10-24 03:03:58.0 | > | null | > | 2016-09-09 19:01:18.0 | > | 2017-03-07 04:35:55.0 | > | 2017-01-21 06:32:43.0 | > ++ > {code} > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (DRILL-5097) Using store.parquet.reader.int96_as_timestamp gives IOOB whereas convert_from works
[ https://issues.apache.org/jira/browse/DRILL-5097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthikeyan Manivannan updated DRILL-5097: -- Labels: ready-to-commit (was: ) > Using store.parquet.reader.int96_as_timestamp gives IOOB whereas convert_from > works > --- > > Key: DRILL-5097 > URL: https://issues.apache.org/jira/browse/DRILL-5097 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Data Types, Storage - Parquet >Affects Versions: 1.9.0 >Reporter: Vitalii Diravka >Assignee: Vitalii Diravka > Labels: ready-to-commit > Fix For: Future > > Attachments: data.snappy.parquet > > > Using store.parquet.reader.int96_as_timestamp gives IOOB whereas convert_from > works. > The below query succeeds: > {code} > select c, convert_from(d, 'TIMESTAMP_IMPALA') from > dfs.`/drill/testdata/parquet_timestamp/spark_generated/d3`; > {code} > The below query fails: > {code} > 0: jdbc:drill:zk=10.10.100.190:5181> alter session set > `store.parquet.reader.int96_as_timestamp` = true; > +---+---+ > | ok | summary | > +---+---+ > | true | store.parquet.reader.int96_as_timestamp updated. | > +---+---+ > 1 row selected (0.231 seconds) > 0: jdbc:drill:zk=10.10.100.190:5181> select c, d from > dfs.`/drill/testdata/parquet_timestamp/spark_generated/d3`; > Error: SYSTEM ERROR: IndexOutOfBoundsException: readerIndex: 0, writerIndex: > 131076 (expected: 0 <= readerIndex <= writerIndex <= capacity(131072)) > Fragment 0:0 > [Error Id: bd94f477-7c01-420f-8920-06263212177b on qa-node190.qa.lab:31010] > (state=,code=0) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-4868) Hive functions should update writerIndex accordingly when return binary type
[ https://issues.apache.org/jira/browse/DRILL-4868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthikeyan Manivannan updated DRILL-4868: -- Labels: ready-to-commit (was: ) > Hive functions should update writerIndex accordingly when return binary type > > > Key: DRILL-4868 > URL: https://issues.apache.org/jira/browse/DRILL-4868 > Project: Apache Drill > Issue Type: Bug >Reporter: Chunhui Shi >Assignee: Chunhui Shi > Labels: ready-to-commit > > unhex is a Hive function. the returned binary buffer could not be consumed by > convert_from as shown below. > 0: jdbc:drill:zk=10.10.88.128:5181> select > convert_from(unhex('0a5f710b'),'int_be') from (values(1)); > Error: SYSTEM ERROR: IndexOutOfBoundsException: readerIndex(0) + length(4) > exceeds writerIndex(0): DrillBuf[31], udle: [25 0..1024] > Fragment 0:0 > [Error Id: 5e72ce4a-6164-4260-8317-ca2bb6325013 on atsqa4-128.qa.lab:31010] > (state=,code=0) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (DRILL-5121) A memory leak is observed when exact case is not specified for a column in a filter condition
Karthikeyan Manivannan created DRILL-5121: - Summary: A memory leak is observed when exact case is not specified for a column in a filter condition Key: DRILL-5121 URL: https://issues.apache.org/jira/browse/DRILL-5121 Project: Apache Drill Issue Type: Bug Components: Execution - Relational Operators Affects Versions: 1.8.0, 1.6.0 Reporter: Karthikeyan Manivannan Assignee: Karthikeyan Manivannan Fix For: Future When the query SELECT XYZ from dfs.`/tmp/foo' where xYZ like "abc", is executed on a setup where /tmp/foo has 2 Parquet files, 1.parquet and 2.parquet, where 1.parquet has the column XYZ but 2.parquet does not, then there is a memory leak. This seems to happen because xYZ seem to be treated as a new column. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (DRILL-4974) NPE in FindPartitionConditions.analyzeCall() for 'holistic' expressions
Karthikeyan Manivannan created DRILL-4974: - Summary: NPE in FindPartitionConditions.analyzeCall() for 'holistic' expressions Key: DRILL-4974 URL: https://issues.apache.org/jira/browse/DRILL-4974 Project: Apache Drill Issue Type: Bug Components: Execution - Relational Operators Affects Versions: 1.8.0, 1.7.0, 1.6.0 Reporter: Karthikeyan Manivannan Assignee: Karthikeyan Manivannan Fix For: 1.8.0 The following query can cause an NPE in FindPartitionConditions.analyzeCall() if the fileSize column is a partitioned column. SELECT fileSize FROM dfs.`/drill-data/data/` WHERE compoundId LIKE 'FOO-1234567%' This is because, the LIKE is treated as a holistic expression in FindPartitionConditions.analyzeCall(), causing opStack to be empty, thus causing opStack.peek() to return a NULL value. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-4373) Drill and Hive have incompatible timestamp representations in parquet
[ https://issues.apache.org/jira/browse/DRILL-4373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15610055#comment-15610055 ] Karthikeyan Manivannan commented on DRILL-4373: --- +1 > Drill and Hive have incompatible timestamp representations in parquet > - > > Key: DRILL-4373 > URL: https://issues.apache.org/jira/browse/DRILL-4373 > Project: Apache Drill > Issue Type: Improvement > Components: Storage - Hive, Storage - Parquet >Affects Versions: 1.8.0 >Reporter: Rahul Challapalli >Assignee: Karthikeyan Manivannan > Labels: doc-impacting > Fix For: 1.9.0 > > > git.commit.id.abbrev=83d460c > I created a parquet file with a timestamp type using Drill. Now if I define a > hive table on top of the parquet file and use "timestamp" as the column type, > drill fails to read the hive table through the hive storage plugin > Implementation: > Added int96 to timestamp converter for both parquet readers and controling it > by system / session option "store.parquet.int96_as_timestamp". > The value of the option is false by default for the proper work of the old > query scripts with the "convert_from TIMESTAMP_IMPALA" function. > When the option is true using of that function is unnesessary and can lead to > the query fail. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-4373) Drill and Hive have incompatible timestamp representations in parquet
[ https://issues.apache.org/jira/browse/DRILL-4373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15610053#comment-15610053 ] Karthikeyan Manivannan commented on DRILL-4373: --- Looks good to me. > Drill and Hive have incompatible timestamp representations in parquet > - > > Key: DRILL-4373 > URL: https://issues.apache.org/jira/browse/DRILL-4373 > Project: Apache Drill > Issue Type: Improvement > Components: Storage - Hive, Storage - Parquet >Affects Versions: 1.8.0 >Reporter: Rahul Challapalli >Assignee: Karthikeyan Manivannan > Labels: doc-impacting > Fix For: 1.9.0 > > > git.commit.id.abbrev=83d460c > I created a parquet file with a timestamp type using Drill. Now if I define a > hive table on top of the parquet file and use "timestamp" as the column type, > drill fails to read the hive table through the hive storage plugin > Implementation: > Added int96 to timestamp converter for both parquet readers and controling it > by system / session option "store.parquet.int96_as_timestamp". > The value of the option is false by default for the proper work of the old > query scripts with the "convert_from TIMESTAMP_IMPALA" function. > When the option is true using of that function is unnesessary and can lead to > the query fail. -- This message was sent by Atlassian JIRA (v6.3.4#6332)