[GitHub] drill pull request #1098: Initialize POM
Github user QubitPi closed the pull request at: https://github.com/apache/drill/pull/1098 ---
[GitHub] drill pull request #1098: Initialize POM
GitHub user QubitPi opened a pull request: https://github.com/apache/drill/pull/1098 Initialize POM You can merge this pull request into a Git repository by running: $ git pull https://github.com/QubitPi/drill implement-druid-storage-plugin Alternatively you can review and apply these changes as the patch at: https://github.com/apache/drill/pull/1098.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1098 commit 6c3bb6242128c9ea1eecd191628eb547f722ea55 Author: Jiaqi Liu <2257440489@...> Date: 2018-01-23T06:14:13Z Initialize POM ---
Drill Hangout Jan 23, 2018
We will have our bi-weekly tomorrow at 10 am PST Please reply to this post with proposed topics to discuss. Hangout link: https://plus.google.com/hangouts/_/event/ci4rdiju8bv04a64efj5fedd0lc As a reference, please see the last hangouts minutes here https://lists.apache.org/thread.html/24a0e0c1f42b67d9782375a515660e763031b618022322c5f168d8ce@%3Cdev.drill.apache.org%3E Thanks, Pritesh
[GitHub] drill pull request #1059: DRILL-5851: Empty table during a join operation wi...
Github user HanumathRao commented on a diff in the pull request: https://github.com/apache/drill/pull/1059#discussion_r163122636 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/join/HashJoinProbeTemplate.java --- @@ -136,7 +136,9 @@ public void executeProbePhase() throws SchemaChangeException { case OK_NEW_SCHEMA: if (probeBatch.getSchema().equals(probeSchema)) { doSetup(outgoingJoinBatch.getContext(), buildBatch, probeBatch, outgoingJoinBatch); - hashTable.updateBatches(); + if (hashTable != null) { +hashTable.updateBatches(); + } --- End diff -- @Ben-Zvi. Thanks Boaz for the quick review. I made the required changes and had a clean run precommit tests. Just to keep track of these failures. Here are the clean runs on the current commit. PASS (5.431 s) /root/drillAutomation/framework-master/framework/resources/Functional/schema_change_empty_batch/maprdb/binary_maprdb/emptyMaprDBLeftJoin.sql (connection: 1289554899) (queryID: 25997883-c4c3-afee-bd18-4907353109cd) PASS (6.403 s) /root/drillAutomation/framework-master/framework/resources/Functional/schema_change_empty_batch/hbase/emptyHbaseLeftJoin.sql (connection: 1037784189) (queryID: 2599792c-e9c0-304e-a82e-ff9438ed3f5b) PASS (5.392 s) /root/drillAutomation/framework-master/framework/resources/Functional/schema_change_empty_batch/hbase/emptyHbaseRightJoin.sql (connection: 1697676429) (queryID: 259979de-0f7a-f69e-9fcb-3b70f8eca3df) PASS (5.515 s) /root/drillAutomation/framework-master/framework/resources/Functional/schema_change_empty_batch/maprdb/binary_hbase/emptyMaprDBRightJoin.sql (connection: 1143786034) (queryID: 2599790d-5a7f-d737-a3dd-83baf15dcf94) ---
[GitHub] drill issue #1024: DRILL-3640: Support JDBC Statement.setQueryTimeout(int)
Github user kkhatua commented on the issue: https://github.com/apache/drill/pull/1024 Did a rebase on the latest master to resolve the merge conflict from DRILL-3993 [commit](https://github.com/apache/drill/commit/9fabb612f16f6f541b3bde68ad7d734cad26df33#diff-f5de5223afdaaec6d009c4e06015e34d) that upgraded to Calcite 1.13. ---
[GitHub] drill issue #1096: DRILL-6099 : Push limit past flatten(project) without pus...
Github user chunhui-shi commented on the issue: https://github.com/apache/drill/pull/1096 Once all tests are done, I think it is fine to add 'ready-to-commit' label to the JIRA. ---
[GitHub] drill issue #1096: DRILL-6099 : Push limit past flatten(project) without pus...
Github user chunhui-shi commented on the issue: https://github.com/apache/drill/pull/1096 +1 ---
[GitHub] drill pull request #1090: DRILL-6080: Sort incorrectly limits batch size to ...
Github user vrozov commented on a diff in the pull request: https://github.com/apache/drill/pull/1090#discussion_r163093794 --- Diff: exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/xsort/managed/TestSortImpl.java --- @@ -466,10 +469,10 @@ public void runLargeSortTest(OperatorFixture fixture, DataGenerator dataGen, public void runJumboBatchTest(OperatorFixture fixture, int rowCount) { timer.reset(); -DataGenerator dataGen = new DataGenerator(fixture, rowCount, Character.MAX_VALUE); -DataValidator validator = new DataValidator(rowCount, Character.MAX_VALUE); +DataGenerator dataGen = new DataGenerator(fixture, rowCount, ValueVector.MAX_ROW_COUNT); +DataValidator validator = new DataValidator(rowCount, ValueVector.MAX_ROW_COUNT); runLargeSortTest(fixture, dataGen, validator); -System.out.println(timer.elapsed(TimeUnit.MILLISECONDS)); +//System.out.println(timer.elapsed(TimeUnit.MILLISECONDS)); --- End diff -- Please remove or convert to logging. ---
[GitHub] drill pull request #1090: DRILL-6080: Sort incorrectly limits batch size to ...
Github user vrozov commented on a diff in the pull request: https://github.com/apache/drill/pull/1090#discussion_r162771968 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/record/selection/SelectionVector4.java --- @@ -31,8 +31,9 @@ private int length; public SelectionVector4(ByteBuf vector, int recordCount, int batchRecordCount) throws SchemaChangeException { -if (recordCount > Integer.MAX_VALUE /4) { - throw new SchemaChangeException(String.format("Currently, Drill can only support allocations up to 2gb in size. You requested an allocation of %d bytes.", recordCount * 4)); +if (recordCount > Integer.MAX_VALUE / 4) { + throw new SchemaChangeException(String.format("Currently, Drill can only support allocations up to 2gb in size. " + + "You requested an allocation of %d bytes.", recordCount * 4)); --- End diff -- My understanding is that Java will use `int` to compute `recordCount * 4` and overflow. ---
[GitHub] drill pull request #1090: DRILL-6080: Sort incorrectly limits batch size to ...
Github user vrozov commented on a diff in the pull request: https://github.com/apache/drill/pull/1090#discussion_r162772482 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/record/selection/SelectionVector4.java --- @@ -100,8 +101,8 @@ public boolean next() { return false; } -start = start+length; -int newEnd = Math.min(start+length, recordCount); +start = start + length; --- End diff -- This code does not look right to me. It tries to enforce invariant that `start + length <= recordCount`, but based on the check on line 96, the invariant is not enforced in other places, so it is not clear why the invariant needs to be enforced here. If the invariant needs to be enforced, will it be better to use: ``` start += length; length = Math.min(length, recordCount - start); ``` ---
[GitHub] drill pull request #1090: DRILL-6080: Sort incorrectly limits batch size to ...
Github user vrozov commented on a diff in the pull request: https://github.com/apache/drill/pull/1090#discussion_r163093639 --- Diff: exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/xsort/managed/TestSortImpl.java --- @@ -466,10 +469,10 @@ public void runLargeSortTest(OperatorFixture fixture, DataGenerator dataGen, public void runJumboBatchTest(OperatorFixture fixture, int rowCount) { timer.reset(); -DataGenerator dataGen = new DataGenerator(fixture, rowCount, Character.MAX_VALUE); -DataValidator validator = new DataValidator(rowCount, Character.MAX_VALUE); +DataGenerator dataGen = new DataGenerator(fixture, rowCount, ValueVector.MAX_ROW_COUNT); --- End diff -- Will it be better to use ``` DataGenerator dataGen = new DataGenerator(fixture, rowCount, Integer.MAX_VALUE); DataValidator validator = new DataValidator(rowCount, Integer.MAX_VALUE); ``` my understanding is that idea is to use max batch size in the test. ---
[jira] [Created] (DRILL-6103) lsb_release: command not found
Chunhui Shi created DRILL-6103: -- Summary: lsb_release: command not found Key: DRILL-6103 URL: https://issues.apache.org/jira/browse/DRILL-6103 Project: Apache Drill Issue Type: Bug Reporter: Chunhui Shi Got this error when running drillbit.sh: $ bin/drillbit.sh restart bin/drill-config.sh: line 317: lsb_release: command not found -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] drill issue #1087: DRILL-6079: Attempt to fix memory leak in Parquet
Github user parthchandra commented on the issue: https://github.com/apache/drill/pull/1087 +1 Thanks Salim. ---
[jira] [Resolved] (DRILL-5998) Queue information of queries which fail due to queue time out not shown
[ https://issues.apache.org/jira/browse/DRILL-5998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasad Nagaraj Subramanya resolved DRILL-5998. -- Resolution: Cannot Reproduce > Queue information of queries which fail due to queue time out not shown > --- > > Key: DRILL-5998 > URL: https://issues.apache.org/jira/browse/DRILL-5998 > Project: Apache Drill > Issue Type: Bug > Components: Client - HTTP >Affects Versions: 1.11.0 >Reporter: Prasad Nagaraj Subramanya >Assignee: Prasad Nagaraj Subramanya >Priority: Major > > When a query fails because of queue time out, the queue information is not > shown in the web UI -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] drill pull request #1054: DRILL-5998: Set queue name even when exception occ...
Github user prasadns14 closed the pull request at: https://github.com/apache/drill/pull/1054 ---
[GitHub] drill issue #1054: DRILL-5998: Set queue name even when exception occurs
Github user prasadns14 commented on the issue: https://github.com/apache/drill/pull/1054 @arina-ielchiieva , yes this is resolved by DRILL-5963 ---
[jira] [Created] (DRILL-6102) CurrentModification Exception in BaseAllocator when debugging
Timothy Farkas created DRILL-6102: - Summary: CurrentModification Exception in BaseAllocator when debugging Key: DRILL-6102 URL: https://issues.apache.org/jira/browse/DRILL-6102 Project: Apache Drill Issue Type: Improvement Reporter: Timothy Farkas Assignee: Timothy Farkas The following ConcurrentModificationException was observed {code:java} Running org.apache.drill.TestTpchDistributed#tpch19_1 11:45:47.482 [main] ERROR o.a.d.exec.server.BootStrapContext - Pool did not terminate 11:45:47.486 [main] ERROR o.a.d.exec.server.BootStrapContext - Error while closing java.util.ConcurrentModificationException: null at java.util.IdentityHashMap$IdentityHashMapIterator.nextIndex(IdentityHashMap.java:732) ~[na:1.7.0_141] at java.util.IdentityHashMap$KeyIterator.next(IdentityHashMap.java:822) ~[na:1.7.0_141] at org.apache.drill.exec.memory.BaseAllocator.print(BaseAllocator.java:751) ~[drill-memory-base-1.13.0-SNAPSHOT.jar:1.13.0-SNAPSHOT] at org.apache.drill.exec.memory.BaseAllocator.print(BaseAllocator.java:747) ~[drill-memory-base-1.13.0-SNAPSHOT.jar:1.13.0-SNAPSHOT] at org.apache.drill.exec.memory.BaseAllocator.print(BaseAllocator.java:747) ~[drill-memory-base-1.13.0-SNAPSHOT.jar:1.13.0-SNAPSHOT] at org.apache.drill.exec.memory.BaseAllocator.toString(BaseAllocator.java:542) ~[drill-memory-base-1.13.0-SNAPSHOT.jar:1.13.0-SNAPSHOT] at org.apache.drill.exec.memory.BaseAllocator.close(BaseAllocator.java:495) ~[drill-memory-base-1.13.0-SNAPSHOT.jar:1.13.0-SNAPSHOT] at org.apache.drill.common.AutoCloseables.close(AutoCloseables.java:76) [drill-common-1.13.0-SNAPSHOT.jar:1.13.0-SNAPSHOT] at org.apache.drill.common.AutoCloseables.close(AutoCloseables.java:64) [drill-common-1.13.0-SNAPSHOT.jar:1.13.0-SNAPSHOT] at org.apache.drill.exec.server.BootStrapContext.close(BootStrapContext.java:256) ~[classes/:na] at org.apache.drill.common.AutoCloseables.close(AutoCloseables.java:76) [drill-common-1.13.0-SNAPSHOT.jar:1.13.0-SNAPSHOT] at org.apache.drill.common.AutoCloseables.close(AutoCloseables.java:64) [drill-common-1.13.0-SNAPSHOT.jar:1.13.0-SNAPSHOT] at org.apache.drill.exec.server.Drillbit.close(Drillbit.java:254) [classes/:na] at org.apache.drill.test.BaseTestQuery.closeClient(BaseTestQuery.java:311) [test-classes/:1.13.0-SNAPSHOT] at sun.reflect.GeneratedMethodAccessor262.invoke(Unknown Source) ~[na:na] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:1.7.0_141] at java.lang.reflect.Method.invoke(Method.java:606) ~[na:1.7.0_141] at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) [junit-4.11.jar:na] at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) [junit-4.11.jar:na] at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) [junit-4.11.jar:na] at mockit.integration.junit4.internal.JUnit4TestRunnerDecorator.invokeExplosively(JUnit4TestRunnerDecorator.java:44) [jmockit-1.3.jar:na] at mockit.integration.junit4.internal.MockFrameworkMethod.invokeExplosively(MockFrameworkMethod.java:29) [jmockit-1.3.jar:na] at sun.reflect.GeneratedMethodAccessor17.invoke(Unknown Source) ~[na:na] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:1.7.0_141] at java.lang.reflect.Method.invoke(Method.java:606) ~[na:1.7.0_141] at mockit.internal.util.MethodReflection.invokeWithCheckedThrows(MethodReflection.java:95) [jmockit-1.3.jar:na] at mockit.internal.annotations.MockMethodBridge.callMock(MockMethodBridge.java:76) [jmockit-1.3.jar:na] at mockit.internal.annotations.MockMethodBridge.invoke(MockMethodBridge.java:41) [jmockit-1.3.jar:na] at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java) [junit-4.11.jar:na] at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:33) [junit-4.11.jar:na] at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55) [junit-4.11.jar:na] at org.junit.rules.RunRules.evaluate(RunRules.java:20) [junit-4.11.jar:na] at org.junit.runners.ParentRunner.run(ParentRunner.java:309) [junit-4.11.jar:na] at org.junit.runners.Suite.runChild(Suite.java:127) [junit-4.11.jar:na] at org.junit.runners.Suite.runChild(Suite.java:26) [junit-4.11.jar:na] at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238) [junit-4.11.jar:na] at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63) [junit-4.11.jar:na] at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236) [junit-4.11.jar:na] at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53) [junit-4.11.jar:na] at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229) [junit-4.11.jar:na] at org.junit.runners.ParentRunner.run(ParentRunner.java:309) [junit-4.11.jar:na] at org.junit.runner.JUnitCore.run(JUnitCore.java:160) [junit-4.11.jar:n
[GitHub] drill pull request #1096: DRILL-6099 : Push limit past flatten(project) with...
Github user chunhui-shi commented on a diff in the pull request: https://github.com/apache/drill/pull/1096#discussion_r163048747 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/planner/PlannerPhase.java --- @@ -341,7 +346,7 @@ static RuleSet getPruneScanRules(OptimizerRulesContext optimizerRulesContext) { ParquetPruneScanRule.getFilterOnProjectParquet(optimizerRulesContext), ParquetPruneScanRule.getFilterOnScanParquet(optimizerRulesContext), DrillPushLimitToScanRule.LIMIT_ON_SCAN, -DrillPushLimitToScanRule.LIMIT_ON_PROJECT +DrillPushLimitToScanRule.LIMIT_ON_PROJECT_SCAN --- End diff -- Not sure if we still need "limit_on_project_scan". In theory, limit_on_project and limit_on_scan should already cover all the cases. Have you tested with "limit_on_project_scan" disabled? ---
[GitHub] drill pull request #1059: DRILL-5851: Empty table during a join operation wi...
Github user Ben-Zvi commented on a diff in the pull request: https://github.com/apache/drill/pull/1059#discussion_r163045737 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/join/HashJoinProbeTemplate.java --- @@ -136,7 +136,9 @@ public void executeProbePhase() throws SchemaChangeException { case OK_NEW_SCHEMA: if (probeBatch.getSchema().equals(probeSchema)) { doSetup(outgoingJoinBatch.getContext(), buildBatch, probeBatch, outgoingJoinBatch); - hashTable.updateBatches(); + if (hashTable != null) { +hashTable.updateBatches(); + } --- End diff -- Fine. ---
[jira] [Created] (DRILL-6101) Optimize Implicit Columns Processing
salim achouche created DRILL-6101: - Summary: Optimize Implicit Columns Processing Key: DRILL-6101 URL: https://issues.apache.org/jira/browse/DRILL-6101 Project: Apache Drill Issue Type: Improvement Components: Execution - Relational Operators Reporter: salim achouche Assignee: salim achouche Problem Description - * Apache Drill allows users to specify columns even for SELECT STAR queries * From my discussion with [~paul-rogers], Apache Calcite has a limitation where the, extra columns are not provided * The workaround has been to always include all implicit columns for SELECT STAR queries * Unfortunately, the current implementation is very inefficient as implicit column values get duplicated; this leads to substantial performance degradation when the number of rows are large Suggested Optimization - * The NullableVarChar vector should be enhanced to efficiently store duplicate values * This will not only address the current Calcite limitations (for SELECT STAR queries) but also optimize all queries with implicit columns -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] drill pull request #1057: DRILL-5993 Add Generic Copiers With Append Methods
Github user ilooner commented on a diff in the pull request: https://github.com/apache/drill/pull/1057#discussion_r163028998 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/svremover/GenericSV4Copier.java --- @@ -0,0 +1,68 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.drill.exec.physical.impl.svremover; + +import org.apache.drill.exec.exception.SchemaChangeException; +import org.apache.drill.exec.ops.FragmentContext; +import org.apache.drill.exec.record.RecordBatch; +import org.apache.drill.exec.record.VectorContainer; +import org.apache.drill.exec.record.VectorWrapper; +import org.apache.drill.exec.vector.SchemaChangeCallBack; +import org.apache.drill.exec.vector.ValueVector; + +public class GenericSV4Copier extends AbstractSV4Copier { + + @Override + public void setup(RecordBatch incoming, VectorContainer outgoing) throws SchemaChangeException { +super.setup(incoming, outgoing); --- End diff -- The code is similar but not identical. For **GenericSV4Copier** **vvIn** is a 2D array of type **ValueVector[][]**. For **GenericSV2Copier** **vvIn** is a 1D array of type **ValueVector[]**. Since the types are different I can't move the code into the **AbstractCopier**. But I will move the setup code in **GenericSV2Copier** to **AbstractSV2Copier** since that code will likely be the same for different implementations. I'll also do the same thing with the **GenericSV4Copier** and **AbstractSV4Copier** classes. ---
[GitHub] drill issue #1059: DRILL-5851: Empty table during a join operation with a no...
Github user HanumathRao commented on the issue: https://github.com/apache/drill/pull/1059 @amansinha100 These issues might not be because of this PR changes. I think, these are random issues which are being shown up in my branch. I heard from @Ben-Zvi that he also had some issues with empty table in his private branch. Anyway I have made a change to check for htable being not null and now all the tests are passing. I will update this PR once I talk to @Ben-Zvi about the issue which he had seen in his private branch. ---
[GitHub] drill pull request #1057: DRILL-5993 Add Generic Copiers With Append Methods
Github user ilooner commented on a diff in the pull request: https://github.com/apache/drill/pull/1057#discussion_r163025549 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/record/BatchSchema.java --- @@ -60,6 +60,10 @@ public SelectionVectorMode getSelectionVectorMode() { return selectionVectorMode; } + public BatchSchema toMode(SelectionVectorMode mode) { --- End diff -- removed ---
[GitHub] drill issue #1085: DRILL-6049: Misc. hygiene and code cleanup changes
Github user amansinha100 commented on the issue: https://github.com/apache/drill/pull/1085 @paul-rogers since the DRILL-3993 (Calcite related changes) went into master last week, this PR would need to be rebased. I can merge it in soon after that. ---
[GitHub] drill issue #1084: DRILL-5868: Support SQL syntax highlighting of queries
Github user amansinha100 commented on the issue: https://github.com/apache/drill/pull/1084 @kkhatua I saw some check style errors when applying the patch : .git/rebase-apply/patch:215: trailing whitespace. .git/rebase-apply/patch:396: trailing whitespace. .git/rebase-apply/patch:478: trailing whitespace. //[Cannot supported due to space] warning: 3 lines add whitespace errors. Can you rebase on latest master and fixup ? ---