[GitHub] drill pull request #1082: DRILL-5741: Automatically manage memory allocation...
Github user kkhatua commented on a diff in the pull request: https://github.com/apache/drill/pull/1082#discussion_r160606654 --- Diff: distribution/src/resources/drill-config.sh --- @@ -180,18 +251,46 @@ else fi fi -# Default memory settings if none provided by the environment or +# Checking if being executed in context of Drillbit and not SQLLine +if [ "$DRILLBIT_CONTEXT" == "1" ]; then + # *-auto.sh allows for distrib/user specific checks to be done + distribAuto="$DRILL_CONF_DIR/distrib-auto.sh" + if [ ! -r "$distribAuto" ]; then distribAuto="$DRILL_HOME/conf/distrib-auto.sh"; fi + if [ ! -r "$distribAuto" ]; then distribAuto=""; fi + drillAuto="$DRILL_CONF_DIR/drill-auto.sh" + if [ ! -r "$drillAuto" ]; then drillAuto="$DRILL_HOME/conf/drill-auto.sh"; fi --- End diff -- Agreed. The drill-config file follows this for `drill-env.sh`, but checks for both locations in case of `distrib-env.sh`. ---
[GitHub] drill pull request #1082: DRILL-5741: Automatically manage memory allocation...
Github user kkhatua commented on a diff in the pull request: https://github.com/apache/drill/pull/1082#discussion_r160606258 --- Diff: distribution/src/resources/distrib-auto.sh --- @@ -0,0 +1,223 @@ +#!/usr/bin/env bash --- End diff -- Wasn't sure which would be a a suitable place. I reasoned the "Apache" distribution is what we are defining it for, hence the choice ---
[GitHub] drill pull request #1082: DRILL-5741: Automatically manage memory allocation...
Github user kkhatua commented on a diff in the pull request: https://github.com/apache/drill/pull/1082#discussion_r160606013 --- Diff: distribution/src/assemble/bin.xml --- @@ -345,6 +345,16 @@ 0755 conf + + src/resources/drill-auto.sh --- End diff -- AutoConfig was the intent, but looked a bit verbose. How about `drill-autocheck.sh` ? :) ---
[GitHub] drill issue #1087: Attempt to fix memory leak in Parquet
Github user sachouche commented on the issue: https://github.com/apache/drill/pull/1087 @parthchandra can you please review this fix? ---
[GitHub] drill pull request #1087: Attempt to fix memory leak in Parquet
GitHub user sachouche opened a pull request: https://github.com/apache/drill/pull/1087 Attempt to fix memory leak in Parquet ** Problem Description ** This is an extremely rare leak which I was able to emulate by putting a sleep in the AsyncPageReader right after reading the page and before enqueue in the result queue. This is how this issue could manifest itself in real life scenario: - AsyncPageReader reads a page into a buffer but didn't enqueue yet the result (thread got preempted) - Parquet Scan thread blocked waiting on the task (Future object dequeued) - Cancel received and Scan thread interrupted - Future.get() returns (Future object is lost) - Scan thread executes release logic - Scan thread is not able to interrupt the AsyncPageReader thread since the future object is lost - AsyncPageReader thread resumes and enqueues the DrillBuf in the result queue - This results in a leak since this buffer is not properly released ** Fix Description ** - The fix is straightforward as we peek the Future object during the blocking get() method - This way, an exception (such as an interrupt) will leave the Future object in the task queue - The cleanup logic will be able to guarantee the DrillBuf object is either GCed by the AsyncPageReader or ParquetScan thread You can merge this pull request into a Git repository by running: $ git pull https://github.com/sachouche/drill DRILL-6079 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/drill/pull/1087.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1087 commit 52030d1d9cc3b8992a10ade8c7126d66e785043a Author: Salim AchoucheDate: 2017-12-22T19:50:56Z Attempt to fix memory leak in Parquet ---
[GitHub] drill issue #1001: JIRA DRILL-5879: Like operator performance improvements
Github user priteshm commented on the issue: https://github.com/apache/drill/pull/1001 @sachouche can you update this PR? ---
Re: Parquet MR version
Support for byte buffers aka "PARQUET-77: ByteBuffer use in read and write paths" is included both to 1.8.2 and 1.9.0 parquet-mr. Based on the following discussion https://lists.apache.org/thread.html/e1a3d1e49d3b775cb5e1417c882dad7bd2f038e00237ff2a9057c608@1454714357@%3Cdev.drill.apache.org%3E Drill should move to the released version of parquet-mr. I'll file a JIRA and summarize my findings. Thank you, Vlad On 1/5/18 02:23, salim achouche wrote: I don’t think it will be a lot of work to do so as the main changes (from the Parquet branch) is related to passing direct buffers vs using Java heap. FYI - I would also think that we should contribute to the Parquet code base so that an abstraction is used on top of byte arrays (e.g., ByteBuffer) so that Drill will not need to use a private Parquet branch. I guess the key is to figure out why such an approach was not followed in the first place. Regards, Salim On Jan 5, 2018, at 2:37 AM, Vlad Rozovwrote: Hi everyone, With parquet-mr 1.9.0 released more than 1 year ago and parquet-mr 1.8.2 almost a year ago, should the drill dependency on parquet-mr be updated from the current custom 1.8.1-drill-r0? As far as I can tell both 1.8.2 and 1.9.0 have fixes from the 1.8.1-drill-r0 patch. Will the community prefer 1.9.0 over 1.8.2 even if 1.9.0 may introduce some incompatibilities? Thank you, Vlad
[GitHub] drill issue #367: DRILL-4364: Image Metadata Format Plugin
Github user kkhatua commented on the issue: https://github.com/apache/drill/pull/367 @nagix Did you get a chance to rebase this commit? ---
[GitHub] drill issue #456: DRILL-4566: TDigest, median, and quantile functions
Github user kkhatua commented on the issue: https://github.com/apache/drill/pull/456 Is anyone reviewing this? ---
[jira] [Created] (DRILL-6079) Memory leak caused by ParquetRowGroupScan
salim achouche created DRILL-6079: - Summary: Memory leak caused by ParquetRowGroupScan Key: DRILL-6079 URL: https://issues.apache.org/jira/browse/DRILL-6079 Project: Apache Drill Issue Type: Bug Components: Storage - Parquet Reporter: salim achouche Assignee: salim achouche Priority: Minor Concurrency tests with assertion enabled indicate a memory leak in the Parquet scanner code: 2017-10-25 17:28:52,149 [260ed3bc-dbdf-8b4a-66d6-b2bd804e8c74:frag:11:63] ERROR o.a.d.e.w.fragment.FragmentExecutor - SYSTEM ERROR: IllegalStateException: Memory was leaked by query. Memory leaked: (2097152) Allocator(op:11:63:6:ParquetRowGroupScan) 100/2097152/7348224/100 (res/actual/peak/limit) Fragment 11:63 [Error Id: 5927becb-f000-43db-95b5-93b33064a6fd on mperf113.qa.lab:31010] org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: IllegalStateException: Memory was leaked by query. Memory leaked: (2097152) Allocator(op:11:63:6:ParquetRowGroupScan) 100/2097152/7348224/100 (res/actual/peak/limit) Fragment 11:63 [Error Id: 5927becb-f000-43db-95b5-93b33064a6fd on mperf113.qa.lab:31010] at org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:586) ~[drill-common-1.11.0-mapr.jar:1.11.0-mapr] at org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:301) [drill-java-exec-1.11.0-mapr.jar:1.11.0-mapr] at org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:160) [drill-java-exec-1.11.0-mapr.jar:1.11.0-mapr] at org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:267) [drill-java-exec-1.11.0-mapr.jar:1.11.0-mapr] at org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38) [drill-common-1.11.0-mapr.jar:1.11.0-mapr] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [na:1.8.0_121] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_121] at java.lang.Thread.run(Thread.java:745) [na:1.8.0_121] Caused by: java.lang.IllegalStateException: Memory was leaked by query. Memory leaked: (2097152) Allocator(op:11:63:6:ParquetRowGroupScan) 100/2097152/7348224/100 (res/actual/peak/limit) at org.apache.drill.exec.memory.BaseAllocator.close(BaseAllocator.java:520) ~[drill-memory-base-1.11.0-mapr.jar:1.11.0-mapr] at org.apache.drill.exec.ops.AbstractOperatorExecContext.close(AbstractOperatorExecContext.java:86) ~[drill-java-exec-1.11.0-mapr.jar:1.11.0-mapr] at org.apache.drill.exec.ops.OperatorContextImpl.close(OperatorContextImpl.java:108) ~[drill-java-exec-1.11.0-mapr.jar:1.11.0-mapr] at org.apache.drill.exec.ops.FragmentContext.suppressingClose(FragmentContext.java:435) ~[drill-java-exec-1.11.0-mapr.jar:1.11.0-mapr] at org.apache.drill.exec.ops.FragmentContext.close(FragmentContext.java:424) ~[drill-java-exec-1.11.0-mapr.jar:1.11.0-mapr] at org.apache.drill.exec.work.fragment.FragmentExecutor.closeOutResources(FragmentExecutor.java:324) [drill-java-exec-1.11.0-mapr.jar:1.11.0-mapr] at org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:155) [drill-java-exec-1.11.0-mapr.jar:1.11.0-mapr] ... 5 common frames omitted 2017-10-25 17:28:52,149 [260ed3bc-dbdf-8b4a-66d6-b2bd804e8c74:frag:6:0] ERROR o.a.d.e.w.fragment.FragmentExecutor - SYSTEM ERROR: IllegalStateException: Memory was leaked by query. Memory leaked: (2097152) Allocator(op:6:0:5:ParquetRowGroupScan) 100/2097152/19947520/100 (res/actual/peak/limit) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] drill pull request #1059: DRILL-5851: Empty table during a join operation wi...
Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/1059#discussion_r160542798 --- Diff: exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/join/TestNestedLoopJoin.java --- @@ -333,4 +339,55 @@ public void testNlJoinWithLargeRightInputSuccess() throws Exception { test(RESET_JOIN_OPTIMIZATION); } } + + private static void buildFile(String fileName, String[] data, File testDir) throws IOException { +try(PrintWriter out = new PrintWriter(new FileWriter(new File(testDir, fileName { + for (String line : data) { +out.println(line); + } +} + } + + public static void testWithEmptyJoin(File testDir, String joinType, +String joinPattern, long result) throws Exception { +buildFile("dept.json", new String[0], testDir); +String query = String.format(testEmptyJoin, joinType); +testPlanMatchingPatterns(query, new String[]{joinPattern}, new String[]{}); +testBuilder() +.sqlQuery(query) +.unOrdered() +.baselineColumns("cnt") +.baselineValues(result) +.build().run(); + } + + @Test + public void testNestedLeftJoinWithEmptyTable() throws Exception { +try { + test(DISABLE_NLJ_SCALAR); + testWithEmptyJoin(dirTestWatcher.getRootDir(), "left outer", nlpattern, 1155L); +} finally { + test(RESET_HJ); --- End diff -- Please use `client.resetSessionOption(...)`. ---
[GitHub] drill pull request #1059: DRILL-5851: Empty table during a join operation wi...
Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/1059#discussion_r160542923 --- Diff: exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/join/TestNestedLoopJoin.java --- @@ -333,4 +339,55 @@ public void testNlJoinWithLargeRightInputSuccess() throws Exception { test(RESET_JOIN_OPTIMIZATION); } } + + private static void buildFile(String fileName, String[] data, File testDir) throws IOException { +try(PrintWriter out = new PrintWriter(new FileWriter(new File(testDir, fileName { + for (String line : data) { +out.println(line); + } +} + } + + public static void testWithEmptyJoin(File testDir, String joinType, +String joinPattern, long result) throws Exception { +buildFile("dept.json", new String[0], testDir); +String query = String.format(testEmptyJoin, joinType); +testPlanMatchingPatterns(query, new String[]{joinPattern}, new String[]{}); +testBuilder() +.sqlQuery(query) +.unOrdered() +.baselineColumns("cnt") +.baselineValues(result) +.build().run(); + } + + @Test + public void testNestedLeftJoinWithEmptyTable() throws Exception { +try { + test(DISABLE_NLJ_SCALAR); --- End diff -- Please use `client.setSessionOption(...)` to make it clear what we are doing here. ---
[GitHub] drill pull request #1059: DRILL-5851: Empty table during a join operation wi...
Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/1059#discussion_r160540854 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/record/AbstractRecordBatch.java --- @@ -228,4 +228,20 @@ public WritableBatch getWritableBatch() { public VectorContainer getOutgoingContainer() { throw new UnsupportedOperationException(String.format(" You should not call getOutgoingContainer() for class %s", this.getClass().getCanonicalName())); } + + public void drainStream(IterOutcome stream, int input, RecordBatch batch) { +if (stream == IterOutcome.OK_NEW_SCHEMA || stream == IterOutcome.OK) { + for (final VectorWrapper wrapper : batch) { +wrapper.getValueVector().clear(); + } + batch.kill(true); + stream = next(input, batch); + while (stream == IterOutcome.OK_NEW_SCHEMA || stream == IterOutcome.OK) { +for (final VectorWrapper wrapper : batch) { + wrapper.getValueVector().clear(); +} +stream = next(input, batch); + } --- End diff -- In one-on-one discussions it became clear that fixing this issue is out of the scope of this particular task. That's fine. Please just file a JIRA to track the potential problem so we don't forget about it. ---
[GitHub] drill pull request #1082: DRILL-5741: Automatically manage memory allocation...
Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/1082#discussion_r160537845 --- Diff: distribution/src/resources/distrib-auto.sh --- @@ -0,0 +1,223 @@ +#!/usr/bin/env bash --- End diff -- In general, "distrib" files are meant to be added by Drill distributions, not by Drill itself. If we want to provide the memory checking in Drill, then it should go into some other file other than "distrib." ---
[GitHub] drill pull request #1082: DRILL-5741: Automatically manage memory allocation...
Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/1082#discussion_r160538214 --- Diff: distribution/src/assemble/bin.xml --- @@ -345,6 +345,16 @@ 0755 conf + + src/resources/drill-auto.sh --- End diff -- "-auto"? Is this for the self-driving car feature of Drill? :-) Is this meant to be for startup-time environment checks? For other add-on processing? Perhaps we can pick a name that makes that a bit clearer. ---
[GitHub] drill pull request #1082: DRILL-5741: Automatically manage memory allocation...
Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/1082#discussion_r160539584 --- Diff: distribution/src/resources/drill-config.sh --- @@ -180,18 +251,46 @@ else fi fi -# Default memory settings if none provided by the environment or +# Checking if being executed in context of Drillbit and not SQLLine +if [ "$DRILLBIT_CONTEXT" == "1" ]; then + # *-auto.sh allows for distrib/user specific checks to be done + distribAuto="$DRILL_CONF_DIR/distrib-auto.sh" + if [ ! -r "$distribAuto" ]; then distribAuto="$DRILL_HOME/conf/distrib-auto.sh"; fi + if [ ! -r "$distribAuto" ]; then distribAuto=""; fi + drillAuto="$DRILL_CONF_DIR/drill-auto.sh" + if [ ! -r "$drillAuto" ]; then drillAuto="$DRILL_HOME/conf/drill-auto.sh"; fi --- End diff -- This part is just a bit tricky. Drill- and distribution- specific files do, in fact, reside in `$DRILL_HOME/conf`. But, user files reside in a number of places. So, please use: - `$DRILL_HOME/conf` for `distrib-auto.sh` - `$DRILL_CONF_DIR` for user-supplied files (such as `drill-auto.sh`) ---
[GitHub] drill pull request #1082: DRILL-5741: Automatically manage memory allocation...
Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/1082#discussion_r160539726 --- Diff: distribution/src/resources/drill-config.sh --- @@ -180,18 +251,46 @@ else fi fi -# Default memory settings if none provided by the environment or +# Checking if being executed in context of Drillbit and not SQLLine +if [ "$DRILLBIT_CONTEXT" == "1" ]; then + # *-auto.sh allows for distrib/user specific checks to be done + distribAuto="$DRILL_CONF_DIR/distrib-auto.sh" + if [ ! -r "$distribAuto" ]; then distribAuto="$DRILL_HOME/conf/distrib-auto.sh"; fi + if [ ! -r "$distribAuto" ]; then distribAuto=""; fi + drillAuto="$DRILL_CONF_DIR/drill-auto.sh" + if [ ! -r "$drillAuto" ]; then drillAuto="$DRILL_HOME/conf/drill-auto.sh"; fi + if [ ! -r "$drillAuto" ]; then drillAuto=""; fi + + # Enforcing checks in order (distrib-auto.sh , drill-auto.sh) + # (NOTE: A script is executed only if it has relevant executable lines) + if [ -n "$distribAuto" ] && [ $(executableLineCount $distribAuto) -gt 0 ]; then +. "$distribAuto" +if [ $? -gt 0 ]; then fatal_error "Aborting Drill Startup due failed checks from $distribAuto"; fi + fi + if [ -n "$drillAuto" ] && [ $(executableLineCount $drillAuto) -gt 0 ]; then --- End diff -- To make the code a bit simpler, touch each file once: * Check if it exists. * Invoke it if so. * Handle any errors that are reported. ---
[GitHub] drill pull request #1083: DRILL-4185: UNION ALL involving empty directory on...
Github user arina-ielchiieva commented on a diff in the pull request: https://github.com/apache/drill/pull/1083#discussion_r160525590 --- Diff: exec/java-exec/src/test/java/org/apache/drill/exec/store/parquet/TestParquetGroupScan.java --- @@ -56,65 +56,50 @@ private void prepareTables(final String tableName, boolean refreshMetadata) thro public void testFix4376() throws Exception { prepareTables("4376_1", true); -testBuilder() - .sqlQuery("SELECT COUNT(*) AS `count` FROM dfs.tmp.`4376_1/60*`") - .ordered() - .baselineColumns("count").baselineValues(1984L) - .go(); +int actualRecordCount = testSql("SELECT * FROM dfs.tmp.`4376_1/60*`"); +int expectedRecordCount = 1984; +assertEquals(String.format("Received unexpected number of rows in output: expected=%d, received=%s", --- End diff -- `expected=%d` -> `expected = %d`. Please correct here and in other cases. ---
[GitHub] drill pull request #1083: DRILL-4185: UNION ALL involving empty directory on...
Github user arina-ielchiieva commented on a diff in the pull request: https://github.com/apache/drill/pull/1083#discussion_r160525146 --- Diff: exec/java-exec/src/test/java/org/apache/drill/test/BaseTestQuery.java --- @@ -119,6 +125,15 @@ public static void setupDefaultTestCluster() throws Exception { // turns on the verbose errors in tests // sever side stacktraces are added to the message before sending back to the client test("ALTER SESSION SET `exec.errors.verbose` = true"); +emptyDirCreating(); + } + + /** + * Creates an empty directory under dfs.root schema. + */ + private static void emptyDirCreating() { --- End diff -- Please rename -> `createEmptyDirectory`. ---
[GitHub] drill pull request #1083: DRILL-4185: UNION ALL involving empty directory on...
Github user arina-ielchiieva commented on a diff in the pull request: https://github.com/apache/drill/pull/1083#discussion_r160526940 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/ParquetFormatPlugin.java --- @@ -250,20 +250,12 @@ private boolean metaDataFileExists(FileSystem fs, FileStatus dir) throws IOExcep } boolean isDirReadable(DrillFileSystem fs, FileStatus dir) { - Path p = new Path(dir.getPath(), ParquetFileWriter.PARQUET_METADATA_FILE); try { -if (fs.exists(p)) { - return true; -} else { - - if (metaDataFileExists(fs, dir)) { -return true; - } - List statuses = DrillFileSystemUtil.listFiles(fs, dir.getPath(), false); - return !statuses.isEmpty() && super.isFileReadable(fs, statuses.get(0)); -} +// There should be at least one file, which is readable by Drill +List statuses = DrillFileSystemUtil.listFiles(fs, dir.getPath(), false); +return !statuses.isEmpty() && super.isFileReadable(fs, statuses.get(0)); --- End diff -- Why did you remove `metaDataFileExists(fs, dir)` check? `DrillFileSystemUtil` won't list dot files. Plus it seems you break the assumptions highlighted in `isReadable` method above. ---
[GitHub] drill pull request #1083: DRILL-4185: UNION ALL involving empty directory on...
Github user arina-ielchiieva commented on a diff in the pull request: https://github.com/apache/drill/pull/1083#discussion_r160524662 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/base/SchemalessScan.java --- @@ -0,0 +1,80 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to you under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.drill.exec.physical.base; + +import com.fasterxml.jackson.annotation.JsonTypeName; +import com.google.common.base.Preconditions; +import org.apache.drill.common.exceptions.ExecutionSetupException; +import org.apache.drill.common.expression.SchemaPath; +import org.apache.drill.exec.physical.PhysicalOperatorSetupException; +import org.apache.drill.exec.planner.logical.DynamicDrillTable; +import org.apache.drill.exec.proto.CoordinationProtos; + +import java.util.List; + +/** + * The type of scan operator, which allows to scan schemaless tables ({@link DynamicDrillTable} with null selection) + */ +@JsonTypeName("schemaless-scan") +public class SchemalessScan extends AbstractGroupScan implements SubScan { + + public SchemalessScan(String userName) { +super(userName); + } + + public SchemalessScan(AbstractGroupScan that) { +super(that); + } + + @Override + public void applyAssignments(List endpoints) throws PhysicalOperatorSetupException { + } + + @Override + public SubScan getSpecificScan(int minorFragmentId) throws ExecutionSetupException { +return this; + } + + @Override + public int getMaxParallelizationWidth() { +return 1; + } + + @Override + public String getDigest() { +return toString(); --- End diff -- What value `toString` will return? ---
[GitHub] drill pull request #1083: DRILL-4185: UNION ALL involving empty directory on...
Github user arina-ielchiieva commented on a diff in the pull request: https://github.com/apache/drill/pull/1083#discussion_r160525926 --- Diff: exec/java-exec/src/test/java/org/apache/drill/TestUnionAll.java --- @@ -1197,4 +1197,64 @@ public void testFieldWithDots() throws Exception { .baselineValues("1", "2", "1", null, "a") .go(); } -} \ No newline at end of file + + @Test + public void testUnionAllRightEmptyDir() throws Exception { --- End diff -- 1. Does union (aka union distinct) also work fine? 2. Case when bot directories in union / union all are empty? ---
[GitHub] drill pull request #1083: DRILL-4185: UNION ALL involving empty directory on...
Github user arina-ielchiieva commented on a diff in the pull request: https://github.com/apache/drill/pull/1083#discussion_r160525739 --- Diff: exec/java-exec/src/test/java/org/apache/drill/exec/TestEmptyInputSql.java --- @@ -177,4 +177,33 @@ public void testQueryEmptyCsv() throws Exception { .run(); } + @Test + public void testEmptyDirectory() throws Exception { +final BatchSchema expectedSchema = new SchemaBuilder().build(); + +testBuilder() +.sqlQuery("select * from dfs.`%s`", EMPTY_DIR_NAME) +.schemaBaseLine(expectedSchema) +.build() +.run(); + } + + @Test + public void testEmptyDirectoryAndFieldInQuery() throws Exception { +final List> expectedSchema = Lists.newArrayList(); +final TypeProtos.MajorType majorType = TypeProtos.MajorType.newBuilder() +.setMinorType(TypeProtos.MinorType.INT) // field "key" is absent in schema-less table --- End diff -- schema-less -> schemaless ---
[GitHub] drill pull request #1083: DRILL-4185: UNION ALL involving empty directory on...
Github user arina-ielchiieva commented on a diff in the pull request: https://github.com/apache/drill/pull/1083#discussion_r160524828 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/handlers/RefreshMetadataHandler.java --- @@ -78,19 +78,24 @@ public PhysicalPlan getPlan(SqlNode sqlNode) throws ValidationException, RelConv final Table table = schema.getTable(tableName); - if(table == null){ + if(table == null) { --- End diff -- Please add space before `if`, here and below. ---
[GitHub] drill pull request #1083: DRILL-4185: UNION ALL involving empty directory on...
Github user arina-ielchiieva commented on a diff in the pull request: https://github.com/apache/drill/pull/1083#discussion_r160525303 --- Diff: exec/java-exec/src/test/java/org/apache/drill/test/BaseTestQuery.java --- @@ -119,6 +125,15 @@ public static void setupDefaultTestCluster() throws Exception { // turns on the verbose errors in tests // sever side stacktraces are added to the message before sending back to the client test("ALTER SESSION SET `exec.errors.verbose` = true"); +emptyDirCreating(); --- End diff -- Why do we create empty directory for all tests that extend this class? I guess need to create it only for those tests that need it. ---
[GitHub] drill issue #652: DRILL-4990:Use new HDFS API access instead of listStatus t...
Github user kkhatua commented on the issue: https://github.com/apache/drill/pull/652 @ppadma can you rebase this with the current master, and include the workaround? It does not make sense to hold up this commit for so long if a workaround for the Windows platform is sufficient. ---
[GitHub] drill issue #440: Elasticsearch storage plugin
Github user kkhatua commented on the issue: https://github.com/apache/drill/pull/440 @hamdanuk this commit has too many commits, which makes the PR nearly impossible to review. Would you like to rebase your work on the latest Drill 1.12.0 and try again? ---
[GitHub] drill pull request #1060: DRILL-5846: Improve parquet performance for Flat D...
Github user sachouche commented on a diff in the pull request: https://github.com/apache/drill/pull/1060#discussion_r160514755 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/columnreaders/VLAbstractEntryReader.java --- @@ -0,0 +1,215 @@ +/*** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + **/ +package org.apache.drill.exec.store.parquet.columnreaders; + +import java.nio.ByteBuffer; + +import org.apache.drill.exec.store.parquet.columnreaders.VLColumnBulkInput.ColumnPrecisionInfo; +import org.apache.drill.exec.store.parquet.columnreaders.VLColumnBulkInput.PageDataInfo; +import org.apache.drill.exec.util.MemoryUtils; + +/** Abstract class for sub-classes implementing several algorithms for loading a Bulk Entry */ +abstract class VLAbstractEntryReader { + + /** byte buffer used for buffering page data */ + protected final ByteBuffer buffer; + /** Page Data Information */ + protected final PageDataInfo pageInfo; + /** expected precision type: fixed or variable length */ + protected final ColumnPrecisionInfo columnPrecInfo; + /** Bulk entry */ + protected final VLColumnBulkEntry entry; + + /** + * CTOR. + * @param _buffer byte buffer for data buffering (within CPU cache) + * @param _pageInfo page being processed information + * @param _columnPrecInfo column precision information + * @param _entry reusable bulk entry object + */ + VLAbstractEntryReader(ByteBuffer _buffer, +PageDataInfo _pageInfo, +ColumnPrecisionInfo _columnPrecInfo, +VLColumnBulkEntry _entry) { + +this.buffer = _buffer; +this.pageInfo = _pageInfo; +this.columnPrecInfo = _columnPrecInfo; +this.entry = _entry; + } + + /** + * @param valuesToRead maximum values to read within the current page + * @return a bulk entry object + */ + abstract VLColumnBulkEntry getEntry(int valsToReadWithinPage); + + /** + * Indicates whether to use bulk processing + */ + protected final boolean bulkProcess() { +return columnPrecInfo.bulkProcess; + } + + /** + * Loads new data into the buffer if empty or the force flag is set. + * + * @param force flag to force loading new data into the buffer + */ + protected final boolean load(boolean force) { + +if (!force && buffer.hasRemaining()) { + return true; // NOOP +} + +// We're here either because the buffer is empty or we want to force a new load operation. +// In the case of force, there might be unprocessed data (still in the buffer) which is fine +// since the caller updates the page data buffer's offset only for the data it has consumed; this +// means unread data will be loaded again but this time will be positioned in the beginning of the +// buffer. This can happen only for the last entry in the buffer when either of its length or value +// is incomplete. +buffer.clear(); + +int remaining = remainingPageData(); +int toCopy= remaining > buffer.capacity() ? buffer.capacity() : remaining; + +if (toCopy == 0) { + return false; +} + +pageInfo.pageData.getBytes(pageInfo.pageDataOff, buffer.array(), buffer.position(), toCopy); --- End diff -- @parthchandra I know this is counter intuitive but this path makes it faster and I had VTune to prove it; let me explain it: - The logic is to fetch small chunks of data from main memory to the L1 cache (using 4k buffers) for processing - This logic doesn't introduce any extra memory bandwidth penalty since in all cases data has to move from main memory to the CPU and back to main memory (if modified) - Note that the CPU has optimizations that disable
[GitHub] drill issue #1060: DRILL-5846: Improve parquet performance for Flat Data Typ...
Github user sachouche commented on the issue: https://github.com/apache/drill/pull/1060 Before I reply to the provided comments I want first to thank both Parth and Paul for taking time to review this Pull Request. @parthchandra Regarding the High Level Comments - FS Access Manager The attached document listed several possible improvements in terms of CPU, Partitioning (changes to the query plan), and IO layer. I didn't want to club all the changes in a single pull request; instead I started with the Pull Request with the highest priority (in terms of performance improvement) which is CPU related enhancements. - Drill Memory Package I understand where you're coming from; I have analyzed this task and came up with a solution which I felt was the cleanest. Let me take you through my thinking process and the alternative solutions that I have considered: a) Why did I use this low level APIs? The Bulk Parquet reader uses the Drill Buffer APIs to copy data chunks to the CPU cache; this means that memory sanity checks are always performed. During code profiling I have noticed the Hotspot was using too many instructions (byte based assembly was used); I have researched ways to improve this and found some Hadoop and Compression java code which utilized 64bit access instructions when manipulating byte arrays (e.g., [Snappy](https://github.com/airlift/aircompressor/blob/master/src/main/java/io/airlift/compress/snappy/SnappyRawCompressor.java)). This makes a huge difference b) Safety As indicated previously, native data is always accessed through the Drill Buffer APIs (that is, accessed data is sanitized). During bulk processing, the new utility has additional checks which are triggered when assertions are enabled. This means all test-suite and QA tests (which run with assertions On) will be able to catch any faulty memory access; note also the checks are simpler since they only pertain to byte arrays boundary checks. c) Abstraction I agree that the Drill code base should follow a common code path when accessing native memory (minimize chances of introducing bugs), though I felt the Drill Memory package was the main abstraction since it was centralized and thus easier to test. d) Alternatives I have considered using the Netty io.netty.util.internal.PlatformDependent0 to access the Java intrinsics APIs though this class has a package scope and the wrapper public class io.netty.util.internal.PlatformDependent doesn't expose this low level functionality. Thus the work around would have been to create MemoryUtils class under the io.netty.util.internal package. I didn't think this was a cleaner approach than just using Java's intrinsic APIs directly (note also that Java9 will expose such intrinsics under a public JDK package instead of the sun.misc. Please let me know if you rather prefer the io.netty.util.internal route instead. - Changes to the Vector Code I had few discussions with Paul about the need to reconcile DRILL-5657; the agreement was to proceed with the current implementation and then apply refactoring (future releases) as my changes are backward compatible (new public APIs added). - Arrow Integration I will create a task to analyze the impact of the Arrow Integration on the Parquet Reader. - UserBitShared.proto This class was updated to deal with the implicit columns optimization. I have added two fields to the NullableVarChar class to capture the logical number of rows and the fact this feature was active. This allowed me to use different Accessor and Mutator implementation. You and I had a discussion about this and the agreement was to: a) The new optimization is configurable (current default is off); though we might change the default at some point b) Detect C++ based clients and turn of this optimization (that is, the implicit columns enhancements); I didn't implement this task yet as I forgot the name of the property that you have indicated (this should be quite fast to implement though) - Testing Yes, QA helped me to run the test-suite with the two new options enabled. The new changes didn't modify any of the encoding/compression logic but I'll be happy to add any test(s) you deem necessary.. - Code Style Thank you for the pointers; I'll stick to your feedback in future changes to ease the review process. ---
[GitHub] drill issue #458: DRILL-4573: Zero copy LIKE, REGEXP_MATCHES, SUBSTR
Github user kkhatua commented on the issue: https://github.com/apache/drill/pull/458 @jcmcote Did you get a chance to follow up on @jinfengni 's comment on this PR? ---
[GitHub] drill issue #814: Add clickable localhost URL
Github user ebuildy commented on the issue: https://github.com/apache/drill/pull/814 Hello, nop, no JIRA ticket associated. ---
[GitHub] drill issue #814: Add clickable localhost URL
Github user kkhatua commented on the issue: https://github.com/apache/drill/pull/814 @ebuildy is there a JIRA associated with this PR? ---
[GitHub] drill pull request #570: DRILL-4834 decimal implementation is vulnerable to ...
Github user vvysotskyi commented on a diff in the pull request: https://github.com/apache/drill/pull/570#discussion_r160492087 --- Diff: exec/vector/src/main/codegen/templates/VariableLengthVectors.java --- @@ -539,7 +553,12 @@ public void setValueLengthSafe(int index, int length) { } -public void setSafe(int index, int start, int end, DrillBuf buffer){ +<#if type.minor == "VarDecimal"> --- End diff -- I suppose condition should be negated or if and else should be swapped ---
[GitHub] drill pull request #570: DRILL-4834 decimal implementation is vulnerable to ...
Github user vvysotskyi commented on a diff in the pull request: https://github.com/apache/drill/pull/570#discussion_r160489866 --- Diff: exec/vector/src/main/codegen/templates/ComplexWriters.java --- @@ -99,7 +99,7 @@ public void write(Nullable${minor.class?cap_first}Holder h) { <#if !(minor.class == "Decimal9" || minor.class == "Decimal18" || minor.class == "Decimal28Sparse" || minor.class == "Decimal38Sparse" || minor.class == "Decimal28Dense" || minor.class == "Decimal38Dense")> public void write${minor.class}(<#list fields as field>${field.type} ${field.name}<#if field_has_next>, ) { -mutator.addSafe(idx(), <#list fields as field>${field.name}<#if field_has_next>, ); +mutator.addSafe(idx(), <#list fields as field><#if field.name == "scale"><#break>${field.name}<#if field_has_next && fields[field_index+1].name != "scale" >, ); --- End diff -- I think it would be better to replace these checks by a check for `minor.class` ---
[GitHub] drill pull request #570: DRILL-4834 decimal implementation is vulnerable to ...
Github user vvysotskyi commented on a diff in the pull request: https://github.com/apache/drill/pull/570#discussion_r160480509 --- Diff: exec/java-exec/src/main/codegen/templates/SqlAccessors.java --- @@ -127,6 +127,25 @@ public String getString(int index) { } <#break> +<#case "VarDecimal"> + +@Override +public String getString(int index){ +<#if mode=="Nullable"> +if(ac.isNull(index)){ + return null; +} + +try { --- End diff -- I don't see the necessity of try catch there. Could you please explain? ---
[GitHub] drill pull request #570: DRILL-4834 decimal implementation is vulnerable to ...
Github user vvysotskyi commented on a diff in the pull request: https://github.com/apache/drill/pull/570#discussion_r160481004 --- Diff: exec/java-exec/src/main/codegen/templates/StringOutputRecordWriter.java --- @@ -146,7 +146,7 @@ public void writeField() throws IOException { // TODO: error check addField(fieldId, reader.readObject().toString()); - <#elseif minor.class == "VarChar" || minor.class == "Var16Char" || minor.class == "VarBinary"> + <#elseif minor.class == "VarChar" || minor.class == "Var16Char" || minor.class == "VarBinary" || minor.class == "VarDecimal"> --- End diff -- May these types be moved into previous `elseif`? ---
[GitHub] drill pull request #570: DRILL-4834 decimal implementation is vulnerable to ...
Github user vvysotskyi commented on a diff in the pull request: https://github.com/apache/drill/pull/570#discussion_r160458242 --- Diff: exec/java-exec/src/main/codegen/templates/Decimal/CastDecimalVarchar.java --- @@ -150,6 +150,14 @@ public void setup() { public void eval() { +<#if type.from.contains("VarDecimal")> +java.math.BigDecimal bigDecimal = org.apache.drill.exec.util.DecimalUtility.getBigDecimalFromDrillBuf(in.buffer, in.start, in.end - in.start, in.scale); +String str = bigDecimal.toString(); +out.buffer = buffer; +out.start = 0; +out.end = Math.min((int)len.value, str.length()); +out.buffer.setBytes(0, str.getBytes()); --- End diff -- What about the case when `str` has a length greater than `len`? I think it would be better to use `setBytes(int index, byte[] src, int srcIndex, int length)` method here. ---
[GitHub] drill pull request #570: DRILL-4834 decimal implementation is vulnerable to ...
Github user vvysotskyi commented on a diff in the pull request: https://github.com/apache/drill/pull/570#discussion_r160492409 --- Diff: exec/vector/src/main/java/org/apache/drill/exec/util/DecimalUtility.java --- @@ -159,9 +159,20 @@ public static BigDecimal getBigDecimalFromSparse(DrillBuf data, int startIndex, } public static BigDecimal getBigDecimalFromDrillBuf(DrillBuf bytebuf, int start, int length, int scale) { + if (length <= 0) { +// if the length is somehow non-positive, interpret this as zero +//System.out.println("getBigDecimal forces 0 with start " + start + " len " + length); + try { + throw new Exception("hi there"); --- End diff -- Please remove ---
[GitHub] drill pull request #570: DRILL-4834 decimal implementation is vulnerable to ...
Github user vvysotskyi commented on a diff in the pull request: https://github.com/apache/drill/pull/570#discussion_r160476071 --- Diff: exec/java-exec/src/main/codegen/templates/Decimal/DecimalFunctions.java --- @@ -102,7 +111,578 @@ <#-- For each DECIMAL... type (in DecimalTypes.tdd) ... --> <#list comparisonTypesDecimal.decimalTypes as type> -<#if type.name.endsWith("Sparse")> +<#if type.name.endsWith("VarDecimal")> + +<@pp.changeOutputFile name="/org/apache/drill/exec/expr/fn/impl/${type.name}Functions.java" /> + +<#include "/@includes/license.ftl" /> + +package org.apache.drill.exec.expr.fn.impl; + +<#include "/@includes/vv_imports.ftl" /> + +import org.apache.drill.exec.expr.DrillSimpleFunc; +import org.apache.drill.exec.expr.annotations.FunctionTemplate; +import org.apache.drill.exec.expr.annotations.FunctionTemplate.NullHandling; +import org.apache.drill.exec.expr.annotations.Output; +import org.apache.drill.exec.expr.annotations.Param; +import org.apache.drill.exec.expr.annotations.Workspace; +import org.apache.drill.exec.expr.fn.FunctionGenerationHelper; +import org.apache.drill.exec.expr.holders.*; +import org.apache.drill.exec.record.RecordBatch; + +import io.netty.buffer.ByteBuf; +import io.netty.buffer.DrillBuf; + +import java.nio.ByteBuffer; + +@SuppressWarnings("unused") +public class ${type.name}Functions { +private static void initBuffer(DrillBuf buffer) { +// for VarDecimal, this method of setting initial size is actually only a very rough heuristic. +int size = (${type.storage} * (org.apache.drill.exec.util.DecimalUtility.INTEGER_SIZE)); +buffer = buffer.reallocIfNeeded(size); + } + +@FunctionTemplate(name = "subtract", scope = FunctionTemplate.FunctionScope.DECIMAL_ADD_SCALE, nulls = NullHandling.NULL_IF_NULL) --- End diff -- Please specify `FunctionTemplate` correct scope, returnType and checkPrecisionRange for all functions in this class. ---
[GitHub] drill pull request #570: DRILL-4834 decimal implementation is vulnerable to ...
Github user vvysotskyi commented on a diff in the pull request: https://github.com/apache/drill/pull/570#discussion_r160473858 --- Diff: exec/java-exec/src/main/codegen/templates/Decimal/DecimalAggrTypeFunctions2.java --- @@ -108,9 +108,12 @@ public void output() { out.buffer = buffer; out.start = 0; out.scale = outputScale.value; -out.precision = 38; java.math.BigDecimal average = ((java.math.BigDecimal)(value.obj)).divide(java.math.BigDecimal.valueOf(count.value, 0), out.scale, java.math.BigDecimal.ROUND_HALF_UP); +<#if type.inputType.contains("VarDecimal")> --- End diff -- It would be better to swap if and else and negate if condition for better readability. ---
[GitHub] drill pull request #570: DRILL-4834 decimal implementation is vulnerable to ...
Github user vvysotskyi commented on a diff in the pull request: https://github.com/apache/drill/pull/570#discussion_r160490947 --- Diff: exec/vector/src/main/codegen/templates/NullableValueVectors.java --- @@ -327,13 +327,17 @@ public Mutator getMutator(){ return v; } - public void copyFrom(int fromIndex, int thisIndex, Nullable${minor.class}Vector from){ + protected void copyFromUnsafe(int fromIndex, int thisIndex, Nullable${minor.class}Vector from){ --- End diff -- Are changes in this class necessary? ---
[GitHub] drill pull request #570: DRILL-4834 decimal implementation is vulnerable to ...
Github user vvysotskyi commented on a diff in the pull request: https://github.com/apache/drill/pull/570#discussion_r160465652 --- Diff: exec/java-exec/src/main/codegen/templates/Decimal/CastIntDecimal.java --- @@ -68,15 +68,31 @@ public void setup() { public void eval() { out.scale = (int) scale.value; + +<#if !type.to.endsWith("VarDecimal")> out.precision = (int) precision.value; + -<#if type.to == "Decimal9" || type.to == "Decimal18"> +<#if type.to.endsWith("VarDecimal")> +out.start = 0; +out.buffer = buffer; +String s = Long.toString((long)in.value); +for (int i = 0; i < out.scale; ++i) { // add 0's to get unscaled integer +s += "0"; --- End diff -- I think the usage of `StringBuilder` will be more suitable here. ---
[GitHub] drill pull request #1084: DRILL-5868: Support SQL syntax highlighting of que...
Github user arina-ielchiieva commented on a diff in the pull request: https://github.com/apache/drill/pull/1084#discussion_r160491764 --- Diff: exec/java-exec/src/main/resources/rest/static/js/ace-code-editor/mode-sql.js --- @@ -1 +1,128 @@ -ace.define("ace/mode/sql_highlight_rules",["require","exports","module","ace/lib/oop","ace/mode/text_highlight_rules"],function(e,t,n){"use strict";var r=e("../lib/oop"),i=e("./text_highlight_rules").TextHighlightRules,s=function(){var e="select|insert|update|delete|from|where|and|or|group|by|order|limit|offset|having|as|case|when|else|end|type|left|right|join|on|outer|desc|asc|union|create|table|primary|key|if|foreign|not|references|default|null|inner|cross|natural|database|drop|grant",t="true|false",n="avg|count|first|last|max|min|sum|ucase|lcase|mid|len|round|rank|now|format|coalesce|ifnull|isnull|nvl",r="int|numeric|decimal|date|varchar|char|bigint|float|double|bit|binary|text|set|timestamp|money|real|number|integer",i=this.createKeywordMapper({"support.function":n,keyword:e,"constant.language":t,"storage.type":r},"identifier",!0);this.$rules={start:[{token:"comment",regex:"--.*$"},{token:"comment",start:"/\\*",end:"\\*/"},{token:"string",regex:'".*?"'},{token:"string",regex :"'.*?'"},{token:"string",regex:"`.*?`"},{token:"constant.numeric",regex:"[+-]?\\d+(?:(?:\\.\\d*)?(?:[eE][+-]?\\d+)?)?\\b"},{token:i,regex:"[a-zA-Z_$][a-zA-Z0-9_$]*\\b"},{token:"keyword.operator",regex:"\\+|\\-|\\/|\\/\\/|%|<@>|@>|<@|&|\\^|~|<|>|<=|=>|==|!=|<>|="},{token:"paren.lparen",regex:"[\\(]"},{token:"paren.rparen",regex:"[\\)]"},{token:"text",regex:"\\s+"}]},this.normalizeRules()};r.inherits(s,i),t.SqlHighlightRules=s}),ace.define("ace/mode/sql",["require","exports","module","ace/lib/oop","ace/mode/text","ace/mode/sql_highlight_rules"],function(e,t,n){"use strict";var r=e("../lib/oop"),i=e("./text").Mode,s=e("./sql_highlight_rules").SqlHighlightRules,o=function(){this.HighlightRules=s,this.$behaviour=this.$defaultBehaviour};r.inherits(o,i),function(){this.lineCommentStart="--",this.$id="ace/mode/sql"}.call(o.prototype),t.Mode=o}) \ No newline at end of file +/** + * Drill SQL Definition (Forked from SqlServer definition) + */ + +ace.define("ace/mode/sql_highlight_rules",["require","exports","module","ace/lib/oop","ace/mode/text_highlight_rules"], function(require, exports, module) { +"use strict"; + +var oop = require("../lib/oop"); +var TextHighlightRules = require("./text_highlight_rules").TextHighlightRules; + +var SqlHighlightRules = function() { + +//TODO: https://drill.apache.org/docs/reserved-keywords/ + //e.g. Cubing operators like ROLLUP are not listed +var keywords = ( + "select|insert|update|delete|from|where|and|or|group|by|order|limit|offset|having|as|case|" + + "when|else|end|type|left|right|join|on|outer|desc|asc|union|create|table|key|if|" + +"not|default|null|inner|database|drop|" + + "flatten|kvgen|columns" +); + //Confirmed to be UnSupported as of Drill-1.12.0 + /* cross|natural|primary|foreign|references|grant */ + +var builtinConstants = ( +"true|false" +); + +//Drill-specific +var builtinFunctions = ( +//Math and Trignometric + "abs|cbrt|ceil|ceiling|degrees|e|exp|floor|log|log|log10|lshift|mod|negative|pi|pow|radians|rand|round|round|rshift|sign|sqrt|trunc|trunc|" + --- End diff -- Sounds good, could you please then just file a Jira for this improvement? ---
[GitHub] drill pull request #1084: DRILL-5868: Support SQL syntax highlighting of que...
Github user kkhatua commented on a diff in the pull request: https://github.com/apache/drill/pull/1084#discussion_r160491309 --- Diff: exec/java-exec/src/main/resources/rest/static/js/ace-code-editor/mode-sql.js --- @@ -1 +1,128 @@ -ace.define("ace/mode/sql_highlight_rules",["require","exports","module","ace/lib/oop","ace/mode/text_highlight_rules"],function(e,t,n){"use strict";var r=e("../lib/oop"),i=e("./text_highlight_rules").TextHighlightRules,s=function(){var e="select|insert|update|delete|from|where|and|or|group|by|order|limit|offset|having|as|case|when|else|end|type|left|right|join|on|outer|desc|asc|union|create|table|primary|key|if|foreign|not|references|default|null|inner|cross|natural|database|drop|grant",t="true|false",n="avg|count|first|last|max|min|sum|ucase|lcase|mid|len|round|rank|now|format|coalesce|ifnull|isnull|nvl",r="int|numeric|decimal|date|varchar|char|bigint|float|double|bit|binary|text|set|timestamp|money|real|number|integer",i=this.createKeywordMapper({"support.function":n,keyword:e,"constant.language":t,"storage.type":r},"identifier",!0);this.$rules={start:[{token:"comment",regex:"--.*$"},{token:"comment",start:"/\\*",end:"\\*/"},{token:"string",regex:'".*?"'},{token:"string",regex :"'.*?'"},{token:"string",regex:"`.*?`"},{token:"constant.numeric",regex:"[+-]?\\d+(?:(?:\\.\\d*)?(?:[eE][+-]?\\d+)?)?\\b"},{token:i,regex:"[a-zA-Z_$][a-zA-Z0-9_$]*\\b"},{token:"keyword.operator",regex:"\\+|\\-|\\/|\\/\\/|%|<@>|@>|<@|&|\\^|~|<|>|<=|=>|==|!=|<>|="},{token:"paren.lparen",regex:"[\\(]"},{token:"paren.rparen",regex:"[\\)]"},{token:"text",regex:"\\s+"}]},this.normalizeRules()};r.inherits(s,i),t.SqlHighlightRules=s}),ace.define("ace/mode/sql",["require","exports","module","ace/lib/oop","ace/mode/text","ace/mode/sql_highlight_rules"],function(e,t,n){"use strict";var r=e("../lib/oop"),i=e("./text").Mode,s=e("./sql_highlight_rules").SqlHighlightRules,o=function(){this.HighlightRules=s,this.$behaviour=this.$defaultBehaviour};r.inherits(o,i),function(){this.lineCommentStart="--",this.$id="ace/mode/sql"}.call(o.prototype),t.Mode=o}) \ No newline at end of file +/** + * Drill SQL Definition (Forked from SqlServer definition) + */ + +ace.define("ace/mode/sql_highlight_rules",["require","exports","module","ace/lib/oop","ace/mode/text_highlight_rules"], function(require, exports, module) { +"use strict"; + +var oop = require("../lib/oop"); +var TextHighlightRules = require("./text_highlight_rules").TextHighlightRules; + +var SqlHighlightRules = function() { + +//TODO: https://drill.apache.org/docs/reserved-keywords/ + //e.g. Cubing operators like ROLLUP are not listed +var keywords = ( + "select|insert|update|delete|from|where|and|or|group|by|order|limit|offset|having|as|case|" + + "when|else|end|type|left|right|join|on|outer|desc|asc|union|create|table|key|if|" + +"not|default|null|inner|database|drop|" + + "flatten|kvgen|columns" --- End diff -- There is a small challenge with this list. The 300+ keywords are a mix of reserved SQL keywords, functions and other words. I could be wrong, but I think some of these are not even supported (e.g. TRIGGER). I would need to separate the functions and SQL keywords for the colour highlighting to appear correctly. Let me walk through the list and at least identify the SQL keywords. Functions were already covered separately. ---
[GitHub] drill pull request #1084: DRILL-5868: Support SQL syntax highlighting of que...
Github user kkhatua commented on a diff in the pull request: https://github.com/apache/drill/pull/1084#discussion_r160485223 --- Diff: exec/java-exec/src/main/resources/rest/static/js/ace-code-editor/snippets/sql.js --- @@ -0,0 +1,46 @@ +/** + * Drill SQL Syntax Snippets + */ + +ace.define("ace/snippets/sql",["require","exports","module"], function(require, exports, module) { --- End diff -- +1 I did see this initially, but ignored it as I made progress. Guess I got trained to ignoring the contents in the end :) Will fix this. ---
[GitHub] drill pull request #1084: DRILL-5868: Support SQL syntax highlighting of que...
Github user kkhatua commented on a diff in the pull request: https://github.com/apache/drill/pull/1084#discussion_r160482817 --- Diff: exec/java-exec/src/main/resources/rest/static/js/ace-code-editor/mode-sql.js --- @@ -1 +1,128 @@ -ace.define("ace/mode/sql_highlight_rules",["require","exports","module","ace/lib/oop","ace/mode/text_highlight_rules"],function(e,t,n){"use strict";var r=e("../lib/oop"),i=e("./text_highlight_rules").TextHighlightRules,s=function(){var e="select|insert|update|delete|from|where|and|or|group|by|order|limit|offset|having|as|case|when|else|end|type|left|right|join|on|outer|desc|asc|union|create|table|primary|key|if|foreign|not|references|default|null|inner|cross|natural|database|drop|grant",t="true|false",n="avg|count|first|last|max|min|sum|ucase|lcase|mid|len|round|rank|now|format|coalesce|ifnull|isnull|nvl",r="int|numeric|decimal|date|varchar|char|bigint|float|double|bit|binary|text|set|timestamp|money|real|number|integer",i=this.createKeywordMapper({"support.function":n,keyword:e,"constant.language":t,"storage.type":r},"identifier",!0);this.$rules={start:[{token:"comment",regex:"--.*$"},{token:"comment",start:"/\\*",end:"\\*/"},{token:"string",regex:'".*?"'},{token:"string",regex :"'.*?'"},{token:"string",regex:"`.*?`"},{token:"constant.numeric",regex:"[+-]?\\d+(?:(?:\\.\\d*)?(?:[eE][+-]?\\d+)?)?\\b"},{token:i,regex:"[a-zA-Z_$][a-zA-Z0-9_$]*\\b"},{token:"keyword.operator",regex:"\\+|\\-|\\/|\\/\\/|%|<@>|@>|<@|&|\\^|~|<|>|<=|=>|==|!=|<>|="},{token:"paren.lparen",regex:"[\\(]"},{token:"paren.rparen",regex:"[\\)]"},{token:"text",regex:"\\s+"}]},this.normalizeRules()};r.inherits(s,i),t.SqlHighlightRules=s}),ace.define("ace/mode/sql",["require","exports","module","ace/lib/oop","ace/mode/text","ace/mode/sql_highlight_rules"],function(e,t,n){"use strict";var r=e("../lib/oop"),i=e("./text").Mode,s=e("./sql_highlight_rules").SqlHighlightRules,o=function(){this.HighlightRules=s,this.$behaviour=this.$defaultBehaviour};r.inherits(o,i),function(){this.lineCommentStart="--",this.$id="ace/mode/sql"}.call(o.prototype),t.Mode=o}) \ No newline at end of file +/** + * Drill SQL Definition (Forked from SqlServer definition) + */ + +ace.define("ace/mode/sql_highlight_rules",["require","exports","module","ace/lib/oop","ace/mode/text_highlight_rules"], function(require, exports, module) { +"use strict"; + +var oop = require("../lib/oop"); +var TextHighlightRules = require("./text_highlight_rules").TextHighlightRules; + +var SqlHighlightRules = function() { + +//TODO: https://drill.apache.org/docs/reserved-keywords/ + //e.g. Cubing operators like ROLLUP are not listed +var keywords = ( + "select|insert|update|delete|from|where|and|or|group|by|order|limit|offset|having|as|case|" + + "when|else|end|type|left|right|join|on|outer|desc|asc|union|create|table|key|if|" + +"not|default|null|inner|database|drop|" + + "flatten|kvgen|columns" +); + //Confirmed to be UnSupported as of Drill-1.12.0 + /* cross|natural|primary|foreign|references|grant */ + +var builtinConstants = ( +"true|false" +); + +//Drill-specific +var builtinFunctions = ( +//Math and Trignometric + "abs|cbrt|ceil|ceiling|degrees|e|exp|floor|log|log|log10|lshift|mod|negative|pi|pow|radians|rand|round|round|rshift|sign|sqrt|trunc|trunc|" + --- End diff -- I tried looking at that option, but it is quite difficult to figure out how the libraries read these files. We could do a dynamic generation of this list, when the {sys.functions} table does get implemented. ---
[GitHub] drill pull request #1084: DRILL-5868: Support SQL syntax highlighting of que...
Github user kkhatua commented on a diff in the pull request: https://github.com/apache/drill/pull/1084#discussion_r160482221 --- Diff: exec/java-exec/src/main/resources/rest/static/js/ace-code-editor/mode-sql.js --- @@ -1 +1,128 @@ -ace.define("ace/mode/sql_highlight_rules",["require","exports","module","ace/lib/oop","ace/mode/text_highlight_rules"],function(e,t,n){"use strict";var r=e("../lib/oop"),i=e("./text_highlight_rules").TextHighlightRules,s=function(){var e="select|insert|update|delete|from|where|and|or|group|by|order|limit|offset|having|as|case|when|else|end|type|left|right|join|on|outer|desc|asc|union|create|table|primary|key|if|foreign|not|references|default|null|inner|cross|natural|database|drop|grant",t="true|false",n="avg|count|first|last|max|min|sum|ucase|lcase|mid|len|round|rank|now|format|coalesce|ifnull|isnull|nvl",r="int|numeric|decimal|date|varchar|char|bigint|float|double|bit|binary|text|set|timestamp|money|real|number|integer",i=this.createKeywordMapper({"support.function":n,keyword:e,"constant.language":t,"storage.type":r},"identifier",!0);this.$rules={start:[{token:"comment",regex:"--.*$"},{token:"comment",start:"/\\*",end:"\\*/"},{token:"string",regex:'".*?"'},{token:"string",regex :"'.*?'"},{token:"string",regex:"`.*?`"},{token:"constant.numeric",regex:"[+-]?\\d+(?:(?:\\.\\d*)?(?:[eE][+-]?\\d+)?)?\\b"},{token:i,regex:"[a-zA-Z_$][a-zA-Z0-9_$]*\\b"},{token:"keyword.operator",regex:"\\+|\\-|\\/|\\/\\/|%|<@>|@>|<@|&|\\^|~|<|>|<=|=>|==|!=|<>|="},{token:"paren.lparen",regex:"[\\(]"},{token:"paren.rparen",regex:"[\\)]"},{token:"text",regex:"\\s+"}]},this.normalizeRules()};r.inherits(s,i),t.SqlHighlightRules=s}),ace.define("ace/mode/sql",["require","exports","module","ace/lib/oop","ace/mode/text","ace/mode/sql_highlight_rules"],function(e,t,n){"use strict";var r=e("../lib/oop"),i=e("./text").Mode,s=e("./sql_highlight_rules").SqlHighlightRules,o=function(){this.HighlightRules=s,this.$behaviour=this.$defaultBehaviour};r.inherits(o,i),function(){this.lineCommentStart="--",this.$id="ace/mode/sql"}.call(o.prototype),t.Mode=o}) \ No newline at end of file +/** + * Drill SQL Definition (Forked from SqlServer definition) + */ + +ace.define("ace/mode/sql_highlight_rules",["require","exports","module","ace/lib/oop","ace/mode/text_highlight_rules"], function(require, exports, module) { +"use strict"; + +var oop = require("../lib/oop"); +var TextHighlightRules = require("./text_highlight_rules").TextHighlightRules; + +var SqlHighlightRules = function() { + +//TODO: https://drill.apache.org/docs/reserved-keywords/ + //e.g. Cubing operators like ROLLUP are not listed +var keywords = ( + "select|insert|update|delete|from|where|and|or|group|by|order|limit|offset|having|as|case|" + + "when|else|end|type|left|right|join|on|outer|desc|asc|union|create|table|key|if|" + +"not|default|null|inner|database|drop|" + + "flatten|kvgen|columns" --- End diff -- This list I could add. Adding everything else is a bit harder. I'll update the list in an add-on comit and attach to this PR. Thnx! ---
Re: Drill Hangout Jan 9, 2018
Should we also talk about Java 9 and Hadoop 3.x support? Thank you, Vlad On 1/9/18 08:36, Arina Yelchiyeva wrote: Mark Java 8 support as critical for 1.13.0 release? For example, in Calcite 1.16 support for Java 7 will be dropped. On Tue, Jan 9, 2018 at 9:34 AM, Aditya Allamrajuwrote: We will have our first bi-weekly hangout in the 2018 tomorrow at 10 am. Please reply to this post with proposed topics to discuss. Hangout link: https://plus.google.com/hangouts/_/event/ ci4rdiju8bv04a64efj 5fedd0lc Thank you, Vlad
Re: Drill Hangout Jan 9, 2018
Hi Aditya, Yes, it is open to everyone with exception that hangouts allow 10 connections only. Thank you, Vlad On 1/8/18 23:34, Aditya Allamraju wrote: Hi Vlad, If this is open to everyone in this group, i would like to join this hangout. Please let me know. Thanks Aditya On Tue, Jan 9, 2018 at 1:22 AM, Vlad Rozovwrote: We will have our first bi-weekly hangout in the 2018 tomorrow at 10 am. Please reply to this post with proposed topics to discuss. Hangout link: https://plus.google.com/hangouts/_/event/ci4rdiju8bv04a64efj 5fedd0lc Thank you, Vlad
Re: Drill Hangout Jan 9, 2018
Mark Java 8 support as critical for 1.13.0 release? For example, in Calcite 1.16 support for Java 7 will be dropped. On Tue, Jan 9, 2018 at 9:34 AM, Aditya Allamrajuwrote: > Hi Vlad, > > If this is open to everyone in this group, i would like to join this > hangout. > Please let me know. > > Thanks > Aditya > > On Tue, Jan 9, 2018 at 1:22 AM, Vlad Rozov wrote: > > > We will have our first bi-weekly hangout in the 2018 tomorrow at 10 am. > > Please reply to this post with proposed topics to discuss. > > > > Hangout link: https://plus.google.com/hangouts/_/event/ > ci4rdiju8bv04a64efj > > 5fedd0lc > > > > Thank you, > > > > Vlad > > >
[GitHub] drill pull request #1084: DRILL-5868: Support SQL syntax highlighting of que...
Github user arina-ielchiieva commented on a diff in the pull request: https://github.com/apache/drill/pull/1084#discussion_r160457025 --- Diff: exec/java-exec/src/main/resources/rest/static/js/ace-code-editor/snippets/sql.js --- @@ -0,0 +1,46 @@ +/** + * Drill SQL Syntax Snippets + */ + +ace.define("ace/snippets/sql",["require","exports","module"], function(require, exports, module) { +"use strict"; + +exports.snippetText = "snippet info\n\ + select * from INFORMATION_SCHEMA.${1:};\n\ +snippet sysmem\n\ --- End diff -- Also snippets are case-sensitive. Is it common practice? ---
Re: PCAP Issues
The pcap format is there almost entirely just to give us a record structure. The stream could give us that just as easily. So that is one thing that could be done. Another thought is that data could be buffered a (very) short time and groups of packets could be sent as a single message. That would decrease streaming overhead. I would vote against that change at first, but it would make the pcap format relevant again. On Tue, Jan 9, 2018 at 6:45 AM, John Omernikwrote: > I've been following this and and also now playing with them more. A couple > of tidbits that I think are worth re-mentioning: > > 1. pcapng the ability to read this too will be handy. > > 2. The flags are really cool, and the work here is great. > > The other thing I'd like to consider for this is with recent work on the > Kafka plugin, and I am guessing a similar plugin with MapR Streams, what if > we were able to have a tool like daemonlogger, or tcpdump read from a > network interface and write directly to a Kafka/MapR Stream topic? > > My thought is this, you write to the topic, you partition based on a hash > of known data components (like the 5-tuple of protocol, sport, dport, > saddr, and daddr). From here, you could have things subscribe to the topic > and process, you could etl, you could query. There would be some concern > at high scale, but even that has methods to address. One of the challenges > we have infosec is getting processing close enough to capture OR > capture to file, move them around, have creates process to read and process > files. > > It would be cool, to combine the pcap plugin directly with the kafka/stream > plugin in drill and be able to read live pcaps! > > One big challenge is the pcap format has a header, but it seems to be a > simple header that we "may" be able to forge upon dropping the needle on a > stream of packet records. I am just brainstorming aloud right now, (I > should consider waiting until at least 3 cups of coffee before I make this > inner dialog public) > > Curious on thoughts here. > > > > > On Fri, Jan 5, 2018 at 10:01 AM, Charles Givre wrote: > > > I assumed (incorrectly as it turned out) that I was not reading the > > offsets correctly or something. > > > > > > > On Jan 5, 2018, at 10:59, Ted Dunning wrote: > > > > > > Yeah... I got the same result. > > > > > > I will push my latest shortly (without the boolean stuff) > > > > > > On Fri, Jan 5, 2018 at 7:52 AM, Charles Givre > wrote: > > > > > >> Ted, > > >> I’m wondering if we’ve uncovered an unsupported, undocumented feature > in > > >> Drill with respect to the BitWriter. I made the following changes: > > >> > > >> > > >> To PcapRecordReader.java, I added: > > >> > > >> > > >> case "ack_flag": > > >> if (packet.isTcpPacket()) { > > >>int flag = (packet.getAckFlag()) ? 1 : 0; > > >>setBooleanColumnValue(flag, pci, count); > > >> } > > >> break; > > >> ... > > >> > > >> private void setBooleanColumnValue(final int data, final > > >> ProjectedColumnInfo pci, final int count) { > > >> ((NullableBitVector.Mutator) pci.vv.getMutator()) > > >>.setSafe(count, data); > > >> } > > >> > > >> Then I added this to PcapFormatUtils > > >> > > >> public static int getBit(final byte[] buf, final int posByte, final > int > > >> posBit ) { > > >> byte valByte = buf[posByte]; > > >> return valByte >> (8 - (posBit + 1)) & 0x0001; > > >> } > > >> > > >> I added a column in the Schema.java and a function in Packet.java to > get > > >> the bit. I was getting NPEs and I’m wondering now if the cause wasn’t > > my > > >> code but rather a problem with the bitwriter. I’m going to play with > > this > > >> and see if I can get Drill to write a True/False value at all, and > > report > > >> back. If the BitWriter is throwing NPE, I’ll create a JIRA for it. > > >> > > >> Thanks for your work on the PCAP format reader. > > >> —C > > >> > > >> > > >> > > >>> On Jan 3, 2018, at 17:33, Ted Dunning wrote: > > >>> > > >>> Don't think that will work. > > >>> > > >>> I tried what seemed obvious and got a NPE. Joys of Drill. > > >>> > > >>> > > >>> > > >>> On Wed, Jan 3, 2018 at 1:31 PM, Charles Givre > > wrote: > > >>> > > This isn’t the most elegant example, but should do the trick. > > > > https://github.com/cgivre/drill-network-functions/blob/ > > master/src/main/java/org/apache/drill/contrib/function/ > > IsPrivateIP.java > > >> < > > https://github.com/cgivre/drill-network-functions/blob/ > > master/src/main/java/org/apache/drill/contrib/function/ > > >> IsPrivateIP.java> > > > > > > > > > > > On Jan 3, 2018, at 16:09, Ted Dunning > wrote: > > > > > > On Wed, Jan 3, 2018 at 12:17 PM, Charles Givre > > >> wrote: > > > > > >> HI Ted, > > >> This is really looking amazing. Thank you so much for
[GitHub] drill pull request #1084: DRILL-5868: Support SQL syntax highlighting of que...
Github user arina-ielchiieva commented on a diff in the pull request: https://github.com/apache/drill/pull/1084#discussion_r160447919 --- Diff: exec/java-exec/src/main/resources/rest/static/js/ace-code-editor/mode-sql.js --- @@ -1 +1,128 @@ -ace.define("ace/mode/sql_highlight_rules",["require","exports","module","ace/lib/oop","ace/mode/text_highlight_rules"],function(e,t,n){"use strict";var r=e("../lib/oop"),i=e("./text_highlight_rules").TextHighlightRules,s=function(){var e="select|insert|update|delete|from|where|and|or|group|by|order|limit|offset|having|as|case|when|else|end|type|left|right|join|on|outer|desc|asc|union|create|table|primary|key|if|foreign|not|references|default|null|inner|cross|natural|database|drop|grant",t="true|false",n="avg|count|first|last|max|min|sum|ucase|lcase|mid|len|round|rank|now|format|coalesce|ifnull|isnull|nvl",r="int|numeric|decimal|date|varchar|char|bigint|float|double|bit|binary|text|set|timestamp|money|real|number|integer",i=this.createKeywordMapper({"support.function":n,keyword:e,"constant.language":t,"storage.type":r},"identifier",!0);this.$rules={start:[{token:"comment",regex:"--.*$"},{token:"comment",start:"/\\*",end:"\\*/"},{token:"string",regex:'".*?"'},{token:"string",regex :"'.*?'"},{token:"string",regex:"`.*?`"},{token:"constant.numeric",regex:"[+-]?\\d+(?:(?:\\.\\d*)?(?:[eE][+-]?\\d+)?)?\\b"},{token:i,regex:"[a-zA-Z_$][a-zA-Z0-9_$]*\\b"},{token:"keyword.operator",regex:"\\+|\\-|\\/|\\/\\/|%|<@>|@>|<@|&|\\^|~|<|>|<=|=>|==|!=|<>|="},{token:"paren.lparen",regex:"[\\(]"},{token:"paren.rparen",regex:"[\\)]"},{token:"text",regex:"\\s+"}]},this.normalizeRules()};r.inherits(s,i),t.SqlHighlightRules=s}),ace.define("ace/mode/sql",["require","exports","module","ace/lib/oop","ace/mode/text","ace/mode/sql_highlight_rules"],function(e,t,n){"use strict";var r=e("../lib/oop"),i=e("./text").Mode,s=e("./sql_highlight_rules").SqlHighlightRules,o=function(){this.HighlightRules=s,this.$behaviour=this.$defaultBehaviour};r.inherits(o,i),function(){this.lineCommentStart="--",this.$id="ace/mode/sql"}.call(o.prototype),t.Mode=o}) \ No newline at end of file +/** + * Drill SQL Definition (Forked from SqlServer definition) + */ + +ace.define("ace/mode/sql_highlight_rules",["require","exports","module","ace/lib/oop","ace/mode/text_highlight_rules"], function(require, exports, module) { +"use strict"; + +var oop = require("../lib/oop"); +var TextHighlightRules = require("./text_highlight_rules").TextHighlightRules; + +var SqlHighlightRules = function() { + +//TODO: https://drill.apache.org/docs/reserved-keywords/ + //e.g. Cubing operators like ROLLUP are not listed +var keywords = ( + "select|insert|update|delete|from|where|and|or|group|by|order|limit|offset|having|as|case|" + + "when|else|end|type|left|right|join|on|outer|desc|asc|union|create|table|key|if|" + +"not|default|null|inner|database|drop|" + + "flatten|kvgen|columns" --- End diff -- We have more keywords which would be nice to highlight and which are in https://drill.apache.org/docs/reserved-keywords/ list, for example: alter, function, explain, session, system etc. maybe we can add them in this PR? ---
[GitHub] drill pull request #1084: DRILL-5868: Support SQL syntax highlighting of que...
Github user arina-ielchiieva commented on a diff in the pull request: https://github.com/apache/drill/pull/1084#discussion_r160452751 --- Diff: exec/java-exec/src/main/resources/rest/static/js/ace-code-editor/mode-sql.js --- @@ -1 +1,128 @@ -ace.define("ace/mode/sql_highlight_rules",["require","exports","module","ace/lib/oop","ace/mode/text_highlight_rules"],function(e,t,n){"use strict";var r=e("../lib/oop"),i=e("./text_highlight_rules").TextHighlightRules,s=function(){var e="select|insert|update|delete|from|where|and|or|group|by|order|limit|offset|having|as|case|when|else|end|type|left|right|join|on|outer|desc|asc|union|create|table|primary|key|if|foreign|not|references|default|null|inner|cross|natural|database|drop|grant",t="true|false",n="avg|count|first|last|max|min|sum|ucase|lcase|mid|len|round|rank|now|format|coalesce|ifnull|isnull|nvl",r="int|numeric|decimal|date|varchar|char|bigint|float|double|bit|binary|text|set|timestamp|money|real|number|integer",i=this.createKeywordMapper({"support.function":n,keyword:e,"constant.language":t,"storage.type":r},"identifier",!0);this.$rules={start:[{token:"comment",regex:"--.*$"},{token:"comment",start:"/\\*",end:"\\*/"},{token:"string",regex:'".*?"'},{token:"string",regex :"'.*?'"},{token:"string",regex:"`.*?`"},{token:"constant.numeric",regex:"[+-]?\\d+(?:(?:\\.\\d*)?(?:[eE][+-]?\\d+)?)?\\b"},{token:i,regex:"[a-zA-Z_$][a-zA-Z0-9_$]*\\b"},{token:"keyword.operator",regex:"\\+|\\-|\\/|\\/\\/|%|<@>|@>|<@|&|\\^|~|<|>|<=|=>|==|!=|<>|="},{token:"paren.lparen",regex:"[\\(]"},{token:"paren.rparen",regex:"[\\)]"},{token:"text",regex:"\\s+"}]},this.normalizeRules()};r.inherits(s,i),t.SqlHighlightRules=s}),ace.define("ace/mode/sql",["require","exports","module","ace/lib/oop","ace/mode/text","ace/mode/sql_highlight_rules"],function(e,t,n){"use strict";var r=e("../lib/oop"),i=e("./text").Mode,s=e("./sql_highlight_rules").SqlHighlightRules,o=function(){this.HighlightRules=s,this.$behaviour=this.$defaultBehaviour};r.inherits(o,i),function(){this.lineCommentStart="--",this.$id="ace/mode/sql"}.call(o.prototype),t.Mode=o}) \ No newline at end of file +/** + * Drill SQL Definition (Forked from SqlServer definition) + */ + +ace.define("ace/mode/sql_highlight_rules",["require","exports","module","ace/lib/oop","ace/mode/text_highlight_rules"], function(require, exports, module) { +"use strict"; + +var oop = require("../lib/oop"); +var TextHighlightRules = require("./text_highlight_rules").TextHighlightRules; + +var SqlHighlightRules = function() { + +//TODO: https://drill.apache.org/docs/reserved-keywords/ + //e.g. Cubing operators like ROLLUP are not listed +var keywords = ( + "select|insert|update|delete|from|where|and|or|group|by|order|limit|offset|having|as|case|" + + "when|else|end|type|left|right|join|on|outer|desc|asc|union|create|table|key|if|" + +"not|default|null|inner|database|drop|" + + "flatten|kvgen|columns" +); + //Confirmed to be UnSupported as of Drill-1.12.0 + /* cross|natural|primary|foreign|references|grant */ + +var builtinConstants = ( +"true|false" +); + +//Drill-specific +var builtinFunctions = ( +//Math and Trignometric + "abs|cbrt|ceil|ceiling|degrees|e|exp|floor|log|log|log10|lshift|mod|negative|pi|pow|radians|rand|round|round|rshift|sign|sqrt|trunc|trunc|" + --- End diff -- I believe n future we'll have system table with Drill functions. Will be there a possibility to pass list of functions to the js script instead of hard-coding them? ---
[GitHub] drill pull request #1084: DRILL-5868: Support SQL syntax highlighting of que...
Github user arina-ielchiieva commented on a diff in the pull request: https://github.com/apache/drill/pull/1084#discussion_r160449936 --- Diff: exec/java-exec/src/main/resources/rest/static/js/ace-code-editor/snippets/sql.js --- @@ -0,0 +1,46 @@ +/** + * Drill SQL Syntax Snippets + */ + +ace.define("ace/snippets/sql",["require","exports","module"], function(require, exports, module) { --- End diff -- 1. Could you please add information about snippets to Jira description. We might what to add it to Drill documentation later so user knows which snippets can be used. Let's say, it's hard to guess the snippers without looking into the code :) 2. When using snippets, I saw the following warning. ![image](https://user-images.githubusercontent.com/15086720/34730558-6a34fde4-f568-11e7-81e9-3e6d62091189.png) Could you please check? ---
Re: PCAP Issues
I've been following this and and also now playing with them more. A couple of tidbits that I think are worth re-mentioning: 1. pcapng the ability to read this too will be handy. 2. The flags are really cool, and the work here is great. The other thing I'd like to consider for this is with recent work on the Kafka plugin, and I am guessing a similar plugin with MapR Streams, what if we were able to have a tool like daemonlogger, or tcpdump read from a network interface and write directly to a Kafka/MapR Stream topic? My thought is this, you write to the topic, you partition based on a hash of known data components (like the 5-tuple of protocol, sport, dport, saddr, and daddr). From here, you could have things subscribe to the topic and process, you could etl, you could query. There would be some concern at high scale, but even that has methods to address. One of the challenges we have infosec is getting processing close enough to capture OR capture to file, move them around, have creates process to read and process files. It would be cool, to combine the pcap plugin directly with the kafka/stream plugin in drill and be able to read live pcaps! One big challenge is the pcap format has a header, but it seems to be a simple header that we "may" be able to forge upon dropping the needle on a stream of packet records. I am just brainstorming aloud right now, (I should consider waiting until at least 3 cups of coffee before I make this inner dialog public) Curious on thoughts here. On Fri, Jan 5, 2018 at 10:01 AM, Charles Givrewrote: > I assumed (incorrectly as it turned out) that I was not reading the > offsets correctly or something. > > > > On Jan 5, 2018, at 10:59, Ted Dunning wrote: > > > > Yeah... I got the same result. > > > > I will push my latest shortly (without the boolean stuff) > > > > On Fri, Jan 5, 2018 at 7:52 AM, Charles Givre wrote: > > > >> Ted, > >> I’m wondering if we’ve uncovered an unsupported, undocumented feature in > >> Drill with respect to the BitWriter. I made the following changes: > >> > >> > >> To PcapRecordReader.java, I added: > >> > >> > >> case "ack_flag": > >> if (packet.isTcpPacket()) { > >>int flag = (packet.getAckFlag()) ? 1 : 0; > >>setBooleanColumnValue(flag, pci, count); > >> } > >> break; > >> ... > >> > >> private void setBooleanColumnValue(final int data, final > >> ProjectedColumnInfo pci, final int count) { > >> ((NullableBitVector.Mutator) pci.vv.getMutator()) > >>.setSafe(count, data); > >> } > >> > >> Then I added this to PcapFormatUtils > >> > >> public static int getBit(final byte[] buf, final int posByte, final int > >> posBit ) { > >> byte valByte = buf[posByte]; > >> return valByte >> (8 - (posBit + 1)) & 0x0001; > >> } > >> > >> I added a column in the Schema.java and a function in Packet.java to get > >> the bit. I was getting NPEs and I’m wondering now if the cause wasn’t > my > >> code but rather a problem with the bitwriter. I’m going to play with > this > >> and see if I can get Drill to write a True/False value at all, and > report > >> back. If the BitWriter is throwing NPE, I’ll create a JIRA for it. > >> > >> Thanks for your work on the PCAP format reader. > >> —C > >> > >> > >> > >>> On Jan 3, 2018, at 17:33, Ted Dunning wrote: > >>> > >>> Don't think that will work. > >>> > >>> I tried what seemed obvious and got a NPE. Joys of Drill. > >>> > >>> > >>> > >>> On Wed, Jan 3, 2018 at 1:31 PM, Charles Givre > wrote: > >>> > This isn’t the most elegant example, but should do the trick. > > https://github.com/cgivre/drill-network-functions/blob/ > master/src/main/java/org/apache/drill/contrib/function/ > IsPrivateIP.java > >> < > https://github.com/cgivre/drill-network-functions/blob/ > master/src/main/java/org/apache/drill/contrib/function/ > >> IsPrivateIP.java> > > > > > > On Jan 3, 2018, at 16:09, Ted Dunning wrote: > > > > On Wed, Jan 3, 2018 at 12:17 PM, Charles Givre > >> wrote: > > > >> HI Ted, > >> This is really looking amazing. Thank you so much for working on > this. I > >> wanted to ask whether you’ve tried the BitWriter to write a Boolean > value? > >> I’ve done that in UDFs. > > > > > > Point me to an example. I think booleans would be vastly better. > > > >> > >> > >
[GitHub] drill pull request #1047: DRILL-5970: DrillParquetReader always builds the s...
Github user vdiravka commented on a diff in the pull request: https://github.com/apache/drill/pull/1047#discussion_r160419416 --- Diff: exec/vector/src/main/codegen/templates/BaseWriter.java --- @@ -106,37 +114,37 @@ MapOrListWriter list(String name); boolean isMapWriter(); boolean isListWriter(); -UInt1Writer uInt1(String name); -UInt2Writer uInt2(String name); -UInt4Writer uInt4(String name); -UInt8Writer uInt8(String name); -VarCharWriter varChar(String name); -Var16CharWriter var16Char(String name); -TinyIntWriter tinyInt(String name); -SmallIntWriter smallInt(String name); -IntWriter integer(String name); -BigIntWriter bigInt(String name); -Float4Writer float4(String name); -Float8Writer float8(String name); -BitWriter bit(String name); -VarBinaryWriter varBinary(String name); +UInt1Writer uInt1(String name, TypeProtos.DataMode dataMode); --- End diff -- > The mode passed here cannot be REPEATED: just won't work. So, can we pass REQUIRED? Yes, we can with my improvements - for OPTIONAL dataMode we will pass Nullable Vectors and usual one for REQUIRED. The proposed alternative is interesting, I will think how to implement it in the best way. Maybe replacing SingleMapWriter with OptionalMapWriter and RequredMapWriter could be done. Note: here are some good code refactoring was done, so this PR shouldn't be closed. ---
[GitHub] drill issue #1066: DRILL-3993: Changes to support Calcite 1.15
Github user vvysotskyi commented on the issue: https://github.com/apache/drill/pull/1066 @chunhui-shi, I have made additional fixes in new commits (commits after DRILL-3993: Changes after code review. 3120762). Could you please take a look? Also, I have created pull request on incubator-calcite: https://github.com/mapr/incubator-calcite/pull/16 ---
[GitHub] drill issue #1049: DRILL-5971: Fix INT64, INT32 logical types in complex par...
Github user vvysotskyi commented on the issue: https://github.com/apache/drill/pull/1049 @parthchandra, thanks for the pull request! LGTM, +1. ---
[GitHub] drill issue #1023: DRILL-5922 Fixed Child Allocator Leak. DRILL-5926 Stabali...
Github user ilooner commented on the issue: https://github.com/apache/drill/pull/1023 @paul-rogers For some reason this wan't included in a batch commit. After rebasing onto master I noticed some tests were failing sporadically so I've included a 1 line fix for DRILL-6003 in the last commit. Can you please take a look and +1 again? Hopefully the change will go in this time. ---