[GitHub] [drill] jnturton commented on pull request #2743: DRILL-8391: Disable auto complete on the password field of web UI login forms

2023-01-20 Thread GitBox


jnturton commented on PR #2743:
URL: https://github.com/apache/drill/pull/2743#issuecomment-1398480919

   The dang squash and merge mangled the commit message!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [drill] cgivre merged pull request #2743: DRILL-8391: Disable auto complete on the password field of web UI login forms

2023-01-20 Thread GitBox


cgivre merged PR #2743:
URL: https://github.com/apache/drill/pull/2743


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [drill] jnturton commented on pull request #2636: DRILL-8290: Short cut recursive file listings for LIMIT 0 queries.

2023-01-20 Thread GitBox


jnturton commented on PR #2636:
URL: https://github.com/apache/drill/pull/2636#issuecomment-1398472306

   > For such queries the same QueryComputationHints will be used for both 
inputs, so it will cause incorrect results.
   
   @vvysotskyi the idea here was that only a LIMIT 0 on the _root_ SELECT is 
detected, in which case the single file optimisation can be done on _all_ 
inputs so a single flag is sufficient.
   
   However, I'm trying to implement a better approach that optimises LIMIT 0s 
at any level. Since files are first listed very early, during validation (so 
even before partition pruning ) no RelNode trees are available and the 
detection will have to be done on the SqlNode tree.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [drill] jnturton opened a new pull request, #2743: DRILL-8391: Disable auto complete on the password field of web UI login forms

2023-01-20 Thread GitBox


jnturton opened a new pull request, #2743:
URL: https://github.com/apache/drill/pull/2743

   # [DRILL-8391](https://issues.apache.org/jira/browse/DRILL-8391): Disable 
auto complete on the password field of web UI login forms
   
   ## Description
   
   In order to avoid triggering security scanners it is necessary to set 
autocomplete = "off" on the password field in the web UI login forms. This 
change probably has no real world security benefit because
   
   > Even without a master password, in-browser password management is 
generally seen as a net gain for security. Since users do not have to remember 
passwords that the browser stores for them, they are able to choose stronger 
passwords than they would otherwise.
   > 
   > For this reason, many modern browsers do not support autocomplete="off" 
for login fields:
   > 
   > - If a site sets autocomplete="off" for a form, and the form includes 
username and password input fields, then the browser still offers to remember 
this login, and if the user agrees, the browser will autofill those fields the 
next time the user visits the page.
   > - If a site sets autocomplete="off" for username and password input 
fields, then the browser still offers to remember this login, and if the user 
agrees, the browser will autofill those fields the next time the user visits 
the page
   
   Excerpt taken from [this Mozilla Developer Network 
page](https://developer.mozilla.org/en-US/docs/Web/Security/Securing_your_site/Turning_off_form_autocompletion).
   
   ## Documentation
   N/A
   
   ## Testing
   Confirm that the attribute assignment `autocomplete="off"` is present on the 
password of the web UI login form.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [drill] cgivre merged pull request #2742: DRILL-8390: Minor Improvements to PDF Reader

2023-01-19 Thread GitBox


cgivre merged PR #2742:
URL: https://github.com/apache/drill/pull/2742


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [drill] cgivre opened a new pull request, #2742: DRILL-8390: Minor Improvements to PDF Reader

2023-01-18 Thread GitBox


cgivre opened a new pull request, #2742:
URL: https://github.com/apache/drill/pull/2742

   # [DRILL-8390](https://issues.apache.org/jira/browse/DRILL-8390): Minor 
Improvements to PDF Reader
   
   
   ## Description
   This PR makes some minor improvements to the PDF reader including:
   Fixes a minor bug where certain configurations the first row of data was 
skipped
   Fixes a minor bug where empty tables were causing crashes with the 
spreadsheet extraction algorithm was used
   Adds a `_table_count` metadata field
   Adds a `_table_index` metadata field to reflect the current table.
   
   ## Documentation
   See above.  Updated README.
   
   ## Testing
   Ran existing unit tests.  Manually tested against customer data.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [drill] cgivre commented on issue #2721: select * from hive ;report refcnt = 0 error

2023-01-18 Thread GitBox


cgivre commented on issue #2721:
URL: https://github.com/apache/drill/issues/2721#issuecomment-1387248414

   I'm just realizing something here.  Are you attempting to run an INSERT 
query into Hive via Drill? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [drill] cgivre merged pull request #2741: [MINOR UPDATE]: Remove travis.yml

2023-01-17 Thread GitBox


cgivre merged PR #2741:
URL: https://github.com/apache/drill/pull/2741


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [drill] cgivre closed pull request #2731: DRILL-5033: Query on JSON That Has Null as Value For Each Key

2023-01-17 Thread GitBox


cgivre closed pull request #2731: DRILL-5033: Query on JSON That Has Null as 
Value For Each Key
URL: https://github.com/apache/drill/pull/2731


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [drill] cgivre commented on pull request #2731: DRILL-5033: Query on JSON That Has Null as Value For Each Key

2023-01-17 Thread GitBox


cgivre commented on PR #2731:
URL: https://github.com/apache/drill/pull/2731#issuecomment-1386304077

   I'm going to close this PR.  If there is any objection, we can revisit. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [drill] cgivre opened a new pull request, #2741: [MINOR UPDATE]: Remove travis.yml

2023-01-17 Thread GitBox


cgivre opened a new pull request, #2741:
URL: https://github.com/apache/drill/pull/2741

   # [MINOR UPDATE]: Remove Travis.yml
   
   ## Description
   Per INFRA request, the Apache Foundation is moving away from Travis CI.  
They have requested that all tools deactivate it. 
   
   ## Documentation
   N/A
   
   ## Testing
   N/A


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [drill] cgivre merged pull request #2733: DRILL-8380: Remove customised SqlValidatorImpl.deriveAlias

2023-01-17 Thread GitBox


cgivre merged PR #2733:
URL: https://github.com/apache/drill/pull/2733


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [drill] vvysotskyi commented on pull request #2733: DRILL-8380: Remove customised SqlValidatorImpl.deriveAlias

2023-01-16 Thread GitBox


vvysotskyi commented on PR #2733:
URL: https://github.com/apache/drill/pull/2733#issuecomment-1384890894

   Yes, it can be merged before Calcite is released.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [drill] cgivre merged pull request #2740: [MINOR UPDATE]: Clear Results after Splunk Unit Tests

2023-01-16 Thread GitBox


cgivre merged PR #2740:
URL: https://github.com/apache/drill/pull/2740


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [drill] cgivre commented on pull request #2733: DRILL-8380: Remove customised SqlValidatorImpl.deriveAlias

2023-01-16 Thread GitBox


cgivre commented on PR #2733:
URL: https://github.com/apache/drill/pull/2733#issuecomment-1384683174

   @vvysotskyi Can we merge this or should we wait for Calcite 1.33 to be 
released?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [drill] cgivre opened a new pull request, #2740: [MINOR UPDATE]: Clear Results after Splunk Unit Tests

2023-01-16 Thread GitBox


cgivre opened a new pull request, #2740:
URL: https://github.com/apache/drill/pull/2740

   ## Description
   This minor modification to the Splunk unit tests clears for user translation 
explicitly clears the result sets.  During some other work, I found that these 
tests would occasionally fail.   This fixes that.
   
   ## Documentation
   No user facing changes.
   
   ## Testing
   Existing unit tests.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [drill] cgivre commented on pull request #2733: DRILL-8380: Remove customised SqlValidatorImpl.deriveAlias

2023-01-16 Thread GitBox


cgivre commented on PR #2733:
URL: https://github.com/apache/drill/pull/2733#issuecomment-1384307651

   One thing I noticed is that the splunk tests sometimes fail locally as they 
don't have results.clear() at the end.  This is inconsistent behavior, but I 
added that in the ES PR that I'm working on. 
   Best,
   -- C
   
   > On Jan 16, 2023, at 11:39 AM, Volodymyr Vysotskyi ***@***.***> wrote:
   > 
   > hat 
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [drill] vvysotskyi commented on pull request #2733: DRILL-8380: Remove customised SqlValidatorImpl.deriveAlias

2023-01-16 Thread GitBox


vvysotskyi commented on PR #2733:
URL: https://github.com/apache/drill/pull/2733#issuecomment-1384305483

   I think Splunk tests have somewhere a condition to fail if I'm the author of 
the commit 
   Here is CI run for another my commit in the master branch that has the same 
error: https://github.com/apache/drill/actions/runs/3874925191/jobs/6620797212


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [drill] jnturton commented on pull request #2733: DRILL-8380: Remove customised SqlValidatorImpl.deriveAlias

2023-01-16 Thread GitBox


jnturton commented on PR #2733:
URL: https://github.com/apache/drill/pull/2733#issuecomment-1383868835

   @vvysotskyi given that most of the CI runs had passed I ran (8, 
default-hadoop) again but the same failure turned up. I wouldn't expect a 
failure such as the following to affect only JDK 8 though so I'm a little 
mystified.
   
   ```
   Error:SplunkWriterTest.testBasicCTASWithScalarDataTypes:136 Schemas 
don't match.
   Expected: [TupleSchema [PrimitiveColumnMetadata [`int_field` 
(VARCHAR:OPTIONAL)]], [PrimitiveColumnMetadata [`bigint_field` 
(VARCHAR:OPTIONAL)]], [PrimitiveColumnMetadata [`float4_field` 
(VARCHAR:OPTIONAL)]], [PrimitiveColumnMetadata [`float8_field` 
(VARCHAR:OPTIONAL)]], [PrimitiveColumnMetadata [`varchar_field` 
(VARCHAR:OPTIONAL)]], [PrimitiveColumnMetadata [`date_field` 
(VARCHAR:OPTIONAL)]], [PrimitiveColumnMetadata [`time_field` 
(VARCHAR:OPTIONAL)]], [PrimitiveColumnMetadata [`timestamp_field` 
(VARCHAR:OPTIONAL)]], [PrimitiveColumnMetadata [`boolean_field` 
(VARCHAR:OPTIONAL)]]]
   Actual:   [TupleSchema [PrimitiveColumnMetadata [`int_field` 
(INT:OPTIONAL)]], [PrimitiveColumnMetadata [`bigint_field` (INT:OPTIONAL)]], 
[PrimitiveColumnMetadata [`float4_field` (INT:OPTIONAL)]], 
[PrimitiveColumnMetadata [`float8_field` (INT:OPTIONAL)]], 
[PrimitiveColumnMetadata [`varchar_field` (INT:OPTIONAL)]], 
[PrimitiveColumnMetadata [`date_field` (INT:OPTIONAL)]], 
[PrimitiveColumnMetadata [`time_field` (INT:OPTIONAL)]], 
[PrimitiveColumnMetadata [`timestamp_field` (INT:OPTIONAL)]], 
[PrimitiveColumnMetadata [`boolean_field` (INT:OPTIONAL)]]]
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [drill] cgivre merged pull request #2737: DRILL-8384: Add Format Plugin for Microsoft Access

2023-01-15 Thread GitBox


cgivre merged PR #2737:
URL: https://github.com/apache/drill/pull/2737


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [drill] vvysotskyi commented on a diff in pull request #2733: DRILL-8380: Remove customised SqlValidatorImpl.deriveAlias

2023-01-15 Thread GitBox


vvysotskyi commented on code in PR #2733:
URL: https://github.com/apache/drill/pull/2733#discussion_r1070571396


##
exec/java-exec/src/main/java/org/apache/drill/exec/store/dfs/WorkspaceSchemaFactory.java:
##
@@ -403,8 +404,24 @@ private View getView(DotDrillFile f) throws IOException {
   return f.getView(mapper);
 }
 
+private String getTemporaryName(String name) {
+  if (isTemporaryWorkspace()) {
+String tableName = DrillStringUtils.removeLeadingSlash(name);
+return schemaConfig.getTemporaryTableName(tableName);
+  }
+  return null;
+}
+
+private boolean isTemporaryWorkspace() {

Review Comment:
   I think it is unlikely that it would be reused.



##
exec/java-exec/src/main/java/org/apache/drill/exec/work/metadata/MetadataProvider.java:
##
@@ -607,6 +608,16 @@ public String getQueryUserName() {
   @Override public UserCredentials getQueryUserCredentials() {
 return session.getCredentials();
   }
+
+  @Override
+  public String getTemporaryTableName(String table) {

Review Comment:
   I agree it is not good to have such interfaces with unsupported methods. 
Ideally, we should split them into several interfaces instead and use broader 
ones in places where it is required.



##
exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/conversion/DrillCalciteCatalogReader.java:
##
@@ -135,14 +112,15 @@ public Prepare.PreparingTable getTable(List 
names) {
   }
 
   private void checkTemporaryTable(List names) {
-if (allowTemporaryTables) {
+if (allowTemporaryTables || !needsTemporaryTableCheck(names, 
session.getDefaultSchemaPath(), drillConfig)) {
   return;
 }
-String originalTableName = 
session.getOriginalTableNameFromTemporaryTable(names.get(names.size() - 1));
+String tableName = names.get(names.size() - 1);
+String originalTableName = session.resolveTemporaryTableName(tableName);
 if (originalTableName != null) {
   throw UserException
   .validationError()
-  .message("Temporary tables usage is disallowed. Used temporary table 
name: [%s].", originalTableName)
+  .message("Temporary tables usage is disallowed. Used temporary table 
name: [%s].", tableName)

Review Comment:
   Thanks, replaced it.



##
exec/java-exec/src/main/java/org/apache/drill/exec/work/metadata/MetadataProvider.java:
##
@@ -607,6 +608,16 @@ public String getQueryUserName() {
   @Override public UserCredentials getQueryUserCredentials() {
 return session.getCredentials();
   }
+
+  @Override
+  public String getTemporaryTableName(String table) {
+return session.resolveTemporaryTableName(table);
+  }
+
+  @Override
+  public String getTemporaryWorkspace() {
+return config.getString(ExecConstants.DEFAULT_TEMPORARY_WORKSPACE);

Review Comment:
   Yes, config is the only source for this property. But I think it is better 
to have an interface that provides only information related to schema config 
info rather than allow callers to access config by themselves. The current 
approach helps to encapsulate it, so I would prefer to leave it as it is. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [drill] Leon-WTF commented on a diff in pull request #2599: DRILL-4232: Support for EXCEPT and INTERSECT set operator

2023-01-14 Thread GitBox


Leon-WTF commented on code in PR #2599:
URL: https://github.com/apache/drill/pull/2599#discussion_r1070523002


##
exec/java-exec/src/test/java/org/apache/drill/TestSetOp.java:
##
@@ -0,0 +1,1093 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill;
+
+import org.apache.drill.exec.planner.physical.PlannerSettings;
+import org.apache.drill.exec.record.BatchSchema;
+import org.apache.drill.exec.record.BatchSchemaBuilder;
+import org.apache.drill.exec.record.metadata.SchemaBuilder;
+import org.apache.drill.shaded.guava.com.google.common.collect.Lists;
+import org.apache.commons.lang3.tuple.Pair;
+import org.apache.drill.categories.OperatorTest;
+import org.apache.drill.categories.SqlTest;
+import org.apache.drill.categories.UnlikelyTest;
+import org.apache.drill.common.exceptions.UserException;
+import org.apache.drill.common.expression.SchemaPath;
+import org.apache.drill.common.types.TypeProtos;
+import org.apache.drill.exec.ExecConstants;
+import org.apache.drill.test.ClusterFixture;
+import org.apache.drill.test.ClusterTest;
+import org.junit.BeforeClass;
+import org.junit.Test;
+import org.junit.experimental.categories.Category;
+
+import java.io.BufferedWriter;
+import java.io.File;
+import java.io.FileWriter;
+import java.nio.file.Paths;
+import java.util.List;
+
+@Category({SqlTest.class, OperatorTest.class})
+public class TestSetOp extends ClusterTest {

Review Comment:
   first empty batch will be skipped in sniffNonEmptyBatch



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [drill] cgivre opened a new pull request, #2739: DRILL-8387: Add Support for User Translation to ElasticSearch Plugin

2023-01-13 Thread GitBox


cgivre opened a new pull request, #2739:
URL: https://github.com/apache/drill/pull/2739

   # [DRILL-8387](https://issues.apache.org/jira/browse/DRILL-8387): Add 
Support for User Translation to ElasticSearch Plugin
   
   ## Description
   This PR adds support for user translation to the ElasticSearch plugin.
   
   ## Documentation
   Updated README.
   
   
   ## Testing
   Working on unit tests


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [drill] cgivre merged pull request #2738: DRILL-8386: Add Support for User Translation for Cassandra

2023-01-12 Thread GitBox


cgivre merged PR #2738:
URL: https://github.com/apache/drill/pull/2738


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [drill] cgivre commented on pull request #2738: DRILL-8386: Add Support for User Translation for Cassandra

2023-01-12 Thread GitBox


cgivre commented on PR #2738:
URL: https://github.com/apache/drill/pull/2738#issuecomment-1380565243

   @jnturton  Thanks for the review!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [drill] cgivre commented on a diff in pull request #2738: DRILL-8386: Add Support for User Translation for Cassandra

2023-01-12 Thread GitBox


cgivre commented on code in PR #2738:
URL: https://github.com/apache/drill/pull/2738#discussion_r1068265466


##
contrib/storage-splunk/README.md:
##
@@ -42,6 +42,10 @@ Sometimes Splunk has issue in connection to it:
 https://github.com/splunk/splunk-sdk-java/issues/62 
 To bypass it by Drill please specify "reconnectRetries": 3. It allows you to 
retry the connection several times.
 
+### User Translation
+The Splunk plugin supports user translation.  Simply set the `authMode` 
parameter to `USER_TRANSLATION` and use either the plain or vault credential 
provider for credentials.

Review Comment:
   That's probably a good idea.  I'd give it a +1.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [drill] jnturton commented on a diff in pull request #2738: DRILL-8386: Add Support for User Translation for Cassandra

2023-01-12 Thread GitBox


jnturton commented on code in PR #2738:
URL: https://github.com/apache/drill/pull/2738#discussion_r1068257539


##
contrib/storage-cassandra/src/test/java/org/apache/drill/exec/store/cassandra/CassandraUserTranslationTest.java:
##
@@ -0,0 +1,103 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.drill.exec.store.cassandra;
+
+import org.apache.drill.categories.SlowTest;
+import org.apache.drill.common.config.DrillProperties;
+import org.apache.drill.common.exceptions.UserRemoteException;
+import org.apache.drill.exec.physical.rowSet.RowSet;
+import org.apache.drill.test.ClientFixture;
+import org.junit.Test;
+import org.junit.experimental.categories.Category;
+
+import static 
org.apache.drill.exec.rpc.user.security.testing.UserAuthenticatorTestImpl.ADMIN_USER;
+import static 
org.apache.drill.exec.rpc.user.security.testing.UserAuthenticatorTestImpl.ADMIN_USER_PASSWORD;
+import static 
org.apache.drill.exec.rpc.user.security.testing.UserAuthenticatorTestImpl.TEST_USER_1;
+import static 
org.apache.drill.exec.rpc.user.security.testing.UserAuthenticatorTestImpl.TEST_USER_1_PASSWORD;
+import static org.junit.Assert.assertEquals;
+import static org.junit.Assert.assertTrue;
+import static org.junit.jupiter.api.Assertions.fail;
+
+@Category({SlowTest.class})
+public class CassandraUserTranslationTest extends BaseCassandraTest {
+  @Test
+  public void testInfoSchemaQueryWithMissingCredentials() throws Exception {
+// This test validates that the correct credentials are sent down to 
Cassandra.
+// This user should not see the ut_cassandra because they do not have 
valid credentials.
+ClientFixture client = cluster
+.clientBuilder()
+.property(DrillProperties.USER, ADMIN_USER)
+.property(DrillProperties.PASSWORD, ADMIN_USER_PASSWORD)
+.build();
+
+String sql = "SHOW DATABASES WHERE schema_name LIKE '%cassandra%'";
+
+RowSet results = client.queryBuilder().sql(sql).rowSet();
+assertEquals(1, results.rowCount());
+results.clear();
+  }
+
+  @Test
+  public void testInfoSchemaQueryWithValidCredentials() throws Exception {
+// This test validates that the cassandra connection with user translation 
appears whne the user is

Review Comment:
   ```suggestion
   // This test validates that the cassandra connection with user 
translation appears when the user is
   ```



##
contrib/storage-splunk/README.md:
##
@@ -42,6 +42,10 @@ Sometimes Splunk has issue in connection to it:
 https://github.com/splunk/splunk-sdk-java/issues/62 
 To bypass it by Drill please specify "reconnectRetries": 3. It allows you to 
retry the connection several times.
 
+### User Translation
+The Splunk plugin supports user translation.  Simply set the `authMode` 
parameter to `USER_TRANSLATION` and use either the plain or vault credential 
provider for credentials.

Review Comment:
   Do you think I should make authMode values case insensitive before 1.21 is 
released? USER_TRANSLATION looks odd compared to other Drill config values and 
is only in caps due to the coincidence that the Java Enum names are in caps.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [drill] jnturton commented on a diff in pull request #2733: DRILL-8380: Remove customised SqlValidatorImpl.deriveAlias

2023-01-12 Thread GitBox


jnturton commented on code in PR #2733:
URL: https://github.com/apache/drill/pull/2733#discussion_r1068252216


##
exec/java-exec/src/main/java/org/apache/drill/exec/work/metadata/MetadataProvider.java:
##
@@ -607,6 +608,16 @@ public String getQueryUserName() {
   @Override public UserCredentials getQueryUserCredentials() {
 return session.getCredentials();
   }
+
+  @Override
+  public String getTemporaryTableName(String table) {

Review Comment:
   This looks like another case where we wouldn't need to keep expanding 
interfaces like SchemaConfigInfoProvider and adding partial implementations 
where some throw UnsupportedOperationExceptions if we just had a good way of 
accessing the UserSession from most layers of Drill. It's not something for 
this PR for sure, but I wanted to remark to get your opinion since I remember 
having to work the same way when I was trying to expose UserCredentials 
(visible above) for user translation in plugins.



##
exec/java-exec/src/main/java/org/apache/drill/exec/work/metadata/MetadataProvider.java:
##
@@ -607,6 +608,16 @@ public String getQueryUserName() {
   @Override public UserCredentials getQueryUserCredentials() {
 return session.getCredentials();
   }
+
+  @Override
+  public String getTemporaryTableName(String table) {
+return session.resolveTemporaryTableName(table);
+  }
+
+  @Override
+  public String getTemporaryWorkspace() {
+return config.getString(ExecConstants.DEFAULT_TEMPORARY_WORKSPACE);

Review Comment:
   Have I got it right that this config option value is the only value returned 
by implementations of getTemporaryWorkspace? If so, do we need this method or 
could its callers look up the config value themselves instead?



##
exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/conversion/DrillCalciteCatalogReader.java:
##
@@ -135,14 +112,15 @@ public Prepare.PreparingTable getTable(List 
names) {
   }
 
   private void checkTemporaryTable(List names) {
-if (allowTemporaryTables) {
+if (allowTemporaryTables || !needsTemporaryTableCheck(names, 
session.getDefaultSchemaPath(), drillConfig)) {
   return;
 }
-String originalTableName = 
session.getOriginalTableNameFromTemporaryTable(names.get(names.size() - 1));
+String tableName = names.get(names.size() - 1);
+String originalTableName = session.resolveTemporaryTableName(tableName);
 if (originalTableName != null) {
   throw UserException
   .validationError()
-  .message("Temporary tables usage is disallowed. Used temporary table 
name: [%s].", originalTableName)
+  .message("Temporary tables usage is disallowed. Used temporary table 
name: [%s].", tableName)

Review Comment:
   ```suggestion
 .message("A reference to temporary table [%s] was made in a 
context where temporary table references are not allowed.", tableName)
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [drill] jnturton commented on a diff in pull request #2733: DRILL-8380: Remove customised SqlValidatorImpl.deriveAlias

2023-01-11 Thread GitBox


jnturton commented on code in PR #2733:
URL: https://github.com/apache/drill/pull/2733#discussion_r1067718395


##
exec/java-exec/src/main/java/org/apache/drill/exec/store/dfs/WorkspaceSchemaFactory.java:
##
@@ -403,8 +404,24 @@ private View getView(DotDrillFile f) throws IOException {
   return f.getView(mapper);
 }
 
+private String getTemporaryName(String name) {
+  if (isTemporaryWorkspace()) {
+String tableName = DrillStringUtils.removeLeadingSlash(name);
+return schemaConfig.getTemporaryTableName(tableName);
+  }
+  return null;
+}
+
+private boolean isTemporaryWorkspace() {

Review Comment:
   Could this utility method move to SchemaConfig or SchemaUtilities so that 
it's available for reuse elsewhere or is that unlikely?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [drill] jnturton commented on a diff in pull request #2733: DRILL-8380: Remove customised SqlValidatorImpl.deriveAlias

2023-01-11 Thread GitBox


jnturton commented on code in PR #2733:
URL: https://github.com/apache/drill/pull/2733#discussion_r1067718395


##
exec/java-exec/src/main/java/org/apache/drill/exec/store/dfs/WorkspaceSchemaFactory.java:
##
@@ -403,8 +404,24 @@ private View getView(DotDrillFile f) throws IOException {
   return f.getView(mapper);
 }
 
+private String getTemporaryName(String name) {
+  if (isTemporaryWorkspace()) {
+String tableName = DrillStringUtils.removeLeadingSlash(name);
+return schemaConfig.getTemporaryTableName(tableName);
+  }
+  return null;
+}
+
+private boolean isTemporaryWorkspace() {

Review Comment:
   Could this utiltiy method move to SchemaConfig or SchemaUtilities so that 
it's available for reuse elsewhere or is that unlikely?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [drill] cgivre opened a new pull request, #2738: DRILL-8386: Add Support for User Translation for Cassandra

2023-01-11 Thread GitBox


cgivre opened a new pull request, #2738:
URL: https://github.com/apache/drill/pull/2738

   # [DRILL-8386](https://issues.apache.org/jira/browse/DRILL-8386): Add 
Support for User Translation for Cassandra
   
   ## Description
   Adds support for user translation for Apache Cassandra.
   
   ## Documentation
   Updated README.
   
   ## Testing
   Added unit tests.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [drill] LYCJeff commented on issue #2735: Use some configuration items to specify the parameters as filters that allow them to be passed to headers and post body through SQL dynamically

2023-01-11 Thread GitBox


LYCJeff commented on issue #2735:
URL: https://github.com/apache/drill/issues/2735#issuecomment-1378369362

   > @LYCJeff Drill already does this. Take a look at the docs 
(https://github.com/apache/drill/tree/master/contrib/storage-http#method) for 
the `postBodyLocation` parameter.
   > 
   > I actually like your design better however.
   
   What about headers? Some APIs require digital signature in the headers to be 
generated at access time.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [drill] cgivre merged pull request #2729: DRILL-8376: Add Distribution UDFs

2023-01-10 Thread GitBox


cgivre merged PR #2729:
URL: https://github.com/apache/drill/pull/2729


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [drill] cgivre commented on a diff in pull request #2729: DRILL-8376: Add Distribution UDFs

2023-01-10 Thread GitBox


cgivre commented on code in PR #2729:
URL: https://github.com/apache/drill/pull/2729#discussion_r1065940539


##
contrib/udfs/src/main/java/org/apache/drill/exec/udfs/DistributionFunctions.java:
##
@@ -51,31 +51,29 @@ public static class WidthBucketFunction implements 
DrillSimpleFunc {
 @Workspace
 double binWidth;
 
+@Workspace
+int bucketCount;
+
 @Output
 IntHolder bucket;
 
 @Override
 public void setup() {
   double max = MaxRangeValueHolder.value;
   double min = MinRangeValueHolder.value;
-  int bucketCount = bucketCountHolder.value;
+  bucketCount = bucketCountHolder.value;
   binWidth = (max - min) / bucketCount;
 }
 
 @Override
 public void eval() {
-  // There is probably a more elegant way of doing this...
-  double binFloor = MinRangeValueHolder.value;
-  double binCeiling = binFloor + binWidth;
-
-  for (int i = 1; i <= bucketCountHolder.value; i++) {
-if (inputValue.value <= binCeiling && inputValue.value > binFloor) {
-   bucket.value = i;
-   break;
-} else {
-  binFloor = binCeiling;
-  binCeiling = binWidth * (i + 1);
-}
+  if (inputValue.value < MinRangeValueHolder.value) {
+bucket.value = 0;
+  } else if (inputValue.value > MaxRangeValueHolder.value) {
+bucket.value = bucketCount + 1;
+  } else {
+double f = (1 + (inputValue.value - MinRangeValueHolder.value) / 
binWidth);

Review Comment:
   Oops... That was a test variable.  Removed. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [drill] cgivre opened a new pull request, #2737: DRILL-8384: Add Format Plugin for Microsoft Access

2023-01-10 Thread GitBox


cgivre opened a new pull request, #2737:
URL: https://github.com/apache/drill/pull/2737

   # [DRILL-8384](https://issues.apache.org/jira/browse/DRILL-8384): Add Format 
Plugin for Microsoft Access
   
   ## Description
   Added format plugin to enable Drill to read MS Access files. 
   
   ## Documentation
   See README.
   
   ## Testing
   Added unit tests.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [drill] cgivre commented on issue #2735: Use some configuration items to specify the parameters as filters that allow them to be passed to headers and post body through SQL dynamically

2023-01-10 Thread GitBox


cgivre commented on issue #2735:
URL: https://github.com/apache/drill/issues/2735#issuecomment-1377178966

   @LYCJeff  Drill already does this.  Take a look at the docs 
(https://github.com/apache/drill/tree/master/contrib/storage-http#method) for 
the `postBodyLocation` parameter.  
   
   I actually like your design better however. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [drill] LYCJeff closed issue #2736: Use some configuration items to specify the parameters as filters that allow them to be passed to headers and post body through SQL dynamically

2023-01-10 Thread GitBox


LYCJeff closed issue #2736: Use some configuration items to specify the 
parameters as filters that allow them to be passed to headers and post body 
through SQL dynamically
URL: https://github.com/apache/drill/issues/2736


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [drill] LYCJeff opened a new issue, #2736: Use some configuration items to specify the parameters as filters that allow them to be passed to headers and post body through SQL dynamically

2023-01-10 Thread GitBox


LYCJeff opened a new issue, #2736:
URL: https://github.com/apache/drill/issues/2736

   Some APIs require information be sent as a headers or post body dynamically. 
So I'm wondering if we can pass it in through filter statement.
   
   Perhaps we could design it like the params field in connections parameter. 
For example:
   
   {
 "url": "https://api.sunrise-sunset.org/json;,
 "requireTail": false,
 "bodyParams": ["lat", "lng", "date"]
   }
   
   SQL Query:
   SELECT *
   FROM api.sunrise
   WHERE `body.lat` = 36.7201600
 AND `body.lng` = -4.4203400
 AND `body.date` = '2019-10-02';
   
   Then, the post body would be:
   {
  "lat": 36.7201600,
  "lng": -4.4203400,
  "date": "2019-10-02"
  }
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [drill] LYCJeff opened a new issue, #2735: Use some configuration items to specify the parameters as filters that allow them to be passed to headers and post body through SQL dynamically

2023-01-10 Thread GitBox


LYCJeff opened a new issue, #2735:
URL: https://github.com/apache/drill/issues/2735

   Some APIs require information be sent as a headers or post body dynamically. 
So I'm wondering if we can pass it in through filter statement.
   
   Perhaps we could design it like the params field in connections parameter. 
For example:
   
   {
 "url": "https://api.sunrise-sunset.org/json;,
 "requireTail": false,
 "bodyParams": ["lat", "lng", "date"]
   }
   
   SQL Query:
   SELECT *
   FROM api.sunrise
   WHERE `body.lat` = 36.7201600
 AND `body.lng` = -4.4203400
 AND `body.date` = '2019-10-02';
   
   Then, the post body would be:
   {
  "lat": 36.7201600,
  "lng": -4.4203400,
  "date": "2019-10-02"
  }
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [drill] jnturton commented on a diff in pull request #2729: DRILL-8376: Add Distribution UDFs

2023-01-09 Thread GitBox


jnturton commented on code in PR #2729:
URL: https://github.com/apache/drill/pull/2729#discussion_r1065424637


##
contrib/udfs/src/main/java/org/apache/drill/exec/udfs/DistributionFunctions.java:
##
@@ -51,31 +51,29 @@ public static class WidthBucketFunction implements 
DrillSimpleFunc {
 @Workspace
 double binWidth;
 
+@Workspace
+int bucketCount;
+
 @Output
 IntHolder bucket;
 
 @Override
 public void setup() {
   double max = MaxRangeValueHolder.value;
   double min = MinRangeValueHolder.value;
-  int bucketCount = bucketCountHolder.value;
+  bucketCount = bucketCountHolder.value;
   binWidth = (max - min) / bucketCount;
 }
 
 @Override
 public void eval() {
-  // There is probably a more elegant way of doing this...
-  double binFloor = MinRangeValueHolder.value;
-  double binCeiling = binFloor + binWidth;
-
-  for (int i = 1; i <= bucketCountHolder.value; i++) {
-if (inputValue.value <= binCeiling && inputValue.value > binFloor) {
-   bucket.value = i;
-   break;
-} else {
-  binFloor = binCeiling;
-  binCeiling = binWidth * (i + 1);
-}
+  if (inputValue.value < MinRangeValueHolder.value) {
+bucket.value = 0;
+  } else if (inputValue.value > MaxRangeValueHolder.value) {
+bucket.value = bucketCount + 1;
+  } else {
+double f = (1 + (inputValue.value - MinRangeValueHolder.value) / 
binWidth);

Review Comment:
   It looks like `f` is recomputed rather than used in what follows.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [drill] cgivre merged pull request #2734: DRILL-8381: Add support for filtered aggregate calls

2023-01-09 Thread GitBox


cgivre merged PR #2734:
URL: https://github.com/apache/drill/pull/2734


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [drill] cgivre commented on pull request #2729: DRILL-8376: Add Distribution UDFs

2023-01-09 Thread GitBox


cgivre commented on PR #2729:
URL: https://github.com/apache/drill/pull/2729#issuecomment-1375739889

   @jnturton Thanks for the review.  I believe I've addressed your comments.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [drill] cgivre commented on a diff in pull request #2729: DRILL-8376: Add Distribution UDFs

2023-01-09 Thread GitBox


cgivre commented on code in PR #2729:
URL: https://github.com/apache/drill/pull/2729#discussion_r1064725653


##
contrib/udfs/src/main/java/org/apache/drill/exec/udfs/DistributionFunctions.java:
##
@@ -0,0 +1,335 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.drill.exec.udfs;
+
+import org.apache.drill.exec.expr.DrillAggFunc;
+import org.apache.drill.exec.expr.DrillSimpleFunc;
+import org.apache.drill.exec.expr.annotations.FunctionTemplate;
+import org.apache.drill.exec.expr.annotations.FunctionTemplate.FunctionScope;
+import org.apache.drill.exec.expr.annotations.FunctionTemplate.NullHandling;
+import org.apache.drill.exec.expr.annotations.Output;
+import org.apache.drill.exec.expr.annotations.Param;
+import org.apache.drill.exec.expr.annotations.Workspace;
+import org.apache.drill.exec.expr.holders.Float8Holder;
+import org.apache.drill.exec.expr.holders.IntHolder;
+
+public class DistributionFunctions {
+
+  @FunctionTemplate(names = {"width_bucket", "widthBucket"},
+  scope = FunctionScope.SIMPLE,
+  nulls = NullHandling.NULL_IF_NULL)
+  public static class WidthBucketFunction implements DrillSimpleFunc {
+
+@Param
+Float8Holder inputValue;
+
+@Param
+Float8Holder MinRangeValueHolder;
+
+@Param
+Float8Holder MaxRangeValueHolder;
+
+@Param
+IntHolder bucketCountHolder;
+
+@Workspace
+double binWidth;
+
+@Output
+IntHolder bucket;
+
+@Override
+public void setup() {
+  double max = MaxRangeValueHolder.value;
+  double min = MinRangeValueHolder.value;
+  int bucketCount = bucketCountHolder.value;
+  binWidth = (max - min) / bucketCount;
+}
+
+@Override
+public void eval() {
+  // There is probably a more elegant way of doing this...
+  double binFloor = MinRangeValueHolder.value;
+  double binCeiling = binFloor + binWidth;
+
+  for (int i = 1; i <= bucketCountHolder.value; i++) {
+if (inputValue.value <= binCeiling && inputValue.value > binFloor) {
+   bucket.value = i;
+   break;
+} else {
+  binFloor = binCeiling;
+  binCeiling = binWidth * (i + 1);
+}
+  }

Review Comment:
   @jnturton I looked at the docs for PostgreSQL (which is where this function 
is modeled after), and saw that in PostgreSQL, if the result is less than the 
bucket count, it goes into bucket `0` and if it is larger than the range, it 
goes into bucket `n+1`. 
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [drill] cgivre commented on a diff in pull request #2729: DRILL-8376: Add Distribution UDFs

2023-01-09 Thread GitBox


cgivre commented on code in PR #2729:
URL: https://github.com/apache/drill/pull/2729#discussion_r1064724101


##
contrib/udfs/src/main/java/org/apache/drill/exec/udfs/DistributionFunctions.java:
##
@@ -0,0 +1,335 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.drill.exec.udfs;
+
+import org.apache.drill.exec.expr.DrillAggFunc;
+import org.apache.drill.exec.expr.DrillSimpleFunc;
+import org.apache.drill.exec.expr.annotations.FunctionTemplate;
+import org.apache.drill.exec.expr.annotations.FunctionTemplate.FunctionScope;
+import org.apache.drill.exec.expr.annotations.FunctionTemplate.NullHandling;
+import org.apache.drill.exec.expr.annotations.Output;
+import org.apache.drill.exec.expr.annotations.Param;
+import org.apache.drill.exec.expr.annotations.Workspace;
+import org.apache.drill.exec.expr.holders.Float8Holder;
+import org.apache.drill.exec.expr.holders.IntHolder;
+
+public class DistributionFunctions {
+
+  @FunctionTemplate(names = {"width_bucket", "widthBucket"},
+  scope = FunctionScope.SIMPLE,
+  nulls = NullHandling.NULL_IF_NULL)
+  public static class WidthBucketFunction implements DrillSimpleFunc {
+
+@Param
+Float8Holder inputValue;
+
+@Param
+Float8Holder MinRangeValueHolder;
+
+@Param
+Float8Holder MaxRangeValueHolder;
+
+@Param
+IntHolder bucketCountHolder;
+
+@Workspace
+double binWidth;
+
+@Output
+IntHolder bucket;
+
+@Override
+public void setup() {
+  double max = MaxRangeValueHolder.value;
+  double min = MinRangeValueHolder.value;
+  int bucketCount = bucketCountHolder.value;
+  binWidth = (max - min) / bucketCount;
+}
+
+@Override
+public void eval() {
+  // There is probably a more elegant way of doing this...
+  double binFloor = MinRangeValueHolder.value;
+  double binCeiling = binFloor + binWidth;
+
+  for (int i = 1; i <= bucketCountHolder.value; i++) {
+if (inputValue.value <= binCeiling && inputValue.value > binFloor) {
+   bucket.value = i;
+   break;
+} else {
+  binFloor = binCeiling;
+  binCeiling = binWidth * (i + 1);
+}
+  }
+}
+  }
+
+  @FunctionTemplate(
+  names = {"kendall_correlation","kendallCorrelation", "kendallTau", 
"kendall_tau"},
+  scope = FunctionScope.POINT_AGGREGATE,
+  nulls = NullHandling.INTERNAL
+  )
+  public static class KendallTauFunction implements DrillAggFunc {
+@Param
+Float8Holder xInput;
+
+@Param
+Float8Holder yInput;
+
+@Workspace
+Float8Holder prevXValue;
+
+@Workspace
+Float8Holder prevYValue;
+
+@Workspace
+IntHolder concordantPairs;
+
+@Workspace
+IntHolder discordantPairs;
+
+@Workspace
+IntHolder n;
+
+@Output
+Float8Holder tau;
+
+@Override
+public void add() {
+  double xValue = xInput.value;
+  double yValue = yInput.value;
+
+  if (n.value > 0) {
+if ((xValue > prevXValue.value && yValue > prevYValue.value) || 
(xValue < prevXValue.value && yValue < prevYValue.value)) {
+  concordantPairs.value = concordantPairs.value + 1;
+} else if ((xValue > prevXValue.value && yValue < prevYValue.value) || 
(xValue < prevXValue.value && yValue > prevYValue.value)) {
+  discordantPairs.value = discordantPairs.value + 1;
+} else {
+  //Tie...
+}
+
+prevXValue.value = xInput.value;
+prevYValue.value = yInput.value;
+n.value = n.value + 1;

Review Comment:
   Fixed.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [drill] vvysotskyi opened a new pull request, #2734: DRILL-8381: Add support for filtered aggregate calls

2023-01-09 Thread GitBox


vvysotskyi opened a new pull request, #2734:
URL: https://github.com/apache/drill/pull/2734

   # [DRILL-8381](https://issues.apache.org/jira/browse/DRILL-8381): Add 
support for filtered aggregate calls
   
   ## Description
   For the case when filtering expression is specified, Drill will generate an 
`if` expression to obtain field value that will be used in aggregate function 
only when the filter predicate is true. Filter expression specified within an 
aggregate function is present in the underlying project, so it is enough to get 
a reference to it to use it as a condition.
   
   ## Documentation
   NA
   
   ## Testing
   Added UT.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [drill] vvysotskyi opened a new pull request, #2733: DRILL-8380: Remove customised SqlValidatorImpl.deriveAlias

2023-01-07 Thread GitBox


vvysotskyi opened a new pull request, #2733:
URL: https://github.com/apache/drill/pull/2733

   # [DRILL-8380](https://issues.apache.org/jira/browse/DRILL-8380): Remove 
customised SqlValidatorImpl.deriveAlias
   
   ## Description
   As pointed out in CALCITE-5463, `SqlValidatorImpl.deriveAlias` isn't 
intended to be customized, and it is not used in the latest version. It causes 
issues with table and storage aliases and temporary tables functionality.
   
   Unfortunately, there is no way to preserve existing temporary table 
behavior. After these changes, the temporary table will be accessible only 
within its workspace, as regular tables. 
   
   ## Documentation
   Yes.
   
   ## Testing
   All UT pass.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [drill] jnturton commented on issue #2732: Any chance of INFORMATION SCHEMA updates like storing it in any database instead of in-memory data

2023-01-06 Thread GitBox


jnturton commented on issue #2732:
URL: https://github.com/apache/drill/issues/2732#issuecomment-1373448803

   Using as specific a WHERE clause as possible in your information schema 
query will usually  help.
   
   On 06 January 2023 06:48:23 SAST, Porika Venkatesh ***@***.***> wrote:
   >I have connected to HIVE Metastore, but my application depends mainly on 
metadata, so we are drill **INFORMATION SCHEMA** as it is virtual datastore and 
we have use metadata. queries are too slow on **INFORMATION SCHEMA**. ANY 
SUGGESTIONS WOULD HELP ME GREAT. THANK YOU
   >
   >-- 
   >Reply to this email directly or view it on GitHub:
   >https://github.com/apache/drill/issues/2732#issuecomment-1373140572
   >You are receiving this because you commented.
   >
   >Message ID: ***@***.***>
   -- 
   Sent from my Android device with K-9 Mail. Please excuse my brevity.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [drill] porika-v commented on issue #2732: Any chance of INFORMATION SCHEMA updates like storing it in any database instead of in-memory data

2023-01-05 Thread GitBox


porika-v commented on issue #2732:
URL: https://github.com/apache/drill/issues/2732#issuecomment-1373140572

   I have connected to HIVE Metastore, but my application depends mainly on 
metadata, so we are drill **INFORMATION SCHEMA** as it is virtual datastore and 
we have use metadata. queries are too slow on **INFORMATION SCHEMA**. ANY 
SUGGESTIONS WOULD HELP ME GREAT. THANK YOU


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [drill] jnturton commented on a diff in pull request #2729: DRILL-8376: Add Distribution UDFs

2023-01-05 Thread GitBox


jnturton commented on code in PR #2729:
URL: https://github.com/apache/drill/pull/2729#discussion_r1062569553


##
contrib/udfs/src/main/java/org/apache/drill/exec/udfs/DistributionFunctions.java:
##
@@ -0,0 +1,335 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.drill.exec.udfs;
+
+import org.apache.drill.exec.expr.DrillAggFunc;
+import org.apache.drill.exec.expr.DrillSimpleFunc;
+import org.apache.drill.exec.expr.annotations.FunctionTemplate;
+import org.apache.drill.exec.expr.annotations.FunctionTemplate.FunctionScope;
+import org.apache.drill.exec.expr.annotations.FunctionTemplate.NullHandling;
+import org.apache.drill.exec.expr.annotations.Output;
+import org.apache.drill.exec.expr.annotations.Param;
+import org.apache.drill.exec.expr.annotations.Workspace;
+import org.apache.drill.exec.expr.holders.Float8Holder;
+import org.apache.drill.exec.expr.holders.IntHolder;
+
+public class DistributionFunctions {
+
+  @FunctionTemplate(names = {"width_bucket", "widthBucket"},
+  scope = FunctionScope.SIMPLE,
+  nulls = NullHandling.NULL_IF_NULL)
+  public static class WidthBucketFunction implements DrillSimpleFunc {
+
+@Param
+Float8Holder inputValue;
+
+@Param
+Float8Holder MinRangeValueHolder;
+
+@Param
+Float8Holder MaxRangeValueHolder;
+
+@Param
+IntHolder bucketCountHolder;
+
+@Workspace
+double binWidth;
+
+@Output
+IntHolder bucket;
+
+@Override
+public void setup() {
+  double max = MaxRangeValueHolder.value;
+  double min = MinRangeValueHolder.value;
+  int bucketCount = bucketCountHolder.value;
+  binWidth = (max - min) / bucketCount;
+}
+
+@Override
+public void eval() {
+  // There is probably a more elegant way of doing this...
+  double binFloor = MinRangeValueHolder.value;
+  double binCeiling = binFloor + binWidth;
+
+  for (int i = 1; i <= bucketCountHolder.value; i++) {
+if (inputValue.value <= binCeiling && inputValue.value > binFloor) {
+   bucket.value = i;
+   break;
+} else {
+  binFloor = binCeiling;
+  binCeiling = binWidth * (i + 1);
+}
+  }
+}
+  }
+
+  @FunctionTemplate(
+  names = {"kendall_correlation","kendallCorrelation", "kendallTau", 
"kendall_tau"},
+  scope = FunctionScope.POINT_AGGREGATE,
+  nulls = NullHandling.INTERNAL
+  )
+  public static class KendallTauFunction implements DrillAggFunc {
+@Param
+Float8Holder xInput;
+
+@Param
+Float8Holder yInput;
+
+@Workspace
+Float8Holder prevXValue;
+
+@Workspace
+Float8Holder prevYValue;
+
+@Workspace
+IntHolder concordantPairs;
+
+@Workspace
+IntHolder discordantPairs;
+
+@Workspace
+IntHolder n;
+
+@Output
+Float8Holder tau;
+
+@Override
+public void add() {
+  double xValue = xInput.value;
+  double yValue = yInput.value;
+
+  if (n.value > 0) {
+if ((xValue > prevXValue.value && yValue > prevYValue.value) || 
(xValue < prevXValue.value && yValue < prevYValue.value)) {
+  concordantPairs.value = concordantPairs.value + 1;
+} else if ((xValue > prevXValue.value && yValue < prevYValue.value) || 
(xValue < prevXValue.value && yValue > prevYValue.value)) {
+  discordantPairs.value = discordantPairs.value + 1;
+} else {
+  //Tie...
+}
+
+prevXValue.value = xInput.value;
+prevYValue.value = yInput.value;
+n.value = n.value + 1;

Review Comment:
   Given that xValue = xInput.value and yValue = yInput.value, I think this 
code is common to both branches of the parent if statement.



##
contrib/udfs/src/main/java/org/apache/drill/exec/udfs/DistributionFunctions.java:
##
@@ -0,0 +1,335 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You 

[GitHub] [drill] jnturton commented on issue #2732: Any chance of INFORMATION SCHEMA updates like storing it in any database instead of in-memory data

2023-01-05 Thread GitBox


jnturton commented on issue #2732:
URL: https://github.com/apache/drill/issues/2732#issuecomment-1372171275

   Have you looked looked at the Hive metastore?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [drill] porika-v commented on issue #2732: Any chance of INFORMATION SCHEMA updates like storing it in any database instead of in-memory data

2023-01-05 Thread GitBox


porika-v commented on issue #2732:
URL: https://github.com/apache/drill/issues/2732#issuecomment-1372024894

   this works only with parquet data, I can't use with **HIVE**


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [drill] jnturton commented on issue #2732: Any chance of INFORMATION SCHEMA updates like storing it in any database instead of in-memory data

2023-01-04 Thread GitBox


jnturton commented on issue #2732:
URL: https://github.com/apache/drill/issues/2732#issuecomment-1371793916

   Have you looked at the Drill metastore?
   
   https://drill.apache.org/docs/using-drill-metastore/
   https://drill.apache.org/docs/rdbms-metastore/


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [drill] porika-v opened a new issue, #2732: Any chance of INFORMATION SCHEMA updates like storing it in any database instead of in-memory data

2023-01-04 Thread GitBox


porika-v opened a new issue, #2732:
URL: https://github.com/apache/drill/issues/2732

   **Is your feature request related to a problem? Please describe.**
   A clear and concise description of what the problem is. Ex. I'm always 
frustrated when [...]
   
   **Describe the solution you'd like**
   A clear and concise description of what you want to happen.
   
   **Describe alternatives you've considered**
   A clear and concise description of any alternative solutions or features 
you've considered.
   
   **Additional context**
   Add any other context or screenshots about the feature request here.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [drill] cgivre commented on pull request #2731: DRILL-5033: Query on JSON That Has Null as Value For Each Key

2022-12-30 Thread GitBox


cgivre commented on PR #2731:
URL: https://github.com/apache/drill/pull/2731#issuecomment-1367977377

   > Thanks @cgivre for the clarification, but suppose the assumption that 
considering nulls as strings would solve the issue, were the changes i made 
(over the class JSONReader.java) adequate (should the methods be changed as i 
did)? i see that some tests didn't pass.
   
   For us to merge a pull request, all the unit tests have to pass.  (Or be 
modified with an explanation of why they are being modified)  Drill is a very 
complex beast with a lot of dependencies so even small changes can break things 
you didn't intend to.  Believe me... I know from experience ;-)
   
   One other thing to note is that there is another option 
`drill.exec.functions.cast_empty_string_to_null`.  Setting this to true forces 
empty strings to be treated as `null`.  This can have some unintended side 
effects,  but might also help you out. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [drill] unical1988 commented on pull request #2731: DRILL-5033: Query on JSON That Has Null as Value For Each Key

2022-12-30 Thread GitBox


unical1988 commented on PR #2731:
URL: https://github.com/apache/drill/pull/2731#issuecomment-1367969342

   Thanks @cgivre for the clarification, but suppose the assumption that 
considering nulls as strings would solve the issue, were the changes i made 
(over the class JSONReader.java) adequate (should the methods be changed as i 
did)? i see that some tests didn't pass.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [drill] cgivre commented on pull request #2731: DRILL-5033: Query on JSON That Has Null as Value For Each Key

2022-12-29 Thread GitBox


cgivre commented on PR #2731:
URL: https://github.com/apache/drill/pull/2731#issuecomment-1367677557

   @unical1988 
   You actually don't have to modify the code to get this data to read 
properly.  As I mentioned on the user group, the easiest way would probably be 
to provide a schema.  The good news is that you can do this at query time.  
Take a look here: 
https://drill.apache.org/docs/plugin-configuration-basics/#specifying-the-schema-as-table-function-parameter
   
   An example query might be:
   
   ```sql
   select * from table(dfs.tmp.`file.json`(
   schema => 'inline=(col0 varchar, col1 date properties {`drill.format` = 
`-MM-dd`})
   properties {`drill.strict` = `false`}'))
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [drill] unical1988 commented on pull request #2731: DRILL-5033: Query on JSON That Has Null as Value For Each Key

2022-12-29 Thread GitBox


unical1988 commented on PR #2731:
URL: https://github.com/apache/drill/pull/2731#issuecomment-1367674759

   @vvysotskyi My attempt to deal with this bug is just a quick workaround 
since the solution as stated by @cgivre might just be the setting of the 
schema, of the dataset to query, from the start (which requires non trivial 
updates to the code).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [drill] cgivre commented on pull request #2731: DRILL-5033: Query on JSON That Has Null as Value For Each Key

2022-12-29 Thread GitBox


cgivre commented on PR #2731:
URL: https://github.com/apache/drill/pull/2731#issuecomment-1367526640

   I copied the JIRA into the PR description.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [drill] unical1988 opened a new pull request, #2731: DRILL-8033

2022-12-29 Thread GitBox


unical1988 opened a new pull request, #2731:
URL: https://github.com/apache/drill/pull/2731

   # [DRILL-8033](https://issues.apache.org/jira/browse/DRILL-8033): PR Title
   
   (Please replace `PR Title` with actual PR Title)
   
   ## Description
   
   (Please describe the change. If more than one ticket is fixed, include a 
reference to those tickets.)
   
   ## Documentation
   (Please describe user-visible changes similar to what should appear in the 
Drill documentation.)
   
   ## Testing
   (Please describe how this PR has been tested.)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [drill] cgivre merged pull request #2730: DRILL-8378: Support doing Maven releases using modern JDKs

2022-12-28 Thread GitBox


cgivre merged PR #2730:
URL: https://github.com/apache/drill/pull/2730


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [drill] jnturton opened a new pull request, #2730: DRILL-8378: Support doing Maven releases using modern JDKs

2022-12-28 Thread GitBox


jnturton opened a new pull request, #2730:
URL: https://github.com/apache/drill/pull/2730

   # [DRILL-8378](https://issues.apache.org/jira/browse/DRILL-8378): Support 
doing Maven releases using modern JDKs
   
   ## Description
   
   While [DRILL-8113](https://issues.apache.org/jira/browse/DRILL-8113) enabled 
the building of Drill using a modern JDK, more work is required to enable a 
Maven release of Drill using a modern JDK. Presently, the Maven Release Plugin 
will fail on Javadoc generation when run with a newer JDK while it succeeds 
with JDK 8. The failures are due to dependencies missing from the Maven Javadoc 
Plugin's config which I assume get treated with a more lenient "warn and skip" 
policy in the javadoc tool shipped with JDK 8 but cause errors in newer JDKs 
(in my case OpenJDK 17).
   
   In particular, the presence of the `sourcepath` property in the javadoc 
plugin's config in the root pom causes the default javadoc:javadoc goal to try 
to generate docs for our src/test packages. Unlike the javadoc:test-javadoc, 
the javadoc:javadoc goal does not inherit dependencies declared with `test` 
scope so it fails to resolve those.
   
   ## Documentation
   N/A
   
   ## Testing
   Successfully run maven release:prepare using OpenJDK 17.
   Successfully run mvn javadoc:javadoc in the Drill root module using OpenJDK 
17 with HTML output generated under each module's target/site/apidocs directory.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [drill] weijunlu commented on issue #2723: Failed to execute an insert statement across the database

2022-12-28 Thread GitBox


weijunlu commented on issue #2723:
URL: https://github.com/apache/drill/issues/2723#issuecomment-1366577871

   2022-12-28 19:03:07,401 [1c53dd94-4277-9ab0-effe-18b1ab8989ac:foreman] DEBUG 
o.a.d.e.p.s.h.DefaultSqlHandler - Drill Plan : 
   {
 "head" : {
   "version" : 1,
   "generator" : {
 "type" : "InsertHandler",
 "info" : ""
   },
   "type" : "APACHE_DRILL_PHYSICAL",
   "options" : [ ],
   "queue" : 0,
   "hasResourcePlan" : false,
   "scannedPluginNames" : [ "mysql", "pg" ],
   "resultMode" : "EXEC"
 },
 "graph" : [ {
   "pop" : "jdbc-scan",
   "@id" : 1,
   "sql" : "INSERT INTO `public`.`t1` (`c1`, `c2`)\r\n(SELECT *\r\nFROM 
`test`.`t1`)",
   "columns" : [ "`ROWCOUNT`" ],
   "config" : {
 "type" : "jdbc",
 "driver" : "com.mysql.jdbc.Driver",
 "url" : "jdbc:mysql://localhost:3316",
 "username" : "root",
 "caseInsensitiveTableNames" : true,
 "writable" : true,
 "authMode" : "SHARED_USER",
 "writerBatchSize" : 1,
 "enabled" : true
   },
   "userName" : "anonymous",
   "cost" : {
 "memoryCost" : 1.6777216E7,
 "outputRowCount" : 1.0E9
   }
 }, {
   "pop" : "screen",
   "@id" : 0,
   "child" : 1,
   "initialAllocation" : 100,
   "maxAllocation" : 100,
   "cost" : {
 "memoryCost" : 1.6777216E7,
 "outputRowCount" : 1.0E9
   }
 } ]
   }
   2022-12-28 19:03:07,402 [1c53dd94-4277-9ab0-effe-18b1ab8989ac:foreman] DEBUG 
o.a.d.e.p.f.SimpleParallelizer - Root fragment:
handle {
 query_id {
   part1: 2041218684968999600
   part2: -1153457303194072660
 }
 major_fragment_id: 0
 minor_fragment_id: 0
   }
   leaf_fragment: true
   assignment {
 address: "DESKTOP-PHHB7LC"
 user_port: 31010
 control_port: 31011
 data_port: 31012
 version: "2.0.0-SNAPSHOT"
 state: STARTUP
   }
   foreman {
 address: "DESKTOP-PHHB7LC"
 user_port: 31010
 control_port: 31011
 data_port: 31012
 version: "2.0.0-SNAPSHOT"
 state: STARTUP
   }
   mem_initial: 100
   mem_max: 100
   credentials {
 user_name: "anonymous"
   }
   context {
 query_start_time: 1672225387214
 time_zone: 299
 default_schema_name: ""
 session_id: "0b3af775-337f-4db3-8ce4-52e20d5c50ee"
   }
   
   2022-12-28 19:03:07,403 [1c53dd94-4277-9ab0-effe-18b1ab8989ac:foreman] TRACE 
o.a.drill.exec.work.foreman.Foreman - PlanFragments for query part1: 
2041218684968999600
   part2: -1153457303194072660

   
   2022-12-28 19:03:07,403 [1c53dd94-4277-9ab0-effe-18b1ab8989ac:foreman] DEBUG 
o.a.d.e.w.f.QueryStateProcessor - 1c53dd94-4277-9ab0-effe-18b1ab8989ac: State 
change requested PLANNING --> ENQUEUED
   2022-12-28 19:03:07,403 [1c53dd94-4277-9ab0-effe-18b1ab8989ac:foreman] DEBUG 
o.a.d.e.w.f.QueryStateProcessor - 1c53dd94-4277-9ab0-effe-18b1ab8989ac: State 
change requested ENQUEUED --> STARTING
   2022-12-28 19:03:07,403 [1c53dd94-4277-9ab0-effe-18b1ab8989ac:foreman] DEBUG 
o.a.d.exec.rpc.control.WorkEventBus - Adding fragment status listener for 
queryId 1c53dd94-4277-9ab0-effe-18b1ab8989ac.
   2022-12-28 19:03:07,403 [1c53dd94-4277-9ab0-effe-18b1ab8989ac:foreman] DEBUG 
o.a.d.e.work.foreman.FragmentsRunner - Submitting fragments to run.
   2022-12-28 19:03:07,403 [1c53dd94-4277-9ab0-effe-18b1ab8989ac:foreman] DEBUG 
o.a.d.exec.ops.FragmentContextImpl - Getting initial memory allocation of 
100
   2022-12-28 19:03:07,403 [1c53dd94-4277-9ab0-effe-18b1ab8989ac:foreman] DEBUG 
o.a.d.exec.ops.FragmentContextImpl - Fragment max allocation: 100
   2022-12-28 19:03:07,403 [1c53dd94-4277-9ab0-effe-18b1ab8989ac:foreman] DEBUG 
o.a.d.e.work.batch.IncomingBuffers - Came up with a list of 0 required 
fragments.  Fragments {}
   2022-12-28 19:03:07,403 [1c53dd94-4277-9ab0-effe-18b1ab8989ac:foreman] DEBUG 
o.a.d.e.work.foreman.FragmentsRunner - Fragments running.
   2022-12-28 19:03:07,403 [1c53dd94-4277-9ab0-effe-18b1ab8989ac:foreman] DEBUG 
o.a.d.e.w.f.QueryStateProcessor - 1c53dd94-4277-9ab0-effe-18b1ab8989ac: State 
change requested STARTING --> RUNNING
   2022-12-28 19:03:07,421 [1c53dd94-4277-9ab0-effe-18b1ab8989ac:frag:0:0] 
DEBUG o.a.d.e.physical.impl.BaseRootExec - BaseRootExec(60762332) operators: 
org.apache.drill.exec.physical.impl.protocol.OperatorRecordBatch 654876346
   2022-12-28 19:03:07,421 [1c53dd94-4277-9ab0-effe-18b1ab8989ac:frag:0:0] 
DEBUG o.a.d.exec.physical.impl.ImplCreator - Took 17 ms to create RecordBatch 
tree
   2022-12-28 19:03:07,421 [1c53dd94-4277-9ab0-effe-18b1ab8989ac:frag:0:0] INFO 
 o.a.d.e.w.fragment.FragmentExecutor - 
1c53dd94-4277-9ab0-effe-18b1ab8989ac:0:0: State change requested 
AWAITING_ALLOCATION --> RUNNING
   2022-12-28 19:03:07,421 [1c53dd94-4277-9ab0-effe-18b1ab8989ac:frag:0:0] INFO 
 o.a.d.e.w.f.FragmentStatusReporter - 1c53dd94-4277-9ab0-effe-18b1ab8989ac:0:0: 

[GitHub] [drill] weijunlu commented on issue #2723: Failed to execute an insert statement across the database

2022-12-28 Thread GitBox


weijunlu commented on issue #2723:
URL: https://github.com/apache/drill/issues/2723#issuecomment-1366576817

   2022-12-28 19:03:07,330 [1c53dd94-4277-9ab0-effe-18b1ab8989ac:foreman] TRACE 
o.apache.calcite.plan.RelOptPlanner - Rule queue:
   rule 
[JdbcTableModificationRule(in:NONE,out:JDBC.mysql)(in:NONE,out:JDBC.mysql)] 
rels [#112]
   rule [JdbcTableModificationRule(in:NONE,out:JDBC.pg)(in:NONE,out:JDBC.pg)] 
rels [#112]
   rule [ExpandConversionRule] rels [#115]
   rule [JDBC_PREL_ConverterJDBC.mysql] rels [#118,#89]
   rule 
[JdbcTableModificationRule(in:NONE,out:JDBC.mysql)(in:LOGICAL,out:JDBC.mysql)] 
rels [#120]
   rule 
[JdbcTableModificationRule(in:NONE,out:JDBC.pg)(in:LOGICAL,out:JDBC.pg)] rels 
[#120]
   2022-12-28 19:03:07,330 [1c53dd94-4277-9ab0-effe-18b1ab8989ac:foreman] DEBUG 
o.apache.calcite.plan.RelOptPlanner - Pop match: rule 
[JdbcTableModificationRule(in:NONE,out:JDBC.mysql)(in:NONE,out:JDBC.mysql)] 
rels [#112]
   2022-12-28 19:03:07,330 [1c53dd94-4277-9ab0-effe-18b1ab8989ac:foreman] DEBUG 
o.apache.calcite.plan.RelOptPlanner - call#202: Apply rule 
[JdbcTableModificationRule(in:NONE,out:JDBC.mysql)(in:NONE,out:JDBC.mysql)] to 
[rel#112:LogicalTableModify]
   2022-12-28 19:03:07,337 [1c53dd94-4277-9ab0-effe-18b1ab8989ac:foreman] DEBUG 
o.apache.calcite.plan.RelOptPlanner - Transform to: rel#121 via 
JdbcTableModificationRule(in:NONE,out:JDBC.mysql)(in:NONE,out:JDBC.mysql)
   2022-12-28 19:03:07,338 [1c53dd94-4277-9ab0-effe-18b1ab8989ac:foreman] DEBUG 
o.apache.calcite.plan.RelOptPlanner - call#202: Full plan for rule input 
[rel#112:LogicalTableModify]: 
   LogicalTableModify(table=[[pg, public, t1]], operation=[INSERT], 
flattened=[true])
 JdbcTableScan(subset=[rel#116:RelSubset#0.NONE.ANY([]).[]], table=[[mysql, 
test, t1]])
   
   2022-12-28 19:03:07,338 [1c53dd94-4277-9ab0-effe-18b1ab8989ac:foreman] DEBUG 
o.apache.calcite.plan.RelOptPlanner - call#202: Rule 
[JdbcTableModificationRule(in:NONE,out:JDBC.mysql)(in:NONE,out:JDBC.mysql)] 
produced [rel#121:JdbcTableModify]
   2022-12-28 19:03:07,338 [1c53dd94-4277-9ab0-effe-18b1ab8989ac:foreman] DEBUG 
o.apache.calcite.plan.RelOptPlanner - call#202: Full plan for 
[rel#121:JdbcTableModify]:
   JdbcTableModify(table=[[pg, public, t1]], operation=[INSERT], 
flattened=[true])
 JdbcTableScan(subset=[rel#109:RelSubset#0.JDBC.mysql.ANY([]).[]], 
table=[[mysql, test, t1]])
   
   2022-12-28 19:03:07,338 [1c53dd94-4277-9ab0-effe-18b1ab8989ac:foreman] TRACE 
o.apache.calcite.plan.RelOptPlanner - Subset cost changed: subset 
[rel#122:RelSubset#2.JDBC.mysql.ANY([]).[]] cost was {inf} now {101.0 rows, 
102.0 cpu, 0.0 io, 0.0 network, 0.0 memory}
   2022-12-28 19:03:07,338 [1c53dd94-4277-9ab0-effe-18b1ab8989ac:foreman] TRACE 
o.apache.calcite.plan.RelOptPlanner - Register 
rel#121:JdbcTableModify.JDBC.mysql.ANY([]).[](input=RelSubset#109,table=[pg, 
public, t1],operation=INSERT,flattened=true) in 
rel#122:RelSubset#2.JDBC.mysql.ANY([]).[]
   2022-12-28 19:03:07,338 [1c53dd94-4277-9ab0-effe-18b1ab8989ac:foreman] TRACE 
o.apache.calcite.plan.RelOptPlanner - Rule-match queued: rule 
[VertexDrelConverterRuleJDBC.mysql(in:JDBC.mysql,out:LOGICAL)] rels [#121]
   2022-12-28 19:03:07,338 [1c53dd94-4277-9ab0-effe-18b1ab8989ac:foreman] DEBUG 
o.apache.calcite.plan.RelOptPlanner - call#202 generated 1 successors: 
[rel#121:JdbcTableModify.JDBC.mysql.ANY([]).[](input=RelSubset#109,table=[pg, 
public, t1],operation=INSERT,flattened=true)]
   2022-12-28 19:03:07,338 [1c53dd94-4277-9ab0-effe-18b1ab8989ac:foreman] DEBUG 
o.apache.calcite.plan.RelOptPlanner - Best cost before rule match: 
{1.0101E10 rows, 2.0101E8 cpu, 1.1E10 io, 0.0 network, 0.0 
memory}
   2022-12-28 19:03:07,339 [1c53dd94-4277-9ab0-effe-18b1ab8989ac:foreman] TRACE 
o.apache.calcite.plan.RelOptPlanner - Root: 
rel#114:RelSubset#2.LOGICAL.ANY([]).[]
   Original rel:
   DrillTableModify(subset=[rel#114:RelSubset#2.LOGICAL.ANY([]).[]], 
table=[[pg, public, t1]], operation=[INSERT], flattened=[true]): rowcount = 
1.0E9, cumulative cost = {1.0E10 rows, 0.0 cpu, 1.1E10 io, 0.0 network, 
0.0 memory}, id = 120
 VertexDrel(subset=[rel#119:RelSubset#0.LOGICAL.ANY([]).[]]): rowcount = 
1.0E9, cumulative cost = {1.0E8 rows, 2.0E8 cpu, 0.0 io, 0.0 network, 0.0 
memory}, id = 118
   JdbcTableScan(subset=[rel#109:RelSubset#0.JDBC.mysql.ANY([]).[]], 
table=[[mysql, test, t1]]): rowcount = 1.0E9, cumulative cost = {100.0 rows, 
101.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 89
   
   Sets:
   Set#0, type: RecordType(INTEGER c1, INTEGER c2)
rel#109:RelSubset#0.JDBC.mysql.ANY([]).[], best=rel#89
rel#89:JdbcTableScan.JDBC.mysql.ANY([]).[](table=[mysql, test, 
t1]), rowcount=1.0E9, cumulative cost={100.0 rows, 101.0 cpu, 0.0 io, 0.0 
network, 0.0 memory}
rel#116:RelSubset#0.NONE.ANY([]).[], best=null
rel#119:RelSubset#0.LOGICAL.ANY([]).[], best=rel#118
rel#118:VertexDrel.LOGICAL.ANY([]).[](input=RelSubset#109), 

[GitHub] [drill] weijunlu commented on issue #2723: Failed to execute an insert statement across the database

2022-12-28 Thread GitBox


weijunlu commented on issue #2723:
URL: https://github.com/apache/drill/issues/2723#issuecomment-1366575973

   2022-12-28 19:03:07,204 [main] DEBUG o.a.d.j.impl.DrillStatementRegistry - 
Adding to open-statements registry: 
org.apache.drill.jdbc.impl.DrillStatementImpl@71df3d2b
   2022-12-28 19:03:07,204 [main] DEBUG o.a.d.j.i.DrillCursor$ResultsListener - 
[#2] Query listener created.
   2022-12-28 19:03:07,204 [main] DEBUG o.apache.drill.jdbc.impl.DrillCursor - 
Setting timeout as 0
   2022-12-28 19:03:07,206 [UserServer-1] DEBUG 
o.a.d.e.r.u.UserServerRequestHandler - Received query to run.  Returning query 
handle.
   2022-12-28 19:03:07,215 [1c53dd94-4277-9ab0-effe-18b1ab8989ac:foreman] DEBUG 
o.a.d.e.w.f.QueryStateProcessor - 1c53dd94-4277-9ab0-effe-18b1ab8989ac: State 
change requested PREPARING --> PLANNING
   2022-12-28 19:03:07,215 [1c53dd94-4277-9ab0-effe-18b1ab8989ac:foreman] INFO  
o.a.drill.exec.work.foreman.Foreman - Query text for query with id 
1c53dd94-4277-9ab0-effe-18b1ab8989ac issued by anonymous: insert into 
pg.public.t1 select c1, c2 from mysql.test.t1
   2022-12-28 19:03:07,215 [Client-1] DEBUG 
o.a.d.j.i.DrillCursor$ResultsListener - [#2] Received query ID: 
1c53dd94-4277-9ab0-effe-18b1ab8989ac.
   2022-12-28 19:03:07,215 [Client-1] DEBUG o.a.d.e.rpc.user.QueryResultHandler 
- Received QueryId 1c53dd94-4277-9ab0-effe-18b1ab8989ac successfully. Adding 
results listener 
org.apache.drill.jdbc.impl.DrillCursor$ResultsListener@29741514.
   2022-12-28 19:03:07,222 [1c53dd94-4277-9ab0-effe-18b1ab8989ac:foreman] TRACE 
org.apache.calcite.sql.parser - After unconditional rewrite: INSERT INTO 
`pg`.`public`.`t1`
   (SELECT `c1`, `c2`
   FROM `mysql`.`test`.`t1`)
   2022-12-28 19:03:07,308 [1c53dd94-4277-9ab0-effe-18b1ab8989ac:foreman] TRACE 
org.apache.calcite.sql.parser - After validation: INSERT INTO `pg`.`public`.`t1`
   (SELECT `t1`.`c1`, `t1`.`c2`
   FROM `mysql`.`test`.`t1` AS `t1`)
   2022-12-28 19:03:07,308 [1c53dd94-4277-9ab0-effe-18b1ab8989ac:foreman] TRACE 
o.a.c.sql.pretty.SqlPrettyWriter - Token is 'INSERT INTO'; result is false
   2022-12-28 19:03:07,309 [1c53dd94-4277-9ab0-effe-18b1ab8989ac:foreman] TRACE 
o.a.c.sql.pretty.SqlPrettyWriter - Token is 'pg'; result is false
   2022-12-28 19:03:07,309 [1c53dd94-4277-9ab0-effe-18b1ab8989ac:foreman] TRACE 
o.a.c.sql.pretty.SqlPrettyWriter - Token is '.'; result is false
   2022-12-28 19:03:07,309 [1c53dd94-4277-9ab0-effe-18b1ab8989ac:foreman] TRACE 
o.a.c.sql.pretty.SqlPrettyWriter - Token is 'public'; result is false
   2022-12-28 19:03:07,309 [1c53dd94-4277-9ab0-effe-18b1ab8989ac:foreman] TRACE 
o.a.c.sql.pretty.SqlPrettyWriter - Token is '.'; result is false
   2022-12-28 19:03:07,309 [1c53dd94-4277-9ab0-effe-18b1ab8989ac:foreman] TRACE 
o.a.c.sql.pretty.SqlPrettyWriter - Token is 't1'; result is false
   2022-12-28 19:03:07,309 [1c53dd94-4277-9ab0-effe-18b1ab8989ac:foreman] TRACE 
o.a.c.sql.pretty.SqlPrettyWriter - Token is ''; result is false
   2022-12-28 19:03:07,309 [1c53dd94-4277-9ab0-effe-18b1ab8989ac:foreman] TRACE 
o.a.c.sql.pretty.SqlPrettyWriter - Token is '('; result is false
   2022-12-28 19:03:07,309 [1c53dd94-4277-9ab0-effe-18b1ab8989ac:foreman] TRACE 
o.a.c.sql.pretty.SqlPrettyWriter - Token is 'SELECT'; result is false
   2022-12-28 19:03:07,309 [1c53dd94-4277-9ab0-effe-18b1ab8989ac:foreman] TRACE 
o.a.c.sql.pretty.SqlPrettyWriter - Token is 't1'; result is false
   2022-12-28 19:03:07,309 [1c53dd94-4277-9ab0-effe-18b1ab8989ac:foreman] TRACE 
o.a.c.sql.pretty.SqlPrettyWriter - Token is '.'; result is false
   2022-12-28 19:03:07,309 [1c53dd94-4277-9ab0-effe-18b1ab8989ac:foreman] TRACE 
o.a.c.sql.pretty.SqlPrettyWriter - Token is 'c1'; result is false
   2022-12-28 19:03:07,309 [1c53dd94-4277-9ab0-effe-18b1ab8989ac:foreman] TRACE 
o.a.c.sql.pretty.SqlPrettyWriter - Token is ''; result is false
   2022-12-28 19:03:07,309 [1c53dd94-4277-9ab0-effe-18b1ab8989ac:foreman] TRACE 
o.a.c.sql.pretty.SqlPrettyWriter - Token is ','; result is false
   2022-12-28 19:03:07,309 [1c53dd94-4277-9ab0-effe-18b1ab8989ac:foreman] TRACE 
o.a.c.sql.pretty.SqlPrettyWriter - Token is 't1'; result is false
   2022-12-28 19:03:07,309 [1c53dd94-4277-9ab0-effe-18b1ab8989ac:foreman] TRACE 
o.a.c.sql.pretty.SqlPrettyWriter - Token is '.'; result is false
   2022-12-28 19:03:07,309 [1c53dd94-4277-9ab0-effe-18b1ab8989ac:foreman] TRACE 
o.a.c.sql.pretty.SqlPrettyWriter - Token is 'c2'; result is false
   2022-12-28 19:03:07,309 [1c53dd94-4277-9ab0-effe-18b1ab8989ac:foreman] TRACE 
o.a.c.sql.pretty.SqlPrettyWriter - Token is ''; result is false
   2022-12-28 19:03:07,309 [1c53dd94-4277-9ab0-effe-18b1ab8989ac:foreman] TRACE 
o.a.c.sql.pretty.SqlPrettyWriter - Token is ''; result is false
   2022-12-28 19:03:07,309 [1c53dd94-4277-9ab0-effe-18b1ab8989ac:foreman] TRACE 
o.a.c.sql.pretty.SqlPrettyWriter - Token is 'FROM'; result is false
   2022-12-28 19:03:07,309 [1c53dd94-4277-9ab0-effe-18b1ab8989ac:foreman] TRACE 

[GitHub] [drill] weijunlu commented on issue #2723: Failed to execute an insert statement across the database

2022-12-28 Thread GitBox


weijunlu commented on issue #2723:
URL: https://github.com/apache/drill/issues/2723#issuecomment-1366570466

   I opened the trace log, included the calcite log.
   Log configurations are as follows:
 
   
   
 
   
   
   
   
 
   
 
   
   
   
 
   
 
 
   
   
 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [drill] cgivre commented on issue #2723: Failed to execute an insert statement across the database

2022-12-26 Thread GitBox


cgivre commented on issue #2723:
URL: https://github.com/apache/drill/issues/2723#issuecomment-136578

   Can you please enable verbose logging and post the stack trace?  Without 
that, we really can't debug this.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [drill] vvysotskyi commented on a diff in pull request #2599: DRILL-4232 Support for EXCEPT and INTERSECT set operator

2022-12-26 Thread GitBox


vvysotskyi commented on code in PR #2599:
URL: https://github.com/apache/drill/pull/2599#discussion_r1057326323


##
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/common/HashTable.java:
##
@@ -98,6 +98,10 @@ void setup(HashTableConfig htConfig, BufferAllocator 
allocator, VectorContainer
*/
   int probeForKey(int incomingRowIdx, int hashCode) throws 
SchemaChangeException;
 
+  int getNum(int currentIndex);

Review Comment:
   Please rename this method to clarify that it holds the count of records for 
a specific key and add JavaDoc.



##
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/aggregate/HashAggBatch.java:
##
@@ -188,10 +192,13 @@ public HashAggBatch(HashAggregate popConfig, RecordBatch 
incoming, FragmentConte
 wasKilled = false;
 
 final int numGrpByExprs = popConfig.getGroupByExprs().size();
-comparators = Lists.newArrayListWithExpectedSize(numGrpByExprs);
-for (int i=0; ihttp://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill;
+
+import org.apache.drill.exec.planner.physical.PlannerSettings;
+import org.apache.drill.exec.record.BatchSchema;
+import org.apache.drill.exec.record.BatchSchemaBuilder;
+import org.apache.drill.exec.record.metadata.SchemaBuilder;
+import org.apache.drill.shaded.guava.com.google.common.collect.Lists;
+import org.apache.commons.lang3.tuple.Pair;
+import org.apache.drill.categories.OperatorTest;
+import org.apache.drill.categories.SqlTest;
+import org.apache.drill.categories.UnlikelyTest;
+import org.apache.drill.common.exceptions.UserException;
+import org.apache.drill.common.expression.SchemaPath;
+import org.apache.drill.common.types.TypeProtos;
+import org.apache.drill.exec.ExecConstants;
+import org.apache.drill.test.ClusterFixture;
+import org.apache.drill.test.ClusterTest;
+import org.junit.BeforeClass;
+import org.junit.Test;
+import org.junit.experimental.categories.Category;
+
+import java.io.BufferedWriter;
+import java.io.File;
+import java.io.FileWriter;
+import java.nio.file.Paths;
+import java.util.List;
+
+@Category({SqlTest.class, OperatorTest.class})
+public class TestSetOp extends ClusterTest {

Review Comment:
   Could you please add more tests that check several batches? It could be done 
using the `UNION ALL` operator. Also, it would be interesting to see cases when 
the first batch of one side is empty, and so on.
   One more scenario is to check how it behaves with complex types. It is fine 
if not supported, but we should be sure that we have the correct error message 
and error handling.



##
exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillSetOpRel.java:
##
@@ -0,0 +1,79 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.planner.logical;
+
+import org.apache.calcite.linq4j.Ord;
+import org.apache.calcite.plan.RelOptCluster;
+import org.apache.calcite.plan.RelTraitSet;
+import org.apache.calcite.rel.InvalidRelException;
+import org.apache.calcite.rel.RelNode;
+import org.apache.calcite.sql.SqlKind;
+import org.apache.drill.common.logical.data.LogicalOperator;
+import org.apache.drill.common.logical.data.Union;
+import org.apache.drill.exec.planner.common.DrillSetOpRelBase;
+
+import java.util.List;
+
+/**
+ * SetOp implemented in Drill.
+ */
+public class DrillSetOpRel extends DrillSetOpRelBase implements DrillRel {
+  private boolean isAggAdded;
+
+  public DrillSetOpRel(RelOptCluster cluster, RelTraitSet traits,
+   List inputs, SqlKind kind, boolean all, 
boolean checkCompatibility, boolean isAggAdded) throws InvalidRelException {
+super(cluster, traits, inputs, kind, all, checkCompatibility);
+this.isAggAdded = isAggAdded;
+  }
+
+  public DrillSetOpRel(RelOptCluster cluster, RelTraitSet traits,
+   List inputs, SqlKind kind, boolean 

[GitHub] [drill] jnturton merged pull request #2727: DRILL-8374: Set the Drill development version to 1.21.0-SNAPSHOT

2022-12-25 Thread GitBox


jnturton merged PR #2727:
URL: https://github.com/apache/drill/pull/2727


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [drill] cgivre opened a new pull request, #2729: DRILL-8376: Add Distribution UDFs

2022-12-24 Thread GitBox


cgivre opened a new pull request, #2729:
URL: https://github.com/apache/drill/pull/2729

   # [DRILL-8376](https://issues.apache.org/jira/browse/DRILL-8376): Add 
Distribution UDFs
   
   ## Description
   This PR adds several new UDFs to help with statistical analysis.  They are 
`width_bucket` which mirrors the functionality of the POSTGRES function of the 
same name. 
(https://www.oreilly.com/library/view/sql-in-a/9780596155322/re91.html).  This 
function is useful for building histograms of data.
   
   This also adds the `kendall_correlation` and `pearson_correlation` functions 
which are two function for calculating correlation coefficients of two columns.
   
   ## Documentation
   Updated README.
   
   ## Testing
   Added unit tests.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [drill] jnturton merged pull request #2726: [BACKPORT-TO-STABLE] Bugfix Release 1.20.3 Phase 4

2022-12-24 Thread GitBox


jnturton merged PR #2726:
URL: https://github.com/apache/drill/pull/2726


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [drill] kingswanwho commented on pull request #2726: [BACKPORT-TO-STABLE] Bugfix Release 1.20.3 Phase 4

2022-12-23 Thread GitBox


kingswanwho commented on PR #2726:
URL: https://github.com/apache/drill/pull/2726#issuecomment-1364474789

   > Thanks @kingswanwho, this looks good with the only issue I see being [the 
protobuf 
upgrade](https://github.com/apache/drill/pull/2726/commits/439958bc56eb2d24b7206e83a75e491ff23c89a6).
   > 
   > In addition to the dependency version number change, a lot of generated 
protobuf code needs an update. In the master branch I can see that @vvysotskyi 
had to add this to Dependabot's PR manually. Did you try cherry picking [his 
commit](https://github.com/apache/drill/pull/2671/commits/a97c7e16f01c36e5d683b561517fe8bad59cfce8)
 which includes the updates to the generated code?
   
   Hi @jnturton, yes I cherry-picked the commit into this PR, I will drop this 
commit and submit PR again.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [drill] jnturton commented on pull request #2726: [BACKPORT-TO-STABLE] Bugfix Release 1.20.3 Phase 4

2022-12-23 Thread GitBox


jnturton commented on PR #2726:
URL: https://github.com/apache/drill/pull/2726#issuecomment-1364471208

   Thanks @kingswanwho, this looks good with the only issue I see being [the 
protobuf 
upgrade](https://github.com/apache/drill/pull/2726/commits/439958bc56eb2d24b7206e83a75e491ff23c89a6).
   
   In addition to the the dependency version number change, a lot of generated 
protobuf code needs an update. In the master branch I can see that @vvysotskyi 
had to add this to Dependabot's PR manually. Did you try cherry picking [his 
commit](https://github.com/apache/drill/pull/2671/commits/a97c7e16f01c36e5d683b561517fe8bad59cfce8)
 which includes the updates to generated code?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [drill] kingswanwho closed pull request #2726: [BACKPORT-TO-STABLE] Bugfix Release 1.20.3 Phase 4

2022-12-22 Thread GitBox


kingswanwho closed pull request #2726: [BACKPORT-TO-STABLE] Bugfix Release 
1.20.3 Phase 4
URL: https://github.com/apache/drill/pull/2726


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [drill] weijunlu commented on issue #2723: Failed to execute an insert statement across the database

2022-12-22 Thread GitBox


weijunlu commented on issue #2723:
URL: https://github.com/apache/drill/issues/2723#issuecomment-1363564470

   @cgivre yes, I used the master version.
   apache drill> select version,  commit_message, commit_time from sys.version;
   
++--+---+
   |version |  commit_message   
   |commit_time|
   
++--+---+
   | 2.0.0-SNAPSHOT | DRILL-8314: Add support for automatically retrying and 
disabling broken storage plugins (#2655) | 18.10.2022 @ 18:15:31 CST |


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [drill] jnturton commented on pull request #2726: [BACKPORT-TO-STABLE] Bugfix Release 1.20.3 Phase 4

2022-12-22 Thread GitBox


jnturton commented on PR #2726:
URL: https://github.com/apache/drill/pull/2726#issuecomment-1363032480

   @kingswanwho note that the test failures on [this PR's last CI 
run](https://github.com/apache/drill/actions/runs/3757899347/jobs/6385612311) 
are showing up everywhere at the moment and are not caused by your commits here.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [drill] jnturton opened a new pull request, #2728: DRILL-8372: Unfreed buffers when running a LIMIT 0 query over delimited text

2022-12-22 Thread GitBox


jnturton opened a new pull request, #2728:
URL: https://github.com/apache/drill/pull/2728

   # [DRILL-8372](https://issues.apache.org/jira/browse/DRILL-8372): Unfreed 
buffers when running a LIMIT 0 query over delimited text
   
   ## Description
   
   TODO
   
   ## Documentation
   N/A
   
   ## Testing
   TODO
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [drill] cgivre commented on issue #2723: Failed to execute an insert statement across the database

2022-12-22 Thread GitBox


cgivre commented on issue #2723:
URL: https://github.com/apache/drill/issues/2723#issuecomment-1362913019

   What version of Drill are you using?  


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [drill] jnturton opened a new pull request, #2727: DRILL-8374: Set the Drill development version to 1.21.0-SNAPSHOT

2022-12-22 Thread GitBox


jnturton opened a new pull request, #2727:
URL: https://github.com/apache/drill/pull/2727

   # [DRILL-8374](https://issues.apache.org/jira/browse/DRILL-8374): Set the 
Drill development version to 1.21.0-SNAPSHOT
   
   ## Description
   Changes the Maven version numbers in the Drill master branch from 2.0.0 to 
1.21.0. Discussion in the Drill mailing list established that the project would 
prefer to do a release in the near future than to wait to build up a changset 
for which a version jump to 2.0 would be appropriate.
   
   ## Documentation
   N/A
   
   ## Testing
   N/A
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [drill] jnturton merged pull request #2724: [BACKPORT-TO-STABLE] Bugfix Release 1.20.3 Phase 3

2022-12-22 Thread GitBox


jnturton merged PR #2724:
URL: https://github.com/apache/drill/pull/2724


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [drill] jnturton commented on pull request #2724: [BACKPORT-TO-STABLE] Bugfix Release 1.20.3 Phase 3

2022-12-22 Thread GitBox


jnturton commented on PR #2724:
URL: https://github.com/apache/drill/pull/2724#issuecomment-1362814276

   Thanks for the review @kingswanwho.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [drill] kingswanwho opened a new pull request, #2726: [BACKPORT-TO-STABLE] Bugfix Release 1.20.3 Phase 4

2022-12-22 Thread GitBox


kingswanwho opened a new pull request, #2726:
URL: https://github.com/apache/drill/pull/2726

   # [BACKPORT-TO-STABLE] Bugfix Release 1.20.3 Phase 4
   
   ## Description
   
   Merges the following backport-to-stable commits into the 1.20 branch:
   
   * https://github.com/apache/drill/pull/2666
   * https://github.com/apache/drill/pull/2669
   * https://github.com/apache/drill/pull/2671
   * https://github.com/apache/drill/pull/2674
   * https://github.com/apache/drill/pull/2675
   * https://github.com/apache/drill/pull/2676
   * https://github.com/apache/drill/pull/2677
   * https://github.com/apache/drill/pull/2678
   * https://github.com/apache/drill/pull/2682
   
   
   ## Documentation
   N/A
   
   ## Testing
   UT
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [drill] cgivre merged pull request #2722: DRILL-8371: Add Write/Insert Capability to Splunk Plugin

2022-12-22 Thread GitBox


cgivre merged PR #2722:
URL: https://github.com/apache/drill/pull/2722


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [drill] jnturton commented on a diff in pull request #2722: DRILL-8371: Add Write/Insert Capability to Splunk Plugin

2022-12-22 Thread GitBox


jnturton commented on code in PR #2722:
URL: https://github.com/apache/drill/pull/2722#discussion_r1055192926


##
contrib/storage-splunk/src/main/java/org/apache/drill/exec/store/splunk/SplunkBatchWriter.java:
##
@@ -98,27 +100,69 @@ public void updateSchema(VectorAccessible batch) {
   @Override
   public void startRecord() {
 logger.debug("Starting record");
-// Ensure that the new record is empty. This is not strictly necessary, 
but it is a belt and suspenders approach.
-splunkEvent.clear();
+// Ensure that the new record is empty.
+splunkEvent = new JSONObject();
   }
 
   @Override
-  public void endRecord() throws IOException {
+  public void endRecord() {
 logger.debug("Ending record");
+recordCount++;
+
+// Put event in buffer
+eventBuffer.add(splunkEvent);
+
 // Write the event to the Splunk index
-destinationIndex.submit(eventArgs, splunkEvent.toJSONString());
-// Clear out the splunk event.
-splunkEvent.clear();
+if (recordCount >= config.getPluginConfig().getWriterBatchSize()) {
+  try {
+writeEvents();
+  } catch (IOException e) {
+throw  UserException.dataWriteError(e)
+.message("Error writing data to Splunk: " + e.getMessage())
+.build(logger);
+  }
+
+  // Reset record count
+  recordCount = 0;
+}
   }
 
+
+  /*
+  args – Optional arguments for this stream. Valid parameters are: "host", 
"host_regex", "source", and "sourcetype".
+   */
   @Override
   public void abort() {
+logger.debug("Aborting writing records to Splunk.");
 // No op
   }
 
   @Override
   public void cleanup() {
-// No op
+try {
+  writeEvents();
+} catch (IOException e) {
+  throw  UserException.dataWriteError(e)
+  .message("Error writing data to Splunk: " + e.getMessage())
+  .build(logger);
+}
+  }
+
+  private void writeEvents() throws IOException {
+// Open the socket and stream, set up a timestamp
+destinationIndex.attachWith(new ReceiverBehavior() {

Review Comment:
   This results in a dedicated TCP socket being opened and closed for every 
writer batch.



##
contrib/storage-splunk/src/main/java/org/apache/drill/exec/store/splunk/SplunkBatchWriter.java:
##
@@ -98,27 +100,69 @@ public void updateSchema(VectorAccessible batch) {
   @Override
   public void startRecord() {
 logger.debug("Starting record");
-// Ensure that the new record is empty. This is not strictly necessary, 
but it is a belt and suspenders approach.
-splunkEvent.clear();
+// Ensure that the new record is empty.
+splunkEvent = new JSONObject();
   }
 
   @Override
-  public void endRecord() throws IOException {
+  public void endRecord() {
 logger.debug("Ending record");
+recordCount++;
+
+// Put event in buffer
+eventBuffer.add(splunkEvent);
+
 // Write the event to the Splunk index
-destinationIndex.submit(eventArgs, splunkEvent.toJSONString());
-// Clear out the splunk event.
-splunkEvent.clear();
+if (recordCount >= config.getPluginConfig().getWriterBatchSize()) {
+  try {
+writeEvents();
+  } catch (IOException e) {
+throw  UserException.dataWriteError(e)
+.message("Error writing data to Splunk: " + e.getMessage())
+.build(logger);
+  }
+
+  // Reset record count
+  recordCount = 0;
+}
   }
 
+
+  /*
+  args – Optional arguments for this stream. Valid parameters are: "host", 
"host_regex", "source", and "sourcetype".
+   */
   @Override
   public void abort() {
+logger.debug("Aborting writing records to Splunk.");

Review Comment:
   Would there be any use in clearing eventBuffer here?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [drill] kingswanwho commented on pull request #2724: [BACKPORT-TO-STABLE] Bugfix Release 1.20.3 Phase 3

2022-12-21 Thread GitBox


kingswanwho commented on PR #2724:
URL: https://github.com/apache/drill/pull/2724#issuecomment-1362496955

   Looks perfect to me +1. That's a quite lot of work in a short time.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [drill] kingswanwho commented on pull request #2724: [BACKPORT-TO-STABLE] Bugfix Release 1.20.3 Phase 3

2022-12-21 Thread GitBox


kingswanwho commented on PR #2724:
URL: https://github.com/apache/drill/pull/2724#issuecomment-1362496900

   Looks perfect to me +1. That's a quite lot of work in a short time.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [drill] cgivre commented on a diff in pull request #2722: DRILL-8371: Add Write/Insert Capability to Splunk Plugin

2022-12-20 Thread GitBox


cgivre commented on code in PR #2722:
URL: https://github.com/apache/drill/pull/2722#discussion_r1053981884


##
contrib/storage-splunk/src/main/java/org/apache/drill/exec/store/splunk/SplunkBatchWriter.java:
##
@@ -0,0 +1,309 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.drill.exec.store.splunk;
+
+
+import com.splunk.Args;
+import com.splunk.Index;
+import com.splunk.IndexCollection;
+import com.splunk.Service;
+import org.apache.drill.common.exceptions.UserException;
+import org.apache.drill.exec.proto.UserBitShared.UserCredentials;
+import org.apache.drill.exec.record.VectorAccessible;
+import org.apache.drill.exec.store.AbstractRecordWriter;
+import org.apache.drill.exec.store.EventBasedRecordWriter.FieldConverter;
+import org.apache.drill.exec.vector.complex.reader.FieldReader;
+import org.json.simple.JSONObject;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+import java.util.List;
+import java.util.Map;
+
+public class SplunkBatchWriter extends AbstractRecordWriter {
+
+  private static final Logger logger = 
LoggerFactory.getLogger(SplunkBatchWriter.class);
+  private static final String DEFAULT_SOURCETYPE = "drill";
+  private final UserCredentials userCredentials;
+  private final List tableIdentifier;
+  private final SplunkWriter config;
+  private final Args eventArgs;
+  protected final Service splunkService;
+  private final JSONObject splunkEvent;
+  protected Index destinationIndex;
+
+
+  public SplunkBatchWriter(UserCredentials userCredentials, List 
tableIdentifier, SplunkWriter config) {
+this.config = config;
+this.tableIdentifier = tableIdentifier;
+this.userCredentials = userCredentials;
+this.splunkEvent = new JSONObject();
+SplunkConnection connection = new 
SplunkConnection(config.getPluginConfig(), userCredentials.getUserName());
+this.splunkService = connection.connect();
+
+// Populate event arguments
+this.eventArgs = new Args();
+eventArgs.put("sourcetype", DEFAULT_SOURCETYPE);
+  }
+
+  @Override
+  public void init(Map writerOptions) throws IOException {
+// No op
+  }
+
+  /**
+   * Update the schema in RecordWriter. Called before starting writing the 
records. In this case,
+   * we add the index to Splunk here. Splunk's API is a little sparse and 
doesn't really do much in the way
+   * of error checking or providing feedback if the operation fails.
+   *
+   * @param batch {@link VectorAccessible} The incoming batch
+   */
+  @Override
+  public void updateSchema(VectorAccessible batch) {
+logger.debug("Updating schema for Splunk");
+
+//Get the collection of indexes
+IndexCollection indexes = splunkService.getIndexes();
+try {
+  String indexName = tableIdentifier.get(0);
+  indexes.create(indexName);
+  destinationIndex = splunkService.getIndexes().get(indexName);
+} catch (Exception e) {
+  // We have to catch a generic exception here, as Splunk's SDK does not 
really provide any kind of
+  // failure messaging.
+  throw UserException.systemError(e)
+.message("Error creating new index in Splunk plugin: " + 
e.getMessage())
+.build(logger);
+}
+  }
+
+
+  @Override
+  public void startRecord() {
+logger.debug("Starting record");
+// Ensure that the new record is empty. This is not strictly necessary, 
but it is a belt and suspenders approach.
+splunkEvent.clear();

Review Comment:
   I removed this from the `endRecord` method.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [drill] cgivre commented on a diff in pull request #2722: DRILL-8371: Add Write/Insert Capability to Splunk Plugin

2022-12-20 Thread GitBox


cgivre commented on code in PR #2722:
URL: https://github.com/apache/drill/pull/2722#discussion_r1053979483


##
contrib/storage-splunk/src/main/java/org/apache/drill/exec/store/splunk/SplunkBatchWriter.java:
##
@@ -0,0 +1,308 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.drill.exec.store.splunk;
+
+
+import com.splunk.Args;
+import com.splunk.Index;
+import com.splunk.IndexCollection;
+import com.splunk.Service;
+import org.apache.drill.common.exceptions.UserException;
+import org.apache.drill.exec.proto.UserBitShared.UserCredentials;
+import org.apache.drill.exec.record.VectorAccessible;
+import org.apache.drill.exec.store.AbstractRecordWriter;
+import org.apache.drill.exec.store.EventBasedRecordWriter.FieldConverter;
+import org.apache.drill.exec.vector.complex.reader.FieldReader;
+import org.json.simple.JSONObject;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+import java.util.List;
+import java.util.Map;
+
+public class SplunkBatchWriter extends AbstractRecordWriter {
+
+  private static final Logger logger = 
LoggerFactory.getLogger(SplunkBatchWriter.class);
+  private static final String DEFAULT_SOURCETYPE = "drill";
+  private final UserCredentials userCredentials;
+  private final List tableIdentifier;
+  private final SplunkWriter config;
+  private final Args eventArgs;
+  protected final Service splunkService;
+  private JSONObject splunkEvent;
+  protected Index destinationIndex;
+
+
+  public SplunkBatchWriter(UserCredentials userCredentials, List 
tableIdentifier, SplunkWriter config) {
+this.config = config;
+this.tableIdentifier = tableIdentifier;
+this.userCredentials = userCredentials;
+
+SplunkConnection connection = new 
SplunkConnection(config.getPluginConfig(), userCredentials.getUserName());
+this.splunkService = connection.connect();
+
+// Populate event arguments
+this.eventArgs = new Args();
+eventArgs.put("sourcetype", DEFAULT_SOURCETYPE);
+  }
+
+  @Override
+  public void init(Map writerOptions) throws IOException {
+// No op
+  }
+
+  /**
+   * Update the schema in RecordWriter. Called at least once before starting 
writing the records. In this case,
+   * we add the index to Splunk here. Splunk's API is a little sparse and 
doesn't really do much in the way
+   * of error checking or providing feedback if the operation fails.
+   *
+   * @param batch {@link VectorAccessible} The incoming batch
+   */
+  @Override
+  public void updateSchema(VectorAccessible batch) {
+logger.debug("Updating schema for Splunk");
+
+//Get the collection of indexes
+IndexCollection indexes = splunkService.getIndexes();
+try {
+  String indexName = tableIdentifier.get(0);
+  indexes.create(indexName);
+  destinationIndex = splunkService.getIndexes().get(indexName);
+} catch (Exception e) {
+  // We have to catch a generic exception here, as Splunk's SDK does not 
really provide any kind of
+  // failure messaging.
+  throw UserException.systemError(e)
+.message("Error creating new index in Splunk plugin: " + 
e.getMessage())
+.build(logger);
+}
+  }
+
+
+  @Override
+  public void startRecord() {
+logger.debug("Starting record");
+splunkEvent = new JSONObject();
+  }
+
+  @Override
+  public void endRecord() throws IOException {
+logger.debug("Ending record");
+// Write the event to the Splunk index
+destinationIndex.submit(eventArgs, splunkEvent.toJSONString());

Review Comment:
   @jnturton I figured this out.   Using Splunk's sample code from their SDK 
documentation resulted in Splunk not parsing the fields correctly which broke 
all the unit tests, and didn't work.  I did some experiments and found that 
removing the date actually solved the issue.
   
   Splunk's SDK provides a method for writing to a socket which does all the 
error handling.  I used that because that was what the docs recommended, 
however that method does not allow you to set some of the properties that the 
other insert methods do.  But I'm not debugging Splunk's SDK for free. 



-- 
This is an automated message from the Apache Git Service.
To respond to 

[GitHub] [drill] cgivre merged pull request #2725: DRILL-8179: Convert LTSV Format Plugin to EVF2

2022-12-20 Thread GitBox


cgivre merged PR #2725:
URL: https://github.com/apache/drill/pull/2725


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [drill] cgivre commented on a diff in pull request #2725: DRILL-8179: Convert LTSV Format Plugin to EVF2

2022-12-20 Thread GitBox


cgivre commented on code in PR #2725:
URL: https://github.com/apache/drill/pull/2725#discussion_r1053498604


##
contrib/format-ltsv/src/main/java/org/apache/drill/exec/store/ltsv/LTSVBatchReader.java:
##
@@ -0,0 +1,264 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.drill.exec.store.ltsv;
+
+import com.github.lolo.ltsv.LtsvParser;
+import com.github.lolo.ltsv.LtsvParser.Builder;
+import org.apache.commons.lang3.StringUtils;
+import org.apache.drill.common.AutoCloseables;
+import org.apache.drill.common.exceptions.CustomErrorContext;
+import org.apache.drill.common.exceptions.UserException;
+import org.apache.drill.common.types.TypeProtos;
+import org.apache.drill.common.types.TypeProtos.MinorType;
+import org.apache.drill.exec.physical.impl.scan.v3.ManagedReader;
+import org.apache.drill.exec.physical.impl.scan.v3.file.FileDescrip;
+import org.apache.drill.exec.physical.impl.scan.v3.file.FileSchemaNegotiator;
+import org.apache.drill.exec.physical.resultSet.ResultSetLoader;
+import org.apache.drill.exec.physical.resultSet.RowSetLoader;
+import org.apache.drill.exec.record.metadata.ColumnMetadata;
+import org.apache.drill.exec.record.metadata.MetadataUtils;
+import org.apache.drill.exec.record.metadata.TupleMetadata;
+import org.apache.drill.exec.vector.accessor.ScalarWriter;
+import org.apache.drill.shaded.guava.com.google.common.base.Strings;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+import java.io.InputStream;
+import java.text.ParseException;
+import java.text.SimpleDateFormat;
+import java.time.Instant;
+import java.time.LocalDate;
+import java.time.LocalTime;
+import java.time.format.DateTimeFormatter;
+import java.util.Date;
+import java.util.Iterator;
+import java.util.Map;
+
+public class LTSVBatchReader implements ManagedReader {
+
+  private static final Logger logger = 
LoggerFactory.getLogger(LTSVBatchReader.class);
+  private final LTSVFormatPluginConfig config;
+  private final FileDescrip file;
+  private final CustomErrorContext errorContext;
+  private final LtsvParser ltsvParser;
+  private final RowSetLoader rowWriter;
+  private final FileSchemaNegotiator negotiator;
+  private InputStream fsStream;
+  private Iterator> rowIterator;
+
+
+  public LTSVBatchReader(LTSVFormatPluginConfig config, FileSchemaNegotiator 
negotiator) {
+this.config = config;
+this.negotiator = negotiator;
+file = negotiator.file();
+errorContext = negotiator.parentErrorContext();
+ltsvParser = buildParser();
+
+openFile();
+
+// If there is a provided schema, import it
+if (negotiator.providedSchema() != null) {
+  TupleMetadata schema = negotiator.providedSchema();
+  negotiator.tableSchema(schema, false);
+}
+ResultSetLoader loader = negotiator.build();
+rowWriter = loader.writer();
+
+  }
+
+  private void openFile() {
+try {
+  fsStream = 
file.fileSystem().openPossiblyCompressedStream(file.split().getPath());
+} catch (IOException e) {
+  throw UserException
+  .dataReadError(e)
+  .message("Unable to open LTSV File %s", file.split().getPath() + " " 
+ e.getMessage())
+  .addContext(errorContext)
+  .build(logger);
+}
+rowIterator = ltsvParser.parse(fsStream);
+  }
+
+  @Override
+  public boolean next() {
+while (!rowWriter.isFull()) {
+  if (!processNextRow()) {
+return false;
+  }
+}
+return true;
+  }
+
+  private LtsvParser buildParser() {
+Builder builder = LtsvParser.builder();
+builder.trimKeys();
+builder.trimValues();
+builder.skipNullValues();
+
+if (config.getParseMode().contentEquals("strict")) {
+  builder.strict();
+} else {
+  builder.lenient();
+}
+
+if (StringUtils.isNotEmpty(config.getEscapeCharacter())) {
+  builder.withEscapeChar(config.getEscapeCharacter().charAt(0));
+}
+
+if (StringUtils.isNotEmpty(config.getKvDelimiter())) {
+  builder.withKvDelimiter(config.getKvDelimiter().charAt(0));
+}
+
+if (StringUtils.isNotEmpty(config.getEntryDelimiter())) {
+  builder.withEntryDelimiter(config.getEntryDelimiter().charAt(0));
+}
+
+if 

[GitHub] [drill] jnturton commented on a diff in pull request #2725: DRILL-8179: Convert LTSV Format Plugin to EVF2

2022-12-20 Thread GitBox


jnturton commented on code in PR #2725:
URL: https://github.com/apache/drill/pull/2725#discussion_r1053465233


##
contrib/format-ltsv/src/test/java/org/apache/drill/exec/store/ltsv/TestLTSVRecordReader.java:
##
@@ -37,34 +42,77 @@ public static void setup() throws Exception {
 
   @Test
   public void testWildcard() throws Exception {
-testBuilder()
-  .sqlQuery("SELECT * FROM cp.`simple.ltsv`")
-  .unOrdered()
-  .baselineColumns("host", "forwardedfor", "req", "status", "size", 
"referer", "ua", "reqtime", "apptime", "vhost")
-  .baselineValues("xxx.xxx.xxx.xxx", "-", "GET /v1/xxx HTTP/1.1", "200", 
"4968", "-", "Java/1.8.0_131", "2.532", "2.532", "api.example.com")
-  .baselineValues("xxx.xxx.xxx.xxx", "-", "GET /v1/yyy HTTP/1.1", "200", 
"412", "-", "Java/1.8.0_201", "3.580", "3.580", "api.example.com")
-  .go();
+String sql = "SELECT * FROM cp.`simple.ltsv`";

Review Comment:
   Let's rename this class TestLTSVQueries or similar now that LTSVRecordReader 
is gone?



##
contrib/format-ltsv/src/main/java/org/apache/drill/exec/store/ltsv/LTSVBatchReader.java:
##
@@ -0,0 +1,264 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.drill.exec.store.ltsv;
+
+import com.github.lolo.ltsv.LtsvParser;
+import com.github.lolo.ltsv.LtsvParser.Builder;
+import org.apache.commons.lang3.StringUtils;
+import org.apache.drill.common.AutoCloseables;
+import org.apache.drill.common.exceptions.CustomErrorContext;
+import org.apache.drill.common.exceptions.UserException;
+import org.apache.drill.common.types.TypeProtos;
+import org.apache.drill.common.types.TypeProtos.MinorType;
+import org.apache.drill.exec.physical.impl.scan.v3.ManagedReader;
+import org.apache.drill.exec.physical.impl.scan.v3.file.FileDescrip;
+import org.apache.drill.exec.physical.impl.scan.v3.file.FileSchemaNegotiator;
+import org.apache.drill.exec.physical.resultSet.ResultSetLoader;
+import org.apache.drill.exec.physical.resultSet.RowSetLoader;
+import org.apache.drill.exec.record.metadata.ColumnMetadata;
+import org.apache.drill.exec.record.metadata.MetadataUtils;
+import org.apache.drill.exec.record.metadata.TupleMetadata;
+import org.apache.drill.exec.vector.accessor.ScalarWriter;
+import org.apache.drill.shaded.guava.com.google.common.base.Strings;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+import java.io.InputStream;
+import java.text.ParseException;
+import java.text.SimpleDateFormat;
+import java.time.Instant;
+import java.time.LocalDate;
+import java.time.LocalTime;
+import java.time.format.DateTimeFormatter;
+import java.util.Date;
+import java.util.Iterator;
+import java.util.Map;
+
+public class LTSVBatchReader implements ManagedReader {
+
+  private static final Logger logger = 
LoggerFactory.getLogger(LTSVBatchReader.class);
+  private final LTSVFormatPluginConfig config;
+  private final FileDescrip file;
+  private final CustomErrorContext errorContext;
+  private final LtsvParser ltsvParser;
+  private final RowSetLoader rowWriter;
+  private final FileSchemaNegotiator negotiator;
+  private InputStream fsStream;
+  private Iterator> rowIterator;
+
+
+  public LTSVBatchReader(LTSVFormatPluginConfig config, FileSchemaNegotiator 
negotiator) {
+this.config = config;
+this.negotiator = negotiator;
+file = negotiator.file();
+errorContext = negotiator.parentErrorContext();
+ltsvParser = buildParser();
+
+openFile();
+
+// If there is a provided schema, import it
+if (negotiator.providedSchema() != null) {
+  TupleMetadata schema = negotiator.providedSchema();
+  negotiator.tableSchema(schema, false);
+}
+ResultSetLoader loader = negotiator.build();
+rowWriter = loader.writer();
+
+  }
+
+  private void openFile() {
+try {
+  fsStream = 
file.fileSystem().openPossiblyCompressedStream(file.split().getPath());
+} catch (IOException e) {
+  throw UserException
+  .dataReadError(e)
+  .message("Unable to open LTSV File %s", file.split().getPath() + " " 
+ e.getMessage())
+  .addContext(errorContext)
+  .build(logger);
+}
+rowIterator = 

[GitHub] [drill] jnturton commented on pull request #2668: DRILL-8328: HTTP UDF Not Resolving Storage Aliases

2022-12-20 Thread GitBox


jnturton commented on PR #2668:
URL: https://github.com/apache/drill/pull/2668#issuecomment-1359539411

   I've just removed the backport-to-stable tag since these UDFs arrived after 
Drill 1.20. Thanks to @kingswanwho for spotting this.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [drill] cgivre opened a new pull request, #2725: DRILL-8179: Convert LTSV Format Plugin to EVF2

2022-12-19 Thread GitBox


cgivre opened a new pull request, #2725:
URL: https://github.com/apache/drill/pull/2725

   # [DRILL-8179](https://issues.apache.org/jira/browse/DRILL-8179): Convert 
LTSV Format Plugin to EVF2
   
   ## Description
   With this PR, all format plugins are now using the EVF readers.   This is 
part of [DRILL-8132](https://issues.apache.org/jira/browse/DRILL-8312).  
   
   ## Documentation
   In addition to refactoring the plugin to use EVF V2, this code replaces the 
homegrown LTSV reader with a module that parses the data, and introduces new 
configuration variables.  These variables are all noted in the updated README.  
However they are all optional, so the user is not likely to notice any real 
difference. 
   
   One exception is the variable which controls error tolerance.  
   
   ## Testing
   Ran existing unit tests.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [drill] weijunlu commented on issue #2693: Order by expression failed to execute in mysql plugin

2022-12-19 Thread GitBox


weijunlu commented on issue #2693:
URL: https://github.com/apache/drill/issues/2693#issuecomment-1358751539

   @vvysotskyi @cgivre.  If MySQL disables only_full_group_by, the sql can be 
executed.
   
   Jupiter (mysql.test)> select
   2..semicolon> extract(year from o_orderdate) as o_year
   3..semicolon> from orders
   4..semicolon> group by o_year
   5..semicolon> order by o_year;
   ++
   | o_year |
   ++
   | 1992   |
   | 1993   |
   | 1994   |
   | 1995   |
   | 1996   |
   | 1997   |
   | 1998   |
   ++
   7 rows selected (4.079 seconds)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [drill] jnturton commented on a diff in pull request #2722: DRILL-8371: Add Write/Insert Capability to Splunk Plugin

2022-12-19 Thread GitBox


jnturton commented on code in PR #2722:
URL: https://github.com/apache/drill/pull/2722#discussion_r1052404918


##
contrib/storage-splunk/src/main/java/org/apache/drill/exec/store/splunk/SplunkBatchWriter.java:
##
@@ -0,0 +1,308 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.drill.exec.store.splunk;
+
+
+import com.splunk.Args;
+import com.splunk.Index;
+import com.splunk.IndexCollection;
+import com.splunk.Service;
+import org.apache.drill.common.exceptions.UserException;
+import org.apache.drill.exec.proto.UserBitShared.UserCredentials;
+import org.apache.drill.exec.record.VectorAccessible;
+import org.apache.drill.exec.store.AbstractRecordWriter;
+import org.apache.drill.exec.store.EventBasedRecordWriter.FieldConverter;
+import org.apache.drill.exec.vector.complex.reader.FieldReader;
+import org.json.simple.JSONObject;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+import java.util.List;
+import java.util.Map;
+
+public class SplunkBatchWriter extends AbstractRecordWriter {
+
+  private static final Logger logger = 
LoggerFactory.getLogger(SplunkBatchWriter.class);
+  private static final String DEFAULT_SOURCETYPE = "drill";
+  private final UserCredentials userCredentials;
+  private final List tableIdentifier;
+  private final SplunkWriter config;
+  private final Args eventArgs;
+  protected final Service splunkService;
+  private JSONObject splunkEvent;
+  protected Index destinationIndex;
+
+
+  public SplunkBatchWriter(UserCredentials userCredentials, List 
tableIdentifier, SplunkWriter config) {
+this.config = config;
+this.tableIdentifier = tableIdentifier;
+this.userCredentials = userCredentials;
+
+SplunkConnection connection = new 
SplunkConnection(config.getPluginConfig(), userCredentials.getUserName());
+this.splunkService = connection.connect();
+
+// Populate event arguments
+this.eventArgs = new Args();
+eventArgs.put("sourcetype", DEFAULT_SOURCETYPE);
+  }
+
+  @Override
+  public void init(Map writerOptions) throws IOException {
+// No op
+  }
+
+  /**
+   * Update the schema in RecordWriter. Called at least once before starting 
writing the records. In this case,
+   * we add the index to Splunk here. Splunk's API is a little sparse and 
doesn't really do much in the way
+   * of error checking or providing feedback if the operation fails.
+   *
+   * @param batch {@link VectorAccessible} The incoming batch
+   */
+  @Override
+  public void updateSchema(VectorAccessible batch) {
+logger.debug("Updating schema for Splunk");
+
+//Get the collection of indexes
+IndexCollection indexes = splunkService.getIndexes();
+try {
+  String indexName = tableIdentifier.get(0);
+  indexes.create(indexName);
+  destinationIndex = splunkService.getIndexes().get(indexName);
+} catch (Exception e) {
+  // We have to catch a generic exception here, as Splunk's SDK does not 
really provide any kind of
+  // failure messaging.
+  throw UserException.systemError(e)
+.message("Error creating new index in Splunk plugin: " + 
e.getMessage())
+.build(logger);
+}
+  }
+
+
+  @Override
+  public void startRecord() {
+logger.debug("Starting record");
+splunkEvent = new JSONObject();
+  }
+
+  @Override
+  public void endRecord() throws IOException {
+logger.debug("Ending record");
+// Write the event to the Splunk index
+destinationIndex.submit(eventArgs, splunkEvent.toJSONString());

Review Comment:
   @cgivre can we leave a comment explaining this to readers then?



##
contrib/storage-splunk/src/main/java/org/apache/drill/exec/store/splunk/SplunkBatchWriter.java:
##
@@ -0,0 +1,309 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * 

[GitHub] [drill] cgivre commented on a diff in pull request #2722: DRILL-8371: Add Write/Insert Capability to Splunk Plugin

2022-12-19 Thread GitBox


cgivre commented on code in PR #2722:
URL: https://github.com/apache/drill/pull/2722#discussion_r1052374395


##
contrib/storage-splunk/src/main/java/org/apache/drill/exec/store/splunk/SplunkBatchWriter.java:
##
@@ -0,0 +1,308 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.drill.exec.store.splunk;
+
+
+import com.splunk.Args;
+import com.splunk.Index;
+import com.splunk.IndexCollection;
+import com.splunk.Service;
+import org.apache.drill.common.exceptions.UserException;
+import org.apache.drill.exec.proto.UserBitShared.UserCredentials;
+import org.apache.drill.exec.record.VectorAccessible;
+import org.apache.drill.exec.store.AbstractRecordWriter;
+import org.apache.drill.exec.store.EventBasedRecordWriter.FieldConverter;
+import org.apache.drill.exec.vector.complex.reader.FieldReader;
+import org.json.simple.JSONObject;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+import java.util.List;
+import java.util.Map;
+
+public class SplunkBatchWriter extends AbstractRecordWriter {
+
+  private static final Logger logger = 
LoggerFactory.getLogger(SplunkBatchWriter.class);
+  private static final String DEFAULT_SOURCETYPE = "drill";
+  private final UserCredentials userCredentials;
+  private final List tableIdentifier;
+  private final SplunkWriter config;
+  private final Args eventArgs;
+  protected final Service splunkService;
+  private JSONObject splunkEvent;
+  protected Index destinationIndex;
+
+
+  public SplunkBatchWriter(UserCredentials userCredentials, List 
tableIdentifier, SplunkWriter config) {
+this.config = config;
+this.tableIdentifier = tableIdentifier;
+this.userCredentials = userCredentials;
+
+SplunkConnection connection = new 
SplunkConnection(config.getPluginConfig(), userCredentials.getUserName());
+this.splunkService = connection.connect();
+
+// Populate event arguments
+this.eventArgs = new Args();
+eventArgs.put("sourcetype", DEFAULT_SOURCETYPE);
+  }
+
+  @Override
+  public void init(Map writerOptions) throws IOException {
+// No op
+  }
+
+  /**
+   * Update the schema in RecordWriter. Called at least once before starting 
writing the records. In this case,
+   * we add the index to Splunk here. Splunk's API is a little sparse and 
doesn't really do much in the way
+   * of error checking or providing feedback if the operation fails.
+   *
+   * @param batch {@link VectorAccessible} The incoming batch
+   */
+  @Override
+  public void updateSchema(VectorAccessible batch) {
+logger.debug("Updating schema for Splunk");
+
+//Get the collection of indexes
+IndexCollection indexes = splunkService.getIndexes();
+try {
+  String indexName = tableIdentifier.get(0);
+  indexes.create(indexName);
+  destinationIndex = splunkService.getIndexes().get(indexName);
+} catch (Exception e) {
+  // We have to catch a generic exception here, as Splunk's SDK does not 
really provide any kind of
+  // failure messaging.
+  throw UserException.systemError(e)
+.message("Error creating new index in Splunk plugin: " + 
e.getMessage())
+.build(logger);
+}
+  }
+
+
+  @Override
+  public void startRecord() {
+logger.debug("Starting record");
+splunkEvent = new JSONObject();
+  }
+
+  @Override
+  public void endRecord() throws IOException {
+logger.debug("Ending record");
+// Write the event to the Splunk index
+destinationIndex.submit(eventArgs, splunkEvent.toJSONString());
+// Clear out the splunk event.
+splunkEvent = new JSONObject();

Review Comment:
   Yes. This line clears out the event so every row we start fresh.   I 
discovered there is a `clear` method so I called that rather than creating a 
new object every time. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [drill] cgivre commented on a diff in pull request #2722: DRILL-8371: Add Write/Insert Capability to Splunk Plugin

2022-12-19 Thread GitBox


cgivre commented on code in PR #2722:
URL: https://github.com/apache/drill/pull/2722#discussion_r1052368638


##
contrib/storage-splunk/src/main/java/org/apache/drill/exec/store/splunk/SplunkBatchWriter.java:
##
@@ -0,0 +1,308 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.drill.exec.store.splunk;
+
+
+import com.splunk.Args;
+import com.splunk.Index;
+import com.splunk.IndexCollection;
+import com.splunk.Service;
+import org.apache.drill.common.exceptions.UserException;
+import org.apache.drill.exec.proto.UserBitShared.UserCredentials;
+import org.apache.drill.exec.record.VectorAccessible;
+import org.apache.drill.exec.store.AbstractRecordWriter;
+import org.apache.drill.exec.store.EventBasedRecordWriter.FieldConverter;
+import org.apache.drill.exec.vector.complex.reader.FieldReader;
+import org.json.simple.JSONObject;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+import java.util.List;
+import java.util.Map;
+
+public class SplunkBatchWriter extends AbstractRecordWriter {
+
+  private static final Logger logger = 
LoggerFactory.getLogger(SplunkBatchWriter.class);
+  private static final String DEFAULT_SOURCETYPE = "drill";
+  private final UserCredentials userCredentials;
+  private final List tableIdentifier;
+  private final SplunkWriter config;
+  private final Args eventArgs;
+  protected final Service splunkService;
+  private JSONObject splunkEvent;
+  protected Index destinationIndex;
+
+
+  public SplunkBatchWriter(UserCredentials userCredentials, List 
tableIdentifier, SplunkWriter config) {
+this.config = config;
+this.tableIdentifier = tableIdentifier;
+this.userCredentials = userCredentials;
+
+SplunkConnection connection = new 
SplunkConnection(config.getPluginConfig(), userCredentials.getUserName());
+this.splunkService = connection.connect();
+
+// Populate event arguments
+this.eventArgs = new Args();
+eventArgs.put("sourcetype", DEFAULT_SOURCETYPE);
+  }
+
+  @Override
+  public void init(Map writerOptions) throws IOException {
+// No op
+  }
+
+  /**
+   * Update the schema in RecordWriter. Called at least once before starting 
writing the records. In this case,

Review Comment:
   Sorry..  I clarified the comment.  This is called once before the records 
are written.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [drill] cgivre commented on a diff in pull request #2722: DRILL-8371: Add Write/Insert Capability to Splunk Plugin

2022-12-19 Thread GitBox


cgivre commented on code in PR #2722:
URL: https://github.com/apache/drill/pull/2722#discussion_r1052367809


##
contrib/storage-splunk/src/main/java/org/apache/drill/exec/store/splunk/SplunkInsertWriter.java:
##
@@ -0,0 +1,72 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.drill.exec.store.splunk;
+
+import com.fasterxml.jackson.annotation.JacksonInject;
+import com.fasterxml.jackson.annotation.JsonCreator;
+import com.fasterxml.jackson.annotation.JsonIgnore;
+import com.fasterxml.jackson.annotation.JsonProperty;
+import org.apache.drill.exec.physical.base.PhysicalOperator;
+import org.apache.drill.exec.store.StoragePluginRegistry;
+
+import java.util.List;
+
+public class SplunkInsertWriter extends SplunkWriter {
+  public static final String OPERATOR_TYPE = "SPLUNK_INSERT_WRITER";
+
+  private final SplunkStoragePlugin plugin;
+  private final List tableIdentifier;
+
+  @JsonCreator
+  public SplunkInsertWriter(
+  @JsonProperty("child") PhysicalOperator child,
+  @JsonProperty("tableIdentifier") List tableIdentifier,
+  @JsonProperty("storage") SplunkPluginConfig storageConfig,
+  @JacksonInject StoragePluginRegistry engineRegistry) {
+super(child, tableIdentifier, engineRegistry.resolve(storageConfig, 
SplunkStoragePlugin.class));

Review Comment:
   Fixed.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [drill] cgivre commented on a diff in pull request #2722: DRILL-8371: Add Write/Insert Capability to Splunk Plugin

2022-12-19 Thread GitBox


cgivre commented on code in PR #2722:
URL: https://github.com/apache/drill/pull/2722#discussion_r1052363847


##
contrib/storage-splunk/src/main/java/org/apache/drill/exec/store/splunk/SplunkBatchWriter.java:
##
@@ -0,0 +1,308 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.drill.exec.store.splunk;
+
+
+import com.splunk.Args;
+import com.splunk.Index;
+import com.splunk.IndexCollection;
+import com.splunk.Service;
+import org.apache.drill.common.exceptions.UserException;
+import org.apache.drill.exec.proto.UserBitShared.UserCredentials;
+import org.apache.drill.exec.record.VectorAccessible;
+import org.apache.drill.exec.store.AbstractRecordWriter;
+import org.apache.drill.exec.store.EventBasedRecordWriter.FieldConverter;
+import org.apache.drill.exec.vector.complex.reader.FieldReader;
+import org.json.simple.JSONObject;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+import java.util.List;
+import java.util.Map;
+
+public class SplunkBatchWriter extends AbstractRecordWriter {
+
+  private static final Logger logger = 
LoggerFactory.getLogger(SplunkBatchWriter.class);
+  private static final String DEFAULT_SOURCETYPE = "drill";
+  private final UserCredentials userCredentials;
+  private final List tableIdentifier;
+  private final SplunkWriter config;
+  private final Args eventArgs;
+  protected final Service splunkService;
+  private JSONObject splunkEvent;
+  protected Index destinationIndex;
+
+
+  public SplunkBatchWriter(UserCredentials userCredentials, List 
tableIdentifier, SplunkWriter config) {
+this.config = config;
+this.tableIdentifier = tableIdentifier;
+this.userCredentials = userCredentials;
+
+SplunkConnection connection = new 
SplunkConnection(config.getPluginConfig(), userCredentials.getUserName());
+this.splunkService = connection.connect();
+
+// Populate event arguments
+this.eventArgs = new Args();
+eventArgs.put("sourcetype", DEFAULT_SOURCETYPE);
+  }
+
+  @Override
+  public void init(Map writerOptions) throws IOException {
+// No op
+  }
+
+  /**
+   * Update the schema in RecordWriter. Called at least once before starting 
writing the records. In this case,
+   * we add the index to Splunk here. Splunk's API is a little sparse and 
doesn't really do much in the way
+   * of error checking or providing feedback if the operation fails.
+   *
+   * @param batch {@link VectorAccessible} The incoming batch
+   */
+  @Override
+  public void updateSchema(VectorAccessible batch) {
+logger.debug("Updating schema for Splunk");
+
+//Get the collection of indexes
+IndexCollection indexes = splunkService.getIndexes();
+try {
+  String indexName = tableIdentifier.get(0);
+  indexes.create(indexName);
+  destinationIndex = splunkService.getIndexes().get(indexName);
+} catch (Exception e) {
+  // We have to catch a generic exception here, as Splunk's SDK does not 
really provide any kind of
+  // failure messaging.
+  throw UserException.systemError(e)
+.message("Error creating new index in Splunk plugin: " + 
e.getMessage())
+.build(logger);
+}
+  }
+
+
+  @Override
+  public void startRecord() {
+logger.debug("Starting record");
+splunkEvent = new JSONObject();
+  }
+
+  @Override
+  public void endRecord() throws IOException {
+logger.debug("Ending record");
+// Write the event to the Splunk index
+destinationIndex.submit(eventArgs, splunkEvent.toJSONString());

Review Comment:
   I think there may be some bug in the Splunk SDK.  



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [drill] cgivre commented on a diff in pull request #2722: DRILL-8371: Add Write/Insert Capability to Splunk Plugin

2022-12-19 Thread GitBox


cgivre commented on code in PR #2722:
URL: https://github.com/apache/drill/pull/2722#discussion_r1052356461


##
contrib/storage-splunk/src/test/java/org/apache/drill/exec/store/splunk/SplunkWriterTest.java:
##
@@ -0,0 +1,191 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.drill.exec.store.splunk;
+
+import org.apache.drill.categories.SlowTest;
+import org.apache.drill.common.types.TypeProtos;
+import org.apache.drill.common.types.TypeProtos.MinorType;
+import org.apache.drill.exec.physical.rowSet.DirectRowSet;
+import org.apache.drill.exec.physical.rowSet.RowSet;
+import org.apache.drill.exec.physical.rowSet.RowSetBuilder;
+import org.apache.drill.exec.record.metadata.SchemaBuilder;
+import org.apache.drill.exec.record.metadata.TupleMetadata;
+import org.apache.drill.test.QueryBuilder.QuerySummary;
+import org.apache.drill.test.rowSet.RowSetUtilities;
+import org.junit.FixMethodOrder;
+import org.junit.Test;
+import org.junit.experimental.categories.Category;
+import org.junit.runners.MethodSorters;
+
+
+import static org.junit.Assert.assertEquals;
+import static org.junit.Assert.assertTrue;
+
+@FixMethodOrder(MethodSorters.JVM)
+@Category({SlowTest.class})
+public class SplunkWriterTest extends SplunkBaseTest {
+
+  @Test
+  public void testBasicCTAS() throws Exception {
+
+// Verify that there is no index called t1 in Splunk
+String sql = "SELECT * FROM INFORMATION_SCHEMA.`TABLES` WHERE TABLE_SCHEMA 
= 'splunk' AND TABLE_NAME LIKE 't1'";
+RowSet results = client.queryBuilder().sql(sql).rowSet();
+assertEquals(0, results.rowCount());
+results.clear();
+
+// Now create the table
+sql = "CREATE TABLE `splunk`.`t1` AS SELECT * FROM cp.`test_data.csvh`";
+QuerySummary summary = client.queryBuilder().sql(sql).run();
+assertTrue(summary.succeeded());
+
+// Verify that an index was created called t1 in Splunk
+sql = "SELECT * FROM INFORMATION_SCHEMA.`TABLES` WHERE TABLE_SCHEMA = 
'splunk' AND TABLE_NAME LIKE 't1'";
+results = client.queryBuilder().sql(sql).rowSet();
+assertEquals(1, results.rowCount());
+results.clear();
+
+// There seems to be some delay between the Drill query writing the data 
and the data being made
+// accessible.
+Thread.sleep(3);

Review Comment:
   Yeah.. There seems to be a processing delay between inserting data and it 
actually being queryable.   I don't think this is a Drill issue. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [drill] cgivre commented on a diff in pull request #2722: DRILL-8371: Add Write/Insert Capability to Splunk Plugin

2022-12-19 Thread GitBox


cgivre commented on code in PR #2722:
URL: https://github.com/apache/drill/pull/2722#discussion_r1052354949


##
contrib/storage-splunk/src/main/java/org/apache/drill/exec/store/splunk/SplunkBatchWriter.java:
##
@@ -0,0 +1,308 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.drill.exec.store.splunk;
+
+
+import com.splunk.Args;
+import com.splunk.Index;
+import com.splunk.IndexCollection;
+import com.splunk.Service;
+import org.apache.drill.common.exceptions.UserException;
+import org.apache.drill.exec.proto.UserBitShared.UserCredentials;
+import org.apache.drill.exec.record.VectorAccessible;
+import org.apache.drill.exec.store.AbstractRecordWriter;
+import org.apache.drill.exec.store.EventBasedRecordWriter.FieldConverter;
+import org.apache.drill.exec.vector.complex.reader.FieldReader;
+import org.json.simple.JSONObject;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+import java.util.List;
+import java.util.Map;
+
+public class SplunkBatchWriter extends AbstractRecordWriter {
+
+  private static final Logger logger = 
LoggerFactory.getLogger(SplunkBatchWriter.class);
+  private static final String DEFAULT_SOURCETYPE = "drill";
+  private final UserCredentials userCredentials;
+  private final List tableIdentifier;
+  private final SplunkWriter config;
+  private final Args eventArgs;
+  protected final Service splunkService;
+  private JSONObject splunkEvent;
+  protected Index destinationIndex;
+
+
+  public SplunkBatchWriter(UserCredentials userCredentials, List 
tableIdentifier, SplunkWriter config) {
+this.config = config;
+this.tableIdentifier = tableIdentifier;
+this.userCredentials = userCredentials;
+
+SplunkConnection connection = new 
SplunkConnection(config.getPluginConfig(), userCredentials.getUserName());
+this.splunkService = connection.connect();
+
+// Populate event arguments
+this.eventArgs = new Args();
+eventArgs.put("sourcetype", DEFAULT_SOURCETYPE);
+  }
+
+  @Override
+  public void init(Map writerOptions) throws IOException {
+// No op
+  }
+
+  /**
+   * Update the schema in RecordWriter. Called at least once before starting 
writing the records. In this case,
+   * we add the index to Splunk here. Splunk's API is a little sparse and 
doesn't really do much in the way
+   * of error checking or providing feedback if the operation fails.
+   *
+   * @param batch {@link VectorAccessible} The incoming batch
+   */
+  @Override
+  public void updateSchema(VectorAccessible batch) {
+logger.debug("Updating schema for Splunk");
+
+//Get the collection of indexes
+IndexCollection indexes = splunkService.getIndexes();
+try {
+  String indexName = tableIdentifier.get(0);
+  indexes.create(indexName);
+  destinationIndex = splunkService.getIndexes().get(indexName);
+} catch (Exception e) {
+  // We have to catch a generic exception here, as Splunk's SDK does not 
really provide any kind of
+  // failure messaging.
+  throw UserException.systemError(e)
+.message("Error creating new index in Splunk plugin: " + 
e.getMessage())
+.build(logger);
+}
+  }
+
+
+  @Override
+  public void startRecord() {
+logger.debug("Starting record");
+splunkEvent = new JSONObject();
+  }
+
+  @Override
+  public void endRecord() throws IOException {
+logger.debug("Ending record");
+// Write the event to the Splunk index
+destinationIndex.submit(eventArgs, splunkEvent.toJSONString());

Review Comment:
   @jnturton 
   I actually tried this first and I couldn't get Splunk to actually write any 
data.  I literally cut/pasted their code into Drill to no avail. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



  1   2   3   4   5   6   7   8   9   10   >