[GitHub] [drill] jnturton commented on pull request #2743: DRILL-8391: Disable auto complete on the password field of web UI login forms
jnturton commented on PR #2743: URL: https://github.com/apache/drill/pull/2743#issuecomment-1398480919 The dang squash and merge mangled the commit message! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [drill] cgivre merged pull request #2743: DRILL-8391: Disable auto complete on the password field of web UI login forms
cgivre merged PR #2743: URL: https://github.com/apache/drill/pull/2743 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [drill] jnturton commented on pull request #2636: DRILL-8290: Short cut recursive file listings for LIMIT 0 queries.
jnturton commented on PR #2636: URL: https://github.com/apache/drill/pull/2636#issuecomment-1398472306 > For such queries the same QueryComputationHints will be used for both inputs, so it will cause incorrect results. @vvysotskyi the idea here was that only a LIMIT 0 on the _root_ SELECT is detected, in which case the single file optimisation can be done on _all_ inputs so a single flag is sufficient. However, I'm trying to implement a better approach that optimises LIMIT 0s at any level. Since files are first listed very early, during validation (so even before partition pruning 🙁) no RelNode trees are available and the detection will have to be done on the SqlNode tree. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [drill] jnturton opened a new pull request, #2743: DRILL-8391: Disable auto complete on the password field of web UI login forms
jnturton opened a new pull request, #2743: URL: https://github.com/apache/drill/pull/2743 # [DRILL-8391](https://issues.apache.org/jira/browse/DRILL-8391): Disable auto complete on the password field of web UI login forms ## Description In order to avoid triggering security scanners it is necessary to set autocomplete = "off" on the password field in the web UI login forms. This change probably has no real world security benefit because > Even without a master password, in-browser password management is generally seen as a net gain for security. Since users do not have to remember passwords that the browser stores for them, they are able to choose stronger passwords than they would otherwise. > > For this reason, many modern browsers do not support autocomplete="off" for login fields: > > - If a site sets autocomplete="off" for a form, and the form includes username and password input fields, then the browser still offers to remember this login, and if the user agrees, the browser will autofill those fields the next time the user visits the page. > - If a site sets autocomplete="off" for username and password input fields, then the browser still offers to remember this login, and if the user agrees, the browser will autofill those fields the next time the user visits the page Excerpt taken from [this Mozilla Developer Network page](https://developer.mozilla.org/en-US/docs/Web/Security/Securing_your_site/Turning_off_form_autocompletion). ## Documentation N/A ## Testing Confirm that the attribute assignment `autocomplete="off"` is present on the password of the web UI login form. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [drill] cgivre merged pull request #2742: DRILL-8390: Minor Improvements to PDF Reader
cgivre merged PR #2742: URL: https://github.com/apache/drill/pull/2742 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [drill] cgivre opened a new pull request, #2742: DRILL-8390: Minor Improvements to PDF Reader
cgivre opened a new pull request, #2742: URL: https://github.com/apache/drill/pull/2742 # [DRILL-8390](https://issues.apache.org/jira/browse/DRILL-8390): Minor Improvements to PDF Reader ## Description This PR makes some minor improvements to the PDF reader including: Fixes a minor bug where certain configurations the first row of data was skipped Fixes a minor bug where empty tables were causing crashes with the spreadsheet extraction algorithm was used Adds a `_table_count` metadata field Adds a `_table_index` metadata field to reflect the current table. ## Documentation See above. Updated README. ## Testing Ran existing unit tests. Manually tested against customer data. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [drill] cgivre commented on issue #2721: select * from hive ;report refcnt = 0 error
cgivre commented on issue #2721: URL: https://github.com/apache/drill/issues/2721#issuecomment-1387248414 I'm just realizing something here. Are you attempting to run an INSERT query into Hive via Drill? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [drill] cgivre merged pull request #2741: [MINOR UPDATE]: Remove travis.yml
cgivre merged PR #2741: URL: https://github.com/apache/drill/pull/2741 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [drill] cgivre closed pull request #2731: DRILL-5033: Query on JSON That Has Null as Value For Each Key
cgivre closed pull request #2731: DRILL-5033: Query on JSON That Has Null as Value For Each Key URL: https://github.com/apache/drill/pull/2731 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [drill] cgivre commented on pull request #2731: DRILL-5033: Query on JSON That Has Null as Value For Each Key
cgivre commented on PR #2731: URL: https://github.com/apache/drill/pull/2731#issuecomment-1386304077 I'm going to close this PR. If there is any objection, we can revisit. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [drill] cgivre opened a new pull request, #2741: [MINOR UPDATE]: Remove travis.yml
cgivre opened a new pull request, #2741: URL: https://github.com/apache/drill/pull/2741 # [MINOR UPDATE]: Remove Travis.yml ## Description Per INFRA request, the Apache Foundation is moving away from Travis CI. They have requested that all tools deactivate it. ## Documentation N/A ## Testing N/A -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [drill] cgivre merged pull request #2733: DRILL-8380: Remove customised SqlValidatorImpl.deriveAlias
cgivre merged PR #2733: URL: https://github.com/apache/drill/pull/2733 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [drill] vvysotskyi commented on pull request #2733: DRILL-8380: Remove customised SqlValidatorImpl.deriveAlias
vvysotskyi commented on PR #2733: URL: https://github.com/apache/drill/pull/2733#issuecomment-1384890894 Yes, it can be merged before Calcite is released. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [drill] cgivre merged pull request #2740: [MINOR UPDATE]: Clear Results after Splunk Unit Tests
cgivre merged PR #2740: URL: https://github.com/apache/drill/pull/2740 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [drill] cgivre commented on pull request #2733: DRILL-8380: Remove customised SqlValidatorImpl.deriveAlias
cgivre commented on PR #2733: URL: https://github.com/apache/drill/pull/2733#issuecomment-1384683174 @vvysotskyi Can we merge this or should we wait for Calcite 1.33 to be released? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [drill] cgivre opened a new pull request, #2740: [MINOR UPDATE]: Clear Results after Splunk Unit Tests
cgivre opened a new pull request, #2740: URL: https://github.com/apache/drill/pull/2740 ## Description This minor modification to the Splunk unit tests clears for user translation explicitly clears the result sets. During some other work, I found that these tests would occasionally fail. This fixes that. ## Documentation No user facing changes. ## Testing Existing unit tests. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [drill] cgivre commented on pull request #2733: DRILL-8380: Remove customised SqlValidatorImpl.deriveAlias
cgivre commented on PR #2733: URL: https://github.com/apache/drill/pull/2733#issuecomment-1384307651 One thing I noticed is that the splunk tests sometimes fail locally as they don't have results.clear() at the end. This is inconsistent behavior, but I added that in the ES PR that I'm working on. Best, -- C > On Jan 16, 2023, at 11:39 AM, Volodymyr Vysotskyi ***@***.***> wrote: > > hat -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [drill] vvysotskyi commented on pull request #2733: DRILL-8380: Remove customised SqlValidatorImpl.deriveAlias
vvysotskyi commented on PR #2733: URL: https://github.com/apache/drill/pull/2733#issuecomment-1384305483 I think Splunk tests have somewhere a condition to fail if I'm the author of the commit 😅 Here is CI run for another my commit in the master branch that has the same error: https://github.com/apache/drill/actions/runs/3874925191/jobs/6620797212 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [drill] jnturton commented on pull request #2733: DRILL-8380: Remove customised SqlValidatorImpl.deriveAlias
jnturton commented on PR #2733: URL: https://github.com/apache/drill/pull/2733#issuecomment-1383868835 @vvysotskyi given that most of the CI runs had passed I ran (8, default-hadoop) again but the same failure turned up. I wouldn't expect a failure such as the following to affect only JDK 8 though so I'm a little mystified. ``` Error:SplunkWriterTest.testBasicCTASWithScalarDataTypes:136 Schemas don't match. Expected: [TupleSchema [PrimitiveColumnMetadata [`int_field` (VARCHAR:OPTIONAL)]], [PrimitiveColumnMetadata [`bigint_field` (VARCHAR:OPTIONAL)]], [PrimitiveColumnMetadata [`float4_field` (VARCHAR:OPTIONAL)]], [PrimitiveColumnMetadata [`float8_field` (VARCHAR:OPTIONAL)]], [PrimitiveColumnMetadata [`varchar_field` (VARCHAR:OPTIONAL)]], [PrimitiveColumnMetadata [`date_field` (VARCHAR:OPTIONAL)]], [PrimitiveColumnMetadata [`time_field` (VARCHAR:OPTIONAL)]], [PrimitiveColumnMetadata [`timestamp_field` (VARCHAR:OPTIONAL)]], [PrimitiveColumnMetadata [`boolean_field` (VARCHAR:OPTIONAL)]]] Actual: [TupleSchema [PrimitiveColumnMetadata [`int_field` (INT:OPTIONAL)]], [PrimitiveColumnMetadata [`bigint_field` (INT:OPTIONAL)]], [PrimitiveColumnMetadata [`float4_field` (INT:OPTIONAL)]], [PrimitiveColumnMetadata [`float8_field` (INT:OPTIONAL)]], [PrimitiveColumnMetadata [`varchar_field` (INT:OPTIONAL)]], [PrimitiveColumnMetadata [`date_field` (INT:OPTIONAL)]], [PrimitiveColumnMetadata [`time_field` (INT:OPTIONAL)]], [PrimitiveColumnMetadata [`timestamp_field` (INT:OPTIONAL)]], [PrimitiveColumnMetadata [`boolean_field` (INT:OPTIONAL)]]] ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [drill] cgivre merged pull request #2737: DRILL-8384: Add Format Plugin for Microsoft Access
cgivre merged PR #2737: URL: https://github.com/apache/drill/pull/2737 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [drill] vvysotskyi commented on a diff in pull request #2733: DRILL-8380: Remove customised SqlValidatorImpl.deriveAlias
vvysotskyi commented on code in PR #2733: URL: https://github.com/apache/drill/pull/2733#discussion_r1070571396 ## exec/java-exec/src/main/java/org/apache/drill/exec/store/dfs/WorkspaceSchemaFactory.java: ## @@ -403,8 +404,24 @@ private View getView(DotDrillFile f) throws IOException { return f.getView(mapper); } +private String getTemporaryName(String name) { + if (isTemporaryWorkspace()) { +String tableName = DrillStringUtils.removeLeadingSlash(name); +return schemaConfig.getTemporaryTableName(tableName); + } + return null; +} + +private boolean isTemporaryWorkspace() { Review Comment: I think it is unlikely that it would be reused. ## exec/java-exec/src/main/java/org/apache/drill/exec/work/metadata/MetadataProvider.java: ## @@ -607,6 +608,16 @@ public String getQueryUserName() { @Override public UserCredentials getQueryUserCredentials() { return session.getCredentials(); } + + @Override + public String getTemporaryTableName(String table) { Review Comment: I agree it is not good to have such interfaces with unsupported methods. Ideally, we should split them into several interfaces instead and use broader ones in places where it is required. ## exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/conversion/DrillCalciteCatalogReader.java: ## @@ -135,14 +112,15 @@ public Prepare.PreparingTable getTable(List names) { } private void checkTemporaryTable(List names) { -if (allowTemporaryTables) { +if (allowTemporaryTables || !needsTemporaryTableCheck(names, session.getDefaultSchemaPath(), drillConfig)) { return; } -String originalTableName = session.getOriginalTableNameFromTemporaryTable(names.get(names.size() - 1)); +String tableName = names.get(names.size() - 1); +String originalTableName = session.resolveTemporaryTableName(tableName); if (originalTableName != null) { throw UserException .validationError() - .message("Temporary tables usage is disallowed. Used temporary table name: [%s].", originalTableName) + .message("Temporary tables usage is disallowed. Used temporary table name: [%s].", tableName) Review Comment: Thanks, replaced it. ## exec/java-exec/src/main/java/org/apache/drill/exec/work/metadata/MetadataProvider.java: ## @@ -607,6 +608,16 @@ public String getQueryUserName() { @Override public UserCredentials getQueryUserCredentials() { return session.getCredentials(); } + + @Override + public String getTemporaryTableName(String table) { +return session.resolveTemporaryTableName(table); + } + + @Override + public String getTemporaryWorkspace() { +return config.getString(ExecConstants.DEFAULT_TEMPORARY_WORKSPACE); Review Comment: Yes, config is the only source for this property. But I think it is better to have an interface that provides only information related to schema config info rather than allow callers to access config by themselves. The current approach helps to encapsulate it, so I would prefer to leave it as it is. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [drill] Leon-WTF commented on a diff in pull request #2599: DRILL-4232: Support for EXCEPT and INTERSECT set operator
Leon-WTF commented on code in PR #2599: URL: https://github.com/apache/drill/pull/2599#discussion_r1070523002 ## exec/java-exec/src/test/java/org/apache/drill/TestSetOp.java: ## @@ -0,0 +1,1093 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.drill; + +import org.apache.drill.exec.planner.physical.PlannerSettings; +import org.apache.drill.exec.record.BatchSchema; +import org.apache.drill.exec.record.BatchSchemaBuilder; +import org.apache.drill.exec.record.metadata.SchemaBuilder; +import org.apache.drill.shaded.guava.com.google.common.collect.Lists; +import org.apache.commons.lang3.tuple.Pair; +import org.apache.drill.categories.OperatorTest; +import org.apache.drill.categories.SqlTest; +import org.apache.drill.categories.UnlikelyTest; +import org.apache.drill.common.exceptions.UserException; +import org.apache.drill.common.expression.SchemaPath; +import org.apache.drill.common.types.TypeProtos; +import org.apache.drill.exec.ExecConstants; +import org.apache.drill.test.ClusterFixture; +import org.apache.drill.test.ClusterTest; +import org.junit.BeforeClass; +import org.junit.Test; +import org.junit.experimental.categories.Category; + +import java.io.BufferedWriter; +import java.io.File; +import java.io.FileWriter; +import java.nio.file.Paths; +import java.util.List; + +@Category({SqlTest.class, OperatorTest.class}) +public class TestSetOp extends ClusterTest { Review Comment: first empty batch will be skipped in sniffNonEmptyBatch -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [drill] cgivre opened a new pull request, #2739: DRILL-8387: Add Support for User Translation to ElasticSearch Plugin
cgivre opened a new pull request, #2739: URL: https://github.com/apache/drill/pull/2739 # [DRILL-8387](https://issues.apache.org/jira/browse/DRILL-8387): Add Support for User Translation to ElasticSearch Plugin ## Description This PR adds support for user translation to the ElasticSearch plugin. ## Documentation Updated README. ## Testing Working on unit tests -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [drill] cgivre merged pull request #2738: DRILL-8386: Add Support for User Translation for Cassandra
cgivre merged PR #2738: URL: https://github.com/apache/drill/pull/2738 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [drill] cgivre commented on pull request #2738: DRILL-8386: Add Support for User Translation for Cassandra
cgivre commented on PR #2738: URL: https://github.com/apache/drill/pull/2738#issuecomment-1380565243 @jnturton Thanks for the review! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [drill] cgivre commented on a diff in pull request #2738: DRILL-8386: Add Support for User Translation for Cassandra
cgivre commented on code in PR #2738: URL: https://github.com/apache/drill/pull/2738#discussion_r1068265466 ## contrib/storage-splunk/README.md: ## @@ -42,6 +42,10 @@ Sometimes Splunk has issue in connection to it: https://github.com/splunk/splunk-sdk-java/issues/62 To bypass it by Drill please specify "reconnectRetries": 3. It allows you to retry the connection several times. +### User Translation +The Splunk plugin supports user translation. Simply set the `authMode` parameter to `USER_TRANSLATION` and use either the plain or vault credential provider for credentials. Review Comment: That's probably a good idea. I'd give it a +1. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [drill] jnturton commented on a diff in pull request #2738: DRILL-8386: Add Support for User Translation for Cassandra
jnturton commented on code in PR #2738: URL: https://github.com/apache/drill/pull/2738#discussion_r1068257539 ## contrib/storage-cassandra/src/test/java/org/apache/drill/exec/store/cassandra/CassandraUserTranslationTest.java: ## @@ -0,0 +1,103 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.drill.exec.store.cassandra; + +import org.apache.drill.categories.SlowTest; +import org.apache.drill.common.config.DrillProperties; +import org.apache.drill.common.exceptions.UserRemoteException; +import org.apache.drill.exec.physical.rowSet.RowSet; +import org.apache.drill.test.ClientFixture; +import org.junit.Test; +import org.junit.experimental.categories.Category; + +import static org.apache.drill.exec.rpc.user.security.testing.UserAuthenticatorTestImpl.ADMIN_USER; +import static org.apache.drill.exec.rpc.user.security.testing.UserAuthenticatorTestImpl.ADMIN_USER_PASSWORD; +import static org.apache.drill.exec.rpc.user.security.testing.UserAuthenticatorTestImpl.TEST_USER_1; +import static org.apache.drill.exec.rpc.user.security.testing.UserAuthenticatorTestImpl.TEST_USER_1_PASSWORD; +import static org.junit.Assert.assertEquals; +import static org.junit.Assert.assertTrue; +import static org.junit.jupiter.api.Assertions.fail; + +@Category({SlowTest.class}) +public class CassandraUserTranslationTest extends BaseCassandraTest { + @Test + public void testInfoSchemaQueryWithMissingCredentials() throws Exception { +// This test validates that the correct credentials are sent down to Cassandra. +// This user should not see the ut_cassandra because they do not have valid credentials. +ClientFixture client = cluster +.clientBuilder() +.property(DrillProperties.USER, ADMIN_USER) +.property(DrillProperties.PASSWORD, ADMIN_USER_PASSWORD) +.build(); + +String sql = "SHOW DATABASES WHERE schema_name LIKE '%cassandra%'"; + +RowSet results = client.queryBuilder().sql(sql).rowSet(); +assertEquals(1, results.rowCount()); +results.clear(); + } + + @Test + public void testInfoSchemaQueryWithValidCredentials() throws Exception { +// This test validates that the cassandra connection with user translation appears whne the user is Review Comment: ```suggestion // This test validates that the cassandra connection with user translation appears when the user is ``` ## contrib/storage-splunk/README.md: ## @@ -42,6 +42,10 @@ Sometimes Splunk has issue in connection to it: https://github.com/splunk/splunk-sdk-java/issues/62 To bypass it by Drill please specify "reconnectRetries": 3. It allows you to retry the connection several times. +### User Translation +The Splunk plugin supports user translation. Simply set the `authMode` parameter to `USER_TRANSLATION` and use either the plain or vault credential provider for credentials. Review Comment: Do you think I should make authMode values case insensitive before 1.21 is released? USER_TRANSLATION looks odd compared to other Drill config values and is only in caps due to the coincidence that the Java Enum names are in caps. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [drill] jnturton commented on a diff in pull request #2733: DRILL-8380: Remove customised SqlValidatorImpl.deriveAlias
jnturton commented on code in PR #2733: URL: https://github.com/apache/drill/pull/2733#discussion_r1068252216 ## exec/java-exec/src/main/java/org/apache/drill/exec/work/metadata/MetadataProvider.java: ## @@ -607,6 +608,16 @@ public String getQueryUserName() { @Override public UserCredentials getQueryUserCredentials() { return session.getCredentials(); } + + @Override + public String getTemporaryTableName(String table) { Review Comment: This looks like another case where we wouldn't need to keep expanding interfaces like SchemaConfigInfoProvider and adding partial implementations where some throw UnsupportedOperationExceptions if we just had a good way of accessing the UserSession from most layers of Drill. It's not something for this PR for sure, but I wanted to remark to get your opinion since I remember having to work the same way when I was trying to expose UserCredentials (visible above) for user translation in plugins. ## exec/java-exec/src/main/java/org/apache/drill/exec/work/metadata/MetadataProvider.java: ## @@ -607,6 +608,16 @@ public String getQueryUserName() { @Override public UserCredentials getQueryUserCredentials() { return session.getCredentials(); } + + @Override + public String getTemporaryTableName(String table) { +return session.resolveTemporaryTableName(table); + } + + @Override + public String getTemporaryWorkspace() { +return config.getString(ExecConstants.DEFAULT_TEMPORARY_WORKSPACE); Review Comment: Have I got it right that this config option value is the only value returned by implementations of getTemporaryWorkspace? If so, do we need this method or could its callers look up the config value themselves instead? ## exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/conversion/DrillCalciteCatalogReader.java: ## @@ -135,14 +112,15 @@ public Prepare.PreparingTable getTable(List names) { } private void checkTemporaryTable(List names) { -if (allowTemporaryTables) { +if (allowTemporaryTables || !needsTemporaryTableCheck(names, session.getDefaultSchemaPath(), drillConfig)) { return; } -String originalTableName = session.getOriginalTableNameFromTemporaryTable(names.get(names.size() - 1)); +String tableName = names.get(names.size() - 1); +String originalTableName = session.resolveTemporaryTableName(tableName); if (originalTableName != null) { throw UserException .validationError() - .message("Temporary tables usage is disallowed. Used temporary table name: [%s].", originalTableName) + .message("Temporary tables usage is disallowed. Used temporary table name: [%s].", tableName) Review Comment: ```suggestion .message("A reference to temporary table [%s] was made in a context where temporary table references are not allowed.", tableName) ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [drill] jnturton commented on a diff in pull request #2733: DRILL-8380: Remove customised SqlValidatorImpl.deriveAlias
jnturton commented on code in PR #2733: URL: https://github.com/apache/drill/pull/2733#discussion_r1067718395 ## exec/java-exec/src/main/java/org/apache/drill/exec/store/dfs/WorkspaceSchemaFactory.java: ## @@ -403,8 +404,24 @@ private View getView(DotDrillFile f) throws IOException { return f.getView(mapper); } +private String getTemporaryName(String name) { + if (isTemporaryWorkspace()) { +String tableName = DrillStringUtils.removeLeadingSlash(name); +return schemaConfig.getTemporaryTableName(tableName); + } + return null; +} + +private boolean isTemporaryWorkspace() { Review Comment: Could this utility method move to SchemaConfig or SchemaUtilities so that it's available for reuse elsewhere or is that unlikely? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [drill] jnturton commented on a diff in pull request #2733: DRILL-8380: Remove customised SqlValidatorImpl.deriveAlias
jnturton commented on code in PR #2733: URL: https://github.com/apache/drill/pull/2733#discussion_r1067718395 ## exec/java-exec/src/main/java/org/apache/drill/exec/store/dfs/WorkspaceSchemaFactory.java: ## @@ -403,8 +404,24 @@ private View getView(DotDrillFile f) throws IOException { return f.getView(mapper); } +private String getTemporaryName(String name) { + if (isTemporaryWorkspace()) { +String tableName = DrillStringUtils.removeLeadingSlash(name); +return schemaConfig.getTemporaryTableName(tableName); + } + return null; +} + +private boolean isTemporaryWorkspace() { Review Comment: Could this utiltiy method move to SchemaConfig or SchemaUtilities so that it's available for reuse elsewhere or is that unlikely? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [drill] cgivre opened a new pull request, #2738: DRILL-8386: Add Support for User Translation for Cassandra
cgivre opened a new pull request, #2738: URL: https://github.com/apache/drill/pull/2738 # [DRILL-8386](https://issues.apache.org/jira/browse/DRILL-8386): Add Support for User Translation for Cassandra ## Description Adds support for user translation for Apache Cassandra. ## Documentation Updated README. ## Testing Added unit tests. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [drill] LYCJeff commented on issue #2735: Use some configuration items to specify the parameters as filters that allow them to be passed to headers and post body through SQL dynamically
LYCJeff commented on issue #2735: URL: https://github.com/apache/drill/issues/2735#issuecomment-1378369362 > @LYCJeff Drill already does this. Take a look at the docs (https://github.com/apache/drill/tree/master/contrib/storage-http#method) for the `postBodyLocation` parameter. > > I actually like your design better however. What about headers? Some APIs require digital signature in the headers to be generated at access time. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [drill] cgivre merged pull request #2729: DRILL-8376: Add Distribution UDFs
cgivre merged PR #2729: URL: https://github.com/apache/drill/pull/2729 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [drill] cgivre commented on a diff in pull request #2729: DRILL-8376: Add Distribution UDFs
cgivre commented on code in PR #2729: URL: https://github.com/apache/drill/pull/2729#discussion_r1065940539 ## contrib/udfs/src/main/java/org/apache/drill/exec/udfs/DistributionFunctions.java: ## @@ -51,31 +51,29 @@ public static class WidthBucketFunction implements DrillSimpleFunc { @Workspace double binWidth; +@Workspace +int bucketCount; + @Output IntHolder bucket; @Override public void setup() { double max = MaxRangeValueHolder.value; double min = MinRangeValueHolder.value; - int bucketCount = bucketCountHolder.value; + bucketCount = bucketCountHolder.value; binWidth = (max - min) / bucketCount; } @Override public void eval() { - // There is probably a more elegant way of doing this... - double binFloor = MinRangeValueHolder.value; - double binCeiling = binFloor + binWidth; - - for (int i = 1; i <= bucketCountHolder.value; i++) { -if (inputValue.value <= binCeiling && inputValue.value > binFloor) { - bucket.value = i; - break; -} else { - binFloor = binCeiling; - binCeiling = binWidth * (i + 1); -} + if (inputValue.value < MinRangeValueHolder.value) { +bucket.value = 0; + } else if (inputValue.value > MaxRangeValueHolder.value) { +bucket.value = bucketCount + 1; + } else { +double f = (1 + (inputValue.value - MinRangeValueHolder.value) / binWidth); Review Comment: Oops... That was a test variable. Removed. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [drill] cgivre opened a new pull request, #2737: DRILL-8384: Add Format Plugin for Microsoft Access
cgivre opened a new pull request, #2737: URL: https://github.com/apache/drill/pull/2737 # [DRILL-8384](https://issues.apache.org/jira/browse/DRILL-8384): Add Format Plugin for Microsoft Access ## Description Added format plugin to enable Drill to read MS Access files. ## Documentation See README. ## Testing Added unit tests. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [drill] cgivre commented on issue #2735: Use some configuration items to specify the parameters as filters that allow them to be passed to headers and post body through SQL dynamically
cgivre commented on issue #2735: URL: https://github.com/apache/drill/issues/2735#issuecomment-1377178966 @LYCJeff Drill already does this. Take a look at the docs (https://github.com/apache/drill/tree/master/contrib/storage-http#method) for the `postBodyLocation` parameter. I actually like your design better however. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [drill] LYCJeff closed issue #2736: Use some configuration items to specify the parameters as filters that allow them to be passed to headers and post body through SQL dynamically
LYCJeff closed issue #2736: Use some configuration items to specify the parameters as filters that allow them to be passed to headers and post body through SQL dynamically URL: https://github.com/apache/drill/issues/2736 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [drill] LYCJeff opened a new issue, #2736: Use some configuration items to specify the parameters as filters that allow them to be passed to headers and post body through SQL dynamically
LYCJeff opened a new issue, #2736: URL: https://github.com/apache/drill/issues/2736 Some APIs require information be sent as a headers or post body dynamically. So I'm wondering if we can pass it in through filter statement. Perhaps we could design it like the params field in connections parameter. For example: { "url": "https://api.sunrise-sunset.org/json";, "requireTail": false, "bodyParams": ["lat", "lng", "date"] } SQL Query: SELECT * FROM api.sunrise WHERE `body.lat` = 36.7201600 AND `body.lng` = -4.4203400 AND `body.date` = '2019-10-02'; Then, the post body would be: { "lat": 36.7201600, "lng": -4.4203400, "date": "2019-10-02" } -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [drill] LYCJeff opened a new issue, #2735: Use some configuration items to specify the parameters as filters that allow them to be passed to headers and post body through SQL dynamically
LYCJeff opened a new issue, #2735: URL: https://github.com/apache/drill/issues/2735 Some APIs require information be sent as a headers or post body dynamically. So I'm wondering if we can pass it in through filter statement. Perhaps we could design it like the params field in connections parameter. For example: { "url": "https://api.sunrise-sunset.org/json";, "requireTail": false, "bodyParams": ["lat", "lng", "date"] } SQL Query: SELECT * FROM api.sunrise WHERE `body.lat` = 36.7201600 AND `body.lng` = -4.4203400 AND `body.date` = '2019-10-02'; Then, the post body would be: { "lat": 36.7201600, "lng": -4.4203400, "date": "2019-10-02" } -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [drill] jnturton commented on a diff in pull request #2729: DRILL-8376: Add Distribution UDFs
jnturton commented on code in PR #2729: URL: https://github.com/apache/drill/pull/2729#discussion_r1065424637 ## contrib/udfs/src/main/java/org/apache/drill/exec/udfs/DistributionFunctions.java: ## @@ -51,31 +51,29 @@ public static class WidthBucketFunction implements DrillSimpleFunc { @Workspace double binWidth; +@Workspace +int bucketCount; + @Output IntHolder bucket; @Override public void setup() { double max = MaxRangeValueHolder.value; double min = MinRangeValueHolder.value; - int bucketCount = bucketCountHolder.value; + bucketCount = bucketCountHolder.value; binWidth = (max - min) / bucketCount; } @Override public void eval() { - // There is probably a more elegant way of doing this... - double binFloor = MinRangeValueHolder.value; - double binCeiling = binFloor + binWidth; - - for (int i = 1; i <= bucketCountHolder.value; i++) { -if (inputValue.value <= binCeiling && inputValue.value > binFloor) { - bucket.value = i; - break; -} else { - binFloor = binCeiling; - binCeiling = binWidth * (i + 1); -} + if (inputValue.value < MinRangeValueHolder.value) { +bucket.value = 0; + } else if (inputValue.value > MaxRangeValueHolder.value) { +bucket.value = bucketCount + 1; + } else { +double f = (1 + (inputValue.value - MinRangeValueHolder.value) / binWidth); Review Comment: It looks like `f` is recomputed rather than used in what follows. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [drill] cgivre merged pull request #2734: DRILL-8381: Add support for filtered aggregate calls
cgivre merged PR #2734: URL: https://github.com/apache/drill/pull/2734 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [drill] cgivre commented on pull request #2729: DRILL-8376: Add Distribution UDFs
cgivre commented on PR #2729: URL: https://github.com/apache/drill/pull/2729#issuecomment-1375739889 @jnturton Thanks for the review. I believe I've addressed your comments. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [drill] cgivre commented on a diff in pull request #2729: DRILL-8376: Add Distribution UDFs
cgivre commented on code in PR #2729: URL: https://github.com/apache/drill/pull/2729#discussion_r1064725653 ## contrib/udfs/src/main/java/org/apache/drill/exec/udfs/DistributionFunctions.java: ## @@ -0,0 +1,335 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.drill.exec.udfs; + +import org.apache.drill.exec.expr.DrillAggFunc; +import org.apache.drill.exec.expr.DrillSimpleFunc; +import org.apache.drill.exec.expr.annotations.FunctionTemplate; +import org.apache.drill.exec.expr.annotations.FunctionTemplate.FunctionScope; +import org.apache.drill.exec.expr.annotations.FunctionTemplate.NullHandling; +import org.apache.drill.exec.expr.annotations.Output; +import org.apache.drill.exec.expr.annotations.Param; +import org.apache.drill.exec.expr.annotations.Workspace; +import org.apache.drill.exec.expr.holders.Float8Holder; +import org.apache.drill.exec.expr.holders.IntHolder; + +public class DistributionFunctions { + + @FunctionTemplate(names = {"width_bucket", "widthBucket"}, + scope = FunctionScope.SIMPLE, + nulls = NullHandling.NULL_IF_NULL) + public static class WidthBucketFunction implements DrillSimpleFunc { + +@Param +Float8Holder inputValue; + +@Param +Float8Holder MinRangeValueHolder; + +@Param +Float8Holder MaxRangeValueHolder; + +@Param +IntHolder bucketCountHolder; + +@Workspace +double binWidth; + +@Output +IntHolder bucket; + +@Override +public void setup() { + double max = MaxRangeValueHolder.value; + double min = MinRangeValueHolder.value; + int bucketCount = bucketCountHolder.value; + binWidth = (max - min) / bucketCount; +} + +@Override +public void eval() { + // There is probably a more elegant way of doing this... + double binFloor = MinRangeValueHolder.value; + double binCeiling = binFloor + binWidth; + + for (int i = 1; i <= bucketCountHolder.value; i++) { +if (inputValue.value <= binCeiling && inputValue.value > binFloor) { + bucket.value = i; + break; +} else { + binFloor = binCeiling; + binCeiling = binWidth * (i + 1); +} + } Review Comment: @jnturton I looked at the docs for PostgreSQL (which is where this function is modeled after), and saw that in PostgreSQL, if the result is less than the bucket count, it goes into bucket `0` and if it is larger than the range, it goes into bucket `n+1`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [drill] cgivre commented on a diff in pull request #2729: DRILL-8376: Add Distribution UDFs
cgivre commented on code in PR #2729: URL: https://github.com/apache/drill/pull/2729#discussion_r1064724101 ## contrib/udfs/src/main/java/org/apache/drill/exec/udfs/DistributionFunctions.java: ## @@ -0,0 +1,335 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.drill.exec.udfs; + +import org.apache.drill.exec.expr.DrillAggFunc; +import org.apache.drill.exec.expr.DrillSimpleFunc; +import org.apache.drill.exec.expr.annotations.FunctionTemplate; +import org.apache.drill.exec.expr.annotations.FunctionTemplate.FunctionScope; +import org.apache.drill.exec.expr.annotations.FunctionTemplate.NullHandling; +import org.apache.drill.exec.expr.annotations.Output; +import org.apache.drill.exec.expr.annotations.Param; +import org.apache.drill.exec.expr.annotations.Workspace; +import org.apache.drill.exec.expr.holders.Float8Holder; +import org.apache.drill.exec.expr.holders.IntHolder; + +public class DistributionFunctions { + + @FunctionTemplate(names = {"width_bucket", "widthBucket"}, + scope = FunctionScope.SIMPLE, + nulls = NullHandling.NULL_IF_NULL) + public static class WidthBucketFunction implements DrillSimpleFunc { + +@Param +Float8Holder inputValue; + +@Param +Float8Holder MinRangeValueHolder; + +@Param +Float8Holder MaxRangeValueHolder; + +@Param +IntHolder bucketCountHolder; + +@Workspace +double binWidth; + +@Output +IntHolder bucket; + +@Override +public void setup() { + double max = MaxRangeValueHolder.value; + double min = MinRangeValueHolder.value; + int bucketCount = bucketCountHolder.value; + binWidth = (max - min) / bucketCount; +} + +@Override +public void eval() { + // There is probably a more elegant way of doing this... + double binFloor = MinRangeValueHolder.value; + double binCeiling = binFloor + binWidth; + + for (int i = 1; i <= bucketCountHolder.value; i++) { +if (inputValue.value <= binCeiling && inputValue.value > binFloor) { + bucket.value = i; + break; +} else { + binFloor = binCeiling; + binCeiling = binWidth * (i + 1); +} + } +} + } + + @FunctionTemplate( + names = {"kendall_correlation","kendallCorrelation", "kendallTau", "kendall_tau"}, + scope = FunctionScope.POINT_AGGREGATE, + nulls = NullHandling.INTERNAL + ) + public static class KendallTauFunction implements DrillAggFunc { +@Param +Float8Holder xInput; + +@Param +Float8Holder yInput; + +@Workspace +Float8Holder prevXValue; + +@Workspace +Float8Holder prevYValue; + +@Workspace +IntHolder concordantPairs; + +@Workspace +IntHolder discordantPairs; + +@Workspace +IntHolder n; + +@Output +Float8Holder tau; + +@Override +public void add() { + double xValue = xInput.value; + double yValue = yInput.value; + + if (n.value > 0) { +if ((xValue > prevXValue.value && yValue > prevYValue.value) || (xValue < prevXValue.value && yValue < prevYValue.value)) { + concordantPairs.value = concordantPairs.value + 1; +} else if ((xValue > prevXValue.value && yValue < prevYValue.value) || (xValue < prevXValue.value && yValue > prevYValue.value)) { + discordantPairs.value = discordantPairs.value + 1; +} else { + //Tie... +} + +prevXValue.value = xInput.value; +prevYValue.value = yInput.value; +n.value = n.value + 1; Review Comment: Fixed. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [drill] vvysotskyi opened a new pull request, #2734: DRILL-8381: Add support for filtered aggregate calls
vvysotskyi opened a new pull request, #2734: URL: https://github.com/apache/drill/pull/2734 # [DRILL-8381](https://issues.apache.org/jira/browse/DRILL-8381): Add support for filtered aggregate calls ## Description For the case when filtering expression is specified, Drill will generate an `if` expression to obtain field value that will be used in aggregate function only when the filter predicate is true. Filter expression specified within an aggregate function is present in the underlying project, so it is enough to get a reference to it to use it as a condition. ## Documentation NA ## Testing Added UT. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [drill] vvysotskyi opened a new pull request, #2733: DRILL-8380: Remove customised SqlValidatorImpl.deriveAlias
vvysotskyi opened a new pull request, #2733: URL: https://github.com/apache/drill/pull/2733 # [DRILL-8380](https://issues.apache.org/jira/browse/DRILL-8380): Remove customised SqlValidatorImpl.deriveAlias ## Description As pointed out in CALCITE-5463, `SqlValidatorImpl.deriveAlias` isn't intended to be customized, and it is not used in the latest version. It causes issues with table and storage aliases and temporary tables functionality. Unfortunately, there is no way to preserve existing temporary table behavior. After these changes, the temporary table will be accessible only within its workspace, as regular tables. ## Documentation Yes. ## Testing All UT pass. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [drill] jnturton commented on issue #2732: Any chance of INFORMATION SCHEMA updates like storing it in any database instead of in-memory data
jnturton commented on issue #2732: URL: https://github.com/apache/drill/issues/2732#issuecomment-1373448803 Using as specific a WHERE clause as possible in your information schema query will usually help. On 06 January 2023 06:48:23 SAST, Porika Venkatesh ***@***.***> wrote: >I have connected to HIVE Metastore, but my application depends mainly on metadata, so we are drill **INFORMATION SCHEMA** as it is virtual datastore and we have use metadata. queries are too slow on **INFORMATION SCHEMA**. ANY SUGGESTIONS WOULD HELP ME GREAT. THANK YOU > >-- >Reply to this email directly or view it on GitHub: >https://github.com/apache/drill/issues/2732#issuecomment-1373140572 >You are receiving this because you commented. > >Message ID: ***@***.***> -- Sent from my Android device with K-9 Mail. Please excuse my brevity. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [drill] porika-v commented on issue #2732: Any chance of INFORMATION SCHEMA updates like storing it in any database instead of in-memory data
porika-v commented on issue #2732: URL: https://github.com/apache/drill/issues/2732#issuecomment-1373140572 I have connected to HIVE Metastore, but my application depends mainly on metadata, so we are drill **INFORMATION SCHEMA** as it is virtual datastore and we have use metadata. queries are too slow on **INFORMATION SCHEMA**. ANY SUGGESTIONS WOULD HELP ME GREAT. THANK YOU -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [drill] jnturton commented on a diff in pull request #2729: DRILL-8376: Add Distribution UDFs
jnturton commented on code in PR #2729: URL: https://github.com/apache/drill/pull/2729#discussion_r1062569553 ## contrib/udfs/src/main/java/org/apache/drill/exec/udfs/DistributionFunctions.java: ## @@ -0,0 +1,335 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.drill.exec.udfs; + +import org.apache.drill.exec.expr.DrillAggFunc; +import org.apache.drill.exec.expr.DrillSimpleFunc; +import org.apache.drill.exec.expr.annotations.FunctionTemplate; +import org.apache.drill.exec.expr.annotations.FunctionTemplate.FunctionScope; +import org.apache.drill.exec.expr.annotations.FunctionTemplate.NullHandling; +import org.apache.drill.exec.expr.annotations.Output; +import org.apache.drill.exec.expr.annotations.Param; +import org.apache.drill.exec.expr.annotations.Workspace; +import org.apache.drill.exec.expr.holders.Float8Holder; +import org.apache.drill.exec.expr.holders.IntHolder; + +public class DistributionFunctions { + + @FunctionTemplate(names = {"width_bucket", "widthBucket"}, + scope = FunctionScope.SIMPLE, + nulls = NullHandling.NULL_IF_NULL) + public static class WidthBucketFunction implements DrillSimpleFunc { + +@Param +Float8Holder inputValue; + +@Param +Float8Holder MinRangeValueHolder; + +@Param +Float8Holder MaxRangeValueHolder; + +@Param +IntHolder bucketCountHolder; + +@Workspace +double binWidth; + +@Output +IntHolder bucket; + +@Override +public void setup() { + double max = MaxRangeValueHolder.value; + double min = MinRangeValueHolder.value; + int bucketCount = bucketCountHolder.value; + binWidth = (max - min) / bucketCount; +} + +@Override +public void eval() { + // There is probably a more elegant way of doing this... + double binFloor = MinRangeValueHolder.value; + double binCeiling = binFloor + binWidth; + + for (int i = 1; i <= bucketCountHolder.value; i++) { +if (inputValue.value <= binCeiling && inputValue.value > binFloor) { + bucket.value = i; + break; +} else { + binFloor = binCeiling; + binCeiling = binWidth * (i + 1); +} + } +} + } + + @FunctionTemplate( + names = {"kendall_correlation","kendallCorrelation", "kendallTau", "kendall_tau"}, + scope = FunctionScope.POINT_AGGREGATE, + nulls = NullHandling.INTERNAL + ) + public static class KendallTauFunction implements DrillAggFunc { +@Param +Float8Holder xInput; + +@Param +Float8Holder yInput; + +@Workspace +Float8Holder prevXValue; + +@Workspace +Float8Holder prevYValue; + +@Workspace +IntHolder concordantPairs; + +@Workspace +IntHolder discordantPairs; + +@Workspace +IntHolder n; + +@Output +Float8Holder tau; + +@Override +public void add() { + double xValue = xInput.value; + double yValue = yInput.value; + + if (n.value > 0) { +if ((xValue > prevXValue.value && yValue > prevYValue.value) || (xValue < prevXValue.value && yValue < prevYValue.value)) { + concordantPairs.value = concordantPairs.value + 1; +} else if ((xValue > prevXValue.value && yValue < prevYValue.value) || (xValue < prevXValue.value && yValue > prevYValue.value)) { + discordantPairs.value = discordantPairs.value + 1; +} else { + //Tie... +} + +prevXValue.value = xInput.value; +prevYValue.value = yInput.value; +n.value = n.value + 1; Review Comment: Given that xValue = xInput.value and yValue = yInput.value, I think this code is common to both branches of the parent if statement. ## contrib/udfs/src/main/java/org/apache/drill/exec/udfs/DistributionFunctions.java: ## @@ -0,0 +1,335 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You ma
[GitHub] [drill] jnturton commented on issue #2732: Any chance of INFORMATION SCHEMA updates like storing it in any database instead of in-memory data
jnturton commented on issue #2732: URL: https://github.com/apache/drill/issues/2732#issuecomment-1372171275 Have you looked looked at the Hive metastore? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [drill] porika-v commented on issue #2732: Any chance of INFORMATION SCHEMA updates like storing it in any database instead of in-memory data
porika-v commented on issue #2732: URL: https://github.com/apache/drill/issues/2732#issuecomment-1372024894 this works only with parquet data, I can't use with **HIVE** -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [drill] jnturton commented on issue #2732: Any chance of INFORMATION SCHEMA updates like storing it in any database instead of in-memory data
jnturton commented on issue #2732: URL: https://github.com/apache/drill/issues/2732#issuecomment-1371793916 Have you looked at the Drill metastore? https://drill.apache.org/docs/using-drill-metastore/ https://drill.apache.org/docs/rdbms-metastore/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [drill] porika-v opened a new issue, #2732: Any chance of INFORMATION SCHEMA updates like storing it in any database instead of in-memory data
porika-v opened a new issue, #2732: URL: https://github.com/apache/drill/issues/2732 **Is your feature request related to a problem? Please describe.** A clear and concise description of what the problem is. Ex. I'm always frustrated when [...] **Describe the solution you'd like** A clear and concise description of what you want to happen. **Describe alternatives you've considered** A clear and concise description of any alternative solutions or features you've considered. **Additional context** Add any other context or screenshots about the feature request here. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [drill] cgivre commented on pull request #2731: DRILL-5033: Query on JSON That Has Null as Value For Each Key
cgivre commented on PR #2731: URL: https://github.com/apache/drill/pull/2731#issuecomment-1367977377 > Thanks @cgivre for the clarification, but suppose the assumption that considering nulls as strings would solve the issue, were the changes i made (over the class JSONReader.java) adequate (should the methods be changed as i did)? i see that some tests didn't pass. For us to merge a pull request, all the unit tests have to pass. (Or be modified with an explanation of why they are being modified) Drill is a very complex beast with a lot of dependencies so even small changes can break things you didn't intend to. Believe me... I know from experience ;-) One other thing to note is that there is another option `drill.exec.functions.cast_empty_string_to_null`. Setting this to true forces empty strings to be treated as `null`. This can have some unintended side effects, but might also help you out. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [drill] unical1988 commented on pull request #2731: DRILL-5033: Query on JSON That Has Null as Value For Each Key
unical1988 commented on PR #2731: URL: https://github.com/apache/drill/pull/2731#issuecomment-1367969342 Thanks @cgivre for the clarification, but suppose the assumption that considering nulls as strings would solve the issue, were the changes i made (over the class JSONReader.java) adequate (should the methods be changed as i did)? i see that some tests didn't pass. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [drill] cgivre commented on pull request #2731: DRILL-5033: Query on JSON That Has Null as Value For Each Key
cgivre commented on PR #2731: URL: https://github.com/apache/drill/pull/2731#issuecomment-1367677557 @unical1988 You actually don't have to modify the code to get this data to read properly. As I mentioned on the user group, the easiest way would probably be to provide a schema. The good news is that you can do this at query time. Take a look here: https://drill.apache.org/docs/plugin-configuration-basics/#specifying-the-schema-as-table-function-parameter An example query might be: ```sql select * from table(dfs.tmp.`file.json`( schema => 'inline=(col0 varchar, col1 date properties {`drill.format` = `-MM-dd`}) properties {`drill.strict` = `false`}')) ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [drill] unical1988 commented on pull request #2731: DRILL-5033: Query on JSON That Has Null as Value For Each Key
unical1988 commented on PR #2731: URL: https://github.com/apache/drill/pull/2731#issuecomment-1367674759 @vvysotskyi My attempt to deal with this bug is just a quick workaround since the solution as stated by @cgivre might just be the setting of the schema, of the dataset to query, from the start (which requires non trivial updates to the code). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [drill] cgivre commented on pull request #2731: DRILL-5033: Query on JSON That Has Null as Value For Each Key
cgivre commented on PR #2731: URL: https://github.com/apache/drill/pull/2731#issuecomment-1367526640 I copied the JIRA into the PR description. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [drill] unical1988 opened a new pull request, #2731: DRILL-8033
unical1988 opened a new pull request, #2731: URL: https://github.com/apache/drill/pull/2731 # [DRILL-8033](https://issues.apache.org/jira/browse/DRILL-8033): PR Title (Please replace `PR Title` with actual PR Title) ## Description (Please describe the change. If more than one ticket is fixed, include a reference to those tickets.) ## Documentation (Please describe user-visible changes similar to what should appear in the Drill documentation.) ## Testing (Please describe how this PR has been tested.) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [drill] cgivre merged pull request #2730: DRILL-8378: Support doing Maven releases using modern JDKs
cgivre merged PR #2730: URL: https://github.com/apache/drill/pull/2730 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [drill] jnturton opened a new pull request, #2730: DRILL-8378: Support doing Maven releases using modern JDKs
jnturton opened a new pull request, #2730: URL: https://github.com/apache/drill/pull/2730 # [DRILL-8378](https://issues.apache.org/jira/browse/DRILL-8378): Support doing Maven releases using modern JDKs ## Description While [DRILL-8113](https://issues.apache.org/jira/browse/DRILL-8113) enabled the building of Drill using a modern JDK, more work is required to enable a Maven release of Drill using a modern JDK. Presently, the Maven Release Plugin will fail on Javadoc generation when run with a newer JDK while it succeeds with JDK 8. The failures are due to dependencies missing from the Maven Javadoc Plugin's config which I assume get treated with a more lenient "warn and skip" policy in the javadoc tool shipped with JDK 8 but cause errors in newer JDKs (in my case OpenJDK 17). In particular, the presence of the `sourcepath` property in the javadoc plugin's config in the root pom causes the default javadoc:javadoc goal to try to generate docs for our src/test packages. Unlike the javadoc:test-javadoc, the javadoc:javadoc goal does not inherit dependencies declared with `test` scope so it fails to resolve those. ## Documentation N/A ## Testing Successfully run maven release:prepare using OpenJDK 17. Successfully run mvn javadoc:javadoc in the Drill root module using OpenJDK 17 with HTML output generated under each module's target/site/apidocs directory. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [drill] weijunlu commented on issue #2723: Failed to execute an insert statement across the database
weijunlu commented on issue #2723: URL: https://github.com/apache/drill/issues/2723#issuecomment-1366577871 2022-12-28 19:03:07,401 [1c53dd94-4277-9ab0-effe-18b1ab8989ac:foreman] DEBUG o.a.d.e.p.s.h.DefaultSqlHandler - Drill Plan : { "head" : { "version" : 1, "generator" : { "type" : "InsertHandler", "info" : "" }, "type" : "APACHE_DRILL_PHYSICAL", "options" : [ ], "queue" : 0, "hasResourcePlan" : false, "scannedPluginNames" : [ "mysql", "pg" ], "resultMode" : "EXEC" }, "graph" : [ { "pop" : "jdbc-scan", "@id" : 1, "sql" : "INSERT INTO `public`.`t1` (`c1`, `c2`)\r\n(SELECT *\r\nFROM `test`.`t1`)", "columns" : [ "`ROWCOUNT`" ], "config" : { "type" : "jdbc", "driver" : "com.mysql.jdbc.Driver", "url" : "jdbc:mysql://localhost:3316", "username" : "root", "caseInsensitiveTableNames" : true, "writable" : true, "authMode" : "SHARED_USER", "writerBatchSize" : 1, "enabled" : true }, "userName" : "anonymous", "cost" : { "memoryCost" : 1.6777216E7, "outputRowCount" : 1.0E9 } }, { "pop" : "screen", "@id" : 0, "child" : 1, "initialAllocation" : 100, "maxAllocation" : 100, "cost" : { "memoryCost" : 1.6777216E7, "outputRowCount" : 1.0E9 } } ] } 2022-12-28 19:03:07,402 [1c53dd94-4277-9ab0-effe-18b1ab8989ac:foreman] DEBUG o.a.d.e.p.f.SimpleParallelizer - Root fragment: handle { query_id { part1: 2041218684968999600 part2: -1153457303194072660 } major_fragment_id: 0 minor_fragment_id: 0 } leaf_fragment: true assignment { address: "DESKTOP-PHHB7LC" user_port: 31010 control_port: 31011 data_port: 31012 version: "2.0.0-SNAPSHOT" state: STARTUP } foreman { address: "DESKTOP-PHHB7LC" user_port: 31010 control_port: 31011 data_port: 31012 version: "2.0.0-SNAPSHOT" state: STARTUP } mem_initial: 100 mem_max: 100 credentials { user_name: "anonymous" } context { query_start_time: 1672225387214 time_zone: 299 default_schema_name: "" session_id: "0b3af775-337f-4db3-8ce4-52e20d5c50ee" } 2022-12-28 19:03:07,403 [1c53dd94-4277-9ab0-effe-18b1ab8989ac:foreman] TRACE o.a.drill.exec.work.foreman.Foreman - PlanFragments for query part1: 2041218684968999600 part2: -1153457303194072660 2022-12-28 19:03:07,403 [1c53dd94-4277-9ab0-effe-18b1ab8989ac:foreman] DEBUG o.a.d.e.w.f.QueryStateProcessor - 1c53dd94-4277-9ab0-effe-18b1ab8989ac: State change requested PLANNING --> ENQUEUED 2022-12-28 19:03:07,403 [1c53dd94-4277-9ab0-effe-18b1ab8989ac:foreman] DEBUG o.a.d.e.w.f.QueryStateProcessor - 1c53dd94-4277-9ab0-effe-18b1ab8989ac: State change requested ENQUEUED --> STARTING 2022-12-28 19:03:07,403 [1c53dd94-4277-9ab0-effe-18b1ab8989ac:foreman] DEBUG o.a.d.exec.rpc.control.WorkEventBus - Adding fragment status listener for queryId 1c53dd94-4277-9ab0-effe-18b1ab8989ac. 2022-12-28 19:03:07,403 [1c53dd94-4277-9ab0-effe-18b1ab8989ac:foreman] DEBUG o.a.d.e.work.foreman.FragmentsRunner - Submitting fragments to run. 2022-12-28 19:03:07,403 [1c53dd94-4277-9ab0-effe-18b1ab8989ac:foreman] DEBUG o.a.d.exec.ops.FragmentContextImpl - Getting initial memory allocation of 100 2022-12-28 19:03:07,403 [1c53dd94-4277-9ab0-effe-18b1ab8989ac:foreman] DEBUG o.a.d.exec.ops.FragmentContextImpl - Fragment max allocation: 100 2022-12-28 19:03:07,403 [1c53dd94-4277-9ab0-effe-18b1ab8989ac:foreman] DEBUG o.a.d.e.work.batch.IncomingBuffers - Came up with a list of 0 required fragments. Fragments {} 2022-12-28 19:03:07,403 [1c53dd94-4277-9ab0-effe-18b1ab8989ac:foreman] DEBUG o.a.d.e.work.foreman.FragmentsRunner - Fragments running. 2022-12-28 19:03:07,403 [1c53dd94-4277-9ab0-effe-18b1ab8989ac:foreman] DEBUG o.a.d.e.w.f.QueryStateProcessor - 1c53dd94-4277-9ab0-effe-18b1ab8989ac: State change requested STARTING --> RUNNING 2022-12-28 19:03:07,421 [1c53dd94-4277-9ab0-effe-18b1ab8989ac:frag:0:0] DEBUG o.a.d.e.physical.impl.BaseRootExec - BaseRootExec(60762332) operators: org.apache.drill.exec.physical.impl.protocol.OperatorRecordBatch 654876346 2022-12-28 19:03:07,421 [1c53dd94-4277-9ab0-effe-18b1ab8989ac:frag:0:0] DEBUG o.a.d.exec.physical.impl.ImplCreator - Took 17 ms to create RecordBatch tree 2022-12-28 19:03:07,421 [1c53dd94-4277-9ab0-effe-18b1ab8989ac:frag:0:0] INFO o.a.d.e.w.fragment.FragmentExecutor - 1c53dd94-4277-9ab0-effe-18b1ab8989ac:0:0: State change requested AWAITING_ALLOCATION --> RUNNING 2022-12-28 19:03:07,421 [1c53dd94-4277-9ab0-effe-18b1ab8989ac:frag:0:0] INFO o.a.d.e.w.f.FragmentStatusReporter - 1c53dd94-4277-9ab0-effe-18b1ab8989ac:0:0: St
[GitHub] [drill] weijunlu commented on issue #2723: Failed to execute an insert statement across the database
weijunlu commented on issue #2723: URL: https://github.com/apache/drill/issues/2723#issuecomment-1366576817 2022-12-28 19:03:07,330 [1c53dd94-4277-9ab0-effe-18b1ab8989ac:foreman] TRACE o.apache.calcite.plan.RelOptPlanner - Rule queue: rule [JdbcTableModificationRule(in:NONE,out:JDBC.mysql)(in:NONE,out:JDBC.mysql)] rels [#112] rule [JdbcTableModificationRule(in:NONE,out:JDBC.pg)(in:NONE,out:JDBC.pg)] rels [#112] rule [ExpandConversionRule] rels [#115] rule [JDBC_PREL_ConverterJDBC.mysql] rels [#118,#89] rule [JdbcTableModificationRule(in:NONE,out:JDBC.mysql)(in:LOGICAL,out:JDBC.mysql)] rels [#120] rule [JdbcTableModificationRule(in:NONE,out:JDBC.pg)(in:LOGICAL,out:JDBC.pg)] rels [#120] 2022-12-28 19:03:07,330 [1c53dd94-4277-9ab0-effe-18b1ab8989ac:foreman] DEBUG o.apache.calcite.plan.RelOptPlanner - Pop match: rule [JdbcTableModificationRule(in:NONE,out:JDBC.mysql)(in:NONE,out:JDBC.mysql)] rels [#112] 2022-12-28 19:03:07,330 [1c53dd94-4277-9ab0-effe-18b1ab8989ac:foreman] DEBUG o.apache.calcite.plan.RelOptPlanner - call#202: Apply rule [JdbcTableModificationRule(in:NONE,out:JDBC.mysql)(in:NONE,out:JDBC.mysql)] to [rel#112:LogicalTableModify] 2022-12-28 19:03:07,337 [1c53dd94-4277-9ab0-effe-18b1ab8989ac:foreman] DEBUG o.apache.calcite.plan.RelOptPlanner - Transform to: rel#121 via JdbcTableModificationRule(in:NONE,out:JDBC.mysql)(in:NONE,out:JDBC.mysql) 2022-12-28 19:03:07,338 [1c53dd94-4277-9ab0-effe-18b1ab8989ac:foreman] DEBUG o.apache.calcite.plan.RelOptPlanner - call#202: Full plan for rule input [rel#112:LogicalTableModify]: LogicalTableModify(table=[[pg, public, t1]], operation=[INSERT], flattened=[true]) JdbcTableScan(subset=[rel#116:RelSubset#0.NONE.ANY([]).[]], table=[[mysql, test, t1]]) 2022-12-28 19:03:07,338 [1c53dd94-4277-9ab0-effe-18b1ab8989ac:foreman] DEBUG o.apache.calcite.plan.RelOptPlanner - call#202: Rule [JdbcTableModificationRule(in:NONE,out:JDBC.mysql)(in:NONE,out:JDBC.mysql)] produced [rel#121:JdbcTableModify] 2022-12-28 19:03:07,338 [1c53dd94-4277-9ab0-effe-18b1ab8989ac:foreman] DEBUG o.apache.calcite.plan.RelOptPlanner - call#202: Full plan for [rel#121:JdbcTableModify]: JdbcTableModify(table=[[pg, public, t1]], operation=[INSERT], flattened=[true]) JdbcTableScan(subset=[rel#109:RelSubset#0.JDBC.mysql.ANY([]).[]], table=[[mysql, test, t1]]) 2022-12-28 19:03:07,338 [1c53dd94-4277-9ab0-effe-18b1ab8989ac:foreman] TRACE o.apache.calcite.plan.RelOptPlanner - Subset cost changed: subset [rel#122:RelSubset#2.JDBC.mysql.ANY([]).[]] cost was {inf} now {101.0 rows, 102.0 cpu, 0.0 io, 0.0 network, 0.0 memory} 2022-12-28 19:03:07,338 [1c53dd94-4277-9ab0-effe-18b1ab8989ac:foreman] TRACE o.apache.calcite.plan.RelOptPlanner - Register rel#121:JdbcTableModify.JDBC.mysql.ANY([]).[](input=RelSubset#109,table=[pg, public, t1],operation=INSERT,flattened=true) in rel#122:RelSubset#2.JDBC.mysql.ANY([]).[] 2022-12-28 19:03:07,338 [1c53dd94-4277-9ab0-effe-18b1ab8989ac:foreman] TRACE o.apache.calcite.plan.RelOptPlanner - Rule-match queued: rule [VertexDrelConverterRuleJDBC.mysql(in:JDBC.mysql,out:LOGICAL)] rels [#121] 2022-12-28 19:03:07,338 [1c53dd94-4277-9ab0-effe-18b1ab8989ac:foreman] DEBUG o.apache.calcite.plan.RelOptPlanner - call#202 generated 1 successors: [rel#121:JdbcTableModify.JDBC.mysql.ANY([]).[](input=RelSubset#109,table=[pg, public, t1],operation=INSERT,flattened=true)] 2022-12-28 19:03:07,338 [1c53dd94-4277-9ab0-effe-18b1ab8989ac:foreman] DEBUG o.apache.calcite.plan.RelOptPlanner - Best cost before rule match: {1.0101E10 rows, 2.0101E8 cpu, 1.1E10 io, 0.0 network, 0.0 memory} 2022-12-28 19:03:07,339 [1c53dd94-4277-9ab0-effe-18b1ab8989ac:foreman] TRACE o.apache.calcite.plan.RelOptPlanner - Root: rel#114:RelSubset#2.LOGICAL.ANY([]).[] Original rel: DrillTableModify(subset=[rel#114:RelSubset#2.LOGICAL.ANY([]).[]], table=[[pg, public, t1]], operation=[INSERT], flattened=[true]): rowcount = 1.0E9, cumulative cost = {1.0E10 rows, 0.0 cpu, 1.1E10 io, 0.0 network, 0.0 memory}, id = 120 VertexDrel(subset=[rel#119:RelSubset#0.LOGICAL.ANY([]).[]]): rowcount = 1.0E9, cumulative cost = {1.0E8 rows, 2.0E8 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 118 JdbcTableScan(subset=[rel#109:RelSubset#0.JDBC.mysql.ANY([]).[]], table=[[mysql, test, t1]]): rowcount = 1.0E9, cumulative cost = {100.0 rows, 101.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 89 Sets: Set#0, type: RecordType(INTEGER c1, INTEGER c2) rel#109:RelSubset#0.JDBC.mysql.ANY([]).[], best=rel#89 rel#89:JdbcTableScan.JDBC.mysql.ANY([]).[](table=[mysql, test, t1]), rowcount=1.0E9, cumulative cost={100.0 rows, 101.0 cpu, 0.0 io, 0.0 network, 0.0 memory} rel#116:RelSubset#0.NONE.ANY([]).[], best=null rel#119:RelSubset#0.LOGICAL.ANY([]).[], best=rel#118 rel#118:VertexDrel.LOGICAL.ANY([]).[](input=RelSubset#109), ro
[GitHub] [drill] weijunlu commented on issue #2723: Failed to execute an insert statement across the database
weijunlu commented on issue #2723: URL: https://github.com/apache/drill/issues/2723#issuecomment-1366575973 2022-12-28 19:03:07,204 [main] DEBUG o.a.d.j.impl.DrillStatementRegistry - Adding to open-statements registry: org.apache.drill.jdbc.impl.DrillStatementImpl@71df3d2b 2022-12-28 19:03:07,204 [main] DEBUG o.a.d.j.i.DrillCursor$ResultsListener - [#2] Query listener created. 2022-12-28 19:03:07,204 [main] DEBUG o.apache.drill.jdbc.impl.DrillCursor - Setting timeout as 0 2022-12-28 19:03:07,206 [UserServer-1] DEBUG o.a.d.e.r.u.UserServerRequestHandler - Received query to run. Returning query handle. 2022-12-28 19:03:07,215 [1c53dd94-4277-9ab0-effe-18b1ab8989ac:foreman] DEBUG o.a.d.e.w.f.QueryStateProcessor - 1c53dd94-4277-9ab0-effe-18b1ab8989ac: State change requested PREPARING --> PLANNING 2022-12-28 19:03:07,215 [1c53dd94-4277-9ab0-effe-18b1ab8989ac:foreman] INFO o.a.drill.exec.work.foreman.Foreman - Query text for query with id 1c53dd94-4277-9ab0-effe-18b1ab8989ac issued by anonymous: insert into pg.public.t1 select c1, c2 from mysql.test.t1 2022-12-28 19:03:07,215 [Client-1] DEBUG o.a.d.j.i.DrillCursor$ResultsListener - [#2] Received query ID: 1c53dd94-4277-9ab0-effe-18b1ab8989ac. 2022-12-28 19:03:07,215 [Client-1] DEBUG o.a.d.e.rpc.user.QueryResultHandler - Received QueryId 1c53dd94-4277-9ab0-effe-18b1ab8989ac successfully. Adding results listener org.apache.drill.jdbc.impl.DrillCursor$ResultsListener@29741514. 2022-12-28 19:03:07,222 [1c53dd94-4277-9ab0-effe-18b1ab8989ac:foreman] TRACE org.apache.calcite.sql.parser - After unconditional rewrite: INSERT INTO `pg`.`public`.`t1` (SELECT `c1`, `c2` FROM `mysql`.`test`.`t1`) 2022-12-28 19:03:07,308 [1c53dd94-4277-9ab0-effe-18b1ab8989ac:foreman] TRACE org.apache.calcite.sql.parser - After validation: INSERT INTO `pg`.`public`.`t1` (SELECT `t1`.`c1`, `t1`.`c2` FROM `mysql`.`test`.`t1` AS `t1`) 2022-12-28 19:03:07,308 [1c53dd94-4277-9ab0-effe-18b1ab8989ac:foreman] TRACE o.a.c.sql.pretty.SqlPrettyWriter - Token is 'INSERT INTO'; result is false 2022-12-28 19:03:07,309 [1c53dd94-4277-9ab0-effe-18b1ab8989ac:foreman] TRACE o.a.c.sql.pretty.SqlPrettyWriter - Token is 'pg'; result is false 2022-12-28 19:03:07,309 [1c53dd94-4277-9ab0-effe-18b1ab8989ac:foreman] TRACE o.a.c.sql.pretty.SqlPrettyWriter - Token is '.'; result is false 2022-12-28 19:03:07,309 [1c53dd94-4277-9ab0-effe-18b1ab8989ac:foreman] TRACE o.a.c.sql.pretty.SqlPrettyWriter - Token is 'public'; result is false 2022-12-28 19:03:07,309 [1c53dd94-4277-9ab0-effe-18b1ab8989ac:foreman] TRACE o.a.c.sql.pretty.SqlPrettyWriter - Token is '.'; result is false 2022-12-28 19:03:07,309 [1c53dd94-4277-9ab0-effe-18b1ab8989ac:foreman] TRACE o.a.c.sql.pretty.SqlPrettyWriter - Token is 't1'; result is false 2022-12-28 19:03:07,309 [1c53dd94-4277-9ab0-effe-18b1ab8989ac:foreman] TRACE o.a.c.sql.pretty.SqlPrettyWriter - Token is ''; result is false 2022-12-28 19:03:07,309 [1c53dd94-4277-9ab0-effe-18b1ab8989ac:foreman] TRACE o.a.c.sql.pretty.SqlPrettyWriter - Token is '('; result is false 2022-12-28 19:03:07,309 [1c53dd94-4277-9ab0-effe-18b1ab8989ac:foreman] TRACE o.a.c.sql.pretty.SqlPrettyWriter - Token is 'SELECT'; result is false 2022-12-28 19:03:07,309 [1c53dd94-4277-9ab0-effe-18b1ab8989ac:foreman] TRACE o.a.c.sql.pretty.SqlPrettyWriter - Token is 't1'; result is false 2022-12-28 19:03:07,309 [1c53dd94-4277-9ab0-effe-18b1ab8989ac:foreman] TRACE o.a.c.sql.pretty.SqlPrettyWriter - Token is '.'; result is false 2022-12-28 19:03:07,309 [1c53dd94-4277-9ab0-effe-18b1ab8989ac:foreman] TRACE o.a.c.sql.pretty.SqlPrettyWriter - Token is 'c1'; result is false 2022-12-28 19:03:07,309 [1c53dd94-4277-9ab0-effe-18b1ab8989ac:foreman] TRACE o.a.c.sql.pretty.SqlPrettyWriter - Token is ''; result is false 2022-12-28 19:03:07,309 [1c53dd94-4277-9ab0-effe-18b1ab8989ac:foreman] TRACE o.a.c.sql.pretty.SqlPrettyWriter - Token is ','; result is false 2022-12-28 19:03:07,309 [1c53dd94-4277-9ab0-effe-18b1ab8989ac:foreman] TRACE o.a.c.sql.pretty.SqlPrettyWriter - Token is 't1'; result is false 2022-12-28 19:03:07,309 [1c53dd94-4277-9ab0-effe-18b1ab8989ac:foreman] TRACE o.a.c.sql.pretty.SqlPrettyWriter - Token is '.'; result is false 2022-12-28 19:03:07,309 [1c53dd94-4277-9ab0-effe-18b1ab8989ac:foreman] TRACE o.a.c.sql.pretty.SqlPrettyWriter - Token is 'c2'; result is false 2022-12-28 19:03:07,309 [1c53dd94-4277-9ab0-effe-18b1ab8989ac:foreman] TRACE o.a.c.sql.pretty.SqlPrettyWriter - Token is ''; result is false 2022-12-28 19:03:07,309 [1c53dd94-4277-9ab0-effe-18b1ab8989ac:foreman] TRACE o.a.c.sql.pretty.SqlPrettyWriter - Token is ''; result is false 2022-12-28 19:03:07,309 [1c53dd94-4277-9ab0-effe-18b1ab8989ac:foreman] TRACE o.a.c.sql.pretty.SqlPrettyWriter - Token is 'FROM'; result is false 2022-12-28 19:03:07,309 [1c53dd94-4277-9ab0-effe-18b1ab8989ac:foreman] TRACE o.a.c.sql.pretty.SqlPrettyWrite
[GitHub] [drill] weijunlu commented on issue #2723: Failed to execute an insert statement across the database
weijunlu commented on issue #2723: URL: https://github.com/apache/drill/issues/2723#issuecomment-1366570466 I opened the trace log, included the calcite log. Log configurations are as follows: -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [drill] cgivre commented on issue #2723: Failed to execute an insert statement across the database
cgivre commented on issue #2723: URL: https://github.com/apache/drill/issues/2723#issuecomment-136578 Can you please enable verbose logging and post the stack trace? Without that, we really can't debug this. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [drill] vvysotskyi commented on a diff in pull request #2599: DRILL-4232 Support for EXCEPT and INTERSECT set operator
vvysotskyi commented on code in PR #2599: URL: https://github.com/apache/drill/pull/2599#discussion_r1057326323 ## exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/common/HashTable.java: ## @@ -98,6 +98,10 @@ void setup(HashTableConfig htConfig, BufferAllocator allocator, VectorContainer */ int probeForKey(int incomingRowIdx, int hashCode) throws SchemaChangeException; + int getNum(int currentIndex); Review Comment: Please rename this method to clarify that it holds the count of records for a specific key and add JavaDoc. ## exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/aggregate/HashAggBatch.java: ## @@ -188,10 +192,13 @@ public HashAggBatch(HashAggregate popConfig, RecordBatch incoming, FragmentConte wasKilled = false; final int numGrpByExprs = popConfig.getGroupByExprs().size(); -comparators = Lists.newArrayListWithExpectedSize(numGrpByExprs); -for (int i=0; ihttp://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.drill; + +import org.apache.drill.exec.planner.physical.PlannerSettings; +import org.apache.drill.exec.record.BatchSchema; +import org.apache.drill.exec.record.BatchSchemaBuilder; +import org.apache.drill.exec.record.metadata.SchemaBuilder; +import org.apache.drill.shaded.guava.com.google.common.collect.Lists; +import org.apache.commons.lang3.tuple.Pair; +import org.apache.drill.categories.OperatorTest; +import org.apache.drill.categories.SqlTest; +import org.apache.drill.categories.UnlikelyTest; +import org.apache.drill.common.exceptions.UserException; +import org.apache.drill.common.expression.SchemaPath; +import org.apache.drill.common.types.TypeProtos; +import org.apache.drill.exec.ExecConstants; +import org.apache.drill.test.ClusterFixture; +import org.apache.drill.test.ClusterTest; +import org.junit.BeforeClass; +import org.junit.Test; +import org.junit.experimental.categories.Category; + +import java.io.BufferedWriter; +import java.io.File; +import java.io.FileWriter; +import java.nio.file.Paths; +import java.util.List; + +@Category({SqlTest.class, OperatorTest.class}) +public class TestSetOp extends ClusterTest { Review Comment: Could you please add more tests that check several batches? It could be done using the `UNION ALL` operator. Also, it would be interesting to see cases when the first batch of one side is empty, and so on. One more scenario is to check how it behaves with complex types. It is fine if not supported, but we should be sure that we have the correct error message and error handling. ## exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillSetOpRel.java: ## @@ -0,0 +1,79 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.drill.exec.planner.logical; + +import org.apache.calcite.linq4j.Ord; +import org.apache.calcite.plan.RelOptCluster; +import org.apache.calcite.plan.RelTraitSet; +import org.apache.calcite.rel.InvalidRelException; +import org.apache.calcite.rel.RelNode; +import org.apache.calcite.sql.SqlKind; +import org.apache.drill.common.logical.data.LogicalOperator; +import org.apache.drill.common.logical.data.Union; +import org.apache.drill.exec.planner.common.DrillSetOpRelBase; + +import java.util.List; + +/** + * SetOp implemented in Drill. + */ +public class DrillSetOpRel extends DrillSetOpRelBase implements DrillRel { + private boolean isAggAdded; + + public DrillSetOpRel(RelOptCluster cluster, RelTraitSet traits, + List inputs, SqlKind kind, boolean all, boolean checkCompatibility, boolean isAggAdded) throws InvalidRelException { +super(cluster, traits, inputs, kind, all, checkCompatibility); +this.isAggAdded = isAggAdded; + } + + public DrillSetOpRel(RelOptCluster cluster, RelTraitSet traits, + List inputs, SqlKind kind, boolean all,
[GitHub] [drill] jnturton merged pull request #2727: DRILL-8374: Set the Drill development version to 1.21.0-SNAPSHOT
jnturton merged PR #2727: URL: https://github.com/apache/drill/pull/2727 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [drill] cgivre opened a new pull request, #2729: DRILL-8376: Add Distribution UDFs
cgivre opened a new pull request, #2729: URL: https://github.com/apache/drill/pull/2729 # [DRILL-8376](https://issues.apache.org/jira/browse/DRILL-8376): Add Distribution UDFs ## Description This PR adds several new UDFs to help with statistical analysis. They are `width_bucket` which mirrors the functionality of the POSTGRES function of the same name. (https://www.oreilly.com/library/view/sql-in-a/9780596155322/re91.html). This function is useful for building histograms of data. This also adds the `kendall_correlation` and `pearson_correlation` functions which are two function for calculating correlation coefficients of two columns. ## Documentation Updated README. ## Testing Added unit tests. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [drill] jnturton merged pull request #2726: [BACKPORT-TO-STABLE] Bugfix Release 1.20.3 Phase 4
jnturton merged PR #2726: URL: https://github.com/apache/drill/pull/2726 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [drill] kingswanwho commented on pull request #2726: [BACKPORT-TO-STABLE] Bugfix Release 1.20.3 Phase 4
kingswanwho commented on PR #2726: URL: https://github.com/apache/drill/pull/2726#issuecomment-1364474789 > Thanks @kingswanwho, this looks good with the only issue I see being [the protobuf upgrade](https://github.com/apache/drill/pull/2726/commits/439958bc56eb2d24b7206e83a75e491ff23c89a6). > > In addition to the dependency version number change, a lot of generated protobuf code needs an update. In the master branch I can see that @vvysotskyi had to add this to Dependabot's PR manually. Did you try cherry picking [his commit](https://github.com/apache/drill/pull/2671/commits/a97c7e16f01c36e5d683b561517fe8bad59cfce8) which includes the updates to the generated code? Hi @jnturton, yes I cherry-picked the commit into this PR, I will drop this commit and submit PR again. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [drill] jnturton commented on pull request #2726: [BACKPORT-TO-STABLE] Bugfix Release 1.20.3 Phase 4
jnturton commented on PR #2726: URL: https://github.com/apache/drill/pull/2726#issuecomment-1364471208 Thanks @kingswanwho, this looks good with the only issue I see being [the protobuf upgrade](https://github.com/apache/drill/pull/2726/commits/439958bc56eb2d24b7206e83a75e491ff23c89a6). In addition to the the dependency version number change, a lot of generated protobuf code needs an update. In the master branch I can see that @vvysotskyi had to add this to Dependabot's PR manually. Did you try cherry picking [his commit](https://github.com/apache/drill/pull/2671/commits/a97c7e16f01c36e5d683b561517fe8bad59cfce8) which includes the updates to generated code? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [drill] kingswanwho closed pull request #2726: [BACKPORT-TO-STABLE] Bugfix Release 1.20.3 Phase 4
kingswanwho closed pull request #2726: [BACKPORT-TO-STABLE] Bugfix Release 1.20.3 Phase 4 URL: https://github.com/apache/drill/pull/2726 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [drill] weijunlu commented on issue #2723: Failed to execute an insert statement across the database
weijunlu commented on issue #2723: URL: https://github.com/apache/drill/issues/2723#issuecomment-1363564470 @cgivre yes, I used the master version. apache drill> select version, commit_message, commit_time from sys.version; ++--+---+ |version | commit_message |commit_time| ++--+---+ | 2.0.0-SNAPSHOT | DRILL-8314: Add support for automatically retrying and disabling broken storage plugins (#2655) | 18.10.2022 @ 18:15:31 CST | -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [drill] jnturton commented on pull request #2726: [BACKPORT-TO-STABLE] Bugfix Release 1.20.3 Phase 4
jnturton commented on PR #2726: URL: https://github.com/apache/drill/pull/2726#issuecomment-1363032480 @kingswanwho note that the test failures on [this PR's last CI run](https://github.com/apache/drill/actions/runs/3757899347/jobs/6385612311) are showing up everywhere at the moment and are not caused by your commits here. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [drill] jnturton opened a new pull request, #2728: DRILL-8372: Unfreed buffers when running a LIMIT 0 query over delimited text
jnturton opened a new pull request, #2728: URL: https://github.com/apache/drill/pull/2728 # [DRILL-8372](https://issues.apache.org/jira/browse/DRILL-8372): Unfreed buffers when running a LIMIT 0 query over delimited text ## Description TODO ## Documentation N/A ## Testing TODO -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [drill] cgivre commented on issue #2723: Failed to execute an insert statement across the database
cgivre commented on issue #2723: URL: https://github.com/apache/drill/issues/2723#issuecomment-1362913019 What version of Drill are you using? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [drill] jnturton opened a new pull request, #2727: DRILL-8374: Set the Drill development version to 1.21.0-SNAPSHOT
jnturton opened a new pull request, #2727: URL: https://github.com/apache/drill/pull/2727 # [DRILL-8374](https://issues.apache.org/jira/browse/DRILL-8374): Set the Drill development version to 1.21.0-SNAPSHOT ## Description Changes the Maven version numbers in the Drill master branch from 2.0.0 to 1.21.0. Discussion in the Drill mailing list established that the project would prefer to do a release in the near future than to wait to build up a changset for which a version jump to 2.0 would be appropriate. ## Documentation N/A ## Testing N/A -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [drill] jnturton merged pull request #2724: [BACKPORT-TO-STABLE] Bugfix Release 1.20.3 Phase 3
jnturton merged PR #2724: URL: https://github.com/apache/drill/pull/2724 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [drill] jnturton commented on pull request #2724: [BACKPORT-TO-STABLE] Bugfix Release 1.20.3 Phase 3
jnturton commented on PR #2724: URL: https://github.com/apache/drill/pull/2724#issuecomment-1362814276 Thanks for the review @kingswanwho. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [drill] kingswanwho opened a new pull request, #2726: [BACKPORT-TO-STABLE] Bugfix Release 1.20.3 Phase 4
kingswanwho opened a new pull request, #2726: URL: https://github.com/apache/drill/pull/2726 # [BACKPORT-TO-STABLE] Bugfix Release 1.20.3 Phase 4 ## Description Merges the following backport-to-stable commits into the 1.20 branch: * https://github.com/apache/drill/pull/2666 * https://github.com/apache/drill/pull/2669 * https://github.com/apache/drill/pull/2671 * https://github.com/apache/drill/pull/2674 * https://github.com/apache/drill/pull/2675 * https://github.com/apache/drill/pull/2676 * https://github.com/apache/drill/pull/2677 * https://github.com/apache/drill/pull/2678 * https://github.com/apache/drill/pull/2682 ## Documentation N/A ## Testing UT -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [drill] cgivre merged pull request #2722: DRILL-8371: Add Write/Insert Capability to Splunk Plugin
cgivre merged PR #2722: URL: https://github.com/apache/drill/pull/2722 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [drill] jnturton commented on a diff in pull request #2722: DRILL-8371: Add Write/Insert Capability to Splunk Plugin
jnturton commented on code in PR #2722: URL: https://github.com/apache/drill/pull/2722#discussion_r1055192926 ## contrib/storage-splunk/src/main/java/org/apache/drill/exec/store/splunk/SplunkBatchWriter.java: ## @@ -98,27 +100,69 @@ public void updateSchema(VectorAccessible batch) { @Override public void startRecord() { logger.debug("Starting record"); -// Ensure that the new record is empty. This is not strictly necessary, but it is a belt and suspenders approach. -splunkEvent.clear(); +// Ensure that the new record is empty. +splunkEvent = new JSONObject(); } @Override - public void endRecord() throws IOException { + public void endRecord() { logger.debug("Ending record"); +recordCount++; + +// Put event in buffer +eventBuffer.add(splunkEvent); + // Write the event to the Splunk index -destinationIndex.submit(eventArgs, splunkEvent.toJSONString()); -// Clear out the splunk event. -splunkEvent.clear(); +if (recordCount >= config.getPluginConfig().getWriterBatchSize()) { + try { +writeEvents(); + } catch (IOException e) { +throw UserException.dataWriteError(e) +.message("Error writing data to Splunk: " + e.getMessage()) +.build(logger); + } + + // Reset record count + recordCount = 0; +} } + + /* + args – Optional arguments for this stream. Valid parameters are: "host", "host_regex", "source", and "sourcetype". + */ @Override public void abort() { +logger.debug("Aborting writing records to Splunk."); // No op } @Override public void cleanup() { -// No op +try { + writeEvents(); +} catch (IOException e) { + throw UserException.dataWriteError(e) + .message("Error writing data to Splunk: " + e.getMessage()) + .build(logger); +} + } + + private void writeEvents() throws IOException { +// Open the socket and stream, set up a timestamp +destinationIndex.attachWith(new ReceiverBehavior() { Review Comment: This results in a dedicated TCP socket being opened and closed for every writer batch. ## contrib/storage-splunk/src/main/java/org/apache/drill/exec/store/splunk/SplunkBatchWriter.java: ## @@ -98,27 +100,69 @@ public void updateSchema(VectorAccessible batch) { @Override public void startRecord() { logger.debug("Starting record"); -// Ensure that the new record is empty. This is not strictly necessary, but it is a belt and suspenders approach. -splunkEvent.clear(); +// Ensure that the new record is empty. +splunkEvent = new JSONObject(); } @Override - public void endRecord() throws IOException { + public void endRecord() { logger.debug("Ending record"); +recordCount++; + +// Put event in buffer +eventBuffer.add(splunkEvent); + // Write the event to the Splunk index -destinationIndex.submit(eventArgs, splunkEvent.toJSONString()); -// Clear out the splunk event. -splunkEvent.clear(); +if (recordCount >= config.getPluginConfig().getWriterBatchSize()) { + try { +writeEvents(); + } catch (IOException e) { +throw UserException.dataWriteError(e) +.message("Error writing data to Splunk: " + e.getMessage()) +.build(logger); + } + + // Reset record count + recordCount = 0; +} } + + /* + args – Optional arguments for this stream. Valid parameters are: "host", "host_regex", "source", and "sourcetype". + */ @Override public void abort() { +logger.debug("Aborting writing records to Splunk."); Review Comment: Would there be any use in clearing eventBuffer here? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [drill] kingswanwho commented on pull request #2724: [BACKPORT-TO-STABLE] Bugfix Release 1.20.3 Phase 3
kingswanwho commented on PR #2724: URL: https://github.com/apache/drill/pull/2724#issuecomment-1362496955 Looks perfect to me +1. That's a quite lot of work in a short time. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [drill] kingswanwho commented on pull request #2724: [BACKPORT-TO-STABLE] Bugfix Release 1.20.3 Phase 3
kingswanwho commented on PR #2724: URL: https://github.com/apache/drill/pull/2724#issuecomment-1362496900 Looks perfect to me +1. That's a quite lot of work in a short time. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [drill] cgivre commented on a diff in pull request #2722: DRILL-8371: Add Write/Insert Capability to Splunk Plugin
cgivre commented on code in PR #2722: URL: https://github.com/apache/drill/pull/2722#discussion_r1053981884 ## contrib/storage-splunk/src/main/java/org/apache/drill/exec/store/splunk/SplunkBatchWriter.java: ## @@ -0,0 +1,309 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.drill.exec.store.splunk; + + +import com.splunk.Args; +import com.splunk.Index; +import com.splunk.IndexCollection; +import com.splunk.Service; +import org.apache.drill.common.exceptions.UserException; +import org.apache.drill.exec.proto.UserBitShared.UserCredentials; +import org.apache.drill.exec.record.VectorAccessible; +import org.apache.drill.exec.store.AbstractRecordWriter; +import org.apache.drill.exec.store.EventBasedRecordWriter.FieldConverter; +import org.apache.drill.exec.vector.complex.reader.FieldReader; +import org.json.simple.JSONObject; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.IOException; +import java.util.List; +import java.util.Map; + +public class SplunkBatchWriter extends AbstractRecordWriter { + + private static final Logger logger = LoggerFactory.getLogger(SplunkBatchWriter.class); + private static final String DEFAULT_SOURCETYPE = "drill"; + private final UserCredentials userCredentials; + private final List tableIdentifier; + private final SplunkWriter config; + private final Args eventArgs; + protected final Service splunkService; + private final JSONObject splunkEvent; + protected Index destinationIndex; + + + public SplunkBatchWriter(UserCredentials userCredentials, List tableIdentifier, SplunkWriter config) { +this.config = config; +this.tableIdentifier = tableIdentifier; +this.userCredentials = userCredentials; +this.splunkEvent = new JSONObject(); +SplunkConnection connection = new SplunkConnection(config.getPluginConfig(), userCredentials.getUserName()); +this.splunkService = connection.connect(); + +// Populate event arguments +this.eventArgs = new Args(); +eventArgs.put("sourcetype", DEFAULT_SOURCETYPE); + } + + @Override + public void init(Map writerOptions) throws IOException { +// No op + } + + /** + * Update the schema in RecordWriter. Called before starting writing the records. In this case, + * we add the index to Splunk here. Splunk's API is a little sparse and doesn't really do much in the way + * of error checking or providing feedback if the operation fails. + * + * @param batch {@link VectorAccessible} The incoming batch + */ + @Override + public void updateSchema(VectorAccessible batch) { +logger.debug("Updating schema for Splunk"); + +//Get the collection of indexes +IndexCollection indexes = splunkService.getIndexes(); +try { + String indexName = tableIdentifier.get(0); + indexes.create(indexName); + destinationIndex = splunkService.getIndexes().get(indexName); +} catch (Exception e) { + // We have to catch a generic exception here, as Splunk's SDK does not really provide any kind of + // failure messaging. + throw UserException.systemError(e) +.message("Error creating new index in Splunk plugin: " + e.getMessage()) +.build(logger); +} + } + + + @Override + public void startRecord() { +logger.debug("Starting record"); +// Ensure that the new record is empty. This is not strictly necessary, but it is a belt and suspenders approach. +splunkEvent.clear(); Review Comment: I removed this from the `endRecord` method. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [drill] cgivre commented on a diff in pull request #2722: DRILL-8371: Add Write/Insert Capability to Splunk Plugin
cgivre commented on code in PR #2722: URL: https://github.com/apache/drill/pull/2722#discussion_r1053979483 ## contrib/storage-splunk/src/main/java/org/apache/drill/exec/store/splunk/SplunkBatchWriter.java: ## @@ -0,0 +1,308 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.drill.exec.store.splunk; + + +import com.splunk.Args; +import com.splunk.Index; +import com.splunk.IndexCollection; +import com.splunk.Service; +import org.apache.drill.common.exceptions.UserException; +import org.apache.drill.exec.proto.UserBitShared.UserCredentials; +import org.apache.drill.exec.record.VectorAccessible; +import org.apache.drill.exec.store.AbstractRecordWriter; +import org.apache.drill.exec.store.EventBasedRecordWriter.FieldConverter; +import org.apache.drill.exec.vector.complex.reader.FieldReader; +import org.json.simple.JSONObject; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.IOException; +import java.util.List; +import java.util.Map; + +public class SplunkBatchWriter extends AbstractRecordWriter { + + private static final Logger logger = LoggerFactory.getLogger(SplunkBatchWriter.class); + private static final String DEFAULT_SOURCETYPE = "drill"; + private final UserCredentials userCredentials; + private final List tableIdentifier; + private final SplunkWriter config; + private final Args eventArgs; + protected final Service splunkService; + private JSONObject splunkEvent; + protected Index destinationIndex; + + + public SplunkBatchWriter(UserCredentials userCredentials, List tableIdentifier, SplunkWriter config) { +this.config = config; +this.tableIdentifier = tableIdentifier; +this.userCredentials = userCredentials; + +SplunkConnection connection = new SplunkConnection(config.getPluginConfig(), userCredentials.getUserName()); +this.splunkService = connection.connect(); + +// Populate event arguments +this.eventArgs = new Args(); +eventArgs.put("sourcetype", DEFAULT_SOURCETYPE); + } + + @Override + public void init(Map writerOptions) throws IOException { +// No op + } + + /** + * Update the schema in RecordWriter. Called at least once before starting writing the records. In this case, + * we add the index to Splunk here. Splunk's API is a little sparse and doesn't really do much in the way + * of error checking or providing feedback if the operation fails. + * + * @param batch {@link VectorAccessible} The incoming batch + */ + @Override + public void updateSchema(VectorAccessible batch) { +logger.debug("Updating schema for Splunk"); + +//Get the collection of indexes +IndexCollection indexes = splunkService.getIndexes(); +try { + String indexName = tableIdentifier.get(0); + indexes.create(indexName); + destinationIndex = splunkService.getIndexes().get(indexName); +} catch (Exception e) { + // We have to catch a generic exception here, as Splunk's SDK does not really provide any kind of + // failure messaging. + throw UserException.systemError(e) +.message("Error creating new index in Splunk plugin: " + e.getMessage()) +.build(logger); +} + } + + + @Override + public void startRecord() { +logger.debug("Starting record"); +splunkEvent = new JSONObject(); + } + + @Override + public void endRecord() throws IOException { +logger.debug("Ending record"); +// Write the event to the Splunk index +destinationIndex.submit(eventArgs, splunkEvent.toJSONString()); Review Comment: @jnturton I figured this out. Using Splunk's sample code from their SDK documentation resulted in Splunk not parsing the fields correctly which broke all the unit tests, and didn't work. I did some experiments and found that removing the date actually solved the issue. Splunk's SDK provides a method for writing to a socket which does all the error handling. I used that because that was what the docs recommended, however that method does not allow you to set some of the properties that the other insert methods do. But I'm not debugging Splunk's SDK for free. -- This is an automated message from the Apache Git Service. To respond to th
[GitHub] [drill] cgivre merged pull request #2725: DRILL-8179: Convert LTSV Format Plugin to EVF2
cgivre merged PR #2725: URL: https://github.com/apache/drill/pull/2725 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [drill] cgivre commented on a diff in pull request #2725: DRILL-8179: Convert LTSV Format Plugin to EVF2
cgivre commented on code in PR #2725: URL: https://github.com/apache/drill/pull/2725#discussion_r1053498604 ## contrib/format-ltsv/src/main/java/org/apache/drill/exec/store/ltsv/LTSVBatchReader.java: ## @@ -0,0 +1,264 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.drill.exec.store.ltsv; + +import com.github.lolo.ltsv.LtsvParser; +import com.github.lolo.ltsv.LtsvParser.Builder; +import org.apache.commons.lang3.StringUtils; +import org.apache.drill.common.AutoCloseables; +import org.apache.drill.common.exceptions.CustomErrorContext; +import org.apache.drill.common.exceptions.UserException; +import org.apache.drill.common.types.TypeProtos; +import org.apache.drill.common.types.TypeProtos.MinorType; +import org.apache.drill.exec.physical.impl.scan.v3.ManagedReader; +import org.apache.drill.exec.physical.impl.scan.v3.file.FileDescrip; +import org.apache.drill.exec.physical.impl.scan.v3.file.FileSchemaNegotiator; +import org.apache.drill.exec.physical.resultSet.ResultSetLoader; +import org.apache.drill.exec.physical.resultSet.RowSetLoader; +import org.apache.drill.exec.record.metadata.ColumnMetadata; +import org.apache.drill.exec.record.metadata.MetadataUtils; +import org.apache.drill.exec.record.metadata.TupleMetadata; +import org.apache.drill.exec.vector.accessor.ScalarWriter; +import org.apache.drill.shaded.guava.com.google.common.base.Strings; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.IOException; +import java.io.InputStream; +import java.text.ParseException; +import java.text.SimpleDateFormat; +import java.time.Instant; +import java.time.LocalDate; +import java.time.LocalTime; +import java.time.format.DateTimeFormatter; +import java.util.Date; +import java.util.Iterator; +import java.util.Map; + +public class LTSVBatchReader implements ManagedReader { + + private static final Logger logger = LoggerFactory.getLogger(LTSVBatchReader.class); + private final LTSVFormatPluginConfig config; + private final FileDescrip file; + private final CustomErrorContext errorContext; + private final LtsvParser ltsvParser; + private final RowSetLoader rowWriter; + private final FileSchemaNegotiator negotiator; + private InputStream fsStream; + private Iterator> rowIterator; + + + public LTSVBatchReader(LTSVFormatPluginConfig config, FileSchemaNegotiator negotiator) { +this.config = config; +this.negotiator = negotiator; +file = negotiator.file(); +errorContext = negotiator.parentErrorContext(); +ltsvParser = buildParser(); + +openFile(); + +// If there is a provided schema, import it +if (negotiator.providedSchema() != null) { + TupleMetadata schema = negotiator.providedSchema(); + negotiator.tableSchema(schema, false); +} +ResultSetLoader loader = negotiator.build(); +rowWriter = loader.writer(); + + } + + private void openFile() { +try { + fsStream = file.fileSystem().openPossiblyCompressedStream(file.split().getPath()); +} catch (IOException e) { + throw UserException + .dataReadError(e) + .message("Unable to open LTSV File %s", file.split().getPath() + " " + e.getMessage()) + .addContext(errorContext) + .build(logger); +} +rowIterator = ltsvParser.parse(fsStream); + } + + @Override + public boolean next() { +while (!rowWriter.isFull()) { + if (!processNextRow()) { +return false; + } +} +return true; + } + + private LtsvParser buildParser() { +Builder builder = LtsvParser.builder(); +builder.trimKeys(); +builder.trimValues(); +builder.skipNullValues(); + +if (config.getParseMode().contentEquals("strict")) { + builder.strict(); +} else { + builder.lenient(); +} + +if (StringUtils.isNotEmpty(config.getEscapeCharacter())) { + builder.withEscapeChar(config.getEscapeCharacter().charAt(0)); +} + +if (StringUtils.isNotEmpty(config.getKvDelimiter())) { + builder.withKvDelimiter(config.getKvDelimiter().charAt(0)); +} + +if (StringUtils.isNotEmpty(config.getEntryDelimiter())) { + builder.withEntryDelimiter(config.getEntryDelimiter().charAt(0)); +} + +if
[GitHub] [drill] jnturton commented on a diff in pull request #2725: DRILL-8179: Convert LTSV Format Plugin to EVF2
jnturton commented on code in PR #2725: URL: https://github.com/apache/drill/pull/2725#discussion_r1053465233 ## contrib/format-ltsv/src/test/java/org/apache/drill/exec/store/ltsv/TestLTSVRecordReader.java: ## @@ -37,34 +42,77 @@ public static void setup() throws Exception { @Test public void testWildcard() throws Exception { -testBuilder() - .sqlQuery("SELECT * FROM cp.`simple.ltsv`") - .unOrdered() - .baselineColumns("host", "forwardedfor", "req", "status", "size", "referer", "ua", "reqtime", "apptime", "vhost") - .baselineValues("xxx.xxx.xxx.xxx", "-", "GET /v1/xxx HTTP/1.1", "200", "4968", "-", "Java/1.8.0_131", "2.532", "2.532", "api.example.com") - .baselineValues("xxx.xxx.xxx.xxx", "-", "GET /v1/yyy HTTP/1.1", "200", "412", "-", "Java/1.8.0_201", "3.580", "3.580", "api.example.com") - .go(); +String sql = "SELECT * FROM cp.`simple.ltsv`"; Review Comment: Let's rename this class TestLTSVQueries or similar now that LTSVRecordReader is gone? ## contrib/format-ltsv/src/main/java/org/apache/drill/exec/store/ltsv/LTSVBatchReader.java: ## @@ -0,0 +1,264 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.drill.exec.store.ltsv; + +import com.github.lolo.ltsv.LtsvParser; +import com.github.lolo.ltsv.LtsvParser.Builder; +import org.apache.commons.lang3.StringUtils; +import org.apache.drill.common.AutoCloseables; +import org.apache.drill.common.exceptions.CustomErrorContext; +import org.apache.drill.common.exceptions.UserException; +import org.apache.drill.common.types.TypeProtos; +import org.apache.drill.common.types.TypeProtos.MinorType; +import org.apache.drill.exec.physical.impl.scan.v3.ManagedReader; +import org.apache.drill.exec.physical.impl.scan.v3.file.FileDescrip; +import org.apache.drill.exec.physical.impl.scan.v3.file.FileSchemaNegotiator; +import org.apache.drill.exec.physical.resultSet.ResultSetLoader; +import org.apache.drill.exec.physical.resultSet.RowSetLoader; +import org.apache.drill.exec.record.metadata.ColumnMetadata; +import org.apache.drill.exec.record.metadata.MetadataUtils; +import org.apache.drill.exec.record.metadata.TupleMetadata; +import org.apache.drill.exec.vector.accessor.ScalarWriter; +import org.apache.drill.shaded.guava.com.google.common.base.Strings; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.IOException; +import java.io.InputStream; +import java.text.ParseException; +import java.text.SimpleDateFormat; +import java.time.Instant; +import java.time.LocalDate; +import java.time.LocalTime; +import java.time.format.DateTimeFormatter; +import java.util.Date; +import java.util.Iterator; +import java.util.Map; + +public class LTSVBatchReader implements ManagedReader { + + private static final Logger logger = LoggerFactory.getLogger(LTSVBatchReader.class); + private final LTSVFormatPluginConfig config; + private final FileDescrip file; + private final CustomErrorContext errorContext; + private final LtsvParser ltsvParser; + private final RowSetLoader rowWriter; + private final FileSchemaNegotiator negotiator; + private InputStream fsStream; + private Iterator> rowIterator; + + + public LTSVBatchReader(LTSVFormatPluginConfig config, FileSchemaNegotiator negotiator) { +this.config = config; +this.negotiator = negotiator; +file = negotiator.file(); +errorContext = negotiator.parentErrorContext(); +ltsvParser = buildParser(); + +openFile(); + +// If there is a provided schema, import it +if (negotiator.providedSchema() != null) { + TupleMetadata schema = negotiator.providedSchema(); + negotiator.tableSchema(schema, false); +} +ResultSetLoader loader = negotiator.build(); +rowWriter = loader.writer(); + + } + + private void openFile() { +try { + fsStream = file.fileSystem().openPossiblyCompressedStream(file.split().getPath()); +} catch (IOException e) { + throw UserException + .dataReadError(e) + .message("Unable to open LTSV File %s", file.split().getPath() + " " + e.getMessage()) + .addContext(errorContext) + .build(logger); +} +rowIterator = ltsvParser.pa
[GitHub] [drill] jnturton commented on pull request #2668: DRILL-8328: HTTP UDF Not Resolving Storage Aliases
jnturton commented on PR #2668: URL: https://github.com/apache/drill/pull/2668#issuecomment-1359539411 I've just removed the backport-to-stable tag since these UDFs arrived after Drill 1.20. Thanks to @kingswanwho for spotting this. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [drill] cgivre opened a new pull request, #2725: DRILL-8179: Convert LTSV Format Plugin to EVF2
cgivre opened a new pull request, #2725: URL: https://github.com/apache/drill/pull/2725 # [DRILL-8179](https://issues.apache.org/jira/browse/DRILL-8179): Convert LTSV Format Plugin to EVF2 ## Description With this PR, all format plugins are now using the EVF readers. This is part of [DRILL-8132](https://issues.apache.org/jira/browse/DRILL-8312). ## Documentation In addition to refactoring the plugin to use EVF V2, this code replaces the homegrown LTSV reader with a module that parses the data, and introduces new configuration variables. These variables are all noted in the updated README. However they are all optional, so the user is not likely to notice any real difference. One exception is the variable which controls error tolerance. ## Testing Ran existing unit tests. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [drill] weijunlu commented on issue #2693: Order by expression failed to execute in mysql plugin
weijunlu commented on issue #2693: URL: https://github.com/apache/drill/issues/2693#issuecomment-1358751539 @vvysotskyi @cgivre. If MySQL disables only_full_group_by, the sql can be executed. Jupiter (mysql.test)> select 2..semicolon> extract(year from o_orderdate) as o_year 3..semicolon> from orders 4..semicolon> group by o_year 5..semicolon> order by o_year; ++ | o_year | ++ | 1992 | | 1993 | | 1994 | | 1995 | | 1996 | | 1997 | | 1998 | ++ 7 rows selected (4.079 seconds) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [drill] jnturton commented on a diff in pull request #2722: DRILL-8371: Add Write/Insert Capability to Splunk Plugin
jnturton commented on code in PR #2722: URL: https://github.com/apache/drill/pull/2722#discussion_r1052404918 ## contrib/storage-splunk/src/main/java/org/apache/drill/exec/store/splunk/SplunkBatchWriter.java: ## @@ -0,0 +1,308 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.drill.exec.store.splunk; + + +import com.splunk.Args; +import com.splunk.Index; +import com.splunk.IndexCollection; +import com.splunk.Service; +import org.apache.drill.common.exceptions.UserException; +import org.apache.drill.exec.proto.UserBitShared.UserCredentials; +import org.apache.drill.exec.record.VectorAccessible; +import org.apache.drill.exec.store.AbstractRecordWriter; +import org.apache.drill.exec.store.EventBasedRecordWriter.FieldConverter; +import org.apache.drill.exec.vector.complex.reader.FieldReader; +import org.json.simple.JSONObject; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.IOException; +import java.util.List; +import java.util.Map; + +public class SplunkBatchWriter extends AbstractRecordWriter { + + private static final Logger logger = LoggerFactory.getLogger(SplunkBatchWriter.class); + private static final String DEFAULT_SOURCETYPE = "drill"; + private final UserCredentials userCredentials; + private final List tableIdentifier; + private final SplunkWriter config; + private final Args eventArgs; + protected final Service splunkService; + private JSONObject splunkEvent; + protected Index destinationIndex; + + + public SplunkBatchWriter(UserCredentials userCredentials, List tableIdentifier, SplunkWriter config) { +this.config = config; +this.tableIdentifier = tableIdentifier; +this.userCredentials = userCredentials; + +SplunkConnection connection = new SplunkConnection(config.getPluginConfig(), userCredentials.getUserName()); +this.splunkService = connection.connect(); + +// Populate event arguments +this.eventArgs = new Args(); +eventArgs.put("sourcetype", DEFAULT_SOURCETYPE); + } + + @Override + public void init(Map writerOptions) throws IOException { +// No op + } + + /** + * Update the schema in RecordWriter. Called at least once before starting writing the records. In this case, + * we add the index to Splunk here. Splunk's API is a little sparse and doesn't really do much in the way + * of error checking or providing feedback if the operation fails. + * + * @param batch {@link VectorAccessible} The incoming batch + */ + @Override + public void updateSchema(VectorAccessible batch) { +logger.debug("Updating schema for Splunk"); + +//Get the collection of indexes +IndexCollection indexes = splunkService.getIndexes(); +try { + String indexName = tableIdentifier.get(0); + indexes.create(indexName); + destinationIndex = splunkService.getIndexes().get(indexName); +} catch (Exception e) { + // We have to catch a generic exception here, as Splunk's SDK does not really provide any kind of + // failure messaging. + throw UserException.systemError(e) +.message("Error creating new index in Splunk plugin: " + e.getMessage()) +.build(logger); +} + } + + + @Override + public void startRecord() { +logger.debug("Starting record"); +splunkEvent = new JSONObject(); + } + + @Override + public void endRecord() throws IOException { +logger.debug("Ending record"); +// Write the event to the Splunk index +destinationIndex.submit(eventArgs, splunkEvent.toJSONString()); Review Comment: @cgivre can we leave a comment explaining this to readers then? ## contrib/storage-splunk/src/main/java/org/apache/drill/exec/store/splunk/SplunkBatchWriter.java: ## @@ -0,0 +1,309 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.
[GitHub] [drill] cgivre commented on a diff in pull request #2722: DRILL-8371: Add Write/Insert Capability to Splunk Plugin
cgivre commented on code in PR #2722: URL: https://github.com/apache/drill/pull/2722#discussion_r1052374395 ## contrib/storage-splunk/src/main/java/org/apache/drill/exec/store/splunk/SplunkBatchWriter.java: ## @@ -0,0 +1,308 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.drill.exec.store.splunk; + + +import com.splunk.Args; +import com.splunk.Index; +import com.splunk.IndexCollection; +import com.splunk.Service; +import org.apache.drill.common.exceptions.UserException; +import org.apache.drill.exec.proto.UserBitShared.UserCredentials; +import org.apache.drill.exec.record.VectorAccessible; +import org.apache.drill.exec.store.AbstractRecordWriter; +import org.apache.drill.exec.store.EventBasedRecordWriter.FieldConverter; +import org.apache.drill.exec.vector.complex.reader.FieldReader; +import org.json.simple.JSONObject; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.IOException; +import java.util.List; +import java.util.Map; + +public class SplunkBatchWriter extends AbstractRecordWriter { + + private static final Logger logger = LoggerFactory.getLogger(SplunkBatchWriter.class); + private static final String DEFAULT_SOURCETYPE = "drill"; + private final UserCredentials userCredentials; + private final List tableIdentifier; + private final SplunkWriter config; + private final Args eventArgs; + protected final Service splunkService; + private JSONObject splunkEvent; + protected Index destinationIndex; + + + public SplunkBatchWriter(UserCredentials userCredentials, List tableIdentifier, SplunkWriter config) { +this.config = config; +this.tableIdentifier = tableIdentifier; +this.userCredentials = userCredentials; + +SplunkConnection connection = new SplunkConnection(config.getPluginConfig(), userCredentials.getUserName()); +this.splunkService = connection.connect(); + +// Populate event arguments +this.eventArgs = new Args(); +eventArgs.put("sourcetype", DEFAULT_SOURCETYPE); + } + + @Override + public void init(Map writerOptions) throws IOException { +// No op + } + + /** + * Update the schema in RecordWriter. Called at least once before starting writing the records. In this case, + * we add the index to Splunk here. Splunk's API is a little sparse and doesn't really do much in the way + * of error checking or providing feedback if the operation fails. + * + * @param batch {@link VectorAccessible} The incoming batch + */ + @Override + public void updateSchema(VectorAccessible batch) { +logger.debug("Updating schema for Splunk"); + +//Get the collection of indexes +IndexCollection indexes = splunkService.getIndexes(); +try { + String indexName = tableIdentifier.get(0); + indexes.create(indexName); + destinationIndex = splunkService.getIndexes().get(indexName); +} catch (Exception e) { + // We have to catch a generic exception here, as Splunk's SDK does not really provide any kind of + // failure messaging. + throw UserException.systemError(e) +.message("Error creating new index in Splunk plugin: " + e.getMessage()) +.build(logger); +} + } + + + @Override + public void startRecord() { +logger.debug("Starting record"); +splunkEvent = new JSONObject(); + } + + @Override + public void endRecord() throws IOException { +logger.debug("Ending record"); +// Write the event to the Splunk index +destinationIndex.submit(eventArgs, splunkEvent.toJSONString()); +// Clear out the splunk event. +splunkEvent = new JSONObject(); Review Comment: Yes. This line clears out the event so every row we start fresh. I discovered there is a `clear` method so I called that rather than creating a new object every time. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [drill] cgivre commented on a diff in pull request #2722: DRILL-8371: Add Write/Insert Capability to Splunk Plugin
cgivre commented on code in PR #2722: URL: https://github.com/apache/drill/pull/2722#discussion_r1052368638 ## contrib/storage-splunk/src/main/java/org/apache/drill/exec/store/splunk/SplunkBatchWriter.java: ## @@ -0,0 +1,308 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.drill.exec.store.splunk; + + +import com.splunk.Args; +import com.splunk.Index; +import com.splunk.IndexCollection; +import com.splunk.Service; +import org.apache.drill.common.exceptions.UserException; +import org.apache.drill.exec.proto.UserBitShared.UserCredentials; +import org.apache.drill.exec.record.VectorAccessible; +import org.apache.drill.exec.store.AbstractRecordWriter; +import org.apache.drill.exec.store.EventBasedRecordWriter.FieldConverter; +import org.apache.drill.exec.vector.complex.reader.FieldReader; +import org.json.simple.JSONObject; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.IOException; +import java.util.List; +import java.util.Map; + +public class SplunkBatchWriter extends AbstractRecordWriter { + + private static final Logger logger = LoggerFactory.getLogger(SplunkBatchWriter.class); + private static final String DEFAULT_SOURCETYPE = "drill"; + private final UserCredentials userCredentials; + private final List tableIdentifier; + private final SplunkWriter config; + private final Args eventArgs; + protected final Service splunkService; + private JSONObject splunkEvent; + protected Index destinationIndex; + + + public SplunkBatchWriter(UserCredentials userCredentials, List tableIdentifier, SplunkWriter config) { +this.config = config; +this.tableIdentifier = tableIdentifier; +this.userCredentials = userCredentials; + +SplunkConnection connection = new SplunkConnection(config.getPluginConfig(), userCredentials.getUserName()); +this.splunkService = connection.connect(); + +// Populate event arguments +this.eventArgs = new Args(); +eventArgs.put("sourcetype", DEFAULT_SOURCETYPE); + } + + @Override + public void init(Map writerOptions) throws IOException { +// No op + } + + /** + * Update the schema in RecordWriter. Called at least once before starting writing the records. In this case, Review Comment: Sorry.. I clarified the comment. This is called once before the records are written. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [drill] cgivre commented on a diff in pull request #2722: DRILL-8371: Add Write/Insert Capability to Splunk Plugin
cgivre commented on code in PR #2722: URL: https://github.com/apache/drill/pull/2722#discussion_r1052367809 ## contrib/storage-splunk/src/main/java/org/apache/drill/exec/store/splunk/SplunkInsertWriter.java: ## @@ -0,0 +1,72 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.drill.exec.store.splunk; + +import com.fasterxml.jackson.annotation.JacksonInject; +import com.fasterxml.jackson.annotation.JsonCreator; +import com.fasterxml.jackson.annotation.JsonIgnore; +import com.fasterxml.jackson.annotation.JsonProperty; +import org.apache.drill.exec.physical.base.PhysicalOperator; +import org.apache.drill.exec.store.StoragePluginRegistry; + +import java.util.List; + +public class SplunkInsertWriter extends SplunkWriter { + public static final String OPERATOR_TYPE = "SPLUNK_INSERT_WRITER"; + + private final SplunkStoragePlugin plugin; + private final List tableIdentifier; + + @JsonCreator + public SplunkInsertWriter( + @JsonProperty("child") PhysicalOperator child, + @JsonProperty("tableIdentifier") List tableIdentifier, + @JsonProperty("storage") SplunkPluginConfig storageConfig, + @JacksonInject StoragePluginRegistry engineRegistry) { +super(child, tableIdentifier, engineRegistry.resolve(storageConfig, SplunkStoragePlugin.class)); Review Comment: Fixed. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [drill] cgivre commented on a diff in pull request #2722: DRILL-8371: Add Write/Insert Capability to Splunk Plugin
cgivre commented on code in PR #2722: URL: https://github.com/apache/drill/pull/2722#discussion_r1052363847 ## contrib/storage-splunk/src/main/java/org/apache/drill/exec/store/splunk/SplunkBatchWriter.java: ## @@ -0,0 +1,308 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.drill.exec.store.splunk; + + +import com.splunk.Args; +import com.splunk.Index; +import com.splunk.IndexCollection; +import com.splunk.Service; +import org.apache.drill.common.exceptions.UserException; +import org.apache.drill.exec.proto.UserBitShared.UserCredentials; +import org.apache.drill.exec.record.VectorAccessible; +import org.apache.drill.exec.store.AbstractRecordWriter; +import org.apache.drill.exec.store.EventBasedRecordWriter.FieldConverter; +import org.apache.drill.exec.vector.complex.reader.FieldReader; +import org.json.simple.JSONObject; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.IOException; +import java.util.List; +import java.util.Map; + +public class SplunkBatchWriter extends AbstractRecordWriter { + + private static final Logger logger = LoggerFactory.getLogger(SplunkBatchWriter.class); + private static final String DEFAULT_SOURCETYPE = "drill"; + private final UserCredentials userCredentials; + private final List tableIdentifier; + private final SplunkWriter config; + private final Args eventArgs; + protected final Service splunkService; + private JSONObject splunkEvent; + protected Index destinationIndex; + + + public SplunkBatchWriter(UserCredentials userCredentials, List tableIdentifier, SplunkWriter config) { +this.config = config; +this.tableIdentifier = tableIdentifier; +this.userCredentials = userCredentials; + +SplunkConnection connection = new SplunkConnection(config.getPluginConfig(), userCredentials.getUserName()); +this.splunkService = connection.connect(); + +// Populate event arguments +this.eventArgs = new Args(); +eventArgs.put("sourcetype", DEFAULT_SOURCETYPE); + } + + @Override + public void init(Map writerOptions) throws IOException { +// No op + } + + /** + * Update the schema in RecordWriter. Called at least once before starting writing the records. In this case, + * we add the index to Splunk here. Splunk's API is a little sparse and doesn't really do much in the way + * of error checking or providing feedback if the operation fails. + * + * @param batch {@link VectorAccessible} The incoming batch + */ + @Override + public void updateSchema(VectorAccessible batch) { +logger.debug("Updating schema for Splunk"); + +//Get the collection of indexes +IndexCollection indexes = splunkService.getIndexes(); +try { + String indexName = tableIdentifier.get(0); + indexes.create(indexName); + destinationIndex = splunkService.getIndexes().get(indexName); +} catch (Exception e) { + // We have to catch a generic exception here, as Splunk's SDK does not really provide any kind of + // failure messaging. + throw UserException.systemError(e) +.message("Error creating new index in Splunk plugin: " + e.getMessage()) +.build(logger); +} + } + + + @Override + public void startRecord() { +logger.debug("Starting record"); +splunkEvent = new JSONObject(); + } + + @Override + public void endRecord() throws IOException { +logger.debug("Ending record"); +// Write the event to the Splunk index +destinationIndex.submit(eventArgs, splunkEvent.toJSONString()); Review Comment: I think there may be some bug in the Splunk SDK. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [drill] cgivre commented on a diff in pull request #2722: DRILL-8371: Add Write/Insert Capability to Splunk Plugin
cgivre commented on code in PR #2722: URL: https://github.com/apache/drill/pull/2722#discussion_r1052356461 ## contrib/storage-splunk/src/test/java/org/apache/drill/exec/store/splunk/SplunkWriterTest.java: ## @@ -0,0 +1,191 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.drill.exec.store.splunk; + +import org.apache.drill.categories.SlowTest; +import org.apache.drill.common.types.TypeProtos; +import org.apache.drill.common.types.TypeProtos.MinorType; +import org.apache.drill.exec.physical.rowSet.DirectRowSet; +import org.apache.drill.exec.physical.rowSet.RowSet; +import org.apache.drill.exec.physical.rowSet.RowSetBuilder; +import org.apache.drill.exec.record.metadata.SchemaBuilder; +import org.apache.drill.exec.record.metadata.TupleMetadata; +import org.apache.drill.test.QueryBuilder.QuerySummary; +import org.apache.drill.test.rowSet.RowSetUtilities; +import org.junit.FixMethodOrder; +import org.junit.Test; +import org.junit.experimental.categories.Category; +import org.junit.runners.MethodSorters; + + +import static org.junit.Assert.assertEquals; +import static org.junit.Assert.assertTrue; + +@FixMethodOrder(MethodSorters.JVM) +@Category({SlowTest.class}) +public class SplunkWriterTest extends SplunkBaseTest { + + @Test + public void testBasicCTAS() throws Exception { + +// Verify that there is no index called t1 in Splunk +String sql = "SELECT * FROM INFORMATION_SCHEMA.`TABLES` WHERE TABLE_SCHEMA = 'splunk' AND TABLE_NAME LIKE 't1'"; +RowSet results = client.queryBuilder().sql(sql).rowSet(); +assertEquals(0, results.rowCount()); +results.clear(); + +// Now create the table +sql = "CREATE TABLE `splunk`.`t1` AS SELECT * FROM cp.`test_data.csvh`"; +QuerySummary summary = client.queryBuilder().sql(sql).run(); +assertTrue(summary.succeeded()); + +// Verify that an index was created called t1 in Splunk +sql = "SELECT * FROM INFORMATION_SCHEMA.`TABLES` WHERE TABLE_SCHEMA = 'splunk' AND TABLE_NAME LIKE 't1'"; +results = client.queryBuilder().sql(sql).rowSet(); +assertEquals(1, results.rowCount()); +results.clear(); + +// There seems to be some delay between the Drill query writing the data and the data being made +// accessible. +Thread.sleep(3); Review Comment: Yeah.. There seems to be a processing delay between inserting data and it actually being queryable. I don't think this is a Drill issue. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [drill] cgivre commented on a diff in pull request #2722: DRILL-8371: Add Write/Insert Capability to Splunk Plugin
cgivre commented on code in PR #2722: URL: https://github.com/apache/drill/pull/2722#discussion_r1052354949 ## contrib/storage-splunk/src/main/java/org/apache/drill/exec/store/splunk/SplunkBatchWriter.java: ## @@ -0,0 +1,308 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.drill.exec.store.splunk; + + +import com.splunk.Args; +import com.splunk.Index; +import com.splunk.IndexCollection; +import com.splunk.Service; +import org.apache.drill.common.exceptions.UserException; +import org.apache.drill.exec.proto.UserBitShared.UserCredentials; +import org.apache.drill.exec.record.VectorAccessible; +import org.apache.drill.exec.store.AbstractRecordWriter; +import org.apache.drill.exec.store.EventBasedRecordWriter.FieldConverter; +import org.apache.drill.exec.vector.complex.reader.FieldReader; +import org.json.simple.JSONObject; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.IOException; +import java.util.List; +import java.util.Map; + +public class SplunkBatchWriter extends AbstractRecordWriter { + + private static final Logger logger = LoggerFactory.getLogger(SplunkBatchWriter.class); + private static final String DEFAULT_SOURCETYPE = "drill"; + private final UserCredentials userCredentials; + private final List tableIdentifier; + private final SplunkWriter config; + private final Args eventArgs; + protected final Service splunkService; + private JSONObject splunkEvent; + protected Index destinationIndex; + + + public SplunkBatchWriter(UserCredentials userCredentials, List tableIdentifier, SplunkWriter config) { +this.config = config; +this.tableIdentifier = tableIdentifier; +this.userCredentials = userCredentials; + +SplunkConnection connection = new SplunkConnection(config.getPluginConfig(), userCredentials.getUserName()); +this.splunkService = connection.connect(); + +// Populate event arguments +this.eventArgs = new Args(); +eventArgs.put("sourcetype", DEFAULT_SOURCETYPE); + } + + @Override + public void init(Map writerOptions) throws IOException { +// No op + } + + /** + * Update the schema in RecordWriter. Called at least once before starting writing the records. In this case, + * we add the index to Splunk here. Splunk's API is a little sparse and doesn't really do much in the way + * of error checking or providing feedback if the operation fails. + * + * @param batch {@link VectorAccessible} The incoming batch + */ + @Override + public void updateSchema(VectorAccessible batch) { +logger.debug("Updating schema for Splunk"); + +//Get the collection of indexes +IndexCollection indexes = splunkService.getIndexes(); +try { + String indexName = tableIdentifier.get(0); + indexes.create(indexName); + destinationIndex = splunkService.getIndexes().get(indexName); +} catch (Exception e) { + // We have to catch a generic exception here, as Splunk's SDK does not really provide any kind of + // failure messaging. + throw UserException.systemError(e) +.message("Error creating new index in Splunk plugin: " + e.getMessage()) +.build(logger); +} + } + + + @Override + public void startRecord() { +logger.debug("Starting record"); +splunkEvent = new JSONObject(); + } + + @Override + public void endRecord() throws IOException { +logger.debug("Ending record"); +// Write the event to the Splunk index +destinationIndex.submit(eventArgs, splunkEvent.toJSONString()); Review Comment: @jnturton I actually tried this first and I couldn't get Splunk to actually write any data. I literally cut/pasted their code into Drill to no avail. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org