[GitHub] [drill] luocooong opened a new pull request #2501: [MINOR UPDATE] Add license to CodeQL YAML
luocooong opened a new pull request #2501: URL: https://github.com/apache/drill/pull/2501 [MINOR UPDATE] Add license to CodeQL YAML ## Description Fixing the failure of the `checkstyle and generate protobufs` on the CI. ## Documentation N/A ## Testing Use the CI. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [drill] cgivre commented on pull request #2483: DRILL-8149: large xlsx configs
cgivre commented on pull request #2483: URL: https://github.com/apache/drill/pull/2483#issuecomment-1077725853 > No worries! It looks good to me other than adding some doc updates. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [drill] pjfanning commented on pull request #2483: DRILL-8149: large xlsx configs
pjfanning commented on pull request #2483: URL: https://github.com/apache/drill/pull/2483#issuecomment-1077722983 @cgivre @luocooong thanks for the feedback. I've been busy on other stuff and am not prioritising completing this. Before it is ready to merge, it will need doc updates and test coverage. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [drill] cgivre commented on a change in pull request #2483: DRILL-8149: large xlsx configs
cgivre commented on a change in pull request #2483: URL: https://github.com/apache/drill/pull/2483#discussion_r834400430 ## File path: contrib/format-excel/src/main/java/org/apache/drill/exec/store/excel/ExcelBatchReader.java ## @@ -183,6 +189,9 @@ public String getFieldName() { lastColumn = plugin.getConfig().getLastColumn(); allTextMode = plugin.getConfig().getAllTextMode(); sheetName = plugin.getConfig().getSheetName(); + maxArraySize = plugin.getConfig().getMaxArraySize(); Review comment: @pjfanning Thanks for the update! Can you update the `README` file in this folder to explain what these options do? Do you have recommended values? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [drill] cgivre merged pull request #2485: DRILL-8086: Convert the CSV (AKA "compliant text") reader to EVF V2
cgivre merged pull request #2485: URL: https://github.com/apache/drill/pull/2485 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [drill] cgivre commented on pull request #2485: DRILL-8086: Convert the CSV (AKA "compliant text") reader to EVF V2
cgivre commented on pull request #2485: URL: https://github.com/apache/drill/pull/2485#issuecomment-1077578456 @jnturton I added a codeql check to the repo. CodeQL suggested CPP, Java, and JavaScript as languages. I thought we'd try it and see if it adds any value or not. In any event, I removed cpp. I'm fine with merging at this point. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [drill] luocooong commented on a change in pull request #2493: DRILL-8164: Upgrade metadata-extractor because of CVE-2022-24613
luocooong commented on a change in pull request #2493: URL: https://github.com/apache/drill/pull/2493#discussion_r834253110 ## File path: metastore/rdbms-metastore/src/main/java/org/apache/drill/metastore/rdbms/RdbmsMetastore.java ## @@ -117,6 +117,8 @@ private HikariDataSource dataSource(DrillConfig config) { private void initTables(DataSource dataSource) { try (Connection connection = dataSource.getConnection()) { JdbcConnection jdbcConnection = new JdbcConnection(connection); + // TODO Replace the following steps with new function once this issue is resolved. Review comment: Thanks, let's hold-off the merge progress. Actually, I think this is a security issue with only a 50% probability, but in the latest responses from the reporting group, they point out that it can be exploited and attacked. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Created] (DRILL-8174) Convert Avro format to EVF2
Cong Luo created DRILL-8174: --- Summary: Convert Avro format to EVF2 Key: DRILL-8174 URL: https://issues.apache.org/jira/browse/DRILL-8174 Project: Apache Drill Issue Type: Improvement Reporter: Cong Luo Assignee: Cong Luo Second refactor of the Avro format. We may need to resolve multiple bugs related to the Map Vector in this pull request. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[GitHub] [drill] luocooong commented on pull request #2485: DRILL-8086: Convert the CSV (AKA "compliant text") reader to EVF V2
luocooong commented on pull request #2485: URL: https://github.com/apache/drill/pull/2485#issuecomment-1077346954 @paul-rogers @jnturton Don't worry, I think Charles accidentally pushed the local file: Pushed a CodeQL commit. Let's wait for him for a few hours. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [drill] jnturton commented on pull request #2485: DRILL-8086: Convert the CSV (AKA "compliant text") reader to EVF V2
jnturton commented on pull request #2485: URL: https://github.com/apache/drill/pull/2485#issuecomment-1077330375 > Rebased on master. It is now failing on the C++ code check, which is a bit odd because I don't recall seeing much C++ code in Drill... What are these new CodeQL actions here, a rebranding of LGTM? And what prompted this new analysis of our C++, we previously only analysed Java, JS and Python IIRC. (These are more questions for us, @cgivre @luocooong @vdiravka, than for you @paul-rogers) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [drill] lgtm-com[bot] commented on pull request #2485: DRILL-8086: Convert the CSV (AKA "compliant text") reader to EVF V2
lgtm-com[bot] commented on pull request #2485: URL: https://github.com/apache/drill/pull/2485#issuecomment-1077317155 This pull request **introduces 3 alerts** and **fixes 3** when merging 946692198b84c7828781401fb120b51e552898b1 into ed3b1d395787534532e6fa42510c30b7a399befd - [view on LGTM.com](https://lgtm.com/projects/g/apache/drill/rev/pr-fa822892d7133370d7e14a4f4a6d046d755d9c94) **new alerts:** * 3 for Dereferenced variable may be null **fixed alerts:** * 3 for Dereferenced variable may be null -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [drill] paul-rogers commented on pull request #2485: DRILL-8086: Convert the CSV (AKA "compliant text") reader to EVF V2
paul-rogers commented on pull request #2485: URL: https://github.com/apache/drill/pull/2485#issuecomment-1077311337 Rebased on master. It is now failing on the C++ code check, which is a bit odd because I don't recall seeing much C++ code in Drill... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [drill] paul-rogers commented on a change in pull request #2499: DRILL-8117: Clean up deprecated Apache code in Drill
paul-rogers commented on a change in pull request #2499: URL: https://github.com/apache/drill/pull/2499#discussion_r833963288 ## File path: exec/java-exec/src/test/java/org/apache/drill/exec/impersonation/TestInboundImpersonation.java ## @@ -156,22 +159,25 @@ public void unauthorizedTarget() throws Exception { @Test public void invalidPolicy() throws Exception { -thrownException.expect(new UserExceptionMatcher(UserBitShared.DrillPBError.ErrorType.VALIDATION, -"Invalid impersonation policies.")); +String query = "ALTER SYSTEM SET `%s`='%s'"; Review comment: Just picking one test file at random, here's an example of the idea: https://github.com/apache/drill/blob/master/exec/java-exec/src/test/java/org/apache/drill/exec/compile/TestLargeFileCompilation.java#L273 It is not about the exception handling, which is fine. It's just about the busy-work of formatting the `ALTER SYSTEM SET` commands over and over. Again, without looking closely, I don't know if this test happens to use one of the two clients that have these methods. If not, then you have to do things the tedious way. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [drill] paul-rogers commented on a change in pull request #2485: DRILL-8086: Convert the CSV (AKA "compliant text") reader to EVF V2
paul-rogers commented on a change in pull request #2485: URL: https://github.com/apache/drill/pull/2485#discussion_r833958591 ## File path: exec/java-exec/src/main/java/org/apache/drill/exec/store/easy/text/TextFormatPlugin.java ## @@ -81,9 +68,8 @@ * to allow tight control of the size of produced batches (as well * as to support provided schema.) */ -public class TextFormatPlugin extends EasyFormatPlugin { - - private final static String PLUGIN_NAME = "text"; +public class TextFormatPlugin extends EasyFormatPlugin { + final static String PLUGIN_NAME = "text"; Review comment: On the `field_x` vs `columns[]` issue, that part is possible in the current code, as an option. There is a "columns" class in EVF that handles the `columns[]` column. I suspect that a parallel "fields" version could be created that generates the numbered fields instead of array indexes. An option on the plugin would choose one or the other for the no-headers case. In fact, with this approach, the individual `field_n` columns could be typed, based on first-row type inference: something not possible with `columns[]`. Anyone up for giving it a shot? I'd be happy to offer pointers along the way. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [drill] paul-rogers commented on pull request #2485: DRILL-8086: Convert the CSV (AKA "compliant text") reader to EVF V2
paul-rogers commented on pull request #2485: URL: https://github.com/apache/drill/pull/2485#issuecomment-1077279388 @jnturton you make good points. I somewhat worry that folks who build large distributed systems are a bit under-represented in our current wonderful group of contributors, so I do see a drift toward small-scale concerns. (For example, I've never seen any shop query 1000s of Excel files, but folks do that every day for JSON, Parquet and CSV.) Similarly, the data science folks want Drill do to it all in SQL. But, at scale, people build out data pipelines so that the files which Drill queries are the result of a process to produce a query-optimized format such as well-partitioned Parquet. The idea of having two forks (or modes) was an attempt to preserve Drill's distributed system heritage, while also allowing the current contributors to go wild on the data science use cases. Maybe a simpler solution is to just have different "bootstrap" files: the at-scale edition, the data science edition, etc. Each addition includes plugins and defaults optimized for the target use case. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [drill] paul-rogers commented on a change in pull request #2485: DRILL-8086: Convert the CSV (AKA "compliant text") reader to EVF V2
paul-rogers commented on a change in pull request #2485: URL: https://github.com/apache/drill/pull/2485#discussion_r833941830 ## File path: exec/java-exec/src/main/java/org/apache/drill/exec/store/easy/text/TextFormatPlugin.java ## @@ -81,9 +68,8 @@ * to allow tight control of the size of produced batches (as well * as to support provided schema.) */ -public class TextFormatPlugin extends EasyFormatPlugin { - - private final static String PLUGIN_NAME = "text"; +public class TextFormatPlugin extends EasyFormatPlugin { + final static String PLUGIN_NAME = "text"; Review comment: The team can certainly many things in the 2.0 branch. But, this is not a 2.0 PR. The goal of this PR is to preserve functionality and simply upgrade the underlying reader mechanism to use EVF 2. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org