[jira] [Closed] (DRILL-7717) Support Mongo extended types in V2 JSON loader
[ https://issues.apache.org/jira/browse/DRILL-7717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Rogers closed DRILL-7717. -- > Support Mongo extended types in V2 JSON loader > -- > > Key: DRILL-7717 > URL: https://issues.apache.org/jira/browse/DRILL-7717 > Project: Apache Drill > Issue Type: Improvement >Affects Versions: 1.18.0 >Reporter: Paul Rogers >Assignee: Paul Rogers >Priority: Major > Fix For: 1.18.0 > > > Drill supports Mongo's extended types in the V1 JSON reader. Add similar > support to the V2 version. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (DRILL-7734) Revise the result set reader
[ https://issues.apache.org/jira/browse/DRILL-7734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17100372#comment-17100372 ] ASF GitHub Bot commented on DRILL-7734: --- paul-rogers opened a new pull request #2077: URL: https://github.com/apache/drill/pull/2077 # [DRILL-7734](https://issues.apache.org/jira/browse/DRILL-7734): Revise the result set reader ## Description The "result set reader" uses the column accessors to iterate over rows from multiple batches, similar to how the "result set loader" creates batches in a scan. This PR refactors the code to clarify the two ways that the reader is used. First, the "pull" reader in one operator reads ("pulls") batches from an upstream operator. To implement JSON streaming, we need a second model, a "push" version where a caller provides batches. ## Documentation N/A ## Testing Added tests. Reran all unit tests. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Revise the result set reader > > > Key: DRILL-7734 > URL: https://issues.apache.org/jira/browse/DRILL-7734 > Project: Apache Drill > Issue Type: Improvement >Affects Versions: 1.17.0 >Reporter: Paul Rogers >Assignee: Paul Rogers >Priority: Major > Fix For: 1.18.0 > > > Updates to the {{ResultSetReader}} abstractions to make them usable in more > cases. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (DRILL-7734) Revise the result set reader
Paul Rogers created DRILL-7734: -- Summary: Revise the result set reader Key: DRILL-7734 URL: https://issues.apache.org/jira/browse/DRILL-7734 Project: Apache Drill Issue Type: Improvement Affects Versions: 1.17.0 Reporter: Paul Rogers Assignee: Paul Rogers Fix For: 1.18.0 Updates to the {{ResultSetReader}} abstractions to make them usable in more cases. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (DRILL-7717) Support Mongo extended types in V2 JSON loader
[ https://issues.apache.org/jira/browse/DRILL-7717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17100326#comment-17100326 ] ASF GitHub Bot commented on DRILL-7717: --- paul-rogers commented on pull request #2068: URL: https://github.com/apache/drill/pull/2068#issuecomment-624363287 Rebased on latest master. Squashed commits. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Support Mongo extended types in V2 JSON loader > -- > > Key: DRILL-7717 > URL: https://issues.apache.org/jira/browse/DRILL-7717 > Project: Apache Drill > Issue Type: Improvement >Affects Versions: 1.18.0 >Reporter: Paul Rogers >Assignee: Paul Rogers >Priority: Major > Fix For: 1.18.0 > > > Drill supports Mongo's extended types in the V1 JSON reader. Add similar > support to the V2 version. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (DRILL-7733) Use streaming for REST JSON queries
Paul Rogers created DRILL-7733: -- Summary: Use streaming for REST JSON queries Key: DRILL-7733 URL: https://issues.apache.org/jira/browse/DRILL-7733 Project: Apache Drill Issue Type: Improvement Affects Versions: 1.17.0 Reporter: Paul Rogers Assignee: Paul Rogers Fix For: 1.18.0 Several uses on the user and dev mail lists have complained about the memory overhead when running a REST JSON query: {{http:://node:8047/query.json}}. The current implementation buffers the entire result set in memory, then lets Jersey/Jetty convert the results to JSON. The result is very heavy heap use for larger query result sets. This ticket requests a change to use streaming. As each batch arrives at the Screen operator, convert that batch to JSON and directly stream the results to the client network connection, much as is done for the native client connection. For backward compatibility, the form of the JSON must be the same as the current API. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (DRILL-7726) Boost requirement is incorrect
[ https://issues.apache.org/jira/browse/DRILL-7726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Laurent Goujon resolved DRILL-7726. --- Fix Version/s: 1.18.0 Resolution: Fixed > Boost requirement is incorrect > -- > > Key: DRILL-7726 > URL: https://issues.apache.org/jira/browse/DRILL-7726 > Project: Apache Drill > Issue Type: Bug > Components: Client - C++ >Reporter: Laurent Goujon >Assignee: Laurent Goujon >Priority: Major > Fix For: 1.18.0 > > > Drill C++ connector documentation states that Boost 1.53 is required, but as > support for tlsv12 is present, Boost 1.54 is actually required. Trying to > compile with Boost 1.53 actually result in a compilation error for undefined > constant -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (DRILL-7698) RDBMS Plugin Not Returning Results from Presto
[ https://issues.apache.org/jira/browse/DRILL-7698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charles Givre updated DRILL-7698: - Priority: Blocker (was: Major) > RDBMS Plugin Not Returning Results from Presto > -- > > Key: DRILL-7698 > URL: https://issues.apache.org/jira/browse/DRILL-7698 > Project: Apache Drill > Issue Type: Bug > Components: Storage - JDBC >Affects Versions: 1.18.0 >Reporter: Charles Givre >Priority: Blocker > Attachments: Screen Shot 2020-04-12 at 2.43.33 PM.png, Screen Shot > 2020-04-12 at 2.56.10 PM.png, Screen Shot 2020-04-12 at 3.00.00 PM.png, > Screen Shot 2020-04-12 at 3.01.37 PM.png > > > Using the RDBMS storage plugin, Drill is unable to connect to Presto. More > specifically, Drill seems to be connecting and sending queries to Presto, but > then nothing happens with the query results. > I verified the configuration using DBBeaver and was able to successfully > query Presto. See screenshot below for config. > !Screen Shot 2020-04-12 at 2.43.33 PM.png! > Presto ships with a few sample databases as shown below, and these should be > visible in Drill but are not. > !Screen Shot 2020-04-12 at 2.56.10 PM.png! > From the logs below, Presto is clearly receiving the queries from Drill, and > the queries are returning results, but Drill seems to be dropping the > results. While this may seem like a silly exercise, querying Presto from > Drill, the fact that it didn't work makes me think we may have a bug in the > JDBC Storage Plugin. > !Screen Shot 2020-04-12 at 3.01.37 PM.png! > !Screen Shot 2020-04-12 at 3.00.00 PM.png! > h2. Steps to Reproduce > 1. Download and start Docker container with Presto. > 2. Download Presto JDBC driver (https://prestodb.io/download.html) and copy > to Drill classpath. > 3. Create RDBMS storage plugin instance using default config below: > {code:java} > { > "type": "jdbc", > "driver": "io.prestosql.jdbc.PrestoDriver", > "url": "jdbc:presto://localhost:8080/tpch/sf1", > "username": "user", > "password": null, > "caseInsensitiveTableNames": true, > "sourceParameters": {}, > "enabled": true > } > {code} > 4. Execute a SHOW DATABASES query and you will see that no presto related > results are returned. Various queries to the INFORMATION SCHEMA reveal the > same thing. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (DRILL-7732) Group by on UINT_32 Parquet field causing: SYSTEM ERROR: CompileException
[ https://issues.apache.org/jira/browse/DRILL-7732?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dave Challis updated DRILL-7732: Description: When executing the following query in Drill (via the web UI): {noformat} SELECT P.portfolio_id FROM dfs.`/data/portfolio.parquet` AS P GROUP BY P.portfolio_id {noformat} the query fails with the following error message: {noformat} org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: CompileException: Line 61, Column 30: "value" is neither a method, a field, nor a member class of "org.apache.drill.exec.vector.UntypedNullHolder" Fragment 0:0 Please, refer to logs for more information. [Error Id: be0b9712-c9c6-4733-9d17-5c370d5dfce0 on e531a6492cf4:31010] {noformat} The schema for the Parquet file in question is: {noformat} $ parquet-tool schema portfolio.parquet message parquet_go_root { required int32 portfolio_id (UINT_32) = 0; required binary name (UTF8) = 0; optional binary meta (UTF8) = 0; } {noformat} The contents of the file is a single row: {noformat} $ parquet-tools cat portfolio.parquet portfolio_id = 0 name = Bose Corporation meta = null {noformat} I've attached this file to this ticket, and also included some Drill logs around the query, in case it helps. was: When executing the following query in Drill (via the web UI): {noformat} SELECT P.portfolio_id FROM dfs.`/data/portfolio.parquet` AS P GROUP BY P.portfolio_id {noformat} the query fails with the following error message: {noformat} org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: CompileException: Line 61, Column 30: "value" is neither a method, a field, nor a member class of "org.apache.drill.exec.vector.UntypedNullHolder" Fragment 0:0 Please, refer to logs for more information. [Error Id: be0b9712-c9c6-4733-9d17-5c370d5dfce0 on e531a6492cf4:31010] {noformat} The schema for the Parquet file in question is: {noformat} $ parquet-tool schema portfolio.parquet message parquet_go_root { required int32 portfolio_id (UINT_32) = 0; required binary name (UTF8) = 0; optional binary meta (UTF8) = 0; } {noformat} The contents of the file is a single row: {noformat} $ parquet-tools cat portfolio.parquet portfolio_id = 0 name = Bose Corporation meta = null {noformat} I've attached this file to this ticket, and also included some Drill log about around the query, in case it helps. > Group by on UINT_32 Parquet field causing: SYSTEM ERROR: CompileException > - > > Key: DRILL-7732 > URL: https://issues.apache.org/jira/browse/DRILL-7732 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.17.0 > Environment: AWS Linux instance, 64Gb RAM. >Reporter: Dave Challis >Priority: Major > Attachments: drill.log, portfolio.parquet > > > When executing the following query in Drill (via the web UI): > {noformat} > SELECT P.portfolio_id > FROM dfs.`/data/portfolio.parquet` AS P > GROUP BY P.portfolio_id > {noformat} > the query fails with the following error message: > {noformat} > org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: > CompileException: Line 61, Column 30: "value" is neither a method, a field, > nor a member class of "org.apache.drill.exec.vector.UntypedNullHolder" > Fragment 0:0 Please, refer to logs for more information. > [Error Id: be0b9712-c9c6-4733-9d17-5c370d5dfce0 on e531a6492cf4:31010] > {noformat} > > The schema for the Parquet file in question is: > {noformat} > $ parquet-tool schema portfolio.parquet > message parquet_go_root { > required int32 portfolio_id (UINT_32) = 0; > required binary name (UTF8) = 0; > optional binary meta (UTF8) = 0; > } > {noformat} > The contents of the file is a single row: > {noformat} > $ parquet-tools cat portfolio.parquet > portfolio_id = 0 > name = Bose Corporation > meta = null > {noformat} > I've attached this file to this ticket, and also included some Drill logs > around the query, in case it helps. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (DRILL-7732) Group by on UINT_32 Parquet field causing: SYSTEM ERROR: CompileException
[ https://issues.apache.org/jira/browse/DRILL-7732?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dave Challis updated DRILL-7732: Summary: Group by on UINT_32 Parquet field causing: SYSTEM ERROR: CompileException (was: Group by on Parquet field causing: SYSTEM ERROR: CompileException) > Group by on UINT_32 Parquet field causing: SYSTEM ERROR: CompileException > - > > Key: DRILL-7732 > URL: https://issues.apache.org/jira/browse/DRILL-7732 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.17.0 > Environment: AWS Linux instance, 64Gb RAM. >Reporter: Dave Challis >Priority: Major > Attachments: drill.log, portfolio.parquet > > > When executing the following query in Drill (via the web UI): > {noformat} > SELECT P.portfolio_id > FROM dfs.`/data/portfolio.parquet` AS P > GROUP BY P.portfolio_id > {noformat} > the query fails with the following error message: > {noformat} > org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: > CompileException: Line 61, Column 30: "value" is neither a method, a field, > nor a member class of "org.apache.drill.exec.vector.UntypedNullHolder" > Fragment 0:0 Please, refer to logs for more information. > [Error Id: be0b9712-c9c6-4733-9d17-5c370d5dfce0 on e531a6492cf4:31010] > {noformat} > > The schema for the Parquet file in question is: > {noformat} > $ parquet-tool schema portfolio.parquet > message parquet_go_root { > required int32 portfolio_id (UINT_32) = 0; > required binary name (UTF8) = 0; > optional binary meta (UTF8) = 0; > } > {noformat} > The contents of the file is a single row: > {noformat} > $ parquet-tools cat portfolio.parquet > portfolio_id = 0 > name = Bose Corporation > meta = null > {noformat} > I've attached this file to this ticket, and also included some Drill log > about around the query, in case it helps. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (DRILL-7732) Group by on Parquet field causing: SYSTEM ERROR: CompileException
[ https://issues.apache.org/jira/browse/DRILL-7732?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dave Challis updated DRILL-7732: Description: When executing the following query in Drill (via the web UI): {noformat} SELECT P.portfolio_id FROM dfs.`/data/portfolio.parquet` AS P GROUP BY P.portfolio_id {noformat} the query fails with the following error message: {noformat} org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: CompileException: Line 61, Column 30: "value" is neither a method, a field, nor a member class of "org.apache.drill.exec.vector.UntypedNullHolder" Fragment 0:0 Please, refer to logs for more information. [Error Id: be0b9712-c9c6-4733-9d17-5c370d5dfce0 on e531a6492cf4:31010] {noformat} The schema for the Parquet file in question is: {noformat} $ parquet-tool schema portfolio.parquet message parquet_go_root { required int32 portfolio_id (UINT_32) = 0; required binary name (UTF8) = 0; optional binary meta (UTF8) = 0; } {noformat} The contents of the file is a single row: {noformat} $ parquet-tools cat portfolio.parquet portfolio_id = 0 name = Bose Corporation meta = null {noformat} I've attached this file to this ticket, and also included some Drill log about around the query, in case it helps. was: When executing the following query in Drill (via the web UI): {noformat} SELECT P.portfolio_id FROM dfs.`/data/portfolio.parquet` AS P GROUP BY P.portfolio_id {noformat} the query fails with the following error message: {noformat} org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: CompileException: Line 61, Column 30: "value" is neither a method, a field, nor a member class of "org.apache.drill.exec.vector.UntypedNullHolder" Fragment 0:0 Please, refer to logs for more information. [Error Id: be0b9712-c9c6-4733-9d17-5c370d5dfce0 on e531a6492cf4:31010] {noformat} The schema for the Parquet file in question is: {noformat} $ parquet-tool schema portfolio.parquet message parquet_go_root { required int32 portfolio_id (UINT_32) = 0; required binary name (UTF8) = 0; optional binary meta (UTF8) = 0; } {noformat} The contents of the file is a single row: {noformat} $ parquet-tools cat portfolio.parquet portfolio_id = 0 name = Bose Corporation meta = null {noformat} I've attached this file to this ticket. > Group by on Parquet field causing: SYSTEM ERROR: CompileException > - > > Key: DRILL-7732 > URL: https://issues.apache.org/jira/browse/DRILL-7732 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.17.0 > Environment: AWS Linux instance, 64Gb RAM. >Reporter: Dave Challis >Priority: Major > Attachments: drill.log, portfolio.parquet > > > When executing the following query in Drill (via the web UI): > {noformat} > SELECT P.portfolio_id > FROM dfs.`/data/portfolio.parquet` AS P > GROUP BY P.portfolio_id > {noformat} > the query fails with the following error message: > {noformat} > org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: > CompileException: Line 61, Column 30: "value" is neither a method, a field, > nor a member class of "org.apache.drill.exec.vector.UntypedNullHolder" > Fragment 0:0 Please, refer to logs for more information. > [Error Id: be0b9712-c9c6-4733-9d17-5c370d5dfce0 on e531a6492cf4:31010] > {noformat} > > The schema for the Parquet file in question is: > {noformat} > $ parquet-tool schema portfolio.parquet > message parquet_go_root { > required int32 portfolio_id (UINT_32) = 0; > required binary name (UTF8) = 0; > optional binary meta (UTF8) = 0; > } > {noformat} > The contents of the file is a single row: > {noformat} > $ parquet-tools cat portfolio.parquet > portfolio_id = 0 > name = Bose Corporation > meta = null > {noformat} > I've attached this file to this ticket, and also included some Drill log > about around the query, in case it helps. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (DRILL-7732) Group by on Parquet field causing: SYSTEM ERROR: CompileException
[ https://issues.apache.org/jira/browse/DRILL-7732?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dave Challis updated DRILL-7732: Attachment: drill.log > Group by on Parquet field causing: SYSTEM ERROR: CompileException > - > > Key: DRILL-7732 > URL: https://issues.apache.org/jira/browse/DRILL-7732 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.17.0 > Environment: AWS Linux instance, 64Gb RAM. >Reporter: Dave Challis >Priority: Major > Attachments: drill.log, portfolio.parquet > > > When executing the following query in Drill (via the web UI): > {noformat} > SELECT P.portfolio_id > FROM dfs.`/data/portfolio.parquet` AS P > GROUP BY P.portfolio_id > {noformat} > the query fails with the following error message: > {noformat} > org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: > CompileException: Line 61, Column 30: "value" is neither a method, a field, > nor a member class of "org.apache.drill.exec.vector.UntypedNullHolder" > Fragment 0:0 Please, refer to logs for more information. > [Error Id: be0b9712-c9c6-4733-9d17-5c370d5dfce0 on e531a6492cf4:31010] > {noformat} > > The schema for the Parquet file in question is: > {noformat} > $ parquet-tool schema portfolio.parquet > message parquet_go_root { > required int32 portfolio_id (UINT_32) = 0; > required binary name (UTF8) = 0; > optional binary meta (UTF8) = 0; > } > {noformat} > The contents of the file is a single row: > {noformat} > $ parquet-tools cat portfolio.parquet > portfolio_id = 0 > name = Bose Corporation > meta = null > {noformat} > I've attached this file to this ticket. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (DRILL-7732) Group by on Parquet field causing: SYSTEM ERROR: CompileException
Dave Challis created DRILL-7732: --- Summary: Group by on Parquet field causing: SYSTEM ERROR: CompileException Key: DRILL-7732 URL: https://issues.apache.org/jira/browse/DRILL-7732 Project: Apache Drill Issue Type: Bug Affects Versions: 1.17.0 Environment: AWS Linux instance, 64Gb RAM. Reporter: Dave Challis Attachments: portfolio.parquet When executing the following query in Drill (via the web UI): {noformat} SELECT P.portfolio_id FROM dfs.`/data/portfolio.parquet` AS P GROUP BY P.portfolio_id {noformat} the query fails with the following error message: {noformat} org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: CompileException: Line 61, Column 30: "value" is neither a method, a field, nor a member class of "org.apache.drill.exec.vector.UntypedNullHolder" Fragment 0:0 Please, refer to logs for more information. [Error Id: be0b9712-c9c6-4733-9d17-5c370d5dfce0 on e531a6492cf4:31010] {noformat} The schema for the Parquet file in question is: {noformat} $ parquet-tool schema portfolio.parquet message parquet_go_root { required int32 portfolio_id (UINT_32) = 0; required binary name (UTF8) = 0; optional binary meta (UTF8) = 0; } {noformat} The contents of the file is a single row: {noformat} $ parquet-tools cat portfolio.parquet portfolio_id = 0 name = Bose Corporation meta = null {noformat} I've attached this file to this ticket. -- This message was sent by Atlassian Jira (v8.3.4#803005)