[jira] [Created] (DRILL-8494) HTTP Caching Not Saving Pages
Charles Givre created DRILL-8494: Summary: HTTP Caching Not Saving Pages Key: DRILL-8494 URL: https://issues.apache.org/jira/browse/DRILL-8494 Project: Apache Drill Issue Type: Bug Components: Storage - HTTP Affects Versions: 1.21.1 Reporter: Charles Givre Assignee: Charles Givre Fix For: 1.21.2 A minor bugfix, but the HTTP storage plugin was not actually caching results even when caching was set to true. This bug was introduced in DRILL-8329. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (DRILL-8493) Drill Unable to Read XML Files with Namespaces
Charles Givre created DRILL-8493: Summary: Drill Unable to Read XML Files with Namespaces Key: DRILL-8493 URL: https://issues.apache.org/jira/browse/DRILL-8493 Project: Apache Drill Issue Type: Bug Components: Format - XML Affects Versions: 1.21.1 Reporter: Charles Givre Assignee: Charles Givre Fix For: 1.21.2 This is a bug fix whereby Drill ignores all data when an XML file has a namespace. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (DRILL-8474) Add Daffodil Format Plugin
Charles Givre created DRILL-8474: Summary: Add Daffodil Format Plugin Key: DRILL-8474 URL: https://issues.apache.org/jira/browse/DRILL-8474 Project: Apache Drill Issue Type: New Feature Affects Versions: 1.21.1 Reporter: Charles Givre Fix For: 1.22.0 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (DRILL-8472) Bump Image Metadata Library to Latest Version
Charles Givre created DRILL-8472: Summary: Bump Image Metadata Library to Latest Version Key: DRILL-8472 URL: https://issues.apache.org/jira/browse/DRILL-8472 Project: Apache Drill Issue Type: Task Affects Versions: 1.21.1 Reporter: Charles Givre Assignee: Charles Givre Fix For: 1.21.2 Bump Metadata Extractor dependency to latest version. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (DRILL-8471) Bump DeltaLake Driver to Version 3.0.0
Charles Givre created DRILL-8471: Summary: Bump DeltaLake Driver to Version 3.0.0 Key: DRILL-8471 URL: https://issues.apache.org/jira/browse/DRILL-8471 Project: Apache Drill Issue Type: Task Components: Format - DeltaLake Reporter: Charles Givre Bump DeltaLake Driver to Version 3.0.0 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (DRILL-8470) Bump MongoDB Driver to Latest Version
Charles Givre created DRILL-8470: Summary: Bump MongoDB Driver to Latest Version Key: DRILL-8470 URL: https://issues.apache.org/jira/browse/DRILL-8470 Project: Apache Drill Issue Type: Task Components: Storage - MongoDB Affects Versions: 1.21.1 Reporter: Charles Givre Assignee: Charles Givre Fix For: 1.21.2 Bump mongoDB driver to latest version. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (DRILL-8461) Prevent XXE Attacks in XML Format Plugin
Charles Givre created DRILL-8461: Summary: Prevent XXE Attacks in XML Format Plugin Key: DRILL-8461 URL: https://issues.apache.org/jira/browse/DRILL-8461 Project: Apache Drill Issue Type: Bug Components: Format - XML Affects Versions: 1.21.1 Reporter: Charles Givre Assignee: Charles Givre Fix For: 1.22.0 Drill's XML reader would allow a maliciously crafted XML file to perform an _XML eXternal Entity injection_ (XXE) attack. This fix disables DTD parsing in the XML format plugin and prevents XXE attacks. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (DRILL-8453) Add XSD Support to XML Reader (Part 1)
Charles Givre created DRILL-8453: Summary: Add XSD Support to XML Reader (Part 1) Key: DRILL-8453 URL: https://issues.apache.org/jira/browse/DRILL-8453 Project: Apache Drill Issue Type: Improvement Components: Format - XML Affects Versions: 1.21.1 Reporter: Charles Givre Assignee: Charles Givre Fix For: 1.21.2 This PR is a part of a series to add better support for reading XML data to Drill. One of the main challenges is that XML data does not have a way of inferring data types, nor does it have a way of detecting arrays. The only way to do this really well is to have a schema. Some XML files link a schema definition file to the data. This PR adds the capability for Drill to map XSD schema files into Drill schemas. The current plan is as follows: Part 1 of this PR simply adds the reader but adds no new user detectable functionality. Part 2 will include the actual integration with the XML reader. Part 3 will include the ability to read arrays. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (DRILL-8450) Add Data Type Inference to XML Format Plugin
Charles Givre created DRILL-8450: Summary: Add Data Type Inference to XML Format Plugin Key: DRILL-8450 URL: https://issues.apache.org/jira/browse/DRILL-8450 Project: Apache Drill Issue Type: Improvement Components: Format - XML Affects Versions: 1.21.1 Reporter: Charles Givre Assignee: Charles Givre Fix For: 1.22.0 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (DRILL-8438) Bump YAUAA to 7.19.2
Charles Givre created DRILL-8438: Summary: Bump YAUAA to 7.19.2 Key: DRILL-8438 URL: https://issues.apache.org/jira/browse/DRILL-8438 Project: Apache Drill Issue Type: Task Components: Functions - Drill Reporter: Charles Givre Assignee: Niels Basjes Bump YAUAA to latest version. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (DRILL-8437) Add Header Index Pagination
Charles Givre created DRILL-8437: Summary: Add Header Index Pagination Key: DRILL-8437 URL: https://issues.apache.org/jira/browse/DRILL-8437 Project: Apache Drill Issue Type: Improvement Components: Storage - HTTP Affects Versions: 1.21.1 Reporter: Charles Givre Assignee: Charles Givre Fix For: 1.22.0 Some APIs include pagination fields in the HTTP response headers. This PR adds a new pagination method called Header Index which supports that. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (DRILL-8434) Add Median Function
Charles Givre created DRILL-8434: Summary: Add Median Function Key: DRILL-8434 URL: https://issues.apache.org/jira/browse/DRILL-8434 Project: Apache Drill Issue Type: Improvement Components: Functions - Drill Affects Versions: 1.21.1 Reporter: Charles Givre Assignee: Charles Givre Fix For: 1.22.0 Adds a median function to Drill. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (DRILL-8433) Add Percent Change UDF to Drill
Charles Givre created DRILL-8433: Summary: Add Percent Change UDF to Drill Key: DRILL-8433 URL: https://issues.apache.org/jira/browse/DRILL-8433 Project: Apache Drill Issue Type: Improvement Components: Functions - Drill Affects Versions: 1.21.1 Reporter: Charles Givre Assignee: Charles Givre Fix For: 1.22.0 Adds a function to calculate the percent change between two columns. Doing this without a custom function is cumbersome because you have to include a check for division by zero. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (DRILL-8428) ElasticSearch Config Missing Getters
Charles Givre created DRILL-8428: Summary: ElasticSearch Config Missing Getters Key: DRILL-8428 URL: https://issues.apache.org/jira/browse/DRILL-8428 Project: Apache Drill Issue Type: Bug Reporter: Charles Givre Assignee: Charles Givre The ElasticSearch config was missing some getters and as a result, prevented users from setting certain config variables. This PR fixes this. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (DRILL-4223) PIVOT and UNPIVOT to rotate table valued expressions
[ https://issues.apache.org/jira/browse/DRILL-4223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charles Givre resolved DRILL-4223. -- Resolution: Fixed Added in Drill 1.21. > PIVOT and UNPIVOT to rotate table valued expressions > > > Key: DRILL-4223 > URL: https://issues.apache.org/jira/browse/DRILL-4223 > Project: Apache Drill > Issue Type: New Feature > Components: Execution - Codegen, SQL Parser >Reporter: Ashwin Aravind >Priority: Major > Fix For: 1.21.0 > > > Capability to PIVOT and UNPIVOT table values expressions which are results of > a SELECT query -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (DRILL-8417) Allow Excel Reader to Ignore Formula Errors
Charles Givre created DRILL-8417: Summary: Allow Excel Reader to Ignore Formula Errors Key: DRILL-8417 URL: https://issues.apache.org/jira/browse/DRILL-8417 Project: Apache Drill Issue Type: Improvement Components: Storage - Excel Affects Versions: 1.21.0 Reporter: Charles Givre Assignee: Charles Givre Fix For: 1.21.1 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (DRILL-8414) Index Paginator Not Working When Provided URL
Charles Givre created DRILL-8414: Summary: Index Paginator Not Working When Provided URL Key: DRILL-8414 URL: https://issues.apache.org/jira/browse/DRILL-8414 Project: Apache Drill Issue Type: Bug Components: Storage - HTTP Affects Versions: 1.21.0 Reporter: Charles Givre Assignee: Charles Givre Fix For: 1.21.1 The index paginator offers two options: One where the API returns an index or offset and the other is when it returns a URL. The second was not fully implemented. This PR also adds functionality in the case where the API returns a path rather than a URL. In that case, the path will replace the pre-existing path segments. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (DRILL-8413) Add DNS Lookup Functions
Charles Givre created DRILL-8413: Summary: Add DNS Lookup Functions Key: DRILL-8413 URL: https://issues.apache.org/jira/browse/DRILL-8413 Project: Apache Drill Issue Type: New Feature Components: Functions - Drill Affects Versions: 1.21.0 Reporter: Charles Givre Assignee: Charles Givre Fix For: 1.22 This PR adds additional DNS lookup functions to Drill: -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (DRILL-8411) GoogleSheets Reader Will Not Read More than 1K Rows
Charles Givre created DRILL-8411: Summary: GoogleSheets Reader Will Not Read More than 1K Rows Key: DRILL-8411 URL: https://issues.apache.org/jira/browse/DRILL-8411 Project: Apache Drill Issue Type: Bug Components: Storage - GoogleSheets Affects Versions: 1.21.0 Reporter: Charles Givre Assignee: Charles Givre Fix For: 1.21.1 The GoogleSheets reader hits the batch limit from the GoogleSheets SDK of 1000 rows and stops. This PR fixes that. It also fixes a minor but annoying issue whereby the GoogleSheets reader determines a column is a date/time, but is then unable to parse it because it is in a non-standard format. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (DRILL-8408) Allow Implicit Casts on Join
Charles Givre created DRILL-8408: Summary: Allow Implicit Casts on Join Key: DRILL-8408 URL: https://issues.apache.org/jira/browse/DRILL-8408 Project: Apache Drill Issue Type: Improvement Components: Execution - Data Types Affects Versions: 1.21.0 Reporter: Charles Givre Assignee: Charles Givre Fix For: 1.21.1 Currently, Drill does not allow implicit casts on joins. With DRILL-8136, this has been significantly improved, and it might make sense to do so. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (DRILL-8407) Add Support for SFTP File Systems
Charles Givre created DRILL-8407: Summary: Add Support for SFTP File Systems Key: DRILL-8407 URL: https://issues.apache.org/jira/browse/DRILL-8407 Project: Apache Drill Issue Type: Improvement Components: Storage - File Affects Versions: 1.20.0 Reporter: Charles Givre Assignee: Charles Givre Fix For: Future Add support for SFTP File Systems. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (DRILL-8402) Add REGEXP_EXTRACT Function
Charles Givre created DRILL-8402: Summary: Add REGEXP_EXTRACT Function Key: DRILL-8402 URL: https://issues.apache.org/jira/browse/DRILL-8402 Project: Apache Drill Issue Type: Improvement Components: Functions - Drill Affects Versions: 1.21.0 Reporter: Charles Givre Assignee: Charles Givre Fix For: 1.21.0 This PR adds two UDFs to Drill: regexp_extract(, ) which returns an array of strings which were captured by capturing groups in the regex. regexp_extract(, , ) returns the text captured by a specific capturing group. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (DRILL-8399) MS Access Reader Misinterprets Data Types
Charles Givre created DRILL-8399: Summary: MS Access Reader Misinterprets Data Types Key: DRILL-8399 URL: https://issues.apache.org/jira/browse/DRILL-8399 Project: Apache Drill Issue Type: Bug Components: Format - MS Access Affects Versions: 1.21.0 Reporter: Charles Givre Assignee: Charles Givre Fix For: 1.21.0 The MS Access reader was assigning certain data types incorrectly, resulting in various errors. This minor PR fixes that. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (DRILL-8395) Add Support for INSERT and Drop Table to GoogleSheets Plugin
Charles Givre created DRILL-8395: Summary: Add Support for INSERT and Drop Table to GoogleSheets Plugin Key: DRILL-8395 URL: https://issues.apache.org/jira/browse/DRILL-8395 Project: Apache Drill Issue Type: Improvement Components: Storage - GoogleSheets Affects Versions: 1.20.3 Reporter: Charles Givre Assignee: Charles Givre Fix For: 1.21.0 This PR adds support for INSERT queries which allow a user to append data to an existing GoogleSheets tab. It also: * Adds support for DROP TABLE queries which were not implemented * Modifies CTAS queries so that if a user executes a CTAS query with a file token, Drill will add a new tab to an existing document, but if the user executes a CTAS with a file name, it will create an entirely new document. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (DRILL-8392) Empty Tables Causes Index Out of Bounds Exception on PDF Reader
Charles Givre created DRILL-8392: Summary: Empty Tables Causes Index Out of Bounds Exception on PDF Reader Key: DRILL-8392 URL: https://issues.apache.org/jira/browse/DRILL-8392 Project: Apache Drill Issue Type: Bug Components: Format - PDF Affects Versions: 1.20.3 Reporter: Charles Givre Assignee: Charles Givre Fix For: 1.21.0 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (DRILL-8390) Minor Improvements to PDF Reader
Charles Givre created DRILL-8390: Summary: Minor Improvements to PDF Reader Key: DRILL-8390 URL: https://issues.apache.org/jira/browse/DRILL-8390 Project: Apache Drill Issue Type: Improvement Components: Format - PDF Reporter: Charles Givre Assignee: Charles Givre This PR makes some minor improvements to the PDF reader including: * Fixes a minor bug where certain configurations the first row of data was skipped * Fixes a minor bug where empty tables were causing crashes with the spreadsheet extraction algorithm was used * Adds a table_count metadata field * Adds a table_index metadata field to reflect the current table. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (DRILL-8387) Add Support for User Translation to ElasticSearch Plugin
Charles Givre created DRILL-8387: Summary: Add Support for User Translation to ElasticSearch Plugin Key: DRILL-8387 URL: https://issues.apache.org/jira/browse/DRILL-8387 Project: Apache Drill Issue Type: Improvement Components: Storage - ElasticSearch Affects Versions: 1.20.3 Reporter: Charles Givre Assignee: Charles Givre Fix For: 1.21.0 Add support for user translation to ElasticSearch. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (DRILL-8386) Add Support for User Translation for Cassandra
Charles Givre created DRILL-8386: Summary: Add Support for User Translation for Cassandra Key: DRILL-8386 URL: https://issues.apache.org/jira/browse/DRILL-8386 Project: Apache Drill Issue Type: Improvement Components: Storage - Cassandra Affects Versions: 1.20.3 Reporter: Charles Givre Assignee: Charles Givre Fix For: 1.21.0 Adds support for user translation to the Cassandra plugin. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (DRILL-8384) Add Format Plugin for Microsoft Access
Charles Givre created DRILL-8384: Summary: Add Format Plugin for Microsoft Access Key: DRILL-8384 URL: https://issues.apache.org/jira/browse/DRILL-8384 Project: Apache Drill Issue Type: Improvement Components: Format - MS Access Affects Versions: 1.21.0 Reporter: Charles Givre Assignee: Charles Givre Fix For: 1.21.0 Shockingly, MS Access is still in widespread use. This plugin enables Drill to read MS Access files. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (DRILL-8376) Add Distribution UDFs
Charles Givre created DRILL-8376: Summary: Add Distribution UDFs Key: DRILL-8376 URL: https://issues.apache.org/jira/browse/DRILL-8376 Project: Apache Drill Issue Type: Improvement Components: Functions - Drill Affects Versions: 1.21 Reporter: Charles Givre Assignee: Charles Givre Add `width_bucket`, `pearson_correlation` and `kendall_correlation` to Drill -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (DRILL-8198) XML EVF2 reader provideSchema usage
[ https://issues.apache.org/jira/browse/DRILL-8198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charles Givre resolved DRILL-8198. -- Resolution: Fixed > XML EVF2 reader provideSchema usage > --- > > Key: DRILL-8198 > URL: https://issues.apache.org/jira/browse/DRILL-8198 > Project: Apache Drill > Issue Type: Sub-task > Components: Storage - XML >Affects Versions: 1.20.0 >Reporter: Vitalii Diravka >Assignee: Vitalii Diravka >Priority: Major > Fix For: 2.0.0 > > > XMLBatchReader is converted to EVF2 reader, but not used provideSchema for > Schema Provision feature -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (DRILL-8371) Add Write/Append Capability to Splunk Plugin
Charles Givre created DRILL-8371: Summary: Add Write/Append Capability to Splunk Plugin Key: DRILL-8371 URL: https://issues.apache.org/jira/browse/DRILL-8371 Project: Apache Drill Issue Type: Improvement Components: Storage - Splunk Affects Versions: 1.20.2 Reporter: Charles Givre Assignee: Charles Givre Fix For: 2.0.0 While Drill can currently read from Splunk indexes, it cannot write to them or create them. This proposed PR adds support for CTAS queries for Splunk as well as INSERT and DROP TABLE. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (DRILL-8365) HTTP Plugin Places Parameters in Wrong Place
Charles Givre created DRILL-8365: Summary: HTTP Plugin Places Parameters in Wrong Place Key: DRILL-8365 URL: https://issues.apache.org/jira/browse/DRILL-8365 Project: Apache Drill Issue Type: Bug Components: Storage - HTTP Affects Versions: 1.20.2 Reporter: Charles Givre Assignee: Charles Givre Fix For: 1.20.3 When the requireTail option is set to true, and pagination is enabled, the HTTP plugin puts the required parameters in the wrong place in the URL. This PR fixes that. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (DRILL-8364) Add Support for OAuth Enabled File Systems
Charles Givre created DRILL-8364: Summary: Add Support for OAuth Enabled File Systems Key: DRILL-8364 URL: https://issues.apache.org/jira/browse/DRILL-8364 Project: Apache Drill Issue Type: Improvement Components: Storage - File Affects Versions: 1.20.2 Reporter: Charles Givre Assignee: Charles Givre Fix For: 2.0.0 Currently Drill supports reading from file systems such as HDFS, S3 and others that use token based authentication. This PR extends Drill's plugin architecture so that Drill can connect with other file systems which use OAuth 2.0 for authentication. This PR also adds support for Drill to query Box. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (DRILL-8360) Add Provided Schema for XML Reader
Charles Givre created DRILL-8360: Summary: Add Provided Schema for XML Reader Key: DRILL-8360 URL: https://issues.apache.org/jira/browse/DRILL-8360 Project: Apache Drill Issue Type: Improvement Components: Format - XML Affects Versions: 1.20.2 Reporter: Charles Givre Assignee: Charles Givre Fix For: 2.0.0 The XML reader does not support provisioned schema. This PR adds that support. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (DRILL-8356) Add File Name to GoogleSheets Plugin
Charles Givre created DRILL-8356: Summary: Add File Name to GoogleSheets Plugin Key: DRILL-8356 URL: https://issues.apache.org/jira/browse/DRILL-8356 Project: Apache Drill Issue Type: Improvement Components: Storage - GoogleSheets Affects Versions: 2.0.0 Reporter: Charles Givre Assignee: Charles Givre Fix For: 2.0.0 GoogleSheets uses tokens to identify the individual files. These tokens are not human readable and will make it difficult for a user to know which file they are accessing. This PR adds a metadata field called `_title` which identifies the document they are working with. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (DRILL-8354) Add IS_EMPTY Function.
Charles Givre created DRILL-8354: Summary: Add IS_EMPTY Function. Key: DRILL-8354 URL: https://issues.apache.org/jira/browse/DRILL-8354 Project: Apache Drill Issue Type: Improvement Components: Functions - Drill Affects Versions: 1.20.2 Reporter: Charles Givre Assignee: Charles Givre Fix For: 2.0.0 When analyzing data, there is currently no single function to evaluate whether a given field is empty. With scalar fields, this can be accomplished with the `IS NOT NULL` operator, but with complex fields, this is more challenging as complex fields are never null. This PR adds a UDF called IS_EMPTY() which accepts any type of field and returns true if the field does not contain data. In the case of scalar fields, if the field is `null` this returns true. In the case of complex fields, which can never be `null`, in the case of lists, the function returns true if the list is empty. In the case of maps, it returns true if all of the map's fields are unpopulated. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (DRILL-8350) Convert PCAP Format Plugin to EVF2
Charles Givre created DRILL-8350: Summary: Convert PCAP Format Plugin to EVF2 Key: DRILL-8350 URL: https://issues.apache.org/jira/browse/DRILL-8350 Project: Apache Drill Issue Type: Task Components: Format - PCAP Affects Versions: 1.20.2 Reporter: Charles Givre Assignee: Charles Givre Fix For: 2.0.0 Convert the PCAP format plugin to EVF2 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (DRILL-8349) GoogleSheets Not Registering Schemas with Non Default Name
Charles Givre created DRILL-8349: Summary: GoogleSheets Not Registering Schemas with Non Default Name Key: DRILL-8349 URL: https://issues.apache.org/jira/browse/DRILL-8349 Project: Apache Drill Issue Type: Bug Components: Storage - GoogleSheets Affects Versions: 2.0.0 Reporter: Charles Givre Assignee: Charles Givre Fix For: 2.0.0 GoogleSheets plugin fails to register plugin instances with names other than `GoogleSheets`. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (DRILL-8342) Add Automatic Retry for Rate Limited APIs
Charles Givre created DRILL-8342: Summary: Add Automatic Retry for Rate Limited APIs Key: DRILL-8342 URL: https://issues.apache.org/jira/browse/DRILL-8342 Project: Apache Drill Issue Type: Improvement Components: Storage - HTTP Affects Versions: 1.20.2 Reporter: Charles Givre Assignee: Charles Givre Fix For: 2.0.0 Many APIs have a burst limit for number of requests. This PR adds a retry capability to the HTTP Storage Plugin, whereby if a 429 response code is received, Drill will wait a configurable amount of time, and retry the request once. To prevent runaway pagination, this retry will only happen once per request. This PR adds a new configuration option called retryDelay which is the number of milliseconds that Drill should wait between retrys. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (DRILL-8341) Add Scanned Plugin List to Sys Profiles Table
Charles Givre created DRILL-8341: Summary: Add Scanned Plugin List to Sys Profiles Table Key: DRILL-8341 URL: https://issues.apache.org/jira/browse/DRILL-8341 Project: Apache Drill Issue Type: Improvement Components: Execution - Monitoring Affects Versions: 1.20.2 Reporter: Charles Givre Assignee: Charles Givre Fix For: 2.0.0 In DRILL-8322, [~dzamo] added the list of scanned plugins to the query profiles. This information is extremely useful in query analysis. This minor PR adds this same information to the sys.profiles table. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (DRILL-8340) Add Additional Date Manipulation Functions (Part 1)
Charles Givre created DRILL-8340: Summary: Add Additional Date Manipulation Functions (Part 1) Key: DRILL-8340 URL: https://issues.apache.org/jira/browse/DRILL-8340 Project: Apache Drill Issue Type: Improvement Components: Functions - Drill Affects Versions: 1.20.2 Reporter: Charles Givre Assignee: Charles Givre Fix For: 2.0.0 This PR adds several utility functions to facilitate working with dates and times. These are modeled after the date/time functionality in MySQL. Specifically this adds: * YEARWEEK(): Returns an int of year week. IE (202002) * TIME_STAMP(): Converts most anything that looks like a date string into a timestamp. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (DRILL-8335) Add Ability to Query GoogleSheets Tabs by Index
Charles Givre created DRILL-8335: Summary: Add Ability to Query GoogleSheets Tabs by Index Key: DRILL-8335 URL: https://issues.apache.org/jira/browse/DRILL-8335 Project: Apache Drill Issue Type: Improvement Components: Storage - GoogleSheets Affects Versions: 1.20.2 Reporter: Charles Givre Assignee: Charles Givre Fix For: 2.0.0 The GoogleSheets plugin does not provide a way for a user to query data if they do not know the available tab names. This adds the ability to query by index of the tabs. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (DRILL-8333) Fix Resource Leaks in HTTP Plugin
Charles Givre created DRILL-8333: Summary: Fix Resource Leaks in HTTP Plugin Key: DRILL-8333 URL: https://issues.apache.org/jira/browse/DRILL-8333 Project: Apache Drill Issue Type: Bug Components: Storage - HTTP Affects Versions: 1.20.2 Reporter: Charles Givre Assignee: Charles Givre Fix For: 1.20.3 The HTTP plugin has several methods which collect a `ResponseBody` object but do not close these objects. This is causing a resource leak and will cause Drill to fail in the event that queries fire off many API calls. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (DRILL-8330) Convert ESRI Shape File Reader to EVF2
Charles Givre created DRILL-8330: Summary: Convert ESRI Shape File Reader to EVF2 Key: DRILL-8330 URL: https://issues.apache.org/jira/browse/DRILL-8330 Project: Apache Drill Issue Type: Task Components: Format - ESRI Affects Versions: 1.20.2 Reporter: Charles Givre Assignee: Charles Givre Fix For: 2.0.0 Converts the ESRI Shape File reader to EVF V2. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (DRILL-8329) Close HTTP Caching Resources
Charles Givre created DRILL-8329: Summary: Close HTTP Caching Resources Key: DRILL-8329 URL: https://issues.apache.org/jira/browse/DRILL-8329 Project: Apache Drill Issue Type: Bug Components: Storage - HTTP Affects Versions: 1.20.2 Reporter: Charles Givre Assignee: Charles Givre Fix For: 1.20.3 The HTTP plugin has the ability to cache API responses. However, the storage plugin was not closing the connection to the file cache. This minor PR fixes that. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (DRILL-8328) HTTP UDF Not Resolving Storage Aliases
Charles Givre created DRILL-8328: Summary: HTTP UDF Not Resolving Storage Aliases Key: DRILL-8328 URL: https://issues.apache.org/jira/browse/DRILL-8328 Project: Apache Drill Issue Type: Bug Components: Storage - HTTP Affects Versions: 1.20.0 Reporter: Charles Givre Assignee: Charles Givre Fix For: 1.20.3 The http_request function currently does not resolve plugin aliases correctly. This PR fixes that issue. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (DRILL-8327) GoogleSheets not Reporting Schemata to Info_Schema
Charles Givre created DRILL-8327: Summary: GoogleSheets not Reporting Schemata to Info_Schema Key: DRILL-8327 URL: https://issues.apache.org/jira/browse/DRILL-8327 Project: Apache Drill Issue Type: Bug Components: Storage - GoogleSheets Affects Versions: 2.0.0 Reporter: Charles Givre Assignee: Charles Givre Fix For: 2.0.0 The GoogleSheets (GS) plugin was not reporting the available documents to the info schema. This PR makes some modifications so that users can determine which documents are available via the information schema. The GS plugin does not report the tabs as tables to the information schema because that can cause Drill to exceed Google's rate quota. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (DRILL-8325) Convert PDF Format Plugin to EVF V2
Charles Givre created DRILL-8325: Summary: Convert PDF Format Plugin to EVF V2 Key: DRILL-8325 URL: https://issues.apache.org/jira/browse/DRILL-8325 Project: Apache Drill Issue Type: Task Components: Format - PDF Affects Versions: 1.20.2 Reporter: Charles Givre Assignee: Charles Givre Fix For: 2.0.0 Converts the PDF Format Reader to EVF V2. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (DRILL-8320) Prevent Infinite Pagination for Index Paginator
Charles Givre created DRILL-8320: Summary: Prevent Infinite Pagination for Index Paginator Key: DRILL-8320 URL: https://issues.apache.org/jira/browse/DRILL-8320 Project: Apache Drill Issue Type: Bug Components: Storage - HTTP Affects Versions: 1.20.2 Reporter: Charles Givre Assignee: Charles Givre Fix For: 2.0.0 In some cases that use keyset/index pagination, if the API does not have a boolean column that indicates when to stop, Drill will send requests until the API stops returning data. This PR fixes this by making the boolean parameter optional. If that parameter is not present, if the index result is blank or the same as the previous request, pagination will end. Note, if the pagination parameters are buried in nested objects, this cannot be configured with a dataPath. If the user uses a dataPath, pagination will stop at the first page. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (DRILL-8317) Convert LogRegex Format Plugin to EVF V2
[ https://issues.apache.org/jira/browse/DRILL-8317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charles Givre resolved DRILL-8317. -- Resolution: Done > Convert LogRegex Format Plugin to EVF V2 > > > Key: DRILL-8317 > URL: https://issues.apache.org/jira/browse/DRILL-8317 > Project: Apache Drill > Issue Type: Task > Components: Format - Log Reader >Affects Versions: 1.20.2 >Reporter: Charles Givre >Assignee: Charles Givre >Priority: Major > Fix For: 2.0.0 > > > Converts the existing logRegex reader to EVF V2. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (DRILL-8317) Convert LogRegex Format Plugin to EVF V2
Charles Givre created DRILL-8317: Summary: Convert LogRegex Format Plugin to EVF V2 Key: DRILL-8317 URL: https://issues.apache.org/jira/browse/DRILL-8317 Project: Apache Drill Issue Type: Task Components: Format - Log Reader Affects Versions: 1.20.2 Reporter: Charles Givre Assignee: Charles Givre Fix For: 2.0.0 Converts the existing logRegex reader to EVF V2. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (DRILL-8316) Convert Druid Storage Plugin to EVF & V2 JSON Reader
Charles Givre created DRILL-8316: Summary: Convert Druid Storage Plugin to EVF & V2 JSON Reader Key: DRILL-8316 URL: https://issues.apache.org/jira/browse/DRILL-8316 Project: Apache Drill Issue Type: Improvement Components: Storage - Druid Affects Versions: 1.20.2 Reporter: Charles Givre Assignee: Charles Givre Fix For: 2.0.0 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (DRILL-8315) Convert SAS Format Plugin to EVF V2
Charles Givre created DRILL-8315: Summary: Convert SAS Format Plugin to EVF V2 Key: DRILL-8315 URL: https://issues.apache.org/jira/browse/DRILL-8315 Project: Apache Drill Issue Type: Improvement Components: Format - SAS Affects Versions: 1.20.2, 1.20.1 Reporter: Charles Givre Assignee: Charles Givre Fix For: 2.0.0 Convert the SAS Format Plugin to EVF V2. No user facing changes. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (DRILL-8312) Convert Format Plugins to EVF V2
Charles Givre created DRILL-8312: Summary: Convert Format Plugins to EVF V2 Key: DRILL-8312 URL: https://issues.apache.org/jira/browse/DRILL-8312 Project: Apache Drill Issue Type: Improvement Affects Versions: 1.20.2 Reporter: Charles Givre Fix For: 2.0.0 This is a blanket ticket to convert all format plugins to EVF V2. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (DRILL-8159) Upgrade HTTPD reader to use EVF V2
[ https://issues.apache.org/jira/browse/DRILL-8159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charles Givre resolved DRILL-8159. -- Resolution: Done > Upgrade HTTPD reader to use EVF V2 > -- > > Key: DRILL-8159 > URL: https://issues.apache.org/jira/browse/DRILL-8159 > Project: Apache Drill > Issue Type: New Feature >Reporter: Paul Rogers >Assignee: Paul Rogers >Priority: Major > > Continuation of work originally in the DRILL-8085 PR. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (DRILL-8311) Convert SPSS Format Plugin to EVF V2
Charles Givre created DRILL-8311: Summary: Convert SPSS Format Plugin to EVF V2 Key: DRILL-8311 URL: https://issues.apache.org/jira/browse/DRILL-8311 Project: Apache Drill Issue Type: Improvement Components: Storage - SPSS Affects Versions: 1.20.2 Reporter: Charles Givre Assignee: Charles Givre Fix For: 2.0.0 This PR converts the SPSS format plugin to use EVF V2. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (DRILL-8310) Convert Syslog Format to EVF V2
Charles Givre created DRILL-8310: Summary: Convert Syslog Format to EVF V2 Key: DRILL-8310 URL: https://issues.apache.org/jira/browse/DRILL-8310 Project: Apache Drill Issue Type: Improvement Components: Storage - Syslog Affects Versions: 1.20.2 Reporter: Charles Givre Assignee: Charles Givre Fix For: 2.0.0 This PR proposes to convert the syslog to use EVF V2. No user facing changes. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (DRILL-8289) Add Threat Hunting Functions
[ https://issues.apache.org/jira/browse/DRILL-8289?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charles Givre resolved DRILL-8289. -- Resolution: Done > Add Threat Hunting Functions > > > Key: DRILL-8289 > URL: https://issues.apache.org/jira/browse/DRILL-8289 > Project: Apache Drill > Issue Type: New Feature > Components: Functions - Drill >Affects Versions: 2.0.0 >Reporter: Charles Givre >Assignee: Charles Givre >Priority: Major > Fix For: 2.0.0 > > > # Threat Hunting Functions > These functions are useful for doing threat hunting with Apache Drill. These > were inspired by huntlib.[1] > The functions are: > * `punctuation_pattern()`: Extracts the pattern of punctuation in > text. > * `entropy()`: This function calculates the Shannon Entropy of a > given string of text. > * `entropyPerByte()`: This function calculates the Shannon Entropy of > a given string of text, normed for the string length. > [1]: https://github.com/target/huntlib -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (DRILL-8305) Add Implicit Fields to Google Sheets Reader
Charles Givre created DRILL-8305: Summary: Add Implicit Fields to Google Sheets Reader Key: DRILL-8305 URL: https://issues.apache.org/jira/browse/DRILL-8305 Project: Apache Drill Issue Type: Improvement Components: Storage - GoogleSheets Affects Versions: 2.0.0 Reporter: Charles Givre Assignee: Charles Givre Fix For: 2.0.0 GoogleSheets needs additional metadata fields to access the available data. This PR adds framework for implicit metadata fields. This PR also adds the _sheets field which lists the available tabs within a Google Sheets document. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (DRILL-8291) Allow case sensitive Filters in HTTP Plugin
Charles Givre created DRILL-8291: Summary: Allow case sensitive Filters in HTTP Plugin Key: DRILL-8291 URL: https://issues.apache.org/jira/browse/DRILL-8291 Project: Apache Drill Issue Type: Bug Components: Storage - HTTP Affects Versions: 1.20.2 Reporter: Charles Givre Assignee: Charles Givre Fix For: 1.20.3 Some APIs will reject filter pushdowns if they are not in the correct case. This PR adds a config option `caseSensitiveFilters` to the API config and when set to true, preserves the case of the filters pushed down. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (DRILL-8289) Add Threat Hunting Functions
Charles Givre created DRILL-8289: Summary: Add Threat Hunting Functions Key: DRILL-8289 URL: https://issues.apache.org/jira/browse/DRILL-8289 Project: Apache Drill Issue Type: New Feature Components: Functions - Drill Affects Versions: 2.0.0 Reporter: Charles Givre Assignee: Charles Givre Fix For: 2.0.0 # Threat Hunting Functions These functions are useful for doing threat hunting with Apache Drill. These were inspired by huntlib.[1] The functions are: * `punctuation_pattern()`: Extracts the pattern of punctuation in text. * `entropy()`: This function calculates the Shannon Entropy of a given string of text. * `entropyPerByte()`: This function calculates the Shannon Entropy of a given string of text, normed for the string length. [1]: https://github.com/target/huntlib -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (DRILL-8288) Null Columns not being Written to GoogleSheets
Charles Givre created DRILL-8288: Summary: Null Columns not being Written to GoogleSheets Key: DRILL-8288 URL: https://issues.apache.org/jira/browse/DRILL-8288 Project: Apache Drill Issue Type: Bug Components: Storage - GoogleSheets Affects Versions: 2.0.0 Reporter: Charles Givre Assignee: Charles Givre Fix For: 2.0.0 When writing to GoogleSheets, null columns are not written which causes wrong data. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (DRILL-8287) Add Support for Keyset Based Pagination
Charles Givre created DRILL-8287: Summary: Add Support for Keyset Based Pagination Key: DRILL-8287 URL: https://issues.apache.org/jira/browse/DRILL-8287 Project: Apache Drill Issue Type: New Feature Components: Storage - HTTP Affects Versions: 1.20.2 Reporter: Charles Givre Assignee: Charles Givre Fix For: 2.0.0 Some APIs such as HubSpot use values in the result set to indicate whether there are additional pages. This PR adds support for this kind of pagination. Note that current implementation only works for JSON based APIs. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (DRILL-8286) GoogleSheets StoragePlugin displaying ClientID and ClientSecret in Config
Charles Givre created DRILL-8286: Summary: GoogleSheets StoragePlugin displaying ClientID and ClientSecret in Config Key: DRILL-8286 URL: https://issues.apache.org/jira/browse/DRILL-8286 Project: Apache Drill Issue Type: Bug Reporter: Charles Givre Assignee: Charles Givre The GoogleSheets storage plugin is rendering the `clientID` and `clientSecret` in the config body instead of in the credential provider. This minor PR fixes that. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (DRILL-8276) Add Support for User Translation for Splunk
Charles Givre created DRILL-8276: Summary: Add Support for User Translation for Splunk Key: DRILL-8276 URL: https://issues.apache.org/jira/browse/DRILL-8276 Project: Apache Drill Issue Type: Task Components: Storage - Other Affects Versions: 1.20.2 Reporter: Charles Givre Assignee: Charles Givre Fix For: 2.0.0 This PR adds support for user translation to Splunk. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (DRILL-8271) Make Storage and Format Config Case Insensitive
Charles Givre created DRILL-8271: Summary: Make Storage and Format Config Case Insensitive Key: DRILL-8271 URL: https://issues.apache.org/jira/browse/DRILL-8271 Project: Apache Drill Issue Type: Task Reporter: Charles Givre -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (DRILL-8270) Delete absolete zookeeper patch (tech debt)
Charles Givre created DRILL-8270: Summary: Delete absolete zookeeper patch (tech debt) Key: DRILL-8270 URL: https://issues.apache.org/jira/browse/DRILL-8270 Project: Apache Drill Issue Type: Task Reporter: Charles Givre -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (DRILL-8244) HTTP_Request Not Passing Down Config Variable
Charles Givre created DRILL-8244: Summary: HTTP_Request Not Passing Down Config Variable Key: DRILL-8244 URL: https://issues.apache.org/jira/browse/DRILL-8244 Project: Apache Drill Issue Type: Bug Components: Storage - Other Affects Versions: 1.20.1 Reporter: Charles Givre Assignee: Charles Givre Fix For: 2.0.0 The http_request UDF was not passing down the provided schema and other config parameters down to the jsonLoader. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (DRILL-8243) Move JSON Config Options Out of HTTP Plugin
Charles Givre created DRILL-8243: Summary: Move JSON Config Options Out of HTTP Plugin Key: DRILL-8243 URL: https://issues.apache.org/jira/browse/DRILL-8243 Project: Apache Drill Issue Type: Improvement Components: Storage - JSON Affects Versions: 1.20.1 Reporter: Charles Givre Assignee: Charles Givre Fix For: 2.0.0 As part of DRILL-8241, this PR moves the json configuration options out of the HTTP plugin and creates a file which can be used for other plugins that consume JSON data. The idea being that all such plugins, like Druid, ES, Mongo, can set the same JSON options for each plugin instance w/o having to duplicate config code. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (DRILL-8241) Remove Deprecated JSON Reader
Charles Givre created DRILL-8241: Summary: Remove Deprecated JSON Reader Key: DRILL-8241 URL: https://issues.apache.org/jira/browse/DRILL-8241 Project: Apache Drill Issue Type: Improvement Components: Storage - JSON Affects Versions: 1.20.1 Reporter: Charles Givre Fix For: 2.0.0 This is a master ticket to remove the deprecated v1 JSON reader from Drill. This JSON reader is used in several places and removing it will ensure consistent behavior across all data sources. The V2, EVF based JSON reader has several advantages, including the possibility of schema provisioning, limit pushdowns and others. Here are the tasks which need to be completed to fully remove the v1 JSON reader. * Convert the convert_fromJSON functions to V2 * Convert the Druid Storage Plugin to V2 * Convert MongoDB Storage Plugin to V2. (Note the MongoDB plugin uses an EVF-based BSON reader as well as the V1 JSON reader) * Remove all V1-based unit tests * Migrate the JsonOptions from the HTTP Storage Plugin to global location to allow other plugins and users of JSON to set JSON configuration at a more granular level. * Remove extraneous configuration options. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (DRILL-8239) Convert JSON UDF to EVF
Charles Givre created DRILL-8239: Summary: Convert JSON UDF to EVF Key: DRILL-8239 URL: https://issues.apache.org/jira/browse/DRILL-8239 Project: Apache Drill Issue Type: Improvement Components: Execution - Data Types Affects Versions: 1.20.1 Reporter: Charles Givre Assignee: Charles Givre Fix For: 2.0.0 In an effort to fully deprecate the old JsonReader, this PR converts the convert_from JSON UDF to EVF. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (DRILL-8235) Add Storage Plugin for Google Sheets
Charles Givre created DRILL-8235: Summary: Add Storage Plugin for Google Sheets Key: DRILL-8235 URL: https://issues.apache.org/jira/browse/DRILL-8235 Project: Apache Drill Issue Type: Improvement Components: Storage - Other Affects Versions: 1.20.1 Reporter: Charles Givre Assignee: Charles Givre Fix For: 2.0.0 Google Sheets is a very commonly used data source among business users. Presto and other query engines do include integrations with Google Sheets and so it would be useful for Drill to add this functionality. The proposed plugin supports both reading and writing to Google Sheets. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (DRILL-8229) Add Parameter to Skip Malformed Records to HTTP UDF
Charles Givre created DRILL-8229: Summary: Add Parameter to Skip Malformed Records to HTTP UDF Key: DRILL-8229 URL: https://issues.apache.org/jira/browse/DRILL-8229 Project: Apache Drill Issue Type: Improvement Components: Functions - Drill Affects Versions: 1.20.1 Reporter: Charles Givre Assignee: Charles Givre Fix For: 1.20.2 The http_get and http_request UDFs were not using the JSON parameter to skip malformed records. This PR fixes that. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (DRILL-8220) Add User Translation Support for OAuth Enabled Plugins
Charles Givre created DRILL-8220: Summary: Add User Translation Support for OAuth Enabled Plugins Key: DRILL-8220 URL: https://issues.apache.org/jira/browse/DRILL-8220 Project: Apache Drill Issue Type: Improvement Components: Storage - Other Affects Versions: 1.20.0 Reporter: Charles Givre Assignee: Charles Givre Fix For: 2.0.0 This PR adds support for individual users to provide their own credentials for plugins that use OAuth 2.0 as a means of authorization and authentication. Currently, only the HTTP storage plugin supports OAuth, however, this PR moves some of the core features out of the HTTP plugin so that other plugins can access this. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (DRILL-8217) Credential Resources Page Throws an Error With Empty Lists.
Charles Givre created DRILL-8217: Summary: Credential Resources Page Throws an Error With Empty Lists. Key: DRILL-8217 URL: https://issues.apache.org/jira/browse/DRILL-8217 Project: Apache Drill Issue Type: Bug Components: Web Server Affects Versions: 1.20.0 Reporter: Charles Givre Assignee: Charles Givre Fix For: 2.0.0 If a user does not have any plugins enabled with USER_TRANSLATION on, the Credentials page will throw an exception. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (DRILL-8215) Remove SecurityContext from PluginConfigWrapper
Charles Givre created DRILL-8215: Summary: Remove SecurityContext from PluginConfigWrapper Key: DRILL-8215 URL: https://issues.apache.org/jira/browse/DRILL-8215 Project: Apache Drill Issue Type: Bug Components: Web Server Affects Versions: 1.20.0 Reporter: Charles Givre Drill-8155 introduced a bug in the PluginConfigWrapper by including the SecurityContext in it. This seemed to cause SerDe issues. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (DRILL-8207) Fix Username Typo in JDBC SerDe
Charles Givre created DRILL-8207: Summary: Fix Username Typo in JDBC SerDe Key: DRILL-8207 URL: https://issues.apache.org/jira/browse/DRILL-8207 Project: Apache Drill Issue Type: Bug Reporter: Charles Givre Assignee: Charles Givre Fixes SerDe error with default JDBC plugin. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (DRILL-8205) Inline Schema Not Being Passed to HTTP Reader.
Charles Givre created DRILL-8205: Summary: Inline Schema Not Being Passed to HTTP Reader. Key: DRILL-8205 URL: https://issues.apache.org/jira/browse/DRILL-8205 Project: Apache Drill Issue Type: Bug Reporter: Charles Givre -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (DRILL-8204) Allow Provided Schema for HTTP Plugin in JSON Mode
Charles Givre created DRILL-8204: Summary: Allow Provided Schema for HTTP Plugin in JSON Mode Key: DRILL-8204 URL: https://issues.apache.org/jira/browse/DRILL-8204 Project: Apache Drill Issue Type: Improvement Components: Storage - Other Affects Versions: 1.20.0 Reporter: Charles Givre Assignee: Charles Givre Fix For: 2.0.0 One of the challenges of querying APIs is inconsistent data. Drill allows you to provide a schema for individual endpoints. You can do this in one of two ways: either by providing a serialized TupleMetadata of the desired schema. This is an advanced functionality and should only be used by advanced Drill users. The schema provisioning currently supports complex types of Arrays and Maps at any nesting level. ### Example Schema Provisioning: ```json "jsonOptions": { "providedSchema": [ { "fieldName": "int_field", "fieldType": "bigint" }, { "fieldName": "jsonField", "fieldType": "varchar", "properties": { "drill.json-mode":"json" } },{ // Array field "fieldName": "stringField", "fieldType": "varchar", "isArray": true }, { // Map field "fieldName": "mapField", "fieldType": "map", "fields": [ { "fieldName": "nestedField", "fieldType": "int" },{ "fieldName": "nestedField2", "fieldType": "varchar" } ] } ] } ``` ### Example Provisioning the Schema with a JSON String ```json "jsonOptions": { "jsonSchema": "\{\"type\":\"tuple_schema\",\"columns\":[{\"name\":\"outer_map\",\"type\":\"STRUCT<`int_field` BIGINT, `int_array` ARRAY>\",\"mode\":\"REQUIRED\"}]}" } ``` You can print out a JSON string of a schema with the Java code below. ```java TupleMetadata schema = new SchemaBuilder() .addNullable("a", MinorType.BIGINT) .addNullable("m", MinorType.VARCHAR) .build(); ColumnMetadata m = schema.metadata("m"); m.setProperty(JsonLoader.JSON_MODE, JsonLoader.JSON_LITERAL_MODE); System.out.println(schema.jsonString()); ``` This will generate something like the JSON string below: ```json { "type":"tuple_schema", "columns":[ {"name":"a","type":"BIGINT","mode":"OPTIONAL"}, {"name":"m","type":"VARCHAR","mode":"OPTIONAL","properties":\{"drill.json-mode":"json"} } ] } ``` ## Dealing With Inconsistent Schemas One of the major challenges of interacting with JSON data is when the schema is inconsistent. Drill has a `UNION` data type which is marked as experimental. At the time of writing, the HTTP plugin does not support the `UNION`, however supplying a schema can solve a lot of those issues. ### Json Mode Drill offers the option of reading all JSON values as a string. While this can complicate downstream analytics, it can also be a more memory-efficient way of reading data with inconsistent schema. Unfortunately, at the time of writing, JSON-mode is only available with a provided schema. However, future work will allow this mode to be enabled for any JSON data. Enabling JSON Mode: You can enable JSON mode simply by adding the `drill.json-mode` property with a value of `json` to a field, as shown below: ```json { "fieldName": "jsonField", "fieldType": "varchar", "properties": { "drill.json-mode": "json" } } ``` -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (DRILL-8202) Add Options to Skip Malformed JSON Records to HTTP Plugin
Charles Givre created DRILL-8202: Summary: Add Options to Skip Malformed JSON Records to HTTP Plugin Key: DRILL-8202 URL: https://issues.apache.org/jira/browse/DRILL-8202 Project: Apache Drill Issue Type: Improvement Components: Storage - Other Affects Versions: 1.20.0 Reporter: Charles Givre Assignee: Charles Givre Fix For: 2.0.0 The JSON reader has the possibility of skipping malformed records and documents, but this is a global setting. This PR adds this configuration to the HTTP plugin so that it can be set individually for each endpoint. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (DRILL-8193) Incorrect Annotation used for HttpJsonOptions
Charles Givre created DRILL-8193: Summary: Incorrect Annotation used for HttpJsonOptions Key: DRILL-8193 URL: https://issues.apache.org/jira/browse/DRILL-8193 Project: Apache Drill Issue Type: Bug Components: Storage - Other Affects Versions: 1.20.0 Reporter: Charles Givre Assignee: Charles Givre Fix For: 2.0.0 -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (DRILL-8191) HTTP Request Function Not Detecting JSON Config
Charles Givre created DRILL-8191: Summary: HTTP Request Function Not Detecting JSON Config Key: DRILL-8191 URL: https://issues.apache.org/jira/browse/DRILL-8191 Project: Apache Drill Issue Type: Bug Components: Storage - Other Affects Versions: 1.20.0 Reporter: Charles Givre Assignee: Charles Givre Fix For: 2.0.0 This is a minor fix. The http_request function was not detecting the input format option and throwing an exception. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (DRILL-8180) Add Icons to Storage Plugin List
Charles Givre created DRILL-8180: Summary: Add Icons to Storage Plugin List Key: DRILL-8180 URL: https://issues.apache.org/jira/browse/DRILL-8180 Project: Apache Drill Issue Type: Task Components: Storage - Other, Web Server Affects Versions: 1.20.0 Reporter: Charles Givre Assignee: Charles Givre Fix For: 2.0.0 -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (DRILL-8178) Bump S3 SDK to Lastest Version
Charles Givre created DRILL-8178: Summary: Bump S3 SDK to Lastest Version Key: DRILL-8178 URL: https://issues.apache.org/jira/browse/DRILL-8178 Project: Apache Drill Issue Type: Task Components: Storage - Other Affects Versions: 1.20.0 Reporter: Charles Givre Assignee: Charles Givre Fix For: 2.0.0 -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (DRILL-8169) Add UDFs to HTTP Plugin to Facilitate Joins
Charles Givre created DRILL-8169: Summary: Add UDFs to HTTP Plugin to Facilitate Joins Key: DRILL-8169 URL: https://issues.apache.org/jira/browse/DRILL-8169 Project: Apache Drill Issue Type: Improvement Components: Storage - Other Affects Versions: 1.20.0 Reporter: Charles Givre Assignee: Charles Givre Fix For: 2.0.0 There are some situations where a user might want to join data with an API result and the pushdowns prevent that from happening. The main situation where this happens is when an API has parameters which are part of the URL AND these parameters are dynamically populated via a join. In this case, there are two functions `http_get_url` and `http_get` which you can use to faciliate these joins. * `http_get('', )`: This function accepts a storage plugin as input and an optional list of parameters to include in a URL. * `http_get_url(, )`: This function works in the same way except that it does not pull any configuration information from existing storage plugins. ### Example Queries Let's say that you have a storage plugin called `github` with an endpoint called `repos` which points to the url: https://github.com/orgs/\{org}/repos. It is easy enough to write a query like this: ```sql SELECT * FROM github.repos WHERE org='apache' ``` However, if you had a file with organizations and wanted to join this with the API, the query would fail. Using the functions listed above you could get this data as follows: ```sql SELECT http_get('github.repos', `org`) FROM dfs.`some_data.csvh` ``` or ```sql SELECT http_get('https://github.com/orgs/\{org}/repos', `org`) FROM dfs.`some_data.csvh` ``` ** WARNING: This functionality will execute an HTTP Request FOR EVERY ROW IN YOUR DATA. Use with caution. ** -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (DRILL-8167) Add JSON Config Options to Format Config
Charles Givre created DRILL-8167: Summary: Add JSON Config Options to Format Config Key: DRILL-8167 URL: https://issues.apache.org/jira/browse/DRILL-8167 Project: Apache Drill Issue Type: Improvement Components: Storage - JSON Affects Versions: 1.20.0 Reporter: Charles Givre Assignee: Charles Givre Fix For: Future Most all Drill format plugins allow the user to configure various options for that plugin as part of the format config. The one glaring exception is the JSON reader which has several configuration options which can only be set globally. This PR moves these to the format config so that users can set these options when they configure a storage plugin. This PR does not eliminate the global settings for JSON. It simply adds another place where a user can update the settings. If the settings in the config file are not defined (`null`) Drill will use the global settings. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (DRILL-8166) Add List of Supported File Format Extensions
Charles Givre created DRILL-8166: Summary: Add List of Supported File Format Extensions Key: DRILL-8166 URL: https://issues.apache.org/jira/browse/DRILL-8166 Project: Apache Drill Issue Type: Improvement Components: Web Server Affects Versions: 1.20.0 Reporter: Charles Givre Assignee: Charles Givre Fix For: Future Drill does not currently give users a way of knowing what file extensions are supported. This PR adds two REST endpoints which return a list of supported file extensions. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (DRILL-8161) Add Global Credentials to HTTP Storage Plugin
Charles Givre created DRILL-8161: Summary: Add Global Credentials to HTTP Storage Plugin Key: DRILL-8161 URL: https://issues.apache.org/jira/browse/DRILL-8161 Project: Apache Drill Issue Type: Improvement Components: Storage - Other Affects Versions: 1.20.0 Reporter: Charles Givre Assignee: Charles Givre Fix For: Future Currently, Drill forces to you set username and passwords individually for every API endpoint in a http storage plugin. This PR allows you to set global credentials which will be used for all endpoints in a given HTTP storage plugin instance. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (DRILL-8155) Add Impersonation Support for Non-Hadoop Based Storage Plugins
Charles Givre created DRILL-8155: Summary: Add Impersonation Support for Non-Hadoop Based Storage Plugins Key: DRILL-8155 URL: https://issues.apache.org/jira/browse/DRILL-8155 Project: Apache Drill Issue Type: Improvement Components: Security Affects Versions: 1.20.0 Reporter: Charles Givre Assignee: Charles Givre Fix For: Future Drill's current implementation of user impersonation does not allow non-Hadoop based plugins to impersonate the user. This creates security issues as it requires an organization to create service accounts for users to access storage such as a relational database or Splunk, ES and the like from Drill. This PR proposes to add the framework to support individual credentials for non-Hadoop based plugins. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (DRILL-8153) Convert OAuth REST APIs to JSON
Charles Givre created DRILL-8153: Summary: Convert OAuth REST APIs to JSON Key: DRILL-8153 URL: https://issues.apache.org/jira/browse/DRILL-8153 Project: Apache Drill Issue Type: Improvement Components: Storage - Other Affects Versions: 1.20.0 Reporter: Charles Givre Assignee: Charles Givre Fix For: Future This PR converts the OAuth REST endpoints to accept JSON for the sake of consistency. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (DRILL-8148) Add REST Endpoints to Update OAuth Tokens
Charles Givre created DRILL-8148: Summary: Add REST Endpoints to Update OAuth Tokens Key: DRILL-8148 URL: https://issues.apache.org/jira/browse/DRILL-8148 Project: Apache Drill Issue Type: Improvement Affects Versions: 1.20.0 Reporter: Charles Givre Assignee: Charles Givre Fix For: Future See attached PR -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (DRILL-8142) SAS Reader Returns NPE
Charles Givre created DRILL-8142: Summary: SAS Reader Returns NPE Key: DRILL-8142 URL: https://issues.apache.org/jira/browse/DRILL-8142 Project: Apache Drill Issue Type: Bug Components: Storage - Text CSV Affects Versions: 1.19.0 Reporter: Charles Givre Assignee: Charles Givre Fix For: Future The SAS reader uses the first row of data to infer the data types. If the first row has null values, the SAS reader was throwing a NPE. This PR fixes that. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (DRILL-8140) Add JSON Post Body to HTTP Rest Storage Plugin
Charles Givre created DRILL-8140: Summary: Add JSON Post Body to HTTP Rest Storage Plugin Key: DRILL-8140 URL: https://issues.apache.org/jira/browse/DRILL-8140 Project: Apache Drill Issue Type: Improvement Components: Storage - Other Affects Versions: 1.19.0 Reporter: Charles Givre Assignee: Charles Givre Fix For: Future Some APIs require information be sent as a JSON post body. This PR enables that. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (DRILL-8126) Ignore OAuth Parameter in Storage Plugin
Charles Givre created DRILL-8126: Summary: Ignore OAuth Parameter in Storage Plugin Key: DRILL-8126 URL: https://issues.apache.org/jira/browse/DRILL-8126 Project: Apache Drill Issue Type: Bug Components: Web Server Affects Versions: 1.19.0 Reporter: Charles Givre Assignee: Charles Givre Fix For: 1.20.0 During certain REST calls, the REST interface was throwing a 400 error due to the `oauth` parameter. This minor fix, makes that parameter ignorable. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (DRILL-8121) Add Partial Support for Per-User Credentials
Charles Givre created DRILL-8121: Summary: Add Partial Support for Per-User Credentials Key: DRILL-8121 URL: https://issues.apache.org/jira/browse/DRILL-8121 Project: Apache Drill Issue Type: Improvement Components: Storage - Other Affects Versions: 1.19.0 Reporter: Charles Givre Assignee: Charles Givre Fix For: Future See pull request -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (DRILL-8118) Add Option to Allow Disk Use on Mongo Queries
Charles Givre created DRILL-8118: Summary: Add Option to Allow Disk Use on Mongo Queries Key: DRILL-8118 URL: https://issues.apache.org/jira/browse/DRILL-8118 Project: Apache Drill Issue Type: Bug Components: Storage - MongoDB Affects Versions: 1.19.0 Reporter: Charles Givre Assignee: Charles Givre Fix For: 1.20.0 MongoDB has a strange feature (?) whereby queries which use more than 100MB of memory will by default fail. Mongo allows the user to specify whether they want the query to spill to disk which allows larger queries but at a performance cost. This minor PR adds the ability for a user to specify whether they want this option included in Mongo queries. This only affects aggregate queries in Mongo. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (DRILL-8112) Excel Reader Ignores HeaderRow Config Param
Charles Givre created DRILL-8112: Summary: Excel Reader Ignores HeaderRow Config Param Key: DRILL-8112 URL: https://issues.apache.org/jira/browse/DRILL-8112 Project: Apache Drill Issue Type: Bug Components: Storage - Other Affects Versions: 1.19.0 Reporter: Charles Givre Assignee: Charles Givre Fix For: 1.20.0 Excel reader was ignoring the `headerRow` parameter. This minor bug fix corrects that. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (DRILL-8108) Excel Reader Fails with Duplicate Columns
Charles Givre created DRILL-8108: Summary: Excel Reader Fails with Duplicate Columns Key: DRILL-8108 URL: https://issues.apache.org/jira/browse/DRILL-8108 Project: Apache Drill Issue Type: Bug Components: Storage - Other Reporter: Charles Givre Assignee: Charles Givre In its current implementation, if Drill encounters an Excel file which contains duplicate column names, it will fail to read the data. This PR fixes this issue by appending `_n` after the duplicate column. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (DRILL-8092) Add Auto Pagination to HTTP Storage Plugin
Charles Givre created DRILL-8092: Summary: Add Auto Pagination to HTTP Storage Plugin Key: DRILL-8092 URL: https://issues.apache.org/jira/browse/DRILL-8092 Project: Apache Drill Issue Type: Improvement Components: Storage - Other Affects Versions: 1.19.0 Reporter: Charles Givre Assignee: Charles Givre Fix For: 1.20.0 See github -- This message was sent by Atlassian Jira (v8.20.1#820001)