This is an automated email from the ASF dual-hosted git repository.

cgivre pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/drill.git


The following commit(s) were added to refs/heads/master by this push:
     new c4cfe5acfd DRILL-8360: Add Provided Schema for XML Reader (#2710)
c4cfe5acfd is described below

commit c4cfe5acfdede85f4e31bc3398c14dfb2e8a312b
Author: Charles S. Givre <[email protected]>
AuthorDate: Mon Nov 28 12:59:20 2022 -0500

    DRILL-8360: Add Provided Schema for XML Reader (#2710)
---
 .../drill/exec/store/pdf/PdfBatchReader.java       |   4 +-
 contrib/format-xml/.gitignore                      |   2 +
 contrib/format-xml/README.md                       |  35 ++++--
 .../drill/exec/store/xml/XMLBatchReader.java       |   7 ++
 .../org/apache/drill/exec/store/xml/XMLReader.java |  85 +++++++++++++-
 .../apache/drill/exec/store/xml/TestXMLReader.java |  35 ++++++
 .../src/test/resources/xml/simple_array.xml        |  44 +++++++
 .../test/resources/xml/simple_with_datatypes.xml   |  47 ++++++++
 contrib/storage-http/README.md                     |  54 ++++-----
 contrib/storage-http/XML_Options.md                |  39 +++++++
 .../drill/exec/store/http/HttpApiConfig.java       |  34 +++++-
 .../drill/exec/store/http/HttpXMLBatchReader.java  |  53 ++++++++-
 .../drill/exec/store/http/HttpXmlOptions.java      | 120 +++++++++++++++++++
 .../drill/exec/store/http/util/SimpleHttp.java     |  15 +++
 .../drill/exec/store/http/TestHttpPlugin.java      | 128 ++++++++++++++++++++-
 .../drill/exec/store/http/TestPagination.java      |   8 +-
 .../src/test/resources/data/response.xml           |  20 ++--
 17 files changed, 668 insertions(+), 62 deletions(-)

diff --git 
a/contrib/format-pdf/src/main/java/org/apache/drill/exec/store/pdf/PdfBatchReader.java
 
b/contrib/format-pdf/src/main/java/org/apache/drill/exec/store/pdf/PdfBatchReader.java
index 94b4caf3bd..fd6cec92e6 100644
--- 
a/contrib/format-pdf/src/main/java/org/apache/drill/exec/store/pdf/PdfBatchReader.java
+++ 
b/contrib/format-pdf/src/main/java/org/apache/drill/exec/store/pdf/PdfBatchReader.java
@@ -486,7 +486,9 @@ public class PdfBatchReader implements ManagedReader {
           Date parsedDate = simpleDateFormat.parse(cell.getText());
           timestamp = Instant.ofEpochMilli(parsedDate.getTime());
         } catch (ParseException e) {
-          logger.error("Error parsing timestamp: " + e.getMessage());
+          throw UserException.parseError(e)
+            .message("Cannot parse " + cell.getText() + " as a timestamp. You 
can specify a format string in the provided schema to correct this.")
+            .build(logger);
         }
       }
       writer.setTimestamp(timestamp);
diff --git a/contrib/format-xml/.gitignore b/contrib/format-xml/.gitignore
new file mode 100644
index 0000000000..9341ff44dc
--- /dev/null
+++ b/contrib/format-xml/.gitignore
@@ -0,0 +1,2 @@
+# Directory to store oauth tokens for testing Googlesheets Storage plugin
+/src/test/resources/logback-test.xml
diff --git a/contrib/format-xml/README.md b/contrib/format-xml/README.md
index 3c50ce2956..ca32715ee6 100644
--- a/contrib/format-xml/README.md
+++ b/contrib/format-xml/README.md
@@ -1,10 +1,10 @@
 # XML Format Reader
-This plugin enables Drill to read XML files without defining any kind of 
schema. 
+This plugin enables Drill to read XML files without defining any kind of 
schema.
 
 ## Configuration
 Aside from the file extension, there is one configuration option:
 
-* `dataLevel`: XML data often contains a considerable amount of nesting which 
is not necesarily useful for data analysis. This parameter allows you to set 
the nesting level 
+* `dataLevel`: XML data often contains a considerable amount of nesting which 
is not necesarily useful for data analysis. This parameter allows you to set 
the nesting level
   where the data actually starts.  The levels start at `1`.
 
 The default configuration is shown below:
@@ -22,6 +22,21 @@ The default configuration is shown below:
 ## Data Types
 All fields are read as strings.  Nested fields are read as maps.  Future 
functionality could include support for lists.
 
+## Provided Schema
+The XML Format Reader supports provided inline schemas.  An example query 
might be:
+
+```sql
+SELECT * FROM table(cp.`xml/simple_with_datatypes.xml`(type => 'xml',
+    schema => 'inline=(`int_field` INT, `bigint_field` BIGINT,
+    `float_field` FLOAT, `double_field` DOUBLE,
+    `boolean_field` BOOLEAN, `date_field` DATE,
+    `time_field` TIME, `timestamp_field` TIMESTAMP,
+    `string_field` VARCHAR,
+    `date2_field` DATE properties {`drill.format` = `MM/dd/yyyy`})'));
+```
+
+Current implementation only supports provided schema for scalar data types.
+
 ### Attributes
 XML events can have attributes which can also be useful.
 ```xml
@@ -33,8 +48,8 @@ XML events can have attributes which can also be useful.
 </book>
 ```
 
-In the example above, the `title` field contains two attributes, the `binding` 
and `subcategory`.  In order to access these fields, Drill creates a map called 
`attributes` and 
-adds an entry for each attribute with the field name and then the attribute 
name.  Every XML file will have a field called `atttributes` regardless of 
whether the data actually 
+In the example above, the `title` field contains two attributes, the `binding` 
and `subcategory`.  In order to access these fields, Drill creates a map called 
`attributes` and
+adds an entry for each attribute with the field name and then the attribute 
name.  Every XML file will have a field called `atttributes` regardless of 
whether the data actually
 has attributes or not.
 
 ```xml
@@ -65,7 +80,7 @@ has attributes or not.
 If you queried this data in Drill you'd get the table below:
 
 ```sql
-SELECT * 
+SELECT *
 FROM <path>.`attributes.xml`
 ```
 
@@ -82,7 +97,7 @@ apache drill> select * from dfs.test.`attributes.xml`;
 
 ## Limitations:  Malformed XML
 Drill can read properly formatted XML.  If the XML is not properly formatted, 
Drill will throw errors. Some issues include illegal characters in field names, 
or attribute names.
-Future functionality will include some degree of data cleaning and fault 
tolerance. 
+Future functionality will include some degree of data cleaning and fault 
tolerance.
 
 ## Limitations: Schema Ambiguity
 XML is a challenging format to process as the structure does not give any 
hints about the schema.  For example, a JSON file might have the following 
record:
@@ -126,13 +141,13 @@ This is no problem to parse this data. But consider what 
would happen if we enco
   </otherField>
 </record>
 ```
-In this example, there is no way for Drill to know whether `listField` is a 
`list` or a `map` because it only has one entry. 
+In this example, there is no way for Drill to know whether `listField` is a 
`list` or a `map` because it only has one entry.
 
 ## Future Functionality
 
 * **Build schema from XSD file or link**:  One of the major challenges of this 
reader is having to infer the schema of the data. XML files do provide a schema 
although this is not
- required.  In the future, if there is interest, we can extend this reader to 
use an XSD file to build the schema which will be used to parse the actual XML 
file. 
-  
+ required.  In the future, if there is interest, we can extend this reader to 
use an XSD file to build the schema which will be used to parse the actual XML 
file.
+
 * **Infer Date Fields**: It may be possible to add the ability to infer data 
fields.
 
-* **List Support**:  Future functionality may include the ability to infer 
lists from data structures.  
\ No newline at end of file
+* **List Support**:  Future functionality may include the ability to infer 
lists from data structures.
diff --git 
a/contrib/format-xml/src/main/java/org/apache/drill/exec/store/xml/XMLBatchReader.java
 
b/contrib/format-xml/src/main/java/org/apache/drill/exec/store/xml/XMLBatchReader.java
index 52a2b6d903..579652a1df 100644
--- 
a/contrib/format-xml/src/main/java/org/apache/drill/exec/store/xml/XMLBatchReader.java
+++ 
b/contrib/format-xml/src/main/java/org/apache/drill/exec/store/xml/XMLBatchReader.java
@@ -28,6 +28,7 @@ import 
org.apache.drill.exec.physical.impl.scan.v3.file.FileDescrip;
 import org.apache.drill.exec.physical.impl.scan.v3.file.FileSchemaNegotiator;
 import org.apache.drill.exec.physical.resultSet.ResultSetLoader;
 import org.apache.drill.exec.physical.resultSet.RowSetLoader;
+import org.apache.drill.exec.record.metadata.TupleMetadata;
 import org.apache.drill.exec.store.dfs.easy.EasySubScan;
 import org.slf4j.Logger;
 import org.slf4j.LoggerFactory;
@@ -59,6 +60,12 @@ public class XMLBatchReader implements ManagedReader {
     dataLevel = readerConfig.dataLevel;
     file = negotiator.file();
 
+    // Add schema if provided
+    if (negotiator.providedSchema() != null) {
+      TupleMetadata schema = negotiator.providedSchema();
+      negotiator.tableSchema(schema, false);
+    }
+
     ResultSetLoader loader = negotiator.build();
     rootRowWriter = loader.writer();
 
diff --git 
a/contrib/format-xml/src/main/java/org/apache/drill/exec/store/xml/XMLReader.java
 
b/contrib/format-xml/src/main/java/org/apache/drill/exec/store/xml/XMLReader.java
index b3af9d2ea3..8b23ac7621 100644
--- 
a/contrib/format-xml/src/main/java/org/apache/drill/exec/store/xml/XMLReader.java
+++ 
b/contrib/format-xml/src/main/java/org/apache/drill/exec/store/xml/XMLReader.java
@@ -31,6 +31,7 @@ import org.apache.drill.exec.record.metadata.SchemaBuilder;
 import org.apache.drill.exec.store.ImplicitColumnUtils.ImplicitColumns;
 import org.apache.drill.exec.vector.accessor.ScalarWriter;
 import org.apache.drill.exec.vector.accessor.TupleWriter;
+import org.apache.drill.shaded.guava.com.google.common.base.Strings;
 import org.slf4j.Logger;
 import org.slf4j.LoggerFactory;
 
@@ -43,6 +44,13 @@ import javax.xml.stream.events.StartElement;
 import javax.xml.stream.events.XMLEvent;
 import java.io.Closeable;
 import java.io.InputStream;
+import java.text.ParseException;
+import java.text.SimpleDateFormat;
+import java.time.Instant;
+import java.time.LocalDate;
+import java.time.LocalTime;
+import java.time.format.DateTimeFormatter;
+import java.util.Date;
 import java.util.HashMap;
 import java.util.Iterator;
 import java.util.Map;
@@ -177,7 +185,7 @@ public class XMLReader implements Closeable {
         currentEvent = nextEvent;
 
         // Process the event
-        processEvent(currentEvent, lastEvent);
+        processEvent(currentEvent, lastEvent, reader.peek());
       } catch (XMLStreamException e) {
         throw UserException
           .dataReadError(e)
@@ -195,7 +203,7 @@ public class XMLReader implements Closeable {
    * the self-closing events can cause schema issues with Drill specifically, 
if a self-closing event
    * is detected prior to a non-self-closing event, and that populated event 
contains a map or other nested data
    * Drill will throw a schema change exception.
-   *
+   * <p>
    * Since Drill uses Java's streaming XML parser, unfortunately, it does not 
provide a means of identifying
    * self-closing tags.  This function does that by comparing the event with 
the previous event and looking for
    * a condition where one event is a start and the other is an ending event.  
Additionally, the column number and
@@ -229,7 +237,7 @@ public class XMLReader implements Closeable {
    * @param lastEvent The previous event which was processed
    */
   private void processEvent(XMLEvent currentEvent,
-                            XMLEvent lastEvent) {
+                            XMLEvent lastEvent, XMLEvent nextEvent) {
     String mapName;
     switch (currentEvent.getEventType()) {
 
@@ -282,7 +290,6 @@ public class XMLReader implements Closeable {
             attributePrefix = XMLUtils.addField(attributePrefix, fieldName);
           }
 
-          @SuppressWarnings("unchecked")
           Iterator<Attribute> attributes = startElement.getAttributes();
           if (attributes != null && attributes.hasNext()) {
             writeAttributes(attributePrefix, attributes);
@@ -428,8 +435,70 @@ public class XMLReader implements Closeable {
       index = writer.addColumn(colSchema);
     }
     ScalarWriter colWriter = writer.scalar(index);
+    ColumnMetadata columnMetadata = writer.tupleSchema().metadata(index);
+    MinorType dataType = columnMetadata.schema().getType().getMinorType();
+    String dateFormat;
+
+    // Write the values depending on their data type.  This only applies to 
scalar fields.
     if (fieldValue != null && (currentState != xmlState.ROW_ENDED && 
currentState != xmlState.FIELD_ENDED)) {
-      colWriter.setString(fieldValue);
+      switch (dataType) {
+        case BIT:
+          colWriter.setBoolean(Boolean.parseBoolean(fieldValue));
+          break;
+        case TINYINT:
+        case SMALLINT:
+        case INT:
+          colWriter.setInt(Integer.parseInt(fieldValue));
+          break;
+        case BIGINT:
+          colWriter.setLong(Long.parseLong(fieldValue));
+          break;
+        case FLOAT4:
+        case FLOAT8:
+          colWriter.setDouble(Double.parseDouble(fieldValue));
+          break;
+        case DATE:
+          dateFormat = columnMetadata.property("drill.format");
+          LocalDate localDate;
+          if (Strings.isNullOrEmpty(dateFormat)) {
+            localDate = LocalDate.parse(fieldValue);
+          } else {
+            localDate = LocalDate.parse(fieldValue, 
DateTimeFormatter.ofPattern(dateFormat));
+          }
+          colWriter.setDate(localDate);
+          break;
+        case TIME:
+          dateFormat = columnMetadata.property("drill.format");
+          LocalTime localTime;
+          if (Strings.isNullOrEmpty(dateFormat)) {
+            localTime = LocalTime.parse(fieldValue);
+          } else {
+            localTime = LocalTime.parse(fieldValue, 
DateTimeFormatter.ofPattern(dateFormat));
+          }
+          colWriter.setTime(localTime);
+          break;
+        case TIMESTAMP:
+          dateFormat = columnMetadata.property("drill.format");
+          Instant timestamp;
+          if (Strings.isNullOrEmpty(dateFormat)) {
+            timestamp = Instant.parse(fieldValue);
+          } else {
+            try {
+              SimpleDateFormat simpleDateFormat = new 
SimpleDateFormat(dateFormat);
+              Date parsedDate = simpleDateFormat.parse(fieldValue);
+              timestamp = Instant.ofEpochMilli(parsedDate.getTime());
+            } catch (ParseException e) {
+              throw UserException.parseError(e)
+                .message("Cannot parse " + fieldValue + " as a timestamp. You 
can specify a format string in the provided schema to correct this.")
+                .addContext(errorContext)
+                .build(logger);
+            }
+          }
+          colWriter.setTimestamp(timestamp);
+          break;
+      default:
+          colWriter.setString(fieldValue);
+      }
       changeState(xmlState.FIELD_ENDED);
     }
   }
@@ -491,7 +560,11 @@ public class XMLReader implements Closeable {
   }
 
   private TupleWriter getAttributeWriter() {
-    int attributeIndex = 
rootRowWriter.addColumn(SchemaBuilder.columnSchema(ATTRIBUTE_MAP_NAME, 
MinorType.MAP, DataMode.REQUIRED));
+    int attributeIndex = rootRowWriter.tupleSchema().index(ATTRIBUTE_MAP_NAME);
+
+    if (attributeIndex == -1) {
+      attributeIndex = 
rootRowWriter.addColumn(SchemaBuilder.columnSchema(ATTRIBUTE_MAP_NAME, 
MinorType.MAP, DataMode.REQUIRED));
+    }
     return rootRowWriter.tuple(attributeIndex);
   }
 
diff --git 
a/contrib/format-xml/src/test/java/org/apache/drill/exec/store/xml/TestXMLReader.java
 
b/contrib/format-xml/src/test/java/org/apache/drill/exec/store/xml/TestXMLReader.java
index 6a9fc11bf4..260fe9c3cb 100644
--- 
a/contrib/format-xml/src/test/java/org/apache/drill/exec/store/xml/TestXMLReader.java
+++ 
b/contrib/format-xml/src/test/java/org/apache/drill/exec/store/xml/TestXMLReader.java
@@ -32,6 +32,9 @@ import org.junit.Test;
 import org.junit.experimental.categories.Category;
 
 import java.nio.file.Paths;
+import java.time.Instant;
+import java.time.LocalDate;
+import java.time.LocalTime;
 
 import static org.apache.drill.test.QueryTestUtil.generateCompressedFile;
 import static org.apache.drill.test.rowSet.RowSetUtilities.mapArray;
@@ -83,6 +86,38 @@ public class TestXMLReader extends ClusterTest {
     new RowSetComparison(expected).verifyAndClearAll(results);
   }
 
+  @Test
+  public void testSimpleProvidedSchema() throws Exception {
+    String sql = "SELECT * FROM table(cp.`xml/simple_with_datatypes.xml` (type 
=> 'xml', schema " +
+      "=> 'inline=(`int_field` INT, `bigint_field` BIGINT, `float_field` 
FLOAT, `double_field` DOUBLE, `boolean_field` " +
+      "BOOLEAN, `date_field` DATE, `time_field` TIME, `timestamp_field` 
TIMESTAMP, `string_field`" +
+      " VARCHAR, `date2_field` DATE properties {`drill.format` = 
`MM/dd/yyyy`})'))";
+    RowSet results = client.queryBuilder().sql(sql).rowSet();
+    assertEquals(2, results.rowCount());
+
+    TupleMetadata expectedSchema = new SchemaBuilder()
+      .addNullable("int_field", MinorType.INT)
+      .addNullable("bigint_field", MinorType.BIGINT)
+      .addNullable("float_field", MinorType.FLOAT4)
+      .addNullable("double_field", MinorType.FLOAT8)
+      .addNullable("boolean_field", MinorType.BIT)
+      .addNullable("date_field", MinorType.DATE)
+      .addNullable("time_field", MinorType.TIME)
+      .addNullable("timestamp_field", MinorType.TIMESTAMP)
+      .addNullable("string_field", MinorType.VARCHAR)
+      .addNullable("date2_field", MinorType.DATE)
+      .add("attributes", MinorType.MAP)
+      .buildSchema();
+
+    RowSet expected = client.rowSetBuilder(expectedSchema)
+      .addRow(1, 1000L, 1.2999999523162842, 3.3, true, 
LocalDate.parse("2022-01-01"), LocalTime.parse("12:04:34"), 
Instant.parse("2022-01-06T12:30:30Z"), "string", LocalDate.parse("2022-03-02"), 
mapArray())
+      .addRow(2, 2000L, 2.299999952316284, 4.3, false, 
LocalDate.parse("2022-02-01"), LocalTime.parse("13:04:34"), 
Instant.parse("2022-03-06T12:30:30Z"), null, LocalDate.parse("2022-03-01"), 
mapArray())
+      .build();
+
+    new RowSetComparison(expected).verifyAndClearAll(results);
+  }
+
+
   @Test
   public void testSelfClosingTags() throws Exception {
     String sql = "SELECT * FROM cp.`xml/weather.xml`";
diff --git a/contrib/format-xml/src/test/resources/xml/simple_array.xml 
b/contrib/format-xml/src/test/resources/xml/simple_array.xml
new file mode 100644
index 0000000000..c734f3a559
--- /dev/null
+++ b/contrib/format-xml/src/test/resources/xml/simple_array.xml
@@ -0,0 +1,44 @@
+<!--
+
+    Licensed to the Apache Software Foundation (ASF) under one
+    or more contributor license agreements.  See the NOTICE file
+    distributed with this work for additional information
+    regarding copyright ownership.  The ASF licenses this file
+    to you under the Apache License, Version 2.0 (the
+    "License"); you may not use this file except in compliance
+    with the License.  You may obtain a copy of the License at
+
+    http://www.apache.org/licenses/LICENSE-2.0
+
+    Unless required by applicable law or agreed to in writing, software
+    distributed under the License is distributed on an "AS IS" BASIS,
+    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+    See the License for the specific language governing permissions and
+    limitations under the License.
+
+-->
+<dependencies>
+  <dependency>
+    <groupId>org.apache.drill.exec</groupId>
+    <array_field>
+      <value>1</value>
+      <value>2</value>
+      <value>3</value>
+    </array_field>
+  </dependency>
+
+  <dependency>
+    <groupId>org.apache.drill.exec</groupId>
+    <scope>test</scope>
+    <array_field>
+      <value>4</value>
+      <value>5</value>
+      <value>6</value>
+    </array_field>
+  </dependency>
+
+  <dependency>
+    <groupId>org.apache.drill</groupId>
+    <scope>test</scope>
+  </dependency>
+</dependencies>
diff --git 
a/contrib/format-xml/src/test/resources/xml/simple_with_datatypes.xml 
b/contrib/format-xml/src/test/resources/xml/simple_with_datatypes.xml
new file mode 100644
index 0000000000..92f6296040
--- /dev/null
+++ b/contrib/format-xml/src/test/resources/xml/simple_with_datatypes.xml
@@ -0,0 +1,47 @@
+<!--
+
+    Licensed to the Apache Software Foundation (ASF) under one
+    or more contributor license agreements.  See the NOTICE file
+    distributed with this work for additional information
+    regarding copyright ownership.  The ASF licenses this file
+    to you under the Apache License, Version 2.0 (the
+    "License"); you may not use this file except in compliance
+    with the License.  You may obtain a copy of the License at
+
+    http://www.apache.org/licenses/LICENSE-2.0
+
+    Unless required by applicable law or agreed to in writing, software
+    distributed under the License is distributed on an "AS IS" BASIS,
+    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+    See the License for the specific language governing permissions and
+    limitations under the License.
+
+-->
+
+<data>
+  <row>
+    <int_field>1</int_field>
+    <bigint_field>1000</bigint_field>
+    <float_field>1.3</float_field>
+    <double_field>3.3</double_field>
+    <boolean_field>true</boolean_field>
+    <date_field>2022-01-01</date_field>
+    <date2_field>03/02/2022</date2_field>
+    <time_field>12:04:34</time_field>
+    <timestamp_field>2022-01-06T12:30:30Z</timestamp_field>
+    <string_field>string</string_field>
+  </row>
+
+  <row>
+    <int_field>2</int_field>
+    <bigint_field>2000</bigint_field>
+    <float_field>2.3</float_field>
+    <double_field>4.3</double_field>
+    <boolean_field>false</boolean_field>
+    <date_field>2022-02-01</date_field>
+    <date2_field>03/01/2022</date2_field>
+    <time_field>13:04:34</time_field>
+    <timestamp_field>2022-03-06T12:30:30Z</timestamp_field>
+  </row>
+
+</data>
diff --git a/contrib/storage-http/README.md b/contrib/storage-http/README.md
index d384bedd11..797ab5f407 100644
--- a/contrib/storage-http/README.md
+++ b/contrib/storage-http/README.md
@@ -49,23 +49,23 @@ The `connection` property can accept the following options.
 Many APIs require parameters to be passed directly in the URL instead of as 
query arguments.  For example, github's API allows you to query an 
organization's repositories with the following
 URL:  https://github.com/orgs/{org}/repos
 
-As of Drill 1.20.0, you can simply set the URL in the connection using the 
curly braces.  If your API includes URL parameters you must include them in the 
`WHERE` clause in your 
+As of Drill 1.20.0, you can simply set the URL in the connection using the 
curly braces.  If your API includes URL parameters you must include them in the 
`WHERE` clause in your
 query, or specify a default value in the configuration.
 
 As an example, the API above, you would have to query as shown below:
 
 ```sql
-SELECT * 
+SELECT *
 FROM api.github
 WHERE org = 'apache'
 ```
 
 This query would replace the `org`in the URL with the value from the `WHERE` 
clause, in this case `apache`.  You can specify a default value as follows:  
`https://someapi.com/
-{param1}/{param2=default}`.  In this case, the default would be used if and 
only if there isn't a parameter supplied in the query. 
+{param1}/{param2=default}`.  In this case, the default would be used if and 
only if there isn't a parameter supplied in the query.
 
 #### Limitations on URL Parameters
-* Drill does not support boolean expressions of URL parameters in queries.  
For instance, for the above example, if you were to include `WHERE org='apache' 
OR org='linux'`, 
-  these parameters could not be pushed down in the current state. 
+* Drill does not support boolean expressions of URL parameters in queries.  
For instance, for the above example, if you were to include `WHERE org='apache' 
OR org='linux'`,
+  these parameters could not be pushed down in the current state.
 * All URL parameter clauses must be equality only.
 
 ### Passing Parameters in the Query
@@ -141,6 +141,7 @@ key2=value2"
 * `query_string`:  Parameters from the query are pushed down to the query 
string.  Static parameters are pushed to the post body.
 * `post_body`:  Both static and parameters from the query are pushed to the 
post body as key/value pairs
 * `json_body`:  Both static and parameters from the query are pushed to the 
post body as json.
+* `xml_body`:  Both static and parameters from the query are pushed to the 
post body as XML.
 
 #### Headers
 
@@ -245,13 +246,14 @@ as that shown above. Drill assumes that the server will 
uses HTTP status codes t
 indicate a bad request or other error.
 
 #### Input Type
-The REST plugin accepts three different types of input: `json`, `csv` and 
`xml`.  The default is `json`.  If you are using `XML` as a data type, there is 
an additional 
-configuration option called `xmlDataLevel` which reduces the level of unneeded 
nesting found in XML files.  You can find more information in the documentation 
for Drill's XML 
-format plugin. 
+The REST plugin accepts three different types of input: `json`, `csv` and 
`xml`.  The default is `json`.
 
 #### JSON Configuration
 [Read the documentation for configuring json options, including schema 
provisioning.](JSON_Options.md)
 
+#### XML Configuration
+[Read the documentation for configuring XML options, including schema 
provisioning.](XML_Options.md)
+
 #### Authorization
 
 `authType`: If your API requires authentication, specify the authentication
@@ -263,8 +265,8 @@ If the `authType` is set to `basic`, `username` and 
`password` must be set in th
 `password`: The password for basic authentication.
 
 ##### Global Credentials
-If you have an HTTP plugin with multiple endpoints that all use the same 
credentials, you can set the `authType` to `basic` and set global 
-credentials in the storage plugin configuration. 
+If you have an HTTP plugin with multiple endpoints that all use the same 
credentials, you can set the `authType` to `basic` and set global
+credentials in the storage plugin configuration.
 
 Simply add the following to the storage plugin configuration:
 ```json
@@ -280,12 +282,12 @@ Note that the `authType` still must be set to `basic` and 
that any endpoint cred
 
 #### Limiting Results
 Some APIs support a query parameter which is used to limit the number of 
results returned by the API.  In this case you can set the `limitQueryParam` 
config variable to the query parameter name and Drill will automatically 
include this in your query.  For instance, if you have an API which supports a 
limit query parameter called `maxRecords` and you set the abovementioned config 
variable then execute the following query:
-  
+
 ```sql
 SELECT <fields>
 FROM api.limitedApi
-LIMIT 10 
-```  
+LIMIT 10
+```
 Drill will send the following request to your API:
 ```
 https://<api>?maxRecords=10
@@ -298,12 +300,12 @@ If the API which you are querying requires OAuth2.0 for 
authentication [read the
 If you want to use automatic pagination in Drill, [click here to read the 
documentation for pagination](Pagination.md).
 
 #### errorOn400
-When a user makes HTTP calls, the response code will be from 100-599.  400 
series error codes can contain useful information and in some cases you would 
not want Drill to throw 
-errors on 400 series errors.  This option allows you to define Drill's 
behavior on 400 series error codes.  When set to `true`, Drill will throw an 
exception and halt execution 
+When a user makes HTTP calls, the response code will be from 100-599.  400 
series error codes can contain useful information and in some cases you would 
not want Drill to throw
+errors on 400 series errors.  This option allows you to define Drill's 
behavior on 400 series error codes.  When set to `true`, Drill will throw an 
exception and halt execution
 on 400 series errors, `false` will return an empty result set (with implicit 
fields populated).
 
 #### verifySSLCert
-Default is `true`, but when set to false, Drill will trust all SSL 
certificates.  Useful for debugging or on internal corporate networks using 
self-signed certificates or 
+Default is `true`, but when set to false, Drill will trust all SSL 
certificates.  Useful for debugging or on internal corporate networks using 
self-signed certificates or
 private certificate authorities.
 
 #### caseSensitiveFilters
@@ -447,7 +449,7 @@ To query this API, set the configuration as follows:
       "authType": "none",
       "userName": null,
       "password": null,
-      "postBody": null, 
+      "postBody": null,
       "inputType": "json",
        "errorOn400": true
     }
@@ -495,7 +497,7 @@ body. Set the configuration as follows:
       "authType": "none",
       "userName": null,
       "password": null,
-      "postBody": null, 
+      "postBody": null,
        "errorOn400": true
     }
   }
@@ -641,24 +643,24 @@ The HTTP plugin includes four implicit fields which can 
be used for debugging.
 * `_response_code`: The response code from the HTTP request.  This field is an 
`INT`.
 * `_response_message`:  The response message.
 * `_response_protocol`:  The response protocol.
-* `_response_url`:  The actual URL sent to the API. 
+* `_response_url`:  The actual URL sent to the API.
 
 ## Joining Data
-There are some situations where a user might want to join data with an API 
result and the pushdowns prevent that from happening.  The main situation where 
this happens is when 
-an API has parameters which are part of the URL AND these parameters are 
dynamically populated via a join. 
+There are some situations where a user might want to join data with an API 
result and the pushdowns prevent that from happening.  The main situation where 
this happens is when
+an API has parameters which are part of the URL AND these parameters are 
dynamically populated via a join.
 
-In this case, there are two functions `http_get_url` and `http_get` which you 
can use to faciliate these joins. 
+In this case, there are two functions `http_get_url` and `http_get` which you 
can use to faciliate these joins.
 
 * `http_request('<storage_plugin_name>', <params>)`:  This function accepts a 
storage plugin as input and an optional list of parameters to include in a URL.
-* `http_get(<url>, <params>)`:  This function works in the same way except 
that it does not pull any configuration information from existing storage 
plugins.  The input url for 
-  the `http_get` function must be a valid URL. 
+* `http_get(<url>, <params>)`:  This function works in the same way except 
that it does not pull any configuration information from existing storage 
plugins.  The input url for
+  the `http_get` function must be a valid URL.
 
 ### Example Queries
-Let's say that you have a storage plugin called `github` with an endpoint 
called `repos` which points to the url: https://github.com/orgs/{org}/repos.  
It is easy enough to 
+Let's say that you have a storage plugin called `github` with an endpoint 
called `repos` which points to the url: https://github.com/orgs/{org}/repos.  
It is easy enough to
 write a query like this:
 
 ```sql
-SELECT * 
+SELECT *
 FROM github.repos
 WHERE org='apache'
 ```
diff --git a/contrib/storage-http/XML_Options.md 
b/contrib/storage-http/XML_Options.md
new file mode 100644
index 0000000000..e53e1e8e99
--- /dev/null
+++ b/contrib/storage-http/XML_Options.md
@@ -0,0 +1,39 @@
+# XML Options
+Drill has a several XML configuration options to allow you to configure how 
Drill interprets XML files.
+
+## DataLevel
+XML data often contains a considerable amount of nesting which is not 
necessarily useful for data analysis. This parameter allows you to set the 
nesting level
+  where the data actually starts.  The levels start at `1`.
+
+## Schema Provisioning
+One of the challenges of querying APIs is inconsistent data.  Drill allows you 
to provide a schema for individual endpoints.  You can do this in one of three 
ways:
+
+1. By providing a schema inline [See: Specifying Schema as Table Function 
Parameter](https://drill.apache.org/docs/plugin-configuration-basics/#specifying-the-schema-as-table-function-parameter)
+2. By providing a schema in the configuration for the endpoint.
+
+Note: At the time of writing Drill's XML reader only supports provided schema 
with scalar data types.
+
+## Example Configuration:
+You can set either of these options on a per-endpoint basis as shown below:
+
+```json
+"xmlOptions": {
+  "dataLevel": 1
+}
+```
+
+Or,
+```json
+"xmlOptions": {
+  "dataLevel": 2,
+  "schema": {
+    "type": "tuple_schema",
+      "columns": [
+        {
+          "name": "custom_field",
+          "type": "VARCHAR
+        }
+    ]
+  }
+}
+```
diff --git 
a/contrib/storage-http/src/main/java/org/apache/drill/exec/store/http/HttpApiConfig.java
 
b/contrib/storage-http/src/main/java/org/apache/drill/exec/store/http/HttpApiConfig.java
index 91af33e36f..32efdbf559 100644
--- 
a/contrib/storage-http/src/main/java/org/apache/drill/exec/store/http/HttpApiConfig.java
+++ 
b/contrib/storage-http/src/main/java/org/apache/drill/exec/store/http/HttpApiConfig.java
@@ -95,6 +95,7 @@ public class HttpApiConfig {
   @JsonProperty
   private final String inputType;
   @JsonProperty
+  @Deprecated
   private final int xmlDataLevel;
   @JsonProperty
   private final String limitQueryParam;
@@ -111,6 +112,9 @@ public class HttpApiConfig {
   @JsonProperty
   private final HttpJsonOptions jsonOptions;
 
+  @JsonProperty
+  private final HttpXmlOptions xmlOptions;
+
   @JsonInclude
   @JsonProperty
   private final boolean verifySSLCert;
@@ -164,6 +168,7 @@ public class HttpApiConfig {
     return this.caseSensitiveFilters;
   }
 
+  @Deprecated
   public int xmlDataLevel() {
     return this.xmlDataLevel;
   }
@@ -179,6 +184,9 @@ public class HttpApiConfig {
   public HttpJsonOptions jsonOptions() {
     return this.jsonOptions;
   }
+  public HttpXmlOptions xmlOptions() {
+    return this.xmlOptions;
+  }
 
   public boolean verifySSLCert() {
     return this.verifySSLCert;
@@ -202,7 +210,6 @@ public class HttpApiConfig {
     }
     HttpApiConfig that = (HttpApiConfig) o;
     return requireTail == that.requireTail
-      && xmlDataLevel == that.xmlDataLevel
       && errorOn400 == that.errorOn400
       && verifySSLCert == that.verifySSLCert
       && directCredentials == that.directCredentials
@@ -218,6 +225,7 @@ public class HttpApiConfig {
       && Objects.equals(inputType, that.inputType)
       && Objects.equals(limitQueryParam, that.limitQueryParam)
       && Objects.equals(jsonOptions, that.jsonOptions)
+      && Objects.equals(xmlOptions, that.xmlOptions)
       && Objects.equals(credentialsProvider, that.credentialsProvider)
       && Objects.equals(paginator, that.paginator);
   }
@@ -225,7 +233,7 @@ public class HttpApiConfig {
   @Override
   public int hashCode() {
     return Objects.hash(url, requireTail, method, postBody, headers, params, 
dataPath,
-      authType, inputType, xmlDataLevel, limitQueryParam, errorOn400, 
jsonOptions, verifySSLCert,
+      authType, inputType, limitQueryParam, errorOn400, jsonOptions, 
xmlOptions, verifySSLCert,
       credentialsProvider, paginator, directCredentials, 
postParameterLocation, caseSensitiveFilters);
   }
 
@@ -243,10 +251,10 @@ public class HttpApiConfig {
       .field("caseSensitiveFilters", caseSensitiveFilters)
       .field("authType", authType)
       .field("inputType", inputType)
-      .field("xmlDataLevel", xmlDataLevel)
       .field("limitQueryParam", limitQueryParam)
       .field("errorOn400", errorOn400)
       .field("jsonOptions", jsonOptions)
+      .field("xmlOptions", xmlOptions)
       .field("verifySSLCert", verifySSLCert)
       .field("credentialsProvider", credentialsProvider)
       .field("paginator", paginator)
@@ -272,7 +280,12 @@ public class HttpApiConfig {
      * All POST parameters, both static and from the query, are pushed to the 
POST body
      * as a JSON object.
      */
-    JSON_BODY
+    JSON_BODY,
+    /**
+     * All POST parameters, both static and from the query, are pushed to the 
POST body
+     * as an XML request.
+     */
+    XML_BODY
   }
 
   public enum HttpMethod {
@@ -292,6 +305,7 @@ public class HttpApiConfig {
         ? HttpMethod.GET.toString() : builder.method.trim().toUpperCase();
     this.url = builder.url;
     this.jsonOptions = builder.jsonOptions;
+    this.xmlOptions = builder.xmlOptions;
 
     HttpMethod httpMethod = HttpMethod.valueOf(this.method);
     // Get the request method.  Only accept GET and POST requests.  Anything 
else will default to GET.
@@ -438,6 +452,7 @@ public class HttpApiConfig {
     private boolean errorOn400;
 
     private HttpJsonOptions jsonOptions;
+    private HttpXmlOptions xmlOptions;
 
     private CredentialsProvider credentialsProvider;
 
@@ -479,6 +494,11 @@ public class HttpApiConfig {
       return this;
     }
 
+    public HttpApiConfigBuilder xmlOptions(HttpXmlOptions options) {
+      this.xmlOptions = options;
+      return this;
+    }
+
     public HttpApiConfigBuilder requireTail(boolean requireTail) {
       this.requireTail = requireTail;
       return this;
@@ -539,6 +559,12 @@ public class HttpApiConfig {
       return this;
     }
 
+    /**
+     * Do not use.  Use xmlOptions instead to set XML data level.
+     * @param xmlDataLevel
+     * @return
+     */
+    @Deprecated
     public HttpApiConfigBuilder xmlDataLevel(int xmlDataLevel) {
       this.xmlDataLevel = xmlDataLevel;
       return this;
diff --git 
a/contrib/storage-http/src/main/java/org/apache/drill/exec/store/http/HttpXMLBatchReader.java
 
b/contrib/storage-http/src/main/java/org/apache/drill/exec/store/http/HttpXMLBatchReader.java
index d5dc5b5fe0..5aec7ffdf8 100644
--- 
a/contrib/storage-http/src/main/java/org/apache/drill/exec/store/http/HttpXMLBatchReader.java
+++ 
b/contrib/storage-http/src/main/java/org/apache/drill/exec/store/http/HttpXMLBatchReader.java
@@ -26,8 +26,10 @@ import org.apache.drill.common.exceptions.CustomErrorContext;
 import org.apache.drill.common.exceptions.UserException;
 import org.apache.drill.exec.ExecConstants;
 import org.apache.drill.exec.physical.impl.scan.framework.SchemaNegotiator;
+import org.apache.drill.exec.physical.impl.scan.v3.FixedReceiver;
 import org.apache.drill.exec.physical.resultSet.ResultSetLoader;
 import org.apache.drill.exec.physical.resultSet.RowSetLoader;
+import org.apache.drill.exec.record.metadata.TupleMetadata;
 import org.apache.drill.exec.store.ImplicitColumnUtils.ImplicitColumns;
 import org.apache.drill.exec.store.http.paginator.Paginator;
 import org.apache.drill.exec.store.http.util.SimpleHttp;
@@ -53,7 +55,13 @@ public class HttpXMLBatchReader extends HttpBatchReader {
     super(subScan);
     this.subScan = subScan;
     this.maxRecords = subScan.maxRecords();
-    this.dataLevel = subScan.tableSpec().connectionConfig().xmlDataLevel();
+
+    // TODO Remove the XMLDataLevel parameter.  For now, check both
+    if (subScan.tableSpec().connectionConfig().xmlOptions() == null) {
+      this.dataLevel = subScan.tableSpec().connectionConfig().xmlDataLevel();
+    } else {
+      this.dataLevel = 
subScan.tableSpec().connectionConfig().xmlOptions().getDataLevel();
+    }
   }
 
 
@@ -61,7 +69,12 @@ public class HttpXMLBatchReader extends HttpBatchReader {
     super(subScan, paginator);
     this.subScan = subScan;
     this.maxRecords = subScan.maxRecords();
-    this.dataLevel = subScan.tableSpec().connectionConfig().xmlDataLevel();
+
+    if (subScan.tableSpec().connectionConfig().xmlOptions() == null) {
+      this.dataLevel = subScan.tableSpec().connectionConfig().xmlDataLevel();
+    } else {
+      this.dataLevel = 
subScan.tableSpec().connectionConfig().xmlOptions().getDataLevel();
+    }
   }
 
   @Override
@@ -96,6 +109,12 @@ public class HttpXMLBatchReader extends HttpBatchReader {
     inStream = http.getInputStream();
     // Initialize the XMLReader the reader
     try {
+      // Add schema if provided
+      TupleMetadata finalSchema = getSchema(negotiator);
+      if (finalSchema != null) {
+        negotiator.tableSchema(finalSchema, false);
+      }
+
       xmlReader = new XMLReader(inStream, dataLevel);
       resultLoader = negotiator.build();
 
@@ -121,6 +140,36 @@ public class HttpXMLBatchReader extends HttpBatchReader {
     return true;
   }
 
+  /**
+   * This function obtains the correct schema for the {@link XMLReader}.  
There are four possibilities:
+   * 1.  The schema is provided in the configuration only.  In this case, that 
schema will be returned.
+   * 2.  The schema is provided in both the configuration and inline.  These 
two schemas will be merged together.
+   * 3.  The schema is provided inline in a query.  In this case, that schema 
will be returned.
+   * 4.  No schema is provided.  Function returns null.
+   * @param negotiator {@link SchemaNegotiator} The schema negotiator with all 
the connection information
+   * @return The built {@link TupleMetadata} of the provided schema, null if 
none provided.
+   */
+  private TupleMetadata getSchema(SchemaNegotiator negotiator) {
+    if (subScan.tableSpec().connectionConfig().xmlOptions() != null &&
+      subScan.tableSpec().connectionConfig().xmlOptions().schema() != null) {
+      TupleMetadata configuredSchema = 
subScan.tableSpec().connectionConfig().xmlOptions().schema();
+
+      // If it has a provided schema both inline and in the config, merge the 
two, otherwise, return the config schema
+      if (negotiator.hasProvidedSchema()) {
+        TupleMetadata inlineSchema = negotiator.providedSchema();
+        return FixedReceiver.Builder.mergeSchemas(configuredSchema, 
inlineSchema);
+      } else {
+        return configuredSchema;
+      }
+    } else {
+      if (negotiator.hasProvidedSchema()) {
+        return negotiator.providedSchema();
+      }
+    }
+    return null;
+  }
+
+
   @Override
   public boolean next() {
     boolean result;
diff --git 
a/contrib/storage-http/src/main/java/org/apache/drill/exec/store/http/HttpXmlOptions.java
 
b/contrib/storage-http/src/main/java/org/apache/drill/exec/store/http/HttpXmlOptions.java
new file mode 100644
index 0000000000..d73e576778
--- /dev/null
+++ 
b/contrib/storage-http/src/main/java/org/apache/drill/exec/store/http/HttpXmlOptions.java
@@ -0,0 +1,120 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.drill.exec.store.http;
+
+
+import com.fasterxml.jackson.annotation.JsonCreator;
+import com.fasterxml.jackson.annotation.JsonInclude;
+import com.fasterxml.jackson.annotation.JsonProperty;
+import com.fasterxml.jackson.databind.annotation.JsonDeserialize;
+import com.fasterxml.jackson.databind.annotation.JsonPOJOBuilder;
+import org.apache.drill.common.PlanStringBuilder;
+import org.apache.drill.exec.record.metadata.TupleMetadata;
+
+import java.util.Objects;
+
+@JsonInclude(JsonInclude.Include.NON_DEFAULT)
+@JsonDeserialize(builder = HttpXmlOptions.HttpXmlOptionsBuilder.class)
+public class HttpXmlOptions {
+
+  @JsonProperty
+  private final int dataLevel;
+
+  @JsonProperty
+  private final TupleMetadata schema;
+
+  @JsonCreator
+  public HttpXmlOptions(@JsonProperty("dataLevel") Integer dataLevel,
+                        @JsonProperty("schema") TupleMetadata schema) {
+    this.schema = schema;
+    if (dataLevel == null || dataLevel < 1) {
+      this.dataLevel = 1;
+    } else {
+      this.dataLevel = dataLevel;
+    }
+  }
+
+  public HttpXmlOptions(HttpXmlOptionsBuilder builder) {
+    this.dataLevel = builder.dataLevel;
+    this.schema = builder.schema;
+  }
+
+
+  public static HttpXmlOptionsBuilder builder() {
+    return new HttpXmlOptionsBuilder();
+  }
+
+  @JsonProperty("dataLevel")
+  public int getDataLevel() {
+    return this.dataLevel;
+  }
+
+  @JsonProperty("schema")
+  public TupleMetadata schema() {
+    return this.schema;
+  }
+
+
+  @Override
+  public boolean equals(Object o) {
+    if (this == o) {
+      return true;
+    }
+    if (o == null || getClass() != o.getClass()) {
+      return false;
+    }
+    HttpXmlOptions that = (HttpXmlOptions) o;
+    return Objects.equals(dataLevel, that.dataLevel)
+      && Objects.equals(schema, that.schema);
+  }
+
+  @Override
+  public int hashCode() {
+    return Objects.hash(dataLevel, schema);
+  }
+
+  @Override
+  public String toString() {
+    return new PlanStringBuilder(this)
+      .field("dataLevel", dataLevel)
+      .field("schema", schema)
+      .toString();
+  }
+
+  @JsonPOJOBuilder(withPrefix = "")
+  public static class HttpXmlOptionsBuilder {
+
+    private int dataLevel;
+    private TupleMetadata schema;
+
+    public HttpXmlOptions.HttpXmlOptionsBuilder dataLevel(int dataLevel) {
+      this.dataLevel = dataLevel;
+      return this;
+    }
+
+    public HttpXmlOptions.HttpXmlOptionsBuilder schema(TupleMetadata schema) {
+      this.schema = schema;
+      return this;
+    }
+
+    public HttpXmlOptions build() {
+      return new HttpXmlOptions(this);
+    }
+  }
+}
diff --git 
a/contrib/storage-http/src/main/java/org/apache/drill/exec/store/http/util/SimpleHttp.java
 
b/contrib/storage-http/src/main/java/org/apache/drill/exec/store/http/util/SimpleHttp.java
index d0f12f26c2..3568fe9213 100644
--- 
a/contrib/storage-http/src/main/java/org/apache/drill/exec/store/http/util/SimpleHttp.java
+++ 
b/contrib/storage-http/src/main/java/org/apache/drill/exec/store/http/util/SimpleHttp.java
@@ -104,6 +104,7 @@ public class SimpleHttp implements AutoCloseable {
   private static final int DEFAULT_TIMEOUT = 1;
   private static final Pattern URL_PARAM_REGEX = 
Pattern.compile("\\{(\\w+)(?:=(\\w*))?}");
   public static final MediaType JSON_MEDIA_TYPE = 
MediaType.get("application/json; charset=utf-8");
+  public static final MediaType XML_MEDIA_TYPE = 
MediaType.get("application/xml");
   private static final OkHttpClient SIMPLE_CLIENT = new OkHttpClient.Builder()
     .connectTimeout(DEFAULT_TIMEOUT, TimeUnit.SECONDS)
     .writeTimeout(DEFAULT_TIMEOUT, TimeUnit.SECONDS)
@@ -365,6 +366,20 @@ public class SimpleHttp implements AutoCloseable {
 
         RequestBody requestBody = RequestBody.create(json.toJSONString(), 
JSON_MEDIA_TYPE);
         requestBuilder.post(requestBody);
+      } else if (apiConfig.getPostLocation() == PostLocation.XML_BODY) {
+        StringBuilder xmlRequest = new StringBuilder();
+        xmlRequest.append("<request>");
+        if (filters != null) {
+          for (Map.Entry<String, String> filter : filters.entrySet()) {
+            xmlRequest.append("<").append(filter.getKey()).append(">");
+            xmlRequest.append(filter.getValue());
+            xmlRequest.append("</").append(filter.getKey()).append(">");
+          }
+        }
+        xmlRequest.append("</request>");
+        RequestBody requestBody = RequestBody.create(xmlRequest.toString(), 
XML_MEDIA_TYPE);
+        requestBuilder.post(requestBody);
+
       } else {
         formBodyBuilder = buildPostBody(apiConfig.postBody());
         requestBuilder.post(formBodyBuilder.build());
diff --git 
a/contrib/storage-http/src/test/java/org/apache/drill/exec/store/http/TestHttpPlugin.java
 
b/contrib/storage-http/src/test/java/org/apache/drill/exec/store/http/TestHttpPlugin.java
index c09e3a5503..32e31fca57 100644
--- 
a/contrib/storage-http/src/test/java/org/apache/drill/exec/store/http/TestHttpPlugin.java
+++ 
b/contrib/storage-http/src/test/java/org/apache/drill/exec/store/http/TestHttpPlugin.java
@@ -131,11 +131,26 @@ public class TestHttpPlugin extends ClusterTest {
       .requireTail(false)
       .build();
 
+    HttpXmlOptions nycXmlOptions = HttpXmlOptions.builder()
+      .dataLevel(5)
+      .build();
+
+    HttpApiConfig nycConfig = HttpApiConfig.builder()
+      .url("https://www.checkbooknyc.com/api";)
+      .method("post")
+      .inputType("xml")
+      .requireTail(false)
+      .params(Arrays.asList("type_of_data", "records_from", "max_records"))
+      .postParameterLocation("xml_body")
+      .xmlOptions(nycXmlOptions)
+      .build();
+
     Map<String, HttpApiConfig> configs = new HashMap<>();
     configs.put("stock", stockConfig);
     configs.put("sunrise", sunriseConfig);
     configs.put("sunrise2", sunriseWithParamsConfig);
     configs.put("pokemon", pokemonConfig);
+    configs.put("nyc", nycConfig);
 
     HttpStoragePluginConfig mockStorageConfigWithWorkspace =
         new HttpStoragePluginConfig(false, configs, 10, 1000, null, null, "", 
80, "", "", "", null, PlainCredentialsProvider.EMPTY_CREDENTIALS_PROVIDER,
@@ -286,6 +301,26 @@ public class TestHttpPlugin extends ClusterTest {
       .dataPath("results")
       .build();
 
+    HttpXmlOptions xmlOptions = new HttpXmlOptions.HttpXmlOptionsBuilder()
+      .dataLevel(2)
+      .build();
+
+    TupleMetadata testSchema = new SchemaBuilder()
+      .add("attributes", MinorType.MAP)
+      .addNullable("COMMON", MinorType.VARCHAR)
+      .addNullable("BOTANICAL", MinorType.VARCHAR)
+      .addNullable("ZONE", MinorType.INT)
+      .addNullable("LIGHT", MinorType.VARCHAR)
+      .addNullable("PRICE", MinorType.VARCHAR)
+      .addNullable("AVAILABILITY", MinorType.VARCHAR)
+      .buildSchema();
+
+    HttpXmlOptions xmlOptionsWithSchhema = new 
HttpXmlOptions.HttpXmlOptionsBuilder()
+      .dataLevel(2)
+      .schema(testSchema)
+      .build();
+
+
     HttpApiConfig mockXmlConfig = HttpApiConfig.builder()
       .url(makeUrl("http://localhost:%d/xml";))
       .method("GET")
@@ -295,9 +330,22 @@ public class TestHttpPlugin extends ClusterTest {
       .password("pass")
       .dataPath("results")
       .inputType("xml")
-      .xmlDataLevel(2)
+      .xmlOptions(xmlOptions)
       .build();
 
+    HttpApiConfig mockXmlConfigWithSchema = HttpApiConfig.builder()
+      .url(makeUrl("http://localhost:%d/xml";))
+      .method("GET")
+      .headers(headers)
+      .authType("basic")
+      .userName("user")
+      .password("pass")
+      .dataPath("results")
+      .inputType("xml")
+      .xmlOptions(xmlOptionsWithSchhema)
+      .build();
+
+
     HttpApiConfig mockGithubWithParam = HttpApiConfig.builder()
       .url(makeUrl("http://localhost:%d/orgs/{org}/repos";))
       .method("GET")
@@ -349,6 +397,7 @@ public class TestHttpPlugin extends ClusterTest {
     configs.put("mockPostPushdownWithStaticParams", 
mockPostPushdownWithStaticParams);
     configs.put("mockcsv", mockCsvConfig);
     configs.put("mockxml", mockXmlConfig);
+    configs.put("mockxml_with_schema", mockXmlConfigWithSchema);
     configs.put("github", mockGithubWithParam);
     configs.put("github2", mockGithubWithDuplicateParam);
     configs.put("github3", mockGithubWithParamInQuery);
@@ -385,6 +434,7 @@ public class TestHttpPlugin extends ClusterTest {
         .addRow("local.mockcsv", "http")
         .addRow("local.mockpost", "http")
         .addRow("local.mockxml", "http")
+        .addRow("local.mockxml_with_schema", "http")
         .addRow("local.nullpost", "http")
         .addRow("local.sunrise", "http")
         .build();
@@ -505,6 +555,35 @@ public class TestHttpPlugin extends ClusterTest {
     doSimpleSpecificQuery(sql);
   }
 
+  @Test
+  @Ignore("Requires Remote Server")
+  public void simpleStarQueryWithXMLParams() throws Exception {
+    String sql = "SELECT year, department, expense_category, budget_code, 
budget_name, modified, adopted " +
+      "FROM live.nyc WHERE type_of_data='Budget' AND records_from=1 AND 
max_records=5 AND year IS NOT null";
+
+    RowSet results = client.queryBuilder().sql(sql).rowSet();
+
+    TupleMetadata expectedSchema = new SchemaBuilder()
+      .add("year", TypeProtos.MinorType.VARCHAR, TypeProtos.DataMode.OPTIONAL)
+      .add("department", TypeProtos.MinorType.VARCHAR, 
TypeProtos.DataMode.OPTIONAL)
+      .add("expense_category", TypeProtos.MinorType.VARCHAR, 
TypeProtos.DataMode.OPTIONAL)
+      .add("budget_code", TypeProtos.MinorType.VARCHAR, 
TypeProtos.DataMode.OPTIONAL)
+      .add("budget_name", TypeProtos.MinorType.VARCHAR, 
TypeProtos.DataMode.OPTIONAL)
+      .add("modified", TypeProtos.MinorType.VARCHAR, 
TypeProtos.DataMode.OPTIONAL)
+      .add("adopted", TypeProtos.MinorType.VARCHAR, 
TypeProtos.DataMode.OPTIONAL)
+      .build();
+
+    RowSet expected = new RowSetBuilder(client.allocator(), expectedSchema)
+      .addRow("2022", "MEDICAL ASSISTANCE - OTPS", "MEDICAL ASSISTANCE", 
"9564", "MMIS MEDICAL ASSISTANCE", "5972433142", "5584533142")
+      .addRow("2020", "MEDICAL ASSISTANCE - OTPS", "MEDICAL ASSISTANCE", 
"9564", "MMIS MEDICAL ASSISTANCE", "5819588142", "4953233142")
+      .addRow("2014", "MEDICAL ASSISTANCE - OTPS", "MEDICAL ASSISTANCE", 
"9564", "MMIS MEDICAL ASSISTANCE", "5708101276", "5231324567")
+      .addRow("2015", "MEDICAL ASSISTANCE - OTPS", "MEDICAL ASSISTANCE", 
"9564", "MMIS MEDICAL ASSISTANCE", "5663673673", "5312507361")
+      .build();
+
+    RowSetUtilities.verify(expected, results);
+  }
+
+
   private void doSimpleSpecificQuery(String sql) throws Exception {
 
     RowSet results = client.queryBuilder().sql(sql).rowSet();
@@ -758,6 +837,22 @@ public class TestHttpPlugin extends ClusterTest {
     }
   }
 
+  @Test
+  public void testSerDeXML() throws Exception {
+    try (MockWebServer server = startServer()) {
+
+      server.enqueue(
+        new MockResponse().setResponseCode(200)
+          .setBody(TEST_XML_RESPONSE)
+      );
+
+      String sql = "SELECT COUNT(*) FROM local.mockxml.`xml?arg1=4` ";
+      String plan = queryBuilder().sql(sql).explainJson();
+      long cnt = queryBuilder().physical(plan).singletonLong();
+      assertEquals("Counts should match", 36L, cnt);
+    }
+  }
+
   @Test
    public void testSerDeCSV() throws Exception {
     try (MockWebServer server = startServer()) {
@@ -874,6 +969,37 @@ public class TestHttpPlugin extends ClusterTest {
     }
   }
 
+  @Test
+  public void testXmlWithSchemaResponse() throws Exception {
+    String sql = "SELECT * FROM local.mockxml_with_schema.`?arg1=4` LIMIT 5";
+    try (MockWebServer server = startServer()) {
+
+      server.enqueue(new 
MockResponse().setResponseCode(200).setBody(TEST_XML_RESPONSE));
+
+      RowSet results = client.queryBuilder().sql(sql).rowSet();
+
+      TupleMetadata expectedSchema = new SchemaBuilder()
+        .add("attributes", MinorType.MAP)
+        .addNullable("COMMON", MinorType.VARCHAR)
+        .addNullable("BOTANICAL", MinorType.VARCHAR)
+        .addNullable("ZONE", MinorType.INT)
+        .addNullable("LIGHT", MinorType.VARCHAR)
+        .addNullable("PRICE", MinorType.VARCHAR)
+        .addNullable("AVAILABILITY", MinorType.VARCHAR)
+        .buildSchema();
+
+      RowSet expected = new RowSetBuilder(client.allocator(), expectedSchema)
+        .addRow(mapArray(), "Bloodroot", "Sanguinaria canadensis", 4, "Mostly 
Shady", "$2.44", "031599")
+        .addRow(mapArray(),"Columbine", "Aquilegia canadensis", 3, "Mostly 
Shady", "$9.37", "030699")
+        .addRow(mapArray(),"Marsh Marigold", "Caltha palustris", 4, "Mostly 
Sunny", "$6.81", "051799")
+        .addRow(mapArray(), "Cowslip", "Caltha palustris", 4, "Mostly Shady", 
"$9.90", "030699")
+        .addRow(mapArray(), "Dutchman's-Breeches", "Dicentra cucullaria", 3, 
"Mostly Shady", "$6.44", "012099")
+        .build();
+
+      RowSetUtilities.verify(expected, results);
+    }
+  }
+
   @Test
   public void testImplicitFieldsWithJSON() throws Exception {
     String sql = "SELECT _response_code, _response_message, 
_response_protocol, _response_url FROM 
local.sunrise.`?lat=36.7201600&lng=-4.4203400&date=2019-10-02`";
diff --git 
a/contrib/storage-http/src/test/java/org/apache/drill/exec/store/http/TestPagination.java
 
b/contrib/storage-http/src/test/java/org/apache/drill/exec/store/http/TestPagination.java
index 2334315a3e..5931e0d032 100644
--- 
a/contrib/storage-http/src/test/java/org/apache/drill/exec/store/http/TestPagination.java
+++ 
b/contrib/storage-http/src/test/java/org/apache/drill/exec/store/http/TestPagination.java
@@ -226,6 +226,10 @@ public class TestPagination extends ClusterTest {
     List<String> params = new ArrayList<>();
     params.add("foo");
 
+    HttpXmlOptions xmlOptions = HttpXmlOptions.builder()
+      .dataLevel(2)
+      .build();
+
     HttpApiConfig mockXmlConfigWithPaginator = HttpApiConfig.builder()
       .url("http://localhost:8092/xml";)
       .method("GET")
@@ -233,7 +237,7 @@ public class TestPagination extends ClusterTest {
       .params(params)
       .paginator(pagePaginatorForXML)
       .inputType("xml")
-      .xmlDataLevel(2)
+      .xmlOptions(xmlOptions)
       .build();
 
     HttpApiConfig mockXmlConfigWithPaginatorAndUrlParams = 
HttpApiConfig.builder()
@@ -243,7 +247,7 @@ public class TestPagination extends ClusterTest {
       .params(params)
       .paginator(pagePaginatorForXML)
       .inputType("xml")
-      .xmlDataLevel(2)
+      .xmlOptions(xmlOptions)
       .build();
 
 
diff --git a/contrib/storage-http/src/test/resources/data/response.xml 
b/contrib/storage-http/src/test/resources/data/response.xml
index d9dc3f5c1e..6681266a51 100644
--- a/contrib/storage-http/src/test/resources/data/response.xml
+++ b/contrib/storage-http/src/test/resources/data/response.xml
@@ -197,7 +197,7 @@
   <PLANT>
     <COMMON>Black-Eyed Susan</COMMON>
     <BOTANICAL>Rudbeckia hirta</BOTANICAL>
-    <ZONE>Annual</ZONE>
+    <ZONE>8</ZONE>
     <LIGHT>Sunny</LIGHT>
     <PRICE>$9.80</PRICE>
     <AVAILABILITY>061899</AVAILABILITY>
@@ -221,7 +221,7 @@
   <PLANT>
     <COMMON>Butterfly Weed</COMMON>
     <BOTANICAL>Asclepias tuberosa</BOTANICAL>
-    <ZONE>Annual</ZONE>
+    <ZONE>8</ZONE>
     <LIGHT>Sunny</LIGHT>
     <PRICE>$2.78</PRICE>
     <AVAILABILITY>063099</AVAILABILITY>
@@ -229,7 +229,7 @@
   <PLANT>
     <COMMON>Cinquefoil</COMMON>
     <BOTANICAL>Potentilla</BOTANICAL>
-    <ZONE>Annual</ZONE>
+    <ZONE>8</ZONE>
     <LIGHT>Shade</LIGHT>
     <PRICE>$7.06</PRICE>
     <AVAILABILITY>052599</AVAILABILITY>
@@ -237,7 +237,7 @@
   <PLANT>
     <COMMON>Primrose</COMMON>
     <BOTANICAL>Oenothera</BOTANICAL>
-    <ZONE>3 - 5</ZONE>
+    <ZONE>3</ZONE>
     <LIGHT>Sunny</LIGHT>
     <PRICE>$6.56</PRICE>
     <AVAILABILITY>013099</AVAILABILITY>
@@ -261,7 +261,7 @@
   <PLANT>
     <COMMON>Jacob's Ladder</COMMON>
     <BOTANICAL>Polemonium caeruleum</BOTANICAL>
-    <ZONE>Annual</ZONE>
+    <ZONE>8</ZONE>
     <LIGHT>Shade</LIGHT>
     <PRICE>$9.26</PRICE>
     <AVAILABILITY>022199</AVAILABILITY>
@@ -269,7 +269,7 @@
   <PLANT>
     <COMMON>Greek Valerian</COMMON>
     <BOTANICAL>Polemonium caeruleum</BOTANICAL>
-    <ZONE>Annual</ZONE>
+    <ZONE>8</ZONE>
     <LIGHT>Shade</LIGHT>
     <PRICE>$4.36</PRICE>
     <AVAILABILITY>071499</AVAILABILITY>
@@ -277,7 +277,7 @@
   <PLANT>
     <COMMON>California Poppy</COMMON>
     <BOTANICAL>Eschscholzia californica</BOTANICAL>
-    <ZONE>Annual</ZONE>
+    <ZONE>8</ZONE>
     <LIGHT>Sun</LIGHT>
     <PRICE>$7.89</PRICE>
     <AVAILABILITY>032799</AVAILABILITY>
@@ -285,7 +285,7 @@
   <PLANT>
     <COMMON>Shooting Star</COMMON>
     <BOTANICAL>Dodecatheon</BOTANICAL>
-    <ZONE>Annual</ZONE>
+    <ZONE>8</ZONE>
     <LIGHT>Mostly Shady</LIGHT>
     <PRICE>$8.60</PRICE>
     <AVAILABILITY>051399</AVAILABILITY>
@@ -293,7 +293,7 @@
   <PLANT>
     <COMMON>Snakeroot</COMMON>
     <BOTANICAL>Cimicifuga</BOTANICAL>
-    <ZONE>Annual</ZONE>
+    <ZONE>8</ZONE>
     <LIGHT>Shade</LIGHT>
     <PRICE>$5.63</PRICE>
     <AVAILABILITY>071199</AVAILABILITY>
@@ -306,4 +306,4 @@
     <PRICE>$3.02</PRICE>
     <AVAILABILITY>022299</AVAILABILITY>
   </PLANT>
-</CATALOG>
\ No newline at end of file
+</CATALOG>

Reply via email to