[jira] [Commented] (DRILL-7143) Enforce column-level constraints when using a schema

ASF GitHub Bot (JIRA) Sun, 07 Apr 2019 06:05:00 -0700


    [ 
https://issues.apache.org/jira/browse/DRILL-7143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16811874#comment-16811874
 ]


ASF GitHub Bot commented on DRILL-7143:
---------------------------------------

arina-ielchiieva commented on pull request #1726: DRILL-7143: Support default 
value for empty columns
URL: https://github.com/apache/drill/pull/1726#discussion_r272833495
 
 

 ##########
 File path: 
exec/java-exec/src/test/java/org/apache/drill/exec/store/easy/text/compliant/TestCsvWithSchema.java
 ##########
 @@ -1069,4 +1070,203 @@ public void testMissingColsReqDefault() throws 
Exception {
       resetSchema();
     }
   }
+
+  public static final String missingColContents[] = {
+    "id,amount,start_date",
+    "1,20,2019-01-01",
+    "2",
+    "3,30"
+  };
+
+  /**
+   * Demonstrate that CSV works for a schema with nullable types when columns
+   * are missing (there is no comma to introduce an empty field in the data.)
+   */
+  @Test
+  public void testMissingColsNullable() throws Exception {
+    String tableName = "missingColsNullable";
+    String tablePath = buildTable(tableName, missingColContents);
+
+    try {
+      enableV3(true);
+      enableSchema(true);
+      String sql = "create or replace schema (" +
+          "id int not null, amount int, start_date date" +
+          ") for table %s";
+      run(sql, tablePath);
+      sql = "SELECT * FROM " + tablePath + "ORDER BY id";
+      RowSet actual = client.queryBuilder().sql(sql).rowSet();
+
+      TupleMetadata expectedSchema = new SchemaBuilder()
+          .add("id", MinorType.INT)
+          .addNullable("amount", MinorType.INT)
+          .addNullable("start_date", MinorType.DATE)
+          .buildSchema();
+      RowSet expected = new RowSetBuilder(client.allocator(), expectedSchema)
+          .addRow(1, 20, new LocalDate(2019, 1, 1))
+          .addRow(2, null, null)
+          .addRow(3, 30, null)
+          .build();
+      RowSetUtilities.verify(expected, actual);
+    } finally {
+      resetV3();
+      resetSchema();
+    }
+  }
+
+  public static final String blankColContents[] = {
+    "id,amount,start_date",
+    "1,20,2019-01-01",
+    "2,,",
+    "3,30,"
+  };
+
+  /**
+   * Demonstrate that CSV uses a comma to introduce a column,
+   * even if that column has no data. In this case, CSV assumes the
+   * value of the column is a blank string.
+   * <p>
+   * Such a schema cannot be converted to a number or date column,
+   * even nullable, because a blank string is neither a valid number nor
+   * a valid date.
+   */
+
+  @Test
+  public void testBlankCols() throws Exception {
+    String tableName = "blankCols";
+    String tablePath = buildTable(tableName, blankColContents);
+
+    try {
+      enableV3(true);
+      enableSchema(true);
+      String sql = "SELECT * FROM " + tablePath + "ORDER BY id";
+      RowSet actual = client.queryBuilder().sql(sql).rowSet();
+
+      TupleMetadata expectedSchema = new SchemaBuilder()
+          .add("id", MinorType.VARCHAR)
+          .add("amount", MinorType.VARCHAR)
+          .add("start_date", MinorType.VARCHAR)
+          .buildSchema();
+      RowSet expected = new RowSetBuilder(client.allocator(), expectedSchema)
+          .addRow("1", "20", "2019-01-01")
+          .addRow("2", "", "")
+          .addRow("3", "30", "")
+          .build();
+      RowSetUtilities.verify(expected, actual);
+    } finally {
+      resetV3();
+      resetSchema();
+    }
+  }
+
+  /**
+   * Use the same data set as above tests, but use a schema to do type
+   * conversion. Blank columns become 0 for numeric non-nullable, nulls for
+   * nullable non-numeric.
+   *
+   * @throws Exception
 
 Review comment:
   Please remove @throws or add description to avoid warnings in the IDE.
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


> Enforce column-level constraints when using a schema
> ----------------------------------------------------
>
>                 Key: DRILL-7143
>                 URL: https://issues.apache.org/jira/browse/DRILL-7143
>             Project: Apache Drill
>          Issue Type: Improvement
>    Affects Versions: 1.16.0
>            Reporter: Paul Rogers
>            Assignee: Paul Rogers
>            Priority: Major
>             Fix For: 1.16.0
>
>
> The recently added schema framework enforces schema constraints at the table 
> level. We now wish to add additional constraints at the column level.
> * If a column is marked as "strict", then the reader will use the exact type 
> and mode from the column schema, or fail if it is not possible to do so.
> * If a column is marked as required, and provides a default value, then that 
> value is used instead of 0 if a row is missing a value for that column.
> This PR may also contain other fixes the the base functional revealed through 
> additional testing.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (DRILL-7143) Enforce column-level constraints when using a schema

Reply via email to