[jira] [Commented] (DRILL-7261) Simplify Easy format config for new scan framework

ASF GitHub Bot (JIRA) Mon, 03 Jun 2019 00:16:26 -0700


    [ 
https://issues.apache.org/jira/browse/DRILL-7261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16854294#comment-16854294
 ]


ASF GitHub Bot commented on DRILL-7261:
---------------------------------------

arina-ielchiieva commented on pull request #1796: DRILL-7261: Simplify Easy 
framework config
URL: https://github.com/apache/drill/pull/1796#discussion_r289711757
 
 

 ##########
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/easy/text/TextFormatPlugin.java
 ##########
 @@ -336,6 +268,53 @@ public RecordReader getRecordReader(FragmentContext 
context,
     }
   }
 
+  @Override
+  protected FileScanBuilder frameworkBuilder(
+      OptionManager options, EasySubScan scan) throws ExecutionSetupException {
+    ColumnsScanBuilder builder = new ColumnsScanBuilder();
+    builder.setReaderFactory(new ColumnsReaderFactory(this));
+
+    // If this format has no headers, or wants to skip them,
+    // then we must use the columns column to hold the data.
+
+    builder.requireColumnsArray(
+        ! getConfig().isHeaderExtractionEnabled());
+
+    // Text files handle nulls in an unusual way. Missing columns
+    // are set to required Varchar and filled with blanks. Yes, this
+    // means that the SQL statement or code cannot differentiate missing
+    // columns from empty columns, but that is how CSV and other text
+    // files have been defined within Drill.
+
+    builder.setNullType(Types.required(MinorType.VARCHAR));
+
+    // CSV maps blank columns to nulls (for nullable non-string columns),
+    // or to the default value (for non-nullable non-string columns.)
+
+    builder.setConversionProperty(AbstractConvertFromString.BLANK_ACTION_PROP,
+        AbstractConvertFromString.BLANK_AS_NULL);
+
+    // The text readers use required Varchar columns to represent null columns.
+
+    builder.allowRequiredNullColumns(true);
+
+    // Provide custom error context
+    builder.setContext(
+        new CustomErrorContext() {
+          @Override
+          public void addContext(UserException.Builder builder) {
+            builder.addContext("Format plugin:", PLUGIN_NAME);
+            builder.addContext("Plugin config name:", getName());
+            builder.addContext("Extract headers:",
+                Boolean.toString(getConfig().isHeaderExtractionEnabled()));
+            builder.addContext("Skip headers:",
 
 Review comment:
   Skip lines?
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


> Simplify Easy format config for new scan framework
> --------------------------------------------------
>
>                 Key: DRILL-7261
>                 URL: https://issues.apache.org/jira/browse/DRILL-7261
>             Project: Apache Drill
>          Issue Type: Improvement
>    Affects Versions: 1.16.0
>            Reporter: Paul Rogers
>            Assignee: Paul Rogers
>            Priority: Major
>             Fix For: 1.17.0
>
>
> Rollup of related CSV V3 fixes along with supporting row set framework fixes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (DRILL-7261) Simplify Easy format config for new scan framework

Reply via email to