paul-rogers commented on a change in pull request #1807: DRILL-7293: Convert
the regex ("log") plugin to use EVF
URL: https://github.com/apache/drill/pull/1807#discussion_r294615119
##########
File path:
exec/java-exec/src/main/java/org/apache/drill/exec/store/log/README.md
##########
@@ -129,19 +129,62 @@ Drill 1.16 introduced the `CREATE SCHEMA` command to
allow you to define the
schema for your table. This plugin was created earlier. Here is how the two
schema
systems interact.
+### Plugin Config Provides Regex and Field Names
+
+The first way to use the provided schema is just to define column types.
+In this use case, the plugin config provides the physical layout (pattern
+and column names), the provided schema provides data types and default
+values (for missing columns.)
+
+In this case:
+
* The plugin config must provide the regex.
-* The plugin config should provide the list of column names. (If not provided,
+* The plugin config provides the list of column names. (If not provided,
the names will be `field_1`, `field_2`, etc.)
-* The plugin config can provide a type for each field. Text data from the regex
-is converted to a nullable column of the specified type.
-* The table can provide a schema via `CREATE SCHEMA`. If so, the column names
-in the schema must match those in the plugin config. The types in the provided
-schema are used instead of those specified in the plugin config. The schema
+* The plugin config should not provide column types.
+* The table provides a schema via `CREATE SCHEMA`. Column names
+in the schema must match those in the plugin config by name. The types in the
+provided schema are used instead of those specified in the plugin config. The
schema
allows you to specify the data type, and either nullable or `not null`
cardinality.
-You may find it helpful to specify the regex and column names via the plugin
-config, types via the `CREATE SCHEMA` command.
+### Provided Schema Provides The Regex
+
+Another way to use the provided schema is to define an empty plugin config;
don't
+even provide the regex. Use table properties to define the regex (and the
maximum
+error count, if desired.)
+
+In this case:
+
+* Set the table property `drill.regex.regex` to the desired pattern.
Review comment:
Agree, it is pretty awkward. The saving grace is that I did, I believe,
change "regex" to "logRegex" as you suggested. That is, the second item is the
plugin "type" name.
When we worked on the text reader, I had first tried to choose good names
for the third item. You rightly pointed out that it might be easier to remember
if we simply use the existing config field names, which is what I did here.
So, even if the names are awkward, the pattern we've evolved is:
```
drill.<plugin type name>.<config field name>
```
That said, I'm open to suggestions if there is a better way to handle these
names; now is the time to make improvements before folks deploy schema files
with the names.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services