paul-rogers commented on a change in pull request #2160:
URL: https://github.com/apache/drill/pull/2160#discussion_r568948930



##########
File path: 
exec/java-exec/src/test/java/org/apache/drill/exec/store/log/TestLogReader.java
##########
@@ -177,6 +180,15 @@ private static LogFormatConfig firewallConfig() {
         regex, "ssdlog", null, schema);
   }
 
+  // DRILL-7853
+  private static LogFormatConfig issue7853Config() {
+    String regex = "([^ ]*) ([^ ]*) ([^ ]*) ([^ ]*):([0-9]*) ([^ 
]*)[:-]([0-9]*) ([-.0-9]*) ([-.0-9]*) ([-.0-9]*) (|[-0-9]*) (-|[-0-9]*) 
([-0-9]*) ([-0-9]*) \\\"([^ ]*) ([^ ]*) (- |[^ ]*)\\\" \\\"([^\\\"]*)\\\" 
([A-Z0-9-]+) ([A-Za-z0-9.-]*) ([^ ]*) \\\"([^\\\"]*)\\\" \\\"([^\\\"]*)\\\" 
\\\"([^\\\"]*)\\\" ([-.0-9]*) ([^ ]*) \\\"([^\\\"]*)\\\"($| \\\"[^ ]*\\\")(.*)";
+    List<LogFormatField> schema = Lists.newArrayList(
+        new LogFormatField("type", "VARCHAR"),
+        new LogFormatField("time", "TIMESTAMP", 
"yyyy-MM-dd''T''HH:mm:ss.SSSSSSZ"));

Review comment:
       The original source of the error is likely that the user's log format 
included a Joda-specific time format that must be changed to use a Java time 
format. Makes sense to better report the error. I suppose we should also add 
this change to the docs. I had forgotten that the log reader allowed Joda 
formats. No tests caught this issue. Thanks for adding this test.

##########
File path: 
exec/java-exec/src/test/java/org/apache/drill/exec/store/log/TestLogReader.java
##########
@@ -765,4 +777,16 @@ public void testFirewallSchema() throws RpcException {
     RowSetUtilities.verify(expected, result);
     result.clear();
   }
-}
+
+  @Test
+  public void testIssue7853() throws Exception {
+    thrownException.expect(UserRemoteException.class);
+    thrownException.expectMessage("is not valid for type TIMESTAMP");
+    String sql = "SELECT * FROM cp.`regex/issue7853.log3`";
+    QueryBuilder builder = client.queryBuilder().sql(sql);
+    RowSet sets = builder.rowSet();
+
+    assertEquals(2, sets.rowCount());
+    sets.clear();
+  }
+}

Review comment:
       Missing newline

##########
File path: 
exec/java-exec/src/test/java/org/apache/drill/exec/store/log/TestLogReader.java
##########
@@ -177,6 +180,15 @@ private static LogFormatConfig firewallConfig() {
         regex, "ssdlog", null, schema);
   }
 
+  // DRILL-7853
+  private static LogFormatConfig issue7853Config() {
+    String regex = "([^ ]*) ([^ ]*) ([^ ]*) ([^ ]*):([0-9]*) ([^ 
]*)[:-]([0-9]*) ([-.0-9]*) ([-.0-9]*) ([-.0-9]*) (|[-0-9]*) (-|[-0-9]*) 
([-0-9]*) ([-0-9]*) \\\"([^ ]*) ([^ ]*) (- |[^ ]*)\\\" \\\"([^\\\"]*)\\\" 
([A-Z0-9-]+) ([A-Za-z0-9.-]*) ([^ ]*) \\\"([^\\\"]*)\\\" \\\"([^\\\"]*)\\\" 
\\\"([^\\\"]*)\\\" ([-.0-9]*) ([^ ]*) \\\"([^\\\"]*)\\\"($| \\\"[^ ]*\\\")(.*)";

Review comment:
       Even better, whittle this test, and the required log file, down to the 
one field which failed: the date field. Then, the code here, and the test code 
below, can simply verify that the log reader is correctly parsing the date 
field (using Java date/time formats.)
   
   If you make the log file itself simpler, then you can write the log file in 
the test. See the various `TestCsv*` tests for an example of how to do this.

##########
File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/scan/convert/StandardConversions.java
##########
@@ -181,6 +181,10 @@ public DirectConverter newInstance(
       final Constructor<? extends DirectConverter> ctor = 
conversionClass.getDeclaredConstructor(ScalarWriter.class, Map.class);
       return ctor.newInstance(baseWriter, mergeProperties(properties));
     } catch (final ReflectiveOperationException e) {
+      // There is no need to continue

Review comment:
       Better comment might be:
   
   ```
     // Not a real reflection error: pass along underlying cause.
   ```

##########
File path: 
exec/java-exec/src/test/java/org/apache/drill/exec/store/log/TestLogReader.java
##########
@@ -177,6 +180,15 @@ private static LogFormatConfig firewallConfig() {
         regex, "ssdlog", null, schema);
   }
 
+  // DRILL-7853
+  private static LogFormatConfig issue7853Config() {
+    String regex = "([^ ]*) ([^ ]*) ([^ ]*) ([^ ]*):([0-9]*) ([^ 
]*)[:-]([0-9]*) ([-.0-9]*) ([-.0-9]*) ([-.0-9]*) (|[-0-9]*) (-|[-0-9]*) 
([-0-9]*) ([-0-9]*) \\\"([^ ]*) ([^ ]*) (- |[^ ]*)\\\" \\\"([^\\\"]*)\\\" 
([A-Z0-9-]+) ([A-Za-z0-9.-]*) ([^ ]*) \\\"([^\\\"]*)\\\" \\\"([^\\\"]*)\\\" 
\\\"([^\\\"]*)\\\" ([-.0-9]*) ([^ ]*) \\\"([^\\\"]*)\\\"($| \\\"[^ ]*\\\")(.*)";

Review comment:
       Nit: to make the pattern more readable, use `\d*` in place of `[0-9]*` 
and `\W*` in place of `[(^ ]*)`.
   
   One of the problems with the log format reader is that the above long regex 
is nearly unreadable. Better, in code, to provide the poor reader some hints 
(as we do elsewhere):
   
   ```
      regex = "\\W*"  // field 1 name
                + " \\W*" // field 2 name
                ...
   ```
   
   Or, even better:
   
   ```
       String plain = "\\W*";
       String integer = "\\d*";
       String sep = " ";
       ...
       regex = plain // field1
                 + sep + plain // field 2
        ...
   ```
   
   Not pretty, but makes the code more readable. And, as you create the format, 
you can comment out all but a few fields.
   
   For production use, there should be some simple way to provide this in the 
config since huge amount of time will be wasted trying to get the long regex 
correct in the current design.

##########
File path: 
exec/java-exec/src/test/java/org/apache/drill/exec/store/log/TestLogReader.java
##########
@@ -765,4 +777,16 @@ public void testFirewallSchema() throws RpcException {
     RowSetUtilities.verify(expected, result);
     result.clear();
   }
-}
+
+  @Test
+  public void testIssue7853() throws Exception {
+    thrownException.expect(UserRemoteException.class);
+    thrownException.expectMessage("is not valid for type TIMESTAMP");
+    String sql = "SELECT * FROM cp.`regex/issue7853.log3`";
+    QueryBuilder builder = client.queryBuilder().sql(sql);
+    RowSet sets = builder.rowSet();
+
+    assertEquals(2, sets.rowCount());
+    sets.clear();

Review comment:
       This is fine. There is a simpler solution:
   
   ```
      QuerySummary result = client.queryBuilder().sql(sql).run();
      assertEquals(2, result.records);
   ```
   
   However, all this test does is assert not crash. What you really want to 
know is that the format worked. So, we want to build a result set with expected 
results as done in other tests. To avoid having to crate a zillion fields, 
change your query to select only the fields of interest.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to