Hello All,
First, a big thank you Paul for updating the log regex reader to the new EVF
framework. I am having a little trouble getting it to work however...
Here is my config:
,
"ssdlog": {
"type": "logRegex",
"regex":
"(\\w{3}\\s\\d{1,2}\\s\\d{4}\\s\\d{2}:\\d{2}:\\d{2})\\s+(\\w+)\\[(\\d+)\\]:\\s(.*?(\\d{1,3}\\.\\d{1,3}\\.\\d{1,3}\\.\\d{1,3}).*?)",
"extension": "ssdlog",
"maxErrors": 10,
"schema": [
{"fieldName":"eventDate"}
]
},
This works if I leave the schema null, however if I attempt to populate it, I
get JSON errors. This was what I originally had:
"schema" : [ {
"fieldName" : "eventDate",
"fieldType" : "TIMESTAMP",
"format" : "MMM dd yyyy hh:mm:ss"
}, {
"fieldName" : "process_name"
}, {
"fieldName" : "pid",
"fieldType" : "INT"
}, {
"fieldName" : "message"
}, {
"fieldName" : "src_ip"
} ]
which worked.
Also, I am working on updating a few format plugins and kept getting the
following error when I try to run unit tests:
at org.apache.drill.test.ClusterFixture.<init>(ClusterFixture.java:152)
at
org.apache.drill.test.ClusterFixtureBuilder.build(ClusterFixtureBuilder.java:283)
at org.apache.drill.test.ClusterTest.startCluster(ClusterTest.java:83)
at
org.apache.drill.exec.store.excel.TestExcelFormat.setup(TestExcelFormat.java:49)
Caused by: com.typesafe.config.ConfigException$Missing: No configuration
setting found for key 'drill.exec.grace_period_ms'
at com.typesafe.config.impl.SimpleConfig.findKey(SimpleConfig.java:115)
at com.typesafe.config.impl.SimpleConfig.find(SimpleConfig.java:136)
at com.typesafe.config.impl.SimpleConfig.find(SimpleConfig.java:142)
at com.typesafe.config.impl.SimpleConfig.find(SimpleConfig.java:142)
at com.typesafe.config.impl.SimpleConfig.find(SimpleConfig.java:150)
at com.typesafe.config.impl.SimpleConfig.find(SimpleConfig.java:155)
at
com.typesafe.config.impl.SimpleConfig.getConfigNumber(SimpleConfig.java:170)
at com.typesafe.config.impl.SimpleConfig.getInt(SimpleConfig.java:181)
at
org.apache.drill.common.config.NestedConfig.getInt(NestedConfig.java:96)
at
org.apache.drill.common.config.DrillConfig.getInt(DrillConfig.java:44)
at
org.apache.drill.common.config.NestedConfig.getInt(NestedConfig.java:96)
at
org.apache.drill.common.config.DrillConfig.getInt(DrillConfig.java:44)
at org.apache.drill.exec.server.Drillbit.<init>(Drillbit.java:160)
at org.apache.drill.exec.server.Drillbit.<init>(Drillbit.java:138)
at
org.apache.drill.test.ClusterFixture.startDrillbits(ClusterFixture.java:228)
at org.apache.drill.test.ClusterFixture.<init>(ClusterFixture.java:146)
... 3 more
Process finished with exit code 255
I understand that I have to set the variable drill.exec.grace_period_ms, but
I'm not sure how/where to do this. Here is the beginning of my unit test code:
@ClassRule
public static final BaseDirTestWatcher dirTestWatcher = new
BaseDirTestWatcher();
@BeforeClass
public static void setup() throws Exception {
ClusterTest.startCluster(ClusterFixture.builder(dirTestWatcher).maxParallelization(1));
definePlugin();
}
private static void definePlugin() throws ExecutionSetupException {
ExcelFormatConfig sampleConfig = new ExcelFormatConfig();
// Define a temporary plugin for the "cp" storage plugin.
Drillbit drillbit = cluster.drillbit();
final StoragePluginRegistry pluginRegistry =
drillbit.getContext().getStorage();
final FileSystemPlugin plugin = (FileSystemPlugin)
pluginRegistry.getPlugin("cp");
final FileSystemConfig pluginConfig = (FileSystemConfig) plugin.getConfig();
pluginConfig.getFormats().put("sample", sampleConfig);
pluginRegistry.createOrUpdate("cp", pluginConfig, false);
}
@Test
public void testStarQuery() throws RpcException {
String sql = "SELECT * FROM cp.`excel/test_data.xlsx` LIMIT 5";
RowSet results = client.queryBuilder().sql(sql).rowSet();
TupleMetadata expectedSchema = new SchemaBuilder()
.add("id", TypeProtos.MinorType.FLOAT8, TypeProtos.DataMode.OPTIONAL)
.add("first__name", TypeProtos.MinorType.VARCHAR,
TypeProtos.DataMode.OPTIONAL)
.add("last__name", TypeProtos.MinorType.VARCHAR,
TypeProtos.DataMode.OPTIONAL)
.add("email", TypeProtos.MinorType.VARCHAR,
TypeProtos.DataMode.OPTIONAL)
.add("gender", TypeProtos.MinorType.VARCHAR,
TypeProtos.DataMode.OPTIONAL)
.add("birthdate", TypeProtos.MinorType.VARCHAR,
TypeProtos.DataMode.OPTIONAL)
.add("balance", TypeProtos.MinorType.FLOAT8,
TypeProtos.DataMode.OPTIONAL)
.add("order__count", TypeProtos.MinorType.FLOAT8,
TypeProtos.DataMode.OPTIONAL)
.add("average__order", TypeProtos.MinorType.FLOAT8,
TypeProtos.DataMode.OPTIONAL)
.buildSchema();
RowSet expected = new RowSetBuilder(client.allocator(), expectedSchema)
.addRow(1.0, "Cornelia", "Matej", "[email protected]", "Female",
"10/31/1974", 735.29, 22.0, 33.42227273)
.addRow(2.0, "Nydia", "Heintsch", "[email protected]", "Female",
"12/10/1966", 784.14, 22.0, 35.64272727)
.addRow(3.0, "Waiter", "Sherel", "[email protected]", "Male",
"3/12/1961", 172.36, 17.0, 10.13882353)
.addRow(4.0, "Cicely", "Lyver", "[email protected]", "Female",
"5/4/2000", 987.39, 6.0, 164.565)
.addRow(5.0, "Dorie", "Doe", "[email protected]", "Female",
"12/28/1955", 852.48, 17.0, 50.14588235)
.build();
new RowSetComparison(expected).verifyAndClearAll(results);
}
Thanks!
-C