Tamas Nemeth created GOBBLIN-231: ------------------------------------ Summary: Grok to Json Converter Key: GOBBLIN-231 URL: https://issues.apache.org/jira/browse/GOBBLIN-231 Project: Apache Gobblin Issue Type: New Feature Components: gobblin-core Reporter: Tamas Nemeth Assignee: Abhishek Tiwari Priority: Minor
Converter can convert text to json base on a GROK pattern. GrokToJsonConverter accepts already deserialized text row, String. Converts Text to JSON based on Grok pattern. Schema is represented by the form of JsonArray same interface being used by CsvToJonConverter. Each text record is represented by a String. The converter only supports Grok patterns where every group is named because it uses the group names as column names. The following config properties can be set: The grok pattern to use for the conversion: converter.grok_to_json.pattern=^%{IPORHOST:clientip} (?:-|%{USER:ident}) (?:-|%{USER:auth}) \[%{HTTPDATE:timestamp}\] \"(?:%{WORD:verb} %{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion})?|-)\" %{NUMBER:response} (?:-|%{NUMBER:bytes}) Path to the grok patterns (if not set it will use the default ones): converter.grok_to_json.patterns=/tmp/grok_patterns Treat empty string as null value: converter.grok_to_json.empty_as_null=true Specify the null string: converter.grok_to_json.null_string=null Example of schema: [ { "columnName": "Day", "comment": "", "isNullable": "true", "dataType": { "type": "string" } }, { "columnName": "Pageviews", "comment": "", "isNullable": "true", "dataType": { "type": "long" } } ] -- This message was sent by Atlassian JIRA (v6.4.14#64029)