Tushar created GRIFFIN-313:
------------------------------
Summary: No source information provided into metric output
Key: GRIFFIN-313
URL: https://issues.apache.org/jira/browse/GRIFFIN-313
Project: Griffin
Issue Type: Improvement
Components: accuracy-batch, completeness-batch, timeliness-batch
Reporter: Tushar
Users can provide multiple sources in Griffin config and apply a different
constraint on each of the sources. But there are no details coming metric
output that which constraint is applied on which source. Please find below
input config,
{{ "name": "accu_batch",
"process.type": "batch",
"data.sources": [\{ "name": "source", "baseline": true, "connectors": [{
"type": "avro", "version": "1.7", "config": { "file.name":
"/griffin-master/measure/src/test/resources/users_info_src/users_info_src.avro"
} }] }, \{ "name": "source_tgt", "baseline": true, "connectors": [{ "type":
"avro", "version": "1.7", "config": { "file.name":
"/griffin-master/measure/src/test/resources/users_info_src/users_info_src.avro"
} }] }
],
"evaluate.rule": \{ "rules": [{ "dsl.type": "griffin-dsl", "dq.type":
"completeness", "out.dataframe.name": "comp", "rule": "email, post_code,
first_name", "details": { "source": "source" }, "out": [\{ "type": "metric",
"name": "completeness", "flatten": "map" }] }, \{ "dsl.type": "griffin-dsl",
"dq.type": "completeness", "out.dataframe.name": "comp", "rule": "email,
post_code, first_name", "details": { "source": "source_tgt" }, "out": [\{
"type": "metric", "name": "completeness_tgt", "flatten": "map" }] } ] },
"sinks": ["CONSOLE", "HDFS"]}
Output Json file :
{\{ "name": "accu_batch", "tmst": 1576833946344, "value": { "completeness": [{
"total": 50, "incomplete": 1, "complete": 49 }], "completeness_tgt": [\{
"total": 50, "incomplete": 1, "complete": 49 }] }, "metadata": \{
"applicationId": "local-1576833941125" }}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)