Repository: incubator-griffin Updated Branches: refs/heads/master 93930b816 -> d597575a3
update document of measure batch sample Author: Lionel Liu <[email protected]> Closes #274 from bhlx3lyx7/tmst. Project: http://git-wip-us.apache.org/repos/asf/incubator-griffin/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-griffin/commit/d597575a Tree: http://git-wip-us.apache.org/repos/asf/incubator-griffin/tree/d597575a Diff: http://git-wip-us.apache.org/repos/asf/incubator-griffin/diff/d597575a Branch: refs/heads/master Commit: d597575a3ea5d7681dff7568d02b7095c4319da9 Parents: 93930b8 Author: Lionel Liu <[email protected]> Authored: Tue May 1 16:09:02 2018 +0800 Committer: Lionel Liu <[email protected]> Committed: Tue May 1 16:09:02 2018 +0800 ---------------------------------------------------------------------- griffin-doc/measure/measure-batch-sample.md | 19 ++++++++++--------- 1 file changed, 10 insertions(+), 9 deletions(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/incubator-griffin/blob/d597575a/griffin-doc/measure/measure-batch-sample.md ---------------------------------------------------------------------- diff --git a/griffin-doc/measure/measure-batch-sample.md b/griffin-doc/measure/measure-batch-sample.md index 544adc7..af5d43a 100644 --- a/griffin-doc/measure/measure-batch-sample.md +++ b/griffin-doc/measure/measure-batch-sample.md @@ -101,10 +101,11 @@ The miss records of source will be persisted as record. "name": "source", "connectors": [ { - "type": "avro", - "version": "1.7", + "type": "hive", + "version": "1.2", "config": { - "file.name": "src/test/resources/users_info_src.avro" + "database": "default", + "table.name": "src" } } ] @@ -117,7 +118,7 @@ The miss records of source will be persisted as record. "dsl.type": "griffin-dsl", "dq.type": "profiling", "name": "prof", - "rule": "select count(*) as `cnt`, count(distinct `post_code`) as `dis-cnt`, max(user_id) as `max` from source", + "rule": "select max(age) as `max_age`, min(age) as `min_age` from source", "metric": { "name": "prof" } @@ -125,10 +126,10 @@ The miss records of source will be persisted as record. { "dsl.type": "griffin-dsl", "dq.type": "profiling", - "name": "grp", - "rule": "select post_code as `pc`, count(*) as `cnt` from source group by post_code", + "name": "name_grp", + "rule": "select name, count(*) as cnt from source group by name", "metric": { - "name": "post_group", + "name": "name_grp", "collect.type": "array" } } @@ -142,5 +143,5 @@ Above is the configure file of batch profiling job. In this sample, we use hive table as source. ### Evaluate rule -In this profiling sample, the rule describes the profiling request: `country, country.count() as cnt group by country order by cnt desc limit 3`. -The profiling metrics will be persisted as metric, listing the most 3 groups of items in same country. \ No newline at end of file +In this profiling sample, the rule describes the profiling request: `select max(age) as max_age, min(age) as min_age from source` and `select name, count(*) as cnt from source group by name`. +The profiling metrics will be persisted as metric, with the max and min value of age, and count group by name, like this: `{"max_age": 53, "min_age": 11, "name_grp": [{"name": "Adam", "cnt": 13}, {"name": "Fred", "cnt": 2}]}`. \ No newline at end of file
