That’s cool!

Thanks
Lionel, Liu

From: Vikram Jain
Sent: 2018年10月11日 22:52
To: Lionel Liu
Cc: [email protected]
Subject: Re: Metrics not persisted when writing a query in SPARK-SQL insteadof 
Griffin DSL

Thank you Lionel for your help. 
We figured it out just before your mail arrived :) 

Regards,
Vikram



On 11-Oct-2018, at 8:20 PM, Lionel Liu <[email protected]> wrote:


Hi Vikram,

In your JSON body, I notice that in the "rules" field, there's no "out" field, 
which means griffin measure application will only calculate without output. You 
might just changed the "dsl.type" from "griffin-dsl" to "spark-sql", actually, 
for a "griffin-dsl" rule with "dq.type" as "profiling", we create a output for 
it in transform phase: 
https://github.com/apache/incubator-griffin/blob/griffin-0.3.0-incubating-rc1/measure/src/main/scala/org/apache/griffin/measure/step/builder/dsl/transform/ProfilingExpr2DQSteps.scala#L97,
 but for a "spark-sql" rule, we don't parse it, so we don't know how it would 
work, you need to manually configure the output field to enable it.

You can refer to this document to configure the output field: 
https://github.com/apache/incubator-griffin/blob/master/griffin-doc/measure/measure-configuration-guide.md#rule
 
Or just simply refer to the demo json for spark-sql profiling rules:
https://github.com/apache/incubator-griffin/blob/griffin-0.3.0-incubating-rc1/measure/src/test/resources/_profiling-batch-sparksql.json
 

Hope this could help you.

--
Regards,
Lionel, Liu


At 2018-10-11 17:30:29, "Vikram Jain" <[email protected]> wrote:
>Hello,
>
>I was trying to create a measure and write the rule in Spark-SQL directly 
>instead of Griffin-DSL. I use Postman to create the measure. The measure is 
>created successfully, the job is created and executed successfully.
>
>However, the output metrics of execution of jobs are not persisted in 
>ElasticSearch. The entry is created in Elastic but the "metricValues" array is 
>NULL.
>
>The same SQL query works fine directly on Spark-Shell.
>
>I am not using Docker and building the environment (Griffin 3.0) on my local 
>machine. All the measures created using UI are executing well. And measures 
>created using Postman with griffin-dsl rule are also working well.
>
>Below is the body of json which I am passing to add measure API call from 
>Postman. Please help me understand what is going wrong.
>
>
>{
>   "name": "custom_profiling_measure_2",
>   "measure.type": "griffin",
>   "dq.type": "PROFILING",
>   "rule.description": {
>     "details": [
>       {
>         "name": "id",
>         "infos": "Total Count"
>       }
>     ]
>   },
>   "process.type": "BATCH",
>   "owner": "test",
>   "description": "custom_profiling_measure_2",
>   "data.sources": [
>     {
>       "name": "source",
>       "connectors": [
>         {
>           "name": "source123",
>           "type": "HIVE",
>           "version": "1.2",
>           "data.unit": "1day",
>           "data.time.zone": "",
>           "config": {
>             "database": "default",
>             "table.name": "demo_src",
>             "where": ""
>           }
>         }
>       ]
>     }
>   ],
>   "evaluate.rule": {
>     "out.dataframe.name": "profiling_2",
>     "rules": [
>       {
>         "dsl.type": "spark-sql",
>         "dq.type": "PROFILING",
>         "rule": "SELECT count(id) AS cnt, max(age) AS Max_Age from demo_src",
>         "out.dataframe.name": "id_count_2"
>       }
>     ]
>   }
>}
>
>
>
>
>
>Regards,
>
>Vikram
>



 


Reply via email to