Re: Metrics not persisted when writing a query in SPARK-SQL instead of Griffin DSL

Vikram Jain Thu, 11 Oct 2018 07:52:53 -0700

Thank you Lionel for your help. 
We figured it out just before your mail arrived :)


Regards,
Vikram


> On 11-Oct-2018, at 8:20 PM, Lionel Liu <[email protected]> wrote:
> 
> 
> Hi Vikram,
> 
> In your JSON body, I notice that in the "rules" field, there's no "out" 
> field, which means griffin measure application will only calculate without 
> output. You might just changed the "dsl.type" from "griffin-dsl" to 
> "spark-sql", actually, for a "griffin-dsl" rule with "dq.type" as 
> "profiling", we create a output for it in transform phase: 
> https://github.com/apache/incubator-griffin/blob/griffin-0.3.0-incubating-rc1/measure/src/main/scala/org/apache/griffin/measure/step/builder/dsl/transform/ProfilingExpr2DQSteps.scala#L97,
>  
> <https://github.com/apache/incubator-griffin/blob/griffin-0.3.0-incubating-rc1/measure/src/main/scala/org/apache/griffin/measure/step/builder/dsl/transform/ProfilingExpr2DQSteps.scala#L97,>
>  but for a "spark-sql" rule, we don't parse it, so we don't know how it would 
> work, you need to manually configure the output field to enable it.
> 
> You can refer to this document to configure the output field: 
> https://github.com/apache/incubator-griffin/blob/master/griffin-doc/measure/measure-configuration-guide.md#rule
>  
> <https://github.com/apache/incubator-griffin/blob/master/griffin-doc/measure/measure-configuration-guide.md#rule>
> Or just simply refer to the demo json for spark-sql profiling rules:
> https://github.com/apache/incubator-griffin/blob/griffin-0.3.0-incubating-rc1/measure/src/test/resources/_profiling-batch-sparksql.json
>  
> <https://github.com/apache/incubator-griffin/blob/griffin-0.3.0-incubating-rc1/measure/src/test/resources/_profiling-batch-sparksql.json>
> 
> Hope this could help you.
> 
> --
> Regards,
> Lionel, Liu
> 
> 
> At 2018-10-11 17:30:29, "Vikram Jain" <[email protected]> wrote:
> >Hello,
> >
> >I was trying to create a measure and write the rule in Spark-SQL directly 
> >instead of Griffin-DSL. I use Postman to create the measure. The measure is 
> >created successfully, the job is created and executed successfully.
> >
> >However, the output metrics of execution of jobs are not persisted in 
> >ElasticSearch. The entry is created in Elastic but the "metricValues" array 
> >is NULL.
> >
> >The same SQL query works fine directly on Spark-Shell.
> >
> >I am not using Docker and building the environment (Griffin 3.0) on my local 
> >machine. All the measures created using UI are executing well. And measures 
> >created using Postman with griffin-dsl rule are also working well.
> >
> >Below is the body of json which I am passing to add measure API call from 
> >Postman. Please help me understand what is going wrong.
> >
> >
> >{
> >   "name": "custom_profiling_measure_2",
> >   "measure.type": "griffin",
> >   "dq.type": "PROFILING",
> >   "rule.description": {
> >     "details": [
> >       {
> >         "name": "id",
> >         "infos": "Total Count"
> >       }
> >     ]
> >   },
> >   "process.type": "BATCH",
> >   "owner": "test",
> >   "description": "custom_profiling_measure_2",
> >   "data.sources": [
> >     {
> >       "name": "source",
> >       "connectors": [
> >         {
> >           "name": "source123",
> >           "type": "HIVE",
> >           "version": "1.2",
> >           "data.unit": "1day",
> >           "data.time.zone": "",
> >           "config": {
> >             "database": "default",
> >             "table.name": "demo_src",
> >             "where": ""
> >           }
> >         }
> >       ]
> >     }
> >   ],
> >   "evaluate.rule": {
> >     "out.dataframe.name": "profiling_2",
> >     "rules": [
> >       {
> >         "dsl.type": "spark-sql",
> >         "dq.type": "PROFILING",
> >         "rule": "SELECT count(id) AS cnt, max(age) AS Max_Age from 
> > demo_src",
> >         "out.dataframe.name": "id_count_2"
> >       }
> >     ]
> >   }
> >}
> >
> >
> >
> >
> >
> >Regards,
> >
> >Vikram
> >
> 
> 
>

Re: Metrics not persisted when writing a query in SPARK-SQL instead of Griffin DSL

Reply via email to