yes, griffin leverages livy to post spark jobs to spark cluster.

if you manually submit a job to spark, griffin cannot automatically refresh
metrics from es.


On Fri, Sep 11, 2020 at 9:03 PM Sunil Muniyal <[email protected]>
wrote:

> As of now I am doing the test with Hive tables.
> Spark jobs aren't submitted by me yet.
>
> A question... so if I understand correctly, the metrics will be generated
> only after a spark job is executed and an accuracy check is performed
> between two hive tables via submitted spark job? is that correct? if yes, i
> guess Livy is needed then so that Griffin can submit spark jobs by itself
> else I will have to manually submit a Spark job first. If later is an
> option, how do we do that w.r.t. Griffin?
>
> Thanks and Regards,
> Sunil Muniyal
>
>
> On Fri, Sep 11, 2020 at 6:30 PM William Guo <[email protected]> wrote:
>
>> check this
>> https://spark.apache.org/docs/2.2.1/monitoring.html
>>
>>
>>
>> On Fri, Sep 11, 2020 at 8:58 PM William Guo <[email protected]> wrote:
>>
>>> for griffin log, please search in your spark cluster env, usually in
>>> worker log dir.
>>> One weird thing is how you submit a job to spark, if you disabled livy?
>>>
>>> On Fri, Sep 11, 2020 at 8:46 PM Sunil Muniyal <
>>> [email protected]> wrote:
>>>
>>>> possible to please help the location where measure log would get
>>>> created or from where can i check the location?
>>>>
>>>> Thanks and Regards,
>>>> Sunil Muniyal
>>>>
>>>>
>>>> On Fri, Sep 11, 2020 at 6:14 PM William Guo <[email protected]> wrote:
>>>>
>>>>> Livy is used to post jobs to your cluster, I don't think it is related
>>>>> to livy.
>>>>>
>>>>> Could you also share the measure log in your cluster?
>>>>>
>>>>>
>>>>> On Fri, Sep 11, 2020 at 8:03 PM Sunil Muniyal <
>>>>> [email protected]> wrote:
>>>>>
>>>>>> Got below message as output of
>>>>>>
>>>>>> {"Test_Measure":[{"name":"Test_Job","type":"ACCURACY","owner":"test","metricValues":[]}]}
>>>>>>
>>>>>> metricValues seems empty. So is it like Griffin is not getting data
>>>>>> from ES? whereas ES does have the data which we verified previously. By 
>>>>>> any
>>>>>> chance, do you think not having Livy could be a problem?
>>>>>>
>>>>>> These are the latest logs from service.out:
>>>>>> *[EL Fine]: sql: 2020-09-11
>>>>>> 11:59:11.662--ServerSession(400064818)--Connection(754936662)--SELECT
>>>>>> DISTINCT ID, APPID, APPURI, CREATEDDATE, DELETED, expire_timestamp,
>>>>>> MODIFIEDDATE, predicate_job_deleted, predicate_group_name,
>>>>>> predicate_job_name, SESSIONID, STATE, timestamp, TYPE, job_id FROM
>>>>>> JOBINSTANCEBEAN WHERE (STATE IN (?,?,?,?,?,?))*
>>>>>> *        bind => [6 parameters bound]*
>>>>>> *[EL Fine]: sql: 2020-09-11
>>>>>> 11:59:51.044--ServerSession(400064818)--Connection(353930083)--SELECT ID,
>>>>>> type, CREATEDDATE, CRONEXPRESSION, DELETED, quartz_group_name, JOBNAME,
>>>>>> MEASUREID, METRICNAME, MODIFIEDDATE, quartz_job_name, PREDICATECONFIG,
>>>>>> TIMEZONE FROM job WHERE (DELETED = ?)*
>>>>>> *        bind => [1 parameter bound]*
>>>>>> *[EL Fine]: sql: 2020-09-11
>>>>>> 11:59:51.046--ServerSession(400064818)--Connection(1245663749)--SELECT
>>>>>> DISTINCT DTYPE FROM MEASURE WHERE (DELETED = ?)*
>>>>>> *        bind => [1 parameter bound]*
>>>>>> *[EL Fine]: sql: 2020-09-11
>>>>>> 11:59:51.046--ServerSession(400064818)--Connection(674248356)--SELECT
>>>>>> t0.ID, t0.DTYPE, t0.CREATEDDATE, t0.DELETED, t0.DESCRIPTION, t0.DQTYPE,
>>>>>> t0.MODIFIEDDATE, t0.NAME, t0.ORGANIZATION, t0.OWNER, t0.SINKS, t1.ID,
>>>>>> t1.PROCESSTYPE, t1.RULEDESCRIPTION, t1.evaluate_rule_id FROM MEASURE t0,
>>>>>> GRIFFINMEASURE t1 WHERE ((t0.DELETED = ?) AND ((t1.ID = t0.ID) AND
>>>>>> (t0.DTYPE = ?)))*
>>>>>> *        bind => [2 parameters bound]*
>>>>>> *[EL Fine]: sql: 2020-09-11
>>>>>> 12:00:00.019--ClientSession(294162678)--Connection(98503327)--INSERT INTO
>>>>>> JOBINSTANCEBEAN (ID, APPID, APPURI, CREATEDDATE, DELETED, 
>>>>>> expire_timestamp,
>>>>>> MODIFIEDDATE, predicate_job_deleted, predicate_group_name,
>>>>>> predicate_job_name, SESSIONID, STATE, timestamp, TYPE, job_id) VALUES (?,
>>>>>> ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)*
>>>>>> *        bind => [15 parameters bound]*
>>>>>> *[EL Fine]: sql: 2020-09-11
>>>>>> 12:00:00.09--ServerSession(400064818)--Connection(491395630)--SELECT ID,
>>>>>> APPID, APPURI, CREATEDDATE, DELETED, expire_timestamp, MODIFIEDDATE,
>>>>>> predicate_job_deleted, predicate_group_name, predicate_job_name, 
>>>>>> SESSIONID,
>>>>>> STATE, timestamp, TYPE, job_id FROM JOBINSTANCEBEAN WHERE
>>>>>> (predicate_job_name = ?)*
>>>>>> *        bind => [1 parameter bound]*
>>>>>> *2020-09-11 12:00:00.117  INFO 10980 --- [ryBean_Worker-3]
>>>>>> o.a.g.c.j.SparkSubmitJob                 : {*
>>>>>> *  "measure.type" : "griffin",*
>>>>>> *  "id" : 201,*
>>>>>> *  "name" : "Test_Job",*
>>>>>> *  "owner" : "test",*
>>>>>> *  "description" : "Measure to check %age of id field values are
>>>>>> same",*
>>>>>> *  "deleted" : false,*
>>>>>> *  "timestamp" : 1599822000000,*
>>>>>> *  "dq.type" : "ACCURACY",*
>>>>>> *  "sinks" : [ "ELASTICSEARCH", "HDFS" ],*
>>>>>> *  "process.type" : "BATCH",*
>>>>>> *  "data.sources" : [ {*
>>>>>> *    "id" : 204,*
>>>>>> *    "name" : "source",*
>>>>>> *    "connectors" : [ {*
>>>>>> *      "id" : 205,*
>>>>>> *      "name" : "source1599568886803",*
>>>>>> *      "type" : "HIVE",*
>>>>>> *      "version" : "1.2",*
>>>>>> *      "predicates" : [ ],*
>>>>>> *      "data.unit" : "1hour",*
>>>>>> *      "data.time.zone" : "",*
>>>>>> *      "config" : {*
>>>>>> *        "database" : "default",*
>>>>>> *        "table.name <http://table.name>" : "demo_src",*
>>>>>> *        "where" : "dt=20200911 AND hour=11"*
>>>>>> *      }*
>>>>>> *    } ],*
>>>>>> *    "baseline" : false*
>>>>>> *  }, {*
>>>>>> *    "id" : 206,*
>>>>>> *    "name" : "target",*
>>>>>> *    "connectors" : [ {*
>>>>>> *      "id" : 207,*
>>>>>> *      "name" : "target1599568896874",*
>>>>>> *      "type" : "HIVE",*
>>>>>> *      "version" : "1.2",*
>>>>>> *      "predicates" : [ ],*
>>>>>> *      "data.unit" : "1hour",*
>>>>>> *      "data.time.zone" : "",*
>>>>>> *      "config" : {*
>>>>>> *        "database" : "default",*
>>>>>> *        "table.name <http://table.name>" : "demo_tgt",*
>>>>>> *        "where" : "dt=20200911 AND hour=11"*
>>>>>> *      }*
>>>>>> *    } ],*
>>>>>> *    "baseline" : false*
>>>>>> *  } ],*
>>>>>> *  "evaluate.rule" : {*
>>>>>> *    "id" : 202,*
>>>>>> *    "rules" : [ {*
>>>>>> *      "id" : 203,*
>>>>>> *      "rule" : "source.id <http://source.id>=target.id
>>>>>> <http://target.id>",*
>>>>>> *      "dsl.type" : "griffin-dsl",*
>>>>>> *      "dq.type" : "ACCURACY",*
>>>>>> *      "out.dataframe.name <http://out.dataframe.name>" : "accuracy"*
>>>>>> *    } ]*
>>>>>> *  },*
>>>>>> *  "measure.type" : "griffin"*
>>>>>> *}*
>>>>>> *2020-09-11 12:00:00.119 ERROR 10980 --- [ryBean_Worker-3]
>>>>>> o.a.g.c.j.SparkSubmitJob                 : Post to livy ERROR. I/O error 
>>>>>> on
>>>>>> POST request for "http://localhost:8998/batches
>>>>>> <http://localhost:8998/batches>": Connection refused (Connection 
>>>>>> refused);
>>>>>> nested exception is java.net.ConnectException: Connection refused
>>>>>> (Connection refused)*
>>>>>> *2020-09-11 12:00:00.131  INFO 10980 --- [ryBean_Worker-3]
>>>>>> o.a.g.c.j.SparkSubmitJob                 : Delete predicate
>>>>>> job(PG,Test_Job_predicate_1599825600016) SUCCESS.*
>>>>>> *[EL Fine]: sql: 2020-09-11
>>>>>> 12:00:00.133--ClientSession(273634815)--Connection(296858203)--UPDATE
>>>>>> JOBINSTANCEBEAN SET predicate_job_deleted = ?, STATE = ? WHERE (ID = ?)*
>>>>>> *        bind => [3 parameters bound]*
>>>>>> *[EL Fine]: sql: 2020-09-11
>>>>>> 12:00:11.664--ServerSession(400064818)--Connection(1735064739)--SELECT
>>>>>> DISTINCT ID, APPID, APPURI, CREATEDDATE, DELETED, expire_timestamp,
>>>>>> MODIFIEDDATE, predicate_job_deleted, predicate_group_name,
>>>>>> predicate_job_name, SESSIONID, STATE, timestamp, TYPE, job_id FROM
>>>>>> JOBINSTANCEBEAN WHERE (STATE IN (?,?,?,?,?,?))*
>>>>>> *        bind => [6 parameters bound]*
>>>>>>
>>>>>>
>>>>>> Thanks and Regards,
>>>>>> Sunil Muniyal
>>>>>>
>>>>>>
>>>>>> On Fri, Sep 11, 2020 at 3:42 PM William Guo <[email protected]> wrote:
>>>>>>
>>>>>>> From the log, I didn't find any information related to metrics
>>>>>>> fetching.
>>>>>>>
>>>>>>> Could you try to call /api/v1/metrics, and show us the latest log
>>>>>>> again?
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Fri, Sep 11, 2020 at 5:48 PM Sunil Muniyal <
>>>>>>> [email protected]> wrote:
>>>>>>>
>>>>>>>> 1: I guest it is related to your login user and super user.
>>>>>>>> I am less worried about unless this could be the cause of metrics
>>>>>>>> not being displayed.
>>>>>>>>
>>>>>>>> 2: Could you share with us your griffin log , I suspect some
>>>>>>>> exception happened when trying to connect with ES.
>>>>>>>> Attached is the service.out file. I see an error is while
>>>>>>>> submitting Spark jobs via Livy. Since Livy is not configured / deployed
>>>>>>>> this is expected. I believe this should not be the reason since we are
>>>>>>>> getting data from hive (as part of batch processing). Please correct 
>>>>>>>> if my
>>>>>>>> understanding is incorrect.
>>>>>>>>
>>>>>>>> Thanks and Regards,
>>>>>>>> Sunil Muniyal
>>>>>>>>
>>>>>>>>
>>>>>>>> On Fri, Sep 11, 2020 at 3:09 PM William Guo <[email protected]>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> 1: I guest it is related to your login user and super user.
>>>>>>>>> 2: Could you share with us your griffin log , I suspect some
>>>>>>>>> exception happened when try to connect with ES.
>>>>>>>>>
>>>>>>>>> On Fri, Sep 11, 2020 at 5:14 PM Sunil Muniyal <
>>>>>>>>> [email protected]> wrote:
>>>>>>>>>
>>>>>>>>>> Hello William,
>>>>>>>>>>
>>>>>>>>>> Tried as suggested.
>>>>>>>>>>
>>>>>>>>>> 1. Ingested data into Hive tables using the provided script.
>>>>>>>>>> The ownership still show as is (Source with Admin and Target with
>>>>>>>>>> Root)
>>>>>>>>>>
>>>>>>>>>> 2. Updated env-batch.json and env-streaming.json files with IP
>>>>>>>>>> address for ES and rebuilt Griffin.
>>>>>>>>>> Still no metrics for the jobs executed.
>>>>>>>>>> ES does have data as confirmed yesterday.
>>>>>>>>>>
>>>>>>>>>> Please help.
>>>>>>>>>>
>>>>>>>>>> Thanks and Regards,
>>>>>>>>>> Sunil Muniyal
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Thu, Sep 10, 2020 at 7:41 PM William Guo <[email protected]>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> please enter ip directly.
>>>>>>>>>>> not sure whether hostname can be resolved correctly or not.
>>>>>>>>>>>
>>>>>>>>>>> On Thu, Sep 10, 2020 at 10:06 PM Sunil Muniyal <
>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Hi William,
>>>>>>>>>>>>
>>>>>>>>>>>> Thank you for the reply.
>>>>>>>>>>>>
>>>>>>>>>>>> Regarding points 2 and 3. Possible to share some more details.
>>>>>>>>>>>> I believe the env_batch.json is configured as it is expected. What 
>>>>>>>>>>>> exactly
>>>>>>>>>>>> needs to be updated correctly? ES Hostname or shall I enter IP or 
>>>>>>>>>>>> something
>>>>>>>>>>>> else? Please help.
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks and Regards,
>>>>>>>>>>>> Sunil Muniyal
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Thu, Sep 10, 2020 at 7:30 PM William Guo <[email protected]>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> 1 OK, We will fix this issue soon.
>>>>>>>>>>>>> 2 Could you try ping es from your spark environment and input
>>>>>>>>>>>>> ES endpoint correctly in env_batch.json
>>>>>>>>>>>>> 3 Please put your es endpoint in env_batch.json
>>>>>>>>>>>>> 6 Please try the following script to build your env.
>>>>>>>>>>>>> ```
>>>>>>>>>>>>>
>>>>>>>>>>>>> #!/bin/bash
>>>>>>>>>>>>> #create table
>>>>>>>>>>>>> hive -f create-table.hqlecho "create table done"
>>>>>>>>>>>>> #current hoursudo ./gen_demo_data.shcur_date=`date 
>>>>>>>>>>>>> +%Y%m%d%H`dt=${cur_date:0:8}hour=${cur_date:8:2}partition_date="dt='$dt',hour='$hour'"
>>>>>>>>>>>>> sed s/PARTITION_DATE/$partition_date/ ./insert-data.hql.template 
>>>>>>>>>>>>> > insert-data.hql
>>>>>>>>>>>>> hive -f 
>>>>>>>>>>>>> insert-data.hqlsrc_done_path=/griffin/data/batch/demo_src/dt=${dt}/hour=${hour}/_DONEtgt_done_path=/griffin/data/batch/demo_tgt/dt=${dt}/hour=${hour}/_DONE
>>>>>>>>>>>>> hadoop fs -mkdir -p 
>>>>>>>>>>>>> /griffin/data/batch/demo_src/dt=${dt}/hour=${hour}
>>>>>>>>>>>>> hadoop fs -mkdir -p 
>>>>>>>>>>>>> /griffin/data/batch/demo_tgt/dt=${dt}/hour=${hour}
>>>>>>>>>>>>> hadoop fs -touchz ${src_done_path}
>>>>>>>>>>>>> hadoop fs -touchz ${tgt_done_path}echo "insert data 
>>>>>>>>>>>>> [$partition_date] done"
>>>>>>>>>>>>> #last hoursudo ./gen_demo_data.shcur_date=`date -d '1 hour ago' 
>>>>>>>>>>>>> +%Y%m%d%H`dt=${cur_date:0:8}hour=${cur_date:8:2}partition_date="dt='$dt',hour='$hour'"
>>>>>>>>>>>>> sed s/PARTITION_DATE/$partition_date/ ./insert-data.hql.template 
>>>>>>>>>>>>> > insert-data.hql
>>>>>>>>>>>>> hive -f 
>>>>>>>>>>>>> insert-data.hqlsrc_done_path=/griffin/data/batch/demo_src/dt=${dt}/hour=${hour}/_DONEtgt_done_path=/griffin/data/batch/demo_tgt/dt=${dt}/hour=${hour}/_DONE
>>>>>>>>>>>>> hadoop fs -mkdir -p 
>>>>>>>>>>>>> /griffin/data/batch/demo_src/dt=${dt}/hour=${hour}
>>>>>>>>>>>>> hadoop fs -mkdir -p 
>>>>>>>>>>>>> /griffin/data/batch/demo_tgt/dt=${dt}/hour=${hour}
>>>>>>>>>>>>> hadoop fs -touchz ${src_done_path}
>>>>>>>>>>>>> hadoop fs -touchz ${tgt_done_path}echo "insert data 
>>>>>>>>>>>>> [$partition_date] done"
>>>>>>>>>>>>> #next hoursset +ewhile truedo
>>>>>>>>>>>>>   sudo ./gen_demo_data.sh
>>>>>>>>>>>>>   cur_date=`date +%Y%m%d%H`
>>>>>>>>>>>>>   next_date=`date -d "+1hour" '+%Y%m%d%H'`
>>>>>>>>>>>>>   dt=${next_date:0:8}
>>>>>>>>>>>>>   hour=${next_date:8:2}
>>>>>>>>>>>>>   partition_date="dt='$dt',hour='$hour'"
>>>>>>>>>>>>>   sed s/PARTITION_DATE/$partition_date/ 
>>>>>>>>>>>>> ./insert-data.hql.template > insert-data.hql
>>>>>>>>>>>>>   hive -f insert-data.hql
>>>>>>>>>>>>>   
>>>>>>>>>>>>> src_done_path=/griffin/data/batch/demo_src/dt=${dt}/hour=${hour}/_DONE
>>>>>>>>>>>>>   
>>>>>>>>>>>>> tgt_done_path=/griffin/data/batch/demo_tgt/dt=${dt}/hour=${hour}/_DONE
>>>>>>>>>>>>>   hadoop fs -mkdir -p 
>>>>>>>>>>>>> /griffin/data/batch/demo_src/dt=${dt}/hour=${hour}
>>>>>>>>>>>>>   hadoop fs -mkdir -p 
>>>>>>>>>>>>> /griffin/data/batch/demo_tgt/dt=${dt}/hour=${hour}
>>>>>>>>>>>>>   hadoop fs -touchz ${src_done_path}
>>>>>>>>>>>>>   hadoop fs -touchz ${tgt_done_path}
>>>>>>>>>>>>>   echo "insert data [$partition_date] done"
>>>>>>>>>>>>>   sleep 3600doneset -e
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>
>>>>>>>>>>>>> William
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Thu, Sep 10, 2020 at 4:58 PM Sunil Muniyal <
>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> 1. Since I was able to get ElasticSearch 6.8.x integrated,
>>>>>>>>>>>>>> does it mean that only ES upto 6.8.x is supported for Griffin as 
>>>>>>>>>>>>>> of now? If
>>>>>>>>>>>>>> yes, what are the plans further? Is there a page from which I 
>>>>>>>>>>>>>> could get
>>>>>>>>>>>>>> updates?
>>>>>>>>>>>>>> --please file a jira ticket for us to make our code ES
>>>>>>>>>>>>>> compatible.
>>>>>>>>>>>>>> [SM] GRIFFIN-346 - Support for Elastic Search latest version
>>>>>>>>>>>>>> (7.9.1) <https://issues.apache.org/jira/browse/GRIFFIN-346> is
>>>>>>>>>>>>>> submitted
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 2. I still do not see the metrics available (please refer
>>>>>>>>>>>>>> below screenshots). Though the measure is now listed in the drop 
>>>>>>>>>>>>>> down of *DQ
>>>>>>>>>>>>>> Metrics* tab. But when I selected the test measure, nothing
>>>>>>>>>>>>>> came up.
>>>>>>>>>>>>>> --could you check the ES whether metrics have been injected
>>>>>>>>>>>>>> or not.
>>>>>>>>>>>>>> [SM] I used the link below and got the index that is created
>>>>>>>>>>>>>> in ES. I believe the data is loaded. However, please correct if I
>>>>>>>>>>>>>> understood incorrectly
>>>>>>>>>>>>>> *"http://<ES Public IP>:9200/_cat/indices?v"*
>>>>>>>>>>>>>> --------------> POC env is on public cloud so using Public IP.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> health status index   uuid                   pri rep docs.count 
>>>>>>>>>>>>>> docs.deleted store.size pri.store.size
>>>>>>>>>>>>>> yellow open   griffin ur_Kd3XFQBCsPzIM84j87Q   5   2          0  
>>>>>>>>>>>>>>           0      1.2kb          1.2kb
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Docs in the index:* "http://<ES Public
>>>>>>>>>>>>>> IP>:9200/griffin/_search"*
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> {"took":44,"timed_out":false,"_shards":{"total":5,"successful":5,"skipped":0,"failed":0},"hits":{"total":0,"max_score":null,"hits":[]}}
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Index Mapping: *"http://<ES Public IP>:9200/griffin"*
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> {"griffin":{"aliases":{},"mappings":{"accuracy":{"properties":{"name":{"type":"text","fields":{"keyword":{"type":"keyword","ignore_above":256}}},"tmst":{"type":"date"}}}},"settings":{"index":{"creation_date":"1599567930578","number_of_shards":"5","number_of_replicas":"2","uuid":"ur_Kd3XFQBCsPzIM84j87Q","version":{"created":"6081299"},"provided_name":"griffin"}}}}
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 3. At a step in deployment guide it is suggested to check
>>>>>>>>>>>>>> URL: "*http://<ES HOST IP>:9200/griffin/accuracy"* When
>>>>>>>>>>>>>> navigated to this URL, I get below error. Please advise
>>>>>>>>>>>>>> *{"error":"Incorrect HTTP method for uri [/griffin/accuracy]
>>>>>>>>>>>>>> and method [GET], allowed: [POST]","status":405}*
>>>>>>>>>>>>>> *-- it seems you need to use POST method.*
>>>>>>>>>>>>>> [SM] I am using the POST method as suggested in the article.
>>>>>>>>>>>>>> Below is the JSON of *env_batch.JSON*
>>>>>>>>>>>>>> *    {*
>>>>>>>>>>>>>> *      "type": "ELASTICSEARCH",*
>>>>>>>>>>>>>> *      "config": {*
>>>>>>>>>>>>>> *        "method": "post",*
>>>>>>>>>>>>>> *        "api": "http://<ES Host
>>>>>>>>>>>>>> Name>:9200/griffin/accuracy", ---------> *do we need IP here?
>>>>>>>>>>>>>> *        "connection.timeout": "1m",*
>>>>>>>>>>>>>> *        "retry": 10*
>>>>>>>>>>>>>> *      }*
>>>>>>>>>>>>>> *    }*
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 6. I also noticed that in Data Assets, *demo_src* is owned
>>>>>>>>>>>>>> by Admin whereas, *demo-tgt* by root. Would that make any
>>>>>>>>>>>>>> difference? If yes, how to correct it? Reload HIVE data?
>>>>>>>>>>>>>> -- could you show me your script for dataset setup?
>>>>>>>>>>>>>> <https://issues.apache.org/jira/browse/GRIFFIN-346>
>>>>>>>>>>>>>> [SM] Attached are the 3 scripts. gen-hive-data.sh is the
>>>>>>>>>>>>>> master script which triggers demo_data and it further triggers 
>>>>>>>>>>>>>> delta_src.
>>>>>>>>>>>>>> Have done it as it is instructed in the Github article and
>>>>>>>>>>>>>> gen-hive-data.sh is triggered as root in the terminal.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Please advise.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks and Regards,
>>>>>>>>>>>>>> Sunil Muniyal
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Wed, Sep 9, 2020 at 8:41 PM William Guo <[email protected]>
>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> *Request you to please advise further on below points:*
>>>>>>>>>>>>>>> 1. Since I was able to get ElasticSearch 6.8.x integrated,
>>>>>>>>>>>>>>> does it mean that only ES upto 6.8.x is supported for Griffin 
>>>>>>>>>>>>>>> as of now? If
>>>>>>>>>>>>>>> yes, what are the plans further? Is there a page from which I 
>>>>>>>>>>>>>>> could get
>>>>>>>>>>>>>>> updates?
>>>>>>>>>>>>>>> --please file a jira ticket for us to make our code ES
>>>>>>>>>>>>>>> compatible.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> 2. I still do not see the metrics available (please refer
>>>>>>>>>>>>>>> below screenshots). Though the measure is now listed in the 
>>>>>>>>>>>>>>> drop down of *DQ
>>>>>>>>>>>>>>> Metrics* tab. But when I selected the test measure, nothing
>>>>>>>>>>>>>>> came up.
>>>>>>>>>>>>>>> --could you check the ES whether metrics have been injected
>>>>>>>>>>>>>>> or not.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> 3. At a step in deployment guide it is suggested to check
>>>>>>>>>>>>>>> URL: http://<ES HOST IP>:9200/griffin/accuracy
>>>>>>>>>>>>>>> <http://13.126.127.141:9200/griffin/accuracy> When
>>>>>>>>>>>>>>> navigated to this URL, I get below error. Please advise
>>>>>>>>>>>>>>> *{"error":"Incorrect HTTP method for uri [/griffin/accuracy]
>>>>>>>>>>>>>>> and method [GET], allowed: [POST]","status":405}*
>>>>>>>>>>>>>>> *-- it seems you need to use POST method.*
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> 6. I also noticed that in Data Assets, *demo_src* is owned
>>>>>>>>>>>>>>> by Admin whereas, *demo-tgt* by root. Would that make any
>>>>>>>>>>>>>>> difference? If yes, how to correct it? Reload HIVE data?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> -- could you show me your script for dataset setup?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Tue, Sep 8, 2020 at 9:02 PM Sunil Muniyal <
>>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Hi William,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I was finally able to get Griffin up and ElasticSearch
>>>>>>>>>>>>>>>> integrated along with Hadoop. Thanks a lot for your help and 
>>>>>>>>>>>>>>>> guidance so
>>>>>>>>>>>>>>>> far.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I have created a test measure and a job which gets
>>>>>>>>>>>>>>>> triggered at every 4 mins automatically (have referred to the 
>>>>>>>>>>>>>>>> user guide
>>>>>>>>>>>>>>>> available on GitHub at this link
>>>>>>>>>>>>>>>> <https://github.com/apache/griffin/blob/master/griffin-doc/ui/user-guide.md>
>>>>>>>>>>>>>>>> .)
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> *Request you to please advise further on below points:*
>>>>>>>>>>>>>>>> 1. Since I was able to get ElasticSearch 6.8.x integrated,
>>>>>>>>>>>>>>>> does it mean that only ES upto 6.8.x is supported for Griffin 
>>>>>>>>>>>>>>>> as of now? If
>>>>>>>>>>>>>>>> yes, what are the plans further? Is there a page from which I 
>>>>>>>>>>>>>>>> could get
>>>>>>>>>>>>>>>> updates?
>>>>>>>>>>>>>>>> 2. I still do not see the metrics available (please refer
>>>>>>>>>>>>>>>> below screenshots). Though the measure is now listed in the 
>>>>>>>>>>>>>>>> drop down of *DQ
>>>>>>>>>>>>>>>> Metrics* tab. But when I selected the test measure,
>>>>>>>>>>>>>>>> nothing came up.
>>>>>>>>>>>>>>>> 3. At a step in deployment guide it is suggested to check
>>>>>>>>>>>>>>>> URL: http://<ES HOST IP>:9200/griffin/accuracy
>>>>>>>>>>>>>>>> <http://13.126.127.141:9200/griffin/accuracy> When
>>>>>>>>>>>>>>>> navigated to this URL, I get below error. Please advise
>>>>>>>>>>>>>>>> *{"error":"Incorrect HTTP method for uri
>>>>>>>>>>>>>>>> [/griffin/accuracy] and method [GET], allowed: 
>>>>>>>>>>>>>>>> [POST]","status":405}*
>>>>>>>>>>>>>>>> 6. I also noticed that in Data Assets, *demo_src* is owned
>>>>>>>>>>>>>>>> by Admin whereas, *demo-tgt* by root. Would that make any
>>>>>>>>>>>>>>>> difference? If yes, how to correct it? Reload HIVE data?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> *Screenshots:*
>>>>>>>>>>>>>>>> *Data Assets:*
>>>>>>>>>>>>>>>> [image: image.png]
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> *DQ Metrics (Test Measure selected):*
>>>>>>>>>>>>>>>> [image: image.png]
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> *Job Triggered multiple times:*
>>>>>>>>>>>>>>>> [image: image.png]
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> *Metrics page from job directly:*
>>>>>>>>>>>>>>>> [image: image.png]
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thanks and Regards,
>>>>>>>>>>>>>>>> Sunil Muniyal
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Tue, Sep 8, 2020 at 4:38 PM Sunil Muniyal <
>>>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I am unable to get repos for 6.4.1 instead I found 6.8.x.
>>>>>>>>>>>>>>>>> Will try with this version of Elastic Search in sometime.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> In the meantime, would it be possible to confirm if 6.4.x
>>>>>>>>>>>>>>>>> or 6.8.x is the only supported version for Griffin? Reason I 
>>>>>>>>>>>>>>>>> am asking is,
>>>>>>>>>>>>>>>>> the GitHub article for griffin deployment points to the 
>>>>>>>>>>>>>>>>> latest version of
>>>>>>>>>>>>>>>>> ES.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Thanks and Regards,
>>>>>>>>>>>>>>>>> Sunil Muniyal
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Tue, Sep 8, 2020 at 4:06 PM Sunil Muniyal <
>>>>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> I will need to redeploy ElasticSearch, correct?
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Thanks and Regards,
>>>>>>>>>>>>>>>>>> Sunil Muniyal
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On Tue, Sep 8, 2020 at 4:05 PM William Guo <
>>>>>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Could you try with this version?
>>>>>>>>>>>>>>>>>>> <elasticsearch.version>6.4.1</elasticsearch.version>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>> William
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On Tue, Sep 8, 2020 at 5:59 PM Sunil Muniyal <
>>>>>>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Hi William / Dev group,
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> I have deployed ES 7.9 - latest version (single node)
>>>>>>>>>>>>>>>>>>>> and the same is configured. I also get the default page 
>>>>>>>>>>>>>>>>>>>> when hitting
>>>>>>>>>>>>>>>>>>>> http://<ES HOST IP>:9200/
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Upon creating the griffin configurations using the JSON
>>>>>>>>>>>>>>>>>>>> string given
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> curl -k -H "Content-Type: application/json" -X PUT 
>>>>>>>>>>>>>>>>>>>> http://<replaced with my ES host IP>:9200/griffin \
>>>>>>>>>>>>>>>>>>>>  -d '{
>>>>>>>>>>>>>>>>>>>>     "aliases": {},
>>>>>>>>>>>>>>>>>>>>     "mappings": {
>>>>>>>>>>>>>>>>>>>>         "accuracy": {
>>>>>>>>>>>>>>>>>>>>             "properties": {
>>>>>>>>>>>>>>>>>>>>                 "name": {
>>>>>>>>>>>>>>>>>>>>                     "fields": {
>>>>>>>>>>>>>>>>>>>>                         "keyword": {
>>>>>>>>>>>>>>>>>>>>                             "ignore_above": 256,
>>>>>>>>>>>>>>>>>>>>                             "type": "keyword"
>>>>>>>>>>>>>>>>>>>>                         }
>>>>>>>>>>>>>>>>>>>>                     },
>>>>>>>>>>>>>>>>>>>>                     "type": "text"
>>>>>>>>>>>>>>>>>>>>                 },
>>>>>>>>>>>>>>>>>>>>                 "tmst": {
>>>>>>>>>>>>>>>>>>>>                     "type": "date"
>>>>>>>>>>>>>>>>>>>>                 }
>>>>>>>>>>>>>>>>>>>>             }
>>>>>>>>>>>>>>>>>>>>         }
>>>>>>>>>>>>>>>>>>>>     },
>>>>>>>>>>>>>>>>>>>>     "settings": {
>>>>>>>>>>>>>>>>>>>>         "index": {
>>>>>>>>>>>>>>>>>>>>             "number_of_replicas": "2",
>>>>>>>>>>>>>>>>>>>>             "number_of_shards": "5"
>>>>>>>>>>>>>>>>>>>>         }
>>>>>>>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>>>>>>> }'
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> *I get below error:*
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> *{"error":{"root_cause":[{"type":"mapper_parsing_exception","reason":"Root
>>>>>>>>>>>>>>>>>>>> mapping definition has unsupported parameters:  [accuracy :
>>>>>>>>>>>>>>>>>>>> {properties={name={fields={keyword={ignore_above=256, 
>>>>>>>>>>>>>>>>>>>> type=keyword}},
>>>>>>>>>>>>>>>>>>>> type=text},
>>>>>>>>>>>>>>>>>>>> tmst={type=date}}}]"}],"type":"mapper_parsing_exception","reason":"Failed
>>>>>>>>>>>>>>>>>>>> to parse mapping [_doc]: Root mapping definition has 
>>>>>>>>>>>>>>>>>>>> unsupported
>>>>>>>>>>>>>>>>>>>> parameters:  [accuracy :
>>>>>>>>>>>>>>>>>>>> {properties={name={fields={keyword={ignore_above=256, 
>>>>>>>>>>>>>>>>>>>> type=keyword}},
>>>>>>>>>>>>>>>>>>>> type=text},
>>>>>>>>>>>>>>>>>>>> tmst={type=date}}}]","caused_by":{"type":"mapper_parsing_exception","reason":"Root
>>>>>>>>>>>>>>>>>>>> mapping definition has unsupported parameters:  [accuracy :
>>>>>>>>>>>>>>>>>>>> {properties={name={fields={keyword={ignore_above=256, 
>>>>>>>>>>>>>>>>>>>> type=keyword}},
>>>>>>>>>>>>>>>>>>>> type=text}, tmst={type=date}}}]"}},"status":400}*
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Seems like the JSON string is missing some values or is
>>>>>>>>>>>>>>>>>>>> incorrectly provided.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Would be great if you could please help.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Thanks and Regards,
>>>>>>>>>>>>>>>>>>>> Sunil Muniyal
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> On Mon, Sep 7, 2020 at 8:16 PM Sunil Muniyal <
>>>>>>>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Thank you for the response, William.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> I have started preparing for ES deployment and should
>>>>>>>>>>>>>>>>>>>>> attempt the same tomorrow.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> In the meantime, I will also wait for the Dev team in
>>>>>>>>>>>>>>>>>>>>> case they have any additional inputs.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Thanks and Regards,
>>>>>>>>>>>>>>>>>>>>> Sunil Muniyal
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> On Mon, Sep 7, 2020 at 8:06 PM William Guo <
>>>>>>>>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> If dev confirms it to be mandatory, as I understand
>>>>>>>>>>>>>>>>>>>>>> correct, I will need to:
>>>>>>>>>>>>>>>>>>>>>> 1. Deploy and Configure ES
>>>>>>>>>>>>>>>>>>>>>> 2. Update application.properties to include ES
>>>>>>>>>>>>>>>>>>>>>> details and create ES index
>>>>>>>>>>>>>>>>>>>>>> 3. Rebuild Maven package and rerun the Griffin service
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> *Right, you need to package es env configuration into
>>>>>>>>>>>>>>>>>>>>>> your jar.*
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> There is no need to reload the data into Hadoop
>>>>>>>>>>>>>>>>>>>>>> (Hive), correct?
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> *No*
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> On a side note, is there any other documentation of
>>>>>>>>>>>>>>>>>>>>>> Griffin available or underway which would help to get 
>>>>>>>>>>>>>>>>>>>>>> below details while
>>>>>>>>>>>>>>>>>>>>>> integrating it with Cloudera Hadoop?
>>>>>>>>>>>>>>>>>>>>>> 1. What are the exact ports requirements (internal
>>>>>>>>>>>>>>>>>>>>>> and external)?
>>>>>>>>>>>>>>>>>>>>>> *check log and make sure all extra connections in
>>>>>>>>>>>>>>>>>>>>>> properties can accessible*
>>>>>>>>>>>>>>>>>>>>>> 2. Which all packages will be required?
>>>>>>>>>>>>>>>>>>>>>> *no*
>>>>>>>>>>>>>>>>>>>>>> 3. Any Java dependencies?
>>>>>>>>>>>>>>>>>>>>>> *java 1.8*
>>>>>>>>>>>>>>>>>>>>>> 4. If we have Cloudera Hadoop cluster kerberized
>>>>>>>>>>>>>>>>>>>>>> (secured), what are the dependencies or additional 
>>>>>>>>>>>>>>>>>>>>>> configurations needed?
>>>>>>>>>>>>>>>>>>>>>> *Should no extra dependencies, except
>>>>>>>>>>>>>>>>>>>>>> those transitive dependencies incurred by spark and 
>>>>>>>>>>>>>>>>>>>>>> hadoop.*
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> On Mon, Sep 7, 2020 at 6:42 PM Sunil Muniyal <
>>>>>>>>>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Ohh ok.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> If dev confirms it to be mandatory, as I understand
>>>>>>>>>>>>>>>>>>>>>>> correct, I will need to:
>>>>>>>>>>>>>>>>>>>>>>> 1. Deploy and Configure ES
>>>>>>>>>>>>>>>>>>>>>>> 2. Update application.properties to include ES
>>>>>>>>>>>>>>>>>>>>>>> details and create ES index
>>>>>>>>>>>>>>>>>>>>>>> 3. Rebuild Maven package and rerun the Griffin
>>>>>>>>>>>>>>>>>>>>>>> service
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> There is no need to reload the data into Hadoop
>>>>>>>>>>>>>>>>>>>>>>> (Hive), correct?
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> On a side note, is there any other documentation of
>>>>>>>>>>>>>>>>>>>>>>> Griffin available or underway which would help to get 
>>>>>>>>>>>>>>>>>>>>>>> below details while
>>>>>>>>>>>>>>>>>>>>>>> integrating it with Cloudera Hadoop?
>>>>>>>>>>>>>>>>>>>>>>> 1. What are the exact ports requirements (internal
>>>>>>>>>>>>>>>>>>>>>>> and external)?
>>>>>>>>>>>>>>>>>>>>>>> 2. Which all packages will be required?
>>>>>>>>>>>>>>>>>>>>>>> 3. Any Java dependencies?
>>>>>>>>>>>>>>>>>>>>>>> 4. If we have Cloudera Hadoop cluster kerberized
>>>>>>>>>>>>>>>>>>>>>>> (secured), what are the dependencies or additional 
>>>>>>>>>>>>>>>>>>>>>>> configurations needed?
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> I know some of the above information can be fetched
>>>>>>>>>>>>>>>>>>>>>>> from the deployment guide on Github. However, checking 
>>>>>>>>>>>>>>>>>>>>>>> if any other formal
>>>>>>>>>>>>>>>>>>>>>>> documentation has been made available for the same?
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Thanks and Regards,
>>>>>>>>>>>>>>>>>>>>>>> Sunil Muniyal
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> On Mon, Sep 7, 2020 at 4:05 PM William Guo <
>>>>>>>>>>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> cc dev for double checking.
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> Measure will emit metrics and store them in
>>>>>>>>>>>>>>>>>>>>>>>> elastic, UI fetch those metrics from elastic.
>>>>>>>>>>>>>>>>>>>>>>>> So elastic should be mandatory.
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>>>>>> William
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> On Mon, Sep 7, 2020 at 6:32 PM Sunil Muniyal <
>>>>>>>>>>>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> Thank you for the quick response, William.
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> I have not configured ElasticSearch since it is
>>>>>>>>>>>>>>>>>>>>>>>>> not deployed.
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> In the application.properties, I just added the
>>>>>>>>>>>>>>>>>>>>>>>>> dummy information (as below) just to pass the 
>>>>>>>>>>>>>>>>>>>>>>>>> validation test and get
>>>>>>>>>>>>>>>>>>>>>>>>> Griffin up and running.
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> # elasticsearch
>>>>>>>>>>>>>>>>>>>>>>>>> # elasticsearch.host = <IP>
>>>>>>>>>>>>>>>>>>>>>>>>> # elasticsearch.port = <elasticsearch rest port>
>>>>>>>>>>>>>>>>>>>>>>>>> # elasticsearch.user = user
>>>>>>>>>>>>>>>>>>>>>>>>> # elasticsearch.password = password
>>>>>>>>>>>>>>>>>>>>>>>>> elasticsearch.host=localhost
>>>>>>>>>>>>>>>>>>>>>>>>> elasticsearch.port=9200
>>>>>>>>>>>>>>>>>>>>>>>>> elasticsearch.scheme=http
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> Is ElasticSearch a mandatory requirement to use
>>>>>>>>>>>>>>>>>>>>>>>>> Griffin?
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> Thanks and Regards,
>>>>>>>>>>>>>>>>>>>>>>>>> Sunil Muniyal
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> On Mon, Sep 7, 2020 at 3:58 PM William Guo <
>>>>>>>>>>>>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> Could you check whether ES has been injected with
>>>>>>>>>>>>>>>>>>>>>>>>>> those metrics or not?
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> On Mon, Sep 7, 2020 at 6:23 PM Sunil Muniyal <
>>>>>>>>>>>>>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> Hello William,
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> I was able to bypass this error by entering the
>>>>>>>>>>>>>>>>>>>>>>>>>>> default field values for LDAP, ElasticSearch and 
>>>>>>>>>>>>>>>>>>>>>>>>>>> Livy in
>>>>>>>>>>>>>>>>>>>>>>>>>>> application.properties and successfully get Griffin 
>>>>>>>>>>>>>>>>>>>>>>>>>>> running.
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> By following the below article, I have created a
>>>>>>>>>>>>>>>>>>>>>>>>>>> test measure and then a job which triggers that 
>>>>>>>>>>>>>>>>>>>>>>>>>>> measure.
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> https://github.com/apache/griffin/blob/master/griffin-doc/ui/user-guide.md
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> Have allowed the job to get triggered multiple
>>>>>>>>>>>>>>>>>>>>>>>>>>> times, however, still i can't see anything in 
>>>>>>>>>>>>>>>>>>>>>>>>>>> metrics related to the job.
>>>>>>>>>>>>>>>>>>>>>>>>>>> Neither I see anything in *health *or
>>>>>>>>>>>>>>>>>>>>>>>>>>> *mydashboard* tabs. Also, if you notice in the
>>>>>>>>>>>>>>>>>>>>>>>>>>> screenshot below, being in the *DQ Metrics*
>>>>>>>>>>>>>>>>>>>>>>>>>>> tab, I still do not see the created measure in the 
>>>>>>>>>>>>>>>>>>>>>>>>>>> drop down list.
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> [image: image.png]
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> *Test job executed multiple times:*
>>>>>>>>>>>>>>>>>>>>>>>>>>> [image: image.png]
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> Please advise if anything is mis-configured.
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks and Regards,
>>>>>>>>>>>>>>>>>>>>>>>>>>> Sunil Muniyal
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> On Mon, Sep 7, 2020 at 12:40 PM Sunil Muniyal <
>>>>>>>>>>>>>>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hello William,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thank you for the reply.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> This helped, actually i had missed to add the
>>>>>>>>>>>>>>>>>>>>>>>>>>>> property in application.properties.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> Now the other challenge is, along with ES and
>>>>>>>>>>>>>>>>>>>>>>>>>>>> Livy, I am also not using LDAP and it is hitting 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> the error *unable
>>>>>>>>>>>>>>>>>>>>>>>>>>>> to resolve ldap.url property.* Of Course it
>>>>>>>>>>>>>>>>>>>>>>>>>>>> will, since the property is not configured.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> Please suggest.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks and Regards,
>>>>>>>>>>>>>>>>>>>>>>>>>>>> Sunil Muniyal
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Sun, Sep 6, 2020 at 7:26 PM William Guo <
>>>>>>>>>>>>>>>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> hi Sunil Muniyal,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Could you check this property in your griffin
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> properties file?
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> internal.event.listeners
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> William
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Thu, Sep 3, 2020 at 11:05 PM Sunil Muniyal <
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hello,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I am attempting to integrate Griffin with
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Cloudera Hadoop by following below article:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> https://github.com/apache/griffin/blob/master/griffin-doc/deploy/deploy-guide.md
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> <https://github.com/apache/griffin/blob/master/griffin-doc/deploy/deploy-guide.md>I
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> have followed everything as instructed, apart 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> from below things:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 1. Using Cloudera Hadoop 5.15 and relevant
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> configurations instead of Apache Hadoop
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 2. Not using Elastic search as it is not
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> applicable
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 3. Did not use Livy as it is not applicable.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Maven build is successful and has got 2 jars
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> at service/target and measure/target which I 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> have uploaded to HDFS.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> However, *starting griffin-service.jar using
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> nohup command* is failing with below error:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> *Caused by:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> java.lang.IllegalArgumentException: Could not 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> resolve placeholder
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 'internal.event.listeners' in string value
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> "#{'${internal.event.listeners}'.split(',')}"*
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> *        at
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> org.springframework.util.PropertyPlaceholderHelper.parseStringValue(PropertyPlaceholderHelper.java:174)
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ~[spring-core-4.3.6.RELEASE.jar!/:4.3.6.RELEASE]*
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> *        at
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> org.springframework.util.PropertyPlaceholderHelper.replacePlaceholders(PropertyPlaceholderHelper.java:126)
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ~[spring-core-4.3.6.RELEASE.jar!/:4.3.6.RELEASE]*
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> *        at
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> org.springframework.core.env.AbstractPropertyResolver.doResolvePlaceholders(AbstractPropertyResolver.java:236)
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ~[spring-core-4.3.6.RELEASE.jar!/:4.3.6.RELEASE]*
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I have tried to search a lot of articles with
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> no luck.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Would be great if someone could help me to
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> fix this.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Also, attached is the output of nohup command
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> that was written in service.out.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks and Regards,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Sunil Muniyal
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ---------------------------------------------------------------------
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> To unsubscribe, e-mail:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> [email protected]
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> For additional commands, e-mail:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> [email protected]
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> ---------------------------------------------------------------------
>>>>>>>>>>>>>> To unsubscribe, e-mail: [email protected]
>>>>>>>>>>>>>> For additional commands, e-mail:
>>>>>>>>>>>>>> [email protected]
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>
>>>>>>>> ---------------------------------------------------------------------
>>>>>>>> To unsubscribe, e-mail: [email protected]
>>>>>>>> For additional commands, e-mail: [email protected]
>>>>>>>
>>>>>>>

Reply via email to