yes, griffin leverages livy to post spark jobs to spark cluster. if you manually submit a job to spark, griffin cannot automatically refresh metrics from es.
On Fri, Sep 11, 2020 at 9:03 PM Sunil Muniyal <[email protected]> wrote: > As of now I am doing the test with Hive tables. > Spark jobs aren't submitted by me yet. > > A question... so if I understand correctly, the metrics will be generated > only after a spark job is executed and an accuracy check is performed > between two hive tables via submitted spark job? is that correct? if yes, i > guess Livy is needed then so that Griffin can submit spark jobs by itself > else I will have to manually submit a Spark job first. If later is an > option, how do we do that w.r.t. Griffin? > > Thanks and Regards, > Sunil Muniyal > > > On Fri, Sep 11, 2020 at 6:30 PM William Guo <[email protected]> wrote: > >> check this >> https://spark.apache.org/docs/2.2.1/monitoring.html >> >> >> >> On Fri, Sep 11, 2020 at 8:58 PM William Guo <[email protected]> wrote: >> >>> for griffin log, please search in your spark cluster env, usually in >>> worker log dir. >>> One weird thing is how you submit a job to spark, if you disabled livy? >>> >>> On Fri, Sep 11, 2020 at 8:46 PM Sunil Muniyal < >>> [email protected]> wrote: >>> >>>> possible to please help the location where measure log would get >>>> created or from where can i check the location? >>>> >>>> Thanks and Regards, >>>> Sunil Muniyal >>>> >>>> >>>> On Fri, Sep 11, 2020 at 6:14 PM William Guo <[email protected]> wrote: >>>> >>>>> Livy is used to post jobs to your cluster, I don't think it is related >>>>> to livy. >>>>> >>>>> Could you also share the measure log in your cluster? >>>>> >>>>> >>>>> On Fri, Sep 11, 2020 at 8:03 PM Sunil Muniyal < >>>>> [email protected]> wrote: >>>>> >>>>>> Got below message as output of >>>>>> >>>>>> {"Test_Measure":[{"name":"Test_Job","type":"ACCURACY","owner":"test","metricValues":[]}]} >>>>>> >>>>>> metricValues seems empty. So is it like Griffin is not getting data >>>>>> from ES? whereas ES does have the data which we verified previously. By >>>>>> any >>>>>> chance, do you think not having Livy could be a problem? >>>>>> >>>>>> These are the latest logs from service.out: >>>>>> *[EL Fine]: sql: 2020-09-11 >>>>>> 11:59:11.662--ServerSession(400064818)--Connection(754936662)--SELECT >>>>>> DISTINCT ID, APPID, APPURI, CREATEDDATE, DELETED, expire_timestamp, >>>>>> MODIFIEDDATE, predicate_job_deleted, predicate_group_name, >>>>>> predicate_job_name, SESSIONID, STATE, timestamp, TYPE, job_id FROM >>>>>> JOBINSTANCEBEAN WHERE (STATE IN (?,?,?,?,?,?))* >>>>>> * bind => [6 parameters bound]* >>>>>> *[EL Fine]: sql: 2020-09-11 >>>>>> 11:59:51.044--ServerSession(400064818)--Connection(353930083)--SELECT ID, >>>>>> type, CREATEDDATE, CRONEXPRESSION, DELETED, quartz_group_name, JOBNAME, >>>>>> MEASUREID, METRICNAME, MODIFIEDDATE, quartz_job_name, PREDICATECONFIG, >>>>>> TIMEZONE FROM job WHERE (DELETED = ?)* >>>>>> * bind => [1 parameter bound]* >>>>>> *[EL Fine]: sql: 2020-09-11 >>>>>> 11:59:51.046--ServerSession(400064818)--Connection(1245663749)--SELECT >>>>>> DISTINCT DTYPE FROM MEASURE WHERE (DELETED = ?)* >>>>>> * bind => [1 parameter bound]* >>>>>> *[EL Fine]: sql: 2020-09-11 >>>>>> 11:59:51.046--ServerSession(400064818)--Connection(674248356)--SELECT >>>>>> t0.ID, t0.DTYPE, t0.CREATEDDATE, t0.DELETED, t0.DESCRIPTION, t0.DQTYPE, >>>>>> t0.MODIFIEDDATE, t0.NAME, t0.ORGANIZATION, t0.OWNER, t0.SINKS, t1.ID, >>>>>> t1.PROCESSTYPE, t1.RULEDESCRIPTION, t1.evaluate_rule_id FROM MEASURE t0, >>>>>> GRIFFINMEASURE t1 WHERE ((t0.DELETED = ?) AND ((t1.ID = t0.ID) AND >>>>>> (t0.DTYPE = ?)))* >>>>>> * bind => [2 parameters bound]* >>>>>> *[EL Fine]: sql: 2020-09-11 >>>>>> 12:00:00.019--ClientSession(294162678)--Connection(98503327)--INSERT INTO >>>>>> JOBINSTANCEBEAN (ID, APPID, APPURI, CREATEDDATE, DELETED, >>>>>> expire_timestamp, >>>>>> MODIFIEDDATE, predicate_job_deleted, predicate_group_name, >>>>>> predicate_job_name, SESSIONID, STATE, timestamp, TYPE, job_id) VALUES (?, >>>>>> ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)* >>>>>> * bind => [15 parameters bound]* >>>>>> *[EL Fine]: sql: 2020-09-11 >>>>>> 12:00:00.09--ServerSession(400064818)--Connection(491395630)--SELECT ID, >>>>>> APPID, APPURI, CREATEDDATE, DELETED, expire_timestamp, MODIFIEDDATE, >>>>>> predicate_job_deleted, predicate_group_name, predicate_job_name, >>>>>> SESSIONID, >>>>>> STATE, timestamp, TYPE, job_id FROM JOBINSTANCEBEAN WHERE >>>>>> (predicate_job_name = ?)* >>>>>> * bind => [1 parameter bound]* >>>>>> *2020-09-11 12:00:00.117 INFO 10980 --- [ryBean_Worker-3] >>>>>> o.a.g.c.j.SparkSubmitJob : {* >>>>>> * "measure.type" : "griffin",* >>>>>> * "id" : 201,* >>>>>> * "name" : "Test_Job",* >>>>>> * "owner" : "test",* >>>>>> * "description" : "Measure to check %age of id field values are >>>>>> same",* >>>>>> * "deleted" : false,* >>>>>> * "timestamp" : 1599822000000,* >>>>>> * "dq.type" : "ACCURACY",* >>>>>> * "sinks" : [ "ELASTICSEARCH", "HDFS" ],* >>>>>> * "process.type" : "BATCH",* >>>>>> * "data.sources" : [ {* >>>>>> * "id" : 204,* >>>>>> * "name" : "source",* >>>>>> * "connectors" : [ {* >>>>>> * "id" : 205,* >>>>>> * "name" : "source1599568886803",* >>>>>> * "type" : "HIVE",* >>>>>> * "version" : "1.2",* >>>>>> * "predicates" : [ ],* >>>>>> * "data.unit" : "1hour",* >>>>>> * "data.time.zone" : "",* >>>>>> * "config" : {* >>>>>> * "database" : "default",* >>>>>> * "table.name <http://table.name>" : "demo_src",* >>>>>> * "where" : "dt=20200911 AND hour=11"* >>>>>> * }* >>>>>> * } ],* >>>>>> * "baseline" : false* >>>>>> * }, {* >>>>>> * "id" : 206,* >>>>>> * "name" : "target",* >>>>>> * "connectors" : [ {* >>>>>> * "id" : 207,* >>>>>> * "name" : "target1599568896874",* >>>>>> * "type" : "HIVE",* >>>>>> * "version" : "1.2",* >>>>>> * "predicates" : [ ],* >>>>>> * "data.unit" : "1hour",* >>>>>> * "data.time.zone" : "",* >>>>>> * "config" : {* >>>>>> * "database" : "default",* >>>>>> * "table.name <http://table.name>" : "demo_tgt",* >>>>>> * "where" : "dt=20200911 AND hour=11"* >>>>>> * }* >>>>>> * } ],* >>>>>> * "baseline" : false* >>>>>> * } ],* >>>>>> * "evaluate.rule" : {* >>>>>> * "id" : 202,* >>>>>> * "rules" : [ {* >>>>>> * "id" : 203,* >>>>>> * "rule" : "source.id <http://source.id>=target.id >>>>>> <http://target.id>",* >>>>>> * "dsl.type" : "griffin-dsl",* >>>>>> * "dq.type" : "ACCURACY",* >>>>>> * "out.dataframe.name <http://out.dataframe.name>" : "accuracy"* >>>>>> * } ]* >>>>>> * },* >>>>>> * "measure.type" : "griffin"* >>>>>> *}* >>>>>> *2020-09-11 12:00:00.119 ERROR 10980 --- [ryBean_Worker-3] >>>>>> o.a.g.c.j.SparkSubmitJob : Post to livy ERROR. I/O error >>>>>> on >>>>>> POST request for "http://localhost:8998/batches >>>>>> <http://localhost:8998/batches>": Connection refused (Connection >>>>>> refused); >>>>>> nested exception is java.net.ConnectException: Connection refused >>>>>> (Connection refused)* >>>>>> *2020-09-11 12:00:00.131 INFO 10980 --- [ryBean_Worker-3] >>>>>> o.a.g.c.j.SparkSubmitJob : Delete predicate >>>>>> job(PG,Test_Job_predicate_1599825600016) SUCCESS.* >>>>>> *[EL Fine]: sql: 2020-09-11 >>>>>> 12:00:00.133--ClientSession(273634815)--Connection(296858203)--UPDATE >>>>>> JOBINSTANCEBEAN SET predicate_job_deleted = ?, STATE = ? WHERE (ID = ?)* >>>>>> * bind => [3 parameters bound]* >>>>>> *[EL Fine]: sql: 2020-09-11 >>>>>> 12:00:11.664--ServerSession(400064818)--Connection(1735064739)--SELECT >>>>>> DISTINCT ID, APPID, APPURI, CREATEDDATE, DELETED, expire_timestamp, >>>>>> MODIFIEDDATE, predicate_job_deleted, predicate_group_name, >>>>>> predicate_job_name, SESSIONID, STATE, timestamp, TYPE, job_id FROM >>>>>> JOBINSTANCEBEAN WHERE (STATE IN (?,?,?,?,?,?))* >>>>>> * bind => [6 parameters bound]* >>>>>> >>>>>> >>>>>> Thanks and Regards, >>>>>> Sunil Muniyal >>>>>> >>>>>> >>>>>> On Fri, Sep 11, 2020 at 3:42 PM William Guo <[email protected]> wrote: >>>>>> >>>>>>> From the log, I didn't find any information related to metrics >>>>>>> fetching. >>>>>>> >>>>>>> Could you try to call /api/v1/metrics, and show us the latest log >>>>>>> again? >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Fri, Sep 11, 2020 at 5:48 PM Sunil Muniyal < >>>>>>> [email protected]> wrote: >>>>>>> >>>>>>>> 1: I guest it is related to your login user and super user. >>>>>>>> I am less worried about unless this could be the cause of metrics >>>>>>>> not being displayed. >>>>>>>> >>>>>>>> 2: Could you share with us your griffin log , I suspect some >>>>>>>> exception happened when trying to connect with ES. >>>>>>>> Attached is the service.out file. I see an error is while >>>>>>>> submitting Spark jobs via Livy. Since Livy is not configured / deployed >>>>>>>> this is expected. I believe this should not be the reason since we are >>>>>>>> getting data from hive (as part of batch processing). Please correct >>>>>>>> if my >>>>>>>> understanding is incorrect. >>>>>>>> >>>>>>>> Thanks and Regards, >>>>>>>> Sunil Muniyal >>>>>>>> >>>>>>>> >>>>>>>> On Fri, Sep 11, 2020 at 3:09 PM William Guo <[email protected]> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> 1: I guest it is related to your login user and super user. >>>>>>>>> 2: Could you share with us your griffin log , I suspect some >>>>>>>>> exception happened when try to connect with ES. >>>>>>>>> >>>>>>>>> On Fri, Sep 11, 2020 at 5:14 PM Sunil Muniyal < >>>>>>>>> [email protected]> wrote: >>>>>>>>> >>>>>>>>>> Hello William, >>>>>>>>>> >>>>>>>>>> Tried as suggested. >>>>>>>>>> >>>>>>>>>> 1. Ingested data into Hive tables using the provided script. >>>>>>>>>> The ownership still show as is (Source with Admin and Target with >>>>>>>>>> Root) >>>>>>>>>> >>>>>>>>>> 2. Updated env-batch.json and env-streaming.json files with IP >>>>>>>>>> address for ES and rebuilt Griffin. >>>>>>>>>> Still no metrics for the jobs executed. >>>>>>>>>> ES does have data as confirmed yesterday. >>>>>>>>>> >>>>>>>>>> Please help. >>>>>>>>>> >>>>>>>>>> Thanks and Regards, >>>>>>>>>> Sunil Muniyal >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Thu, Sep 10, 2020 at 7:41 PM William Guo <[email protected]> >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> please enter ip directly. >>>>>>>>>>> not sure whether hostname can be resolved correctly or not. >>>>>>>>>>> >>>>>>>>>>> On Thu, Sep 10, 2020 at 10:06 PM Sunil Muniyal < >>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>> >>>>>>>>>>>> Hi William, >>>>>>>>>>>> >>>>>>>>>>>> Thank you for the reply. >>>>>>>>>>>> >>>>>>>>>>>> Regarding points 2 and 3. Possible to share some more details. >>>>>>>>>>>> I believe the env_batch.json is configured as it is expected. What >>>>>>>>>>>> exactly >>>>>>>>>>>> needs to be updated correctly? ES Hostname or shall I enter IP or >>>>>>>>>>>> something >>>>>>>>>>>> else? Please help. >>>>>>>>>>>> >>>>>>>>>>>> Thanks and Regards, >>>>>>>>>>>> Sunil Muniyal >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Thu, Sep 10, 2020 at 7:30 PM William Guo <[email protected]> >>>>>>>>>>>> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> 1 OK, We will fix this issue soon. >>>>>>>>>>>>> 2 Could you try ping es from your spark environment and input >>>>>>>>>>>>> ES endpoint correctly in env_batch.json >>>>>>>>>>>>> 3 Please put your es endpoint in env_batch.json >>>>>>>>>>>>> 6 Please try the following script to build your env. >>>>>>>>>>>>> ``` >>>>>>>>>>>>> >>>>>>>>>>>>> #!/bin/bash >>>>>>>>>>>>> #create table >>>>>>>>>>>>> hive -f create-table.hqlecho "create table done" >>>>>>>>>>>>> #current hoursudo ./gen_demo_data.shcur_date=`date >>>>>>>>>>>>> +%Y%m%d%H`dt=${cur_date:0:8}hour=${cur_date:8:2}partition_date="dt='$dt',hour='$hour'" >>>>>>>>>>>>> sed s/PARTITION_DATE/$partition_date/ ./insert-data.hql.template >>>>>>>>>>>>> > insert-data.hql >>>>>>>>>>>>> hive -f >>>>>>>>>>>>> insert-data.hqlsrc_done_path=/griffin/data/batch/demo_src/dt=${dt}/hour=${hour}/_DONEtgt_done_path=/griffin/data/batch/demo_tgt/dt=${dt}/hour=${hour}/_DONE >>>>>>>>>>>>> hadoop fs -mkdir -p >>>>>>>>>>>>> /griffin/data/batch/demo_src/dt=${dt}/hour=${hour} >>>>>>>>>>>>> hadoop fs -mkdir -p >>>>>>>>>>>>> /griffin/data/batch/demo_tgt/dt=${dt}/hour=${hour} >>>>>>>>>>>>> hadoop fs -touchz ${src_done_path} >>>>>>>>>>>>> hadoop fs -touchz ${tgt_done_path}echo "insert data >>>>>>>>>>>>> [$partition_date] done" >>>>>>>>>>>>> #last hoursudo ./gen_demo_data.shcur_date=`date -d '1 hour ago' >>>>>>>>>>>>> +%Y%m%d%H`dt=${cur_date:0:8}hour=${cur_date:8:2}partition_date="dt='$dt',hour='$hour'" >>>>>>>>>>>>> sed s/PARTITION_DATE/$partition_date/ ./insert-data.hql.template >>>>>>>>>>>>> > insert-data.hql >>>>>>>>>>>>> hive -f >>>>>>>>>>>>> insert-data.hqlsrc_done_path=/griffin/data/batch/demo_src/dt=${dt}/hour=${hour}/_DONEtgt_done_path=/griffin/data/batch/demo_tgt/dt=${dt}/hour=${hour}/_DONE >>>>>>>>>>>>> hadoop fs -mkdir -p >>>>>>>>>>>>> /griffin/data/batch/demo_src/dt=${dt}/hour=${hour} >>>>>>>>>>>>> hadoop fs -mkdir -p >>>>>>>>>>>>> /griffin/data/batch/demo_tgt/dt=${dt}/hour=${hour} >>>>>>>>>>>>> hadoop fs -touchz ${src_done_path} >>>>>>>>>>>>> hadoop fs -touchz ${tgt_done_path}echo "insert data >>>>>>>>>>>>> [$partition_date] done" >>>>>>>>>>>>> #next hoursset +ewhile truedo >>>>>>>>>>>>> sudo ./gen_demo_data.sh >>>>>>>>>>>>> cur_date=`date +%Y%m%d%H` >>>>>>>>>>>>> next_date=`date -d "+1hour" '+%Y%m%d%H'` >>>>>>>>>>>>> dt=${next_date:0:8} >>>>>>>>>>>>> hour=${next_date:8:2} >>>>>>>>>>>>> partition_date="dt='$dt',hour='$hour'" >>>>>>>>>>>>> sed s/PARTITION_DATE/$partition_date/ >>>>>>>>>>>>> ./insert-data.hql.template > insert-data.hql >>>>>>>>>>>>> hive -f insert-data.hql >>>>>>>>>>>>> >>>>>>>>>>>>> src_done_path=/griffin/data/batch/demo_src/dt=${dt}/hour=${hour}/_DONE >>>>>>>>>>>>> >>>>>>>>>>>>> tgt_done_path=/griffin/data/batch/demo_tgt/dt=${dt}/hour=${hour}/_DONE >>>>>>>>>>>>> hadoop fs -mkdir -p >>>>>>>>>>>>> /griffin/data/batch/demo_src/dt=${dt}/hour=${hour} >>>>>>>>>>>>> hadoop fs -mkdir -p >>>>>>>>>>>>> /griffin/data/batch/demo_tgt/dt=${dt}/hour=${hour} >>>>>>>>>>>>> hadoop fs -touchz ${src_done_path} >>>>>>>>>>>>> hadoop fs -touchz ${tgt_done_path} >>>>>>>>>>>>> echo "insert data [$partition_date] done" >>>>>>>>>>>>> sleep 3600doneset -e >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> >>>>>>>>>>>>> William >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On Thu, Sep 10, 2020 at 4:58 PM Sunil Muniyal < >>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> 1. Since I was able to get ElasticSearch 6.8.x integrated, >>>>>>>>>>>>>> does it mean that only ES upto 6.8.x is supported for Griffin as >>>>>>>>>>>>>> of now? If >>>>>>>>>>>>>> yes, what are the plans further? Is there a page from which I >>>>>>>>>>>>>> could get >>>>>>>>>>>>>> updates? >>>>>>>>>>>>>> --please file a jira ticket for us to make our code ES >>>>>>>>>>>>>> compatible. >>>>>>>>>>>>>> [SM] GRIFFIN-346 - Support for Elastic Search latest version >>>>>>>>>>>>>> (7.9.1) <https://issues.apache.org/jira/browse/GRIFFIN-346> is >>>>>>>>>>>>>> submitted >>>>>>>>>>>>>> >>>>>>>>>>>>>> 2. I still do not see the metrics available (please refer >>>>>>>>>>>>>> below screenshots). Though the measure is now listed in the drop >>>>>>>>>>>>>> down of *DQ >>>>>>>>>>>>>> Metrics* tab. But when I selected the test measure, nothing >>>>>>>>>>>>>> came up. >>>>>>>>>>>>>> --could you check the ES whether metrics have been injected >>>>>>>>>>>>>> or not. >>>>>>>>>>>>>> [SM] I used the link below and got the index that is created >>>>>>>>>>>>>> in ES. I believe the data is loaded. However, please correct if I >>>>>>>>>>>>>> understood incorrectly >>>>>>>>>>>>>> *"http://<ES Public IP>:9200/_cat/indices?v"* >>>>>>>>>>>>>> --------------> POC env is on public cloud so using Public IP. >>>>>>>>>>>>>> >>>>>>>>>>>>>> health status index uuid pri rep docs.count >>>>>>>>>>>>>> docs.deleted store.size pri.store.size >>>>>>>>>>>>>> yellow open griffin ur_Kd3XFQBCsPzIM84j87Q 5 2 0 >>>>>>>>>>>>>> 0 1.2kb 1.2kb >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Docs in the index:* "http://<ES Public >>>>>>>>>>>>>> IP>:9200/griffin/_search"* >>>>>>>>>>>>>> >>>>>>>>>>>>>> {"took":44,"timed_out":false,"_shards":{"total":5,"successful":5,"skipped":0,"failed":0},"hits":{"total":0,"max_score":null,"hits":[]}} >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Index Mapping: *"http://<ES Public IP>:9200/griffin"* >>>>>>>>>>>>>> >>>>>>>>>>>>>> {"griffin":{"aliases":{},"mappings":{"accuracy":{"properties":{"name":{"type":"text","fields":{"keyword":{"type":"keyword","ignore_above":256}}},"tmst":{"type":"date"}}}},"settings":{"index":{"creation_date":"1599567930578","number_of_shards":"5","number_of_replicas":"2","uuid":"ur_Kd3XFQBCsPzIM84j87Q","version":{"created":"6081299"},"provided_name":"griffin"}}}} >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> 3. At a step in deployment guide it is suggested to check >>>>>>>>>>>>>> URL: "*http://<ES HOST IP>:9200/griffin/accuracy"* When >>>>>>>>>>>>>> navigated to this URL, I get below error. Please advise >>>>>>>>>>>>>> *{"error":"Incorrect HTTP method for uri [/griffin/accuracy] >>>>>>>>>>>>>> and method [GET], allowed: [POST]","status":405}* >>>>>>>>>>>>>> *-- it seems you need to use POST method.* >>>>>>>>>>>>>> [SM] I am using the POST method as suggested in the article. >>>>>>>>>>>>>> Below is the JSON of *env_batch.JSON* >>>>>>>>>>>>>> * {* >>>>>>>>>>>>>> * "type": "ELASTICSEARCH",* >>>>>>>>>>>>>> * "config": {* >>>>>>>>>>>>>> * "method": "post",* >>>>>>>>>>>>>> * "api": "http://<ES Host >>>>>>>>>>>>>> Name>:9200/griffin/accuracy", ---------> *do we need IP here? >>>>>>>>>>>>>> * "connection.timeout": "1m",* >>>>>>>>>>>>>> * "retry": 10* >>>>>>>>>>>>>> * }* >>>>>>>>>>>>>> * }* >>>>>>>>>>>>>> >>>>>>>>>>>>>> 6. I also noticed that in Data Assets, *demo_src* is owned >>>>>>>>>>>>>> by Admin whereas, *demo-tgt* by root. Would that make any >>>>>>>>>>>>>> difference? If yes, how to correct it? Reload HIVE data? >>>>>>>>>>>>>> -- could you show me your script for dataset setup? >>>>>>>>>>>>>> <https://issues.apache.org/jira/browse/GRIFFIN-346> >>>>>>>>>>>>>> [SM] Attached are the 3 scripts. gen-hive-data.sh is the >>>>>>>>>>>>>> master script which triggers demo_data and it further triggers >>>>>>>>>>>>>> delta_src. >>>>>>>>>>>>>> Have done it as it is instructed in the Github article and >>>>>>>>>>>>>> gen-hive-data.sh is triggered as root in the terminal. >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Please advise. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks and Regards, >>>>>>>>>>>>>> Sunil Muniyal >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Wed, Sep 9, 2020 at 8:41 PM William Guo <[email protected]> >>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> *Request you to please advise further on below points:* >>>>>>>>>>>>>>> 1. Since I was able to get ElasticSearch 6.8.x integrated, >>>>>>>>>>>>>>> does it mean that only ES upto 6.8.x is supported for Griffin >>>>>>>>>>>>>>> as of now? If >>>>>>>>>>>>>>> yes, what are the plans further? Is there a page from which I >>>>>>>>>>>>>>> could get >>>>>>>>>>>>>>> updates? >>>>>>>>>>>>>>> --please file a jira ticket for us to make our code ES >>>>>>>>>>>>>>> compatible. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> 2. I still do not see the metrics available (please refer >>>>>>>>>>>>>>> below screenshots). Though the measure is now listed in the >>>>>>>>>>>>>>> drop down of *DQ >>>>>>>>>>>>>>> Metrics* tab. But when I selected the test measure, nothing >>>>>>>>>>>>>>> came up. >>>>>>>>>>>>>>> --could you check the ES whether metrics have been injected >>>>>>>>>>>>>>> or not. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> 3. At a step in deployment guide it is suggested to check >>>>>>>>>>>>>>> URL: http://<ES HOST IP>:9200/griffin/accuracy >>>>>>>>>>>>>>> <http://13.126.127.141:9200/griffin/accuracy> When >>>>>>>>>>>>>>> navigated to this URL, I get below error. Please advise >>>>>>>>>>>>>>> *{"error":"Incorrect HTTP method for uri [/griffin/accuracy] >>>>>>>>>>>>>>> and method [GET], allowed: [POST]","status":405}* >>>>>>>>>>>>>>> *-- it seems you need to use POST method.* >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> 6. I also noticed that in Data Assets, *demo_src* is owned >>>>>>>>>>>>>>> by Admin whereas, *demo-tgt* by root. Would that make any >>>>>>>>>>>>>>> difference? If yes, how to correct it? Reload HIVE data? >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> -- could you show me your script for dataset setup? >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Tue, Sep 8, 2020 at 9:02 PM Sunil Muniyal < >>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Hi William, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I was finally able to get Griffin up and ElasticSearch >>>>>>>>>>>>>>>> integrated along with Hadoop. Thanks a lot for your help and >>>>>>>>>>>>>>>> guidance so >>>>>>>>>>>>>>>> far. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I have created a test measure and a job which gets >>>>>>>>>>>>>>>> triggered at every 4 mins automatically (have referred to the >>>>>>>>>>>>>>>> user guide >>>>>>>>>>>>>>>> available on GitHub at this link >>>>>>>>>>>>>>>> <https://github.com/apache/griffin/blob/master/griffin-doc/ui/user-guide.md> >>>>>>>>>>>>>>>> .) >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> *Request you to please advise further on below points:* >>>>>>>>>>>>>>>> 1. Since I was able to get ElasticSearch 6.8.x integrated, >>>>>>>>>>>>>>>> does it mean that only ES upto 6.8.x is supported for Griffin >>>>>>>>>>>>>>>> as of now? If >>>>>>>>>>>>>>>> yes, what are the plans further? Is there a page from which I >>>>>>>>>>>>>>>> could get >>>>>>>>>>>>>>>> updates? >>>>>>>>>>>>>>>> 2. I still do not see the metrics available (please refer >>>>>>>>>>>>>>>> below screenshots). Though the measure is now listed in the >>>>>>>>>>>>>>>> drop down of *DQ >>>>>>>>>>>>>>>> Metrics* tab. But when I selected the test measure, >>>>>>>>>>>>>>>> nothing came up. >>>>>>>>>>>>>>>> 3. At a step in deployment guide it is suggested to check >>>>>>>>>>>>>>>> URL: http://<ES HOST IP>:9200/griffin/accuracy >>>>>>>>>>>>>>>> <http://13.126.127.141:9200/griffin/accuracy> When >>>>>>>>>>>>>>>> navigated to this URL, I get below error. Please advise >>>>>>>>>>>>>>>> *{"error":"Incorrect HTTP method for uri >>>>>>>>>>>>>>>> [/griffin/accuracy] and method [GET], allowed: >>>>>>>>>>>>>>>> [POST]","status":405}* >>>>>>>>>>>>>>>> 6. I also noticed that in Data Assets, *demo_src* is owned >>>>>>>>>>>>>>>> by Admin whereas, *demo-tgt* by root. Would that make any >>>>>>>>>>>>>>>> difference? If yes, how to correct it? Reload HIVE data? >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> *Screenshots:* >>>>>>>>>>>>>>>> *Data Assets:* >>>>>>>>>>>>>>>> [image: image.png] >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> *DQ Metrics (Test Measure selected):* >>>>>>>>>>>>>>>> [image: image.png] >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> *Job Triggered multiple times:* >>>>>>>>>>>>>>>> [image: image.png] >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> *Metrics page from job directly:* >>>>>>>>>>>>>>>> [image: image.png] >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Thanks and Regards, >>>>>>>>>>>>>>>> Sunil Muniyal >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On Tue, Sep 8, 2020 at 4:38 PM Sunil Muniyal < >>>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> I am unable to get repos for 6.4.1 instead I found 6.8.x. >>>>>>>>>>>>>>>>> Will try with this version of Elastic Search in sometime. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> In the meantime, would it be possible to confirm if 6.4.x >>>>>>>>>>>>>>>>> or 6.8.x is the only supported version for Griffin? Reason I >>>>>>>>>>>>>>>>> am asking is, >>>>>>>>>>>>>>>>> the GitHub article for griffin deployment points to the >>>>>>>>>>>>>>>>> latest version of >>>>>>>>>>>>>>>>> ES. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Thanks and Regards, >>>>>>>>>>>>>>>>> Sunil Muniyal >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On Tue, Sep 8, 2020 at 4:06 PM Sunil Muniyal < >>>>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> I will need to redeploy ElasticSearch, correct? >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Thanks and Regards, >>>>>>>>>>>>>>>>>> Sunil Muniyal >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> On Tue, Sep 8, 2020 at 4:05 PM William Guo < >>>>>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Could you try with this version? >>>>>>>>>>>>>>>>>>> <elasticsearch.version>6.4.1</elasticsearch.version> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>> William >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> On Tue, Sep 8, 2020 at 5:59 PM Sunil Muniyal < >>>>>>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Hi William / Dev group, >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> I have deployed ES 7.9 - latest version (single node) >>>>>>>>>>>>>>>>>>>> and the same is configured. I also get the default page >>>>>>>>>>>>>>>>>>>> when hitting >>>>>>>>>>>>>>>>>>>> http://<ES HOST IP>:9200/ >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Upon creating the griffin configurations using the JSON >>>>>>>>>>>>>>>>>>>> string given >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> curl -k -H "Content-Type: application/json" -X PUT >>>>>>>>>>>>>>>>>>>> http://<replaced with my ES host IP>:9200/griffin \ >>>>>>>>>>>>>>>>>>>> -d '{ >>>>>>>>>>>>>>>>>>>> "aliases": {}, >>>>>>>>>>>>>>>>>>>> "mappings": { >>>>>>>>>>>>>>>>>>>> "accuracy": { >>>>>>>>>>>>>>>>>>>> "properties": { >>>>>>>>>>>>>>>>>>>> "name": { >>>>>>>>>>>>>>>>>>>> "fields": { >>>>>>>>>>>>>>>>>>>> "keyword": { >>>>>>>>>>>>>>>>>>>> "ignore_above": 256, >>>>>>>>>>>>>>>>>>>> "type": "keyword" >>>>>>>>>>>>>>>>>>>> } >>>>>>>>>>>>>>>>>>>> }, >>>>>>>>>>>>>>>>>>>> "type": "text" >>>>>>>>>>>>>>>>>>>> }, >>>>>>>>>>>>>>>>>>>> "tmst": { >>>>>>>>>>>>>>>>>>>> "type": "date" >>>>>>>>>>>>>>>>>>>> } >>>>>>>>>>>>>>>>>>>> } >>>>>>>>>>>>>>>>>>>> } >>>>>>>>>>>>>>>>>>>> }, >>>>>>>>>>>>>>>>>>>> "settings": { >>>>>>>>>>>>>>>>>>>> "index": { >>>>>>>>>>>>>>>>>>>> "number_of_replicas": "2", >>>>>>>>>>>>>>>>>>>> "number_of_shards": "5" >>>>>>>>>>>>>>>>>>>> } >>>>>>>>>>>>>>>>>>>> } >>>>>>>>>>>>>>>>>>>> }' >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> *I get below error:* >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> *{"error":{"root_cause":[{"type":"mapper_parsing_exception","reason":"Root >>>>>>>>>>>>>>>>>>>> mapping definition has unsupported parameters: [accuracy : >>>>>>>>>>>>>>>>>>>> {properties={name={fields={keyword={ignore_above=256, >>>>>>>>>>>>>>>>>>>> type=keyword}}, >>>>>>>>>>>>>>>>>>>> type=text}, >>>>>>>>>>>>>>>>>>>> tmst={type=date}}}]"}],"type":"mapper_parsing_exception","reason":"Failed >>>>>>>>>>>>>>>>>>>> to parse mapping [_doc]: Root mapping definition has >>>>>>>>>>>>>>>>>>>> unsupported >>>>>>>>>>>>>>>>>>>> parameters: [accuracy : >>>>>>>>>>>>>>>>>>>> {properties={name={fields={keyword={ignore_above=256, >>>>>>>>>>>>>>>>>>>> type=keyword}}, >>>>>>>>>>>>>>>>>>>> type=text}, >>>>>>>>>>>>>>>>>>>> tmst={type=date}}}]","caused_by":{"type":"mapper_parsing_exception","reason":"Root >>>>>>>>>>>>>>>>>>>> mapping definition has unsupported parameters: [accuracy : >>>>>>>>>>>>>>>>>>>> {properties={name={fields={keyword={ignore_above=256, >>>>>>>>>>>>>>>>>>>> type=keyword}}, >>>>>>>>>>>>>>>>>>>> type=text}, tmst={type=date}}}]"}},"status":400}* >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Seems like the JSON string is missing some values or is >>>>>>>>>>>>>>>>>>>> incorrectly provided. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Would be great if you could please help. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Thanks and Regards, >>>>>>>>>>>>>>>>>>>> Sunil Muniyal >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> On Mon, Sep 7, 2020 at 8:16 PM Sunil Muniyal < >>>>>>>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Thank you for the response, William. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> I have started preparing for ES deployment and should >>>>>>>>>>>>>>>>>>>>> attempt the same tomorrow. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> In the meantime, I will also wait for the Dev team in >>>>>>>>>>>>>>>>>>>>> case they have any additional inputs. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Thanks and Regards, >>>>>>>>>>>>>>>>>>>>> Sunil Muniyal >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> On Mon, Sep 7, 2020 at 8:06 PM William Guo < >>>>>>>>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> If dev confirms it to be mandatory, as I understand >>>>>>>>>>>>>>>>>>>>>> correct, I will need to: >>>>>>>>>>>>>>>>>>>>>> 1. Deploy and Configure ES >>>>>>>>>>>>>>>>>>>>>> 2. Update application.properties to include ES >>>>>>>>>>>>>>>>>>>>>> details and create ES index >>>>>>>>>>>>>>>>>>>>>> 3. Rebuild Maven package and rerun the Griffin service >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> *Right, you need to package es env configuration into >>>>>>>>>>>>>>>>>>>>>> your jar.* >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> There is no need to reload the data into Hadoop >>>>>>>>>>>>>>>>>>>>>> (Hive), correct? >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> *No* >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> On a side note, is there any other documentation of >>>>>>>>>>>>>>>>>>>>>> Griffin available or underway which would help to get >>>>>>>>>>>>>>>>>>>>>> below details while >>>>>>>>>>>>>>>>>>>>>> integrating it with Cloudera Hadoop? >>>>>>>>>>>>>>>>>>>>>> 1. What are the exact ports requirements (internal >>>>>>>>>>>>>>>>>>>>>> and external)? >>>>>>>>>>>>>>>>>>>>>> *check log and make sure all extra connections in >>>>>>>>>>>>>>>>>>>>>> properties can accessible* >>>>>>>>>>>>>>>>>>>>>> 2. Which all packages will be required? >>>>>>>>>>>>>>>>>>>>>> *no* >>>>>>>>>>>>>>>>>>>>>> 3. Any Java dependencies? >>>>>>>>>>>>>>>>>>>>>> *java 1.8* >>>>>>>>>>>>>>>>>>>>>> 4. If we have Cloudera Hadoop cluster kerberized >>>>>>>>>>>>>>>>>>>>>> (secured), what are the dependencies or additional >>>>>>>>>>>>>>>>>>>>>> configurations needed? >>>>>>>>>>>>>>>>>>>>>> *Should no extra dependencies, except >>>>>>>>>>>>>>>>>>>>>> those transitive dependencies incurred by spark and >>>>>>>>>>>>>>>>>>>>>> hadoop.* >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> On Mon, Sep 7, 2020 at 6:42 PM Sunil Muniyal < >>>>>>>>>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> Ohh ok. >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> If dev confirms it to be mandatory, as I understand >>>>>>>>>>>>>>>>>>>>>>> correct, I will need to: >>>>>>>>>>>>>>>>>>>>>>> 1. Deploy and Configure ES >>>>>>>>>>>>>>>>>>>>>>> 2. Update application.properties to include ES >>>>>>>>>>>>>>>>>>>>>>> details and create ES index >>>>>>>>>>>>>>>>>>>>>>> 3. Rebuild Maven package and rerun the Griffin >>>>>>>>>>>>>>>>>>>>>>> service >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> There is no need to reload the data into Hadoop >>>>>>>>>>>>>>>>>>>>>>> (Hive), correct? >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> On a side note, is there any other documentation of >>>>>>>>>>>>>>>>>>>>>>> Griffin available or underway which would help to get >>>>>>>>>>>>>>>>>>>>>>> below details while >>>>>>>>>>>>>>>>>>>>>>> integrating it with Cloudera Hadoop? >>>>>>>>>>>>>>>>>>>>>>> 1. What are the exact ports requirements (internal >>>>>>>>>>>>>>>>>>>>>>> and external)? >>>>>>>>>>>>>>>>>>>>>>> 2. Which all packages will be required? >>>>>>>>>>>>>>>>>>>>>>> 3. Any Java dependencies? >>>>>>>>>>>>>>>>>>>>>>> 4. If we have Cloudera Hadoop cluster kerberized >>>>>>>>>>>>>>>>>>>>>>> (secured), what are the dependencies or additional >>>>>>>>>>>>>>>>>>>>>>> configurations needed? >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> I know some of the above information can be fetched >>>>>>>>>>>>>>>>>>>>>>> from the deployment guide on Github. However, checking >>>>>>>>>>>>>>>>>>>>>>> if any other formal >>>>>>>>>>>>>>>>>>>>>>> documentation has been made available for the same? >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> Thanks and Regards, >>>>>>>>>>>>>>>>>>>>>>> Sunil Muniyal >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> On Mon, Sep 7, 2020 at 4:05 PM William Guo < >>>>>>>>>>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> cc dev for double checking. >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> Measure will emit metrics and store them in >>>>>>>>>>>>>>>>>>>>>>>> elastic, UI fetch those metrics from elastic. >>>>>>>>>>>>>>>>>>>>>>>> So elastic should be mandatory. >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>>>>> William >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> On Mon, Sep 7, 2020 at 6:32 PM Sunil Muniyal < >>>>>>>>>>>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> Thank you for the quick response, William. >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> I have not configured ElasticSearch since it is >>>>>>>>>>>>>>>>>>>>>>>>> not deployed. >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> In the application.properties, I just added the >>>>>>>>>>>>>>>>>>>>>>>>> dummy information (as below) just to pass the >>>>>>>>>>>>>>>>>>>>>>>>> validation test and get >>>>>>>>>>>>>>>>>>>>>>>>> Griffin up and running. >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> # elasticsearch >>>>>>>>>>>>>>>>>>>>>>>>> # elasticsearch.host = <IP> >>>>>>>>>>>>>>>>>>>>>>>>> # elasticsearch.port = <elasticsearch rest port> >>>>>>>>>>>>>>>>>>>>>>>>> # elasticsearch.user = user >>>>>>>>>>>>>>>>>>>>>>>>> # elasticsearch.password = password >>>>>>>>>>>>>>>>>>>>>>>>> elasticsearch.host=localhost >>>>>>>>>>>>>>>>>>>>>>>>> elasticsearch.port=9200 >>>>>>>>>>>>>>>>>>>>>>>>> elasticsearch.scheme=http >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> Is ElasticSearch a mandatory requirement to use >>>>>>>>>>>>>>>>>>>>>>>>> Griffin? >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> Thanks and Regards, >>>>>>>>>>>>>>>>>>>>>>>>> Sunil Muniyal >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> On Mon, Sep 7, 2020 at 3:58 PM William Guo < >>>>>>>>>>>>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> Could you check whether ES has been injected with >>>>>>>>>>>>>>>>>>>>>>>>>> those metrics or not? >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> On Mon, Sep 7, 2020 at 6:23 PM Sunil Muniyal < >>>>>>>>>>>>>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> Hello William, >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> I was able to bypass this error by entering the >>>>>>>>>>>>>>>>>>>>>>>>>>> default field values for LDAP, ElasticSearch and >>>>>>>>>>>>>>>>>>>>>>>>>>> Livy in >>>>>>>>>>>>>>>>>>>>>>>>>>> application.properties and successfully get Griffin >>>>>>>>>>>>>>>>>>>>>>>>>>> running. >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> By following the below article, I have created a >>>>>>>>>>>>>>>>>>>>>>>>>>> test measure and then a job which triggers that >>>>>>>>>>>>>>>>>>>>>>>>>>> measure. >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> https://github.com/apache/griffin/blob/master/griffin-doc/ui/user-guide.md >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> Have allowed the job to get triggered multiple >>>>>>>>>>>>>>>>>>>>>>>>>>> times, however, still i can't see anything in >>>>>>>>>>>>>>>>>>>>>>>>>>> metrics related to the job. >>>>>>>>>>>>>>>>>>>>>>>>>>> Neither I see anything in *health *or >>>>>>>>>>>>>>>>>>>>>>>>>>> *mydashboard* tabs. Also, if you notice in the >>>>>>>>>>>>>>>>>>>>>>>>>>> screenshot below, being in the *DQ Metrics* >>>>>>>>>>>>>>>>>>>>>>>>>>> tab, I still do not see the created measure in the >>>>>>>>>>>>>>>>>>>>>>>>>>> drop down list. >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> [image: image.png] >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> *Test job executed multiple times:* >>>>>>>>>>>>>>>>>>>>>>>>>>> [image: image.png] >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> Please advise if anything is mis-configured. >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks and Regards, >>>>>>>>>>>>>>>>>>>>>>>>>>> Sunil Muniyal >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> On Mon, Sep 7, 2020 at 12:40 PM Sunil Muniyal < >>>>>>>>>>>>>>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> Hello William, >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> Thank you for the reply. >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> This helped, actually i had missed to add the >>>>>>>>>>>>>>>>>>>>>>>>>>>> property in application.properties. >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> Now the other challenge is, along with ES and >>>>>>>>>>>>>>>>>>>>>>>>>>>> Livy, I am also not using LDAP and it is hitting >>>>>>>>>>>>>>>>>>>>>>>>>>>> the error *unable >>>>>>>>>>>>>>>>>>>>>>>>>>>> to resolve ldap.url property.* Of Course it >>>>>>>>>>>>>>>>>>>>>>>>>>>> will, since the property is not configured. >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> Please suggest. >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks and Regards, >>>>>>>>>>>>>>>>>>>>>>>>>>>> Sunil Muniyal >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> On Sun, Sep 6, 2020 at 7:26 PM William Guo < >>>>>>>>>>>>>>>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> hi Sunil Muniyal, >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Could you check this property in your griffin >>>>>>>>>>>>>>>>>>>>>>>>>>>>> properties file? >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> internal.event.listeners >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> William >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Thu, Sep 3, 2020 at 11:05 PM Sunil Muniyal < >>>>>>>>>>>>>>>>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hello, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I am attempting to integrate Griffin with >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Cloudera Hadoop by following below article: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> https://github.com/apache/griffin/blob/master/griffin-doc/deploy/deploy-guide.md >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> <https://github.com/apache/griffin/blob/master/griffin-doc/deploy/deploy-guide.md>I >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> have followed everything as instructed, apart >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> from below things: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 1. Using Cloudera Hadoop 5.15 and relevant >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> configurations instead of Apache Hadoop >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 2. Not using Elastic search as it is not >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> applicable >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 3. Did not use Livy as it is not applicable. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Maven build is successful and has got 2 jars >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> at service/target and measure/target which I >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> have uploaded to HDFS. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> However, *starting griffin-service.jar using >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> nohup command* is failing with below error: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> *Caused by: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> java.lang.IllegalArgumentException: Could not >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> resolve placeholder >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 'internal.event.listeners' in string value >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> "#{'${internal.event.listeners}'.split(',')}"* >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> * at >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> org.springframework.util.PropertyPlaceholderHelper.parseStringValue(PropertyPlaceholderHelper.java:174) >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ~[spring-core-4.3.6.RELEASE.jar!/:4.3.6.RELEASE]* >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> * at >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> org.springframework.util.PropertyPlaceholderHelper.replacePlaceholders(PropertyPlaceholderHelper.java:126) >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ~[spring-core-4.3.6.RELEASE.jar!/:4.3.6.RELEASE]* >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> * at >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> org.springframework.core.env.AbstractPropertyResolver.doResolvePlaceholders(AbstractPropertyResolver.java:236) >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ~[spring-core-4.3.6.RELEASE.jar!/:4.3.6.RELEASE]* >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I have tried to search a lot of articles with >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> no luck. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Would be great if someone could help me to >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> fix this. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Also, attached is the output of nohup command >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> that was written in service.out. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks and Regards, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Sunil Muniyal >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> --------------------------------------------------------------------- >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> To unsubscribe, e-mail: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> [email protected] >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> For additional commands, e-mail: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> [email protected] >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> --------------------------------------------------------------------- >>>>>>>>>>>>>> To unsubscribe, e-mail: [email protected] >>>>>>>>>>>>>> For additional commands, e-mail: >>>>>>>>>>>>>> [email protected] >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>> >>>>>>>> --------------------------------------------------------------------- >>>>>>>> To unsubscribe, e-mail: [email protected] >>>>>>>> For additional commands, e-mail: [email protected] >>>>>>> >>>>>>>
