<livy.core.version>0.3.0</livy.core.version> <scala.version>2.11.8</scala.version> <spark.version>2.2.1</spark.version>
On Fri, Sep 11, 2020 at 9:37 PM Sunil Muniyal <[email protected]> wrote: > Hi William, > > Getting below error while starting Livy server. Any suggestions? If you > notice, I already have Spark 1.6 and the Livy documentation ( > http://livy.incubator.apache.org/get-started/) states that it is > supported: > *Livy requires at least Spark 1.6 and supports both Scala 2.10 and 2.11 > builds of Spark.* > > 20/09/11 13:34:57 INFO server.AccessManager: AccessControlManager acls > disabled;users with view permission: ;users with modify permission: ;users > with super permission: ;other allowed users: * > 20/09/11 13:34:57 INFO utils.LineBufferedStream: Welcome to > 20/09/11 13:34:57 INFO utils.LineBufferedStream: ____ __ > 20/09/11 13:34:57 INFO utils.LineBufferedStream: / __/__ ___ _____/ > /__ > 20/09/11 13:34:57 INFO utils.LineBufferedStream: _\ \/ _ \/ _ `/ __/ > '_/ > 20/09/11 13:34:57 INFO utils.LineBufferedStream: /___/ .__/\_,_/_/ > /_/\_\ version 1.6.0 > 20/09/11 13:34:57 INFO utils.LineBufferedStream: /_/ > 20/09/11 13:34:57 INFO utils.LineBufferedStream: > 20/09/11 13:34:57 INFO utils.LineBufferedStream: Type --help for more > information. > Exception in thread "main" java.lang.IllegalArgumentException: requirement > failed: Unsupported Spark version (1,6) > at scala.Predef$.require(Predef.scala:224) > at > org.apache.livy.utils.LivySparkUtils$.testSparkVersion(LivySparkUtils.scala:79) > at org.apache.livy.server.LivyServer.start(LivyServer.scala:81) > at org.apache.livy.server.LivyServer$.main(LivyServer.scala:423) > at org.apache.livy.server.LivyServer.main(LivyServer.scala) > > Thanks and Regards, > Sunil Muniyal > > > On Fri, Sep 11, 2020 at 6:40 PM Sunil Muniyal <[email protected]> > wrote: > >> I guess that could be the reason then. I will deploy Livy and rebuild >> Griffin then... will post back the results. >> >> In the meantime, if you get any other info that could help, please do let >> me know. >> >> Thanks and Regards, >> Sunil Muniyal >> >> >> On Fri, Sep 11, 2020 at 6:38 PM William Guo <[email protected]> wrote: >> >>> yes, griffin leverages livy to post spark jobs to spark cluster. >>> >>> if you manually submit a job to spark, griffin cannot automatically >>> refresh metrics from es. >>> >>> >>> On Fri, Sep 11, 2020 at 9:03 PM Sunil Muniyal < >>> [email protected]> wrote: >>> >>>> As of now I am doing the test with Hive tables. >>>> Spark jobs aren't submitted by me yet. >>>> >>>> A question... so if I understand correctly, the metrics will be >>>> generated only after a spark job is executed and an accuracy check is >>>> performed between two hive tables via submitted spark job? is that correct? >>>> if yes, i guess Livy is needed then so that Griffin can submit spark jobs >>>> by itself else I will have to manually submit a Spark job first. If later >>>> is an option, how do we do that w.r.t. Griffin? >>>> >>>> Thanks and Regards, >>>> Sunil Muniyal >>>> >>>> >>>> On Fri, Sep 11, 2020 at 6:30 PM William Guo <[email protected]> wrote: >>>> >>>>> check this >>>>> https://spark.apache.org/docs/2.2.1/monitoring.html >>>>> >>>>> >>>>> >>>>> On Fri, Sep 11, 2020 at 8:58 PM William Guo <[email protected]> wrote: >>>>> >>>>>> for griffin log, please search in your spark cluster env, usually in >>>>>> worker log dir. >>>>>> One weird thing is how you submit a job to spark, if you disabled >>>>>> livy? >>>>>> >>>>>> On Fri, Sep 11, 2020 at 8:46 PM Sunil Muniyal < >>>>>> [email protected]> wrote: >>>>>> >>>>>>> possible to please help the location where measure log would get >>>>>>> created or from where can i check the location? >>>>>>> >>>>>>> Thanks and Regards, >>>>>>> Sunil Muniyal >>>>>>> >>>>>>> >>>>>>> On Fri, Sep 11, 2020 at 6:14 PM William Guo <[email protected]> >>>>>>> wrote: >>>>>>> >>>>>>>> Livy is used to post jobs to your cluster, I don't think it is >>>>>>>> related to livy. >>>>>>>> >>>>>>>> Could you also share the measure log in your cluster? >>>>>>>> >>>>>>>> >>>>>>>> On Fri, Sep 11, 2020 at 8:03 PM Sunil Muniyal < >>>>>>>> [email protected]> wrote: >>>>>>>> >>>>>>>>> Got below message as output of >>>>>>>>> >>>>>>>>> {"Test_Measure":[{"name":"Test_Job","type":"ACCURACY","owner":"test","metricValues":[]}]} >>>>>>>>> >>>>>>>>> metricValues seems empty. So is it like Griffin is not getting >>>>>>>>> data from ES? whereas ES does have the data which we verified >>>>>>>>> previously. >>>>>>>>> By any chance, do you think not having Livy could be a problem? >>>>>>>>> >>>>>>>>> These are the latest logs from service.out: >>>>>>>>> *[EL Fine]: sql: 2020-09-11 >>>>>>>>> 11:59:11.662--ServerSession(400064818)--Connection(754936662)--SELECT >>>>>>>>> DISTINCT ID, APPID, APPURI, CREATEDDATE, DELETED, expire_timestamp, >>>>>>>>> MODIFIEDDATE, predicate_job_deleted, predicate_group_name, >>>>>>>>> predicate_job_name, SESSIONID, STATE, timestamp, TYPE, job_id FROM >>>>>>>>> JOBINSTANCEBEAN WHERE (STATE IN (?,?,?,?,?,?))* >>>>>>>>> * bind => [6 parameters bound]* >>>>>>>>> *[EL Fine]: sql: 2020-09-11 >>>>>>>>> 11:59:51.044--ServerSession(400064818)--Connection(353930083)--SELECT >>>>>>>>> ID, >>>>>>>>> type, CREATEDDATE, CRONEXPRESSION, DELETED, quartz_group_name, >>>>>>>>> JOBNAME, >>>>>>>>> MEASUREID, METRICNAME, MODIFIEDDATE, quartz_job_name, PREDICATECONFIG, >>>>>>>>> TIMEZONE FROM job WHERE (DELETED = ?)* >>>>>>>>> * bind => [1 parameter bound]* >>>>>>>>> *[EL Fine]: sql: 2020-09-11 >>>>>>>>> 11:59:51.046--ServerSession(400064818)--Connection(1245663749)--SELECT >>>>>>>>> DISTINCT DTYPE FROM MEASURE WHERE (DELETED = ?)* >>>>>>>>> * bind => [1 parameter bound]* >>>>>>>>> *[EL Fine]: sql: 2020-09-11 >>>>>>>>> 11:59:51.046--ServerSession(400064818)--Connection(674248356)--SELECT >>>>>>>>> t0.ID, t0.DTYPE, t0.CREATEDDATE, t0.DELETED, t0.DESCRIPTION, >>>>>>>>> t0.DQTYPE, >>>>>>>>> t0.MODIFIEDDATE, t0.NAME, t0.ORGANIZATION, t0.OWNER, t0.SINKS, t1.ID, >>>>>>>>> t1.PROCESSTYPE, t1.RULEDESCRIPTION, t1.evaluate_rule_id FROM MEASURE >>>>>>>>> t0, >>>>>>>>> GRIFFINMEASURE t1 WHERE ((t0.DELETED = ?) AND ((t1.ID = t0.ID) AND >>>>>>>>> (t0.DTYPE = ?)))* >>>>>>>>> * bind => [2 parameters bound]* >>>>>>>>> *[EL Fine]: sql: 2020-09-11 >>>>>>>>> 12:00:00.019--ClientSession(294162678)--Connection(98503327)--INSERT >>>>>>>>> INTO >>>>>>>>> JOBINSTANCEBEAN (ID, APPID, APPURI, CREATEDDATE, DELETED, >>>>>>>>> expire_timestamp, >>>>>>>>> MODIFIEDDATE, predicate_job_deleted, predicate_group_name, >>>>>>>>> predicate_job_name, SESSIONID, STATE, timestamp, TYPE, job_id) VALUES >>>>>>>>> (?, >>>>>>>>> ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)* >>>>>>>>> * bind => [15 parameters bound]* >>>>>>>>> *[EL Fine]: sql: 2020-09-11 >>>>>>>>> 12:00:00.09--ServerSession(400064818)--Connection(491395630)--SELECT >>>>>>>>> ID, >>>>>>>>> APPID, APPURI, CREATEDDATE, DELETED, expire_timestamp, MODIFIEDDATE, >>>>>>>>> predicate_job_deleted, predicate_group_name, predicate_job_name, >>>>>>>>> SESSIONID, >>>>>>>>> STATE, timestamp, TYPE, job_id FROM JOBINSTANCEBEAN WHERE >>>>>>>>> (predicate_job_name = ?)* >>>>>>>>> * bind => [1 parameter bound]* >>>>>>>>> *2020-09-11 12:00:00.117 INFO 10980 --- [ryBean_Worker-3] >>>>>>>>> o.a.g.c.j.SparkSubmitJob : {* >>>>>>>>> * "measure.type" : "griffin",* >>>>>>>>> * "id" : 201,* >>>>>>>>> * "name" : "Test_Job",* >>>>>>>>> * "owner" : "test",* >>>>>>>>> * "description" : "Measure to check %age of id field values are >>>>>>>>> same",* >>>>>>>>> * "deleted" : false,* >>>>>>>>> * "timestamp" : 1599822000000,* >>>>>>>>> * "dq.type" : "ACCURACY",* >>>>>>>>> * "sinks" : [ "ELASTICSEARCH", "HDFS" ],* >>>>>>>>> * "process.type" : "BATCH",* >>>>>>>>> * "data.sources" : [ {* >>>>>>>>> * "id" : 204,* >>>>>>>>> * "name" : "source",* >>>>>>>>> * "connectors" : [ {* >>>>>>>>> * "id" : 205,* >>>>>>>>> * "name" : "source1599568886803",* >>>>>>>>> * "type" : "HIVE",* >>>>>>>>> * "version" : "1.2",* >>>>>>>>> * "predicates" : [ ],* >>>>>>>>> * "data.unit" : "1hour",* >>>>>>>>> * "data.time.zone" : "",* >>>>>>>>> * "config" : {* >>>>>>>>> * "database" : "default",* >>>>>>>>> * "table.name <http://table.name>" : "demo_src",* >>>>>>>>> * "where" : "dt=20200911 AND hour=11"* >>>>>>>>> * }* >>>>>>>>> * } ],* >>>>>>>>> * "baseline" : false* >>>>>>>>> * }, {* >>>>>>>>> * "id" : 206,* >>>>>>>>> * "name" : "target",* >>>>>>>>> * "connectors" : [ {* >>>>>>>>> * "id" : 207,* >>>>>>>>> * "name" : "target1599568896874",* >>>>>>>>> * "type" : "HIVE",* >>>>>>>>> * "version" : "1.2",* >>>>>>>>> * "predicates" : [ ],* >>>>>>>>> * "data.unit" : "1hour",* >>>>>>>>> * "data.time.zone" : "",* >>>>>>>>> * "config" : {* >>>>>>>>> * "database" : "default",* >>>>>>>>> * "table.name <http://table.name>" : "demo_tgt",* >>>>>>>>> * "where" : "dt=20200911 AND hour=11"* >>>>>>>>> * }* >>>>>>>>> * } ],* >>>>>>>>> * "baseline" : false* >>>>>>>>> * } ],* >>>>>>>>> * "evaluate.rule" : {* >>>>>>>>> * "id" : 202,* >>>>>>>>> * "rules" : [ {* >>>>>>>>> * "id" : 203,* >>>>>>>>> * "rule" : "source.id <http://source.id>=target.id >>>>>>>>> <http://target.id>",* >>>>>>>>> * "dsl.type" : "griffin-dsl",* >>>>>>>>> * "dq.type" : "ACCURACY",* >>>>>>>>> * "out.dataframe.name <http://out.dataframe.name>" : >>>>>>>>> "accuracy"* >>>>>>>>> * } ]* >>>>>>>>> * },* >>>>>>>>> * "measure.type" : "griffin"* >>>>>>>>> *}* >>>>>>>>> *2020-09-11 12:00:00.119 ERROR 10980 --- [ryBean_Worker-3] >>>>>>>>> o.a.g.c.j.SparkSubmitJob : Post to livy ERROR. I/O >>>>>>>>> error on >>>>>>>>> POST request for "http://localhost:8998/batches >>>>>>>>> <http://localhost:8998/batches>": Connection refused (Connection >>>>>>>>> refused); >>>>>>>>> nested exception is java.net.ConnectException: Connection refused >>>>>>>>> (Connection refused)* >>>>>>>>> *2020-09-11 12:00:00.131 INFO 10980 --- [ryBean_Worker-3] >>>>>>>>> o.a.g.c.j.SparkSubmitJob : Delete predicate >>>>>>>>> job(PG,Test_Job_predicate_1599825600016) SUCCESS.* >>>>>>>>> *[EL Fine]: sql: 2020-09-11 >>>>>>>>> 12:00:00.133--ClientSession(273634815)--Connection(296858203)--UPDATE >>>>>>>>> JOBINSTANCEBEAN SET predicate_job_deleted = ?, STATE = ? WHERE (ID = >>>>>>>>> ?)* >>>>>>>>> * bind => [3 parameters bound]* >>>>>>>>> *[EL Fine]: sql: 2020-09-11 >>>>>>>>> 12:00:11.664--ServerSession(400064818)--Connection(1735064739)--SELECT >>>>>>>>> DISTINCT ID, APPID, APPURI, CREATEDDATE, DELETED, expire_timestamp, >>>>>>>>> MODIFIEDDATE, predicate_job_deleted, predicate_group_name, >>>>>>>>> predicate_job_name, SESSIONID, STATE, timestamp, TYPE, job_id FROM >>>>>>>>> JOBINSTANCEBEAN WHERE (STATE IN (?,?,?,?,?,?))* >>>>>>>>> * bind => [6 parameters bound]* >>>>>>>>> >>>>>>>>> >>>>>>>>> Thanks and Regards, >>>>>>>>> Sunil Muniyal >>>>>>>>> >>>>>>>>> >>>>>>>>> On Fri, Sep 11, 2020 at 3:42 PM William Guo <[email protected]> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> From the log, I didn't find any information related to metrics >>>>>>>>>> fetching. >>>>>>>>>> >>>>>>>>>> Could you try to call /api/v1/metrics, and show us the latest log >>>>>>>>>> again? >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Fri, Sep 11, 2020 at 5:48 PM Sunil Muniyal < >>>>>>>>>> [email protected]> wrote: >>>>>>>>>> >>>>>>>>>>> 1: I guest it is related to your login user and super user. >>>>>>>>>>> I am less worried about unless this could be the cause of >>>>>>>>>>> metrics not being displayed. >>>>>>>>>>> >>>>>>>>>>> 2: Could you share with us your griffin log , I suspect some >>>>>>>>>>> exception happened when trying to connect with ES. >>>>>>>>>>> Attached is the service.out file. I see an error is while >>>>>>>>>>> submitting Spark jobs via Livy. Since Livy is not configured / >>>>>>>>>>> deployed >>>>>>>>>>> this is expected. I believe this should not be the reason since we >>>>>>>>>>> are >>>>>>>>>>> getting data from hive (as part of batch processing). Please >>>>>>>>>>> correct if my >>>>>>>>>>> understanding is incorrect. >>>>>>>>>>> >>>>>>>>>>> Thanks and Regards, >>>>>>>>>>> Sunil Muniyal >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Fri, Sep 11, 2020 at 3:09 PM William Guo <[email protected]> >>>>>>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>>> 1: I guest it is related to your login user and super user. >>>>>>>>>>>> 2: Could you share with us your griffin log , I suspect some >>>>>>>>>>>> exception happened when try to connect with ES. >>>>>>>>>>>> >>>>>>>>>>>> On Fri, Sep 11, 2020 at 5:14 PM Sunil Muniyal < >>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Hello William, >>>>>>>>>>>>> >>>>>>>>>>>>> Tried as suggested. >>>>>>>>>>>>> >>>>>>>>>>>>> 1. Ingested data into Hive tables using the provided script. >>>>>>>>>>>>> The ownership still show as is (Source with Admin and Target >>>>>>>>>>>>> with Root) >>>>>>>>>>>>> >>>>>>>>>>>>> 2. Updated env-batch.json and env-streaming.json files with IP >>>>>>>>>>>>> address for ES and rebuilt Griffin. >>>>>>>>>>>>> Still no metrics for the jobs executed. >>>>>>>>>>>>> ES does have data as confirmed yesterday. >>>>>>>>>>>>> >>>>>>>>>>>>> Please help. >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks and Regards, >>>>>>>>>>>>> Sunil Muniyal >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On Thu, Sep 10, 2020 at 7:41 PM William Guo <[email protected]> >>>>>>>>>>>>> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> please enter ip directly. >>>>>>>>>>>>>> not sure whether hostname can be resolved correctly or not. >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Thu, Sep 10, 2020 at 10:06 PM Sunil Muniyal < >>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Hi William, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thank you for the reply. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Regarding points 2 and 3. Possible to share some >>>>>>>>>>>>>>> more details. I believe the env_batch.json is configured as it >>>>>>>>>>>>>>> is expected. >>>>>>>>>>>>>>> What exactly needs to be updated correctly? ES Hostname or >>>>>>>>>>>>>>> shall I enter IP >>>>>>>>>>>>>>> or something else? Please help. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks and Regards, >>>>>>>>>>>>>>> Sunil Muniyal >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Thu, Sep 10, 2020 at 7:30 PM William Guo < >>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> 1 OK, We will fix this issue soon. >>>>>>>>>>>>>>>> 2 Could you try ping es from your spark environment and >>>>>>>>>>>>>>>> input ES endpoint correctly in env_batch.json >>>>>>>>>>>>>>>> 3 Please put your es endpoint in env_batch.json >>>>>>>>>>>>>>>> 6 Please try the following script to build your env. >>>>>>>>>>>>>>>> ``` >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> #!/bin/bash >>>>>>>>>>>>>>>> #create table >>>>>>>>>>>>>>>> hive -f create-table.hqlecho "create table done" >>>>>>>>>>>>>>>> #current hoursudo ./gen_demo_data.shcur_date=`date >>>>>>>>>>>>>>>> +%Y%m%d%H`dt=${cur_date:0:8}hour=${cur_date:8:2}partition_date="dt='$dt',hour='$hour'" >>>>>>>>>>>>>>>> sed s/PARTITION_DATE/$partition_date/ >>>>>>>>>>>>>>>> ./insert-data.hql.template > insert-data.hql >>>>>>>>>>>>>>>> hive -f >>>>>>>>>>>>>>>> insert-data.hqlsrc_done_path=/griffin/data/batch/demo_src/dt=${dt}/hour=${hour}/_DONEtgt_done_path=/griffin/data/batch/demo_tgt/dt=${dt}/hour=${hour}/_DONE >>>>>>>>>>>>>>>> hadoop fs -mkdir -p >>>>>>>>>>>>>>>> /griffin/data/batch/demo_src/dt=${dt}/hour=${hour} >>>>>>>>>>>>>>>> hadoop fs -mkdir -p >>>>>>>>>>>>>>>> /griffin/data/batch/demo_tgt/dt=${dt}/hour=${hour} >>>>>>>>>>>>>>>> hadoop fs -touchz ${src_done_path} >>>>>>>>>>>>>>>> hadoop fs -touchz ${tgt_done_path}echo "insert data >>>>>>>>>>>>>>>> [$partition_date] done" >>>>>>>>>>>>>>>> #last hoursudo ./gen_demo_data.shcur_date=`date -d '1 hour >>>>>>>>>>>>>>>> ago' >>>>>>>>>>>>>>>> +%Y%m%d%H`dt=${cur_date:0:8}hour=${cur_date:8:2}partition_date="dt='$dt',hour='$hour'" >>>>>>>>>>>>>>>> sed s/PARTITION_DATE/$partition_date/ >>>>>>>>>>>>>>>> ./insert-data.hql.template > insert-data.hql >>>>>>>>>>>>>>>> hive -f >>>>>>>>>>>>>>>> insert-data.hqlsrc_done_path=/griffin/data/batch/demo_src/dt=${dt}/hour=${hour}/_DONEtgt_done_path=/griffin/data/batch/demo_tgt/dt=${dt}/hour=${hour}/_DONE >>>>>>>>>>>>>>>> hadoop fs -mkdir -p >>>>>>>>>>>>>>>> /griffin/data/batch/demo_src/dt=${dt}/hour=${hour} >>>>>>>>>>>>>>>> hadoop fs -mkdir -p >>>>>>>>>>>>>>>> /griffin/data/batch/demo_tgt/dt=${dt}/hour=${hour} >>>>>>>>>>>>>>>> hadoop fs -touchz ${src_done_path} >>>>>>>>>>>>>>>> hadoop fs -touchz ${tgt_done_path}echo "insert data >>>>>>>>>>>>>>>> [$partition_date] done" >>>>>>>>>>>>>>>> #next hoursset +ewhile truedo >>>>>>>>>>>>>>>> sudo ./gen_demo_data.sh >>>>>>>>>>>>>>>> cur_date=`date +%Y%m%d%H` >>>>>>>>>>>>>>>> next_date=`date -d "+1hour" '+%Y%m%d%H'` >>>>>>>>>>>>>>>> dt=${next_date:0:8} >>>>>>>>>>>>>>>> hour=${next_date:8:2} >>>>>>>>>>>>>>>> partition_date="dt='$dt',hour='$hour'" >>>>>>>>>>>>>>>> sed s/PARTITION_DATE/$partition_date/ >>>>>>>>>>>>>>>> ./insert-data.hql.template > insert-data.hql >>>>>>>>>>>>>>>> hive -f insert-data.hql >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> src_done_path=/griffin/data/batch/demo_src/dt=${dt}/hour=${hour}/_DONE >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> tgt_done_path=/griffin/data/batch/demo_tgt/dt=${dt}/hour=${hour}/_DONE >>>>>>>>>>>>>>>> hadoop fs -mkdir -p >>>>>>>>>>>>>>>> /griffin/data/batch/demo_src/dt=${dt}/hour=${hour} >>>>>>>>>>>>>>>> hadoop fs -mkdir -p >>>>>>>>>>>>>>>> /griffin/data/batch/demo_tgt/dt=${dt}/hour=${hour} >>>>>>>>>>>>>>>> hadoop fs -touchz ${src_done_path} >>>>>>>>>>>>>>>> hadoop fs -touchz ${tgt_done_path} >>>>>>>>>>>>>>>> echo "insert data [$partition_date] done" >>>>>>>>>>>>>>>> sleep 3600doneset -e >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> William >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On Thu, Sep 10, 2020 at 4:58 PM Sunil Muniyal < >>>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> 1. Since I was able to get ElasticSearch 6.8.x integrated, >>>>>>>>>>>>>>>>> does it mean that only ES upto 6.8.x is supported for Griffin >>>>>>>>>>>>>>>>> as of now? If >>>>>>>>>>>>>>>>> yes, what are the plans further? Is there a page from which I >>>>>>>>>>>>>>>>> could get >>>>>>>>>>>>>>>>> updates? >>>>>>>>>>>>>>>>> --please file a jira ticket for us to make our code ES >>>>>>>>>>>>>>>>> compatible. >>>>>>>>>>>>>>>>> [SM] GRIFFIN-346 - Support for Elastic Search latest >>>>>>>>>>>>>>>>> version (7.9.1) >>>>>>>>>>>>>>>>> <https://issues.apache.org/jira/browse/GRIFFIN-346> is >>>>>>>>>>>>>>>>> submitted >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> 2. I still do not see the metrics available (please refer >>>>>>>>>>>>>>>>> below screenshots). Though the measure is now listed in the >>>>>>>>>>>>>>>>> drop down of *DQ >>>>>>>>>>>>>>>>> Metrics* tab. But when I selected the test measure, >>>>>>>>>>>>>>>>> nothing came up. >>>>>>>>>>>>>>>>> --could you check the ES whether metrics have been >>>>>>>>>>>>>>>>> injected or not. >>>>>>>>>>>>>>>>> [SM] I used the link below and got the index that is >>>>>>>>>>>>>>>>> created in ES. I believe the data is loaded. However, please >>>>>>>>>>>>>>>>> correct if I >>>>>>>>>>>>>>>>> understood incorrectly >>>>>>>>>>>>>>>>> *"http://<ES Public IP>:9200/_cat/indices?v"* >>>>>>>>>>>>>>>>> --------------> POC env is on public cloud so using Public >>>>>>>>>>>>>>>>> IP. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> health status index uuid pri rep >>>>>>>>>>>>>>>>> docs.count docs.deleted store.size pri.store.size >>>>>>>>>>>>>>>>> yellow open griffin ur_Kd3XFQBCsPzIM84j87Q 5 2 >>>>>>>>>>>>>>>>> 0 0 1.2kb 1.2kb >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Docs in the index:* "http://<ES Public >>>>>>>>>>>>>>>>> IP>:9200/griffin/_search"* >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> {"took":44,"timed_out":false,"_shards":{"total":5,"successful":5,"skipped":0,"failed":0},"hits":{"total":0,"max_score":null,"hits":[]}} >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Index Mapping: *"http://<ES Public IP>:9200/griffin"* >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> {"griffin":{"aliases":{},"mappings":{"accuracy":{"properties":{"name":{"type":"text","fields":{"keyword":{"type":"keyword","ignore_above":256}}},"tmst":{"type":"date"}}}},"settings":{"index":{"creation_date":"1599567930578","number_of_shards":"5","number_of_replicas":"2","uuid":"ur_Kd3XFQBCsPzIM84j87Q","version":{"created":"6081299"},"provided_name":"griffin"}}}} >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> 3. At a step in deployment guide it is suggested to check >>>>>>>>>>>>>>>>> URL: "*http://<ES HOST IP>:9200/griffin/accuracy"* When >>>>>>>>>>>>>>>>> navigated to this URL, I get below error. Please advise >>>>>>>>>>>>>>>>> *{"error":"Incorrect HTTP method for uri >>>>>>>>>>>>>>>>> [/griffin/accuracy] and method [GET], allowed: >>>>>>>>>>>>>>>>> [POST]","status":405}* >>>>>>>>>>>>>>>>> *-- it seems you need to use POST method.* >>>>>>>>>>>>>>>>> [SM] I am using the POST method as suggested in the >>>>>>>>>>>>>>>>> article. Below is the JSON of *env_batch.JSON* >>>>>>>>>>>>>>>>> * {* >>>>>>>>>>>>>>>>> * "type": "ELASTICSEARCH",* >>>>>>>>>>>>>>>>> * "config": {* >>>>>>>>>>>>>>>>> * "method": "post",* >>>>>>>>>>>>>>>>> * "api": "http://<ES Host >>>>>>>>>>>>>>>>> Name>:9200/griffin/accuracy", ---------> *do we need IP >>>>>>>>>>>>>>>>> here? >>>>>>>>>>>>>>>>> * "connection.timeout": "1m",* >>>>>>>>>>>>>>>>> * "retry": 10* >>>>>>>>>>>>>>>>> * }* >>>>>>>>>>>>>>>>> * }* >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> 6. I also noticed that in Data Assets, *demo_src* is >>>>>>>>>>>>>>>>> owned by Admin whereas, *demo-tgt* by root. Would that >>>>>>>>>>>>>>>>> make any difference? If yes, how to correct it? Reload HIVE >>>>>>>>>>>>>>>>> data? >>>>>>>>>>>>>>>>> -- could you show me your script for dataset setup? >>>>>>>>>>>>>>>>> <https://issues.apache.org/jira/browse/GRIFFIN-346> >>>>>>>>>>>>>>>>> [SM] Attached are the 3 scripts. gen-hive-data.sh is the >>>>>>>>>>>>>>>>> master script which triggers demo_data and it further >>>>>>>>>>>>>>>>> triggers delta_src. >>>>>>>>>>>>>>>>> Have done it as it is instructed in the Github article and >>>>>>>>>>>>>>>>> gen-hive-data.sh is triggered as root in the terminal. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Please advise. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Thanks and Regards, >>>>>>>>>>>>>>>>> Sunil Muniyal >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On Wed, Sep 9, 2020 at 8:41 PM William Guo < >>>>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> *Request you to please advise further on below points:* >>>>>>>>>>>>>>>>>> 1. Since I was able to get ElasticSearch 6.8.x >>>>>>>>>>>>>>>>>> integrated, does it mean that only ES upto 6.8.x is >>>>>>>>>>>>>>>>>> supported for Griffin >>>>>>>>>>>>>>>>>> as of now? If yes, what are the plans further? Is there a >>>>>>>>>>>>>>>>>> page from which I >>>>>>>>>>>>>>>>>> could get updates? >>>>>>>>>>>>>>>>>> --please file a jira ticket for us to make our code ES >>>>>>>>>>>>>>>>>> compatible. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> 2. I still do not see the metrics available (please refer >>>>>>>>>>>>>>>>>> below screenshots). Though the measure is now listed in the >>>>>>>>>>>>>>>>>> drop down of *DQ >>>>>>>>>>>>>>>>>> Metrics* tab. But when I selected the test measure, >>>>>>>>>>>>>>>>>> nothing came up. >>>>>>>>>>>>>>>>>> --could you check the ES whether metrics have been >>>>>>>>>>>>>>>>>> injected or not. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> 3. At a step in deployment guide it is suggested to check >>>>>>>>>>>>>>>>>> URL: http://<ES HOST IP>:9200/griffin/accuracy >>>>>>>>>>>>>>>>>> <http://13.126.127.141:9200/griffin/accuracy> When >>>>>>>>>>>>>>>>>> navigated to this URL, I get below error. Please advise >>>>>>>>>>>>>>>>>> *{"error":"Incorrect HTTP method for uri >>>>>>>>>>>>>>>>>> [/griffin/accuracy] and method [GET], allowed: >>>>>>>>>>>>>>>>>> [POST]","status":405}* >>>>>>>>>>>>>>>>>> *-- it seems you need to use POST method.* >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> 6. I also noticed that in Data Assets, *demo_src* is >>>>>>>>>>>>>>>>>> owned by Admin whereas, *demo-tgt* by root. Would that >>>>>>>>>>>>>>>>>> make any difference? If yes, how to correct it? Reload HIVE >>>>>>>>>>>>>>>>>> data? >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> -- could you show me your script for dataset setup? >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> On Tue, Sep 8, 2020 at 9:02 PM Sunil Muniyal < >>>>>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Hi William, >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> I was finally able to get Griffin up and ElasticSearch >>>>>>>>>>>>>>>>>>> integrated along with Hadoop. Thanks a lot for your help >>>>>>>>>>>>>>>>>>> and guidance so >>>>>>>>>>>>>>>>>>> far. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> I have created a test measure and a job which gets >>>>>>>>>>>>>>>>>>> triggered at every 4 mins automatically (have referred to >>>>>>>>>>>>>>>>>>> the user guide >>>>>>>>>>>>>>>>>>> available on GitHub at this link >>>>>>>>>>>>>>>>>>> <https://github.com/apache/griffin/blob/master/griffin-doc/ui/user-guide.md> >>>>>>>>>>>>>>>>>>> .) >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> *Request you to please advise further on below points:* >>>>>>>>>>>>>>>>>>> 1. Since I was able to get ElasticSearch 6.8.x >>>>>>>>>>>>>>>>>>> integrated, does it mean that only ES upto 6.8.x is >>>>>>>>>>>>>>>>>>> supported for Griffin >>>>>>>>>>>>>>>>>>> as of now? If yes, what are the plans further? Is there a >>>>>>>>>>>>>>>>>>> page from which I >>>>>>>>>>>>>>>>>>> could get updates? >>>>>>>>>>>>>>>>>>> 2. I still do not see the metrics available (please >>>>>>>>>>>>>>>>>>> refer below screenshots). Though the measure is now listed >>>>>>>>>>>>>>>>>>> in the drop down >>>>>>>>>>>>>>>>>>> of *DQ Metrics* tab. But when I selected the test >>>>>>>>>>>>>>>>>>> measure, nothing came up. >>>>>>>>>>>>>>>>>>> 3. At a step in deployment guide it is suggested to >>>>>>>>>>>>>>>>>>> check URL: http://<ES HOST IP>:9200/griffin/accuracy >>>>>>>>>>>>>>>>>>> <http://13.126.127.141:9200/griffin/accuracy> When >>>>>>>>>>>>>>>>>>> navigated to this URL, I get below error. Please advise >>>>>>>>>>>>>>>>>>> *{"error":"Incorrect HTTP method for uri >>>>>>>>>>>>>>>>>>> [/griffin/accuracy] and method [GET], allowed: >>>>>>>>>>>>>>>>>>> [POST]","status":405}* >>>>>>>>>>>>>>>>>>> 6. I also noticed that in Data Assets, *demo_src* is >>>>>>>>>>>>>>>>>>> owned by Admin whereas, *demo-tgt* by root. Would that >>>>>>>>>>>>>>>>>>> make any difference? If yes, how to correct it? Reload HIVE >>>>>>>>>>>>>>>>>>> data? >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> *Screenshots:* >>>>>>>>>>>>>>>>>>> *Data Assets:* >>>>>>>>>>>>>>>>>>> [image: image.png] >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> *DQ Metrics (Test Measure selected):* >>>>>>>>>>>>>>>>>>> [image: image.png] >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> *Job Triggered multiple times:* >>>>>>>>>>>>>>>>>>> [image: image.png] >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> *Metrics page from job directly:* >>>>>>>>>>>>>>>>>>> [image: image.png] >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Thanks and Regards, >>>>>>>>>>>>>>>>>>> Sunil Muniyal >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> On Tue, Sep 8, 2020 at 4:38 PM Sunil Muniyal < >>>>>>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> I am unable to get repos for 6.4.1 instead I found >>>>>>>>>>>>>>>>>>>> 6.8.x. Will try with this version of Elastic Search in >>>>>>>>>>>>>>>>>>>> sometime. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> In the meantime, would it be possible to confirm if >>>>>>>>>>>>>>>>>>>> 6.4.x or 6.8.x is the only supported version for Griffin? >>>>>>>>>>>>>>>>>>>> Reason I am >>>>>>>>>>>>>>>>>>>> asking is, the GitHub article for griffin deployment >>>>>>>>>>>>>>>>>>>> points to the latest >>>>>>>>>>>>>>>>>>>> version of ES. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Thanks and Regards, >>>>>>>>>>>>>>>>>>>> Sunil Muniyal >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> On Tue, Sep 8, 2020 at 4:06 PM Sunil Muniyal < >>>>>>>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> I will need to redeploy ElasticSearch, correct? >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Thanks and Regards, >>>>>>>>>>>>>>>>>>>>> Sunil Muniyal >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> On Tue, Sep 8, 2020 at 4:05 PM William Guo < >>>>>>>>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Could you try with this version? >>>>>>>>>>>>>>>>>>>>>> <elasticsearch.version>6.4.1</elasticsearch.version> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>>> William >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> On Tue, Sep 8, 2020 at 5:59 PM Sunil Muniyal < >>>>>>>>>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> Hi William / Dev group, >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> I have deployed ES 7.9 - latest version (single >>>>>>>>>>>>>>>>>>>>>>> node) and the same is configured. I also get the >>>>>>>>>>>>>>>>>>>>>>> default page when hitting >>>>>>>>>>>>>>>>>>>>>>> http://<ES HOST IP>:9200/ >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> Upon creating the griffin configurations using the >>>>>>>>>>>>>>>>>>>>>>> JSON string given >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> curl -k -H "Content-Type: application/json" -X PUT >>>>>>>>>>>>>>>>>>>>>>> http://<replaced with my ES host IP>:9200/griffin \ >>>>>>>>>>>>>>>>>>>>>>> -d '{ >>>>>>>>>>>>>>>>>>>>>>> "aliases": {}, >>>>>>>>>>>>>>>>>>>>>>> "mappings": { >>>>>>>>>>>>>>>>>>>>>>> "accuracy": { >>>>>>>>>>>>>>>>>>>>>>> "properties": { >>>>>>>>>>>>>>>>>>>>>>> "name": { >>>>>>>>>>>>>>>>>>>>>>> "fields": { >>>>>>>>>>>>>>>>>>>>>>> "keyword": { >>>>>>>>>>>>>>>>>>>>>>> "ignore_above": 256, >>>>>>>>>>>>>>>>>>>>>>> "type": "keyword" >>>>>>>>>>>>>>>>>>>>>>> } >>>>>>>>>>>>>>>>>>>>>>> }, >>>>>>>>>>>>>>>>>>>>>>> "type": "text" >>>>>>>>>>>>>>>>>>>>>>> }, >>>>>>>>>>>>>>>>>>>>>>> "tmst": { >>>>>>>>>>>>>>>>>>>>>>> "type": "date" >>>>>>>>>>>>>>>>>>>>>>> } >>>>>>>>>>>>>>>>>>>>>>> } >>>>>>>>>>>>>>>>>>>>>>> } >>>>>>>>>>>>>>>>>>>>>>> }, >>>>>>>>>>>>>>>>>>>>>>> "settings": { >>>>>>>>>>>>>>>>>>>>>>> "index": { >>>>>>>>>>>>>>>>>>>>>>> "number_of_replicas": "2", >>>>>>>>>>>>>>>>>>>>>>> "number_of_shards": "5" >>>>>>>>>>>>>>>>>>>>>>> } >>>>>>>>>>>>>>>>>>>>>>> } >>>>>>>>>>>>>>>>>>>>>>> }' >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> *I get below error:* >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> *{"error":{"root_cause":[{"type":"mapper_parsing_exception","reason":"Root >>>>>>>>>>>>>>>>>>>>>>> mapping definition has unsupported parameters: >>>>>>>>>>>>>>>>>>>>>>> [accuracy : >>>>>>>>>>>>>>>>>>>>>>> {properties={name={fields={keyword={ignore_above=256, >>>>>>>>>>>>>>>>>>>>>>> type=keyword}}, >>>>>>>>>>>>>>>>>>>>>>> type=text}, >>>>>>>>>>>>>>>>>>>>>>> tmst={type=date}}}]"}],"type":"mapper_parsing_exception","reason":"Failed >>>>>>>>>>>>>>>>>>>>>>> to parse mapping [_doc]: Root mapping definition has >>>>>>>>>>>>>>>>>>>>>>> unsupported >>>>>>>>>>>>>>>>>>>>>>> parameters: [accuracy : >>>>>>>>>>>>>>>>>>>>>>> {properties={name={fields={keyword={ignore_above=256, >>>>>>>>>>>>>>>>>>>>>>> type=keyword}}, >>>>>>>>>>>>>>>>>>>>>>> type=text}, >>>>>>>>>>>>>>>>>>>>>>> tmst={type=date}}}]","caused_by":{"type":"mapper_parsing_exception","reason":"Root >>>>>>>>>>>>>>>>>>>>>>> mapping definition has unsupported parameters: >>>>>>>>>>>>>>>>>>>>>>> [accuracy : >>>>>>>>>>>>>>>>>>>>>>> {properties={name={fields={keyword={ignore_above=256, >>>>>>>>>>>>>>>>>>>>>>> type=keyword}}, >>>>>>>>>>>>>>>>>>>>>>> type=text}, tmst={type=date}}}]"}},"status":400}* >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> Seems like the JSON string is missing some values or >>>>>>>>>>>>>>>>>>>>>>> is incorrectly provided. >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> Would be great if you could please help. >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> Thanks and Regards, >>>>>>>>>>>>>>>>>>>>>>> Sunil Muniyal >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> On Mon, Sep 7, 2020 at 8:16 PM Sunil Muniyal < >>>>>>>>>>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> Thank you for the response, William. >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> I have started preparing for ES deployment and >>>>>>>>>>>>>>>>>>>>>>>> should attempt the same tomorrow. >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> In the meantime, I will also wait for the Dev team >>>>>>>>>>>>>>>>>>>>>>>> in case they have any additional inputs. >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> Thanks and Regards, >>>>>>>>>>>>>>>>>>>>>>>> Sunil Muniyal >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> On Mon, Sep 7, 2020 at 8:06 PM William Guo < >>>>>>>>>>>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> If dev confirms it to be mandatory, as I >>>>>>>>>>>>>>>>>>>>>>>>> understand correct, I will need to: >>>>>>>>>>>>>>>>>>>>>>>>> 1. Deploy and Configure ES >>>>>>>>>>>>>>>>>>>>>>>>> 2. Update application.properties to include ES >>>>>>>>>>>>>>>>>>>>>>>>> details and create ES index >>>>>>>>>>>>>>>>>>>>>>>>> 3. Rebuild Maven package and rerun the Griffin >>>>>>>>>>>>>>>>>>>>>>>>> service >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> *Right, you need to package es env configuration >>>>>>>>>>>>>>>>>>>>>>>>> into your jar.* >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> There is no need to reload the data into Hadoop >>>>>>>>>>>>>>>>>>>>>>>>> (Hive), correct? >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> *No* >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> On a side note, is there any other documentation >>>>>>>>>>>>>>>>>>>>>>>>> of Griffin available or underway which would help to >>>>>>>>>>>>>>>>>>>>>>>>> get below details >>>>>>>>>>>>>>>>>>>>>>>>> while integrating it with Cloudera Hadoop? >>>>>>>>>>>>>>>>>>>>>>>>> 1. What are the exact ports requirements (internal >>>>>>>>>>>>>>>>>>>>>>>>> and external)? >>>>>>>>>>>>>>>>>>>>>>>>> *check log and make sure all extra connections in >>>>>>>>>>>>>>>>>>>>>>>>> properties can accessible* >>>>>>>>>>>>>>>>>>>>>>>>> 2. Which all packages will be required? >>>>>>>>>>>>>>>>>>>>>>>>> *no* >>>>>>>>>>>>>>>>>>>>>>>>> 3. Any Java dependencies? >>>>>>>>>>>>>>>>>>>>>>>>> *java 1.8* >>>>>>>>>>>>>>>>>>>>>>>>> 4. If we have Cloudera Hadoop cluster kerberized >>>>>>>>>>>>>>>>>>>>>>>>> (secured), what are the dependencies or additional >>>>>>>>>>>>>>>>>>>>>>>>> configurations needed? >>>>>>>>>>>>>>>>>>>>>>>>> *Should no extra dependencies, except >>>>>>>>>>>>>>>>>>>>>>>>> those transitive dependencies incurred by spark and >>>>>>>>>>>>>>>>>>>>>>>>> hadoop.* >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> On Mon, Sep 7, 2020 at 6:42 PM Sunil Muniyal < >>>>>>>>>>>>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> Ohh ok. >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> If dev confirms it to be mandatory, as I >>>>>>>>>>>>>>>>>>>>>>>>>> understand correct, I will need to: >>>>>>>>>>>>>>>>>>>>>>>>>> 1. Deploy and Configure ES >>>>>>>>>>>>>>>>>>>>>>>>>> 2. Update application.properties to include ES >>>>>>>>>>>>>>>>>>>>>>>>>> details and create ES index >>>>>>>>>>>>>>>>>>>>>>>>>> 3. Rebuild Maven package and rerun the Griffin >>>>>>>>>>>>>>>>>>>>>>>>>> service >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> There is no need to reload the data into Hadoop >>>>>>>>>>>>>>>>>>>>>>>>>> (Hive), correct? >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> On a side note, is there any other documentation >>>>>>>>>>>>>>>>>>>>>>>>>> of Griffin available or underway which would help to >>>>>>>>>>>>>>>>>>>>>>>>>> get below details >>>>>>>>>>>>>>>>>>>>>>>>>> while integrating it with Cloudera Hadoop? >>>>>>>>>>>>>>>>>>>>>>>>>> 1. What are the exact ports requirements >>>>>>>>>>>>>>>>>>>>>>>>>> (internal and external)? >>>>>>>>>>>>>>>>>>>>>>>>>> 2. Which all packages will be required? >>>>>>>>>>>>>>>>>>>>>>>>>> 3. Any Java dependencies? >>>>>>>>>>>>>>>>>>>>>>>>>> 4. If we have Cloudera Hadoop cluster kerberized >>>>>>>>>>>>>>>>>>>>>>>>>> (secured), what are the dependencies or additional >>>>>>>>>>>>>>>>>>>>>>>>>> configurations needed? >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> I know some of the above information can be >>>>>>>>>>>>>>>>>>>>>>>>>> fetched from the deployment guide on Github. >>>>>>>>>>>>>>>>>>>>>>>>>> However, checking if any other >>>>>>>>>>>>>>>>>>>>>>>>>> formal documentation has been made available for the >>>>>>>>>>>>>>>>>>>>>>>>>> same? >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> Thanks and Regards, >>>>>>>>>>>>>>>>>>>>>>>>>> Sunil Muniyal >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> On Mon, Sep 7, 2020 at 4:05 PM William Guo < >>>>>>>>>>>>>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> cc dev for double checking. >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> Measure will emit metrics and store them in >>>>>>>>>>>>>>>>>>>>>>>>>>> elastic, UI fetch those metrics from elastic. >>>>>>>>>>>>>>>>>>>>>>>>>>> So elastic should be mandatory. >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>>>>>>>> William >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> On Mon, Sep 7, 2020 at 6:32 PM Sunil Muniyal < >>>>>>>>>>>>>>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> Thank you for the quick response, William. >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> I have not configured ElasticSearch since it is >>>>>>>>>>>>>>>>>>>>>>>>>>>> not deployed. >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> In the application.properties, I just added the >>>>>>>>>>>>>>>>>>>>>>>>>>>> dummy information (as below) just to pass the >>>>>>>>>>>>>>>>>>>>>>>>>>>> validation test and get >>>>>>>>>>>>>>>>>>>>>>>>>>>> Griffin up and running. >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> # elasticsearch >>>>>>>>>>>>>>>>>>>>>>>>>>>> # elasticsearch.host = <IP> >>>>>>>>>>>>>>>>>>>>>>>>>>>> # elasticsearch.port = <elasticsearch rest port> >>>>>>>>>>>>>>>>>>>>>>>>>>>> # elasticsearch.user = user >>>>>>>>>>>>>>>>>>>>>>>>>>>> # elasticsearch.password = password >>>>>>>>>>>>>>>>>>>>>>>>>>>> elasticsearch.host=localhost >>>>>>>>>>>>>>>>>>>>>>>>>>>> elasticsearch.port=9200 >>>>>>>>>>>>>>>>>>>>>>>>>>>> elasticsearch.scheme=http >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> Is ElasticSearch a mandatory requirement to use >>>>>>>>>>>>>>>>>>>>>>>>>>>> Griffin? >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks and Regards, >>>>>>>>>>>>>>>>>>>>>>>>>>>> Sunil Muniyal >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> On Mon, Sep 7, 2020 at 3:58 PM William Guo < >>>>>>>>>>>>>>>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Could you check whether ES has been injected >>>>>>>>>>>>>>>>>>>>>>>>>>>>> with those metrics or not? >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Mon, Sep 7, 2020 at 6:23 PM Sunil Muniyal < >>>>>>>>>>>>>>>>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hello William, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I was able to bypass this error by entering >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> the default field values for LDAP, ElasticSearch >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> and Livy in >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> application.properties and successfully get >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Griffin running. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> By following the below article, I have >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> created a test measure and then a job which >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> triggers that measure. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> https://github.com/apache/griffin/blob/master/griffin-doc/ui/user-guide.md >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Have allowed the job to get triggered >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> multiple times, however, still i can't see >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> anything in metrics related to >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> the job. Neither I see anything in *health *or >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> *mydashboard* tabs. Also, if you notice in >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> the screenshot below, being in the *DQ >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Metrics* tab, I still do not see the created >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> measure in the drop down list. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> [image: image.png] >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> *Test job executed multiple times:* >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> [image: image.png] >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Please advise if anything is mis-configured. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks and Regards, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Sunil Muniyal >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Mon, Sep 7, 2020 at 12:40 PM Sunil Muniyal >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> <[email protected]> wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hello William, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thank you for the reply. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> This helped, actually i had missed to add >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> the property in application.properties. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Now the other challenge is, along with ES >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> and Livy, I am also not using LDAP and it is >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> hitting the error *unable >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> to resolve ldap.url property.* Of Course it >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> will, since the property is not configured. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Please suggest. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks and Regards, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Sunil Muniyal >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Sun, Sep 6, 2020 at 7:26 PM William Guo < >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> hi Sunil Muniyal, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Could you check this property in your >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> griffin properties file? >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> internal.event.listeners >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> William >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Thu, Sep 3, 2020 at 11:05 PM Sunil >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Muniyal <[email protected]> wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hello, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I am attempting to integrate Griffin with >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Cloudera Hadoop by following below article: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> https://github.com/apache/griffin/blob/master/griffin-doc/deploy/deploy-guide.md >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> <https://github.com/apache/griffin/blob/master/griffin-doc/deploy/deploy-guide.md>I >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> have followed everything as instructed, apart >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> from below things: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 1. Using Cloudera Hadoop 5.15 and relevant >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> configurations instead of Apache Hadoop >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 2. Not using Elastic search as it is not >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> applicable >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 3. Did not use Livy as it is not >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> applicable. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Maven build is successful and has got 2 >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> jars at service/target and measure/target >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> which I have uploaded to HDFS. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> However, *starting griffin-service.jar >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> using nohup command* is failing with >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> below error: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> *Caused by: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> java.lang.IllegalArgumentException: Could not >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> resolve placeholder >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 'internal.event.listeners' in string value >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> "#{'${internal.event.listeners}'.split(',')}"* >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> * at >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> org.springframework.util.PropertyPlaceholderHelper.parseStringValue(PropertyPlaceholderHelper.java:174) >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ~[spring-core-4.3.6.RELEASE.jar!/:4.3.6.RELEASE]* >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> * at >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> org.springframework.util.PropertyPlaceholderHelper.replacePlaceholders(PropertyPlaceholderHelper.java:126) >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ~[spring-core-4.3.6.RELEASE.jar!/:4.3.6.RELEASE]* >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> * at >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> org.springframework.core.env.AbstractPropertyResolver.doResolvePlaceholders(AbstractPropertyResolver.java:236) >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ~[spring-core-4.3.6.RELEASE.jar!/:4.3.6.RELEASE]* >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I have tried to search a lot of articles >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> with no luck. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Would be great if someone could help me to >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> fix this. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Also, attached is the output of nohup >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> command that was written in service.out. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks and Regards, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Sunil Muniyal >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> --------------------------------------------------------------------- >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> To unsubscribe, e-mail: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> [email protected] >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> For additional commands, e-mail: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> [email protected] >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> --------------------------------------------------------------------- >>>>>>>>>>>>>>>>> To unsubscribe, e-mail: >>>>>>>>>>>>>>>>> [email protected] >>>>>>>>>>>>>>>>> For additional commands, e-mail: >>>>>>>>>>>>>>>>> [email protected] >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> --------------------------------------------------------------------- >>>>>>>>>>> To unsubscribe, e-mail: [email protected] >>>>>>>>>>> For additional commands, e-mail: [email protected] >>>>>>>>>> >>>>>>>>>>
