RE: Re:RE: Can't get output after creating measure

Ashwini Kumar Gupta Sat, 14 Apr 2018 23:11:37 -0700

Hello Lionel,

As suggested I kept only one configuration i.e

                                sparkJob.jars = 
hdfs:///livy/datanucleus-api-jdo-3.2.6.jar;\

hdfs:///livy/datanucleus-core-3.2.10.jar;\

hdfs:///livy/datanucleus-rdbms-3.2.9.jar

  *   I kept env.json , griffin-measure.jar ,hive-site.xml  in 
hdfs:///griffin/griffin-measure.jar .

  *   I created an accuracy measure and created a job with 1 minute cron 
expression.

Attached is the log file. It seems it can’t get to HDFS although my cloudera 
HDFS is up.

Regards
Ashwini
From: Lionel Liu <lionel...@apache.org>
Sent: 13 April 2018 15:14
To: Ashwini Kumar Gupta <ashwini.gu...@enquero.com>
Cc: dev@griffin.incubator.apache.org
Subject: Re: Re:RE: Can't get output after creating measure

Hi Ashwini,

I've read your document, and here lists my answers:

Question

·         Do I keep both of them?

You should only keep the effective “sparkJob.jars” parameter.

·         Do I have to copy hive-site.xml to HDFS and give the HDFS path in 
spark.yarn.dist.files?

You’d better copy hive-site.xml to HDFS, cause livy could only submit spark 
applications in cluster mode, so the hive-site.xml should be accessed by each 
node.

About the livy log:

According to livy log, it seems that your configuration of sparkJob.properties 
doesn’t work, livy is trying to find hdfs:///griffin/griffin-measure.jar, not 
hdfs:///user/griffin/griffin-measure.jar.

Pls correct the sparkJob.properties, and rebuild the service module and have a 
try.

Thanks,
Lionel

On Fri, Apr 13, 2018 at 4:16 PM, Ashwini Kumar Gupta 
<ashwini.gu...@enquero.com<mailto:ashwini.gu...@enquero.com>> wrote:
Hello Lionel,

Apologies for delayed reply. I was trying all my options before raising an 
issue.

I’m attaching my installation steps. Please let me know what’s wrong with them.

Regards
Ashwin

From: bhlx3l...@163.com<mailto:bhlx3l...@163.com> 
<bhlx3l...@163.com<mailto:bhlx3l...@163.com>> On Behalf Of Lionel Liu
Sent: 10 April 2018 18:11
To: dev@griffin.incubator.apache.org<mailto:dev@griffin.incubator.apache.org>; 
Ashwini Kumar Gupta 
<ashwini.gu...@enquero.com<mailto:ashwini.gu...@enquero.com>>
Subject: Re:RE: Can't get output after creating measure

Hi Ashwini,

It works the same in linux OS, we need to check the log to figure out what 
happened, it might be some configure mistake or input mistake.
I recommend you try our docker image first, by following this doc: 
https://github.com/apache/incubator-griffin/blob/master/griffin-doc/docker/griffin-docker-guide.md

--
Regards,
Lionel, Liu

At 2018-04-10 19:25:01, "Ashwini Kumar Gupta" 
<ashwini.gu...@enquero.com<mailto:ashwini.gu...@enquero.com>> wrote:

>Hello Lionel,

>

>I’m running this in cloudera VM. Will that change anything?

>

>Regards

>Ashwin

>

>From: Lionel Liu <lionel...@apache.org<mailto:lionel...@apache.org>>

>Sent: 10 April 2018 15:26

>To: dev@griffin.incubator.apache.org<mailto:dev@griffin.incubator.apache.org>; 
>Ashwini Kumar Gupta 
><ashwini.gu...@enquero.com<mailto:ashwini.gu...@enquero.com>>

>Subject: Re: Can't get output after creating measure

>

>Hi Ashwini,

>

>First, you could check the log of griffin service, to know if it has triggered 
>the job instance.

>Then, griffin service will submit a spark application with configuration to 
>livy, you can check the log of livy, to verify if it has been submitted 
>correctly.

>After that, you need to check the spark cluster, to verify the application has 
>been accepted by the cluster, if it runs, you can get the application log 
>through yarn.

>

>Each step error might block the result.

>

>Thanks,

>Lionel

>

>On Tue, Apr 10, 2018 at 5:01 PM, William Guo 
><gu...@apache.org<mailto:gu...@apache.org<mailto:gu...@apache.org%3cmailto:gu...@apache.org>>>
> wrote:

>hi Ashwin,

>

>Could you show us your log here?

>

>Thanks,

>William

>

>On Tue, Apr 10, 2018 at 3:35 PM, Ashwini Kumar Gupta <

>ashwini.gu...@enquero.com<mailto:ashwini.gu...@enquero.com<mailto:ashwini.gu...@enquero.com%3cmailto:ashwini.gu...@enquero.com>>>
> wrote:

>

>> Hello Team,

>>

>> I have been trying to install and use griffin but I cannot get output when

>> I click on DQ Matrix.

>>

>> I created a measure, created a job to run.

>> The sequence in which I run all services are:

>>

>>

>>   1.  Elasticsearch

>>   2.  Jar file

>>

>> I also noticed that griffin is not creating mapping in ES.

>>

>> Can you please tell me where I'm going wrong.

>>

>> Thanks

>> Ashwin

>>

>

Hibernate: select predicates0_.data_connector_id as data_con6_11_0_, 
predicates0_.id as id1_11_0_, predicates0_.id as id1_11_1_, 
predicates0_.created_date as created_2_11_1_, predicates0_.modified_date as 
modified3_11_1_, predicates0_.config as config4_11_1_, predicates0_.type as 
type5_11_1_ from segment_predicate predicates0_ where 
predicates0_.data_connector_id=?
Hibernate: select rules0_.evaluate_rule_id as evaluat11_10_0_, rules0_.id as 
id1_10_0_, rules0_.id as id1_10_1_, rules0_.created_date as created_2_10_1_, 
rules0_.modified_date as modified3_10_1_, rules0_.details as details4_10_1_, 
rules0_.dq_type as dq_type5_10_1_, rules0_.dsl_type as dsl_type6_10_1_, 
rules0_.metric as metric7_10_1_, rules0_.name as name8_10_1_, rules0_.record as 
record9_10_1_, rules0_.rule as rule10_10_1_ from rule rules0_ where 
rules0_.evaluate_rule_id=?
Hibernate: select predicates0_.data_connector_id as data_con6_11_0_, 
predicates0_.id as id1_11_0_, predicates0_.id as id1_11_1_, 
predicates0_.created_date as created_2_11_1_, predicates0_.modified_date as 
modified3_11_1_, predicates0_.config as config4_11_1_, predicates0_.type as 
type5_11_1_ from segment_predicate predicates0_ where 
predicates0_.data_connector_id=?
Hibernate: select griffinjob0_.id as id2_5_1_, griffinjob0_.created_date as 
created_3_5_1_, griffinjob0_.modified_date as modified4_5_1_, 
griffinjob0_.deleted as deleted5_5_1_, griffinjob0_.job_name as job_name6_5_1_, 
griffinjob0_.measure_id as measure_7_5_1_, griffinjob0_.metric_name as 
metric_n8_5_1_, griffinjob0_.quartz_group_name as quartz_g9_5_1_, 
griffinjob0_.quartz_job_name as quartz_10_5_1_, jobinstanc1_.job_id as 
job_id13_7_3_, jobinstanc1_.id as id1_7_3_, jobinstanc1_.id as id1_7_0_, 
jobinstanc1_.created_date as created_2_7_0_, jobinstanc1_.modified_date as 
modified3_7_0_, jobinstanc1_.app_id as app_id4_7_0_, jobinstanc1_.app_uri as 
app_uri5_7_0_, jobinstanc1_.predicate_job_deleted as predicat6_7_0_, 
jobinstanc1_.expire_timestamp as expire_t7_7_0_, 
jobinstanc1_.predicate_group_name as predicat8_7_0_, 
jobinstanc1_.predicate_job_name as predicat9_7_0_, jobinstanc1_.session_id as 
session10_7_0_, jobinstanc1_.state as state11_7_0_, jobinstanc1_.timestamp as 
timesta12_7_0_ from job griffinjob0_ left outer join job_instance_bean 
jobinstanc1_ on griffinjob0_.id=jobinstanc1_.job_id where griffinjob0_.id=? and 
griffinjob0_.type='griffin_job'
Hibernate: insert into job_instance_bean (created_date, modified_date, app_id, 
app_uri, predicate_job_deleted, expire_timestamp, predicate_group_name, 
predicate_job_name, session_id, state, timestamp) values (?, ?, ?, ?, ?, ?, ?, 
?, ?, ?, ?)
Hibernate: update job_instance_bean set job_id=? where id=?
Hibernate: select jobinstanc0_.id as id1_7_, jobinstanc0_.created_date as 
created_2_7_, jobinstanc0_.modified_date as modified3_7_, jobinstanc0_.app_id 
as app_id4_7_, jobinstanc0_.app_uri as app_uri5_7_, 
jobinstanc0_.predicate_job_deleted as predicat6_7_, 
jobinstanc0_.expire_timestamp as expire_t7_7_, 
jobinstanc0_.predicate_group_name as predicat8_7_, 
jobinstanc0_.predicate_job_name as predicat9_7_, jobinstanc0_.session_id as 
session10_7_, jobinstanc0_.state as state11_7_, jobinstanc0_.timestamp as 
timesta12_7_ from job_instance_bean jobinstanc0_ where 
jobinstanc0_.predicate_job_name=?
2018-04-14 23:03:00.354  INFO 21565 --- [ryBean_Worker-3] 
o.a.griffin.core.job.SparkSubmitJob      : {
  "measure.type" : "griffin",
  "id" : 1,
  "name" : "test_job",
  "owner" : "test",
  "description" : null,
  "organization" : null,
  "deleted" : false,
  "timestamp" : 1523772120000,
  "dq.type" : "accuracy",
  "process.type" : "batch",
  "data.sources" : [ {
    "id" : 1,
    "name" : "source",
    "connectors" : [ {
      "id" : 1,
      "name" : "source1523771608446",
      "type" : "HIVE",
      "version" : "1.2",
      "predicates" : [ {
        "id" : 1,
        "type" : "file.exist",
        "config" : {
          "root.path" : 
"hdfs://quickstart.cloudera:8020/user/hive/warehouse/griffin.db/emp_src",
          "path" : "dt=20180415 AND hour=06/_DONE"
        }
      } ],
      "data.unit" : "1minute",
      "data.time.zone" : "UTC+5:30(IDT)",
      "config" : {
        "database" : "griffin",
        "table.name" : "emp_src",
        "where" : "dt=20180415 AND hour=06"
      }
    } ]
  }, {
    "id" : 2,
    "name" : "target",
    "connectors" : [ {
      "id" : 2,
      "name" : "target1523771619816",
      "type" : "HIVE",
      "version" : "1.2",
      "predicates" : [ {
        "id" : 2,
        "type" : "file.exist",
        "config" : {
          "root.path" : 
"hdfs://quickstart.cloudera:8020/user/hive/warehouse/griffin.db/emp_tgt",
          "path" : "dt=20180415 AND hour=06/_DONE1"
        }
      } ],
      "data.unit" : "1minute",
      "data.time.zone" : "UTC+5:30(IDT)",
      "config" : {
        "database" : "griffin",
        "table.name" : "emp_tgt",
        "where" : "dt=20180415 AND hour=06"
      }
    } ]
  } ],
  "evaluate.rule" : {
    "id" : 1,
    "rules" : [ {
      "id" : 1,
      "rule" : "source.id=target.id AND source.name=target.name AND 
source.city=target.city",
      "name" : "accuracy",
      "dsl.type" : "griffin-dsl",
      "dq.type" : "accuracy"
    } ]
  },
  "measure.type" : "griffin"
}
2018-04-14 23:03:00.354  INFO 21565 --- [ryBean_Worker-3] 
o.a.g.core.job.FileExistPredicator       : Predicate path: 
hdfs://quickstart.cloudera:8020/user/hive/warehouse/griffin.db/emp_srcdt=20180415
 AND hour=06/_DONE
2018-04-14 23:03:00.487  INFO 21565 --- [ryBean_Worker-3] 
org.apache.griffin.core.util.FSUtil      : Setting 
fs.defaultFS:hdfs://hdfs-default-name
2018-04-14 23:03:00.487  INFO 21565 --- [ryBean_Worker-3] 
org.apache.griffin.core.util.FSUtil      : Setting 
fs.hdfs.impl:org.apache.hadoop.hdfs.DistributedFileSystem
2018-04-14 23:03:00.487  INFO 21565 --- [ryBean_Worker-3] 
org.apache.griffin.core.util.FSUtil      : Setting 
fs.file.impl:org.apache.hadoop.fs.LocalFileSystem
2018-04-14 23:03:01.604 ERROR 21565 --- [ryBean_Worker-3] 
org.apache.griffin.core.util.FSUtil      : Can not get hdfs file system. 
java.net.UnknownHostException: hdfs-default-name

hive-site.xml
Description: hive-site.xml

RE: Re:RE: Can't get output after creating measure

Reply via email to