[
https://issues.apache.org/jira/browse/GRIFFIN-189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16603881#comment-16603881
]
Michael Kisly commented on GRIFFIN-189:
---------------------------------------
Hello, I've been working directly with Cory on this.
# We are not using the docker container. We have installed ourselves.
# Yes our livy url is simply [http://localhost:8998/batches]
# Yes we have been able to submit requests to the livy and they function
correctly.
It seems there have been a couple occasions where we do see dead job in livy
that was created, and the error we see is
Exception in thread "main" java.io.IOException: Incomplete HDFS URI, no host:
hdfs:///livy/datanucleus-api-jdo-3.2.6.jar
This is puzzling because in the sparkJob.Properties we have the host specified
in the HDFS address, yet it doesn't appear to be using that value when trying
to find the jars, perhaps were using the wrong parameters ?
#spark required
sparkJob.file=hdfs://localhost:9000/griffin/griffin-measure.jar
sparkJob.className=org.apache.griffin.measure.Application
sparkJob.args_1=hdfs://localhost:9000/env/env.json
sparkJob.args_3=hdfs,raw
sparkJob.jars_1 = hdfs://localhost:9000/livy/datanucleus-api-jdo-3.2.6.jar
sparkJob.jars_2 = hdfs://localhost:9000/livy/datanucleus-core-3.2.10.jar
sparkJob.jars_3 = hdfs://localhost:9000/livy/datanucleus-rdbms-3.2.9.jar
I've attached the full [^sparkJob.properties] ^^ file. We are using the default
ports for all apache utilities except for the Spark UI url which we switched
from 8080 to 8089 so that 8080 was available for griffin.
Appreciate the help
> Griffin - Livy error
> --------------------
>
> Key: GRIFFIN-189
> URL: https://issues.apache.org/jira/browse/GRIFFIN-189
> Project: Griffin (Incubating)
> Issue Type: Bug
> Affects Versions: 0.2.0-incubating
> Reporter: Cory Woytasik
> Priority: Major
> Labels: beginner, newbie, usability
> Attachments: sparkJob.properties, sparkJob.properties
>
>
> We are trying to get griffin set up and after creating measure and jobs and
> letting them run we have noticed the results are not available via the DQ
> metrics link or metric link from the job itself. We have noticed when the
> job gets submitted the following spark context and error message are
> generated. We assume we must have a setting in one of the directories set
> incorrectly. Thoughts?
>
> INFO 20972 --- [ryBean_Worker-2] o.a.g.c.j.SparkSubmitJob : {
> "measure.type" : "griffin",
> "id" : 13,
> "name" : "LineageAccuracy",
> "owner" : "test",
> "description" : "AccuracyTest",
> "organization" : null,
> "deleted" : false,
> "timestamp" : 1535998320000,
> "dq.type" : "accuracy",
> "process.type" : "batch",
> "data.sources" : [ {
> "id" : 16,
> "name" : "source",
> "connectors" : [ {
> "id" : 17,
> "name" : "source1535741016027",
> "type" : "HIVE",
> "version" : "1.2",
> "predicates" : [ ],
> "data.unit" : "1day",
> "config" : {
> "database" : "default",
> "table.name" : "lineage"
> }
> } ]
> }, {
> "id" : 18,
> "name" : "target",
> "connectors" : [ {
> "id" : 19,
> "name" : "target1535741022277",
> "type" : "HIVE",
> "version" : "1.2",
> "predicates" : [ ],
> "data.unit" : "1day",
> "config" : {
> "database" : "default",
> "table.name" : "lineageload"
> }
> } ]
> } ],
> "evaluate.rule" : {
> "id" : 14,
> "rules" : [ {
> "id" : 15,
> "rule" : "source.asset=target.asset AND source.element=target.element
> AND source.elementtype=target.elementtype AND source.object=target.object AND
> source.objecttype=target.objecttype AND source.objectfield=target.objectfield
> AND source.sourceelement=target.sourceelement AND
> source.sourceobject=target.sourceobject AND
> source.sourcefield=target.sourcefield AND
> source.sourcefieldname=target.sourcefieldname AND
> source.transformationtext=target.transformationtext AND
> source.displayindicator=target.displayindicator",
> "name" : "accuracy",
> "dsl.type" : "griffin-dsl",
> "dq.type" : "accuracy"
> } ]
> },
> "measure.type" : "griffin"
> }
> {color:#FF0000}2018-09-04 13:12:00.752 ERROR 20972 --- [ryBean_Worker-2]
> o.a.g.c.j.SparkSubmitJob : Post to livy error. 500 Internal
> Server Error{color}
> [EL Fine]: sql: 2018-09-04
> 13:12:00.754--ClientSession(787879814)--Connection(1389579691)--UPDATE
> JOBINSTANCEBEAN SET predicate_job_deleted = ?, STATE = ? WHERE (ID = ?)
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)