[ 
https://issues.apache.org/jira/browse/GRIFFIN-189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16603881#comment-16603881
 ] 

Michael Kisly commented on GRIFFIN-189:
---------------------------------------

Hello, I've been working directly with Cory on this. 
 # We are not using the docker container. We have installed ourselves.
 # Yes our livy url is simply [http://localhost:8998/batches]
 # Yes we have been able to submit requests to the livy and they function 
correctly.

It seems there have been a couple occasions where we do see dead job in livy 
that was created, and the error we see is 

Exception in thread "main" java.io.IOException: Incomplete HDFS URI, no host: 
hdfs:///livy/datanucleus-api-jdo-3.2.6.jar

This is puzzling because in the sparkJob.Properties we have the host specified 
in the HDFS address, yet it doesn't appear to be using that value when trying 
to find the jars, perhaps were using the wrong parameters ?

#spark required
sparkJob.file=hdfs://localhost:9000/griffin/griffin-measure.jar
sparkJob.className=org.apache.griffin.measure.Application
sparkJob.args_1=hdfs://localhost:9000/env/env.json
sparkJob.args_3=hdfs,raw
sparkJob.jars_1 = hdfs://localhost:9000/livy/datanucleus-api-jdo-3.2.6.jar
sparkJob.jars_2 = hdfs://localhost:9000/livy/datanucleus-core-3.2.10.jar
sparkJob.jars_3 = hdfs://localhost:9000/livy/datanucleus-rdbms-3.2.9.jar

 

I've attached the full [^sparkJob.properties] ^^ file. We are using the default 
ports for all apache utilities except for the Spark UI url which we switched 
from 8080 to 8089 so that 8080 was available for griffin.

 

Appreciate the help

> Griffin - Livy error
> --------------------
>
>                 Key: GRIFFIN-189
>                 URL: https://issues.apache.org/jira/browse/GRIFFIN-189
>             Project: Griffin (Incubating)
>          Issue Type: Bug
>    Affects Versions: 0.2.0-incubating
>            Reporter: Cory Woytasik
>            Priority: Major
>              Labels: beginner, newbie, usability
>         Attachments: sparkJob.properties, sparkJob.properties
>
>
> We are trying to get griffin set up and after creating measure and jobs and 
> letting them run we have noticed the results are not available via the DQ 
> metrics link or metric link from the job itself.  We have noticed when the 
> job gets submitted the following spark context and error message are 
> generated.  We assume we must have a setting in one of the directories set 
> incorrectly.  Thoughts?
>  
> INFO 20972 --- [ryBean_Worker-2] o.a.g.c.j.SparkSubmitJob                 : {
>   "measure.type" : "griffin",
>   "id" : 13,
>   "name" : "LineageAccuracy",
>   "owner" : "test",
>   "description" : "AccuracyTest",
>   "organization" : null,
>   "deleted" : false,
>   "timestamp" : 1535998320000,
>   "dq.type" : "accuracy",
>   "process.type" : "batch",
>   "data.sources" : [ {
>     "id" : 16,
>     "name" : "source",
>     "connectors" : [ {
>       "id" : 17,
>       "name" : "source1535741016027",
>       "type" : "HIVE",
>       "version" : "1.2",
>       "predicates" : [ ],
>       "data.unit" : "1day",
>       "config" : {
>         "database" : "default",
>         "table.name" : "lineage"
>       }
>     } ]
>   }, {
>     "id" : 18,
>     "name" : "target",
>     "connectors" : [ {
>       "id" : 19,
>       "name" : "target1535741022277",
>       "type" : "HIVE",
>       "version" : "1.2",
>       "predicates" : [ ],
>       "data.unit" : "1day",
>       "config" : {
>         "database" : "default",
>         "table.name" : "lineageload"
>       }
>     } ]
>   } ],
>   "evaluate.rule" : {
>     "id" : 14,
>     "rules" : [ {
>       "id" : 15,
>       "rule" : "source.asset=target.asset AND source.element=target.element 
> AND source.elementtype=target.elementtype AND source.object=target.object AND 
> source.objecttype=target.objecttype AND source.objectfield=target.objectfield 
> AND source.sourceelement=target.sourceelement AND 
> source.sourceobject=target.sourceobject AND 
> source.sourcefield=target.sourcefield AND 
> source.sourcefieldname=target.sourcefieldname AND 
> source.transformationtext=target.transformationtext AND 
> source.displayindicator=target.displayindicator",
>       "name" : "accuracy",
>       "dsl.type" : "griffin-dsl",
>       "dq.type" : "accuracy"
>     } ]
>   },
>   "measure.type" : "griffin"
> }
> {color:#FF0000}2018-09-04 13:12:00.752 ERROR 20972 --- [ryBean_Worker-2] 
> o.a.g.c.j.SparkSubmitJob                 : Post to livy error. 500 Internal 
> Server Error{color}
> [EL Fine]: sql: 2018-09-04 
> 13:12:00.754--ClientSession(787879814)--Connection(1389579691)--UPDATE 
> JOBINSTANCEBEAN SET predicate_job_deleted = ?, STATE = ? WHERE (ID = ?)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to