Yu LIU created GRIFFIN-316:
------------------------------
Summary: Spark runtime exception cannot be caught while running a
dq application
Key: GRIFFIN-316
URL: https://issues.apache.org/jira/browse/GRIFFIN-316
Project: Griffin
Issue Type: Bug
Affects Versions: 0.4.0, 0.5.0, 0.6.0
Reporter: Yu LIU
Fix For: 0.6.0
If we put an invalid rule for a batch job (as it happens quite often given that
the rules are evaluated at runtime via spark sql), the exception thrown by
SparkSession has not been caught and transferred properly to user via "Try"
instance, but the job actually succeed with a "Success" returned.
The reason is that we are only wrapping the returned Boolean result by applying
"Try" at the most outside level for DQApp.run, so the exception thrown deeper
through the call stack cannot be caught.
Here is an example config file to reproduce the issue:
{noformat}
{
"name": "prof_batch",
"process.type": "batch",
"timestamp": 123456,
"data.sources": [
{
"name": "source",
"connectors": [
{
"type": "avro",
"version": "1.7",
"dataframe.name" : "this_table",
"config": {
"file.name": "src/test/resources/users_info_src.avro"
},
"pre.proc": [
{
"dsl.type": "spark-sql",
"rule": "select * from this_table where user_id < 10014"
}
]
}
]
}
],
"evaluate.rule": {
"rules": [
{
"dsl.type": "griffin-dsl",
"dq.type": "profiling",
"out.dataframe.name": "prof",
"rule": "xxx",
"out":[
{
"type": "metric",
"name": "prof",
"flatten": "array"
}
]
}
]
},
"sinks": ["CONSOLE"]
}{noformat}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)