Yu LIU created GRIFFIN-316:
------------------------------

             Summary: Spark runtime exception cannot be caught while running a 
dq application
                 Key: GRIFFIN-316
                 URL: https://issues.apache.org/jira/browse/GRIFFIN-316
             Project: Griffin
          Issue Type: Bug
    Affects Versions: 0.4.0, 0.5.0, 0.6.0
            Reporter: Yu LIU
             Fix For: 0.6.0


If we put an invalid rule for a batch job (as it happens quite often given that 
the rules are evaluated at runtime via spark sql), the exception thrown by 
SparkSession has not been caught and transferred properly to user via "Try" 
instance, but the job actually succeed with a "Success" returned.

The reason is that we are only wrapping the returned Boolean result by applying 
"Try" at the most outside level for DQApp.run, so the exception thrown deeper 
through the call stack cannot be caught.

 

Here is an example config file to reproduce the issue:
{noformat}
{
  "name": "prof_batch",

  "process.type": "batch",

  "timestamp": 123456,

  "data.sources": [
    {
      "name": "source",
      "connectors": [
        {
          "type": "avro",
          "version": "1.7",
          "dataframe.name" : "this_table",
          "config": {
            "file.name": "src/test/resources/users_info_src.avro"
          },
          "pre.proc": [
            {
              "dsl.type": "spark-sql",
              "rule": "select * from this_table where user_id < 10014"
            }
          ]
        }
      ]
    }
  ],

  "evaluate.rule": {
    "rules": [
      {
        "dsl.type": "griffin-dsl",
        "dq.type": "profiling",
        "out.dataframe.name": "prof",
        "rule": "xxx",
        "out":[
          {
            "type": "metric",
            "name": "prof",
            "flatten": "array"
          }
        ]
      }
    ]
  },

  "sinks": ["CONSOLE"]
}{noformat}
 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to