Unable to persist profiling results in HDFS

Karan Gupta Wed, 23 May 2018 23:39:06 -0700

Hi Lionel,

I created a custom config.json and defined a custom rule to run<dsl.type is 
spark-sql> and submitted it through spark. The spark submit runs fine without 
any issues. On HDFS, I could see the directory with the custom rule name but I 
am unable to see the _METRIC file where the results will be persisted, I only 
see a _START file. What am I missing here?


HDFS -> ../persist/CheckAlphaNumeric/1527143022495/_START

Config.json ->


{
  "name": "CheckAlphaNumeric",

  "process.type": "batch",

  "data.sources": [
    {
      "name": "src",
      "connectors": [
        {
          "type": "hive",
          "version": "1.2",
          "config": {
            "database": "griffined",
            "table.name": "check_table"
          }
        }
      ]
    }
  ],
  "evaluateRule": {
    "rules": [
      {
        "dsl.type": "spark-sql",
        "dq.type": "profiling",
        "name": "checkalphnumeric",
        "rule": "SELECT count(name) FROM src WHERE name REGEXP 
'^[a-zA-Z0-9]+$'",
        "metric": {
        "name": "check_rules"
          }
        }
    ]
  }
}


________________________________
Any comments or statements made in this email are not necessarily those of 
Tavant Technologies. The information transmitted is intended only for the 
person or entity to which it is addressed and may contain confidential and/or 
privileged material. If you have received this in error, please contact the 
sender and delete the material from any computer. All emails sent from or to 
Tavant Technologies may be subject to our monitoring procedures.

Unable to persist profiling results in HDFS

Reply via email to