Github user ashutakGG commented on a diff in the pull request:
https://github.com/apache/incubator-griffin/pull/434#discussion_r224424475
--- Diff:
measure/src/main/scala/org/apache/griffin/measure/step/builder/dsl/transform/AccuracyExpr2DQSteps.scala
---
@@ -125,14 +126,26 @@ case class AccuracyExpr2DQSteps(context: DQContext,
// 4. accuracy metric
val accuracyTableName = ruleParam.getOutDfName()
val matchedColName = details.getStringOrKey(_matched)
+ val matchedFractionColName = details.getStringOrKey(_matchedFraction)
val accuracyMetricSql = procType match {
case BatchProcessType =>
s"""
- |SELECT `${totalCountTableName}`.`${totalColName}` AS
`${totalColName}`,
- |coalesce(`${missCountTableName}`.`${missColName}`, 0) AS
`${missColName}`,
- |(`${totalCountTableName}`.`${totalColName}` -
coalesce(`${missCountTableName}`.`${missColName}`, 0)) AS `${matchedColName}`
- |FROM `${totalCountTableName}` LEFT JOIN
`${missCountTableName}`
- """.stripMargin
+ SELECT `${totalColName}`,
+ `${missColName}`,
+ `${matchedColName}`,
+ coalesce(`${matchedColName}` / `${totalColName}`, 1.0)
AS `${matchedFractionColName}`
+
+ FROM (
+ SELECT `${totalColName}`,
--- End diff --
Good point. For a sec I even thought you are right (an this is a potential
bug). But the following query works as expected.
`spark.sql("SELECT age FROM (SELECT count(age) as age FROM person)").show()`
It show 2 as a result.
So, looks like high level select does not have an access to tables used for
inner select.
But probably I will use this approach to increase readability.
---