[jira] [Updated] (GRIFFIN-333) JDBC Connector: Ability to Use "group by" caluse

Obaidul Karim (Jira) Sat, 11 Jul 2020 20:37:08 -0700


     [ 
https://issues.apache.org/jira/browse/GRIFFIN-333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Obaidul Karim updated GRIFFIN-333:
----------------------------------
    Description: 
Refer to [https://issues.apache.org/jira/projects/GRIFFIN/issues/GRIFFIN-332].

If we have the ability to select specific columns, it will open the door to use 
sql base aggregation, further reducing the volume of data from JDBC sources.

So, I propose the feature to allow JDBC connector to able to use sql based 
aggregations using clause `groupby`

 

Let's say we have source and target tables that have data like below.

src:
{code:java}
------------------------
|employee_id   |country|
------------------------
|1             | NZ    |
|2             | DE    |
|3             | DE    |
|4             | NZ    |
|5             | DE    |
....
....
------------------------
{code}
tgt:
{code:java}
------------------------
|total_employee|country|
------------------------
|10            | NZ    |
|11            | DE    |
------------------------
{code}
Then we can perform `accuracy` check directly  like below using `columns` and 
`groupby` clauses for source table:
{code:java}
{   "name":"src",
   "connector":{      "type":"jdbc",
      "config":{         "database":"mydatabase",
         "tablename":"mytable",
         "columns":"count(*) total_employee, country",
         "groupby":"country",
         "url":"jdbc:sqlserver://myhost:1433;databaseName=mydatabase",
         "user":"user",
         "password":"password",
         "driver":"com.microsoft.sqlserver.jdbc.SQLServerDriver",
         "where":""
      }
   }
}
{code}

  was:
Refer to https://issues.apache.org/jira/projects/GRIFFIN/issues/GRIFFIN-332.

If we have the ability to select specific columns, it will open the door to use 
sql base aggregation, further reducing the volume of data from JDBC sources.

So, I propose the feature to allow JDBC connector to able to use sql based 
aggregations using clause `groupby`

------------------------
|employee_id |country|
------------------------
|1 | NZ |
|2 | DE |
|3 | DE |
|4 | NZ |
|5 | DE |
....
....
------------------------

Let's say we have source and target tables that have data like below.

src:
{code:java}
------------------------
|employee_id   |country|
------------------------
|1             | NZ    |
|2             | DE    |
|3             | DE    |
|4             | NZ    |
|5             | DE    |
....
....
------------------------
{code}
tgt:
{code:java}
------------------------
|total_employee|country|
------------------------
|10            | NZ    |
|11            | DE    |
------------------------
{code}
Then we can perform `accuracy` check directly  like below using `columns` and 
`groupby` clauses for source table:
{code:java}
{   "name":"src",
   "connector":{      "type":"jdbc",
      "config":{         "database":"mydatabase",
         "tablename":"mytable",
         "columns":"count(*) total_employee, country",
         "groupby":"country",
         "url":"jdbc:sqlserver://myhost:1433;databaseName=mydatabase",
         "user":"user",
         "password":"password",
         "driver":"com.microsoft.sqlserver.jdbc.SQLServerDriver",
         "where":""
      }
   }
}
{code}


> JDBC Connector: Ability to Use "group by" caluse
> ------------------------------------------------
>
>                 Key: GRIFFIN-333
>                 URL: https://issues.apache.org/jira/browse/GRIFFIN-333
>             Project: Griffin
>          Issue Type: Improvement
>          Components: accuracy-batch
>    Affects Versions: 0.6.0
>            Reporter: Obaidul Karim
>            Priority: Major
>
> Refer to [https://issues.apache.org/jira/projects/GRIFFIN/issues/GRIFFIN-332].
> If we have the ability to select specific columns, it will open the door to 
> use sql base aggregation, further reducing the volume of data from JDBC 
> sources.
> So, I propose the feature to allow JDBC connector to able to use sql based 
> aggregations using clause `groupby`
>  
> Let's say we have source and target tables that have data like below.
> src:
> {code:java}
> ------------------------
> |employee_id   |country|
> ------------------------
> |1             | NZ    |
> |2             | DE    |
> |3             | DE    |
> |4             | NZ    |
> |5             | DE    |
> ....
> ....
> ------------------------
> {code}
> tgt:
> {code:java}
> ------------------------
> |total_employee|country|
> ------------------------
> |10            | NZ    |
> |11            | DE    |
> ------------------------
> {code}
> Then we can perform `accuracy` check directly  like below using `columns` and 
> `groupby` clauses for source table:
> {code:java}
> {   "name":"src",
>    "connector":{      "type":"jdbc",
>       "config":{         "database":"mydatabase",
>          "tablename":"mytable",
>          "columns":"count(*) total_employee, country",
>          "groupby":"country",
>          "url":"jdbc:sqlserver://myhost:1433;databaseName=mydatabase",
>          "user":"user",
>          "password":"password",
>          "driver":"com.microsoft.sqlserver.jdbc.SQLServerDriver",
>          "where":""
>       }
>    }
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (GRIFFIN-333) JDBC Connector: Ability to Use "group by" caluse

Reply via email to