[
https://issues.apache.org/jira/browse/GRIFFIN-332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17172322#comment-17172322
]
ishan verma commented on GRIFFIN-332:
-------------------------------------
hi [~obaid] ,
I am currently working on data quality POC using griffin.
So far till now everything is working fine using HIVE as a data source, but
there is new requirement to add mysql as source.
I have tried every possible way to have *mysql* as custom data connector but
its not working. Measure is getting created and job is going to successful but
griffin showing *NO CONTENT* on ui . below is my code for that:-
"data.sources": [
{
"name": "source",
"connectors": [
{
"name": "source1595488803031",
"type": "CUSTOM",
"data.unit": "1day",
"data.time.zone": "",
"config":
{ "class":
"org.apache.griffin.measure.datasource.connector.batch.MySqlDataConnector",
"database": "griffin_poc", "tablename": "person_src",
"url": "jdbc:mysql://griffin:3306/griffin_poc", "user": "test_u",
"password": "test_p", "driver": "com.mysql.jdbc.Driver"
}
}
can you please provide some valuable suggestions on this , how to use
*mysql/jdbc* as my datasource as it is very critical for my POC. Its an urgent
issue.
Anything i am missing here to link with mysql, please guide me through this.
It would be great if you can provide one sample on this.
I have also used 0.6 latest commit sql and jdbc class connector but still its
showing no content in UI
Also i have setup mysql on ec2 instance using griffin docker image.
[~obaid] request you to please provide your valuable inputs on this if you have
done similar setup like that. very much critical for my poc.
Any leads will be appreciated.
Thanks:)
> JDBC Connector: Ability to Select Specific Columns Instead of All the Columns
> -----------------------------------------------------------------------------
>
> Key: GRIFFIN-332
> URL: https://issues.apache.org/jira/browse/GRIFFIN-332
> Project: Griffin
> Issue Type: Improvement
> Components: accuracy-batch
> Affects Versions: 0.6.0
> Reporter: Obaidul Karim
> Priority: Major
> Labels: columns, jdbc
>
> *Background:*
> Thanks to https://issues.apache.org/jira/browse/GRIFFIN-315, we already have
> JDBC connector.
> However, currently, it is pulling all the columns using`"SELECT * FROM
> $fullTableName"`.
> It will cause some issues for larger JDBC tables -
> - memory overhead for spark data frame
> - longer execution time
> - resource overhear for RDBMS
> *Proposed Improvement:*
> So, I propose the feature to allow JDBC connector to able to select only
> required columns.
> *Example:*
> We have a rule `"rule":"src.id = tgt.id and src.country = tgt.country "`.
> Then we only need two columns `id` and 'country'.
> So, in connector we can add additional clause `columns` to select only
> required columns, like below:
>
> {code:java}
> { "name":"src",
> "connector":{ "type":"jdbc",
> "config":{ "database":"mydatabase",
> "tablename":"mytable",
> "columns":"id, country",
> "url":"jdbc:sqlserver://myhost:1433;databaseName=mydatabase",
> "user":"user",
> "password":"password",
> "driver":"com.microsoft.sqlserver.jdbc.SQLServerDriver",
> "where":""
> }
> }
> }
> {code}
> We can implement it like this, if there is `columns` clause then use it
> otherwise use `*` as default.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)