[
https://issues.apache.org/jira/browse/HIVE-24938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Naveen Gangam reassigned HIVE-24938:
------------------------------------
> [Evaluate] Dataconnector URL validation on create
> -------------------------------------------------
>
> Key: HIVE-24938
> URL: https://issues.apache.org/jira/browse/HIVE-24938
> Project: Hive
> Issue Type: Sub-task
> Reporter: Naveen Gangam
> Assignee: Naveen Gangam
> Priority: Major
>
> From the review feedback, there was a comment about validating URL specified
> in the connector URL when it is created. Currently, there is no validation
> except for checking for empty/null value. This is by-design and the desired
> behavior, IMHO. But filing this to be discussed with wider audience.
> {noformat}
> I tried creating a connector without the mysql JDBC URL specified properly
> and it went through,
> please see below,
> CREATE CONNECTOR mysql_test_2
> TYPE 'mysql'
> URL 'jdbc://'
> COMMENT 'test connector'
> WITH DCPROPERTIES (
> "hive.sql.dbcp.username"="hive1",
> "hive.sql.dbcp.password"="hive1");
> CREATE CONNECTOR mysql_test_3
> TYPE 'mysql'
> URL 'jdbc:derby://nightly1.apache.org:3306/hive1'
> COMMENT 'test connector'
> WITH DCPROPERTIES (
> "hive.sql.dbcp.username"="hive1",
> "hive.sql.dbcp.password"="hive1");
> I am not saying they are wrong, but we should probably call this out in the
> documentation. Document that URLs are not verified.
> Another thing I noticed is that the password is displayed in plain
> text on the command line. This used be considered a security problem
> in a product I worked in a past life. But I notice that an external
> table can be created with this semantics. I guess it is acceptable
> here.
> It is also stored in plain text in the metastore, please see below,
> CREATE TABLE DATACONNECTOR_PARAMS (
> NAME VARCHAR(128) NOT NULL,
> PARAM_KEY VARCHAR(180) NOT NULL,
> PARAM_VALUE VARCHAR(4000),
> PRIMARY KEY (NAME, PARAM_KEY),
> CONSTRAINT DATACONNECTOR_NAME_FK1 FOREIGN KEY (NAME) REFERENCES
> DATACONNECTORS (NAME) ON DELETE CASCADE
> ) ENGINE=InnoDB DEFAULT CHARSET=latin1;
> Again I am not saying this is a problem, but I thought I can call this out to
> you.
>
> @nrg4878 nrg4878 24 minutes ago Author Member
> We check for null/empty values for URL. We error out in those cases. Other
> than that, any non-empty value is accepted. I dont think we should check for
> correctness of the URL or even can for that matter.
> a) The URL is meant to be a freeform value against dozens of datasource types
> (mysql, postgres, hive, AWS Glue, Redshift etc). For each such source type,
> there could be dozens of variations of the url (includes properties and other
> params specific to the source). So I dont think we can meaningfully detect
> incorrect URLs.
> For example, MySQL though the URL might look fine syntactically, we cannot
> confirm dbName1 or dbName2 exist without actually attempting to connect to
> the DB.
> jdbc:mysql://:3306/
> jdbc:mysql://:3306/
> b) The format for the URLs could be changing overtime as well. It is
> unnecessary burden for maintaining new formats in hive. We want to be able to
> plugin a new datasource type by simply adding a provider.
> c) To be able to validate the URL, we have to establish the connection to the
> datasource at the time of creation. We are trying to delay making that
> connection as long as possible. When actual show tables is called. We avoid
> using up extra resources and leak connections.
> d) Users can do "create connector" .. followed by "alter connector set url".
> So any incorrect URLS can be modified using alter. Also in this case, we
> would be checking the URL twice. Better to have the onus of configuring it
> correctly on the end user.
> Passwords can be secured using jceks files as described in the "Securing
> Password" section of the doc below.
> https://cwiki.apache.org/confluence/display/Hive/JDBC+Storage+Handler
> So users have an option of using non-CTVs
> {noformat}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)