[
https://issues.apache.org/jira/browse/SPARK-21519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Luca Canali closed SPARK-21519.
-------------------------------
> Add an option to the JDBC data source to initialize the environment of the
> remote database session
> --------------------------------------------------------------------------------------------------
>
> Key: SPARK-21519
> URL: https://issues.apache.org/jira/browse/SPARK-21519
> Project: Spark
> Issue Type: New Feature
> Components: SQL
> Affects Versions: 2.1.0, 2.1.1, 2.2.0
> Reporter: Luca Canali
> Assignee: Luca Canali
> Priority: Minor
> Fix For: 2.3.0
>
>
> This proposes an option to the JDBC datasource, tentatively called
> "sessionInitStatement" to implement the functionality of session
> initialization present for example in the Sqoop connector for Oracle (see
> https://sqoop.apache.org/docs/1.4.6/SqoopUserGuide.html#_oraoop_oracle_session_initialization_statements)
> . After each database session is opened to the remote DB, and before
> starting to read data, this option executes a custom SQL statement (or a
> PL/SQL block in the case of Oracle).
> Example of usage, relevant to Oracle JDBC:
> {code}
> val preambleSQL="""
> begin
> execute immediate 'alter session set tracefile_identifier=sparkora';
> execute immediate 'alter session set "_serial_direct_read"=true';
> execute immediate 'alter session set time_zone=''+02:00''';
> end;
> """
> bin/spark-shell --jars ojdb6.jar
> val df = spark.read
> .format("jdbc")
> .option("url",
> "jdbc:oracle:thin:@ORACLEDBSERVER:1521/service_name")
> .option("driver", "oracle.jdbc.driver.OracleDriver")
> .option("dbtable", "(select 1, sysdate, systimestamp,
> current_timestamp, localtimestamp from dual)")
> .option("user", "MYUSER")
> .option("password", "MYPASSWORD").option("fetchsize",1000)
> .option("sessionInitStatement", preambleSQL)
> .load()
> df.show(5,false)
> {code}
> *Comments:* This proposal has been developed and tested for connecting the
> Spark JDBC data source to Oracle databases, however I believe it can be
> useful for other target DBs too, as it is quite generic.
> The code executed by the option "sessionInitStatement" is just the
> user-provided string fed through the execute method of the JDBC connection,
> so it can use the features of the target database language/syntax. When using
> sessionInitStatement for querying Oracle, for example, the user-provided
> command can be a SQL statement or a PL/SQL block grouping multiple commands
> and logic.
> Note the proposed code allows to inject SQL into the target database. This is
> not a security concern as such, as it requires password authentication,
> however beware of the possibilities of injecting user-provided SQL (and
> PL/SQL) that this opens.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]