[ https://issues.apache.org/jira/browse/SPARK-21519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Luca Canali updated SPARK-21519: -------------------------------- Description: This proposes an option to the JDBC datasource, tentatively called "sessionInitStatement" to implement the functionality of session initialization present for example in the Sqoop connector for Oracle (see https://sqoop.apache.org/docs/1.4.6/SqoopUserGuide.html#_oraoop_oracle_session_initialization_statements) . After each database session is opened to the remote DB, and before starting to read data, this option executes a custom SQL statement (or a PL/SQL block in the case of Oracle). Example of usage, relevant to Oracle JDBC: {code} val preambleSQL=""" begin execute immediate 'alter session set tracefile_identifier=sparkora'; execute immediate 'alter session set "_serial_direct_read"=true'; execute immediate 'alter session set time_zone=''+02:00'''; end; """ bin/spark-shell --jars ojdb6.jar val df = spark.read.format("jdbc").option("url", "jdbc:oracle:thin:@ORACLEDBSERVER:1521/service_name").option("driver", "oracle.jdbc.driver.OracleDriver").option("dbtable", "(select 1, sysdate, systimestamp, current_timestamp, localtimestamp from dual)").option("user", "MYUSER").option("password", "MYPASSWORD").option("fetchsize",1000).option("sessionInitStatement", preambleSQL).load() df.show(5,false) {code} Comments: This proposal has been developed and tested for connecting the Spark JDBC data source to Oracle databases, however I believe it can be useful for other target DBs too, as it is quite generic. Note the proposed code allows to inject SQL into the target database. This is not a security concern as such, as it requires password authentication, however beware of the possibilities of injecting user-provided SQL (and PL/SQL) that this opens. was: This proposes an option to the JDBC datasource, tentatively called " sessionInitStatement" to implement the functionality of session initialization present for example in the Sqoop connector for Oracle (see https://sqoop.apache.org/docs/1.4.6/SqoopUserGuide.html#_oraoop_oracle_session_initialization_statements ) . After each database session is opened to the remote DB, and before starting to read data, this option executes a custom SQL statement (or a PL/SQL block in the case of Oracle). Example of usage, relevant to Oracle JDBC: {code} val preambleSQL=""" begin execute immediate 'alter session set tracefile_identifier=sparkora'; execute immediate 'alter session set "_serial_direct_read"=true'; execute immediate 'alter session set time_zone=''+02:00'''; end; """ bin/spark-shell –jars ojdb6.jar val df = spark.read.format("jdbc").option("url", "jdbc:oracle:thin:@ORACLEDBSERVER:1521/service_name").option("driver", "oracle.jdbc.driver.OracleDriver").option("dbtable", "(select 1, sysdate, systimestamp, current_timestamp, localtimestamp from dual)").option("user", "MYUSER").option("password", "MYPASSWORD").option("fetchsize",1000).option("sessionInitStatement", preambleSQL).load() df.show(5,false) {code} Comments: This proposal has been developed and tested for connecting the Spark JDBC data source to Oracle databases, however I believe it can be useful for other target DBs too, as it is quite generic. Note the proposed code allows to inject SQL into the target database. This is not a security concern as such, as it requires password authentication, however beware of the possibilities of injecting user-provided SQL (and PL/SQL) that this opens. > Add an option to the JDBC data source to initialize the environment of the > remote database session > -------------------------------------------------------------------------------------------------- > > Key: SPARK-21519 > URL: https://issues.apache.org/jira/browse/SPARK-21519 > Project: Spark > Issue Type: New Feature > Components: SQL > Affects Versions: 2.1.0, 2.1.1, 2.2.0 > Reporter: Luca Canali > Assignee: Luca Canali > Priority: Minor > Fix For: 2.3.0 > > > This proposes an option to the JDBC datasource, tentatively called > "sessionInitStatement" to implement the functionality of session > initialization present for example in the Sqoop connector for Oracle (see > https://sqoop.apache.org/docs/1.4.6/SqoopUserGuide.html#_oraoop_oracle_session_initialization_statements) > . After each database session is opened to the remote DB, and before > starting to read data, this option executes a custom SQL statement (or a > PL/SQL block in the case of Oracle). > Example of usage, relevant to Oracle JDBC: > {code} > val preambleSQL=""" > begin > execute immediate 'alter session set tracefile_identifier=sparkora'; > execute immediate 'alter session set "_serial_direct_read"=true'; > execute immediate 'alter session set time_zone=''+02:00'''; > end; > """ > bin/spark-shell --jars ojdb6.jar > val df = spark.read.format("jdbc").option("url", > "jdbc:oracle:thin:@ORACLEDBSERVER:1521/service_name").option("driver", > "oracle.jdbc.driver.OracleDriver").option("dbtable", "(select 1, sysdate, > systimestamp, current_timestamp, localtimestamp from dual)").option("user", > "MYUSER").option("password", > "MYPASSWORD").option("fetchsize",1000).option("sessionInitStatement", > preambleSQL).load() > df.show(5,false) > {code} > Comments: This proposal has been developed and tested for connecting the > Spark JDBC data source to Oracle databases, however I believe it can be > useful for other target DBs too, as it is quite generic. > Note the proposed code allows to inject SQL into the target database. This is > not a security concern as such, as it requires password authentication, > however beware of the possibilities of injecting user-provided SQL (and > PL/SQL) that this opens. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org