maropu commented on a change in pull request #28953:
URL: https://github.com/apache/spark/pull/28953#discussion_r451339072



##########
File path: docs/sql-data-sources-jdbc.md
##########
@@ -156,6 +156,20 @@ the following case-insensitive options:
      </td>
   </tr>
 
+  <tr>
+     <td><code>preActions</code></td>
+     <td>
+       Custom queries which you want to run before reading data from JDBC or 
writing data to JDBC. Only DDL or DML (insert/update/delete) are allowed. It is 
called per DataFrame, not per session. You can specify multiple queries 
separated by semicolon. When exceptions occur in preActions, the queries in 
preActions will be rollbacked. 

Review comment:
       >> It is called per DataFrame, not per session.
   
   I think this statement looks ambiguous. What does `per DataFrame` mean? 
Probably, `per session` refers to `sessionInitStatement` though. `postAction` 
seems to be executed during the resolution phase in read paths, so...
   
   ```
   $ ./bin/spark-shell
   scala> paste:
   sql("""
   CREATE TABLE jdbcTable
   USING org.apache.spark.sql.jdbc
   OPTIONS (
     driver "org.postgresql.Driver",
     url "jdbc:postgresql:postgres",
     dbtable "t1",
     user 'maropu',
     password ''
   )
   """)
   <-- `postAction` executed
   :quit
   
   // Re-launch spark-shell
   $ ./bin/spark-shell
   scala> sql("select * from jdbcTable").show()
   <-- `postAction` executed
   scala> sql("select * from jdbcTable").show()
   <-- `postAction` not executed
   ```
   Is this an expected behaviour? On the other hand, in write paths, are 
`preAction`/`postAction` executed per session? I'm afraid that users will get 
confused about the current execution timing.
   
   One more question; is there any usecase `preActions` can handle and 
`sessionInitStatement` cannot?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to