Re: [PR] Always use the executemany method when inserting rows in DbApiHook as it's way much faster [airflow]

via GitHub Thu, 11 Apr 2024 03:45:42 -0700


potiuk commented on code in PR #38715:
URL: https://github.com/apache/airflow/pull/38715#discussion_r1560811973



##########
airflow/providers/common/sql/hooks/sql.py:
##########
@@ -550,47 +557,47 @@ def insert_rows(
         :param commit_every: The maximum number of rows to insert in one
             transaction. Set to 0 to insert all rows in one transaction.
         :param replace: Whether to replace instead of insert
-        :param executemany: Insert all rows at once in chunks defined by the 
commit_every parameter, only
+        :param executemany: Inserts all rows at once in chunks defined by the 
commit_every parameter, only
             works if all rows have same number of column names but leads to 
better performance
         """
-        i = 0
-        with closing(self.get_conn()) as conn:
-            if self.supports_autocommit:
-                self.set_autocommit(conn, False)
+        if executemany:
+            warnings.warn(
+                "executemany parameter is deprecated, override 
supports_executemany instead.",
+                AirflowProviderDeprecationWarning,
+                stacklevel=2,
+            )
 
+        with self._closing_supporting_autocommit() as conn:
             conn.commit()

Review Comment:
   Added in 173th change in the repo in 2015 and nobody asked that question for 
9 years https://github.com/apache/airflow/pull/173/files 
   
   My guess is to avoid side effects where there is an open transaction and 
some changes done locally already. 
   
   I can imagine that it was:
   
   * If there are no local changes - commit does nothing
   * If there are local changes, better to commit them before and not in a 
single transaction, someone forgot to do it, inserting rows marks end of 
transaction anyway - this potentially avoids side effects of too big 
transaction or even possibly some conflicting changes 
   
   In order to avoid any potential behavioral changes - better to leave it I 
guess. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] Always use the executemany method when inserting rows in DbApiHook as it's way much faster [airflow]

Reply via email to