MaksYermak commented on code in PR #40916:
URL: https://github.com/apache/airflow/pull/40916#discussion_r1689578355


##########
airflow/dag_processing/processor.py:
##########
@@ -837,13 +877,17 @@ def process_file(
         :return: number of dags found, count of import errors, last number of 
db queries
         """
         self.log.info("Processing file %s for tasks to queue", file_path)
+        try:
+            if InternalApiConfig.get_use_internal_api():
+                dagbag = DagFileProcessor._get_dagbag(file_path)
+            else:
+                with create_session() as session:

Review Comment:
   @potiuk thank for your investigation! For me your message makes sense. You 
are right the original idea was to count all queries during DAG file 
processing. And maybe now I understand why with different event handlers the 
number of queries are different. I have mentioned it here: 
https://github.com/apache/airflow/pull/40323#discussion_r1650979649
   
   I have made investigations and recorded queries for event handlers with 
`session` object and `engine` object. Here's the result:
   ```
   Recorded query locations session:
        processor.py:process_file:883 > processor.py:save_dag_to_db:919 > 
dagbag.py:_sync_to_db:707 > dagbag.py:_serialize_dag_capturing_errors:672 > 
serialized_dag.py:write_dag:158:  5
        processor.py:process_file:883 > processor.py:save_dag_to_db:919 > 
dagbag.py:_sync_to_db:707 > dagbag.py:_serialize_dag_capturing_errors:672 > 
serialized_dag.py:write_dag:168:  5
        processor.py:process_file:883 > processor.py:save_dag_to_db:919 > 
dagbag.py:_sync_to_db:707 > dagbag.py:_serialize_dag_capturing_errors:672 > 
serialized_dag.py:write_dag:181:  5
        dagbag.py:_sync_perm_for_dag:735 > override.py:sync_perm_for_dag:1101 > 
override.py:create_permission:1836 > override.py:get_permission:1807 > 
override.py:get_action:1676:     15
        dagbag.py:_sync_perm_for_dag:735 > override.py:sync_perm_for_dag:1101 > 
override.py:create_permission:1836 > override.py:get_permission:1808 > 
override.py:get_resource:1737:   15
        dagbag.py:_sync_perm_for_dag:735 > override.py:sync_perm_for_dag:1101 > 
override.py:create_permission:1839 > override.py:create_resource:1745 > 
override.py:get_resource:1737:  15
        dagbag.py:_sync_perm_for_dag:735 > override.py:sync_perm_for_dag:1101 > 
override.py:create_permission:1840 > override.py:create_action:1685 > 
override.py:get_action:1676:      15
        dagbag.py:_serialize_dag_capturing_errors:679 > 
dagbag.py:_sync_perm_for_dag:735 > override.py:sync_perm_for_dag:1101 > 
override.py:create_permission:1836 > override.py:get_permission:1811:   10
        test_processor.py:_process_file:114 > processor.py:process_file:883 > 
processor.py:save_dag_to_db:919 > dagbag.py:_sync_to_db:710 > 
dag.py:bulk_write_to_db:3171:       1
        test_processor.py:_process_file:114 > processor.py:process_file:883 > 
processor.py:save_dag_to_db:919 > dagbag.py:_sync_to_db:710 > 
dag.py:bulk_write_to_db:3191:       1
        processor.py:process_file:883 > processor.py:save_dag_to_db:919 > 
dagbag.py:_sync_to_db:710 > dag.py:bulk_write_to_db:3194 > 
dagrun.py:active_runs_of_dags:380: 1
        processor.py:process_file:883 > processor.py:save_dag_to_db:919 > 
dagbag.py:_sync_to_db:710 > dag.py:bulk_write_to_db:3263 > 
dagcode.py:bulk_sync_to_db:82:     1
        python.py:pytest_pyfunc_call:162 > 
test_processor.py:test_counter_for_last_num_of_db_queries:1022 > 
test_processor.py:_process_file:114 > processor.py:process_file:893 > 
processor.py:update_import_errors:618:        2
        python.py:pytest_pyfunc_call:162 > 
test_processor.py:test_counter_for_last_num_of_db_queries:1022 > 
test_processor.py:_process_file:114 > processor.py:process_file:893 > 
processor.py:update_import_errors:625:        1
        test_processor.py:_process_file:114 > processor.py:process_file:904 > 
processor.py:update_dag_warnings:689 > processor.py:_validate_task_pools:668 > 
pool.py:get_pools:69:      1
        python.py:pytest_pyfunc_call:162 > 
test_processor.py:test_counter_for_last_num_of_db_queries:1022 > 
test_processor.py:_process_file:114 > processor.py:process_file:904 > 
processor.py:update_dag_warnings:691: 1
   
   Recorded query locations engine:
        processor.py:process_file:883 > processor.py:save_dag_to_db:919 > 
dagbag.py:_sync_to_db:707 > dagbag.py:_serialize_dag_capturing_errors:672 > 
serialized_dag.py:write_dag:158:  5
        processor.py:process_file:883 > processor.py:save_dag_to_db:919 > 
dagbag.py:_sync_to_db:707 > dagbag.py:_serialize_dag_capturing_errors:672 > 
serialized_dag.py:write_dag:168:  5
        processor.py:process_file:883 > processor.py:save_dag_to_db:919 > 
dagbag.py:_sync_to_db:707 > dagbag.py:_serialize_dag_capturing_errors:672 > 
serialized_dag.py:write_dag:181:  5
        dagbag.py:_sync_perm_for_dag:735 > override.py:sync_perm_for_dag:1101 > 
override.py:create_permission:1836 > override.py:get_permission:1807 > 
override.py:get_action:1676:     15
        dagbag.py:_sync_perm_for_dag:735 > override.py:sync_perm_for_dag:1101 > 
override.py:create_permission:1836 > override.py:get_permission:1808 > 
override.py:get_resource:1737:   15
        dagbag.py:_sync_perm_for_dag:735 > override.py:sync_perm_for_dag:1101 > 
override.py:create_permission:1839 > override.py:create_resource:1745 > 
override.py:get_resource:1737:  15
        dagbag.py:_serialize_dag_capturing_errors:679 > 
dagbag.py:_sync_perm_for_dag:735 > override.py:sync_perm_for_dag:1101 > 
override.py:create_permission:1839 > override.py:create_resource:1751:  10 -> 
new
        dagbag.py:_sync_perm_for_dag:735 > override.py:sync_perm_for_dag:1101 > 
override.py:create_permission:1840 > override.py:create_action:1685 > 
override.py:get_action:1676:      15
        dagbag.py:_sync_to_db:707 > 
dagbag.py:_serialize_dag_capturing_errors:679 > 
dagbag.py:_sync_perm_for_dag:735 > override.py:sync_perm_for_dag:1101 > 
override.py:create_permission:1845: 15 -> new
        dagbag.py:_serialize_dag_capturing_errors:679 > 
dagbag.py:_sync_perm_for_dag:735 > override.py:sync_perm_for_dag:1101 > 
override.py:create_permission:1836 > override.py:get_permission:1811:   10
        test_processor.py:_process_file:114 > processor.py:process_file:883 > 
processor.py:save_dag_to_db:919 > dagbag.py:_sync_to_db:710 > 
dag.py:bulk_write_to_db:3171:       1
        test_processor.py:_process_file:114 > processor.py:process_file:883 > 
processor.py:save_dag_to_db:919 > dagbag.py:_sync_to_db:710 > 
dag.py:bulk_write_to_db:3191:       1
        processor.py:process_file:883 > processor.py:save_dag_to_db:919 > 
dagbag.py:_sync_to_db:710 > dag.py:bulk_write_to_db:3194 > 
dagrun.py:active_runs_of_dags:380: 1
        processor.py:process_file:883 > processor.py:save_dag_to_db:919 > 
dagbag.py:_sync_to_db:710 > dag.py:bulk_write_to_db:3263 > 
dagcode.py:bulk_sync_to_db:82:     1
        processor.py:process_file:883 > processor.py:save_dag_to_db:919 > 
dagbag.py:_sync_to_db:710 > dag.py:bulk_write_to_db:3324 > 
manager.py:create_datasets:56:     2 -> new
        python.py:pytest_pyfunc_call:162 > 
test_processor.py:test_counter_for_last_num_of_db_queries:1022 > 
test_processor.py:_process_file:114 > processor.py:process_file:893 > 
processor.py:update_import_errors:618:        1
        python.py:pytest_pyfunc_call:162 > 
test_processor.py:test_counter_for_last_num_of_db_queries:1022 > 
test_processor.py:_process_file:114 > processor.py:process_file:893 > 
processor.py:update_import_errors:625:        1
        test_processor.py:_process_file:114 > processor.py:process_file:904 > 
processor.py:update_dag_warnings:689 > processor.py:_validate_task_pools:668 > 
pool.py:get_pools:69:      1
        python.py:pytest_pyfunc_call:162 > 
test_processor.py:test_counter_for_last_num_of_db_queries:1022 > 
test_processor.py:_process_file:114 > processor.py:process_file:904 > 
processor.py:update_dag_warnings:691: 1
   
   ```
   As I see the mostly new queries come from `dagbag`. Maybe as a solution we 
can change the event handler from `session` to `engine`  and it will make our 
counter more accurate.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to