Re: [PR] Add Rendered k8s pod spec tab to ti details view [airflow]
dirrao commented on code in PR #39141: URL: https://github.com/apache/airflow/pull/39141#discussion_r1573147650 ## airflow/www/views.py: ## @@ -1533,6 +1533,48 @@ def rendered_k8s(self, *, session: Session = NEW_SESSION): title=title, ) +@expose("/object/rendered-k8s") +@auth.has_access_dag("GET", DagAccessEntity.TASK_INSTANCE) +@provide_session +def rendered_k8s_data(self, *, session: Session = NEW_SESSION): +"""Get rendered k8s yaml.""" +if not settings.IS_K8S_OR_K8SCELERY_EXECUTOR: +return {"error": "Not a k8s or k8s_celery executor"}, 404 +# This part is only used for k8s executor so providers.cncf.kubernetes must be installed +# with the get_rendered_k8s_spec method +from airflow.providers.cncf.kubernetes.template_rendering import get_rendered_k8s_spec + +dag_id = request.args.get("dag_id") +task_id = request.args.get("task_id") +if task_id is None: +return {"error": "Task id not passed in the request"}, 404 +run_id = request.args.get("run_id") +map_index = request.args.get("map_index", -1, type=int) +logger.info("Retrieving rendered k8s data.") + +dag: DAG = get_airflow_app().dag_bag.get_dag(dag_id) +task = dag.get_task(task_id) +dag_run = dag.get_dagrun(run_id=run_id, session=session) +ti = dag_run.get_task_instance(task_id=task.task_id, map_index=map_index, session=session) + +if not ti: +return {"error": f"can't find task instance {task.task_id}"}, 404 +pod_spec = None +if not isinstance(ti, TaskInstance): +return {"error": f"{task.task_id} is not a task instance"}, 500 +try: +pod_spec = get_rendered_k8s_spec(ti, session=session) Review Comment: Does this regenerate the pod spec or load from DB? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] Fix stacklevel for TaskContextLogger [airflow]
dirrao commented on code in PR #39142: URL: https://github.com/apache/airflow/pull/39142#discussion_r1573142983 ## airflow/utils/log/task_context_logger.py: ## @@ -101,7 +101,7 @@ def _log(self, level: int, msg: str, *args, ti: TaskInstance): task_handler.set_context(ti, identifier=self.component_name) if hasattr(task_handler, "mark_end_on_close"): task_handler.mark_end_on_close = False -filename, lineno, func, stackinfo = logger.findCaller() +filename, lineno, func, stackinfo = logger.findCaller(stacklevel=3) Review Comment: LGTM. Can you add the test case to reflect the same? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] trigger_rule=TriggerRule.ONE_FAILED doesn't work properly with task_groups [airflow]
red-crown commented on issue #30333: URL: https://github.com/apache/airflow/issues/30333#issuecomment-2067540565 Is there any kind of workaround for this issue? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] Add color to log lines in UI for error and warnings based on keywords [airflow]
tirkarthi commented on PR #39006: URL: https://github.com/apache/airflow/pull/39006#issuecomment-2067480912 The legacy log view doesn't have code changes to process every line. It just has the whole log as a single string so doing splitting and then adding ansi codes to join them back was more work that I left off since there is an issue to remove legacy log view especially with full-screen mode and other improvements being made to new UI. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] Added JOB_STATE_CANCELLED and pool_sleep GCP Dataflow Operators [airflow]
github-actions[bot] commented on PR #37364: URL: https://github.com/apache/airflow/pull/37364#issuecomment-2067413940 This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 5 days if no further activity occurs. Thank you for your contributions. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] feat/log null bytes file [airflow]
github-actions[bot] commented on PR #37894: URL: https://github.com/apache/airflow/pull/37894#issuecomment-2067413926 This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 5 days if no further activity occurs. Thank you for your contributions. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [FEAT] added notebook error in databricks deferrable handler [airflow]
gaurav7261 commented on PR #39110: URL: https://github.com/apache/airflow/pull/39110#issuecomment-2067391203 Hi @Lee-W , is there any issue with PR -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[PR] Cleanup PagerdutyNotifier __init__ [airflow]
DavidTraina opened a new pull request, #39145: URL: https://github.com/apache/airflow/pull/39145 - fix duplicate initialization of `self.class_type` and `self.custom_details` - allow `pagerduty_events_conn_id` to be `None` since that's allowed in `PagerdutyEventsHook` --- -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] Cleanup PagerdutyNotifier __init__ [airflow]
boring-cyborg[bot] commented on PR #39145: URL: https://github.com/apache/airflow/pull/39145#issuecomment-2067385930 Congratulations on your first Pull Request and welcome to the Apache Airflow community! If you have any issues or are unsure about any anything please check our Contributors' Guide (https://github.com/apache/airflow/blob/main/contributing-docs/README.rst) Here are some useful points: - Pay attention to the quality of your code (ruff, mypy and type annotations). Our [pre-commits]( https://github.com/apache/airflow/blob/main/contributing-docs/08_static_code_checks.rst#prerequisites-for-pre-commit-hooks) will help you with that. - In case of a new feature add useful documentation (in docstrings or in `docs/` directory). Adding a new operator? Check this short [guide](https://github.com/apache/airflow/blob/main/docs/apache-airflow/howto/custom-operator.rst) Consider adding an example DAG that shows how users should use it. - Consider using [Breeze environment](https://github.com/apache/airflow/blob/main/dev/breeze/doc/README.rst) for testing locally, it's a heavy docker but it ships with a working Airflow and a lot of integrations. - Be patient and persistent. It might take some time to get a review or get the final approval from Committers. - Please follow [ASF Code of Conduct](https://www.apache.org/foundation/policies/conduct) for all communication including (but not limited to) comments on Pull Requests, Mailing list and Slack. - Be sure to read the [Airflow Coding style]( https://github.com/apache/airflow/blob/main/contributing-docs/05_pull_requests.rst#coding-style-and-best-practices). - Always keep your Pull Requests rebased, otherwise your build might fail due to changes not related to your commits. Apache Airflow is a community-driven project and together we are making it better . In case of doubts contact the developers at: Mailing List: d...@airflow.apache.org Slack: https://s.apache.org/airflow-slack -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[I] Custom FAB actions no longer work [airflow]
jedcunningham opened a new issue, #39144: URL: https://github.com/apache/airflow/issues/39144 ### Apache Airflow version 2.9.0 ### If "Other Airflow 2 version" selected, which one? _No response_ ### What happened? Before the auth manager/FAB transition in 2.9.0, plugins were able to include custom actions. This no longer works: ``` [2024-04-19T21:24:50.050+] {app.py:1744} ERROR - Exception on /testappbuilderbaseview/ [GET] Traceback (most recent call last): File "/home/airflow/.local/lib/python3.12/site-packages/flask/app.py", line 2529, in wsgi_app response = self.full_dispatch_request() File "/home/airflow/.local/lib/python3.12/site-packages/flask/app.py", line 1825, in full_dispatch_request rv = self.handle_user_exception(e) ^ File "/home/airflow/.local/lib/python3.12/site-packages/flask/app.py", line 1823, in full_dispatch_request rv = self.dispatch_request() ^^^ File "/home/airflow/.local/lib/python3.12/site-packages/flask/app.py", line 1799, in dispatch_request return self.ensure_sync(self.view_functions[rule.endpoint])(**view_args) ^ File "/home/airflow/.local/lib/python3.12/site-packages/airflow/providers/fab/auth_manager/decorators/auth.py", line 118, in decorated is_authorized=appbuilder.sm.check_authorization(permissions, dag_id), ^^ File "/home/airflow/.local/lib/python3.12/site-packages/airflow/providers/fab/auth_manager/security_manager/override.py", line 2305, in check_authorization elif not self.has_access(*perm): ^^ File "/home/airflow/.local/lib/python3.12/site-packages/airflow/www/security_manager.py", line 142, in has_access return is_authorized_method(action_name, resource_pk, user) File "/home/airflow/.local/lib/python3.12/site-packages/airflow/www/security_manager.py", line 340, in method=get_method_from_fab_action_map()[action], KeyError: 'can_do' ``` ### What you think should happen instead? Custom actions should continue to function. ### How to reproduce This is a simple plugin that reproduces the issue: ``` from airflow.plugins_manager
Re: [PR] Added Feature: search dags by task id with suggestions [airflow]
jscheffl commented on PR #37436: URL: https://github.com/apache/airflow/pull/37436#issuecomment-2067295317 @bbovenzi or @eladkal Can yo umake a 2nd pass review, then I thing we should merge this. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] SSHClickhouseOperator [airflow]
jscheffl commented on issue #39140: URL: https://github.com/apache/airflow/issues/39140#issuecomment-2067294313 Probably take a bit of reading in here: https://github.com/apache/airflow/blob/main/contributing-docs/11_provider_packages.rst#developing-community-managed-provider-packages -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] Add task failed dependencies to details page. [airflow]
jscheffl commented on PR #38449: URL: https://github.com/apache/airflow/pull/38449#issuecomment-2067289934 I really like this extension in the view! Looking forward for fixes and removal of merge conflict... then would like to review. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] Make audit log before/after filterable [airflow]
jscheffl commented on PR #39120: URL: https://github.com/apache/airflow/pull/39120#issuecomment-2067288506 I tried to "use" the audit log and either I am tooo stupid (rarely have clicked there, maybe I'm a noob in this) but: - When clicking in one of the date fields and picking a date the filter is not applied. - Clicking or pressing enter to a different aera clears the previous selcted data - I tried to use multiple options of include/exclude filter and it always when using include generated empty results or in case of exclude no effect. Somehow I was not able to filter and I did not understand where I can filter for. Might it be that with the changes something is broken? Or is my browser failing? Ubuntu 22.04 x64, Firefox. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] Improve task filtering UX [airflow]
jscheffl commented on code in PR #39119: URL: https://github.com/apache/airflow/pull/39119#discussion_r1572952793 ## airflow/www/static/js/dag/details/Header.tsx: ## @@ -124,7 +124,7 @@ const Header = ({ mapIndex }: Props) => { )} - {mapIndex !== undefined && ( + {mapIndex !== undefined && mapIndex !== -1 && ( Review Comment: Aaaah, also found this, thanks to fix it within this PR! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] missing conf as a filter field in dag_run browsing [airflow]
raphaelauv commented on issue #39137: URL: https://github.com/apache/airflow/issues/39137#issuecomment-2067268019 Okay , but we can already order by conf and on less than 10 000 dag_runs it render almost instantly. So maybe a disclaimer on the front in the case of using a filter on this field would be a good tradeoff -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] front - admin menu - drop-down non deterministic [airflow]
jscheffl commented on issue #39135: URL: https://github.com/apache/airflow/issues/39135#issuecomment-2067265929 Also have seen this a couple of times since 2.9.0 in our environment. Still have no clue about the root cause and how to fix. Ideas welcome. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] missing conf as a filter field in dag_run browsing [airflow]
jscheffl commented on issue #39137: URL: https://github.com/apache/airflow/issues/39137#issuecomment-2067264129 I also was missing this option many many times. But you need to know that `conf`is not a scalar database field but a serialized object. You can not "easily" set a filter criteria in the database engine and leverage an index to filter. Filtering on a `conf` would be very heavy as actually besides filtering on other fields, all rows must be loaded and de-serialized for filtering. On a large DAG run table this could be a performance breaker and multiple such queries can be a DoS on the system. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] SSHClickhouseOperator [airflow]
RNHTTR commented on issue #39140: URL: https://github.com/apache/airflow/issues/39140#issuecomment-2067261562 This would likely require a new [community maintained provider](https://airflow.apache.org/docs/apache-airflow-providers/#community-maintained-providers), which would probably require a discussion in the [Airflow dev list](https://airflow.apache.org/community/) as to who would actually maintain such a provider. If you're interested in maintaining the provider, then sure, this might be a great idea. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] SSHClickhouseOperator [airflow]
jscheffl commented on issue #39140: URL: https://github.com/apache/airflow/issues/39140#issuecomment-2067261323 Thank for bringing up this idea. I propose that this would need to be added to any provider package. Even the SSHOperator is in a provider package (apache-airflow-providers-ssh) - For Clickhouse there is currently no provider package. Starting a new provider package has some rules and needs a strong community to support. Be aware that "just dropping code" might be rejected or strongly challenged. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[PR] Fix stacklevel for TaskContextLogger [airflow]
dstandish opened a new pull request, #39142: URL: https://github.com/apache/airflow/pull/39142 Previously it would show the message as coming from context logger module and not the actual call site. ] -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[PR] Add Rendered k8s pod spec tab to ti details view [airflow]
bbovenzi opened a new pull request, #39141: URL: https://github.com/apache/airflow/pull/39141 https://github.com/apache/airflow/assets/4600967/b0138b68-3674-48f1-8fd5-820ec351eb20;> --- **^ Add meaningful description above** Read the **[Pull Request Guidelines](https://github.com/apache/airflow/blob/main/contributing-docs/05_pull_requests.rst#pull-request-guidelines)** for more information. In case of fundamental code changes, an Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvement+Proposals)) is needed. In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x). In case of backwards incompatible changes please leave a note in a newsfragment file, named `{pr_number}.significant.rst` or `{issue_number}.significant.rst`, in [newsfragments](https://github.com/apache/airflow/tree/main/newsfragments). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[I] SSHClickhouseOperator [airflow]
vargacypher opened a new issue, #39140: URL: https://github.com/apache/airflow/issues/39140 ### Description Making calls to Clickhouse over ssh protocol could be make with SSHOperator or BashOperator, but could we abstract it more to make somenthing more specific to clickhouse ? I don't know if it's make sense or just something that wee missed. ### Use case/motivation Today we use a lot of ssh calls to connect to Clickhouse servers and execute commands there using **clickhouse-client** or use a intermediam server (ETL server) that has multiples Clikhouse server acess via Native Protocol, and executes the **clickhouse-client** commands from there using the Clickhouse Host. I'd like to have and specific SSHClickhouseOperator and hook to do interect with clickhouse over ssh. ### Related issues _No response_ ### Are you willing to submit a PR? - [X] Yes I am willing to submit a PR! ### Code of Conduct - [X] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] Add color to log lines in UI for error and warnings based on keywords [airflow]
jscheffl commented on code in PR #39006: URL: https://github.com/apache/airflow/pull/39006#discussion_r1572806934 ## airflow/www/static/js/utils/index.ts: ## @@ -185,6 +185,34 @@ const toSentenceCase = (camelCase: string): string => { return ""; }; +const addColorKeyword = ( + parsedLine: string, + errorKeywords: string[], + warningKeywords: string[] +): string => { + const lowerParsedLine = parsedLine.toLowerCase(); + const containsError = errorKeywords.some((keyword) => +lowerParsedLine.includes(keyword) + ); + const bold = (line: string) => `\x1b[1m${line}\x1b[0m`; + const red = (line: string) => `\x1b[31m${line}\x1b[39m`; Review Comment: okay, but you are measuring in ms still... most time probably is needed by browser then anyway. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] Fix templated env_vars field in `KubernetesPodOperator` to allow for compatibility with `XComArgs` [airflow]
boring-cyborg[bot] commented on PR #39139: URL: https://github.com/apache/airflow/pull/39139#issuecomment-2067136213 Congratulations on your first Pull Request and welcome to the Apache Airflow community! If you have any issues or are unsure about any anything please check our Contributors' Guide (https://github.com/apache/airflow/blob/main/contributing-docs/README.rst) Here are some useful points: - Pay attention to the quality of your code (ruff, mypy and type annotations). Our [pre-commits]( https://github.com/apache/airflow/blob/main/contributing-docs/08_static_code_checks.rst#prerequisites-for-pre-commit-hooks) will help you with that. - In case of a new feature add useful documentation (in docstrings or in `docs/` directory). Adding a new operator? Check this short [guide](https://github.com/apache/airflow/blob/main/docs/apache-airflow/howto/custom-operator.rst) Consider adding an example DAG that shows how users should use it. - Consider using [Breeze environment](https://github.com/apache/airflow/blob/main/dev/breeze/doc/README.rst) for testing locally, it's a heavy docker but it ships with a working Airflow and a lot of integrations. - Be patient and persistent. It might take some time to get a review or get the final approval from Committers. - Please follow [ASF Code of Conduct](https://www.apache.org/foundation/policies/conduct) for all communication including (but not limited to) comments on Pull Requests, Mailing list and Slack. - Be sure to read the [Airflow Coding style]( https://github.com/apache/airflow/blob/main/contributing-docs/05_pull_requests.rst#coding-style-and-best-practices). - Always keep your Pull Requests rebased, otherwise your build might fail due to changes not related to your commits. Apache Airflow is a community-driven project and together we are making it better . In case of doubts contact the developers at: Mailing List: d...@airflow.apache.org Slack: https://s.apache.org/airflow-slack -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[PR] Fix templated env_vars field in `KubernetesPodOperator` to allow for compatibility with `XComArgs` [airflow]
nyoungstudios opened a new pull request, #39139: URL: https://github.com/apache/airflow/pull/39139 Allows for usage of `XComArgs` for `env_vars` field in `KubernetesPodOperator` by moving the raise error conditional statement to after the jinja template strings are rendered. Also by extension, allows for usage of passing a jinja template string to the `env_vars` field when `render_as_native_obj=True` converts it to a dictionary related discussion: https://github.com/apache/airflow/discussions/38522 The example dag on the related discussion now works --- **^ Add meaningful description above** Read the **[Pull Request Guidelines](https://github.com/apache/airflow/blob/main/contributing-docs/05_pull_requests.rst#pull-request-guidelines)** for more information. In case of fundamental code changes, an Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvement+Proposals)) is needed. In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x). In case of backwards incompatible changes please leave a note in a newsfragment file, named `{pr_number}.significant.rst` or `{issue_number}.significant.rst`, in [newsfragments](https://github.com/apache/airflow/tree/main/newsfragments). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] Add logic to handle on_kill for BigQueryInsertJobOperator when deferrable=True [airflow]
tirkarthi commented on PR #38912: URL: https://github.com/apache/airflow/pull/38912#issuecomment-2067009288 @sunank200 As I understand this will kill the job when triggerer is restarted for deployment, maintenance etc or marked unhealthy in HA mode with the trigger to be handled by another triggerer job as the task is cancelled by asyncio and the `CancellationError` is caught in except clause to kill the job. Is it expected that triggerer restarts should kill the job? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] Add color to log lines in UI for error and warnings based on keywords [airflow]
tirkarthi commented on code in PR #39006: URL: https://github.com/apache/airflow/pull/39006#discussion_r1572653543 ## airflow/www/static/js/utils/index.ts: ## @@ -185,6 +185,34 @@ const toSentenceCase = (camelCase: string): string => { return ""; }; +const addColorKeyword = ( + parsedLine: string, + errorKeywords: string[], + warningKeywords: string[] +): string => { + const lowerParsedLine = parsedLine.toLowerCase(); + const containsError = errorKeywords.some((keyword) => +lowerParsedLine.includes(keyword) + ); + const bold = (line: string) => `\x1b[1m${line}\x1b[0m`; + const red = (line: string) => `\x1b[31m${line}\x1b[39m`; + const yellow = (line: string) => `\x1b[33m${line}\x1b[39m`; + + if (containsError) { +return bold(red(parsedLine)); + } + + const containsWarning = warningKeywords.some((keyword) => Review Comment: I kept them away since if the error keyword is found I wanted to return the line red colored and don't want to search the line for warning keyword as an optimisation. Keeping them next to each other will make 2 searches for every line though only one is used in case of error keyword being present. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] Add color to log lines in UI for error and warnings based on keywords [airflow]
tirkarthi commented on PR #39006: URL: https://github.com/apache/airflow/pull/39006#issuecomment-2066933356 @dirrao @jscheffl Used below dag which generated around 30MB log file and the function call added around 15ms with total render time taking 9-10 seconds. I did it with dev builds using "yarn dev" since the function name was not searchable in production mode possibly due to no source mapping. Please let me know if I missed anything. Thanks ```pythonfrom __future__ import annotations from datetime import datetime from airflow import DAG from airflow.decorators import task with DAG( dag_id="perf_39006", start_date=datetime(2024, 1, 1), catchup=False, schedule_interval=None, ) as dag: @task def log_line_generator(): import random lorem_ipsum_words = "Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor".split() for _ in range(200_000): random.shuffle(lorem_ipsum_words) line = " ".join(lorem_ipsum_words) rand = random.random() if rand > 0.9: print(f"{line} error") elif rand > 0.8: print(f"{line} warn") else: print(line) log_line_generator() ``` Without patch ![before_color](https://github.com/apache/airflow/assets/3972343/dd418987-ebf7-4590-a7e0-b95cf541a257) With patch : ![after_color](https://github.com/apache/airflow/assets/3972343/62403c96-d030-4476-a0aa-e2672ad46961) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] Add CloudRunCreateServiceOperator to Google Provider [airflow]
eladkal commented on issue #38760: URL: https://github.com/apache/airflow/issues/38760#issuecomment-2066927617 > @eladkal Hi, Could i take this issue? Assigned -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] Add color to log lines in UI for error and warnings based on keywords [airflow]
tirkarthi commented on code in PR #39006: URL: https://github.com/apache/airflow/pull/39006#discussion_r1572643904 ## airflow/www/static/js/utils/index.ts: ## @@ -185,6 +185,34 @@ const toSentenceCase = (camelCase: string): string => { return ""; }; +const addColorKeyword = ( + parsedLine: string, + errorKeywords: string[], + warningKeywords: string[] +): string => { + const lowerParsedLine = parsedLine.toLowerCase(); + const containsError = errorKeywords.some((keyword) => +lowerParsedLine.includes(keyword) + ); + const bold = (line: string) => `\x1b[1m${line}\x1b[0m`; + const red = (line: string) => `\x1b[31m${line}\x1b[39m`; Review Comment: Thanks, it helped performance to go from 15ms to 2ms. I tried inlining by formatting and returning the string itself but for some reason the time went back to 15ms though it was my assumption inlining the code would increase performance which didn't happen on directly returning the string. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] Add color to log lines in UI for error and warnings based on keywords [airflow]
tirkarthi commented on code in PR #39006: URL: https://github.com/apache/airflow/pull/39006#discussion_r1572642067 ## airflow/config_templates/config.yml: ## @@ -979,6 +979,22 @@ logging: type: boolean example: ~ default: "True" +color_log_error_keywords: Review Comment: I don't want to add color names of red to the config option in case it changes before release or so. I guess for me color_log_error_keywords is good enough. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] Add color to log lines in UI for error and warnings based on keywords [airflow]
tirkarthi commented on code in PR #39006: URL: https://github.com/apache/airflow/pull/39006#discussion_r1572640444 ## airflow/www/static/js/utils/index.ts: ## @@ -185,6 +185,34 @@ const toSentenceCase = (camelCase: string): string => { return ""; }; +const addColorKeyword = ( Review Comment: Thanks added tests and renamed function to `highlightByKeywords` . -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [AIRFLOW-2193] Add ROperator for using R [airflow]
paid-geek commented on PR #3115: URL: https://github.com/apache/airflow/pull/3115#issuecomment-2066850447 @potiuk I want to pick up this effort on the ROperator as I need to develop one. Anyway, what is the process for restarting work on a closed issue like this one? To be clear, I will be undertaking this work anyway, so I want to make sure my efforts contribute to the Airflow software that I use daily. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
(airflow) branch constraints-main updated: Updating constraints. Github run id:8753451927
This is an automated email from the ASF dual-hosted git repository. github-bot pushed a commit to branch constraints-main in repository https://gitbox.apache.org/repos/asf/airflow.git The following commit(s) were added to refs/heads/constraints-main by this push: new 0c46b7681d Updating constraints. Github run id:8753451927 0c46b7681d is described below commit 0c46b7681d60459cf82e66a2a8dad04ead849967 Author: Automated GitHub Actions commit AuthorDate: Fri Apr 19 15:44:20 2024 + Updating constraints. Github run id:8753451927 This update in constraints is automatically committed by the CI 'constraints-push' step based on 'refs/heads/main' in the 'apache/airflow' repository with commit sha 90acbfbba1a3e6535b87376aeaf089805b7d3303. The action that build those constraints can be found at https://github.com/apache/airflow/actions/runs/8753451927/ The image tag used for that build was: 90acbfbba1a3e6535b87376aeaf089805b7d3303. You can enter Breeze environment with this image by running 'breeze shell --image-tag 90acbfbba1a3e6535b87376aeaf089805b7d3303' All tests passed in this build so we determined we can push the updated constraints. See https://github.com/apache/airflow/blob/main/README.md#installing-from-pypi for details. --- constraints-3.10.txt | 30 +++--- constraints-3.11.txt | 28 ++-- constraints-3.12.txt | 29 +++-- constraints-3.8.txt | 30 +++--- constraints-3.9.txt | 30 +++--- constraints-no-providers-3.10.txt | 9 + constraints-no-providers-3.11.txt | 7 --- constraints-no-providers-3.12.txt | 7 --- constraints-no-providers-3.8.txt | 9 + constraints-no-providers-3.9.txt | 9 + constraints-source-providers-3.10.txt | 30 +++--- constraints-source-providers-3.11.txt | 28 ++-- constraints-source-providers-3.12.txt | 29 +++-- constraints-source-providers-3.8.txt | 30 +++--- constraints-source-providers-3.9.txt | 30 +++--- 15 files changed, 171 insertions(+), 164 deletions(-) diff --git a/constraints-3.10.txt b/constraints-3.10.txt index 25fc6b1b16..01146d0f0b 100644 --- a/constraints-3.10.txt +++ b/constraints-3.10.txt @@ -1,6 +1,6 @@ # -# This constraints file was automatically generated on 2024-04-17T19:33:55.728687 +# This constraints file was automatically generated on 2024-04-19T12:42:43.369729 # via "eager-upgrade" mechanism of PIP. For the "main" branch of Airflow. # This variant of constraints install uses the HEAD of the branch version for 'apache-airflow' but installs # the providers from PIP-released packages at the moment of the constraint generation. @@ -237,7 +237,7 @@ cachelib==0.9.0 cachetools==5.3.3 cassandra-driver==3.29.1 cattrs==23.2.3 -celery==5.3.6 +celery==5.4.0 certifi==2024.2.2 cffi==1.16.0 cfgv==3.4.0 @@ -266,7 +266,7 @@ cron-descriptor==1.4.3 croniter==2.0.3 cryptography==41.0.7 curlify==2.2.1 -databricks-sql-connector==2.9.5 +databricks-sql-connector==2.9.6 datadog==0.49.1 db-dtypes==1.2.0 debugpy==1.8.1 @@ -291,7 +291,7 @@ entrypoints==0.4 eralchemy2==1.3.8 et-xmlfile==1.1.0 eventlet==0.36.1 -exceptiongroup==1.2.0 +exceptiongroup==1.2.1 execnet==2.1.1 executing==2.0.1 facebook_business==19.0.3 @@ -323,7 +323,7 @@ google-cloud-audit-log==0.2.5 google-cloud-automl==2.13.3 google-cloud-batch==0.17.18 google-cloud-bigquery-datatransfer==3.15.2 -google-cloud-bigquery==3.20.1 +google-cloud-bigquery==3.21.0 google-cloud-bigtable==2.23.1 google-cloud-build==3.24.0 google-cloud-compute==1.18.0 @@ -348,7 +348,7 @@ google-cloud-redis==2.15.3 google-cloud-resource-manager==1.12.3 google-cloud-run==0.10.5 google-cloud-secret-manager==2.19.0 -google-cloud-spanner==3.44.0 +google-cloud-spanner==3.45.0 google-cloud-speech==2.26.0 google-cloud-storage-transfer==1.11.3 google-cloud-storage==2.16.0 @@ -368,9 +368,9 @@ greenlet==3.0.3 grpc-google-iam-v1==0.13.0 grpc-interceptor==0.15.4 grpcio-gcp==0.2.2 -grpcio-status==1.62.1 -grpcio-tools==1.62.1 -grpcio==1.62.1 +grpcio-status==1.62.2 +grpcio-tools==1.62.2 +grpcio==1.62.2 gssapi==1.8.3 gunicorn==22.0.0 h11==0.14.0 @@ -406,7 +406,7 @@ isodate==0.6.1 itsdangerous==2.2.0 jaraco.classes==3.4.0 jaraco.context==5.3.0 -jaraco.functools==4.0.0 +jaraco.functools==4.0.1 jedi==0.19.1 jeepney==0.8.0 jmespath==0.10.0 @@ -480,7 +480,7 @@ nodeenv==1.8.0 numpy==1.26.4 oauthlib==3.2.2 objsize==0.7.0 -openai==1.21.0 +openai==1.23.1 openapi-schema-validator==0.6.2 openapi-spec-validator==0.7.1 openlineage-integration-common==1.12.0 @@ -634,7 +634,7 @@ smbprotocol==1.13.0 smmap==5.0.1 sniffio==1.3.1 snowballstemmer==2.2.0
[PR] Add dag re-parsing request endpoint [airflow]
utkarsharma2 opened a new pull request, #39138: URL: https://github.com/apache/airflow/pull/39138 API endpoint to request reparsing of DAG. The API is useful in cases when users have lots of dags and want to prioritize the parsing of a particular dag. We record the requests in the DB table and rearrange the `file_path_queue` variable such that the requested DAG is prioritized. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] openlineage, snowflake: do not run external queries for Snowflake when [airflow]
mobuchowski commented on code in PR #39113: URL: https://github.com/apache/airflow/pull/39113#discussion_r1572516872 ## tests/providers/snowflake/operators/test_snowflake_sql.py: ## @@ -163,7 +162,9 @@ def test_exec_success(sql, return_last, split_statement, hook_results, hook_desc ) -def test_execute_openlineage_events(): +@mock.patch("airflow.providers.openlineage.utils.utils.should_use_external_connection") +def test_execute_openlineage_events(should_use_external_connection): +should_use_external_connection.return_value = False Review Comment: `test_execute_openlineage_events` in `test_redshift_sql` fulfills that role now -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] Kubernetes Executor Task Leak [airflow]
karunpoudel-chr commented on issue #36998: URL: https://github.com/apache/airflow/issues/36998#issuecomment-2066762929 I am seeing issue in single namespace. airflow==2.8.4 apache-airflow-providers-cncf-kubernetes==7.14.0 kubernetes==23.6.0 `KubernetesJobWatcher` failed a couple times but it was able to restart. In the logs below, the Watcher running on PID: 2034 failed. On the next sync of the executor, it was able to start back with PID: 3740. ``` [2024-04-18T23:29:34.285+] [2034:139691425343296] {airflow/providers/cncf/kubernetes/executors/kubernetes_executor_utils.py:121} ERROR - Unknown error in KubernetesJobWatcher. Failing Traceback (most recent call last): File "/usr/local/lib/python3.11/site-packages/urllib3/response.py", line 710, in _error_catcher yield File "/usr/local/lib/python3.11/site-packages/urllib3/response.py", line 1073, in read_chunked self._update_chunk_length() File "/usr/local/lib/python3.11/site-packages/urllib3/response.py", line 1008, in _update_chunk_length raise InvalidChunkLength(self, line) from None urllib3.exceptions.InvalidChunkLength: InvalidChunkLength(got length b'', 0 bytes read) The above exception was the direct cause of the following exception: Traceback (most recent call last): File "/usr/local/lib/python3.11/site-packages/airflow/providers/cncf/kubernetes/executors/kubernetes_executor_utils.py", line 112, in run self.resource_version = self._run( ^^ File "/usr/local/lib/python3.11/site-packages/airflow/providers/cncf/kubernetes/executors/kubernetes_executor_utils.py", line 168, in _run for event in self._pod_events(kube_client=kube_client, query_kwargs=kwargs): File "/usr/local/lib/python3.11/site-packages/kubernetes/watch/watch.py", line 165, in stream for line in iter_resp_lines(resp): File "/usr/local/lib/python3.11/site-packages/kubernetes/watch/watch.py", line 56, in iter_resp_lines for seg in resp.stream(amt=None, decode_content=False): File "/usr/local/lib/python3.11/site-packages/urllib3/response.py", line 933, in stream yield from self.read_chunked(amt, decode_content=decode_content) File "/usr/local/lib/python3.11/site-packages/urllib3/response.py", line 1061, in read_chunked with self._error_catcher(): File "/usr/local/lib/python3.11/contextlib.py", line 158, in __exit__ self.gen.throw(typ, value, traceback) File "/usr/local/lib/python3.11/site-packages/urllib3/response.py", line 727, in _error_catcher raise ProtocolError(f"Connection broken: {e!r}", e) from e urllib3.exceptions.ProtocolError: ("Connection broken: InvalidChunkLength(got length b'', 0 bytes read)", InvalidChunkLength(got length b'', 0 bytes read)) Process KubernetesJobWatcher-5: Traceback (most recent call last): File "/usr/local/lib/python3.11/site-packages/urllib3/response.py", line 710, in _error_catcher yield File "/usr/local/lib/python3.11/site-packages/urllib3/response.py", line 1073, in read_chunked self._update_chunk_length() File "/usr/local/lib/python3.11/site-packages/urllib3/response.py", line 1008, in _update_chunk_length raise InvalidChunkLength(self, line) from None urllib3.exceptions.InvalidChunkLength: InvalidChunkLength(got length b'', 0 bytes read) The above exception was the direct cause of the following exception: Traceback (most recent call last): File "/usr/local/lib/python3.11/multiprocessing/process.py", line 314, in _bootstrap self.run() File "/usr/local/lib/python3.11/site-packages/airflow/providers/cncf/kubernetes/executors/kubernetes_executor_utils.py", line 112, in run self.resource_version = self._run( ^^ File "/usr/local/lib/python3.11/site-packages/airflow/providers/cncf/kubernetes/executors/kubernetes_executor_utils.py", line 168, in _run for event in self._pod_events(kube_client=kube_client, query_kwargs=kwargs): File "/usr/local/lib/python3.11/site-packages/kubernetes/watch/watch.py", line 165, in stream for line in iter_resp_lines(resp): File "/usr/local/lib/python3.11/site-packages/kubernetes/watch/watch.py", line 56, in iter_resp_lines for seg in resp.stream(amt=None, decode_content=False): File "/usr/local/lib/python3.11/site-packages/urllib3/response.py", line 933, in stream yield from self.read_chunked(amt, decode_content=decode_content) File "/usr/local/lib/python3.11/site-packages/urllib3/response.py", line 1061, in read_chunked with self._error_catcher(): File "/usr/local/lib/python3.11/contextlib.py", line 158, in __exit__ self.gen.throw(typ, value, traceback) File "/usr/local/lib/python3.11/site-packages/urllib3/response.py", line 727, in _error_catcher raise
[I] missing conf as a filter field in dag_run browsing [airflow]
raphaelauv opened a new issue, #39137: URL: https://github.com/apache/airflow/issues/39137 ### Apache Airflow version 2.9.0 ### If "Other Airflow 2 version" selected, which one? _No response_ ### What happened? I can't use `conf` as a filter field to search a dag_run in the webserver ![Screenshot from 2024-04-19 16-43-51](https://github.com/apache/airflow/assets/10202690/2acee905-0af4-4b70-b2b1-11efe50430a6) ### What you think should happen instead? _No response_ ### How to reproduce --- ### Operating System --- ### Versions of Apache Airflow Providers _No response_ ### Deployment Official Apache Airflow Helm Chart ### Deployment details _No response_ ### Anything else? _No response_ ### Are you willing to submit PR? - [ ] Yes I am willing to submit a PR! ### Code of Conduct - [X] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
(airflow) branch constraints-2-9 updated: Updating constraints. Github run id:8753677815
This is an automated email from the ASF dual-hosted git repository. github-bot pushed a commit to branch constraints-2-9 in repository https://gitbox.apache.org/repos/asf/airflow.git The following commit(s) were added to refs/heads/constraints-2-9 by this push: new 2d9dc7f892 Updating constraints. Github run id:8753677815 2d9dc7f892 is described below commit 2d9dc7f892d3438e9acf52a9c1d00cd237c7f1b1 Author: Automated GitHub Actions commit AuthorDate: Fri Apr 19 14:42:12 2024 + Updating constraints. Github run id:8753677815 This update in constraints is automatically committed by the CI 'constraints-push' step based on 'refs/heads/v2-9-test' in the 'apache/airflow' repository with commit sha e61cb8fa41f34bc5e3140a2c22b24dd110b4c421. The action that build those constraints can be found at https://github.com/apache/airflow/actions/runs/8753677815/ The image tag used for that build was: e61cb8fa41f34bc5e3140a2c22b24dd110b4c421. You can enter Breeze environment with this image by running 'breeze shell --image-tag e61cb8fa41f34bc5e3140a2c22b24dd110b4c421' All tests passed in this build so we determined we can push the updated constraints. See https://github.com/apache/airflow/blob/main/README.md#installing-from-pypi for details. --- constraints-3.10.txt | 28 ++-- constraints-3.11.txt | 24 constraints-3.12.txt | 24 constraints-3.8.txt | 26 +- constraints-3.9.txt | 26 +- constraints-no-providers-3.10.txt | 6 +++--- constraints-no-providers-3.11.txt | 4 ++-- constraints-no-providers-3.12.txt | 4 ++-- constraints-no-providers-3.8.txt | 6 +++--- constraints-no-providers-3.9.txt | 6 +++--- constraints-source-providers-3.10.txt | 28 ++-- constraints-source-providers-3.11.txt | 24 constraints-source-providers-3.12.txt | 24 constraints-source-providers-3.8.txt | 26 +- constraints-source-providers-3.9.txt | 26 +- 15 files changed, 141 insertions(+), 141 deletions(-) diff --git a/constraints-3.10.txt b/constraints-3.10.txt index eb280b6a38..092d430a0f 100644 --- a/constraints-3.10.txt +++ b/constraints-3.10.txt @@ -1,6 +1,6 @@ # -# This constraints file was automatically generated on 2024-04-18T09:58:55.445040 +# This constraints file was automatically generated on 2024-04-19T12:56:27.417773 # via "eager-upgrade" mechanism of PIP. For the "v2-9-test" branch of Airflow. # This variant of constraints install uses the HEAD of the branch version for 'apache-airflow' but installs # the providers from PIP-released packages at the moment of the constraint generation. @@ -266,7 +266,7 @@ cron-descriptor==1.4.3 croniter==2.0.3 cryptography==41.0.7 curlify==2.2.1 -databricks-sql-connector==2.9.5 +databricks-sql-connector==2.9.6 datadog==0.49.1 db-dtypes==1.2.0 debugpy==1.8.1 @@ -290,7 +290,7 @@ entrypoints==0.4 eralchemy2==1.3.8 et-xmlfile==1.1.0 eventlet==0.36.1 -exceptiongroup==1.2.0 +exceptiongroup==1.2.1 execnet==2.1.1 executing==2.0.1 facebook_business==19.0.3 @@ -322,7 +322,7 @@ google-cloud-audit-log==0.2.5 google-cloud-automl==2.13.3 google-cloud-batch==0.17.18 google-cloud-bigquery-datatransfer==3.15.2 -google-cloud-bigquery==3.20.1 +google-cloud-bigquery==3.21.0 google-cloud-bigtable==2.23.1 google-cloud-build==3.24.0 google-cloud-compute==1.18.0 @@ -347,7 +347,7 @@ google-cloud-redis==2.15.3 google-cloud-resource-manager==1.12.3 google-cloud-run==0.10.5 google-cloud-secret-manager==2.19.0 -google-cloud-spanner==3.44.0 +google-cloud-spanner==3.45.0 google-cloud-speech==2.26.0 google-cloud-storage-transfer==1.11.3 google-cloud-storage==2.16.0 @@ -367,9 +367,9 @@ greenlet==3.0.3 grpc-google-iam-v1==0.13.0 grpc-interceptor==0.15.4 grpcio-gcp==0.2.2 -grpcio-status==1.62.1 -grpcio-tools==1.62.1 -grpcio==1.62.1 +grpcio-status==1.62.2 +grpcio-tools==1.62.2 +grpcio==1.62.2 gssapi==1.8.3 gunicorn==22.0.0 h11==0.14.0 @@ -405,7 +405,7 @@ isodate==0.6.1 itsdangerous==2.2.0 jaraco.classes==3.4.0 jaraco.context==5.3.0 -jaraco.functools==4.0.0 +jaraco.functools==4.0.1 jedi==0.19.1 jeepney==0.8.0 jmespath==0.10.0 @@ -474,7 +474,7 @@ nodeenv==1.8.0 numpy==1.26.4 oauthlib==3.2.2 objsize==0.7.0 -openai==1.21.2 +openai==1.23.1 openapi-schema-validator==0.6.2 openapi-spec-validator==0.7.1 openlineage-integration-common==1.12.0 @@ -584,7 +584,7 @@ python-telegram-bot==20.2 python3-saml==1.16.0 pytz==2024.1 pywinrm==0.4.3 -pyzmq==26.0.0 +pyzmq==26.0.1 qdrant-client==1.8.2 reactivex==4.0.4 readme_renderer==43.0 @@ -628,7 +628,7 @@ smbprotocol==1.13.0 smmap==5.0.1 sniffio==1.3.1 snowballstemmer==2.2.0 -snowflake-connector-python==3.8.1
[PR] [OpenLineage] Add more debug logs to facilitate debugging [airflow]
kacpermuda opened a new pull request, #39136: URL: https://github.com/apache/airflow/pull/39136 After debugging some user issues i feel like we could add some more DEBUG level logs, so that we can deduce more information about what's happening inside OpenLineage. We should be able to trace the whole flow from loading the plugin, through configuration to event being emitted. We should not have to ask users for their configuration for simple debugging. I will follow up with another PR for OpenLineage Python client, so that it's also as descriptive as possible. --- **^ Add meaningful description above** Read the **[Pull Request Guidelines](https://github.com/apache/airflow/blob/main/contributing-docs/05_pull_requests.rst#pull-request-guidelines)** for more information. In case of fundamental code changes, an Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvement+Proposals)) is needed. In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x). In case of backwards incompatible changes please leave a note in a newsfragment file, named `{pr_number}.significant.rst` or `{issue_number}.significant.rst`, in [newsfragments](https://github.com/apache/airflow/tree/main/newsfragments). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] Rendering custom map index before task is run. [airflow]
raphaelauv commented on issue #39118: URL: https://github.com/apache/airflow/issues/39118#issuecomment-2066707639 related : https://github.com/apache/airflow/issues/39092 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[I] front - admin menu - drop-down non deterministic [airflow]
raphaelauv opened a new issue, #39135: URL: https://github.com/apache/airflow/issues/39135 ### Apache Airflow version 2.9.0 ### If "Other Airflow 2 version" selected, which one? _No response_ ### What happened? My user is Admin in airflow but most of the time I can't see all the admin menu drop-down but only Plugins ![Screenshot from 2024-04-19 16-21-32](https://github.com/apache/airflow/assets/10202690/89ed609f-315f-4f0a-9499-52a0b5debb95) after multiple F5 I can see all the admin menu file:///home/raphael/Pictures/Screenshots/Screenshot%20from%202024-04-19%2016-21-44.png and if I F5 again it go back to only Plugins ### What you think should happen instead? _No response_ ### How to reproduce I can't reproduce it with the simple docker compose of the airflow project it's maybe related to the fact that the problematic airflow deployment is setup with an LDAP ### Operating System -- ### Versions of Apache Airflow Providers _No response_ ### Deployment Official Apache Airflow Helm Chart ### Deployment details _No response_ ### Anything else? _No response_ ### Are you willing to submit PR? - [ ] Yes I am willing to submit a PR! ### Code of Conduct - [X] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[PR] Use `declarative_base` from `sqlalchemy.orm` instead of `sqlalchemy.ext.declarative` [airflow]
Taragolis opened a new pull request, #39134: URL: https://github.com/apache/airflow/pull/39134 --- **^ Add meaningful description above** Read the **[Pull Request Guidelines](https://github.com/apache/airflow/blob/main/contributing-docs/05_pull_requests.rst#pull-request-guidelines)** for more information. In case of fundamental code changes, an Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvement+Proposals)) is needed. In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x). In case of backwards incompatible changes please leave a note in a newsfragment file, named `{pr_number}.significant.rst` or `{issue_number}.significant.rst`, in [newsfragments](https://github.com/apache/airflow/tree/main/newsfragments). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[PR] Chart: add priorityClassName to jobs [airflow]
Aakcht opened a new pull request, #39133: URL: https://github.com/apache/airflow/pull/39133 Add possibility to set priorityClassName to helm chart jobs --- **^ Add meaningful description above** Read the **[Pull Request Guidelines](https://github.com/apache/airflow/blob/main/contributing-docs/05_pull_requests.rst#pull-request-guidelines)** for more information. In case of fundamental code changes, an Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvement+Proposals)) is needed. In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x). In case of backwards incompatible changes please leave a note in a newsfragment file, named `{pr_number}.significant.rst` or `{issue_number}.significant.rst`, in [newsfragments](https://github.com/apache/airflow/tree/main/newsfragments). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] Starts execution directly from triggerer without going to worker [airflow]
Lee-W commented on code in PR #38674: URL: https://github.com/apache/airflow/pull/38674#discussion_r1572436018 ## airflow/decorators/base.py: ## @@ -509,6 +509,9 @@ def _expand(self, expand_input: ExpandInput, *, strict: bool) -> XComArg: # task's expand() contribute to the op_kwargs operator argument, not # the operator arguments themselves, and should expand against it. expand_input_attr="op_kwargs_expand_input", +# start with trigger is not supported in dynamic task mapping Review Comment: Yes, we're able to make it work with dynamic task mapping if `_start_trigger` and `_next_method` are defined as class attributes. However, if they're specified in `__init__`. It seems we do not actually run `__init__` before `schedulre_tis` but only validate kwargs https://github.com/apache/airflow/blob/90acbfbba1a3e6535b87376aeaf089805b7d3303/airflow/models/mappedoperator.py#L176. Is there anything I missed? If my understanding is correct, do you have suggestions on how this should be handled? Thanks! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] Starts execution directly from triggerer without going to worker [airflow]
Lee-W commented on PR #38674: URL: https://github.com/apache/airflow/pull/38674#issuecomment-2066635766 > > For some operators such as S3KeySensor with deferrable set to True, running execute method in workers might not be necessary. > > How would that work in case user uses `soft_fail`? For example in case of timeout? I don't think it has anything to do with `soft_fail` 樂 It just changes how a task starts but not how it ends. Or is there anything I missed? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] Fix/helm chart: workers.command for KubernetesExecutor [airflow]
boring-cyborg[bot] commented on PR #39132: URL: https://github.com/apache/airflow/pull/39132#issuecomment-2066637133 Congratulations on your first Pull Request and welcome to the Apache Airflow community! If you have any issues or are unsure about any anything please check our Contributors' Guide (https://github.com/apache/airflow/blob/main/contributing-docs/README.rst) Here are some useful points: - Pay attention to the quality of your code (ruff, mypy and type annotations). Our [pre-commits]( https://github.com/apache/airflow/blob/main/contributing-docs/08_static_code_checks.rst#prerequisites-for-pre-commit-hooks) will help you with that. - In case of a new feature add useful documentation (in docstrings or in `docs/` directory). Adding a new operator? Check this short [guide](https://github.com/apache/airflow/blob/main/docs/apache-airflow/howto/custom-operator.rst) Consider adding an example DAG that shows how users should use it. - Consider using [Breeze environment](https://github.com/apache/airflow/blob/main/dev/breeze/doc/README.rst) for testing locally, it's a heavy docker but it ships with a working Airflow and a lot of integrations. - Be patient and persistent. It might take some time to get a review or get the final approval from Committers. - Please follow [ASF Code of Conduct](https://www.apache.org/foundation/policies/conduct) for all communication including (but not limited to) comments on Pull Requests, Mailing list and Slack. - Be sure to read the [Airflow Coding style]( https://github.com/apache/airflow/blob/main/contributing-docs/05_pull_requests.rst#coding-style-and-best-practices). - Always keep your Pull Requests rebased, otherwise your build might fail due to changes not related to your commits. Apache Airflow is a community-driven project and together we are making it better . In case of doubts contact the developers at: Mailing List: d...@airflow.apache.org Slack: https://s.apache.org/airflow-slack -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[PR] Fix/helm chart: workers.command for KubernetesExecutor [airflow]
nyirit opened a new pull request, #39132: URL: https://github.com/apache/airflow/pull/39132 **Helm-chart version used:** 1.13.1 (latest) **Airflow version used:** 2.8.4 The `workers.command` value override is ignored if `KubernetesExecutor` is used. I wanted to add a custom `command` parameter to the base container in the pods that are created to execute the tasks, but I failed to do so with the following values.yaml override: ```yaml airflow: executor: KubernetesExecutor workers: command: ['my-test-command'] ``` I have found that this value is only used in the `worker-deployment.yaml`, but not in the `pod-template-file.kubernetes-helm-yaml` which seems to be used for `KubernetesExecutor`. Based on the `workers.command` [documentation](https://airflow.apache.org/docs/helm-chart/stable/parameters-ref.html#workers) this does seem to be a bug. ### Potential workaround To make this work and prove that this is actually needed in my case, I have used the following override in the dag definition: ```python from datetime import datetime from airflow import DAG from kubernetes.client import models as k8s default_args = { 'executor_config': { 'pod_override': k8s.V1Pod( spec=k8s.V1PodSpec( containers=[ k8s.V1Container( name='base', command=['my', 'command', 'to', 'run'], ) ], ) ), } } with DAG(dag_id='test', default_args=default_args, start_date=datetime(2024, 1, 1), schedule=None) as dag: ``` This gets the job done, but it would be a lot nicer to be able to do this from the helm-chart, especially as this should be the default behaviour in my case for each dag. --- I have checked open issues, but did not find one that mentions this problem. Also, this is my first PR here, so even against my best intentions, I might have missed something in the contributing guidelines, so all feedback and help is welcome :) Branch is freshly rebased! :) --- **^ Add meaningful description above** Read the **[Pull Request Guidelines](https://github.com/apache/airflow/blob/main/contributing-docs/05_pull_requests.rst#pull-request-guidelines)** for more information. In case of fundamental code changes, an Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvement+Proposals)) is needed. In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x). In case of backwards incompatible changes please leave a note in a newsfragment file, named `{pr_number}.significant.rst` or `{issue_number}.significant.rst`, in [newsfragments](https://github.com/apache/airflow/tree/main/newsfragments). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] Move render map index method and apply to dry run [airflow]
anteverse closed pull request #39087: Move render map index method and apply to dry run URL: https://github.com/apache/airflow/pull/39087 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] Add ability to filter DAGs in DAGs View by tags using "AND" instead of default behavior "OR" [airflow]
RNHTTR commented on issue #38147: URL: https://github.com/apache/airflow/issues/38147#issuecomment-2066580280 > @RNHTTR I'd love to but unfortunately this is a really inopportune time for me :( No problem! Come back to the project when you have some more time! > Hello, could a colleague (@TiDeane) and I be assigned to this issue? We are working for a university project and would be grateful to implement this feature. Thank you in advance. Done! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
(airflow) branch v2-9-test updated (eee047821d -> e61cb8fa41)
This is an automated email from the ASF dual-hosted git repository. potiuk pushed a change to branch v2-9-test in repository https://gitbox.apache.org/repos/asf/airflow.git from eee047821d Upgrade to latest hatchling 1.24.1 (again). add e61cb8fa41 Apply PROVIDE_PROJECT_ID mypy workaround across Google provider (#39129) No new revisions were added by this update. Summary of changes: airflow/providers/google/cloud/hooks/bigquery.py | 65 -- airflow/providers/google/cloud/hooks/cloud_sql.py | 9 ++- .../cloud/hooks/cloud_storage_transfer_service.py | 8 ++- .../providers/google/cloud/hooks/compute_ssh.py| 3 +- airflow/providers/google/cloud/hooks/dataplex.py | 8 ++- airflow/providers/google/cloud/hooks/dlp.py| 28 +- airflow/providers/google/cloud/hooks/gcs.py| 8 ++- airflow/providers/google/cloud/hooks/gdm.py| 4 +- .../google/cloud/hooks/kubernetes_engine.py| 4 +- airflow/providers/google/cloud/hooks/mlengine.py | 12 ++-- airflow/providers/google/cloud/hooks/pubsub.py | 2 +- .../providers/google/cloud/hooks/secret_manager.py | 4 +- .../providers/google/cloud/log/gcs_task_handler.py | 3 +- .../providers/google/cloud/operators/bigquery.py | 27 - .../google/cloud/operators/bigquery_dts.py | 7 ++- .../providers/google/cloud/operators/bigtable.py | 13 +++-- .../google/cloud/operators/cloud_build.py | 23 .../google/cloud/operators/cloud_memorystore.py| 33 +-- .../providers/google/cloud/operators/cloud_sql.py | 20 +++ .../operators/cloud_storage_transfer_service.py| 15 ++--- .../providers/google/cloud/operators/compute.py| 23 .../google/cloud/operators/datacatalog.py | 41 +++--- .../providers/google/cloud/operators/dataflow.py | 15 ++--- .../providers/google/cloud/operators/datafusion.py | 21 +++ .../google/cloud/operators/datapipeline.py | 5 +- .../providers/google/cloud/operators/dataprep.py | 9 +-- .../providers/google/cloud/operators/dataproc.py | 33 +-- .../providers/google/cloud/operators/datastore.py | 15 ++--- airflow/providers/google/cloud/operators/dlp.py| 61 ++-- .../providers/google/cloud/operators/functions.py | 7 ++- airflow/providers/google/cloud/operators/gcs.py| 4 +- .../google/cloud/operators/kubernetes_engine.py| 21 +++ .../google/cloud/operators/life_sciences.py| 3 +- .../providers/google/cloud/operators/mlengine.py | 21 +++ airflow/providers/google/cloud/operators/pubsub.py | 11 ++-- .../providers/google/cloud/operators/spanner.py| 13 +++-- .../google/cloud/operators/speech_to_text.py | 3 +- .../google/cloud/operators/stackdriver.py | 21 +++ airflow/providers/google/cloud/operators/tasks.py | 27 - .../google/cloud/operators/text_to_speech.py | 3 +- .../google/cloud/operators/translate_speech.py | 3 +- airflow/providers/google/cloud/operators/vision.py | 25 + .../providers/google/cloud/operators/workflows.py | 19 --- .../google/cloud/secrets/secret_manager.py | 3 +- .../providers/google/cloud/sensors/bigquery_dts.py | 3 +- airflow/providers/google/cloud/sensors/bigtable.py | 3 +- .../sensors/cloud_storage_transfer_service.py | 3 +- airflow/providers/google/cloud/sensors/dataflow.py | 9 +-- .../providers/google/cloud/sensors/datafusion.py | 3 +- airflow/providers/google/cloud/sensors/dataproc.py | 5 +- airflow/providers/google/cloud/sensors/tasks.py| 3 +- .../providers/google/cloud/sensors/workflows.py| 3 +- .../google/cloud/transfers/bigquery_to_gcs.py | 3 +- .../google/cloud/transfers/gcs_to_bigquery.py | 3 +- .../providers/google/cloud/triggers/bigquery.py| 6 +- .../providers/google/cloud/triggers/cloud_sql.py | 3 +- .../triggers/cloud_storage_transfer_service.py | 5 +- .../providers/google/cloud/triggers/dataproc.py| 3 +- .../providers/google/cloud/triggers/mlengine.py| 3 +- .../providers/google/common/hooks/base_google.py | 4 +- .../providers/google/firebase/hooks/firestore.py | 4 +- .../google/firebase/operators/firestore.py | 3 +- 62 files changed, 425 insertions(+), 347 deletions(-)
Re: [PR] Apply PROVIDE_PROJECT_ID mypy workaround across Google provider [airflow]
potiuk merged PR #39129: URL: https://github.com/apache/airflow/pull/39129 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] Apply PROVIDE_PROJECT_ID mypy workaround across Google provider [airflow]
potiuk commented on PR #39129: URL: https://github.com/apache/airflow/pull/39129#issuecomment-2066457221 Random /flaky failure . Merging. All looks good. In the future (providing we keep the approach) it should not cause unexpected mypy issues. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] OpenAI Chat & Assistant hook functions [airflow]
Lee-W merged PR #38736: URL: https://github.com/apache/airflow/pull/38736 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
(airflow) branch main updated (a6f612d899 -> 2674a69780)
This is an automated email from the ASF dual-hosted git repository. weilee pushed a change to branch main in repository https://gitbox.apache.org/repos/asf/airflow.git from a6f612d899 Rename model `ImportError` to `ParseImportError` for avoid shadowing with builtin exception (#39116) add 2674a69780 OpenAI Chat & Assistant hook functions (#38736) No new revisions were added by this update. Summary of changes: airflow/providers/openai/hooks/openai.py| 176 - airflow/providers/openai/provider.yaml | 2 +- generated/provider_dependencies.json| 2 +- tests/providers/openai/hooks/test_openai.py | 233 4 files changed, 410 insertions(+), 3 deletions(-)
Re: [PR] Starts execution directly from triggerer without going to worker [airflow]
Lee-W commented on code in PR #38674: URL: https://github.com/apache/airflow/pull/38674#discussion_r1572255021 ## airflow/models/baseoperator.py: ## @@ -1072,6 +1072,8 @@ def __init__( if SetupTeardownContext.active: SetupTeardownContext.update_context_map(self) +self._start_trigger: BaseTrigger | None = getattr(self, "_start_trigger", None) Review Comment: Yep, sounds reasonable. Just updated -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[PR] Improve DataprocCreateClusterOperator in Triggers for Enhanced Error Handling and Resource Cleanup [airflow]
sunank200 opened a new pull request, #39130: URL: https://github.com/apache/airflow/pull/39130 This PR introduces improvements to the `DataprocCreateClusterOperator`, specifically addressing deficiencies in handling clusters that are created in an ERROR state when operating in deferrable mode Previously, when the `DataprocCreateClusterOperator` was operating in deferrable mode, it did not correctly handle scenarios where clusters were initially created in an ERROR state. The lack of appropriate error handling and resource cleanup led to subsequent retries encountering these errored clusters, attempting deletion, and resulting in pipeline failures. --- **^ Add meaningful description above** Read the **[Pull Request Guidelines](https://github.com/apache/airflow/blob/main/contributing-docs/05_pull_requests.rst#pull-request-guidelines)** for more information. In case of fundamental code changes, an Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvement+Proposals)) is needed. In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x). In case of backwards incompatible changes please leave a note in a newsfragment file, named `{pr_number}.significant.rst` or `{issue_number}.significant.rst`, in [newsfragments](https://github.com/apache/airflow/tree/main/newsfragments). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] Apply PROVIDE_PROJECT_ID mypy workaround across Google provider [airflow]
potiuk commented on PR #39129: URL: https://github.com/apache/airflow/pull/39129#issuecomment-2066346998 cc: @moiseenkov -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] Apply PROVIDE_PROJECT_ID mypy workaround across Google provider [airflow]
potiuk commented on PR #39129: URL: https://github.com/apache/airflow/pull/39129#issuecomment-2066344960 cc: @VladaZakharova -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] Run the same trigger in 2+ triggerer in the same time. [airflow]
MaksYermak commented on issue #37180: URL: https://github.com/apache/airflow/issues/37180#issuecomment-2066343207 Hello Team, I have checked this bug one more time for Airflow 2.6.3 with different `apache-airflow-providers-cncf-kubernetes` versions. As a result, I can say that this bug reproduces only where `apache-airflow-providers-cncf-kubernetes` package's version is less than `8.0.0`. Starting from version `8.0.0` this bug isn't reproduced. In my opinion this bug was fixed in `apache-airflow-providers-cncf-kubernetes==8.0.0`. I think it happened because the community had refactored deferrable mode for `KubernetesPodOperator` in this version of the package. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[PR] Apply PROVIDE_PROJECT_ID mypy workaround across Google provider [airflow]
potiuk opened a new pull request, #39129: URL: https://github.com/apache/airflow/pull/39129 There is a simple workaround implemented several years ago for Google provider `project_id` default value being PROVIDE_PROJECT_ID that satisfy mypy checks for project_id being set. They way how `fallback_to_default_project_id` works is that across all the providers the project_id is actually set, even if technically it's default value is set to None. This is similar typing workaround as we use for NEW_SESSION in the core of Airflow. The workaround has not been applied consistently across all the google provider code and occasionally it causes MyPy complaining when newer version of a google library introduces more strict type checking and expects the provider_id to be set. This PR applies the workaround across all the Google provider code. This is - generally speaking a no-op operation. Nothing changes, except MyPy being aware that the project_id is actually going to be set even if it is technically set to None. --- **^ Add meaningful description above** Read the **[Pull Request Guidelines](https://github.com/apache/airflow/blob/main/contributing-docs/05_pull_requests.rst#pull-request-guidelines)** for more information. In case of fundamental code changes, an Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvement+Proposals)) is needed. In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x). In case of backwards incompatible changes please leave a note in a newsfragment file, named `{pr_number}.significant.rst` or `{issue_number}.significant.rst`, in [newsfragments](https://github.com/apache/airflow/tree/main/newsfragments). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] Starts execution directly from triggerer without going to worker [airflow]
uranusjr commented on code in PR #38674: URL: https://github.com/apache/airflow/pull/38674#discussion_r1572203632 ## airflow/models/baseoperator.py: ## @@ -1072,6 +1072,8 @@ def __init__( if SetupTeardownContext.active: SetupTeardownContext.update_context_map(self) +self._start_trigger: BaseTrigger | None = getattr(self, "_start_trigger", None) Review Comment: Probably easier to provide the default class-wide ```python class BaseOperator(...): _start_trigger: BaseTrigger | None = None _next_method: str | None = None ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] Starts execution directly from triggerer without going to worker [airflow]
uranusjr commented on code in PR #38674: URL: https://github.com/apache/airflow/pull/38674#discussion_r1572203632 ## airflow/models/baseoperator.py: ## @@ -1072,6 +1072,8 @@ def __init__( if SetupTeardownContext.active: SetupTeardownContext.update_context_map(self) +self._start_trigger: BaseTrigger | None = getattr(self, "_start_trigger", None) Review Comment: Probably easier to provide the default class-wide ```python class BaseOperator(...): _start_trigger: BaseTrigger | None = None _next_method: ... = None ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] Replace dill package to use cloudpickle [airflow]
moiseenkov commented on PR #38531: URL: https://github.com/apache/airflow/pull/38531#issuecomment-2066306300 Thank you! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] Fix all deprecations for SQLAlchemy 2.0 [airflow]
Taragolis commented on issue #28723: URL: https://github.com/apache/airflow/issues/28723#issuecomment-2066283481 I've add collect this warning during the CI step, however our warning collection system collect warnings only during run tests, and errors might happen during initial configurations, so this might be other incompatibilities, i would plan to extend our plugin and collect in the other steps but it required some time. I have add all currently found warnings in Issue detail, so fill free to fix it -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] log_id field is missing from log lines (ES remote logging) [airflow]
marcomancuso commented on issue #10406: URL: https://github.com/apache/airflow/issues/10406#issuecomment-2066246589 Ah, thanks for that @eyalzek . I had a digging around the `parser` and my understanding was that in this particular case, it was removing the key `log` and expanding its json content as keys of the record itself, https://docs.fluentd.org/filter/parser#remove_key_name_field -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] Replace dill package to use cloudpickle [airflow]
potiuk commented on PR #38531: URL: https://github.com/apache/airflow/pull/38531#issuecomment-2066247355 yes. Python Operator and related are part of the Airflow core. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] log_id field is missing from log lines (ES remote logging) [airflow]
eyalzek commented on issue #10406: URL: https://github.com/apache/airflow/issues/10406#issuecomment-2066214663 @marcomancuso it's been a while since I've used this, but AFAICR the first filter parses the JSON record and removes the `kubernetes` prefix, then the second filter handles the record to add `log_id` field if missing. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] Starts execution directly from triggerer without going to worker [airflow]
eladkal commented on PR #38674: URL: https://github.com/apache/airflow/pull/38674#issuecomment-2066199167 > For some operators such as S3KeySensor with deferrable set to True, running execute method in workers might not be necessary. How would that work in case user uses `soft_fail`? For example in case of timeout? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] Incorrect encoding of the plus sign in URLs during user authentication [airflow]
boring-cyborg[bot] commented on issue #39128: URL: https://github.com/apache/airflow/issues/39128#issuecomment-2066195514 Thanks for opening your first issue here! Be sure to follow the issue template! If you are willing to raise PR to address this issue please do so, no need to wait for approval. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[I] Incorrect encoding of the plus sign in URLs during user authentication [airflow]
e-galan opened a new issue, #39128: URL: https://github.com/apache/airflow/issues/39128 ### Apache Airflow version main (development) ### If "Other Airflow 2 version" selected, which one? _No response_ ### What happened? A URL encoding error of the `+` sign happens during user authentication, if a user tries to access the dag run page link (`/dags//grid?dag_run_id==graph`) while not being logged in. After several redirects the user gets forwarded to the dag page instead (`/dags//grid?=graph`). ### What you think should happen instead? The `+` sign in the URL should be properly encoded and decoded during the redirects. The user should be redirected to the proper dag run page (`/dags//grid?dag_run_id==graph`). ### How to reproduce Here is what happens to the URL during the redirects. For this example I am using the latest development version of Airflow (*2.10.0.dev0*) and *tutorial_dag* from *airflow/example_dags*. 1. We make a **GET** request to `/dags/tutorial_dag/grid?dag_run_id=manual__2024-04-18T12%3A03%3A17.474232%2B00%3A00=graph` while being logged out from Airflow. 2. The Airflow server checks authentication and returns a response **302 FOUND** with one of the headers being `Location: /login/?next=http%3A%2F%2F127.0.0.1%3A28080%2Fdags%2Ftutorial_dag%2Fgrid%3Fdag_run_id%3Dmanual__2024-04-18T12%253A03%253A17.474232%2B00%253A00%26tab%3Dgraph` . Here we can see that during encoding all non-ASCII chars get prefixed with **%25** (%) except for single **%2B** (+) which stays unchanged. This is the first error. 3. Then the **GET** request is redirected to `/login/?next=http%3A%2F%2F127.0.0.1%3A28080%2Fdags%2Ftutorial_dag%2Fgrid%3Fdag_run_id%3Dmanual__2024-04-18T12%253A03%253A17.474232%2B00%253A00%26tab%3Dgraph` , where we can see the same un-encoded **%2B** in the **next** parameter. 4. Once the user logs in, a **POST** request with the same **next** parameter value is made: `/login/?next=http%3A%2F%2F127.0.0.1%3A28080%2Fdags%2Ftutorial_dag%2Fgrid%3Fdag_run_id%3Dmanual__2024-04-18T12%253A03%253A17.474232%2B00%253A00%26tab%3Dgraph` . 5. Again we get a **302 FOUND** response, but its headers now contain `Location: /dags/tutorial_dag/grid?dag_run_id=manual__2024-04-18T12%3A03%3A17.474232+00%3A00=graph` . Here we can see that during decoding the **%25** prefixes are removed, but the lonely **%2B** gets decoded into a **+** sign. This is the second error 6. Then we make another request to Airflow, this time using URL from step 5. Airflow can’t find the badly encoded dag_run_id and instead sends us to the dag page (`/dags/tutorial_dag/grid?tab=graph`). ### Operating System Debian GNU/Linux rodete ### Versions of Apache Airflow Providers _No response_ ### Deployment Other Docker-based deployment ### Deployment details _No response_ ### Anything else? Happens every time. The same problem exists in Apache Airflow 2.7.3. In Apache Airflow 2.6.3 and 2.5.3 it results in an uncaught server error, although it should be mentioned that the dag run page URL looks differently for these versions (`/dags//graph?run_id=0_date=` vs `/dags//grid?dag_run_id==graph`). A kind of similar problem was mentioned for Apache Airflow 2.5.3 in #30898 , where *Werkzeug 2.3.0* was pointed as the culprit. However, I have tested the bug in environments with *Werkzeug 2.2.3*, and it was still reproduced, which leads me to believe that it is probably not the reason in my case. ### Are you willing to submit PR? - [ ] Yes I am willing to submit a PR! ### Code of Conduct - [X] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] Replace dill package to use cloudpickle [airflow]
moiseenkov commented on PR #38531: URL: https://github.com/apache/airflow/pull/38531#issuecomment-2066188528 @potiuk , hi Thanks for reviewing and merging this. We have only one short question here if you don't mind. Because these changes are not contained by any provider package, does it mean that they will be released with the Airflow itself? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [DRAFT] Switch to Connexion 3 framework [airflow]
potiuk commented on code in PR #39055: URL: https://github.com/apache/airflow/pull/39055#discussion_r1567021838 ## airflow/cli/commands/webserver_command.py: ## @@ -356,11 +356,11 @@ def webserver(args): print(f"Starting the web server on port {args.port} and host {args.hostname}.") app = create_app(testing=conf.getboolean("core", "unit_test_mode")) app.run( -debug=True, -use_reloader=not app.config["TESTING"], +log_level="debug", Review Comment: TODO: We need to figure ouy our reloading here -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] Rename model `ImportError` to `ParseImportError` for avoid shadowing with builtin exception [airflow]
hussein-awala merged PR #39116: URL: https://github.com/apache/airflow/pull/39116 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
(airflow) branch main updated: Rename model `ImportError` to `ParseImportError` for avoid shadowing with builtin exception (#39116)
This is an automated email from the ASF dual-hosted git repository. husseinawala pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/airflow.git The following commit(s) were added to refs/heads/main by this push: new a6f612d899 Rename model `ImportError` to `ParseImportError` for avoid shadowing with builtin exception (#39116) a6f612d899 is described below commit a6f612d89942f141eb8a7affbbea46d033923d1a Author: Andrey Anshin AuthorDate: Fri Apr 19 12:51:21 2024 +0400 Rename model `ImportError` to `ParseImportError` for avoid shadowing with builtin exception (#39116) --- airflow/api/common/delete_dag.py | 5 ++-- .../endpoints/import_error_endpoint.py | 12 airflow/api_connexion/schemas/error_schema.py | 6 ++-- airflow/dag_processing/manager.py | 8 +++--- airflow/dag_processing/processor.py| 13 + airflow/models/__init__.py | 30 ++-- airflow/models/errors.py | 19 - airflow/www/utils.py | 4 +-- airflow/www/views.py | 7 +++-- pyproject.toml | 2 ++ .../endpoints/test_import_error_endpoint.py| 30 ++-- tests/api_connexion/schemas/test_error_schema.py | 8 +++--- tests/api_experimental/common/test_delete_dag.py | 2 +- tests/dag_processing/test_job_runner.py| 11 tests/dag_processing/test_processor.py | 33 +++--- tests/test_utils/db.py | 4 +-- 16 files changed, 116 insertions(+), 78 deletions(-) diff --git a/airflow/api/common/delete_dag.py b/airflow/api/common/delete_dag.py index 4452e2726f..1cf7ffec8b 100644 --- a/airflow/api/common/delete_dag.py +++ b/airflow/api/common/delete_dag.py @@ -27,6 +27,7 @@ from sqlalchemy import and_, delete, or_, select from airflow import models from airflow.exceptions import AirflowException, DagNotFound from airflow.models import DagModel, TaskFail +from airflow.models.errors import ParseImportError from airflow.models.serialized_dag import SerializedDagModel from airflow.utils.db import get_sqla_model_classes from airflow.utils.session import NEW_SESSION, provide_session @@ -99,8 +100,8 @@ def delete_dag(dag_id: str, keep_records_in_log: bool = True, session: Session = # Delete entries in Import Errors table for a deleted DAG # This handles the case when the dag_id is changed in the file session.execute( -delete(models.ImportError) -.where(models.ImportError.filename == dag.fileloc) +delete(ParseImportError) +.where(ParseImportError.filename == dag.fileloc) .execution_options(synchronize_session="fetch") ) diff --git a/airflow/api_connexion/endpoints/import_error_endpoint.py b/airflow/api_connexion/endpoints/import_error_endpoint.py index 274d842d18..76b706eac1 100644 --- a/airflow/api_connexion/endpoints/import_error_endpoint.py +++ b/airflow/api_connexion/endpoints/import_error_endpoint.py @@ -30,7 +30,7 @@ from airflow.api_connexion.schemas.error_schema import ( ) from airflow.auth.managers.models.resource_details import AccessView, DagDetails from airflow.models.dag import DagModel -from airflow.models.errors import ImportError as ImportErrorModel +from airflow.models.errors import ParseImportError from airflow.utils.session import NEW_SESSION, provide_session from airflow.www.extensions.init_auth_manager import get_auth_manager @@ -45,7 +45,7 @@ if TYPE_CHECKING: @provide_session def get_import_error(*, import_error_id: int, session: Session = NEW_SESSION) -> APIResponse: """Get an import error.""" -error = session.get(ImportErrorModel, import_error_id) +error = session.get(ParseImportError, import_error_id) if error is None: raise NotFound( "Import error not found", @@ -85,8 +85,8 @@ def get_import_errors( """Get all import errors.""" to_replace = {"import_error_id": "id"} allowed_sort_attrs = ["import_error_id", "timestamp", "filename"] -count_query = select(func.count(ImportErrorModel.id)) -query = select(ImportErrorModel) +count_query = select(func.count(ParseImportError.id)) +query = select(ParseImportError) query = apply_sorting(query, order_by, to_replace, allowed_sort_attrs) can_read_all_dags = get_auth_manager().is_authorized_dag(method="GET") @@ -95,8 +95,8 @@ def get_import_errors( # if the user doesn't have access to all DAGs, only display errors from visible DAGs readable_dag_ids = security.get_readable_dags() dagfiles_stmt = select(DagModel.fileloc).distinct().where(DagModel.dag_id.in_(readable_dag_ids)) -query = query.where(ImportErrorModel.filename.in_(dagfiles_stmt)) -count_query =
Re: [PR] Starts execution directly from triggerer without going to worker [airflow]
Lee-W commented on code in PR #38674: URL: https://github.com/apache/airflow/pull/38674#discussion_r1572048134 ## airflow/decorators/base.py: ## @@ -509,6 +509,9 @@ def _expand(self, expand_input: ExpandInput, *, strict: bool) -> XComArg: # task's expand() contribute to the op_kwargs operator argument, not # the operator arguments themselves, and should expand against it. expand_input_attr="op_kwargs_expand_input", +# start with trigger is not supported in dynamic task mapping Review Comment: I thought it might not make sense for dynamic task mapping, but after a second thought, it might actually do. let me see how I can do that -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
(airflow) branch main updated (fd8a05739f -> eee17f0a26)
This is an automated email from the ASF dual-hosted git repository. potiuk pushed a change to branch main in repository https://gitbox.apache.org/repos/asf/airflow.git from fd8a05739f Bugfix to correct GCSHook being called even when not required with BeamRunPythonPipelineOperator (#38716) add eee17f0a26 fix: BigQueryCheckOperator skipped value and error check in deferrable mode (#38408) No new revisions were added by this update. Summary of changes: .../providers/google/cloud/operators/bigquery.py | 30 ++--- .../google/cloud/operators/test_bigquery.py| 77 ++ 2 files changed, 86 insertions(+), 21 deletions(-)
Re: [PR] fix: BigQueryCheckOperator skip value and error check in deferrable mode [airflow]
potiuk merged PR #38408: URL: https://github.com/apache/airflow/pull/38408 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] BigQueryCheckOperator skips value and error check when in deferrable mode but not deferred [airflow]
potiuk closed issue #37885: BigQueryCheckOperator skips value and error check when in deferrable mode but not deferred URL: https://github.com/apache/airflow/issues/37885 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
(airflow) branch main updated: Bugfix to correct GCSHook being called even when not required with BeamRunPythonPipelineOperator (#38716)
This is an automated email from the ASF dual-hosted git repository. potiuk pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/airflow.git The following commit(s) were added to refs/heads/main by this push: new fd8a05739f Bugfix to correct GCSHook being called even when not required with BeamRunPythonPipelineOperator (#38716) fd8a05739f is described below commit fd8a05739f945643b5023db15d51a97459109a02 Author: Zack Strathe <59071005+zstra...@users.noreply.github.com> AuthorDate: Fri Apr 19 03:40:19 2024 -0500 Bugfix to correct GCSHook being called even when not required with BeamRunPythonPipelineOperator (#38716) * Bugfix to correct GCSHook being called even when not required with BeamRunPythonPipelineOperator * remove unneccary check for GCSHook and add unit test for BeamRunPythonPipelineOperator to ensure that GCSHook is only called when necessary * Split out unit tests for TestBeamRunPythonPipelineOperator with GCSHook 'gs://' arg prefixes * Fix formatting --- airflow/providers/apache/beam/operators/beam.py| 3 +- tests/providers/apache/beam/operators/test_beam.py | 73 ++ 2 files changed, 75 insertions(+), 1 deletion(-) diff --git a/airflow/providers/apache/beam/operators/beam.py b/airflow/providers/apache/beam/operators/beam.py index e88923bc05..62f650f19a 100644 --- a/airflow/providers/apache/beam/operators/beam.py +++ b/airflow/providers/apache/beam/operators/beam.py @@ -364,11 +364,12 @@ class BeamRunPythonPipelineOperator(BeamBasePipelineOperator): def execute_sync(self, context: Context): with ExitStack() as exit_stack: -gcs_hook = GCSHook(gcp_conn_id=self.gcp_conn_id) if self.py_file.lower().startswith("gs://"): +gcs_hook = GCSHook(gcp_conn_id=self.gcp_conn_id) tmp_gcs_file = exit_stack.enter_context(gcs_hook.provide_file(object_url=self.py_file)) self.py_file = tmp_gcs_file.name if self.snake_case_pipeline_options.get("requirements_file", "").startswith("gs://"): +gcs_hook = GCSHook(gcp_conn_id=self.gcp_conn_id) tmp_req_file = exit_stack.enter_context( gcs_hook.provide_file(object_url=self.snake_case_pipeline_options["requirements_file"]) ) diff --git a/tests/providers/apache/beam/operators/test_beam.py b/tests/providers/apache/beam/operators/test_beam.py index f7ca9649fb..a6a4c31c77 100644 --- a/tests/providers/apache/beam/operators/test_beam.py +++ b/tests/providers/apache/beam/operators/test_beam.py @@ -256,6 +256,79 @@ class TestBeamRunPythonPipelineOperator: op.on_kill() dataflow_cancel_job.assert_not_called() +@mock.patch(BEAM_OPERATOR_PATH.format("BeamHook")) +@mock.patch(BEAM_OPERATOR_PATH.format("GCSHook")) +def test_execute_gcs_hook_not_called_without_gs_prefix(self, mock_gcs_hook, _): +""" +Test that execute method does not call GCSHook when neither py_file nor requirements_file +starts with 'gs://'. (i.e., running pipeline entirely locally) +""" +local_test_op_args = { +"task_id": TASK_ID, +"py_file": "local_file.py", +"py_options": ["-m"], +"default_pipeline_options": { +"project": TEST_PROJECT, +"requirements_file": "local_requirements.txt", +}, +"pipeline_options": {"output": "test_local/output", "labels": {"foo": "bar"}}, +} + +op = BeamRunPythonPipelineOperator(**local_test_op_args) +context_mock = mock.MagicMock() + +op.execute(context_mock) +mock_gcs_hook.assert_not_called() + +@mock.patch(BEAM_OPERATOR_PATH.format("BeamHook")) +@mock.patch(BEAM_OPERATOR_PATH.format("GCSHook")) +def test_execute_gcs_hook_called_with_gs_prefix_py_file(self, mock_gcs_hook, _): +""" +Test that execute method calls GCSHook when only 'py_file' starts with 'gs://'. +""" +local_test_op_args = { +"task_id": TASK_ID, +"py_file": "gs://gcs_file.py", +"py_options": ["-m"], +"default_pipeline_options": { +"project": TEST_PROJECT, +"requirements_file": "local_requirements.txt", +}, +"pipeline_options": {"output": "test_local/output", "labels": {"foo": "bar"}}, +} +op = BeamRunPythonPipelineOperator(**local_test_op_args) +context_mock = mock.MagicMock() + +op.execute(context_mock) +mock_gcs_hook.assert_called_once() + +@mock.patch(BEAM_OPERATOR_PATH.format("BeamHook")) +@mock.patch(BEAM_OPERATOR_PATH.format("GCSHook")) +def test_execute_gcs_hook_called_with_gs_prefix_pipeline_requirements(self, mock_gcs_hook, _): +""" +Test that execute method calls GCSHook when only pipeline_options
Re: [PR] Bugfix to correct GCSHook being called even when not required with BeamRunPythonPipelineOperator [airflow]
boring-cyborg[bot] commented on PR #38716: URL: https://github.com/apache/airflow/pull/38716#issuecomment-2066107379 Awesome work, congrats on your first merged pull request! You are invited to check our [Issue Tracker](https://github.com/apache/airflow/issues) for additional contributions. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] BeamRunPythonPipelineOperator: cannot run local input files with DirectRunner [airflow]
potiuk closed issue #38713: BeamRunPythonPipelineOperator: cannot run local input files with DirectRunner URL: https://github.com/apache/airflow/issues/38713 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] Bugfix to correct GCSHook being called even when not required with BeamRunPythonPipelineOperator [airflow]
potiuk merged PR #38716: URL: https://github.com/apache/airflow/pull/38716 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] Bugfix to correct GCSHook being called even when not required with BeamRunPythonPipelineOperator [airflow]
potiuk commented on PR #38716: URL: https://github.com/apache/airflow/pull/38716#issuecomment-2066107147 Nothing to be sorry about :) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] Capture warnings during collect DAGs [airflow]
Taragolis commented on code in PR #39109: URL: https://github.com/apache/airflow/pull/39109#discussion_r1572032793 ## airflow/models/dagbag.py: ## @@ -67,13 +68,23 @@ class FileLoadStat(NamedTuple): -"""Information about single file.""" +""" +Information about single file. + +:param file: Loaded file. +:param duration: Time spent on process file. +:param dag_num: Total number of DAGs loaded in this file. +:param task_num: Total number of Tasks loaded in this file. +:param dags: DAGs names loaded in this file. +:param warnings: Total warnings captured during the process file. Review Comment: Why not? Let me change also the name of variable -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] log_id field is missing from log lines (ES remote logging) [airflow]
marcomancuso commented on issue #10406: URL: https://github.com/apache/airflow/issues/10406#issuecomment-2066088920 Is the second filter `` correct? Should it not be ``. What's the purpose of the first filter anyway? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[I] let BigQueryGetData operator take a list of fields for the "order by" clause [airflow]
lopezvit opened a new issue, #39127: URL: https://github.com/apache/airflow/issues/39127 ### Description Sometimes you just need a the latest value of a field (e.g. `updatedAt`) so further operators downstream could use said value in their own query. This can be done by `SELECT MAX(updatedAt) [...]` but that would required a lot of re-write, when simply adding a new param `ordering_fields` could solve the same issue, allowing to create a query similar to: `SELECT updatedAt FROM [...] LIMIT 1 ORDER BY updatedAt DESC` Example implementation (not tested): def generate_query(self, hook: BigQueryHook) -> str: """Generate a SELECT query if for the given dataset and table ID.""" query = "select " if self.selected_fields: query += self.selected_fields else: query += "*" query += ( f" from `{self.table_project_id or hook.project_id}.{self.dataset_id}" f".{self.table_id}` limit {self.max_results}" ) if self.ordering_fields: query += f" ORDER BY {self.ordering_fields}" return query ### Use case/motivation The operator BigQueryGetData should have 1 more params `ordering_fields` so the generated query would also include the `ORDER BY` clause. ### Related issues https://github.com/apache/airflow/issues/24460 ### Are you willing to submit a PR? - [ ] Yes I am willing to submit a PR! ### Code of Conduct - [X] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] Starts execution directly from triggerer without going to worker [airflow]
Lee-W commented on code in PR #38674: URL: https://github.com/apache/airflow/pull/38674#discussion_r1571990864 ## airflow/models/baseoperator.py: ## @@ -1072,6 +1072,8 @@ def __init__( if SetupTeardownContext.active: SetupTeardownContext.update_context_map(self) +self._start_trigger: BaseTrigger | None = getattr(self, "_start_trigger", None) Review Comment: ```python class Test2Operator(BaseOperator): def __init__(self, *args, **kwargs): self._start_trigger = trigger self._next_method = "execute_complete" super().__init__(*args, **kwargs) def execute_complete(self): pass ``` I'm unsure whether users will do something like this or if we should allow them. Just want to prevent user from doing it in a not ideal (or even wrong?) way. Can change it back to `= _start_trigger` if we do not want there user to do something like this -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] Fix SFTPSensor.newer_than not working with jinja logical ds/ts expression [airflow]
grrolland commented on code in PR #39056: URL: https://github.com/apache/airflow/pull/39056#discussion_r1571992193 ## tests/providers/sftp/sensors/test_sftp.py: ## @@ -97,11 +97,12 @@ def test_file_not_new_enough(self, sftp_hook_mock): sftp_hook_mock.return_value.get_mod_time.assert_called_once_with("/path/to/file/1970-01-01.txt") assert not output +@pytest.mark.parametrize("newer_than", (datetime(2020, 1, 2), "2020-01-02 00:00:00+00:00")) Review Comment: Rename and add test case parameters. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] Fix SFTPSensor.newer_than not working with jinja logical ds/ts expression [airflow]
grrolland commented on code in PR #39056: URL: https://github.com/apache/airflow/pull/39056#discussion_r1571991298 ## airflow/providers/sftp/sensors/sftp.py: ## @@ -21,6 +21,7 @@ import os from datetime import datetime, timedelta +from dateutil.parser import parse as parse_date Review Comment: Use [airflow.utils.timezone.parse](https://github.com/apache/airflow/blob/0a74928894fb57b0160208262ccacad12da23fc7/airflow/utils/timezone.py#L194-L204) now. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] Starts execution directly from triggerer without going to worker [airflow]
Lee-W commented on code in PR #38674: URL: https://github.com/apache/airflow/pull/38674#discussion_r1571987975 ## airflow/models/baseoperator.py: ## @@ -1691,6 +1693,16 @@ def inherits_from_empty_operator(self): # of its subclasses (which don't inherit from anything but BaseOperator). return getattr(self, "_is_empty", False) +@property +def start_trigger(self) -> BaseTrigger | None: +"""Trigger when deferring task.""" +return getattr(self, "_start_trigger", None) Review Comment: Yep, this is no longer needed after the initialization has been moved to `__init__`. I'll update it. Thanks! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] Starts execution directly from triggerer without going to worker [airflow]
Lee-W commented on code in PR #38674: URL: https://github.com/apache/airflow/pull/38674#discussion_r1571987070 ## airflow/models/baseoperator.py: ## @@ -1691,6 +1693,16 @@ def inherits_from_empty_operator(self): # of its subclasses (which don't inherit from anything but BaseOperator). return getattr(self, "_is_empty", False) +@property +def start_trigger(self) -> BaseTrigger | None: +"""Trigger when deferring task.""" +return getattr(self, "_start_trigger", None) + +@property +def next_method(self) -> str | None: +"""Method to execute after finish deferring.""" +return getattr(self, "_next_method", None) Review Comment: This is not yet set in `__init__`, but yep, this could be moved to `__init__` and changed into `return self._next_method` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [Core] Add deprecation module with decorator and custom warnings [airflow]
kacpermuda commented on PR #36952: URL: https://github.com/apache/airflow/pull/36952#issuecomment-2066025589 Adding another example of deprecations from Ray: https://github.com/ray-project/ray/blob/master/python/ray/_private/utils.py#L1090 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] Support Generic OIDC Providers [airflow]
potiuk closed issue #39124: Support Generic OIDC Providers URL: https://github.com/apache/airflow/issues/39124 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] Support Generic OIDC Providers [airflow]
potiuk commented on issue #39124: URL: https://github.com/apache/airflow/issues/39124#issuecomment-2066019243 As explained in the other tickets - this is a request to Flask Application Builder - not to Airlfow. Airflow uses Flask Application Builder to implement those https://github.com/dpgaspar/Flask-AppBuilder . irflow is periodically syncing with what gets implemented in FAB - so I think you can open your issue there. It might be in the future we will have a generic AuthManager (Keycloak based) replacing FAB Auth Manager as we have today. The interface is already implemented - Airflow is looking for a contibution of an implementation (possibly based on KeyCloak https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-56+Extensible+user+management) Conveting to a Discussion if more is needed. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] allow setting a default view for the run [airflow]
boring-cyborg[bot] commented on issue #39125: URL: https://github.com/apache/airflow/issues/39125#issuecomment-2065990182 Thanks for opening your first issue here! Be sure to follow the issue template! If you are willing to raise PR to address this issue please do so, no need to wait for approval. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[I] allow setting a default view for the run [airflow]
kiaradlf opened a new issue, #39125: URL: https://github.com/apache/airflow/issues/39125 ### Description `dag_default_view` lets one specify a preferred default view for the dag from among its tab. runs similarly have a number of tabs, and i find myself continuously switching to the log tab as soon as i can. ### Use case/motivation i feel it would thus be nice if we similarly had a setting to allow specifying the default tab for runs. ### Related issues n/a ### Are you willing to submit a PR? - [ ] Yes I am willing to submit a PR! ### Code of Conduct - [X] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org