[GitHub] eladkal commented on issue #4317: [AIRFLOW-2629] Change reference of hive_hooks to hive_hook everywhere
eladkal commented on issue #4317: [AIRFLOW-2629] Change reference of hive_hooks to hive_hook everywhere URL: https://github.com/apache/incubator-airflow/pull/4317#issuecomment-447622291 Hi guys, I picked it up because it was an open ticket. I see now that this task wan't actually discusses. Is there something required here to be done or shall I close this PR? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] Fokko commented on issue #3596: [AIRFLOW-2747] Explicit re-schedule of sensors
Fokko commented on issue #3596: [AIRFLOW-2747] Explicit re-schedule of sensors URL: https://github.com/apache/incubator-airflow/pull/3596#issuecomment-447619213 @ashb @seelmann To simplify things, I'm all in for setting the re-schedule method as the default scheduling method for sensors, and get rid of the blocking sensors. This will also enable the use of sensors on the `SequentialExecutor` (apart from the reschedule state). But more important from my perspective, it will greatly simplify the logic since we don't have to maintain two branches of the sensor execution (legacy, and re-scheduling). Curious what you guys think. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Created] (AIRFLOW-3523) Try number displays incorrect values in the web UI
Murali Sathenapalli created AIRFLOW-3523: Summary: Try number displays incorrect values in the web UI Key: AIRFLOW-3523 URL: https://issues.apache.org/jira/browse/AIRFLOW-3523 Project: Apache Airflow Issue Type: Bug Affects Versions: 1.10.0 Environment: centos7.5/rhel7.5 Reporter: Murali Sathenapalli Attachments: airflow-bug.jpg This was confusing us a lot in our task runs - in the alerts that are sent by airflow, a task that ran shows as 2 tries. what does 2 out of 1 means. However, when we view it in the UI or the log file, it shows correctly as 1 try. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (AIRFLOW-3523) Try number displays incorrect values in failure alert
[ https://issues.apache.org/jira/browse/AIRFLOW-3523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Murali Sathenapalli updated AIRFLOW-3523: - Summary: Try number displays incorrect values in failure alert (was: Try number displays incorrect values in the web UI) > Try number displays incorrect values in failure alert > - > > Key: AIRFLOW-3523 > URL: https://issues.apache.org/jira/browse/AIRFLOW-3523 > Project: Apache Airflow > Issue Type: Bug >Affects Versions: 1.10.0 > Environment: centos7.5/rhel7.5 >Reporter: Murali Sathenapalli >Priority: Major > Attachments: airflow-bug.jpg > > > This was confusing us a lot in our task runs - in the alerts that are sent by > airflow, a task that ran shows as 2 tries. what does 2 out of 1 means. > However, when we view it in the UI or the log file, it shows correctly as 1 > try. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-2143) Try number displays incorrect values in the web UI
[ https://issues.apache.org/jira/browse/AIRFLOW-2143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16722385#comment-16722385 ] Murali Sathenapalli commented on AIRFLOW-2143: -- we are getting this too in airflow 1.10.0 > Try number displays incorrect values in the web UI > -- > > Key: AIRFLOW-2143 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2143 > Project: Apache Airflow > Issue Type: Bug >Affects Versions: 1.9.0 >Reporter: James Davidheiser >Priority: Minor > Attachments: adhoc_query.png, task_instance_page.png > > > This was confusing us a lot in our task runs - in the database, a task that > ran is marked as 1 try. However, when we view it in the UI, it shows at 2 > tries in several places. These include: > * Task Instance Details (ie > [https://airflow/task?execution_date=xxx&dag_id=xxx&task_id=xxx > )|https://airflow/task?execution_date=xxx&dag_id=xxx&task_id=xxx] > * Task instance browser (/admin/taskinstance/) > * Task Tries graph (/admin/airflow/tries) > Notably, is is correctly shown as 1 try in the log filenames, on the log > viewer page (admin/airflow/log?execution_date=), and some other places. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (AIRFLOW-3500) Make task duration display user friendly
[ https://issues.apache.org/jira/browse/AIRFLOW-3500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kaxil Naik updated AIRFLOW-3500: Fix Version/s: (was: 1.10.2) 2.0.0 > Make task duration display user friendly > > > Key: AIRFLOW-3500 > URL: https://issues.apache.org/jira/browse/AIRFLOW-3500 > Project: Apache Airflow > Issue Type: Improvement > Components: DAG >Affects Versions: 1.10.1 >Reporter: Ofer Zelig >Assignee: Ofer Zelig >Priority: Major > Fix For: 2.0.0 > > > When hovering over a task (in Graph mode), the duration it took is displayed > as a plain number, which doesn't say what the number is (it's actually > seconds). > When you see something like 2716 it's impractical to know how much time it > actually took, unless you can quickly do the math in your head. > Change the display to read days/hours/minutes/seconds. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-3500) Make task duration display user friendly
[ https://issues.apache.org/jira/browse/AIRFLOW-3500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16722371#comment-16722371 ] ASF GitHub Bot commented on AIRFLOW-3500: - kaxil closed pull request #4304: [AIRFLOW-3500] Make task duration display user friendly URL: https://github.com/apache/incubator-airflow/pull/4304 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/airflow/www/templates/airflow/graph.html b/airflow/www/templates/airflow/graph.html index 33cdd9737b..1fede359a6 100644 --- a/airflow/www/templates/airflow/graph.html +++ b/airflow/www/templates/airflow/graph.html @@ -282,13 +282,24 @@ tt += "Operator: " + task.task_type + ""; tt += "Started: " + ti.start_date + ""; tt += "Ended: " + ti.end_date + ""; - tt += "Duration: " + ti.duration + ""; + tt += "Duration: " + secondsToString(ti.duration) + ""; tt += "State: " + ti.state + ""; return tt; }); }); } +function secondsToString(seconds) { +var numdays = Math.floor((seconds % 31536000) / 86400); +var numhours = Math.floor(((seconds % 31536000) % 86400) / 3600); +var numminutes = Math.floorseconds % 31536000) % 86400) % 3600) / 60); +var numseconds = Math.floorseconds % 31536000) % 86400) % 3600) % 60); +return (numdays > 0? numdays+ (numdays=== 1 ? " day ": " days "): "") + + (numhours > 0 ? numhours + (numhours === 1 ? " hour " : " hours ") : "") + + (numminutes > 0 ? numminutes + (numminutes === 1 ? " minute " : " minutes ") : "") + + (numseconds > 0 ? numseconds + (numseconds === 1 ? " second" : " seconds") : ""); +} + function clearFocus(){ d3.selectAll("g.node") .transition(duration) diff --git a/airflow/www_rbac/static/js/datetime-utils.js b/airflow/www_rbac/static/js/datetime-utils.js index 08e9469479..31a944f3a1 100644 --- a/airflow/www_rbac/static/js/datetime-utils.js +++ b/airflow/www_rbac/static/js/datetime-utils.js @@ -58,3 +58,14 @@ export const converAndFormatUTC = (datetime, tz) => { if (tz) dateTimeObj = dateTimeObj.tz(tz); return dateTimeObj.format(defaultFormatWithTZ) } + +export const secondsToString = (seconds) => { + let numdays= Math.floor((seconds % 31536000) / 86400); + let numhours = Math.floor(((seconds % 31536000) % 86400) / 3600); + let numminutes = Math.floorseconds % 31536000) % 86400) % 3600) / 60); + let numseconds = Math.floorseconds % 31536000) % 86400) % 3600) % 60); + return (numdays > 0? numdays+ (numdays=== 1 ? " day ": " days "): "") + + (numhours > 0 ? numhours + (numhours === 1 ? " hour " : " hours ") : "") + + (numminutes > 0 ? numminutes + (numminutes === 1 ? " minute " : " minutes ") : "") + + (numseconds > 0 ? numseconds + (numseconds === 1 ? " second" : " seconds") : ""); +} \ No newline at end of file diff --git a/airflow/www_rbac/static/js/graph.js b/airflow/www_rbac/static/js/graph.js index 3ccc72cc6f..689623beab 100644 --- a/airflow/www_rbac/static/js/graph.js +++ b/airflow/www_rbac/static/js/graph.js @@ -17,7 +17,7 @@ * under the License. */ -import { generateTooltipDateTime, converAndFormatUTC } from './datetime-utils'; +import { generateTooltipDateTime, converAndFormatUTC, secondsToString } from './datetime-utils'; // Assigning css classes based on state to nodes // Initiating the tooltips @@ -38,7 +38,7 @@ function update_nodes_states(task_instances) { tt += "run_id: " + ti.run_id + ""; } tt += "Operator: " + task.task_type + ""; -tt += "Duration: " + ti.duration + ""; +tt += "Duration: " + secondsToString(ti.duration) + ""; tt += "State: " + ti.state + ""; tt += generateTooltipDateTime(ti.start_date, ti.end_date, dagTZ); // dagTZ has been defined in dag.html return tt; This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Make task duration display user friendly > > > Key: AIRFLOW-3500 > URL: https://issues.apache.org/jira/browse/AIRFLOW-3500 > Project: Apache Airflow > Issue Type: Improvement > Components: DAG >Affects Versions: 1.10.1
[GitHub] kaxil closed pull request #4304: [AIRFLOW-3500] Make task duration display user friendly
kaxil closed pull request #4304: [AIRFLOW-3500] Make task duration display user friendly URL: https://github.com/apache/incubator-airflow/pull/4304 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/airflow/www/templates/airflow/graph.html b/airflow/www/templates/airflow/graph.html index 33cdd9737b..1fede359a6 100644 --- a/airflow/www/templates/airflow/graph.html +++ b/airflow/www/templates/airflow/graph.html @@ -282,13 +282,24 @@ tt += "Operator: " + task.task_type + ""; tt += "Started: " + ti.start_date + ""; tt += "Ended: " + ti.end_date + ""; - tt += "Duration: " + ti.duration + ""; + tt += "Duration: " + secondsToString(ti.duration) + ""; tt += "State: " + ti.state + ""; return tt; }); }); } +function secondsToString(seconds) { +var numdays = Math.floor((seconds % 31536000) / 86400); +var numhours = Math.floor(((seconds % 31536000) % 86400) / 3600); +var numminutes = Math.floorseconds % 31536000) % 86400) % 3600) / 60); +var numseconds = Math.floorseconds % 31536000) % 86400) % 3600) % 60); +return (numdays > 0? numdays+ (numdays=== 1 ? " day ": " days "): "") + + (numhours > 0 ? numhours + (numhours === 1 ? " hour " : " hours ") : "") + + (numminutes > 0 ? numminutes + (numminutes === 1 ? " minute " : " minutes ") : "") + + (numseconds > 0 ? numseconds + (numseconds === 1 ? " second" : " seconds") : ""); +} + function clearFocus(){ d3.selectAll("g.node") .transition(duration) diff --git a/airflow/www_rbac/static/js/datetime-utils.js b/airflow/www_rbac/static/js/datetime-utils.js index 08e9469479..31a944f3a1 100644 --- a/airflow/www_rbac/static/js/datetime-utils.js +++ b/airflow/www_rbac/static/js/datetime-utils.js @@ -58,3 +58,14 @@ export const converAndFormatUTC = (datetime, tz) => { if (tz) dateTimeObj = dateTimeObj.tz(tz); return dateTimeObj.format(defaultFormatWithTZ) } + +export const secondsToString = (seconds) => { + let numdays= Math.floor((seconds % 31536000) / 86400); + let numhours = Math.floor(((seconds % 31536000) % 86400) / 3600); + let numminutes = Math.floorseconds % 31536000) % 86400) % 3600) / 60); + let numseconds = Math.floorseconds % 31536000) % 86400) % 3600) % 60); + return (numdays > 0? numdays+ (numdays=== 1 ? " day ": " days "): "") + + (numhours > 0 ? numhours + (numhours === 1 ? " hour " : " hours ") : "") + + (numminutes > 0 ? numminutes + (numminutes === 1 ? " minute " : " minutes ") : "") + + (numseconds > 0 ? numseconds + (numseconds === 1 ? " second" : " seconds") : ""); +} \ No newline at end of file diff --git a/airflow/www_rbac/static/js/graph.js b/airflow/www_rbac/static/js/graph.js index 3ccc72cc6f..689623beab 100644 --- a/airflow/www_rbac/static/js/graph.js +++ b/airflow/www_rbac/static/js/graph.js @@ -17,7 +17,7 @@ * under the License. */ -import { generateTooltipDateTime, converAndFormatUTC } from './datetime-utils'; +import { generateTooltipDateTime, converAndFormatUTC, secondsToString } from './datetime-utils'; // Assigning css classes based on state to nodes // Initiating the tooltips @@ -38,7 +38,7 @@ function update_nodes_states(task_instances) { tt += "run_id: " + ti.run_id + ""; } tt += "Operator: " + task.task_type + ""; -tt += "Duration: " + ti.duration + ""; +tt += "Duration: " + secondsToString(ti.duration) + ""; tt += "State: " + ti.state + ""; tt += generateTooltipDateTime(ti.start_date, ti.end_date, dagTZ); // dagTZ has been defined in dag.html return tt; This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (AIRFLOW-3327) BiqQuery job checking doesn't include location, which api requires outside US/EU
[ https://issues.apache.org/jira/browse/AIRFLOW-3327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16722366#comment-16722366 ] ASF GitHub Bot commented on AIRFLOW-3327: - kaxil opened a new pull request #4324: [AIRFLOW-3327] Add location in BigQueryHook URL: https://github.com/apache/incubator-airflow/pull/4324 Make sure you have checked _all_ steps below. ### Jira - [x] My PR addresses the following [Airflow Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR" - https://issues.apache.org/jira/browse/AIRFLOW-3327 ### Description - [x] Here are some details about my PR, including screenshots of any UI changes: The geographic location of the job is required except for US and EU. See details at https://cloud.google.com/bigquery/docs/locations#specifying_your_location. ### Tests - [x] My PR adds the following unit tests __OR__ does not need testing for this extremely good reason: ### Commits - [x] My commits all reference Jira issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)": 1. Subject is separated from body by a blank line 1. Subject is limited to 50 characters (not including Jira issue reference) 1. Subject does not end with a period 1. Subject uses the imperative mood ("add", not "adding") 1. Body wraps at 72 characters 1. Body explains "what" and "why", not "how" ### Documentation - [x] In case of new functionality, my PR adds documentation that describes how to use it. - When adding new operators/hooks/sensors, the autoclass documentation generation needs to be added. - All the public functions and the classes in the PR contain docstrings that explain what it does ### Code Quality - [x] Passes `flake8` This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > BiqQuery job checking doesn't include location, which api requires outside > US/EU > > > Key: AIRFLOW-3327 > URL: https://issues.apache.org/jira/browse/AIRFLOW-3327 > Project: Apache Airflow > Issue Type: Bug >Reporter: Daniel Swiegers >Assignee: Kaxil Naik >Priority: Minor > Labels: google-cloud-bigquery > Fix For: 1.10.2 > > Original Estimate: 24h > Remaining Estimate: 24h > > We use this api but don't set / pass through the geographical location. > Which is required in areas other than US and EU. > Can be seen in contrib/hooks/big_query_hook.py poll_job_complete > [https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs/get] > |The geographic location of the job. Required except for US and EU. See > details at > https://cloud.google.com/bigquery/docs/locations#specifying_your_location.| -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] kaxil commented on issue #4324: [AIRFLOW-3327] Add location in BigQueryHook
kaxil commented on issue #4324: [AIRFLOW-3327] Add location in BigQueryHook URL: https://github.com/apache/incubator-airflow/pull/4324#issuecomment-447614094 cc @fenglu-g This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] kaxil opened a new pull request #4324: [AIRFLOW-3327] Add location in BigQueryHook
kaxil opened a new pull request #4324: [AIRFLOW-3327] Add location in BigQueryHook URL: https://github.com/apache/incubator-airflow/pull/4324 Make sure you have checked _all_ steps below. ### Jira - [x] My PR addresses the following [Airflow Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR" - https://issues.apache.org/jira/browse/AIRFLOW-3327 ### Description - [x] Here are some details about my PR, including screenshots of any UI changes: The geographic location of the job is required except for US and EU. See details at https://cloud.google.com/bigquery/docs/locations#specifying_your_location. ### Tests - [x] My PR adds the following unit tests __OR__ does not need testing for this extremely good reason: ### Commits - [x] My commits all reference Jira issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)": 1. Subject is separated from body by a blank line 1. Subject is limited to 50 characters (not including Jira issue reference) 1. Subject does not end with a period 1. Subject uses the imperative mood ("add", not "adding") 1. Body wraps at 72 characters 1. Body explains "what" and "why", not "how" ### Documentation - [x] In case of new functionality, my PR adds documentation that describes how to use it. - When adding new operators/hooks/sensors, the autoclass documentation generation needs to be added. - All the public functions and the classes in the PR contain docstrings that explain what it does ### Code Quality - [x] Passes `flake8` This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] oferze commented on issue #4304: [AIRFLOW-3500] Make task duration display user friendly
oferze commented on issue #4304: [AIRFLOW-3500] Make task duration display user friendly URL: https://github.com/apache/incubator-airflow/pull/4304#issuecomment-447608798 @kaxil I originally ticked the box in the PR template: "My PR does not need testing for this extremely good reason: JS display change" I didn't see any tests to the other functions in `datetime-utils.js`. If there were any, I believe it wouldn't be too hard to add another one for the new function. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (AIRFLOW-3447) Intended usage of ts_nodash macro broken with migration to new time system.
[ https://issues.apache.org/jira/browse/AIRFLOW-3447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16722319#comment-16722319 ] ASF GitHub Bot commented on AIRFLOW-3447: - kaxil closed pull request #4323: [AIRFLOW-3447] Add 2 options for ts_nodash Macro URL: https://github.com/apache/incubator-airflow/pull/4323 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/UPDATING.md b/UPDATING.md index 814e2e107d..986d3a23c1 100644 --- a/UPDATING.md +++ b/UPDATING.md @@ -24,6 +24,13 @@ assists users migrating to a new version. ## Airflow Master +### Modification to `ts_nodash` macro +`ts_nodash` previously contained TimeZone information alongwith execution date. For Example: `20150101T00+`. This is not user-friendly for file or folder names which was a popular use case for `ts_nodash`. Hence this behavior has been changed and using `ts_nodash` will no longer contain TimeZone information, restoring the pre-1.10 behavior of this macro. And a new macro `ts_nodash_with_tz` has been added which can be used to get a string with execution date and timezone info without dashes. + +Examples: + * `ts_nodash`: `20150101T00` + * `ts_nodash_with_tz`: `20150101T00+` + ### New `dag_processor_manager_log_location` config option The DAG parsing manager log now by default will be log into a file, where its location is diff --git a/airflow/models.py b/airflow/models.py index 06a79b8af5..1089970b65 100755 --- a/airflow/models.py +++ b/airflow/models.py @@ -1869,7 +1869,8 @@ def get_template_context(self, session=None): prev_ds_nodash = prev_ds.replace('-', '') ds_nodash = ds.replace('-', '') -ts_nodash = ts.replace('-', '').replace(':', '') +ts_nodash = self.execution_date.strftime('%Y%m%dT%H%M%S') +ts_nodash_with_tz = ts.replace('-', '').replace(':', '') yesterday_ds_nodash = yesterday_ds.replace('-', '') tomorrow_ds_nodash = tomorrow_ds.replace('-', '') @@ -1939,6 +1940,7 @@ def __repr__(self): 'ds_nodash': ds_nodash, 'ts': ts, 'ts_nodash': ts_nodash, +'ts_nodash_with_tz': ts_nodash_with_tz, 'yesterday_ds': yesterday_ds, 'yesterday_ds_nodash': yesterday_ds_nodash, 'tomorrow_ds': tomorrow_ds, diff --git a/docs/code.rst b/docs/code.rst index 996f702a0e..61414ecbd6 100644 --- a/docs/code.rst +++ b/docs/code.rst @@ -269,7 +269,7 @@ Sensors .. _macros: Macros -- +-- Here's a list of variables and macros that can be used in templates @@ -284,19 +284,20 @@ VariableDescription ``{{ ds }}``the execution date as ``-MM-DD`` ``{{ ds_nodash }}`` the execution date as ``MMDD`` ``{{ prev_ds }}`` the previous execution date as ``-MM-DD`` -if ``{{ ds }}`` is ``2016-01-08`` and ``schedule_interval`` is ``@weekly``, +if ``{{ ds }}`` is ``2018-01-08`` and ``schedule_interval`` is ``@weekly``, ``{{ prev_ds }}`` will be ``2016-01-01`` ``{{ prev_ds_nodash }}``the previous execution date as ``MMDD`` if exists, else ``None` ``{{ next_ds }}`` the next execution date as ``-MM-DD`` -if ``{{ ds }}`` is ``2016-01-01`` and ``schedule_interval`` is ``@weekly``, -``{{ next_ds }}`` will be ``2016-01-08`` +if ``{{ ds }}`` is ``2018-01-01`` and ``schedule_interval`` is ``@weekly``, +``{{ next_ds }}`` will be ``2018-01-08`` ``{{ next_ds_nodash }}``the next execution date as ``MMDD`` if exists, else ``None` ``{{ yesterday_ds }}`` the day before the execution date as ``-MM-DD`` ``{{ yesterday_ds_nodash }}`` the day before the execution date as ``MMDD`` ``{{ tomorrow_ds }}`` the day after the execution date as ``-MM-DD`` ``{{ tomorrow_ds_nodash }}``the day after the execution date as ``MMDD`` -``{{ ts }}``same as ``execution_date.isoformat()`` -``{{ ts_nodash }}`` same as ``ts`` without ``-`` and ``:`` +``{{ ts }}``same as ``execution_date.isoformat()``. Example: ``2018-01-01T00:00:00+00:00`` +``{{ ts_nodash }}`` same as ``ts`` without ``-``, ``:`` and TimeZone info. Example: ``20180101T00`` +``{{ ts_nodash_with_tz }}`` same as ``ts`` without ``-`` and ``:``. Example:
[GitHub] kaxil closed pull request #4323: [AIRFLOW-3447] Add 2 options for ts_nodash Macro
kaxil closed pull request #4323: [AIRFLOW-3447] Add 2 options for ts_nodash Macro URL: https://github.com/apache/incubator-airflow/pull/4323 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/UPDATING.md b/UPDATING.md index 814e2e107d..986d3a23c1 100644 --- a/UPDATING.md +++ b/UPDATING.md @@ -24,6 +24,13 @@ assists users migrating to a new version. ## Airflow Master +### Modification to `ts_nodash` macro +`ts_nodash` previously contained TimeZone information alongwith execution date. For Example: `20150101T00+`. This is not user-friendly for file or folder names which was a popular use case for `ts_nodash`. Hence this behavior has been changed and using `ts_nodash` will no longer contain TimeZone information, restoring the pre-1.10 behavior of this macro. And a new macro `ts_nodash_with_tz` has been added which can be used to get a string with execution date and timezone info without dashes. + +Examples: + * `ts_nodash`: `20150101T00` + * `ts_nodash_with_tz`: `20150101T00+` + ### New `dag_processor_manager_log_location` config option The DAG parsing manager log now by default will be log into a file, where its location is diff --git a/airflow/models.py b/airflow/models.py index 06a79b8af5..1089970b65 100755 --- a/airflow/models.py +++ b/airflow/models.py @@ -1869,7 +1869,8 @@ def get_template_context(self, session=None): prev_ds_nodash = prev_ds.replace('-', '') ds_nodash = ds.replace('-', '') -ts_nodash = ts.replace('-', '').replace(':', '') +ts_nodash = self.execution_date.strftime('%Y%m%dT%H%M%S') +ts_nodash_with_tz = ts.replace('-', '').replace(':', '') yesterday_ds_nodash = yesterday_ds.replace('-', '') tomorrow_ds_nodash = tomorrow_ds.replace('-', '') @@ -1939,6 +1940,7 @@ def __repr__(self): 'ds_nodash': ds_nodash, 'ts': ts, 'ts_nodash': ts_nodash, +'ts_nodash_with_tz': ts_nodash_with_tz, 'yesterday_ds': yesterday_ds, 'yesterday_ds_nodash': yesterday_ds_nodash, 'tomorrow_ds': tomorrow_ds, diff --git a/docs/code.rst b/docs/code.rst index 996f702a0e..61414ecbd6 100644 --- a/docs/code.rst +++ b/docs/code.rst @@ -269,7 +269,7 @@ Sensors .. _macros: Macros -- +-- Here's a list of variables and macros that can be used in templates @@ -284,19 +284,20 @@ VariableDescription ``{{ ds }}``the execution date as ``-MM-DD`` ``{{ ds_nodash }}`` the execution date as ``MMDD`` ``{{ prev_ds }}`` the previous execution date as ``-MM-DD`` -if ``{{ ds }}`` is ``2016-01-08`` and ``schedule_interval`` is ``@weekly``, +if ``{{ ds }}`` is ``2018-01-08`` and ``schedule_interval`` is ``@weekly``, ``{{ prev_ds }}`` will be ``2016-01-01`` ``{{ prev_ds_nodash }}``the previous execution date as ``MMDD`` if exists, else ``None` ``{{ next_ds }}`` the next execution date as ``-MM-DD`` -if ``{{ ds }}`` is ``2016-01-01`` and ``schedule_interval`` is ``@weekly``, -``{{ next_ds }}`` will be ``2016-01-08`` +if ``{{ ds }}`` is ``2018-01-01`` and ``schedule_interval`` is ``@weekly``, +``{{ next_ds }}`` will be ``2018-01-08`` ``{{ next_ds_nodash }}``the next execution date as ``MMDD`` if exists, else ``None` ``{{ yesterday_ds }}`` the day before the execution date as ``-MM-DD`` ``{{ yesterday_ds_nodash }}`` the day before the execution date as ``MMDD`` ``{{ tomorrow_ds }}`` the day after the execution date as ``-MM-DD`` ``{{ tomorrow_ds_nodash }}``the day after the execution date as ``MMDD`` -``{{ ts }}``same as ``execution_date.isoformat()`` -``{{ ts_nodash }}`` same as ``ts`` without ``-`` and ``:`` +``{{ ts }}``same as ``execution_date.isoformat()``. Example: ``2018-01-01T00:00:00+00:00`` +``{{ ts_nodash }}`` same as ``ts`` without ``-``, ``:`` and TimeZone info. Example: ``20180101T00`` +``{{ ts_nodash_with_tz }}`` same as ``ts`` without ``-`` and ``:``. Example: ``20180101T00+`` ``{{ execution_date }}``the execution_date, (datetime.datetime) ``{{ prev_execution_date }}`` the previous execution date (if available) (datetime.datetime) ``{{ next_execution_date }}`` the next execution dat
[jira] [Resolved] (AIRFLOW-3447) Intended usage of ts_nodash macro broken with migration to new time system.
[ https://issues.apache.org/jira/browse/AIRFLOW-3447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kaxil Naik resolved AIRFLOW-3447. - Resolution: Fixed Resolved by https://github.com/apache/incubator-airflow/pull/4323 > Intended usage of ts_nodash macro broken with migration to new time system. > --- > > Key: AIRFLOW-3447 > URL: https://issues.apache.org/jira/browse/AIRFLOW-3447 > Project: Apache Airflow > Issue Type: Bug > Components: core >Reporter: Luka Draksler >Assignee: Kaxil Naik >Priority: Minor > Labels: easyfix > Fix For: 1.10.2 > > > Migration to timezone aware times broke the intended usage of ts_nodash macro. > ts_nodash is used in certain placeholders to create different names (table > names, cluster names...). As such it is alphanumeric only, it contains no > characters that could be deemed illegal by various naming restrictions. > Migration to new time system changed that. > As an example, this would be returned currently: > {{20181205T125657.169324+}} > {{before:}} > {{20181204T03}} > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] codecov-io edited a comment on issue #4323: [AIRFLOW-3447] Add 2 options for ts_nodash Macro
codecov-io edited a comment on issue #4323: [AIRFLOW-3447] Add 2 options for ts_nodash Macro URL: https://github.com/apache/incubator-airflow/pull/4323#issuecomment-447584517 # [Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4323?src=pr&el=h1) Report > Merging [#4323](https://codecov.io/gh/apache/incubator-airflow/pull/4323?src=pr&el=desc) into [master](https://codecov.io/gh/apache/incubator-airflow/commit/9d87552cebe513fcea5f3d93e18dff2df06a66a7?src=pr&el=desc) will **increase** coverage by `<.01%`. > The diff coverage is `100%`. [![Impacted file tree graph](https://codecov.io/gh/apache/incubator-airflow/pull/4323/graphs/tree.svg?width=650&token=WdLKlKHOAU&height=150&src=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/4323?src=pr&el=tree) ```diff @@Coverage Diff@@ ## master #4323 +/- ## = + Coverage 78.09% 78.1% +<.01% = Files 201 201 Lines 16470 16471 +1 = + Hits12862 12864 +2 + Misses 36083607 -1 ``` | [Impacted Files](https://codecov.io/gh/apache/incubator-airflow/pull/4323?src=pr&el=tree) | Coverage Δ | | |---|---|---| | [airflow/models.py](https://codecov.io/gh/apache/incubator-airflow/pull/4323/diff?src=pr&el=tree#diff-YWlyZmxvdy9tb2RlbHMucHk=) | `92.35% <100%> (+0.04%)` | :arrow_up: | -- [Continue to review full report at Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4323?src=pr&el=continue). > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta) > `Δ = absolute (impact)`, `ø = not affected`, `? = missing data` > Powered by [Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4323?src=pr&el=footer). Last update [9d87552...554a187](https://codecov.io/gh/apache/incubator-airflow/pull/4323?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments). This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] codecov-io edited a comment on issue #4323: [AIRFLOW-3447] Add 2 options for ts_nodash Macro
codecov-io edited a comment on issue #4323: [AIRFLOW-3447] Add 2 options for ts_nodash Macro URL: https://github.com/apache/incubator-airflow/pull/4323#issuecomment-447584517 # [Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4323?src=pr&el=h1) Report > Merging [#4323](https://codecov.io/gh/apache/incubator-airflow/pull/4323?src=pr&el=desc) into [master](https://codecov.io/gh/apache/incubator-airflow/commit/9d87552cebe513fcea5f3d93e18dff2df06a66a7?src=pr&el=desc) will **increase** coverage by `<.01%`. > The diff coverage is `100%`. [![Impacted file tree graph](https://codecov.io/gh/apache/incubator-airflow/pull/4323/graphs/tree.svg?width=650&token=WdLKlKHOAU&height=150&src=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/4323?src=pr&el=tree) ```diff @@Coverage Diff@@ ## master #4323 +/- ## = + Coverage 78.09% 78.1% +<.01% = Files 201 201 Lines 16470 16471 +1 = + Hits12862 12864 +2 + Misses 36083607 -1 ``` | [Impacted Files](https://codecov.io/gh/apache/incubator-airflow/pull/4323?src=pr&el=tree) | Coverage Δ | | |---|---|---| | [airflow/models.py](https://codecov.io/gh/apache/incubator-airflow/pull/4323/diff?src=pr&el=tree#diff-YWlyZmxvdy9tb2RlbHMucHk=) | `92.35% <100%> (+0.04%)` | :arrow_up: | -- [Continue to review full report at Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4323?src=pr&el=continue). > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta) > `Δ = absolute (impact)`, `ø = not affected`, `? = missing data` > Powered by [Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4323?src=pr&el=footer). Last update [9d87552...554a187](https://codecov.io/gh/apache/incubator-airflow/pull/4323?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments). This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] codecov-io edited a comment on issue #4323: [AIRFLOW-3447] Add 2 options for ts_nodash Macro
codecov-io edited a comment on issue #4323: [AIRFLOW-3447] Add 2 options for ts_nodash Macro URL: https://github.com/apache/incubator-airflow/pull/4323#issuecomment-447584517 # [Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4323?src=pr&el=h1) Report > Merging [#4323](https://codecov.io/gh/apache/incubator-airflow/pull/4323?src=pr&el=desc) into [master](https://codecov.io/gh/apache/incubator-airflow/commit/9d87552cebe513fcea5f3d93e18dff2df06a66a7?src=pr&el=desc) will **increase** coverage by `<.01%`. > The diff coverage is `100%`. [![Impacted file tree graph](https://codecov.io/gh/apache/incubator-airflow/pull/4323/graphs/tree.svg?width=650&token=WdLKlKHOAU&height=150&src=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/4323?src=pr&el=tree) ```diff @@Coverage Diff@@ ## master #4323 +/- ## = + Coverage 78.09% 78.1% +<.01% = Files 201 201 Lines 16470 16471 +1 = + Hits12862 12864 +2 + Misses 36083607 -1 ``` | [Impacted Files](https://codecov.io/gh/apache/incubator-airflow/pull/4323?src=pr&el=tree) | Coverage Δ | | |---|---|---| | [airflow/models.py](https://codecov.io/gh/apache/incubator-airflow/pull/4323/diff?src=pr&el=tree#diff-YWlyZmxvdy9tb2RlbHMucHk=) | `92.35% <100%> (+0.04%)` | :arrow_up: | -- [Continue to review full report at Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4323?src=pr&el=continue). > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta) > `Δ = absolute (impact)`, `ø = not affected`, `? = missing data` > Powered by [Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4323?src=pr&el=footer). Last update [9d87552...554a187](https://codecov.io/gh/apache/incubator-airflow/pull/4323?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments). This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] codecov-io edited a comment on issue #4323: [AIRFLOW-3447] Add 2 options for ts_nodash Macro
codecov-io edited a comment on issue #4323: [AIRFLOW-3447] Add 2 options for ts_nodash Macro URL: https://github.com/apache/incubator-airflow/pull/4323#issuecomment-447584517 # [Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4323?src=pr&el=h1) Report > Merging [#4323](https://codecov.io/gh/apache/incubator-airflow/pull/4323?src=pr&el=desc) into [master](https://codecov.io/gh/apache/incubator-airflow/commit/9d87552cebe513fcea5f3d93e18dff2df06a66a7?src=pr&el=desc) will **increase** coverage by `<.01%`. > The diff coverage is `100%`. [![Impacted file tree graph](https://codecov.io/gh/apache/incubator-airflow/pull/4323/graphs/tree.svg?width=650&token=WdLKlKHOAU&height=150&src=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/4323?src=pr&el=tree) ```diff @@Coverage Diff@@ ## master #4323 +/- ## = + Coverage 78.09% 78.1% +<.01% = Files 201 201 Lines 16470 16471 +1 = + Hits12862 12864 +2 + Misses 36083607 -1 ``` | [Impacted Files](https://codecov.io/gh/apache/incubator-airflow/pull/4323?src=pr&el=tree) | Coverage Δ | | |---|---|---| | [airflow/models.py](https://codecov.io/gh/apache/incubator-airflow/pull/4323/diff?src=pr&el=tree#diff-YWlyZmxvdy9tb2RlbHMucHk=) | `92.35% <100%> (+0.04%)` | :arrow_up: | -- [Continue to review full report at Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4323?src=pr&el=continue). > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta) > `Δ = absolute (impact)`, `ø = not affected`, `? = missing data` > Powered by [Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4323?src=pr&el=footer). Last update [9d87552...554a187](https://codecov.io/gh/apache/incubator-airflow/pull/4323?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments). This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Assigned] (AIRFLOW-3522) Support Slack Attachments for SlackWebhookHook
[ https://issues.apache.org/jira/browse/AIRFLOW-3522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Holtzscher reassigned AIRFLOW-3522: --- Assignee: Michael Holtzscher > Support Slack Attachments for SlackWebhookHook > -- > > Key: AIRFLOW-3522 > URL: https://issues.apache.org/jira/browse/AIRFLOW-3522 > Project: Apache Airflow > Issue Type: Improvement > Components: contrib >Reporter: Michael Holtzscher >Assignee: Michael Holtzscher >Priority: Minor > > The SlackWebhookHook and SlackWebhookOperator do not support sending > attachments. Adding support for attachments would allow for a much more full > featured Slack messaging experience. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] kaxil commented on a change in pull request #4323: [AIRFLOW-3447] Add 2 options for ts_nodash Macro
kaxil commented on a change in pull request #4323: [AIRFLOW-3447] Add 2 options for ts_nodash Macro URL: https://github.com/apache/incubator-airflow/pull/4323#discussion_r241962520 ## File path: UPDATING.md ## @@ -24,6 +24,13 @@ assists users migrating to a new version. ## Airflow Master +### Modification to `ts_nodash` macro +`ts_nodash` previously contained TimeZone information alongwith execution date. For Example: `20150101T00+`. This is not user-friendly for file or folder names which was a popular use case for `ts_nodash`. Hence this behavior has been changed and using `ts_nodash` will no longer contain TimeZone information. And a new macro `ts_nodash_with_tz` has been added which can be used to get a string with execution date and timezone info without dashes. Review comment: Done. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] seelmann edited a comment on issue #3596: [AIRFLOW-2747] Explicit re-schedule of sensors
seelmann edited a comment on issue #3596: [AIRFLOW-2747] Explicit re-schedule of sensors URL: https://github.com/apache/incubator-airflow/pull/3596#issuecomment-447601423 Another thing that should be improved: The start date, end date, and duration in the tooltip when hovering over the task only shows the values of the last run, should be from first to last. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] seelmann commented on issue #3596: [AIRFLOW-2747] Explicit re-schedule of sensors
seelmann commented on issue #3596: [AIRFLOW-2747] Explicit re-schedule of sensors URL: https://github.com/apache/incubator-airflow/pull/3596#issuecomment-447601423 Another thing that should be improved: The start date, end date, and duration in the tooltip when hovering over that task only shows the values of the last run, should be from first to last. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] seelmann commented on issue #3596: [AIRFLOW-2747] Explicit re-schedule of sensors
seelmann commented on issue #3596: [AIRFLOW-2747] Explicit re-schedule of sensors URL: https://github.com/apache/incubator-airflow/pull/3596#issuecomment-447601182 Screenshots are attached at the Jira. The Gantt view should show a white bar when it's rescheduled in thus in "None" state, see https://issues.apache.org/jira/browse/AIRFLOW-2747#comment-16616842 In a previous version all rescheduled executions were visible, but that ended up in too many small bars: https://issues.apache.org/jira/browse/AIRFLOW-2747#comment-16541539 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Updated] (AIRFLOW-3001) Accumulative tis slow allocation of new schedule
[ https://issues.apache.org/jira/browse/AIRFLOW-3001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ash Berlin-Taylor updated AIRFLOW-3001: --- Fix Version/s: (was: 2.0.0) 1.10.2 > Accumulative tis slow allocation of new schedule > > > Key: AIRFLOW-3001 > URL: https://issues.apache.org/jira/browse/AIRFLOW-3001 > Project: Apache Airflow > Issue Type: Improvement > Components: scheduler >Affects Versions: 1.10.0 >Reporter: Jason Kim >Assignee: Jason Kim >Priority: Major > Fix For: 1.10.2 > > > I have created very long term schedule in short interval. (2~3 years as 10 > min interval) > So, dag could be bigger and bigger as scheduling goes on. > Finally, at critical point (I don't know exactly when it is), the allocation > of new task_instances get slow and then almost stop. > I found that in this point, many slow query logs had occurred. (I was using > mysql as meta repository) > queries like this > "SELECT * FROM task_instance WHERE dag_id = 'some_dag_id' AND execution_date > = ''2018-09-01 00:00:00" > I could resolve this issue by adding new index consists of dag_id and > execution_date. > So, I wanted 1.10 branch to be modified to create task_instance table with > the index. > Thanks. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (AIRFLOW-2747) Explicit re-schedule of sensors
[ https://issues.apache.org/jira/browse/AIRFLOW-2747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ash Berlin-Taylor updated AIRFLOW-2747: --- Fix Version/s: (was: 2.0.0) 1.10.2 > Explicit re-schedule of sensors > --- > > Key: AIRFLOW-2747 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2747 > Project: Apache Airflow > Issue Type: Improvement > Components: core, operators >Affects Versions: 1.9.0, 1.10.0 >Reporter: Stefan Seelmann >Assignee: Stefan Seelmann >Priority: Major > Fix For: 1.10.2 > > Attachments: Screenshot_2018-07-12_14-10-24.png, > Screenshot_2018-09-16_20-09-28.png, Screenshot_2018-09-16_20-19-23.png, > google_apis-23_r01.zip > > > By default sensors block a worker and just sleep between pokes. This is very > inefficient, especially when there are many long-running sensors. > There is a hacky workaroud by setting a small timeout value and a high retry > number. But that has drawbacks: > * Errors raised by sensors are hidden and the sensor retries too often > * The sensor is retried in a fixed time interval (with optional exponential > backoff) > * There are many attempts and many log files are generated > I'd like to propose an explicit reschedule mechanism: > * A new "reschedule" flag for sensors, if set to True it will raise an > AirflowRescheduleException that causes a reschedule. > * AirflowRescheduleException contains the (earliest) re-schedule date. > * Reschedule requests are recorded in new `task_reschedule` table and > visualized in the Gantt view. > * A new TI dependency that checks if a sensor task is ready to be > re-scheduled. > Advantages: > * This change is backward compatible. Existing sensors behave like before. > But it's possible to set the "reschedule" flag. > * The poke_interval, timeout, and soft_fail parameters are still respected > and used to calculate the next schedule time. > * Custom sensor implementations can even define the next sensible schedule > date by raising AirflowRescheduleException themselves. > * Existing TimeSensor and TimeDeltaSensor can also be changed to be > rescheduled when the time is reached. > * This mechanism can also be used by non-sensor operators (but then the new > ReadyToRescheduleDep has to be added to deps or BaseOperator). > Design decisions and caveats: > * When handling AirflowRescheduleException the `try_number` is decremented. > That means that subsequent runs use the same try number and write to the same > log file. > * Sensor TI dependency check now depends on `task_reschedule` table. However > only the BaseSensorOperator includes the new ReadyToRescheduleDep for now. > Open questions and TODOs: > * Should a dedicated state `UP_FOR_RESCHEDULE` be used instead of setting > the state back to `NONE`? This would require more changes in scheduler code > and especially in the UI, but the state of a task would be more explicit and > more transparent to the user. > * Add example/test for a non-sensor operator > * Document the new feature -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (AIRFLOW-3392) Add index on dag_id in sla_miss table
[ https://issues.apache.org/jira/browse/AIRFLOW-3392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ash Berlin-Taylor resolved AIRFLOW-3392. Resolution: Fixed Fix Version/s: (was: 2.0.0) 1.10.2 > Add index on dag_id in sla_miss table > - > > Key: AIRFLOW-3392 > URL: https://issues.apache.org/jira/browse/AIRFLOW-3392 > Project: Apache Airflow > Issue Type: Improvement >Reporter: Kevin Yang >Assignee: Kevin Yang >Priority: Major > Fix For: 1.10.2 > > > The select queries on sla_miss table produce a great % of DB traffic and thus > made the DB CPU usage unnecessarily high. It would be a low hanging fruit to > add an index and reduce the load. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Reopened] (AIRFLOW-3392) Add index on dag_id in sla_miss table
[ https://issues.apache.org/jira/browse/AIRFLOW-3392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ash Berlin-Taylor reopened AIRFLOW-3392: Changing fix version > Add index on dag_id in sla_miss table > - > > Key: AIRFLOW-3392 > URL: https://issues.apache.org/jira/browse/AIRFLOW-3392 > Project: Apache Airflow > Issue Type: Improvement >Reporter: Kevin Yang >Assignee: Kevin Yang >Priority: Major > Fix For: 1.10.2 > > > The select queries on sla_miss table produce a great % of DB traffic and thus > made the DB CPU usage unnecessarily high. It would be a low hanging fruit to > add an index and reduce the load. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] ashb commented on issue #3596: [AIRFLOW-2747] Explicit re-schedule of sensors
ashb commented on issue #3596: [AIRFLOW-2747] Explicit re-schedule of sensors URL: https://github.com/apache/incubator-airflow/pull/3596#issuecomment-447599864 @seelmann Thanks, that would be ace! This feature is very nice. I'm mostly concerned with the visualisation on the Graph, Tree, and Task Instance Detail pages. Gannt isn't one I look at very often, but looking at it now I don't see anything indicating retries? (That or I cherry-picked it on to 1-10-test wrong) Could you give some screenshots? As for the log? Hmmm, There are only a few executors so updating them to be aware of it and log not log something incorrect shouldn't be too much work? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Created] (AIRFLOW-3522) Support Slack Attachments for SlackWebhookHook
Michael Holtzscher created AIRFLOW-3522: --- Summary: Support Slack Attachments for SlackWebhookHook Key: AIRFLOW-3522 URL: https://issues.apache.org/jira/browse/AIRFLOW-3522 Project: Apache Airflow Issue Type: Improvement Components: contrib Reporter: Michael Holtzscher The SlackWebhookHook and SlackWebhookOperator do not support sending attachments. Adding support for attachments would allow for a much more full featured Slack messaging experience. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] dmvieira edited a comment on issue #2748: Fixing temp path
dmvieira edited a comment on issue #2748: Fixing temp path URL: https://github.com/apache/incubator-airflow/pull/2748#issuecomment-447598599 We can live without it This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] dmvieira closed pull request #2748: Fixing temp path
dmvieira closed pull request #2748: Fixing temp path URL: https://github.com/apache/incubator-airflow/pull/2748 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/airflow/operators/bash_operator.py b/airflow/operators/bash_operator.py index a7eeb03ad6..d29aaee96b 100644 --- a/airflow/operators/bash_operator.py +++ b/airflow/operators/bash_operator.py @@ -73,8 +73,7 @@ def execute(self, context): f.write(bytes(bash_command, 'utf_8')) f.flush() -fname = f.name -script_location = tmp_dir + "/" + fname +script_location = f.name self.log.info( "Temporary script location: %s", script_location @@ -87,7 +86,7 @@ def pre_exec(): os.setsid() self.log.info("Running command: %s", bash_command) sp = Popen( -['bash', fname], +['bash', script_location], stdout=PIPE, stderr=STDOUT, cwd=tmp_dir, env=self.env, preexec_fn=pre_exec) This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] dmvieira commented on issue #2748: Fixing temp path
dmvieira commented on issue #2748: Fixing temp path URL: https://github.com/apache/incubator-airflow/pull/2748#issuecomment-447598599 We can live with it This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] seelmann commented on issue #3596: [AIRFLOW-2747] Explicit re-schedule of sensors
seelmann commented on issue #3596: [AIRFLOW-2747] Explicit re-schedule of sensors URL: https://github.com/apache/incubator-airflow/pull/3596#issuecomment-447596460 @ashb I can work on those during the holidays. Regarding 1 (the "None" state): I agree it's not optimal that there is not indication about what's going on. Do you want to have a new dedicated state (in state.py)? Or is is just about the visualization in the tree view (and other views make also sense IMHO)? Adding a new state was discussed but decided against. Changing the visualization in views should be possible, in Gantt view it's already done. Regarding 2 (the success log): What do you expect should be logged instead of success? If I look into `sequential_executor.py` it always returns sets success or failed, depending if the command execution was successful or not. Should we make all executors aware of the reschedule state? PS: We run 1.10.0 with this patch successfully in production since November :) This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] codecov-io edited a comment on issue #4323: [AIRFLOW-3447] Add 2 options for ts_nodash Macro
codecov-io edited a comment on issue #4323: [AIRFLOW-3447] Add 2 options for ts_nodash Macro URL: https://github.com/apache/incubator-airflow/pull/4323#issuecomment-447584517 # [Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4323?src=pr&el=h1) Report > Merging [#4323](https://codecov.io/gh/apache/incubator-airflow/pull/4323?src=pr&el=desc) into [master](https://codecov.io/gh/apache/incubator-airflow/commit/9d87552cebe513fcea5f3d93e18dff2df06a66a7?src=pr&el=desc) will **increase** coverage by `<.01%`. > The diff coverage is `100%`. [![Impacted file tree graph](https://codecov.io/gh/apache/incubator-airflow/pull/4323/graphs/tree.svg?width=650&token=WdLKlKHOAU&height=150&src=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/4323?src=pr&el=tree) ```diff @@Coverage Diff @@ ## master#4323 +/- ## == + Coverage 78.09% 78.09% +<.01% == Files 201 201 Lines 1647016471 +1 == + Hits1286212863 +1 Misses 3608 3608 ``` | [Impacted Files](https://codecov.io/gh/apache/incubator-airflow/pull/4323?src=pr&el=tree) | Coverage Δ | | |---|---|---| | [airflow/models.py](https://codecov.io/gh/apache/incubator-airflow/pull/4323/diff?src=pr&el=tree#diff-YWlyZmxvdy9tb2RlbHMucHk=) | `92.31% <100%> (ø)` | :arrow_up: | -- [Continue to review full report at Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4323?src=pr&el=continue). > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta) > `Δ = absolute (impact)`, `ø = not affected`, `? = missing data` > Powered by [Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4323?src=pr&el=footer). Last update [9d87552...c87e243](https://codecov.io/gh/apache/incubator-airflow/pull/4323?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments). This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] ashb commented on a change in pull request #4323: [AIRFLOW-3447] Add 2 options for ts_nodash Macro
ashb commented on a change in pull request #4323: [AIRFLOW-3447] Add 2 options for ts_nodash Macro URL: https://github.com/apache/incubator-airflow/pull/4323#discussion_r241958823 ## File path: UPDATING.md ## @@ -24,6 +24,13 @@ assists users migrating to a new version. ## Airflow Master +### Modification to `ts_nodash` macro +`ts_nodash` previously contained TimeZone information alongwith execution date. For Example: `20150101T00+`. This is not user-friendly for file or folder names which was a popular use case for `ts_nodash`. Hence this behavior has been changed and using `ts_nodash` will no longer contain TimeZone information. And a new macro `ts_nodash_with_tz` has been added which can be used to get a string with execution date and timezone info without dashes. Review comment: Something to say "this restores the pre-1.10 behaviour of this macro"? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Reopened] (AIRFLOW-2747) Explicit re-schedule of sensors
[ https://issues.apache.org/jira/browse/AIRFLOW-2747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ash Berlin-Taylor reopened AIRFLOW-2747: Some more work needed on this issue - the logs are wrong and the visibility in the UI is now worse. > Explicit re-schedule of sensors > --- > > Key: AIRFLOW-2747 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2747 > Project: Apache Airflow > Issue Type: Improvement > Components: core, operators >Affects Versions: 1.9.0, 1.10.0 >Reporter: Stefan Seelmann >Assignee: Stefan Seelmann >Priority: Major > Fix For: 2.0.0 > > Attachments: Screenshot_2018-07-12_14-10-24.png, > Screenshot_2018-09-16_20-09-28.png, Screenshot_2018-09-16_20-19-23.png, > google_apis-23_r01.zip > > > By default sensors block a worker and just sleep between pokes. This is very > inefficient, especially when there are many long-running sensors. > There is a hacky workaroud by setting a small timeout value and a high retry > number. But that has drawbacks: > * Errors raised by sensors are hidden and the sensor retries too often > * The sensor is retried in a fixed time interval (with optional exponential > backoff) > * There are many attempts and many log files are generated > I'd like to propose an explicit reschedule mechanism: > * A new "reschedule" flag for sensors, if set to True it will raise an > AirflowRescheduleException that causes a reschedule. > * AirflowRescheduleException contains the (earliest) re-schedule date. > * Reschedule requests are recorded in new `task_reschedule` table and > visualized in the Gantt view. > * A new TI dependency that checks if a sensor task is ready to be > re-scheduled. > Advantages: > * This change is backward compatible. Existing sensors behave like before. > But it's possible to set the "reschedule" flag. > * The poke_interval, timeout, and soft_fail parameters are still respected > and used to calculate the next schedule time. > * Custom sensor implementations can even define the next sensible schedule > date by raising AirflowRescheduleException themselves. > * Existing TimeSensor and TimeDeltaSensor can also be changed to be > rescheduled when the time is reached. > * This mechanism can also be used by non-sensor operators (but then the new > ReadyToRescheduleDep has to be added to deps or BaseOperator). > Design decisions and caveats: > * When handling AirflowRescheduleException the `try_number` is decremented. > That means that subsequent runs use the same try number and write to the same > log file. > * Sensor TI dependency check now depends on `task_reschedule` table. However > only the BaseSensorOperator includes the new ReadyToRescheduleDep for now. > Open questions and TODOs: > * Should a dedicated state `UP_FOR_RESCHEDULE` be used instead of setting > the state back to `NONE`? This would require more changes in scheduler code > and especially in the UI, but the state of a task would be more explicit and > more transparent to the user. > * Add example/test for a non-sensor operator > * Document the new feature -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] ashb edited a comment on issue #3596: [AIRFLOW-2747] Explicit re-schedule of sensors
ashb edited a comment on issue #3596: [AIRFLOW-2747] Explicit re-schedule of sensors URL: https://github.com/apache/incubator-airflow/pull/3596#issuecomment-447590657 I think this feature needs some work 1. the tasks that are in "reschedule" just show up as "None" in the tree view: ![screen shot 2018-12-15 at 18 59 07](https://user-images.githubusercontent.com/34150/50046454-8868a100-009b-11e9-8468-4e4fbae371b1.png) This makes it hard-to-impossible to know what is going on and they probably need to show up as a new state, or just make them show up as running, queued or something? Showing as None isn't helpful. (I even knew what was going on and was still confused! 2. The scheduler (when using SequentialExecutor, but that isn't relevant) logs this task as Success! ``` [2018-12-15 18:59:13,635] {jobs.py:1100} INFO - 1 tasks up for execution: [2018-12-15 18:59:13,649] {jobs.py:1135} INFO - Figuring out tasks to run in Pool(name=None) with 128 open slots and 1 task instances in queue [2018-12-15 18:59:13,656] {jobs.py:1171} INFO - DAG hello_world has 0/16 running and queued tasks [2018-12-15 18:59:13,656] {jobs.py:1209} INFO - Setting the follow tasks to queued state: [2018-12-15 18:59:13,698] {jobs.py:1293} INFO - Setting the following 1 tasks to queued state: [2018-12-15 18:59:13,699] {jobs.py:1335} INFO - Sending ('hello_world', 'wait', datetime.datetime(2018, 12, 15, 18, 50, tzinfo=), 1) to executor with priority 2 and queue default [2018-12-15 18:59:13,701] {base_executor.py:56} INFO - Adding to queue: airflow run hello_world wait 2018-12-15T18:50:00+00:00 --local -sd /Users/ash/airflow/dags/foo.py [2018-12-15 18:59:13,742] {sequential_executor.py:45} INFO - Executing command: airflow run hello_world wait 2018-12-15T18:50:00+00:00 --local -sd /Users/ash/airflow/dags/foo.py [2018-12-15 18:59:15,558] {__init__.py:51} INFO - Using executor SequentialExecutor [2018-12-15 18:59:15,755] {models.py:273} INFO - Filling up the DagBag from /Users/ash/airflow/dags/foo.py [2018-12-15 18:59:15,833] {cli.py:530} INFO - Running on host themisto.localdomain [2018-12-15 18:59:21,427] {jobs.py:1439} INFO - Executor reports hello_world.wait execution_date=2018-12-15 18:50:00+00:00 as success for try_number 1 ``` The dag I was testing this with as: ``` from airflow import DAG import datetime import airflow from airflow.operators.dummy_operator import DummyOperator from airflow.sensors.time_sensor import TimeSensor start = ( airflow.utils.dates.days_ago(2) ) dag2 = DAG('hello_world', schedule_interval='*/5 * * * *', start_date=start, catchup=False) with dag2: ( TimeSensor( task_id='wait', target_time=datetime.time(20), mode='reschedule', ) >> DummyOperator(task_id='dummy') ) ``` @seelmann Are you able to work on fixing these two issues? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] ashb commented on issue #3596: [AIRFLOW-2747] Explicit re-schedule of sensors
ashb commented on issue #3596: [AIRFLOW-2747] Explicit re-schedule of sensors URL: https://github.com/apache/incubator-airflow/pull/3596#issuecomment-447590657 I think this feature needs some work 1. the tasks that are in "reschedule" just show up as "None" in the tree view: ![screen shot 2018-12-15 at 18 59 07](https://user-images.githubusercontent.com/34150/50046454-8868a100-009b-11e9-8468-4e4fbae371b1.png) This makes it hard-to-impossible to know what is going on and they probably need to show up as a new state, or just make them show up as running, queued or something? Showing as None isn't helpful. (I even knew what was going on and was still confused! 2. The scheduler (when using SequentialExecutor, but that isn't relevant) logs this task as Success! ``` [2018-12-15 18:59:13,635] {jobs.py:1100} INFO - 1 tasks up for execution: [2018-12-15 18:59:13,649] {jobs.py:1135} INFO - Figuring out tasks to run in Pool(name=None) with 128 open slots and 1 task instances in queue [2018-12-15 18:59:13,656] {jobs.py:1171} INFO - DAG hello_world has 0/16 running and queued tasks [2018-12-15 18:59:13,656] {jobs.py:1209} INFO - Setting the follow tasks to queued state: [2018-12-15 18:59:13,698] {jobs.py:1293} INFO - Setting the following 1 tasks to queued state: [2018-12-15 18:59:13,699] {jobs.py:1335} INFO - Sending ('hello_world', 'wait', datetime.datetime(2018, 12, 15, 18, 50, tzinfo=), 1) to executor with priority 2 and queue default [2018-12-15 18:59:13,701] {base_executor.py:56} INFO - Adding to queue: airflow run hello_world wait 2018-12-15T18:50:00+00:00 --local -sd /Users/ash/airflow/dags/foo.py [2018-12-15 18:59:13,742] {sequential_executor.py:45} INFO - Executing command: airflow run hello_world wait 2018-12-15T18:50:00+00:00 --local -sd /Users/ash/airflow/dags/foo.py [2018-12-15 18:59:15,558] {__init__.py:51} INFO - Using executor SequentialExecutor [2018-12-15 18:59:15,755] {models.py:273} INFO - Filling up the DagBag from /Users/ash/airflow/dags/foo.py [2018-12-15 18:59:15,833] {cli.py:530} INFO - Running on host themisto.localdomain [2018-12-15 18:59:21,427] {jobs.py:1439} INFO - Executor reports hello_world.wait execution_date=2018-12-15 18:50:00+00:00 as success for try_number 1 ``` The dag I was testing this with as: ``` from airflow import DAG import datetime import airflow from airflow.operators.dummy_operator import DummyOperator from airflow.sensors.time_sensor import TimeSensor start = ( airflow.utils.dates.days_ago(2) ) dag2 = DAG('hello_world', schedule_interval='*/5 * * * *', start_date=start, catchup=False) with dag2: ( TimeSensor( task_id='wait', target_time=datetime.time(20), mode='reschedule', ) >> DummyOperator(task_id='dummy') ) ``` @seelmann Are you able to work on fixing these two issues? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] kaxil commented on issue #4323: [AIRFLOW-3447] Add 2 options for ts_nodash Macro
kaxil commented on issue #4323: [AIRFLOW-3447] Add 2 options for ts_nodash Macro URL: https://github.com/apache/incubator-airflow/pull/4323#issuecomment-447590637 I don't think it affects the log filename as the log filename uses `ts`. I will add details to UPDATING.md ``` # Log filename format log_filename_template = ti.dag_id / ti.task_id / ts / try_number .log log_processor_filename_template = filename .log dag_processor_manager_log_location = {AIRFLOW_HOME}/logs/dag_processor_manager/dag_processor_manager.log ``` This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] ashb commented on issue #4276: [AIRFLOW-1552] Airflow Filter_by_owner not working with password_auth
ashb commented on issue #4276: [AIRFLOW-1552] Airflow Filter_by_owner not working with password_auth URL: https://github.com/apache/incubator-airflow/pull/4276#issuecomment-447587937 I tried cherry-picking this in to v1-10-test but it failed due to the down rev specified there not being found, so we might have to cherry pick #4235 (which in turn depends on #3596 and a few more on down) - both those seem like things that would be good to include anyway. I'll try and see how long the chain goes. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] codecov-io edited a comment on issue #4323: [AIRFLOW-3447] Add 2 options for ts_nodash Macro
codecov-io edited a comment on issue #4323: [AIRFLOW-3447] Add 2 options for ts_nodash Macro URL: https://github.com/apache/incubator-airflow/pull/4323#issuecomment-447584517 # [Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4323?src=pr&el=h1) Report > Merging [#4323](https://codecov.io/gh/apache/incubator-airflow/pull/4323?src=pr&el=desc) into [master](https://codecov.io/gh/apache/incubator-airflow/commit/9d87552cebe513fcea5f3d93e18dff2df06a66a7?src=pr&el=desc) will **increase** coverage by `0.49%`. > The diff coverage is `100%`. [![Impacted file tree graph](https://codecov.io/gh/apache/incubator-airflow/pull/4323/graphs/tree.svg?width=650&token=WdLKlKHOAU&height=150&src=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/4323?src=pr&el=tree) ```diff @@Coverage Diff @@ ## master#4323 +/- ## == + Coverage 77.59% 78.09% +0.49% == Files 201 201 Lines 1647016471 +1 == + Hits1278012863 +83 + Misses 3690 3608 -82 ``` | [Impacted Files](https://codecov.io/gh/apache/incubator-airflow/pull/4323?src=pr&el=tree) | Coverage Δ | | |---|---|---| | [airflow/models.py](https://codecov.io/gh/apache/incubator-airflow/pull/4323/diff?src=pr&el=tree#diff-YWlyZmxvdy9tb2RlbHMucHk=) | `92.31% <100%> (ø)` | :arrow_up: | | [airflow/www/views.py](https://codecov.io/gh/apache/incubator-airflow/pull/4323/diff?src=pr&el=tree#diff-YWlyZmxvdy93d3cvdmlld3MucHk=) | `69.37% <0%> (+0.12%)` | :arrow_up: | | [airflow/hooks/dbapi\_hook.py](https://codecov.io/gh/apache/incubator-airflow/pull/4323/diff?src=pr&el=tree#diff-YWlyZmxvdy9ob29rcy9kYmFwaV9ob29rLnB5) | `83.06% <0%> (+1.61%)` | :arrow_up: | | [airflow/hooks/hive\_hooks.py](https://codecov.io/gh/apache/incubator-airflow/pull/4323/diff?src=pr&el=tree#diff-YWlyZmxvdy9ob29rcy9oaXZlX2hvb2tzLnB5) | `75.26% <0%> (+1.84%)` | :arrow_up: | | [airflow/utils/sqlalchemy.py](https://codecov.io/gh/apache/incubator-airflow/pull/4323/diff?src=pr&el=tree#diff-YWlyZmxvdy91dGlscy9zcWxhbGNoZW15LnB5) | `78.57% <0%> (+5.71%)` | :arrow_up: | | [airflow/operators/mysql\_to\_hive.py](https://codecov.io/gh/apache/incubator-airflow/pull/4323/diff?src=pr&el=tree#diff-YWlyZmxvdy9vcGVyYXRvcnMvbXlzcWxfdG9faGl2ZS5weQ==) | `100% <0%> (+100%)` | :arrow_up: | | [airflow/operators/mysql\_operator.py](https://codecov.io/gh/apache/incubator-airflow/pull/4323/diff?src=pr&el=tree#diff-YWlyZmxvdy9vcGVyYXRvcnMvbXlzcWxfb3BlcmF0b3IucHk=) | `100% <0%> (+100%)` | :arrow_up: | -- [Continue to review full report at Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4323?src=pr&el=continue). > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta) > `Δ = absolute (impact)`, `ø = not affected`, `? = missing data` > Powered by [Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4323?src=pr&el=footer). Last update [9d87552...f3d0f9e](https://codecov.io/gh/apache/incubator-airflow/pull/4323?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments). This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] codecov-io edited a comment on issue #4323: [AIRFLOW-3447] Add 2 options for ts_nodash Macro
codecov-io edited a comment on issue #4323: [AIRFLOW-3447] Add 2 options for ts_nodash Macro URL: https://github.com/apache/incubator-airflow/pull/4323#issuecomment-447584517 # [Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4323?src=pr&el=h1) Report > Merging [#4323](https://codecov.io/gh/apache/incubator-airflow/pull/4323?src=pr&el=desc) into [master](https://codecov.io/gh/apache/incubator-airflow/commit/9d87552cebe513fcea5f3d93e18dff2df06a66a7?src=pr&el=desc) will **increase** coverage by `0.49%`. > The diff coverage is `100%`. [![Impacted file tree graph](https://codecov.io/gh/apache/incubator-airflow/pull/4323/graphs/tree.svg?width=650&token=WdLKlKHOAU&height=150&src=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/4323?src=pr&el=tree) ```diff @@Coverage Diff @@ ## master#4323 +/- ## == + Coverage 77.59% 78.09% +0.49% == Files 201 201 Lines 1647016471 +1 == + Hits1278012863 +83 + Misses 3690 3608 -82 ``` | [Impacted Files](https://codecov.io/gh/apache/incubator-airflow/pull/4323?src=pr&el=tree) | Coverage Δ | | |---|---|---| | [airflow/models.py](https://codecov.io/gh/apache/incubator-airflow/pull/4323/diff?src=pr&el=tree#diff-YWlyZmxvdy9tb2RlbHMucHk=) | `92.31% <100%> (ø)` | :arrow_up: | | [airflow/www/views.py](https://codecov.io/gh/apache/incubator-airflow/pull/4323/diff?src=pr&el=tree#diff-YWlyZmxvdy93d3cvdmlld3MucHk=) | `69.37% <0%> (+0.12%)` | :arrow_up: | | [airflow/hooks/dbapi\_hook.py](https://codecov.io/gh/apache/incubator-airflow/pull/4323/diff?src=pr&el=tree#diff-YWlyZmxvdy9ob29rcy9kYmFwaV9ob29rLnB5) | `83.06% <0%> (+1.61%)` | :arrow_up: | | [airflow/hooks/hive\_hooks.py](https://codecov.io/gh/apache/incubator-airflow/pull/4323/diff?src=pr&el=tree#diff-YWlyZmxvdy9ob29rcy9oaXZlX2hvb2tzLnB5) | `75.26% <0%> (+1.84%)` | :arrow_up: | | [airflow/utils/sqlalchemy.py](https://codecov.io/gh/apache/incubator-airflow/pull/4323/diff?src=pr&el=tree#diff-YWlyZmxvdy91dGlscy9zcWxhbGNoZW15LnB5) | `78.57% <0%> (+5.71%)` | :arrow_up: | | [airflow/operators/mysql\_to\_hive.py](https://codecov.io/gh/apache/incubator-airflow/pull/4323/diff?src=pr&el=tree#diff-YWlyZmxvdy9vcGVyYXRvcnMvbXlzcWxfdG9faGl2ZS5weQ==) | `100% <0%> (+100%)` | :arrow_up: | | [airflow/operators/mysql\_operator.py](https://codecov.io/gh/apache/incubator-airflow/pull/4323/diff?src=pr&el=tree#diff-YWlyZmxvdy9vcGVyYXRvcnMvbXlzcWxfb3BlcmF0b3IucHk=) | `100% <0%> (+100%)` | :arrow_up: | -- [Continue to review full report at Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4323?src=pr&el=continue). > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta) > `Δ = absolute (impact)`, `ø = not affected`, `? = missing data` > Powered by [Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4323?src=pr&el=footer). Last update [9d87552...f3d0f9e](https://codecov.io/gh/apache/incubator-airflow/pull/4323?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments). This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] stale[bot] commented on issue #2367: [AIRFLOW-1077] Warn about subdag deadlock case
stale[bot] commented on issue #2367: [AIRFLOW-1077] Warn about subdag deadlock case URL: https://github.com/apache/incubator-airflow/pull/2367#issuecomment-447585342 This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] ashb commented on issue #4323: [AIRFLOW-3447] Add 2 options for ts_nodash Macro
ashb commented on issue #4323: [AIRFLOW-3447] Add 2 options for ts_nodash Macro URL: https://github.com/apache/incubator-airflow/pull/4323#issuecomment-447584724 Actually - what does this do to the filename that logs are written to? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] codecov-io edited a comment on issue #4323: [AIRFLOW-3447] Add 2 options for ts_nodash Macro
codecov-io edited a comment on issue #4323: [AIRFLOW-3447] Add 2 options for ts_nodash Macro URL: https://github.com/apache/incubator-airflow/pull/4323#issuecomment-447584517 # [Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4323?src=pr&el=h1) Report > Merging [#4323](https://codecov.io/gh/apache/incubator-airflow/pull/4323?src=pr&el=desc) into [master](https://codecov.io/gh/apache/incubator-airflow/commit/9d87552cebe513fcea5f3d93e18dff2df06a66a7?src=pr&el=desc) will **increase** coverage by `0.49%`. > The diff coverage is `100%`. [![Impacted file tree graph](https://codecov.io/gh/apache/incubator-airflow/pull/4323/graphs/tree.svg?width=650&token=WdLKlKHOAU&height=150&src=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/4323?src=pr&el=tree) ```diff @@Coverage Diff @@ ## master#4323 +/- ## == + Coverage 77.59% 78.09% +0.49% == Files 201 201 Lines 1647016471 +1 == + Hits1278012863 +83 + Misses 3690 3608 -82 ``` | [Impacted Files](https://codecov.io/gh/apache/incubator-airflow/pull/4323?src=pr&el=tree) | Coverage Δ | | |---|---|---| | [airflow/models.py](https://codecov.io/gh/apache/incubator-airflow/pull/4323/diff?src=pr&el=tree#diff-YWlyZmxvdy9tb2RlbHMucHk=) | `92.31% <100%> (ø)` | :arrow_up: | | [airflow/www/views.py](https://codecov.io/gh/apache/incubator-airflow/pull/4323/diff?src=pr&el=tree#diff-YWlyZmxvdy93d3cvdmlld3MucHk=) | `69.37% <0%> (+0.12%)` | :arrow_up: | | [airflow/hooks/dbapi\_hook.py](https://codecov.io/gh/apache/incubator-airflow/pull/4323/diff?src=pr&el=tree#diff-YWlyZmxvdy9ob29rcy9kYmFwaV9ob29rLnB5) | `83.06% <0%> (+1.61%)` | :arrow_up: | | [airflow/hooks/hive\_hooks.py](https://codecov.io/gh/apache/incubator-airflow/pull/4323/diff?src=pr&el=tree#diff-YWlyZmxvdy9ob29rcy9oaXZlX2hvb2tzLnB5) | `75.26% <0%> (+1.84%)` | :arrow_up: | | [airflow/utils/sqlalchemy.py](https://codecov.io/gh/apache/incubator-airflow/pull/4323/diff?src=pr&el=tree#diff-YWlyZmxvdy91dGlscy9zcWxhbGNoZW15LnB5) | `78.57% <0%> (+5.71%)` | :arrow_up: | | [airflow/operators/mysql\_to\_hive.py](https://codecov.io/gh/apache/incubator-airflow/pull/4323/diff?src=pr&el=tree#diff-YWlyZmxvdy9vcGVyYXRvcnMvbXlzcWxfdG9faGl2ZS5weQ==) | `100% <0%> (+100%)` | :arrow_up: | | [airflow/operators/mysql\_operator.py](https://codecov.io/gh/apache/incubator-airflow/pull/4323/diff?src=pr&el=tree#diff-YWlyZmxvdy9vcGVyYXRvcnMvbXlzcWxfb3BlcmF0b3IucHk=) | `100% <0%> (+100%)` | :arrow_up: | -- [Continue to review full report at Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4323?src=pr&el=continue). > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta) > `Δ = absolute (impact)`, `ø = not affected`, `? = missing data` > Powered by [Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4323?src=pr&el=footer). Last update [9d87552...f3d0f9e](https://codecov.io/gh/apache/incubator-airflow/pull/4323?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments). This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] codecov-io edited a comment on issue #4323: [AIRFLOW-3447] Add 2 options for ts_nodash Macro
codecov-io edited a comment on issue #4323: [AIRFLOW-3447] Add 2 options for ts_nodash Macro URL: https://github.com/apache/incubator-airflow/pull/4323#issuecomment-447584517 # [Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4323?src=pr&el=h1) Report > Merging [#4323](https://codecov.io/gh/apache/incubator-airflow/pull/4323?src=pr&el=desc) into [master](https://codecov.io/gh/apache/incubator-airflow/commit/9d87552cebe513fcea5f3d93e18dff2df06a66a7?src=pr&el=desc) will **increase** coverage by `0.49%`. > The diff coverage is `100%`. [![Impacted file tree graph](https://codecov.io/gh/apache/incubator-airflow/pull/4323/graphs/tree.svg?width=650&token=WdLKlKHOAU&height=150&src=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/4323?src=pr&el=tree) ```diff @@Coverage Diff @@ ## master#4323 +/- ## == + Coverage 77.59% 78.09% +0.49% == Files 201 201 Lines 1647016471 +1 == + Hits1278012863 +83 + Misses 3690 3608 -82 ``` | [Impacted Files](https://codecov.io/gh/apache/incubator-airflow/pull/4323?src=pr&el=tree) | Coverage Δ | | |---|---|---| | [airflow/models.py](https://codecov.io/gh/apache/incubator-airflow/pull/4323/diff?src=pr&el=tree#diff-YWlyZmxvdy9tb2RlbHMucHk=) | `92.31% <100%> (ø)` | :arrow_up: | | [airflow/www/views.py](https://codecov.io/gh/apache/incubator-airflow/pull/4323/diff?src=pr&el=tree#diff-YWlyZmxvdy93d3cvdmlld3MucHk=) | `69.37% <0%> (+0.12%)` | :arrow_up: | | [airflow/hooks/dbapi\_hook.py](https://codecov.io/gh/apache/incubator-airflow/pull/4323/diff?src=pr&el=tree#diff-YWlyZmxvdy9ob29rcy9kYmFwaV9ob29rLnB5) | `83.06% <0%> (+1.61%)` | :arrow_up: | | [airflow/hooks/hive\_hooks.py](https://codecov.io/gh/apache/incubator-airflow/pull/4323/diff?src=pr&el=tree#diff-YWlyZmxvdy9ob29rcy9oaXZlX2hvb2tzLnB5) | `75.26% <0%> (+1.84%)` | :arrow_up: | | [airflow/utils/sqlalchemy.py](https://codecov.io/gh/apache/incubator-airflow/pull/4323/diff?src=pr&el=tree#diff-YWlyZmxvdy91dGlscy9zcWxhbGNoZW15LnB5) | `78.57% <0%> (+5.71%)` | :arrow_up: | | [airflow/operators/mysql\_to\_hive.py](https://codecov.io/gh/apache/incubator-airflow/pull/4323/diff?src=pr&el=tree#diff-YWlyZmxvdy9vcGVyYXRvcnMvbXlzcWxfdG9faGl2ZS5weQ==) | `100% <0%> (+100%)` | :arrow_up: | | [airflow/operators/mysql\_operator.py](https://codecov.io/gh/apache/incubator-airflow/pull/4323/diff?src=pr&el=tree#diff-YWlyZmxvdy9vcGVyYXRvcnMvbXlzcWxfb3BlcmF0b3IucHk=) | `100% <0%> (+100%)` | :arrow_up: | -- [Continue to review full report at Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4323?src=pr&el=continue). > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta) > `Δ = absolute (impact)`, `ø = not affected`, `? = missing data` > Powered by [Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4323?src=pr&el=footer). Last update [9d87552...f3d0f9e](https://codecov.io/gh/apache/incubator-airflow/pull/4323?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments). This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] codecov-io commented on issue #4323: [AIRFLOW-3447] Add 2 options for ts_nodash Macro
codecov-io commented on issue #4323: [AIRFLOW-3447] Add 2 options for ts_nodash Macro URL: https://github.com/apache/incubator-airflow/pull/4323#issuecomment-447584517 # [Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4323?src=pr&el=h1) Report > Merging [#4323](https://codecov.io/gh/apache/incubator-airflow/pull/4323?src=pr&el=desc) into [master](https://codecov.io/gh/apache/incubator-airflow/commit/9d87552cebe513fcea5f3d93e18dff2df06a66a7?src=pr&el=desc) will **increase** coverage by `0.49%`. > The diff coverage is `100%`. [![Impacted file tree graph](https://codecov.io/gh/apache/incubator-airflow/pull/4323/graphs/tree.svg?width=650&token=WdLKlKHOAU&height=150&src=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/4323?src=pr&el=tree) ```diff @@Coverage Diff @@ ## master#4323 +/- ## == + Coverage 77.59% 78.09% +0.49% == Files 201 201 Lines 1647016471 +1 == + Hits1278012863 +83 + Misses 3690 3608 -82 ``` | [Impacted Files](https://codecov.io/gh/apache/incubator-airflow/pull/4323?src=pr&el=tree) | Coverage Δ | | |---|---|---| | [airflow/models.py](https://codecov.io/gh/apache/incubator-airflow/pull/4323/diff?src=pr&el=tree#diff-YWlyZmxvdy9tb2RlbHMucHk=) | `92.31% <100%> (ø)` | :arrow_up: | | [airflow/www/views.py](https://codecov.io/gh/apache/incubator-airflow/pull/4323/diff?src=pr&el=tree#diff-YWlyZmxvdy93d3cvdmlld3MucHk=) | `69.37% <0%> (+0.12%)` | :arrow_up: | | [airflow/hooks/dbapi\_hook.py](https://codecov.io/gh/apache/incubator-airflow/pull/4323/diff?src=pr&el=tree#diff-YWlyZmxvdy9ob29rcy9kYmFwaV9ob29rLnB5) | `83.06% <0%> (+1.61%)` | :arrow_up: | | [airflow/hooks/hive\_hooks.py](https://codecov.io/gh/apache/incubator-airflow/pull/4323/diff?src=pr&el=tree#diff-YWlyZmxvdy9ob29rcy9oaXZlX2hvb2tzLnB5) | `75.26% <0%> (+1.84%)` | :arrow_up: | | [airflow/utils/sqlalchemy.py](https://codecov.io/gh/apache/incubator-airflow/pull/4323/diff?src=pr&el=tree#diff-YWlyZmxvdy91dGlscy9zcWxhbGNoZW15LnB5) | `78.57% <0%> (+5.71%)` | :arrow_up: | | [airflow/operators/mysql\_to\_hive.py](https://codecov.io/gh/apache/incubator-airflow/pull/4323/diff?src=pr&el=tree#diff-YWlyZmxvdy9vcGVyYXRvcnMvbXlzcWxfdG9faGl2ZS5weQ==) | `100% <0%> (+100%)` | :arrow_up: | | [airflow/operators/mysql\_operator.py](https://codecov.io/gh/apache/incubator-airflow/pull/4323/diff?src=pr&el=tree#diff-YWlyZmxvdy9vcGVyYXRvcnMvbXlzcWxfb3BlcmF0b3IucHk=) | `100% <0%> (+100%)` | :arrow_up: | -- [Continue to review full report at Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4323?src=pr&el=continue). > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta) > `Δ = absolute (impact)`, `ø = not affected`, `? = missing data` > Powered by [Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4323?src=pr&el=footer). Last update [9d87552...f3d0f9e](https://codecov.io/gh/apache/incubator-airflow/pull/4323?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments). This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (AIRFLOW-3447) Intended usage of ts_nodash macro broken with migration to new time system.
[ https://issues.apache.org/jira/browse/AIRFLOW-3447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16722218#comment-16722218 ] ASF GitHub Bot commented on AIRFLOW-3447: - kaxil opened a new pull request #4323: [AIRFLOW-3447] Add 2 options for ts_nodash Macro URL: https://github.com/apache/incubator-airflow/pull/4323 Make sure you have checked _all_ steps below. ### Jira - [x] My PR addresses the following [Airflow Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR" - https://issues.apache.org/jira/browse/AIRFLOW-3447 ### Description - [x] Here are some details about my PR, including screenshots of any UI changes: Add Option to use `ts_nodash` and `ts_nodash_with_tz` macro so it gives any option to users to use it Filename. Previous users have been using `ts_nodash` for file and folder names. ### Tests - [x] My PR adds the following unit tests __OR__ does not need testing for this extremely good reason: Added tests ### Commits - [x] My commits all reference Jira issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)": 1. Subject is separated from body by a blank line 1. Subject is limited to 50 characters (not including Jira issue reference) 1. Subject does not end with a period 1. Subject uses the imperative mood ("add", not "adding") 1. Body wraps at 72 characters 1. Body explains "what" and "why", not "how" ### Documentation - [x] In case of new functionality, my PR adds documentation that describes how to use it. - When adding new operators/hooks/sensors, the autoclass documentation generation needs to be added. - All the public functions and the classes in the PR contain docstrings that explain what it does Updated the docs ### Code Quality - [x] Passes `flake8` This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Intended usage of ts_nodash macro broken with migration to new time system. > --- > > Key: AIRFLOW-3447 > URL: https://issues.apache.org/jira/browse/AIRFLOW-3447 > Project: Apache Airflow > Issue Type: Bug > Components: core >Reporter: Luka Draksler >Assignee: Kaxil Naik >Priority: Minor > Labels: easyfix > Fix For: 1.10.2 > > > Migration to timezone aware times broke the intended usage of ts_nodash macro. > ts_nodash is used in certain placeholders to create different names (table > names, cluster names...). As such it is alphanumeric only, it contains no > characters that could be deemed illegal by various naming restrictions. > Migration to new time system changed that. > As an example, this would be returned currently: > {{20181205T125657.169324+}} > {{before:}} > {{20181204T03}} > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] kaxil opened a new pull request #4323: [AIRFLOW-3447] Add 2 options for ts_nodash Macro
kaxil opened a new pull request #4323: [AIRFLOW-3447] Add 2 options for ts_nodash Macro URL: https://github.com/apache/incubator-airflow/pull/4323 Make sure you have checked _all_ steps below. ### Jira - [x] My PR addresses the following [Airflow Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR" - https://issues.apache.org/jira/browse/AIRFLOW-3447 ### Description - [x] Here are some details about my PR, including screenshots of any UI changes: Add Option to use `ts_nodash` and `ts_nodash_with_tz` macro so it gives any option to users to use it Filename. Previous users have been using `ts_nodash` for file and folder names. ### Tests - [x] My PR adds the following unit tests __OR__ does not need testing for this extremely good reason: Added tests ### Commits - [x] My commits all reference Jira issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)": 1. Subject is separated from body by a blank line 1. Subject is limited to 50 characters (not including Jira issue reference) 1. Subject does not end with a period 1. Subject uses the imperative mood ("add", not "adding") 1. Body wraps at 72 characters 1. Body explains "what" and "why", not "how" ### Documentation - [x] In case of new functionality, my PR adds documentation that describes how to use it. - When adding new operators/hooks/sensors, the autoclass documentation generation needs to be added. - All the public functions and the classes in the PR contain docstrings that explain what it does Updated the docs ### Code Quality - [x] Passes `flake8` This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] kaxil commented on issue #4317: [AIRFLOW-2629] Change reference of hive_hooks to hive_hook everywhere
kaxil commented on issue #4317: [AIRFLOW-2629] Change reference of hive_hooks to hive_hook everywhere URL: https://github.com/apache/incubator-airflow/pull/4317#issuecomment-447579180 Definitely, agree with most of the points. I had raised a PR a year back :D regarding a similar thing i.e. grouping related items for GCP Storage under 1 module then to have multiple files here and there. Regarding `contrib` - I had different plans - as hooks/Ops for AWS, GCP evolve more rapidly than the core and we didn't have frequent airflow releases - I thought it would be better to have them in seprate repos similar to Terraform Providers but that needs a bigger work and a solid plan. But looks like we are having more frequent releases now + once we are TLP we can actually make the process faster. Hence, maybe we can remove contrib completely. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] kaxil commented on issue #4304: [AIRFLOW-3500] Make task duration display user friendly
kaxil commented on issue #4304: [AIRFLOW-3500] Make task duration display user friendly URL: https://github.com/apache/incubator-airflow/pull/4304#issuecomment-447578458 @oferze Can you add tests to check this functions, please? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] ashb commented on issue #4050: [AIRFLOW-3178] Don't bake ENV and _cmd into tmp config for non-sudo
ashb commented on issue #4050: [AIRFLOW-3178] Don't bake ENV and _cmd into tmp config for non-sudo URL: https://github.com/apache/incubator-airflow/pull/4050#issuecomment-447577199 @Fokko updated (and rebased). PTAL. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] ashb commented on issue #4317: [AIRFLOW-2629] Change reference of hive_hooks to hive_hook everywhere
ashb commented on issue #4317: [AIRFLOW-2629] Change reference of hive_hooks to hive_hook everywhere URL: https://github.com/apache/incubator-airflow/pull/4317#issuecomment-447577053 I don't feel strongly either way about this PR. It would be possible to rename (if we decide that's what we want to do) and to have a deprecated import to make the upgrade path easier -- it's easy to maintain this shim for a release or two past 2.0 (say to 2.1 or 2.2) A random point: I think the naming of things is a little bit all-over the place, now that you mention it, and I think we should have something like this (as a non-representative sample): (I am *not* saying we should make this change, it's just my immediate thought) * Remove `_hook` suffix from the module name We don't need "hook" or "hooks" in the module, as it's already there from the parent ```python from airflow.hooks.hive import HiveServer2Hook, HiveCliHook ``` * Have related operators in the same module: Rather than ```python from airflow.operators.emr_add_step_operator import EmrAddStepOperator from airflow.operators.emr_create_job_flow_operator import EmrCreateJobFlowOperator ``` I think we should group closely related operators together like this. ```python from airflow.operators.emr import EmrAddStepOperator, EmrCreateJobFlowOperator ``` * Entirely drop the `contrib` folder module. We as a project have a copyright grant on all code, and have to maintain it all equally. I think the contrib folder made more sense when Airflow belonged to AirB'n'B but isn't useful now. This should probably be an AIP. 😀 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Resolved] (AIRFLOW-1552) Airflow Filter_by_owner not working with password_auth
[ https://issues.apache.org/jira/browse/AIRFLOW-1552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ash Berlin-Taylor resolved AIRFLOW-1552. Resolution: Fixed > Airflow Filter_by_owner not working with password_auth > -- > > Key: AIRFLOW-1552 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1552 > Project: Apache Airflow > Issue Type: Bug > Components: configuration >Affects Versions: 1.8.0 > Environment: CentOS , python 2.7 >Reporter: raghu ram reddy >Assignee: Thomas Brockmeier >Priority: Major > Fix For: 1.10.2 > > > Airflow Filter_by_owner parameter is not working with password_auth. > I created sample user using the below code from airflow documentation and > enabled password_auth. > I'm able to login as the user created but by default this user is superuser > and there is noway to modify it, default all users created by PasswordUser > are superusers. > import airflow > from airflow import models, settings > from airflow.contrib.auth.backends.password_auth import PasswordUser > user = PasswordUser(models.User()) > user.username = 'test1' > user.password = 'test1' > user.is_superuser() > session = settings.Session() > session.add(user) > session.commit() > session.close() > exit() -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-1552) Airflow Filter_by_owner not working with password_auth
[ https://issues.apache.org/jira/browse/AIRFLOW-1552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16722194#comment-16722194 ] ASF GitHub Bot commented on AIRFLOW-1552: - ashb closed pull request #4276: [AIRFLOW-1552] Airflow Filter_by_owner not working with password_auth URL: https://github.com/apache/incubator-airflow/pull/4276 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/UPDATING.md b/UPDATING.md index 88dc78c810..814e2e107d 100644 --- a/UPDATING.md +++ b/UPDATING.md @@ -76,6 +76,24 @@ To delete a user: airflow users --delete --username jondoe ``` +### User model changes +This patch changes the `User.superuser` field from a hardcoded boolean to a `Boolean()` database column. `User.superuser` will default to `False`, which means that this privilege will have to be granted manually to any users that may require it. + +For example, open a Python shell and +```python +from airflow import models, settings + +session = settings.Session() +users = session.query(models.User).all() # [admin, regular_user] + +users[1].superuser # False + +admin = users[0] +admin.superuser = True +session.add(admin) +session.commit() +``` + ## Airflow 1.10.1 ### StatsD Metrics diff --git a/airflow/contrib/auth/backends/password_auth.py b/airflow/contrib/auth/backends/password_auth.py index 55f5daf8fd..dcdb1d1225 100644 --- a/airflow/contrib/auth/backends/password_auth.py +++ b/airflow/contrib/auth/backends/password_auth.py @@ -95,8 +95,7 @@ def data_profiling(self): return True def is_superuser(self): -"""Access all the things""" -return True +return hasattr(self, 'user') and self.user.is_superuser() @login_manager.user_loader diff --git a/airflow/migrations/versions/41f5f12752f8_add_superuser_field.py b/airflow/migrations/versions/41f5f12752f8_add_superuser_field.py new file mode 100644 index 00..6e02582b7e --- /dev/null +++ b/airflow/migrations/versions/41f5f12752f8_add_superuser_field.py @@ -0,0 +1,42 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +"""add superuser field + +Revision ID: 41f5f12752f8 +Revises: 03bc53e68815 +Create Date: 2018-12-04 15:50:04.456875 + +""" + +from alembic import op +import sqlalchemy as sa + + +# revision identifiers, used by Alembic. +revision = '41f5f12752f8' +down_revision = '03bc53e68815' +branch_labels = None +depends_on = None + + +def upgrade(): +op.add_column('users', sa.Column('superuser', sa.Boolean(), default=False)) + + +def downgrade(): +op.drop_column('users', 'superuser') diff --git a/airflow/models.py b/airflow/models.py index 1bca27cbc8..a6d0ebbd73 100755 --- a/airflow/models.py +++ b/airflow/models.py @@ -601,7 +601,7 @@ class User(Base): id = Column(Integer, primary_key=True) username = Column(String(ID_LEN), unique=True) email = Column(String(500)) -superuser = False +superuser = Column(Boolean(), default=False) def __repr__(self): return self.username This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Airflow Filter_by_owner not working with password_auth > -- > > Key: AIRFLOW-1552 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1552 > Project: Apache Airflow > Issue Type: Bug > Components: configuration >Affects Versions: 1.8.0 > Environment: CentOS , python 2.7 >Reporter: raghu ram reddy >Assignee: Thomas Brockmeier >Priority: Major > Fix For: 1.10.2 > > > Airflow Filter_by_owner parameter is not working with password_auth. > I created sample u
[GitHub] ashb closed pull request #4276: [AIRFLOW-1552] Airflow Filter_by_owner not working with password_auth
ashb closed pull request #4276: [AIRFLOW-1552] Airflow Filter_by_owner not working with password_auth URL: https://github.com/apache/incubator-airflow/pull/4276 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/UPDATING.md b/UPDATING.md index 88dc78c810..814e2e107d 100644 --- a/UPDATING.md +++ b/UPDATING.md @@ -76,6 +76,24 @@ To delete a user: airflow users --delete --username jondoe ``` +### User model changes +This patch changes the `User.superuser` field from a hardcoded boolean to a `Boolean()` database column. `User.superuser` will default to `False`, which means that this privilege will have to be granted manually to any users that may require it. + +For example, open a Python shell and +```python +from airflow import models, settings + +session = settings.Session() +users = session.query(models.User).all() # [admin, regular_user] + +users[1].superuser # False + +admin = users[0] +admin.superuser = True +session.add(admin) +session.commit() +``` + ## Airflow 1.10.1 ### StatsD Metrics diff --git a/airflow/contrib/auth/backends/password_auth.py b/airflow/contrib/auth/backends/password_auth.py index 55f5daf8fd..dcdb1d1225 100644 --- a/airflow/contrib/auth/backends/password_auth.py +++ b/airflow/contrib/auth/backends/password_auth.py @@ -95,8 +95,7 @@ def data_profiling(self): return True def is_superuser(self): -"""Access all the things""" -return True +return hasattr(self, 'user') and self.user.is_superuser() @login_manager.user_loader diff --git a/airflow/migrations/versions/41f5f12752f8_add_superuser_field.py b/airflow/migrations/versions/41f5f12752f8_add_superuser_field.py new file mode 100644 index 00..6e02582b7e --- /dev/null +++ b/airflow/migrations/versions/41f5f12752f8_add_superuser_field.py @@ -0,0 +1,42 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +"""add superuser field + +Revision ID: 41f5f12752f8 +Revises: 03bc53e68815 +Create Date: 2018-12-04 15:50:04.456875 + +""" + +from alembic import op +import sqlalchemy as sa + + +# revision identifiers, used by Alembic. +revision = '41f5f12752f8' +down_revision = '03bc53e68815' +branch_labels = None +depends_on = None + + +def upgrade(): +op.add_column('users', sa.Column('superuser', sa.Boolean(), default=False)) + + +def downgrade(): +op.drop_column('users', 'superuser') diff --git a/airflow/models.py b/airflow/models.py index 1bca27cbc8..a6d0ebbd73 100755 --- a/airflow/models.py +++ b/airflow/models.py @@ -601,7 +601,7 @@ class User(Base): id = Column(Integer, primary_key=True) username = Column(String(ID_LEN), unique=True) email = Column(String(500)) -superuser = False +superuser = Column(Boolean(), default=False) def __repr__(self): return self.username This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] ashb commented on issue #4276: [AIRFLOW-1552] Airflow Filter_by_owner not working with password_auth
ashb commented on issue #4276: [AIRFLOW-1552] Airflow Filter_by_owner not working with password_auth URL: https://github.com/apache/incubator-airflow/pull/4276#issuecomment-447576172 LGTM This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (AIRFLOW-3520) RBAC UI seems to have bug in master branch
[ https://issues.apache.org/jira/browse/AIRFLOW-3520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16722193#comment-16722193 ] Ash Berlin-Taylor commented on AIRFLOW-3520: This seems to bite a few people on master. I wonder if it's worth us putting a check somewhere in the webserver start-up code that checks if the expected compiled assets exist? > RBAC UI seems to have bug in master branch > -- > > Key: AIRFLOW-3520 > URL: https://issues.apache.org/jira/browse/AIRFLOW-3520 > Project: Apache Airflow > Issue Type: Bug >Reporter: Tao Feng >Priority: Major > Attachments: Screen Shot 2018-12-14 at 10.58.07 PM.png > > > !Screen Shot 2018-12-14 at 10.58.07 PM.png! -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (AIRFLOW-3518) Toposort is very slow
[ https://issues.apache.org/jira/browse/AIRFLOW-3518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ash Berlin-Taylor resolved AIRFLOW-3518. Resolution: Fixed Fix Version/s: 1.10.2 > Toposort is very slow > - > > Key: AIRFLOW-3518 > URL: https://issues.apache.org/jira/browse/AIRFLOW-3518 > Project: Apache Airflow > Issue Type: New Feature > Components: scheduler >Reporter: Niels Zeilemaker >Assignee: Niels Zeilemaker >Priority: Major > Fix For: 1.10.2 > > > At a client we've discovered that for larger DAGs toposort is very slow. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-3518) Toposort is very slow
[ https://issues.apache.org/jira/browse/AIRFLOW-3518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16722184#comment-16722184 ] ASF GitHub Bot commented on AIRFLOW-3518: - ashb closed pull request #4322: [AIRFLOW-3518] Performance fixes for topological_sort URL: https://github.com/apache/incubator-airflow/pull/4322 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/airflow/models.py b/airflow/models.py index 74555902a5..47ebd9713c 100755 --- a/airflow/models.py +++ b/airflow/models.py @@ -23,7 +23,7 @@ from __future__ import unicode_literals import copy -from collections import defaultdict, namedtuple +from collections import defaultdict, namedtuple, OrderedDict from builtins import ImportError as BuiltinImportError, bytes, object, str from future.standard_library import install_aliases @@ -2662,10 +2662,10 @@ def __init__( } def __eq__(self, other): -return ( -type(self) == type(other) and -all(self.__dict__.get(c, None) == other.__dict__.get(c, None) -for c in self._comps)) +if (type(self) == type(other) and +self.task_id == other.task_id): +return all(self.__dict__.get(c, None) == other.__dict__.get(c, None) for c in self._comps) +return False def __ne__(self, other): return not self == other @@ -3443,12 +3443,13 @@ def __repr__(self): return "".format(self=self) def __eq__(self, other): -return ( -type(self) == type(other) and +if (type(self) == type(other) and +self.dag_id == other.dag_id): + # Use getattr() instead of __dict__ as __dict__ doesn't return # correct values for properties. -all(getattr(self, c, None) == getattr(other, c, None) -for c in self._comps)) +return all(getattr(self, c, None) == getattr(other, c, None) for c in self._comps) +return False def __ne__(self, other): return not self == other @@ -3904,8 +3905,8 @@ def topological_sort(self): :return: list of tasks in topological order """ -# copy the the tasks so we leave it unmodified -graph_unsorted = self.tasks[:] +# convert into an OrderedDict to speedup lookup while keeping order the same +graph_unsorted = OrderedDict((task.task_id, task) for task in self.tasks) graph_sorted = [] @@ -3928,14 +3929,14 @@ def topological_sort(self): # not, we need to bail out as the graph therefore can't be # sorted. acyclic = False -for node in list(graph_unsorted): +for node in list(graph_unsorted.values()): for edge in node.upstream_list: -if edge in graph_unsorted: +if edge.task_id in graph_unsorted: break # no edges in upstream tasks else: acyclic = True -graph_unsorted.remove(node) +del graph_unsorted[node.task_id] graph_sorted.append(node) if not acyclic: This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Toposort is very slow > - > > Key: AIRFLOW-3518 > URL: https://issues.apache.org/jira/browse/AIRFLOW-3518 > Project: Apache Airflow > Issue Type: New Feature > Components: scheduler >Reporter: Niels Zeilemaker >Assignee: Niels Zeilemaker >Priority: Major > > At a client we've discovered that for larger DAGs toposort is very slow. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] ashb closed pull request #4322: [AIRFLOW-3518] Performance fixes for topological_sort
ashb closed pull request #4322: [AIRFLOW-3518] Performance fixes for topological_sort URL: https://github.com/apache/incubator-airflow/pull/4322 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/airflow/models.py b/airflow/models.py index 74555902a5..47ebd9713c 100755 --- a/airflow/models.py +++ b/airflow/models.py @@ -23,7 +23,7 @@ from __future__ import unicode_literals import copy -from collections import defaultdict, namedtuple +from collections import defaultdict, namedtuple, OrderedDict from builtins import ImportError as BuiltinImportError, bytes, object, str from future.standard_library import install_aliases @@ -2662,10 +2662,10 @@ def __init__( } def __eq__(self, other): -return ( -type(self) == type(other) and -all(self.__dict__.get(c, None) == other.__dict__.get(c, None) -for c in self._comps)) +if (type(self) == type(other) and +self.task_id == other.task_id): +return all(self.__dict__.get(c, None) == other.__dict__.get(c, None) for c in self._comps) +return False def __ne__(self, other): return not self == other @@ -3443,12 +3443,13 @@ def __repr__(self): return "".format(self=self) def __eq__(self, other): -return ( -type(self) == type(other) and +if (type(self) == type(other) and +self.dag_id == other.dag_id): + # Use getattr() instead of __dict__ as __dict__ doesn't return # correct values for properties. -all(getattr(self, c, None) == getattr(other, c, None) -for c in self._comps)) +return all(getattr(self, c, None) == getattr(other, c, None) for c in self._comps) +return False def __ne__(self, other): return not self == other @@ -3904,8 +3905,8 @@ def topological_sort(self): :return: list of tasks in topological order """ -# copy the the tasks so we leave it unmodified -graph_unsorted = self.tasks[:] +# convert into an OrderedDict to speedup lookup while keeping order the same +graph_unsorted = OrderedDict((task.task_id, task) for task in self.tasks) graph_sorted = [] @@ -3928,14 +3929,14 @@ def topological_sort(self): # not, we need to bail out as the graph therefore can't be # sorted. acyclic = False -for node in list(graph_unsorted): +for node in list(graph_unsorted.values()): for edge in node.upstream_list: -if edge in graph_unsorted: +if edge.task_id in graph_unsorted: break # no edges in upstream tasks else: acyclic = True -graph_unsorted.remove(node) +del graph_unsorted[node.task_id] graph_sorted.append(node) if not acyclic: This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] ashb commented on issue #4322: [AIRFLOW-3518] Performance fixes for topological_sort
ashb commented on issue #4322: [AIRFLOW-3518] Performance fixes for topological_sort URL: https://github.com/apache/incubator-airflow/pull/4322#issuecomment-447575229 Maybe I should offer bad code reviews as a service? ;) This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] ashb commented on a change in pull request #4322: [AIRFLOW-3518] Performance fixes for topological_sort
ashb commented on a change in pull request #4322: [AIRFLOW-3518] Performance fixes for topological_sort URL: https://github.com/apache/incubator-airflow/pull/4322#discussion_r241952239 ## File path: airflow/models.py ## @@ -2662,10 +2662,15 @@ def __init__( } def __eq__(self, other): -return ( -type(self) == type(other) and -all(self.__dict__.get(c, None) == other.__dict__.get(c, None) -for c in self._comps)) +if (type(self) == type(other) and +self.task_id == other.task_id): Review comment: Oh sorry yes! If they don't match then we return false. Right This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] kaxil commented on a change in pull request #4298: [AIRFLOW-3478] Make sure that the session is closed
kaxil commented on a change in pull request #4298: [AIRFLOW-3478] Make sure that the session is closed URL: https://github.com/apache/incubator-airflow/pull/4298#discussion_r241947328 ## File path: airflow/jobs.py ## @@ -544,7 +544,6 @@ def __init__( subdir=settings.DAGS_FOLDER, num_runs=-1, processor_poll_interval=1.0, -run_duration=None, Review comment: Is this a part of this PR? This is already a part of https://github.com/apache/incubator-airflow/pull/4320 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] kaxil commented on a change in pull request #4298: [AIRFLOW-3478] Make sure that the session is closed
kaxil commented on a change in pull request #4298: [AIRFLOW-3478] Make sure that the session is closed URL: https://github.com/apache/incubator-airflow/pull/4298#discussion_r241947339 ## File path: UPDATING.md ## @@ -24,6 +24,10 @@ assists users migrating to a new version. ## Airflow Master + Remove run_duration + +We should not use the `run_duration` option anymore. This used to be for restarting the scheduler from time to time, but right now the scheduler is getting more stable and therefore using this setting is considered bad and might cause an inconsistent state. + Review comment: Is this a part of this PR? This is already a part of https://github.com/apache/incubator-airflow/pull/4320 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] kaxil commented on a change in pull request #4298: [AIRFLOW-3478] Make sure that the session is closed
kaxil commented on a change in pull request #4298: [AIRFLOW-3478] Make sure that the session is closed URL: https://github.com/apache/incubator-airflow/pull/4298#discussion_r241947334 ## File path: airflow/jobs.py ## @@ -562,8 +561,6 @@ def __init__( :param processor_poll_interval: The number of seconds to wait between polls of running processors :type processor_poll_interval: int -:param run_duration: how long to run (in seconds) before exiting Review comment: Same. Is this a part of this PR? This is already a part of https://github.com/apache/incubator-airflow/pull/4320 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Resolved] (AIRFLOW-3521) Airflow Jira Compare script is limited to 50 items
[ https://issues.apache.org/jira/browse/AIRFLOW-3521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kaxil Naik resolved AIRFLOW-3521. - Resolution: Fixed Fix Version/s: 1.10.2 > Airflow Jira Compare script is limited to 50 items > -- > > Key: AIRFLOW-3521 > URL: https://issues.apache.org/jira/browse/AIRFLOW-3521 > Project: Apache Airflow > Issue Type: Improvement >Reporter: Kaxil Naik >Priority: Minor > Fix For: 1.10.2 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-3521) Airflow Jira Compare script is limited to 50 items
[ https://issues.apache.org/jira/browse/AIRFLOW-3521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16722126#comment-16722126 ] ASF GitHub Bot commented on AIRFLOW-3521: - kaxil closed pull request #4300: [AIRFLOW-3521] Fetch more than 50 items in `airflow-jira compare` script URL: https://github.com/apache/incubator-airflow/pull/4300 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/dev/airflow-jira b/dev/airflow-jira index 27ec991914..5625654c19 100755 --- a/dev/airflow-jira +++ b/dev/airflow-jira @@ -65,9 +65,22 @@ GIT_LOG_FORMAT = '%x1f'.join(GIT_LOG_FORMAT) + '%x1e' def get_jiras_for_version(version): asf_jira = jira.client.JIRA({'server': JIRA_API_BASE}) -return asf_jira.search_issues( -'PROJECT={} and fixVersion={}'.format(PROJECT, version) -) +start_at = 0 +page_size = 50 +while True: +results = asf_jira.search_issues( +'PROJECT={} and fixVersion={}'.format(PROJECT, version), +maxResults=page_size, +startAt=start_at, +) + +for r in results: +yield r + +if len(results) < page_size: +break + +start_at += page_size def get_merged_issues(version): This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Airflow Jira Compare script is limited to 50 items > -- > > Key: AIRFLOW-3521 > URL: https://issues.apache.org/jira/browse/AIRFLOW-3521 > Project: Apache Airflow > Issue Type: Improvement >Reporter: Kaxil Naik >Priority: Minor > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (AIRFLOW-3521) Airflow Jira Compare script is limited to 50 items
Kaxil Naik created AIRFLOW-3521: --- Summary: Airflow Jira Compare script is limited to 50 items Key: AIRFLOW-3521 URL: https://issues.apache.org/jira/browse/AIRFLOW-3521 Project: Apache Airflow Issue Type: Improvement Reporter: Kaxil Naik -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] kaxil closed pull request #4300: [AIRFLOW-3521] Fetch more than 50 items in `airflow-jira compare` script
kaxil closed pull request #4300: [AIRFLOW-3521] Fetch more than 50 items in `airflow-jira compare` script URL: https://github.com/apache/incubator-airflow/pull/4300 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/dev/airflow-jira b/dev/airflow-jira index 27ec991914..5625654c19 100755 --- a/dev/airflow-jira +++ b/dev/airflow-jira @@ -65,9 +65,22 @@ GIT_LOG_FORMAT = '%x1f'.join(GIT_LOG_FORMAT) + '%x1e' def get_jiras_for_version(version): asf_jira = jira.client.JIRA({'server': JIRA_API_BASE}) -return asf_jira.search_issues( -'PROJECT={} and fixVersion={}'.format(PROJECT, version) -) +start_at = 0 +page_size = 50 +while True: +results = asf_jira.search_issues( +'PROJECT={} and fixVersion={}'.format(PROJECT, version), +maxResults=page_size, +startAt=start_at, +) + +for r in results: +yield r + +if len(results) < page_size: +break + +start_at += page_size def get_merged_issues(version): This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] codecov-io commented on issue #4314: [AIRFLOW-3398] Google Cloud Spanner instance database query operator
codecov-io commented on issue #4314: [AIRFLOW-3398] Google Cloud Spanner instance database query operator URL: https://github.com/apache/incubator-airflow/pull/4314#issuecomment-447562637 # [Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4314?src=pr&el=h1) Report > Merging [#4314](https://codecov.io/gh/apache/incubator-airflow/pull/4314?src=pr&el=desc) into [master](https://codecov.io/gh/apache/incubator-airflow/commit/457ad83e4eb02b7348e5ce00292ca9bd27032651?src=pr&el=desc) will **increase** coverage by `0.07%`. > The diff coverage is `n/a`. [![Impacted file tree graph](https://codecov.io/gh/apache/incubator-airflow/pull/4314/graphs/tree.svg?width=650&token=WdLKlKHOAU&height=150&src=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/4314?src=pr&el=tree) ```diff @@Coverage Diff @@ ## master#4314 +/- ## == + Coverage 78.02% 78.09% +0.07% == Files 201 201 Lines 1646616466 == + Hits1284712859 +12 + Misses 3619 3607 -12 ``` | [Impacted Files](https://codecov.io/gh/apache/incubator-airflow/pull/4314?src=pr&el=tree) | Coverage Δ | | |---|---|---| | [airflow/configuration.py](https://codecov.io/gh/apache/incubator-airflow/pull/4314/diff?src=pr&el=tree#diff-YWlyZmxvdy9jb25maWd1cmF0aW9uLnB5) | `89.24% <0%> (+4.3%)` | :arrow_up: | -- [Continue to review full report at Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4314?src=pr&el=continue). > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta) > `Δ = absolute (impact)`, `ø = not affected`, `? = missing data` > Powered by [Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4314?src=pr&el=footer). Last update [457ad83...9c7ae48](https://codecov.io/gh/apache/incubator-airflow/pull/4314?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments). This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] kaxil commented on a change in pull request #4299: [AIRFLOW-3490] BigQueryHook's Ability to Patch Table/View
kaxil commented on a change in pull request #4299: [AIRFLOW-3490] BigQueryHook's Ability to Patch Table/View URL: https://github.com/apache/incubator-airflow/pull/4299#discussion_r241947032 ## File path: airflow/contrib/hooks/bigquery_hook.py ## @@ -495,6 +495,78 @@ def create_external_table(self, 'BigQuery job failed. Error was: {}'.format(err.content) ) +def patch_table(self, +project_id, +dataset_id, +table_id, +schema=None, +view=None): +""" +Patch information in an existing table/view. +Schema changes can only be applied to tables, not views. Review comment: This needs to support more than just updating schema as done by the api https://cloud.google.com/bigquery/docs/reference/rest/v2/tables/patch. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] kaxil commented on issue #4317: [AIRFLOW-2629] Change reference of hive_hooks to hive_hook everywhere
kaxil commented on issue #4317: [AIRFLOW-2629] Change reference of hive_hooks to hive_hook everywhere URL: https://github.com/apache/incubator-airflow/pull/4317#issuecomment-447561722 We can keep it as it is. Most of the other hook file in that directory has 1 hook, whereas hive as 3 hooks. Accepting this will mean many more PRs with changing the naming convention here and there. And I know we don't have a clear rule for `contrib` vs `core`, but hive has been a part of core, I don't want users to change module import because of this change. Maybe we can address NamingConvention when we have a 2.0 branch with a Massive Single PR that address NamingConvention across all. What do you think @ashb This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] kaxil commented on a change in pull request #4320: [AIRFLOW-3515] Remove the run_duration option
kaxil commented on a change in pull request #4320: [AIRFLOW-3515] Remove the run_duration option URL: https://github.com/apache/incubator-airflow/pull/4320#discussion_r241946571 ## File path: airflow/bin/cli.py ## @@ -971,7 +971,6 @@ def scheduler(args): job = jobs.SchedulerJob( dag_id=args.dag_id, subdir=process_subdir(args.subdir), -run_duration=args.run_duration, num_runs=args.num_runs, Review comment: Should we `num_runs` as well, then? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] kaxil commented on a change in pull request #4320: [AIRFLOW-3515] Remove the run_duration option
kaxil commented on a change in pull request #4320: [AIRFLOW-3515] Remove the run_duration option URL: https://github.com/apache/incubator-airflow/pull/4320#discussion_r241946571 ## File path: airflow/bin/cli.py ## @@ -971,7 +971,6 @@ def scheduler(args): job = jobs.SchedulerJob( dag_id=args.dag_id, subdir=process_subdir(args.subdir), -run_duration=args.run_duration, num_runs=args.num_runs, Review comment: Should we remove `num_runs` as well, then? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Resolved] (AIRFLOW-1919) Add option to query for DAG runs given a DAG ID
[ https://issues.apache.org/jira/browse/AIRFLOW-1919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kaxil Naik resolved AIRFLOW-1919. - Resolution: Fixed Fix Version/s: (was: 2.0.0) 1.10.2 > Add option to query for DAG runs given a DAG ID > --- > > Key: AIRFLOW-1919 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1919 > Project: Apache Airflow > Issue Type: Improvement > Components: cli >Affects Versions: 1.8.0 >Reporter: Steen Manniche >Assignee: Tao Feng >Priority: Trivial > Fix For: 1.10.2 > > > Having a way to list all DAG runs for a given DAG identifier would be useful > when trying to get a programmatic overview of running DAGs. Something along > the lines of > {code} > airflow list_runs $DAG_ID > {code} > Which would return the running DAGs for {{$DAG_ID}} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] kaxil commented on issue #4276: [AIRFLOW-1552] Airflow Filter_by_owner not working with password_auth
kaxil commented on issue #4276: [AIRFLOW-1552] Airflow Filter_by_owner not working with password_auth URL: https://github.com/apache/incubator-airflow/pull/4276#issuecomment-447561172 @ashb Any more comments? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Reopened] (AIRFLOW-1919) Add option to query for DAG runs given a DAG ID
[ https://issues.apache.org/jira/browse/AIRFLOW-1919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kaxil Naik reopened AIRFLOW-1919: - > Add option to query for DAG runs given a DAG ID > --- > > Key: AIRFLOW-1919 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1919 > Project: Apache Airflow > Issue Type: Improvement > Components: cli >Affects Versions: 1.8.0 >Reporter: Steen Manniche >Assignee: Tao Feng >Priority: Trivial > Fix For: 1.10.2 > > > Having a way to list all DAG runs for a given DAG identifier would be useful > when trying to get a programmatic overview of running DAGs. Something along > the lines of > {code} > airflow list_runs $DAG_ID > {code} > Which would return the running DAGs for {{$DAG_ID}} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-1919) Add option to query for DAG runs given a DAG ID
[ https://issues.apache.org/jira/browse/AIRFLOW-1919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16722114#comment-16722114 ] Kaxil Naik commented on AIRFLOW-1919: - This feature will be available in 1.10.2. Currently it is only available in the master > Add option to query for DAG runs given a DAG ID > --- > > Key: AIRFLOW-1919 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1919 > Project: Apache Airflow > Issue Type: Improvement > Components: cli >Affects Versions: 1.8.0 >Reporter: Steen Manniche >Assignee: Tao Feng >Priority: Trivial > Fix For: 1.10.2 > > > Having a way to list all DAG runs for a given DAG identifier would be useful > when trying to get a programmatic overview of running DAGs. Something along > the lines of > {code} > airflow list_runs $DAG_ID > {code} > Which would return the running DAGs for {{$DAG_ID}} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-1919) Add option to query for DAG runs given a DAG ID
[ https://issues.apache.org/jira/browse/AIRFLOW-1919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16722113#comment-16722113 ] Kaxil Naik commented on AIRFLOW-1919: - [~villasv]: This is already available in the documentation at https://airflow.readthedocs.io/en/latest/cli.html#list_dag_runs > Add option to query for DAG runs given a DAG ID > --- > > Key: AIRFLOW-1919 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1919 > Project: Apache Airflow > Issue Type: Improvement > Components: cli >Affects Versions: 1.8.0 >Reporter: Steen Manniche >Assignee: Tao Feng >Priority: Trivial > Fix For: 2.0.0 > > > Having a way to list all DAG runs for a given DAG identifier would be useful > when trying to get a programmatic overview of running DAGs. Something along > the lines of > {code} > airflow list_runs $DAG_ID > {code} > Which would return the running DAGs for {{$DAG_ID}} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] kaxil commented on a change in pull request #3880: [AIRFLOW-461] Support autodetected schemas in BigQuery run_load
kaxil commented on a change in pull request #3880: [AIRFLOW-461] Support autodetected schemas in BigQuery run_load URL: https://github.com/apache/incubator-airflow/pull/3880#discussion_r241946183 ## File path: airflow/contrib/operators/gcs_to_bq.py ## @@ -190,20 +191,24 @@ def __init__(self, self.src_fmt_configs = src_fmt_configs self.time_partitioning = time_partitioning self.cluster_fields = cluster_fields +self.autodetect = autodetect def execute(self, context): bq_hook = BigQueryHook(bigquery_conn_id=self.bigquery_conn_id, delegate_to=self.delegate_to) -if not self.schema_fields and \ -self.schema_object and \ -self.source_format != 'DATASTORE_BACKUP': -gcs_hook = GoogleCloudStorageHook( -google_cloud_storage_conn_id=self.google_cloud_storage_conn_id, -delegate_to=self.delegate_to) -schema_fields = json.loads(gcs_hook.download( -self.bucket, -self.schema_object).decode("utf-8")) +if not self.schema_fields: +if self.schema_object and self.source_format != 'DATASTORE_BACKUP': +gcs_hook = GoogleCloudStorageHook( + google_cloud_storage_conn_id=self.google_cloud_storage_conn_id, +delegate_to=self.delegate_to) +schema_fields = json.loads(gcs_hook.download( Review comment: Check Line 212 & 213 ``` else: schema_fields = self.schema_fields ``` This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] kaxil commented on a change in pull request #3880: [AIRFLOW-461] Support autodetected schemas in BigQuery run_load
kaxil commented on a change in pull request #3880: [AIRFLOW-461] Support autodetected schemas in BigQuery run_load URL: https://github.com/apache/incubator-airflow/pull/3880#discussion_r241946183 ## File path: airflow/contrib/operators/gcs_to_bq.py ## @@ -190,20 +191,24 @@ def __init__(self, self.src_fmt_configs = src_fmt_configs self.time_partitioning = time_partitioning self.cluster_fields = cluster_fields +self.autodetect = autodetect def execute(self, context): bq_hook = BigQueryHook(bigquery_conn_id=self.bigquery_conn_id, delegate_to=self.delegate_to) -if not self.schema_fields and \ -self.schema_object and \ -self.source_format != 'DATASTORE_BACKUP': -gcs_hook = GoogleCloudStorageHook( -google_cloud_storage_conn_id=self.google_cloud_storage_conn_id, -delegate_to=self.delegate_to) -schema_fields = json.loads(gcs_hook.download( -self.bucket, -self.schema_object).decode("utf-8")) +if not self.schema_fields: +if self.schema_object and self.source_format != 'DATASTORE_BACKUP': +gcs_hook = GoogleCloudStorageHook( + google_cloud_storage_conn_id=self.google_cloud_storage_conn_id, +delegate_to=self.delegate_to) +schema_fields = json.loads(gcs_hook.download( Review comment: Check Line 212 & 213 ``` else: schema_fields = self.schema_fields ``` This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] kaxil commented on a change in pull request #3880: [AIRFLOW-461] Support autodetected schemas in BigQuery run_load
kaxil commented on a change in pull request #3880: [AIRFLOW-461] Support autodetected schemas in BigQuery run_load URL: https://github.com/apache/incubator-airflow/pull/3880#discussion_r241946183 ## File path: airflow/contrib/operators/gcs_to_bq.py ## @@ -190,20 +191,24 @@ def __init__(self, self.src_fmt_configs = src_fmt_configs self.time_partitioning = time_partitioning self.cluster_fields = cluster_fields +self.autodetect = autodetect def execute(self, context): bq_hook = BigQueryHook(bigquery_conn_id=self.bigquery_conn_id, delegate_to=self.delegate_to) -if not self.schema_fields and \ -self.schema_object and \ -self.source_format != 'DATASTORE_BACKUP': -gcs_hook = GoogleCloudStorageHook( -google_cloud_storage_conn_id=self.google_cloud_storage_conn_id, -delegate_to=self.delegate_to) -schema_fields = json.loads(gcs_hook.download( -self.bucket, -self.schema_object).decode("utf-8")) +if not self.schema_fields: +if self.schema_object and self.source_format != 'DATASTORE_BACKUP': +gcs_hook = GoogleCloudStorageHook( + google_cloud_storage_conn_id=self.google_cloud_storage_conn_id, +delegate_to=self.delegate_to) +schema_fields = json.loads(gcs_hook.download( Review comment: Check Line 212 & 213 ``` else: schema_fields = self.schema_fields ``` This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] NielsZeilemaker commented on issue #4322: [AIRFLOW-3518] Performance fixes for topological_sort
NielsZeilemaker commented on issue #4322: [AIRFLOW-3518] Performance fixes for topological_sort URL: https://github.com/apache/incubator-airflow/pull/4322#issuecomment-447553072 @feng-tao that's correct, i've modified the condition to check for `task_id` and in the other equality check to check for `dag_id` as a performance optimization. The original check (using `self._comps`) is still there. During our performance test we saw a lot of calls to both `__eq__` methods, and hence the small modification to make them perform better. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services