[jira] [Updated] (AIRFLOW-3414) reload_module not working with custom logging class
[ https://issues.apache.org/jira/browse/AIRFLOW-3414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kevin Yang updated AIRFLOW-3414: Issue Type: Bug (was: Improvement) > reload_module not working with custom logging class > --- > > Key: AIRFLOW-3414 > URL: https://issues.apache.org/jira/browse/AIRFLOW-3414 > Project: Apache Airflow > Issue Type: Bug >Affects Versions: 1.10.2 >Reporter: Kevin Yang >Assignee: Kevin Yang >Priority: Major > > If using custom logging class, the reload_module in dag_processing.py will > fail because it will try to reload default logging class, which is not loaded > at the first place. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (AIRFLOW-3414) reload_module not working with custom logging class
Kevin Yang created AIRFLOW-3414: --- Summary: reload_module not working with custom logging class Key: AIRFLOW-3414 URL: https://issues.apache.org/jira/browse/AIRFLOW-3414 Project: Apache Airflow Issue Type: Improvement Affects Versions: 1.10.2 Reporter: Kevin Yang Assignee: Kevin Yang If using custom logging class, the reload_module in dag_processing.py will fail because it will try to reload default logging class, which is not loaded at the first place. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (AIRFLOW-3393) Fix bug in usage of reload_module
Kevin Yang created AIRFLOW-3393: --- Summary: Fix bug in usage of reload_module Key: AIRFLOW-3393 URL: https://issues.apache.org/jira/browse/AIRFLOW-3393 Project: Apache Airflow Issue Type: Improvement Reporter: Kevin Yang Assignee: Kevin Yang The[ reload_module usage|https://github.com/apache/incubator-airflow/blob/master/airflow/utils/dag_processing.py#L479] is wrong. Need to remove the last section in the package string. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (AIRFLOW-3392) Add index on dag_id in sla_miss table
Kevin Yang created AIRFLOW-3392: --- Summary: Add index on dag_id in sla_miss table Key: AIRFLOW-3392 URL: https://issues.apache.org/jira/browse/AIRFLOW-3392 Project: Apache Airflow Issue Type: Improvement Reporter: Kevin Yang Assignee: Kevin Yang The select queries on sla_miss table produce a great % of DB traffic and thus made the DB CPU usage unnecessarily high. It would be a low hanging fruit to add an index and reduce the load. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (AIRFLOW-3194) Refactor session creation to use with block
[ https://issues.apache.org/jira/browse/AIRFLOW-3194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kevin Yang updated AIRFLOW-3194: Description: There are a lot usage of session = settings.Session() in the code base and would be nice to refactor them all to use with create_session() as session block. (was: There are a lot usage of session = settings.Session() in the code base and would be nice to refactor them all to use with settings.Session() as session block.) > Refactor session creation to use with block > --- > > Key: AIRFLOW-3194 > URL: https://issues.apache.org/jira/browse/AIRFLOW-3194 > Project: Apache Airflow > Issue Type: Improvement >Reporter: Kevin Yang >Priority: Minor > > There are a lot usage of session = settings.Session() in the code base and > would be nice to refactor them all to use with create_session() as session > block. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (AIRFLOW-3194) Refactor session creation to use with block
Kevin Yang created AIRFLOW-3194: --- Summary: Refactor session creation to use with block Key: AIRFLOW-3194 URL: https://issues.apache.org/jira/browse/AIRFLOW-3194 Project: Apache Airflow Issue Type: Improvement Reporter: Kevin Yang There are a lot usage of session = settings.Session() in the code base and would be nice to refactor them all to use with settings.Session() as session block. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-1268) Celery bug can cause tasks to be delayed indefinitely
[ https://issues.apache.org/jira/browse/AIRFLOW-1268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16640403#comment-16640403 ] Kevin Yang commented on AIRFLOW-1268: - [~lbodeen] Thank you! That's great context. I do agree with you if that's the case. [~saguziel] While this issue is no longer valid, I still think we need some sort of requeue, at least optional, to make Airflow more robust to surprises on the celery side. Wanna create a new issue for that? > Celery bug can cause tasks to be delayed indefinitely > - > > Key: AIRFLOW-1268 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1268 > Project: Apache Airflow > Issue Type: Bug > Components: celery > Environment: With celery_executor with redis >Reporter: Alex Guziel >Priority: Critical > > With celery, tasks can get delayed indefinitely (or default 1 hour) due to a > bug with celery, see https://github.com/celery/celery/issues/3765 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-1268) Celery bug can cause tasks to be delayed indefinitely
[ https://issues.apache.org/jira/browse/AIRFLOW-1268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16640385#comment-16640385 ] Kevin Yang commented on AIRFLOW-1268: - Hi [~lbodeen], did you went ahead and verified it in 4.2? From the original celery issue it seems like it was just supposed to be verified in 4.2 but that haven't happen yet. > Celery bug can cause tasks to be delayed indefinitely > - > > Key: AIRFLOW-1268 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1268 > Project: Apache Airflow > Issue Type: Bug > Components: celery > Environment: With celery_executor with redis >Reporter: Alex Guziel >Priority: Critical > > With celery, tasks can get delayed indefinitely (or default 1 hour) due to a > bug with celery, see https://github.com/celery/celery/issues/3765 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-2761) Parallelize Celery Executor enqueuing
[ https://issues.apache.org/jira/browse/AIRFLOW-2761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16639409#comment-16639409 ] Kevin Yang commented on AIRFLOW-2761: - [~xnuinside] Hi Luliia, I actually have [an open PR|https://github.com/KevinYang21/incubator-airflow/pull/4] for it but it is right now blocked by [this PR|https://github.com/apache/incubator-airflow/pull/3873]. Seems like the committers are quite busy so not sure about when can I get unblocked. > Parallelize Celery Executor enqueuing > - > > Key: AIRFLOW-2761 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2761 > Project: Apache Airflow > Issue Type: Improvement >Reporter: Kevin Yang >Priority: Major > > Currently celery executor enqueues in an async fashion but still doing that > in a single process loop. This can slows down scheduler loop and creates > scheduling delay if we have large # of task to schedule in a short time, e.g. > UTC midnight we need to schedule large # of sensors in a short period. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-2442) Airflow run command leaves database connections open
[ https://issues.apache.org/jira/browse/AIRFLOW-2442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16626515#comment-16626515 ] Kevin Yang commented on AIRFLOW-2442: - I suppose this issue is resolved. let's resolve the ticket. > Airflow run command leaves database connections open > > > Key: AIRFLOW-2442 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2442 > Project: Apache Airflow > Issue Type: Bug > Components: cli >Affects Versions: 1.8.0 >Reporter: Alejandro Fernandez >Assignee: Alejandro Fernandez >Priority: Major > Fix For: 2.0.0 > > Attachments: connection_duration_1_hour.png, db_connections.png, > fixed_before_and_after.jpg, monthly_db_connections.png, running_tasks.png > > > *Summary* > The "airflow run" command creates a connection to the database and leaves it > open (until killed by SQLALchemy later). The number of these connections can > skyrocket whenever hundreds/thousands of tasks are launched simultaneously, > and potentially hit the database connection limit. > The problem is that in cli.py, the run() method first calls > {code:java} > settings.configure_orm(disable_connection_pool=True){code} > correctly > to use a NullPool, but then parses any custom configs and again calls > {code:java} > settings.configure_orm(){code} > , thereby overriding the desired behavior by instead using a QueuePool. > The QueuePool uses the default configs for SQL_ALCHEMY_POOL_SIZE and > SQL_ALCHEMY_POOL_RECYCLE. This means that while the task is running and the > executor is sending heartbeats, the sleeping connection is idle until it is > killed by SQLAlchemy. > This fixes a bug introduced by > [https://github.com/apache/incubator-airflow/pull/1934] in > [https://github.com/apache/incubator-airflow/pull/1934/commits/b380013634b02bb4c1b9d1cc587ccd12383820b6#diff-1c2404a3a60f829127232842250ff406R344] > > which is present in branches 1-8-stable, 1-9-stable, and 1-10-test > NOTE: Will create a PR once I've done more testing since I'm on an older > branch. For now, attaching a patch file [^AIRFLOW-2442.patch] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (AIRFLOW-2760) DAG parsing loop coupled with scheduler loop
[ https://issues.apache.org/jira/browse/AIRFLOW-2760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kevin Yang reassigned AIRFLOW-2760: --- Assignee: Kevin Yang > DAG parsing loop coupled with scheduler loop > > > Key: AIRFLOW-2760 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2760 > Project: Apache Airflow > Issue Type: Improvement >Reporter: Kevin Yang >Assignee: Kevin Yang >Priority: Major > > Currently DAG parsing loop is coupled with scheduler loop, meaning that if > scheduler loop became slow, we will parse DAG slower. > As a simple producer and consumer pattern, we shall have them decoupled and > completely remove the scheduling bottleneck placed by DAG parsing--which is > identified in Airbnb as the current biggest bottleneck. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work started] (AIRFLOW-2760) DAG parsing loop coupled with scheduler loop
[ https://issues.apache.org/jira/browse/AIRFLOW-2760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on AIRFLOW-2760 started by Kevin Yang. --- > DAG parsing loop coupled with scheduler loop > > > Key: AIRFLOW-2760 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2760 > Project: Apache Airflow > Issue Type: Improvement >Reporter: Kevin Yang >Assignee: Kevin Yang >Priority: Major > > Currently DAG parsing loop is coupled with scheduler loop, meaning that if > scheduler loop became slow, we will parse DAG slower. > As a simple producer and consumer pattern, we shall have them decoupled and > completely remove the scheduling bottleneck placed by DAG parsing--which is > identified in Airbnb as the current biggest bottleneck. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-2762) Parallelize DAG parsing in webserver
[ https://issues.apache.org/jira/browse/AIRFLOW-2762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16548352#comment-16548352 ] Kevin Yang commented on AIRFLOW-2762: - [~ashb] Good idea. Though I am a bit concerned about the parsing time--we have a couple framework DAGs that takes tens of seconds to parse. I think in this case cache beforehand during start up may even be better than cache lazily. This might also create two sources for webserver to find DAG and potentially create inconsistency within the webserver if the files on the scheduler and webservers are not synced. I think to parse the DAG into simple DAG would be a relatively safer way to approach this. > Parallelize DAG parsing in webserver > > > Key: AIRFLOW-2762 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2762 > Project: Apache Airflow > Issue Type: Improvement >Reporter: Kevin Yang >Priority: Major > > Currently the webserver parses DagBag in a single thread fashion and causes > the start up time to be slow when we have large # of DAG files. Webservers > should not need the actual DAG object and this should be parallelized. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (AIRFLOW-2762) Parallelize DAG parsing in webserver
[ https://issues.apache.org/jira/browse/AIRFLOW-2762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16547113#comment-16547113 ] Kevin Yang edited comment on AIRFLOW-2762 at 7/17/18 9:13 PM: -- [~ashb] Ty a lot for providing your opinions. I think that is good idea, since it will also provide some sort of consistency between scheduler and webserver. Though to be able to do that, we need to store more info in the DagModel that webserver needs, e.g. the dependency. I am also not very sure about how much extra load that would place on the DB. I think if we go this route, we might want to build a DAG parsing component that parses DAG for both scheduler and webserver. I think before we decided to do that, we can try parallelize the parsing on webserver--the work can be reused when we have the DAG parsing service since the webserver will be using the serializable info of the DAG instead of the the DAG object in both cases. was (Author: yrqls21): [~ashb] Ty for the opinions. I think that is good idea, since it will also provide some sort of consistency between scheduler and webserver. Though to be able to do that, we need to store more info in the DagModel that webserver needs, e.g. the dependency. I am also not very sure about how much extra load that would place on the DB. I think if we go this route, we might want to build a DAG parsing component that parses DAG for both scheduler and webserver. I think before we decided to do that, we can try parallelize the parsing on webserver--the work can be reused when we have the DAG parsing service since the webserver will be using the serializable info of the DAG instead of the the DAG object in both cases. > Parallelize DAG parsing in webserver > > > Key: AIRFLOW-2762 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2762 > Project: Apache Airflow > Issue Type: Improvement >Reporter: Kevin Yang >Priority: Major > > Currently the webserver parses DagBag in a single thread fashion and causes > the start up time to be slow when we have large # of DAG files. Webservers > should not need the actual DAG object and this should be parallelized. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-2762) Parallelize DAG parsing in webserver
[ https://issues.apache.org/jira/browse/AIRFLOW-2762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16547113#comment-16547113 ] Kevin Yang commented on AIRFLOW-2762: - [~ashb] Ty for the opinions. I think that is good idea, since it will also provide some sort of consistency between scheduler and webserver. Though to be able to do that, we need to store more info in the DagModel that webserver needs, e.g. the dependency. I am also not very sure about how much extra load that would place on the DB. I think if we go this route, we might want to build a DAG parsing component that parses DAG for both scheduler and webserver. I think before we decided to do that, we can try parallelize the parsing on webserver--the work can be reused when we have the DAG parsing service since the webserver will be using the serializable info of the DAG instead of the the DAG object in both cases. > Parallelize DAG parsing in webserver > > > Key: AIRFLOW-2762 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2762 > Project: Apache Airflow > Issue Type: Improvement >Reporter: Kevin Yang >Priority: Major > > Currently the webserver parses DagBag in a single thread fashion and causes > the start up time to be slow when we have large # of DAG files. Webservers > should not need the actual DAG object and this should be parallelized. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (AIRFLOW-2762) Parallelize DAG parsing in webserver
Kevin Yang created AIRFLOW-2762: --- Summary: Parallelize DAG parsing in webserver Key: AIRFLOW-2762 URL: https://issues.apache.org/jira/browse/AIRFLOW-2762 Project: Apache Airflow Issue Type: Improvement Reporter: Kevin Yang Currently the webserver parses DagBag in a single thread fashion and causes the start up time to be slow when we have large # of DAG files. Webservers should not need the actual DAG object and this should be parallelized. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (AIRFLOW-2761) Parallelize Celery Executor enqueuing
Kevin Yang created AIRFLOW-2761: --- Summary: Parallelize Celery Executor enqueuing Key: AIRFLOW-2761 URL: https://issues.apache.org/jira/browse/AIRFLOW-2761 Project: Apache Airflow Issue Type: Improvement Reporter: Kevin Yang Currently celery executor enqueues in an async fashion but still doing that in a single process loop. This can slows down scheduler loop and creates scheduling delay if we have large # of task to schedule in a short time, e.g. UTC midnight we need to schedule large # of sensors in a short period. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (AIRFLOW-2760) DAG parsing loop coupled with scheduler loop
Kevin Yang created AIRFLOW-2760: --- Summary: DAG parsing loop coupled with scheduler loop Key: AIRFLOW-2760 URL: https://issues.apache.org/jira/browse/AIRFLOW-2760 Project: Apache Airflow Issue Type: Improvement Reporter: Kevin Yang Currently DAG parsing loop is coupled with scheduler loop, meaning that if scheduler loop became slow, we will parse DAG slower. As a simple producer and consumer pattern, we shall have them decoupled and completely remove the scheduling bottleneck placed by DAG parsing--which is identified in Airbnb as the current biggest bottleneck. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (AIRFLOW-2756) Marking DAG run does not set start_time and end_time correctly
[ https://issues.apache.org/jira/browse/AIRFLOW-2756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kevin Yang reassigned AIRFLOW-2756: --- Assignee: Kevin Yang > Marking DAG run does not set start_time and end_time correctly > -- > > Key: AIRFLOW-2756 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2756 > Project: Apache Airflow > Issue Type: Bug >Reporter: Kevin Yang >Assignee: Kevin Yang >Priority: Major > > Marking DAG run right now always set end_time while it should set start_time > when marking RUNNING and otherwise end_time. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (AIRFLOW-2756) Marking DAG run does not set start_time and end_time correctly
Kevin Yang created AIRFLOW-2756: --- Summary: Marking DAG run does not set start_time and end_time correctly Key: AIRFLOW-2756 URL: https://issues.apache.org/jira/browse/AIRFLOW-2756 Project: Apache Airflow Issue Type: Bug Reporter: Kevin Yang Marking DAG run right now always set end_time while it should set start_time when marking RUNNING and otherwise end_time. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (AIRFLOW-2648) Mapred job name in HiveOperator hard to parse and order can be improved
[ https://issues.apache.org/jira/browse/AIRFLOW-2648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kevin Yang updated AIRFLOW-2648: Description: Existing format: "Airflow HiveOperator task for {hostname}.{dag_id}.{task_id}.{execution_date}". Proposing to make it configurable since it is a bit hard to parse. was: Existing format: "Airflow HiveOperator task for {hostname}.{dag_id}.{task_id}.{execution_date}". Proposing "{dag_id}.{task_id}.{execution_date}.{hostname}" > Mapred job name in HiveOperator hard to parse and order can be improved > --- > > Key: AIRFLOW-2648 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2648 > Project: Apache Airflow > Issue Type: Improvement >Reporter: Kevin Yang >Assignee: Kevin Yang >Priority: Major > > Existing format: "Airflow HiveOperator task for > {hostname}.{dag_id}.{task_id}.{execution_date}". > Proposing to make it configurable since it is a bit hard to parse. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work started] (AIRFLOW-2648) Mapred job name in HiveOperator hard to parse and order can be improved
[ https://issues.apache.org/jira/browse/AIRFLOW-2648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on AIRFLOW-2648 started by Kevin Yang. --- > Mapred job name in HiveOperator hard to parse and order can be improved > --- > > Key: AIRFLOW-2648 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2648 > Project: Apache Airflow > Issue Type: Improvement >Reporter: Kevin Yang >Assignee: Kevin Yang >Priority: Major > > Existing format: "Airflow HiveOperator task for > {hostname}.{dag_id}.{task_id}.{execution_date}". > Proposing to make it configurable since it is a bit hard to parse. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (AIRFLOW-2648) Mapred job name in HiveOperator hard to parse and order can be improved
[ https://issues.apache.org/jira/browse/AIRFLOW-2648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kevin Yang reassigned AIRFLOW-2648: --- Assignee: Kevin Yang > Mapred job name in HiveOperator hard to parse and order can be improved > --- > > Key: AIRFLOW-2648 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2648 > Project: Apache Airflow > Issue Type: Improvement >Reporter: Kevin Yang >Assignee: Kevin Yang >Priority: Major > > Existing format: "Airflow HiveOperator task for > {hostname}.{dag_id}.{task_id}.{execution_date}". > Proposing "{dag_id}.{task_id}.{execution_date}.{hostname}" -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (AIRFLOW-2648) Mapred job name in HiveOperator hard to parse and order can be improved
Kevin Yang created AIRFLOW-2648: --- Summary: Mapred job name in HiveOperator hard to parse and order can be improved Key: AIRFLOW-2648 URL: https://issues.apache.org/jira/browse/AIRFLOW-2648 Project: Apache Airflow Issue Type: Improvement Reporter: Kevin Yang Existing format: "Airflow HiveOperator task for {hostname}.{dag_id}.{task_id}.{execution_date}". Proposing "{dag_id}.{task_id}.{execution_date}.{hostname}" -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (AIRFLOW-2624) Airflow webserver broken out of the box
Kevin Yang created AIRFLOW-2624: --- Summary: Airflow webserver broken out of the box Key: AIRFLOW-2624 URL: https://issues.apache.org/jira/browse/AIRFLOW-2624 Project: Apache Airflow Issue Type: Bug Reporter: Kevin Yang Assignee: Kevin Yang `airflow webserver` and then click on any DAG, I get ``` File "/Users/kevin_yang/ext_repos/incubator-airflow/airflow/www/utils.py", line 364, in view_func return f(*args, **kwargs) File "/Users/kevin_yang/ext_repos/incubator-airflow/airflow/www/utils.py", line 251, in wrapper user = current_user.user.username AttributeError: 'NoneType' object has no attribute 'username' ``` -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (AIRFLOW-2615) Webserver parent not using cached app
[ https://issues.apache.org/jira/browse/AIRFLOW-2615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kevin Yang reassigned AIRFLOW-2615: --- Assignee: Kevin Yang > Webserver parent not using cached app > - > > Key: AIRFLOW-2615 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2615 > Project: Apache Airflow > Issue Type: Bug >Reporter: Kevin Yang >Assignee: Kevin Yang >Priority: Major > > From what I can tell, the app cached > [here|https://github.com/apache/incubator-airflow/blob/master/airflow/bin/cli.py#L790] > attempt to cache the app for later use-likely to be for the expensive > DagBag() creation. Before I dive into the webserver parsing everything in one > process problem, I was hoping this cached app would save me sometime. However > it seems to me that every subprocess spun up by gunicorn is trying to create > the DagBag() right after they've been created--make sense to me since we > didn't share the cached app to the subprocess( doubt we can). If what I > observed is true, why do we cache the app at all in the parent process? -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (AIRFLOW-2615) Webserver parent not using cached app
[ https://issues.apache.org/jira/browse/AIRFLOW-2615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kevin Yang updated AIRFLOW-2615: Summary: Webserver parent not using cached app (was: Webserver not using cached app) > Webserver parent not using cached app > - > > Key: AIRFLOW-2615 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2615 > Project: Apache Airflow > Issue Type: Bug >Reporter: Kevin Yang >Priority: Major > > From what I can tell, the app cached > [here|https://github.com/apache/incubator-airflow/blob/master/airflow/bin/cli.py#L790] > attempt to cache the app for later use-likely to be for the expensive > DagBag() creation. Before I dive into the webserver parsing everything in one > process problem, I was hoping this cached app would save me sometime. However > it seems to me that every subprocess spun up by gunicorn is trying to create > the DagBag() right after they've been created--make sense to me since we > didn't share the cached app to the subprocess( doubt we can). If what I > observed is true, why do we cache the app at all? -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (AIRFLOW-2615) Webserver parent not using cached app
[ https://issues.apache.org/jira/browse/AIRFLOW-2615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kevin Yang updated AIRFLOW-2615: Description: From what I can tell, the app cached [here|https://github.com/apache/incubator-airflow/blob/master/airflow/bin/cli.py#L790] attempt to cache the app for later use-likely to be for the expensive DagBag() creation. Before I dive into the webserver parsing everything in one process problem, I was hoping this cached app would save me sometime. However it seems to me that every subprocess spun up by gunicorn is trying to create the DagBag() right after they've been created--make sense to me since we didn't share the cached app to the subprocess( doubt we can). If what I observed is true, why do we cache the app at all in the parent process? (was: From what I can tell, the app cached [here|https://github.com/apache/incubator-airflow/blob/master/airflow/bin/cli.py#L790] attempt to cache the app for later use-likely to be for the expensive DagBag() creation. Before I dive into the webserver parsing everything in one process problem, I was hoping this cached app would save me sometime. However it seems to me that every subprocess spun up by gunicorn is trying to create the DagBag() right after they've been created--make sense to me since we didn't share the cached app to the subprocess( doubt we can). If what I observed is true, why do we cache the app at all?) > Webserver parent not using cached app > - > > Key: AIRFLOW-2615 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2615 > Project: Apache Airflow > Issue Type: Bug >Reporter: Kevin Yang >Priority: Major > > From what I can tell, the app cached > [here|https://github.com/apache/incubator-airflow/blob/master/airflow/bin/cli.py#L790] > attempt to cache the app for later use-likely to be for the expensive > DagBag() creation. Before I dive into the webserver parsing everything in one > process problem, I was hoping this cached app would save me sometime. However > it seems to me that every subprocess spun up by gunicorn is trying to create > the DagBag() right after they've been created--make sense to me since we > didn't share the cached app to the subprocess( doubt we can). If what I > observed is true, why do we cache the app at all in the parent process? -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-2615) Webserver not using cached app
[ https://issues.apache.org/jira/browse/AIRFLOW-2615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16512059#comment-16512059 ] Kevin Yang commented on AIRFLOW-2615: - Adding a little bit context here: Airbnb has ~2000 DAG file in our centralized DAG repo and it takes a long time to parse the entire repo, this extra app creation is basically doubling the time we need to refresh webserver worker. > Webserver not using cached app > -- > > Key: AIRFLOW-2615 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2615 > Project: Apache Airflow > Issue Type: Bug >Reporter: Kevin Yang >Priority: Major > > From what I can tell, the app cached > [here|https://github.com/apache/incubator-airflow/blob/master/airflow/bin/cli.py#L790] > attempt to cache the app for later use-likely to be for the expensive > DagBag() creation. Before I dive into the webserver parsing everything in one > process problem, I was hoping this cached app would save me sometime. However > it seems to me that every subprocess spun up by gunicorn is trying to create > the DagBag() right after they've been created--make sense to me since we > didn't share the cached app to the subprocess( doubt we can). If what I > observed is true, why do we cache the app at all? -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (AIRFLOW-2615) Webserver not using cached app
[ https://issues.apache.org/jira/browse/AIRFLOW-2615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kevin Yang updated AIRFLOW-2615: Description: From what I can tell, the app cached [here|https://github.com/apache/incubator-airflow/blob/master/airflow/bin/cli.py#L790] attempt to cache the app for later use-likely to be for the expensive DagBag() creation. Before I dive into the webserver parsing everything in one process problem, I was hoping this cached app would save me sometime. However it seems to me that every subprocess spun up by gunicorn is trying to create the DagBag() right after they've been created--make sense to me since we didn't share the cached app to the subprocess( doubt we can). If what I observed is true, why do we cache the app at all? (was: From what I can tell, the app cached here attempt to cache the app for later use-likely to be for the expensive DagBag() creation. Before I dive into the webserver parsing everything in one process problem, I was hoping this cached app would save me sometime. However it seems to me that every subprocess spun up by gunicorn is trying to create the DagBag() right after they've been created--make sense to me since we didn't share the cached app to the subprocess( doubt we can). If what I observed is true, why do we cache the app at all?) > Webserver not using cached app > -- > > Key: AIRFLOW-2615 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2615 > Project: Apache Airflow > Issue Type: Bug >Reporter: Kevin Yang >Priority: Major > > From what I can tell, the app cached > [here|https://github.com/apache/incubator-airflow/blob/master/airflow/bin/cli.py#L790] > attempt to cache the app for later use-likely to be for the expensive > DagBag() creation. Before I dive into the webserver parsing everything in one > process problem, I was hoping this cached app would save me sometime. However > it seems to me that every subprocess spun up by gunicorn is trying to create > the DagBag() right after they've been created--make sense to me since we > didn't share the cached app to the subprocess( doubt we can). If what I > observed is true, why do we cache the app at all? -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (AIRFLOW-2615) Webserver not using cached app
[ https://issues.apache.org/jira/browse/AIRFLOW-2615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16512057#comment-16512057 ] Kevin Yang edited comment on AIRFLOW-2615 at 6/14/18 7:10 AM: -- [~joygao] Not very confident in the webserver area, would you kindly provide your opinion here please? Thank you! was (Author: yrqls21): [~joygao] Not very confident in the webserver area, would you kindly provide you opinion here please? Thank you! > Webserver not using cached app > -- > > Key: AIRFLOW-2615 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2615 > Project: Apache Airflow > Issue Type: Bug >Reporter: Kevin Yang >Priority: Major > > From what I can tell, the app cached here attempt to cache the app for later > use-likely to be for the expensive DagBag() creation. Before I dive into the > webserver parsing everything in one process problem, I was hoping this cached > app would save me sometime. However it seems to me that every subprocess spun > up by gunicorn is trying to create the DagBag() right after they've been > created--make sense to me since we didn't share the cached app to the > subprocess( doubt we can). If what I observed is true, why do we cache the > app at all? -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-2615) Webserver not using cached app
[ https://issues.apache.org/jira/browse/AIRFLOW-2615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16512057#comment-16512057 ] Kevin Yang commented on AIRFLOW-2615: - [~joygao] Not very confident in the webserver area, would you kindly provide you opinion here please? Thank you! > Webserver not using cached app > -- > > Key: AIRFLOW-2615 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2615 > Project: Apache Airflow > Issue Type: Bug >Reporter: Kevin Yang >Priority: Major > > From what I can tell, the app cached here attempt to cache the app for later > use-likely to be for the expensive DagBag() creation. Before I dive into the > webserver parsing everything in one process problem, I was hoping this cached > app would save me sometime. However it seems to me that every subprocess spun > up by gunicorn is trying to create the DagBag() right after they've been > created--make sense to me since we didn't share the cached app to the > subprocess( doubt we can). If what I observed is true, why do we cache the > app at all? -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (AIRFLOW-2615) Webserver not using cached app
Kevin Yang created AIRFLOW-2615: --- Summary: Webserver not using cached app Key: AIRFLOW-2615 URL: https://issues.apache.org/jira/browse/AIRFLOW-2615 Project: Apache Airflow Issue Type: Bug Reporter: Kevin Yang >From what I can tell, the app cached here attempt to cache the app for later >use-likely to be for the expensive DagBag() creation. Before I dive into the >webserver parsing everything in one process problem, I was hoping this cached >app would save me sometime. However it seems to me that every subprocess spun >up by gunicorn is trying to create the DagBag() right after they've been >created--make sense to me since we didn't share the cached app to the >subprocess( doubt we can). If what I observed is true, why do we cache the app >at all? -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (AIRFLOW-2605) MySqlHook().run() will not commit if autocommit is set to True.
Kevin Yang created AIRFLOW-2605: --- Summary: MySqlHook().run() will not commit if autocommit is set to True. Key: AIRFLOW-2605 URL: https://issues.apache.org/jira/browse/AIRFLOW-2605 Project: Apache Airflow Issue Type: Bug Reporter: Kevin Yang Assignee: Kevin Yang MySql [set autocommit in a different way|https://github.com/PyMySQL/mysqlclient-python/blob/master/MySQLdb/connections.py#L249-L256]. Thus setting it by doing `conn.autocommit = True` as we currently do will not set autocommit correctly. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work started] (AIRFLOW-2597) dbapi hook not committing when autocommit is set to false
[ https://issues.apache.org/jira/browse/AIRFLOW-2597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on AIRFLOW-2597 started by Kevin Yang. --- > dbapi hook not committing when autocommit is set to false > - > > Key: AIRFLOW-2597 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2597 > Project: Apache Airflow > Issue Type: Bug >Reporter: Kevin Yang >Assignee: Kevin Yang >Priority: Major > > dbapi.run() right now commits only when autocommit is set to true or db does > not support autocommit. > This is breaking CI now. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (AIRFLOW-2597) dbapi hook not committing when autocommit is set to false
Kevin Yang created AIRFLOW-2597: --- Summary: dbapi hook not committing when autocommit is set to false Key: AIRFLOW-2597 URL: https://issues.apache.org/jira/browse/AIRFLOW-2597 Project: Apache Airflow Issue Type: Bug Reporter: Kevin Yang Assignee: Kevin Yang dbapi.run() right now commits only when autocommit is set to true or db does not support autocommit. This is breaking CI now. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (AIRFLOW-2590) dbapi hook not committing when conn does not support auto commit
Kevin Yang created AIRFLOW-2590: --- Summary: dbapi hook not committing when conn does not support auto commit Key: AIRFLOW-2590 URL: https://issues.apache.org/jira/browse/AIRFLOW-2590 Project: Apache Airflow Issue Type: Bug Reporter: Kevin Yang Assignee: Kevin Yang After this commit, DbApiHook.run() will only commit when the autocommit field is set for the connection. For those don't support autocommit (e.g. sqlite_hook), the connection won't commit the query. This is currently breaking CI (tests/core.py:CoreTest.test_check_operators). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (AIRFLOW-2586) Stop getting AIRFLOW_HOME value from config file in bash operator
[ https://issues.apache.org/jira/browse/AIRFLOW-2586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kevin Yang updated AIRFLOW-2586: Description: Before [this commit|https://github.com/apache/incubator-airflow/commit/a0deb506c070637abc3c426bc7d060e3fe6c854d#diff-30054b6fa334216ba6e66c9f07025cd2R35] subprocess created by bash operator inherits env vars from the parent process. However, it does not inherit the proper env var from the `airflow worker` process because we had a bug in the `sudo airflow run --raw` command. The commit was created to address the bug for bash operator. The bug was later on fixed in [this commit|https://github.com/apache/incubator-airflow/commit/354492bc597130f43c76e7bec4bc894fb6deb7fe] and thus bash operator does not need and should not get AIRFLOW_HOME value from the config (otherwise there might be discrepancy between the AIRFLOW_HOME value in the parent process and the child process). (was: Before [this commit|[https://github.com/apache/incubator-airflow/commit/a0deb506c070637abc3c426bc7d060e3fe6c854d#diff-30054b6fa334216ba6e66c9f07025cd2R35]] subprocess created by bash operator inherits env vars from the parent process. However, it does not inherit the proper env var from the `airflow worker` process because we had a bug in the `sudo airflow run --raw` command. The commit was created to address the bug for bash operator. The bug was later on fixed in [this commit|[https://github.com/apache/incubator-airflow/commit/354492bc597130f43c76e7bec4bc894fb6deb7fe]] and thus bash operator does not need and should not get AIRFLOW_HOME value from the config (otherwise there might be discrepancy between the AIRFLOW_HOME value in the parent process and the child process).) > Stop getting AIRFLOW_HOME value from config file in bash operator > - > > Key: AIRFLOW-2586 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2586 > Project: Apache Airflow > Issue Type: Bug >Reporter: Kevin Yang >Assignee: Kevin Yang >Priority: Major > > Before [this > commit|https://github.com/apache/incubator-airflow/commit/a0deb506c070637abc3c426bc7d060e3fe6c854d#diff-30054b6fa334216ba6e66c9f07025cd2R35] > subprocess created by bash operator inherits env vars from the parent > process. However, it does not inherit the proper env var from the `airflow > worker` process because we had a bug in the `sudo airflow run --raw` command. > The commit was created to address the bug for bash operator. The bug was > later on fixed in [this > commit|https://github.com/apache/incubator-airflow/commit/354492bc597130f43c76e7bec4bc894fb6deb7fe] > and thus bash operator does not need and should not get AIRFLOW_HOME value > from the config (otherwise there might be discrepancy between the > AIRFLOW_HOME value in the parent process and the child process). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (AIRFLOW-2586) Stop getting AIRFLOW_HOME value from config file in bash operator
Kevin Yang created AIRFLOW-2586: --- Summary: Stop getting AIRFLOW_HOME value from config file in bash operator Key: AIRFLOW-2586 URL: https://issues.apache.org/jira/browse/AIRFLOW-2586 Project: Apache Airflow Issue Type: Bug Reporter: Kevin Yang Assignee: Kevin Yang Before [this commit|[https://github.com/apache/incubator-airflow/commit/a0deb506c070637abc3c426bc7d060e3fe6c854d#diff-30054b6fa334216ba6e66c9f07025cd2R35],] subprocess created by bash operator inherits env vars from the parent process. However, it does not inherit the proper env var from the `airflow worker` process because we had a bug in the `sudo airflow run --raw` command. The commit was created to address the bug for bash operator. The bug was later on fixed in [this commit|[https://github.com/apache/incubator-airflow/commit/354492bc597130f43c76e7bec4bc894fb6deb7fe],] and thus bash operator does not need and should not get AIRFLOW_HOME value from the config (otherwise there might be discrepancy between the AIRFLOW_HOME value in the parent process and the child process). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Closed] (AIRFLOW-2497) Cgroup task runner doesn't pass down correct env vars
[ https://issues.apache.org/jira/browse/AIRFLOW-2497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kevin Yang closed AIRFLOW-2497. --- Resolution: Fixed > Cgroup task runner doesn't pass down correct env vars > - > > Key: AIRFLOW-2497 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2497 > Project: Apache Airflow > Issue Type: Bug >Reporter: Kevin Yang >Assignee: Kevin Yang >Priority: Major > > From > [https://github.com/apache/incubator-airflow/blob/master/airflow/task/task_runner/base_task_runner.py#L79-L84,] > only PYTHONPATH is propagated to the child process, which make the behavior > of bash task runner and cgroup task runner different as bash task runner > would issue a `bash -c` command that automatically pass all the env var from > the parent process to the subprocess. Cgroup task runner should not behave > different. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-2497) Cgroup task runner doesn't pass down correct env vars
[ https://issues.apache.org/jira/browse/AIRFLOW-2497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16483547#comment-16483547 ] Kevin Yang commented on AIRFLOW-2497: - This is actually resolved in https://issues.apache.org/jira/browse/AIRFLOW-2162 > Cgroup task runner doesn't pass down correct env vars > - > > Key: AIRFLOW-2497 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2497 > Project: Apache Airflow > Issue Type: Bug >Reporter: Kevin Yang >Assignee: Kevin Yang >Priority: Major > > From > [https://github.com/apache/incubator-airflow/blob/master/airflow/task/task_runner/base_task_runner.py#L79-L84,] > only PYTHONPATH is propagated to the child process, which make the behavior > of bash task runner and cgroup task runner different as bash task runner > would issue a `bash -c` command that automatically pass all the env var from > the parent process to the subprocess. Cgroup task runner should not behave > different. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (AIRFLOW-2497) Cgroup task runner doesn't pass down correct env vars
[ https://issues.apache.org/jira/browse/AIRFLOW-2497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kevin Yang reassigned AIRFLOW-2497: --- Assignee: Kevin Yang > Cgroup task runner doesn't pass down correct env vars > - > > Key: AIRFLOW-2497 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2497 > Project: Apache Airflow > Issue Type: Bug >Reporter: Kevin Yang >Assignee: Kevin Yang >Priority: Major > > From > [https://github.com/apache/incubator-airflow/blob/master/airflow/task/task_runner/base_task_runner.py#L79-L84,] > only PYTHONPATH is propagated to the child process, which make the behavior > of bash task runner and cgroup task runner different as bash task runner > would issue a `bash -c` command that automatically pass all the env var from > the parent process to the subprocess. Cgroup task runner should not behave > different. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (AIRFLOW-2497) Cgroup task runner doesn't pass down correct env vars
Kevin Yang created AIRFLOW-2497: --- Summary: Cgroup task runner doesn't pass down correct env vars Key: AIRFLOW-2497 URL: https://issues.apache.org/jira/browse/AIRFLOW-2497 Project: Apache Airflow Issue Type: Bug Reporter: Kevin Yang >From >[https://github.com/apache/incubator-airflow/blob/master/airflow/task/task_runner/base_task_runner.py#L79-L84,] > only PYTHONPATH is propagated to the child process, which make the behavior >of bash task runner and cgroup task runner different as bash task runner would >issue a `bash -c` command that automatically pass all the env var from the >parent process to the subprocess. Cgroup task runner should not behave >different. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (AIRFLOW-2463) Make task instance context available for hive queries
Kevin Yang created AIRFLOW-2463: --- Summary: Make task instance context available for hive queries Key: AIRFLOW-2463 URL: https://issues.apache.org/jira/browse/AIRFLOW-2463 Project: Apache Airflow Issue Type: Improvement Reporter: Kevin Yang Assignee: Kevin Yang Currently hive queries run through HiveOperator() would receive task_instance context as hive_conf. But the context is not available when HiveCliHook()/HiveServer2Hook() was called through PythonOperator(), nor available when hive cli was called in BashOperator() nor available when HiveServer2Hook() was called in any operator. Having the context available would provide users the capability to audit hive queries. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (AIRFLOW-2402) Airflow 1.10 Logs UI throws oops error
[ https://issues.apache.org/jira/browse/AIRFLOW-2402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kevin Yang reassigned AIRFLOW-2402: --- Assignee: Kevin Yang (was: Ramki Subramanian) > Airflow 1.10 Logs UI throws oops error > -- > > Key: AIRFLOW-2402 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2402 > Project: Apache Airflow > Issue Type: Bug > Components: authentication, ui >Affects Versions: 1.10.0 >Reporter: Ramki Subramanian >Assignee: Kevin Yang >Priority: Major > Fix For: 1.10.0 > > > Hi, > I am getting an error at > [incubator-airflow/airflow/www_rbac/views.py|https://github.com/apache/incubator-airflow/blob/4d64ad4928f0188f7532936e8da6612f5ec7170d/airflow/www_rbac/views.py#L454] > Line 454 in > [4d64ad4|https://github.com/apache/incubator-airflow/commit/4d64ad4928f0188f7532936e8da6612f5ec7170d] > | |logs[i] = log.decode('utf-8')| > > {{/home/user/lib/python2.7/site-packages/airflow/www_rbac/views.py", line > 454, in log logs[i] = log.decode('utf-8') AttributeError: 'list' object has > no attribute 'decode' }} > Not sure if someone is already looking into this, or I am missing some config? > Branch : 1.10_test > More Info here: > [https://github.com/apache/incubator-airflow/commit/05e1861e24de42f9a2c649cd93041c5c744504e1#diff-77df5adb32d964f37748c4557ffb3c4c] > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-2402) Airflow 1.10 Logs UI throws oops error
[ https://issues.apache.org/jira/browse/AIRFLOW-2402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463393#comment-16463393 ] Kevin Yang commented on AIRFLOW-2402: - Hi [~rsubra13], I believe the problem was introduced by my previous [PR|[https://github.com/apache/incubator-airflow/pull/3214],] I think you can copy/paste the changesin the PR on www/views.py into www_rbac/views.py I don't have rbac set up on my side but if you want me to do it you can assign this ticket to me and I'll follow through. Otherwise you can tag be in the PR, I'll review in on a high priority base. Cheers, Kevin Y > Airflow 1.10 Logs UI throws oops error > -- > > Key: AIRFLOW-2402 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2402 > Project: Apache Airflow > Issue Type: Bug > Components: authentication, ui >Affects Versions: 1.10.0 >Reporter: Ramki Subramanian >Assignee: Ramki Subramanian >Priority: Major > Fix For: 1.10.0 > > > Hi, > I am getting an error at > [incubator-airflow/airflow/www_rbac/views.py|https://github.com/apache/incubator-airflow/blob/4d64ad4928f0188f7532936e8da6612f5ec7170d/airflow/www_rbac/views.py#L454] > Line 454 in > [4d64ad4|https://github.com/apache/incubator-airflow/commit/4d64ad4928f0188f7532936e8da6612f5ec7170d] > | |logs[i] = log.decode('utf-8')| > > {{/home/user/lib/python2.7/site-packages/airflow/www_rbac/views.py", line > 454, in log logs[i] = log.decode('utf-8') AttributeError: 'list' object has > no attribute 'decode' }} > Not sure if someone is already looking into this, or I am missing some config? > Branch : 1.10_test > More Info here: > [https://github.com/apache/incubator-airflow/commit/05e1861e24de42f9a2c649cd93041c5c744504e1#diff-77df5adb32d964f37748c4557ffb3c4c] > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-2363) S3 remote logging appending tuple instead of str
[ https://issues.apache.org/jira/browse/AIRFLOW-2363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16454925#comment-16454925 ] Kevin Yang commented on AIRFLOW-2363: - [~b11c] If the case is that the set_context is not called properly then this will not resolve 2379. The entry point of the task handler's set_context() method should be here: [https://github.com/apache/incubator-airflow/blob/master/airflow/bin/cli.py#L460,] which is called in all `airflow run` command. Not sure why it can be missed. I kinda suspect the reason log not being uploaded is because of the bug getting fixed in this issue but I might need your help to confirm that. > S3 remote logging appending tuple instead of str > > > Key: AIRFLOW-2363 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2363 > Project: Apache Airflow > Issue Type: Bug > Components: logging >Reporter: Kyle Hamlin >Priority: Major > Fix For: 1.10.0 > > > A recent merge into master that added support for Elasticsearch logging seems > to have broken S3 logging by returning a tuple instead of a string. > [https://github.com/apache/incubator-airflow/commit/ec38ba9594395de04ec932481212a86fbe9ae107#diff-0442332ecbe42ebbf426911c68d8cd4aR128] > > following errors thrown: > > *Session NoneType error* > Traceback (most recent call last): > File > "/usr/local/lib/python3.6/site-packages/airflow/utils/log/s3_task_handler.py", > line 171, in s3_write > encrypt=configuration.conf.getboolean('core', 'ENCRYPT_S3_LOGS'), > File "/usr/local/lib/python3.6/site-packages/airflow/hooks/S3_hook.py", > line 274, in load_string > encrypt=encrypt) > File "/usr/local/lib/python3.6/site-packages/airflow/hooks/S3_hook.py", > line 313, in load_bytes > client = self.get_conn() > File "/usr/local/lib/python3.6/site-packages/airflow/hooks/S3_hook.py", > line 34, in get_conn > return self.get_client_type('s3') > File > "/usr/local/lib/python3.6/site-packages/airflow/contrib/hooks/aws_hook.py", > line 151, in get_client_type > session, endpoint_url = self._get_credentials(region_name) > File > "/usr/local/lib/python3.6/site-packages/airflow/contrib/hooks/aws_hook.py", > line 97, in _get_credentials > connection_object = self.get_connection(self.aws_conn_id) > File "/usr/local/lib/python3.6/site-packages/airflow/hooks/base_hook.py", > line 82, in get_connection > conn = random.choice(cls.get_connections(conn_id)) > File "/usr/local/lib/python3.6/site-packages/airflow/hooks/base_hook.py", > line 77, in get_connections > conns = cls._get_connections_from_db(conn_id) > File "/usr/local/lib/python3.6/site-packages/airflow/utils/db.py", line > 72, in wrapper > with create_session() as session: > File "/usr/local/lib/python3.6/contextlib.py", line 81, in __enter__ > return next(self.gen) > File "/usr/local/lib/python3.6/site-packages/airflow/utils/db.py", line > 41, in create_session > session = settings.Session() > TypeError: 'NoneType' object is not callable > > *TypeError must be str not tuple* > [2018-04-16 18:37:28,200] ERROR in app: Exception on > /admin/airflow/get_logs_with_metadata [GET] > Traceback (most recent call last): > File "/usr/local/lib/python3.6/site-packages/flask/app.py", line 1982, in > wsgi_app > response = self.full_dispatch_request() > File "/usr/local/lib/python3.6/site-packages/flask/app.py", line 1614, in > full_dispatch_request > rv = self.handle_user_exception(e) > File "/usr/local/lib/python3.6/site-packages/flask/app.py", line 1517, in > handle_user_exception > reraise(exc_type, exc_value, tb) > File "/usr/local/lib/python3.6/site-packages/flask/_compat.py", line 33, > in reraise > raise value > File "/usr/local/lib/python3.6/site-packages/flask/app.py", line 1612, in > full_dispatch_request > rv = self.dispatch_request() > File "/usr/local/lib/python3.6/site-packages/flask/app.py", line 1598, in > dispatch_request > return self.view_functions[rule.endpoint](**req.view_args) > File "/usr/local/lib/python3.6/site-packages/flask_admin/base.py", line > 69, in inner > return self._run_view(f, *args, **kwargs) > File "/usr/local/lib/python3.6/site-packages/flask_admin/base.py", line > 368, in _run_view > return fn(self, *args, **kwargs) > File "/usr/local/lib/python3.6/site-packages/flask_login.py", line 755, in > decorated_view > return func(*args, **kwargs) > File "/usr/local/lib/python3.6/site-packages/airflow/www/utils.py", line > 269, in wrapper > return f(*args, **kwargs) > File "/usr/local/lib/python3.6/site-packages/airflow/utils/db.py", line > 74, in wrapper > return func(*args, **kwargs) > File
[jira] [Closed] (AIRFLOW-2383) Escape colon in partition name when poking inside NamedHivePartitionSensor
[ https://issues.apache.org/jira/browse/AIRFLOW-2383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kevin Yang closed AIRFLOW-2383. --- Resolution: Invalid > Escape colon in partition name when poking inside NamedHivePartitionSensor > -- > > Key: AIRFLOW-2383 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2383 > Project: Apache Airflow > Issue Type: Bug >Reporter: Kevin Yang >Assignee: Kevin Yang >Priority: Major > > colon in NamedHivePartitionSensor is not escaping colon in the partition name > causing different behavior than HivePartitionSensor if there's colon in the > partition name. Need to escape it to `%3A` -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-2383) Escape colon in partition name when poking inside NamedHivePartitionSensor
[ https://issues.apache.org/jira/browse/AIRFLOW-2383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16454840#comment-16454840 ] Kevin Yang commented on AIRFLOW-2383: - When using partition names, users are supposed to specify escaped values, closing the jira. > Escape colon in partition name when poking inside NamedHivePartitionSensor > -- > > Key: AIRFLOW-2383 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2383 > Project: Apache Airflow > Issue Type: Bug >Reporter: Kevin Yang >Assignee: Kevin Yang >Priority: Major > > colon in NamedHivePartitionSensor is not escaping colon in the partition name > causing different behavior than HivePartitionSensor if there's colon in the > partition name. Need to escape it to `%3A` -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (AIRFLOW-2383) Escape colon in partition name when poking inside NamedHivePartitionSensor
[ https://issues.apache.org/jira/browse/AIRFLOW-2383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kevin Yang reassigned AIRFLOW-2383: --- Assignee: Kevin Yang > Escape colon in partition name when poking inside NamedHivePartitionSensor > -- > > Key: AIRFLOW-2383 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2383 > Project: Apache Airflow > Issue Type: Bug >Reporter: Kevin Yang >Assignee: Kevin Yang >Priority: Major > > colon in NamedHivePartitionSensor is not escaping colon in the partition name > causing different behavior than HivePartitionSensor if there's colon in the > partition name. Need to escape it to `%3A` -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (AIRFLOW-2383) Escape colon in partition name when poking inside NamedHivePartitionSensor
Kevin Yang created AIRFLOW-2383: --- Summary: Escape colon in partition name when poking inside NamedHivePartitionSensor Key: AIRFLOW-2383 URL: https://issues.apache.org/jira/browse/AIRFLOW-2383 Project: Apache Airflow Issue Type: Bug Reporter: Kevin Yang colon in NamedHivePartitionSensor is not escaping colon in the partition name causing different behavior than HivePartitionSensor if there's colon in the partition name. Need to escape it to `%3A` -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-2374) Airflow fails to show logs
[ https://issues.apache.org/jira/browse/AIRFLOW-2374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16454663#comment-16454663 ] Kevin Yang commented on AIRFLOW-2374: - Hi [~b11c], I think the BUG is handled in the following jira. https://issues.apache.org/jira/browse/AIRFLOW-2363 > Airflow fails to show logs > -- > > Key: AIRFLOW-2374 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2374 > Project: Apache Airflow > Issue Type: Bug >Reporter: Berislav Lopac >Assignee: Berislav Lopac >Priority: Blocker > > When viewing a log in the webserver, the page shows a loading gif and the log > never appears. Looking in the Javascript console, the problem appears to be > error 500 when loading the {{get_logs_with_metadata}} endpoint, givving the > following trace: > {code:java} > / ( () ) \___ > /( ( ( ) _)) ) )\ >(( ( )() ) ( ) ) > ((/ ( _( ) ( _) ) ( () ) ) > ( ( ( (_) ((( ) .((_ ) . )_ >( ( )( ( )) ) . ) ( ) > ( ( ( ( ) ( _ ( _) ). ) . ) ) ( ) > ( ( ( ) ( ) ( )) ) _)( ) ) ) > ( ( ( \ ) ((_ ( ) ( ) ) ) ) )) ( ) > ( ( ( ( (_ ( ) ( _) ) ( ) ) ) > ( ( ( ( ( ) (_ ) ) ) _) ) _( ( ) > (( ( )(( _) _) _(_ ( (_ ) >(_((__(_(__(( ( ( | ) ) ) )_))__))_)___) >((__)\\||lll|l||/// \_)) > ( /(/ ( ) ) )\ ) > (( ( ( | | ) ) )\ ) >( /(| / ( )) ) ) )) ) > ( ( _(|)_) ) > ( ||\(|(|)|/|| ) > (|(||(||)) > ( //|/l|||)|\\ \ ) > (/ / // /|//\\ \ \ \ _) > --- > Node: airflow-nods-dev > --- > Traceback (most recent call last): > File > "/opt/airflow/src/apache-airflow/airflow/utils/log/gcs_task_handler.py", line > 113, in _read > remote_log = self.gcs_read(remote_loc) > File > "/opt/airflow/src/apache-airflow/airflow/utils/log/gcs_task_handler.py", line > 131, in gcs_read > return self.hook.download(bkt, blob).decode() > File "/opt/airflow/src/apache-airflow/airflow/contrib/hooks/gcs_hook.py", > line 107, in download > .get_media(bucket=bucket, object=object) \ > File "/usr/local/lib/python3.6/dist-packages/oauth2client/_helpers.py", > line 133, in positional_wrapper > return wrapped(*args, **kwargs) > File "/usr/local/lib/python3.6/dist-packages/googleapiclient/http.py", line > 841, in execute > raise HttpError(resp, content, uri=self.uri) > googleapiclient.errors.HttpError: https://www.googleapis.com/storage/v1/b/bucket-af/o/test-logs%2Fgeneric_transfer_single%2Ftransfer_file%2F2018-04-25T13%3A00%3A51.250983%2B00%3A00%2F1.log?alt=media > returned "Not Found"> > During handling of the above exception, another exception occurred: > Traceback (most recent call last): > File "/usr/local/lib/python3.6/dist-packages/flask/app.py", line 1982, in > wsgi_app > response = self.full_dispatch_request() > File "/usr/local/lib/python3.6/dist-packages/flask/app.py", line 1614, in > full_dispatch_request > rv = self.handle_user_exception(e) > File "/usr/local/lib/python3.6/dist-packages/flask/app.py", line 1517, in > handle_user_exception > reraise(exc_type, exc_value, tb) > File "/usr/local/lib/python3.6/dist-packages/flask/_compat.py", line 33, in > reraise > raise value > File "/usr/local/lib/python3.6/dist-packages/flask/app.py", line 1612, in > full_dispatch_request > rv = self.dispatch_request() > File "/usr/local/lib/python3.6/dist-packages/flask/app.py", line 1598, in > dispatch_request > return self.view_functions[rule.endpoint](**req.view_args) > File "/usr/local/lib/python3.6/dist-packages/flask_admin/base.py", line 69, > in inner > return self._run_view(f, *args, **kwargs) > File "/usr/local/lib/python3.6/dist-packages/flask_admin/base.py", line > 368, in _run_view > return fn(self, *args, **kwargs) > File "/usr/local/lib/python3.6/dist-packages/flask_login.py", line 758, in > decorated_view > return func(*args, **kwargs) > File "/opt/airflow/src/apache-airflow/airflow/www/utils.py", line 269, in > wrapper > return f(*args, **kwargs) > File
[jira] [Commented] (AIRFLOW-2363) S3 remote logging appending tuple instead of str
[ https://issues.apache.org/jira/browse/AIRFLOW-2363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16454406#comment-16454406 ] Kevin Yang commented on AIRFLOW-2363: - [~hamlinkn] Do you mind try out the change in this PR [https://github.com/apache/incubator-airflow/pull/3259] and see if the issue was resolved? > S3 remote logging appending tuple instead of str > > > Key: AIRFLOW-2363 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2363 > Project: Apache Airflow > Issue Type: Bug > Components: logging >Reporter: Kyle Hamlin >Assignee: Kevin Yang >Priority: Major > Fix For: 1.10.0 > > > A recent merge into master that added support for Elasticsearch logging seems > to have broken S3 logging by returning a tuple instead of a string. > [https://github.com/apache/incubator-airflow/commit/ec38ba9594395de04ec932481212a86fbe9ae107#diff-0442332ecbe42ebbf426911c68d8cd4aR128] > > following errors thrown: > > *Session NoneType error* > Traceback (most recent call last): > File > "/usr/local/lib/python3.6/site-packages/airflow/utils/log/s3_task_handler.py", > line 171, in s3_write > encrypt=configuration.conf.getboolean('core', 'ENCRYPT_S3_LOGS'), > File "/usr/local/lib/python3.6/site-packages/airflow/hooks/S3_hook.py", > line 274, in load_string > encrypt=encrypt) > File "/usr/local/lib/python3.6/site-packages/airflow/hooks/S3_hook.py", > line 313, in load_bytes > client = self.get_conn() > File "/usr/local/lib/python3.6/site-packages/airflow/hooks/S3_hook.py", > line 34, in get_conn > return self.get_client_type('s3') > File > "/usr/local/lib/python3.6/site-packages/airflow/contrib/hooks/aws_hook.py", > line 151, in get_client_type > session, endpoint_url = self._get_credentials(region_name) > File > "/usr/local/lib/python3.6/site-packages/airflow/contrib/hooks/aws_hook.py", > line 97, in _get_credentials > connection_object = self.get_connection(self.aws_conn_id) > File "/usr/local/lib/python3.6/site-packages/airflow/hooks/base_hook.py", > line 82, in get_connection > conn = random.choice(cls.get_connections(conn_id)) > File "/usr/local/lib/python3.6/site-packages/airflow/hooks/base_hook.py", > line 77, in get_connections > conns = cls._get_connections_from_db(conn_id) > File "/usr/local/lib/python3.6/site-packages/airflow/utils/db.py", line > 72, in wrapper > with create_session() as session: > File "/usr/local/lib/python3.6/contextlib.py", line 81, in __enter__ > return next(self.gen) > File "/usr/local/lib/python3.6/site-packages/airflow/utils/db.py", line > 41, in create_session > session = settings.Session() > TypeError: 'NoneType' object is not callable > > *TypeError must be str not tuple* > [2018-04-16 18:37:28,200] ERROR in app: Exception on > /admin/airflow/get_logs_with_metadata [GET] > Traceback (most recent call last): > File "/usr/local/lib/python3.6/site-packages/flask/app.py", line 1982, in > wsgi_app > response = self.full_dispatch_request() > File "/usr/local/lib/python3.6/site-packages/flask/app.py", line 1614, in > full_dispatch_request > rv = self.handle_user_exception(e) > File "/usr/local/lib/python3.6/site-packages/flask/app.py", line 1517, in > handle_user_exception > reraise(exc_type, exc_value, tb) > File "/usr/local/lib/python3.6/site-packages/flask/_compat.py", line 33, > in reraise > raise value > File "/usr/local/lib/python3.6/site-packages/flask/app.py", line 1612, in > full_dispatch_request > rv = self.dispatch_request() > File "/usr/local/lib/python3.6/site-packages/flask/app.py", line 1598, in > dispatch_request > return self.view_functions[rule.endpoint](**req.view_args) > File "/usr/local/lib/python3.6/site-packages/flask_admin/base.py", line > 69, in inner > return self._run_view(f, *args, **kwargs) > File "/usr/local/lib/python3.6/site-packages/flask_admin/base.py", line > 368, in _run_view > return fn(self, *args, **kwargs) > File "/usr/local/lib/python3.6/site-packages/flask_login.py", line 755, in > decorated_view > return func(*args, **kwargs) > File "/usr/local/lib/python3.6/site-packages/airflow/www/utils.py", line > 269, in wrapper > return f(*args, **kwargs) > File "/usr/local/lib/python3.6/site-packages/airflow/utils/db.py", line > 74, in wrapper > return func(*args, **kwargs) > File "/usr/local/lib/python3.6/site-packages/airflow/www/views.py", line > 770, in get_logs_with_metadata > logs, metadatas = handler.read(ti, try_number, metadata=metadata) > File > "/usr/local/lib/python3.6/site-packages/airflow/utils/log/file_task_handler.py", > line 165, in read > logs[i] += log > TypeError: must be str, not
[jira] [Commented] (AIRFLOW-2363) S3 remote logging appending tuple instead of str
[ https://issues.apache.org/jira/browse/AIRFLOW-2363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16453663#comment-16453663 ] Kevin Yang commented on AIRFLOW-2363: - [~jdavidh], it was a tricky one to debug, I actually think the NoneType error exist even before 5cb530b455be54e6b58eae19c8c10ef8f5cf955d was merged (at least in my naive setup with S3). That error blocks one attempt of uploading (there's actually multiple attempts, whenever the s3 task handler was closed) and the one that's not blocked got removed by 5cb530b455be54e6b58eae19c8c10ef8f5cf955d and I made a fix to it in the PR. I'm going to assume the root cause is that the uploading called in atexit() was killed when the subprocess ended and the upload cannot finish (according to my debugging logs). But I believe there's more juice in this task handler closing issue and need some more work to be perfect. I'm gonna stop here due to priority change but I would be very curious to know all the details if you decided to dig to the end of it. > S3 remote logging appending tuple instead of str > > > Key: AIRFLOW-2363 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2363 > Project: Apache Airflow > Issue Type: Bug > Components: logging >Reporter: Kyle Hamlin >Assignee: Kevin Yang >Priority: Major > Fix For: 1.10.0 > > > A recent merge into master that added support for Elasticsearch logging seems > to have broken S3 logging by returning a tuple instead of a string. > [https://github.com/apache/incubator-airflow/commit/ec38ba9594395de04ec932481212a86fbe9ae107#diff-0442332ecbe42ebbf426911c68d8cd4aR128] > > following errors thrown: > > *Session NoneType error* > Traceback (most recent call last): > File > "/usr/local/lib/python3.6/site-packages/airflow/utils/log/s3_task_handler.py", > line 171, in s3_write > encrypt=configuration.conf.getboolean('core', 'ENCRYPT_S3_LOGS'), > File "/usr/local/lib/python3.6/site-packages/airflow/hooks/S3_hook.py", > line 274, in load_string > encrypt=encrypt) > File "/usr/local/lib/python3.6/site-packages/airflow/hooks/S3_hook.py", > line 313, in load_bytes > client = self.get_conn() > File "/usr/local/lib/python3.6/site-packages/airflow/hooks/S3_hook.py", > line 34, in get_conn > return self.get_client_type('s3') > File > "/usr/local/lib/python3.6/site-packages/airflow/contrib/hooks/aws_hook.py", > line 151, in get_client_type > session, endpoint_url = self._get_credentials(region_name) > File > "/usr/local/lib/python3.6/site-packages/airflow/contrib/hooks/aws_hook.py", > line 97, in _get_credentials > connection_object = self.get_connection(self.aws_conn_id) > File "/usr/local/lib/python3.6/site-packages/airflow/hooks/base_hook.py", > line 82, in get_connection > conn = random.choice(cls.get_connections(conn_id)) > File "/usr/local/lib/python3.6/site-packages/airflow/hooks/base_hook.py", > line 77, in get_connections > conns = cls._get_connections_from_db(conn_id) > File "/usr/local/lib/python3.6/site-packages/airflow/utils/db.py", line > 72, in wrapper > with create_session() as session: > File "/usr/local/lib/python3.6/contextlib.py", line 81, in __enter__ > return next(self.gen) > File "/usr/local/lib/python3.6/site-packages/airflow/utils/db.py", line > 41, in create_session > session = settings.Session() > TypeError: 'NoneType' object is not callable > > *TypeError must be str not tuple* > [2018-04-16 18:37:28,200] ERROR in app: Exception on > /admin/airflow/get_logs_with_metadata [GET] > Traceback (most recent call last): > File "/usr/local/lib/python3.6/site-packages/flask/app.py", line 1982, in > wsgi_app > response = self.full_dispatch_request() > File "/usr/local/lib/python3.6/site-packages/flask/app.py", line 1614, in > full_dispatch_request > rv = self.handle_user_exception(e) > File "/usr/local/lib/python3.6/site-packages/flask/app.py", line 1517, in > handle_user_exception > reraise(exc_type, exc_value, tb) > File "/usr/local/lib/python3.6/site-packages/flask/_compat.py", line 33, > in reraise > raise value > File "/usr/local/lib/python3.6/site-packages/flask/app.py", line 1612, in > full_dispatch_request > rv = self.dispatch_request() > File "/usr/local/lib/python3.6/site-packages/flask/app.py", line 1598, in > dispatch_request > return self.view_functions[rule.endpoint](**req.view_args) > File "/usr/local/lib/python3.6/site-packages/flask_admin/base.py", line > 69, in inner > return self._run_view(f, *args, **kwargs) > File "/usr/local/lib/python3.6/site-packages/flask_admin/base.py", line > 368, in _run_view > return fn(self, *args, **kwargs) > File
[jira] [Commented] (AIRFLOW-2363) S3 remote logging appending tuple instead of str
[ https://issues.apache.org/jira/browse/AIRFLOW-2363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16453189#comment-16453189 ] Kevin Yang commented on AIRFLOW-2363: - Seems like the orm is somehow not configured. It's less intuitive to find the root cause, I'll set up a S3 env on my side to debug. Sry for the any trouble the bug might bring. > S3 remote logging appending tuple instead of str > > > Key: AIRFLOW-2363 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2363 > Project: Apache Airflow > Issue Type: Bug > Components: logging >Reporter: Kyle Hamlin >Assignee: Kevin Yang >Priority: Major > Fix For: 1.10.0 > > > A recent merge into master that added support for Elasticsearch logging seems > to have broken S3 logging by returning a tuple instead of a string. > [https://github.com/apache/incubator-airflow/commit/ec38ba9594395de04ec932481212a86fbe9ae107#diff-0442332ecbe42ebbf426911c68d8cd4aR128] > > following errors thrown: > > *Session NoneType error* > Traceback (most recent call last): > File > "/usr/local/lib/python3.6/site-packages/airflow/utils/log/s3_task_handler.py", > line 171, in s3_write > encrypt=configuration.conf.getboolean('core', 'ENCRYPT_S3_LOGS'), > File "/usr/local/lib/python3.6/site-packages/airflow/hooks/S3_hook.py", > line 274, in load_string > encrypt=encrypt) > File "/usr/local/lib/python3.6/site-packages/airflow/hooks/S3_hook.py", > line 313, in load_bytes > client = self.get_conn() > File "/usr/local/lib/python3.6/site-packages/airflow/hooks/S3_hook.py", > line 34, in get_conn > return self.get_client_type('s3') > File > "/usr/local/lib/python3.6/site-packages/airflow/contrib/hooks/aws_hook.py", > line 151, in get_client_type > session, endpoint_url = self._get_credentials(region_name) > File > "/usr/local/lib/python3.6/site-packages/airflow/contrib/hooks/aws_hook.py", > line 97, in _get_credentials > connection_object = self.get_connection(self.aws_conn_id) > File "/usr/local/lib/python3.6/site-packages/airflow/hooks/base_hook.py", > line 82, in get_connection > conn = random.choice(cls.get_connections(conn_id)) > File "/usr/local/lib/python3.6/site-packages/airflow/hooks/base_hook.py", > line 77, in get_connections > conns = cls._get_connections_from_db(conn_id) > File "/usr/local/lib/python3.6/site-packages/airflow/utils/db.py", line > 72, in wrapper > with create_session() as session: > File "/usr/local/lib/python3.6/contextlib.py", line 81, in __enter__ > return next(self.gen) > File "/usr/local/lib/python3.6/site-packages/airflow/utils/db.py", line > 41, in create_session > session = settings.Session() > TypeError: 'NoneType' object is not callable > > *TypeError must be str not tuple* > [2018-04-16 18:37:28,200] ERROR in app: Exception on > /admin/airflow/get_logs_with_metadata [GET] > Traceback (most recent call last): > File "/usr/local/lib/python3.6/site-packages/flask/app.py", line 1982, in > wsgi_app > response = self.full_dispatch_request() > File "/usr/local/lib/python3.6/site-packages/flask/app.py", line 1614, in > full_dispatch_request > rv = self.handle_user_exception(e) > File "/usr/local/lib/python3.6/site-packages/flask/app.py", line 1517, in > handle_user_exception > reraise(exc_type, exc_value, tb) > File "/usr/local/lib/python3.6/site-packages/flask/_compat.py", line 33, > in reraise > raise value > File "/usr/local/lib/python3.6/site-packages/flask/app.py", line 1612, in > full_dispatch_request > rv = self.dispatch_request() > File "/usr/local/lib/python3.6/site-packages/flask/app.py", line 1598, in > dispatch_request > return self.view_functions[rule.endpoint](**req.view_args) > File "/usr/local/lib/python3.6/site-packages/flask_admin/base.py", line > 69, in inner > return self._run_view(f, *args, **kwargs) > File "/usr/local/lib/python3.6/site-packages/flask_admin/base.py", line > 368, in _run_view > return fn(self, *args, **kwargs) > File "/usr/local/lib/python3.6/site-packages/flask_login.py", line 755, in > decorated_view > return func(*args, **kwargs) > File "/usr/local/lib/python3.6/site-packages/airflow/www/utils.py", line > 269, in wrapper > return f(*args, **kwargs) > File "/usr/local/lib/python3.6/site-packages/airflow/utils/db.py", line > 74, in wrapper > return func(*args, **kwargs) > File "/usr/local/lib/python3.6/site-packages/airflow/www/views.py", line > 770, in get_logs_with_metadata > logs, metadatas = handler.read(ti, try_number, metadata=metadata) > File > "/usr/local/lib/python3.6/site-packages/airflow/utils/log/file_task_handler.py", > line 165, in read > logs[i] += log >
[jira] [Created] (AIRFLOW-2373) Do not run tasks when DagRun state is not running
Kevin Yang created AIRFLOW-2373: --- Summary: Do not run tasks when DagRun state is not running Key: AIRFLOW-2373 URL: https://issues.apache.org/jira/browse/AIRFLOW-2373 Project: Apache Airflow Issue Type: Improvement Reporter: Kevin Yang Logically it might make sense to stop tasks from being started when DagRun is not running. Note that this will affect the ability to run task from the UI, it might make sense to have an additional ignore option in the UI when running tasks manually. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-2363) S3 remote logging appending tuple instead of str
[ https://issues.apache.org/jira/browse/AIRFLOW-2363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16449156#comment-16449156 ] Kevin Yang commented on AIRFLOW-2363: - Hi [~hamlinkn], I've created a PR to fix it but I think I didn't tag you correctly on github [https://github.com/apache/incubator-airflow/pull/3259] Do you think it is possible for you to test it in your infra end to end? It might take some extra work to have S3 set up on my side and I think you might benefit from a faster merge. Thank you! > S3 remote logging appending tuple instead of str > > > Key: AIRFLOW-2363 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2363 > Project: Apache Airflow > Issue Type: Bug > Components: logging >Reporter: Kyle Hamlin >Assignee: Kevin Yang >Priority: Major > Fix For: 1.10.0 > > > A recent merge into master that added support for Elasticsearch logging seems > to have broken S3 logging by returning a tuple instead of a string. > [https://github.com/apache/incubator-airflow/commit/ec38ba9594395de04ec932481212a86fbe9ae107#diff-0442332ecbe42ebbf426911c68d8cd4aR128] > > following errors thrown: > > *Session NoneType error* > Traceback (most recent call last): > File > "/usr/local/lib/python3.6/site-packages/airflow/utils/log/s3_task_handler.py", > line 171, in s3_write > encrypt=configuration.conf.getboolean('core', 'ENCRYPT_S3_LOGS'), > File "/usr/local/lib/python3.6/site-packages/airflow/hooks/S3_hook.py", > line 274, in load_string > encrypt=encrypt) > File "/usr/local/lib/python3.6/site-packages/airflow/hooks/S3_hook.py", > line 313, in load_bytes > client = self.get_conn() > File "/usr/local/lib/python3.6/site-packages/airflow/hooks/S3_hook.py", > line 34, in get_conn > return self.get_client_type('s3') > File > "/usr/local/lib/python3.6/site-packages/airflow/contrib/hooks/aws_hook.py", > line 151, in get_client_type > session, endpoint_url = self._get_credentials(region_name) > File > "/usr/local/lib/python3.6/site-packages/airflow/contrib/hooks/aws_hook.py", > line 97, in _get_credentials > connection_object = self.get_connection(self.aws_conn_id) > File "/usr/local/lib/python3.6/site-packages/airflow/hooks/base_hook.py", > line 82, in get_connection > conn = random.choice(cls.get_connections(conn_id)) > File "/usr/local/lib/python3.6/site-packages/airflow/hooks/base_hook.py", > line 77, in get_connections > conns = cls._get_connections_from_db(conn_id) > File "/usr/local/lib/python3.6/site-packages/airflow/utils/db.py", line > 72, in wrapper > with create_session() as session: > File "/usr/local/lib/python3.6/contextlib.py", line 81, in __enter__ > return next(self.gen) > File "/usr/local/lib/python3.6/site-packages/airflow/utils/db.py", line > 41, in create_session > session = settings.Session() > TypeError: 'NoneType' object is not callable > > *TypeError must be str not tuple* > [2018-04-16 18:37:28,200] ERROR in app: Exception on > /admin/airflow/get_logs_with_metadata [GET] > Traceback (most recent call last): > File "/usr/local/lib/python3.6/site-packages/flask/app.py", line 1982, in > wsgi_app > response = self.full_dispatch_request() > File "/usr/local/lib/python3.6/site-packages/flask/app.py", line 1614, in > full_dispatch_request > rv = self.handle_user_exception(e) > File "/usr/local/lib/python3.6/site-packages/flask/app.py", line 1517, in > handle_user_exception > reraise(exc_type, exc_value, tb) > File "/usr/local/lib/python3.6/site-packages/flask/_compat.py", line 33, > in reraise > raise value > File "/usr/local/lib/python3.6/site-packages/flask/app.py", line 1612, in > full_dispatch_request > rv = self.dispatch_request() > File "/usr/local/lib/python3.6/site-packages/flask/app.py", line 1598, in > dispatch_request > return self.view_functions[rule.endpoint](**req.view_args) > File "/usr/local/lib/python3.6/site-packages/flask_admin/base.py", line > 69, in inner > return self._run_view(f, *args, **kwargs) > File "/usr/local/lib/python3.6/site-packages/flask_admin/base.py", line > 368, in _run_view > return fn(self, *args, **kwargs) > File "/usr/local/lib/python3.6/site-packages/flask_login.py", line 755, in > decorated_view > return func(*args, **kwargs) > File "/usr/local/lib/python3.6/site-packages/airflow/www/utils.py", line > 269, in wrapper > return f(*args, **kwargs) > File "/usr/local/lib/python3.6/site-packages/airflow/utils/db.py", line > 74, in wrapper > return func(*args, **kwargs) > File "/usr/local/lib/python3.6/site-packages/airflow/www/views.py", line > 770, in get_logs_with_metadata > logs, metadatas =
[jira] [Assigned] (AIRFLOW-2363) S3 remote logging appending tuple instead of str
[ https://issues.apache.org/jira/browse/AIRFLOW-2363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kevin Yang reassigned AIRFLOW-2363: --- Assignee: Kevin Yang > S3 remote logging appending tuple instead of str > > > Key: AIRFLOW-2363 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2363 > Project: Apache Airflow > Issue Type: Bug > Components: logging >Reporter: Kyle Hamlin >Assignee: Kevin Yang >Priority: Major > Fix For: 1.10.0 > > > A recent merge into master that added support for Elasticsearch logging seems > to have broken S3 logging by returning a tuple instead of a string. > [https://github.com/apache/incubator-airflow/commit/ec38ba9594395de04ec932481212a86fbe9ae107#diff-0442332ecbe42ebbf426911c68d8cd4aR128] > > following errors thrown: > > *Session NoneType error* > Traceback (most recent call last): > File > "/usr/local/lib/python3.6/site-packages/airflow/utils/log/s3_task_handler.py", > line 171, in s3_write > encrypt=configuration.conf.getboolean('core', 'ENCRYPT_S3_LOGS'), > File "/usr/local/lib/python3.6/site-packages/airflow/hooks/S3_hook.py", > line 274, in load_string > encrypt=encrypt) > File "/usr/local/lib/python3.6/site-packages/airflow/hooks/S3_hook.py", > line 313, in load_bytes > client = self.get_conn() > File "/usr/local/lib/python3.6/site-packages/airflow/hooks/S3_hook.py", > line 34, in get_conn > return self.get_client_type('s3') > File > "/usr/local/lib/python3.6/site-packages/airflow/contrib/hooks/aws_hook.py", > line 151, in get_client_type > session, endpoint_url = self._get_credentials(region_name) > File > "/usr/local/lib/python3.6/site-packages/airflow/contrib/hooks/aws_hook.py", > line 97, in _get_credentials > connection_object = self.get_connection(self.aws_conn_id) > File "/usr/local/lib/python3.6/site-packages/airflow/hooks/base_hook.py", > line 82, in get_connection > conn = random.choice(cls.get_connections(conn_id)) > File "/usr/local/lib/python3.6/site-packages/airflow/hooks/base_hook.py", > line 77, in get_connections > conns = cls._get_connections_from_db(conn_id) > File "/usr/local/lib/python3.6/site-packages/airflow/utils/db.py", line > 72, in wrapper > with create_session() as session: > File "/usr/local/lib/python3.6/contextlib.py", line 81, in __enter__ > return next(self.gen) > File "/usr/local/lib/python3.6/site-packages/airflow/utils/db.py", line > 41, in create_session > session = settings.Session() > TypeError: 'NoneType' object is not callable > > *TypeError must be str not tuple* > [2018-04-16 18:37:28,200] ERROR in app: Exception on > /admin/airflow/get_logs_with_metadata [GET] > Traceback (most recent call last): > File "/usr/local/lib/python3.6/site-packages/flask/app.py", line 1982, in > wsgi_app > response = self.full_dispatch_request() > File "/usr/local/lib/python3.6/site-packages/flask/app.py", line 1614, in > full_dispatch_request > rv = self.handle_user_exception(e) > File "/usr/local/lib/python3.6/site-packages/flask/app.py", line 1517, in > handle_user_exception > reraise(exc_type, exc_value, tb) > File "/usr/local/lib/python3.6/site-packages/flask/_compat.py", line 33, > in reraise > raise value > File "/usr/local/lib/python3.6/site-packages/flask/app.py", line 1612, in > full_dispatch_request > rv = self.dispatch_request() > File "/usr/local/lib/python3.6/site-packages/flask/app.py", line 1598, in > dispatch_request > return self.view_functions[rule.endpoint](**req.view_args) > File "/usr/local/lib/python3.6/site-packages/flask_admin/base.py", line > 69, in inner > return self._run_view(f, *args, **kwargs) > File "/usr/local/lib/python3.6/site-packages/flask_admin/base.py", line > 368, in _run_view > return fn(self, *args, **kwargs) > File "/usr/local/lib/python3.6/site-packages/flask_login.py", line 755, in > decorated_view > return func(*args, **kwargs) > File "/usr/local/lib/python3.6/site-packages/airflow/www/utils.py", line > 269, in wrapper > return f(*args, **kwargs) > File "/usr/local/lib/python3.6/site-packages/airflow/utils/db.py", line > 74, in wrapper > return func(*args, **kwargs) > File "/usr/local/lib/python3.6/site-packages/airflow/www/views.py", line > 770, in get_logs_with_metadata > logs, metadatas = handler.read(ti, try_number, metadata=metadata) > File > "/usr/local/lib/python3.6/site-packages/airflow/utils/log/file_task_handler.py", > line 165, in read > logs[i] += log > TypeError: must be str, not tuple -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (AIRFLOW-1819) Fix slack operator unittest bug
[ https://issues.apache.org/jira/browse/AIRFLOW-1819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kevin Yang resolved AIRFLOW-1819. - Resolution: Fixed > Fix slack operator unittest bug > --- > > Key: AIRFLOW-1819 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1819 > Project: Apache Airflow > Issue Type: Bug >Reporter: Kevin Yang >Assignee: Kevin Yang >Priority: Major > > slack_operator.py unittest is failing and is not covering code paths for > passing in api_params. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (AIRFLOW-1805) Allow to supply Slack token through connection
[ https://issues.apache.org/jira/browse/AIRFLOW-1805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kevin Yang resolved AIRFLOW-1805. - Resolution: Fixed > Allow to supply Slack token through connection > -- > > Key: AIRFLOW-1805 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1805 > Project: Apache Airflow > Issue Type: Improvement >Reporter: Kevin Yang >Assignee: Kevin Yang >Priority: Major > > To prevent passing in Slack token directly in plain text, it is safer to pass > in the token as 'password' through connection. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (AIRFLOW-1787) Fix batch clear RUNNING task instance and inconsistent timestamp format bugs
[ https://issues.apache.org/jira/browse/AIRFLOW-1787?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kevin Yang resolved AIRFLOW-1787. - Resolution: Fixed > Fix batch clear RUNNING task instance and inconsistent timestamp format bugs > > > Key: AIRFLOW-1787 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1787 > Project: Apache Airflow > Issue Type: Bug > Components: webserver >Reporter: Kevin Yang >Assignee: Kevin Yang >Priority: Major > Fix For: 1.10.0 > > > * Batch clear in CRUD is not working for task instances in RUNNING state, > need to be fixed > * Batch clear and set status are not working for manually triggered task > instances because manually triggered task instances have different execution > date format. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (AIRFLOW-2359) Add set failed for DagRun and TaskInstance in tree view
Kevin Yang created AIRFLOW-2359: --- Summary: Add set failed for DagRun and TaskInstance in tree view Key: AIRFLOW-2359 URL: https://issues.apache.org/jira/browse/AIRFLOW-2359 Project: Apache Airflow Issue Type: Improvement Reporter: Kevin Yang Assignee: Kevin Yang User has been requesting to add set failed in tree view for DagRun and TaskInstance. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (AIRFLOW-2202) Support filter in HiveMetastoreHook().max_partition()
[ https://issues.apache.org/jira/browse/AIRFLOW-2202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kevin Yang resolved AIRFLOW-2202. - Resolution: Fixed Fixed by https://github.com/apache/incubator-airflow/pull/3117 > Support filter in HiveMetastoreHook().max_partition() > -- > > Key: AIRFLOW-2202 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2202 > Project: Apache Airflow > Issue Type: Bug >Reporter: Kevin Yang >Priority: Major > > Change made in https://issues.apache.org/jira/browse/AIRFLOW-2150 removed the > support for filter in max_partition(), which should be a valid use case. So > we're adding it back. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (AIRFLOW-2150) Use get_partition_names() instead of get_partitions() in HiveMetastoreHook().max_partition()
[ https://issues.apache.org/jira/browse/AIRFLOW-2150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kevin Yang resolved AIRFLOW-2150. - Resolution: Fixed > Use get_partition_names() instead of get_partitions() in > HiveMetastoreHook().max_partition() > > > Key: AIRFLOW-2150 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2150 > Project: Apache Airflow > Issue Type: Bug >Reporter: Kevin Yang >Assignee: Kevin Yang >Priority: Major > > get_partitions() is extremely expensive for large tables, max_partition() > should be using get_partition_names() instead. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work started] (AIRFLOW-2150) Use get_partition_names() instead of get_partitions() in HiveMetastoreHook().max_partition()
[ https://issues.apache.org/jira/browse/AIRFLOW-2150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on AIRFLOW-2150 started by Kevin Yang. --- > Use get_partition_names() instead of get_partitions() in > HiveMetastoreHook().max_partition() > > > Key: AIRFLOW-2150 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2150 > Project: Apache Airflow > Issue Type: Bug >Reporter: Kevin Yang >Assignee: Kevin Yang >Priority: Major > > get_partitions() is extremely expensive for large tables, max_partition() > should be using get_partition_names() instead. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (AIRFLOW-2150) Use get_partition_names() instead of get_partitions() in HiveMetastoreHook().max_partition()
Kevin Yang created AIRFLOW-2150: --- Summary: Use get_partition_names() instead of get_partitions() in HiveMetastoreHook().max_partition() Key: AIRFLOW-2150 URL: https://issues.apache.org/jira/browse/AIRFLOW-2150 Project: Apache Airflow Issue Type: Bug Reporter: Kevin Yang Assignee: Kevin Yang get_partitions() is extremely expensive for large tables, max_partition() should be using get_partition_names() instead. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-1805) Allow to supply Slack token through connection
[ https://issues.apache.org/jira/browse/AIRFLOW-1805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16254572#comment-16254572 ] Kevin Yang commented on AIRFLOW-1805: - Bug in this issue is fixed in AIRFLOW-1819 > Allow to supply Slack token through connection > -- > > Key: AIRFLOW-1805 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1805 > Project: Apache Airflow > Issue Type: Improvement >Reporter: Kevin Yang >Assignee: Kevin Yang > > To prevent passing in Slack token directly in plain text, it is safer to pass > in the token as 'password' through connection. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (AIRFLOW-1819) Fix slack operator unittest bug
[ https://issues.apache.org/jira/browse/AIRFLOW-1819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16254570#comment-16254570 ] Kevin Yang commented on AIRFLOW-1819: - this jira fix bug in issue AIRFLOW-1805 > Fix slack operator unittest bug > --- > > Key: AIRFLOW-1819 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1819 > Project: Apache Airflow > Issue Type: Bug >Reporter: Kevin Yang >Assignee: Kevin Yang > > slack_operator.py unittest is failing and is not covering code paths for > passing in api_params. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Work started] (AIRFLOW-1805) Allow to supply Slack token through connection
[ https://issues.apache.org/jira/browse/AIRFLOW-1805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on AIRFLOW-1805 started by Kevin Yang. --- > Allow to supply Slack token through connection > -- > > Key: AIRFLOW-1805 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1805 > Project: Apache Airflow > Issue Type: Improvement >Reporter: Kevin Yang >Assignee: Kevin Yang > > To prevent passing in Slack token directly in plain text, it is safer to pass > in the token as 'password' through connection. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Work started] (AIRFLOW-1819) Fix slack operator unittest bug
[ https://issues.apache.org/jira/browse/AIRFLOW-1819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on AIRFLOW-1819 started by Kevin Yang. --- > Fix slack operator unittest bug > --- > > Key: AIRFLOW-1819 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1819 > Project: Apache Airflow > Issue Type: Bug >Reporter: Kevin Yang >Assignee: Kevin Yang > > slack_operator.py unittest is failing and is not covering code paths for > passing in api_params. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (AIRFLOW-1819) Fix slack operator unittest bug
[ https://issues.apache.org/jira/browse/AIRFLOW-1819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kevin Yang reassigned AIRFLOW-1819: --- Assignee: Kevin Yang > Fix slack operator unittest bug > --- > > Key: AIRFLOW-1819 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1819 > Project: Apache Airflow > Issue Type: Bug >Reporter: Kevin Yang >Assignee: Kevin Yang > > slack_operator.py unittest is failing and is not covering code paths for > passing in api_params. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (AIRFLOW-1819) Fix slack operator unittest bug
Kevin Yang created AIRFLOW-1819: --- Summary: Fix slack operator unittest bug Key: AIRFLOW-1819 URL: https://issues.apache.org/jira/browse/AIRFLOW-1819 Project: Apache Airflow Issue Type: Bug Reporter: Kevin Yang slack_operator.py unittest is failing and is not covering code paths for passing in api_params. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (AIRFLOW-1805) Allow to supply Slack token through connection
[ https://issues.apache.org/jira/browse/AIRFLOW-1805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kevin Yang reassigned AIRFLOW-1805: --- Assignee: Kevin Yang > Allow to supply Slack token through connection > -- > > Key: AIRFLOW-1805 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1805 > Project: Apache Airflow > Issue Type: Improvement >Reporter: Kevin Yang >Assignee: Kevin Yang > > To prevent passing in Slack token directly in plain text, it is safer to pass > in the token as 'password' through connection. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (AIRFLOW-1805) Allow to supply Slack token through connection
Kevin Yang created AIRFLOW-1805: --- Summary: Allow to supply Slack token through connection Key: AIRFLOW-1805 URL: https://issues.apache.org/jira/browse/AIRFLOW-1805 Project: Apache Airflow Issue Type: Improvement Reporter: Kevin Yang To prevent passing in Slack token directly in plain text, it is safer to pass in the token as 'password' through connection. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (AIRFLOW-1787) Fix batch clear RUNNING task instance and inconsistent timestamp format bugs
Kevin Yang created AIRFLOW-1787: --- Summary: Fix batch clear RUNNING task instance and inconsistent timestamp format bugs Key: AIRFLOW-1787 URL: https://issues.apache.org/jira/browse/AIRFLOW-1787 Project: Apache Airflow Issue Type: Bug Components: webserver Reporter: Kevin Yang Assignee: Kevin Yang * Batch clear in CRUD is not working for task instances in RUNNING state, need to be fixed * Batch clear and set status are not working for manually triggered task instances because manually triggered task instances have different execution date format. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (AIRFLOW-1681) Create way to batch retry task instances in the CRUD
[ https://issues.apache.org/jira/browse/AIRFLOW-1681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kevin Yang reassigned AIRFLOW-1681: --- Assignee: Kevin Yang > Create way to batch retry task instances in the CRUD > > > Key: AIRFLOW-1681 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1681 > Project: Apache Airflow > Issue Type: Bug > Components: webserver >Reporter: Dan Davydov >Assignee: Kevin Yang > > The old way to batch retry tasks was to select them on the Task Instances > page on the webserver and do a With Selected -> Delete. > This no longer works as you will get overlapping task instance logs (e.g. the > first retry log will be placed in the same location as the first try log). We > need an option in the crud called With Selected -> Retry that does the same > thing as With Selected -> Delete but follows the logic for task clearing > (sets state to none, increases max_tries). Once this feature is stable With > Selected -> Delete should probably be removed as it leaders to bad states > with the logs. -- This message was sent by Atlassian JIRA (v6.4.14#64029)