[ 
https://issues.apache.org/jira/browse/AIRFLOW-6033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16980031#comment-16980031
 ] 

ASF GitHub Bot commented on AIRFLOW-6033:
-----------------------------------------

drexpp commented on pull request #6634: This commit fixes [AIRFLOW-6033] UI 
crashes on "Landing Times"
URL: https://github.com/apache/airflow/pull/6634
 
 
   ### Adding our company to _"Who uses Apache Airflow?"_
   
   Hello everyone, is it possible that our company appears on **_"Who uses 
Apache Airflow?"_** section? We are a small team working in 
[Endesa](https://www.endesa.com/), a big spanish electric distributor and part 
of [Enel](https://www.enel.com/). I wrote some pipelines to automate the ETLs 
processes we use with Hadoop / Spark, so I believe it would be great for our 
team.
   
   [Endesa](https://www.endesa.com/) [@drexpp]
   ___
   
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [X] My PR addresses the following Airflow Jira Issue 
[AIRFLOW-6033](https://issues.apache.org/jira/browse/AIRFLOW-6033) and 
references them in the PR title.
    
   ### Description
   
   I targeted to v1-10-stable since I think is what @ashb recommended to me in 
my last PR.
   
   - [X] Here are some details about my PR:
   
   
   Airflow UI will crash in the browser returning "Oops" message and the 
Traceback of the crashing error.
   
   This is caused by modifying a task_id with a capital/small letter, I will 
point out some examples that will cause airflow to crash:
   
   task_id = "DUMMY_TASK" to task_id = "dUMMY_TASK"
   task_id = "Dummy_Task" to task_id = "dummy_Task" or "Dummy_task",...
   task_id = "Dummy_task" to task_id = "Dummy_tASk"
   ___
   #### File causing the problem:  
https://github.com/apache/airflow/blob/master/airflow/www/views.py (lines 1643 
- 1654)
   ```
   for task in dag.tasks:
       y[task.task_id] = []
       x[task.task_id] = []
   
       for ti in task.get_task_instances(start_date=min_date, 
end_date=base_date):
   
           ts = ti.execution_date
           if dag.schedule_interval and dag.following_schedule(ts):
               ts = dag.following_schedule(ts)
           if ti.end_date:
               dttm = wwwutils.epoch(ti.execution_date)
               secs = (ti.end_date - ts).total_seconds()
               x[ti.task_id].append(dttm)
               y[ti.task_id].append(secs)
   ```
   ___
   
   We can see in first two lines inside the first for loop, how the dictionary 
x and y is being filled with tasks_id attributes which comes from the actual 
DAG.
   
   The problem actually comes in the second for loop when you get the task 
instances from a DAG, I am not sure about this next part and I wish someone to 
clarify my question about this.
   
   I think that the task instances (ti) received from get_task_instances() 
function comes from the information stored into the database, that is the 
reason of crash when you access to "Landing Times" page, is that the x and y 
where filled with the actual name of the task_id in the DAG and the 
task_instances' task_id has different name stored causing this problem access 
to the dictionary.
   
   One of my main questions is how having a different task name (such as 
changing from "run" to "Run") the function get_task_instances() keeps returning 
past task instances with different name, such asking instances of Run but 
returns task instances (ti) with task_id "run"?
   
   ### Error screeshot
   
   How to replicate: 
   
     - Launch airflow webserver -p 8080
     - Go to the Airflow-UI
     - Create an example DAG with a task_id name up to your choice in small 
letters (ex. "run")
     - Launch the DAG and wait its execution to finish
     - Modify the task_id inside the DAG with the first letter to capital 
letter (ex. "Run")
     - Refresh the DAG
     - Go to "Landing Times" inside the DAG menu in the UI
     - You will get an "oops" message with the Traceback.
   
   ![error_img](https://i.imgur.com/0tXRDuH.png)
   ____
   ### Tests
   
   - [X] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   
     - I didn't know exactly how to unit test this, if you have any advice I 
will do a test for it. Other than that, I did test checking that the behaviour 
was as expected:
       
       - [X] Create DAG and access to Landing Times
       - [X] Modify a task from the created DAG to a completly new name and 
access to Landing Times
       - [X] Modifying a task with capital/lower letters and accessing to 
Landing Times
       - [X] Switch to the original name and access to Landing Times
   
   
   ### Commits
   
   - [X] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
     1. Subject is separated from body by a blank line
     1. Subject is limited to 50 characters (not including Jira issue reference)
     1. Subject does not end with a period
     1. Subject uses the imperative mood ("add", not "adding")
     1. Body wraps at 72 characters
     1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [X] In case of new functionality, my PR adds documentation that describes 
how to use it.
     - All the public functions and the classes in the PR contain docstrings 
that explain what it does
     - If you implement backwards incompatible changes, please leave a note in 
the [Updating.md](https://github.com/apache/airflow/blob/master/UPDATING.md) so 
we can assign it to a appropriate release
   
   ### Code Quality
   
   - [X] Passes `flake8`
   
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


> UI crashes at "Landing Time" after switching task_id caps/small letters
> -----------------------------------------------------------------------
>
>                 Key: AIRFLOW-6033
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-6033
>             Project: Apache Airflow
>          Issue Type: Bug
>          Components: DAG, ui
>    Affects Versions: 1.10.6
>            Reporter: ivan de los santos
>            Priority: Minor
>
> Airflow UI will crash in the browser returning "Oops" message and the 
> Traceback of the crashing error.
> This is caused by modifying a task_id with a capital/small letter, I will 
> point out some examples that will cause airflow to crash:
>  - task_id = "DUMMY_TASK" to task_id = "dUMMY_TASK"
>  - task_id = "Dummy_Task" to task_id = "dummy_Task" or "Dummy_task",...
>  - task_id = "Dummy_task" to task_id = "Dummy_tASk"
> _____________________________________
> If you change the name of the task_id to something different such as, in our 
> example:
>  - task_id = "Dummy_Task" to task_id = "DummyTask" or "Dummytask"
> It won't fail since it will be recognized as new tasks, which is the expected 
> behaviour.
> If we switch back the modified name to the original name it won't crash since 
> it will access to the correct tasks instances. I will explain in next 
> paragraphs where this error is located.
> _____________________________________________
>  *How to replicate*: 
>  # Launch airflow webserver -p 8080
>  # Go to the Airflow-UI
>  # Create an example DAG with a task_id name up to your choice in small 
> letters (ex. "run")
>  # Launch the DAG and wait its execution to finish
>  # Modify the task_id inside the DAG with the first letter to capital letter 
> (ex. "Run")
>  # Refresh the DAG
>  # Go to "Landing Times" inside the DAG menu in the UI
>  # You will get an "oops" message with the Traceback.
>  
> *File causing the problem*:  
> [https://github.com/apache/airflow/blob/master/airflow/www/views.py] (lines 
> 1643 - 1654)
>  
> *Reasons of the problem*:
>  #  KeyError: 'run', meaning a dictionary does not contain the task_id "run", 
> it will get more into the details of where this comes from.
> {code:python}
> Traceback (most recent call last):
>   File "/home/rde/.local/lib/python3.6/site-packages/flask/app.py", line 
> 2446, in wsgi_app
>     response = self.full_dispatch_request()
>   File "/home/rde/.local/lib/python3.6/site-packages/flask/app.py", line 
> 1951, in full_dispatch_request
>     rv = self.handle_user_exception(e)
>   File "/home/rde/.local/lib/python3.6/site-packages/flask/app.py", line 
> 1820, in handle_user_exception
>     reraise(exc_type, exc_value, tb)
>   File "/home/rde/.local/lib/python3.6/site-packages/flask/_compat.py", line 
> 39, in reraise
>     raise value
>   File "/home/rde/.local/lib/python3.6/site-packages/flask/app.py", line 
> 1949, in full_dispatch_request
>     rv = self.dispatch_request()
>   File "/home/rde/.local/lib/python3.6/site-packages/flask/app.py", line 
> 1935, in dispatch_request
>     return self.view_functions[rule.endpoint](**req.view_args)
>   File "/home/rde/.local/lib/python3.6/site-packages/flask_admin/base.py", 
> line 69, in inner
>     return self._run_view(f, *args, **kwargs)
>   File "/home/rde/.local/lib/python3.6/site-packages/flask_admin/base.py", 
> line 368, in _run_view
>     return fn(self, *args, **kwargs)
>   File "/home/rde/.local/lib/python3.6/site-packages/flask_login/utils.py", 
> line 258, in decorated_view
>     return func(*args, **kwargs)
>   File "/home/rde/.local/lib/python3.6/site-packages/airflow/www/utils.py", 
> line 295, in wrapper
>     return f(*args, **kwargs)
>   File "/home/rde/.local/lib/python3.6/site-packages/airflow/utils/db.py", 
> line 74, in wrapper
>     return func(*args, **kwargs)
>   File "/home/rde/.local/lib/python3.6/site-packages/airflow/www/views.py", 
> line 1921, in landing_times
>     x[ti.task_id].append(dttm)
> KeyError: 'run'
> {code}
> _____________________________
> h2. Code
> {code:python}
> for task in dag.tasks:
>     y[task.task_id] = []
>     x[task.task_id] = []
>     for ti in task.get_task_instances(start_date=min_date, 
> end_date=base_date):
>         ts = ti.execution_date
>         if dag.schedule_interval and dag.following_schedule(ts):
>             ts = dag.following_schedule(ts)
>         if ti.end_date:
>             dttm = wwwutils.epoch(ti.execution_date)
>             secs = (ti.end_date - ts).total_seconds()
>             x[ti.task_id].append(dttm)
>             y[ti.task_id].append(secs)
> {code}
>  
> We can see in first two lines inside the first for loop, how the dictionary x 
> and y is being filled with tasks_id attributes which comes from the actual 
> DAG.
> *The problem actually comes in the second for loop* when you get the task 
> instances from a DAG, I am not sure about this next part and I wish someone 
> to clarify my question about this.
> I think that the task instances (ti) received from get_task_instances() 
> function comes from the information stored into the database, that is the 
> reason of crash when you access to "Landing Times" page, is that the x and y 
> where filled with the actual name of the task_id in the DAG and the 
> task_instances' task_id has different name stored causing this problem access 
> to the dictionary.
> One of my main questions is how having a different task name (such as 
> changing from "run" to "Run")  the function get_task_instances() keeps 
> returning past task instances with different name, such asking instances of 
> Run but returns task instances (ti) with task_id "run"?
> ________________________
> *Proposed solution*:  I propose creating a variable saving the DAG task's 
> task_id from first dag.tasks for loop , and re-using it for the creation of 
> the dictionary and the assign of task instances time values (dttm and secs). 
> This is due to the fact that the task instances (ti) task_id will be related 
> to the task who actually asks (get_task_instances) from itself.
>  
> {code:python}
> for task in dag.tasks:
>     # Change proposed is HERE
>     task_id = task.task_id
>     y[task_id] = []
>     x[task_id] = []
>     for ti in task.get_task_instances(start_date=min_date, 
> end_date=base_date):
>         ts = ti.execution_date
>         if dag.schedule_interval and dag.following_schedule(ts):
>             ts = dag.following_schedule(ts)
>         if ti.end_date:
>             dttm = wwwutils.epoch(ti.execution_date)
>             secs = (ti.end_date - ts).total_seconds()
>            # And HERE
>             x[task_id].append(dttm)
>             y[task_id].append(secs)
> {code}
>  
> This fixed the problem for me and my team.
>  
> I am willing to work deeper on this issue if the problem requires it or apply 
> my solution.
>  
> Best regards,
> Iván



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to