[jira] [Resolved] (AIRFLOW-2563) Pig Hook Doesn't work for Python 3

2018-07-27 Thread Arthur Wiedmer (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arthur Wiedmer resolved AIRFLOW-2563.
-
   Resolution: Fixed
Fix Version/s: 2.0.0

Fixed by PR #3594

> Pig Hook Doesn't work for Python 3
> --
>
> Key: AIRFLOW-2563
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2563
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Murium Iqbal
>Assignee: Jasper Kahn
>Priority: Major
> Fix For: 2.0.0
>
>
> Pig Hook doesn't work in Python3 due to differences in handling string and 
> bytes as described in this stackO post
> https://stackoverflow.com/questions/50652034/pig-hook-in-airflow-doesnt-work-for-python3



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-2510) Introduce new macros: prev_ds and next_ds

2018-05-22 Thread Arthur Wiedmer (JIRA)

[ 
https://issues.apache.org/jira/browse/AIRFLOW-2510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16484511#comment-16484511
 ] 

Arthur Wiedmer commented on AIRFLOW-2510:
-

Have you tried using yesterday_ds and tomorrow_ds ?

 

https://github.com/apache/incubator-airflow/blob/1f0a717b65e0ea7e0127708b084baff0697f0946/airflow/models.py#L1755

> Introduce new macros: prev_ds and next_ds
> -
>
> Key: AIRFLOW-2510
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2510
> Project: Apache Airflow
>  Issue Type: Improvement
>Reporter: Chao-Han Tsai
>Assignee: Chao-Han Tsai
>Priority: Major
>
> Introduce new macros {{ prev_ds }} and {{ next_ds }}.
> {{ prev_ds }}: the previous execution date as {{ -MM-DD }}
> {{ next_ds }}: the next execution date as {{ -MM-DD }}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (AIRFLOW-2393) UI tree view struggles with large dags (60 tasks)x25 dag histories

2018-05-09 Thread Arthur Wiedmer (JIRA)

 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arthur Wiedmer resolved AIRFLOW-2393.
-
   Resolution: Fixed
Fix Version/s: 2.0.0

Issue resolved by pull request #3279
[https://github.com/apache/incubator-airflow/pull/3279]

> UI tree view struggles with large dags (60 tasks)x25 dag histories
> --
>
> Key: AIRFLOW-2393
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2393
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: ui
>Affects Versions: 1.9.0
>Reporter: Badger
>Priority: Major
> Fix For: 2.0.0
>
>
> Hi, 
> We are noticing the tree view is taking a long time to render as our DAG has 
> become more complex. We will need to start breaking our dag apart in order to 
> continue to use the user interface. 
> The basic problem is that a reasonably complex DAG (60 operators) x the 
> standard 25 dag run histories on the tree view causes a 350MB json response 
> (compressed to 8MB) to be downloaded, this then needs the browser to render 
> it.
> On quick observation this appears to be because, the response appears to 
> contain all meta-data for each task.
> Is this something others think is a problem. We occasionally have to refresh 
> due to memory errors and have already increased the RAM allocated to the box.
> A suggestion might be to load specific instance history when a user hovers 
> over the task, rather than exporting all of the history on page load. I'd 
> look at contributing a PR but haven't had chance to take a look at this area 
> of the code base.
> Thanks



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (AIRFLOW-2086) The tree view page is too slow when display big dag.

2018-05-09 Thread Arthur Wiedmer (JIRA)

 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arthur Wiedmer resolved AIRFLOW-2086.
-
   Resolution: Fixed
Fix Version/s: 2.0.0

Issue resolved by pull request #3279
[https://github.com/apache/incubator-airflow/pull/3279]

> The tree view page is too slow when display big dag.
> 
>
> Key: AIRFLOW-2086
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2086
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: webserver
>Reporter: Lintao LI
>Priority: Major
> Fix For: 2.0.0
>
>
> The tree view page is too slow for big(actually not too big) dag. 
> The page size will increase dramatically to hundreds of MB.
> please refer to 
> [here|https://stackoverflow.com/questions/48656221/apache-airflow-webui-tree-view-is-too-slow]
>  for details.
> I think the page contains a lot of redundant data. it's a bug or a flaw of 
> design.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-2385) Airflow task is not stopped when execution timeout gets triggered

2018-04-26 Thread Arthur Wiedmer (JIRA)

[ 
https://issues.apache.org/jira/browse/AIRFLOW-2385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16455717#comment-16455717
 ] 

Arthur Wiedmer commented on AIRFLOW-2385:
-

Hi Yohei,

Unless I am mistaken, it looks like your operator is executing a Spark Job (I 
seem to recognize the progress bar from the logs.). execution_timeout will only 
a raise an exception in the Python process, but it might not kill the job.

You probably want to implement the on_kill method for your operator so that it 
knows how to clean up your process. It has been implemented in a few operators 
already in the code base.

Good luck!

> Airflow task is not stopped when execution timeout gets triggered
> -
>
> Key: AIRFLOW-2385
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2385
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: DAG
>Affects Versions: 1.9.0
>Reporter: Yohei Onishi
>Priority: Major
>
> I have my own custom operator extends BaseOperator as follows. I tried to 
> kill a task if the task runs for more than 30 minutes. timeout seems to be 
> triggered according to a log but the task still continued.
> Am I missing something? I checked the official document but do not know what 
> is wrong.[https://airflow.apache.org/code.html#baseoperator]
> My operator is like as follows.
> {code:java}
> class MyOperator(BaseOperator):
>   @apply_defaults
>   def __init__(
> self,
> some_parameters_here,
> *args,
> **kwargs):
> super(MyOperator, self).__init__(*args, **kwargs)
> # some initialization here
>   def execute(self, context):
> # some code here
> {code}
>  
> {{}}My task is like as follows.
> {code:java}
> t = MyOperator(
>   task_id='task',
>   dag=scheduled_dag,
>   execution_timeout=timedelta(minutes=30)
> {code}
>  
> I found this error but the task continued.
> {code:java}
> [2018-04-12 03:30:28,353] {base_task_runner.py:98} INFO - Subtask: [Stage 
> 6:==(1380 + -160) / 
> 1224][2018-04- 12 03:30:28,353] {timeout.py:36} ERROR - Process timed out
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (AIRFLOW-2380) Add support for environment variables in Spark submit operator

2018-04-26 Thread Arthur Wiedmer (JIRA)

 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arthur Wiedmer resolved AIRFLOW-2380.
-
   Resolution: Fixed
Fix Version/s: 1.10.0

Issue resolved by pull request #3268
[https://github.com/apache/incubator-airflow/pull/3268]

> Add support for environment variables in Spark submit operator
> --
>
> Key: AIRFLOW-2380
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2380
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: contrib, operators
>Reporter: Cristòfol Torrens
>Assignee: Cristòfol Torrens
>Priority: Minor
> Fix For: 1.10.0
>
>
> Add support for environment variables in Spark submit operator.
> For example, to pass the *HADOOP_CONF_DIR* in case of use same Spark cluster 
> with multiple HDFS.
> The idea is to pass as a dict, and resolve it in the case of using 
> *yarn-*_client/cluster_*,* and *standalone-*_client_ mode.
> In *standalone-*_cluster_ mode is not possible to do this.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (AIRFLOW-74) SubdagOperators can consume all celeryd worker processes

2018-04-24 Thread Arthur Wiedmer (JIRA)

 [ 
https://issues.apache.org/jira/browse/AIRFLOW-74?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arthur Wiedmer resolved AIRFLOW-74.
---
   Resolution: Fixed
Fix Version/s: 1.10.0

Issue resolved by pull request #3251
[https://github.com/apache/incubator-airflow/pull/3251]

> SubdagOperators can consume all celeryd worker processes
> 
>
> Key: AIRFLOW-74
> URL: https://issues.apache.org/jira/browse/AIRFLOW-74
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: celery
>Affects Versions: Airflow 1.7.1, Airflow 1.7.0, Airflow 1.6.2
> Environment: Airflow 1.7.1rc3 with CeleryExecutor
> 1  webserver
> 1 scheduler
> 2 workers 
>Reporter: Steven Yvinec-Kruyk
>Assignee: zgl
>Priority: Major
> Fix For: 1.10.0
>
>
> If the amount of concurrent ```SubdagOperator``` running >= the no. of celery 
> worker processes tasks are unable to work. All SDOs come to a complete halt. 
> Futhermore performance of a DAG is drastically reduced even before full 
> saturation of the workers as less workers are gradually available for actual 
> tasks. A workaround for this is to specify ```SequentialExecutor``` be used 
> by the ```SubdagOperator```
> ```
> from datetime import timedelta, datetime
> from airflow.models import DAG, Pool
> from airflow.operators import BashOperator, SubDagOperator, DummyOperator
> from airflow.executors import SequentialExecutor
> import airflow
> # -\
> # DEFINE THE POOLS
> # -/
> session = airflow.settings.Session()
> for p in ['test_pool_1', 'test_pool_2', 'test_pool_3']:
> pool = (
> session.query(Pool)
> .filter(Pool.pool == p)
> .first())
> if not pool:
> session.add(Pool(pool=p, slots=8))
> session.commit()
> # -\
> # DEFINE THE DAG
> # -/
> # Define the Dag Name. This must be unique.
> dag_name = 'hanging_subdags_n16_sqe'
> # Default args are passed to each task
> default_args = {
> 'owner': 'Airflow',
> 'depends_on_past': False,
> 'start_date': datetime(2016, 04, 10),
> 'retries': 0,
> 'retry_interval': timedelta(minutes=5),
> 'email': ['y...@email.com'],
> 'email_on_failure': True,
> 'email_on_retry': True,
> 'wait_for_downstream': False,
> }
> # Create the dag object
> dag = DAG(dag_name,
>   default_args=default_args,
>   schedule_interval='0 0 * * *'
>   )
> # -\
> # DEFINE THE TASKS
> # -/
> def get_subdag(dag, sd_id, pool=None):
> subdag = DAG(
> dag_id='{parent_dag}.{sd_id}'.format(
> parent_dag=dag.dag_id,
> sd_id=sd_id),
> params=dag.params,
> default_args=dag.default_args,
> template_searchpath=dag.template_searchpath,
> user_defined_macros=dag.user_defined_macros,
> )
> t1 = BashOperator(
> task_id='{sd_id}_step_1'.format(
> sd_id=sd_id
> ),
> bash_command='echo "hello" && sleep 60',
> dag=subdag,
> pool=pool,
> executor=SequentialExecutor
> )
> t2 = BashOperator(
> task_id='{sd_id}_step_two'.format(
> sd_id=sd_id
> ),
> bash_command='echo "hello" && sleep 15',
> dag=subdag,
> pool=pool,
> executor=SequentialExecutor
> )
> t2.set_upstream(t1)
> sdo = SubDagOperator(
> task_id=sd_id,
> subdag=subdag,
> retries=0,
> retry_delay=timedelta(seconds=5),
> dag=dag,
> depends_on_past=True,
> )
> return sdo
> start_task = DummyOperator(
> task_id='start',
> dag=dag
> )
> for n in range(1, 17):
> sd_i = get_subdag(dag=dag, sd_id='level_1_{n}'.format(n=n), 
> pool='test_pool_1')
> sd_ii = get_subdag(dag=dag, sd_id='level_2_{n}'.format(n=n), 
> pool='test_pool_2')
> sd_iii = get_subdag(dag=dag, sd_id='level_3_{n}'.format(n=n), 
> pool='test_pool_3')
> sd_i.set_upstream(start_task)
> sd_ii.set_upstream(sd_i)
> sd_iii.set_upstream(sd_ii)
> ```



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (AIRFLOW-2365) Fix autocommit test issue with SQLite

2018-04-23 Thread Arthur Wiedmer (JIRA)

 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arthur Wiedmer reassigned AIRFLOW-2365:
---

Assignee: Arthur Wiedmer

> Fix autocommit test issue with SQLite
> -
>
> Key: AIRFLOW-2365
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2365
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Arthur Wiedmer
>Assignee: Arthur Wiedmer
>Priority: Major
>
> In a previous PR, I added acheck for an autocommit attribute which fails for 
> SQLite. Correcting the tests now.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (AIRFLOW-2365) Fix autocommit test issue with SQLite

2018-04-23 Thread Arthur Wiedmer (JIRA)
Arthur Wiedmer created AIRFLOW-2365:
---

 Summary: Fix autocommit test issue with SQLite
 Key: AIRFLOW-2365
 URL: https://issues.apache.org/jira/browse/AIRFLOW-2365
 Project: Apache Airflow
  Issue Type: Bug
Reporter: Arthur Wiedmer


In a previous PR, I added acheck for an autocommit attribute which fails for 
SQLite. Correcting the tests now.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (AIRFLOW-2364) The autocommit flag can be set on a connection which does not support it.

2018-04-23 Thread Arthur Wiedmer (JIRA)
Arthur Wiedmer created AIRFLOW-2364:
---

 Summary: The autocommit flag can be set on a connection which does 
not support it.
 Key: AIRFLOW-2364
 URL: https://issues.apache.org/jira/browse/AIRFLOW-2364
 Project: Apache Airflow
  Issue Type: Bug
Reporter: Arthur Wiedmer
Assignee: Arthur Wiedmer


We could just add a logging warning when the method is invoked.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (AIRFLOW-2240) Add TLS/SSL to Dask Executor

2018-04-18 Thread Arthur Wiedmer (JIRA)

 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arthur Wiedmer resolved AIRFLOW-2240.
-
   Resolution: Fixed
Fix Version/s: 2.0.0

Issue resolved by pull request #2683
[https://github.com/apache/incubator-airflow/pull/2683]

> Add TLS/SSL to Dask Executor
> 
>
> Key: AIRFLOW-2240
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2240
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: executor
>Affects Versions: Airflow 1.8
>Reporter: Marius Van Niekerk
>Assignee: Marius Van Niekerk
>Priority: Minor
> Fix For: 2.0.0
>
>
> As of distributed 0.17 dask distributed supports tls / ssl for communication.
>  
> We should allow this configuration to be used with airflow.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (AIRFLOW-2335) Issue downloading oracle jdk8 is preventing travis builds from running

2018-04-17 Thread Arthur Wiedmer (JIRA)

 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arthur Wiedmer resolved AIRFLOW-2335.
-
   Resolution: Fixed
Fix Version/s: 1.10.0

Issue resolved by pull request #3236
[https://github.com/apache/incubator-airflow/pull/3236]

> Issue downloading oracle jdk8 is preventing travis builds from running
> --
>
> Key: AIRFLOW-2335
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2335
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Daniel Imberman
>Assignee: Daniel Imberman
>Priority: Major
> Fix For: 1.10.0
>
>
> Currently, all airflow build are dying after ~1 minute due to an issue with 
> how travis pulls jdk8



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (AIRFLOW-1235) Odd behaviour when all gunicorn workers die

2018-03-22 Thread Arthur Wiedmer (JIRA)

 [ 
https://issues.apache.org/jira/browse/AIRFLOW-1235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arthur Wiedmer resolved AIRFLOW-1235.
-
   Resolution: Fixed
Fix Version/s: 2.0.0

Issue resolved by pull request #2330
[https://github.com/apache/incubator-airflow/pull/2330]

> Odd behaviour when all gunicorn workers die
> ---
>
> Key: AIRFLOW-1235
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1235
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: webserver
>Affects Versions: 1.8.0
>Reporter: Erik Forsberg
>Assignee: Kengo Seki
>Priority: Major
> Fix For: 2.0.0
>
>
> The webserver has sometimes stopped responding to port 443, and today I found 
> the issue - I had a misconfigured resolv.conf that made it unable to talk to 
> my postgresql. This was the root cause, but the way airflow webserver behaved 
> was a bit odd.
> It seems that when all gunicorn workers failed to start, the gunicorn master 
> shut down. However, the main process (the one that starts gunicorn master) 
> did not shut down, so there was no way of detecting the failed status of 
> webserver from e.g. systemd or init script.
> Full traceback leading to stale webserver process:
> {noformat}
> May 21 09:51:57 airmaster01 airflow[26451]: [2017-05-21 09:51:57 +] 
> [23794] [ERROR] Exception in worker process:
> May 21 09:51:57 airmaster01 airflow[26451]: Traceback (most recent call last):
> May 21 09:51:57 airmaster01 airflow[26451]: File 
> "/opt/airflow/production/lib/python3.4/site-packages/sqlalchemy/pool.py", 
> line 1122, in _do_get
> May 21 09:51:57 airmaster01 airflow[26451]: return self._pool.get(wait, 
> self._timeout)
> May 21 09:51:57 airmaster01 airflow[26451]: File 
> "/opt/airflow/production/lib/python3.4/site-packages/sqlalchemy/util/queue.py",
>  line 145, in get
> May 21 09:51:57 airmaster01 airflow[26451]: raise Empty
> May 21 09:51:57 airmaster01 airflow[26451]: sqlalchemy.util.queue.Empty
> May 21 09:51:57 airmaster01 airflow[26451]: During handling of the above 
> exception, another exception occurred:
> May 21 09:51:57 airmaster01 airflow[26451]: Traceback (most recent call last):
> May 21 09:51:57 airmaster01 airflow[26451]: File 
> "/opt/airflow/production/lib/python3.4/site-packages/sqlalchemy/engine/base.py",
>  line 2147, in _wrap_pool_connect
> May 21 09:51:57 airmaster01 airflow[26451]: return fn()
> May 21 09:51:57 airmaster01 airflow[26451]: File 
> "/opt/airflow/production/lib/python3.4/site-packages/sqlalchemy/pool.py", 
> line 387, in connect
> May 21 09:51:57 airmaster01 airflow[26451]: return 
> _ConnectionFairy._checkout(self)
> May 21 09:51:57 airmaster01 airflow[26451]: File 
> "/opt/airflow/production/lib/python3.4/site-packages/sqlalchemy/pool.py", 
> line 766, in _checkout
> May 21 09:51:57 airmaster01 airflow[26451]: fairy = 
> _ConnectionRecord.checkout(pool)
> May 21 09:51:57 airmaster01 airflow[26451]: File 
> "/opt/airflow/production/lib/python3.4/site-packages/sqlalchemy/pool.py", 
> line 516, in checkout
> May 21 09:51:57 airmaster01 airflow[26451]: rec = pool._do_get()
> May 21 09:51:57 airmaster01 airflow[26451]: File 
> "/opt/airflow/production/lib/python3.4/site-packages/sqlalchemy/pool.py", 
> line 1138, in _do_get
> May 21 09:51:57 airmaster01 airflow[26451]: self._dec_overflow()
> May 21 09:51:57 airmaster01 airflow[26451]: File 
> "/opt/airflow/production/lib/python3.4/site-packages/sqlalchemy/util/langhelpers.py",
>  line 66, in __exit__
> May 21 09:51:57 airmaster01 airflow[26451]: compat.reraise(exc_type, 
> exc_value, exc_tb)
> May 21 09:51:57 airmaster01 airflow[26451]: File 
> "/opt/airflow/production/lib/python3.4/site-packages/sqlalchemy/util/compat.py",
>  line 187, in reraise
> May 21 09:51:57 airmaster01 airflow[26451]: raise value
> May 21 09:51:57 airmaster01 airflow[26451]: File 
> "/opt/airflow/production/lib/python3.4/site-packages/sqlalchemy/pool.py", 
> line 1135, in _do_get
> May 21 09:51:57 airmaster01 airflow[26451]: return self._create_connection()
> May 21 09:51:57 airmaster01 airflow[26451]: File 
> "/opt/airflow/production/lib/python3.4/site-packages/sqlalchemy/pool.py", 
> line 333, in _create_connection
> May 21 09:51:57 airmaster01 airflow[26451]: return _ConnectionRecord(self)
> May 21 09:51:57 airmaster01 airflow[26451]: File 
> "/opt/airflow/production/lib/python3.4/site-packages/sqlalchemy/pool.py", 
> line 461, in __init__
> May 21 09:51:57 airmaster01 airflow[26451]: 
> self.__connect(first_connect_check=True)
> May 21 09:51:57 airmaster01 airflow[26451]: File 
> "/opt/airflow/production/lib/python3.4/site-packages/sqlalchemy/pool.py", 
> line 651, in __connect
> May 21 09:51:57 airmaster01 airflow[26451]: connection = 
> pool._invoke_creator(self)
> May 21 09:51:57 airmaster01 airflow[26451]: File 
> 

[jira] [Commented] (AIRFLOW-1165) airflow webservice crashes on ubuntu16 - python3

2017-07-10 Thread Arthur Wiedmer (JIRA)

[ 
https://issues.apache.org/jira/browse/AIRFLOW-1165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16080698#comment-16080698
 ] 

Arthur Wiedmer commented on AIRFLOW-1165:
-

A short fix until the version is upgraded can be the following

At the prompt
# Generating an RSA public/private-key pair
openssl genrsa -out private.pem 2048
# Generating a self-signed certificate
openssl req -new -x509 -key private.pem -out cacert.pem -days 1095

# In your airflow.cfg under [webserver]
web_server_ssl_cert = path/to/cacert.pem
web_server_ssl_key = path/to/private.pem

> airflow webservice crashes on ubuntu16 - python3 
> -
>
> Key: AIRFLOW-1165
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1165
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Hamed
>Assignee: Arthur Wiedmer
> Fix For: 1.9.0
>
>
> I am trying to run airflow webserver on ubuntu16, python3 and ran to this 
> issue. Any idea?
> {code}
> [2017-05-02 16:36:34,789] [24096] {_internal.py:87} WARNING -  * Debugger is 
> active!
> [2017-05-02 16:36:34,790] [24096] {_internal.py:87} INFO -  * Debugger PIN: 
> 294-518-137
> Exception in thread Thread-1:
> Traceback (most recent call last):
>   File "/usr/lib/python3.5/threading.py", line 914, in _bootstrap_inner
> self.run()
>   File "/usr/lib/python3.5/threading.py", line 862, in run
> self._target(*self._args, **self._kwargs)
>   File "/usr/local/lib/python3.5/dist-packages/werkzeug/serving.py", line 
> 696, in inner
> fd=fd)
>   File "/usr/local/lib/python3.5/dist-packages/werkzeug/serving.py", line 
> 590, in make_server
> passthrough_errors, ssl_context, fd=fd)
>   File "/usr/local/lib/python3.5/dist-packages/werkzeug/serving.py", line 
> 525, in __init__
> self.socket = ssl_context.wrap_socket(sock, server_side=True)
>   File "/usr/local/lib/python3.5/dist-packages/werkzeug/serving.py", line 
> 447, in wrap_socket
> ssl_version=self._protocol, **kwargs)
>   File "/usr/lib/python3.5/ssl.py", line 1069, in wrap_socket
> ciphers=ciphers)
>   File "/usr/lib/python3.5/ssl.py", line 680, in __init__
> raise ValueError("certfile must be specified for server-side "
> ValueError: certfile must be specified for server-side operations
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (AIRFLOW-1165) airflow webservice crashes on ubuntu16 - python3

2017-05-02 Thread Arthur Wiedmer (JIRA)

 [ 
https://issues.apache.org/jira/browse/AIRFLOW-1165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arthur Wiedmer reassigned AIRFLOW-1165:
---

Assignee: Arthur Wiedmer

> airflow webservice crashes on ubuntu16 - python3 
> -
>
> Key: AIRFLOW-1165
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1165
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Hamed
>Assignee: Arthur Wiedmer
> Fix For: 1.8.1
>
>
> I am trying to run airflow webserver on ubuntu16, python3 and ran to this 
> issue. Any idea?
> {code}
> [2017-05-02 16:36:34,789] [24096] {_internal.py:87} WARNING -  * Debugger is 
> active!
> [2017-05-02 16:36:34,790] [24096] {_internal.py:87} INFO -  * Debugger PIN: 
> 294-518-137
> Exception in thread Thread-1:
> Traceback (most recent call last):
>   File "/usr/lib/python3.5/threading.py", line 914, in _bootstrap_inner
> self.run()
>   File "/usr/lib/python3.5/threading.py", line 862, in run
> self._target(*self._args, **self._kwargs)
>   File "/usr/local/lib/python3.5/dist-packages/werkzeug/serving.py", line 
> 696, in inner
> fd=fd)
>   File "/usr/local/lib/python3.5/dist-packages/werkzeug/serving.py", line 
> 590, in make_server
> passthrough_errors, ssl_context, fd=fd)
>   File "/usr/local/lib/python3.5/dist-packages/werkzeug/serving.py", line 
> 525, in __init__
> self.socket = ssl_context.wrap_socket(sock, server_side=True)
>   File "/usr/local/lib/python3.5/dist-packages/werkzeug/serving.py", line 
> 447, in wrap_socket
> ssl_version=self._protocol, **kwargs)
>   File "/usr/lib/python3.5/ssl.py", line 1069, in wrap_socket
> ciphers=ciphers)
>   File "/usr/lib/python3.5/ssl.py", line 680, in __init__
> raise ValueError("certfile must be specified for server-side "
> ValueError: certfile must be specified for server-side operations
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (AIRFLOW-1165) airflow webservice crashes on ubuntu16 - python3

2017-05-02 Thread Arthur Wiedmer (JIRA)

 [ 
https://issues.apache.org/jira/browse/AIRFLOW-1165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arthur Wiedmer resolved AIRFLOW-1165.
-
   Resolution: Fixed
Fix Version/s: 1.8.1

Resolved in master and the fix is in the current RC being voted on.

> airflow webservice crashes on ubuntu16 - python3 
> -
>
> Key: AIRFLOW-1165
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1165
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Hamed
> Fix For: 1.8.1
>
>
> I am trying to run airflow webserver on ubuntu16, python3 and ran to this 
> issue. Any idea?
> {code}
> [2017-05-02 16:36:34,789] [24096] {_internal.py:87} WARNING -  * Debugger is 
> active!
> [2017-05-02 16:36:34,790] [24096] {_internal.py:87} INFO -  * Debugger PIN: 
> 294-518-137
> Exception in thread Thread-1:
> Traceback (most recent call last):
>   File "/usr/lib/python3.5/threading.py", line 914, in _bootstrap_inner
> self.run()
>   File "/usr/lib/python3.5/threading.py", line 862, in run
> self._target(*self._args, **self._kwargs)
>   File "/usr/local/lib/python3.5/dist-packages/werkzeug/serving.py", line 
> 696, in inner
> fd=fd)
>   File "/usr/local/lib/python3.5/dist-packages/werkzeug/serving.py", line 
> 590, in make_server
> passthrough_errors, ssl_context, fd=fd)
>   File "/usr/local/lib/python3.5/dist-packages/werkzeug/serving.py", line 
> 525, in __init__
> self.socket = ssl_context.wrap_socket(sock, server_side=True)
>   File "/usr/local/lib/python3.5/dist-packages/werkzeug/serving.py", line 
> 447, in wrap_socket
> ssl_version=self._protocol, **kwargs)
>   File "/usr/lib/python3.5/ssl.py", line 1069, in wrap_socket
> ciphers=ciphers)
>   File "/usr/lib/python3.5/ssl.py", line 680, in __init__
> raise ValueError("certfile must be specified for server-side "
> ValueError: certfile must be specified for server-side operations
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (AIRFLOW-1165) airflow webservice crashes on ubuntu16 - python3

2017-05-02 Thread Arthur Wiedmer (JIRA)

[ 
https://issues.apache.org/jira/browse/AIRFLOW-1165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15993681#comment-15993681
 ] 

Arthur Wiedmer commented on AIRFLOW-1165:
-

This is a duplicate of https://issues.apache.org/jira/browse/AIRFLOW-832

It is fixed in the current master, and will be fixed in the next release.

The short term fix is the commands outlined here:
http://stackoverflow.com/a/40857607


> airflow webservice crashes on ubuntu16 - python3 
> -
>
> Key: AIRFLOW-1165
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1165
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Hamed
>
> I am trying to run airflow webserver on ubuntu16, python3 and ran to this 
> issue. Any idea?
> {code}
> [2017-05-02 16:36:34,789] [24096] {_internal.py:87} WARNING -  * Debugger is 
> active!
> [2017-05-02 16:36:34,790] [24096] {_internal.py:87} INFO -  * Debugger PIN: 
> 294-518-137
> Exception in thread Thread-1:
> Traceback (most recent call last):
>   File "/usr/lib/python3.5/threading.py", line 914, in _bootstrap_inner
> self.run()
>   File "/usr/lib/python3.5/threading.py", line 862, in run
> self._target(*self._args, **self._kwargs)
>   File "/usr/local/lib/python3.5/dist-packages/werkzeug/serving.py", line 
> 696, in inner
> fd=fd)
>   File "/usr/local/lib/python3.5/dist-packages/werkzeug/serving.py", line 
> 590, in make_server
> passthrough_errors, ssl_context, fd=fd)
>   File "/usr/local/lib/python3.5/dist-packages/werkzeug/serving.py", line 
> 525, in __init__
> self.socket = ssl_context.wrap_socket(sock, server_side=True)
>   File "/usr/local/lib/python3.5/dist-packages/werkzeug/serving.py", line 
> 447, in wrap_socket
> ssl_version=self._protocol, **kwargs)
>   File "/usr/lib/python3.5/ssl.py", line 1069, in wrap_socket
> ciphers=ciphers)
>   File "/usr/lib/python3.5/ssl.py", line 680, in __init__
> raise ValueError("certfile must be specified for server-side "
> ValueError: certfile must be specified for server-side operations
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (AIRFLOW-1028) Databricks Operator for Airflow

2017-04-06 Thread Arthur Wiedmer (JIRA)

 [ 
https://issues.apache.org/jira/browse/AIRFLOW-1028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arthur Wiedmer resolved AIRFLOW-1028.
-
   Resolution: Fixed
Fix Version/s: 1.9.0

Issue resolved by pull request #2202
[https://github.com/apache/incubator-airflow/pull/2202]

> Databricks Operator for Airflow
> ---
>
> Key: AIRFLOW-1028
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1028
> Project: Apache Airflow
>  Issue Type: New Feature
>Reporter: Andrew Chen
>Assignee: Andrew Chen
> Fix For: 1.9.0
>
>
> It would be nice to have a Databricks Operator/Hook in Airflow so users of 
> Databricks can more easily integrate with Airflow.
> The operator would submit a spark job to our new /jobs/runs/submit endpoint. 
> This endpoint is similar to 
> https://docs.databricks.com/api/latest/jobs.html#jobscreatejob but does not 
> include the email_notifications, max_retries, min_retry_interval_millis, 
> retry_on_timeout, schedule, max_concurrent_runs fields. (The submit docs are 
> not out because it's still a private endpoint.)
> Our proposed design for the operator then is to match this REST API endpoint. 
> Each argument to the parameter is named to be one of the fields of the REST 
> API request and the value of the argument will match the type expected by the 
> REST API. We will also merge extra keys from kwargs which should not be 
> passed to the BaseOperator into our API call in order to be flexible to 
> updates.
> In the case that this interface is not very user friendly, we can later add 
> more operators which extend this operator.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (AIRFLOW-1016) Allow HTTP HEAD request method on HTTPSensor

2017-04-05 Thread Arthur Wiedmer (JIRA)

 [ 
https://issues.apache.org/jira/browse/AIRFLOW-1016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arthur Wiedmer resolved AIRFLOW-1016.
-
   Resolution: Fixed
Fix Version/s: 1.9.0

Issue resolved by pull request #2175
[https://github.com/apache/incubator-airflow/pull/2175]

> Allow HTTP HEAD request method on HTTPSensor
> 
>
> Key: AIRFLOW-1016
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1016
> Project: Apache Airflow
>  Issue Type: Improvement
>Reporter: msempere
>Assignee: msempere
>Priority: Minor
>  Labels: features
> Fix For: 1.9.0
>
>
> HTTPSensor hardcodes the HTTP request method to `GET`, and could be the case 
> where `HEAD` method is needed to act as a sensor.
> This case is useful when we just need to retrieve some meta data and not the 
> complete body for that particular request, and that metadata information is 
> enough for our sensor.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (AIRFLOW-947) Make PrestoHook surface better messages when the Presto Cluster is unavailable.

2017-04-04 Thread Arthur Wiedmer (JIRA)

 [ 
https://issues.apache.org/jira/browse/AIRFLOW-947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arthur Wiedmer resolved AIRFLOW-947.

   Resolution: Fixed
Fix Version/s: 1.9.0

Issue resolved by pull request #2128
[https://github.com/apache/incubator-airflow/pull/2128]

> Make PrestoHook surface better messages when the Presto Cluster is 
> unavailable.
> ---
>
> Key: AIRFLOW-947
> URL: https://issues.apache.org/jira/browse/AIRFLOW-947
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Arthur Wiedmer
>Assignee: Arthur Wiedmer
>Priority: Minor
> Fix For: 1.9.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (AIRFLOW-1067) Should not use airf...@airflow.com in examples

2017-04-04 Thread Arthur Wiedmer (JIRA)

[ 
https://issues.apache.org/jira/browse/AIRFLOW-1067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15955329#comment-15955329
 ] 

Arthur Wiedmer commented on AIRFLOW-1067:
-

Duplicate of https://issues.apache.org/jira/browse/AIRFLOW-1066 We had the same 
idea.

> Should not use airf...@airflow.com in examples
> --
>
> Key: AIRFLOW-1067
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1067
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Xiangrui Meng
>Assignee: Xiangrui Meng
>Priority: Minor
>
> airflow.com is owned by a company named Airflow (selling fans, etc). We 
> should use airf...@example.com in all examples.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (AIRFLOW-1066) Replace instances of airf...@airflow.com with airf...@example.com

2017-04-04 Thread Arthur Wiedmer (JIRA)
Arthur Wiedmer created AIRFLOW-1066:
---

 Summary: Replace instances of airf...@airflow.com with 
airf...@example.com
 Key: AIRFLOW-1066
 URL: https://issues.apache.org/jira/browse/AIRFLOW-1066
 Project: Apache Airflow
  Issue Type: Bug
Reporter: Arthur Wiedmer
Assignee: Arthur Wiedmer
Priority: Trivial


airflow.com is a registered website to a company selling fans :) We can use 
example.com as a domain name.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (AIRFLOW-1038) Specify celery serializers explicitly and pin version

2017-04-03 Thread Arthur Wiedmer (JIRA)

 [ 
https://issues.apache.org/jira/browse/AIRFLOW-1038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arthur Wiedmer resolved AIRFLOW-1038.
-
   Resolution: Fixed
Fix Version/s: 1.9.0

Issue resolved by pull request #2185
[https://github.com/apache/incubator-airflow/pull/2185]

> Specify celery serializers explicitly and pin version
> -
>
> Key: AIRFLOW-1038
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1038
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Alex Guziel
>Assignee: Alex Guziel
> Fix For: 1.9.0
>
>
> Celery 3->4 upgrade changes the default task and result serializer from 
> pickle to json. Pickle is faster and supports more types 
> http://docs.celeryproject.org/en/latest/userguide/calling.html
> This also causes issues when different versions of celery are running on 
> different hosts.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (AIRFLOW-1007) Jinja sandbox is vulnerable to RCE

2017-04-03 Thread Arthur Wiedmer (JIRA)

 [ 
https://issues.apache.org/jira/browse/AIRFLOW-1007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arthur Wiedmer resolved AIRFLOW-1007.
-
   Resolution: Fixed
Fix Version/s: 1.9.0

Issue resolved by pull request #2184
[https://github.com/apache/incubator-airflow/pull/2184]

> Jinja sandbox is vulnerable to RCE
> --
>
> Key: AIRFLOW-1007
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1007
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Alex Guziel
>Assignee: Alex Guziel
> Fix For: 1.9.0
>
>
> Right now, the jinja template functionality in chart_data takes arbitrary 
> strings and executes them. We should use the sandbox functionality to prevent 
> this.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (AIRFLOW-999) Support for Redis database

2017-03-20 Thread Arthur Wiedmer (JIRA)

 [ 
https://issues.apache.org/jira/browse/AIRFLOW-999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arthur Wiedmer resolved AIRFLOW-999.

   Resolution: Fixed
Fix Version/s: 1.9.0

Issue resolved by pull request #2165
[https://github.com/apache/incubator-airflow/pull/2165]

> Support for Redis database
> --
>
> Key: AIRFLOW-999
> URL: https://issues.apache.org/jira/browse/AIRFLOW-999
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: db
>Reporter: msempere
>Assignee: msempere
>Priority: Minor
>  Labels: features
> Fix For: 1.9.0
>
>
> Currently Airflow doesn't offer support for Redis DB.
> The idea is to create a Hook to connect to it and offer a minimal 
> functionality.
> So the proposal is to create a sensor that monitor for a Redis key existence. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (AIRFLOW-997) Change setup.cfg to point to Apache instead of Max

2017-03-16 Thread Arthur Wiedmer (JIRA)

 [ 
https://issues.apache.org/jira/browse/AIRFLOW-997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arthur Wiedmer resolved AIRFLOW-997.

   Resolution: Fixed
Fix Version/s: 1.9.0

Issue resolved by pull request #2162
[https://github.com/apache/incubator-airflow/pull/2162]

> Change setup.cfg to point to Apache instead of Max
> --
>
> Key: AIRFLOW-997
> URL: https://issues.apache.org/jira/browse/AIRFLOW-997
> Project: Apache Airflow
>  Issue Type: Improvement
>Reporter: Arthur Wiedmer
>Assignee: Arthur Wiedmer
>Priority: Minor
> Fix For: 1.9.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (AIRFLOW-997) Change setup.cfg to point to Apache instead of Max

2017-03-16 Thread Arthur Wiedmer (JIRA)
Arthur Wiedmer created AIRFLOW-997:
--

 Summary: Change setup.cfg to point to Apache instead of Max
 Key: AIRFLOW-997
 URL: https://issues.apache.org/jira/browse/AIRFLOW-997
 Project: Apache Airflow
  Issue Type: Improvement
Reporter: Arthur Wiedmer
Assignee: Arthur Wiedmer
Priority: Minor






--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (AIRFLOW-960) Add support for .editorconfig

2017-03-09 Thread Arthur Wiedmer (JIRA)

 [ 
https://issues.apache.org/jira/browse/AIRFLOW-960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arthur Wiedmer resolved AIRFLOW-960.

   Resolution: Fixed
Fix Version/s: 1.9.0

Issue resolved by pull request #2137
[https://github.com/apache/incubator-airflow/pull/2137]

> Add support for .editorconfig
> -
>
> Key: AIRFLOW-960
> URL: https://issues.apache.org/jira/browse/AIRFLOW-960
> Project: Apache Airflow
>  Issue Type: Improvement
>Reporter: George Leslie-Waksman
>Assignee: George Leslie-Waksman
>Priority: Trivial
> Fix For: 1.9.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (AIRFLOW-959) .gitignore file is disorganized and incomplete

2017-03-09 Thread Arthur Wiedmer (JIRA)

 [ 
https://issues.apache.org/jira/browse/AIRFLOW-959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arthur Wiedmer resolved AIRFLOW-959.

   Resolution: Fixed
Fix Version/s: 1.9.0

Issue resolved by pull request #2136
[https://github.com/apache/incubator-airflow/pull/2136]

> .gitignore file is disorganized and incomplete
> --
>
> Key: AIRFLOW-959
> URL: https://issues.apache.org/jira/browse/AIRFLOW-959
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: George Leslie-Waksman
>Assignee: George Leslie-Waksman
>Priority: Trivial
> Fix For: 1.9.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (AIRFLOW-959) .gitignore file is disorganized and incomplete

2017-03-09 Thread Arthur Wiedmer (JIRA)

[ 
https://issues.apache.org/jira/browse/AIRFLOW-959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15903695#comment-15903695
 ] 

Arthur Wiedmer commented on AIRFLOW-959:


+1

> .gitignore file is disorganized and incomplete
> --
>
> Key: AIRFLOW-959
> URL: https://issues.apache.org/jira/browse/AIRFLOW-959
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: George Leslie-Waksman
>Assignee: George Leslie-Waksman
>Priority: Trivial
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (AIRFLOW-947) Make PrestoHook surface better messages when the Presto Cluster is unavailable.

2017-03-06 Thread Arthur Wiedmer (JIRA)
Arthur Wiedmer created AIRFLOW-947:
--

 Summary: Make PrestoHook surface better messages when the Presto 
Cluster is unavailable.
 Key: AIRFLOW-947
 URL: https://issues.apache.org/jira/browse/AIRFLOW-947
 Project: Apache Airflow
  Issue Type: Bug
Reporter: Arthur Wiedmer
Assignee: Arthur Wiedmer
Priority: Minor






--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (AIRFLOW-846) Release schedule, latest tag is too old

2017-03-03 Thread Arthur Wiedmer (JIRA)

[ 
https://issues.apache.org/jira/browse/AIRFLOW-846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15894536#comment-15894536
 ] 

Arthur Wiedmer commented on AIRFLOW-846:


Hi [~ultrabug],

We are on RC5 now, and will release to PyPI once the current blockers are 
cleared, and a new vote on the release is taken. All of this combined might 
take nother couple of weeks.

Best,
Arthur

> Release schedule, latest tag is too old
> ---
>
> Key: AIRFLOW-846
> URL: https://issues.apache.org/jira/browse/AIRFLOW-846
> Project: Apache Airflow
>  Issue Type: Task
>Reporter: Ultrabug
>Priority: Blocker
>  Labels: release, tagging
>
> To my understanding, there is no clear point about the release schedule of 
> the project.
> The latest tag is 1.7.1.3 from June 2016, which is not well suited for 
> production now days.
> For example, the latest available release is still affected by AIRFLOW-178 
> which means that we have to patch the sources on production to work with ZIP 
> files.
> Could you please share your thoughts and position on the release planning of 
> the project ?
> Would it be possible to get a newer tag sometimes soon ?
> Thank you



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (AIRFLOW-916) Fix ConfigParser deprecation warning

2017-03-01 Thread Arthur Wiedmer (JIRA)

[ 
https://issues.apache.org/jira/browse/AIRFLOW-916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15891058#comment-15891058
 ] 

Arthur Wiedmer commented on AIRFLOW-916:


This was breaking things for me on 2.7.13 on a local fresh install.

Let's revert.

> Fix ConfigParser deprecation warning 
> -
>
> Key: AIRFLOW-916
> URL: https://issues.apache.org/jira/browse/AIRFLOW-916
> Project: Apache Airflow
>  Issue Type: Improvement
>Affects Versions: 1.8.0
>Reporter: Jeremiah Lowin
>Assignee: Jeremiah Lowin
>Priority: Trivial
> Fix For: 1.9.0
>
>
> ConfigParser.readfp() is deprecated in favor of ConfigParser.read_file(), 
> according to warning messages



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (AIRFLOW-885) Add Change.org to the list of Airflow users

2017-02-17 Thread Arthur Wiedmer (JIRA)
Arthur Wiedmer created AIRFLOW-885:
--

 Summary: Add Change.org to the list of Airflow users
 Key: AIRFLOW-885
 URL: https://issues.apache.org/jira/browse/AIRFLOW-885
 Project: Apache Airflow
  Issue Type: Task
Reporter: Arthur Wiedmer






--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (AIRFLOW-731) NamedHivePartitionSensor chokes on partition predicate with periods.

2017-01-04 Thread Arthur Wiedmer (JIRA)
Arthur Wiedmer created AIRFLOW-731:
--

 Summary: NamedHivePartitionSensor chokes on partition predicate 
with periods.
 Key: AIRFLOW-731
 URL: https://issues.apache.org/jira/browse/AIRFLOW-731
 Project: Apache Airflow
  Issue Type: Bug
Affects Versions: Airflow 1.7.1, Airflow 1.7.0
Reporter: Arthur Wiedmer
Assignee: Arthur Wiedmer
Priority: Trivial


The partition parsing function did not limit splitting around the first period 
leading to issues.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AIRFLOW-682) Bump MAX_PERIODS

2016-12-07 Thread Arthur Wiedmer (JIRA)

[ 
https://issues.apache.org/jira/browse/AIRFLOW-682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15729673#comment-15729673
 ] 

Arthur Wiedmer commented on AIRFLOW-682:


+1. Very useful for large-ish DAGs > 1k tasks as this limit applies also for 
the max number of tasks when marking upstream or downstream success.

> Bump MAX_PERIODS
> 
>
> Key: AIRFLOW-682
> URL: https://issues.apache.org/jira/browse/AIRFLOW-682
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Dan Davydov
>Assignee: Dan Davydov
>
> It is not possible to mark success on some large DAGs due to the MAX_PERIODS 
> being set to 1000. We should temporarily bump it up until work can be done to 
> scale the mark success endpoint much higher.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (AIRFLOW-575) Improve tutorial information about default_args

2016-10-17 Thread Arthur Wiedmer (JIRA)

 [ 
https://issues.apache.org/jira/browse/AIRFLOW-575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arthur Wiedmer resolved AIRFLOW-575.

Resolution: Fixed

> Improve tutorial information about default_args
> ---
>
> Key: AIRFLOW-575
> URL: https://issues.apache.org/jira/browse/AIRFLOW-575
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: Documentation
>Reporter: Laura Lorenz
>Assignee: Laura Lorenz
>Priority: Minor
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (AIRFLOW-575) Improve tutorial information about default_args

2016-10-16 Thread Arthur Wiedmer (JIRA)
Arthur Wiedmer created AIRFLOW-575:
--

 Summary: Improve tutorial information about default_args
 Key: AIRFLOW-575
 URL: https://issues.apache.org/jira/browse/AIRFLOW-575
 Project: Apache Airflow
  Issue Type: Improvement
  Components: Documentation
Reporter: Laura Lorenz
Assignee: Laura Lorenz
Priority: Minor






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AIRFLOW-497) Release plans & info

2016-09-09 Thread Arthur Wiedmer (JIRA)

[ 
https://issues.apache.org/jira/browse/AIRFLOW-497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15478185#comment-15478185
 ] 

Arthur Wiedmer commented on AIRFLOW-497:


Hi Alexander,

I think I can leave a quick update here. While the committers and various 
contributors have worked on several improvements, we have been blocked on 
navigating our first apache release (a decent amount of contributors are new to 
this process and it takes a little getting used to).

The main issues that the next release will address are licensing issues, 
stripping out components that were not compatible with the Apache License as 
well as a few bug fixes. We hope to be able to release more often in the future 
once we document the release process internally and make sure we are starting 
with the right base to be a successful project under the Apache umbrella.

A general idea of the improvement roadmap can be found on the wiki : 
https://cwiki.apache.org/confluence/display/AIRFLOW/Roadmap

Feel free to ping the dev mailing list also if you have more questions or want 
to start a conversation about releases.

Best,
Arthur

> Release plans & info
> 
>
> Key: AIRFLOW-497
> URL: https://issues.apache.org/jira/browse/AIRFLOW-497
> Project: Apache Airflow
>  Issue Type: Wish
>  Components: core, docs
>Reporter: Alexander Kachkaev
>Priority: Minor
>  Labels: build, newbie, release
>
> I did a couple of experiments with airflow several months ago and returned to 
> explore it properly this week. After a few days of quite intensive reading 
> and hacking it still remains unclear to me what's going on with the project 
> ATM.
> The latest release is 1.7.1.3, which dates back to 2016-06-13 (three months 
> from now). The docs on pythonhosted sometimes refer to 1.8 and git blame 
> reveals that these mentionings have been there since at least April 2016. 
> JIRA's dashboard has references to versions 1.8 and 2.0, but those only 
> contain lists with issues - no deadline etc.
> I imagine that core developers have a clear picture about the situation and 
> it is probably possible to figure things out from the mailing list and 
> gitter, However, it would be good to see roadmap etc. in a slightly more 
> accessible way.
> More frequent releases will help a lot as well. I'm seeing some issues when 
> running 1.7.1.3 via docker-airflow / celery, but it's totally unclear whether 
> these still exist on airflow's master branch or even something's wrong with 
> the docker wrapper I'm using. Opening an issue in JIRA seems somewhat stupid 
> in this situation.
> Could anyone please increase the clarity of meta?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AIRFLOW-323) Should be able to prevent tasks from overlapping across multiple DAG Runs

2016-07-11 Thread Arthur Wiedmer (JIRA)

[ 
https://issues.apache.org/jira/browse/AIRFLOW-323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15371291#comment-15371291
 ] 

Arthur Wiedmer commented on AIRFLOW-323:


Hi Isaac, it sounds like there are a couple of things that could help you :
1) You can set max_active_runs for the DAG to 1 to ensure that only one dag run 
is active at a time. In this case, only one dag run will be executed at a time.
2) You can set depend_on_past to True such that this task will not execute 
unless the previous one completes.
3) Finally, you can make this DAG use a pool with one slot, such that this task 
basically takes a lock on this particular resource.

Though ideally, if several tasks are competing for the same resource, you might 
not want to schedule them at a cadence that will introduce contention...

> Should be able to prevent tasks from overlapping across multiple DAG Runs
> -
>
> Key: AIRFLOW-323
> URL: https://issues.apache.org/jira/browse/AIRFLOW-323
> Project: Apache Airflow
>  Issue Type: Bug
>Affects Versions: Airflow 1.7.1.2
> Environment: 1.7.1.2
>Reporter: Isaac Steele
>Assignee: Isaac Steele
>
> As a the Airflow administrator,
> If a task from a previous DAG Run is still running when the next scheduled 
> run triggers the same task, there should be a way prevent the tasks from 
> overlapping.
> Otherwise the same code could end up running multiple times simultaneously.
> To reproduce:
> 1) Create a DAG with a short scheduled interval
> 2) Create a task in that DAG to run longer than the interval
> Result: Both tasks end up running that the same time.
> This can cause tasks to compete for resources as well as duplicating or 
> overwriting what the other task is doing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AIRFLOW-264) Adding support for Hive queues.

2016-07-06 Thread Arthur Wiedmer (JIRA)

 [ 
https://issues.apache.org/jira/browse/AIRFLOW-264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arthur Wiedmer updated AIRFLOW-264:
---
Fix Version/s: Airflow 1.8

> Adding support for Hive queues.
> ---
>
> Key: AIRFLOW-264
> URL: https://issues.apache.org/jira/browse/AIRFLOW-264
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: hive_hooks
>Reporter: Arthur Wiedmer
>Assignee: Arthur Wiedmer
>Priority: Minor
> Fix For: Airflow 1.8
>
>
> Hive allows for queues to be set for workload management. We have started 
> using them for multi-tenant management on our Hive cluster. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (AIRFLOW-264) Adding support for Hive queues.

2016-07-06 Thread Arthur Wiedmer (JIRA)

 [ 
https://issues.apache.org/jira/browse/AIRFLOW-264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arthur Wiedmer resolved AIRFLOW-264.

Resolution: Fixed

> Adding support for Hive queues.
> ---
>
> Key: AIRFLOW-264
> URL: https://issues.apache.org/jira/browse/AIRFLOW-264
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: hive_hooks
>Reporter: Arthur Wiedmer
>Assignee: Arthur Wiedmer
>Priority: Minor
>
> Hive allows for queues to be set for workload management. We have started 
> using them for multi-tenant management on our Hive cluster. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (AIRFLOW-263) Backtick file introduced by Highcharts refactor

2016-06-21 Thread Arthur Wiedmer (JIRA)

 [ 
https://issues.apache.org/jira/browse/AIRFLOW-263?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arthur Wiedmer resolved AIRFLOW-263.

Resolution: Fixed

> Backtick file introduced by Highcharts refactor
> ---
>
> Key: AIRFLOW-263
> URL: https://issues.apache.org/jira/browse/AIRFLOW-263
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Arthur Wiedmer
>Assignee: Arthur Wiedmer
>Priority: Minor
>
> A file named "`" was introduced during the Highcharts removal. See
> https://github.com/apache/incubator-airflow/commit/0a460081bc7cba2d05434148f092b87d35aa8cd3
> My best assessment, is that this was a temporary file created by mistake.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AIRFLOW-263) Backtick file introduced by Highcharts refactor

2016-06-20 Thread Arthur Wiedmer (JIRA)

[ 
https://issues.apache.org/jira/browse/AIRFLOW-263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15340606#comment-15340606
 ] 

Arthur Wiedmer commented on AIRFLOW-263:


PR here : https://github.com/apache/incubator-airflow/pull/1613

> Backtick file introduced by Highcharts refactor
> ---
>
> Key: AIRFLOW-263
> URL: https://issues.apache.org/jira/browse/AIRFLOW-263
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Arthur Wiedmer
>Assignee: Arthur Wiedmer
>Priority: Minor
>
> A file named "`" was introduced during the Highcharts removal. See
> https://github.com/apache/incubator-airflow/commit/0a460081bc7cba2d05434148f092b87d35aa8cd3
> My best assessment, is that this was a temporary file created by mistake.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AIRFLOW-263) Backtick file introduced by Highcharts refactor

2016-06-20 Thread Arthur Wiedmer (JIRA)

[ 
https://issues.apache.org/jira/browse/AIRFLOW-263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15340552#comment-15340552
 ] 

Arthur Wiedmer commented on AIRFLOW-263:


[~bolke], you are the best to assess if this file is needed, but it looks like 
a temp file to me.

> Backtick file introduced by Highcharts refactor
> ---
>
> Key: AIRFLOW-263
> URL: https://issues.apache.org/jira/browse/AIRFLOW-263
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Arthur Wiedmer
>Assignee: Arthur Wiedmer
>Priority: Minor
>
> A file named "`" was introduced during the Highcharts removal. See
> https://github.com/apache/incubator-airflow/commit/0a460081bc7cba2d05434148f092b87d35aa8cd3
> My best assessment, is that this was a temporary file created by mistake.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (AIRFLOW-263) Backtick file introduced by Highcharts refactor

2016-06-20 Thread Arthur Wiedmer (JIRA)
Arthur Wiedmer created AIRFLOW-263:
--

 Summary: Backtick file introduced by Highcharts refactor
 Key: AIRFLOW-263
 URL: https://issues.apache.org/jira/browse/AIRFLOW-263
 Project: Apache Airflow
  Issue Type: Bug
Reporter: Arthur Wiedmer
Assignee: Arthur Wiedmer
Priority: Minor


A file named "`" was introduced during the Highcharts removal. See
https://github.com/apache/incubator-airflow/commit/0a460081bc7cba2d05434148f092b87d35aa8cd3

My best assessment, is that this was a temporary file created by mistake.





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AIRFLOW-184) Add clear/mark success to CLI

2016-06-06 Thread Arthur Wiedmer (JIRA)

[ 
https://issues.apache.org/jira/browse/AIRFLOW-184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15317196#comment-15317196
 ] 

Arthur Wiedmer commented on AIRFLOW-184:


Sounds good to me.

Ideally, this should need to be queued indeed. Should the commands mark_success 
just be a wrapper around a more general set_state?

Marking large swath of tasks as success is a pain in the ui, and the backfill 
with regex matching was useful for this. But I agree that it does not make 
sense anymore and should be refactored into something more useful + that does 
not go through the scheduler as it is a waste of slots.

> Add clear/mark success to CLI
> -
>
> Key: AIRFLOW-184
> URL: https://issues.apache.org/jira/browse/AIRFLOW-184
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: cli
>Reporter: Chris Riccomini
>Assignee: Joy Gao
>
> AIRFLOW-177 pointed out that the current CLI does not allow us to clear or 
> mark success a task (including upstream, downstream, past, future, and 
> recursive) the way that the UI widget does. Given a goal of keeping parity 
> between the UI and CLI, it seems like we should support this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (AIRFLOW-186) conn.literal is specific to MySQLdb, and should be factored out of the dbapi_hook

2016-05-27 Thread Arthur Wiedmer (JIRA)
Arthur Wiedmer created AIRFLOW-186:
--

 Summary: conn.literal is specific to MySQLdb, and should be 
factored out of the dbapi_hook
 Key: AIRFLOW-186
 URL: https://issues.apache.org/jira/browse/AIRFLOW-186
 Project: Apache Airflow
  Issue Type: Bug
Reporter: Arthur Wiedmer
Assignee: Arthur Wiedmer






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (AIRFLOW-115) Migrate and Refactor AWS integration to use boto3 and better structured hooks

2016-05-13 Thread Arthur Wiedmer (JIRA)

 [ 
https://issues.apache.org/jira/browse/AIRFLOW-115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arthur Wiedmer reassigned AIRFLOW-115:
--

Assignee: Arthur Wiedmer

> Migrate and Refactor AWS integration to use boto3 and better structured hooks
> -
>
> Key: AIRFLOW-115
> URL: https://issues.apache.org/jira/browse/AIRFLOW-115
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: AWS, boto3, hooks
>Reporter: Arthur Wiedmer
>Assignee: Arthur Wiedmer
>Priority: Minor
>
> h2. Current State
> The current AWS integration is mostly done through the S3Hook, which uses non 
> standard credentials parsing on top of using boto instead of boto3 which is 
> the current supported AWS sdk for Python.
> h2. Proposal
> an AWSHook should be provided that maps Airflow connections to the boto3 API. 
> Operators working with s3, as well as other AWS services would then inherit 
> from this hook but extend the functionality with service specific methods 
> like get_key for S3, start_cluster for EMR, enqueue for SQS, send_email for 
> SES etc...
> * AWSHook
> ** S3Hook
> ** EMRHook
> ** SQSHook
> ** SESHook
> ...
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (AIRFLOW-110) Point people to the approriate process to submit PRs in the repository's CONTRIBUTING.md

2016-05-12 Thread Arthur Wiedmer (JIRA)
Arthur Wiedmer created AIRFLOW-110:
--

 Summary: Point people to the approriate process to submit PRs in 
the repository's CONTRIBUTING.md
 Key: AIRFLOW-110
 URL: https://issues.apache.org/jira/browse/AIRFLOW-110
 Project: Apache Airflow
  Issue Type: Task
  Components: docs
Reporter: Arthur Wiedmer
Priority: Trivial


The current process to contribute code could be made more accessible. I am 
assuming that the entry point to the project is Github and the repository. We 
could modify the contributing.md as well as the read me to point to the proper 
way to do this. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (AIRFLOW-109) PrestoHook get_pandas_df executes a method that can raise outside of the try catch statement.

2016-05-12 Thread Arthur Wiedmer (JIRA)
Arthur Wiedmer created AIRFLOW-109:
--

 Summary: PrestoHook get_pandas_df executes a method that can raise 
outside of the try catch statement.
 Key: AIRFLOW-109
 URL: https://issues.apache.org/jira/browse/AIRFLOW-109
 Project: Apache Airflow
  Issue Type: Bug
  Components: hooks
Affects Versions: Airflow 1.8, Airflow 1.7.1, Airflow 1.6.2
Reporter: Arthur Wiedmer
Assignee: Arthur Wiedmer
Priority: Minor


This issue occurs when a malformed SQL statement is passed to the get_pandas_df 
method of the presto hook. Pyhive raises a DatabaseError outside of the try 
catch, leading in the wrong kind of error being raised.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AIRFLOW-17) Master Travis CI build is broken

2016-04-28 Thread Arthur Wiedmer (JIRA)

[ 
https://issues.apache.org/jira/browse/AIRFLOW-17?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15262573#comment-15262573
 ] 

Arthur Wiedmer commented on AIRFLOW-17:
---

Unfortunately, I have very little knowledge of how the license check actually 
works.

The code here seems very simplistic : 
https://github.com/airbnb/airflow/blob/master/scripts/ci/check-license.sh#L98
Maybe we can disable this check in the case of a revert.


> Master Travis CI build is broken
> 
>
> Key: AIRFLOW-17
> URL: https://issues.apache.org/jira/browse/AIRFLOW-17
> Project: Apache Airflow
>  Issue Type: Improvement
>Reporter: Chris Riccomini
>
> It looks like master is broken:
> https://travis-ci.org/airbnb/airflow/branches
> This build seems to be the first one that broke:
> https://travis-ci.org/airbnb/airflow/builds/126014622



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)