You are right!! This is an interesting puzzle to follow, but we are unravelling it! -- Craig
On Thu, May 24, 2018 at 8:39 AM Ash Berlin-Taylor < ash_airflowl...@firemirror.com> wrote: > Kombu (a library Celery uses) 4.1.0 added it back in > https://github.com/celery/kombu/blob/master/Changelog#L75-L99 < > https://github.com/celery/kombu/blob/master/Changelog#L75-L99> - I > _thought_ that means it's supported in Celery again...? > > > > On 24 May 2018, at 16:34, Craig Rodrigues <rodr...@crodrigues.org> > wrote: > > > > Removal of sqla as a backend is mentioned in these release notes for > celery > > 4.0: > > > > > http://docs.celeryproject.org/en/latest/whatsnew-4.0.html#features-removed-for-lack-of-funding > > > > -- > > Craig > > > > > > On Thu, May 24, 2018 at 8:32 AM Craig Rodrigues <rodr...@crodrigues.org> > > wrote: > > > >> It looks like in Celery, the documentation for sqla broker was removed: > >> > >> > >> > https://github.com/celery/celery/commit/79810a26a116e9881c42a14d856fa94c40fefcd8#diff-29ccf8c96d521253467909a652e6ded2 > >> > >> I cannot find the pull request or release notes which document this. > >> > >> -- > >> Craig > >> > >> > >> On Thu, May 24, 2018 at 8:19 AM Ash Berlin-Taylor < > >> ash_airflowl...@firemirror.com> wrote: > >> > >>> Sounds like > >>> > https://github.com/apache/incubator-airflow/blob/v1-10-test/airflow/config_templates/default_celery.py#L31 > >>> < > >>> > https://github.com/apache/incubator-airflow/blob/v1-10-test/airflow/config_templates/default_celery.py#L31 > > > >>> should be guarded in some way to only do that for a redis:// and sqs:// > >>> backends. > >>> > >>> > >>>> On 24 May 2018, at 16:13, Craig Rodrigues <rodr...@crodrigues.org> > >>> wrote: > >>>> > >>>> Ash, > >>>> > >>>> According to this: > >>>> > >>> > http://docs.celeryproject.org/en/latest/userguide/configuration.html#broker-settings > >>>> visibility_timeout is supported by Redis and SQS. > >>>> > >>>> -- > >>>> Craig > >>>> > >>>> > >>>> On Thu, May 24, 2018 at 8:07 AM Craig Rodrigues < > rodr...@crodrigues.org > >>>> > >>>> wrote: > >>>> > >>>>> Ash, > >>>>> > >>>>> Thanks again. You are leading me on the right path! > >>>>> > >>>>> I can prepare a patch to move the ssl_ options into the celery > section. > >>>>> > >>>>> What about visbility_timeout? The error I am getting is: > >>>>> > >>>>> File > >>> "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/strategies.py", > >>>>> line 160, in create > >>>>> engineclass.__name__)) > >>>>> TypeError: Invalid argument(s) > >>>>> 'ssl_key','ssl_cert','ssl_active','visibility_timeout','ssl_cacert' > >>> sent to > >>>>> create_engine(), using configuration > >>>>> MySQLDialect_mysqldb/QueuePool/Engine. Please check that the keyword > >>>>> arguments are appropriate for this combination of components. > >>>>> > >>>>> > >>>>> So it looks like visibility_timeout does not work with sqla as well. > >>>>> -- > >>>>> Craig > >>>>> > >>>>> > >>>>> On Thu, May 24, 2018 at 2:17 AM Ash Berlin-Taylor < > >>>>> ash_airflowl...@firemirror.com> wrote: > >>>>> > >>>>>> Yes, you would need to duplicate a chunk of the default_celery in > your > >>>>>> copy right now. But you can just make it have the values you want - > >>> so it > >>>>>> would be about 10 lines in total. > >>>>>> > >>>>>> It seems that between AIRFLOW-966 and AIRFLOW-1840 things got a > little > >>>>>> bit out of sync in the default .cfg and the celery .py - the .py is > >>> looking > >>>>>> for celery->ssl_* but the default config puts it in > >>>>>> celery_broker_transport_options->ssl_*. Looking at the celery config > >>> option > >>>>>> it looks like they aren't actually options for the other transports, > >>> it's > >>>>>> just that they don't complain about the extra options. I think the > >>> fix is > >>>>>> just to move them up to the celery section. > >>>>>> > >>>>>> -ash > >>>>>> > >>>>>> > >>>>>>> On 24 May 2018, at 07:21, Craig Rodrigues <rodr...@crodrigues.org> > >>>>>> wrote: > >>>>>>> > >>>>>>> Ash, > >>>>>>> > >>>>>>> Thanks! You put me on the right track. > >>>>>>> Unfortunately, there is a lot of logic in > >>>>>>> airflow/config_templates/default_celery.py that I need, > >>>>>>> and if I was to come up with my own class to replace: > >>>>>>> > >>>>>>> celery_config_options = > >>>>>>> airflow.config_templates.default_celery.DEFAULT_CELERY_CONFIG > >>>>>>> > >>>>>>> then I would basically have to rewrite my own version of > >>>>>> default_celery.py > >>>>>>> > >>>>>>> Also, in airflow/config_templates/default_airflow.cfg, there is > this: > >>>>>>> > >>>>>>> [celery_broker_transport_options] > >>>>>>> # The visibility timeout defines the number of seconds to wait for > >>> the > >>>>>>> worker > >>>>>>> # to acknowledge the task before the message is redelivered to > >>> another > >>>>>>> worker. > >>>>>>> # Make sure to increase the visibility timeout to match the time of > >>> the > >>>>>>> longest > >>>>>>> # ETA you're planning to use. Especially important in case of using > >>>>>> Redis > >>>>>>> or SQS > >>>>>>> visibility_timeout = 21600 > >>>>>>> > >>>>>>> # In case of using SSL > >>>>>>> ssl_active = False > >>>>>>> ssl_key = > >>>>>>> ssl_cert = > >>>>>>> ssl_cacert = > >>>>>>> > >>>>>>> > >>>>>>> None of those options work if using an sqla backend for celery > >>>>>> broker_url. > >>>>>>> > >>>>>>> I need to think about this, but this needs to be cleaned up before > >>>>>>> Airflow 1.10 is released. In > >>>>>> airflow/config_templates/default_airflow.cfg > >>>>>>> there is this: > >>>>>>> > >>>>>>> broker_url = sqla+mysql://airflow:airflow@localhost:3306/airflow > >>>>>>> > >>>>>>> > >>>>>>> So sqla is specified as the default broker_url for celery. > >>>>>>> A lot of people (including where I work) have used this template to > >>> set > >>>>>> up > >>>>>>> Airflow + Celery, and even though sqla is an "experimental" > >>> broker_url, > >>>>>>> it actually works pretty well. > >>>>>>> > >>>>>>> Now in airflow 1.10, something that was very easy to set up is now > >>>>>> really > >>>>>>> complicated and unintuitive. > >>>>>>> > >>>>>>> Would it be OK to change the airflow code so that > >>>>>>> in airflow/config_templates/default_airflow.cfg, all the options in > >>>>>>> [celery_broker_transport_options] are commented out? > >>>>>>> And if someone is running Redis, they would have to add those > >>>>>>> options in their own airflow.cfg file? > >>>>>>> > >>>>>>> Bolke, do you have any comments? > >>>>>>> > >>>>>>> -- > >>>>>>> Craig > >>>>>>> > >>>>>>> On Tue, May 22, 2018 at 1:50 AM Ash Berlin-Taylor < > >>>>>>> ash_airflowl...@firemirror.com> wrote: > >>>>>>> > >>>>>>>> To use with the SQLA backend to celery you need to override the > >>> options > >>>>>>>> Airflow passes to Celery. Those come from > >>>>>>>> > >>>>>> > >>> > https://github.com/apache/incubator-airflow/blob/v1-10-test/airflow/config_templates/default_celery.py > >>>>>>>> > >>>>>>>> Since you don't want most/all of those options (and there is no > way > >>> in > >>>>>> the > >>>>>>>> config file to _remove_ a setting) you will have to point airflow > >>> to a > >>>>>>>> different file for the celery config: > >>>>>>>> > >>>>>>>> This line in the config is what you will need to change: > >>>>>>>> > >>>>>>>> # Import path for celery configuration options > >>>>>>>> celery_config_options = > >>>>>>>> airflow.config_templates.default_celery.DEFAULT_CELERY_CONFIG > >>>>>>>> > >>>>>>>> If you create something like config/celery_config.py containing: > >>>>>>>> > >>>>>>>> > >>>>>>>> CELERY_CONFIG = { > >>>>>>>> # Just the options you want to set > >>>>>>>> } > >>>>>>>> > >>>>>>>> > >>>>>>>> (config/ should exist along side your dags/ folder, and I think it > >>>>>> should > >>>>>>>> be added to the python path already). You can then set this in the > >>>>>> config: > >>>>>>>> > >>>>>>>> celery_config_options = celery_config.CELERY_CONFIG > >>>>>>>> > >>>>>>>> That should give you complete control > >>>>>>>> > >>>>>>>> > >>>>>> > >>>>>> > >>> > >>> > >