> On Jul 26, 2017, at 2:25 PM, Bolke de Bruin <[email protected]> wrote: > > Can you explain what is solved by the patch? It only seems to set a empty > default value? > > If that solves it it also seems to be a bug with celery. Or did you set some > options? That would be great to share. >
Apologies - forgot to include that the config option that was set via BROKER_TRANSPORT_OPTIONS was the visibility_timeout which I configured to >1d in seconds. So not sure if that is something that should be set in the config template with a sane default (say 1d), or if it should just be documented and the config should be left with empty broker_transport_option json. Cheers, J > Bolke > > Sent from my iPhone > >> On 26 Jul 2017, at 19:44, Jawahar Panchal <[email protected]> wrote: >> >> Howdy - wanted to provide an update - the following patch applied manually >> to a clean 1.8.1 install addressed the issue: >> >> https://github.com/apache/incubator-airflow/pull/2143 >> <https://github.com/apache/incubator-airflow/pull/2143> >> >> We have confirmed/verified this with jobs running over multiple hours on our >> instance - so the above pull request can close both of the below: >> >> https://issues.apache.org/jira/browse/AIRFLOW-966 >> <https://issues.apache.org/jira/browse/AIRFLOW-966> >> https://issues.apache.org/jira/browse/AIRFLOW-1258 >> <https://issues.apache.org/jira/browse/AIRFLOW-1258> >> >> As I have seen further discussion on 1.8.2 with the final build not being >> tagged yet (iirc), would it be possibly to merge this for the 1.8.2 release? >> As it currently is, someone using Airflow+Celery will have a broken >> configuration for long-running jobs - would the dev team consider this >> major/critical enough to include? >> >> Cheers, >> J >> >>> On Jul 17, 2017, at 3:15 PM, Jawahar Panchal <[email protected]> wrote: >>> >>> Hi again! >>> >>> >>>> On Jul 16, 2017, at 3:22 AM, Alex Guziel <[email protected] >>>> <mailto:[email protected]>> wrote: >>>> >>>> I think this may be related to a celery bug. I'll follow up with more >>>> details later. >>>> >>> >>> Just replying back to the note earlier in the thread - apologies for the >>> earlier top-posting, got a bit excited that I might have found the issue, >>> and of course lack of sleep results in one doing terrible, terrible things… >>> :) >>> >>> Any idea if my suspicion around the 1h default visibility timeout between >>> celery/redis is the culprit? >>> >>>> On Sun, Jul 16, 2017 at 12:56 AM Jawahar Panchal <[email protected]> >>>> wrote: >>>> >>>>> Hi! >>>>> >>>>> I am currently running a couple of long-running tasks on a >>>>> database/dataset at school for a project that results in behavior/log >>>>> output similar to what was flagged in this bug: >>>>> https://issues.apache.org/jira/browse/AIRFLOW-1258 < >>>>> https://issues.apache.org/jira/browse/AIRFLOW-1258> >>>>> >>>>> Wasn’t sure if anyone on the list had seen anything similar, or would know >>>>> what I can do to possibly debug further/patch. As it takes 1hr to test a >>>>> change, needless to say any pointers from the dev team on the right >>>>> direction to look within the codebase would be much appreciated! :) >>>>> >>>>> Thanks in advance for everyone’s/anyone's time and help - am not an >>>>> Airflow expert, but am hopefully learning quickly enough to help resolve >>>>> this issue (if I am ‘barking up the right tree’ with this bug number…) >>>>> >>>>> Cheers, >>>>> J >>>>> >>>>> >>> >>> Cheers, >>> J >>
