Re: BashOperator exit status

Andrew Maguire Tue, 14 Nov 2017 07:58:54 -0800

thanks - adding set -e works like you said - much nicer.

ps the script is below - is just to copy some files from a GCS bucket to a
bq table. might it be the "until" block that is causing this (just curious)


Cheers,
Andy

##################

#!/bin/sh

set -e

lob=$1
lobshort=$2
modelname=$3
begindate=$4
enddate=$5

currentdate=$begindate
loopenddate=$(date --date "$enddate 1 day" +%Y%m%d)

today=$(date --date="-0 days" +%Y%m%d)

until [ "$currentdate" == "$loopenddate" ]
do

  printf '\n##################################\n'
  printf 'DISPLAY INPUTS'
  printf '\n##################################\n'

  echo 'currentdate is '${currentdate}
  echo 'loopenddate is '${loopenddate}

  # get prediction timestamp as latest prediction run for a given date
  prediction_timestamp=$(gsutil ls
gs://pmc-ml/clickmodel/${lobshort}/pred/${currentdate}* | grep -o
'/[0-9].*/' | sed -r 's/[/]//g' | sort -r | head -n 1)

  SOURCE=gs://pmc-ml/${modelname}/${lobshort}/pred/${prediction_timestamp}
  TARGET_TABLE=${lob}.${modelname}_predictions_${currentdate}
  TARGET_TABLE_MAP=${lob}.${modelname}_predictions_map_${currentdate}

  echo '-----------------------------------'
  echo 'lob: '$lob
  echo 'lobshort: '$lobshort
  echo 'prediction_timestamp: '$prediction_timestamp
  echo 'SOURCE: '$SOURCE
  echo 'TARGET_TABLE: '$TARGET_TABLE
  echo '-----------------------------------'
  echo 'Most recent prediction output folders:'
  gsutil ls gs://pmc-ml/${modelname}/${lobshort}/pred | grep -o '/[0-9].*/'
| sed -r 's/[/]//g' | sort -r | head -n 10
  echo '-----------------------------------'

  printf '\n##################################\n'
  printf 'LOAD PREDICTIONS'
  printf '\n##################################\n'

  # load predictions
  bq load --source_format=NEWLINE_DELIMITED_JSON --autodetect --replace \
      $TARGET_TABLE \
      $SOURCE/prediction.results*

  # view predictions table
  bq show $TARGET_TABLE

  printf '\n##################################\n'
  printf 'LOAD PREDICTION MAP'
  printf '\n##################################\n'

  # load predictions map
  bq load --source_format=NEWLINE_DELIMITED_JSON --autodetect --replace \
      $TARGET_TABLE_MAP \
      $SOURCE/prediction_map*

  # view predictions map table
  bq show $TARGET_TABLE_MAP

  # increment date by 1 day
  currentdate=$(date --date "$currentdate 1 day" +%Y%m%d)

done

##################

On Tue, Nov 14, 2017 at 3:44 PM Alek Storm <[email protected]> wrote:

> That snippet shouldn’t be necessary; I assume the bq command was not the
> last one of your script. Add set -e to the top to exit on the first failure
> (this is generally what people expect from experience with scripting
> languages).
>
> Alek
> 
>
> On Tue, Nov 14, 2017 at 9:41 AM, Andrew Maguire <[email protected]>
> wrote:
>
> > Thanks,
> >
> > I've added this into my script and it now triggers error in Airflow.
> >
> > # capture status of last command and exit if error
> >   status=$?
> >   if [ $status -ne 0 ]; then
> >     echo "Return code was not zero but $status"
> >     exit $status
> >   fi
> >
> > Now it triggers below output.
> >
> > I'll try figure out what's going on as would be a bit ugly to have to add
> > this everywhere (although for now will work fine if i add to just the
> last
> > step even).
> >
> > [2017-11-14 15:33:57,545] {base_task_runner.py:95} INFO - Subtask:
> > [2017-11-14 15:33:57,545] {bash_operator.py:94} INFO -
> > ##################################
> > [2017-11-14 15:33:57,546] {base_task_runner.py:95} INFO - Subtask:
> > [2017-11-14 15:33:57,545] {bash_operator.py:94} INFO - LOAD PREDICTIONS
> > [2017-11-14 15:33:57,546] {base_task_runner.py:95} INFO - Subtask:
> > [2017-11-14 15:33:57,545] {bash_operator.py:94} INFO -
> > ##################################
> > [2017-11-14 15:34:00,147] {base_task_runner.py:95} INFO - Subtask:
> > [2017-11-14 15:34:00,146] {bash_operator.py:94} INFO - BigQuery error in
> > load operation: Error processing job
> > [2017-11-14 15:34:00,147] {base_task_runner.py:95} INFO - Subtask:
> > [2017-11-14 15:34:00,147] {bash_operator.py:94} INFO -
> > 'pmc-analytical-data-mart:bqjob_r57a28bce8e342beb_0000015fbb2a633b_1':
> Not
> > [2017-11-14 15:34:00,147] {base_task_runner.py:95} INFO - Subtask:
> > [2017-11-14 15:34:00,147] {bash_operator.py:94} INFO - found: Uris
> > gs://pmc-ml/clickmodel/vy/pred//prediction.results*
> > [2017-11-14 15:34:00,209] {base_task_runner.py:95} INFO - Subtask:
> > [2017-11-14 15:34:00,208] {bash_operator.py:94} INFO - Return code was
> not
> > zero but 2
> > [2017-11-14 15:34:00,209] {base_task_runner.py:95} INFO - Subtask:
> > [2017-11-14 15:34:00,209] {bash_operator.py:97} INFO - Command exited
> with
> > return code 2
> > [2017-11-14 15:34:00,210] {base_task_runner.py:95} INFO - Subtask:
> > [2017-11-14 15:34:00,209] {models.py:1433} ERROR - Bash command failed
> > [2017-11-14 15:34:00,210] {base_task_runner.py:95} INFO - Subtask:
> > Traceback (most recent call last):
> > [2017-11-14 15:34:00,210] {base_task_runner.py:95} INFO - Subtask:   File
> > "/usr/local/lib/python2.7/dist-packages/airflow/models.py", line 1390, in
> > run
> > [2017-11-14 15:34:00,210] {base_task_runner.py:95} INFO - Subtask:
> >  result = task_copy.execute(context=context)
> > [2017-11-14 15:34:00,210] {base_task_runner.py:95} INFO - Subtask:   File
> > "/usr/local/lib/python2.7/dist-packages/airflow/
> > operators/bash_operator.py",
> > line 100, in execute
> > [2017-11-14 15:34:00,211] {base_task_runner.py:95} INFO - Subtask:
> >  raise AirflowException("Bash command failed")
> > [2017-11-14 15:34:00,211] {base_task_runner.py:95} INFO - Subtask:
> > AirflowException: Bash command failed
> > [2017-11-14 15:34:00,211] {base_task_runner.py:95} INFO - Subtask:
> > [2017-11-14 15:34:00,210] {models.py:1449} INFO - Marking task as
> > UP_FOR_RETRY
> > [2017-11-14 15:34:00,474] {base_task_runner.py:95} INFO - Subtask:
> > [2017-11-14 15:34:00,474] {email.py:109} INFO - Sent an alert email to ['
> > [email protected]']
> > [2017-11-14 15:34:02,181] {base_task_runner.py:95} INFO - Subtask:
> > [2017-11-14 15:34:02,181] {models.py:1478} ERROR - Bash command failed
> > [2017-11-14 15:34:02,182] {base_task_runner.py:95} INFO - Subtask:
> > Traceback (most recent call last):
> > [2017-11-14 15:34:02,183] {base_task_runner.py:95} INFO - Subtask:   File
> > "/usr/local/bin/airflow", line 28, in <module>
> > [2017-11-14 15:34:02,183] {base_task_runner.py:95} INFO - Subtask:
> >  args.func(args)
> > [2017-11-14 15:34:02,183] {base_task_runner.py:95} INFO - Subtask:   File
> > "/usr/local/lib/python2.7/dist-packages/airflow/bin/cli.py", line 422, in
> > run
> > [2017-11-14 15:34:02,184] {base_task_runner.py:95} INFO - Subtask:
> >  pool=args.pool,
> > [2017-11-14 15:34:02,184] {base_task_runner.py:95} INFO - Subtask:   File
> > "/usr/local/lib/python2.7/dist-packages/airflow/utils/db.py", line 53, in
> > wrapper
> > [2017-11-14 15:34:02,185] {base_task_runner.py:95} INFO - Subtask:
> >  result = func(*args, **kwargs)
> > [2017-11-14 15:34:02,185] {base_task_runner.py:95} INFO - Subtask:   File
> > "/usr/local/lib/python2.7/dist-packages/airflow/models.py", line 1390, in
> > run
> > [2017-11-14 15:34:02,186] {base_task_runner.py:95} INFO - Subtask:
> >  result = task_copy.execute(context=context)
> > [2017-11-14 15:34:02,187] {base_task_runner.py:95} INFO - Subtask:   File
> > "/usr/local/lib/python2.7/dist-packages/airflow/
> > operators/bash_operator.py",
> > line 100, in execute
> > [2017-11-14 15:34:02,187] {base_task_runner.py:95} INFO - Subtask:
> >  raise AirflowException("Bash command failed")
> > [2017-11-14 15:34:02,188] {base_task_runner.py:95} INFO - Subtask:
> > airflow.exceptions.AirflowException: Bash command failed
> > [2017-11-14 15:34:06,994] {jobs.py:2125} INFO - Task exited with return
> > code 1
> >
> > On Tue, Nov 14, 2017 at 3:33 PM Bolke de Bruin <[email protected]>
> wrote:
> >
> > > Hi Andrew,
> > >
> > > Your task is exiting with “code 0” which means success. I would verify
> > > that you are not swallowing the error/return code somewhere.
> > >
> > > Cheers
> > > Bolke
> > >
> > > Verstuurd vanaf mijn iPad
> > >
> > > > Op 14 nov. 2017 om 16:25 heeft Andrew Maguire <[email protected]
> >
> > > het volgende geschreven:
> > > >
> > > > Oh, good to know.
> > > >
> > > > It was just an image of this log info:
> > > >
> > > > [2017-11-14 15:09:34,595] {base_task_runner.py:95} INFO - Subtask:
> > > > [2017-11-14 15:09:34,594] {bash_operator.py:94} INFO -
> > > > ##################################
> > > > [2017-11-14 15:09:34,595] {base_task_runner.py:95} INFO - Subtask:
> > > > [2017-11-14 15:09:34,594] {bash_operator.py:94} INFO - LOAD
> PREDICTION
> > > MAP
> > > > [2017-11-14 15:09:34,595] {base_task_runner.py:95} INFO - Subtask:
> > > > [2017-11-14 15:09:34,594] {bash_operator.py:94} INFO -
> > > > ##################################
> > > > *[2017-11-14 15:09:36,695] {base_task_runner.py:95} INFO - Subtask:
> > > > [2017-11-14 15:09:36,694] {bash_operator.py:94} INFO - BigQuery error
> > in
> > > > load operation: Error processing job*
> > > > *[2017-11-14 15:09:36,695] {base_task_runner.py:95} INFO - Subtask:
> > > > [2017-11-14 15:09:36,695] {bash_operator.py:94} INFO -
> > > >
> 'pmc-analytical-data-mart:bqjob_r4352c5357a8f26c6_0000015fbb14107a_1':
> > > Not*
> > > > *[2017-11-14 15:09:36,695] {base_task_runner.py:95} INFO - Subtask:
> > > > [2017-11-14 15:09:36,695] {bash_operator.py:94} INFO - found: Uris
> > > > gs://pmc-ml/clickmodel/vy/pred//prediction_map**
> > > > *[2017-11-14 15:09:38,277] {base_task_runner.py:95} INFO - Subtask:
> > > > [2017-11-14 15:09:38,276] {bash_operator.py:94} INFO - BigQuery error
> > in
> > > > show operation: Not found: Table*
> > > > *[2017-11-14 15:09:38,277] {base_task_runner.py:95} INFO - Subtask:
> > > > [2017-11-14 15:09:38,277] {bash_operator.py:94} INFO -
> > > > pmc-analytical-data-mart:variety.clickmodel_predictions_map_20171112*
> > > > [2017-11-14 15:09:38,332] {base_task_runner.py:95} INFO - Subtask:
> > > > [2017-11-14 15:09:38,331] {bash_operator.py:97} INFO - Command exited
> > > with
> > > > return code 0
> > > > [2017-11-14 15:09:40,783] {jobs.py:2125} INFO - Task exited with
> return
> > > > code 0
> > > >
> > > > So you can see the lines in bold are failed bq commands, but for some
> > > > reason (maybe how the bq cli operates) Airflow still thinks the task
> > was
> > > > successful.
> > > >
> > > > I think if i was to put something like the below into the bash script
> > > then
> > > > that would be enough to trigger a failure to airflow - thoughts?
> > > >
> > > > # capture status of last command and exit if error
> > > > status=$?
> > > > if [ $status -ne 0 ]; then
> > > >  echo "Return code was not zero but $status"
> > > >  exit $status
> > > > fi
> > > >
> > > >> On Tue, Nov 14, 2017 at 3:21 PM Alek Storm <[email protected]>
> > > wrote:
> > > >>
> > > >> Hi Andy,
> > > >>
> > > >> The list doesn't allow inline images to be posted - can you paste
> your
> > > >> script content as text?
> > > >>
> > > >> Alek
> > > >>
> > > >>> On Nov 14, 2017 9:16 AM, "Andrew Maguire" <[email protected]>
> > > wrote:
> > > >>>
> > > >>> Hi,
> > > >>>
> > > >>> I have some bash operators that are failing but airflow is not
> > picking
> > > >> the
> > > >>> failure up.
> > > >>>
> > > >>> Here is an example:
> > > >>>
> > > >>> [image: image.png]
> > > >>>
> > > >>> This is a bash script that makes some "bq" and "gcloud" cli
> commands.
> > > >>>
> > > >>> I've used $? to get the status of such failed cli commands in the
> > past
> > > >> and
> > > >>> then do something.
> > > >>>
> > > >>> I've just wondering - how could i use the $? from the failed bq
> > command
> > > >> to
> > > >>> in turn pass an error to Airflow?
> > > >>>
> > > >>> Cheers,
> > > >>> Andy
> > > >>>
> > > >>
> > >
> >
>

Re: BashOperator exit status

Reply via email to