Hi Damien / Cham, Thank you for the explanation and the tip to query BQ when the preview is not yet populated. Very helpful! Eila
On Tue, Aug 14, 2018 at 11:18 AM, Chamikara Jayalath <[email protected]> wrote: > BQ sink (batch) writes data to GCS and loads data to BQ using load jobs. > > Regarding writing to sink, you basically have to write a PCollection of > dictionaries where each dictionary maps to a row in BQ table. > > From a ParDo or a FlatMap you have to return an iterator of records. So > you either have to return a list of records or use yield where each record > is a dictionary. > > See following example that reads from and writes to BigQuery for more > clarity on the syntax. > https://github.com/apache/beam/blob/master/sdks/python/ > apache_beam/examples/cookbook/bigquery_tornadoes.py > > Thanks, > Cham > > > On Mon, Aug 13, 2018 at 8:51 PM Damien Hawes <[email protected]> > wrote: > >> Hi Eila, >> >> To my knowledge the BigQuerySink makes use of BigQuery's streaming insert >> functionality. This means that if your data is successfully written to >> BigQuery it will not be immediately previewable (as you already know), but >> it should be immediately queryable. If you look at the table details, you >> should see records in the streaming buffer. >> >> Kind Regards, >> >> Damien >> >> On Mon, 13 Aug 2018, 20:00 OrielResearch Eila Arich-Landkof, < >> [email protected]> wrote: >> >>> [the previous email was send too early by mistake] >>> update: >>> >>> I tried the following options: >>> >>> 1. return dict from DnFn and error was fired: >>> newRowDictlist = newRowDict #[newRowDict] >>> return(newRowDictlist) >>> and the following warning: >>> >>> *Returning a dict from a ParDo or FlatMap is discouraged. Please use >>> list("...* >>> >>> >>> 2. return list with dict in it >>> newRowDictlist = [newRowDict] >>> return(newRowDictlist) >>> >>> No error was generated. I see the table but the data hasn't been >>> populated yet. BQ normal delay as far as I know >>> >>> Since I can not see the BQ full result....could you please let me know >>> if I am writing the data at the right format to BQ (I had no issues writing >>> it to other type of outputs) >>> >>> Thanks for any help, >>> Eila >>> >>> On Mon, Aug 13, 2018 at 1:55 PM, OrielResearch Eila Arich-Landkof < >>> [email protected]> wrote: >>> >>>> update: >>>> >>>> I tried the following options: >>>> >>>> 1. return dict from DnFn and error was fired: >>>> newRowDictlist = newRowDict #[newRowDict] >>>> return(newRowDictlist) >>>> >>>> 2. return list with dict in it >>>> newRowDictlist = [newRowDict] >>>> return(newRowDictlist) >>>> >>>> >>>> >>>> On Mon, Aug 13, 2018 at 12:51 PM, OrielResearch Eila Arich-Landkof < >>>> [email protected]> wrote: >>>> >>>>> Hello, >>>>> >>>>> I am generating a data to be written in new BQ table with a specific >>>>> schema. The data is generated at DoFn function. >>>>> >>>>> My question is: what is the recommended format of data that I should >>>>> return from DnFn (getValuesStrFn bellow) ? is it dictionary? list? >>>>> other? >>>>> I tried list and str and it fired an error. >>>>> >>>>> >>>>> The pipeline is: >>>>> p = beam.Pipeline(options=options) >>>>> (p | 'Read From Data Frame' >> beam.Create(cellLinesTable. >>>>> values.tolist()) >>>>> | 'call Get Value Str' >> beam.ParDo(getValuesStrFn(colList)) >>>>> | 'write to BQ' >> beam.io.Write(beam.io. >>>>> BigQuerySink(dataset='dataset_cell_lines',table='cell_lines_table', >>>>> schema=schema_bq))) >>>>> Thanks, >>>>> -- >>>>> Eila >>>>> www.orielresearch.org >>>>> https://www.meetu >>>>> <https://www.meetup.com/Deep-Learning-In-Production/>p.co >>>>> <https://www.meetup.com/Deep-Learning-In-Production/>m/Deep- >>>>> Learning-In-Production/ >>>>> <https://www.meetup.com/Deep-Learning-In-Production/> >>>>> >>>>> >>>>> >>>> >>>> >>>> -- >>>> Eila >>>> www.orielresearch.org >>>> https://www.meetu <https://www.meetup.com/Deep-Learning-In-Production/> >>>> p.co <https://www.meetup.com/Deep-Learning-In-Production/>m/Deep- >>>> Learning-In-Production/ >>>> <https://www.meetup.com/Deep-Learning-In-Production/> >>>> >>>> >>>> >>> >>> >>> -- >>> Eila >>> www.orielresearch.org >>> https://www.meetu <https://www.meetup.com/Deep-Learning-In-Production/> >>> p.co <https://www.meetup.com/Deep-Learning-In-Production/>m/Deep- >>> Learning-In-Production/ >>> <https://www.meetup.com/Deep-Learning-In-Production/> >>> >>> >>> -- Eila www.orielresearch.org https://www.meetu <https://www.meetup.com/Deep-Learning-In-Production/>p.co <https://www.meetup.com/Deep-Learning-In-Production/> m/Deep-Learning-In-Production/ <https://www.meetup.com/Deep-Learning-In-Production/>
