Re: Generating data to beam.io.Write(beam.io.BigQuerySink(

2018-08-13 Thread Damien Hawes
Hi Eila,

To my knowledge the BigQuerySink makes use of BigQuery's streaming insert
functionality. This means that if your data is successfully written to
BigQuery it will not be immediately previewable (as you already know), but
it should be immediately queryable. If you look at the table details, you
should see records in the streaming buffer.

Kind Regards,

Damien

On Mon, 13 Aug 2018, 20:00 OrielResearch Eila Arich-Landkof, <
e...@orielresearch.org> wrote:

> [the previous email was send too early by mistake]
> update:
>
> I tried the following options:
>
> 1. return dict from DnFn and error was fired:
> newRowDictlist = newRowDict  #[newRowDict]
> return(newRowDictlist)
> and the following warning:
>
> *Returning a dict from a ParDo or FlatMap is discouraged. Please use 
> list("...*
>
>
> 2. return list with dict in it
> newRowDictlist = [newRowDict]
> return(newRowDictlist)
>
> No error was generated. I see the table but the data hasn't been populated
> yet. BQ normal delay as far as I know
>
> Since I can not see the BQ full resultcould you please let me know if
> I am writing the data at the right format to BQ (I had no issues writing it
> to other type of outputs)
>
> Thanks for any help,
> Eila
>
> On Mon, Aug 13, 2018 at 1:55 PM, OrielResearch Eila Arich-Landkof <
> e...@orielresearch.org> wrote:
>
>> update:
>>
>> I tried the following options:
>>
>> 1. return dict from DnFn and error was fired:
>> newRowDictlist = newRowDict  #[newRowDict]
>> return(newRowDictlist)
>>
>> 2. return list with dict in it
>> newRowDictlist = [newRowDict]
>> return(newRowDictlist)
>>
>>
>>
>> On Mon, Aug 13, 2018 at 12:51 PM, OrielResearch Eila Arich-Landkof <
>> e...@orielresearch.org> wrote:
>>
>>> Hello,
>>>
>>> I am generating a data to be written in new BQ table with a specific
>>> schema. The data is generated at DoFn function.
>>>
>>> My question is: what is the recommended format of data that I should
>>> return from DnFn (getValuesStrFn bellow) ? is it dictionary? list?
>>> other?
>>> I tried list and str and it fired an error.
>>>
>>>
>>> The pipeline is:
>>> p =  beam.Pipeline(options=options)
>>> (p | 'Read From Data Frame' >>
>>> beam.Create(cellLinesTable.values.tolist())
>>>| 'call Get Value Str'  >> beam.ParDo(getValuesStrFn(colList))
>>>| 'write to BQ' >> 
>>> beam.io.Write(beam.io.BigQuerySink(dataset='dataset_cell_lines',table='cell_lines_table',
>>> schema=schema_bq)))
>>> Thanks,
>>> --
>>> Eila
>>> www.orielresearch.org
>>> https://www.meetu 
>>> p.co 
>>> m/Deep-Learning-In-Production/
>>> 
>>>
>>>
>>>
>>
>>
>> --
>> Eila
>> www.orielresearch.org
>> https://www.meetu 
>> p.co 
>> m/Deep-Learning-In-Production/
>> 
>>
>>
>>
>
>
> --
> Eila
> www.orielresearch.org
> https://www.meetu 
> p.co 
> m/Deep-Learning-In-Production/
> 
>
>
>


Re: Beam Slack Channel Invitation Request

2017-11-06 Thread Damien Hawes
I too would like an invite to the slack channel.

On 5 Nov 2017 05:11, "Mingmin Xu"  wrote:

> sent, welcome!
>
> On Sat, Nov 4, 2017 at 4:42 PM, Tristan Shephard <
> tristanasheph...@gmail.com> wrote:
>
>> Hello,
>>
>> Can someone please add me to the Beam slack channel?
>>
>> Thanks in advance,
>> Tristan
>>
>
>
>
> --
> 
> Mingmin
>