Re: Beam/Python to BigTable

Pierre Oberholzer Wed, 13 Oct 2021 09:15:27 -0700

Hi Brian,

Yes I do execute a run() at the end, and I see the Dataflow completing on
the GUI (link <https://console.cloud.google.com/dataflow/jobs>). Thanks for
asking ;)
Is there maybe a commit () missing as referred to here
<https://googleapis.dev/python/bigtable/latest/row.html#google.cloud.bigtable.row.DirectRow>,
and if yes, where to put it in the pipeline ?


Le mer. 13 oct. 2021 à 18:08, Brian Hulette <[email protected]> a écrit :

> Hey Pierre,
> Sorry for the silly question but I have to ask - are you actually running
> the pipeline? In your initial snippet you created the pipeline in a context
> (with beam.Pipeline() as p:), which will run the pipeline when you exit.
> But your latest snippet doesn't show the context, or a call to p.run(). Are
> they missing, or just not shown?
>
> Otherwise I don't see anything obviously wrong with your code. You might
> try contacting GCP support, since you're working with two GCP products.
>
> Brian
>
> On Tue, Oct 12, 2021 at 10:22 PM Pierre Oberholzer <
> [email protected]> wrote:
>
>> Dear Community,
>>
>> Glad to get your support here !
>> Issue: empty BigTable when using the Python/Beam connector.
>>
>> Thanks !
>>
>> Le dim. 10 oct. 2021 à 14:34, Pierre Oberholzer <
>> [email protected]> a écrit :
>>
>>> Thanks Israel, this helped. No error anymore, but the table remains
>>> empty with this code
>>> <https://stackoverflow.com/questions/63035772/streaming-pipeline-in-dataflow-to-bigtable-python>
>>> .
>>>
>>> *Code*
>>>
>>> class CreateRowFn(beam.DoFn):
>>>
>>>     def process(self, key):
>>>         direct_row = row.DirectRow(row_key=key)
>>>         direct_row.set_cell(
>>>             "stats_summary",
>>>             b"os_build",
>>>             b"android",
>>>             datetime.datetime.now())
>>>         return [direct_row]
>>>
>>> _ = (p
>>>                 |
>>> beam.Create(["phone#4c410523#20190501","phone#4c410523#20190502"])
>>>                 | beam.ParDo(CreateRowFn())
>>>                 |
>>> WriteToBigTable(project_id=pipeline_options.bigtable_project,
>>>
>>> instance_id=pipeline_options.bigtable_instance,
>>>
>>> table_id=pipeline_options.bigtable_table)
>>> *Issue*
>>>
>>> Empty table
>>> (checked with happybase and check = [(key,row) for key, row in
>>> table.scan()])
>>>
>>> Thanks !
>>>
>>> Le sam. 9 oct. 2021 à 21:37, Israel Herraiz <[email protected]> a écrit :
>>>
>>>> You have to write DirectRows to Bigtable, not strings. For more info,
>>>> please see
>>>> https://googleapis.dev/python/bigtable/latest/row.html#google.cloud.bigtable.row.DirectRow
>>>>
>>>
>>>
>>> --
>>> Pierre
>>>
>>
>>
>> --
>> Pierre
>>
>

-- 
Pierre

Re: Beam/Python to BigTable

Reply via email to