Hi,

I found the issue and can now write from Beam/Python to BigTable.
You just need to create FIRST the column family before writing (here with
cbt):

`cbt -instance test-instance createfamily test-table cf1`

Confusing is that no error is thrown when the column family is not existing.
There seems to be a similar issue with cbt [1]. It'd be great to correct
this.
Let me know if I should raise another bug.

Thanks !

[1] https://issuetracker.google.com/issues/186053077

Pierre

Le mer. 20 oct. 2021 à 04:18, Pierre Oberholzer <[email protected]>
a écrit :

> Hi Bryan,
>
> Thanks again for your reply last week.
> I’ve raised a ticket here:
>
> https://issuetracker.google.com/issues/202977204
>
> Is that what you mean by GCP support ?
> Any idea on how reactive it is ?
> Any other alternative to use meanwhile (Java I/O in Python)?
>
> Thanks for your support !
>
> Best regards, Pierre
>
> ---------- Message transféré ---------
> De : Pierre Oberholzer <[email protected]>
> Date : sam. 16 oct. 2021 à 08:47
> Objet : Re: Beam/Python to BigTable
> À : <[email protected]>, <[email protected]>
>
>
> Hi Everyone,
>
> I have raised a bug on GCP for this.
> But..am I the only one trying to write from Beam to BigTable in Python ?
> Is that a warning sign showing that this combo is not mature ?
> Is there any attempt using the Java connector in Python ?
>
> Glad to hear about your experience and advice - and of course about other
> ideas to solve this "bug".
>
> Thanks !
>
> Le mer. 13 oct. 2021 à 18:14, Pierre Oberholzer <
> [email protected]> a écrit :
>
>> Hi Brian,
>>
>> Yes I do execute a run() at the end, and I see the Dataflow completing on
>> the GUI (link <https://console.cloud.google.com/dataflow/jobs>). Thanks
>> for asking ;)
>> Is there maybe a commit () missing as referred to here
>> <https://googleapis.dev/python/bigtable/latest/row.html#google.cloud.bigtable.row.DirectRow>,
>> and if yes, where to put it in the pipeline ?
>>
>> Le mer. 13 oct. 2021 à 18:08, Brian Hulette <[email protected]> a
>> écrit :
>>
>>> Hey Pierre,
>>> Sorry for the silly question but I have to ask - are you actually
>>> running the pipeline? In your initial snippet you created the pipeline in a
>>> context (with beam.Pipeline() as p:), which will run the pipeline when you
>>> exit. But your latest snippet doesn't show the context, or a call to
>>> p.run(). Are they missing, or just not shown?
>>>
>>> Otherwise I don't see anything obviously wrong with your code. You might
>>> try contacting GCP support, since you're working with two GCP products.
>>>
>>> Brian
>>>
>>> On Tue, Oct 12, 2021 at 10:22 PM Pierre Oberholzer <
>>> [email protected]> wrote:
>>>
>>>> Dear Community,
>>>>
>>>> Glad to get your support here !
>>>> Issue: empty BigTable when using the Python/Beam connector.
>>>>
>>>> Thanks !
>>>>
>>>> Le dim. 10 oct. 2021 à 14:34, Pierre Oberholzer <
>>>> [email protected]> a écrit :
>>>>
>>>>> Thanks Israel, this helped. No error anymore, but the table remains
>>>>> empty with this code
>>>>> <https://stackoverflow.com/questions/63035772/streaming-pipeline-in-dataflow-to-bigtable-python>
>>>>> .
>>>>>
>>>>> *Code*
>>>>>
>>>>> class CreateRowFn(beam.DoFn):
>>>>>
>>>>>     def process(self, key):
>>>>>         direct_row = row.DirectRow(row_key=key)
>>>>>         direct_row.set_cell(
>>>>>             "stats_summary",
>>>>>             b"os_build",
>>>>>             b"android",
>>>>>             datetime.datetime.now())
>>>>>         return [direct_row]
>>>>>
>>>>> _ = (p
>>>>>                 |
>>>>> beam.Create(["phone#4c410523#20190501","phone#4c410523#20190502"])
>>>>>                 | beam.ParDo(CreateRowFn())
>>>>>                 |
>>>>> WriteToBigTable(project_id=pipeline_options.bigtable_project,
>>>>>
>>>>> instance_id=pipeline_options.bigtable_instance,
>>>>>
>>>>> table_id=pipeline_options.bigtable_table)
>>>>> *Issue*
>>>>>
>>>>> Empty table
>>>>> (checked with happybase and check = [(key,row) for key, row in
>>>>> table.scan()])
>>>>>
>>>>> Thanks !
>>>>>
>>>>> Le sam. 9 oct. 2021 à 21:37, Israel Herraiz <[email protected]> a écrit :
>>>>>
>>>>>> You have to write DirectRows to Bigtable, not strings. For more info,
>>>>>> please see
>>>>>> https://googleapis.dev/python/bigtable/latest/row.html#google.cloud.bigtable.row.DirectRow
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Pierre
>>>>>
>>>>
>>>>
>>>> --
>>>> Pierre
>>>>
>>>
>>
>> --
>> Pierre
>>
>
>
> --
> Pierre
> --
> Pierre
>


-- 
Pierre

Reply via email to