Hi Bart,

Thanks for checking this and making me try again! It does work now if I disable 
the hop – don't know what else I did that caused this to fail.

cheers

Fabian

> Am 19.07.2022 um 15:02 schrieb Bart Maertens <[email protected]>:
> 
> Hi Fabian, 
> 
> I just tried to reproduce. I can run a pipeline with a disabled hop to Beam 
> BigQuery output. 
> If I don't disable or delete that Hop, I get a situation that is very similar 
> to what you describe. 
> 
> The line below indicates that Hop tries to initialize the BigQuery Output 
> transform: 
> 2022/07/19 14:13:42 - General - Handled transform (BQ OUTPUT) : Beam BigQuery 
> Output, gets data from Select values
> 
> Can you confirm you have this issue, even when the hop before your Beam 
> BigQuery Output transform is disabled? 
> 
> Regards, 
> Bart 
> 
> On Tue, Jul 19, 2022 at 2:18 PM Fabian Peters <[email protected] 
> <mailto:[email protected]>> wrote:
> Hi Bart,
> 
> I didn't try this before, because what I'm interested in is seeing the 
> intermediate steps' output.
> 
> However, with the Beam-Direct runner, the pipeline just hangs:
> 
> 2022/07/19 14:13:34 - Hop - Pipeline opened.
> 2022/07/19 14:13:34 - Hop - Launching pipeline [sites]...
> 2022/07/19 14:13:34 - Hop - Started the pipeline execution.
> 2022/07/19 14:13:41 - General - Created Apache Beam pipeline with name 'sites'
> 2022/07/19 14:13:41 - General - Handled generic transform (TRANSFORM) : Get 
> file names, gets data from 0 previous transform(s), targets=0, infos=0
> 2022/07/19 14:13:41 - General - Handled generic transform (TRANSFORM) : Last 
> modified, gets data from 1 previous transform(s), targets=0, infos=0
> 2022/07/19 14:13:41 - General - Handled generic transform (TRANSFORM) : Avro 
> File Input, gets data from 1 previous transform(s), targets=0, infos=0
> 2022/07/19 14:13:41 - General - Handled generic transform (TRANSFORM) : Avro 
> to site, gets data from 1 previous transform(s), targets=0, infos=0
> 2022/07/19 14:13:41 - General - Handled generic transform (TRANSFORM) : 
> Filter rows, gets data from 1 previous transform(s), targets=1, infos=0
> 2022/07/19 14:13:41 - General - Transform Select values reading from previous 
> transform targeting this one using : Filter rows - TARGET - Select values
> 2022/07/19 14:13:41 - General - Handled generic transform (TRANSFORM) : 
> Select values, gets data from 1 previous transform(s), targets=0, infos=0
> 2022/07/19 14:13:42 - General - Handled transform (BQ OUTPUT) : Beam BigQuery 
> Output, gets data from Select values
> 2022/07/19 14:13:42 - sites - Executing this pipeline using the Beam Pipeline 
> Engine with run configuration 'Beam-Direct'
> … nothing more happens
> 
> If I remove the "Beam BigQuery Output" transform:
> 
> 2022/07/19 14:14:32 - Hop - Pipeline opened.
> 2022/07/19 14:14:32 - Hop - Launching pipeline [sites]...
> 2022/07/19 14:14:32 - Hop - Started the pipeline execution.
> 2022/07/19 14:14:40 - General - Created Apache Beam pipeline with name 'sites'
> 2022/07/19 14:14:40 - General - Handled generic transform (TRANSFORM) : Get 
> file names, gets data from 0 previous transform(s), targets=0, infos=0
> 2022/07/19 14:14:40 - General - Handled generic transform (TRANSFORM) : Last 
> modified, gets data from 1 previous transform(s), targets=0, infos=0
> 2022/07/19 14:14:40 - General - Handled generic transform (TRANSFORM) : Avro 
> File Input, gets data from 1 previous transform(s), targets=0, infos=0
> 2022/07/19 14:14:40 - General - Handled generic transform (TRANSFORM) : Avro 
> to site, gets data from 1 previous transform(s), targets=0, infos=0
> 2022/07/19 14:14:40 - General - Handled generic transform (TRANSFORM) : 
> Filter rows, gets data from 1 previous transform(s), targets=1, infos=0
> 2022/07/19 14:14:40 - General - Transform Select values reading from previous 
> transform targeting this one using : Filter rows - TARGET - Select values
> 2022/07/19 14:14:40 - General - Handled generic transform (TRANSFORM) : 
> Select values, gets data from 1 previous transform(s), targets=0, infos=0
> 2022/07/19 14:14:40 - sites - Executing this pipeline using the Beam Pipeline 
> Engine with run configuration 'Beam-Direct'
> 2022/07/19 14:14:44 - sites - Beam pipeline execution has finished.
> 
> The pipeline ran successfully via the DataFlow runner.
> 
> Fabian
> 
> 
>> Am 19.07.2022 um 13:11 schrieb Bart Maertens <[email protected] 
>> <mailto:[email protected]>>:
>> 
>> Hi Fabian,
>> 
>> Do you have this issue with the BeaM Direct run configuration as well? 
>> The Beam Bigquery Output transform is Beam only, so this won't work with the 
>> native (local or remote) run configuration. 
>> 
>> If the issue exists with the direct runner, can you share any errors you 
>> get? 
>> 
>> Regards,
>> Bart 
>> 
>> On Tue, Jul 19, 2022 at 12:12 PM Fabian Peters <[email protected] 
>> <mailto:[email protected]>> wrote:
>> Hi all!
>> 
>> I'm developing a number of pipelines that write data to BigQuery. This works 
>> fine, alas, during development I find I have to entirely remove the "Beam 
>> BigQuery Output" transform to be able to run the pipeline locally, disabling 
>> or deleting the hop to it did not help. Is there a way to keep the transform 
>> around while debugging the pipeline?
>> 
>> cheers
>> 
>> Fabian
> 

Reply via email to