Hi Bart, Thanks for checking this and making me try again! It does work now if I disable the hop – don't know what else I did that caused this to fail.
cheers Fabian > Am 19.07.2022 um 15:02 schrieb Bart Maertens <[email protected]>: > > Hi Fabian, > > I just tried to reproduce. I can run a pipeline with a disabled hop to Beam > BigQuery output. > If I don't disable or delete that Hop, I get a situation that is very similar > to what you describe. > > The line below indicates that Hop tries to initialize the BigQuery Output > transform: > 2022/07/19 14:13:42 - General - Handled transform (BQ OUTPUT) : Beam BigQuery > Output, gets data from Select values > > Can you confirm you have this issue, even when the hop before your Beam > BigQuery Output transform is disabled? > > Regards, > Bart > > On Tue, Jul 19, 2022 at 2:18 PM Fabian Peters <[email protected] > <mailto:[email protected]>> wrote: > Hi Bart, > > I didn't try this before, because what I'm interested in is seeing the > intermediate steps' output. > > However, with the Beam-Direct runner, the pipeline just hangs: > > 2022/07/19 14:13:34 - Hop - Pipeline opened. > 2022/07/19 14:13:34 - Hop - Launching pipeline [sites]... > 2022/07/19 14:13:34 - Hop - Started the pipeline execution. > 2022/07/19 14:13:41 - General - Created Apache Beam pipeline with name 'sites' > 2022/07/19 14:13:41 - General - Handled generic transform (TRANSFORM) : Get > file names, gets data from 0 previous transform(s), targets=0, infos=0 > 2022/07/19 14:13:41 - General - Handled generic transform (TRANSFORM) : Last > modified, gets data from 1 previous transform(s), targets=0, infos=0 > 2022/07/19 14:13:41 - General - Handled generic transform (TRANSFORM) : Avro > File Input, gets data from 1 previous transform(s), targets=0, infos=0 > 2022/07/19 14:13:41 - General - Handled generic transform (TRANSFORM) : Avro > to site, gets data from 1 previous transform(s), targets=0, infos=0 > 2022/07/19 14:13:41 - General - Handled generic transform (TRANSFORM) : > Filter rows, gets data from 1 previous transform(s), targets=1, infos=0 > 2022/07/19 14:13:41 - General - Transform Select values reading from previous > transform targeting this one using : Filter rows - TARGET - Select values > 2022/07/19 14:13:41 - General - Handled generic transform (TRANSFORM) : > Select values, gets data from 1 previous transform(s), targets=0, infos=0 > 2022/07/19 14:13:42 - General - Handled transform (BQ OUTPUT) : Beam BigQuery > Output, gets data from Select values > 2022/07/19 14:13:42 - sites - Executing this pipeline using the Beam Pipeline > Engine with run configuration 'Beam-Direct' > … nothing more happens > > If I remove the "Beam BigQuery Output" transform: > > 2022/07/19 14:14:32 - Hop - Pipeline opened. > 2022/07/19 14:14:32 - Hop - Launching pipeline [sites]... > 2022/07/19 14:14:32 - Hop - Started the pipeline execution. > 2022/07/19 14:14:40 - General - Created Apache Beam pipeline with name 'sites' > 2022/07/19 14:14:40 - General - Handled generic transform (TRANSFORM) : Get > file names, gets data from 0 previous transform(s), targets=0, infos=0 > 2022/07/19 14:14:40 - General - Handled generic transform (TRANSFORM) : Last > modified, gets data from 1 previous transform(s), targets=0, infos=0 > 2022/07/19 14:14:40 - General - Handled generic transform (TRANSFORM) : Avro > File Input, gets data from 1 previous transform(s), targets=0, infos=0 > 2022/07/19 14:14:40 - General - Handled generic transform (TRANSFORM) : Avro > to site, gets data from 1 previous transform(s), targets=0, infos=0 > 2022/07/19 14:14:40 - General - Handled generic transform (TRANSFORM) : > Filter rows, gets data from 1 previous transform(s), targets=1, infos=0 > 2022/07/19 14:14:40 - General - Transform Select values reading from previous > transform targeting this one using : Filter rows - TARGET - Select values > 2022/07/19 14:14:40 - General - Handled generic transform (TRANSFORM) : > Select values, gets data from 1 previous transform(s), targets=0, infos=0 > 2022/07/19 14:14:40 - sites - Executing this pipeline using the Beam Pipeline > Engine with run configuration 'Beam-Direct' > 2022/07/19 14:14:44 - sites - Beam pipeline execution has finished. > > The pipeline ran successfully via the DataFlow runner. > > Fabian > > >> Am 19.07.2022 um 13:11 schrieb Bart Maertens <[email protected] >> <mailto:[email protected]>>: >> >> Hi Fabian, >> >> Do you have this issue with the BeaM Direct run configuration as well? >> The Beam Bigquery Output transform is Beam only, so this won't work with the >> native (local or remote) run configuration. >> >> If the issue exists with the direct runner, can you share any errors you >> get? >> >> Regards, >> Bart >> >> On Tue, Jul 19, 2022 at 12:12 PM Fabian Peters <[email protected] >> <mailto:[email protected]>> wrote: >> Hi all! >> >> I'm developing a number of pipelines that write data to BigQuery. This works >> fine, alas, during development I find I have to entirely remove the "Beam >> BigQuery Output" transform to be able to run the pipeline locally, disabling >> or deleting the hop to it did not help. Is there a way to keep the transform >> around while debugging the pipeline? >> >> cheers >> >> Fabian >
