By "cannot run normally" do you mean you get an exception? We recently had
a bug on master in which streaming pipelines containing `ParDo` with
multiple outputs ran into `NullPointerException`. This was fixed here:
https://issues.apache.org/jira/browse/BEAM-2029
Is this what you're facing? If so
hi, i have a trouble about addition outputs with SparkRunner.
Here if my code, when i use DirectRunner, everything runs OK, but if i replace
DirectRunner with SparkRunner, the code can't run normally.
public class UnifiedDataExtraction {
private static TupleTag rawDataTag = new
Trevor Grant has invited you to view the following document:
Open in Docs
Hi Dan
Thank you for your prompt reply.
Regards,
Prabeesh K.
On 3 May 2017 at 19:23, Dan Halperin wrote:
> Hi Prabeesh,
>
> The underlying Beam primitive you use for Join is CoGroupByKey – this
> takes N different collections KV , KV , ... K and
>
Hi Prabeesh,
The underlying Beam primitive you use for Join is CoGroupByKey – this takes
N different collections KV , KV , ... K and produces
one collection KV. This
is a compressed representation of a Join result, in that you can
Hi Dan,
Sorry for the late response.
I agreed with you for the use cases that you mentioned.
Advice me and please share if there is any sample code to join two data
sets in Beam that are sharing some common keys.
Regards,
Prabeesh K.
On 6 February 2017 at 10:38, Dan Halperin
Hi,
How to we can read a BigQuery table that backed by google sheet?
For me, I am getting the following error.
"error": {
"errors": [
{
"domain": "global",
"reason": "accessDenied",
"message": "Access Denied: BigQuery BigQuery: Permission denied while
globbing file pattern.",
Thanks for your input and sorry for the late reply.
Lukasz, you may be right that running the reprocessing as a batch job will
be better and faster. I'm still experimenting with approach 3 where I
publish all messages and then start the job to let the watermark progress
through the data. It seems