I am using the sink of BigQueryIO so the example is not the same. The
example is bad data from reading, I have problems when writting. There
can be multiple errors when writing to BigQuery, and if it fails there
is no way to catch this error, and the whole pipeline fails
chaim
On Thu, Sep 14,
It's being worked on. Turns out there are some modifications still needed
to the NexMark queries.
Reuven
On Thu, Sep 14, 2017 at 9:33 PM, Pei HE wrote:
> Could any Googlers help to run NexMark on Dataflow streaming and share the
> numbers with the community?
> --
> Pei
>
> On
Could any Googlers help to run NexMark on Dataflow streaming and share the
numbers with the community?
--
Pei
On Fri, Aug 25, 2017 at 11:28 PM, Lukasz Cwik
wrote:
> Etienne, cut some JIRAs for improvements like ValidatesRunner for the
> Nexmark suite that you think are
Cool, thanks for the shove in the right direction!
I'll probably have some more time to spend on this early next week,
hopefully I'll have a PR to submit after that. :)
On Thu, Sep 14, 2017 at 4:37 PM, Eugene Kirpichov <
kirpic...@google.com.invalid> wrote:
> On Thu, Sep 14, 2017 at 1:10 PM
Oh, I see what you mean. Yeah, I agree that having BigQueryIO use TableRow
as the native format was a suboptimal decision in retrospect, and I agree
that it would be reasonable to provide ability to go through Avro
GenericRecord instead. I'm just not sure how to provide it in an
API-compatible way
TL;DR: GroupingState should be just coding work to automate, while merging
ValueState probably needs a design doc
You can automatically merge GroupingState, though it is merely implied - if
all the user can do is add(datum) then the definition of merged state is
necessarily equivalent to calling
For doing something before starting the pipeline, can you do it in the main
program? The only disadvantage I can see is that it wouldn't be amenable to
using templates (ValueProvider's) - is that the blocker?
For doing something after a transform finishes processing a window of a
PCollection - we
I responded on the issue.
> On 12. Sep 2017, at 22:25, Lukasz Cwik wrote:
>
> Filed https://issues.apache.org/jira/browse/BEAM-2948
>
> On Tue, Sep 12, 2017 at 2:10 AM, Pawel Bartoszek > wrote:
>
>> Hi,
>>
>> I am running a flink v1.2.1
Hi,
In one pipeline I have multiple PCollections. If I have an error on
one then the whole pipline is canceled, is there a way to catch the
error and log it, and for all other PCollections to continue?
chaim
My use case is that I have generic code to transfer for example tables
from mongo to bigquery. I iterate over all tables in mongo and create
a PCollection for each. But there are things that need to be checked
before running, and to run only if validated.
I tried the visitor but there is no way to
Hi,
I don't think it makes sense on a transform (as it expects a PCollection).
However, why not introducing a specific hook for that.
I think you can workaround using a Pipeline Visitor, but it would be runner
level.
Regards
JB
On 09/14/2017 08:21 AM, Chaim Turkel wrote:
Hi,
I have a
Hi,
I have a few scenarios where I would like to have code that is
before the PBegin and after the PDone.
This is usually for monitoring purposes.
It would be nice to be able to transform from PBegin to PBegin, and
PDone to PDone, so that code can be run before and after and not in
the driver
12 matches
Mail list logo