Re: Data guarantees PubSub to GCS

2018-01-04 Thread Derek Hao Hu
gt;> What happens if the Beam job is not running successfully, and maybe >> throwing exceptions? Will the data still be available in PubSub when I >> cancel (not drain) the job? Does a drain work successfully if the data >> cannot be written to GCS because of the exceptions? >> >> Thanks, >> Andrew >> > -- Derek Hao Hu Software Engineer | Snapchat Snap Inc.

Re: @DoFn.Setup not called

2017-11-16 Thread Derek Hao Hu
Setup doesn't hint at > anything, so I'm suspecting Dataflow bug? > > Jacob > -- Derek Hao Hu Software Engineer | Snapchat Snap Inc.

Re: How does Beam set up the bundle size in streaming mode (like Pub/Sub)?

2017-10-29 Thread Derek Hao Hu
for filing a JIRA with some hyperlinks to the pages that say that? > > Kenn > > On Sun, Oct 22, 2017 at 9:54 AM, Derek Hao Hu <phoenixin...@gmail.com> > wrote: > >> Thanks Kenneth! I sort of feel the notions of bundles and windows are a >> bit confusing in Beam

Re: Infinite retry in streaming - is there a workaround?

2017-10-25 Thread Derek Hao Hu
channel. I'm also cc'ing Thomas Groh who might be able to help. > > > > On 20 October 2017 at 11:35, Derek Hao Hu <phoenixin...@gmail.com> wrote: > >> ​Kindly ping as I'm really curious about this. :p >> >> Derek​ >> >> On Thu, Oct 19, 2017 at 2:15

Re: How does Beam set up the bundle size in streaming mode (like Pub/Sub)?

2017-10-22 Thread Derek Hao Hu
yield > correct results for for any bundling - you should be implementing > per-element logic, where @StartBundle/@FinishBundle are implementation > details. > > Kenn > > On Tue, Oct 17, 2017 at 5:37 PM, Derek Hao Hu <phoenixin...@gmail.com> > wrote: > >> Hi,

Re: Infinite retry in streaming - is there a workaround?

2017-10-20 Thread Derek Hao Hu
​Kindly ping as I'm really curious about this. :p Derek​ On Thu, Oct 19, 2017 at 2:15 PM, Derek Hao Hu <phoenixin...@gmail.com> wrote: > Hi, > > ​We are trying to use Dataflow in Prod and right now one of our main > concerns is this "infinite retry" behavior

How does Beam set up the bundle size in streaming mode (like Pub/Sub)?

2017-10-17 Thread Derek Hao Hu
strategy as well. :( Could someone kindly provide some pointers? Thanks! -- Derek Hao Hu Software Engineer | Snapchat Snap Inc.

Re: How to catch exceptions while using DatastoreV1 API

2017-10-16 Thread Derek Hao Hu
/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.java#L1263 > > > > On Mon, Oct 16, 2017 at 10:47 AM, Derek Hao Hu <phoenixin...@gmail.com> > wrote: > >> I see. Thanks Lukasz. >> >> In that case, do you think there is an easy / clean way to implement the

Re: How to catch exceptions while using DatastoreV1 API

2017-10-16 Thread Derek Hao Hu
that makes sense? Thanks, Derek On Mon, Oct 16, 2017 at 10:31 AM, Lukasz Cwik <lc...@google.com> wrote: > That source is not available to you as it is part of the Dataflow service. > > On Mon, Oct 16, 2017 at 10:25 AM, Derek Hao Hu <phoenixin...@gmail.com&

How to catch exceptions while using DatastoreV1 API

2017-10-16 Thread Derek Hao Hu
own DatastoreIO. Thanks, -- Derek Hao Hu Software Engineer | Snapchat Snap Inc.

Re: DoFn setup/teardown sequence

2017-10-16 Thread Derek Hao Hu
worker execute multiple instances of a DoFn? (I believe yes) >> >> Thank you, >> >> Jacob >> > > -- > Jean-Baptiste Onofré > jbono...@apache.org > http://blog.nanthrax.net > Talend - http://www.talend.com > -- Derek Hao Hu Software Engineer | Snapchat Snap Inc.

Is there a way to access PipelineOptions in DoFn.Setup?

2017-10-10 Thread Derek Hao Hu
or FinishBundleContext in Setup. Thanks, -- Derek Hao Hu Software Engineer | Snapchat Snap Inc.

Re: [Call for Speakers] Apache Beam Meetup @ San Francisco, CA on Oct. 24th

2017-10-10 Thread Derek Hao Hu
lso look into organizing something in Seattle later on :) > > On Oct 9, 2017 10:59 PM, "Derek Hao Hu" <phoenixin...@gmail.com> wrote: > >> It's just that I'm in Seattle (and I guess a lot of us are not in SF). >> The talks seem pretty interesting. :) >>

Re: [Call for Speakers] Apache Beam Meetup @ San Francisco, CA on Oct. 24th

2017-10-09 Thread Derek Hao Hu
at but so far we're not planning on it. > Would you like to watch them later? > > On 9 October 2017 at 16:36, Derek Hao Hu <phoenixin...@gmail.com> wrote: > >> Hi Griselda, >> >> Will the talks be recorded? >> >> Thanks, >> >> Derek >>

Re: How to get the PublishTime of PubsubMessages?

2017-10-02 Thread Derek Hao Hu
p/pubsub/PubsubClient.java#L101-L115> > github.com > beam - Mirror of Apache Beam > > > -- > *From:* Derek Hao Hu <phoenixin...@gmail.com> > *Sent:* 02 October 2017 18:01:10 > *To:* user@beam.apache.org > *Subject:* Re: How to get the PublishTime of

Re: Problem with autoscaling

2017-09-04 Thread Derek Hao Hu
https://cloud.google.com/dataflow/service/dataflow- > service-desc#autoscaling > > Does anyone know if this is true? I know this is not the forum for > Dataflow questions in general but I though someone else here might have > experience that support or contradict this. > > Thanks, &

Read binary file from GCS buckets?

2017-07-16 Thread Derek Hao Hu
Hi, ​I'm trying to read a binary file from GCS. I've seen that `TextIO` can read directly from GCS buckets but based on the documentation it would split lines based on carriage returns. Is there a way to read a binary file directly from GCS buckets?​ ​Thanks,​ -- Derek Hao Hu Software Engineer