Re: Slack

2017-03-13 Thread Amit Sela
Done. Welcome!

On Mon, Mar 13, 2017 at 2:29 PM Alexander Gallego 
wrote:

> same for me please.
>
>
>
>
> .alex
>
>
>
> On Fri, Mar 10, 2017 at 3:01 PM, Amit Sela  wrote:
>
> Done
>
> On Fri, Mar 10, 2017, 21:59 Devon Meunier 
> wrote:
>
> Hi!
>
> Sorry for the noise but could someone invite me to the slack channel?
>
> Thanks,
>
> Devon
>
>
>


Re: Slack

2017-03-13 Thread Alexander Gallego
same for me please.




.alex



On Fri, Mar 10, 2017 at 3:01 PM, Amit Sela  wrote:

> Done
>
> On Fri, Mar 10, 2017, 21:59 Devon Meunier 
> wrote:
>
>> Hi!
>>
>> Sorry for the noise but could someone invite me to the slack channel?
>>
>> Thanks,
>>
>> Devon
>>
>


Re: Slack

2017-03-13 Thread Tobias Feldhaus
Same for me please :)
Tobi

On 13.03.17, 13:30, "Amit Sela" 
> wrote:

Done. Welcome!

On Mon, Mar 13, 2017 at 2:29 PM Alexander Gallego 
> wrote:
same for me please.




.alex


On Fri, Mar 10, 2017 at 3:01 PM, Amit Sela 
> wrote:

Done

On Fri, Mar 10, 2017, 21:59 Devon Meunier 
> wrote:
Hi!

Sorry for the noise but could someone invite me to the slack channel?

Thanks,

Devon



Re: Slack

2017-03-13 Thread Amit Sela
I'm so well trained, I do it on my phone now!

On Mon, Mar 13, 2017, 15:24 Tobias Feldhaus 
wrote:

> Same for me please :)
>
> Tobi
>
>
>
> On 13.03.17, 13:30, "Amit Sela"  wrote:
>
>
>
> Done. Welcome!
>
>
>
> On Mon, Mar 13, 2017 at 2:29 PM Alexander Gallego 
> wrote:
>
> same for me please.
>
>
>
>
>
>
>
> .alex
>
>
>
> On Fri, Mar 10, 2017 at 3:01 PM, Amit Sela  wrote:
>
> Done
>
>
>
> On Fri, Mar 10, 2017, 21:59 Devon Meunier 
> wrote:
>
> Hi!
>
>
>
> Sorry for the noise but could someone invite me to the slack channel?
>
>
>
> Thanks,
>
>
>
> Devon
>
>
>
>


Re: grpc IO?

2017-03-13 Thread Borisa Zivkovic
Thanks Eugene,

I thought about this over the weekend and I think the best is for me to
spend some time trying to mock up some code to see how grpcIO woulc help us.
I propose you give me few weeks and we keep JIRA open. After that I will
either close JIRA or propose how API would look like.

makes sense?

On Fri, 10 Mar 2017 at 14:20 Eugene Kirpichov  wrote:

> Hi,
>
> I want to clarify: I am not pushing back on your proposal, merely trying
> to understand your use case, and what part of it you want Beam to do for
> you. E.g. obviously you, as the user, will be specifying what service you
> want to contact, what RPC method to call, you will construct arguments to
> this call, and you will handle the results of the call - this is all
> application-specific logic and Beam can not decide any of this for you. It
> vaguely seems like not very much remains for Beam to help with, but I may
> be mistaken.
>
> So - can you tell more about your use case? Are you just calling an RPC
> for every element of a collection? Are you applying an RPC to every element
> of a collection and collecting the results into another collection? Are you
> calling an RPC that returns a large number (or a stream) of results and
> you're putting these results into a collection? Is it something else?
>
> Thanks.
>
> On Fri, Mar 10, 2017 at 1:40 PM Borisa Zivkovic <
> borisha.zivko...@gmail.com> wrote:
>
> Hi Eugene,
>
> did not try that - I thought that having one IO for this would be better
> solution.
> Of course, in case you think it would not be beneficial to other to have
> this IO in Beam code no problem,
> we can always use custom solution here...
>
> On Fri, 10 Mar 2017 at 13:01 Eugene Kirpichov 
> wrote:
>
> Your first option would be to simply do RPCs from your ParDo's.
> Have you tried that / is it not working for some reason / is the code
> turning out unnecessarily complex or not performant enough etc.?
> On Fri, Mar 10, 2017 at 12:36 PM Borisa Zivkovic <
> borisha.zivko...@gmail.com> wrote:
>
> well what we are looking into is using grpc with linkerd or envoy ... so
> if beam supported grpc out of the box it would be great help...
>
> On Fri, 10 Mar 2017 at 11:58 Jean-Baptiste Onofré  wrote:
>
> Absolutely !
>
> Thanks !
>
> Regards
> JB
>
> On 03/10/2017 11:18 AM, Borisa Zivkovic wrote:
> > ok thanks JB,
> >
> > I guess I can create JIRA and see if I can contribute it?
> >
> > we need grpc IO...
> >
> >
> >
> > On Fri, 10 Mar 2017 at 10:08 Jean-Baptiste Onofré  > > wrote:
> >
> > Hi Borisa,
> >
> > Not on my side, but definitely a good idea.
> >
> > Regards
> > JB
> >
> > On 03/10/2017 10:58 AM, Borisa Zivkovic wrote:
> > > Hi guys,
> > >
> > > any reason why there is no grpc IO available? Is it maybe planned?
> > >
> > > thanks
> >
> > --
> > Jean-Baptiste Onofré
> > jbono...@apache.org 
> > http://blog.nanthrax.net
> > Talend - http://www.talend.com
> >
>
> --
> Jean-Baptiste Onofré
> jbono...@apache.org
> http://blog.nanthrax.net
> Talend - http://www.talend.com
>
>


Re: Slack

2017-03-13 Thread Mingmin Xu
added

ps, resent as it seems the previous one is blocked,

On Mon, Mar 13, 2017 at 1:07 PM, Sunil K Sahu  wrote:

> Could someone add me to slack channel as well.
>
> Thanks,
> Sunil
>
> ​
> Sunil Kumar Sahu
> ​CS
>  Dept - Graduate Student
> ​BU - ​Watson School of Engineering
>
> On Mon, Mar 13, 2017 at 9:28 AM, Amit Sela  wrote:
>
>> I'm so well trained, I do it on my phone now!
>>
>> On Mon, Mar 13, 2017, 15:24 Tobias Feldhaus <
>> tobias.feldh...@localsearch.ch> wrote:
>>
>>> Same for me please :)
>>>
>>> Tobi
>>>
>>>
>>>
>>> On 13.03.17, 13:30, "Amit Sela"  wrote:
>>>
>>>
>>>
>>> Done. Welcome!
>>>
>>>
>>>
>>> On Mon, Mar 13, 2017 at 2:29 PM Alexander Gallego <
>>> gallego.al...@gmail.com> wrote:
>>>
>>> same for me please.
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> .alex
>>>
>>>
>>>
>>> On Fri, Mar 10, 2017 at 3:01 PM, Amit Sela  wrote:
>>>
>>> Done
>>>
>>>
>>>
>>> On Fri, Mar 10, 2017, 21:59 Devon Meunier 
>>> wrote:
>>>
>>> Hi!
>>>
>>>
>>>
>>> Sorry for the noise but could someone invite me to the slack channel?
>>>
>>>
>>>
>>> Thanks,
>>>
>>>
>>>
>>> Devon
>>>
>>>
>>>
>>>
>


-- 

Mingmin


Re: Slack

2017-03-13 Thread Sunil K Sahu
I can access the channels now.

Thanks

​
Sunil Kumar Sahu
​CS
 Dept - Graduate Student
​BU - ​Watson School of Engineering

On Mon, Mar 13, 2017 at 7:03 PM, Mingmin Xu  wrote:

> added
>
> ps, resent as it seems the previous one is blocked,
>
> On Mon, Mar 13, 2017 at 1:07 PM, Sunil K Sahu 
> wrote:
>
>> Could someone add me to slack channel as well.
>>
>> Thanks,
>> Sunil
>>
>> ​
>> Sunil Kumar Sahu
>> ​CS
>>  Dept - Graduate Student
>> ​BU - ​Watson School of Engineering
>>
>> On Mon, Mar 13, 2017 at 9:28 AM, Amit Sela  wrote:
>>
>>> I'm so well trained, I do it on my phone now!
>>>
>>> On Mon, Mar 13, 2017, 15:24 Tobias Feldhaus <
>>> tobias.feldh...@localsearch.ch> wrote:
>>>
 Same for me please :)

 Tobi



 On 13.03.17, 13:30, "Amit Sela"  wrote:



 Done. Welcome!



 On Mon, Mar 13, 2017 at 2:29 PM Alexander Gallego <
 gallego.al...@gmail.com> wrote:

 same for me please.







 .alex



 On Fri, Mar 10, 2017 at 3:01 PM, Amit Sela 
 wrote:

 Done



 On Fri, Mar 10, 2017, 21:59 Devon Meunier 
 wrote:

 Hi!



 Sorry for the noise but could someone invite me to the slack channel?



 Thanks,



 Devon




>>
>
>
> --
> 
> Mingmin
>


Re: Batch loading for streaming pipelines

2017-03-13 Thread Arpan Jain
Thanks for the reply! We are using these pipelines to read structured log
lines from Kafka and storing them in bigquery.

.withMaxNumRecords() or .withMaxReadTime() aren't that useful
because they do not remember how much they have read in previous run.


On Mon, Mar 13, 2017 at 9:42 PM, Kenneth Knowles 
wrote:

> This seems like a good topic for user@ so I've moved it there (dev@ to
> BCC).
>
> You can get a bounded PCollection from KafkaIO via either of
> .withMaxNumRecords() or .withMaxReadTime().
>
> Whether or not that will meet your use case would depend on more details of
> what you are computing. Periodic batch jobs are harder to get right. In
> particular, the time you stop reading and the end of a window (esp.
> sessions) are not likely to coincide, so you'll need to deal with that.
>
> Kenn
>
> On Mon, Mar 13, 2017 at 6:09 PM, Arpan Jain  wrote:
>
> > Hi,
> >
> > We run multiple streaming pipelines using cloud dataflow that read from
> > Kafka and write to BigQuery. We don't mind a few hours delay and are
> > thinking of avoiding the costs associated with streaming data into
> > BigQuery. Is there already a support (or a future plan) for such a
> > scenario? If not then I guess I will implement one of the following
> option:
> > * A BoundedSource implementation for Kafka so that we can run this in
> > batch mode.
> > * The streaming job writes to GCS and then a BQ load job writes to
> > BigQuery.
> >
> > Thanks!
> >
>


Re: Batch loading for streaming pipelines

2017-03-13 Thread Kenneth Knowles
This seems like a good topic for user@ so I've moved it there (dev@ to BCC).

You can get a bounded PCollection from KafkaIO via either of
.withMaxNumRecords() or .withMaxReadTime().

Whether or not that will meet your use case would depend on more details of
what you are computing. Periodic batch jobs are harder to get right. In
particular, the time you stop reading and the end of a window (esp.
sessions) are not likely to coincide, so you'll need to deal with that.

Kenn

On Mon, Mar 13, 2017 at 6:09 PM, Arpan Jain  wrote:

> Hi,
>
> We run multiple streaming pipelines using cloud dataflow that read from
> Kafka and write to BigQuery. We don't mind a few hours delay and are
> thinking of avoiding the costs associated with streaming data into
> BigQuery. Is there already a support (or a future plan) for such a
> scenario? If not then I guess I will implement one of the following option:
> * A BoundedSource implementation for Kafka so that we can run this in
> batch mode.
> * The streaming job writes to GCS and then a BQ load job writes to
> BigQuery.
>
> Thanks!
>


Re: Cannot provide coder

2017-03-13 Thread Antony Mayi
to answer myself:
I should have said I am creating the PCollection using Create.of which 
apparently leads to this getDefaultCoder() implementation: 
CoderRegistry.getDefaultCoder(T exampleValue) which doesn't try inspecting the 
annotation (as opposed CoderRegistry.getDefaultCoder(Class clazz) ). So I 
guess I have to explicitly use the create.of().withCoder().
antony. 

On Monday, 13 March 2017, 13:37, Antony Mayi  wrote:
 

 Hi,
trying to create PCollection of my custom data class but keep failing due to 
the CannotProvideCoderException:
My class is declared as follows:
@DefaultCoder(SerializableCoder.class) public class Data implements Serializable {
and it fails like this:
Caused by: org.apache.beam.sdk.coders.CannotProvideCoderException: Cannot 
provide coder based on value with class my.project.Data: No CoderFactory has 
been registered for the class.
why it doesn't pick the coder?
thanks,Antony.

   

Re: Slack

2017-03-13 Thread Sunil K Sahu
Could someone add me to slack channel as well.

Thanks,
Sunil

​
Sunil Kumar Sahu
​CS
 Dept - Graduate Student
​BU - ​Watson School of Engineering

On Mon, Mar 13, 2017 at 9:28 AM, Amit Sela  wrote:

> I'm so well trained, I do it on my phone now!
>
> On Mon, Mar 13, 2017, 15:24 Tobias Feldhaus  ch> wrote:
>
>> Same for me please :)
>>
>> Tobi
>>
>>
>>
>> On 13.03.17, 13:30, "Amit Sela"  wrote:
>>
>>
>>
>> Done. Welcome!
>>
>>
>>
>> On Mon, Mar 13, 2017 at 2:29 PM Alexander Gallego <
>> gallego.al...@gmail.com> wrote:
>>
>> same for me please.
>>
>>
>>
>>
>>
>>
>>
>> .alex
>>
>>
>>
>> On Fri, Mar 10, 2017 at 3:01 PM, Amit Sela  wrote:
>>
>> Done
>>
>>
>>
>> On Fri, Mar 10, 2017, 21:59 Devon Meunier 
>> wrote:
>>
>> Hi!
>>
>>
>>
>> Sorry for the noise but could someone invite me to the slack channel?
>>
>>
>>
>> Thanks,
>>
>>
>>
>> Devon
>>
>>
>>
>>