Re: Proposal : build single linux binary

2017-12-07 Thread Ning Wang
Sounds good. Is there any concern about the system libs used in the cpp
binaries?

On Thu, Dec 7, 2017 at 8:53 PM, Sanjeev Kulkarni 
wrote:

> sounds good
>
> On Thu, Dec 7, 2017 at 8:45 PM, Karthik Ramasamy 
> wrote:
>
> > Sounds good.
> >
> > On Thu, Dec 7, 2017 at 8:06 PM Ali Ahmed  wrote:
> >
> > > I am considering changing bazel to  build a single binary for linux as
> > > opposed to split among platforms for ubuntu and centos do people have
> > > concerns or technical objections to this.
> > >
> > >
> >
>


Re: Slurm Scheduler

2017-12-03 Thread Ning Wang
+1. I think your idea of moving them into a different directory makes sense
in case some people need it.

On Sat, Dec 2, 2017 at 11:44 PM, Sanjeev Kulkarni 
wrote:

> +1
>
> On Sat, Dec 2, 2017 at 10:57 PM Ali Ahmed  wrote:
>
> > I support deprecating it from the codebase.
> >
> > -Ali
> >
> > > On Dec 2, 2017, at 7:38 PM, Jerry Peng 
> > wrote:
> > >
> > > Hello Everyone,
> > >
> > > In the context of the on going discussion about minimizing the number
> > > of supported schedulers, is anyone use the Slurm scheduler in Heron?
> > > or can we deprecate/remove it? I am not sure if the Slurm scheduler
> > > even works anymore.  Can anyone familiar with the Slurm scheduler
> > > comment on this?  Thanks in advance!
> > >
> > > Best,
> > >
> > > Jerry
> >
> >
>


Re: Stateful updating and deterministic routing

2018-05-05 Thread Ning Wang
Currently I think each Instance serializes the state object into a byte
array and checkpoint manager saves the byte array into a file. The file is
referenced by topology name + component name + instance id.

On Fri, May 4, 2018 at 11:10 PM, Karthik Ramasamy 
wrote:

> I am not sure I understand why the state is tied to an instance?
>
> cheers
> /karthik
>
> On Fri, May 4, 2018 at 4:36 PM, Thomas Cooper 
> wrote:
>
> > Yeah, state recovery is a bit more difficult with Heron's architecture.
> In
> > Storm, the task IDs are not just values used for routing they actually
> > equate to a task instance within the executor. An executor which
> currently
> > processes the keys 4-8 actually contains 5 task instances of the same
> > component. So for each task, they just save its state attached to the
> > single task ID and reassemble executors with the new task instances.
> >
> > We don't want or have to do that with Heron instances but we would need
> to
> > have some way to have a state change tied to the task (or routing key if
> we
> > go to the key range idea). For something like a word count you might save
> > counts using a nested map like: { routing key : {word : count }}. The
> > routing key could be included in the Tuple instance. However, whether
> this
> > pattern would work for more generic state cases I don't know?
> >
> > Tom Cooper
> > W: www.tomcooper.org.uk  | Twitter: @tomncooper
> > 
> >
> >
> > On Fri, 4 May 2018 at 15:54, Neng Lu  wrote:
> >
> > > +1 for this idea. As long as the predefined key space is large enough,
> it
> > > should work for most of the cases.
> > >
> > > Based on my experience with topologies, I never saw one component has
> > more
> > > than 1000 instances in a topology.
> > >
> > > For recovering states from an update, there will be some problems
> though.
> > > Since the states stored in heron are strongly connected with each
> > instance,
> > > we either need to have
> > > some resolver does the state repartitioning or stores states with the
> key
> > > instead of with each instance.
> > >
> > >
> > >
> > > On Fri, May 4, 2018 at 3:01 PM, Karthik Ramasamy 
> > > wrote:
> > >
> > > > Thanks for sharing. I like the Storm approach
> > > >
> > > > - keeps the implementation simpler
> > > > - state is deterministic across restarts
> > > > - makes it easy to reason and debug
> > > >
> > > > The hard limit is not a problem at all since most of the topologies
> > will
> > > > be never that big.
> > > > If you can handle Twitter topologies cleanly, it is more that
> > sufficient
> > > I
> > > > believe.
> > > >
> > > > cheers
> > > > /karthik
> > > >
> > > > > On May 4, 2018, at 2:31 PM, Thomas Cooper 
> > > > wrote:
> > > > >
> > > > > Hi all,
> > > > >
> > > > > A while ago I emailed about the issue of how fields (key) grouped
> > > routing
> > > > > in Heron was not consistent across an update and how this makes
> > > > preserving
> > > > > state across an update very difficult and also makes it
> > > > > difficult/impossible to analyse or predict tuple flows through a
> > > > > current/proposed topology physical plan.
> > > > >
> > > > > I suggested adopting Storms approach of pre-defining a routing key
> > > > > space for each component (eg 0-999), so that instead of an instance
> > > > having
> > > > > a single task id that gets reset at every update (eg 10) it has a
> > range
> > > > of
> > > > > id's (eg 10-16) that changes depending on the parallelism of the
> > > > component.
> > > > > This has the advantage that a key will always hash to the same task
> > ID
> > > > for
> > > > > the lifetime of the topology. Meaning recovering state for an
> > instance
> > > > > after a crash or update is just a case of pulling the state linked
> to
> > > the
> > > > > keys in its task ID range.
> > > > >
> > > > > I know the above proposal has issues, not least of all placing a
> hard
> > > > upper
> > > > > limit on the scale out of a component, and that some alternative
> > ideas
> > > > are
> > > > > being floated for solving the stateful update issue. However, I
> just
> > > > wanted
> > > > > to throw some more weight behind the Storm approach. There was a
> > recent
> > > > > paper about high-performance network load balancing
> > > > >  > > > datacenter-load-balancing-with-beamer/>that
> > > > > describes an approach using a fixed key space similar to Storm's
> (see
> > > the
> > > > > section called Stable Hashing - they assign a range 100x the
> expected
> > > > > connection pool size - which we could do with heron to prevent ever
> > > > hitting
> > > > > the upper scaling limit). Also, this new load balancer, Beamer,
> > claims
> > > to
> > > > > be twice as fast as Google's Maglev
> > > > >  > > > 

Re: Stateful updating and deterministic routing

2018-05-05 Thread Ning Wang
If we go this way, we need key -> state map for each component so that the
state data can be repartitioned.

On Fri, May 4, 2018 at 11:44 PM, Karthik Ramasamy <kart...@streaml.io>
wrote:

> Instead - if it references
>
> topology name + component name + key range
>
> will it be better?
>
> cheers
> /karthik
>
>
> On Fri, May 4, 2018 at 11:23 PM, Ning Wang <wangnin...@gmail.com> wrote:
>
> > Currently I think each Instance serializes the state object into a byte
> > array and checkpoint manager saves the byte array into a file. The file
> is
> > referenced by topology name + component name + instance id.
> >
> > On Fri, May 4, 2018 at 11:10 PM, Karthik Ramasamy <kart...@streaml.io>
> > wrote:
> >
> > > I am not sure I understand why the state is tied to an instance?
> > >
> > > cheers
> > > /karthik
> > >
> > > On Fri, May 4, 2018 at 4:36 PM, Thomas Cooper <tom.n.coo...@gmail.com>
> > > wrote:
> > >
> > > > Yeah, state recovery is a bit more difficult with Heron's
> architecture.
> > > In
> > > > Storm, the task IDs are not just values used for routing they
> actually
> > > > equate to a task instance within the executor. An executor which
> > > currently
> > > > processes the keys 4-8 actually contains 5 task instances of the same
> > > > component. So for each task, they just save its state attached to the
> > > > single task ID and reassemble executors with the new task instances.
> > > >
> > > > We don't want or have to do that with Heron instances but we would
> need
> > > to
> > > > have some way to have a state change tied to the task (or routing key
> > if
> > > we
> > > > go to the key range idea). For something like a word count you might
> > save
> > > > counts using a nested map like: { routing key : {word : count }}. The
> > > > routing key could be included in the Tuple instance. However, whether
> > > this
> > > > pattern would work for more generic state cases I don't know?
> > > >
> > > > Tom Cooper
> > > > W: www.tomcooper.org.uk  | Twitter: @tomncooper
> > > > <https://twitter.com/tomncooper>
> > > >
> > > >
> > > > On Fri, 4 May 2018 at 15:54, Neng Lu <freen...@gmail.com> wrote:
> > > >
> > > > > +1 for this idea. As long as the predefined key space is large
> > enough,
> > > it
> > > > > should work for most of the cases.
> > > > >
> > > > > Based on my experience with topologies, I never saw one component
> has
> > > > more
> > > > > than 1000 instances in a topology.
> > > > >
> > > > > For recovering states from an update, there will be some problems
> > > though.
> > > > > Since the states stored in heron are strongly connected with each
> > > > instance,
> > > > > we either need to have
> > > > > some resolver does the state repartitioning or stores states with
> the
> > > key
> > > > > instead of with each instance.
> > > > >
> > > > >
> > > > >
> > > > > On Fri, May 4, 2018 at 3:01 PM, Karthik Ramasamy <
> > kramas...@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > Thanks for sharing. I like the Storm approach
> > > > > >
> > > > > > - keeps the implementation simpler
> > > > > > - state is deterministic across restarts
> > > > > > - makes it easy to reason and debug
> > > > > >
> > > > > > The hard limit is not a problem at all since most of the
> topologies
> > > > will
> > > > > > be never that big.
> > > > > > If you can handle Twitter topologies cleanly, it is more that
> > > > sufficient
> > > > > I
> > > > > > believe.
> > > > > >
> > > > > > cheers
> > > > > > /karthik
> > > > > >
> > > > > > > On May 4, 2018, at 2:31 PM, Thomas Cooper <
> > tom.n.coo...@gmail.com>
> > > > > > wrote:
> > > > > > >
> > > > > > > Hi all,
> > > > > > >
> > > > > > > A while ago I emailed about the issue of how fields (key)
> grouped
&g

Re: Stateful updating and deterministic routing

2018-05-04 Thread Ning Wang
Interesting. Thanks for sharing~

On Fri, May 4, 2018 at 2:31 PM, Thomas Cooper 
wrote:

> Hi all,
>
> A while ago I emailed about the issue of how fields (key) grouped routing
> in Heron was not consistent across an update and how this makes preserving
> state across an update very difficult and also makes it
> difficult/impossible to analyse or predict tuple flows through a
> current/proposed topology physical plan.
>
> I suggested adopting Storms approach of pre-defining a routing key
> space for each component (eg 0-999), so that instead of an instance having
> a single task id that gets reset at every update (eg 10) it has a range of
> id's (eg 10-16) that changes depending on the parallelism of the component.
> This has the advantage that a key will always hash to the same task ID for
> the lifetime of the topology. Meaning recovering state for an instance
> after a crash or update is just a case of pulling the state linked to the
> keys in its task ID range.
>
> I know the above proposal has issues, not least of all placing a hard upper
> limit on the scale out of a component, and that some alternative ideas are
> being floated for solving the stateful update issue. However, I just wanted
> to throw some more weight behind the Storm approach. There was a recent
> paper about high-performance network load balancing
>  with-beamer/>that
> describes an approach using a fixed key space similar to Storm's (see the
> section called Stable Hashing - they assign a range 100x the expected
> connection pool size - which we could do with heron to prevent ever hitting
> the upper scaling limit). Also, this new load balancer, Beamer, claims to
> be twice as fast as Google's Maglev
>  reliable-software-network-load-balancer/>
> which again uses a pre-defined keyspace and ID ranges to create look-up
> tables deterministically.
>
> I know a load balancer is a different beast to a stream grouping but there
> are some interesting ideas in those papers (The links point to summary blog
> posts so you don't have to read the whole paper).
>
> Anyway, I just thought I would those papers out there and see what people
> think.
>
> Tom Cooper
> W: www.tomcooper.org.uk  | Twitter: @tomncooper
> 
>


Re: Stateful updating and deterministic routing

2018-05-07 Thread Ning Wang
Thanks Karthik. The doc is not exactly the same but close enough.

It seems my doc is an internal one so let's use your doc as reference. I
will see if there is any major differences and comment.

On Sun, May 6, 2018 at 1:22 PM, Karthik Ramasamy <kart...@streaml.io> wrote:

> Here it is
>
> https://docs.google.com/document/d/1YDFNvLTX6Sg3WDrNFKiWLaJvuEtK4
> eyxEaA0w9cVlG4/edit#heading=h.d6uy2uxfs2xq
>
> cheers
> /karthik
>
>
> On Sun, May 6, 2018 at 8:20 AM, Bill Graham <billgra...@gmail.com> wrote:
>
>> Can you share the doc please?
>>
>> On Sat, May 5, 2018 at 4:18 PM Ning Wang <wangnin...@gmail.com> wrote:
>>
>> > Thanks.
>> >
>> > Yeah I have read the design doc. It has a section for scaling and covers
>> > some designs but not reaching this level of details I am afraid.
>> >
>> > On Sat, May 5, 2018 at 9:45 AM, Bill Graham <billgra...@gmail.com>
>> wrote:
>> >
>> >> The stateful processing design included a large section on scaling,
>> which
>> >> was intended to be done as a future phase. It's very similar to what's
>> >> being described. Sanjeev and I worked on it about a 1.5 years ago with
>> >> Maosong and it was in a google doc. Sanjeev do you have that design
>> doc? I
>> >> can't seem locate it.
>> >>
>> >> On Sat, May 5, 2018 at 12:03 AM, Ning Wang <wangnin...@gmail.com>
>> wrote:
>> >>
>> >> > If we go this way, we need key -> state map for each component so
>> that
>> >> the
>> >> > state data can be repartitioned.
>> >> >
>> >> > On Fri, May 4, 2018 at 11:44 PM, Karthik Ramasamy <
>> kart...@streaml.io>
>> >> > wrote:
>> >> >
>> >> > > Instead - if it references
>> >> > >
>> >> > > topology name + component name + key range
>> >> > >
>> >> > > will it be better?
>> >> > >
>> >> > > cheers
>> >> > > /karthik
>> >> > >
>> >> > >
>> >> > > On Fri, May 4, 2018 at 11:23 PM, Ning Wang <wangnin...@gmail.com>
>> >> wrote:
>> >> > >
>> >> > > > Currently I think each Instance serializes the state object into
>> a
>> >> byte
>> >> > > > array and checkpoint manager saves the byte array into a file.
>> The
>> >> file
>> >> > > is
>> >> > > > referenced by topology name + component name + instance id.
>> >> > > >
>> >> > > > On Fri, May 4, 2018 at 11:10 PM, Karthik Ramasamy <
>> >> kart...@streaml.io>
>> >> > > > wrote:
>> >> > > >
>> >> > > > > I am not sure I understand why the state is tied to an
>> instance?
>> >> > > > >
>> >> > > > > cheers
>> >> > > > > /karthik
>> >> > > > >
>> >> > > > > On Fri, May 4, 2018 at 4:36 PM, Thomas Cooper <
>> >> > tom.n.coo...@gmail.com>
>> >> > > > > wrote:
>> >> > > > >
>> >> > > > > > Yeah, state recovery is a bit more difficult with Heron's
>> >> > > architecture.
>> >> > > > > In
>> >> > > > > > Storm, the task IDs are not just values used for routing they
>> >> > > actually
>> >> > > > > > equate to a task instance within the executor. An executor
>> which
>> >> > > > > currently
>> >> > > > > > processes the keys 4-8 actually contains 5 task instances of
>> the
>> >> > same
>> >> > > > > > component. So for each task, they just save its state
>> attached
>> >> to
>> >> > the
>> >> > > > > > single task ID and reassemble executors with the new task
>> >> > instances.
>> >> > > > > >
>> >> > > > > > We don't want or have to do that with Heron instances but we
>> >> would
>> >> > > need
>> >> > > > > to
>> >> > > > > > have some way to have a state change tied to the task (or
>> >> routing
>> >> > key
>> &g

Re: Heron OSS Sync

2018-05-08 Thread Ning Wang
Got it. Thanks!

Yeah. Cadidate is our goal.


On Tue, May 8, 2018 at 4:21 PM, Dave Fisher <dave2w...@comcast.net> wrote:

> Thanks for the quick update!
>
> > On May 8, 2018, at 4:08 PM, Ning Wang <wangnin...@gmail.com> wrote:
> >
> > And here is a brief notes:
> >
> >
> >
> > * - Our current focus is to have our first apache release by the end of
> > this week (we should be pretty much ready for it)
>
> If you mean have the first release candidate ready for voting by the end
> of the week this is achievable. What’s not achievable is completing the
> VOTE process and making the first release. You should consider this an
> indeterminate process until the community gets things correct in the Apache
> Way.
>
> (1) The project will need to take at least 72 hours to review the
> candidate.
> (2) Then the IPMC needs to VOTE on general@. That takes another minimum
> of 72 hours. Often more.
>
> To pass requires 3 +1 IPMC votes. We review the source and binaries to
> make sure that all files have license text and that the NOTICE and LICENSE
> is correct for the source and binary.
>
> Regards,
> Dave
>
> > .- Heron webpage needs
> > some update and reorg- Oracle can’t host the meet up next week.
> > Rescheduling.- We need more blogs. Karthik will send out some ideas.-
> > Please review Saikat’s machine learning support proposal.
> > https://docs.google.com/document/d/1LrO7XRcMxJoMM83wjRd-
> Ov74VAaomA_mXOAhCStgGng/edit
> > <https://docs.google.com/document/d/1LrO7XRcMxJoMM83wjRd-
> Ov74VAaomA_mXOAhCStgGng/edit>-
> > Stateful processing is in progress. Found two issues and working on them
> > (state data removal and hadoop config)- Investigating stuck stmgr issue-
> > Working on the model to predict BP when traffic increases- Got a
> streamlet
> > bug report from user about missing acks.- New ubunton 18.04 has python 3
> > only. We need to migrate- A hands-on session will be scheduled by the end
> > of June (sree)- Security concern in downloader/extractor java code. Sree
> > and karthik to sync up.- CDCI *
> > The google doc is here:
> > https://docs.google.com/document/d/1cTIBq3jOVRTSR0Zd5OKK20OqwT2l9
> 0xXiY_HssVo8mE/edit?ts=5aa84932#heading=h.nq0bjo3oqwfy.
> > Please feel free to reply/comment if anything is mission.
> >
> >
> >
> >
> > On Tue, May 8, 2018 at 1:15 PM, Ning Wang <wangnin...@gmail.com> wrote:
> >
> >> Hi,
> >>
> >> The heron OSS sync meeting will be happening today at 2.00 pm PST.
> Please
> >> use the following hangout link:
> >> https://hangouts.google.com/hangouts/_/streaml.io/oss-heron-
> >> sync?authuser=0
> >>
> >>
> >> See you all then.
> >>
> >>
>
>


Heron OSS Sync

2018-05-08 Thread Ning Wang
Hi,

The heron OSS sync meeting will be happening today at 2.00 pm PST. Please
use the following hangout link:
https://hangouts.google.com/hangouts/_/streaml.io/oss-heron-sync?authuser=0


See you all then.


Re: Heron OSS Sync

2018-05-08 Thread Ning Wang
Agreed. :)

On Tue, May 8, 2018 at 4:49 PM, P. Taylor Goetz <ptgo...@gmail.com> wrote:

> +1 to everything Dave said.
>
> Your first Apache release will likely be the hardest. At times it may even
> seem like hazing. You may get a lot of feedback that seems like nitpicking.
>
> That is not the case. The goal is to make sure to make sure Apache Heron
> knows how to make compliant releases and will continue to do so after
> graduation.
>
> Don’t get discouraged. Lean on your mentors/advisors.
>
> -Taylor
>
> > On May 8, 2018, at 7:21 PM, Dave Fisher <dave2w...@comcast.net> wrote:
> >
> > Thanks for the quick update!
> >
> >> On May 8, 2018, at 4:08 PM, Ning Wang <wangnin...@gmail.com> wrote:
> >>
> >> And here is a brief notes:
> >>
> >>
> >>
> >> * - Our current focus is to have our first apache release by the end of
> >> this week (we should be pretty much ready for it)
> >
> > If you mean have the first release candidate ready for voting by the end
> of the week this is achievable. What’s not achievable is completing the
> VOTE process and making the first release. You should consider this an
> indeterminate process until the community gets things correct in the Apache
> Way.
> >
> > (1) The project will need to take at least 72 hours to review the
> candidate.
> > (2) Then the IPMC needs to VOTE on general@. That takes another minimum
> of 72 hours. Often more.
> >
> > To pass requires 3 +1 IPMC votes. We review the source and binaries to
> make sure that all files have license text and that the NOTICE and LICENSE
> is correct for the source and binary.
> >
> > Regards,
> > Dave
> >
> >> .- Heron webpage needs
> >> some update and reorg- Oracle can’t host the meet up next week.
> >> Rescheduling.- We need more blogs. Karthik will send out some ideas.-
> >> Please review Saikat’s machine learning support proposal.
> >> https://docs.google.com/document/d/1LrO7XRcMxJoMM83wjRd-
> Ov74VAaomA_mXOAhCStgGng/edit
> >> <https://docs.google.com/document/d/1LrO7XRcMxJoMM83wjRd-
> Ov74VAaomA_mXOAhCStgGng/edit>-
> >> Stateful processing is in progress. Found two issues and working on them
> >> (state data removal and hadoop config)- Investigating stuck stmgr issue-
> >> Working on the model to predict BP when traffic increases- Got a
> streamlet
> >> bug report from user about missing acks.- New ubunton 18.04 has python 3
> >> only. We need to migrate- A hands-on session will be scheduled by the
> end
> >> of June (sree)- Security concern in downloader/extractor java code. Sree
> >> and karthik to sync up.- CDCI *
> >> The google doc is here:
> >> https://docs.google.com/document/d/1cTIBq3jOVRTSR0Zd5OKK20OqwT2l9
> 0xXiY_HssVo8mE/edit?ts=5aa84932#heading=h.nq0bjo3oqwfy.
> >> Please feel free to reply/comment if anything is mission.
> >>
> >>
> >>
> >>
> >>> On Tue, May 8, 2018 at 1:15 PM, Ning Wang <wangnin...@gmail.com>
> wrote:
> >>>
> >>> Hi,
> >>>
> >>> The heron OSS sync meeting will be happening today at 2.00 pm PST.
> Please
> >>> use the following hangout link:
> >>> https://hangouts.google.com/hangouts/_/streaml.io/oss-heron-
> >>> sync?authuser=0
> >>>
> >>>
> >>> See you all then.
> >>>
> >>>
> >
>


Re: Stateful updating and deterministic routing

2018-05-07 Thread Ning Wang
I see. Then the doc I was reading might not be it either.

I will ask maosong then.

Thanks for the info!

On Mon, May 7, 2018 at 9:23 AM, Bill Graham <billgra...@gmail.com> wrote:

> Yeah, that's not it. The stateful scaling part of that doc got lengthy
> enough that we broke it into a doc of it's own, per Sanjeev's suggestion
> IIRC. The fact that I can't locate it makes me think it was a twitter doc
> of mine (although it was not Twitter-specific), which I'm sure was shared
> with Sanjeev, Maosong and probably Karthik. If you can find it, please
> share.
>
> On Mon, May 7, 2018 at 12:41 AM, Ning Wang <wangnin...@gmail.com> wrote:
>
>> Thanks Karthik. The doc is not exactly the same but close enough.
>>
>> It seems my doc is an internal one so let's use your doc as reference. I
>> will see if there is any major differences and comment.
>>
>> On Sun, May 6, 2018 at 1:22 PM, Karthik Ramasamy <kart...@streaml.io>
>> wrote:
>>
>>> Here it is
>>>
>>> https://docs.google.com/document/d/1YDFNvLTX6Sg3WDrNFKiWLaJv
>>> uEtK4eyxEaA0w9cVlG4/edit#heading=h.d6uy2uxfs2xq
>>>
>>> cheers
>>> /karthik
>>>
>>>
>>> On Sun, May 6, 2018 at 8:20 AM, Bill Graham <billgra...@gmail.com>
>>> wrote:
>>>
>>>> Can you share the doc please?
>>>>
>>>> On Sat, May 5, 2018 at 4:18 PM Ning Wang <wangnin...@gmail.com> wrote:
>>>>
>>>> > Thanks.
>>>> >
>>>> > Yeah I have read the design doc. It has a section for scaling and
>>>> covers
>>>> > some designs but not reaching this level of details I am afraid.
>>>> >
>>>> > On Sat, May 5, 2018 at 9:45 AM, Bill Graham <billgra...@gmail.com>
>>>> wrote:
>>>> >
>>>> >> The stateful processing design included a large section on scaling,
>>>> which
>>>> >> was intended to be done as a future phase. It's very similar to
>>>> what's
>>>> >> being described. Sanjeev and I worked on it about a 1.5 years ago
>>>> with
>>>> >> Maosong and it was in a google doc. Sanjeev do you have that design
>>>> doc? I
>>>> >> can't seem locate it.
>>>> >>
>>>> >> On Sat, May 5, 2018 at 12:03 AM, Ning Wang <wangnin...@gmail.com>
>>>> wrote:
>>>> >>
>>>> >> > If we go this way, we need key -> state map for each component so
>>>> that
>>>> >> the
>>>> >> > state data can be repartitioned.
>>>> >> >
>>>> >> > On Fri, May 4, 2018 at 11:44 PM, Karthik Ramasamy <
>>>> kart...@streaml.io>
>>>> >> > wrote:
>>>> >> >
>>>> >> > > Instead - if it references
>>>> >> > >
>>>> >> > > topology name + component name + key range
>>>> >> > >
>>>> >> > > will it be better?
>>>> >> > >
>>>> >> > > cheers
>>>> >> > > /karthik
>>>> >> > >
>>>> >> > >
>>>> >> > > On Fri, May 4, 2018 at 11:23 PM, Ning Wang <wangnin...@gmail.com
>>>> >
>>>> >> wrote:
>>>> >> > >
>>>> >> > > > Currently I think each Instance serializes the state object
>>>> into a
>>>> >> byte
>>>> >> > > > array and checkpoint manager saves the byte array into a file.
>>>> The
>>>> >> file
>>>> >> > > is
>>>> >> > > > referenced by topology name + component name + instance id.
>>>> >> > > >
>>>> >> > > > On Fri, May 4, 2018 at 11:10 PM, Karthik Ramasamy <
>>>> >> kart...@streaml.io>
>>>> >> > > > wrote:
>>>> >> > > >
>>>> >> > > > > I am not sure I understand why the state is tied to an
>>>> instance?
>>>> >> > > > >
>>>> >> > > > > cheers
>>>> >> > > > > /karthik
>>>> >> > > > >
>>>> >> > > > > On Fri, May 4, 2018 at 4:36 PM, Thomas Cooper <
>>>> >&g

Re: Stateful updating and deterministic routing

2018-05-05 Thread Ning Wang
Thanks.

Yeah I have read the design doc. It has a section for scaling and covers
some designs but not reaching this level of details I am afraid.

On Sat, May 5, 2018 at 9:45 AM, Bill Graham <billgra...@gmail.com> wrote:

> The stateful processing design included a large section on scaling, which
> was intended to be done as a future phase. It's very similar to what's
> being described. Sanjeev and I worked on it about a 1.5 years ago with
> Maosong and it was in a google doc. Sanjeev do you have that design doc? I
> can't seem locate it.
>
> On Sat, May 5, 2018 at 12:03 AM, Ning Wang <wangnin...@gmail.com> wrote:
>
> > If we go this way, we need key -> state map for each component so that
> the
> > state data can be repartitioned.
> >
> > On Fri, May 4, 2018 at 11:44 PM, Karthik Ramasamy <kart...@streaml.io>
> > wrote:
> >
> > > Instead - if it references
> > >
> > > topology name + component name + key range
> > >
> > > will it be better?
> > >
> > > cheers
> > > /karthik
> > >
> > >
> > > On Fri, May 4, 2018 at 11:23 PM, Ning Wang <wangnin...@gmail.com>
> wrote:
> > >
> > > > Currently I think each Instance serializes the state object into a
> byte
> > > > array and checkpoint manager saves the byte array into a file. The
> file
> > > is
> > > > referenced by topology name + component name + instance id.
> > > >
> > > > On Fri, May 4, 2018 at 11:10 PM, Karthik Ramasamy <
> kart...@streaml.io>
> > > > wrote:
> > > >
> > > > > I am not sure I understand why the state is tied to an instance?
> > > > >
> > > > > cheers
> > > > > /karthik
> > > > >
> > > > > On Fri, May 4, 2018 at 4:36 PM, Thomas Cooper <
> > tom.n.coo...@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > Yeah, state recovery is a bit more difficult with Heron's
> > > architecture.
> > > > > In
> > > > > > Storm, the task IDs are not just values used for routing they
> > > actually
> > > > > > equate to a task instance within the executor. An executor which
> > > > > currently
> > > > > > processes the keys 4-8 actually contains 5 task instances of the
> > same
> > > > > > component. So for each task, they just save its state attached to
> > the
> > > > > > single task ID and reassemble executors with the new task
> > instances.
> > > > > >
> > > > > > We don't want or have to do that with Heron instances but we
> would
> > > need
> > > > > to
> > > > > > have some way to have a state change tied to the task (or routing
> > key
> > > > if
> > > > > we
> > > > > > go to the key range idea). For something like a word count you
> > might
> > > > save
> > > > > > counts using a nested map like: { routing key : {word : count }}.
> > The
> > > > > > routing key could be included in the Tuple instance. However,
> > whether
> > > > > this
> > > > > > pattern would work for more generic state cases I don't know?
> > > > > >
> > > > > > Tom Cooper
> > > > > > W: www.tomcooper.org.uk  | Twitter: @tomncooper
> > > > > > <https://twitter.com/tomncooper>
> > > > > >
> > > > > >
> > > > > > On Fri, 4 May 2018 at 15:54, Neng Lu <freen...@gmail.com> wrote:
> > > > > >
> > > > > > > +1 for this idea. As long as the predefined key space is large
> > > > enough,
> > > > > it
> > > > > > > should work for most of the cases.
> > > > > > >
> > > > > > > Based on my experience with topologies, I never saw one
> component
> > > has
> > > > > > more
> > > > > > > than 1000 instances in a topology.
> > > > > > >
> > > > > > > For recovering states from an update, there will be some
> problems
> > > > > though.
> > > > > > > Since the states stored in heron are strongly connected with
> each
> > > > > > instance,
> > > > > > > we either need to have
> > > > > > > some resolver does th

Re: Heron github emails [VOTE RESULTS]

2018-05-18 Thread Ning Wang
+1!

On Fri, May 18, 2018 at 7:28 PM, Karthik Ramasamy 
wrote:

> +1 as well.
>
> On Fri, May 18, 2018 at 7:18 PM Josh Fischer  wrote:
>
> > We have 7 positive votes and NO vetos.  I will take the necessary steps
> to
> > get this change completed.
> >
> > Thank you for the guidance Dave.
> >
> > - Josh
> >
> > On Fri, May 18, 2018 at 4:18 PM, Dave Fisher 
> > wrote:
> >
> > > Hi Josh,
> > >
> > > Reply to this thread with [VOTE RESULTS] in the message subject.
> > >
> > > You would then fill out an INFRA JIRA requesting the change. Use
> > > lists.apache.org to get permalinks to the first email and the results
> > > email. Include those in the JIRA.
> > >
> > > Regards,
> > > Dave
> > >
> > >
> > > Sent from my iPhone
> > >
> > > > On May 18, 2018, at 11:21 AM, Josh Fischer 
> > wrote:
> > > >
> > > > To All,
> > > >
> > > > What needs to be done to prevent the duplicate gitbox emails sent
> every
> > > > time there is activity on github?  I would be happy to complete this
> > > task.
> > > >
> > > > -Josh
> > > >
> > > > On Wed, May 2, 2018 at 12:53 PM, Eren Avsarogullari <
> > > > erenavsarogull...@gmail.com> wrote:
> > > >
> > > >> +1
> > > >>
> > > >>> On Wed, May 2, 2018, 18:50 Neng Lu  wrote:
> > > >>>
> > > >>> +1
> > > >>>
> > > >>> On Tue, May 1, 2018 at 7:41 PM, Jake Farrell 
> > > >> wrote:
> > > >>>
> > >  +1
> > > 
> > >  -Jake
> > > 
> > >  On Tue, May 1, 2018 at 10:00 PM, Chris Kellogg <
> cckell...@gmail.com
> > >
> > >  wrote:
> > > 
> > > > +1
> > > >
> > > > On Tue, May 1, 2018 at 4:45 PM, Karthik Ramasamy <
> > kart...@streaml.io
> > > >>>
> > > > wrote:
> > > >
> > > >> +1 as well.
> > > >>
> > > >> On Tue, May 1, 2018 at 3:53 PM Josh Fischer <
> j...@joshfischer.io>
> > >  wrote:
> > > >>
> > > >>> I’m on board with sending git commits to commits@.
> > > >>>
> > > >>> +1
> > > >>>
> > > >>> On Tue, May 1, 2018 at 5:20 PM Bill Graham <
> billgra...@gmail.com
> > > >>>
> > > > wrote:
> > > >>>
> > >  Hi,
> > > 
> > >  Ever since migrating to Apache's gitbox, the dev list has been
> > > >> dominated
> > > >>> by
> > >  gitbox comment emails. See
> > >  https://lists.apache.org/list.html?d...@heron.apache.org
> > > 
> > >  It seems a bit heavy handed (i.e. spammy) to have every
> comment
> > > > emailed
> > > >>> to
> > >  the dev lis, especially when git has features to let people
> > > >>> control
> > > >> their
> > >  notifications. What do people think about disabling the git
> > >  comments
> > > > to
> > > >>> the
> > >  dev list? Or we could have the git comments sent to commits@
> > > > instead.
> > > 
> > >  Thoughts?
> > > 
> > >  Bill
> > > 
> > > >>> --
> > > >>> Sent from A Mobile Device
> > > >>>
> > > >>
> > > >
> > > 
> > > >>>
> > > >>
> > >
> > >
> >
>


Re: Heron OSS Sync

2018-05-23 Thread Ning Wang
Regarding the Apache release. Sijie responded with the following valuable
information:

The general guideline of release:
http://www.apache.org/dev/release-publishing.html
An example release process (bookkeeper):
http://bookkeeper.apache.org/community/release_guide/

For heron release, I think you need to take care of following 3 parts:
- For java artifacts (e.g heron api), you need to publish the java
artifacts to apache artifactory. So you need to enable Nexus access for
heron. File a JIRA to INFRA - example: [INFRA-14694 Enable Nexus Access For
Pulsar - ASF JIRA](https://issues.apache.org/jira/browse/INFRA-14694)
- You need to have a location on apache dist to host those src/binary
packages. File a JIRA to INFA - example: [INFRA-13024 Setup dists for
Apache DistributedLog - ASF JIRA](
https://issues.apache.org/jira/browse/INFRA-13024)
- website/documentation. That’s not related to releases, but you
eventually need to put the content under http://heron.incubator.apache.org/
, which you need to enable gitpubsub. So when you put the generated website
content under `asf-site` branch, the website is automatically published to
http://heron.incubator.apache.org/ . File a JIRA to INFRA - example:
[INFRA-14587 Enable Gitpubsub for Apache Pulsar (incubating) - ASF JIRA](
https://issues.apache.org/jira/browse/INFRA-14587)


Here are the Apache Infra tickets I just created accordingly:
Nexus access: https://issues.apache.org/jira/browse/INFRA-16560
Dists: https://issues.apache.org/jira/browse/INFRA-16561
gitpubsub: https://issues.apache.org/jira/browse/INFRA-16562

Please feel free to comment if I missed anything or there is any mistakes.








On Tue, May 22, 2018 at 3:20 PM, Ning Wang <wangnin...@gmail.com> wrote:

> Brief notes:
>
>
>
>
>
> * - Not much progress in the new release so far. Twitter forks will reach
> out to Bill/Sijie for suggestions.- Productionize stateful process in
> Twitter. Found an issue with local checkpoint expiration. - Downloader has
> been refactored to be more flexible. - Discussed zombie aurora container
> (causing duplicated stmgr issue).- Async ack/emit/fail PR is green to
> merge. To document the requirements/limitations.- Traffic prediction model
> is on going.- There are many old issues/PRs to clean up.*
>
> On Tue, May 22, 2018 at 1:21 PM, Ning Wang <wangnin...@gmail.com> wrote:
>
>> Hi,
>>
>> The heron OSS sync meeting will be happening today at 2.00 pm PDT.
>> Please use the following hangout link:
>> https://hangouts.google.com/hangouts/_/streaml.io/oss-heron-sync
>> ?authuser=0
>>
>>
>> See you all then.
>>
>>
>


Heron OSS Sync

2018-05-22 Thread Ning Wang
Hi,

The heron OSS sync meeting will be happening today at 2.00 pm PDT. Please
use the following hangout link:
https://hangouts.google.com/hangouts/_/streaml.io/oss-heron-sync?authuser=0


See you all then.


Re: Heron OSS Sync

2018-05-22 Thread Ning Wang
Brief notes:





* - Not much progress in the new release so far. Twitter forks will reach
out to Bill/Sijie for suggestions.- Productionize stateful process in
Twitter. Found an issue with local checkpoint expiration. - Downloader has
been refactored to be more flexible. - Discussed zombie aurora container
(causing duplicated stmgr issue).- Async ack/emit/fail PR is green to
merge. To document the requirements/limitations.- Traffic prediction model
is on going.- There are many old issues/PRs to clean up.*

On Tue, May 22, 2018 at 1:21 PM, Ning Wang <wangnin...@gmail.com> wrote:

> Hi,
>
> The heron OSS sync meeting will be happening today at 2.00 pm PDT. Please
> use the following hangout link:
> https://hangouts.google.com/hangouts/_/streaml.io/oss-heron-sync
> ?authuser=0
>
>
> See you all then.
>
>


ML in Heron weekly meeting

2018-06-08 Thread Ning Wang
Brief notes for today's meeting:

- Review DD:
https://docs.google.com/document/d/1LrO7XRcMxJoMM83wjRd-Ov74VAaomA_mXOAhCStgGng/edit
- We want to understand better about the bigger picture of ML in stream
processing systems.
  -- talk to ML users
  -- doc of related systems to read:
---
https://mapr.com/blog/monitoring-real-time-uber-data-using-spark-machine-learning-streaming-and-kafka-api-part-2/
---
https://databricks.com/blog/2018/06/05/introducing-mlflow-an-open-source-machine-learning-platform.html
--- https://eng.uber.com/michelangelo/


Heron OSS Sync

2018-06-18 Thread Ning Wang
Hi,

The heron OSS sync meeting will be happening tomorrow at 2.00 pm PDT.
Please use the following hangout link:
https://hangouts.google.com/hangouts/_/streaml.io/oss-heron-sync?authuser=0


See you all then.


Re: Heron OSS Sync

2018-06-18 Thread Ning Wang
Thanks! You are right!!! Let me send it again.



On Mon, Jun 18, 2018 at 4:55 PM, Jerry Peng 
wrote:

> Ning,
>
> I think you mean tomorrow correct?
>
> On Mon, Jun 18, 2018 at 4:54 PM Ning Wang  wrote:
>
> > Hi,
> >
> > The heron OSS sync meeting will be happening today at 2.00 pm PDT. Please
> > use the following hangout link:
> > https://hangouts.google.com/hangouts/_/streaml.io/oss-
> heron-sync?authuser=0
> >
> >
> > See you all then.
> >
>


Heron OSS Sync

2018-06-18 Thread Ning Wang
Hi,

The heron OSS sync meeting will be happening today at 2.00 pm PDT. Please
use the following hangout link:
https://hangouts.google.com/hangouts/_/streaml.io/oss-heron-sync?authuser=0


See you all then.


Re: Heron OSS Sync

2018-06-19 Thread Ning Wang
Brief notes for today's sync up meeting:


   - Apache release is still in progress. Jerry, Ali and Ning will sync
   up about the previous release process first.
   - Summingbird team reviewed streamlet API. Will discuss more about
   the window operation and stateful storage.
   - Huijun adding toggleable switch into Dhalion.
   - Yao working on new integration test for topology structure.
   - Fario started to take the traffic modeling project from Thomas.



On Mon, Jun 18, 2018 at 8:58 PM, Ning Wang  wrote:

> Thanks for asking.
>
> Normally we talk about these two items in the meeting:
> - Updates
> - There could be something we want to discuss briefly or schedule further
> discussions.
>
>
>
> On Mon, Jun 18, 2018 at 5:00 PM, Dave Fisher 
> wrote:
>
>> Hi -
>>
>> Is there an agenda?
>>
>> Regards,
>> Dave
>>
>> Sent from my iPhone
>>
>> > On Jun 18, 2018, at 4:56 PM, Ning Wang  wrote:
>> >
>> > Hi,
>> >
>> > The heron OSS sync meeting will be happening tomorrow at 2.00 pm PDT.
>> > Please use the following hangout link:
>> > https://hangouts.google.com/hangouts/_/streaml.io/oss-heron-
>> sync?authuser=0
>> >
>> >
>> > See you all then.
>>
>
>


Re: Heron OSS Sync

2018-06-05 Thread Ning Wang
Also, thanks a lot for the information Dave! That is very helpful.

We were a bit lost and we were not sure what to ask (the related
information is quite overwhelming and a bit sparse) so we asked for some
info from guys that has the previous experience. Sorry if it is not the
recommended way. We will try to follow the standard procedure.

On Mon, Jun 4, 2018 at 1:31 PM, Dave Fisher  wrote:

> Hi Herons,
>
> Your Mentors should be available to answer these type of release
> questions. You need to ask these straight up on the dev@heron mailing
> list and not go to people within one of the companies working on related
> projects. The Incubator has additional rules beyond Apache’s Release Policy
> [1]
>
> Typically you would put [MENTORS] at the beginning of the subject. Justin
> Mclean gave a talk at Apachecon NA last year. [2]
>
> Do you all understand? If it doesn’t happen on the mailing list it didn’t
> happen is for a reason.
>
> I also suggest that if you are going to continue these sync ups that you
> provide an agenda at least 24 hours in advance. This can actually serve as
> calls for discussion. It is important to do this. You might find that your
> questions will be answered before the “sync up” by participating in the
> global asynchronous advantage of emails!
>
> Please think about this!
>
> Regards,
> Dave
>
> [1] https://incubator.apache.org/guides/releasemanagement.html
> [2] https://www.youtube.com/watch?v=I0-lp1t9ee0
>
>
> On May 22, 2018, at 11:22 PM, Ning Wang  wrote:
>
> Regarding the Apache release. Sijie responded with the following valuable
> information:
>
> The general guideline of release:
> http://www.apache.org/dev/release-publishing.html
> An example release process (bookkeeper):
> http://bookkeeper.apache.org/community/release_guide/
>
> For heron release, I think you need to take care of following 3 parts:
>- For java artifacts (e.g heron api), you need to publish the java
> artifacts to apache artifactory. So you need to enable Nexus access for
> heron. File a JIRA to INFRA - example: [INFRA-14694 Enable Nexus Access For
> Pulsar - ASF JIRA](https://issues.apache.org/jira/browse/INFRA-14694)
>- You need to have a location on apache dist to host those src/binary
> packages. File a JIRA to INFA - example: [INFRA-13024 Setup dists for
> Apache DistributedLog - ASF JIRA](
> https://issues.apache.org/jira/browse/INFRA-13024)
>- website/documentation. That’s not related to releases, but you
> eventually need to put the content under http://heron.incubator.apache.
> org/
> , which you need to enable gitpubsub. So when you put the generated website
> content under `asf-site` branch, the website is automatically published to
> http://heron.incubator.apache.org/ . File a JIRA to INFRA - example:
> [INFRA-14587 Enable Gitpubsub for Apache Pulsar (incubating) - ASF JIRA](
> https://issues.apache.org/jira/browse/INFRA-14587)
>
>
> Here are the Apache Infra tickets I just created accordingly:
> Nexus access: https://issues.apache.org/jira/browse/INFRA-16560
> Dists: https://issues.apache.org/jira/browse/INFRA-16561
> gitpubsub: https://issues.apache.org/jira/browse/INFRA-16562
>
> Please feel free to comment if I missed anything or there is any mistakes.
>
>
>
>
>
>
>
>
> On Tue, May 22, 2018 at 3:20 PM, Ning Wang  wrote:
>
> Brief notes:
>
>
>
>
>
> * - Not much progress in the new release so far. Twitter forks will reach
> out to Bill/Sijie for suggestions.- Productionize stateful process in
> Twitter. Found an issue with local checkpoint expiration. - Downloader has
> been refactored to be more flexible. - Discussed zombie aurora container
> (causing duplicated stmgr issue).- Async ack/emit/fail PR is green to
> merge. To document the requirements/limitations.- Traffic prediction model
> is on going.- There are many old issues/PRs to clean up.*
>
> On Tue, May 22, 2018 at 1:21 PM, Ning Wang  wrote:
>
> Hi,
>
> The heron OSS sync meeting will be happening today at 2.00 pm PDT.
> Please use the following hangout link:
> https://hangouts.google.com/hangouts/_/streaml.io/oss-heron-sync
> ?authuser=0
>
>
> See you all then.
>
>
>
>
>


Re: Heron OSS Sync

2018-06-04 Thread Ning Wang
Thanks!

On Mon, Jun 4, 2018 at 1:31 PM, Dave Fisher  wrote:

> Hi Herons,
>
> Your Mentors should be available to answer these type of release
> questions. You need to ask these straight up on the dev@heron mailing
> list and not go to people within one of the companies working on related
> projects. The Incubator has additional rules beyond Apache’s Release Policy
> [1]
>
> Typically you would put [MENTORS] at the beginning of the subject. Justin
> Mclean gave a talk at Apachecon NA last year. [2]
>
> Do you all understand? If it doesn’t happen on the mailing list it didn’t
> happen is for a reason.
>
> I also suggest that if you are going to continue these sync ups that you
> provide an agenda at least 24 hours in advance. This can actually serve as
> calls for discussion. It is important to do this. You might find that your
> questions will be answered before the “sync up” by participating in the
> global asynchronous advantage of emails!
>
> Please think about this!
>
> Regards,
> Dave
>
> [1] https://incubator.apache.org/guides/releasemanagement.html
> [2] https://www.youtube.com/watch?v=I0-lp1t9ee0
>
>
> On May 22, 2018, at 11:22 PM, Ning Wang  wrote:
>
> Regarding the Apache release. Sijie responded with the following valuable
> information:
>
> The general guideline of release:
> http://www.apache.org/dev/release-publishing.html
> An example release process (bookkeeper):
> http://bookkeeper.apache.org/community/release_guide/
>
> For heron release, I think you need to take care of following 3 parts:
>- For java artifacts (e.g heron api), you need to publish the java
> artifacts to apache artifactory. So you need to enable Nexus access for
> heron. File a JIRA to INFRA - example: [INFRA-14694 Enable Nexus Access For
> Pulsar - ASF JIRA](https://issues.apache.org/jira/browse/INFRA-14694)
>- You need to have a location on apache dist to host those src/binary
> packages. File a JIRA to INFA - example: [INFRA-13024 Setup dists for
> Apache DistributedLog - ASF JIRA](
> https://issues.apache.org/jira/browse/INFRA-13024)
>- website/documentation. That’s not related to releases, but you
> eventually need to put the content under http://heron.incubator.apache.
> org/
> , which you need to enable gitpubsub. So when you put the generated website
> content under `asf-site` branch, the website is automatically published to
> http://heron.incubator.apache.org/ . File a JIRA to INFRA - example:
> [INFRA-14587 Enable Gitpubsub for Apache Pulsar (incubating) - ASF JIRA](
> https://issues.apache.org/jira/browse/INFRA-14587)
>
>
> Here are the Apache Infra tickets I just created accordingly:
> Nexus access: https://issues.apache.org/jira/browse/INFRA-16560
> Dists: https://issues.apache.org/jira/browse/INFRA-16561
> gitpubsub: https://issues.apache.org/jira/browse/INFRA-16562
>
> Please feel free to comment if I missed anything or there is any mistakes.
>
>
>
>
>
>
>
>
> On Tue, May 22, 2018 at 3:20 PM, Ning Wang  wrote:
>
> Brief notes:
>
>
>
>
>
> * - Not much progress in the new release so far. Twitter forks will reach
> out to Bill/Sijie for suggestions.- Productionize stateful process in
> Twitter. Found an issue with local checkpoint expiration. - Downloader has
> been refactored to be more flexible. - Discussed zombie aurora container
> (causing duplicated stmgr issue).- Async ack/emit/fail PR is green to
> merge. To document the requirements/limitations.- Traffic prediction model
> is on going.- There are many old issues/PRs to clean up.*
>
> On Tue, May 22, 2018 at 1:21 PM, Ning Wang  wrote:
>
> Hi,
>
> The heron OSS sync meeting will be happening today at 2.00 pm PDT.
> Please use the following hangout link:
> https://hangouts.google.com/hangouts/_/streaml.io/oss-heron-sync
> ?authuser=0
>
>
> See you all then.
>
>
>
>
>


Heron OSS Sync

2018-06-04 Thread Ning Wang
Hi,

The heron OSS sync meeting will be happening tomorrow at 2.00 pm PDT.
Please use the following hangout link:
https://hangouts.google.com/hangouts/_/streaml.io/oss-heron-sync?authuser=0


See you all then.


Re: Heron OSS Sync

2018-06-05 Thread Ning Wang
Brief notes for today's meeting:


   - We made some progress on the infra side of heron Apache release. Ning
   will put some time on it this week.
   - Thomas is wrapping up the traffic modeling project and aiming at
   releasing the code. It can do short-time pressure prediction now.
   - Last update would be a useful feature to add.
   - It could be useful for us to do a Heron/Flink benchmark and comparison.
   - windowing depends on user's event time extraction currently. We might
   consider making event time a first class citizen.
   - Jerry working on Nomad scheduler
   - Users asked about Streamlet interface and acking mechanism.



On Mon, Jun 4, 2018 at 5:05 PM, Ning Wang  wrote:

> Hi,
>
> The heron OSS sync meeting will be happening tomorrow at 2.00 pm PDT.
> Please use the following hangout link:
> https://hangouts.google.com/hangouts/_/streaml.io/oss-heron-sync
> ?authuser=0
>
>
> See you all then.
>
>


Re: [DISCUSS] A design proposal for incorporating machine learning algorithms into heron

2018-06-22 Thread Ning Wang
My thoughts:

1. sounds good!
2. I feel it might be better to be separated so we can focus on one problem
each time.
3. depending on how hard it is to add in future I feel.
4. not sure.


On Wed, May 9, 2018 at 7:39 AM, Saikat Kanjilal  wrote:

> FYI for those that dont know about Michaelangelo: https://eng.uber.com/
> michelangelo/
>
> [http://eng.uber.com/wp-content/uploads/2017/09/Facebook.png] uber.com/michelangelo/>
>
> Meet Michelangelo: Uber's Machine Learning Platform michelangelo/>
> eng.uber.com
> Uber Engineering introduces Michelangelo, our machine
> learning-as-a-service system that enables teams to easily build, deploy,
> and operate ML solutions at scale.
>
>
>
>
> 
> From: Saikat Kanjilal 
> Sent: Wednesday, May 9, 2018 7:35 AM
> To: dev@heron.incubator.apache.org; Karthik Ramasamy
> Subject: Re: [DISCUSS] A design proposal for incorporating machine
> learning algorithms into heron
>
> Hi Folks,
>
> I was thinking about how to drive this initiative and had some ideas
> around execution, would love some feedback:
>
> 1) While the discussion is happening around the design I was thinking of
> building a little prototype with one of the algorithms , the prototype will
> be a first cut representation of the design where we represent one
> algorithm into a storm topology, when I look at the list of algorithms that
> we're thinking about bringing over from samoa (https://samoa.incubator.
> apache.org/documentation/SAMOA-and-Machine-Learning.html) the distributed
> stream clustering looks the most valuable for a prototype, thoughts
> Apache SAMOA and Machine Learning documentation/SAMOA-and-Machine-Learning.html>
> samoa.incubator.apache.org
> Apache SAMOA and Machine Learning. SAMOA’s main goal is to help developers
> to create easily machine learning algorithms on top of any distributed
> stream processing engine.
>
>
>
>
> Apache SAMOA and Machine Learning documentation/SAMOA-and-Machine-Learning.html>
> Apache SAMOA and Machine Learning documentation/SAMOA-and-Machine-Learning.html>
> samoa.incubator.apache.org
> Apache SAMOA and Machine Learning. SAMOA’s main goal is to help developers
> to create easily machine learning algorithms on top of any distributed
> stream processing engine.
>
>
>
> samoa.incubator.apache.org
> Apache SAMOA and Machine Learning. SAMOA’s main goal is to help developers
> to create easily machine learning algorithms on top of any distributed
> stream processing engine.
>
>
> 2) I would like to leverage some of the ideas in MichaelAngelo as well as
> my previous experience in building a tool that versions, deploys and
> associates ML models with newly arriving windows of data, in actuality I
> feel like this is a completely orthogonal initiative that we also need to
> design out, should this be part of the design doc at this point, thoughts?
>
> 3) Should we address security in streaming machine learning models for the
> first release?
>
> 4) The design doc mentions a GenericMLOutputModelSink, I was thinking this
> is like a factory method in that has underlying representations of various
> sinks that already exist that I'm hoping to leverage, see here:
> https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.4/
> bk_storm-component-guide/content/ch_storm-connectors.html
>
>
>
> @Karthik Ramasamy et all, would love to get
> thoughts on how we proceed with this initiative at this point, in the
> meantime I will get started with 1 to test out the feasibility of this
> design.
>
> Regards
>
> Chapter 5. Moving Data Into and Out of Apache Storm Using ...<
> https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.
> 4/bk_storm-component-guide/content/ch_storm-connectors.html>
> docs.hortonworks.com
> This chapter focuses on moving data into and out of Apache Storm through
> the use of spouts and bolts. Spouts read data from external sources to
> ingest data into a topology.
>
>
>
>
>
>
> 
> From: Saikat Kanjilal 
> Sent: Monday, May 7, 2018 2:31 PM
> To: dev@heron.incubator.apache.org
> Subject: [DISCUSS] A design proposal for incorporating machine learning
> algorithms into heron
>
>
> Hello Dev community,
>
> I have created the initial API design documentation around building storm
> topologies around a set of machine learning streaming algorithms here:
> https://docs.google.com/document/d/1LrO7XRcMxJoMM83wjRd-
> Ov74VAaomA_mXOAhCStgGng/edit?usp=sharing, this is very much a work in
> progress but I wanted to start getting early  feedback from the community
> as its a lot of complex operations representing a streaming ml pipeline
> using heron.   This design leverages apache samoa to figure out which
> algorithms to focus on in bringing into heron.
>
> Thank you Karthik Ramasamy for your mentoring on this, the goal will be to
> 

Re: Building error

2018-06-24 Thread Ning Wang
Thanks! :D

On Sun, Jun 24, 2018 at 6:38 AM, Oliver Bristow 
wrote:

> The scala rules repo has a new release which uses HTTP instead, so at least
> builds can pass for now with the new hash (no new tagged release). I put in
> a PR to update to it https://github.com/apache/incubator-heron/pull/2937
>
> I emailed Lightbend about their cert and hosting on GitHub, so at least
> they should be aware of it once the email reaches someone who knows what to
> do, so hopefully it will be sorted some time next week, but that may not be
> soon enough, and hope driven development isn't great.
>
> On 24 June 2018 at 04:30, Ning Wang  wrote:
>
> > Got it. Thanks~
> >
> > On Sat, Jun 23, 2018 at 4:28 PM, Oliver Bristow <
> > oli...@oliverbristow.co.uk>
> > wrote:
> >
> > > There's an issue <https://github.com/bazelbuild/rules_scala/issues/532
> >
> > on
> > > the remote repo for it, hopefully they'll fix their cert soon - it
> > expired
> > > 24 hours ago. Longer term may be for them to get that archive on and
> from
> > > GitHub
> > >
> > > On 24 June 2018 at 00:13, Ning Wang  wrote:
> > >
> > > > It seems bazel build is failing (
> > > > https://travis-ci.org/apache/incubator-heron/builds/
> > > > 395945899?utm_source=github_status_medium=notification)
> > > > currently because of the following error:
> > > >
> > > > (22:58:33) ERROR:
> > > > /home/travis/.cache/bazel/_bazel_travis/
> be6dac4936703c7eedcb4f5cf38cdd
> > > > 65/external/io_bazel_rules_scala/scala/scala.bzl:1094:3:
> > > > no such package '@scala//': Error downloading
> > > > [https://downloads.lightbend.com/scala/2.11.11/scala-2.11.11.tgz] to
> > > > /home/travis/.cache/bazel/_bazel_travis/
> be6dac4936703c7eedcb4f5cf38cdd
> > > > 65/external/scala/scala-2.11.11.tgz:
> > > > sun.security.validator.ValidatorException: PKIX path validation
> > > > failed: java.security.cert.CertPathValidatorException: validity
> check
> > > > failed and referenced by
> > > > '//external:io_bazel_rules_scala/dependency/scala/scala_library'
> > > >
> > > >
> > > > The website hosing the library (https://downloads.lightbend.
> com/scala)
> > > > seems to be inaccessible, maybe this is related to the error.
> > > >
> > > > Is there a replacement we can use?
> > > >
> > >
> >
>


ML in Heron weekly meeting

2018-06-23 Thread Ning Wang
Brief notes for the meeting on June 22th:

- still studying the documents.
--- https://mapr.com/blog/monitoring-real-time-uber-
data-using-spark-machine-learning-streaming-and-kafka-api-part-2/
--- https://databricks.com/blog/2018/06/05/introducing-mlflow-
an-open-source-machine-learning-platform.html
--- https://eng.uber.com/michelangelo/
- stateful storage might need to be improved (data size) to support big
state object which could be required by ML jobs.


Building error

2018-06-23 Thread Ning Wang
It seems bazel build is failing (
https://travis-ci.org/apache/incubator-heron/builds/395945899?utm_source=github_status_medium=notification)
currently because of the following error:

(22:58:33) ERROR:
/home/travis/.cache/bazel/_bazel_travis/be6dac4936703c7eedcb4f5cf38cdd65/external/io_bazel_rules_scala/scala/scala.bzl:1094:3:
no such package '@scala//': Error downloading
[https://downloads.lightbend.com/scala/2.11.11/scala-2.11.11.tgz] to
/home/travis/.cache/bazel/_bazel_travis/be6dac4936703c7eedcb4f5cf38cdd65/external/scala/scala-2.11.11.tgz:
sun.security.validator.ValidatorException: PKIX path validation
failed: java.security.cert.CertPathValidatorException: validity check
failed and referenced by
'//external:io_bazel_rules_scala/dependency/scala/scala_library'


The website hosing the library (https://downloads.lightbend.com/scala)
seems to be inaccessible, maybe this is related to the error.

Is there a replacement we can use?


Re: Apache Releases

2018-06-19 Thread Ning Wang
Here is a quick PR to clean up Twitter related content in the
governance/community pages. There are still a few things to be updated
later.

https://github.com/apache/incubator-heron/pull/2926



On Tue, Jun 19, 2018 at 8:57 PM, Ning Wang  wrote:

> Thank you very much!
>
> On Tue, Jun 19, 2018 at 4:22 PM, Dave Fisher 
> wrote:
>
>> Hi -
>>
>> Here are some resources and notes about Apache Releases:
>>
>> Special rules for incubating projects.
>> https://incubator.apache.org/guides/releasemanagement.html
>>
>> General policies.
>> http://www.apache.org/dev/#releases
>>
>> Construction of LICENSE and NOTICE are important. Your Mentors can help,
>> but you have to ask on list.
>> Source release is critical. This is OSS. The only official releases from
>> Apache are completely source code.
>> Download page will be required where the source release is available to
>> users.
>> The Apache process is very much different from how Heron is distributed
>> now.  https://apache.github.io/incubator-heron/docs/getting-started/
>> I recommend learning how to do an Apache Release before understanding how
>> to properly do several binary distributions.
>>
>> No press releases, but you can announce. Sally and the Marketing team at
>> pr...@apache.org can be used to get help with the rules about
>> announcements.
>>
>> About Governance: https://apache.github.io/incubator-heron/docs/
>> contributors/governance/
>> Please update this page on priority as this is NOT Apache governance.
>> Twitter’s CLA does not matter. Apache’s ICLA does. An ICLA is not required
>> for contribution, but is for Committers and PPMC members.
>>
>> Regards,
>> Dave
>>
>
>


Re: Apache Releases

2018-06-19 Thread Ning Wang
Thank you very much!

On Tue, Jun 19, 2018 at 4:22 PM, Dave Fisher  wrote:

> Hi -
>
> Here are some resources and notes about Apache Releases:
>
> Special rules for incubating projects.
> https://incubator.apache.org/guides/releasemanagement.html
>
> General policies.
> http://www.apache.org/dev/#releases
>
> Construction of LICENSE and NOTICE are important. Your Mentors can help,
> but you have to ask on list.
> Source release is critical. This is OSS. The only official releases from
> Apache are completely source code.
> Download page will be required where the source release is available to
> users.
> The Apache process is very much different from how Heron is distributed
> now.  https://apache.github.io/incubator-heron/docs/getting-started/
> I recommend learning how to do an Apache Release before understanding how
> to properly do several binary distributions.
>
> No press releases, but you can announce. Sally and the Marketing team at
> pr...@apache.org can be used to get help with the rules about
> announcements.
>
> About Governance: https://apache.github.io/incubator-heron/
> docs/contributors/governance/
> Please update this page on priority as this is NOT Apache governance.
> Twitter’s CLA does not matter. Apache’s ICLA does. An ICLA is not required
> for contribution, but is for Committers and PPMC members.
>
> Regards,
> Dave
>


Re: ML in Heron weekly meeting

2018-06-30 Thread Ning Wang
Brief notes for the meeting on June 29:

- We need to hook up heron with Apache samoa. Saikat to create new issues
in github.
- Create a slack channel: #machine-learning
- Let's add potential use cases in the design doc:
https://docs.google.com/document/d/1LrO7XRcMxJoMM83wjRd-Ov74VAaomA_mXOAhCStgGng/edit


On Sat, Jun 23, 2018 at 3:44 PM, Ning Wang  wrote:

> Brief notes for the meeting on June 22th:
>
> - still studying the documents.
> --- https://mapr.com/blog/monitoring-real-time-uber-data-using-
> spark-machine-learning-streaming-and-kafka-api-part-2/
> --- https://databricks.com/blog/2018/06/05/introducing-mlflow-an
> -open-source-machine-learning-platform.html
> --- https://eng.uber.com/michelangelo/
> - stateful storage might need to be improved (data size) to support big
> state object which could be required by ML jobs.
>


Re: Copyright Violations

2018-05-01 Thread Ning Wang
Thanks for bringing it up!

Too many things going on and we haven't got time to update the license text
yet. We will work on it.

This is the first update on the NOTICE file. Does it look ok this way?
https://github.com/apache/incubator-heron/pull/2878

Thanks in advance.



On Tue, May 1, 2018 at 6:53 PM, Dave Fisher  wrote:

> See https://www.apache.org/legal/src-headers.html#faq-moveothercopyright
>
> Copyrights go in the NOTICE and not the source.
>
> Regards,
> Dave
>
> Sent from my iPhone
>
> > On May 1, 2018, at 6:47 PM, Karthik Ramasamy  wrote:
> >
> > Taylor -
> >
> > As Ali pointed out these are pending task items. One of the major task
> that
> > we finished is converting the namespace from com.twitter.heron to
> > org.apache.heron.
> >
> > Sree is working on converting the copyrights to use Apache copyrights.
> One
> > of the question he had is whether the copyright can use one line or it
> > should be multi line.
> >
> > It will be great if you could provide advice on this - so that we can
> > change the copyright accordingly.
> >
> > cheers
> > /karthik
> >
> >
> >> On Tue, May 1, 2018 at 5:42 PM, Ali Ahmed  wrote:
> >>
> >> Hi Taylor ,
> >>
> >> There are some tasks pending in this regard , the goal is to remove all
> >> twitter copyright headers soon , some of the commits are in and some are
> >> remaining .
> >>
> >> -Ali
> >>
> >>> On May 1, 2018, at 5:37 PM, P. Taylor Goetz  wrote:
> >>>
> >>> Heron PPMC,
> >>>
> >>> I’ve mentioned this before, but IMO, this practice needs to stop.
> >>>
> >>> The community default seems to be to apply the Apache license header
> >> with a twitter copyright assertion on all files, even ones copied from
> >> other ALv2-licensed projects.
> >>>
> >>> Twitter can’t assert copyright on code their employees didn’t create,
> >> though, at least in this project, they continue to do so. I find this
> >> practice unacceptable.
> >>>
> >>> Please reconsider even including “Copyright Twitter $date” at all in
> >> license headers. That’s more suitable for the NOTICE file, and removed
> from
> >> the source header.
> >>>
> >>> Aside from having a twitter account, I have no affiliation with
> twitter.
> >> As a mentor I was surprised and disappointed to see a Twitter copyright
> >> applied to my own (implicitly copyrighted) work. That’s not cool, nor
> >> really (IANAL) legal.
> >>>
> >>> I would appreciate if this could be corrected. This kind of thing is
> >> something podlings need to know how to address proactively.
> >>>
> >>> -Taylor
> >>
> >>
>


Re: Copyright Violations

2018-05-02 Thread Ning Wang
Agreed and thanks for the suggestions!

On Wed, May 2, 2018 at 7:52 PM, P. Taylor Goetz <ptgo...@gmail.com> wrote:

> Thank you all for stepping up to correct this.
>
> With my mentor hat on...
>
> Trademark, licensing, and copyright hygiene are very important for ASF
> projects.
>
> I’d encourage everyone to proactively research branding, release, and
> other policies.
>
> ASF documentation can seem pretty scattered, but it is well indexed by
> google (e.g. “Apache X policy”). It’s also open source, so there’s an
> opportunity to improve it.
>
> It’s important that everyone strive to adopt the Apache Way.
>
> -Taylor
>
> > On May 2, 2018, at 1:59 PM, Ning Wang <wangnin...@gmail.com> wrote:
> >
> > Prepared two PRs to update license text so far:
> >
> > https://github.com/apache/incubator-heron/pull/2881
> > https://github.com/apache/incubator-heron/pull/2882
> >
> > Other files to come.
> >
> >
> > On Wed, May 2, 2018 at 10:40 AM, Karthik Ramasamy <kart...@streaml.io>
> > wrote:
> >
> >> Dave -
> >>
> >> Task that we are working are
> >>
> >> - Convert namespace com.twitter.heron to org.apache.heron - completed
> >> - Copyright on the source files - ongoing
> >> - Binaries are being removed from the code base - ongoing
> >>
> >> Meetup last week was great. We had around 80 people attending and we had
> >> three talks.
> >> Another meetup is being organized by Sree in South Bay.
> >>
> >> Regarding meeting minutes, Ning sends out an update after every sync up.
> >>
> >> cheers
> >> /karthik
> >>
> >>
> >>
> >>
> >>> On Tue, May 1, 2018 at 6:50 PM, Dave Fisher <dave2w...@comcast.net>
> wrote:
> >>>
> >>> Hi Ali,
> >>>
> >>> Where are pending tasks and plans discussed and recorded so that folks
> >> can
> >>> find out and participate.
> >>>
> >>> What I am saying is you are having sync ups and not publishing minutes.
> >>> You also announce the sync up 5 minutes beforehand. Decisions need to
> >> come
> >>> to this mailing list.
> >>>
> >>> How was the Meetup last week?
> >>>
> >>> Apache projects are global which actually means that asynchronous
> >>> communication and letting the world turn for three days is a common
> rule
> >> to
> >>> the Apache Way.
> >>>
> >>> Regards,
> >>> Dave
> >>>
> >>> Sent from my iPhone
> >>>
> >>>> On May 1, 2018, at 5:42 PM, Ali Ahmed <a.ah...@streaml.io> wrote:
> >>>>
> >>>> Hi Taylor ,
> >>>>
> >>>> There are some tasks pending in this regard , the goal is to remove
> all
> >>> twitter copyright headers soon , some of the commits are in and some
> are
> >>> remaining .
> >>>>
> >>>> -Ali
> >>>>
> >>>>> On May 1, 2018, at 5:37 PM, P. Taylor Goetz <ptgo...@gmail.com>
> >> wrote:
> >>>>>
> >>>>> Heron PPMC,
> >>>>>
> >>>>> I’ve mentioned this before, but IMO, this practice needs to stop.
> >>>>>
> >>>>> The community default seems to be to apply the Apache license header
> >>> with a twitter copyright assertion on all files, even ones copied from
> >>> other ALv2-licensed projects.
> >>>>>
> >>>>> Twitter can’t assert copyright on code their employees didn’t create,
> >>> though, at least in this project, they continue to do so. I find this
> >>> practice unacceptable.
> >>>>>
> >>>>> Please reconsider even including “Copyright Twitter $date” at all in
> >>> license headers. That’s more suitable for the NOTICE file, and removed
> >> from
> >>> the source header.
> >>>>>
> >>>>> Aside from having a twitter account, I have no affiliation with
> >>> twitter. As a mentor I was surprised and disappointed to see a Twitter
> >>> copyright applied to my own (implicitly copyrighted) work. That’s not
> >> cool,
> >>> nor really (IANAL) legal.
> >>>>>
> >>>>> I would appreciate if this could be corrected. This kind of thing is
> >>> something podlings need to know how to address proactively.
> >>>>>
> >>>>> -Taylor
> >>>>
> >>>
> >>>
> >>
>


Re: Heron OSS Sync

2018-07-03 Thread Ning Wang
Today's brief notes:

- We are testing DSL API in Twitter. Jerry mentioned some potential
improvement on builder.
- Found a Dhalion issue (2927).
- Shipped new integration tests for topology structures
- The current model in the queuing model (M/M/1) might not be useful.
Trying new ones.
- Added support for transferring stateful data via disk for big stateful
data.
- CkptMgr memory size is hardcoded currently, might add a new config for it.
- Currently windowing process waits for all data first and it might be
improved.
- Heron Lite is under considerations.





On Tue, Jul 3, 2018 at 9:42 AM, Ning Wang  wrote:

> Hi,
>
> Sorry for the late notice The heron OSS sync meeting will be
> happening today at 2.00 pm PDT. Please use the following hangout link:
> https://hangouts.google.com/hangouts/_/streaml.io/oss-heron-sync
> ?authuser=0
>
>
> See you all then.
>
>


Re: Heron OSS Sync - 01/30/18

2018-01-30 Thread Ning Wang
Please look for another message from Karthik, with subject "Heron Sync up
Meeting Notes". Also please feel free to comment.

On Tue, Jan 30, 2018 at 5:28 PM, Saikat Kanjilal 
wrote:

> All,
> Apologies but I missed this , we’re there any notes or output conversation
> that you guys can share with the community, would be good to be in the loop.
> Thanks
>
> Sent from my iPhone
>
> > On Jan 30, 2018, at 1:59 PM, Karthik Ramasamy 
> wrote:
> >
> > All --
> >
> > Heron OSS Sync will be happening at 2.00 pm PST. Please use the Google
> > Hangout -
> >
> > https://plus.google.com/hangouts/_/streaml.io/oss-heron-sync?hceid=
> a2FydGhpa0BzdHJlYW1sLmlv.218m8hgr4ekqsf1o8of2olo1a0=0
> >
> > cheers
> > /karthik
>


Re: About stream manager's quitting logic on connection failures

2018-02-05 Thread Ning Wang
Yeah. That is an option too. In fact it was my first try:
https://github.com/twitter/heron/pull/2693 (just an initiative, not
completed, a count map should be used instead of a single total count)

In most cases, I think both solutions should have the same result. A few
reasons I changed to a tmaster check:
- with tmaster, there is only one source of truth and tmaster is more
critical anyway. If the tmaster link is not healthy, stmgrs won't work
correctly: topology may have created replacement nodes but the disconnected
nodes could keep going by themselves.
- it is more straightforward. The logic is the same as the current one. One
the other side, if we use an array for all remote stmgrs, we could have a
smarter logic (which is good) but it could make stmgrs more complicated and
less straightforward (bad). I left the stmgr counters there so if in future
we decide to add this feature, it should be easy to add. There is a gap
between "errors from all" and "errors from a few" and this is not a
simple/quick question.




On Sun, Feb 4, 2018 at 6:48 PM, Sanjeev Kulkarni <sanjee...@gmail.com>
wrote:

> I could't add comments to the document, thus am posting my comments to the
> mailing list
> One more approach could be to do the current measurement as it is, but
> instead of leaving the quitting decision to the stmgtclient, have
> stmgrclientmgr do the decision. Thus everytime a stmgr client detects
> connection issues, inform that to stmgrclientmgr which keeps a map of
> peerstmgrid to error count. Thus it is able to decide things like am i
> seeing connection errors from all stmgrs or if only a few of them are
> having issues. Then it can take the decisions better.
>
> On Sat, Feb 3, 2018 at 8:11 PM, Ning Wang <wangnin...@gmail.com> wrote:
>
> > Hi, heron devs~
> >
> > I think the current stream manager's quitting logic on connection
> failures
> > is problematic. We saw a few internal cases in Twitter that this logic
> > could cause extra issue.
> >
> > Here is a doc with more details:
> >
> > https://docs.google.com/document/d/1WHNc2NEp2gVL9ge2QVKp9t4Hpd4U9
> > sAbzBqCu4-iDUM/edit#
> >
> > Comments and feedbacks are welcome!
> >
> > Thanks.
> > --ning
> >
>


Re: About stream manager's quitting logic on connection failures

2018-02-05 Thread Ning Wang
PR is here: https://github.com/twitter/heron/pull/2711

It should be quite simple, most changes are in config files.


On Mon, Feb 5, 2018 at 1:40 PM, Ning Wang <wangnin...@gmail.com> wrote:

> Cool. Thanks!
>
> On Mon, Feb 5, 2018 at 11:01 AM, Karthik Ramasamy <kramas...@gmail.com>
> wrote:
>
>> Ning - let us get this rolled out soon.
>>
>> Cheers
>> /karthik
>>
>> > On Feb 5, 2018, at 10:57 AM, Sanjeev Kulkarni <sanjee...@gmail.com>
>> wrote:
>> >
>> > This sounds good to me!
>> >
>> > On Mon, Feb 5, 2018 at 1:08 AM, Ning Wang <wangnin...@gmail.com> wrote:
>> >
>> >> Yeah. That is an option too. In fact it was my first try:
>> >> https://github.com/twitter/heron/pull/2693 (just an initiative, not
>> >> completed, a count map should be used instead of a single total count)
>> >>
>> >> In most cases, I think both solutions should have the same result. A
>> few
>> >> reasons I changed to a tmaster check:
>> >> - with tmaster, there is only one source of truth and tmaster is more
>> >> critical anyway. If the tmaster link is not healthy, stmgrs won't work
>> >> correctly: topology may have created replacement nodes but the
>> disconnected
>> >> nodes could keep going by themselves.
>> >> - it is more straightforward. The logic is the same as the current
>> one. One
>> >> the other side, if we use an array for all remote stmgrs, we could
>> have a
>> >> smarter logic (which is good) but it could make stmgrs more
>> complicated and
>> >> less straightforward (bad). I left the stmgr counters there so if in
>> future
>> >> we decide to add this feature, it should be easy to add. There is a gap
>> >> between "errors from all" and "errors from a few" and this is not a
>> >> simple/quick question.
>> >>
>> >>
>> >>
>> >>
>> >> On Sun, Feb 4, 2018 at 6:48 PM, Sanjeev Kulkarni <sanjee...@gmail.com>
>> >> wrote:
>> >>
>> >>> I could't add comments to the document, thus am posting my comments to
>> >> the
>> >>> mailing list
>> >>> One more approach could be to do the current measurement as it is, but
>> >>> instead of leaving the quitting decision to the stmgtclient, have
>> >>> stmgrclientmgr do the decision. Thus everytime a stmgr client detects
>> >>> connection issues, inform that to stmgrclientmgr which keeps a map of
>> >>> peerstmgrid to error count. Thus it is able to decide things like am i
>> >>> seeing connection errors from all stmgrs or if only a few of them are
>> >>> having issues. Then it can take the decisions better.
>> >>>
>> >>> On Sat, Feb 3, 2018 at 8:11 PM, Ning Wang <wangnin...@gmail.com>
>> wrote:
>> >>>
>> >>>> Hi, heron devs~
>> >>>>
>> >>>> I think the current stream manager's quitting logic on connection
>> >>> failures
>> >>>> is problematic. We saw a few internal cases in Twitter that this
>> logic
>> >>>> could cause extra issue.
>> >>>>
>> >>>> Here is a doc with more details:
>> >>>>
>> >>>> https://docs.google.com/document/d/1WHNc2NEp2gVL9ge2QVKp9t4Hpd4U9
>> >>>> sAbzBqCu4-iDUM/edit#
>> >>>>
>> >>>> Comments and feedbacks are welcome!
>> >>>>
>> >>>> Thanks.
>> >>>> --ning
>> >>>>
>> >>>
>> >>
>>
>>
>


Re: Our Meetup group

2018-02-13 Thread Ning Wang
Great!

On Tue, Feb 13, 2018 at 9:37 PM, Karthik Ramasamy 
wrote:

> Thanks Sree for creating a group. Look forward to our first event.
>
> cheers
> /karthik
>
> > On Feb 13, 2018, at 9:35 PM, Sree V 
> wrote:
> >
> > yup, it was the auto unfolding of url feature in yahoo mail.here is the
> link:
> > https://www.meetup.com/Apache-Heron-Bay-Area/
> >
> > thank you, dave.
> >
> > Thanking you.
> > With Regards
> > Sree
> >
> >On Tuesday, February 13, 2018, 9:14:33 PM PST, Dave Fisher <
> dave2w...@comcast.net> wrote:
> >
> > Hi Sree,
> >
> > That’s good news. It looks like something was in your email that somehow
> got mangled in the transmission.
> >
> > Is there a URL to join the group?
> >
> > Regards,
> > Dave
> >
> >> On Feb 13, 2018, at 9:07 PM, Sree V 
> wrote:
> >>
> >> Hi Herons,
> >> I am excited to share that we have a meetup group, now.Please join and
> stay tuned for our very first meeting.
> >>
> >>
> >> Apache Heron - Bay Area (Sunnyvale, CA)
> >>
> >>
> >> |
> >> |
> >> |
> >> |  |  |
> >>
> >> |
> >>
> >> |
> >> |
> >> |  |
> >> Apache Heron - Bay Area (Sunnyvale, CA)
> >>
> >> A realtime, distributed, fault-tolerant stream processing engine.
> https://twitter.github.io/heron/http://heron.in...
> >> |
> >>
> >> |
> >>
> >> |
> >>
> >>
> >>
> >>
> >> Thanking you.
> >> With Regards
> >> Sree
>
>


Re: About stream manager's quitting logic on connection failures

2018-02-05 Thread Ning Wang
Cool. Thanks!

On Mon, Feb 5, 2018 at 11:01 AM, Karthik Ramasamy <kramas...@gmail.com>
wrote:

> Ning - let us get this rolled out soon.
>
> Cheers
> /karthik
>
> > On Feb 5, 2018, at 10:57 AM, Sanjeev Kulkarni <sanjee...@gmail.com>
> wrote:
> >
> > This sounds good to me!
> >
> > On Mon, Feb 5, 2018 at 1:08 AM, Ning Wang <wangnin...@gmail.com> wrote:
> >
> >> Yeah. That is an option too. In fact it was my first try:
> >> https://github.com/twitter/heron/pull/2693 (just an initiative, not
> >> completed, a count map should be used instead of a single total count)
> >>
> >> In most cases, I think both solutions should have the same result. A few
> >> reasons I changed to a tmaster check:
> >> - with tmaster, there is only one source of truth and tmaster is more
> >> critical anyway. If the tmaster link is not healthy, stmgrs won't work
> >> correctly: topology may have created replacement nodes but the
> disconnected
> >> nodes could keep going by themselves.
> >> - it is more straightforward. The logic is the same as the current one.
> One
> >> the other side, if we use an array for all remote stmgrs, we could have
> a
> >> smarter logic (which is good) but it could make stmgrs more complicated
> and
> >> less straightforward (bad). I left the stmgr counters there so if in
> future
> >> we decide to add this feature, it should be easy to add. There is a gap
> >> between "errors from all" and "errors from a few" and this is not a
> >> simple/quick question.
> >>
> >>
> >>
> >>
> >> On Sun, Feb 4, 2018 at 6:48 PM, Sanjeev Kulkarni <sanjee...@gmail.com>
> >> wrote:
> >>
> >>> I could't add comments to the document, thus am posting my comments to
> >> the
> >>> mailing list
> >>> One more approach could be to do the current measurement as it is, but
> >>> instead of leaving the quitting decision to the stmgtclient, have
> >>> stmgrclientmgr do the decision. Thus everytime a stmgr client detects
> >>> connection issues, inform that to stmgrclientmgr which keeps a map of
> >>> peerstmgrid to error count. Thus it is able to decide things like am i
> >>> seeing connection errors from all stmgrs or if only a few of them are
> >>> having issues. Then it can take the decisions better.
> >>>
> >>> On Sat, Feb 3, 2018 at 8:11 PM, Ning Wang <wangnin...@gmail.com>
> wrote:
> >>>
> >>>> Hi, heron devs~
> >>>>
> >>>> I think the current stream manager's quitting logic on connection
> >>> failures
> >>>> is problematic. We saw a few internal cases in Twitter that this logic
> >>>> could cause extra issue.
> >>>>
> >>>> Here is a doc with more details:
> >>>>
> >>>> https://docs.google.com/document/d/1WHNc2NEp2gVL9ge2QVKp9t4Hpd4U9
> >>>> sAbzBqCu4-iDUM/edit#
> >>>>
> >>>> Comments and feedbacks are welcome!
> >>>>
> >>>> Thanks.
> >>>> --ning
> >>>>
> >>>
> >>
>
>


Re: Proposing Changes To Heron

2018-02-26 Thread Ning Wang
+1 for Heron SQL

On Sun, Feb 25, 2018 at 9:28 PM, Jerry Peng 
wrote:

> Thanks Josh for taking the initiative to get this start!  SQL on Heron
> will be a great feature! The plan sounds great to me.  Lets first get
> an initial version of the Heron SQL out and then we can worry about
> custom / user defined sources and sinks.  We can even start talking
> about UDFs (User defined functions) at that point!
>
> Best,
>
> Jerry
>
> On Sun, Feb 25, 2018 at 9:05 PM, Josh Fischer  wrote:
> > Please see this google drive link for adding comments.  I will copy and
> > paste the drive doc below as well.
> >
> > https://docs.google.com/document/d/1PxLCyR_H-
> mOgPjyFj3DhWXryKW21CH2zFWwzTnqjfEA/edit?usp=sharing
> >
> >
> > Proposal Below
> >
> >
> >
> >
> >
> >
> >
> > *I am writing this document to propose changes and to start conversations
> > on adding functionality similar to Storm SQL to Heron.  We would call it
> > Heron SQL.  After reviewing how the code is structured in Storm I have
> some
> > suggestions and questions relating to the implementation into the Heron
> > code base. - High Level Overview Of Code Workflow (Keeping Similar to
> > Storm)- We would parse the sql with calcite to create the logical and
> > physical plans- We would then convert the logical and physical plans to a
> > Heron Topology- We would then submit the Heron Topology into the Heron
> > System - Some thoughts on code structure and overall functionality- I
> think
> >  we should place the Heron SQL code base as a top level directory in the
> > repo. - I will have to add the command “sql” to the Heron command line
> code
> > in python.- As a first pass implementation users  can interact with Heron
> > SQL via the following command - heron sql  - We
> > will also support the explain command for displaying the query plan, this
> > will not deploy the topology- heron sql  --explain- After the
> > first pass implementation is working smoothly, we can then add an
> > interactive command line interface to accept sql on the fly by omitting
> the
> > sql file argument- Heron sql - We would support all of the
> > existing functionality in Storm SQL today with the exception of being
> > dependent on trident.  We would use Storm SQL as a way to deploy
> topologies
> > into Heron.  Similar to how you deploy topologies with the Streamlet,
> > Topology, and ECO APIs- Questions- Do we see any issue with this plan to
> > implement?- I believe we would have to supply an external jar at times to
> > connect to external data sources, such as reuse of kafka libraries or
> > database drivers.  I see that Storm has few external connectors for
> mongo,
> > kafka, redis and hdfs.  Do we want to limit users to what we decide to
> > build as connectors or do we want to give them the ability to load
> external
> > jars at submit time? I don’t think heron offers the ability to pass extra
> > jars to via the “--jars” or “--artifacts” flags like Storm does today.
> > Would this be the correct way to pull in external jars?  Does anyone
> have a
> > different idea?  I’m thinking that this might be a v2 feature after we
> get
> > Heron sql working well.  Ideas, thoughts or concerns?- Is there anything
> I
> > missed?*
>


Re: Heron OSS Sync

2018-06-18 Thread Ning Wang
Thanks for asking.

Normally we talk about these two items in the meeting:
- Updates
- There could be something we want to discuss briefly or schedule further
discussions.



On Mon, Jun 18, 2018 at 5:00 PM, Dave Fisher  wrote:

> Hi -
>
> Is there an agenda?
>
> Regards,
> Dave
>
> Sent from my iPhone
>
> > On Jun 18, 2018, at 4:56 PM, Ning Wang  wrote:
> >
> > Hi,
> >
> > The heron OSS sync meeting will be happening tomorrow at 2.00 pm PDT.
> > Please use the following hangout link:
> > https://hangouts.google.com/hangouts/_/streaml.io/oss-
> heron-sync?authuser=0
> >
> >
> > See you all then.
>


Re: Heron OSS Sync

2018-08-14 Thread Ning Wang
Brief notes for today's sync up meeting:


   - Apache release is in progress (Neng & Ning). Licenses for 3rdparty
   libraries have been created and heron folders have been created in apache
   dist.
   - K8s deployment improvement is on going (Karthik).
   - A Samoa integration branch is created for devs to discuss and work on
   it (Saikat).
   - Example topology running on Normad uses a lot of resources which is
   not good for new users. Jerry taking a look at topology and configs.
   - Room is found (Palo Alto) for monthly meetups (Sree).


On Tue, Aug 14, 2018 at 11:51 AM, Josh Fischer  wrote:

> I will be on for the first 30 minutes.
>
> Josh
>
> On Tue, Aug 14, 2018 at 1:50 PM Jerry Peng 
> wrote:
>
> > Sounds good.  I will attend the OSS sync meeting today
> >
> > On Tue, Aug 14, 2018 at 11:22 AM Ning Wang  wrote:
> >
> > > Hi,
> > >
> > > Sorry for the late notice The heron OSS sync meeting will be
> > happening
> > > today at 2.00 pm PDT. Please use the following hangout link:
> > >
> > https://hangouts.google.com/hangouts/_/streaml.io/oss-
> heron-sync?authuser=0
> > >
> > >
> > > See you all then.
> > >
> >
> --
> Sent from A Mobile Device
>


Re: Rough Draft to Podling Report

2018-08-07 Thread Ning Wang
nice!

On Tue, Aug 7, 2018 at 8:06 AM, Josh Fischer  wrote:

> The Heron Incubator PMC report has been submitted.
>
> - Josh
>
> On Tue, Aug 7, 2018 at 1:06 AM, Josh Fischer  wrote:
>
> > Thank you both for the feedback.  Taylor, I appreciate the suggestion of
> > looking back at the ASF policies and practices.  I’ll definitely take
> some
> > time do a little more understanding of the Apache way.
> >
> > Karthik I’ll add your points and get the report submitted tomorrow.
> >
> > Talk soon,
> >
> > Josh
> >
> >
> > On Mon, Aug 6, 2018 at 11:53 PM Karthik Ramasamy 
> > wrote:
> >
> >> Josh - Thanks for taking the lead on the report. Three most important
> >> issues to address before the release
> >>
> >> - Removal of binaries from code base (this is already completed)
> >> - Moving from com.twitter.heron to org.apache.heron namespace (this is
> >> already completed)
> >> - How to publish the binary artifacts into the Apache Maven repo
> >>
> >> Otherwise, rest of them looks ok.
> >>
> >> On Mon, Aug 6, 2018 at 7:19 PM, P. Taylor Goetz 
> >> wrote:
> >>
> >> > Hi Josh,
> >> >
> >> > Thanks for taking the time for putting together an initial report!
> >> That’s
> >> > a good indicator that you are dedicated to the project by dealing with
> >> > administriva that may seem like a drag, but is actually really
> important
> >> > for ASF projects.
> >> >
> >> > As far as issues to address for graduation, I’d redirect the question
> >> back
> >> > to you, as well as the rest of the PPMC:
> >> >
> >> > What do you, given your current understanding of ASF policies,
> >> practices,
> >> > etc. think are the most important issues you need to address? (This
> is a
> >> > question for the entire PPMC — I don’t mean to single out Josh in any
> >> way.)
> >> >
> >> > I’ll admit that learning how Apache works can be a bit like like
> >> learning
> >> > a rapidly changing, not-so-well-documented API. But haven’t we all
> dealt
> >> > with and used poorly documented libraries simply because they work
> well?
> >> >
> >> > The ASF can be difficult to understand and navigate at times. Feel
> free
> >> to
> >> > lean on your mentors for help at any time. This what we are here for.
> >> >
> >> > -Taylor
> >> >
> >> > > On Aug 6, 2018, at 8:27 PM, Josh Fischer 
> wrote:
> >> > >
> >> > > All,
> >> > >
> >> > > Please review the report below.  I've added answers to most of the
> >> > > questions, some I have not.  Please make changes / give feedback as
> >> > > needed.  I will get this submitted as soon as I get the "OK" from
> the
> >> > > community and the answers  to:
> >> > >
> >> > > *"Three most important issues to address in the move towards
> >> > graduation:"*
> >> > >
> >> > >
> >> > > Thanks,
> >> > >
> >> > > Josh
> >> > >
> >> > > #
> >> > >
> >> > >
> >> > > Heron
> >> > >
> >> > > A real-time, distributed, fault-tolerant stream processing engine.
> >> > >
> >> > > Heron has been incubating since 2017-06-23.
> >> > >
> >> > > Three most important issues to address in the move towards
> graduation:
> >> > >
> >> > >  1.
> >> > >  2.
> >> > >  3.
> >> > >
> >> > > Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to
> be
> >> > > aware of?
> >> > >
> >> > > We had an issue with scheduling to get the report done on time.  We
> >> work
> >> > to
> >> > > make this better for the next period.
> >> > >
> >> > >
> >> > >
> >> > > How has the community developed since the last report?
> >> > >
> >> > > The community has been increasing well.  The growing community has
> >> been
> >> > > asking questions through the mailing list and slack channels.  The
> >> > > supporting community has responded with answers to questions asked.
> >>  We
> >> > > have also had new individuals  come into help with cleaning the
> >> project
> >> > > from an outside perspective.  It's been very helpful to the
> >> committers.
> >> > >
> >> > > How has the project developed since the last report?
> >> > >
> >> > > There have been mainly bug fixes and improvements to existing
> >> features.
> >> > > Some to note are
> >> > > * Fixing issue with downloader for Nomad
> >> > > * Updating to the latest Dhalion version
> >> > > * Updating of Dockerfiles and docker build scripts
> >> > > * Updates to the documentation
> >> > > * Updates to Helm charts
> >> > > * Added a S3 uploader
> >> > >
> >> > >
> >> > > How would you assess the podling's maturity?
> >> > > Please feel free to add your own commentary.
> >> > >
> >> > >  [ ] Initial setup
> >> > >  [ X] Working towards first release
> >> > >  [ ] Community building
> >> > >  [ ] Nearing graduation
> >> > >  [ ] Other:
> >> > >
> >> > > Date of last release:
> >> > >
> >> > > No Apache releases as of yet.
> >> > >
> >> > > When were the last committers or PPMC members elected?
> >> > >
> >> > > N/A.  Still working towards bootstrapping the project..
> >> > >
> >> > > Signed-off-by:
> >> > >
> >> > >  [ ](heron) Jake Farrell
> >> > > Comments:
> >> > >  [ ](heron) 

Re: [VOTE] Heron Release 0.20.0-incubating Candidate 1

2018-08-17 Thread Ning Wang
Thanks a lot Dave, That's super helpful to us!





On Wed, Aug 15, 2018 at 4:25 PM Dave Fisher  wrote:

> Hi -
>
> Thanks for providing this. Some key items are missing. I have to VOTE -1
> (While I am not on this PPMC I am giving you a head start with an IPMC
> binding vote.)
>
> (1) A detached signature is required in the directory that has the release
> package.
> (2) A KEYS file needs to be present which contains the public key of the
> release manager who signed this release.
> (3) A brand new policy is that SHA-1 is compromised and new releases need
> an SHA-256 or SHA-512.
>
> See https://www.apache.org/dev/release-distribution#sigs-and-sums and
> http://www.apache.org/legal/release-policy.html#what-must-every-release-contain
>
> I looked at the following:
> DISCLAIMER
> LICENSE
> NOTICE
> These look OK.
>
> Many files are missing license headers in the source.
> Please provide a way to run a release audit tool to check on licenses in
> the source files.
> See https://creadur.apache.org/rat/
>
> Binaries cannot be included in source packages:
> A discussion is needed about the third_party directory. Binary files
> cannot be included. E.G. cereal-1.2.1.tar.gz.
>
> In the website directory there are binaries for the logo source. I wonder
> why you need ai, eps, psd and pdf versions of the logo in your source
> release. This is branding information and the project should control when
> these are given out.
> For branding policy see http://www.apache.org/foundation/marks/#guidelines
>  , http://www.apache.org/foundation/marks/faq/#poweredby , and
> https://www.apache.org/foundation/press/kit/
>
> The icomoon font in the tools/ui has licenses that are incompatible with
> Apache releases. I would not block this release for this, but these will
> need to be replaced.
>
> I did not do any builds ….
>
> Regards,
> Dave
>
>
> On Aug 15, 2018, at 3:14 PM, Neng Lu  wrote:
>
> Hi All,
>
> This is the first release candidate for Apache Heron, version
> 0.20.0-incubating.
>
> It is the starting poiont of Heron and contains heron's main features, such
> as streaming
> processing, stateful processing, streamlet api, api server, eco support,
> etc.
>
> Full list of changes and fixes are available:
>
> https://github.com/apache/incubator-heron/compare/0.17.5.1-rc...release/v-0.20.0-incubating
>
> *** Please download, test and vote on this release. This vote will stay
> open
> for at least 72 hours ***
>
> Source files:
>
> https://dist.apache.org/repos/dist/dev/incubator/heron/heron-0.20.0-incubating-candidate-1/
>
> SHA-1 checksums:
> 9a42c828f2264eb6c0e49ae52c8ba525f0e1c4ee
> ./incubator-heron-v-0.20.0-incubating-candidate-1.tar.gz
>
> The tag to be voted upon:
> v0.20.0-incubating-candidate-1 (d2946ce0cfb3a6fe230a93d9f16550d7f46d2cf3)
>
> https://github.com/apache/incubator-heron/releases/tag/v-0.20.0-incubating-candidate-1
>
> Please download the the source package, and follow the compiling guide(
>
> https://apache.github.io/incubator-heron/docs/developers/compiling/compiling/
> )
> to build
> and run the Heron locally.
>
> Best Regards,
> Neng Lu
>
>
>


Re: [VOTE] Heron Release 0.20.0-incubating Candidate 2

2018-08-22 Thread Ning Wang
-1

we found a compiling issue caused by a script (status.sh) that requires git
environment.

On Wed, Aug 22, 2018 at 11:00 AM Neng Lu  wrote:

> Hi All,
>
> This is the 2nd release candidate for Apache Heron, version
> 0.20.0-incubating. Thanks Dave Fisher for providing various feedback for
> the first release candidate. We've resolved all the feedbacks from Dave and
> thus call for voting of the 2nd release candidate.
>
> It is the starting point of Heron and contains heron's main features, such
> as streaming
> processing, stateful processing, streamlet API, API server, eco support,
> etc.
>
> The full list of changes and fixes are available:
>
> https://github.com/apache/incubator-heron/compare/0.17.8...release/v-0.20.0-incubating
>
> *** Please download, test and vote on this release. This vote will stay
> open
> for at least 72 hours ***
>
> Source files:
>
> https://dist.apache.org/repos/dist/dev/incubator/heron/heron-0.20.0-incubating-candidate-2/
>
> SHA-1 checksums:
> a80c6bae4938c4c8a322552c256cb3fb4bd0c809
> ./incubator-heron-v-0.20.0-incubating-candidate-2.tar.gz
>
> The tag to be voted upon:
> v0.20.0-incubating-candidate-2 (d01b9bbda0d92fff4f688d3463658eccd358cc42)
>
> https://github.com/apache/incubator-heron/releases/tag/v-0.20.0-incubating-candidate-2
>
> Please download the source package, and follow the compiling guide(
>
> https://apache.github.io/incubator-heron/docs/developers/compiling/compiling/
> )
> to build
> and run the Heron locally.
>
> Best Regards,
> Neng Lu
>


Re: Heron OSS Sync Meeting

2018-09-10 Thread Ning Wang
One thing I think we may need to discuss tomorrow is to find a replacement
for Google hangout. We have seen some issues(cant accept requests) in the
past a few weeks.



On Mon, Sep 10, 2018 at 10:02 AM Ning Wang  wrote:

> Hi,
>
> The heron OSS sync meeting will be happening tomorrow at 2.00 pm PDT.
> Please use the following hangout link:
> https://hangouts.google.com/hangouts/_/streaml.io/oss-heron-sync
> ?authuser=0
>
>
> See you all then.
>


Heron OSS Sync Meeting

2018-09-10 Thread Ning Wang
Hi,

The heron OSS sync meeting will be happening tomorrow at 2.00 pm PDT.
Please use the following hangout link:
https://hangouts.google.com/hangouts/_/streaml.io/oss-heron-sync?authuser=0


See you all then.


Re: [VOTE] Heron Release 0.20.0-incubating Candidate 3

2018-08-29 Thread Ning Wang
se/README.md
> == File: ./scripts/resources/idea/.name
> == File: ./storm-compatibility/src/java/shade.conf
> == File: ./third_party/cereal/cereal.BUILD
> == File: ./third_party/glog/glog.BUILD
> == File: ./third_party/gperftools/gperftools.BUILD
> == File: ./third_party/gtest/gtest.BUILD
> == File: ./third_party/helm/helm.BUILD
> == File: ./third_party/java/Empty.java
> == File:
> ./third_party/java/jarjar/src/main/resources/com/tonicsystems/jarjar/help.txt
> == File: ./third_party/kashmir/abstractrandomstream.h
> == File: ./third_party/kashmir/devrandom.h
> == File: ./third_party/kashmir/empty.cc
> == File: ./third_party/kashmir/iofwd.h
> == File: ./third_party/kashmir/iostate.h
> == File: ./third_party/kashmir/polydevrandom.h
> == File: ./third_party/kashmir/randomstream.h
> == File: ./third_party/kashmir/uuid.h
> == File: ./third_party/kashmir/tests/cli.cpp
> == File: ./third_party/kashmir/tests/command.cpp
> == File: ./third_party/libevent/libevent.BUILD
> == File: ./third_party/libunwind/libunwind-1.1-cache.patch
> == File: ./third_party/libunwind/libunwind-1.1-config.patch
> == File: ./third_party/libunwind/libunwind-1.1-lzma-link.patch
> == File: ./third_party/libunwind/libunwind.BUILD
> == File: ./third_party/nomad/nomad.BUILD
> == File: ./third_party/python/cpplint/cpplint.py
> == File: ./third_party/python/pylint/main.py
> == File: ./third_party/python/semver/PKG-INFO
> == File: ./third_party/python/semver/README.md
> == File: ./third_party/python/semver/semver.py
> == File: ./third_party/python/semver/setup.py
> == File: ./third_party/yaml-cpp/yaml.BUILD
> == File: ./third_party/zookeeper/zookeeper.BUILD
> == File: ./tools/bazel.rc
> == File: ./tools/build_rules/prelude_bazel
> == File: ./tools/docker/bazel.rc
> == File: ./tools/java/src/org/apache/bazel/checkstyle/heron_header.txt
> == File: ./tools/python/checkstyle.ini
> == File: ./tools/rules/genproto.bzl
> == File: ./tools/rules/heron_deps.bzl
> == File: ./tools/rules/java_tests.bzl
> == File: ./tools/rules/newgenproto.bzl
> == File: ./tools/rules/proto.bzl
> == File: ./tools/rules/pex/testlauncher.sh.template
> == File: ./tools/travis/bazel.rc
> == File: ./tools/travis/toolchain/CROSSTOOL
> == File: ./vagrant/.gitignore
> == File: ./vagrant/README.md
>
> Regards,
> Dave
>
> On Aug 23, 2018, at 1:42 PM, Neng Lu  wrote:
>
> Hi All,
>
> This is the 3rd release candidate for Apache Heron, version
> 0.20.0-incubating. Thank Dave Fisher for providing various feedback for the
> first release candidate. Also, thank Ning Wang for finding the compilation
> issue for the second release candidate. We've resolved all the feedbacks
> and thus call for voting of the 3rd release candidate.
>
> It is the starting point of Heron and contains heron's main features, such
> as streaming
> processing, stateful processing, streamlet API, API server, eco support,
> etc.
>
> The full list of changes and fixes are available:
>
> https://github.com/apache/incubator-heron/compare/0.17.8...release/v-0.20.0-incubating
>
> *** Please download, test and vote on this release. This vote will stay
> open
> for at least 72 hours ***
>
> Source files:
>
> https://dist.apache.org/repos/dist/dev/incubator/heron/heron-0.20.0-incubating-candidate-3/
>
> SHA-1 checksums:
> 18181be53b697f68e6a4fdf6622dd42aba9fd095
> ./incubator-heron-v-0.20.0-incubating-candidate-3.tar.gz
>
> The tag to be voted upon:
> v0.20.0-incubating-candidate-3 (7fb0df3b6ec29d8c51f9d43ad7e8ecb3d45d643a)
>
> https://github.com/apache/incubator-heron/releases/tag/v-0.20.0-incubating-candidate-3
>
> Please download the source package, and follow the compiling guide(
>
> https://apache.github.io/incubator-heron/docs/developers/compiling/compiling/
> )
> to build and run the Heron locally.
>
> Best Regards,
> Neng Lu
>
>
>


Re: ML in Heron weekly meeting

2018-07-05 Thread Ning Wang
Hmm. Good question. Maybe not yet reaching out.



On Thu, Jul 5, 2018 at 11:49 AM, Dave Fisher  wrote:

> Hi -
>
> Has anyone reached out to the SAMOA podling? Or is their architecture
> inverted from that being proposed I’m not sure how well the SAMOA community
> is doing as they have had low activity since early this year.
>
> Regards,
> Dave
>
> > On Jun 29, 2018, at 11:01 PM, Ning Wang  wrote:
> >
> > Brief notes for the meeting on June 29:
> >
> > - We need to hook up heron with Apache samoa. Saikat to create new issues
> > in github.
> > - Create a slack channel: #machine-learning
> > - Let's add potential use cases in the design doc:
> > https://docs.google.com/document/d/1LrO7XRcMxJoMM83wjRd-
> Ov74VAaomA_mXOAhCStgGng/edit
> >
> >
> > On Sat, Jun 23, 2018 at 3:44 PM, Ning Wang  wrote:
> >
> >> Brief notes for the meeting on June 22th:
> >>
> >> - still studying the documents.
> >>--- https://mapr.com/blog/monitoring-real-time-uber-data-using-
> >> spark-machine-learning-streaming-and-kafka-api-part-2/
> >>--- https://databricks.com/blog/2018/06/05/introducing-mlflow-an
> >> -open-source-machine-learning-platform.html
> >>--- https://eng.uber.com/michelangelo/
> >> - stateful storage might need to be improved (data size) to support big
> >> state object which could be required by ML jobs.
> >>
>
>


Heron OSS Sync

2018-07-03 Thread Ning Wang
Hi,

Sorry for the late notice The heron OSS sync meeting will be happening
today at 2.00 pm PDT. Please use the following hangout link:
https://hangouts.google.com/hangouts/_/streaml.io/oss-heron-sync?authuser=0


See you all then.


Re: Proposing Changes To ECO

2018-01-22 Thread Ning Wang
Got it. Thanks! It makes more sense now. :)

On Mon, Jan 22, 2018 at 1:51 PM, Josh Fischer <j...@joshfischer.io> wrote:

> Ning,
>
> In my email I was thinking specifically of setting the componentRam.  This
> is case the value is a comma delimited string value which would be easy to
> incorrectly format the list of values to be appended.   An image to
> reference is below.  So by passing in a list of values, I could then
> correctly format the value String as we would expect.
>
> public static void setComponentRam(Map<String, Object> conf,
>String component, ByteAmount
> ramInBytes) {
>   if (conf.containsKey(Config.TOPOLOGY_COMPONENT_RAMMAP)) {
> String oldEntry = (String) conf.get(Config.TOPOLOGY_COMPONENT_RAMMAP);
> String newEntry = String.format("%s,%s:%d", oldEntry, component,
> ramInBytes.asBytes());
> conf.put(Config.TOPOLOGY_COMPONENT_RAMMAP, newEntry);
>   } else {
> String newEntry = String.format("%s:%d", component,
> ramInBytes.asBytes());
> conf.put(Config.TOPOLOGY_COMPONENT_RAMMAP, newEntry);
>   }
> }
>
> I'm glad you sent this email as it got me thinking about the above spec
> that Karthik mentioned.  I've copied his spec below
>
>
> config:
>   topology.workers: 2
>   topology.component.resourcemap:
>
> - id: "component-1"
>   ram: 1234MB
>   cpu: 0.5
>   disk: 123MB
>
>- id: "component-2"
>  ram: 2345MB
>  cpu: 0.75
>  disk: 4GB
>
> I think disk and cpu resources are allocated at a topology level and would
> not be applicable here.  Unless there is a way that you specify this
> through the Heron Config class?..  After looking at the docs here
> https://twitter.github.io/heron/docs/developers/tuning/ and looking at the
> Heron Config class, I don't see way to specify these at a component level.
> I do see there is a way to pass any configuration up to Heron, can I set
> this values via a `prepare()` or `open()` call?
>
> One last note while thinking about this.  `setComponentJvmOptions()` has a
> similar behavior.  I would have this do the same for this field too I
> believe
>
>
> public static void setComponentJvmOptions(
> Map<String, Object> conf,
> String component,
> String jvmOptions) {
>   String optsBase64;
>   String componentBase64;
>
>   optsBase64 = DatatypeConverter.printBase64Binary(
>   jvmOptions.getBytes(StandardCharsets.UTF_8));
>   componentBase64 = DatatypeConverter.printBase64Binary(
>   component.getBytes(StandardCharsets.UTF_8));
>
>   String oldEntry = (String) conf.get(Config.TOPOLOGY_COMPONENT_JVMOPTS);
>   String newEntry;
>   if (oldEntry == null) {
> newEntry = String.format("{\"%s\":\"%s\"}", componentBase64,
> optsBase64);
>   } else {
> // To remove the '{' at the start and '}' at the end
> oldEntry = oldEntry.substring(1, oldEntry.length() - 1);
> newEntry = String.format("{%s,\"%s\":\"%s\"}", oldEntry,
> componentBase64, optsBase64);
>   }
>   // Format for TOPOLOGY_COMPONENT_JVMOPTS would be a json map like this:
>   //  {
>   // "componentNameAInBase64": "jvmOptionsInBase64",
>   // "componentNameBInBase64": "jvmOptionsInBase64"
>   //  }
>   conf.put(Config.TOPOLOGY_COMPONENT_JVMOPTS, newEntry);
>
> }
>
>
>
> If I've missed something please let me know.
>
> -Josh
>
>
> On Mon, Jan 22, 2018 at 12:02 PM, Ning Wang <wangnin...@gmail.com> wrote:
>
> > LGTM. And I like the 123MB  more than separating value and unit into two
> > settings.
> >
> > Quick questions:
> > This new config will replace the existing topology.component.rammap?
> > "the way ECO handles topology configuration will not work for all
> > configuration types". Can you give a more specific example?
> >
> > Thanks.
> >
> >
> >
> >
> >
> > On Mon, Jan 22, 2018 at 9:33 AM, Karthik Ramasamy <kart...@streaml.io>
> > wrote:
> >
> > > Josh -
> > >
> > > One more feedback - since the resources assigned can be CPU, RAM, DISK
> -
> > > instead of calling it
> > >
> > > topology.component.rammap
> > >
> > > can we call it
> > >
> > > topology.component.resourcemap
> > >
> > > and allow for CPU and DISK. Furthermore, we append the size type into
> the
> > > metric as follows
> > >
> > > config:
> > >   topology.workers: 2
> > >   topology.component.resou

Re: Release Today/Tomorrow

2018-02-27 Thread Ning Wang
sgtm

On Tue, Feb 27, 2018 at 2:27 AM, Karthik Ramasamy 
wrote:

> Ok let us do a release. Any objections to do a release?
>
> Sent from my iPhone
>
> > On Feb 27, 2018, at 1:57 AM, Chris Kellogg  wrote:
> >
> > It's a little more involved then i thought. We could do a release and
> then
> > follow up with another one soon.
> >
> > On Mon, Feb 26, 2018 at 8:50 PM, Karthik Ramasamy 
> > wrote:
> >
> >> We can do a release. Chris wanted to get a PR in. Chris - any update?
> >>
> >> On Mon, Feb 26, 2018 at 11:38 PM Jerry Peng <
> jerry.boyang.p...@gmail.com>
> >> wrote:
> >>
> >>> We haven't done a release in a while.  Should we do a release today or
> >>> tomorrow?
> >>>
> >>> Best,
> >>>
> >>> Jerry
> >>>
> >>
>


Re: Moving to Apache

2018-02-28 Thread Ning Wang
+1

On Wed, Feb 28, 2018 at 7:48 AM, Chris Kellogg <cckell...@gmail.com> wrote:

> +1 for Karthik's suggestion too.
>
>
> On Wed, Feb 28, 2018 at 6:04 AM, Josh Fischer <j...@joshfischer.io> wrote:
>
> > +1 to Karthiks suggestion
> >
> > On Wed, Feb 28, 2018 at 2:15 AM Karthik Ramasamy <kart...@streaml.io>
> > wrote:
> >
> > > Sree -
> > >
> > > Since Apache allows for using github.com (Apache Pulsar uses it) and
> the
> > > team is used to github.com and its workflow using ISSUES and PR, I
> would
> > > vote to
> > >
> > > github.com/twitter/heron > github.com/apache/incubator-heron
> > >
> > > This is not hard since the organization owner can transfer (in this
> case
> > > Twitter) can transfer the project to github.com/apache/incubator-heron
> -
> > > with a single UI click.
> > > and it preserves all the github/forks/ stars etc.
> > >
> > > If we can use the ISSUE tracking in github.com associated with the
> > project
> > > incubator-heron, it will be easier as well as opposed to moving to JIRA
> > > (again Apache Pulsar use this approach)
> > >
> > > cheers
> > > /karthik
> > >
> > > On Tue, Feb 27, 2018 at 8:35 PM, Sree V <sree_at_ch...@yahoo.com.
> > invalid>
> > > wrote:
> > >
> > > > Thank you for giving me access to create INFRA tickets.
> > > > > (1) Transfer GitHub. Create an Infrastructure JIRA issue to convert
> > the
> > > > twitter repository to apache/incubator-heron/. You will need to have
> an
> > > > admin for twitter give an ASF Infra admin rights to move.
> > > >
> > > > This is the most common path, that I worked with.
> > > > external code (eg.google code)  -> apache git -> (periodic sync) ->
> > > > github.com/apache
> > > >
> > > > Now we are asking for github.com/twitter/heron -> (move) ->
> > > > github.com/apache/incubator-heron -> (one time) -> apache git ->
> > > > (periodic sync) -> github.com/apache/incubator-heron.
> > > >
> > > >
> > > > This probably very hard as there are more than Heron project under
> > > > github.com/twitter/. We would not know, until we attempt.  Twitter
> > > > employees for rescue.
> > > > In detail:We need INFRA create, https://git-wip-us.apache.org/
> > > > repos/asf/incubator-heron.githttps://jira.apache.org/
> > > > jira/browse/INFRA-16116
> > > >
> > > > We need INFRA create, https://github.com/apache/incubator-heron
> > > > https://jira.apache.org/jira/browse/INFRA-16117
> > > > Here in the comments, I mentioned to move from
> > github.com/twitter/heron
> > > > to github.com/apache/incubator-heron, retaining everything without
> any
> > > > exceptions.
> > > >
> > > >
> > > > In addition, request to create JIRA project for HERON.
> > > > https://jira.apache.org/jira/browse/INFRA-16115
> > > > Karthik/PMC: you may add more admins for jira heron and add more
> > > > developers as well.
> > > >
> > > >
> > > >
> > > >
> > > > Thanking you.
> > > > With Regards
> > > > Sree
> > > >
> > > > On Tuesday, February 27, 2018, 4:44:38 PM PST, Ning Wang <
> > > > wangnin...@gmail.com> wrote:
> > > >
> > > >   I created a google doc for us to track questions and plans:
> > > >
> > > > https://docs.google.com/document/d/1-G5qbFN1ftDRf42Dee_BjlEZ75C-
> > > > gVdGJ_uaGIL8qx8/edit#
> > > >
> > > >
> > > > On Tue, Feb 27, 2018 at 4:23 PM, Josh Fischer <j...@joshfischer.io>
> > > wrote:
> > > >
> > > > > Dave,
> > > > >
> > > > > Thank you for the suggestions.  This is fantastic advice.
> > > > >
> > > > >
> > > > > On Tue, Feb 27, 2018 at 6:19 PM Dave Fisher <w...@apache.org>
> wrote:
> > > > >
> > > > > > Hi -
> > > > > >
> > > > > > I will answer with two hats - (1) IPMC member and mentor to other
> > > > > > projects, and (2) Brand committee member.
> > > > > >
> > > > > > On Feb 27, 2018, at 3:20 PM, Jerry Peng <
> > jerry.boyang.p...@gmai

Re: Heron OSS Sync

2018-03-13 Thread Ning Wang
And here are the brief notes for today's meeting:


* - Maosong needs to contact with Apache admin about github to
apache/github.- After code migration is done, we are going to follow apache
release process- Sree is following up on the site and documentations- A few
bug fixes on k8s support were merged and will be included in the new
release.- Dhalion testing is ongoing in Microsoft- SQL API is ongoing,
https://docs.google.com/document/d/1PxLCyR_H-mOgPjyFj3DhWXryKW21CH2zFWwzTnqjfEA/edit
-
More Dhalion metrics are being added (Twitter internal feedback/requests)-
Data rate limiting is on going. Will be hooked up with runtime config. In
future some automation might be possible. It could also be related to auto
scaling. To document after it is stabilized.- First meetup date/time. Late
April, Sree is talking to some meeting groups such as
http://www.sfbayacm.org/ - Slack/email
integration is ongoing (almost ready)*
Meeting notes are tracked here:
https://docs.google.com/document/d/1cTIBq3jOVRTSR0Zd5OKK20OqwT2l90xXiY_HssVo8mE/edit?ts=5aa84932#


On Tue, Mar 13, 2018 at 11:00 AM, Karthik Ramasamy 
wrote:

> Will be happening today at 2.00 pm PST. Please use the following hangout
> link
>
> https://hangouts.google.com/hangouts/_/streaml.io/oss-
> heron-sync?authuser=0
>
> See you all then.
>
> cheers
> /karthik
>


Heron OSS Sync

2018-04-10 Thread Ning Wang
The bi-weekly Heron OSS sync meeting will be happening today at 2.00 pm PST.
Pl ease use the following hangoutlink to join.

https://hangouts.google.com/hangouts/_/streaml.io/oss-heron-sync?authuser=0

See you all then.


Re: Regarding package renaming PR#2840

2018-04-05 Thread Ning Wang
Make sense to me.



On Thu, Apr 5, 2018 at 9:19 AM, Ashvin A  wrote:

> Hi Devs,
>
> PR 2840 renames com.twitter package to org.apache. This change touches more
> than *2,127* files. Is there a test strategy for this change which updates
> everything? I believe just depending on unit and integration tests may be
> insufficient.
>
> Also I am hoping git history will be preserved.
>
> Should we create a coarse checklist and take ownership of manual
> verification of individual components?
>
>1. Examples
>2. Heron UI
>   1. Metrics
>   2. Logs
>3. API server
>4. Heron client
>5. Docker
>6. Schedulers
>1. Aurora
>   2. Kubernetes
>   3. Yarn
>   4. ..
>7. Python
>8. Heron Tracker
>9. Heron metrics cache
>10. Heron Health manager
>11. ...
>
>
> Thanks,
> Ashvin
>


Re: Build machine provisioning and release access

2018-04-05 Thread Ning Wang
Cool. Thanks!

On Thu, Apr 5, 2018 at 11:39 AM, Jake Farrell  wrote:

> Hey Ali
> The ASF Jenkins build system is at https://builds.apache.org/ and setup
> details are available at
> https://cwiki.apache.org/confluence/display/INFRA/Jenkins . We can not
> create release artifacts on those servers as they must go through a voting
> process. We can as part of a release candidate stage artifacts like jars
> using the Apache Nexus repository and after a successful vote publish those
> jars which automatically get picked up and mirrored to Maven central from
> https://repository.apache.org/.
>
> If you have any other questions please let us know
> -Jake
>
> On 2018/04/02 21:36:19, Ali Ahmed  wrote:
> > I have currently mananging the heron releases for the last few months
> via a local jenkins instance, as part of apache incubation I need guidance
> on transition to an official build  and release pipeline possibly on infra
> managed by apache org.
> >
> > Can someone forward me to point of contact or documentation for this
> process.
> >
> > Thanks
> > -Ali
>


A new resource config proposal

2018-04-18 Thread Ning Wang
In the discussions with a few developers, we have some new thoughts about
the resource config and packing algorithms in heron. Basically we want to
step back from the current configs and try to come up with a new solution
from user's point of view.

Here is a brief proposal document for reference and discussion.

https://docs.google.com/document/d/1huySPugRcR5LXmlBxCCtie9grkc-X67t57_R_tZM-4w/edit#

Please feel free to comment. Thanks in advance.

Regards,
--ning


Re: Heron OSS Sync

2018-04-24 Thread Ning Wang
And here are a brief notes:



* - The first meetup went very well last evening! 60+ attendees. Next
meetup will be in May and could be hackthon format. 200+ followers in the
meetup channel now.- Making an apache release (v0.2.0) is our priority.
Package name changed. Copy right to update. Need to remove binaries. Thrift
0.5.0 is used by Twitter. To remove by Twitter team. Need to sync with
apache infra for the infra part.- Stateful processing in progress. Found
missing functions (stream repartitioning, sink data deduplication).-
Stateful storage layer revisiting- Modeling the traffic evaluation system.-
Integration tests for the Scala streamlet API, as well as the
documentation. Pretty much ready. Scala bazel rule is special and needs to
be doublechecked- New resource config proposal. Could be helpful for some
users.- Container pipeline is in progress.- SQL API in progress too.
Looking into issue with Intellij.*

On Tue, Apr 24, 2018 at 1:55 PM, Ning Wang <wangnin...@gmail.com> wrote:

> Will be happening today at 2.00 pm PST. Please use the following hangou
> link:
> https://hangouts.google.com/hangouts/_/streaml.io/oss-heron-
> sync?authuser=0
>
>
> See you all then.
>


Heron OSS Sync

2018-04-24 Thread Ning Wang
Will be happening today at 2.00 pm PST. Please use the following hangou
link:
https://hangouts.google.com/hangouts/_/streaml.io/oss-heron-sync?authuser=0


See you all then.


Re: Heron OSS Sync Meeting notes

2018-03-27 Thread Ning Wang
Got it. Thanks.

On Tue, Mar 27, 2018 at 6:35 PM, Dave Fisher <dave2w...@comcast.net> wrote:

> Hi Ning,
>
> If matters including that the hangout is happening were discussed on this
> mailing list then the following benefits occur.
>
> (1) Every person interested and subscribed can know the agenda.
> (2) Anyone can share any updates to the meeting url in real time.
> (3) Post Meeting Notes are kept in the open.
> (4) Apache projects are Open Source for the public good.
>
> I’m here to help Heron become an Apache project.
>
> Regards,
> Dave
>
> Sent from my iPhone
>
> > On Mar 27, 2018, at 4:51 PM, Ning Wang <wangnin...@gmail.com> wrote:
> >
> > A little more details about the hangout issue:
> >
> > Owner of hangout meeting is responsible to accept incoming requests.
> Since
> > Karthik is sick today so we had to create a new temporary meeting, which
> is
> > not ideal.
> >
> >
> >
> >> On Tue, Mar 27, 2018 at 4:09 PM, Ning Wang <wangnin...@gmail.com>
> wrote:
> >>
> >> Thanks for the suggestions.
> >>
> >> On Tue, Mar 27, 2018 at 4:02 PM, Dave Fisher <dave2w...@comcast.net>
> >> wrote:
> >>
> >>>  - Looking for another host for our Mailing list
> >>>
> >>>
> >>> The Mailing List must be dev@heron.incubator.apache.org using any
> other
> >>> mailing list is not the Apache Way.
> >>>
> >>>
> >>>  - Hangout is killing us today! We need a better solution.
> >>>
> >>>
> >>> Discuss items in plain sight of the whole community on this list. This
> >>> allows other developers to participate asynchronously.
> >>>
> >>> Regards,
> >>> Dave
> >>>
> >>>
> >>> On Mar 27, 2018, at 3:47 PM, Ning Wang <wangnin...@gmail.com> wrote:
> >>>
> >>> Hi,
> >>>
> >>> Here is a super brief notes for today's sync meeting:
> >>>
> >>>
> >>>  - Hangout is killing us today! We need a better solution.
> >>>  - Per component cpu/disk config is added but not used yet.
> >>>  - Looking into failing integration tests. This has been affecting us
> >>>  quite a big
> >>>  - New works on nomad scheduler (docker support)
> >>>  - New padding configs (to improve padding calculation and avoid hard
> >>>  coded configs) are documented and shared:
> >>>  https://docs.google.com/document/d/10JNZNYtcUIlsEyXcWWQUyh
> >>> cSoQoJZMvYrTYT8UFo4dQ/edit
> >>>  - Clean up and k8s scheduler related works ongoing
> >>>  - New scala examples have been added. Next step is to document them
> and
> >>>  integration tests
> >>>  - Looking for another host for our Mailing list
> >>>  - Code change ongoing to make the website apache ready. ETA next
> >>>  weekend. Apache Infra setup is needed for starting the website.
> >>>  - Apache transfer is on going by Twitter’s OSS team.
> >>>  - We need to finalize the meetup date @maosong.
> >>>
> >>>
> >>>
> >>>
> >>> Meeting notes are tracked here:
> >>> https://docs.google.com/document/d/1cTIBq3jOVRTSR0Zd5OKK20OqwT2l9
> >>> 0xXiY_HssVo8mE/edit?ts=5aa84932#
> >>>
> >>>
> >>>
> >>
>
>


Re: Heron OSS Sync Meeting notes

2018-03-27 Thread Ning Wang
A little more details about the hangout issue:

Owner of hangout meeting is responsible to accept incoming requests. Since
Karthik is sick today so we had to create a new temporary meeting, which is
not ideal.



On Tue, Mar 27, 2018 at 4:09 PM, Ning Wang <wangnin...@gmail.com> wrote:

> Thanks for the suggestions.
>
> On Tue, Mar 27, 2018 at 4:02 PM, Dave Fisher <dave2w...@comcast.net>
> wrote:
>
>>   - Looking for another host for our Mailing list
>>
>>
>> The Mailing List must be dev@heron.incubator.apache.org using any other
>> mailing list is not the Apache Way.
>>
>>
>>   - Hangout is killing us today! We need a better solution.
>>
>>
>> Discuss items in plain sight of the whole community on this list. This
>> allows other developers to participate asynchronously.
>>
>> Regards,
>> Dave
>>
>>
>> On Mar 27, 2018, at 3:47 PM, Ning Wang <wangnin...@gmail.com> wrote:
>>
>> Hi,
>>
>> Here is a super brief notes for today's sync meeting:
>>
>>
>>   - Hangout is killing us today! We need a better solution.
>>   - Per component cpu/disk config is added but not used yet.
>>   - Looking into failing integration tests. This has been affecting us
>>   quite a big
>>   - New works on nomad scheduler (docker support)
>>   - New padding configs (to improve padding calculation and avoid hard
>>   coded configs) are documented and shared:
>>   https://docs.google.com/document/d/10JNZNYtcUIlsEyXcWWQUyh
>> cSoQoJZMvYrTYT8UFo4dQ/edit
>>   - Clean up and k8s scheduler related works ongoing
>>   - New scala examples have been added. Next step is to document them and
>>   integration tests
>>   - Looking for another host for our Mailing list
>>   - Code change ongoing to make the website apache ready. ETA next
>>   weekend. Apache Infra setup is needed for starting the website.
>>   - Apache transfer is on going by Twitter’s OSS team.
>>   - We need to finalize the meetup date @maosong.
>>
>>
>>
>>
>> Meeting notes are tracked here:
>> https://docs.google.com/document/d/1cTIBq3jOVRTSR0Zd5OKK20OqwT2l9
>> 0xXiY_HssVo8mE/edit?ts=5aa84932#
>>
>>
>>
>


Re: Heron OSS Sync Meeting notes

2018-03-27 Thread Ning Wang
Thanks for the suggestions.

On Tue, Mar 27, 2018 at 4:02 PM, Dave Fisher <dave2w...@comcast.net> wrote:

>   - Looking for another host for our Mailing list
>
>
> The Mailing List must be dev@heron.incubator.apache.org using any other
> mailing list is not the Apache Way.
>
>
>   - Hangout is killing us today! We need a better solution.
>
>
> Discuss items in plain sight of the whole community on this list. This
> allows other developers to participate asynchronously.
>
> Regards,
> Dave
>
>
> On Mar 27, 2018, at 3:47 PM, Ning Wang <wangnin...@gmail.com> wrote:
>
> Hi,
>
> Here is a super brief notes for today's sync meeting:
>
>
>   - Hangout is killing us today! We need a better solution.
>   - Per component cpu/disk config is added but not used yet.
>   - Looking into failing integration tests. This has been affecting us
>   quite a big
>   - New works on nomad scheduler (docker support)
>   - New padding configs (to improve padding calculation and avoid hard
>   coded configs) are documented and shared:
>   https://docs.google.com/document/d/10JNZNYtcUIlsEyXcWWQUyhcSoQoJZ
> MvYrTYT8UFo4dQ/edit
>   - Clean up and k8s scheduler related works ongoing
>   - New scala examples have been added. Next step is to document them and
>   integration tests
>   - Looking for another host for our Mailing list
>   - Code change ongoing to make the website apache ready. ETA next
>   weekend. Apache Infra setup is needed for starting the website.
>   - Apache transfer is on going by Twitter’s OSS team.
>   - We need to finalize the meetup date @maosong.
>
>
>
>
> Meeting notes are tracked here:
> https://docs.google.com/document/d/1cTIBq3jOVRTSR0Zd5OKK20OqwT2l9
> 0xXiY_HssVo8mE/edit?ts=5aa84932#
>
>
>


CI job for heron

2018-03-30 Thread Ning Wang
Hi, Dave,

It seems that we need to update heron CI job for unit tests and integration
tests. We used Travis previously. I am wondering if there is a CI engine in
Apache infra? Or is there any suggestion?

Thanks in advance.


Re: [MENTORS] Re: Heron Non Apache Release 0.17.8

2018-03-30 Thread Ning Wang
For 2,

0.17.7 was released in late Feb I believe and there have been new
features/fixes since then.

On Fri, Mar 30, 2018 at 10:31 AM, Dave Fisher  wrote:

> Hi -
>
> Adding the tag that Taylor mentioned to help signal when the project has
> questions.
>
> While I on the IPMC I am not one of your mentors.
>
> A few questions to answer about this non-Apache release.
>
> (1) where will the resulting artifacts be published?
>
> (2) How is it different from the prior non-Apache release 0.17.7?
>
> (3) The main step for an Apache release is reviewing the dependencies and
> those licenses in order to build correct LICENSE and NOTICE files. Does
> someone have a list handy - link in the wiki?
>
> Regards,
> Dave
>
> Sent from my iPhone
>
> > On Mar 30, 2018, at 10:05 AM, Karthik Ramasamy 
> wrote:
> >
> > All -
> >
> > Since we have been planning to release 0.17.8 till last week, I would
> > suggest that we can go ahead with this release. However this release will
> > be a non Apache release since there are several task items that needs to
> be
> > done before making an Apache release.
> >
> > I would suggest the following -
> >
> > * Move forward with 0.17.8 as a non Apache Release
> >
> > * Target 0.18.0 as a full Apache Release
> >
> > Since the task items could be long - there might be a need for some
> interim
> > releases between 0.17.8 and 0.18.0 and these releases might be
> potentially
> > non Apache release as well.
> >
> > Let me know if this plan sounds good.
> >
> > In the Slack most of committers have said yes to go ahead with 0.17.8
> > release.
> >
> > cheers
> > /karthik
> >
> > ps: Mentors please let us know if this looks ok
>
>


Re: Heron Non Apache Release 0.17.8

2018-03-30 Thread Ning Wang
SGTM

On Fri, Mar 30, 2018 at 10:05 AM, Karthik Ramasamy 
wrote:

> All -
>
> Since we have been planning to release 0.17.8 till last week, I would
> suggest that we can go ahead with this release. However this release will
> be a non Apache release since there are several task items that needs to be
> done before making an Apache release.
>
> I would suggest the following -
>
> * Move forward with 0.17.8 as a non Apache Release
>
> * Target 0.18.0 as a full Apache Release
>
> Since the task items could be long - there might be a need for some interim
> releases between 0.17.8 and 0.18.0 and these releases might be potentially
> non Apache release as well.
>
> Let me know if this plan sounds good.
>
> In the Slack most of committers have said yes to go ahead with 0.17.8
> release.
>
> cheers
> /karthik
>
> ps: Mentors please let us know if this looks ok
>


Re: CI job for heron

2018-03-30 Thread Ning Wang
Got it. Thanks. I will take a look.

The previous Travis service is a Twitter paid service I think, so it is
time to consider our options.

On Fri, Mar 30, 2018 at 10:36 AM, Dave Fisher <dave2w...@comcast.net> wrote:

> Hi -
>
> > On Mar 29, 2018, at 11:37 PM, Ning Wang <wangnin...@gmail.com> wrote:
> >
> > Hi, Dave,
> >
> > It seems that we need to update heron CI job for unit tests and
> integration tests. We used Travis previously. I am wondering if there is a
> CI engine in Apache infra? Or is there any suggestion?
>
> There is a Jenkins and buildbots based system. Subscribe to
> bui...@apache.org and ask. (Or ask Infra on Hipchat. I think I sent you
> that link before.)
>
> I think that with GitHub you can use Travis as before.
>
> Regards,
> Dave
>
> >
> > Thanks in advance.
> >
>
>


Heron is in Apacheinfra home now

2018-03-29 Thread Ning Wang
Hi,

Heron git repo has moved to the new home!

https://github.com/apache/incubator-heron


Apache committers should link their ASF IDs and Github IDs as well as
enable 2FA on Github: https://gitbox.apache.org/setup/ so they can commit
to Github. Otherwise Apache credentials to
https://gitbox.apache.org/repos/asf/incubator-heron.git will work.


Re: Getting binaries out of the code base

2018-04-02 Thread Ning Wang
nice!

On Sun, Apr 1, 2018 at 7:10 PM, Karthik Ramasamy 
wrote:

> This PR successfully passed Travis and merged!
>
> Cheers
> /karthik
>
> Sent from my iPhone
>
> > On Apr 1, 2018, at 10:44 AM, Karthik Ramasamy 
> wrote:
> >
> > All -
> >
> > As a first step towards making an Apache Release, I worked out getting
> the
> > binary tar balls out of of the code base. Here is the PR
> >
> > https://github.com/apache/incubator-heron/pull/2833
> >
> > All the tarballs are downloaded and compiled instead of being embedded in
> > the code.
> >
> > cheers
> > /karthik
>


Re: Fault tolerance

2018-03-20 Thread Ning Wang
Two features I can think of:

Dhalion
https://blog.acolyer.org/2017/06/30/dhalion-self-regulating-stream-processing-in-heron/

Backpressure
Streammanager is responsible for all data transactions and it monitors the
data consumption. If a process runs slow and incoming data buffer is full,
it triggers backpressure to stop new data until the data in pipeline is
consumed.



On Tue, Mar 20, 2018 at 10:26 AM, Thuvarakan Tharmarajasingam <
thuva4@gmail.com> wrote:

> Hello,
> I am currently working on a distributed fault tolerance system for big data
> toolkit. Could you please give me an overview of how Heron achieved the
> fault tolerance?
>


Re: Heron Sync Up 02/27/2017

2018-02-27 Thread Ning Wang
Got it. Thanks!

On Tue, Feb 27, 2018 at 6:23 PM, Dave Fisher <dave2w...@comcast.net> wrote:

> Please post the notes into an email so that we can see what was discussed
> through Apache hosted resources. This is an important part of the Apache
> Way.
>
> Thanks!
>
> Regards,
> Dave
>
> Sent from my iPhone
>
> > On Feb 27, 2018, at 4:33 PM, Ning Wang <wangnin...@gmail.com> wrote:
> >
> > All -
> >
> > We had the sync up meeting today and here is the brief note.
> >
> > https://docs.google.com/document/d/1cTIBq3jOVRTSR0Zd5OKK20OqwT2l9
> 0xXiY_HssVo8mE/edit?ts=5a15c620#
> >
> > Please feel free to comment or reply if you have any question.
>
>


Re: Heron Sync Up 02/27/2017

2018-02-27 Thread Ning Wang
A brief meeting notes for today's sync up meeting:


   -

   Updates
   -

  Huijun has been working on a failure handling (mostly for aurora)
  issue in heron update command (topology state not recovered to
the previous
  state). Also internal webhook integration.
  -

  Runtime config design doc is stable enough in major components (there
  might be notes in details.) and coding is started and one PR in
TMaster is
  ready for review.
  -

  A doc about per-component CPU config as well as other resource
  related config is available. Jerry will have a look.
  
https://docs.google.com/document/d/1mDDW6Lc0PXT8JD29n1zc3hv3G5LPXAoX4KkXu0Ycnds/edit#
  -

  Jerry got an issue that stateful example is not working (throwing
  exception). Maosong/Jerry will take a look
  -

 Note that currently stateful support is low level API and single
 thread only.
 -

   Discussion
   -

  Apache migration. We need a migration plan to discuss and review.
  
https://docs.google.com/document/d/1-G5qbFN1ftDRf42Dee_BjlEZ75C-gVdGJ_uaGIL8qx8/edit?usp=sharing
  Here are a few questions we need to have answers.
  -

 pointing the heron.apache.org - github pages
 -

 how to automatically do daily digest for slack and push into dev
 mailing lists
 -

 can we use the apache infrastructure for it?
 -

 When can we start using apache infrastructure
 -

 How to get access to Apache Maven repo
 -

 Whom we need to contact?
 -

  Other Apache migration thoughts have been talked about in the meeting
  -

 We need to replace all package name, but it is likely to require
 some config changes.  Maybe hold on until we have a plan and thing are
 ready.
 -

 We can move website/document first, but needs to set up SVN. There
 are other tickets and setups works. @karthik
 -

 Git repo. How to keep stars/forks in github?
 -

 Github issues or Apache jira?
 -

  First meetup.
  -

 Will be hosted in Twitter
 -

 User oriented.
 -

Estimate 2 hours,
-

We need have step by step instructions to start topology
locally/aws.


On Tue, Feb 27, 2018 at 6:56 PM, Ning Wang <wangnin...@gmail.com> wrote:

> Got it. Thanks!
>
> On Tue, Feb 27, 2018 at 6:23 PM, Dave Fisher <dave2w...@comcast.net>
> wrote:
>
>> Please post the notes into an email so that we can see what was discussed
>> through Apache hosted resources. This is an important part of the Apache
>> Way.
>>
>> Thanks!
>>
>> Regards,
>> Dave
>>
>> Sent from my iPhone
>>
>> > On Feb 27, 2018, at 4:33 PM, Ning Wang <wangnin...@gmail.com> wrote:
>> >
>> > All -
>> >
>> > We had the sync up meeting today and here is the brief note.
>> >
>> > https://docs.google.com/document/d/1cTIBq3jOVRTSR0Zd5OKK20Oq
>> wT2l90xXiY_HssVo8mE/edit?ts=5a15c620#
>> >
>> > Please feel free to comment or reply if you have any question.
>>
>>
>


Re: Moving to Apache

2018-02-27 Thread Ning Wang
Sounds good to me. maosong@, do you have any concerns?

On Tue, Feb 27, 2018 at 3:23 PM, Sanjeev Kulkarni 
wrote:

> Thanks Jerry for taking the lead on this!
> I second the proposal to just transfer the github account to Apache. Could
> someone from Twitter follow up internally?
>
>
> On Tue, Feb 27, 2018 at 3:20 PM, Jerry Peng 
> wrote:
>
> > Hello all,
> >
> > I just want to start an email thread discussing moving Heron to
> > Apache.  There are some items we need to figure out for this:
> >
> > 1. Moving the code to Apache github
> >
> > I was told that an repo can be transferred to another account and
> > people have done this in the past to move to the Apache github
> > account.  This is the best way to move the code to be under apache
> > since with this method heron will keep all its stars and forks.
> >
> > We need to start converting heron packages from com.twitter ->
> org.apache.
> >
> > Ideally this whole process of migrating to Apache will not be a
> > blocker for development and releases.
> >
> > Thus, if mentors or people with experience in this area want to chime
> > in on the exact details (step by step) of what needs to be do for
> > heron to be completely migrated to Apache that would be great!
> >
> > 2. Moving website to heron.apache.org
> >
> > What do we want to do here?  Migrate the whole website to
> > heron.apache.org? And Have heron.io forward to heron.apache.org?
> >
> > 3.  How can we use apache infra
> >
> > I think committers/mentors need to file some tickets to apache infra for
> > this.
> >
> > How can we use the apache infra to do apache release for heron?
> >
> > Lets get the discussion going!
> >
> > Thanks!
> >
> > Jerry
> >
>


Heron Sync Up 02/27/2017

2018-02-27 Thread Ning Wang
All -

We had the sync up meeting today and here is the brief note.

https://docs.google.com/document/d/1cTIBq3jOVRTSR0Zd5OKK20OqwT2l90xXiY_HssVo8mE/edit?ts=5a15c620#

Please feel free to comment or reply if you have any question.


Re: Moving to Apache

2018-02-27 Thread Ning Wang
 I created a google doc for us to track questions and plans:

https://docs.google.com/document/d/1-G5qbFN1ftDRf42Dee_BjlEZ75C-gVdGJ_uaGIL8qx8/edit#


On Tue, Feb 27, 2018 at 4:23 PM, Josh Fischer  wrote:

> Dave,
>
> Thank you for the suggestions.  This is fantastic advice.
>
>
> On Tue, Feb 27, 2018 at 6:19 PM Dave Fisher  wrote:
>
> > Hi -
> >
> > I will answer with two hats - (1) IPMC member and mentor to other
> > projects, and (2) Brand committee member.
> >
> > On Feb 27, 2018, at 3:20 PM, Jerry Peng 
> > wrote:
> >
> > Hello all,
> >
> > I just want to start an email thread discussing moving Heron to
> > Apache.  There are some items we need to figure out for this:
> >
> > 1. Moving the code to Apache github
> >
> > I was told that an repo can be transferred to another account and
> > people have done this in the past to move to the Apache github
> > account.  This is the best way to move the code to be under apache
> > since with this method heron will keep all its stars and forks.
> >
> > We need to start converting heron packages from com.twitter ->
> org.apache.
> >
> > Ideally this whole process of migrating to Apache will not be a
> > blocker for development and releases.
> >
> > Thus, if mentors or people with experience in this area want to chime
> > in on the exact details (step by step) of what needs to be do for
> > heron to be completely migrated to Apache that would be great!
> >
> >
> > (1) Transfer GitHub. Create an Infrastructure JIRA issue to convert the
> > twitter repository to apache/incubator-heron/. You will need to have an
> > admin for twitter give an ASF Infra admin rights to move.
> > (2) Once moved then your apache-id and your GitHub id are associated
> > through id.apache.org. You setup 2FA.
> > (3) Once IDs and the repository are moved then you can begin.
> > (4) Someone from Twitter in the project should make the changes.
> > (5) As an Incubator podling you then begin making releases. You aren’t
> > expected to get it correct the first time, but the closer you are the
> > sooner you can graduate. The policy [1] and additional constraints for
> > podlings. [2]
> >
> >
> > [1] http://www.apache.org/legal/release-policy.html
> > [2] https://incubator.apache.org/guides/releasemanagement.html
> >
> >
> > 2. Moving website to heron.apache.org
> >
> > What do we want to do here?  Migrate the whole website to
> > heron.apache.org? And Have heron.io forward to heron.apache.org?
> >
> >
> > Yes. And Yes. There are other branding and incubator policies for the
> > website. [3]
> >
> > [3] https://www.apache.org/foundation/marks/pmcs
> >
> > For those of you using the Heron brand in your site please see [4] for
> the
> > policies.
> >
> > [4] https://www.apache.org/foundation/marks/
> >
> > For special branding rules during incubation. [5]
> >
> > [5] https://incubator.apache.org/guides/branding.html
> >
> >
> > 3.  How can we use apache infra
> >
> > I think committers/mentors need to file some tickets to apache infra for
> > this.
> >
> >
> > A ticket for Github/GitBox is required.
> >
> >
> > How can we use the apache infra to do apache release for heron?
> >
> >
> > Follow the rules above and ask questions as you go.
> >
> > Best Regards,
> > Dave
> >
> >
> > Lets get the discussion going!
> >
> > Thanks!
> >
> > Jerry
> >
> > --
> Sent from A Mobile Device
>


Re: Using Bazel 0.14.1+

2018-06-28 Thread Ning Wang
Thanks~

On Thu, Jun 28, 2018 at 8:29 AM, Oliver Bristow 
wrote:

> Hey folks, #2932 
> updated
> the repo to use 0.14.1 - it would be good to update your local version to
> be that or greater.
>
> Without a newer version you may experience an issue relating to scala if
> you are using the current master, but the change that introduced that will
> be reverted in this PR
>  for
> now.
>


Proposal: Streamlet Custom Operator

2018-09-27 Thread Ning Wang
Hi,

I was trying to add the support to reuse existing Bolts in the Streamlet
API last week and got some feedbacks about the feature and the
improvements. After reconsidering what I want to have, I think it can be
generalized a bit to "Custom Operator".

Here is a design doc to summerize what in my mind and what I am planning to
do. Please feel free to comment:
https://docs.google.com/document/d/1XzF0IlfuaaW8Gx3cPx1xLtP-kgCFK0TRNS5aAzuMuMg/edit#.
All ideas are welcome.

Thanks in advance!


Heron OSS Sync Meeting

2018-10-08 Thread Ning Wang
Hi, heron devs,

The heron OSS sync meeting will be happening tomorrow at 2.00 pm PDT.
Please use the following hangout link:
https://hangouts.google.com/hangouts/_/streaml.io/oss-heron-sync?authuser=0

Like we discussed earlier, updates will be communicated via email. Here are
mine:

- Shared design doc for Custom Operator for review. Here is the doc fyi:
https://docs.google.com/document/d/1XzF0IlfuaaW8Gx3cPx1xLtP-kgCFK0TRNS5aAzuMuMg/edit#
- Refactor grouping code. This will be useful in Custom Operator.
- Update heron license information.

Please reply with your updates. Thanks!


--ning


Re: Hermetic Build Proposal

2018-10-12 Thread Ning Wang
Sounds reasonable to me.

On Fri, Oct 12, 2018 at 11:26 AM Josh Fischer  wrote:

> Hey All,
>
> I've done some research and I do not think Heron's Bazel configuration is
> hermetic.  What I mean by this is that to build Heron we are relying on the
> JDK on the host platform.  If JDK versions change across host platforms it
> will create build issues for the developers.  I think we could reduce the
> amount of issues that occur by relying on a tool chain to provide the
> needed versions of our build tools through Bazel.  I imagine this could be
> a bit of a task, so I think tackling Java first would be a good start to
> resolving this issue and allowing for cleaner builds.
>
> Any thoughts? Comments? Concerns?
>
> See below for a toolchains explanation:
>
> https://docs.bazel.build/versions/master/remote-execution-rules.html#invoking-build-tools-through-toolchain-rules
>
> -Josh
>


Re: [VOTE] Heron Release 0.20.0-incubating Candidate 5

2018-10-18 Thread Ning Wang
+1

Tests I have done:
- compiled ok, all unit tests passed
- installed CLI successfully
- example topology runs fine locally
- ui/tracker work fine and show topology info
- license scan with apache-rat (rat couldn't find license in cloudpickle.py
but it is there)



On Tue, Oct 16, 2018 at 11:17 PM Ning Wang  wrote:

> Thanks!
>
> I am going to try it when I get time.
>
> On Tue, Oct 16, 2018 at 10:45 AM Neng Lu  wrote:
>
>> Hi All,
>>
>> This is the 5th release candidate for Apache Heron, version
>> 0.20.0-incubating. Thanks everyone for providing various feedback for the
>> previous release candidates.
>>
>> It is the starting point of Heron and contains heron's main features, such
>> as streaming
>> processing, stateful processing, streamlet API, API server, eco support,
>> etc.
>>
>> The full list of changes and fixes are available:
>>
>> https://github.com/apache/incubator-heron/compare/0.17.8...release/v-0.20.0-incubating
>>
>> *** Please download, test and vote on this release. This vote will stay
>> open
>> for at least 72 hours ***
>>
>> Source files:
>>
>> https://dist.apache.org/repos/dist/dev/incubator/heron/heron-0.20.0-incubating-candidate-5/
>>
>> SHA-512 checksums:
>>
>> 27890ab30fc3e69b627f47d58d178d1a7dffa9dbe4ebbb5a5aa77caaac882fdc2b6f98b3b76210020db0fa3fd86e294cba214f86072e449837e1b7615cd6124a
>> incubator-heron-v-0.20.0-incubating-candidate-5.tar.gz
>>
>> The tag to be voted upon:
>> v0.20.0-incubating-candidate-5 (45043bb6dcef1e8089c0834f17f8be0cc3f451d3)
>>
>> https://github.com/apache/incubator-heron/releases/tag/v-0.20.0-incubating-candidate-5
>>
>> Please download the source package, and follow the compiling guide(
>>
>> https://apache.github.io/incubator-heron/docs/developers/compiling/compiling/
>> )
>> to build and run the Heron locally.
>>
>> --
>> Best Regards,
>> Neng
>>
>


Heron OSS Sync meeting

2018-10-22 Thread Ning Wang
Hi, heron devs,

The heron OSS sync meeting will be happening tomorrow at 2.00 pm PDT.
Please use the following hangout link:
https://hangouts.google.com/hangouts/_/streaml.io/oss-heron-sync?authuser=0

Please reply this email with your updates. Thanks!

Here are my updates:

- Added IStreamletOperator interfaces for existing bolts.
- First pass of CustomOperator support.


Re: Podling Report Reminder - November 2018

2018-10-24 Thread Ning Wang
I created a google doc based on the previous report. We can fill in
necessary information on it.

https://docs.google.com/document/d/1K0tHdolQ-Hfl5an1PXtrdSeL8posGfOPETk56pIk-KA/edit

On Mon, Oct 22, 2018 at 9:59 AM Neng Lu  wrote:

> Any update on this Podling Report task?
>
> On Sat, Oct 20, 2018 at 10:46 AM Ning  wrote:
>
> > Cool. Thanks!
> >
> > Sent from my iPhone
> >
> > > On Oct 20, 2018, at 10:13 AM, Josh Fischer 
> wrote:
> > >
> > > I’ve pasted in the report from August.  You can access where the
> template
> > > is to be input here —->
> > > https://wiki.apache.org/incubator/November2018
> > >
> > > Just know that you have to register a username with the site to be able
> > to
> > > edit the document.
> > >
> > > I’m sure Karthik will have a few of these answers, I do not think it
> > should
> > > be solely dependent on him.  I believe the purpose of the reports is to
> > > bring in the community to get feedback and help Heron graduate to a top
> > > level project by following the Apache way.
> > >
> > >
> > > HeronA real-time, distributed, fault-tolerant stream processing
> > > engine.Heron has been incubating since 2017-06-23.Three most important
> > > issues to address in the move towards graduation:  1. Removal of
> > > binaries from code base (this is already completed)  2. Moving from
> > > com.twitter.heron to org.apache.heron namespace (this isalready
> > > completed)  3. How to publish the binary artifacts into the Apache
> > > Maven repoAny issues that the Incubator PMC (IPMC) or ASF Board
> > > wish/need to beaware of?We had an issue with scheduling to get the
> > > report done on time.  We will work to correct this for the next
> > > period.How has the community developed since the last report?The
> > > community has been increasing steadily.  The community has been asking
> > > questions through the mailing list and slack channels.  The supporting
> > > community has responded with answers to questions asked.   We have
> > > also had new individuals  come into help with cleaning the project
> > > from an outside perspective.  It's been very helpful to the
> > > committers.How has the project developed since the last report?There
> > > have been mainly bug fixes and improvements to existing features.
> > > Some to note are* Fixing issue with downloader for Nomad* Updating to
> > > the latest Dhalion version* Updating of Dockerfiles and docker build
> > > scripts* Updates to the documentation* Updates to Helm charts* Added a
> > > S3 uploaderHow would you assess the podling's maturity?Please feel
> > > free to add your own commentary.  [ ] Initial setup  [X] Working
> > > towards first release  [ ] Community building  [ ] Nearing graduation
> > > [ ] Other:Date of last release: No Apache releases as of yet.When were
> > > the last committers or PPMC members elected?N/A.  Still working
> > > towards bootstrapping the project..Signed-off-by:  [ ](heron) Jake
> > > Farrell Comments:  [ ](heron) Jacques Nadeau Comments:  [
> > > ](heron) Julien Le Dem Comments:  [x](heron) P. Taylor Goetz
> > > Comments: I'm concerned about lack of activity on the public mailing
> > > lists. Decisions are being   made, I just don't know where
> > > (Slack?). I'll be leaning on the podling to be more
> > > publicly transparent.IPMC/Shepherd notes:
> > >
> > >
> > >
> > >
> > >> On Sat, Oct 20, 2018 at 11:26 AM Ning Wang 
> > wrote:
> > >>
> > >> Is there a template? We can start a google doc and collaborate on it.
> > >> Karthik has been super duper busy so I am not sure if he will have the
> > >> time, but some questions might for him?
> > >>
> > >>> On Sat, Oct 20, 2018 at 6:05 AM Josh Fischer 
> > wrote:
> > >>>
> > >>> Hey All,
> > >>>
> > >>> Does anyone want to fill out the Podling Report?  I can help along
> the
> > >> way
> > >>> if questions come up.
> > >>>
> > >>> -Josh
> > >>>
> > >>>> On Sat, Oct 20, 2018 at 6:03 AM  wrote:
> > >>>>
> > >>>> Dear podling,
> > >>>>
> > >>>> This email was sent by an automated system on behalf of the Apache
> > >>>> Incubator PMC. It is an initial reminder to give you plenty of time
> to
>

Re: Podling Report Reminder - November 2018

2018-11-03 Thread Ning Wang
Yeah. Need help to fill the the report please. We need to submit asap.

For "Three most important issues to address in the move towards
graduation:". I dont have much experience on the graduation. Maybe @Karthik
Ramasamy  @sijie could help?

"Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be
aware of?"
No idea. Might be related to the previous question. Maybe we dont have
issues for now?

I added some answers to the "How has the community developed since the last
report?" question. Please feel free to add/update.




On Fri, Nov 2, 2018 at 6:53 AM Josh Fischer  wrote:

> We still need to finish the podling report. I don't have an answer for many
> of these questions this time around.  I'll need someone else to fill them
> in.  Once they are complete I can submit it for us.
>
>
>
> On Thu, Oct 25, 2018 at 12:13 AM Ning Wang  wrote:
>
> > I created a google doc based on the previous report. We can fill in
> > necessary information on it.
> >
> >
> >
> https://docs.google.com/document/d/1K0tHdolQ-Hfl5an1PXtrdSeL8posGfOPETk56pIk-KA/edit
> >
> > On Mon, Oct 22, 2018 at 9:59 AM Neng Lu  wrote:
> >
> > > Any update on this Podling Report task?
> > >
> > > On Sat, Oct 20, 2018 at 10:46 AM Ning  wrote:
> > >
> > > > Cool. Thanks!
> > > >
> > > > Sent from my iPhone
> > > >
> > > > > On Oct 20, 2018, at 10:13 AM, Josh Fischer 
> > > wrote:
> > > > >
> > > > > I’ve pasted in the report from August.  You can access where the
> > > template
> > > > > is to be input here —->
> > > > > https://wiki.apache.org/incubator/November2018
> > > > >
> > > > > Just know that you have to register a username with the site to be
> > able
> > > > to
> > > > > edit the document.
> > > > >
> > > > > I’m sure Karthik will have a few of these answers, I do not think
> it
> > > > should
> > > > > be solely dependent on him.  I believe the purpose of the reports
> is
> > to
> > > > > bring in the community to get feedback and help Heron graduate to a
> > top
> > > > > level project by following the Apache way.
> > > > >
> > > > >
> > > > > HeronA real-time, distributed, fault-tolerant stream processing
> > > > > engine.Heron has been incubating since 2017-06-23.Three most
> > important
> > > > > issues to address in the move towards graduation:  1. Removal of
> > > > > binaries from code base (this is already completed)  2. Moving from
> > > > > com.twitter.heron to org.apache.heron namespace (this isalready
> > > > > completed)  3. How to publish the binary artifacts into the Apache
> > > > > Maven repoAny issues that the Incubator PMC (IPMC) or ASF Board
> > > > > wish/need to beaware of?We had an issue with scheduling to get the
> > > > > report done on time.  We will work to correct this for the next
> > > > > period.How has the community developed since the last report?The
> > > > > community has been increasing steadily.  The community has been
> > asking
> > > > > questions through the mailing list and slack channels.  The
> > supporting
> > > > > community has responded with answers to questions asked.   We have
> > > > > also had new individuals  come into help with cleaning the project
> > > > > from an outside perspective.  It's been very helpful to the
> > > > > committers.How has the project developed since the last
> report?There
> > > > > have been mainly bug fixes and improvements to existing features.
> > > > > Some to note are* Fixing issue with downloader for Nomad* Updating
> to
> > > > > the latest Dhalion version* Updating of Dockerfiles and docker
> build
> > > > > scripts* Updates to the documentation* Updates to Helm charts*
> Added
> > a
> > > > > S3 uploaderHow would you assess the podling's maturity?Please feel
> > > > > free to add your own commentary.  [ ] Initial setup  [X] Working
> > > > > towards first release  [ ] Community building  [ ] Nearing
> graduation
> > > > > [ ] Other:Date of last release: No Apache releases as of yet.When
> > were
> > > > > the last committers or PPMC members elected?N/A.  Still working
> > > > > towards bootstrapping the project..Signed-off-by:  [ ](her

Re: [Mentors] Podling Report Reminder - November 2018

2018-11-04 Thread Ning Wang
For "Have your mentors been helpful and responsive or are things falling
through the cracks? " I think the answer is:

Definitely helpful and responsive.


On Sun, Nov 4, 2018 at 12:24 AM Ning Wang  wrote:

> Thanks!
>
> On Sat, Nov 3, 2018 at 3:42 PM Justin Mclean  wrote:
>
>> HI,
>>
>> JFYI the temple for what goes into a report sometime changes, so looking
>> at previous reports, modifying them, and coping and pasting into the report
>> document [1] may miss some things, For instance this question has been
>> added:
>>
>> "Have your mentors been helpful and responsive or are things falling
>> through the cracks? In the latter case, please list any open issues
>> that need to be addressed."
>>
>> Thanks,
>> Justin
>>
>> 1. https://wiki.apache.org/incubator/November2018
>>
>


Re: [Mentors] Podling Report Reminder - November 2018

2018-11-04 Thread Ning Wang
Thanks~ :D

On Sun, Nov 4, 2018 at 5:11 AM Josh Fischer  wrote:

> Thanks Justin.  I didn’t realize that the forms change.  I’ll get a new
> draft sent out here in a bit.
>
> Ning,
>
> Good answer I’ll add it in!
>
>
> On Sun, Nov 4, 2018 at 1:25 AM Ning Wang  wrote:
>
> > For "Have your mentors been helpful and responsive or are things falling
> > through the cracks? " I think the answer is:
> >
> > Definitely helpful and responsive.
> >
> >
> > On Sun, Nov 4, 2018 at 12:24 AM Ning Wang  wrote:
> >
> > > Thanks!
> > >
> > > On Sat, Nov 3, 2018 at 3:42 PM Justin Mclean 
> wrote:
> > >
> > >> HI,
> > >>
> > >> JFYI the temple for what goes into a report sometime changes, so
> looking
> > >> at previous reports, modifying them, and coping and pasting into the
> > report
> > >> document [1] may miss some things, For instance this question has been
> > >> added:
> > >>
> > >> "Have your mentors been helpful and responsive or are things falling
> > >> through the cracks? In the latter case, please list any open issues
> > >> that need to be addressed."
> > >>
> > >> Thanks,
> > >> Justin
> > >>
> > >> 1. https://wiki.apache.org/incubator/November2018
> > >>
> > >
> >
> --
> Sent from A Mobile Device
>


Re: Podling Report Reminder - November 2018

2018-10-20 Thread Ning Wang
Is there a template? We can start a google doc and collaborate on it.
Karthik has been super duper busy so I am not sure if he will have the
time, but some questions might for him?

On Sat, Oct 20, 2018 at 6:05 AM Josh Fischer  wrote:

> Hey All,
>
> Does anyone want to fill out the Podling Report?  I can help along the way
> if questions come up.
>
> -Josh
>
> On Sat, Oct 20, 2018 at 6:03 AM  wrote:
>
> > Dear podling,
> >
> > This email was sent by an automated system on behalf of the Apache
> > Incubator PMC. It is an initial reminder to give you plenty of time to
> > prepare your quarterly board report.
> >
> > The board meeting is scheduled for Wed, 21 November 2018, 10:30 am PDT.
> > The report for your podling will form a part of the Incubator PMC
> > report. The Incubator PMC requires your report to be submitted 2 weeks
> > before the board meeting, to allow sufficient time for review and
> > submission (Wed, November 07).
> >
> > Please submit your report with sufficient time to allow the Incubator
> > PMC, and subsequently board members to review and digest. Again, the
> > very latest you should submit your report is 2 weeks prior to the board
> > meeting.
> >
> > Candidate names should not be made public before people are actually
> > elected, so please do not include the names of potential committers or
> > PPMC members in your report.
> >
> > Thanks,
> >
> > The Apache Incubator PMC
> >
> > Submitting your Report
> >
> > --
> >
> > Your report should contain the following:
> >
> > *   Your project name
> > *   A brief description of your project, which assumes no knowledge of
> > the project or necessarily of its field
> > *   A list of the three most important issues to address in the move
> > towards graduation.
> > *   Any issues that the Incubator PMC or ASF Board might wish/need to be
> > aware of
> > *   How has the community developed since the last report
> > *   How has the project developed since the last report.
> > *   How does the podling rate their own maturity.
> >
> > This should be appended to the Incubator Wiki page at:
> >
> > https://wiki.apache.org/incubator/November2018
> >
> > Note: This is manually populated. You may need to wait a little before
> > this page is created from a template.
> >
> > Mentors
> > ---
> >
> > Mentors should review reports for their project(s) and sign them off on
> > the Incubator wiki page. Signing off reports shows that you are
> > following the project - projects that are not signed may raise alarms
> > for the Incubator PMC.
> >
> > Incubator PMC
> >
>


11/06/2018 Bi-Weekly OSS Heron Sync-up

2018-11-05 Thread Ning Wang
Hi All,

It has been two weeks since our last sync and now is time to share our
progress again. Let's share our works done for the last two weeks in this
thread. And see if we need some hangout discussions or not.

>From my side:
Helped to prepare the Heron release and pod report
Continue working on custom operator support. Cleaned up the first version
and added grouping support.


Regards,
--ning


Re: BazelCon

2018-11-05 Thread Ning Wang
Nice intro! Thanks!

On Mon, Nov 5, 2018 at 12:49 PM Josh Fischer  wrote:

> Hey All,
>
> I did a talk at Google's Headquarters in NYC last October on how Apache
> Heron is built with Bazel. Here is the link if you are interested ~>
> https://www.youtube.com/watch?v=yBTSfA4YDtY=
>
> - Josh
>


Re: Building on MacOS 10.14

2018-11-15 Thread Ning Wang
Great!

On Thu, Nov 15, 2018 at 10:50 PM Dave Fisher  wrote:

> That worked!
>
> INFO: Elapsed time: 758.221s, Critical Path: 168.57s
> INFO: 2565 processes: 2235 local, 330 worker.
> INFO: Build completed successfully, 4380 total actions
>
>
> > On Nov 15, 2018, at 8:11 PM, Ning Wang  wrote:
> >
> > Agreed.
> >
> > For bazel, I think I manually download the specific version from
> > https://github.com/bazelbuild/bazel/releases and then install it. I
> > remember I upgraded and downgraded a few times but not very often.
> >
> >
> >
> > On Thu, Nov 15, 2018 at 2:26 PM Dave Fisher 
> wrote:
> >
> >> I am not sure how to get brew to give me the old version. I looked and
> do
> >> not see a way.
> >>
> >> So - more careful instructions next time.
> >>
> >> Regards,
> >> Dave
> >>
> >>> On Nov 15, 2018, at 2:13 PM, Ning Wang  wrote:
> >>>
> >>> bazel is not backward compatible. :(
> >>>
> >>> Plesae use 0.14.1 instead.
> >>>
> >>> This information is updated in the website source files a while ago
> but I
> >>> think it hasn't been deployed yet. It could be helpful to include this
> >>> information in the voting message next time.
> >>>
> >>> On Thu, Nov 15, 2018 at 11:26 AM Dave Fisher 
> >> wrote:
> >>>
> >>>> $ bazel version
> >>>> WARNING: Processed legacy workspace file
> >>>>
> >>
> /Users/davewave/Development/heron/incubator-heron-v-0.20.0-incubating-candidate-5/tools/bazel.rc.
> >>>> This file will not be processed in the next release of Bazel. Please
> >> read
> >>>> https://github.com/bazelbuild/bazel/issues/6319 for further
> >> information,
> >>>> including how to upgrade.
> >>>> Build label: 0.18.1-homebrew
> >>>> Build target:
> >>>>
> >>
> bazel-out/darwin-opt/bin/src/main/java/com/google/devtools/build/lib/bazel/BazelServer_deploy.jar
> >>>> Build time: Fri Nov 2 11:16:42 2018 (1541157402)
> >>>> Build timestamp: 1541157402
> >>>> Build timestamp as int: 1541157402
> >>>>
> >>>>
> >>>>> On Nov 15, 2018, at 11:23 AM, Ning Wang 
> wrote:
> >>>>>
> >>>>> Interesting.
> >>>>>
> >>>>> I am using 10.13 for my corp laptop. Will try at home.
> >>>>>
> >>>>> Which bazel version are you using? We are using 0.14.1.
> >>>>>
> >>>>>
> >>>>>
> >>>>> On Thu, Nov 15, 2018 at 11:07 AM Dave Fisher 
> >>>> wrote:
> >>>>>
> >>>>>> Hi -
> >>>>>>
> >>>>>> In testing and building the RC5 on macOS 10.14 I encountered some
> >>>> issues:
> >>>>>>
> >>>>>> $ bazel build --config=darwin heron/...
> >>>>>> WARNING: Processed legacy workspace file
> >>>>>>
> >>>>
> >>
> /Users/davewave/Development/heron/incubator-heron-v-0.20.0-incubating-candidate-5/tools/bazel.rc.
> >>>>>> This file will not be processed in the next release of Bazel. Please
> >>>> read
> >>>>>> https://github.com/bazelbuild/bazel/issues/6319 for further
> >>>> information,
> >>>>>> including how to upgrade.
> >>>>>> Starting local Bazel server and connecting to it...
> >>>>>> DEBUG:
> >>>>>>
> >>>>
> >>
> /private/var/tmp/_bazel_davewave/f2432422fcc701440a82a59536536f46/external/bazel_tools/tools/osx/xcode_configure.bzl:87:9:
> >>>>>> Invoking xcodebuild failed, developer dir:
> >>>>>> /Applications/Xcode8.app/Contents/Developer ,return code 256,
> stderr:
> >>>>>> Process terminated by signal 6, stdout:
> >>>>>> ERROR:
> >>>>>>
> >>>>
> >>
> /Users/davewave/Development/heron/incubator-heron-v-0.20.0-incubating-candidate-5/heron/api/src/java/BUILD:8:1:
> >>>>>> every rule of type java_doc implicitly depends upon the target
> >>>>>> '@local_jdk//:jdk-default', but this target could not be found
> because
> >>>> of:
> >>>>>> no such target '@local_jdk//:jdk-default': target 'jdk-default' not
> >>>>>> declared in package '' (did you mean 'jre-default'?) defined by
> >>>>>>
> >>>>
> >>
> /private/var/tmp/_bazel_davewave/f2432422fcc701440a82a59536536f46/external/local_jdk/BUILD.bazel
> >>>>>> ERROR: Analysis of target '//heron/api/src/java:heron-api-javadoc'
> >>>> failed;
> >>>>>> build aborted: Analysis failed
> >>>>>> INFO: Elapsed time: 9.642s
> >>>>>> INFO: 0 processes.
> >>>>>> FAILED: Build did NOT complete successfully (287 packages loaded)
> >>>>>>
> >>>>>> I have several versions of Xcode on my system and I am not sure why
> >>>> Bazel
> >>>>>> is choosing the older Xcode 8?
> >>>>>>
> >>>>>> I did
> >>>>>> $ bazel clean --expunge
> >>>>>>
> >>>>>> And now I’m getting different results. Any suggestions?
> >>>>>>
> >>>>>> Regards,
> >>>>>> Dave:
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>
> >>>>
> >>
> >>
>
>


Re: [Mentors] Podling Report Reminder - November 2018

2018-11-05 Thread Ning Wang
Thanks!

On Mon, Nov 5, 2018 at 5:33 PM Dave Fisher  wrote:

> I signed off on the report.
>
> I did add that Ning Wang was elected as a committer and accepted on
> 11/1/2018.
>
> Regards,
> Dave
>
> > On Nov 5, 2018, at 5:00 PM, Josh Fischer  wrote:
> >
> > The podling report has been submitted.
> >
> > - Josh
> >
> > On Mon, Nov 5, 2018 at 6:30 PM Ning Wang  wrote:
> >
> >> SGTM. Thanks!
> >>
> >> On Mon, Nov 5, 2018 at 4:12 PM Dave Fisher 
> wrote:
> >>
> >>> No. Please go ahead and file.
> >>>
> >>> Sent from my iPhone
> >>>
> >>>> On Nov 5, 2018, at 4:01 PM, Ning Wang  wrote:
> >>>>
> >>>> LGTM~
> >>>>
> >>>> We need an approval from mentors before submission?
> >>>>
> >>>>> On Mon, Nov 5, 2018 at 6:58 AM Josh Fischer 
> >>> wrote:
> >>>>>
> >>>>> Hi All,
> >>>>>
> >>>>> Please review the updated podling report. Answers are in *bold*
> >>>>>
> >>>>> 
> >>>>> Heron
> >>>>>
> >>>>> A real-time, distributed, fault-tolerant stream processing engine.
> >>>>>
> >>>>> Heron has been incubating since 2017-06-23.
> >>>>>
> >>>>> Three most important issues to address in the move towards
> graduation:
> >>>>>
> >>>>> *1. Making the fist Apache Release.  Vote for RC5 is currently on
> >>>>> general@incubator*
> >>>>> *  2. Making several Releases*
> >>>>> *  3. Continuing to grow the community*
> >>>>>
> >>>>> Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be
> >>>>> aware of?
> >>>>> *Not at this time*
> >>>>> How has the community developed since the last report?
> >>>>> *The community has been growing.  Two more monthly meetups have been
> >>>>> successfully organized so far after the initial one in April 2018.
> We
> >>> have
> >>>>> also seen more interests in the Heron from different channels and
> >>> getting
> >>>>> more ideas from the community.  In October 2018 there was a
> >> presentation
> >>>>> given at BazelCon in New York city on building Apache Heron.*
> >>>>> How has the project developed since the last report?
> >>>>> * There have been mainly bug fixes and improvements to existing
> >>> features.*
> >>>>> *Some to note are*
> >>>>> ** Works towards Apache release and five release candidates have been
> >>>>> created.*
> >>>>> ** Updates for the licenses*
> >>>>> ** Updates to the documentation*
> >>>>>
> >>>>> ** New integration tests*
> >>>>>
> >>>>> ** New designs/works for the Streamlet API(Heron’s high level DSL)*
> >>>>>
> >>>>>
> >>>>> How would you assess the podling's maturity?
> >>>>> Please feel free to add your own commentary.
> >>>>>
> >>>>> [ ] Initial setup
> >>>>> [*x*] Working towards first release
> >>>>> [*x*] Community building
> >>>>> [ ] Nearing graduation
> >>>>> [ ] Other:
> >>>>>
> >>>>> Date of last release:
> >>>>>
> >>>>> * No Apache releases as of yet. Latest RC was done on Oct 15th. [this
> >>> might
> >>>>> change if our Apache release succeeds]*
> >>>>>
> >>>>> When were the last committers or PPMC members elected
> >>>>> *  None as of yet.*
> >>>>> Have your mentors been helpful and responsive or are things falling
> >>>>> through the cracks? In the latter case, please list any open issues
> >>>>> that need to be addressed.
> >>>>> *Mentors have been very responsive and helpful.*
> >>>>>
> >>>>>
> >>>>>> On Mon, Nov 5, 2018 at 1:11 AM Ning Wang 
> >> wrote:
> >>>>>>
> >>>>>> Thanks~ :D
> >>>>>>
> >>>>>>> On Sun, Nov 4, 2018 at 5:11 AM Josh Fischer 
> >>> wrote:
> >>>>>>>
> >>>>>>> Thanks Justin.  I didn’t realize that the forms change.  I’ll get a
> >>> new
> >>>>>>> draft sent out here in a bit.
> >>>>>>>
> >>>>>>> Ning,
> >>>>>>>
> >>>>>>> Good answer I’ll add it in!
> >>>>>>>
> >>>>>>>
> >>>>>>>> On Sun, Nov 4, 2018 at 1:25 AM Ning Wang 
> >>> wrote:
> >>>>>>>>
> >>>>>>>> For "Have your mentors been helpful and responsive or are things
> >>>>>> falling
> >>>>>>>> through the cracks? " I think the answer is:
> >>>>>>>>
> >>>>>>>> Definitely helpful and responsive.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On Sun, Nov 4, 2018 at 12:24 AM Ning Wang 
> >>>>>> wrote:
> >>>>>>>>
> >>>>>>>>> Thanks!
> >>>>>>>>>
> >>>>>>>>> On Sat, Nov 3, 2018 at 3:42 PM Justin Mclean  >
> >>>>>>> wrote:
> >>>>>>>>>
> >>>>>>>>>> HI,
> >>>>>>>>>>
> >>>>>>>>>> JFYI the temple for what goes into a report sometime changes, so
> >>>>>>> looking
> >>>>>>>>>> at previous reports, modifying them, and coping and pasting into
> >>>>> the
> >>>>>>>> report
> >>>>>>>>>> document [1] may miss some things, For instance this question
> has
> >>>>>> been
> >>>>>>>>>> added:
> >>>>>>>>>>
> >>>>>>>>>> "Have your mentors been helpful and responsive or are things
> >>>>> falling
> >>>>>>>>>> through the cracks? In the latter case, please list any open
> >>>>> issues
> >>>>>>>>>> that need to be addressed."
> >>>>>>>>>>
> >>>>>>>>>> Thanks,
> >>>>>>>>>> Justin
> >>>>>>>>>>
> >>>>>>>>>> 1. https://wiki.apache.org/incubator/November2018
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>> --
> >>>>>>> Sent from A Mobile Device
> >>>>>>>
> >>>>>>
> >>>>>
> >>>
> >>>
> >>
>
>


Re: [Mentors] Podling Report Reminder - November 2018

2018-11-05 Thread Ning Wang
LGTM~

We need an approval from mentors before submission?

On Mon, Nov 5, 2018 at 6:58 AM Josh Fischer  wrote:

> Hi All,
>
> Please review the updated podling report. Answers are in *bold*
>
> 
> Heron
>
> A real-time, distributed, fault-tolerant stream processing engine.
>
> Heron has been incubating since 2017-06-23.
>
> Three most important issues to address in the move towards graduation:
>
>   *1. Making the fist Apache Release.  Vote for RC5 is currently on
> general@incubator*
> *  2. Making several Releases*
> *  3. Continuing to grow the community*
>
> Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be
> aware of?
>   *Not at this time*
> How has the community developed since the last report?
>   *The community has been growing.  Two more monthly meetups have been
> successfully organized so far after the initial one in April 2018.  We have
> also seen more interests in the Heron from different channels and getting
> more ideas from the community.  In October 2018 there was a presentation
> given at BazelCon in New York city on building Apache Heron.*
> How has the project developed since the last report?
>  * There have been mainly bug fixes and improvements to existing features.*
> *Some to note are*
> ** Works towards Apache release and five release candidates have been
> created.*
> ** Updates for the licenses*
> ** Updates to the documentation*
>
> ** New integration tests*
>
> ** New designs/works for the Streamlet API(Heron’s high level DSL)*
>
>
> How would you assess the podling's maturity?
> Please feel free to add your own commentary.
>
>   [ ] Initial setup
>   [*x*] Working towards first release
>   [*x*] Community building
>   [ ] Nearing graduation
>   [ ] Other:
>
> Date of last release:
>
> * No Apache releases as of yet. Latest RC was done on Oct 15th. [this might
> change if our Apache release succeeds]*
>
> When were the last committers or PPMC members elected
> *  None as of yet.*
> Have your mentors been helpful and responsive or are things falling
> through the cracks? In the latter case, please list any open issues
> that need to be addressed.
>   *Mentors have been very responsive and helpful.*
>
>
> On Mon, Nov 5, 2018 at 1:11 AM Ning Wang  wrote:
>
> > Thanks~ :D
> >
> > On Sun, Nov 4, 2018 at 5:11 AM Josh Fischer  wrote:
> >
> > > Thanks Justin.  I didn’t realize that the forms change.  I’ll get a new
> > > draft sent out here in a bit.
> > >
> > > Ning,
> > >
> > > Good answer I’ll add it in!
> > >
> > >
> > > On Sun, Nov 4, 2018 at 1:25 AM Ning Wang  wrote:
> > >
> > > > For "Have your mentors been helpful and responsive or are things
> > falling
> > > > through the cracks? " I think the answer is:
> > > >
> > > > Definitely helpful and responsive.
> > > >
> > > >
> > > > On Sun, Nov 4, 2018 at 12:24 AM Ning Wang 
> > wrote:
> > > >
> > > > > Thanks!
> > > > >
> > > > > On Sat, Nov 3, 2018 at 3:42 PM Justin Mclean 
> > > wrote:
> > > > >
> > > > >> HI,
> > > > >>
> > > > >> JFYI the temple for what goes into a report sometime changes, so
> > > looking
> > > > >> at previous reports, modifying them, and coping and pasting into
> the
> > > > report
> > > > >> document [1] may miss some things, For instance this question has
> > been
> > > > >> added:
> > > > >>
> > > > >> "Have your mentors been helpful and responsive or are things
> falling
> > > > >> through the cracks? In the latter case, please list any open
> issues
> > > > >> that need to be addressed."
> > > > >>
> > > > >> Thanks,
> > > > >> Justin
> > > > >>
> > > > >> 1. https://wiki.apache.org/incubator/November2018
> > > > >>
> > > > >
> > > >
> > > --
> > > Sent from A Mobile Device
> > >
> >
>


Re: [Mentors] Podling Report Reminder - November 2018

2018-11-05 Thread Ning Wang
SGTM. Thanks!

On Mon, Nov 5, 2018 at 4:12 PM Dave Fisher  wrote:

> No. Please go ahead and file.
>
> Sent from my iPhone
>
> > On Nov 5, 2018, at 4:01 PM, Ning Wang  wrote:
> >
> > LGTM~
> >
> > We need an approval from mentors before submission?
> >
> >> On Mon, Nov 5, 2018 at 6:58 AM Josh Fischer 
> wrote:
> >>
> >> Hi All,
> >>
> >> Please review the updated podling report. Answers are in *bold*
> >>
> >> 
> >> Heron
> >>
> >> A real-time, distributed, fault-tolerant stream processing engine.
> >>
> >> Heron has been incubating since 2017-06-23.
> >>
> >> Three most important issues to address in the move towards graduation:
> >>
> >>  *1. Making the fist Apache Release.  Vote for RC5 is currently on
> >> general@incubator*
> >> *  2. Making several Releases*
> >> *  3. Continuing to grow the community*
> >>
> >> Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be
> >> aware of?
> >>  *Not at this time*
> >> How has the community developed since the last report?
> >>  *The community has been growing.  Two more monthly meetups have been
> >> successfully organized so far after the initial one in April 2018.  We
> have
> >> also seen more interests in the Heron from different channels and
> getting
> >> more ideas from the community.  In October 2018 there was a presentation
> >> given at BazelCon in New York city on building Apache Heron.*
> >> How has the project developed since the last report?
> >> * There have been mainly bug fixes and improvements to existing
> features.*
> >> *Some to note are*
> >> ** Works towards Apache release and five release candidates have been
> >> created.*
> >> ** Updates for the licenses*
> >> ** Updates to the documentation*
> >>
> >> ** New integration tests*
> >>
> >> ** New designs/works for the Streamlet API(Heron’s high level DSL)*
> >>
> >>
> >> How would you assess the podling's maturity?
> >> Please feel free to add your own commentary.
> >>
> >>  [ ] Initial setup
> >>  [*x*] Working towards first release
> >>  [*x*] Community building
> >>  [ ] Nearing graduation
> >>  [ ] Other:
> >>
> >> Date of last release:
> >>
> >> * No Apache releases as of yet. Latest RC was done on Oct 15th. [this
> might
> >> change if our Apache release succeeds]*
> >>
> >> When were the last committers or PPMC members elected
> >> *  None as of yet.*
> >> Have your mentors been helpful and responsive or are things falling
> >> through the cracks? In the latter case, please list any open issues
> >> that need to be addressed.
> >>  *Mentors have been very responsive and helpful.*
> >>
> >>
> >>> On Mon, Nov 5, 2018 at 1:11 AM Ning Wang  wrote:
> >>>
> >>> Thanks~ :D
> >>>
> >>>> On Sun, Nov 4, 2018 at 5:11 AM Josh Fischer 
> wrote:
> >>>>
> >>>> Thanks Justin.  I didn’t realize that the forms change.  I’ll get a
> new
> >>>> draft sent out here in a bit.
> >>>>
> >>>> Ning,
> >>>>
> >>>> Good answer I’ll add it in!
> >>>>
> >>>>
> >>>>> On Sun, Nov 4, 2018 at 1:25 AM Ning Wang 
> wrote:
> >>>>>
> >>>>> For "Have your mentors been helpful and responsive or are things
> >>> falling
> >>>>> through the cracks? " I think the answer is:
> >>>>>
> >>>>> Definitely helpful and responsive.
> >>>>>
> >>>>>
> >>>>> On Sun, Nov 4, 2018 at 12:24 AM Ning Wang 
> >>> wrote:
> >>>>>
> >>>>>> Thanks!
> >>>>>>
> >>>>>> On Sat, Nov 3, 2018 at 3:42 PM Justin Mclean 
> >>>> wrote:
> >>>>>>
> >>>>>>> HI,
> >>>>>>>
> >>>>>>> JFYI the temple for what goes into a report sometime changes, so
> >>>> looking
> >>>>>>> at previous reports, modifying them, and coping and pasting into
> >> the
> >>>>> report
> >>>>>>> document [1] may miss some things, For instance this question has
> >>> been
> >>>>>>> added:
> >>>>>>>
> >>>>>>> "Have your mentors been helpful and responsive or are things
> >> falling
> >>>>>>> through the cracks? In the latter case, please list any open
> >> issues
> >>>>>>> that need to be addressed."
> >>>>>>>
> >>>>>>> Thanks,
> >>>>>>> Justin
> >>>>>>>
> >>>>>>> 1. https://wiki.apache.org/incubator/November2018
> >>>>>>>
> >>>>>>
> >>>>>
> >>>> --
> >>>> Sent from A Mobile Device
> >>>>
> >>>
> >>
>
>


Re: Proposal: Streamlet Custom Operator

2018-09-28 Thread Ning Wang
Thanks~

On Thu, Sep 27, 2018 at 9:26 PM Karthik Ramasamy  wrote:

> Thanks Ning - Nice proposal and looks good for me.
>
> On Thu, Sep 27, 2018 at 11:25 AM Ning Wang  wrote:
>
> > Hi,
> >
> > I was trying to add the support to reuse existing Bolts in the Streamlet
> > API last week and got some feedbacks about the feature and the
> > improvements. After reconsidering what I want to have, I think it can be
> > generalized a bit to "Custom Operator".
> >
> > Here is a design doc to summerize what in my mind and what I am planning
> to
> > do. Please feel free to comment:
> >
> >
> https://docs.google.com/document/d/1XzF0IlfuaaW8Gx3cPx1xLtP-kgCFK0TRNS5aAzuMuMg/edit#
> > .
> > All ideas are welcome.
> >
> > Thanks in advance!
> >
>


Re: [VOTE] Heron Release 0.20.0-incubating Candidate 4

2018-09-25 Thread Ning Wang
+1

My tests:

License check ok:
java -jar ~/Downloads/apache-rat-0.12.jar . -E .rat-excludes

Builds ok:
bazel build --compilation_mode=dbg --config=darwin heron/...
bazel build --compilation_mode=dbg --config=darwin scripts/packages:binpkgs
--verbose_failures

Unit tests and integration tests passed

CLI installed ok.

Started tracker/ui and deployed ExclamationTopology locally and it worked
and showed up in UI:
~/.heron/bin/heron submit local ~/.heron/examples/heron-api-examples.jar
org.apache.heron.examples.api.ExclamationTopology ExclamationTopology





On Mon, Sep 24, 2018 at 4:41 PM P. Taylor Goetz  wrote:

> Please keep release votes public. No need to cc private@. In fact,
> depending on order, including private@ can set the reply-to, as it did in
> this case.
>
> -Taylor
>
> > On Sep 24, 2018, at 5:18 PM, Neng Lu  wrote:
> >
> > Hi All,
> >
> > This is the 4th release candidate for Apache Heron, version
> 0.20.0-incubating. Thank Dave Fisher and Ning Wang for providing various
> feedback for the previous release candidates.
> >
> > It is the starting point of Heron and contains heron's main features,
> such as streaming
> > processing, stateful processing, streamlet API, API server, eco support,
> etc.
> >
> > The full list of changes and fixes are available:
> >
> https://github.com/apache/incubator-heron/compare/0.17.8...release/v-0.20.0-incubating
> >
> > *** Please download, test and vote on this release. This vote will stay
> open
> > for at least 72 hours ***
> >
> > Source files:
> >
> https://dist.apache.org/repos/dist/dev/incubator/heron/heron-0.20.0-incubating-candidate-4/
> >
> > SHA-512 checksums:
> >
> b207bd181dc1960abc7393bb2af4f8c5428676174693cecd3b95216af1d3c6dcec76a07d5e91d0a1959f5523ca3fe79f310b8f9ad797252441d8e1be23aadca2
>
> > incubator-heron-v-0.20.0-incubating-candidate-4.tar.gz
> >
> > The tag to be voted upon:
> > v0.20.0-incubating-candidate-4 (a468699b180a44b411705172ae2ee4d981b3c162)
> >
> https://github.com/apache/incubator-heron/releases/tag/v-0.20.0-incubating-candidate-4
> >
> > Please download the source package, and follow the compiling guide(
> https://apache.github.io/incubator-heron/docs/developers/compiling/compiling/)
> to build and run the Heron locally.
> >
> > --
> > Best Regards,
> > Neng
>


Re: Heron OSS Sync Meeting

2018-09-25 Thread Ning Wang
No objection so far.

Here are my updates:

- Discussed with Saikat about Machine Learning with Heron. We are planning
to create a new target in Apache Samoa project similar to the existing
Storm support.
- Working on user Bolt/Spout support in Streamlet API. I am stepping back a
bit and reconsider user Bolt as a Custom Operator way and see if this can
make the support more straightforward. I would like to keep the type safety
feature but it seems to be tricky.





On Mon, Sep 24, 2018 at 2:37 PM Neng Lu  wrote:

> +1 for this idea. We should utilize more of the mailing list and use it as
> the main discussion place.
>
> The sync meet should only play as a supplementary role.
>
> On Mon, Sep 24, 2018 at 2:12 PM Ning Wang  wrote:
>
> > Hi,
> >
> >
> > The heron OSS sync meeting will be happening tomorrow at 2.00 pm PDT.
> > Please use the following hangout link:
> >
> https://hangouts.google.com/hangouts/_/streaml.io/oss-heron-sync?authuser=0
> >
> > There has been suggestions about communicating regular biweekly updates
> via
> > this mailing list. It could be easier for some of us to arrange time, and
> > it might be easier for discussions.
> >
> > I am thinking how about we send updates in this email thread by the end
> of
> > tomorrow and leave the sync meeting to discussions if there is any? What
> do
> > you think?
> >
>
>
> --
> Best Regards,
> Neng
>


Discussion of the support of Bolt/Spout in Streamlet API

2018-09-19 Thread Ning Wang
Hi, all,

We had a discussion in this PR but I am feeling that it would be good to
gather more thoughts from other devs/users as well.

https://github.com/apache/incubator-heron/pull/3029#pullrequestreview-156614156


During Twitter internal onboarding of Streamlet API, I started to consider
supporting low level Bolt and Spout in Streamlet API. I totally understand
the concerns that Neng and Jerry raised in the PR that the Streamlet API is
not pure with Bolt/Spout support because it would expose low level things.
However I am still feeling that the advantages is way more than the
disadvantages with the support. The following are my comments in the RP:



Here are my thoughts:

Streamlet is not really the abstraction. My feeling is that Streamlet is
good at the DAG layer but not flexible enough in the low level (operators).
I would think it is like Scala vs Java(not the same, just some idea). Scala
has the nice functional API but it is pretty useless in real life if
procedural code is not allowed/supported.

Two reasons:

   1. Migration is one major reason. There are quite some existing
   topologies written in low level API (for heron and storm). Streamlet is
   only friendly to new users, existing code such as KafkaSpout (it is spout,
   but same issue) in storm and some ML bolts has to be rewritten to take the
   readability/maintainability advantages.
   2. Bolt/Spout are more flexible. They can do a lot more than a function
   provided by Streamlet API (initialization, config, checkpoint, etc). For
   examples, the stateful processing and component configs, they are not
   supported currently by Streamlet and if we add the features, it is likely
   user has to provide extra functions as parameters and the Streamlet API
   would became more and more complicated. Streamlet API will evolve but
   supporting Bolt/Spout could give us a lot room to design a clean API.




Re: Discussion of the support of Bolt/Spout in Streamlet API

2018-09-19 Thread Ning Wang
Thanks for your input Josh!

Sanjeev has a comment in the PR to improve it. I am going to try it out. At
the same time, please feel free to reply with your concerns or suggestions.
Thanks in advance.

On Wed, Sep 19, 2018 at 2:00 PM Josh Fischer  wrote:

> I can understand why some would not want to mix the two APIs as they each
> stand for a different concept.  I also have found in my own experience the
> streamlet API to be limiting in some cases.  For example I couldn't find a
> way to implement a specific grouping between Streamlets in a case where I
> wanted fine grained control on what data was sent over different instances
> of a Streamlet (of course this is probably part of the abstraction).I
> like the low level control you have with the spout and bolt implementations
> and think it would be nice to have the flexibility to choose when you want
> to take fine grained control if using the Streamlet API.
>
>
>
> On Wed, Sep 19, 2018 at 12:22 PM Ning Wang  wrote:
>
> > Hi, all,
> >
> > We had a discussion in this PR but I am feeling that it would be good to
> > gather more thoughts from other devs/users as well.
> >
> >
> >
> https://github.com/apache/incubator-heron/pull/3029#pullrequestreview-156614156
> >
> >
> > During Twitter internal onboarding of Streamlet API, I started to
> consider
> > supporting low level Bolt and Spout in Streamlet API. I totally
> understand
> > the concerns that Neng and Jerry raised in the PR that the Streamlet API
> is
> > not pure with Bolt/Spout support because it would expose low level
> things.
> > However I am still feeling that the advantages is way more than the
> > disadvantages with the support. The following are my comments in the RP:
> >
> > 
> >
> > Here are my thoughts:
> >
> > Streamlet is not really the abstraction. My feeling is that Streamlet is
> > good at the DAG layer but not flexible enough in the low level
> (operators).
> > I would think it is like Scala vs Java(not the same, just some idea).
> Scala
> > has the nice functional API but it is pretty useless in real life if
> > procedural code is not allowed/supported.
> >
> > Two reasons:
> >
> >1. Migration is one major reason. There are quite some existing
> >topologies written in low level API (for heron and storm). Streamlet
> is
> >only friendly to new users, existing code such as KafkaSpout (it is
> > spout,
> >but same issue) in storm and some ML bolts has to be rewritten to take
> > the
> >readability/maintainability advantages.
> >2. Bolt/Spout are more flexible. They can do a lot more than a
> function
> >provided by Streamlet API (initialization, config, checkpoint, etc).
> For
> >examples, the stateful processing and component configs, they are not
> >supported currently by Streamlet and if we add the features, it is
> > likely
> >user has to provide extra functions as parameters and the Streamlet
> API
> >would became more and more complicated. Streamlet API will evolve but
> >supporting Bolt/Spout could give us a lot room to design a clean API.
> >
> > 
> >
>


Heron OSS Sync Meeting

2018-09-24 Thread Ning Wang
Hi,


The heron OSS sync meeting will be happening tomorrow at 2.00 pm PDT.
Please use the following hangout link:
https://hangouts.google.com/hangouts/_/streaml.io/oss-heron-sync?authuser=0

There has been suggestions about communicating regular biweekly updates via
this mailing list. It could be easier for some of us to arrange time, and
it might be easier for discussions.

I am thinking how about we send updates in this email thread by the end of
tomorrow and leave the sync meeting to discussions if there is any? What do
you think?


Re: Heron Spouts Code

2019-01-16 Thread Ning Wang
+Siming

On Tue, Jan 15, 2019 at 11:35 PM Ning Wang  wrote:

> Hi, all,
>
> A few of us (Spencer, Saikat, Siming, Karthik, Josh, Sree) discussed today
> in our general slack channel that we should have spouts code somewhere so
> that people can reuse them (spouts are highly reusable in general) and
> contribute improvements. This is just a recap of the idea and some updates.
>
> We have two options:
> 1. add a spouts/ dir in heron project.
> 2. create a new project in github.
>
> For option 1, it is easy to start. But the iteration and release will be
> coupled with Heron project itself. It is likely there will be quite some
> activities around spouts time by time when new spouts are added. Also,
> Heron itself is basically the engine itself plus APIs and tooling, while
> there could be quite some spouts in future with many new dependencies like
> Kafka, pubsub, neo4j and neptune, etc. It is debatable to have spout
> implementations in Heron project, and these extra dependencies could add
> some unnecessary complexity.
>
> For option 2, there will be some work up front. but it will be much easier
> to manage and evolve. And here will be less concerns about new spouts (in
> different languages) and dependencies because spouts are relatively
> independent to each other and we may generate artifacts per spout.
>
> Overall most people prefer option 2 for its cleanness.
>
> I talked with Twitter OSS team. They are happy to support the initiative
> and suggest us to check with Apache team and see what is the best process.
> First question is that should this new side project be under Apache or not?
> This might be a question to mentors. What do you think/suggest?
>
> Another topic being discussed is the build tool in case we decide to
> create a new side project. Maven is more mature for sure, but we will
> likely need multi language support so currently Bazel seems to be the
> winner (I personally vote for Bazel 1.0 because the backward compatibility
> has been bad so far).
>
> Any ideas or suggestions, please feel free to reply.
>
> Regards,
> --ning
>


Re: Heron Spouts Code

2019-01-16 Thread Ning Wang
This is an option. I have a few concerns about it:
- There will be a lot of repos and it will be messy to manage and it might
be harder for users to find it. I am expecting at least more than ten
(different services times different languages).
- There will be some duplicated code such as build/release configs,
scripts. etc.

I think we should be able to achieve the first reason with a single repo.
Different spouts should likely be in different folders and they can evolve
separately.
The second reason is valid, but duplicated code is a side effect.
The third reason depends on building tool I feel. Bazel is powerful, but it
is just changing time by time. :(

Just my two cents.





On Wed, Jan 16, 2019 at 8:09 PM Simon Weng  wrote:

> Hi, all:
>
> Can it also be one of the options to even have separate repo for each type
> of spouts? The reasons it is worth considering are:
>
> 1. Allow each spout to evolve and release in different pace because each
> is technically driven by external source software. For example, the
> community may need different versions of the Kafka Spout to be compatible
> with their deployed Kafka cluster in production
> 2. Allow each spout project to use the de facto build tool that suits the
> external SDK best. This will help to minimize the learning curve for
> constributors who specialize in different source software stack
> 3. Simply the maintainence of the build and CI
>
> I’m not familiar with the capability of Bazel, so certainly I’m not
> against it. If it can help to achieve some of the above, I guess one single
> repo will also work then.
>
> SiMing
>
> On Wed, Jan 16, 2019 at 5:34 PM Ning Wang  wrote:
>
>> +Siming
>>
>> On Tue, Jan 15, 2019 at 11:35 PM Ning Wang  wrote:
>>
>>> Hi, all,
>>>
>>> A few of us (Spencer, Saikat, Siming, Karthik, Josh, Sree) discussed
>>> today in our general slack channel that we should have spouts code
>>> somewhere so that people can reuse them (spouts are highly reusable in
>>> general) and contribute improvements. This is just a recap of the idea and
>>> some updates.
>>>
>>> We have two options:
>>> 1. add a spouts/ dir in heron project.
>>> 2. create a new project in github.
>>>
>>> For option 1, it is easy to start. But the iteration and release will be
>>> coupled with Heron project itself. It is likely there will be quite some
>>> activities around spouts time by time when new spouts are added. Also,
>>> Heron itself is basically the engine itself plus APIs and tooling, while
>>> there could be quite some spouts in future with many new dependencies like
>>> Kafka, pubsub, neo4j and neptune, etc. It is debatable to have spout
>>> implementations in Heron project, and these extra dependencies could add
>>> some unnecessary complexity.
>>>
>>> For option 2, there will be some work up front. but it will be much
>>> easier to manage and evolve. And here will be less concerns about new
>>> spouts (in different languages) and dependencies because spouts are
>>> relatively independent to each other and we may generate artifacts per
>>> spout.
>>>
>>> Overall most people prefer option 2 for its cleanness.
>>>
>>> I talked with Twitter OSS team. They are happy to support the initiative
>>> and suggest us to check with Apache team and see what is the best process.
>>> First question is that should this new side project be under Apache or not?
>>> This might be a question to mentors. What do you think/suggest?
>>>
>>> Another topic being discussed is the build tool in case we decide to
>>> create a new side project. Maven is more mature for sure, but we will
>>> likely need multi language support so currently Bazel seems to be the
>>> winner (I personally vote for Bazel 1.0 because the backward compatibility
>>> has been bad so far).
>>>
>>> Any ideas or suggestions, please feel free to reply.
>>>
>>> Regards,
>>> --ning
>>>
>> --
> Sent from Gmail Mobile
>


  1   2   3   >