Re: 3/26/2019 Bi-Weekly OSS Heron Sync-up

2019-03-25 Thread Saikat Kanjilal
@Ning Wang<mailto:wangnin...@gmail.com> Done


From: Ning Wang 
Sent: Monday, March 25, 2019 2:52 PM
To: dev@heron.incubator.apache.org; Saikat Kanjilal
Subject: Re: 3/26/2019 Bi-Weekly OSS Heron Sync-up

@Saikat Kanjilal<mailto:sxk1...@hotmail.com> could you enable the comment 
permission? Thanks.

On Mon, Mar 25, 2019 at 1:50 PM Saikat Kanjilal 
mailto:sxk1...@hotmail.com>> wrote:
@Dave Fisher<mailto:dave2w...@comcast.net<mailto:dave2w...@comcast.net>> & dev 
community Bump on this: 
https://docs.google.com/document/d/11rwES5gfqyogw0BE8o9cfg4G6ndRxYxlpp4uUptwaEw/edit



From: Dave Fisher mailto:dave2w...@comcast.net>>
Sent: Monday, March 25, 2019 12:54 PM
To: dev@heron.incubator.apache.org<mailto:dev@heron.incubator.apache.org>
Cc: Ning Wang
Subject: Re: 3/26/2019 Bi-Weekly OSS Heron Sync-up



Sent from my iPhone

> On Mar 25, 2019, at 10:51 AM, Saikat Kanjilal 
> mailto:sxk1...@hotmail.com>> wrote:
>
> My updates
> --Bazel computational upgrade complete
> --Design doc for neo4j spout for heron out for review awaiting community 
> feedback

Where is this document? A link would help!

Thanks,
Dave

> --Design doc for spark spout for heron in the works
>
> Regards
>
> 
> From: Ning Wang mailto:wangnin...@gmail.com>>
> Sent: Monday, March 25, 2019 10:38 AM
> To: dev@heron.incubator.apache.org<mailto:dev@heron.incubator.apache.org>
> Subject: 3/26/2019 Bi-Weekly OSS Heron Sync-up
>
> Hi, all~,
>
> It has been two weeks since our last sync and there has been quite a few
> works done! Let's share our works for the last two weeks in this thread.
>
> My updates (mostly internal works, not much on Apache side):
> - Release 0.20.1-incubating-rc1 was sent out for vote and a license issue
> was found. Preparing 0.20.1-incubating-rc2 currently.
> - Small update in Heron UI.
> - Make cppcheck independent of pcre.
>
> Regards,
> --ning



Re: 3/26/2019 Bi-Weekly OSS Heron Sync-up

2019-03-25 Thread Saikat Kanjilal
@Dave Fisher<mailto:dave2w...@comcast.net> & dev community Bump on this: 
https://docs.google.com/document/d/11rwES5gfqyogw0BE8o9cfg4G6ndRxYxlpp4uUptwaEw/edit



From: Dave Fisher 
Sent: Monday, March 25, 2019 12:54 PM
To: dev@heron.incubator.apache.org
Cc: Ning Wang
Subject: Re: 3/26/2019 Bi-Weekly OSS Heron Sync-up



Sent from my iPhone

> On Mar 25, 2019, at 10:51 AM, Saikat Kanjilal  wrote:
>
> My updates
> --Bazel computational upgrade complete
> --Design doc for neo4j spout for heron out for review awaiting community 
> feedback

Where is this document? A link would help!

Thanks,
Dave

> --Design doc for spark spout for heron in the works
>
> Regards
>
> 
> From: Ning Wang 
> Sent: Monday, March 25, 2019 10:38 AM
> To: dev@heron.incubator.apache.org
> Subject: 3/26/2019 Bi-Weekly OSS Heron Sync-up
>
> Hi, all~,
>
> It has been two weeks since our last sync and there has been quite a few
> works done! Let's share our works for the last two weeks in this thread.
>
> My updates (mostly internal works, not much on Apache side):
> - Release 0.20.1-incubating-rc1 was sent out for vote and a license issue
> was found. Preparing 0.20.1-incubating-rc2 currently.
> - Small update in Heron UI.
> - Make cppcheck independent of pcre.
>
> Regards,
> --ning



Re: 3/26/2019 Bi-Weekly OSS Heron Sync-up

2019-03-25 Thread Saikat Kanjilal
My updates
--Bazel computational upgrade complete
--Design doc for neo4j spout for heron out for review awaiting community 
feedback
--Design doc for spark spout for heron in the works

Regards


From: Ning Wang 
Sent: Monday, March 25, 2019 10:38 AM
To: dev@heron.incubator.apache.org
Subject: 3/26/2019 Bi-Weekly OSS Heron Sync-up

Hi, all~,

It has been two weeks since our last sync and there has been quite a few
works done! Let's share our works for the last two weeks in this thread.

My updates (mostly internal works, not much on Apache side):
- Release 0.20.1-incubating-rc1 was sent out for vote and a license issue
was found. Preparing 0.20.1-incubating-rc2 currently.
- Small update in Heron UI.
- Make cppcheck independent of pcre.

Regards,
--ning


Fwd: heron doc

2019-03-22 Thread Saikat Kanjilal
Hi Folks,
I’ve started the design of a heron spout for connecting with connected data use 
cases for  neo4j, see below,  would love to get iterative feedback from the 
community.

https://docs.google.com/document/d/11rwES5gfqyogw0BE8o9cfg4G6ndRxYxlpp4uUptwaEw/edit

Thanks in advance and ping me either in the dev list or slack for more 
questions.

Best Regards


Re: 3/12/2019 Bi-Weekly OSS Heron Sync-up

2019-03-12 Thread Saikat Kanjilal
Welcoming Rohan to hazel upgrade efforts and keeping discussions humming on 
machine learning initiatives around heron, still defining use cases for this 

Sent from my iPhone

> On Mar 12, 2019, at 2:48 PM, Ning Wang  wrote:
> 
> Friendly ping~
> 
>> On Mon, Mar 11, 2019 at 11:42 AM Neng Lu  wrote:
>> 
>> Hi All,
>> 
>> My updates:
>> - I'm finishing the process of a security vulnerability issue.
>> 
>> On Mon, Mar 11, 2019 at 11:19 AM FatJ Love 
>> wrote:
>> 
>>> My update
>>> - Worked with OSS team about the March meetup
>>> 
 On Mon, Mar 11, 2019 at 10:39 AM Ning Wang  wrote:
 
 Hi, all~,
 
 It has been two weeks since our last sync and it is time to share our
 progress again. Let's share our works for the last two weeks in this
 thread.
 
 My updates (mostly internal works, not much on Apache side):
 - Worked with OSS team about the April meetup
 - Maven artifacts for Apache release 0.20.1 rc1 is uploaded to Apache
>>> maven
 repo:
 
 
>>> 
>> https://repository.apache.org/content/repositories/staging/org/apache/heron/
 .
 Working on the binary packages now. The current discussion about file
 hosting can be found here:
 
 
>>> 
>> http://mail-archives.apache.org/mod_mbox/heron-dev/201903.mbox/%3CCAHZ_pm6Dh0mtRdRZL07jMPKZitr%3DNmQLFq7ZuDzL_jGRR%2Btxxw%40mail.gmail.com%3E
 
 
 Regards,
 --ning
 
>>> 
>> 
>> 
>> --
>> Best Regards,
>> Neng
>> 


Re: Incubator Podling Report (Due 6th February)

2019-02-04 Thread Saikat Kanjilal
Should we mention something around the progress over the bazel upgrade 
initiative or is that too specific?
Otherwise looks fine.

Sent from my iPhone

> On Feb 4, 2019, at 12:12 PM, Ning Wang  wrote:
> 
> Thanks!
> 
>> On Mon, Feb 4, 2019 at 12:02 PM Karthik Ramasamy  wrote:
>> 
>> Looks good to me the podling report
>> 
>>> On Mon, Feb 4, 2019 at 11:06 AM Ning Wang  wrote:
>>> 
>>> Hi,
>>> 
>>> Ok. Sounds good. Thanks!
>>> 
>>> @Karthik Ramasamy  Added Ali, Jerry and me to the
>>> report. Is there any others?
>>> 
>>> @Sree V  We should send the notification to dev@
>>> as well next time.
>>> 
>>> 
>>> On Mon, Feb 4, 2019 at 10:14 AM Dave Fisher 
>>> wrote:
>>> 
 Hi -
 
 Please post this into the wiki and we can finish discussion.
 
 One point is that the monthly meetups are seldom discussed on the dev
 list. There should be discussion there reminding interested parties that
 these meetups are happening and what the discussion topics and talks are
 going to be.
 
 Also the names of the committers who accepted should be provided.
 
 Regards,
 Dave
 
> On Feb 4, 2019, at 10:06 AM, Ning Wang  wrote:
> 
> Ping for review~
> 
> Deadline is approaching.
> 
> On Thu, Jan 31, 2019 at 10:38 PM Ning Wang 
 wrote:
> 
>> Put the content of community development. There is no TOADD any more.
>> 
>> Everyone, please feel free to update if anything is missing.
>> 
>> On Thu, Jan 31, 2019 at 4:56 PM Josh Fischer 
 wrote:
>> 
>>> Thanks for the reminder..  I have attached the link to the google
 doc that
>>> Ning put together..  Does it need anything additional?  Anything
 taken
>>> away?  Once we are ready I can submit it for us (unless someone else
 would
>>> like to do it).
>>> 
>>> 
>>> 
 https://docs.google.com/document/d/18-fn-m87lIafnKjueKQ89l_E6GXHMSb47P_hhCtmvw0/edit
>>> 
>>> 
>>> On Thu, Jan 31, 2019 at 2:43 PM Justin Mclean 
 wrote:
>>> 
 Hi,
 
 The incubator PMC would appreciated if you could complete the
 podling
 report on time it's due on 6th February in a few days.
 
 It's best if you discuss the contents of the report on the list a
 week
 before it is due and work collaboratively on it before submitting
 it.
 
 It takes time to prepare the incubator report, have your mentors
 sign
>>> off
 the report and for the board to review it, so it's best if you can
 get
>>> it
 in early.
 
 Thanks,
 Justin
 
>>> 
>> 
 
 


Re: Heron Spouts Code

2019-01-17 Thread Saikat Kanjilal
As an analogy I was looking at Hadoop, yarn and spark as a comparison related 
to get some ideas and it seems that these components work together pretty 
seamlessly and have independent versioning.   I really feel like it’s up to the 
main engineers of each spout project on how to version things, as far as how to 
tell what version of heron to use that’s typically specified on the readme or 
the main site page for the spout.

My 2 cents.

Sent from my iPhone

> On Jan 17, 2019, at 5:29 AM, Simon Weng  wrote:
> 
> This is a good question. Each version spout must maintain a compatibility
> matrix {Spout Version, external SDK version, Heron API version}. It’s more
> of a documentation effort so that user haves enough information to
> determine which one to pick, isn’t it?
> 
>> On Thu, Jan 17, 2019 at 7:48 AM Josh Fischer  wrote:
>> 
>> If we were to go with a separate repo for the spouts how would we version
>> it?  Would it be consistent with the Heron repo?  How would people know
>> what version spout to use with the Heron version they are running?
>> 
>> 
>>> On Thu, Jan 17, 2019 at 1:26 AM Ning Wang  wrote:
>>> 
>>> This is an option. I have a few concerns about it:
>>> - There will be a lot of repos and it will be messy to manage and it might
>>> be harder for users to find it. I am expecting at least more than ten
>>> (different services times different languages).
>>> - There will be some duplicated code such as build/release configs,
>>> scripts. etc.
>>> 
>>> I think we should be able to achieve the first reason with a single repo.
>>> Different spouts should likely be in different folders and they can evolve
>>> separately.
>>> The second reason is valid, but duplicated code is a side effect.
>>> The third reason depends on building tool I feel. Bazel is powerful, but
>>> it
>>> is just changing time by time. :(
>>> 
>>> Just my two cents.
>>> 
>>> 
>>> 
>>> 
>>> 
 On Wed, Jan 16, 2019 at 8:09 PM Simon Weng  wrote:
 
 Hi, all:
 
 Can it also be one of the options to even have separate repo for each
>>> type
 of spouts? The reasons it is worth considering are:
 
 1. Allow each spout to evolve and release in different pace because each
 is technically driven by external source software. For example, the
 community may need different versions of the Kafka Spout to be
>>> compatible
 with their deployed Kafka cluster in production
 2. Allow each spout project to use the de facto build tool that suits
>>> the
 external SDK best. This will help to minimize the learning curve for
 constributors who specialize in different source software stack
 3. Simply the maintainence of the build and CI
 
 I’m not familiar with the capability of Bazel, so certainly I’m not
 against it. If it can help to achieve some of the above, I guess one
>>> single
 repo will also work then.
 
 SiMing
 
> On Wed, Jan 16, 2019 at 5:34 PM Ning Wang  wrote:
> 
> +Siming
> 
> On Tue, Jan 15, 2019 at 11:35 PM Ning Wang 
>>> wrote:
> 
>> Hi, all,
>> 
>> A few of us (Spencer, Saikat, Siming, Karthik, Josh, Sree) discussed
>> today in our general slack channel that we should have spouts code
>> somewhere so that people can reuse them (spouts are highly reusable in
>> general) and contribute improvements. This is just a recap of the
>>> idea and
>> some updates.
>> 
>> We have two options:
>> 1. add a spouts/ dir in heron project.
>> 2. create a new project in github.
>> 
>> For option 1, it is easy to start. But the iteration and release will
>>> be
>> coupled with Heron project itself. It is likely there will be quite
>>> some
>> activities around spouts time by time when new spouts are added. Also,
>> Heron itself is basically the engine itself plus APIs and tooling,
>>> while
>> there could be quite some spouts in future with many new dependencies
>>> like
>> Kafka, pubsub, neo4j and neptune, etc. It is debatable to have spout
>> implementations in Heron project, and these extra dependencies could
>>> add
>> some unnecessary complexity.
>> 
>> For option 2, there will be some work up front. but it will be much
>> easier to manage and evolve. And here will be less concerns about new
>> spouts (in different languages) and dependencies because spouts are
>> relatively independent to each other and we may generate artifacts per
>> spout.
>> 
>> Overall most people prefer option 2 for its cleanness.
>> 
>> I talked with Twitter OSS team. They are happy to support the
>>> initiative
>> and suggest us to check with Apache team and see what is the best
>>> process.
>> First question is that should this new side project be under Apache
>>> or not?
>> This might be a question to mentors. What do you think/suggest?
>> 
>> Another topic being discussed is the build tool in case we decide to
>> 

Re: ML in Heron weekly meeting

2018-07-05 Thread Saikat Kanjilal
@Dave Apache Samoa seemed like a good starting point as they’ve already 
implemented a set of algorithms as storm topologies, additionally regardless of 
whether that community is active , we could still take that code and iterate on 
it within heron as a good starting point.  To that end I feel the biggest 
challenge with the machine learning initiative is figuring out the exact use 
cases and operationalizing ml topologies within heron.

Sent from my iPhone

> On Jul 5, 2018, at 3:11 PM, Ning Wang  wrote:
> 
> Hmm. Good question. Maybe not yet reaching out.
> 
> 
> 
>> On Thu, Jul 5, 2018 at 11:49 AM, Dave Fisher  wrote:
>> 
>> Hi -
>> 
>> Has anyone reached out to the SAMOA podling? Or is their architecture
>> inverted from that being proposed I’m not sure how well the SAMOA community
>> is doing as they have had low activity since early this year.
>> 
>> Regards,
>> Dave
>> 
>>> On Jun 29, 2018, at 11:01 PM, Ning Wang  wrote:
>>> 
>>> Brief notes for the meeting on June 29:
>>> 
>>> - We need to hook up heron with Apache samoa. Saikat to create new issues
>>> in github.
>>> - Create a slack channel: #machine-learning
>>> - Let's add potential use cases in the design doc:
>>> https://docs.google.com/document/d/1LrO7XRcMxJoMM83wjRd-
>> Ov74VAaomA_mXOAhCStgGng/edit
>>> 
>>> 
 On Sat, Jun 23, 2018 at 3:44 PM, Ning Wang  wrote:
 
 Brief notes for the meeting on June 22th:
 
 - still studying the documents.
   --- https://mapr.com/blog/monitoring-real-time-uber-data-using-
 spark-machine-learning-streaming-and-kafka-api-part-2/
   --- https://databricks.com/blog/2018/06/05/introducing-mlflow-an
 -open-source-machine-learning-platform.html
   --- https://eng.uber.com/michelangelo/
 - stateful storage might need to be improved (data size) to support big
 state object which could be required by ML jobs.
 
>> 
>> 


Re: ML in Heron weekly meeting

2018-06-08 Thread Saikat Kanjilal
Hi Dave,
The Samoa piece is a bit tricky, the goal essentially is to take their storm 
components and enhance them to work within the heron storm subcomponent and 
eventually with heron streamlet architecture.  We chose Samoa because they have 
already built several machine learning topologies within storm.

Cheers

Sent from my iPhone

> On Jun 8, 2018, at 5:27 PM, Dave Fisher  wrote:
> 
> 
> 
> Sent from my iPhone
> 
>> On Jun 8, 2018, at 5:08 PM, Ning Wang  wrote:
>> 
>> Brief notes for today's meeting:
>> 
>> - Review DD:
>> https://docs.google.com/document/d/1LrO7XRcMxJoMM83wjRd-Ov74VAaomA_mXOAhCStgGng/edit
> 
> The document says copying Samoa. Heron should be working with the Samoa team 
> and being careful not to fork.
> 
> 
>> - We want to understand better about the bigger picture of ML in stream
>> processing systems.
>> -- talk to ML users
>> -- doc of related systems to read:
>>   ---
>> https://mapr.com/blog/monitoring-real-time-uber-data-using-spark-machine-learning-streaming-and-kafka-api-part-2/
>>   ---
>> https://databricks.com/blog/2018/06/05/introducing-mlflow-an-open-source-machine-learning-platform.html
>>   --- https://eng.uber.com/michelangelo/
> 
> In addition to talking to the vendors who are powered by Apache Spark, 
> directly talk to Apache Spark and Apache Kafka.
> 
> My 2cents.
> 
> Regards,
> Dave


Re: [DISCUSS] A design proposal for incorporating machine learning algorithms into heron

2018-05-09 Thread Saikat Kanjilal
Hi Folks,

I was thinking about how to drive this initiative and had some ideas around 
execution, would love some feedback:

1) While the discussion is happening around the design I was thinking of 
building a little prototype with one of the algorithms , the prototype will be 
a first cut representation of the design where we represent one algorithm into 
a storm topology, when I look at the list of algorithms that we're thinking 
about bringing over from samoa 
(https://samoa.incubator.apache.org/documentation/SAMOA-and-Machine-Learning.html)
 the distributed stream clustering looks the most valuable for a prototype, 
thoughts

Apache SAMOA and Machine 
Learning<https://samoa.incubator.apache.org/documentation/SAMOA-and-Machine-Learning.html>
samoa.incubator.apache.org
Apache SAMOA and Machine Learning. SAMOA’s main goal is to help developers to 
create easily machine learning algorithms on top of any distributed stream 
processing engine.


2) I would like to leverage some of the ideas in MichaelAngelo as well as my 
previous experience in building a tool that versions, deploys and associates ML 
models with newly arriving windows of data, in actuality I feel like this is a 
completely orthogonal initiative that we also need to design out, should this 
be part of the design doc at this point, thoughts?

3) Should we address security in streaming machine learning models for the 
first release?

4) The design doc mentions a GenericMLOutputModelSink, I was thinking this is 
like a factory method in that has underlying representations of various sinks 
that already exist that I'm hoping to leverage, see here: 
https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.4/bk_storm-component-guide/content/ch_storm-connectors.html



@Karthik Ramasamy<mailto:kart...@streaml.io> et all, would love to get thoughts 
on how we proceed with this initiative at this point, in the meantime I will 
get started with 1 to test out the feasibility of this design.

Regards

Chapter 5. Moving Data Into and Out of Apache Storm Using 
...<https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.4/bk_storm-component-guide/content/ch_storm-connectors.html>
docs.hortonworks.com
This chapter focuses on moving data into and out of Apache Storm through the 
use of spouts and bolts. Spouts read data from external sources to ingest data 
into a topology.






____
From: Saikat Kanjilal <sxk1...@hotmail.com>
Sent: Monday, May 7, 2018 2:31 PM
To: dev@heron.incubator.apache.org
Subject: [DISCUSS] A design proposal for incorporating machine learning 
algorithms into heron


Hello Dev community,

I have created the initial API design documentation around building storm 
topologies around a set of machine learning streaming algorithms here: 
https://docs.google.com/document/d/1LrO7XRcMxJoMM83wjRd-Ov74VAaomA_mXOAhCStgGng/edit?usp=sharing,
 this is very much a work in progress but I wanted to start getting early  
feedback from the community as its a lot of complex operations representing a 
streaming ml pipeline using heron.   This design leverages apache samoa to 
figure out which algorithms to focus on in bringing into heron.

Thank you Karthik Ramasamy for your mentoring on this, the goal will be to 
represent all the algorithms in phase 1 as storm topologies and then to evolve 
this to building a streamlet based architecture would really appreciate some 
feedback from the community

While you guys are commenting on the initial approach I will : 1) finish the 
design for the rest of the algorithms for phase 1 2) start the design for 
building out a heron streamlet based architecture to run on top of the storm 
based topologies.

Look forward to a productive discussion around the design



[DISCUSS] A design proposal for incorporating machine learning algorithms into heron

2018-05-07 Thread Saikat Kanjilal
Hello Dev community,

I have created the initial API design documentation around building storm 
topologies around a set of machine learning streaming algorithms here: 
https://docs.google.com/document/d/1LrO7XRcMxJoMM83wjRd-Ov74VAaomA_mXOAhCStgGng/edit?usp=sharing,
 this is very much a work in progress but I wanted to start getting early  
feedback from the community as its a lot of complex operations representing a 
streaming ml pipeline using heron.   This design leverages apache samoa to 
figure out which algorithms to focus on in bringing into heron.

Thank you Karthik Ramasamy for your mentoring on this, the goal will be to 
represent all the algorithms in phase 1 as storm topologies and then to evolve 
this to building a streamlet based architecture would really appreciate some 
feedback from the community

While you guys are commenting on the initial approach I will : 1) finish the 
design for the rest of the algorithms for phase 1 2) start the design for 
building out a heron streamlet based architecture to run on top of the storm 
based topologies.

Look forward to a productive discussion around the design



Proposal for design of the traits for the heron scala API--Looking for feedback from dev community

2018-01-30 Thread Saikat Kanjilal
Folks,
@karthikz asked me to publish this on the dev list, so here goes:

I've taken an initial stab below at getting the deisgn doc for the scala API in 
shape:  
https://docs.google.com/document/d/1vIL4hVC4SwYU5YkP9cFSn6poJ0M1_XROrB-Kl50xDs4/edit
 . and in doing so I've added the designs around the traits builder, 
source,sink,streamlet
[https://lh5.googleusercontent.com/ojQSM8oogNBO7ST645PYSkaXilzntJr_vPfc_rPcfsRAMugB0ZRKj92qiXPqxyPBwHzBJA=w1200-h630-p]

API Design for Heron Scala 
Port
API Design for Heron Scala Port Introduction This document will outline all of 
the api’s associated with the scala port for heron. The design will consist of 
a set of interfaces , I will build the codebase after every interface 
conversion so that the design actually reflects the port which b...
docs.google.com


For now I've tackled only the source, sink, streamlet traits.  Once we can 
agree on the design for these we can tackle the higher level traits that depend 
on these.

The biggest open question in my mind is how we represent Java serializable 
functions that need to passed into some of the traits, I will be researching 
methodologies to do this as folks add comments to the doc.  Given that the 
passed in functions are passed in the options we have are . 1) make the 
functions extend Serializable 2) keep them pure scala functions and use case 
classes to implement serializability transparently


Looking forward to hearing back.


Re: PR out for scala port, please review

2018-01-20 Thread Saikat Kanjilal
All,

For number 2 please use this PR:  https://github.com/twitter/heron/pull/2673

[https://avatars2.githubusercontent.com/u/674374?s=400=4]<https://github.com/twitter/heron/pull/2673>

Scalaapi by skanjila · Pull Request #2673 · 
twitter/heron<https://github.com/twitter/heron/pull/2673>
github.com
Added the first set of scala files and bazel build file that links in java 
implementations.






I had to fix some issues that I ran into with git and fixed it by doing a brand 
new fork and rebase and then commit.



Thanks


____
From: Saikat Kanjilal <sxk1...@hotmail.com>
Sent: Saturday, January 20, 2018 1:56 PM
To: dev@heron.incubator.apache.org
Subject: PR out for scala port, please review

Hello Heron Experts et al,

To get the scala port off to a start I have done the following:

1) Reran the javatoscala converter (http://javatoscala.com/) on the latest java 
code since my upstream repo is the master branch for heron
Java to Scala converter<http://javatoscala.com/>
javatoscala.com
How does it work? Java to Scala converter is created with Play framework and 
Scalagen library. I don't want you to see my code. No problem. Java to Scala 
converter ...




2) Created a PR which is here:  https://github.com/skanjila/heron/pull/2  so 
please review this when you have a moment

3) Please note that number 2 is a very much a WIP and we will be changing the 
converter code per PR comments but in order to speedtrack this I felt like the 
converter will help get this rolling

4) I created the first implementation of the bazel build file that will link in 
the java implementations for streamlets , once this works I will then add the 
unit tests

5) To get this off to a reasonable start we picked the streamlets API, for 
future reference the design doc to track this is here: 
https://docs.google.com/document/d/1vIL4hVC4SwYU5YkP9cFSn6poJ0M1_XROrB-Kl50xDs4/edit
[https://lh5.googleusercontent.com/ojQSM8oogNBO7ST645PYSkaXilzntJr_vPfc_rPcfsRAMugB0ZRKj92qiXPqxyPBwHzBJA=w1200-h630-p]<https://docs.google.com/document/d/1vIL4hVC4SwYU5YkP9cFSn6poJ0M1_XROrB-Kl50xDs4/edit>

API Design for Heron Scala 
Port<https://docs.google.com/document/d/1vIL4hVC4SwYU5YkP9cFSn6poJ0M1_XROrB-Kl50xDs4/edit>
docs.google.com
API Design for Heron Scala Port Introduction This document will outline all of 
the api’s associated with the scala port for heron. The design will consist of 
a set of interfaces , I will build the codebase after every interface 
conversion so that the design actually reflects the port which b...




[https://lh5.googleusercontent.com/ojQSM8oogNBO7ST645PYSkaXilzntJr_vPfc_rPcfsRAMugB0ZRKj92qiXPqxyPBwHzBJA=w1200-h630-p]<https://docs.google.com/document/d/1vIL4hVC4SwYU5YkP9cFSn6poJ0M1_XROrB-Kl50xDs4/edit>

API Design for Heron Scala 
Port<https://docs.google.com/document/d/1vIL4hVC4SwYU5YkP9cFSn6poJ0M1_XROrB-Kl50xDs4/edit>
[https://lh5.googleusercontent.com/ojQSM8oogNBO7ST645PYSkaXilzntJr_vPfc_rPcfsRAMugB0ZRKj92qiXPqxyPBwHzBJA=w1200-h630-p]<https://docs.google.com/document/d/1vIL4hVC4SwYU5YkP9cFSn6poJ0M1_XROrB-Kl50xDs4/edit>

API Design for Heron Scala 
Port<https://docs.google.com/document/d/1vIL4hVC4SwYU5YkP9cFSn6poJ0M1_XROrB-Kl50xDs4/edit>
docs.google.com
API Design for Heron Scala Port Introduction This document will outline all of 
the api’s associated with the scala port for heron. The design will consist of 
a set of interfaces , I will build the codebase after every interface 
conversion so that the design actually reflects the port which b...



docs.google.com
API Design for Heron Scala Port Introduction This document will outline all of 
the api’s associated with the scala port for heron. The design will consist of 
a set of interfaces , I will build the codebase after every interface 
conversion so that the design actually reflects the port which b...


[https://avatars2.githubusercontent.com/u/674374?s=400=4]<https://github.com/skanjila/heron/pull/2>

Scalaapi by skanjila · Pull Request #2 · 
skanjila/heron<https://github.com/skanjila/heron/pull/2>
github.com
First cut of scala interface code based on the javatoscala converter






Java to Scala converter<http://javatoscala.com/>
javatoscala.com
How does it work? Java to Scala converter is created with Play framework and 
Scalagen library. I don't want you to see my code. No problem. Java to Scala 
converter ...



Look forward to getting this rolling and hearing back from the community.
Best Regards


Scala porting efforts resuming

2018-01-06 Thread Saikat Kanjilal
Hi Heron Folks,

I wanted to run something by the dev list, for the initial scala port of the 
streamlets API I am targeting building scala interfaces using the java to scala 
tool and having the interfaces invoke the java implementations, now the way the 
packaging is setup all the java implementations currently live inside 
/Users/saikat.kanjilal/code/heron/heron/api/src/java/com/twitter/heron/streamlet/impl,
 in Intellij I have made the scala directory an additional source directory and 
am planning on adding the bazel rules file to create a scala library.


Now my question is this, should we move the java implementation package to a 
central location so that the implementations can easily be accessed by both the 
java and scala interfaces, I really don't want to have 2 copies of the 
implementation files, and it seems awkward to leave the java implementations 
living where they currently do.


Here's a possible idea I was thinking about:


top level directory:  /api/src/java/com/twitter/heron/streamlet

java interface directory: /api/src/java/com/twitter/heron/streamlet/java

scala interface directory: /api/src/java/com/twitter/heron/streamlet/scala

implementation directory: /api/src/java/com/twitter/heron/streamlet/impl


With the above strategy I can tell Intellij to make the streamlet subdirectory 
be the src, not sure this actually matters when building a bazel related 
project.


Would love to have some input on this.

Thanks in advance.


Re: Notes on bp(back-pressure) discussion on Nov.17

2017-11-17 Thread Saikat Kanjilal
Hi Fu,
Thanks for the proposals, is it possible to outline the pros and cons that you 
guys discussed?  That would help the community in weighing in on this.
Thanks in advance

Sent from my iPhone

On Nov 17, 2017, at 4:15 PM, Fu Maosong 
> wrote:

Today, Ning, Huijun, Sanjeev and Maosong discussed about the
bp(back-pressure) today, and here are the notes:

*Two major issues of current bp algorithm*:
1. one single point of bp will stop the whole topology
2. when a topology in bp state, the overall throughput for the whole
topology can reduce

*Some proposals*:
For 1.
- load shedding -- for example, drop tuples in stmgr.
- better bp algo. -- no need to stop the whole topology; can just handle
the slowness to particular instances, for example, spill the buffer to the
disk

For 2.
- Rate control on the source(spout) side. So even in bp state, the overall
throughput can be higher than normal
- Run-time scale-up

Let me know if I missing anything.

--
With my best Regards
--
Fu Maosong
Twitter Inc.
Mobile: +001-415-244-7520


Re: [Discuss] First cut of design doc ready for scala port of heron

2017-11-11 Thread Saikat Kanjilal
Yes that’s the next large effort after this,the decision to port the streamlet 
API came from an earlier discussion thread on this topic.  Let’s see how the 
experience goes with this porting and as the next adventure we can then tackle 
the low level API, one step at a time :).

Sent from my iPhone

On Nov 11, 2017, at 12:50 PM, FatJ Love 
<huijun.wu.2...@gmail.com<mailto:huijun.wu.2...@gmail.com>> wrote:

It seems the proposed Scala API is from Java streamlet API. Will we port
Java low level API to Scala as well ?

On Sat, Nov 11, 2017 at 10:48 AM, Karthik Ramasamy 
<kart...@streaml.io<mailto:kart...@streaml.io>>
wrote:

Thanks Saikat. Let us give a few days for comments to come in.

On Sat, Nov 11, 2017 at 5:17 AM, Saikat Kanjilal 
<sxk1...@hotmail.com<mailto:sxk1...@hotmail.com>>
wrote:

Hello Heron community,

I have created the first cut of an API design doc for the Scala port for
heron here:


https://docs.google.com/document/d/1vIL4hVC4SwYU5YkP9cFSn6poJ0M1_
XROrB-Kl50xDs4/edit

API Design for Heron Scala Port<https://docs.google.com/document/d/
1vIL4hVC4SwYU5YkP9cFSn6poJ0M1_XROrB-Kl50xDs4/edit>
docs.google.com<http://docs.google.com>
API Design for Heron Scala Port Introduction This document will outline
all of the api’s associated with the scala port for heron. The design
will
consist of a set of interfaces , I will build the codebase after every
interface conversion so that the design actually reflects the port which
b...


The doc has some high level objectives and a simple list of the Java
interfaces and code corresponding to the Scala port.  Please review in
detail and add feedback, for each addition of the Scala trait or I went
ahead and just did a basic compile of the code and from there added the
trait to the design doc.I want to nail down this design doc first
before any more deeper incisions into the code.  As a bare minimum I have
cleaned up each of the traits to have the correct package and removed
unneeded classes like JavaToScalaConversions where appropriate and added
the appropriate license on the top of each trait.  Once the feedback is
collected we can go about this two ways, have a discussion thread on the
apache list to vet the APIs and in parallel I will tie the underlying
Java
implementations to the upper level scala traits.   Do let me know your
thoughts either either on this thread or the doc or finally the
slackchannel.


Thanks in advance for your help.




[Discuss] First cut of design doc ready for scala port of heron

2017-11-11 Thread Saikat Kanjilal
Hello Heron community,

I have created the first cut of an API design doc for the Scala port for heron 
here:


https://docs.google.com/document/d/1vIL4hVC4SwYU5YkP9cFSn6poJ0M1_XROrB-Kl50xDs4/edit

API Design for Heron Scala 
Port
docs.google.com
API Design for Heron Scala Port Introduction This document will outline all of 
the api’s associated with the scala port for heron. The design will consist of 
a set of interfaces , I will build the codebase after every interface 
conversion so that the design actually reflects the port which b...


The doc has some high level objectives and a simple list of the Java interfaces 
and code corresponding to the Scala port.  Please review in detail and add 
feedback, for each addition of the Scala trait or I went ahead and just did a 
basic compile of the code and from there added the trait to the design doc.
I want to nail down this design doc first before any more deeper incisions into 
the code.  As a bare minimum I have cleaned up each of the traits to have the 
correct package and removed unneeded classes like JavaToScalaConversions where 
appropriate and added the appropriate license on the top of each trait.  Once 
the feedback is collected we can go about this two ways, have a discussion 
thread on the apache list to vet the APIs and in parallel I will tie the 
underlying Java implementations to the upper level scala traits.   Do let me 
know your thoughts either either on this thread or the doc or finally the 
slackchannel.


Thanks in advance for your help.


Re: Hello heron community

2017-10-31 Thread Saikat Kanjilal
Here's an initial PR showing the scala traits that I got as a result of running 
the javatoscala converter on the streamlets API, please take a look and let me 
know if you have any feedback, next steps include putting these on the design 
doc and then starting work on the implementations.  To reiterate the work was 
done against a child branch in my fork of heron.


https://github.com/skanjila/heron/pull/1


Thanks



From: Saikat Kanjilal <sxk1...@hotmail.com>
Sent: Monday, October 30, 2017 9:01 PM
To: dev@heron.incubator.apache.org
Subject: Re: Hello heron community


Thanks Ashwin and yes I am aware of this from reading the documentation.  I 
have actually written a little component in reef that runs a simple reef job 
using spark running on yarn, will be curious to look at this code when I have 
some more time.



From: aas@gmail.com <aas@gmail.com> on behalf of Ashvin A 
<aas...@gmail.com>
Sent: Monday, October 30, 2017 11:36 AM
To: dev@heron.incubator.apache.org
Subject: Re: Hello heron community

Hi Saikat,

Welcome !

Since you are familiar with REEF, you might want to know that the YARN port
of Heron is based on REEF. You can find some details here:
https://twitter.github.io/heron/docs/operators/deployment/schedulers/yarn/.
Heron Documentation - Apache Hadoop YARN Cluster 
...<https://twitter.github.io/heron/docs/operators/deployment/schedulers/yarn/>
twitter.github.io
In addition to out-of-the-box schedulers for Aurora, Heron can also be deployed 
on a YARN cluster with the YARN scheduler. The YARN scheduler is implemented 
using the ...




-ashvin

On Sun, Oct 29, 2017 at 10:19 PM, Karthik Ramasamy <kramas...@gmail.com>
wrote:

> Saikat -
>
> Welcome to the Heron Community. It will be great to see support for Scale
> API. Current API are of two forms -
>
> - procedural/object oriented api (this is very low level where you have to
> write the DAG operators and assemble them) - somewhat similar to Storm
> - functional (this is more higher level that use functions like map,
> flatMap, etc)
>
> Here are few suggestions and recommendations.
>
> - Join the slack for any quick questions and answers (which you have
> already done)
>
> - We use Github to track issues. In your case, it might be a easier to
> track using projects - for your convenience
> I have created a project called Scala API for Heron -
> https://github.com/twitter/heron/projects/3 <https://github.com/twitter/
[https://avatars1.githubusercontent.com/u/50278?s=400=4]<https://github.com/twitter/heron/projects/3>

twitter/heron<https://github.com/twitter/heron/projects/3>
github.com
heron - Heron is a realtime, distributed, fault-tolerant stream processing 
engine from Twitter



> heron/projects/3>
>
> - It will be great to have a design doc, adding this as a first task into
> backlog. Once the design doc is complete, you can post it for review and
> comments
> for the community. Google doc is an easy way to share and get feedback
> unless there is other preference.
>
> - Once review is complete, you can create additional tasks and issues and
> link them to the project.
>
> Hope this helps to jump start.
>
> cheers
> /karthik
>
> > On Oct 29, 2017, at 6:17 PM, Saikat Kanjilal <sxk1...@hotmail.com>
> wrote:
> >
> > Hello Folks,
> >
> > I'm interested in taking and driving the following github component
> (this means design/development/architecture and more).
> >
> >
> > https://github.com/twitter/heron/issues/668
> >
> >
> > [https://avatars1.githubusercontent.com/u/367684?v=4=400]<https://
> github.com/twitter/heron/issues/668>
> >
> > Ability to write native scala topologies · Issue #668 ...<
> https://github.com/twitter/heron/issues/668>
> > github.com
> > heron - Heron is a realtime, distributed, fault-tolerant stream
> processing engine from Twitter
> >
> >
> >
> >
> > I've joined the heron slack channel as recommended and am ready to start
> by creating a master JIRA issue to track this if not already done, I wanted
> some guidance from a committer or two to help guide me a bit take care of
> at he administrative details and how to get this kicked off the ground.  My
> background includes working on apache reef codebase as well as dabbling in
> apache mahout and apache flume as well as storm and various nosql/graph
> databases.  Looking forward to being part of the community.
> >
> >
> >
> > The rough plan of action in my mind so you know:
> >
> > 1) Review heron architecture and codebase in detail
> >
> > 2) Create an umbrella JIRA with the first series of tasks
> >
> > 3) Come up with a design doc
> >
> > 4) Build the interfaces
> >
> > 5) Add more dev tasks to JIRA
> >
> > 6) Drive the implementation
> >
> >
> > Thanks in advance for your help.
> >
>
>


Re: Hello heron community

2017-10-30 Thread Saikat Kanjilal
Thanks Ashwin and yes I am aware of this from reading the documentation.  I 
have actually written a little component in reef that runs a simple reef job 
using spark running on yarn, will be curious to look at this code when I have 
some more time.



From: aas@gmail.com <aas@gmail.com> on behalf of Ashvin A 
<aas...@gmail.com>
Sent: Monday, October 30, 2017 11:36 AM
To: dev@heron.incubator.apache.org
Subject: Re: Hello heron community

Hi Saikat,

Welcome !

Since you are familiar with REEF, you might want to know that the YARN port
of Heron is based on REEF. You can find some details here:
https://twitter.github.io/heron/docs/operators/deployment/schedulers/yarn/.
Heron Documentation - Apache Hadoop YARN Cluster 
...<https://twitter.github.io/heron/docs/operators/deployment/schedulers/yarn/>
twitter.github.io
In addition to out-of-the-box schedulers for Aurora, Heron can also be deployed 
on a YARN cluster with the YARN scheduler. The YARN scheduler is implemented 
using the ...




-ashvin

On Sun, Oct 29, 2017 at 10:19 PM, Karthik Ramasamy <kramas...@gmail.com>
wrote:

> Saikat -
>
> Welcome to the Heron Community. It will be great to see support for Scale
> API. Current API are of two forms -
>
> - procedural/object oriented api (this is very low level where you have to
> write the DAG operators and assemble them) - somewhat similar to Storm
> - functional (this is more higher level that use functions like map,
> flatMap, etc)
>
> Here are few suggestions and recommendations.
>
> - Join the slack for any quick questions and answers (which you have
> already done)
>
> - We use Github to track issues. In your case, it might be a easier to
> track using projects - for your convenience
> I have created a project called Scala API for Heron -
> https://github.com/twitter/heron/projects/3 <https://github.com/twitter/
[https://avatars1.githubusercontent.com/u/50278?s=400=4]<https://github.com/twitter/heron/projects/3>

twitter/heron<https://github.com/twitter/heron/projects/3>
github.com
heron - Heron is a realtime, distributed, fault-tolerant stream processing 
engine from Twitter



> heron/projects/3>
>
> - It will be great to have a design doc, adding this as a first task into
> backlog. Once the design doc is complete, you can post it for review and
> comments
> for the community. Google doc is an easy way to share and get feedback
> unless there is other preference.
>
> - Once review is complete, you can create additional tasks and issues and
> link them to the project.
>
> Hope this helps to jump start.
>
> cheers
> /karthik
>
> > On Oct 29, 2017, at 6:17 PM, Saikat Kanjilal <sxk1...@hotmail.com>
> wrote:
> >
> > Hello Folks,
> >
> > I'm interested in taking and driving the following github component
> (this means design/development/architecture and more).
> >
> >
> > https://github.com/twitter/heron/issues/668
> >
> >
> > [https://avatars1.githubusercontent.com/u/367684?v=4=400]<https://
> github.com/twitter/heron/issues/668>
> >
> > Ability to write native scala topologies · Issue #668 ...<
> https://github.com/twitter/heron/issues/668>
> > github.com
> > heron - Heron is a realtime, distributed, fault-tolerant stream
> processing engine from Twitter
> >
> >
> >
> >
> > I've joined the heron slack channel as recommended and am ready to start
> by creating a master JIRA issue to track this if not already done, I wanted
> some guidance from a committer or two to help guide me a bit take care of
> at he administrative details and how to get this kicked off the ground.  My
> background includes working on apache reef codebase as well as dabbling in
> apache mahout and apache flume as well as storm and various nosql/graph
> databases.  Looking forward to being part of the community.
> >
> >
> >
> > The rough plan of action in my mind so you know:
> >
> > 1) Review heron architecture and codebase in detail
> >
> > 2) Create an umbrella JIRA with the first series of tasks
> >
> > 3) Come up with a design doc
> >
> > 4) Build the interfaces
> >
> > 5) Add more dev tasks to JIRA
> >
> > 6) Drive the implementation
> >
> >
> > Thanks in advance for your help.
> >
>
>


Re: Hello heron community

2017-10-30 Thread Saikat Kanjilal
The goal is to tie together spark streaming and heron, I could very easily see 
an architecture where we build topologies in heron to apply business logic and 
pump them into time windows with spark streaming where can apply machine 
learning algorithms inside these windows.


Detailed Example:


Stock Trading or gamin use case:


Heron is used to extract real time data from a gaming or trading client,

Heron will apply a set of business rules to cleanup this data to fit into a set 
of machine learning models

We need to then figure out how to connect spark streaming to heron, in the case 
of spark streaming it listens to some tcp or http endpoint so the objective 
would be to figure out the bridge between the two worlds.


My point here is that if I am going through and building out a scala API it 
might make sense to also build a connector to a tech stack entirely written 
using scala as a first class citizen.



Let me know your thoughts, we can also push this to phase 2 once the heron 
scala api is readily available.



From: Sanjeev Kulkarni <sanjee...@gmail.com>
Sent: Monday, October 30, 2017 2:54 PM
To: dev@heron.incubator.apache.org
Subject: Re: Hello heron community

I didn't really get what you meant by 'stream endpoint would be coming from
heron'. Could you please elaborate? Preferably with an example?

On Mon, Oct 30, 2017 at 2:26 PM, Saikat Kanjilal <sxk1...@hotmail.com>
wrote:

> I'm conflicted a bit on 1, here's why I feel like most of our users would
> want to use the spouts/bolts API who are coming over from Storm or using
> heron for the first time, however the Streamlets interface seems like it
> has the right level of abstraction, I think I will start with the Streamlet
> interface and add in the spout/bolt API as needed.   One other thought I
> had was to figure out an integration plan with spark, to this end the fit
> between heron and spark would be in using spark streaming where the stream
> endpoint would be coming from heron, not sure if that should be part of
> this effort or not, what do you guys think?
>
>
> 
> From: Sanjeev Kulkarni <sanjee...@gmail.com>
> Sent: Monday, October 30, 2017 2:17 PM
> To: dev@heron.incubator.apache.org
> Subject: Re: Hello heron community
>
> I have a couple of questions
> 1. Do you plan on exposing both the low level(spouts/bolts) and the
> streamlet api? Or are you preferring one over another?
> 2. My suggestion would be to start with the Streamlet interface. Primarily
> because a) Most of scala ml libraries that I've seen tend to operate on the
> streamlet kind of interface and b) It has a far smaller surface area(i.e.
> number of interfaces) so might be easy to get something up quickly for
> testing.
> Thanks!
>
> On Mon, Oct 30, 2017 at 2:13 PM, Saikat Kanjilal <sxk1...@hotmail.com>
> wrote:
>
> > Thanks Sanjeev,  I had an initial idea that I wanted to float on the
> list,
> > I was thinking that as part of the initial scala port of the API I'd like
> > to propose that we use this tool: http://javatoscala.com/
Java to Scala converter<http://javatoscala.com/>
javatoscala.com
How does it work? Java to Scala converter is created with Play framework and 
Scalagen library. I don't want you to see my code. No problem. Java to Scala 
converter ...



> Java to Scala converter<http://javatoscala.com/>
Java to Scala converter<http://javatoscala.com/>
javatoscala.com
How does it work? Java to Scala converter is created with Play framework and 
Scalagen library. I don't want you to see my code. No problem. Java to Scala 
converter ...



> javatoscala.com
> How does it work? Java to Scala converter is created with Play framework
> and Scalagen library. I don't want you to see my code. No problem. Java to
> Scala converter ...
>
>
>
> >
> > Java to Scala converter<http://javatoscala.com/>
Java to Scala converter<http://javatoscala.com/>
javatoscala.com
How does it work? Java to Scala converter is created with Play framework and 
Scalagen library. I don't want you to see my code. No problem. Java to Scala 
converter ...



> Java to Scala converter<http://javatoscala.com/>
Java to Scala converter<http://javatoscala.com/>
javatoscala.com
How does it work? Java to Scala converter is created with Play framework and 
Scalagen library. I don't want you to see my code. No problem. Java to Scala 
converter ...



> javatoscala.com
> How does it work? Java to Scala converter is created with Play framework
> and Scalagen library. I don't want you to see my code. No problem. Java to
> Scala converter ...
>
>
>
> > javatoscala.com
> > How does it work? Java to Scala converter is created with Play framework
> > and Scalagen library. I don't 

Re: Hello heron community

2017-10-30 Thread Saikat Kanjilal
I'm conflicted a bit on 1, here's why I feel like most of our users would want 
to use the spouts/bolts API who are coming over from Storm or using heron for 
the first time, however the Streamlets interface seems like it has the right 
level of abstraction, I think I will start with the Streamlet interface and add 
in the spout/bolt API as needed.   One other thought I had was to figure out an 
integration plan with spark, to this end the fit between heron and spark would 
be in using spark streaming where the stream endpoint would be coming from 
heron, not sure if that should be part of this effort or not, what do you guys 
think?



From: Sanjeev Kulkarni <sanjee...@gmail.com>
Sent: Monday, October 30, 2017 2:17 PM
To: dev@heron.incubator.apache.org
Subject: Re: Hello heron community

I have a couple of questions
1. Do you plan on exposing both the low level(spouts/bolts) and the
streamlet api? Or are you preferring one over another?
2. My suggestion would be to start with the Streamlet interface. Primarily
because a) Most of scala ml libraries that I've seen tend to operate on the
streamlet kind of interface and b) It has a far smaller surface area(i.e.
number of interfaces) so might be easy to get something up quickly for
testing.
Thanks!

On Mon, Oct 30, 2017 at 2:13 PM, Saikat Kanjilal <sxk1...@hotmail.com>
wrote:

> Thanks Sanjeev,  I had an initial idea that I wanted to float on the list,
> I was thinking that as part of the initial scala port of the API I'd like
> to propose that we use this tool: http://javatoscala.com/
Java to Scala converter<http://javatoscala.com/>
javatoscala.com
How does it work? Java to Scala converter is created with Play framework and 
Scalagen library. I don't want you to see my code. No problem. Java to Scala 
converter ...



>
> Java to Scala converter<http://javatoscala.com/>
Java to Scala converter<http://javatoscala.com/>
javatoscala.com
How does it work? Java to Scala converter is created with Play framework and 
Scalagen library. I don't want you to see my code. No problem. Java to Scala 
converter ...



> javatoscala.com
> How does it work? Java to Scala converter is created with Play framework
> and Scalagen library. I don't want you to see my code. No problem. Java to
> Scala converter ...
>
>
> I propose that we use the above tool to do an initial conversion of all
> the Java functional and oo interfaces and then plug that into the initial
> design doc as well as the codebase to get things off the ground.   I have
> already forked the heron repo and will be doing this off of a child branch
> based on my fork.
>
>
> How does that sound to folks, anyone has strong objections, of course I
> fully realize that I will still need to do a lot of work around refactoring
> the code even when using the converter so I'm prepared for that, the goal
> in using the converters is to save a bit of time so that manual
> intervention isnt necessarily needed in designing every interface.
>
> Thoughts.
>
> 
> From: Sanjeev Kulkarni <sanjee...@gmail.com>
> Sent: Sunday, October 29, 2017 10:18 PM
> To: dev@heron.incubator.apache.org
> Subject: Re: Hello heron community
>
> Hi Saikat,
> Welcome to the Heron users group. Your plan of action sounds good to me. A
> native scala interface would be great for all those scala fans. Eagerly
> awaiting design doc.
> Thanks!
>
> On Sun, Oct 29, 2017 at 6:17 PM, Saikat Kanjilal <sxk1...@hotmail.com>
> wrote:
>
> > Hello Folks,
> >
> > I'm interested in taking and driving the following github component (this
> > means design/development/architecture and more).
> >
> >
> > https://github.com/twitter/heron/issues/668
> [https://avatars1.githubusercontent.com/u/367684?s=400=4]<https://
> github.com/twitter/heron/issues/668>
>
> Ability to write native scala topologies · Issue #668 · twitter/heron<
> https://github.com/twitter/heron/issues/668>
[https://avatars1.githubusercontent.com/u/367684?s=400=4]<https://github.com/twitter/heron/issues/668>

Ability to write native scala topologies · Issue #668 · 
twitter/heron<https://github.com/twitter/heron/issues/668>
github.com
Support for Scala Instance and APIs.



> github.com
> Support for Scala Instance and APIs.
>
>
>
> >
> >
> > [https://avatars1.githubusercontent.com/u/367684?v=4=400]<https://
> > github.com/twitter/heron/issues/668>
> >
> > Ability to write native scala topologies · Issue #668 ...<
> > https://github.com/twitter/heron/issues/668>
[https://avatars1.githubusercontent.com/u/367684?s=400=4]<https://github.com/twitter/heron/issues/668>

Ability to write native scala topologies · Issue #668 · 
twitte

Re: Hello heron community

2017-10-30 Thread Saikat Kanjilal
Thanks Sanjeev,  I had an initial idea that I wanted to float on the list, I 
was thinking that as part of the initial scala port of the API I'd like to 
propose that we use this tool: http://javatoscala.com/

Java to Scala converter<http://javatoscala.com/>
javatoscala.com
How does it work? Java to Scala converter is created with Play framework and 
Scalagen library. I don't want you to see my code. No problem. Java to Scala 
converter ...


I propose that we use the above tool to do an initial conversion of all the 
Java functional and oo interfaces and then plug that into the initial design 
doc as well as the codebase to get things off the ground.   I have already 
forked the heron repo and will be doing this off of a child branch based on my 
fork.


How does that sound to folks, anyone has strong objections, of course I fully 
realize that I will still need to do a lot of work around refactoring the code 
even when using the converter so I'm prepared for that, the goal in using the 
converters is to save a bit of time so that manual intervention isnt 
necessarily needed in designing every interface.

Thoughts.


From: Sanjeev Kulkarni <sanjee...@gmail.com>
Sent: Sunday, October 29, 2017 10:18 PM
To: dev@heron.incubator.apache.org
Subject: Re: Hello heron community

Hi Saikat,
Welcome to the Heron users group. Your plan of action sounds good to me. A
native scala interface would be great for all those scala fans. Eagerly
awaiting design doc.
Thanks!

On Sun, Oct 29, 2017 at 6:17 PM, Saikat Kanjilal <sxk1...@hotmail.com>
wrote:

> Hello Folks,
>
> I'm interested in taking and driving the following github component (this
> means design/development/architecture and more).
>
>
> https://github.com/twitter/heron/issues/668
[https://avatars1.githubusercontent.com/u/367684?s=400=4]<https://github.com/twitter/heron/issues/668>

Ability to write native scala topologies · Issue #668 · 
twitter/heron<https://github.com/twitter/heron/issues/668>
github.com
Support for Scala Instance and APIs.



>
>
> [https://avatars1.githubusercontent.com/u/367684?v=4=400]<https://
> github.com/twitter/heron/issues/668>
>
> Ability to write native scala topologies · Issue #668 ...<
> https://github.com/twitter/heron/issues/668>
[https://avatars1.githubusercontent.com/u/367684?s=400=4]<https://github.com/twitter/heron/issues/668>

Ability to write native scala topologies · Issue #668 · 
twitter/heron<https://github.com/twitter/heron/issues/668>
github.com
Support for Scala Instance and APIs.



> github.com
> heron - Heron is a realtime, distributed, fault-tolerant stream processing
> engine from Twitter
>
>
>
>
> I've joined the heron slack channel as recommended and am ready to start
> by creating a master JIRA issue to track this if not already done, I wanted
> some guidance from a committer or two to help guide me a bit take care of
> at he administrative details and how to get this kicked off the ground.  My
> background includes working on apache reef codebase as well as dabbling in
> apache mahout and apache flume as well as storm and various nosql/graph
> databases.  Looking forward to being part of the community.
>
>
>
> The rough plan of action in my mind so you know:
>
> 1) Review heron architecture and codebase in detail
>
> 2) Create an umbrella JIRA with the first series of tasks
>
> 3) Come up with a design doc
>
> 4) Build the interfaces
>
> 5) Add more dev tasks to JIRA
>
> 6) Drive the implementation
>
>
> Thanks in advance for your help.
>
>


Hello heron community

2017-10-29 Thread Saikat Kanjilal
Hello Folks,

I'm interested in taking and driving the following github component (this means 
design/development/architecture and more).


https://github.com/twitter/heron/issues/668


[https://avatars1.githubusercontent.com/u/367684?v=4=400]

Ability to write native scala topologies · Issue #668 
...
github.com
heron - Heron is a realtime, distributed, fault-tolerant stream processing 
engine from Twitter




I've joined the heron slack channel as recommended and am ready to start by 
creating a master JIRA issue to track this if not already done, I wanted some 
guidance from a committer or two to help guide me a bit take care of at he 
administrative details and how to get this kicked off the ground.  My 
background includes working on apache reef codebase as well as dabbling in 
apache mahout and apache flume as well as storm and various nosql/graph 
databases.  Looking forward to being part of the community.



The rough plan of action in my mind so you know:

1) Review heron architecture and codebase in detail

2) Create an umbrella JIRA with the first series of tasks

3) Come up with a design doc

4) Build the interfaces

5) Add more dev tasks to JIRA

6) Drive the implementation


Thanks in advance for your help.