Re: Apache Beam and Flink

2016-05-26 Thread Slim Baltagi
Hi Ashutosh

There is a related open JIRA: Enable DataSet and DataStream Joins 
https://issues.apache.org/jira/browse/FLINK-2320 
<https://issues.apache.org/jira/browse/FLINK-2320>

Slim 


> On May 26, 2016, at 3:05 AM, Fabian Hueske <fhue...@gmail.com> wrote:
> 
> No, that is not supported yet.
> Beam provides a common API but the Flink runner translates programs against 
> batch sources into the DataSet API programs and Beam programs against 
> streaming source into DataStream programs.
> It is not possible to mix both.
> 
> 2016-05-26 10:00 GMT+02:00 Ashutosh Kumar <kmr.ashutos...@gmail.com 
> <mailto:kmr.ashutos...@gmail.com>>:
> Thanks . So if we use Beam API with flink engine then we can get inter action 
> between batch and stream ? As i know currently in flink Dataset and DStream 
> can not talk . Is this correct ? 
>  Thanks
> Ashutosh
>  
> 
> On Thu, May 26, 2016 at 1:09 PM, Slim Baltagi <sbalt...@gmail.com 
> <mailto:sbalt...@gmail.com>> wrote:
> Hi Ashutosh
> 
> Apache Beam provides a Unified API for batch and streaming.
> It also supports multiple ‘runners’: local, Apache Spark, Apache Flink and 
> Google Cloud Data Flow (commercial service). 
> It is not an alternative to Flink because it is an API and you still need an 
> execution engine.
> It can be used as an alternative API to using the two Flink APIs : DataSet 
> API and DataStream API. 
> It can be complementary to Flink in the way that you use Beam as API and 
> Flink as the execution engine.  
> Many of Flink committers are also Apache Beam committers!
> The following blogs describe why Apache Beam:
>  from Flink perspective: http://data-artisans.com/why-apache-beam/ 
> <http://data-artisans.com/why-apache-beam/> 
>  from Google perspective. 
> https://cloud.google.com/blog/big-data/2016/05/why-apache-beam-a-google-perspective
>  
> <https://cloud.google.com/blog/big-data/2016/05/why-apache-beam-a-google-perspective>
> 
> A few recent resources about Apache Beam published this month: May 2016 
> Running Apache Beam (screencast) https://www.youtube.com/watch?v=dwxUbzbwtyI 
> <https://www.youtube.com/watch?v=dwxUbzbwtyI>
> Introduction to Apache Beam ( presentation) 
> https://skillsmatter.com/skillscasts/8036-apache-flink-may-meetup 
> <https://skillsmatter.com/skillscasts/8036-apache-flink-may-meetup>
> Introduction to Apache Beam ( blog) 
> http://www.talend.com/blog/2016/05/02/introduction-to-apache-beam 
> <http://www.talend.com/blog/2016/05/02/introduction-to-apache-beam>
> 
> I hope this helps.
> 
> Thanks
> 
> Slim Baltagi
> 
>> On May 26, 2016, at 2:20 AM, Ashutosh Kumar <kmr.ashutos...@gmail.com 
>> <mailto:kmr.ashutos...@gmail.com>> wrote:
>> 
>> How does apache beam fits with  flink ? Is it an alternative for flink or 
>> complementary to it ? 
>> 
>> Thanks
>> Ashutosh 
> 
> 
> 



Re: Apache Beam and Flink

2016-05-26 Thread Slim Baltagi
Hi Ashutosh

Apache Beam provides a Unified API for batch and streaming.
It also supports multiple ‘runners’: local, Apache Spark, Apache Flink and 
Google Cloud Data Flow (commercial service). 
It is not an alternative to Flink because it is an API and you still need an 
execution engine.
It can be used as an alternative API to using the two Flink APIs : DataSet API 
and DataStream API. 
It can be complementary to Flink in the way that you use Beam as API and Flink 
as the execution engine.  
Many of Flink committers are also Apache Beam committers!
The following blogs describe why Apache Beam:
 from Flink perspective: http://data-artisans.com/why-apache-beam/ 
 from Google perspective. 
https://cloud.google.com/blog/big-data/2016/05/why-apache-beam-a-google-perspective

A few recent resources about Apache Beam published this month: May 2016 
Running Apache Beam (screencast) https://www.youtube.com/watch?v=dwxUbzbwtyI
Introduction to Apache Beam ( presentation) 
https://skillsmatter.com/skillscasts/8036-apache-flink-may-meetup
Introduction to Apache Beam ( blog) 
http://www.talend.com/blog/2016/05/02/introduction-to-apache-beam

I hope this helps.

Thanks

Slim Baltagi

> On May 26, 2016, at 2:20 AM, Ashutosh Kumar <kmr.ashutos...@gmail.com> wrote:
> 
> How does apache beam fits with  flink ? Is it an alternative for flink or 
> complementary to it ? 
> 
> Thanks
> Ashutosh 



Re: Powered by Flink

2016-04-05 Thread Slim Baltagi
Hi 

The following are missing in the ‘Powered by Flink’ list: 
king.com 
https://blogs.apache.org/foundation/entry/the_apache_software_foundation_announces88
Otto Group  
http://data-artisans.com/how-we-selected-apache-flink-at-otto-group/ 
<http://data-artisans.com/how-we-selected-apache-flink-at-otto-group/>
Eura Nova https://research.euranova.eu/flink-forward-2015-talk/ 
<https://research.euranova.eu/flink-forward-2015-talk/>
Big Data Europe http://www.big-data-europe.eu
Thanks 

Slim Baltagi


> On Apr 5, 2016, at 10:08 AM, Robert Metzger <rmetz...@apache.org> wrote:
> 
> Hi everyone,
> 
> I would like to bring the "Powered by Flink" wiki page [1] to the attention 
> of Flink user's who recently joined the Flink community. The list tracks 
> which organizations are using Flink.
> If your company / university / research institute / ... is using Flink but 
> the name is not yet listed there, let me know and I'll add the name.
> 
> Regards,
> Robert
> 
> [1] https://cwiki.apache.org/confluence/display/FLINK/Powered+by+Flink 
> <https://cwiki.apache.org/confluence/display/FLINK/Powered+by+Flink>
> 
> 
> On Mon, Oct 19, 2015 at 4:10 PM, Matthias J. Sax <mj...@apache.org 
> <mailto:mj...@apache.org>> wrote:
> +1
> 
> On 10/19/2015 04:05 PM, Maximilian Michels wrote:
> > +1 Let's collect in the Wiki for now. At some point in time, we might
> > want to have a dedicated page on the Flink homepage.
> >
> > On Mon, Oct 19, 2015 at 3:31 PM, Timo Walther <twal...@apache.org 
> > <mailto:twal...@apache.org>> wrote:
> >> Ah ok, sorry. I think linking to the wiki is also ok.
> >>
> >>
> >> On 19.10.2015 15:18, Fabian Hueske wrote:
> >>>
> >>> @Timo: The proposal was to keep the list in the wiki (can be easily
> >>> extended) but link from the main website to the wiki page.
> >>>
> >>> 2015-10-19 15:16 GMT+02:00 Timo Walther <twal...@apache.org 
> >>> <mailto:twal...@apache.org>>:
> >>>
> >>>> +1 for adding it to the website instead of wiki.
> >>>> "Who is using Flink?" is always a question difficult to answer to
> >>>> interested users.
> >>>>
> >>>>
> >>>> On 19.10.2015 15:08, Suneel Marthi wrote:
> >>>>
> >>>> +1 to this.
> >>>>
> >>>> On Mon, Oct 19, 2015 at 3:00 PM, Fabian Hueske <fhue...@gmail.com 
> >>>> <mailto:fhue...@gmail.com>> wrote:
> >>>>
> >>>>> Sounds good +1
> >>>>>
> >>>>> 2015-10-19 14:57 GMT+02:00 Márton Balassi < <balassi.mar...@gmail.com 
> >>>>> <mailto:balassi.mar...@gmail.com>>
> >>>>> balassi.mar...@gmail.com <mailto:balassi.mar...@gmail.com>>:
> >>>>>
> >>>>>> Thanks for starting and big +1 for making it more prominent.
> >>>>>>
> >>>>>> On Mon, Oct 19, 2015 at 2:53 PM, Fabian Hueske < <fhue...@gmail.com 
> >>>>>> <mailto:fhue...@gmail.com>>
> >>>>>
> >>>>> fhue...@gmail.com <mailto:fhue...@gmail.com>> wrote:
> >>>>>>>
> >>>>>>> Thanks for starting this Kostas.
> >>>>>>>
> >>>>>>> I think the list is quite hidden in the wiki. Should we link from
> >>>>>>> flink.apache.org <http://flink.apache.org/> to that page?
> >>>>>>>
> >>>>>>> Cheers, Fabian
> >>>>>>>
> >>>>>>> 2015-10-19 14:50 GMT+02:00 Kostas Tzoumas < <ktzou...@apache.org 
> >>>>>>> <mailto:ktzou...@apache.org>>
> >>>>>
> >>>>> ktzou...@apache.org <mailto:ktzou...@apache.org>>:
> >>>>>>>>
> >>>>>>>> Hi everyone,
> >>>>>>>>
> >>>>>>>> I started a "Powered by Flink" wiki page, listing some of the
> >>>>>>>> organizations that are using Flink:
> >>>>>>>>
> >>>>>>>> https://cwiki.apache.org/confluence/display/FLINK/Powered+by+Flink 
> >>>>>>>> <https://cwiki.apache.org/confluence/display/FLINK/Powered+by+Flink>
> >>>>>>>>
> >>>>>>>> If you would like to be added to the list, just send me a short email
> >>>>>>>> with your organization's name and a description and I will add you to
> >>>>>
> >>>>> the
> >>>>>>>>
> >>>>>>>> wiki page.
> >>>>>>>>
> >>>>>>>> Best,
> >>>>>>>> Kostas
> >>>>>>>>
> >>>>>>>
> >>>>
> >>>>
> >>
> 
> 



Re: [VOTE] Release Apache Flink 1.0.0 (RC1)

2016-02-25 Thread Slim Baltagi
Dear Flink community

It is great news that the vote for the first release candidate (RC1) of Apache 
Flink 1.0.0 is starting today February 25th, 2016!
As a community, we need to double our efforts and make sure that Flink 1.0.0 is 
GA before these 2 upcoming major events: 
 Strata + Hadoop World in San Jose on March 28-31, 2016
 Hadoop Summit Europe in Dublin on April 13-14, 2016
This is one aspect of the ‘market dynamics’ that we need to take into account 
as a community. 

Good luck!

Slim Baltagi

On Feb 25, 2016, at 4:34 AM, Robert Metzger <rmetz...@apache.org> wrote:

> Dear Flink community,
> 
> Please vote on releasing the following candidate as Apache Flink version 
> 1.0.0.
> 
> I've set user@flink.apache.org on CC because users are encouraged to help 
> testing Flink 1.0.0 for their specific use cases. Please report issues (and 
> successful tests!) on d...@flink.apache.org.
> 
> 
> The commit to be voted on 
> (http://git-wip-us.apache.org/repos/asf/flink/commit/e4d308d6)
> e4d308d64057e5f94bec8bbca8f67aab0ea78faa
> 
> Branch:
> release-1.0.0-rc1 (see 
> https://git1-us-west.apache.org/repos/asf/flink/repo?p=flink.git;a=shortlog;h=refs/heads/release-1.0.0-rc1)
> 
> The release artifacts to be voted on can be found at:
> http://people.apache.org/~rmetzger/flink-1.0.0-rc1/
> 
> The release artifacts are signed with the key with fingerprint D9839159:
> http://www.apache.org/dist/flink/KEYS
> 
> The staging repository for this release can be found at:
> https://repository.apache.org/content/repositories/orgapacheflink-1063
> 
> -
> 
> The vote is open until Tuesday and passes if a majority of at least three +1 
> PMC votes are cast.
> 
> The vote ends on Tuesday, March 1, 12:00 CET.
> 
> [ ] +1 Release this package as Apache Flink 1.0.0
> [ ] -1 Do not release this package because ...



Re: Comparison of storm and flink

2016-01-23 Thread Slim Baltagi
Hi Vinaya

1. Comparing streaming tools ( in this case Storm and Flink) should not be
based on performance benchmarks only! For example, slides 16-36 list over 96
criteria, that we identified at Capital One, to compare two streaming tools   
http://www.slideshare.net/sbaltagi/flink-vs-spark/17

2. Now, if you are focusing on performance only, I'll suggest a few related
resources: 

- Benchmarking Streaming Computation Engines at Yahoo!  
http://yahooeng.tumblr.com/post/135321837876/benchmarking-streaming-computation-engines-at
 
December 16, 2015 Code at github:
https://github.com/yahoo/streaming-benchmarks

-  There is some work started by some Flink contributors to create some
performance scripts for Flink, Spark, and MapReduce here: There is Apache
Flink: Performance and Testing  https://github.com/project-flink/flink-perf

- Some first numbers on performance of streaming jobs with Apache Flink are
here:
http://data-artisans.com/high-throughput-low-latency-and-exactly-once-stream-processing-with-apache-flink/
 
under the section: 'Show me the numbers'. Code used is at:
https://github.com/dataArtisans/performance  

- Yangjun Wang is currently working on his Master thesis at Aalto university
in Helsinki, Finland. The topic of his thesis is about building a standard
benchmark system for streaming processing systems like Apache Storm, Spark
and Flink. Code at github
https://github.com/wangyangjun/StreamBench/tree/master/StreamBench

3. I am giving a talk in NYC on Tuesday February 2nd, 2016 on Apache Flink
and I will be touching a bit on benchmarks
http://www.meetup.com/New-York-City-NYC-Apache-Flink-Meetup/events/228113118/
You are welcome to attend. 

Thanks

Slim Baltagi 



--
View this message in context: 
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Comparison-of-storm-and-flink-tp4468p4469.html
Sent from the Apache Flink User Mailing List archive. mailing list archive at 
Nabble.com.


Re: 2015: A Year in Review for Apache Flink

2015-12-31 Thread Slim Baltagi
Happy New Year to you and your families!
Let’s  make 2016 the year of Flink: General Availability, faster growth, wider 
industry adoption, …
Slim Baltagi 
Chicago, US

On Dec 31, 2015, at 5:05 AM, Vasiliki Kalavri <vasilikikala...@gmail.com> wrote:

> Happy new year everyone!
> Looking forward to all the great things the Apache Flink community will 
> accomplish in 2016 :))
> 
> Greetings from snowy Greece!
> -Vasia.
> 
> On 31 December 2015 at 04:22, Henry Saputra <henry.sapu...@gmail.com> wrote:
> Dear All,
> 
> It is almost end of 2015 and it has been busy and great year for Apache Flink 
> =)
> 
> Robert Metzger had posted great blog summarizing Apache Flink grow for
> this year:
> 
>   https://flink.apache.org/news/2015/12/18/a-year-in-review.html
> 
> Happy New Year everyone and thanks for being part of this great community!
> 
> 
> Thanks,
> 
> - Henry
> 



Re: Apache Flink 0.10.0 released

2015-11-16 Thread Slim Baltagi
Hi

I’m very pleased to be first to tweet about the release of Apache Flink 0.10.0 
just after receiving Fabian’s email :)
Flink 1.0 is around the corner now!

Slim Baltagi

On Nov 16, 2015, at 7:53 AM, Fabian Hueske <fhue...@gmail.com> wrote:

> Hi everybody, 
> 
> The Flink community is excited to announce that Apache Flink 0.10.0 has been 
> released.
> Please find the release announcement here:
> 
> -->  http://flink.apache.org/news/2015/11/16/release-0.10.0.html
> 
> Best, 
> Fabian



Building Big Data Benchmarking suite for Apache Flink

2015-07-13 Thread Slim Baltagi
Hi

BigDataBench is  an open source Big Data Benchmarking suite from both
industry and academia.  As a subset of BigDataBench, BigDataBench-DCA  is
China’s first industry-standard big data benchmark suite:
http://prof.ict.ac.cn/BigDataBench/industry-standard-benchmarks/
It comes with real-world data sets and many workloads: TeraSort, WordCount,
PageRank, K-means, NaiveBayes, Aggregation and Read/Write/Scan and also a
tool that uses Hadoop, HBase and Mahout. 
This might be inspiring to build a Big Data Benchmarking suite for Flink! 

I would like to share with you the news that professor Jianfeng Zhan from
the Institute of Computing Technology, Chinese Academy of Sciences is
planning to support Flink in the BigDataBench project! Reference:
https://www.linkedin.com/grp/home?gid=6777483

Thanks

Slim Baltagi




--
View this message in context: 
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Building-Big-Data-Benchmarking-suite-for-Apache-Flink-tp2035.html
Sent from the Apache Flink User Mailing List archive. mailing list archive at 
Nabble.com.


Re: SLIDES: Overview of Apache Flink: Next-Gen Big Data Analytics Framework

2015-07-07 Thread Slim Baltagi
Hi

Well, thanks to the Apache Flink community for continuously improving the
project docs and  to Data Artisans for sharing the slides and materials of
the Apache Flink training!! 

Both helped me with putting together the slide deck of my talk in our
Chicago Apache Flink meetup. 

Slim Baltagi



--
View this message in context: 
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/SLIDES-Overview-of-Apache-Flink-Next-Gen-Big-Data-Analytics-Framework-tp1966p1972.html
Sent from the Apache Flink User Mailing List archive. mailing list archive at 
Nabble.com.


SLIDES: Overview of Apache Flink: Next-Gen Big Data Analytics Framework

2015-07-06 Thread Slim Baltagi
Hi

This is the link *http://goo.gl/gVOSp8* to the slides of my talk on June 30,
2015 at the Chicago Apache Flink meetup. 

Although most of the current buzz is about Apache Spark, the talk shows how
Apache Flink offers the only hybrid open source (Real-Time Streaming +
Batch) distributed data processing engine supporting many use cases:
Real-Time stream processing, machine learning at scale, graph analytics and
batch processing.

Many slides are also dedicated to showing why Apache Flink is an alternative
to Apache Hadoop MapReduce, Apache Storm and Apache Spark! 

Thanks

Slim Baltagi




--
View this message in context: 
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/SLIDES-Overview-of-Apache-Flink-Next-Gen-Big-Data-Analytics-Framework-tp1966.html
Sent from the Apache Flink User Mailing List archive. mailing list archive at 
Nabble.com.