Re: MapState in flink

2020-06-13 Thread Oytun Tez
Correct me @everyone if I'm wrong, but you cannot keep State inside State
in that way. So change your signature to: MapState>

The underlying mechanism wouldn't make sense in this state-inside-state
shape.



 --

[image: MotaWord]
Oytun Tez
M O T A W O R D | CTO & Co-Founder
oy...@motaword.com

  <https://www.motaword.com/blog>


On Sat, Jun 13, 2020 at 3:32 PM Jaswin Shah  wrote:

> I need some representation like this:
>
>
> --
> *From:* Jaswin Shah 
> *Sent:* 14 June 2020 01:01
> *To:* user@flink.apache.org 
> *Subject:* MapState in flink
>
> Hi,
> Can anyone please help me on how can I create a MapState of ListState in
> flink, does flink support the same and if supports, how to declare the
> descriptor for same state data structure?
> If it is not supported, how may I create similar datastructure for state
> in flink?
>
> Thanks,
> Jaswin
>


Re: [ANNOUNCE] Apache Flink Stateful Functions 2.1.0 released

2020-06-09 Thread Oytun Tez
Thank you, Gordon and everyone.

On Tue, Jun 9, 2020 at 5:56 AM Tzu-Li (Gordon) Tai 
wrote:

> The Apache Flink community is very happy to announce the release of Apache
> Flink Stateful Functions 2.1.0.
>
> Stateful Functions is an API that simplifies building distributed stateful
> applications.
> It's based on functions with persistent state that can interact
> dynamically with strong consistency guarantees.
>
> Please check out the release blog post for an overview of the release:
> https://flink.apache.org/news/2020/06/09/release-statefun-2.1.0.html
>
> The release is available for download at:
> https://flink.apache.org/downloads.html
>
> Maven artifacts for Stateful Functions can be found at:
> https://search.maven.org/search?q=g:org.apache.flink%20statefun
>
> Python SDK for Stateful Functions published to the PyPI index can be found
> at:
> https://pypi.org/project/apache-flink-statefun/
>
> Official Docker image for building Stateful Functions applications is
> currently being published to Docker Hub. Progress for creating the Docker
> Hub repository can be tracked at:
> https://github.com/docker-library/official-images/pull/7749
>
> In the meantime, before the official Docker images are available,
> Ververica has volunteered to make Stateful Function's images available for
> the community via their public registry:
> https://hub.docker.com/r/ververica/flink-statefun
>
> The full release notes are available in Jira:
>
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12347861
>
> <https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12346878>
> We would like to thank all contributors of the Apache Flink community who
> made this release possible!
>
> Cheers,
> Gordon
>
-- 
 --

[image: MotaWord]
Oytun Tez
M O T A W O R D | CTO & Co-Founder
oy...@motaword.com

  <https://www.motaword.com/blog>


Re: Window processing in Stateful Functions

2020-05-06 Thread Oytun Tez
Oops – will follow the thread 


 --

[image: MotaWord]
Oytun Tez
M O T A W O R D | CTO & Co-Founder
oy...@motaword.com

  <https://www.motaword.com/blog>


On Wed, May 6, 2020 at 5:37 PM m@xi  wrote:

> Hello Tez,
>
> With all the respect, I doubt your answer is related the question.
>
> *Just to re-phase a bit*: Assuming we use SF for our application, how can
> we
> apply window logic when a function does some processing? *Is there a proper
> way?*
>
> @*Igal*: we would very much appreciate your answer. :)
>
> Best,
> Max
>
>
>
>
>
> --
> Sent from:
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
>


Re: Window processing in Stateful Functions

2020-05-06 Thread Oytun Tez
I think this is also related to my question about CEP in Statefun.

@Annemarie Burger  , I was looking
into using Siddhi library within the Function's context.



 --

[image: MotaWord]
Oytun Tez
M O T A W O R D | CTO & Co-Founder
oy...@motaword.com

  <https://www.motaword.com/blog>


On Wed, May 6, 2020 at 4:13 PM Annemarie Burger <
annemarie.bur...@campus.tu-berlin.de> wrote:

> Hi,
>
> I want to do windowed processing in each function when using Stateful
> Functions. Is this possible? Some pseudo code would be very helpful!
>
> Some more context: I'm having a stream of edges as input. I want to window
> these and save the graph representation (either as edge list, adjacency
> list, or CSR) in a distributed way using state. Since doing this for the
> entire edge stream would cost far too much memory, I want to only save the
> state of the graph within the window. How could I achieve this?
>
> Thanks!
>
>
>
> --
> Sent from:
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
>


Re: Using Stateful Functions within a Flink pipeline

2020-04-22 Thread Oytun Tez
@Igal, this sounds more comprehensive (better) than just opening
DataStreams: "basically exposing the core Flink job that is the heart of
stateful functions. "

Great!



 --

[image: MotaWord]
Oytun Tez
M O T A W O R D | CTO & Co-Founder
oy...@motaword.com

  <https://www.motaword.com/blog>


On Wed, Apr 22, 2020 at 10:56 AM Igal Shilman  wrote:

> Hi Annemarie,
> There are plans to make stateful functions more easily embeddable within a
> Flink Job,
> perhaps skipping ingress/egress routing abstracting all together and
> basically exposing the
> core Flink job that is the heart of stateful functions.
> Although these plans are not concrete yet I believe this would be brought
> to discussion with the community in the upcoming weeks.
>
> Currently, you can split your pipeline to a preprocessing Flink job, and a
> stateful functions job.
>
> Good luck,
> Igal.
>
> On Wed, Apr 22, 2020 at 4:26 PM Annemarie Burger <
> annemarie.bur...@campus.tu-berlin.de> wrote:
>
>> I was wondering if it is possible to use a Stateful Function within a
>> Flink
>> pipeline. I know they work with different API's, so I was wondering if it
>> is
>> possible to have a DataStream as ingress for a Stateful Function.
>>
>> Some context: I'm working on a streaming graph analytics system, and want
>> to
>> save the state of the graph within a window. Stateful functions could then
>> allow me to process these distributed graph states by making use of the SF
>> messaging.
>>
>> Thanks!
>>
>>
>>
>> --
>> Sent from:
>> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
>>
>


Re: Using Stateful Functions within a Flink pipeline

2020-04-22 Thread Oytun Tez
Hi Annemarie,

Unfortunately this is not possible at the moment, but DataStream as
in/egress is in the plans as much as I know.



 --

[image: MotaWord]
Oytun Tez
M O T A W O R D | CTO & Co-Founder
oy...@motaword.com

  <https://www.motaword.com/blog>


On Wed, Apr 22, 2020 at 10:26 AM Annemarie Burger <
annemarie.bur...@campus.tu-berlin.de> wrote:

> I was wondering if it is possible to use a Stateful Function within a Flink
> pipeline. I know they work with different API's, so I was wondering if it
> is
> possible to have a DataStream as ingress for a Stateful Function.
>
> Some context: I'm working on a streaming graph analytics system, and want
> to
> save the state of the graph within a window. Stateful functions could then
> allow me to process these distributed graph states by making use of the SF
> messaging.
>
> Thanks!
>
>
>
> --
> Sent from:
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
>


Re: Enable custom REST APIs in Flink

2020-04-21 Thread Oytun Tez
I had some back and forth last year about this, I'll forward the discussion
email chain to you privately (it was in this mailing list).

Basically, the idea was to make *DispatcherRestEndpoint* and/or
*WebMonitorExtension* more accessible so we can extend them. It didn't look
too much work on Flink's side, but of course I can't tell confidently.


 --

[image: MotaWord]
Oytun Tez
M O T A W O R D | CTO & Co-Founder
oy...@motaword.com

  <https://www.motaword.com/blog>


On Tue, Apr 21, 2020 at 11:12 AM Jeff Zhang  wrote:

> I know some users do the same thing in spark. Usually the service run
> spark driver side. But flink is different from spark. Spark driver is equal
> to flink client + flink job manager. I don't think currently we allow to
> run any user code in job manager. So allow running user defined service in
> job manager might a big change for flink.
>
>
>
> Flavio Pompermaier  于2020年4月21日周二 下午11:06写道:
>
>> In my mind the user API could run everywhere but the simplest thing is to
>> make them available in the Job Manager (where the other REST API lives).
>> They could become a very simple but powerful way to add valuable services
>> to Flink without adding useless complexity to the overall architecture for
>> just a few methods.
>> I don't know whether Spark or Beam allow you to do something like that
>> but IMHO it's super useful (especially from a maintenance point of view wrt
>> the overall architecture complexity).
>>
>> @Oytun indeed we'd like to avoid recompiling everything when a single
>> user class (i.e. not related to Flink classes) is modified or added. Glad
>> to see that there are other people having the same problem here
>>
>> On Tue, Apr 21, 2020 at 4:39 PM Jeff Zhang  wrote:
>>
>>> Hi Flavio,
>>>
>>> I am curious know where service run, Do you create this service in UDF
>>> and run it  in TM ?
>>>
>>> Flavio Pompermaier  于2020年4月21日周二 下午8:30写道:
>>>
>>>> Hi to all,
>>>> many times it happens that we use Flink as a broker towards the data
>>>> layer but we need to be able to get some specific info from the
>>>> data sources we use (i.e. get triggers and relationships from jdbc).
>>>> The quick and dirty way of achieving this is to run a Flink job that
>>>> calls another service to store the required info. Another solution could be
>>>> to add a custom REST service that contains a lot of dependencies already
>>>> provided by Flink, with the risk of having misaligned versions between the
>>>> 2..
>>>> It would be much simpler to enable users to add custom REST services to
>>>> Flink in a configurable file. something like:
>>>> /myservice1/* -> com.example.MyRestService1
>>>> /myservice2/* -> com.example.MyRestService2
>>>>
>>>> The listed classes should be contained in a jar within the Flink lib
>>>> dir and should implement a common interface.
>>>> In order to avoid path collisions with already existing FLINK services,
>>>> the configured path can be further prefixed with some other token (e.g.
>>>> /userapi/*).
>>>>
>>>> What do you think about this? Does it sound reasonable to you?
>>>> Am I the only one that thinks this could be useful for many use cases?
>>>>
>>>> Best,
>>>> Flavio
>>>>
>>>
>>>
>>> --
>>> Best Regards
>>>
>>> Jeff Zhang
>>>
>>
>
> --
> Best Regards
>
> Jeff Zhang
>


Re: Enable custom REST APIs in Flink

2020-04-21 Thread Oytun Tez
definitely, this is for me about making Flink an "application framework"
rather than solely a "dataflow framework".




 --

[image: MotaWord]
Oytun Tez
M O T A W O R D | CTO & Co-Founder
oy...@motaword.com

  <https://www.motaword.com/blog>


On Tue, Apr 21, 2020 at 11:07 AM Flavio Pompermaier 
wrote:

> In my mind the user API could run everywhere but the simplest thing is to
> make them available in the Job Manager (where the other REST API lives).
> They could become a very simple but powerful way to add valuable services
> to Flink without adding useless complexity to the overall architecture for
> just a few methods.
> I don't know whether Spark or Beam allow you to do something like that but
> IMHO it's super useful (especially from a maintenance point of view wrt the
> overall architecture complexity).
>
> @Oytun indeed we'd like to avoid recompiling everything when a single user
> class (i.e. not related to Flink classes) is modified or added. Glad to see
> that there are other people having the same problem here
>
> On Tue, Apr 21, 2020 at 4:39 PM Jeff Zhang  wrote:
>
>> Hi Flavio,
>>
>> I am curious know where service run, Do you create this service in UDF
>> and run it  in TM ?
>>
>> Flavio Pompermaier  于2020年4月21日周二 下午8:30写道:
>>
>>> Hi to all,
>>> many times it happens that we use Flink as a broker towards the data
>>> layer but we need to be able to get some specific info from the
>>> data sources we use (i.e. get triggers and relationships from jdbc).
>>> The quick and dirty way of achieving this is to run a Flink job that
>>> calls another service to store the required info. Another solution could be
>>> to add a custom REST service that contains a lot of dependencies already
>>> provided by Flink, with the risk of having misaligned versions between the
>>> 2..
>>> It would be much simpler to enable users to add custom REST services to
>>> Flink in a configurable file. something like:
>>> /myservice1/* -> com.example.MyRestService1
>>> /myservice2/* -> com.example.MyRestService2
>>>
>>> The listed classes should be contained in a jar within the Flink lib dir
>>> and should implement a common interface.
>>> In order to avoid path collisions with already existing FLINK services,
>>> the configured path can be further prefixed with some other token (e.g.
>>> /userapi/*).
>>>
>>> What do you think about this? Does it sound reasonable to you?
>>> Am I the only one that thinks this could be useful for many use cases?
>>>
>>> Best,
>>> Flavio
>>>
>>
>>
>> --
>> Best Regards
>>
>> Jeff Zhang
>>
>


Re: Enable custom REST APIs in Flink

2020-04-21 Thread Oytun Tez
I would LOVE this. We had to hack our way a lot to achieve something
similar.

@Flavio, we basically added a new entrypoint to the same codebase and ran
that separately in its own container.


 --

[image: MotaWord]
Oytun Tez
M O T A W O R D | CTO & Co-Founder
oy...@motaword.com

  <https://www.motaword.com/blog>


On Tue, Apr 21, 2020 at 8:30 AM Flavio Pompermaier 
wrote:

> Hi to all,
> many times it happens that we use Flink as a broker towards the data layer
> but we need to be able to get some specific info from the data sources we
> use (i.e. get triggers and relationships from jdbc).
> The quick and dirty way of achieving this is to run a Flink job that calls
> another service to store the required info. Another solution could be to
> add a custom REST service that contains a lot of dependencies already
> provided by Flink, with the risk of having misaligned versions between the
> 2..
> It would be much simpler to enable users to add custom REST services to
> Flink in a configurable file. something like:
> /myservice1/* -> com.example.MyRestService1
> /myservice2/* -> com.example.MyRestService2
>
> The listed classes should be contained in a jar within the Flink lib dir
> and should implement a common interface.
> In order to avoid path collisions with already existing FLINK services,
> the configured path can be further prefixed with some other token (e.g.
> /userapi/*).
>
> What do you think about this? Does it sound reasonable to you?
> Am I the only one that thinks this could be useful for many use cases?
>
> Best,
> Flavio
>


Re: [Stateful Functions] Using Flink CEP

2020-04-14 Thread Oytun Tez
Hi Igal,

I have use cases such as "when a translator translates 10 words within 30
seconds". Typically, it is beautiful to express these with CEP.

Yet, these are exploration questions where I try to replicate our Flink
application in Statefun. Rephrasing problems better might be what's needed
to fit our problems within Statefun (e.g., from a CEP-like expression to
the ridesharing example you shared)

Oytun


 --

[image: MotaWord]
Oytun Tez
M O T A W O R D | CTO & Co-Founder
oy...@motaword.com

  <https://www.motaword.com/blog>


On Tue, Apr 14, 2020 at 5:41 AM Igal Shilman  wrote:

> Hi,
>
> I'm not familiar with the other library that you have mentioned, and
> indeed using Flink CEP from within a stateful function is not possible
> within a single Flink job,  as Gordon mentioned.
>
> I'm wondering what aspects of CEP are you interested in?
> Because essentially a stateful function can be considered as a state
> machine with auxiliary state.
> You can take a look at the ride sharing example [1] for instance, where
> the drivers, and the rides are cooperative state machines.
>
> [1] -
> https://github.com/apache/flink-statefun/tree/master/statefun-examples/statefun-ridesharing-example/statefun-ridesharing-example-functions/src/main/java/org/apache/flink/statefun/examples/ridesharing
>
> Good luck!
> Igal.
>
>
> On Tue, Apr 14, 2020 at 5:07 AM Tzu-Li (Gordon) Tai 
> wrote:
>
>> Hi!
>>
>> It isn't possible to use Flink CEP within Stateful Functions.
>>
>> That could be an interesting primitive, to add CEP-based function
>> constructs.
>> Could your briefly describe what you are trying to achieve?
>>
>> On the other hand, there are plans to integrate Stateful Functions more
>> closely with the Flink APIs.
>> One direction we've been thinking about is to, for example, support Flink
>> DataStreams as StateFun ingress / egresses. In this case, you'll be able
>> to
>> use Flink CEP to detect patterns, and use the results as an ingress which
>> invokes functions within a StateFun app.
>>
>> Cheers,
>> Gordon
>>
>>
>>
>> --
>> Sent from:
>> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
>>
>


Re: [Stateful Functions] Using Flink CEP

2020-04-14 Thread Oytun Tez
Hi Gordon,

Getting a little closer to Flink API could be helpful here with
integration. DataStreams as ingress/egress would be AMAZING. Deploying
regular Flink API code and statefun together as a single job is also
something I will explore soon.

With CEP, I simply want to keep a Function-specific pattern engine. I could
use a library like Siddhi and try to persist its state via Function's
PersistValue or something.

Also, one thing to consider in Flink API integration is protobuf
enforcement in statefun. We had to use protobuf within the statefun code,
but nowhere else we are using protobuf. Even if we talk via DataStreams,
this would still bring us some level of separation between our Flink API
code and statefun.


 --

[image: MotaWord]
Oytun Tez
M O T A W O R D | CTO & Co-Founder
oy...@motaword.com

  <https://www.motaword.com/blog>


On Mon, Apr 13, 2020 at 11:07 PM Tzu-Li (Gordon) Tai 
wrote:

> Hi!
>
> It isn't possible to use Flink CEP within Stateful Functions.
>
> That could be an interesting primitive, to add CEP-based function
> constructs.
> Could your briefly describe what you are trying to achieve?
>
> On the other hand, there are plans to integrate Stateful Functions more
> closely with the Flink APIs.
> One direction we've been thinking about is to, for example, support Flink
> DataStreams as StateFun ingress / egresses. In this case, you'll be able to
> use Flink CEP to detect patterns, and use the results as an ingress which
> invokes functions within a StateFun app.
>
> Cheers,
> Gordon
>
>
>
> --
> Sent from:
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
>


Re: [Stateful Functions] Using Flink CEP

2020-04-13 Thread Oytun Tez
I am leaning towards using Siddhi as a library, but I would really love to
stick with Flink CEP, or at least the specific CEP mechanism that Flink CEP
uses. Exploring the codebase of Flink CEP wasn't much promising on this end.


 --

[image: MotaWord]
Oytun Tez
M O T A W O R D | CTO & Co-Founder
oy...@motaword.com

  <https://www.motaword.com/blog>


On Mon, Apr 13, 2020 at 11:22 AM Oytun Tez  wrote:

> Hi there,
>
> I was wondering if I could somehow use CEP within a Function. Have you
> experimented with this before?
>
> Or, do you have any suggestions to do CEP within a Function? I am looking
> for a standalone library now.
>
> Oytun
>
>
>
>  --
>
> [image: MotaWord]
> Oytun Tez
> M O T A W O R D | CTO & Co-Founder
> oy...@motaword.com
>
>   <https://www.motaword.com/blog>
>


[Stateful Functions] Using Flink CEP

2020-04-13 Thread Oytun Tez
Hi there,

I was wondering if I could somehow use CEP within a Function. Have you
experimented with this before?

Or, do you have any suggestions to do CEP within a Function? I am looking
for a standalone library now.

Oytun



 --

[image: MotaWord]
Oytun Tez
M O T A W O R D | CTO & Co-Founder
oy...@motaword.com

  <https://www.motaword.com/blog>


[Stateful Functions] Using statefun for E2E testing

2020-04-09 Thread Oytun Tez
Hi there,

Today we were designing a test for a workflow that involved 3 different
systems talking to each other async. My colleague came with the idea that
we could use Flink for E2E, which we got excited about.

We came with a quick implementation, within our existing Flink application,
after some hours of debugging this and that, everything actually worked
very nicely. We triggered the initial actions within Functions, other
Functions kept state for CEP-like logics (can we use Flink CEP directly?),
some events triggered validation assortments via async API calls and such...

Has anyone used a similar approach? This is just a general question to see
resources about integration testing via Flink.



 --

[image: MotaWord]
Oytun Tez
M O T A W O R D | CTO & Co-Founder
oy...@motaword.com

  <https://www.motaword.com/blog>


StateFun - Multiple modules example

2020-04-08 Thread Oytun Tez
Hi there,

Does anyone have any statefun 2.0 examples with multiple modules?



 --

[image: MotaWord]
Oytun Tez
M O T A W O R D | CTO & Co-Founder
oy...@motaword.com

  <https://www.motaword.com/blog>


Re: [ANNOUNCE] Apache Flink Stateful Functions 2.0.0 released

2020-04-07 Thread Oytun Tez
I should also add, I couldn't agree more with this sentence in the release
article: "state access/updates and messaging need to be integrated."

This is something we strictly enforce in our Flink case, where we do not
refer to anything external for storage, use Flink as our DB.



 --

[image: MotaWord]
Oytun Tez
M O T A W O R D | CTO & Co-Founder
oy...@motaword.com

  <https://www.motaword.com/blog>


On Tue, Apr 7, 2020 at 12:26 PM Oytun Tez  wrote:

> Great news! Thank you all.
>
> On Tue, Apr 7, 2020 at 12:23 PM Marta Paes Moreira 
> wrote:
>
>> Thank you for managing the release, Gordon — you did a tremendous job!
>> And to everyone else who worked on pushing it through.
>>
>> Really excited about the new use cases that StateFun 2.0 unlocks for
>> Flink users and beyond!
>>
>>
>> Marta
>>
>> On Tue, Apr 7, 2020 at 4:47 PM Hequn Cheng  wrote:
>>
>>> Thanks a lot for the release and your great job, Gordon!
>>> Also thanks to everyone who made this release possible!
>>>
>>> Best,
>>> Hequn
>>>
>>> On Tue, Apr 7, 2020 at 8:58 PM Tzu-Li (Gordon) Tai 
>>> wrote:
>>>
>>>> The Apache Flink community is very happy to announce the release of
>>>> Apache Flink Stateful Functions 2.0.0.
>>>>
>>>> Stateful Functions is an API that simplifies building distributed
>>>> stateful applications.
>>>> It's based on functions with persistent state that can interact
>>>> dynamically with strong consistency guarantees.
>>>>
>>>> Please check out the release blog post for an overview of the release:
>>>> https://flink.apache.org/news/2020/04/07/release-statefun-2.0.0.html
>>>>
>>>> The release is available for download at:
>>>> https://flink.apache.org/downloads.html
>>>>
>>>> Maven artifacts for Stateful Functions can be found at:
>>>> https://search.maven.org/search?q=g:org.apache.flink%20statefun
>>>>
>>>> Python SDK for Stateful Functions published to the PyPI index can be
>>>> found at:
>>>> https://pypi.org/project/apache-flink-statefun/
>>>>
>>>> Official Docker image for building Stateful Functions applications is
>>>> currently being published to Docker Hub.
>>>> Dockerfiles for this release can be found at:
>>>> https://github.com/apache/flink-statefun-docker/tree/master/2.0.0
>>>> Progress for creating the Docker Hub repository can be tracked at:
>>>> https://github.com/docker-library/official-images/pull/7749
>>>>
>>>> The full release notes are available in Jira:
>>>>
>>>> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12346878
>>>>
>>>> We would like to thank all contributors of the Apache Flink community
>>>> who made this release possible!
>>>>
>>>> Cheers,
>>>> Gordon
>>>>
>>> --
>  --
>
> [image: MotaWord]
> Oytun Tez
> M O T A W O R D | CTO & Co-Founder
> oy...@motaword.com
>
>   <https://www.motaword.com/blog>
>


Re: [ANNOUNCE] Apache Flink Stateful Functions 2.0.0 released

2020-04-07 Thread Oytun Tez
Great news! Thank you all.

On Tue, Apr 7, 2020 at 12:23 PM Marta Paes Moreira 
wrote:

> Thank you for managing the release, Gordon — you did a tremendous job! And
> to everyone else who worked on pushing it through.
>
> Really excited about the new use cases that StateFun 2.0 unlocks for Flink
> users and beyond!
>
>
> Marta
>
> On Tue, Apr 7, 2020 at 4:47 PM Hequn Cheng  wrote:
>
>> Thanks a lot for the release and your great job, Gordon!
>> Also thanks to everyone who made this release possible!
>>
>> Best,
>> Hequn
>>
>> On Tue, Apr 7, 2020 at 8:58 PM Tzu-Li (Gordon) Tai 
>> wrote:
>>
>>> The Apache Flink community is very happy to announce the release of
>>> Apache Flink Stateful Functions 2.0.0.
>>>
>>> Stateful Functions is an API that simplifies building distributed
>>> stateful applications.
>>> It's based on functions with persistent state that can interact
>>> dynamically with strong consistency guarantees.
>>>
>>> Please check out the release blog post for an overview of the release:
>>> https://flink.apache.org/news/2020/04/07/release-statefun-2.0.0.html
>>>
>>> The release is available for download at:
>>> https://flink.apache.org/downloads.html
>>>
>>> Maven artifacts for Stateful Functions can be found at:
>>> https://search.maven.org/search?q=g:org.apache.flink%20statefun
>>>
>>> Python SDK for Stateful Functions published to the PyPI index can be
>>> found at:
>>> https://pypi.org/project/apache-flink-statefun/
>>>
>>> Official Docker image for building Stateful Functions applications is
>>> currently being published to Docker Hub.
>>> Dockerfiles for this release can be found at:
>>> https://github.com/apache/flink-statefun-docker/tree/master/2.0.0
>>> Progress for creating the Docker Hub repository can be tracked at:
>>> https://github.com/docker-library/official-images/pull/7749
>>>
>>> The full release notes are available in Jira:
>>>
>>> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12346878
>>>
>>> We would like to thank all contributors of the Apache Flink community
>>> who made this release possible!
>>>
>>> Cheers,
>>> Gordon
>>>
>> --
 --

[image: MotaWord]
Oytun Tez
M O T A W O R D | CTO & Co-Founder
oy...@motaword.com

  <https://www.motaword.com/blog>


Re: FlinkCEP questions - architecture

2020-02-21 Thread Oytun Tez
link terminology. Is that
>> assumption correct?
>> >
>> > 2) The records within the DB export files are NOT in chronologically,
>> and we can not change the export. Our use case is a "complex event
>> processing" case (FlinkCEP) with rules like "KeyBy(someKey) If first A,
>> then B, then C within 30 days, then do something". Does that work with
>> FlinkCEP despite the events/records are not in chrono order within the
>> file? The files are 100MB to 20GB in size. Do I need to sort the files
>> first before CEP processing?
>> >
>> > 3) Occassionally some crazy people manually "correct" DB records within
>> the database and manually trigger a re-export of ALL of the changes for
>> that respective day (e.g. last weeks Tuesday). Consequently we receive a
>> correction file. Same filename but "_1" appended. All filenames include the
>> date (of the original export). What are the options to handle that case
>> (besides telling the DB admins not to, which we did already). Regular
>> checkpoints and re-process all files since then?  What happens to the CEP
>> state? Will it be checkpointed as well?
>> >
>> > 4) Our CEP rules span upto 180 days (resp. 6 months). Is that a problem?
>> >
>> > 5) We also have CEP rules that must fire if after a start sequence
>> matched, the remaining sequence did NOT within a configured window. E.g. If
>> A, then B, but C did not occur within 30 days since A. Is that supported by
>> FlinkCEP? I couldn't find a working example.
>> >
>> > 6) We expect 30-40 CEP rules. How can we estimate the required storage
>> size for the temporary CEP state? Is there some sort of formular
>> considering number of rules, number of records per file or day, record
>> size, window, number of records matched per sequence, number of keyBy
>> grouping keys, ...
>> >
>> > 7) I can imagine that for debugging reasons it'd be good if we were
>> able to query the temporary CEP state. What is the (CEP) schema used to
>> persist the CEP state and how can we query it? And does such query work on
>> the whole cluster or only per node (e.g. because of shuffle and nodes
>> responsible only for a portion of the events).
>> >
>> > 8) I understand state is stored per node. What happens if I want to add
>> or remove a nodes. Will the state still be found, despite it being stored
>> in another node? I read that I need to be equally careful when changing
>> rules? Or is that a different issue?
>> >
>> > 9) How does garbage collection of temp CEP state work, or will it stay
>> forever?  For tracing/investigation reasons I can imagine that purging it
>> at the earliest possible time is not always the best option. May be after
>> 30 days later or so.
>> >
>> > 10) Are there strategies to minimize temp CEP state? In SQL queries
>> you  filter first on the "smallest" attributes. CEP rules form a sequence.
>> Hence that approach will not work. Is that an issue at all? What are
>> practical limits on the CEP temp state storage engine?
>> >
>> > 11) Occassionally we need to process about 200 files at once. Can I
>> speed things up by processing all files in parallel on multiple nodes,
>> despite their sequence (CEP use case)? This would only work if FlinkCEP in
>> step 1 simply filters on all relevant events of a sequence, updates state,
>> and in a step 2 - after the files are processed - evaluates the updated
>> state if that meets the sequences.
>> >
>> > 12) Schema changes in the input files: Occassionly the DB source system
>> schema is changed, and not always in a backwards compatible way (insert new
>> fields in the middle), and also the export will have the field in the
>> middle. This means that starting from a specific (file) date, I need to
>> consider a different schema. This must also be handled when re-running
>> files for the last month, because of corrections provided. And if the file
>> format has changed someone in the middle ...
>> >
>> > thanks a lot for your time and your help
>> > Juergen
>>
> --
 --

[image: MotaWord]
Oytun Tez
M O T A W O R D | CTO & Co-Founder
oy...@motaword.com

  <https://www.motaword.com/blog>


Re: [ANNOUNCE] Apache Flink 1.10.0 released

2020-02-13 Thread Oytun Tez


On Thu, Feb 13, 2020 at 2:26 AM godfrey he  wrote:

> Congrats to everyone involved! Thanks, Yu & Gary.
>
> Best,
> godfrey
>
> Yu Li  于2020年2月13日周四 下午12:57写道:
>
>> Hi Kristoff,
>>
>> Thanks for the question.
>>
>> About Java 11 support, please allow me to quote from our release note [1]:
>>
>> Lastly, note that the connectors for Cassandra, Hive, HBase, and Kafka
>> 0.8–0.11
>> have not been tested with Java 11 because the respective projects did not
>> provide
>> Java 11 support at the time of the Flink 1.10.0 release
>>
>> Which is the main reason for us to still make our docker image based on
>> JDK 8.
>>
>> Hope this answers your question.
>>
>> Best Regards,
>> Yu
>>
>> [1]
>> https://ci.apache.org/projects/flink/flink-docs-release-1.10/release-notes/flink-1.10.html
>>
>>
>> On Wed, 12 Feb 2020 at 23:43, KristoffSC 
>> wrote:
>>
>>> Hi all,
>>> I have a small question regarding 1.10
>>>
>>> Correct me if I'm wrong, but 1.10 should support Java 11 right?
>>>
>>> If so, then I noticed that docker images [1] referenced in [2] are still
>>> based on openjdk8 not Java 11.
>>>
>>> Whats up with that?
>>>
>>> P.S.
>>> Congrats on releasing 1.10 ;)
>>>
>>> [1]
>>>
>>> https://github.com/apache/flink/blob/release-1.10/flink-container/docker/Dockerfile
>>> [2]
>>>
>>> https://ci.apache.org/projects/flink/flink-docs-stable/ops/deployment/docker.html
>>>
>>>
>>>
>>> --
>>> Sent from:
>>> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
>>>
>> --
 --

[image: MotaWord]
Oytun Tez
M O T A W O R D | CTO & Co-Founder
oy...@motaword.com

  <https://www.motaword.com/blog>


Re: [ANNOUNCE] Apache Flink 1.9.2 released

2020-01-31 Thread Oytun Tez
Thank you!


 --

[image: MotaWord]
Oytun Tez
M O T A W O R D | CTO & Co-Founder
oy...@motaword.com

  <https://www.motaword.com/blog>


On Fri, Jan 31, 2020 at 8:47 AM Till Rohrmann  wrote:

> Thanks for being our release manager Hequn and thanks for everyone who
> made the 1.9.2 release possible.
>
> Cheers,
> Till
>
> On Fri, Jan 31, 2020 at 1:46 PM Hequn Cheng  wrote:
>
>> @jincheng sun  That's great. Thank you!
>>
>> On Fri, Jan 31, 2020 at 7:57 PM jincheng sun 
>> wrote:
>>
>>> Thanks for being the release manager and the great work Hequn :)
>>>
>>> Also thanks to the community making this release possible!
>>>
>>> BTW: I have add the 1.9.2 release to report.
>>>
>>> Best,
>>> Jincheng
>>>
>>> Hequn Cheng  于2020年1月31日周五 下午6:55写道:
>>>
>>>> Hi everyone,
>>>>
>>>> The Apache Flink community is very happy to announce the release of
>>>> Apache Flink 1.9.2, which is the second bugfix release for the Apache Flink
>>>> 1.9 series.
>>>>
>>>> Apache Flink® is an open-source stream processing framework for
>>>> distributed, high-performing, always-available, and accurate data streaming
>>>> applications.
>>>>
>>>> The release is available for download at:
>>>> https://flink.apache.org/downloads.html
>>>>
>>>> Please check out the release blog post for an overview of the
>>>> improvements for this bugfix release:
>>>> https://flink.apache.org/news/2020/01/30/release-1.9.2.html
>>>>
>>>> The full release notes are available in Jira:
>>>> https://issues.apache.org/jira/projects/FLINK/versions/12346278
>>>>
>>>> We would like to thank all contributors of the Apache Flink community
>>>> who helped to verify this release and made this release possible!
>>>> Great thanks to @Jincheng for helping finalize this release.
>>>>
>>>> Regards,
>>>> Hequn
>>>>
>>>>


Re: SANSA 0.7.1 (Scalable Semantic Analytics Stack) Released

2020-01-16 Thread Oytun Tez
Looks interesting, Hajira. Thank you for sharing.

We should add this to flink-packages.org.





 --

[image: MotaWord]
Oytun Tez
M O T A W O R D | CTO & Co-Founder
oy...@motaword.com

  <https://www.motaword.com/blog>


On Thu, Jan 16, 2020 at 5:01 PM Hajira Jabeen 
wrote:

>
> Dear all,
>
> The Smart Data Analytics group (http://sda.tech) is happy to announce
> SANSA 0.7.1 - the seventh release of the Scalable Semantic Analytics Stack.
> SANSA employs distributed computing via Apache Spark and Apache Flink in
> order to allow scalable machine learning, inference and querying
> capabilities for large knowledge graphs.
>
> Website: http://sansa-stack.net
>
> GitHub: https://github.com/SANSA-Stack
>
> Download: https://github.com/SANSA-Stack/SANSA-Stack/releases
>
> In the latest release, we have included support for Query-engine over
> compressed RDF data, added more data quality assessment methods as well as
> new clustering approaches.
>
> You can find the FAQ and usage examples at http://sansa-stack.net/faq/.
>
> View this announcement on Twitter and the SANSA blog:
>
>   http://sansa-stack.net/sansa-0-7/
>
>   https://twitter.com/SANSA_Stack/status/1217772649293795328
>
>
> Kind regards,
>
> The SANSA Development Team
>
> (http://sansa-stack.net/community/#Contributors)
>
> Dr Hajira Jabeen
> Senior researcher,
> SDA, Universität Bonn.
>
> http://sda.cs.uni-bonn.de/people/dr-hajira-jabeen/
>


Re: [ANNOUNCE] Launch of flink-packages.org: A website to foster the Flink Ecosystem

2019-11-18 Thread Oytun Tez
Congratulations! This is exciting.


 --

[image: MotaWord]
Oytun Tez
M O T A W O R D | CTO & Co-Founder
oy...@motaword.com

  <https://www.motaword.com/blog>


On Mon, Nov 18, 2019 at 11:07 AM Robert Metzger  wrote:

> Hi all,
>
> I would like to announce that Ververica, with the permission of the Flink
> PMC, is launching a website called flink-packages.org. This goes back to
> an effort proposed earlier in 2019 [1]
> The idea of the site is to help developers building extensions /
> connectors / API etc. for Flink to get attention for their project.
> At the same time, we want to help Flink users to find those ecosystem
> projects, so that they can benefit from the work. A voting and commenting
> functionality allows users to rate and and discuss about individual
> packages.
>
> You can find the website here: https://flink-packages.org/
>
> The full announcement is available here:
> https://www.ververica.com/blog/announcing-flink-community-packages
>
> I'm happy to hear any feedback about the site.
>
> Best,
> Robert
>
>
> [1]
> https://lists.apache.org/thread.html/c306b8b6d5d2ca071071b634d647f47769760e1e91cd758f52a62c93@%3Cdev.flink.apache.org%3E
>


Re: Broadcast state

2019-09-30 Thread Oytun Tez
This is how we currently use broadcast state. Our states are re-usable
(code-wise), every operator that wants to consume basically keeps the same
descriptor state locally by processBroadcastElement'ing into a local state.

I am open to suggestions. I see this as a hard drawback of dataflow
programming or Flink framework?



---
Oytun Tez

*M O T A W O R D*
The World's Fastest Human Translation Platform.
oy...@motaword.com — www.motaword.com


On Mon, Sep 30, 2019 at 8:40 PM Oytun Tez  wrote:

> You can re-use the broadcasted state (along with its descriptor) that
> comes into your KeyedBroadcastProcessFunction, in another operator
> downstream. that's basically duplicating the broadcasted state whichever
> operator you want to use, every time.
>
>
>
> ---
> Oytun Tez
>
> *M O T A W O R D*
> The World's Fastest Human Translation Platform.
> oy...@motaword.com — www.motaword.com
>
>
> On Mon, Sep 30, 2019 at 8:29 PM Navneeth Krishnan <
> reachnavnee...@gmail.com> wrote:
>
>> Hi All,
>>
>> Is it possible to access a broadcast state across the pipeline? For
>> example, say I have a KeyedBroadcastProcessFunction which adds the incoming
>> data to state and I have downstream operator where I need the same state as
>> well, would I be able to just read the broadcast state with a readonly
>> view. I know this is possible in kafka streams.
>>
>> Thanks
>>
>


Re: Broadcast state

2019-09-30 Thread Oytun Tez
You can re-use the broadcasted state (along with its descriptor) that comes
into your KeyedBroadcastProcessFunction, in another operator downstream.
that's basically duplicating the broadcasted state whichever operator you
want to use, every time.



---
Oytun Tez

*M O T A W O R D*
The World's Fastest Human Translation Platform.
oy...@motaword.com — www.motaword.com


On Mon, Sep 30, 2019 at 8:29 PM Navneeth Krishnan 
wrote:

> Hi All,
>
> Is it possible to access a broadcast state across the pipeline? For
> example, say I have a KeyedBroadcastProcessFunction which adds the incoming
> data to state and I have downstream operator where I need the same state as
> well, would I be able to just read the broadcast state with a readonly
> view. I know this is possible in kafka streams.
>
> Thanks
>


Re: Best way to link static data to event data?

2019-09-27 Thread Oytun Tez
Hi,

You should look broadcast state pattern in Flink docs.

---
Oytun Tez

*M O T A W O R D*
The World's Fastest Human Translation Platform.
oy...@motaword.com — www.motaword.com


On Fri, Sep 27, 2019 at 2:42 PM John Smith  wrote:

> Using 1.8
>
> I have a list of phone area codes, cities and their geo location in CSV
> file. And my events from Kafka contain phone numbers.
>
> I want to parse the phone number get it's area code and then associate the
> phone number to a city, geo location and as well count how many numbers are
> in that city/geo location.
>


Re: [SURVEY] How many people are using customized RestartStrategy(s)

2019-09-12 Thread Oytun Tez
Hi Zhu,

We are using custom restart strategy like this:

environment.setRestartStrategy(failureRateRestart(2, Time.minutes(1),
Time.minutes(10)));


---
Oytun Tez

*M O T A W O R D*
The World's Fastest Human Translation Platform.
oy...@motaword.com — www.motaword.com


On Thu, Sep 12, 2019 at 7:11 AM Zhu Zhu  wrote:

> Hi everyone,
>
> I wanted to reach out to you and ask how many of you are using a
> customized RestartStrategy[1] in production jobs.
>
> We are currently developing the new Flink scheduler[2] which interacts
> with restart strategies in a different way. We have to re-design the
> interfaces for the new restart strategies (so called
> RestartBackoffTimeStrategy). Existing customized RestartStrategy will not
> work any more with the new scheduler.
>
> We want to know whether we should keep the way
> to customized RestartBackoffTimeStrategy so that existing customized
> RestartStrategy can be migrated.
>
> I'd appreciate if you can share the status if you are using customized
> RestartStrategy. That will be valuable for use to make decisions.
>
> [1]
> https://ci.apache.org/projects/flink/flink-docs-master/dev/task_failure_recovery.html#restart-strategies
> [2] https://issues.apache.org/jira/browse/FLINK-10429
>
> Thanks,
> Zhu Zhu
>


Re: [ANNOUNCE] Zili Chen becomes a Flink committer

2019-09-11 Thread Oytun Tez
Congratulations!

---
Oytun Tez

*M O T A W O R D*
The World's Fastest Human Translation Platform.
oy...@motaword.com — www.motaword.com


On Wed, Sep 11, 2019 at 6:36 AM bupt_ljy  wrote:

> Congratulations!
>
>
> Best,
>
> Jiayi Liao
>
>  Original Message
> *Sender:* Till Rohrmann
> *Recipient:* dev; user
> *Date:* Wednesday, Sep 11, 2019 17:22
> *Subject:* [ANNOUNCE] Zili Chen becomes a Flink committer
>
> Hi everyone,
>
> I'm very happy to announce that Zili Chen (some of you might also know
> him as Tison Kun) accepted the offer of the Flink PMC to become a committer
> of the Flink project.
>
> Zili Chen has been an active community member for almost 16 months now.
> He helped pushing the Flip-6 effort over the finish line, ported a lot of
> legacy code tests, removed a good part of the legacy code, contributed
> numerous fixes, is involved in the Flink's client API refactoring, drives
> the refactoring of Flink's HighAvailabilityServices and much more. Zili
> Chen also helped the community by PR reviews, reporting Flink issues,
> answering user mails and being very active on the dev mailing list.
>
> Congratulations Zili Chen!
>
> Best, Till
> (on behalf of the Flink PMC)
>


Re: Making broadcast state queryable?

2019-09-06 Thread Oytun Tez
Hi Yu,

Excuse my late reply... We simply want Flink to be our centralized stream
analysis platform, where we 1) consume data, 2) generate analysis, 3)
present analysis. I honestly don't want "stream analysis" to spill out to
other components in our ecosystem (e.g., sinking insights into a DB-like
place).

So the case for QS for us is centralization of input, output, presentation.
State Processor API for instance also counts as a presentation tool for us
(on top of migration tool).

This kind of all-in-one (in, out, ui) packaging helps with small teams.

---
Oytun Tez

*M O T A W O R D*
The World's Fastest Human Translation Platform.
oy...@motaword.com — www.motaword.com


On Wed, Aug 14, 2019 at 3:13 AM Yu Li  wrote:

> Good to know your thoughts and the coming talk in Flink Forward Berlin
> Oytun, interesting topic and great job! And it's great to hear the voice
> from application perspective.
>
> Could you share, if possible, the reason why you need to open the state to
> outside instead of writing the result to sink and directly query there? In
> another thread there's a case that sink writes to different kafka topics so
> state is the only place to get a global view, is this the same case you're
> facing? Or some different requirements? I believe more attention will be
> drawn to QS if more and more user requirements emerge (smile).
>
> Thanks.
>
> Best Regards,
> Yu
>
>
> On Wed, 14 Aug 2019 at 00:50, Oytun Tez  wrote:
>
>> Thank you for the honest response, Yu!
>>
>> There is so much that comes to mind when we look at Flink as a
>> "application framework" (my talk
>> <https://europe-2019.flink-forward.org/conference-program#not-so-big-%E2%80%93-flink-as-a-true-application-framework>
>> in Flink Forward in Berlin will be about this). QS is one of them
>> (need-wise, not QS itself necessarily). I opened the design doc Vino Yang
>> created, I'll try to contribute to that.
>>
>> Meanwhile, for the sake of opening our state to outside, we will put a
>> stupid simple operator in between to keep a *duplicate* of the state...
>>
>> Thanks again!
>>
>>
>>
>>
>>
>> ---
>> Oytun Tez
>>
>> *M O T A W O R D*
>> The World's Fastest Human Translation Platform.
>> oy...@motaword.com — www.motaword.com
>>
>>
>> On Tue, Aug 13, 2019 at 6:29 PM Yu Li  wrote:
>>
>>> Hi Oytun,
>>>
>>> Sorry but TBH such support will probably not be added in the foreseeable
>>> future due to lack of committer bandwidth (not only support queryable
>>> broadcast state but all about QueryableState module) as pointed out in
>>> other threads [1] [2].
>>>
>>> However, I think you could open a JIRA for this so when things changed
>>> it could be taken care of. Thanks.
>>>
>>> [1] https://s.apache.org/MaOl
>>> [2] https://s.apache.org/r8k8a
>>>
>>> Best Regards,
>>> Yu
>>>
>>>
>>> On Tue, 13 Aug 2019 at 20:34, Oytun Tez  wrote:
>>>
>>>> Hi there,
>>>>
>>>> Can we set a broadcast state as queryable? I've looked around, not much
>>>> to find about it. I am receiving UnknownKvStateLocation when I try to query
>>>> with the descriptor/state name I give to the broadcast state.
>>>>
>>>> If it doesn't work, what could be the alternative? My mind goes around
>>>> ctx.getBroadcastState and making it queryable in the operator level (I'd
>>>> rather not).
>>>>
>>>> ---
>>>> Oytun Tez
>>>>
>>>> *M O T A W O R D*
>>>> The World's Fastest Human Translation Platform.
>>>> oy...@motaword.com — www.motaword.com
>>>>
>>>


Re: [ANNOUNCE] Kostas Kloudas joins the Flink PMC

2019-09-06 Thread Oytun Tez
It was only natural! I already thought Kostas was PMC member! 

---
Oytun Tez

*M O T A W O R D*
The World's Fastest Human Translation Platform.
oy...@motaword.com — www.motaword.com


On Fri, Sep 6, 2019 at 9:51 AM Zili Chen  wrote:

> Congrats Klou!
>
> Best,
> tison.
>
>
> Till Rohrmann  于2019年9月6日周五 下午9:23写道:
>
>> Congrats Klou!
>>
>> Cheers,
>> Till
>>
>> On Fri, Sep 6, 2019 at 3:00 PM Dian Fu  wrote:
>>
>>> Congratulations Kostas!
>>>
>>> Regards,
>>> Dian
>>>
>>> > 在 2019年9月6日,下午8:58,Wesley Peng  写道:
>>> >
>>> > On 2019/9/6 8:55 下午, Fabian Hueske wrote:
>>> >> I'm very happy to announce that Kostas Kloudas is joining the Flink
>>> PMC.
>>> >> Kostas is contributing to Flink for many years and puts lots of
>>> effort in helping our users and growing the Flink community.
>>> >> Please join me in congratulating Kostas!
>>> >
>>> > congratulation Kostas!
>>> >
>>> > regards.
>>>
>>>


Re: [ANNOUNCE] Apache Flink 1.9.0 released

2019-08-27 Thread Oytun Tez
Thank you, Fabian! We are migrating soon once 2.12 is available.

Cheers,

---
Oytun Tez

*M O T A W O R D*
The World's Fastest Human Translation Platform.
oy...@motaword.com — www.motaword.com


On Tue, Aug 27, 2019 at 8:11 AM Fabian Hueske  wrote:

> Hi all,
>
> Flink 1.9 Docker images are available at Docker Hub [1] now.
> Due to some configuration issue, there are only Scala 2.11 issues at the
> moment but this was fixed [2].
> Flink 1.9 Scala 2.12 images should be available soon.
>
> Cheers,
> Fabian
>
> [1] https://hub.docker.com/_/flink
> [2]
> https://github.com/docker-flink/docker-flink/commit/01e41867c9270cd4dd44970cdbe53ff665e0c9e3
>
> Am Mo., 26. Aug. 2019 um 20:03 Uhr schrieb Oytun Tez :
>
>> Thanks Till and Zili!
>>
>> I see that docker-flink repo now has 1.9 set up, we are only waiting for
>> it to be pushed to Docker Hub. We should be fine once that is done.
>>
>> Thanks again!
>>
>>
>>
>>
>> ---
>> Oytun Tez
>>
>> *M O T A W O R D*
>> The World's Fastest Human Translation Platform.
>> oy...@motaword.com — www.motaword.com
>>
>>
>> On Mon, Aug 26, 2019 at 4:04 AM Zili Chen  wrote:
>>
>>> Hi Oytun,
>>>
>>> I think it intents to publish flink-queryable-state-client-java
>>> without scala suffix since it is scala-free. An artifact without
>>> scala suffix has been published [2].
>>>
>>> See also [1].
>>>
>>> Best,
>>> tison.
>>>
>>> [1] https://issues.apache.org/jira/browse/FLINK-12602
>>> [2]
>>> https://mvnrepository.com/artifact/org.apache.flink/flink-queryable-state-client-java/1.9.0
>>>
>>>
>>>
>>> Till Rohrmann  于2019年8月26日周一 下午3:50写道:
>>>
>>>> The missing support for the Scala shell with Scala 2.12 was documented
>>>> in the 1.7 release notes [1].
>>>>
>>>> @Oytun, the docker image should be updated in a bit. Sorry for the
>>>> inconveniences. Thanks for the pointer that
>>>> flink-queryable-state-client-java_2.11 hasn't been published. We'll upload
>>>> this in a bit.
>>>>
>>>> [1]
>>>> https://ci.apache.org/projects/flink/flink-docs-release-1.7/release-notes/flink-1.7.html#scala-shell-does-not-work-with-scala-212
>>>>
>>>> Cheers,
>>>> Till
>>>>
>>>> On Sat, Aug 24, 2019 at 12:14 PM chaojianok  wrote:
>>>>
>>>>> Congratulations and thanks!
>>>>> At 2019-08-22 20:03:26, "Tzu-Li (Gordon) Tai" 
>>>>> wrote:
>>>>> >The Apache Flink community is very happy to announce the release of
>>>>> Apache
>>>>> >Flink 1.9.0, which is the latest major release.
>>>>> >
>>>>> >Apache Flink® is an open-source stream processing framework for
>>>>> >distributed, high-performing, always-available, and accurate data
>>>>> streaming
>>>>> >applications.
>>>>> >
>>>>> >The release is available for download at:
>>>>> >https://flink.apache.org/downloads.html
>>>>> >
>>>>> >Please check out the release blog post for an overview of the
>>>>> improvements
>>>>> >for this new major release:
>>>>> >https://flink.apache.org/news/2019/08/22/release-1.9.0.html
>>>>> >
>>>>> >The full release notes are available in Jira:
>>>>> >
>>>>> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12344601
>>>>> >
>>>>> >We would like to thank all contributors of the Apache Flink community
>>>>> who
>>>>> >made this release possible!
>>>>> >
>>>>> >Cheers,
>>>>> >Gordon
>>>>>
>>>>


Re: [ANNOUNCE] Apache Flink 1.9.0 released

2019-08-26 Thread Oytun Tez
Thanks Till and Zili!

I see that docker-flink repo now has 1.9 set up, we are only waiting for it
to be pushed to Docker Hub. We should be fine once that is done.

Thanks again!




---
Oytun Tez

*M O T A W O R D*
The World's Fastest Human Translation Platform.
oy...@motaword.com — www.motaword.com


On Mon, Aug 26, 2019 at 4:04 AM Zili Chen  wrote:

> Hi Oytun,
>
> I think it intents to publish flink-queryable-state-client-java
> without scala suffix since it is scala-free. An artifact without
> scala suffix has been published [2].
>
> See also [1].
>
> Best,
> tison.
>
> [1] https://issues.apache.org/jira/browse/FLINK-12602
> [2]
> https://mvnrepository.com/artifact/org.apache.flink/flink-queryable-state-client-java/1.9.0
>
>
>
> Till Rohrmann  于2019年8月26日周一 下午3:50写道:
>
>> The missing support for the Scala shell with Scala 2.12 was documented in
>> the 1.7 release notes [1].
>>
>> @Oytun, the docker image should be updated in a bit. Sorry for the
>> inconveniences. Thanks for the pointer that
>> flink-queryable-state-client-java_2.11 hasn't been published. We'll upload
>> this in a bit.
>>
>> [1]
>> https://ci.apache.org/projects/flink/flink-docs-release-1.7/release-notes/flink-1.7.html#scala-shell-does-not-work-with-scala-212
>>
>> Cheers,
>> Till
>>
>> On Sat, Aug 24, 2019 at 12:14 PM chaojianok  wrote:
>>
>>> Congratulations and thanks!
>>> At 2019-08-22 20:03:26, "Tzu-Li (Gordon) Tai" 
>>> wrote:
>>> >The Apache Flink community is very happy to announce the release of
>>> Apache
>>> >Flink 1.9.0, which is the latest major release.
>>> >
>>> >Apache Flink® is an open-source stream processing framework for
>>> >distributed, high-performing, always-available, and accurate data
>>> streaming
>>> >applications.
>>> >
>>> >The release is available for download at:
>>> >https://flink.apache.org/downloads.html
>>> >
>>> >Please check out the release blog post for an overview of the
>>> improvements
>>> >for this new major release:
>>> >https://flink.apache.org/news/2019/08/22/release-1.9.0.html
>>> >
>>> >The full release notes are available in Jira:
>>> >
>>> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12344601
>>> >
>>> >We would like to thank all contributors of the Apache Flink community
>>> who
>>> >made this release possible!
>>> >
>>> >Cheers,
>>> >Gordon
>>>
>>


Re: [ANNOUNCE] Apache Flink 1.9.0 released

2019-08-23 Thread Oytun Tez
Hi all,

We also had to rollback our upgrade effort for 2 reasons:

- Official Docker container is not ready yet
- This artefact is not published with
scala: org.apache.flink:flink-queryable-state-client-java_2.11:jar:1.9.0






---
Oytun Tez

*M O T A W O R D*
The World's Fastest Human Translation Platform.
oy...@motaword.com — www.motaword.com


On Fri, Aug 23, 2019 at 10:48 AM Zili Chen  wrote:

> Hi Till,
>
> Did we mention this in release note(or maybe previous release note where
> we did the exclusion)?
>
> Best,
> tison.
>
>
> Till Rohrmann  于2019年8月23日周五 下午10:28写道:
>
>> Hi Gavin,
>>
>> if I'm not mistaken, then the community excluded the Scala FlinkShell
>> since a couple of versions for Scala 2.12. The problem seems to be that
>> some of the tests failed. See here [1] for more information.
>>
>> [1] https://issues.apache.org/jira/browse/FLINK-10911
>>
>> Cheers,
>> Till
>>
>> On Fri, Aug 23, 2019 at 11:44 AM Gavin Lee  wrote:
>>
>>> I used package on apache official site, with mirror [1], the difference
>>> is
>>> I used scala 2.12 version.
>>> I also tried to build from source for both scala 2.11 and 2.12, when
>>> build
>>> 2.12 the FlinkShell.class is in flink-dist jar file but after running mvn
>>> clean package -Dscala-2.12, this class was removed in
>>> flink-dist_2.12-1.9 jar
>>> file.
>>> Seems broken here for scala 2.12, right?
>>>
>>> [1]
>>>
>>> http://mirror.bit.edu.cn/apache/flink/flink-1.9.0/flink-1.9.0-bin-scala_2.12.tgz
>>>
>>> On Fri, Aug 23, 2019 at 4:37 PM Zili Chen  wrote:
>>>
>>> > I downloaded 1.9.0 dist from here[1] and didn't see the issue. Where
>>> do you
>>> > download it? Could you try to download the dist from [1] and see
>>> whether
>>> > the problem last?
>>> >
>>> > Best,
>>> > tison.
>>> >
>>> > [1]
>>> >
>>> http://mirrors.tuna.tsinghua.edu.cn/apache/flink/flink-1.9.0/flink-1.9.0-bin-scala_2.11.tgz
>>> >
>>> >
>>> > Gavin Lee  于2019年8月23日周五 下午4:34写道:
>>> >
>>> >> Thanks for your reply @Zili.
>>> >> I'm afraid it's not the same issue.
>>> >> I found that the FlinkShell.class was not included in flink dist jar
>>> file
>>> >> in 1.9.0 version.
>>> >> Nowhere can find this class file inside jar, either in opt or lib
>>> >> directory under root folder of flink distribution.
>>> >>
>>> >>
>>> >> On Fri, Aug 23, 2019 at 4:10 PM Zili Chen 
>>> wrote:
>>> >>
>>> >>> Hi Gavin,
>>> >>>
>>> >>> I also find a problem in shell if the directory contain whitespace
>>> >>> then the final command to run is incorrect. Could you check the
>>> >>> final command to be executed?
>>> >>>
>>> >>> FYI, here is the ticket[1].
>>> >>>
>>> >>> Best,
>>> >>> tison.
>>> >>>
>>> >>> [1] https://issues.apache.org/jira/browse/FLINK-13827
>>> >>>
>>> >>>
>>> >>> Gavin Lee  于2019年8月23日周五 下午3:36写道:
>>> >>>
>>> >>>> Why bin/start-scala-shell.sh local return following error?
>>> >>>>
>>> >>>> bin/start-scala-shell.sh local
>>> >>>>
>>> >>>> Error: Could not find or load main class
>>> >>>> org.apache.flink.api.scala.FlinkShell
>>> >>>> For flink 1.8.1 and previous ones, no such issues.
>>> >>>>
>>> >>>> On Fri, Aug 23, 2019 at 2:05 PM qi luo  wrote:
>>> >>>>
>>> >>>>> Congratulations and thanks for the hard work!
>>> >>>>>
>>> >>>>> Qi
>>> >>>>>
>>> >>>>> On Aug 22, 2019, at 8:03 PM, Tzu-Li (Gordon) Tai <
>>> tzuli...@apache.org>
>>> >>>>> wrote:
>>> >>>>>
>>> >>>>> The Apache Flink community is very happy to announce the release of
>>> >>>>> Apache Flink 1.9.0, which is the latest major release.
>>> >>>>>
>>> >>>>> Apache Flink® is an open-source stream processing framework for
>>> >>>>> distributed, high-performing, always-available, and accurate data
>>> streaming
>>> >>>>> applications.
>>> >>>>>
>>> >>>>> The release is available for download at:
>>> >>>>> https://flink.apache.org/downloads.html
>>> >>>>>
>>> >>>>> Please check out the release blog post for an overview of the
>>> >>>>> improvements for this new major release:
>>> >>>>> https://flink.apache.org/news/2019/08/22/release-1.9.0.html
>>> >>>>>
>>> >>>>> The full release notes are available in Jira:
>>> >>>>>
>>> >>>>>
>>> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12344601
>>> >>>>>
>>> >>>>> We would like to thank all contributors of the Apache Flink
>>> community
>>> >>>>> who made this release possible!
>>> >>>>>
>>> >>>>> Cheers,
>>> >>>>> Gordon
>>> >>>>>
>>> >>>>>
>>> >>>>>
>>> >>>>
>>> >>>> --
>>> >>>> Gavin
>>> >>>>
>>> >>>
>>> >>
>>> >> --
>>> >> Gavin
>>> >>
>>> >
>>>
>>> --
>>> Gavin
>>>
>>


Re: [ANNOUNCE] Apache Flink 1.9.0 released

2019-08-22 Thread Oytun Tez
Ah, I was worried State Processor API would work only on savepoints, but
this confirmed the opposite:

The new State Processor API covers all variations of snapshots: savepoints,
full checkpoints and incremental checkpoints.

Now this is a good day for me. I can imagine creating some abstractions
around it on the application level to open it up for every operator by
default, bind it to our custom rest implementation (we did this initially
to serve QueryableState with abstractions to automate).

Now the question is, whether to rid of QueryableState flows in our
application... Let's see. :)

This is a great update, thank you again!

---
Oytun Tez

*M O T A W O R D*
The World's Fastest Human Translation Platform.
oy...@motaword.com — www.motaword.com


On Thu, Aug 22, 2019 at 8:11 AM Jark Wu  wrote:

> Congratulations!
>
> Thanks Gordon and Kurt for being the release manager and thanks a lot to
> all the contributors.
>
>
> Cheers,
> Jark
>
> On Thu, 22 Aug 2019 at 20:06, Oytun Tez  wrote:
>
>> Congratulations team; thanks for the update, Gordon.
>>
>> ---
>> Oytun Tez
>>
>> *M O T A W O R D*
>> The World's Fastest Human Translation Platform.
>> oy...@motaword.com — www.motaword.com
>>
>>
>> On Thu, Aug 22, 2019 at 8:03 AM Tzu-Li (Gordon) Tai 
>> wrote:
>>
>>> The Apache Flink community is very happy to announce the release of
>>> Apache Flink 1.9.0, which is the latest major release.
>>>
>>> Apache Flink® is an open-source stream processing framework for
>>> distributed, high-performing, always-available, and accurate data streaming
>>> applications.
>>>
>>> The release is available for download at:
>>> https://flink.apache.org/downloads.html
>>>
>>> Please check out the release blog post for an overview of the
>>> improvements for this new major release:
>>> https://flink.apache.org/news/2019/08/22/release-1.9.0.html
>>>
>>> The full release notes are available in Jira:
>>>
>>> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12344601
>>>
>>> We would like to thank all contributors of the Apache Flink community
>>> who made this release possible!
>>>
>>> Cheers,
>>> Gordon
>>>
>>


Re: [ANNOUNCE] Apache Flink 1.9.0 released

2019-08-22 Thread Oytun Tez
Congratulations team; thanks for the update, Gordon.

---
Oytun Tez

*M O T A W O R D*
The World's Fastest Human Translation Platform.
oy...@motaword.com — www.motaword.com


On Thu, Aug 22, 2019 at 8:03 AM Tzu-Li (Gordon) Tai 
wrote:

> The Apache Flink community is very happy to announce the release of Apache
> Flink 1.9.0, which is the latest major release.
>
> Apache Flink® is an open-source stream processing framework for
> distributed, high-performing, always-available, and accurate data streaming
> applications.
>
> The release is available for download at:
> https://flink.apache.org/downloads.html
>
> Please check out the release blog post for an overview of the improvements
> for this new major release:
> https://flink.apache.org/news/2019/08/22/release-1.9.0.html
>
> The full release notes are available in Jira:
>
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12344601
>
> We would like to thank all contributors of the Apache Flink community who
> made this release possible!
>
> Cheers,
> Gordon
>


Re: Questions for platform to choose

2019-08-21 Thread Oytun Tez
Flink 

---
Oytun Tez

*M O T A W O R D*
The World's Fastest Human Translation Platform.
oy...@motaword.com — www.motaword.com


On Wed, Aug 21, 2019 at 2:42 AM Eliza  wrote:

> Hello,
>
> We have all of spark, flink, storm, kafka installed.
> For realtime streaming calculation, which one is the best above?
> Like other big players, the logs in our stack are huge.
>
> Thanks.
>


Re: [VOTE] Apache Flink 1.9.0, release candidate #3

2019-08-19 Thread Oytun Tez
Thanks, Gordon, will do tomorrow.

---
Oytun Tez

*M O T A W O R D*
The World's Fastest Human Translation Platform.
oy...@motaword.com — www.motaword.com


On Mon, Aug 19, 2019 at 7:21 PM Tzu-Li (Gordon) Tai 
wrote:

> Hi!
>
> Voting on RC3 for Apache Flink 1.9.0 has started:
>
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/VOTE-Apache-Flink-1-9-0-release-candidate-3-td31988.html
>
> Please check this out if you want to verify your applications against this
> new Flink release.
>
> Cheers,
> Gordon
>
> -- Forwarded message -
> From: Tzu-Li (Gordon) Tai 
> Date: Tue, Aug 20, 2019 at 1:16 AM
> Subject: [VOTE] Apache Flink 1.9.0, release candidate #3
> To: dev 
>
>
> Hi all,
>
> Release candidate #3 for Apache Flink 1.9.0 is now ready for your review.
>
> Please review and vote on release candidate #3 for version 1.9.0, as
> follows:
> [ ] +1, Approve the release
> [ ] -1, Do not approve the release (please provide specific comments)
>
> The complete staging area is available for your review, which includes:
> * JIRA release notes [1],
> * the official Apache source release and binary convenience releases to be
> deployed to dist.apache.org [2], which are signed with the key with
> fingerprint 1C1E2394D3194E1944613488F320986D35C33D6A [3],
> * all artifacts to be deployed to the Maven Central Repository [4],
> * source code tag “release-1.9.0-rc3” [5].
> * pull requests for the release note documentation [6] and announcement
> blog post [7].
>
> As proposed in the RC2 vote thread [8], for RC3 we are only cherry-picking
> minimal specific changes on top of RC2 to be able to reasonably carry over
> previous testing efforts and effectively require a shorter voting time.
>
> The only extra commits in this RC, compared to RC2, are the following:
> - c2d9aeac [FLINK-13231] [pubsub] Replace Max outstanding acknowledgement
> ids limit with a FlinkConnectorRateLimiter
> - d8941711 [FLINK-13699][table-api] Fix TableFactory doesn’t work with DDL
> when containing TIMESTAMP/DATE/TIME types
> - 04e95278 [FLINK-13752] Only references necessary variables when
> bookkeeping result partitions on TM
>
> Due to the minimal set of changes, the vote for RC3 will be *open for
> only 48 hours*.
> Please cast your votes before *Aug. 21st (Wed.) 2019, 17:00 PM CET*.
>
> It is adopted by majority approval, with at least 3 PMC affirmative votes.
>
> Thanks,
> Gordon
>
> [1]
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12344601
> [2] https://dist.apache.org/repos/dist/dev/flink/flink-1.9.0-rc3/
> [3] https://dist.apache.org/repos/dist/release/flink/KEYS
> [4] https://repository.apache.org/content/repositories/orgapacheflink-1236
> [5]
> https://gitbox.apache.org/repos/asf?p=flink.git;a=tag;h=refs/tags/release-1.9.0-rc3
> [6] https://github.com/apache/flink/pull/9438
> [7] https://github.com/apache/flink-web/pull/244
> [8]
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/VOTE-Apache-Flink-Release-1-9-0-release-candidate-2-tp31542p31933.html
>


RabbitMQ - ConsumerCancelledException

2019-08-19 Thread Oytun Tez
Hi there,

We've started to witness ConsumerCancelledException errors from our
RabbitMQ source. We've digged in everywhere, yet couldn't come up with a
good explanation.

This is the exception:

com.rabbitmq.client.ConsumerCancelledException
at 
com.rabbitmq.client.QueueingConsumer.handle(QueueingConsumer.java:208)
at 
com.rabbitmq.client.QueueingConsumer.nextDelivery(QueueingConsumer.java:223)
at 
org.apache.flink.streaming.connectors.rabbitmq.RMQSource.run(RMQSource.java:193)
at 
org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:93)
at 
org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:57)
at 
org.apache.flink.streaming.runtime.tasks.SourceStreamTask.run(SourceStreamTask.java:97)
at 
org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:300)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:711)
at java.lang.Thread.run(Thread.java:748)
Caused by: com.rabbitmq.client.ConsumerCancelledException
at 
com.rabbitmq.client.QueueingConsumer.handleCancel(QueueingConsumer.java:122)
at 
com.rabbitmq.client.impl.ConsumerDispatcher$3.run(ConsumerDispatcher.java:115)
at 
com.rabbitmq.client.impl.ConsumerWorkService$WorkPoolRunnable.run(ConsumerWorkService.java:100)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
... 1 more


We've tried limiting prefetch count 100 and 500, didn't change. We can
try 1 by 1, but that doesn't really sound efficient.


Is anyone familiar with possible causes?





---
Oytun Tez

*M O T A W O R D*
The World's Fastest Human Translation Platform.
oy...@motaword.com — www.motaword.com


Re: Using S3 as a sink (StreamingFileSink)

2019-08-16 Thread Oytun Tez
Hi Swapnil,

I am not familiar with the StreamingFileSink, however, this sounds like a
checkpointing issue to me FileSink should keep its sink state, and remove
from the state the files that it *really successfully* sinks (perhaps you
may want to add a validation here with S3 to check file integrity). This
leaves us in the state with the failed files, partial files etc.



---
Oytun Tez

*M O T A W O R D*
The World's Fastest Human Translation Platform.
oy...@motaword.com — www.motaword.com


On Fri, Aug 16, 2019 at 6:02 PM Swapnil Kumar  wrote:

> Hello, We are using Flink to process input events and aggregate and write
> o/p of our streaming job to S3 using StreamingFileSink but whenever we try
> to restore the job from a savepoint, the restoration fails with missing
> part files error. As per my understanding, s3 deletes those
> part(intermittent) files and can no longer be found on s3. Is there a
> workaround for this, so that we can use s3 as a sink?
>
> --
> Thanks,
> Swapnil Kumar
>


Re: Product Evaluation

2019-08-16 Thread Oytun Tez
Hi Ajit,

I believe many of us use CEP – we use it extensively. If you can be more
specific, we'll try to respond more directly.

---
Oytun Tez

*M O T A W O R D*
The World's Fastest Human Translation Platform.
oy...@motaword.com — www.motaword.com


On Fri, Aug 16, 2019 at 5:06 PM Ajit Saluja  wrote:

> Hi,
> We have a Client requirement to implement a Complex Event Processing tool.
> We would like to evaluate if Apache Flink meets our requirements. If there
> is any Client/Group who has implemented it, we would like to have
> discussion to understand the capabilities of this tool better.
>
> If you may provide any contacts, that would be great.
>
> Thanks,
> Ajit Saluja
>
>


Re: [ANNOUNCE] Andrey Zagrebin becomes a Flink committer

2019-08-14 Thread Oytun Tez
Congratulations Andrey!

I am glad the Flink committer team is growing at such a pace!

---
Oytun Tez

*M O T A W O R D*
The World's Fastest Human Translation Platform.
oy...@motaword.com — www.motaword.com


On Wed, Aug 14, 2019 at 9:29 AM Zili Chen  wrote:

> Congratulations Andrey!
>
> Best,
> tison.
>
>
> Till Rohrmann  于2019年8月14日周三 下午9:26写道:
>
>> Hi everyone,
>>
>> I'm very happy to announce that Andrey Zagrebin accepted the offer of
>> the Flink PMC to become a committer of the Flink project.
>>
>> Andrey has been an active community member for more than 15 months. He
>> has helped shaping numerous features such as State TTL, FRocksDB release,
>> Shuffle service abstraction, FLIP-1, result partition management and
>> various fixes/improvements. He's also frequently helping out on the
>> user@f.a.o mailing lists.
>>
>> Congratulations Andrey!
>>
>> Best, Till
>> (on behalf of the Flink PMC)
>>
>


Re: Making broadcast state queryable?

2019-08-13 Thread Oytun Tez
Thank you for the honest response, Yu!

There is so much that comes to mind when we look at Flink as a "application
framework" (my talk
<https://europe-2019.flink-forward.org/conference-program#not-so-big-%E2%80%93-flink-as-a-true-application-framework>
in Flink Forward in Berlin will be about this). QS is one of them
(need-wise, not QS itself necessarily). I opened the design doc Vino Yang
created, I'll try to contribute to that.

Meanwhile, for the sake of opening our state to outside, we will put a
stupid simple operator in between to keep a *duplicate* of the state...

Thanks again!





---
Oytun Tez

*M O T A W O R D*
The World's Fastest Human Translation Platform.
oy...@motaword.com — www.motaword.com


On Tue, Aug 13, 2019 at 6:29 PM Yu Li  wrote:

> Hi Oytun,
>
> Sorry but TBH such support will probably not be added in the foreseeable
> future due to lack of committer bandwidth (not only support queryable
> broadcast state but all about QueryableState module) as pointed out in
> other threads [1] [2].
>
> However, I think you could open a JIRA for this so when things changed it
> could be taken care of. Thanks.
>
> [1] https://s.apache.org/MaOl
> [2] https://s.apache.org/r8k8a
>
> Best Regards,
> Yu
>
>
> On Tue, 13 Aug 2019 at 20:34, Oytun Tez  wrote:
>
>> Hi there,
>>
>> Can we set a broadcast state as queryable? I've looked around, not much
>> to find about it. I am receiving UnknownKvStateLocation when I try to query
>> with the descriptor/state name I give to the broadcast state.
>>
>> If it doesn't work, what could be the alternative? My mind goes around
>> ctx.getBroadcastState and making it queryable in the operator level (I'd
>> rather not).
>>
>> ---
>> Oytun Tez
>>
>> *M O T A W O R D*
>> The World's Fastest Human Translation Platform.
>> oy...@motaword.com — www.motaword.com
>>
>


Making broadcast state queryable?

2019-08-13 Thread Oytun Tez
Hi there,

Can we set a broadcast state as queryable? I've looked around, not much to
find about it. I am receiving UnknownKvStateLocation when I try to query
with the descriptor/state name I give to the broadcast state.

If it doesn't work, what could be the alternative? My mind goes around
ctx.getBroadcastState and making it queryable in the operator level (I'd
rather not).

---
Oytun Tez

*M O T A W O R D*
The World's Fastest Human Translation Platform.
oy...@motaword.com — www.motaword.com


Re: [ANNOUNCE] Hequn becomes a Flink committer

2019-08-07 Thread Oytun Tez
Congratulations Hequn!

---
Oytun Tez

*M O T A W O R D*
The World's Fastest Human Translation Platform.
oy...@motaword.com — www.motaword.com


On Wed, Aug 7, 2019 at 9:03 AM Jark Wu  wrote:

> Congratulations Hequn! It's great to have you in the community!
>
>
>
> On Wed, 7 Aug 2019 at 21:00, Fabian Hueske  wrote:
>
>> Congratulations Hequn!
>>
>> Am Mi., 7. Aug. 2019 um 14:50 Uhr schrieb Robert Metzger <
>> rmetz...@apache.org>:
>>
>>> Congratulations!
>>>
>>> On Wed, Aug 7, 2019 at 1:09 PM highfei2...@126.com 
>>> wrote:
>>>
>>> > Congrats Hequn!
>>> >
>>> > Best,
>>> > Jeff Yang
>>> >
>>> >
>>> >  Original Message 
>>> > Subject: Re: [ANNOUNCE] Hequn becomes a Flink committer
>>> > From: Piotr Nowojski
>>> > To: JingsongLee
>>> > CC: Biao Liu ,Zhu Zhu ,Zili Chen ,Jeff Zhang ,Paul Lam ,jincheng sun
>>> ,dev ,user
>>> >
>>> >
>>> > Congratulations :)
>>> >
>>> > On 7 Aug 2019, at 12:09, JingsongLee  wrote:
>>> >
>>> > Congrats Hequn!
>>> >
>>> > Best,
>>> > Jingsong Lee
>>> >
>>> > --
>>> > From:Biao Liu 
>>> > Send Time:2019年8月7日(星期三) 12:05
>>> > To:Zhu Zhu 
>>> > Cc:Zili Chen ; Jeff Zhang ;
>>> Paul
>>> > Lam ; jincheng sun ;
>>> dev
>>> > ; user 
>>> > Subject:Re: [ANNOUNCE] Hequn becomes a Flink committer
>>> >
>>> > Congrats Hequn!
>>> >
>>> > Thanks,
>>> > Biao /'bɪ.aʊ/
>>> >
>>> >
>>> >
>>> > On Wed, Aug 7, 2019 at 6:00 PM Zhu Zhu  wrote:
>>> > Congratulations to Hequn!
>>> >
>>> > Thanks,
>>> > Zhu Zhu
>>> >
>>> > Zili Chen  于2019年8月7日周三 下午5:16写道:
>>> > Congrats Hequn!
>>> >
>>> > Best,
>>> > tison.
>>> >
>>> >
>>> > Jeff Zhang  于2019年8月7日周三 下午5:14写道:
>>> > Congrats Hequn!
>>> >
>>> > Paul Lam  于2019年8月7日周三 下午5:08写道:
>>> > Congrats Hequn! Well deserved!
>>> >
>>> > Best,
>>> > Paul Lam
>>> >
>>> > 在 2019年8月7日,16:28,jincheng sun  写道:
>>> >
>>> > Hi everyone,
>>> >
>>> > I'm very happy to announce that Hequn accepted the offer of the Flink
>>> PMC
>>> > to become a committer of the Flink project.
>>> >
>>> > Hequn has been contributing to Flink for many years, mainly working on
>>> > SQL/Table API features. He's also frequently helping out on the user
>>> > mailing lists and helping check/vote the release.
>>> >
>>> > Congratulations Hequn!
>>> >
>>> > Best, Jincheng
>>> > (on behalf of the Flink PMC)
>>> >
>>> >
>>> >
>>> > --
>>> > Best Regards
>>> >
>>> > Jeff Zhang
>>> >
>>> >
>>> >
>>>
>>


Re: Best way to access a Flink state entry from another Flink application

2019-08-06 Thread Oytun Tez
Hi Mohammad,

Queryable State works in some cases:
https://ci.apache.org/projects/flink/flink-docs-release-1.8/dev/stream/state/queryable_state.html

As much as I know, this is the only way to access Flink's state from
outside, until we have Savepoint API coming in 1.9.

---
Oytun Tez

*M O T A W O R D*
The World's Fastest Human Translation Platform.
oy...@motaword.com — www.motaword.com


On Tue, Aug 6, 2019 at 9:52 AM Mohammad Hosseinian <
mohammad.hossein...@id1.de> wrote:

> Hi Alex,
>
> Thanks for your reply. The application is streaming. The issue with using
> messaging channels for such kind of communication is the 'race condition'.
> I mean, when you have parallel channels of communication (one for the main
> flow of your streaming application and one for bringing 'stated/current'
> objects to desired processing nodes), then the order of messages are not
> preserved and it might lead to incorrect result of your application. That
> was the reason why I was wondering if there is any 'synchronous' way of
> accessing the Flink state.
>
> BR, Moe
>
>
> On 06/08/2019 13:25, Протченко Алексей wrote:
>
> Hi Mohammad,
>
> which types of applications do you mean? Streaming or batch ones? In terms
> of streaming ones queues like Kafka or RabbitMq between applications should
> be the best way I think.
>
> Best regards,
> Alex
>
>
> Вторник, 6 августа 2019, 12:21 +02:00 от Mohammad Hosseinian
>  :
>
> Hi all,
>
>
> We have a network of Flink applications. The whole cluster receives
> 'state-update' messages from the outside, and there is one Flink
> application in our cluster that 'merges' these updates and creates the
> actual, most up-to-date, state of the 'data-objects' and passes it to the
> next process. It does this, using a stateful stream processing with a
> `KeyedProcessFunction` object. In our processing logic, there are nodes
> that require to access the actual state of the objects when they receive
> one or more 'object-id's from the previous Flink application. We do not
> propagate the actual-state of the objects since, not all types of the
> objects are relevant to all processes in the cluster, so we saved some
> network/storage overhead there.
>
> The question is: for such scenario, what is the best way to expose the
> Flink state to another Flink application? I am aware of 'Queryable states',
> but I am not sure if this feature has been designed and is suitable for our
> use-case or not?
>
>
> Thank you very much in advance.
>
>
> BR, Moe
> --
>
> *Mohammad Hosseinian*
> Software Developer
> Information Design One AG
>
> Phone +49-69-244502-0
> Fax +49-69-244502-10
> Web *www.id1.de <http://www.id1.de>*
>
>
> Information Design One AG, Baseler Strasse 10, 60329 Frankfurt am Main,
> Germany
> Registration: Amtsgericht Frankfurt am Main, HRB 52596
> Executive Board: Robert Peters, Benjamin Walther, Supervisory Board:
> Christian Hecht
>
>
>
> --
> Протченко Алексей
>
> --
>
> *Mohammad Hosseinian*
> Software Developer
> Information Design One AG
>
> Phone +49-69-244502-0
> Fax +49-69-244502-10
> Web *www.id1.de <http://www.id1.de>*
>
>
> Information Design One AG, Baseler Strasse 10, 60329 Frankfurt am Main,
> Germany
> Registration: Amtsgericht Frankfurt am Main, HRB 52596
> Executive Board: Robert Peters, Benjamin Walther, Supervisory Board:
> Christian Hecht
>


Re: Best pattern to signal a watermark > t across all tasks?

2019-08-02 Thread Oytun Tez
This bit of info is very useful, Fabian, thank you:

You can get the parallel task id from the
> RuntimeContext.getIndexOfThisSubtask().
> RuntimeContext.getNumberOfParallelSubtasks() gives the total number of
> tasks.











---
Oytun Tez

*M O T A W O R D*
The World's Fastest Human Translation Platform.
oy...@motaword.com — www.motaword.com


On Fri, Aug 2, 2019 at 9:20 AM Eduardo Winpenny Tejedor <
eduardo.winpe...@gmail.com> wrote:

> awesome, thanks
>
> On Fri, 2 Aug 2019, 10:56 Fabian Hueske,  wrote:
>
>> Hi,
>>
>> Regarding step 3, it is sufficient to check that you got on message from
>> each parallel task of the previous operator. That's because a task
>> processes the timers of all keys before moving forward.
>> Timers are always processed per key, but you could deduplicate on the
>> parallel task id and check that you got a message from each task.
>>
>> You can get the parallel task id from the
>> RuntimeContext.getIndexOfThisSubtask().
>> RuntimeContext.getNumberOfParallelSubtasks() gives the total number of
>> tasks.
>>
>> Fabian
>>
>> Am Fr., 2. Aug. 2019 um 10:55 Uhr schrieb Eduardo Winpenny Tejedor <
>> eduardo.winpe...@gmail.com>:
>>
>>> Hi Oytun, that sounds like a great idea thanks!! Just wanted to confirm
>>> a couple of things:
>>>
>>> -In step 2 by merging do you mean anything else apart from setting the
>>> operator parallelism to 1? Forcing a parallelism of 1 should ensure all
>>> items go to the same task.
>>>
>>> -In step 3 I don't think I could check an item for each key has been
>>> received, I would need to know how many keys I have on my stream (or could
>>> I!? that's exactly what I'm trying to solve) but I could definitely rely on
>>> Flink's watermarking mechanism. If the watermark > t (t being the time for
>>> the trigger of the first operator) it must mean all streams have finished.
>>>
>>> Thanks again
>>>
>>> On Thu, 1 Aug 2019, 18:34 Oytun Tez,  wrote:
>>>
>>>> Perhaps:
>>>>
>>>>1. collect() an item inside onTimer() inside operator#1
>>>>2. merge the resulting stream from all keys
>>>>3. process the combined stream in operator#2 to see if all keys
>>>>were processed. you will probably want to keep state in the operator#2 
>>>> to
>>>>see if you received items from all keys.
>>>>
>>>>
>>>> ---
>>>> Oytun Tez
>>>>
>>>> *M O T A W O R D*
>>>> The World's Fastest Human Translation Platform.
>>>> oy...@motaword.com — www.motaword.com
>>>>
>>>>
>>>> On Thu, Aug 1, 2019 at 1:06 PM Eduardo Winpenny Tejedor <
>>>> eduardo.winpe...@gmail.com> wrote:
>>>>
>>>>> Hi all,
>>>>>
>>>>> I have a keyed operator with an hourly event time trigger. On a timer
>>>>> trigger, the operator simply persists some state to a table.
>>>>>
>>>>> I'd like to know when the triggers for all keys have finished so I can
>>>>> send a further signal to the data warehouse, to indicate it has all the
>>>>> necessary data to start producing a report.
>>>>>
>>>>> How can I achieve this? If my operator is distributed across different
>>>>> machine tasks I need to make sure I don't send the signal to the data
>>>>> warehouse before the timers for every key have fired.
>>>>>
>>>>> Thanks,
>>>>> Eduardo
>>>>>
>>>>


Re: Best pattern to signal a watermark > t across all tasks?

2019-08-01 Thread Oytun Tez
Perhaps:

   1. collect() an item inside onTimer() inside operator#1
   2. merge the resulting stream from all keys
   3. process the combined stream in operator#2 to see if all keys were
   processed. you will probably want to keep state in the operator#2 to see if
   you received items from all keys.


---
Oytun Tez

*M O T A W O R D*
The World's Fastest Human Translation Platform.
oy...@motaword.com — www.motaword.com


On Thu, Aug 1, 2019 at 1:06 PM Eduardo Winpenny Tejedor <
eduardo.winpe...@gmail.com> wrote:

> Hi all,
>
> I have a keyed operator with an hourly event time trigger. On a timer
> trigger, the operator simply persists some state to a table.
>
> I'd like to know when the triggers for all keys have finished so I can
> send a further signal to the data warehouse, to indicate it has all the
> necessary data to start producing a report.
>
> How can I achieve this? If my operator is distributed across different
> machine tasks I need to make sure I don't send the signal to the data
> warehouse before the timers for every key have fired.
>
> Thanks,
> Eduardo
>


Re: Is Queryable State not working in standalone cluster with 1 task manager?

2019-07-30 Thread Oytun Tez
[image: image.png]

---
Oytun Tez

*M O T A W O R D*
The World's Fastest Human Translation Platform.
oy...@motaword.com — www.motaword.com


On Tue, Jul 30, 2019 at 9:04 AM Oytun Tez  wrote:

> We extend the official container and the jar is available under ./lib – so
> it wasn't enabled by default, we moved the jar under ./lib and then build
> the container+application.
>
>
>
>
> ---
> Oytun Tez
>
> *M O T A W O R D*
> The World's Fastest Human Translation Platform.
> oy...@motaword.com — www.motaword.com
>
>
> On Tue, Jul 30, 2019 at 3:42 AM Fabian Hueske  wrote:
>
>> Hi Oytun,
>>
>> Is QS enabled in your Docker image or did you enable QS by copying/moving
>> flink-queryable-state-runtime_2.11-1.8.0.jar from ./opt to ./lib [1]?
>>
>> Best, Fabian
>>
>> [1]
>> https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/state/queryable_state.html#activating-queryable-state
>>
>> Am Mo., 29. Juli 2019 um 22:46 Uhr schrieb Oytun Tez > >:
>>
>>> And, Flink version: 1.8, incl. Docker container (official flink:1.8 tag)
>>>
>>> ---
>>> Oytun Tez
>>>
>>> *M O T A W O R D*
>>> The World's Fastest Human Translation Platform.
>>> oy...@motaword.com — www.motaword.com
>>>
>>>
>>> On Mon, Jul 29, 2019 at 4:32 PM Oytun Tez  wrote:
>>>
>>>> Hi there,
>>>>
>>>> We have a job that is run inside Docker via `standalone-job.sh
>>>> start-foreground --job-classname`, with 1 task manager. I've been
>>>> trying to make QueryableState available in this setup for 2 days now and I
>>>> can't seem to enable it.
>>>>
>>>> If I create a LocalEnvironment within the code itself and provide 
>>>> queryable-state.enable:
>>>> true config directly via Configuration, then I can at least see that
>>>> the queryable state server is spinning up.
>>>>
>>>> Any pointers? With standalone-job.sh, it seems that it doesn't care
>>>> about queryable-state.enable config inside flink-conf.yaml.
>>>>
>>>> ---
>>>> Oytun Tez
>>>>
>>>> *M O T A W O R D*
>>>> The World's Fastest Human Translation Platform.
>>>> oy...@motaword.com — www.motaword.com
>>>>
>>>


Re: Is Queryable State not working in standalone cluster with 1 task manager?

2019-07-30 Thread Oytun Tez
We extend the official container and the jar is available under ./lib – so
it wasn't enabled by default, we moved the jar under ./lib and then build
the container+application.




---
Oytun Tez

*M O T A W O R D*
The World's Fastest Human Translation Platform.
oy...@motaword.com — www.motaword.com


On Tue, Jul 30, 2019 at 3:42 AM Fabian Hueske  wrote:

> Hi Oytun,
>
> Is QS enabled in your Docker image or did you enable QS by copying/moving
> flink-queryable-state-runtime_2.11-1.8.0.jar from ./opt to ./lib [1]?
>
> Best, Fabian
>
> [1]
> https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/state/queryable_state.html#activating-queryable-state
>
> Am Mo., 29. Juli 2019 um 22:46 Uhr schrieb Oytun Tez :
>
>> And, Flink version: 1.8, incl. Docker container (official flink:1.8 tag)
>>
>> ---
>> Oytun Tez
>>
>> *M O T A W O R D*
>> The World's Fastest Human Translation Platform.
>> oy...@motaword.com — www.motaword.com
>>
>>
>> On Mon, Jul 29, 2019 at 4:32 PM Oytun Tez  wrote:
>>
>>> Hi there,
>>>
>>> We have a job that is run inside Docker via `standalone-job.sh
>>> start-foreground --job-classname`, with 1 task manager. I've been
>>> trying to make QueryableState available in this setup for 2 days now and I
>>> can't seem to enable it.
>>>
>>> If I create a LocalEnvironment within the code itself and provide 
>>> queryable-state.enable:
>>> true config directly via Configuration, then I can at least see that
>>> the queryable state server is spinning up.
>>>
>>> Any pointers? With standalone-job.sh, it seems that it doesn't care
>>> about queryable-state.enable config inside flink-conf.yaml.
>>>
>>> ---
>>> Oytun Tez
>>>
>>> *M O T A W O R D*
>>> The World's Fastest Human Translation Platform.
>>> oy...@motaword.com — www.motaword.com
>>>
>>


Re: Is Queryable State not working in standalone cluster with 1 task manager?

2019-07-29 Thread Oytun Tez
And, Flink version: 1.8, incl. Docker container (official flink:1.8 tag)

---
Oytun Tez

*M O T A W O R D*
The World's Fastest Human Translation Platform.
oy...@motaword.com — www.motaword.com


On Mon, Jul 29, 2019 at 4:32 PM Oytun Tez  wrote:

> Hi there,
>
> We have a job that is run inside Docker via `standalone-job.sh
> start-foreground --job-classname`, with 1 task manager. I've been trying
> to make QueryableState available in this setup for 2 days now and I can't
> seem to enable it.
>
> If I create a LocalEnvironment within the code itself and provide 
> queryable-state.enable:
> true config directly via Configuration, then I can at least see that the
> queryable state server is spinning up.
>
> Any pointers? With standalone-job.sh, it seems that it doesn't care about
> queryable-state.enable config inside flink-conf.yaml.
>
> ---
> Oytun Tez
>
> *M O T A W O R D*
> The World's Fastest Human Translation Platform.
> oy...@motaword.com — www.motaword.com
>


Is Queryable State not working in standalone cluster with 1 task manager?

2019-07-29 Thread Oytun Tez
Hi there,

We have a job that is run inside Docker via `standalone-job.sh
start-foreground --job-classname`, with 1 task manager. I've been trying to
make QueryableState available in this setup for 2 days now and I can't seem
to enable it.

If I create a LocalEnvironment within the code itself and provide
queryable-state.enable:
true config directly via Configuration, then I can at least see that the
queryable state server is spinning up.

Any pointers? With standalone-job.sh, it seems that it doesn't care about
queryable-state.enable config inside flink-conf.yaml.

---
Oytun Tez

*M O T A W O R D*
The World's Fastest Human Translation Platform.
oy...@motaword.com — www.motaword.com


Re: question for handling db data

2019-07-26 Thread Oytun Tez
imagine an operator, ProcessFunction, it has 2 incoming data:
geofences via broadcast,
user location via normal data stream

geofence updates and user location updates will come separately into this
single operator.

1)
when geofence update comes via broadcast, the operator will update its
state with the new geofence rules. this.geofenceListState =
myNewGeofenceListState
this happens in processBroadcastElement() method.

whenever geofence data is updated, it will come to this operator, into
processBroadcastElement, and you will put the new geofence list into the
operator's state.

2)
when user location update comes to the operator, via regular stream, you
will access this.geofenceListState and do your calculations and collect()
whatever you need to collect at the end of computation.

regular stream comes, this time, to processElement() method.

-

geofence update will not affect the previously collected elements from
processElement. but Flink will make sure all of instances of this operator
in various task managers will get the same geofence update via
processBroadcastElement(), no matter whether the operator is keyed.

your processElement, meaning your user location updates, will do its
calculation via the latest geofence data from processBroadcastElement. when
geofence data is updated, user location updates from that point on will use
the new geofence data.

i hope this is more clear...






---
Oytun Tez

*M O T A W O R D*
The World's Fastest Human Translation Platform.
oy...@motaword.com — www.motaword.com


On Thu, Jul 25, 2019 at 7:08 PM jaya sai  wrote:

> Hi Oytun,
>
> Thanks for the quick reply, will study more
>
> so when we have a stream let say in kakfa for edit, delete and insert of
> geofences and add it to the flink broadcast downstream, what happens if the
> processing is taking place and we update the bounds of geofence ?
>
> when will the new data or how the updates take place and any impact on the
> geofence events out come based on the location data ?
>
> Thank you,
>
>
>
> On Thu, Jul 25, 2019 at 5:53 PM Oytun Tez  wrote:
>
>> Hi Jaya,
>>
>> Broadcast pattern may help here. Take a look at this:
>> https://ci.apache.org/projects/flink/flink-docs-release-1.8/dev/stream/state/broadcast_state.html
>>
>> You'll still keep your geofence data as a stream (depending on the data
>> and use case, maybe the whole list of geofence as a single stream item),
>> broadcast the stream to downstream operators, which will now have geofence
>> data in their state as their slow changing data (processBroadcastElement),
>> and the user location regularly coming to the operator (processElement).
>>
>>
>>
>>
>>
>> ---
>> Oytun Tez
>>
>> *M O T A W O R D*
>> The World's Fastest Human Translation Platform.
>> oy...@motaword.com — www.motaword.com
>>
>>
>> On Thu, Jul 25, 2019 at 6:48 PM jaya sai  wrote:
>>
>>> Hello,
>>>
>>> I have a question on using flink, we have a small data set which does
>>> not change often but have another data set which we need to compare with it
>>> and it has lots of data
>>>
>>> let say I have two collections geofence and locations in mongodb.
>>> Geofence collection does not change often and relatively small, but we have
>>> location data coming in at high amounts from clients and we need to
>>> calculate the goefence entry exits based on geofence and location data
>>> point.
>>> For calculating the entry and exit we were thinking of using flink CEP.
>>> But our problem is sometimes geofence data changes and we need to update
>>> the in memory store of the flink somehow
>>>
>>> we were thinking of bootstrapping the memory of flink processor by
>>> loading data on initial start and subscribe to kafaka topic to listen for
>>> geofence changes and re-pull the data
>>> Is this a valid approach ?
>>>
>>> Thank you,
>>>
>>


Re: Extending REST API with new endpoints

2019-07-26 Thread Oytun Tez
Scary! :) I would heartily hate to maintain our own fork.

Should I make a feature request to discuss further and then send a PR for
this? Is this the normal way to push for a feature?






---
Oytun Tez

*M O T A W O R D*
The World's Fastest Human Translation Platform.
oy...@motaword.com — www.motaword.com


On Fri, Jul 26, 2019 at 8:01 AM Chesnay Schepler  wrote:

> There's no built-in way to extend the REST API. You will have to create a
> fork and either extend the DIspatcherRestEndpoint (or parent classes), or
> implement another WebMonitorExtension and modify the DispatcherRestEndpoint
> to load that one as well.
>
> On 23/07/2019 15:51, Oytun Tez wrote:
>
> Ping, any ideas?
>
> ---
> Oytun Tez
>
> *M O T A W O R D*
> The World's Fastest Human Translation Platform.
> oy...@motaword.com — www.motaword.com
>
>
> On Mon, Jul 22, 2019 at 9:39 AM Oytun Tez  wrote:
>
>> I did take a look at it, but things got out of hand very quickly from
>> there on :D
>>
>> I see that WebSubmissionExtension implements WebMonitorExtension, but
>> then WebSubmissionExtension was used in DispatcherRestEndpoint, which I
>> couldn't know how to manipulate/extend...
>>
>> How can I plug my Extension into the dispatcher?
>>
>> ---
>> Oytun Tez
>>
>> *M O T A W O R D*
>> The World's Fastest Human Translation Platform.
>> oy...@motaword.com — www.motaword.com
>>
>>
>> On Mon, Jul 22, 2019 at 9:37 AM Seth Wiesman  wrote:
>>
>>> Would the `WebMonitorExtension` work?
>>>
>>> [1]
>>> https://github.com/apache/flink/blob/8674b69964eae50cad024f2c5caf92a71bf21a09/flink-runtime/src/main/java/org/apache/flink/runtime/webmonitor/WebMonitorExtension.java
>>>
>>> On Mon, Jul 22, 2019 at 8:35 AM Oytun Tez  wrote:
>>>
>>>> I simply want to open up endpoints to query QueryableStates. What I had
>>>> in mind was to give operators an interface to implement their own
>>>> QueryableState controllers, e.g. serializers etc.
>>>>
>>>> We are trying to use Flink in more of an "application framework"
>>>> fashion, so extensibility helps a lot. As there already is a http server in
>>>> this codebase, we'd like to attach to that instead. Especially queryable
>>>> state is tightly coupled with Flink code, so it doesn't make much sense to
>>>> host another http application to bridge into Flink.
>>>>
>>>> ---
>>>> Oytun Tez
>>>>
>>>> *M O T A W O R D*
>>>> The World's Fastest Human Translation Platform.
>>>> oy...@motaword.com — www.motaword.com
>>>>
>>>>
>>>> On Mon, Jul 22, 2019 at 4:38 AM Biao Liu  wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> As far as I know, the RESTful handler is not pluggable. And I don't
>>>>> see a strong reason from your description to do so.
>>>>> Could you explain more about your requirement?
>>>>>
>>>>>
>>>>> Oytun Tez  于2019年7月20日周六 上午4:36写道:
>>>>>
>>>>>> Yep, I scanned all of the issues in Jira and the codebase, I couldn't
>>>>>> find a way to plug my new endpoint in.
>>>>>>
>>>>>> I am basically trying to open up an endpoint for queryable state
>>>>>> client. I also read somewhere that this may cause some issues due to SSL
>>>>>> communication within the cluster.
>>>>>>
>>>>>> Any pointers?
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> ---
>>>>>> Oytun Tez
>>>>>>
>>>>>> *M O T A W O R D*
>>>>>> The World's Fastest Human Translation Platform.
>>>>>> oy...@motaword.com — www.motaword.com
>>>>>>
>>>>>>
>>>>>> On Fri, Jul 19, 2019 at 3:53 PM Oytun Tez  wrote:
>>>>>>
>>>>>>> Hi there,
>>>>>>>
>>>>>>> I am trying to add a new endpoint to the REST API, by
>>>>>>> extending AbstractRestHandler. But this new handler needs to be added
>>>>>>> in WebMonitorEndpoint, which has no interface for outside.
>>>>>>>
>>>>>>> Can I do this with 1.8? Any other way or plans to make this possible?
>>>>>>>
>>>>>>> ---
>>>>>>> Oytun Tez
>>>>>>>
>>>>>>> *M O T A W O R D*
>>>>>>> The World's Fastest Human Translation Platform.
>>>>>>> oy...@motaword.com — www.motaword.com
>>>>>>>
>>>>>>
>>>
>>> --
>>>
>>> Seth Wiesman | Solutions Architect
>>>
>>> +1 314 387 1463
>>>
>>> <https://www.ververica.com/>
>>>
>>> Follow us @VervericaData
>>>
>>> --
>>>
>>> Join Flink Forward <https://flink-forward.org/> - The Apache Flink
>>> Conference
>>>
>>> Stream Processing | Event Driven | Real Time
>>>
>>> --
>>>
>>> Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany
>>>
>>> --
>>> Ververica GmbH Registered at Amtsgericht Charlottenburg: HRB 158244 B 
>>> Managing
>>> Directors: Dr. Kostas Tzoumas, Dr. Stephan Ewen
>>>
>>
>


Re: question for handling db data

2019-07-25 Thread Oytun Tez
Hi Jaya,

Broadcast pattern may help here. Take a look at this:
https://ci.apache.org/projects/flink/flink-docs-release-1.8/dev/stream/state/broadcast_state.html

You'll still keep your geofence data as a stream (depending on the data and
use case, maybe the whole list of geofence as a single stream item),
broadcast the stream to downstream operators, which will now have geofence
data in their state as their slow changing data (processBroadcastElement),
and the user location regularly coming to the operator (processElement).





---
Oytun Tez

*M O T A W O R D*
The World's Fastest Human Translation Platform.
oy...@motaword.com — www.motaword.com


On Thu, Jul 25, 2019 at 6:48 PM jaya sai  wrote:

> Hello,
>
> I have a question on using flink, we have a small data set which does not
> change often but have another data set which we need to compare with it and
> it has lots of data
>
> let say I have two collections geofence and locations in mongodb. Geofence
> collection does not change often and relatively small, but we have location
> data coming in at high amounts from clients and we need to calculate the
> goefence entry exits based on geofence and location data point.
> For calculating the entry and exit we were thinking of using flink CEP.
> But our problem is sometimes geofence data changes and we need to update
> the in memory store of the flink somehow
>
> we were thinking of bootstrapping the memory of flink processor by loading
> data on initial start and subscribe to kafaka topic to listen for geofence
> changes and re-pull the data
> Is this a valid approach ?
>
> Thank you,
>


Re: 1.9 Release Timeline

2019-07-23 Thread Oytun Tez
Thank you for responding! I'll subscribe to dev@

---
Oytun Tez

*M O T A W O R D*
The World's Fastest Human Translation Platform.
oy...@motaword.com — www.motaword.com


On Tue, Jul 23, 2019 at 10:25 AM Timo Walther  wrote:

> Hi Oytun,
>
> the community is working hard to release 1.9. You can see the progress
> here [1] and on the dev@ mailing list.
>
> [1]
> https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=328=detail
>
> Regards,
> Timo
>
> Am 23.07.19 um 15:52 schrieb Oytun Tez:
>
> Ping, any estimates?
>
> ---
> Oytun Tez
>
> *M O T A W O R D*
> The World's Fastest Human Translation Platform.
> oy...@motaword.com — www.motaword.com
>
>
> On Thu, Jul 18, 2019 at 11:07 AM Oytun Tez  wrote:
>
>> Hi team,
>>
>> 1.9 is bringing very exciting updates, State Processor API and MapState
>> migrations being two of them. Thank you for all the hard work!
>>
>> I checked the burndown board [1], do you have an estimated timeline for
>> the GA release of 1.9?
>>
>>
>>
>> [1]
>> https://issues.apache.org/jira/secure/RapidBoard.jspa?projectKey=FLINK=328
>>
>> ---
>> Oytun Tez
>>
>> *M O T A W O R D*
>> The World's Fastest Human Translation Platform.
>> oy...@motaword.com — www.motaword.com
>>
>
>


Re: 1.9 Release Timeline

2019-07-23 Thread Oytun Tez
Ping, any estimates?

---
Oytun Tez

*M O T A W O R D*
The World's Fastest Human Translation Platform.
oy...@motaword.com — www.motaword.com


On Thu, Jul 18, 2019 at 11:07 AM Oytun Tez  wrote:

> Hi team,
>
> 1.9 is bringing very exciting updates, State Processor API and MapState
> migrations being two of them. Thank you for all the hard work!
>
> I checked the burndown board [1], do you have an estimated timeline for
> the GA release of 1.9?
>
>
>
> [1]
> https://issues.apache.org/jira/secure/RapidBoard.jspa?projectKey=FLINK=328
>
> ---
> Oytun Tez
>
> *M O T A W O R D*
> The World's Fastest Human Translation Platform.
> oy...@motaword.com — www.motaword.com
>


Re: Extending REST API with new endpoints

2019-07-23 Thread Oytun Tez
Ping, any ideas?

---
Oytun Tez

*M O T A W O R D*
The World's Fastest Human Translation Platform.
oy...@motaword.com — www.motaword.com


On Mon, Jul 22, 2019 at 9:39 AM Oytun Tez  wrote:

> I did take a look at it, but things got out of hand very quickly from
> there on :D
>
> I see that WebSubmissionExtension implements WebMonitorExtension, but
> then WebSubmissionExtension was used in DispatcherRestEndpoint, which I
> couldn't know how to manipulate/extend...
>
> How can I plug my Extension into the dispatcher?
>
> ---
> Oytun Tez
>
> *M O T A W O R D*
> The World's Fastest Human Translation Platform.
> oy...@motaword.com — www.motaword.com
>
>
> On Mon, Jul 22, 2019 at 9:37 AM Seth Wiesman  wrote:
>
>> Would the `WebMonitorExtension` work?
>>
>> [1]
>> https://github.com/apache/flink/blob/8674b69964eae50cad024f2c5caf92a71bf21a09/flink-runtime/src/main/java/org/apache/flink/runtime/webmonitor/WebMonitorExtension.java
>>
>> On Mon, Jul 22, 2019 at 8:35 AM Oytun Tez  wrote:
>>
>>> I simply want to open up endpoints to query QueryableStates. What I had
>>> in mind was to give operators an interface to implement their own
>>> QueryableState controllers, e.g. serializers etc.
>>>
>>> We are trying to use Flink in more of an "application framework"
>>> fashion, so extensibility helps a lot. As there already is a http server in
>>> this codebase, we'd like to attach to that instead. Especially queryable
>>> state is tightly coupled with Flink code, so it doesn't make much sense to
>>> host another http application to bridge into Flink.
>>>
>>> ---
>>> Oytun Tez
>>>
>>> *M O T A W O R D*
>>> The World's Fastest Human Translation Platform.
>>> oy...@motaword.com — www.motaword.com
>>>
>>>
>>> On Mon, Jul 22, 2019 at 4:38 AM Biao Liu  wrote:
>>>
>>>> Hi,
>>>>
>>>> As far as I know, the RESTful handler is not pluggable. And I don't see
>>>> a strong reason from your description to do so.
>>>> Could you explain more about your requirement?
>>>>
>>>>
>>>> Oytun Tez  于2019年7月20日周六 上午4:36写道:
>>>>
>>>>> Yep, I scanned all of the issues in Jira and the codebase, I couldn't
>>>>> find a way to plug my new endpoint in.
>>>>>
>>>>> I am basically trying to open up an endpoint for queryable state
>>>>> client. I also read somewhere that this may cause some issues due to SSL
>>>>> communication within the cluster.
>>>>>
>>>>> Any pointers?
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> ---
>>>>> Oytun Tez
>>>>>
>>>>> *M O T A W O R D*
>>>>> The World's Fastest Human Translation Platform.
>>>>> oy...@motaword.com — www.motaword.com
>>>>>
>>>>>
>>>>> On Fri, Jul 19, 2019 at 3:53 PM Oytun Tez  wrote:
>>>>>
>>>>>> Hi there,
>>>>>>
>>>>>> I am trying to add a new endpoint to the REST API, by
>>>>>> extending AbstractRestHandler. But this new handler needs to be added
>>>>>> in WebMonitorEndpoint, which has no interface for outside.
>>>>>>
>>>>>> Can I do this with 1.8? Any other way or plans to make this possible?
>>>>>>
>>>>>> ---
>>>>>> Oytun Tez
>>>>>>
>>>>>> *M O T A W O R D*
>>>>>> The World's Fastest Human Translation Platform.
>>>>>> oy...@motaword.com — www.motaword.com
>>>>>>
>>>>>
>>
>> --
>>
>> Seth Wiesman | Solutions Architect
>>
>> +1 314 387 1463
>>
>> <https://www.ververica.com/>
>>
>> Follow us @VervericaData
>>
>> --
>>
>> Join Flink Forward <https://flink-forward.org/> - The Apache Flink
>> Conference
>>
>> Stream Processing | Event Driven | Real Time
>>
>> --
>>
>> Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany
>>
>> --
>> Ververica GmbH
>> Registered at Amtsgericht Charlottenburg: HRB 158244 B
>> Managing Directors: Dr. Kostas Tzoumas, Dr. Stephan Ewen
>>
>


Re: [DISCUSS] Create a Flink ecosystem website

2019-07-23 Thread Oytun Tez
I agree with Robert – localization (this is what we do at MotaWord) is a
maintenance work. If not maintained as well as mainstream, it will only
damage and distant devs that use those local websites.

Re: comments, I don't think people will really discuss furiously. But we at
least need a system where we understand the popularity of a package, it
helps to pick among similar packages if any. Something like the popularity
figure here that we can fetch from somewhere:
https://mvnrepository.com/popular

---
Oytun Tez

*M O T A W O R D*
The World's Fastest Human Translation Platform.
oy...@motaword.com — www.motaword.com


On Tue, Jul 23, 2019 at 5:07 AM Robert Metzger  wrote:

> Thanks a lot Marta for offering to write a blog post about the community
> site!
>
> I'm not sure if multi-language support for the site is a good idea. I see
> the packages site as something similar to GitHub or Jira. The page itself
> contains very view things we could actually translate. The package owners
> usually are only able to provide one or two languages for their package
> description.
> For the comments, we don't want disjoint discussions to happen.
>
> I've also kicked off a discussion with Apache legal on the initiative [1].
> We might not be able to host this at Apache, but let's see where the
> discussion goes.
>
>
> [1]
> https://lists.apache.org/thread.html/ee76a02257b51292ab61f6ac8d3d69307e83cc569cdeebde80596207@%3Clegal-discuss.apache.org%3E
>
>
> On Sat, Jul 20, 2019 at 5:25 AM Becket Qin  wrote:
>
>> [Sorry for the incomplete message. Clicked send by mistake...]
>>
>> I agree with Marta that it might be good to have multi-language support
>> as a mid-term goal.
>>
>> Jiangjie (Becket) Qin
>>
>> On Sat, Jul 20, 2019 at 11:22 AM Becket Qin  wrote:
>>
>>> The website is awesome! I really like its conciseness and yet fairly
>>> useful information and functionalities. I cannot think of much to improve
>>> at the moment. Just one thought, do we need an "others" category, just in
>>> case a package does not fit into any of the current given categories?
>>>
>>> Thanks Robert and Daryl for the great effort. Looking forward to seeing
>>> this get published soon!!
>>>
>>> I agree with Marta that
>>>
>>> Jiangjie (Becket) Qin
>>>
>>> On Sat, Jul 20, 2019 at 1:34 AM Marta Paes Moreira 
>>> wrote:
>>>
>>>> Hey, Robert.
>>>>
>>>> I will keep an eye on the overall progress and get started on the blog
>>>> post
>>>> to make the community announcement. Are there (mid-term) plans to
>>>> translate/localize this website as well? It might be a point worth
>>>> mentioning in the blogpost.
>>>>
>>>> Hats off to you and Daryl — this turned out amazing!
>>>>
>>>> Marta
>>>>
>>>> On Thu, Jul 18, 2019 at 10:57 AM Congxian Qiu 
>>>> wrote:
>>>>
>>>> > Robert and Daryl, thanks for the great work, I tried the website and
>>>> filed
>>>> > some issues on Github.
>>>> > Best,
>>>> > Congxian
>>>> >
>>>> >
>>>> > Robert Metzger  于2019年7月17日周三 下午11:28写道:
>>>> >
>>>> >> Hey all,
>>>> >>
>>>> >> Daryl and I have great news to share. We are about to finish adding
>>>> the
>>>> >> basic features to the ecosystem page.
>>>> >> We are at a stage where it is ready to be reviewed and made public.
>>>> >>
>>>> >> You can either check out a development instance of the ecosystem page
>>>> >> here: https://flink-ecosystem-demo.flink-resources.org/
>>>> >> Or you run it locally, with the instructions from the README.md:
>>>> >> https://github.com/sorahn/flink-ecosystem
>>>> >>
>>>> >> Please report all issues you find here:
>>>> >> https://github.com/sorahn/flink-ecosystem/issues or in this thread.
>>>> >>
>>>> >> The next steps in this project are the following:
>>>> >> - We fix all issues reported through this testing
>>>> >> - We set up the site on the INFRA resources Becket has secured [1],
>>>> do
>>>> >> some further testing (including email notifications) and pre-fill
>>>> the page
>>>> >> with some packages.
>>>> >> - We set up a packages.flink.apache.org or flink.apache.org/pac

Re: Flink SinkFunction for WebSockets

2019-07-23 Thread Oytun Tez
Hi Tim,

I think this might be a useful sink for small interactions with outside.
Are you planning to open source this? If yes, can you try to make it
agnostic so that people can plug in their own WebSocket protocol – Stomp
etc? :) We can publish this in the upcoming community website as an
extension/plugin/library.




---
Oytun Tez

*M O T A W O R D*
The World's Fastest Human Translation Platform.
oy...@motaword.com — www.motaword.com


On Tue, Jul 23, 2019 at 7:07 AM Fabian Hueske  wrote:

> Hi Tim,
>
> One thing that might be interesting is that Flink might emit results more
> than once when a job recovers from a failure.
> It is up to the receiver to deal with that.
> Depending on the type of results this might be easy (idempotent updates)
> or impossible.
>
> Best, Fabian
>
>
>
> Am Fr., 19. Juli 2019 um 00:23 Uhr schrieb Timothy Victor <
> vict...@gmail.com>:
>
>> Hi
>>
>> I'm looking to write a sink function for writing to websockets, in
>> particular ones that speak the WAMP protocol (
>> https://wamp-proto.org/index.html).
>>
>> Before going down that path, I wanted to ask if
>>
>> a) anyone has done something like that already so I dont reinvent stuff
>>
>> b) any caveats or warnings before I try this.
>>
>> Any advise would be appreciated.
>>
>> Thanks
>>
>> Tim
>>
>


Re: Extending REST API with new endpoints

2019-07-22 Thread Oytun Tez
I did take a look at it, but things got out of hand very quickly from there
on :D

I see that WebSubmissionExtension implements WebMonitorExtension, but
then WebSubmissionExtension was used in DispatcherRestEndpoint, which I
couldn't know how to manipulate/extend...

How can I plug my Extension into the dispatcher?

---
Oytun Tez

*M O T A W O R D*
The World's Fastest Human Translation Platform.
oy...@motaword.com — www.motaword.com


On Mon, Jul 22, 2019 at 9:37 AM Seth Wiesman  wrote:

> Would the `WebMonitorExtension` work?
>
> [1]
> https://github.com/apache/flink/blob/8674b69964eae50cad024f2c5caf92a71bf21a09/flink-runtime/src/main/java/org/apache/flink/runtime/webmonitor/WebMonitorExtension.java
>
> On Mon, Jul 22, 2019 at 8:35 AM Oytun Tez  wrote:
>
>> I simply want to open up endpoints to query QueryableStates. What I had
>> in mind was to give operators an interface to implement their own
>> QueryableState controllers, e.g. serializers etc.
>>
>> We are trying to use Flink in more of an "application framework" fashion,
>> so extensibility helps a lot. As there already is a http server in this
>> codebase, we'd like to attach to that instead. Especially queryable state
>> is tightly coupled with Flink code, so it doesn't make much sense to host
>> another http application to bridge into Flink.
>>
>> ---
>> Oytun Tez
>>
>> *M O T A W O R D*
>> The World's Fastest Human Translation Platform.
>> oy...@motaword.com — www.motaword.com
>>
>>
>> On Mon, Jul 22, 2019 at 4:38 AM Biao Liu  wrote:
>>
>>> Hi,
>>>
>>> As far as I know, the RESTful handler is not pluggable. And I don't see
>>> a strong reason from your description to do so.
>>> Could you explain more about your requirement?
>>>
>>>
>>> Oytun Tez  于2019年7月20日周六 上午4:36写道:
>>>
>>>> Yep, I scanned all of the issues in Jira and the codebase, I couldn't
>>>> find a way to plug my new endpoint in.
>>>>
>>>> I am basically trying to open up an endpoint for queryable state
>>>> client. I also read somewhere that this may cause some issues due to SSL
>>>> communication within the cluster.
>>>>
>>>> Any pointers?
>>>>
>>>>
>>>>
>>>>
>>>> ---
>>>> Oytun Tez
>>>>
>>>> *M O T A W O R D*
>>>> The World's Fastest Human Translation Platform.
>>>> oy...@motaword.com — www.motaword.com
>>>>
>>>>
>>>> On Fri, Jul 19, 2019 at 3:53 PM Oytun Tez  wrote:
>>>>
>>>>> Hi there,
>>>>>
>>>>> I am trying to add a new endpoint to the REST API, by
>>>>> extending AbstractRestHandler. But this new handler needs to be added
>>>>> in WebMonitorEndpoint, which has no interface for outside.
>>>>>
>>>>> Can I do this with 1.8? Any other way or plans to make this possible?
>>>>>
>>>>> ---
>>>>> Oytun Tez
>>>>>
>>>>> *M O T A W O R D*
>>>>> The World's Fastest Human Translation Platform.
>>>>> oy...@motaword.com — www.motaword.com
>>>>>
>>>>
>
> --
>
> Seth Wiesman | Solutions Architect
>
> +1 314 387 1463
>
> <https://www.ververica.com/>
>
> Follow us @VervericaData
>
> --
>
> Join Flink Forward <https://flink-forward.org/> - The Apache Flink
> Conference
>
> Stream Processing | Event Driven | Real Time
>
> --
>
> Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany
>
> --
> Ververica GmbH
> Registered at Amtsgericht Charlottenburg: HRB 158244 B
> Managing Directors: Dr. Kostas Tzoumas, Dr. Stephan Ewen
>


Re: Extending REST API with new endpoints

2019-07-22 Thread Oytun Tez
I simply want to open up endpoints to query QueryableStates. What I had in
mind was to give operators an interface to implement their own
QueryableState controllers, e.g. serializers etc.

We are trying to use Flink in more of an "application framework" fashion,
so extensibility helps a lot. As there already is a http server in this
codebase, we'd like to attach to that instead. Especially queryable state
is tightly coupled with Flink code, so it doesn't make much sense to host
another http application to bridge into Flink.

---
Oytun Tez

*M O T A W O R D*
The World's Fastest Human Translation Platform.
oy...@motaword.com — www.motaword.com


On Mon, Jul 22, 2019 at 4:38 AM Biao Liu  wrote:

> Hi,
>
> As far as I know, the RESTful handler is not pluggable. And I don't see a
> strong reason from your description to do so.
> Could you explain more about your requirement?
>
>
> Oytun Tez  于2019年7月20日周六 上午4:36写道:
>
>> Yep, I scanned all of the issues in Jira and the codebase, I couldn't
>> find a way to plug my new endpoint in.
>>
>> I am basically trying to open up an endpoint for queryable state client.
>> I also read somewhere that this may cause some issues due to SSL
>> communication within the cluster.
>>
>> Any pointers?
>>
>>
>>
>>
>> ---
>> Oytun Tez
>>
>> *M O T A W O R D*
>> The World's Fastest Human Translation Platform.
>> oy...@motaword.com — www.motaword.com
>>
>>
>> On Fri, Jul 19, 2019 at 3:53 PM Oytun Tez  wrote:
>>
>>> Hi there,
>>>
>>> I am trying to add a new endpoint to the REST API, by
>>> extending AbstractRestHandler. But this new handler needs to be added
>>> in WebMonitorEndpoint, which has no interface for outside.
>>>
>>> Can I do this with 1.8? Any other way or plans to make this possible?
>>>
>>> ---
>>> Oytun Tez
>>>
>>> *M O T A W O R D*
>>> The World's Fastest Human Translation Platform.
>>> oy...@motaword.com — www.motaword.com
>>>
>>


Re: Extending REST API with new endpoints

2019-07-19 Thread Oytun Tez
Yep, I scanned all of the issues in Jira and the codebase, I couldn't find
a way to plug my new endpoint in.

I am basically trying to open up an endpoint for queryable state client. I
also read somewhere that this may cause some issues due to SSL
communication within the cluster.

Any pointers?




---
Oytun Tez

*M O T A W O R D*
The World's Fastest Human Translation Platform.
oy...@motaword.com — www.motaword.com


On Fri, Jul 19, 2019 at 3:53 PM Oytun Tez  wrote:

> Hi there,
>
> I am trying to add a new endpoint to the REST API, by
> extending AbstractRestHandler. But this new handler needs to be added
> in WebMonitorEndpoint, which has no interface for outside.
>
> Can I do this with 1.8? Any other way or plans to make this possible?
>
> ---
> Oytun Tez
>
> *M O T A W O R D*
> The World's Fastest Human Translation Platform.
> oy...@motaword.com — www.motaword.com
>


Extending REST API with new endpoints

2019-07-19 Thread Oytun Tez
Hi there,

I am trying to add a new endpoint to the REST API, by
extending AbstractRestHandler. But this new handler needs to be added
in WebMonitorEndpoint, which has no interface for outside.

Can I do this with 1.8? Any other way or plans to make this possible?

---
Oytun Tez

*M O T A W O R D*
The World's Fastest Human Translation Platform.
oy...@motaword.com — www.motaword.com


1.9 Release Timeline

2019-07-18 Thread Oytun Tez
Hi team,

1.9 is bringing very exciting updates, State Processor API and MapState
migrations being two of them. Thank you for all the hard work!

I checked the burndown board [1], do you have an estimated timeline for the
GA release of 1.9?



[1]
https://issues.apache.org/jira/secure/RapidBoard.jspa?projectKey=FLINK=328

---
Oytun Tez

*M O T A W O R D*
The World's Fastest Human Translation Platform.
oy...@motaword.com — www.motaword.com


Re: Migrating existing application to Flink

2019-07-12 Thread Oytun Tez
I am so excited with 1.9, State Processor API is single handedly the most
important update for me.

---
Oytun Tez

*M O T A W O R D*
The World's Fastest Human Translation Platform.
oy...@motaword.com — www.motaword.com


On Sun, Jul 7, 2019 at 4:34 AM Konstantin Knauf 
wrote:

> Hi Eduardo,
>
> Flink 1.9 will add a new State Processor API [1], which you can use to
> create Savepoints from scratch with a batch job.
>
> Cheers,
>
> Konstantin
>
> [1]
> https://ci.apache.org/projects/flink/flink-docs-master/dev/libs/state_processor_api.html#writing-new-savepoints
>
> On Thu, Jul 4, 2019 at 12:38 AM Eduardo Winpenny Tejedor <
> eduardo.winpe...@gmail.com> wrote:
>
>> Hi all,
>>
>> How would one go about migrating a stateful streaming application that
>> doesn't use Flink to one that does?
>>
>> My question is specifically how to load state for the first time? I can
>> set the source operator (Kafka in my case) to start from a desired point in
>> time but I've no idea how I'd go about initialising the state of the Flink
>> equivalent version of the app from a "pseudo-snapshot". I call
>> "pseudo-snapshot" to the non-Flink representation of state at a given point
>> in time.
>>
>> Thanks,
>> Eduardo
>>
>
>
> --
>
> Konstantin Knauf | Solutions Architect
>
> +49 160 91394525
>
>
> Planned Absences: 10.08.2019 - 31.08.2019, 05.09. - 06.09.2010
>
>
> --
>
> Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany
>
> --
>
> Ververica GmbH
> Registered at Amtsgericht Charlottenburg: HRB 158244 B
> Managing Directors: Dr. Kostas Tzoumas, Dr. Stephan Ewen
>


Re: [ANNOUNCE] Rong Rong becomes a Flink committer

2019-07-11 Thread Oytun Tez
Congratulations Rong!

---
Oytun Tez

*M O T A W O R D*
The World's Fastest Human Translation Platform.
oy...@motaword.com — www.motaword.com


On Thu, Jul 11, 2019 at 1:44 PM Peter Huang 
wrote:

> Congrats Rong!
>
> On Thu, Jul 11, 2019 at 10:40 AM Becket Qin  wrote:
>
>> Congrats, Rong!
>>
>> On Fri, Jul 12, 2019 at 1:13 AM Xingcan Cui  wrote:
>>
>>> Congrats Rong!
>>>
>>> Best,
>>> Xingcan
>>>
>>> On Jul 11, 2019, at 1:08 PM, Shuyi Chen  wrote:
>>>
>>> Congratulations, Rong!
>>>
>>> On Thu, Jul 11, 2019 at 8:26 AM Yu Li  wrote:
>>>
>>>> Congratulations Rong!
>>>>
>>>> Best Regards,
>>>> Yu
>>>>
>>>>
>>>> On Thu, 11 Jul 2019 at 22:54, zhijiang 
>>>> wrote:
>>>>
>>>>> Congratulations Rong!
>>>>>
>>>>> Best,
>>>>> Zhijiang
>>>>>
>>>>> --
>>>>> From:Kurt Young 
>>>>> Send Time:2019年7月11日(星期四) 22:54
>>>>> To:Kostas Kloudas 
>>>>> Cc:Jark Wu ; Fabian Hueske ; dev
>>>>> ; user 
>>>>> Subject:Re: [ANNOUNCE] Rong Rong becomes a Flink committer
>>>>>
>>>>> Congratulations Rong!
>>>>>
>>>>> Best,
>>>>> Kurt
>>>>>
>>>>>
>>>>> On Thu, Jul 11, 2019 at 10:53 PM Kostas Kloudas 
>>>>> wrote:
>>>>> Congratulations Rong!
>>>>>
>>>>> On Thu, Jul 11, 2019 at 4:40 PM Jark Wu  wrote:
>>>>> Congratulations Rong Rong!
>>>>> Welcome on board!
>>>>>
>>>>> On Thu, 11 Jul 2019 at 22:25, Fabian Hueske  wrote:
>>>>> Hi everyone,
>>>>>
>>>>> I'm very happy to announce that Rong Rong accepted the offer of the
>>>>> Flink PMC to become a committer of the Flink project.
>>>>>
>>>>> Rong has been contributing to Flink for many years, mainly working on
>>>>> SQL and Yarn security features. He's also frequently helping out on the
>>>>> user@f.a.o mailing lists.
>>>>>
>>>>> Congratulations Rong!
>>>>>
>>>>> Best, Fabian
>>>>> (on behalf of the Flink PMC)
>>>>>
>>>>>
>>>>>
>>>


Re: Flink Forward Europe 2019 - Call for Presentations open until 17th May

2019-05-17 Thread Oytun Tez
Thanks for the update, Robert!

I am planning to prepare a use case content with how we use Flink at
MotaWord, focusing more on Flink as "application framework", rather than
confining our mindset to Flink as "stream processor", on non-Uber,
non-Alibaba scales. Hopefully over the weekend, I should be ready to submit
CFP.




---
Oytun Tez

*M O T A W O R D*
The World's Fastest Human Translation Platform.
oy...@motaword.com — www.motaword.com


On Fri, May 17, 2019 at 11:02 AM Robert Metzger  wrote:

> Hey all,
>
> Short update on the Flink Forward Call For Presentations: We've extended
> the submission deadline till May 31 ... so there's more time to finish the
> talk abstracts.
>
> Also, the organizers are now able to cover travel costs for speakers in
> cases where an employer can not cover them.
>
>
>
> On Fri, May 10, 2019 at 9:38 AM Fabian Hueske  wrote:
>
>> Hi Tim,
>>
>> Thanks for submitting a talk!
>> This sounds like a good and interesting use case to me.
>> Machine learning on streaming data is definitely a relevant and
>> interesting
>> topic for Flink Forward!
>>
>> Best,
>> Fabian
>>
>>
>> Am Mo., 6. Mai 2019 um 19:52 Uhr schrieb Tim Frey :
>>
>> > Hi All,
>> >
>> > Sounds interesting. I submitted a talk about using Flink for machine
>> > learning.
>> > However, I would also be happy to gain some community feedback if the
>> > topic is to the right interest of the community.
>> >
>> > In short, we use flink to train machine learning models and to then use
>> > the same models for predict then. Our goal was to determine if it is
>> > possible to predict crypto currency exchange rates by utilizing social
>> data
>> > from Twitter.
>> > I would talk about our experiences and describe how we leveraged online
>> > learning in conjunction with social data to determine if we are able to
>> > predict future currency exchange rates. I’ll point out the general
>> > architecture and describe the most interesting findings.
>> >
>> > Best
>> > Tim
>> >
>> > -Ursprüngliche Nachricht-
>> > Von: Robert Metzger 
>> > Gesendet: Montag, 6. Mai 2019 09:44
>> > An: Fabian Hueske 
>> > Cc: user ; dev ;
>> > commun...@flink.apache.org
>> > Betreff: Re: Flink Forward Europe 2019 - Call for Presentations open
>> until
>> > 17th May
>> >
>> > Thanks for announcing the Call for Presentations here!
>> >
>> > Since the deadline is approaching, I wanted to bump up this thread to
>> > remind everybody to submit talks!
>> > Please reach out to me or Fabian directly if you have any questions or
>> if
>> > you need any support!
>> >
>> >
>> >
>> > On Thu, Apr 11, 2019 at 3:47 PM Fabian Hueske 
>> wrote:
>> >
>> > > Hi all,
>> > >
>> > > Flink Forward Europe returns to Berlin on October 7-9th, 2019.
>> > > We are happy to announce that the Call for Presentations is open!
>> > >
>> > > Please submit a proposal if you'd like to present your Apache Flink
>> > > experience, best practices, new features, or use cases in front of an
>> > > international audience of highly skilled and enthusiastic Flink users
>> > > and committers.
>> > >
>> > > Flink Forward will run tracks for the following topics:
>> > > * Use Case
>> > > * Operations
>> > > * Technology Deep Dive
>> > > * Ecosystem
>> > > * Research
>> > >
>> > > For the first time, we'll also have a Community track.
>> > >
>> > > Please find the submission form at
>> > > https://berlin-2019.flink-forward.org/call-for-presentations
>> > >
>> > > The deadline for submissions is May 17th, 11:59pm (CEST).
>> > >
>> > > Best regards,
>> > > Fabian
>> > > (PC Chair for Flink Forward Berlin 2019)
>> > >
>> >
>> >
>>
>


Re: PatternFlatSelectAdapter - Serialization issue after 1.8 upgrade

2019-04-30 Thread Oytun Tez
Hi all,

Making the tag a static element worked out, thank you!

---
Oytun Tez

*M O T A W O R D*
The World's Fastest Human Translation Platform.
oy...@motaword.com — www.motaword.com


On Tue, Apr 23, 2019 at 10:37 AM Oytun Tez  wrote:

> Thank you Guowei and Dawid! I am trying your suggestions today and will
> report back.
>
> - I assume the cleaning operation should be done only once because of the
> upgrade, or should I run every time the application is up?
> - `static` sounds a very simple fix to get rid of this. Any drawbacks here?
>
>
>
>
> ---
> Oytun Tez
>
> *M O T A W O R D*
> The World's Fastest Human Translation Platform.
> oy...@motaword.com — www.motaword.com
>
>
> On Tue, Apr 23, 2019 at 2:56 AM Dawid Wysakowicz 
> wrote:
>
>> Hi Oytun,
>>
>> I think there is a regression introduced in 1.8 how we handle output
>> tags. The problem is we do not call ClosureCleaner on OutputTag.
>>
>> There are two options how you can workaround this issue:
>>
>> 1. Declare the OutputTag static
>>
>> 2. Clean the closure explicitly as Guowei suggested:
>> StreamExecutionEnvironment.clean(pendingProjectsTag)
>>
>> I also opened a jira issue to fix this (FLINK-12297[1])
>>
>> Best,
>>
>> Dawid
>>
>> [1] https://issues.apache.org/jira/browse/FLINK-12297
>> On 22/04/2019 03:06, Guowei Ma wrote:
>>
>> I think you could try
>> StreamExecutionEnvironment.clean(pendingProjectsTag).
>>
>>
>> Oytun Tez 于2019年4月19日 周五下午9:58写道:
>>
>>> Forgot to answer one of your points: the parent class compiles well
>>> without this CEP selector (with timeout signature)...
>>>
>>>
>>> ---
>>> Oytun Tez
>>>
>>> *M O T A W O R D*
>>> The World's Fastest Human Translation Platform.
>>> oy...@motaword.com — www.motaword.com
>>>
>>>
>>> On Fri, Apr 19, 2019 at 9:40 AM Oytun Tez  wrote:
>>>
>>>> Hey JingsongLee!
>>>>
>>>> Here are some findings...
>>>>
>>>>- flatSelect *without timeout* works normally:
>>>>patternStream.flatSelect(PatternFlatSelectFunction), this compiles
>>>>well.
>>>>- Converted the both timeout and select selectors to an *inner
>>>>class* (not static), yielded the same results, doesn't compile.
>>>>- flatSelect *without* timeout, but with an inner class for
>>>>PatternFlatSelectFunction, it compiles (same as first bullet).
>>>>- Tried both of these selectors with *empty* body. Just a skeleton
>>>>class. Doesn't compile either. Empty body example is in my first email.
>>>>- Tried making both selectors *static public inner* classes,
>>>>doesn't compile either.
>>>>- Extracted both timeout and flat selectors to their own *independent
>>>>classes* in separate files. Doesn't compile.
>>>>- I am putting the *error stack* below.
>>>>- Without the timeout selector in any class or lambda shape, with
>>>>empty or full body, flatSelect compiles well.
>>>>
>>>> Would these findings help? Any ideas?
>>>>
>>>> Here is an error stack:
>>>>
>>>> 09:36:51,925 ERROR
>>>> com.motaword.ipm.kernel.error.controller.ExceptionHandler -
>>>> org.apache.flink.api.common.InvalidProgramException: The implementation
>>>> of the PatternFlatSelectAdapter is not serializable. The object probably
>>>> contains or references non serializable fields.
>>>> at
>>>> org.apache.flink.api.java.ClosureCleaner.clean(ClosureCleaner.java:99)
>>>> at
>>>> org.apache.flink.streaming.api.environment.StreamExecutionEnvironment.clean(StreamExecutionEnvironment.java:1558)
>>>> at
>>>> org.apache.flink.cep.PatternStreamBuilder.clean(PatternStreamBuilder.java:86)
>>>> at org.apache.flink.cep.PatternStream.process(PatternStream.java:114)
>>>> at org.apache.flink.cep.PatternStream.flatSelect(PatternStream.java:451)
>>>> at org.apache.flink.cep.PatternStream.flatSelect(PatternStream.java:408)
>>>> at
>>>> com.motaword.ipm.business.invitation.controller.PendingProjects.getPending(PendingProjects.java:89)
>>>> at
>>>> com.motaword.ipm.business.invitation.controller.PendingProjects.run(PendingProjects.java:45)
>>>> at
>>>> com.motaword.ipm.business.invitation.boundary.InvitationJob.run(Invitation

Re: Flink CLI

2019-04-25 Thread Oytun Tez
I had come across flink-deployer actually, but somehow didn't want to
"learn" it... (versus just a bunch of lines in a script)

At some time with more bandwidth, we should migrate to this one and
standardize flink-deployer (and later take this to mainstream Flink :P).

---
Oytun Tez

*M O T A W O R D*
The World's Fastest Human Translation Platform.
oy...@motaword.com — www.motaword.com


On Thu, Apr 25, 2019 at 3:14 AM Marc Rooding  wrote:

> Hi Steven, Oytun
>
> You may find the tool we open-sourced last year useful. It offers
> deploying and updating jobs with savepointing.
>
> You can find it on Github: https://github.com/ing-bank/flink-deployer
>
> There’s also a docker image available in Docker Hub.
>
> Marc
> On 24 Apr 2019, 17:29 +0200, Oytun Tez , wrote:
>
> Hi Steven,
>
> As much as I am aware,
> 1) no update call. our build flow feels a little weird to us as well.
> definitely requires scripting.
> 2) we are using Flink management API remotely in our build flow to 1) get
> jobs, 2) savepoint them, 3) cancel them etc. I am going to release a Python
> script for this soon.
>
> ---
> Oytun Tez
>
> *M O T A W O R D*
> The World's Fastest Human Translation Platform.
> oy...@motaword.com — www.motaword.com
>
>
> On Wed, Apr 24, 2019 at 11:06 AM Steven Nelson 
> wrote:
>
>> Hello!
>>
>> I am working on automating our deployments to our Flink cluster. I had a
>> couple questions about the flink cli.
>>
>> 1) I thought there was an "update" command that would internally manage
>> the cancel with savepoint, upload new jar, restart from savepoint process.
>>
>> 2) Is there a way to get the Flink cli to output it's result in a json
>> format? Right now I would need to parse the results of the "flink list"
>> command to get the job id, cancel the job with savepoint, parse the results
>> of that to get the savepoint filename, then restore using that. Parsing the
>> output seems brittle to me.
>>
>> Thought?
>> -Steve
>>
>>


Re: Flink CLI

2019-04-24 Thread Oytun Tez
Hi Steven,

As much as I am aware,
1) no update call. our build flow feels a little weird to us as well.
definitely requires scripting.
2) we are using Flink management API remotely in our build flow to 1) get
jobs, 2) savepoint them, 3) cancel them etc. I am going to release a Python
script for this soon.

---
Oytun Tez

*M O T A W O R D*
The World's Fastest Human Translation Platform.
oy...@motaword.com — www.motaword.com


On Wed, Apr 24, 2019 at 11:06 AM Steven Nelson 
wrote:

> Hello!
>
> I am working on automating our deployments to our Flink cluster. I had a
> couple questions about the flink cli.
>
> 1) I thought there was an "update" command that would internally manage
> the cancel with savepoint, upload new jar, restart from savepoint process.
>
> 2) Is there a way to get the Flink cli to output it's result in a json
> format? Right now I would need to parse the results of the "flink list"
> command to get the job id, cancel the job with savepoint, parse the results
> of that to get the savepoint filename, then restore using that. Parsing the
> output seems brittle to me.
>
> Thought?
> -Steve
>
>


Re: May be useful: our reference document for "Understanding State in Flink"

2019-04-24 Thread Oytun Tez
Thank you all!

@David and @Fabian can guide me (or Deepak as well) to maintain this
document if they'd like. I can export HTML from this that we can easily
play with and put in docs.

---
Oytun Tez

*M O T A W O R D*
The World's Fastest Human Translation Platform.
oy...@motaword.com — www.motaword.com


On Wed, Apr 24, 2019 at 7:33 AM Deepak Sharma  wrote:

> I want to volunteer for maintaining or adding to this kind of document.
> Please do let me know if i can.
>
> Thanks
> Deepak
>
> On Wed, Apr 24, 2019 at 6:33 AM Deepak Sharma 
> wrote:
>
>>
>>
>> On Wed, Apr 24, 2019 at 5:14 AM Till Rohrmann 
>> wrote:
>>
>>> Thanks for sharing this resource with the community Oytun. It looks
>>> really helpful.
>>>
>>> I'm pulling in David and Fabian who work a lot on documentation. Maybe
>>> it's interesting for them to take a look at. The community had once the
>>> idea to set up a cook book with common Flink recipes but we never managed
>>> to get it properly started.
>>>
>>> Cheers,
>>> Till
>>>
>>> On Tue, Apr 23, 2019 at 5:54 PM Oytun Tez  wrote:
>>>
>>>> We keep a document with state-related use cases in our application,
>>>> useful for onboarding new engineers in the application. See attached PDF.
>>>>
>>>> May be useful for others. And of course, corrections are welcome.
>>>> (Couldn't share our Wiki page)
>>>>
>>>>
>>>> ---
>>>> Oytun Tez
>>>>
>>>> *M O T A W O R D*
>>>> The World's Fastest Human Translation Platform.
>>>> oy...@motaword.com — www.motaword.com
>>>>
>>>
>>
>> --
>> Thanks
>> Deepak
>> www.bigdatabig.com
>> www.keosha.net
>>
>
>
> --
> Thanks
> Deepak
> www.bigdatabig.com
> www.keosha.net
>


Re: Sinking messages in RabbitMQ

2019-04-23 Thread Oytun Tez
I think you should exchangeDeclare when you open the sink. When invoked,
you can channel.basicPublish(exchangeName).

Would this work? We have a single exchange, so didn't explore this method.

---
Oytun Tez

*M O T A W O R D*
The World's Fastest Human Translation Platform.
oy...@motaword.com — www.motaword.com


On Tue, Apr 23, 2019 at 12:37 PM Soheil Pourbafrani 
wrote:

> I'm using Flink RabbitMQ Connector for Sinking Data but using the
> RMQConnectionConfig object I couldn't find any method to set the type of
> the exchange (Fanout, Topic, Direct). And also the RMQSink get just name of
> the queue as the parameter. Is there any way to specify the exchange type?
>


May be useful: our reference document for "Understanding State in Flink"

2019-04-23 Thread Oytun Tez
We keep a document with state-related use cases in our application, useful
for onboarding new engineers in the application. See attached PDF.

May be useful for others. And of course, corrections are welcome. (Couldn't
share our Wiki page)


---
Oytun Tez

*M O T A W O R D*
The World's Fastest Human Translation Platform.
oy...@motaword.com — www.motaword.com


MW-IPM-UnderstandingState-230419-1550.pdf
Description: Adobe PDF document


Re: PatternFlatSelectAdapter - Serialization issue after 1.8 upgrade

2019-04-23 Thread Oytun Tez
Thank you Guowei and Dawid! I am trying your suggestions today and will
report back.

- I assume the cleaning operation should be done only once because of the
upgrade, or should I run every time the application is up?
- `static` sounds a very simple fix to get rid of this. Any drawbacks here?




---
Oytun Tez

*M O T A W O R D*
The World's Fastest Human Translation Platform.
oy...@motaword.com — www.motaword.com


On Tue, Apr 23, 2019 at 2:56 AM Dawid Wysakowicz 
wrote:

> Hi Oytun,
>
> I think there is a regression introduced in 1.8 how we handle output tags.
> The problem is we do not call ClosureCleaner on OutputTag.
>
> There are two options how you can workaround this issue:
>
> 1. Declare the OutputTag static
>
> 2. Clean the closure explicitly as Guowei suggested:
> StreamExecutionEnvironment.clean(pendingProjectsTag)
>
> I also opened a jira issue to fix this (FLINK-12297[1])
>
> Best,
>
> Dawid
>
> [1] https://issues.apache.org/jira/browse/FLINK-12297
> On 22/04/2019 03:06, Guowei Ma wrote:
>
> I think you could try
> StreamExecutionEnvironment.clean(pendingProjectsTag).
>
>
> Oytun Tez 于2019年4月19日 周五下午9:58写道:
>
>> Forgot to answer one of your points: the parent class compiles well
>> without this CEP selector (with timeout signature)...
>>
>>
>> ---
>> Oytun Tez
>>
>> *M O T A W O R D*
>> The World's Fastest Human Translation Platform.
>> oy...@motaword.com — www.motaword.com
>>
>>
>> On Fri, Apr 19, 2019 at 9:40 AM Oytun Tez  wrote:
>>
>>> Hey JingsongLee!
>>>
>>> Here are some findings...
>>>
>>>- flatSelect *without timeout* works normally:
>>>patternStream.flatSelect(PatternFlatSelectFunction), this compiles
>>>well.
>>>- Converted the both timeout and select selectors to an *inner class*
>>>(not static), yielded the same results, doesn't compile.
>>>- flatSelect *without* timeout, but with an inner class for
>>>PatternFlatSelectFunction, it compiles (same as first bullet).
>>>- Tried both of these selectors with *empty* body. Just a skeleton
>>>class. Doesn't compile either. Empty body example is in my first email.
>>>- Tried making both selectors *static public inner* classes, doesn't
>>>compile either.
>>>- Extracted both timeout and flat selectors to their own *independent
>>>classes* in separate files. Doesn't compile.
>>>- I am putting the *error stack* below.
>>>- Without the timeout selector in any class or lambda shape, with
>>>empty or full body, flatSelect compiles well.
>>>
>>> Would these findings help? Any ideas?
>>>
>>> Here is an error stack:
>>>
>>> 09:36:51,925 ERROR
>>> com.motaword.ipm.kernel.error.controller.ExceptionHandler -
>>> org.apache.flink.api.common.InvalidProgramException: The implementation
>>> of the PatternFlatSelectAdapter is not serializable. The object probably
>>> contains or references non serializable fields.
>>> at org.apache.flink.api.java.ClosureCleaner.clean(ClosureCleaner.java:99)
>>> at
>>> org.apache.flink.streaming.api.environment.StreamExecutionEnvironment.clean(StreamExecutionEnvironment.java:1558)
>>> at
>>> org.apache.flink.cep.PatternStreamBuilder.clean(PatternStreamBuilder.java:86)
>>> at org.apache.flink.cep.PatternStream.process(PatternStream.java:114)
>>> at org.apache.flink.cep.PatternStream.flatSelect(PatternStream.java:451)
>>> at org.apache.flink.cep.PatternStream.flatSelect(PatternStream.java:408)
>>> at
>>> com.motaword.ipm.business.invitation.controller.PendingProjects.getPending(PendingProjects.java:89)
>>> at
>>> com.motaword.ipm.business.invitation.controller.PendingProjects.run(PendingProjects.java:45)
>>> at
>>> com.motaword.ipm.business.invitation.boundary.InvitationJob.run(InvitationJob.java:31)
>>> at com.motaword.ipm.kernel.Application.main(Application.java:63)
>>> Caused by: java.io.NotSerializableException:
>>> org.apache.flink.streaming.api.datastream.SingleOutputStreamOperator
>>> at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1184)
>>> at
>>> java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
>>> at
>>> java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)
>>> at
>>> java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
>>> at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
>>&g

Re: PatternFlatSelectAdapter - Serialization issue after 1.8 upgrade

2019-04-19 Thread Oytun Tez
Forgot to answer one of your points: the parent class compiles well without
this CEP selector (with timeout signature)...

---
Oytun Tez

*M O T A W O R D*
The World's Fastest Human Translation Platform.
oy...@motaword.com — www.motaword.com


On Fri, Apr 19, 2019 at 9:40 AM Oytun Tez  wrote:

> Hey JingsongLee!
>
> Here are some findings...
>
>- flatSelect *without timeout* works normally:
>patternStream.flatSelect(PatternFlatSelectFunction), this compiles
>well.
>- Converted the both timeout and select selectors to an *inner class*
>(not static), yielded the same results, doesn't compile.
>- flatSelect *without* timeout, but with an inner class for
>PatternFlatSelectFunction, it compiles (same as first bullet).
>- Tried both of these selectors with *empty* body. Just a skeleton
>class. Doesn't compile either. Empty body example is in my first email.
>- Tried making both selectors *static public inner* classes, doesn't
>compile either.
>- Extracted both timeout and flat selectors to their own *independent
>classes* in separate files. Doesn't compile.
>- I am putting the *error stack* below.
>- Without the timeout selector in any class or lambda shape, with
>empty or full body, flatSelect compiles well.
>
> Would these findings help? Any ideas?
>
> Here is an error stack:
>
> 09:36:51,925 ERROR
> com.motaword.ipm.kernel.error.controller.ExceptionHandler -
> org.apache.flink.api.common.InvalidProgramException: The implementation of
> the PatternFlatSelectAdapter is not serializable. The object probably
> contains or references non serializable fields.
> at org.apache.flink.api.java.ClosureCleaner.clean(ClosureCleaner.java:99)
> at
> org.apache.flink.streaming.api.environment.StreamExecutionEnvironment.clean(StreamExecutionEnvironment.java:1558)
> at
> org.apache.flink.cep.PatternStreamBuilder.clean(PatternStreamBuilder.java:86)
> at org.apache.flink.cep.PatternStream.process(PatternStream.java:114)
> at org.apache.flink.cep.PatternStream.flatSelect(PatternStream.java:451)
> at org.apache.flink.cep.PatternStream.flatSelect(PatternStream.java:408)
> at
> com.motaword.ipm.business.invitation.controller.PendingProjects.getPending(PendingProjects.java:89)
> at
> com.motaword.ipm.business.invitation.controller.PendingProjects.run(PendingProjects.java:45)
> at
> com.motaword.ipm.business.invitation.boundary.InvitationJob.run(InvitationJob.java:31)
> at com.motaword.ipm.kernel.Application.main(Application.java:63)
> Caused by: java.io.NotSerializableException:
> org.apache.flink.streaming.api.datastream.SingleOutputStreamOperator
> at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1184)
> at
> java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
> at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)
> at
> java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
> at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
> at
> java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
> at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)
> at
> java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
> at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
> at
> java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
> at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)
> at
> java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
> at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
> at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:348)
> at
> org.apache.flink.util.InstantiationUtil.serializeObject(InstantiationUtil.java:576)
> at org.apache.flink.api.java.ClosureCleaner.clean(ClosureCleaner.java:81)
> ... 9 more
>
>
>
>
>
>
> ---
> Oytun Tez
>
> *M O T A W O R D*
> The World's Fastest Human Translation Platform.
> oy...@motaword.com — www.motaword.com
>
>
> On Fri, Apr 19, 2019 at 3:14 AM JingsongLee 
> wrote:
>
>> Hi @Oytun Tez
>> It Looks like your *PatternFlatSelectFunction* is not serializable.
>> Because you use anonymous inner class,
>> Check the class to which getPending belongs, maybe that class is not
>> serializable?
>>
>> Or you may be advised not to use internal classes, but to use a static 
>> internal class.
>>
>> Best, JingsongLee
>>
>> --
>> From:Oytun Tez 
>> Send Time:2019年4月19日(星期五) 03:38
>> To:user 
>> Subject:PatternFlatSelectAd

Re: PatternFlatSelectAdapter - Serialization issue after 1.8 upgrade

2019-04-19 Thread Oytun Tez
Hey JingsongLee!

Here are some findings...

   - flatSelect *without timeout* works normally:
   patternStream.flatSelect(PatternFlatSelectFunction), this compiles well.
   - Converted the both timeout and select selectors to an *inner class*
   (not static), yielded the same results, doesn't compile.
   - flatSelect *without* timeout, but with an inner class for
   PatternFlatSelectFunction, it compiles (same as first bullet).
   - Tried both of these selectors with *empty* body. Just a skeleton
   class. Doesn't compile either. Empty body example is in my first email.
   - Tried making both selectors *static public inner* classes, doesn't
   compile either.
   - Extracted both timeout and flat selectors to their own *independent
   classes* in separate files. Doesn't compile.
   - I am putting the *error stack* below.
   - Without the timeout selector in any class or lambda shape, with empty
   or full body, flatSelect compiles well.

Would these findings help? Any ideas?

Here is an error stack:

09:36:51,925 ERROR
com.motaword.ipm.kernel.error.controller.ExceptionHandler -
org.apache.flink.api.common.InvalidProgramException: The implementation of
the PatternFlatSelectAdapter is not serializable. The object probably
contains or references non serializable fields.
at org.apache.flink.api.java.ClosureCleaner.clean(ClosureCleaner.java:99)
at
org.apache.flink.streaming.api.environment.StreamExecutionEnvironment.clean(StreamExecutionEnvironment.java:1558)
at
org.apache.flink.cep.PatternStreamBuilder.clean(PatternStreamBuilder.java:86)
at org.apache.flink.cep.PatternStream.process(PatternStream.java:114)
at org.apache.flink.cep.PatternStream.flatSelect(PatternStream.java:451)
at org.apache.flink.cep.PatternStream.flatSelect(PatternStream.java:408)
at
com.motaword.ipm.business.invitation.controller.PendingProjects.getPending(PendingProjects.java:89)
at
com.motaword.ipm.business.invitation.controller.PendingProjects.run(PendingProjects.java:45)
at
com.motaword.ipm.business.invitation.boundary.InvitationJob.run(InvitationJob.java:31)
at com.motaword.ipm.kernel.Application.main(Application.java:63)
Caused by: java.io.NotSerializableException:
org.apache.flink.streaming.api.datastream.SingleOutputStreamOperator
at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1184)
at
java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)
at
java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
at
java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)
at
java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
at
java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)
at
java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:348)
at
org.apache.flink.util.InstantiationUtil.serializeObject(InstantiationUtil.java:576)
at org.apache.flink.api.java.ClosureCleaner.clean(ClosureCleaner.java:81)
... 9 more






---
Oytun Tez

*M O T A W O R D*
The World's Fastest Human Translation Platform.
oy...@motaword.com — www.motaword.com


On Fri, Apr 19, 2019 at 3:14 AM JingsongLee  wrote:

> Hi @Oytun Tez
> It Looks like your *PatternFlatSelectFunction* is not serializable.
> Because you use anonymous inner class,
> Check the class to which getPending belongs, maybe that class is not
> serializable?
>
> Or you may be advised not to use internal classes, but to use a static 
> internal class.
>
> Best, JingsongLee
>
> --
> From:Oytun Tez 
> Send Time:2019年4月19日(星期五) 03:38
> To:user 
> Subject:PatternFlatSelectAdapter - Serialization issue after 1.8 upgrade
>
> Hi all,
>
> We are just migration from 1.6 to 1.8. I encountered a serialization error
> which we didn't have before if memory serves: The implementation of the
> *PatternFlatSelectAdapter* is not serializable. The object probably
> contains or references non serializable fields.
>
> The method below simply intakes a PatternStream from CEP.pattern() and
> makes use of the sideoutput for timed-out events. Can you see anything
> weird here (WorkerEvent is the input, but collectors collect Project
> object)?
>
> protected DataStream getPending(PatternStream
> patternStream) {
> OutputTag pendingProjectsTag = new *OutputTag*
> ("invitation-pending-p

PatternFlatSelectAdapter - Serialization issue after 1.8 upgrade

2019-04-18 Thread Oytun Tez
Hi all,

We are just migration from 1.6 to 1.8. I encountered a serialization error
which we didn't have before if memory serves: The implementation of the
*PatternFlatSelectAdapter* is not serializable. The object probably
contains or references non serializable fields.

The method below simply intakes a PatternStream from CEP.pattern() and
makes use of the sideoutput for timed-out events. Can you see anything
weird here (WorkerEvent is the input, but collectors collect Project
object)?

protected DataStream getPending(PatternStream
patternStream) {
OutputTag pendingProjectsTag = new *OutputTag*
("invitation-pending-projects"){};

return patternStream.*flatSelect*(
pendingProjectsTag,
new *PatternFlatTimeoutFunction*() {
@Override
public void *timeout*(Map> map, long l, Collector collector) {
}
},
new *PatternFlatSelectFunction*()
{
@Override
public void *flatSelect*(Map> pattern, Collector collector) {
}
}
).name("Select pending projects for invitation").*getSideOutput*
(pendingProjectsTag);
}

---
Oytun Tez

*M O T A W O R D*
The World's Fastest Human Translation Platform.
oy...@motaword.com — www.motaword.com


Re: Sink data into java stream variable

2019-04-15 Thread Oytun Tez
Hi Soheil,

This is a tricky question that requires much more context due to
distributed nature of Flink. Take a look at [1], especially BufferingSink
example which I will *assume* behaves similar to your need.

Oytun
[1]
https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/state/state.html#using-managed-operator-state



---
Oytun Tez

*M O T A W O R D*
The World's Fastest Human Translation Platform.
oy...@motaword.com — www.motaword.com


On Mon, Apr 15, 2019 at 3:01 PM Soheil Pourbafrani 
wrote:

> Hi,
>
> In Flink Stream processing can we sink data into java stream array?
>


Re: [DISCUSS] Create a Flink ecosystem website

2019-03-21 Thread Oytun Tez
Thank you, all! If there are operational tasks about the ecosystem page(s),
let me know (organizing the content etc, whatever).

---
Oytun Tez

*M O T A W O R D*
The World's Fastest Human Translation Platform.
oy...@motaword.com — www.motaword.com


On Thu, Mar 21, 2019 at 2:14 PM Becket Qin  wrote:

> Thanks for the update Robert! Looking forward to the prototype!
>
> On Thu, Mar 21, 2019 at 10:07 PM Robert Metzger 
> wrote:
>
>> Quick summary of our call:
>> Daryl will soon start with a front end, build against a very simple
>> mock-backend.
>> Congxian will start implementing the Spring-based backend early April.
>>
>> As soon as the first prototype of the UI is ready, we'll share it here for
>> feedback.
>>
>> On Thu, Mar 21, 2019 at 10:08 AM Robert Metzger 
>> wrote:
>>
>> > Okay, great.
>> >
>> > Congxian Qiu, Daryl and I have a kick-off call later today at 2pm CET,
>> 9pm
>> > China time about the design of the ecosystem page (see:
>> > https://github.com/rmetzger/flink-community-tools/issues/4)
>> > Please let me know if others want to join as well, I can add them to the
>> > invite.
>> >
>> > On Wed, Mar 20, 2019 at 4:10 AM Becket Qin 
>> wrote:
>> >
>> >> I agree. We can start with english-only and see how it goes. The
>> comments
>> >> and descriptions can always be multi-lingual but that is up to the
>> package
>> >> owners.
>> >>
>> >> On Tue, Mar 19, 2019 at 6:07 PM Robert Metzger 
>> >> wrote:
>> >>
>> >>> Thanks.
>> >>>
>> >>> Do we actually want this page to be multi-language?
>> >>>
>> >>> I propose to make the website english-only, but maybe consider
>> allowing
>> >>> comments in different languages.
>> >>> If we would make it multi-language, then we might have problems with
>> >>> people submitting packages in non-english languages.
>> >>>
>> >>>
>> >>>
>> >>> On Tue, Mar 19, 2019 at 2:42 AM Becket Qin 
>> wrote:
>> >>>
>> >>>> Done. The writeup looks great!
>> >>>>
>> >>>> On Mon, Mar 18, 2019 at 9:09 PM Robert Metzger 
>> >>>> wrote:
>> >>>>
>> >>>>> Nice, really good news on the INFRA front!
>> >>>>> I think the hardware specs sound reasonable. And a periodic backup
>> of
>> >>>>> the website's database to Infra's backup solution sounds reasonable
>> too.
>> >>>>>
>> >>>>> Can you accept and review my proposal for the website?
>> >>>>>
>> >>>>>
>> >>>>> On Sat, Mar 16, 2019 at 3:47 PM Becket Qin 
>> >>>>> wrote:
>> >>>>>
>> >>>>>> >
>> >>>>>> > I have a very capable and motivated frontend developer who would
>> be
>> >>>>>> > willing to implement what I've mocked in my proposal.
>> >>>>>>
>> >>>>>>
>> >>>>>> That is awesome!
>> >>>>>>
>> >>>>>> I created a Jira ticket[1] to Apache Infra and got the reply. It
>> >>>>>> looks that
>> >>>>>> Apache infra team could provide a decent VM. The last piece is how
>> to
>> >>>>>> ensure the data is persisted so we won't lose the project info /
>> user
>> >>>>>> feedbacks when the VM is down. If Apache infra does not provide a
>> >>>>>> persistent storage for DB backup, we can always ask for multiple
>> VMs
>> >>>>>> and do
>> >>>>>> the fault tolerance by ourselves. It seems we can almost say the
>> >>>>>> hardware
>> >>>>>> side is also ready.
>> >>>>>>
>> >>>>>> Thanks,
>> >>>>>>
>> >>>>>> Jiangjie (Becket) Qin
>> >>>>>>
>> >>>>>> [1] https://issues.apache.org/jira/browse/INFRA-18010
>> >>>>>>
>> >>>>>> On Fri, Mar 15, 2019 at 5:39 PM Robert Metzger <
>> rmetz...@apache.org>
>> >>>>>> wrote:
>> >>>>>>
>> >>>>>> > Thank you for reaching out to I

Re: Technical consulting resources

2019-03-14 Thread Oytun Tez
Here is a better link from a service buyer perspective: https://touk.pl/

---
Oytun Tez

*M O T A W O R D*
The World's Fastest Human Translation Platform.
oy...@motaword.com — www.motaword.com


On Thu, Mar 14, 2019 at 1:02 PM Oytun Tez  wrote:

> Hi Ron,
>
> I've been experimenting with Nussknacker, a Flink application, which is
> built by TouK: https://touk.pl/esp/
>
> It looks like they have experience with large deployments (telecom), so
> they may be helpful. I am CC'ing 2 engineers from their team.
>
>
>
> ---
> Oytun Tez
>
> *M O T A W O R D*
> The World's Fastest Human Translation Platform.
> oy...@motaword.com — www.motaword.com
>
>
> On Thu, Mar 14, 2019 at 12:48 PM Ron Crocker 
> wrote:
>
>> I have a team that I’m working with that’s looking for some Flink
>> consulting. They’re running on an older version (1.4) and are looking for
>> help with upgrading to a later version and even a path for staying
>> current). There are some complications of our deployment model that make
>> this harder than it needs to be, so being comfortable with the deployment
>> details of a flink cluster is helpful.
>>
>> This team is in Portland Oregon but you can likely work with us remotely
>> as well.
>>
>> You can contact me directly or share with this list with your
>> recommendations.
>>
>> Thanks in advance
>>
>> Ron
>> —
>> Ron Crocker
>> Distinguished Engineer & Architect
>> ( ( •)) New Relic
>> rcroc...@newrelic.com
>>
>>


Re: Technical consulting resources

2019-03-14 Thread Oytun Tez
Hi Ron,

I've been experimenting with Nussknacker, a Flink application, which is
built by TouK: https://touk.pl/esp/

It looks like they have experience with large deployments (telecom), so
they may be helpful. I am CC'ing 2 engineers from their team.



---
Oytun Tez

*M O T A W O R D*
The World's Fastest Human Translation Platform.
oy...@motaword.com — www.motaword.com


On Thu, Mar 14, 2019 at 12:48 PM Ron Crocker  wrote:

> I have a team that I’m working with that’s looking for some Flink
> consulting. They’re running on an older version (1.4) and are looking for
> help with upgrading to a later version and even a path for staying
> current). There are some complications of our deployment model that make
> this harder than it needs to be, so being comfortable with the deployment
> details of a flink cluster is helpful.
>
> This team is in Portland Oregon but you can likely work with us remotely
> as well.
>
> You can contact me directly or share with this list with your
> recommendations.
>
> Thanks in advance
>
> Ron
> —
> Ron Crocker
> Distinguished Engineer & Architect
> ( ( •)) New Relic
> rcroc...@newrelic.com
>
>


Re: Calculating over multiple streams...

2019-02-22 Thread Oytun Tez
Restructuring with your tip now, Michael, thank you!

---
Oytun Tez

*M O T A W O R D*
The World's Fastest Human Translation Platform.
oy...@motaword.com — www.motaword.com


On Fri, Feb 22, 2019 at 11:23 AM Michael Latta 
wrote:

> You may want to union the 3 streams prior to the process function if they
> are independently processed.
>
>
> Michael
>
> On Feb 22, 2019, at 9:15 AM, Oytun Tez  wrote:
>
> Hi everyone!
>
> I've been struggling with an implementation problem in the last days,
> which I am almost sure caused by my misunderstanding of Flink.
>
> The purpose: consume multiple streams, update a score list (with meta data
> e.g. user_id) for each update coming from any of the streams. The new
> output list will also need to be used by another pattern.
>
>1. We created 3 SourceFunctions, that periodically go to our MySQL
>database and stream new results back. This one returns POJOs.
>2. Then we flatMap these streams to unify their Type. They are now all
>Tuple3s with matching types.
>3. And we process each stream with the same ProcessFunction.
>4. I am stuck with the output list.
>
> Business case (human translation workflow):
>
>1. Input: Stream "translation quality" score updates of each
>translator [translator_id, score]
>2. Input: Stream "responsivity score" updates of each translator
>(email open rates/speeds etc) [translator_id, score]
>3. Input: Stream "number of projects" updates each translator worked
>on [translator_id, score]
>4. Calculation: for each translator, use 3 scores to come up with a
>unified score and its percentile over all translators. This step definitely
>feels like a Batch job, but I am pushing to go with a streaming mindset.
>5. So now supposedly, in this way or another, I have a list of
>translators with their unified score and percentile over this list.
>6. Another independent stream should send me updates on "need for
>proofreaders" – I couldn't even come to this point yet. Once a need info is
>streamed, application would fetch the previously calculated list and let's
>say picks the top X determined by the message from need algorithm.
>
>
> 
>
> Overall, my desire is to make everything a stream and let the data and
> decisions constantly react to stream updates. I am very confused at this
> point. Tried using keyed and operator states, but they seem to be keeping
> their state only for their own items. Considering to do Batch instead after
> all the struggle.
>
> Any ideas? I can even get on a call.
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> ---
> Oytun Tez
>
> *M O T A W O R D*
> The World's Fastest Human Translation Platform.
> oy...@motaword.com — www.motaword.com
>
>