[jira] [Created] (FLINK-9175) Flink CEP with Checkpointing alway failed

2018-04-14 Thread godfrey johnson (JIRA)
godfrey johnson created FLINK-9175:
--

 Summary: Flink CEP with Checkpointing alway failed
 Key: FLINK-9175
 URL: https://issues.apache.org/jira/browse/FLINK-9175
 Project: Flink
  Issue Type: Improvement
  Components: State Backends, Checkpointing
Affects Versions: 1.4.1
 Environment: Checkpoint Interval: 1min

Checkpoint Timeout: 2min

Checkpoint Pause: 5s

Checkpoint Concurrent: 1

Checkpoint Mode: EXACTLY_ONCE

AllowedLateness: 100s

CEP within time: 30s

Kafka QPS:10,000

Source Parallelism: 16

 
Reporter: godfrey johnson
 Attachments: dataStream.png

I used RocksDBStateBackend to checkpoint for my job, and it always failed for 
timeout. But when I closed CEP, only keeped the source operator, which was 
working fine. And FsStateBackend was also finished checkpoint quickly without 
timeout.

!dataStream.png!

 

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Flink 1.3.2 with CEP Pattern, Memory usage increases results in OOM

2018-04-14 Thread Abiramalakshmi Natarajan
I am using Flink 1.3.2 CEP pattern to detect a frequently occurring
condition. On scale testing this pattern with 10k events per minute, memory
leak happens finally OOM. 

I found a related JIRA FLINK-7606 where it mentioned to specifiying
EventTime as streamTimeCharacteristic. 

I have also configured the same, Also i am using RMQSource . Still i am
facing the memory leak, can you please let me know whether i am missing
anything. 

env.setStreamTimeCharacteristic(TimeCharacteristic.EventTime);
DataStream apCheckInReqStreams =
env.addSource(RMQSourceHelper.createAPCheckInSource(parameters))
.assignTimestampsAndWatermarks(new
IngestionTimeExtractor()); 
DataStream apconnectReqStream =
env.addSource(RMQSourceHelper.createAPConnectSource(parameters))
.assignTimestampsAndWatermarks(new 
IngestionTimeExtractor<
ApConnectEvent>());


DataStream apcheckInEventStream =  
apCheckInReqStreams.flatMap(new
APCheckInEventMapper());

DataStream apconnectEventStream = 
apconnectReqStream.flatMap(new
APConnectCheckInEventMapper());

DataStream unifiedDevicecheckInStream =
apcheckInEventStream.union(apconnectEventStream); 

DataStream deviceCheckInStream = 
unifiedDevicecheckInStream .flatMap(new FlatMapFunction() {

@Override
public void flatMap(Event value, 
Collector out)
throws Exception 
{
DeviceCheckInEvent deviceCheckInEvent = new 
DeviceCheckInEvent();

deviceCheckInEvent.setEntityType(value.getDeviceType());

deviceCheckInEvent.setDeviceMacAddress(value.getMac());

deviceCheckInEvent.setEventType(value.getType());

deviceCheckInEvent.setSerialNum(value.getSource());

deviceCheckInEvent.setDeviceModel(value.getModel());

deviceCheckInEvent.setStreamType(StreamType.getStreamType(value.getType()));

deviceCheckInEvent.setEventTime(value.getTimeStamp());

deviceCheckInEvent.setTenantId(value.getTenantId());
deviceCheckInEvent.setSiteid(value.getSiteId());
out.collect(deviceCheckInEvent);
}
});


Below is my pattern detection code:

Pattern connectEventPattern1 =
Pattern.begin("first")

.where(new IterativeCondition() {



@Override

public boolean filter(DeviceCheckInEvent value,
Context ctx) throws Exception {

return value.getEventType() == 
EventType.ApConnect;  

}

}).times(4).within(Time.minutes(6));

DataStream frequentEventStream =  
CEP.pattern(inputStream.keyBy(new KeySelector()
{

@Override
public String getKey(DeviceCheckInEvent arg0) throws 
Exception {
return arg0.getSerialNum();
}
}), connectEventPattern1).flatSelect(new
PatternFlatSelectFunction() {

@Override
public void flatSelect(Map> pattern,
Collector out) throws Exception
{
List connectOccurrences = 
pattern.get("first");

DeviceCheckInEvent freqConnEvent =
connectOccurrences.get(connectOccurrences.size()-1); 
Event event = new 
Event(EventType.FrequentConnects,
freqConnEvent.getEntityType(), freqConnEvent.getTenantId(), 
freqConnEvent.getSerialNum(), 
freqConnEvent.getSiteid(),
freqConnEvent.getMac(), freqConnEvent.getEventTime());

event.setEventDescr(EMEventMessage.frequentconnects.getMsgFormat(freqConnEvent.getEntityType(),
freqConnEvent.getSerialNum(), count, timeElapsed));


logger.warn("Detected Frequent Connects:"+ 
builder.toString());  
   

Re: Documentation glitch w/AsyncFunction?

2018-04-14 Thread Ted Yu
Sounds good to me.

On Sat, Apr 14, 2018 at 9:55 AM, Ken Krugler 
wrote:

> Hi Ted,
>
> Thanks - yes regarding renaming the variable, and changing the type.
>
> The other issue is that most clients return a Future, not a
> CompletableFuture.
>
> So should I do the bit of extra code to show using a CompletableFuture
> with a Future?
>
> — Ken
>
> > On Apr 14, 2018, at 9:13 AM, Ted Yu  wrote:
> >
> > bq.  resultFuture.thenAccept( (String result) -> {
> >
> > I think the type of variable for the above call should be
> CompletableFuture.
> > Meaning, the variable currently named resultFuture should be renamed so
> > that the intention is clearer.
> >
> > bq.  resultFuture.complete(Collections.singleton(new
> > Tuple2<>(str, result)));
> >
> > Looking at existing code in unit tests, the complete() call is on the
> > parameter.
> >
> > Cheers
> >
> > On Sat, Apr 14, 2018 at 8:34 AM, Ken Krugler <
> kkrugler_li...@transpac.com>
> > wrote:
> >
> >> Hi devs,
> >>
> >> https://ci.apache.org/projects/flink/flink-docs-release-1.4/dev/stream/
> >> operators/asyncio.html  projects/flink/flink-docs-
> >> release-1.4/dev/stream/operators/asyncio.html>
> >>
> >> Has this example of asyncInvoke:
> >>
> >>> @Override
> >>>public void asyncInvoke(final String str, final
> >> ResultFuture> resultFuture) throws Exception {
> >>>
> >>>// issue the asynchronous request, receive a future for result
> >>>Future resultFuture = client.query(str);
> >>>
> >>>// set the callback to be executed once the request by the
> >> client is complete
> >>>// the callback simply forwards the result to the result future
> >>>resultFuture.thenAccept( (String result) -> {
> >>>
> >>>resultFuture.complete(Collections.singleton(new
> >> Tuple2<>(str, result)));
> >>>
> >>>});
> >>>}
> >>
> >> 1. there’s a resultFuture parameter, and a resultFuture variable.
> >>
> >> 2. resultFuture.thenAccept() is a method available for
> CompletableFuture <
> >> https://docs.oracle.com/javase/8/docs/api/java/util/
> >> concurrent/CompletableFuture.html>, not Future.
> >>
> >> I can fix this up, but I’m wondering what you think the code should do,
> >> assuming there’s a typical client that returns a Future vs. a
> >> CompletableFuture.
> >>
> >> e.g. I could use CompletableFuture.supplyAsync(new Supplier()
> { }
> >> with a get that calls the Future’s get().
> >>
> >> Thanks,
> >>
> >> — Ken
> >>
> >> 
> >> http://about.me/kkrugler
> >> +1 530-210-6378
> >>
> >>
>
> 
> http://about.me/kkrugler
> +1 530-210-6378
>
>


Re: Documentation glitch w/AsyncFunction?

2018-04-14 Thread Ken Krugler
Hi Ted,

Thanks - yes regarding renaming the variable, and changing the type.

The other issue is that most clients return a Future, not a CompletableFuture.

So should I do the bit of extra code to show using a CompletableFuture with a 
Future?

— Ken

> On Apr 14, 2018, at 9:13 AM, Ted Yu  wrote:
> 
> bq.  resultFuture.thenAccept( (String result) -> {
> 
> I think the type of variable for the above call should be CompletableFuture.
> Meaning, the variable currently named resultFuture should be renamed so
> that the intention is clearer.
> 
> bq.  resultFuture.complete(Collections.singleton(new
> Tuple2<>(str, result)));
> 
> Looking at existing code in unit tests, the complete() call is on the
> parameter.
> 
> Cheers
> 
> On Sat, Apr 14, 2018 at 8:34 AM, Ken Krugler 
> wrote:
> 
>> Hi devs,
>> 
>> https://ci.apache.org/projects/flink/flink-docs-release-1.4/dev/stream/
>> operators/asyncio.html > release-1.4/dev/stream/operators/asyncio.html>
>> 
>> Has this example of asyncInvoke:
>> 
>>> @Override
>>>public void asyncInvoke(final String str, final
>> ResultFuture> resultFuture) throws Exception {
>>> 
>>>// issue the asynchronous request, receive a future for result
>>>Future resultFuture = client.query(str);
>>> 
>>>// set the callback to be executed once the request by the
>> client is complete
>>>// the callback simply forwards the result to the result future
>>>resultFuture.thenAccept( (String result) -> {
>>> 
>>>resultFuture.complete(Collections.singleton(new
>> Tuple2<>(str, result)));
>>> 
>>>});
>>>}
>> 
>> 1. there’s a resultFuture parameter, and a resultFuture variable.
>> 
>> 2. resultFuture.thenAccept() is a method available for CompletableFuture <
>> https://docs.oracle.com/javase/8/docs/api/java/util/
>> concurrent/CompletableFuture.html>, not Future.
>> 
>> I can fix this up, but I’m wondering what you think the code should do,
>> assuming there’s a typical client that returns a Future vs. a
>> CompletableFuture.
>> 
>> e.g. I could use CompletableFuture.supplyAsync(new Supplier() { }
>> with a get that calls the Future’s get().
>> 
>> Thanks,
>> 
>> — Ken
>> 
>> 
>> http://about.me/kkrugler
>> +1 530-210-6378
>> 
>> 


http://about.me/kkrugler
+1 530-210-6378



Re: Documentation glitch w/AsyncFunction?

2018-04-14 Thread Ted Yu
bq.  resultFuture.thenAccept( (String result) -> {

I think the type of variable for the above call should be CompletableFuture.
Meaning, the variable currently named resultFuture should be renamed so
that the intention is clearer.

bq.  resultFuture.complete(Collections.singleton(new
Tuple2<>(str, result)));

Looking at existing code in unit tests, the complete() call is on the
parameter.

Cheers

On Sat, Apr 14, 2018 at 8:34 AM, Ken Krugler 
wrote:

> Hi devs,
>
> https://ci.apache.org/projects/flink/flink-docs-release-1.4/dev/stream/
> operators/asyncio.html  release-1.4/dev/stream/operators/asyncio.html>
>
> Has this example of asyncInvoke:
>
> > @Override
> > public void asyncInvoke(final String str, final
> ResultFuture> resultFuture) throws Exception {
> >
> > // issue the asynchronous request, receive a future for result
> > Future resultFuture = client.query(str);
> >
> > // set the callback to be executed once the request by the
> client is complete
> > // the callback simply forwards the result to the result future
> > resultFuture.thenAccept( (String result) -> {
> >
> > resultFuture.complete(Collections.singleton(new
> Tuple2<>(str, result)));
> >
> > });
> > }
>
> 1. there’s a resultFuture parameter, and a resultFuture variable.
>
> 2. resultFuture.thenAccept() is a method available for CompletableFuture <
> https://docs.oracle.com/javase/8/docs/api/java/util/
> concurrent/CompletableFuture.html>, not Future.
>
> I can fix this up, but I’m wondering what you think the code should do,
> assuming there’s a typical client that returns a Future vs. a
> CompletableFuture.
>
> e.g. I could use CompletableFuture.supplyAsync(new Supplier() { }
> with a get that calls the Future’s get().
>
> Thanks,
>
> — Ken
>
> 
> http://about.me/kkrugler
> +1 530-210-6378
>
>


Documentation glitch w/AsyncFunction?

2018-04-14 Thread Ken Krugler
Hi devs,

https://ci.apache.org/projects/flink/flink-docs-release-1.4/dev/stream/operators/asyncio.html
 


Has this example of asyncInvoke:

> @Override
> public void asyncInvoke(final String str, final 
> ResultFuture> resultFuture) throws Exception {
> 
> // issue the asynchronous request, receive a future for result
> Future resultFuture = client.query(str);
> 
> // set the callback to be executed once the request by the client is 
> complete
> // the callback simply forwards the result to the result future
> resultFuture.thenAccept( (String result) -> {
> 
> resultFuture.complete(Collections.singleton(new Tuple2<>(str, 
> result)));
>  
> });
> }

1. there’s a resultFuture parameter, and a resultFuture variable.

2. resultFuture.thenAccept() is a method available for CompletableFuture 
,
 not Future.

I can fix this up, but I’m wondering what you think the code should do, 
assuming there’s a typical client that returns a Future vs. a CompletableFuture.

e.g. I could use CompletableFuture.supplyAsync(new Supplier() { } with 
a get that calls the Future’s get().

Thanks,

— Ken


http://about.me/kkrugler
+1 530-210-6378



[jira] [Created] (FLINK-9174) The type of state created in ProccessWindowFunction.proccess() is inconsistency

2018-04-14 Thread Sihua Zhou (JIRA)
Sihua Zhou created FLINK-9174:
-

 Summary: The type of state created in 
ProccessWindowFunction.proccess() is inconsistency
 Key: FLINK-9174
 URL: https://issues.apache.org/jira/browse/FLINK-9174
 Project: Flink
  Issue Type: Bug
  Components: State Backends, Checkpointing
Affects Versions: 1.5.0
Reporter: Sihua Zhou
Assignee: Sihua Zhou
 Fix For: 1.5.0


The type of state created from windowState and globalState in 
{{ProcessWindowFunction.process()}} is inconsistency. For detail,
{code}
context.windowState().getListState(); // return type is HeapListState or 
RocksDBListState
context.globalState().getListState(); // return type is UserFacingListState
{code}

This cause the problem in the following code,
{code}
Iterable iterableState = listState.get();
 if (terableState.iterator().hasNext()) {
   for (T value : iterableState) {
 value.setRetracting(true);
 collector.collect(value);
   }
   state.clear();
}
{code}
If the {{listState}} is created from {{context.globalState()}} is fine, but 
when it created from {{context.windowState()}} this will cause NPE. I met this 
in 1.3.2 but I found it also affect 1.5.0.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Support for out-of-the-box external catalog for SQL Client

2018-04-14 Thread Fabian Hueske
Great, Thank you Shuyi and Rong!

2018-04-14 3:03 GMT+02:00 Rong Rong :

> Thanks Peter, Fabian & Shuyi for the input.
>
> I have also created a task https://issues.apache.org/
> jira/browse/FLINK-9172.
> Then we can have external catalog factory support on SQL-Client that works
> with the abstractions in Table/SQL API.
>
> --
> Rong
>
>
> On Fri, Apr 13, 2018 at 5:00 PM, Shuyi Chen  wrote:
>
> > I've created master JIRA (https://issues.apache.org/
> jira/browse/FLINK-9171
> > ),
> > and included all HCatalog related JIRAs as subtasks. This make it easier
> to
> > track all HCatalog related effort in Flink. Thanks.
> >
> > Shuyi
> >
> > On Fri, Apr 13, 2018 at 12:36 PM, Fabian Hueske 
> wrote:
> >
> > > Hi everybody,
> > >
> > > An HCatalog integration with the Table API/SQL would be great and be
> > > helpful for many users!
> > >
> > > A big +1 to that.
> > >
> > > Thank you,
> > > Fabian
> > >
> > > Shuyi Chen  schrieb am Mi., 11. Apr. 2018, 14:36:
> > >
> > > > Thanks a lot, Rong and Peter.
> > > >
> > > > AFAIK, there is a flink hcatalog connector introduced in FLINK-1466
> > > >  that is added by
> > > > Fabian.
> > > > And there is another JIRA in FLINK-1913
> > > >  to document the
> use
> > > of
> > > > connector.
> > > >
> > > > I think we can start with looking at the existing hcatalog
> connector,
> > > > adding missing documentation, and come up with a proposal to evolve
> the
> > > > Flink HCatalog integration with ExternalCatalog, and the SQL client
> to
> > > make
> > > > it both useful both SQL and non-SQL scenarios.
> > > >
> > > > Given we already have the integration implemented in AthenaX
> > > >  internally, we can help drive and
> > > > contribute back to the community.
> > > >
> > > > Shuyi
> > > >
> > > > On Wed, Apr 11, 2018 at 12:01 PM, Peter Huang <
> > > huangzhenqiu0...@gmail.com>
> > > > wrote:
> > > >
> > > > > Hi Rong,
> > > > >
> > > > > It is a good point out. I aligned with Fabian yesterday. It is a
> good
> > > > work
> > > > > that I can involve
> > > > > to contribute back to Apache Flink after having AthenaX backfill
> > > support
> > > > > internally.
> > > > >
> > > > >
> > > > > Best Regards
> > > > > Peter Huang
> > > > >
> > > > > On Wed, Apr 11, 2018 at 10:52 AM, Rong Rong 
> > > wrote:
> > > > >
> > > > > > Hi everyone,
> > > > > >
> > > > > > I was wondering if it is a good idea to support some external
> > catalog
> > > > > > software, such as Apache HCatalog[2], out-of-the-box for the
> > FLIP-24
> > > > > > SQL-Client[1]. There are many widely used catalogs that we can
> > > > > incorporate.
> > > > > > This way users won't have to always extend and create their own
> > > > > > ExternalCatalog.class separately and this could potentially make
> > the
> > > > > > configuration part easier for SQL users.
> > > > > >
> > > > > > Thanks,
> > > > > > Rong
> > > > > >
> > > > > >
> > > > > > [1] https://cwiki.apache.org/confluence/display/FLINK/FLIP-
> > > > > 24+-+SQL+Client
> > > > > > [2] https://cwiki.apache.org/confluence/display/Hive/HCatalog
> > > > > >
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > "So you have to trust that the dots will somehow connect in your
> > future."
> > > >
> > >
> >
> >
> >
> > --
> > "So you have to trust that the dots will somehow connect in your future."
> >
>