from:"Dominik Wosiński"

Flink KafkaProducer Failed Transaction Stalling the whole flow

2023-12-18 Thread Dominik Wosiński

Hey,
I've got a question regarding the transaction failures in EXACTLY_ONCE flow
with Flink 1.15.3 with Confluent Cloud Kafka.

The case is that there is a FlinkKafkaProducer in EXACTLY_ONCE setup with
default *transaction.timeout.ms *of 15min.

During the processing the job had some issues that caused checkpoint to
timeout, that in turn caused the transaction issues, which caused
transaction to fail with the following logs:
Unable to commit transaction
(org.apache.flink.streaming.runtime.operators.sink.committables.CommitRequestImpl@5d0d5082)
because its producer is already fenced. This means that you either have a
different producer with the same 'transactional.id' (this is unlikely with
the 'KafkaSink' as all generated ids are unique and shouldn't be reused) or
recovery took longer than 'transaction.timeout.ms' (90ms). In both
cases this most likely signals data loss, please consult the Flink
documentation for more details.
Up to this point everything is pretty clear. After that however, the job
continued to work normally but every single transaction was failing with:
Unable to commit transaction
(org.apache.flink.streaming.runtime.operators.sink.committables.CommitRequestImpl@5a924600)
because it's in an invalid state. Most likely the transaction has been
aborted for some reason. Please check the Kafka logs for more details.
Which effectively stalls all downstream processing because no transaction
would be ever commited.

I've read through the docs and understand that this is kind of a known
issue due to the fact that Kafka doesn't effectively support 2PC, but why
doesn't that cause the failure and restart of the whole job? Currently, the
job will process everything normally and hides the issue until it has grown
catastrophically.

Thanks in advance,
Cheers.

StreamQueryConfig vs TemporalTableFunction

2020-04-20 Thread Dominik Wosiński

Hey,
I wanted to ask whether the TemporalTableFunctions are subject to
StreamQueryConfig retention? I was pretty sure that they are not, but I
have recently noticed weird behavior in one of my jobs that suggests that
they indeed are.


Thanks for answers,
Best Regards,
Dom.

Objects with fields that are not serializable

2020-04-14 Thread Dominik Wosiński

Hey,
I have a question about using classes with fields that are not serializable
in DataStream. Basically, I would like to use the  Java's Optional in
DataStream. So Say I have a class *Data *that has several optional fields
and I would like to have *DataStream*. I don't think this should
cause any issues, but I thought it can be good to ask whether this can
cause any issues with Flink Jobs.

Thanks,
Best,
Dom.

Re: How to consume kafka from the last offset?

2020-03-26 Thread Dominik Wosiński

Hey,
Are You completely sure you mean *auto.offset.reset ??  *False is not valid
setting for that AFAIK.

Best,
Dom.

czw., 26 mar 2020 o 08:38 Jim Chen  napisał(a):

> Thanks!
>
> I made a mistake. I forget to set the auto.offset.reset=false. It's my
> fault.
>
> Dominik Wosiński  于2020年3月25日周三 下午6:49写道：
>
>> Hi Jim,
>> Well, *auto.offset.reset *is only used when there is no offset saved for
>> this *group.id <http://group.id>* in Kafka. So, if You want to read the
>> data from the latest record (and by latest I mean the newest here) You
>> should assign the *group.id <http://group.id>* that was not previously
>> used and then FlinkKafkaConsumer should automatically fetch the last offset
>> and start reading from that place.
>>
>>
>> Best Regards,
>> Dom.
>>
>> śr., 25 mar 2020 o 11:19 Jim Chen 
>> napisał(a):
>>
>>> Hi, All
>>>   I use flink-connector-kafka-0.11 consume the Kafka0.11. In
>>> KafkaConsumer params, i set the group.id and auto.offset.reset. In the
>>> Flink1.10, set the kafkaConsumer.setStartFromGroupOffsets();
>>>   Then, i restart the application, found the offset is not from the last
>>> position. Any one know where is wrong? HELP!
>>>
>>

Re: How to consume kafka from the last offset?

2020-03-25 Thread Dominik Wosiński

Hi Jim,
Well, *auto.offset.reset *is only used when there is no offset saved for
this *group.id * in Kafka. So, if You want to read the
data from the latest record (and by latest I mean the newest here) You
should assign the *group.id * that was not previously used
and then FlinkKafkaConsumer should automatically fetch the last offset and
start reading from that place.


Best Regards,
Dom.

śr., 25 mar 2020 o 11:19 Jim Chen  napisał(a):

> Hi, All
>   I use flink-connector-kafka-0.11 consume the Kafka0.11. In KafkaConsumer
> params, i set the group.id and auto.offset.reset. In the Flink1.10, set
> the kafkaConsumer.setStartFromGroupOffsets();
>   Then, i restart the application, found the offset is not from the last
> position. Any one know where is wrong? HELP!
>

Re: Issues with Watermark generation after join

2020-03-24 Thread Dominik Wosiński

Hey Timo,
Thanks a lot for this answer! I was mostly using the DataStream API, so
that's good to know the difference.
I have followup questions then, I will be glad for clarification:

1) So, for the SQL Join operator, is the *partition *the parallel instance
of operator or is it the table partitioning as defined by *partitionBy ??*
2) Assuming that this is instance of parallel operator, does this mean that
we need output from ALL operators so that the watermark progresses and the
output is generated?

Best Regards,
Dom.

wt., 24 mar 2020 o 10:01 Timo Walther  napisał(a):

> Hi Dominik,
>
> the big conceptual difference between DataStream and Table API is that
> record timestamps are part of the schema in Table API whereas they are
> attached internally to each record in DataStream API. When you call
> `y.rowtime` during a stream to table conversion, the runtime will
> extract the internal timestamp and will copy it into the field `y`.
>
> Even if the timestamp is not internally anymore, Flink makes sure that
> the watermarking (which still happens internally) remains valid.
> However, this means that timestamps and watermarks must already be
> correct when entering the Table API. If they were not correct before,
> they will also not trigger time-based operations correctly.
>
> If there is no output for a parallelism > 1, usually this means that one
> source parition has not emitted a watermark to have progress globally
> for the job:
>
> watermark of operator = min(previous operator partition 1, previous
> operator partition 2, ...)
>
> I hope this helps.
>
> Regards,
> Timo
>
>
> On 19.03.20 16:38, Dominik Wosiński wrote:
> > I have created a simple minimal reproducible example that shows what I
> > am talking about:
> > https://github.com/DomWos/FlinkTTF/tree/sql-ttf
> >
> > It contains a test that shows that even if the output is in order which
> > is enforced by multiple sleeps, then for parallelism > 1 there is no
> > output and for parallelism == 1, the output is produced normally.
> >
> > Best Regards,
> > Dom.
>
>

Re: Issues with Watermark generation after join

2020-03-19 Thread Dominik Wosiński

I have created a simple minimal reproducible example that shows what I am
talking about:
https://github.com/DomWos/FlinkTTF/tree/sql-ttf

It contains a test that shows that even if the output is in order which is
enforced by multiple sleeps, then for parallelism > 1 there is no output
and for parallelism == 1, the output is produced normally.

Best Regards,
Dom.

Re: Timestamp Erasure

2020-03-19 Thread Dominik Wosiński

Yes, I understand this completely, but my question is a little bit
different.

The issue is that if I have something like :
*val firstStream = dataStreamFromKafka*
*.assignTimestampAndWatermarks(...)*
*val secondStream = otherStreamFromKafka*
*.assignTimestampsAndWatermarks(...)*
*.broadcast(...)*

So, now If I do something like:
*firstStream.keyby(...).connect(secondStream)*
*.process(someBroadcastProcessFunction)*

Now, I only select one field from the second stream and this is *not the
timestamp field *and from the first stream I select all fields *including
timestamp *(in process function when creating a new record).

Then everything works like a charm and no issues there. But If I register
ProcessingTime timer in this *someBroadcastProcessFunction *and any element
is produced from *onTimer* function, then I get the issue described above.

Best Regards,
Dom.

czw., 19 mar 2020 o 02:41 Jark Wu  napisał(a):

> Hi  Dom,
>
> If you are converting a DataStream to a Table with a rowtime attribute,
> then the  DataStream should hold event-time timestamp.
> For example, call `assignTimestampsAndWatermarks` before converting to
> table. You can find more details in the doc [1].
>
> Best,
> Jark
>
> [1]:
> https://ci.apache.org/projects/flink/flink-docs-master/dev/table/streaming/time_attributes.html#during-datastream-to-table-conversion-1
>
> On Thu, 19 Mar 2020 at 02:38, Dominik Wosiński  wrote:
>
>> Hey,
>> I just wanted to ask about one thing about timestamps. So, currently If I
>> have a KeyedBroadcastProcess function followed by Temporal Table Join, it
>> works like a charm. But, say I want to delay emitting some of the results
>> due to any reason. So If I *registerProcessingTimeTimer*  and any
>> elements are emitted in *onTimer* call then the timestamps are erased,
>> meaning that I will simply get :
>> *Caused by: java.lang.RuntimeException: Rowtime timestamp is null. Please
>> make sure that a proper TimestampAssigner is defined and the stream
>> environment uses the EventTime time characteristic.*
>> * at DataStreamSourceConversion$10.processElement(Unknown Source)*
>> * at
>> org.apache.flink.table.runtime.CRowOutputProcessRunner.processElement(CRowOutputProcessRunner.scala:70)*
>> * at
>> org.apache.flink.streaming.api.operators.ProcessOperator.processElement(ProcessOperator.java:66)*
>> * at
>> org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.pushToOperator(OperatorChain.java:579)*
>> * ... 23 more*
>>
>> Is that the expected behavior? I haven't seen it described anywhere
>> before and I wasn't able to find any docs specifying this.
>>
>> Thanks in advance,
>> Best Regards,
>> Dom.
>>
>

Timestamp Erasure

2020-03-18 Thread Dominik Wosiński

Hey,
I just wanted to ask about one thing about timestamps. So, currently If I
have a KeyedBroadcastProcess function followed by Temporal Table Join, it
works like a charm. But, say I want to delay emitting some of the results
due to any reason. So If I *registerProcessingTimeTimer*  and any elements
are emitted in *onTimer* call then the timestamps are erased, meaning that
I will simply get :
*Caused by: java.lang.RuntimeException: Rowtime timestamp is null. Please
make sure that a proper TimestampAssigner is defined and the stream
environment uses the EventTime time characteristic.*
* at DataStreamSourceConversion$10.processElement(Unknown Source)*
* at
org.apache.flink.table.runtime.CRowOutputProcessRunner.processElement(CRowOutputProcessRunner.scala:70)*
* at
org.apache.flink.streaming.api.operators.ProcessOperator.processElement(ProcessOperator.java:66)*
* at
org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.pushToOperator(OperatorChain.java:579)*
* ... 23 more*

Is that the expected behavior? I haven't seen it described anywhere before
and I wasn't able to find any docs specifying this.

Thanks in advance,
Best Regards,
Dom.

Re: Issues with Watermark generation after join

2020-03-17 Thread Dominik Wosiński

Hey sure,
the original Temporal Table SQL is:

|SELECT e.*, f.level as level FROM
| enablers AS e,
| LATERAL TABLE (Detectors(e.timestamp)) AS f
| WHERE e.id= f.id
|""

And the previous SQL query to join A is something like :

SELECT *
| FROM A te,
| B s
| WHERE s.id = te.id AND s.level = te.level AND s.timestamp = te.timestamp


Also, if I replace the SQL to Join A with BroadcastProcessFunction this
works like a charm, everything is calculated correctly. Even if I don't
change the parallelism.

I have noticed one more weird behavior, after the temporal table Join I
have a windowing function to process the data. Now I have two options, in
TTF I can select the rowtime with type Timestamp and assign it to field in
output class, this automatically passes the Timestamp over so I don't need
to assign it again. But I could also select just a Long field that is not
marked as rowtime (even if they actually have the same value but this field
was not marked with *.rowtime* on declaration) and then I will need to
assign the timestamps and watermarks again, since Flink doesn't now what is
the timestamp. Now, the former solution works like a charm, but for the
latter one there is actually no output visible from the windowing function.
My expectation is that both solutions should work exactly the same and pass
the timestamps in the same manner, but apparently they are don't.

Best Regards,
Dom.

>

Re: Issues with Watermark generation after join

2020-03-16 Thread Dominik Wosiński

Actually, I just put this process function there for debugging purposes. My
main goal is to join the E & C using the Temporal Table function, but I
have observed exactly the same behavior i.e. when the parallelism was > 1
there was no output and when I was setting it to 1 then the output was
generated. So, I have switched to process function to see whether the
watermarks are reaching this stage.

Best Regards,
Dom.

pon., 16 mar 2020 o 19:46 Theo Diefenthal 
napisał(a):

> Hi Dominik,
>
> I had the same once with a custom processfunction. My processfunction
> buffered the data for a while and then output it again. As the proces
> function can do anything with the data (transforming, buffering,
> aggregating...), I think it's just not safe for flink to reason about the
> watermark of the output.
>
> I solved all my issues by calling `assignTimestampsAndWatermarks` directly
> post to the (co-)process function.
>
> Best regards
> Theo
>
> --
> *Von: *"Dominik Wosiński" 
> *An: *"user" 
> *Gesendet: *Montag, 16. März 2020 16:55:18
> *Betreff: *Issues with Watermark generation after join
>
> Hey,
> I have noticed a weird behavior with a job that I am currently working on.
> I have 4 different streams from Kafka, lets call them A, B, C and D. Now
> the idea is that first I do SQL Join of A & B based on some field, then I
> create append stream from Joined A, let's call it E. Then I need to
> assign timestamps to E since it is a result of joining and Flink can't
> figure out the timestamps.
>
> Next, I union E & C, to create some F stream. Then finally I connect E & C
> using `keyBy` and CoProcessFunction. Now the issue I am facing is that if I
> try to, it works fine if I enforce the parallelism of E to be 1 by invoking
> *setParallelism*. But if parallelism is higher than 1, for the same data
> - the watermark is not progressing correctly. I can see that 
> *CoProcessFunction
> *methods are invoked and that data is produced, but the Watermark is
> never progressing for this function. What I can see is that watermark is
> always equal to (0 - allowedOutOfOrderness). I can see that timestamps are
> correctly extracted and when I add debug prints I can actually see that
> Watermarks are generated for all streams, but for some reason, if the
> parallelism is > 1 they will never progress up to connect function. Is
> there anything that needs to be done after SQL joins that I don't know of
> ??
>
> Best Regards,
> Dom.
>

Fwd: AfterMatchSkipStrategy for timed out patterns

2020-03-16 Thread Dominik Wosiński

Hey all,

I was wondering whether for CEP the *AfterMatchSkipStrategy *is applied
during matching or if simply the results are removed after the match. The
question is the result of the experiments I was doing with CEP. Say I have
the readings from some sensor and I want to detect events over some
threshold. So I have something like below:

Pattern.begin[AccelVector]("beginning",
AfterMatchSkipStrategy.skipPastLastEvent())
  .where(_.data() < Threshold)
  .optional
  .followedBy(EventPatternName)
  .where(event => event.data() >= Threshold)
  .oneOrMore
  .greedy
  .consecutive()
  .followedBy("end")
  .where(_.data() < Threshold)
  .oneOrMore
  .within(Time.minutes(1))


The thing is that sometimes sensors may stop sending data or the data is
lost so I would like to emit events that have technically timed out. I have
created a PatternProcessFunction that simply gets events that have timed
out and check for *EventPatternName* part.

It works fine, but I have noticed weird behavior that the events that get
passed to the *processTimedOutMatch *are repeated as if there was no
*AfterMatchSkipStrategy.*

So, for example say the Threshold=200, and I have the following events for
one of the sensors:
Event1 (timestamp= 1, data = 10)
Event2 (timestamp= 2, data = 250)
Event3 (timestamp= 3, data = 300)
Event4 (timestamp= 4, data = 350)
Event5 (timestamp= 5, data = 400)
Event6 (timestamp= 6, data = 450)

After that, this sensor stops sending data but others are sending data so
the watermark is progressing - this obviously causes timeout of the
pattern. And the issue I have is the fact that  *processTimedOutMatch* gets
called multiple times, first for the whole pattern Event1 to Event6 and
each call just skips one event so next, I have Event2 to Event 6, Event3 to
Event6 up to just Event6.

My understanding is that *AfterMatchSkipStrategy *should wipe out those
partial matches or does it work differently for timed out matches?

Thanks in advance,
Best Regards,
Dom.

Issues with Watermark generation after join

2020-03-16 Thread Dominik Wosiński

Hey,
I have noticed a weird behavior with a job that I am currently working on.
I have 4 different streams from Kafka, lets call them A, B, C and D. Now
the idea is that first I do SQL Join of A & B based on some field, then I
create append stream from Joined A, let's call it E. Then I need to
assign timestamps to E since it is a result of joining and Flink can't
figure out the timestamps.

Next, I union E & C, to create some F stream. Then finally I connect E & C
using `keyBy` and CoProcessFunction. Now the issue I am facing is that if I
try to, it works fine if I enforce the parallelism of E to be 1 by invoking
*setParallelism*. But if parallelism is higher than 1, for the same data -
the watermark is not progressing correctly. I can see that *CoProcessFunction
*methods are invoked and that data is produced, but the Watermark is never
progressing for this function. What I can see is that watermark is always
equal to (0 - allowedOutOfOrderness). I can see that timestamps are
correctly extracted and when I add debug prints I can actually see that
Watermarks are generated for all streams, but for some reason, if the
parallelism is > 1 they will never progress up to connect function. Is
there anything that needs to be done after SQL joins that I don't know of
??

Best Regards,
Dom.

Re: Implementing a tick service

2020-01-21 Thread Dominik Wosiński

Hey,
you have access to context in `onTimer` so You can easily reschedule the
timer when it is fired.

Best,
Dom.

HadoopInputFormat

2019-11-06 Thread Dominik Wosiński

Hey,
I wanted to ask if the *HadoopInputFormat* does currently support some
custom partitioning scheme ? Say I have 200 files in HDFS each having the
partitioning key in name, can we ATM use HadoopInputFormat to distribute
reading to multiple TaskManagers using the key ??


Best Regards,
Dom.

WebUI show custom config

2019-06-21 Thread Dominik Wosiński

Hey,
I am building jobs that use Typesafe Config under the hood. The configs
tend to grow big. I was wondering whether there is a possibility to use
WebUI to show the config that the job was run with, currently the only idea
is to log the config and check it inside the logs, but with dozens of jobs
it is getting troublesome. Is there a better way to access the job custom
configuration??

Thanks in advance,
Best,
Dom.

Re: kafka corrupt record exception

2019-04-25 Thread Dominik Wosiński

Hey,
Sorry for such a delay, but I have missed this message. Basically,
technically you could have Kafka broker installed in version say 1.0.0  and
using  FlinkKafkaConsumer08. This could technically create issues.
I'm not sure if You can automate the process of skipping corrupted
messages, as You would have to write the consumer that will allow skipping
messages that are corrupted. This maybe a good idea to think about for
Flink though.

On the other hand, if You have many messages that are corrupted, this may
mean that the problem lies elsewhere within You pipeline (kafka producers
before Flink).

Re: Flink Control Stream

2019-04-25 Thread Dominik Wosiński

Thanks for help Till,

I thought so, but I wanted to be sure.
Best Regards,
Dom.

Flink Control Stream

2019-04-24 Thread Dominik Wosiński

Hey,
I wanted to use the control stream to dynamically adjust parameters of the
tasks. I know that it is possible to use *connect()* and *BroadcastState *to
obtain such a thing. But I would like to have the possibility to control
the parameters inside the *AsyncFunction. *Like specific timeout for HTTP
client or the request address if it changes. I know that I could
technically create join control stream and event stream into one stream and
process it, but I was wondering If it would be possible to do this with any
other mechanism?

Thanks in advance,
Best Regards.
Dom.

Re: kafka corrupt record exception

2019-04-02 Thread Dominik Wosiński

Hey,
As far as I understand the error is not caused by the deserialization but
really by the polling of the message, so custom deserialization schema
won't really help in this case. There seems to be an error in the messages
in Your topic.

You can see here
what
is the data that should be associated with the message. One thing you could
possibly do is simply find the offset of the corrupted message and start
reading after the record. However, You should probably verify what is the
reason for the message size being smaller than it should. One thing that
can cause this exact behavior may be a mismatch between Kafka versions on
broker and consumer.

Best Regards,
Dom.

wt., 2 kwi 2019 o 09:36 Ilya Karpov  napisał(a):

> According to docs (here:
> https://ci.apache.org/projects/flink/flink-docs-stable/dev/connectors/kafka.html#the-deserializationschema
>  ,
> last paragraph) that’s an expected behaviour. May be you should think about
> writing your own deserialisation schema to skip corrupted messages.
>
> 1 апр. 2019 г., в 18:19, Sushant Sawant 
> написал(а):
>
> Hi,
> Thanks for reply.
> But is there a way one could skip this corrupt record from Flink consumer?
> Flink job goes in a loop, it restarts and then fails again at same record.
>
>
> On Mon, 1 Apr 2019, 07:34 Congxian Qiu,  wrote:
>
>> Hi
>> As you said, consume from ubuntu terminal has the same error, maybe you
>> could send a email to kafka user maillist.
>>
>> Best, Congxian
>> On Apr 1, 2019, 05:26 +0800, Sushant Sawant ,
>> wrote:
>>
>> Hi team,
>> I am facing this exception,
>>
>> org.apache.kafka.common.KafkaException: Received exception when fetching
>> the next record from topic_log-3. If needed, please seek past the record to
>> continue consumption.
>>
>> at
>> org.apache.kafka.clients.consumer.internals.Fetcher$PartitionRecords.fetchRecords(Fetcher.java:1076)
>>
>> at
>> org.apache.kafka.clients.consumer.internals.Fetcher$PartitionRecords.access$1200(Fetcher.java:944)
>>
>> at
>> org.apache.kafka.clients.consumer.internals.Fetcher.fetchRecords(Fetcher.java:567)
>>
>> at
>> org.apache.kafka.clients.consumer.internals.Fetcher.fetchedRecords(Fetcher.java:528)
>>
>> at
>> org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:1086)
>>
>> at
>> org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1043)
>>
>> at
>> org.apache.flink.streaming.connectors.kafka.internal.KafkaConsumerThread.run(KafkaConsumerThread.java:257)
>>
>> Caused by: org.apache.kafka.common.errors.CorruptRecordException: Record
>> size is less than the minimum record overhead (14)
>>
>>
>> Also, when I consume message from ubuntu terminal consumer, I get same
>> error.
>>
>> How can skip this corrupt record?
>>
>>
>>
>>
>>
>

Re: JSON to CEP coversion

2019-01-22 Thread Dominik Wosiński

Hey Anish,

I have done some abstraction over the logic of CEP, but with the use of
Apache Bahir[1], which introduces SIddhi CEP[2][ engine that allows SQL
like definitions of the logic.

Best, Dom.

[1] https://github.com/apache/bahir
[2] https://github.com/wso2/siddhi

wt., 22 sty 2019 o 20:20 ashish pok  napisał(a):

> All,
>
> Wondering if anyone in community has started something along the line -
> idea being CEP logic is abstracted out to metadata instead. That can then
> further be exposed out to users from a REST API/UI etc. Obviously, it would
> really need some additional information like data catalog etc for it to be
> really helpful. But to get started we were thinking of allowing power users
> make some modifications to existing rule outside of CI/CD process.
>
> Thanks,
>
> - Ashish
>

Re: Getting RemoteTransportException

2019-01-17 Thread Dominik Wosiński

*Hey,*
As for the question about  taskmanager.network.netty.server.numThreads
.
It is the size of the thread pool that will be used by the netty server.
The default value is -1, which will result in the thread pool with size
equal to the number of task slots for your JobManager.

Best Regards,
Dom.

czw., 17 sty 2019 o 00:52 Avi Levi  napisał(a):

> Hi Guys,
>
> We done some load tests and we got the exception below, I saw that the
> JobManager was restarted, If I understood correctly, it will get new job id
> and the state will lost - is that correct? how the state is handled setting
> HA as described here
> ,
>  what
> actually happens to the state if one of the job manager crashes (keyed
> state using rocks db) ?
>
>
> One of the property that might be relevant to this exception is
> taskmanager.network.netty.server.numThreads
> 
>  with
> a default value of -1 - what is this default value actually means?  and
> should it be set to different value according to #cores?
>
>
> Thanks for your advice .
>
> Avi
>
>
>
> org.apache.flink.runtime.io.network.netty.exception.RemoteTransportException:
> Lost connection to task manager ':1234'. This indicates that the remote
> task manager was lost.
>
> at
> org.apache.flink.runtime.io.network.netty.CreditBasedPartitionRequestClientHandler.exceptionCaught(CreditBasedPartitionRequestClientHandler.java:160)
>
> at
> org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeExceptionCaught(AbstractChannelHandlerContext.java:285)
>
> at
> org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeExceptionCaught(AbstractChannelHandlerContext.java:264)
>
> at
> org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.fireExceptionCaught(AbstractChannelHandlerContext.java:256)
>
> at
> org.apache.flink.shaded.netty4.io.netty.channel.ChannelInboundHandlerAdapter.exceptionCaught(ChannelInboundHandlerAdapter.java:131)
>
> at
> org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeExceptionCaught(AbstractChannelHandlerContext.java:285)
>
> at
> org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeExceptionCaught(AbstractChannelHandlerContext.java:264)
>
> at
> org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.fireExceptionCaught(AbstractChannelHandlerContext.java:256)
>
> at
> org.apache.flink.shaded.netty4.io.netty.channel.ChannelHandlerAdapter.exceptionCaught(ChannelHandlerAdapter.java:87)
>
> at
> org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeExceptionCaught(AbstractChannelHandlerContext.java:285)
>
> at
> org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeExceptionCaught(AbstractChannelHandlerContext.java:264)
>
> at
> org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.fireExceptionCaught(AbstractChannelHandlerContext.java:256)
>
> at
> org.apache.flink.shaded.netty4.io.netty.channel.DefaultChannelPipeline$HeadContext.exceptionCaught(DefaultChannelPipeline.java:1401)
>
> at
> org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeExceptionCaught(AbstractChannelHandlerContext.java:285)
>
> at
> org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeExceptionCaught(AbstractChannelHandlerContext.java:264)
>
> at
> org.apache.flink.shaded.netty4.io.netty.channel.DefaultChannelPipeline.fireExceptionCaught(DefaultChannelPipeline.java:953)
>
> at
> org.apache.flink.shaded.netty4.io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.handleReadException(AbstractNioByteChannel.java:125)
>
> at
> org.apache.flink.shaded.netty4.io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:174)
>
> at
> org.apache.flink.shaded.netty4.io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:645)
>
> at
> org.apache.flink.shaded.netty4.io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:580)
>
> at
> org.apache.flink.shaded.netty4.io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:497)
>
> at
> org.apache.flink.shaded.netty4.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:459)
>
> at
> org.apache.flink.shaded.netty4.io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:884)
>
> at java.lang.Thread.run(Thread.java:748)
>
> Caused by: java.io.IOException: Connection reset by peer
>
> at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
>
> at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
>
> at

Re: Passing vm options

2019-01-07 Thread Dominik Wosiński

Hey,
AFAIK, Flink supports dynamic properties currently only on YARN and not
really in standalone mode.
If You are using YARN it should indeed be possible to set such
configuration. If not, then I am afraid it is not possible.

Best Regards,
Dom.


pon., 7 sty 2019 o 09:01 Avi Levi  napisał(a):

> Hi ,
> I am trying to pass some vm options e.g
> bin/flink run foo-assembly-0.0.1-SNAPSHOT.jar
> -Dflink.stateDir=file:///tmp/ -Dkafka.bootstrap.servers="localhost:9092"
> -Dkafka.security.ssl.enabled=false
> but it doesn't seem to override the values in application.conf . Am I
> missing something?
> BTW is it possible to pass config file using -Dcofig.file ?
>
> BR
> Avi
>

Re: Kafka consumer, is there a way to filter out messages using key only?

2018-12-27 Thread Dominik Wosiński

Hey,
AFAIK, returning null from deserialize function in FlinkKafkaConsumer will
indeed filter the message out and it won't be further processed.

Best Regards,
Dom.

śr., 19 gru 2018 o 11:06 Dawid Wysakowicz 
napisał(a):

> Hi,
>
> I'm afraid that there is no out-of-the box solution for this, but what you
> could do is to generate from KeyedDeserializationSchema some marker
> (Optional, null value...) based on the message key, that would allow you
> later to filter it out. So assuming the Optional solution the result of
> KeyedDeserializationSchema#deserialize could be Optional.empty() for
> invalid keys and Optional.of(deserializedValue) for valid keys.
>
> Best,
>
> Dawid
> On 18/12/2018 20:22, Hao Sun wrote:
>
> Hi, I am using 1.7 on K8S.
>
> I have a huge amount of data in kafka, but I only need a tiny portion of
> it.
> It is a keyed stream, the value in JSON encoded. I want to avoid
> deserialization of the value, since it is very expensive. Can I only filter
> based on the key?
> I know there is a KeyedDeserializationSchema, but can I use it to filter
> data?
>
> Hao Sun
> Team Lead
> 1019 Market St. 7F
> San Francisco, CA 94103
>
>

Re: Changes in Flink 1.6.2

2018-11-30 Thread Dominik Wosiński

Hey,

@Dawid is right. It is a known issue in Scala. This is due to the
functional nature of Scala and is explained on StackOverflow[1].

Best Regards,
Dom.
[1]
https://stackoverflow.com/questions/7498677/why-is-this-reference-ambiguous

pt., 30 lis 2018 o 15:56 Dawid Wysakowicz 
napisał(a):

> Hi Boris,
>
> I am not a scala expert, so I won't be able explain the root cause
> completely, but it is because you access empty-parameter java method as
> scala parameterless one (I don't know why it doesn't work).
>
> If you change your code to: env.getStreamGraph.getJobGraph().getJobID it
> will work.
>
> Best,
>
> Dawid
> On 30/11/2018 15:19, Boris Lublinsky wrote:
>
> Dominik,
> Any feedback on this?
>
> Boris Lublinsky
> FDP Architect
> boris.lublin...@lightbend.com
> https://www.lightbend.com/
>
> On Nov 28, 2018, at 2:56 PM, Boris Lublinsky <
> boris.lublin...@lightbend.com> wrote:
>
> Here is the code
>
> def executeLocal() : Unit = {
>   val env = StreamExecutionEnvironment.getExecutionEnvironment  
> buildGraph(env)
>   System.out.println("[info] Job ID: " + 
> env.getStreamGraph.getJobGraph.getJobID)
>   env.execute()
> }
>
> And an error
>
> Error:(68, 63) ambiguous reference to overloaded definition,
> both method getJobGraph in class StreamGraph of type (x$1:
> org.apache.flink.api.common.JobID)org.apache.flink.runtime.jobgraph.JobGraph
> and  method getJobGraph in class StreamingPlan of type
> ()org.apache.flink.runtime.jobgraph.JobGraph
> match expected type ?
> System.out.println("[info] Job ID: " +
> env.getStreamGraph.getJobGraph.getJobID)
>
> Boris Lublinsky
> FDP Architect
> boris.lublin...@lightbend.com
> https://www.lightbend.com/
>
> On Nov 28, 2018, at 2:47 PM, Dominik Wosiński  wrote:
>
> Hey,
> Could you show the message that You are getting?
> Best Regards,
> Dom.
>
> śr., 28 lis 2018 o 19:08 Boris Lublinsky 
> napisał(a):
>
>>
>>
>> Prior to Flink version 1.6.2 including 1.6.1
>> env.getStreamGraph.getJobGraph was happily returning currently defined
>> Graph, but in 1.6.2 this fails to compile with a pretty cryptic message
>> AM I missing something?
>>
>>
>> Boris Lublinsky
>> FDP Architect
>> boris.lublin...@lightbend.com
>> https://www.lightbend.com/
>>
>>
>
>

Re: Flink SQL

2018-11-30 Thread Dominik Wosiński

Hey,

Not exactly sure by what you mean by "nothing" but generally the concept
is. The data is fed to the dynamic table and the result of the query
creates another dynamic table. So, if the resulting query returns an empty
table, no data will indeed be written to the S3. Not sure if this was what
You are asking about.

Best Regards,
Dom.



pt., 30 lis 2018 o 08:24 Steve Bistline 
napisał(a):

> Hi,
>
> I have a silly question about Flink SQL that I cannot seem to find a clear
> answer to. If I have the following code. Will the "result" from the sql
> SELECT statement only return and the data then be written to S3 if and only
> if the statement returns data that matches the criteria?
>
> Does "nothing" happen otherwise ( ie no match to the sql statement.)?
>
> tableEnv.registerDataStream("SENSORS",dataset,"t_deviceID, t_timeStamp,
> t_sKey, t_sValue");
>
>
> // TEMEPERATURE
> Table result = tableEnv.sql("SELECT 'AlertTEMEPERATURE ',t_sKey,
> t_deviceID, t_sValue FROM SENSORS WHERE t_sKey='TEMPERATURE' AND t_sValue >
> " + TEMPERATURE_THRESHOLD);
> tableEnv.toAppendStream(result, Row.class).print();
> // Write to S3 bucket
> DataStream dsRow = tableEnv.toAppendStream(result, Row.class);
> String fileNameTemp = sdf.format(new Date());
> dsRow.writeAsText("s3://csv-ai/flink-alerts/"+fileNameTemp+
> "TEMPERATURE.txt");
>

Re: Changes in Flink 1.6.2

2018-11-28 Thread Dominik Wosiński

Hey,
Could you show the message that You are getting?
Best Regards,
Dom.

śr., 28 lis 2018 o 19:08 Boris Lublinsky 
napisał(a):

>
>
>
> Prior to Flink version 1.6.2 including 1.6.1
> env.getStreamGraph.getJobGraph was happily returning currently defined
> Graph, but in 1.6.2 this fails to compile with a pretty cryptic message
> AM I missing something?
>
>
> Boris Lublinsky
> FDP Architect
> boris.lublin...@lightbend.com
> https://www.lightbend.com/
>
>

Re: Can JDBCSinkFunction support exectly once?

2018-11-21 Thread Dominik Wosiński

Hey,

As far as I know, the function needs to implement the
*TwoPhaseCommitFunction* and not the *CheckpointListener.
JDBCSinkFunction *does
not implement the two-phase commit, so currently it does not support
exactly once.

Best Regards,
Dom.

śr., 21 lis 2018 o 11:07 Jocean shi  napisał(a):

> Hi,
> Can JDBCSinkFunction support exectly once? Is it that The JDBCSinkFunction
> dont't implement CheckpointListener meaning JDBCSinkFunction dont't support
> exectly once?
>
> cheers
>
> Jocean
>

Re: Store Predicate or any lambda in MapState

2018-11-21 Thread Dominik Wosiński

Hey Jayant,

I don't really think that the sole fact of using Predicate should cause the
*ClassNotFoundException* that You are talking about. The exception may come
from the fact that some libraries are missing from Your cluster
environment. Have You tried running the job locally to verify that the
exception occurs? Also, could You please paste some logs here, they may
help in determining the exact reason for the problem.

Best Regards,
Dom.



śr., 21 lis 2018 o 04:41 Jayant Ameta  napisał(a):

> Hi,
> I want to store a custom POJO in the MapState. One of the fields in the
> object is a java.util.function.Predicate type.
> Flink gives ClassNotFoundException exception on the lambda. How do I
> store this object in the mapState?
>
> Marking the predicate field as transient is an option. But in my use-case,
> the predicate field is set using another library, and I don't want to call
> it every time I want.
>
>
> Jayant Ameta
>

Re: Kinesis Connector - NoClassDefFoundError

2018-11-20 Thread Dominik Wosiński

Hey,

Have you updated the versions both on the environment and the dependency on
the job?
>From my personal experience, 95 % of such issues is due to the mismatch
between Flink versions on the cluster you are using and Your job.

Best Regards,
Dom.

wt., 20 lis 2018 o 07:41 Steve Bistline 
napisał(a):

> Hey all... upgrade from Flink 1.5.0 to 1.6.2 and for some reason cannot
> figure out what I missed in setting up the new environment. I am gettin
> this error:
>
>
> java.lang.NoClassDefFoundError: Could not initialize class 
> org.apache.flink.kinesis.shaded.com.amazonaws.partitions.PartitionsLoader
>   at 
> org.apache.flink.kinesis.shaded.com.amazonaws.regions.RegionMetadataFactory.create(RegionMetadataFactory.java:30)
>   at 
> org.apache.flink.kinesis.shaded.com.amazonaws.regions.RegionUtils.initialize(RegionUtils.java:64)
>   at 
> org.apache.flink.kinesis.shaded.com.amazonaws.regions.RegionUtils.getRegionMetadata(RegionUtils.java:52)
>   at 
> org.apache.flink.kinesis.shaded.com.amazonaws.regions.RegionUtils.getRegion(RegionUtils.java:105)
>   at 
> org.apache.flink.kinesis.shaded.com.amazonaws.client.builder.AwsClientBuilder.withRegion(AwsClientBuilder.java:239)
>   at 
> org.apache.flink.kinesis.shaded.com.amazonaws.client.builder.AwsClientBuilder.withRegion(AwsClientBuilder.java:226)
>   at 
> org.apache.flink.streaming.connectors.kinesis.util.AWSUtil.createKinesisClient(AWSUtil.java:93)
>   at 
> org.apache.flink.streaming.connectors.kinesis.proxy.KinesisProxy.createKinesisClient(KinesisProxy.java:203)
>   at 
> org.apache.flink.streaming.connectors.kinesis.proxy.KinesisProxy.(KinesisProxy.java:138)
>   at 
> org.apache.flink.streaming.connectors.kinesis.proxy.KinesisProxy.create(KinesisProxy.java:213)
>   at 
> org.apache.flink.streaming.connectors.kinesis.internals.KinesisDataFetcher.(KinesisDataFetcher.java:242)
>   at 
> org.apache.flink.streaming.connectors.kinesis.internals.KinesisDataFetcher.(KinesisDataFetcher.java:207)
>   at 
> org.apache.flink.streaming.connectors.kinesis.FlinkKinesisConsumer.createFetcher(FlinkKinesisConsumer.java:417)
>   at 
> org.apache.flink.streaming.connectors.kinesis.FlinkKinesisConsumer.run(FlinkKinesisConsumer.java:233)
>   at 
> org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:94)
>   at 
> org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:58)
>   at 
> org.apache.flink.streaming.runtime.tasks.SourceStreamTask.run(SourceStreamTask.java:99)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:300)
>   at org.apache.flink.runtime.taskmanager.Task.run(Task.java:711)
>   at java.lang.Thread.run(Thread.java:748)
>
>

Re: Field could not be resolved by the field mapping when using kafka connector

2018-11-15 Thread Dominik Wosiński

Hey,
Thanks for the info, I haven't noticed that.
I was just going through older messages with no responses.

Best Regards,
Dom.

Re: Flink with parallelism 3 is running locally but not on cluster

2018-11-15 Thread Dominik Wosiński

PS.
Could You also post the whole log for the application run ??

Best Regards,
Dom.

czw., 15 lis 2018 o 11:04 Dominik Wosiński  napisał(a):

> Hey,
>
> DId You try to run any other job on your setup? Also, could You please
> tell what are the sources you are trying to use, do all messages come from
> Kafka??
> From the first look, it seems that the JobManager can't connect to one of
> the TaskManagers.
>
>
> Best Regards,
> Dom.
>
> pon., 12 lis 2018 o 17:12 zavalit  napisał(a):
>
>> Hi,
>> may be i just missing smth, but i just have no more ideas where to look.
>>
>> here is an screen of the failed state
>> <
>> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/file/t1383/Bildschirmfoto_2018-11-12_um_16.png>
>>
>>
>> i read messages from 2 sources, make a join based on a common key and sink
>> it all in a kafka.
>>
>>   val env = StreamExecutionEnvironment.getExecutionEnvironment
>>   env.setParallelism(3)
>>   ...
>>   source1
>>  .keyBy(_.searchId)
>>  .connect(source2.keyBy(_.searchId))
>>  .process(new SearchResultsJoinFunction)
>>  .addSink(KafkaSink.sink)
>>
>> so it perfectly works when launch it locally. when i deploy it to 1 job
>> manager and 3 taskmanagers and get every Task in "RUNNING" state, after 2
>> minutes (when nothing is comming to sink) one of the taskmanagers gets
>> following in log:
>>
>>  Flat Map (1/3) (9598c11996f4b52a2e2f9f532f91ff66) switched from RUNNING
>> to
>> FAILED.
>> java.io.IOException: Connecting the channel failed: Connecting to remote
>> task manager + 'flink-taskmanager-11-dn9cj/10.81.27.84:37708' has failed.
>> This might indicate that the remote task manager has been lost.
>> at
>> org.apache.flink.runtime.io
>> .network.netty.PartitionRequestClientFactory$ConnectingChannel.waitForChannel(PartitionRequestClientFactory.java:196)
>> at
>> org.apache.flink.runtime.io
>> .network.netty.PartitionRequestClientFactory$ConnectingChannel.access$000(PartitionRequestClientFactory.java:133)
>> at
>> org.apache.flink.runtime.io
>> .network.netty.PartitionRequestClientFactory.createPartitionRequestClient(PartitionRequestClientFactory.java:85)
>> at
>> org.apache.flink.runtime.io
>> .network.netty.NettyConnectionManager.createPartitionRequestClient(NettyConnectionManager.java:60)
>> at
>> org.apache.flink.runtime.io
>> .network.partition.consumer.RemoteInputChannel.requestSubpartition(RemoteInputChannel.java:166)
>> at
>> org.apache.flink.runtime.io
>> .network.partition.consumer.SingleInputGate.requestPartitions(SingleInputGate.java:494)
>> at
>> org.apache.flink.runtime.io
>> .network.partition.consumer.SingleInputGate.getNextBufferOrEvent(SingleInputGate.java:525)
>> at
>> org.apache.flink.runtime.io
>> .network.partition.consumer.SingleInputGate.getNextBufferOrEvent(SingleInputGate.java:508)
>> at
>>
>> org.apache.flink.streaming.runtime.io.BarrierTracker.getNextNonBlocked(BarrierTracker.java:94)
>> at
>> org.apache.flink.streaming.runtime.io
>> .StreamInputProcessor.processInput(StreamInputProcessor.java:209)
>> at
>>
>> org.apache.flink.streaming.runtime.tasks.OneInputStreamTask.run(OneInputStreamTask.java:105)
>> at
>>
>> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:300)
>> at org.apache.flink.runtime.taskmanager.Task.run(Task.java:711)
>> at java.lang.Thread.run(Thread.java:748)
>> Caused by:
>> org.apache.flink.runtime.io
>> .network.netty.exception.RemoteTransportException:
>> Connecting to remote task manager +
>> 'flink-taskmanager-11-dn9cj/10.81.27.84:37708' has failed. This might
>> indicate that the remote task manager has been lost.
>> at
>> org.apache.flink.runtime.io
>> .network.netty.PartitionRequestClientFactory$ConnectingChannel.operationComplete(PartitionRequestClientFactory.java:219)
>> at
>> org.apache.flink.runtime.io
>> .network.netty.PartitionRequestClientFactory$ConnectingChannel.operationComplete(PartitionRequestClientFactory.java:133)
>> at
>>
>> org.apache.flink.shaded.netty4.io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:511)
>> at
>>
>> org.apache.flink.shaded.netty4.io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:504)
>> at
>>
>> org.apache.flink.shaded.netty4.io.netty.util.con

Re: Flink with parallelism 3 is running locally but not on cluster

2018-11-15 Thread Dominik Wosiński

Hey,

DId You try to run any other job on your setup? Also, could You please tell
what are the sources you are trying to use, do all messages come from
Kafka??
>From the first look, it seems that the JobManager can't connect to one of
the TaskManagers.


Best Regards,
Dom.

pon., 12 lis 2018 o 17:12 zavalit  napisał(a):

> Hi,
> may be i just missing smth, but i just have no more ideas where to look.
>
> here is an screen of the failed state
> <
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/file/t1383/Bildschirmfoto_2018-11-12_um_16.png>
>
>
> i read messages from 2 sources, make a join based on a common key and sink
> it all in a kafka.
>
>   val env = StreamExecutionEnvironment.getExecutionEnvironment
>   env.setParallelism(3)
>   ...
>   source1
>  .keyBy(_.searchId)
>  .connect(source2.keyBy(_.searchId))
>  .process(new SearchResultsJoinFunction)
>  .addSink(KafkaSink.sink)
>
> so it perfectly works when launch it locally. when i deploy it to 1 job
> manager and 3 taskmanagers and get every Task in "RUNNING" state, after 2
> minutes (when nothing is comming to sink) one of the taskmanagers gets
> following in log:
>
>  Flat Map (1/3) (9598c11996f4b52a2e2f9f532f91ff66) switched from RUNNING to
> FAILED.
> java.io.IOException: Connecting the channel failed: Connecting to remote
> task manager + 'flink-taskmanager-11-dn9cj/10.81.27.84:37708' has failed.
> This might indicate that the remote task manager has been lost.
> at
> org.apache.flink.runtime.io
> .network.netty.PartitionRequestClientFactory$ConnectingChannel.waitForChannel(PartitionRequestClientFactory.java:196)
> at
> org.apache.flink.runtime.io
> .network.netty.PartitionRequestClientFactory$ConnectingChannel.access$000(PartitionRequestClientFactory.java:133)
> at
> org.apache.flink.runtime.io
> .network.netty.PartitionRequestClientFactory.createPartitionRequestClient(PartitionRequestClientFactory.java:85)
> at
> org.apache.flink.runtime.io
> .network.netty.NettyConnectionManager.createPartitionRequestClient(NettyConnectionManager.java:60)
> at
> org.apache.flink.runtime.io
> .network.partition.consumer.RemoteInputChannel.requestSubpartition(RemoteInputChannel.java:166)
> at
> org.apache.flink.runtime.io
> .network.partition.consumer.SingleInputGate.requestPartitions(SingleInputGate.java:494)
> at
> org.apache.flink.runtime.io
> .network.partition.consumer.SingleInputGate.getNextBufferOrEvent(SingleInputGate.java:525)
> at
> org.apache.flink.runtime.io
> .network.partition.consumer.SingleInputGate.getNextBufferOrEvent(SingleInputGate.java:508)
> at
>
> org.apache.flink.streaming.runtime.io.BarrierTracker.getNextNonBlocked(BarrierTracker.java:94)
> at
> org.apache.flink.streaming.runtime.io
> .StreamInputProcessor.processInput(StreamInputProcessor.java:209)
> at
>
> org.apache.flink.streaming.runtime.tasks.OneInputStreamTask.run(OneInputStreamTask.java:105)
> at
>
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:300)
> at org.apache.flink.runtime.taskmanager.Task.run(Task.java:711)
> at java.lang.Thread.run(Thread.java:748)
> Caused by:
> org.apache.flink.runtime.io
> .network.netty.exception.RemoteTransportException:
> Connecting to remote task manager +
> 'flink-taskmanager-11-dn9cj/10.81.27.84:37708' has failed. This might
> indicate that the remote task manager has been lost.
> at
> org.apache.flink.runtime.io
> .network.netty.PartitionRequestClientFactory$ConnectingChannel.operationComplete(PartitionRequestClientFactory.java:219)
> at
> org.apache.flink.runtime.io
> .network.netty.PartitionRequestClientFactory$ConnectingChannel.operationComplete(PartitionRequestClientFactory.java:133)
> at
>
> org.apache.flink.shaded.netty4.io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:511)
> at
>
> org.apache.flink.shaded.netty4.io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:504)
> at
>
> org.apache.flink.shaded.netty4.io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:483)
> at
>
> org.apache.flink.shaded.netty4.io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:424)
> at
>
> org.apache.flink.shaded.netty4.io.netty.util.concurrent.DefaultPromise.tryFailure(DefaultPromise.java:121)
> at
>
> org.apache.flink.shaded.netty4.io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe$1.run(AbstractNioChannel.java:269)
> at
>
> org.apache.flink.shaded.netty4.io.netty.util.concurrent.PromiseTask$RunnableAdapter.call(PromiseTask.java:38)
> at
>
> org.apache.flink.shaded.netty4.io.netty.util.concurrent.ScheduledFutureTask.run(ScheduledFutureTask.java:125)
> at
>
> org.apache.flink.shaded.netty4.io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)
>

Re: Field could not be resolved by the field mapping when using kafka connector

2018-11-15 Thread Dominik Wosiński

Hey,

Could You please show a sample data that You want to process? This would
help in verifying the issue.

Best Regards,
Dom.

wt., 13 lis 2018 o 13:58 Jeff Zhang  napisał(a):

> Hi,
>
> I hit the following error when I try to use kafka connector in flink table
> api. There's very little document about how to use kafka connector in flink
> table api, could anyone help me on that ? Thanks
>
> Exception in thread "main" org.apache.flink.table.api.ValidationException:
> Field 'event_ts' could not be resolved by the field mapping.
> at
> org.apache.flink.table.sources.TableSourceUtil$.org$apache$flink$table$sources$TableSourceUtil$$resolveInputField(TableSourceUtil.scala:491)
> at
> org.apache.flink.table.sources.TableSourceUtil$$anonfun$org$apache$flink$table$sources$TableSourceUtil$$resolveInputFields$1.apply(TableSourceUtil.scala:521)
> at
> org.apache.flink.table.sources.TableSourceUtil$$anonfun$org$apache$flink$table$sources$TableSourceUtil$$resolveInputFields$1.apply(TableSourceUtil.scala:521)
> at
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
> at
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
> at
> scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
> at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186)
> at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
> at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:186)
> at
> org.apache.flink.table.sources.TableSourceUtil$.org$apache$flink$table$sources$TableSourceUtil$$resolveInputFields(TableSourceUtil.scala:521)
> at
> org.apache.flink.table.sources.TableSourceUtil$.validateTableSource(TableSourceUtil.scala:127)
> at
> org.apache.flink.table.plan.schema.StreamTableSourceTable.(StreamTableSourceTable.scala:33)
> at
> org.apache.flink.table.api.StreamTableEnvironment.registerTableSourceInternal(StreamTableEnvironment.scala:150)
> at
> org.apache.flink.table.api.TableEnvironment.registerTableSource(TableEnvironment.scala:541)
> at
> org.apache.flink.table.descriptors.ConnectTableDescriptor.registerTableSource(ConnectTableDescriptor.scala:47)
> at
> org.apache.flink.table.descriptors.ConnectTableDescriptor.registerTableSourceAndSink(ConnectTableDescriptor.scala:68)
>
> And here's the source code:
>
>
>
>  case class Record(status: String, direction: String, var event_ts: Timestamp)
>
>
>   def main(args: Array[String]): Unit = {
> val senv = StreamExecutionEnvironment.getExecutionEnvironment
> senv.setParallelism(1)
> senv.setStreamTimeCharacteristic(TimeCharacteristic.EventTime)
>
> val data: DataStream[Record] = ...
> val tEnv = TableEnvironment.getTableEnvironment(senv)
> tEnv
>   // declare the external system to connect to
>   .connect(
>   new Kafka()
> .version("0.11")
> .topic("processed5.events")
> .startFromEarliest()
> .property("zookeeper.connect", "localhost:2181")
> .property("bootstrap.servers", "localhost:9092"))
>   .withFormat(new Json()
> .failOnMissingField(false)
> .deriveSchema()
>   )
>   .withSchema(
> new Schema()
>   .field("status", Types.STRING)
>   .field("direction", Types.STRING)
>   .field("event_ts", Types.SQL_TIMESTAMP).rowtime(
>   new 
> Rowtime().timestampsFromField("event_ts").watermarksPeriodicAscending())
>   )
>
>   // specify the update-mode for streaming tables
>   .inAppendMode()
>
>   // register as source, sink, or both and under a name
>   .registerTableSourceAndSink("MyUserTable");
>
> tEnv.fromDataStream(data).insertInto("MyUserTable")
>
>

Re: Manual trigger the window in fold operator or incremental aggregation

2018-10-24 Thread Dominik Wosiński

Hey Zhen Li,

What are You trying to do exactly? Maybe there is a more suitable method
than manually triggering windows available in Flink.

Best Regards,
Dom.

śr., 24 paź 2018 o 09:25 Dawid Wysakowicz 
napisał(a):

> Hi Zhen Li,
>
> As far as I know that is not possible. For such custom handling I would
> recommend having a look at ProcessFunction[1], where you have access to
> timers and state.
>
> Best,
>
> Dawid
> [1]
> https://ci.apache.org/projects/flink/flink-docs-release-1.6/dev/stream/operators/process_function.html#process-function-low-level-operations
>
> On 17/10/2018 14:18, Ahmad Hassan wrote:
>
> Hi Niels,
>
> Can we distinguish within apply function of 'RichWindowFunction' whether
> it was called due to onElement trigger call or onProcessingtime trigger
> call of a custom Trigger ?
>
> Thanks!
>
> On Wed, 17 Oct 2018 at 12:51, Niels van Kaam  wrote:
>
>> Hi Zhen Li,
>>
>> You can control when a windowed stream emits data with "Triggers". See:
>> https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/operators/windows.html#triggers
>>
>> Flink comes with a couple of default triggers, but you can also create
>> your own by implementing
>> https://ci.apache.org/projects/flink/flink-docs-stable/api/java/org/apache/flink/streaming/api/windowing/triggers/Trigger.html
>> .
>>
>> Note that this does not change the window, but just causes the
>> windowedstream to emit intermediate results to the next operator.
>>
>> Does this answer your question?
>>
>> Cheers,
>> Niels
>>
>> On Wed, Oct 17, 2018 at 12:34 PM zhen li  wrote:
>>
>>> Hi all:
>>> How can I trigger the window manually in  fold operator or
>>> incremental aggregation? For example, when some conditions is meet，althouth
>>> the count window or time window is not reach
>>
>>

Re: Kafka connector error: This server does not host this topic-partition

2018-10-23 Thread Dominik Wosiński

Hey Alexander,
It seems that this issue occurs when the broker is down and the partition
is selecting the new leader AFAIK. There is one JIRA issue I have found,
not sure if that's what are You looking for:
https://issues.apache.org/jira/browse/KAFKA-6221

This issue is connected with Kafka itself rather than Flink.

Best Regards,
Dom.

wt., 23 paź 2018 o 15:04 Alexander Smirnov 
napisał(a):

> Hi,
>
> I stumbled upon an exception in the "Exceptions" tab which I could not
> explain. Do you know what could cause it? Unfortunately I don't know how to
> reproduce it. Do you know if there is a respective JIRA issue for it?
>
> Here's the exception's stack trace:
>
> org.apache.flink.streaming.connectors.kafka.FlinkKafka011Exception: Failed
> to send data to Kafka: This server does not host this topic-partition.
> at
> org.apache.flink.streaming.connectors.kafka.FlinkKafkaProducer011.checkErroneous(FlinkKafkaProducer011.java:999)
> at
> org.apache.flink.streaming.connectors.kafka.FlinkKafkaProducer011.invoke(FlinkKafkaProducer011.java:614)
> at
> org.apache.flink.streaming.connectors.kafka.FlinkKafkaProducer011.invoke(FlinkKafkaProducer011.java:93)
> at
> org.apache.flink.streaming.api.functions.sink.TwoPhaseCommitSinkFunction.invoke(TwoPhaseCommitSinkFunction.java:219)
> at
> org.apache.flink.streaming.api.operators.StreamSink.processElement(StreamSink.java:56)
> at
> org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.pushToOperator(OperatorChain.java:549)
> at
> org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.collect(OperatorChain.java:524)
> at
> org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.collect(OperatorChain.java:504)
> at
> org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect(AbstractStreamOperator.java:830)
> at
> org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect(AbstractStreamOperator.java:808)
> at
> org.apache.flink.streaming.api.operators.TimestampedCollector.collect(TimestampedCollector.java:51)
> at
> com.five9.distribution.FilterBusinessMetricsFunction.flatMap(FilterBusinessMetricsFunction.java:162)
> at
> com.five9.distribution.FilterBusinessMetricsFunction.flatMap(FilterBusinessMetricsFunction.java:31)
> at
> org.apache.flink.streaming.api.operators.StreamFlatMap.processElement(StreamFlatMap.java:50)
> at
> org.apache.flink.streaming.runtime.io.StreamInputProcessor.processInput(StreamInputProcessor.java:207)
> at
> org.apache.flink.streaming.runtime.tasks.OneInputStreamTask.run(OneInputStreamTask.java:69)
> at
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:264)
> at org.apache.flink.runtime.taskmanager.Task.run(Task.java:718)
> at java.lang.Thread.run(Thread.java:748)
> Caused by:
> org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This
> server does not host this topic-partition.
>
> Thank you,
> Alex
>

Re: KafkaException or ExecutionStateChange failure on job startup

2018-10-23 Thread Dominik Wosiński

Hey Mark,

Do You use more than 1 Kafka consumer for Your jobs? I think this relates
to the known issue in Kafka:
https://issues.apache.org/jira/browse/KAFKA-3992.
The problem is that if You don't provide client ID for your
*KafkaConsumer* Kafka
assigns one, but this is done in an unsynchronized way, so finally, it ends
up in assigning the same id for multiple different Consumer instances.
Probably this is what happens when multiple jobs are resumed at the same
time.

What You could try to do is to assign the *consumer.id
* using
properties passed to each consumer. This should help in solving this issue.

Best Regards,
Dom.




wt., 23 paź 2018 o 13:21 Mark Harris  napisał(a):

> Hi,
> We regularly see the following two exceptions in a number of jobs shortly
> after they have been resumed during our flink cluster startup:
>
> org.apache.kafka.common.KafkaException: Error registering mbean
> kafka.consumer:type=consumer-node-metrics,client-id=consumer-1,node-id=node--1
> at
> org.apache.kafka.common.metrics.JmxReporter.reregister(JmxReporter.java:159)
> at
> org.apache.kafka.common.metrics.JmxReporter.metricChange(JmxReporter.java:77)
> at
> org.apache.kafka.common.metrics.Metrics.registerMetric(Metrics.java:436)
> at org.apache.kafka.common.metrics.Sensor.add(Sensor.java:249)
> at org.apache.kafka.common.metrics.Sensor.add(Sensor.java:234)
> at
> org.apache.kafka.common.network.Selector$SelectorMetrics.maybeRegisterConnectionMetrics(Selector.java:749)
> at
> org.apache.kafka.common.network.Selector.pollSelectionKeys(Selector.java:327)
> at org.apache.kafka.common.network.Selector.poll(Selector.java:303)
> at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:349)
> at
> org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:226)
> at
> org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:188)
> at
> org.apache.kafka.clients.consumer.internals.Fetcher.getTopicMetadata(Fetcher.java:283)
> at
> org.apache.kafka.clients.consumer.KafkaConsumer.partitionsFor(KafkaConsumer.java:1344)
> at
> org.apache.flink.streaming.connectors.kafka.internal.Kafka09PartitionDiscoverer.getAllPartitionsForTopics(Kafka09PartitionDiscoverer.java:77)
> at
> org.apache.flink.streaming.connectors.kafka.internals.AbstractPartitionDiscoverer.discoverPartitions(AbstractPartitionDiscoverer.java:131)
> at
> org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumerBase.open(FlinkKafkaConsumerBase.java:473)
> at
> org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:36)
> at
> org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.open(AbstractUdfStreamOperator.java:102)
> at
> org.apache.flink.streaming.runtime.tasks.StreamTask.openAllOperators(StreamTask.java:424)
> at
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:290)
> at org.apache.flink.runtime.taskmanager.Task.run(Task.java:711)
> at java.lang.Thread.run(Thread.java:748)
> Caused by: javax.management.InstanceAlreadyExistsException:
> kafka.consumer:type=consumer-node-metrics,client-id=consumer-1,node-id=node--1
> at com.sun.jmx.mbeanserver.Repository.addMBean(Repository.java:437)
> at
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerWithRepository(DefaultMBeanServerInterceptor.java:1898)
> at
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerDynamicMBean(DefaultMBeanServerInterceptor.java:966)
> at
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerObject(DefaultMBeanServerInterceptor.java:900)
> at
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerMBean(DefaultMBeanServerInterceptor.java:324)
> at
> com.sun.jmx.mbeanserver.JmxMBeanServer.registerMBean(JmxMBeanServer.java:522)
> at
> org.apache.kafka.common.metrics.JmxReporter.reregister(JmxReporter.java:157)
> ... 21 more
> java.lang.Exception: Failed to send ExecutionStateChange notification to
> JobManager
> at
> org.apache.flink.runtime.taskmanager.TaskManager$$anonfun$org$apache$flink$runtime$taskmanager$TaskManager$$handleTaskMessage$3$$anonfun$apply$2.apply(TaskManager.scala:439)
> at
> org.apache.flink.runtime.taskmanager.TaskManager$$anonfun$org$apache$flink$runtime$taskmanager$TaskManager$$handleTaskMessage$3$$anonfun$apply$2.apply(TaskManager.scala:423)
> at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:36)
> at
> akka.dispatch.BatchingExecutor$AbstractBatch.processBatch(BatchingExecutor.scala:55)
> at
> akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply$mcV$sp(BatchingExecutor.scala:91)
> at
> akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply(BatchingExecutor.scala:91)
> at
> akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply(BatchingExecutor.scala:91)
> at
>

Re: Mapstatedescriptor

2018-10-13 Thread Dominik Wosiński

Hey,
It's the name for the whole descriptor. Not the keys, it means that no
other descriptor should be created with the same name.

Best Regards,
Dom.

Sob., 13.10.2018, 09:50 użytkownik Szymon  napisał:

>
>
> Hi, i have a question about MapStateDescriptor used to create MapState.
> I have a keyed stream and ProcessWindowFunction where I want to use
> MapState. And the question is that in MapStateDescriptor constructor
>
> public MapStateDescriptor(String name, Class keyClass, Class
> valueClass)
>
> the "name" must be unique for each key or this is only the
> name/description.
>
> Best Regards
> Szymon
>
>
>
>
>

Re: Making calls to external API wit Data Streams

2018-10-12 Thread Dominik Wosiński

Hey,

It seems that You have written Async function that takes *String* and
returns *String*. But in execution you expect the result of the function to
be the tuple (*String, String).*  That's where the mismatch occurs, the
function itself is ok :)
If you will change *DataStream[(String,String)] *to *Datastream[String]* it
should work smoothly.

Best Regards,
Dom.

pt., 12 paź 2018 o 16:26 Krishna Kalyan  napisał(a):

> Thanks for the quick reply Dom,
>
> I am using flink 1.6.1.
>
> [image: image.png]
>
> Error: Type Mismatch expected AsyncFunction actual AsyncWeatherAPIRequest
>
>
>
> On Fri, 12 Oct 2018 at 16:21, Dominik Wosiński  wrote:
>
>> Hey,
>> What is the exact issue that you are facing and the Flink version that
>> you are using ??
>>
>>
>> Best Regards,
>> Dom.
>>
>> pt., 12 paź 2018 o 16:11 Krishna Kalyan 
>> napisał(a):
>>
>>> Hello All,
>>>
>>> I need some help making async API calls. I have tried the following code
>>> below.
>>>
>>> class AsyncWeatherAPIRequest extends AsyncFunction[String, String] {
>>>   override def asyncInvoke(input: String, resultFuture:
>>> ResultFuture[String]): Unit = {
>>> val query = url("")
>>> val response = Http.default(query OK as.String)
>>> resultFuture.complete(Collections.singleton(response()))
>>>   }
>>> }
>>>
>>> The code below leads to a compilation issue while calling the
>>> AsyncDataStream api.
>>>
>>> val resultStream: DataStream[(String, String)] =
>>>   AsyncDataStream.unorderedWait(userData, new
>>> AsyncWeatherAPIRequest(), 1000, TimeUnit.MILLISECONDS, 1)
>>>
>>> I would really appreciate some examples in scala to make an external API
>>> call with datastreams.
>>>
>>> Regards,
>>> Krishna
>>>
>>>
>>>
>>> Standorte in Stuttgart und Berlin <http://www.zoi.de/#kontakt> · Zoi
>>> TechCon GmbH · Quellenstr. 7 · 70376 Stuttgart · Geschäftsführer: Benjamin
>>> Hermann, Dr. Daniel Heubach. Amtsgericht Stuttgart HRB 759619,
>>> Gerichtsstand Stuttgart. Die genannten Angaben werden automatisch
>>> hinzugefügt und lassen keine Rückschlüsse auf den Rechtscharakter der
>>> E-Mail zu. This message (including any attachments) contains confidential
>>> information intended for a specific individual and purpose, and is
>>> protected by law. If you are not the intended recipient, you should delete
>>> this message. Any disclosure, copying, or distribution of this message, or
>>> the taking of any action based on it, is strictly prohibited.
>>>
>>>
>
> --
>
> Krishna Kalyan
>
> M +49 151 44159906 <+49%20151%2044159906>
>
> Standorte in Stuttgart und Berlin <http://www.zoi.de/#kontakt> · Zoi
> TechCon GmbH · Quellenstr. 7 · 70376 Stuttgart · Geschäftsführer: Benjamin
> Hermann, Dr. Daniel Heubach. Amtsgericht Stuttgart HRB 759619,
> Gerichtsstand Stuttgart. Die genannten Angaben werden automatisch
> hinzugefügt und lassen keine Rückschlüsse auf den Rechtscharakter der
> E-Mail zu. This message (including any attachments) contains confidential
> information intended for a specific individual and purpose, and is
> protected by law. If you are not the intended recipient, you should delete
> this message. Any disclosure, copying, or distribution of this message, or
> the taking of any action based on it, is strictly prohibited.
>
>

Re: Making calls to external API wit Data Streams

2018-10-12 Thread Dominik Wosiński

Hey,
What is the exact issue that you are facing and the Flink version that you
are using ??


Best Regards,
Dom.

pt., 12 paź 2018 o 16:11 Krishna Kalyan  napisał(a):

> Hello All,
>
> I need some help making async API calls. I have tried the following code
> below.
>
> class AsyncWeatherAPIRequest extends AsyncFunction[String, String] {
>   override def asyncInvoke(input: String, resultFuture:
> ResultFuture[String]): Unit = {
> val query = url("")
> val response = Http.default(query OK as.String)
> resultFuture.complete(Collections.singleton(response()))
>   }
> }
>
> The code below leads to a compilation issue while calling the
> AsyncDataStream api.
>
> val resultStream: DataStream[(String, String)] =
>   AsyncDataStream.unorderedWait(userData, new
> AsyncWeatherAPIRequest(), 1000, TimeUnit.MILLISECONDS, 1)
>
> I would really appreciate some examples in scala to make an external API
> call with datastreams.
>
> Regards,
> Krishna
>
>
>
> Standorte in Stuttgart und Berlin  · Zoi
> TechCon GmbH · Quellenstr. 7 · 70376 Stuttgart · Geschäftsführer: Benjamin
> Hermann, Dr. Daniel Heubach. Amtsgericht Stuttgart HRB 759619,
> Gerichtsstand Stuttgart. Die genannten Angaben werden automatisch
> hinzugefügt und lassen keine Rückschlüsse auf den Rechtscharakter der
> E-Mail zu. This message (including any attachments) contains confidential
> information intended for a specific individual and purpose, and is
> protected by law. If you are not the intended recipient, you should delete
> this message. Any disclosure, copying, or distribution of this message, or
> the taking of any action based on it, is strictly prohibited.
>
>

Re: Duplicates in self join

2018-10-08 Thread Dominik Wosiński

Hey,
IMHO, the simplest way in your case would be to use the Evictor to evict
duplicate values after the window is generated. Have look at it here:
https://ci.apache.org/projects/flink/flink-docs-release-1.6/api/java/org/apache/flink/streaming/api/windowing/evictors/Evictor.html

Best Regards,
Dominik.

pon., 8 paź 2018 o 08:00 Eric L Goodman 
napisał(a):

> What is the best way to avoid or remove duplicates when joining a stream
> with itself?  I'm performing a streaming temporal triangle computation and
> the first part is to find triads of two edges of the form vertexA->vertexB
> and vertexB->vertexC (and there are temporal constraints where the first
> edge occurs before the second edge).  To do that, I have the following code:
>
> DataStream triads = edges.join(edges)
> .where(new DestKeySelector())
> .equalTo(new SourceKeySelector())
> .window(SlidingEventTimeWindows.of(Time.milliseconds(windowSizeMs),
> Time.milliseconds(slideSizeMs)))
> .apply(new EdgeJoiner(queryWindow));
>
> However, when I look at the triads being built, there are two copies of each 
> triad.
>
> For example, if I create ten edges (time, source, target):
>
> 0.0, 4, 0
>
> 0.01, 1, 5
>
> 0.02, 3, 7
>
> 0.03, 0, 8
>
> 0.04, 0, 9
>
> 0.05, 4, 8
>
> 0.06, 4, 3
>
> 0.07, 5, 9
>
> 0.08, 7, 1
>
> 0.09, 9, 6
>
>
> It creates the following triads (time1, source1, target1, time2, source2,
> targe2). Note there are two copies of each.
>
> 0.0, 4, 0 0.03, 0, 8
>
> 0.0, 4, 0 0.03, 0, 8
>
> 0.0, 4, 0 0.04, 0, 9
>
> 0.0, 4, 0 0.04, 0, 9
>
> 0.01, 1, 5 0.07, 5, 9
>
> 0.01, 1, 5 0.07, 5, 9
>
> 0.02, 3, 7 0.08, 7, 1
>
> 0.02, 3, 7 0.08, 7, 1
>
> 0.04, 0, 9 0.09, 9, 6
>
> 0.04, 0, 9 0.09, 9, 6
>
> 0.07, 5, 9 0.09, 9, 6
>
> 0.07, 5, 9 0.09, 9, 6
>
> I'm assuming this behavior has something to do with the joining of "edges" 
> with itself.
>
> I can provide more code if that would be helpful, but I believe I've captured 
> the most salient portion.
>
>
>
>
>
>

ODP: How to add jvm Options when using Flink dashboard?

2018-09-05 Thread Dominik Wosiński

Hey,
You can’t as Chesnay have already said, but for your usecase you could use 
arguments instead of JVM option and they will work equally good. 
Best Regards,
Dom.


Wysłane z aplikacji Poczta dla Windows 10

Od: Chesnay Schepler
Wysłano: środa, 5 września 2018 11:43
Do: zpp; user@flink.apache.org
Temat: Re: How to add jvm Options when using Flink dashboard?

You can't set JVM options when submitting through the Dashboard. This cannot be 
implemented since no separate JVM is spun up when you submit a job that way.

On 05.09.2018 11:41, zpp wrote:
I wrote a task using Typesafe config. It must be pointed config file position 
using jvm Options like "-Dconfig.resource=dev.conf".
How can I do that with Flink dashboard?
Thanks for the help!

ODP: API for delayed/scheduled interval input source for integrationtests

2018-09-01 Thread Dominik Wosiński

Hey, 
Maybe it would be a good idea to create somekind of test source for DataStream 
that allows writing elements to stream directly. Similarly like it’s done for 
reactive libraries sources. This would make creating tests a lot easier for 
Flink.

Best Regards,
Dom.

Wysłane z aplikacji Poczta dla Windows 10

Od: Hequn Cheng
Wysłano: sobota, 1 września 2018 17:18
Do: yee.ni...@gmail.com
DW: user
Temat: Re: API for delayed/scheduled interval input source for integrationtests

Hi Yee,

Yes, AbstractStreamOperatorTestHarness is a good way to test an operator. As 
for iterator, do you use an IT or an UT test? I think Thread.sleeps may works 
for an IT test. If you use an UT, you probably need to set time by yourself, 
similar to setProcessingTime in harness test.

Best, Hequn

On Sat, Sep 1, 2018 at 12:20 PM Yee-Ning Cheng  wrote:
I was able to use the AbstractStreamOperatorTestHarness to write more of a
unit test for windowing operators.  However, I'm still trying to figure out
a way to have a "delayed iterator".  I tried implementing an iterator that
Thread.sleeps for the interval and passed it to the stream as an input, but
that didn't seem to work, plus I was having issues with serialization if I
enabled checkpointing which seemed like a hassle.



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Re: Dealing with Not Serializable classes in Java

2018-08-27 Thread Dominik Wosiński

Hey ;)
I have received one response that was sent directly to my email and not to
user group :

> Hi Dominik,
>
> I think you can put the unserializable fields into RichFunctions and
> initiate them in the `open` method, so the the fields won’t need to be
> serialized with the tasks.
>
> Best,
> Paul Lam
>

And my response about RichFunctions was meant for Paul :)

Pon., 27.08.2018, 10:38 użytkownik Chesnay Schepler 
napisał:

> You don't need RichFunctions for that, you should be able to just do:
>
> private static final ObjectMapper objectMapper =  new 
> ObjectMapper().registerModule(new JavaTimeModule());
>
> On 27.08.2018 10:28, Dominik Wosiński wrote:
>
> Hey Paul,
> Yeah that is possible, but I was asking in terms of serialization schema.
> So I would really want to avoid RichFunction :)
>
> Best Regards,
> Dominik.
>
> pon., 27 sie 2018 o 10:23 Chesnay Schepler 
> napisał(a):
>
>> The null check in the method is the general-purpose way of solving it.
>> If the ObjectMapper is thread-safe you could also initialize it as a
>> static field.
>>
>> On 26.08.2018 17:58, Dominik Wosiński wrote:
>>
>> Hey,
>>
>> I was wondering how do You normally deal with fields that contain
>> references that are not serializable. Say, we have a custom serialization
>> schema in Java that needs to serialize *LocalDateTime* field with
>> *ObjectMapper.*  This requires registering specific module for
>> *ObjectMapper* and this makes it not serializable (module contains some
>> references to classes that are not serializable).
>> Now, if You would initialize *ObjectMapper *directly in the field this
>> will cause an exception when deploying the job.
>>
>> Normally I would do :
>>
>> @Overridepublic byte[] serialize(Backup backupMessage) {
>> if(objectMapper == null) {
>> objectMapper = new ObjectMapper().registerModule(new 
>> JavaTimeModule());}
>> ...
>> }
>>
>> But I was wondering whether do You have any prettier option of doing
>> this?
>>
>> Thanks,
>> Dominik.
>>
>>
>>
>

Re: Dealing with Not Serializable classes in Java

2018-08-27 Thread Dominik Wosiński

Hey Paul,
Yeah that is possible, but I was asking in terms of serialization schema.
So I would really want to avoid RichFunction :)

Best Regards,
Dominik.

pon., 27 sie 2018 o 10:23 Chesnay Schepler  napisał(a):

> The null check in the method is the general-purpose way of solving it.
> If the ObjectMapper is thread-safe you could also initialize it as a
> static field.
>
> On 26.08.2018 17:58, Dominik Wosiński wrote:
>
> Hey,
>
> I was wondering how do You normally deal with fields that contain
> references that are not serializable. Say, we have a custom serialization
> schema in Java that needs to serialize *LocalDateTime* field with
> *ObjectMapper.*  This requires registering specific module for
> *ObjectMapper* and this makes it not serializable (module contains some
> references to classes that are not serializable).
> Now, if You would initialize *ObjectMapper *directly in the field this
> will cause an exception when deploying the job.
>
> Normally I would do :
>
> @Overridepublic byte[] serialize(Backup backupMessage) {
> if(objectMapper == null) {
> objectMapper = new ObjectMapper().registerModule(new 
> JavaTimeModule());}
> ...
> }
>
> But I was wondering whether do You have any prettier option of doing this?
>
> Thanks,
> Dominik.
>
>
>

Dealing with Not Serializable classes in Java

2018-08-26 Thread Dominik Wosiński

Hey,

I was wondering how do You normally deal with fields that contain
references that are not serializable. Say, we have a custom serialization
schema in Java that needs to serialize *LocalDateTime* field with
*ObjectMapper.*  This requires registering specific module for
*ObjectMapper* and this makes it not serializable (module contains some
references to classes that are not serializable).
Now, if You would initialize *ObjectMapper *directly in the field this will
cause an exception when deploying the job.

Normally I would do :

@Override
public byte[] serialize(Backup backupMessage) {
if(objectMapper == null) {
objectMapper = new ObjectMapper().registerModule(new JavaTimeModule());
}
...
}

But I was wondering whether do You have any prettier option of doing this?

Thanks,
Dominik.

Fwd: would you join a Slack workspace for Flink?

2018-08-26 Thread Dominik Wosiński

-- Forwarded message -
From: Dominik Wosiński 
Date: niedz., 26 sie 2018 o 15:12
Subject: ODP: would you join a Slack workspace for Flink?
To: Hequn Cheng 

Hey,
I have been facing this issue for multiple open source projects and
discussions. Slack in my opinion has two main issues :

 - the already mentioned issue with searching, through search
engine

 - Slack is still commercial application.

The second issue is quite important, because for free version Slack gives
10k messages of history. I personally think that for Flink this would to
loss all messages that are older than a week possibly. This is the big
issue as it woul most certainly lead to asking the same questions over and
over again. I’ve seen really big slack groups for some big projects where
the history would last like 3-4 days and this is pure nightmare.

The better solution would be to use gitter than Slack IMHO if there is need
for such way of communication.

Best Regards,
Dominik.

Wysłane z aplikacji Poczta <https://go.microsoft.com/fwlink/?LinkId=550986>
dla Windows 10

*Od: *Hequn Cheng 
*Wysłano: *niedziela, 26 sierpnia 2018 14:37
*Do: *Nicos Maris 
*DW: *ches...@apache.org; user 
*Temat: *Re: would you join a Slack workspace for Flink?

Hi Nicos,

Thanks for bring up this discussion. :-)

Slack is a good way to communicate, but it seems not very fit for the open
source field. The messages on Slack are mixed up and can not be searched
through search engine.

Best, Hequn

On Sun, Aug 26, 2018 at 7:22 PM Nicos Maris  wrote:

Chesnay can you take a look at the following PR?

https://github.com/apache/flink-web/pull/120

On Sun, Aug 26, 2018 at 1:09 PM Chesnay Schepler  wrote:

There have been previous discussions around using slack and they were
rejected.

Personally I would just remove the IRC channel; I'm not aware of any
committer actually spending time there.

On 25.08.2018 17:07, Nicos Maris wrote:

Hi all,

This mailing list is for user support and questions. If you would also use
slack for user support and questions, then please vote at the following
ticket. If you don't have an account at that jira, you can reply to this
email with a "+1".

[FLINK-10217 <https://issues.apache.org/jira/browse/FLINK-3862>] use Slack
for user support and questions
 Current status

For user support and questions, users are instructed to subscribe to
user@flink.apache.org but there are users like me who enjoy using also a
chat channel. However, the instructions to do so are not clear and the IRC
activity is low and it is definitely not indicative of the project's
activity
<https://issues.apache.org/jira/browse/FLINK-3862?focusedCommentId=16152376=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16152376>
.

The website <https://flink.apache.org/community.html> mentions that "If you
want to talk with the Flink committers and users in a chat, there is an IRC
channel <https://flink.apache.org/community.html#irc>."
Option 1: Use Slack

An example of an Apache project that is using Slack
<https://tedium.co/2017/10/17/irc-vs-slack-chat-history> is:
http://mesos.apache.org/community

I can assist on setting it up if at least one expert joins from the very
beginning.
Option 2: Keep using IRC and document it

Add the missing section
<https://github.com/apache/flink-web/blob/master/community.md#irc> at the
website along with instructions for people who have never used IRC.
Option 3: Use only the mailing list

Use only user@flink.apache.org for user support and questions and do not
mention IRC at the website.

Re: Flink checkpointing to Google Cloud Storage

2018-08-21 Thread Dominik Wosiński

Hey,
>From my perspective, such issues always meant clashing dependencies in case
of Flink. Have you checked the full dependency tree if there are no issues
there ?
Best Regards,
Dominik.

Re: Cluster die when one of the TM killed

2018-08-20 Thread Dominik Wosiński

Hey,
Can You please provide a little more information about your setup and maybe
logs showing when the crash occurs?
Best Regards,
Dominik

2018-08-20 16:23 GMT+02:00 Siew Wai Yow :

> Hi,
>
>
> When one of the task manager is killed, the whole cluster die, is this
> something expected? We are using Flink 1.4. Thank you.
>
>
> Regards,
>
> Yow
>

Re: Flink not rolling log files

2018-08-17 Thread Dominik Wosiński

I am using this *log4j.properties *file config for rolling files once per
day and it is working perfectly. Maybe this will give You some hint:

log4j.appender.file=org.apache.log4j.DailyRollingFileAppender
log4j.appender.file.file=${log.file}
log4j.appender.file.append=false
log4j.appender.file.layout=org.apache.log4j.PatternLayout
log4j.appender.file.layout.ConversionPattern=%d{-MM-dd
HH:mm:ss,SSS} %-5p %-60c %x - %m%n
log4j.appender.file.layout.DatePattern='.'-MM-dd

Best Regards,
Dominik.

Re: Flink Jobmanager Failover in HA mode

2018-08-17 Thread Dominik Wosiński

I have faced this issue, but in 1.4.0 IIRC. This seems to be related to
https://issues.apache.org/jira/browse/FLINK-10011. What was the status of
the jobs when the main Job Manager has been stopped ?

2018-08-17 17:08 GMT+02:00 Helmut Zechmann :

> Hi all,
>
> we have a problem with flink 1.5.2 high availability in standalone mode.
>
> We have two jobmanagers running. When I shut down the main job manager,
> the failover job manager encounters an error during failover.
>
> Logs:
>
>
> 2018-08-17 14:38:16,478 WARN  akka.remote.ReliableDeliverySupervisor
>   - Association with remote system [akka.tcp://
> fl...@seg-1.adjust.com:29095] has failed, address is now gated for [50]
> ms. Reason: [Disassociated]
> 2018-08-17 14:38:31,449 WARN  akka.remote.transport.netty.NettyTransport
>   - Remote connection to [null] failed with
> java.net.ConnectException: Connection refused:
> seg-1.adjust.com/178.162.219.66:29095
> 2018-08-17 14:38:31,451 WARN  akka.remote.ReliableDeliverySupervisor
>   - Association with remote system [akka.tcp://
> fl...@seg-1.adjust.com:29095] has failed, address is now gated for [50]
> ms. Reason: [Association failed with [akka.tcp://flink@seg-1.
> adjust.com:29095]] Caused by: [Connection refused:
> seg-1.adjust.com/178.162.219.66:29095]
> 2018-08-17 14:38:41,379 ERROR org.apache.flink.runtime.rest.
> handler.legacy.files.StaticFileServerHandler  - Could not retrieve the
> redirect address.
> java.util.concurrent.CompletionException: akka.pattern.AskTimeoutException:
> Ask timed out on [Actor[akka.tcp://fl...@seg-1.adjust.com:29095/user/
> dispatcher#-1599908403]] after [1 ms]. Sender[null] sent message of
> type "org.apache.flink.runtime.rpc.messages.RemoteFencedMessage".
> at java.util.concurrent.CompletableFuture.encodeThrowable(
> CompletableFuture.java:292)
> [... shortened ...]
> Caused by: akka.pattern.AskTimeoutException: Ask timed out on
> [Actor[akka.tcp://fl...@seg-1.adjust.com:29095/user/dispatcher#-1599908403]]
> after [1 ms]. Sender[null] sent message of type
> "org.apache.flink.runtime.rpc.messages.RemoteFencedMessage".
> at akka.pattern.PromiseActorRef$$anonfun$1.apply$mcV$sp(
> AskSupport.scala:604)
> ... 9 more
> 2018-08-17 14:38:48,005 INFO  
> org.apache.flink.runtime.dispatcher.DispatcherRestEndpoint
>   - http://seg-2.adjust.com:8083 was granted leadership with
> leaderSessionID=708d1a64-c353-448b-9101-7eb3f910970e
> 2018-08-17 14:38:48,005 INFO  
> org.apache.flink.runtime.resourcemanager.StandaloneResourceManager
> - ResourceManager akka.tcp://flink@seg-2.adjust.
> com:30169/user/resourcemanager was granted leadership with fencing token
> 8de829de14876a367a80d37194b944ee
> 2018-08-17 14:38:48,006 INFO  org.apache.flink.runtime.
> resourcemanager.slotmanager.SlotManager  - Starting the SlotManager.
> 2018-08-17 14:38:48,007 INFO  
> org.apache.flink.runtime.dispatcher.StandaloneDispatcher
> - Dispatcher akka.tcp://fl...@seg-2.adjust.com:30169/user/dispatcher
> was granted leadership with fencing token 684f50f8-327c-47e1-a53c-
> 931c4f4ea3e5
> 2018-08-17 14:38:48,007 INFO  
> org.apache.flink.runtime.dispatcher.StandaloneDispatcher
> - Recovering all persisted jobs.
> 2018-08-17 14:38:48,021 INFO  org.apache.flink.runtime.jobmanager.
> ZooKeeperSubmittedJobGraphStore  - Recovered SubmittedJobGraph(
> b951bbf518bcf6cc031be6d2ccc441bb, null).
> 2018-08-17 14:38:48,028 INFO  org.apache.flink.runtime.jobmanager.
> ZooKeeperSubmittedJobGraphStore  - Recovered SubmittedJobGraph(
> 06ed64f48fa0a7cffde53b99cbaa073f, null).
> 2018-08-17 14:38:48,035 ERROR 
> org.apache.flink.runtime.entrypoint.ClusterEntrypoint
>- Fatal error occurred in the cluster entrypoint.
> java.lang.RuntimeException: 
> org.apache.flink.runtime.client.JobExecutionException:
> Could not set up JobManager
> at org.apache.flink.util.ExceptionUtils.rethrow(
> ExceptionUtils.java:199)
> [... shortened ...]
> Caused by: org.apache.flink.runtime.client.JobExecutionException: Could
> not set up JobManager
> at org.apache.flink.runtime.jobmaster.JobManagerRunner.<
> init>(JobManagerRunner.java:176)
> at org.apache.flink.runtime.dispatcher.Dispatcher$
> DefaultJobManagerRunnerFactory.createJobManagerRunner(Dispatcher.java:936)
> at org.apache.flink.runtime.dispatcher.Dispatcher.
> createJobManagerRunner(Dispatcher.java:291)
> at org.apache.flink.runtime.dispatcher.Dispatcher.runJob(
> Dispatcher.java:281)
> at org.apache.flink.util.function.ConsumerWithException.accept(
> ConsumerWithException.java:38)
> ... 21 more
> Caused by: java.lang.Exception: Cannot set up the user code libraries:
> /var/lib/flink/ceph/prod/1.5-batch/ha_state/1.5-batch/blob/job_
> b951bbf518bcf6cc031be6d2ccc441bb/blob_p-a26f62e3bbdcd8884dd18c42a3f6f2
> 02b9d2c6e7-0dc87a56862a1f799d515306ffeddfcb (No such file or directory)
> at

Re: Initial value of ValueStates of primitive types

2018-08-17 Thread Dominik Wosiński

Hey,

After you call, by default values you mean after you call :

getRuntimeContext.getState()

If so, the default value will be state with *value() *of null, as described
in :

/**
 * Returns the current value for the state. When the state is not
 * partitioned the returned value is the same for all inputs in a given
 * operator instance. If state partitioning is applied, the value returned
 * depends on the current operator input, as the operator maintains an
 * independent state for each partition.
 *
 * If you didn't specify a default value when creating the {@link
ValueStateDescriptor}
 * this will return {@code null} when to value was previously set
using {@link #update(Object)}.
 *
 * @return The state value corresponding to the current input.
 *
 * @throws IOException Thrown if the system cannot access the state.
 */
T value() throws IOException;

For the *MapState* it should be an empty map with no keys present.

Funny thing is that there is an implicit conversion between null values
returned by state, so assume you have defined :

private lazy val *test*: ValueState[Boolean] =
getRuntimeContext.getState(new ValueStateDescriptor[Boolean]("test",
classOf[Boolean]))

If you will now do :

print(test.value())

It will indeed print the *null*.
But if You will do  :

val myTest = test.value()
print(test.value())

It will now print *false *instead;

Best Regards,
Dominik.

2018-08-17 11:13 GMT+02:00 Averell :

> Hi,
>
> In Flink's documents, I couldn't find any example that uses primitive type
> when working with States. What would be the initial value of a ValueState
> of
> type Int/Boolean/...? The same question apply for MapValueState like
> [String, Int]
>
> Thanks and regards,
> Averell
>
>
>
> --
> Sent from: http://apache-flink-user-mailing-list-archive.2336050.
> n4.nabble.com/
>

ODP: docker, error NoResourceAvailableException..

2018-08-15 Thread Dominik Wosiński

Hey, 
The problem is that your command does start Job Manager container, but it does 
not start the Task Manager . That is why you have 0 slots. Currently, the 
default numberOfTaskSlots is set to the number of CPUs avaialbe on the machine.

So, You generally can to do 2 things: 

1) Start Job Manager and 2 Task Managers. If you have Docker Compose available, 
you can paste this to your docker-compose.yml : 
services:
  jobmanager:
image: ${FLINK_DOCKER_IMAGE_NAME:-flink}
expose:
  - "6123"
ports:
  - "8081:8081"
command: jobmanager
environment:
  - JOB_MANAGER_RPC_ADDRESS=jobmanager

  taskmanager:
image: ${FLINK_DOCKER_IMAGE_NAME:-flink}
expose:
  - "6121"
  - "6122"
depends_on:
  - jobmanager
command: taskmanager
links:
  - "jobmanager:jobmanager"
environment:
  - JOB_MANAGER_RPC_ADDRESS=jobmanager

  taskmanager1:
image: ${FLINK_DOCKER_IMAGE_NAME:-flink}
expose:
  - "6190"
  - "6120"
depends_on:
  - jobmanager
command: taskmanager
links:
  - "jobmanager:jobmanager"
environment:
  - JOB_MANAGER_RPC_ADDRESS=jobmanager

This will give you 1 Job Manager and 2 Task Managers with one task slot each, 
so 2 Task slots in general.

2) You can deploy 1 Job Manager and 1 Task Manager.Then you need to modify 
flink-conf.yml and set the following setting : 

taskmanager.numberOfTaskSlots: 2

This will give you 2 Task Slots with only 1 Task Manager. But you will need to 
somehow override config in the container, possibly using : 
https://docs.docker.com/storage/volumes/

Regards,
Dominik.
Od: shyla deshpande
Wysłano: środa, 15 sierpnia 2018 07:23
Do: user
Temat: docker, error NoResourceAvailableException..

Hello all,

Trying to use docker as a single node flink cluster.

docker run --name flink_local -p 8081:8081 -t flink local

I submited a job to the cluster using the Web UI. The job failed. I see this 
error message in the docker logs.

org.apache.flink.runtime.jobmanager.scheduler.NoResourceAvailableException: 
Could not allocate all requires slots within timeout of 30 ms. Slots 
required: 2, slots allocated: 0

The Web UI, shows 0 taskmanagers and 0 task slots on the Flink dashboard.
How do I start the docker with 2 Task slots?

Appreciate any help.

Thanks

Re: How to set log level using Flink Docker Image

2018-06-21 Thread Dominik Wosiński

You can for example mount the *conf* directory using docker volumes.

You would need to have *logback.xml* and then mount it as *conf/logback.xml
*inside the image. Locally You could do this using *docker-compose.yml*,
for mounting volumes in kubernetes refer to this page:
https://kubernetes.io/docs/concepts/storage/volumes/

Regards.

Czw., 21.06.2018, 18:42 użytkownik Guilherme Nobre <
guilhermenob...@gmail.com> napisał:

> Hi everyone,
>
> I have a Flink Cluster created from Flink's Docker Image. I'm trying to
> set log level to DEBUG but couldn't find how. I know where logback
> configuration files are as per documentation:
>
> "The conf directory contains a logback.xml file which can be modified and
> is used if Flink is started outside of an IDE."
>
> However I'm not sure how to do that for a running Flink Cluster on
> Kubernetes.
>
> Thanks,
> G
>
>

Blob Server Removes Failed Jobs Immediately

2018-06-20 Thread Dominik Wosiński

Hello,

I'm not sure whether the problem is connected with bad configuration or
it's some inconsistency in the documentation but according to this document:

https://cwiki.apache.org/confluence/display/FLINK/FLIP-19%3A+Improved+BLOB+storage+architecture
. *I*f a job fails, all non-HA files' refCounts are reset to 0; all HA *files'
refCounts remain and will not be increased again on recovery. *But in the
JobManager's code if the Job Status is changed to failed and the JobManager
receive the message with that fact, it will send *RemoveJob* message to
itself, which invokes *removeJob() *function that always invokes following
functions :

libraryCacheManager.unregisterJob(jobID)
blobServer.cleanupJob(jobID, removeJobFromStateBackend)

jobManagerMetricGroup.removeJob(jobID)

As far as I understand this removes blob entries immediately. And according
to the doc it should only freeze refCounts for HA files and reset refCounts
for non-Ha files to allow their later removal.
Is the doc right and I have missed something here ?
Thanks in Advance.

55 matches

Mail list logo