from:"miki haiat"

Re: Setting app Flink logger

2020-03-10 Thread miki haiat

Which image are you using ?

On Tue, Mar 10, 2020, 16:27 Eyal Pe'er  wrote:

> Hi Rafi,
>
> The file exists (and is the file from the official imageJ, please see
> below).
>
> The user is root and it has permission. I am running in HA mode using
> docker.
>
>
>
> cat /opt/flink/conf/log4j-console.properties
>
>
>
>
> 
>
> #  Licensed to the Apache Software Foundation (ASF) under one
>
> #  or more contributor license agreements.  See the NOTICE file
>
> #  distributed with this work for additional information
>
> #  regarding copyright ownership.  The ASF licenses this file
>
> #  to you under the Apache License, Version 2.0 (the
>
> #  "License"); you may not use this file except in compliance
>
> #  with the License.  You may obtain a copy of the License at
>
> #
>
> #  http://www.apache.org/licenses/LICENSE-2.0
>
> #
>
> #  Unless required by applicable law or agreed to in writing, software
>
> #  distributed under the License is distributed on an "AS IS" BASIS,
>
> #  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
>
> #  See the License for the specific language governing permissions and
>
> # limitations under the License.
>
>
> 
>
>
>
> # This affects logging for both user code and Flink
>
> rootLogger.level = INFO
>
> rootLogger.appenderRef.console.ref = ConsoleAppender
>
>
>
> # Uncomment this if you want to _only_ change Flink's logging
>
> #log4j.logger.org.apache.flink=INFO
>
>
>
> # The following lines keep the log level of common libraries/connectors on
>
> # log level INFO. The root logger does not override this. You have to
> manually
>
> # change the log levels here.
>
> logger.akka.name = akka
>
> logger.akka.level = INFO
>
> logger.kafka.name= org.apache.kafka
>
> logger.kafka.level = INFO
>
> logger.hadoop.name = org.apache.hadoop
>
> logger.hadoop.level = INFO
>
> logger.zookeeper.name = org.apache.zookeeper
>
> logger.zookeeper.level = INFO
>
>
>
> # Log all infos to the console
>
> appender.console.name = ConsoleAppender
>
> appender.console.type = CONSOLE
>
> appender.console.layout.type = PatternLayout
>
> appender.console.layout.pattern = %d{-MM-dd HH:mm:ss,SSS} %-5p %-60c
> %x - %m%n
>
>
>
> # Suppress the irrelevant (wrong) warnings from the Netty channel handler
>
> logger.netty.name =
> org.apache.flink.shaded.akka.org.jboss.netty.channel.DefaultChannelPipeline
>
> logger.netty.level = OFF
>
>
>
> Best regards
>
> Eyal Peer */ *Data Platform Developer
>
> [image: cid:image003.png@01D32C73.C785C410]
>
>
>
> *From:* Rafi Aroch 
> *Sent:* Tuesday, March 10, 2020 3:55 PM
> *To:* Eyal Pe'er 
> *Cc:* user ; StartApp R Data Platform <
> startapprnd...@startapp.com>
> *Subject:* Re: Setting app Flink logger
>
>
>
> Hi Eyal,
>
>
>
> Sounds trivial, but can you verify that the file actually exists in
> /opt/flink/conf/log4j-console.properties? Also, verify that the user
> running the process has read permissions to that file.
>
> You said you use Flink in YARN mode, but the the example above you run
> inside a docker image so this is a bit confusing. Notice that the official
> docker images run as "flink" user and group ids.
>
>
>
> If you wish to try to use Logback instead, you can place you logback.xml
> file as part of your project resources folder to include it in the
> classpath. That should automatically get detected on startup.
>
>
>
> Hope this helps,
>
> Rafi
>
>
>
>
>
> On Tue, Mar 10, 2020 at 1:42 PM Eyal Pe'er  wrote:
>
> Hi,
>
> I am running Flink in YARN mode using the official image with few
> additional files.
>
> I’ve noticed that my logger failed to initialize:
>
>
>
> root:~# docker logs flink-task-manager
>
> Starting taskexecutor as a console application on host ***.
>
> log4j:WARN No appenders could be found for logger
> (org.apache.flink.runtime.taskexecutor.TaskManagerRunner).
>
> log4j:WARN Please initialize the log4j system properly.
>
> log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for
> more info.
>
>
>
> I followed the documentation
> 
> and seems like all related configuration files exist.
>
> Currently, I am using the default files from the official image
> https://github.com/apache/flink/tree/master/flink-dist/src/main/flink-bin/conf
>
>
>
> In addition, seems like the process got the right parameters:
>
> root 21892 21866  1 08:29 ?00:02:06
> /usr/local/openjdk-8/bin/java -XX:+UseG1GC
> -Dlog4j.configuration=file:/opt/flink/conf/log4j-console.properties
> -Dlogback.configurationFile=file:/opt/flink/conf/logback-console.xml
> -classpath
>

Re: Single stream, two sinks

2020-03-01 Thread miki haiat

So you have rabitmq source and http sink?
If so you can use side output in order to dump your data to db.

https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/side_output.html

On Sat, Feb 29, 2020, 23:01 Gadi Katsovich  wrote:

> Hi,
> I'm new to flink and am evaluating it to replace our existing streaming
> application.
> The use case I'm working on is reading messages from RabbitMQ queue,
> applying some transformation and filtering logic and sending it via HTTP to
> a 3rd party.
> A must have requirement of this flow is to to write the data that was sent
> to an SQL db, for audit and troubleshooting purposes.
> I'm currently basing my HTTP solution on a PR with needed adjustments:
> https://github.com/apache/flink/pull/5866/files
> How can I add an insertion to a DB after a successful HTTP request?
> Thank you.
>

Re: Flink State Migration Version 1.8.2

2019-10-16 Thread miki haiat

Can you try to add the new variables as option ?


On Wed, Oct 16, 2019, 17:17 ApoorvK  wrote:

> I have been trying to alter the current state case class (scala) which has
> 250 variables, now when I add 10 more variables to the class, and when I
> run
> my flink application from the save point taken before(Some of the variables
> are object which are also maintained as state). It fails to migrate the
> state  error : "The new state typeSerializer for operator state must not be
> incompatible. "
>
> Please suggest what I can do to avoid that error
>
>
>
> --
> Sent from:
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
>

Re: Reading Key-Value from Kafka | Eviction policy.

2019-09-26 Thread miki haiat

I'm sure there is several ways to implement it. Can you elaborate more on
your use case ?

On Fri, Sep 27, 2019, 08:37 srikanth flink  wrote:

> Hi,
>
> My data source is Kafka, all these days have been reading the values from
> Kafka stream to a table. The table just grows and runs into a heap issue.
>
> Came across the eviction policy that works on only keys, right?
>
> Have researched to configure the environment file(Flink SLQ) to read both
> key and value, so as the eviction works on the keys and older data is
> cleared. I found nothing in the docs, so far.
>
> Could someone help with that?
> If there's no support for reading key and value, can someone help me to
> assign a key to the table I'm building from stream?
>
> Thanks
> Srikanth
>

Re: Flink- Heap Space running out

2019-09-26 Thread miki haiat

You can configure the task manager memory in the config.yaml file.
What is the current configuration?

On Thu, Sep 26, 2019, 17:14 Nishant Gupta 
wrote:

>  am running a query to join a stream and a table as below. It is running
> out of heap space. Even though it has enough heap space in flink cluster
> (60GB * 3)
>
> Is there an eviction strategy needed for this query ?
>
> *SELECT sourceKafka.* FROM sourceKafka INNER JOIN DefaulterTable ON
> sourceKafka.CC=DefaulterTable.CC;  *
>
> Thanks
>
> Nishant
>

Re: No field appear in Time Field name in Kibana

2019-09-05 Thread miki haiat

You need to define a date or time type in your elastic index mapping .
Its not a flink issue

On Wed, Sep 4, 2019 at 3:02 PM alaa  wrote:

> I try to run this application but there was problem when Configure an index
> pattern .
> There was No field appear in Time Field name in Kibana when i set index
> name.
> How can i solve this problem ?
>
> <
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/file/t1965/Screenshot_from_2019-09-04_13-49-17.png>
>
>
>
>
> --
> Sent from:
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
>

Re: timeout error while connecting to Kafka

2019-08-25 Thread miki haiat

I'm trying to understand.
Did you submitted your jar throw the flink web UI ,
And then you got the time out error ?

On Sun, Aug 25, 2019, 16:10 Eyal Pe'er  wrote:

> What do you mean by “remote cluster”?
>
> I tried to run dockerized Flink version (
> https://ci.apache.org/projects/flink/flink-docs-stable/ops/deployment/docker.html)
> on a remote machine and to submit a job that supposed to communicate with
> Kafka, but still I cannot access the topic.
>
>
>
>
>
> Best regards
>
> Eyal Peer */ *Data Platform Developer
>
>
>
> *From:* miki haiat 
> *Sent:* Sunday, August 25, 2019 3:50 PM
> *To:* Eyal Pe'er 
> *Cc:* user@flink.apache.org
> *Subject:* Re: timeout error while connecting to Kafka
>
>
>
> Did you try to submit it to  remote cluster ?
>
>
>
>
>
> On Sun, Aug 25, 2019 at 2:55 PM Eyal Pe'er  wrote:
>
> BTW, the exception that I see in the log is: ERROR
> org.apache.flink.runtime.rest.handler.job.JobDetailsHandler   - Exception
> occurred in REST handler…
>
> Best regards
>
> Eyal Peer */ *Data Platform Developer
>
>
>
> *From:* Eyal Pe'er 
> *Sent:* Sunday, August 25, 2019 2:20 PM
> *To:* miki haiat 
> *Cc:* user@flink.apache.org
> *Subject:* RE: timeout error while connecting to Kafka
>
>
>
> Hi,
>
> I removed that dependency, but it still fails.
>
> The reason why I used Kafka 1.5.0 is because I followed a training which
> used it (https://www.baeldung.com/kafka-flink-data-pipeline).
>
> If needed, I can change it.
>
>
>
> I’m not sure, but maybe in order to consume events from Kafka 0.9 I need
> to connect zookeeper, instead of the bootstrap servers ?
>
> I know that in Spark streaming we consume via zookeeper
> ("zookeeper.connect").
>
> I saw that in Apache Flink-Kafka connector zookeeper.connect  only
> required for Kafka 0.8, but maybe I still need to use it ?
>
> Best regards
>
> Eyal Peer */ *Data Platform Developer
>
>
>
> *From:* miki haiat 
> *Sent:* Thursday, August 22, 2019 2:29 PM
> *To:* Eyal Pe'er 
> *Cc:* user@flink.apache.org
> *Subject:* Re: timeout error while connecting to Kafka
>
>
>
> Can you try to remove this from your pom file .
>
>  
>
> org.apache.flink
>
> flink-connector-kafka_2.11
>
> 1.7.0
>
> 
>
>
>
>
>
> Is their any reason why you are using flink 1.5 and not latest release.
>
>
>
>
>
> Best,
>
>
> Miki
>
>
>
> On Thu, Aug 22, 2019 at 2:19 PM Eyal Pe'er  wrote:
>
> Hi Miki,
>
> First, I would like to thank you for the fast response.
>
> I recheck Kafka and it is up and running fine.
>
> I’m still getting the same error (Timeout expired while fetching topic
> metadata).
>
> Maybe my Flink version is wrong (Kafka version is 0.9)?
>
>
>
> 
>
> org.apache.flink
>
> flink-core
>
> 1.5.0
>
> 
>
> 
>
> org.apache.flink
>
> flink-connector-kafka-0.11_2.11
>
> 1.5.0
>
> 
>
> 
>
> org.apache.flink
>
> flink-streaming-java_2.11
>
> 1.5.0
>
> 
>
> 
>
> org.apache.flink
>
> flink-java
>
> 1.5.0
>
> 
>
> 
>
> org.apache.flink
>
> flink-clients_2.10
>
> 1.1.4
>
> 
>
> 
>
> org.apache.flink
>
> flink-connector-kafka_2.11
>
> 1.7.0
>
> 
>
>
>
>
>
> Best regards
>
> Eyal Peer */ *Data Platform Developer
>
>
>
> *From:* miki haiat 
> *Sent:* Thursday, August 22, 2019 11:03 AM
> *To:* Eyal Pe'er 
> *Cc:* user@flink.apache.org
> *Subject:* Re: timeout error while connecting to Kafka
>
>
>
> Can you double check that the kafka instance is up ?
> The code looks fine.
>
>
>
>
>
> Best,
>
>
>
> Miki
>
>
>
> On Thu, Aug 22, 2019 at 10:45 AM Eyal Pe'er 
> wrote:
>
> Hi,
>
> I'm trying to consume events using Apache Flink.
>
> The code is very basic, trying to connect the topic split words by space
> and print it to the console. Kafka version is 0.9.
>
> import org.apache.flink.api.common.functions.FlatMapFunction;
>
> import org.apache.flink.api.common.serialization.SimpleStringSchema;
>
>
>
> import org.apache.flink.streaming.api.datastream.DataStream;
>
> import org.apache.flink.streaming.api.environment.
>

Re: timeout error while connecting to Kafka

2019-08-25 Thread miki haiat

Did you try to submit it to  remote cluster ?


On Sun, Aug 25, 2019 at 2:55 PM Eyal Pe'er  wrote:

> BTW, the exception that I see in the log is: ERROR
> org.apache.flink.runtime.rest.handler.job.JobDetailsHandler   - Exception
> occurred in REST handler…
>
> Best regards
>
> Eyal Peer */ *Data Platform Developer
>
>
>
> *From:* Eyal Pe'er 
> *Sent:* Sunday, August 25, 2019 2:20 PM
> *To:* miki haiat 
> *Cc:* user@flink.apache.org
> *Subject:* RE: timeout error while connecting to Kafka
>
>
>
> Hi,
>
> I removed that dependency, but it still fails.
>
> The reason why I used Kafka 1.5.0 is because I followed a training which
> used it (https://www.baeldung.com/kafka-flink-data-pipeline).
>
> If needed, I can change it.
>
>
>
> I’m not sure, but maybe in order to consume events from Kafka 0.9 I need
> to connect zookeeper, instead of the bootstrap servers ?
>
> I know that in Spark streaming we consume via zookeeper
> ("zookeeper.connect").
>
> I saw that in Apache Flink-Kafka connector zookeeper.connect  only
> required for Kafka 0.8, but maybe I still need to use it ?
>
> Best regards
>
> Eyal Peer */ *Data Platform Developer
>
>
>
> *From:* miki haiat 
> *Sent:* Thursday, August 22, 2019 2:29 PM
> *To:* Eyal Pe'er 
> *Cc:* user@flink.apache.org
> *Subject:* Re: timeout error while connecting to Kafka
>
>
>
> Can you try to remove this from your pom file .
>
>  
>
> org.apache.flink
>
> flink-connector-kafka_2.11
>
> 1.7.0
>
> 
>
>
>
>
>
> Is their any reason why you are using flink 1.5 and not latest release.
>
>
>
>
>
> Best,
>
>
> Miki
>
>
>
> On Thu, Aug 22, 2019 at 2:19 PM Eyal Pe'er  wrote:
>
> Hi Miki,
>
> First, I would like to thank you for the fast response.
>
> I recheck Kafka and it is up and running fine.
>
> I’m still getting the same error (Timeout expired while fetching topic
> metadata).
>
> Maybe my Flink version is wrong (Kafka version is 0.9)?
>
>
>
> 
>
> org.apache.flink
>
> flink-core
>
> 1.5.0
>
> 
>
> 
>
> org.apache.flink
>
> flink-connector-kafka-0.11_2.11
>
> 1.5.0
>
> 
>
> 
>
> org.apache.flink
>
> flink-streaming-java_2.11
>
> 1.5.0
>
> 
>
> 
>
> org.apache.flink
>
> flink-java
>
> 1.5.0
>
> 
>
> 
>
> org.apache.flink
>
> flink-clients_2.10
>
> 1.1.4
>
> 
>
> 
>
> org.apache.flink
>
> flink-connector-kafka_2.11
>
> 1.7.0
>
> 
>
>
>
>
>
> Best regards
>
> Eyal Peer */ *Data Platform Developer
>
>
>
> *From:* miki haiat 
> *Sent:* Thursday, August 22, 2019 11:03 AM
> *To:* Eyal Pe'er 
> *Cc:* user@flink.apache.org
> *Subject:* Re: timeout error while connecting to Kafka
>
>
>
> Can you double check that the kafka instance is up ?
> The code looks fine.
>
>
>
>
>
> Best,
>
>
>
> Miki
>
>
>
> On Thu, Aug 22, 2019 at 10:45 AM Eyal Pe'er 
> wrote:
>
> Hi,
>
> I'm trying to consume events using Apache Flink.
>
> The code is very basic, trying to connect the topic split words by space
> and print it to the console. Kafka version is 0.9.
>
> import org.apache.flink.api.common.functions.FlatMapFunction;
>
> import org.apache.flink.api.common.serialization.SimpleStringSchema;
>
>
>
> import org.apache.flink.streaming.api.datastream.DataStream;
>
> import org.apache.flink.streaming.api.environment.
> StreamExecutionEnvironment;
>
> import org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumer09;
>
> import org.apache.flink.util.Collector;
>
> import java.util.Properties;
>
>
>
> public class KafkaStreaming {
>
>
>
> public static void main(String[] args) throws Exception {
>
> final StreamExecutionEnvironment env = StreamExecutionEnvironment
> .getExecutionEnvironment();
>
>
>
> Properties props = new Properties();
>
> props.setProperty("bootstrap.servers", "kafka servers:9092...");
>
> props.setProperty("group.id", "flinkPOC");
>
> FlinkKafkaConsumer09 consumer = new FlinkKafkaConsumer09<>(
> "topic", new SimpleStringSchema(), props);
>
>
>
>

Re: timeout error while connecting to Kafka

2019-08-22 Thread miki haiat

Can you try to remove this from your pom file .

 

org.apache.flink

flink-connector-kafka_2.11

1.7.0





Is their any reason why you are using flink 1.5 and not latest release.



Best,


Miki

On Thu, Aug 22, 2019 at 2:19 PM Eyal Pe'er  wrote:

> Hi Miki,
>
> First, I would like to thank you for the fast response.
>
> I recheck Kafka and it is up and running fine.
>
> I’m still getting the same error (Timeout expired while fetching topic
> metadata).
>
> Maybe my Flink version is wrong (Kafka version is 0.9)?
>
>
>
> 
>
> org.apache.flink
>
> flink-core
>
> 1.5.0
>
> 
>
> 
>
> org.apache.flink
>
> flink-connector-kafka-0.11_2.11
>
> 1.5.0
>
> 
>
> 
>
> org.apache.flink
>
> flink-streaming-java_2.11
>
> 1.5.0
>
> 
>
> 
>
> org.apache.flink
>
> flink-java
>
> 1.5.0
>
> 
>
> 
>
> org.apache.flink
>
> flink-clients_2.10
>
> 1.1.4
>
> 
>
> 
>
>     org.apache.flink
>
> flink-connector-kafka_2.11
>
> 1.7.0
>
> 
>
>
>
>
>
> Best regards
>
> Eyal Peer */ *Data Platform Developer
>
>
>
> *From:* miki haiat 
> *Sent:* Thursday, August 22, 2019 11:03 AM
> *To:* Eyal Pe'er 
> *Cc:* user@flink.apache.org
> *Subject:* Re: timeout error while connecting to Kafka
>
>
>
> Can you double check that the kafka instance is up ?
> The code looks fine.
>
>
>
>
>
> Best,
>
>
>
> Miki
>
>
>
> On Thu, Aug 22, 2019 at 10:45 AM Eyal Pe'er 
> wrote:
>
> Hi,
>
> I'm trying to consume events using Apache Flink.
>
> The code is very basic, trying to connect the topic split words by space
> and print it to the console. Kafka version is 0.9.
>
> import org.apache.flink.api.common.functions.FlatMapFunction;
>
> import org.apache.flink.api.common.serialization.SimpleStringSchema;
>
>
>
> import org.apache.flink.streaming.api.datastream.DataStream;
>
> import org.apache.flink.streaming.api.environment.
> StreamExecutionEnvironment;
>
> import org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumer09;
>
> import org.apache.flink.util.Collector;
>
> import java.util.Properties;
>
>
>
> public class KafkaStreaming {
>
>
>
> public static void main(String[] args) throws Exception {
>
> final StreamExecutionEnvironment env = StreamExecutionEnvironment
> .getExecutionEnvironment();
>
>
>
> Properties props = new Properties();
>
> props.setProperty("bootstrap.servers", "kafka servers:9092...");
>
> props.setProperty("group.id", "flinkPOC");
>
> FlinkKafkaConsumer09 consumer = new FlinkKafkaConsumer09<>(
> "topic", new SimpleStringSchema(), props);
>
>
>
> DataStream dataStream = env.addSource(consumer);
>
>
>
> DataStream wordDataStream = dataStream.flatMap(new Splitter
> ());
>
> wordDataStream.print();
>
> env.execute("Word Split");
>
>
>
> }
>
>
>
> public static class Splitter implements FlatMapFunction {
>
>
>
> public void flatMap(String sentence, Collector out) throws
> Exception {
>
>
>
> for (String word : sentence.split(" ")) {
>
> out.collect(word);
>
> }
>
> }
>
>
>
> }
>
> }
>
>
>
> The app does not print anything to the screen (although I produced events
> to Kafka).
>
> I tried to skip the Splitter FlatMap function, but still nothing happens.
> SSL or any kind of authentication is not required from Kafka.
>
> This is the error that I found in the logs:
>
> 2019-08-20 14:36:17,654 INFO  org.apache.flink.runtime.executiongraph.
> ExecutionGraph- Source: Custom Source -> Flat Map -> Sink: Print
> to Std. Out (1/1) (02258a2cafab83afbc0f5650c088da2b) switched from
> RUNNING to FAILED.
>
> org.apache.kafka.common.errors.TimeoutException: Timeout expired while
> fetching topic metadata
>
>
>
> The Kafka’s topic has only one partition, so the topic metadata supposed
> to be very basic.
>
> I ran Kafka and the Flink locally in order to eliminate network related
> issues, but the issue persists. So my assumption is that I’m doing
> something wrong…
>
> Did you encounter such issue? Does someone have different code for
> consuming Kafka events ?
>
>
>
> Best regards
>
> Eyal Peer */ *Data Platform Developer
>
>
>
>

Re: timeout error while connecting to Kafka

2019-08-22 Thread miki haiat

Can you double check that the kafka instance is up ?
The code looks fine.


Best,

Miki

On Thu, Aug 22, 2019 at 10:45 AM Eyal Pe'er  wrote:

> Hi,
>
> I'm trying to consume events using Apache Flink.
>
> The code is very basic, trying to connect the topic split words by space
> and print it to the console. Kafka version is 0.9.
>
> import org.apache.flink.api.common.functions.FlatMapFunction;
>
> import org.apache.flink.api.common.serialization.SimpleStringSchema;
>
>
>
> import org.apache.flink.streaming.api.datastream.DataStream;
>
> import org.apache.flink.streaming.api.environment.
> StreamExecutionEnvironment;
>
> import org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumer09;
>
> import org.apache.flink.util.Collector;
>
> import java.util.Properties;
>
>
>
> public class KafkaStreaming {
>
>
>
> public static void main(String[] args) throws Exception {
>
> final StreamExecutionEnvironment env = StreamExecutionEnvironment
> .getExecutionEnvironment();
>
>
>
> Properties props = new Properties();
>
> props.setProperty("bootstrap.servers", "kafka servers:9092...");
>
> props.setProperty("group.id", "flinkPOC");
>
> FlinkKafkaConsumer09 consumer = new FlinkKafkaConsumer09<>(
> "topic", new SimpleStringSchema(), props);
>
>
>
> DataStream dataStream = env.addSource(consumer);
>
>
>
> DataStream wordDataStream = dataStream.flatMap(new Splitter
> ());
>
> wordDataStream.print();
>
> env.execute("Word Split");
>
>
>
> }
>
>
>
> public static class Splitter implements FlatMapFunction {
>
>
>
> public void flatMap(String sentence, Collector out) throws
> Exception {
>
>
>
> for (String word : sentence.split(" ")) {
>
> out.collect(word);
>
> }
>
> }
>
>
>
> }
>
> }
>
>
>
> The app does not print anything to the screen (although I produced events
> to Kafka).
>
> I tried to skip the Splitter FlatMap function, but still nothing happens.
> SSL or any kind of authentication is not required from Kafka.
>
> This is the error that I found in the logs:
>
> 2019-08-20 14:36:17,654 INFO  org.apache.flink.runtime.executiongraph.
> ExecutionGraph- Source: Custom Source -> Flat Map -> Sink: Print
> to Std. Out (1/1) (02258a2cafab83afbc0f5650c088da2b) switched from
> RUNNING to FAILED.
>
> org.apache.kafka.common.errors.TimeoutException: Timeout expired while
> fetching topic metadata
>
>
>
> The Kafka’s topic has only one partition, so the topic metadata supposed
> to be very basic.
>
> I ran Kafka and the Flink locally in order to eliminate network related
> issues, but the issue persists. So my assumption is that I’m doing
> something wrong…
>
> Did you encounter such issue? Does someone have different code for
> consuming Kafka events ?
>
>
>
> Best regards
>
> Eyal Peer */ *Data Platform Developer
>
>
>

Re: Recovery from job manager crash using check points

2019-08-19 Thread miki haiat

Wich kind of deployment system are you using,
Standalone ,yarn ... Other ?

On Mon, Aug 19, 2019, 18:28  wrote:

> Hi,
>
>
>
> I can use check points to recover Flink states when a task manger crashes.
>
>
>
> I can not use check points to recover Flink states when a job manger
> crashes.
>
>
>
> Do I need to set up zookeepers to keep the states when a job manager
> crashes?
>
>
>
> Regards
>
>
>
> Min
>
>
>

Re: Capping RocksDb memory usage

2019-08-07 Thread miki haiat

I think using metrics exporter is the easiest way

[1]
https://ci.apache.org/projects/flink/flink-docs-stable/monitoring/metrics.html#rocksdb


On Wed, Aug 7, 2019, 20:28 Cam Mach  wrote:

> Hello everyone,
>
> What is the most easy and efficiently way to cap RocksDb's memory usage?
>
> Thanks,
> Cam
>
>

Re: Does RocksDBStateBackend need a separate RocksDB service？

2019-08-07 Thread miki haiat

There  is no need to add an external RocksDB instance
  .
*The RocksDBStateBackend holds in-flight data in a RocksDB
 database that is (per default) stored in the
TaskManager data directories.  *

[1]
https://ci.apache.org/projects/flink/flink-docs-stable/ops/state/state_backends.html#the-rocksdbstatebackend

On Wed, Aug 7, 2019 at 1:25 PM wangl...@geekplus.com.cn <
wangl...@geekplus.com.cn> wrote:

>
> In  my  code,  I just setStateBackend with a hdfs direcoty.
>env.setStateBackend(new
> RocksDBStateBackend("hdfs://user/test/job"));
>
> Is there an embeded  RocksDB  service in the flink task?
>
> --
> wangl...@geekplus.com.cn
>

[bug ?] PrometheusPushGatewayReporter register more then one JM

2019-08-06 Thread miki haiat

We have  standalone cluster  with  PrometheusPushGatewayReporter
conflagration.
its seems like we cant register more then one JM to  Prometheus  because of
naming uniqueness.

 WARN  org.apache.flink.metrics.prometheus.PrometheusPushGatewayReporter  -
There was a problem registering metric numRunningJobs.
java.lang.IllegalArgumentException: *Collector already registered that
provides name: flink_jobmanager_numRunningJobs*

Re: From Kafka Stream to Flink

2019-07-21 Thread miki haiat

Can you elaborate more  about your use case .


On Sat, Jul 20, 2019 at 1:04 AM Maatary Okouya 
wrote:

> Hi,
>
> I am a user of Kafka Stream so far. However, because i have been face with
> several limitation in particular in performing Join on KTable.
>
> I was wondering what is the appraoch in Flink to achieve  (1) the concept
> of KTable, i.e. a Table that represent a changeLog, i.e. only the latest
> version of all keyed records,  and (2) joining those.
>
> There are currently a lot of limitation around that on Kafka Stream, and i
> need that for performing some ETL process, where i need to mirror entire
> databases in Kafka, and then do some join on the table to emit the logical
> entity in Kafka Topics. I was hoping that somehow i could acheive that by
> using FLink as intermediary.
>
> I can see that you support any kind of join, but i just don't see the
> notion of Ktable.
>
>
>

Re: failed checkpoint with metadata timeout exception

2019-07-18 Thread miki haiat

Can you check in the kafka logs what happens when you adding new brokers ?




On Thu, Jul 18, 2019 at 4:36 PM Yitzchak Lieberman <
yitzch...@sentinelone.com> wrote:

> org.apache.kafka.common.errors.TimeoutException: Failed to update metadata 
> after 6 ms.
>
>
>
>
> On Thu, Jul 18, 2019 at 3:49 PM miki haiat  wrote:
>
>> Can you share your logs
>>
>>
>> On Thu, Jul 18, 2019 at 3:22 PM Yitzchak Lieberman <
>> yitzch...@sentinelone.com> wrote:
>>
>>> Hi.
>>>
>>> I have flink a application that produces to kafka with 3 brokers.
>>> When I add 2 brokers that are not up yet it fails the checkpoint (a key
>>> in s3) due to timeout error.
>>>
>>> Do you know what can cause that?
>>>
>>> Thanks,
>>> Yitzchak.
>>>
>>

Re: failed checkpoint with metadata timeout exception

2019-07-18 Thread miki haiat

Can you share your logs


On Thu, Jul 18, 2019 at 3:22 PM Yitzchak Lieberman <
yitzch...@sentinelone.com> wrote:

> Hi.
>
> I have flink a application that produces to kafka with 3 brokers.
> When I add 2 brokers that are not up yet it fails the checkpoint (a key in
> s3) due to timeout error.
>
> Do you know what can cause that?
>
> Thanks,
> Yitzchak.
>

Re: Flink and CDC

2019-07-18 Thread miki haiat

I actually thinking   about this option as well .
Im assuming that the correct way to implement it ,  is to integrate
debezium embedded   to source function ?

[1] https://github.com/debezium/debezium/tree/master/debezium-embedded

On Wed, Jul 17, 2019 at 7:08 PM Flavio Pompermaier 
wrote:

> Hi to all,
> I'd like to know whether it exists or not an example about how to leverage
> Debezium as a CDC source and to feed a Flink Table (From MySQL for example).
>
> Best,
> Flavio
>

Re: Running Flink cluster via Marathon

2019-07-14 Thread miki haiat

How many Job manager did you configure ?




On Mon, Jul 15, 2019, 07:11 Marzieh Ghasemy  wrote:

> Hello I have a Mesos cluster of two master and three slaves, I configured
> Marathon and Zookeeper. My Zookeeper cluster has five nodes. When I run
> Flink Json file via Marathon, it is run, but I can see Flink UI in just one
> slave. Other slaves show me this error: Service temporarily unavailable due
> to an ongoing leader election. Please refresh.
>
> I have searched a lot, but cannot find a solution. Also, I know it is
> Zookeeper problem,so I changed default Zookeeper port, even though it did
> not help.
>
>
> Would you please tell me how to solve the issue?
>
>
> Thank you in advance.
>
>

Re: Read file from S3 and write to kafka

2019-06-04 Thread miki haiat

You can use the DataSet API to parse files from S3.

https://ci.apache.org/projects/flink/flink-docs-release-1.8/dev/batch/#data-sources
https://ci.apache.org/projects/flink/flink-docs-stable/ops/deployment/aws.html#s3-simple-storage-service


And then  parsed it and send it to kafka.



On Tue, Jun 4, 2019 at 5:57 AM anurag  wrote:

> Hi All,
> I am searched a lot on google but could not find how I can achieve writing
> a flink  function which reads a file in S3 and for each line in the file
> write a message to kafka.
> Thanks a lot , much appreciated. I am sorry if I did not searched properly.
> Thanks,
> Anurag
>

Re: Propagating delta from window upon trigger

2019-05-18 Thread miki haiat

Can you elaborate more what  is you use case ?

On Sat, May 18, 2019 at 12:47 AM Nikhil Goyal  wrote:

> Hi guys,
>
> Is there a way in Flink to only propagate the changes which happened in
> the window's state rather than dumbing the contents of the window again and
> again upon trigger?
>
> Thanks
> Nikhil
>

Re: flink 1.7 HA production setup going down completely

2019-05-07 Thread miki haiat

Which flink version are you using?
I had similar  issues with 1.5.x

On Tue, May 7, 2019 at 2:49 PM Manjusha Vuyyuru 
wrote:

> Hello,
>
> I have a flink setup with two job managers coordinated by zookeeper.
>
> I see the below exception and both jobmanagers are going down:
>
> 2019-05-07 08:29:13,346 INFO
> org.apache.flink.runtime.jobmanager.ZooKeeperSubmittedJobGraphStore  -
> Released locks of job graph f8eb1b482d8ec8c1d3e94c4d0f79df77 from ZooKeeper.
> 2019-05-07 08:29:13,346 ERROR
> org.apache.flink.runtime.entrypoint.ClusterEntrypoint -* Fatal
> error occurred in the cluster entrypoint.*
> java.lang.RuntimeException: org.apache.flink.util.FlinkException: Could
> not retrieve submitted JobGraph from state handle under
> /147dd022ec91f7381ad4ca3d290387e9. This indicates that the retrieved state
> handle is broken. Try cleaning the state handle store.
> at
> org.apache.flink.util.ExceptionUtils.rethrow(ExceptionUtils.java:199)
> at
> org.apache.flink.util.function.FunctionUtils.lambda$uncheckedFunction$2(FunctionUtils.java:74)
> at
> java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:602)
> at
> java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:577)
> at
> java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:442)
> at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:39)
> at
> akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:415)
> at
> scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
> at
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
> at
> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
> at
> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> Caused by: org.apache.flink.util.FlinkException: Could not retrieve
> submitted JobGraph from state handle under
> /147dd022ec91f7381ad4ca3d290387e9. This indicates that the retrieved state
> handle is broken. Try cleaning the state handle store.
> at
> org.apache.flink.runtime.jobmanager.ZooKeeperSubmittedJobGraphStore.recoverJobGraph(ZooKeeperSubmittedJobGraphStore.java:208)
> at
> org.apache.flink.runtime.dispatcher.Dispatcher.recoverJob(Dispatcher.java:696)
> at
> org.apache.flink.runtime.dispatcher.Dispatcher.recoverJobGraphs(Dispatcher.java:681)
> at
> org.apache.flink.runtime.dispatcher.Dispatcher.recoverJobs(Dispatcher.java:662)
> at
> org.apache.flink.runtime.dispatcher.Dispatcher.lambda$null$26(Dispatcher.java:821)
> at
> org.apache.flink.util.function.FunctionUtils.lambda$uncheckedFunction$2(FunctionUtils.java:72)
> ... 9 more
>
>
> Can someone please help me understand in detail on what is causing this
> exception. I can see zookeeper not able to retrieve job graph. What could
> be the reason for this?
>
> This is second time that my setup is going down with this excepton, first
> time i cleared jobgraph folder in zookeeper and restarted, now again faced
> with same issue.
>
> Since this is production setup this way of outage is not at all expected
> :(. Can someone help me how to give a permanent fix to this issue?
>
>
> Thanks,
> Manju
>
>

Re: connecting two streams flink

2019-01-29 Thread miki haiat

If c1 and c2 are  listing   to the same topic they will  consume the same
data .
so i cant understand this

>  these two streams one(c2) is fast and other(c1)





On Tue, Jan 29, 2019 at 2:44 PM Selvaraj chennappan <
selvarajchennap...@gmail.com> wrote:

> Team,
>
> I have two kafka consumer for same topic and want to join second stream to
> first after couple of subtasks computation in the first stream then
> validate the record . KT - C1 ,C2
>
> KT - C1 - Transformation(FlatMap) - Dedup - Validate --ifvalidsave it to DB
> -C2 - Process --
>
> if record is invalid then save it to Error topic .
>
> How do I merge these two streams one(c2) is fast and other(c1) is little
> slow (two levels of computation) ?
> Same record is flowing from C1-Flatmap-FlatMap and other consumer C2 . I
> have to validate that record based on the rules.
> Please find the attached image herewith reference.
> [image: two-stream.png]
>
> --
>
>
>
>
>
> Regards,
> Selvaraj C
>

Re: Forking a stream with Flink

2019-01-29 Thread miki haiat

Im not sure if i got your question correctly, can you elaborate more on
your use case

Re: Kafka stream fed in batches throughout the day

2019-01-21 Thread miki haiat

In flink you cant read data from kafka in Dataset API (Batch)
And you dont want to mess with start and stop your job every few hours.
Can you elaborate more on your use case ,
Are you going to use KeyBy , is thire any way to use trigger ... ?



On Mon, Jan 21, 2019 at 4:43 PM Jonny Graham 
wrote:

> We have a Kafka stream of events that we want to process with a Flink
> datastream process. However, the stream is populated by an upstream batch
> process that only executes every few hours. So the stream has very 'bursty'
> behaviour. We need a window based on event time to await the next events
> for the same key. Due to this batch population of the stream, these windows
> can remain open (with no event activity on the stream) for many hours. From
> what I understand we could indeed leave the Flink datastream process up and
> running all this time and the window would remain open. We could even use a
> savepoint and then stop the process and restart it (with the window state
> being restored) when we get the next batch and the events start appearing
> in the stream again.
>
>
>
> One rationale for this mode of operation is that we have a future usecase
> where this stream will be populated in real-time and would behave like a
> normal stream.
>
>
>
> Is that a best-practice approach for this scenario? Or should we be
> treating these batches as individual batches (Flink job that ends with the
> end of the batch) and manually handle the windowing that needs to cross
> multiple batches.
>
>
>
> Thanks,
>
> Jonny
>
>
> Confidentiality: This communication and any attachments are intended for
> the above-named persons only and may be confidential and/or legally
> privileged. Any opinions expressed in this communication are not
> necessarily those of NICE Actimize. If this communication has come to you
> in error you must take no action based on it, nor must you copy or show it
> to anyone; please delete/destroy and inform the sender by e-mail
> immediately.
> Monitoring: NICE Actimize may monitor incoming and outgoing e-mails.
> Viruses: Although we have taken steps toward ensuring that this e-mail and
> attachments are free from any virus, we advise that in keeping with good
> computing practice the recipient should ensure they are actually virus free.
>
>
> Confidentiality: This communication and any attachments are intended for
> the above-named persons only and may be confidential and/or legally
> privileged. Any opinions expressed in this communication are not
> necessarily those of NICE Actimize. If this communication has come to you
> in error you must take no action based on it, nor must you copy or show it
> to anyone; please delete/destroy and inform the sender by e-mail
> immediately.
> Monitoring: NICE Actimize may monitor incoming and outgoing e-mails.
> Viruses: Although we have taken steps toward ensuring that this e-mail and
> attachments are free from any virus, we advise that in keeping with good
> computing practice the recipient should ensure they are actually virus free.
>

Re: How can I make HTTP requests from an Apache Flink program?

2019-01-15 Thread miki haiat

Can you share more which use case are you trying to implement ?



On Tue, Jan 15, 2019 at 2:02 PM  wrote:

> Hi all,
>
>
>
> I was wondering if anybody has any recommendation over making HTTP
> requests from Flink to another service.
>
> On the long term we are looking for a solution that is both performing and
> integrates well with our flink program.
>
> Does it matter the library we use? Do we need a special connector to make
> HTTP calls?
>
> One library we thought that could fit our necessities Akka akka HTTP
> client API due to the possibility to make async HTTP calls.
>
>
>
> We are using Scala 2.12 and Flink 1.7.
>
>
>
> Kind regards,
>
>
>
> Jacopo Gobbi
>

Flink error reading file over network (Windows)

2019-01-02 Thread miki haiat

Hi,

Im trying to read a csv file from windows shard drive.
I tried numbers option but i failed.

I cant find an option to use SMB format,
so im assuming that create my own input format is the way to achieve that ?

What is the correct way to read file from windows network ?.

Thanks,

Miki

Re: using updating shared data

2019-01-01 Thread miki haiat

Im trying to understand  your  use case.
What is the source  of the data ? FS ,KAFKA else ?


On Tue, Jan 1, 2019 at 6:29 PM Avi Levi  wrote:

> Hi,
> I have a list (couple of thousands text lines) that I need to use in my
> map function. I read this article about broadcasting variables
> 
>  or
> using distributed cache
> 
> however I need to update this list from time to time, and if I understood
> correctly it is not possible on broadcast or cache without restarting the
> job. Is there idiomatic way to achieve this? A db seems to be an overkill
> for that and I do want to be cheap on io/network calls as much as possible.
>
> Cheers
> Avi
>
>

Re: getting Timeout expired while fetching topic metadata

2018-12-24 Thread miki haiat

Hi Avi,
Can you try to add this properties

 props.put(CommonClientConfigs.SECURITY_PROTOCOL_CONFIG, "SSL");

Thanks,
Miki

On Mon, Dec 24, 2018 at 8:19 PM Avi Levi  wrote:

> Hi all,
> very new to flink so my apology if it seems trivial.
> We deployed flink on gcloud
> I am trying to connect to kafka but keep getting this error:
> *org.apache.kafka.common.errors.TimeoutException: Timeout expired while
> fetching topic metadata*
> this how my properties look like
> val consumerProperties: Properties = {
> val p = new Properties()
> p.setProperty("bootstrap.servers", kafkaBootStrapServers)
> p.setProperty("group.id", groupId)
> p.setProperty("client.id", s"queue-consumer-${randomUUID().toString}")
>
> p.setProperty("ssl.keystore.location","/usr/path_to/kafka_ssl_client.keystore.jks"))
> p.setProperty("ssl.keystore.password",  "some password")
> p.setProperty("ssl.truststore.location",
> "/usr/path_to/kafka_ssl_client.keystore.jks")
> p.setProperty("ssl.truststore.password", "some password")
> p
>   }
>
> please advise
>
> Thanks
> Avi
>

Re: Query big mssql Data Source [Batch]

2018-12-06 Thread miki haiat

Hi Flavio ,
That working fine for and im able to pull ~17m rows in 20 seconds.

Im a bit confuse regarding the state backhand ,
I could find a way to configure it so im guessing the data is in the memory
...

thanks,
Miki



On Thu, Dec 6, 2018 at 12:06 PM Flavio Pompermaier 
wrote:

> the constructor of NumericBetweenParametersProvider takes 3 params: long
> fetchSize, long minVal, long maxVal.
> If you want parallelism you should use a  1 < fetchSize  < maxVal.
> In your case, if you do new NumericBetweenParametersProvider(50, 3, 300)
> you will produce 6 parallel tasks:
>
>1. SELECT  BETWEEN 3 and 50
>2. SELECT  BETWEEN 51 and 100
>3. SELECT  BETWEEN 101 and 150
>4. SELECT  BETWEEN 151 and 200
>5. SELECT  BETWEEN 201 and 250
>6. SELECT  BETWEEN 251 and 300
>
>
> On Thu, Dec 6, 2018 at 10:32 AM miki haiat  wrote:
>
>> hi Flavio ,
>>
>> This is the query that im trying to coordinate
>>
>>> .setQuery("SELECT a, b, c, \n" +
>>> "FROM dbx.dbo.x as tls\n"+
>>> "WHERE tls.a BETWEEN ? and ?"
>>>
>>> And this is the way im trying to parameterized
>>
>> ParameterValuesProvider pramProvider = new
>> NumericBetweenParametersProvider(1, 3,300);
>>
>> I also tried this way
>>
>>  Serializable[][] queryParameters = new String[1][2];
>> queryParameters[0] = new String[]{"3","300"};
>>
>>
>> On Wed, Dec 5, 2018 at 6:44 PM Flavio Pompermaier 
>> wrote:
>>
>>> whats your query? Have you used '?' where query should be parameterized?
>>>
>>> Give a look at
>>> https://github.com/apache/flink/blob/master/flink-connectors/flink-jdbc/src/test/java/org/apache/flink/api/java/io/jdbc/JDBCFullTest.java
>>>
>>
>

Re: Query big mssql Data Source [Batch]

2018-12-06 Thread miki haiat

hi Flavio ,

This is the query that im trying to coordinate

> .setQuery("SELECT a, b, c, \n" +
> "FROM dbx.dbo.x as tls\n"+
> "WHERE tls.a BETWEEN ? and ?"
>
> And this is the way im trying to parameterized

ParameterValuesProvider pramProvider = new NumericBetweenParametersProvider(
1, 3,300);

I also tried this way

 Serializable[][] queryParameters = new String[1][2];
queryParameters[0] = new String[]{"3","300"};


On Wed, Dec 5, 2018 at 6:44 PM Flavio Pompermaier 
wrote:

> whats your query? Have you used '?' where query should be parameterized?
>
> Give a look at
> https://github.com/apache/flink/blob/master/flink-connectors/flink-jdbc/src/test/java/org/apache/flink/api/java/io/jdbc/JDBCFullTest.java
>

Re: Query big mssql Data Source [Batch]

2018-12-05 Thread miki haiat

Im using  jdts driver to query mssql .
I used the  ParametersProvider  as you suggested but  for some reason the
job wont run  parallel .

[image: flink_in.JPG]

Also the sink , a simple print out wont parallel

[image: flink_out.JPG]




On Tue, Dec 4, 2018 at 10:05 PM Flavio Pompermaier 
wrote:

> You can pass a ParametersProvider to the jdbc input format in order to
> parallelize the fetch.
> Of course you don't have to kill the mysql server  with too many request
> in parallel so you'll probably put a limit to the parallelism of the input
> format.
>
>
> On Tue, 4 Dec 2018, 17:31 miki haiat 
>> HI ,
>> I want to query some sql table that contains  ~80m rows.
>>
>> There is  a few ways to do that  and i wonder what is the best way to do
>> that .
>>
>>
>>1. Using JDBCINPUTFORMAT  -> convert to dataset and output it without
>>doing any logic in the dataset, passing the full query in the
>>JDBCINPUTFORMAT set query parameters.
>>2.  Using JDBCINPUTFORMATselect all the data from table then
>>desirelaze it ->convert to dataset and preforming logic.
>>
>>
>> Or something else that is much efficient ?
>>
>> Thanks,
>>
>> Miki
>>
>>

Query big mssql Data Source [Batch]

2018-12-04 Thread miki haiat

HI ,
I want to query some sql table that contains  ~80m rows.

There is  a few ways to do that  and i wonder what is the best way to do
that .


   1. Using JDBCINPUTFORMAT  -> convert to dataset and output it without
   doing any logic in the dataset, passing the full query in the
   JDBCINPUTFORMAT set query parameters.
   2.  Using JDBCINPUTFORMATselect all the data from table then
   desirelaze it ->convert to dataset and preforming logic.


Or something else that is much efficient ?

Thanks,

Miki

Re: Looking for example for bucketingSink / StreamingFileSink

2018-12-03 Thread miki haiat

HI Avi ,
Im assuming that the cause  of the "pending" file is because the checkpoint
isn't finished successfully [1]
Can you try to change the checkpoint time to 1 min as well .


Thanks,
Miki



[1]
 
https://github.com/apache/flink/blob/master/flink-connectors/flink-connector-filesystem/src/main/java/org/apache/flink/streaming/connectors/fs/bucketing/BucketingSink.java#L131


On Mon, Dec 3, 2018 at 12:45 PM Avi Levi  wrote:

> Hi Guys,
> very new to flink so my apology for the newbie questions :)
> but I desperately looking for a good example for streaming to file
> using bucketingSink / StreamingFileSink . Unfortunately the examples in the
> documentation are not event compiling (at least not the ones in scala
> https://issues.apache.org/jira/browse/FLINK-11053 )
>
> I tried using bucketing sink with streamingFileSink (or just
> streamingFileSink ) and finally tried to implement a writer but with no
> luck.
> BucketingSink seems to be a perfect fit because I can set the batch
> interval by time interval or size which is exactly what I need.
>
> This is my last attempt (sample project)
>  which
> results a lot of "pending" files.
> *Any help would be appreciated*
>
> *Thanks*
> *Avi*
>

Re: where can I see logs from code

2018-11-25 Thread miki haiat

You can see the logs in the webUI.
If you click on the Task manager tab you can find the logs

http://SERVERADD/#/taskmanager/TM_ID/log





On Sun, Nov 25, 2018 at 12:11 PM Avi Levi  wrote:

> Hi,
> Where can I see the logs written by the app code (i.e by the app
> developer) ?
>
> BR
> Avi
>

Re: Logging Kafka during exceptions

2018-11-21 Thread miki haiat

If so , then you can implement your own deserializer[1] with costume logic
and error handling



1.
https://ci.apache.org/projects/flink/flink-docs-stable/api/java/org/apache/flink/streaming/util/serialization/KeyedDeserializationSchemaWrapper.html


On Thu, Nov 22, 2018 at 8:57 AM Scott Sue  wrote:

> Json is sent into Kafka
>
>
> Regards,
> Scott
>
> SCOTT SUE
> CHIEF TECHNOLOGY OFFICER
>
> Support Line : +44(0) 2031 371 603
> Mobile : +852 9611 3969
>
> 9/F, 33 Lockhart Road, Wan Chai, Hong Kong
> www.celer-tech.com
>
>
>
>
>
>
>
> On 22 Nov 2018, at 14:55, miki haiat  wrote:
>
> Which data format   is sent to kafka ?
> Json Avro Other ?
>
>
>
> On Thu, Nov 22, 2018 at 7:36 AM Scott Sue 
> wrote:
>
>> Unexpected data meaning business level data that I didn’t expect to
>> receive. So business level data that doesn’t quite conform
>>
>> On Thu, 22 Nov 2018 at 13:30, miki haiat  wrote:
>>
>>>  Unexpected data you mean parsing error ?
>>> Which format is sent to Kafka ?
>>>
>>>
>>>
>>> On Thu, 22 Nov 2018, 6:59 Scott Sue >>
>>>> Hi all,
>>>>
>>>> When I'm running my jobs I am consuming data from Kafka to process in my
>>>> job.  Unfortunately my job receives unexpected data from time to time
>>>> which
>>>> I'm trying to find the root cause of the issue.
>>>>
>>>> Ideally, I want to be able to have a way to know when the job has
>>>> failed due
>>>> to an exception, to then log to file the last message that it was
>>>> consuming
>>>> at the time to help track down the offending message consumed.  How is
>>>> this
>>>> possible within Flink?
>>>>
>>>> Thinking about this more, it may not be a consumed message that killed
>>>> the
>>>> job, but maybe a transformation within the job itself and it died in a
>>>> downstream Operator.  In this case, is there a way to log to file the
>>>> message that an Operator was processing at the time that caused the
>>>> exception?
>>>>
>>>>
>>>> Thanks in advance!
>>>>
>>>>
>>>>
>>>> --
>>>> Sent from:
>>>> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
>>>>
>>> --
>>
>>
>> Regards,
>> Scott
>>
>> SCOTT SUE
>> CHIEF TECHNOLOGY OFFICER
>>
>> Support Line : +44(0) 2031 371 603 <+44%2020%203137%201603>
>> Mobile : +852 9611 3969 <9611%203969>
>>
>> 9/F, 33 Lockhart Road, Wanchai, Hong Kong
>> www.celer-tech.com
>>
>> *This message, including any attachments, may include private, privileged
>> and confidential information and is intended only for the personal and
>> confidential use of the intended recipient(s). If the reader of this
>> message is not an intended recipient, you are hereby notified that any
>> review, use, dissemination, distribution, printing or copying of this
>> message or its contents is strictly prohibited and may be unlawful. If you
>> are not an intended recipient or have received this communication in error,
>> please immediately notify the sender by telephone and/or a reply email and
>> permanently delete the original message, including any attachments, without
>> making a copy.*
>>
>
>
> *This message, including any attachments, may include private, privileged
> and confidential information and is intended only for the personal and
> confidential use of the intended recipient(s). If the reader of this
> message is not an intended recipient, you are hereby notified that any
> review, use, dissemination, distribution, printing or copying of this
> message or its contents is strictly prohibited and may be unlawful. If you
> are not an intended recipient or have received this communication in error,
> please immediately notify the sender by telephone and/or a reply email and
> permanently delete the original message, including any attachments, without
> making a copy.*
>

Re: Could not extract key Exception only on runtime not in dev environment

2018-11-20 Thread miki haiat

What r.id  Value ?
Are you sure that is not null ?

Miki.


On Tue, 20 Nov 2018, 17:26 Avi Levi  I am running flink locally on my machine , I am getting the exception
> below when reading from kafka topic. when running from the ide (intellij)
> it is running perfectly. however when I deploy my jar to flink runtime
> (locally) using
>
> */bin/flink run ~MyApp-1.0-SNAPSHOT.jar*
> my class looks like this
> case class Foo(id: String, value: String, timestamp: Long, counter: Int)
> I am getting this exception
>
> *java.lang.RuntimeException: Could not extract key from 
> Foo("some-uuid","text",1540348398,1)
>   at 
> org.apache.flink.streaming.runtime.io.RecordWriterOutput.pushToRecordWriter(RecordWriterOutput.java:110)
>   at 
> org.apache.flink.streaming.runtime.io.RecordWriterOutput.collect(RecordWriterOutput.java:89)
>   at 
> org.apache.flink.streaming.runtime.io.RecordWriterOutput.collect(RecordWriterOutput.java:45)
>   at 
> org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect(AbstractStreamOperator.java:689)
>   at 
> org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect(AbstractStreamOperator.java:667)
>   at 
> org.apache.flink.streaming.api.operators.StreamFilter.processElement(StreamFilter.java:40)
>   at 
> org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.pushToOperator(OperatorChain.java:579)
>   at 
> org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.collect(OperatorChain.java:554)
>   at 
> org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.collect(OperatorChain.java:534)
>   at 
> org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect(AbstractStreamOperator.java:689)
>   at 
> org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect(AbstractStreamOperator.java:667)
>   at 
> org.apache.flink.streaming.api.operators.StreamSourceContexts$NonTimestampContext.collect(StreamSourceContexts.java:104)
>   at 
> org.apache.flink.streaming.api.operators.StreamSourceContexts$NonTimestampContext.collectWithTimestamp(StreamSourceContexts.java:111)
>   at 
> org.apache.flink.streaming.connectors.kafka.internals.AbstractFetcher.emitRecordWithTimestamp(AbstractFetcher.java:398)
>   at 
> org.apache.flink.streaming.connectors.kafka.internal.Kafka010Fetcher.emitRecord(Kafka010Fetcher.java:89)
>   at 
> org.apache.flink.streaming.connectors.kafka.internal.Kafka09Fetcher.runFetchLoop(Kafka09Fetcher.java:154)
>   at 
> org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumerBase.run(FlinkKafkaConsumerBase.java:738)
>   at 
> org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:87)
>   at 
> org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:56)
>   at 
> org.apache.flink.streaming.runtime.tasks.SourceStreamTask.run(SourceStreamTask.java:99)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:300)
>   at org.apache.flink.runtime.taskmanager.Task.run(Task.java:711)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: java.lang.RuntimeException: Could not extract key from 
> Foo("some-uuid","text",1540348398,1)
>   at 
> org.apache.flink.streaming.runtime.partitioner.KeyGroupStreamPartitioner.selectChannels(KeyGroupStreamPartitioner.java:61)
>   at 
> org.apache.flink.streaming.runtime.partitioner.KeyGroupStreamPartitioner.selectChannels(KeyGroupStreamPartitioner.java:32)
>   at 
> org.apache.flink.runtime.io.network.api.writer.RecordWriter.emit(RecordWriter.java:106)
>   at 
> org.apache.flink.streaming.runtime.io.StreamRecordWriter.emit(StreamRecordWriter.java:81)
>   at 
> org.apache.flink.streaming.runtime.io.RecordWriterOutput.pushToRecordWriter(RecordWriterOutput.java:107)
>   ... 22 more
> Caused by: java.lang.NullPointerException
>   at com.bluevoyant.StreamingJob$$anonfun$3.apply(StreamingJob.scala:41)
>   at com.bluevoyant.StreamingJob$$anonfun$3.apply(StreamingJob.scala:40)
>   at 
> org.apache.flink.streaming.api.scala.DataStream$$anon$2.getKey(DataStream.scala:411)
>   at 
> org.apache.flink.streaming.runtime.partitioner.KeyGroupStreamPartitioner.selectChannels(KeyGroupStreamPartitioner.java:59)
>   ... 26 more*
>
>
> my key partition is simple (partitionFactor = some number)
>
> *.keyBy{ r =>
> val h = fastHash(r.id ) % partitionFactor
> math.abs(h)
> }*
>
> again, this happens only on runtime not when I run it from intellij
>
> this so frustrating, any advice ?
>
>
>

Re: Deadlock happens when sink to mysql

2018-11-19 Thread miki haiat

can you share your entire code please

On Mon, Nov 19, 2018 at 4:03 PM 徐涛  wrote:

> Hi Experts,
> I use the following sql, and sink to mysql,
> select
>
> album_id, date
> count(1)
> from
> coupon_5_discount_date_conv
> group by
> album_id, date;
>
>
> when sink to mysql, the following SQL is executed: insert into xxx (c1,
> c2,c3) values (?,?,?) on duplicate key update c1=VALUES(c1),c2=VALUES(c2
> ), c3=VALUES(c3)
> The engine is InnoDB, column c1,c2 is unique key, the isolation
> level is READ COMMITTED. But in the log a deadlock exception happens.
> As I know, because the unique key exists, only the line lock will be
> applied, no gap lock will be applied. And due to a group by sentence, the
> same unique key should be written by the same thread. So in this case,
> why the dead lock should happened? Could anyone help me? Thanks a lot.
>
> Best
> Henry
>

Re: Standalone HA cluster: Fatal error occurred in the cluster entrypoint.

2018-11-16 Thread miki haiat

I "solved" this issue by cleaning the zookeeper information and start the
cluster again all the the checkpoint and job graph data will be erased and
basacly you will start a new cluster...

It's happened to me allot on a 1.5.x
On a 1.6 things are running perfect .
I'm not sure way this error is back again on 1.6.1 ?


On Fri, 16 Nov 2018, 0:42 Olga Luganska  Hello,
>
> I am running flink 1.6.1 standalone HA cluster. Today I am unable to start
> cluster because of "Fatal error in cluster entrypoint"
> (I used to see this error when running flink 1.5 version, after upgrade to
> 1.6.1 (which had a fix for this bug) everything worked well for a while)
>
> Question: what exactly needs to be done to clean "state handle store"?
>
> 2018-11-15 15:09:53,181 DEBUG
> org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor  - Fencing
> token not set: Ignoring message LocalFencedMessage(null,
> org.apache.flink.runtime.rpc.messages.RunAsync@21fd224c) because the
> fencing token is null.
>
> 2018-11-15 15:09:53,182 ERROR
> org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Fatal error
> occurred in the cluster entrypoint.
>
> java.lang.RuntimeException: org.apache.flink.util.FlinkException: Could
> not retrieve submitted JobGraph from state handle under
> /e13034f83a80072204facb2cec9ea6a3. This indicates that the retrieved state
> handle is broken. Try cleaning the state handle store.
>
> at
> org.apache.flink.util.ExceptionUtils.rethrow(ExceptionUtils.java:199)
>
> at
> org.apache.flink.util.function.FunctionUtils.lambda$uncheckedFunction$1(FunctionUtils.java:61)
>
> at
> java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:602)
>
> at
> java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:577)
>
> at
> java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:442)
>
> at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:39)
>
> at
> akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:415)
>
> at
> scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>
> at
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
>
> at
> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
>
> at
> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
>
> Caused by: org.apache.flink.util.FlinkException: Could not retrieve
> submitted JobGraph from state handle under
> /e13034f83a80072204facb2cec9ea6a3. This indicates that the retrieved state
> handle is broken. Try cleaning the state handle store.
>
> at
> org.apache.flink.runtime.jobmanager.ZooKeeperSubmittedJobGraphStore.recoverJobGraph(ZooKeeperSubmittedJobGraphStore.java:208)
>
> at
> org.apache.flink.runtime.dispatcher.Dispatcher.recoverJob(Dispatcher.java:692)
>
> at
> org.apache.flink.runtime.dispatcher.Dispatcher.recoverJobGraphs(Dispatcher.java:677)
>
> at
> org.apache.flink.runtime.dispatcher.Dispatcher.recoverJobs(Dispatcher.java:658)
>
> at
> org.apache.flink.runtime.dispatcher.Dispatcher.lambda$null$26(Dispatcher.java:817)
>
> at
> org.apache.flink.util.function.FunctionUtils.lambda$uncheckedFunction$1(FunctionUtils.java:59)
>
> ... 9 more
>
> Caused by: java.io.FileNotFoundException:
> /checkpoint_repo/ha/submittedJobGraphdd865937d674 (No such file or
> directory)
>
> at java.io.FileInputStream.open0(Native Method)
>
> at java.io.FileInputStream.open(FileInputStream.java:195)
>
> at java.io.FileInputStream.(FileInputStream.java:138)
>
> at
> org.apache.flink.core.fs.local.LocalDataInputStream.(LocalDataInputStream.java:50)
>
> at
> org.apache.flink.core.fs.local.LocalFileSystem.open(LocalFileSystem.java:142)
>
> at
> org.apache.flink.runtime.state.filesystem.FileStateHandle.openInputStream(FileStateHandle.java:68)
>
> at
> org.apache.flink.runtime.state.RetrievableStreamStateHandle.openInputStream(RetrievableStreamStateHandle.java:64)
>
> at
> org.apache.flink.runtime.state.RetrievableStreamStateHandle.retrieveState(RetrievableStreamStateHandle.java:57)
>
> at
> org.apache.flink.runtime.jobmanager.ZooKeeperSubmittedJobGraphStore.recoverJobGraph(ZooKeeperSubmittedJobGraphStore.java:202)
>
> ... 14 more
>
> 2018-11-15 15:09:53,185 INFO
> org.apache.flink.runtime.blob.TransientBlobCache  - Shutting
> down BLOB cache
>
>
> thank you,
>
> Olga
>
>

Re: Get savepoint status fails - Flink 1.6.2

2018-11-15 Thread miki haiat

Can you share some logs

On Thu, Nov 15, 2018 at 10:46 AM PedroMrChaves 
wrote:

> Hello,
>
> I've tried with different (jobId, triggerId) pairs but it doesn't work.
>
>
> Regards,
> Pedro Chaves.
>
>
>
> -
> Best Regards,
> Pedro Chaves
> --
> Sent from:
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
>

Re: running flink job cluster on kubernetes with HA

2018-11-13 Thread miki haiat

Its looks like  in the next version  1.7 you can achieve HA on Kubernetes
 without zookeeper .
Anyway for now you can configure one zookeeper path  to save the data , the
path should  be  some  distribute FS  like  HDFS ,S3 fs.

Thanks ,

Miki


On Tue, Nov 13, 2018 at 10:24 AM aviad  wrote:

> Hi,
>
> I want to run several jobs under kubernetes using "flink job cluster" (see
> -
>
> https://ci.apache.org/projects/flink/flink-docs-stable/ops/deployment/kubernetes.html#flink-job-cluster-on-kubernetes
> ),
> meaning each job is running on a different flink cluster.
>
> I want to configure the cluster with HA, meaning working with ZooKeeper.
>
> do I need a different ZooKeeper cluster to each job cluster or can I use
> the
> same cluster to all jobs?
>
> I saw that there is parameter called
> "high-availability.zookeeper.path.root"
> I can config each flink cluster with a different path, is it enough?
>
> thanks, Aviad
>
>
>
>
>
> --
> Sent from:
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
>

Re: Failed to fetch BLOB - IO Exception

2018-10-23 Thread miki haiat

How is your cluster configured ?
What is the  Checkpoint/save point directory configuration ?


On Tue, Oct 23, 2018 at 8:00 AM Manjusha Vuyyuru 
wrote:

> Hello All,
>
> I have a  job which fails lets say after every 14 days with IO Exception,
> failed to fetch blob.
> I submitted the job using command line using java jar.Below is the
> exception I'm getting:
>
> java.io.IOException: Failed to fetch BLOB 
> d23d168655dd51efe4764f9b22b85a18/p-446f7e0137fd66af062de7a56c55528171d380db-baf0b6bce698d586a3b0d30c6e487d16
>  from flink-job-mamager/10.20.1.85:38147 and store it under 
> /tmp/blobStore-e3e34fec-22d9-4b3c-b542-0c1e5cdcf896/incoming/temp-0022
>   at 
> org.apache.flink.runtime.blob.BlobClient.downloadFromBlobServer(BlobClient.java:191)
>   at 
> org.apache.flink.runtime.blob.AbstractBlobCache.getFileInternal(AbstractBlobCache.java:177)
>   at 
> org.apache.flink.runtime.blob.PermanentBlobCache.getFile(PermanentBlobCache.java:205)
>   at 
> org.apache.flink.runtime.execution.librarycache.BlobLibraryCacheManager.registerTask(BlobLibraryCacheManager.java:119)
>   at 
> org.apache.flink.runtime.taskmanager.Task.createUserCodeClassloader(Task.java:878)
>   at org.apache.flink.runtime.taskmanager.Task.run(Task.java:589)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: java.io.IOException: GET operation failed: Server side error: 
> /tmp/blobStore-5535a94c-5bdd-41f3-878d-8320e53ba7c5/incoming/temp-00182356
>   at 
> org.apache.flink.runtime.blob.BlobClient.getInternal(BlobClient.java:253)
>   at 
> org.apache.flink.runtime.blob.BlobClient.downloadFromBlobServer(BlobClient.java:166)
>   ... 6 more
> Caused by: java.io.IOException: Server side error: 
> /tmp/blobStore-5535a94c-5bdd-41f3-878d-8320e53ba7c5/incoming/temp-00182356
>   at 
> org.apache.flink.runtime.blob.BlobClient.receiveAndCheckGetResponse(BlobClient.java:306)
>   at 
> org.apache.flink.runtime.blob.BlobClient.getInternal(BlobClient.java:247)
>   ... 7 more
> Caused by: java.nio.file.NoSuchFileException: 
> /tmp/blobStore-5535a94c-5bdd-41f3-878d-8320e53ba7c5/incoming/temp-00182356
>   at 
> sun.nio.fs.UnixException.translateToIOException(UnixException.java:86)
>   at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
>   at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
>   at sun.nio.fs.UnixCopyFile.move(UnixCopyFile.java:409)
>   at 
> sun.nio.fs.UnixFileSystemProvider.move(UnixFileSystemProvider.java:262)
>   at java.nio.file.Files.move(Files.java:1395)
>   at 
> org.apache.flink.runtime.blob.BlobUtils.moveTempFileToStore(BlobUtils.java:452)
>   at 
> org.apache.flink.runtime.blob.BlobServer.getFileInternal(BlobServer.java:521)
>   at 
> org.apache.flink.runtime.blob.BlobServerConnection.get(BlobServerConnection.java:231)
>   at 
> org.apache.flink.runtime.blob.BlobServerConnection.run(BlobServerConnection.java:117)
>
> All the configurations of blob are default, i didn't change anything.
>
> Can someone help me to fix this issue.
>
> Thanks,
>
> Manjusha
>
>
>
>

Re: Flink JobManager is not starting when running on a standalone cluster

2018-10-22 Thread miki haiat

I think it`s related to this issue
https://issues.apache.org/jira/browse/FLINK-10011




On Mon, Oct 22, 2018 at 1:52 PM Kumar Bolar, Harshith 
wrote:

> Hi all,
>
>
>
> We run Flink on a five node cluster – three task managers, two job
> managers. One of the job manager running on flink2-0 node is down and
> refuses to come back up, so the cluster is currently running with a single
> job manager. When I restart the service, I see this in the logs. Any idea
> what this issue might be?
>
>
>
> 2018-10-22 06:43:50,458 INFO
> org.apache.flink.runtime.jobmanager.JobManager- Starting
> JobManager actor
>
> 2018-10-22 06:43:50,462 INFO  org.apache.flink.runtime.blob.BlobServer
>   - Created BLOB server storage directory
> /tmp/blobStore-73e8dbe2-8fdb-4310-84d4-c9f3445723f3
>
> 2018-10-22 06:43:50,466 INFO  org.apache.flink.runtime.blob.BlobServer
>   - Enabling ssl for the blob server
>
> 2018-10-22 06:43:50,482 INFO  org.apache.flink.runtime.blob.BlobServer
>   - Started BLOB server at 0.0.0.0:36880 - max concurrent
> requests: 50 - max backlog: 1000
>
> 2018-10-22 06:43:50,501 INFO  
> org.apache.flink.runtime.jobmanager.MemoryArchivist
>   - Started memory archivist akka://flink/user/archive
>
> 2018-10-22 06:43:50,525 INFO
> org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService  -
> Starting ZooKeeperLeaderRetrievalService.
>
> 2018-10-22 06:43:50,525 INFO
> org.apache.flink.runtime.jobmanager.JobManager- Starting
> JobManager at akka.ssl.tcp://
> fl...@flink2-0.flink2.us-east-1.prod.xxx.io:22902/user/jobmanager.
>
> 2018-10-22 06:43:50,526 INFO
> org.apache.flink.runtime.leaderelection.ZooKeeperLeaderElectionService  -
> Starting ZooKeeperLeaderElectionService
> org.apache.flink.runtime.leaderelection.ZooKeeperLeaderElectionService@2805f48f.
>
> 2018-10-22 06:43:50,532 INFO
> org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService  -
> Starting ZooKeeperLeaderRetrievalService.
>
> 2018-10-22 06:43:50,557 INFO
> org.apache.flink.runtime.clusterframework.standalone.StandaloneResourceManager
> - Received leader address but not running in leader ActorSystem.
> Cancelling registration.
>
>
>
> Thanks,
>
> Harshith
>

Re: Flink Table API and table name

2018-10-16 Thread miki haiat

Im not sure if it will solve this issue but can you try to register the
your catalog [1]

1.
https://ci.apache.org/projects/flink/flink-docs-release-1.6/dev/table/common.html#register-an-external-catalog


On Tue, Oct 16, 2018 at 11:40 AM Flavio Pompermaier 
wrote:

> Hi to all,
> in my job I'm trying to read a dataset whose name/id starts with a number.
> It seems that when using the Table API to read that dataset, if the name
> starts with a number it is a problem..am I wrong?  I can't find anything
> about table id constraints on the documentation and it seems that it's not
> possible to escape the name..for the moment I've added a 'T' in front of
> the name in order to have something  like T1 or T2 but it looks like a
> workaround to me..
>
> Best,
> Flavio
>

Re: Can't start taskmanager in Minikube

2018-10-16 Thread miki haiat

Did you execute this command ?

Note: If using MiniKube please make sure to execute minikube ssh 'sudo ip
> link set docker0 promisc on' before deploying a Flink cluster. Otherwise
> Flink components are not able to self reference themselves through a
> Kubernetes service.


On Tue, Oct 16, 2018 at 10:01 AM zpp  wrote:

> I followed the Doc(
> https://ci.apache.org/projects/flink/flink-docs-release-1.6/ops/deployment/kubernetes.html#session-cluster-resource-definitions)
>  to
> run flink on kubernetes,
> but there is an exception(java.net.UnknownHostException: flink-jobmanager:
> Temporary failure in name resolution).
> I use a newly installed Minikube on Windows.How can I solve this problem?
> Thanks a lot.
>
>
>
>
>

Re: I want run flink program in ubuntu x64 Mult Node Cluster what is configuration?

2018-09-30 Thread miki haiat

The easiest way to tun it without adding user and root permission  is to
run it with *sudo command*
*sudo /start-cluster.sh*

If you want to run  high  availability cluster you need to follow those
instruction [1]

1.
https://ci.apache.org/projects/flink/flink-docs-release-1.6/ops/jobmanager_high_availability.html#standalone-cluster-high-availability



On Sun, Sep 30, 2018 at 12:31 PM Mar_zieh 
wrote:

> Hello
>
> I run "start-cluster.bat" on windows very easy and it works fine. Even
> though, when I run "start-cluster.sh" on terminal of ubuntu, I get these
> errors:
>
> Starting cluster.
> ./start-cluster.sh: line 48:
> /home/pooya/IdeaProjects/flink-1.6.0/bin/jobmanager.sh: Permission denied
> /home/pooya/IdeaProjects/flink-1.6.0/bin/config.sh: line 656:
> /home/pooya/IdeaProjects/flink-1.6.0/bin/taskmanager.sh: Permission denied
>
> Could you please help me? What should I do and how to config it?
>
> Thanks in advance.
>
>
>
> --
> Sent from:
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
>

Re: error closing kafka

2018-09-24 Thread miki haiat

What are you trying to do , can you share some code ?
This is the reason for the exeption
Proceeding to force close the producer since pending requests could not be
completed within timeout 9223372036854775807 ms.



On Mon, 24 Sep 2018, 9:23 yuvraj singh, <19yuvrajsing...@gmail.com> wrote:

> Hi all ,
>
>
> I am getting this error with flink 1.6.0 , please help me .
>
>
>
>
>
>
> 2018-09-23 07:15:08,846 ERROR
> org.apache.kafka.clients.producer.KafkaProducer   -
> Interrupted while joining ioThread
>
> java.lang.InterruptedException
>
> at java.lang.Object.wait(Native Method)
>
> at java.lang.Thread.join(Thread.java:1257)
>
> at
> org.apache.kafka.clients.producer.KafkaProducer.close(KafkaProducer.java:703)
>
> at
> org.apache.kafka.clients.producer.KafkaProducer.close(KafkaProducer.java:682)
>
> at
> org.apache.kafka.clients.producer.KafkaProducer.close(KafkaProducer.java:661)
>
> at
> org.apache.flink.streaming.connectors.kafka.FlinkKafkaProducerBase.close(FlinkKafkaProducerBase.java:319)
>
> at
> org.apache.flink.api.common.functions.util.FunctionUtils.closeFunction(FunctionUtils.java:43)
>
> at
> org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.dispose(AbstractUdfStreamOperator.java:117)
>
> at
> org.apache.flink.streaming.runtime.tasks.StreamTask.disposeAllOperators(StreamTask.java:477)
>
> at
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:378)
>
> at org.apache.flink.runtime.taskmanager.Task.run(Task.java:711)
>
> at java.lang.Thread.run(Thread.java:745)
>
> 2018-09-23 07:15:08,847 INFO  org.apache.kafka.clients.producer.KafkaProducer
>   - Proceeding to force close the producer since pending
> requests could not be completed within timeout 9223372036854775807 ms.
>
> 2018-09-23 07:15:08,860 ERROR
> org.apache.flink.streaming.runtime.tasks.StreamTask   - Error
> during disposal of stream operator.
>
> org.apache.kafka.common.KafkaException: Failed to close kafka producer
>
> at
> org.apache.kafka.clients.producer.KafkaProducer.close(KafkaProducer.java:734)
>
> at
> org.apache.kafka.clients.producer.KafkaProducer.close(KafkaProducer.java:682)
>
> at
> org.apache.kafka.clients.producer.KafkaProducer.close(KafkaProducer.java:661)
>
> at
> org.apache.flink.streaming.connectors.kafka.FlinkKafkaProducerBase.close(FlinkKafkaProducerBase.java:319)
>
> at
> org.apache.flink.api.common.functions.util.FunctionUtils.closeFunction(FunctionUtils.java:43)
>
> at
> org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.dispose(AbstractUdfStreamOperator.java:117)
>
> at
> org.apache.flink.streaming.runtime.tasks.StreamTask.disposeAllOperators(StreamTask.java:477)
>
> at
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:378)
>
> at org.apache.flink.runtime.taskmanager.Task.run(Task.java:711)
>
> at java.lang.Thread.run(Thread.java:745)
>
> Caused by: java.lang.InterruptedException
>
> at java.lang.Object.wait(Native Method)
>
> at java.lang.Thread.join(Thread.java:1257)
>
> at
> org.apache.kafka.clients.producer.KafkaProducer.close(KafkaProducer.java:703)
>
> ... 9 more
>
>
> Thanks
>
> Yubraj Singh
>

Re: Server side error: Cannot find required BLOB at /tmp/blobStore

2018-09-11 Thread miki haiat

/tmp/blobStore
Is it the path  for checkpoints/savepoints  storage ?

On Tue, 11 Sep 2018, 10:01 Raja.Aravapalli, 
wrote:

> Hi,
>
>
>
> My Flink application which reads from Kafka and writes to HDFS is failing
> repeatedly with below exception:
>
>
>
> Caused by: java.io.IOException: Server side error: Cannot find required
> BLOB at /tmp/blobStore-**
>
>
>
> Can someone please help me on, what could be the root cause of this
> issue?  I am not able to trace logs also.
>
>
>
> Also, FYI, our FLINK cluster(4nodes) is setup on our Hadoop YARN cluster.
>
>
>
> Thanks a lot.
>
>
>
>
>
> Regards,
>
> Raja.
>

Re: Unable to start Flink HA cluster with Zookeeper

2018-08-21 Thread miki haiat

First of all try with  FQD or full ip.
Also in order to run HA cluster you need to make sure that you have
password less ssh access to your slaves and master communication.   .

On Tue, Aug 21, 2018 at 4:15 PM mozer 
wrote:

> I am trying to install a Flink HA cluster (Zookeeper mode) but the task
> manager cannot find the job manager.
>
> Here I give you the architecture;
>
> - Machine 1 : Job Manager + Zookeeper
> - Machine 2 : Task Manager
>
> masters:
>
> Machine1
>
> slaves :
>
> Machine2
>
> flink-conf.yaml:
>
> #jobmanager.rpc.address: localhost
> jobmanager.rpc.port: 6123
> blob.server.port: 50100-50200
> taskmanager.data.port: 6121
> high-availability: zookeeper
> high-availability.zookeeper.quorum: Machine1:2181
> high-availability.zookeeper.path.root: /flink-1.5.1
> high-availability.cluster-id: /default_b
> high-availability.storageDir: file:///shareflink/recovery
>
> Here this is the log of Task Manager, it tries to connect to localhost
> instead of Machine1:
>
> 2018-08-17 10:46:44,875 INFO
> org.apache.flink.runtime.util.LeaderRetrievalUtils- Trying to
> select the network interface and address to use by connecting to the
> leading
> JobManager.
> 2018-08-17 10:46:44,876 INFO
> org.apache.flink.runtime.util.LeaderRetrievalUtils- TaskManager
> will try to connect for 1 milliseconds before falling back to
> heuristics
> 2018-08-17 10:46:44,966 INFO
> org.apache.flink.runtime.net.ConnectionUtils  - Retrieved
> new target address /127.0.0.1:37133.
> 2018-08-17 10:46:45,324 INFO
> org.apache.flink.runtime.net.ConnectionUtils  - Trying to
> connect to address /127.0.0.1:37133
> 2018-08-17 10:46:45,325 INFO
> org.apache.flink.runtime.net.ConnectionUtils  - Failed to
> connect from address 'Machine2/IP-Machine2': Connection refused
> 2018-08-17 10:46:45,325 INFO
> org.apache.flink.runtime.net.ConnectionUtils  - Failed to
> connect from address '/127.0.0.1': Connection refused
> 2018-08-17 10:46:45,325 INFO
> org.apache.flink.runtime.net.ConnectionUtils  - Failed to
> connect from address '/IP_Machine2': Connection refused
> 2018-08-17 10:46:45,325 INFO
> org.apache.flink.runtime.net.ConnectionUtils  - Failed to
> connect from address '/127.0.0.1': Connection refused
> 2018-08-17 10:46:45,326 INFO
> org.apache.flink.runtime.net.ConnectionUtils  - Failed to
> connect from address '/IP_Machine2': Connection refused
> 2018-08-17 10:46:45,326 INFO
> org.apache.flink.runtime.net.ConnectionUtils  - Failed to
> connect from address '/127.0.0.1': Connection refused
> 2018-08-17 10:46:45,726 INFO
> org.apache.flink.runtime.net.ConnectionUtils  - Trying to
> connect to address /127.0.0.1:37133
> 2018-08-17 10:46:45,727 INFO
> org.apache.flink.runtime.net.ConnectionUtils  - Failed to
> connect from address 'Machine2/IP-Machine2
>
> 2018-08-17 10:47:22,022 WARN  akka.remote.ReliableDeliverySupervisor
>
> - Association with remote system [akka.tcp://flink@127.0.0.1:36515] has
> failed, address is now gated for [50] ms. Reason: [Association failed with
> [akka.tcp://flink@127.0.0.1:36515]] Caused by: [Connection refused:
> /127.0.0.1:36515]
>
> 2018-08-17 10:47:22,022 INFO
> org.apache.flink.runtime.taskexecutor.TaskExecutor- Could not
> resolve ResourceManager address
> akka.tcp://flink@127.0.0.1:36515/user/resourcemanager, retrying in 1
> ms:
> Could not connect to rpc endpoint under address
> akka.tcp://flink@127.0.0.1:36515/user/resourcemanager..
> 2018-08-17 10:47:32,037 WARN
> akka.remote.transport.netty.NettyTransport
> - Remote connection to [null] failed with java.net.ConnectException:
> Connection refused: /127.0.0.1:36515
>
>
>
> PS. : **/etc/hosts** contains the **localhost, Machine1 and Machine2**
>
>
> Can you please tell me how the Task Manager can connect to Job Manager ?
>
> Regards
>
>
>
>
>
> --
> Sent from:
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
>

Re: connection failed when running flink in a cluster

2018-08-06 Thread miki haiat

Did you start job manager and task manager on the same resbery pi ?

On Mon, 6 Aug 2018, 12:01 Felipe Gutierrez, 
wrote:

> Hello everyone,
>
> I am trying to run Flink on Raspberry Pis. My first test for word count in
> a single node worked. I just have to decrease the Heap memory of the
> jobmanager.heap.mb and taskmanager.heap.mb to 512.
> My second test is to add 2 slave nodes I got the error: "Java HotSpot(TM)
> Client VM warning: G1 GC is disabled in this release." at the file
> log/flink-root-taskexecutor-0-*.out.
>
> This link (
> https://blog.sflow.com/2016/06/raspberry-pi-real-time-network-analytics.html)
> says that in order to Raspberry Pi ARM architecture works with JVM it is
> necessary to configure the JVM as:
> -Xms600M
> -Xmx600M
> -XX:+UseParNewGC
> -XX:+UseConcMarkSweepGC
> -XX:+CMSIncrementalMode
>
> then I set this variables on the path inside the file flink-conf.yaml
> env.java.opts: "-XX:+UseParNewGC -XX:+UseConcMarkSweepGC
> -XX:+CMSIncrementalMode"
> env.java.opts.jobmanager: "-XX:+UseParNewGC -XX:+UseConcMarkSweepGC
> -XX:+CMSIncrementalMode"
> env.java.opts.taskmanager: "-XX:+UseParNewGC -XX:+UseConcMarkSweepGC
> -XX:+CMSIncrementalMode"
>
> and the error "Java HotSpot(TM) Client VM warning: G1 GC is disabled in
> this release." is not showing anymore. However, the connection from the
> master node to the slave node is still not possible. Does anybody know how
> I must configure flink to deal with that?
>
> This is the error stack trace:
>
> 2017-05-25 12:40:26,421 INFO
> org.apache.flink.runtime.executiongraph.ExecutionGraph- Source:
> Socket Stream -> Flat Map (1/1) (b81b6492fc0860367be422d0b0bf4358) switched
> from DEPLOYING to RUNNING.
> 2017-05-25 12:40:26,891 INFO
> org.apache.flink.runtime.executiongraph.ExecutionGraph- Source:
> Socket Stream -> Flat Map (1/1) (b81b6492fc0860367be422d0b0bf4358) switched
> from RUNNING to FAILED.
> java.net.ConnectException: Connection refused
> at java.net.PlainSocketImpl.socketConnect(Native Method)
> at
> java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
> at
> java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
> at
> java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
> at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
> at java.net.Socket.connect(Socket.java:589)
> at
> org.apache.flink.streaming.api.functions.source.SocketTextStreamFunction.run(SocketTextStreamFunction.java:96)
> at
> org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:87)
> at
> org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:56)
> at
> org.apache.flink.streaming.runtime.tasks.SourceStreamTask.run(SourceStreamTask.java:99)
> at
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:306)
> at org.apache.flink.runtime.taskmanager.Task.run(Task.java:703)
> at java.lang.Thread.run(Thread.java:745)
> 2017-05-25 12:40:26,898 INFO
> org.apache.flink.runtime.executiongraph.ExecutionGraph- Job Socket
> Window WordCount (71c6d7796eccf6587d9d1deda0490e09) switched from state
> RUNNING to FAILING.
> java.net.ConnectException: Connection refused
> at java.net.PlainSocketImpl.socketConnect(Native Method)
> at
> java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
> at
> java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
> at
> java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
> at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
> at java.net.Socket.connect(Socket.java:589)
> at
> org.apache.flink.streaming.api.functions.source.SocketTextStreamFunction.run(SocketTextStreamFunction.java:96)
> at
> org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:87)
> at
> org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:56)
> at
> org.apache.flink.streaming.runtime.tasks.SourceStreamTask.run(SourceStreamTask.java:99)
> at
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:306)
> at org.apache.flink.runtime.taskmanager.Task.run(Task.java:703)
> at java.lang.Thread.run(Thread.java:745)
> 2017-05-25 12:40:26,921 INFO
> org.apache.flink.runtime.executiongraph.ExecutionGraph-
> Window(TumblingProcessingTimeWindows(5000), ProcessingTimeTrigger,
> ReduceFunction$1, PassThroughWindowFunction) -> Sink: Print to Std. Out
> (1/1) (aa1a0e7ee3a1d3ad8f99b2608bd64c5b) switched from RUNNING to CANCELING.
> 2017-05-25 12:40:26,975 INFO
> org.apache.flink.runtime.executiongraph.ExecutionGraph-
> Window(TumblingProcessingTimeWindows(5000), ProcessingTimeTrigger,
> ReduceFunction$1, PassThroughWindowFunction) -> Sink: Print to Std. Out
> (1/1) (aa1a0e7ee3a1d3ad8f99b2608bd64c5b) switched from CANCELING to
> CANCELED.
>
>
>
> Thanks, Felipe
> *--*
> *-- Felipe Gutierrez*
>
> *-- skype: felipe.o.gutierrez*
> *--*

Elasticsearch 6.3.x connector

2018-07-16 Thread miki haiat

HI ,

I just wondered what is to status of  the 6.3.x  elastic connector.
flink-connector-elasticsearch-base_2.11

has elastic 6.3.1 dependencies .

Documentation mention 5.3 as the stable version  elasticsearch.html



What is the latest stable version of the elasitc connector .


Miki

Re: flink JPS result changes

2018-07-11 Thread miki haiat

Flink 6 changed  the execution model compactly
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=65147077
https://docs.google.com/document/d/1zwBP3LKnJ0LI1R4-9hLXBVjOXAkQS5jKrGYYoPz-xRk/edit#heading=h.giuwq6q8d23j



On Wed, Jul 11, 2018 at 5:09 PM Will Du  wrote:

> Hi folks
> Do we have any information about the process changes after v1.5.0? I used
> to see jobManager and TaskManager process once the start-cluster.sh is
> being called. But, it shows below in v1.5.0 once started. Everything works,
> but no idea where is the jobManager.
>
> $jps
> 2523 TaskManagerRunner
> 2190 StandaloneSessionClusterEntrypoint
>
> thanks,
> Will

Re: Passing type information to JDBCAppendTableSink

2018-07-01 Thread miki haiat

can you share the full code .

On Sun, Jul 1, 2018 at 12:49 PM chrisr123  wrote:

>
> I'm trying to determine if I'm specifying type information properly when
> doing an INSERT using
> the JDBCAppendTableSink API.   Specifically, how do I specify timestamp and
> date types? It looks like
> I need to use Type.SQL_TIMESTAMP for a timestamp but BasicTypeInfo for
> types
> like varchar, etc?
>
> I am having trouble finding complete examples. I got this to work below but
> I wanted to confirm I'm
> doing things the correct way?
>
> This is for an append-only into a Derby Database table.
>
> My DDL
> # simple table with a timestamp, varchar, bigint
> create table mydb.pageview_counts
> (window_end timestamp not null,
>  username   varchar(40) not null,
>  viewcount  bigint not null);
>
> My Insert Statement
> // Write Result Table to Sink
> // Configure Sink
> JDBCAppendTableSink pageViewSink = JDBCAppendTableSink.builder()
> .setDrivername("org.apache.derby.jdbc.ClientDriver")
> .setDBUrl("jdbc:myhost://captain:1527/mydb")
> .setUsername("foo")
> .setPassword("bar")
> .setBatchSize(1)
> .setQuery("INSERT INTO mydb.pageview_counts
> (window_end,username,viewcount)
> VALUES (?,?,?)")
>
>
> .setParameterTypes(Types.SQL_TIMESTAMP,BasicTypeInfo.STRING_TYPE_INFO,BasicTypeInfo.LONG_TYPE_INFO)
> .build();
>
>
>
>
>
> --
> Sent from:
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
>

Writing csv to Hadoop Data stream

2018-06-11 Thread miki haiat

Hi,
Im trying to stream data to Haddop  as a csv .

In batch processing i can use  HadoopOutputFormat like that (
example/WordCount.java

).

I cant find any way to integrate bucktsink and HaddopOutputFormat and im
not sure if its the correct way ?

Any suggestion how can i stream  csv to Hadoop.


thanks,

Miki

Re: File does not exist prevent from Job manager to start .

2018-06-06 Thread miki haiat

I had some   zookeeper errors that  crashed the cluster

 ERROR org.apache.flink.shaded.org.apache.curator.ConnectionState
  - Authentication failed

What happen to Flink checkpoint and state if zookeeper cluster is crashed  ?
Is it possible that the checkpoint/state is written in zookeeper   but not
in Hadoop and then when i try to restart the flink cluster im getting the
file not find error ??


On Mon, Jun 4, 2018 at 4:27 PM Till Rohrmann  wrote:

> Hi Miki,
>
> it looks as if you did not submit a job to the cluster of which you shared
> the logs. At least I could not see a submit job call.
>
> Cheers,
> Till
>
> On Mon, Jun 4, 2018 at 12:31 PM miki haiat  wrote:
>
>> HI Till,
>> Iv`e managed to do  reproduce it.
>> Full log faild_jm.log
>> <https://gist.githubusercontent.com/miko-code/e634164404354c4c590be84292fd8cb2/raw/baeee310cd50cfa79303b328e3334d960c8e98e6/faild_jm.log>
>>
>>
>>
>>
>> On Mon, Jun 4, 2018 at 10:33 AM Till Rohrmann 
>> wrote:
>>
>>> Hmmm, Flink should not delete the stored blobs on the HA storage. Could
>>> you try to reproduce the problem and then send us the logs on DEBUG level?
>>> Please also check before shutting the cluster down, that the files were
>>> there.
>>>
>>> Cheers,
>>> Till
>>>
>>> On Sun, Jun 3, 2018 at 1:10 PM miki haiat  wrote:
>>>
>>>> Hi  Till ,
>>>>
>>>>1. the files are not longer exist in HDFS.
>>>>2. yes , stop and start the cluster from the bin commands.
>>>>3.  unfortunately i deleted the log.. :(
>>>>
>>>>
>>>> I wondered if this code could cause this issue , the way in using
>>>> checkpoint
>>>>
>>>> StateBackend sb = new 
>>>> FsStateBackend("hdfs://***/flink/my_city/checkpoints");
>>>> env.setStateBackend(sb);
>>>> env.getCheckpointConfig().setCheckpointingMode(CheckpointingMode.AT_LEAST_ONCE);
>>>> env.getCheckpointConfig().setCheckpointInterval(6);
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Fri, Jun 1, 2018 at 6:19 PM Till Rohrmann 
>>>> wrote:
>>>>
>>>>> Hi Miki,
>>>>>
>>>>> could you check whether the files are really no longer stored on HDFS?
>>>>> How did you terminate the cluster? Simply calling `bin/stop-cluster.sh`? I
>>>>> just tried it locally and it could recover the job after calling
>>>>> `bin/start-cluster.sh` again.
>>>>>
>>>>> What would be helpful are the logs from the initial run of the job. So
>>>>> if you can reproduce the problem, then this log would be very helpful.
>>>>>
>>>>> Cheers,
>>>>> Till
>>>>>
>>>>> On Thu, May 31, 2018 at 6:14 PM, miki haiat 
>>>>> wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> Im having some wierd issue with the JM recovery ,
>>>>>> I using HDFS and ZOOKEEPER for HA stand alone cluster .
>>>>>>
>>>>>> Iv  stop the cluster change some parameters in the flink conf
>>>>>> (Memory).
>>>>>> But now when i start the cluster again im having an error that
>>>>>> preventing from JM to start.
>>>>>> somehow the checkpoint file doesn't exists in HDOOP  and JM wont
>>>>>> start .
>>>>>>
>>>>>> full log JM log file
>>>>>> <https://gist.github.com/miko-code/28d57b32cb9c4f1aa96fa9873e10e53c>
>>>>>>
>>>>>>
>>>>>>> 2018-05-31 11:57:05,568 ERROR
>>>>>>> org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Fatal error
>>>>>>> occurred in the cluster entrypoint.
>>>>>>
>>>>>> Caused by: java.lang.Exception: Cannot set up the user code
>>>>>> libraries: File does not exist:
>>>>>> /flink1.5/ha/default/blob/job_5c545fc3f43d69325fb9966b8dd4c8f3/blob_p-5d9f3be555d3b05f90b5e148235d25730eb65b3d-ae486e221962f7b96e36da18fe1c57ca
>>>>>> at
>>>>>> org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:72)
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>

Re: Writing stream to Hadoop

2018-06-05 Thread miki haiat

OMG i missed it ...

Thanks,

MIki

On Tue, Jun 5, 2018 at 1:30 PM Chesnay Schepler  wrote:

> This particular version of the method is deprecated, use 
> enableCheckpointing(long
> checkpointingInterval) instead.
>
> On 05.06.2018 12:19, miki haiat wrote:
>
> I saw the option of enabling checkpoint
> enabling-and-configuring-checkpointing
> <https://ci.apache.org/projects/flink/flink-docs-release-1.2/dev/stream/checkpointing.html#enabling-and-configuring-checkpointing>
>
> But on 1.5 it said that the method is deprecated so im a bit confused .
>
> /** @deprecated */@Deprecated
> @PublicEvolvingpublic StreamExecutionEnvironment enableCheckpointing() {
> this.checkpointCfg.setCheckpointInterval(500L);return this;}
>
>
>
>
> On Tue, Jun 5, 2018 at 1:11 PM Kostas Kloudas 
> wrote:
>
>> Hi Miki,
>>
>> Have you enabled checkpointing?
>>
>> Kostas
>>
>> On Jun 5, 2018, at 11:14 AM, miki haiat  wrote:
>>
>> Im trying to write some data to Hadoop by using this code
>>
>> The state backend is set without time
>>
>> StateBackend sb = new 
>> FsStateBackend("hdfs://***:9000/flink/my_city/checkpoints");env.setStateBackend(sb);
>>
>> BucketingSink> sink =
>> new 
>> BucketingSink<>("hdfs://:9000/mycity/raw");sink.setBucketer(new 
>> DateTimeBucketer("-MM-dd--HHmm"));sink.setInactiveBucketCheckInterval(12);sink.setInactiveBucketThreshold(12);
>>
>> the result is that all the files are stuck in* in.programs  *status and
>> not closed.
>> is it related to the state backend configuration.
>>
>> thanks,
>>
>> Miki
>>
>>
>>
>

Re: Writing stream to Hadoop

2018-06-05 Thread miki haiat

I saw the option of enabling checkpoint
enabling-and-configuring-checkpointing
<https://ci.apache.org/projects/flink/flink-docs-release-1.2/dev/stream/checkpointing.html#enabling-and-configuring-checkpointing>

But on 1.5 it said that the method is deprecated so im a bit confused .

/** @deprecated */
@Deprecated
@PublicEvolving
public StreamExecutionEnvironment enableCheckpointing() {
this.checkpointCfg.setCheckpointInterval(500L);
return this;
}




On Tue, Jun 5, 2018 at 1:11 PM Kostas Kloudas 
wrote:

> Hi Miki,
>
> Have you enabled checkpointing?
>
> Kostas
>
> On Jun 5, 2018, at 11:14 AM, miki haiat  wrote:
>
> Im trying to write some data to Hadoop by using this code
>
> The state backend is set without time
>
> StateBackend sb = new 
> FsStateBackend("hdfs://***:9000/flink/my_city/checkpoints");
> env.setStateBackend(sb);
>
> BucketingSink> sink =
> new BucketingSink<>("hdfs://:9000/mycity/raw");
> sink.setBucketer(new DateTimeBucketer("-MM-dd--HHmm"));
> sink.setInactiveBucketCheckInterval(12);
> sink.setInactiveBucketThreshold(12);
>
> the result is that all the files are stuck in* in.programs  *status and
> not closed.
> is it related to the state backend configuration.
>
> thanks,
>
> Miki
>
>
>

Writing stream to Hadoop

2018-06-05 Thread miki haiat

Im trying to write some data to Hadoop by using this code

The state backend is set without time

StateBackend sb = new
FsStateBackend("hdfs://***:9000/flink/my_city/checkpoints");
env.setStateBackend(sb);

BucketingSink> sink =
new BucketingSink<>("hdfs://:9000/mycity/raw");
sink.setBucketer(new DateTimeBucketer("-MM-dd--HHmm"));
sink.setInactiveBucketCheckInterval(12);
sink.setInactiveBucketThreshold(12);

the result is that all the files are stuck in* in.programs  *status and not
closed.
is it related to the state backend configuration.

thanks,

Miki

Re: TaskManager use more memory than Xmx

2018-06-04 Thread miki haiat

Which flink version ?
I had the same issue  on 1.4.2 .



On Mon, Jun 4, 2018 at 2:14 PM Szczypiński, S. (Szymon) <
szymon.szczypin...@ingbank.pl> wrote:

> Hi,
>
> I have a problem with TaskManagers in my standalone cluster.
>
>
>
> My problem is that my host have 32GB RAM memory. TaskManager have Xms and
> Xmx set to 20GB but when TaskManager is working it use more memory than
> host have and start to consume SWAP. In the end system kills java process.
>
>
>
> Mayby someone know why Flink TaskManager use so much more memory than is
> set by Xmx property.
>
>
>
> Best regards
>
> Simon
>
>
>
>
>
>
>
> *Szymon Szczypiński*
> Projektant IT
>
> Departament Aplikacji Centralnych
>
>
>
> *E *szymon.szczypin...@ingbank.pl
>
>
> [image:
> https://shp-app.pl.ing-ad/apps/mailsignature/Lists/Images/podpis_logo.png]
>
>
>
> ING Bank Śląski S.A.
>
> ul. Sokolska 34, 40-086 Katowice
>
>
>
> [image:
> https://shp-app.pl.ing-ad/apps/mailsignature/Lists/Images/stopka_eko.png] 
> Zastanów
> się czy musisz drukować tego maila. / Do you really need to print this mail?
>
>
> --
>
>
> Niniejsza wiadomość zawiera informacje poufne, przeznaczone do wyłącznego
> użytku adresata. Jeśli nie są Państwo adresatem przesyłki lub jeśli
> otrzymaliście Państwo ten dokument omyłkowo, prosimy o bezzwłoczne
> skontaktowanie się z nadawcą. Wszelkie rozpowszechnianie, dystrybucja,
> reprodukcja, kopiowanie, publikacja lub wykorzystanie tej wiadomości czy
> też zawartych w niej informacji przez osobę inną niż adresat jest
> niedozwolone i może spowodować odpowiedzialność prawną.
>
>
>
>
> Niniejsza wiadomość zawiera informacje poufne, prawnie chronione, 
> przeznaczone do wyłącznego użytku adresata. Jeśli nie są Państwo adresatem 
> przesyłki lub jeśli otrzymaliście Państwo ten dokument omyłkowo, prosimy o 
> bezzwłoczne skontaktowanie się z nadawcą. Wszelkie rozpowszechnianie, 
> dystrybucja, reprodukcja, kopiowanie, publikacja lub wykorzystanie tej 
> wiadomości czy też zawartych w niej informacji przez osobę inną niż adresat 
> jest niedozwolone i może skutkować odpowiedzialnością prawną.
>
>
> ---
> The information in this electronic mail message is confidential, legally 
> privileged, and only intended for the addressee. Should you receive this 
> message by mistake, please contact the sender immediately. Any disclosure, 
> reproduction, distribution or use of this message is strictly prohibited and 
> illegal.
>
>
>
>

Re: File does not exist prevent from Job manager to start .

2018-06-04 Thread miki haiat

HI Till,
Iv`e managed to do  reproduce it.
Full log faild_jm.log
<https://gist.githubusercontent.com/miko-code/e634164404354c4c590be84292fd8cb2/raw/baeee310cd50cfa79303b328e3334d960c8e98e6/faild_jm.log>




On Mon, Jun 4, 2018 at 10:33 AM Till Rohrmann  wrote:

> Hmmm, Flink should not delete the stored blobs on the HA storage. Could
> you try to reproduce the problem and then send us the logs on DEBUG level?
> Please also check before shutting the cluster down, that the files were
> there.
>
> Cheers,
> Till
>
> On Sun, Jun 3, 2018 at 1:10 PM miki haiat  wrote:
>
>> Hi  Till ,
>>
>>1. the files are not longer exist in HDFS.
>>2. yes , stop and start the cluster from the bin commands.
>>3.  unfortunately i deleted the log.. :(
>>
>>
>> I wondered if this code could cause this issue , the way in using
>> checkpoint
>>
>> StateBackend sb = new FsStateBackend("hdfs://***/flink/my_city/checkpoints");
>> env.setStateBackend(sb);
>> env.getCheckpointConfig().setCheckpointingMode(CheckpointingMode.AT_LEAST_ONCE);
>> env.getCheckpointConfig().setCheckpointInterval(6);
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> On Fri, Jun 1, 2018 at 6:19 PM Till Rohrmann 
>> wrote:
>>
>>> Hi Miki,
>>>
>>> could you check whether the files are really no longer stored on HDFS?
>>> How did you terminate the cluster? Simply calling `bin/stop-cluster.sh`? I
>>> just tried it locally and it could recover the job after calling
>>> `bin/start-cluster.sh` again.
>>>
>>> What would be helpful are the logs from the initial run of the job. So
>>> if you can reproduce the problem, then this log would be very helpful.
>>>
>>> Cheers,
>>> Till
>>>
>>> On Thu, May 31, 2018 at 6:14 PM, miki haiat  wrote:
>>>
>>>> Hi,
>>>>
>>>> Im having some wierd issue with the JM recovery ,
>>>> I using HDFS and ZOOKEEPER for HA stand alone cluster .
>>>>
>>>> Iv  stop the cluster change some parameters in the flink conf (Memory).
>>>> But now when i start the cluster again im having an error that
>>>> preventing from JM to start.
>>>> somehow the checkpoint file doesn't exists in HDOOP  and JM wont start .
>>>>
>>>> full log JM log file
>>>> <https://gist.github.com/miko-code/28d57b32cb9c4f1aa96fa9873e10e53c>
>>>>
>>>>
>>>>> 2018-05-31 11:57:05,568 ERROR
>>>>> org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Fatal error
>>>>> occurred in the cluster entrypoint.
>>>>
>>>> Caused by: java.lang.Exception: Cannot set up the user code libraries:
>>>> File does not exist:
>>>> /flink1.5/ha/default/blob/job_5c545fc3f43d69325fb9966b8dd4c8f3/blob_p-5d9f3be555d3b05f90b5e148235d25730eb65b3d-ae486e221962f7b96e36da18fe1c57ca
>>>> at
>>>> org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:72)
>>>>
>>>>
>>>>
>>>>
>>>

Re: File does not exist prevent from Job manager to start .

2018-06-03 Thread miki haiat

Hi  Till ,

   1. the files are not longer exist in HDFS.
   2. yes , stop and start the cluster from the bin commands.
   3.  unfortunately i deleted the log.. :(


I wondered if this code could cause this issue , the way in using
checkpoint

StateBackend sb = new FsStateBackend("hdfs://***/flink/my_city/checkpoints");
env.setStateBackend(sb);
env.getCheckpointConfig().setCheckpointingMode(CheckpointingMode.AT_LEAST_ONCE);
env.getCheckpointConfig().setCheckpointInterval(6);










On Fri, Jun 1, 2018 at 6:19 PM Till Rohrmann  wrote:

> Hi Miki,
>
> could you check whether the files are really no longer stored on HDFS? How
> did you terminate the cluster? Simply calling `bin/stop-cluster.sh`? I just
> tried it locally and it could recover the job after calling
> `bin/start-cluster.sh` again.
>
> What would be helpful are the logs from the initial run of the job. So if
> you can reproduce the problem, then this log would be very helpful.
>
> Cheers,
> Till
>
> On Thu, May 31, 2018 at 6:14 PM, miki haiat  wrote:
>
>> Hi,
>>
>> Im having some wierd issue with the JM recovery ,
>> I using HDFS and ZOOKEEPER for HA stand alone cluster .
>>
>> Iv  stop the cluster change some parameters in the flink conf (Memory).
>> But now when i start the cluster again im having an error that preventing
>> from JM to start.
>> somehow the checkpoint file doesn't exists in HDOOP  and JM wont start .
>>
>> full log JM log file
>> <https://gist.github.com/miko-code/28d57b32cb9c4f1aa96fa9873e10e53c>
>>
>>
>>> 2018-05-31 11:57:05,568 ERROR
>>> org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Fatal error
>>> occurred in the cluster entrypoint.
>>
>> Caused by: java.lang.Exception: Cannot set up the user code libraries:
>> File does not exist:
>> /flink1.5/ha/default/blob/job_5c545fc3f43d69325fb9966b8dd4c8f3/blob_p-5d9f3be555d3b05f90b5e148235d25730eb65b3d-ae486e221962f7b96e36da18fe1c57ca
>> at
>> org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:72)
>>
>>
>>
>>
>

File does not exist prevent from Job manager to start .

2018-05-31 Thread miki haiat

Hi,

Im having some wierd issue with the JM recovery ,
I using HDFS and ZOOKEEPER for HA stand alone cluster .

Iv  stop the cluster change some parameters in the flink conf (Memory).
But now when i start the cluster again im having an error that preventing
from JM to start.
somehow the checkpoint file doesn't exists in HDOOP  and JM wont start .

full log JM log file



> 2018-05-31 11:57:05,568 ERROR
> org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Fatal error
> occurred in the cluster entrypoint.

Caused by: java.lang.Exception: Cannot set up the user code libraries: File
does not exist:
/flink1.5/ha/default/blob/job_5c545fc3f43d69325fb9966b8dd4c8f3/blob_p-5d9f3be555d3b05f90b5e148235d25730eb65b3d-ae486e221962f7b96e36da18fe1c57ca
at
org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:72)

HA stand alone cluster error

2018-05-29 Thread miki haiat

i had some catastrofic eroror

>
>  ERROR org.apache.flink.runtime.entrypoint.ClusterEntrypoint -
> Fatal error occurred in the cluster entrypoint.
> org.apache.flink.util.FlinkException: Failed to recover job
> a048ad572c9837a400eca20cd55241b6.
> File does not exist:
> /flink_1.5/ha/beam1/blob/job_a048ad572c9837a400eca20cd55241b6/blob_p-45d544ca331844235e4f09e2a738b4de38a3bb0a-5dc3a8cbc69f56d9c824a7a4fddc131d



I was unable to start the cluster again ,
I  removed all the data from Hdoop and clean Zookeeper  in order to be able
to start the cluster again.

But now i have this error

2018-05-29 03:51:54,082 ERROR
> org.apache.flink.runtime.dispatcher.StandaloneDispatcher  - Could not
> recover job graph for job e3369e6dce5305b9411b4695975eea26.
> org.apache.flink.util.FlinkException: Could not retrieve submitted
> JobGraph from state handle under /e3369e6dce5305b9411b4695975eea26. This
> indicates that the retrieved state handle is broken. Try cleaning the state
> handle store.


how can i clean the state and bring back the cluster ...

Thanks,

Miki

Re: data enrichment with SQL use case

2018-05-15 Thread miki haiat

Replicate the meta data into Flink state and join the streamed
>>>> records with the state. This solution is more complex because you need
>>>> propagate updates of the meta data (if there are any) into the Flink state.
>>>> At the moment, Flink lacks a few features to have a good implementation of
>>>> this approach, but there a some workarounds that help in certain cases.
>>>>
>>>> Note that Flink's SQL support does not add advantages for the either of
>>>> both approaches. You should use the DataStream API (and possible
>>>> ProcessFunctions).
>>>>
>>>> I'd go for the first approach if one query per record is feasible.
>>>> Let me know if you need to tackle the second approach and I can give
>>>> some details on the workarounds I mentioned.
>>>>
>>>> Best, Fabian
>>>>
>>>> 2018-04-16 20:38 GMT+02:00 Ken Krugler <kkrugler_li...@transpac.com>:
>>>>
>>>>> Hi Miki,
>>>>>
>>>>> I haven’t tried mixing AsyncFunctions with SQL queries.
>>>>>
>>>>> Normally I’d create a regular DataStream workflow that first reads
>>>>> from Kafka, then has an AsyncFunction to read from the SQL database.
>>>>>
>>>>> If there are often duplicate keys in the Kafka-based stream, you could
>>>>> keyBy(key) before the AsyncFunction, and then cache the result of the SQL
>>>>> query.
>>>>>
>>>>> — Ken
>>>>>
>>>>> On Apr 16, 2018, at 11:19 AM, miki haiat <miko5...@gmail.com> wrote:
>>>>>
>>>>> HI thanks  for the reply  i will try to break your reply to the flow
>>>>> execution order .
>>>>>
>>>>> First data stream Will use AsyncIO and select the table ,
>>>>> Second stream will be kafka and the i can join the stream and map it ?
>>>>>
>>>>> If that   the case  then i will select the table only once on load ?
>>>>> How can i make sure that my stream table is "fresh" .
>>>>>
>>>>> Im thinking to myself , is thire a way to use flink backend (ROKSDB)
>>>>> and create read/write through
>>>>> macanisem ?
>>>>>
>>>>> Thanks
>>>>>
>>>>> miki
>>>>>
>>>>>
>>>>>
>>>>> On Mon, Apr 16, 2018 at 2:45 AM, Ken Krugler <
>>>>> kkrugler_li...@transpac.com> wrote:
>>>>>
>>>>>> If the SQL data is all (or mostly all) needed to join against the
>>>>>> data from Kafka, then I might try a regular join.
>>>>>>
>>>>>> Otherwise it sounds like you want to use an AsyncFunction to do ad
>>>>>> hoc queries (in parallel) against your SQL DB.
>>>>>>
>>>>>>
>>>>>> https://ci.apache.org/projects/flink/flink-docs-release-1.4/dev/stream/operators/asyncio.html
>>>>>>
>>>>>> — Ken
>>>>>>
>>>>>>
>>>>>> On Apr 15, 2018, at 12:15 PM, miki haiat <miko5...@gmail.com> wrote:
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I have a case of meta data enrichment and im wondering if my approach
>>>>>> is the correct way .
>>>>>>
>>>>>>1. input stream from kafka.
>>>>>>2. MD in msSQL .
>>>>>>3. map to new pojo
>>>>>>
>>>>>> I need to extract  a key from the kafka stream   and use it to select
>>>>>> some values from the sql table  .
>>>>>>
>>>>>> SO i thought  to use  the table SQL api in order to select the table
>>>>>> MD
>>>>>> then convert the kafka stream to table and join the data by  the
>>>>>> stream key .
>>>>>>
>>>>>> At the end i need to map the joined data to a new POJO and send it to
>>>>>> elesticserch .
>>>>>>
>>>>>> Any suggestions or different ways to solve this use case ?
>>>>>>
>>>>>> thanks,
>>>>>> Miki
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Ken Krugler
>>>>>> http://www.scaleunlimited.com
>>>>>> custom big data solutions & training
>>>>>> Hadoop, Cascading, Cassandra & Solr
>>>>>>
>>>>>>
>>>>>
>>>>> 
>>>>> http://about.me/kkrugler
>>>>> +1 530-210-6378 <(530)%20210-6378>
>>>>>
>>>>>
>>>>
>>
>> 
>> http://about.me/kkrugler
>> +1 530-210-6378
>>
>>
>>
>> 
>> http://about.me/kkrugler
>> +1 530-210-6378
>>
>>
>>
>> 
>> http://about.me/kkrugler
>> +1 530-210-6378
>>
>>
>

Default zookeeper

2018-05-13 Thread miki haiat

When downloading the the flink source in order to run it local thire is a
zookeper script and start-zookeeper-quorum script .

Is thire any difference between the default zookeeper installation lets say
in Ubuntu and the zookeeper that come with flink ?

thanks,

MIki

JM/TM Disassociated

2018-04-29 Thread miki haiat

Hi,

I have some simple  elesticsearch  sink on mesos with 6 TM   ,
but for some reason after few hours  JM/TM is Disassociated and killed
 all other TM as well .

this is were everything collapse

2018-04-29 08:21:00,457 WARN  akka.remote.ReliableDeliverySupervisor
 - Association with remote system
[akka.tcp://flink@10.1.70.116:6123] has failed, address is now gated
for [5000] ms. Reason: [Disassociated]
2018-04-29 08:21:06,501 WARN
akka.remote.transport.netty.NettyTransport- Remote
connection to [null] failed with java.net.ConnectException: Connection
refused: /10.1.70.116:6123






2018-04-29 08:21:00,457 WARN  akka.remote.ReliableDeliverySupervisor
 - Association with remote system
[akka.tcp://flink@10.1.70.116:6123] has failed, address is now gated
for [5000] ms. Reason: [Disassociated]
> 2018-04-29 08:21:06,501 WARN  akka.remote.transport.netty.NettyTransport  
>   - Remote connection to [null] failed with 
> java.net.ConnectException: Connection refused: /10.1.70.116:6123
> 2018-04-29 08:21:06,503 WARN  akka.remote.ReliableDeliverySupervisor  
>   - Association with remote system 
> [akka.tcp://flink@10.1.70.116:6123] has failed, address is now gated for 
> [5000] ms. Reason: [Association failed with 
> [akka.tcp://flink@10.1.70.116:6123]] Caused by: [Connection refused: 
> /10.1.70.116:6123]
> 2018-04-29 08:21:16,501 WARN  akka.remote.transport.netty.NettyTransport  
>   - Remote connection to [null] failed with 
> java.net.ConnectException: Connection refused: /10.1.70.116:6123
> 2018-04-29 08:21:16,504 WARN  akka.remote.ReliableDeliverySupervisor  
>   - Association with remote system 
> [akka.tcp://flink@10.1.70.116:6123] has failed, address is now gated for 
> [5000] ms. Reason: [Association failed with 
> [akka.tcp://flink@10.1.70.116:6123]] Caused by: [Connection refused: 
> /10.1.70.116:6123]
> 2018-04-29 08:21:26,500 WARN  akka.remote.transport.netty.NettyTransport  
>   - Remote connection to [null] failed with 
> java.net.ConnectException: Connection refused: /10.1.70.116:6123
> 2018-04-29 08:21:26,501 WARN  akka.remote.ReliableDeliverySupervisor  
>   - Association with remote system 
> [akka.tcp://flink@10.1.70.116:6123] has failed, address is now gated for 
> [5000] ms. Reason: [Association failed with 
> [akka.tcp://flink@10.1.70.116:6123]] Caused by: [Connection refused: 
> /10.1.70.116:6123]
> 2018-04-29 08:21:31,124 INFO  
> org.apache.flink.mesos.runtime.clusterframework.MesosTaskManager  - 
> TaskManager akka://flink/user/taskmanager disconnects from JobManager 
> akka.tcp://flink@10.1.70.116:6123/user/jobmanager: Old JobManager lost its 
> leadership.
> 2018-04-29 08:21:31,124 INFO  
> org.apache.flink.mesos.runtime.clusterframework.MesosTaskManager  - 
> Cancelling all computations and discarding all cached data.
> 2018-04-29 08:21:31,125 INFO  org.apache.flink.runtime.taskmanager.Task   
>   - Attempting to fail task externally Source: Custom Source -> 
> Sink: Unnamed (1/1) (8468d2974c12c77d4ddbbb4e48ed2eae).
> 2018-04-29 08:21:31,127 INFO  org.apache.flink.runtime.taskmanager.Task   
>   - Source: Custom Source -> Sink: Unnamed (1/1) 
> (8468d2974c12c77d4ddbbb4e48ed2eae) switched from RUNNING to FAILED.
> java.lang.Exception: TaskManager akka://flink/user/taskmanager disconnects 
> from JobManager akka.tcp://flink@10.1.70.116:6123/user/jobmanager: Old 
> JobManager lost its leadership.
>   at 
> org.apache.flink.runtime.taskmanager.TaskManager.handleJobManagerDisconnect(TaskManager.scala:1073)
>   at 
> org.apache.flink.runtime.taskmanager.TaskManager.org$apache$flink$runtime$taskmanager$TaskManager$$handleJobManagerLeaderAddress(TaskManager.scala:1467)
>   at 
> org.apache.flink.runtime.taskmanager.TaskManager$$anonfun$handleMessage$1.applyOrElse(TaskManager.scala:277)
>   at 
> scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:36)
>   at 
> org.apache.flink.runtime.LeaderSessionMessageFilter$$anonfun$receive$1.applyOrElse(LeaderSessionMessageFilter.scala:49)
>   at 
> scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:36)
>   at 
> org.apache.flink.runtime.LogMessages$$anon$1.apply(LogMessages.scala:33)
>   at 
> org.apache.flink.runtime.LogMessages$$anon$1.apply(LogMessages.scala:28)
>   at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:123)
>   at 
> org.apache.flink.runtime.LogMessages$$anon$1.applyOrElse(LogMessages.scala:28)
>   at akka.actor.Actor$class.aroundReceive(Actor.scala:502)
>   at 
> org.apache.flink.runtime.taskmanager.TaskManager.aroundReceive(TaskManager.scala:121)
>   at akka.actor.ActorCell.receiveMessage(ActorCell.scala:526)
>   at akka.actor.ActorCell.invoke(ActorCell.scala:495)
>   at

elasticsearch5 , java.lang.NoClassDefFoundError on mesos

2018-04-24 Thread miki haiat

Hi ,

Im having some weird issue  when running some stream  job to ELK .

The job i starting fine but after few hours im getting this exception and
the  TM/JB is crashed .



this is the config for the elesticserch sink , may by  1 sec flush can
cause the deadlock ??

config.put("bulk.flush.max.actions", "20");
config.put("bulk.flush.interval.ms", "1000");






>
> Exception in thread
> "elasticsearch[_client_][transport_client_worker][T#3]"
> java.lang.NoClassDefFoundError:
> org/apache/flink/streaming/connectors/elasticsearch5/shaded/org/jboss/netty/util/internal/ByteBufferUtil
> at
> org.apache.flink.streaming.connectors.elasticsearch5.shaded.org.jboss.netty.channel.socket.nio.SocketSendBufferPool.releaseExternalResources(SocketSendBufferPool.java:380)
> at
> org.apache.flink.streaming.connectors.elasticsearch5.shaded.org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:90)
> at
> org.apache.flink.streaming.connectors.elasticsearch5.shaded.org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
> at
> org.apache.flink.streaming.connectors.elasticsearch5.shaded.org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
> at
> org.apache.flink.streaming.connectors.elasticsearch5.shaded.org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)

Trigger state clear

2018-04-24 Thread miki haiat

Hi
I have some issue possibly memory issue that causing the task manager to
crash .

full code :
https://gist.github.com/miko-code/6d7010505c3cb95be122364b29057237

I defined  fire_and_purge on element and also   evictor  so state should be
very small ...

Any suggestion how  figure this issue ?

Thanks,


Miki

Re: How to run flip-6 on mesos

2018-04-24 Thread miki haiat

Its 1.4.2 ...
Any approximate date for 1.5 release ?

Thanks allot for your help .



On Tue, Apr 24, 2018 at 10:39 AM, Gary Yao <g...@data-artisans.com> wrote:

> Hi Miki,
>
> The stacktrace you posted looks familiar [1]. We have fixed the issue in
> Flink
> 1.5. What is the Flink version you are using? FLIP-6 before Flink 1.5 is
> very
> experimental, and I doubt that it is in a usable state. Since 1.5 is not
> out
> yet, you can either compile the release branch yourself, or use the RC1
> binaries [2]. If you are already on 1.5, please open a ticket.
>
> Best,
> Gary
>
> [1] https://issues.apache.org/jira/browse/FLINK-8176
> [2] http://people.apache.org/~trohrmann/flink-1.5.0-rc1/
>
>
> On Tue, Apr 24, 2018 at 9:27 AM, miki haiat <miko5...@gmail.com> wrote:
>
>> The problem is that the Web UI hasn't started at all
>>  Im using the sane config file that i used for none flip-6 is that ok ?
>> Also i got this error in the logs .
>>
>>
>> 2018-04-24 10:16:05,466 ERROR 
>> org.apache.flink.runtime.dispatcher.StandaloneDispatcher
>>> - Could not recover the job graph for 4ac6ed0270bf6836941285ffcb9eb9
>>> c6.
>>> java.lang.IllegalStateException: Not running. Forgot to call start()?
>>> at org.apache.flink.util.Preconditions.checkState(Preconditions
>>> .java:195)
>>> at org.apache.flink.runtime.jobmanager.ZooKeeperSubmittedJobGra
>>> phStore.verifyIsRunning(ZooKeeperSubmittedJobGraphStore.java:411)
>>> at org.apache.flink.runtime.jobmanager.ZooKeeperSubmittedJobGra
>>> phStore.recoverJobGraph(ZooKeeperSubmittedJobGraphStore.java:167)
>>> at org.apache.flink.runtime.dispatcher.Dispatcher.lambda$recove
>>> rJobs$3(Dispatcher.java:445)
>>> at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:39)
>>> at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.
>>> exec(AbstractDispatcher.scala:415)
>>> at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.j
>>> ava:260)
>>> at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(For
>>> kJoinPool.java:1339)
>>> at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPoo
>>> l.java:1979)
>>> at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinW
>>> orkerThread.java:107)
>>> 2018-04-24 10:16:05,469 ERROR 
>>> org.apache.flink.runtime.dispatcher.StandaloneDispatcher
>>> - Could not recover the job graph for 700f37540fe95787510dfa2bc0cc5a
>>> c3.
>>> java.lang.IllegalStateException: Not running. Forgot to call start()?
>>> at org.apache.flink.util.Preconditions.checkState(Preconditions
>>> .java:195)
>>> at org.apache.flink.runtime.jobmanager.ZooKeeperSubmittedJobGra
>>> phStore.verifyIsRunning(ZooKeeperSubmittedJobGraphStore.java:411)
>>> at org.apache.flink.runtime.jobmanager.ZooKeeperSubmittedJobGra
>>> phStore.recoverJobGraph(ZooKeeperSubmittedJobGraphStore.java:167)
>>> at org.apache.flink.runtime.dispatcher.Dispatcher.lambda$recove
>>> rJobs$3(Dispatcher.java:445)
>>> at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:39)
>>> at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.
>>> exec(AbstractDispatcher.scala:415)
>>> at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.j
>>> ava:260)
>>> at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(For
>>> kJoinPool.java:1339)
>>> at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPoo
>>> l.java:1979)
>>> at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinW
>>> orkerThread.java:107)
>>
>>
>>
>> On Tue, Apr 24, 2018 at 10:06 AM, Gary Yao <g...@data-artisans.com>
>> wrote:
>>
>>> Hi Miki,
>>>
>>> IIRC the port on which the Web UI is listening is not allocated
>>> dynamically when
>>> deploying on Mesos, and should be 8081 by default (you can override the
>>> default
>>> by setting rest.port in flink-conf.yaml). If you can find out the
>>> hostname/IP of
>>> the JobManager, you can submit as usual via the Web UI. Alternatively
>>> you can
>>> use the CLI, e.g.,
>>>
>>>   bin/flink run -m hostname:6123 examples/streaming/WordCount.jar
>>>
>>> where 6123 is the jobmanager.rpc.port.
>>>
>>> Let me know if any of these work for you
>>>
&g

Re: How to run flip-6 on mesos

2018-04-24 Thread miki haiat

NO  :) ...
I usually using the web UI .
Can you refer me to some example how to submit  a job ?
Using REST ? to which port ?

thanks,

miki

On Tue, Apr 24, 2018 at 9:42 AM, Gary Yao <g...@data-artisans.com> wrote:

> Hi Miki,
>
> Did you try to submit a job? With the introduction of FLIP-6, resources are
> allocated dynamically.
>
> Best,
> Gary
>
>
> On Tue, Apr 24, 2018 at 8:31 AM, miki haiat <miko5...@gmail.com> wrote:
>
>>
>> HI,
>> Im trying to tun flip-6 on mesos but its not clear to me what is the
>> correct way to do it .
>>
>> I run  the session script and i can see that new framework has been
>> created in mesos  but the task manager  hasn't been created
>> running  taskmanager-flip6.sh throw null pointer ...
>>
>> what is the correct way to run flip-6 .
>>
>>
>> thanks,
>>
>> miki
>>
>
>

How to run flip-6 on mesos

2018-04-24 Thread miki haiat

HI,
Im trying to tun flip-6 on mesos but its not clear to me what is the
correct way to do it .

I run  the session script and i can see that new framework has been created
in mesos  but the task manager  hasn't been created
running  taskmanager-flip6.sh throw null pointer ...

what is the correct way to run flip-6 .


thanks,

miki

Re: data enrichment with SQL use case

2018-04-16 Thread miki haiat

HI thanks  for the reply  i will try to break your reply to the flow
execution order .

First data stream Will use AsyncIO and select the table ,
Second stream will be kafka and the i can join the stream and map it ?

If that   the case  then i will select the table only once on load ?
How can i make sure that my stream table is "fresh" .

Im thinking to myself , is thire a way to use flink backend (ROKSDB)  and
create read/write through
macanisem ?

Thanks

miki

On Mon, Apr 16, 2018 at 2:45 AM, Ken Krugler <kkrugler_li...@transpac.com>
wrote:

> If the SQL data is all (or mostly all) needed to join against the data
> from Kafka, then I might try a regular join.
>
> Otherwise it sounds like you want to use an AsyncFunction to do ad hoc
> queries (in parallel) against your SQL DB.
>
> https://ci.apache.org/projects/flink/flink-docs-release-1.4/dev/stream/
> operators/asyncio.html
>
> — Ken
>
>
> On Apr 15, 2018, at 12:15 PM, miki haiat <miko5...@gmail.com> wrote:
>
> Hi,
>
> I have a case of meta data enrichment and im wondering if my approach is
> the correct way .
>
>1. input stream from kafka.
>2. MD in msSQL .
>3. map to new pojo
>
> I need to extract  a key from the kafka stream   and use it to select some
> values from the sql table  .
>
> SO i thought  to use  the table SQL api in order to select the table MD
> then convert the kafka stream to table and join the data by  the stream
> key .
>
> At the end i need to map the joined data to a new POJO and send it to
> elesticserch .
>
> Any suggestions or different ways to solve this use case ?
>
> thanks,
> Miki
>
>
>
>
> --
> Ken Krugler
> http://www.scaleunlimited.com
> custom big data solutions & training
> Hadoop, Cascading, Cassandra & Solr
>
>

data enrichment with SQL use case

2018-04-15 Thread miki haiat

Hi,

I have a case of meta data enrichment and im wondering if my approach is
the correct way .

   1. input stream from kafka.
   2. MD in msSQL .
   3. map to new pojo

I need to extract  a key from the kafka stream   and use it to select some
values from the sql table  .

SO i thought  to use  the table SQL api in order to select the table MD
then convert the kafka stream to table and join the data by  the stream key
.

At the end i need to map the joined data to a new POJO and send it to
elesticserch .

Any suggestions or different ways to solve this use case ?

thanks,
Miki

Re: Re: java.lang.Exception: TaskManager was lost/killed

2018-04-09 Thread miki haiat

Javier
"adding the jar file to the /lib path of every task manager"
are you moving the job  jar to  the* ~/flink-1.4.2/lib path* ?

On Mon, Apr 9, 2018 at 12:23 PM, Javier Lopez 
wrote:

> Hi,
>
> We had the same metaspace problem, it was solved by adding the jar file to
> the /lib path of every task manager, as explained here
> https://ci.apache.org/projects/flink/flink-docs-release-1.4/monitoring/
> debugging_classloading.html#avoiding-dynamic-classloading. As well we
> added these java options: "-XX:CompressedClassSpaceSize=100M
> -XX:MaxMetaspaceSize=300M -XX:MetaspaceSize=200M "
>
> From time to time we have the same problem with TaskManagers
> disconnecting, but the logs are not useful. We are using 1.3.2.
>
> On 9 April 2018 at 10:41, Alexander Smirnov 
> wrote:
>
>> I've seen similar problem, but it was not a heap size, but Metaspace.
>> It was caused by a job restarting in a loop. Looks like for each restart,
>> Flink loads new instance of classes and very soon in runs out of metaspace.
>>
>> I've created a JIRA issue for this problem, but got no response from the
>> development team on it: https://issues.apache.org/jira/browse/FLINK-9132
>>
>>
>> On Mon, Apr 9, 2018 at 11:36 AM 王凯  wrote:
>>
>>> thanks a lot,i will try it
>>>
>>> 在 2018-04-09 00:06:02，"TechnoMage"  写道：
>>>
>>> I have seen this when my task manager ran out of RAM.  Increase the heap
>>> size.
>>>
>>> flink-conf.yaml:
>>> taskmanager.heap.mb
>>> jobmanager.heap.mb
>>>
>>> Michael
>>>
>>> On Apr 8, 2018, at 2:36 AM, 王凯  wrote:
>>>
>>> 
>>> hi all, recently, i found a problem,it runs well when start. But after
>>> long run,the exception display as above,how can resolve it?
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>
>

Re: Flink override config params (Docker)

2018-04-09 Thread miki haiat

You can mount the conf folder and ovride the conf file.

On Mon, 9 Apr 2018, 14:04 Pavel Ciorba,  wrote:

> Hi everyone
>
> Is there a way to override the *conf/flink-conf.yaml* of the Flink Docker
> container?
>
> I need to specify some params such as:
> state.backend
> state.backend.fs.checkpointdir
> state.checkpoints.dir
> etc.
>
> Thanks
>
>

Re: Temporary failure in name resolution

2018-04-04 Thread miki haiat

HI ,

i checked the code again the figure out where the problem  can be

i just wondered if im implementing the Evictor correctly  ?

full code
https://gist.github.com/miko-code/6d7010505c3cb95be122364b29057237




public static class EsbTraceEvictor implements Evictor<EsbTrace, GlobalWindow> {
org.slf4j.Logger LOG = LoggerFactory.getLogger(EsbTraceEvictor.class);
@Override
public void evictBefore(Iterable<TimestampedValue>
iterable, int i, GlobalWindow globalWindow, Evictor.EvictorContext
evictorContext) {

}

@Override
public void evictAfter(Iterable<TimestampedValue>
elements, int i, GlobalWindow globalWindow, EvictorContext
evictorContext) {
//change it to current procces  time
long min5min = LocalDateTime.now().minusMinutes(5).getNano();
LOG.info("time now -5min",min5min);
DateTimeFormatter format = DateTimeFormatter.ISO_DATE_TIME;
for (Iterator<TimestampedValue> iterator =
elements.iterator(); iterator.hasNext(); ) {
TimestampedValue element = iterator.next();
LocalDateTime el =
LocalDateTime.parse(element.getValue().getEndDate(),format);
LOG.info("element time ",element.getValue().getEndDate());
if (el.minusMinutes(5).getNano() <= min5min) {
iterator.remove();
}
}
}
}






On Tue, Apr 3, 2018 at 4:28 PM, Hao Sun <ha...@zendesk.com> wrote:

> Hi Timo, we do have similar issue, TM got killed by a job. Is there a way
> to monitor JVM status? If through the monitor metrics, what metric I should
> look after?
> We are running Flink on K8S. Is there a possibility that a job consumes
> too much network bandwidth, so JM and TM can not connect?
>
> On Tue, Apr 3, 2018 at 3:11 AM Timo Walther <twal...@apache.org> wrote:
>
>> Hi Miki,
>>
>> for me this sounds like your job has a resource leak such that your
>> memory fills up and the JVM of the TaskManager is killed at some point. How
>> does your job look like? I see a WindowedStream.apply which might not be
>> appropriate if you have big/frequent windows where the evaluation happens
>> too late such that the state becomes too big.
>>
>> Regards,
>> Timo
>>
>>
>> Am 03.04.18 um 08:26 schrieb miki haiat:
>>
>> i tried to run flink on kubernetes and  as stand alone HA cluster and on
>> both cases  task manger got lost/kill after few hours/days.
>> im using ubuntu and flink 1.4.2 .
>>
>>
>> this is part of the log , i also attaches the full log .
>>
>>>
>>> org.tlv.esb.StreamingJob$EsbTraceEvictor@20ffca60, 
>>> WindowedStream.apply(WindowedStream.java:1061))
>>> -> Sink: Unnamed (1/1) (91b27853aa30be93322d9c516ec266bf) switched from
>>> RUNNING to FAILED.
>>> java.lang.Exception: TaskManager was lost/killed:
>>> 6dc6cd5c15588b49da39a31b6480b2e3 @ beam2 (dataPort=42587)
>>> at org.apache.flink.runtime.instance.SimpleSlot.
>>> releaseSlot(SimpleSlot.java:217)
>>> at org.apache.flink.runtime.instance.SlotSharingGroupAssignment.
>>> releaseSharedSlot(SlotSharingGroupAssignment.java:523)
>>> at org.apache.flink.runtime.instance.SharedSlot.
>>> releaseSlot(SharedSlot.java:192)
>>> at org.apache.flink.runtime.instance.Instance.markDead(
>>> Instance.java:167)
>>> at org.apache.flink.runtime.instance.InstanceManager.
>>> unregisterTaskManager(InstanceManager.java:212)
>>> at org.apache.flink.runtime.jobmanager.JobManager.org$
>>> apache$flink$runtime$jobmanager$JobManager$$handleTaskManagerTerminated(
>>> JobManager.scala:1198)
>>> at org.apache.flink.runtime.jobmanager.JobManager$$
>>> anonfun$handleMessage$1.applyOrElse(JobManager.scala:1096)
>>> at scala.runtime.AbstractPartialFunction.apply(
>>> AbstractPartialFunction.scala:36)
>>> at org.apache.flink.runtime.LeaderSessionMessageFilter$$
>>> anonfun$receive$1.applyOrElse(LeaderSessionMessageFilter.scala:49)
>>> at scala.runtime.AbstractPartialFunction.apply(
>>> AbstractPartialFunction.scala:36)
>>> at org.apache.flink.runtime.LogMessages$$anon$1.apply(
>>> LogMessages.scala:33)
>>> at org.apache.flink.runtime.LogMessages$$anon$1.apply(
>>> LogMessages.scala:28)
>>> at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:123)
>>> at org.apache.flink.runtime.LogMessages$$anon$1.
>>> applyOrElse(LogMessages.scala:28)
>>> at akka.actor.Actor$class.aroundReceive(Actor.scala:502)
>>> at org.apache.flink.runtime.jobmanager.JobManager.
>>> aroundReceive(JobManager.scala:122)
>>> at akka.actor.Actor

Re: Connect more than stream!!

2018-03-22 Thread miki haiat

You can join streams

http://training.data-artisans.com/exercises/eventTimeJoin.html


On Thu, 22 Mar 2018, 11:36 Puneet Kinra, 
wrote:

> Hi
>
> Is there any way of connecting multiple streams(more than one) in flink
>
> --
> *Cheers *
>
> *Puneet Kinra*
>
> *Mobile:+918800167808 | Skype : puneet.ki...@customercentria.com
> *
>
> *e-mail :puneet.ki...@customercentria.com
> *
>
>
>
On 22 Mar 2018 11:36, "Puneet Kinra" 
wrote:

Hi

Is there any way of connecting multiple streams(more than one) in flink


-- 
*Cheers *

*Puneet Kinra*

*Mobile:+918800167808 | Skype : puneet.ki...@customercentria.com
*

*e-mail :puneet.ki...@customercentria.com
*

Re: flink on mesos

2018-03-18 Thread miki haiat

I think  that you can use the catalog option only if you install dc/os ?


>  iv  installed  mesos and marathon




On Sun, Mar 18, 2018 at 5:59 PM, Lasse Nedergaard <lassenederga...@gmail.com
> wrote:

> Hi.
> Go to Catalog, Search for Flink and click deploy
>
> Med venlig hilsen / Best regards
> Lasse Nedergaard
>
>
> Den 18. mar. 2018 kl. 16.18 skrev miki haiat <miko5...@gmail.com>:
>
>
> Hi ,
>
> Im trying to run flink on mesos iv  installed  mesos and marathon
> successfully but im unable to create flink job/task manager
>
> im running this command but mesos wont start any task
>
> ./mesos-appmaster-flip6-session.sh  -n 1
>
>
>
> i cant figure out the proper way to run flink on  mesos
>
>
>

flink on mesos

2018-03-18 Thread miki haiat

Hi ,

Im trying to run flink on mesos iv  installed  mesos and marathon
successfully but im unable to create flink job/task manager

im running this command but mesos wont start any task

./mesos-appmaster-flip6-session.sh  -n 1



i cant figure out the proper way to run flink on  mesos

akka.remote.ReliableDeliverySupervisor Temporary failure in name resolution

2018-03-06 Thread miki haiat

Hi ,

Im running flink jobs on kubernetes after a day or so.
the task manager and job managerlosing connection   and i have to
restart earthing .
Im assuming that one of the pods crashed and when now pod start he cant
find the job manager ?
Also i saw that is an Akka issue...  and it wiil be fixed in version 1.5 .

How can i safely deploy jobs on kubernetes .


task manager logs

> 2018-03-06 07:23:18,186 INFO
> org.apache.flink.runtime.taskmanager.TaskManager  - Trying to
> register at JobManager akka.tcp://flink@flink-jobmanager:6123/user/jobmanager
> (attempt 1594, timeout: 3 milliseconds)
> 2018-03-06 07:23:48,196 INFO
> org.apache.flink.runtime.taskmanager.TaskManager  - Trying to
> register at JobManager akka.tcp://flink@flink-jobmanager:6123/user/jobmanager
> (attempt 1595, timeout: 3 milliseconds)
> 2018-03-06 07:24:18,216 INFO
> org.apache.flink.runtime.taskmanager.TaskManager  - Trying to
> register at JobManager akka.tcp://flink@flink-jobmanager:6123/user/jobmanager
> (attempt 1596, timeout: 3 milliseconds)
> 2018-03-06 07:24:48,237 INFO
> org.apache.flink.runtime.taskmanager.TaskManager  - Trying to
> register at JobManager akka.tcp://flink@flink-jobmanager:6123/user/jobmanager
> (attempt 1597, timeout: 3 milliseconds)
> 2018-03-06 07:24:53,042 WARN  akka.remote.ReliableDeliverySupervisor
>   - Association with remote system
> [akka.tcp://flink@flink-jobmanager:6123] has failed, address is now gated
> for [5000] ms. Reason: [Disassociated]


Job manager logs

>
> 2018-03-06 07:25:18,262 INFO
> org.apache.flink.runtime.instance.InstanceManager - Registered
> TaskManager at flink-taskmanager-3509325052-bqtkd
> (akka.tcp://flink@flink-taskmanager-3509325052-bqtkd:35073/user/taskmanager)
> as c37614c28df29d34b80676488e386da3. Current number of registered hosts is
> 2. Current number of alive task slots is 16.
> 2018-03-06 07:25:18,263 WARN  akka.remote.ReliableDeliverySupervisor
>   - Association with remote system
> [akka.tcp://flink@flink-taskmanager-3509325052-bqtkd:35073] has failed,
> address is now gated for [5000] ms. Reason: [Association failed with
> [akka.tcp://flink@flink-taskmanager-3509325052-bqtkd:35073]] Caused by:
> [flink-taskmanager-3509325052-bqtkd: Temporary failure in name resolution]
> 2018-03-06 07:25:23,282 WARN  akka.remote.ReliableDeliverySupervisor
>   - Association with remote system
> [akka.tcp://flink@flink-taskmanager-3509325052-bqtkd:35073] has failed,
> address is now gated for [5000] ms. Reason: [Association failed with
> [akka.tcp://flink@flink-taskmanager-3509325052-bqtkd:35073]] Caused by:
> [flink-taskmanager-3509325052-bqtkd]
> 2018-03-06 07:25:28,303 WARN  akka.remote.ReliableDeliverySupervisor
>   - Association with remote system
> [akka.tcp://flink@flink-taskmanager-3509325052-bqtkd:35073] has failed,
> address is now gated for [5000] ms. Reason: [Association failed with
> [akka.tcp://flink@flink-taskmanager-3509325052-bqtkd:35073]] Caused by:
> [flink-taskmanager-3509325052-bqtkd: Temporary failure in name resolution]
> 2018-03-06 07:25:33,322 WARN  akka.remote.ReliableDeliverySupervisor
>   - Association with remote system
> [akka.tcp://flink@flink-taskmanager-3509325052-bqtkd:35073] has failed,
> address is now gated for [5000] ms. Reason: [Association failed with
> [akka.tcp://flink@flink-taskmanager-3509325052-bqtkd:35073]] Caused by:
> [flink-taskmanager-3509325052-bqtkd]
> 2018-03-06 07:25:38,343 WARN  akka.remote.ReliableDeliverySupervisor
>   - Association with remote system
> [akka.tcp://flink@flink-taskmanager-3509325052-bqtkd:35073] has failed,
> address is now gated for [5000] ms. Reason: [Association failed with
> [akka.tcp://flink@flink-taskmanager-3509325052-bqtkd:35073]] Caused by:
> [flink-taskmanager-3509325052-bqtkd: Temporary failure in name resolution]
> 2018-03-06 07:25:43,362 WARN  akka.remote.ReliableDeliverySupervisor
>   - Association with remote system
> [akka.tcp://flink@flink-taskmanager-3509325052-bqtkd:35073] has failed,
> address is now gated for [5000] ms. Reason: [Association failed with
> [akka.tcp://flink@flink-taskmanager-3509325052-bqtkd:35073]] Caused by:
> [flink-taskmanager-3509325052-bqtkd]
> 2018-03-06 07:25:48,383 WARN  akka.remote.ReliableDeliverySupervisor
>   - Association with remote system
> [akka.tcp://flink@flink-taskmanager-3509325052-bqtkd:35073] has failed,
> address is now gated for [5000] ms. Reason: [Association failed with
> [akka.tcp://flink@flink-taskmanager-3509325052-bqtkd:35073]] Caused by:
> [flink-taskmanager-3509325052-bqtkd: Temporary failure in name resolution]
> 2018-03-06 07:25:53,402 WARN  akka.remote.ReliableDeliverySupervisor
>   - Association with remote system
> [akka.tcp://flink@flink-taskmanager-3509325052-bqtkd:35073] has failed,
> address is now gated for

Fwd: Global window keyBy

2018-02-05 Thread miki haiat

yes .
another question is how can i clear non trigger events  after a period of
time.
is thire a way to configure some "timeout "

thanks, allot .



On Mon, Feb 5, 2018 at 10:40 AM, Piotr Nowojski <pi...@data-artisans.com>
wrote:

> Hi,
>
> FIRE_AND_PURGE triggers `org.apache.flink.api.common.state.State#clear()`
> call and it "Removes the value mapped under the current key.”. So other
> keys should remain unmodified.
>
> I hope this solves your problem/question?
>
> Piotrek
>
> On 4 Feb 2018, at 15:39, miki haiat <miko5...@gmail.com> wrote:
>
> Im using trigger   and a  guid in order to key stream .
>
> I have  some problem to understand how to clear the window .
>
>
>- FIRE_AND_PURGE   in trigger  will remove the keyd data only ?
>
> if fire and purge is removing all the data then i need to implement it
> more like this  example
>
> https://github.com/dataArtisans/flink-training-exercises/blo
> b/master/src/main/java/com/dataartisans/flinktraining/exerci
> ses/datastream_java/windows/DrivingSegments.java
>
> Evictor is used in order to clear the data by time stamp  but how can i
> clear the data  by the key  also ?
>
>
> thanks ,
>
> Miki
>
>
>

Global window keyBy

2018-02-04 Thread miki haiat

Im using trigger   and a  guid in order to key stream .

I have  some problem to understand how to clear the window .


   - FIRE_AND_PURGE   in trigger  will remove the keyd data only ?

if fire and purge is removing all the data then i need to implement it more
like this  example

https://github.com/dataArtisans/flink-training-exercises/blob/master/src/main/java/com/dataartisans/flinktraining/exercises/datastream_java/windows/DrivingSegments.java

Evictor is used in order to clear the data by time stamp  but how can i
clear the data  by the key  also ?


thanks ,

Miki

82 matches

Mail list logo