Re: Monitoring Flink on Yarn

2016-12-19 Thread Shannon Carey
Check out Logstash or Splunk. Those can pipe your logs into an external database which can be used by a purpose-built UI for examining logs, and as a result it doesn't matter if the original files or machines are still around or not. From: Lydia Ickler

Serializing NULLs

2016-12-19 Thread Matt
Hello list, I'm getting this error: *java.lang.RuntimeException: Could not forward element to next operator* *...* *Caused by: java.lang.NullPointerException: in com.entities.Sector in map in double null of double of map in field properties of com.entities.Sector* *...* *Caused by:

Re: Updating a Tumbling Window every second?

2016-12-19 Thread Matt
Fabian, Thanks for your answer. Since elements in B are expensive to create, I wanted to reuse them. I understand I can plug two consumers into stream A, but in that case -if I'm not wrong- I would have to create repeated elements of B: one to save them into stream B and one to create C objects

Re: Reg Checkpoint size using RocksDb

2016-12-19 Thread Anirudh Mallem
Hi Stephan, Thanks for your response. I shall try switching to the fully Async mode and see. On another note, is there any option available to configure compaction capabilities using the default checkpointing mode? Thanks. From: Stephan Ewen Reply-To:

Re: Reg Checkpoint size using RocksDb

2016-12-19 Thread Stephan Ewen
Hi! If you use the default checkpoint mode, Flink will checkpoint the current RocksDB instance. It may be that there simply has not been a compaction in RocksDB when checkpointing, so the checkpoint contains some "old data" as well. If you switch to the "fully async" mode, it should always only

Re: Alert on state change.

2016-12-19 Thread Scott Kidder
Hi Rudra, You could accomplish this with a rolling-fold on the stream of stock prices. The accumulator argument to the fold can track the last price that triggered an alert and the timestamp of the alert. When evaluating a new stock price it can compare the the price against the last one that

Calculating stateful counts per key

2016-12-19 Thread Mäki Hanna
Hi, I'm trying to calculate stateful counts per key with checkpoints following the example in https://ci.apache.org/projects/flink/flink-docs-release-1.1/apis/streaming/state.html#checkpointing-instance-fields. I would expect my test program to calculate the counts per key, but it seems to

Alert on state change.

2016-12-19 Thread Rudra Tripathy
Hi All, I have a use case where I am monitoring price change. Lets s say the original price is $100 in case of 20% rise , send the alert. In the stream I am getting updated prices. If in the next data $200 comes, send the alerts. Next I am getting 230 I would keep it but no alert . When I

Monitoring Flink on Yarn

2016-12-19 Thread Lydia Ickler
Hi all, I am using Flink 1.1.3 on Yarn and I wanted to ask how I can save the monitoring logs, e.g. for I/O or network, to HDFS or local FS? Since Yarn closes the Flink session after finishing the job I can't access the log via REST API. I am looking forward to your answer! Best regards,

Re: High virtual memory usage

2016-12-19 Thread Stephan Ewen
Hi Paulo! Hmm, interesting. The high discrepancy between virtual and physical memory usually means that the process either maps large files into memory, or that it pre-allocates a lot of memory without immediately using it. Neither of these things are done by Flink. Could this be an effect of

Re: High virtual memory usage

2016-12-19 Thread Paulo Cezar
- Are you using RocksDB? No. - What is your flink configuration, especially around memory settings? I'm using default config with 2GB for jobmanager and 5GB for taskmanagers. I'm starting flink via "./bin/yarn-session.sh -d -n 5 -jm 2048 -tm 5120 -s 4 -nm 'Flink'" - What do you use for

Reg Checkpoint size using RocksDb

2016-12-19 Thread Anirudh Mallem
Hi, I was experimenting with using RocksDb as the state backend for my job and to test its behavior I modified the socket word count program to store states. I also wrote a RichMapFunction which stores the states as a ValueState with default value as null. What the job does basically is, for

Re: Blocking RichFunction.open() and backpressure

2016-12-19 Thread Yury Ruchin
Thanks Fabian, that quite explains what's going on. 2016-12-19 12:19 GMT+03:00 Fabian Hueske : > Hi Yury, > > Flink's operators start processing as soon as they receive data. If an > operator produces more data than its successor task can process, the data > is buffered in

Re: How to analyze space usage of Flink algorithms

2016-12-19 Thread Fabian Hueske
Your functions do not need to implement RichFunction (although, each function can be a RichFunction and it should not be a problem to adapt the job). The system metrics are automatically collected. Metrics are exposed via a Reporter [1]. So you do not need to take care of the collection but rather

Re: Blocking RichFunction.open() and backpressure

2016-12-19 Thread Fabian Hueske
Hi Yury, Flink's operators start processing as soon as they receive data. If an operator produces more data than its successor task can process, the data is buffered in Flink's network stack, i.e., its network buffers. The backpressure mechanism kicks in when all network buffers are in use and no

Re: How to analyze space usage of Flink algorithms

2016-12-19 Thread otherwise777
Thank you for your reply, I'm afraid i still don't understand it, the part i don't understand is how to actually analyze it. It's ok if i can just analyze the system instead of the actual job, but how would i actually do that? I don't have any function in my program that extends the richfunction

Re: Updating a Tumbling Window every second?

2016-12-19 Thread Fabian Hueske
Hi Matt, the combination of a tumbling time window and a count window is one way to define a sliding window. In your example of a 30 secs tumbling window and a (3,1) count window results in a time sliding window of 90 secs width and 30 secs slide. You could define a time sliding window of 90