Re: env.setStateBackend deprecated in 1.5 for FsStateBackend

2018-06-26 Thread Rong Rong
Hmm. If you have a wrapper function like this, it will not report deprecated warning. *def getFsStateBackend(path: String): StateBackend = return new FsStateBackend(path) * Since AbstractStateBackend implements StateBackend and *def setStateBackend(backend: StateBackend):

Re: Over Window Not Processing Messages

2018-06-26 Thread Rong Rong
Hi Greg. Based on a quick test I cannot reproduce the issue, it is emitting messages correctly in the ITCase environment. can you share more information? Does the same problem happen if you use proctime? I am guessing this could be highly correlated with how you set your watermark strategy of

Re: env.setStateBackend deprecated in 1.5 for FsStateBackend

2018-06-26 Thread zhangminglei
At the moment, it seems you can not. Because FsStateBackend extends AbstructFileStateBackend then extend AbstructStateBackend which is deprecated in setStateBackend parameter.. I think you can do what you want like below now but it is very bad. env.setStateBackend(new StateBackend() {

env.setStateBackend deprecated in 1.5 for FsStateBackend

2018-06-26 Thread chrisr123
I upgraded from Flink 1.4 to 1.5 and now this call is being flagged as deprecated. What should I change this code to for 1.5 to get rid of the deprecation warning? Thanks // deprecated env.setStateBackend(new FsStateBackend("hdfs://myhdfsmachine:9000/flink/checkpoints")); -- Sent from:

Over Window Not Processing Messages

2018-06-26 Thread Gregory Fee
Hello User Community! I am running some streaming SQL that involves a union all into an over window similar to the below: SELECT user_id, count(action) OVER (PARTITION BY user_id ORDER BY rowtime RANGE INTERVAL '14' DAY PRECEDING) action_count, rowtime FROM (SELECT rowtime, user_id, thing as

Re: Measure Latency from source to sink

2018-06-26 Thread Hequn Cheng
Hi antonio, latency is exposed via a metric. You can find each operator's latency through flink UI(Overview->Task Metrics -> select the task, for example select the sink -> Add metric -> find latency metric) On Tue, Jun 26, 2018 at 11:18 PM, antonio saldivar wrote: > Hello thank you > > I also

[ANNOUNCE] Weekly community update #26

2018-06-26 Thread Till Rohrmann
Dear community, this is the weekly community update thread #26. Please post any news and updates you want to share with the community to this thread. # New Flink community website The new community website [1] has been launched. Big kudos to Fabian for driving this effort. The new structure

Re: [Flink-9407] Question about proposed ORC Sink !

2018-06-26 Thread sagar loke
@zhangminglei, Question about the schema for ORC format: 1. Does it always need to be of complex type "" ? 2. Or can it be created with individual data types directly ? eg. "name:string, age:int" ? Thanks, Sagar On Fri, Jun 22, 2018 at 11:56 PM, zhangminglei <18717838...@163.com> wrote:

How to deploy Flink in a geo-distributed environment

2018-06-26 Thread Stephen
Hi, Can Flink be deployed in a geo-distributed environment instead of being in local clusters? As far as I know, raw data should be moved to local cloud environment or local clusters before Flink handle it. Consider this situation where data sources are on different areas which might be cross

Re: How to partition within same physical node in Flink

2018-06-26 Thread Vijay Balakrishnan
Hi Fabian, Thanks once again for your reply. I need to get the data from each cam/camera into 1 partition/slot and not move the gigantic video data around as much as I perform other operations on it. For eg, I can get seq#1 and seq#2 for cam1 in cam1 partition/slot and then combine, split,parse,

Re: Few question about upgrade from 1.4 to 1.5 flink ( some very basic )

2018-06-26 Thread Vishal Santoshi
Ok, I will check. On Tue, Jun 26, 2018, 12:39 PM Gary Yao wrote: > Hi Vishal, > > You should check the contents of znode /flink_test/[...]/rest_server_lock > to see > if the URL is correct. > > The host and port should be logged by the RestClient [1]. If you do not > see the > message "Sending

Re: TaskIOMetricGroup metrics not unregistered in prometheus on job failure ?

2018-06-26 Thread Chesnay Schepler
Great work on debugging this, you're exactly right. The children we add to the collector have to be removed individually when a metric is unregistered. If the collector is a io.prometheus.client.Gauge we can use the #remove() method. For histograms we will have to modify our

Re: Few question about upgrade from 1.4 to 1.5 flink ( some very basic )

2018-06-26 Thread Gary Yao
Hi Vishal, You should check the contents of znode /flink_test/[...]/rest_server_lock to see if the URL is correct. The host and port should be logged by the RestClient [1]. If you do not see the message "Sending request of class [...]]" on DEBUG level, probably the client is not able to get the

SANSA 0.4 (Scalable Semantic Analytics Stack using Spark/Flink) Released

2018-06-26 Thread Jens Lehmann
Dear all, The Smart Data Analytics group [1] is happy to announce SANSA 0.4 - the fourth release of the Scalable Semantic Analytics Stack. SANSA employs distributed computing via Apache Spark and Flink in order to allow scalable machine learning, inference and querying capabilities for large

Re: TaskIOMetricGroup metrics not unregistered in prometheus on job failure ?

2018-06-26 Thread jelmer
Hi Chesnay, sorry for the late reply. I did not have time to look into this sooner I did what you suggested. Added some logging to the PrometheusReporter like this : https://github.com/jelmerk/flink/commit/58779ee60a8c3961f3eb2c487c603c33822bba8a And deployed a custom build of the reporter to

Re: Measure Latency from source to sink

2018-06-26 Thread antonio saldivar
Hello thank you I also was trying using Flink UI Metrics on version 1.4.2 *env.getConfig().setLatencyTrackingInterval(1000L), *But looks like is not displaying anything El mar., 26 jun. 2018 a las 10:45, zhangminglei (<18717838...@163.com>) escribió: > Hi, You can do that but it does not makes

Re: Measure Latency from source to sink

2018-06-26 Thread zhangminglei
Hi, You can do that but it does not makes sense in general. But you can do that by flink, storm, spark streaming or structured streaming. And make a compare the latency under different framework. Cheers Minglei > 在 2018年6月26日,下午9:36,antonio saldivar 写道: > > Hello Thank you for the feedback,

Re: Few question about upgrade from 1.4 to 1.5 flink ( some very basic )

2018-06-26 Thread Vishal Santoshi
The leader znode is the right one ( it is a binary ) get /flink_test/da_15/leader//job_manager_lock wFDakka.tcp:// fl...@flink-9edd15d7.bf2.tumblr.net:22161/user/jobmanagersrjava.util.UUIDm/J leastSigBitsJ

Re: Few question about upgrade from 1.4 to 1.5 flink ( some very basic )

2018-06-26 Thread Vishal Santoshi
OK few things 2018-06-26 13:31:29 INFO CliFrontend:282 - Starting Command Line Client (Version: 1.5.0, Rev:c61b108, Date:24.05.2018 @ 14:54:44 UTC) ... 2018-06-26 13:31:31 INFO ClientCnxn:876 - Socket connection established to zk-f1fb95b9.bf2.tumblr.net/10.246.218.17:2181, initiating session

Re: Measure Latency from source to sink

2018-06-26 Thread antonio saldivar
Hello Thank you for the feedback, Well for now I just Want to measure the time that takes form Source to Sink each transaction add the start and end time in mills El mar., 26 jun. 2018 a las 5:19, zhangminglei (<18717838...@163.com>) escribió: > Hi,Antonio > > Usually, the measurement of

Re: Restore state from save point with add new flink sql

2018-06-26 Thread Hequn Cheng
Hi I'm not sure about the answer. I have a feeling that if we only add new code below the old code(i.e., append new code after old code), the uid will not be changed. On Tue, Jun 26, 2018 at 3:06 PM, Till Rohrmann wrote: > I think so. Maybe Fabian or Timo can correct me if I'm wrong here. > >

Re: Storing Streaming Data into Static source

2018-06-26 Thread Rad Rad
thanks Stefan. I subscribed the streaming data from Kafka and I did some queries using Flink. I need to store some of the results into a static source. So, which is better data source can I define by Kafka source within Flink API MongoDB or Postgresql. Thanks again. -- Sent from:

Re: Storing Streaming Data into Static source

2018-06-26 Thread Stefan Richter
Hi, I think this is not really a Flink related question. In any case, you might want to specify a bit more what you mean by „better", because usually there is no strict better but trade-offs and what is „better“ to somebody might not be „better“ for you. Best, Stefan > Am 26.06.2018 um 12:54

Logback doesn't receive logs from job

2018-06-26 Thread Guilherme Nobre
Hi all, I have a Flink cluster (1.4.0) built from flink's docker image, with 1 job manager and 2 task managers. I'm trying to use logback isntead of log4j and as far as the cluster configurations goes, seems alright. Following

Storing Streaming Data into Static source

2018-06-26 Thread Rad Rad
Hi all, Kindly, I want to save streaming data which subscribed from Kafka into a static data source. Which is better /MongoDB or PostgreSQL. Radhya. -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Re: Few question about upgrade from 1.4 to 1.5 flink ( some very basic )

2018-06-26 Thread zhangminglei
By the way, in HA set up. > 在 2018年6月26日,下午5:39,zhangminglei <18717838...@163.com> 写道: > > Hi, Gary Yao > > Once I discovered that there was a change in the ip address[ > jobmanager.rpc.address ]. From 10.208.73.129 to localhost. I think that will > cause the issue. What do you think ? > >

Re: Few question about upgrade from 1.4 to 1.5 flink ( some very basic )

2018-06-26 Thread zhangminglei
Hi, Gary Yao Once I discovered that there was a change in the ip address[ jobmanager.rpc.address ]. From 10.208.73.129 to localhost. I think that will cause the issue. What do you think ? Cheers Minglei > 在 2018年6月26日,下午4:53,Gary Yao 写道: > > Hi Vishal, > > Could it be that you are not

Re: Measure Latency from source to sink

2018-06-26 Thread zhangminglei
Hi,Antonio Usually, the measurement of delay is for specific business I think it is more reasonable. What I understand of latency from my experience is data preparation time plus query calculation time. It is like an end to end latency test. Hopes this can help you. Not point to the latency of

Re: Few question about upgrade from 1.4 to 1.5 flink ( some very basic )

2018-06-26 Thread Gary Yao
Hi Vishal, Could it be that you are not using the 1.5.0 client? The stacktrace you posted does not reference valid lines of code in the release-1.5.0-rc6 tag. If you have a HA setup, the host and port of the leading JM will be looked up from ZooKeeper before job submission. Therefore, the

Re: String Interning

2018-06-26 Thread Stefan Richter
Hi, you can enable object reuse via the execution config [1]: „By default, objects are not reused in Flink. Enabling the object reuse mode will instruct the runtime to reuse user objects for better performance. Keep in mind that this can lead to bugs when the user-code function of an operation

Re: Measure Latency from source to sink

2018-06-26 Thread Fabian Hueske
Hi, Measuring latency is tricky and you have to be careful about what you measure. Aggregations like window operators make things even more difficult because you need to decide which timestamp(s) to forward (smallest?, largest?, all?) Depending on the operation, the measurement code might even

Re: How to partition within same physical node in Flink

2018-06-26 Thread Fabian Hueske
Hi, keyBy() does not work hierarchically. Each keyBy() overrides the previous partitioning. You can keyBy(cam, seq#) which guarantees that all records with the same (cam, seq#) are processed by the same parallel instance. However, Flink does not give any guarantees about how the (cam, seq#)

Re: Restore state from save point with add new flink sql

2018-06-26 Thread Till Rohrmann
I think so. Maybe Fabian or Timo can correct me if I'm wrong here. On Mon, Jun 25, 2018 at 9:17 AM James (Jian Wu) [FDS Data Platform] < james...@coupang.com> wrote: > Hi Till: > > > > Thanks for your answer, so if I just add new sql and not modified old sql > then use `/`--allowNonRestoredState

Re: CEP: Different consuming strategies within a pattern

2018-06-26 Thread Shailesh Jain
Thanks, Dawid. On Mon, Jun 25, 2018 at 12:48 PM, Dawid Wysakowicz wrote: > Hi Shailesh, > > It does not emit results because "followedBy" accepts only the first > occurrence of matching event. Therefore in your case it only tries to > construct pattern with start(id=2). Try removing this event