Re: How to write value only using flink's SequenceFileWriter?

2019-07-26 Thread Liu Bo
The file header says key is NullWritable: SEQ^F!org.apache.hadoop.io.NullWritable^Yorg.apache.hadoop.io.Text^A^A)org.apache.hadoop.io.compress.SnappyCodec Might be a hadoop -text problem? On Sat, 27 Jul 2019 at 11:07, Liu Bo wrote: > Dear flink users, > > We're trying to switch from StringWrit

How to write value only using flink's SequenceFileWriter?

2019-07-26 Thread Liu Bo
Dear flink users, We're trying to switch from StringWriter to SequenceFileWriter to turn on compression. StringWriter writes value only and we want to keep that way. AFAIK, you can use NullWritable in Hadoop writers to escape key so you only write the values. So I tried with NullWritable as foll

Savepoint process recovery in Jobmanager HA setup

2019-07-26 Thread Bajaj, Abhinav
Hi, I am trying to test a scenario that triggers a savepoint on a Flink 1.7.1 Job deployed with jobmanager HA mode. The purpose is to check if savepoint process recovers if the leader jobmanager fails during the savepoint. During my testing, I found that the new leader jobmanager returns the be

Re: Is it possible to decide the order of where conditions in Flink SQL

2019-07-26 Thread sri hari kali charan Tummala
try cte common table expressions if it supports or sql subquery. On Fri, Jul 26, 2019 at 1:00 PM Fanbin Bu wrote: > how about move query db filter to the outer select. > > On Fri, Jul 26, 2019 at 9:31 AM Tony Wei wrote: > >> Hi, >> >> If I have multiple where conditions in my SQL, is it possibl

async and checkpointing

2019-07-26 Thread anurag
Hi , Thanks in advance for your help. I am trying to write a flink function which reads from kafka using kafka-flinkconsumer and sends messages to an indexer. I am not clear on how async and checkpointing will work in this case. My flow is like this: a) Messages are ingested into kafka. b)The messa

Re: Is it possible to decide the order of where conditions in Flink SQL

2019-07-26 Thread Fanbin Bu
how about move query db filter to the outer select. On Fri, Jul 26, 2019 at 9:31 AM Tony Wei wrote: > Hi, > > If I have multiple where conditions in my SQL, is it possible to specify > its order, so that the query > can be executed more efficiently? > > For example, if I have the following SQL,

Is it possible to decide the order of where conditions in Flink SQL

2019-07-26 Thread Tony Wei
Hi, If I have multiple where conditions in my SQL, is it possible to specify its order, so that the query can be executed more efficiently? For example, if I have the following SQL, it used a heavy UDF that needs to access database. However, if I can specify the order of conditions is executing `

Re: question for handling db data

2019-07-26 Thread Oytun Tez
imagine an operator, ProcessFunction, it has 2 incoming data: geofences via broadcast, user location via normal data stream geofence updates and user location updates will come separately into this single operator. 1) when geofence update comes via broadcast, the operator will update its state wi

Re: Extending REST API with new endpoints

2019-07-26 Thread Oytun Tez
Scary! :) I would heartily hate to maintain our own fork. Should I make a feature request to discuss further and then send a PR for this? Is this the normal way to push for a feature? --- Oytun Tez *M O T A W O R D* The World's Fastest Human Translation Platform. oy...@motaword.com — www.mo

Pramaters in eclipse with Flink

2019-07-26 Thread alaa
Hallo I run this example form GitHub https://github.com/ScaleUnlimited/flink-streaming-kmeans but I am not familiar with eclipse and i got this error I dont know how and

Re: Extending REST API with new endpoints

2019-07-26 Thread Chesnay Schepler
There's no built-in way to extend the REST API. You will have to create a fork and either extend the DIspatcherRestEndpoint (or parent classes), or implement another WebMonitorExtension and modify the DispatcherRestEndpoint to load that one as well. On 23/07/2019 15:51, Oytun Tez wrote: Ping,

Re: Job Manager becomes irresponsive if the size of the session cluster grows

2019-07-26 Thread Richard Deurwaarder
Hello, We run into the same problem. We've done most of the same steps/observations: - increase memory - increase cpu - No noticable increase in GC activity - Little network io Our current setup has the liveliness probe disabled and we've increased (akka)timeouts, this seems to help

Re: Job Manager becomes irresponsive if the size of the session cluster grows

2019-07-26 Thread Biao Liu
Hi Prakhar, Sorry I don't have much experience on k8s. Maybe some other guys could help. On Fri, Jul 26, 2019 at 6:20 PM Prakhar Mathur wrote: > Hi, > > So we were deploying our flink clusters on YARN earlier but then we moved > to kubernetes, but then our clusters were not this big. Have you g

Re: Job Manager becomes irresponsive if the size of the session cluster grows

2019-07-26 Thread Prakhar Mathur
Hi, So we were deploying our flink clusters on YARN earlier but then we moved to kubernetes, but then our clusters were not this big. Have you guys seen issues with job manager rest server becoming irresponsive on kubernetes before? On Fri, Jul 26, 2019, 14:28 Biao Liu wrote: > Hi Prakhar, > >

Re: Help with the correct Event Pattern

2019-07-26 Thread Dawid Wysakowicz
Have you tried pattern like: /Pattern.begin[Event]("b", //AfterMatchSkipStrategy.skipPastLast//).where(...).followedBy("c").where(...).followedBy("e").where(...)/ The method followedBy(Pattern) constructs a Pattern with a subGroup pattern. The skip strategy there does not have any effect. Best,

Re: Job Manager becomes irresponsive if the size of the session cluster grows

2019-07-26 Thread Biao Liu
Hi Prakhar, Sorry I could not find any abnormal message from your GC log and stack trace. Have you ever tried deploying the cluster in other ways? Not on Kubernetes. Like on YARN or standalone. Just for narrowing down the scope. On Tue, Jul 23, 2019 at 12:34 PM Prakhar Mathur wrote: > > On Mon

Re: Execution environments for testing: local vs collection vs mini cluster

2019-07-26 Thread Biao Liu
Hi Juan, Sorry for the late reply. 1. the environments of data stream and data set are not same. An obvious difference is there always be a "stream" prefix of environment for data stream. For example, StreamExecutionEnvironment is for data stream, ExecutionEnvironment and CollectionEnvironment ar

Re: MapSate within Aggregate function

2019-07-26 Thread Ahmad Hassan
Hi Congzian, My understanding is that if I use AggregateFunction and have Million of unique elements coming in for the duration of 24hour, then the state of AggregateFunction will grow huge with those million entries and the checkpointing would take longer and longer. I thought if i could use MapS

Re: Assign a Row.of(ListsElements) exception

2019-07-26 Thread Caizhi Weng
Hi Andres, This exception is often caused by other exceptions. Please post your full stack trace here so we can diagnose the problem. Thanks. Andres Angel 于2019年7月26日周五 上午11:14写道: > Hello everyone, > > I have a list with bunch of elements and I need create a Row.of() based on > the whole elemen