Re: Memory ran out. Compaction failed. - Exception

2017-06-01 Thread Robert Metzger
Hi Marc, The CompactingHashtable is not spillable (the only operator in batch actually that isn't), so you can only either reduce your data size or increase your memory. When you are using unmanaged memory, make sure the allocation of managed memory is reduced to a minimum. On Mon, May 29, 2017 a

Re: Flink - Iteration and Backpressure

2017-06-01 Thread Robert Metzger
Hi Mahesh, why do you need to iterate over the elements? I wonder if you can't just stream the data from kafka1-kafka4 through your topology? On Fri, May 26, 2017 at 7:21 PM, MAHESH KUMAR wrote: > Hi Team, > > I am trying to build an audit like system where I read messages from "n" > Kafka q

Re: Flink parallel tasks, slots and vcores

2017-06-01 Thread Robert Metzger
Hi Sathi, Are you seeing 8 slots in the JobManager UI? How many shards do you have in your kinesis stream? On Fri, May 26, 2017 at 3:14 PM, Jason Brelloch wrote: > Can you give us more information about what your Flink job is doing and > the distribution of the Kinesis data/keys? The distrib

Re: Excessive stdout is causing java heap out of mem

2017-06-01 Thread Robert Metzger
le the > ProcessFunction timer to the latest timer due to a bug from the elements > burst and its causing pipeline to back pressure, as a result the watermark > is also get stuck which is causing lots of buffering. > > Thanks for your tips, it was helpful. > > — > Fritz > &

[ANNOUNCE] Apache Flink 1.3.0 released

2017-06-01 Thread Robert Metzger
The Apache Flink community is pleased to announce the release of Apache Flink 1.3.0. Apache Flink® is an open-source stream processing framework for distributed, high-performing, always-available, and accurate data streaming applications. The release is available for download at: https://fl

[ANNOUNCE] Flink Forward Berlin (11-13 Sep 2017) Call for Submissions is open now

2017-05-30 Thread Robert Metzger
Dear Flink Community, The Call for Submissions for Flink Forward Berlin 2017 is open now! Since we believe in collaboration, participation and exchange of ideas, we are inviting the Flink community to submit a session. Share your knowledge, applications, use cases and best practices and shape the

Re: Checkpointing SIGSEGV

2017-05-26 Thread Robert Metzger
Hi Jason, This error is unexpected. I don't think its caused by insufficient memory. I'm including Stefan into the conversation, he's the RocksDB expert :) On Thu, May 25, 2017 at 4:15 PM, Jason Brelloch wrote: > Hey guys, > > We are running into a JVM crash on checkpointing when our rocksDB st

Re: High Availability on Yarn

2017-05-24 Thread Robert Metzger
at 3:36 PM, Robert Metzger wrote: > Hi Ankit, > > I'm sorry that nobody is responding to the message. I'll try to find > somebody. > > On Tue, May 23, 2017 at 10:27 PM, Jain, Ankit wrote: > >> Following up on this. >> >> >> >> *

Re: High Availability on Yarn

2017-05-24 Thread Robert Metzger
Hi Ankit, I'm sorry that nobody is responding to the message. I'll try to find somebody. On Tue, May 23, 2017 at 10:27 PM, Jain, Ankit wrote: > Following up on this. > > > > *From: *"Jain, Ankit" > *Date: *Tuesday, May 16, 2017 at 12:14 AM > > *To: *Stephan Ewen , "user@flink.apache.org" < > u

Re: yarnship option

2017-05-23 Thread Robert Metzger
or-kafka-0.9_2.11/pom.properties > 0 2017-04-11 01:59 META-INF/maven/org.apache. > flink/force-shading/ > 3285 2017-04-11 01:59 META-INF/maven/org.apache. > flink/force-shading/pom.xml > 114 2017-04-11 01:59 META-INF/maven/org.apache. > flink/force-sha

Re: Excessive stdout is causing java heap out of mem

2017-05-23 Thread Robert Metzger
ert, I’ll start using the logger. > > I didn’t pay attention whether the error occur when I accessed the log > from job manager. > I will do that in my next test. > > Anyone has any suggestion on how to debug out of memory exception on flink > jm/tm ? > > — > Fritz >

Re: yarnship option

2017-05-22 Thread Robert Metzger
Hi, this issue is unexpected :) Can you double check if the job-libs/flink-connector-kafka-0.9_2.11-1.2.1.jar contains org/apache/flink/streaming/connectors/kafka/FlinkKafkaConsumer09 ? Also, were there any previous Kafka09 related exceptions in the log?? >From this SO answer, it seems that this

Re: Job submission: Fail using command line. Success using web (flink1.2.0)

2017-05-22 Thread Robert Metzger
> org.apache.flink.runtime.client.JobClientActorSubmissionTimeoutException: > Job submission to the JobManager timed out. You may increase > 'akka.client.timeout' in case the JobManager needs more time to configure > and confirm the job submission. > - > Does this

Re: Excessive stdout is causing java heap out of mem

2017-05-22 Thread Robert Metzger
Hi Fritz, The TaskManagers are not buffering all stdout for the webinterface (at least I'm not aware of that). Did the error occur when accessing the log from the JobManager? Flinks web front end lazily loads the logs from the taskmanagers. The suggested method for logging is to use slf4j for log

Re: Using Kafka 0.10.x timestamps as a record value in Flink Streaming

2017-05-12 Thread Robert Metzger
Hi Jia, The Kafka 0.10 connector internally relies on the Kafka09 fetcher, but it is extensible / pluggable so that also the Kafka 0.9 Fetcher can read the event timestamps from Kafka 10. We don't expose the timestamp through the deserilaization API, because we set it internally in Flink. (there i

Re: [DISCUSS] Release 1.3.0 RC0 (Non voting, testing release candidate)

2017-05-09 Thread Robert Metzger
E are okay > >>>>> > >>>>> - In the shaded JAR files, we are not bundling the license and > >>>>> notice files of the dependencies we include in the shaded jars. > >>>>> => Not a problem for Guava (Apache Licensed

[DISCUSS] Release 1.3.0 RC0 (Non voting, testing release candidate)

2017-05-08 Thread Robert Metzger
Hi Devs, I've created a first non-voting release candidate for Flink 1.3.0. Please use this RC to test as much as you can and provide feedback to the Flink community. The more we find and fix now, the better Flink 1.3.0 wil be :) I've CC'ed the user@ mailing list to get more people to test it. DO

Re: Flink job on secure Yarn fails after many hours

2017-04-12 Thread Robert Metzger
Niels, are you still facing this issue? As far as I understood it, the security changes in Flink 1.2.0 use a new Kerberos mechanism that allows infinite token renewal. On Thu, Mar 17, 2016 at 7:30 AM, Maximilian Michels wrote: > Hi Niels, > > Thanks for the feedback. As far as I know, Hadoop de

[ANNOUNCE] Flink Forward San Francisco 10-11 Apr 2017 community discount codes

2017-03-24 Thread Robert Metzger
Dear Flink community, I would like to bring Flink Forward San Francisco to your attention. After hosting Flink Forward for two years in Berlin, Germany, we decided to bring it to the US west coast as well. Check out this very nice video summary from last year's conference in Berlin: https://www.y

Re: [E] Re: unable to add more servers in zookeeper quorum peers in flink 1.2

2017-03-24 Thread Robert Metzger
> and after the equals sign. > > > On Mar 23, 2017, at 10:12 AM, Robert Metzger wrote: > > Thank you for verifying. Fixed in master: http://git-wip-us. > apache.org/repos/asf/flink/commit/3e860b40 > > On Wed, Mar 22, 2017 at 9:25 PM, > wrote: > >> That worked..

Re: org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl

2017-03-24 Thread Robert Metzger
Hi, I don't think that his error is actually an issue: https://issues.apache.org/jira/browse/YARN-1022 Is something not working as expected in your application? On Fri, Mar 24, 2017 at 5:53 AM, wrote: > hi,i read file from hdfs,but there is error when run jon on yarn clutster, > ---

Re: flink Broadcast

2017-03-24 Thread Robert Metzger
Hi, Can you provide more logs to help us understand whats going on? One note regarding your application: You are calling .collect() and send the collection with the map() call to the cluster again. This is pretty inefficient and can potentially break your application (in particular the RPC system

Re: Task manager number mismatch container number on mesos

2017-03-23 Thread Robert Metzger
Could you provide the logs of the task manager that still runs as a container but doesn't show up as a Taskmanager? On Thu, Mar 23, 2017 at 11:38 AM, Renjie Liu wrote: > Permanent. I've waited for several minutes and the task manager is still > lost. > > On Thu, Mar 23, 2017 at 6:34 PM Ufuk Cele

Re: Odd error

2017-03-23 Thread Robert Metzger
Hi, I assume the flatMap(new RecordSplit()) is emitting a RawRecord. Is it possible that you've also added an empty constructor to it while adding the compareTo() method? I think the problem is that one of your types (probably the schema) is recognized as a nested POJO. Check out this documentati

Re: [E] Re: unable to add more servers in zookeeper quorum peers in flink 1.2

2017-03-23 Thread Robert Metzger
Thank you for verifying. Fixed in master: http://git-wip-us.apache.org/repos/asf/flink/commit/3e860b40 On Wed, Mar 22, 2017 at 9:25 PM, wrote: > That worked.. Thanks Chesnay, > > > > > [image: Verizon] > > Kanagaraj Vengidasamy > RTCI > > 7701 E Telecom PKWY > Temple Te

Re: Flink AUR package is available

2017-03-23 Thread Robert Metzger
Amazing, thanks a lot! On Thu, Mar 23, 2017 at 10:36 AM, Tao Meng wrote: > Hi all, > > For arch linux users, I have created flink AUR package. > We can use the package manager to install flink and use the systemd > manager flink as service. > If you have any questions or suggestions please fee

Re: RocksDB segfaults

2017-03-23 Thread Robert Metzger
Florian, can you post the log of the Taskmanager where the segfault happened ? On Wed, Mar 22, 2017 at 6:19 PM, Stefan Richter wrote: > Hi, > > for the first checkpoint, from the stacktrace I assume that the backend is > not accessed as part of processing an element, but by another thread. Is >

Re: Threading issue

2017-03-23 Thread Robert Metzger
Hi, how many unique combinations of your key "partition","threadNumber","schemaId" exist? In my opinion, all sinks should receive data if there are enough different keys. On Wed, Mar 22, 2017 at 3:41 AM, Telco Phone wrote: > I am looking to get readers from kafka / keyBy and Sink working with

Re: [POLL] Who still uses Java 7 with Flink ?

2017-03-23 Thread Robert Metzger
re is > currently little cost to staying with Java 7 since no Flink code or pull > requests have been written for Java 8. > > Greg > > > > On Mar 23, 2017, at 6:37 AM, Robert Metzger wrote: > > Looks like 9% on twitter and 24% on the mailing list are still using Java

Re: A question about iterations and prioritizing "control" over "data" inputs

2017-03-23 Thread Robert Metzger
To very quickly respond to Theo's question: No, it is not possible to have records overtake each other in the buffer. This could potentially void the exactly once processing guarantees, in the case when records overtake checkpoint barriers. On Wed, Mar 15, 2017 at 5:58 PM, Kathleen Sharp wrote:

Re: Telling if a job has caught up with Kafka

2017-03-23 Thread Robert Metzger
Sorry for joining this discussion late, but there is already a metric for the offset lag in our 0.9+ consumers. Its called the "records-lag-max": https://kafka.apache.org/documentation/#new_consumer_fetch_monitoring and its exposed via Flink's metrics system. The only issue is that it only shows th

Re: accessing flink HA cluster with scala shell/zeppelin notebook

2017-03-23 Thread Robert Metzger
Hi Alexis, did you set the Zookeeper configuration for Flink in Zeppelin? On Mon, Mar 20, 2017 at 11:37 AM, Alexis Gendronneau < a.gendronn...@gmail.com> wrote: > Hello users, > > As Maciek, I'm currently trying to make apache Zeppelin 0.7 working with > Flink. I have two versions of flink avail

Re: [POLL] Who still uses Java 7 with Flink ?

2017-03-23 Thread Robert Metzger
ch current, old-tech > compatible version at a reasonably limited scope. Building backward > compatibility is too much for an open sourced project > > > > On Wed, Mar 15, 2017 at 7:10 AM, Robert Metzger > wrote: > >> I've put it also on our Twitter account: >> h

Re: Netty issues while deploying Flink with Yarn on MapR

2017-03-23 Thread Robert Metzger
Just FYI: There is now a documentation page on how to use Flink on MapR (Thanks to Gordon): https://ci.apache.org/projects/flink/flink-docs- release-1.3/setup/mapr_setup.html On Tue, Feb 7, 2017 at 6:34 PM, Robert Metzger wrote: > Hi, > cool! > > Yes, creating a JIRA for the probl

Re: Return of Flink shading problems in 1.2.0

2017-03-20 Thread Robert Metzger
Here is the JIRA: https://issues.apache.org/jira/browse/FLINK-6125 On Mon, Mar 20, 2017 at 10:27 AM, Robert Metzger wrote: > Hi Craig, > > I was able to reproduce the issue with maven 3.3 in Flink 1.2. I'll look > into it. > > On Fri, Mar 17, 2017 at 11:56 PM, Foster, Cr

Re: Return of Flink shading problems in 1.2.0

2017-03-20 Thread Robert Metzger
know of a new workaround for 3.3.x. > > > > Thanks! > > Craig > > > > *From: *"Foster, Craig" > *Reply-To: *"user@flink.apache.org" > *Date: *Friday, March 17, 2017 at 7:23 AM > *To: *"user@flink.apache.org" > *Cc: *Ufuk Celebi , Robe

Re: Flink 1.2 and Cassandra Connector

2017-03-16 Thread Robert Metzger
ar 16, 2017 at 6:08 PM, Robert Metzger wrote: > Yep, this is definitively a bug / misconfiguration in the build system. > > The cassandra client defines metrics-core as a dependency, but the shading > is dropping the dependency when building the dependency reduced pom. > To resolve the i

Re: Flink 1.2 and Cassandra Connector

2017-03-16 Thread Robert Metzger
s-io:commons-io >>> commons-cli:commons-cli >>> >>> >>> org.apache.flink:* >>>

Re: Checkpointing with RocksDB as statebackend

2017-03-16 Thread Robert Metzger
Yes, you can change the GC using the env.java.opts parameter. We are not setting any GC on YARN. On Thu, Mar 16, 2017 at 1:50 PM, Stephan Ewen wrote: > The only immediate workaround is to use windows with "reduce" or "fold" or > "aggregate" and not "apply". And to not use an evictor. > > The goo

Re: [POLL] Who still uses Java 7 with Flink ?

2017-03-15 Thread Robert Metzger
I've put it also on our Twitter account: https://twitter.com/ApacheFlink/status/842015062667755521 On Wed, Mar 15, 2017 at 2:19 PM, Martin Neumann wrote: > I think this easier done in a straw poll than in an email conversation. > I created one at: http://www.strawpoll.me/12535073 > (Note that yo

Re: Source and Sink Flink

2017-03-14 Thread Robert Metzger
Hi Alberto, It should be possible. The IBM MQ supports the JMS standard, and we have a JMS compatible connector for Flink in Apache Bahir: http://bahir.apache. org/docs/flink/current/flink-streaming-activemq/ For writing files to HDFS, we have the bucketing sink in Flink https://ci.apache.org/pro

Re: Frontend classpath issue

2017-03-13 Thread Robert Metzger
wise I am in favor of reverting to the 1.1 > default. (My logic is that the user will only observe a difference in > behavior when the new setup actually causes problems) > > Gyula > > On Fri, Feb 24, 2017, 17:53 Robert Metzger wrote: > >> The JIRA (https://issues.apache.

Re: AWS exception serialization problem

2017-03-11 Thread Robert Metzger
erializer for Throwables in Kryo, but I’m not sure if we should actually > do this, or just wait for the Kryo fix to be released. > > - Gordon > > > On March 11, 2017 at 9:54:03 AM, Shannon Carey (sca...@expedia.com) wrote: > > Here ya go (see attached). > > > From

Re: Streaming Exception

2017-03-11 Thread Robert Metzger
; at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue. > runTask(ForkJoinPool.java:1339) > at scala.concurrent.forkjoin.ForkJoinPool.runWorker( > ForkJoinPool.java:1979) > at scala.concurrent.forkjoin.ForkJoinWorkerThread.run( > ForkJoinWorkerThread.java:107) > >

Re: Any good ideas for online/offline detection of devices that send events?

2017-03-10 Thread Robert Metzger
pro tip for debugging watermarks: They are exposed via a metric in Flink 1.2. On Tue, Mar 7, 2017 at 1:37 PM, Bruno Aranda wrote: > Hi Gordon, > > Many thanks for your helpful ideas. We tried yesterday the CEP approach, > but could not figure it out. The ProcessFunction one looks more promising,

Re: Isolate Tasks - Run Distinct Tasks in Different Task Managers

2017-03-10 Thread Robert Metzger
There is currently no way in Flink to define such scheduling constraints. On Wed, Mar 8, 2017 at 5:00 PM, PedroMrChaves wrote: > Thanks for the response. > > I would like to assure that the map operator is not in the same task > manager > as the window/apply operator, regardless of the number of

Re: Job completion or failure callback?

2017-03-10 Thread Robert Metzger
Hi Shannon, the web UI runs on the same JVM as the JobManager, so log outputs should go there. There is no way of running user code on the JobManager on job completion. We try to not allow users to execute code on the JobManager...bringing the JM down, will kill the entire cluster :) What you ar

Re: AWS exception serialization problem

2017-03-10 Thread Robert Metzger
Can one of you guys provide us with a minimal example to reproduce the issue? (Ideally locally, not using EMR?) I think once we can reproduce the issue its easy to fix. On Thu, Mar 9, 2017 at 1:24 AM, Bruno Aranda wrote: > Hi Stephan, we are running Flink 1.2.0 on Yarn (AWS EMR cluster) > > On W

Re: Issues with Event Time and Kafka

2017-03-10 Thread Robert Metzger
Hi Ethan, how late elements (elements with event time after the watermark) are handled depends on the operator. Flink's window operators will trigger a single event window when they fall into the "allowed lateness" timeframe. Otherwise, they are dropped. On Thu, Mar 9, 2017 at 5:30 PM, ext.eformi

Re: Flink support for DeepLearning4j or other deep learning library

2017-03-10 Thread Robert Metzger
Hi, I'm not aware of any community efforts into that direction. This is the only thing google brought up: https://www.slideshare.net/FlinkForward/suneel-marthi-deep-learning-with-apache-flink-and-dl4j But in general, I don't think that it should be terribly hard to get started with something, sinc

Re: Reference configs for HA / RocksDB / YARN / Zookeeper / HDFS

2017-03-10 Thread Robert Metzger
Hi Dave, Let me answer your questions: 1. The RocksDB state backend always stores the data on local disks for speed. The back up is done to HDFS or any other distributed file system. The local data directory is configured automatically by YARN. 2. You need to manually configure zookeeper in the f

Re: Flink Standalone Service

2017-03-10 Thread Robert Metzger
Hi Daniel, if you install the flink RPMs (or DEBs) from Apache Bigtop, they should come with init.d or service files. On Thu, Mar 9, 2017 at 6:10 PM, Daniel Skates wrote: > Hi all, > > Is there a init.d or similar service script for Flink on Redhat (or > Centos) 7? Mostly I'm just wanting to m

Re: Flink - Writing Test Case for the Datastream

2017-03-10 Thread Robert Metzger
Hi Mahesh, In the kafka tests, were using a pattern of killing a job by throwing a "SuccessException" after a certain number of messages have passed. Just check the Kafka tests to see how its done :) On Thu, Mar 9, 2017 at 10:09 PM, MAHESH KUMAR wrote: > Hi Team, > > I am trying to write test c

Re: questions on custom state with flink window

2017-03-10 Thread Robert Metzger
Hi Sai, 1) I think its okay to keep state in a RichWindowFunction. 2) I think it stays forever, yes 3) I'm including Nico, he can probably help you with the queryable state question. 4) I guess that's a queryable state question too. On Fri, Mar 10, 2017 at 1:04 AM, saiprasad mishra wrote: > Hi

Re: Streaming Exception

2017-03-10 Thread Robert Metzger
Hi, this error is only logged at WARN level. As Kaibo already said, its not a critical issue. Can you send some more messages from your log. Usually the Jobmanager logs why a taskmanager has failed. And the last few log messages of the failed TM itself are also often helpful. On Fri, Mar 10, 2

Re: Request jira permission

2017-03-10 Thread Robert Metzger
You'll have permissions in the next two minutes :) On Fri, Mar 10, 2017 at 2:19 PM, Mauro Cortellazzi < mauro.cortella...@radicalbit.io> wrote: > Hello Comunity, > > i would help and contribute into Flink and i'm already registered into > jira. > > I've created an issue [1] about documentation an

Re: Integrate Flink with S3 on EMR cluster

2017-03-10 Thread Robert Metzger
Hi Vinay, using the HADOOP_CLASSPATH variable on the client machine is the recommended way to solve this problem. I'll update the documentation accordingly. On Wed, Mar 8, 2017 at 10:26 AM, vinay patil wrote: > Hi , > > @Shannon - I am not facing any issue while writing to S3, was getting > N

Re: Performance tuning

2017-03-10 Thread Robert Metzger
ryo > serializer? > > > Best regards, > Dmitry > > On Thu, Feb 23, 2017 at 8:59 PM, Robert Metzger > wrote: > >> Hi Dmitry, >> >> Cool! Looks like you've taken the right approach to analyze the >> performance issues! >> Often the deseri

Re: [test][ignore] Sending an email to user@flink without being subscribed ...

2017-03-04 Thread Robert Metzger
Please keep ignoring my messages here. I'll talk to Infra again if the issue persists. On Wed, Mar 1, 2017 at 11:35 AM, Robert Metzger wrote: > As part of https://issues.apache.org/jira/browse/INFRA-13594, I'm sending > another message from my other email address, to check o

Re: [test][ignore] Sending an email to user@flink without being subscribed ...

2017-03-01 Thread Robert Metzger
As part of https://issues.apache.org/jira/browse/INFRA-13594, I'm sending another message from my other email address, to check out the rejection message :) On Thu, Feb 23, 2017 at 4:58 PM, Robert Metzger wrote: > Please ignore these messages. > > I'll talk to the ASF infra

Re: Flink the right tool for the job ? Huge Data window lateness

2017-02-24 Thread Robert Metzger
Hi, sounds like a cool project. What's the size of one data point? If one datapoint is 2 kb, you'll have 100 800 000 * 2048 bytes = 206 gigabytes of state. That's something one or two machines (depending on the disk throughput) should be able to handle. If possible, I would recommend you to do an

Re: Frontend classpath issue

2017-02-24 Thread Robert Metzger
for adding the user jar to all > the classpaths. > > On Fri, 24 Feb 2017 at 14:50 Robert Metzger wrote: > >> I agree with you Gyula, this change is dangerous. I have seen another case >> from a user with Hadoop dependencies that crashed in Flink 1.2.0 that >> didn&#

Re: Frontend classpath issue

2017-02-24 Thread Robert Metzger
I agree with you Gyula, this change is dangerous. I have seen another case from a user with Hadoop dependencies that crashed in Flink 1.2.0 that didn't in 1.1.x I wonder if we should introduce a config flag for Flink 1.2.1 to disable the behavior if needed. On Fri, Feb 24, 2017 at 2:27 PM, Ufuk C

Re: Frontend classpath issue

2017-02-23 Thread Robert Metzger
nto a different location. On Thu, Feb 23, 2017 at 10:01 PM, Gyula Fóra wrote: > Hi Robert, > It definitely explains the behaviour. > > This only applies to the frontend right? > If so what is the rationale behind it, and how should I handle the > dependency conflict? > > Thank

Re: Performance tuning

2017-02-23 Thread Robert Metzger
> > I understand that it's not possible to keep the rate the same when adding > more components due to communication overhead. > I'm just trying to reduce it. > > > Best regards, > Dmitry > > On Thu, Feb 23, 2017 at 4:17 PM, Robert Metzger > wrote: > >

Re: Difference between partition and groupBy

2017-02-23 Thread Robert Metzger
Hi Patrick, I think (but I'm not 100% sure) its not a difference in what the engine does in the end, its more of an API thing. When you are grouping, you can perform operations such as reducing afterwards. On a partitioned dataset, you can do stuff like processing each partition in parallel, or so

Re: Frontend classpath issue

2017-02-23 Thread Robert Metzger
Hi, Since Flink 1.2 "per job yarn applications" (when you do "-m yarn-cluster") include the job jar into the classpath as well. Does this change explain the behavior? On Thu, Feb 23, 2017 at 4:59 PM, Gyula Fóra wrote: > Hi, > > I have a problem that the frontend somehow seems to have the user ja

Re: Flink not reading from Kafka

2017-02-23 Thread Robert Metzger
gt; > AFAIR I checked the taskmanager log from the Flink UI in Mesos. It said > stdout was not available. But this may be due to the fact that Flink on > DC/OS is not yet very stable .. > > regards. > > On Fri, Feb 24, 2017 at 1:41 AM, Robert Metzger > wrote: > >>

Re: Flink checkpointing gets stuck

2017-02-23 Thread Robert Metzger
Hi Shai, I think we don't have so many users running Flink on Azure. Maybe you are the first to put some heavy load onto that infrastructure using Flink. I would guess that your problems are caused by the same root cause, just the way the job is being cancelled is a bit different based on what is

Re: Checkpointing with RocksDB as statebackend

2017-02-23 Thread Robert Metzger
The FSStatebackend uses the heap to keep the data, only the state snapshots are stored in the file system. On Thu, Feb 23, 2017 at 6:13 PM, vinay patil wrote: > Hi, > > When I disabled checkpointing the memory usage is similar for all nodes, > this means that for checkpointing enabled case the

Re: Flink not reading from Kafka

2017-02-23 Thread Robert Metzger
oKafka.java#L71 .. > > It worked for me. Is the stdout disabled somehow by default ? > > regards. > > On Thu, Feb 23, 2017 at 9:42 PM, Robert Metzger > wrote: > >> Hi Mohit, >> >> is there new data being produced into the topic? >> The properties.setProper

Re: Reliable Distributed FS support (HCFS)

2017-02-23 Thread Robert Metzger
Hi Vijay, Regarding your second question: First of all, the example jobs of Flink need to pass. Secondly, I would recommend implementing a test job that uses a lot of state, different state backends (file system and rocks) and some artificial failures. We at data Artisans have some testing jobs in

Re: Cartesian product over windows

2017-02-23 Thread Robert Metzger
I think Till is referring to regular windows. The *All variants combine the data into one task. On Fri, Feb 17, 2017 at 4:14 PM, Sonex wrote: > Hi Till, > > when you say parallel windows, what do you mean? Do you mean the use of > timeWindowAll which has all the elements of a window in a single

Re: Log4J

2017-02-23 Thread Robert Metzger
Absolutely agreed. I have such a task on my todo list and I hope to find time to address this soon. On Mon, Feb 20, 2017 at 8:08 PM, Stephan Ewen wrote: > How about adding this to the "logging" docs - a section on how to run > log4j2 > > On Mon, Feb 20, 2017 at 8:50 AM, R

Re: Performance tuning

2017-02-23 Thread Robert Metzger
Hi Dmitry, sorry for the late response. Where are you reading the data from? Did you check if one operator is causing backpressure? Are you using checkpointing? Serialization is often the cause for slow processing. However, its very hard to diagnose potential other causes without any details on

Re: Flink not reading from Kafka

2017-02-23 Thread Robert Metzger
Hi Mohit, is there new data being produced into the topic? The properties.setProperty("auto.offset.reset", "earliest"); setting only applies if you haven't consumed anything in this consumer group. So if you have read all the data in the topic before, you won't see anything new showing up. On Sat

Re: How to achieve exactly once on node failure using Kafka

2017-02-23 Thread Robert Metzger
Hi, exactly. You have to make sure that you can write data for the same ID multiple times. Exactly once in Flink is only guaranteed for registered state. So if you have a flatMap() with a "counter" variable, that is held in a "ValueState", this counter will always be in sync with the number of ele

Re: flink on yarn ha

2017-02-23 Thread Robert Metzger
Hi, This looks like a shading issue. Can you post the classpath the JobManager / AppMaster is logging on startup on the mailing list? If seems that Hadoop loads an unshaded version of the SecurityProtos. Maybe there is some hadoop version mixup. Are you using a Hadoop distribution (like CDH or HD

Re: [test][ignore] Sending an email to user@flink without being subscribed ...

2017-02-23 Thread Robert Metzger
Please ignore these messages. I'll talk to the ASF infra how we can resolve the issue. On Thu, Feb 23, 2017 at 4:54 PM, Robert Metzger wrote: > I'm testing what happens if I'm sending an email to the user@flink list > without being subscribed. > > On the dev@ list,

[test][ignore] Sending an email to user@flink without being subscribed ...

2017-02-23 Thread Robert Metzger
I'm testing what happens if I'm sending an email to the user@flink list without being subscribed. On the dev@ list, moderators get an email in that case. I have the suspicion that you can post on the user@ list without subscribing first. We have often users that ask a question, we give an initial

Re: Unknown I/O error while extracting contained jar files

2017-02-22 Thread Robert Metzger
Hi, which JVM variant and version are you using? What's your operating system? This is a pretty generic issue. If you search for "ZipException: error in opening zip file" on google, you'll find plenty of people who have this issue as well. I think its unrelated to Flink and more of a general JVM

Re: Arrays values in keyBy

2017-02-21 Thread Robert Metzger
I've filed a JIRA for this issue: https://issues.apache.org/jira/browse/FLINK-5874 On Wed, Jul 20, 2016 at 4:32 PM, Stephan Ewen wrote: > I thing we can simply add this behavior when we use the TypeComparator in > the keyBy() function. It can implement the hashCode() as a deepHashCode on > array

Re: Previously working job fails on Flink 1.2.0

2017-02-21 Thread Robert Metzger
I've filed a JIRA for the problem: https://issues.apache.org/jira/browse/FLINK-5874 On Tue, Feb 21, 2017 at 4:09 PM, Stephan Ewen wrote: > @Steffen > > Yes, you can currently not use arrays as keys. There is a check missing > that gives you a proper error message for that. > > The double[] is ha

Re: Log4J

2017-02-19 Thread Robert Metzger
Hi Chet, These are the files I have in my lib/ folder with the working log4j2 integration: -rw-r--r-- 1 robert robert 79966937 Oct 10 13:49 flink-dist_2.10-1.1.3.jar -rw-r--r-- 1 robert robert90883 Dec 9 20:13 flink-python_2.10-1.1.3.jar -rw-r--r-- 1 robert robert60547 Dec 9 18:45 lo

Re: Log4J

2017-02-16 Thread Robert Metzger
I've also (successfully) tried running Flink with log4j2 to connect it to greylog2. If I remember correctly, the biggest problem was "injecting" the log4j2 properties file into the classpath (when running Flink on YARN). Maybe you need to put the file into the lib/ folder, so that it is shipped to

Re: "Job submission to the JobManager timed out" on EMR YARN cluster with multiple jobs

2017-02-15 Thread Robert Metzger
Hi Geoffrey, I think the "per job yarn cluster" feature does probably not work for one main() function submitting multiple jobs. If you have a yarn session + regular "flink run" it should work. On Mon, Feb 13, 2017 at 5:37 PM, Geoffrey Mon wrote: > Just to clarify, is Flink designed to allow su

Re: JavaDoc 404

2017-02-14 Thread Robert Metzger
Feb 8, 2017 at 10:49 AM, Yassine MARZOUGUI < y.marzou...@mindlytix.com> wrote: > Thanks Robert and Ufuk for the update. > > 2017-02-07 18:43 GMT+01:00 Robert Metzger : > >> I've filed a JIRA for the issue: https://issues.apache.o >> rg/jira/browse/FLINK-5736 >

Re: Flink 1.2 Maven dependency

2017-02-12 Thread Robert Metzger
Hi Dominik, I hope the artifacts were distributed properly. did you get download errors for the 1.2.0 version from any official maven servers? Maybe mvnrepository.com is slow indexing new artifacts? Best, Robert On Fri, Feb 10, 2017 at 12:02 AM, Yassine MARZOUGUI < y.marzou...@mindlytix.com> wro

Re: Flink 1.2 and Cassandra Connector

2017-02-12 Thread Robert Metzger
Hi Nico, The cassandra connector should be available on Maven central: http://search.maven.org/#artifactdetails%7Corg.apache.flink%7Cflink-connector-cassandra_2.10%7C1.2.0%7Cjar Potentially, the issue you've mentioned is due to some shading issue. Is the "com/codahale/metrics/Metric" class in your

Re: #kafka/#ssl: how to enable ssl authentication for a new kafka consumer?

2017-02-09 Thread Robert Metzger
I've added another answer on SO that explains how you can pass a custom configuration object to the execution environment. On Thu, Feb 9, 2017 at 11:09 AM, alex.decastro wrote: > I found a similar question and answer at #stackoverflow > http://stackoverflow.com/questions/37743194/local-flink-con

Re: #kafka/#ssl: how to enable ssl authentication for a new kafka consumer?

2017-02-09 Thread Robert Metzger
Check out the documentation: https://ci.apache.org/projects/flink/flink-docs-release-1.2/dev/connectors/kafka.html#enabling-kerberos-authentication-for-versions-09-and-above-only On Wed, Feb 8, 2017 at 4:40 PM, alex.decastro wrote: > Dear flinkers, > I'm consuming from a kafka broker in a server

Re: 1.2 release date

2017-02-08 Thread Robert Metzger
ist+of+contributors would be updated > > But after announcement it’s not necessary I think > > > > *From:* Robert Metzger [mailto:rmetz...@apache.org] > *Sent:* Tuesday, February 7, 2017 7:58 PM > > *To:* user@flink.apache.org > *Subject:* Re: 1.2 release date > > &g

Re: JavaDoc 404

2017-02-07 Thread Robert Metzger
I've filed a JIRA for the issue: https://issues.apache.org/jira/browse/FLINK-5736 On Tue, Feb 7, 2017 at 5:00 PM, Robert Metzger wrote: > Yes, I'll try to fix it asap. Sorry for the inconvenience. > > On Mon, Feb 6, 2017 at 4:43 PM, Ufuk Celebi wrote: > >> Thanks

Re: Issue Faced: Not able to get the consumer offsets from Kafka when using Flink with Flink-Kafka Connector

2017-02-07 Thread Robert Metzger
Hi Mahesh, this is a known limitation of Apache Kafka: https://www.mail-archive.com/users@kafka.apache.org/msg22595.html You could implement a tool that is manually retrieving the latest offset for the group from the __offsets topic. On Tue, Feb 7, 2017 at 6:24 PM, MAHESH KUMAR wrote: > Hi Team

Re: Netty issues while deploying Flink with Yarn on MapR

2017-02-07 Thread Robert Metzger
Hi, cool! Yes, creating a JIRA for the problem is a good idea. Once you've found a way to fix the issue, you can open a pull request referencing the issue. Regards, Robert On Tue, Feb 7, 2017 at 6:20 PM, ani.desh1512 wrote: > Thanks Robert. > I would love to try to solve this problem so that

Re: Netty issues while deploying Flink with Yarn on MapR

2017-02-07 Thread Robert Metzger
Hi Aniket, great analysis of the problem! Thank you for looking so well into it! Would you be interested in trying to solve the problem for Flink? We could try to provide a maven build profile that sets the correct versions and excludes. We could maybe also provide a MapR specific release of Flink

Re: logback

2017-02-07 Thread Robert Metzger
Hi Dmitry, Did you also check out this documentation page? https://ci.apache.org/projects/flink/flink-docs-release-1.2/monitoring/best_practices.html#use-logback-when-running-flink-on-a-cluster On Tue, Feb 7, 2017 at 1:07 PM, Dmitry Golubets wrote: > Hi, > > documentation says: "Users willing to

Re: Dealing with latency in Sink

2017-02-07 Thread Robert Metzger
Hi Mohit, Flink doesn't allow dynamic up or downscaling of parallel operator instances at runtime. However, you can stop and restore from a savepoint with a different parallelism. This way, you can adopt to workload changes. Flink's handling of backpressure is very implicit. If you want to thrott

Re: To get Schema for jdbc database in Flink

2017-02-07 Thread Robert Metzger
Currently, there is no streaming JDBC connector. Check out this thread from last year: http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/JDBC-Streaming-Connector-td10508.html On Mon, Feb 6, 2017 at 5:00 PM, Ufuk Celebi wrote: > I'm not sure how well this works for the streaming AP

Re: JavaDoc 404

2017-02-07 Thread Robert Metzger
Yes, I'll try to fix it asap. Sorry for the inconvenience. On Mon, Feb 6, 2017 at 4:43 PM, Ufuk Celebi wrote: > Thanks for reporting this. I think Robert (cc'd) is working in fixing > this, correct? > > On Sat, Feb 4, 2017 at 12:12 PM, Yassine MARZOUGUI > wrote: > > Hi, > > > > The JavaDoc link

<    1   2   3   4   5   6   7   8   9   10   >