Re: convert Json to Tuple

2016-04-25 Thread Alexander Smirnov
Thanks Timur! I should have mentioned, I need it for Java On Mon, Apr 25, 2016 at 10:13 PM Timur Fayruzov wrote: > Normally, Json4s or Jackson+scala plugin work well for json to scala data > structure conversions. However, I would not expect they support a special >

Re: convert Json to Tuple

2016-04-25 Thread Timur Fayruzov
Normally, Json4s or Jackson+scala plugin work well for json to scala data structure conversions. However, I would not expect they support a special case for tuples, since JSON key-value fields would normally convert to case classes and JSON arrays are converted to, well, arrays. That's being said,

convert Json to Tuple

2016-04-25 Thread Alexander Smirnov
Hello everybody! my RMQSource function receives string with JSONs in it. Because many operations in Flink rely on Tuple operations, I think it is a good idea to convert JSON to Tuple. I believe this task has been solved already :) what's the common approach for this conversion? Thank you, Alex

Re: Flink program without a line of code

2016-04-25 Thread Alexander Smirnov
thank you so much for the responses, guys! On Sat, Apr 23, 2016 at 12:09 AM Flavio Pompermaier wrote: > Hi Alexander, > since I was looking for something similar some days ago here is what I > know about this argument: > during the Stratosphere project there was Meteor and

Job hangs

2016-04-25 Thread Timur Fayruzov
Hello, Now I'm at the stage where my job seem to completely hang. Source code is attached (it won't compile but I think gives a very good idea of what happens). Unfortunately I can't provide the datasets. Most of them are about 100-500MM records, I try to process on EMR cluster with 40 tasks 6GB

Re: Join DataStream with dimension tables?

2016-04-25 Thread Srikanth
Aljoscha, Looks like a potential solution. Feels a bit hacky though. Didn't quite understand why a list backed store is used to for static input buffer? Join(inner) should emit only one record if there is a key match. Is it a property of the system to emit Long.MAX_VALUE watermark when a finite

Re: Submit Flink Jobs to YARN running on AWS

2016-04-25 Thread Bajaj, Abhinav
Hi Fabian, Thanks for your reply and the pointers to documentation. In these steps, I think the Flink client is installed on the master node, referring to steps mentioned in Flink docs here. However, the scenario I

Re: Getting java.lang.Exception when try to fetch data from Kafka

2016-04-25 Thread prateekarora
Hi I have java program to send data into kafka topic. below is code for this : private Producer producer = null Serializer keySerializer = new StringSerializer(); Serializer valueSerializer = new ByteArraySerializer(); producer = new KafkaProducer(props,

Re: AvroWriter for Rolling sink

2016-04-25 Thread Igor Berman
Hi, it's not a problem, I'll find time to change it(I understand the refactoring is in master and not released yet). Wanted to ask if it's acceptable to add following dependency to flink? I mean my code reused code in this jar(pay attention it's not present currently in flink classpath)

Re: YARN terminating TaskNode

2016-04-25 Thread Timur Fayruzov
Great answer, thanks you Max for a very detailed explanation! Illuminating how off-heap parameter affects the memory allocation. I read this post: https://blogs.oracle.com/jrockit/entry/why_is_my_jvm_process_larger_t and the thing that jumped on me is the allocation of memory for jni libs. I do

Re: Getting java.lang.Exception when try to fetch data from Kafka

2016-04-25 Thread prateekarora
Hi I have java program that sending data into kafka topic using kafa client API (0.8.2) here is sample to code using to send data in kafka topic : import org.apache.kafka.clients.producer.Producer; import org.apache.kafka.clients.producer.ProducerConfig; import

Re: Access to a shared resource within a mapper

2016-04-25 Thread Timur Fayruzov
Hi Fabian, I didn't realize you meant that lazy val should be inside RichMapFunction implementation, it makes sense. That's what I ended up doing already. Thanks! Timur On Mon, Apr 25, 2016 at 3:34 AM, Fabian Hueske wrote: > Hi Timur, > > a TaskManager may run as many

Re: YARN terminating TaskNode

2016-04-25 Thread Timur Fayruzov
Hello Maximilian, I'm using 1.0.0 compiled with Scala 2.11 and Hadoop 2.7. I'm running this on EMR. I didn't see any exceptions in other logs. What are the logs you are interested in? Thanks, Timur On Mon, Apr 25, 2016 at 3:44 AM, Maximilian Michels wrote: > Hi Timur, > >

Re: General Data questions - streams vs batch

2016-04-25 Thread Aljoscha Krettek
Hi, I'll try and answer your questions separately. First, a general remark, although Flink has the DataSet API for batch processing and the DataStream API for stream processing we only have one underlying streaming execution engine that is used for both. Now, regarding the questions: 1) What do

Re: AvroWriter for Rolling sink

2016-04-25 Thread Aljoscha Krettek
Hi, the code looks very good! Do you think it can be adapted to the slightly modified interface introduced here: https://issues.apache.org/jira/browse/FLINK-3637 It basically requires the writer to know the write position, so that we can truncate to a valid position in case of failure. Cheers,

Re: Clear irrelevant state values

2016-04-25 Thread Sowmya Vallabhajosyula
Thanks Gyula. Yes, I am using state only in RichFlatMapFunction. Will try to evaluate generating events for removal of state. Regards, Sowmya On Mon, Apr 25, 2016 at 5:44 PM, Gyula Fóra wrote: > Hi, > > The removal markers are just something I made up :) What I meant is

Re: Flink on Yarn - ApplicationMaster command

2016-04-25 Thread Maximilian Michels
Great to hear! :) On Sun, Apr 24, 2016 at 3:51 PM, Theofilos Kakantousis wrote: > Hi, > > The issue was a mismatch of jar versions on my client. Seems to be working > fine now. > Thanks again for your help! > > Cheers, > Theofilos > > > On 2016-04-22 18:22, Theofilos Kakantousis

Re: Clear irrelevant state values

2016-04-25 Thread Sowmya Vallabhajosyula
Hi Gyula, Thank you so much. 1. Can you point me to any documentation on removal markers? 2. My understanding is this implementation of custom state maintenance does not impact scalabiity. Is that right? Thanks, Sowmya On Mon, Apr 25, 2016 at 3:06 PM, Gyula Fóra wrote: >

Re: YARN terminating TaskNode

2016-04-25 Thread Maximilian Michels
Hi Timur, Which version of Flink are you using? Could you share the entire logs? Thanks, Max On Mon, Apr 25, 2016 at 12:05 PM, Robert Metzger wrote: > Hi Timur, > > The reason why we only allocate 570mb for the heap is because you are > allocating most of the memory as off

Re: Access to a shared resource within a mapper

2016-04-25 Thread Fabian Hueske
Hi Timur, a TaskManager may run as many subtasks of a Map operator as it has slots. Each subtask of an operator runs in a different thread. Each parallel subtask of a Map operator has its own MapFunction object, so it should be possible to use a lazy val. However, you should not use static

Re: Custom Trigger Implementation

2016-04-25 Thread Piyush Shrivastava
Thanks a lot Kostas. This solved my problem. Thanks and Regards,Piyush Shrivastava http://webograffiti.com On Monday, 25 April 2016 3:27 PM, Kostas Kloudas wrote: Hi, Let me also add that you should also override the clear() method in order to clear you

Re: YARN terminating TaskNode

2016-04-25 Thread Robert Metzger
Hi Timur, The reason why we only allocate 570mb for the heap is because you are allocating most of the memory as off heap (direct byte buffers). In theory, the memory footprint of the JVM is limited to 570 (heap) + 1900 (direct mem) = 2470 MB (which is below 2500). But in practice thje JVM is

Re: Custom Trigger Implementation

2016-04-25 Thread Kostas Kloudas
Hi, Let me also add that you should also override the clear() method in order to clear you state. and delete the pending timers. Kostas > On Apr 25, 2016, at 11:52 AM, Kostas Kloudas > wrote: > > Hi Piyush, > > In the onElement function, you register a timer

Re: Flink first() operator

2016-04-25 Thread Fabian Hueske
Hi Biplop, you can also implement a generic IF that wraps another IF (such as a CsvInputFormat). The wrapping IF forwards all calls to the wrapped IF and in addition counts how many records were emitted (how often InputFormat.nextRecord() was called). Once the count arrives at the threshold, it

Re: Custom Trigger Implementation

2016-04-25 Thread Kostas Kloudas
Hi Piyush, In the onElement function, you register a timer every time you receive an element. When the next watermark arrives, in the flag==false case, this will lead to every element adding a timer for its timestamp+6ms. The same for flag==true case, with 2ms interval. What you

Re: Getting java.lang.Exception when try to fetch data from Kafka

2016-04-25 Thread Robert Metzger
Hi Prateek, were the messages written to the Kafka topic by Flink, using the TypeInformationKeyValueSerializationSchema ? If not, maybe the Flink deserializers expect a different data format of the messages in the topic. How are the messages written into the topic? On Fri, Apr 22, 2016 at

Re: Clear irrelevant state values

2016-04-25 Thread Gyula Fóra
Hi, (a) I think your understanding is correct, one consideration might be that if you are always sending the state to the sink, it might make sense to build it there directly using a RichSinkFunction. (b) There is no built-in support for this at the moment. What you can do yourself is to

Re: Thanks everyone

2016-04-25 Thread Ufuk Celebi
Thanks for sharing Prez! :-) On Sat, Apr 23, 2016 at 7:08 AM, Márton Balassi wrote: > Hi Prez, > > Thanks for sharing, the community is always glad to welcome new Flink users. > > Best, > > Marton > > On Sat, Apr 23, 2016 at 6:01 AM, Prez Cannady