Re: JDBC table source

2017-09-26 Thread Mohit Anchlia
e a long story short: implementing a JDBC TableSource for batch > query should be fairly easy. A true streaming solution that hooks into the > changelog stream of a table is not possible at the moment. > > Cheers, Fabian > > 2017-09-26 15:04 GMT-04:00 Mohit Anchlia <mohitanch..

JDBC table source

2017-09-26 Thread Mohit Anchlia
We are looking to stream data from the database. Is there already a jdbc table source available for streaming?

Re: Deleting files in continuous processing

2017-08-21 Thread Mohit Anchlia
Just checking to see if there is a way to purge files after it's processed. On Tue, Aug 15, 2017 at 5:11 PM, Mohit Anchlia <mohitanch...@gmail.com> wrote: > Is there a way to delete a file once it has been processed? > > streamEnv > > .readFile(format, args[0

Deleting files in continuous processing

2017-08-15 Thread Mohit Anchlia
Is there a way to delete a file once it has been processed? streamEnv .readFile(format, args[0], FileProcessingMode.*PROCESS_CONTINUOUSLY*, 2000)

Avoiding duplicates in joined stream

2017-08-15 Thread Mohit Anchlia
What's the best way to avoid duplicates in joined stream. In below code I get duplicates of "A" because I have multiple of "A" in fileInput3. SingleOutputStreamOperator fileInput3 = streamEnv.fromElements("A", "A") .assignTimestampsAndWatermarks(timestampAndWatermarkAssigner1);

Re: Odd flink behaviour

2017-08-02 Thread Mohit Anchlia
the first) InputSplit and skip all others. >>> >>> I'd override open as follows: >>> >>> public void open(FileInputSplit fileSplit) throws IOException { >>> super.open(); >>> reached = false; >>> } >>> >>>

Re: Using FileInputFormat - org.apache.flink.streaming.api.functions.source.TimestampedFileInputSplit

2017-08-01 Thread Mohit Anchlia
This was user induced problem - me. I wasn't calling streamenv.execute() :( On Tue, Aug 1, 2017 at 1:29 PM, Mohit Anchlia <mohitanch...@gmail.com> wrote: > This doesn't work even with TextInputFormat. Not sure what's wrong. > > On Tue, Aug 1, 2017 at 9:53 AM, Mohit Anc

Re: Using FileInputFormat - org.apache.flink.streaming.api.functions.source.TimestampedFileInputSplit

2017-08-01 Thread Mohit Anchlia
This doesn't work even with TextInputFormat. Not sure what's wrong. On Tue, Aug 1, 2017 at 9:53 AM, Mohit Anchlia <mohitanch...@gmail.com> wrote: > I don't see the print output. > > On Tue, Aug 1, 2017 at 2:08 AM, Fabian Hueske <fhue...@gmail.com> wrote: > >> Hi Mo

Re: Odd flink behaviour

2017-08-01 Thread Mohit Anchlia
super.open(); > reached = false; > } > > Cheers, Fabian > > > 2017-08-01 8:08 GMT+02:00 Mohit Anchlia <mohitanch...@gmail.com>: > >> I didn't override open. I am using open that got inherited from >> FileInputFormat . Am I supposed to specificall

Re: Using FileInputFormat - org.apache.flink.streaming.api.functions.source.TimestampedFileInputSplit

2017-08-01 Thread Mohit Anchlia
> Best, Fabian > > 2017-08-01 0:32 GMT+02:00 Mohit Anchlia <mohitanch...@gmail.com>: > >> I even tried existing format but still same error: >> >> FileInputFormat fileInputFormat = *new* TextInputFormat(*new* >> Path(args[0])); >> >> fileInputFor

Re: Odd flink behaviour

2017-08-01 Thread Mohit Anchlia
I didn't override open. I am using open that got inherited from FileInputFormat . Am I supposed to specifically override open? On Mon, Jul 31, 2017 at 9:49 PM, Fabian Hueske <fhue...@gmail.com> wrote: > Do you set reached to false in open()? > > > Am 01.08.2017 2:44 vorm. schr

Re: Odd flink behaviour

2017-07-31 Thread Mohit Anchlia
PDF"); String content = new String( Files.readAllBytes(Paths.get(this.currentSplit.getPath().getPath(; logger.info("Content " + content); reached = true; return content; } } On Mon, Jul 31, 2017 at 5:09 PM, Mohit Anchlia <mohitanch...@gmail.com> wrote: >

Re: Using FileInputFormat - org.apache.flink.streaming.api.functions.source.TimestampedFileInputSplit

2017-07-31 Thread Mohit Anchlia
] INFO org.apache.flink.api.java.typeutils.TypeExtractor - class org.apache.flink.streaming.api.functions.source.TimestampedFileInputSplit does not contain a setter for field modificationTime [main] INFO org.apache.flink.api.java.typeutils.TypeExtractor - c On Mon, Jul 31, 2017 at 1:07 PM, Mohit

Using FileInputFormat - org.apache.flink.streaming.api.functions.source.TimestampedFileInputSplit

2017-07-31 Thread Mohit Anchlia
In trying to use this code I get the following error. Is it asking me to implement additional interface? streamEnv.readFile(format, args[0], FileProcessingMode. *PROCESS_CONTINUOUSLY*, 2000).print(); [main] INFO com.s.flink.example.PDFInputFormat - Start streaming [main] INFO

Re: Customer inputformat

2017-07-31 Thread Mohit Anchlia
> flink-core/src/main/java/org/apache/flink/api/common/io/DelimitedInputFormat.java:public >> abstract class DelimitedInputFormat extends FileInputFormat >> implements Checkpoi >> >> flink-streaming-java/src/test/java/org/apache/flink/streamin >> g/runtime/operato

Re: Invalid path exception

2017-07-31 Thread Mohit Anchlia
; I think that on Windows, you need to use "file:/c:/proj/..." with just one > slash after the scheme. > > > > On Mon, Jul 31, 2017 at 1:24 AM, Mohit Anchlia <mohitanch...@gmail.com> > wrote: > >> This is what I tired and it doesn't work. Is this a bug? >

Re: Invalid path exception

2017-07-30 Thread Mohit Anchlia
so, please try file:///C: ... > > > On 30.07.2017 22:28, Mohit Anchlia wrote: > > I am using flink 1.3.1 and getting this exception. Is there a workaround? > > Caused by: *java.nio.file.InvalidPathException*: Illegal char <:> at > index 2: /C:/Users/m/default/flink-example/pom.xml >

Invalid path exception

2017-07-30 Thread Mohit Anchlia
I am using flink 1.3.1 and getting this exception. Is there a workaround? Caused by: *java.nio.file.InvalidPathException*: Illegal char <:> at index 2: /C:/Users/m/default/flink-example/pom.xml at sun.nio.fs.WindowsPathParser.normalize(Unknown Source) at

Re: Customer inputformat

2017-07-30 Thread Mohit Anchlia
might also want to built your PDFFileInputFormat on FileInputFormat > and set unsplittable to true. > FileInputFormat comes with lots of built-in functionality such as > InputSplit generation. > > Cheers, Fabian > > 2017-07-30 3:41 GMT+02:00 Mohit Anchlia <mohitanch...@gmail.com&

Customer inputformat

2017-07-29 Thread Mohit Anchlia
Hi, I created a custom input format. Idea behind this is to read all binary files from a directory and use each file as it's own split. Each split is read as one whole record. When I run it in flink I don't get any error but I am not seeing any output from .print. Am I missing something?

Re: Reading static data

2017-07-14 Thread Mohit Anchlia
ps. > > Timo > > > Am 13.07.17 um 02:16 schrieb Mohit Anchlia: > > What is the best way to read a map of lookup data? This lookup data is >> like a small short lived data that is available in transformation to do >> things like filtering, additional augmentation of data etc. >> > > >

Reading static data

2017-07-12 Thread Mohit Anchlia
What is the best way to read a map of lookup data? This lookup data is like a small short lived data that is available in transformation to do things like filtering, additional augmentation of data etc.

Re: Connecting workflows in batch

2017-03-01 Thread Mohit Anchlia
gt; release-1.3/monitoring/rest_api.html > > You would have to know the ID of your job and then you can poll the status > of your running jobs. > > On Mon, 27 Feb 2017 at 18:15 Mohit Anchlia <mohitanch...@gmail.com> wrote: > > What's the best way to track the progress of th

Thread safety

2017-02-27 Thread Mohit Anchlia
Trying to understand what parts of flink have thread safety built in them. Key question is, are the objects created in flink shared between threads (slots)? For eg: if I create a sink function and open a file is that shared between threads?

Re: Connecting workflows in batch

2017-02-27 Thread Mohit Anchlia
nd then trigger > execution of the next one. > > Best, > Aljoscha > > On Fri, 24 Feb 2017 at 19:16 Mohit Anchlia <mohitanch...@gmail.com> wrote: > >> Is there a way to connect 2 workflows such that one triggers the other if >> certain condition is met? Howe

Re: Serialization schema

2017-02-26 Thread Mohit Anchlia
java.io.ObjectOutputStream.def >>>>> aultWriteFields(ObjectOutputStream.java:1548) >>>>> >>>> > com.sy.flink.test.Tuple2Serializerr$1: this states that an anonymous > inner class in `Tuple2Serializerr` is not serializable. > > Could you check if that’s the case? > > > > On February

Re: Java 7 -> 8 - Association failed error

2017-02-24 Thread Mohit Anchlia
Figured out. It looks like there is a virtual memory limit check enforced in yarn which just surfaced with java 8 On Fri, Feb 24, 2017 at 2:09 PM, Mohit Anchlia <mohitanch...@gmail.com> wrote: > I recently upgraded the cluster from java 7 to java 8. Now when I run > flink on a yarn c

Java 7 -> 8 - Association failed error

2017-02-24 Thread Mohit Anchlia
I recently upgraded the cluster from java 7 to java 8. Now when I run flink on a yarn cluster I see errors: Eventually application gives up and terminates. Any suggestions? Association with remote system [akka.tcp://flink@slave:35543] has failed, address is now gated for [5000] ms. Reason:

Connecting workflows in batch

2017-02-24 Thread Mohit Anchlia
Is there a way to connect 2 workflows such that one triggers the other if certain condition is met? However, the workaround may be to insert a notification in a topic to trigger another workflow. The problem is that the addSink ends the flow so if we need to add a trigger after addSink there

Re: Serialization schema

2017-02-23 Thread Mohit Anchlia
e outer class instance as well. > > > On February 24, 2017 at 2:43:38 PM, Mohit Anchlia (mohitanch...@gmail.com) > wrote: > > This is at high level what I am doing: > > Serialize: > > String s = tuple.getPos(0) + "," + tuple.getPos(1); > return s.getBytes() > &

Re: Serialization schema

2017-02-23 Thread Mohit Anchlia
rialize` implementation if you > don’t want to. > > Cheers, > Gordon > > > On February 24, 2017 at 11:04:34 AM, Mohit Anchlia (mohitanch...@gmail.com) > wrote: > > I am using String inside to convert into bytes. > > On Thu, Feb 23, 2017 at 6:50 PM, 刘彪 <mmyy1...@gmail.c

Re: Serialization schema

2017-02-23 Thread Mohit Anchlia
; 2017-02-24 9:07 GMT+08:00 Mohit Anchlia <mohitanch...@gmail.com>: > >> I wrote a key serialization class to write to kafka however I am getting >> this error. Not sure why as I've already implemented the interfaces. >> >> Caused by: java.io.NotSerializa

Serialization schema

2017-02-23 Thread Mohit Anchlia
I wrote a key serialization class to write to kafka however I am getting this error. Not sure why as I've already implemented the interfaces. Caused by: java.io.NotSerializableException: com.sy.flink.test.Tuple2Serializerr$1 at

Re: Writing Tuple2 to a sink

2017-02-23 Thread Mohit Anchlia
And user have to implement the SerializationSchema, maybe > named Tuple2SerializationSchema. > > 2017-02-22 7:17 GMT+08:00 Mohit Anchlia <mohitanch...@gmail.com>: > >> What's the best way to retrieve both the values in Tuple2 inside a custom >> sink given that the type is not known inside the sink function? >> > >

Writing Tuple2 to a sink

2017-02-21 Thread Mohit Anchlia
What's the best way to retrieve both the values in Tuple2 inside a custom sink given that the type is not known inside the sink function?

Re: Flink not reading from Kafka

2017-02-17 Thread Mohit Anchlia
Interestingly enough same job runs ok on Linux but not on windows On Fri, Feb 17, 2017 at 4:54 PM, Mohit Anchlia <mohitanch...@gmail.com> wrote: > I have this code trying to read from a topic however the flink process > comes up and waits forever even though there is data in the top

Flink not reading from Kafka

2017-02-17 Thread Mohit Anchlia
I have this code trying to read from a topic however the flink process comes up and waits forever even though there is data in the topic. Not sure why? Has anyone else seen this problem? StreamExecutionEnvironment env = StreamExecutionEnvironment .*createLocalEnvironment*(); Properties

Re: Further aggregation possible after sink?

2017-02-13 Thread Mohit Anchlia
dding stages, but then your sink is no more a sink - it > would have transformed into a map or a flatmap ! > > On Mon, Feb 13, 2017 at 12:34 PM Mohit Anchlia <mohitanch...@gmail.com> > wrote: > >> Is it possible to further add aggregation after the sink task executes? >>

Further aggregation possible after sink?

2017-02-13 Thread Mohit Anchlia
Is it possible to further add aggregation after the sink task executes? Or is the sink the last stage of the workflow? Is this flow possible? start stream -> transform -> load (sink) -> mark final state as loaded in a table after all the load was successful in previous state (sink)

Hadoop 2.7.3

2017-02-10 Thread Mohit Anchlia
Does Flink support Hadoop 2.7.3? I installed Flink for HAdoop 2.7.0 but seeing this error: 2017-02-10 18:59:52,661 INFO org.apache.flink.yarn.YarnClusterDescriptor - Deployment took more than 60 seconds. Please check if the requested resources are available in the YARN cluster

Dealing with latency in Sink

2017-02-06 Thread Mohit Anchlia
What is the best way to dynamically adapt and tune down number of tasks created to write/read to a sink when sink slows down or the latency to sink increases? I am looking at the sink interface but don't see a way to influence flink to reduce the number of tasks or throttle the volume down to the

Re: Clarification on state backend parameters

2017-02-04 Thread Mohit Anchlia
e. This > would typically be located on a distributed file system like HDFS that is > also accessible from each node, so that operators can be recovered on > different machines in case of machine failures. > > Am 03.02.2017 um 20:55 schrieb Mohit Anchlia <mohitanch...@gmail.com

Re: Parallelism and Partitioning

2017-02-03 Thread Mohit Anchlia
Any information on this would be helpful. On Thu, Feb 2, 2017 at 5:09 PM, Mohit Anchlia <mohitanch...@gmail.com> wrote: > What is the granularity of parallelism in flink? For eg: if I am reading > from Kafka -> map -> Sink and I say parallel 2 does it mean it creates 2

Re: Clarification on state backend parameters

2017-02-03 Thread Mohit Anchlia
is a poorly named key for the directory of the RocksDB instance data and > has in fact nothing to do with checkpoints. > > Best, > Stefan > > Am 03.02.2017 um 01:45 schrieb Mohit Anchlia <mohitanch...@gmail.com>: > > Trying to understand these 3 parameters: > > state.bac

Parallelism and Partitioning

2017-02-02 Thread Mohit Anchlia
What is the granularity of parallelism in flink? For eg: if I am reading from Kafka -> map -> Sink and I say parallel 2 does it mean it creates 2 consumer threads and allocates it on 2 separate task managers? Also, it would be good to understand the difference between parallelism and partitioning

Clarification on state backend parameters

2017-02-02 Thread Mohit Anchlia
Trying to understand these 3 parameters: state.backend state.backend.fs.checkpointdir state.backend.rocksdb.checkpointdir state.checkpoints.dir As I understand stream of data and the state of operators are 2 different concepts and that both need to be checkpointed. I am bit confused about the