e a long story short: implementing a JDBC TableSource for batch
> query should be fairly easy. A true streaming solution that hooks into the
> changelog stream of a table is not possible at the moment.
>
> Cheers, Fabian
>
> 2017-09-26 15:04 GMT-04:00 Mohit Anchlia <mohitanch..
We are looking to stream data from the database. Is there already a jdbc
table source available for streaming?
Just checking to see if there is a way to purge files after it's processed.
On Tue, Aug 15, 2017 at 5:11 PM, Mohit Anchlia <mohitanch...@gmail.com>
wrote:
> Is there a way to delete a file once it has been processed?
>
> streamEnv
>
> .readFile(format, args[0
Is there a way to delete a file once it has been processed?
streamEnv
.readFile(format, args[0], FileProcessingMode.*PROCESS_CONTINUOUSLY*, 2000)
What's the best way to avoid duplicates in joined stream. In below code I
get duplicates of "A" because I have multiple of "A" in fileInput3.
SingleOutputStreamOperator fileInput3 = streamEnv.fromElements("A",
"A")
.assignTimestampsAndWatermarks(timestampAndWatermarkAssigner1);
the first) InputSplit and skip all others.
>>>
>>> I'd override open as follows:
>>>
>>> public void open(FileInputSplit fileSplit) throws IOException {
>>> super.open();
>>> reached = false;
>>> }
>>>
>>>
This was user induced problem - me. I wasn't calling streamenv.execute() :(
On Tue, Aug 1, 2017 at 1:29 PM, Mohit Anchlia <mohitanch...@gmail.com>
wrote:
> This doesn't work even with TextInputFormat. Not sure what's wrong.
>
> On Tue, Aug 1, 2017 at 9:53 AM, Mohit Anc
This doesn't work even with TextInputFormat. Not sure what's wrong.
On Tue, Aug 1, 2017 at 9:53 AM, Mohit Anchlia <mohitanch...@gmail.com>
wrote:
> I don't see the print output.
>
> On Tue, Aug 1, 2017 at 2:08 AM, Fabian Hueske <fhue...@gmail.com> wrote:
>
>> Hi Mo
super.open();
> reached = false;
> }
>
> Cheers, Fabian
>
>
> 2017-08-01 8:08 GMT+02:00 Mohit Anchlia <mohitanch...@gmail.com>:
>
>> I didn't override open. I am using open that got inherited from
>> FileInputFormat . Am I supposed to specificall
> Best, Fabian
>
> 2017-08-01 0:32 GMT+02:00 Mohit Anchlia <mohitanch...@gmail.com>:
>
>> I even tried existing format but still same error:
>>
>> FileInputFormat fileInputFormat = *new* TextInputFormat(*new*
>> Path(args[0]));
>>
>> fileInputFor
I didn't override open. I am using open that got inherited from
FileInputFormat . Am I supposed to specifically override open?
On Mon, Jul 31, 2017 at 9:49 PM, Fabian Hueske <fhue...@gmail.com> wrote:
> Do you set reached to false in open()?
>
>
> Am 01.08.2017 2:44 vorm. schr
PDF");
String content = new String(
Files.readAllBytes(Paths.get(this.currentSplit.getPath().getPath(;
logger.info("Content " + content);
reached = true;
return content;
}
}
On Mon, Jul 31, 2017 at 5:09 PM, Mohit Anchlia <mohitanch...@gmail.com>
wrote:
>
] INFO org.apache.flink.api.java.typeutils.TypeExtractor - class
org.apache.flink.streaming.api.functions.source.TimestampedFileInputSplit
does not contain a setter for field modificationTime
[main] INFO org.apache.flink.api.java.typeutils.TypeExtractor - c
On Mon, Jul 31, 2017 at 1:07 PM, Mohit
In trying to use this code I get the following error. Is it asking me to
implement additional interface?
streamEnv.readFile(format, args[0], FileProcessingMode.
*PROCESS_CONTINUOUSLY*, 2000).print();
[main] INFO com.s.flink.example.PDFInputFormat - Start streaming
[main] INFO
> flink-core/src/main/java/org/apache/flink/api/common/io/DelimitedInputFormat.java:public
>> abstract class DelimitedInputFormat extends FileInputFormat
>> implements Checkpoi
>>
>> flink-streaming-java/src/test/java/org/apache/flink/streamin
>> g/runtime/operato
; I think that on Windows, you need to use "file:/c:/proj/..." with just one
> slash after the scheme.
>
>
>
> On Mon, Jul 31, 2017 at 1:24 AM, Mohit Anchlia <mohitanch...@gmail.com>
> wrote:
>
>> This is what I tired and it doesn't work. Is this a bug?
>
so, please try file:///C: ...
>
>
> On 30.07.2017 22:28, Mohit Anchlia wrote:
>
> I am using flink 1.3.1 and getting this exception. Is there a workaround?
>
> Caused by: *java.nio.file.InvalidPathException*: Illegal char <:> at
> index 2: /C:/Users/m/default/flink-example/pom.xml
>
I am using flink 1.3.1 and getting this exception. Is there a workaround?
Caused by: *java.nio.file.InvalidPathException*: Illegal char <:> at index
2: /C:/Users/m/default/flink-example/pom.xml
at sun.nio.fs.WindowsPathParser.normalize(Unknown Source)
at
might also want to built your PDFFileInputFormat on FileInputFormat
> and set unsplittable to true.
> FileInputFormat comes with lots of built-in functionality such as
> InputSplit generation.
>
> Cheers, Fabian
>
> 2017-07-30 3:41 GMT+02:00 Mohit Anchlia <mohitanch...@gmail.com&
Hi,
I created a custom input format. Idea behind this is to read all binary
files from a directory and use each file as it's own split. Each split is
read as one whole record. When I run it in flink I don't get any error but
I am not seeing any output from .print. Am I missing something?
ps.
>
> Timo
>
>
> Am 13.07.17 um 02:16 schrieb Mohit Anchlia:
>
> What is the best way to read a map of lookup data? This lookup data is
>> like a small short lived data that is available in transformation to do
>> things like filtering, additional augmentation of data etc.
>>
>
>
>
What is the best way to read a map of lookup data? This lookup data is like
a small short lived data that is available in transformation to do things
like filtering, additional augmentation of data etc.
gt; release-1.3/monitoring/rest_api.html
>
> You would have to know the ID of your job and then you can poll the status
> of your running jobs.
>
> On Mon, 27 Feb 2017 at 18:15 Mohit Anchlia <mohitanch...@gmail.com> wrote:
>
> What's the best way to track the progress of th
Trying to understand what parts of flink have thread safety built in them.
Key question is, are the objects created in flink shared between threads
(slots)? For eg: if I create a sink function and open a file is that shared
between threads?
nd then trigger
> execution of the next one.
>
> Best,
> Aljoscha
>
> On Fri, 24 Feb 2017 at 19:16 Mohit Anchlia <mohitanch...@gmail.com> wrote:
>
>> Is there a way to connect 2 workflows such that one triggers the other if
>> certain condition is met? Howe
java.io.ObjectOutputStream.def
>>>>> aultWriteFields(ObjectOutputStream.java:1548)
>>>>>
>>>>
> com.sy.flink.test.Tuple2Serializerr$1: this states that an anonymous
> inner class in `Tuple2Serializerr` is not serializable.
>
> Could you check if that’s the case?
>
>
>
> On February
Figured out. It looks like there is a virtual memory limit check enforced
in yarn which just surfaced with java 8
On Fri, Feb 24, 2017 at 2:09 PM, Mohit Anchlia <mohitanch...@gmail.com>
wrote:
> I recently upgraded the cluster from java 7 to java 8. Now when I run
> flink on a yarn c
I recently upgraded the cluster from java 7 to java 8. Now when I run flink
on a yarn cluster I see errors: Eventually application gives up and
terminates. Any suggestions?
Association with remote system [akka.tcp://flink@slave:35543] has failed,
address is now gated for [5000] ms. Reason:
Is there a way to connect 2 workflows such that one triggers the other if
certain condition is met? However, the workaround may be to insert a
notification in a topic to trigger another workflow. The problem is that
the addSink ends the flow so if we need to add a trigger after addSink
there
e outer class instance as well.
>
>
> On February 24, 2017 at 2:43:38 PM, Mohit Anchlia (mohitanch...@gmail.com)
> wrote:
>
> This is at high level what I am doing:
>
> Serialize:
>
> String s = tuple.getPos(0) + "," + tuple.getPos(1);
> return s.getBytes()
>
&
rialize` implementation if you
> don’t want to.
>
> Cheers,
> Gordon
>
>
> On February 24, 2017 at 11:04:34 AM, Mohit Anchlia (mohitanch...@gmail.com)
> wrote:
>
> I am using String inside to convert into bytes.
>
> On Thu, Feb 23, 2017 at 6:50 PM, 刘彪 <mmyy1...@gmail.c
; 2017-02-24 9:07 GMT+08:00 Mohit Anchlia <mohitanch...@gmail.com>:
>
>> I wrote a key serialization class to write to kafka however I am getting
>> this error. Not sure why as I've already implemented the interfaces.
>>
>> Caused by: java.io.NotSerializa
I wrote a key serialization class to write to kafka however I am getting
this error. Not sure why as I've already implemented the interfaces.
Caused by: java.io.NotSerializableException:
com.sy.flink.test.Tuple2Serializerr$1
at
And user have to implement the SerializationSchema, maybe
> named Tuple2SerializationSchema.
>
> 2017-02-22 7:17 GMT+08:00 Mohit Anchlia <mohitanch...@gmail.com>:
>
>> What's the best way to retrieve both the values in Tuple2 inside a custom
>> sink given that the type is not known inside the sink function?
>>
>
>
What's the best way to retrieve both the values in Tuple2 inside a custom
sink given that the type is not known inside the sink function?
Interestingly enough same job runs ok on Linux but not on windows
On Fri, Feb 17, 2017 at 4:54 PM, Mohit Anchlia <mohitanch...@gmail.com>
wrote:
> I have this code trying to read from a topic however the flink process
> comes up and waits forever even though there is data in the top
I have this code trying to read from a topic however the flink process
comes up and waits forever even though there is data in the topic. Not sure
why? Has anyone else seen this problem?
StreamExecutionEnvironment env = StreamExecutionEnvironment
.*createLocalEnvironment*();
Properties
dding stages, but then your sink is no more a sink - it
> would have transformed into a map or a flatmap !
>
> On Mon, Feb 13, 2017 at 12:34 PM Mohit Anchlia <mohitanch...@gmail.com>
> wrote:
>
>> Is it possible to further add aggregation after the sink task executes?
>>
Is it possible to further add aggregation after the sink task executes? Or
is the sink the last stage of the workflow? Is this flow possible?
start stream -> transform -> load (sink) -> mark final state as loaded in a
table after all the load was successful in previous state (sink)
Does Flink support Hadoop 2.7.3? I installed Flink for HAdoop 2.7.0 but
seeing this error:
2017-02-10 18:59:52,661 INFO
org.apache.flink.yarn.YarnClusterDescriptor - Deployment
took more than 60 seconds. Please check if the requested resources are
available in the YARN cluster
What is the best way to dynamically adapt and tune down number of tasks
created to write/read to a sink when sink slows down or the latency to sink
increases? I am looking at the sink interface but don't see a way to
influence flink to reduce the number of tasks or throttle the volume down
to the
e. This
> would typically be located on a distributed file system like HDFS that is
> also accessible from each node, so that operators can be recovered on
> different machines in case of machine failures.
>
> Am 03.02.2017 um 20:55 schrieb Mohit Anchlia <mohitanch...@gmail.com
Any information on this would be helpful.
On Thu, Feb 2, 2017 at 5:09 PM, Mohit Anchlia <mohitanch...@gmail.com>
wrote:
> What is the granularity of parallelism in flink? For eg: if I am reading
> from Kafka -> map -> Sink and I say parallel 2 does it mean it creates 2
is a poorly named key for the directory of the RocksDB instance data and
> has in fact nothing to do with checkpoints.
>
> Best,
> Stefan
>
> Am 03.02.2017 um 01:45 schrieb Mohit Anchlia <mohitanch...@gmail.com>:
>
> Trying to understand these 3 parameters:
>
> state.bac
What is the granularity of parallelism in flink? For eg: if I am reading
from Kafka -> map -> Sink and I say parallel 2 does it mean it creates 2
consumer threads and allocates it on 2 separate task managers?
Also, it would be good to understand the difference between parallelism and
partitioning
Trying to understand these 3 parameters:
state.backend
state.backend.fs.checkpointdir
state.backend.rocksdb.checkpointdir
state.checkpoints.dir
As I understand stream of data and the state of operators are 2 different
concepts and that both need to be checkpointed. I am bit confused about the
46 matches
Mail list logo