Hi Giancarlo,
Since it has been a while since the last post and there hasn't been a JIRA
ticket opened for Kinesis connector yet, I'm wondering how you are doing on
the Kinesis connector and hope to help out with this feature :)
I've opened a JIRA
Hi Francis,
A part of every complete snapshot is the record positions associated with
the barrier that triggered the checkpointing of this snapshot. The snapshot
is completed only when all the records within the checkpoint reaches the
sink. When a topology fails, all the operators' state will
Hi Soumya,
No, currently there is no Flink standard supported S3 streaming source. As
far as I know, there isn't one out in the public yet either. The community
is open to submissions for new connectors, so if you happen to be working on
one for S3, you can file up a JIRA to let us know.
Also,
Hi Shannon,
Thanks for your investigation on the issue and the JIRA. There's actually a
previous JIRA on this problem already:
https://issues.apache.org/jira/browse/FLINK-4023. Would you be ok with
tracking this issue on FLINK-4023, and close FLINK-4069 as a duplicate
issue? As you can see, I've
Hi Sourav,
A little help with more clarification on your last comment.
In sense of "where" the driver program is executed, then yes the Flink
driver program runs in a mode similar to Spark's YARN-client.
However, the "role" of the driver program and the work that it is
responsible of is quite
Hi Saiph,
In Flink, the key for keyBy() can be provided in different ways:
https://ci.apache.org/projects/flink/flink-docs-master/apis/programming_guide.html#specifying-keys
(the doc is for DataSet API, but specifying keys is basically the same for
DataStream and DataSet).
As described in the
Hi Rafal,
>From your description, it seems like Flink is complaining because it cannot
access the Elasticsearch API related dependencies as well. You'd also have
to include the following into your Maven build, under :
org.elasticsearch
elasticsearch
2.3.2
jar
false
Hi,
Please also note that the “auto.offset.reset” property is only respected
when there is no offsets under the same consumer group in ZK. So,
currently, in order to make sure you read from the latest / earliest
offsets every time you restart your Flink application, you’d have to use an
unique
Hi Sandeep!
While auto scaling jobs in Flink still isn’t possible, in Flink 1.2 you will be
able to rescale jobs by stopping and restarting.
This works by taking a savepoint of the job before stopping the job, and then
redeploy the job with a higher / lower parallelism using the savepoint.
Upon
Hi Ninad and Till,
Thank you for looking into the issue! This is actually a bug.
Till’s suggestion is correct:
The producer holds a `pendingRecords` value that is incremented on each
invoke() and decremented on each callback, used to check if the producer needs
to sync on pending callbacks on
t;: "abc"
}
}
how can I treat this cases? There isn't a way to send all the json element and
index it like the in the REST request?
Thanks.
Tzu-Li (Gordon) Tai <tzuli...@apache.org> escreveu no dia terça, 21/02/2017 às
07:54:
Hi,
I’ll use your code to e
Hi Mohit,
As 刘彪 pointed out in his reply, the problem is that your `Tuple2Serializer`
contains fields that are not serializable, so `Tuple2Serializer` itself is not
serializable.
Could you perhaps share your `Tuple2Serializer` implementation with us so we
can pinpoint the problem?
A snippet
Hi Patrick,
Thanks a lot for feedback on your use case! At a first glance, I would say that
Flink can definitely solve the issues you are evaluating.
I’ll try to explain them, and point you to some docs / articles that can
further explain in detail:
- Lateness
The 7-day lateness shouldn’t be
Hi Sathi,
The `getPartitionId` method is invoked with each record from the stream. In
there, you can extract values / fields from the record, and use that to
determine the target partition id.
Is this what you had in mind?
Cheers,
Gordon
On February 21, 2017 at 11:54:21 AM, Sathi Chowdhury
Hi Dmitry,
Technically, from the looks of the internal code around
`OperatorStateRepartitioner`, I think it is certainly possible to be pluggable.
Right now it is just hard coded to use a round-robin repartitioner
implementation as default.
However, I’m not sure of the plans in exposing this
Hi Alex,
Kafka authentication and data transfer encryption using SSL can be simply
done be configuring brokers and the connecting client.
You can take a look at this:
https://kafka.apache.org/documentation/#security_ssl.
The Kafka client that the Flink connector uses can be configured through
Hi Jonas,
A few things to clarify first:
Stream A has a rate of 100k tuples/s. After processing the whole Kafka queue,
the rate drops to 10 tuples/s.
From this description it seems like the job is re-reading from the beginning
from the topic, and once you reach the latest record at the head
Hi Dmitry,
I think currently the simplest way to do this is simply to add a
program argument as a flag to whether or not the current run
is from a savepoint (so you manually supply the flag whenever you’re starting
the
job from a savepoint), and check that flag in the main method.
The main
Hi Luqman,
From your description, it seems like that you want to infer the type (case
class, tuple, etc.) of a stream dynamically at runtime.
AFAIK, I don’t think this is supported in Flink. You’re required to have
defined types for your DataStreams.
Could you also provide an example code of
Hi Howard,
I don’t think there is a rich variant for Async IO in Scala yet. We should
perhaps add support for it.
Looped in Till who worked on the Async IO and its Scala support to clarify
whether there were any concerns in not supporting it initially.
Cheers,
Gordon
On February 13, 2017 at
abase from another source.
Once again, thank you so much for your help.
I will wait to hear from you!
Cumprimentos,
Pedro Lima Monteiro
On 16 February 2017 at 09:43, Tzu-Li (Gordon) Tai <tzuli...@apache.org> wrote:
Hi Pedro!
This is definitely possible, by simply writing a Flink `SourceFunc
se.
I will give it a try and will come back to you.
Pedro Lima Monteiro
On 16 February 2017 at 10:20, Tzu-Li (Gordon) Tai <tzuli...@apache.org> wrote:
I would recommend checking out the Flink RabbitMQ Source for examples:
https://github.com/apache/flink/blob/master/flink-connectors/flink-connector-
Hi Geoffrey,
Thanks for investigating and updating on this. Good to know that it is working!
Just to clarify, was your series of jobs submitted to a “yarn session + regular
bin/flink run”, or “per job yarn cluster”?
I’m asking just to make sure of the limitations Robert mentioned.
Cheers,
the
affected sinks participate in checkpointing is enough of a solution - is there
anything special about `SinkFunction[T]` extending `Checkpointed[S]`, or can I
just implement it as I would for e.g. a mapping function?
Thanks,
Andrew
On Jan 13, 2017, at 4:34 PM, Tzu-Li (Gordon) Tai <tz
Hi,
You’re correct that the FlinkKafkaProducer may emit duplicates to Kafka topics,
as it currently only provides at-least-once guarantees.
Note that this isn’t a restriction only in the FlinkKafkaProducer, but a
general restriction for Kafka's message delivery.
This can definitely be improved
and I could
filter the data more efficiently because the data would not need to go over the
network before this filter.
Afterwards I can scale it up to 'many' tasks for the heavier processing that
follows.
As a concept: Could that be made to work?
Niels
On Mon, Jan 9, 2017 at 9:14 AM, Tzu-Li (G
Hi Matt!
As mentioned in the docs, due to the ASL license, we do not deploy the artifact
to the Maven central repository on Flink releases.
You will need to build the Kinesis connector by yourself (the instructions to
do so are also in the Flink Kinesis connector docs :)), and install it to
Hi!
This could be a Elasticsearch server / client version conflict, or that the
uber jar of your code wasn’t built properly.
For the first possible issue, we’re currently using Elasticsearch 2.3.5 to
build the Flink Elasticsearch Connector. Could you try overriding this version
to 2.4.1 when
Hi Aniket,
Thanks a lot for reporting this.
I’m afraid this seems to be a bug with Flink on YARN’s Kerberos authentication.
It is incorrectly checking for Kerberos credentials even for non-Kerberos
authentication methods.
I’ve filed a JIRA for this:
2.x. After changing the
version it worked. Thanks for the help.
On Mon, Feb 27, 2017 at 6:12 AM, Tzu-Li (Gordon) Tai <tzuli...@apache.org>
wrote:
Hi!
Like wha Flavio suggested, at a first glance this looks like a problem with
building the uber jar.
I haven’t bumped into the problem
with stock Flink 1.1.3 for the execution since I'm running
things on top of Hopsworks.
cheers Martin
On Tue, Feb 28, 2017 at 5:42 PM, Tzu-Li (Gordon) Tai <tzuli...@apache.org>
wrote:
Hi!
This could be a Elasticsearch server / client version conflict, or that the
uber jar of your code wasn’t
lang.Thread.run(Thread.java:745)
On Wed, Mar 1, 2017 at 7:58 AM, Tzu-Li (Gordon) Tai <tzuli...@apache.org> wrote:
Hi Martin,
You can do that by adding a dependency to the Elasticsearch client of your
desired version in your project.
You can also check what Elasticsearch client version
, 2017 at 9:23:35 PM, Tzu-Li (Gordon) Tai (tzuli...@apache.org) wrote:
Hi Martin,
Just letting you know I’m trying your setup right now, and will get back to you
once I confirm the results.
- Gordon
On March 1, 2017 at 9:15:16 PM, Martin Neumann (mneum...@sics.se) wrote:
I created the project
Hi David,
Is it possible that your Kafka installation is an older version than 0.9? Or
you may have used a different Kafka client major version in your job jar's
dependency?
This seems like an odd incompatible protocol with the Kafka broker to me, as
the client in the Kafka consumer is reading
Hi!
Yes, you can set custom operator names by calling `.name(…)` on DataStreams
after a transformation.
For example, `.addSource(…).map(...).name(…)`. This name will be used for
visualization on the dashboard, and also for logging.
Regards,
Gordon
On September 12, 2016 at 3:44:58 PM, Bart
Hi,
Helping out here: this is the PR for async Kafka offset committing -
https://github.com/apache/flink/pull/2574.
It has already been merged into the master and release-1.1 branches, so you can
try out the changes now if you’d like.
The change should also be included in the 1.1.3 release,
rators? If I need the state stored on previous
operators, how can I fetch it?
I don’t think this is possible.
Fine, thanks.
Thanks.
On Mon, Oct 3, 2016 at 12:23 AM, Tzu-Li (Gordon) Tai <tzuli...@apache.org>
wrote:
Hi!
- I'm currently running my flink program on 1.2 SNAPSHOT with kafka s
Really great to hear this!
Cheers,
Gordon
On October 4, 2016 at 8:13:27 PM, Till Rohrmann (trohrm...@apache.org) wrote:
It's always great to hear Flink success stories :-) Thanks for sharing it with
the community.
I hope Flink helps you to solve even more problems. And don't hesitate to
Hi Josh,
Thank you for reporting this, I’m looking into it. There was some major changes
to the Kinesis connector after mid June, but the changes don’t seem to be
related to the iterator timeout, so it may be a bug that had always been there.
I’m not sure yet if it may be related, but may I
57 PM, Tzu-Li (Gordon) Tai <tzuli...@apache.org>
wrote:
Hi Josh,
Thank you for reporting this, I’m looking into it. There was some major changes
to the Kinesis connector after mid June, but the changes don’t seem to be
related to the iterator timeout, so it may be a bug that had always
stamps receiving and sending data. If there is
only one source instance per broker how does that happen?
Thanks,
Sameer
On Tue, Aug 23, 2016 at 7:17 AM, Tzu-Li (Gordon) Tai <tzuli...@gmail.com>
wrote:
> Hi!
>
> Kinesis shards should be ideally evenly assigned to the source insta
Hi!
Kinesis shards should be ideally evenly assigned to the source instances.
So, with your example of source parallelism of 10 and 20 shards, each
source instance will have 2 shards and will have 2 threads consuming them
(therefore, not in round robin).
For the Kafka consumer, in the source
on the source stream-
assignTimestampsAndWatermarks(new IngestionTimeExtractor<>())
Sameer
On Tue, Aug 23, 2016 at 7:29 AM, Tzu-Li (Gordon) Tai <tzuli...@gmail.com>
wrote:
> Hi,
>
> For the Kinesis consumer, when you use Event Time but do not explicitly
> assign timestamps,
:)
Regards,
Gordon
On August 23, 2016 at 9:31:58 PM, Tzu-Li (Gordon) Tai (tzuli...@gmail.com)
wrote:
Slight misunderstanding here. The one thread per Kafka broker happens
*after* the assignment of Kafka partitions to the source instances. So,
with a total of 10 partitions and 10 source instances
Hi,
For the Kinesis consumer, when you use Event Time but do not explicitly
assign timestamps, the Kinesis server-side timestamp (the time which
Kinesis received the record) is attached to the record as default, not
Flink’s ingestion time.
Does this answer your question?
Regards,
Gordon
On
Hi!
- I'm currently running my flink program on 1.2 SNAPSHOT with kafka source and
I have checkpoint enabled. When I look at the consumer offsets in kafka it
appears to be stagnant and there is a huge lag. But I can see my flink program
is in pace with kafka source in JMX metrics and outputs.
Hi!
This is definitely a planned feature for the Kafka connectors, there’s a JIRA
exactly for this [1].
We’re currently going through some blocking tasks to make this happen, I also
hope to speed up things over there :)
Your observation is correct that the Kaka consumer uses “assign()” instead
Hi Matt,
Just to be clear, what I'm looking for is a way to serialize a POJO class for
Kafka but also for Flink, I'm not sure the interface of both frameworks are
compatible but it seems they aren't.
For Kafka (producer) I need a Serializer and a Deserializer class, and for
Flink (consumer) a
Hi Phillip,
Thanks for testing it. From your log and my own tests, I can confirm the
problem is with Kinesalite not correctly
mocking the official Kinesis behaviour for the `describeStream` API.
There’s a PR for the fix here: https://github.com/apache/flink/pull/2822. With
this change, shard
Hi Matt,
Here’s an example of writing a DeserializationSchema for your POJOs: [1].
As for simply writing messages from WebSocket to Kafka using a Flink job, while
it is absolutely viable, I would not recommend it,
mainly because you’d never know if you might need to temporarily shut down
Flink
Hi Craig,
I think the email wasn't sent to the ‘dev’ list, somehow.
Have you tried this:
mvn clean install -DskipTests
# In Maven 3.3 the shading of flink-dist doesn't work properly in one run, so
we need to run mvn for flink-dist again.
cd flink-dist
mvn clean install -DskipTests
I agree that
Hi Philipp,
When used against Kinesalite, can you tell if the connector is already reading
data from the test shard before any
of the shard discovery messages? If you have any spare time to test this, you
can set a larger value for the
`ConsumerConfigConstants.SHARD_DISCOVERY_INTERVAL_MILLIS`
the job from my last savepoint, it's consuming the stream fine
again with no problems.
Do you have any idea what might be causing this, or anything I should do to
investigate further?
Cheers,
Josh
On Wed, Oct 5, 2016 at 4:55 AM, Tzu-Li (Gordon) Tai <tzuli...@apache.org> wrote:
Hi S
processing?
Thanks,
Li
On Oct 17, 2016, at 11:10 AM, Tzu-Li (Gordon) Tai <tzuli...@apache.org> wrote:
Hi!
No, the operator does not need to pause processing input records while the
checkpointing of its state is in progress.
The checkpointing of operator state is asynchronous. The op
Hi Matt,
1. There’s some in-progress work on wrapper util classes for Kafka
de/serializers here [1] that allows
Kafka de/serializers to be used with the Flink Kafka Consumers/Producers with
minimal user overhead.
The PR also has some proposed adds to the documentations for the wrappers.
2. I
Hi Dominik,
Do you mean how Flink redistributes an operator’s state when the parallelism of
the operator is changed?
If so, you can take a look at [1] and [2].
Cheers,
Gordon
[1] https://issues.apache.org/jira/browse/FLINK-3755
[2]
Hi,
The FlinkKafkaProducers currently support per record topic through the
user-provided serialization schema which has a "getTargetTopic(T element)"
method called for per record,
and also decides the partition the record will be sent to through a custom
KafkaPartitioner, which is also provided
Hi Andrew,
Your observations are correct. Like you mentioned, the current problem circles
around how we deal with the pending buffered requests with accordance to
Flink’s checkpointing.
I’ve filed a JIRA for this, as well as some thoughts for the solution in the
description:
Hi,
This is expected behaviour due to how the per-partition watermarks are designed
in the Kafka consumer, but I think it’s probably a good idea to handle idle
partitions also when the Kafka consumer itself emits watermarks. I’ve filed a
JIRA issue for this:
Henri Heiskanen <henri.heiska...@gmail.com>
wrote:
Hi,
We had the same problem when running 0.9 consumer against 0.10 Kafka. Upgrading
Flink Kafka connector to 0.10 fixed our issue.
Br,
Henkka
On Mon, Jan 9, 2017 at 5:39 PM, Tzu-Li (Gordon) Tai <tzuli...@apache.org> wrote:
Hi,
N
Hi,
Not sure what might be going on here. I’m pretty certain that for
FlinkKafkaConsumer09 when checkpointing is turned off, the internally used
KafkaConsumer client will auto commit offsets back to Kafka at a default
interval of 5000ms (the default value for “auto.commit.interval.ms”).
Could
Hi Igor!
What you can actually do is let a single FlinkKafkaConsumer consume from both
topics, producing a single DataStream which you can keyBy afterwards.
All versions of the FlinkKafkaConsumer support consuming multiple Kafka topics
simultaneously. This is logically the same as union and
Hi Niels,
Thank you for bringing this up. I recall there was some previous discussion
related to this before: [1].
I don’t think this is possible at the moment, mainly because of how the API is
designed.
On the other hand, a KeyedStream in Flink is basically just a DataStream with a
hash
The Apache Flink community is pleased to announce the availability of Flink
1.1.5, which is the next bugfix release for the 1.1 series.
The official release announcement:
https://flink.apache.org/news/2017/03/23/release-1.1.5.html
Release binaries: http://apache.lauf-forum.at/flink/flink-1.1.5
Hi Steve,
This normally shouldn’t happen, unless there simply is two copies of the data.
What is the source of the topology? Also, this might be obvious, but if you
have broadcasted your input stream to the sink, then each sink instance would
then get all records in the input stream.
Cheers,
Hi,
Thanks for the clarification.
What are the reasons behind consuming/producing messages from/to Kafka while
the window has not expired yet?
First, some remarks here - sources (in your case the Kafka consumer) will not
stop fetching / producing data when the windows haven’t fired yet. Does
I'm wondering what I can tweak further to increase this. I was reading in this
blog: https://data-artisans.com/extending-the-yahoo-streaming-benchmark/
about 3 millions per sec with only 20 partitions. So i'm sure I should be able
to squeeze out more out of it.
Not really sure if it is relevant
Hi Steffan,
I have to admit that I didn’t put too much thoughts in the default values for
the Kinesis consumer.
I’d say it would be reasonable to change the default values to follow KCL’s
settings. Could you file a JIRA for this?
In general, we might want to reconsider all the default values
Hi Dominique,
What your plan A is suggesting is that a downstream operator can provide
signals to upstream operators and alter their behaviour.
In general, this isn’t possible, as in a distributed streaming environment it’s
hard to guarantee what records exactly will be altered by the
Sounds like a cool event! Thanks for sharing this!
On March 27, 2017 at 11:40:24 PM, Lior Amar (lior.a...@parallelmachines.com)
wrote:
Hi all,
My name is Lior and I am working at Parallel Machines (a startup company
located in the Silicon Valley).
We are hosting a Flink Hackathon on April
Hi Dominik,
Was the job running with processing time or event time? If event time, how are
you producing the watermarks?
Normally to understand how windows are firing in Flink, these two factors would
be the place to look at.
I can try to further explain this once you provide info with these.
code and a detail Flink configuration.
Cheers,
Dominik
On 30 Mar 2017, at 13:09, Tzu-Li (Gordon) Tai <tzuli...@apache.org> wrote:
Hi,
Thanks for the clarification.
What are the reasons behind consuming/producing messages from/to Kafka while
the window has not expired yet?
First, some r
value must use all data.so is there some
operate can set the step 3 start when step 2 is end.
- 原始邮件 -----
发件人:"Tzu-Li (Gordon) Tai" <tzuli...@apache.org>
收件人:rimin...@sina.cn
主题:Re: flink one transformation end,the next transformation start
日期:2017年03月30日 15点54分
Hi,
What e
Hi,
There currently isn’t any shaded version Kafka 0.8 connector version available,
so yes, you would need to do build that yourself.
I’m not completely sure if there will be any class name clashes, because the
Kafka 0.8 API is typically packaged under `kafka.javaapi.*`, while in 0.9 /
0.10
Hi Tarandeep,
I haven’t looked at the rest of the code yet, but my first guess is that you
might not be reading any data from Kafka at all:
private static DataStream readKafkaStream(String topic,
StreamExecutionEnvironment env) throws IOException {
Properties properties = new
way.
Gyula
Tzu-Li (Gordon) Tai <tzuli...@apache.org> ezt írta (időpont: 2017. márc. 17.,
P, 14:24):
One other possibility for reporting “consumer lag” is to update the metric only
at a
configurable interval, if use cases can tolerate a certain delay in realizing
the consumer
has cau
@Florian
the 0.9 / 0.10 version and 0.8 version behave a bit differently right now for
the offset committing.
In 0.9 / 0.10, if checkpointing is enabled, the “auto.commit.enable” etc.
settings will be completely ignored and overwritten before used to instantiate
the interval Kafka clients,
Hi,
I was thinking somewhat similar to what Ufuk suggested,
but if we want to report a “consumer lag” metric, we would
essentially need to request the latest offset on every record fetch
(because the latest offset advances as well), so I wasn’t so sure
of the performance tradeoffs there (the
sed here.
What do you think?
On March 17, 2017 at 9:05:04 PM, Tzu-Li (Gordon) Tai (tzuli...@apache.org)
wrote:
Hi,
I was thinking somewhat similar to what Ufuk suggested,
but if we want to report a “consumer lag” metric, we would
essentially need to request the latest offset on every reco
time I run the
code. I have put break points and print statements to verify that.
Also, if I don't connect with control stream the window function works.
- Tarandeep
On Mar 16, 2017, at 1:12 AM, Tzu-Li (Gordon) Tai <tzuli...@apache.org> wrote:
Hi Tarandeep,
I haven’t looked at the rest
illis();
}
});
FlinkKafkaProducer010.FlinkKafkaProducer010Configuration config =
FlinkKafkaProducer010.writeToKafkaWithTimestamps(stream, KAFKA_TOPIC, new
WordCountSchema(), properties);
config.setWriteTimestampToKafka(true);
env.execute("job");
On Mon, Apr 3,
Hi Diego,
I think the problem is here:
security.kerberos.login.contexts: Client, KafkaClient
The space between “Client,” and “KafkaClient” is causing the problem.
Removing it should fix your issue.
Cheers,
Gordon
On April 11, 2017 at 3:24:20 AM, Diego Fustes Villadóniga (dfus...@oesia.com)
Hi,
I just need to
start a timer of x days/hours (lets say) and when it is fired just trigger
something.
Flink’s lower-level ProcessFunction [1] should be very suitable to implement
this. Have you taken a look at this and see if it suits your case?
[1]
Hi Sandeep,
It isn’t fixed yet, so I think external tools like the Kafka offset checker
still won’t work.
If you’re using 08 and is currently stuck with this issue, you can still
directly query ZK to get the offsets.
I think for FlinkKafkaConsumer09 the offset is exposed to Flink's metric
Hi Pradeep,
There is not single API or connector to take input as a file and writing it to
Kafka.
In Flink, this operation consists of 2 parts, 1) source reading from input, and
2) sink producing to Kafka.
So, all you have to have a job that consists of that source and sink.
You’ve already
Hi Sathi,
Here, in the producer-side log, it says:
2017-04-22 19:46:53,608 INFO [main] DLKnesisPublisher: Successfully published
record, of bytes :162810 partition key
:fb043b73-1d9d-447a-a593-b9c1bf3d4974-1492915604228 ShardId:
shardId-Sequence
to file the
issue: https://issues.apache.org/jira/browse/FLINK-6365.
Really appreciate your contributions for the Kinesis connector!
Cheers,
Steffen
On 22/03/2017 20:21, Tzu-Li (Gordon) Tai wrote:
> Hi Steffan,
>
> I have to admit that I didn’t put too much thoughts in th
nd when it fails to parse it, we can see the ClassNotFoundException
for the relevant exception (in our case JsResultException from the play-json
library). The library is indeed in the shaded JAR, otherwise we would not be
able to parse the JSON.
Cheers,
Bruno
On Wed, 8 Mar 2017 at 12:57 Tzu-Li (Gordon) Tai <tz
FYI: Here’s the JIRA ticket to track this issue -
https://issues.apache.org/jira/browse/FLINK-6025.
On March 11, 2017 at 6:27:36 PM, Tzu-Li (Gordon) Tai (tzuli...@apache.org)
wrote:
Hi Shannon,
Thanks a lot for providing the example, it was very helpful in reproducing the
problem.
I think
Hi Aniket!
Thanks for also looking into the problem!
I think checking `getAuthenticationMethod` on the UGI subject is the way to go.
At the moment I don’t think there’s a better “proper” solution for this.
As explained in the JIRA, we simply should not be checking for Kerberos
credentials for
Hi Shannon,
Just to clarify:
From the error trace, it seems like that the messages fetched from Kafka are
serialized `AmazonS3Exception`s, and you’re emitting a stream of
`AmazonS3Exception` as records from FlinkKafkaConsumer?
Is this correct? If so, I think we should just make sure that the
Hi Bruno!
The Flink CEP library also seems like an option you can look into to see if it
can easily realize what you have in mind.
Basically, the pattern you are detecting is a timeout of 5 minutes after the
last event. Once that pattern is detected, you emit a “device offline” event
Hi,
Are you running Zeppelin on a local machine?
I haven’t tried this before, but you could first try and check if port ‘6123’
is publicly accessible in the security group settings of the AWS EMR instances.
- Gordon
On March 3, 2017 at 10:21:41 AM, Meghashyam Sandeep V
Hi Dominik,
AFAIK, the local mode executions create a mini cluster within the JVM to run
the job.
Also, `MiniCluster` seems to be something FLIP-6 related, and since FLIP-6 is
still work
in progress, I’m not entirely sure if it is viable at the moment. Right now,
you should look
into using
On March 6, 2017 at 9:40:28 PM, Tzu-Li (Gordon) Tai (tzuli...@apache.org) wrote:
Hi Bruno!
The Flink CEP library also seems like an option you can look into to see if it
can easily realize what you have in mind.
Basically, the pattern you are detecting is a timeout of 5 minutes after the
last
Hi Steve,
I’ll try to provide some input for the approaches you’re currently looking into
(I’ll comment on your email below):
* API based stop and restart of job … ugly.
Yes, indeed ;) I think this is absolutely not necessary.
* Use a co-map function with the rules alone stream and the events
Hi,
I just had a quick look on this, but the Kafka fetcher thread’s context
classloader doesn’t seem to be the issue (at least for 1.1.4).
In Flink 1.1.4, a separate thread from the task thread is created to run the
fetcher, but since the task thread sets the user code classloader as its
Hi Sunil,
There’s recently some effort in allowing `DeserializationSchema#deserialize()`
to return `null` in cases like yours, so that the invalid record can be simply
skipped instead of throwing an exception from the deserialization schema.
Here are the related links that you may be interested
Hi Hussein!
Your approach seems reasonable to me. The open() method will be called only
once for the UDF every time the job has started (and when the job is restored
from failures also).
Cheers,
Gordon
On March 3, 2017 at 7:03:22 PM, Hussein Baghdadi (hussein.baghd...@zalando.de)
wrote:
Hi,
java.lang.NoSuchMethodError:
org.apache.flink.util.InstantiationUtil.isSerializable(Ljava/lang/Object;)Z
at
org.apache.flink.streaming.connectors.elasticsearch.ElasticsearchSinkBase.(ElasticsearchSinkBase.java:195)
at
1 - 100 of 546 matches
Mail list logo