Hi Ken,
I am doing a global distinct. What i want to achive is someting like below.
With windowAll it sends all data to single operator which means shuffle all
data and calculate with par 1. I dont want to shuffle data since i just
want to feed it to hll instance and shuffle just hll instances at
Hi Zhijiang,
Thanks for your reply. I first also suspect the same reason. But once I
read the connected host log, the netty server starts to listen on the
correct port after 2 seconds of task manager start. I compared the log of
the connected host and connecting host log, it seems requesting
Hi Till,
I will try the local test according to your suggestion. Uber Flink version
is mainly adding something to integrate with Uber deployment and other
infra components. There is no change for Flink original code flow.
I also found that the issue can be avoided with the same setting in other
Hi,
I am using Avro GenericRecord for most of IN/OUT types from my custom
functions. What I have noticed is that default Avro GenericRecord
serializer, also serializes Schema which makes messages very heavy and
hence impacts overall performance. In my case I already know the schema
before hand
Hi Wenrui,
I suspect another issue which might cause connection failure. You can check
whether the netty server already binds and listens port successfully in time
before the client requests connection. If there exists some time-consuming
process during TM startup which might delay netty
Hi,
I have a use case where 4 streams get merged (union) and grouped on common
key (keyBy) and a custom KeyedProcessFunction is called. Now I need to keep
state (RocksDB backend) for all 4 streams in my custom KeyedProcessFunction
where each of these 4 streams would be stored as map. So I have 2
> On Jan 9, 2019, at 3:10 PM, CPC wrote:
>
> Hi Ken,
>
> From regular time-based windows do you mean keyed windows?
Correct. Without doing a keyBy() you would have a parallelism of 1.
I think you want to key on whatever you’re counting for unique values, so that
each window operator gets a
Hello,
What is the recommended flink streaming approach for serialising a POJO to
Avro according to a schema, and pushing the subsequent byte array into a
Kafka sink? Also, is there any existing approach for prepending the schema
id to the payload (following the Confluent pattern)?
Thanks,
Hi there,
You should be able to use a regular time-based window(), and emit the
HyperLogLog binary data as your result, which then would get merged in your
custom function (which you set a parallelism of 1 on).
Note that if you are generating unique counts per non-overlapping time window,
For what it's worth, we believe are able to work around this issue by
adding the following line to our flink-conf.yaml:
classloader.parent-first-patterns.additional: javax.xml.;org.apache.xerces.
On Thu, Dec 6, 2018 at 2:28 AM Chesnay Schepler wrote:
> Small correction: Flink 1.7 does not
Hi Stefan,
Could i use "Reinterpreting a pre-partitioned data stream as keyed stream"
feature for this?
On Wed, 9 Jan 2019 at 17:50, Stefan Richter
wrote:
> Hi,
>
> I think your expectation about windowAll is wrong, from the method
> documentation: “Note: This operation is inherently
Hi,
I've managed to correct this by implementing my own FsStateBackend based on the
original one with proper Kerberos relogin in createCheckpointStorage().
Regards,
Arnaud
-Message d'origine-
De : LINZ, Arnaud
Envoyé : vendredi 4 janvier 2019 11:32
À : user
Objet : Kerberos error when
Thanks very much for you rapid answer Stefan.
Regards,
Edward
El mié., 9 ene. 2019 a las 15:26, Stefan Richter ()
escribió:
> Hi,
>
> I would assume that this should currently work because the format of basic
> savepoints and checkpoints is the same right now. The restriction in the
> doc is
Hi Stefan,
Attaching Logs :
You can search for : "2019-01-09 19:34:44,170 INFO
org.apache.flink.runtime.taskmanager.Task - Attempting
to cancel task Source:
" in first 2 log files.
f3-part-aa.gz
With https://issues.apache.org/jira/browse/FLINK-10571, we will remove the
Storm topologies from Flink and keep the wrappers for the moment.
However, looking at the FlinkTopologyContext [1], it becomes quite obvious
that Flink's compatibility with Storm is really limited. Almost all of the
Hi,
Could you also provide the job master log?
Best,
Stefan
> On 9. Jan 2019, at 12:02, sohimankotia wrote:
>
> Hi,
>
> I am running Flink Streaming Job with 1.5.5 version.
>
> - Job is basically reading from Kafka , windowing on 2 minutes , and writing
> to hdfs using AvroBucketing Sink .
Hi,
I think your expectation about windowAll is wrong, from the method
documentation: “Note: This operation is inherently non-parallel since all
elements have to pass through the same operator instance” and I also cannot
think of a way in which the windowing API would support your use case
Hi,
I would assume that this should currently work because the format of basic
savepoints and checkpoints is the same right now. The restriction in the doc is
probably there in case that the checkpoint format will diverge more in the
future.
Best,
Stefan
> On 9. Jan 2019, at 13:12, Edward
Hi,
That is more a ZK question than a Flink question, but I don’t think there is a
problem.
Best,
Stefan
> On 9. Jan 2019, at 13:31, min@ubs.com wrote:
>
> Hi,
>
> I am new to Flink.
>
> I have a question:
> Can a zookeeper cluster be shared by a flink cluster and a kafka cluster?
>
Hi all,
In our implementation,we are consuming from kafka and calculating distinct
with hyperloglog. We are using windowAll function with a custom
AggregateFunction but flink runtime shows a little bit unexpected behavior
at runtime. Our sources running with parallelism 4 and i expect add
Hi,
I am new to Flink.
I have a question:
Can a zookeeper cluster be shared by a flink cluster and a kafka cluster?
Regards,
Min
Check out our new brand campaign: www.ubs.com/together
E-mails can involve SUBSTANTIAL RISKS, e.g. lack of confidentiality, potential
manipulation of contents
Hello,
For upgrading jobs between Flink versions I follow the guide in the doc
here:
https://ci.apache.org/projects/flink/flink-docs-release-1.7/ops/upgrading.html#upgrading-the-flink-framework-version
It states that we should always use savepoints for this procedure, I
followed it and it works
Hi,
I am running Flink Streaming Job with 1.5.5 version.
- Job is basically reading from Kafka , windowing on 2 minutes , and writing
to hdfs using AvroBucketing Sink .
- Job is running with parallelism 132
- Checkpointing is enabled with interval of 1 minute.
- Savepoint is enabled and getting
Hi Till,
This job is not on AthenaX but on a special uber version Flink. I tried to
ping the connected host from connecting host. It seems very stable. For the
connection timeout, I do set it as 20min but it still report the timeout
after 2 minutes. Could you let me know how do you test locally
Hi Wenrui,
I executed AutoParallelismITCase#testProgramWithAutoParallelism and set a
breakpoint in NettClient.java:102 to see whether the configured timeout
value is correctly set. Moreover, I did the same for
AbstractNioChannel.java:207 and it looked as if the correct timeout value
was set.
25 matches
Mail list logo