维表(mysql) 是动态变化的,与流表join 时,维表一直是第一次查询到的数据,后面维表变化的数据,在join时,查询不到。
hi
首先,就算选择rocksDB statebackend,也是需要写HDFS的,只是在开启了incremental
checkpoint方式情况下可以减少每次hdfs数据写入。
我觉得这个问题核心是一个trade
off。不做checkpoint的时候,RocksDBStateBackend的读写性能不如纯内存的FsStateBackend。而在checkpoint的同步阶段,RocksDB
stateBackend需要全量写本地磁盘,比FsStateBackend的内存操作可能要慢一些,也会影响吞吐。在checkpoint的异步阶段,由于RocksDB
Flink中对于state不是很大,但是需要频繁checkpoint的任务,backendstate是选file还是rockdb呀,看官档说的是rockdb适合state很大的任务,可能吞吐会降低。但是如果选用file的话对hdfs的压力又很大
Hi, Samya.Patro
I guess this may be a setup problem. What OS and what version of JDK do you
use? You can try upgrading JDK to see if the issue can be solved.
Best,
Haibo
At 2019-07-02 17:16:59, "Patro, Samya" wrote:
Hello,
I am using rocksdb for storing state . But when I run the
Thanks for being the release manager and the great job!
Cheers,
Jark
On Wed, 3 Jul 2019 at 10:16, Dian Fu wrote:
> Awesome! Thanks a lot for being the release manager. Great job! @Jincheng
>
> Regards,
> Dian
>
> 在 2019年7月3日,上午10:08,jincheng sun 写道:
>
> I've also tweeted about it from my
Hi,Till:
Thank you very much for answering my question!
In the attachment I store the logs from my last attempt to run (flink
started debug-level logs), and my flink cluster runs in a kubernate-based
hadoop cluster with 3 nodes and kerberos security authentication enabled. I
made a
Thanks for being the release manager and great job! @Jincheng
Best,
Kurt
On Wed, Jul 3, 2019 at 10:19 AM Tzu-Li (Gordon) Tai
wrote:
> Thanks for being the release manager @jincheng sun
> :)
>
> On Wed, Jul 3, 2019 at 10:16 AM Dian Fu wrote:
>
>> Awesome! Thanks a lot for being the release
Thanks for being the release manager @jincheng sun
:)
On Wed, Jul 3, 2019 at 10:16 AM Dian Fu wrote:
> Awesome! Thanks a lot for being the release manager. Great job! @Jincheng
>
> Regards,
> Dian
>
> 在 2019年7月3日,上午10:08,jincheng sun 写道:
>
> I've also tweeted about it from my twitter:
>
Awesome! Thanks a lot for being the release manager. Great job! @Jincheng
Regards,
Dian
> 在 2019年7月3日,上午10:08,jincheng sun 写道:
>
> I've also tweeted about it from my twitter:
> https://twitter.com/sunjincheng121/status/1146236834344648704
>
I've also tweeted about it from my twitter:
https://twitter.com/sunjincheng121/status/1146236834344648704
later would be tweeted it from @ApacheFlink!
Best, Jincheng
Hequn Cheng 于2019年7月3日周三 上午9:48写道:
> Thanks for being the release manager and the great work Jincheng!
> Also thanks to Gorden
Thanks for being the release manager and the great work Jincheng!
Also thanks to Gorden and the community making this release possible!
Best, Hequn
On Wed, Jul 3, 2019 at 9:40 AM jincheng sun
wrote:
> Hi,
>
> The Apache Flink community is very happy to announce the release of Apache
> Flink
Hi,
The Apache Flink community is very happy to announce the release of Apache
Flink 1.8.1, which is the first bugfix release for the Apache Flink 1.8
series.
Apache Flink® is an open-source stream processing framework for
distributed, high-performing, always-available, and accurate data
Hi
I want load MySQL tables in Flink without need to specifying column names
and types (like what we can do in Apache Spark DataFrames).
Using the JDBCInputFormat we should pass the table fields type in the
method setRowTypeInfo. I couldn't find any way to force Flink to infer the
column type.
Hi Ever,
As Haibo noted, that’s a known regression.
If you fall back to the older approach of having multiple TMs per slave, each
with one slot, then Flink (as of 1.7/1.8) does a better job of distributing
work.
— Ken
> On Jul 1, 2019, at 9:23 PM, Haibo Sun wrote:
>
> Hi, Ever
>
> This is
Hi,
how did you start the job masters? Could you maybe share the logs of all
components? It looks as if the leader election is not working properly. One
thing to make sure is that you specify for every new HA cluster a different
cluster ID via `high-availability.cluster-id: cluster_xy`. That way
No problem! Glad I could help.
Kostas
On Tue, Jul 2, 2019 at 12:11 PM Avi Levi wrote:
> No, it doesn't. Thanks for pointing it. I just noticed that I wasn't using
> the proxy server address.
> Thanks !!!
>
>
> On Tue, Jul 2, 2019 at 12:16 PM Kostas Kloudas wrote:
>
>> *This Message originated
No, it doesn't. Thanks for pointing it. I just noticed that I wasn't using
the proxy server address.
Thanks !!!
On Tue, Jul 2, 2019 at 12:16 PM Kostas Kloudas wrote:
> *This Message originated outside your organization.*
> --
> Hi Avi,
>
> Do you point the client to
> how do I enable Blink planner support?
After flink-1.9 release, you can try Blink-planner.
>Since when is LATERAL TABLE available in Flink? Is it equivalent to using
>temporal tables?
LATERAL TABLE is table function in table, it is available in Flink for a long
time.[1]
It is different from
Hi, Andreas
You are right. To meet this requirement, Flink should need to expose a
interface to allow customizing the filename.
Best,
Haibo
At 2019-07-02 16:33:44, "Yitzchak Lieberman" wrote:
regarding option 2 for parquet:
implementing bucket assigner won't set the file name as
Thanks for the update Ken. The input splits seem to
be org.apache.hadoop.mapred.FileSplit. Nothing too fancy pops into my eye.
Internally they use org.apache.hadoop.mapreduce.lib.input.FileSplit which
stores a Path, two long pointers and two string arrays with hosts and host
infos. I would assume
Hello,
I am using rocksdb for storing state . But when I run the pipeline I get the
error "Could not load the native RocksDB library" . Kindly can you check the
configs and error stacktrace and suggest what am I doing wrong .
Flink version - 1.8.0
org.apache.flink
Hi Avi,
Do you point the client to the correct address? This means where the
"Queryable State Proxy Server @ ..." says?
Cheers,
Kostas
On Sun, Jun 30, 2019 at 4:37 PM Avi Levi wrote:
> Hi,
> I am trying to query state (cluster 1.8.0 is running on my local machine) .
> I do see in the logs
Hi,
If your async operations are stalled, this will eventually cause problems.
Either this will back pressure sources (the async’s operator queue will become
full) or you will run out of memory (if you configured the queue’s capacity too
high). I think the only possible solution is to either
Thanks for reporting this problem Joshua. I think this is actually a
problem we should fix. The cause seems to be that we swallow the OOM
exception when calling `Task#failExternally`. Probably we don't set the
right uncaught exception handler in the thread which executes the
checkpoint. Let's
Hi all,
In the past, I have tried to further refine the design of this topic thread
and wrote a design document to give more detailed design images and text
description, so that it is more conducive to discussion.[1]
Note: The document is not yet completed, for example, the "Implementation"
regarding option 2 for parquet:
implementing bucket assigner won't set the file name as getBucketId()
defined the directory for the files in case of partitioning the data,
for example:
/day=20190101/part-1-1
there is an open issue for that:
https://issues.apache.org/jira/browse/FLINK-12573
On
Hello,
I am using rocksdb for storing state . But when I run the pipeline I get the
error "Could not load the native RocksDB library" . Kindly can you check the
configs and error stacktrace and suggest what am I doing wrong .
Flink version - 1.8.0
org.apache.flink
Hi Avi,
Yes, it will. The restored state takes priority over the start position.
Best,
Paul Lam
> 在 2019年7月2日,15:11,Avi Levi 写道:
>
> Hi,
> If I set in code the consumer offset e.g consumer.setStartFromTimestamp and I
> start the job from a curtain savepoint/checkpoint will the offset in the
Hi,
If I set in code the consumer offset e.g *consumer.setStartFromTimestamp*
and I start the job from a curtain savepoint/checkpoint will the offset in
the checkpoint will override the the offset that is defined in the code ?
Best Regards
Avi
29 matches
Mail list logo