??
flink1.8 on
yarn??yarn??ResourceManagerHA??RM??2ActiveStandbystandby??8032RM
PS??flink1.7 on yarnyarn??hadoop
2019-07-23
请问您第一个问题是如何解决呢
在 2019-07-18 11:15:49,"九思" <1048095...@qq.com> 写道:
请教老师,使用StreamingFileSink或者BucketingSink写入HDFS。
1、在本地,当掐掉程序的时候,文件状态还是in-process状态,不会转为正式文件,这个要怎么处理呢?
2、重启程序开始后,编号又从0开始,而不是从之前的编号继续。看了源码,是有去获取之前的编号。但是我断点查了,没获取到,是什么原因呢?
Hi,
In
https://ci.apache.org/projects/flink/flink-docs-stable/dev/local_execution.html
and
https://ci.apache.org/projects/flink/flink-docs-master/api/java/org/apache/flink/runtime/minicluster/MiniCluster.html
I see there are 3 ways to create an execution environment for testing:
-
Hi,
I'm trying to use AbstractTestBase in a test in order to use the mini
cluster. I'm using specs2 with Scala, so I cannot extend AbstractTestBase
because I also have to extend org.specs2.Specification, so I'm trying to
access the mini cluster directly using Specs2 BeforeAll to initialize it as
On Mon, Jul 22, 2019, 16:08 Prakhar Mathur wrote:
> Hi,
>
> We enabled GC logging, here are the logs
>
> [GC (Allocation Failure) [PSYoungGen: 6482015K->70303K(6776832K)]
> 6955827K->544194K(20823552K), 0.0591479 secs] [Times: user=0.09 sys=0.00,
> real=0.06 secs]
> [GC (Allocation Failure)
Hi Yun Tang Your suggestion is very very important to us. According to your suggestion, We have suggested that users increase the interval time (1 to 5 minutes) and set state.backend.fs.memory-threshold=10k. But we only have one hdfs cluster, we try to reduce Hdfs api call, I don't know if there
Hi Yun Tang Your suggestion is very very important to us. According to your suggestion, We have suggested that users increase the interval time (1 to 5 minutes) and set state.backend.fs.memory-threshold=10k. But we only have one hdfs cluster, we try to reduce Hdfs api call, I don't know if there
Hi In my understanding,CreateFile and FileCreated api is different,FileCreated is more like a check api, but I don’t find where it was called in the src source. I don’t understand when FileCreated Api was called and for what。Is FileCreated api a hdfs internal confirmation api?FLINK-11696 is to
Please check whether the following profile section exists in
"flink-filesystems/flink-mapr-fs/pom.xml". If not, you should pull the latest
code and try to compile it again. If yes, please share the latest error
message, it may be different from before.
Hi Flavio,
Based on the discussion in the tickets you mentioned above, the
program-class attribute was a mistake and community is intended to use
main-class to replace it.
Deprecating Program interface is a part of work of flink new client api.
IIUC, your requirements are not so complicated. We
Hi,
I have a Flink sql streaming job defined by:
SELECT
user_id
, hop_end(created_at, interval '30' second, interval '1' minute) as bucket_ts
, count(name) as count
FROM event
WHERE name = 'signin'
GROUP BY
user_id
, hop(created_at, interval '30' second, interval '1' minute)
there is
Hi,
I used the command mvn clean package -DskipTests -Punsafe-mapr-repo , but it
didn’t work. I get the same error.
Regards
Yebgenya Lazar
Von: Haibo Sun
Gesendet: Montag, 22. Juli 2019 04:40
An: Yebgenya Lazarkhosrouabadi
Cc: user@flink.apache.org
Betreff: Re:Unable to build Flink1.10 from
Hi Lasse,
Thanks for the reply. If your input is in epoch time, you are not getting
local time, instead, you are getting a wrong time that does not make sense.
For example, if the user input value is 0 (which means 00:00:00 UTC on 1
January 1970), and your local timezone is UTC-8, converting
Hi folks,
Will StreamingFileSink.forBulkFormat(...) support overriding
OnCheckpointRollingPolicy?
Does anyone use StreamingFileSink *with checkpoint disabled *for writing
Parquet output files?
The output parquet files are generated, but they are empty, and stay in
*inprogress* state, even when
Hi.
I have encountered the same problem when you input epoch time to window table
function and then use window.start and window.end the out doesn’t output in
epoch but local time and I located the problem to the same internal function as
you.
Med venlig hilsen / Best regards
Lasse
It turns out the actual issue was a configuration issue and we just had to pore
over job manager log carefully. We were using HDFS [really API on top of
windows blob] as source and we didn’t provide the server location and it took
the path prefix as the server.
Only thing here would have been
Hi all,
Currently, in the non-blink table/SQL runtime, Flink used
SqlFunctions.internalToTimestamp(long v) from Calcite to convert event time
(in long) to java.sql.Timestamp. However, as discussed in the recent
Calcite mailing list (Jul. 19, 2019), SqlFunctions.internalToTimestamp()
assumes the
Thanks Andrey.
The environment we run [Azure HD insight cluster] only supports Flink 1.4.2
now. So I can’t run with 1.8 in this environment.
I can run in a different environment with 1.8 [on Kubernetes not YARN though]
and report the results.
Thanks,
-Fakrudeen
(define (sqrte n xn eph) (if (>
Hi Fakrudeen,
Thanks for sharing the logs. Could you also try it with Flink 1.8?
Best,
Andrey
On Sat, Jul 20, 2019 at 12:44 AM Fakrudeen Ali Ahmed
wrote:
> Hi Andrey,
>
>
>
>
>
> Flink version: 1.4.2
>
> Please find the client log attached and job manager log is at: job
> manager log
>
Hello everyone,
I need to create a table from a stream environment and thinking in a pure
SQL approach I was wondering if I can create few of the enrichment tables
in batch environment and only the streaming payload as streaming table
environment.
I tried to create a batch table environment
I did take a look at it, but things got out of hand very quickly from there
on :D
I see that WebSubmissionExtension implements WebMonitorExtension, but
then WebSubmissionExtension was used in DispatcherRestEndpoint, which I
couldn't know how to manipulate/extend...
How can I plug my Extension
I simply want to open up endpoints to query QueryableStates. What I had in
mind was to give operators an interface to implement their own
QueryableState controllers, e.g. serializers etc.
We are trying to use Flink in more of an "application framework" fashion,
so extensibility helps a lot. As
Hi Tison,
we use a modified version of the Program interface to enable a web UI do
properly detect and run Flink jobs contained in a jar + their parameters.
As stated in [1], we dected multiple Main classes per jar by handling an
extra comma-separeted Manifest entry (i.e. 'Main-classes').
As
Hi guys,
We want to have an accurate idea of how many people are implementing
Flink job based on the interface Program, and how they actually
implement it.
The reason I ask for the survey is from this thread[1] where we notice
this codepath is stale and less useful than it should be. As it is an
Thanks Biao, just want to not reinvent the wheel :)
> On Jul 22, 2019, at 4:29 PM, Biao Liu wrote:
>
> Hi Andy,
>
> As far as I know, Flink does not support feature like that.
>
> I would suggest recording and calculating the time in user code.
> For example, add a timestamp field (maybe an
Hi.
I'm running a Flink application (version 1.8.0) that
uses FlinkKafkaConsumer to fetch topic data and perform transformation on
the data, with state backend as below:
StreamExecutionEnvironment env =
StreamExecutionEnvironment.getExecutionEnvironment();
env.enableCheckpointing(5_000,
Hi Andy,
As far as I know, Flink does not support feature like that.
I would suggest recording and calculating the time in user code.
For example, add a timestamp field (maybe an array) in your record with
printing a timestamp on in by each processing.
Andy Hoang 于2019年7月22日周一 下午4:49写道:
> Hi
Hi guys,
I’m trying to write elk log for flink, this help us to store/calculate
processing time of a group of operators for business auditing.
I read about process_function and Debugging Windows & Event Time in docs.
They’re focus on “keyed” events and monitoring using web/metric, where I want
Hi,
As far as I know, the RESTful handler is not pluggable. And I don't see a
strong reason from your description to do so.
Could you explain more about your requirement?
Oytun Tez 于2019年7月20日周六 上午4:36写道:
> Yep, I scanned all of the issues in Jira and the codebase, I couldn't find
> a way to
Hi Juan,
If you want to deduplicate, then you could group by the record, and use a (very
simple) reduce function to only emit a record if the group contains one element.
There will be performance issues, though - Flink will have to generate all
groups first, which typically means spilling to
Hi,
I've been trying to write a function to compute the difference between 2
datasets. With that I mean computing a dataset that has all the elements of
a dataset that are not present in another dataset. I first tried using
coCogroup, but it was very slow in a local execution environment, and
??TaskManager??
2019-07-22 05:39:03,987 WARN org.apache.hadoop.ipc.Client
- Failed to connect to server: master/10.0.2.11:9000: try once and
fail. java.nio.channels.ClosedByInterruptException at
32 matches
Mail list logo