In Flink k8s application mode with high-availability, it's job id always
00, but in history server, it make job's id for the key. How can I
using the application mode with HA and store the history job status with
history server?
Best,
tanjialiang.
owse/FLINK-19358
Chenyu Zheng 於 2021年8月20日 週五 下午12:16寫道:
> History Server的API也是使用jobid作为区分
>
> * /config
> * /jobs/overview
> * /jobs/
> * /jobs//vertices
> * /jobs//config
> * /jobs//exceptions
> * /jobs//accumulators
> * /jobs//vertice
History Server的API也是使用jobid作为区分
* /config
* /jobs/overview
* /jobs/
* /jobs//vertices
* /jobs//config
* /jobs//exceptions
* /jobs//accumulators
* /jobs//vertices/
* /jobs//vertices//subtasktimes
* /jobs//vertices//taskmanagers
* /jobs//vertices
您好,
我们目前在k8s上以flink application模式运行作业,现在希望部署一个history server方便debug。但是根据文档,flink
historyserver貌似只支持单个cluster下不同job的使用方法,如果存在多个cluster,相同的jobID将会出现错误。
请问对于多个application cluster,history使用的最佳姿势是什么样的?
谢谢[cid:image001.png@01D795B8.6430A670]
目前Flink的history server并没有和Yarn NM的log
aggregation进行整合,所以任务结束以后只能看webui以及exception
日志是没有办法看的
Best,
Yang
lhuiseu 于2021年5月7日周五 下午1:57写道:
> Hi:
> flink 1.12.0
> on yarn 模式
> 已经Finish的任务可以再history server中找到。但是通过WebUI查看TaskManager Log报404。目前Flink
> History Server是不支持查看TaskManager聚合后的
请问在native kubernetes上如何运行Flink History Server? 有没有相应的文档?
Hi:
flink 1.12.0
on yarn 模式
已经Finish的任务可以再history server中找到。但是通过WebUI查看TaskManager Log报404。目前Flink
History Server是不支持查看TaskManager聚合后的日志吗?希望了解history serve相关原理的同学给予帮助。
非常感谢。
<http://apache-flink.147419.n8.nabble.com/file/t1254/file.png>
--
Sent from: http://apache-flink.147419.n8.nabble.com/
- -c
>>>>> - /opt/flink/bin/flink run-application --target
>>>>> kubernetes-application -Dkubernetes.service-account=flink-service-account
>>>>> -Dkubernetes.rest-service.exposed.type=NodePort
>>>
Dkubernetes.service-account=flink-service-account
>>>> -Dkubernetes.rest-service.exposed.type=NodePort
>>>> -Dkubernetes.cluster-id=batch-job-cluster
>>>> -Dkubernetes.container.image=localhost:5000/batch-flink-app-v3:latest
>>>> -Ds3.endpoint=http://mi
3:latest
>>> -Ds3.endpoint=http://minio-1616518256:9000 -Ds3.access-key=ACCESSKEY
>>> -Ds3.secret-key=SECRETKEY
>>> -Djobmanager.archive.fs.dir=s3://flink/completed-jobs/
>>> -Ds3.path-style-access=true -Ds3.ssl.enabled=false
>>> -Dhigh-availability=org.apache.flink.kubernetes
secret-key=SECRETKEY
>> -Djobmanager.archive.fs.dir=s3://flink/completed-jobs/
>> -Ds3.path-style-access=true -Ds3.ssl.enabled=false
>> -Dhigh-availability=org.apache.flink.kubernetes.highavailability.KubernetesHaServicesFactory
>> -Dhigh-availability.storageDir=s3://flink/flink-ha
>> lo
abled=false
> -Dhigh-availability=org.apache.flink.kubernetes.highavailability.KubernetesHaServicesFactory
> -Dhigh-availability.storageDir=s3://flink/flink-ha
> local:///opt/flink/usrlib/job.jar
> restartPolicy: OnFailure
>
>
> This works well for me but I would like to write
ory
-Dhigh-availability.storageDir=s3://flink/flink-ha
local:///opt/flink/usrlib/job.jar
restartPolicy: OnFailure
This works well for me but I would like to write the result to the archive
path and show it in the History server (running as separate deployment in
k8)
Anytime it cre
Thank you for the confirmation.
On Fri, Mar 19, 2021 at 5:37 AM Matthias Pohl
wrote:
> Hi Vishal,
> yes, as the documentation explains [1]: Only jobs that reached a globally
> terminal state are archived into Flink's history server. State information
> about running jobs can
Hi Vishal,
yes, as the documentation explains [1]: Only jobs that reached a globally
terminal state are archived into Flink's history server. State information
about running jobs can be retrieved through Flink's REST API.
Best,
Matthias
[1]
https://ci.apache.org/projects/flink/flink-docs-release
Hello folks,
Does fliink server not provide for running jobs ( like spark history does )
?
Regards.
Hi,yujianbo
只要任务结束,不管是cancel、failed、killed都会在history sever展示,
可以先去hdfs查看配置的目录是否存在任务相关的文件夹;也可以尝试重启一下history
server试试。麻烦问一下,你的任务使用什么api写的,以及版本、提交方式?
yujianbo wrote
> 大佬,我发现我配置完后就只能看到完成的任务在history sever上面
-session的running job
list展示模式,官方没有对页面进行分页操作,需要自己改源码。
问题2:1.10版本对日志的展示不是很友好,1.11可以滚动文件展示,至于jm 和 tm
日志怎么获取,受限于官网文档资料的限制,现在还没有解决,我这里现在还是依赖yarn的job history
server以及聚合日志功能进行bug分析。如有进展会在此继续讨论,欢迎分享新成果。
Best,
Robin
zhisheng wrote
> Hi Robin:
>
> 1、是不是更改了刷新时间?一直不显示吗?
>
> 2、running
大佬,我发现我配置完后就只能看到完成的任务在history sever上面,失败的看不到。现在疑惑的是失败的能不能出现在history server
--
Sent from: http://apache-flink.147419.n8.nabble.com/
Hi Robin:
1、是不是更改了刷新时间?一直不显示吗?
2、running 的作业不会显示的,你可以之间在 yarn 查看,history server 应该是只提供展示挂掉的作业
PS:另外提几个 history server 的问题
1、挂掉的作业展示能否支持分页呢?目前直接在一个页面全部展示了历史所有的作业,打开会很卡
2、有办法可以查看挂掉作业的 jm 和 tm 日志吗?因为 HDFS
其实是有日志,按道理是可以拿到日志信息然后解析展示出来的,Spark history server 也是可以查看挂掉作业的日志
Best!
zhisheng
Robin
如下图,Flink 1.10 on yarn per job提交方式,如果是java datastream 以及table
api开发的应用,能够被jm正常拉取统计信息,但是sql化的job没有办法被历史服务器监控。
使用的sql不完全是官网的,但是是经过转化为datastream,以on yarn per
job方式提交到yarn运行的,只是多了个sql解析动作。不能理解
,为什么历史服务器没有加载job信息到hdfs上的目标目录。查看jobmanager日志以及configuration都能确定jm加载到了历史服务器的相关配置。
目前自己的解决方案是像之前版本1.10类似
.
"containers": [
{
"name": "jobmanager",
"image": "flink:1.11.2-scala_2.11",
"command": [
"/bin/bash",
"-c",
"/opt/flink/bin/historyserver.sh start;/docker-entrypoint.sh
jobmanager;"
From: chenxuying
Sent: Saturday, October 10, 2020 15:56
To: user-zh@flink.apache.org
Subject: flink1.11.2 在k8s上部署,如何启动history server
flink1.11.2 在k8s上部署,如何启动history server
之前1.10的yaml里面可以加命令,但是1.11的yaml是通过docker-entrypoint.sh
好像没发现这个入口脚本没有对应的history server参数
flink1.11.2 在k8s上部署,如何启动history server
之前1.10的yaml里面可以加命令,但是1.11的yaml是通过docker-entrypoint.sh
好像没发现这个入口脚本没有对应的history server参数
doesn’t do this?
2020-07-11 11:43:29,527 [HistoryServer shutdown hook] INFO
HistoryServer - *Removing web dashboard root cache directory
/local/scratch/flink_historyserver_tmpdir*
2020-07-11 11:43:29,536 [HistoryServer shutdown hook] INFO
HistoryServer - Stopped history server.
We’re attempting to w
toryServer shutdown hook] INFO HistoryServer -
Stopped history server.
We're attempting to work around the UI becoming un-responsive/crashing the
browser at a large number archives (in my testing, that's around 20,000
archives with Chrome) by persisting the job IDs of our submitted apps and then
is the upper limit of the number of archives the history server
can support? Does it attempt to download every archive and load them
all into memory?
2.Retention: we have on the order of 100K applications per day in our
production environment. Is there any native retention of policy? E.g.
only
please:
1. What is the upper limit of the number of archives the history server
can support? Does it attempt to download every archive and load them all into
memory?
2. Retention: we have on the order of 100K applications per day in our
production environment. Is there any native
me kind
of resource problem.
// ah
From: Hailu, Andreas [Engineering]
Sent: Thursday, May 28, 2020 12:18 PM
To: 'Chesnay Schepler' <mailto:ches...@apache.org>;
user@flink.apache.org<mailto:user@flink.apache.org>
Subject: RE: History Server Not Showing Any Jobs - File Not Found?
*ah**
*From:*Hailu, Andreas [Engineering]
*Sent:* Thursday, May 28, 2020 12:18 PM
*To:* 'Chesnay Schepler' ; user@flink.apache.org
*Subject:* RE: History Server Not Showing Any Jobs - File Not Found?
Okay, I will look further to see if we’re mistakenly using a version
that’s pre-2.6.0. Howeve
]
Sent: Thursday, May 28, 2020 12:18 PM
To: 'Chesnay Schepler' ; user@flink.apache.org
Subject: RE: History Server Not Showing Any Jobs - File Not Found?
Okay, I will look further to see if we're mistakenly using a version that's
pre-2.6.0. However, I don't see flink-shaded-hadoop in my /lib directory
Are the files within /lib.
// ah
From: Chesnay Schepler
Sent: Thursday, May 28, 2020 11:00 AM
To: Hailu, Andreas [Engineering] ;
user@flink.apache.org
Subject: Re: History Server Not Showing Any Jobs - File Not Found?
Looks like it is indeed stuck on downloading the archive.
I searched a bit
being included introduce?
*// *ah**
*From:*Chesnay Schepler
*Sent:* Thursday, May 28, 2020 9:26 AM
*To:* Hailu, Andreas [Engineering] ;
user@flink.apache.org
*Subject:* Re: History Server Not Showing Any Jobs - File Not Found?
If it were a class-loading issue I would think that we'd see an
e
dPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
What problems could the flink-shaded-hadoop jar being included introduce?
// ah
From: Chesnay Schepler
Sent: Thursday, May 28, 2020 9:26 AM
To: Hailu, Andreas [Engineering] ;
user@flink.apache.org
Subject: Re:
] ;
user@flink.apache.org
*Subject:* Re: History Server Not Showing Any Jobs - File Not Found?
yes, exactly; I want to rule out that (somehow) HDFS is the problem.
I couldn't reproduce the issue locally myself so far.
On 01/05/2020 22:31, Hailu, Andreas wrote:
Hi Chesnay, yes – they were
, Andreas [Engineering] ;
user@flink.apache.org
Subject: Re: History Server Not Showing Any Jobs - File Not Found?
yes, exactly; I want to rule out that (somehow) HDFS is the problem.
I couldn't reproduce the issue locally myself so far.
On 01/05/2020 22:31, Hailu, Andreas wrote:
Hi Chesnay, yes
@flink.apache.org
*Subject:* Re: History Server Not Showing Any Jobs - File Not Found?
hmm...let's see if I can reproduce the issue locally.
Are the archives from the same version the history server runs on?
(Which I supposed would be 1.9.1?)
Just for the sake of narrowing things down, it would also
?
// ah
From: Chesnay Schepler
Sent: Wednesday, April 29, 2020 8:26 AM
To: Hailu, Andreas [Engineering] ;
user@flink.apache.org
Subject: Re: History Server Not Showing Any Jobs - File Not Found?
hmm...let's see if I can reproduce the issue locally.
Are the archives from the same version the history
hmm...let's see if I can reproduce the issue locally.
Are the archives from the same version the history server runs on?
(Which I supposed would be 1.9.1?)
Just for the sake of narrowing things down, it would also be interesting
to check if it works with the archives residing in the local
10:28 AM
To: Hailu, Andreas [Engineering] ;
user@flink.apache.org
Subject: Re: History Server Not Showing Any Jobs - File Not Found?
If historyserver.web.tmpdir is not set then java.io.tmpdir is used, so that
should be fine.
What are the contents of /local/scratch/flink_historyserver_tmpdir?
I
/
historyserver.web.tmpdir: /local/scratch/flink_historyserver_tmpdir/
Did you have anything else in mind when you said pointing somewhere funny?
*// *ah**
*From:*Chesnay Schepler
*Sent:* Monday, April 27, 2020 5:56 AM
*To:* Hailu, Andreas [Engineering] ;
user@flink.apache.org
*Subject:* Re: History Server
: Re: History Server Not Showing Any Jobs - File Not Found?
overview.json is a generated file that is placed in the local directory
controlled by historyserver.web.tmpdir.
Have you configured this option to point to some non-local filesystem? (Or if
not, is the java.io.tmpdir property pointing
[Engineering] ; user@flink.apache.org
*Subject:* RE: History Server Not Showing Any Jobs - File Not Found?
Hi Chesnay, thanks for responding. We’re using Flink 1.9.1. I enabled
DEBUG level logging and this is something relevant I see:
2020-04-22 13:25:52,566 [Flink-HistoryServer-ArchiveFetcher
]
; user@flink.apache.org
Subject: RE: History Server Not Showing Any Jobs - File Not Found?
Hi Chesnay, thanks for responding. We're using Flink 1.9.1. I enabled DEBUG
level logging and this is something relevant I see:
2020-04-22 13:25:52,566 [Flink-HistoryServer-ArchiveFetcher-thread-1] DEBUG
, April 22, 2020 2:16 AM
To: Hailu, Andreas [Engineering] ;
user@flink.apache.org
Subject: Re: History Server Not Showing Any Jobs - File Not Found?
Which Flink version are you using?
Have you checked the history server logs after enabling debug logging?
On 21/04/2020 17:16, Hailu, Andreas
Which Flink version are you using?
Have you checked the history server logs after enabling debug logging?
On 21/04/2020 17:16, Hailu, Andreas [Engineering] wrote:
Hi,
I’m trying to set up the History Server, but none of my applications
are showing up in the Web UI. Looking at the console, I
Hi,
I'm trying to set up the History Server, but none of my applications are
showing up in the Web UI. Looking at the console, I see that all of the calls
to /overview return the following 404 response: {"errors":["File not found."]}.
I've set up my configuration as
Hi pwestermann
I believe this is related to
https://issues.apache.org/jira/browse/FLINK-13799
It seems that the configuration.features['web-submit'] is missed from the
api when you upgrading from 1.7 to 1.9.2
Do you have the same problem when upgrading to 1.10? feel free to ping me if
you still
Hey Robert,
I just tried Flink 1.10 and the history server UI works for me too. Only
Flink 1.9.2 is not loading.
Since we were already looking into upgrading to 1.10, I might just do that
now.
Thanks,
Peter
--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
Hey Peter,
I tried reproducing the error, and for a second, I though the 1.10 release
really broke the web ui, because I saw a pretty similar error.
However after clearing the cache, the error was gone.
Are you sure that you cleared the cache of your browser?
I have also asked the main
I am seeing this error in firefox:
ERROR TypeError: "this.statusService.configuration.features is undefined"
t http://10.25.197.60:8082/main.177039bdbab11da4f8ac.js:1
qr http://10.25.197.60:8082/main.177039bdbab11da4f8ac.js:1
Gr http://10.25.197.60:8082/main.177039bdbab11da4f8ac.js:1
(e.g. Cmd+Shft+R for Mac). It solved my
> problem before.
>
>
> Best,
> Yang
>
> pwestermann 于2020年3月4日周三 下午8:40写道:
>
>> We recently upgraded from Flink 1.7 to Flink 1.9.2 and the history server
>> UI
>> now seems to be broken. It doesn't load and always
If all the rest api could be viewed successfully, then the reason may be js
cache.
You could try to force a refresh(e.g. Cmd+Shft+R for Mac). It solved my
problem before.
Best,
Yang
pwestermann 于2020年3月4日周三 下午8:40写道:
> We recently upgraded from Flink 1.7 to Flink 1.9.2 and the history ser
We recently upgraded from Flink 1.7 to Flink 1.9.2 and the history server UI
now seems to be broken. It doesn't load and always just displays a blank
screen.
The individual endpoints (e.g. /jobs/overview) still work.
Could this be an issue caused by the Angular update for the regular UI
2020-02-18 09:44:45,227 ERROR
org.apache.flink.runtime.webmonitor.hist/ry.HistoryServerArchiveFetcher -
Failure while fetching/process
ing job archive for job eaf0639027aca1624adaa100bdf1332e.
java.nio.file.FileSystemException:
/dev/shm/flink-history-server/jobs/eaf0639027aca1624adaa100bdf1332e
rtition were full or not?
>>>
>>> Richard Moorhead 于2020年2月18日周二 上午8:16写道:
>>>
>>>> I see the following exception often:
>>>>
>>>> 2020-02-17 18:13:26,796 ERROR
>>>> org.apache.flink.runtime.webmonitor.history.HistoryServer
ime.webmonitor.history.HistoryServerArchiveFetcher -
>>> Failure while fetching/processing job archive for job
>>> eaf0639027aca1624adaa100bdf1332e.
>>> java.nio.file.FileSystemException:
>>> /dev/shm/flink-history-server/jobs/eaf0639027aca1624adaa100bdf1332e/vertices/6a
t;> 2020-02-17 18:13:26,796 ERROR
>> org.apache.flink.runtime.webmonitor.history.HistoryServerArchiveFetcher -
>> Failure while fetching/processing job archive for job
>> eaf0639027aca1624adaa100bdf1332e.
>> java.nio.file.FileSystemException:
>> /dev/shm/flink-history-server/jobs/eaf06390
; Failure while fetching/processing job archive for job
> eaf0639027aca1624adaa100bdf1332e.
> java.nio.file.FileSystemException:
> /dev/shm/flink-history-server/jobs/eaf0639027aca1624adaa100bdf1332e/vertices/6abf3ed37d1a5e48f2786b832033f074/subtasks/86/attempts:
&
I see the following exception often:
2020-02-17 18:13:26,796 ERROR
org.apache.flink.runtime.webmonitor.history.HistoryServerArchiveFetcher -
Failure while fetching/processing job archive for job
eaf0639027aca1624adaa100bdf1332e.
java.nio.file.FileSystemException:
/dev/shm/flink-history-server
I think the best way to view the log is flink history server.
However, it could only support jobGraph and exceptions. Maybe
the flink history server needs to be enhanced so that we could view
logs just like the cluster is running.
Best,
Yang
Yu Yang 于2019年9月6日周五 上午3:06写道:
> Hi Yun Tang &a
l not be removed.
>
> Best
> Yun Tang
>
> --
> *From:* Zhu Zhu
> *Sent:* Friday, August 30, 2019 16:24
> *To:* Yu Yang
> *Cc:* user
> *Subject:* Re: best practices on getting flink job logs from Hadoop
> history server?
>
> Hi Yu,
>
> Rega
note that the temporary files of the YARN session in the home
directory will not be removed.
Best
Yun Tang
From: Zhu Zhu
Sent: Friday, August 30, 2019 16:24
To: Yu Yang
Cc: user
Subject: Re: best practices on getting flink job logs from Hadoop history
server?
Hi
Hi Yu,
Regarding #2,
Currently we search task deployment log in JM log, which contains info of
the container and machine the task deploys to.
Regarding #3,
You can find the application logs aggregated by machines on DFS, this path
of which relies on your YARN config.
Each log may still include
Hi,
We run flink jobs through yarn on hadoop clusters. One challenge that we
are facing is to simplify flink job log access.
The flink job logs can be accessible using "yarn logs $application_id".
That approach has a few limitations:
1. It is not straightforward to find yarn application id
Hi Encho,
currently, the existing image does not support to start a HistoryServer.
The reason is simply that it has not been exposed because the image
contains everything needed. In order to do this, you would need to extend
the docker-entrypoint.sh script with an additional history-server option
Hello,
I am struggling to find how to run a history server in Kubernetes. The
docker image takes an argument that starts a jobmanager or a taskmanager,
but no history server. What's the best way to set up one in K8S?
Thanks,
Encho
As a follow-up question, how well does the history server work for
observing a running job? I'm trying to understand whether, in the
cluster-per-job model, a user would be expected to hop from the Web UI to
the History Server once the job completed.
Thanks
On Wed, Oct 4, 2017 at 3:49 AM
To add to this:
The History Server is mainly useful in cases where one runs a
Flink-cluster-per-job. One the job finished, the processes disappear. The
History Server should be longer lived to make past executions' stats
available.
On Mon, Sep 25, 2017 at 3:44 PM, Nico Kruber <n...@d
Hi Elias,
in theory, it could be integrated into a single web interface, but this was
not done so far.
I guess the main reason for keeping it separate was probably to have a better
separation of concerns as the history server is actually independent of the
current JobManager execution
I am curious, why is the History Server a separate process and Web UI
instead of being part of the Web Dashboard within the Job Manager?
71 matches
Mail list logo