*使用flink1.12.2 CLI连接hive时,可以查看到catalog,也能看到指定的catalog下的tables,但是执行select
语句时就出现异常。*
[ERROR] Could not execute SQL statement. Reason:
org.apache.hadoop.ipc.RemoteException: Application with id
'application_1605840182730_29292' doesn't exist in RM. Please check that
the job submission was suc
at
感谢你的回复
我看了打的JAR包中是存在org.apache.kafka.common.serialization.Serializer和ByteArraySerializer
在Flink run-application过程中也并没有将集群中Kafka
1.1.1版本的依赖加载到类路径中,并且在HDFS的application_1619417125027_0011/lib下也没有发现Kafka相关的jar包
不太明白为什么会出现问题
这是我的打包配置:
org.apache.maven.plugins
Hi, Radoslav,
> 1. Is it a good idea to have regular savepoints (say on a daily basis)?
> 2. Is it possible to have high availability with Per-Job mode? Or maybe I
> should go with session mode and make sure that my flink cluster is running a
> single job?
Yes, we can achieve HA with per-job
Hi gen
我在1.13分支上验证了下你的case,发现能够跑通。建议cp下那个patch到自己的分支,再验证下。
Best,
Shengkai
Shengkai Fang 于2021年4月27日周二 上午11:46写道:
> 请问你使用的是哪个版本? 这个似乎是一个已知的修复的bug[1]
>
> [1] https://github.com/apache/flink/pull/15548
>
> gen 于2021年4月27日周二 上午9:40写道:
>
>> Hi, all
>>
>> 请教下为什么 无法通过t.* 将 自定义函数返回的嵌套字段查出来。
>>
>>
是不是没有删除之前生成的类,手动删除冲突的类试试。
Best,
Shengkai
HunterXHunter <1356469...@qq.com> 于2021年4月27日周二 上午10:58写道:
> 查看发现
>
> org.apache.avro
> avro-maven-plugin
> ${avro.version}
>
hi, Colar.
Flink 使用的 Kafka 的版本是2.4.1,但是你的集群版本是1.1.1。看样子 作业运行时加载的是 集群上的
ByteArraySerializer,而不是 Flink 的
`flink-connector-kafka`中的。不太确定打成一个shade包能不能行。
Best,
Shengkai
Colar <523774...@qq.com> 于2021年4月26日周一 下午6:05写道:
> 使用Flink 1.12.2 消费Kafka报错:
>
> 2021-04-26 17:39:39,802 WARN
请问你使用的是哪个版本? 这个似乎是一个已知的修复的bug[1]
[1] https://github.com/apache/flink/pull/15548
gen 于2021年4月27日周二 上午9:40写道:
> Hi, all
>
> 请教下为什么 无法通过t.* 将 自定义函数返回的嵌套字段查出来。
>
> tEnv.executeSql(
> """
> | SELECT t.* FROM (
> | SELECT EvtParser(request) as t FROM parsed_nginx_log
>
Flink支持将DataStream 转换成一个 Table,然后通过API进行操作。如果想跟SQL相结合,可以将Table注册成一个
temporary view。
Best,
Shengkai
HunterXHunter <1356469...@qq.com> 于2021年4月27日周二 上午9:46写道:
> 你试过吗?
>
>
>
> --
> Sent from: http://apache-flink.147419.n8.nabble.com/
>
查看发现
org.apache.avro
avro-maven-plugin
${avro.version}
generate-sources
Hi Dian,
If that's the case, shall we reword "Attach custom python files for job."
into "attach custom files that could be put in PYTHONPATH, e.g., .zip,
.whl, etc."
Best,
Yik San
On Tue, Apr 27, 2021 at 10:08 AM Dian Fu wrote:
> Hi Yik San,
>
> All the files which could be put in the
For the command line arguments, it’s documented in
https://ci.apache.org/projects/flink/flink-docs-release-1.11/ops/cli.html
> 2021年4月27日 上午10:19,Dian Fu 写道:
>
> There are multiple ways to specify the target directory depending on how to
> specify the python archives.
> 1) API:
There are multiple ways to specify the target directory depending on how to
specify the python archives.
1) API: add_python_archive(“file:///path/to/py_env
.zip", "myenv"), see [1] for more details,
2) configuration: python.archives, e.g. file:///path/to/py_env.zip#myenv
3) command line
Hi Yik San,
All the files which could be put in the PYTHONPATH are allowed here, e.g. .zip,
.whl, etc.
Regards,
Dian
> 2021年4月27日 上午8:16,Yik San Chan 写道:
>
> Hi Dian,
>
> It is still not clear to me - does it only allow Python files (.py), or not?
>
> Best,
> Yik San
>
> On Mon, Apr 26,
Hi, all
请教下为什么 无法通过t.* 将 自定义函数返回的嵌套字段查出来。
tEnv.executeSql(
"""
| SELECT t.* FROM (
| SELECT EvtParser(request) as t FROM parsed_nginx_log
| )
|""".stripMargin)
自定义函数 EvtParser
@DataTypeHint("ROW")
def eval(line: String) = {...}
详细代码
class
目前无法通过t.* 将嵌套的字段查询出来。
val schema = tEnv.executeSql(
"""
| SELECT t.* FROM (
| SELECT EvtParser(request) as t FROM parsed_nginx_log
| )
|""".stripMargin).getTableSchema
其中自定义函数 EvtParser 定义如下。
@DataTypeHint("ROW")
def eval(line: String) = {
你试过吗?
--
Sent from: http://apache-flink.147419.n8.nabble.com/
org.apache.kafka.connect.errors.ConnectException: An exception occurred in
the change event producer. This connector will be stopped.
at
io.debezium.pipeline.ErrorHandler.setProducerThrowable(ErrorHandler.java:42)
~[flink-sql-connector-postgres-cdc-1.2.0.jar:1.2.0]
at
Hey Robert.
Nothing weird. I was trying to find recent records (not the latest). No
savepoints (just was running about ~1 day). No checkpoint issues (all
successes). I don't know how many are missing.
I removed the withIdleness. The other parts are very basic. The text logs
look pretty
Hi Dian,
I wonder where can we specify the target directory?
Best,
Yik San
On Mon, Apr 26, 2021 at 9:19 PM Dian Fu wrote:
> Hi Yik San,
>
> It should be a typo issue. I guess it should be `If the target directory
> name is specified, the archive file will be extracted to a directory with
>
Hi Dian,
It is still not clear to me - does it only allow Python files (.py), or not?
Best,
Yik San
On Mon, Apr 26, 2021 at 9:15 PM Dian Fu wrote:
> Hi Yik San,
>
> 1) what `--pyFiles` is used for:
> All the files specified via `--pyFiles` will be put in the PYTHONPATH of
> the Python worker
Hey Yun and Robert,
I'm using Flink v1.11.1.
Robert, I'll send you a separate email with the logs.
On Mon, Apr 26, 2021 at 12:46 AM Yun Tang wrote:
> Hi Dan,
>
> I think you might use older version of Flink and this problem has been
> resolved by FLINK-16753 [1] after Flink-1.10.3.
>
>
> [1]
Hey Nico, thanks for your reply. I gave this a try and unfortunately had no
luck.
// ah
-Original Message-
From: Nico Kruber
Sent: Wednesday, April 21, 2021 1:01 PM
To: user@flink.apache.org
Subject: Re: [1.9.2] Flink SSL on YARN - NoSuchFileException
Hi Andreas,
judging from [1], it
Hi Dian,
Thanks for trying it out, it ruled out a problem with the python code. I
double checked the jar path and only included the jar you referenced
without any luck. However, I tried creating a python 3.7 (had 3.8)
environment for pyflink and the code worked without any errors!
On Sun, Apr
The isBackPressured metric is a Boolean -- it reports true or false, rather
than 1 or 0. The Flink web UI can not display it (it shows NaN); perhaps
the same is true for Datadog.
https://issues.apache.org/jira/browse/FLINK-15753 relates to this.
Regards,
David
On Tue, Apr 13, 2021 at 12:13 PM
Hi Arvid, thanks for the reply.
Our stores are world-readable, so I don’t think that it’s an access issue. All
of our clients have the stores present through a shared mount as well. I’m able
to see the shipped stores in the directory.info output when pulling the YARN
logs, and can confirm the
Quick comment on the kryo type registration and the messages you are
seeing: The messages are expected: What the message is saying is that we
are not serializing the type using Flink's POJO serializer, but we are
falling back to Kryo.
Since you are registering all the instances of Number that you
Hi Dan,
Can you describe under which conditions you are missing records (after a
machine failure, after a Kafka failure, after taking and restoring from a
savepoint, ...).
Are many records missing? Are "the first records" or the "latest records"
missing? Any individual records missing, or larger
Thanks a lot for your message. This could be a bug in Flink. It seems that
the archival of the execution graph is failing because some classes are
unloaded.
What I observe from your stack traces is that some classes are loaded from
flink-dist_2.11-1.11.2.jar, while other classes are loaded from
Hi Sonam,
sorry for the late reply. We were a bit caught in the midst of the feature
freeze for the next major Flink release.
In general, I think it is a very good idea to disaggregate the local state
storage to make it reusable across TaskManager failures. However, it is
also not trivial to do.
Hi Yik San,
It should be a typo issue. I guess it should be `If the target directory name
is specified, the archive file will be extracted to a directory with the
specified name.`
Regards,
Dian
> 2021年4月26日 下午8:57,Yik San Chan 写道:
>
> Hi community,
>
> In
>
Hi Yik San,
1) what `--pyFiles` is used for:
All the files specified via `--pyFiles` will be put in the PYTHONPATH of the
Python worker during execution and then they will be available for the Python
user-defined functions during execution.
2) validate for the files passed to `--pyFiles`
Hi community,
In
https://ci.apache.org/projects/flink/flink-docs-stable/dev/python/python_config.html#python-options
,
> For each archive file, a target directory is specified. If the target
directory name is specified, the archive file will be extracted to a name
can directory with the
Hi community,
In
https://ci.apache.org/projects/flink/flink-docs-stable/dev/python/python_config.html,
regarding python.files:
> Attach custom python files for job.
This makes readers think only Python files are allowed here. However, in
Hi community,
When using Filesystem SQL Connector, users need to provide a path. When
running a PyFlink job using the mini-cluster mode by simply `python
WordCount.py`, the path can be a relative path, such as, `words.txt`.
However, trying to submit the job to `flink run` will fail without
flink版本使用1.12.2,业务需要用到了侧输出流,首先先创建了一个侧输出标签
val outputTagDate = OutputTag[String]("Date-side-output")。程序写完编译的时候报
type mismatch :
found : org.apache.flink.streaming.api.scala.OutputTag[String]
required: org.apache.flink.util.OutputTag[java.io.Serializable]
Note: String <: java.io.Serializable,
使用Flink 1.12.2 消费Kafka报错:
2021-04-26 17:39:39,802 WARN org.apache.flink.runtime.taskmanager.Task
[] - TriggerWindow(SlidingEventTimeWindows(3, 5000),
ReducingStateDescriptor{name=window-contents, defaultValue=null,
中间的状态也不能丢,两个都需要
hk__lrzy 于2021年4月25日周日 下午8:25写道:
> 所有算子都需要维护。
>
>
>
> --
> Sent from: http://apache-flink.147419.n8.nabble.com/
>
Hi Till,
I have created the ticket to extend the description of `execution.targe`.
https://issues.apache.org/jira/browse/FLINK-22476
best regards,
Tony Wei 於 2021年4月26日 週一 下午5:24寫道:
> Hi Till, Yangze,
>
> I think FLINK-15852 should solve my problem.
> It is my fault that my flink version is
Hi community,
This question is cross-posted on Stack Overflow
https://stackoverflow.com/questions/67264156/flink-hive-connector-hive-conf-dir-supports-hdfs-uri-while-hadoop-conf-dir-sup
In my current setup, local dev env can access testing env. I would like to
run Flink job on local dev env,
Hi Till, Yangze,
I think FLINK-15852 should solve my problem.
It is my fault that my flink version is not 100% consistent with the
community version, and FLINK-15852 is the one I missed.
Thanks for your information.
best regards,
Till Rohrmann 於 2021年4月26日 週一 下午5:14寫道:
> I think you are right
If the GenericCLI is selected, then the execution.target should have
been overwritten to "yarn-application" in GenericCLI#toConfiguration.
It is odd that why the GenericCLI#isActive return false as the
execution.target is defined in both flink-conf and command line.
Best,
Yangze Guo
On Mon, Apr
I think you are right that the `GenericCLI` should be the first choice.
>From the top of my head I do not remember why FlinkYarnSessionCli is still
used. Maybe it is in order to support some Yarn specific cli option
parsing. I assume it is either an oversight or some parsing has not been
Hi, Till,
I agree that we need to resolve the issue by overriding the
configuration before selecting the CustomCommandLines. However, IIUC,
after FLINK-15852 the GenericCLI should always be the first choice.
Could you help me to understand why the FlinkYarnSessionCli can be
activated?
Best,
Hi Tony,
I think you are right that Flink's cli does not behave super consistent at
the moment. Case 2. should definitely work because `-t yarn-application`
should overwrite what is defined in the Flink configuration. The problem
seems to be that we don't resolve the configuration wrt the
Hi Arvid,
On 2 - I was referring to stateful functions as an alternative to windows,
but in this particular use case, its not fitting in exactly I think, though
a solution can be built around it.
On the overall approach here what's the right way to use Flink SQL:
Every event has the transaction
Hi all,
I am having multiple questions regarding Flink :) Let me give you some
background of what I have done so far.
*Description*
I am using Flink 1.11.2. My job is doing data enrichment. Data is consumed
from 6 different kafka topics and it is joined via multiple
CoProcessFunctions. On a
I'm not sure what you're trying to achieve. Are you trying to simulate a
task failure? Or are you trying to pick up the state from a stopped job?
You could achieve the former one by killing the TaskManager instance or by
throwing a custom failure as part of your job pipeline. The latter one can
be
Hi Dan,
I think you might use older version of Flink and this problem has been resolved
by FLINK-16753 [1] after Flink-1.10.3.
[1] https://issues.apache.org/jira/browse/FLINK-16753
Best
Yun Tang
From: Robert Metzger
Sent: Monday, April 26, 2021 14:46
To: Dan
flink版本使用的是1.12.2.。请问如果在Dstream 上用一些Operater,比如map
,flatmap,process等,可以在其重写的方法中使用tableEnv.sqlQuery("xxx")
tableEnv.createTemporaryView(),这种sql吗,能这样结合吗?
Hi Prashant,
Flink should always give warnings as long as the deduced result
is GenericType, no matter it uses the default kryo serializer or
the register one, thus if you have registered the type, you may
simply ignore the warnings. To make sure it works, you may
find the tm that the source
Hey Prashant,
the Kafka Consumer parallelism is constrained by the number of partitions
the topic(s) have. If you have configured the Kafka Consumer in Flink with
a parallelism of 100, but your topic has only 20 partitions, 80 consumer
instances in Flink will be idle.
On Mon, Apr 26, 2021 at 2:54
Hi Dan,
can you provide me with the JobManager logs to take a look as well? (This
will also tell me which Flink version you are using)
On Mon, Apr 26, 2021 at 7:20 AM Dan Hill wrote:
> My Flink job failed to checkpoint with a "The job has failed" error. The
> logs contained no other recent
Hi, Gil
IIUC, you want to deploy Flink cluster using YAML files yourselves and
want to know whether the JM should be deployed as Job[1] or
Deployment. If that is the case, as Matthias mentioned, Flink provides
two ways to integrate with K8S [2][3], in [3] the JM will be deployed
as a Deployment.
Hi, Tony.
What is the version of your flink-dist. AFAIK, this issue should be
addressed in FLINK-15852[1]. Could you give the client log of case
2(set the log level to DEBUG would be better).
[1] https://issues.apache.org/jira/browse/FLINK-15852
Best,
Yangze Guo
On Sun, Apr 25, 2021 at 11:33
Hi, could you tell me which version do you use? I just want to check
whether there are any problems.
Best,
Shengkai
张颖 于2021年4月25日周日 下午5:23写道:
> hi,I met an appearance like this:
>
> this is my sql:
> SELECT distinct header,label,reqsig,dense_feat,item_id_feat,user_id_feat
> FROM
Hi,
MATCH_RECOGNIZE clause in SQL standard does not support different
contiguities. The MATCH_RECOGNIZE always uses the strict contiguity.
Best,
Dawid
On 21/04/2021 00:02, tbud wrote:
> There's 3 different types of Contiguity defined in the CEP documentation [1]
> looping + non-looping --
Hi,
Yes you are correct that if an event can not match any pattern it won't
be stored in state. If you process your records in event time it might
be stored for a little while before processing in order to sort the
incoming records based on time. Once a Watermark with a higher timestamp
comes it
Hi Shengkai,
Thanks for the answer. The question is do we need to determine if an
event in the main stream is late.
Let's look at interval join - event is emitted as soon as there is a
match between left and right stream.
I agree the watermark should pass on versioned table side, because
this is
hi, macial kk.
看样子是个bug,能提供以下你的ddl以及相关的环境吗?方便我们复现下问题。
Best,
Shengkai
plan的digest是不会打印connector的option的值的,因此你是没有办法通过plan来判断是否生效了。
macia kk 于2021年4月26日周一 上午12:31写道:
> Hi
>
> 我有在使用 temporal Joini 的时候有设置 如果读取分区的相关的 dynamic
> option,但是最后是没有生效的,我看全部使用的默认参数,打印出来了执行计划,逻辑执行计划是有的,优化之后没有了
>
59 matches
Mail list logo