回复: flink 1.11 使用sql写入hdfs无法自动提交分区

[email protected] Wed, 12 Aug 2020 01:21:46 -0700

https://ci.apache.org/projects/flink/flink-docs-release-1.11/dev/connectors/streamfile_sink.html


应该是这个原因
General

Important Note 1: When using Hadoop < 2.7, please use the 
OnCheckpointRollingPolicy which rolls part files on every checkpoint. The 
reason is that if part files “traverse” the checkpoint interval, then, upon 
recovery from a failure the StreamingFileSink may use the truncate() method of 
the filesystem to discard uncommitted data from the in-progress file. This 
method is not supported by pre-2.7 Hadoop versions and Flink will throw an 
exception.


发送自 Windows 10 版邮件<https://go.microsoft.com/fwlink/?LinkId=550986>应用

发件人: Jun Zhang<mailto:[email protected]>
发送时间: 2020年7月23日 12:55
收件人: Jingsong Li<mailto:[email protected]>
抄送: user-zh<mailto:[email protected]>
主题: Re: flink 1.11 使用sql写入hdfs无法自动提交分区

hi,jinsong
我们生产环境hdfs是cdh 2.6的，我换了一个hadoop 3 版本的hdfs，还真没问题了，不知道是哪里出问题了。

Jingsong Li <[email protected]> 于2020年7月23日周四 上午11:45写道：

> 相同操作我也没有复现。。是可以成功执行的
>
> 你的HDFS是什么版本？是否可以考虑换个来测试下
>
> On Thu, Jul 23, 2020 at 11:34 AM Jun Zhang <[email protected]>
> wrote:
>
>> hi,jinsong:
>>
>> 这个问题不知道你后来有没有做过测试，我这里一直不行，就是并发度是1的时候，文件写入是正常的，就是没有生成success文件，如果是hive的话，就没有自动生成分区和更新分区数据。
>>
>> Jun Zhang <[email protected]> 于2020年7月23日周四 上午11:15写道：
>>
>>> hi，夏帅：
>>>
>>> 抱歉，这几天没搞这个，我这个问题是文件是正常写入hdfs了，但是没有自动提交，也没有错误日志，就是如果写入的是文件系统，没有SUCCESS文件，写入hive的话，没有自动更新分区。
>>>
>>> 你测试没有问题的情况并行度是 1 吗？写入hdfs？
>>>
>>> 夏帅 <[email protected]> 于2020年7月10日周五 下午5:39写道：
>>>
>>>> 你好,
>>>> 我这边同样的代码,并没有出现类似的问题
>>>> 是本地跑么,可以提供下日志信息么?
>>>>
>>>>
>
> --
> Best, Jingsong Lee
>