Re: Re: Re: Re: Re: StreamingFileWriter 压测性能

Jingsong Li Wed, 16 Sep 2020 23:20:39 -0700

是的，可以测一下，理论上 mr writer不应该有较大性能差距。


> 为何要强制滚动文件

因为要保证Exactly-Once， 像Orc和parquet类似的 format，它并不能把一个文件拆成多次来写。

On Thu, Sep 17, 2020 at 2:05 PM kandy.wang <kandy1...@163.com> wrote:

>
>
>
> ok. 就是用hadoop mr writer vs  flink 自实现的native
> writer之间的性能对比了。至少目前看了一下table.exec.hive.fallback-mapred-writer
> 改成false是可以满足我们的写hive需求了
> 还有一个问题，之前问过你，你还没回复：
> HiveRollingPolicy为什么 shouldRollOnCheckpoint true 为何要强制滚动文件，这个可以抽取成一个配置参数么？
> 如果强制滚动的话，基本上sink.rolling-policy.rollover-interval、
> sink.rolling-policy.rollover-interval参数就不work了，如果5min一个分区，2min做一次checkpoint，那文件还不到几十M就滚动了。配置的参数就没意义了
> 在 2020-09-17 13:43:04，"Jingsong Li" <jingsongl...@gmail.com> 写道：
> >可以再尝试下最新的1.11.2吗？
> >
> >https://flink.apache.org/downloads.html
> >
> >On Thu, Sep 17, 2020 at 1:33 PM kandy.wang <kandy1...@163.com> wrote:
> >
> >> 是master分支代码
> >> 那你说的这个情况，刚好是table.exec.hive.fallback-mapred-writer默认是true 的情况
> >> 出现的，现在改成false 就走到else 部分 就暂时没这个问题了
> >> if (userMrWriter) {
> >>    builder = bucketsBuilderForMRWriter(recordWriterFactory, sd,
> assigner,
> >> rollingPolicy, outputFileConfig);
> >> LOG.info("Hive streaming sink: Use MapReduce RecordWriter writer.");
> >> } else {
> >>    Optional<BulkWriter.Factory<RowData>> bulkFactory =
> >> createBulkWriterFactory(partitionColumns, sd);
> >>    if (bulkFactory.isPresent()) {
> >>       builder = StreamingFileSink.forBulkFormat(
> >> new org.apache.flink.core.fs.Path(sd.getLocation()),
> >>             new
> >> FileSystemTableSink.ProjectionBulkFactory(bulkFactory.get(),
> partComputer))
> >>             .withBucketAssigner(assigner)
> >>             .withRollingPolicy(rollingPolicy)
> >>             .withOutputFileConfig(outputFileConfig);
> >> LOG.info("Hive streaming sink: Use native parquet&orc writer.");
> >> } else {
> >>       builder = bucketsBuilderForMRWriter(recordWriterFactory, sd,
> >> assigner, rollingPolicy, outputFileConfig);
> >> LOG.info("Hive streaming sink: Use MapReduce RecordWriter writer because
> >> BulkWriter Factory not available.");
> >> }
> >> }
> >> 在 2020-09-17 13:21:40，"Jingsong Li" <jingsongl...@gmail.com> 写道：
> >> >是最新的代码吗？
> >> >1.11.2解了一个bug：https://issues.apache.org/jira/browse/FLINK-19121
> >> >它是影响性能的，1.11.2已经投票通过，即将发布
> >> >
> >> >On Thu, Sep 17, 2020 at 12:46 PM kandy.wang <kandy1...@163.com> wrote:
> >> >
> >> >> @Jingsong Li
> >> >>
> >> >> public TableSink createTableSink(TableSinkFactory.Context context) {
> >> >>    CatalogTable table = checkNotNull(context.getTable());
> >> >> Preconditions.checkArgument(table instanceof CatalogTableImpl);
> >> >>
> >> >>    boolean isGeneric =
> >> >>
> >>
> Boolean.parseBoolean(table.getProperties().get(CatalogConfig.IS_GENERIC));
> >> >>
> >> >>    if (!isGeneric) {
> >> >> return new HiveTableSink(
> >> >>             context.getConfiguration().get(
> >> >>
>  HiveOptions.TABLE_EXEC_HIVE_FALLBACK_MAPRED_WRITER),
> >> >> context.isBounded(),
> >> >>             new JobConf(hiveConf),
> >> >> context.getObjectIdentifier(),
> >> >> table);
> >> >> } else {
> >> >> return TableFactoryUtil.findAndCreateTableSink(context);
> >> >> }
> >> >> }
> >> >>
> >> >>
> >>
> HiveTableFactory中，有个配置table.exec.hive.fallback-mapred-writer默认是true，控制是否使用Hadoop
> >> >> 自带的mr writer还是用flink native 实现的 writer去写orc parquet格式。
> >> >>
> >> >> If it is false, using flink native writer to write parquet and orc
> >> files;
> >> >>
> >> >> If it is true, using hadoop mapred record writer to write parquet and
> >> orc
> >> >> files
> >> >>
> >> >> 将此参数调整成false后，同样的资源配置下，tps达到30W
> >> >>
> >> >> 这个不同的ORC实现，可能性能本身就存在差异吧？ 另外我们的存储格式是orc，orc有没有一些可以优化的参数，async  flush
> >> >> 一些相关的参数 ？
> >> >>
> >> >>
> >> >>
> >> >>
> >> >>
> >> >> 在 2020-09-17 11:21:43，"Jingsong Li" <jingsongl...@gmail.com> 写道：
> >> >> >Sink并行度
> >> >> >我理解是配置Sink并行度，这个一直在讨论，还没结论
> >> >> >
> >> >> >HDFS性能
> >> >> >具体可以看HDFS到底什么瓶颈，是网络还是请求数还是连接数还是磁盘IO
> >> >> >
> >> >> >On Wed, Sep 16, 2020 at 8:16 PM kandy.wang <kandy1...@163.com>
> wrote:
> >> >> >
> >> >> >> 场景很简单，就是kafka2hive
> >> >> >> --5min入仓Hive
> >> >> >>
> >> >> >> INSERT INTO  hive.temp_.hive_5min
> >> >> >>
> >> >> >> SELECT
> >> >> >>
> >> >> >>  arg_service,
> >> >> >>
> >> >> >> time_local
> >> >> >>
> >> >> >> .....
> >> >> >>
> >> >> >> FROM_UNIXTIME((UNIX_TIMESTAMP()/300 * 300) ,'yyyyMMdd'),
> >> >> >> FROM_UNIXTIME((UNIX_TIMESTAMP()/300 * 300) ,'HHmm')  5min产生一个分区
> >> >> >>
> >> >> >> FROM hive.temp_.kafka_source_pageview/*+ OPTIONS('
> >> properties.group.id
> >> >> '='kafka_hive_test',
> >> >> >> 'scan.startup.mode'='earliest-offset') */;
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >> --kafka source表定义
> >> >> >>
> >> >> >> CREATE TABLE hive.temp_vipflink.kafka_source_pageview (
> >> >> >>
> >> >> >> arg_service string COMMENT 'arg_service',
> >> >> >>
> >> >> >> ....
> >> >> >>
> >> >> >> )WITH (
> >> >> >>
> >> >> >>   'connector' = 'kafka',
> >> >> >>
> >> >> >>   'topic' = '...',
> >> >> >>
> >> >> >>   'properties.bootstrap.servers' = '...',
> >> >> >>
> >> >> >>   'properties.group.id' = 'flink_etl_kafka_hive',
> >> >> >>
> >> >> >>   'scan.startup.mode' = 'group-offsets',
> >> >> >>
> >> >> >>   'format' = 'json',
> >> >> >>
> >> >> >>   'json.fail-on-missing-field' = 'false',
> >> >> >>
> >> >> >>   'json.ignore-parse-errors' = 'true'
> >> >> >>
> >> >> >> );
> >> >> >> --sink hive表定义
> >> >> >> CREATE TABLE temp_vipflink.vipflink_dm_log_app_pageview_5min (
> >> >> >> ....
> >> >> >> )
> >> >> >> PARTITIONED BY (dt string , hm string) stored as orc location
> >> >> >> 'hdfs://ssdcluster/....._5min' TBLPROPERTIES(
> >> >> >>   'sink.partition-commit.trigger'='process-time',
> >> >> >>   'sink.partition-commit.delay'='0 min',
> >> >> >>   'sink.partition-commit.policy.class'='...CustomCommitPolicy',
> >> >> >>
> >>  'sink.partition-commit.policy.kind'='metastore,success-file,custom',
> >> >> >>   'sink.rolling-policy.check-interval' ='30s',
> >> >> >>   'sink.rolling-policy.rollover-interval'='10min',
> >> >> >>   'sink.rolling-policy.file-size'='128MB'
> >> >> >> );
> >> >> >> 初步看下来，感觉瓶颈在写hdfs，hdfs 这边已经是ssd hdfs了，kafka的分区数=40
> >> >> >> ，算子并行度=40，tps也就达到6-7万这样子，并行度放大，性能并无提升。
> >> >> >> 就是flink sql可以
> >> >> >>
> >> >>
> >>
> 改局部某个算子的并行度，想单独改一下StreamingFileWriter算子的并行度，有什么好的办法么？然后StreamingFileWriter
> >> >> >> 这块，有没有什么可以提升性能相关的优化参数？
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >> 在 2020-09-16 19:29:50，"Jingsong Li" <jingsongl...@gmail.com> 写道：
> >> >> >> >Hi,
> >> >> >> >
> >> >> >> >可以分享下具体的测试场景吗？有对比吗？比如使用手写的DataStream作业来对比下，性能的差距？
> >> >> >> >
> >> >> >> >另外，压测时是否可以看下jstack？
> >> >> >> >
> >> >> >> >Best,
> >> >> >> >Jingsong
> >> >> >> >
> >> >> >> >On Wed, Sep 16, 2020 at 2:03 PM kandy.wang <kandy1...@163.com>
> >> wrote:
> >> >> >> >
> >> >> >> >> 压测下来，发现streaming方式写入hive StreamingFileWriter ，在kafka
> partition=40
> >> >> >> ，source
> >> >> >> >> writer算子并行度 =40的情况下，kafka从头消费，tps只能达到 7w
> >> >> >> >> 想了解一下，streaming方式写Hive 这块有压测过么？性能能达到多少
> >> >> >> >
> >> >> >> >
> >> >> >> >
> >> >> >> >--
> >> >> >> >Best, Jingsong Lee
> >> >> >>
> >> >> >
> >> >> >
> >> >> >--
> >> >> >Best, Jingsong Lee
> >> >>
> >> >
> >> >
> >> >--
> >> >Best, Jingsong Lee
> >>
> >
> >
> >--
> >Best, Jingsong Lee
>


-- 
Best, Jingsong Lee