Re: Dynamic Table Options 被优化器去掉了

2021-04-26 文章 Shengkai Fang
hi, macial kk.

看样子是个bug,能提供以下你的ddl以及相关的环境吗?方便我们复现下问题。

Best,
Shengkai

plan的digest是不会打印connector的option的值的,因此你是没有办法通过plan来判断是否生效了。
macia kk  于2021年4月26日周一 上午12:31写道:

> Hi
>
>   我有在使用 temporal Joini 的时候有设置 如果读取分区的相关的 dynamic
> option,但是最后是没有生效的,我看全部使用的默认参数,打印出来了执行计划,逻辑执行计划是有的,优化之后没有了
>   如下,我设置的是加载最新分区,24小时加载一次,我看最后运行的日志是加载的全部分区,1小时有一次加载,这都是默认的参数,所以怀疑是 dyanmic
> option 没有生效。
>
>
> == Abstract Syntax Tree ==
> +- LogicalSnapshot(period=[$cor0.proctime])
>+- LogicalTableScan(table=[[ds, my_db, store_da_table,
> source: [HiveTableSource(store_id, store_name, merchant_id, tag_id,
> brand_id, tob_user_id, is_use_wallet, is_use_merchant_app, longitude,
> latitude, state, city, district, address, postal_code, register_phone,
> email, email_source, register_time, logo, banner, partner_type,
> commission_rate, tax_rate, service_fee, min_spend, delivery_distance,
> preparation_time, contact_phone, store_status, closed_start_time,
> closed_end_time, effective_closed_end_time, auto_confirmed,
> auto_confirmed_enabled, create_time, update_time, rating_total,
> rating_score, opening_status, surcharge_intervals, service_charge_fee_rate,
> driver_modify_order_enabled, delivery_distance_mode, business_info_added,
> mtime, dt, grass_region) TablePath: my_db.store_da_table, PartitionPruned:
> false, PartitionNums: null], dynamic options:
> {streaming-source.enable=true, streaming-source.monitor-interval=24 h,
> streaming-source.partition.include=latest}]])
>
> == Optimized Logical Plan ==
> Calc(select=[_UTF-16LE'v4' AS version, _UTF-16LE'ID' AS country, city, id,
> event_time, operation, platform, payment_method, gmv, 0.0:DECIMAL(2, 1) AS
> gmv_usd], where=[NOT(LIKE(UPPER(store_name), _UTF-16LE'%[TEST]%'))])
> +- LookupJoin(table=[ds.my_db.store_da_table],
> joinType=[LeftOuterJoin], async=[false], lookup=[store_id=store_id],
> select=[city, id, event_time, operation, platform, payment_method, gmv,
> store_id, store_id, store_name])
>+- Union(all=[true], union=[city, id, event_time, operation, platform,
> payment_method, gmv, store_id])
>   :- Calc(select=[delivery_city AS city, id, /(CAST(create_time), 1000)
> AS event_time, CASE(OR(=(order_status, 440), =(order_status, 800)),
> _UTF-16LE'NET':VARCHAR(5) CHARACTER SET "UTF-16LE",
> _UTF-16LE'GROSS':VARCHAR(5) CHARACTER SET "UTF-16LE") AS operation,
> _UTF-16LE'' AS platform, payment_method, /(CAST(total_amount), 10)
> AS gmv, CAST(store_id) AS store_id])
>   :  +- DataStreamScan(table=[[ds, keystats,
> main_db__transaction_tab]], fields=[id, delivery_city, store_id,
> create_time, payment_time, order_status, payment_method, total_amount,
> proctime], reuse_id=[1])
>   +- Calc(select=[delivery_city AS city, id, /(CAST(payment_time),
> 1000) AS event_time, _UTF-16LE'NET':VARCHAR(5) CHARACTER SET "UTF-16LE" AS
> operation, _UTF-16LE'AIRPAY' AS platform, payment_method,
> /(CAST(total_amount), 10) AS gmv, CAST(store_id) AS store_id],
> where=[OR(=(order_status, 440), =(order_status, 800))])
>  +- Reused(reference_id=[1])
>


Dynamic Table Options 被优化器去掉了

2021-04-25 文章 macia kk
Hi

  我有在使用 temporal Joini 的时候有设置 如果读取分区的相关的 dynamic
option,但是最后是没有生效的,我看全部使用的默认参数,打印出来了执行计划,逻辑执行计划是有的,优化之后没有了
  如下,我设置的是加载最新分区,24小时加载一次,我看最后运行的日志是加载的全部分区,1小时有一次加载,这都是默认的参数,所以怀疑是 dyanmic
option 没有生效。


== Abstract Syntax Tree ==
+- LogicalSnapshot(period=[$cor0.proctime])
   +- LogicalTableScan(table=[[ds, my_db, store_da_table,
source: [HiveTableSource(store_id, store_name, merchant_id, tag_id,
brand_id, tob_user_id, is_use_wallet, is_use_merchant_app, longitude,
latitude, state, city, district, address, postal_code, register_phone,
email, email_source, register_time, logo, banner, partner_type,
commission_rate, tax_rate, service_fee, min_spend, delivery_distance,
preparation_time, contact_phone, store_status, closed_start_time,
closed_end_time, effective_closed_end_time, auto_confirmed,
auto_confirmed_enabled, create_time, update_time, rating_total,
rating_score, opening_status, surcharge_intervals, service_charge_fee_rate,
driver_modify_order_enabled, delivery_distance_mode, business_info_added,
mtime, dt, grass_region) TablePath: my_db.store_da_table, PartitionPruned:
false, PartitionNums: null], dynamic options:
{streaming-source.enable=true, streaming-source.monitor-interval=24 h,
streaming-source.partition.include=latest}]])

== Optimized Logical Plan ==
Calc(select=[_UTF-16LE'v4' AS version, _UTF-16LE'ID' AS country, city, id,
event_time, operation, platform, payment_method, gmv, 0.0:DECIMAL(2, 1) AS
gmv_usd], where=[NOT(LIKE(UPPER(store_name), _UTF-16LE'%[TEST]%'))])
+- LookupJoin(table=[ds.my_db.store_da_table],
joinType=[LeftOuterJoin], async=[false], lookup=[store_id=store_id],
select=[city, id, event_time, operation, platform, payment_method, gmv,
store_id, store_id, store_name])
   +- Union(all=[true], union=[city, id, event_time, operation, platform,
payment_method, gmv, store_id])
  :- Calc(select=[delivery_city AS city, id, /(CAST(create_time), 1000)
AS event_time, CASE(OR(=(order_status, 440), =(order_status, 800)),
_UTF-16LE'NET':VARCHAR(5) CHARACTER SET "UTF-16LE",
_UTF-16LE'GROSS':VARCHAR(5) CHARACTER SET "UTF-16LE") AS operation,
_UTF-16LE'' AS platform, payment_method, /(CAST(total_amount), 10)
AS gmv, CAST(store_id) AS store_id])
  :  +- DataStreamScan(table=[[ds, keystats,
main_db__transaction_tab]], fields=[id, delivery_city, store_id,
create_time, payment_time, order_status, payment_method, total_amount,
proctime], reuse_id=[1])
  +- Calc(select=[delivery_city AS city, id, /(CAST(payment_time),
1000) AS event_time, _UTF-16LE'NET':VARCHAR(5) CHARACTER SET "UTF-16LE" AS
operation, _UTF-16LE'AIRPAY' AS platform, payment_method,
/(CAST(total_amount), 10) AS gmv, CAST(store_id) AS store_id],
where=[OR(=(order_status, 440), =(order_status, 800))])
 +- Reused(reference_id=[1])