Hi Fabian,
Thanks for your response. This question puzzles me for quite a long
time.
If the praiseAggr has the following value:
window-1 100,101,102
window-2 100,101,103
window-3 100
the last time the article table joins praiseAggr, which of the
following value does praiseAggr table has?
1— 100,101,102,100,101,103,100 collect all the element
of all the window
2— 100 the element
of the latest window
3— 101,102,103 the distinct value
of all the window
Best,
Henry
> 在 2018年8月21日,下午8:02,Fabian Hueske <[email protected]> 写道:
>
> Hi,
>
> The semantics of a query do not depend on the way that it is used.
> praiseAggr is a table that grows by one row per second and article_id. If you
> use that table in a join, the join will fully materialize the table.
> This is a special case because the same row is added multiple times, so the
> state won't grow that quickly, but the performance will decrease because for
> each row from article will join with multiple (a growing number) of rows from
> praiseAggr.
>
> Best, Fabian
>
> 2018-08-21 12:19 GMT+02:00 徐涛 <[email protected]
> <mailto:[email protected]>>:
> Hi All,
> var praiseAggr = tableEnv.sqlQuery(s"SELECT article_id FROM praise
> GROUP BY HOP(updated_time, INTERVAL '1' SECOND,INTERVAL '3' MINUTE) ,
> article_id" )
> tableEnv.registerTable("praiseAggr", praiseAggr)
> var finalTable = tableEnv.sqlQuery(s”SELECT 1 FROM article a join
> praiseAggr p on a.article_id=p.article_id" )
> tableEnv.registerTable("finalTable", finalTable)
> I know that praiseAggr, if written to sink, is append mode , so if a
> table joins praiseAggr, what the table “see”, is a table contains the latest
> value, or a table that grows larger and larger? If it is the later, will it
> introduce performance problem?
> Thanks a lot.
>
>
> Best,
> Henry
>