Hi Fabian, Thanks for your response. This question puzzles me for quite a long time. If the praiseAggr has the following value: window-1 100,101,102 window-2 100,101,103 window-3 100
the last time the article table joins praiseAggr, which of the following value does praiseAggr table has? 1— 100,101,102,100,101,103,100 collect all the element of all the window 2— 100 the element of the latest window 3— 101,102,103 the distinct value of all the window Best, Henry > 在 2018年8月21日,下午8:02,Fabian Hueske <fhue...@gmail.com> 写道: > > Hi, > > The semantics of a query do not depend on the way that it is used. > praiseAggr is a table that grows by one row per second and article_id. If you > use that table in a join, the join will fully materialize the table. > This is a special case because the same row is added multiple times, so the > state won't grow that quickly, but the performance will decrease because for > each row from article will join with multiple (a growing number) of rows from > praiseAggr. > > Best, Fabian > > 2018-08-21 12:19 GMT+02:00 徐涛 <happydexu...@gmail.com > <mailto:happydexu...@gmail.com>>: > Hi All, > var praiseAggr = tableEnv.sqlQuery(s"SELECT article_id FROM praise > GROUP BY HOP(updated_time, INTERVAL '1' SECOND,INTERVAL '3' MINUTE) , > article_id" ) > tableEnv.registerTable("praiseAggr", praiseAggr) > var finalTable = tableEnv.sqlQuery(s”SELECT 1 FROM article a join > praiseAggr p on a.article_id=p.article_id" ) > tableEnv.registerTable("finalTable", finalTable) > I know that praiseAggr, if written to sink, is append mode , so if a > table joins praiseAggr, what the table “see”, is a table contains the latest > value, or a table that grows larger and larger? If it is the later, will it > introduce performance problem? > Thanks a lot. > > > Best, > Henry >