Hi Hequn,
Another question, for some case, I think update the timestamp of the
retract row is reasonable, for example, some user does not want to the hard
delete, but the soft delete, so I write code when the retract row comes I only
do the soft delete, but I want the update_timestamp different so the ETL
program can know that this line has changed.
For example, if the value is updated from 1 to 2,
previous row: add (a, 1, 2018-08-20 20:18:10.286)
retract row: delete (a, 1, 2018-08-20 20:18:10.386)
new row: add (a, 2, 2018-08-20 20:18:10.486)
> 在 2018年8月21日,下午12:25,Hequn Cheng <[email protected]> 写道:
>
> Hi Henry,
>
> You are right that, in MySQL, SYSDATE returns the time at which it executes
> while LOCALTIMESTAMP returns a constant time that indicates the time at which
> the statement began to execute.
> But other database system seems don't have this constraint(correct me if I'm
> wrong). Sometimes we don't have to follow MySQL.
>
> Best, Hequn
>
> On Tue, Aug 21, 2018 at 10:21 AM, 徐涛 <[email protected]
> <mailto:[email protected]>> wrote:
> Hi Hequn,
> Maybe I do not express clearly. I mean if only the update_timestamp of
> the increment data is updated, it is not enough. Because from the sql, it
> express the idea “all the time in the table is the same”, but actually each
> item in the table may be different. It is a bit weird.
>
> Best, Henry
>
>
>
>> 在 2018年8月21日,上午10:09,Hequn Cheng <[email protected]
>> <mailto:[email protected]>> 写道:
>>
>> Hi Henry,
>>
>> If you upsert by key 'article_id', the result is correct, i.e, the result is
>> (a, 2, 2018-08-20 20:18:10.486). What do you think?
>>
>> Best, Hequn
>>
>>
>>
>> On Tue, Aug 21, 2018 at 9:44 AM, 徐涛 <[email protected]
>> <mailto:[email protected]>> wrote:
>> Hi Hequn,
>> However is it semantically correct? because the sql result is not equal
>> to the bounded table.
>>
>>
>>> 在 2018年8月20日,下午8:34,Hequn Cheng <[email protected]
>>> <mailto:[email protected]>> 写道:
>>>
>>> Hi Henry,
>>>
>>> Both sql output incrementally.
>>>
>>> However there are some problems if you use retract sink. You have to pay
>>> attention to the timestamp field since each time the value is different.
>>> For example, if the value is updated from 1 to 2,
>>> previous row: add (a, 1, 2018-08-20 20:18:10.286)
>>> retract row: delete (a, 1, 2018-08-20 20:18:10.386)
>>> new row: add (a, 2, 2018-08-20 20:18:10.486)
>>> The retract row is different from the previous row because of the time
>>> field.
>>>
>>> Of course, this problem should be fixed later.
>>>
>>> Best, Hequn
>>>
>>> On Mon, Aug 20, 2018 at 6:43 PM, 徐涛 <[email protected]
>>> <mailto:[email protected]>> wrote:
>>> Hi All,
>>> Like the following code,If I use retract stream, I think Flink is able
>>> to know which item is modified( if praise has 10000 items now, when one
>>> item comes to the stream, only very small amount of data is write to sink)
>>> var praiseAggr = tableEnv.sqlQuery(s"SELECT article_id,hll(uid) as PU
>>> FROM praise group by article_id” )
>>> tableEnv.registerTable("finalTable", praiseAggr)
>>> tableEnv.sqlUpdate(s"insert into sinkTableName SELECT * from
>>> finalTable")
>>>
>>> But if I use the following sql, by adding a dynamic timestamp field:
>>> var praiseAggr = tableEnv.sqlQuery(s"SELECT article_id,hll(uid)
>>> as PU,LOCALTIMESTAMP as update_timestamp FROM praise group by article_id” )
>>> Is the whole table flush to the sink? Or only the incremental value
>>> will flush to the sink? Why?
>>>
>>> Thanks,
>>> Henry
>>>
>>>
>>
>>
>
>