[jira] [Commented] (FLINK-10875) Add `toTableWithTimestamp` method in `DataStreamConversions`

2019-02-25 Thread sunjincheng (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-10875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16777616#comment-16777616
 ] 

sunjincheng commented on FLINK-10875:
-

Did not find a better way. 
close JIRA first.

> Add `toTableWithTimestamp` method in `DataStreamConversions`
> 
>
> Key: FLINK-10875
> URL: https://issues.apache.org/jira/browse/FLINK-10875
> Project: Flink
>  Issue Type: Improvement
>  Components: Table API  SQL
>Reporter: sunjincheng
>Assignee: sunjincheng
>Priority: Minor
> Fix For: 1.7.3
>
>
> Currently we convert a `DataStream` to a `Table` by  
> `DataStreamConversions#toTable`, e.g.:
> {code:java}
> // Without TimeAttribute
> ...
> val stream = env.fromCollection(...)
> val tab = stream.toTable(tEnv, 'a, 'b, 'c)
> val result = tab.select('a, 'b)
> 
> // With TimeAttribute
> ...
> val stream = env.fromCollection(...).assignTimestampsAndWatermarks(...)
> val tab = stream.toTable(tEnv, 'a, 'b, 'c, 'ts.rowtime)
> val result = tab.window(Session withGap 5.milli on 'ts as 'w)
> ...{code}
> I think the fieldNames parameter in the `toTable` method is reasonable in the 
> conversion without the time attribute, because the fieldNames will actually 
> correspond to the fields of the physical table, but when applied to the 
> conversion with the time attribute, the time attribute column is silently 
> added to the table. This feeling is very Magical, so I recommend adding a 
> method that allows the user to display the time attribute added to the 
> physical table: `toTableWithTimestamp`, which is automatically named to the 
> time attribute column named by user input and TimeCharacteristic, eg:
> {code:java}
> env.setStreamTimeCharacteristic(TimeCharacteristic.EventTime)
> ...
> val table = stream.toTableWithTimestamp(tEnv, 'count, 'size, 'name, 'ts)
>   .window(Tumble over 2.rows on 'ts as 'w)
> ...
> {code}
> In the example above the flink will mark `ts` ad a `RowtimeAttribute`.
> What do you think ? 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-10875) Add `toTableWithTimestamp` method in `DataStreamConversions`

2018-11-18 Thread sunjincheng (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-10875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16691161#comment-16691161
 ] 

sunjincheng commented on FLINK-10875:
-

Thanks for the feedback [~fhueske] !

It is a good point to use both `proctime` and `rowtime` on a table. IMO. A way 
to solve this problem is add new methods, but too many methods are not the best 
solution. It is reasonable to keep the status quo before we come up with a 
better way. (I will think about this question again.)

 

 

> Add `toTableWithTimestamp` method in `DataStreamConversions`
> 
>
> Key: FLINK-10875
> URL: https://issues.apache.org/jira/browse/FLINK-10875
> Project: Flink
>  Issue Type: Improvement
>  Components: Table API  SQL
>Reporter: sunjincheng
>Assignee: sunjincheng
>Priority: Minor
> Fix For: 1.7.1
>
>
> Currently we convert a `DataStream` to a `Table` by  
> `DataStreamConversions#toTable`, e.g.:
> {code:java}
> // Without TimeAttribute
> ...
> val stream = env.fromCollection(...)
> val tab = stream.toTable(tEnv, 'a, 'b, 'c)
> val result = tab.select('a, 'b)
> 
> // With TimeAttribute
> ...
> val stream = env.fromCollection(...).assignTimestampsAndWatermarks(...)
> val tab = stream.toTable(tEnv, 'a, 'b, 'c, 'ts.rowtime)
> val result = tab.window(Session withGap 5.milli on 'ts as 'w)
> ...{code}
> I think the fieldNames parameter in the `toTable` method is reasonable in the 
> conversion without the time attribute, because the fieldNames will actually 
> correspond to the fields of the physical table, but when applied to the 
> conversion with the time attribute, the time attribute column is silently 
> added to the table. This feeling is very Magical, so I recommend adding a 
> method that allows the user to display the time attribute added to the 
> physical table: `toTableWithTimestamp`, which is automatically named to the 
> time attribute column named by user input and TimeCharacteristic, eg:
> {code:java}
> env.setStreamTimeCharacteristic(TimeCharacteristic.EventTime)
> ...
> val table = stream.toTableWithTimestamp(tEnv, 'count, 'size, 'name, 'ts)
>   .window(Tumble over 2.rows on 'ts as 'w)
> ...
> {code}
> In the example above the flink will mark `ts` ad a `RowtimeAttribute`.
> What do you think ? 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-10875) Add `toTableWithTimestamp` method in `DataStreamConversions`

2018-11-15 Thread Fabian Hueske (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-10875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16687896#comment-16687896
 ] 

Fabian Hueske commented on FLINK-10875:
---

Thanks for the proposal [~sunjincheng121].

To be honest, I don't think it would improve the API and be more limited in its 
use cases than the current API.

Currently, users can arrange the fields arbitrarily (order the fields as they 
like) and inject an event-time *and/or* processing-time attribute at any 
position. Which attributes are time attributes is very obvious and the 
resulting schema as well. I don't quite follow this argument

bq. but when applied to the conversion with the time attribute, the time 
attribute column is silently added to the table. This feeling is very Magical

Because, it is (in my opinion) very clear which field is a time attribute 
(indicated by {{.rowtime}} or {{.proctime}}) and it is not at all "silently 
added" but explicitly selected.

In my opinion, the new approach does not improve the sitation because, it is in 
fact not clear which attribute is a time attribute and where it would be added. 
The proposal is to add it at the end, but a user can find that only out by 
trying or reading the docs. Also, it is less expressive because the attribute 
is always added to the end (the current approach allows for arbirary 
positions). Moreover, it is not clear to me how to handle situations where 
users want a processing-time attribute or an event-time and a processing-time 
attribute.

And finally, I think APIs should aim to be concise and limit the number of 
methods to the set that is required. Since, the proposed method only covers a 
subset of the (well-supported) use cases, I think we should not add it.

> Add `toTableWithTimestamp` method in `DataStreamConversions`
> 
>
> Key: FLINK-10875
> URL: https://issues.apache.org/jira/browse/FLINK-10875
> Project: Flink
>  Issue Type: Improvement
>  Components: Table API  SQL
>Reporter: sunjincheng
>Assignee: sunjincheng
>Priority: Minor
> Fix For: 1.7.1
>
>
> Currently we convert a `DataStream` to a `Table` by  
> `DataStreamConversions#toTable`, e.g.:
> {code:java}
> // Without TimeAttribute
> ...
> val stream = env.fromCollection(...)
> val tab = stream.toTable(tEnv, 'a, 'b, 'c)
> val result = tab.select('a, 'b)
> 
> // With TimeAttribute
> ...
> val stream = env.fromCollection(...).assignTimestampsAndWatermarks(...)
> val tab = stream.toTable(tEnv, 'a, 'b, 'c, 'ts.rowtime)
> val result = tab.window(Session withGap 5.milli on 'ts as 'w)
> ...{code}
> I think the fieldNames parameter in the `toTable` method is reasonable in the 
> conversion without the time attribute, because the fieldNames will actually 
> correspond to the fields of the physical table, but when applied to the 
> conversion with the time attribute, the time attribute column is silently 
> added to the table. This feeling is very Magical, so I recommend adding a 
> method that allows the user to display the time attribute added to the 
> physical table: `toTableWithTimestamp`, which is automatically named to the 
> time attribute column named by user input and TimeCharacteristic, eg:
> {code:java}
> env.setStreamTimeCharacteristic(TimeCharacteristic.EventTime)
> ...
> val table = stream.toTableWithTimestamp(tEnv, 'count, 'size, 'name, 'ts)
>   .window(Tumble over 2.rows on 'ts as 'w)
> ...
> {code}
> In the example above the flink will mark `ts` ad a `RowtimeAttribute`.
> What do you think ? 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-10875) Add `toTableWithTimestamp` method in `DataStreamConversions`

2018-11-14 Thread sunjincheng (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-10875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16686220#comment-16686220
 ] 

sunjincheng commented on FLINK-10875:
-

Hi, [~fhueske]  [~twalthr] , does this change make sense to you? I'll 
appreciate if you can give me feedback!

> Add `toTableWithTimestamp` method in `DataStreamConversions`
> 
>
> Key: FLINK-10875
> URL: https://issues.apache.org/jira/browse/FLINK-10875
> Project: Flink
>  Issue Type: Improvement
>  Components: Table API  SQL
>Reporter: sunjincheng
>Assignee: sunjincheng
>Priority: Minor
> Fix For: 1.7.1
>
>
> Currently we convert a `DataStream` to a `Table` by  
> `DataStreamConversions#toTable`, e.g.:
> {code:java}
> // Without TimeAttribute
> ...
> val stream = env.fromCollection(...)
> val tab = stream.toTable(tEnv, 'a, 'b, 'c)
> val result = tab.select('a, 'b)
> 
> // With TimeAttribute
> ...
> val stream = env.fromCollection(...).assignTimestampsAndWatermarks(...)
> val tab = stream.toTable(tEnv, 'a, 'b, 'c, 'ts.rowtime)
> val result = tab.window(Session withGap 5.milli on 'ts as 'w)
> ...{code}
> I think the fieldNames parameter in the `toTable` method is reasonable in the 
> conversion without the time attribute, because the fieldNames will actually 
> correspond to the fields of the physical table, but when applied to the 
> conversion with the time attribute, the time attribute column is silently 
> added to the table. This feeling is very Magical, so I recommend adding a 
> method that allows the user to display the time attribute added to the 
> physical table: `toTableWithTimestamp`, which is automatically named to the 
> time attribute column named by user input and TimeCharacteristic, eg:
> {code:java}
> env.setStreamTimeCharacteristic(TimeCharacteristic.EventTime)
> ...
> val table = stream.toTableWithTimestamp(tEnv, 'count, 'size, 'name, 'ts)
>   .window(Tumble over 2.rows on 'ts as 'w)
> ...
> {code}
> In the example above the flink will mark `ts` ad a `RowtimeAttribute`.
> What do you think ? 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)