[GitHub] flink pull request #2961: [FLINK-5266] [table] eagerly project unused fields...

2016-12-15 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/2961


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request #2961: [FLINK-5266] [table] eagerly project unused fields...

2016-12-12 Thread fhueske
Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/2961#discussion_r91906988
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/api/table/table.scala
 ---
@@ -881,24 +883,21 @@ class GroupWindowedTable(
 * }}}
 */
   def select(fields: Expression*): Table = {
--- End diff --

Not sure, I would like to avoid a solution where projections for aggregates 
are injected at different locations in the code, i.e., injecting a projection 
before validation for non-windowed aggregates and after validation (or during 
optimization) for windowed aggregates. I think it would be better to have this 
consistent.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request #2961: [FLINK-5266] [table] eagerly project unused fields...

2016-12-11 Thread KurtYoung
Github user KurtYoung commented on a diff in the pull request:

https://github.com/apache/flink/pull/2961#discussion_r91885381
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/api/table/table.scala
 ---
@@ -881,24 +883,21 @@ class GroupWindowedTable(
 * }}}
 */
   def select(fields: Expression*): Table = {
--- End diff --

Hi @fhueske , i think it will be a more complex situation when window 
involves. However, i want to keep this change simple and straightforward, and 
open another issue to track the scenario when involving windows. Does this make 
sense to you?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request #2961: [FLINK-5266] [table] eagerly project unused fields...

2016-12-09 Thread fhueske
Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/2961#discussion_r91738368
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/api/table/table.scala
 ---
@@ -881,24 +883,21 @@ class GroupWindowedTable(
 * }}}
 */
   def select(fields: Expression*): Table = {
--- End diff --

At the moment there is no way to specify watermarks inside of a Table API 
or SQL query. This can only be done on a DataStream before it is converted into 
a Table. Therefore, watermarks and timestamps are already assigned before the 
first Table or SQL operator can remove anything. In case of a TableSource which 
assigns timestamps, the TableSource needs to take care that the assignment 
happens before a pushed-down projection is applied.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request #2961: [FLINK-5266] [table] eagerly project unused fields...

2016-12-09 Thread KurtYoung
Github user KurtYoung commented on a diff in the pull request:

https://github.com/apache/flink/pull/2961#discussion_r91729382
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/api/table/table.scala
 ---
@@ -881,24 +883,21 @@ class GroupWindowedTable(
 * }}}
 */
   def select(fields: Expression*): Table = {
--- End diff --

What if user use a customized watermark extracter which used some fields 
from the element. For example, we have original table source containing 4 
fields: a, b, c, d. And user used "a" field to extract the timestamp and 
watermark. But in the later query on the table, only "b" and "c" are used. And 
if we do a projection on "b" and "c", and the projection is pushed into table 
source further, we will not get field "a" which used to produce timestamp 
anymore. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request #2961: [FLINK-5266] [table] eagerly project unused fields...

2016-12-09 Thread fhueske
Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/2961#discussion_r91691224
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/api/table/table.scala
 ---
@@ -881,24 +883,21 @@ class GroupWindowedTable(
 * }}}
 */
   def select(fields: Expression*): Table = {
--- End diff --

Watermarks and timestamps should not be affected by this change. They are 
treated as metadata by Flink and not part of the schema. Also watermarks and 
timestamps should be assigned before the query. We do not support assigning 
watermarks within a query.

I also had a quick look into it. One problem I found was that a window 
alias is handled as an `UnresolvedFieldReference` in `select` here and 
therefore added to the projection. However, the input does to have a field like 
that and validation fails.

During validation, the window alias is correctly recognized. Maybe it makes 
more sense to add the projection at this point by injection an additional 
`Project` with the `RelBuilder`. Another solution could be a `RelOptRule`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request #2961: [FLINK-5266] [table] eagerly project unused fields...

2016-12-08 Thread KurtYoung
Github user KurtYoung commented on a diff in the pull request:

https://github.com/apache/flink/pull/2961#discussion_r91641034
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/api/table/table.scala
 ---
@@ -881,24 +883,21 @@ class GroupWindowedTable(
 * }}}
 */
   def select(fields: Expression*): Table = {
--- End diff --

Oh. My bad, i will add it. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request #2961: [FLINK-5266] [table] eagerly project unused fields...

2016-12-08 Thread fhueske
Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/2961#discussion_r91561592
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/api/table/table.scala
 ---
@@ -881,24 +883,21 @@ class GroupWindowedTable(
 * }}}
 */
   def select(fields: Expression*): Table = {
--- End diff --

Can you add the pre-aggregation projection for window aggregates as well?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request #2961: [FLINK-5266] [table] eagerly project unused fields...

2016-12-07 Thread KurtYoung
GitHub user KurtYoung opened a pull request:

https://github.com/apache/flink/pull/2961

[FLINK-5266] [table] eagerly project unused fields when selecting 
aggregation fields

This PR is based on #2926 , only the second commit is related.

I add a "plan" test directory to hold all the plan level tests. And i also 
did a small refactory for ProjectionTranslator, thought it's better to keep 
each method only do one thing.

@fhueske As we discussed earlier in the jira: 
https://issues.apache.org/jira/browse/FLINK-5266 about where the logics should 
be added. I decided to add them when we selecting fields from a normal or 
grouped table. Since this kind of logics involves some fields references 
rewrite, if we choose to add the needed projection node when we convert the 
LogicalPlan to Calcite's RelNode, we should also take care the whole rewrite 
thing. 

However, if we add the project node in the first place, we only need to 
extract all the field references used in all selecting expressions, and treat 
them as UnresolvedFieldReferences. The validation part will take care of the 
rewrite thing. I think this will be easier and more consistent with other 
procedures. (Noticed all the "construct" logic are fairly simple)

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/KurtYoung/flink flink-5266

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/2961.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2961


commit 374d231d44f84ae385d8f8adb2353685e1214ff6
Author: Kurt Young 
Date:   2016-12-08T01:27:55Z

[FLINK-5226] [table] Use correct DataSetCostFactory and improve DataSetCalc 
costs.

commit 8a3ecf8e6362acd9370b11f08018a0143fc9be18
Author: Kurt Young 
Date:   2016-12-08T02:35:43Z

[FLINK-5266] [table] eagerly project unused fields when selecting 
aggregation fields

Add a "plan" test dir to hold all the plan level unit tests

Small refactory with ProjectionTranslator, keep each method handle one 
single thing




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---