[jira] [Commented] (FLINK-7060) Change annotation in TypeInformation subclasses

2017-06-30 Thread Greg Hogan (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16070908#comment-16070908
 ] 

Greg Hogan commented on FLINK-7060:
---

The {{Public}} annotation denotes stable inheritance so this looks to be as 
intended and not something we can change without breaking the API.

> Change annotation in TypeInformation subclasses
> ---
>
> Key: FLINK-7060
> URL: https://issues.apache.org/jira/browse/FLINK-7060
> Project: Flink
>  Issue Type: Improvement
>  Components: Type Serialization System
>Reporter: Kostas Kloudas
>Priority: Minor
>
> Currently many subclasses of the {{TypeInformation}} are annotated as 
> {{Public}} but all their methods are {{PublicEvolving}}. As an example, you 
> can check out the {{GenericTypeInfo}}. 
> It would be clearer if the whole class was annotated with {{PublicEvolving}} 
> instead of {{Public}} and have all the methods without annotations.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-6877) Activate checkstyle for runtime/security

2017-06-30 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16070869#comment-16070869
 ] 

ASF GitHub Bot commented on FLINK-6877:
---

Github user hsaputra commented on the issue:

https://github.com/apache/flink/pull/4095
  
+1 for merging. The longer it waits the more conflicts it will cause


> Activate checkstyle for runtime/security
> 
>
> Key: FLINK-6877
> URL: https://issues.apache.org/jira/browse/FLINK-6877
> Project: Flink
>  Issue Type: Improvement
>  Components: Local Runtime
>Affects Versions: 1.4.0
>Reporter: Greg Hogan
>Assignee: Greg Hogan
>Priority: Trivial
> Fix For: 1.4.0
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] flink issue #4095: [FLINK-6877] [runtime] Activate checkstyle for runtime/se...

2017-06-30 Thread hsaputra
Github user hsaputra commented on the issue:

https://github.com/apache/flink/pull/4095
  
+1 for merging. The longer it waits the more conflicts it will cause


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-6075) Support Limit/Top(Sort) for Stream SQL

2017-06-30 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16070721#comment-16070721
 ] 

ASF GitHub Bot commented on FLINK-6075:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/3889


> Support Limit/Top(Sort) for Stream SQL
> --
>
> Key: FLINK-6075
> URL: https://issues.apache.org/jira/browse/FLINK-6075
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API & SQL
>Reporter: radu
>  Labels: features
> Attachments: sort.png
>
>
> These will be split in 3 separated JIRA issues. However, the design is the 
> same only the processing function differs in terms of the output. Hence, the 
> design is the same for all of them.
> Time target: Proc Time
> **SQL targeted query examples:**
> *Sort example*
> Q1)` SELECT a FROM stream1 GROUP BY HOP(proctime, INTERVAL '1' HOUR, INTERVAL 
> '3' HOUR) ORDER BY b` 
> Comment: window is defined using GROUP BY
> Comment: ASC or DESC keywords can be placed to mark the ordering type
> *Limit example*
> Q2) `SELECT a FROM stream1 WHERE rowtime BETWEEN current_timestamp - INTERVAL 
> '1' HOUR AND current_timestamp ORDER BY b LIMIT 10`
> Comment: window is defined using time ranges in the WHERE clause
> Comment: window is row triggered
> *Top example*
> Q3) `SELECT sum(a) OVER (ORDER BY proctime RANGE INTERVAL '1' HOUR PRECEDING 
> LIMIT 10) FROM stream1`  
> Comment: limit over the contents of the sliding window
> General Comments:
> -All these SQL clauses are supported only over windows (bounded collections 
> of data). 
> -Each of the 3 operators will be supported with each of the types of 
> expressing the windows. 
> **Description**
> The 3 operations (limit, top and sort) are similar in behavior as they all 
> require a sorted collection of the data on which the logic will be applied 
> (i.e., select a subset of the items or the entire sorted set). These 
> functions would make sense in the streaming context only in the context of a 
> window. Without defining a window the functions could never emit as the sort 
> operation would never trigger. If an SQL query will be provided without 
> limits an error will be thrown (`SELECT a FROM stream1 TOP 10` -> ERROR). 
> Although not targeted by this JIRA, in the case of working based on event 
> time order, the retraction mechanisms of windows and the lateness mechanisms 
> can be used to deal with out of order events and retraction/updates of 
> results.
> **Functionality example**
> We exemplify with the query below for all the 3 types of operators (sorting, 
> limit and top). Rowtime indicates when the HOP window will trigger – which 
> can be observed in the fact that outputs are generated only at those moments. 
> The HOP windows will trigger at every hour (fixed hour) and each event will 
> contribute/ be duplicated for 2 consecutive hour intervals. Proctime 
> indicates the processing time when a new event arrives in the system. Events 
> are of the type (a,b) with the ordering being applied on the b field.
> `SELECT a FROM stream1 HOP(proctime, INTERVAL '1' HOUR, INTERVAL '2' HOUR) 
> ORDER BY b (LIMIT 2/ TOP 2 / [ASC/DESC] `)
> ||Rowtime||   Proctime||  Stream1||   Limit 2||   Top 2|| Sort 
> [ASC]||
> | |10:00:00  |(aaa, 11)   |   | | 
>|
> | |10:05:00|(aab, 7)  |   | ||
> |10-11  |11:00:00  |  |   aab,aaa |aab,aaa  | aab,aaa 
>|
> | |11:03:00  |(aac,21)  |   | ||  
> 
> |11-12|12:00:00  |  | aab,aaa |aab,aaa  | aab,aaa,aac|
> | |12:10:00  |(abb,12)  |   | ||  
> 
> | |12:15:00  |(abb,12)  |   | ||  
> 
> |12-13  |13:00:00  |  |   abb,abb | abb,abb | 
> abb,abb,aac|
> |...|
> **Implementation option**
> Considering that the SQL operators will be associated with window boundaries, 
> the functionality will be implemented within the logic of the window as 
> follows.
> * Window assigner – selected based on the type of window used in SQL 
> (TUMBLING, SLIDING…)
> * Evictor/ Trigger – time or count evictor based on the definition of the 
> window boundaries
> * Apply – window function that sorts data and selects the output to trigger 
> (based on LIMIT/TOP parameters). All data will be sorted at once and result 
> outputted when the window is triggered
> An alternative implementation can be to use a fold window function to sort 
> the elements as they arrive, one at a time followed by a flatMap to filter 
> the number of outputs. 
> !sort.png!
> **General logic of Join**
> ```
> inputDataStream.window(new 

[GitHub] flink pull request #3889: [FLINK-6075] - Support Limit/Top(Sort) for Stream ...

2017-06-30 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/3889


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-7044) Add methods to the client API that take the stateDescriptor.

2017-06-30 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16070478#comment-16070478
 ] 

ASF GitHub Bot commented on FLINK-7044:
---

Github user kl0u commented on the issue:

https://github.com/apache/flink/pull/4225
  
@aljoscha thanks for the review, I updated the PR.


> Add methods to the client API that take the stateDescriptor.
> 
>
> Key: FLINK-7044
> URL: https://issues.apache.org/jira/browse/FLINK-7044
> Project: Flink
>  Issue Type: Improvement
>  Components: Queryable State
>Affects Versions: 1.3.0, 1.3.1
>Reporter: Kostas Kloudas
>Assignee: Kostas Kloudas
> Fix For: 1.4.0
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] flink issue #4225: [FLINK-7044] [queryable-st] Allow to specify namespace an...

2017-06-30 Thread kl0u
Github user kl0u commented on the issue:

https://github.com/apache/flink/pull/4225
  
@aljoscha thanks for the review, I updated the PR.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (FLINK-7060) Change annotation in TypeInformation subclasses

2017-06-30 Thread Kostas Kloudas (JIRA)
Kostas Kloudas created FLINK-7060:
-

 Summary: Change annotation in TypeInformation subclasses
 Key: FLINK-7060
 URL: https://issues.apache.org/jira/browse/FLINK-7060
 Project: Flink
  Issue Type: Improvement
  Components: Type Serialization System
Reporter: Kostas Kloudas
Priority: Minor


Currently many subclasses of the {{TypeInformation}} are annotated as 
{{Public}} but all their methods are {{PublicEvolving}}. As an example, you can 
check out the {{GenericTypeInfo}}. 

It would be clearer if the whole class was annotated with {{PublicEvolving}} 
instead of {{Public}} and have all the methods without annotations.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-7030) Build with scala-2.11 by default

2017-06-30 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16070451#comment-16070451
 ] 

ASF GitHub Bot commented on FLINK-7030:
---

Github user zentol commented on the issue:

https://github.com/apache/flink/pull/4209
  
I'm aware of that. I will merge it over the weekend with a bunch of other 
PR's when the Flink Travis isn't so busy.


> Build with scala-2.11 by default
> 
>
> Key: FLINK-7030
> URL: https://issues.apache.org/jira/browse/FLINK-7030
> Project: Flink
>  Issue Type: Improvement
>  Components: Build System
>Reporter: Piotr Nowojski
>Assignee: Piotr Nowojski
>
> As proposed recently on the dev mailing list.
> I propose to switch to Scala 2.11 as a default and to have a Scala 2.10 build 
> profile. Now it is the other way around. The reason for that is poor support 
> for build profiles in Intellij, I was unable to make it work after I added 
> Kafka 0.11 dependency (Kafka 0.11 dropped support for Scala 2.10).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] flink issue #4209: [FLINK-7030] Build with scala-2.11 by default

2017-06-30 Thread zentol
Github user zentol commented on the issue:

https://github.com/apache/flink/pull/4209
  
I'm aware of that. I will merge it over the weekend with a bunch of other 
PR's when the Flink Travis isn't so busy.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-6043) Display time when exceptions/root cause of failure happened

2017-06-30 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16070449#comment-16070449
 ] 

ASF GitHub Bot commented on FLINK-6043:
---

Github user tillrohrmann commented on a diff in the pull request:

https://github.com/apache/flink/pull/3583#discussion_r125088693
  
--- Diff: 
flink-runtime/src/main/java/org/apache/flink/runtime/executiongraph/ExecutionGraph.java
 ---
@@ -1070,9 +1088,9 @@ public void suspend(Throwable suspensionCause) {
 * exceptions that indicate a bug or an unexpected call race), and 
where a full restart is the
 * safe way to get consistency back.
 * 
-* @param t The exception that caused the failure.
+* @param errorInfo ErrorInfo containing the exception that caused the 
failure.
 */
-   public void failGlobal(Throwable t) {
+   public void failGlobal(ErrorInfo errorInfo) {
--- End diff --

Why don't we do a null check for `errorInfo` similar to the one in 
`suspend`? I think it should be fine to assume that this parameter is not null.


> Display time when exceptions/root cause of failure happened
> ---
>
> Key: FLINK-6043
> URL: https://issues.apache.org/jira/browse/FLINK-6043
> Project: Flink
>  Issue Type: Improvement
>  Components: Webfrontend
>Affects Versions: 1.3.0
>Reporter: Till Rohrmann
>Assignee: Chesnay Schepler
>Priority: Minor
>
> In order to better understand the behaviour of Flink jobs, it would be nice 
> to add timestamp information to exception causing the job to restart or to 
> fail. This information could then be displayed in the web UI making it easier 
> for the user to understand what happened when.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] flink pull request #3583: [FLINK-6043] [web] Display exception timestamp

2017-06-30 Thread tillrohrmann
Github user tillrohrmann commented on a diff in the pull request:

https://github.com/apache/flink/pull/3583#discussion_r125088693
  
--- Diff: 
flink-runtime/src/main/java/org/apache/flink/runtime/executiongraph/ExecutionGraph.java
 ---
@@ -1070,9 +1088,9 @@ public void suspend(Throwable suspensionCause) {
 * exceptions that indicate a bug or an unexpected call race), and 
where a full restart is the
 * safe way to get consistency back.
 * 
-* @param t The exception that caused the failure.
+* @param errorInfo ErrorInfo containing the exception that caused the 
failure.
 */
-   public void failGlobal(Throwable t) {
+   public void failGlobal(ErrorInfo errorInfo) {
--- End diff --

Why don't we do a null check for `errorInfo` similar to the one in 
`suspend`? I think it should be fine to assume that this parameter is not null.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Updated] (FLINK-7003) "select * from" in Flink SQL should not flatten all fields in the table by default

2017-06-30 Thread Shuyi Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-7003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shuyi Chen updated FLINK-7003:
--
Description: 
Currently, CompositeRelDataType is extended from 
RelRecordType(StructKind.PEEK_FIELDS, ...).  In Calcite, StructKind.PEEK_FIELDS 
would allow us to peek fields for nested types. However, when we use "select * 
from", calcite will flatten all nested fields that is marked as 
StructKind.PEEK_FIELDS in the table. 

For example, if the table structure *T* is as follows:

{code:java}
VARCHAR K0,
VARCHAR C1,
RecordType:peek_no_flattening(INTEGER C0, INTEGER C1) F0,
RecordType:peek_no_flattening(INTEGER C0, INTEGER C2) F1

{code}
The following query
{code:java}
Select * from T
{code} should return (K0, C1, F0, F1) instead of (K0, C1, F0.C0, F0.C1, F1.C0, 
F1.C1), which is the current behavior.

After upgrading to Calcite 1.14, this issue should change the type of 
{{CompositeRelDataType}} to {{StructKind. PEEK_FIELDS_NO_EXPAND}}.


  was:
Currently, CompositeRelDataType is extended from 
RelRecordType(StructKind.PEEK_FIELDS, ...).  In Calcite, StructKind.PEEK_FIELDS 
would allow us to peek fields for nested types. However, when we use "select * 
from", calcite will flatten all nested fields that is marked as 
StructKind.PEEK_FIELDS in the table. 

For example, if the table structure *T* is as follows:

{code:java}
VARCHAR K0,
VARCHAR C1,
RecordType:peek_no_flattening(INTEGER C0, INTEGER C1) F0,
RecordType:peek_no_flattening(INTEGER C0, INTEGER C2) F1

{code}
The following query
{code:java}
Select * from T
{code} should return (K0, C1, F0, F1) instead of (K0, C1, F0.C0, F0.C1, F1.C0, 
F1.C1), which is the current behavior.

After upgrading to Calcite 1.14, this issue should change the type of 
{{CompositeRelDataType}} to {{StructKind.PEEK_FIELDS_NO_FLATTENING}}.



> "select * from" in Flink SQL should not flatten all fields in the table by 
> default
> --
>
> Key: FLINK-7003
> URL: https://issues.apache.org/jira/browse/FLINK-7003
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API & SQL
>Reporter: Shuyi Chen
>
> Currently, CompositeRelDataType is extended from 
> RelRecordType(StructKind.PEEK_FIELDS, ...).  In Calcite, 
> StructKind.PEEK_FIELDS would allow us to peek fields for nested types. 
> However, when we use "select * from", calcite will flatten all nested fields 
> that is marked as StructKind.PEEK_FIELDS in the table. 
> For example, if the table structure *T* is as follows:
> {code:java}
> VARCHAR K0,
> VARCHAR C1,
> RecordType:peek_no_flattening(INTEGER C0, INTEGER C1) F0,
> RecordType:peek_no_flattening(INTEGER C0, INTEGER C2) F1
> {code}
> The following query
> {code:java}
> Select * from T
> {code} should return (K0, C1, F0, F1) instead of (K0, C1, F0.C0, F0.C1, 
> F1.C0, F1.C1), which is the current behavior.
> After upgrading to Calcite 1.14, this issue should change the type of 
> {{CompositeRelDataType}} to {{StructKind. PEEK_FIELDS_NO_EXPAND}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (FLINK-7059) Queryable state does not work with ListState

2017-06-30 Thread Kostas Kloudas (JIRA)
Kostas Kloudas created FLINK-7059:
-

 Summary: Queryable state does not work with ListState
 Key: FLINK-7059
 URL: https://issues.apache.org/jira/browse/FLINK-7059
 Project: Flink
  Issue Type: Bug
  Components: Queryable State
Reporter: Kostas Kloudas
 Fix For: 1.4.0


The serialization format of the list state follows the one of RocksDB (comma 
separated binaries without size of list) which is incompatible with that of our 
ListSerializer. 

For reference you can look up {{HeapListState.getSerializedValue()}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-7058) flink-scala-shell unintended dependencies for scala 2.11

2017-06-30 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16070419#comment-16070419
 ] 

ASF GitHub Bot commented on FLINK-7058:
---

GitHub user pnowojski opened a pull request:

https://github.com/apache/flink/pull/4240

[FLINK-7058] Fix scala-2.10 dependencies

First commit is from #4209 and should be ignored in this PR

Before fixup:

```
$ mvn dependency:tree -pl flink-scala | grep quasi
[INFO] +- org.scalamacros:quasiquotes_2.10:jar:2.1.0:compile
$ mvn dependency:tree -pl flink-scala -Pscala-2.11 | grep quasi
[INFO] +- org.scalamacros:quasiquotes_2.10:jar:2.1.0:compile
$ mvn dependency:tree -pl flink-scala -Pscala-2.10 | grep quasi
[INFO] +- org.scalamacros:quasiquotes_2.10:jar:2.1.0:compile
```

After fixup:
```
$ mvn dependency:tree -pl flink-scala | grep quasi
$ mvn dependency:tree -pl flink-scala -Pscala-2.11 | grep quasi
$ mvn dependency:tree -pl flink-scala -Pscala-2.10 | grep quasi
[INFO] +- org.scalamacros:quasiquotes_2.10:jar:2.1.0:compile
```


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/pnowojski/flink scala210

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/4240.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4240


commit a17b0d4aee3c116761871513b2ef073bf8a98750
Author: Piotr Nowojski 
Date:   2017-06-23T11:41:55Z

[FLINK-7030] Build with scala-2.11 by default

commit 58a3b7b0a936da0148de4ddb5b9a6b2c3bccc335
Author: Piotr Nowojski 
Date:   2017-06-30T16:19:59Z

[FLINK-7058] Fix scala-2.10 dependencies




> flink-scala-shell unintended dependencies for scala 2.11
> 
>
> Key: FLINK-7058
> URL: https://issues.apache.org/jira/browse/FLINK-7058
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.3.0, 1.3.1
>Reporter: Piotr Nowojski
>Assignee: Piotr Nowojski
>Priority: Minor
>
> Activation of profile scala-2.10 in `flink-scala-shell` and `flink-scala` do 
> not work as intended. 
> {code:xml}
>   
>   
>   scala-2.10
>   
>   
>   !scala-2.11
>   
>   
>   
>   
>   org.scalamacros
>   
> quasiquotes_2.10
>   
> ${scala.macros.version}
>   
>   
>   org.scala-lang
>   jline
>   2.10.4
>   
>   
>   
>   
> 
> 
> {code}
> This activation IMO have nothing to do with `-Pscala-2.11` profile switch 
> used in our build. "properties" are defined by `-Dproperty` switches. As far 
> as I understand that, those additional dependencies would be added only if 
> nobody defined property named `scala-2.11`, which means, they would be added 
> only if switch `-Dscala-2.11` was not used, so it seems like those 
> dependencies were basically added always. This quick test proves that I'm 
> correct:
> {code:bash}
> $ mvn dependency:tree -pl flink-scala | grep quasi
> [INFO] +- org.scalamacros:quasiquotes_2.10:jar:2.1.0:compile
> $ mvn dependency:tree -pl flink-scala -Pscala-2.11 | grep quasi
> [INFO] +- org.scalamacros:quasiquotes_2.10:jar:2.1.0:compile
> $ mvn dependency:tree -pl flink-scala -Pscala-2.10 | grep quasi
> [INFO] +- org.scalamacros:quasiquotes_2.10:jar:2.1.0:compile
> {code}
> regardless of the selected profile those dependencies are always there.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] flink pull request #4240: [FLINK-7058] Fix scala-2.10 dependencies

2017-06-30 Thread pnowojski
GitHub user pnowojski opened a pull request:

https://github.com/apache/flink/pull/4240

[FLINK-7058] Fix scala-2.10 dependencies

First commit is from #4209 and should be ignored in this PR

Before fixup:

```
$ mvn dependency:tree -pl flink-scala | grep quasi
[INFO] +- org.scalamacros:quasiquotes_2.10:jar:2.1.0:compile
$ mvn dependency:tree -pl flink-scala -Pscala-2.11 | grep quasi
[INFO] +- org.scalamacros:quasiquotes_2.10:jar:2.1.0:compile
$ mvn dependency:tree -pl flink-scala -Pscala-2.10 | grep quasi
[INFO] +- org.scalamacros:quasiquotes_2.10:jar:2.1.0:compile
```

After fixup:
```
$ mvn dependency:tree -pl flink-scala | grep quasi
$ mvn dependency:tree -pl flink-scala -Pscala-2.11 | grep quasi
$ mvn dependency:tree -pl flink-scala -Pscala-2.10 | grep quasi
[INFO] +- org.scalamacros:quasiquotes_2.10:jar:2.1.0:compile
```


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/pnowojski/flink scala210

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/4240.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4240


commit a17b0d4aee3c116761871513b2ef073bf8a98750
Author: Piotr Nowojski 
Date:   2017-06-23T11:41:55Z

[FLINK-7030] Build with scala-2.11 by default

commit 58a3b7b0a936da0148de4ddb5b9a6b2c3bccc335
Author: Piotr Nowojski 
Date:   2017-06-30T16:19:59Z

[FLINK-7058] Fix scala-2.10 dependencies




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Updated] (FLINK-7058) flink-scala-shell unintended dependencies for scala 2.11

2017-06-30 Thread Piotr Nowojski (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-7058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Piotr Nowojski updated FLINK-7058:
--
Affects Version/s: 1.3.0
   1.3.1

> flink-scala-shell unintended dependencies for scala 2.11
> 
>
> Key: FLINK-7058
> URL: https://issues.apache.org/jira/browse/FLINK-7058
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.3.0, 1.3.1
>Reporter: Piotr Nowojski
>Assignee: Piotr Nowojski
>Priority: Minor
>
> Activation of profile scala-2.10 in `flink-scala-shell` and `flink-scala` do 
> not work as intended. 
> {code:xml}
>   
>   
>   scala-2.10
>   
>   
>   !scala-2.11
>   
>   
>   
>   
>   org.scalamacros
>   
> quasiquotes_2.10
>   
> ${scala.macros.version}
>   
>   
>   org.scala-lang
>   jline
>   2.10.4
>   
>   
>   
>   
> 
> 
> {code}
> This activation IMO have nothing to do with `-Pscala-2.11` profile switch 
> used in our build. "properties" are defined by `-Dproperty` switches. As far 
> as I understand that, those additional dependencies would be added only if 
> nobody defined property named `scala-2.11`, which means, they would be added 
> only if switch `-Dscala-2.11` was not used, so it seems like those 
> dependencies were basically added always. This quick test proves that I'm 
> correct:
> {code:bash}
> $ mvn dependency:tree -pl flink-scala | grep quasi
> [INFO] +- org.scalamacros:quasiquotes_2.10:jar:2.1.0:compile
> $ mvn dependency:tree -pl flink-scala -Pscala-2.11 | grep quasi
> [INFO] +- org.scalamacros:quasiquotes_2.10:jar:2.1.0:compile
> $ mvn dependency:tree -pl flink-scala -Pscala-2.10 | grep quasi
> [INFO] +- org.scalamacros:quasiquotes_2.10:jar:2.1.0:compile
> {code}
> regardless of the selected profile those dependencies are always there.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (FLINK-7058) flink-scala-shell unintended dependencies for scala 2.11

2017-06-30 Thread Piotr Nowojski (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-7058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Piotr Nowojski reassigned FLINK-7058:
-

Assignee: Piotr Nowojski

> flink-scala-shell unintended dependencies for scala 2.11
> 
>
> Key: FLINK-7058
> URL: https://issues.apache.org/jira/browse/FLINK-7058
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.3.0, 1.3.1
>Reporter: Piotr Nowojski
>Assignee: Piotr Nowojski
>Priority: Minor
>
> Activation of profile scala-2.10 in `flink-scala-shell` and `flink-scala` do 
> not work as intended. 
> {code:xml}
>   
>   
>   scala-2.10
>   
>   
>   !scala-2.11
>   
>   
>   
>   
>   org.scalamacros
>   
> quasiquotes_2.10
>   
> ${scala.macros.version}
>   
>   
>   org.scala-lang
>   jline
>   2.10.4
>   
>   
>   
>   
> 
> 
> {code}
> This activation IMO have nothing to do with `-Pscala-2.11` profile switch 
> used in our build. "properties" are defined by `-Dproperty` switches. As far 
> as I understand that, those additional dependencies would be added only if 
> nobody defined property named `scala-2.11`, which means, they would be added 
> only if switch `-Dscala-2.11` was not used, so it seems like those 
> dependencies were basically added always. This quick test proves that I'm 
> correct:
> {code:bash}
> $ mvn dependency:tree -pl flink-scala | grep quasi
> [INFO] +- org.scalamacros:quasiquotes_2.10:jar:2.1.0:compile
> $ mvn dependency:tree -pl flink-scala -Pscala-2.11 | grep quasi
> [INFO] +- org.scalamacros:quasiquotes_2.10:jar:2.1.0:compile
> $ mvn dependency:tree -pl flink-scala -Pscala-2.10 | grep quasi
> [INFO] +- org.scalamacros:quasiquotes_2.10:jar:2.1.0:compile
> {code}
> regardless of the selected profile those dependencies are always there.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (FLINK-7058) flink-scala-shell unintended dependencies for scala 2.11

2017-06-30 Thread Piotr Nowojski (JIRA)
Piotr Nowojski created FLINK-7058:
-

 Summary: flink-scala-shell unintended dependencies for scala 2.11
 Key: FLINK-7058
 URL: https://issues.apache.org/jira/browse/FLINK-7058
 Project: Flink
  Issue Type: Bug
Reporter: Piotr Nowojski
Priority: Minor


Activation of profile scala-2.10 in `flink-scala-shell` and `flink-scala` do 
not work as intended. 

{code:xml}


scala-2.10


!scala-2.11




org.scalamacros

quasiquotes_2.10

${scala.macros.version}



org.scala-lang
jline
2.10.4






{code}
This activation IMO have nothing to do with `-Pscala-2.11` profile switch used 
in our build. "properties" are defined by `-Dproperty` switches. As far as I 
understand that, those additional dependencies would be added only if nobody 
defined property named `scala-2.11`, which means, they would be added only if 
switch `-Dscala-2.11` was not used, so it seems like those dependencies were 
basically added always. This quick test proves that I'm correct:

{code:bash}
$ mvn dependency:tree -pl flink-scala | grep quasi
[INFO] +- org.scalamacros:quasiquotes_2.10:jar:2.1.0:compile
$ mvn dependency:tree -pl flink-scala -Pscala-2.11 | grep quasi
[INFO] +- org.scalamacros:quasiquotes_2.10:jar:2.1.0:compile
$ mvn dependency:tree -pl flink-scala -Pscala-2.10 | grep quasi
[INFO] +- org.scalamacros:quasiquotes_2.10:jar:2.1.0:compile
{code}

regardless of the selected profile those dependencies are always there.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-7030) Build with scala-2.11 by default

2017-06-30 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16070388#comment-16070388
 ] 

ASF GitHub Bot commented on FLINK-7030:
---

Github user pnowojski commented on the issue:

https://github.com/apache/flink/pull/4209
  
@zentol build is passing :)


> Build with scala-2.11 by default
> 
>
> Key: FLINK-7030
> URL: https://issues.apache.org/jira/browse/FLINK-7030
> Project: Flink
>  Issue Type: Improvement
>  Components: Build System
>Reporter: Piotr Nowojski
>Assignee: Piotr Nowojski
>
> As proposed recently on the dev mailing list.
> I propose to switch to Scala 2.11 as a default and to have a Scala 2.10 build 
> profile. Now it is the other way around. The reason for that is poor support 
> for build profiles in Intellij, I was unable to make it work after I added 
> Kafka 0.11 dependency (Kafka 0.11 dropped support for Scala 2.10).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] flink issue #4209: [FLINK-7030] Build with scala-2.11 by default

2017-06-30 Thread pnowojski
Github user pnowojski commented on the issue:

https://github.com/apache/flink/pull/4209
  
@zentol build is passing :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request #4228: Flink-7035 Automatically identify AWS Region, simp...

2017-06-30 Thread mpouttuclarke
Github user mpouttuclarke commented on a diff in the pull request:

https://github.com/apache/flink/pull/4228#discussion_r125076287
  
--- Diff: docs/dev/connectors/kinesis.md ---
@@ -72,12 +72,80 @@ Before consuming data from Kinesis streams, make sure 
that all streams are creat
 
 
 {% highlight java %}
-Properties consumerConfig = new Properties();
-consumerConfig.put(ConsumerConfigConstants.AWS_REGION, "us-east-1");
+StreamExecutionEnvironment env = 
StreamExecutionEnvironment.getEnvironment();
+
+DataStream kinesis = env.addSource(new FlinkKinesisConsumer<>(
+"kinesis_stream_name", new SimpleStringSchema(), 
ConsumerConfigConstants.InitialPosition.LATEST));
+{% endhighlight %}
+
+
+{% highlight scala %}
+val env = StreamExecutionEnvironment.getEnvironment
+
+val kinesis = env.addSource(new FlinkKinesisConsumer[String](
+"kinesis_stream_name", new SimpleStringSchema, 
ConsumerConfigConstants.InitialPosition.LATEST))
+{% endhighlight %}
+
+
+
+The above is a simple example of using the Kinesis consumer when running 
on an Amazon Linux node (such as in EMR or AWS Lambda).
+The AWS APIs automatically provide the authentication credentials and 
region when available.  For unit testing, the ability to
+set test configuration is provided using KinesisConfigUtil.
+
+
+
+{% highlight java %}
+Properties testConfig = new Properties();
+testConfig.put(ConsumerConfigConstants.AWS_REGION, "us-east-1");
+testConfig.put(ConsumerConfigConstants.AWS_ACCESS_KEY_ID, 
"aws_access_key_id");
+testConfig.put(ConsumerConfigConstants.AWS_SECRET_ACCESS_KEY, 
"aws_secret_access_key");
+testConfig.put(ConsumerConfigConstants.STREAM_INITIAL_POSITION, "LATEST");
+
+KinesisConfigUtil.setDefaultTestProperties(testConfig);
+
+// Automatically uses testConfig without having to modify job flow
+StreamExecutionEnvironment env = 
StreamExecutionEnvironment.getEnvironment();
+DataStream kinesis = env.addSource(new FlinkKinesisConsumer<>(
+"kinesis_stream_name", new SimpleStringSchema()));
+{% endhighlight %}
+
+
+{% highlight scala %}
+val testConfig = new Properties();
+testConfig.put(ConsumerConfigConstants.AWS_REGION, "us-east-1");
 consumerConfig.put(ConsumerConfigConstants.AWS_ACCESS_KEY_ID, 
"aws_access_key_id");
 consumerConfig.put(ConsumerConfigConstants.AWS_SECRET_ACCESS_KEY, 
"aws_secret_access_key");
 consumerConfig.put(ConsumerConfigConstants.STREAM_INITIAL_POSITION, 
"LATEST");
 
+KinesisConfigUtil.setDefaultTestProperties(testConfig);
+
+// Automatically uses testConfig without having to modify job flow
+val env = StreamExecutionEnvironment.getEnvironment
+val kinesis = env.addSource(new FlinkKinesisConsumer[String](
+"kinesis_stream_name", new SimpleStringSchema))
+{% endhighlight %}
+
+
+
+Configuration for the consumer can also be supplied with 
`java.util.Properties` for use on non-Amazon Linux hardware,
+or in the case that other stream consumer properties need to be tuned.
+
+Please note it is strongly recommended to use Kinesis streams within the 
same availability zone they originate in.
--- End diff --

We completely restrict cross-region network traffic except in special 
circumstances within Amazon because of these reasons.  These are simply lessons 
learned from scaling our systems globally, where situations arise not only due 
to performance concerns but also regulatory issues such as EU data laws 
restricting data egress for example.  Customer should be aware of these issues 
and discuss them with support before going down this path.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request #4228: Flink-7035 Automatically identify AWS Region, simp...

2017-06-30 Thread mpouttuclarke
Github user mpouttuclarke commented on a diff in the pull request:

https://github.com/apache/flink/pull/4228#discussion_r125074194
  
--- Diff: 
flink-connectors/flink-connector-kinesis/src/main/java/org/apache/flink/streaming/connectors/kinesis/util/KinesisConfigUtil.java
 ---
@@ -171,7 +172,14 @@ public static void validateAwsConfiguration(Properties 
config) {
}
 
if (!config.containsKey(AWSConfigConstants.AWS_REGION)) {
-   throw new IllegalArgumentException("The AWS region ('" 
+ AWSConfigConstants.AWS_REGION + "') must be set in the config.");
+   final Region currentRegion = Regions.getCurrentRegion();
+   if (currentRegion != null) {
+   
config.setProperty(AWSConfigConstants.AWS_REGION, currentRegion.getName());
+   } else {
+   throw new IllegalArgumentException("The AWS 
region could not be identified automatically from the AWS API.  " +
--- End diff --

Yes I like that wording.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request #4228: Flink-7035 Automatically identify AWS Region, simp...

2017-06-30 Thread mpouttuclarke
Github user mpouttuclarke commented on a diff in the pull request:

https://github.com/apache/flink/pull/4228#discussion_r125073534
  
--- Diff: 
flink-connectors/flink-connector-kinesis/src/main/java/org/apache/flink/streaming/connectors/kinesis/util/KinesisConfigUtil.java
 ---
@@ -171,7 +172,14 @@ public static void validateAwsConfiguration(Properties 
config) {
}
 
if (!config.containsKey(AWSConfigConstants.AWS_REGION)) {
--- End diff --

The new constructors make the easy path the right path.  We go through a 
lot of trouble at Amazon to make sure that the default constructors do the 
right thing with the minimal amount of effort.  Yet people still set things 
like region and auth manually when it is not only unnecessary but also a 
security, performance, and compliance risk.  Wherever we can we should try to 
follow the example of the AWS SDK and provide for using it correctly.  Overall, 
I would make the argument that using property files and statics isn't a best 
practice.  There really should be type safe POJOs and dependency injection in 
place for configuration of the consumer but that is a larger issue than I can 
take on right now.  The new constructors attempt to add some type safety while 
improving ease of use when operating in an Amazon environment.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-6988) Add Apache Kafka 0.11 connector

2017-06-30 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16070319#comment-16070319
 ] 

ASF GitHub Bot commented on FLINK-6988:
---

GitHub user pnowojski opened a pull request:

https://github.com/apache/flink/pull/4239

[FLINK-6988] Initial flink-connector-kafka-0.11 with at-least-once semantic

Couple of first commits are from other PRs #4206 #4209 #4213 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/pnowojski/flink kafka011

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/4239.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4239


commit 5191d5b4b78620cfc5ecfc9088afba0d611eaacb
Author: Piotr Nowojski 
Date:   2017-06-26T09:28:51Z

[FLINK-6996] Refactor and automaticall inherit KafkaProducer integration 
tests

commit 1c7d349ce425ec0213059e062f10c90773cc780d
Author: Piotr Nowojski 
Date:   2017-06-26T10:20:36Z

[FLINK-6996] Fix formatting in KafkaConsumerTestBase and 
KafkaProducerTestBase

commit 5b849f98191439e69ca2357a4767f47957ee0250
Author: Piotr Nowojski 
Date:   2017-06-23T11:41:55Z

[FLINK-7030] Build with scala-2.11 by default

commit 3f62aecb57cea9d43611ecfa24e2233a63197341
Author: Piotr Nowojski 
Date:   2017-06-26T10:36:40Z

[FLINK-6996] Fix at-least-once semantic for FlinkKafkaProducer010

Add tests coverage for Kafka 0.10 and 0.9

commit 4b78626df474a8d49a406714a7142ad44d8a8faf
Author: Piotr Nowojski 
Date:   2017-06-28T18:30:08Z

[FLINK-7032] Overwrite inherited values of compiler version from parent pom

Default values were 1.6 and were causing Intellij to constantly switch 
language
level to 1.6, which in turn was causing compilation errors. It worked fine
for compiling from console using  maven, because those values are separetly 
set
in maven-compiler-plugin configuration.

commit 2c2556e72dd73c5e470e5afd6dab4a11cb41772d
Author: Piotr Nowojski 
Date:   2017-06-23T07:14:28Z

[FLINK-6988] Initial flink-connector-kafka-0.11 with at-least-once semantic

Code of 0.11 connector is based on 0.10 version




> Add Apache Kafka 0.11 connector
> ---
>
> Key: FLINK-6988
> URL: https://issues.apache.org/jira/browse/FLINK-6988
> Project: Flink
>  Issue Type: Improvement
>  Components: Kafka Connector
>Affects Versions: 1.3.1
>Reporter: Piotr Nowojski
>Assignee: Piotr Nowojski
>
> Kafka 0.11 (it will be released very soon) add supports for transactions. 
> Thanks to that, Flink might be able to implement Kafka sink supporting 
> "exactly-once" semantic. API changes and whole transactions support is 
> described in 
> [KIP-98|https://cwiki.apache.org/confluence/display/KAFKA/KIP-98+-+Exactly+Once+Delivery+and+Transactional+Messaging].
> The goal is to mimic implementation of existing BucketingSink. New 
> FlinkKafkaProducer011 would 
> * upon creation begin transaction, store transaction identifiers into the 
> state and would write all incoming data to an output Kafka topic using that 
> transaction
> * on `snapshotState` call, it would flush the data and write in state 
> information that current transaction is pending to be committed
> * on `notifyCheckpointComplete` we would commit this pending transaction
> * in case of crash between `snapshotState` and `notifyCheckpointComplete` we 
> either abort this pending transaction (if not every participant successfully 
> saved the snapshot) or restore and commit it. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] flink pull request #4239: [FLINK-6988] Initial flink-connector-kafka-0.11 wi...

2017-06-30 Thread pnowojski
GitHub user pnowojski opened a pull request:

https://github.com/apache/flink/pull/4239

[FLINK-6988] Initial flink-connector-kafka-0.11 with at-least-once semantic

Couple of first commits are from other PRs #4206 #4209 #4213 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/pnowojski/flink kafka011

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/4239.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4239


commit 5191d5b4b78620cfc5ecfc9088afba0d611eaacb
Author: Piotr Nowojski 
Date:   2017-06-26T09:28:51Z

[FLINK-6996] Refactor and automaticall inherit KafkaProducer integration 
tests

commit 1c7d349ce425ec0213059e062f10c90773cc780d
Author: Piotr Nowojski 
Date:   2017-06-26T10:20:36Z

[FLINK-6996] Fix formatting in KafkaConsumerTestBase and 
KafkaProducerTestBase

commit 5b849f98191439e69ca2357a4767f47957ee0250
Author: Piotr Nowojski 
Date:   2017-06-23T11:41:55Z

[FLINK-7030] Build with scala-2.11 by default

commit 3f62aecb57cea9d43611ecfa24e2233a63197341
Author: Piotr Nowojski 
Date:   2017-06-26T10:36:40Z

[FLINK-6996] Fix at-least-once semantic for FlinkKafkaProducer010

Add tests coverage for Kafka 0.10 and 0.9

commit 4b78626df474a8d49a406714a7142ad44d8a8faf
Author: Piotr Nowojski 
Date:   2017-06-28T18:30:08Z

[FLINK-7032] Overwrite inherited values of compiler version from parent pom

Default values were 1.6 and were causing Intellij to constantly switch 
language
level to 1.6, which in turn was causing compilation errors. It worked fine
for compiling from console using  maven, because those values are separetly 
set
in maven-compiler-plugin configuration.

commit 2c2556e72dd73c5e470e5afd6dab4a11cb41772d
Author: Piotr Nowojski 
Date:   2017-06-23T07:14:28Z

[FLINK-6988] Initial flink-connector-kafka-0.11 with at-least-once semantic

Code of 0.11 connector is based on 0.10 version




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-6075) Support Limit/Top(Sort) for Stream SQL

2017-06-30 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16070310#comment-16070310
 ] 

ASF GitHub Bot commented on FLINK-6075:
---

Github user fhueske commented on the issue:

https://github.com/apache/flink/pull/3714
  
@rtudoran can you please close this PR? Thank you


> Support Limit/Top(Sort) for Stream SQL
> --
>
> Key: FLINK-6075
> URL: https://issues.apache.org/jira/browse/FLINK-6075
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API & SQL
>Reporter: radu
>  Labels: features
> Attachments: sort.png
>
>
> These will be split in 3 separated JIRA issues. However, the design is the 
> same only the processing function differs in terms of the output. Hence, the 
> design is the same for all of them.
> Time target: Proc Time
> **SQL targeted query examples:**
> *Sort example*
> Q1)` SELECT a FROM stream1 GROUP BY HOP(proctime, INTERVAL '1' HOUR, INTERVAL 
> '3' HOUR) ORDER BY b` 
> Comment: window is defined using GROUP BY
> Comment: ASC or DESC keywords can be placed to mark the ordering type
> *Limit example*
> Q2) `SELECT a FROM stream1 WHERE rowtime BETWEEN current_timestamp - INTERVAL 
> '1' HOUR AND current_timestamp ORDER BY b LIMIT 10`
> Comment: window is defined using time ranges in the WHERE clause
> Comment: window is row triggered
> *Top example*
> Q3) `SELECT sum(a) OVER (ORDER BY proctime RANGE INTERVAL '1' HOUR PRECEDING 
> LIMIT 10) FROM stream1`  
> Comment: limit over the contents of the sliding window
> General Comments:
> -All these SQL clauses are supported only over windows (bounded collections 
> of data). 
> -Each of the 3 operators will be supported with each of the types of 
> expressing the windows. 
> **Description**
> The 3 operations (limit, top and sort) are similar in behavior as they all 
> require a sorted collection of the data on which the logic will be applied 
> (i.e., select a subset of the items or the entire sorted set). These 
> functions would make sense in the streaming context only in the context of a 
> window. Without defining a window the functions could never emit as the sort 
> operation would never trigger. If an SQL query will be provided without 
> limits an error will be thrown (`SELECT a FROM stream1 TOP 10` -> ERROR). 
> Although not targeted by this JIRA, in the case of working based on event 
> time order, the retraction mechanisms of windows and the lateness mechanisms 
> can be used to deal with out of order events and retraction/updates of 
> results.
> **Functionality example**
> We exemplify with the query below for all the 3 types of operators (sorting, 
> limit and top). Rowtime indicates when the HOP window will trigger – which 
> can be observed in the fact that outputs are generated only at those moments. 
> The HOP windows will trigger at every hour (fixed hour) and each event will 
> contribute/ be duplicated for 2 consecutive hour intervals. Proctime 
> indicates the processing time when a new event arrives in the system. Events 
> are of the type (a,b) with the ordering being applied on the b field.
> `SELECT a FROM stream1 HOP(proctime, INTERVAL '1' HOUR, INTERVAL '2' HOUR) 
> ORDER BY b (LIMIT 2/ TOP 2 / [ASC/DESC] `)
> ||Rowtime||   Proctime||  Stream1||   Limit 2||   Top 2|| Sort 
> [ASC]||
> | |10:00:00  |(aaa, 11)   |   | | 
>|
> | |10:05:00|(aab, 7)  |   | ||
> |10-11  |11:00:00  |  |   aab,aaa |aab,aaa  | aab,aaa 
>|
> | |11:03:00  |(aac,21)  |   | ||  
> 
> |11-12|12:00:00  |  | aab,aaa |aab,aaa  | aab,aaa,aac|
> | |12:10:00  |(abb,12)  |   | ||  
> 
> | |12:15:00  |(abb,12)  |   | ||  
> 
> |12-13  |13:00:00  |  |   abb,abb | abb,abb | 
> abb,abb,aac|
> |...|
> **Implementation option**
> Considering that the SQL operators will be associated with window boundaries, 
> the functionality will be implemented within the logic of the window as 
> follows.
> * Window assigner – selected based on the type of window used in SQL 
> (TUMBLING, SLIDING…)
> * Evictor/ Trigger – time or count evictor based on the definition of the 
> window boundaries
> * Apply – window function that sorts data and selects the output to trigger 
> (based on LIMIT/TOP parameters). All data will be sorted at once and result 
> outputted when the window is triggered
> An alternative implementation can be to use a fold window function to sort 
> the elements as they arrive, one at a time followed by a flatMap to filter 
> the number of outputs. 
> !sort.png!
> **General 

[GitHub] flink issue #3714: [FLINK-6075] - Support Limit/Top(Sort) for Stream SQL

2017-06-30 Thread fhueske
Github user fhueske commented on the issue:

https://github.com/apache/flink/pull/3714
  
@rtudoran can you please close this PR? Thank you


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-7050) RFC Compliant CSV Parser for Table Source

2017-06-30 Thread Fabian Hueske (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16070301#comment-16070301
 ] 

Fabian Hueske commented on FLINK-7050:
--

Thanks for opening this JIRA [~uybhatti]. An RFC compliant TableSource would be 
a great addition.

> RFC Compliant CSV Parser for Table Source
> -
>
> Key: FLINK-7050
> URL: https://issues.apache.org/jira/browse/FLINK-7050
> Project: Flink
>  Issue Type: Improvement
>  Components: Table API & SQL
>Affects Versions: 1.3.1
>Reporter: Usman Younas
>  Labels: csv, parsing
> Fix For: 1.4.0
>
>
> Currently, Flink CSV parser is not compliant with RFC 4180. Due to this 
> issue, it was not able to parse standard csv files including double quotes 
> and delimiters with in fields etc. 
> In order to produce this bug, we can take a csv file with double quotes 
> included in field of the records and parse it using Flink CSV parser. One of 
> the issue is mentioned in the jira 
> [FLINK-4785|https://issues.apache.org/jira/browse/FLINK-4785].
> The CSV related issues will be solved by making CSV parser compliant with RFC 
> 4180 standards for Table Source. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-7057) move BLOB ref-counting from LibraryCacheManager to BlobCache

2017-06-30 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16070255#comment-16070255
 ] 

ASF GitHub Bot commented on FLINK-7057:
---

GitHub user NicoK opened a pull request:

https://github.com/apache/flink/pull/4238

[FLINK-7057][blob] move BLOB ref-counting from LibraryCacheManager to 
BlobCache

Currently, the `LibraryCacheManager` is doing some ref-counting for JAR 
files managed by it. Instead, we want the `BlobCache` to do that itself for 
**all** job-related BLOBs. Also, we do not want to operate on a per-BlobKey 
level but rather per job. Job-unrelated BLOBs should be cleaned manually as 
done for the Web-UI logs. A future API change will reflect the different use 
cases in a better way. For now, we need to also adapt the cleanup appropriately.

On the `BlobServer`, the JAR files should remain locally as well as in the 
HA store until the job enters a final state. Then they can be deleted.

With this intermediate state, job-unrelated BLOBs will remain in the file 
system until deleted manually. This is the same as the previous API use when 
working with a `BlobService` directly instead of going through the 
`LibraryCacheManager`. The aforementioned API extension will include TTL fields 
for those BLOBs in order to have a proper cleanup, too.

This PR is based upon #4237  in a series to implement FLINK-6916.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/NicoK/flink flink-6916-7057

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/4238.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4238


commit d54a316cfffd8243980df561fd4fcbd99934a40b
Author: Nico Kruber 
Date:   2016-12-20T15:49:57Z

[FLINK-6008][docs] minor improvements in the BlobService docs

commit b215515fa14d3f6af218e86b67bc2c27ae9d4f4f
Author: Nico Kruber 
Date:   2016-12-20T17:27:13Z

[FLINK-6008] refactor BlobCache#getURL() for cleaner code

commit bbcde52b3105fcf379c852b568f3893cc6052ce6
Author: Nico Kruber 
Date:   2016-12-21T15:23:29Z

[FLINK-6008] do not fail the BlobServer if delete fails

also extend the delete tests and remove one code duplication

commit dda1a12e40027724efb0e50005e5b57058a220f0
Author: Nico Kruber 
Date:   2017-01-06T17:42:58Z

[FLINK-6008][docs] update some config options to the new, non-deprecated 
ones

commit e12c2348b237207a50649d515a0fbbd19f92e6a0
Author: Nico Kruber 
Date:   2017-03-09T17:14:02Z

[FLINK-6008] use Preconditions.checkArgument in BlobClient

commit 24060e01332c6df9fd01f1dc5f321c3fda9301c1
Author: Nico Kruber 
Date:   2017-03-17T15:21:40Z

[FLINK-6008] fix concurrent job directory creation

also add according unit tests

commit 2e0d16ab8bf8a48a2d028602a3a7693fc4b76039
Author: Nico Kruber 
Date:   2017-06-14T16:01:47Z

[FLINK-6008] do not guard a delete() call with a check for existence

commit 7ba911d7ecb4861261dff8509996be0bd64d6d27
Author: Nico Kruber 
Date:   2017-04-18T14:37:37Z

[FLINK-6008] some comments about BlobLibraryCacheManager cleanup

commit d3f50d595f85356ae6ed0a85e1f8b8e8ac630bde
Author: Nico Kruber 
Date:   2017-04-19T13:39:03Z

[hotfix] minor typos

commit 79b6ce35a9e246b35415a388295f9ee2fc19a82e
Author: Nico Kruber 
Date:   2017-04-19T14:10:16Z

[FLINK-6008] further cleanup tests for BlobLibraryCacheManager

commit 23fb6ecd6c43c86d762503339c67953290236dca
Author: Nico Kruber 
Date:   2017-06-30T14:03:16Z

[FLINK-6008] address PR comments

commit 794764ceeed6b9bbbac08662f5754b218ff86c9c
Author: Nico Kruber 
Date:   2017-06-16T08:51:04Z

[FLINK-7052][blob] remove (unused) NAME_ADDRESSABLE mode

commit 774bafa85f242110a2ce7907c1150f8c62d73b3f
Author: Nico Kruber 
Date:   2017-06-21T15:05:57Z

[FLINK-7052][blob] remove further unused code due to the NAME_ADDRESSABLE 
removal

commit 4da3b3f6269e43bf1c66621099528824cad9373f
Author: Nico Kruber 
Date:   2017-06-22T15:31:17Z

[FLINK-7053][blob] remove code duplication in BlobClientSslTest

This lets BlobClientSslTest extend BlobClientTest as most of its 
implementation
came from there and was simply copied.

commit aa9cdc820f9ca1a38a19708bf45a2099e42eaf48
Author: Nico Kruber 
Date:   2017-06-23T09:40:34Z

[FLINK-7053][blob] verify some of the buffers returned by GET

commit c9b693a46053b55b3939ff471184796f12d36a72
Author: Nico 

[GitHub] flink pull request #4238: [FLINK-7057][blob] move BLOB ref-counting from Lib...

2017-06-30 Thread NicoK
GitHub user NicoK opened a pull request:

https://github.com/apache/flink/pull/4238

[FLINK-7057][blob] move BLOB ref-counting from LibraryCacheManager to 
BlobCache

Currently, the `LibraryCacheManager` is doing some ref-counting for JAR 
files managed by it. Instead, we want the `BlobCache` to do that itself for 
**all** job-related BLOBs. Also, we do not want to operate on a per-BlobKey 
level but rather per job. Job-unrelated BLOBs should be cleaned manually as 
done for the Web-UI logs. A future API change will reflect the different use 
cases in a better way. For now, we need to also adapt the cleanup appropriately.

On the `BlobServer`, the JAR files should remain locally as well as in the 
HA store until the job enters a final state. Then they can be deleted.

With this intermediate state, job-unrelated BLOBs will remain in the file 
system until deleted manually. This is the same as the previous API use when 
working with a `BlobService` directly instead of going through the 
`LibraryCacheManager`. The aforementioned API extension will include TTL fields 
for those BLOBs in order to have a proper cleanup, too.

This PR is based upon #4237  in a series to implement FLINK-6916.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/NicoK/flink flink-6916-7057

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/4238.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4238


commit d54a316cfffd8243980df561fd4fcbd99934a40b
Author: Nico Kruber 
Date:   2016-12-20T15:49:57Z

[FLINK-6008][docs] minor improvements in the BlobService docs

commit b215515fa14d3f6af218e86b67bc2c27ae9d4f4f
Author: Nico Kruber 
Date:   2016-12-20T17:27:13Z

[FLINK-6008] refactor BlobCache#getURL() for cleaner code

commit bbcde52b3105fcf379c852b568f3893cc6052ce6
Author: Nico Kruber 
Date:   2016-12-21T15:23:29Z

[FLINK-6008] do not fail the BlobServer if delete fails

also extend the delete tests and remove one code duplication

commit dda1a12e40027724efb0e50005e5b57058a220f0
Author: Nico Kruber 
Date:   2017-01-06T17:42:58Z

[FLINK-6008][docs] update some config options to the new, non-deprecated 
ones

commit e12c2348b237207a50649d515a0fbbd19f92e6a0
Author: Nico Kruber 
Date:   2017-03-09T17:14:02Z

[FLINK-6008] use Preconditions.checkArgument in BlobClient

commit 24060e01332c6df9fd01f1dc5f321c3fda9301c1
Author: Nico Kruber 
Date:   2017-03-17T15:21:40Z

[FLINK-6008] fix concurrent job directory creation

also add according unit tests

commit 2e0d16ab8bf8a48a2d028602a3a7693fc4b76039
Author: Nico Kruber 
Date:   2017-06-14T16:01:47Z

[FLINK-6008] do not guard a delete() call with a check for existence

commit 7ba911d7ecb4861261dff8509996be0bd64d6d27
Author: Nico Kruber 
Date:   2017-04-18T14:37:37Z

[FLINK-6008] some comments about BlobLibraryCacheManager cleanup

commit d3f50d595f85356ae6ed0a85e1f8b8e8ac630bde
Author: Nico Kruber 
Date:   2017-04-19T13:39:03Z

[hotfix] minor typos

commit 79b6ce35a9e246b35415a388295f9ee2fc19a82e
Author: Nico Kruber 
Date:   2017-04-19T14:10:16Z

[FLINK-6008] further cleanup tests for BlobLibraryCacheManager

commit 23fb6ecd6c43c86d762503339c67953290236dca
Author: Nico Kruber 
Date:   2017-06-30T14:03:16Z

[FLINK-6008] address PR comments

commit 794764ceeed6b9bbbac08662f5754b218ff86c9c
Author: Nico Kruber 
Date:   2017-06-16T08:51:04Z

[FLINK-7052][blob] remove (unused) NAME_ADDRESSABLE mode

commit 774bafa85f242110a2ce7907c1150f8c62d73b3f
Author: Nico Kruber 
Date:   2017-06-21T15:05:57Z

[FLINK-7052][blob] remove further unused code due to the NAME_ADDRESSABLE 
removal

commit 4da3b3f6269e43bf1c66621099528824cad9373f
Author: Nico Kruber 
Date:   2017-06-22T15:31:17Z

[FLINK-7053][blob] remove code duplication in BlobClientSslTest

This lets BlobClientSslTest extend BlobClientTest as most of its 
implementation
came from there and was simply copied.

commit aa9cdc820f9ca1a38a19708bf45a2099e42eaf48
Author: Nico Kruber 
Date:   2017-06-23T09:40:34Z

[FLINK-7053][blob] verify some of the buffers returned by GET

commit c9b693a46053b55b3939ff471184796f12d36a72
Author: Nico Kruber 
Date:   2017-06-23T10:04:10Z

[FLINK-7053][blob] use TemporaryFolder for local BLOB dir in unit tests

This replaces the use of some temporary directory where it is not guaranteed
that it will 

[GitHub] flink pull request #3889: [FLINK-6075] - Support Limit/Top(Sort) for Stream ...

2017-06-30 Thread fhueske
Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/3889#discussion_r125006340
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/calcite/RelTimeIndicatorConverter.scala
 ---
@@ -92,8 +92,19 @@ class RelTimeIndicatorConverter(rexBuilder: RexBuilder) 
extends RelShuttle {
   override def visit(minus: LogicalMinus): RelNode =
 throw new TableException("Logical minus in a stream environment is not 
supported yet.")
 
-  override def visit(sort: LogicalSort): RelNode =
-throw new TableException("Logical sort in a stream environment is not 
supported yet.")
+  override def visit(sort: LogicalSort): RelNode = {
+
+val input = sort.getInput.accept(this)
+
+val materializer = new RexTimeIndicatorMaterializer(
--- End diff --

unused. Should be removed


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request #3889: [FLINK-6075] - Support Limit/Top(Sort) for Stream ...

2017-06-30 Thread fhueske
Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/3889#discussion_r125058559
  
--- Diff: 
flink-libraries/flink-table/src/test/scala/org/apache/flink/table/runtime/aggregate/TimeSortProcessFunctionTest.scala
 ---
@@ -122,6 +130,11 @@ class TimeSortProcessFunctionTest{
   Row.of(1: JInt, 12L: JLong, 1: JInt, "aaa", 11L: JLong),true), 4))
 expectedOutput.add(new StreamRecord(new CRow(
   Row.of(1: JInt, 12L: JLong, 0: JInt, "aaa", 11L: JLong),true), 4))
+
+expectedOutput.add(new StreamRecord(new CRow(
+  Row.of(1: JInt, 1L: JLong, 0: JInt, "aaa", 11L: JLong),true), 1006))
+expectedOutput.add(new StreamRecord(new CRow(
+  Row.of(1: JInt, 2L: JLong, 0: JInt, "aaa", 11L: JLong),true), 1006))
   
 TestHarnessUtil.assertOutputEqualsSorted("Output was not correct.",
--- End diff --

The test should check that the ProcessFunction emit the rows in the correct 
order. `assertOutputEqualsSorted` sorts the result and expected data before 
comparing them. We have to use `assertOutputEquals` instead.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request #3889: [FLINK-6075] - Support Limit/Top(Sort) for Stream ...

2017-06-30 Thread fhueske
Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/3889#discussion_r125058944
  
--- Diff: 
flink-libraries/flink-table/src/test/scala/org/apache/flink/table/runtime/aggregate/TimeSortProcessFunctionTest.scala
 ---
@@ -0,0 +1,256 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.flink.table.runtime.aggregate
+
+import java.util.Comparator
+import java.util.concurrent.ConcurrentLinkedQueue
+import java.lang.{Integer => JInt, Long => JLong}
+
+import org.apache.flink.api.common.typeinfo.BasicTypeInfo._
+import org.apache.flink.api.common.typeinfo.{BasicTypeInfo, 
TypeInformation}
+import org.apache.flink.api.java.functions.KeySelector
+import org.apache.flink.api.java.typeutils.RowTypeInfo
+import org.apache.flink.streaming.api.operators.KeyedProcessOperator
+import org.apache.flink.streaming.api.watermark.Watermark
+import org.apache.flink.streaming.runtime.streamrecord.StreamRecord
+import 
org.apache.flink.streaming.util.{KeyedOneInputStreamOperatorTestHarness, 
TestHarnessUtil}
+import org.apache.flink.types.Row
+import org.junit.Test
+import org.apache.flink.table.runtime.aggregate.ProcTimeSortProcessFunction
+import org.apache.flink.table.runtime.aggregate.RowTimeSortProcessFunction
+import 
org.apache.flink.table.runtime.aggregate.TimeSortProcessFunctionTest._
+import org.apache.flink.api.java.typeutils.runtime.RowComparator
+import org.apache.flink.api.common.typeutils.TypeSerializer
+import org.apache.flink.api.common.typeutils.TypeComparator
+import org.apache.flink.table.runtime.types.{CRow, CRowTypeInfo}
+import org.apache.flink.streaming.api.TimeCharacteristic
+
+class TimeSortProcessFunctionTest{
+
+  
+  @Test
+  def testSortProcTimeHarnessPartitioned(): Unit = {
+
+val rT =  new RowTypeInfo(Array[TypeInformation[_]](
+  INT_TYPE_INFO,
+  LONG_TYPE_INFO,
+  INT_TYPE_INFO,
+  STRING_TYPE_INFO,
+  LONG_TYPE_INFO),
+  Array("a","b","c","d","e"))
+
+val rTA =  new RowTypeInfo(Array[TypeInformation[_]](
+ LONG_TYPE_INFO), Array("count"))
+val indexes = Array(1,2)
+  
+val fieldComps = Array[TypeComparator[AnyRef]](
+  LONG_TYPE_INFO.createComparator(true, 
null).asInstanceOf[TypeComparator[AnyRef]],
+  INT_TYPE_INFO.createComparator(false, 
null).asInstanceOf[TypeComparator[AnyRef]] )
+val booleanOrders = Array(true, false)
+
+
+val rowComp = new RowComparator(
+  rT.getTotalFields,
+  indexes,
+  fieldComps,
+  new Array[TypeSerializer[AnyRef]](0), //used only for serialized 
comparisons
+  booleanOrders)
+
+val collectionRowComparator = new CollectionRowComparator(rowComp)
+
+val inputCRowType = CRowTypeInfo(rT)
+
+val processFunction = new KeyedProcessOperator[Integer,CRow,CRow](
+  new ProcTimeSortProcessFunction(
+inputCRowType,
+collectionRowComparator))
+  
+   val testHarness = new 
KeyedOneInputStreamOperatorTestHarness[Integer,CRow,CRow](
+  processFunction, 
+  new TupleRowSelector(0), 
+  BasicTypeInfo.INT_TYPE_INFO)
+
+   testHarness.open();
+
+   testHarness.setProcessingTime(3)
+
+  // timestamp is ignored in processing time
+testHarness.processElement(new StreamRecord(new CRow(
+  Row.of(1: JInt, 11L: JLong, 1: JInt, "aaa", 11L: JLong), true), 
1001))
+testHarness.processElement(new StreamRecord(new CRow(
+Row.of(1: JInt, 12L: JLong, 1: JInt, "aaa", 11L: JLong), true), 
2002))
+testHarness.processElement(new StreamRecord(new CRow(
+Row.of(1: JInt, 12L: JLong, 2: JInt, "aaa", 11L: JLong), true), 
2003))
+testHarness.processElement(new StreamRecord(new CRow(
+Row.of(1: JInt, 12L: JLong, 0: JInt, "aaa", 11L: JLong), true), 
2004))
+

[jira] [Commented] (FLINK-6075) Support Limit/Top(Sort) for Stream SQL

2017-06-30 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16070252#comment-16070252
 ] 

ASF GitHub Bot commented on FLINK-6075:
---

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/3889#discussion_r125006376
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/calcite/RelTimeIndicatorConverter.scala
 ---
@@ -92,8 +92,19 @@ class RelTimeIndicatorConverter(rexBuilder: RexBuilder) 
extends RelShuttle {
   override def visit(minus: LogicalMinus): RelNode =
 throw new TableException("Logical minus in a stream environment is not 
supported yet.")
 
-  override def visit(sort: LogicalSort): RelNode =
-throw new TableException("Logical sort in a stream environment is not 
supported yet.")
+  override def visit(sort: LogicalSort): RelNode = {
+
+val input = sort.getInput.accept(this)
+
+val materializer = new RexTimeIndicatorMaterializer(
+  rexBuilder,
+  input.getRowType.getFieldList.map(_.getType))
+   
+//val offset = if(sort.offset != null) 
sort.offset.accept(materializer) else null
--- End diff --

Should be removed


> Support Limit/Top(Sort) for Stream SQL
> --
>
> Key: FLINK-6075
> URL: https://issues.apache.org/jira/browse/FLINK-6075
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API & SQL
>Reporter: radu
>  Labels: features
> Attachments: sort.png
>
>
> These will be split in 3 separated JIRA issues. However, the design is the 
> same only the processing function differs in terms of the output. Hence, the 
> design is the same for all of them.
> Time target: Proc Time
> **SQL targeted query examples:**
> *Sort example*
> Q1)` SELECT a FROM stream1 GROUP BY HOP(proctime, INTERVAL '1' HOUR, INTERVAL 
> '3' HOUR) ORDER BY b` 
> Comment: window is defined using GROUP BY
> Comment: ASC or DESC keywords can be placed to mark the ordering type
> *Limit example*
> Q2) `SELECT a FROM stream1 WHERE rowtime BETWEEN current_timestamp - INTERVAL 
> '1' HOUR AND current_timestamp ORDER BY b LIMIT 10`
> Comment: window is defined using time ranges in the WHERE clause
> Comment: window is row triggered
> *Top example*
> Q3) `SELECT sum(a) OVER (ORDER BY proctime RANGE INTERVAL '1' HOUR PRECEDING 
> LIMIT 10) FROM stream1`  
> Comment: limit over the contents of the sliding window
> General Comments:
> -All these SQL clauses are supported only over windows (bounded collections 
> of data). 
> -Each of the 3 operators will be supported with each of the types of 
> expressing the windows. 
> **Description**
> The 3 operations (limit, top and sort) are similar in behavior as they all 
> require a sorted collection of the data on which the logic will be applied 
> (i.e., select a subset of the items or the entire sorted set). These 
> functions would make sense in the streaming context only in the context of a 
> window. Without defining a window the functions could never emit as the sort 
> operation would never trigger. If an SQL query will be provided without 
> limits an error will be thrown (`SELECT a FROM stream1 TOP 10` -> ERROR). 
> Although not targeted by this JIRA, in the case of working based on event 
> time order, the retraction mechanisms of windows and the lateness mechanisms 
> can be used to deal with out of order events and retraction/updates of 
> results.
> **Functionality example**
> We exemplify with the query below for all the 3 types of operators (sorting, 
> limit and top). Rowtime indicates when the HOP window will trigger – which 
> can be observed in the fact that outputs are generated only at those moments. 
> The HOP windows will trigger at every hour (fixed hour) and each event will 
> contribute/ be duplicated for 2 consecutive hour intervals. Proctime 
> indicates the processing time when a new event arrives in the system. Events 
> are of the type (a,b) with the ordering being applied on the b field.
> `SELECT a FROM stream1 HOP(proctime, INTERVAL '1' HOUR, INTERVAL '2' HOUR) 
> ORDER BY b (LIMIT 2/ TOP 2 / [ASC/DESC] `)
> ||Rowtime||   Proctime||  Stream1||   Limit 2||   Top 2|| Sort 
> [ASC]||
> | |10:00:00  |(aaa, 11)   |   | | 
>|
> | |10:05:00|(aab, 7)  |   | ||
> |10-11  |11:00:00  |  |   aab,aaa |aab,aaa  | aab,aaa 
>|
> | |11:03:00  |(aac,21)  |   | ||  
> 
> |11-12|12:00:00  |  | aab,aaa |aab,aaa  | aab,aaa,aac|
> | |12:10:00  |(abb,12)  |   | ||  
> 
> | |12:15:00  |(abb,12)  |  

[jira] [Commented] (FLINK-6075) Support Limit/Top(Sort) for Stream SQL

2017-06-30 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16070250#comment-16070250
 ] 

ASF GitHub Bot commented on FLINK-6075:
---

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/3889#discussion_r125063072
  
--- Diff: 
flink-libraries/flink-table/src/test/scala/org/apache/flink/table/api/scala/stream/sql/SortITCase.scala
 ---
@@ -0,0 +1,127 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.api.scala.stream.sql
+
+import org.apache.flink.api.scala._
+import 
org.apache.flink.table.api.scala.stream.sql.SortITCase.StringRowSelectorSink
+import 
org.apache.flink.table.api.scala.stream.sql.TimeTestUtil.EventTimeSourceFunction
+import org.apache.flink.streaming.api.functions.source.SourceFunction
+import org.apache.flink.streaming.api.scala.StreamExecutionEnvironment
+import org.apache.flink.table.api.TableEnvironment
+import org.apache.flink.table.api.scala._
+import org.apache.flink.table.api.scala.stream.utils.{StreamITCase, 
StreamTestData, StreamingWithStateTestBase}
+import org.apache.flink.api.common.typeinfo.BasicTypeInfo
+import org.apache.flink.api.java.typeutils.RowTypeInfo
+import org.apache.flink.api.common.typeinfo.TypeInformation
+import org.apache.flink.types.Row
+import org.junit.Assert._
+import org.junit._
+import org.apache.flink.streaming.api.TimeCharacteristic
+import 
org.apache.flink.streaming.api.functions.source.SourceFunction.SourceContext
+import org.apache.flink.streaming.api.watermark.Watermark
+import scala.collection.mutable
+import org.apache.flink.streaming.api.functions.sink.RichSinkFunction
+
+class SortITCase extends StreamingWithStateTestBase {
+
+  @Test
+  def testEventTimeOrderBy(): Unit = {
+val data = Seq(
+  Left((1500L, (1L, 15, "Hello"))),
+  Left((1600L, (1L, 16, "Hello"))),
+  Left((1000L, (1L, 1, "Hello"))),
+  Left((2000L, (2L, 2, "Hello"))),
+  Right(1000L),
+  Left((2000L, (2L, 2, "Hello"))),
+  Left((2000L, (2L, 3, "Hello"))),
+  Left((3000L, (3L, 3, "Hello"))),
+  Left((2000L, (3L, 1, "Hello"))),
+  Right(2000L),
+  Left((4000L, (4L, 4, "Hello"))),
+  Right(3000L),
+  Left((5000L, (5L, 5, "Hello"))),
+  Right(5000L),
+  Left((6000L, (6L, 65, "Hello"))),
+  Left((6000L, (6L, 6, "Hello"))),
+  Left((6000L, (6L, 67, "Hello"))),
+  Left((6000L, (6L, -1, "Hello"))),
+  Left((6000L, (6L, 6, "Hello"))),
+  Right(7000L),
+  Left((9000L, (6L, 9, "Hello"))),
+  Left((8500L, (6L, 18, "Hello"))),
+  Left((9000L, (6L, 7, "Hello"))),
+  Right(1L),
+  Left((1L, (7L, 7, "Hello World"))),
+  Left((11000L, (7L, 77, "Hello World"))),
+  Left((11000L, (7L, 17, "Hello World"))),
+  Right(12000L),
+  Left((14000L, (7L, 18, "Hello World"))),
+  Right(14000L),
+  Left((15000L, (8L, 8, "Hello World"))),
+  Right(17000L),
+  Left((2L, (20L, 20, "Hello World"))), 
+  Right(19000L))
+
+val env = StreamExecutionEnvironment.getExecutionEnvironment
+env.setStateBackend(getStateBackend)
+env.setStreamTimeCharacteristic(TimeCharacteristic.EventTime)
+val tEnv = TableEnvironment.getTableEnvironment(env)
+StreamITCase.clear
+
+val t1 = env.addSource(new EventTimeSourceFunction[(Long, Int, 
String)](data))
+  .toTable(tEnv, 'a, 'b, 'c, 'rowtime.rowtime)
+  
+tEnv.registerTable("T1", t1)
+
+val  sqlQuery = "SELECT b FROM T1 " +
+  "ORDER BY rowtime, b ASC ";
+  
+  
+val result = tEnv.sql(sqlQuery).toDataStream[Row]
--- End diff --

OK, will do that before merging


> Support Limit/Top(Sort) for Stream SQL
> 

[jira] [Commented] (FLINK-6075) Support Limit/Top(Sort) for Stream SQL

2017-06-30 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16070248#comment-16070248
 ] 

ASF GitHub Bot commented on FLINK-6075:
---

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/3889#discussion_r125058559
  
--- Diff: 
flink-libraries/flink-table/src/test/scala/org/apache/flink/table/runtime/aggregate/TimeSortProcessFunctionTest.scala
 ---
@@ -122,6 +130,11 @@ class TimeSortProcessFunctionTest{
   Row.of(1: JInt, 12L: JLong, 1: JInt, "aaa", 11L: JLong),true), 4))
 expectedOutput.add(new StreamRecord(new CRow(
   Row.of(1: JInt, 12L: JLong, 0: JInt, "aaa", 11L: JLong),true), 4))
+
+expectedOutput.add(new StreamRecord(new CRow(
+  Row.of(1: JInt, 1L: JLong, 0: JInt, "aaa", 11L: JLong),true), 1006))
+expectedOutput.add(new StreamRecord(new CRow(
+  Row.of(1: JInt, 2L: JLong, 0: JInt, "aaa", 11L: JLong),true), 1006))
   
 TestHarnessUtil.assertOutputEqualsSorted("Output was not correct.",
--- End diff --

The test should check that the ProcessFunction emit the rows in the correct 
order. `assertOutputEqualsSorted` sorts the result and expected data before 
comparing them. We have to use `assertOutputEquals` instead.


> Support Limit/Top(Sort) for Stream SQL
> --
>
> Key: FLINK-6075
> URL: https://issues.apache.org/jira/browse/FLINK-6075
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API & SQL
>Reporter: radu
>  Labels: features
> Attachments: sort.png
>
>
> These will be split in 3 separated JIRA issues. However, the design is the 
> same only the processing function differs in terms of the output. Hence, the 
> design is the same for all of them.
> Time target: Proc Time
> **SQL targeted query examples:**
> *Sort example*
> Q1)` SELECT a FROM stream1 GROUP BY HOP(proctime, INTERVAL '1' HOUR, INTERVAL 
> '3' HOUR) ORDER BY b` 
> Comment: window is defined using GROUP BY
> Comment: ASC or DESC keywords can be placed to mark the ordering type
> *Limit example*
> Q2) `SELECT a FROM stream1 WHERE rowtime BETWEEN current_timestamp - INTERVAL 
> '1' HOUR AND current_timestamp ORDER BY b LIMIT 10`
> Comment: window is defined using time ranges in the WHERE clause
> Comment: window is row triggered
> *Top example*
> Q3) `SELECT sum(a) OVER (ORDER BY proctime RANGE INTERVAL '1' HOUR PRECEDING 
> LIMIT 10) FROM stream1`  
> Comment: limit over the contents of the sliding window
> General Comments:
> -All these SQL clauses are supported only over windows (bounded collections 
> of data). 
> -Each of the 3 operators will be supported with each of the types of 
> expressing the windows. 
> **Description**
> The 3 operations (limit, top and sort) are similar in behavior as they all 
> require a sorted collection of the data on which the logic will be applied 
> (i.e., select a subset of the items or the entire sorted set). These 
> functions would make sense in the streaming context only in the context of a 
> window. Without defining a window the functions could never emit as the sort 
> operation would never trigger. If an SQL query will be provided without 
> limits an error will be thrown (`SELECT a FROM stream1 TOP 10` -> ERROR). 
> Although not targeted by this JIRA, in the case of working based on event 
> time order, the retraction mechanisms of windows and the lateness mechanisms 
> can be used to deal with out of order events and retraction/updates of 
> results.
> **Functionality example**
> We exemplify with the query below for all the 3 types of operators (sorting, 
> limit and top). Rowtime indicates when the HOP window will trigger – which 
> can be observed in the fact that outputs are generated only at those moments. 
> The HOP windows will trigger at every hour (fixed hour) and each event will 
> contribute/ be duplicated for 2 consecutive hour intervals. Proctime 
> indicates the processing time when a new event arrives in the system. Events 
> are of the type (a,b) with the ordering being applied on the b field.
> `SELECT a FROM stream1 HOP(proctime, INTERVAL '1' HOUR, INTERVAL '2' HOUR) 
> ORDER BY b (LIMIT 2/ TOP 2 / [ASC/DESC] `)
> ||Rowtime||   Proctime||  Stream1||   Limit 2||   Top 2|| Sort 
> [ASC]||
> | |10:00:00  |(aaa, 11)   |   | | 
>|
> | |10:05:00|(aab, 7)  |   | ||
> |10-11  |11:00:00  |  |   aab,aaa |aab,aaa  | aab,aaa 
>|
> | |11:03:00  |(aac,21)  |   | ||  
> 
> |11-12|12:00:00  |  | aab,aaa |aab,aaa  | aab,aaa,aac|
> | |12:10:00  |(abb,12)  |   | |

[jira] [Commented] (FLINK-6075) Support Limit/Top(Sort) for Stream SQL

2017-06-30 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16070253#comment-16070253
 ] 

ASF GitHub Bot commented on FLINK-6075:
---

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/3889#discussion_r125006340
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/calcite/RelTimeIndicatorConverter.scala
 ---
@@ -92,8 +92,19 @@ class RelTimeIndicatorConverter(rexBuilder: RexBuilder) 
extends RelShuttle {
   override def visit(minus: LogicalMinus): RelNode =
 throw new TableException("Logical minus in a stream environment is not 
supported yet.")
 
-  override def visit(sort: LogicalSort): RelNode =
-throw new TableException("Logical sort in a stream environment is not 
supported yet.")
+  override def visit(sort: LogicalSort): RelNode = {
+
+val input = sort.getInput.accept(this)
+
+val materializer = new RexTimeIndicatorMaterializer(
--- End diff --

unused. Should be removed


> Support Limit/Top(Sort) for Stream SQL
> --
>
> Key: FLINK-6075
> URL: https://issues.apache.org/jira/browse/FLINK-6075
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API & SQL
>Reporter: radu
>  Labels: features
> Attachments: sort.png
>
>
> These will be split in 3 separated JIRA issues. However, the design is the 
> same only the processing function differs in terms of the output. Hence, the 
> design is the same for all of them.
> Time target: Proc Time
> **SQL targeted query examples:**
> *Sort example*
> Q1)` SELECT a FROM stream1 GROUP BY HOP(proctime, INTERVAL '1' HOUR, INTERVAL 
> '3' HOUR) ORDER BY b` 
> Comment: window is defined using GROUP BY
> Comment: ASC or DESC keywords can be placed to mark the ordering type
> *Limit example*
> Q2) `SELECT a FROM stream1 WHERE rowtime BETWEEN current_timestamp - INTERVAL 
> '1' HOUR AND current_timestamp ORDER BY b LIMIT 10`
> Comment: window is defined using time ranges in the WHERE clause
> Comment: window is row triggered
> *Top example*
> Q3) `SELECT sum(a) OVER (ORDER BY proctime RANGE INTERVAL '1' HOUR PRECEDING 
> LIMIT 10) FROM stream1`  
> Comment: limit over the contents of the sliding window
> General Comments:
> -All these SQL clauses are supported only over windows (bounded collections 
> of data). 
> -Each of the 3 operators will be supported with each of the types of 
> expressing the windows. 
> **Description**
> The 3 operations (limit, top and sort) are similar in behavior as they all 
> require a sorted collection of the data on which the logic will be applied 
> (i.e., select a subset of the items or the entire sorted set). These 
> functions would make sense in the streaming context only in the context of a 
> window. Without defining a window the functions could never emit as the sort 
> operation would never trigger. If an SQL query will be provided without 
> limits an error will be thrown (`SELECT a FROM stream1 TOP 10` -> ERROR). 
> Although not targeted by this JIRA, in the case of working based on event 
> time order, the retraction mechanisms of windows and the lateness mechanisms 
> can be used to deal with out of order events and retraction/updates of 
> results.
> **Functionality example**
> We exemplify with the query below for all the 3 types of operators (sorting, 
> limit and top). Rowtime indicates when the HOP window will trigger – which 
> can be observed in the fact that outputs are generated only at those moments. 
> The HOP windows will trigger at every hour (fixed hour) and each event will 
> contribute/ be duplicated for 2 consecutive hour intervals. Proctime 
> indicates the processing time when a new event arrives in the system. Events 
> are of the type (a,b) with the ordering being applied on the b field.
> `SELECT a FROM stream1 HOP(proctime, INTERVAL '1' HOUR, INTERVAL '2' HOUR) 
> ORDER BY b (LIMIT 2/ TOP 2 / [ASC/DESC] `)
> ||Rowtime||   Proctime||  Stream1||   Limit 2||   Top 2|| Sort 
> [ASC]||
> | |10:00:00  |(aaa, 11)   |   | | 
>|
> | |10:05:00|(aab, 7)  |   | ||
> |10-11  |11:00:00  |  |   aab,aaa |aab,aaa  | aab,aaa 
>|
> | |11:03:00  |(aac,21)  |   | ||  
> 
> |11-12|12:00:00  |  | aab,aaa |aab,aaa  | aab,aaa,aac|
> | |12:10:00  |(abb,12)  |   | ||  
> 
> | |12:15:00  |(abb,12)  |   | ||  
> 
> |12-13  |13:00:00  |  |   abb,abb | abb,abb | 
> abb,abb,aac|
> |...|
> **Implementation option**
> 

[jira] [Commented] (FLINK-6075) Support Limit/Top(Sort) for Stream SQL

2017-06-30 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16070249#comment-16070249
 ] 

ASF GitHub Bot commented on FLINK-6075:
---

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/3889#discussion_r125058944
  
--- Diff: 
flink-libraries/flink-table/src/test/scala/org/apache/flink/table/runtime/aggregate/TimeSortProcessFunctionTest.scala
 ---
@@ -0,0 +1,256 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.flink.table.runtime.aggregate
+
+import java.util.Comparator
+import java.util.concurrent.ConcurrentLinkedQueue
+import java.lang.{Integer => JInt, Long => JLong}
+
+import org.apache.flink.api.common.typeinfo.BasicTypeInfo._
+import org.apache.flink.api.common.typeinfo.{BasicTypeInfo, 
TypeInformation}
+import org.apache.flink.api.java.functions.KeySelector
+import org.apache.flink.api.java.typeutils.RowTypeInfo
+import org.apache.flink.streaming.api.operators.KeyedProcessOperator
+import org.apache.flink.streaming.api.watermark.Watermark
+import org.apache.flink.streaming.runtime.streamrecord.StreamRecord
+import 
org.apache.flink.streaming.util.{KeyedOneInputStreamOperatorTestHarness, 
TestHarnessUtil}
+import org.apache.flink.types.Row
+import org.junit.Test
+import org.apache.flink.table.runtime.aggregate.ProcTimeSortProcessFunction
+import org.apache.flink.table.runtime.aggregate.RowTimeSortProcessFunction
+import 
org.apache.flink.table.runtime.aggregate.TimeSortProcessFunctionTest._
+import org.apache.flink.api.java.typeutils.runtime.RowComparator
+import org.apache.flink.api.common.typeutils.TypeSerializer
+import org.apache.flink.api.common.typeutils.TypeComparator
+import org.apache.flink.table.runtime.types.{CRow, CRowTypeInfo}
+import org.apache.flink.streaming.api.TimeCharacteristic
+
+class TimeSortProcessFunctionTest{
+
+  
+  @Test
+  def testSortProcTimeHarnessPartitioned(): Unit = {
+
+val rT =  new RowTypeInfo(Array[TypeInformation[_]](
+  INT_TYPE_INFO,
+  LONG_TYPE_INFO,
+  INT_TYPE_INFO,
+  STRING_TYPE_INFO,
+  LONG_TYPE_INFO),
+  Array("a","b","c","d","e"))
+
+val rTA =  new RowTypeInfo(Array[TypeInformation[_]](
+ LONG_TYPE_INFO), Array("count"))
+val indexes = Array(1,2)
+  
+val fieldComps = Array[TypeComparator[AnyRef]](
+  LONG_TYPE_INFO.createComparator(true, 
null).asInstanceOf[TypeComparator[AnyRef]],
+  INT_TYPE_INFO.createComparator(false, 
null).asInstanceOf[TypeComparator[AnyRef]] )
+val booleanOrders = Array(true, false)
+
+
+val rowComp = new RowComparator(
+  rT.getTotalFields,
+  indexes,
+  fieldComps,
+  new Array[TypeSerializer[AnyRef]](0), //used only for serialized 
comparisons
+  booleanOrders)
+
+val collectionRowComparator = new CollectionRowComparator(rowComp)
+
+val inputCRowType = CRowTypeInfo(rT)
+
+val processFunction = new KeyedProcessOperator[Integer,CRow,CRow](
+  new ProcTimeSortProcessFunction(
+inputCRowType,
+collectionRowComparator))
+  
+   val testHarness = new 
KeyedOneInputStreamOperatorTestHarness[Integer,CRow,CRow](
+  processFunction, 
+  new TupleRowSelector(0), 
+  BasicTypeInfo.INT_TYPE_INFO)
+
+   testHarness.open();
+
+   testHarness.setProcessingTime(3)
+
+  // timestamp is ignored in processing time
+testHarness.processElement(new StreamRecord(new CRow(
+  Row.of(1: JInt, 11L: JLong, 1: JInt, "aaa", 11L: JLong), true), 
1001))
+testHarness.processElement(new StreamRecord(new CRow(
+Row.of(1: JInt, 12L: JLong, 1: JInt, "aaa", 11L: JLong), true), 
2002))
+testHarness.processElement(new StreamRecord(new CRow(
+

[jira] [Commented] (FLINK-6075) Support Limit/Top(Sort) for Stream SQL

2017-06-30 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16070251#comment-16070251
 ] 

ASF GitHub Bot commented on FLINK-6075:
---

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/3889#discussion_r125040180
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/plan/nodes/datastream/DataStreamSort.scala
 ---
@@ -0,0 +1,212 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.plan.nodes.datastream
+
+import org.apache.calcite.plan.{ RelOptCluster, RelTraitSet }
--- End diff --

About half of the imports are unused. Other classes have unused imports as 
well.


> Support Limit/Top(Sort) for Stream SQL
> --
>
> Key: FLINK-6075
> URL: https://issues.apache.org/jira/browse/FLINK-6075
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API & SQL
>Reporter: radu
>  Labels: features
> Attachments: sort.png
>
>
> These will be split in 3 separated JIRA issues. However, the design is the 
> same only the processing function differs in terms of the output. Hence, the 
> design is the same for all of them.
> Time target: Proc Time
> **SQL targeted query examples:**
> *Sort example*
> Q1)` SELECT a FROM stream1 GROUP BY HOP(proctime, INTERVAL '1' HOUR, INTERVAL 
> '3' HOUR) ORDER BY b` 
> Comment: window is defined using GROUP BY
> Comment: ASC or DESC keywords can be placed to mark the ordering type
> *Limit example*
> Q2) `SELECT a FROM stream1 WHERE rowtime BETWEEN current_timestamp - INTERVAL 
> '1' HOUR AND current_timestamp ORDER BY b LIMIT 10`
> Comment: window is defined using time ranges in the WHERE clause
> Comment: window is row triggered
> *Top example*
> Q3) `SELECT sum(a) OVER (ORDER BY proctime RANGE INTERVAL '1' HOUR PRECEDING 
> LIMIT 10) FROM stream1`  
> Comment: limit over the contents of the sliding window
> General Comments:
> -All these SQL clauses are supported only over windows (bounded collections 
> of data). 
> -Each of the 3 operators will be supported with each of the types of 
> expressing the windows. 
> **Description**
> The 3 operations (limit, top and sort) are similar in behavior as they all 
> require a sorted collection of the data on which the logic will be applied 
> (i.e., select a subset of the items or the entire sorted set). These 
> functions would make sense in the streaming context only in the context of a 
> window. Without defining a window the functions could never emit as the sort 
> operation would never trigger. If an SQL query will be provided without 
> limits an error will be thrown (`SELECT a FROM stream1 TOP 10` -> ERROR). 
> Although not targeted by this JIRA, in the case of working based on event 
> time order, the retraction mechanisms of windows and the lateness mechanisms 
> can be used to deal with out of order events and retraction/updates of 
> results.
> **Functionality example**
> We exemplify with the query below for all the 3 types of operators (sorting, 
> limit and top). Rowtime indicates when the HOP window will trigger – which 
> can be observed in the fact that outputs are generated only at those moments. 
> The HOP windows will trigger at every hour (fixed hour) and each event will 
> contribute/ be duplicated for 2 consecutive hour intervals. Proctime 
> indicates the processing time when a new event arrives in the system. Events 
> are of the type (a,b) with the ordering being applied on the b field.
> `SELECT a FROM stream1 HOP(proctime, INTERVAL '1' HOUR, INTERVAL '2' HOUR) 
> ORDER BY b (LIMIT 2/ TOP 2 / [ASC/DESC] `)
> ||Rowtime||   Proctime||  Stream1||   Limit 2||   Top 2|| Sort 
> [ASC]||
> | |10:00:00  |(aaa, 11)   |   | | 
>|
> | |10:05:00|(aab, 7)  |   | ||
> |10-11  |11:00:00  |  |   aab,aaa 

[GitHub] flink pull request #3889: [FLINK-6075] - Support Limit/Top(Sort) for Stream ...

2017-06-30 Thread fhueske
Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/3889#discussion_r125006376
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/calcite/RelTimeIndicatorConverter.scala
 ---
@@ -92,8 +92,19 @@ class RelTimeIndicatorConverter(rexBuilder: RexBuilder) 
extends RelShuttle {
   override def visit(minus: LogicalMinus): RelNode =
 throw new TableException("Logical minus in a stream environment is not 
supported yet.")
 
-  override def visit(sort: LogicalSort): RelNode =
-throw new TableException("Logical sort in a stream environment is not 
supported yet.")
+  override def visit(sort: LogicalSort): RelNode = {
+
+val input = sort.getInput.accept(this)
+
+val materializer = new RexTimeIndicatorMaterializer(
+  rexBuilder,
+  input.getRowType.getFieldList.map(_.getType))
+   
+//val offset = if(sort.offset != null) 
sort.offset.accept(materializer) else null
--- End diff --

Should be removed


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request #3889: [FLINK-6075] - Support Limit/Top(Sort) for Stream ...

2017-06-30 Thread fhueske
Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/3889#discussion_r125063072
  
--- Diff: 
flink-libraries/flink-table/src/test/scala/org/apache/flink/table/api/scala/stream/sql/SortITCase.scala
 ---
@@ -0,0 +1,127 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.api.scala.stream.sql
+
+import org.apache.flink.api.scala._
+import 
org.apache.flink.table.api.scala.stream.sql.SortITCase.StringRowSelectorSink
+import 
org.apache.flink.table.api.scala.stream.sql.TimeTestUtil.EventTimeSourceFunction
+import org.apache.flink.streaming.api.functions.source.SourceFunction
+import org.apache.flink.streaming.api.scala.StreamExecutionEnvironment
+import org.apache.flink.table.api.TableEnvironment
+import org.apache.flink.table.api.scala._
+import org.apache.flink.table.api.scala.stream.utils.{StreamITCase, 
StreamTestData, StreamingWithStateTestBase}
+import org.apache.flink.api.common.typeinfo.BasicTypeInfo
+import org.apache.flink.api.java.typeutils.RowTypeInfo
+import org.apache.flink.api.common.typeinfo.TypeInformation
+import org.apache.flink.types.Row
+import org.junit.Assert._
+import org.junit._
+import org.apache.flink.streaming.api.TimeCharacteristic
+import 
org.apache.flink.streaming.api.functions.source.SourceFunction.SourceContext
+import org.apache.flink.streaming.api.watermark.Watermark
+import scala.collection.mutable
+import org.apache.flink.streaming.api.functions.sink.RichSinkFunction
+
+class SortITCase extends StreamingWithStateTestBase {
+
+  @Test
+  def testEventTimeOrderBy(): Unit = {
+val data = Seq(
+  Left((1500L, (1L, 15, "Hello"))),
+  Left((1600L, (1L, 16, "Hello"))),
+  Left((1000L, (1L, 1, "Hello"))),
+  Left((2000L, (2L, 2, "Hello"))),
+  Right(1000L),
+  Left((2000L, (2L, 2, "Hello"))),
+  Left((2000L, (2L, 3, "Hello"))),
+  Left((3000L, (3L, 3, "Hello"))),
+  Left((2000L, (3L, 1, "Hello"))),
+  Right(2000L),
+  Left((4000L, (4L, 4, "Hello"))),
+  Right(3000L),
+  Left((5000L, (5L, 5, "Hello"))),
+  Right(5000L),
+  Left((6000L, (6L, 65, "Hello"))),
+  Left((6000L, (6L, 6, "Hello"))),
+  Left((6000L, (6L, 67, "Hello"))),
+  Left((6000L, (6L, -1, "Hello"))),
+  Left((6000L, (6L, 6, "Hello"))),
+  Right(7000L),
+  Left((9000L, (6L, 9, "Hello"))),
+  Left((8500L, (6L, 18, "Hello"))),
+  Left((9000L, (6L, 7, "Hello"))),
+  Right(1L),
+  Left((1L, (7L, 7, "Hello World"))),
+  Left((11000L, (7L, 77, "Hello World"))),
+  Left((11000L, (7L, 17, "Hello World"))),
+  Right(12000L),
+  Left((14000L, (7L, 18, "Hello World"))),
+  Right(14000L),
+  Left((15000L, (8L, 8, "Hello World"))),
+  Right(17000L),
+  Left((2L, (20L, 20, "Hello World"))), 
+  Right(19000L))
+
+val env = StreamExecutionEnvironment.getExecutionEnvironment
+env.setStateBackend(getStateBackend)
+env.setStreamTimeCharacteristic(TimeCharacteristic.EventTime)
+val tEnv = TableEnvironment.getTableEnvironment(env)
+StreamITCase.clear
+
+val t1 = env.addSource(new EventTimeSourceFunction[(Long, Int, 
String)](data))
+  .toTable(tEnv, 'a, 'b, 'c, 'rowtime.rowtime)
+  
+tEnv.registerTable("T1", t1)
+
+val  sqlQuery = "SELECT b FROM T1 " +
+  "ORDER BY rowtime, b ASC ";
+  
+  
+val result = tEnv.sql(sqlQuery).toDataStream[Row]
--- End diff --

OK, will do that before merging


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with 

[GitHub] flink pull request #3889: [FLINK-6075] - Support Limit/Top(Sort) for Stream ...

2017-06-30 Thread fhueske
Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/3889#discussion_r125040180
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/plan/nodes/datastream/DataStreamSort.scala
 ---
@@ -0,0 +1,212 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.plan.nodes.datastream
+
+import org.apache.calcite.plan.{ RelOptCluster, RelTraitSet }
--- End diff --

About half of the imports are unused. Other classes have unused imports as 
well.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-6916) FLIP-19: Improved BLOB storage architecture

2017-06-30 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16070243#comment-16070243
 ] 

ASF GitHub Bot commented on FLINK-6916:
---

Github user NicoK closed the pull request at:

https://github.com/apache/flink/pull/4176


> FLIP-19: Improved BLOB storage architecture
> ---
>
> Key: FLINK-6916
> URL: https://issues.apache.org/jira/browse/FLINK-6916
> Project: Flink
>  Issue Type: Improvement
>  Components: Distributed Coordination, Network
>Affects Versions: 1.4.0
>Reporter: Nico Kruber
>Assignee: Nico Kruber
>
> The current architecture around the BLOB server and cache components seems 
> rather patched up and has some issues regarding concurrency ([FLINK-6380]), 
> cleanup, API inconsistencies / currently unused API ([FLINK-6329], 
> [FLINK-6008]). These make future integration with FLIP-6 or extensions like 
> offloading oversized RPC messages ([FLINK-6046]) difficult. We therefore 
> propose an improvement on the current architecture as described below which 
> tackles these issues, provides some cleanup, and enables further BLOB server 
> use cases.
> Please refer to 
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-19%3A+Improved+BLOB+storage+architecture
>  for a full overview on the proposed changes.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-6916) FLIP-19: Improved BLOB storage architecture

2017-06-30 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16070242#comment-16070242
 ] 

ASF GitHub Bot commented on FLINK-6916:
---

Github user NicoK commented on the issue:

https://github.com/apache/flink/pull/4176
  
superseeded by #4237 


> FLIP-19: Improved BLOB storage architecture
> ---
>
> Key: FLINK-6916
> URL: https://issues.apache.org/jira/browse/FLINK-6916
> Project: Flink
>  Issue Type: Improvement
>  Components: Distributed Coordination, Network
>Affects Versions: 1.4.0
>Reporter: Nico Kruber
>Assignee: Nico Kruber
>
> The current architecture around the BLOB server and cache components seems 
> rather patched up and has some issues regarding concurrency ([FLINK-6380]), 
> cleanup, API inconsistencies / currently unused API ([FLINK-6329], 
> [FLINK-6008]). These make future integration with FLIP-6 or extensions like 
> offloading oversized RPC messages ([FLINK-6046]) difficult. We therefore 
> propose an improvement on the current architecture as described below which 
> tackles these issues, provides some cleanup, and enables further BLOB server 
> use cases.
> Please refer to 
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-19%3A+Improved+BLOB+storage+architecture
>  for a full overview on the proposed changes.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] flink issue #4176: [FLINK-6916][blob] add API to allow job-related BLOBs to ...

2017-06-30 Thread NicoK
Github user NicoK commented on the issue:

https://github.com/apache/flink/pull/4176
  
superseeded by #4237 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request #4176: [FLINK-6916][blob] add API to allow job-related BL...

2017-06-30 Thread NicoK
Github user NicoK closed the pull request at:

https://github.com/apache/flink/pull/4176


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request #4237: [FLINK-7056][blob] add API to allow job-related BL...

2017-06-30 Thread NicoK
GitHub user NicoK opened a pull request:

https://github.com/apache/flink/pull/4237

[FLINK-7056][blob] add API to allow job-related BLOBs to be stored

To ease cleanup, we will make job-related BLOBs be reflected in the blob 
storage so that they may be removed along with the job. This adds the `jobId` 
to many methods similar to the previous code from the `NAME_ADDRESSABLE` mode.

This PR is based upon #4236  in a series to implement FLINK-6916.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/NicoK/flink flink-6916-7056

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/4237.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4237


commit d54a316cfffd8243980df561fd4fcbd99934a40b
Author: Nico Kruber 
Date:   2016-12-20T15:49:57Z

[FLINK-6008][docs] minor improvements in the BlobService docs

commit b215515fa14d3f6af218e86b67bc2c27ae9d4f4f
Author: Nico Kruber 
Date:   2016-12-20T17:27:13Z

[FLINK-6008] refactor BlobCache#getURL() for cleaner code

commit bbcde52b3105fcf379c852b568f3893cc6052ce6
Author: Nico Kruber 
Date:   2016-12-21T15:23:29Z

[FLINK-6008] do not fail the BlobServer if delete fails

also extend the delete tests and remove one code duplication

commit dda1a12e40027724efb0e50005e5b57058a220f0
Author: Nico Kruber 
Date:   2017-01-06T17:42:58Z

[FLINK-6008][docs] update some config options to the new, non-deprecated 
ones

commit e12c2348b237207a50649d515a0fbbd19f92e6a0
Author: Nico Kruber 
Date:   2017-03-09T17:14:02Z

[FLINK-6008] use Preconditions.checkArgument in BlobClient

commit 24060e01332c6df9fd01f1dc5f321c3fda9301c1
Author: Nico Kruber 
Date:   2017-03-17T15:21:40Z

[FLINK-6008] fix concurrent job directory creation

also add according unit tests

commit 2e0d16ab8bf8a48a2d028602a3a7693fc4b76039
Author: Nico Kruber 
Date:   2017-06-14T16:01:47Z

[FLINK-6008] do not guard a delete() call with a check for existence

commit 7ba911d7ecb4861261dff8509996be0bd64d6d27
Author: Nico Kruber 
Date:   2017-04-18T14:37:37Z

[FLINK-6008] some comments about BlobLibraryCacheManager cleanup

commit d3f50d595f85356ae6ed0a85e1f8b8e8ac630bde
Author: Nico Kruber 
Date:   2017-04-19T13:39:03Z

[hotfix] minor typos

commit 79b6ce35a9e246b35415a388295f9ee2fc19a82e
Author: Nico Kruber 
Date:   2017-04-19T14:10:16Z

[FLINK-6008] further cleanup tests for BlobLibraryCacheManager

commit 23fb6ecd6c43c86d762503339c67953290236dca
Author: Nico Kruber 
Date:   2017-06-30T14:03:16Z

[FLINK-6008] address PR comments

commit 794764ceeed6b9bbbac08662f5754b218ff86c9c
Author: Nico Kruber 
Date:   2017-06-16T08:51:04Z

[FLINK-7052][blob] remove (unused) NAME_ADDRESSABLE mode

commit 774bafa85f242110a2ce7907c1150f8c62d73b3f
Author: Nico Kruber 
Date:   2017-06-21T15:05:57Z

[FLINK-7052][blob] remove further unused code due to the NAME_ADDRESSABLE 
removal

commit 4da3b3f6269e43bf1c66621099528824cad9373f
Author: Nico Kruber 
Date:   2017-06-22T15:31:17Z

[FLINK-7053][blob] remove code duplication in BlobClientSslTest

This lets BlobClientSslTest extend BlobClientTest as most of its 
implementation
came from there and was simply copied.

commit aa9cdc820f9ca1a38a19708bf45a2099e42eaf48
Author: Nico Kruber 
Date:   2017-06-23T09:40:34Z

[FLINK-7053][blob] verify some of the buffers returned by GET

commit c9b693a46053b55b3939ff471184796f12d36a72
Author: Nico Kruber 
Date:   2017-06-23T10:04:10Z

[FLINK-7053][blob] use TemporaryFolder for local BLOB dir in unit tests

This replaces the use of some temporary directory where it is not guaranteed
that it will be deleted after the test.

commit 11db399d5103d9ffe9083c9b6029a7e81afa9abe
Author: Nico Kruber 
Date:   2017-06-21T12:45:31Z

[FLINK-7054][blob] remove LibraryCacheManager#getFile()

This was only used in tests where it is avoidable but if used anywhere 
else, it
may have caused cleanup issues.

commit 4ae04b68453d4b099f752d6c6fd3c09335ede33a
Author: Nico Kruber 
Date:   2017-06-21T14:14:15Z

[FLINK-7055][blob] refactor getURL() to the more generic getFile()

The fact that we always returned URL objects is a relic of the BlobServer's 
only
use for URLClassLoader. Since we'd like to extend its use, returning File
objects instead is more generic.

commit 

[jira] [Commented] (FLINK-7056) add API to allow job-related BLOBs to be stored

2017-06-30 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16070239#comment-16070239
 ] 

ASF GitHub Bot commented on FLINK-7056:
---

GitHub user NicoK opened a pull request:

https://github.com/apache/flink/pull/4237

[FLINK-7056][blob] add API to allow job-related BLOBs to be stored

To ease cleanup, we will make job-related BLOBs be reflected in the blob 
storage so that they may be removed along with the job. This adds the `jobId` 
to many methods similar to the previous code from the `NAME_ADDRESSABLE` mode.

This PR is based upon #4236  in a series to implement FLINK-6916.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/NicoK/flink flink-6916-7056

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/4237.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4237


commit d54a316cfffd8243980df561fd4fcbd99934a40b
Author: Nico Kruber 
Date:   2016-12-20T15:49:57Z

[FLINK-6008][docs] minor improvements in the BlobService docs

commit b215515fa14d3f6af218e86b67bc2c27ae9d4f4f
Author: Nico Kruber 
Date:   2016-12-20T17:27:13Z

[FLINK-6008] refactor BlobCache#getURL() for cleaner code

commit bbcde52b3105fcf379c852b568f3893cc6052ce6
Author: Nico Kruber 
Date:   2016-12-21T15:23:29Z

[FLINK-6008] do not fail the BlobServer if delete fails

also extend the delete tests and remove one code duplication

commit dda1a12e40027724efb0e50005e5b57058a220f0
Author: Nico Kruber 
Date:   2017-01-06T17:42:58Z

[FLINK-6008][docs] update some config options to the new, non-deprecated 
ones

commit e12c2348b237207a50649d515a0fbbd19f92e6a0
Author: Nico Kruber 
Date:   2017-03-09T17:14:02Z

[FLINK-6008] use Preconditions.checkArgument in BlobClient

commit 24060e01332c6df9fd01f1dc5f321c3fda9301c1
Author: Nico Kruber 
Date:   2017-03-17T15:21:40Z

[FLINK-6008] fix concurrent job directory creation

also add according unit tests

commit 2e0d16ab8bf8a48a2d028602a3a7693fc4b76039
Author: Nico Kruber 
Date:   2017-06-14T16:01:47Z

[FLINK-6008] do not guard a delete() call with a check for existence

commit 7ba911d7ecb4861261dff8509996be0bd64d6d27
Author: Nico Kruber 
Date:   2017-04-18T14:37:37Z

[FLINK-6008] some comments about BlobLibraryCacheManager cleanup

commit d3f50d595f85356ae6ed0a85e1f8b8e8ac630bde
Author: Nico Kruber 
Date:   2017-04-19T13:39:03Z

[hotfix] minor typos

commit 79b6ce35a9e246b35415a388295f9ee2fc19a82e
Author: Nico Kruber 
Date:   2017-04-19T14:10:16Z

[FLINK-6008] further cleanup tests for BlobLibraryCacheManager

commit 23fb6ecd6c43c86d762503339c67953290236dca
Author: Nico Kruber 
Date:   2017-06-30T14:03:16Z

[FLINK-6008] address PR comments

commit 794764ceeed6b9bbbac08662f5754b218ff86c9c
Author: Nico Kruber 
Date:   2017-06-16T08:51:04Z

[FLINK-7052][blob] remove (unused) NAME_ADDRESSABLE mode

commit 774bafa85f242110a2ce7907c1150f8c62d73b3f
Author: Nico Kruber 
Date:   2017-06-21T15:05:57Z

[FLINK-7052][blob] remove further unused code due to the NAME_ADDRESSABLE 
removal

commit 4da3b3f6269e43bf1c66621099528824cad9373f
Author: Nico Kruber 
Date:   2017-06-22T15:31:17Z

[FLINK-7053][blob] remove code duplication in BlobClientSslTest

This lets BlobClientSslTest extend BlobClientTest as most of its 
implementation
came from there and was simply copied.

commit aa9cdc820f9ca1a38a19708bf45a2099e42eaf48
Author: Nico Kruber 
Date:   2017-06-23T09:40:34Z

[FLINK-7053][blob] verify some of the buffers returned by GET

commit c9b693a46053b55b3939ff471184796f12d36a72
Author: Nico Kruber 
Date:   2017-06-23T10:04:10Z

[FLINK-7053][blob] use TemporaryFolder for local BLOB dir in unit tests

This replaces the use of some temporary directory where it is not guaranteed
that it will be deleted after the test.

commit 11db399d5103d9ffe9083c9b6029a7e81afa9abe
Author: Nico Kruber 
Date:   2017-06-21T12:45:31Z

[FLINK-7054][blob] remove LibraryCacheManager#getFile()

This was only used in tests where it is avoidable but if used anywhere 
else, it
may have caused cleanup issues.

commit 4ae04b68453d4b099f752d6c6fd3c09335ede33a
Author: Nico Kruber 
Date:   2017-06-21T14:14:15Z

[FLINK-7055][blob] refactor getURL() to the more generic 

[jira] [Updated] (FLINK-7055) refactor BlobService#getURL() methods to return a File object

2017-06-30 Thread Nico Kruber (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-7055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nico Kruber updated FLINK-7055:
---
Description: As a relic from its use by the {{URLClassLoader}}, 
{{BlobService#getURL()}} methods always returned {{URL}} objects although they 
were always pointing to locally cached files. As a step towards a better 
architecture and API, these should return a File object instead.  (was: As a 
relic from its use by the {{UrlClassLoader}}, {{BlobService#getURL()}} methods 
always returned {{URL}} objects although they were always pointing to locally 
cached files. As a step towards a better architecture and API, these should 
return a File object instead.)

> refactor BlobService#getURL() methods to return a File object
> -
>
> Key: FLINK-7055
> URL: https://issues.apache.org/jira/browse/FLINK-7055
> Project: Flink
>  Issue Type: Sub-task
>  Components: Distributed Coordination, Network
>Affects Versions: 1.4.0
>Reporter: Nico Kruber
>Assignee: Nico Kruber
>
> As a relic from its use by the {{URLClassLoader}}, {{BlobService#getURL()}} 
> methods always returned {{URL}} objects although they were always pointing to 
> locally cached files. As a step towards a better architecture and API, these 
> should return a File object instead.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-7055) refactor BlobService#getURL() methods to return a File object

2017-06-30 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16070237#comment-16070237
 ] 

ASF GitHub Bot commented on FLINK-7055:
---

GitHub user NicoK opened a pull request:

https://github.com/apache/flink/pull/4236

[FLINK-7055][blob] refactor BlobService#getURL() methods to return a File 
object

As a relic from its use by the `URLClassLoader`, `BlobService#getURL()` 
methods always returned URL objects although they were always pointing to 
locally cached files. As a step towards a better architecture and API, these 
should be renamed and return a File object instead.

This PR is based upon #4235  in a series to implement FLINK-6916.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/NicoK/flink flink-6916-7055

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/4236.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4236


commit d54a316cfffd8243980df561fd4fcbd99934a40b
Author: Nico Kruber 
Date:   2016-12-20T15:49:57Z

[FLINK-6008][docs] minor improvements in the BlobService docs

commit b215515fa14d3f6af218e86b67bc2c27ae9d4f4f
Author: Nico Kruber 
Date:   2016-12-20T17:27:13Z

[FLINK-6008] refactor BlobCache#getURL() for cleaner code

commit bbcde52b3105fcf379c852b568f3893cc6052ce6
Author: Nico Kruber 
Date:   2016-12-21T15:23:29Z

[FLINK-6008] do not fail the BlobServer if delete fails

also extend the delete tests and remove one code duplication

commit dda1a12e40027724efb0e50005e5b57058a220f0
Author: Nico Kruber 
Date:   2017-01-06T17:42:58Z

[FLINK-6008][docs] update some config options to the new, non-deprecated 
ones

commit e12c2348b237207a50649d515a0fbbd19f92e6a0
Author: Nico Kruber 
Date:   2017-03-09T17:14:02Z

[FLINK-6008] use Preconditions.checkArgument in BlobClient

commit 24060e01332c6df9fd01f1dc5f321c3fda9301c1
Author: Nico Kruber 
Date:   2017-03-17T15:21:40Z

[FLINK-6008] fix concurrent job directory creation

also add according unit tests

commit 2e0d16ab8bf8a48a2d028602a3a7693fc4b76039
Author: Nico Kruber 
Date:   2017-06-14T16:01:47Z

[FLINK-6008] do not guard a delete() call with a check for existence

commit 7ba911d7ecb4861261dff8509996be0bd64d6d27
Author: Nico Kruber 
Date:   2017-04-18T14:37:37Z

[FLINK-6008] some comments about BlobLibraryCacheManager cleanup

commit d3f50d595f85356ae6ed0a85e1f8b8e8ac630bde
Author: Nico Kruber 
Date:   2017-04-19T13:39:03Z

[hotfix] minor typos

commit 79b6ce35a9e246b35415a388295f9ee2fc19a82e
Author: Nico Kruber 
Date:   2017-04-19T14:10:16Z

[FLINK-6008] further cleanup tests for BlobLibraryCacheManager

commit 23fb6ecd6c43c86d762503339c67953290236dca
Author: Nico Kruber 
Date:   2017-06-30T14:03:16Z

[FLINK-6008] address PR comments

commit 794764ceeed6b9bbbac08662f5754b218ff86c9c
Author: Nico Kruber 
Date:   2017-06-16T08:51:04Z

[FLINK-7052][blob] remove (unused) NAME_ADDRESSABLE mode

commit 774bafa85f242110a2ce7907c1150f8c62d73b3f
Author: Nico Kruber 
Date:   2017-06-21T15:05:57Z

[FLINK-7052][blob] remove further unused code due to the NAME_ADDRESSABLE 
removal

commit 4da3b3f6269e43bf1c66621099528824cad9373f
Author: Nico Kruber 
Date:   2017-06-22T15:31:17Z

[FLINK-7053][blob] remove code duplication in BlobClientSslTest

This lets BlobClientSslTest extend BlobClientTest as most of its 
implementation
came from there and was simply copied.

commit aa9cdc820f9ca1a38a19708bf45a2099e42eaf48
Author: Nico Kruber 
Date:   2017-06-23T09:40:34Z

[FLINK-7053][blob] verify some of the buffers returned by GET

commit c9b693a46053b55b3939ff471184796f12d36a72
Author: Nico Kruber 
Date:   2017-06-23T10:04:10Z

[FLINK-7053][blob] use TemporaryFolder for local BLOB dir in unit tests

This replaces the use of some temporary directory where it is not guaranteed
that it will be deleted after the test.

commit 11db399d5103d9ffe9083c9b6029a7e81afa9abe
Author: Nico Kruber 
Date:   2017-06-21T12:45:31Z

[FLINK-7054][blob] remove LibraryCacheManager#getFile()

This was only used in tests where it is avoidable but if used anywhere 
else, it
may have caused cleanup issues.

commit 4ae04b68453d4b099f752d6c6fd3c09335ede33a
Author: Nico Kruber 
Date:   2017-06-21T14:14:15Z


[GitHub] flink pull request #4236: [FLINK-7055][blob] refactor BlobService#getURL() m...

2017-06-30 Thread NicoK
GitHub user NicoK opened a pull request:

https://github.com/apache/flink/pull/4236

[FLINK-7055][blob] refactor BlobService#getURL() methods to return a File 
object

As a relic from its use by the `URLClassLoader`, `BlobService#getURL()` 
methods always returned URL objects although they were always pointing to 
locally cached files. As a step towards a better architecture and API, these 
should be renamed and return a File object instead.

This PR is based upon #4235  in a series to implement FLINK-6916.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/NicoK/flink flink-6916-7055

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/4236.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4236


commit d54a316cfffd8243980df561fd4fcbd99934a40b
Author: Nico Kruber 
Date:   2016-12-20T15:49:57Z

[FLINK-6008][docs] minor improvements in the BlobService docs

commit b215515fa14d3f6af218e86b67bc2c27ae9d4f4f
Author: Nico Kruber 
Date:   2016-12-20T17:27:13Z

[FLINK-6008] refactor BlobCache#getURL() for cleaner code

commit bbcde52b3105fcf379c852b568f3893cc6052ce6
Author: Nico Kruber 
Date:   2016-12-21T15:23:29Z

[FLINK-6008] do not fail the BlobServer if delete fails

also extend the delete tests and remove one code duplication

commit dda1a12e40027724efb0e50005e5b57058a220f0
Author: Nico Kruber 
Date:   2017-01-06T17:42:58Z

[FLINK-6008][docs] update some config options to the new, non-deprecated 
ones

commit e12c2348b237207a50649d515a0fbbd19f92e6a0
Author: Nico Kruber 
Date:   2017-03-09T17:14:02Z

[FLINK-6008] use Preconditions.checkArgument in BlobClient

commit 24060e01332c6df9fd01f1dc5f321c3fda9301c1
Author: Nico Kruber 
Date:   2017-03-17T15:21:40Z

[FLINK-6008] fix concurrent job directory creation

also add according unit tests

commit 2e0d16ab8bf8a48a2d028602a3a7693fc4b76039
Author: Nico Kruber 
Date:   2017-06-14T16:01:47Z

[FLINK-6008] do not guard a delete() call with a check for existence

commit 7ba911d7ecb4861261dff8509996be0bd64d6d27
Author: Nico Kruber 
Date:   2017-04-18T14:37:37Z

[FLINK-6008] some comments about BlobLibraryCacheManager cleanup

commit d3f50d595f85356ae6ed0a85e1f8b8e8ac630bde
Author: Nico Kruber 
Date:   2017-04-19T13:39:03Z

[hotfix] minor typos

commit 79b6ce35a9e246b35415a388295f9ee2fc19a82e
Author: Nico Kruber 
Date:   2017-04-19T14:10:16Z

[FLINK-6008] further cleanup tests for BlobLibraryCacheManager

commit 23fb6ecd6c43c86d762503339c67953290236dca
Author: Nico Kruber 
Date:   2017-06-30T14:03:16Z

[FLINK-6008] address PR comments

commit 794764ceeed6b9bbbac08662f5754b218ff86c9c
Author: Nico Kruber 
Date:   2017-06-16T08:51:04Z

[FLINK-7052][blob] remove (unused) NAME_ADDRESSABLE mode

commit 774bafa85f242110a2ce7907c1150f8c62d73b3f
Author: Nico Kruber 
Date:   2017-06-21T15:05:57Z

[FLINK-7052][blob] remove further unused code due to the NAME_ADDRESSABLE 
removal

commit 4da3b3f6269e43bf1c66621099528824cad9373f
Author: Nico Kruber 
Date:   2017-06-22T15:31:17Z

[FLINK-7053][blob] remove code duplication in BlobClientSslTest

This lets BlobClientSslTest extend BlobClientTest as most of its 
implementation
came from there and was simply copied.

commit aa9cdc820f9ca1a38a19708bf45a2099e42eaf48
Author: Nico Kruber 
Date:   2017-06-23T09:40:34Z

[FLINK-7053][blob] verify some of the buffers returned by GET

commit c9b693a46053b55b3939ff471184796f12d36a72
Author: Nico Kruber 
Date:   2017-06-23T10:04:10Z

[FLINK-7053][blob] use TemporaryFolder for local BLOB dir in unit tests

This replaces the use of some temporary directory where it is not guaranteed
that it will be deleted after the test.

commit 11db399d5103d9ffe9083c9b6029a7e81afa9abe
Author: Nico Kruber 
Date:   2017-06-21T12:45:31Z

[FLINK-7054][blob] remove LibraryCacheManager#getFile()

This was only used in tests where it is avoidable but if used anywhere 
else, it
may have caused cleanup issues.

commit 4ae04b68453d4b099f752d6c6fd3c09335ede33a
Author: Nico Kruber 
Date:   2017-06-21T14:14:15Z

[FLINK-7055][blob] refactor getURL() to the more generic getFile()

The fact that we always returned URL objects is a relic of the BlobServer's 
only
use for URLClassLoader. Since we'd like to extend its use, returning File

[jira] [Commented] (FLINK-7054) remove LibraryCacheManager#getFile()

2017-06-30 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16070232#comment-16070232
 ] 

ASF GitHub Bot commented on FLINK-7054:
---

GitHub user NicoK opened a pull request:

https://github.com/apache/flink/pull/4235

[FLINK-7054] [blob] remove LibraryCacheManager#getFile()

`LibraryCacheManager#getFile()` was only used in tests where it is 
avoidable but if used anywhere else, it may have caused cleanup issues.

This PR is based upon #4234 in a series to implement FLINK-6916.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/NicoK/flink flink-6916-7054

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/4235.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4235


commit d54a316cfffd8243980df561fd4fcbd99934a40b
Author: Nico Kruber 
Date:   2016-12-20T15:49:57Z

[FLINK-6008][docs] minor improvements in the BlobService docs

commit b215515fa14d3f6af218e86b67bc2c27ae9d4f4f
Author: Nico Kruber 
Date:   2016-12-20T17:27:13Z

[FLINK-6008] refactor BlobCache#getURL() for cleaner code

commit bbcde52b3105fcf379c852b568f3893cc6052ce6
Author: Nico Kruber 
Date:   2016-12-21T15:23:29Z

[FLINK-6008] do not fail the BlobServer if delete fails

also extend the delete tests and remove one code duplication

commit dda1a12e40027724efb0e50005e5b57058a220f0
Author: Nico Kruber 
Date:   2017-01-06T17:42:58Z

[FLINK-6008][docs] update some config options to the new, non-deprecated 
ones

commit e12c2348b237207a50649d515a0fbbd19f92e6a0
Author: Nico Kruber 
Date:   2017-03-09T17:14:02Z

[FLINK-6008] use Preconditions.checkArgument in BlobClient

commit 24060e01332c6df9fd01f1dc5f321c3fda9301c1
Author: Nico Kruber 
Date:   2017-03-17T15:21:40Z

[FLINK-6008] fix concurrent job directory creation

also add according unit tests

commit 2e0d16ab8bf8a48a2d028602a3a7693fc4b76039
Author: Nico Kruber 
Date:   2017-06-14T16:01:47Z

[FLINK-6008] do not guard a delete() call with a check for existence

commit 7ba911d7ecb4861261dff8509996be0bd64d6d27
Author: Nico Kruber 
Date:   2017-04-18T14:37:37Z

[FLINK-6008] some comments about BlobLibraryCacheManager cleanup

commit d3f50d595f85356ae6ed0a85e1f8b8e8ac630bde
Author: Nico Kruber 
Date:   2017-04-19T13:39:03Z

[hotfix] minor typos

commit 79b6ce35a9e246b35415a388295f9ee2fc19a82e
Author: Nico Kruber 
Date:   2017-04-19T14:10:16Z

[FLINK-6008] further cleanup tests for BlobLibraryCacheManager

commit 23fb6ecd6c43c86d762503339c67953290236dca
Author: Nico Kruber 
Date:   2017-06-30T14:03:16Z

[FLINK-6008] address PR comments

commit 794764ceeed6b9bbbac08662f5754b218ff86c9c
Author: Nico Kruber 
Date:   2017-06-16T08:51:04Z

[FLINK-7052][blob] remove (unused) NAME_ADDRESSABLE mode

commit 774bafa85f242110a2ce7907c1150f8c62d73b3f
Author: Nico Kruber 
Date:   2017-06-21T15:05:57Z

[FLINK-7052][blob] remove further unused code due to the NAME_ADDRESSABLE 
removal

commit 4da3b3f6269e43bf1c66621099528824cad9373f
Author: Nico Kruber 
Date:   2017-06-22T15:31:17Z

[FLINK-7053][blob] remove code duplication in BlobClientSslTest

This lets BlobClientSslTest extend BlobClientTest as most of its 
implementation
came from there and was simply copied.

commit aa9cdc820f9ca1a38a19708bf45a2099e42eaf48
Author: Nico Kruber 
Date:   2017-06-23T09:40:34Z

[FLINK-7053][blob] verify some of the buffers returned by GET

commit c9b693a46053b55b3939ff471184796f12d36a72
Author: Nico Kruber 
Date:   2017-06-23T10:04:10Z

[FLINK-7053][blob] use TemporaryFolder for local BLOB dir in unit tests

This replaces the use of some temporary directory where it is not guaranteed
that it will be deleted after the test.

commit 11db399d5103d9ffe9083c9b6029a7e81afa9abe
Author: Nico Kruber 
Date:   2017-06-21T12:45:31Z

[FLINK-7054][blob] remove LibraryCacheManager#getFile()

This was only used in tests where it is avoidable but if used anywhere 
else, it
may have caused cleanup issues.




> remove LibraryCacheManager#getFile()
> 
>
> Key: FLINK-7054
> URL: https://issues.apache.org/jira/browse/FLINK-7054
> Project: Flink
>  Issue Type: Sub-task
>  Components: Distributed 

[GitHub] flink pull request #4235: [FLINK-7054] [blob] remove LibraryCacheManager#get...

2017-06-30 Thread NicoK
GitHub user NicoK opened a pull request:

https://github.com/apache/flink/pull/4235

[FLINK-7054] [blob] remove LibraryCacheManager#getFile()

`LibraryCacheManager#getFile()` was only used in tests where it is 
avoidable but if used anywhere else, it may have caused cleanup issues.

This PR is based upon #4234 in a series to implement FLINK-6916.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/NicoK/flink flink-6916-7054

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/4235.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4235


commit d54a316cfffd8243980df561fd4fcbd99934a40b
Author: Nico Kruber 
Date:   2016-12-20T15:49:57Z

[FLINK-6008][docs] minor improvements in the BlobService docs

commit b215515fa14d3f6af218e86b67bc2c27ae9d4f4f
Author: Nico Kruber 
Date:   2016-12-20T17:27:13Z

[FLINK-6008] refactor BlobCache#getURL() for cleaner code

commit bbcde52b3105fcf379c852b568f3893cc6052ce6
Author: Nico Kruber 
Date:   2016-12-21T15:23:29Z

[FLINK-6008] do not fail the BlobServer if delete fails

also extend the delete tests and remove one code duplication

commit dda1a12e40027724efb0e50005e5b57058a220f0
Author: Nico Kruber 
Date:   2017-01-06T17:42:58Z

[FLINK-6008][docs] update some config options to the new, non-deprecated 
ones

commit e12c2348b237207a50649d515a0fbbd19f92e6a0
Author: Nico Kruber 
Date:   2017-03-09T17:14:02Z

[FLINK-6008] use Preconditions.checkArgument in BlobClient

commit 24060e01332c6df9fd01f1dc5f321c3fda9301c1
Author: Nico Kruber 
Date:   2017-03-17T15:21:40Z

[FLINK-6008] fix concurrent job directory creation

also add according unit tests

commit 2e0d16ab8bf8a48a2d028602a3a7693fc4b76039
Author: Nico Kruber 
Date:   2017-06-14T16:01:47Z

[FLINK-6008] do not guard a delete() call with a check for existence

commit 7ba911d7ecb4861261dff8509996be0bd64d6d27
Author: Nico Kruber 
Date:   2017-04-18T14:37:37Z

[FLINK-6008] some comments about BlobLibraryCacheManager cleanup

commit d3f50d595f85356ae6ed0a85e1f8b8e8ac630bde
Author: Nico Kruber 
Date:   2017-04-19T13:39:03Z

[hotfix] minor typos

commit 79b6ce35a9e246b35415a388295f9ee2fc19a82e
Author: Nico Kruber 
Date:   2017-04-19T14:10:16Z

[FLINK-6008] further cleanup tests for BlobLibraryCacheManager

commit 23fb6ecd6c43c86d762503339c67953290236dca
Author: Nico Kruber 
Date:   2017-06-30T14:03:16Z

[FLINK-6008] address PR comments

commit 794764ceeed6b9bbbac08662f5754b218ff86c9c
Author: Nico Kruber 
Date:   2017-06-16T08:51:04Z

[FLINK-7052][blob] remove (unused) NAME_ADDRESSABLE mode

commit 774bafa85f242110a2ce7907c1150f8c62d73b3f
Author: Nico Kruber 
Date:   2017-06-21T15:05:57Z

[FLINK-7052][blob] remove further unused code due to the NAME_ADDRESSABLE 
removal

commit 4da3b3f6269e43bf1c66621099528824cad9373f
Author: Nico Kruber 
Date:   2017-06-22T15:31:17Z

[FLINK-7053][blob] remove code duplication in BlobClientSslTest

This lets BlobClientSslTest extend BlobClientTest as most of its 
implementation
came from there and was simply copied.

commit aa9cdc820f9ca1a38a19708bf45a2099e42eaf48
Author: Nico Kruber 
Date:   2017-06-23T09:40:34Z

[FLINK-7053][blob] verify some of the buffers returned by GET

commit c9b693a46053b55b3939ff471184796f12d36a72
Author: Nico Kruber 
Date:   2017-06-23T10:04:10Z

[FLINK-7053][blob] use TemporaryFolder for local BLOB dir in unit tests

This replaces the use of some temporary directory where it is not guaranteed
that it will be deleted after the test.

commit 11db399d5103d9ffe9083c9b6029a7e81afa9abe
Author: Nico Kruber 
Date:   2017-06-21T12:45:31Z

[FLINK-7054][blob] remove LibraryCacheManager#getFile()

This was only used in tests where it is avoidable but if used anywhere 
else, it
may have caused cleanup issues.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-7053) improve code quality in some tests

2017-06-30 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16070228#comment-16070228
 ] 

ASF GitHub Bot commented on FLINK-7053:
---

GitHub user NicoK opened a pull request:

https://github.com/apache/flink/pull/4234

[FLINK-7053] improve code quality in some tests

* `BlobClientTest` and `BlobClientSslTest` share a lot of common code
* the received buffers there are currently not verified for being equal to 
the expected one
* `TemporaryFolder` should be used throughout blob store tests

This PR is based upon #4158 in a series of PRs to implement FLINK-6916.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/NicoK/flink flink-6916-7053

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/4234.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4234


commit d54a316cfffd8243980df561fd4fcbd99934a40b
Author: Nico Kruber 
Date:   2016-12-20T15:49:57Z

[FLINK-6008][docs] minor improvements in the BlobService docs

commit b215515fa14d3f6af218e86b67bc2c27ae9d4f4f
Author: Nico Kruber 
Date:   2016-12-20T17:27:13Z

[FLINK-6008] refactor BlobCache#getURL() for cleaner code

commit bbcde52b3105fcf379c852b568f3893cc6052ce6
Author: Nico Kruber 
Date:   2016-12-21T15:23:29Z

[FLINK-6008] do not fail the BlobServer if delete fails

also extend the delete tests and remove one code duplication

commit dda1a12e40027724efb0e50005e5b57058a220f0
Author: Nico Kruber 
Date:   2017-01-06T17:42:58Z

[FLINK-6008][docs] update some config options to the new, non-deprecated 
ones

commit e12c2348b237207a50649d515a0fbbd19f92e6a0
Author: Nico Kruber 
Date:   2017-03-09T17:14:02Z

[FLINK-6008] use Preconditions.checkArgument in BlobClient

commit 24060e01332c6df9fd01f1dc5f321c3fda9301c1
Author: Nico Kruber 
Date:   2017-03-17T15:21:40Z

[FLINK-6008] fix concurrent job directory creation

also add according unit tests

commit 2e0d16ab8bf8a48a2d028602a3a7693fc4b76039
Author: Nico Kruber 
Date:   2017-06-14T16:01:47Z

[FLINK-6008] do not guard a delete() call with a check for existence

commit 7ba911d7ecb4861261dff8509996be0bd64d6d27
Author: Nico Kruber 
Date:   2017-04-18T14:37:37Z

[FLINK-6008] some comments about BlobLibraryCacheManager cleanup

commit d3f50d595f85356ae6ed0a85e1f8b8e8ac630bde
Author: Nico Kruber 
Date:   2017-04-19T13:39:03Z

[hotfix] minor typos

commit 79b6ce35a9e246b35415a388295f9ee2fc19a82e
Author: Nico Kruber 
Date:   2017-04-19T14:10:16Z

[FLINK-6008] further cleanup tests for BlobLibraryCacheManager

commit 23fb6ecd6c43c86d762503339c67953290236dca
Author: Nico Kruber 
Date:   2017-06-30T14:03:16Z

[FLINK-6008] address PR comments

commit 794764ceeed6b9bbbac08662f5754b218ff86c9c
Author: Nico Kruber 
Date:   2017-06-16T08:51:04Z

[FLINK-7052][blob] remove (unused) NAME_ADDRESSABLE mode

commit 774bafa85f242110a2ce7907c1150f8c62d73b3f
Author: Nico Kruber 
Date:   2017-06-21T15:05:57Z

[FLINK-7052][blob] remove further unused code due to the NAME_ADDRESSABLE 
removal

commit 4da3b3f6269e43bf1c66621099528824cad9373f
Author: Nico Kruber 
Date:   2017-06-22T15:31:17Z

[FLINK-7053][blob] remove code duplication in BlobClientSslTest

This lets BlobClientSslTest extend BlobClientTest as most of its 
implementation
came from there and was simply copied.

commit aa9cdc820f9ca1a38a19708bf45a2099e42eaf48
Author: Nico Kruber 
Date:   2017-06-23T09:40:34Z

[FLINK-7053][blob] verify some of the buffers returned by GET

commit c9b693a46053b55b3939ff471184796f12d36a72
Author: Nico Kruber 
Date:   2017-06-23T10:04:10Z

[FLINK-7053][blob] use TemporaryFolder for local BLOB dir in unit tests

This replaces the use of some temporary directory where it is not guaranteed
that it will be deleted after the test.




> improve code quality in some tests
> --
>
> Key: FLINK-7053
> URL: https://issues.apache.org/jira/browse/FLINK-7053
> Project: Flink
>  Issue Type: Sub-task
>  Components: Distributed Coordination, Network
>Affects Versions: 1.4.0
>Reporter: Nico Kruber
>Assignee: Nico Kruber
>
> * {{BlobClientTest}} and {{BlobClientSslTest}} share a lot of common code
> * the received buffers 

[jira] [Updated] (FLINK-7053) improve code quality in some tests

2017-06-30 Thread Nico Kruber (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-7053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nico Kruber updated FLINK-7053:
---
Description: 
* {{BlobClientTest}} and {{BlobClientSslTest}} share a lot of common code
* the received buffers there are currently not verified for being equal to the 
expected one
* {{TemporaryFolder}} should be used throughout blob store tests

  was:
* {{BlobClientTest}} and {{BlobClientSslTest}} share a lot of common code
* the received buffers there are currently not verified for being equal to the 
expected one
* {{TemporarFolder}} should be used throughout blob store tests


> improve code quality in some tests
> --
>
> Key: FLINK-7053
> URL: https://issues.apache.org/jira/browse/FLINK-7053
> Project: Flink
>  Issue Type: Sub-task
>  Components: Distributed Coordination, Network
>Affects Versions: 1.4.0
>Reporter: Nico Kruber
>Assignee: Nico Kruber
>
> * {{BlobClientTest}} and {{BlobClientSslTest}} share a lot of common code
> * the received buffers there are currently not verified for being equal to 
> the expected one
> * {{TemporaryFolder}} should be used throughout blob store tests



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] flink pull request #4234: [FLINK-7053] improve code quality in some tests

2017-06-30 Thread NicoK
GitHub user NicoK opened a pull request:

https://github.com/apache/flink/pull/4234

[FLINK-7053] improve code quality in some tests

* `BlobClientTest` and `BlobClientSslTest` share a lot of common code
* the received buffers there are currently not verified for being equal to 
the expected one
* `TemporaryFolder` should be used throughout blob store tests

This PR is based upon #4158 in a series of PRs to implement FLINK-6916.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/NicoK/flink flink-6916-7053

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/4234.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4234


commit d54a316cfffd8243980df561fd4fcbd99934a40b
Author: Nico Kruber 
Date:   2016-12-20T15:49:57Z

[FLINK-6008][docs] minor improvements in the BlobService docs

commit b215515fa14d3f6af218e86b67bc2c27ae9d4f4f
Author: Nico Kruber 
Date:   2016-12-20T17:27:13Z

[FLINK-6008] refactor BlobCache#getURL() for cleaner code

commit bbcde52b3105fcf379c852b568f3893cc6052ce6
Author: Nico Kruber 
Date:   2016-12-21T15:23:29Z

[FLINK-6008] do not fail the BlobServer if delete fails

also extend the delete tests and remove one code duplication

commit dda1a12e40027724efb0e50005e5b57058a220f0
Author: Nico Kruber 
Date:   2017-01-06T17:42:58Z

[FLINK-6008][docs] update some config options to the new, non-deprecated 
ones

commit e12c2348b237207a50649d515a0fbbd19f92e6a0
Author: Nico Kruber 
Date:   2017-03-09T17:14:02Z

[FLINK-6008] use Preconditions.checkArgument in BlobClient

commit 24060e01332c6df9fd01f1dc5f321c3fda9301c1
Author: Nico Kruber 
Date:   2017-03-17T15:21:40Z

[FLINK-6008] fix concurrent job directory creation

also add according unit tests

commit 2e0d16ab8bf8a48a2d028602a3a7693fc4b76039
Author: Nico Kruber 
Date:   2017-06-14T16:01:47Z

[FLINK-6008] do not guard a delete() call with a check for existence

commit 7ba911d7ecb4861261dff8509996be0bd64d6d27
Author: Nico Kruber 
Date:   2017-04-18T14:37:37Z

[FLINK-6008] some comments about BlobLibraryCacheManager cleanup

commit d3f50d595f85356ae6ed0a85e1f8b8e8ac630bde
Author: Nico Kruber 
Date:   2017-04-19T13:39:03Z

[hotfix] minor typos

commit 79b6ce35a9e246b35415a388295f9ee2fc19a82e
Author: Nico Kruber 
Date:   2017-04-19T14:10:16Z

[FLINK-6008] further cleanup tests for BlobLibraryCacheManager

commit 23fb6ecd6c43c86d762503339c67953290236dca
Author: Nico Kruber 
Date:   2017-06-30T14:03:16Z

[FLINK-6008] address PR comments

commit 794764ceeed6b9bbbac08662f5754b218ff86c9c
Author: Nico Kruber 
Date:   2017-06-16T08:51:04Z

[FLINK-7052][blob] remove (unused) NAME_ADDRESSABLE mode

commit 774bafa85f242110a2ce7907c1150f8c62d73b3f
Author: Nico Kruber 
Date:   2017-06-21T15:05:57Z

[FLINK-7052][blob] remove further unused code due to the NAME_ADDRESSABLE 
removal

commit 4da3b3f6269e43bf1c66621099528824cad9373f
Author: Nico Kruber 
Date:   2017-06-22T15:31:17Z

[FLINK-7053][blob] remove code duplication in BlobClientSslTest

This lets BlobClientSslTest extend BlobClientTest as most of its 
implementation
came from there and was simply copied.

commit aa9cdc820f9ca1a38a19708bf45a2099e42eaf48
Author: Nico Kruber 
Date:   2017-06-23T09:40:34Z

[FLINK-7053][blob] verify some of the buffers returned by GET

commit c9b693a46053b55b3939ff471184796f12d36a72
Author: Nico Kruber 
Date:   2017-06-23T10:04:10Z

[FLINK-7053][blob] use TemporaryFolder for local BLOB dir in unit tests

This replaces the use of some temporary directory where it is not guaranteed
that it will be deleted after the test.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (FLINK-7057) move BLOB ref-counting from LibraryCacheManager to BlobCache

2017-06-30 Thread Nico Kruber (JIRA)
Nico Kruber created FLINK-7057:
--

 Summary: move BLOB ref-counting from LibraryCacheManager to 
BlobCache
 Key: FLINK-7057
 URL: https://issues.apache.org/jira/browse/FLINK-7057
 Project: Flink
  Issue Type: Sub-task
  Components: Distributed Coordination, Network
Affects Versions: 1.4.0
Reporter: Nico Kruber
Assignee: Nico Kruber


Currently, the {{LibraryCacheManager}} is doing some ref-counting for JAR files 
managed by it. Instead, we want the {{BlobCache}} to do that itself for all 
job-related BLOBs. Also, we do not want to operate on a per-{{BlobKey}} level 
but rather per job. Therefore, the cleanup process should be adapted, too.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (FLINK-6916) FLIP-19: Improved BLOB storage architecture

2017-06-30 Thread Nico Kruber (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-6916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nico Kruber updated FLINK-6916:
---
Component/s: Distributed Coordination

> FLIP-19: Improved BLOB storage architecture
> ---
>
> Key: FLINK-6916
> URL: https://issues.apache.org/jira/browse/FLINK-6916
> Project: Flink
>  Issue Type: Improvement
>  Components: Distributed Coordination, Network
>Affects Versions: 1.4.0
>Reporter: Nico Kruber
>Assignee: Nico Kruber
>
> The current architecture around the BLOB server and cache components seems 
> rather patched up and has some issues regarding concurrency ([FLINK-6380]), 
> cleanup, API inconsistencies / currently unused API ([FLINK-6329], 
> [FLINK-6008]). These make future integration with FLIP-6 or extensions like 
> offloading oversized RPC messages ([FLINK-6046]) difficult. We therefore 
> propose an improvement on the current architecture as described below which 
> tackles these issues, provides some cleanup, and enables further BLOB server 
> use cases.
> Please refer to 
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-19%3A+Improved+BLOB+storage+architecture
>  for a full overview on the proposed changes.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (FLINK-7052) remove NAME_ADDRESSABLE mode

2017-06-30 Thread Nico Kruber (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-7052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nico Kruber updated FLINK-7052:
---
Component/s: Distributed Coordination

> remove NAME_ADDRESSABLE mode
> 
>
> Key: FLINK-7052
> URL: https://issues.apache.org/jira/browse/FLINK-7052
> Project: Flink
>  Issue Type: Sub-task
>  Components: Distributed Coordination, Network
>Affects Versions: 1.4.0
>Reporter: Nico Kruber
>Assignee: Nico Kruber
>
> Remove the BLOB store's {{NAME_ADDRESSABLE}} mode as it is currently not used 
> and partly broken.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-6789) Remove duplicated test utility reducer in optimizer

2017-06-30 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16070210#comment-16070210
 ] 

ASF GitHub Bot commented on FLINK-6789:
---

Github user zhangminglei commented on the issue:

https://github.com/apache/flink/pull/4216
  
Thanks for review.


> Remove duplicated test utility reducer in optimizer
> ---
>
> Key: FLINK-6789
> URL: https://issues.apache.org/jira/browse/FLINK-6789
> Project: Flink
>  Issue Type: Improvement
>  Components: Optimizer, Tests
>Affects Versions: 1.4.0
>Reporter: Chesnay Schepler
>Assignee: mingleizhang
>
> The {{DummyReducer}} and {{SelectOneReducer}} in 
> {{org.apache.flink.optimizer.testfunctions}} are identical; we could remove 
> one of them.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] flink issue #4216: [FLINK-6789] [optimizer] Remove duplicated test utility r...

2017-06-30 Thread zhangminglei
Github user zhangminglei commented on the issue:

https://github.com/apache/flink/pull/4216
  
Thanks for review.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (FLINK-7056) add API to allow job-related BLOBs to be stored

2017-06-30 Thread Nico Kruber (JIRA)
Nico Kruber created FLINK-7056:
--

 Summary: add API to allow job-related BLOBs to be stored
 Key: FLINK-7056
 URL: https://issues.apache.org/jira/browse/FLINK-7056
 Project: Flink
  Issue Type: Sub-task
  Components: Distributed Coordination, Network
Affects Versions: 1.4.0
Reporter: Nico Kruber
Assignee: Nico Kruber


To ease cleanup, we will make job-related BLOBs be reflected in the blob 
storage so that they may be removed along with the job. This adds the jobId to 
many methods similar to the previous code from the {{NAME_ADDRESSABLE}} mode.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (FLINK-7055) refactor BlobService#getURL() methods to return a File object

2017-06-30 Thread Nico Kruber (JIRA)
Nico Kruber created FLINK-7055:
--

 Summary: refactor BlobService#getURL() methods to return a File 
object
 Key: FLINK-7055
 URL: https://issues.apache.org/jira/browse/FLINK-7055
 Project: Flink
  Issue Type: Sub-task
  Components: Distributed Coordination, Network
Affects Versions: 1.4.0
Reporter: Nico Kruber
Assignee: Nico Kruber


As a relic from its use by the {{UrlClassLoader}}, {{BlobService#getURL()}} 
methods always returned {{URL}} objects although they were always pointing to 
locally cached files. As a step towards a better architecture and API, these 
should return a File object instead.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (FLINK-7054) remove LibraryCacheManager#getFile()

2017-06-30 Thread Nico Kruber (JIRA)
Nico Kruber created FLINK-7054:
--

 Summary: remove LibraryCacheManager#getFile()
 Key: FLINK-7054
 URL: https://issues.apache.org/jira/browse/FLINK-7054
 Project: Flink
  Issue Type: Sub-task
  Components: Distributed Coordination, Network
Affects Versions: 1.4.0
Reporter: Nico Kruber
Assignee: Nico Kruber


{{LibraryCacheManager#getFile()}} was only used in tests where it is avoidable 
but if used anywhere else, it may have caused cleanup issues.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (FLINK-7053) improve code quality in some tests

2017-06-30 Thread Nico Kruber (JIRA)
Nico Kruber created FLINK-7053:
--

 Summary: improve code quality in some tests
 Key: FLINK-7053
 URL: https://issues.apache.org/jira/browse/FLINK-7053
 Project: Flink
  Issue Type: Sub-task
  Components: Distributed Coordination, Network
Affects Versions: 1.4.0
Reporter: Nico Kruber
Assignee: Nico Kruber


* {{BlobClientTest}} and {{BlobClientSslTest}} share a lot of common code
* the received buffers there are currently not verified for being equal to the 
expected one
* {{TemporarFolder}} should be used throughout blob store tests



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] flink issue #4174: [FLINK-6916][blob] more code style and test improvements

2017-06-30 Thread NicoK
Github user NicoK commented on the issue:

https://github.com/apache/flink/pull/4174
  
let's split this up into two parts...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-6916) FLIP-19: Improved BLOB storage architecture

2017-06-30 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16070173#comment-16070173
 ] 

ASF GitHub Bot commented on FLINK-6916:
---

Github user NicoK commented on the issue:

https://github.com/apache/flink/pull/4174
  
let's split this up into two parts...


> FLIP-19: Improved BLOB storage architecture
> ---
>
> Key: FLINK-6916
> URL: https://issues.apache.org/jira/browse/FLINK-6916
> Project: Flink
>  Issue Type: Improvement
>  Components: Network
>Affects Versions: 1.4.0
>Reporter: Nico Kruber
>Assignee: Nico Kruber
>
> The current architecture around the BLOB server and cache components seems 
> rather patched up and has some issues regarding concurrency ([FLINK-6380]), 
> cleanup, API inconsistencies / currently unused API ([FLINK-6329], 
> [FLINK-6008]). These make future integration with FLIP-6 or extensions like 
> offloading oversized RPC messages ([FLINK-6046]) difficult. We therefore 
> propose an improvement on the current architecture as described below which 
> tackles these issues, provides some cleanup, and enables further BLOB server 
> use cases.
> Please refer to 
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-19%3A+Improved+BLOB+storage+architecture
>  for a full overview on the proposed changes.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-6916) FLIP-19: Improved BLOB storage architecture

2017-06-30 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16070174#comment-16070174
 ] 

ASF GitHub Bot commented on FLINK-6916:
---

Github user NicoK closed the pull request at:

https://github.com/apache/flink/pull/4174


> FLIP-19: Improved BLOB storage architecture
> ---
>
> Key: FLINK-6916
> URL: https://issues.apache.org/jira/browse/FLINK-6916
> Project: Flink
>  Issue Type: Improvement
>  Components: Network
>Affects Versions: 1.4.0
>Reporter: Nico Kruber
>Assignee: Nico Kruber
>
> The current architecture around the BLOB server and cache components seems 
> rather patched up and has some issues regarding concurrency ([FLINK-6380]), 
> cleanup, API inconsistencies / currently unused API ([FLINK-6329], 
> [FLINK-6008]). These make future integration with FLIP-6 or extensions like 
> offloading oversized RPC messages ([FLINK-6046]) difficult. We therefore 
> propose an improvement on the current architecture as described below which 
> tackles these issues, provides some cleanup, and enables further BLOB server 
> use cases.
> Please refer to 
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-19%3A+Improved+BLOB+storage+architecture
>  for a full overview on the proposed changes.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] flink pull request #4174: [FLINK-6916][blob] more code style and test improv...

2017-06-30 Thread NicoK
Github user NicoK closed the pull request at:

https://github.com/apache/flink/pull/4174


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-7047) Reorganize build profiles

2017-06-30 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16070164#comment-16070164
 ] 

ASF GitHub Bot commented on FLINK-7047:
---

Github user zentol commented on the issue:

https://github.com/apache/flink/pull/4233
  
I would merge this PR in any case. It's a nice short-term option that is 
more stable than the previous approach; regardless of how many times we add in 
flink-runtime we can still test connectors/libs and vice versa.


> Reorganize build profiles
> -
>
> Key: FLINK-7047
> URL: https://issues.apache.org/jira/browse/FLINK-7047
> Project: Flink
>  Issue Type: Improvement
>  Components: Tests, Travis
>Affects Versions: 1.4.0
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
> Fix For: 1.4.0
>
>
> With the current build times once again hitting the timeout it is time to 
> revisit our approach.
> The current approach of splitting all tests by name, while easy to maintain 
> or extend, has the big disadvantage that it's fairly binary in regards to the 
> timeout: either we're below the timeout and all builds pass, or we're above 
> and the entire merging process stalls. Furthermore, it requires all modules 
> to be compiled.
> I propose a different approach by which we bundle several modules, only 
> execute the tests of these modules and skip the compilation of some modules 
> that are not required for these tests.
> 5 groups are my current suggestion, which will result in 10 build profiles 
> total.
> The groups are:
> # *core* - core flink modules like core,runtime,streaming-java,metrics,rocksdb
> # *libraries* - flink-libraries and flink-storm
> # *connectors* - flink-connectors, flink-connector-wikiedits, 
> flink-tweet-inputformat
> # *tests* - flink-tests
> # *misc* - flink-yarn, fink-yarn-tests, flink-mesos, flink-examples, 
> flink-dist
> To not increase the total number of profiles to ridiculous numbers i also 
> propose to only test against 2 combinations of jdk+hadoop+scala:
> # oraclejdk8 + hadoop 2.8.0 + scala 2.11
> # openjdk7 + hadoop 2.4.1 + scala 2.10
> My current estimate is that this will cause profiles to take at most 40 
> minutes.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] flink issue #4233: [FLINK-7047] [travis] Reorganize build profiles

2017-06-30 Thread zentol
Github user zentol commented on the issue:

https://github.com/apache/flink/pull/4233
  
I would merge this PR in any case. It's a nice short-term option that is 
more stable than the previous approach; regardless of how many times we add in 
flink-runtime we can still test connectors/libs and vice versa.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (FLINK-7052) remove NAME_ADDRESSABLE mode

2017-06-30 Thread Nico Kruber (JIRA)
Nico Kruber created FLINK-7052:
--

 Summary: remove NAME_ADDRESSABLE mode
 Key: FLINK-7052
 URL: https://issues.apache.org/jira/browse/FLINK-7052
 Project: Flink
  Issue Type: Sub-task
  Components: Network
Affects Versions: 1.4.0
Reporter: Nico Kruber
Assignee: Nico Kruber


Remove the BLOB store's {{NAME_ADDRESSABLE}} mode as it is currently not used 
and partly broken.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (FLINK-6008) collection of BlobServer improvements

2017-06-30 Thread Nico Kruber (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-6008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nico Kruber updated FLINK-6008:
---
Description: 
The following things should be improved around the BlobServer/BlobCache:
* update config uptions with non-deprecated ones, e.g. 
{{high-availability.cluster-id}} and {{high-availability.storageDir}}
* -promote {{BlobStore#deleteAll(JobID)}} to the {{BlobService}}-
* -extend the {{BlobService}} to work with {{NAME_ADDRESSABLE}} blobs (prepares 
FLINK-4399]-
* -remove {{NAME_ADDRESSABLE}} blobs after job/task termination-
* do not fail the {{BlobServer}} when a delete operation fails
* code style, like using {{Preconditions.checkArgument}}

  was:
The following things should be improved around the BlobServer/BlobCache:
* update config uptions with non-deprecated ones, e.g. 
{{high-availability.cluster-id}} and {{high-availability.storageDir}}
* promote {{BlobStore#deleteAll(JobID)}} to the {{BlobService}}
* extend the {{BlobService}} to work with {{NAME_ADDRESSABLE}} blobs (prepares 
FLINK-4399]
* remove {{NAME_ADDRESSABLE}} blobs after job/task termination
* do not fail the {{BlobServer}} when a delete operation fails
* code style, like using {{Preconditions.checkArgument}}


> collection of BlobServer improvements
> -
>
> Key: FLINK-6008
> URL: https://issues.apache.org/jira/browse/FLINK-6008
> Project: Flink
>  Issue Type: Improvement
>  Components: Network
>Affects Versions: 1.3.0
>Reporter: Nico Kruber
>Assignee: Nico Kruber
>
> The following things should be improved around the BlobServer/BlobCache:
> * update config uptions with non-deprecated ones, e.g. 
> {{high-availability.cluster-id}} and {{high-availability.storageDir}}
> * -promote {{BlobStore#deleteAll(JobID)}} to the {{BlobService}}-
> * -extend the {{BlobService}} to work with {{NAME_ADDRESSABLE}} blobs 
> (prepares FLINK-4399]-
> * -remove {{NAME_ADDRESSABLE}} blobs after job/task termination-
> * do not fail the {{BlobServer}} when a delete operation fails
> * code style, like using {{Preconditions.checkArgument}}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] flink issue #4233: [FLINK-7047] [travis] Reorganize build profiles

2017-06-30 Thread greghogan
Github user greghogan commented on the issue:

https://github.com/apache/flink/pull/4233
  
@zentol could you start a mailing list discussion and describe your 
preference for splitting the tests in this manner or splitting the repo? 
Despite heroic efforts by you and Robert keeping the test times under the 
timeout continues to be a Sisyphean task. The number of tests never truly drops 
so the test times will only decrease on the off chance that TravisCI bumps the 
number of cores (or Google increases CPU performance, which they have never 
done).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-7047) Reorganize build profiles

2017-06-30 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16070104#comment-16070104
 ] 

ASF GitHub Bot commented on FLINK-7047:
---

Github user greghogan commented on the issue:

https://github.com/apache/flink/pull/4233
  
@zentol could you start a mailing list discussion and describe your 
preference for splitting the tests in this manner or splitting the repo? 
Despite heroic efforts by you and Robert keeping the test times under the 
timeout continues to be a Sisyphean task. The number of tests never truly drops 
so the test times will only decrease on the off chance that TravisCI bumps the 
number of cores (or Google increases CPU performance, which they have never 
done).


> Reorganize build profiles
> -
>
> Key: FLINK-7047
> URL: https://issues.apache.org/jira/browse/FLINK-7047
> Project: Flink
>  Issue Type: Improvement
>  Components: Tests, Travis
>Affects Versions: 1.4.0
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
> Fix For: 1.4.0
>
>
> With the current build times once again hitting the timeout it is time to 
> revisit our approach.
> The current approach of splitting all tests by name, while easy to maintain 
> or extend, has the big disadvantage that it's fairly binary in regards to the 
> timeout: either we're below the timeout and all builds pass, or we're above 
> and the entire merging process stalls. Furthermore, it requires all modules 
> to be compiled.
> I propose a different approach by which we bundle several modules, only 
> execute the tests of these modules and skip the compilation of some modules 
> that are not required for these tests.
> 5 groups are my current suggestion, which will result in 10 build profiles 
> total.
> The groups are:
> # *core* - core flink modules like core,runtime,streaming-java,metrics,rocksdb
> # *libraries* - flink-libraries and flink-storm
> # *connectors* - flink-connectors, flink-connector-wikiedits, 
> flink-tweet-inputformat
> # *tests* - flink-tests
> # *misc* - flink-yarn, fink-yarn-tests, flink-mesos, flink-examples, 
> flink-dist
> To not increase the total number of profiles to ridiculous numbers i also 
> propose to only test against 2 combinations of jdk+hadoop+scala:
> # oraclejdk8 + hadoop 2.8.0 + scala 2.11
> # openjdk7 + hadoop 2.4.1 + scala 2.10
> My current estimate is that this will cause profiles to take at most 40 
> minutes.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-7044) Add methods to the client API that take the stateDescriptor.

2017-06-30 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16070096#comment-16070096
 ] 

ASF GitHub Bot commented on FLINK-7044:
---

Github user aljoscha commented on a diff in the pull request:

https://github.com/apache/flink/pull/4225#discussion_r125039522
  
--- Diff: 
flink-tests/src/test/java/org/apache/flink/test/query/AbstractQueryableStateITCase.java
 ---
@@ -534,33 +537,66 @@ public Integer getKey(Tuple2 value) 
throws Exception {
 * Retry a query for state for keys between 0 and {@link #NUM_SLOTS} 
until
 * expected equals the value of the result tuple's second 
field.
 */
-   private void executeValueQuery(final Deadline deadline,
-   final QueryableStateClient client, final JobID jobId,
-   final QueryableStateStream> 
queryableState,
-   final long expected) throws Exception {
+   private void executeValueQuery(
--- End diff --

 


> Add methods to the client API that take the stateDescriptor.
> 
>
> Key: FLINK-7044
> URL: https://issues.apache.org/jira/browse/FLINK-7044
> Project: Flink
>  Issue Type: Improvement
>  Components: Queryable State
>Affects Versions: 1.3.0, 1.3.1
>Reporter: Kostas Kloudas
>Assignee: Kostas Kloudas
> Fix For: 1.4.0
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] flink pull request #4225: [FLINK-7044] [queryable-st] Allow to specify names...

2017-06-30 Thread aljoscha
Github user aljoscha commented on a diff in the pull request:

https://github.com/apache/flink/pull/4225#discussion_r125039522
  
--- Diff: 
flink-tests/src/test/java/org/apache/flink/test/query/AbstractQueryableStateITCase.java
 ---
@@ -534,33 +537,66 @@ public Integer getKey(Tuple2 value) 
throws Exception {
 * Retry a query for state for keys between 0 and {@link #NUM_SLOTS} 
until
 * expected equals the value of the result tuple's second 
field.
 */
-   private void executeValueQuery(final Deadline deadline,
-   final QueryableStateClient client, final JobID jobId,
-   final QueryableStateStream> 
queryableState,
-   final long expected) throws Exception {
+   private void executeValueQuery(
--- End diff --

👍 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-7044) Add methods to the client API that take the stateDescriptor.

2017-06-30 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16070094#comment-16070094
 ] 

ASF GitHub Bot commented on FLINK-7044:
---

Github user aljoscha commented on a diff in the pull request:

https://github.com/apache/flink/pull/4225#discussion_r125039482
  
--- Diff: 
flink-runtime/src/main/java/org/apache/flink/runtime/state/VoidNamespaceTypeInfo.java
 ---
@@ -0,0 +1,102 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.runtime.state;
+
+import org.apache.flink.annotation.Public;
+import org.apache.flink.annotation.PublicEvolving;
+import org.apache.flink.api.common.ExecutionConfig;
+import org.apache.flink.api.common.typeinfo.TypeInformation;
+import org.apache.flink.api.common.typeutils.TypeSerializer;
+
+/**
+ * {@link TypeInformation} for {@link VoidNamespace}.
+ */
+@Public
+public class VoidNamespaceTypeInfo extends TypeInformation {
+
+   private static final long serialVersionUID = 5453679706408610586L;
+
+   public static final VoidNamespaceTypeInfo INSTANCE = new 
VoidNamespaceTypeInfo();
+
+   @Override
+   @PublicEvolving
--- End diff --

sounds good!


> Add methods to the client API that take the stateDescriptor.
> 
>
> Key: FLINK-7044
> URL: https://issues.apache.org/jira/browse/FLINK-7044
> Project: Flink
>  Issue Type: Improvement
>  Components: Queryable State
>Affects Versions: 1.3.0, 1.3.1
>Reporter: Kostas Kloudas
>Assignee: Kostas Kloudas
> Fix For: 1.4.0
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] flink pull request #4225: [FLINK-7044] [queryable-st] Allow to specify names...

2017-06-30 Thread aljoscha
Github user aljoscha commented on a diff in the pull request:

https://github.com/apache/flink/pull/4225#discussion_r125039482
  
--- Diff: 
flink-runtime/src/main/java/org/apache/flink/runtime/state/VoidNamespaceTypeInfo.java
 ---
@@ -0,0 +1,102 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.runtime.state;
+
+import org.apache.flink.annotation.Public;
+import org.apache.flink.annotation.PublicEvolving;
+import org.apache.flink.api.common.ExecutionConfig;
+import org.apache.flink.api.common.typeinfo.TypeInformation;
+import org.apache.flink.api.common.typeutils.TypeSerializer;
+
+/**
+ * {@link TypeInformation} for {@link VoidNamespace}.
+ */
+@Public
+public class VoidNamespaceTypeInfo extends TypeInformation {
+
+   private static final long serialVersionUID = 5453679706408610586L;
+
+   public static final VoidNamespaceTypeInfo INSTANCE = new 
VoidNamespaceTypeInfo();
+
+   @Override
+   @PublicEvolving
--- End diff --

sounds good!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Updated] (FLINK-7003) "select * from" in Flink SQL should not flatten all fields in the table by default

2017-06-30 Thread Timo Walther (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-7003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timo Walther updated FLINK-7003:

Issue Type: Sub-task  (was: Bug)
Parent: FLINK-7051

> "select * from" in Flink SQL should not flatten all fields in the table by 
> default
> --
>
> Key: FLINK-7003
> URL: https://issues.apache.org/jira/browse/FLINK-7003
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API & SQL
>Reporter: Shuyi Chen
>
> Currently, CompositeRelDataType is extended from 
> RelRecordType(StructKind.PEEK_FIELDS, ...).  In Calcite, 
> StructKind.PEEK_FIELDS would allow us to peek fields for nested types. 
> However, when we use "select * from", calcite will flatten all nested fields 
> that is marked as StructKind.PEEK_FIELDS in the table. 
> For example, if the table structure *T* is as follows:
> {code:java}
> VARCHAR K0,
> VARCHAR C1,
> RecordType:peek_no_flattening(INTEGER C0, INTEGER C1) F0,
> RecordType:peek_no_flattening(INTEGER C0, INTEGER C2) F1
> {code}
> The following query
> {code:java}
> Select * from T
> {code} should return (K0, C1, F0, F1) instead of (K0, C1, F0.C0, F0.C1, 
> F1.C0, F1.C1), which is the current behavior.
> After upgrading to Calcite 1.14, this issue should change the type of 
> {{CompositeRelDataType}} to {{StructKind.PEEK_FIELDS_NO_FLATTENING}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-5031) Consecutive DataStream.split() ignored

2017-06-30 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16070089#comment-16070089
 ] 

ASF GitHub Bot commented on FLINK-5031:
---

Github user aljoscha commented on the issue:

https://github.com/apache/flink/pull/2847
  
I wrote a somewhat longer comment on the issue explaining some stuff: 
https://issues.apache.org/jira/browse/FLINK-5031?focusedCommentId=16070088=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16070088


> Consecutive DataStream.split() ignored
> --
>
> Key: FLINK-5031
> URL: https://issues.apache.org/jira/browse/FLINK-5031
> Project: Flink
>  Issue Type: Bug
>  Components: DataStream API
>Affects Versions: 1.2.0, 1.1.3
>Reporter: Fabian Hueske
>Assignee: Renkai Ge
>
> The output of the following program 
> {code}
> static final class ThresholdSelector implements OutputSelector {
>   long threshold;
>   public ThresholdSelector(long threshold) {
>   this.threshold = threshold;
>   }
>   @Override
>   public Iterable select(Long value) {
>   if (value < threshold) {
>   return Collections.singletonList("Less");
>   } else {
>   return Collections.singletonList("GreaterEqual");
>   }
>   }
> }
> public static void main(String[] args) throws Exception {
>   StreamExecutionEnvironment env = 
> StreamExecutionEnvironment.getExecutionEnvironment();
>   env.setParallelism(1);
>   SplitStream split1 = env.generateSequence(1, 11)
>   .split(new ThresholdSelector(6));
>   // stream11 should be [1,2,3,4,5]
>   DataStream stream11 = split1.select("Less");
>   SplitStream split2 = stream11
> //.map(new MapFunction() {
> //@Override
> //public Long map(Long value) throws Exception {
> //return value;
> //}
> //})
>   .split(new ThresholdSelector(3));
>   DataStream stream21 = split2.select("Less");
>   // stream21 should be [1,2]
>   stream21.print();
>   env.execute();
> }
> {code}
> should be {{1, 2}}, however it is {{1, 2, 3, 4, 5}}. It seems that the second 
> {{split}} operation is ignored.
> The program is correctly evaluate if the identity {{MapFunction}} is added to 
> the program.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-5031) Consecutive DataStream.split() ignored

2017-06-30 Thread Aljoscha Krettek (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16070088#comment-16070088
 ] 

Aljoscha Krettek commented on FLINK-5031:
-

Actually, the current behaviour seems a bit more complicated than "union". In 
this example (from [~RenkaiGe]'s PR):
{code}
StreamExecutionEnvironment env = 
StreamExecutionEnvironment.getExecutionEnvironment();
env.setParallelism(1);
env.setBufferTimeout(1);

DataStream ds = env.generateSequence(0,11);

SplitStream consecutiveSplit = ds.split(new OutputSelector() {
@Override
public Iterable select(Long value) {
List s = new ArrayList();
if (value <= 5) {
s.add("Less");
} else {
s.add("GreaterEqual");
}
return s;
}
}).select("Less")
.split(new OutputSelector() {

@Override
public Iterable select(Long value) {
List s = new ArrayList();
if (value % 2 == 0) {
s.add("Even");
} else {
s.add("Odd");
}
return s;
}
});

consecutiveSplit.select("Even").addSink(smallEvenSink);
consecutiveSplit.select("Odd").addSink(smallOddSink);
env.execute();
{code}
the output with the current master is {{0, 2, 4, 6, 8, 10}}. It works like 
this: {{split()}} operations "attach" tags to elements, then only the last 
{{select()}} is taken into account when deciding what to forward. The reason 
why the example with "Less" seems to output the union is that they both attach 
the same tag name. In the "Less"/"Even" example you can see that only the 
"Even" elements are selected and not the "Less" ones. 

It would be easy to change the behaviour to truly union, i.e. {{0, 1, 2, 3, 4, 
5, 6, 8, 10}}, in fact I have a branch that does that: 
https://github.com/aljoscha/flink/tree/finish-pr-2847-split-select-intersection 
(The names/titles are misleading because I tried doing "intersection" but it's 
not quite possible currently).

Providing "intersection" behaviour without introducing an identity operator (as 
in [~RenkaiGe]'s PR) is not possible because of how the output is sent along 
from one operator to the next in a chain and I would be very hesitant about 
adding a dummy operator for this case.

To conclude, I would actually favour not fixing this at all and instead 
deprecate split/select because it is superseded by the strictly more powerful 
side outputs. What do you think [~fhueske] as the original reporter of the 
issue?

> Consecutive DataStream.split() ignored
> --
>
> Key: FLINK-5031
> URL: https://issues.apache.org/jira/browse/FLINK-5031
> Project: Flink
>  Issue Type: Bug
>  Components: DataStream API
>Affects Versions: 1.2.0, 1.1.3
>Reporter: Fabian Hueske
>Assignee: Renkai Ge
>
> The output of the following program 
> {code}
> static final class ThresholdSelector implements OutputSelector {
>   long threshold;
>   public ThresholdSelector(long threshold) {
>   this.threshold = threshold;
>   }
>   @Override
>   public Iterable select(Long value) {
>   if (value < threshold) {
>   return Collections.singletonList("Less");
>   } else {
>   return Collections.singletonList("GreaterEqual");
>   }
>   }
> }
> public static void main(String[] args) throws Exception {
>   StreamExecutionEnvironment env = 
> StreamExecutionEnvironment.getExecutionEnvironment();
>   env.setParallelism(1);
>   SplitStream split1 = env.generateSequence(1, 11)
>   .split(new ThresholdSelector(6));
>   // stream11 should be [1,2,3,4,5]
>   DataStream stream11 = split1.select("Less");
>   SplitStream split2 = stream11
> //.map(new MapFunction() {
> //@Override
> //public Long map(Long value) throws Exception {
> //return value;
> //}
> //})
>   .split(new ThresholdSelector(3));
>   DataStream stream21 = split2.select("Less");
>   // stream21 should be [1,2]
>   stream21.print();
>   env.execute();
> }
> {code}
> should be {{1, 2}}, however it is {{1, 2, 3, 4, 5}}. It seems that the second 
> {{split}} operation is ignored.
> The program is correctly evaluate if the identity {{MapFunction}} is added to 
> the program.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] flink issue #2847: [FLINK-5031]Consecutive DataStream.split() ignored

2017-06-30 Thread aljoscha
Github user aljoscha commented on the issue:

https://github.com/apache/flink/pull/2847
  
I wrote a somewhat longer comment on the issue explaining some stuff: 
https://issues.apache.org/jira/browse/FLINK-5031?focusedCommentId=16070088=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16070088


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (FLINK-7051) Bump up Calcite version to 1.14

2017-06-30 Thread Timo Walther (JIRA)
Timo Walther created FLINK-7051:
---

 Summary: Bump up Calcite version to 1.14
 Key: FLINK-7051
 URL: https://issues.apache.org/jira/browse/FLINK-7051
 Project: Flink
  Issue Type: New Feature
  Components: Table API & SQL
Reporter: Timo Walther


This is an umbrella issue for all tasks that need to be done once Apache 
Calcite 1.14 is released.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (FLINK-7003) "select * from" in Flink SQL should not flatten all fields in the table by default

2017-06-30 Thread Timo Walther (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-7003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timo Walther updated FLINK-7003:

Description: 
Currently, CompositeRelDataType is extended from 
RelRecordType(StructKind.PEEK_FIELDS, ...).  In Calcite, StructKind.PEEK_FIELDS 
would allow us to peek fields for nested types. However, when we use "select * 
from", calcite will flatten all nested fields that is marked as 
StructKind.PEEK_FIELDS in the table. 

For example, if the table structure *T* is as follows:

{code:java}
VARCHAR K0,
VARCHAR C1,
RecordType:peek_no_flattening(INTEGER C0, INTEGER C1) F0,
RecordType:peek_no_flattening(INTEGER C0, INTEGER C2) F1

{code}
The following query
{code:java}
Select * from T
{code} should return (K0, C1, F0, F1) instead of (K0, C1, F0.C0, F0.C1, F1.C0, 
F1.C1), which is the current behavior.

After upgrading to Calcite 1.14, this issue should change the type of 
{{CompositeRelDataType}} to {{StructKind.PEEK_FIELDS_NO_FLATTENING}}.


  was:
Currently, CompositeRelDataType is extended from 
RelRecordType(StructKind.PEEK_FIELDS, ...).  In Calcite, StructKind.PEEK_FIELDS 
would allow us to peek fields for nested types. However, when we use "select * 
from", calcite will flatten all nested fields that is marked as 
StructKind.PEEK_FIELDS in the table. 

For example, if the table structure *T* is as follows:

{code:java}
VARCHAR K0,
VARCHAR C1,
RecordType:peek_no_flattening(INTEGER C0, INTEGER C1) F0,
RecordType:peek_no_flattening(INTEGER C0, INTEGER C2) F1

{code}
The following query
{code:java}
Select * from T
{code} should return (K0, C1, F0, F1) instead of (K0, C1, F0.C0, F0.C1, F1.C0, 
F1.C1), which is the current behavior.



> "select * from" in Flink SQL should not flatten all fields in the table by 
> default
> --
>
> Key: FLINK-7003
> URL: https://issues.apache.org/jira/browse/FLINK-7003
> Project: Flink
>  Issue Type: Bug
>  Components: Table API & SQL
>Reporter: Shuyi Chen
>
> Currently, CompositeRelDataType is extended from 
> RelRecordType(StructKind.PEEK_FIELDS, ...).  In Calcite, 
> StructKind.PEEK_FIELDS would allow us to peek fields for nested types. 
> However, when we use "select * from", calcite will flatten all nested fields 
> that is marked as StructKind.PEEK_FIELDS in the table. 
> For example, if the table structure *T* is as follows:
> {code:java}
> VARCHAR K0,
> VARCHAR C1,
> RecordType:peek_no_flattening(INTEGER C0, INTEGER C1) F0,
> RecordType:peek_no_flattening(INTEGER C0, INTEGER C2) F1
> {code}
> The following query
> {code:java}
> Select * from T
> {code} should return (K0, C1, F0, F1) instead of (K0, C1, F0.C0, F0.C1, 
> F1.C0, F1.C1), which is the current behavior.
> After upgrading to Calcite 1.14, this issue should change the type of 
> {{CompositeRelDataType}} to {{StructKind.PEEK_FIELDS_NO_FLATTENING}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-7048) Define javadoc skipping in travis watchdog script

2017-06-30 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16070084#comment-16070084
 ] 

ASF GitHub Bot commented on FLINK-7048:
---

Github user greghogan commented on the issue:

https://github.com/apache/flink/pull/4227
  
+1


> Define javadoc skipping in travis watchdog script
> -
>
> Key: FLINK-7048
> URL: https://issues.apache.org/jira/browse/FLINK-7048
> Project: Flink
>  Issue Type: Improvement
>  Components: Travis
>Affects Versions: 1.4.0
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
>Priority: Trivial
> Fix For: 1.4.0
>
>
> For all builds on travis we currently skip building the javadocs to save some 
> time. For this purpose we added {{-Dmaven.skip.javadocs=true}} to each build 
> profile in {{.travis.yml}}
> I would like to move the declaration of this into the travis watchdog script, 
> as it is obfuscating the profile view for travis builds.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] flink issue #4227: [FLINK-7048] [travis] Define javadoc skipping in travis w...

2017-06-30 Thread greghogan
Github user greghogan commented on the issue:

https://github.com/apache/flink/pull/4227
  
+1


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-7046) Hide logging about downloaded artifacts on travis

2017-06-30 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16070082#comment-16070082
 ] 

ASF GitHub Bot commented on FLINK-7046:
---

Github user greghogan commented on the issue:

https://github.com/apache/flink/pull/4226
  
+1!


> Hide logging about downloaded artifacts on travis
> -
>
> Key: FLINK-7046
> URL: https://issues.apache.org/jira/browse/FLINK-7046
> Project: Flink
>  Issue Type: Improvement
>  Components: Travis
>Affects Versions: 1.4.0
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
>Priority: Trivial
> Fix For: 1.4.0
>
>
> We can reduce the verbosity of the travis logs by hiding messages about 
> downloaded artifacts, such as this:
> {code}
> [INFO] Downloading: 
> https://repo.maven.apache.org/maven2/org/eclipse/tycho/tycho-compiler-jdt/0.21.0/tycho-compiler-jdt-0.21.0.pom
> [INFO] Downloaded: 
> https://repo.maven.apache.org/maven2/org/eclipse/tycho/tycho-compiler-jdt/0.21.0/tycho-compiler-jdt-0.21.0.pom
>  (2 KB at 62.0 KB/sec)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] flink issue #4226: [FLINK-7046] [travis] Hide download logging messages

2017-06-30 Thread greghogan
Github user greghogan commented on the issue:

https://github.com/apache/flink/pull/4226
  
+1!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-6617) Improve JAVA and SCALA logical plans consistent test

2017-06-30 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16070073#comment-16070073
 ] 

ASF GitHub Bot commented on FLINK-6617:
---

Github user twalthr commented on the issue:

https://github.com/apache/flink/pull/3943
  
Thanks for the update @sunjincheng121. I will go over all changes and try 
to merge this. 


> Improve JAVA and SCALA logical plans consistent test
> 
>
> Key: FLINK-6617
> URL: https://issues.apache.org/jira/browse/FLINK-6617
> Project: Flink
>  Issue Type: Improvement
>  Components: Table API & SQL
>Affects Versions: 1.3.0
>Reporter: sunjincheng
>Assignee: sunjincheng
>
> Currently,we need some `StringExpression` test,for all JAVA and SCALA API.
> Such as:`GroupAggregations`,`GroupWindowAggregaton`(Session,Tumble),`Calc` 
> etc.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] flink issue #3943: [FLINK-6617][table] Improve JAVA and SCALA logical plans ...

2017-06-30 Thread twalthr
Github user twalthr commented on the issue:

https://github.com/apache/flink/pull/3943
  
Thanks for the update @sunjincheng121. I will go over all changes and try 
to merge this. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-7047) Reorganize build profiles

2017-06-30 Thread Greg Hogan (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16070072#comment-16070072
 ] 

Greg Hogan commented on FLINK-7047:
---

We can run the extra profiles in a daily TravisCI cron job 
(https://docs.travis-ci.com/user/cron-jobs/). This requires an INFRA ticket or 
I believe can be done ourselves if/when using GitBox (we can test this after 
setting up TravisCI for flink-shaded).

> Reorganize build profiles
> -
>
> Key: FLINK-7047
> URL: https://issues.apache.org/jira/browse/FLINK-7047
> Project: Flink
>  Issue Type: Improvement
>  Components: Tests, Travis
>Affects Versions: 1.4.0
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
> Fix For: 1.4.0
>
>
> With the current build times once again hitting the timeout it is time to 
> revisit our approach.
> The current approach of splitting all tests by name, while easy to maintain 
> or extend, has the big disadvantage that it's fairly binary in regards to the 
> timeout: either we're below the timeout and all builds pass, or we're above 
> and the entire merging process stalls. Furthermore, it requires all modules 
> to be compiled.
> I propose a different approach by which we bundle several modules, only 
> execute the tests of these modules and skip the compilation of some modules 
> that are not required for these tests.
> 5 groups are my current suggestion, which will result in 10 build profiles 
> total.
> The groups are:
> # *core* - core flink modules like core,runtime,streaming-java,metrics,rocksdb
> # *libraries* - flink-libraries and flink-storm
> # *connectors* - flink-connectors, flink-connector-wikiedits, 
> flink-tweet-inputformat
> # *tests* - flink-tests
> # *misc* - flink-yarn, fink-yarn-tests, flink-mesos, flink-examples, 
> flink-dist
> To not increase the total number of profiles to ridiculous numbers i also 
> propose to only test against 2 combinations of jdk+hadoop+scala:
> # oraclejdk8 + hadoop 2.8.0 + scala 2.11
> # openjdk7 + hadoop 2.4.1 + scala 2.10
> My current estimate is that this will cause profiles to take at most 40 
> minutes.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (FLINK-6429) Bump up Calcite version to 1.13

2017-06-30 Thread Timo Walther (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-6429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timo Walther reassigned FLINK-6429:
---

Assignee: Timo Walther

> Bump up Calcite version to 1.13
> ---
>
> Key: FLINK-6429
> URL: https://issues.apache.org/jira/browse/FLINK-6429
> Project: Flink
>  Issue Type: Improvement
>  Components: Table API & SQL
>Reporter: Timo Walther
>Assignee: Timo Walther
>
> This is an umbrella issue for all tasks that need to be done once Apache 
> Calcite 1.13 is released.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-6429) Bump up Calcite version to 1.13

2017-06-30 Thread Timo Walther (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16070071#comment-16070071
 ] 

Timo Walther commented on FLINK-6429:
-

I will assign this issue to me and coordinate it with the auxiliary group 
functions issues (see FLINK-6584).

> Bump up Calcite version to 1.13
> ---
>
> Key: FLINK-6429
> URL: https://issues.apache.org/jira/browse/FLINK-6429
> Project: Flink
>  Issue Type: Improvement
>  Components: Table API & SQL
>Reporter: Timo Walther
>
> This is an umbrella issue for all tasks that need to be done once Apache 
> Calcite 1.13 is released.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-7041) Deserialize StateBackend from JobCheckpointingSettings with user classloader

2017-06-30 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16070063#comment-16070063
 ] 

ASF GitHub Bot commented on FLINK-7041:
---

Github user aljoscha commented on a diff in the pull request:

https://github.com/apache/flink/pull/4232#discussion_r125034182
  
--- Diff: 
flink-runtime/src/main/java/org/apache/flink/runtime/executiongraph/ExecutionGraphBuilder.java
 ---
@@ -223,19 +223,21 @@ public static ExecutionGraph buildGraph(
// if specified in the application, use from there, 
otherwise load from configuration
final StateBackend metadataBackend;
 
-   final StateBackend applicationConfiguredBackend = 
snapshotSettings.getDefaultStateBackend();
+   final SerializedValue 
applicationConfiguredBackend = snapshotSettings.getDefaultStateBackend();
if (applicationConfiguredBackend != null) {
-   metadataBackend = applicationConfiguredBackend;
+   try {
+   metadataBackend = 
applicationConfiguredBackend.deserializeValue(classLoader);
+   } catch (IOException | ClassNotFoundException 
e) {
+   throw new JobExecutionException(jobId, 
"Could not instantiate configured state backend.", e);
+   }
 
log.info("Using application-defined state 
backend for checkpoint/savepoint metadata: {}.",
applicationConfiguredBackend);
--- End diff --

Good catch!


> Deserialize StateBackend from JobCheckpointingSettings with user classloader
> 
>
> Key: FLINK-7041
> URL: https://issues.apache.org/jira/browse/FLINK-7041
> Project: Flink
>  Issue Type: Bug
>  Components: DataStream API, Distributed Coordination, State 
> Backends, Checkpointing
>Affects Versions: 1.3.0, 1.3.1
>Reporter: Aljoscha Krettek
>Assignee: Aljoscha Krettek
>Priority: Blocker
> Fix For: 1.3.2
>
>
> A user ran into the problem that the {{SubmitJob}} message is not 
> (de)serialisable if it contains custom RocksDB options (or a custom state 
> backend): [1]
> The problem is that {{SubmitJob}} contains a {{JobGraph}} which contains 
> {{JobCheckpointingSettings}} which contains a {{StateBackend}}. This 
> {{StateBackend}} potentially has user code and therefore can only be 
> deserialised with the user classloader.
> This issue is mostly identical to FLINK-6531.
> [1] 
> https://lists.apache.org/thread.html/69bb573787258ab34c3dc56ac155052f099d75e62d805f463bde5621@%3Cuser.flink.apache.org%3E



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] flink pull request #4232: [FLINK-7041] Deserialize StateBackend from JobChec...

2017-06-30 Thread aljoscha
Github user aljoscha commented on a diff in the pull request:

https://github.com/apache/flink/pull/4232#discussion_r125034182
  
--- Diff: 
flink-runtime/src/main/java/org/apache/flink/runtime/executiongraph/ExecutionGraphBuilder.java
 ---
@@ -223,19 +223,21 @@ public static ExecutionGraph buildGraph(
// if specified in the application, use from there, 
otherwise load from configuration
final StateBackend metadataBackend;
 
-   final StateBackend applicationConfiguredBackend = 
snapshotSettings.getDefaultStateBackend();
+   final SerializedValue 
applicationConfiguredBackend = snapshotSettings.getDefaultStateBackend();
if (applicationConfiguredBackend != null) {
-   metadataBackend = applicationConfiguredBackend;
+   try {
+   metadataBackend = 
applicationConfiguredBackend.deserializeValue(classLoader);
+   } catch (IOException | ClassNotFoundException 
e) {
+   throw new JobExecutionException(jobId, 
"Could not instantiate configured state backend.", e);
+   }
 
log.info("Using application-defined state 
backend for checkpoint/savepoint metadata: {}.",
applicationConfiguredBackend);
--- End diff --

Good catch!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-6584) Support multiple consecutive windows in SQL

2017-06-30 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16070056#comment-16070056
 ] 

ASF GitHub Bot commented on FLINK-6584:
---

Github user twalthr commented on the issue:

https://github.com/apache/flink/pull/4199
  
See https://issues.apache.org/jira/browse/CALCITE-1867


> Support multiple consecutive windows in SQL
> ---
>
> Key: FLINK-6584
> URL: https://issues.apache.org/jira/browse/FLINK-6584
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API & SQL
>Reporter: Timo Walther
>Assignee: Timo Walther
>
> Right now, the Table API supports multiple consecutive windows as follows:
> {code}
> val table = stream.toTable(tEnv, 'rowtime.rowtime, 'int, 'double, 'float, 
> 'bigdec, 'string)
> val t = table
>   .window(Tumble over 2.millis on 'rowtime as 'w)
>   .groupBy('w)
>   .select('w.rowtime as 'rowtime, 'int.count as 'int)
>   .window(Tumble over 4.millis on 'rowtime as 'w2)
>   .groupBy('w2)
>   .select('w2.rowtime, 'w2.end, 'int.count)
> {code}
> Similar behavior should be supported by the SQL API as well. We need to 
> introduce a new auxiliary group function, but this should happen in sync with 
> Apache Calcite.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] flink issue #4199: [FLINK-6584] [table] Support multiple consecutive windows...

2017-06-30 Thread twalthr
Github user twalthr commented on the issue:

https://github.com/apache/flink/pull/4199
  
See https://issues.apache.org/jira/browse/CALCITE-1867


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-7047) Reorganize build profiles

2017-06-30 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16070049#comment-16070049
 ] 

ASF GitHub Bot commented on FLINK-7047:
---

GitHub user zentol opened a pull request:

https://github.com/apache/flink/pull/4233

[FLINK-7047] [travis] Reorganize build profiles

Builds on-top of #4226 and #4227.

This PR reorganizes the test groups.

1. core - executes tests for core, runtime, streaming-java, scala, etc
2. libraries - executes tests for modules libraries and storm
3. connectors - executes tests for connectors and wikiedits
4. tests - executes tests for flink-tests and examples
5. misc - executes tests for yarn, mesos, fs-tests, dist

The main change (and worst part) is that every module (or their respective 
parent) now have a `run-tests` profile that is activated when the specific 
property for that groups was set.

For example, tests for `flink-libraries` are activated if the 
`flink.test.lib` property is set.

For this to work it was necessary to **disable the test execution by 
default**. Execution of tests is now **strictly opt-in** for each module (or 
their parent).

To keep the number of build profiles <= 10 I've also reduced the 
combinations of jdk+scala+hadoop that we're testing to 2:
* oraclejdk8 + scala11 + hadoop2.8.0
* openjdk7 + scala10 + hadoop2.4.1

For the jdk7 profile max build times hover around 45 minutes for the core 
tests; for jdk8 the same build takes 40 minutes.

I tried to reduce this further my cutting out parts of the compilation, 
specifically the scala modules which easily add 6-8 minutes, but i couldn't 
find a way that properly works and is maintainable enough. Instead of fiddling 
with maven it's probably easier just to split the entire repo and call it a day.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/zentol/flink 7047

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/4233.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4233


commit 94e2b5bf02c2f5e04ecfd7af18f063a773e603c3
Author: zentol 
Date:   2017-06-29T21:17:01Z

[FLINK-7048] [travis] Define javadoc skipping in travis watchdog script

commit 1d1cc7549c262985f7d9b98f079297d40210c73d
Author: zentol 
Date:   2017-06-29T19:31:23Z

[FLINK-7046] [travis] Hide download logging messages

commit e8cb22ff9bac40fe4df69206b390f9d7040ea150
Author: zentol 
Date:   2017-06-29T15:36:21Z

[FLINK-7047] [travis] Reorganize build profiles




> Reorganize build profiles
> -
>
> Key: FLINK-7047
> URL: https://issues.apache.org/jira/browse/FLINK-7047
> Project: Flink
>  Issue Type: Improvement
>  Components: Tests, Travis
>Affects Versions: 1.4.0
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
> Fix For: 1.4.0
>
>
> With the current build times once again hitting the timeout it is time to 
> revisit our approach.
> The current approach of splitting all tests by name, while easy to maintain 
> or extend, has the big disadvantage that it's fairly binary in regards to the 
> timeout: either we're below the timeout and all builds pass, or we're above 
> and the entire merging process stalls. Furthermore, it requires all modules 
> to be compiled.
> I propose a different approach by which we bundle several modules, only 
> execute the tests of these modules and skip the compilation of some modules 
> that are not required for these tests.
> 5 groups are my current suggestion, which will result in 10 build profiles 
> total.
> The groups are:
> # *core* - core flink modules like core,runtime,streaming-java,metrics,rocksdb
> # *libraries* - flink-libraries and flink-storm
> # *connectors* - flink-connectors, flink-connector-wikiedits, 
> flink-tweet-inputformat
> # *tests* - flink-tests
> # *misc* - flink-yarn, fink-yarn-tests, flink-mesos, flink-examples, 
> flink-dist
> To not increase the total number of profiles to ridiculous numbers i also 
> propose to only test against 2 combinations of jdk+hadoop+scala:
> # oraclejdk8 + hadoop 2.8.0 + scala 2.11
> # openjdk7 + hadoop 2.4.1 + scala 2.10
> My current estimate is that this will cause profiles to take at most 40 
> minutes.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


  1   2   3   >