[jira] [Commented] (FLINK-5303) Add CUBE/ROLLUP/GROUPING SETS operator in SQL

2016-12-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15734601#comment-15734601
 ] 

ASF GitHub Bot commented on FLINK-5303:
---

Github user chermenin commented on the issue:

https://github.com/apache/flink/pull/2965
  
PR closed to rename the branch.


> Add CUBE/ROLLUP/GROUPING SETS operator in SQL
> -
>
> Key: FLINK-5303
> URL: https://issues.apache.org/jira/browse/FLINK-5303
> Project: Flink
>  Issue Type: New Feature
>  Components: Documentation, Table API & SQL
>Reporter: Alexander Chermenin
>Assignee: Alexander Chermenin
>
> Add support for such operators as CUBE, ROLLUP and GROUPING SETS in SQL.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2965: [FLINK-5303] [table] Support for SQL GROUPING SETS clause...

2016-12-08 Thread chermenin
Github user chermenin commented on the issue:

https://github.com/apache/flink/pull/2965
  
PR closed to rename the branch.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request #2965: [FLINK-5303] [table] Support for SQL GROUPING SETS...

2016-12-08 Thread chermenin
Github user chermenin closed the pull request at:

https://github.com/apache/flink/pull/2965


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-5303) Add CUBE/ROLLUP/GROUPING SETS operator in SQL

2016-12-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15734598#comment-15734598
 ] 

ASF GitHub Bot commented on FLINK-5303:
---

GitHub user chermenin opened a pull request:

https://github.com/apache/flink/pull/2976

[FLINK-5303] [table] Support for SQL GROUPING SETS clause.

Support for operators GROUPING SETS / ROLLUP / CUBE was added in this PR.
Also added some tests for check execution of SQL queries with them.
PR will close next issue: https://issues.apache.org/jira/browse/FLINK-5303.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/chermenin/flink flink-5303

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/2976.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2976


commit 51832104b5bb9aac06b6b86c98944a2d512e358c
Author: Aleksandr Chermenin 
Date:   2016-12-07T07:57:04Z

[FLINK-5303] Added GROUPING SETS implementation.

commit 9594a197148b77ffd4873d6fb77efafe01915c6e
Author: Aleksandr Chermenin 
Date:   2016-12-07T14:23:35Z

[FLINK-5303] Fixed grouping sets implementation.

commit a1aa9b2315974e63fee4f948b0e99580c49413ab
Author: Aleksandr Chermenin 
Date:   2016-12-07T14:35:46Z

[FLINK-5303] Small fixes.

commit c1170e2ce6111a77d31d29fbbad6b2c660d9e980
Author: Aleksandr Chermenin 
Date:   2016-12-08T07:46:09Z

[FLINK-5303] Some improvements.

commit 400c78d4b78fd092da0756177fa7e6dcfe7544b8
Author: Aleksandr Chermenin 
Date:   2016-12-08T09:32:53Z

[FLINK-5303] Added tests.

commit 8f30cbadca6a610ae9b4894065c4af38ec7ab12d
Author: Aleksandr Chermenin 
Date:   2016-12-08T09:34:35Z

[FLINK-5303] Test small fix.

commit eaa745bb907695bc70a0b61bc4e322ad617cd1b1
Author: Aleksandr Chermenin 
Date:   2016-12-08T11:34:19Z

[FLINK-5303] Grouping sets tests and fixes.

commit 543b2be72ec30f6fce2a25371cc0b8b95a49f832
Author: Aleksandr Chermenin 
Date:   2016-12-08T11:44:41Z

[FLINK-5303] Some cleanup.

commit 3976cea7ce3ad98381e6467d6cf1e02f1d19b103
Author: Aleksandr Chermenin 
Date:   2016-12-08T13:14:14Z

[FLINK-5303] Have supplemented documentation.

commit 92955c58fc464be34f3e3af0a83d38a6261edca3
Author: Aleksandr Chermenin 
Date:   2016-12-08T14:56:00Z

[FLINK-5303] Improved documentation.




> Add CUBE/ROLLUP/GROUPING SETS operator in SQL
> -
>
> Key: FLINK-5303
> URL: https://issues.apache.org/jira/browse/FLINK-5303
> Project: Flink
>  Issue Type: New Feature
>  Components: Documentation, Table API & SQL
>Reporter: Alexander Chermenin
>Assignee: Alexander Chermenin
>
> Add support for such operators as CUBE, ROLLUP and GROUPING SETS in SQL.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request #2976: [FLINK-5303] [table] Support for SQL GROUPING SETS...

2016-12-08 Thread chermenin
GitHub user chermenin opened a pull request:

https://github.com/apache/flink/pull/2976

[FLINK-5303] [table] Support for SQL GROUPING SETS clause.

Support for operators GROUPING SETS / ROLLUP / CUBE was added in this PR.
Also added some tests for check execution of SQL queries with them.
PR will close next issue: https://issues.apache.org/jira/browse/FLINK-5303.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/chermenin/flink flink-5303

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/2976.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2976


commit 51832104b5bb9aac06b6b86c98944a2d512e358c
Author: Aleksandr Chermenin 
Date:   2016-12-07T07:57:04Z

[FLINK-5303] Added GROUPING SETS implementation.

commit 9594a197148b77ffd4873d6fb77efafe01915c6e
Author: Aleksandr Chermenin 
Date:   2016-12-07T14:23:35Z

[FLINK-5303] Fixed grouping sets implementation.

commit a1aa9b2315974e63fee4f948b0e99580c49413ab
Author: Aleksandr Chermenin 
Date:   2016-12-07T14:35:46Z

[FLINK-5303] Small fixes.

commit c1170e2ce6111a77d31d29fbbad6b2c660d9e980
Author: Aleksandr Chermenin 
Date:   2016-12-08T07:46:09Z

[FLINK-5303] Some improvements.

commit 400c78d4b78fd092da0756177fa7e6dcfe7544b8
Author: Aleksandr Chermenin 
Date:   2016-12-08T09:32:53Z

[FLINK-5303] Added tests.

commit 8f30cbadca6a610ae9b4894065c4af38ec7ab12d
Author: Aleksandr Chermenin 
Date:   2016-12-08T09:34:35Z

[FLINK-5303] Test small fix.

commit eaa745bb907695bc70a0b61bc4e322ad617cd1b1
Author: Aleksandr Chermenin 
Date:   2016-12-08T11:34:19Z

[FLINK-5303] Grouping sets tests and fixes.

commit 543b2be72ec30f6fce2a25371cc0b8b95a49f832
Author: Aleksandr Chermenin 
Date:   2016-12-08T11:44:41Z

[FLINK-5303] Some cleanup.

commit 3976cea7ce3ad98381e6467d6cf1e02f1d19b103
Author: Aleksandr Chermenin 
Date:   2016-12-08T13:14:14Z

[FLINK-5303] Have supplemented documentation.

commit 92955c58fc464be34f3e3af0a83d38a6261edca3
Author: Aleksandr Chermenin 
Date:   2016-12-08T14:56:00Z

[FLINK-5303] Improved documentation.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-5303) Add CUBE/ROLLUP/GROUPING SETS operator in SQL

2016-12-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15734596#comment-15734596
 ] 

ASF GitHub Bot commented on FLINK-5303:
---

Github user chermenin closed the pull request at:

https://github.com/apache/flink/pull/2965


> Add CUBE/ROLLUP/GROUPING SETS operator in SQL
> -
>
> Key: FLINK-5303
> URL: https://issues.apache.org/jira/browse/FLINK-5303
> Project: Flink
>  Issue Type: New Feature
>  Components: Documentation, Table API & SQL
>Reporter: Alexander Chermenin
>Assignee: Alexander Chermenin
>
> Add support for such operators as CUBE, ROLLUP and GROUPING SETS in SQL.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-5220) Flink SQL projection pushdown

2016-12-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15734591#comment-15734591
 ] 

ASF GitHub Bot commented on FLINK-5220:
---

Github user tonycox commented on the issue:

https://github.com/apache/flink/pull/2923
  
Hi @fhueske, this PR looks good to me


> Flink SQL projection pushdown
> -
>
> Key: FLINK-5220
> URL: https://issues.apache.org/jira/browse/FLINK-5220
> Project: Flink
>  Issue Type: Improvement
>  Components: Table API & SQL
>Reporter: zhangjing
>Assignee: zhangjing
>
> The jira is to do projection pushdown optimization. Please go forward to the 
> the design document for more details.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2923: [FLINK-5220] [Table API & SQL] Flink SQL projection pushd...

2016-12-08 Thread tonycox
Github user tonycox commented on the issue:

https://github.com/apache/flink/pull/2923
  
Hi @fhueske, this PR looks good to me


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (FLINK-5303) Add CUBE/ROLLUP/GROUPING SETS operator in SQL

2016-12-08 Thread Alexander Chermenin (JIRA)
Alexander Chermenin created FLINK-5303:
--

 Summary: Add CUBE/ROLLUP/GROUPING SETS operator in SQL
 Key: FLINK-5303
 URL: https://issues.apache.org/jira/browse/FLINK-5303
 Project: Flink
  Issue Type: New Feature
  Components: Documentation, Table API & SQL
Reporter: Alexander Chermenin
Assignee: Alexander Chermenin


Add support for such operators as CUBE, ROLLUP and GROUPING SETS in SQL.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (FLINK-5188) Adjust all the imports referencing Row

2016-12-08 Thread Anton Solovev (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-5188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anton Solovev reassigned FLINK-5188:


Assignee: Anton Solovev

> Adjust all the imports referencing Row
> --
>
> Key: FLINK-5188
> URL: https://issues.apache.org/jira/browse/FLINK-5188
> Project: Flink
>  Issue Type: Sub-task
>  Components: Core, Table API & SQL
>Reporter: Anton Solovev
>Assignee: Anton Solovev
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2821) Change Akka configuration to allow accessing actors from different URLs

2016-12-08 Thread Philipp von dem Bussche (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15734537#comment-15734537
 ] 

Philipp von dem Bussche commented on FLINK-2821:


Thanks [~mxm] this is working for me now. I tested again and am purely using 
Rancher DNS names now as well as can deploy my jobs from my build service. 
There are two things I still had to do which was a) to also update the flink 
version on my build slave so it seems the client also got some changes and b) 
to exactly match the hostname when connecting e.g. from the client. So in 
Rancher a host can be reached by hostname.stackname where in my case stackname 
was in camel case and then I also had to use camel case to connect from the 
client to the jobmanager because otherwise the jobmanager would again refuse.
Anyways this looks quite good now and I will test further with this. Thanks 
again for the help !

> Change Akka configuration to allow accessing actors from different URLs
> ---
>
> Key: FLINK-2821
> URL: https://issues.apache.org/jira/browse/FLINK-2821
> Project: Flink
>  Issue Type: Bug
>  Components: Distributed Coordination
>Reporter: Robert Metzger
>Assignee: Maximilian Michels
>
> Akka expects the actor's URL to be exactly matching.
> As pointed out here, cases where users were complaining about this: 
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Error-trying-to-access-JM-through-proxy-td3018.html
>   - Proxy routing (as described here, send to the proxy URL, receiver 
> recognizes only original URL)
>   - Using hostname / IP interchangeably does not work (we solved this by 
> always putting IP addresses into URLs, never hostnames)
>   - Binding to multiple interfaces (any local 0.0.0.0) does not work. Still 
> no solution to that (but seems not too much of a restriction)
> I am aware that this is not possible due to Akka, so it is actually not a 
> Flink bug. But I think we should track the resolution of the issue here 
> anyways because its affecting our user's satisfaction.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-5299) DataStream support for arrays as keys

2016-12-08 Thread Aljoscha Krettek (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15734216#comment-15734216
 ] 

Aljoscha Krettek commented on FLINK-5299:
-

A {{KeySelector}} that calls {{Arrays.hashCode(values)}} will not work because 
the {{KeySelector}} does usually not calculate the hash but just return the 
key. I think we would need to wrap the array into a special type that has a 
{{hashCode()}} method that returns the right value so that when Flink calls 
{{hashCode()}} we get the correct array hash.

> DataStream support for arrays as keys
> -
>
> Key: FLINK-5299
> URL: https://issues.apache.org/jira/browse/FLINK-5299
> Project: Flink
>  Issue Type: Improvement
>  Components: DataStream API
>Affects Versions: 1.2.0
>Reporter: Chesnay Schepler
>  Labels: star
>
> It is currently not possible to use an array as a key in the DataStream api, 
> as it relies on hashcodes which aren't stable for arrays.
> One way to implement this would be to check for the key type and inject a 
> KeySelector that calls "Arrays.hashcode(values)".



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-5266) Eagerly project unused fields when selecting aggregation fields

2016-12-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15734203#comment-15734203
 ] 

ASF GitHub Bot commented on FLINK-5266:
---

Github user KurtYoung commented on the issue:

https://github.com/apache/flink/pull/2961
  
rebased to latest master


> Eagerly project unused fields when selecting aggregation fields
> ---
>
> Key: FLINK-5266
> URL: https://issues.apache.org/jira/browse/FLINK-5266
> Project: Flink
>  Issue Type: Improvement
>  Components: Table API & SQL
>Reporter: Kurt Young
>Assignee: Kurt Young
>
> When we call table's {{select}} method and if it contains some aggregations, 
> we will project fields after the aggregation. Would be better to project 
> unused fields before the aggregation, and can furthermore leave the 
> opportunity to push the project into scan.
> For example, the current logical plan of a simple query:
> {code}
> table.select('a.sum as 's, 'a.max)
> {code}
> is
> {code}
> LogicalProject(s=[$0], TMP_2=[$1])
>   LogicalAggregate(group=[{}], TMP_0=[SUM($5)], TMP_1=[MAX($5)])
> LogicalTableScan(table=[[supplier]])
> {code}
> Would be better if we can project unused fields right after scan, and looks 
> like this:
> {code}
> LogicalProject(s=[$0], EXPR$1=[$0])
>   LogicalAggregate(group=[{}], EXPR$1=[SUM($0)])
> LogicalProject(a=[$5])
>   LogicalTableScan(table=[[supplier]])
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2961: [FLINK-5266] [table] eagerly project unused fields when s...

2016-12-08 Thread KurtYoung
Github user KurtYoung commented on the issue:

https://github.com/apache/flink/pull/2961
  
rebased to latest master


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-5187) Create analog of Row in core

2016-12-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15734180#comment-15734180
 ] 

ASF GitHub Bot commented on FLINK-5187:
---

Github user wuchong commented on the issue:

https://github.com/apache/flink/pull/2968
  
Hi @tonycox , thanks for your reviewing. I addressed some of your comments.


> Create analog of Row in core
> 
>
> Key: FLINK-5187
> URL: https://issues.apache.org/jira/browse/FLINK-5187
> Project: Flink
>  Issue Type: Sub-task
>  Components: Core, Table API & SQL
>Reporter: Anton Solovev
>Assignee: Jark Wu
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2968: [FLINK-5187] [core] Create analog of Row and RowTypeInfo ...

2016-12-08 Thread wuchong
Github user wuchong commented on the issue:

https://github.com/apache/flink/pull/2968
  
Hi @tonycox , thanks for your reviewing. I addressed some of your comments.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-5187) Create analog of Row in core

2016-12-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15734112#comment-15734112
 ] 

ASF GitHub Bot commented on FLINK-5187:
---

Github user wuchong commented on a diff in the pull request:

https://github.com/apache/flink/pull/2968#discussion_r91652223
  
--- Diff: 
flink-core/src/main/java/org/apache/flink/api/java/typeutils/runtime/RowComparator.java
 ---
@@ -0,0 +1,698 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.flink.api.java.typeutils.runtime;
+
+import org.apache.flink.api.common.typeutils.CompositeTypeComparator;
+import org.apache.flink.api.common.typeutils.TypeComparator;
+import org.apache.flink.api.common.typeutils.TypeSerializer;
+import org.apache.flink.api.java.tuple.Tuple4;
+import org.apache.flink.core.memory.DataInputView;
+import org.apache.flink.core.memory.DataOutputView;
+import org.apache.flink.core.memory.MemorySegment;
+import org.apache.flink.types.KeyFieldOutOfBoundsException;
+import org.apache.flink.types.Row;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.List;
+
+import static 
org.apache.flink.api.java.typeutils.runtime.NullMaskUtils.readIntoNullMask;
+import static org.apache.flink.util.Preconditions.checkArgument;
+
+/**
+ * Comparator for {@link Row}
+ */
+public class RowComparator extends CompositeTypeComparator {
+
+   private static final long serialVersionUID = 1L;
+   /** The number of fields of the Row */
+   private final int arity;
+   /** key positions describe which fields are keys in what order */
+   private final int[] keyPositions;
+   /** null-aware comparators for the key fields, in the same order as the 
key fields */
+   private final NullAwareComparator[] comparators;
+   /** serializers to deserialize the first n fields for comparison */
+   private final TypeSerializer[] serializers;
+   /** auxiliary fields for normalized key support */
+   private final int[] normalizedKeyLengths;
+   private final int numLeadingNormalizableKeys;
+   private final int normalizableKeyPrefixLen;
+   private final boolean invertNormKey;
+
+   // null masks for serialized comparison
+   private final boolean[] nullMask1;
+   private final boolean[] nullMask2;
+
+   // cache for the deserialized key field objects
+   transient private final Object[] deserializedKeyFields1;
+   transient private final Object[] deserializedKeyFields2;
+
+   /**
+* General constructor for RowComparator.
+*
+* @param aritythe number of fields of the Row
+* @param keyPositions key positions describe which fields are keys in 
what order
+* @param comparators  non-null-aware comparators for the key fields, 
in the same order as
+* the key fields
+* @param serializers  serializers to deserialize the first n fields 
for comparison
+* @param orders   sorting orders for the fields
+*/
+   public RowComparator(
+   int arity,
+   int[] keyPositions,
+   TypeComparator[] comparators,
+   TypeSerializer[] serializers,
+   boolean[] orders) {
+   this(arity, keyPositions, makeNullAware(comparators, orders), 
serializers);
+   }
+
+
+   /**
+* Intermediate constructor for creating auxiliary fields.
+*/
+   private RowComparator(
+   int arity,
+   int[] keyPositions,
+   NullAwareComparator[] comparators,
+   TypeSerializer[] serializers) {
+   this(
+   arity,
+   keyPositions,
+   comparators,
+   serializers,
+   createAuxiliaryFields(keyPositions, comparators));
+   }
+
+   /**
+* 

[jira] [Commented] (FLINK-5187) Create analog of Row in core

2016-12-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15734110#comment-15734110
 ] 

ASF GitHub Bot commented on FLINK-5187:
---

Github user wuchong commented on a diff in the pull request:

https://github.com/apache/flink/pull/2968#discussion_r91652178
  
--- Diff: 
flink-core/src/main/java/org/apache/flink/api/java/typeutils/RowTypeInfo.java 
---
@@ -0,0 +1,176 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.flink.api.java.typeutils;
+
+import org.apache.flink.annotation.PublicEvolving;
+import org.apache.flink.api.common.ExecutionConfig;
+import org.apache.flink.api.common.typeinfo.TypeInformation;
+import org.apache.flink.api.common.typeutils.TypeComparator;
+import org.apache.flink.api.common.typeutils.TypeSerializer;
+import org.apache.flink.api.java.typeutils.runtime.RowComparator;
+import org.apache.flink.api.java.typeutils.runtime.RowSerializer;
+import org.apache.flink.types.Row;
+
+import java.util.ArrayList;
+import java.util.Collections;
+
+import static org.apache.flink.util.Preconditions.checkState;
+
+/**
+ * TypeInformation for {@link Row}
+ */
+@PublicEvolving
+public class RowTypeInfo extends TupleTypeInfoBase {
+
+   private static final long serialVersionUID = 9158518989896601963L;
+
+   protected final String[] fieldNames;
+   /** Temporary variable for directly passing orders to comparators. */
+   private boolean[] comparatorOrders = null;
+
+   public RowTypeInfo(TypeInformation... types) {
+   super(Row.class, types);
+
+   this.fieldNames = new String[types.length];
+
+   for (int i = 0; i < types.length; i++) {
+   fieldNames[i] = "f" + i;
--- End diff --

I think `RowTypeInfo` is something like `TupleTypeInfo`, which can't have 
custom field names.


> Create analog of Row in core
> 
>
> Key: FLINK-5187
> URL: https://issues.apache.org/jira/browse/FLINK-5187
> Project: Flink
>  Issue Type: Sub-task
>  Components: Core, Table API & SQL
>Reporter: Anton Solovev
>Assignee: Jark Wu
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request #2968: [FLINK-5187] [core] Create analog of Row and RowTy...

2016-12-08 Thread wuchong
Github user wuchong commented on a diff in the pull request:

https://github.com/apache/flink/pull/2968#discussion_r91652189
  
--- Diff: 
flink-core/src/main/java/org/apache/flink/api/java/typeutils/runtime/RowComparator.java
 ---
@@ -0,0 +1,698 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.flink.api.java.typeutils.runtime;
+
+import org.apache.flink.api.common.typeutils.CompositeTypeComparator;
+import org.apache.flink.api.common.typeutils.TypeComparator;
+import org.apache.flink.api.common.typeutils.TypeSerializer;
+import org.apache.flink.api.java.tuple.Tuple4;
+import org.apache.flink.core.memory.DataInputView;
+import org.apache.flink.core.memory.DataOutputView;
+import org.apache.flink.core.memory.MemorySegment;
+import org.apache.flink.types.KeyFieldOutOfBoundsException;
+import org.apache.flink.types.Row;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.List;
+
+import static 
org.apache.flink.api.java.typeutils.runtime.NullMaskUtils.readIntoNullMask;
+import static org.apache.flink.util.Preconditions.checkArgument;
+
+/**
+ * Comparator for {@link Row}
+ */
+public class RowComparator extends CompositeTypeComparator {
+
+   private static final long serialVersionUID = 1L;
+   /** The number of fields of the Row */
+   private final int arity;
+   /** key positions describe which fields are keys in what order */
+   private final int[] keyPositions;
+   /** null-aware comparators for the key fields, in the same order as the 
key fields */
+   private final NullAwareComparator[] comparators;
+   /** serializers to deserialize the first n fields for comparison */
+   private final TypeSerializer[] serializers;
+   /** auxiliary fields for normalized key support */
+   private final int[] normalizedKeyLengths;
+   private final int numLeadingNormalizableKeys;
+   private final int normalizableKeyPrefixLen;
+   private final boolean invertNormKey;
+
+   // null masks for serialized comparison
+   private final boolean[] nullMask1;
+   private final boolean[] nullMask2;
+
+   // cache for the deserialized key field objects
+   transient private final Object[] deserializedKeyFields1;
+   transient private final Object[] deserializedKeyFields2;
+
+   /**
+* General constructor for RowComparator.
+*
+* @param aritythe number of fields of the Row
+* @param keyPositions key positions describe which fields are keys in 
what order
+* @param comparators  non-null-aware comparators for the key fields, 
in the same order as
+* the key fields
+* @param serializers  serializers to deserialize the first n fields 
for comparison
+* @param orders   sorting orders for the fields
+*/
+   public RowComparator(
+   int arity,
+   int[] keyPositions,
+   TypeComparator[] comparators,
+   TypeSerializer[] serializers,
+   boolean[] orders) {
+   this(arity, keyPositions, makeNullAware(comparators, orders), 
serializers);
+   }
+
+
+   /**
+* Intermediate constructor for creating auxiliary fields.
+*/
+   private RowComparator(
+   int arity,
+   int[] keyPositions,
+   NullAwareComparator[] comparators,
+   TypeSerializer[] serializers) {
+   this(
+   arity,
+   keyPositions,
+   comparators,
+   serializers,
+   createAuxiliaryFields(keyPositions, comparators));
+   }
+
+   /**
+* Intermediate constructor for creating auxiliary fields.
+*/
+   private RowComparator(
+   int arity,
+   int[] keyPositions,
+   NullAwareComparator[] comparators,
+   

[jira] [Commented] (FLINK-5187) Create analog of Row in core

2016-12-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15734111#comment-15734111
 ] 

ASF GitHub Bot commented on FLINK-5187:
---

Github user wuchong commented on a diff in the pull request:

https://github.com/apache/flink/pull/2968#discussion_r91652189
  
--- Diff: 
flink-core/src/main/java/org/apache/flink/api/java/typeutils/runtime/RowComparator.java
 ---
@@ -0,0 +1,698 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.flink.api.java.typeutils.runtime;
+
+import org.apache.flink.api.common.typeutils.CompositeTypeComparator;
+import org.apache.flink.api.common.typeutils.TypeComparator;
+import org.apache.flink.api.common.typeutils.TypeSerializer;
+import org.apache.flink.api.java.tuple.Tuple4;
+import org.apache.flink.core.memory.DataInputView;
+import org.apache.flink.core.memory.DataOutputView;
+import org.apache.flink.core.memory.MemorySegment;
+import org.apache.flink.types.KeyFieldOutOfBoundsException;
+import org.apache.flink.types.Row;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.List;
+
+import static 
org.apache.flink.api.java.typeutils.runtime.NullMaskUtils.readIntoNullMask;
+import static org.apache.flink.util.Preconditions.checkArgument;
+
+/**
+ * Comparator for {@link Row}
+ */
+public class RowComparator extends CompositeTypeComparator {
+
+   private static final long serialVersionUID = 1L;
+   /** The number of fields of the Row */
+   private final int arity;
+   /** key positions describe which fields are keys in what order */
+   private final int[] keyPositions;
+   /** null-aware comparators for the key fields, in the same order as the 
key fields */
+   private final NullAwareComparator[] comparators;
+   /** serializers to deserialize the first n fields for comparison */
+   private final TypeSerializer[] serializers;
+   /** auxiliary fields for normalized key support */
+   private final int[] normalizedKeyLengths;
+   private final int numLeadingNormalizableKeys;
+   private final int normalizableKeyPrefixLen;
+   private final boolean invertNormKey;
+
+   // null masks for serialized comparison
+   private final boolean[] nullMask1;
+   private final boolean[] nullMask2;
+
+   // cache for the deserialized key field objects
+   transient private final Object[] deserializedKeyFields1;
+   transient private final Object[] deserializedKeyFields2;
+
+   /**
+* General constructor for RowComparator.
+*
+* @param aritythe number of fields of the Row
+* @param keyPositions key positions describe which fields are keys in 
what order
+* @param comparators  non-null-aware comparators for the key fields, 
in the same order as
+* the key fields
+* @param serializers  serializers to deserialize the first n fields 
for comparison
+* @param orders   sorting orders for the fields
+*/
+   public RowComparator(
+   int arity,
+   int[] keyPositions,
+   TypeComparator[] comparators,
+   TypeSerializer[] serializers,
+   boolean[] orders) {
+   this(arity, keyPositions, makeNullAware(comparators, orders), 
serializers);
+   }
+
+
+   /**
+* Intermediate constructor for creating auxiliary fields.
+*/
+   private RowComparator(
+   int arity,
+   int[] keyPositions,
+   NullAwareComparator[] comparators,
+   TypeSerializer[] serializers) {
+   this(
+   arity,
+   keyPositions,
+   comparators,
+   serializers,
+   createAuxiliaryFields(keyPositions, comparators));
+   }
+
+   /**
+* 

[GitHub] flink pull request #2968: [FLINK-5187] [core] Create analog of Row and RowTy...

2016-12-08 Thread wuchong
Github user wuchong commented on a diff in the pull request:

https://github.com/apache/flink/pull/2968#discussion_r91652223
  
--- Diff: 
flink-core/src/main/java/org/apache/flink/api/java/typeutils/runtime/RowComparator.java
 ---
@@ -0,0 +1,698 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.flink.api.java.typeutils.runtime;
+
+import org.apache.flink.api.common.typeutils.CompositeTypeComparator;
+import org.apache.flink.api.common.typeutils.TypeComparator;
+import org.apache.flink.api.common.typeutils.TypeSerializer;
+import org.apache.flink.api.java.tuple.Tuple4;
+import org.apache.flink.core.memory.DataInputView;
+import org.apache.flink.core.memory.DataOutputView;
+import org.apache.flink.core.memory.MemorySegment;
+import org.apache.flink.types.KeyFieldOutOfBoundsException;
+import org.apache.flink.types.Row;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.List;
+
+import static 
org.apache.flink.api.java.typeutils.runtime.NullMaskUtils.readIntoNullMask;
+import static org.apache.flink.util.Preconditions.checkArgument;
+
+/**
+ * Comparator for {@link Row}
+ */
+public class RowComparator extends CompositeTypeComparator {
+
+   private static final long serialVersionUID = 1L;
+   /** The number of fields of the Row */
+   private final int arity;
+   /** key positions describe which fields are keys in what order */
+   private final int[] keyPositions;
+   /** null-aware comparators for the key fields, in the same order as the 
key fields */
+   private final NullAwareComparator[] comparators;
+   /** serializers to deserialize the first n fields for comparison */
+   private final TypeSerializer[] serializers;
+   /** auxiliary fields for normalized key support */
+   private final int[] normalizedKeyLengths;
+   private final int numLeadingNormalizableKeys;
+   private final int normalizableKeyPrefixLen;
+   private final boolean invertNormKey;
+
+   // null masks for serialized comparison
+   private final boolean[] nullMask1;
+   private final boolean[] nullMask2;
+
+   // cache for the deserialized key field objects
+   transient private final Object[] deserializedKeyFields1;
+   transient private final Object[] deserializedKeyFields2;
+
+   /**
+* General constructor for RowComparator.
+*
+* @param aritythe number of fields of the Row
+* @param keyPositions key positions describe which fields are keys in 
what order
+* @param comparators  non-null-aware comparators for the key fields, 
in the same order as
+* the key fields
+* @param serializers  serializers to deserialize the first n fields 
for comparison
+* @param orders   sorting orders for the fields
+*/
+   public RowComparator(
+   int arity,
+   int[] keyPositions,
+   TypeComparator[] comparators,
+   TypeSerializer[] serializers,
+   boolean[] orders) {
+   this(arity, keyPositions, makeNullAware(comparators, orders), 
serializers);
+   }
+
+
+   /**
+* Intermediate constructor for creating auxiliary fields.
+*/
+   private RowComparator(
+   int arity,
+   int[] keyPositions,
+   NullAwareComparator[] comparators,
+   TypeSerializer[] serializers) {
+   this(
+   arity,
+   keyPositions,
+   comparators,
+   serializers,
+   createAuxiliaryFields(keyPositions, comparators));
+   }
+
+   /**
+* Intermediate constructor for creating auxiliary fields.
+*/
+   private RowComparator(
+   int arity,
+   int[] keyPositions,
+   NullAwareComparator[] comparators,
+   

[GitHub] flink pull request #2968: [FLINK-5187] [core] Create analog of Row and RowTy...

2016-12-08 Thread wuchong
Github user wuchong commented on a diff in the pull request:

https://github.com/apache/flink/pull/2968#discussion_r91652178
  
--- Diff: 
flink-core/src/main/java/org/apache/flink/api/java/typeutils/RowTypeInfo.java 
---
@@ -0,0 +1,176 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.flink.api.java.typeutils;
+
+import org.apache.flink.annotation.PublicEvolving;
+import org.apache.flink.api.common.ExecutionConfig;
+import org.apache.flink.api.common.typeinfo.TypeInformation;
+import org.apache.flink.api.common.typeutils.TypeComparator;
+import org.apache.flink.api.common.typeutils.TypeSerializer;
+import org.apache.flink.api.java.typeutils.runtime.RowComparator;
+import org.apache.flink.api.java.typeutils.runtime.RowSerializer;
+import org.apache.flink.types.Row;
+
+import java.util.ArrayList;
+import java.util.Collections;
+
+import static org.apache.flink.util.Preconditions.checkState;
+
+/**
+ * TypeInformation for {@link Row}
+ */
+@PublicEvolving
+public class RowTypeInfo extends TupleTypeInfoBase {
+
+   private static final long serialVersionUID = 9158518989896601963L;
+
+   protected final String[] fieldNames;
+   /** Temporary variable for directly passing orders to comparators. */
+   private boolean[] comparatorOrders = null;
+
+   public RowTypeInfo(TypeInformation... types) {
+   super(Row.class, types);
+
+   this.fieldNames = new String[types.length];
+
+   for (int i = 0; i < types.length; i++) {
+   fieldNames[i] = "f" + i;
--- End diff --

I think `RowTypeInfo` is something like `TupleTypeInfo`, which can't have 
custom field names.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request #2961: [FLINK-5266] [table] eagerly project unused fields...

2016-12-08 Thread KurtYoung
Github user KurtYoung commented on a diff in the pull request:

https://github.com/apache/flink/pull/2961#discussion_r91641034
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/api/table/table.scala
 ---
@@ -881,24 +883,21 @@ class GroupWindowedTable(
 * }}}
 */
   def select(fields: Expression*): Table = {
--- End diff --

Oh. My bad, i will add it. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-5266) Eagerly project unused fields when selecting aggregation fields

2016-12-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15733890#comment-15733890
 ] 

ASF GitHub Bot commented on FLINK-5266:
---

Github user KurtYoung commented on a diff in the pull request:

https://github.com/apache/flink/pull/2961#discussion_r91641034
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/api/table/table.scala
 ---
@@ -881,24 +883,21 @@ class GroupWindowedTable(
 * }}}
 */
   def select(fields: Expression*): Table = {
--- End diff --

Oh. My bad, i will add it. 


> Eagerly project unused fields when selecting aggregation fields
> ---
>
> Key: FLINK-5266
> URL: https://issues.apache.org/jira/browse/FLINK-5266
> Project: Flink
>  Issue Type: Improvement
>  Components: Table API & SQL
>Reporter: Kurt Young
>Assignee: Kurt Young
>
> When we call table's {{select}} method and if it contains some aggregations, 
> we will project fields after the aggregation. Would be better to project 
> unused fields before the aggregation, and can furthermore leave the 
> opportunity to push the project into scan.
> For example, the current logical plan of a simple query:
> {code}
> table.select('a.sum as 's, 'a.max)
> {code}
> is
> {code}
> LogicalProject(s=[$0], TMP_2=[$1])
>   LogicalAggregate(group=[{}], TMP_0=[SUM($5)], TMP_1=[MAX($5)])
> LogicalTableScan(table=[[supplier]])
> {code}
> Would be better if we can project unused fields right after scan, and looks 
> like this:
> {code}
> LogicalProject(s=[$0], EXPR$1=[$0])
>   LogicalAggregate(group=[{}], EXPR$1=[SUM($0)])
> LogicalProject(a=[$5])
>   LogicalTableScan(table=[[supplier]])
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-5302) Log failure cause at Execution

2016-12-08 Thread Ufuk Celebi (JIRA)
Ufuk Celebi created FLINK-5302:
--

 Summary: Log failure cause at Execution 
 Key: FLINK-5302
 URL: https://issues.apache.org/jira/browse/FLINK-5302
 Project: Flink
  Issue Type: Improvement
Reporter: Ufuk Celebi
 Fix For: 1.2.0, 1.1.4


It can be helpful to log the failure cause that made an {{Execution}} switch to 
state {{FAILED}}. We currently only see a "root cause" logged on the 
JobManager, which happens to be the first failure cause that makes it to 
{{ExecutionGraph#fail()}}. This depends on relative timings of messages. For 
debugging it can be helpful to have all causes available.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request #2975: [backport] [FLINK-5114] [network] Handle partition...

2016-12-08 Thread uce
GitHub user uce opened a pull request:

https://github.com/apache/flink/pull/2975

[backport] [FLINK-5114] [network] Handle partition producer state check for 
unregistered executions

Reverted some changes made in #2913 after a discussion with @StephanEwen 
and decided to close the other one in favour of this PR for cleaner diffs.

The main difference to the previous variants in #2913 and #2912 (for 
`master`) is that here I stick to the JobManager side changes only. The clumsy 
way of how the TaskManagers ask the JobManager for the producer state via a 
`tell` that is manually routed back to the `Task` is kept in order to keep the 
changes minimially invasive, which is better to oversee given that this goes 
into a bugfix release.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/uce/flink 5114-partition_state-1.1-reworked

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/2975.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2975






---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-5114) PartitionState update with finished execution fails

2016-12-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15733668#comment-15733668
 ] 

ASF GitHub Bot commented on FLINK-5114:
---

GitHub user uce opened a pull request:

https://github.com/apache/flink/pull/2975

[backport] [FLINK-5114] [network] Handle partition producer state check for 
unregistered executions

Reverted some changes made in #2913 after a discussion with @StephanEwen 
and decided to close the other one in favour of this PR for cleaner diffs.

The main difference to the previous variants in #2913 and #2912 (for 
`master`) is that here I stick to the JobManager side changes only. The clumsy 
way of how the TaskManagers ask the JobManager for the producer state via a 
`tell` that is manually routed back to the `Task` is kept in order to keep the 
changes minimially invasive, which is better to oversee given that this goes 
into a bugfix release.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/uce/flink 5114-partition_state-1.1-reworked

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/2975.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2975






> PartitionState update with finished execution fails
> ---
>
> Key: FLINK-5114
> URL: https://issues.apache.org/jira/browse/FLINK-5114
> Project: Flink
>  Issue Type: Bug
>  Components: Network
>Reporter: Ufuk Celebi
>Assignee: Ufuk Celebi
>
> If a partition state request is triggered for a producer that finishes before 
> the request arrives, the execution is unregistered and the producer cannot be 
> found. In this case the PartitionState returns null and the job fails.
> We need to check the producer location via the intermediate result partition 
> in this case.
> See here: https://api.travis-ci.org/jobs/177668505/log.txt?deansi=true



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-5114) PartitionState update with finished execution fails

2016-12-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15733664#comment-15733664
 ] 

ASF GitHub Bot commented on FLINK-5114:
---

Github user uce closed the pull request at:

https://github.com/apache/flink/pull/2913


> PartitionState update with finished execution fails
> ---
>
> Key: FLINK-5114
> URL: https://issues.apache.org/jira/browse/FLINK-5114
> Project: Flink
>  Issue Type: Bug
>  Components: Network
>Reporter: Ufuk Celebi
>Assignee: Ufuk Celebi
>
> If a partition state request is triggered for a producer that finishes before 
> the request arrives, the execution is unregistered and the producer cannot be 
> found. In this case the PartitionState returns null and the job fails.
> We need to check the producer location via the intermediate result partition 
> in this case.
> See here: https://api.travis-ci.org/jobs/177668505/log.txt?deansi=true



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-5114) PartitionState update with finished execution fails

2016-12-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15733663#comment-15733663
 ] 

ASF GitHub Bot commented on FLINK-5114:
---

Github user uce commented on the issue:

https://github.com/apache/flink/pull/2913
  
Closing in favour of #2975.


> PartitionState update with finished execution fails
> ---
>
> Key: FLINK-5114
> URL: https://issues.apache.org/jira/browse/FLINK-5114
> Project: Flink
>  Issue Type: Bug
>  Components: Network
>Reporter: Ufuk Celebi
>Assignee: Ufuk Celebi
>
> If a partition state request is triggered for a producer that finishes before 
> the request arrives, the execution is unregistered and the producer cannot be 
> found. In this case the PartitionState returns null and the job fails.
> We need to check the producer location via the intermediate result partition 
> in this case.
> See here: https://api.travis-ci.org/jobs/177668505/log.txt?deansi=true



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request #2913: [backport] [FLINK-5114] [network] Handle partition...

2016-12-08 Thread uce
Github user uce closed the pull request at:

https://github.com/apache/flink/pull/2913


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink issue #2913: [backport] [FLINK-5114] [network] Handle partition produc...

2016-12-08 Thread uce
Github user uce commented on the issue:

https://github.com/apache/flink/pull/2913
  
Closing in favour of #2975.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request #2972: [FLINK-5211] [metrics] [docs] Include example repo...

2016-12-08 Thread uce
Github user uce commented on a diff in the pull request:

https://github.com/apache/flink/pull/2972#discussion_r91626592
  
--- Diff: docs/monitoring/metrics.md ---
@@ -335,6 +345,21 @@ Parameters:
 - `ttl` - time-to-live for transmitted UDP packets
 - `addressingMode` - UDP addressing mode to use (UNICAST/MULTICAST)
 
+Example configuration:
+
+{% highlight yaml %}
+
+metrics.reporters: gang
--- End diff --

I think that's very easy to understand when reading the first, but this is 
definitely good enough! Feel free to merge as is, but can you double check that 
the examples are actually correct (no typos, etc.) and consistent with the 
other parts of the documentation.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-5211) Include an example configuration for all reporters

2016-12-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15733650#comment-15733650
 ] 

ASF GitHub Bot commented on FLINK-5211:
---

Github user uce commented on a diff in the pull request:

https://github.com/apache/flink/pull/2972#discussion_r91626592
  
--- Diff: docs/monitoring/metrics.md ---
@@ -335,6 +345,21 @@ Parameters:
 - `ttl` - time-to-live for transmitted UDP packets
 - `addressingMode` - UDP addressing mode to use (UNICAST/MULTICAST)
 
+Example configuration:
+
+{% highlight yaml %}
+
+metrics.reporters: gang
--- End diff --

I think that's very easy to understand when reading the first, but this is 
definitely good enough! Feel free to merge as is, but can you double check that 
the examples are actually correct (no typos, etc.) and consistent with the 
other parts of the documentation.


> Include an example configuration for all reporters
> --
>
> Key: FLINK-5211
> URL: https://issues.apache.org/jira/browse/FLINK-5211
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation, Metrics
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
> Fix For: 1.2.0
>
>
> We should extend the reporter documentation to include an example 
> configuration for every reporter.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-5211) Include an example configuration for all reporters

2016-12-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15733563#comment-15733563
 ] 

ASF GitHub Bot commented on FLINK-5211:
---

Github user zentol commented on a diff in the pull request:

https://github.com/apache/flink/pull/2972#discussion_r91620174
  
--- Diff: docs/monitoring/metrics.md ---
@@ -335,6 +345,21 @@ Parameters:
 - `ttl` - time-to-live for transmitted UDP packets
 - `addressingMode` - UDP addressing mode to use (UNICAST/MULTICAST)
 
+Example configuration:
+
+{% highlight yaml %}
+
+metrics.reporters: gang
--- End diff --

there is no limit, but i wanted to make sure that users understand that the 
name is arbitrary; as in to configure graphite the reporter does not have to 
called "graphite" as well.


> Include an example configuration for all reporters
> --
>
> Key: FLINK-5211
> URL: https://issues.apache.org/jira/browse/FLINK-5211
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation, Metrics
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
> Fix For: 1.2.0
>
>
> We should extend the reporter documentation to include an example 
> configuration for every reporter.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request #2972: [FLINK-5211] [metrics] [docs] Include example repo...

2016-12-08 Thread zentol
Github user zentol commented on a diff in the pull request:

https://github.com/apache/flink/pull/2972#discussion_r91620174
  
--- Diff: docs/monitoring/metrics.md ---
@@ -335,6 +345,21 @@ Parameters:
 - `ttl` - time-to-live for transmitted UDP packets
 - `addressingMode` - UDP addressing mode to use (UNICAST/MULTICAST)
 
+Example configuration:
+
+{% highlight yaml %}
+
+metrics.reporters: gang
--- End diff --

there is no limit, but i wanted to make sure that users understand that the 
name is arbitrary; as in to configure graphite the reporter does not have to 
called "graphite" as well.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2254) Add Bipartite Graph Support for Gelly

2016-12-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15733544#comment-15733544
 ] 

ASF GitHub Bot commented on FLINK-2254:
---

Github user mushketyk commented on the issue:

https://github.com/apache/flink/pull/2564
  
Hi @greghogan , I've fixed the PR according to your review.


> Add Bipartite Graph Support for Gelly
> -
>
> Key: FLINK-2254
> URL: https://issues.apache.org/jira/browse/FLINK-2254
> Project: Flink
>  Issue Type: New Feature
>  Components: Gelly
>Affects Versions: 0.10.0
>Reporter: Andra Lungu
>Assignee: Ivan Mushketyk
>  Labels: requires-design-doc
>
> A bipartite graph is a graph for which the set of vertices can be divided 
> into two disjoint sets such that each edge having a source vertex in the 
> first set, will have a target vertex in the second set. We would like to 
> support efficient operations for this type of graphs along with a set of 
> metrics(http://jponnela.com/web_documents/twomode.pdf). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request #2564: [FLINK-2254] Add BipartiateGraph class

2016-12-08 Thread mushketyk
Github user mushketyk commented on a diff in the pull request:

https://github.com/apache/flink/pull/2564#discussion_r91618485
  
--- Diff: 
flink-test-utils-parent/flink-test-utils/src/main/java/org/apache/flink/test/util/TestBaseUtils.java
 ---
@@ -480,16 +480,22 @@ protected static File asFile(String path) {
resultStrings[i] = (val == null) ? "null" : 
val.toString();
}
}
-   
-   assertEquals("Wrong number of elements result", 
expectedStrings.length, resultStrings.length);
+
+   //
+   String msg = String.format(
+   "Different elements in arrays. Expected %d elements: 
%s. Actual %d elements: %s",
+   expectedStrings.length, 
Arrays.toString(expectedStrings),
+   resultStrings.length, Arrays.toString(resultStrings));
+
+   assertEquals(msg, expectedStrings.length, resultStrings.length);
 
if (sort) {
Arrays.sort(expectedStrings);
Arrays.sort(resultStrings);
}

for (int i = 0; i < expectedStrings.length; i++) {
-   assertEquals(expectedStrings[i], resultStrings[i]);
+   assertEquals(msg, expectedStrings[i], resultStrings[i]);
--- End diff --

I think it will give more context just as in the comparing lengths case.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink issue #2564: [FLINK-2254] Add BipartiateGraph class

2016-12-08 Thread mushketyk
Github user mushketyk commented on the issue:

https://github.com/apache/flink/pull/2564
  
Hi @greghogan , I've fixed the PR according to your review.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2254) Add Bipartite Graph Support for Gelly

2016-12-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15733541#comment-15733541
 ] 

ASF GitHub Bot commented on FLINK-2254:
---

Github user mushketyk commented on a diff in the pull request:

https://github.com/apache/flink/pull/2564#discussion_r91618485
  
--- Diff: 
flink-test-utils-parent/flink-test-utils/src/main/java/org/apache/flink/test/util/TestBaseUtils.java
 ---
@@ -480,16 +480,22 @@ protected static File asFile(String path) {
resultStrings[i] = (val == null) ? "null" : 
val.toString();
}
}
-   
-   assertEquals("Wrong number of elements result", 
expectedStrings.length, resultStrings.length);
+
+   //
+   String msg = String.format(
+   "Different elements in arrays. Expected %d elements: 
%s. Actual %d elements: %s",
+   expectedStrings.length, 
Arrays.toString(expectedStrings),
+   resultStrings.length, Arrays.toString(resultStrings));
+
+   assertEquals(msg, expectedStrings.length, resultStrings.length);
 
if (sort) {
Arrays.sort(expectedStrings);
Arrays.sort(resultStrings);
}

for (int i = 0; i < expectedStrings.length; i++) {
-   assertEquals(expectedStrings[i], resultStrings[i]);
+   assertEquals(msg, expectedStrings[i], resultStrings[i]);
--- End diff --

I think it will give more context just as in the comparing lengths case.


> Add Bipartite Graph Support for Gelly
> -
>
> Key: FLINK-2254
> URL: https://issues.apache.org/jira/browse/FLINK-2254
> Project: Flink
>  Issue Type: New Feature
>  Components: Gelly
>Affects Versions: 0.10.0
>Reporter: Andra Lungu
>Assignee: Ivan Mushketyk
>  Labels: requires-design-doc
>
> A bipartite graph is a graph for which the set of vertices can be divided 
> into two disjoint sets such that each edge having a source vertex in the 
> first set, will have a target vertex in the second set. We would like to 
> support efficient operations for this type of graphs along with a set of 
> metrics(http://jponnela.com/web_documents/twomode.pdf). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2254) Add Bipartite Graph Support for Gelly

2016-12-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15733483#comment-15733483
 ] 

ASF GitHub Bot commented on FLINK-2254:
---

Github user greghogan commented on a diff in the pull request:

https://github.com/apache/flink/pull/2564#discussion_r91613888
  
--- Diff: 
flink-test-utils-parent/flink-test-utils/src/main/java/org/apache/flink/test/util/TestBaseUtils.java
 ---
@@ -480,16 +480,22 @@ protected static File asFile(String path) {
resultStrings[i] = (val == null) ? "null" : 
val.toString();
}
}
-   
-   assertEquals("Wrong number of elements result", 
expectedStrings.length, resultStrings.length);
+
+   //
+   String msg = String.format(
+   "Different elements in arrays. Expected %d elements: 
%s. Actual %d elements: %s",
+   expectedStrings.length, 
Arrays.toString(expectedStrings),
+   resultStrings.length, Arrays.toString(resultStrings));
+
+   assertEquals(msg, expectedStrings.length, resultStrings.length);
 
if (sort) {
Arrays.sort(expectedStrings);
Arrays.sort(resultStrings);
}

for (int i = 0; i < expectedStrings.length; i++) {
-   assertEquals(expectedStrings[i], resultStrings[i]);
+   assertEquals(msg, expectedStrings[i], resultStrings[i]);
--- End diff --

Do we need to include `msg` here?


> Add Bipartite Graph Support for Gelly
> -
>
> Key: FLINK-2254
> URL: https://issues.apache.org/jira/browse/FLINK-2254
> Project: Flink
>  Issue Type: New Feature
>  Components: Gelly
>Affects Versions: 0.10.0
>Reporter: Andra Lungu
>Assignee: Ivan Mushketyk
>  Labels: requires-design-doc
>
> A bipartite graph is a graph for which the set of vertices can be divided 
> into two disjoint sets such that each edge having a source vertex in the 
> first set, will have a target vertex in the second set. We would like to 
> support efficient operations for this type of graphs along with a set of 
> metrics(http://jponnela.com/web_documents/twomode.pdf). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request #2564: [FLINK-2254] Add BipartiateGraph class

2016-12-08 Thread greghogan
Github user greghogan commented on a diff in the pull request:

https://github.com/apache/flink/pull/2564#discussion_r91613888
  
--- Diff: 
flink-test-utils-parent/flink-test-utils/src/main/java/org/apache/flink/test/util/TestBaseUtils.java
 ---
@@ -480,16 +480,22 @@ protected static File asFile(String path) {
resultStrings[i] = (val == null) ? "null" : 
val.toString();
}
}
-   
-   assertEquals("Wrong number of elements result", 
expectedStrings.length, resultStrings.length);
+
+   //
+   String msg = String.format(
+   "Different elements in arrays. Expected %d elements: 
%s. Actual %d elements: %s",
+   expectedStrings.length, 
Arrays.toString(expectedStrings),
+   resultStrings.length, Arrays.toString(resultStrings));
+
+   assertEquals(msg, expectedStrings.length, resultStrings.length);
 
if (sort) {
Arrays.sort(expectedStrings);
Arrays.sort(resultStrings);
}

for (int i = 0; i < expectedStrings.length; i++) {
-   assertEquals(expectedStrings[i], resultStrings[i]);
+   assertEquals(msg, expectedStrings[i], resultStrings[i]);
--- End diff --

Do we need to include `msg` here?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2254) Add Bipartite Graph Support for Gelly

2016-12-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15733425#comment-15733425
 ] 

ASF GitHub Bot commented on FLINK-2254:
---

Github user mushketyk commented on the issue:

https://github.com/apache/flink/pull/2564
  
@vasia , @greghogan I've created a new package, moved new classes there and 
update PR according to your latest comments.

Best regards,
Ivan.


> Add Bipartite Graph Support for Gelly
> -
>
> Key: FLINK-2254
> URL: https://issues.apache.org/jira/browse/FLINK-2254
> Project: Flink
>  Issue Type: New Feature
>  Components: Gelly
>Affects Versions: 0.10.0
>Reporter: Andra Lungu
>Assignee: Ivan Mushketyk
>  Labels: requires-design-doc
>
> A bipartite graph is a graph for which the set of vertices can be divided 
> into two disjoint sets such that each edge having a source vertex in the 
> first set, will have a target vertex in the second set. We would like to 
> support efficient operations for this type of graphs along with a set of 
> metrics(http://jponnela.com/web_documents/twomode.pdf). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2564: [FLINK-2254] Add BipartiateGraph class

2016-12-08 Thread mushketyk
Github user mushketyk commented on the issue:

https://github.com/apache/flink/pull/2564
  
@vasia , @greghogan I've created a new package, moved new classes there and 
update PR according to your latest comments.

Best regards,
Ivan.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Closed] (FLINK-5039) Avro GenericRecord support is broken

2016-12-08 Thread Fabian Hueske (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-5039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fabian Hueske closed FLINK-5039.

Resolution: Fixed
  Assignee: Robert Metzger

Fixed for 1.1.4 with 3ae6e9e09ba74f88fe87d1ac130b3cc232a5e88c
Fixed for 1.2.0 with 2d8f03e7ad12af3a0dcb7bec087c25f19a4fd03e

> Avro GenericRecord support is broken
> 
>
> Key: FLINK-5039
> URL: https://issues.apache.org/jira/browse/FLINK-5039
> Project: Flink
>  Issue Type: Bug
>  Components: Batch Connectors and Input/Output Formats
>Affects Versions: 1.1.3
>Reporter: Bruno Dumon
>Assignee: Robert Metzger
>Priority: Blocker
> Fix For: 1.2.0, 1.1.4
>
>
> Avro GenericRecord support was introduced in FLINK-3691, but it seems like 
> the GenericRecords are not properly (de)serialized.
> This can be easily seen with a program like this:
> {noformat}
>   env.createInput(new AvroInputFormat<>(new Path("somefile.avro"), 
> GenericRecord.class))
> .first(10)
> .print();
> {noformat}
> which will print records in which all fields have the same value:
> {noformat}
> {"foo": 1478628723066, "bar": 1478628723066, "baz": 1478628723066, ...}
> {"foo": 1478628723179, "bar": 1478628723179, "baz": 1478628723179, ...}
> {noformat}
> If I'm not mistaken, the AvroInputFormat does essentially 
> TypeExtractor.getForClass(GenericRecord.class), but GenericRecords are not 
> POJOs.
> Furthermore, each GenericRecord contains a pointer to the record schema. I 
> guess the current naive approach will serialize this schema with each record, 
> which is quite inefficient (the schema is typically more complex and much 
> larger than the data). We probably need a TypeInformation and TypeSerializer 
> specific to Avro GenericRecords, which could just use avro serialization.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-5288) Flakey ConnectedComponentsITCase#testConnectedComponentsExample unit test

2016-12-08 Thread Fabian Hueske (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-5288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fabian Hueske updated FLINK-5288:
-
Labels: test-stability  (was: )

> Flakey ConnectedComponentsITCase#testConnectedComponentsExample unit test
> -
>
> Key: FLINK-5288
> URL: https://issues.apache.org/jira/browse/FLINK-5288
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 2.0.0
> Environment: TravisCI
>Reporter: Nico Kruber
>  Labels: test-stability
>
> https://api.travis-ci.org/jobs/182243067/log.txt?deansi=true
> {code:none}
> Tests run: 2, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 2.272 sec <<< 
> FAILURE! - in org.apache.flink.graph.test.examples.ConnectedComponentsITCase
> testConnectedComponentsExample[Execution mode = 
> CLUSTER](org.apache.flink.graph.test.examples.ConnectedComponentsITCase)  
> Time elapsed: 1.195 sec  <<< FAILURE!
> java.lang.AssertionError: Different number of lines in expected and obtained 
> result. expected:<4> but was:<3>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:834)
>   at org.junit.Assert.assertEquals(Assert.java:645)
>   at 
> org.apache.flink.test.util.TestBaseUtils.compareResultsByLinesInMemory(TestBaseUtils.java:316)
>   at 
> org.apache.flink.test.util.TestBaseUtils.compareResultsByLinesInMemory(TestBaseUtils.java:302)
>   at 
> org.apache.flink.graph.test.examples.ConnectedComponentsITCase.after(ConnectedComponentsITCase.java:70)
> Failed tests: 
>   
> ConnectedComponentsITCase.after:70->TestBaseUtils.compareResultsByLinesInMemory:302->TestBaseUtils.compareResultsByLinesInMemory:316
>  Different number of lines in expected and obtained result. expected:<4> but 
> was:<3>{code}
> full log:
> https://transfer.sh/RjFRD/38.4.tar.gz



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (FLINK-5280) Extend TableSource to support nested data

2016-12-08 Thread Ivan Mushketyk (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-5280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Mushketyk reassigned FLINK-5280:
-

Assignee: Ivan Mushketyk

> Extend TableSource to support nested data
> -
>
> Key: FLINK-5280
> URL: https://issues.apache.org/jira/browse/FLINK-5280
> Project: Flink
>  Issue Type: Improvement
>  Components: Table API & SQL
>Affects Versions: 1.2.0
>Reporter: Fabian Hueske
>Assignee: Ivan Mushketyk
>
> The {{TableSource}} interface does currently only support the definition of 
> flat rows. 
> However, there are several storage formats for nested data that should be 
> supported such as Avro, Json, Parquet, and Orc. The Table API and SQL can 
> also natively handle nested rows.
> The {{TableSource}} interface and the code to register table sources in 
> Calcite's schema need to be extended to support nested data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-5211) Include an example configuration for all reporters

2016-12-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15733382#comment-15733382
 ] 

ASF GitHub Bot commented on FLINK-5211:
---

Github user uce commented on the issue:

https://github.com/apache/flink/pull/2972
  
Very good idea! This will definitely be helpful for users.


> Include an example configuration for all reporters
> --
>
> Key: FLINK-5211
> URL: https://issues.apache.org/jira/browse/FLINK-5211
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation, Metrics
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
> Fix For: 1.2.0
>
>
> We should extend the reporter documentation to include an example 
> configuration for every reporter.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2972: [FLINK-5211] [metrics] [docs] Include example reporter co...

2016-12-08 Thread uce
Github user uce commented on the issue:

https://github.com/apache/flink/pull/2972
  
Very good idea! This will definitely be helpful for users.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Resolved] (FLINK-3921) StringParser not specifying encoding to use

2016-12-08 Thread Fabian Hueske (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fabian Hueske resolved FLINK-3921.
--
   Resolution: Implemented
Fix Version/s: 1.2.0

Implemented for 1.2.0 with f2186af6702c9fe48c91d5c2d7748378984cd29b and 
41d5875bfc272f2cd5c7e8c8523036684865c1ce

> StringParser not specifying encoding to use
> ---
>
> Key: FLINK-3921
> URL: https://issues.apache.org/jira/browse/FLINK-3921
> Project: Flink
>  Issue Type: Improvement
>  Components: Core
>Affects Versions: 1.0.3
>Reporter: Tatu Saloranta
>Assignee: Rekha Joshi
>Priority: Trivial
> Fix For: 1.2.0
>
>
> Class `flink.types.parser.StringParser` has javadocs indicating that contents 
> are expected to be Ascii, similar to `StringValueParser`. That makes sense, 
> but when constructing actual instance, no encoding is specified; on line 66 
> f.ex:
>this.result = new String(bytes, startPos+1, i - startPos - 2);
> which leads to using whatever default platform encoding is. If contents 
> really are always Ascii (would not count on that as parser is used from CSV 
> reader), not a big deal, but it can lead to the usual Latin-1-VS-UTF-8 issues.
> So I think that encoding should be explicitly specified, whatever is to be 
> used: javadocs claim ascii, so could be "us-ascii", but could well be UTF-8 
> or even ISO-8859-1.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (FLINK-5226) Eagerly project unused attributes

2016-12-08 Thread Fabian Hueske (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-5226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fabian Hueske closed FLINK-5226.

   Resolution: Fixed
Fix Version/s: 1.2.0

Fixed for 1.2.0 with 677d0d9073952b6f4c745ac242ba4108364f2189

> Eagerly project unused attributes
> -
>
> Key: FLINK-5226
> URL: https://issues.apache.org/jira/browse/FLINK-5226
> Project: Flink
>  Issue Type: Improvement
>  Components: Table API & SQL
>Affects Versions: 1.2.0
>Reporter: Fabian Hueske
>Assignee: Fabian Hueske
> Fix For: 1.2.0
>
>
> The optimizer does currently not eagerly remove unused attributes. 
> For example given a table {{tab5}} with five attributes {{a, b, c, d, e}}, 
> the following query
> {code}
> SELECT x.a, y.b FROM tab5 AS x, tab5 AS y WHERE x.a = y.a
> {code}
> would result in the non-optimized plan
> {code}
> LogicalProject(a=[$0], b=[$6])
>   LogicalFilter(condition=[=($0, $5)])
> LogicalJoin(condition=[true], joinType=[inner])
>   LogicalTableScan(table=[[tab5]])
>   LogicalTableScan(table=[[tab5]])
> {code}
> and the optimized plan:
> {code}
> DataSetCalc(select=[a, b0 AS b])
>   DataSetJoin(where=[=(a, a0)], join=[a, b, c, d, e, a0, b0, c0, d0, e0], 
> joinType=[InnerJoin])
> DataSetScan(table=[[_DataSetTable_0]])
> DataSetScan(table=[[_DataSetTable_0]])
> {code}
> This plan is inefficient because it joins all ten attributes of both tables 
> instead of eagerly projecting out all unused fields ({{x.b, x.c, x.d, x.e, 
> y.c, y.d, y.e}}).
> Since this is one of the most common optimizations, I would assume that 
> Calcite provides some rules to extract eager projections. If this is the 
> case, the issue can be solved by adding such rules to {{FlinkRuleSets}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request #2060: [FLINK-3921] StringParser encoding

2016-12-08 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/2060


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request #2926: [FLINK-5226] [table] Use correct DataSetCostFactor...

2016-12-08 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/2926


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-5039) Avro GenericRecord support is broken

2016-12-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15733311#comment-15733311
 ] 

ASF GitHub Bot commented on FLINK-5039:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/2953


> Avro GenericRecord support is broken
> 
>
> Key: FLINK-5039
> URL: https://issues.apache.org/jira/browse/FLINK-5039
> Project: Flink
>  Issue Type: Bug
>  Components: Batch Connectors and Input/Output Formats
>Affects Versions: 1.1.3
>Reporter: Bruno Dumon
>Priority: Blocker
> Fix For: 1.2.0, 1.1.4
>
>
> Avro GenericRecord support was introduced in FLINK-3691, but it seems like 
> the GenericRecords are not properly (de)serialized.
> This can be easily seen with a program like this:
> {noformat}
>   env.createInput(new AvroInputFormat<>(new Path("somefile.avro"), 
> GenericRecord.class))
> .first(10)
> .print();
> {noformat}
> which will print records in which all fields have the same value:
> {noformat}
> {"foo": 1478628723066, "bar": 1478628723066, "baz": 1478628723066, ...}
> {"foo": 1478628723179, "bar": 1478628723179, "baz": 1478628723179, ...}
> {noformat}
> If I'm not mistaken, the AvroInputFormat does essentially 
> TypeExtractor.getForClass(GenericRecord.class), but GenericRecords are not 
> POJOs.
> Furthermore, each GenericRecord contains a pointer to the record schema. I 
> guess the current naive approach will serialize this schema with each record, 
> which is quite inefficient (the schema is typically more complex and much 
> larger than the data). We probably need a TypeInformation and TypeSerializer 
> specific to Avro GenericRecords, which could just use avro serialization.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3921) StringParser not specifying encoding to use

2016-12-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15733312#comment-15733312
 ] 

ASF GitHub Bot commented on FLINK-3921:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/2060


> StringParser not specifying encoding to use
> ---
>
> Key: FLINK-3921
> URL: https://issues.apache.org/jira/browse/FLINK-3921
> Project: Flink
>  Issue Type: Improvement
>  Components: Core
>Affects Versions: 1.0.3
>Reporter: Tatu Saloranta
>Assignee: Rekha Joshi
>Priority: Trivial
>
> Class `flink.types.parser.StringParser` has javadocs indicating that contents 
> are expected to be Ascii, similar to `StringValueParser`. That makes sense, 
> but when constructing actual instance, no encoding is specified; on line 66 
> f.ex:
>this.result = new String(bytes, startPos+1, i - startPos - 2);
> which leads to using whatever default platform encoding is. If contents 
> really are always Ascii (would not count on that as parser is used from CSV 
> reader), not a big deal, but it can lead to the usual Latin-1-VS-UTF-8 issues.
> So I think that encoding should be explicitly specified, whatever is to be 
> used: javadocs claim ascii, so could be "us-ascii", but could well be UTF-8 
> or even ISO-8859-1.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-5226) Eagerly project unused attributes

2016-12-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15733310#comment-15733310
 ] 

ASF GitHub Bot commented on FLINK-5226:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/2926


> Eagerly project unused attributes
> -
>
> Key: FLINK-5226
> URL: https://issues.apache.org/jira/browse/FLINK-5226
> Project: Flink
>  Issue Type: Improvement
>  Components: Table API & SQL
>Affects Versions: 1.2.0
>Reporter: Fabian Hueske
>Assignee: Fabian Hueske
>
> The optimizer does currently not eagerly remove unused attributes. 
> For example given a table {{tab5}} with five attributes {{a, b, c, d, e}}, 
> the following query
> {code}
> SELECT x.a, y.b FROM tab5 AS x, tab5 AS y WHERE x.a = y.a
> {code}
> would result in the non-optimized plan
> {code}
> LogicalProject(a=[$0], b=[$6])
>   LogicalFilter(condition=[=($0, $5)])
> LogicalJoin(condition=[true], joinType=[inner])
>   LogicalTableScan(table=[[tab5]])
>   LogicalTableScan(table=[[tab5]])
> {code}
> and the optimized plan:
> {code}
> DataSetCalc(select=[a, b0 AS b])
>   DataSetJoin(where=[=(a, a0)], join=[a, b, c, d, e, a0, b0, c0, d0, e0], 
> joinType=[InnerJoin])
> DataSetScan(table=[[_DataSetTable_0]])
> DataSetScan(table=[[_DataSetTable_0]])
> {code}
> This plan is inefficient because it joins all ten attributes of both tables 
> instead of eagerly projecting out all unused fields ({{x.b, x.c, x.d, x.e, 
> y.c, y.d, y.e}}).
> Since this is one of the most common optimizations, I would assume that 
> Calcite provides some rules to extract eager projections. If this is the 
> case, the issue can be solved by adding such rules to {{FlinkRuleSets}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3921) StringParser not specifying encoding to use

2016-12-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15733313#comment-15733313
 ] 

ASF GitHub Bot commented on FLINK-3921:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/2901


> StringParser not specifying encoding to use
> ---
>
> Key: FLINK-3921
> URL: https://issues.apache.org/jira/browse/FLINK-3921
> Project: Flink
>  Issue Type: Improvement
>  Components: Core
>Affects Versions: 1.0.3
>Reporter: Tatu Saloranta
>Assignee: Rekha Joshi
>Priority: Trivial
>
> Class `flink.types.parser.StringParser` has javadocs indicating that contents 
> are expected to be Ascii, similar to `StringValueParser`. That makes sense, 
> but when constructing actual instance, no encoding is specified; on line 66 
> f.ex:
>this.result = new String(bytes, startPos+1, i - startPos - 2);
> which leads to using whatever default platform encoding is. If contents 
> really are always Ascii (would not count on that as parser is used from CSV 
> reader), not a big deal, but it can lead to the usual Latin-1-VS-UTF-8 issues.
> So I think that encoding should be explicitly specified, whatever is to be 
> used: javadocs claim ascii, so could be "us-ascii", but could well be UTF-8 
> or even ISO-8859-1.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request #2953: [FLINK-5039] Bump Avro version

2016-12-08 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/2953


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request #2901: [FLINK-3921] StringParser encoding

2016-12-08 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/2901


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-5280) Extend TableSource to support nested data

2016-12-08 Thread Fabian Hueske (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15733296#comment-15733296
 ] 

Fabian Hueske commented on FLINK-5280:
--

Sure, you can work on this [~ivan.mushketyk].
I think it makes sense to first design the interfaces before we start with the 
actual implementation.

Our goal should be to change as little as possible on the current interfaces. 
It should still be possible to define TableSources for tables with flat schema 
in an easy way.

I would propose the following:
- create a {{FlatTableSouce}} interface and move the 
{{TableSource.getFieldNames()}} and {{TableSource.getFieldTypes()}} methods 
there. The {{TableSource.getNumberOfFields()}} method can be dropped.
- create a {{NestedTableSource}} interface that provides methods to derive a 
nested schema (field names and types). We need to decide how this is supposed 
to look like.
- Change all classes that currently implement {{TableSource}} to also implement 
either {{FlatTableSource}}. {{NestedTableSource}} will be used for instance for 
the Avro table source.
- We need to modify / extend the way that table sources are currently 
registered. First, we need to distinguish flat and nested sources. For the 
nested sources we need an implementation that converts the information of the 
{{NestedTableSource}} interface into the {{RelDataType}} required by Calcite's 
{{Table}} interface (see {{FlinkTable}}).

What do you think?

> Extend TableSource to support nested data
> -
>
> Key: FLINK-5280
> URL: https://issues.apache.org/jira/browse/FLINK-5280
> Project: Flink
>  Issue Type: Improvement
>  Components: Table API & SQL
>Affects Versions: 1.2.0
>Reporter: Fabian Hueske
>
> The {{TableSource}} interface does currently only support the definition of 
> flat rows. 
> However, there are several storage formats for nested data that should be 
> supported such as Avro, Json, Parquet, and Orc. The Table API and SQL can 
> also natively handle nested rows.
> The {{TableSource}} interface and the code to register table sources in 
> Calcite's schema need to be extended to support nested data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1536) Graph partitioning operators for Gelly

2016-12-08 Thread Vasia Kalavri (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15733284#comment-15733284
 ] 

Vasia Kalavri commented on FLINK-1536:
--

The idea is what [~greghogan] describes. In a distributed graph processing 
system, you first have to partition the graph before you perform any 
computation. The performance of graph algorithms greatly depends on the 
resulting partitioning. A bad partitioning might assign disproportionally more 
vertices to one partition thus hurting load balancing or it might partition the 
graph so that the communication required is too high (or both). Currently, we 
only support hash partitioning; that is, vertices are randomly assigned to 
workers using the hash of their id. This strategy has very low overhead and 
results in good load balancing unless the graphs are skewed. For more details 
on this problem, I suggest you read some of the papers in the literature linked 
in the description of the issue [~ivan.mushketyk].

> Graph partitioning operators for Gelly
> --
>
> Key: FLINK-1536
> URL: https://issues.apache.org/jira/browse/FLINK-1536
> Project: Flink
>  Issue Type: New Feature
>  Components: Gelly
>Reporter: Vasia Kalavri
>Assignee: Ivan Mushketyk
>Priority: Minor
>
> Smart graph partitioning can significantly improve the performance and 
> scalability of graph analysis applications. Depending on the computation 
> pattern, a graph partitioning algorithm divides the graph into (maybe 
> overlapping) subgraphs, optimizing some objective. For example, if 
> communication is performed across graph edges, one might want to minimize the 
> edges that cross from one partition to another.
> The problem of graph partitioning is a well studied problem and several 
> algorithms have been proposed in the literature. The goal of this project 
> would be to choose a few existing partitioning techniques and implement the 
> corresponding graph partitioning operators for Gelly.
> Some related literature can be found [here| 
> http://www.citeulike.org/user/vasiakalavri/tag/graph-partitioning].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2254) Add Bipartite Graph Support for Gelly

2016-12-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15733244#comment-15733244
 ] 

ASF GitHub Bot commented on FLINK-2254:
---

Github user greghogan commented on the issue:

https://github.com/apache/flink/pull/2564
  
Yes, you are right, `bipartite`.


> Add Bipartite Graph Support for Gelly
> -
>
> Key: FLINK-2254
> URL: https://issues.apache.org/jira/browse/FLINK-2254
> Project: Flink
>  Issue Type: New Feature
>  Components: Gelly
>Affects Versions: 0.10.0
>Reporter: Andra Lungu
>Assignee: Ivan Mushketyk
>  Labels: requires-design-doc
>
> A bipartite graph is a graph for which the set of vertices can be divided 
> into two disjoint sets such that each edge having a source vertex in the 
> first set, will have a target vertex in the second set. We would like to 
> support efficient operations for this type of graphs along with a set of 
> metrics(http://jponnela.com/web_documents/twomode.pdf). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2564: [FLINK-2254] Add BipartiateGraph class

2016-12-08 Thread greghogan
Github user greghogan commented on the issue:

https://github.com/apache/flink/pull/2564
  
Yes, you are right, `bipartite`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2254) Add Bipartite Graph Support for Gelly

2016-12-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15733224#comment-15733224
 ] 

ASF GitHub Bot commented on FLINK-2254:
---

Github user vasia commented on the issue:

https://github.com/apache/flink/pull/2564
  
I would go for `org.apache.flink.graph.bipartite`. I think that 
`bidirectional` simply suggests that each edge exists in both directions.


> Add Bipartite Graph Support for Gelly
> -
>
> Key: FLINK-2254
> URL: https://issues.apache.org/jira/browse/FLINK-2254
> Project: Flink
>  Issue Type: New Feature
>  Components: Gelly
>Affects Versions: 0.10.0
>Reporter: Andra Lungu
>Assignee: Ivan Mushketyk
>  Labels: requires-design-doc
>
> A bipartite graph is a graph for which the set of vertices can be divided 
> into two disjoint sets such that each edge having a source vertex in the 
> first set, will have a target vertex in the second set. We would like to 
> support efficient operations for this type of graphs along with a set of 
> metrics(http://jponnela.com/web_documents/twomode.pdf). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2564: [FLINK-2254] Add BipartiateGraph class

2016-12-08 Thread vasia
Github user vasia commented on the issue:

https://github.com/apache/flink/pull/2564
  
I would go for `org.apache.flink.graph.bipartite`. I think that 
`bidirectional` simply suggests that each edge exists in both directions.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2254) Add Bipartite Graph Support for Gelly

2016-12-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15733210#comment-15733210
 ] 

ASF GitHub Bot commented on FLINK-2254:
---

Github user greghogan commented on the issue:

https://github.com/apache/flink/pull/2564
  
@mushketyk @vasia, thoughts on package naming? Should we create a new 
`org.apache.flink.bigraph` package? Another option would be 
`org.apache.flink.graph.bidirectional` which would suggest future package names 
like `org.apache.flink.graph.multi` and `org.apache.flink.graph.temporal`.


> Add Bipartite Graph Support for Gelly
> -
>
> Key: FLINK-2254
> URL: https://issues.apache.org/jira/browse/FLINK-2254
> Project: Flink
>  Issue Type: New Feature
>  Components: Gelly
>Affects Versions: 0.10.0
>Reporter: Andra Lungu
>Assignee: Ivan Mushketyk
>  Labels: requires-design-doc
>
> A bipartite graph is a graph for which the set of vertices can be divided 
> into two disjoint sets such that each edge having a source vertex in the 
> first set, will have a target vertex in the second set. We would like to 
> support efficient operations for this type of graphs along with a set of 
> metrics(http://jponnela.com/web_documents/twomode.pdf). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2564: [FLINK-2254] Add BipartiateGraph class

2016-12-08 Thread greghogan
Github user greghogan commented on the issue:

https://github.com/apache/flink/pull/2564
  
@mushketyk @vasia, thoughts on package naming? Should we create a new 
`org.apache.flink.bigraph` package? Another option would be 
`org.apache.flink.graph.bidirectional` which would suggest future package names 
like `org.apache.flink.graph.multi` and `org.apache.flink.graph.temporal`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1536) Graph partitioning operators for Gelly

2016-12-08 Thread Greg Hogan (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15733195#comment-15733195
 ] 

Greg Hogan commented on FLINK-1536:
---

As I understand, these can be implemented as a {{GraphAlgorithm}} returning a 
{{Graph}} where the edges and/or vertices have been shuffled to partitions 
which improve the performance of follow-on algorithms or when written out to 
database, storage, etc.

> Graph partitioning operators for Gelly
> --
>
> Key: FLINK-1536
> URL: https://issues.apache.org/jira/browse/FLINK-1536
> Project: Flink
>  Issue Type: New Feature
>  Components: Gelly
>Reporter: Vasia Kalavri
>Assignee: Ivan Mushketyk
>Priority: Minor
>
> Smart graph partitioning can significantly improve the performance and 
> scalability of graph analysis applications. Depending on the computation 
> pattern, a graph partitioning algorithm divides the graph into (maybe 
> overlapping) subgraphs, optimizing some objective. For example, if 
> communication is performed across graph edges, one might want to minimize the 
> edges that cross from one partition to another.
> The problem of graph partitioning is a well studied problem and several 
> algorithms have been proposed in the literature. The goal of this project 
> would be to choose a few existing partitioning techniques and implement the 
> corresponding graph partitioning operators for Gelly.
> Some related literature can be found [here| 
> http://www.citeulike.org/user/vasiakalavri/tag/graph-partitioning].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2254) Add Bipartite Graph Support for Gelly

2016-12-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15733156#comment-15733156
 ] 

ASF GitHub Bot commented on FLINK-2254:
---

Github user greghogan commented on a diff in the pull request:

https://github.com/apache/flink/pull/2564#discussion_r91585475
  
--- Diff: 
flink-test-utils-parent/flink-test-utils/src/main/java/org/apache/flink/test/util/TestBaseUtils.java
 ---
@@ -480,7 +480,8 @@ protected static File asFile(String path) {
}
}

-   assertEquals("Wrong number of elements result", 
expectedStrings.length, resultStrings.length);
+   assertEquals(String.format("Wrong number of elements result. 
Expected: %s. Result: %s.", Arrays.toString(expectedStrings), 
Arrays.toString(resultStrings)),
--- End diff --

What if we moved `String.format` to its own line, included in the string 
both the array lengths and contents, and added a comment to describe why we are 
also printing the full arrays?

Also, should the arrays be printed on new lines such that they would line 
up until the diverging element? We'll need to move the sorting of the arrays 
before the length check.


> Add Bipartite Graph Support for Gelly
> -
>
> Key: FLINK-2254
> URL: https://issues.apache.org/jira/browse/FLINK-2254
> Project: Flink
>  Issue Type: New Feature
>  Components: Gelly
>Affects Versions: 0.10.0
>Reporter: Andra Lungu
>Assignee: Ivan Mushketyk
>  Labels: requires-design-doc
>
> A bipartite graph is a graph for which the set of vertices can be divided 
> into two disjoint sets such that each edge having a source vertex in the 
> first set, will have a target vertex in the second set. We would like to 
> support efficient operations for this type of graphs along with a set of 
> metrics(http://jponnela.com/web_documents/twomode.pdf). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request #2564: [FLINK-2254] Add BipartiateGraph class

2016-12-08 Thread greghogan
Github user greghogan commented on a diff in the pull request:

https://github.com/apache/flink/pull/2564#discussion_r91585475
  
--- Diff: 
flink-test-utils-parent/flink-test-utils/src/main/java/org/apache/flink/test/util/TestBaseUtils.java
 ---
@@ -480,7 +480,8 @@ protected static File asFile(String path) {
}
}

-   assertEquals("Wrong number of elements result", 
expectedStrings.length, resultStrings.length);
+   assertEquals(String.format("Wrong number of elements result. 
Expected: %s. Result: %s.", Arrays.toString(expectedStrings), 
Arrays.toString(resultStrings)),
--- End diff --

What if we moved `String.format` to its own line, included in the string 
both the array lengths and contents, and added a comment to describe why we are 
also printing the full arrays?

Also, should the arrays be printed on new lines such that they would line 
up until the diverging element? We'll need to move the sorting of the arrays 
before the length check.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-5298) TaskManager crashes when TM log not existant

2016-12-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15733003#comment-15733003
 ] 

ASF GitHub Bot commented on FLINK-5298:
---

GitHub user zentol opened a pull request:

https://github.com/apache/flink/pull/2974

[FLINK-5298] TM checks that log file exists

This PR slightly modifies the `TaskManager#handleRequestTaskManagerLog`. 
For one it verifies that the log file actually exists before opening it. 
Second, if the logFilePathOption is empty it no longer throws an IOException 
(which _could_ crash the TM) but instead forwards it to the sender.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/zentol/flink 5298_tm_log

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/2974.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2974


commit c135d1940e61a4a80b274042e6e095f3369ec911
Author: zentol 
Date:   2016-12-08T18:28:12Z

[FLINK-5298] TM checks that log file exists




> TaskManager crashes when TM log not existant
> 
>
> Key: FLINK-5298
> URL: https://issues.apache.org/jira/browse/FLINK-5298
> Project: Flink
>  Issue Type: Bug
>  Components: TaskManager, Webfrontend
>Affects Versions: 1.1.0, 1.2.0
>Reporter: Mischa Krüger
>Assignee: Chesnay Schepler
>Priority: Trivial
> Fix For: 1.2.0
>
>
> {code}
> java.io.FileNotFoundException: flink-taskmanager.out (No such file or 
> directory)
> at java.io.FileInputStream.open0(Native Method)
> at java.io.FileInputStream.open(FileInputStream.java:195)
> at java.io.FileInputStream.(FileInputStream.java:138)
> at 
> org.apache.flink.runtime.taskmanager.TaskManager.org$apache$flink$runtime$taskmanager$TaskManager$$handleRequestTaskManagerLog(TaskManager.scala:833)
> at 
> org.apache.flink.runtime.taskmanager.TaskManager$$anonfun$handleMessage$1.applyOrElse(TaskManager.scala:340)
> at 
> scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33)
> at 
> scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33)
> at 
> scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25)
> at 
> org.apache.flink.runtime.LeaderSessionMessageFilter$$anonfun$receive$1.applyOrElse(LeaderSessionMessageFilter.scala:44)
> at 
> scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33)
> at 
> scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33)
> at 
> scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25)
> at 
> org.apache.flink.runtime.LogMessages$$anon$1.apply(LogMessages.scala:33)
> at 
> org.apache.flink.runtime.LogMessages$$anon$1.apply(LogMessages.scala:28)
> at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:118)
> at 
> org.apache.flink.runtime.LogMessages$$anon$1.applyOrElse(LogMessages.scala:28)
> at akka.actor.Actor$class.aroundReceive(Actor.scala:465)
> at 
> org.apache.flink.runtime.taskmanager.TaskManager.aroundReceive(TaskManager.scala:122)
> at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
> at akka.actor.ActorCell.invoke(ActorCell.scala:487)
> at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254)
> at akka.dispatch.Mailbox.run(Mailbox.scala:221)
> at akka.dispatch.Mailbox.exec(Mailbox.scala:231)
> at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
> at 
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
> at 
> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
> at 
> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> 2016-12-08 16:45:14,995 INFO  
> org.apache.flink.mesos.runtime.clusterframework.MesosTaskManager  - Stopping 
> TaskManager akka://flink/user/taskmanager#1361882659.
> 2016-12-08 16:45:14,995 INFO  
> org.apache.flink.mesos.runtime.clusterframework.MesosTaskManager  - 
> Disassociating from JobManager
> 2016-12-08 16:45:14,997 INFO  org.apache.flink.runtime.blob.BlobCache 
>   - Shutting down BlobCache
> 2016-12-08 16:45:15,006 INFO  
> org.apache.flink.runtime.io.disk.iomanager.IOManager  - I/O manager 
> removed spill file directory 
> /tmp/flink-io-e61f717b-630c-4a2a-b3e3-62ccb40aa2f9
> 2016-12-08 16:45:15,006 INFO  
> org.apache.flink.runtime.io.network.NetworkEnvironment- Shutting down 
> the network environment and its components.
> 2016-12-08 16:45:15,008 INFO  
> 

[GitHub] flink pull request #2974: [FLINK-5298] TM checks that log file exists

2016-12-08 Thread zentol
GitHub user zentol opened a pull request:

https://github.com/apache/flink/pull/2974

[FLINK-5298] TM checks that log file exists

This PR slightly modifies the `TaskManager#handleRequestTaskManagerLog`. 
For one it verifies that the log file actually exists before opening it. 
Second, if the logFilePathOption is empty it no longer throws an IOException 
(which _could_ crash the TM) but instead forwards it to the sender.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/zentol/flink 5298_tm_log

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/2974.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2974


commit c135d1940e61a4a80b274042e6e095f3369ec911
Author: zentol 
Date:   2016-12-08T18:28:12Z

[FLINK-5298] TM checks that log file exists




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-5206) Flakey PythonPlanBinderTest

2016-12-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15732997#comment-15732997
 ] 

ASF GitHub Bot commented on FLINK-5206:
---

GitHub user zentol opened a pull request:

https://github.com/apache/flink/pull/2973

[FLINK-5206] [py] Use random file names in tests

This PR modifies the python tests to use random file names.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/zentol/flink 5206_py_tests

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/2973.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2973


commit 30abf5628766886775dfbd9c49518a49da222afc
Author: zentol 
Date:   2016-12-08T18:29:03Z

[FLINK-5206] [py] Use random file names in tests




> Flakey PythonPlanBinderTest
> ---
>
> Key: FLINK-5206
> URL: https://issues.apache.org/jira/browse/FLINK-5206
> Project: Flink
>  Issue Type: Bug
>  Components: Python API
>Affects Versions: 1.2.0
> Environment: in TravisCI
>Reporter: Nico Kruber
>Assignee: Chesnay Schepler
>  Labels: test-stability
>
> {code:none}
> ---
>  T E S T S
> ---
> Running org.apache.flink.python.api.PythonPlanBinderTest
> Job execution failed.
> org.apache.flink.runtime.client.JobExecutionException: Job execution failed.
>   at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply$mcV$sp(JobManager.scala:903)
>   at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply(JobManager.scala:846)
>   at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply(JobManager.scala:846)
>   at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
>   at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
>   at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:41)
>   at 
> akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:401)
>   at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>   at 
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
>   at 
> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
>   at 
> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> Caused by: java.io.IOException: Output path '/tmp/flink/result2' could not be 
> initialized. Canceling task...
>   at 
> org.apache.flink.api.common.io.FileOutputFormat.open(FileOutputFormat.java:233)
>   at 
> org.apache.flink.api.java.io.TextOutputFormat.open(TextOutputFormat.java:78)
>   at 
> org.apache.flink.runtime.operators.DataSinkTask.invoke(DataSinkTask.java:178)
>   at org.apache.flink.runtime.taskmanager.Task.run(Task.java:654)
>   at java.lang.Thread.run(Thread.java:745)
> Tests run: 3, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 36.891 sec 
> <<< FAILURE! - in org.apache.flink.python.api.PythonPlanBinderTest
> testJobWithoutObjectReuse(org.apache.flink.python.api.PythonPlanBinderTest)  
> Time elapsed: 11.53 sec  <<< FAILURE!
> java.lang.AssertionError: Error while calling the test program: Job execution 
> failed.
>   at org.junit.Assert.fail(Assert.java:88)
>   at 
> org.apache.flink.test.util.JavaProgramTestBase.testJobWithoutObjectReuse(JavaProgramTestBase.java:180)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
>   at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> 

[GitHub] flink pull request #2973: [FLINK-5206] [py] Use random file names in tests

2016-12-08 Thread zentol
GitHub user zentol opened a pull request:

https://github.com/apache/flink/pull/2973

[FLINK-5206] [py] Use random file names in tests

This PR modifies the python tests to use random file names.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/zentol/flink 5206_py_tests

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/2973.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2973


commit 30abf5628766886775dfbd9c49518a49da222afc
Author: zentol 
Date:   2016-12-08T18:29:03Z

[FLINK-5206] [py] Use random file names in tests




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-5211) Include an example configuration for all reporters

2016-12-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15732980#comment-15732980
 ] 

ASF GitHub Bot commented on FLINK-5211:
---

GitHub user zentol opened a pull request:

https://github.com/apache/flink/pull/2972

[FLINK-5211] [metrics] [docs] Include example reporter configuration

CC @tillrohrmann 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/zentol/flink 5211_docs_reporter

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/2972.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2972


commit 7839152db689241d851f218907103ad3a191f974
Author: zentol 
Date:   2016-11-30T15:20:45Z

[FLINK-5211] [metrics] [docs] Include example reporter configuration




> Include an example configuration for all reporters
> --
>
> Key: FLINK-5211
> URL: https://issues.apache.org/jira/browse/FLINK-5211
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation, Metrics
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
> Fix For: 1.2.0
>
>
> We should extend the reporter documentation to include an example 
> configuration for every reporter.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request #2972: [FLINK-5211] [metrics] [docs] Include example repo...

2016-12-08 Thread zentol
GitHub user zentol opened a pull request:

https://github.com/apache/flink/pull/2972

[FLINK-5211] [metrics] [docs] Include example reporter configuration

CC @tillrohrmann 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/zentol/flink 5211_docs_reporter

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/2972.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2972


commit 7839152db689241d851f218907103ad3a191f974
Author: zentol 
Date:   2016-11-30T15:20:45Z

[FLINK-5211] [metrics] [docs] Include example reporter configuration




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Updated] (FLINK-5298) TaskManager crashes when TM log not existant

2016-12-08 Thread Chesnay Schepler (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-5298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chesnay Schepler updated FLINK-5298:

Component/s: Webfrontend
 TaskManager

> TaskManager crashes when TM log not existant
> 
>
> Key: FLINK-5298
> URL: https://issues.apache.org/jira/browse/FLINK-5298
> Project: Flink
>  Issue Type: Bug
>  Components: TaskManager, Webfrontend
>Affects Versions: 1.1.0, 1.2.0
>Reporter: Mischa Krüger
>Assignee: Chesnay Schepler
>Priority: Trivial
> Fix For: 1.2.0
>
>
> {code}
> java.io.FileNotFoundException: flink-taskmanager.out (No such file or 
> directory)
> at java.io.FileInputStream.open0(Native Method)
> at java.io.FileInputStream.open(FileInputStream.java:195)
> at java.io.FileInputStream.(FileInputStream.java:138)
> at 
> org.apache.flink.runtime.taskmanager.TaskManager.org$apache$flink$runtime$taskmanager$TaskManager$$handleRequestTaskManagerLog(TaskManager.scala:833)
> at 
> org.apache.flink.runtime.taskmanager.TaskManager$$anonfun$handleMessage$1.applyOrElse(TaskManager.scala:340)
> at 
> scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33)
> at 
> scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33)
> at 
> scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25)
> at 
> org.apache.flink.runtime.LeaderSessionMessageFilter$$anonfun$receive$1.applyOrElse(LeaderSessionMessageFilter.scala:44)
> at 
> scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33)
> at 
> scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33)
> at 
> scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25)
> at 
> org.apache.flink.runtime.LogMessages$$anon$1.apply(LogMessages.scala:33)
> at 
> org.apache.flink.runtime.LogMessages$$anon$1.apply(LogMessages.scala:28)
> at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:118)
> at 
> org.apache.flink.runtime.LogMessages$$anon$1.applyOrElse(LogMessages.scala:28)
> at akka.actor.Actor$class.aroundReceive(Actor.scala:465)
> at 
> org.apache.flink.runtime.taskmanager.TaskManager.aroundReceive(TaskManager.scala:122)
> at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
> at akka.actor.ActorCell.invoke(ActorCell.scala:487)
> at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254)
> at akka.dispatch.Mailbox.run(Mailbox.scala:221)
> at akka.dispatch.Mailbox.exec(Mailbox.scala:231)
> at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
> at 
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
> at 
> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
> at 
> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> 2016-12-08 16:45:14,995 INFO  
> org.apache.flink.mesos.runtime.clusterframework.MesosTaskManager  - Stopping 
> TaskManager akka://flink/user/taskmanager#1361882659.
> 2016-12-08 16:45:14,995 INFO  
> org.apache.flink.mesos.runtime.clusterframework.MesosTaskManager  - 
> Disassociating from JobManager
> 2016-12-08 16:45:14,997 INFO  org.apache.flink.runtime.blob.BlobCache 
>   - Shutting down BlobCache
> 2016-12-08 16:45:15,006 INFO  
> org.apache.flink.runtime.io.disk.iomanager.IOManager  - I/O manager 
> removed spill file directory 
> /tmp/flink-io-e61f717b-630c-4a2a-b3e3-62ccb40aa2f9
> 2016-12-08 16:45:15,006 INFO  
> org.apache.flink.runtime.io.network.NetworkEnvironment- Shutting down 
> the network environment and its components.
> 2016-12-08 16:45:15,008 INFO  
> org.apache.flink.runtime.io.network.netty.NettyClient - Successful 
> shutdown (took 1 ms).
> 2016-12-08 16:45:15,009 INFO  
> org.apache.flink.runtime.io.network.netty.NettyServer - Successful 
> shutdown (took 0 ms).
> 2016-12-08 16:45:15,020 INFO  
> org.apache.flink.mesos.runtime.clusterframework.MesosTaskManager  - Task 
> manager akka://flink/user/taskmanager is completely shut down.
> 2016-12-08 16:45:15,023 ERROR 
> org.apache.flink.runtime.taskmanager.TaskManager  - Actor 
> akka://flink/user/taskmanager#1361882659 terminated, stopping process...
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (FLINK-3902) Discarded FileSystem checkpoints are lingering around

2016-12-08 Thread Ufuk Celebi (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ufuk Celebi closed FLINK-3902.
--
Resolution: Not A Bug

This is actually an artifact of the way we try to delete fiiles and how this 
interacts with HDFS (which logs this as an Exception).

> Discarded FileSystem checkpoints are lingering around
> -
>
> Key: FLINK-3902
> URL: https://issues.apache.org/jira/browse/FLINK-3902
> Project: Flink
>  Issue Type: Bug
>  Components: Distributed Coordination
>Affects Versions: 1.0.2
>Reporter: Ufuk Celebi
>
> A user reported that checkpoints with {{FSStateBackend}} are not properly 
> cleaned up.
> {code}
> 2016-05-10 12:21:06,559 INFO BlockStateChange: BLOCK* addToInvalidates: 
> blk_1084791727_11053122 10.10.113.10:50010
> 2016-05-10 12:21:06,559 INFO org.apache.hadoop.ipc.Server: IPC Server handler 
> 9 on 8020, call org.apache.hadoop.hdfs.protocol.ClientProtocol.delete from 
> 10.10.113.9:49233 Call#12337 Retry#0
> org.apache.hadoop.fs.PathIsNotEmptyDirectoryException: 
> `/flink/checkpoints_test/570d6e67d571c109daab468e5678402b/chk-62 is non 
> empty': Directory is not empty
> at 
> org.apache.hadoop.hdfs.server.namenode.FSDirDeleteOp.delete(FSDirDeleteOp.java:85)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:3712)
> {code}
> {code}
> 2016-05-10 12:20:22,636 [Checkpoint Timer] INFO 
> org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Triggering 
> checkpoint 62 @ 1462875622636
> 2016-05-10 12:20:32,507 [flink-akka.actor.default-dispatcher-240088] INFO  
> org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Completed 
> checkpoint 62 (in 9843 ms)
> 2016-05-10 12:20:52,637 [Checkpoint Timer] INFO 
> org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Triggering 
> checkpoint 63 @ 1462875652637
> 2016-05-10 12:21:06,563 [flink-akka.actor.default-dispatcher-240028] INFO  
> org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Completed 
> checkpoint 63 (in 13909 ms)
> 2016-05-10 12:21:22,636 [Checkpoint Timer] INFO 
> org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Triggering 
> checkpoint 64 @ 1462875682636
> {code}
> Running the same program with the {{RocksDBBackend}} works as expected and 
> clears the old checkpoints properly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-5300) FileStateHandle#discard & FsCheckpointStateOutputStream#close tries to delete non-empty directory

2016-12-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15732934#comment-15732934
 ] 

ASF GitHub Bot commented on FLINK-5300:
---

GitHub user tillrohrmann opened a pull request:

https://github.com/apache/flink/pull/2971

[backport] [FLINK-5300] Add more gentle file deletion procedure

Backport of #2970 to the release-1.1 branch.

Before deleting a parent directory always check the directory whether it 
contains some
files. If not, then try to delete the parent directory.

This will give a more gentle behaviour wrt storage systems which are not 
instructed to
delete a non-empty directory.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tillrohrmann/flink 
backportMoreGentleFileDeletion

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/2971.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2971


commit fbce253eddd2ea6ce681b8881a5b6b8d470d861b
Author: Till Rohrmann 
Date:   2016-12-08T17:53:40Z

[FLINK-5300] Add more gentle file deletion procedure

Before deleting a parent directory always check the directory whether it 
contains some
files. If not, then try to delete the parent directory.

This will give a more gentle behaviour wrt storage systems which are not 
instructed to
delete a non-empty directory.




> FileStateHandle#discard & FsCheckpointStateOutputStream#close tries to delete 
> non-empty directory
> -
>
> Key: FLINK-5300
> URL: https://issues.apache.org/jira/browse/FLINK-5300
> Project: Flink
>  Issue Type: Improvement
>  Components: State Backends, Checkpointing
>Affects Versions: 1.2.0, 1.1.3
>Reporter: Till Rohrmann
>Assignee: Till Rohrmann
>Priority: Critical
>
> Flink's behaviour to delete {{FileStateHandles}} and closing 
> {{FsCheckpointStateOutputStream}} always triggers a delete operation on the 
> parent directory. Often this call will fail because the directory still 
> contains some other files.
> A user reported that the SRE of their Hadoop cluster noticed this behaviour 
> in the logs. It might be more system friendly if we first checked whether the 
> directory is empty or not. This would prevent many error message to appear in 
> the Hadoop logs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request #2971: [backport] [FLINK-5300] Add more gentle file delet...

2016-12-08 Thread tillrohrmann
GitHub user tillrohrmann opened a pull request:

https://github.com/apache/flink/pull/2971

[backport] [FLINK-5300] Add more gentle file deletion procedure

Backport of #2970 to the release-1.1 branch.

Before deleting a parent directory always check the directory whether it 
contains some
files. If not, then try to delete the parent directory.

This will give a more gentle behaviour wrt storage systems which are not 
instructed to
delete a non-empty directory.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tillrohrmann/flink 
backportMoreGentleFileDeletion

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/2971.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2971


commit fbce253eddd2ea6ce681b8881a5b6b8d470d861b
Author: Till Rohrmann 
Date:   2016-12-08T17:53:40Z

[FLINK-5300] Add more gentle file deletion procedure

Before deleting a parent directory always check the directory whether it 
contains some
files. If not, then try to delete the parent directory.

This will give a more gentle behaviour wrt storage systems which are not 
instructed to
delete a non-empty directory.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Updated] (FLINK-5298) TaskManager crashes when TM log not existant

2016-12-08 Thread Chesnay Schepler (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-5298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chesnay Schepler updated FLINK-5298:

Affects Version/s: 1.2.0
   1.1.0

> TaskManager crashes when TM log not existant
> 
>
> Key: FLINK-5298
> URL: https://issues.apache.org/jira/browse/FLINK-5298
> Project: Flink
>  Issue Type: Bug
>  Components: TaskManager, Webfrontend
>Affects Versions: 1.1.0, 1.2.0
>Reporter: Mischa Krüger
>Assignee: Chesnay Schepler
>Priority: Trivial
> Fix For: 1.2.0
>
>
> {code}
> java.io.FileNotFoundException: flink-taskmanager.out (No such file or 
> directory)
> at java.io.FileInputStream.open0(Native Method)
> at java.io.FileInputStream.open(FileInputStream.java:195)
> at java.io.FileInputStream.(FileInputStream.java:138)
> at 
> org.apache.flink.runtime.taskmanager.TaskManager.org$apache$flink$runtime$taskmanager$TaskManager$$handleRequestTaskManagerLog(TaskManager.scala:833)
> at 
> org.apache.flink.runtime.taskmanager.TaskManager$$anonfun$handleMessage$1.applyOrElse(TaskManager.scala:340)
> at 
> scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33)
> at 
> scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33)
> at 
> scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25)
> at 
> org.apache.flink.runtime.LeaderSessionMessageFilter$$anonfun$receive$1.applyOrElse(LeaderSessionMessageFilter.scala:44)
> at 
> scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33)
> at 
> scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33)
> at 
> scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25)
> at 
> org.apache.flink.runtime.LogMessages$$anon$1.apply(LogMessages.scala:33)
> at 
> org.apache.flink.runtime.LogMessages$$anon$1.apply(LogMessages.scala:28)
> at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:118)
> at 
> org.apache.flink.runtime.LogMessages$$anon$1.applyOrElse(LogMessages.scala:28)
> at akka.actor.Actor$class.aroundReceive(Actor.scala:465)
> at 
> org.apache.flink.runtime.taskmanager.TaskManager.aroundReceive(TaskManager.scala:122)
> at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
> at akka.actor.ActorCell.invoke(ActorCell.scala:487)
> at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254)
> at akka.dispatch.Mailbox.run(Mailbox.scala:221)
> at akka.dispatch.Mailbox.exec(Mailbox.scala:231)
> at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
> at 
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
> at 
> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
> at 
> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> 2016-12-08 16:45:14,995 INFO  
> org.apache.flink.mesos.runtime.clusterframework.MesosTaskManager  - Stopping 
> TaskManager akka://flink/user/taskmanager#1361882659.
> 2016-12-08 16:45:14,995 INFO  
> org.apache.flink.mesos.runtime.clusterframework.MesosTaskManager  - 
> Disassociating from JobManager
> 2016-12-08 16:45:14,997 INFO  org.apache.flink.runtime.blob.BlobCache 
>   - Shutting down BlobCache
> 2016-12-08 16:45:15,006 INFO  
> org.apache.flink.runtime.io.disk.iomanager.IOManager  - I/O manager 
> removed spill file directory 
> /tmp/flink-io-e61f717b-630c-4a2a-b3e3-62ccb40aa2f9
> 2016-12-08 16:45:15,006 INFO  
> org.apache.flink.runtime.io.network.NetworkEnvironment- Shutting down 
> the network environment and its components.
> 2016-12-08 16:45:15,008 INFO  
> org.apache.flink.runtime.io.network.netty.NettyClient - Successful 
> shutdown (took 1 ms).
> 2016-12-08 16:45:15,009 INFO  
> org.apache.flink.runtime.io.network.netty.NettyServer - Successful 
> shutdown (took 0 ms).
> 2016-12-08 16:45:15,020 INFO  
> org.apache.flink.mesos.runtime.clusterframework.MesosTaskManager  - Task 
> manager akka://flink/user/taskmanager is completely shut down.
> 2016-12-08 16:45:15,023 ERROR 
> org.apache.flink.runtime.taskmanager.TaskManager  - Actor 
> akka://flink/user/taskmanager#1361882659 terminated, stopping process...
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-5298) TaskManager crashes when TM log not existant

2016-12-08 Thread Chesnay Schepler (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-5298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chesnay Schepler updated FLINK-5298:

Fix Version/s: 1.2.0

> TaskManager crashes when TM log not existant
> 
>
> Key: FLINK-5298
> URL: https://issues.apache.org/jira/browse/FLINK-5298
> Project: Flink
>  Issue Type: Bug
>  Components: TaskManager, Webfrontend
>Affects Versions: 1.1.0, 1.2.0
>Reporter: Mischa Krüger
>Assignee: Chesnay Schepler
>Priority: Trivial
> Fix For: 1.2.0
>
>
> {code}
> java.io.FileNotFoundException: flink-taskmanager.out (No such file or 
> directory)
> at java.io.FileInputStream.open0(Native Method)
> at java.io.FileInputStream.open(FileInputStream.java:195)
> at java.io.FileInputStream.(FileInputStream.java:138)
> at 
> org.apache.flink.runtime.taskmanager.TaskManager.org$apache$flink$runtime$taskmanager$TaskManager$$handleRequestTaskManagerLog(TaskManager.scala:833)
> at 
> org.apache.flink.runtime.taskmanager.TaskManager$$anonfun$handleMessage$1.applyOrElse(TaskManager.scala:340)
> at 
> scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33)
> at 
> scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33)
> at 
> scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25)
> at 
> org.apache.flink.runtime.LeaderSessionMessageFilter$$anonfun$receive$1.applyOrElse(LeaderSessionMessageFilter.scala:44)
> at 
> scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33)
> at 
> scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33)
> at 
> scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25)
> at 
> org.apache.flink.runtime.LogMessages$$anon$1.apply(LogMessages.scala:33)
> at 
> org.apache.flink.runtime.LogMessages$$anon$1.apply(LogMessages.scala:28)
> at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:118)
> at 
> org.apache.flink.runtime.LogMessages$$anon$1.applyOrElse(LogMessages.scala:28)
> at akka.actor.Actor$class.aroundReceive(Actor.scala:465)
> at 
> org.apache.flink.runtime.taskmanager.TaskManager.aroundReceive(TaskManager.scala:122)
> at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
> at akka.actor.ActorCell.invoke(ActorCell.scala:487)
> at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254)
> at akka.dispatch.Mailbox.run(Mailbox.scala:221)
> at akka.dispatch.Mailbox.exec(Mailbox.scala:231)
> at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
> at 
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
> at 
> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
> at 
> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> 2016-12-08 16:45:14,995 INFO  
> org.apache.flink.mesos.runtime.clusterframework.MesosTaskManager  - Stopping 
> TaskManager akka://flink/user/taskmanager#1361882659.
> 2016-12-08 16:45:14,995 INFO  
> org.apache.flink.mesos.runtime.clusterframework.MesosTaskManager  - 
> Disassociating from JobManager
> 2016-12-08 16:45:14,997 INFO  org.apache.flink.runtime.blob.BlobCache 
>   - Shutting down BlobCache
> 2016-12-08 16:45:15,006 INFO  
> org.apache.flink.runtime.io.disk.iomanager.IOManager  - I/O manager 
> removed spill file directory 
> /tmp/flink-io-e61f717b-630c-4a2a-b3e3-62ccb40aa2f9
> 2016-12-08 16:45:15,006 INFO  
> org.apache.flink.runtime.io.network.NetworkEnvironment- Shutting down 
> the network environment and its components.
> 2016-12-08 16:45:15,008 INFO  
> org.apache.flink.runtime.io.network.netty.NettyClient - Successful 
> shutdown (took 1 ms).
> 2016-12-08 16:45:15,009 INFO  
> org.apache.flink.runtime.io.network.netty.NettyServer - Successful 
> shutdown (took 0 ms).
> 2016-12-08 16:45:15,020 INFO  
> org.apache.flink.mesos.runtime.clusterframework.MesosTaskManager  - Task 
> manager akka://flink/user/taskmanager is completely shut down.
> 2016-12-08 16:45:15,023 ERROR 
> org.apache.flink.runtime.taskmanager.TaskManager  - Actor 
> akka://flink/user/taskmanager#1361882659 terminated, stopping process...
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-5298) TaskManager crashes when TM log not existant

2016-12-08 Thread Chesnay Schepler (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-5298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chesnay Schepler updated FLINK-5298:

Priority: Trivial  (was: Major)

> TaskManager crashes when TM log not existant
> 
>
> Key: FLINK-5298
> URL: https://issues.apache.org/jira/browse/FLINK-5298
> Project: Flink
>  Issue Type: Bug
>  Components: TaskManager, Webfrontend
>Affects Versions: 1.1.0, 1.2.0
>Reporter: Mischa Krüger
>Assignee: Chesnay Schepler
>Priority: Trivial
> Fix For: 1.2.0
>
>
> {code}
> java.io.FileNotFoundException: flink-taskmanager.out (No such file or 
> directory)
> at java.io.FileInputStream.open0(Native Method)
> at java.io.FileInputStream.open(FileInputStream.java:195)
> at java.io.FileInputStream.(FileInputStream.java:138)
> at 
> org.apache.flink.runtime.taskmanager.TaskManager.org$apache$flink$runtime$taskmanager$TaskManager$$handleRequestTaskManagerLog(TaskManager.scala:833)
> at 
> org.apache.flink.runtime.taskmanager.TaskManager$$anonfun$handleMessage$1.applyOrElse(TaskManager.scala:340)
> at 
> scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33)
> at 
> scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33)
> at 
> scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25)
> at 
> org.apache.flink.runtime.LeaderSessionMessageFilter$$anonfun$receive$1.applyOrElse(LeaderSessionMessageFilter.scala:44)
> at 
> scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33)
> at 
> scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33)
> at 
> scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25)
> at 
> org.apache.flink.runtime.LogMessages$$anon$1.apply(LogMessages.scala:33)
> at 
> org.apache.flink.runtime.LogMessages$$anon$1.apply(LogMessages.scala:28)
> at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:118)
> at 
> org.apache.flink.runtime.LogMessages$$anon$1.applyOrElse(LogMessages.scala:28)
> at akka.actor.Actor$class.aroundReceive(Actor.scala:465)
> at 
> org.apache.flink.runtime.taskmanager.TaskManager.aroundReceive(TaskManager.scala:122)
> at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
> at akka.actor.ActorCell.invoke(ActorCell.scala:487)
> at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254)
> at akka.dispatch.Mailbox.run(Mailbox.scala:221)
> at akka.dispatch.Mailbox.exec(Mailbox.scala:231)
> at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
> at 
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
> at 
> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
> at 
> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> 2016-12-08 16:45:14,995 INFO  
> org.apache.flink.mesos.runtime.clusterframework.MesosTaskManager  - Stopping 
> TaskManager akka://flink/user/taskmanager#1361882659.
> 2016-12-08 16:45:14,995 INFO  
> org.apache.flink.mesos.runtime.clusterframework.MesosTaskManager  - 
> Disassociating from JobManager
> 2016-12-08 16:45:14,997 INFO  org.apache.flink.runtime.blob.BlobCache 
>   - Shutting down BlobCache
> 2016-12-08 16:45:15,006 INFO  
> org.apache.flink.runtime.io.disk.iomanager.IOManager  - I/O manager 
> removed spill file directory 
> /tmp/flink-io-e61f717b-630c-4a2a-b3e3-62ccb40aa2f9
> 2016-12-08 16:45:15,006 INFO  
> org.apache.flink.runtime.io.network.NetworkEnvironment- Shutting down 
> the network environment and its components.
> 2016-12-08 16:45:15,008 INFO  
> org.apache.flink.runtime.io.network.netty.NettyClient - Successful 
> shutdown (took 1 ms).
> 2016-12-08 16:45:15,009 INFO  
> org.apache.flink.runtime.io.network.netty.NettyServer - Successful 
> shutdown (took 0 ms).
> 2016-12-08 16:45:15,020 INFO  
> org.apache.flink.mesos.runtime.clusterframework.MesosTaskManager  - Task 
> manager akka://flink/user/taskmanager is completely shut down.
> 2016-12-08 16:45:15,023 ERROR 
> org.apache.flink.runtime.taskmanager.TaskManager  - Actor 
> akka://flink/user/taskmanager#1361882659 terminated, stopping process...
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request #2970: [FLINK-5300] Add more gentle file deletion procedu...

2016-12-08 Thread tillrohrmann
GitHub user tillrohrmann opened a pull request:

https://github.com/apache/flink/pull/2970

[FLINK-5300] Add more gentle file deletion procedure

Before deleting a parent directory always check the directory whether it 
contains some
files. If not, then try to delete the parent directory.

This will give a more gentle behaviour wrt storage systems which are not 
instructed to
delete a non-empty directory.

cc: @StefanRRichter 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tillrohrmann/flink moreGentleFileDeletion

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/2970.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2970


commit 62d3d28f675aaad203d655848d30e9fb916af43b
Author: Till Rohrmann 
Date:   2016-12-08T17:53:40Z

[FLINK-5300] Add more gentle file deletion procedure

Before deleting a parent directory always check the directory whether it 
contains some
files. If not, then try to delete the parent directory.

This will give a more gentle behaviour wrt storage systems which are not 
instructed to
delete a non-empty directory.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-5300) FileStateHandle#discard & FsCheckpointStateOutputStream#close tries to delete non-empty directory

2016-12-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15732927#comment-15732927
 ] 

ASF GitHub Bot commented on FLINK-5300:
---

GitHub user tillrohrmann opened a pull request:

https://github.com/apache/flink/pull/2970

[FLINK-5300] Add more gentle file deletion procedure

Before deleting a parent directory always check the directory whether it 
contains some
files. If not, then try to delete the parent directory.

This will give a more gentle behaviour wrt storage systems which are not 
instructed to
delete a non-empty directory.

cc: @StefanRRichter 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tillrohrmann/flink moreGentleFileDeletion

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/2970.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2970


commit 62d3d28f675aaad203d655848d30e9fb916af43b
Author: Till Rohrmann 
Date:   2016-12-08T17:53:40Z

[FLINK-5300] Add more gentle file deletion procedure

Before deleting a parent directory always check the directory whether it 
contains some
files. If not, then try to delete the parent directory.

This will give a more gentle behaviour wrt storage systems which are not 
instructed to
delete a non-empty directory.




> FileStateHandle#discard & FsCheckpointStateOutputStream#close tries to delete 
> non-empty directory
> -
>
> Key: FLINK-5300
> URL: https://issues.apache.org/jira/browse/FLINK-5300
> Project: Flink
>  Issue Type: Improvement
>  Components: State Backends, Checkpointing
>Affects Versions: 1.2.0, 1.1.3
>Reporter: Till Rohrmann
>Assignee: Till Rohrmann
>Priority: Critical
>
> Flink's behaviour to delete {{FileStateHandles}} and closing 
> {{FsCheckpointStateOutputStream}} always triggers a delete operation on the 
> parent directory. Often this call will fail because the directory still 
> contains some other files.
> A user reported that the SRE of their Hadoop cluster noticed this behaviour 
> in the logs. It might be more system friendly if we first checked whether the 
> directory is empty or not. This would prevent many error message to appear in 
> the Hadoop logs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (FLINK-5298) TaskManager crashes when TM log not existant

2016-12-08 Thread Chesnay Schepler (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-5298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chesnay Schepler reassigned FLINK-5298:
---

Assignee: Chesnay Schepler

> TaskManager crashes when TM log not existant
> 
>
> Key: FLINK-5298
> URL: https://issues.apache.org/jira/browse/FLINK-5298
> Project: Flink
>  Issue Type: Bug
>Reporter: Mischa Krüger
>Assignee: Chesnay Schepler
>
> {code}
> java.io.FileNotFoundException: flink-taskmanager.out (No such file or 
> directory)
> at java.io.FileInputStream.open0(Native Method)
> at java.io.FileInputStream.open(FileInputStream.java:195)
> at java.io.FileInputStream.(FileInputStream.java:138)
> at 
> org.apache.flink.runtime.taskmanager.TaskManager.org$apache$flink$runtime$taskmanager$TaskManager$$handleRequestTaskManagerLog(TaskManager.scala:833)
> at 
> org.apache.flink.runtime.taskmanager.TaskManager$$anonfun$handleMessage$1.applyOrElse(TaskManager.scala:340)
> at 
> scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33)
> at 
> scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33)
> at 
> scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25)
> at 
> org.apache.flink.runtime.LeaderSessionMessageFilter$$anonfun$receive$1.applyOrElse(LeaderSessionMessageFilter.scala:44)
> at 
> scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33)
> at 
> scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33)
> at 
> scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25)
> at 
> org.apache.flink.runtime.LogMessages$$anon$1.apply(LogMessages.scala:33)
> at 
> org.apache.flink.runtime.LogMessages$$anon$1.apply(LogMessages.scala:28)
> at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:118)
> at 
> org.apache.flink.runtime.LogMessages$$anon$1.applyOrElse(LogMessages.scala:28)
> at akka.actor.Actor$class.aroundReceive(Actor.scala:465)
> at 
> org.apache.flink.runtime.taskmanager.TaskManager.aroundReceive(TaskManager.scala:122)
> at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
> at akka.actor.ActorCell.invoke(ActorCell.scala:487)
> at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254)
> at akka.dispatch.Mailbox.run(Mailbox.scala:221)
> at akka.dispatch.Mailbox.exec(Mailbox.scala:231)
> at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
> at 
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
> at 
> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
> at 
> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> 2016-12-08 16:45:14,995 INFO  
> org.apache.flink.mesos.runtime.clusterframework.MesosTaskManager  - Stopping 
> TaskManager akka://flink/user/taskmanager#1361882659.
> 2016-12-08 16:45:14,995 INFO  
> org.apache.flink.mesos.runtime.clusterframework.MesosTaskManager  - 
> Disassociating from JobManager
> 2016-12-08 16:45:14,997 INFO  org.apache.flink.runtime.blob.BlobCache 
>   - Shutting down BlobCache
> 2016-12-08 16:45:15,006 INFO  
> org.apache.flink.runtime.io.disk.iomanager.IOManager  - I/O manager 
> removed spill file directory 
> /tmp/flink-io-e61f717b-630c-4a2a-b3e3-62ccb40aa2f9
> 2016-12-08 16:45:15,006 INFO  
> org.apache.flink.runtime.io.network.NetworkEnvironment- Shutting down 
> the network environment and its components.
> 2016-12-08 16:45:15,008 INFO  
> org.apache.flink.runtime.io.network.netty.NettyClient - Successful 
> shutdown (took 1 ms).
> 2016-12-08 16:45:15,009 INFO  
> org.apache.flink.runtime.io.network.netty.NettyServer - Successful 
> shutdown (took 0 ms).
> 2016-12-08 16:45:15,020 INFO  
> org.apache.flink.mesos.runtime.clusterframework.MesosTaskManager  - Task 
> manager akka://flink/user/taskmanager is completely shut down.
> 2016-12-08 16:45:15,023 ERROR 
> org.apache.flink.runtime.taskmanager.TaskManager  - Actor 
> akka://flink/user/taskmanager#1361882659 terminated, stopping process...
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2821) Change Akka configuration to allow accessing actors from different URLs

2016-12-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15732925#comment-15732925
 ] 

ASF GitHub Bot commented on FLINK-2821:
---

Github user mxm commented on the issue:

https://github.com/apache/flink/pull/2917
  
@StephanEwen I've updated the pull request to incorporate your suggestions. 
When an IPv6 address is specified, we format it like in the current code base. 
When a hostname is specified, we do some simple validation but do not resolve 
it. IPv4 addresses are simply used as-is. 

I hope that this PR becomes more mergeable in this state. With the help of 
some eager users we can further test this in a cluster environment.


> Change Akka configuration to allow accessing actors from different URLs
> ---
>
> Key: FLINK-2821
> URL: https://issues.apache.org/jira/browse/FLINK-2821
> Project: Flink
>  Issue Type: Bug
>  Components: Distributed Coordination
>Reporter: Robert Metzger
>Assignee: Maximilian Michels
>
> Akka expects the actor's URL to be exactly matching.
> As pointed out here, cases where users were complaining about this: 
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Error-trying-to-access-JM-through-proxy-td3018.html
>   - Proxy routing (as described here, send to the proxy URL, receiver 
> recognizes only original URL)
>   - Using hostname / IP interchangeably does not work (we solved this by 
> always putting IP addresses into URLs, never hostnames)
>   - Binding to multiple interfaces (any local 0.0.0.0) does not work. Still 
> no solution to that (but seems not too much of a restriction)
> I am aware that this is not possible due to Akka, so it is actually not a 
> Flink bug. But I think we should track the resolution of the issue here 
> anyways because its affecting our user's satisfaction.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2917: [FLINK-2821] use custom Akka build to listen on all inter...

2016-12-08 Thread mxm
Github user mxm commented on the issue:

https://github.com/apache/flink/pull/2917
  
@StephanEwen I've updated the pull request to incorporate your suggestions. 
When an IPv6 address is specified, we format it like in the current code base. 
When a hostname is specified, we do some simple validation but do not resolve 
it. IPv4 addresses are simply used as-is. 

I hope that this PR becomes more mergeable in this state. With the help of 
some eager users we can further test this in a cluster environment.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Comment Edited] (FLINK-1536) Graph partitioning operators for Gelly

2016-12-08 Thread Ivan Mushketyk (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15732898#comment-15732898
 ] 

Ivan Mushketyk edited comment on FLINK-1536 at 12/8/16 6:02 PM:


Do I understand correctly that as the result of this issue we should have an 
interface like this:

List partition(Graph graph);

namely we will have something that takes an a graph as an input and produces a 
list of partitions that we are interested in?

What confuses me is this part of the Gell roadmap 
(https://cwiki.apache.org/confluence/display/FLINK/Flink+Gelly): 

"Graph Partitioning plays a key role in application parallelization and in 
scaling data analysis up. Processes need to evenly be assigned to machines 
while maintaining communication costs to a minimum."

Does it mean that partitioning should be a step in other graph processing 
algorithms? If so how is it supposed to be used?

>From the same document. What is "hash/random partitioning"?


was (Author: ivan.mushketyk):
Do I understand correctly that as the result of this issue we should have an 
interface like this:

List partition(Graph graph);

namely we will have something that takes an a graph as an input and produces a 
list of partitions that we are interested in?

What confuses me is this part of the Gell roadmap 
(https://cwiki.apache.org/confluence/display/FLINK/Flink+Gelly): 

"Graph Partitioning plays a key role in application parallelization and in 
scaling data analysis up. Processes need to evenly be assigned to machines 
while maintaining communication costs to a minimum."

Does it mean that partitioning should be a step in other graph processing 
algorithms?

>From the same document. What is "hash/random partitioning"?

> Graph partitioning operators for Gelly
> --
>
> Key: FLINK-1536
> URL: https://issues.apache.org/jira/browse/FLINK-1536
> Project: Flink
>  Issue Type: New Feature
>  Components: Gelly
>Reporter: Vasia Kalavri
>Assignee: Ivan Mushketyk
>Priority: Minor
>
> Smart graph partitioning can significantly improve the performance and 
> scalability of graph analysis applications. Depending on the computation 
> pattern, a graph partitioning algorithm divides the graph into (maybe 
> overlapping) subgraphs, optimizing some objective. For example, if 
> communication is performed across graph edges, one might want to minimize the 
> edges that cross from one partition to another.
> The problem of graph partitioning is a well studied problem and several 
> algorithms have been proposed in the literature. The goal of this project 
> would be to choose a few existing partitioning techniques and implement the 
> corresponding graph partitioning operators for Gelly.
> Some related literature can be found [here| 
> http://www.citeulike.org/user/vasiakalavri/tag/graph-partitioning].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (FLINK-1536) Graph partitioning operators for Gelly

2016-12-08 Thread Ivan Mushketyk (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15732898#comment-15732898
 ] 

Ivan Mushketyk edited comment on FLINK-1536 at 12/8/16 6:01 PM:


Do I understand correctly that as the result of this issue we should have an 
interface like this:

List partition(Graph graph);

namely we will have something that takes an a graph as an input and produces a 
list of partitions that we are interested in?

What confuses me is this part of the Gell roadmap 
(https://cwiki.apache.org/confluence/display/FLINK/Flink+Gelly): 

"Graph Partitioning plays a key role in application parallelization and in 
scaling data analysis up. Processes need to evenly be assigned to machines 
while maintaining communication costs to a minimum."

Does it mean that partitioning should be a step in other graph processing 
algorithms?

>From the same document. What is "hash/random partitioning"?


was (Author: ivan.mushketyk):
Do I understand correctly that as the result of this issue we should have an 
interface like this:

List partition(Graph graph);

namely we will have something that takes an a graph as an input and produces a 
list of partitions that we are interested in?

> Graph partitioning operators for Gelly
> --
>
> Key: FLINK-1536
> URL: https://issues.apache.org/jira/browse/FLINK-1536
> Project: Flink
>  Issue Type: New Feature
>  Components: Gelly
>Reporter: Vasia Kalavri
>Assignee: Ivan Mushketyk
>Priority: Minor
>
> Smart graph partitioning can significantly improve the performance and 
> scalability of graph analysis applications. Depending on the computation 
> pattern, a graph partitioning algorithm divides the graph into (maybe 
> overlapping) subgraphs, optimizing some objective. For example, if 
> communication is performed across graph edges, one might want to minimize the 
> edges that cross from one partition to another.
> The problem of graph partitioning is a well studied problem and several 
> algorithms have been proposed in the literature. The goal of this project 
> would be to choose a few existing partitioning techniques and implement the 
> corresponding graph partitioning operators for Gelly.
> Some related literature can be found [here| 
> http://www.citeulike.org/user/vasiakalavri/tag/graph-partitioning].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1536) Graph partitioning operators for Gelly

2016-12-08 Thread Ivan Mushketyk (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15732898#comment-15732898
 ] 

Ivan Mushketyk commented on FLINK-1536:
---

Do I understand correctly that as the result of this issue we should have an 
interface like this:

List partition(Graph graph);

namely we will have something that takes an a graph as an input and produces a 
list of partitions that we are interested in?

> Graph partitioning operators for Gelly
> --
>
> Key: FLINK-1536
> URL: https://issues.apache.org/jira/browse/FLINK-1536
> Project: Flink
>  Issue Type: New Feature
>  Components: Gelly
>Reporter: Vasia Kalavri
>Assignee: Ivan Mushketyk
>Priority: Minor
>
> Smart graph partitioning can significantly improve the performance and 
> scalability of graph analysis applications. Depending on the computation 
> pattern, a graph partitioning algorithm divides the graph into (maybe 
> overlapping) subgraphs, optimizing some objective. For example, if 
> communication is performed across graph edges, one might want to minimize the 
> edges that cross from one partition to another.
> The problem of graph partitioning is a well studied problem and several 
> algorithms have been proposed in the literature. The goal of this project 
> would be to choose a few existing partitioning techniques and implement the 
> corresponding graph partitioning operators for Gelly.
> Some related literature can be found [here| 
> http://www.citeulike.org/user/vasiakalavri/tag/graph-partitioning].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3921) StringParser not specifying encoding to use

2016-12-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15732896#comment-15732896
 ] 

ASF GitHub Bot commented on FLINK-3921:
---

Github user greghogan commented on the issue:

https://github.com/apache/flink/pull/2901
  
I'm not quite following but I think we have the same idea.


> StringParser not specifying encoding to use
> ---
>
> Key: FLINK-3921
> URL: https://issues.apache.org/jira/browse/FLINK-3921
> Project: Flink
>  Issue Type: Improvement
>  Components: Core
>Affects Versions: 1.0.3
>Reporter: Tatu Saloranta
>Assignee: Rekha Joshi
>Priority: Trivial
>
> Class `flink.types.parser.StringParser` has javadocs indicating that contents 
> are expected to be Ascii, similar to `StringValueParser`. That makes sense, 
> but when constructing actual instance, no encoding is specified; on line 66 
> f.ex:
>this.result = new String(bytes, startPos+1, i - startPos - 2);
> which leads to using whatever default platform encoding is. If contents 
> really are always Ascii (would not count on that as parser is used from CSV 
> reader), not a big deal, but it can lead to the usual Latin-1-VS-UTF-8 issues.
> So I think that encoding should be explicitly specified, whatever is to be 
> used: javadocs claim ascii, so could be "us-ascii", but could well be UTF-8 
> or even ISO-8859-1.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2901: [FLINK-3921] StringParser encoding

2016-12-08 Thread greghogan
Github user greghogan commented on the issue:

https://github.com/apache/flink/pull/2901
  
I'm not quite following but I think we have the same idea.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Assigned] (FLINK-5300) FileStateHandle#discard & FsCheckpointStateOutputStream#close tries to delete non-empty directory

2016-12-08 Thread Till Rohrmann (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-5300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Till Rohrmann reassigned FLINK-5300:


Assignee: Till Rohrmann

> FileStateHandle#discard & FsCheckpointStateOutputStream#close tries to delete 
> non-empty directory
> -
>
> Key: FLINK-5300
> URL: https://issues.apache.org/jira/browse/FLINK-5300
> Project: Flink
>  Issue Type: Improvement
>  Components: State Backends, Checkpointing
>Affects Versions: 1.2.0, 1.1.3
>Reporter: Till Rohrmann
>Assignee: Till Rohrmann
>Priority: Critical
>
> Flink's behaviour to delete {{FileStateHandles}} and closing 
> {{FsCheckpointStateOutputStream}} always triggers a delete operation on the 
> parent directory. Often this call will fail because the directory still 
> contains some other files.
> A user reported that the SRE of their Hadoop cluster noticed this behaviour 
> in the logs. It might be more system friendly if we first checked whether the 
> directory is empty or not. This would prevent many error message to appear in 
> the Hadoop logs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-5301) Can't upload job via Web UI when using a proxy

2016-12-08 Thread JIRA
Mischa Krüger created FLINK-5301:


 Summary: Can't upload job via Web UI when using a proxy
 Key: FLINK-5301
 URL: https://issues.apache.org/jira/browse/FLINK-5301
 Project: Flink
  Issue Type: Bug
  Components: Webfrontend
Reporter: Mischa Krüger


Using DC/OS with Flink service in current development 
(https://github.com/mesosphere/dcos-flink-service). For reproduction:
1. Install a DC/OS cluster
2. Follow the instruction on mentioned repo for setting up a universe server 
with the flink app.
3. Install the flink app via the universe
4. Access the Web UI
5. Upload a job

Experience:
The upload reaches 100%, and then says "Saving..." forever.

Upload works when using ssh forwarding to access the node directly serving the 
Flink Web UI.

DC/OS uses a proxy to access the Web UI. The webpage is delivered by a 
component called the "Admin Router".

Side note:
Interestingly also the new favicon does not appear.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-5280) Extend TableSource to support nested data

2016-12-08 Thread Ivan Mushketyk (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15732870#comment-15732870
 ] 

Ivan Mushketyk commented on FLINK-5280:
---

If this is not urgent and nobody has started working on this, I can give it a 
try.

> Extend TableSource to support nested data
> -
>
> Key: FLINK-5280
> URL: https://issues.apache.org/jira/browse/FLINK-5280
> Project: Flink
>  Issue Type: Improvement
>  Components: Table API & SQL
>Affects Versions: 1.2.0
>Reporter: Fabian Hueske
>
> The {{TableSource}} interface does currently only support the definition of 
> flat rows. 
> However, there are several storage formats for nested data that should be 
> supported such as Avro, Json, Parquet, and Orc. The Table API and SQL can 
> also natively handle nested rows.
> The {{TableSource}} interface and the code to register table sources in 
> Calcite's schema need to be extended to support nested data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3921) StringParser not specifying encoding to use

2016-12-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15732859#comment-15732859
 ] 

ASF GitHub Bot commented on FLINK-3921:
---

Github user fhueske commented on the issue:

https://github.com/apache/flink/pull/2901
  
I thought, I'll merge the original commit and squash yours on top. OK?


> StringParser not specifying encoding to use
> ---
>
> Key: FLINK-3921
> URL: https://issues.apache.org/jira/browse/FLINK-3921
> Project: Flink
>  Issue Type: Improvement
>  Components: Core
>Affects Versions: 1.0.3
>Reporter: Tatu Saloranta
>Assignee: Rekha Joshi
>Priority: Trivial
>
> Class `flink.types.parser.StringParser` has javadocs indicating that contents 
> are expected to be Ascii, similar to `StringValueParser`. That makes sense, 
> but when constructing actual instance, no encoding is specified; on line 66 
> f.ex:
>this.result = new String(bytes, startPos+1, i - startPos - 2);
> which leads to using whatever default platform encoding is. If contents 
> really are always Ascii (would not count on that as parser is used from CSV 
> reader), not a big deal, but it can lead to the usual Latin-1-VS-UTF-8 issues.
> So I think that encoding should be explicitly specified, whatever is to be 
> used: javadocs claim ascii, so could be "us-ascii", but could well be UTF-8 
> or even ISO-8859-1.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2901: [FLINK-3921] StringParser encoding

2016-12-08 Thread fhueske
Github user fhueske commented on the issue:

https://github.com/apache/flink/pull/2901
  
I thought, I'll merge the original commit and squash yours on top. OK?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-3921) StringParser not specifying encoding to use

2016-12-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15732852#comment-15732852
 ] 

ASF GitHub Bot commented on FLINK-3921:
---

Github user greghogan commented on the issue:

https://github.com/apache/flink/pull/2901
  
I had planned to squash these into the original commit (#2060), but 
whatever you think is best.


> StringParser not specifying encoding to use
> ---
>
> Key: FLINK-3921
> URL: https://issues.apache.org/jira/browse/FLINK-3921
> Project: Flink
>  Issue Type: Improvement
>  Components: Core
>Affects Versions: 1.0.3
>Reporter: Tatu Saloranta
>Assignee: Rekha Joshi
>Priority: Trivial
>
> Class `flink.types.parser.StringParser` has javadocs indicating that contents 
> are expected to be Ascii, similar to `StringValueParser`. That makes sense, 
> but when constructing actual instance, no encoding is specified; on line 66 
> f.ex:
>this.result = new String(bytes, startPos+1, i - startPos - 2);
> which leads to using whatever default platform encoding is. If contents 
> really are always Ascii (would not count on that as parser is used from CSV 
> reader), not a big deal, but it can lead to the usual Latin-1-VS-UTF-8 issues.
> So I think that encoding should be explicitly specified, whatever is to be 
> used: javadocs claim ascii, so could be "us-ascii", but could well be UTF-8 
> or even ISO-8859-1.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2901: [FLINK-3921] StringParser encoding

2016-12-08 Thread greghogan
Github user greghogan commented on the issue:

https://github.com/apache/flink/pull/2901
  
I had planned to squash these into the original commit (#2060), but 
whatever you think is best.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-5226) Eagerly project unused attributes

2016-12-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15732843#comment-15732843
 ] 

ASF GitHub Bot commented on FLINK-5226:
---

Github user fhueske commented on the issue:

https://github.com/apache/flink/pull/2926
  
Merging


> Eagerly project unused attributes
> -
>
> Key: FLINK-5226
> URL: https://issues.apache.org/jira/browse/FLINK-5226
> Project: Flink
>  Issue Type: Improvement
>  Components: Table API & SQL
>Affects Versions: 1.2.0
>Reporter: Fabian Hueske
>Assignee: Fabian Hueske
>
> The optimizer does currently not eagerly remove unused attributes. 
> For example given a table {{tab5}} with five attributes {{a, b, c, d, e}}, 
> the following query
> {code}
> SELECT x.a, y.b FROM tab5 AS x, tab5 AS y WHERE x.a = y.a
> {code}
> would result in the non-optimized plan
> {code}
> LogicalProject(a=[$0], b=[$6])
>   LogicalFilter(condition=[=($0, $5)])
> LogicalJoin(condition=[true], joinType=[inner])
>   LogicalTableScan(table=[[tab5]])
>   LogicalTableScan(table=[[tab5]])
> {code}
> and the optimized plan:
> {code}
> DataSetCalc(select=[a, b0 AS b])
>   DataSetJoin(where=[=(a, a0)], join=[a, b, c, d, e, a0, b0, c0, d0, e0], 
> joinType=[InnerJoin])
> DataSetScan(table=[[_DataSetTable_0]])
> DataSetScan(table=[[_DataSetTable_0]])
> {code}
> This plan is inefficient because it joins all ten attributes of both tables 
> instead of eagerly projecting out all unused fields ({{x.b, x.c, x.d, x.e, 
> y.c, y.d, y.e}}).
> Since this is one of the most common optimizations, I would assume that 
> Calcite provides some rules to extract eager projections. If this is the 
> case, the issue can be solved by adding such rules to {{FlinkRuleSets}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   3   >