date:20170717


[ 
https://issues.apache.org/jira/browse/FLINK-6588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16091114#comment-16091114
 ] 

ASF GitHub Bot commented on FLINK-6588:
---

Github user zjureel commented on the issue:

https://github.com/apache/flink/pull/4292
  
@zentol What do you think of @StephanEwen 's suggestion? I think this 
change does cause some incompatibilities for user, thanks


> Rename NumberOfFullRestarts metric
> --
>
> Key: FLINK-6588
> URL: https://issues.apache.org/jira/browse/FLINK-6588
> Project: Flink
>  Issue Type: Improvement
>  Components: Metrics
>Affects Versions: 1.3.0, 1.4.0
>Reporter: Chesnay Schepler
>Assignee: Fang Yong
>
> The metric for the number of full restarts is currently called 
> {{fullRestarts}}. For clarity and consitency purposes I propose to rename it 
> to {{numFullRestarts}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[GitHub] flink issue #4292: [FLINK-6588] Rename NumberOfFullRestarts metric

2017-07-17 Thread zjureel

Github user zjureel commented on the issue:

https://github.com/apache/flink/pull/4292
  
@zentol What do you think of @StephanEwen 's suggestion? I think this 
change does cause some incompatibilities for user, thanks


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-6667) Pass a callback type to the RestartStrategy, rather than the full ExecutionGraph


[ 
https://issues.apache.org/jira/browse/FLINK-6667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16091112#comment-16091112
 ] 

ASF GitHub Bot commented on FLINK-6667:
---

Github user zjureel commented on the issue:

https://github.com/apache/flink/pull/4277
  
Thank you for your reply, I will fix 
[FLINK-6665](https://issues.apache.org/jira/browse/FLINK-6665) after this PR is 
merged, thanks :)


> Pass a callback type to the RestartStrategy, rather than the full 
> ExecutionGraph
> 
>
> Key: FLINK-6667
> URL: https://issues.apache.org/jira/browse/FLINK-6667
> Project: Flink
>  Issue Type: Sub-task
>  Components: Distributed Coordination
>Affects Versions: 1.3.0
>Reporter: Stephan Ewen
>Assignee: Fang Yong
>
> To let the {{ResourceStrategy}} work across multiple {{FailoverStrategy}} 
> implementations, it needs to be passed a "callback" to call to trigger the 
> restart of tasks/regions/etc.
> Such a "callback" would be a nice abstraction to use for global restarts as 
> well, to not expose the full execution graph.
> Ideally, the callback is one-shot, so it cannot accidentally be used to call 
> restart() multiple times.
> This would also make the testing of RestartStrategies much easier.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[GitHub] flink issue #4277: [FLINK-6667] Pass a callback type to the RestartStrategy,...

2017-07-17 Thread zjureel

Github user zjureel commented on the issue:

https://github.com/apache/flink/pull/4277
  
Thank you for your reply, I will fix 
[FLINK-6665](https://issues.apache.org/jira/browse/FLINK-6665) after this PR is 
merged, thanks :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-7118) Remove hadoop1.x code in HadoopUtils


[ 
https://issues.apache.org/jira/browse/FLINK-7118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16091067#comment-16091067
 ] 

ASF GitHub Bot commented on FLINK-7118:
---

Github user zhangminglei commented on the issue:

https://github.com/apache/flink/pull/4285
  
@StephanEwen PR have been updated. Please check it out again ~


> Remove hadoop1.x code in HadoopUtils
> 
>
> Key: FLINK-7118
> URL: https://issues.apache.org/jira/browse/FLINK-7118
> Project: Flink
>  Issue Type: Improvement
>  Components: Java API
>Reporter: mingleizhang
>Assignee: mingleizhang
>
> Since flink no longer support hadoop 1.x version, we should remove it. Below 
> code reside in {{org.apache.flink.api.java.hadoop.mapred.utils.HadoopUtils}}
>   
> {code:java}
> public static JobContext instantiateJobContext(Configuration configuration, 
> JobID jobId) throws Exception {
>   try {
>   Class clazz;
>   // for Hadoop 1.xx
>   if(JobContext.class.isInterface()) {
>   clazz = 
> Class.forName("org.apache.hadoop.mapreduce.task.JobContextImpl", true, 
> Thread.currentThread().getContextClassLoader());
>   }
>   // for Hadoop 2.xx
>   else {
>   clazz = 
> Class.forName("org.apache.hadoop.mapreduce.JobContext", true, 
> Thread.currentThread().getContextClassLoader());
>   }
>   Constructor constructor = 
> clazz.getConstructor(Configuration.class, JobID.class);
>   JobContext context = (JobContext) 
> constructor.newInstance(configuration, jobId);
>   
>   return context;
>   } catch(Exception e) {
>   throw new Exception("Could not create instance of 
> JobContext.");
>   }
>   }
> {code}
> And 
> {code:java}
>   public static TaskAttemptContext 
> instantiateTaskAttemptContext(Configuration configuration,  TaskAttemptID 
> taskAttemptID) throws Exception {
>   try {
>   Class clazz;
>   // for Hadoop 1.xx
>   if(JobContext.class.isInterface()) {
>   clazz = 
> Class.forName("org.apache.hadoop.mapreduce.task.TaskAttemptContextImpl");
>   }
>   // for Hadoop 2.xx
>   else {
>   clazz = 
> Class.forName("org.apache.hadoop.mapreduce.TaskAttemptContext");
>   }
>   Constructor constructor = 
> clazz.getConstructor(Configuration.class, TaskAttemptID.class);
>   TaskAttemptContext context = (TaskAttemptContext) 
> constructor.newInstance(configuration, taskAttemptID);
>   
>   return context;
>   } catch(Exception e) {
>   throw new Exception("Could not create instance of 
> TaskAttemptContext.");
>   }
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[GitHub] flink issue #4285: [FLINK-7118] [hadoop] Remove hadoop1.x code in HadoopUtil...

Github user zhangminglei commented on the issue:

https://github.com/apache/flink/pull/4285
  
@StephanEwen PR have been updated. Please check it out again ~


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink issue #4347: [FLINK-7201] fix concurrency in JobLeaderIdService when s...

2017-07-17 Thread XuPingyong

Github user XuPingyong commented on the issue:

https://github.com/apache/flink/pull/4347
  
@StephanEwen ,  rpcService of ResourceManager executes with only one single 
thread, so there is no conflicts when resourcemanager is in service. When 
resourceManager is shutdown by the other thread, the rpcService had better stop 
first.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-7201) ConcurrentModificationException in JobLeaderIdService


[ 
https://issues.apache.org/jira/browse/FLINK-7201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16091053#comment-16091053
 ] 

ASF GitHub Bot commented on FLINK-7201:
---

Github user XuPingyong commented on the issue:

https://github.com/apache/flink/pull/4347
  
@StephanEwen ,  rpcService of ResourceManager executes with only one single 
thread, so there is no conflicts when resourcemanager is in service. When 
resourceManager is shutdown by the other thread, the rpcService had better stop 
first.




> ConcurrentModificationException in JobLeaderIdService
> -
>
> Key: FLINK-7201
> URL: https://issues.apache.org/jira/browse/FLINK-7201
> Project: Flink
>  Issue Type: Bug
>  Components: JobManager
>Reporter: Xu Pingyong
>Assignee: Xu Pingyong
>  Labels: flip-6
>
> {code:java}
>  java.util.ConcurrentModificationException: null
>   at java.util.HashMap$HashIterator.nextEntry(HashMap.java:922)
>   at java.util.HashMap$ValueIterator.next(HashMap.java:950)
>   at 
> org.apache.flink.runtime.resourcemanager.JobLeaderIdService.clear(JobLeaderIdService.java:114)
>   at 
> org.apache.flink.runtime.resourcemanager.JobLeaderIdService.stop(JobLeaderIdService.java:92)
>   at 
> org.apache.flink.runtime.resourcemanager.ResourceManager.shutDown(ResourceManager.java:200)
>   at 
> org.apache.flink.runtime.resourcemanager.ResourceManagerRunner.shutDownInternally(ResourceManagerRunner.java:102)
>   at 
> org.apache.flink.runtime.resourcemanager.ResourceManagerRunner.shutDown(ResourceManagerRunner.java:97)
>   at 
> org.apache.flink.runtime.minicluster.MiniCluster.shutdownInternally(MiniCluster.java:329)
>   at 
> org.apache.flink.runtime.minicluster.MiniCluster.shutdown(MiniCluster.java:297)
>   at 
> org.apache.flink.runtime.minicluster.MiniClusterITCase.runJobWithMultipleJobManagers(MiniClusterITCase.java:85)
> {code}
> Because the jobLeaderIdService stops before the rpcService when shutdown the 
> resourceManager, jobLeaderIdService has a risk of thread-unsafe.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (FLINK-7194) Add getResultType and getAccumulatorType to AggregateFunction

2017-07-17 Thread Ruidong Li (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-7194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16091036#comment-16091036
 ] 

Ruidong Li commented on FLINK-7194:
---

{{ScalarFunction.getResultType()}} has parameters while {{TableFunction}} and 
{{AggregateFunction}} does not, users can implement different 
{{ScalarFunction.eval()}} with different signatures, such as {{def eval(x: 
Int): Boolean}} or {{def eval(x: String): String}}, so the 
{{ScalarFunction.getResultType()}}' s return value is determined by parameters.

> Add getResultType and getAccumulatorType to AggregateFunction
> -
>
> Key: FLINK-7194
> URL: https://issues.apache.org/jira/browse/FLINK-7194
> Project: Flink
>  Issue Type: Improvement
>  Components: Table API & SQL
>Affects Versions: 1.4.0
>Reporter: Fabian Hueske
>
> FLINK-6725 and FLINK-6457 proposed to remove methods with default 
> implementations such as {{getResultType()}}, {{toString()}}, or 
> {{requiresOver()}} from the base classes of user-defined methods (UDF, UDTF, 
> UDAGG) and instead offer them as contract methods which are dynamically 
> In PR [#3993|https://github.com/apache/flink/pull/3993] I argued that these 
> methods have a fixed signature (in contrast to the {{eval()}}, 
> {{accumulate()}} and {{retract()}} methods) and should be kept in the 
> classes. For users that don't need these methods, this doesn't make a 
> difference because the methods are not abstract and have a default 
> implementation. For users that need to override the methods it makes a 
> difference, because they get IDE and compiler support when overriding them 
> and the cannot get the signature wrong.
> Consequently, I propose to add {{getResultType()}} and 
> {{getAccumulatorType()}} as methods with default implementation to 
> {{AggregateFunction}}. This will make the interface of {{AggregateFunction}} 
> more consistent with {{ScalarFunction}} and {{TableFunction}}.
> What do you think [~shaoxuan], [~RuidongLi] and [~jark]?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (FLINK-6493) Ineffective null check in RegisteredOperatorBackendStateMetaInfo#equals()


[ 
https://issues.apache.org/jira/browse/FLINK-6493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16091034#comment-16091034
 ] 

ASF GitHub Bot commented on FLINK-6493:
---

Github user zhangminglei commented on the issue:

https://github.com/apache/flink/pull/4328
  
PR has been updated. Please helps to check again. Thanks ~


> Ineffective null check in RegisteredOperatorBackendStateMetaInfo#equals()
> -
>
> Key: FLINK-6493
> URL: https://issues.apache.org/jira/browse/FLINK-6493
> Project: Flink
>  Issue Type: Bug
>  Components: State Backends, Checkpointing
>Reporter: Ted Yu
>Assignee: mingleizhang
>Priority: Minor
>
> {code}
> && ((partitionStateSerializer == null && ((Snapshot) 
> obj).getPartitionStateSerializer() == null)
>   || partitionStateSerializer.equals(((Snapshot) 
> obj).getPartitionStateSerializer()))
> && ((partitionStateSerializerConfigSnapshot == null && ((Snapshot) 
> obj).getPartitionStateSerializerConfigSnapshot() == null)
>   || partitionStateSerializerConfigSnapshot.equals(((Snapshot) 
> obj).getPartitionStateSerializerConfigSnapshot()));
> {code}
> The null check for partitionStateSerializer / 
> partitionStateSerializerConfigSnapshot is in combination with another clause.
> This may lead to NPE in the partitionStateSerializer.equals() call.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[GitHub] flink issue #4328: [FLINK-6493] Fix ineffective null check in RegisteredOper...

Github user zhangminglei commented on the issue:

https://github.com/apache/flink/pull/4328
  
PR has been updated. Please helps to check again. Thanks ~


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-6105) Properly handle InterruptedException in HadoopInputFormatBase

2017-07-17 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-6105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16091003#comment-16091003
 ] 

Ted Yu commented on FLINK-6105:
---

Using InterruptedIOException is common practice in handling 
InterruptedException.
You can see a lot of such usage in, e.g. hadoop and hbase.

> Properly handle InterruptedException in HadoopInputFormatBase
> -
>
> Key: FLINK-6105
> URL: https://issues.apache.org/jira/browse/FLINK-6105
> Project: Flink
>  Issue Type: Bug
>  Components: DataStream API
>Reporter: Ted Yu
>Assignee: mingleizhang
>
> When catching InterruptedException, we should throw InterruptedIOException 
> instead of IOException.
> The following example is from HadoopInputFormatBase :
> {code}
> try {
>   splits = this.mapreduceInputFormat.getSplits(jobContext);
> } catch (InterruptedException e) {
>   throw new IOException("Could not get Splits.", e);
> }
> {code}
> There may be other places where IOE is thrown.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (FLINK-6105) Properly handle InterruptedException in HadoopInputFormatBase


[ 
https://issues.apache.org/jira/browse/FLINK-6105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16090997#comment-16090997
 ] 

ASF GitHub Bot commented on FLINK-6105:
---

Github user zhangminglei commented on the issue:

https://github.com/apache/flink/pull/4316
  
Maybe @tedyu could share some light on it.


> Properly handle InterruptedException in HadoopInputFormatBase
> -
>
> Key: FLINK-6105
> URL: https://issues.apache.org/jira/browse/FLINK-6105
> Project: Flink
>  Issue Type: Bug
>  Components: DataStream API
>Reporter: Ted Yu
>Assignee: mingleizhang
>
> When catching InterruptedException, we should throw InterruptedIOException 
> instead of IOException.
> The following example is from HadoopInputFormatBase :
> {code}
> try {
>   splits = this.mapreduceInputFormat.getSplits(jobContext);
> } catch (InterruptedException e) {
>   throw new IOException("Could not get Splits.", e);
> }
> {code}
> There may be other places where IOE is thrown.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[GitHub] flink issue #4316: [FLINK-6105] Use InterruptedIOException instead of IOExce...

Github user zhangminglei commented on the issue:

https://github.com/apache/flink/pull/4316
  
Maybe @tedyu could share some light on it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-6893) Add BIN supported in SQL


[ 
https://issues.apache.org/jira/browse/FLINK-6893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16090992#comment-16090992
 ] 

ASF GitHub Bot commented on FLINK-6893:
---

Github user wuchong commented on a diff in the pull request:

https://github.com/apache/flink/pull/4128#discussion_r127868565
  
--- Diff: 
flink-libraries/flink-table/src/test/scala/org/apache/flink/table/expressions/ScalarFunctionsTest.scala
 ---
@@ -352,6 +352,14 @@ class ScalarFunctionsTest extends ScalarTypesTestBase {
   "Flinkxx")
   }
 
+  @Test
+  def testBin(): Unit = {
+testSqlApi("BIN(12)", "1100")
+testSqlApi("BIN(10)", "1010")
+testSqlApi("BIN(0)", "0")
+
testSqlApi("BIN(f32)","")
+  }
+
--- End diff --

Would be better to add validation tests to `ScalarFunctionsValidationTest` 
to check the expected exception for unsupported operand types.


> Add BIN supported in SQL
> 
>
> Key: FLINK-6893
> URL: https://issues.apache.org/jira/browse/FLINK-6893
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API & SQL
>Affects Versions: 1.4.0
>Reporter: sunjincheng
>Assignee: sunjincheng
>
> BIN(N) Returns a string representation of the binary value of N, where N is a 
> longlong (BIGINT) number. This is equivalent to CONV(N,10,2). Returns NULL if 
> N is NULL.
> * Syntax:
> BIN(num)
> * Arguments
> **num: a long/bigint value
> * Return Types
>   String
> * Example:
>   BIN(12) -> '1100'
> * See more:
> ** [MySQL| 
> https://dev.mysql.com/doc/refman/5.7/en/string-functions.html#function_bin]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (FLINK-6893) Add BIN supported in SQL


[ 
https://issues.apache.org/jira/browse/FLINK-6893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16090993#comment-16090993
 ] 

ASF GitHub Bot commented on FLINK-6893:
---

Github user wuchong commented on a diff in the pull request:

https://github.com/apache/flink/pull/4128#discussion_r127868400
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/functions/sql/ScalarSqlFunctions.scala
 ---
@@ -33,6 +33,14 @@ object ScalarSqlFunctions {
 OperandTypes.NILADIC,
 SqlFunctionCategory.NUMERIC)
 
+  val BIN = new SqlFunction(
+"BIN",
+SqlKind.OTHER_FUNCTION,
+ReturnTypes.explicit(SqlTypeName.VARCHAR),
+null,
+OperandTypes.NUMERIC,
--- End diff --

I don't think BIN accepts all NUMERIC operands. I think it only accepts 
BIGINT, TINYINT, SMALLINT,INTEGER.  And not support other decimal numeric. 


> Add BIN supported in SQL
> 
>
> Key: FLINK-6893
> URL: https://issues.apache.org/jira/browse/FLINK-6893
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API & SQL
>Affects Versions: 1.4.0
>Reporter: sunjincheng
>Assignee: sunjincheng
>
> BIN(N) Returns a string representation of the binary value of N, where N is a 
> longlong (BIGINT) number. This is equivalent to CONV(N,10,2). Returns NULL if 
> N is NULL.
> * Syntax:
> BIN(num)
> * Arguments
> **num: a long/bigint value
> * Return Types
>   String
> * Example:
>   BIN(12) -> '1100'
> * See more:
> ** [MySQL| 
> https://dev.mysql.com/doc/refman/5.7/en/string-functions.html#function_bin]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (FLINK-6893) Add BIN supported in SQL


[ 
https://issues.apache.org/jira/browse/FLINK-6893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16090994#comment-16090994
 ] 

ASF GitHub Bot commented on FLINK-6893:
---

Github user wuchong commented on a diff in the pull request:

https://github.com/apache/flink/pull/4128#discussion_r127866461
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/runtime/functions/ScalarFunctions.scala
 ---
@@ -82,4 +82,24 @@ object ScalarFunctions {
 }
 sb.toString
   }
+
+  /**
+* Returns a string representation of the binary value of N, Returns 
NULL if N is NULL.
+*/
+  def bin(n: Long): String = {
+if (null == n) {
+  return null
+}
+val value = new Array[Byte](64)
+var num = n
+// Extract the bits of num into value[] from right to left
+var len: Int = 0
+do {
+  len += 1
+  value(value.length - len) = ('0' + (num & 1)).toByte
+  num >>>= 1
+} while (num != 0)
--- End diff --

Use `Long.toBinaryString(long)` to parse it.


> Add BIN supported in SQL
> 
>
> Key: FLINK-6893
> URL: https://issues.apache.org/jira/browse/FLINK-6893
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API & SQL
>Affects Versions: 1.4.0
>Reporter: sunjincheng
>Assignee: sunjincheng
>
> BIN(N) Returns a string representation of the binary value of N, where N is a 
> longlong (BIGINT) number. This is equivalent to CONV(N,10,2). Returns NULL if 
> N is NULL.
> * Syntax:
> BIN(num)
> * Arguments
> **num: a long/bigint value
> * Return Types
>   String
> * Example:
>   BIN(12) -> '1100'
> * See more:
> ** [MySQL| 
> https://dev.mysql.com/doc/refman/5.7/en/string-functions.html#function_bin]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[GitHub] flink pull request #4128: [FLINK-6893][table]Add BIN supported in SQL

2017-07-17 Thread wuchong

Github user wuchong commented on a diff in the pull request:

https://github.com/apache/flink/pull/4128#discussion_r127868400
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/functions/sql/ScalarSqlFunctions.scala
 ---
@@ -33,6 +33,14 @@ object ScalarSqlFunctions {
 OperandTypes.NILADIC,
 SqlFunctionCategory.NUMERIC)
 
+  val BIN = new SqlFunction(
+"BIN",
+SqlKind.OTHER_FUNCTION,
+ReturnTypes.explicit(SqlTypeName.VARCHAR),
+null,
+OperandTypes.NUMERIC,
--- End diff --

I don't think BIN accepts all NUMERIC operands. I think it only accepts 
BIGINT, TINYINT, SMALLINT,INTEGER.  And not support other decimal numeric. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request #4128: [FLINK-6893][table]Add BIN supported in SQL

2017-07-17 Thread wuchong

Github user wuchong commented on a diff in the pull request:

https://github.com/apache/flink/pull/4128#discussion_r127866461
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/runtime/functions/ScalarFunctions.scala
 ---
@@ -82,4 +82,24 @@ object ScalarFunctions {
 }
 sb.toString
   }
+
+  /**
+* Returns a string representation of the binary value of N, Returns 
NULL if N is NULL.
+*/
+  def bin(n: Long): String = {
+if (null == n) {
+  return null
+}
+val value = new Array[Byte](64)
+var num = n
+// Extract the bits of num into value[] from right to left
+var len: Int = 0
+do {
+  len += 1
+  value(value.length - len) = ('0' + (num & 1)).toByte
+  num >>>= 1
+} while (num != 0)
--- End diff --

Use `Long.toBinaryString(long)` to parse it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request #4128: [FLINK-6893][table]Add BIN supported in SQL

2017-07-17 Thread wuchong

Github user wuchong commented on a diff in the pull request:

https://github.com/apache/flink/pull/4128#discussion_r127868565
  
--- Diff: 
flink-libraries/flink-table/src/test/scala/org/apache/flink/table/expressions/ScalarFunctionsTest.scala
 ---
@@ -352,6 +352,14 @@ class ScalarFunctionsTest extends ScalarTypesTestBase {
   "Flinkxx")
   }
 
+  @Test
+  def testBin(): Unit = {
+testSqlApi("BIN(12)", "1100")
+testSqlApi("BIN(10)", "1010")
+testSqlApi("BIN(0)", "0")
+
testSqlApi("BIN(f32)","")
+  }
+
--- End diff --

Would be better to add validation tests to `ScalarFunctionsValidationTest` 
to check the expected exception for unsupported operand types.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-7162) Tests should not write outside 'target' directory.


[ 
https://issues.apache.org/jira/browse/FLINK-7162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16090948#comment-16090948
 ] 

ASF GitHub Bot commented on FLINK-7162:
---

Github user zhangminglei commented on the issue:

https://github.com/apache/flink/pull/4311
  
Thanks @StephanEwen for generous review. All proposes have been addressed. 
PR updated again. Please helps to check :)


> Tests should not write outside 'target' directory.
> --
>
> Key: FLINK-7162
> URL: https://issues.apache.org/jira/browse/FLINK-7162
> Project: Flink
>  Issue Type: Improvement
>  Components: Local Runtime
>Reporter: mingleizhang
>Assignee: mingleizhang
>
> A few tests use Files.createTempDir() from Guava package, but do not set 
> java.io.tmpdir system property. Thus the temp directory is created in 
> unpredictable places and is not being cleaned up by {{mvn clean}}.
> This was probably introduced in {{JobManagerStartupTest}} and then replicated 
> in {{BlobUtilsTest}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[GitHub] flink issue #4311: [FLINK-7162] [test] Tests should not write outside 'targe...

Github user zhangminglei commented on the issue:

https://github.com/apache/flink/pull/4311
  
Thanks @StephanEwen for generous review. All proposes have been addressed. 
PR updated again. Please helps to check :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-6923) Kafka connector needs to expose information about in-flight record in AbstractFetcher base class


[ 
https://issues.apache.org/jira/browse/FLINK-6923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16090762#comment-16090762
 ] 

ASF GitHub Bot commented on FLINK-6923:
---

Github user zhenzhongxu commented on the issue:

https://github.com/apache/flink/pull/4149
  
Sounds fair. @aljoscha @tzulitai Any recommendations on what particular 
test and where I should put the tests in? I'll also improve the documentation 
as well.


> Kafka connector needs to expose information about in-flight record in 
> AbstractFetcher base class
> 
>
> Key: FLINK-6923
> URL: https://issues.apache.org/jira/browse/FLINK-6923
> Project: Flink
>  Issue Type: Improvement
>  Components: Kafka Connector
>Reporter: Zhenzhong Xu
>Assignee: Zhenzhong Xu
>Priority: Minor
>
> We have a use case where we have our custom Fetcher implementation that 
> extends AbstractFetcher base class. We need to periodically get current in 
> flight (in processing) records' partition and offset information. 
> This can be easily exposed in AbstractFetcher class.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[GitHub] flink issue #4149: [FLINK-6923] [Kafka Connector] Expose in-processing/in-fl...

2017-07-17 Thread zhenzhongxu

Github user zhenzhongxu commented on the issue:

https://github.com/apache/flink/pull/4149
  
Sounds fair. @aljoscha @tzulitai Any recommendations on what particular 
test and where I should put the tests in? I'll also improve the documentation 
as well.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request #4187: [FLINK-6998][Kafka Connector] Add kafka offset com...

2017-07-17 Thread zhenzhongxu

Github user zhenzhongxu commented on a diff in the pull request:

https://github.com/apache/flink/pull/4187#discussion_r127842765
  
--- Diff: 
flink-connectors/flink-connector-kafka-base/src/main/java/org/apache/flink/streaming/connectors/kafka/FlinkKafkaConsumerBase.java
 ---
@@ -505,6 +519,21 @@ public void run(SourceContext sourceContext) throws 
Exception {
throw new Exception("The partitions were not set for 
the consumer");
}
 
+   // initialize commit metrics and default offset callback method
+   this.successfulCommits = 
this.getRuntimeContext().getMetricGroup().counter("commitsSucceeded");
+   this.failedCommits =  
this.getRuntimeContext().getMetricGroup().counter("commitsFailed");
+
+   this.offsetCommitCallback = new KafkaCommitCallback() {
+   @Override
+   public void onComplete(Exception exception) {
+   if (exception == null) {
+   successfulCommits.inc();
--- End diff --

I think I'll add a detailed javadoc to describe the current unprotected 
implementation. I am hesitant to add in lock protection because higher level 
abstraction currently guarantees no concurrent commit at the moment.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-6998) Kafka connector needs to expose metrics for failed/successful offset commits in the Kafka Consumer callback


[ 
https://issues.apache.org/jira/browse/FLINK-6998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16090699#comment-16090699
 ] 

ASF GitHub Bot commented on FLINK-6998:
---

Github user zhenzhongxu commented on a diff in the pull request:

https://github.com/apache/flink/pull/4187#discussion_r127842765
  
--- Diff: 
flink-connectors/flink-connector-kafka-base/src/main/java/org/apache/flink/streaming/connectors/kafka/FlinkKafkaConsumerBase.java
 ---
@@ -505,6 +519,21 @@ public void run(SourceContext sourceContext) throws 
Exception {
throw new Exception("The partitions were not set for 
the consumer");
}
 
+   // initialize commit metrics and default offset callback method
+   this.successfulCommits = 
this.getRuntimeContext().getMetricGroup().counter("commitsSucceeded");
+   this.failedCommits =  
this.getRuntimeContext().getMetricGroup().counter("commitsFailed");
+
+   this.offsetCommitCallback = new KafkaCommitCallback() {
+   @Override
+   public void onComplete(Exception exception) {
+   if (exception == null) {
+   successfulCommits.inc();
--- End diff --

I think I'll add a detailed javadoc to describe the current unprotected 
implementation. I am hesitant to add in lock protection because higher level 
abstraction currently guarantees no concurrent commit at the moment.


> Kafka connector needs to expose metrics for failed/successful offset commits 
> in the Kafka Consumer callback
> ---
>
> Key: FLINK-6998
> URL: https://issues.apache.org/jira/browse/FLINK-6998
> Project: Flink
>  Issue Type: Improvement
>  Components: Kafka Connector
>Reporter: Zhenzhong Xu
>Assignee: Zhenzhong Xu
>
> Propose to add "kafkaCommitsSucceeded" and "kafkaCommitsFailed" metrics in 
> KafkaConsumerThread class.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (FLINK-6998) Kafka connector needs to expose metrics for failed/successful offset commits in the Kafka Consumer callback


[ 
https://issues.apache.org/jira/browse/FLINK-6998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16090659#comment-16090659
 ] 

ASF GitHub Bot commented on FLINK-6998:
---

Github user zhenzhongxu commented on a diff in the pull request:

https://github.com/apache/flink/pull/4187#discussion_r127838984
  
--- Diff: 
flink-connectors/flink-connector-kafka-base/src/main/java/org/apache/flink/streaming/connectors/kafka/FlinkKafkaConsumerBase.java
 ---
@@ -185,6 +187,18 @@
private volatile boolean running = true;
 
// 

+   //  internal metrics
+   // 

+
+   /** Counter for successful Kafka offset commits. */
+   private transient Counter successfulCommits;
+
+   /** Counter for failed Kafka offset commits. */
+   private transient Counter failedCommits;
+
+   private transient KafkaCommitCallback offsetCommitCallback;
--- End diff --

Sounds fair, I'll include a javadoc as well a notice about the 
thread-safety contract as you suggested.


> Kafka connector needs to expose metrics for failed/successful offset commits 
> in the Kafka Consumer callback
> ---
>
> Key: FLINK-6998
> URL: https://issues.apache.org/jira/browse/FLINK-6998
> Project: Flink
>  Issue Type: Improvement
>  Components: Kafka Connector
>Reporter: Zhenzhong Xu
>Assignee: Zhenzhong Xu
>
> Propose to add "kafkaCommitsSucceeded" and "kafkaCommitsFailed" metrics in 
> KafkaConsumerThread class.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[GitHub] flink pull request #4187: [FLINK-6998][Kafka Connector] Add kafka offset com...

2017-07-17 Thread zhenzhongxu

Github user zhenzhongxu commented on a diff in the pull request:

https://github.com/apache/flink/pull/4187#discussion_r127838984
  
--- Diff: 
flink-connectors/flink-connector-kafka-base/src/main/java/org/apache/flink/streaming/connectors/kafka/FlinkKafkaConsumerBase.java
 ---
@@ -185,6 +187,18 @@
private volatile boolean running = true;
 
// 

+   //  internal metrics
+   // 

+
+   /** Counter for successful Kafka offset commits. */
+   private transient Counter successfulCommits;
+
+   /** Counter for failed Kafka offset commits. */
+   private transient Counter failedCommits;
+
+   private transient KafkaCommitCallback offsetCommitCallback;
--- End diff --

Sounds fair, I'll include a javadoc as well a notice about the 
thread-safety contract as you suggested.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Created] (FLINK-7216) ExecutionGraph can perform concurrent global restarts to scheduling

Stephan Ewen created FLINK-7216:
---

 Summary: ExecutionGraph can perform concurrent global restarts to 
scheduling
 Key: FLINK-7216
 URL: https://issues.apache.org/jira/browse/FLINK-7216
 Project: Flink
  Issue Type: Bug
  Components: Distributed Coordination
Affects Versions: 1.3.1, 1.2.1
Reporter: Stephan Ewen
Assignee: Stephan Ewen
Priority: Blocker
 Fix For: 1.4.0, 1.3.2


Because ExecutionGraph restarts happen asynchronously and possibly delayed, it 
can happen in rare corner cases that two restarts are attempted concurrently, 
in which case some structures on the Execution Graph undergo a concurrent 
access:

Sample stack trace:
{code}
WARN  org.apache.flink.runtime.executiongraph.ExecutionGraph- Failed to 
restart the job.
java.lang.IllegalStateException: SlotSharingGroup cannot clear task assignment, 
group still has allocated resources.
at 
org.apache.flink.runtime.jobmanager.scheduler.SlotSharingGroup.clearTaskAssignment(SlotSharingGroup.java:78)
at 
org.apache.flink.runtime.executiongraph.ExecutionJobVertex.resetForNewExecution(ExecutionJobVertex.java:535)
at 
org.apache.flink.runtime.executiongraph.ExecutionGraph.restart(ExecutionGraph.java:1151)
at 
org.apache.flink.runtime.executiongraph.restart.ExecutionGraphRestarter$1.call(ExecutionGraphRestarter.java:40)
at akka.dispatch.Futures$$anonfun$future$1.apply(Future.scala:95)
at 
scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
at 
scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:748)
{code}

The solution is to strictly guard against "subsumed" restarts via the 
{{globalModVersion}} in a similar way as we fence local restarts against global 
restarts.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (FLINK-6667) Pass a callback type to the RestartStrategy, rather than the full ExecutionGraph


[ 
https://issues.apache.org/jira/browse/FLINK-6667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16090420#comment-16090420
 ] 

ASF GitHub Bot commented on FLINK-6667:
---

Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/4277
  
@zjureel Thanks for this patch. I will pick it up from here.
I think there is a small additional change needed, and a test, but I can do 
that...


> Pass a callback type to the RestartStrategy, rather than the full 
> ExecutionGraph
> 
>
> Key: FLINK-6667
> URL: https://issues.apache.org/jira/browse/FLINK-6667
> Project: Flink
>  Issue Type: Sub-task
>  Components: Distributed Coordination
>Affects Versions: 1.3.0
>Reporter: Stephan Ewen
>Assignee: Fang Yong
>
> To let the {{ResourceStrategy}} work across multiple {{FailoverStrategy}} 
> implementations, it needs to be passed a "callback" to call to trigger the 
> restart of tasks/regions/etc.
> Such a "callback" would be a nice abstraction to use for global restarts as 
> well, to not expose the full execution graph.
> Ideally, the callback is one-shot, so it cannot accidentally be used to call 
> restart() multiple times.
> This would also make the testing of RestartStrategies much easier.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[GitHub] flink issue #4277: [FLINK-6667] Pass a callback type to the RestartStrategy,...

Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/4277
  
@zjureel Thanks for this patch. I will pick it up from here.
I think there is a small additional change needed, and a test, but I can do 
that...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-7067) Cancel with savepoint does not restart checkpoint scheduler on failure


[ 
https://issues.apache.org/jira/browse/FLINK-7067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16090415#comment-16090415
 ] 

ASF GitHub Bot commented on FLINK-7067:
---

Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/4254
  
I think this is a meaningful fix.

I would suggest to do the tests different, though. The tests of the 
CheckpointCoordinator overdo the mockito stuff so heavily that it becomes an 
extremely hard job to change anything in the CheckpointCoordinator. Mocks are 
super maintenance heavy, compared to actual test implementations of interfaces 
or classes.


> Cancel with savepoint does not restart checkpoint scheduler on failure
> --
>
> Key: FLINK-7067
> URL: https://issues.apache.org/jira/browse/FLINK-7067
> Project: Flink
>  Issue Type: Bug
>  Components: State Backends, Checkpointing
>Affects Versions: 1.3.1
>Reporter: Ufuk Celebi
>Assignee: Ufuk Celebi
>Priority: Blocker
> Fix For: 1.3.2
>
>
> The `CancelWithSavepoint` action of the JobManager first stops the checkpoint 
> scheduler, then triggers a savepoint, and cancels the job after the savepoint 
> completes.
> If the savepoint fails, the command should not have any side effects and we 
> don't cancel the job. The issue is that the checkpoint scheduler is not 
> restarted though.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[GitHub] flink issue #4254: [FLINK-7067] [jobmanager] Fix side effects after failed c...

Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/4254
  
I think this is a meaningful fix.

I would suggest to do the tests different, though. The tests of the 
CheckpointCoordinator overdo the mockito stuff so heavily that it becomes an 
extremely hard job to change anything in the CheckpointCoordinator. Mocks are 
super maintenance heavy, compared to actual test implementations of interfaces 
or classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Assigned] (FLINK-7067) Cancel with savepoint does not restart checkpoint scheduler on failure


 [ 
https://issues.apache.org/jira/browse/FLINK-7067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephan Ewen reassigned FLINK-7067:
---

Assignee: Ufuk Celebi

> Cancel with savepoint does not restart checkpoint scheduler on failure
> --
>
> Key: FLINK-7067
> URL: https://issues.apache.org/jira/browse/FLINK-7067
> Project: Flink
>  Issue Type: Bug
>  Components: State Backends, Checkpointing
>Affects Versions: 1.3.1
>Reporter: Ufuk Celebi
>Assignee: Ufuk Celebi
>Priority: Blocker
> Fix For: 1.3.2
>
>
> The `CancelWithSavepoint` action of the JobManager first stops the checkpoint 
> scheduler, then triggers a savepoint, and cancels the job after the savepoint 
> completes.
> If the savepoint fails, the command should not have any side effects and we 
> don't cancel the job. The issue is that the checkpoint scheduler is not 
> restarted though.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (FLINK-7067) Cancel with savepoint does not restart checkpoint scheduler on failure


 [ 
https://issues.apache.org/jira/browse/FLINK-7067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephan Ewen updated FLINK-7067:

Fix Version/s: 1.3.2

> Cancel with savepoint does not restart checkpoint scheduler on failure
> --
>
> Key: FLINK-7067
> URL: https://issues.apache.org/jira/browse/FLINK-7067
> Project: Flink
>  Issue Type: Bug
>  Components: State Backends, Checkpointing
>Affects Versions: 1.3.1
>Reporter: Ufuk Celebi
> Fix For: 1.3.2
>
>
> The `CancelWithSavepoint` action of the JobManager first stops the checkpoint 
> scheduler, then triggers a savepoint, and cancels the job after the savepoint 
> completes.
> If the savepoint fails, the command should not have any side effects and we 
> don't cancel the job. The issue is that the checkpoint scheduler is not 
> restarted though.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (FLINK-7067) Cancel with savepoint does not restart checkpoint scheduler on failure


 [ 
https://issues.apache.org/jira/browse/FLINK-7067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephan Ewen updated FLINK-7067:

Priority: Blocker  (was: Major)

> Cancel with savepoint does not restart checkpoint scheduler on failure
> --
>
> Key: FLINK-7067
> URL: https://issues.apache.org/jira/browse/FLINK-7067
> Project: Flink
>  Issue Type: Bug
>  Components: State Backends, Checkpointing
>Affects Versions: 1.3.1
>Reporter: Ufuk Celebi
>Priority: Blocker
> Fix For: 1.3.2
>
>
> The `CancelWithSavepoint` action of the JobManager first stops the checkpoint 
> scheduler, then triggers a savepoint, and cancels the job after the savepoint 
> completes.
> If the savepoint fails, the command should not have any side effects and we 
> don't cancel the job. The issue is that the checkpoint scheduler is not 
> restarted though.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (FLINK-6588) Rename NumberOfFullRestarts metric


[ 
https://issues.apache.org/jira/browse/FLINK-6588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16090374#comment-16090374
 ] 

ASF GitHub Bot commented on FLINK-6588:
---

Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/4292
  
I am very hesitant to rename such metrics, because there may be production 
users that defined monitoring or alterting based on that metric. Those 
installations would be broken through this change.

For that reason, I would suggest to not do this change...


> Rename NumberOfFullRestarts metric
> --
>
> Key: FLINK-6588
> URL: https://issues.apache.org/jira/browse/FLINK-6588
> Project: Flink
>  Issue Type: Improvement
>  Components: Metrics
>Affects Versions: 1.3.0, 1.4.0
>Reporter: Chesnay Schepler
>Assignee: Fang Yong
>
> The metric for the number of full restarts is currently called 
> {{fullRestarts}}. For clarity and consitency purposes I propose to rename it 
> to {{numFullRestarts}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[GitHub] flink issue #4292: [FLINK-6588] Rename NumberOfFullRestarts metric

Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/4292
  
I am very hesitant to rename such metrics, because there may be production 
users that defined monitoring or alterting based on that metric. Those 
installations would be broken through this change.

For that reason, I would suggest to not do this change...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Updated] (FLINK-7058) flink-scala-shell unintended dependencies for scala 2.11


 [ 
https://issues.apache.org/jira/browse/FLINK-7058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chesnay Schepler updated FLINK-7058:

Priority: Critical  (was: Blocker)

> flink-scala-shell unintended dependencies for scala 2.11
> 
>
> Key: FLINK-7058
> URL: https://issues.apache.org/jira/browse/FLINK-7058
> Project: Flink
>  Issue Type: Bug
>  Components: Build System
>Affects Versions: 1.3.0, 1.3.1
>Reporter: Piotr Nowojski
>Assignee: Piotr Nowojski
>Priority: Critical
> Fix For: 1.4.0, 1.3.2
>
>
> Activation of profile scala-2.10 in `flink-scala-shell` and `flink-scala` do 
> not work as intended. 
> {code:xml}
>   
>   
>   scala-2.10
>   
>   
>   !scala-2.11
>   
>   
>   
>   
>   org.scalamacros
>   
> quasiquotes_2.10
>   
> ${scala.macros.version}
>   
>   
>   org.scala-lang
>   jline
>   2.10.4
>   
>   
>   
>   
> 
> 
> {code}
> This activation IMO have nothing to do with `-Pscala-2.11` profile switch 
> used in our build. "properties" are defined by `-Dproperty` switches. As far 
> as I understand that, those additional dependencies would be added only if 
> nobody defined property named `scala-2.11`, which means, they would be added 
> only if switch `-Dscala-2.11` was not used, so it seems like those 
> dependencies were basically added always. This quick test proves that I'm 
> correct:
> {code:bash}
> $ mvn dependency:tree -pl flink-scala | grep quasi
> [INFO] +- org.scalamacros:quasiquotes_2.10:jar:2.1.0:compile
> $ mvn dependency:tree -pl flink-scala -Pscala-2.11 | grep quasi
> [INFO] +- org.scalamacros:quasiquotes_2.10:jar:2.1.0:compile
> $ mvn dependency:tree -pl flink-scala -Pscala-2.10 | grep quasi
> [INFO] +- org.scalamacros:quasiquotes_2.10:jar:2.1.0:compile
> {code}
> regardless of the selected profile those dependencies are always there.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (FLINK-7178) Datadog Metric Reporter Jar is Lacking Dependencies


[ 
https://issues.apache.org/jira/browse/FLINK-7178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16090369#comment-16090369
 ] 

Chesnay Schepler commented on FLINK-7178:
-

1.3: 3c0f38369000ed4a1a5f16140e7f88770a10057d

> Datadog Metric Reporter Jar is Lacking Dependencies
> ---
>
> Key: FLINK-7178
> URL: https://issues.apache.org/jira/browse/FLINK-7178
> Project: Flink
>  Issue Type: Bug
>  Components: Metrics
>Affects Versions: 1.3.1, 1.4.0
>Reporter: Elias Levy
>Assignee: Chesnay Schepler
>Priority: Blocker
> Fix For: 1.4.0, 1.3.2
>
>
> The Datadog metric reporter has dependencies on {{com.squareup.okhttp3}} and 
> {{com.squareup.okio}}.  It appears there was an attempt to Maven Shade 
> plug-in to move these classes to {{org.apache.flink.shaded.okhttp3}} and 
> {{org.apache.flink.shaded.okio}} during packaging.  Alas, the shaded classes 
> are not included in the {{flink-metrics-datadog-1.3.1.jar}} released to Maven 
> Central.  Using the Jar results in an error when the Jobmanager or 
> Taskmanager starts up because of the missing dependencies. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (FLINK-7178) Datadog Metric Reporter Jar is Lacking Dependencies


 [ 
https://issues.apache.org/jira/browse/FLINK-7178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chesnay Schepler updated FLINK-7178:

Priority: Critical  (was: Blocker)

> Datadog Metric Reporter Jar is Lacking Dependencies
> ---
>
> Key: FLINK-7178
> URL: https://issues.apache.org/jira/browse/FLINK-7178
> Project: Flink
>  Issue Type: Bug
>  Components: Metrics
>Affects Versions: 1.3.1, 1.4.0
>Reporter: Elias Levy
>Assignee: Chesnay Schepler
>Priority: Critical
> Fix For: 1.4.0, 1.3.2
>
>
> The Datadog metric reporter has dependencies on {{com.squareup.okhttp3}} and 
> {{com.squareup.okio}}.  It appears there was an attempt to Maven Shade 
> plug-in to move these classes to {{org.apache.flink.shaded.okhttp3}} and 
> {{org.apache.flink.shaded.okio}} during packaging.  Alas, the shaded classes 
> are not included in the {{flink-metrics-datadog-1.3.1.jar}} released to Maven 
> Central.  Using the Jar results in an error when the Jobmanager or 
> Taskmanager starts up because of the missing dependencies. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (FLINK-7058) flink-scala-shell unintended dependencies for scala 2.11


[ 
https://issues.apache.org/jira/browse/FLINK-7058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16090368#comment-16090368
 ] 

Chesnay Schepler commented on FLINK-7058:
-

1.3: 09a4a4bdfb03887387d47f366193d1216a66257c

> flink-scala-shell unintended dependencies for scala 2.11
> 
>
> Key: FLINK-7058
> URL: https://issues.apache.org/jira/browse/FLINK-7058
> Project: Flink
>  Issue Type: Bug
>  Components: Build System
>Affects Versions: 1.3.0, 1.3.1
>Reporter: Piotr Nowojski
>Assignee: Piotr Nowojski
>Priority: Blocker
> Fix For: 1.4.0, 1.3.2
>
>
> Activation of profile scala-2.10 in `flink-scala-shell` and `flink-scala` do 
> not work as intended. 
> {code:xml}
>   
>   
>   scala-2.10
>   
>   
>   !scala-2.11
>   
>   
>   
>   
>   org.scalamacros
>   
> quasiquotes_2.10
>   
> ${scala.macros.version}
>   
>   
>   org.scala-lang
>   jline
>   2.10.4
>   
>   
>   
>   
> 
> 
> {code}
> This activation IMO have nothing to do with `-Pscala-2.11` profile switch 
> used in our build. "properties" are defined by `-Dproperty` switches. As far 
> as I understand that, those additional dependencies would be added only if 
> nobody defined property named `scala-2.11`, which means, they would be added 
> only if switch `-Dscala-2.11` was not used, so it seems like those 
> dependencies were basically added always. This quick test proves that I'm 
> correct:
> {code:bash}
> $ mvn dependency:tree -pl flink-scala | grep quasi
> [INFO] +- org.scalamacros:quasiquotes_2.10:jar:2.1.0:compile
> $ mvn dependency:tree -pl flink-scala -Pscala-2.11 | grep quasi
> [INFO] +- org.scalamacros:quasiquotes_2.10:jar:2.1.0:compile
> $ mvn dependency:tree -pl flink-scala -Pscala-2.10 | grep quasi
> [INFO] +- org.scalamacros:quasiquotes_2.10:jar:2.1.0:compile
> {code}
> regardless of the selected profile those dependencies are always there.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (FLINK-7118) Remove hadoop1.x code in HadoopUtils


[ 
https://issues.apache.org/jira/browse/FLINK-7118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16090362#comment-16090362
 ] 

ASF GitHub Bot commented on FLINK-7118:
---

Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/4285
  
Since these utility methods are now so simple, I think it makes sense to 
inline them in the two places where they are called. Then we could also get rid 
of the extra exception catch blocks and avoid the extra wrapping into 
RuntimeExceptions.


> Remove hadoop1.x code in HadoopUtils
> 
>
> Key: FLINK-7118
> URL: https://issues.apache.org/jira/browse/FLINK-7118
> Project: Flink
>  Issue Type: Improvement
>  Components: Java API
>Reporter: mingleizhang
>Assignee: mingleizhang
>
> Since flink no longer support hadoop 1.x version, we should remove it. Below 
> code reside in {{org.apache.flink.api.java.hadoop.mapred.utils.HadoopUtils}}
>   
> {code:java}
> public static JobContext instantiateJobContext(Configuration configuration, 
> JobID jobId) throws Exception {
>   try {
>   Class clazz;
>   // for Hadoop 1.xx
>   if(JobContext.class.isInterface()) {
>   clazz = 
> Class.forName("org.apache.hadoop.mapreduce.task.JobContextImpl", true, 
> Thread.currentThread().getContextClassLoader());
>   }
>   // for Hadoop 2.xx
>   else {
>   clazz = 
> Class.forName("org.apache.hadoop.mapreduce.JobContext", true, 
> Thread.currentThread().getContextClassLoader());
>   }
>   Constructor constructor = 
> clazz.getConstructor(Configuration.class, JobID.class);
>   JobContext context = (JobContext) 
> constructor.newInstance(configuration, jobId);
>   
>   return context;
>   } catch(Exception e) {
>   throw new Exception("Could not create instance of 
> JobContext.");
>   }
>   }
> {code}
> And 
> {code:java}
>   public static TaskAttemptContext 
> instantiateTaskAttemptContext(Configuration configuration,  TaskAttemptID 
> taskAttemptID) throws Exception {
>   try {
>   Class clazz;
>   // for Hadoop 1.xx
>   if(JobContext.class.isInterface()) {
>   clazz = 
> Class.forName("org.apache.hadoop.mapreduce.task.TaskAttemptContextImpl");
>   }
>   // for Hadoop 2.xx
>   else {
>   clazz = 
> Class.forName("org.apache.hadoop.mapreduce.TaskAttemptContext");
>   }
>   Constructor constructor = 
> clazz.getConstructor(Configuration.class, TaskAttemptID.class);
>   TaskAttemptContext context = (TaskAttemptContext) 
> constructor.newInstance(configuration, taskAttemptID);
>   
>   return context;
>   } catch(Exception e) {
>   throw new Exception("Could not create instance of 
> TaskAttemptContext.");
>   }
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[GitHub] flink issue #4285: [FLINK-7118] [hadoop] Remove hadoop1.x code in HadoopUtil...

Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/4285
  
Since these utility methods are now so simple, I think it makes sense to 
inline them in the two places where they are called. Then we could also get rid 
of the extra exception catch blocks and avoid the extra wrapping into 
RuntimeExceptions.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-6493) Ineffective null check in RegisteredOperatorBackendStateMetaInfo#equals()


[ 
https://issues.apache.org/jira/browse/FLINK-6493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16090349#comment-16090349
 ] 

ASF GitHub Bot commented on FLINK-6493:
---

Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/4328
  
Good fix, thanks!

I would suggest to improve the `equals(...)` method all together by pulling 
out the repeated casts to `Snapshot`.


> Ineffective null check in RegisteredOperatorBackendStateMetaInfo#equals()
> -
>
> Key: FLINK-6493
> URL: https://issues.apache.org/jira/browse/FLINK-6493
> Project: Flink
>  Issue Type: Bug
>  Components: State Backends, Checkpointing
>Reporter: Ted Yu
>Assignee: mingleizhang
>Priority: Minor
>
> {code}
> && ((partitionStateSerializer == null && ((Snapshot) 
> obj).getPartitionStateSerializer() == null)
>   || partitionStateSerializer.equals(((Snapshot) 
> obj).getPartitionStateSerializer()))
> && ((partitionStateSerializerConfigSnapshot == null && ((Snapshot) 
> obj).getPartitionStateSerializerConfigSnapshot() == null)
>   || partitionStateSerializerConfigSnapshot.equals(((Snapshot) 
> obj).getPartitionStateSerializerConfigSnapshot()));
> {code}
> The null check for partitionStateSerializer / 
> partitionStateSerializerConfigSnapshot is in combination with another clause.
> This may lead to NPE in the partitionStateSerializer.equals() call.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[GitHub] flink issue #4328: [FLINK-6493] Fix ineffective null check in RegisteredOper...

Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/4328
  
Good fix, thanks!

I would suggest to improve the `equals(...)` method all together by pulling 
out the repeated casts to `Snapshot`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-7176) Failed builds (due to compilation) don't upload logs


[ 
https://issues.apache.org/jira/browse/FLINK-7176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16090344#comment-16090344
 ] 

ASF GitHub Bot commented on FLINK-7176:
---

Github user zentol commented on a diff in the pull request:

https://github.com/apache/flink/pull/4329#discussion_r127794837
  
--- Diff: tools/travis_mvn_watchdog.sh ---
@@ -225,9 +226,14 @@ echo "MVN exited with EXIT CODE: ${EXIT_CODE}."
 rm $MVN_PID
 rm $MVN_EXIT
 
-check_shaded_artifacts
-
-put_yarn_logs_to_artifacts
--- End diff --

ohh...no that wasn't on purpose, must've happened during rebase.


> Failed builds (due to compilation) don't upload logs
> 
>
> Key: FLINK-7176
> URL: https://issues.apache.org/jira/browse/FLINK-7176
> Project: Flink
>  Issue Type: Bug
>  Components: Travis
>Affects Versions: 1.3.0, 1.4.0
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
> Fix For: 1.3.0, 1.4.0
>
>
> If the compile phase fails on travis {{flink-dist}} may not be created. This 
> causes the check for the inclusion of snappy in {{flink-dist}} to fail.
> The function doing this check calls {{exit 1}} on error, which exits the 
> entire shell, thus skipping subsequent actions like the upload of logs.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[GitHub] flink issue #4325: [hotfix] [hadoopCompat] Fix tests to verify results new H...

Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/4325
  
+1, please merge!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request #4329: [FLINK-7176] [travis] Improve error handling

2017-07-17 Thread zentol

Github user zentol commented on a diff in the pull request:

https://github.com/apache/flink/pull/4329#discussion_r127794837
  
--- Diff: tools/travis_mvn_watchdog.sh ---
@@ -225,9 +226,14 @@ echo "MVN exited with EXIT CODE: ${EXIT_CODE}."
 rm $MVN_PID
 rm $MVN_EXIT
 
-check_shaded_artifacts
-
-put_yarn_logs_to_artifacts
--- End diff --

ohh...no that wasn't on purpose, must've happened during rebase.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-6105) Properly handle InterruptedException in HadoopInputFormatBase


[ 
https://issues.apache.org/jira/browse/FLINK-6105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16090343#comment-16090343
 ] 

ASF GitHub Bot commented on FLINK-6105:
---

Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/4316
  
We could make this change. I have not seen a lot of use of 
`InterruptedIOException`, probably because it is a bit of a strange class, with 
its public mutable int field.

I am +/- 0 on this. Do you have a concrete case where this change would 
lead to a benefit?


> Properly handle InterruptedException in HadoopInputFormatBase
> -
>
> Key: FLINK-6105
> URL: https://issues.apache.org/jira/browse/FLINK-6105
> Project: Flink
>  Issue Type: Bug
>  Components: DataStream API
>Reporter: Ted Yu
>Assignee: mingleizhang
>
> When catching InterruptedException, we should throw InterruptedIOException 
> instead of IOException.
> The following example is from HadoopInputFormatBase :
> {code}
> try {
>   splits = this.mapreduceInputFormat.getSplits(jobContext);
> } catch (InterruptedException e) {
>   throw new IOException("Could not get Splits.", e);
> }
> {code}
> There may be other places where IOE is thrown.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[GitHub] flink issue #4316: [FLINK-6105] Use InterruptedIOException instead of IOExce...

Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/4316
  
We could make this change. I have not seen a lot of use of 
`InterruptedIOException`, probably because it is a bit of a strange class, with 
its public mutable int field.

I am +/- 0 on this. Do you have a concrete case where this change would 
lead to a benefit?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request #4311: [FLINK-7162] [test] Tests should not write outside...

Github user StephanEwen commented on a diff in the pull request:

https://github.com/apache/flink/pull/4311#discussion_r127793662
  
--- Diff: 
flink-runtime/src/test/java/org/apache/flink/runtime/jobmanager/JobManagerStartupTest.java
 ---
@@ -51,11 +54,14 @@
 
private File blobStorageDirectory;
 
+   @Rule
+   public TemporaryFolder temporaryFolder = new TemporaryFolder();
--- End diff --

Same here...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-7162) Tests should not write outside 'target' directory.


[ 
https://issues.apache.org/jira/browse/FLINK-7162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16090335#comment-16090335
 ] 

ASF GitHub Bot commented on FLINK-7162:
---

Github user StephanEwen commented on a diff in the pull request:

https://github.com/apache/flink/pull/4311#discussion_r127793662
  
--- Diff: 
flink-runtime/src/test/java/org/apache/flink/runtime/jobmanager/JobManagerStartupTest.java
 ---
@@ -51,11 +54,14 @@
 
private File blobStorageDirectory;
 
+   @Rule
+   public TemporaryFolder temporaryFolder = new TemporaryFolder();
--- End diff --

Same here...


> Tests should not write outside 'target' directory.
> --
>
> Key: FLINK-7162
> URL: https://issues.apache.org/jira/browse/FLINK-7162
> Project: Flink
>  Issue Type: Improvement
>  Components: Local Runtime
>Reporter: mingleizhang
>Assignee: mingleizhang
>
> A few tests use Files.createTempDir() from Guava package, but do not set 
> java.io.tmpdir system property. Thus the temp directory is created in 
> unpredictable places and is not being cleaned up by {{mvn clean}}.
> This was probably introduced in {{JobManagerStartupTest}} and then replicated 
> in {{BlobUtilsTest}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (FLINK-7162) Tests should not write outside 'target' directory.


[ 
https://issues.apache.org/jira/browse/FLINK-7162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16090333#comment-16090333
 ] 

ASF GitHub Bot commented on FLINK-7162:
---

Github user StephanEwen commented on a diff in the pull request:

https://github.com/apache/flink/pull/4311#discussion_r127793846
  
--- Diff: 
flink-runtime/src/test/java/org/apache/flink/runtime/blob/BlobUtilsTest.java ---
@@ -22,26 +22,29 @@
 import static org.junit.Assume.assumeTrue;
 import static org.mockito.Mockito.mock;
 
-import com.google.common.io.Files;
-
 import org.apache.flink.util.OperatingSystem;
 import org.junit.After;
 import org.junit.Before;
 import org.junit.Test;
+import org.junit.Rule;
 
 import java.io.File;
 import java.io.IOException;
+import org.junit.rules.TemporaryFolder;
 
 public class BlobUtilsTest {
 
private final static String CANNOT_CREATE_THIS = "cannot-create-this";
 
private File blobUtilsTestDirectory;
 
+   @Rule
+   public TemporaryFolder temporaryFolder = new TemporaryFolder();
+
@Before
-   public void before() {
+   public void before() throws IOException {
// Prepare test directory
-   blobUtilsTestDirectory = Files.createTempDir();
+   blobUtilsTestDirectory = temporaryFolder.newFolder();
--- End diff --

Minor issue: Would be nice to move this after the OS check, keep the 
directory operations logically together.


> Tests should not write outside 'target' directory.
> --
>
> Key: FLINK-7162
> URL: https://issues.apache.org/jira/browse/FLINK-7162
> Project: Flink
>  Issue Type: Improvement
>  Components: Local Runtime
>Reporter: mingleizhang
>Assignee: mingleizhang
>
> A few tests use Files.createTempDir() from Guava package, but do not set 
> java.io.tmpdir system property. Thus the temp directory is created in 
> unpredictable places and is not being cleaned up by {{mvn clean}}.
> This was probably introduced in {{JobManagerStartupTest}} and then replicated 
> in {{BlobUtilsTest}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (FLINK-7162) Tests should not write outside 'target' directory.


[ 
https://issues.apache.org/jira/browse/FLINK-7162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16090334#comment-16090334
 ] 

ASF GitHub Bot commented on FLINK-7162:
---

Github user StephanEwen commented on a diff in the pull request:

https://github.com/apache/flink/pull/4311#discussion_r127793602
  
--- Diff: 
flink-runtime/src/test/java/org/apache/flink/runtime/blob/BlobUtilsTest.java ---
@@ -22,26 +22,29 @@
 import static org.junit.Assume.assumeTrue;
 import static org.mockito.Mockito.mock;
 
-import com.google.common.io.Files;
-
 import org.apache.flink.util.OperatingSystem;
 import org.junit.After;
 import org.junit.Before;
 import org.junit.Test;
+import org.junit.Rule;
 
 import java.io.File;
 import java.io.IOException;
+import org.junit.rules.TemporaryFolder;
 
 public class BlobUtilsTest {
 
private final static String CANNOT_CREATE_THIS = "cannot-create-this";
 
private File blobUtilsTestDirectory;
 
+   @Rule
+   public TemporaryFolder temporaryFolder = new TemporaryFolder();
--- End diff --

I think it is good practice to make such variables `final`.


> Tests should not write outside 'target' directory.
> --
>
> Key: FLINK-7162
> URL: https://issues.apache.org/jira/browse/FLINK-7162
> Project: Flink
>  Issue Type: Improvement
>  Components: Local Runtime
>Reporter: mingleizhang
>Assignee: mingleizhang
>
> A few tests use Files.createTempDir() from Guava package, but do not set 
> java.io.tmpdir system property. Thus the temp directory is created in 
> unpredictable places and is not being cleaned up by {{mvn clean}}.
> This was probably introduced in {{JobManagerStartupTest}} and then replicated 
> in {{BlobUtilsTest}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[GitHub] flink pull request #4311: [FLINK-7162] [test] Tests should not write outside...

Github user StephanEwen commented on a diff in the pull request:

https://github.com/apache/flink/pull/4311#discussion_r127793602
  
--- Diff: 
flink-runtime/src/test/java/org/apache/flink/runtime/blob/BlobUtilsTest.java ---
@@ -22,26 +22,29 @@
 import static org.junit.Assume.assumeTrue;
 import static org.mockito.Mockito.mock;
 
-import com.google.common.io.Files;
-
 import org.apache.flink.util.OperatingSystem;
 import org.junit.After;
 import org.junit.Before;
 import org.junit.Test;
+import org.junit.Rule;
 
 import java.io.File;
 import java.io.IOException;
+import org.junit.rules.TemporaryFolder;
 
 public class BlobUtilsTest {
 
private final static String CANNOT_CREATE_THIS = "cannot-create-this";
 
private File blobUtilsTestDirectory;
 
+   @Rule
+   public TemporaryFolder temporaryFolder = new TemporaryFolder();
--- End diff --

I think it is good practice to make such variables `final`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request #4311: [FLINK-7162] [test] Tests should not write outside...

Github user StephanEwen commented on a diff in the pull request:

https://github.com/apache/flink/pull/4311#discussion_r127793846
  
--- Diff: 
flink-runtime/src/test/java/org/apache/flink/runtime/blob/BlobUtilsTest.java ---
@@ -22,26 +22,29 @@
 import static org.junit.Assume.assumeTrue;
 import static org.mockito.Mockito.mock;
 
-import com.google.common.io.Files;
-
 import org.apache.flink.util.OperatingSystem;
 import org.junit.After;
 import org.junit.Before;
 import org.junit.Test;
+import org.junit.Rule;
 
 import java.io.File;
 import java.io.IOException;
+import org.junit.rules.TemporaryFolder;
 
 public class BlobUtilsTest {
 
private final static String CANNOT_CREATE_THIS = "cannot-create-this";
 
private File blobUtilsTestDirectory;
 
+   @Rule
+   public TemporaryFolder temporaryFolder = new TemporaryFolder();
+
@Before
-   public void before() {
+   public void before() throws IOException {
// Prepare test directory
-   blobUtilsTestDirectory = Files.createTempDir();
+   blobUtilsTestDirectory = temporaryFolder.newFolder();
--- End diff --

Minor issue: Would be nice to move this after the OS check, keep the 
directory operations logically together.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-7141) enable travis cache again


[ 
https://issues.apache.org/jira/browse/FLINK-7141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16090329#comment-16090329
 ] 

ASF GitHub Bot commented on FLINK-7141:
---

Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/4293
  
@zentol Do you want to merge and validate this this as part of your ongoing 
build optimization project?


> enable travis cache again
> -
>
> Key: FLINK-7141
> URL: https://issues.apache.org/jira/browse/FLINK-7141
> Project: Flink
>  Issue Type: Improvement
>  Components: Build System, Travis
>Affects Versions: 1.4.0
>Reporter: Nico Kruber
>Assignee: Nico Kruber
>
> In the past, we had some troubles with the travis cache but in general it may 
> be a good idea to include it again to speed up build times by reducing the 
> time the maven downloads take.
> This time, we should also deal with corrupt files in the maven repository and 
> [tune 
> travis|https://docs.travis-ci.com/user/caching/#Caches-and-build-matrices] so 
> that it does not create corrupt caches in the first place.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[GitHub] flink issue #4293: [FLINK-7141][build] enable travis cache again

Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/4293
  
@zentol Do you want to merge and validate this this as part of your ongoing 
build optimization project?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-6980) TypeExtractor.getForObject can't get typeinfo correctly.


[ 
https://issues.apache.org/jira/browse/FLINK-6980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16090318#comment-16090318
 ] 

Stephan Ewen commented on FLINK-6980:
-

Thanks for reporting this.

[~twalthr] This would be your turf - do you have a chance to look at this? Or 
give [~sihuazhou] a pointer about how to create a patch for this?

> TypeExtractor.getForObject can't get typeinfo correctly.
> 
>
> Key: FLINK-6980
> URL: https://issues.apache.org/jira/browse/FLINK-6980
> Project: Flink
>  Issue Type: Bug
>  Components: DataStream API
>Affects Versions: 1.3.0
>Reporter: Sihua Zhou
>Priority: Minor
>
> Here is my class define:
> _class MyRecord extends Row implements Retracting, Value {}_
> When i use it like below, it just throw type cast error: 
> java.lang.ClassCastException: org.apache.flink.types.Row cannot be cast to 
> org.apache.flink.types.Value
> MyRecord[] recordList = new MyRecord[6];
> DataStream dataStream = env.fromElements(recordList);
> //MyFilter 's input arg type is MyRecord.
> dataStream.flatMap(new MyFilter()).returns(MyRecord.class).print();
> I found this is becuase of the TypeExtractor.getForObject called in 
> env.fromElements() can't get the 
> element's type corrently and TypeExtractor.getForObject work corrently in 
> flink 1.2.0. 
> I know this problem can be solved by use env.fromElement(MyRecord.class, 
> recordList) instead, i just want to know whether this is a bug or not? Why it 
> can be work correctly in 1.2.0 and can't in 1.3.0?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (FLINK-7176) Failed builds (due to compilation) don't upload logs


[ 
https://issues.apache.org/jira/browse/FLINK-7176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16090302#comment-16090302
 ] 

ASF GitHub Bot commented on FLINK-7176:
---

Github user StephanEwen commented on a diff in the pull request:

https://github.com/apache/flink/pull/4329#discussion_r127788859
  
--- Diff: tools/travis_mvn_watchdog.sh ---
@@ -225,9 +226,14 @@ echo "MVN exited with EXIT CODE: ${EXIT_CODE}."
 rm $MVN_PID
 rm $MVN_EXIT
 
-check_shaded_artifacts
-
-put_yarn_logs_to_artifacts
--- End diff --

This is removed here, is that on purpose?
Are the yarn logs now handled differently?


> Failed builds (due to compilation) don't upload logs
> 
>
> Key: FLINK-7176
> URL: https://issues.apache.org/jira/browse/FLINK-7176
> Project: Flink
>  Issue Type: Bug
>  Components: Travis
>Affects Versions: 1.3.0, 1.4.0
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
> Fix For: 1.3.0, 1.4.0
>
>
> If the compile phase fails on travis {{flink-dist}} may not be created. This 
> causes the check for the inclusion of snappy in {{flink-dist}} to fail.
> The function doing this check calls {{exit 1}} on error, which exits the 
> entire shell, thus skipping subsequent actions like the upload of logs.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[GitHub] flink pull request #4329: [FLINK-7176] [travis] Improve error handling

Github user StephanEwen commented on a diff in the pull request:

https://github.com/apache/flink/pull/4329#discussion_r127788859
  
--- Diff: tools/travis_mvn_watchdog.sh ---
@@ -225,9 +226,14 @@ echo "MVN exited with EXIT CODE: ${EXIT_CODE}."
 rm $MVN_PID
 rm $MVN_EXIT
 
-check_shaded_artifacts
-
-put_yarn_logs_to_artifacts
--- End diff --

This is removed here, is that on purpose?
Are the yarn logs now handled differently?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-7201) ConcurrentModificationException in JobLeaderIdService


[ 
https://issues.apache.org/jira/browse/FLINK-7201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16090301#comment-16090301
 ] 

ASF GitHub Bot commented on FLINK-7201:
---

Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/4347
  
@XuPingyong Can you give us a bit of context for the review?
From the initial exception I would expect that there is something that also 
needs to be addressed in the `JobLeaderIdService` class...


> ConcurrentModificationException in JobLeaderIdService
> -
>
> Key: FLINK-7201
> URL: https://issues.apache.org/jira/browse/FLINK-7201
> Project: Flink
>  Issue Type: Bug
>  Components: JobManager
>Reporter: Xu Pingyong
>Assignee: Xu Pingyong
>  Labels: flip-6
>
> {code:java}
>  java.util.ConcurrentModificationException: null
>   at java.util.HashMap$HashIterator.nextEntry(HashMap.java:922)
>   at java.util.HashMap$ValueIterator.next(HashMap.java:950)
>   at 
> org.apache.flink.runtime.resourcemanager.JobLeaderIdService.clear(JobLeaderIdService.java:114)
>   at 
> org.apache.flink.runtime.resourcemanager.JobLeaderIdService.stop(JobLeaderIdService.java:92)
>   at 
> org.apache.flink.runtime.resourcemanager.ResourceManager.shutDown(ResourceManager.java:200)
>   at 
> org.apache.flink.runtime.resourcemanager.ResourceManagerRunner.shutDownInternally(ResourceManagerRunner.java:102)
>   at 
> org.apache.flink.runtime.resourcemanager.ResourceManagerRunner.shutDown(ResourceManagerRunner.java:97)
>   at 
> org.apache.flink.runtime.minicluster.MiniCluster.shutdownInternally(MiniCluster.java:329)
>   at 
> org.apache.flink.runtime.minicluster.MiniCluster.shutdown(MiniCluster.java:297)
>   at 
> org.apache.flink.runtime.minicluster.MiniClusterITCase.runJobWithMultipleJobManagers(MiniClusterITCase.java:85)
> {code}
> Because the jobLeaderIdService stops before the rpcService when shutdown the 
> resourceManager, jobLeaderIdService has a risk of thread-unsafe.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[GitHub] flink issue #4347: [FLINK-7201] fix concurrency in JobLeaderIdService when s...

Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/4347
  
@XuPingyong Can you give us a bit of context for the review?
From the initial exception I would expect that there is something that also 
needs to be addressed in the `JobLeaderIdService` class...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-7212) JobManagerLeaderSessionIDITSuite not executed


[ 
https://issues.apache.org/jira/browse/FLINK-7212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16090255#comment-16090255
 ] 

ASF GitHub Bot commented on FLINK-7212:
---

Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/4354
  
+1, please merge to `master`...


> JobManagerLeaderSessionIDITSuite not executed
> -
>
> Key: FLINK-7212
> URL: https://issues.apache.org/jira/browse/FLINK-7212
> Project: Flink
>  Issue Type: Bug
>  Components: JobManager, Tests
>Affects Versions: 1.4.0
>Reporter: Nico Kruber
>Assignee: Nico Kruber
>
> {{JobManagerLeaderSessionIDITSuite}} is currently not executed due to its 
> naming scheme. Only {{*ITCase}} and {{*Test}} classes are run, except for 
> inside {{flink-ml}} which adds more patters to the {{scalatest}} plugin.
> Also, {{JobManagerLeaderSessionIDITSuite}} needs to be adapted slightly so 
> that it runs successfully.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[GitHub] flink issue #4354: [FLINK-7212][tests] re-enable JobManagerLeaderSessionIDIT...

Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/4354
  
+1, please merge to `master`...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Created] (FLINK-7215) Typo in FAQ page

2017-07-17 Thread Bowen Li (JIRA)

Bowen Li created FLINK-7215:
---

 Summary: Typo in FAQ page
 Key: FLINK-7215
 URL: https://issues.apache.org/jira/browse/FLINK-7215
 Project: Flink
  Issue Type: Bug
  Components: Documentation
Affects Versions: 1.3.0
Reporter: Bowen Li
Assignee: Bowen Li
Priority: Trivial
 Fix For: 1.4.0


In section 'How do I assess the progress of a Flink program' at 
https://flink.apache.org/faq.html#usage, the sentence should be "*It* runs on 
port 8081 by default (configured in conf/flink-config.yml)." rather than "*In* 
runs on port 8081 by default (configured in conf/flink-config.yml)."





--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[GitHub] flink issue #4136: [FLINK-6940][docs] Clarify the effect of configuring per-...

2017-07-17 Thread bowenli86

Github user bowenli86 commented on the issue:

https://github.com/apache/flink/pull/4136
  
@zentol @alpinegizmo  Let me know your thoughts on it


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-6940) Clarify the effect of configuring per-job state backend


[ 
https://issues.apache.org/jira/browse/FLINK-6940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16090157#comment-16090157
 ] 

ASF GitHub Bot commented on FLINK-6940:
---

Github user bowenli86 commented on the issue:

https://github.com/apache/flink/pull/4136
  
@zentol @alpinegizmo  Let me know your thoughts on it


> Clarify the effect of configuring per-job state backend 
> 
>
> Key: FLINK-6940
> URL: https://issues.apache.org/jira/browse/FLINK-6940
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation
>Affects Versions: 1.3.0
>Reporter: Bowen Li
>Assignee: Bowen Li
>Priority: Minor
> Fix For: 1.4.0
>
>
> The documentation of having different options configuring flink state backend 
> is confusing. We should add explicit doc explaining configuring a per-job 
> flink state backend in code will overwrite any default state backend 
> configured in flink-conf.yaml



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (FLINK-7143) Partition assignment for Kafka consumer is not stable


[ 
https://issues.apache.org/jira/browse/FLINK-7143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16090113#comment-16090113
 ] 

ASF GitHub Bot commented on FLINK-7143:
---

Github user aljoscha commented on the issue:

https://github.com/apache/flink/pull/4301
  
Oye, this is more complicated than I thought. On `release-1.3` the 
assignment actually works if the Kafka brokers always return the partitions in 
the same order. The reason is that the assignment of partitions and the 
assignment of operator state (in `RoundRobinOperatorStateRepartitioner`) is 
aligned. This meant that it's not a problem when sources think that they are 
"fresh" (not restored from state) because they didn't get any state. If they 
tried to assign a partition to themselves this would also mean that they have 
the state for that (again, because partition assignment and operator state 
assignment are aligned). 

This PR breaks the alignment because the `startIndex` is not necessarily 
`0`. However, this is not caught by any tests because the 
`StateAssignmentOperation` has an optimisation where it doesn't repartition 
operator state if the parallelism doesn't change. If we deactivate that 
optimisation by turning this line into `if (true)`: 
https://github.com/apache/flink/blob/b1f762127234e323b947aa4a363935f87be1994f/flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/StateAssignmentOperation.java#L561-L561
 the test in Kafka09ITCase will fail.

The fix is to properly forward the information of whether we're restored in 
`initializeState()`, I did a commit for that: 
https://github.com/aljoscha/flink/tree/finish-pr-4301-kafka-13-fixings. The 
problem is that it is not easy to change the tests to catch this bug. I think 
an ITCase that uses Kafka and does a savepoint and rescaling would do the trick.


> Partition assignment for Kafka consumer is not stable
> -
>
> Key: FLINK-7143
> URL: https://issues.apache.org/jira/browse/FLINK-7143
> Project: Flink
>  Issue Type: Bug
>  Components: Kafka Connector
>Affects Versions: 1.3.1
>Reporter: Steven Zhen Wu
>Assignee: Tzu-Li (Gordon) Tai
>Priority: Blocker
> Fix For: 1.3.2
>
>
> while deploying Flink 1.3 release to hundreds of routing jobs, we found some 
> issues with partition assignment for Kafka consumer. some partitions weren't 
> assigned and some partitions got assigned more than once.
> Here is the bug introduced in Flink 1.3. 
> {code}
>   protected static void initializeSubscribedPartitionsToStartOffsets(...) 
> {
> ...
>   for (int i = 0; i < kafkaTopicPartitions.size(); i++) {
>   if (i % numParallelSubtasks == indexOfThisSubtask) {
>   if (startupMode != 
> StartupMode.SPECIFIC_OFFSETS) {
>   
> subscribedPartitionsToStartOffsets.put(kafkaTopicPartitions.get(i), 
> startupMode.getStateSentinel());
>   }
> ...
>  }
> {code}
> The bug is using array index {{i}} to mod against {{numParallelSubtasks}}. if 
> the {{kafkaTopicPartitions}} has different order among different subtasks, 
> assignment is not stable cross subtasks and creates the assignment issue 
> mentioned earlier. 
> fix is also very simple, we should use partitionId to do the mod {{if 
> (kafkaTopicPartitions.get\(i\).getPartition() % numParallelSubtasks == 
> indexOfThisSubtask)}}. That would result in stable assignment cross subtasks 
> that is independent of ordering in the array.
> marking it as blocker because of its impact.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[GitHub] flink issue #4301: (release-1.3) [FLINK-7143] [kafka] Fix indeterminate part...

Github user aljoscha commented on the issue:

https://github.com/apache/flink/pull/4301

Oye, this is more complicated than I thought. On `release-1.3` the
assignment actually works if the Kafka brokers always return the partitions in
the same order. The reason is that the assignment of partitions and the
assignment of operator state (in `RoundRobinOperatorStateRepartitioner`) is
aligned. This meant that it's not a problem when sources think that they are
"fresh" (not restored from state) because they didn't get any state. If they
tried to assign a partition to themselves this would also mean that they have
the state for that (again, because partition assignment and operator state
assignment are aligned).

This PR breaks the alignment because the `startIndex` is not necessarily
`0`. However, this is not caught by any tests because the
`StateAssignmentOperation` has an optimisation where it doesn't repartition
operator state if the parallelism doesn't change. If we deactivate that
optimisation by turning this line into `if (true)`:
https://github.com/apache/flink/blob/b1f762127234e323b947aa4a363935f87be1994f/flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/StateAssignmentOperation.java#L561-L561
the test in Kafka09ITCase will fail.

The fix is to properly forward the information of whether we're restored in
`initializeState()`, I did a commit for that:
https://github.com/aljoscha/flink/tree/finish-pr-4301-kafka-13-fixings. The
problem is that it is not easy to change the tests to catch this bug. I think
an ITCase that uses Kafka and does a savepoint and rescaling would do the trick.

---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Updated] (FLINK-7214) Add a sink that writes to ORCFile on HDFS

2017-07-17 Thread Fabian Hueske (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-7214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fabian Hueske updated FLINK-7214:
-
Component/s: (was: Batch Connectors and Input/Output Formats)
 Streaming Connectors

> Add a sink that writes to ORCFile on HDFS
> -
>
> Key: FLINK-7214
> URL: https://issues.apache.org/jira/browse/FLINK-7214
> Project: Flink
>  Issue Type: New Feature
>  Components: Streaming Connectors
>Reporter: Robert Rapplean
>Priority: Minor
>  Labels: features, hdfssink, orcfile
>
> ORCFile format is currently one of the most efficient storage formats on HDFS 
> from both the storage and search speed perspective, and it's a well supported 
> standard.
> This feature would receive an input stream, map its columns to the columns in 
> a Hive table, and write it to HDFS in ORC format. It would need to support 
> hive bucketing and dynamic hive partitioning, and generate the appropriate 
> metadata in the Hive database.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Created] (FLINK-7214) Add a sink that writes to ORCFile on HDFS

2017-07-17 Thread Robert Rapplean (JIRA)

Robert Rapplean created FLINK-7214:
--

 Summary: Add a sink that writes to ORCFile on HDFS
 Key: FLINK-7214
 URL: https://issues.apache.org/jira/browse/FLINK-7214
 Project: Flink
  Issue Type: New Feature
  Components: Batch Connectors and Input/Output Formats
Reporter: Robert Rapplean
Priority: Minor


ORCFile format is currently one of the most efficient storage formats on HDFS 
from both the storage and search speed perspective, and it's a well supported 
standard.

This feature would receive an input stream, map its columns to the columns in a 
Hive table, and write it to HDFS in ORC format. It would need to support hive 
bucketing and dynamic hive partitioning, and generate the appropriate metadata 
in the Hive database.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (FLINK-7143) Partition assignment for Kafka consumer is not stable


[ 
https://issues.apache.org/jira/browse/FLINK-7143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16090028#comment-16090028
 ] 

ASF GitHub Bot commented on FLINK-7143:
---

Github user aljoscha commented on the issue:

https://github.com/apache/flink/pull/4301
  
Note, that this doesn't normally occur because the strategy for assigning 
Kafka partitions and for assigning operator state is the same (right now). 
However, this means that you will have active partition discovery for parallel 
instances that didn't previously have state: assume we have 1 partition and 1 
parallel source. Now we add a new partition and restart the Flink job. Now, 
parallel instance 1 will still read from partition 0, parallel instance 2 will 
think that it didn't restart (because it didn't get state) and will discover 
partitions and take ownership of partition 1.

(This is with current `release-1.3` branch code.)


> Partition assignment for Kafka consumer is not stable
> -
>
> Key: FLINK-7143
> URL: https://issues.apache.org/jira/browse/FLINK-7143
> Project: Flink
>  Issue Type: Bug
>  Components: Kafka Connector
>Affects Versions: 1.3.1
>Reporter: Steven Zhen Wu
>Assignee: Tzu-Li (Gordon) Tai
>Priority: Blocker
> Fix For: 1.3.2
>
>
> while deploying Flink 1.3 release to hundreds of routing jobs, we found some 
> issues with partition assignment for Kafka consumer. some partitions weren't 
> assigned and some partitions got assigned more than once.
> Here is the bug introduced in Flink 1.3. 
> {code}
>   protected static void initializeSubscribedPartitionsToStartOffsets(...) 
> {
> ...
>   for (int i = 0; i < kafkaTopicPartitions.size(); i++) {
>   if (i % numParallelSubtasks == indexOfThisSubtask) {
>   if (startupMode != 
> StartupMode.SPECIFIC_OFFSETS) {
>   
> subscribedPartitionsToStartOffsets.put(kafkaTopicPartitions.get(i), 
> startupMode.getStateSentinel());
>   }
> ...
>  }
> {code}
> The bug is using array index {{i}} to mod against {{numParallelSubtasks}}. if 
> the {{kafkaTopicPartitions}} has different order among different subtasks, 
> assignment is not stable cross subtasks and creates the assignment issue 
> mentioned earlier. 
> fix is also very simple, we should use partitionId to do the mod {{if 
> (kafkaTopicPartitions.get\(i\).getPartition() % numParallelSubtasks == 
> indexOfThisSubtask)}}. That would result in stable assignment cross subtasks 
> that is independent of ordering in the array.
> marking it as blocker because of its impact.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[GitHub] flink issue #4301: (release-1.3) [FLINK-7143] [kafka] Fix indeterminate part...

Github user aljoscha commented on the issue:

https://github.com/apache/flink/pull/4301
  
Note, that this doesn't normally occur because the strategy for assigning 
Kafka partitions and for assigning operator state is the same (right now). 
However, this means that you will have active partition discovery for parallel 
instances that didn't previously have state: assume we have 1 partition and 1 
parallel source. Now we add a new partition and restart the Flink job. Now, 
parallel instance 1 will still read from partition 0, parallel instance 2 will 
think that it didn't restart (because it didn't get state) and will discover 
partitions and take ownership of partition 1.

(This is with current `release-1.3` branch code.)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-7143) Partition assignment for Kafka consumer is not stable


[ 
https://issues.apache.org/jira/browse/FLINK-7143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16089969#comment-16089969
 ] 

ASF GitHub Bot commented on FLINK-7143:
---

Github user aljoscha commented on the issue:

https://github.com/apache/flink/pull/4301
  
Yes, I don't think we can get around this when restoring from "old" state.

I also have another suspicion: I don't think that 
`KafkaConsumerTestBase.runMultipleSourcesOnePartitionExactlyOnceTest()` 
accurately catches some cases and I think there is a problem that we cannot 
accurately detect whether we are restoring or whether we are opening from 
scratch. Consider this case: 5 partitions, 5 parallel source instances. Now we 
rescale to 10 parallel source instances. Some sources don't get state, so they 
think that we are starting from scratch and they will run partition discovery. 
Doesn't this mean that they could possibly read from a topic where already 
another source is reading from, because it got the state for that? (Not this 
doesn't occur on master because all sources get all state.)


> Partition assignment for Kafka consumer is not stable
> -
>
> Key: FLINK-7143
> URL: https://issues.apache.org/jira/browse/FLINK-7143
> Project: Flink
>  Issue Type: Bug
>  Components: Kafka Connector
>Affects Versions: 1.3.1
>Reporter: Steven Zhen Wu
>Assignee: Tzu-Li (Gordon) Tai
>Priority: Blocker
> Fix For: 1.3.2
>
>
> while deploying Flink 1.3 release to hundreds of routing jobs, we found some 
> issues with partition assignment for Kafka consumer. some partitions weren't 
> assigned and some partitions got assigned more than once.
> Here is the bug introduced in Flink 1.3. 
> {code}
>   protected static void initializeSubscribedPartitionsToStartOffsets(...) 
> {
> ...
>   for (int i = 0; i < kafkaTopicPartitions.size(); i++) {
>   if (i % numParallelSubtasks == indexOfThisSubtask) {
>   if (startupMode != 
> StartupMode.SPECIFIC_OFFSETS) {
>   
> subscribedPartitionsToStartOffsets.put(kafkaTopicPartitions.get(i), 
> startupMode.getStateSentinel());
>   }
> ...
>  }
> {code}
> The bug is using array index {{i}} to mod against {{numParallelSubtasks}}. if 
> the {{kafkaTopicPartitions}} has different order among different subtasks, 
> assignment is not stable cross subtasks and creates the assignment issue 
> mentioned earlier. 
> fix is also very simple, we should use partitionId to do the mod {{if 
> (kafkaTopicPartitions.get\(i\).getPartition() % numParallelSubtasks == 
> indexOfThisSubtask)}}. That would result in stable assignment cross subtasks 
> that is independent of ordering in the array.
> marking it as blocker because of its impact.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[GitHub] flink issue #4301: (release-1.3) [FLINK-7143] [kafka] Fix indeterminate part...

Github user aljoscha commented on the issue:

https://github.com/apache/flink/pull/4301
  
Yes, I don't think we can get around this when restoring from "old" state.

I also have another suspicion: I don't think that 
`KafkaConsumerTestBase.runMultipleSourcesOnePartitionExactlyOnceTest()` 
accurately catches some cases and I think there is a problem that we cannot 
accurately detect whether we are restoring or whether we are opening from 
scratch. Consider this case: 5 partitions, 5 parallel source instances. Now we 
rescale to 10 parallel source instances. Some sources don't get state, so they 
think that we are starting from scratch and they will run partition discovery. 
Doesn't this mean that they could possibly read from a topic where already 
another source is reading from, because it got the state for that? (Not this 
doesn't occur on master because all sources get all state.)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Updated] (FLINK-6997) SavepointITCase fails in master branch sometimes

2017-07-17 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-6997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated FLINK-6997:
--
Description: 
I got the following test failure (with commit 
a0b781461bcf8c2f1d00b93464995f03eda589f1)

{code}
testSavepointForJobWithIteration(org.apache.flink.test.checkpointing.SavepointITCase)
  Time elapsed: 8.129 sec  <<< ERROR!
java.io.IOException: java.lang.Exception: Failed to complete savepoint
  at 
org.apache.flink.runtime.testingUtils.TestingCluster.triggerSavepoint(TestingCluster.scala:342)
  at 
org.apache.flink.runtime.testingUtils.TestingCluster.triggerSavepoint(TestingCluster.scala:316)
  at 
org.apache.flink.test.checkpointing.SavepointITCase.testSavepointForJobWithIteration(SavepointITCase.java:827)
Caused by: java.lang.Exception: Failed to complete savepoint
  at 
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anon$7.apply(JobManager.scala:821)
  at 
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anon$7.apply(JobManager.scala:805)
  at 
org.apache.flink.runtime.concurrent.impl.FlinkFuture$5.onComplete(FlinkFuture.java:272)
  at akka.dispatch.OnComplete.internal(Future.scala:247)
  at akka.dispatch.OnComplete.internal(Future.scala:245)
  at akka.dispatch.japi$CallbackBridge.apply(Future.scala:175)
  at akka.dispatch.japi$CallbackBridge.apply(Future.scala:172)
  at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:32)
  at 
akka.dispatch.BatchingExecutor$AbstractBatch.processBatch(BatchingExecutor.scala:55)
  at 
akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply$mcV$sp(BatchingExecutor.scala:91)
  at 
akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply(BatchingExecutor.scala:91)
  at 
akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply(BatchingExecutor.scala:91)
  at scala.concurrent.BlockContext$.withBlockContext(BlockContext.scala:72)
  at 
akka.dispatch.BatchingExecutor$BlockableBatch.run(BatchingExecutor.scala:90)
  at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40)
  at 
akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397)
  at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
  at 
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
  at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
  at 
scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
Caused by: java.lang.Exception: Failed to trigger savepoint: Not all required 
tasks are currently running.
  at 
org.apache.flink.runtime.checkpoint.CheckpointCoordinator.triggerSavepoint(CheckpointCoordinator.java:382)
  at 
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1.applyOrElse(JobManager.scala:800)
  at 
scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33)
  at 
scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33)
  at 
scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25)
  at 
org.apache.flink.runtime.testingUtils.TestingJobManagerLike$$anonfun$handleTestingMessage$1.applyOrElse(TestingJobManagerLike.scala:95)
  at scala.PartialFunction$OrElse.apply(PartialFunction.scala:162)
  at 
org.apache.flink.runtime.LeaderSessionMessageFilter$$anonfun$receive$1.applyOrElse(LeaderSessionMessageFilter.scala:38)
  at 
scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33)
  at 
scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33)
  at 
scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25)
  at org.apache.flink.runtime.LogMessages$$anon$1.apply(LogMessages.scala:33)
  at org.apache.flink.runtime.LogMessages$$anon$1.apply(LogMessages.scala:28)
  at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:118)
{code}

  was:
I got the following test failure (with commit 
a0b781461bcf8c2f1d00b93464995f03eda589f1)
{code}
testSavepointForJobWithIteration(org.apache.flink.test.checkpointing.SavepointITCase)
  Time elapsed: 8.129 sec  <<< ERROR!
java.io.IOException: java.lang.Exception: Failed to complete savepoint
  at 
org.apache.flink.runtime.testingUtils.TestingCluster.triggerSavepoint(TestingCluster.scala:342)
  at 
org.apache.flink.runtime.testingUtils.TestingCluster.triggerSavepoint(TestingCluster.scala:316)
  at 
org.apache.flink.test.checkpointing.SavepointITCase.testSavepointForJobWithIteration(SavepointITCase.java:827)
Caused by: java.lang.Exception: Failed to complete savepoint
  at 
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anon$7.apply(JobManager.scala:821)
  at 
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anon$7.apply(JobManager.scala:805)
  at

[jira] [Comment Edited] (FLINK-7049) TestingApplicationMaster keeps running after integration tests finish

2017-07-17 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-7049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16069086#comment-16069086
 ] 

Ted Yu edited comment on FLINK-7049 at 7/17/17 3:28 PM:


Stack trace for TestingApplicationMaster.


was (Author: yuzhih...@gmail.com):
Stack trace for TestingApplicationMaster

> TestingApplicationMaster keeps running after integration tests finish
> -
>
> Key: FLINK-7049
> URL: https://issues.apache.org/jira/browse/FLINK-7049
> Project: Flink
>  Issue Type: Test
>  Components: Tests, YARN
>Reporter: Ted Yu
>Priority: Minor
> Attachments: testingApplicationMaster.stack
>
>
> After integration tests finish, TestingApplicationMaster is still running.
> Toward the end of 
> flink-yarn-tests/target/flink-yarn-tests-ha/flink-yarn-tests-ha-logDir-nm-1_0/application_1498768839874_0001/container_1498768839874_0001_03_01/jobmanager.log
>  :
> {code}
> 2017-06-29 22:09:49,681 INFO  org.apache.zookeeper.ClientCnxn 
>   - Opening socket connection to server 127.0.0.1/127.0.0.1:46165
> 2017-06-29 22:09:49,681 ERROR 
> org.apache.flink.shaded.org.apache.curator.ConnectionState- 
> Authentication failed
> 2017-06-29 22:09:49,682 WARN  org.apache.zookeeper.ClientCnxn 
>   - Session 0x0 for server null, unexpected error, closing socket 
> connection and attempting reconnect
> java.net.ConnectException: Connection refused
>   at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>   at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
>   at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
>   at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)
> 2017-06-29 22:09:50,782 WARN  org.apache.zookeeper.ClientCnxn 
>   - SASL configuration failed: 
> javax.security.auth.login.LoginException: No JAAS configuration section named 
> 'Client' was found in specified JAAS configuration file: 
> '/tmp/jaas-3597644653611245612.conf'. Will continue connection to Zookeeper 
> server without SASL authentication, if Zookeeper server allows it.
> 2017-06-29 22:09:50,782 INFO  org.apache.zookeeper.ClientCnxn 
>   - Opening socket connection to server 127.0.0.1/127.0.0.1:46165
> 2017-06-29 22:09:50,782 ERROR 
> org.apache.flink.shaded.org.apache.curator.ConnectionState- 
> Authentication failed
> 2017-06-29 22:09:50,783 WARN  org.apache.zookeeper.ClientCnxn 
>   - Session 0x0 for server null, unexpected error, closing socket 
> connection and attempting reconnect
> java.net.ConnectException: Connection refused
>   at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>   at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
>   at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
>   at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (FLINK-6893) Add BIN supported in SQL


[ 
https://issues.apache.org/jira/browse/FLINK-6893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16089938#comment-16089938
 ] 

ASF GitHub Bot commented on FLINK-6893:
---

Github user sunjincheng121 commented on the issue:

https://github.com/apache/flink/pull/4128
  
I have rebase the code. I appreciated if you can review this PR. @wuchong 
@shaoxuan-wang 

Best, Jincheng


> Add BIN supported in SQL
> 
>
> Key: FLINK-6893
> URL: https://issues.apache.org/jira/browse/FLINK-6893
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API & SQL
>Affects Versions: 1.4.0
>Reporter: sunjincheng
>Assignee: sunjincheng
>
> BIN(N) Returns a string representation of the binary value of N, where N is a 
> longlong (BIGINT) number. This is equivalent to CONV(N,10,2). Returns NULL if 
> N is NULL.
> * Syntax:
> BIN(num)
> * Arguments
> **num: a long/bigint value
> * Return Types
>   String
> * Example:
>   BIN(12) -> '1100'
> * See more:
> ** [MySQL| 
> https://dev.mysql.com/doc/refman/5.7/en/string-functions.html#function_bin]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Assigned] (FLINK-6974) Add BIN supported in TableAPI

2017-07-17 Thread sunjincheng (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-6974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sunjincheng reassigned FLINK-6974:
--

Assignee: sunjincheng

> Add BIN supported in TableAPI
> -
>
> Key: FLINK-6974
> URL: https://issues.apache.org/jira/browse/FLINK-6974
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API & SQL
>Affects Versions: 1.4.0
>Reporter: sunjincheng
>Assignee: sunjincheng
>  Labels: starter
>
> See FLINK-6893 for detail.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[GitHub] flink issue #4128: [FLINK-6893][table]Add BIN supported in SQL

2017-07-17 Thread sunjincheng121

Github user sunjincheng121 commented on the issue:

https://github.com/apache/flink/pull/4128
  
I have rebase the code. I appreciated if you can review this PR. @wuchong 
@shaoxuan-wang 

Best, Jincheng


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Closed] (FLINK-7101) Fix Non-windowed group-aggregate error when using `minIdleStateRetentionTime` config and retract agg

2017-07-17 Thread sunjincheng (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-7101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sunjincheng closed FLINK-7101.
--
Resolution: Fixed

Fixed in 1125122a75d25c3d3aa55d7f51d84ed25ee69c56 .

> Fix Non-windowed group-aggregate error when using `minIdleStateRetentionTime` 
> config and retract agg
> 
>
> Key: FLINK-7101
> URL: https://issues.apache.org/jira/browse/FLINK-7101
> Project: Flink
>  Issue Type: Bug
>  Components: Table API & SQL
>Affects Versions: 1.3.0, 1.3.1
>Reporter: sunjincheng
>Assignee: sunjincheng
> Fix For: 1.4.0
>
> Attachments: screenshot-1.png
>
>
> When Non-windowed group-aggregate using {{minIdleStateRetentionTime}} config 
> and retract AGG, Will emit "NULL" agg value which we do not expect. 
> For example: ({{IntSumWithRetractAggFunction}})
> 1. Receive: CRow(Row.of(6L: JLong, 5: JInt, "aaa"), true) 
> 2. Cleanup state
> 3. Receive: CRow(Row.of(6L: JLong, 5: JInt, "aaa"), false)  // acc.f1 = -1, 
> getValue= null 
> So, we must change the logic of {{GroupAggProcessFunction}} as follows:
> {code}
> if (inputCnt != 0) {
>  ...
> } else {
>  ...
> }
> {code}
> TO
> {code}
> if (inputCnt > 0) {
>  ...
> } else {
> if( null != prevRow.row){
>  ...
>  }
> }
> {code}
> In this case, the result will bigger than expected, but i think it's make 
> sense, because user want cleanup state.(they should know the impact)
> What do you think? [~fhueske] [~hequn8128]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (FLINK-7101) Fix Non-windowed group-aggregate error when using `minIdleStateRetentionTime` config and retract agg


[ 
https://issues.apache.org/jira/browse/FLINK-7101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16089913#comment-16089913
 ] 

ASF GitHub Bot commented on FLINK-7101:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/4348


> Fix Non-windowed group-aggregate error when using `minIdleStateRetentionTime` 
> config and retract agg
> 
>
> Key: FLINK-7101
> URL: https://issues.apache.org/jira/browse/FLINK-7101
> Project: Flink
>  Issue Type: Bug
>  Components: Table API & SQL
>Affects Versions: 1.3.0, 1.3.1
>Reporter: sunjincheng
>Assignee: sunjincheng
> Fix For: 1.4.0
>
> Attachments: screenshot-1.png
>
>
> When Non-windowed group-aggregate using {{minIdleStateRetentionTime}} config 
> and retract AGG, Will emit "NULL" agg value which we do not expect. 
> For example: ({{IntSumWithRetractAggFunction}})
> 1. Receive: CRow(Row.of(6L: JLong, 5: JInt, "aaa"), true) 
> 2. Cleanup state
> 3. Receive: CRow(Row.of(6L: JLong, 5: JInt, "aaa"), false)  // acc.f1 = -1, 
> getValue= null 
> So, we must change the logic of {{GroupAggProcessFunction}} as follows:
> {code}
> if (inputCnt != 0) {
>  ...
> } else {
>  ...
> }
> {code}
> TO
> {code}
> if (inputCnt > 0) {
>  ...
> } else {
> if( null != prevRow.row){
>  ...
>  }
> }
> {code}
> In this case, the result will bigger than expected, but i think it's make 
> sense, because user want cleanup state.(they should know the impact)
> What do you think? [~fhueske] [~hequn8128]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[GitHub] flink pull request #4348: [FLINK-7101][table] add condition of !stateCleanin...

2017-07-17 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/4348


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-7178) Datadog Metric Reporter Jar is Lacking Dependencies


[ 
https://issues.apache.org/jira/browse/FLINK-7178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16089883#comment-16089883
 ] 

ASF GitHub Bot commented on FLINK-7178:
---

Github user zentol commented on the issue:

https://github.com/apache/flink/pull/4326
  
@aljoscha The fix is not quite correct since I didn't update the flink-dist 
assembly files. Just pushed a fix for that though. I can merge it later today. 
(once my local travis passed for it)


> Datadog Metric Reporter Jar is Lacking Dependencies
> ---
>
> Key: FLINK-7178
> URL: https://issues.apache.org/jira/browse/FLINK-7178
> Project: Flink
>  Issue Type: Bug
>  Components: Metrics
>Affects Versions: 1.3.1, 1.4.0
>Reporter: Elias Levy
>Assignee: Chesnay Schepler
>Priority: Blocker
> Fix For: 1.4.0, 1.3.2
>
>
> The Datadog metric reporter has dependencies on {{com.squareup.okhttp3}} and 
> {{com.squareup.okio}}.  It appears there was an attempt to Maven Shade 
> plug-in to move these classes to {{org.apache.flink.shaded.okhttp3}} and 
> {{org.apache.flink.shaded.okio}} during packaging.  Alas, the shaded classes 
> are not included in the {{flink-metrics-datadog-1.3.1.jar}} released to Maven 
> Central.  Using the Jar results in an error when the Jobmanager or 
> Taskmanager starts up because of the missing dependencies. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[GitHub] flink issue #4326: [FLINK-7178] [metrics] Do not create separate shaded jars

2017-07-17 Thread zentol

Github user zentol commented on the issue:

https://github.com/apache/flink/pull/4326
  
@aljoscha The fix is not quite correct since I didn't update the flink-dist 
assembly files. Just pushed a fix for that though. I can merge it later today. 
(once my local travis passed for it)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-7058) flink-scala-shell unintended dependencies for scala 2.11


[ 
https://issues.apache.org/jira/browse/FLINK-7058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16089878#comment-16089878
 ] 

ASF GitHub Bot commented on FLINK-7058:
---

Github user aljoscha commented on the issue:

https://github.com/apache/flink/pull/4240
  
Perfect! (I mean the "making it a blocker and fixing it" part, not the "it 
being broken" part). 


> flink-scala-shell unintended dependencies for scala 2.11
> 
>
> Key: FLINK-7058
> URL: https://issues.apache.org/jira/browse/FLINK-7058
> Project: Flink
>  Issue Type: Bug
>  Components: Build System
>Affects Versions: 1.3.0, 1.3.1
>Reporter: Piotr Nowojski
>Assignee: Piotr Nowojski
>Priority: Blocker
> Fix For: 1.4.0, 1.3.2
>
>
> Activation of profile scala-2.10 in `flink-scala-shell` and `flink-scala` do 
> not work as intended. 
> {code:xml}
>   
>   
>   scala-2.10
>   
>   
>   !scala-2.11
>   
>   
>   
>   
>   org.scalamacros
>   
> quasiquotes_2.10
>   
> ${scala.macros.version}
>   
>   
>   org.scala-lang
>   jline
>   2.10.4
>   
>   
>   
>   
> 
> 
> {code}
> This activation IMO have nothing to do with `-Pscala-2.11` profile switch 
> used in our build. "properties" are defined by `-Dproperty` switches. As far 
> as I understand that, those additional dependencies would be added only if 
> nobody defined property named `scala-2.11`, which means, they would be added 
> only if switch `-Dscala-2.11` was not used, so it seems like those 
> dependencies were basically added always. This quick test proves that I'm 
> correct:
> {code:bash}
> $ mvn dependency:tree -pl flink-scala | grep quasi
> [INFO] +- org.scalamacros:quasiquotes_2.10:jar:2.1.0:compile
> $ mvn dependency:tree -pl flink-scala -Pscala-2.11 | grep quasi
> [INFO] +- org.scalamacros:quasiquotes_2.10:jar:2.1.0:compile
> $ mvn dependency:tree -pl flink-scala -Pscala-2.10 | grep quasi
> [INFO] +- org.scalamacros:quasiquotes_2.10:jar:2.1.0:compile
> {code}
> regardless of the selected profile those dependencies are always there.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[GitHub] flink issue #4240: [FLINK-7058] Fix scala-2.10 dependencies

Github user aljoscha commented on the issue:

https://github.com/apache/flink/pull/4240
  
Perfect! (I mean the "making it a blocker and fixing it" part, not the "it 
being broken" part). ð


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-7101) Fix Non-windowed group-aggregate error when using `minIdleStateRetentionTime` config and retract agg


[ 
https://issues.apache.org/jira/browse/FLINK-7101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16089877#comment-16089877
 ] 

ASF GitHub Bot commented on FLINK-7101:
---

Github user sunjincheng121 commented on the issue:

https://github.com/apache/flink/pull/4348
  
Hi @fhueske Thanks  for the review. I'll address the description. and merge 
this PR.


> Fix Non-windowed group-aggregate error when using `minIdleStateRetentionTime` 
> config and retract agg
> 
>
> Key: FLINK-7101
> URL: https://issues.apache.org/jira/browse/FLINK-7101
> Project: Flink
>  Issue Type: Bug
>  Components: Table API & SQL
>Affects Versions: 1.3.0, 1.3.1
>Reporter: sunjincheng
>Assignee: sunjincheng
> Fix For: 1.4.0
>
> Attachments: screenshot-1.png
>
>
> When Non-windowed group-aggregate using {{minIdleStateRetentionTime}} config 
> and retract AGG, Will emit "NULL" agg value which we do not expect. 
> For example: ({{IntSumWithRetractAggFunction}})
> 1. Receive: CRow(Row.of(6L: JLong, 5: JInt, "aaa"), true) 
> 2. Cleanup state
> 3. Receive: CRow(Row.of(6L: JLong, 5: JInt, "aaa"), false)  // acc.f1 = -1, 
> getValue= null 
> So, we must change the logic of {{GroupAggProcessFunction}} as follows:
> {code}
> if (inputCnt != 0) {
>  ...
> } else {
>  ...
> }
> {code}
> TO
> {code}
> if (inputCnt > 0) {
>  ...
> } else {
> if( null != prevRow.row){
>  ...
>  }
> }
> {code}
> In this case, the result will bigger than expected, but i think it's make 
> sense, because user want cleanup state.(they should know the impact)
> What do you think? [~fhueske] [~hequn8128]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[GitHub] flink issue #4348: [FLINK-7101][table] add condition of !stateCleaningEnable...

2017-07-17 Thread sunjincheng121

Github user sunjincheng121 commented on the issue:

https://github.com/apache/flink/pull/4348
  
Hi @fhueske Thanks  for the review. I'll address the description. and merge 
this PR.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Updated] (FLINK-7058) flink-scala-shell unintended dependencies for scala 2.11

2017-07-17 Thread Aljoscha Krettek (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-7058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aljoscha Krettek updated FLINK-7058:

Fix Version/s: 1.3.2
   1.4.0

> flink-scala-shell unintended dependencies for scala 2.11
> 
>
> Key: FLINK-7058
> URL: https://issues.apache.org/jira/browse/FLINK-7058
> Project: Flink
>  Issue Type: Bug
>  Components: Build System
>Affects Versions: 1.3.0, 1.3.1
>Reporter: Piotr Nowojski
>Assignee: Piotr Nowojski
>Priority: Blocker
> Fix For: 1.4.0, 1.3.2
>
>
> Activation of profile scala-2.10 in `flink-scala-shell` and `flink-scala` do 
> not work as intended. 
> {code:xml}
>   
>   
>   scala-2.10
>   
>   
>   !scala-2.11
>   
>   
>   
>   
>   org.scalamacros
>   
> quasiquotes_2.10
>   
> ${scala.macros.version}
>   
>   
>   org.scala-lang
>   jline
>   2.10.4
>   
>   
>   
>   
> 
> 
> {code}
> This activation IMO have nothing to do with `-Pscala-2.11` profile switch 
> used in our build. "properties" are defined by `-Dproperty` switches. As far 
> as I understand that, those additional dependencies would be added only if 
> nobody defined property named `scala-2.11`, which means, they would be added 
> only if switch `-Dscala-2.11` was not used, so it seems like those 
> dependencies were basically added always. This quick test proves that I'm 
> correct:
> {code:bash}
> $ mvn dependency:tree -pl flink-scala | grep quasi
> [INFO] +- org.scalamacros:quasiquotes_2.10:jar:2.1.0:compile
> $ mvn dependency:tree -pl flink-scala -Pscala-2.11 | grep quasi
> [INFO] +- org.scalamacros:quasiquotes_2.10:jar:2.1.0:compile
> $ mvn dependency:tree -pl flink-scala -Pscala-2.10 | grep quasi
> [INFO] +- org.scalamacros:quasiquotes_2.10:jar:2.1.0:compile
> {code}
> regardless of the selected profile those dependencies are always there.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (FLINK-7101) Fix Non-windowed group-aggregate error when using `minIdleStateRetentionTime` config and retract agg


[ 
https://issues.apache.org/jira/browse/FLINK-7101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16089875#comment-16089875
 ] 

ASF GitHub Bot commented on FLINK-7101:
---

Github user sunjincheng121 commented on a diff in the pull request:

https://github.com/apache/flink/pull/4348#discussion_r127715580
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/runtime/aggregate/GroupAggProcessFunction.scala
 ---
@@ -131,7 +131,8 @@ class GroupAggProcessFunction(
 
   // if this was not the first row and we have to emit retractions
   if (generateRetraction && !firstRow) {
-if (prevRow.row.equals(newRow.row)) {
+// the condition of !stateCleaningEnabled is avoided state to be 
cleaned up too early
--- End diff --

Make sense to me.


> Fix Non-windowed group-aggregate error when using `minIdleStateRetentionTime` 
> config and retract agg
> 
>
> Key: FLINK-7101
> URL: https://issues.apache.org/jira/browse/FLINK-7101
> Project: Flink
>  Issue Type: Bug
>  Components: Table API & SQL
>Affects Versions: 1.3.0, 1.3.1
>Reporter: sunjincheng
>Assignee: sunjincheng
> Fix For: 1.4.0
>
> Attachments: screenshot-1.png
>
>
> When Non-windowed group-aggregate using {{minIdleStateRetentionTime}} config 
> and retract AGG, Will emit "NULL" agg value which we do not expect. 
> For example: ({{IntSumWithRetractAggFunction}})
> 1. Receive: CRow(Row.of(6L: JLong, 5: JInt, "aaa"), true) 
> 2. Cleanup state
> 3. Receive: CRow(Row.of(6L: JLong, 5: JInt, "aaa"), false)  // acc.f1 = -1, 
> getValue= null 
> So, we must change the logic of {{GroupAggProcessFunction}} as follows:
> {code}
> if (inputCnt != 0) {
>  ...
> } else {
>  ...
> }
> {code}
> TO
> {code}
> if (inputCnt > 0) {
>  ...
> } else {
> if( null != prevRow.row){
>  ...
>  }
> }
> {code}
> In this case, the result will bigger than expected, but i think it's make 
> sense, because user want cleanup state.(they should know the impact)
> What do you think? [~fhueske] [~hequn8128]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[GitHub] flink pull request #4348: [FLINK-7101][table] add condition of !stateCleanin...

2017-07-17 Thread sunjincheng121

Github user sunjincheng121 commented on a diff in the pull request:

https://github.com/apache/flink/pull/4348#discussion_r127715580
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/runtime/aggregate/GroupAggProcessFunction.scala
 ---
@@ -131,7 +131,8 @@ class GroupAggProcessFunction(
 
   // if this was not the first row and we have to emit retractions
   if (generateRetraction && !firstRow) {
-if (prevRow.row.equals(newRow.row)) {
+// the condition of !stateCleaningEnabled is avoided state to be 
cleaned up too early
--- End diff --

Make sense to me.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-7058) flink-scala-shell unintended dependencies for scala 2.11


[ 
https://issues.apache.org/jira/browse/FLINK-7058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16089876#comment-16089876
 ] 

ASF GitHub Bot commented on FLINK-7058:
---

Github user zentol commented on the issue:

https://github.com/apache/flink/pull/4240
  
I made this a blocker since it means you cannot build 1.3 with the scala 
2.11 profile.


> flink-scala-shell unintended dependencies for scala 2.11
> 
>
> Key: FLINK-7058
> URL: https://issues.apache.org/jira/browse/FLINK-7058
> Project: Flink
>  Issue Type: Bug
>  Components: Build System
>Affects Versions: 1.3.0, 1.3.1
>Reporter: Piotr Nowojski
>Assignee: Piotr Nowojski
>Priority: Blocker
>
> Activation of profile scala-2.10 in `flink-scala-shell` and `flink-scala` do 
> not work as intended. 
> {code:xml}
>   
>   
>   scala-2.10
>   
>   
>   !scala-2.11
>   
>   
>   
>   
>   org.scalamacros
>   
> quasiquotes_2.10
>   
> ${scala.macros.version}
>   
>   
>   org.scala-lang
>   jline
>   2.10.4
>   
>   
>   
>   
> 
> 
> {code}
> This activation IMO have nothing to do with `-Pscala-2.11` profile switch 
> used in our build. "properties" are defined by `-Dproperty` switches. As far 
> as I understand that, those additional dependencies would be added only if 
> nobody defined property named `scala-2.11`, which means, they would be added 
> only if switch `-Dscala-2.11` was not used, so it seems like those 
> dependencies were basically added always. This quick test proves that I'm 
> correct:
> {code:bash}
> $ mvn dependency:tree -pl flink-scala | grep quasi
> [INFO] +- org.scalamacros:quasiquotes_2.10:jar:2.1.0:compile
> $ mvn dependency:tree -pl flink-scala -Pscala-2.11 | grep quasi
> [INFO] +- org.scalamacros:quasiquotes_2.10:jar:2.1.0:compile
> $ mvn dependency:tree -pl flink-scala -Pscala-2.10 | grep quasi
> [INFO] +- org.scalamacros:quasiquotes_2.10:jar:2.1.0:compile
> {code}
> regardless of the selected profile those dependencies are always there.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (FLINK-7058) flink-scala-shell unintended dependencies for scala 2.11


 [ 
https://issues.apache.org/jira/browse/FLINK-7058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chesnay Schepler updated FLINK-7058:

Priority: Blocker  (was: Minor)

> flink-scala-shell unintended dependencies for scala 2.11
> 
>
> Key: FLINK-7058
> URL: https://issues.apache.org/jira/browse/FLINK-7058
> Project: Flink
>  Issue Type: Bug
>  Components: Build System
>Affects Versions: 1.3.0, 1.3.1
>Reporter: Piotr Nowojski
>Assignee: Piotr Nowojski
>Priority: Blocker
>
> Activation of profile scala-2.10 in `flink-scala-shell` and `flink-scala` do 
> not work as intended. 
> {code:xml}
>   
>   
>   scala-2.10
>   
>   
>   !scala-2.11
>   
>   
>   
>   
>   org.scalamacros
>   
> quasiquotes_2.10
>   
> ${scala.macros.version}
>   
>   
>   org.scala-lang
>   jline
>   2.10.4
>   
>   
>   
>   
> 
> 
> {code}
> This activation IMO have nothing to do with `-Pscala-2.11` profile switch 
> used in our build. "properties" are defined by `-Dproperty` switches. As far 
> as I understand that, those additional dependencies would be added only if 
> nobody defined property named `scala-2.11`, which means, they would be added 
> only if switch `-Dscala-2.11` was not used, so it seems like those 
> dependencies were basically added always. This quick test proves that I'm 
> correct:
> {code:bash}
> $ mvn dependency:tree -pl flink-scala | grep quasi
> [INFO] +- org.scalamacros:quasiquotes_2.10:jar:2.1.0:compile
> $ mvn dependency:tree -pl flink-scala -Pscala-2.11 | grep quasi
> [INFO] +- org.scalamacros:quasiquotes_2.10:jar:2.1.0:compile
> $ mvn dependency:tree -pl flink-scala -Pscala-2.10 | grep quasi
> [INFO] +- org.scalamacros:quasiquotes_2.10:jar:2.1.0:compile
> {code}
> regardless of the selected profile those dependencies are always there.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[GitHub] flink issue #4240: [FLINK-7058] Fix scala-2.10 dependencies

2017-07-17 Thread zentol

Github user zentol commented on the issue:

https://github.com/apache/flink/pull/4240
  
I made this a blocker since it means you cannot build 1.3 with the scala 
2.11 profile.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-7206) Implementation of DataView to support state access for UDAGG


[ 
https://issues.apache.org/jira/browse/FLINK-7206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16089874#comment-16089874
 ] 

ASF GitHub Bot commented on FLINK-7206:
---

GitHub user kaibozhou opened a pull request:

https://github.com/apache/flink/pull/4355

[FLINK-7206] [table] Implementation of DataView to support state access

for UDAGG

Thanks for contributing to Apache Flink. Before you open your pull request, 
please take the following check list into consideration.
If your changes take all of the items into account, feel free to open your 
pull request. For more information and/or questions please refer to the [How To 
Contribute guide](http://flink.apache.org/how-to-contribute.html).
In addition to going through the list, please provide a meaningful 
description of your changes.

- [x ] General
  - The pull request references the related JIRA issue ("[FLINK-XXX] Jira 
title text")
  - The pull request addresses only one issue
  - Each commit in the PR has a meaningful commit message (including the 
JIRA id)

- [ ] Documentation
  - Documentation has been added for new functionality
  - Old documentation affected by the pull request has been updated
  - JavaDoc for public methods has been added

- [ ] Tests & Build
  - Functionality added by the pull request is covered by tests
  - `mvn clean verify` has been executed successfully locally or a Travis 
build has passed

===

1. only support PoJo accumulator class to have MapView and ListView
2. getAccumulatorType will be support in another JIRA

Thanks, Kaibo

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/kaibozhou/flink FLINK-7206

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/4355.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4355


commit e09944b1a0cbb15ec762924491bbae79d17c1d16
Author: 宝牛 
Date:   2017-07-17T03:08:10Z

[FLINK-7206] [table] Implementation of DataView to support state access
for UDAGG




> Implementation of DataView to support state access for UDAGG
> 
>
> Key: FLINK-7206
> URL: https://issues.apache.org/jira/browse/FLINK-7206
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API & SQL
>Reporter: Kaibo Zhou
>Assignee: Kaibo Zhou
>
> Implementation of MapView and ListView to support state access for UDAGG.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[GitHub] flink pull request #4355: [FLINK-7206] [table] Implementation of DataView to...

2017-07-17 Thread kaibozhou

GitHub user kaibozhou opened a pull request:

https://github.com/apache/flink/pull/4355

[FLINK-7206] [table] Implementation of DataView to support state access

for UDAGG

Thanks for contributing to Apache Flink. Before you open your pull request, 
please take the following check list into consideration.
If your changes take all of the items into account, feel free to open your 
pull request. For more information and/or questions please refer to the [How To 
Contribute guide](http://flink.apache.org/how-to-contribute.html).
In addition to going through the list, please provide a meaningful 
description of your changes.

- [x ] General
  - The pull request references the related JIRA issue ("[FLINK-XXX] Jira 
title text")
  - The pull request addresses only one issue
  - Each commit in the PR has a meaningful commit message (including the 
JIRA id)

- [ ] Documentation
  - Documentation has been added for new functionality
  - Old documentation affected by the pull request has been updated
  - JavaDoc for public methods has been added

- [ ] Tests & Build
  - Functionality added by the pull request is covered by tests
  - `mvn clean verify` has been executed successfully locally or a Travis 
build has passed

===

1. only support PoJo accumulator class to have MapView and ListView
2. getAccumulatorType will be support in another JIRA

Thanks, Kaibo

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/kaibozhou/flink FLINK-7206

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/4355.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4355


commit e09944b1a0cbb15ec762924491bbae79d17c1d16
Author: å®ç 
Date:   2017-07-17T03:08:10Z

[FLINK-7206] [table] Implementation of DataView to support state access
for UDAGG




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-7058) flink-scala-shell unintended dependencies for scala 2.11