[jira] [Commented] (FLINK-6376) when deploy flink cluster on the yarn, it is lack of hdfs delegation token.

2017-05-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16018301#comment-16018301
 ] 

ASF GitHub Bot commented on FLINK-6376:
---

Github user Rucongzhang commented on the issue:

https://github.com/apache/flink/pull/3776
  
@StephanEwen , we resolve this problem. We only add the HDFS delegation 
token in JM、TM yarn container context. And when we configuration the keytab, 
the JM、TM use the keytab to authentication with HDFS.


> when deploy flink cluster on the yarn, it is lack of hdfs delegation token.
> ---
>
> Key: FLINK-6376
> URL: https://issues.apache.org/jira/browse/FLINK-6376
> Project: Flink
>  Issue Type: Bug
>  Components: Security, YARN
>Reporter: zhangrucong1982
>Assignee: zhangrucong1982
>
> 1、I use the flink of version 1.2.0. And  I deploy the flink cluster on the 
> yarn. The hadoop version is 2.7.2.
> 2、I use flink in security model with the keytab and principal. And the key 
> configuration is :security.kerberos.login.keytab: /home/ketab/test.keytab 
> 、security.kerberos.login.principal: test.
> 3、The yarn configuration is default and enable the yarn log aggregation 
> configuration" yarn.log-aggregation-enable : true";
> 4、 Deploying the flink cluster  on the yarn,  the yarn Node manager occur the 
> following failure when aggregation the log in HDFS. The basic reason is lack 
> of HDFS  delegation token. 
>  java.io.IOException: Failed on local exception: java.io.IOException: 
> org.apache.hadoop.security.AccessControlException: Client cannot authenticate 
> via:[TOKEN, KERBEROS]; Host Details : local host is: 
> "SZV1000258954/10.162.181.24"; destination host is: "SZV1000258954":25000;
> at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:796)
> at org.apache.hadoop.ipc.Client.call(Client.java:1515)
> at org.apache.hadoop.ipc.Client.call(Client.java:1447)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230)
> at com.sun.proxy.$Proxy26.getFileInfo(Unknown Source)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:802)
> at sun.reflect.GeneratedMethodAccessor17.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:201)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:103)
> at com.sun.proxy.$Proxy27.getFileInfo(Unknown Source)
> at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1919)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1500)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1496)
> at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1496)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.checkExists(LogAggregationService.java:271)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.access$100(LogAggregationService.java:68)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService$1.run(LogAggregationService.java:299)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1769)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.createAppDir(LogAggregationService.java:284)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.initAppAggregator(LogAggregationService.java:390)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.initApp(LogAggregationService.java:342)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.handle(LogAggregationService.java:470)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.handle(LogAggregationService.java:68)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:194)

[GitHub] flink issue #3776: [FLINK-6376]when deploy flink cluster on the yarn, it is ...

2017-05-19 Thread Rucongzhang
Github user Rucongzhang commented on the issue:

https://github.com/apache/flink/pull/3776
  
@StephanEwen , we resolve this problem. We only add the HDFS delegation 
token in JM、TM yarn container context. And when we configuration the keytab, 
the JM、TM use the keytab to authentication with HDFS.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Closed] (FLINK-6639) Java/Scala code tabs broken in CEP docs

2017-05-19 Thread Chesnay Schepler (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-6639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chesnay Schepler closed FLINK-6639.
---
   Resolution: Fixed
Fix Version/s: 1.4.0
   1.3.0

1.3: 5d1cda52e58b53a0ae9b3e9a691102617a475aff
1.4: fadc026bf1e90cd001bd442e5bca595eb69907cf

> Java/Scala code tabs broken in CEP docs
> ---
>
> Key: FLINK-6639
> URL: https://issues.apache.org/jira/browse/FLINK-6639
> Project: Flink
>  Issue Type: Bug
>  Components: CEP, Documentation
>Reporter: David Anderson
>Assignee: David Anderson
> Fix For: 1.3.0, 1.4.0
>
>
> A missing  is breaking the JS that does the tab switching between the 
> Java and Scala tabs on the CEP page in the docs.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Closed] (FLINK-5636) IO Metric for StreamTwoInputProcessor

2017-05-19 Thread Chesnay Schepler (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-5636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chesnay Schepler closed FLINK-5636.
---
   Resolution: Fixed
Fix Version/s: 1.4.0
   1.3.0

1.3: 54b88d71c78a670762e152b337c04a2e68e8481d
1.4: 8ccaffe3d3f2472fc12fa138c45c0b67458ad2a2

> IO Metric for StreamTwoInputProcessor
> -
>
> Key: FLINK-5636
> URL: https://issues.apache.org/jira/browse/FLINK-5636
> Project: Flink
>  Issue Type: Bug
>  Components: DataStream API, Metrics
>Affects Versions: 1.2.0, 1.3.0, 1.4.0
>Reporter: david.wang
>Assignee: Chesnay Schepler
> Fix For: 1.3.0, 1.4.0
>
>
> Unlike SteamInputProcessor, there is no IO metrics for 
> StreamTwoInputProcessor. It is useful to make a stat for two input stream 
> processor, like TPS, # of input records.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Closed] (FLINK-6586) InputGateMetrics#refreshAndGetMin returns Integer.MAX_VALUE for local channels

2017-05-19 Thread Chesnay Schepler (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-6586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chesnay Schepler closed FLINK-6586.
---
Resolution: Fixed

1.3: c3ab5c8253b32bddc3fb9bf0c1085813e7f97e2f
1.4: 17ec6f020b779efe9152456f4ef33f6f802e4f67

> InputGateMetrics#refreshAndGetMin returns Integer.MAX_VALUE for local channels
> --
>
> Key: FLINK-6586
> URL: https://issues.apache.org/jira/browse/FLINK-6586
> Project: Flink
>  Issue Type: Bug
>  Components: Metrics, Network
>Affects Versions: 1.3.0, 1.4.0
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
>Priority: Trivial
> Fix For: 1.3.0, 1.4.0
>
>
> The {{InputGateMetrics#refreshAndGetMin}} returns {{Integer.MAX_VALUE}} when 
> working with {{LocalChannels}}. In this case it should return 0 instead.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Closed] (FLINK-6439) Unclosed InputStream in OperatorSnapshotUtil#readStateHandle()

2017-05-19 Thread Chesnay Schepler (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-6439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chesnay Schepler closed FLINK-6439.
---
   Resolution: Fixed
Fix Version/s: 1.4.0
   1.3.0

1.3: 5fde739fd2b040a90d42a6a73f1d119648e863a7
1.4: 65fdadac805cb1efe30ff9a57605676b1b8e45b9

> Unclosed InputStream in OperatorSnapshotUtil#readStateHandle()
> --
>
> Key: FLINK-6439
> URL: https://issues.apache.org/jira/browse/FLINK-6439
> Project: Flink
>  Issue Type: Bug
>  Components: State Backends, Checkpointing
>Reporter: Ted Yu
>Assignee: Fang Yong
>Priority: Minor
> Fix For: 1.3.0, 1.4.0
>
>
> {code}
> FileInputStream in = new FileInputStream(path);
> DataInputStream dis = new DataInputStream(in);
> {code}
> None of the in / dis is closed upon return from the method.
> In writeStateHandle(), OutputStream should be closed in finally block.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-6332) Upgrade Scala version to 2.11.11

2017-05-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16018079#comment-16018079
 ] 

ASF GitHub Bot commented on FLINK-6332:
---

GitHub user greghogan opened a pull request:

https://github.com/apache/flink/pull/3957

[FLINK-6332] [build] Upgrade Scala versions

Upgrade to the last maintenance releases of Scala 2.10 and 2.11.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/greghogan/flink 6332_upgrade_scala_version

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/3957.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3957


commit 2d30d4f288207a3a4d064735806359531604a759
Author: Greg Hogan 
Date:   2017-05-19T17:05:55Z

[FLINK-6332] [build] Upgrade Scala versions

Upgrade to the last maintenance releases of Scala 2.10 and 2.11.




> Upgrade Scala version to 2.11.11
> 
>
> Key: FLINK-6332
> URL: https://issues.apache.org/jira/browse/FLINK-6332
> Project: Flink
>  Issue Type: Improvement
>  Components: Build System, Scala API
>Reporter: Ted Yu
>Assignee: Greg Hogan
>Priority: Minor
>
> Currently scala-2.11 profile uses Scala 2.11.7
> 2.11.11 is the most recent version.
> This issue is to upgrade to Scala 2.11.11



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] flink pull request #3957: [FLINK-6332] [build] Upgrade Scala versions

2017-05-19 Thread greghogan
GitHub user greghogan opened a pull request:

https://github.com/apache/flink/pull/3957

[FLINK-6332] [build] Upgrade Scala versions

Upgrade to the last maintenance releases of Scala 2.10 and 2.11.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/greghogan/flink 6332_upgrade_scala_version

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/3957.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3957


commit 2d30d4f288207a3a4d064735806359531604a759
Author: Greg Hogan 
Date:   2017-05-19T17:05:55Z

[FLINK-6332] [build] Upgrade Scala versions

Upgrade to the last maintenance releases of Scala 2.10 and 2.11.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request #3950: [FLINK-5636][metrics] Measure numRecordsIn in Stre...

2017-05-19 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/3950


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-5636) IO Metric for StreamTwoInputProcessor

2017-05-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16018018#comment-16018018
 ] 

ASF GitHub Bot commented on FLINK-5636:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/3950


> IO Metric for StreamTwoInputProcessor
> -
>
> Key: FLINK-5636
> URL: https://issues.apache.org/jira/browse/FLINK-5636
> Project: Flink
>  Issue Type: Bug
>  Components: DataStream API, Metrics
>Affects Versions: 1.2.0, 1.3.0, 1.4.0
>Reporter: david.wang
>Assignee: Chesnay Schepler
>
> Unlike SteamInputProcessor, there is no IO metrics for 
> StreamTwoInputProcessor. It is useful to make a stat for two input stream 
> processor, like TPS, # of input records.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-6639) Java/Scala code tabs broken in CEP docs

2017-05-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16018017#comment-16018017
 ] 

ASF GitHub Bot commented on FLINK-6639:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/3952


> Java/Scala code tabs broken in CEP docs
> ---
>
> Key: FLINK-6639
> URL: https://issues.apache.org/jira/browse/FLINK-6639
> Project: Flink
>  Issue Type: Bug
>  Components: CEP, Documentation
>Reporter: David Anderson
>Assignee: David Anderson
>
> A missing  is breaking the JS that does the tab switching between the 
> Java and Scala tabs on the CEP page in the docs.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] flink pull request #3952: [FLINK-6639] fix code tabs in CEP docs

2017-05-19 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/3952


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-6586) InputGateMetrics#refreshAndGetMin returns Integer.MAX_VALUE for local channels

2017-05-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16018020#comment-16018020
 ] 

ASF GitHub Bot commented on FLINK-6586:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/3907


> InputGateMetrics#refreshAndGetMin returns Integer.MAX_VALUE for local channels
> --
>
> Key: FLINK-6586
> URL: https://issues.apache.org/jira/browse/FLINK-6586
> Project: Flink
>  Issue Type: Bug
>  Components: Metrics, Network
>Affects Versions: 1.3.0, 1.4.0
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
>Priority: Trivial
> Fix For: 1.3.0, 1.4.0
>
>
> The {{InputGateMetrics#refreshAndGetMin}} returns {{Integer.MAX_VALUE}} when 
> working with {{LocalChannels}}. In this case it should return 0 instead.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-6439) Unclosed InputStream in OperatorSnapshotUtil#readStateHandle()

2017-05-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16018019#comment-16018019
 ] 

ASF GitHub Bot commented on FLINK-6439:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/3904


> Unclosed InputStream in OperatorSnapshotUtil#readStateHandle()
> --
>
> Key: FLINK-6439
> URL: https://issues.apache.org/jira/browse/FLINK-6439
> Project: Flink
>  Issue Type: Bug
>  Components: State Backends, Checkpointing
>Reporter: Ted Yu
>Assignee: Fang Yong
>Priority: Minor
>
> {code}
> FileInputStream in = new FileInputStream(path);
> DataInputStream dis = new DataInputStream(in);
> {code}
> None of the in / dis is closed upon return from the method.
> In writeStateHandle(), OutputStream should be closed in finally block.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] flink pull request #3907: [FLINK-6586] InputGateMetrics return 0 as minimum ...

2017-05-19 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/3907


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request #3854: [hotfix] [rat] Add exclusion for test state snapsh...

2017-05-19 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/3854


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request #3904: [FLINK-6439] Fix close OutputStream && InputStream...

2017-05-19 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/3904


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-5005) Publish Scala 2.12 artifacts

2017-05-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16017956#comment-16017956
 ] 

ASF GitHub Bot commented on FLINK-5005:
---

Github user frankbohman commented on the issue:

https://github.com/apache/flink/pull/3703
  
where can i watch the status page-thing.. that tells us when we can get off 
of old 2.11 ?


> Publish Scala 2.12 artifacts
> 
>
> Key: FLINK-5005
> URL: https://issues.apache.org/jira/browse/FLINK-5005
> Project: Flink
>  Issue Type: Improvement
>  Components: Scala API
>Reporter: Andrew Roberts
>
> Scala 2.12 was [released|http://www.scala-lang.org/news/2.12.0] today, and 
> offers many compile-time and runtime speed improvements. It would be great to 
> get artifacts up on maven central to allow Flink users to migrate to Scala 
> 2.12.0.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] flink issue #3703: [FLINK-5005] WIP: publish scala 2.12 artifacts

2017-05-19 Thread frankbohman
Github user frankbohman commented on the issue:

https://github.com/apache/flink/pull/3703
  
where can i watch the status page-thing.. that tells us when we can get off 
of old 2.11 ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (FLINK-6648) Transforms for Gelly examples

2017-05-19 Thread Greg Hogan (JIRA)
Greg Hogan created FLINK-6648:
-

 Summary: Transforms for Gelly examples
 Key: FLINK-6648
 URL: https://issues.apache.org/jira/browse/FLINK-6648
 Project: Flink
  Issue Type: Improvement
  Components: Gelly
Affects Versions: 1.4.0
Reporter: Greg Hogan
Assignee: Greg Hogan
 Fix For: 1.4.0


A primary objective of the Gelly examples {{Runner}} is to make adding new 
inputs and algorithms as simple and powerful as possible. A recent feature made 
it possible to translate the key ID of generated graphs to alternative numeric 
or string representations. For floating point and {{LongValue}} it is desirable 
to translate the key ID of the algorithm results.

Currently a {{Runner}} job consists of an input, an algorithm, and an output. A 
{{Transform}} will translate the input {{Graph}} and the algorithm output 
{{DataSet}}. The {{Input}} and algorithm {{Driver}} will return an ordered list 
of {{Transform}} which will be executed in that order (processed in reverse 
order for algorithm output) . The {{Transform}} can be configured as are inputs 
and drivers.

Example transforms:
- the aforementioned translation of key ID types
- surrogate types (String -> Long or Int) for user data
- FLINK-4481 Maximum results for pairwise algorithms
- FLINK-3625 Graph algorithms to permute graph labels and edges



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-6647) Fail-fast on invalid RocksDBStateBackend configuration

2017-05-19 Thread Helder Pereira (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16017687#comment-16017687
 ] 

Helder Pereira commented on FLINK-6647:
---

The {{File.pathSeparator}} seems a bit dangerous here, as it will strip out the 
schema and consider it an alternative path:
https://github.com/apache/flink/blob/master/flink-contrib/flink-statebackend-rocksdb/src/main/java/org/apache/flink/contrib/streaming/state/RocksDBStateBackendFactory.java#L64

And there seems to be a mismatch with what is expected here, which is prepared 
to get paths with schema:
https://github.com/apache/flink/blob/master/flink-contrib/flink-statebackend-rocksdb/src/main/java/org/apache/flink/contrib/streaming/state/RocksDBStateBackend.java#L367

> Fail-fast on invalid RocksDBStateBackend configuration
> --
>
> Key: FLINK-6647
> URL: https://issues.apache.org/jira/browse/FLINK-6647
> Project: Flink
>  Issue Type: Bug
>Reporter: Andrey
>
> Currently:
> * setup "state.backend.rocksdb.checkpointdir=hdfs:///some/base/path/hdfs"
> * setup backend: state.backend: 
> "org.apache.flink.contrib.streaming.state.RocksDBStateBackendFactory"
> * rocksdb doesn't support hdfs backend so in logs:
> {code}
> 2017-05-19 15:42:33,737 ERROR 
> org.apache.flink.contrib.streaming.state.RocksDBStateBackend - Local DB files 
> directory '/some/base/path/hdfs' does not exist and cannot be created.
> {code}
> * however job continue execution and IOManager temp directory will be picked 
> up for rocksdb files.
> There are several issues with such approach:
> * after "ERROR" message printed and before developer fixes configuration, 
> /tmp directory/partition might run out of disk space.
> * if hdfs base path is the same as local path, then no errors in logs and 
> rocksdb files will be written into an incorrect location. For example: 
> "hdfs:///home/flink/data" will cause an issue.
> Expected:
> * validate URI and throw IllegalArgumentException like already implemented in 
> "RocksDBStateBackend.setDbStoragePaths" method.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (FLINK-6332) Upgrade Scala version to 2.11.11

2017-05-19 Thread Greg Hogan (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-6332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Hogan reassigned FLINK-6332:
-

Assignee: Greg Hogan

> Upgrade Scala version to 2.11.11
> 
>
> Key: FLINK-6332
> URL: https://issues.apache.org/jira/browse/FLINK-6332
> Project: Flink
>  Issue Type: Improvement
>  Components: Build System, Scala API
>Reporter: Ted Yu
>Assignee: Greg Hogan
>Priority: Minor
>
> Currently scala-2.11 profile uses Scala 2.11.7
> 2.11.11 is the most recent version.
> This issue is to upgrade to Scala 2.11.11



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] flink issue #2332: [FLINK-2055] Implement Streaming HBaseSink

2017-05-19 Thread nragon
Github user nragon commented on the issue:

https://github.com/apache/flink/pull/2332
  
I've made a custom solution which works for my use cases. Notice that the 
code attached is not working because it's only a skeleton.
This prototype uses asynchbase and tries to manage throttling issues as 
mentioned above. The way I do this is by limiting requests per client by 1000 
(also configurable, if you want, depending on hbase capacity and response), and 
skipping records after reaching that threshold. Every record skipped is updated 
according with system timestamp, always keeping the most recent skipped record 
for later updates.
Now, in my use case I always use a keyby -> reduce before sink, which keeps 
the aggregation state, meaning that every record invoked by hbase sink will 
have the last aggregated value from your previous operators. When all requests 
are done `pending == 0` I compare the last skipped record with the last 
requested record, if the skipped timestamp is less than the requested timestamp 
means that hbase has the last aggregation.
There is plenty of room for improvments, i just did'nt have the time.

[HBaseSink.txt](https://github.com/apache/flink/files/1014991/HBaseSink.txt)




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2055) Implement Streaming HBaseSink

2017-05-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16017665#comment-16017665
 ] 

ASF GitHub Bot commented on FLINK-2055:
---

Github user nragon commented on the issue:

https://github.com/apache/flink/pull/2332
  
I've made a custom solution which works for my use cases. Notice that the 
code attached is not working because it's only a skeleton.
This prototype uses asynchbase and tries to manage throttling issues as 
mentioned above. The way I do this is by limiting requests per client by 1000 
(also configurable, if you want, depending on hbase capacity and response), and 
skipping records after reaching that threshold. Every record skipped is updated 
according with system timestamp, always keeping the most recent skipped record 
for later updates.
Now, in my use case I always use a keyby -> reduce before sink, which keeps 
the aggregation state, meaning that every record invoked by hbase sink will 
have the last aggregated value from your previous operators. When all requests 
are done `pending == 0` I compare the last skipped record with the last 
requested record, if the skipped timestamp is less than the requested timestamp 
means that hbase has the last aggregation.
There is plenty of room for improvments, i just did'nt have the time.

[HBaseSink.txt](https://github.com/apache/flink/files/1014991/HBaseSink.txt)




> Implement Streaming HBaseSink
> -
>
> Key: FLINK-2055
> URL: https://issues.apache.org/jira/browse/FLINK-2055
> Project: Flink
>  Issue Type: New Feature
>  Components: Streaming Connectors
>Affects Versions: 0.9
>Reporter: Robert Metzger
>Assignee: Erli Ding
>
> As per : 
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Write-Stream-to-HBase-td1300.html



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-5898) Race-Condition with Amazon Kinesis KPL

2017-05-19 Thread Scott Kidder (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16017653#comment-16017653
 ] 

Scott Kidder commented on FLINK-5898:
-

I created a new build of Flink that uses KPL {{0.12.4}} and AWS SDK 
{{1.11.128}}. I had a job that was unable to restore from an earlier checkpoint 
made with my patched KPL {{0.12.3}} and AWS SDK {{1.11.86}}:

{noformat}
java.io.InvalidClassException: com.amazonaws.services.kinesis.model.Shard; 
local class incompatible: stream classdesc serialVersionUID = 
206186249602915, local class serialVersionUID = 5010840014163691006
at java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:616)
at 
java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1829)
at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1713)
at 
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1986)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1535)
at 
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2231)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2155)
at 
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2013)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1535)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:422)
at java.util.HashMap.readObject(HashMap.java:1402)
at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1058)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2122)
at 
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2013)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1535)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:422)
at 
org.apache.flink.util.InstantiationUtil.deserializeObject(InstantiationUtil.java:307)
at 
org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.restoreState(AbstractUdfStreamOperator.java:166)
at 
org.apache.flink.streaming.api.operators.AbstractStreamOperator.restoreStreamCheckpointed(AbstractStreamOperator.java:240)
at 
org.apache.flink.streaming.api.operators.AbstractStreamOperator.initializeState(AbstractStreamOperator.java:203)
at 
org.apache.flink.streaming.runtime.tasks.StreamTask.initializeOperators(StreamTask.java:654)
at 
org.apache.flink.streaming.runtime.tasks.StreamTask.initializeState(StreamTask.java:641)
at 
org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:247)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:655)
at java.lang.Thread.run(Thread.java:745)
{noformat}

So, there are incompatible changes in the Kinesis {{Shard}} class included in 
the AWS SDK release referenced directly by KPL {{0.12.4}}. Just something to be 
aware of when upgrading the KPL.

> Race-Condition with Amazon Kinesis KPL
> --
>
> Key: FLINK-5898
> URL: https://issues.apache.org/jira/browse/FLINK-5898
> Project: Flink
>  Issue Type: Bug
>  Components: Kinesis Connector
>Affects Versions: 1.2.0
>Reporter: Scott Kidder
>
> The Flink Kinesis streaming-connector uses the Amazon Kinesis Producer 
> Library (KPL) to send messages to Kinesis streams. The KPL relies on a native 
> binary client to send messages to achieve better performance.
> When a Kinesis Producer is instantiated, the KPL will extract the native 
> binary to a sub-directory of `/tmp` (or whatever the platform-specific 
> temporary directory happens to be).
> The KPL tries to prevent multiple processes from extracting the binary at the 
> same time by wrapping the operation in a mutex. Unfortunately, this does not 
> prevent multiple Flink cores from trying to perform this operation at the 
> same time. If two or more processes attempt to do this at the same time, then 
> the native binary in /tmp will be corrupted.
> The authors of the KPL are aware of this possibility and suggest that users 
> of the KPL  not do that ... (sigh):
> https://github.com/awslabs/amazon-kinesis-producer/issues/55#issuecomment-251408897
> I encountered this in my production environment when bringing up a new Flink 
> task-manager with multiple cores and restoring from an earlier savepoint, 
> resulting in the instantiation of a KPL client on each core at roughly the 
> same time.
> A stack-trace follows:
> {noformat}
> java.lang.RuntimeException: Could not copy native binaries to temp directory 
> 

[jira] [Commented] (FLINK-6603) Enable checkstyle on test sources

2017-05-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16017625#comment-16017625
 ] 

ASF GitHub Bot commented on FLINK-6603:
---

Github user greghogan commented on the issue:

https://github.com/apache/flink/pull/3941
  
@StephanEwen please verify the first commit implements the 
desired/traditional import order.


> Enable checkstyle on test sources
> -
>
> Key: FLINK-6603
> URL: https://issues.apache.org/jira/browse/FLINK-6603
> Project: Flink
>  Issue Type: Improvement
>  Components: DataStream API
>Affects Versions: 1.4.0
>Reporter: Greg Hogan
>Assignee: Greg Hogan
>Priority: Trivial
> Fix For: 1.4.0
>
>
> With the addition of strict checkstyle to select modules (currently limited 
> to {{flink-streaming-java}}) we can enable the checkstyle flag 
> {{includeTestSourceDirectory}} to perform the same unused imports, 
> whitespace, and other checks on test sources.
> Should first resolve the import grouping as discussed in FLINK-6107. Also, 
> several tests exceed the 2500 line limit.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] flink issue #3941: [FLINK-6603] [streaming] Enable checkstyle on test source...

2017-05-19 Thread greghogan
Github user greghogan commented on the issue:

https://github.com/apache/flink/pull/3941
  
@StephanEwen please verify the first commit implements the 
desired/traditional import order.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (FLINK-6647) Fail-fast on invalid RocksDBStateBackend configuration

2017-05-19 Thread Andrey (JIRA)
Andrey created FLINK-6647:
-

 Summary: Fail-fast on invalid RocksDBStateBackend configuration
 Key: FLINK-6647
 URL: https://issues.apache.org/jira/browse/FLINK-6647
 Project: Flink
  Issue Type: Bug
Reporter: Andrey


Currently:
* setup "state.backend.rocksdb.checkpointdir=hdfs:///some/base/path/hdfs"
* setup backend: state.backend: 
"org.apache.flink.contrib.streaming.state.RocksDBStateBackendFactory"
* rocksdb doesn't support hdfs backend so in logs:
{code}
2017-05-19 15:42:33,737 ERROR 
org.apache.flink.contrib.streaming.state.RocksDBStateBackend - Local DB files 
directory '/some/base/path/hdfs' does not exist and cannot be created.
{code}
* however job continue execution and IOManager temp directory will be picked up 
for rocksdb files.

There are several issues with such approach:
* after "ERROR" message printed and before developer fixes configuration, /tmp 
directory/partition might run out of disk space.
* if hdfs base path is the same as local path, then no errors in logs and 
rocksdb files will be written into an incorrect location. For example: 
"hdfs:///home/flink/data" will cause an issue.

Expected:
* validate URI and throw IllegalArgumentException like already implemented in 
"RocksDBStateBackend.setDbStoragePaths" method.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Closed] (FLINK-6009) Deprecate DataSetUtils#checksumHashCode

2017-05-19 Thread Greg Hogan (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-6009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Hogan closed FLINK-6009.
-
Resolution: Implemented

> Deprecate DataSetUtils#checksumHashCode
> ---
>
> Key: FLINK-6009
> URL: https://issues.apache.org/jira/browse/FLINK-6009
> Project: Flink
>  Issue Type: Improvement
>  Components: Java API
>Affects Versions: 1.3.0
>Reporter: Greg Hogan
>Assignee: Greg Hogan
>Priority: Trivial
> Fix For: 1.3.0
>
>
> This is likely only used by Gelly and we have a more featureful 
> implementation allowing for multiple outputs and setting the job name. 
> Deprecation will allow this to be removed in Flink 2.0.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Reopened] (FLINK-6009) Deprecate DataSetUtils#checksumHashCode

2017-05-19 Thread Greg Hogan (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-6009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Hogan reopened FLINK-6009:
---

> Deprecate DataSetUtils#checksumHashCode
> ---
>
> Key: FLINK-6009
> URL: https://issues.apache.org/jira/browse/FLINK-6009
> Project: Flink
>  Issue Type: Improvement
>  Components: Java API
>Affects Versions: 1.3.0
>Reporter: Greg Hogan
>Assignee: Greg Hogan
>Priority: Trivial
> Fix For: 1.3.0
>
>
> This is likely only used by Gelly and we have a more featureful 
> implementation allowing for multiple outputs and setting the job name. 
> Deprecation will allow this to be removed in Flink 2.0.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-6644) Don't register HUP unix signal handler on Windows

2017-05-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16017600#comment-16017600
 ] 

ASF GitHub Bot commented on FLINK-6644:
---

Github user greghogan commented on the issue:

https://github.com/apache/flink/pull/3955
  
+1


> Don't register HUP unix signal handler on Windows
> -
>
> Key: FLINK-6644
> URL: https://issues.apache.org/jira/browse/FLINK-6644
> Project: Flink
>  Issue Type: Bug
>  Components: Local Runtime
>Affects Versions: 1.4.0
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
>
> The JM/TM register UNIX signal handlers for INT, HUP and TERM on startup.
> On Windows, HUP is not available, causing this exception to be logged 
> everytime:
> {code}
> 2017-05-19 16:49:58,777 INFO  
> org.apache.flink.runtime.taskmanager.TaskManager  - Error while 
> registering signal handler
> java.lang.IllegalArgumentException: Unknown signal: HUP
> at sun.misc.Signal.(Unknown Source)
> at 
> org.apache.flink.runtime.util.SignalHandler$Handler.(SignalHandler.java:41)
> at 
> org.apache.flink.runtime.util.SignalHandler.register(SignalHandler.java:78)
> at 
> org.apache.flink.runtime.taskmanager.TaskManager$.main(TaskManager.scala:1525)
> at 
> org.apache.flink.runtime.taskmanager.TaskManager.main(TaskManager.scala)
> 2017-05-19 16:49:58,778 INFO  
> org.apache.flink.runtime.taskmanager.TaskManager  - Registered 
> UNIX signal handlers for [TERM, INT]
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] flink issue #3955: [FLINK-6644] Don't register HUP signal handler on Windows

2017-05-19 Thread greghogan
Github user greghogan commented on the issue:

https://github.com/apache/flink/pull/3955
  
+1


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (FLINK-6646) YARN session doesn't work with HA

2017-05-19 Thread Robert Metzger (JIRA)
Robert Metzger created FLINK-6646:
-

 Summary: YARN session doesn't work with HA
 Key: FLINK-6646
 URL: https://issues.apache.org/jira/browse/FLINK-6646
 Project: Flink
  Issue Type: Bug
  Components: YARN
Affects Versions: 1.2.0, 1.3.0
Reporter: Robert Metzger
Priority: Critical


While testing Flink 1.3.0 RC1, I ran into the following issue on the JobManager.

{code}
2017-05-19 14:41:38,030 INFO  
org.apache.flink.runtime.webmonitor.JobManagerRetriever   - New leader 
reachable under 
akka.tcp://flink@permanent-qa-cluster-i7c9.c.astral-sorter-757.internal:36528/user/jobmanager:6539dc04-d7fe-4f85-a0b6-09bfb0de8a58.
2017-05-19 14:41:38,033 INFO  org.apache.flink.yarn.YarnFlinkResourceManager
- Resource Manager associating with leading JobManager 
Actor[akka://flink/user/jobmanager#1602741108] - leader session 
6539dc04-d7fe-4f85-a0b6-09bfb0de8a58
2017-05-19 14:41:38,033 INFO  org.apache.flink.yarn.YarnFlinkResourceManager
- Requesting new TaskManager container with 1024 megabytes memory. 
Pending requests: 1
2017-05-19 14:41:38,781 INFO  org.apache.flink.yarn.YarnFlinkResourceManager
- Received new container: container_149487096_0061_02_02 - 
Remaining pending container requests: 0
2017-05-19 14:41:38,782 INFO  org.apache.flink.yarn.YarnFlinkResourceManager
- Launching TaskManager in container ContainerInLaunch @ 
1495204898782: Container: [ContainerId: container_149487096_0061_02_02, 
NodeId: permanent-qa-cluster-d3iz.c.astral-sorter-757.internal:8041, 
NodeHttpAddress: permanent-qa-cluster-d3iz.c.astral-sorter-757.internal:8042, 
Resource: , Priority: 0, Token: Token { kind: 
ContainerToken, service: 10.240.0.32:8041 }, ] on host 
permanent-qa-cluster-d3iz.c.astral-sorter-757.internal
2017-05-19 14:41:38,788 INFO  
org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy  - 
Opening proxy : permanent-qa-cluster-d3iz.c.astral-sorter-757.internal:8041
2017-05-19 14:41:44,284 INFO  org.apache.flink.yarn.YarnFlinkResourceManager
- Container container_149487096_0061_02_02 failed, with a 
TaskManager in launch or registration. Exit status: -1000
2017-05-19 14:41:44,284 INFO  org.apache.flink.yarn.YarnFlinkResourceManager
- Diagnostics for container container_149487096_0061_02_02 
in state COMPLETE : exitStatus=-1000 diagnostics=File does not exist: 
hdfs://nameservice1/user/robert/.flink/application_149487096_0061/cf9287fe-ac75-4066-a648-91787d946890-taskmanager-conf.yaml
java.io.FileNotFoundException: File does not exist: 
hdfs://nameservice1/user/robert/.flink/application_149487096_0061/cf9287fe-ac75-4066-a648-91787d946890-taskmanager-conf.yaml
at 
org.apache.hadoop.hdfs.DistributedFileSystem$19.doCall(DistributedFileSystem.java:1219)
at 
org.apache.hadoop.hdfs.DistributedFileSystem$19.doCall(DistributedFileSystem.java:1211)
at 
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at 
org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1211)
at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:251)
at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:61)
at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:359)
at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:357)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:356)
at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:60)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
{code}

The problem is the following:
- JobManager1 starts from a yarn-session.sh
- Job1 gets submitted to JobManager1
- JobManager1 dies
- YARN starts a new JM: JobManager2
- in the meantime, errors on the yarn-session.sh appear, shutting down the 
session. This includes deleting the yarn staging directory in HDFS.
- JobManager2 is unable to start a new Taskmanager because files in staging got 
deleted by the client.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-2055) Implement Streaming HBaseSink

2017-05-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16017596#comment-16017596
 ] 

ASF GitHub Bot commented on FLINK-2055:
---

Github user fpompermaier commented on the issue:

https://github.com/apache/flink/pull/2332
  
anyone working on this? HBase streaming sink would be a very nice 
addition...


> Implement Streaming HBaseSink
> -
>
> Key: FLINK-2055
> URL: https://issues.apache.org/jira/browse/FLINK-2055
> Project: Flink
>  Issue Type: New Feature
>  Components: Streaming Connectors
>Affects Versions: 0.9
>Reporter: Robert Metzger
>Assignee: Erli Ding
>
> As per : 
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Write-Stream-to-HBase-td1300.html



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] flink issue #2332: [FLINK-2055] Implement Streaming HBaseSink

2017-05-19 Thread fpompermaier
Github user fpompermaier commented on the issue:

https://github.com/apache/flink/pull/2332
  
anyone working on this? HBase streaming sink would be a very nice 
addition...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-6628) Cannot start taskmanager with cygwin in directory containing spaces

2017-05-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16017595#comment-16017595
 ] 

ASF GitHub Bot commented on FLINK-6628:
---

Github user greghogan commented on the issue:

https://github.com/apache/flink/pull/3954
  
+1 and to fix in 1.3.0


> Cannot start taskmanager with cygwin in directory containing spaces
> ---
>
> Key: FLINK-6628
> URL: https://issues.apache.org/jira/browse/FLINK-6628
> Project: Flink
>  Issue Type: Bug
>  Components: Startup Shell Scripts
>Affects Versions: 1.3.0, 1.4.0
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
>
> {code}
> Zento@Lily 
> /cygdrive/c/Users/Zento/Documents/GitHub/flink/flink-dist/target/flink-1.3.0-bin/flink
>  1.3.0
> $ ./bin/taskmanager.sh start
> ./bin/taskmanager.sh: line 82: 
> C:\Users\Zento\Documents\GitHub\flink\flink-dist\target\flink-1.3.0-bin\flink:
>  command not found
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] flink issue #3954: [FLINK-6628] Fix start scripts on Windows

2017-05-19 Thread greghogan
Github user greghogan commented on the issue:

https://github.com/apache/flink/pull/3954
  
+1 and to fix in 1.3.0


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Issue Comment Deleted] (FLINK-6576) Allow ConfigOptions to validate the configured value

2017-05-19 Thread Greg Hogan (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-6576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Hogan updated FLINK-6576:
--
Comment: was deleted

(was: I happend to also look at this and in addition to {{taskmanager.sh}} the 
{{flink-daemon.sh}} log rotation needs escaping: {{rotateLogFilesWithPrefix 
"$FLINK_LOG_DIR" "$FLINK_LOG_PREFIX"}}.)

> Allow ConfigOptions to validate the configured value
> 
>
> Key: FLINK-6576
> URL: https://issues.apache.org/jira/browse/FLINK-6576
> Project: Flink
>  Issue Type: Wish
>  Components: Configuration
>Reporter: Chesnay Schepler
>Priority: Trivial
>
> It is not unusual for a config parameter to only accept certain values. Ports 
> may not be negative, modes way essentially be enums, and file paths must not 
> be gibberish.
> Currently these validations happen manually after the value was extracted 
> from the {{Configuration}}.
> I want to start a discussion on whether we want to move this validation into 
> the ConfigOption itself.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-6551) OutputTag name should not be allowed to be empty

2017-05-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16017588#comment-16017588
 ] 

ASF GitHub Bot commented on FLINK-6551:
---

Github user greghogan commented on the issue:

https://github.com/apache/flink/pull/3953
  
+1


> OutputTag name should not be allowed to be empty
> 
>
> Key: FLINK-6551
> URL: https://issues.apache.org/jira/browse/FLINK-6551
> Project: Flink
>  Issue Type: Improvement
>  Components: DataStream API
>Affects Versions: 1.3.0, 1.4.0
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
>
> When creating an OutputTag it is required to give it a name.
> While we do enforce that the name is not null we do not have a check in place 
> that prevents passing an empty string.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] flink issue #3953: [FLINK-6551] Reject empty OutputTag names

2017-05-19 Thread greghogan
Github user greghogan commented on the issue:

https://github.com/apache/flink/pull/3953
  
+1


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-6439) Unclosed InputStream in OperatorSnapshotUtil#readStateHandle()

2017-05-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16017585#comment-16017585
 ] 

ASF GitHub Bot commented on FLINK-6439:
---

Github user zentol commented on the issue:

https://github.com/apache/flink/pull/3904
  
merging.


> Unclosed InputStream in OperatorSnapshotUtil#readStateHandle()
> --
>
> Key: FLINK-6439
> URL: https://issues.apache.org/jira/browse/FLINK-6439
> Project: Flink
>  Issue Type: Bug
>  Components: State Backends, Checkpointing
>Reporter: Ted Yu
>Assignee: Fang Yong
>Priority: Minor
>
> {code}
> FileInputStream in = new FileInputStream(path);
> DataInputStream dis = new DataInputStream(in);
> {code}
> None of the in / dis is closed upon return from the method.
> In writeStateHandle(), OutputStream should be closed in finally block.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] flink issue #3904: [FLINK-6439] Fix close OutputStream && InputStream in Ope...

2017-05-19 Thread zentol
Github user zentol commented on the issue:

https://github.com/apache/flink/pull/3904
  
merging.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Updated] (FLINK-6297) CEP timeout does not trigger under certain conditions

2017-05-19 Thread Vijayakumar Palaniappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-6297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vijayakumar Palaniappan updated FLINK-6297:
---

Issue is still reproducible


Changes i did are

   - iterator as the source without closing the stream
   - env.getConfig().setAutoWatermarkInterval(5000);

Iterator src:

  class Ite implements Iterator, Serializable {
int i = 0;

public boolean hasNext() {
if (i == inputElements.size()) {
try {
Thread.sleep(1000);
} catch (InterruptedException e) {
e.printStackTrace();
}
}
return i < inputElements.size();
}


public MyEvent next() {
return inputElements.get(i++);
}
}

DataStream input = env.fromCollection(new Ite(), MyEvent.class);


Event Input:

final List inputElements = new ArrayList<>();
inputElements.add(new MyEvent(1, 'a', 1, 1));
inputElements.add(new MyEvent(1, 'b', 1, 2));
inputElements.add(new MyEvent(1, 'a', 2, 3));
inputElements.add(new MyEvent(1, 'a', 3, 6));

Event Constructor:

public MyEvent(int v, char payload,int key, int timestamp)

event key 2 never times out.

On Fri, May 19, 2017 at 10:45 AM, Kostas Kloudas (JIRA) 




-- 
Thanks,
-Vijay


> CEP timeout does not trigger under certain conditions
> -
>
> Key: FLINK-6297
> URL: https://issues.apache.org/jira/browse/FLINK-6297
> Project: Flink
>  Issue Type: Bug
>  Components: CEP
>Affects Versions: 1.2.0
>Reporter: Vijayakumar Palaniappan
>
> -TimeoutPattern does not trigger under certain conditions. Following are the 
> preconditions: 
> -Assume a pattern of Event A followed by Event B within 2 Seconds
> -PeriodicWaterMarks every 1 second
> -Assume following events have arrived. 
> -Event A-1[time: 1 sec]
> -Event B-1[time: 2 sec] 
> -Event A-2[time: 2 sec]
> -Event A-3[time: 5 sec] 
> -WaterMark[time: 5 sec]
> I would assume that after watermark arrival, Event A-1,B-1 detected. A-2 
> timed out. But A-2 timeout does not happen.
> if i use a punctuated watermark and generate watermark for every event, it 
> seems to work as expected.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-6586) InputGateMetrics#refreshAndGetMin returns Integer.MAX_VALUE for local channels

2017-05-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16017580#comment-16017580
 ] 

ASF GitHub Bot commented on FLINK-6586:
---

Github user zentol commented on the issue:

https://github.com/apache/flink/pull/3907
  
merging.


> InputGateMetrics#refreshAndGetMin returns Integer.MAX_VALUE for local channels
> --
>
> Key: FLINK-6586
> URL: https://issues.apache.org/jira/browse/FLINK-6586
> Project: Flink
>  Issue Type: Bug
>  Components: Metrics, Network
>Affects Versions: 1.3.0, 1.4.0
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
>Priority: Trivial
> Fix For: 1.3.0, 1.4.0
>
>
> The {{InputGateMetrics#refreshAndGetMin}} returns {{Integer.MAX_VALUE}} when 
> working with {{LocalChannels}}. In this case it should return 0 instead.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] flink issue #3907: [FLINK-6586] InputGateMetrics return 0 as minimum for loc...

2017-05-19 Thread zentol
Github user zentol commented on the issue:

https://github.com/apache/flink/pull/3907
  
merging.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-6628) Cannot start taskmanager with cygwin in directory containing spaces

2017-05-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16017579#comment-16017579
 ] 

ASF GitHub Bot commented on FLINK-6628:
---

Github user zentol commented on the issue:

https://github.com/apache/flink/pull/3954
  
@greghogan done.


> Cannot start taskmanager with cygwin in directory containing spaces
> ---
>
> Key: FLINK-6628
> URL: https://issues.apache.org/jira/browse/FLINK-6628
> Project: Flink
>  Issue Type: Bug
>  Components: Startup Shell Scripts
>Affects Versions: 1.3.0, 1.4.0
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
>
> {code}
> Zento@Lily 
> /cygdrive/c/Users/Zento/Documents/GitHub/flink/flink-dist/target/flink-1.3.0-bin/flink
>  1.3.0
> $ ./bin/taskmanager.sh start
> ./bin/taskmanager.sh: line 82: 
> C:\Users\Zento\Documents\GitHub\flink\flink-dist\target\flink-1.3.0-bin\flink:
>  command not found
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] flink issue #3954: [FLINK-6628] Fix start scripts on Windows

2017-05-19 Thread zentol
Github user zentol commented on the issue:

https://github.com/apache/flink/pull/3954
  
@greghogan done.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request #3956: [FLINK-6642] Return -1 in EnvInfo#getOpenFileHandl...

2017-05-19 Thread zentol
GitHub user zentol opened a pull request:

https://github.com/apache/flink/pull/3956

[FLINK-6642] Return -1 in EnvInfo#getOpenFileHandlesLimit

Returns -1 for Windows, as the accessed UnixOSMXBean is obviously not 
available on Windows.

This reduces logging noise on Windows.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/zentol/flink 6642_env_windows

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/3956.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3956


commit 4060e8b1a1f4cde1674a5f41dd4a925679bd5d55
Author: zentol 
Date:   2017-05-19T15:22:26Z

[FLINK-6642] Return -1 in EnvInfo#getOpenFileHandlesLimit




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (FLINK-6645) JobMasterTest failed on Travis

2017-05-19 Thread Chesnay Schepler (JIRA)
Chesnay Schepler created FLINK-6645:
---

 Summary: JobMasterTest failed on Travis
 Key: FLINK-6645
 URL: https://issues.apache.org/jira/browse/FLINK-6645
 Project: Flink
  Issue Type: Bug
  Components: Tests
Affects Versions: 1.4.0
Reporter: Chesnay Schepler


{code}
Failed tests: 

  JobMasterTest.testHeartbeatTimeoutWithResourceManager:220 

Wanted but not invoked:

resourceManagerGateway.disconnectJobManager(

be75c925204aede002136b15238f88b4,



);

-> at 
org.apache.flink.runtime.jobmaster.JobMasterTest.testHeartbeatTimeoutWithResourceManager(JobMasterTest.java:220)



However, there were other interactions with this mock:

resourceManagerGateway.registerJobManager(

82d774de-ed78-4670-9623-2fc6638fbbf9,

11b045ea-8b2b-4df0-aa02-ef4922dfc632,

jm,

"b442340a-7d3d-49d8-b440-1a93a5b43bb6",

be75c925204aede002136b15238f88b4,

100 ms

);

-> at 
org.apache.flink.runtime.jobmaster.JobMaster$ResourceManagerConnection$1.invokeRegistration(JobMaster.java:1051)
{code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-6586) InputGateMetrics#refreshAndGetMin returns Integer.MAX_VALUE for local channels

2017-05-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16017572#comment-16017572
 ] 

ASF GitHub Bot commented on FLINK-6586:
---

Github user greghogan commented on the issue:

https://github.com/apache/flink/pull/3907
  
+1


> InputGateMetrics#refreshAndGetMin returns Integer.MAX_VALUE for local channels
> --
>
> Key: FLINK-6586
> URL: https://issues.apache.org/jira/browse/FLINK-6586
> Project: Flink
>  Issue Type: Bug
>  Components: Metrics, Network
>Affects Versions: 1.3.0, 1.4.0
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
>Priority: Trivial
> Fix For: 1.3.0, 1.4.0
>
>
> The {{InputGateMetrics#refreshAndGetMin}} returns {{Integer.MAX_VALUE}} when 
> working with {{LocalChannels}}. In this case it should return 0 instead.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] flink issue #3907: [FLINK-6586] InputGateMetrics return 0 as minimum for loc...

2017-05-19 Thread greghogan
Github user greghogan commented on the issue:

https://github.com/apache/flink/pull/3907
  
+1


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Updated] (FLINK-6641) HA recovery on YARN: ClusterClient calls HighAvailabilityServices#closeAndCleanupAll

2017-05-19 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-6641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-6641:
--
Description: 
While testing the 1.3 RC1, I was unable to recovery from a JobManager failure.

The following error message appears in the JobManager that is started by YARN:

{code}
2017-05-19 14:41:48,133 INFO  org.apache.flink.yarn.YarnJobManager  
- Submitting recovered job 35df3b24209e9ebea12407e6747a746c.
2017-05-19 14:41:48,133 INFO  org.apache.flink.yarn.YarnJobManager  
- Submitting job 35df3b24209e9ebea12407e6747a746c 
(CarTopSpeedWindowingExample) (Recovery).
2017-05-19 14:41:48,139 ERROR org.apache.flink.yarn.YarnJobManager  
- Failed to submit job 35df3b24209e9ebea12407e6747a746c 
(CarTopSpeedWindowingExample)
org.apache.flink.runtime.client.JobSubmissionException: Cannot set up the user 
code libraries: Cannot get library with hash 
935fb4229e27f166d5ba8912cd47974ad436af20
at 
org.apache.flink.runtime.jobmanager.JobManager.org$apache$flink$runtime$jobmanager$JobManager$$submitJob(JobManager.scala:1267)
at 
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1.applyOrElse(JobManager.scala:517)
at 
scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33)
at 
scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33)
at 
scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25)
at 
org.apache.flink.runtime.clusterframework.ContaineredJobManager$$anonfun$handleContainerMessage$1.applyOrElse(ContaineredJobManager.scala:100)
at scala.PartialFunction$OrElse.apply(PartialFunction.scala:162)
at 
org.apache.flink.runtime.LeaderSessionMessageFilter$$anonfun$receive$1.applyOrElse(LeaderSessionMessageFilter.scala:38)
at 
scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33)
at 
scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33)
at 
scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25)
at 
org.apache.flink.runtime.LogMessages$$anon$1.apply(LogMessages.scala:33)
at 
org.apache.flink.runtime.LogMessages$$anon$1.apply(LogMessages.scala:28)
at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:118)
at 
org.apache.flink.runtime.LogMessages$$anon$1.applyOrElse(LogMessages.scala:28)
at akka.actor.Actor$class.aroundReceive(Actor.scala:467)
at 
org.apache.flink.runtime.jobmanager.JobManager.aroundReceive(JobManager.scala:125)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
at akka.actor.ActorCell.invoke(ActorCell.scala:487)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:238)
at akka.dispatch.Mailbox.run(Mailbox.scala:220)
at 
akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at 
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at 
scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at 
scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
Caused by: java.io.IOException: Cannot get library with hash 
935fb4229e27f166d5ba8912cd47974ad436af20
at 
org.apache.flink.runtime.execution.librarycache.BlobLibraryCacheManager.registerReferenceToBlobKeyAndGetURL(BlobLibraryCacheManager.java:264)
at 
org.apache.flink.runtime.execution.librarycache.BlobLibraryCacheManager.registerTask(BlobLibraryCacheManager.java:117)
at 
org.apache.flink.runtime.execution.librarycache.BlobLibraryCacheManager.registerJob(BlobLibraryCacheManager.java:89)
at 
org.apache.flink.runtime.jobmanager.JobManager.org$apache$flink$runtime$jobmanager$JobManager$$submitJob(JobManager.scala:1262)
... 25 more
Caused by: java.io.IOException: Failed to copy from blob store.
at org.apache.flink.runtime.blob.BlobServer.getURL(BlobServer.java:387)
at 
org.apache.flink.runtime.execution.librarycache.BlobLibraryCacheManager.registerReferenceToBlobKeyAndGetURL(BlobLibraryCacheManager.java:255)
... 28 more
Caused by: java.io.FileNotFoundException: File does not exist: 
/shared/recovery-dir/application_149487096_0061/blob/cache/blob_935fb4229e27f166d5ba8912cd47974ad436af20
at 
org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:66)
at 
org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:56)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsUpdateTimes(FSNamesystem.java:1963)
at 

[GitHub] flink pull request #3955: [FLINK-6644] Don't register HUP signal handler on ...

2017-05-19 Thread zentol
GitHub user zentol opened a pull request:

https://github.com/apache/flink/pull/3955

[FLINK-6644] Don't register HUP signal handler on Windows

No longer registers the HUP signal handler on Windows to reduce noise in 
the logs as it is not available on windows.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/zentol/flink 6644_hup_windows

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/3955.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3955


commit 5e972db09853ee7d192ff21b0dc369595b0489ec
Author: zentol 
Date:   2017-05-19T15:34:47Z

[FLINK-6644] Don't register HUP signal handler on Windows




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Updated] (FLINK-6645) JobMasterTest failed on Travis

2017-05-19 Thread Chesnay Schepler (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-6645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chesnay Schepler updated FLINK-6645:

Description: 
{code}
Failed tests: 
  JobMasterTest.testHeartbeatTimeoutWithResourceManager:220 

Wanted but not invoked:
resourceManagerGateway.disconnectJobManager(
be75c925204aede002136b15238f88b4,

);
-> at 
org.apache.flink.runtime.jobmaster.JobMasterTest.testHeartbeatTimeoutWithResourceManager(JobMasterTest.java:220)

However, there were other interactions with this mock:
resourceManagerGateway.registerJobManager(
82d774de-ed78-4670-9623-2fc6638fbbf9,
11b045ea-8b2b-4df0-aa02-ef4922dfc632,
jm,
"b442340a-7d3d-49d8-b440-1a93a5b43bb6",
be75c925204aede002136b15238f88b4,
100 ms
);

-> at 
org.apache.flink.runtime.jobmaster.JobMaster$ResourceManagerConnection$1.invokeRegistration(JobMaster.java:1051)
{code}

  was:
{code}
Failed tests: 

  JobMasterTest.testHeartbeatTimeoutWithResourceManager:220 

Wanted but not invoked:

resourceManagerGateway.disconnectJobManager(

be75c925204aede002136b15238f88b4,



);

-> at 
org.apache.flink.runtime.jobmaster.JobMasterTest.testHeartbeatTimeoutWithResourceManager(JobMasterTest.java:220)



However, there were other interactions with this mock:

resourceManagerGateway.registerJobManager(

82d774de-ed78-4670-9623-2fc6638fbbf9,

11b045ea-8b2b-4df0-aa02-ef4922dfc632,

jm,

"b442340a-7d3d-49d8-b440-1a93a5b43bb6",

be75c925204aede002136b15238f88b4,

100 ms

);

-> at 
org.apache.flink.runtime.jobmaster.JobMaster$ResourceManagerConnection$1.invokeRegistration(JobMaster.java:1051)
{code}


> JobMasterTest failed on Travis
> --
>
> Key: FLINK-6645
> URL: https://issues.apache.org/jira/browse/FLINK-6645
> Project: Flink
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 1.4.0
>Reporter: Chesnay Schepler
>
> {code}
> Failed tests: 
>   JobMasterTest.testHeartbeatTimeoutWithResourceManager:220 
> Wanted but not invoked:
> resourceManagerGateway.disconnectJobManager(
> be75c925204aede002136b15238f88b4,
> 
> );
> -> at 
> org.apache.flink.runtime.jobmaster.JobMasterTest.testHeartbeatTimeoutWithResourceManager(JobMasterTest.java:220)
> However, there were other interactions with this mock:
> resourceManagerGateway.registerJobManager(
> 82d774de-ed78-4670-9623-2fc6638fbbf9,
> 11b045ea-8b2b-4df0-aa02-ef4922dfc632,
> jm,
> "b442340a-7d3d-49d8-b440-1a93a5b43bb6",
> be75c925204aede002136b15238f88b4,
> 100 ms
> );
> -> at 
> org.apache.flink.runtime.jobmaster.JobMaster$ResourceManagerConnection$1.invokeRegistration(JobMaster.java:1051)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Reopened] (FLINK-6577) Expand supported types for ConfigOptions

2017-05-19 Thread Greg Hogan (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-6577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Hogan reopened FLINK-6577:
---

> Expand supported types for ConfigOptions
> 
>
> Key: FLINK-6577
> URL: https://issues.apache.org/jira/browse/FLINK-6577
> Project: Flink
>  Issue Type: Wish
>  Components: Configuration
>Reporter: Chesnay Schepler
>Priority: Trivial
>
> The type of a {{ConfigOption}} is currently limited to the types the that 
> {{Configuration}} supports, which boils down to basic types and byte arrays.
> It would be useful if they could also return things like enums, or the 
> recently added {{MemorySize}}.
> I propose adding a {{fromConfiguration(Configuration}} method to the 
> {{ConfigOption}} class.
> {code}
> // ConfigOption definition
> ConfigOption MEMORY =
> key("memory")
> .defaultValue(new MemorySize(12345))
> .from(new ExtractorFunction() {
> MemorySize extract(Configuration config) {
> // add check for unconfigured option
> return MemorySize.parse(config.getString("memory");}
> });
> // usage
> MemorySize memory = MEMORY.fromConfiguration(config);
> // with default
> MemorySize memory = MEMORY.fromConfiguration(config, new MemorySize(12345);
> // internals of ConfigOption#fromConfiguration
>  fromConfiguration(Configuration config) {
> if (this.extractor == null) { // throw error or something }
> T value = this.extractor.extract(config);
> return value == null ? defaultValue : value;
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Closed] (FLINK-6577) Expand supported types for ConfigOptions

2017-05-19 Thread Greg Hogan (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-6577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Hogan closed FLINK-6577.
-
Resolution: Fixed

> Expand supported types for ConfigOptions
> 
>
> Key: FLINK-6577
> URL: https://issues.apache.org/jira/browse/FLINK-6577
> Project: Flink
>  Issue Type: Wish
>  Components: Configuration
>Reporter: Chesnay Schepler
>Priority: Trivial
>
> The type of a {{ConfigOption}} is currently limited to the types the that 
> {{Configuration}} supports, which boils down to basic types and byte arrays.
> It would be useful if they could also return things like enums, or the 
> recently added {{MemorySize}}.
> I propose adding a {{fromConfiguration(Configuration}} method to the 
> {{ConfigOption}} class.
> {code}
> // ConfigOption definition
> ConfigOption MEMORY =
> key("memory")
> .defaultValue(new MemorySize(12345))
> .from(new ExtractorFunction() {
> MemorySize extract(Configuration config) {
> // add check for unconfigured option
> return MemorySize.parse(config.getString("memory");}
> });
> // usage
> MemorySize memory = MEMORY.fromConfiguration(config);
> // with default
> MemorySize memory = MEMORY.fromConfiguration(config, new MemorySize(12345);
> // internals of ConfigOption#fromConfiguration
>  fromConfiguration(Configuration config) {
> if (this.extractor == null) { // throw error or something }
> T value = this.extractor.extract(config);
> return value == null ? defaultValue : value;
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-6642) EnvInformation#getOpenFileHandlesLimit should return -1 for Windows

2017-05-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16017553#comment-16017553
 ] 

ASF GitHub Bot commented on FLINK-6642:
---

GitHub user zentol opened a pull request:

https://github.com/apache/flink/pull/3956

[FLINK-6642] Return -1 in EnvInfo#getOpenFileHandlesLimit

Returns -1 for Windows, as the accessed UnixOSMXBean is obviously not 
available on Windows.

This reduces logging noise on Windows.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/zentol/flink 6642_env_windows

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/3956.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3956


commit 4060e8b1a1f4cde1674a5f41dd4a925679bd5d55
Author: zentol 
Date:   2017-05-19T15:22:26Z

[FLINK-6642] Return -1 in EnvInfo#getOpenFileHandlesLimit




> EnvInformation#getOpenFileHandlesLimit should return -1 for Windows
> ---
>
> Key: FLINK-6642
> URL: https://issues.apache.org/jira/browse/FLINK-6642
> Project: Flink
>  Issue Type: Bug
>  Components: Local Runtime
>Affects Versions: 1.4.0
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
>
> The {{EnvironmentInformation#getOpenFileHandlesLimit}} is accessing the sun 
> {{UnixOperatingSystemMXBean}} via reflection, but has no branch for Windows.
> This causes the following exception to be logged whenever a TM is started:
> {code}
> 2017-05-19 16:49:58,780 WARN  
> org.apache.flink.runtime.util.EnvironmentInformation  - Unexpected 
> error when accessing file handle limit
> java.lang.IllegalArgumentException: object is not an instance of declaring 
> class
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
> at java.lang.reflect.Method.invoke(Unknown Source)
> at 
> org.apache.flink.runtime.util.EnvironmentInformation.getOpenFileHandlesLimit(EnvironmentInformation.java:245)
> at 
> org.apache.flink.runtime.taskmanager.TaskManager$.main(TaskManager.scala:1528)
> at 
> org.apache.flink.runtime.taskmanager.TaskManager.main(TaskManager.scala)
> 2017-05-19 16:49:58,780 INFO  
> org.apache.flink.runtime.taskmanager.TaskManager  - Cannot 
> determine the maximum number of open file descriptors
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-6644) Don't register HUP unix signal handler on Windows

2017-05-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16017552#comment-16017552
 ] 

ASF GitHub Bot commented on FLINK-6644:
---

GitHub user zentol opened a pull request:

https://github.com/apache/flink/pull/3955

[FLINK-6644] Don't register HUP signal handler on Windows

No longer registers the HUP signal handler on Windows to reduce noise in 
the logs as it is not available on windows.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/zentol/flink 6644_hup_windows

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/3955.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3955


commit 5e972db09853ee7d192ff21b0dc369595b0489ec
Author: zentol 
Date:   2017-05-19T15:34:47Z

[FLINK-6644] Don't register HUP signal handler on Windows




> Don't register HUP unix signal handler on Windows
> -
>
> Key: FLINK-6644
> URL: https://issues.apache.org/jira/browse/FLINK-6644
> Project: Flink
>  Issue Type: Bug
>  Components: Local Runtime
>Affects Versions: 1.4.0
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
>
> The JM/TM register UNIX signal handlers for INT, HUP and TERM on startup.
> On Windows, HUP is not available, causing this exception to be logged 
> everytime:
> {code}
> 2017-05-19 16:49:58,777 INFO  
> org.apache.flink.runtime.taskmanager.TaskManager  - Error while 
> registering signal handler
> java.lang.IllegalArgumentException: Unknown signal: HUP
> at sun.misc.Signal.(Unknown Source)
> at 
> org.apache.flink.runtime.util.SignalHandler$Handler.(SignalHandler.java:41)
> at 
> org.apache.flink.runtime.util.SignalHandler.register(SignalHandler.java:78)
> at 
> org.apache.flink.runtime.taskmanager.TaskManager$.main(TaskManager.scala:1525)
> at 
> org.apache.flink.runtime.taskmanager.TaskManager.main(TaskManager.scala)
> 2017-05-19 16:49:58,778 INFO  
> org.apache.flink.runtime.taskmanager.TaskManager  - Registered 
> UNIX signal handlers for [TERM, INT]
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Issue Comment Deleted] (FLINK-6577) Expand supported types for ConfigOptions

2017-05-19 Thread Greg Hogan (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-6577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Hogan updated FLINK-6577:
--
Comment: was deleted

(was: Sorry about that. Too much tab hopping.)

> Expand supported types for ConfigOptions
> 
>
> Key: FLINK-6577
> URL: https://issues.apache.org/jira/browse/FLINK-6577
> Project: Flink
>  Issue Type: Wish
>  Components: Configuration
>Reporter: Chesnay Schepler
>Priority: Trivial
>
> The type of a {{ConfigOption}} is currently limited to the types the that 
> {{Configuration}} supports, which boils down to basic types and byte arrays.
> It would be useful if they could also return things like enums, or the 
> recently added {{MemorySize}}.
> I propose adding a {{fromConfiguration(Configuration}} method to the 
> {{ConfigOption}} class.
> {code}
> // ConfigOption definition
> ConfigOption MEMORY =
> key("memory")
> .defaultValue(new MemorySize(12345))
> .from(new ExtractorFunction() {
> MemorySize extract(Configuration config) {
> // add check for unconfigured option
> return MemorySize.parse(config.getString("memory");}
> });
> // usage
> MemorySize memory = MEMORY.fromConfiguration(config);
> // with default
> MemorySize memory = MEMORY.fromConfiguration(config, new MemorySize(12345);
> // internals of ConfigOption#fromConfiguration
>  fromConfiguration(Configuration config) {
> if (this.extractor == null) { // throw error or something }
> T value = this.extractor.extract(config);
> return value == null ? defaultValue : value;
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (FLINK-6644) Don't register HUP unix signal handler on Windows

2017-05-19 Thread Chesnay Schepler (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-6644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chesnay Schepler reassigned FLINK-6644:
---

Assignee: Chesnay Schepler

> Don't register HUP unix signal handler on Windows
> -
>
> Key: FLINK-6644
> URL: https://issues.apache.org/jira/browse/FLINK-6644
> Project: Flink
>  Issue Type: Bug
>  Components: Local Runtime
>Affects Versions: 1.4.0
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
>
> The JM/TM register UNIX signal handlers for INT, HUP and TERM on startup.
> On Windows, HUP is not available, causing this exception to be logged 
> everytime:
> {code}
> 2017-05-19 16:49:58,777 INFO  
> org.apache.flink.runtime.taskmanager.TaskManager  - Error while 
> registering signal handler
> java.lang.IllegalArgumentException: Unknown signal: HUP
> at sun.misc.Signal.(Unknown Source)
> at 
> org.apache.flink.runtime.util.SignalHandler$Handler.(SignalHandler.java:41)
> at 
> org.apache.flink.runtime.util.SignalHandler.register(SignalHandler.java:78)
> at 
> org.apache.flink.runtime.taskmanager.TaskManager$.main(TaskManager.scala:1525)
> at 
> org.apache.flink.runtime.taskmanager.TaskManager.main(TaskManager.scala)
> 2017-05-19 16:49:58,778 INFO  
> org.apache.flink.runtime.taskmanager.TaskManager  - Registered 
> UNIX signal handlers for [TERM, INT]
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-6577) Expand supported types for ConfigOptions

2017-05-19 Thread Greg Hogan (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16017545#comment-16017545
 ] 

Greg Hogan commented on FLINK-6577:
---

Sorry about that. Too much tab hopping.

> Expand supported types for ConfigOptions
> 
>
> Key: FLINK-6577
> URL: https://issues.apache.org/jira/browse/FLINK-6577
> Project: Flink
>  Issue Type: Wish
>  Components: Configuration
>Reporter: Chesnay Schepler
>Priority: Trivial
>
> The type of a {{ConfigOption}} is currently limited to the types the that 
> {{Configuration}} supports, which boils down to basic types and byte arrays.
> It would be useful if they could also return things like enums, or the 
> recently added {{MemorySize}}.
> I propose adding a {{fromConfiguration(Configuration}} method to the 
> {{ConfigOption}} class.
> {code}
> // ConfigOption definition
> ConfigOption MEMORY =
> key("memory")
> .defaultValue(new MemorySize(12345))
> .from(new ExtractorFunction() {
> MemorySize extract(Configuration config) {
> // add check for unconfigured option
> return MemorySize.parse(config.getString("memory");}
> });
> // usage
> MemorySize memory = MEMORY.fromConfiguration(config);
> // with default
> MemorySize memory = MEMORY.fromConfiguration(config, new MemorySize(12345);
> // internals of ConfigOption#fromConfiguration
>  fromConfiguration(Configuration config) {
> if (this.extractor == null) { // throw error or something }
> T value = this.extractor.extract(config);
> return value == null ? defaultValue : value;
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Closed] (FLINK-3155) Update Flink docker version to latest stable Flink version

2017-05-19 Thread Greg Hogan (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Hogan closed FLINK-3155.
-
Resolution: Fixed

> Update Flink docker version to latest stable Flink version
> --
>
> Key: FLINK-3155
> URL: https://issues.apache.org/jira/browse/FLINK-3155
> Project: Flink
>  Issue Type: Task
>  Components: Docker
>Affects Versions: 1.0.0, 1.1.0
>Reporter: Maximilian Michels
>Priority: Minor
>
> It would be nice to always set the Docker Flink binary URL to point to the 
> latest Flink version. Until then, this JIRA keeps track of the updates for 
> releases.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (FLINK-6644) Don't register HUP unix signal handler on Windows

2017-05-19 Thread Chesnay Schepler (JIRA)
Chesnay Schepler created FLINK-6644:
---

 Summary: Don't register HUP unix signal handler on Windows
 Key: FLINK-6644
 URL: https://issues.apache.org/jira/browse/FLINK-6644
 Project: Flink
  Issue Type: Bug
  Components: Local Runtime
Affects Versions: 1.4.0
Reporter: Chesnay Schepler


The JM/TM register UNIX signal handlers for INT, HUP and TERM on startup.

On Windows, HUP is not available, causing this exception to be logged everytime:

{code}
2017-05-19 16:49:58,777 INFO  org.apache.flink.runtime.taskmanager.TaskManager  
- Error while registering signal handler
java.lang.IllegalArgumentException: Unknown signal: HUP
at sun.misc.Signal.(Unknown Source)
at 
org.apache.flink.runtime.util.SignalHandler$Handler.(SignalHandler.java:41)
at 
org.apache.flink.runtime.util.SignalHandler.register(SignalHandler.java:78)
at 
org.apache.flink.runtime.taskmanager.TaskManager$.main(TaskManager.scala:1525)
at 
org.apache.flink.runtime.taskmanager.TaskManager.main(TaskManager.scala)
2017-05-19 16:49:58,778 INFO  org.apache.flink.runtime.taskmanager.TaskManager  
- Registered UNIX signal handlers for [TERM, INT]
{code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-3155) Update Flink docker version to latest stable Flink version

2017-05-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16017538#comment-16017538
 ] 

ASF GitHub Bot commented on FLINK-3155:
---

Github user greghogan commented on the issue:

https://github.com/apache/flink/pull/3490
  
@kdombeck thank you for this PR! We now have [official Docker 
images](https://ci.apache.org/projects/flink/flink-docs-release-1.3/setup/docker.html)
 and I see that the contrib Dockerfile no longer hard codes a Flink version. If 
this satisfies your request please close this PR.


> Update Flink docker version to latest stable Flink version
> --
>
> Key: FLINK-3155
> URL: https://issues.apache.org/jira/browse/FLINK-3155
> Project: Flink
>  Issue Type: Task
>  Components: Docker
>Affects Versions: 1.0.0, 1.1.0
>Reporter: Maximilian Michels
>Priority: Minor
> Fix For: 1.0.0
>
>
> It would be nice to always set the Docker Flink binary URL to point to the 
> latest Flink version. Until then, this JIRA keeps track of the updates for 
> releases.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] flink issue #3490: [FLINK-3155] Update Docker to use the latest 1.1.x versio...

2017-05-19 Thread greghogan
Github user greghogan commented on the issue:

https://github.com/apache/flink/pull/3490
  
@kdombeck thank you for this PR! We now have [official Docker 
images](https://ci.apache.org/projects/flink/flink-docs-release-1.3/setup/docker.html)
 and I see that the contrib Dockerfile no longer hard codes a Flink version. If 
this satisfies your request please close this PR.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Updated] (FLINK-6641) HA recovery on YARN: ClusterClient calls HighAvailabilityServices#closeAndCleanupAll

2017-05-19 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-6641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-6641:
--
Summary: HA recovery on YARN: ClusterClient calls 
HighAvailabilityServices#closeAndCleanupAll  (was: HA recovery on YARN fails 
due to missing files in HDFS)

> HA recovery on YARN: ClusterClient calls 
> HighAvailabilityServices#closeAndCleanupAll
> 
>
> Key: FLINK-6641
> URL: https://issues.apache.org/jira/browse/FLINK-6641
> Project: Flink
>  Issue Type: Bug
>  Components: Distributed Coordination, YARN
>Affects Versions: 1.3.0
>Reporter: Robert Metzger
>Assignee: Till Rohrmann
>Priority: Blocker
>
> While testing the 1.3 RC1, I was unable to recovery from a JobManager failure.
> The following error messages appear in the JobManager that is started by YARN:
> {code}
> 2017-05-19 14:41:38,030 INFO  
> org.apache.flink.runtime.webmonitor.JobManagerRetriever   - New leader 
> reachable under 
> akka.tcp://flink@permanent-qa-cluster-i7c9.c.astral-sorter-757.internal:36528/user/jobmanager:6539dc04-d7fe-4f85-a0b6-09bfb0de8a58.
> 2017-05-19 14:41:38,033 INFO  org.apache.flink.yarn.YarnFlinkResourceManager  
>   - Resource Manager associating with leading JobManager 
> Actor[akka://flink/user/jobmanager#1602741108] - leader session 
> 6539dc04-d7fe-4f85-a0b6-09bfb0de8a58
> 2017-05-19 14:41:38,033 INFO  org.apache.flink.yarn.YarnFlinkResourceManager  
>   - Requesting new TaskManager container with 1024 megabytes 
> memory. Pending requests: 1
> 2017-05-19 14:41:38,781 INFO  org.apache.flink.yarn.YarnFlinkResourceManager  
>   - Received new container: 
> container_149487096_0061_02_02 - Remaining pending container 
> requests: 0
> 2017-05-19 14:41:38,782 INFO  org.apache.flink.yarn.YarnFlinkResourceManager  
>   - Launching TaskManager in container ContainerInLaunch @ 
> 1495204898782: Container: [ContainerId: 
> container_149487096_0061_02_02, NodeId: 
> permanent-qa-cluster-d3iz.c.astral-sorter-757.internal:8041, NodeHttpAddress: 
> permanent-qa-cluster-d3iz.c.astral-sorter-757.internal:8042, Resource: 
> , Priority: 0, Token: Token { kind: ContainerToken, 
> service: 10.240.0.32:8041 }, ] on host 
> permanent-qa-cluster-d3iz.c.astral-sorter-757.internal
> 2017-05-19 14:41:38,788 INFO  
> org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy  - 
> Opening proxy : permanent-qa-cluster-d3iz.c.astral-sorter-757.internal:8041
> 2017-05-19 14:41:44,284 INFO  org.apache.flink.yarn.YarnFlinkResourceManager  
>   - Container container_149487096_0061_02_02 failed, with 
> a TaskManager in launch or registration. Exit status: -1000
> 2017-05-19 14:41:44,284 INFO  org.apache.flink.yarn.YarnFlinkResourceManager  
>   - Diagnostics for container 
> container_149487096_0061_02_02 in state COMPLETE : exitStatus=-1000 
> diagnostics=File does not exist: 
> hdfs://nameservice1/user/robert/.flink/application_149487096_0061/cf9287fe-ac75-4066-a648-91787d946890-taskmanager-conf.yaml
> java.io.FileNotFoundException: File does not exist: 
> hdfs://nameservice1/user/robert/.flink/application_149487096_0061/cf9287fe-ac75-4066-a648-91787d946890-taskmanager-conf.yaml
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$19.doCall(DistributedFileSystem.java:1219)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$19.doCall(DistributedFileSystem.java:1211)
>   at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1211)
>   at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:251)
>   at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:61)
>   at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:359)
>   at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:357)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
>   at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:356)
>   at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:60)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> 

[jira] [Commented] (FLINK-6641) HA recovery on YARN fails due to missing files in HDFS

2017-05-19 Thread Robert Metzger (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16017534#comment-16017534
 ] 

Robert Metzger commented on FLINK-6641:
---

Yes, I agree. This JIRA is actually two issues.

Let me create another one and rename this one (this one becomes the 
{{HighAvailabilityServices#closeAndCleanupAll}} one)

> HA recovery on YARN fails due to missing files in HDFS
> --
>
> Key: FLINK-6641
> URL: https://issues.apache.org/jira/browse/FLINK-6641
> Project: Flink
>  Issue Type: Bug
>  Components: Distributed Coordination, YARN
>Affects Versions: 1.3.0
>Reporter: Robert Metzger
>Assignee: Till Rohrmann
>Priority: Blocker
>
> While testing the 1.3 RC1, I was unable to recovery from a JobManager failure.
> The following error messages appear in the JobManager that is started by YARN:
> {code}
> 2017-05-19 14:41:38,030 INFO  
> org.apache.flink.runtime.webmonitor.JobManagerRetriever   - New leader 
> reachable under 
> akka.tcp://flink@permanent-qa-cluster-i7c9.c.astral-sorter-757.internal:36528/user/jobmanager:6539dc04-d7fe-4f85-a0b6-09bfb0de8a58.
> 2017-05-19 14:41:38,033 INFO  org.apache.flink.yarn.YarnFlinkResourceManager  
>   - Resource Manager associating with leading JobManager 
> Actor[akka://flink/user/jobmanager#1602741108] - leader session 
> 6539dc04-d7fe-4f85-a0b6-09bfb0de8a58
> 2017-05-19 14:41:38,033 INFO  org.apache.flink.yarn.YarnFlinkResourceManager  
>   - Requesting new TaskManager container with 1024 megabytes 
> memory. Pending requests: 1
> 2017-05-19 14:41:38,781 INFO  org.apache.flink.yarn.YarnFlinkResourceManager  
>   - Received new container: 
> container_149487096_0061_02_02 - Remaining pending container 
> requests: 0
> 2017-05-19 14:41:38,782 INFO  org.apache.flink.yarn.YarnFlinkResourceManager  
>   - Launching TaskManager in container ContainerInLaunch @ 
> 1495204898782: Container: [ContainerId: 
> container_149487096_0061_02_02, NodeId: 
> permanent-qa-cluster-d3iz.c.astral-sorter-757.internal:8041, NodeHttpAddress: 
> permanent-qa-cluster-d3iz.c.astral-sorter-757.internal:8042, Resource: 
> , Priority: 0, Token: Token { kind: ContainerToken, 
> service: 10.240.0.32:8041 }, ] on host 
> permanent-qa-cluster-d3iz.c.astral-sorter-757.internal
> 2017-05-19 14:41:38,788 INFO  
> org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy  - 
> Opening proxy : permanent-qa-cluster-d3iz.c.astral-sorter-757.internal:8041
> 2017-05-19 14:41:44,284 INFO  org.apache.flink.yarn.YarnFlinkResourceManager  
>   - Container container_149487096_0061_02_02 failed, with 
> a TaskManager in launch or registration. Exit status: -1000
> 2017-05-19 14:41:44,284 INFO  org.apache.flink.yarn.YarnFlinkResourceManager  
>   - Diagnostics for container 
> container_149487096_0061_02_02 in state COMPLETE : exitStatus=-1000 
> diagnostics=File does not exist: 
> hdfs://nameservice1/user/robert/.flink/application_149487096_0061/cf9287fe-ac75-4066-a648-91787d946890-taskmanager-conf.yaml
> java.io.FileNotFoundException: File does not exist: 
> hdfs://nameservice1/user/robert/.flink/application_149487096_0061/cf9287fe-ac75-4066-a648-91787d946890-taskmanager-conf.yaml
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$19.doCall(DistributedFileSystem.java:1219)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$19.doCall(DistributedFileSystem.java:1211)
>   at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1211)
>   at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:251)
>   at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:61)
>   at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:359)
>   at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:357)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
>   at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:356)
>   at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:60)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> 

[jira] [Created] (FLINK-6643) Flink restarts job in HA even if NoRestartStrategy is set

2017-05-19 Thread Robert Metzger (JIRA)
Robert Metzger created FLINK-6643:
-

 Summary: Flink restarts job in HA even if NoRestartStrategy is set
 Key: FLINK-6643
 URL: https://issues.apache.org/jira/browse/FLINK-6643
 Project: Flink
  Issue Type: Bug
  Components: JobManager
Affects Versions: 1.3.0
Reporter: Robert Metzger
Priority: Critical


While testing Flink 1.3 RC1, I found that the JobManager is trying to recover a 
job that had the {NoRestartStrategy} set.

{code}
2017-05-19 15:09:04,038 INFO  org.apache.flink.yarn.YarnJobManager  
- Attempting to recover all jobs.
2017-05-19 15:09:04,039 DEBUG 
org.apache.flink.runtime.jobmanager.ZooKeeperSubmittedJobGraphStore  - 
Retrieving all stored job ids from ZooKeeper under 
flink/application_149487096_0064/jobgraphs.
2017-05-19 15:09:04,041 INFO  org.apache.flink.yarn.YarnJobManager  
- There are 1 jobs to recover. Starting the job recovery.
2017-05-19 15:09:04,043 INFO  org.apache.flink.yarn.YarnJobManager  
- Attempting to recover job f94b1f7a0e9e3dbcb160c687e476ca77.
2017-05-19 15:09:04,043 DEBUG 
org.apache.flink.runtime.jobmanager.ZooKeeperSubmittedJobGraphStore  - 
Recovering job graph f94b1f7a0e9e3dbcb160c687e476ca77 from 
flink/application_149487096_0064/jobgraphs/f94b1f7a0e9e3dbcb160c687e476ca77.
2017-05-19 15:09:04,078 WARN  org.apache.hadoop.util.NativeCodeLoader   
- Unable to load native-hadoop library for your platform... using 
builtin-java classes where applicable
2017-05-19 15:09:04,142 INFO  
org.apache.flink.runtime.jobmanager.ZooKeeperSubmittedJobGraphStore  - 
Recovered SubmittedJobGraph(f94b1f7a0e9e3dbcb160c687e476ca77, JobInfo(clients: 
Set((Actor[akka.tcp://flink@permanent-qa-cluster-master.c.astral-sorter-757.internal:40391/user/$a#-155566858],EXECUTION_RESULT_AND_STATE_CHANGES)),
 start: 1495206476885)).
2017-05-19 15:09:04,142 INFO  org.apache.flink.yarn.YarnJobManager  
- Submitting recovered job f94b1f7a0e9e3dbcb160c687e476ca77.
2017-05-19 15:09:04,143 INFO  org.apache.flink.yarn.YarnJobManager  
- Submitting job f94b1f7a0e9e3dbcb160c687e476ca77 
(CarTopSpeedWindowingExample) (Recovery).
2017-05-19 15:09:04,151 INFO  org.apache.flink.yarn.YarnJobManager  
- Using restart strategy NoRestartStrategy for 
f94b1f7a0e9e3dbcb160c687e476ca77.
2017-05-19 15:09:04,163 INFO  
org.apache.flink.runtime.executiongraph.ExecutionGraph- Job recovers 
via failover strategy: full graph restart
{code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-6641) HA recovery on YARN fails due to missing files in HDFS

2017-05-19 Thread Till Rohrmann (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16017532#comment-16017532
 ] 

Till Rohrmann commented on FLINK-6641:
--

I think one problem is that the {{ClusterClient}} calls 
{{HighAvailabilityServices#closeAndCleanupAll}}. The other problem is probably 
that the {{YarnClusterClient}} deletes the session directory which also acts as 
the Yarn application working directory.

> HA recovery on YARN fails due to missing files in HDFS
> --
>
> Key: FLINK-6641
> URL: https://issues.apache.org/jira/browse/FLINK-6641
> Project: Flink
>  Issue Type: Bug
>  Components: Distributed Coordination, YARN
>Affects Versions: 1.3.0
>Reporter: Robert Metzger
>Assignee: Till Rohrmann
>Priority: Blocker
>
> While testing the 1.3 RC1, I was unable to recovery from a JobManager failure.
> The following error messages appear in the JobManager that is started by YARN:
> {code}
> 2017-05-19 14:41:38,030 INFO  
> org.apache.flink.runtime.webmonitor.JobManagerRetriever   - New leader 
> reachable under 
> akka.tcp://flink@permanent-qa-cluster-i7c9.c.astral-sorter-757.internal:36528/user/jobmanager:6539dc04-d7fe-4f85-a0b6-09bfb0de8a58.
> 2017-05-19 14:41:38,033 INFO  org.apache.flink.yarn.YarnFlinkResourceManager  
>   - Resource Manager associating with leading JobManager 
> Actor[akka://flink/user/jobmanager#1602741108] - leader session 
> 6539dc04-d7fe-4f85-a0b6-09bfb0de8a58
> 2017-05-19 14:41:38,033 INFO  org.apache.flink.yarn.YarnFlinkResourceManager  
>   - Requesting new TaskManager container with 1024 megabytes 
> memory. Pending requests: 1
> 2017-05-19 14:41:38,781 INFO  org.apache.flink.yarn.YarnFlinkResourceManager  
>   - Received new container: 
> container_149487096_0061_02_02 - Remaining pending container 
> requests: 0
> 2017-05-19 14:41:38,782 INFO  org.apache.flink.yarn.YarnFlinkResourceManager  
>   - Launching TaskManager in container ContainerInLaunch @ 
> 1495204898782: Container: [ContainerId: 
> container_149487096_0061_02_02, NodeId: 
> permanent-qa-cluster-d3iz.c.astral-sorter-757.internal:8041, NodeHttpAddress: 
> permanent-qa-cluster-d3iz.c.astral-sorter-757.internal:8042, Resource: 
> , Priority: 0, Token: Token { kind: ContainerToken, 
> service: 10.240.0.32:8041 }, ] on host 
> permanent-qa-cluster-d3iz.c.astral-sorter-757.internal
> 2017-05-19 14:41:38,788 INFO  
> org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy  - 
> Opening proxy : permanent-qa-cluster-d3iz.c.astral-sorter-757.internal:8041
> 2017-05-19 14:41:44,284 INFO  org.apache.flink.yarn.YarnFlinkResourceManager  
>   - Container container_149487096_0061_02_02 failed, with 
> a TaskManager in launch or registration. Exit status: -1000
> 2017-05-19 14:41:44,284 INFO  org.apache.flink.yarn.YarnFlinkResourceManager  
>   - Diagnostics for container 
> container_149487096_0061_02_02 in state COMPLETE : exitStatus=-1000 
> diagnostics=File does not exist: 
> hdfs://nameservice1/user/robert/.flink/application_149487096_0061/cf9287fe-ac75-4066-a648-91787d946890-taskmanager-conf.yaml
> java.io.FileNotFoundException: File does not exist: 
> hdfs://nameservice1/user/robert/.flink/application_149487096_0061/cf9287fe-ac75-4066-a648-91787d946890-taskmanager-conf.yaml
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$19.doCall(DistributedFileSystem.java:1219)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$19.doCall(DistributedFileSystem.java:1211)
>   at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1211)
>   at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:251)
>   at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:61)
>   at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:359)
>   at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:357)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
>   at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:356)
>   at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:60)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>   at 
> 

[jira] [Updated] (FLINK-6642) EnvInformation#getOpenFileHandlesLimit should return -1 for Windows

2017-05-19 Thread Chesnay Schepler (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-6642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chesnay Schepler updated FLINK-6642:

Description: 
The {{EnvironmentInformation#getOpenFileHandlesLimit}} is accessing the sun 
{{UnixOperatingSystemMXBean}} via reflection, but has no branch for Windows.

This causes the following exception to be logged whenever a TM is started:

{code}
2017-05-19 16:49:58,780 WARN  
org.apache.flink.runtime.util.EnvironmentInformation  - Unexpected 
error when accessing file handle limit
java.lang.IllegalArgumentException: object is not an instance of declaring class
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
at java.lang.reflect.Method.invoke(Unknown Source)
at 
org.apache.flink.runtime.util.EnvironmentInformation.getOpenFileHandlesLimit(EnvironmentInformation.java:245)
at 
org.apache.flink.runtime.taskmanager.TaskManager$.main(TaskManager.scala:1528)
at 
org.apache.flink.runtime.taskmanager.TaskManager.main(TaskManager.scala)
2017-05-19 16:49:58,780 INFO  org.apache.flink.runtime.taskmanager.TaskManager  
- Cannot determine the maximum number of open file descriptors

{code}

  was:
The {{EnvironmentInformation#getOpenFileHandlesLimit}} is accessing t he sun 
{{UnixOperatingSystemMXBean}} via reflection, but has no branch for Windows.

This causes the following exception to be logged whenever a TM is started:

{code}
2017-05-19 16:49:58,780 WARN  
org.apache.flink.runtime.util.EnvironmentInformation  - Unexpected 
error when accessing file handle limit
java.lang.IllegalArgumentException: object is not an instance of declaring class
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
at java.lang.reflect.Method.invoke(Unknown Source)
at 
org.apache.flink.runtime.util.EnvironmentInformation.getOpenFileHandlesLimit(EnvironmentInformation.java:245)
at 
org.apache.flink.runtime.taskmanager.TaskManager$.main(TaskManager.scala:1528)
at 
org.apache.flink.runtime.taskmanager.TaskManager.main(TaskManager.scala)
2017-05-19 16:49:58,780 INFO  org.apache.flink.runtime.taskmanager.TaskManager  
- Cannot determine the maximum number of open file descriptors

{code}


> EnvInformation#getOpenFileHandlesLimit should return -1 for Windows
> ---
>
> Key: FLINK-6642
> URL: https://issues.apache.org/jira/browse/FLINK-6642
> Project: Flink
>  Issue Type: Bug
>  Components: Local Runtime
>Affects Versions: 1.4.0
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
>
> The {{EnvironmentInformation#getOpenFileHandlesLimit}} is accessing the sun 
> {{UnixOperatingSystemMXBean}} via reflection, but has no branch for Windows.
> This causes the following exception to be logged whenever a TM is started:
> {code}
> 2017-05-19 16:49:58,780 WARN  
> org.apache.flink.runtime.util.EnvironmentInformation  - Unexpected 
> error when accessing file handle limit
> java.lang.IllegalArgumentException: object is not an instance of declaring 
> class
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
> at java.lang.reflect.Method.invoke(Unknown Source)
> at 
> org.apache.flink.runtime.util.EnvironmentInformation.getOpenFileHandlesLimit(EnvironmentInformation.java:245)
> at 
> org.apache.flink.runtime.taskmanager.TaskManager$.main(TaskManager.scala:1528)
> at 
> org.apache.flink.runtime.taskmanager.TaskManager.main(TaskManager.scala)
> 2017-05-19 16:49:58,780 INFO  
> org.apache.flink.runtime.taskmanager.TaskManager  - Cannot 
> determine the maximum number of open file descriptors
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (FLINK-6642) EnvInformation#getOpenFileHandlesLimit should return -1 for Windows

2017-05-19 Thread Chesnay Schepler (JIRA)
Chesnay Schepler created FLINK-6642:
---

 Summary: EnvInformation#getOpenFileHandlesLimit should return -1 
for Windows
 Key: FLINK-6642
 URL: https://issues.apache.org/jira/browse/FLINK-6642
 Project: Flink
  Issue Type: Bug
  Components: Local Runtime
Affects Versions: 1.4.0
Reporter: Chesnay Schepler
Assignee: Chesnay Schepler


The {{EnvironmentInformation#getOpenFileHandlesLimit}} is accessing t he sun 
{{UnixOperatingSystemMXBean}} via reflection, but has no branch for Windows.

This causes the following exception to be logged whenever a TM is started:

{code}
2017-05-19 16:49:58,780 WARN  
org.apache.flink.runtime.util.EnvironmentInformation  - Unexpected 
error when accessing file handle limit
java.lang.IllegalArgumentException: object is not an instance of declaring class
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
at java.lang.reflect.Method.invoke(Unknown Source)
at 
org.apache.flink.runtime.util.EnvironmentInformation.getOpenFileHandlesLimit(EnvironmentInformation.java:245)
at 
org.apache.flink.runtime.taskmanager.TaskManager$.main(TaskManager.scala:1528)
at 
org.apache.flink.runtime.taskmanager.TaskManager.main(TaskManager.scala)
2017-05-19 16:49:58,780 INFO  org.apache.flink.runtime.taskmanager.TaskManager  
- Cannot determine the maximum number of open file descriptors

{code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (FLINK-6641) HA recovery on YARN fails due to missing files in HDFS

2017-05-19 Thread Till Rohrmann (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-6641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Till Rohrmann reassigned FLINK-6641:


Assignee: Till Rohrmann

> HA recovery on YARN fails due to missing files in HDFS
> --
>
> Key: FLINK-6641
> URL: https://issues.apache.org/jira/browse/FLINK-6641
> Project: Flink
>  Issue Type: Bug
>  Components: Distributed Coordination, YARN
>Affects Versions: 1.3.0
>Reporter: Robert Metzger
>Assignee: Till Rohrmann
>Priority: Blocker
>
> While testing the 1.3 RC1, I was unable to recovery from a JobManager failure.
> The following error messages appear in the JobManager that is started by YARN:
> {code}
> 2017-05-19 14:41:38,030 INFO  
> org.apache.flink.runtime.webmonitor.JobManagerRetriever   - New leader 
> reachable under 
> akka.tcp://flink@permanent-qa-cluster-i7c9.c.astral-sorter-757.internal:36528/user/jobmanager:6539dc04-d7fe-4f85-a0b6-09bfb0de8a58.
> 2017-05-19 14:41:38,033 INFO  org.apache.flink.yarn.YarnFlinkResourceManager  
>   - Resource Manager associating with leading JobManager 
> Actor[akka://flink/user/jobmanager#1602741108] - leader session 
> 6539dc04-d7fe-4f85-a0b6-09bfb0de8a58
> 2017-05-19 14:41:38,033 INFO  org.apache.flink.yarn.YarnFlinkResourceManager  
>   - Requesting new TaskManager container with 1024 megabytes 
> memory. Pending requests: 1
> 2017-05-19 14:41:38,781 INFO  org.apache.flink.yarn.YarnFlinkResourceManager  
>   - Received new container: 
> container_149487096_0061_02_02 - Remaining pending container 
> requests: 0
> 2017-05-19 14:41:38,782 INFO  org.apache.flink.yarn.YarnFlinkResourceManager  
>   - Launching TaskManager in container ContainerInLaunch @ 
> 1495204898782: Container: [ContainerId: 
> container_149487096_0061_02_02, NodeId: 
> permanent-qa-cluster-d3iz.c.astral-sorter-757.internal:8041, NodeHttpAddress: 
> permanent-qa-cluster-d3iz.c.astral-sorter-757.internal:8042, Resource: 
> , Priority: 0, Token: Token { kind: ContainerToken, 
> service: 10.240.0.32:8041 }, ] on host 
> permanent-qa-cluster-d3iz.c.astral-sorter-757.internal
> 2017-05-19 14:41:38,788 INFO  
> org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy  - 
> Opening proxy : permanent-qa-cluster-d3iz.c.astral-sorter-757.internal:8041
> 2017-05-19 14:41:44,284 INFO  org.apache.flink.yarn.YarnFlinkResourceManager  
>   - Container container_149487096_0061_02_02 failed, with 
> a TaskManager in launch or registration. Exit status: -1000
> 2017-05-19 14:41:44,284 INFO  org.apache.flink.yarn.YarnFlinkResourceManager  
>   - Diagnostics for container 
> container_149487096_0061_02_02 in state COMPLETE : exitStatus=-1000 
> diagnostics=File does not exist: 
> hdfs://nameservice1/user/robert/.flink/application_149487096_0061/cf9287fe-ac75-4066-a648-91787d946890-taskmanager-conf.yaml
> java.io.FileNotFoundException: File does not exist: 
> hdfs://nameservice1/user/robert/.flink/application_149487096_0061/cf9287fe-ac75-4066-a648-91787d946890-taskmanager-conf.yaml
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$19.doCall(DistributedFileSystem.java:1219)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$19.doCall(DistributedFileSystem.java:1211)
>   at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1211)
>   at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:251)
>   at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:61)
>   at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:359)
>   at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:357)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
>   at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:356)
>   at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:60)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> {code}
> and
> {code}
> 2017-05-19 14:41:48,133 INFO  org.apache.flink.yarn.YarnJobManager   

[jira] [Commented] (FLINK-6628) Cannot start taskmanager with cygwin in directory containing spaces

2017-05-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16017521#comment-16017521
 ] 

ASF GitHub Bot commented on FLINK-6628:
---

Github user greghogan commented on the issue:

https://github.com/apache/flink/pull/3954
  
This affects more than Windows ... at least MacOS and I expect Linux as 
well.

Also need to update `rotateLogFilesWithPrefix` in `config.sh` to: 
`rotateLogFile "$log"`.


> Cannot start taskmanager with cygwin in directory containing spaces
> ---
>
> Key: FLINK-6628
> URL: https://issues.apache.org/jira/browse/FLINK-6628
> Project: Flink
>  Issue Type: Bug
>  Components: Startup Shell Scripts
>Affects Versions: 1.3.0, 1.4.0
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
>
> {code}
> Zento@Lily 
> /cygdrive/c/Users/Zento/Documents/GitHub/flink/flink-dist/target/flink-1.3.0-bin/flink
>  1.3.0
> $ ./bin/taskmanager.sh start
> ./bin/taskmanager.sh: line 82: 
> C:\Users\Zento\Documents\GitHub\flink\flink-dist\target\flink-1.3.0-bin\flink:
>  command not found
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] flink issue #3954: [FLINK-6628] Fix start scripts on Windows

2017-05-19 Thread greghogan
Github user greghogan commented on the issue:

https://github.com/apache/flink/pull/3954
  
This affects more than Windows ... at least MacOS and I expect Linux as 
well.

Also need to update `rotateLogFilesWithPrefix` in `config.sh` to: 
`rotateLogFile "$log"`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-6446) Various improvements to the Web Frontend

2017-05-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16017514#comment-16017514
 ] 

ASF GitHub Bot commented on FLINK-6446:
---

Github user greghogan commented on a diff in the pull request:

https://github.com/apache/flink/pull/3946#discussion_r117504104
  
--- Diff: flink-runtime-web/web-dashboard/app/partials/jobs/job.plan.jade 
---
@@ -28,10 +28,10 @@ split
 a(ui-sref=".subtasks({nodeid: nodeid})") Subtasks
 
   li(ui-sref-active='active')
-a(ui-sref=".taskmanagers({nodeid: nodeid})") TaskManagers
+a(ui-sref=".taskmanagers({nodeid: nodeid})") Subtasks by 
TaskManager
--- End diff --

The subtask statistics are aggregated by TaskManager so I'm not sure this 
is an improvement. @StephanEwen?


> Various improvements to the Web Frontend
> 
>
> Key: FLINK-6446
> URL: https://issues.apache.org/jira/browse/FLINK-6446
> Project: Flink
>  Issue Type: Improvement
>  Components: Webfrontend
>Reporter: Stephan Ewen
>
> This is the umbrella issue for various improvements to the web frontend,



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] flink pull request #3946: [FLINK-6446] Fix some small issues in the web UI

2017-05-19 Thread greghogan
Github user greghogan commented on a diff in the pull request:

https://github.com/apache/flink/pull/3946#discussion_r117504104
  
--- Diff: flink-runtime-web/web-dashboard/app/partials/jobs/job.plan.jade 
---
@@ -28,10 +28,10 @@ split
 a(ui-sref=".subtasks({nodeid: nodeid})") Subtasks
 
   li(ui-sref-active='active')
-a(ui-sref=".taskmanagers({nodeid: nodeid})") TaskManagers
+a(ui-sref=".taskmanagers({nodeid: nodeid})") Subtasks by 
TaskManager
--- End diff --

The subtask statistics are aggregated by TaskManager so I'm not sure this 
is an improvement. @StephanEwen?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-6576) Allow ConfigOptions to validate the configured value

2017-05-19 Thread Chesnay Schepler (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16017507#comment-16017507
 ] 

Chesnay Schepler commented on FLINK-6576:
-

[~greghogan] wrong issue? :) Think you're looking for FLINK-6628.

> Allow ConfigOptions to validate the configured value
> 
>
> Key: FLINK-6576
> URL: https://issues.apache.org/jira/browse/FLINK-6576
> Project: Flink
>  Issue Type: Wish
>  Components: Configuration
>Reporter: Chesnay Schepler
>Priority: Trivial
>
> It is not unusual for a config parameter to only accept certain values. Ports 
> may not be negative, modes way essentially be enums, and file paths must not 
> be gibberish.
> Currently these validations happen manually after the value was extracted 
> from the {{Configuration}}.
> I want to start a discussion on whether we want to move this validation into 
> the ConfigOption itself.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (FLINK-6641) HA recovery on YARN fails due to missing files in HDFS

2017-05-19 Thread Robert Metzger (JIRA)
Robert Metzger created FLINK-6641:
-

 Summary: HA recovery on YARN fails due to missing files in HDFS
 Key: FLINK-6641
 URL: https://issues.apache.org/jira/browse/FLINK-6641
 Project: Flink
  Issue Type: Bug
  Components: Distributed Coordination, YARN
Affects Versions: 1.3.0
Reporter: Robert Metzger
Priority: Blocker


While testing the 1.3 RC1, I was unable to recovery from a JobManager failure.

The following error messages appear in the JobManager that is started by YARN:

{code}
2017-05-19 14:41:38,030 INFO  
org.apache.flink.runtime.webmonitor.JobManagerRetriever   - New leader 
reachable under 
akka.tcp://flink@permanent-qa-cluster-i7c9.c.astral-sorter-757.internal:36528/user/jobmanager:6539dc04-d7fe-4f85-a0b6-09bfb0de8a58.
2017-05-19 14:41:38,033 INFO  org.apache.flink.yarn.YarnFlinkResourceManager
- Resource Manager associating with leading JobManager 
Actor[akka://flink/user/jobmanager#1602741108] - leader session 
6539dc04-d7fe-4f85-a0b6-09bfb0de8a58
2017-05-19 14:41:38,033 INFO  org.apache.flink.yarn.YarnFlinkResourceManager
- Requesting new TaskManager container with 1024 megabytes memory. 
Pending requests: 1
2017-05-19 14:41:38,781 INFO  org.apache.flink.yarn.YarnFlinkResourceManager
- Received new container: container_149487096_0061_02_02 - 
Remaining pending container requests: 0
2017-05-19 14:41:38,782 INFO  org.apache.flink.yarn.YarnFlinkResourceManager
- Launching TaskManager in container ContainerInLaunch @ 
1495204898782: Container: [ContainerId: container_149487096_0061_02_02, 
NodeId: permanent-qa-cluster-d3iz.c.astral-sorter-757.internal:8041, 
NodeHttpAddress: permanent-qa-cluster-d3iz.c.astral-sorter-757.internal:8042, 
Resource: , Priority: 0, Token: Token { kind: 
ContainerToken, service: 10.240.0.32:8041 }, ] on host 
permanent-qa-cluster-d3iz.c.astral-sorter-757.internal
2017-05-19 14:41:38,788 INFO  
org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy  - 
Opening proxy : permanent-qa-cluster-d3iz.c.astral-sorter-757.internal:8041
2017-05-19 14:41:44,284 INFO  org.apache.flink.yarn.YarnFlinkResourceManager
- Container container_149487096_0061_02_02 failed, with a 
TaskManager in launch or registration. Exit status: -1000
2017-05-19 14:41:44,284 INFO  org.apache.flink.yarn.YarnFlinkResourceManager
- Diagnostics for container container_149487096_0061_02_02 
in state COMPLETE : exitStatus=-1000 diagnostics=File does not exist: 
hdfs://nameservice1/user/robert/.flink/application_149487096_0061/cf9287fe-ac75-4066-a648-91787d946890-taskmanager-conf.yaml
java.io.FileNotFoundException: File does not exist: 
hdfs://nameservice1/user/robert/.flink/application_149487096_0061/cf9287fe-ac75-4066-a648-91787d946890-taskmanager-conf.yaml
at 
org.apache.hadoop.hdfs.DistributedFileSystem$19.doCall(DistributedFileSystem.java:1219)
at 
org.apache.hadoop.hdfs.DistributedFileSystem$19.doCall(DistributedFileSystem.java:1211)
at 
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at 
org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1211)
at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:251)
at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:61)
at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:359)
at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:357)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:356)
at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:60)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
{code}

and

{code}
2017-05-19 14:41:48,133 INFO  org.apache.flink.yarn.YarnJobManager  
- Submitting recovered job 35df3b24209e9ebea12407e6747a746c.
2017-05-19 14:41:48,133 INFO  org.apache.flink.yarn.YarnJobManager  
- Submitting job 35df3b24209e9ebea12407e6747a746c 
(CarTopSpeedWindowingExample) (Recovery).
2017-05-19 14:41:48,139 ERROR org.apache.flink.yarn.YarnJobManager

[jira] [Commented] (FLINK-6628) Cannot start taskmanager with cygwin in directory containing spaces

2017-05-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16017495#comment-16017495
 ] 

ASF GitHub Bot commented on FLINK-6628:
---

GitHub user zentol opened a pull request:

https://github.com/apache/flink/pull/3954

[FLINK-6628] Fix start scripts on Windows

This PR fixes the `flink-daemon.sh` and `taskmanager.sh` script for cygwin 
for directories containing spaces.
They were broken in 11c868f91db773af626ac6ac4dcba9820c13fa8a.

The change in the daemon script is straight-forward; to prevent the path 
containing the space to be interpreted as to arguments we wrap it in `"` which 
causes the entire path, including the space, to be passed  as a single argument.

The change in the taskmanager script is as subtle as it gets. When storing 
the call like this `TM_COMMAND="${FLINK_BIN_DIR}/flink-daemon.sh $STARTSTOP 
taskmanager ${args[@]}"` and executing it later spaces are once again used as 
to separate arguments, breaking the execution. We can't use the trick used 
above, since then the entire expressions is used as a single argument (i.e. 
path to an executable) which obviously fails.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/zentol/flink 6628_win_script

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/3954.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3954


commit f4da72abc7d8d5c380f4552eead69ee477833c8a
Author: zentol 
Date:   2017-05-19T14:50:37Z

[FLINK-6628] Fix start scripts on Windows




> Cannot start taskmanager with cygwin in directory containing spaces
> ---
>
> Key: FLINK-6628
> URL: https://issues.apache.org/jira/browse/FLINK-6628
> Project: Flink
>  Issue Type: Bug
>  Components: Startup Shell Scripts
>Affects Versions: 1.3.0, 1.4.0
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
>
> {code}
> Zento@Lily 
> /cygdrive/c/Users/Zento/Documents/GitHub/flink/flink-dist/target/flink-1.3.0-bin/flink
>  1.3.0
> $ ./bin/taskmanager.sh start
> ./bin/taskmanager.sh: line 82: 
> C:\Users\Zento\Documents\GitHub\flink\flink-dist\target\flink-1.3.0-bin\flink:
>  command not found
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] flink pull request #3954: [FLINK-6628] Fix start scripts on Windows

2017-05-19 Thread zentol
GitHub user zentol opened a pull request:

https://github.com/apache/flink/pull/3954

[FLINK-6628] Fix start scripts on Windows

This PR fixes the `flink-daemon.sh` and `taskmanager.sh` script for cygwin 
for directories containing spaces.
They were broken in 11c868f91db773af626ac6ac4dcba9820c13fa8a.

The change in the daemon script is straight-forward; to prevent the path 
containing the space to be interpreted as to arguments we wrap it in `"` which 
causes the entire path, including the space, to be passed  as a single argument.

The change in the taskmanager script is as subtle as it gets. When storing 
the call like this `TM_COMMAND="${FLINK_BIN_DIR}/flink-daemon.sh $STARTSTOP 
taskmanager ${args[@]}"` and executing it later spaces are once again used as 
to separate arguments, breaking the execution. We can't use the trick used 
above, since then the entire expressions is used as a single argument (i.e. 
path to an executable) which obviously fails.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/zentol/flink 6628_win_script

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/3954.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3954


commit f4da72abc7d8d5c380f4552eead69ee477833c8a
Author: zentol 
Date:   2017-05-19T14:50:37Z

[FLINK-6628] Fix start scripts on Windows




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-6576) Allow ConfigOptions to validate the configured value

2017-05-19 Thread Greg Hogan (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16017493#comment-16017493
 ] 

Greg Hogan commented on FLINK-6576:
---

I happend to also look at this and in addition to {{taskmanager.sh}} the 
{{flink-daemon.sh}} log rotation needs escaping: {{rotateLogFilesWithPrefix 
"$FLINK_LOG_DIR" "$FLINK_LOG_PREFIX"}}.

> Allow ConfigOptions to validate the configured value
> 
>
> Key: FLINK-6576
> URL: https://issues.apache.org/jira/browse/FLINK-6576
> Project: Flink
>  Issue Type: Wish
>  Components: Configuration
>Reporter: Chesnay Schepler
>Priority: Trivial
>
> It is not unusual for a config parameter to only accept certain values. Ports 
> may not be negative, modes way essentially be enums, and file paths must not 
> be gibberish.
> Currently these validations happen manually after the value was extracted 
> from the {{Configuration}}.
> I want to start a discussion on whether we want to move this validation into 
> the ConfigOption itself.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (FLINK-6628) Cannot start taskmanager with cygwin in directory containing spaces

2017-05-19 Thread Chesnay Schepler (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-6628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chesnay Schepler reassigned FLINK-6628:
---

Assignee: Chesnay Schepler

> Cannot start taskmanager with cygwin in directory containing spaces
> ---
>
> Key: FLINK-6628
> URL: https://issues.apache.org/jira/browse/FLINK-6628
> Project: Flink
>  Issue Type: Bug
>  Components: Startup Shell Scripts
>Affects Versions: 1.3.0, 1.4.0
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
>
> {code}
> Zento@Lily 
> /cygdrive/c/Users/Zento/Documents/GitHub/flink/flink-dist/target/flink-1.3.0-bin/flink
>  1.3.0
> $ ./bin/taskmanager.sh start
> ./bin/taskmanager.sh: line 82: 
> C:\Users\Zento\Documents\GitHub\flink\flink-dist\target\flink-1.3.0-bin\flink:
>  command not found
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (FLINK-6639) Java/Scala code tabs broken in CEP docs

2017-05-19 Thread Kostas Kloudas (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-6639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kostas Kloudas updated FLINK-6639:
--
Component/s: CEP

> Java/Scala code tabs broken in CEP docs
> ---
>
> Key: FLINK-6639
> URL: https://issues.apache.org/jira/browse/FLINK-6639
> Project: Flink
>  Issue Type: Bug
>  Components: CEP, Documentation
>Reporter: David Anderson
>Assignee: David Anderson
>
> A missing  is breaking the JS that does the tab switching between the 
> Java and Scala tabs on the CEP page in the docs.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-6297) CEP timeout does not trigger under certain conditions

2017-05-19 Thread Kostas Kloudas (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16017482#comment-16017482
 ] 

Kostas Kloudas commented on FLINK-6297:
---

Hi [~vijayakumarpl], is this problem still occurring? 

If not, I would like to close this issue so that JIRAs accurately reflect what 
is happening in FlinkCEP. 

If it is still occurring, could you please post a minimal example to reproduce 
the problem.As I said previously, I tried to reproduce it but could not.

> CEP timeout does not trigger under certain conditions
> -
>
> Key: FLINK-6297
> URL: https://issues.apache.org/jira/browse/FLINK-6297
> Project: Flink
>  Issue Type: Bug
>  Components: CEP
>Affects Versions: 1.2.0
>Reporter: Vijayakumar Palaniappan
>
> -TimeoutPattern does not trigger under certain conditions. Following are the 
> preconditions: 
> -Assume a pattern of Event A followed by Event B within 2 Seconds
> -PeriodicWaterMarks every 1 second
> -Assume following events have arrived. 
> -Event A-1[time: 1 sec]
> -Event B-1[time: 2 sec] 
> -Event A-2[time: 2 sec]
> -Event A-3[time: 5 sec] 
> -WaterMark[time: 5 sec]
> I would assume that after watermark arrival, Event A-1,B-1 detected. A-2 
> timed out. But A-2 timeout does not happen.
> if i use a punctuated watermark and generate watermark for every event, it 
> seems to work as expected.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-5753) CEP timeout handler.

2017-05-19 Thread Kostas Kloudas (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16017476#comment-16017476
 ] 

Kostas Kloudas commented on FLINK-5753:
---

Hi [~jurijuri], did the above solve your problem?

If yes, I would like to close this issue, so that the open JIRAs reflect 
accurately what is happening in FlinkCEP.

Cheers!

> CEP timeout handler.
> 
>
> Key: FLINK-5753
> URL: https://issues.apache.org/jira/browse/FLINK-5753
> Project: Flink
>  Issue Type: Bug
>  Components: CEP
>Affects Versions: 1.1.2
>Reporter: Michał Jurkiewicz
>Assignee: Kostas Kloudas
>
> I configured the following flink job in my environment:
> {code}
> Pattern patternCommandStarted = Pattern. 
> begin("event-accepted").subtype(Event.class)
> .where(e -> {event accepted where 
> statement}).next("second-event-started").subtype(Event.class)
> .where(e -> {event started where statement}))
> .within(Time.seconds(30));
> DataStream> events = CEP
>   .pattern(eventsStream.keyBy(e -> e.getEventProperties().get("deviceCode")), 
> patternCommandStarted)
>   .select(eventSelector, eventSelector);
> static class EventSelector implements PatternSelectFunction, 
> PatternTimeoutFunction {}
> {code}
> The problem that I have is related to timeout handling. I observed that: 
> if: first event appears, second event not appear in the stream  
> and *no new events appear in a stream*, timeout handler is not executed.
> Expected result: timeout handler should be executed in case if there are no 
> new events in a stream



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-5898) Race-Condition with Amazon Kinesis KPL

2017-05-19 Thread Scott Kidder (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16017473#comment-16017473
 ] 

Scott Kidder commented on FLINK-5898:
-

Hi Gordon! I think this issue and FLINK-5946 warrant upgrading the default KPL 
dependency to 0.12.4.

Also included in KPL 0.12.4 is a change to the AWS SDK core library dependency. 
In previous versions of the KPL, this dependency was expressed as a range, but 
now it's pinned to a specific version: `1.11.128`

I've been using `1.11.86` with my patched KPL, but would like to test 
`1.11.128` before suggesting we upgrade the KPL dependency to 0.12.4.

What do you think?

> Race-Condition with Amazon Kinesis KPL
> --
>
> Key: FLINK-5898
> URL: https://issues.apache.org/jira/browse/FLINK-5898
> Project: Flink
>  Issue Type: Bug
>  Components: Kinesis Connector
>Affects Versions: 1.2.0
>Reporter: Scott Kidder
>
> The Flink Kinesis streaming-connector uses the Amazon Kinesis Producer 
> Library (KPL) to send messages to Kinesis streams. The KPL relies on a native 
> binary client to send messages to achieve better performance.
> When a Kinesis Producer is instantiated, the KPL will extract the native 
> binary to a sub-directory of `/tmp` (or whatever the platform-specific 
> temporary directory happens to be).
> The KPL tries to prevent multiple processes from extracting the binary at the 
> same time by wrapping the operation in a mutex. Unfortunately, this does not 
> prevent multiple Flink cores from trying to perform this operation at the 
> same time. If two or more processes attempt to do this at the same time, then 
> the native binary in /tmp will be corrupted.
> The authors of the KPL are aware of this possibility and suggest that users 
> of the KPL  not do that ... (sigh):
> https://github.com/awslabs/amazon-kinesis-producer/issues/55#issuecomment-251408897
> I encountered this in my production environment when bringing up a new Flink 
> task-manager with multiple cores and restoring from an earlier savepoint, 
> resulting in the instantiation of a KPL client on each core at roughly the 
> same time.
> A stack-trace follows:
> {noformat}
> java.lang.RuntimeException: Could not copy native binaries to temp directory 
> /tmp/amazon-kinesis-producer-native-binaries
>   at 
> com.amazonaws.services.kinesis.producer.KinesisProducer.extractBinaries(KinesisProducer.java:849)
>   at 
> com.amazonaws.services.kinesis.producer.KinesisProducer.(KinesisProducer.java:243)
>   at 
> org.apache.flink.streaming.connectors.kinesis.FlinkKinesisProducer.open(FlinkKinesisProducer.java:198)
>   at 
> org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:36)
>   at 
> org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.open(AbstractUdfStreamOperator.java:112)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.openAllOperators(StreamTask.java:376)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:252)
>   at org.apache.flink.runtime.taskmanager.Task.run(Task.java:655)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.SecurityException: The contents of the binary 
> /tmp/amazon-kinesis-producer-native-binaries/kinesis_producer_e9a87c761db92a73eb74519a4468ee71def87eb2
>  is not what it's expected to be.
>   at 
> com.amazonaws.services.kinesis.producer.KinesisProducer.extractBinaries(KinesisProducer.java:822)
>   ... 8 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Closed] (FLINK-6634) NFA serializer does not serialize the ComputationState counter.

2017-05-19 Thread Kostas Kloudas (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-6634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kostas Kloudas closed FLINK-6634.
-
Resolution: Fixed

> NFA serializer does not serialize the ComputationState counter.
> ---
>
> Key: FLINK-6634
> URL: https://issues.apache.org/jira/browse/FLINK-6634
> Project: Flink
>  Issue Type: Bug
>  Components: CEP
>Affects Versions: 1.3.0
>Reporter: Kostas Kloudas
>Assignee: Kostas Kloudas
> Fix For: 1.3.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (FLINK-6634) NFA serializer does not serialize the ComputationState counter.

2017-05-19 Thread Kostas Kloudas (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-6634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kostas Kloudas updated FLINK-6634:
--
Summary: NFA serializer does not serialize the ComputationState counter.  
(was: NFA serializer does not serialize the ValueTimeWrapper counter.)

> NFA serializer does not serialize the ComputationState counter.
> ---
>
> Key: FLINK-6634
> URL: https://issues.apache.org/jira/browse/FLINK-6634
> Project: Flink
>  Issue Type: Bug
>  Components: CEP
>Affects Versions: 1.3.0
>Reporter: Kostas Kloudas
>Assignee: Kostas Kloudas
> Fix For: 1.3.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-6634) NFA serializer does not serialize the ValueTimeWrapper counter.

2017-05-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16017465#comment-16017465
 ] 

ASF GitHub Bot commented on FLINK-6634:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/3945


> NFA serializer does not serialize the ValueTimeWrapper counter.
> ---
>
> Key: FLINK-6634
> URL: https://issues.apache.org/jira/browse/FLINK-6634
> Project: Flink
>  Issue Type: Bug
>  Components: CEP
>Affects Versions: 1.3.0
>Reporter: Kostas Kloudas
>Assignee: Kostas Kloudas
> Fix For: 1.3.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] flink pull request #3945: [FLINK-6634] [cep] NFASerializer serializes Comput...

2017-05-19 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/3945


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-6551) OutputTag name should not be allowed to be empty

2017-05-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16017462#comment-16017462
 ] 

ASF GitHub Bot commented on FLINK-6551:
---

GitHub user zentol opened a pull request:

https://github.com/apache/flink/pull/3953

[FLINK-6551] Reject empty OutputTag names

With this PR we reject empty `OutputTag` names. Having an empty name 
prevents useful logging messages, as they would effectively contain no 
information about the tag.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/zentol/flink 6551_outputtag_empty

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/3953.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3953


commit 7e8a2f0cfd30466de1c99195a70992e92364efbc
Author: zentol 
Date:   2017-05-19T14:20:05Z

[FLINK-6551] Reject empty OutputTag names




> OutputTag name should not be allowed to be empty
> 
>
> Key: FLINK-6551
> URL: https://issues.apache.org/jira/browse/FLINK-6551
> Project: Flink
>  Issue Type: Improvement
>  Components: DataStream API
>Affects Versions: 1.3.0, 1.4.0
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
>
> When creating an OutputTag it is required to give it a name.
> While we do enforce that the name is not null we do not have a check in place 
> that prevents passing an empty string.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] flink pull request #3953: [FLINK-6551] Reject empty OutputTag names

2017-05-19 Thread zentol
GitHub user zentol opened a pull request:

https://github.com/apache/flink/pull/3953

[FLINK-6551] Reject empty OutputTag names

With this PR we reject empty `OutputTag` names. Having an empty name 
prevents useful logging messages, as they would effectively contain no 
information about the tag.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/zentol/flink 6551_outputtag_empty

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/3953.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3953


commit 7e8a2f0cfd30466de1c99195a70992e92364efbc
Author: zentol 
Date:   2017-05-19T14:20:05Z

[FLINK-6551] Reject empty OutputTag names




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Assigned] (FLINK-6551) OutputTag name should not be allowed to be empty

2017-05-19 Thread Chesnay Schepler (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-6551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chesnay Schepler reassigned FLINK-6551:
---

Assignee: Chesnay Schepler

> OutputTag name should not be allowed to be empty
> 
>
> Key: FLINK-6551
> URL: https://issues.apache.org/jira/browse/FLINK-6551
> Project: Flink
>  Issue Type: Improvement
>  Components: DataStream API
>Affects Versions: 1.3.0, 1.4.0
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
>
> When creating an OutputTag it is required to give it a name.
> While we do enforce that the name is not null we do not have a check in place 
> that prevents passing an empty string.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-5898) Race-Condition with Amazon Kinesis KPL

2017-05-19 Thread Tzu-Li (Gordon) Tai (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16017452#comment-16017452
 ] 

Tzu-Li (Gordon) Tai commented on FLINK-5898:


Thanks for the update Scott! Would it make sense to bump the AWS KPL version 
we're using by default because of this?

> Race-Condition with Amazon Kinesis KPL
> --
>
> Key: FLINK-5898
> URL: https://issues.apache.org/jira/browse/FLINK-5898
> Project: Flink
>  Issue Type: Bug
>  Components: Kinesis Connector
>Affects Versions: 1.2.0
>Reporter: Scott Kidder
>
> The Flink Kinesis streaming-connector uses the Amazon Kinesis Producer 
> Library (KPL) to send messages to Kinesis streams. The KPL relies on a native 
> binary client to send messages to achieve better performance.
> When a Kinesis Producer is instantiated, the KPL will extract the native 
> binary to a sub-directory of `/tmp` (or whatever the platform-specific 
> temporary directory happens to be).
> The KPL tries to prevent multiple processes from extracting the binary at the 
> same time by wrapping the operation in a mutex. Unfortunately, this does not 
> prevent multiple Flink cores from trying to perform this operation at the 
> same time. If two or more processes attempt to do this at the same time, then 
> the native binary in /tmp will be corrupted.
> The authors of the KPL are aware of this possibility and suggest that users 
> of the KPL  not do that ... (sigh):
> https://github.com/awslabs/amazon-kinesis-producer/issues/55#issuecomment-251408897
> I encountered this in my production environment when bringing up a new Flink 
> task-manager with multiple cores and restoring from an earlier savepoint, 
> resulting in the instantiation of a KPL client on each core at roughly the 
> same time.
> A stack-trace follows:
> {noformat}
> java.lang.RuntimeException: Could not copy native binaries to temp directory 
> /tmp/amazon-kinesis-producer-native-binaries
>   at 
> com.amazonaws.services.kinesis.producer.KinesisProducer.extractBinaries(KinesisProducer.java:849)
>   at 
> com.amazonaws.services.kinesis.producer.KinesisProducer.(KinesisProducer.java:243)
>   at 
> org.apache.flink.streaming.connectors.kinesis.FlinkKinesisProducer.open(FlinkKinesisProducer.java:198)
>   at 
> org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:36)
>   at 
> org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.open(AbstractUdfStreamOperator.java:112)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.openAllOperators(StreamTask.java:376)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:252)
>   at org.apache.flink.runtime.taskmanager.Task.run(Task.java:655)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.SecurityException: The contents of the binary 
> /tmp/amazon-kinesis-producer-native-binaries/kinesis_producer_e9a87c761db92a73eb74519a4468ee71def87eb2
>  is not what it's expected to be.
>   at 
> com.amazonaws.services.kinesis.producer.KinesisProducer.extractBinaries(KinesisProducer.java:822)
>   ... 8 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Closed] (FLINK-5946) Kinesis Producer uses KPL that orphans threads that consume 100% CPU

2017-05-19 Thread Scott Kidder (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-5946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Scott Kidder closed FLINK-5946.
---
Resolution: Workaround

The fix for this issue is included in release `0.12.4` of the AWS KPL, released 
2 days ago (May 17, 2017). Anyone affected by this issue can use version 
`0.12.4` of the KPL. Marking this issue as closed.

> Kinesis Producer uses KPL that orphans threads that consume 100% CPU
> 
>
> Key: FLINK-5946
> URL: https://issues.apache.org/jira/browse/FLINK-5946
> Project: Flink
>  Issue Type: Bug
>  Components: Kinesis Connector
>Affects Versions: 1.2.0
>Reporter: Scott Kidder
>
> It's possible for the Amazon Kinesis Producer Library (KPL) to leave orphaned 
> threads running after the producer has been instructed to shutdown via the 
> `destroy()` method. These threads run in a very tight infinite loop that can 
> push CPU usage to 100%. I've seen this happen on several occasions, though it 
> does not happen all of the time. Once these threads are orphaned, the only 
> solution to bring CPU utilization back down is to restart the Flink Task 
> Manager.
> When a KPL producer is instantiated, it creates several threads: one to 
> execute and monitor the native sender process, and two threads to monitor the 
> process' stdout and stderr output. It's possible for the process-monitor 
> thread to stop in such a way that leaves the output monitor threads orphaned.
> I've submitted a Github issue and pull-request against the KPL project:
> https://github.com/awslabs/amazon-kinesis-producer/issues/93
> https://github.com/awslabs/amazon-kinesis-producer/pull/94
> This issue is rooted in the Amazon Kinesis Producer Library (KPL) that the 
> Flink Kinesis streaming connector depends upon. It ought to be fixed in the 
> KPL, but I want to document it on the Flink project. The Flink KPL dependency 
> should be updated once the KPL has been fixed.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Closed] (FLINK-5898) Race-Condition with Amazon Kinesis KPL

2017-05-19 Thread Scott Kidder (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-5898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Scott Kidder closed FLINK-5898.
---
Resolution: Workaround

Issue fixed in KPL version 0.12.4.

> Race-Condition with Amazon Kinesis KPL
> --
>
> Key: FLINK-5898
> URL: https://issues.apache.org/jira/browse/FLINK-5898
> Project: Flink
>  Issue Type: Bug
>  Components: Kinesis Connector
>Affects Versions: 1.2.0
>Reporter: Scott Kidder
>
> The Flink Kinesis streaming-connector uses the Amazon Kinesis Producer 
> Library (KPL) to send messages to Kinesis streams. The KPL relies on a native 
> binary client to send messages to achieve better performance.
> When a Kinesis Producer is instantiated, the KPL will extract the native 
> binary to a sub-directory of `/tmp` (or whatever the platform-specific 
> temporary directory happens to be).
> The KPL tries to prevent multiple processes from extracting the binary at the 
> same time by wrapping the operation in a mutex. Unfortunately, this does not 
> prevent multiple Flink cores from trying to perform this operation at the 
> same time. If two or more processes attempt to do this at the same time, then 
> the native binary in /tmp will be corrupted.
> The authors of the KPL are aware of this possibility and suggest that users 
> of the KPL  not do that ... (sigh):
> https://github.com/awslabs/amazon-kinesis-producer/issues/55#issuecomment-251408897
> I encountered this in my production environment when bringing up a new Flink 
> task-manager with multiple cores and restoring from an earlier savepoint, 
> resulting in the instantiation of a KPL client on each core at roughly the 
> same time.
> A stack-trace follows:
> {noformat}
> java.lang.RuntimeException: Could not copy native binaries to temp directory 
> /tmp/amazon-kinesis-producer-native-binaries
>   at 
> com.amazonaws.services.kinesis.producer.KinesisProducer.extractBinaries(KinesisProducer.java:849)
>   at 
> com.amazonaws.services.kinesis.producer.KinesisProducer.(KinesisProducer.java:243)
>   at 
> org.apache.flink.streaming.connectors.kinesis.FlinkKinesisProducer.open(FlinkKinesisProducer.java:198)
>   at 
> org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:36)
>   at 
> org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.open(AbstractUdfStreamOperator.java:112)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.openAllOperators(StreamTask.java:376)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:252)
>   at org.apache.flink.runtime.taskmanager.Task.run(Task.java:655)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.SecurityException: The contents of the binary 
> /tmp/amazon-kinesis-producer-native-binaries/kinesis_producer_e9a87c761db92a73eb74519a4468ee71def87eb2
>  is not what it's expected to be.
>   at 
> com.amazonaws.services.kinesis.producer.KinesisProducer.extractBinaries(KinesisProducer.java:822)
>   ... 8 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-5898) Race-Condition with Amazon Kinesis KPL

2017-05-19 Thread Scott Kidder (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16017444#comment-16017444
 ] 

Scott Kidder commented on FLINK-5898:
-

The fix for this issue is included in release `0.12.4` of the AWS KPL, released 
2 days ago (May 17, 2017). Anyone affected by this issue can use version 
`0.12.4` of the KPL. Marking this issue as closed.

> Race-Condition with Amazon Kinesis KPL
> --
>
> Key: FLINK-5898
> URL: https://issues.apache.org/jira/browse/FLINK-5898
> Project: Flink
>  Issue Type: Bug
>  Components: Kinesis Connector
>Affects Versions: 1.2.0
>Reporter: Scott Kidder
>
> The Flink Kinesis streaming-connector uses the Amazon Kinesis Producer 
> Library (KPL) to send messages to Kinesis streams. The KPL relies on a native 
> binary client to send messages to achieve better performance.
> When a Kinesis Producer is instantiated, the KPL will extract the native 
> binary to a sub-directory of `/tmp` (or whatever the platform-specific 
> temporary directory happens to be).
> The KPL tries to prevent multiple processes from extracting the binary at the 
> same time by wrapping the operation in a mutex. Unfortunately, this does not 
> prevent multiple Flink cores from trying to perform this operation at the 
> same time. If two or more processes attempt to do this at the same time, then 
> the native binary in /tmp will be corrupted.
> The authors of the KPL are aware of this possibility and suggest that users 
> of the KPL  not do that ... (sigh):
> https://github.com/awslabs/amazon-kinesis-producer/issues/55#issuecomment-251408897
> I encountered this in my production environment when bringing up a new Flink 
> task-manager with multiple cores and restoring from an earlier savepoint, 
> resulting in the instantiation of a KPL client on each core at roughly the 
> same time.
> A stack-trace follows:
> {noformat}
> java.lang.RuntimeException: Could not copy native binaries to temp directory 
> /tmp/amazon-kinesis-producer-native-binaries
>   at 
> com.amazonaws.services.kinesis.producer.KinesisProducer.extractBinaries(KinesisProducer.java:849)
>   at 
> com.amazonaws.services.kinesis.producer.KinesisProducer.(KinesisProducer.java:243)
>   at 
> org.apache.flink.streaming.connectors.kinesis.FlinkKinesisProducer.open(FlinkKinesisProducer.java:198)
>   at 
> org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:36)
>   at 
> org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.open(AbstractUdfStreamOperator.java:112)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.openAllOperators(StreamTask.java:376)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:252)
>   at org.apache.flink.runtime.taskmanager.Task.run(Task.java:655)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.SecurityException: The contents of the binary 
> /tmp/amazon-kinesis-producer-native-binaries/kinesis_producer_e9a87c761db92a73eb74519a4468ee71def87eb2
>  is not what it's expected to be.
>   at 
> com.amazonaws.services.kinesis.producer.KinesisProducer.extractBinaries(KinesisProducer.java:822)
>   ... 8 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (FLINK-6640) Ensure registration of shared state happens before externalizing a checkpoint

2017-05-19 Thread Stefan Richter (JIRA)
Stefan Richter created FLINK-6640:
-

 Summary: Ensure registration of shared state happens before 
externalizing a checkpoint
 Key: FLINK-6640
 URL: https://issues.apache.org/jira/browse/FLINK-6640
 Project: Flink
  Issue Type: Sub-task
  Components: State Backends, Checkpointing
Affects Versions: 1.3.0
Reporter: Stefan Richter
Assignee: Stefan Richter


Currently, a checkpoint is externalized before its shared state is registered. 
As a consequence, placeholder state handles become part of an externalized 
checkpoint, which they should not.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-6639) Java/Scala code tabs broken in CEP docs

2017-05-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16017426#comment-16017426
 ] 

ASF GitHub Bot commented on FLINK-6639:
---

Github user zentol commented on the issue:

https://github.com/apache/flink/pull/3952
  
merging.


> Java/Scala code tabs broken in CEP docs
> ---
>
> Key: FLINK-6639
> URL: https://issues.apache.org/jira/browse/FLINK-6639
> Project: Flink
>  Issue Type: Bug
>  Components: Documentation
>Reporter: David Anderson
>Assignee: David Anderson
>
> A missing  is breaking the JS that does the tab switching between the 
> Java and Scala tabs on the CEP page in the docs.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] flink issue #3952: [FLINK-6639] fix code tabs in CEP docs

2017-05-19 Thread zentol
Github user zentol commented on the issue:

https://github.com/apache/flink/pull/3952
  
merging.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


  1   2   3   >