[jira] [Created] (FLINK-8946) TaskManager stop sending metrics after JobManager failover

2018-03-14 Thread Truong Duc Kien (JIRA)
Truong Duc Kien created FLINK-8946:
--

 Summary: TaskManager stop sending metrics after JobManager failover
 Key: FLINK-8946
 URL: https://issues.apache.org/jira/browse/FLINK-8946
 Project: Flink
  Issue Type: Bug
  Components: Metrics, TaskManager
Affects Versions: 1.4.2
Reporter: Truong Duc Kien


Running in Yarn-standalone mode, when the Job Manager performs a failover, all 
TaskManager that are inherited from the previous JobManager will not send 
metrics to the new JobManager and other registered metric reporters.

 

A cursory glance reveal that these line of code might be the cause

[https://github.com/apache/flink/blob/a3478fdfa0f792104123fefbd9bdf01f5029de51/flink-runtime/src/main/scala/org/apache/flink/runtime/taskmanager/TaskManager.scala#L1082-L1086]

Perhap the TaskManager close its metrics group when disassociating JobManager, 
but not creating a new one on fail-over association ?

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Flip 6 mesos support

2018-03-14 Thread Renjie Liu
Hi, Till:
In fact I'm asking how to deploy other components such as dispatcher, etc.

Till Rohrmann  于 2018年3月15日周四 上午12:17写道:

> Hi Renjie,
>
> in the current master and release-1.5 branch flip-6 is activated by
> default. If you want to turn it off you have to add `mode: old` to your
> flink-conf.yaml. I'm really happy that you want to test it out :-)
>
> Cheers,
> Till
>
> On Wed, Mar 14, 2018 at 3:03 PM, Renjie Liu 
> wrote:
>
> > Hi Till:
> > Is there any doc on deploying flink in flip6 mode? We want to help
> testing
> > it.
> >
> > Till Rohrmann  于 2018年3月14日周三 下午7:08写道:
> >
> > > Hi Renjie,
> > >
> > > in order to make Mesos work, we only needed to implement a Mesos
> specific
> > > ResourceManager. Look at MesosResourceManager for more details. As
> > > dispatcher, we use the StandaloneDispatcher which is spawned by
> > > the MesosSessionClusterEntrypoint.
> > >
> > > Cheers,
> > > Till
> > >
> > > On Wed, Mar 14, 2018 at 9:32 AM, Renjie Liu 
> > > wrote:
> > >
> > > > Hi all:
> > > > I'm reading the source code and it seems that flip6 does not support
> > > mesos?
> > > > According to the design, client send job graph to dispatcher and
> > > dispatcher
> > > > spawn job mananger and resource manager for job execution. But I
> can't
> > > find
> > > > dispatcher implementation for mesos.
> > > > --
> > > > Liu, Renjie
> > > > Software Engineer, MVAD
> > > >
> > >
> > --
> > Liu, Renjie
> > Software Engineer, MVAD
> >
>
-- 
Liu, Renjie
Software Engineer, MVAD


Re: Correct way to reference Hadoop dependencies in Flink 1.4.0

2018-03-14 Thread lrao
Hi,

I am running this on a Hadoop-free cluster (i.e. no YARN etc.). I have the 
following dependencies packaged in my user application JAR:

aws-java-sdk 1.7.4
flink-hadoop-fs 1.4.0
flink-shaded-hadoop2 1.4.0
flink-connector-filesystem_2.11 1.4.0
hadoop-common 2.7.4
hadoop-aws 2.7.4

I have also tried the following conf:
classloader.resolve-order: parent-first
fs.hdfs.hadoopconf: /srv/hadoop/hadoop-2.7.5/etc/hadoop

But no luck. Anything else I could be missing?

On 2018/03/14 18:57:47, Francesco Ciuci  wrote: 
> Hi,
> 
> You do not just need the hadoop dependencies in the jar but you need to
> have the hadoop file system running in your machine/cluster.
> 
> Regards
> 
> On 14 March 2018 at 18:38, l...@lyft.com  wrote:
> 
> > I'm trying to use a BucketingSink to write files to S3 in my Flink job.
> >
> > I have the Hadoop dependencies I need packaged in my user application jar.
> > However, on running the job I get the following error (from the
> > taskmanager):
> >
> > java.lang.RuntimeException: Error while creating FileSystem when
> > initializing the state of the BucketingSink.
> > at org.apache.flink.streaming.connectors.fs.bucketing.
> > BucketingSink.initializeState(BucketingSink.java:358)
> > at org.apache.flink.streaming.util.functions.
> > StreamingFunctionUtils.tryRestoreFunction(StreamingFunctionUtils.java:178)
> > at org.apache.flink.streaming.util.functions.
> > StreamingFunctionUtils.restoreFunctionState(StreamingFunctionUtils.java:
> > 160)
> > at org.apache.flink.streaming.api.operators.
> > AbstractUdfStreamOperator.initializeState(AbstractUdfStreamOperator.
> > java:96)
> > at org.apache.flink.streaming.api.operators.
> > AbstractStreamOperator.initializeState(AbstractStreamOperator.java:259)
> > at org.apache.flink.streaming.runtime.tasks.StreamTask.
> > initializeOperators(StreamTask.java:694)
> > at org.apache.flink.streaming.runtime.tasks.StreamTask.
> > initializeState(StreamTask.java:682)
> > at org.apache.flink.streaming.runtime.tasks.StreamTask.
> > invoke(StreamTask.java:253)
> > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:718)
> > at java.lang.Thread.run(Thread.java:748)
> > Caused by: org.apache.flink.core.fs.UnsupportedFileSystemSchemeException:
> > Could not find a file system implementation for scheme 's3a'. The scheme is
> > not directly supported by Flink and no Hadoop file system to support this
> > scheme could be loaded.
> > at org.apache.flink.core.fs.FileSystem.getUnguardedFileSystem(
> > FileSystem.java:405)
> > at org.apache.flink.core.fs.FileSystem.get(FileSystem.java:320)
> > at org.apache.flink.streaming.connectors.fs.bucketing.
> > BucketingSink.createHadoopFileSystem(BucketingSink.java:1125)
> > at org.apache.flink.streaming.connectors.fs.bucketing.
> > BucketingSink.initFileSystem(BucketingSink.java:411)
> > at org.apache.flink.streaming.connectors.fs.bucketing.
> > BucketingSink.initializeState(BucketingSink.java:355)
> > ... 9 common frames omitted
> > Caused by: org.apache.flink.core.fs.UnsupportedFileSystemSchemeException:
> > Hadoop is not in the classpath/dependencies.
> > at org.apache.flink.core.fs.UnsupportedSchemeFactory.create(
> > UnsupportedSchemeFactory.java:64)
> > at org.apache.flink.core.fs.FileSystem.getUnguardedFileSystem(
> > FileSystem.java:401)
> > ... 13 common frames omitted
> >
> > What's the right way to do this?
> >
> 


[DISCUSS] Not marking Jira issues as resolved in 1.5.0 as resolved in 1.6.0

2018-03-14 Thread Aljoscha Krettek
Hi,

We currently have some issues that are marked as resolved for both 1.5.0 and 
1.6.0 [1]. The reason is that we have the release-1.5 branch and the master 
branch, which will eventually become the branch for 1.6.0.

I think this can lead to confusion because the release notes are created based 
on that data. Say, we fix a bug "foo" after we created the release-1.5 branch. 
Now we will have "[FLINK-] Fixed foo" in the release notes for 1.5.0 and 
1.6.0. We basically start our Flink 1.6.0 release notes with around 50 issues 
that were never bugs in 1.6.0 because they were fixed in 1.5.0. Plus, having 
"[FLINK-] Fixed foo" in the 1.6.0 release notes indicates that "foo" was 
actually a bug in 1.5.0 (because we now had to fix it), but it wasn't.

I would propose to remove fixVersion 1.6.0 from all issues that have 1.5.0 as 
fixVersion. What do you think?

On a side note: a bug that is fixed in 1.5.1 should be marked as fixed for 
1.6.0 separately, because 1.6.0 is not a direct successor to 1.5.1.

Best,
Aljoscha

[1] 
https://issues.apache.org/jira/issues/?jql=project%20%3D%20FLINK%20and%20fixVersion%20%3D%201.6.0%20and%20resolution%20!%3D%20unresolved

Re: Flip 6 mesos support

2018-03-14 Thread Shuyi Chen
Hi Till, have we tested the YARN kerberos integration in flip6? AFAI
remember, YARNSessionFIFOSecuredITCase is not functioning (FLINK-8562
), do we have similar
integration test for flip6? Also, Flink yarn kerberos integration in the
old deployment was broken in 1.3 when flip6 is being developed (FLINK-8286
). Thanks a lot.

Shuyi

On Wed, Mar 14, 2018 at 9:16 AM, Till Rohrmann  wrote:

> Hi Renjie,
>
> in the current master and release-1.5 branch flip-6 is activated by
> default. If you want to turn it off you have to add `mode: old` to your
> flink-conf.yaml. I'm really happy that you want to test it out :-)
>
> Cheers,
> Till
>
> On Wed, Mar 14, 2018 at 3:03 PM, Renjie Liu 
> wrote:
>
> > Hi Till:
> > Is there any doc on deploying flink in flip6 mode? We want to help
> testing
> > it.
> >
> > Till Rohrmann  于 2018年3月14日周三 下午7:08写道:
> >
> > > Hi Renjie,
> > >
> > > in order to make Mesos work, we only needed to implement a Mesos
> specific
> > > ResourceManager. Look at MesosResourceManager for more details. As
> > > dispatcher, we use the StandaloneDispatcher which is spawned by
> > > the MesosSessionClusterEntrypoint.
> > >
> > > Cheers,
> > > Till
> > >
> > > On Wed, Mar 14, 2018 at 9:32 AM, Renjie Liu 
> > > wrote:
> > >
> > > > Hi all:
> > > > I'm reading the source code and it seems that flip6 does not support
> > > mesos?
> > > > According to the design, client send job graph to dispatcher and
> > > dispatcher
> > > > spawn job mananger and resource manager for job execution. But I
> can't
> > > find
> > > > dispatcher implementation for mesos.
> > > > --
> > > > Liu, Renjie
> > > > Software Engineer, MVAD
> > > >
> > >
> > --
> > Liu, Renjie
> > Software Engineer, MVAD
> >
>



-- 
"So you have to trust that the dots will somehow connect in your future."


Re: [VOTE] Release 1.3.3, release candidate #2

2018-03-14 Thread Ted Yu
+1

Ran the following command - passed:

mvn clean package -Pjdk8

On Wed, Mar 14, 2018 at 3:26 AM, Tzu-Li (Gordon) Tai 
wrote:

> Hi everyone,
>
> Please review and vote on release candidate #2 for Flink 1.3.3, as
> follows:
> [ ] +1, Approve the release
> [ ] -1, Do not approve the release (please provide specific comments)
>
> The complete staging area is available for your review, which includes:
> * JIRA release notes [1],
> * the official Apache source release and binary convenience releases to be
> deployed to dist.apache.org [2], which are signed with the key with
> fingerprint 1C1E2394D3194E1944613488F320986D35C33D6A [3],
> * all artifacts to be deployed to the Maven Central Repository [4],
> * source code branch “release-1.3.3-rc2” [5],
> * website pull request listing the new release [6].
> * A complete list of all new commits in release-1.3.3-rc2, since
> release-1.3.2 [7]
>
> This release candidate contains fixes for only the following issues:
> [FLINK-8487] Verify ZooKeeper checkpoint store behaviour with ITCase
> [FLINK-8890] Compare checkpoints with order in CompletedCheckpoint.
> checkpointsMatch()
> [FLINK-8807] Fix ZookeeperCompleted checkpoint store can get stuck in
> infinite loop
> [FLINK-7783] Don’t always remove checkpoints in
> ZooKeeperCompletedCheckpointStore#recover()
>
> Since the last candidate was cancelled only due to incorrect binaries in
> the source artifacts, I think we can also have a shorter voting period for
> RC2.
>
> Please test the release and vote for the release candidate before Thursday
> (March 15th), 7pm CET.
> It is adopted by majority approval, with at least 3 PMC affirmative votes.
>
> Please let me know if you disagree with the shortened voting time.
>
> Thanks,
> Gordon
>
> [1] https://issues.apache.org/jira/secure/ReleaseNote.jspa?
> projectId=12315522=12341142
> [2] http://people.apache.org/~tzulitai/flink-1.3.3-rc2/
> [3] https://dist.apache.org/repos/dist/release/flink/KEYS
> [4] https://repository.apache.org/content/repositories/orgapacheflink-1151
> [5] https://git-wip-us.apache.org/repos/asf?p=flink.git;a=
> shortlog;h=refs/heads/release-1.3.3-rc2
> [6] https://github.com/apache/flink-web/pull/104
> [7]
> - 90559b5413455d9d0f2b61c389a60e26e5c87800 [hotfix] Properly delete temp
> flink dir in create_source_release.sh
> - 99c0353a34c09af5bedb73f525f691dd7e78fcdd [hotfix] Ensure pristine
> release in tools/releasing/create_source_release.sh
> - b2437f87e361a822adbad6f1c3e6eb14eeeb09fa [FLINK-8487] Verify ZooKeeper
> checkpoint store behaviour with ITCase
> - 1432092f29c548c55af562ff7b4a7973fedd2e22 [FLINK-8890] Compare
> checkpoints with order in CompletedCheckpoint.checkpointsMatch()
> - df37d7acfea10a5ca3186f3c53294f2050758627 [FLINK-8807] Fix
> ZookeeperCompleted checkpoint store can get stuck in infinite loop
> - f69bdf207b92ca47a5ce3e29f6ec7193ed17ec72 [FLINK-7783] Don’t always
> remove checkpoints in ZooKeeperCompletedCheckpointStore#recover()
>
>


Re: [Proposal] CEP library changes - review request

2018-03-14 Thread Aljoscha Krettek
Hi,

I think this should have been sent to the dev mailing list because in the user 
mailing list it might disappear among a lot of other mail.

Forwarding...

Best,
Aljoscha

> On 14. Mar 2018, at 06:20, Shailesh Jain  wrote:
> 
> Hi,
> 
> We've been facing issues* w.r.t watermarks not supported per key, which led 
> us to:
> 
> Either (a) run the job in Processing time for a KeyedStream -> compromising 
> on use cases which revolve around catching time-based patterns
> or (b) run the job in Event time for multiple data streams (one data stream 
> per key) -> this is not scalable as the number of operators grow linearly 
> with the number of keys
> 
> To address this, we've done a quick (poc) change in the 
> AbstractKeyedCEPPatternOperator to allow for the NFAs to progress based on 
> timestamps extracted from the events arriving into the operator (and not from 
> the watermarks). We've tested it against our usecase and are seeing a 
> significant improvement in memory usage without compromising on the watermark 
> functionality.
> 
> It'll be really helpful if someone from the cep dev group can take a look at 
> this branch - https://github.com/jainshailesh/flink/commits/cep_changes 
>  and provide 
> comments on the approach taken, and maybe guide us on the next steps for 
> taking it forward. 
> 
> Thanks,
> Shailesh
> 
> * Links to previous email threads related to the same issue:
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Question-on-event-time-functionality-using-Flink-in-a-IoT-usecase-td18653.html
>  
> 
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Generate-watermarks-per-key-in-a-KeyedStream-td16629.html
>  
> 
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Correlation-between-number-of-operators-and-Job-manager-memory-requirements-td18384.html
>  
> 
> 



Re: Correct way to reference Hadoop dependencies in Flink 1.4.0

2018-03-14 Thread Francesco Ciuci
Hi,

You do not just need the hadoop dependencies in the jar but you need to
have the hadoop file system running in your machine/cluster.

Regards

On 14 March 2018 at 18:38, l...@lyft.com  wrote:

> I'm trying to use a BucketingSink to write files to S3 in my Flink job.
>
> I have the Hadoop dependencies I need packaged in my user application jar.
> However, on running the job I get the following error (from the
> taskmanager):
>
> java.lang.RuntimeException: Error while creating FileSystem when
> initializing the state of the BucketingSink.
> at org.apache.flink.streaming.connectors.fs.bucketing.
> BucketingSink.initializeState(BucketingSink.java:358)
> at org.apache.flink.streaming.util.functions.
> StreamingFunctionUtils.tryRestoreFunction(StreamingFunctionUtils.java:178)
> at org.apache.flink.streaming.util.functions.
> StreamingFunctionUtils.restoreFunctionState(StreamingFunctionUtils.java:
> 160)
> at org.apache.flink.streaming.api.operators.
> AbstractUdfStreamOperator.initializeState(AbstractUdfStreamOperator.
> java:96)
> at org.apache.flink.streaming.api.operators.
> AbstractStreamOperator.initializeState(AbstractStreamOperator.java:259)
> at org.apache.flink.streaming.runtime.tasks.StreamTask.
> initializeOperators(StreamTask.java:694)
> at org.apache.flink.streaming.runtime.tasks.StreamTask.
> initializeState(StreamTask.java:682)
> at org.apache.flink.streaming.runtime.tasks.StreamTask.
> invoke(StreamTask.java:253)
> at org.apache.flink.runtime.taskmanager.Task.run(Task.java:718)
> at java.lang.Thread.run(Thread.java:748)
> Caused by: org.apache.flink.core.fs.UnsupportedFileSystemSchemeException:
> Could not find a file system implementation for scheme 's3a'. The scheme is
> not directly supported by Flink and no Hadoop file system to support this
> scheme could be loaded.
> at org.apache.flink.core.fs.FileSystem.getUnguardedFileSystem(
> FileSystem.java:405)
> at org.apache.flink.core.fs.FileSystem.get(FileSystem.java:320)
> at org.apache.flink.streaming.connectors.fs.bucketing.
> BucketingSink.createHadoopFileSystem(BucketingSink.java:1125)
> at org.apache.flink.streaming.connectors.fs.bucketing.
> BucketingSink.initFileSystem(BucketingSink.java:411)
> at org.apache.flink.streaming.connectors.fs.bucketing.
> BucketingSink.initializeState(BucketingSink.java:355)
> ... 9 common frames omitted
> Caused by: org.apache.flink.core.fs.UnsupportedFileSystemSchemeException:
> Hadoop is not in the classpath/dependencies.
> at org.apache.flink.core.fs.UnsupportedSchemeFactory.create(
> UnsupportedSchemeFactory.java:64)
> at org.apache.flink.core.fs.FileSystem.getUnguardedFileSystem(
> FileSystem.java:401)
> ... 13 common frames omitted
>
> What's the right way to do this?
>


Correct way to reference Hadoop dependencies in Flink 1.4.0

2018-03-14 Thread lrao
I'm trying to use a BucketingSink to write files to S3 in my Flink job.

I have the Hadoop dependencies I need packaged in my user application jar. 
However, on running the job I get the following error (from the taskmanager):

java.lang.RuntimeException: Error while creating FileSystem when initializing 
the state of the BucketingSink.
at 
org.apache.flink.streaming.connectors.fs.bucketing.BucketingSink.initializeState(BucketingSink.java:358)
at 
org.apache.flink.streaming.util.functions.StreamingFunctionUtils.tryRestoreFunction(StreamingFunctionUtils.java:178)
at 
org.apache.flink.streaming.util.functions.StreamingFunctionUtils.restoreFunctionState(StreamingFunctionUtils.java:160)
at 
org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.initializeState(AbstractUdfStreamOperator.java:96)
at 
org.apache.flink.streaming.api.operators.AbstractStreamOperator.initializeState(AbstractStreamOperator.java:259)
at 
org.apache.flink.streaming.runtime.tasks.StreamTask.initializeOperators(StreamTask.java:694)
at 
org.apache.flink.streaming.runtime.tasks.StreamTask.initializeState(StreamTask.java:682)
at 
org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:253)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:718)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.flink.core.fs.UnsupportedFileSystemSchemeException: Could 
not find a file system implementation for scheme 's3a'. The scheme is not 
directly supported by Flink and no Hadoop file system to support this scheme 
could be loaded.
at 
org.apache.flink.core.fs.FileSystem.getUnguardedFileSystem(FileSystem.java:405)
at org.apache.flink.core.fs.FileSystem.get(FileSystem.java:320)
at 
org.apache.flink.streaming.connectors.fs.bucketing.BucketingSink.createHadoopFileSystem(BucketingSink.java:1125)
at 
org.apache.flink.streaming.connectors.fs.bucketing.BucketingSink.initFileSystem(BucketingSink.java:411)
at 
org.apache.flink.streaming.connectors.fs.bucketing.BucketingSink.initializeState(BucketingSink.java:355)
... 9 common frames omitted
Caused by: org.apache.flink.core.fs.UnsupportedFileSystemSchemeException: 
Hadoop is not in the classpath/dependencies.
at 
org.apache.flink.core.fs.UnsupportedSchemeFactory.create(UnsupportedSchemeFactory.java:64)
at 
org.apache.flink.core.fs.FileSystem.getUnguardedFileSystem(FileSystem.java:401)
... 13 common frames omitted

What's the right way to do this?


Re: Flip 6 mesos support

2018-03-14 Thread Till Rohrmann
Hi Renjie,

in the current master and release-1.5 branch flip-6 is activated by
default. If you want to turn it off you have to add `mode: old` to your
flink-conf.yaml. I'm really happy that you want to test it out :-)

Cheers,
Till

On Wed, Mar 14, 2018 at 3:03 PM, Renjie Liu  wrote:

> Hi Till:
> Is there any doc on deploying flink in flip6 mode? We want to help testing
> it.
>
> Till Rohrmann  于 2018年3月14日周三 下午7:08写道:
>
> > Hi Renjie,
> >
> > in order to make Mesos work, we only needed to implement a Mesos specific
> > ResourceManager. Look at MesosResourceManager for more details. As
> > dispatcher, we use the StandaloneDispatcher which is spawned by
> > the MesosSessionClusterEntrypoint.
> >
> > Cheers,
> > Till
> >
> > On Wed, Mar 14, 2018 at 9:32 AM, Renjie Liu 
> > wrote:
> >
> > > Hi all:
> > > I'm reading the source code and it seems that flip6 does not support
> > mesos?
> > > According to the design, client send job graph to dispatcher and
> > dispatcher
> > > spawn job mananger and resource manager for job execution. But I can't
> > find
> > > dispatcher implementation for mesos.
> > > --
> > > Liu, Renjie
> > > Software Engineer, MVAD
> > >
> >
> --
> Liu, Renjie
> Software Engineer, MVAD
>


[jira] [Created] (FLINK-8945) Allow customization of the KinesisProxy Interface

2018-03-14 Thread Kailash Hassan Dayanand (JIRA)
Kailash Hassan Dayanand created FLINK-8945:
--

 Summary: Allow customization of the KinesisProxy Interface
 Key: FLINK-8945
 URL: https://issues.apache.org/jira/browse/FLINK-8945
 Project: Flink
  Issue Type: Improvement
Reporter: Kailash Hassan Dayanand


Currently the KinesisProxy interface here:

[https://github.com/apache/flink/blob/310f3de62e52f1f977c217d918cc5aac79b87277/flink-connectors/flink-connector-kinesis/src/main/java/org/apache/flink/streaming/connectors/kinesis/proxy/KinesisProxy.java#L125]

has a private constructor. This restricts extending the class and prevents 
customizations on shard discovery. I am proposing to change this to protected.

While the creating a new implementation of KinesisProxyInterface is possible, I 
would like to continue to use implementation of getRecords and getShardIterator.

This will be a temporary workaround till FLINK-8944 is submitted. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-8944) Use ListShards for shard discovery in the flink kinesis connector

2018-03-14 Thread Kailash Hassan Dayanand (JIRA)
Kailash Hassan Dayanand created FLINK-8944:
--

 Summary: Use ListShards for shard discovery in the flink kinesis 
connector
 Key: FLINK-8944
 URL: https://issues.apache.org/jira/browse/FLINK-8944
 Project: Flink
  Issue Type: Improvement
Reporter: Kailash Hassan Dayanand


Currently the DescribeStream AWS API used to get list of shards is has a 
restricted rate limits on AWS. (5 requests per sec per account). This is 
problematic when running multiple flink jobs all on same account since each 
subtasks calls the Describe Stream. Changing this to ListShards will provide 
more flexibility on rate limits as ListShards has a 100 requests per second per 
data stream limits.

More details on the mailing list. https://goo.gl/mRXjKh



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-8943) Jobs will not recover if DFS is temporarily unavailable

2018-03-14 Thread Gary Yao (JIRA)
Gary Yao created FLINK-8943:
---

 Summary: Jobs will not recover if DFS is temporarily unavailable
 Key: FLINK-8943
 URL: https://issues.apache.org/jira/browse/FLINK-8943
 Project: Flink
  Issue Type: Bug
  Components: Distributed Coordination
Affects Versions: 1.4.2, 1.5.0, 1.6.0
Reporter: Gary Yao
 Fix For: 1.5.0, 1.6.0


*Description*
Job graphs will be recovered only once from the DFS. If the DFS is unavailable 
at the recovery attempt, the jobs will simply be not running until the master 
is restarted again.

*Steps to reproduce*
# Submit job on Flink Cluster with HDFS as HA storage dir.
# Trigger job recovery by killing the master
# Stop HDFS NameNode
# Enable HDFS NameNode after job recovery is over
# Verify that job is not running.

*Stacktrace*
{noformat}
2018-03-14 14:01:37,704 ERROR 
org.apache.flink.runtime.dispatcher.StandaloneDispatcher  - Could not 
recover the job graph for a41d50b6f3ac16a730dd12792a847c97.
org.apache.flink.util.FlinkException: Could not retrieve submitted JobGraph 
from state handle under /a41d50b6f3ac16a730dd12792a847c97. This indicates that 
the retrieved state handle is broken. Try cleaning the state handle store.
at 
org.apache.flink.runtime.jobmanager.ZooKeeperSubmittedJobGraphStore.recoverJobGraph(ZooKeeperSubmittedJobGraphStore.java:192)
at 
org.apache.flink.runtime.dispatcher.Dispatcher.lambda$recoverJobs$5(Dispatcher.java:557)
at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:39)
at 
akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:415)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at 
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at 
scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at 
scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
Caused by: java.net.ConnectException: Call From ip-172-31-43-54/172.31.43.54 to 
ip-172-31-32-118.eu-central-1.compute.internal:9000 failed on connection 
exception: java.net.ConnectException: Connection refused; For more details see: 
 http://wiki.apache.org/hadoop/ConnectionRefused
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:801)
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:732)
at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1493)
at org.apache.hadoop.ipc.Client.call(Client.java:1435)
at org.apache.hadoop.ipc.Client.call(Client.java:1345)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:227)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
at com.sun.proxy.$Proxy10.getBlockLocations(Unknown Source)
at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getBlockLocations(ClientNamenodeProtocolTranslatorPB.java:259)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:409)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:163)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:155)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:346)
at com.sun.proxy.$Proxy11.getBlockLocations(Unknown Source)
at 
org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:843)
at org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:832)
at org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:821)
at 
org.apache.hadoop.hdfs.DFSInputStream.fetchLocatedBlocksAndGetLastBlockLength(DFSInputStream.java:325)
at 
org.apache.hadoop.hdfs.DFSInputStream.openInfo(DFSInputStream.java:285)
at org.apache.hadoop.hdfs.DFSInputStream.(DFSInputStream.java:270)
at 

Re: Flip 6 mesos support

2018-03-14 Thread Renjie Liu
Hi Till:
Is there any doc on deploying flink in flip6 mode? We want to help testing
it.

Till Rohrmann  于 2018年3月14日周三 下午7:08写道:

> Hi Renjie,
>
> in order to make Mesos work, we only needed to implement a Mesos specific
> ResourceManager. Look at MesosResourceManager for more details. As
> dispatcher, we use the StandaloneDispatcher which is spawned by
> the MesosSessionClusterEntrypoint.
>
> Cheers,
> Till
>
> On Wed, Mar 14, 2018 at 9:32 AM, Renjie Liu 
> wrote:
>
> > Hi all:
> > I'm reading the source code and it seems that flip6 does not support
> mesos?
> > According to the design, client send job graph to dispatcher and
> dispatcher
> > spawn job mananger and resource manager for job execution. But I can't
> find
> > dispatcher implementation for mesos.
> > --
> > Liu, Renjie
> > Software Engineer, MVAD
> >
>
-- 
Liu, Renjie
Software Engineer, MVAD


[jira] [Created] (FLINK-8942) Pass targer ResourceID to HeartbeatListener#retrievePayload

2018-03-14 Thread Chesnay Schepler (JIRA)
Chesnay Schepler created FLINK-8942:
---

 Summary: Pass targer ResourceID to 
HeartbeatListener#retrievePayload
 Key: FLINK-8942
 URL: https://issues.apache.org/jira/browse/FLINK-8942
 Project: Flink
  Issue Type: Improvement
  Components: Distributed Coordination
Affects Versions: 1.5.0
Reporter: Chesnay Schepler
Assignee: Chesnay Schepler
 Fix For: 1.5.0


For FLINK-8881 we need a way to determine to which JobManager we are sending 
the heartbeats, otherwise we would have to send all accumulators, that is for 
all jobs, to each connected JobManager.

I suggest to pass the target {{ResourceID}} to 
{{HeartbeatListener#retrievePayload}} which generates the payload.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-8941) SpanningRecordSerializationTest fails on Travis

2018-03-14 Thread Chesnay Schepler (JIRA)
Chesnay Schepler created FLINK-8941:
---

 Summary: SpanningRecordSerializationTest fails on Travis
 Key: FLINK-8941
 URL: https://issues.apache.org/jira/browse/FLINK-8941
 Project: Flink
  Issue Type: Improvement
  Components: Tests
Reporter: Chesnay Schepler
 Fix For: 1.5.0


https://travis-ci.org/zentol/flink/jobs/353217791
{code:java}
testHandleMixedLargeRecords(org.apache.flink.runtime.io.network.api.serialization.SpanningRecordSerializationTest)
  Time elapsed: 1.992 sec  <<< ERROR!
java.nio.channels.ClosedChannelException: null
at sun.nio.ch.FileChannelImpl.ensureOpen(FileChannelImpl.java:110)
at sun.nio.ch.FileChannelImpl.write(FileChannelImpl.java:199)
at 
org.apache.flink.runtime.io.network.api.serialization.SpillingAdaptiveSpanningRecordDeserializer$SpanningWrapper.addNextChunkFromMemorySegment(SpillingAdaptiveSpanningRecordDeserializer.java:528)
at 
org.apache.flink.runtime.io.network.api.serialization.SpillingAdaptiveSpanningRecordDeserializer$SpanningWrapper.access$200(SpillingAdaptiveSpanningRecordDeserializer.java:430)
at 
org.apache.flink.runtime.io.network.api.serialization.SpillingAdaptiveSpanningRecordDeserializer.setNextBuffer(SpillingAdaptiveSpanningRecordDeserializer.java:75)
at 
org.apache.flink.runtime.io.network.api.serialization.SpanningRecordSerializationTest.testSerializationRoundTrip(SpanningRecordSerializationTest.java:143)
at 
org.apache.flink.runtime.io.network.api.serialization.SpanningRecordSerializationTest.testSerializationRoundTrip(SpanningRecordSerializationTest.java:109)
at 
org.apache.flink.runtime.io.network.api.serialization.SpanningRecordSerializationTest.testHandleMixedLargeRecords(SpanningRecordSerializationTest.java:98){code}
 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-8940) Implement JobMaster#disposeSavepoint

2018-03-14 Thread Chesnay Schepler (JIRA)
Chesnay Schepler created FLINK-8940:
---

 Summary: Implement JobMaster#disposeSavepoint
 Key: FLINK-8940
 URL: https://issues.apache.org/jira/browse/FLINK-8940
 Project: Flink
  Issue Type: Improvement
  Components: JobManager, State Backends, Checkpointing
Affects Versions: 1.5.0
Reporter: Chesnay Schepler
 Fix For: 1.5.0


To support {{ClusterClient#disposeSavepoint}} we have to implement a 
{{disposeSavepoint}} method in the flip6 JobMaster.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: flink web ui authentication

2018-03-14 Thread Diego Reiriz Cores
Hi Sampath,

You can use nginx to protect the flink dashboard as David Anderson answered
you on stackoverflow:
https://stackoverflow.com/questions/49254307/flink-web-ui-authentication

Here you got a guide from the guys of DigitalOcean about protecting an url
with basic authentication:
https://www.digitalocean.com/community/tutorials/how-to-set-up-password-authentication-with-nginx-on-ubuntu-14-04


Best Regards,
Diego


2018-03-14 12:09 GMT+01:00 Till Rohrmann :

> Hi Sampath,
>
> at the moment, Flink does not support such a feature.
>
> Cheers,
> Till
>
> On Tue, Mar 13, 2018 at 11:17 AM, Sampath Bhat 
> wrote:
>
> > Hello
> >
> > I would like to know if flink supports any user level authentication like
> > username/password for flink web ui.
> >
> > Regards
> > Sampath S
> >
>



-- 



Diego Reiriz Cores - Gradiant
Investigador - Desarrollador Senior | Área de Servicios y Aplicaciones
Junior Researcher - Developer | Services & Applications Department


Re: flink web ui authentication

2018-03-14 Thread Till Rohrmann
Hi Sampath,

at the moment, Flink does not support such a feature.

Cheers,
Till

On Tue, Mar 13, 2018 at 11:17 AM, Sampath Bhat 
wrote:

> Hello
>
> I would like to know if flink supports any user level authentication like
> username/password for flink web ui.
>
> Regards
> Sampath S
>


Re: Flip 6 mesos support

2018-03-14 Thread Till Rohrmann
Hi Renjie,

in order to make Mesos work, we only needed to implement a Mesos specific
ResourceManager. Look at MesosResourceManager for more details. As
dispatcher, we use the StandaloneDispatcher which is spawned by
the MesosSessionClusterEntrypoint.

Cheers,
Till

On Wed, Mar 14, 2018 at 9:32 AM, Renjie Liu  wrote:

> Hi all:
> I'm reading the source code and it seems that flip6 does not support mesos?
> According to the design, client send job graph to dispatcher and dispatcher
> spawn job mananger and resource manager for job execution. But I can't find
> dispatcher implementation for mesos.
> --
> Liu, Renjie
> Software Engineer, MVAD
>


[VOTE] Release 1.3.3, release candidate #2

2018-03-14 Thread Tzu-Li (Gordon) Tai
Hi everyone,

Please review and vote on release candidate #2 for Flink 1.3.3, as follows:  
[ ] +1, Approve the release  
[ ] -1, Do not approve the release (please provide specific comments)

The complete staging area is available for your review, which includes:  
* JIRA release notes [1],  
* the official Apache source release and binary convenience releases to be 
deployed to dist.apache.org [2], which are signed with the key with fingerprint 
1C1E2394D3194E1944613488F320986D35C33D6A [3],  
* all artifacts to be deployed to the Maven Central Repository [4],  
* source code branch “release-1.3.3-rc2” [5],  
* website pull request listing the new release [6].  
* A complete list of all new commits in release-1.3.3-rc2, since release-1.3.2 
[7]

This release candidate contains fixes for only the following issues:
[FLINK-8487] Verify ZooKeeper checkpoint store behaviour with ITCase
[FLINK-8890] Compare checkpoints with order in 
CompletedCheckpoint.checkpointsMatch()
[FLINK-8807] Fix ZookeeperCompleted checkpoint store can get stuck in infinite 
loop
[FLINK-7783] Don’t always remove checkpoints in 
ZooKeeperCompletedCheckpointStore#recover()

Since the last candidate was cancelled only due to incorrect binaries in the 
source artifacts, I think we can also have a shorter voting period for RC2.

Please test the release and vote for the release candidate before Thursday 
(March 15th), 7pm CET. 
It is adopted by majority approval, with at least 3 PMC affirmative votes.

Please let me know if you disagree with the shortened voting time.

Thanks,  
Gordon

[1] 
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12341142
[2] http://people.apache.org/~tzulitai/flink-1.3.3-rc2/
[3] https://dist.apache.org/repos/dist/release/flink/KEYS
[4] https://repository.apache.org/content/repositories/orgapacheflink-1151
[5] 
https://git-wip-us.apache.org/repos/asf?p=flink.git;a=shortlog;h=refs/heads/release-1.3.3-rc2
[6] https://github.com/apache/flink-web/pull/104
[7] 
- 90559b5413455d9d0f2b61c389a60e26e5c87800 [hotfix] Properly delete temp flink 
dir in create_source_release.sh
- 99c0353a34c09af5bedb73f525f691dd7e78fcdd [hotfix] Ensure pristine release in 
tools/releasing/create_source_release.sh
- b2437f87e361a822adbad6f1c3e6eb14eeeb09fa [FLINK-8487] Verify ZooKeeper 
checkpoint store behaviour with ITCase
- 1432092f29c548c55af562ff7b4a7973fedd2e22 [FLINK-8890] Compare checkpoints 
with order in CompletedCheckpoint.checkpointsMatch()
- df37d7acfea10a5ca3186f3c53294f2050758627 [FLINK-8807] Fix ZookeeperCompleted 
checkpoint store can get stuck in infinite loop
- f69bdf207b92ca47a5ce3e29f6ec7193ed17ec72 [FLINK-7783] Don’t always remove 
checkpoints in ZooKeeperCompletedCheckpointStore#recover()



Re: Starting open source development with Flink

2018-03-14 Thread Fabian Hueske
Hi Deepak,

You can open JIRAs for bugs you discovered or minor improvements.
Larger features should be discussed on the dev mailing list first.
I'd suggest to start contributing by fixing a bug.

Best, Fabian

2018-03-13 3:39 GMT+01:00 Deepak Sharma :

> Hi Flink team!
>
> I've been using Flink at work for the past few months. I want to learn more
> about it and also contribute. Any suggestions for where I should start to
> learn more about its internals? Also, is it okay to open up a JIRA without
> bringing it up in here or should I do the latter first?
>
> Thank You,
> Deepak
>


Flip 6 mesos support

2018-03-14 Thread Renjie Liu
Hi all:
I'm reading the source code and it seems that flip6 does not support mesos?
According to the design, client send job graph to dispatcher and dispatcher
spawn job mananger and resource manager for job execution. But I can't find
dispatcher implementation for mesos.
-- 
Liu, Renjie
Software Engineer, MVAD


[jira] [Created] (FLINK-8939) Provide better support for saving streaming data to s3

2018-03-14 Thread chris snow (JIRA)
chris snow created FLINK-8939:
-

 Summary: Provide better support for saving streaming data to s3
 Key: FLINK-8939
 URL: https://issues.apache.org/jira/browse/FLINK-8939
 Project: Flink
  Issue Type: Improvement
  Components: DataStream API
Reporter: chris snow
 Attachments: 18652BC0-DD67-42C3-9A33-12F7BC10F9F3.png

Flink seems to struggle saving data to s3 due to the lack of a truncate method, 
and in my test this resulted in lots of files with a .valid-length suffix

I’m using a bucketing sink:
{code:java}
return new BucketingSink>(path)
.setWriter(writer)
.setBucketer(new DateTimeBucketer>(formatString));{code}
 

!18652BC0-DD67-42C3-9A33-12F7BC10F9F3.png!

See also, the discussion in the comments here: 
https://issues.apache.org/jira/browse/FLINK-8543?filter=-2



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)