[jira] [Created] (FLINK-8465) Retrieve correct leader component address in ClusterClient

2018-01-19 Thread Till Rohrmann (JIRA)
Till Rohrmann created FLINK-8465:


 Summary: Retrieve correct leader component address in ClusterClient
 Key: FLINK-8465
 URL: https://issues.apache.org/jira/browse/FLINK-8465
 Project: Flink
  Issue Type: Bug
  Components: Distributed Coordination
Affects Versions: 1.5.0
Reporter: Till Rohrmann
Assignee: Till Rohrmann
 Fix For: 1.5.0


The {{ClusterClient}} has a method {{#getJobManagerAddress}} which retrieves 
the address of the leading {{JobManager}}. In order to make this method work 
with Flip-6 I propose to rename it into {{getClusterConnectionInfo}} which 
retrieves the {{LeaderConnectionInfo}} for the leader component of the cluster. 
In Flip-6 this will be the {{Dispatcher}}. In the old code, this will be the 
{{JobManager}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-8464) TestRaftReconfigurationWithSimulatedRpc fails with assertion error

2018-01-19 Thread Ted Yu (JIRA)
Ted Yu created FLINK-8464:
-

 Summary: TestRaftReconfigurationWithSimulatedRpc fails with 
assertion error
 Key: FLINK-8464
 URL: https://issues.apache.org/jira/browse/FLINK-8464
 Project: Flink
  Issue Type: Test
Reporter: Ted Yu
 Attachments: ratis-8464.tst

As of commit 7b3a9a6f5f8e8075727d84e3ddeae7b594eda89c, I observed the following 
:
{code}
testRevertConfigurationChange(org.apache.ratis.server.simulation.TestRaftReconfigurationWithSimulatedRpc)
  Time elapsed: 2.119 sec  <<< FAILURE!
java.lang.AssertionError: 1 0 expected: but was:
{code}
1 was confIndex and 0 was log.getLastCommittedIndex()



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-8463) Remove unnecessary thread blocking in RestClient#submitRequest

2018-01-19 Thread Till Rohrmann (JIRA)
Till Rohrmann created FLINK-8463:


 Summary: Remove unnecessary thread blocking in 
RestClient#submitRequest
 Key: FLINK-8463
 URL: https://issues.apache.org/jira/browse/FLINK-8463
 Project: Flink
  Issue Type: Bug
  Components: REST
Affects Versions: 1.5.0
Reporter: Till Rohrmann
Assignee: Till Rohrmann
 Fix For: 1.5.0


The {{RestClient}} unnecessarily blocks an IO executor thread when trying to 
open a connection to a remote destination. This can be improved by registering 
a {{ChannelFuture}} listener which continues the execution once the connection 
has been established.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-8462) TaskExecutor does not verify RM heartbeat timeouts

2018-01-19 Thread Till Rohrmann (JIRA)
Till Rohrmann created FLINK-8462:


 Summary: TaskExecutor does not verify RM heartbeat timeouts
 Key: FLINK-8462
 URL: https://issues.apache.org/jira/browse/FLINK-8462
 Project: Flink
  Issue Type: Bug
  Components: Distributed Coordination
Affects Versions: 1.5.0
Reporter: Till Rohrmann
Assignee: Till Rohrmann
 Fix For: 1.5.0


The {{TaskExecutor}} does neither properly stop RM heartbeats nor does it check 
whether a RM heartbeat timeout is still valid. As a consequence, it can happen 
that the {{TaskExecutor}} closes the connection to an active {{RM}} due to an 
outdated heartbeat timeout.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-8461) Wrong logger configurations for shaded Netty

2018-01-19 Thread Stephan Ewen (JIRA)
Stephan Ewen created FLINK-8461:
---

 Summary: Wrong logger configurations for shaded Netty
 Key: FLINK-8461
 URL: https://issues.apache.org/jira/browse/FLINK-8461
 Project: Flink
  Issue Type: Bug
  Components: Logging
Affects Versions: 1.4.0
Reporter: Stephan Ewen
Assignee: Stephan Ewen
 Fix For: 1.5.0, 1.4.1


We started shading Akka's Netty in Flink 1.4.

The logger configurations (log4j.properties, logback.xml) were not updated to 
the shaded class names.

One result of this is incorrect/misleading error logging of the Netty handlers 
during shutdown, which pollute the logs and cause Yarn end-to-end tests to fail.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-8460) UtilsTest.testYarnFlinkResourceManagerJobManagerLostLeadership

2018-01-19 Thread Till Rohrmann (JIRA)
Till Rohrmann created FLINK-8460:


 Summary: 
UtilsTest.testYarnFlinkResourceManagerJobManagerLostLeadership
 Key: FLINK-8460
 URL: https://issues.apache.org/jira/browse/FLINK-8460
 Project: Flink
  Issue Type: Bug
  Components: Tests, YARN
Affects Versions: 1.5.0
Reporter: Till Rohrmann


There is a test instability in 
{{UtilsTest.testYarnFlinkResourceManagerJobManagerLostLeadership}} on Travis.

 

https://api.travis-ci.org/v3/job/330406729/log.txt



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-8459) Implement cancelWithSavepoint in RestClusterClient

2018-01-19 Thread Gary Yao (JIRA)
Gary Yao created FLINK-8459:
---

 Summary: Implement cancelWithSavepoint in RestClusterClient
 Key: FLINK-8459
 URL: https://issues.apache.org/jira/browse/FLINK-8459
 Project: Flink
  Issue Type: Sub-task
  Components: Client
Reporter: Gary Yao


Implement the method

 {{RestClusterClient#cancelWithSavepoint(JobID jobId, @Nullable String 
savepointDirectory)}}.

by either taking a savepoint and cancel the job separately, or by migrating the 
logic in {{JobCancellationWithSavepointHandlers}}. The former will have 
different semantics because the checkpoint scheduler is not stopped. Thus it is 
not guaranteed that there won't be additional checkpoints between the savepoint 
and the job cancelation.






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-8458) Add the switch for keeping both the old mode and the new credit-based mode

2018-01-19 Thread zhijiang (JIRA)
zhijiang created FLINK-8458:
---

 Summary: Add the switch for keeping both the old mode and the new 
credit-based mode
 Key: FLINK-8458
 URL: https://issues.apache.org/jira/browse/FLINK-8458
 Project: Flink
  Issue Type: Task
  Components: Network
Reporter: zhijiang
Assignee: zhijiang


After the whole feature of credit-based flow control is done, we should add a 
config parameter to switch on/off the new credit-based mode. To do so, we can 
roll back to the old network mode for any expected risks.

The parameter is defined as 
\{{taskmanager.network.credit-based-flow-control.enabled}} and the default 
value is true. This switch may be removed after next release.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-8457) Documentation for Building from Source is broken

2018-01-19 Thread Fabian Hueske (JIRA)
Fabian Hueske created FLINK-8457:


 Summary: Documentation for Building from Source is broken
 Key: FLINK-8457
 URL: https://issues.apache.org/jira/browse/FLINK-8457
 Project: Flink
  Issue Type: Bug
  Components: Documentation
Affects Versions: 1.4.0, 1.3.0
 Environment: The documentation for how to build Flink from source is 
broken for all released versions.

It only explains how to build the latest master branch which is only correct 
for the docs of the latest master but not for the docs of a release version. 
For example the [build docs for Flink 
1.4|https://ci.apache.org/projects/flink/flink-docs-release-1.4/start/building.html]
 say

{quote}This page covers how to build Flink 1.4.0 from sources.
{quote}

but explain how to build the {{master}} branch. 

I think we should rewrite this page to explain how to build specific versions 
and also explain how to build the SNAPSHOT branches of released versions (for 
example {{release-1.4}}, the latest dev branch for Flink 1.4 with all merged 
bug fix).

I guess the same holds for Flink 1.3 as well.
Reporter: Fabian Hueske






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: adding a new cloud filesystem

2018-01-19 Thread Fabian Hueske
Great! Thanks for reporting back.

2018-01-19 1:43 GMT+01:00 cw7k :

>  Ok, I have the factory working in the WordCount example.  I had to move
> the factory code and META-INF into the WordCount project.
> For general Flink jobs, I'm assuming that the goal would be to be able to
> import the factory from the job itself instead of needing to copy the
> factory .java file into each project?  If so, any guidelines on how to do
> that?On Thursday, January 18, 2018, 10:53:32 AM PST, cw7k
>  wrote:
>
>   Hi, just a bit more info, I have a test function working using oci://,
> based on the S3 test:
> https://github.com/apache/flink/blob/master/flink-filesystems/flink-s3-fs-
> hadoop/src/test/java/org/apache/flink/fs/s3hadoop/
> HadoopS3FileSystemITCase.java#L169
> However, when I try to get the WordCount example's WriteAsText to write to
> my new filesystem:
> https://github.com/apache/flink/blob/master/flink-examples/flink-examples-
> streaming/src/main/java/org/apache/flink/streaming/
> examples/wordcount/WordCount.java#L82
>
> that's where I got the "Could not find a file system implementation" error
> mentioned earlier.
>
> On Thursday, January 18, 2018, 10:22:57 AM PST, cw7k
>  wrote:
>
>   Thanks.  I now have the 3 requirements fulfilled but the scheme isn't
> being loaded; I get this error:
> "Caused by: org.apache.flink.core.fs.UnsupportedFileSystemSchemeException:
> Could not find a file system implementation for scheme 'oci'. The scheme is
> not directly supported by Flink and no Hadoop file system to support this
> scheme could be loaded."
> What's the best way to debug the loading of the schemes/filesystems by the
> ServiceLoader?On Thursday, January 18, 2018, 5:09:10 AM PST, Fabian
> Hueske  wrote:
>
>  In fact, there are two S3FileSystemFactory classes, one for Hadoop and
> another one for Presto.
> In both cases an external file system class is wrapped in Flink's
> HadoopFileSystem class [1] [2].
>
> Best, Fabian
>
> [1]
> https://github.com/apache/flink/blob/master/flink-filesystems/flink-s3-fs-
> hadoop/src/main/java/org/apache/flink/fs/s3hadoop/
> S3FileSystemFactory.java#L132
> [2]
> https://github.com/apache/flink/blob/master/flink-filesystems/flink-s3-fs-
> presto/src/main/java/org/apache/flink/fs/s3presto/
> S3FileSystemFactory.java#L131
>
> 2018-01-18 1:24 GMT+01:00 cw7k :
>
> >  Thanks. I'm looking at the s3 example and I can only find the
> > S3FileSystemFactory but not the File System implementation (subclass
> > of org.apache.flink.core.fs.FileSystem).
> > Is that requirement still needed?On Wednesday, January 17, 2018,
> > 3:59:47 PM PST, Fabian Hueske  wrote:
> >
> >  Hi,
> >
> > please have a look at this doc page [1].
> > It describes how to add new file system implementations and also how to
> > configure them.
> >
> > Best, Fabian
> >
> > [1]
> > https://ci.apache.org/projects/flink/flink-docs-
> > release-1.4/ops/filesystems.html#adding-new-file-system-implementations
> >
> > 2018-01-18 0:32 GMT+01:00 cw7k :
> >
> > >  Hi, I'm adding support for more cloud storage providers such as Google
> > > (gcs://) and Oracle (oci://).
> > > I have an oci:// test working based on the s3a:// test but when I try
> it
> > > on an actual Flink job like WordCount, I get this message:
> > > "org.apache.flink.core.fs.UnsupportedFileSystemSchemeException: Could
> > not
> > > find a file system implementation for scheme 'oci'. The scheme is not
> > > directly supported by Flink and no Hadoop file system to support this
> > > scheme could be loaded."
> > > How do I register new schemes into the file system factory?  Thanks.
> > On
> > > Tuesday, January 16, 2018, 5:27:31 PM PST, cw7k  >
> > > wrote:
> > >
> > >  Hi, question on this page:
> > > "You need to point Flink to a valid Hadoop configuration..."https://ci
> .
> > > apache.org/projects/flink/flink-docs-release-1.4/ops/
> > > deployment/aws.html#s3-simple-storage-service
> > > How do you point Flink to the Hadoop config?
> > >On Saturday, January 13, 2018, 4:56:15 AM PST, Till Rohrmann <
> > > trohrm...@apache.org> wrote:
> > >
> > >  Hi,
> > >
> > > the flink-connector-filesystem contains the BucketingSink which is a
> > > connector with which you can write your data to a file system. It
> > provides
> > > exactly once processing guarantees and allows to write data to
> different
> > > buckets [1].
> > >
> > > The flink-filesystem module contains different file system
> > implementations
> > > (like mapr fs, hdfs or s3). If you want to use, for example, s3 file
> > > system, then there is the flink-s3-fs-hadoop and flink-s3-fs-presto
> > module.
> > >
> > > So if you want to write your data to s3 using the BucketingSink, then
> you
> > > have to add flink-connector-filesystem for the BucketingSink as well
> as a
> > > s3 file system implementations (e.g. flink-s3-fs-hadoop or
> > > flink-s3-fs-presto).
> > >
> > > Usually, there should be no need to change Flink's filesystem
> > > implementations. If you want to add a new c