[jira] [Created] (FLINK-15141) Using decimal type in a sink table, the result returns a not match ValidationException

2019-12-08 Thread xiaojin.wy (Jira)
xiaojin.wy created FLINK-15141:
--

 Summary: Using decimal type in a sink table, the result returns a 
not match ValidationException 
 Key: FLINK-15141
 URL: https://issues.apache.org/jira/browse/FLINK-15141
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Client
Affects Versions: 1.10.0
Reporter: xiaojin.wy


The planner what I used is blink.

*The source table is:*

 

CREATE TABLE `aggtest` (CREATE TABLE `aggtest` ( a smallint, b float) WITH ( 
'format.field-delimiter'='|', 'connector.type'='filesystem', 
'format.derive-schema'='true', 
'connector.path'='hdfs://zthdev/defender_test_data/daily/test_aggregates/sources/aggtest.csv',
 'format.type'='csv');

 

 

 

*The sink table is:*

CREATE TABLE `agg_decimal_res` (CREATE TABLE `agg_decimal_res` ( avg_107_943 
DECIMAL(10, 3)) WITH ( 'format.field-delimiter'='|', 
'connector.type'='filesystem', 'format.derive-schema'='true', 
'connector.path'='hdfs://zthdev/defender_test_data/daily/test_aggregates/test_aggregates__test_avg_cast_batch/results/agg_decimal_res.csv',
 'format.type'='csv');

 

*The sql is:*

INSERT INTO agg_decimal_res SELECT CAST(avg(b) AS numeric(10,3)) AS avg_107_943 
FROM aggtest;

 

After execute the sql, there will be a exception appear, just like this:

[INFO] Submitting SQL update statement to the cluster...
[ERROR] Could not execute SQL statement. Reason:
org.apache.flink.table.api.ValidationException: Field types of query result and 
registered TableSink `default_catalog`.`default_database`.`agg_decimal_res1` do 
not match.
Query result schema: [avg_107_943: DECIMAL(10, 3)]
TableSink schema: [avg_107_943: DECIMAL(38, 18)]

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] Drop Heap Backend Synchronous snapshots

2019-12-08 Thread Congxian Qiu
+1 from my side.

Best,
Congxian


Yun Tang  于2019年12月6日周五 下午12:30写道:

> +1 from my side for I did not see any real benefits if using synchronous
> snapshots.
>
> Moreover, I think we should also remove the support of synchronous
> snapshots in DefaultOpeatorStateBackend and deprecate the config
> state.backend.async
>
> Best
> Yun Tang
>
> On 12/5/19, 8:06 PM, "Stephan Ewen"  wrote:
>
> Hi all!
>
> I am wondering if there is any case for retaining the option to make
> synchronous snapshots on the heap statebackend. Is anyone using that?
> Or
> could we clean that code up and remove it?
>
> Best,
> Stephan
>
>
>


Re: [DISCUSS] Adding e2e tests for Flink's Mesos integration

2019-12-08 Thread Yang Wang
Thanks Yangze for starting this discussion.

Just share my thoughts.

If the mesos official docker image could not meet our requirement, i
suggest to build the image locally.
We have done the same things for yarn e2e tests. This way is more flexible
and easy to maintain. However,
i have no idea how long building the mesos image locally will take. Based
on previous experience of yarn, i
think it may not take too much time.



Best,
Yang

Yangze Guo  于2019年12月7日周六 下午4:25写道:

> Thanks for your feedback!
>
> @Till
> Regarding the time overhead, I think it mainly come from the network
> transmission. For building the image locally, it will totally download
> 260MB files including the base image and packages. For pulling from
> DockerHub, the compressed size of the image is 347MB. Thus, I agree
> that it is ok to build the image locally.
>
> @Piyush
> Thank you for offering the help and sharing your usage scenario. In
> current stage, I think it will be really helpful if you can compress
> the custom image[1] or reduce the time overhead to build it locally.
> Any ideas for improving test coverage will also be appreciated.
>
> [1]
> https://hub.docker.com/layers/karmagyz/mesos-flink/latest/images/sha256-4e1caefea107818aa11374d6ac8a6e889922c81806f5cd791ead141f18ec7e64
>
> Best,
> Yangze Guo
>
> On Sat, Dec 7, 2019 at 3:17 AM Piyush Narang  wrote:
> >
> > +1 from our end as well. At Criteo, we are running some Flink jobs on
> Mesos in production to compute short term features for machine learning.
> We’d love to help out and contribute on this initiative.
> >
> > Thanks,
> > -- Piyush
> >
> >
> > From: Till Rohrmann 
> > Date: Friday, December 6, 2019 at 8:10 AM
> > To: dev 
> > Cc: user 
> > Subject: Re: [DISCUSS] Adding e2e tests for Flink's Mesos integration
> >
> > Big +1 for adding a fully working e2e test for Flink's Mesos
> integration. Ideally we would have it ready for the 1.10 release. The lack
> of such a test has bitten us already multiple times.
> >
> > In general I would prefer to use the official image if possible since it
> frees us from maintaining our own custom image. Since Java 9 is no longer
> officially supported as we opted for supporting Java 11 (LTS) it might not
> be feasible, though. How much longer would building the custom image vs.
> downloading the custom image from DockerHub be? Maybe it is ok to build the
> image locally. Then we would not have to maintain the image.
> >
> > Cheers,
> > Till
> >
> > On Fri, Dec 6, 2019 at 11:05 AM Yangze Guo  karma...@gmail.com>> wrote:
> > Hi, all,
> >
> > Currently, there is no end to end test or IT case for Mesos deployment
> > while the common deployment related developing would inevitably touch
> > the logic of this component. Thus, some work needs to be done to
> > guarantee experience for both Meos users and contributors. After
> > offline discussion with Till and Xintong, we have some basic ideas and
> > would like to start a discussion thread on adding end to end tests for
> > Flink's Mesos integration.
> >
> > As a first step, we would like to keep the scope of this contribution
> > to be relative small. This may also help us to quickly get some basic
> > test cases that might be helpful for the upcoming 1.10 release.
> >
> > As far as we can think of, what needs to be done is to setup a Mesos
> > framework during the testing and determine which tests need to be
> > included.
> >
> >
> > ** Regarding the Mesos framework, after trying out several approaches,
> > I find that setting up Mesos in docker is probably what we want. The
> > resources needed for building and setting up Mesos from source is
> > probably not affordable in most of the scenarios. So, the one open
> > question that worth discussion is the choice of Docker image. We have
> > come up with two options.
> >
> > - Using official Mesos image[1]
> > The official image was the first alternative that come to our mind,
> > but we run into some sort of Java version compatibility problem that
> > leads to failures of launching task executors. Flink supports Java 9
> > since version 1.9.0 [2], However, the official Docker image of Mesos
> > is built with a development version of JDK 9, which probably has
> > caused this problem. Unless we want to make Flink to also be
> > compatible with the JDK development version used by the official mesos
> > image, this option does not work out. Besides, according to the
> > official roadmap[5], Java 9 is not a long-term support version, which
> > may bring stability risk in future.
> >
> > - Build a custom image
> > I've already tried build a custom image[3] and successfully run most
> > of the existing end to end tests cases with it. The image is built
> > with Ubuntu 16.04, JDK 8 and Mesos 1.7.1. For the mesos e2e test
> > framework, we could either build the image from a Docker file or pull
> > the pre-built image from DockerHub (or other hub services) during the
> > testing.
> > If we decide to publish the an image on DockerHub, we 

Reminder: Clean up 1.10.0 open issues

2019-12-08 Thread Yu Li
Hi All,

Since we're approaching feature freeze and will cut branch for 1.10 release
soon, while we still have 34 in-progress and 124 to-do issues [1]. Please
check and clean up the open issues for 1.10.0. More specifically:

* For JIRAs still in "Open" status, please change the fix version for 1.11.
* For features not completed, please change the fix version for 1.11.
* For bug fixes/documentation issues, please continue working and try to
resolve them asap.

We will also start cleaning the issues soon, especially those w/o any
assignee yet. Please let us know if any questions. Thanks!

[1] https://issues.apache.org/jira/projects/FLINK/versions/12345845

Best Regards,
Yu


[jira] [Created] (FLINK-15140) Shuffle data compression does not work with BroadcastRecordWriter.

2019-12-08 Thread Yingjie Cao (Jira)
Yingjie Cao created FLINK-15140:
---

 Summary: Shuffle data compression does not work with 
BroadcastRecordWriter.
 Key: FLINK-15140
 URL: https://issues.apache.org/jira/browse/FLINK-15140
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Network
Affects Versions: 1.10.0
Reporter: Yingjie Cao
 Fix For: 1.10.0


I tested the newest code of master branch last weekend with more test cases. 
Unfortunately, several problems were encountered, including a bug of 
compression.

When BroadcastRecordWriter is used, for pipelined mode, because the compressor 
copies the data back to the input buffer, however, the underlying buffer is 
shared when BroadcastRecordWriter is used. So we can not copy the compressed 
buffer back to the input buffer if the underlying buffer is shared. For 
blocking mode, we wrongly recycle the buffer when buffer is not compressed, and 
the problem is also triggered when BroadcastRecordWriter is used.

To fix the problem, for blocking shuffle, the reference counter should be 
maintained correctly, for pipelined shuffle, the simplest way maybe disable 
compression is the underlying buffer is shared. I will open a PR to fix the 
problem.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [VOTE] Release 1.8.3, release candidate #3

2019-12-08 Thread Jingsong Li
Hi Hequn,

+1 (non-binding) Thank you for driving this.

- Verified signatures and checksums
- Maven build from source skip tests (Scala 2.11 and Scala 2.12)
- Start local cluster and web ui is accessible (Scala 2.11 and Scala 2.12)
- Submit WordCount example of both batch and streaming, good (Scala 2.11
and Scala 2.12)
- Verified pom files point to the 1.8.3 version
- Take a quick look to release note and announcement PR
- Run examples of flink-example-table in IDE

Best,
Jingsong Lee

On Mon, Dec 9, 2019 at 11:52 AM jincheng sun 
wrote:

> +1 (binding)
>
> - checked signatures [SUCCESS]
> - built from source without tests [SUCCESS]
> - run some tests in IDE [SUCCESS]
> - start local cluster and submit word count example [SUCCESS]
> - announcement PR for website looks good! (I have left a few comments)
>
> Best,
> Jincheng
>
> Hequn Cheng  于2019年12月5日周四 下午9:39写道:
>
> > Hi everyone,
> >
> > Please review and vote on the release candidate #3 for the version 1.8.3,
> > as follows:
> > [ ] +1, Approve the release
> > [ ] -1, Do not approve the release (please provide specific comments)
> >
> >
> > The complete staging area is available for your review, which includes:
> > * JIRA release notes [1],
> > * the official Apache source release and binary convenience releases to
> be
> > deployed to dist.apache.org [2], which are signed with the key with
> > fingerprint EF88474C564C7A608A822EEC3FF96A2057B6476C [3],
> > * all artifacts to be deployed to the Maven Central Repository [4],
> > * source code tag "release-1.8.3-rc3" [5],
> > * website pull request listing the new release and adding announcement
> blog
> > post [6].
> >
> > The vote will be open for at least 72 hours.
> > Please cast your votes before *Dec. 10th 2019, 16:00 UTC*.
> >
> > It is adopted by majority approval, with at least 3 PMC affirmative
> votes.
> >
> > Thanks,
> > Hequn
> >
> > [1]
> >
> >
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12346112
> > [2] https://dist.apache.org/repos/dist/dev/flink/flink-1.8.3-rc3/
> > [3] https://dist.apache.org/repos/dist/release/flink/KEYS
> > [4]
> > https://repository.apache.org/content/repositories/orgapacheflink-1314/
> > [5]
> >
> >
> https://github.com/apache/flink/commit/d54807ba10d0392a60663f030f9fe0bfa1c66754
> > [6] https://github.com/apache/flink-web/pull/285
> >
>


-- 
Best, Jingsong Lee


Re: [DISCUSS] Migrate build infrastructure from Travis CI to Azure Pipelines

2019-12-08 Thread Congxian Qiu
+1 for migrating to Azure pipelines as this can have shorter build time,
and faster response.

Best,
Congxian


Xiyuan Wang  于2019年12月9日周一 上午10:13写道:

> Hi Robert,
>   Thanks for bring up this topic. The 2 ARM machines(16cores) which I
> donated is just for POC test. We(Huawei) can donate more once moving to
> official Azure pipeline. :)
>
> Robert Metzger  于2019年12月6日周五 上午3:25写道:
>
> > Thanks for your comments Yun.
> > If there's strong support for idea 2, it would actually make my
> > life easier: the migration would be easier to do.
> >
> > I also noticed that the uploads to transfer.sh were broken, but this
> should
> > be fixed in the "rmetzger.flink" builds (coming from rmetzger/flink). The
> > builds in "flink-ci.flink" (coming from flink-ci/flink) might have
> troubles
> > with transfer.sh.
> >
> >
> > On Thu, Dec 5, 2019 at 5:50 PM Yun Tang  wrote:
> >
> > > Hi Robert
> > >
> > > Really exciting to see this new more powerful CI tool to get rid of the
> > 50
> > > minutes limit of traivs-CI free account.
> > >
> > > After reading the wiki, I support idea 2 of AZP-setup version-2.
> > >
> > > However, after I dig into some failing builds at
> > > https://dev.azure.com/rmetzger/Flink/_build , I found we cannot view
> the
> > > logs of some IT cases which would be uploaded by traivs_watchdog to
> > > transfer.sh previously.
> > > I think this feature is also easy to implement in AZP, right?
> > >
> > > Best
> > > Yun Tang
> > >
> > > On 12/6/19, 12:19 AM, "Robert Metzger"  wrote:
> > >
> > > I've created a first draft of my plans in the wiki:
> > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/%5Bpreview%5D+Azure+Pipelines
> > > .
> > > I'm looking forward to your comments.
> > >
> > > On Thu, Dec 5, 2019 at 12:37 PM Robert Metzger <
> rmetz...@apache.org>
> > > wrote:
> > >
> > > > Thank you all for the positive feedback. I will start putting
> > > together a
> > > > page in the wiki.
> > > >
> > > > @Jark: Azure Pipelines provides a free services, that is even
> > better
> > > than
> > > > what Travis provides for free: 10 parallel builds with 6 hours
> > > timeouts.
> > > >
> > > > @Chesnay: I will answer your questions in the yet-to-be-written
> > > > documentation in the wiki.
> > > >
> > > >
> > > > On Thu, Dec 5, 2019 at 11:58 AM Arvid Heise  >
> > > wrote:
> > > >
> > > >> +1 I had good experiences with Azure pipelines in the past.
> > > >>
> > > >> On Thu, Dec 5, 2019 at 11:35 AM Aljoscha Krettek <
> > > aljos...@apache.org>
> > > >> wrote:
> > > >>
> > > >> > +1
> > > >> >
> > > >> > Thanks for the effort! The tooling seems to be quite a bit
> nicer
> > > and I
> > > >> > like that we can grow by adding more machines.
> > > >> >
> > > >> > Best,
> > > >> > Aljoscha
> > > >> >
> > > >> > > On 5. Dec 2019, at 03:18, Jark Wu  wrote:
> > > >> > >
> > > >> > > +1 for Azure pipeline because it promises better
> performance.
> > > >> > >
> > > >> > > However, I have 2 concerns:
> > > >> > >
> > > >> > > 1) Travis provides personal free service for testing
> personal
> > > >> branches.
> > > >> > > Usually, contributors use this feature to test PoC or run
> CRON
> > > jobs
> > > >> for
> > > >> > > pull requests.
> > > >> > >Using local machine will cost a lot of time. Does AZP
> > > provides the
> > > >> > same
> > > >> > > free service?
> > > >> > > 2) Currently, we deployed a webhook [1] to receive Travis CI
> > > build
> > > >> > > notifications [2] and send to bui...@flink.apache.org
> mailing
> > > list.
> > > >> > >We need to figure out a way how to send Azure build
> results
> > > to the
> > > >> > > mailing list. And this [3] might be the way to go.
> > > >> > >
> > > >> > > builds@f.a.o mailing list
> > > >> > >
> > > >> > > Best,
> > > >> > > Jark
> > > >> > >
> > > >> > > [1]: https://github.com/wuchong/flink-notification-bot
> > > >> > > [2]:
> > > >> > >
> > > >> >
> > > >>
> > >
> >
> https://docs.travis-ci.com/user/notifications/#configuring-webhook-notifications
> > > >> > > [3]:
> > > >> > >
> > > >> >
> > > >>
> > >
> >
> https://docs.microsoft.com/en-us/azure/devops/service-hooks/overview?view=azure-devops
> > > >> > >
> > > >> > >
> > > >> > >
> > > >> > > On Wed, 4 Dec 2019 at 22:48, Jeff Zhang 
> > > wrote:
> > > >> > >
> > > >> > >> +1
> > > >> > >>
> > > >> > >> Till Rohrmann  于2019年12月4日周三
> > 下午10:43写道:
> > > >> > >>
> > > >> > >>> +1 for moving to Azure pipelines as it promises better
> > > scalability
> > > >> and
> > > >> > >>> tooling. Looking forward to having faster builds and hence
> > > shorter
> > > >> > >> feedback
> > > >> > >>> cycles :-)
> > > >> > >>>
> > > >> > >>> Cheers,
> > > >> > >>> Till
> > > >> > >>>
> > > >> > 

Re: [DISCUSS] Drop Kafka 0.8/0.9

2019-12-08 Thread Jingsong Li
Thanks Chesnay,

+1 to make it official that we no longer actively develop them but user can
still use.

Best,
Jingsong Lee

On Mon, Dec 9, 2019 at 4:47 AM Chesnay Schepler  wrote:

> Users can continue to use the 1.9 versions of the 0.8/0.9 connectors
> with future versions of Flink.
>
> We just make it official that we no longer actively develop them,
>
> On 08/12/2019 12:22, Becket Qin wrote:
> > Hi all,
> >
> > Sorry for being late on this thread. My hunch is that 0.8 and 0.9 is old
> > enough to deprecate. However, after checking the downloads statistics
> from
> > Apache, I am not completely sure anymore.
> >
> > Here are some Kafka broker jar download stats from the past month
> according
> > to Apache Nexus[1]. Since Flink only supports Scala 2.11, I only listed
> > Scala 2.11 here.
> >
> > Scala 2.11:
> > version downloads percentage
> > 0.8.2.1 215478 27.01%
> > 2.0.1 155984 19.55%
> > 1.1.0 85794 10.75%
> > 1.0.1 53745 6.74%
> > 0.9.0.1 39825 4.99%
> >
> > And out of all Scala versions in the broker downloads, Scala 2.11
> accounts
> > about 75% of the total downloads. From all the numbers I can see, it
> looks
> > there are about 1/4 of the users still using Kafka 0.8. I have to admit
> it
> > is also quite counter intuitive for me. I am pretty sure that new users
> are
> > not going to use Kafka 0.8 anymore, but the existing users are moving
> > slowly.
> >
> > However, these stats may not necessarily prevent us from dropping
> connector
> > support for Kafka 0.8 and 0.9, if we assume the users are not going to
> > upgrade their Flink version, which might be the case given they did not
> > upgrade their Kafka versions for long either.
> >
> > To avoid surprises, I'd still prefer following the proper deprecation
> > process. i.e. mark the connectors as deprecated, announce the deprecation
> > plan in the release note, and remove them 1-2 releases later if we don't
> > hear complaints from the users.
> >
> > Thanks,
> >
> > Jiangjie (Becket) Qin
> >
> > [1] https://repository.apache.org/#central-stat
> >
> >
> > On Sat, Dec 7, 2019 at 12:42 AM Thomas Weise  wrote:
> >
> >> +1
> >>
> >>
> >> On Fri, Dec 6, 2019 at 3:46 AM Benchao Li  wrote:
> >>
> >>> +1 for dropping.
> >>>
> >>> Zhenghua Gao  于2019年12月5日周四 下午4:05写道:
> >>>
>  +1 for dropping.
> 
>  *Best Regards,*
>  *Zhenghua Gao*
> 
> 
>  On Thu, Dec 5, 2019 at 11:08 AM Dian Fu 
> wrote:
> 
> > +1 for dropping them.
> >
> > Just FYI: there was a similar discussion few months ago [1].
> >
> > [1]
> >
> >>
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/DISCUSS-Drop-older-versions-of-Kafka-Connectors-0-9-0-10-for-Flink-1-10-td29916.html#a29997
> > 在 2019年12月5日,上午10:29,vino yang  写道:
> >
> > +1
> >
> > jincheng sun  于2019年12月5日周四 上午10:26写道:
> >
> >> +1  for drop it, and Thanks for bring up this discussion Chesnay!
> >>
> >> Best,
> >> Jincheng
> >>
> >> Jark Wu  于2019年12月5日周四 上午10:19写道:
> >>
> >>> +1 for dropping, also cc'ed user mailing list.
> >>>
> >>>
> >>> Best,
> >>> Jark
> >>>
> >>> On Thu, 5 Dec 2019 at 03:39, Konstantin Knauf <
>  konstan...@ververica.com>
> >>> wrote:
> >>>
>  Hi Chesnay,
> 
>  +1 for dropping. I have not heard from any user using 0.8 or 0.9
> >>> for
>  a
> >>> long
>  while.
> 
>  Cheers,
> 
>  Konstantin
> 
>  On Wed, Dec 4, 2019 at 1:57 PM Chesnay Schepler <
> >>> ches...@apache.org>
>  wrote:
> 
> > Hello,
> >
> > What's everyone's take on dropping the Kafka 0.8/0.9 connectors
>  from
> >>> the
> > Flink codebase?
> >
> > We haven't touched either of them for the 1.10 release, and it
>  seems
> > quite unlikely that we will do so in the future.
> >
> > We could finally close a number of test stability tickets that
> >>> have
> >>> been
> > lingering for quite a while.
> >
> >
> > Regards,
> >
> > Chesnay
> >
> >
>  --
> 
>  Konstantin Knauf | Solutions Architect
> 
>  +49 160 91394525
> 
> 
>  Follow us @VervericaData Ververica 
> 
> 
>  --
> 
>  Join Flink Forward  - The Apache
> >> Flink
>  Conference
> 
>  Stream Processing | Event Driven | Real Time
> 
>  --
> 
>  Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany
> 
>  --
>  Ververica GmbH
>  Registered at Amtsgericht Charlottenburg: HRB 158244 B
>  Managing Directors: Timothy Alexander Steinert, Yip Park Tung
> >>> Jason,
>  Ji
>  (Tony) Cheng
> 
> >>>
> >>> --

Re: [VOTE] Release 1.8.3, release candidate #3

2019-12-08 Thread jincheng sun
+1 (binding)

- checked signatures [SUCCESS]
- built from source without tests [SUCCESS]
- run some tests in IDE [SUCCESS]
- start local cluster and submit word count example [SUCCESS]
- announcement PR for website looks good! (I have left a few comments)

Best,
Jincheng

Hequn Cheng  于2019年12月5日周四 下午9:39写道:

> Hi everyone,
>
> Please review and vote on the release candidate #3 for the version 1.8.3,
> as follows:
> [ ] +1, Approve the release
> [ ] -1, Do not approve the release (please provide specific comments)
>
>
> The complete staging area is available for your review, which includes:
> * JIRA release notes [1],
> * the official Apache source release and binary convenience releases to be
> deployed to dist.apache.org [2], which are signed with the key with
> fingerprint EF88474C564C7A608A822EEC3FF96A2057B6476C [3],
> * all artifacts to be deployed to the Maven Central Repository [4],
> * source code tag "release-1.8.3-rc3" [5],
> * website pull request listing the new release and adding announcement blog
> post [6].
>
> The vote will be open for at least 72 hours.
> Please cast your votes before *Dec. 10th 2019, 16:00 UTC*.
>
> It is adopted by majority approval, with at least 3 PMC affirmative votes.
>
> Thanks,
> Hequn
>
> [1]
>
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12346112
> [2] https://dist.apache.org/repos/dist/dev/flink/flink-1.8.3-rc3/
> [3] https://dist.apache.org/repos/dist/release/flink/KEYS
> [4]
> https://repository.apache.org/content/repositories/orgapacheflink-1314/
> [5]
>
> https://github.com/apache/flink/commit/d54807ba10d0392a60663f030f9fe0bfa1c66754
> [6] https://github.com/apache/flink-web/pull/285
>


[jira] [Created] (FLINK-15139) misc end to end test failed on 'SQL Client end-to-end test (Old planner)'

2019-12-08 Thread wangxiyuan (Jira)
wangxiyuan created FLINK-15139:
--

 Summary: misc end to end test failed on 'SQL Client end-to-end 
test (Old planner)'
 Key: FLINK-15139
 URL: https://issues.apache.org/jira/browse/FLINK-15139
 Project: Flink
  Issue Type: Bug
  Components: Tests
Affects Versions: 2.0.0
Reporter: wangxiyuan
 Fix For: 2.0.0


The test Running 'SQL Client end-to-end test (Old planner)' in misc e2e test 
failed

log:
{code:java}
(a94d1da25baf2a5586a296d9e933743c) switched from RUNNING to FAILED.
org.apache.flink.streaming.runtime.tasks.StreamTaskException: Cannot load user 
class: org.apache.flink.streaming.connectors.elasticsearch6.ElasticsearchSink
ClassLoader info: URL ClassLoader:
Class not resolvable through given classloader.
at 
org.apache.flink.streaming.api.graph.StreamConfig.getStreamOperatorFactory(StreamConfig.java:266)
at 
org.apache.flink.streaming.runtime.tasks.OperatorChain.createChainedOperator(OperatorChain.java:430)
at 
org.apache.flink.streaming.runtime.tasks.OperatorChain.createOutputCollector(OperatorChain.java:353)
at 
org.apache.flink.streaming.runtime.tasks.OperatorChain.createChainedOperator(OperatorChain.java:419)
at 
org.apache.flink.streaming.runtime.tasks.OperatorChain.createOutputCollector(OperatorChain.java:353)
at 
org.apache.flink.streaming.runtime.tasks.OperatorChain.(OperatorChain.java:144)
at 
org.apache.flink.streaming.runtime.tasks.StreamTask.beforeInvoke(StreamTask.java:432)
at 
org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:460)
at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:702)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:527)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.ClassNotFoundException: 
org.apache.flink.streaming.connectors.elasticsearch6.ElasticsearchSink
at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at 
org.apache.flink.util.ChildFirstClassLoader.loadClass(ChildFirstClassLoader.java:60)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:348)
at 
org.apache.flink.util.InstantiationUtil$ClassLoaderObjectInputStream.resolveClass(InstantiationUtil.java:78)
at 
java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1868)
at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1751)
at 
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2042)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1573)
at 
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2287)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2211)
at 
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2069)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1573)
at 
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2287)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2211)
at 
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2069)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1573)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:431)
at 
org.apache.flink.util.InstantiationUtil.deserializeObject(InstantiationUtil.java:576)
at 
org.apache.flink.util.InstantiationUtil.deserializeObject(InstantiationUtil.java:562)
at 
org.apache.flink.util.InstantiationUtil.deserializeObject(InstantiationUtil.java:550)
at 
org.apache.flink.util.InstantiationUtil.readObjectFromConfig(InstantiationUtil.java:511)
at 
org.apache.flink.streaming.api.graph.StreamConfig.getStreamOperatorFactory(StreamConfig.java:254)
... 10 more
{code}
link: [https://travis-ci.org/apache/flink/jobs/622261358]

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-15138) Add e2e Test for PyFlink

2019-12-08 Thread sunjincheng (Jira)
sunjincheng created FLINK-15138:
---

 Summary: Add e2e Test for PyFlink
 Key: FLINK-15138
 URL: https://issues.apache.org/jira/browse/FLINK-15138
 Project: Flink
  Issue Type: Improvement
  Components: API / Python
Reporter: sunjincheng
 Fix For: 1.11.0


Currently, both the Table API and Python native UDF are supported in 1.10, I 
would like to add the e2e test for PyFlink.
What do you think?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-15137) Improve schema derivation for Avro format

2019-12-08 Thread Jark Wu (Jira)
Jark Wu created FLINK-15137:
---

 Summary: Improve schema derivation for Avro format
 Key: FLINK-15137
 URL: https://issues.apache.org/jira/browse/FLINK-15137
 Project: Flink
  Issue Type: Sub-task
  Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile)
Reporter: Jark Wu


For JSON, CSV and OldCsv, we already supported {{derive.schema=true}} to get 
the schema from table schema. But for Avro format, a user has to pass an Avro 
schema file or define the format schema explicitly via {{avro.schema}}.

We can think of if we can drop {{avro.schema}} and make {{derive.schema=true}} 
as the default behavior.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] Migrate build infrastructure from Travis CI to Azure Pipelines

2019-12-08 Thread Xiyuan Wang
Hi Robert,
  Thanks for bring up this topic. The 2 ARM machines(16cores) which I
donated is just for POC test. We(Huawei) can donate more once moving to
official Azure pipeline. :)

Robert Metzger  于2019年12月6日周五 上午3:25写道:

> Thanks for your comments Yun.
> If there's strong support for idea 2, it would actually make my
> life easier: the migration would be easier to do.
>
> I also noticed that the uploads to transfer.sh were broken, but this should
> be fixed in the "rmetzger.flink" builds (coming from rmetzger/flink). The
> builds in "flink-ci.flink" (coming from flink-ci/flink) might have troubles
> with transfer.sh.
>
>
> On Thu, Dec 5, 2019 at 5:50 PM Yun Tang  wrote:
>
> > Hi Robert
> >
> > Really exciting to see this new more powerful CI tool to get rid of the
> 50
> > minutes limit of traivs-CI free account.
> >
> > After reading the wiki, I support idea 2 of AZP-setup version-2.
> >
> > However, after I dig into some failing builds at
> > https://dev.azure.com/rmetzger/Flink/_build , I found we cannot view the
> > logs of some IT cases which would be uploaded by traivs_watchdog to
> > transfer.sh previously.
> > I think this feature is also easy to implement in AZP, right?
> >
> > Best
> > Yun Tang
> >
> > On 12/6/19, 12:19 AM, "Robert Metzger"  wrote:
> >
> > I've created a first draft of my plans in the wiki:
> >
> >
> https://cwiki.apache.org/confluence/display/FLINK/%5Bpreview%5D+Azure+Pipelines
> > .
> > I'm looking forward to your comments.
> >
> > On Thu, Dec 5, 2019 at 12:37 PM Robert Metzger 
> > wrote:
> >
> > > Thank you all for the positive feedback. I will start putting
> > together a
> > > page in the wiki.
> > >
> > > @Jark: Azure Pipelines provides a free services, that is even
> better
> > than
> > > what Travis provides for free: 10 parallel builds with 6 hours
> > timeouts.
> > >
> > > @Chesnay: I will answer your questions in the yet-to-be-written
> > > documentation in the wiki.
> > >
> > >
> > > On Thu, Dec 5, 2019 at 11:58 AM Arvid Heise 
> > wrote:
> > >
> > >> +1 I had good experiences with Azure pipelines in the past.
> > >>
> > >> On Thu, Dec 5, 2019 at 11:35 AM Aljoscha Krettek <
> > aljos...@apache.org>
> > >> wrote:
> > >>
> > >> > +1
> > >> >
> > >> > Thanks for the effort! The tooling seems to be quite a bit nicer
> > and I
> > >> > like that we can grow by adding more machines.
> > >> >
> > >> > Best,
> > >> > Aljoscha
> > >> >
> > >> > > On 5. Dec 2019, at 03:18, Jark Wu  wrote:
> > >> > >
> > >> > > +1 for Azure pipeline because it promises better performance.
> > >> > >
> > >> > > However, I have 2 concerns:
> > >> > >
> > >> > > 1) Travis provides personal free service for testing personal
> > >> branches.
> > >> > > Usually, contributors use this feature to test PoC or run CRON
> > jobs
> > >> for
> > >> > > pull requests.
> > >> > >Using local machine will cost a lot of time. Does AZP
> > provides the
> > >> > same
> > >> > > free service?
> > >> > > 2) Currently, we deployed a webhook [1] to receive Travis CI
> > build
> > >> > > notifications [2] and send to bui...@flink.apache.org mailing
> > list.
> > >> > >We need to figure out a way how to send Azure build results
> > to the
> > >> > > mailing list. And this [3] might be the way to go.
> > >> > >
> > >> > > builds@f.a.o mailing list
> > >> > >
> > >> > > Best,
> > >> > > Jark
> > >> > >
> > >> > > [1]: https://github.com/wuchong/flink-notification-bot
> > >> > > [2]:
> > >> > >
> > >> >
> > >>
> >
> https://docs.travis-ci.com/user/notifications/#configuring-webhook-notifications
> > >> > > [3]:
> > >> > >
> > >> >
> > >>
> >
> https://docs.microsoft.com/en-us/azure/devops/service-hooks/overview?view=azure-devops
> > >> > >
> > >> > >
> > >> > >
> > >> > > On Wed, 4 Dec 2019 at 22:48, Jeff Zhang 
> > wrote:
> > >> > >
> > >> > >> +1
> > >> > >>
> > >> > >> Till Rohrmann  于2019年12月4日周三
> 下午10:43写道:
> > >> > >>
> > >> > >>> +1 for moving to Azure pipelines as it promises better
> > scalability
> > >> and
> > >> > >>> tooling. Looking forward to having faster builds and hence
> > shorter
> > >> > >> feedback
> > >> > >>> cycles :-)
> > >> > >>>
> > >> > >>> Cheers,
> > >> > >>> Till
> > >> > >>>
> > >> > >>> On Wed, Dec 4, 2019 at 1:24 PM Chesnay Schepler <
> > ches...@apache.org
> > >> >
> > >> > >>> wrote:
> > >> > >>>
> > >> >  @robert Can you expand how the azure setup interacts with
> > CiBot?
> > >> Do we
> > >> >  have to continue mirroring builds into flink-ci? How will
> the
> > >> cronjob
> > >> >  configuration work? We should have a general idea on how to
> > >> implement
> > >> >  this before proceeding.
> > >> > 

[jira] [Created] (FLINK-15136) Update the Chinese version of "Working with state"

2019-12-08 Thread Congxian Qiu(klion26) (Jira)
Congxian Qiu(klion26) created FLINK-15136:
-

 Summary: Update the Chinese version of "Working with state"
 Key: FLINK-15136
 URL: https://issues.apache.org/jira/browse/FLINK-15136
 Project: Flink
  Issue Type: Sub-task
  Components: chinese-translation, Documentation
Reporter: Congxian Qiu(klion26)


Currently, we enabled background cleanup of state with TTL by default in 
FLINK-14898, and we should update the Chinese version to respect it.

 

documentation location : docs/dev/stream/state/state.zh.md



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-15135) Adding e2e tests for Flink's Mesos integration

2019-12-08 Thread Yangze Guo (Jira)
Yangze Guo created FLINK-15135:
--

 Summary: Adding e2e tests for Flink's Mesos integration
 Key: FLINK-15135
 URL: https://issues.apache.org/jira/browse/FLINK-15135
 Project: Flink
  Issue Type: Test
  Components: Deployment / Mesos, Tests
Reporter: Yangze Guo


Currently, there is no end to end test or IT case for Mesos deployment. We want 
to add Mesos end-to-end tests which will benefit both Mesos users and 
contributors.

More discussion could be found 
[here|http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Adding-e2e-tests-for-Flink-s-Mesos-integration-td35660.html].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-15134) Delete temporary files created in YarnClusterDescriptor

2019-12-08 Thread Zili Chen (Jira)
Zili Chen created FLINK-15134:
-

 Summary: Delete temporary files created in YarnClusterDescriptor
 Key: FLINK-15134
 URL: https://issues.apache.org/jira/browse/FLINK-15134
 Project: Flink
  Issue Type: Improvement
  Components: Client / Job Submission
Reporter: Zili Chen
Assignee: Zili Chen
 Fix For: 1.10.0


We create temporary files for storing {{flink-conf}} & {{JobGraph}}. Although 
we ask for deleting them on exit, for a long running {{YarnClusterDescriptor}} 
it is possibly a resource leak. We can delete the file when they are no longer 
used. The delete hook will then silently fail on deletion but it is expected.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] FLIP-72: Introduce Pulsar Connector

2019-12-08 Thread Bowen Li
Hi Yijie,

I took a look at the design doc. LGTM overall, left a few questions.

On Tue, Dec 3, 2019 at 10:39 PM Becket Qin  wrote:

> Yes, you are absolutely right. Cannot believe I posted in the wrong
> thread...
>
> On Wed, Dec 4, 2019 at 1:46 PM Jark Wu  wrote:
>
>> Thanks Becket the the updating,
>>
>> But shouldn't this message be posted in FLIP-27 discussion thread[1]?
>>
>>
>> Best,
>> Jark
>>
>> [1]:
>>
>> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-27-Refactor-Source-Interface-td24952.html
>>
>> On Wed, 4 Dec 2019 at 12:12, Becket Qin  wrote:
>>
>> > Hi all,
>> >
>> > Sorry for the long belated update. I have updated FLIP-27 wiki page with
>> > the latest proposals. Some noticeable changes include:
>> > 1. A new generic communication mechanism between SplitEnumerator and
>> > SourceReader.
>> > 2. Some detail API method signature changes.
>> >
>> > We left a few things out of this FLIP and will address them in separate
>> > FLIPs. Including:
>> > 1. Per split event time.
>> > 2. Event time alignment.
>> > 3. Fine grained failover for SplitEnumerator failure.
>> >
>> > Please let us know if you have any question.
>> >
>> > Thanks,
>> >
>> > Jiangjie (Becket) Qin
>> >
>> > On Tue, Nov 19, 2019 at 10:28 AM Yijie Shen 
>> > wrote:
>> >
>> > > Hi everyone,
>> > >
>> > > I've put the catalog part design in separate doc with more details for
>> > > easier communication.
>> > >
>> > >
>> > >
>> >
>> https://docs.google.com/document/d/1LMnABtXn-wQedsmWv8hopvx-B-jbdr8-jHbIiDhdsoE/edit?usp=sharing
>> > >
>> > > I would love to hear your thoughts on this.
>> > >
>> > > Best,
>> > > Yijie
>> > >
>> > > On Mon, Oct 21, 2019 at 11:15 AM Yijie Shen <
>> henry.yijies...@gmail.com>
>> > > wrote:
>> > >
>> > > > Hi everyone,
>> > > >
>> > > > Glad to receive your valuable feedbacks.
>> > > >
>> > > > I'd first separate the Pulsar catalog as another doc and show more
>> > design
>> > > > and implementation details there.
>> > > >
>> > > > For the current FLIP-72, I would separate it into the sink part for
>> > > > current work and keep the source part as future works until we reach
>> > > > FLIP-27 finals.
>> > > >
>> > > > I also reply to some of the comments in the design doc. I will
>> rewrite
>> > > the
>> > > > catalog part in regarding to Bowen's advice in both email and
>> comments.
>> > > >
>> > > > Thanks for the help again.
>> > > >
>> > > > Best,
>> > > > Yijie
>> > > >
>> > > > On Fri, Oct 18, 2019 at 12:40 AM Rong Rong 
>> > wrote:
>> > > >
>> > > >> Hi Yijie,
>> > > >>
>> > > >> I also agree with Jark on separating the Catalog part into another
>> > FLIP.
>> > > >>
>> > > >> With FLIP-27[1] also in the air, it is also probably great to split
>> > and
>> > > >> unblock the sink implementation contribution.
>> > > >> I would suggest either putting in a detail implementation plan
>> section
>> > > in
>> > > >> the doc, or (maybe too much separation?) splitting them into
>> different
>> > > >> FLIPs. What do you guys think?
>> > > >>
>> > > >> --
>> > > >> Rong
>> > > >>
>> > > >> [1]
>> > > >>
>> > > >>
>> > >
>> >
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface
>> > > >>
>> > > >> On Wed, Oct 16, 2019 at 9:00 PM Jark Wu  wrote:
>> > > >>
>> > > >> > Hi Yijie,
>> > > >> >
>> > > >> > Thanks for the design document. I agree with Bowen that the
>> catalog
>> > > part
>> > > >> > needs more details.
>> > > >> > And I would suggest to separate Pulsar Catalog as another FLIP.
>> IMO,
>> > > it
>> > > >> has
>> > > >> > little to do with source/sink.
>> > > >> > Having a separate FLIP can unblock the contribution for sink (or
>> > > source)
>> > > >> > and keep the discussion more focus.
>> > > >> > I also left some comments in the documentation.
>> > > >> >
>> > > >> > Thanks,
>> > > >> > Jark
>> > > >> >
>> > > >> > On Thu, 17 Oct 2019 at 11:24, Yijie Shen <
>> henry.yijies...@gmail.com
>> > >
>> > > >> > wrote:
>> > > >> >
>> > > >> > > Hi Bowen,
>> > > >> > >
>> > > >> > > Thanks for your comments. I'll add catalog details as you
>> > suggested.
>> > > >> > >
>> > > >> > > One more question: since we decide to not implement source
>> part of
>> > > the
>> > > >> > > connector at the moment.
>> > > >> > > What can users do with a Pulsar catalog?
>> > > >> > > Create a table backed by Pulsar and check existing pulsar
>> tables
>> > to
>> > > >> see
>> > > >> > > their schemas? Drop tables maybe?
>> > > >> > >
>> > > >> > > Best,
>> > > >> > > Yijie
>> > > >> > >
>> > > >> > > On Thu, Oct 17, 2019 at 1:04 AM Bowen Li 
>> > > wrote:
>> > > >> > >
>> > > >> > > > Hi Yijie,
>> > > >> > > >
>> > > >> > > > Per the discussion, maybe you can move pulsar source to
>> 'future
>> > > >> work'
>> > > >> > > > section in the FLIP for now?
>> > > >> > > >
>> > > >> > > > Besides, the FLIP seems to be quite rough at the moment, and
>> I'd
>> > > >> > > recommend
>> > > >> > > > to add more details .
>> > > >> > > >
>> > > >> > > > A few questions 

[jira] [Created] (FLINK-15133) support sql_alchemy in Flink for broader python sql integration

2019-12-08 Thread Bowen Li (Jira)
Bowen Li created FLINK-15133:


 Summary: support sql_alchemy in Flink for broader python sql 
integration
 Key: FLINK-15133
 URL: https://issues.apache.org/jira/browse/FLINK-15133
 Project: Flink
  Issue Type: New Feature
  Components: API / Python, Table SQL / Ecosystem
Reporter: Bowen Li
 Fix For: 1.11.0


examples of integrations requiring sql_alchemy:

- https://github.com/cloudera/hue
- https://github.com/lyft/amundsen



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] Drop Kafka 0.8/0.9

2019-12-08 Thread Chesnay Schepler
Users can continue to use the 1.9 versions of the 0.8/0.9 connectors 
with future versions of Flink.


We just make it official that we no longer actively develop them,

On 08/12/2019 12:22, Becket Qin wrote:

Hi all,

Sorry for being late on this thread. My hunch is that 0.8 and 0.9 is old
enough to deprecate. However, after checking the downloads statistics from
Apache, I am not completely sure anymore.

Here are some Kafka broker jar download stats from the past month according
to Apache Nexus[1]. Since Flink only supports Scala 2.11, I only listed
Scala 2.11 here.

Scala 2.11:
version downloads percentage
0.8.2.1 215478 27.01%
2.0.1 155984 19.55%
1.1.0 85794 10.75%
1.0.1 53745 6.74%
0.9.0.1 39825 4.99%

And out of all Scala versions in the broker downloads, Scala 2.11 accounts
about 75% of the total downloads. From all the numbers I can see, it looks
there are about 1/4 of the users still using Kafka 0.8. I have to admit it
is also quite counter intuitive for me. I am pretty sure that new users are
not going to use Kafka 0.8 anymore, but the existing users are moving
slowly.

However, these stats may not necessarily prevent us from dropping connector
support for Kafka 0.8 and 0.9, if we assume the users are not going to
upgrade their Flink version, which might be the case given they did not
upgrade their Kafka versions for long either.

To avoid surprises, I'd still prefer following the proper deprecation
process. i.e. mark the connectors as deprecated, announce the deprecation
plan in the release note, and remove them 1-2 releases later if we don't
hear complaints from the users.

Thanks,

Jiangjie (Becket) Qin

[1] https://repository.apache.org/#central-stat


On Sat, Dec 7, 2019 at 12:42 AM Thomas Weise  wrote:


+1


On Fri, Dec 6, 2019 at 3:46 AM Benchao Li  wrote:


+1 for dropping.

Zhenghua Gao  于2019年12月5日周四 下午4:05写道:


+1 for dropping.

*Best Regards,*
*Zhenghua Gao*


On Thu, Dec 5, 2019 at 11:08 AM Dian Fu  wrote:


+1 for dropping them.

Just FYI: there was a similar discussion few months ago [1].

[1]


http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/DISCUSS-Drop-older-versions-of-Kafka-Connectors-0-9-0-10-for-Flink-1-10-td29916.html#a29997

在 2019年12月5日,上午10:29,vino yang  写道:

+1

jincheng sun  于2019年12月5日周四 上午10:26写道:


+1  for drop it, and Thanks for bring up this discussion Chesnay!

Best,
Jincheng

Jark Wu  于2019年12月5日周四 上午10:19写道:


+1 for dropping, also cc'ed user mailing list.


Best,
Jark

On Thu, 5 Dec 2019 at 03:39, Konstantin Knauf <

konstan...@ververica.com>

wrote:


Hi Chesnay,

+1 for dropping. I have not heard from any user using 0.8 or 0.9

for

a

long

while.

Cheers,

Konstantin

On Wed, Dec 4, 2019 at 1:57 PM Chesnay Schepler <

ches...@apache.org>

wrote:


Hello,

What's everyone's take on dropping the Kafka 0.8/0.9 connectors

from

the

Flink codebase?

We haven't touched either of them for the 1.10 release, and it

seems

quite unlikely that we will do so in the future.

We could finally close a number of test stability tickets that

have

been

lingering for quite a while.


Regards,

Chesnay



--

Konstantin Knauf | Solutions Architect

+49 160 91394525


Follow us @VervericaData Ververica 


--

Join Flink Forward  - The Apache

Flink

Conference

Stream Processing | Event Driven | Real Time

--

Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany

--
Ververica GmbH
Registered at Amtsgericht Charlottenburg: HRB 158244 B
Managing Directors: Timothy Alexander Steinert, Yip Park Tung

Jason,

Ji

(Tony) Cheng



--

Benchao Li
School of Electronics Engineering and Computer Science, Peking University
Tel:+86-15650713730
Email: libenc...@gmail.com; libenc...@pku.edu.cn





[ANNOUNCE] Weekly Community Update 2019/49

2019-12-08 Thread Konstantin Knauf
Dear community,

happy to share this week's community digest with an update on Flink 1.8.3,
a revival of the n-ary stream operator, a proposal to move our build
infrastructure to Azure pipelines, and quite a few other topics. Enjoy.

Flink Development
==

* [releases] The feature freeze for *Flink 1.10* is tonight.

* [releases] Hequn has published and started a vote on RC3 for *Flink 1.8.3
*Voting is open until Dec. 10th 2019, 16:00 UTC. No votes so far, but I
assume this will change after the feature freeze. [1]

* [runtime] Piotr has restarted the discussion to add an *n-ary stream
operator* which would help to support multi-broadcast joins in Flink SQL
(think of a star schema). The topic has been discussed before (in 2016) in
the context of side-inputs and there is already an old design document
drafted by Aljoscha to build on top. [2]

* [connectors] Chesnay started a conversation to drop support for *Kafka
0.8/0.9* connectors in the upcoming release. It seems that quite a few
people are in favor of dropping, but Becket also made a valid point to only
deprecate these connector instead of removing them all togehter. [3]

* [connectors] Becket has started the vote on *FLIP-27, the new source
interface.* This has been a long ongoing topic, but it has not been
officially been voted on so far. So there we go. [4,5]

* [state backends] Stephan has proposed to drop support for the *synchronous
mode of the heap statebackend.* One supporting comment so far. [6]

* [hadoop] Craig Foster brought up the topic of *Hadoop 3* support in
Apache Flink*.* There is currently no one working on this topic, but Marton
Balassi of Cloudera reported that they have been working on this internally
and would be willing to contribute their work back next year. [7]

* [development process] There is currently no *end-to-end test for Flink
deployments on Mesos*. Yangze suggests to such tests now and has started a
discussion thread on the topic. He is now looking for feedback from Mesos
production users in order to improve the planned test suite. It seems the
community will need to maintain their own Mesos docker images for these
tests due to Java compatibility issues with the official Mesos docker
images. [8]

* [development process] Dawid has asked comitters to only vote with
their *apache.org
 email* addresses and to only count these votes as
binding going forward. Only for apache.org addresses it is possible to
verify the status of the voter. [9]

* [development process] Following the discussion on reducing the build time
of Apache Flink, Robert and others did some experiments to migrate our
build from *Travis CI to Azure Pipelines. *In this thread [10] and wiki
page [11] he presents his results and asks for opinions on how to move
forward. So far there has been a lot of positive feedback to migrate to
Azure pipelines mainly motivated by lower build times (due to additional
sponsored machines) and richer features.

[1]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/VOTE-Release-1-8-3-release-candidate-3-tp35628.html
[2]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Add-N-Ary-Stream-Operator-tp11341p35554.html
[3]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Drop-Kafka-0-8-0-9-tp35553.html
[4]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface
[5]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/VOTE-FLIP-27-Refactor-Source-Interface-tp35569.html
[6]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Drop-Heap-Backend-Synchronous-snapshots-tp35621.html
[7]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/Re-Building-with-Hadoop-3-tp35522p35528.html
[8]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Adding-e2e-tests-for-Flink-s-Mesos-integration-tp35660p35687.html
[9]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Voting-from-apache-org-addresses-tp35499.html
[10]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Migrate-build-infrastructure-from-Travis-CI-to-Azure-Pipelines-tp35538.html
[11]
https://cwiki.apache.org/confluence/display/FLINK/%5Bpreview%5D+Azure+Pipelines

Notable Bugs
==

* [FLINK-15063] [1.9.1] The scope of the input/output metrics of the
network stack are interchanged, e.g. the outPoolUsage metric can be found
under task level scope of "shuffle.netty.input" instead of
"shuffle.netty.output". Fixed for 1.9.2. [12]

* [FLINK-14949] [1.9.1] [1.8.2] A job can get stuck during cancellation,
e.g. if Flink can not spawn the threads, which perform exactly this
cancellation. Fixed for 1.9.2 [13]

[12] https://issues.apache.org/jira/browse/FLINK-15063
[13] https://issues.apache.org/jira/browse/FLINK-14949

Events, Blog Posts, Misc
===

* *Markos* and *Yuan* have published a recap of Flink Forward Asia 2019 on
the Ververica blog including a short 

[jira] [Created] (FLINK-15132) Checkpoint Coordinator does Checkpoint I/O in JobMaster Main Thread

2019-12-08 Thread Stephan Ewen (Jira)
Stephan Ewen created FLINK-15132:


 Summary: Checkpoint Coordinator does Checkpoint I/O in JobMaster 
Main Thread
 Key: FLINK-15132
 URL: https://issues.apache.org/jira/browse/FLINK-15132
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Checkpointing
Reporter: Stephan Ewen
 Fix For: 1.10.0


The {{PendingCheckpoint.completePendingCheckpoint()}} method is called 
synchronously from within the Scheduler / JobMaster Main Thread.

The method writes out the checkpoint metadata, which is a potentially blocking 
I/O method.
Because the target may block arbitrarily long (for example S3 when load 
throttling), this can bring down the entire cluster (blocking actor threads, 
heartbeat timeouts).




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-15131) Add Source API classes

2019-12-08 Thread Jiangjie Qin (Jira)
Jiangjie Qin created FLINK-15131:


 Summary: Add Source API classes
 Key: FLINK-15131
 URL: https://issues.apache.org/jira/browse/FLINK-15131
 Project: Flink
  Issue Type: Sub-task
Reporter: Jiangjie Qin


Add all the top tier classes defined in FLIP-27.

[https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface#FLIP-27:RefactorSourceInterface-Toplevelpublicinterfaces]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-15130) Drop "RequiredParameters" and "Options"

2019-12-08 Thread Stephan Ewen (Jira)
Stephan Ewen created FLINK-15130:


 Summary: Drop "RequiredParameters" and "Options"
 Key: FLINK-15130
 URL: https://issues.apache.org/jira/browse/FLINK-15130
 Project: Flink
  Issue Type: Task
  Components: API / DataSet, API / DataStream
Affects Versions: 1.9.1
Reporter: Stephan Ewen
 Fix For: 1.11.0


As per mailing list discussion, we want to drop those because they are unused 
redundant code.
There are many options for command line parsing, including one in Flink 
(Parameter Tool).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] Drop Kafka 0.8/0.9

2019-12-08 Thread Becket Qin
Hi all,

Sorry for being late on this thread. My hunch is that 0.8 and 0.9 is old
enough to deprecate. However, after checking the downloads statistics from
Apache, I am not completely sure anymore.

Here are some Kafka broker jar download stats from the past month according
to Apache Nexus[1]. Since Flink only supports Scala 2.11, I only listed
Scala 2.11 here.

Scala 2.11:
version downloads percentage
0.8.2.1 215478 27.01%
2.0.1 155984 19.55%
1.1.0 85794 10.75%
1.0.1 53745 6.74%
0.9.0.1 39825 4.99%

And out of all Scala versions in the broker downloads, Scala 2.11 accounts
about 75% of the total downloads. From all the numbers I can see, it looks
there are about 1/4 of the users still using Kafka 0.8. I have to admit it
is also quite counter intuitive for me. I am pretty sure that new users are
not going to use Kafka 0.8 anymore, but the existing users are moving
slowly.

However, these stats may not necessarily prevent us from dropping connector
support for Kafka 0.8 and 0.9, if we assume the users are not going to
upgrade their Flink version, which might be the case given they did not
upgrade their Kafka versions for long either.

To avoid surprises, I'd still prefer following the proper deprecation
process. i.e. mark the connectors as deprecated, announce the deprecation
plan in the release note, and remove them 1-2 releases later if we don't
hear complaints from the users.

Thanks,

Jiangjie (Becket) Qin

[1] https://repository.apache.org/#central-stat


On Sat, Dec 7, 2019 at 12:42 AM Thomas Weise  wrote:

> +1
>
>
> On Fri, Dec 6, 2019 at 3:46 AM Benchao Li  wrote:
>
> > +1 for dropping.
> >
> > Zhenghua Gao  于2019年12月5日周四 下午4:05写道:
> >
> > > +1 for dropping.
> > >
> > > *Best Regards,*
> > > *Zhenghua Gao*
> > >
> > >
> > > On Thu, Dec 5, 2019 at 11:08 AM Dian Fu  wrote:
> > >
> > > > +1 for dropping them.
> > > >
> > > > Just FYI: there was a similar discussion few months ago [1].
> > > >
> > > > [1]
> > > >
> > >
> >
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/DISCUSS-Drop-older-versions-of-Kafka-Connectors-0-9-0-10-for-Flink-1-10-td29916.html#a29997
> > > >
> > > > 在 2019年12月5日,上午10:29,vino yang  写道:
> > > >
> > > > +1
> > > >
> > > > jincheng sun  于2019年12月5日周四 上午10:26写道:
> > > >
> > > >> +1  for drop it, and Thanks for bring up this discussion Chesnay!
> > > >>
> > > >> Best,
> > > >> Jincheng
> > > >>
> > > >> Jark Wu  于2019年12月5日周四 上午10:19写道:
> > > >>
> > > >>> +1 for dropping, also cc'ed user mailing list.
> > > >>>
> > > >>>
> > > >>> Best,
> > > >>> Jark
> > > >>>
> > > >>> On Thu, 5 Dec 2019 at 03:39, Konstantin Knauf <
> > > konstan...@ververica.com>
> > > >>> wrote:
> > > >>>
> > > >>> > Hi Chesnay,
> > > >>> >
> > > >>> > +1 for dropping. I have not heard from any user using 0.8 or 0.9
> > for
> > > a
> > > >>> long
> > > >>> > while.
> > > >>> >
> > > >>> > Cheers,
> > > >>> >
> > > >>> > Konstantin
> > > >>> >
> > > >>> > On Wed, Dec 4, 2019 at 1:57 PM Chesnay Schepler <
> > ches...@apache.org>
> > > >>> > wrote:
> > > >>> >
> > > >>> > > Hello,
> > > >>> > >
> > > >>> > > What's everyone's take on dropping the Kafka 0.8/0.9 connectors
> > > from
> > > >>> the
> > > >>> > > Flink codebase?
> > > >>> > >
> > > >>> > > We haven't touched either of them for the 1.10 release, and it
> > > seems
> > > >>> > > quite unlikely that we will do so in the future.
> > > >>> > >
> > > >>> > > We could finally close a number of test stability tickets that
> > have
> > > >>> been
> > > >>> > > lingering for quite a while.
> > > >>> > >
> > > >>> > >
> > > >>> > > Regards,
> > > >>> > >
> > > >>> > > Chesnay
> > > >>> > >
> > > >>> > >
> > > >>> >
> > > >>> > --
> > > >>> >
> > > >>> > Konstantin Knauf | Solutions Architect
> > > >>> >
> > > >>> > +49 160 91394525
> > > >>> >
> > > >>> >
> > > >>> > Follow us @VervericaData Ververica 
> > > >>> >
> > > >>> >
> > > >>> > --
> > > >>> >
> > > >>> > Join Flink Forward  - The Apache
> Flink
> > > >>> > Conference
> > > >>> >
> > > >>> > Stream Processing | Event Driven | Real Time
> > > >>> >
> > > >>> > --
> > > >>> >
> > > >>> > Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany
> > > >>> >
> > > >>> > --
> > > >>> > Ververica GmbH
> > > >>> > Registered at Amtsgericht Charlottenburg: HRB 158244 B
> > > >>> > Managing Directors: Timothy Alexander Steinert, Yip Park Tung
> > Jason,
> > > Ji
> > > >>> > (Tony) Cheng
> > > >>> >
> > > >>>
> > > >>
> > > >
> > >
> >
> >
> > --
> >
> > Benchao Li
> > School of Electronics Engineering and Computer Science, Peking University
> > Tel:+86-15650713730
> > Email: libenc...@gmail.com; libenc...@pku.edu.cn
> >
>


[jira] [Created] (FLINK-15129) Return JobClient instead of JobClient Future from executeAsync()

2019-12-08 Thread Aljoscha Krettek (Jira)
Aljoscha Krettek created FLINK-15129:


 Summary: Return JobClient instead of JobClient Future from 
executeAsync()
 Key: FLINK-15129
 URL: https://issues.apache.org/jira/browse/FLINK-15129
 Project: Flink
  Issue Type: Sub-task
  Components: API / DataSet, API / DataStream
Reporter: Aljoscha Krettek


Currently, users have to write this when they want to use the {{JobClient}}:
{code}
CompletableFuture jobClientFuture = env.executeAsync();
JobClient jobClient = jobClientFuture.get();
// or use thenApply/thenCompose etc.
{code}

instead we could always return a {{JobClient}} right away and therefore remove 
one step for the user.

I don't know if it's always the right choice, but currently we always return an 
already completed future that contains the {{JobClient}}. In the future we 
might want to return a future that actually completes at some later point, we 
would not be able to do this if we directly return a {{JobClient}} and would 
have to block in {{executeAsync()}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-15128) Document support for Hive timestamp type

2019-12-08 Thread Rui Li (Jira)
Rui Li created FLINK-15128:
--

 Summary: Document support for Hive timestamp type
 Key: FLINK-15128
 URL: https://issues.apache.org/jira/browse/FLINK-15128
 Project: Flink
  Issue Type: Task
  Components: Connectors / Hive, Documentation
Reporter: Rui Li






--
This message was sent by Atlassian Jira
(v8.3.4#803005)