Re: [VOTE] Accept Stateful Functions into Apache Flink

2019-10-31 Thread Greg Hogan
+1 (binding)

Thank you to Stephan and all current and future contributors to this tool!

On Thu, Oct 31, 2019 at 4:24 AM Vijay Bhaskar 
wrote:

> +1 from me
>
> Regards
> Bhaskar
>
> On Thu, Oct 31, 2019 at 11:42 AM Gyula Fóra  wrote:
>
> > +1 from me, this is a great addition to Flink!
> >
> > Gyula
> >
> > On Thu, Oct 31, 2019, 03:52 Yun Gao 
> wrote:
> >
> > > +1 (non-binding)
> > > Very thanks for bringing this to the community!
> > >
> > >
> > > --
> > > From:jincheng sun 
> > > Send Time:2019 Oct. 31 (Thu.) 10:22
> > > To:dev 
> > > Cc:Vasiliki Kalavri 
> > > Subject:Re: [VOTE] Accept Stateful Functions into Apache Flink
> > >
> > > big +1 (binding)
> > >
> > > Andrey Zagrebin 于2019年10月30日 周三23:45写道:
> > >
> > > > sorry, my +1 was non-binding, confused that it was not a committer
> vote
> > > but
> > > > PMC.
> > > >
> > > > On Wed, Oct 30, 2019 at 4:43 PM Chesnay Schepler  >
> > > > wrote:
> > > >
> > > > > +1 (binding)
> > > > >
> > > > > On 30/10/2019 15:25, Vasiliki Kalavri wrote:
> > > > > > +1 (binding) from me. I hope this is not too late :)
> > > > > >
> > > > > > Thank you for this great contribution!
> > > > > >
> > > > > > On Wed, 30 Oct 2019 at 14:45, Stephan Ewen 
> > wrote:
> > > > > >
> > > > > >> Thank you all for voting.
> > > > > >>
> > > > > >> The voting period has passed, but only 13 PMC members have voted
> > so
> > > > far,
> > > > > >> that is less than 2/3rd of the PMCs (17 members).
> > > > > >>
> > > > > >> I will take a few days to ping other members to vote, after that
> > we
> > > > will
> > > > > >> gradually lower the threshold as per the process to account for
> > > > inactive
> > > > > >> members.
> > > > > >>
> > > > > >> Best,
> > > > > >> Stephan
> > > > > >>
> > > > > >>
> > > > > >>
> > > > > >>
> > > > > >> On Tue, Oct 29, 2019 at 6:20 PM Seth Wiesman <
> sjwies...@gmail.com
> > >
> > > > > wrote:
> > > > > >>
> > > > > >>> +1 (non-binding)
> > > > > >>>
> > > > > >>> Seth
> > > > > >>>
> > > > >  On Oct 23, 2019, at 9:31 PM, Jingsong Li <
> > jingsongl...@gmail.com>
> > > > > >> wrote:
> > > > >  +1 (non-binding)
> > > > > 
> > > > >  Best,
> > > > >  Jingsong Lee
> > > > > 
> > > > > > On Wed, Oct 23, 2019 at 9:02 PM Yu Li 
> > wrote:
> > > > > >
> > > > > > +1 (non-binding)
> > > > > >
> > > > > > Best Regards,
> > > > > > Yu
> > > > > >
> > > > > >
> > > > > >> On Wed, 23 Oct 2019 at 16:56, Haibo Sun  >
> > > > wrote:
> > > > > >>
> > > > > >> +1 (non-binding)Best,
> > > > > >> Haibo
> > > > > >>
> > > > > >>
> > > > > >> At 2019-10-23 09:07:41, "Becket Qin" 
> > > > wrote:
> > > > > >>> +1 (binding)
> > > > > >>>
> > > > > >>> Thanks,
> > > > > >>>
> > > > > >>> Jiangjie (Becket) Qin
> > > > > >>>
> > > > > >>> On Tue, Oct 22, 2019 at 11:44 PM Tzu-Li (Gordon) Tai <
> > > > > > tzuli...@apache.org
> > > > > >>> wrote:
> > > > > >>>
> > > > >  +1 (binding)
> > > > > 
> > > > >  Gordon
> > > > > 
> > > > >  On Tue, Oct 22, 2019, 10:58 PM Zhijiang <
> > > > > >> wangzhijiang...@aliyun.com
> > > > >  .invalid>
> > > > >  wrote:
> > > > > 
> > > > > > +1 (non-binding)
> > > > > >
> > > > > > Best,
> > > > > > Zhijiang
> > > > > >
> > > > > >
> > > > > >
> > > > > --
> > > > > > From:Zhu Zhu 
> > > > > > Send Time:2019 Oct. 22 (Tue.) 16:33
> > > > > > To:dev 
> > > > > > Subject:Re: [VOTE] Accept Stateful Functions into Apache
> > > Flink
> > > > > >
> > > > > > +1 (non-binding)
> > > > > >
> > > > > > Thanks,
> > > > > > Zhu Zhu
> > > > > >
> > > > > > Biao Liu  于2019年10月22日周二 上午11:06写道:
> > > > > >
> > > > > >> +1 (non-binding)
> > > > > >>
> > > > > >> Thanks,
> > > > > >> Biao /'bɪ.aʊ/
> > > > > >>
> > > > > >>
> > > > > >>
> > > > > >>> On Tue, 22 Oct 2019 at 10:26, Jark Wu <
> imj...@gmail.com>
> > > > > wrote:
> > > > > >>>
> > > > > >>> +1 (non-binding)
> > > > > >>>
> > > > > >>> Best,
> > > > > >>> Jark
> > > > > >>>
> > > > > >>> On Tue, 22 Oct 2019 at 09:38, Hequn Cheng <
> > > > > chenghe...@gmail.com
> > > > > > wrote:
> > > > >  +1 (non-binding)
> > > > > 
> > > > >  Best, Hequn
> > > > > 
> > > > >  On Tue, Oct 22, 2019 at 9:21 AM Dian Fu <
> > > > > > dian0511...@gmail.com>
> > > > > > wrote:
> > > > > > +1 (non-binding)
> > > > > >
> > > > > > Regards,
> > > > > > Dian
> > > > > >
> > > > > >> 在 

[jira] [Created] (FLINK-8427) Checkstyle for org.apache.flink.optimizer.costs

2018-01-12 Thread Greg Hogan (JIRA)
Greg Hogan created FLINK-8427:
-

 Summary: Checkstyle for org.apache.flink.optimizer.costs
 Key: FLINK-8427
 URL: https://issues.apache.org/jira/browse/FLINK-8427
 Project: Flink
  Issue Type: Improvement
  Components: Optimizer
Affects Versions: 1.5.0
Reporter: Greg Hogan
Assignee: Greg Hogan
Priority: Trivial






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (FLINK-8422) Checkstyle for org.apache.flink.api.java.tuple

2018-01-12 Thread Greg Hogan (JIRA)
Greg Hogan created FLINK-8422:
-

 Summary: Checkstyle for org.apache.flink.api.java.tuple
 Key: FLINK-8422
 URL: https://issues.apache.org/jira/browse/FLINK-8422
 Project: Flink
  Issue Type: Improvement
  Components: Core
Affects Versions: 1.5.0
Reporter: Greg Hogan
Assignee: Greg Hogan
Priority: Trivial


Update {{TupleGenerator}} for Flink's checkstyle and rebuild {{Tuple}} and 
{{TupleBuilder}} classes.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (FLINK-8363) Build Hadoop 2.9.0 convenience binaries

2018-01-04 Thread Greg Hogan (JIRA)
Greg Hogan created FLINK-8363:
-

 Summary: Build Hadoop 2.9.0 convenience binaries
 Key: FLINK-8363
 URL: https://issues.apache.org/jira/browse/FLINK-8363
 Project: Flink
  Issue Type: New Feature
  Components: Build System
Affects Versions: 1.5.0
Reporter: Greg Hogan
Assignee: Greg Hogan
Priority: Trivial


Hadoop 2.9.0 was released on 17 November, 2017. A local {{mvn clean verify 
-Dhadoop.version=2.9.0}} ran successfully.

With the new Hadoopless build we may be able to improve the build process by 
reusing the {{flink-dist}} jar (which differ only in build timestamps) and 
simply make each Hadoop-specific tarball by copying in the corresponding 
{{flink-shaded-hadoop2-uber}} jar.

What portion of the TravisCI jobs can run Hadoopless? We could build and verify 
these once and then run a Hadoop-versioned job for each Hadoop version.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (FLINK-8361) Remove create_release_files.sh

2018-01-04 Thread Greg Hogan (JIRA)
Greg Hogan created FLINK-8361:
-

 Summary: Remove create_release_files.sh
 Key: FLINK-8361
 URL: https://issues.apache.org/jira/browse/FLINK-8361
 Project: Flink
  Issue Type: Improvement
  Components: Build System
Affects Versions: 1.5.0
Reporter: Greg Hogan
Priority: Trivial


The monolithic {{create_release_files.sh}} does not support building Flink 
without Hadoop and looks to have been superseded by the scripts in 
{{tools/releasing}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (FLINK-8222) Update Scala version

2017-12-07 Thread Greg Hogan (JIRA)
Greg Hogan created FLINK-8222:
-

 Summary: Update Scala version
 Key: FLINK-8222
 URL: https://issues.apache.org/jira/browse/FLINK-8222
 Project: Flink
  Issue Type: Improvement
  Components: Build System
Affects Versions: 1.4.0
Reporter: Greg Hogan
Assignee: Greg Hogan


Update Scala to version {{2.11.12}}. I don't believe this affects the Flink 
distribution but rather anyone who is compiling Flink or a 
Flink-quickstart-derived program on a shared system.

"A privilege escalation vulnerability (CVE-2017-15288) has been identified in 
the Scala compilation daemon."

https://www.scala-lang.org/news/security-update-nov17.html



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (FLINK-8223) Update Hadoop versions

2017-12-07 Thread Greg Hogan (JIRA)
Greg Hogan created FLINK-8223:
-

 Summary: Update Hadoop versions
 Key: FLINK-8223
 URL: https://issues.apache.org/jira/browse/FLINK-8223
 Project: Flink
  Issue Type: Improvement
  Components: Build System
Affects Versions: 1.5.0
Reporter: Greg Hogan
Assignee: Greg Hogan
Priority: Trivial


Update 2.7.3 to 2.7.4 and 2.8.0 to 2.8.2. See 
http://hadoop.apache.org/releases.html



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (FLINK-8180) Refactor driver outputs

2017-11-30 Thread Greg Hogan (JIRA)
Greg Hogan created FLINK-8180:
-

 Summary: Refactor driver outputs
 Key: FLINK-8180
 URL: https://issues.apache.org/jira/browse/FLINK-8180
 Project: Flink
  Issue Type: Improvement
  Components: Gelly
Affects Versions: 1.5.0
Reporter: Greg Hogan
Assignee: Greg Hogan
 Fix For: 1.5.0


The change in 1.4 of algorithm results from Tuples to POJOs broke the writing 
of results as csv. Testing this was and is a challenge so was not done. There 
are many additional improvements which can be made based on recent improvements 
to the Gelly framework.

Result hash and analytic results should always be printed to the screen. 
Results can optionally be written to stdout or to a file. In the latter case 
the result hash and analytic results (and schema) will also be written to a 
top-level file.

The "verbose" output strings can be replaced with json which is just as 
human-readable but also machine readable. In addition to csv and json it may be 
simple to support xml, etc. Computed fields will be optionally printed to 
screen or file (currently these are always printed to screen but never to file).

Testing will be simplified since formats are now a separate concern from the 
stream.

Jackson is available to Gelly as a dependency provided in the Flink 
distribution but we may want to build Gelly as a fat jar in order to include 
additional modules (which may require a direct dependency on Jackson, which 
would fail the checkstyle requirement to use the shaded package).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Updated checkstyle version

2017-11-27 Thread Greg Hogan
Hi devs,

Recent commits to the master and release-1.4 branches updated the checkstyle 
version from 6.19 to 8.4 and if using the checkstyle plugin for IntelliJ you 
will need to manually update this version within the preferences dialog. The 
old version was not fully enforcing the rule set.

Greg Hogan

[jira] [Created] (FLINK-8126) Update and fix checkstyle

2017-11-21 Thread Greg Hogan (JIRA)
Greg Hogan created FLINK-8126:
-

 Summary: Update and fix checkstyle
 Key: FLINK-8126
 URL: https://issues.apache.org/jira/browse/FLINK-8126
 Project: Flink
  Issue Type: Bug
Affects Versions: 1.5.0
Reporter: Greg Hogan
Assignee: Greg Hogan
Priority: Trivial
 Fix For: 1.5.0


Our current checkstyle configuration (checkstyle version 6.19) is missing some 
ImportOrder and variable naming errors which are detected in 1) IntelliJ using 
the same checkstyle version and 2) with the maven-checkstyle-plugin with an 
up-to-date checkstyle version (8.4).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Re: [ANNOUNCE] New committer: Haohui Mai

2017-11-02 Thread Greg Hogan
Welcome, Haohui!


> On Nov 1, 2017, at 4:14 AM, Fabian Hueske  wrote:
> 
> Hi everybody,
> 
> On behalf of the PMC I am delighted to announce Haohui Mai as a new Flink
> committer!
> 
> Haohui has been an active member of our community for several months.
> Among other things, he made major contributions in ideas and code to the
> SQL and Table APIs.
> 
> Please join me in congratulating Haohui for becoming a Flink committer!
> 
> Cheers,
> Fabian



Re: System resource logger

2017-10-04 Thread Greg Hogan
What if we added these as system metrics and added a way to write metrics to a 
(separate?) log file?


> On Oct 4, 2017, at 10:13 AM, Piotr Nowojski  wrote:
> 
> Hi,
> 
> Lately I was debugging some weird test failures on Travis and I needed to 
> look into metrics like:
> - User, System, IOWait, IRQ CPU usages (based on CPU ticks since previous 
> check)
> - System wide memory consumption (including making sure that swap was 
> disabled)
> - network usage 
> - etc…
> 
> Without an access to the machines itself. For this purpose I implemented some 
> periodic daemon thread logger. Log output looked like this:
> 
> https://gist.github.com/pnowojski/8b863abb0fb08ac75b62627feadbd2f7 
> 
> 
> I think it would be nice to add this feature to Flink itself, by extending 
> existing MemoryLogger. Same lack of information that I had with travis could 
> easily happen on productional environments. The problem is that there is no 
> easy way to obtain such kind of information without using some external 
> libraries (think about cross platform support). I have used for that:
> 
> https://github.com/oshi/oshi 
> 
> It has some minimal additional dependencies, one thing worth noting is a JNA 
> - it’s JAR weights ~1MB. We would have two options to add this feature:
> 
> 1. Include this oshi dependency in flink-runtime
> 2. Wrap oshi into flink-contrib/flink-resource-logger module and make this 
> new module an optional/dynamically loaded  dependency by flink-runtime (used 
> only if user manually copies flink-resource-logger.jar to a class path).
> 
> I would lean toward 1., since that’s a powerful tool and it’s dependencies 
> are pretty minimal (except this JNA’s jar size). What do you think?
> 
> Piotrek


Re: [DISCUSS] Flink 1.4 and time based release

2017-08-30 Thread Greg Hogan
Haven’t seen much discussion here. I see the benefit of time-based deadlines 
but also of focussing on release functionality and stability.

I like the idea to keep the structure of time-based releases but soften the 
deadlines. The schedule would not be open-ended but we could wait on the 
completion and stability of major new features and also schedule around events 
like the upcoming Flink Forward. I would like to still fork the release branch 
as late as possible.

Greg

> On Aug 23, 2017, at 5:07 AM, Timo Walther  wrote:
> 
> I also think we shouldn't publish releases regularly, just to have a release 
> regularly.
> 
> Maybe we can do time-based releases more flexible: Instead of feature-freeze 
> after 3 months, 1 month testing. We could do it like feature-freeze 3 months 
> after the last release, unlimited testing. This would limit us in not adding 
> too many features, but enables for proper testing for robust releases. What 
> do you think?
> 
> Regards,
> Timo
> 
> Am 23.08.17 um 10:26 schrieb Till Rohrmann:
>> Thanks for starting the discussion Stephan. I agree with you that the last
>> release was probably a bit hasty due to the constraints we put on ourselves
>> with the strict time based release. Therefore and because of some of the
>> incomplete features, I would be in favour of loosening the strict deadline
>> such that we have more time finishing our work and properly testing the
>> release. Hard to tell, however, how much more time is needed.
>> 
>> Cheers,
>> Till
>> 
>> On Tue, Aug 22, 2017 at 6:56 PM, Chen Qin  wrote:
>> 
>>> I would be great to avoid immediate 1.x1 bug fixing release. It cause
>>> confusion and raise quality concerns.
>>> 
>>> Also, is there already way to communicate with Amazon EMR for latest
>>> release speedy available? I may try to find someone work there is needed.
>>> 
>>> Thanks
>>> Chen
>>> 
 On Aug 22, 2017, at 9:32 AM, Stephan Ewen  wrote:
 
 Hi all!
 
 I want to bring up this discussion because we are approaching the date
>>> when
 there would be a feature freeze following the time based release
>>> schedule.
 To make it short, I would suggest to not follow the time-based schedule
>>> for
 that release. There are a bunch of reasons bringing me to that view:
 
  - 1.3.0, which was very much pushed by the time-based schedule was not
 the best release we ever made. In fact, it had quite a few open issues
>>> that
 required an immediate 1.3.1 followup and only 1.3.2 fixed some of them.
 
  - 1.3.2, which is in some sense what 1.3.0 should have been is only 2
 weeks back
 
  - The delta since the last release is still quite small. One could argue
 to make a quick release and then soon another release after that, but
 releases still tie up quite a good amount of resources, so that would
 introduce a delay for much of the ongoing work. I am doubtful if this is
>>> a
 good idea at this point.
 
  - The current master has still quite a bit of "ongoing work" that is not
 in perfect shape for a release, but could use some more weeks to provide
 real value to users. Examples are the dependency reworking, network stack
 enhancements, speedier state restore efforts, flip-6, exactly-once
 sinks/side-effects, and others.
 
 
 Alternatively, we could do what we did for 1.1 and 1.2, which is making
>>> now
 a list of features we want in the release, and then projecting based on
 that when we fork off the 1.4 release branch.
 
 
 What do you think?
 
 
 Cheers,
 Stephan



Re: [POLL] Dropping savepoint format compatibility for 1.1.x in the Flink 1.4.0 release

2017-08-17 Thread Greg Hogan
There’s an argument for delaying this change to 1.5 since the feature freeze is 
two weeks away. There is little time to realize benefits from removing this 
code.

"The reason for that is that there is a lot of code mapping between the 
completely different legacy format (1.1.x, not re-scalable) and the 
key-group-oriented format (1.2.x onwards, re-scalable). It would greatly help 
the development of state and checkpointing features to drop that old code.”

Greg


> On Aug 17, 2017, at 5:36 AM, Stefan Richter  
> wrote:
> 
> One more comment about the consequences of this PR, as pointed out in the 
> comments on Github: this will also break direct compatibility for the CEP 
> library between Flink 1.2 and 1.4. There is still a way to migrate via Flink 
> 1.3: Flink 1.1/2 -> savepoint -> Flink 1.3 -> savepoint -> Flink 1.4.
> 
>> Am 16.08.2017 um 17:31 schrieb Stefan Richter :
>> 
>> Hi,
>> 
>> after there have been no objections since a long time, I took the next step 
>> and created a PR that implements this change in commit 
>> 95e44099784c9deaf2ca422b8dfc11c3d67d7f82 of 
>> https://github.com/apache/flink/pull/4550 
>>  . Announcing this here as a last 
>> opportunity for further discussions. FYI, this will decrease the code base 
>> by almost 12K LOC. 
>> 
>> Best,
>> Stefan
>> 
>> 
>>> Am 02.08.2017 um 15:26 schrieb Kostas Kloudas >> >:
>>> 
>>> +1
>>> 
 On Aug 2, 2017, at 3:16 PM, Till Rohrmann > wrote:
 
 +1
 
 On Wed, Aug 2, 2017 at 9:12 AM, Stefan Richter 
 >
 wrote:
 
> +1
> 
> Am 28.07.2017 um 16:03 schrieb Stephan Ewen  >:
> 
> Seems like no one raised a concern so far about dropping the savepoint
> format compatibility for 1.1 in 1.4.
> 
> Leaving this thread open for some more days, but from the sentiment, it
> seems like we should go ahead?
> 
> On Wed, Jul 12, 2017 at 4:43 PM, Stephan Ewen  > wrote:
> 
>> Hi users!
>> 
>> Flink currently maintains backwards compatibility for savepoint formats,
>> which means that savepoints taken with Flink version 1.1.x and 1.2.x can 
>> be
>> resumed in Flink 1.3.x
>> 
>> We are discussing how many versions back to support. The proposition is
>> the following:
>> 
>> *   Suggestion: Flink 1.4.0 will be able to resume savepoints taken with
>> version 1.3.x and 1.2.x, but not savepoints from version 1.1.x and 1.0.x*
>> 
>> 
>> The reason for that is that there is a lot of code mapping between the
>> completely different legacy format (1.1.x, not re-scalable) and the
>> key-group-oriented format (1.2.x onwards, re-scalable). It would greatly
>> help the development of state and checkpointing features to drop that old
>> code.
>> 
>> Please let us know if you have concerns about that.
>> 
>> Best,
>> Stephan
>> 
>> 
> 
> 
>>> 
>> 
> 



Re: [VOTE] Release 1.3.2, release candidate #3

2017-08-04 Thread Greg Hogan
Thanks Aljoscha!

+1 (binding)

- verified source and binary signatures
- verified source and binary checksums
- verified LICENSEs
- verified NOTICEs
- built from source

Greg

> On Aug 4, 2017, at 2:00 AM, Aljoscha Krettek  wrote:
> 
> Hi everyone,
> 
> Please review and vote on the release candidate #3 for the version 1.3.2, as 
> follows:
> [ ] +1, Approve the release
> [ ] -1, Do not approve the release (please provide specific comments)
> 
> 
> The complete staging area is available for your review, which includes:
> * JIRA release notes [1],
> * the official Apache source release and binary convenience releases to be 
> deployed to dist.apache.org [2], which is signed with the key with 
> fingerprint 0xA8F4FD97121D7293 [3],
> * all artifacts to be deployed to the Maven Central Repository [4],
> * source code tag "release-1.3.2-rc2" [5],
> * website pull request listing the new release and adding announcement blog 
> post [6]. 
> 
> The only change in this RC compared to the last is this commit that fixes 
> copying of the Gelly examples jar: 
> https://github.com/apache/flink/commit/fda455e23a6192200b85b84d3aee312c9be40c99.
>  I would therefore like to propose a shorter voting period because the last 
> RC was seemingly good except for this bug. Please voice your concerns about 
> this if you have any. The vote will be open for at least 24 hours. It is 
> adopted by majority approval, with at least 3 PMC affirmative votes.
> 
> Please use the provided document, as discussed before, for coordinating the 
> testing efforts: [7]. I have copied over the cluster testing efforts and 
> functional testing efforts since the code of this RC is exactly the same as 
> RC2.
> 
> Thanks,
> Aljoscha
> 
> [1] 
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12340984
> [2] http://people.apache.org/~aljoscha/flink-1.3.2-rc3/
> [3] https://dist.apache.org/repos/dist/release/flink/KEYS
> [4] https://repository.apache.org/content/repositories/orgapacheflink-1135/
> [5] 
> https://git-wip-us.apache.org/repos/asf?p=flink.git;a=tag;h=d0f7528a69a93045f4af686347753f8737ca0b2b
> [6] https://github.com/apache/flink-web/pull/75
> [7] 
> https://docs.google.com/document/d/1OJAE6scAZXbSBaGNNEpU1qhr3Jb7bwbec7keEnT9hmA/edit?usp=sharing



Re: [VOTE] Release 1.3.2, release candidate #2

2017-08-02 Thread Greg Hogan
-1

The Gelly examples jar is not included in the Scala 2.11 convenience binaries 
since change-scala-version.sh is not switching the hard-coded Scala version 
from 2.10 to 2.11 in ./flink-dist/src/main/assemblies/bin.xml. The simplest fix 
may be to revert FLINK-7211 and simply exclude the corresponding javadoc jar 
(this is only an issue in the 1.3 branch, the FLINK-7211 should be working on 
master). I don’t think I’ll have time to submit this tomorrow.

I would like to look into adding an end-to-end test for Gelly for the next 
release.

Greg

> On Jul 30, 2017, at 3:07 AM, Aljoscha Krettek  wrote:
> 
> Hi everyone,
> 
> Please review and vote on the release candidate #2 for the version 1.3.2, as 
> follows:
> [ ] +1, Approve the release
> [ ] -1, Do not approve the release (please provide specific comments)
> 
> 
> The complete staging area is available for your review, which includes:
> * JIRA release notes [1],
> * the official Apache source release and binary convenience releases to be 
> deployed to dist.apache.org [2], which is signed with the key with 
> fingerprint 0xA8F4FD97121D7293 [3],
> * all artifacts to be deployed to the Maven Central Repository [4],
> * source code tag "release-1.3.2-rc2" [5],
> * website pull request listing the new release and adding announcement blog 
> post [6]. 
> 
> The vote will be open for at least 72 hours (excluding this current weekend). 
> It is adopted by majority approval, with at least 3 PMC affirmative votes.
> 
> Please use the provided document, as discussed before, for coordinating the 
> testing efforts: [7]
> 
> Thanks,
> Aljoscha
> 
> [1] 
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12340984
> [2] http://people.apache.org/~aljoscha/flink-1.3.2-rc2/
> [3] https://dist.apache.org/repos/dist/release/flink/KEYS
> [4] https://repository.apache.org/content/repositories/orgapacheflink-1133/
> [5] 
> https://git-wip-us.apache.org/repos/asf?p=flink.git;a=tag;h=e38825d0c8e7fe2191a4c657984d9939ed8dd0ad
> [6] https://github.com/apache/flink-web/pull/75
> [7] 
> https://docs.google.com/document/d/1dN9AM9FUPizIu4hTKAXJSbbAORRdrce-BqQ8AUHlOqE/edit?usp=sharing



[jira] [Created] (FLINK-7296) Validate commit messages in git pre-receive hook

2017-07-28 Thread Greg Hogan (JIRA)
Greg Hogan created FLINK-7296:
-

 Summary: Validate commit messages in git pre-receive hook
 Key: FLINK-7296
 URL: https://issues.apache.org/jira/browse/FLINK-7296
 Project: Flink
  Issue Type: Improvement
Reporter: Greg Hogan
Assignee: Greg Hogan
Priority: Minor


Would like to investigate a pre-receive (server-side) hook analyzing the commit 
message incoming revisions on the {{master}} branch for the standard JIRA 
format ({{\[FLINK-\] \[module\] ...}}).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[ANNOUNCE] New Flink PMC member: Chesnay Schepler

2017-07-28 Thread Greg Hogan
Developers,

On behalf of the Flink PMC I am delighted to announce Chesnay Schepler as a 
member of the Flink PMC.

Chesnay is a longtime contributor, reviewer, and committer whose breadth of 
work and knowledge covers nearly the entire codebase. 

Please join me in congratulating Chesnay and welcoming him to bind his votes, 
validate licenses, and sign releases!

Regards,
Greg


[jira] [Created] (FLINK-7277) Weighted PageRank

2017-07-26 Thread Greg Hogan (JIRA)
Greg Hogan created FLINK-7277:
-

 Summary: Weighted PageRank
 Key: FLINK-7277
 URL: https://issues.apache.org/jira/browse/FLINK-7277
 Project: Flink
  Issue Type: New Feature
  Components: Gelly
Affects Versions: 1.4.0
Reporter: Greg Hogan
Assignee: Greg Hogan


Add a weighted PageRank algorithm to complement the existing unweighted 
implementation. Edge values store a `double` weight value which is summed per 
vertex in place of the vertex degree. The vertex score is joined as the 
fraction of vertex weight rather than dividing by the vertex degree.

The examples `Runner` must now read and generated weighted graphs.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (FLINK-7276) Gelly algorithm parameters

2017-07-26 Thread Greg Hogan (JIRA)
Greg Hogan created FLINK-7276:
-

 Summary: Gelly algorithm parameters
 Key: FLINK-7276
 URL: https://issues.apache.org/jira/browse/FLINK-7276
 Project: Flink
  Issue Type: Improvement
  Components: Gelly
Affects Versions: 1.4.0
Reporter: Greg Hogan
Assignee: Greg Hogan
Priority: Minor


Similar to the examples drivers, the algorithm configuration fields should be 
typed to handle `canMergeConfiguration` and `mergeConfiguration` in 
`GraphAlgorithmWrappingBase` rather than overriding these methods in each 
algorithm (which has proven brittle). The existing `OptionalBoolean` is one 
example.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (FLINK-7275) Differentiate between normal and power-user cli options in Gelly examples

2017-07-26 Thread Greg Hogan (JIRA)
Greg Hogan created FLINK-7275:
-

 Summary: Differentiate between normal and power-user cli options 
in Gelly examples
 Key: FLINK-7275
 URL: https://issues.apache.org/jira/browse/FLINK-7275
 Project: Flink
  Issue Type: Improvement
  Components: Gelly
Affects Versions: 1.4.0
Reporter: Greg Hogan
Assignee: Greg Hogan


The current "hack" is to preface "power-user" options with a double underscore 
(i.e. '__parallelism') which are then "hidden" by exclusion from the program 
usage documentation. Change this to instead be explicit in the {{Parameter}} 
API and provide a cli option to display "power-user" options.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (FLINK-7273) Gelly tests with empty graphs

2017-07-26 Thread Greg Hogan (JIRA)
Greg Hogan created FLINK-7273:
-

 Summary: Gelly tests with empty graphs
 Key: FLINK-7273
 URL: https://issues.apache.org/jira/browse/FLINK-7273
 Project: Flink
  Issue Type: Bug
  Components: Gelly
Affects Versions: 1.4.0
Reporter: Greg Hogan
Assignee: Greg Hogan
Priority: Minor
 Fix For: 1.4.0


There exist some tests with empty graphs but the `EmptyGraph` in `AsmTestBase` 
contained vertices but no edges. Add a new `EmptyGraph` without vertices and 
test both empty graphs for each algorithm.

`PageRank` should (optionally?) include zero-degree vertices in the results.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Re: [VOTE] Release Apache Flink-shaded 1.0 (RC1)

2017-07-23 Thread Greg Hogan
Is there a pressing need to get the release out quickly? This being the first 
release, would it be better to change the versioning now to prevent future 
confusion? Even if Flink is the only intended consumer we’ll still be 
publishing the jars.


> On Jul 23, 2017, at 9:41 AM, Stephan Ewen  wrote:
> 
> The release is technically correct, so
> +1 for the release
> 
>  - LICENSE and NOTICE are good
>  - Shaded artifacts add their licenses to the artifact where needed
>  - no binaries in the release
> 
> 
> I will send another mail with suggestions for improving things for future
> releases
> 
> 
> On Fri, Jul 21, 2017 at 11:39 AM, Robert Metzger 
> wrote:
> 
>> Thanks a lot for preparing the release artifacts.
>> While checking the source repo / release commit, I realized that you are
>> not following the versioning scheme as flink:
>> the current master has a "x.y-SNAPSHOT" version, and release candidates
>> (and releases) get a x.y.z version. I wonder if it makes sense to use the
>> same model in the flink-shaded.git repo. I think this is the default
>> assumption in maven, and some modules behave differently based on the
>> version: for example "mvn deploy" sends "-SNAPSHOT" artifacts to a snapshot
>> server, and release artifacts to a staging repository.
>> 
>> I don't think we need to cancel the release because of this, I just wanted
>> to raise this point to see what others are thinking.
>> 
>> 
>> I've checked the following
>> - The netty shaded jar contains the MIT license from netty router:
>> https://repository.apache.org/content/repositories/
>> orgapacheflink-1130/org/apache/flink/flink-shaded-
>> netty-4/1.0-4.0.27.Final/flink-shaded-netty-4-1.0-4.0.27.Final.jar
>> - In the staging repo, I didn't see any dependencies exposed.
>> - I checked some of the md5 sums in the staging and they were correct / I
>> used a mvn plugin to check the signatures in the staging repo and they were
>> okay
>> - clean install in the source repo worked (this includes a license header
>> check)
>> - LICENSE and NOTICE file are there
>> 
>> ==> +1 to release.
>> 
>> On Fri, Jul 21, 2017 at 9:45 AM, Chesnay Schepler 
>> wrote:
>> 
>>> Here's a list of things we need to check:
>>> 
>>> * correct License/Notice files
>>> * licenses of shaded dependencies are included in the jar
>>> * the versions of shaded dependencies match those used in Flink 1.4
>>> * compilation with maven works
>>> * the assembled jars only contain the shaded dependency and no
>>>   non-shaded classes
>>> * no transitive dependencies should be exposed
>>> 
>>> 
>>> On 19.07.2017 15:59, Chesnay Schepler wrote:
>>> 
 Dear Flink community,
 
 Please vote on releasing the following candidate as Apache Flink-shaded
 version 1.0.
 
 The commit to be voted in:
 https://gitbox.apache.org/repos/asf/flink-shaded/commit/fd30
 33ba9ead310478963bf43e09cd50d1e36d71
 
 Branch:
 release-1.0-rc1
 
 The release artifacts to be voted on can be found at:
 http://home.apache.org/~chesnay/flink-shaded-1.0-rc1/ <
 http://home.apache.org/%7Echesnay/flink-shaded-1.0-rc1/>
 
 The release artifacts are signed with the key with fingerprint
 19F2195E1B4816D765A2C324C2EED7B111D464BA:
 http://www.apache.org/dist/flink/KEYS
 
 The staging repository for this release can be found at:
 https://repository.apache.org/content/repositories/orgapacheflink-1130
 
 -
 
 
 The vote ends on Monday (5pm CEST), July 24th, 2017.
 
 [ ] +1 Release this package as Apache Flink-shaded 1.0
 [ ] -1 Do not release this package, because ...
 
 -
 
 
 The flink-shaded project contains a number of shaded dependencies for
 Apache Flink.
 
 This release includes asm-all:5.0.4, guava:18.0, netty-all:4.0.27-FINAL
 and netty-router:1.10 . Note that netty-all and netty-router are
>> bundled as
 a single dependency.
 
 The purpose of these dependencies is to provide a single instance of a
 shaded dependency in the Apache Flink distribution, instead of each
 individual module shading the dependency.
 
 For more information, see
 https://issues.apache.org/jira/browse/FLINK-6529.


Re: [DISCUSS] Release 1.3.2 planning

2017-07-21 Thread Greg Hogan
FLINK-7211 is a trivial change for excluding the gelly examples javadoc from 
the release assembly and would be good to have fixed for 1.3.2.


> On Jul 13, 2017, at 3:34 AM, Tzu-Li (Gordon) Tai  wrote:
> 
> I agree that FLINK-6951 should also be a blocker for 1.3.2. I’ll update its 
> priority.
> 
> On 13 July 2017 at 4:06:06 PM, Bowen Li (bowen...@offerupnow.com) wrote:
> 
> Hi Aljoscha,  
> I'd like to see https://issues.apache.org/jira/browse/FLINK-6951 fixed  
> in 1.3.2, if it makes sense.  
> 
> Thanks,  
> Bowen  
> 
> On Wed, Jul 12, 2017 at 3:06 AM, Aljoscha Krettek   
> wrote:  
> 
>> Short update, we resolved some blockers and discovered some new ones.  
>> There’s this nifty Jira page if you want to keep track:  
>> https://issues.apache.org/jira/projects/FLINK/versions/12340984 <  
>> https://issues.apache.org/jira/projects/FLINK/versions/12340984>  
>> 
>> Once again, could everyone please update the Jira issues that they think  
>> should be release blocking. I would like to start building release  
>> candidates at the end of this week, if possible.  
>> 
>> And yes, I’m volunteering to be the release manager on this release. ;-)  
>> 
>> Best,  
>> Aljoscha  
>> 
>>> On 7. Jul 2017, at 16:03, Aljoscha Krettek  wrote:  
>>> 
>>> I think we might have another blocker: https://issues.apache.org/  
>> jira/browse/FLINK-7133   
>>> 
 On 7. Jul 2017, at 09:18, Haohui Mai  wrote:  
 
 I think we are pretty close now -- Jira shows that we're down to two  
 blockers: FLINK-7069 and FLINK-6965.  
 
 FLINK-7069 is being merged and we have a PR for FLINK-6965.  
 
 ~Haohui  
 
 On Thu, Jul 6, 2017 at 1:44 AM Aljoscha Krettek   
>> wrote:  
 
> I’m seeing these remaining blockers:  
> https://issues.apache.org/jira/browse/FLINK-7069?filter=  
>> 12334772=project%20%3D%20FLINK%20AND%20priority%20%3D%20Blocker%20AND%  
>> 20resolution%20%3D%20Unresolved  
> <  
> https://issues.apache.org/jira/browse/FLINK-7069?filter=  
>> 12334772=project%20=%20FLINK%20AND%20priority%20=%  
>> 20Blocker%20AND%20resolution%20=%20Unresolved  
>> 
> 
> Could everyone please correctly mark as “blocking” those issues that  
>> they  
> consider blocking for 1.3.2 so that we get an accurate overview of  
>> where we  
> are.  
> 
> @Chesnay, could you maybe check if this one should in fact be  
>> considered a  
> blocker: https://issues.apache.org/jira/browse/FLINK-7034? <  
> https://issues.apache.org/jira/browse/FLINK-7034?>  
> 
> Best,  
> Aljoscha  
>> On 6. Jul 2017, at 07:19, Tzu-Li (Gordon) Tai   
> wrote:  
>> 
>> FLINK-7041 has been merged.  
>> I’d also like to raise another blocker for 1.3.2:  
> https://issues.apache.org/jira/browse/FLINK-6996.  
>> 
>> Cheers,  
>> Gordon  
>> On 30 June 2017 at 12:46:07 AM, Aljoscha Krettek (aljos...@apache.org  
>> )  
> wrote:  
>> 
>> Gordon and I found this (in my opinion) blocking issue:  
> https://issues.apache.org/jira/browse/FLINK-7041 <  
> https://issues.apache.org/jira/browse/FLINK-7041>  
>> 
>> I’m trying to quickly provide a fix.  
>> 
>>> On 26. Jun 2017, at 15:30, Timo Walther  wrote:  
>>> 
>>> I just opened a PR which should be included in the next bug fix  
>> release  
> for the Table API:  
>>> https://issues.apache.org/jira/browse/FLINK-7005  
>>> 
>>> Timo  
>>> 
>>> Am 23.06.17 um 14:09 schrieb Robert Metzger:  
 Thanks Haohui.  
 
 The first main task for the release management is to come up with a  
 timeline :)  
 Lets just wait and see which issues get reported. There are  
>> currently  
> no  
 blockers set for 1.3.1 in JIRA.  
 
 On Thu, Jun 22, 2017 at 6:47 PM, Haohui Mai   
> wrote:  
 
> Hi,  
> 
> Release management is though, I'm happy to help. Are there any  
> timelines  
> you have in mind?  
> 
> Haohui  
> On Fri, Jun 23, 2017 at 12:01 AM Robert Metzger <  
>> rmetz...@apache.org>  
> wrote:  
> 
>> Hi all,  
>> 
>> with the 1.3.1 release on the way, we can start thinking about the  
> 1.3.2  
>> release.  
>> 
>> We have already one issue that should go in there:  
>> - https://issues.apache.org/jira/browse/FLINK-6964  
>> 
>> If there are any other blockers, let us know here :)  
>> 
>> I'm wondering if there's somebody from the community who's  
>> willing to  
> take  
>> care of the release management of 1.3.2 :)  
>> 
>>> 
>> 

[jira] [Created] (FLINK-7234) Fix CombineHint documentation

2017-07-19 Thread Greg Hogan (JIRA)
Greg Hogan created FLINK-7234:
-

 Summary: Fix CombineHint documentation
 Key: FLINK-7234
 URL: https://issues.apache.org/jira/browse/FLINK-7234
 Project: Flink
  Issue Type: Bug
  Components: Documentation
Affects Versions: 1.2.2, 1.4.0, 1.3.2
Reporter: Greg Hogan
Assignee: Greg Hogan


The {{CombineHint}} 
[documentation|https://ci.apache.org/projects/flink/flink-docs-release-1.4/dev/batch/index.html]
 applies to {{DataSet#reduce}} not {{DataSet#reduceGroup}} and should also be 
note for {{DataSet#distinct}}. It is also set with 
{{.setCombineHint(CombineHint)}} rather than alongside the UDF parameter.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Re: [DISCUSS] A more thorough Pull Request check list and template

2017-07-18 Thread Greg Hogan
Thanks for leading this discussion, Stephan. I don’t disagree with anything 
that has been said but am slightly concerned that the improvements in 
documenting pull requests won’t translate into the git commit messages. 
Acceptance of a higher standard will be swift as long as reviewers set the 
example and expectation. Perhaps a new template could be trialled as opt-in for 
a few weeks by several or more committers.

Greg


> On Jul 18, 2017, at 10:58 AM, Stephan Ewen  wrote:
> 
> My thinking was exactly as echoed by Gordon and Ufuk:
> 
>  - The yes/no sections are also for reviewers a reminder of what to watch
> out for.
>Let's face it, probably half of the committers are not aware that these
> things need to be checked implicitly against every change.
>A good part of the recent issues came from exactly that. Changes get
> merged (because the pull request lingered or the number of open PRs is
> high) and these implications are not thought through.
> 
>  - This is to me a tradeoff between requiring explicit +1s from certain
> people (maintainers) for certain components, and getting an awareness into
> everybody's mind.
> 
>  - It also makes all users aware that these things are considered and
> implicitly manages expectations in how fast can things get merged.
> 
> 
> Concerning the long text: I think it is fine to play the ball a bit more to
> the contributors.
> Making it easy, yes. But also making it correct and well. We need to make
> contributors aware of what it means to contribute to a system to runs
> highly available critical infrastructure. There is quite often still the
> mindset of "hey, cool, open source, let me throw something out there".
> 
> My take is that anyone who is serious about contributing and serious about
> quality is not put off by this template.
> 
> Concerning the introductory text: I bet that rarely anyone reads the "how
> to contribute" guide. Before the template, virtually no new pull request
> had even the required naming.
> That text needs to be in the template, or we might as well not have it
> anywhere at all.
> 
> 
> 
> Just for reference: Below is the introductory text of the JDK ;-)
> 
> 5. Know what to expect
> 
> Only the best patches submitted will actually make it all the way into a
> JDK code base. The goal is not to take in the maximum number of
> contributions possible, but rather to accept only the highest-quality
> contributions. The JDK is used daily by millions of people and thousands of
> businesses, often in mission-critical applications, and so we can't afford
> to accept anything less than the very best.
> 
> If you're relatively new to the Java platform then we recommend that you
> gain more experience writing Java applications before you attempt to work
> on the JDK itself. The purpose of the sponsored-contribution process is to
> bring developers who already have the skills required to work on the JDK
> into the existing development community. The members of that community have
> neither the time nor the patience required to teach basic Java programming
> skills or platform implementation techniques.
> 
> 
> 
> 
> 
> On Tue, Jul 18, 2017 at 12:15 PM, Ufuk Celebi  wrote:
> 
>> On Tue, Jul 18, 2017 at 10:47 AM, Fabian Hueske  wrote:
>>> For example even if the question about changed dependencies is answered
>>> with "no", the reviewer still has to check that.
>> 
>> But having it as a required option/text in the PR descriptions helps
>> reviewers to actually remember to check that. I think we should be
>> more realistic here and assume that reviewers will also overlook
>> things etc.
>> 
>> To me, keeping the questions is more important than the intro text.
>> Therefore, I would be OK with moving the text to the contrib guide,
>> but I would definitely keep the detailed yes/nos and not go with high
>> level questions that everyone will answer differently.
>> 
>> – Ufuk
>> 



Re: [DISCUSS] GitBox

2017-07-18 Thread Greg Hogan
My understanding was that the synchronization was bidirectional but clearly 
we’re working without documentation.

http://karaf.922171.n3.nabble.com/PROPOSAL-Apache-Karaf-Slack-amp-discuss-about-GitBox-td4050669.html
 
<http://karaf.922171.n3.nabble.com/PROPOSAL-Apache-Karaf-Slack-amp-discuss-about-GitBox-td4050669.html>
http://apache-accumulo.1065345.n5.nabble.com/DISCUSS-GitBox-td21160.html 
<http://apache-accumulo.1065345.n5.nabble.com/DISCUSS-GitBox-td21160.html>


> On Jul 18, 2017, at 8:45 AM, Chesnay Schepler <ches...@apache.org> wrote:
> 
> According to the JIRA you linked, you can push the the apache repo, but it 
> will be overridden by GitHub.
> (as it should since the GitHub repo is the original)
> 
> The solution offered in the JIRA is to (force) push to the github repo 
> instead of the apache one.
> Unless I'm misunderstanding this doesn't appear to change anything.
> 
> On 18.07.2017 14:37, Greg Hogan wrote:
>> You are not able to push to the ASF repo? This link implies that both work 
>> (and identify an issue now addressed):
>>   https://issues.apache.org/jira/browse/INFRA-14039 
>> <https://issues.apache.org/jira/browse/INFRA-14039>
>> 
>> From my .git/config:
>> 
>> [remote "origin"]
>>  url = g...@github.com:apache/flink-shaded.git
>>  fetch = +refs/heads/*:refs/remotes/origin/*
>> [remote "apache"]
>>  url = https://gitbox.apache.org/repos/asf/flink-shaded.git
>>  fetch = +refs/heads/*:refs/remotes/apache/*
>> [branch "master"]
>>  remote = origin
>>  merge = refs/heads/master
>> 
>> 
>>> On Jul 18, 2017, at 7:52 AM, Chesnay Schepler <ches...@apache.org> wrote:
>>> 
>>> So committers would still need to link their accounts.
>>> 
>>> Source for the mirror info: 
>>> https://issues.apache.org/jira/browse/INFRA-13926
>>> 
>>> On 18.07.2017 13:50, Chesnay Schepler wrote:
>>>> Alright, so there is an apache repo that can found at 
>>>> https://gitbox.apache.org/repos/asf?p=flink-shaded.git
>>>> but it is a mirror of the github repo.
>>>> 
>>>> For flink, we push to apache and it is mirrored to github.
>>>> For flink-shaded, we push to github and it is mirror to apache.
>>>> 
>>>> On 18.07.2017 13:47, Chesnay Schepler wrote:
>>>>> I'm not aware of any asf hosted repository for gitbox projects; if you 
>>>>> look at the flink-shaded repository you will
>>>>> not see any mention of it being a mirror, compared to the flink repo.
>>>>> 
>>>>> The git-wip-us.apache.org repo for flink-shaded was removed when we 
>>>>> switched.
>>>>> 
>>>>> On 18.07.2017 13:27, Greg Hogan wrote:
>>>>>> Linking is required to commit to the ASF hosted repo as well as the 
>>>>>> GitHub repo? My understanding was that linking and 2FA was only required 
>>>>>> to commit through GitHub, so no one would have diminished capabilities. 
>>>>>> I’d generally recommend only ever writing to a single repo to prevent 
>>>>>> concurrent commits.
>>>>>> 
>>>>>> 
>>>>>>> On Jul 18, 2017, at 6:21 AM, Chesnay Schepler <ches...@apache.org> 
>>>>>>> wrote:
>>>>>>> 
>>>>>>> We recently moved flink-shaded to GitBox; overall I'm quite happy with 
>>>>>>> how it works.
>>>>>>> 
>>>>>>> However, it is not possible for committers to push commits that haven't 
>>>>>>> gone through the github/asf
>>>>>>> account linking process (https://gitbox.apache.org/setup/).
>>>>>>> 
>>>>>>> I verified this today in an experiment with the help of Robert.
>>>>>>> 
>>>>>>> The linking process requires every committer to join the ASF github 
>>>>>>> organization, include their github username
>>>>>>> in the apache profile, and setup 2-factor-authorization for their 
>>>>>>> github account.
>>>>>>> 
>>>>>>> While i would love to have the gitbox functionality for the Flink 
>>>>>>> repository I don't know whether we want to
>>>>>>> impose these requirements on all committers.
>>>>>>> 
>>>>>>> On 21.06.2017 19:49, Robert Metzger wrote:
>>>&g

Re: [DISCUSS] GitBox

2017-07-18 Thread Greg Hogan
You are not able to push to the ASF repo? This link implies that both work (and 
identify an issue now addressed):
  https://issues.apache.org/jira/browse/INFRA-14039 
<https://issues.apache.org/jira/browse/INFRA-14039>

From my .git/config:

[remote "origin"]
url = g...@github.com:apache/flink-shaded.git
fetch = +refs/heads/*:refs/remotes/origin/*
[remote "apache"]
url = https://gitbox.apache.org/repos/asf/flink-shaded.git
fetch = +refs/heads/*:refs/remotes/apache/*
[branch "master"]
remote = origin
merge = refs/heads/master


> On Jul 18, 2017, at 7:52 AM, Chesnay Schepler <ches...@apache.org> wrote:
> 
> So committers would still need to link their accounts.
> 
> Source for the mirror info: https://issues.apache.org/jira/browse/INFRA-13926
> 
> On 18.07.2017 13:50, Chesnay Schepler wrote:
>> Alright, so there is an apache repo that can found at 
>> https://gitbox.apache.org/repos/asf?p=flink-shaded.git
>> but it is a mirror of the github repo.
>> 
>> For flink, we push to apache and it is mirrored to github.
>> For flink-shaded, we push to github and it is mirror to apache.
>> 
>> On 18.07.2017 13:47, Chesnay Schepler wrote:
>>> I'm not aware of any asf hosted repository for gitbox projects; if you look 
>>> at the flink-shaded repository you will
>>> not see any mention of it being a mirror, compared to the flink repo.
>>> 
>>> The git-wip-us.apache.org repo for flink-shaded was removed when we 
>>> switched.
>>> 
>>> On 18.07.2017 13:27, Greg Hogan wrote:
>>>> Linking is required to commit to the ASF hosted repo as well as the GitHub 
>>>> repo? My understanding was that linking and 2FA was only required to 
>>>> commit through GitHub, so no one would have diminished capabilities. I’d 
>>>> generally recommend only ever writing to a single repo to prevent 
>>>> concurrent commits.
>>>> 
>>>> 
>>>>> On Jul 18, 2017, at 6:21 AM, Chesnay Schepler <ches...@apache.org> wrote:
>>>>> 
>>>>> We recently moved flink-shaded to GitBox; overall I'm quite happy with 
>>>>> how it works.
>>>>> 
>>>>> However, it is not possible for committers to push commits that haven't 
>>>>> gone through the github/asf
>>>>> account linking process (https://gitbox.apache.org/setup/).
>>>>> 
>>>>> I verified this today in an experiment with the help of Robert.
>>>>> 
>>>>> The linking process requires every committer to join the ASF github 
>>>>> organization, include their github username
>>>>> in the apache profile, and setup 2-factor-authorization for their github 
>>>>> account.
>>>>> 
>>>>> While i would love to have the gitbox functionality for the Flink 
>>>>> repository I don't know whether we want to
>>>>> impose these requirements on all committers.
>>>>> 
>>>>> On 21.06.2017 19:49, Robert Metzger wrote:
>>>>>> +1 for trying out Gitbox!
>>>>>> 
>>>>>> On Sun, Jun 18, 2017 at 6:50 PM, Greg Hogan <c...@greghogan.com> wrote:
>>>>>> 
>>>>>>> My understanding is that with GitBox project committers who have linked
>>>>>>> Apache and GitHub accounts are given organization write permissions. 
>>>>>>> Other
>>>>>>> contributors will continue to have read permissions.
>>>>>>> https://help.github.com/articles/repository-permission-levels-for-an- 
>>>>>>> organization/
>>>>>>> 
>>>>>>> The last comment noting the “split-brain” shouldn’t preclude the use of
>>>>>>> GitBox but we should come to a general consensus before switching to 
>>>>>>> commit
>>>>>>> into the GitHub repo.
>>>>>>> 
>>>>>>> If we want to try GitHub for flink-web, a second step could to switch 
>>>>>>> and
>>>>>>> use with the nascent flink-libraries.
>>>>>>> 
>>>>>>> 
>>>>>>>> On Jun 18, 2017, at 6:50 AM, Chesnay Schepler <ches...@apache.org>
>>>>>>> wrote:
>>>>>>>> Found some info in this JIRA: https://issues.apache.org/
>>>>>>> jira/browse/INFRA-14191
>>>>>>>> Apparently, Gitbox is still in

Re: [DISCUSS] GitBox

2017-07-18 Thread Greg Hogan
Linking is required to commit to the ASF hosted repo as well as the GitHub 
repo? My understanding was that linking and 2FA was only required to commit 
through GitHub, so no one would have diminished capabilities. I’d generally 
recommend only ever writing to a single repo to prevent concurrent commits.


> On Jul 18, 2017, at 6:21 AM, Chesnay Schepler <ches...@apache.org> wrote:
> 
> We recently moved flink-shaded to GitBox; overall I'm quite happy with how it 
> works.
> 
> However, it is not possible for committers to push commits that haven't gone 
> through the github/asf
> account linking process (https://gitbox.apache.org/setup/).
> 
> I verified this today in an experiment with the help of Robert.
> 
> The linking process requires every committer to join the ASF github 
> organization, include their github username
> in the apache profile, and setup 2-factor-authorization for their github 
> account.
> 
> While i would love to have the gitbox functionality for the Flink repository 
> I don't know whether we want to
> impose these requirements on all committers.
> 
> On 21.06.2017 19:49, Robert Metzger wrote:
>> +1 for trying out Gitbox!
>> 
>> On Sun, Jun 18, 2017 at 6:50 PM, Greg Hogan <c...@greghogan.com> wrote:
>> 
>>> My understanding is that with GitBox project committers who have linked
>>> Apache and GitHub accounts are given organization write permissions. Other
>>> contributors will continue to have read permissions.
>>>   https://help.github.com/articles/repository-permission-levels-for-an-
>>> organization/
>>> 
>>> The last comment noting the “split-brain” shouldn’t preclude the use of
>>> GitBox but we should come to a general consensus before switching to commit
>>> into the GitHub repo.
>>> 
>>> If we want to try GitHub for flink-web, a second step could to switch and
>>> use with the nascent flink-libraries.
>>> 
>>> 
>>>> On Jun 18, 2017, at 6:50 AM, Chesnay Schepler <ches...@apache.org>
>>> wrote:
>>>> Found some info in this JIRA: https://issues.apache.org/
>>> jira/browse/INFRA-14191
>>>> Apparently, Gitbox is still in the beta phase. There are no public docs
>>> for it yet.
>>>> Committers are required to link their apache & GitHub accounts, which
>>> requires 2FA on GitHub.
>>>> As it stands I would be in favor of Gregs original suggestion of
>>> activating it for flink-web as a test bed.
>>>> I would wait with the main repo until we actually have more info and it
>>> is a bit more proven.
>>>> On 11.06.2017 19:37, Ufuk Celebi wrote:
>>>>> I would also like to see this happening for both flink-web and flink
>>>>> if it allows committers to have control over the respective repos.
>>>>> 
>>>>> On Sat, Jun 10, 2017 at 4:05 PM, Chesnay Schepler <ches...@apache.org>
>>> wrote:
>>>>>> What are the downsides of this? Actually, is there any ASF resource
>>> that
>>>>>> outlines what this would enable?
>>>>>> 
>>>>>> In one of the threads i saw said that this would also allow committers
>>> to
>>>>>> close PR's, assign labels and such.
>>>>>> This sounds very interesting to me for the main repo actually.
>>>>>> 
>>>>>> 
>>>>>> On 09.06.2017 17:41, Greg Hogan wrote:
>>>>>>> Robert has an open PR from March. I’ve found, for example, PRs adding
>>>>>>> links to talks or slides left open for months.
>>>>>>> 
>>>>>>> I’d suggest Fluo is to Accumulo as flink-web is to the flink repo, and
>>>>>>> that migration looks to be satisfactory.
>>>>>>> 
>>>>>>> 
>>>>>>>> On Jun 9, 2017, at 11:15 AM, Ted Yu <yuzhih...@gmail.com> wrote:
>>>>>>>> 
>>>>>>>> bq. better track the oft-neglected contributions
>>>>>>>> 
>>>>>>>> Do you have estimate on how many contributions were not paid
>>> attention in
>>>>>>>> the current infrastructure.
>>>>>>>> 
>>>>>>>> Looking at #2, it seems Accumulo community hasn't reached consensus
>>> yet.
>>>>>>>> Cheers
>>>>>>>> 
>>>>>>>> On Fri, Jun 9, 2017 at 7:54 AM, Greg Hogan <c...@greghogan.com>
>>&g

[jira] [Created] (FLINK-7211) Exclude Gelly javadoc jar from release

2017-07-17 Thread Greg Hogan (JIRA)
Greg Hogan created FLINK-7211:
-

 Summary: Exclude Gelly javadoc jar from release
 Key: FLINK-7211
 URL: https://issues.apache.org/jira/browse/FLINK-7211
 Project: Flink
  Issue Type: Improvement
  Components: Build System
Affects Versions: 1.4.0, 1.3.2
Reporter: Greg Hogan
Assignee: Greg Hogan
Priority: Trivial






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (FLINK-7204) CombineHint.NONE

2017-07-16 Thread Greg Hogan (JIRA)
Greg Hogan created FLINK-7204:
-

 Summary: CombineHint.NONE
 Key: FLINK-7204
 URL: https://issues.apache.org/jira/browse/FLINK-7204
 Project: Flink
  Issue Type: New Feature
  Components: Core
Affects Versions: 1.4.0
Reporter: Greg Hogan
Assignee: Greg Hogan
Priority: Minor


FLINK-3477 added a hash-combine preceding the reducer configured with 
{{CombineHint.HASH}} or {{CombineHint.SORT}} (default). In some cases it may be 
useful to disable the combiner in {{ReduceNode}} by specifying a new 
{{CombineHint.NONE}} value.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (FLINK-7199) Graph simplification does not set parallelism

2017-07-14 Thread Greg Hogan (JIRA)
Greg Hogan created FLINK-7199:
-

 Summary: Graph simplification does not set parallelism
 Key: FLINK-7199
 URL: https://issues.apache.org/jira/browse/FLINK-7199
 Project: Flink
  Issue Type: Bug
  Components: Gelly
Affects Versions: 1.3.1, 1.4.0
Reporter: Greg Hogan
Assignee: Greg Hogan
Priority: Minor


The {{Simplify}} parameter should accept and set the parallelism when calling 
the {{Simplify}} algorithms.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Akka dissassociated

2017-07-14 Thread Greg Hogan
Hi all,

I’m having some issues with Akka running on a modest cluster where increasing 
the parallelism results in disassociation messages.

I am running a batch job, Gelly’s TriangleListing (for simplicity) which is 
join-based. I have not seen this issue running AdamicAdar which is sort-based.

I have increased both of the following timeouts and the job takes less than 100 
seconds.
akka.ask.timeout: 1000 s
akka.lookup.timeout: 100 s

I have not changed taskmanager.exit-on-fatal-akka-error from the default value 
of false but the JobManager is dropping all TaskManager connections.

I can run the TriangleListing job with the same 127 TaskManagers with a smaller 
parallelism. Dropping from 2286 to around 1000 is often successful.

CPU and memory should not be a bottleneck for the JobManager (18 cores and 18 
GB).

I would be grateful for solutions, suggestions, or pointers to debugging this 
issue.

Thanks,
Greg


2017-07-14 16:50:08,119 INFO  
org.apache.flink.runtime.executiongraph.ExecutionGraph- GroupReduce 
(Generate triplets) (30/2286) (5a2e8f0a00530bd2216d7d3ee10688f7) switched from 
RUNNING to FINISHED.
2017-07-14 16:50:08,312 INFO  
org.apache.flink.runtime.executiongraph.ExecutionGraph- GroupReduce 
(Generate triplets) (26/2286) (c6a91db2d6b6797768596d9f746d316f) switched from 
RUNNING to FINISHED.
2017-07-14 16:50:09,831 INFO  
org.apache.flink.runtime.executiongraph.ExecutionGraph- GroupReduce 
(Generate triplets) (131/2286) (2c77b1e4b90b951d3be1e09bf4cf41d2) switched from 
RUNNING to FINISHED.
2017-07-14 16:50:10,057 INFO  
org.apache.flink.runtime.executiongraph.ExecutionGraph- GroupReduce 
(Generate triplets) (133/2286) (d0c4c4eda4f0c44fe594a1b94eb66c93) switched from 
RUNNING to FINISHED.
2017-07-14 16:50:11,861 INFO  
org.apache.flink.runtime.executiongraph.ExecutionGraph- GroupReduce 
(Generate triplets) (70/2286) (69ce8d91fbbad943c277ee92d3c38aaa) switched from 
RUNNING to FINISHED.
2017-07-14 16:50:15,029 INFO  
org.apache.flink.runtime.executiongraph.ExecutionGraph- GroupReduce 
(Generate triplets) (38/2286) (a72c2dee009342bc4d90ec98427fa717) switched from 
RUNNING to FINISHED.
2017-07-14 16:50:16,583 INFO  
org.apache.flink.runtime.executiongraph.ExecutionGraph- GroupReduce 
(Generate triplets) (27/2286) (e79ec6229d4afdc6669c1c221a19ad8c) switched from 
RUNNING to FINISHED.
2017-07-14 16:50:19,498 INFO  
org.apache.flink.runtime.executiongraph.ExecutionGraph- GroupReduce 
(Generate triplets) (44/2286) (53e35ddbd0e02d256620e5310276bea6) switched from 
RUNNING to FINISHED.
2017-07-14 16:50:21,021 WARN  akka.remote.ReliableDeliverySupervisor
- Association with remote system 
[akka.tcp://flink@ip-10-0-28-115:40713] has failed, address is now gated for 
[5000] ms. Reason: [Disassociated]
2017-07-14 16:50:21,097 WARN  akka.remote.ReliableDeliverySupervisor
- Association with remote system 
[akka.tcp://flink@ip-10-0-21-141:45899] has failed, address is now gated for 
[5000] ms. Reason: [Disassociated]
2017-07-14 16:50:21,129 WARN  akka.remote.ReliableDeliverySupervisor
- Association with remote system 
[akka.tcp://flink@ip-10-0-27-236:37471] has failed, address is now gated for 
[5000] ms. Reason: [Disassociated]
2017-07-14 16:50:21,132 WARN  akka.remote.ReliableDeliverySupervisor
- Association with remote system 
[akka.tcp://flink@ip-10-0-18-79:45765] has failed, address is now gated for 
[5000] ms. Reason: [Disassociated]
2017-07-14 16:50:21,140 WARN  akka.remote.ReliableDeliverySupervisor
- Association with remote system 
[akka.tcp://flink@ip-10-0-29-112:41017] has failed, address is now gated for 
[5000] ms. Reason: [Disassociated]
2017-07-14 16:50:21,142 WARN  akka.remote.ReliableDeliverySupervisor
- Association with remote system 
[akka.tcp://flink@ip-10-0-25-70:39625] has failed, address is now gated for 
[5000] ms. Reason: [Disassociated]
2017-07-14 16:50:21,159 WARN  akka.remote.ReliableDeliverySupervisor
- Association with remote system 
[akka.tcp://flink@ip-10-0-28-105:39127] has failed, address is now gated for 
[5000] ms. Reason: [Disassociated]
2017-07-14 16:50:21,170 WARN  akka.remote.ReliableDeliverySupervisor
- Association with remote system 
[akka.tcp://flink@ip-10-0-28-117:38923] has failed, address is now gated for 
[5000] ms. Reason: [Disassociated]
2017-07-14 16:50:21,181 WARN  akka.remote.ReliableDeliverySupervisor
- Association with remote system 
[akka.tcp://flink@ip-10-0-20-172:40007] has failed, address is now gated for 
[5000] ms. Reason: [Disassociated]
2017-07-14 16:50:21,190 WARN  akka.remote.ReliableDeliverySupervisor
- Association with remote system 
[akka.tcp://flink@ip-10-0-22-220:44391] has failed, address is now gated for 
[5000] ms. Reason: 

[jira] [Created] (FLINK-7154) Missing call to build CsvTableSource example

2017-07-11 Thread Greg Hogan (JIRA)
Greg Hogan created FLINK-7154:
-

 Summary: Missing call to build CsvTableSource example
 Key: FLINK-7154
 URL: https://issues.apache.org/jira/browse/FLINK-7154
 Project: Flink
  Issue Type: Bug
  Components: Documentation
Affects Versions: 1.4.0, 1.3.2
Reporter: Greg Hogan
Assignee: Greg Hogan
Priority: Trivial


The Java and Scala example code for CsvTableSource create a builder but are 
missing the final call to {{build}}.

https://ci.apache.org/projects/flink/flink-docs-release-1.3/dev/table/sourceSinks.html#csvtablesource



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Re: [ANNOUNCE] New Flink committer Jincheng Sun

2017-07-10 Thread Greg Hogan
Congrats and welcome, Jincheng!


> On Jul 10, 2017, at 9:17 AM, Fabian Hueske  wrote:
> 
> Hi everybody,
> 
> On behalf of the PMC, I'm very happy to announce that Jincheng Sun has
> accepted the invitation of the PMC to become a Flink committer.
> 
> Since more than nine month, Jincheng is one of the most active contributors
> of the Table API / SQL module. He has contributed several major features,
> reported and fixed many bugs, and also spent a lot of time reviewing pull
> requests.
> 
> Please join in me congratulating Jincheng for becoming a Flink committer.
> 
> Thanks, Fabian



Re: Tips to fix IDEA strange problem after updating master code

2017-07-06 Thread Greg Hogan
I’m wondering if we can remove the ILoopCompat duplication by checking and 
reading the trait properties with reflection … but I have not discovered how to 
do this.


> On Jul 4, 2017, at 3:57 AM, Piotr Nowojski  wrote:
> 
> Besides deactivating “scala-2.10” profile in the Intellij it might be 
> necessary to:
> - reimport maven project:
>   1. Right click on root module: “flink-parent”
>   2. Maven
>   3. reimport
> - invalidate caches and restart: File -> Invalidate caches and restart -> 
> invalidate /restart
> - rebuild whole project
> 
> I suspect that either activation of scala-2.10 by default comes from 
> flink-scala and flick-scala-shell poms or it’s an artifact because you 
> created/imported Intellij project when 2.10 was the default. If the first 
> option is true, this PR: https://github.com/apache/flink/pull/4240 
>  might fix this issue.
> 
> 
> Another quirk that I encauntered is the compile error about  ILoopCompat 
> class being defined twice in Intellij (works fine from console). This comes 
> from flink-scala-shell/pom.xml, which defines two different source paths 
> depending on Scala version:
> 
> src/main/scala-${scala.binary.version}
> 
> Such thing is not supported by Intellij and one have to manually remove one 
> of the source directory (either 2.11 or 2.10) from the project settings.
> 
> Piotrek
> 
>> On Jul 4, 2017, at 9:46 AM, Aljoscha Krettek  wrote:
>> 
>> Thanks for the hint!
>> 
>>> On 4. Jul 2017, at 06:03, Ted Yu  wrote:
>>> 
>>> Looks like the picture didn't go thru.
>>> 
>>> Mind using third party site ?
>>> 
>>> Thanks
>>> 
>>> On Mon, Jul 3, 2017 at 8:56 PM, Jark Wu  wrote:
>>> 
 Hi devs,
 
 Yesterday, I updated the master code which include [FLINK-7030]: Build
 with scala-2.11 by default. After that, I entered a strange problem with
 IDEA that many classes can't be found, and the project can't be
 built/compiled (in IDEA), but maven install worked good.
 
 After a series of attempts, I found that IDEA activate the scala-2.10
 profile by default which result in this problem. After deactivate
 scala-2.10 profile via  sidebar Maven Projects -> Profiles -> deactivate
 "scala-2.10" profile, and every works good again.
 
 [image: 内嵌图片 1]
 
 I share this tip in the dev list, because a lot of my colleagues have the
 same issues, and maybe many other Flink devs have the same problem too.
 
 BTW, I don't know why IDEA activate scala-2.10 by default, not sure it's a
 IDEA bug or the wrong profile setting somewhere.
 
 
 Regards,
 Jark Wu
 
>> 
> 



[jira] [Created] (FLINK-7042) Fix jar file discovery in YARN tests

2017-06-29 Thread Greg Hogan (JIRA)
Greg Hogan created FLINK-7042:
-

 Summary: Fix jar file discovery in YARN tests
 Key: FLINK-7042
 URL: https://issues.apache.org/jira/browse/FLINK-7042
 Project: Flink
  Issue Type: Bug
  Components: YARN
Reporter: Greg Hogan
Assignee: Greg Hogan
Priority: Critical


Running a local {{mvn clean verify}} the following error in 
{{org.apache.flink.yarn.YARNSessionCapacitySchedulerITCase#perJobYarnClusterWithParallelism}}
 is caused by the discovery of a spurious file created by an earlier YARN test.

{code}
15:45:16,627 INFO  org.apache.flink.yarn.YarnTestBase   
 - Running with args [run, -p, 2, -m, yarn-cluster, -yj, 
/home/ec2-user/flink-upstream/flink-yarn-tests/../flink-dist/target/flink-1.4-SNAPSHOT-bin/flink-1.4-SNAPSHOT/lib/flink-dist_2.10-1.4-SNAPSHOT.jar,
 -yt, 
/home/ec2-user/flink-upstream/flink-yarn-tests/../flink-dist/target/flink-1.4-SNAPSHOT-bin/flink-1.4-SNAPSHOT/lib,
 -yn, 1, -yjm, 768, -ytm, 1024, 
/home/ec2-user/flink-upstream/flink-yarn-tests/../flink-yarn-tests/target/flink-yarn-tests-capacityscheduler/flink-yarn-tests-capacityscheduler-localDir-nm-1_0/usercache/ec2-user/appcache/application_1498751075681_0001/filecache/13/.tmp_flink-examples-batch_2.10-1.4-SNAPSHOT-WordCount.jar.crc]
15:45:16,628 INFO  org.apache.flink.client.CliFrontend  
 - Using configuration directory 
/home/ec2-user/flink-upstream/flink-yarn-tests/../flink-dist/target/flink-1.4-SNAPSHOT-bin/flink-1.4-SNAPSHOT/conf
15:45:16,628 INFO  org.apache.flink.client.CliFrontend  
 - Trying to load configuration file
15:45:16,628 INFO  org.apache.flink.configuration.GlobalConfiguration   
 - Loading configuration property: jobmanager.rpc.address, localhost
15:45:16,629 INFO  org.apache.flink.configuration.GlobalConfiguration   
 - Loading configuration property: jobmanager.rpc.port, 6123
15:45:16,629 INFO  org.apache.flink.configuration.GlobalConfiguration   
 - Loading configuration property: jobmanager.heap.mb, 1024
15:45:16,629 INFO  org.apache.flink.configuration.GlobalConfiguration   
 - Loading configuration property: taskmanager.heap.mb, 1024
15:45:16,629 INFO  org.apache.flink.configuration.GlobalConfiguration   
 - Loading configuration property: taskmanager.numberOfTaskSlots, 1
15:45:16,629 INFO  org.apache.flink.configuration.GlobalConfiguration   
 - Loading configuration property: taskmanager.memory.preallocate, false
15:45:16,629 INFO  org.apache.flink.configuration.GlobalConfiguration   
 - Loading configuration property: parallelism.default, 1
15:45:16,629 INFO  org.apache.flink.configuration.GlobalConfiguration   
 - Loading configuration property: jobmanager.web.port, 8081
15:45:16,629 INFO  org.apache.flink.client.CliFrontend  
 - Running 'run' command.
15:45:16,629 INFO  org.apache.flink.client.CliFrontend  
 - Building program from JAR file
15:45:16,630 ERROR org.apache.flink.client.CliFrontend  
 - Error while running the command.
org.apache.flink.client.program.ProgramInvocationException: Error while opening 
jar file 
'/home/ec2-user/flink-upstream/flink-yarn-tests/../flink-yarn-tests/target/flink-yarn-tests-capacityscheduler/flink-yarn-tests-capacityscheduler-localDir-nm-1_0/usercache/ec2-user/appcache/application_1498751075681_0001/filecache/13/.tmp_flink-examples-batch_2.10-1.4-SNAPSHOT-WordCount.jar.crc'.
 error in opening zip file
at 
org.apache.flink.client.program.PackagedProgram.getEntryPointClassNameFromJar(PackagedProgram.java:562)
at 
org.apache.flink.client.program.PackagedProgram.(PackagedProgram.java:188)
at 
org.apache.flink.client.program.PackagedProgram.(PackagedProgram.java:126)
at 
org.apache.flink.client.CliFrontend.buildProgram(CliFrontend.java:900)
at org.apache.flink.client.CliFrontend.run(CliFrontend.java:229)
at 
org.apache.flink.client.CliFrontend.parseParameters(CliFrontend.java:1083)
at org.apache.flink.yarn.YarnTestBase$Runner.run(YarnTestBase.java:657)
Caused by: java.util.zip.ZipException: error in opening zip file
at java.util.zip.ZipFile.open(Native Method)
at java.util.zip.ZipFile.(ZipFile.java:219)
at java.util.zip.ZipFile.(ZipFile.java:149)
at java.util.jar.JarFile.(JarFile.java:166)
at java.util.jar.JarFile.(JarFile.java:130)
at 
org.apache.flink.client.program.PackagedProgram.getEntryPointClassNameFromJar(PackagedProgram.java:557)
... 6 more
15:45:16,632 INFO  org.apache.flink.yarn.YarnTestBase   
 - Runner stopped with exception
{code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (FLINK-7039) Increase forkCountTestPackage for sudo-based Trav

2017-06-29 Thread Greg Hogan (JIRA)
Greg Hogan created FLINK-7039:
-

 Summary: Increase forkCountTestPackage for sudo-based Trav
 Key: FLINK-7039
 URL: https://issues.apache.org/jira/browse/FLINK-7039
 Project: Flink
  Issue Type: Bug
  Components: Build System
Affects Versions: 1.4.0
Reporter: Greg Hogan
Assignee: Greg Hogan
Priority: Trivial
 Fix For: 1.4.0


https://docs.travis-ci.com/user/ci-environment/



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (FLINK-7031) Document Gelly examples

2017-06-28 Thread Greg Hogan (JIRA)
Greg Hogan created FLINK-7031:
-

 Summary: Document Gelly examples
 Key: FLINK-7031
 URL: https://issues.apache.org/jira/browse/FLINK-7031
 Project: Flink
  Issue Type: New Feature
  Components: Documentation
Affects Versions: 1.4.0
Reporter: Greg Hogan
Assignee: Greg Hogan
Priority: Minor
 Fix For: 1.4.0


The components comprising the Gelly examples runner (inputs, outputs, drivers, 
and soon transforms) were initially developed for internal Gelly use. As such, 
the Gelly documentation covers execution of the drivers but does not document 
the design and structure. The runner has become sufficiently advanced and 
integral to the development of new Gelly algorithms to warrant a page of 
documentation.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Re: Switch to Scala 2.11 as a default build profile

2017-06-28 Thread Greg Hogan
You don't need to use the build profile in IntelliJ, just change
scala.version and scala.binary.version in the parent pom (recent
refactorings made this possible without changing every pom).

What is the benefit for changing the default without dropping older
versions when contributions are still limited to the functionality of the
old version?

On Wed, Jun 28, 2017 at 8:36 AM, Piotr Nowojski 
wrote:

> Hi,
>
> I propose to switch to Scala 2.11 as a default and to have a Scala 2.10
> build profile. Now it is other way around. The reason for that is poor
> support for build profiles in Intellij, I was unable to make it work after
> I added Kafka 0.11 dependency (Kafka 0.11 dropped support for Scala 2.10).
>
> As a side note, maybe we should also consider dropping Scala 2.10 support?
>
> Piotrek


[jira] [Created] (FLINK-7023) Remaining types for Gelly ValueArrays

2017-06-27 Thread Greg Hogan (JIRA)
Greg Hogan created FLINK-7023:
-

 Summary: Remaining types for Gelly ValueArrays
 Key: FLINK-7023
 URL: https://issues.apache.org/jira/browse/FLINK-7023
 Project: Flink
  Issue Type: Sub-task
  Components: Gelly
Affects Versions: 1.4.0
Reporter: Greg Hogan
Assignee: Greg Hogan
Priority: Trivial
 Fix For: 1.4.0


Add implementations of Byte/Char/Double/Float/ShortValueArray. Along with the 
existing implementations of Int/Long/Null/StringValueArray this covers all 10 
CopyableValue types.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Re: [DISCUSS] FLIP-21 - Improve object Copying/Reuse Mode for Streaming Runtime

2017-06-27 Thread Greg Hogan
Hi Stephan,

Would this be an appropriate time to discuss allowing reuse to be a 
per-operator configuration? Object reuse for chained operators has lead to 
considerable surprise for some users of the DataSet API. This came up during 
the rework of the object reuse documentation for the DataSet API. With 
annotations a Function could mark whether input/iterator or output/collected 
objects should be copied or reused.

My distant observation is that is is safer to locally assert reuse at the 
operator level than to assume or guarantee the safety of object reuse across an 
entire program. It could also be handy to mix operators receiving copyable 
objects with operators not requiring copyable objects.

Greg


> On Jun 27, 2017, at 1:21 PM, Stephan Ewen  wrote:
> 
> Hi all!
> 
> I would like to propose the following FLIP:
> 
> FLIP-21 - Improve object Copying/Reuse Mode for Streaming Runtime:
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=71012982
> 
> The FLIP is motivated by the fact that many users run into an unnecessary
> kind of performance problem caused by an old design artifact.
> 
> The required change should be reasonably small, and would help many users
> and Flink's general standing.
> 
> Happy to hear thoughts!
> 
> Stephan
> 
> ==
> 
> FLIP text is below. Pictures with illustrations are only in the Wiki, not
> supported on the mailing list.
> -
> 
> Motivation
> 
> The default behavior of the streaming runtime is to copy every element
> between chained operators.
> 
> That operation was introduced for “safety” reasons, to avoid the number of
> cases where users can create incorrect programs by reusing mutable objects
> (a discouraged pattern, but possible). For example when using state
> backends that keep the state as objects on heap, reusing mutable objects
> can theoretically create cases where the same object is used in multiple
> state mappings.
> 
> The effect is that many people that try Flink get much lower performance
> than they could possibly get. From empirical evidence, almost all users
> that I (Stephan) have been in touch with eventually run into this issue
> eventually.
> 
> There are multiple observations about that design:
> 
> 
>   -
> 
>   Object copies are extremely costly. While some simple copy virtually for
>   free (types reliably detected as immutable are not copied at all), many
>   real pipelines use types like Avro, Thrift, JSON, etc, which are very
>   expensive to copy.
> 
> 
> 
>   -
> 
>   Keyed operations currently only occur after shuffles. The operations are
>   hence the first in a pipeline and will never have a reused object anyways.
>   That means for the most critical operation, this pre-caution is unnecessary.
> 
> 
> 
>   -
> 
>   The mode is inconsistent with the contract of the DataSet API, which
>   does not copy at each step
> 
> 
> 
>   -
> 
>   To prevent these copies, users can select {{enableObjectReuse()}}, which
>   is misleading, since it does not really reuse mutable objects, but avoids
>   additional copies.
> 
> 
> Proposal
> 
> Summary
> 
> I propose to change the default behavior of the DataStream runtime to be
> the same as the DataSet runtime. That means that new objects are chosen on
> every deserialization, and no copies are made as the objects are passed on
> along the pipelines.
> 
> Details
> 
> I propose to drop the execution config flag {{objectReuse}} and instead
> introduce an {{ObjectReuseMode}} enumeration with better control of what
> should happen. There will be three different types:
> 
> 
>   -
> 
>   DEFAULT
>   -
> 
>  This is the default in the DataSet API
>  -
> 
>  This will become the default in the DataStream API
>  -
> 
>  This happens in the DataStream API when {{enableObjectReuse()}} is
>  activated.
> 
> 
> 
>   -
> 
>   COPY_PER_OPERATOR
>   -
> 
>  The current default in the DataStream API
> 
> 
> 
>   -
> 
>   FULL_REUSE
>   -
> 
>  This happens in the DataSet API when {{enableObjectReuse()}} is
>  chosen.
> 
> 
> An illustration of the modes is as follows:
> 
> DEFAULT
> 
> 
> See here:
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=71012982=/https%3A%2F%2Flh5.googleusercontent.com%2F1UOpVB2wSMhx8067IE9t2_mJG549IoOkDiAfIN_uXQZVUvAXCp-hQLY-mgoSWunwF-xciZuJ4pZpj1FX0ZPQrd-Fm1jWzgX3Hv7-SELUdPUvEN6XUPbLrwfA9YRl605bFKMYlf1r
> 
> COPY_PER_OPERATOR
> 
> 
> See here:
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=71012982=/https%3A%2F%2Flh3.googleusercontent.com%2Fs5sBOktzaKrRw3v1-IQMgImYZfchQMVz2HiG3i050xCWNTKuQV6mmlv3QtR0TZ0SGPRSCyjI-sUAqfbJw4fGOxKqBuRX2f-iZGh0e7hBke7DzuApUNy1vaF2SgtQVH3XEXkRx8Ks
> 
> 
> FULL_REUSE
> 
> 
> See here:
> 

[jira] [Created] (FLINK-7019) Rework parallelism in Gelly algorithms and examples

2017-06-27 Thread Greg Hogan (JIRA)
Greg Hogan created FLINK-7019:
-

 Summary: Rework parallelism in Gelly algorithms and examples
 Key: FLINK-7019
 URL: https://issues.apache.org/jira/browse/FLINK-7019
 Project: Flink
  Issue Type: Sub-task
  Components: Gelly
Affects Versions: 1.4.0
Reporter: Greg Hogan
Assignee: Greg Hogan
Priority: Minor
 Fix For: 1.4.0


Flink job parallelism is set with {{ExecutionConfig#setParallelism}} or when 
{{-p}} on the command-line. The Gelly algorithms {{JaccardIndex}}, 
{{AdamicAdar}}, {{TriangleListing}}, and {{ClusteringCoefficient}} have 
intermediate operators which generated output quadratic in the size of input. 
These algorithms may need to be run with a high parallelism but doing so for 
all operations is wasteful. Thus was introduced "little parallelism".

This can be simplified by moving the parallelism parameter to the new common 
base class and with the rule-of-thumb to use the algorithm parallelism for all 
normal (small output) operators. The asymptotically large operators will 
default to the job parallelism, as will the default algorithm parallelism.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (FLINK-7006) Base class using POJOs for Gelly algorithms

2017-06-26 Thread Greg Hogan (JIRA)
Greg Hogan created FLINK-7006:
-

 Summary: Base class using POJOs for Gelly algorithms
 Key: FLINK-7006
 URL: https://issues.apache.org/jira/browse/FLINK-7006
 Project: Flink
  Issue Type: Sub-task
  Components: Gelly
Affects Versions: 1.4.0
Reporter: Greg Hogan
Assignee: Greg Hogan
Priority: Minor
 Fix For: 1.4.0


Gelly algorithms commonly have a {{Result}} class extending a {{Tuple}} type 
and implementing one of the {{Unary/Binary/TertiaryResult}} interfaces.

Add a {{Unary/Binary/TertiaryResultBase}} class implementing each interface and 
convert the {{Result}} classes to POJOs extending the base result classes.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (FLINK-6989) Refactor examples with Output interface

2017-06-22 Thread Greg Hogan (JIRA)
Greg Hogan created FLINK-6989:
-

 Summary: Refactor examples with Output interface
 Key: FLINK-6989
 URL: https://issues.apache.org/jira/browse/FLINK-6989
 Project: Flink
  Issue Type: Sub-task
  Components: Gelly
Affects Versions: 1.4.0
Reporter: Greg Hogan
Assignee: Greg Hogan
 Fix For: 1.4.0


The current organization of the Gelly examples retains full flexibility by 
handling the Graph input to the algorithm Driver and having the Driver overload 
interfaces for the various output types. The outputs must be made independent 
in order to support Transforms which are applied between the Driver and Output 
(and also between the Input and Driver).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Re: [VOTE] Release Apache Flink 1.3.1 (RC2)

2017-06-22 Thread Greg Hogan
+1 (binding)

- verified source and binary signatures
- verified source and binary checksums
- verified LICENSEs
- verified NOTICEs
- built from source

Greg

> On Jun 21, 2017, at 3:46 AM, Robert Metzger  wrote:
> 
> Dear Flink community,
> 
> Please vote on releasing the following candidate as Apache Flink version
> 1.3.1.
> 
> The commit to be voted on:
> *http://git-wip-us.apache.org/repos/asf/flink/commit/1ca6e5b6
> *
> 
> Branch:
> release-1.3.1-rc2
> 
> The release artifacts to be voted on can be found at:
> *http://people.apache.org/~rmetzger/flink-1.3.1-rc2/
> *
> 
> The release artifacts are signed with the key with fingerprint D9839159:
> http://www.apache.org/dist/flink/KEYS
> 
> The staging repository for this release can be found at:
> *https://repository.apache.org/content/repositories/orgapacheflink-1125
> *
> 
> 
> -
> 
> 
> The vote ends on Thursday (5pm CEST), June 22, 2016.
> IMPORTANT: I've reduced the voting time to only one day because the number
> of changes between RC1 and RC2 are mostly in the table API (mostly
> documentation) and the serializer changes Till and Gordon were working on.
> The list of changes is the following
> - Reworked Table API documentation (this is a set of commits)
> - [FLINK-6817] [table] Add OverWindowWithPreceding class to guide users
> - [FLINK-6859] [table] Do not delete timers in StateCleaningCountTrigger
> - [FLINK-6930] [table] Forbid selecting window start/end on row-based T…
> - [FLINK-6886] [table] Fix conversion of Row Table to POJO
> - [FLINK-6602] [table] Prevent TableSources with empty time attribute n…
> - [FLINK-6941] [table] Validate that start and end window properties ar…
> - [FLINK-6881] [FLINK-6896] [table] Creating a table from a POJO and de…
> - [FLINK-6921] [serializer] Allow EnumValueSerializer to deal with appe…
> - [FLINK-6948] [serializer] Harden EnumValueSerializer to detect change…
> - [FLINK-6922] [serializer] Remove Java serialization from Enum(Value)S…
> - [FLINK-6652] [core] Fix handling of delimiters split by buffers in De…
> 
> 
> 
> [ ] +1 Release this package as Apache Flink 1.3.1
> [ ] -1 Do not release this package, because ...



[jira] [Created] (FLINK-6986) Broken links to Photoshop images

2017-06-22 Thread Greg Hogan (JIRA)
Greg Hogan created FLINK-6986:
-

 Summary: Broken links to Photoshop images
 Key: FLINK-6986
 URL: https://issues.apache.org/jira/browse/FLINK-6986
 Project: Flink
  Issue Type: Bug
  Components: Project Website
Reporter: Greg Hogan
Assignee: Greg Hogan
Priority: Minor


The "Black outline logo with text" links on the 
[community|https://flink.apache.org/community.html] page are broken.

I'd like to see if we can find a comprehensive solution for broken links. I 
only noticed this due to random clicking. I think Google can report broken 
links or we could run our own scan.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Re: [ANNOUNCE] New Flink committer Shaoxuan Wang

2017-06-21 Thread Greg Hogan
Congrats and welcome, Shaoxuan!


> On Jun 21, 2017, at 4:19 PM, Fabian Hueske  wrote:
> 
> Hi everybody,
> 
> On behalf of the PMC, I'm very happy to announce that Shaoxuan Wang has
> accepted the invitation of the PMC to become a Flink committer.
> 
> Shaoxuan has contributed several major features to the Table API / SQL and
> is very engaged in discussions about the design of new features and the
> future direction of Flink's relational APIs.
> 
> Please join in me congratulating Shaoxuan for becoming a Flink committer.
> 
> Thanks, Fabian



Re: [ANNOUNCE] New committer: Dawid Wysakowicz

2017-06-19 Thread Greg Hogan
Welcome and congrats Dawid!


> On Jun 19, 2017, at 4:40 AM, Till Rohrmann  wrote:
> 
> Hi everybody,
> 
> On behalf of the PMC I am delighted to announce Dawid Wysakowicz as a new
> Flink committer!
> 
> Dawid has been a community member for a very long time and among other
> things he helped shaping Flink's CEP library into what it is today.
> 
> Welcome Dawid and congratulations again for becoming a Flink committer!
> 
> Cheers,
> Till


Re: [VOTE] Release Apache Flink 1.3.1

2017-06-18 Thread Greg Hogan
+1 (binding)

- verified source and binary signatures
- verified source and binary checksums
- verified LICENSEs
- verified NOTICEs
- built from source

Greg

> On Jun 14, 2017, at 10:14 AM, Robert Metzger  wrote:
> 
> Dear Flink community,
> 
> Please vote on releasing the following candidate as Apache Flink version
> 1.3.1.
> 
> The commit to be voted on:
> http://git-wip-us.apache.org/repos/asf/flink/commit/7cfe62b9
> 
> Branch:
> release-1.3.1-rc1
> 
> The release artifacts to be voted on can be found at:
> *http://people.apache.org/~rmetzger/flink-1.3.1-rc1/
> *
> 
> The release artifacts are signed with the key with fingerprint D9839159:
> http://www.apache.org/dist/flink/KEYS
> 
> The staging repository for this release can be found at:
> https://repository.apache.org/content/repositories/orgapacheflink-1124
> 
> 
> -
> 
> 
> The vote ends on Monday (5pm CEST), June 19rd, 2016.
> 
> [ ] +1 Release this package as Apache Flink 1.3.1
> [ ] -1 Do not release this package, because ...



Re: [DISCUSS] GitBox

2017-06-18 Thread Greg Hogan
My understanding is that with GitBox project committers who have linked Apache 
and GitHub accounts are given organization write permissions. Other 
contributors will continue to have read permissions.
  
https://help.github.com/articles/repository-permission-levels-for-an-organization/

The last comment noting the “split-brain” shouldn’t preclude the use of GitBox 
but we should come to a general consensus before switching to commit into the 
GitHub repo.

If we want to try GitHub for flink-web, a second step could to switch and use 
with the nascent flink-libraries.


> On Jun 18, 2017, at 6:50 AM, Chesnay Schepler <ches...@apache.org> wrote:
> 
> Found some info in this JIRA: 
> https://issues.apache.org/jira/browse/INFRA-14191
> 
> Apparently, Gitbox is still in the beta phase. There are no public docs for 
> it yet.
> 
> Committers are required to link their apache & GitHub accounts, which 
> requires 2FA on GitHub.
> 
> As it stands I would be in favor of Gregs original suggestion of activating 
> it for flink-web as a test bed.
> I would wait with the main repo until we actually have more info and it is a 
> bit more proven.
> 
> On 11.06.2017 19:37, Ufuk Celebi wrote:
>> I would also like to see this happening for both flink-web and flink
>> if it allows committers to have control over the respective repos.
>> 
>> On Sat, Jun 10, 2017 at 4:05 PM, Chesnay Schepler <ches...@apache.org> wrote:
>>> What are the downsides of this? Actually, is there any ASF resource that
>>> outlines what this would enable?
>>> 
>>> In one of the threads i saw said that this would also allow committers to
>>> close PR's, assign labels and such.
>>> This sounds very interesting to me for the main repo actually.
>>> 
>>> 
>>> On 09.06.2017 17:41, Greg Hogan wrote:
>>>> Robert has an open PR from March. I’ve found, for example, PRs adding
>>>> links to talks or slides left open for months.
>>>> 
>>>> I’d suggest Fluo is to Accumulo as flink-web is to the flink repo, and
>>>> that migration looks to be satisfactory.
>>>> 
>>>> 
>>>>> On Jun 9, 2017, at 11:15 AM, Ted Yu <yuzhih...@gmail.com> wrote:
>>>>> 
>>>>> bq. better track the oft-neglected contributions
>>>>> 
>>>>> Do you have estimate on how many contributions were not paid attention in
>>>>> the current infrastructure.
>>>>> 
>>>>> Looking at #2, it seems Accumulo community hasn't reached consensus yet.
>>>>> 
>>>>> Cheers
>>>>> 
>>>>> On Fri, Jun 9, 2017 at 7:54 AM, Greg Hogan <c...@greghogan.com> wrote:
>>>>> 
>>>>>> All,
>>>>>> 
>>>>>> ASF now has available (and maybe mandatory for new projects or repos)
>>>>>> GitBox [0] which enables bi-directional sync to GitHub and links
>>>>>> committers' accounts, allowing for greater use of GitHub functionality
>>>>>> by
>>>>>> contributors and for committers to perform many tasks otherwise
>>>>>> requiring
>>>>>> INFRA tickets.
>>>>>> 
>>>>>> I'd like to propose moving flink-web [1] to GitBox, using GitHub issues,
>>>>>> and enabling notifications to the mailing lists. Apache Accumulo has
>>>>>> recently discussed [2] this topic with a list of benefits after
>>>>>> migrating
>>>>>> Fluo. By migrating flink-web we can better track the oft-neglected
>>>>>> contributions and also test the waters for future migrations (perhaps
>>>>>> for
>>>>>> the future sub-projects).
>>>>>> 
>>>>>> [0] https://gitbox.apache.org/
>>>>>> [1] https://github.com/apache/flink-web/pulls
>>>>>> [2]
>>>>>> http://apache-accumulo.1065345.n5.nabble.com/DISCUSS-
>>>>>> GitBox-tp21160p21497.html
>>>>>> 
>>>>>> Greg
>>> 
>>> 
> 



[jira] [Created] (FLINK-6903) Activate checkstyle for runtime/akka

2017-06-12 Thread Greg Hogan (JIRA)
Greg Hogan created FLINK-6903:
-

 Summary: Activate checkstyle for runtime/akka
 Key: FLINK-6903
 URL: https://issues.apache.org/jira/browse/FLINK-6903
 Project: Flink
  Issue Type: Improvement
  Components: Local Runtime
Affects Versions: 1.4.0
Reporter: Greg Hogan
Assignee: Greg Hogan
Priority: Trivial
 Fix For: 1.4.0






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (FLINK-6882) Activate checkstyle for runtime/registration

2017-06-09 Thread Greg Hogan (JIRA)
Greg Hogan created FLINK-6882:
-

 Summary: Activate checkstyle for runtime/registration
 Key: FLINK-6882
 URL: https://issues.apache.org/jira/browse/FLINK-6882
 Project: Flink
  Issue Type: Improvement
  Components: Local Runtime
Affects Versions: 1.4.0
Reporter: Greg Hogan
Assignee: Greg Hogan
Priority: Trivial
 Fix For: 1.4.0






--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (FLINK-6880) Activate checkstyle for runtime/iterative

2017-06-09 Thread Greg Hogan (JIRA)
Greg Hogan created FLINK-6880:
-

 Summary: Activate checkstyle for runtime/iterative
 Key: FLINK-6880
 URL: https://issues.apache.org/jira/browse/FLINK-6880
 Project: Flink
  Issue Type: Improvement
  Components: Local Runtime
Affects Versions: 1.4.0
Reporter: Greg Hogan
Assignee: Greg Hogan
Priority: Trivial
 Fix For: 1.4.0






--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (FLINK-6879) Activate checkstyle for runtime/memory

2017-06-09 Thread Greg Hogan (JIRA)
Greg Hogan created FLINK-6879:
-

 Summary: Activate checkstyle for runtime/memory
 Key: FLINK-6879
 URL: https://issues.apache.org/jira/browse/FLINK-6879
 Project: Flink
  Issue Type: Improvement
  Components: Local Runtime
Affects Versions: 1.4.0
Reporter: Greg Hogan
Assignee: Greg Hogan
Priority: Trivial
 Fix For: 1.4.0






--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Re: [DISCUSS] GitBox

2017-06-09 Thread Greg Hogan
Robert has an open PR from March. I’ve found, for example, PRs adding links to 
talks or slides left open for months.

I’d suggest Fluo is to Accumulo as flink-web is to the flink repo, and that 
migration looks to be satisfactory.


> On Jun 9, 2017, at 11:15 AM, Ted Yu <yuzhih...@gmail.com> wrote:
> 
> bq. better track the oft-neglected contributions
> 
> Do you have estimate on how many contributions were not paid attention in
> the current infrastructure.
> 
> Looking at #2, it seems Accumulo community hasn't reached consensus yet.
> 
> Cheers
> 
> On Fri, Jun 9, 2017 at 7:54 AM, Greg Hogan <c...@greghogan.com> wrote:
> 
>> All,
>> 
>> ASF now has available (and maybe mandatory for new projects or repos)
>> GitBox [0] which enables bi-directional sync to GitHub and links
>> committers' accounts, allowing for greater use of GitHub functionality by
>> contributors and for committers to perform many tasks otherwise requiring
>> INFRA tickets.
>> 
>> I'd like to propose moving flink-web [1] to GitBox, using GitHub issues,
>> and enabling notifications to the mailing lists. Apache Accumulo has
>> recently discussed [2] this topic with a list of benefits after migrating
>> Fluo. By migrating flink-web we can better track the oft-neglected
>> contributions and also test the waters for future migrations (perhaps for
>> the future sub-projects).
>> 
>> [0] https://gitbox.apache.org/
>> [1] https://github.com/apache/flink-web/pulls
>> [2]
>> http://apache-accumulo.1065345.n5.nabble.com/DISCUSS-
>> GitBox-tp21160p21497.html
>> 
>> Greg


[jira] [Created] (FLINK-6878) Activate checkstyle for runtime/query

2017-06-09 Thread Greg Hogan (JIRA)
Greg Hogan created FLINK-6878:
-

 Summary:  Activate checkstyle for runtime/query
 Key: FLINK-6878
 URL: https://issues.apache.org/jira/browse/FLINK-6878
 Project: Flink
  Issue Type: Improvement
  Components: Local Runtime
Affects Versions: 1.4.0
Reporter: Greg Hogan
Assignee: Greg Hogan
Priority: Trivial
 Fix For: 1.4.0






--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[DISCUSS] GitBox

2017-06-09 Thread Greg Hogan
All,

ASF now has available (and maybe mandatory for new projects or repos)
GitBox [0] which enables bi-directional sync to GitHub and links
committers' accounts, allowing for greater use of GitHub functionality by
contributors and for committers to perform many tasks otherwise requiring
INFRA tickets.

I'd like to propose moving flink-web [1] to GitBox, using GitHub issues,
and enabling notifications to the mailing lists. Apache Accumulo has
recently discussed [2] this topic with a list of benefits after migrating
Fluo. By migrating flink-web we can better track the oft-neglected
contributions and also test the waters for future migrations (perhaps for
the future sub-projects).

[0] https://gitbox.apache.org/
[1] https://github.com/apache/flink-web/pulls
[2]
http://apache-accumulo.1065345.n5.nabble.com/DISCUSS-GitBox-tp21160p21497.html

Greg


[jira] [Created] (FLINK-6877) Activate checkstyle for runtime/security

2017-06-09 Thread Greg Hogan (JIRA)
Greg Hogan created FLINK-6877:
-

 Summary: Activate checkstyle for runtime/security
 Key: FLINK-6877
 URL: https://issues.apache.org/jira/browse/FLINK-6877
 Project: Flink
  Issue Type: Improvement
  Components: Local Runtime
Affects Versions: 1.4.0
Reporter: Greg Hogan
Assignee: Greg Hogan
Priority: Trivial
 Fix For: 1.4.0






--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Re: [POLL] Who still uses Java 7 with Flink ?

2017-06-08 Thread Greg Hogan
Is this not two different issues?
- adding builds for Scala 2.12
- upgrading to Java version 1.8

It may be time to switch, but I haven’t seen anything in FLINK-5005 which 
prevents simply adding Scala 2.12 to our supported build matrix and continuing 
to build 2.10 / 2.11 against Java 1.7.

Greg


> On Jun 8, 2017, at 11:39 AM, Robert Metzger <rmetz...@apache.org> wrote:
> 
> Hi all,
> 
> as promised in March, I want to revive this discussion!
> 
> Our users are begging for Scala 2.12 support [1], migration to Akka 2.4 would 
> solve a bunch of shading / dependency issues (Akka 2.4 will remove Akka's 
> protobuf dependency [2][3]) and generally Java 8's new language features all 
> speak for dropping Java 7.
> 
> Java 8 has been released in March, 2014. Java 7 is unsupported since June 
> 2016.
> 
> So what's the feeling in the community regarding the step?
> 
> 
> [1] https://issues.apache.org/jira/browse/FLINK-5005# 
> <https://issues.apache.org/jira/browse/FLINK-5005#>
> [2] https://issues.apache.org/jira/browse/FLINK-5989 
> <https://issues.apache.org/jira/browse/FLINK-5989>
> [3] 
> https://issues.apache.org/jira/browse/FLINK-3211?focusedCommentId=15274018=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15274018
>  
> <https://issues.apache.org/jira/browse/FLINK-3211?focusedCommentId=15274018=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15274018>
> 
> 
> On Thu, Mar 23, 2017 at 2:42 PM, Theodore Vasiloudis 
> <theodoros.vasilou...@gmail.com <mailto:theodoros.vasilou...@gmail.com>> 
> wrote:
> Hello all,
> 
> I'm sure you've considered this already, but what this data does not include 
> is all the potential future users,
> i.e. slower moving organizations (banks etc.) which could be on Java 7 still.
> 
> Whether those are relevant is up for debate.
> 
> Cheers,
> Theo
> 
> On Thu, Mar 23, 2017 at 12:14 PM, Robert Metzger <rmetz...@apache.org 
> <mailto:rmetz...@apache.org>> wrote:
> Yeah, you are right :)
> I'll put something in my calendar for end of May.
> 
> On Thu, Mar 23, 2017 at 12:12 PM, Greg Hogan <c...@greghogan.com 
> <mailto:c...@greghogan.com>> wrote:
> Robert,
> 
> Thanks for the report. Shouldn’t we be revisiting this decision at the 
> beginning of the new release cycle rather than near the end? There is 
> currently little cost to staying with Java 7 since no Flink code or pull 
> requests have been written for Java 8.
> 
> Greg
> 
> 
> 
>> On Mar 23, 2017, at 6:37 AM, Robert Metzger <rmetz...@apache.org 
>> <mailto:rmetz...@apache.org>> wrote:
>> 
>> Looks like 9% on twitter and 24% on the mailing list are still using Java 7.
>> 
>> I would vote to keep supporting Java 7 for Flink 1.3 and then revisit once 
>> we are approaching 1.4 in September.
>> 
>> On Thu, Mar 16, 2017 at 8:00 AM, Bowen Li <bowen...@offerupnow.com 
>> <mailto:bowen...@offerupnow.com>> wrote:
>> There's always a tradeoff we need to make. I'm in favor of upgrading to Java 
>> 8 to bring in all new Java features.
>> 
>> The common way I've seen (and I agree) other software upgrading major things 
>> like this is 1) upgrade for next big release without backward compatibility 
>> and notify everyone 2) maintain and patch current, old-tech compatible 
>> version at a reasonably limited scope. Building backward compatibility is 
>> too much for an open sourced project
>> 
>> 
>> 
>> On Wed, Mar 15, 2017 at 7:10 AM, Robert Metzger <rmetz...@apache.org 
>> <mailto:rmetz...@apache.org>> wrote:
>> I've put it also on our Twitter account:
>> https://twitter.com/ApacheFlink/status/842015062667755521 
>> <https://twitter.com/ApacheFlink/status/842015062667755521>
>> 
>> On Wed, Mar 15, 2017 at 2:19 PM, Martin Neumann <martin.neum...@ri.se 
>> <mailto:martin.neum...@ri.se>>
>> wrote:
>> 
>> > I think this easier done in a straw poll than in an email conversation.
>> > I created one at: http://www.strawpoll.me/12535073 
>> > <http://www.strawpoll.me/12535073>
>> > (Note that you have multiple choices.)
>> >
>> >
>> > Though I prefer Java 8 most of the time I have to work on Java 7. A lot of
>> > the infrastructure I work on still runs Java 7, one of the companies I
>> > build a prototype for a while back just updated to Java 7 2 years ago. I
>> > doubt we can ditch Java 7 support any time soon if we want to make it easy
>> > for companies to use Flink.
>> >
>> > cheers Martin
>>

[jira] [Created] (FLINK-6872) Add MissingOverride to checkstyle

2017-06-08 Thread Greg Hogan (JIRA)
Greg Hogan created FLINK-6872:
-

 Summary: Add MissingOverride to checkstyle
 Key: FLINK-6872
 URL: https://issues.apache.org/jira/browse/FLINK-6872
 Project: Flink
  Issue Type: New Feature
Affects Versions: 1.4.0
Reporter: Greg Hogan
Assignee: Greg Hogan
Priority: Minor


[Verifies|http://checkstyle.sourceforge.net/config_annotation.html#MissingOverride]
 that the java.lang.Override annotation is present when the @inheritDoc javadoc 
tag is present.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Re: Update Netty version

2017-06-07 Thread Greg Hogan
Hi Alexey,

Are you looking to create pull requests for upgrading Netty 4.0 and/or 4.1?

Greg

On Thu, May 18, 2017 at 4:41 AM, Alexey Demin  wrote:

> Hi
>
> Problem not directly in flink, but it you use flink with beam then in
> classpath you have original netty 4.0.27 from flink and netty 4.1.x from
> beam (grpc use netty 4.1.x for communication).
>
> Other interest (specific for me now): netty have custom wrapper for openssl
> library which have more productivity versus default jdk version, when you
> have enabled ssl cluster communication it's will be very usefull give users
> select implementation of ssl (default,openssl,boringssl).
>
> But netty have a lot of fixes for openssl/boringssl in latest versions and
> more preferable do update netty as first step and enable select sslengine
> as second step, not all in one step.
>
> > However, if we do the change at the beginning of the next release cycle,
> we
> > might have enough exposure time to verify whether things work or not.
>
> We just start 1.4 iteration and have time for testing.
>
> Thank,
> Alexey Demin
>
>
> 2017-05-18 11:48 GMT+04:00 Till Rohrmann :
>
> > Hi Alexey,
> >
> > thanks for looking into it. Are we currently facing any problems with
> Netty
> > 4.0.27 (bugs or performance)? I agree that in general we should try to
> use
> > the latest bug fix release. However, in the past we have seen that they
> > might entail some slight behaviour changes which breaks things on the
> Flink
> > side. Since Netty is quite crucial for Flink, I would be extra careful
> here
> > when bumping versions, especially if there is no strong need for it.
> >
> > However, if we do the change at the beginning of the next release cycle,
> we
> > might have enough exposure time to verify whether things work or not.
> >
> > Cheers,
> > Till
> >
> > On Thu, May 18, 2017 at 8:51 AM, Alexey Demin 
> wrote:
> >
> > > Hi
> > >
> > > For now we use very old netty version.
> > >
> > > Netty 4.0.27.Final released on 02-Apr-15
> > >
> > > 
> > >
> > > If we so worry about slice in LengthFieldBasedFrameDecoder we can add
> > > custom
> > > LengthFieldBasedCopyFrameDecoder which extend original
> > > LengthFieldBasedFrameDecoder and override extractFrame for keep current
> > > behavior.
> > >
> > > With this small changes we can update for last 4.0.x.
> > >
> > > For now LengthFieldBasedFrameDecoder also used in KvStateClient and
> > > KvStateServer.
> > > Can we keep use original LengthFieldBasedFrameDecoder or must also
> change
> > > on LengthFieldBasedCopyFrameDecoder?
> > >
> > > If we want we can migrate on 4.1.
> > > I already did tests and all work correctly, small changes in
> > > NettyBufferPool.java and ChunkedByteBuf.java is required (implement new
> > > method which added to interface).
> > >
> > >
> > > Thanks
> > > Alexey Diomin
> > >
> >
>


Re: [DISCUSS] Planning Release 1.4

2017-06-01 Thread Greg Hogan
I’d like to propose keeping the same schedule but move branch forking from the 
feature freeze to the code freeze. The early fork required duplicate 
verification and commits for numerous bug fixes and minor features which had 
been reviewed but were still queued. There did not look to be much new 
development merged to master between the freezes.

Greg


> On Jun 1, 2017, at 11:26 AM, Robert Metzger  wrote:
> 
> Hi all,
> 
> Flink 1.2 was released on February 2, Flink 1.3 on June 1, which means
> we've managed to release Flink 1.3 in almost exactly 4 months!
> 
> For the 1.4 release, I've put the following deadlines into the wiki [1]:
> 
> *Next scheduled major release*: 1.4.0
> *Feature freeze (branch forking)*:  4. September 2017
> *Code freeze (first voting RC)*:  18 September 2017
> *Release date*: 29 September 2017
> 
> I'll try to send a message every month into this thread to have a countdown
> to the next feature freeze.
> 
> 
> [1]
> https://cwiki.apache.org/confluence/display/FLINK/Flink+Release+and+Feature+Plan



[DISCUSS] NOTICE

2017-05-31 Thread Greg Hogan
In several locations we have copied code from external projects, for
example
flink-scala-shell/src/main/java/org/apache/flink/api/java/JarHelper.java
links to the file from org.apache.xmlbeans/xmlbeans/2.4.0. We also have
copied from Apache's Calcite, Spark, and Hadoop.

None of these projects are referenced in Flink's NOTICE or LICENSE. Is this
unnecessary because all Apache project code is is copyright the ASF (or
public domain)?

Also, lodash is cited in Flink's NOTICE but there is no lodash NOTICE at
https://github.com/lodash/lodash. We do properly cite the project in the
MIT License section of Flink's LICENSE.

Greg


[jira] [Created] (FLINK-6779) Activate strict checkstyle in flink-scala

2017-05-30 Thread Greg Hogan (JIRA)
Greg Hogan created FLINK-6779:
-

 Summary: Activate strict checkstyle in flink-scala
 Key: FLINK-6779
 URL: https://issues.apache.org/jira/browse/FLINK-6779
 Project: Flink
  Issue Type: Sub-task
  Components: Scala API
Affects Versions: 1.4.0
Reporter: Greg Hogan
Assignee: Greg Hogan
Priority: Trivial






--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (FLINK-6778) Activate strict checkstyle for flink-dist

2017-05-30 Thread Greg Hogan (JIRA)
Greg Hogan created FLINK-6778:
-

 Summary: Activate strict checkstyle for flink-dist
 Key: FLINK-6778
 URL: https://issues.apache.org/jira/browse/FLINK-6778
 Project: Flink
  Issue Type: Sub-task
Affects Versions: 1.4.0
Reporter: Greg Hogan
Assignee: Greg Hogan
Priority: Trivial






--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (FLINK-6777) Activate strict checkstyle for flink-scala-shell

2017-05-30 Thread Greg Hogan (JIRA)
Greg Hogan created FLINK-6777:
-

 Summary: Activate strict checkstyle for flink-scala-shell
 Key: FLINK-6777
 URL: https://issues.apache.org/jira/browse/FLINK-6777
 Project: Flink
  Issue Type: Sub-task
  Components: Scala Shell
Affects Versions: 1.4.0
Reporter: Greg Hogan
Assignee: Greg Hogan
Priority: Trivial






--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Re: [VOTE] Release Apache Flink 1.3.0 (RC3)

2017-05-30 Thread Greg Hogan
+1 (binding)

- verified source and binary signatures
- verified source and binary checksums
- verified LICENSEs
- verified NOTICEs
- built from source

Greg


> On May 26, 2017, at 12:58 PM, Robert Metzger  wrote:
> 
> Hi all,
> 
> this is the second VOTEing release candidate for Flink 1.3.0
> 
> The commit to be voted on:
> 760eea8a 
> (*http://git-wip-us.apache.org/repos/asf/flink/commit/760eea8a
> *)
> 
> Branch:
> release-1.3.0-rc3
> 
> The release artifacts to be voted on can be found at:
> http://people.apache.org/~rmetzger/flink-1.3.0-rc3
> 
> 
> The release artifacts are signed with the key with fingerprint D9839159:
> http://www.apache.org/dist/flink/KEYS
> 
> The staging repository for this release can be found at:
> *https://repository.apache.org/content/repositories/orgapacheflink-1122
> *
> 
> -
> 
> 
> The vote ends on Tuesday (May 30th), 7pm CET.
> 
> [ ] +1 Release this package as Apache Flink 1.3.0
> [ ] -1 Do not release this package, because ...



[jira] [Created] (FLINK-6709) Activate strict checkstyle for flink-gellies

2017-05-24 Thread Greg Hogan (JIRA)
Greg Hogan created FLINK-6709:
-

 Summary: Activate strict checkstyle for flink-gellies
 Key: FLINK-6709
 URL: https://issues.apache.org/jira/browse/FLINK-6709
 Project: Flink
  Issue Type: Sub-task
  Components: Gelly
Affects Versions: 1.4.0
Reporter: Greg Hogan
Assignee: Greg Hogan
Priority: Trivial
 Fix For: 1.4.0






--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (FLINK-6707) Activate strict checkstyle for flink-examples

2017-05-24 Thread Greg Hogan (JIRA)
Greg Hogan created FLINK-6707:
-

 Summary: Activate strict checkstyle for flink-examples
 Key: FLINK-6707
 URL: https://issues.apache.org/jira/browse/FLINK-6707
 Project: Flink
  Issue Type: Sub-task
  Components: Examples
Affects Versions: 1.4.0
Reporter: Greg Hogan
Assignee: Greg Hogan
Priority: Trivial
 Fix For: 1.4.0






--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Re: [DISCUSS] Backwards compatibility policy.

2017-05-22 Thread Greg Hogan
I can’t find when the time-based maintenance schedule switched from “6 months” 
to “2 concurrent versions” (which has not yet made it into the website [0]). Is 
it correct to assume that most users are waiting until the first bug fix 
release or later to upgrade? That only leaves a narrow window of stability.

Greg

[0] https://github.com/apache/flink-web/pull/50



> On May 22, 2017, at 1:39 AM, Tzu-Li (Gordon) Tai  wrote:
> 
> Hi Kostas,
> 
> Thanks for bringing this up!
> I think it is reasonable to keep this coherent with our timely-based release 
> model guarantees.
> 
> With the timely-based release model, there is a guarantee that the current 
> latest major version and the previous one is supported.
> For example, upon releasing 1.3, only 1.3 and 1.2 will still be supported by 
> the community for any required bug fixes.
> I think this was initially decided not only to ease old version maintenance 
> efforts for the community, but also as a means to let users upgrade their 
> Flink versions in a reasonable pace (at least every other major release.)
> 
> Therefore, I think its also reasonable to also clearly state that savepoints 
> compatibility will only be guaranteed for the previous release.
> Although I think at the moment almost, if not all, of the current code still 
> maintains compatibility for 1.1, in the long run these migration codes would 
> definitely start to pile up and pollute the actual codebase if we try to 
> always be compatible with all previous versions.
> 
> Cheers,
> Gordon
> 
> 
> On 21 May 2017 at 2:24:53 AM, Kostas Kloudas (k.klou...@data-artisans.com) 
> wrote:
> 
> Hi Chesnay, 
> 
> I believe that for APIs we already have a pretty clear policy with the 
> annotations. 
> I was referring to savepoints and state related backwards compatibility. 
> 
> 
>> On May 20, 2017, at 7:20 PM, Chesnay Schepler  wrote: 
>> 
>> I think it would be a good to clarify what kind of backwards-compatibilitiy 
>> we're talking about here. As in are we talking about APIs or savepoints? 
>> 
>> On 20.05.2017 19:09, Kostas Kloudas wrote: 
>>> Hi all, 
>>> 
>>> As we are getting closer to releasing Flink-1.3, I would like to open a 
>>> discussion 
>>> on how far back we provide backwards compatibility for. 
>>> 
>>> The reason for opening the discussion is that i) for the users and for the 
>>> adoption of the project, it is good to have an explicitely stated policy 
>>> that implies 
>>> certain guarantees, and ii) keeping code and tests for backwards 
>>> compatibility with 
>>> Flink-1.1 does not offer much. On the contrary, I think that it leads to: 
>>> 
>>> 1) dead or ugly code in the codebase, e.g. deprecated class fields that 
>>> could go away and 
>>> ugly if() loops (see aligned window operators that were deprecated in 1.2 
>>> and are now 
>>> normal windows), etc 
>>> 2) expensive tests (as, normally, they read from a savepoint) 
>>> 3) binary files in the codebase for holding the aforementioned savepoints 
>>> 
>>> My proposal for such a policy would be to offer backwards compatibility for 
>>> one previous version. 
>>> 
>>> This means that 1.3 will be compatible with 1.2 (not 1.1). This still 
>>> allows a clear 
>>> "backwards compatibility" path when jumping versions (a user that goes 
>>> from 1.1 to 1.3 can go initially 1.1 -> 1.2, take a savepoint, and then 1.2 
>>> -> 1.3), 
>>> while also allowing us to clean up the codebase a bit. 
>>> 
>>> What do you think? 
>>> 
>>> Kostas



[jira] [Created] (FLINK-6648) Transforms for Gelly examples

2017-05-19 Thread Greg Hogan (JIRA)
Greg Hogan created FLINK-6648:
-

 Summary: Transforms for Gelly examples
 Key: FLINK-6648
 URL: https://issues.apache.org/jira/browse/FLINK-6648
 Project: Flink
  Issue Type: Improvement
  Components: Gelly
Affects Versions: 1.4.0
Reporter: Greg Hogan
Assignee: Greg Hogan
 Fix For: 1.4.0


A primary objective of the Gelly examples {{Runner}} is to make adding new 
inputs and algorithms as simple and powerful as possible. A recent feature made 
it possible to translate the key ID of generated graphs to alternative numeric 
or string representations. For floating point and {{LongValue}} it is desirable 
to translate the key ID of the algorithm results.

Currently a {{Runner}} job consists of an input, an algorithm, and an output. A 
{{Transform}} will translate the input {{Graph}} and the algorithm output 
{{DataSet}}. The {{Input}} and algorithm {{Driver}} will return an ordered list 
of {{Transform}} which will be executed in that order (processed in reverse 
order for algorithm output) . The {{Transform}} can be configured as are inputs 
and drivers.

Example transforms:
- the aforementioned translation of key ID types
- surrogate types (String -> Long or Int) for user data
- FLINK-4481 Maximum results for pairwise algorithms
- FLINK-3625 Graph algorithms to permute graph labels and edges



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Re: [DISCUSS] Release 1.3.0 RC1 (Non voting, testing release candidate)

2017-05-18 Thread Greg Hogan
The following tickets for 1.3.0 have a PR in need of review:

[FLINK-6582] [docs] Project from maven archetype is not buildable by default
[FLINK-6616] [docs] Clarify provenance of official Docker images


> On May 18, 2017, at 5:40 AM, Fabian Hueske  wrote:
> 
> I have a couple of PRs ready with bugfixes that I'll try to get in as well.
> Should be done soon.
> 
> 2017-05-18 11:24 GMT+02:00 Till Rohrmann :
> 
>> I'd like to get a fix in for
>> https://issues.apache.org/jira/browse/FLINK-6612. This can basically
>> thwart
>> Flink's recovery capabilities.
>> 
>> On Thu, May 18, 2017 at 11:13 AM, Chesnay Schepler 
>> wrote:
>> 
>>> This PR reduces logging noise a bit: (got +1 to merge)
>>> https://github.com/apache/flink/pull/3917
>>> 
>>> This PR fixes the compilation on Windows:  (reviewed once, most recent
>>> changes not reviewed)
>>> https://github.com/apache/flink/pull/3854
>>> 
>>> This PR enables a test for savepoint compatibility: (nice to have, easy
>> to
>>> review)
>>> https://github.com/apache/flink/pull/3854
>>> 
>>> These 2 PRs fix minor issues with metrics: (trivial review, both
>>> one-liners)
>>> https://github.com/apache/flink/pull/3906
>>> https://github.com/apache/flink/pull/3907
>>> 
>>> 
>>> On 18.05.2017 10:52, Robert Metzger wrote:
>>> 
 I will.
 Actually I had it already on my radar because its one of the three
 remaining blockers.
 
 Your JIRA has already a PR so I guess its on a good track, for the other
 blockers, I think its fine to release without having them fixed.
 Is there anything else we need to get into the 1.3.0 release?
 Otherwise, I would soon create the first voting RC.
 
 
 
 On Wed, May 17, 2017 at 8:49 PM, Eron Wright 
 wrote:
 
 Robert, please add FLINK-6606 to the list of JIRAs that you're tracking,
> thanks.
> 
> On Tue, May 16, 2017 at 8:30 AM, Robert Metzger 
> wrote:
> 
> I totally forgot to post a document with testing tasks in the RC0
>> thread,
>> so I'll do it in the RC1 thread.
>> 
>> Please use this document:
>> https://docs.google.com/document/d/11WCfV15VwQNF-
>> Rar4E0RtWiZw1ddEbg5WWf4RFSQ_2Q/edit#
>> 
>> If I have the feeling that not enough people are seeing the document,
>> 
> I'll
> 
>> write a dedicated email to user@ and dev@ :)
>> 
>> 
>> On Tue, May 16, 2017 at 9:26 AM, Robert Metzger 
>> wrote:
>> 
>> Thanks for the pointer. I'll keep an eye on the JIRA.
>>> 
>>> I've gone through the JIRAs tagged with 1.3.0 yesterday to create a
>>> 
>> list
> 
>> of new features in 1.3. Feel free to add more / change it in the wiki:
>>> https://cwiki.apache.org/confluence/display/FLINK/
>>> Flink+Release+and+Feature+Plan#FlinkReleaseandFeaturePlan-Flink1.3
>>> 
>>> On Mon, May 15, 2017 at 10:29 PM, Gyula Fóra 
>>> 
>> wrote:
>> 
>>> Thanks Robert,
 
 Just for the record I think there are still some problems with
 
>>> incremental
>> 
>>> snapshots, I think Stefan is still working on it.
 
 I added some comments to https://issues.apache.org/
 
>>> jira/browse/FLINK-6537
>> 
>>> Gyula
 
 Robert Metzger  ezt írta (időpont: 2017. máj.
 
>>> 15.,
> 
>> H,
 19:41):
 
 Hi Devs,
> 
> This is the second non-voting RC. The last RC had some big issues,
> 
 making
 
> it hard to start Flink locally. I hope this RC proves to be more
> 
 stable.
>> 
>>> I hope to create the first voting RC by end of this week.
> 
> 
> 
 -
> 
>> The release commit is 3659a82f553fedf8afe8b5fae75922075fe17e85
> 
> The artifacts are located here:
> http://people.apache.org/~rmetzger/flink-1.3.0-rc1/
> 
> The maven staging repository is here:
> https://repository.apache.org/content/repositories/
> 
 orgapacheflink-1119
>> 
>>> 
> 
 -
> 
>> Happy testing!
> 
> Regards,
> Robert
> 
> 
>>> 
>>> 
>> 



[jira] [Created] (FLINK-6616) Clarify provenance of official Docker images

2017-05-17 Thread Greg Hogan (JIRA)
Greg Hogan created FLINK-6616:
-

 Summary: Clarify provenance of official Docker images
 Key: FLINK-6616
 URL: https://issues.apache.org/jira/browse/FLINK-6616
 Project: Flink
  Issue Type: Improvement
  Components: Documentation
Affects Versions: 1.3.0
Reporter: Greg Hogan
Assignee: Greg Hogan
Priority: Critical
 Fix For: 1.3.0


Note that the official Docker images for Flink are community supported and not 
an official release of the Apache Flink PMC.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (FLINK-6603) Enable checkstyle on test sources

2017-05-16 Thread Greg Hogan (JIRA)
Greg Hogan created FLINK-6603:
-

 Summary: Enable checkstyle on test sources
 Key: FLINK-6603
 URL: https://issues.apache.org/jira/browse/FLINK-6603
 Project: Flink
  Issue Type: Improvement
  Components: Streaming
Affects Versions: 1.4.0
Reporter: Greg Hogan
Assignee: Greg Hogan
Priority: Trivial
 Fix For: 1.4.0


With the addition of strict checkstyle to select modules (currently limited to 
{{flink-streaming-java}}) we can enable the checkstyle flag 
{{includeTestSourceDirectory}} to perform the same unused imports, whitespace, 
and other checks on test sources.

Should first resolve the import grouping as discussed in FLINK-6107. Also, 
several tests exceed the 2500 line limit.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Re: [DISCUSS] Release 1.3.0 RC0 (Non voting, testing release candidate)

2017-05-12 Thread Greg Hogan
+1 for sticking to the code freeze deadline and building a new release 
candidate but since the release is still two weeks off (5/26) I think it better 
to delay voting to give time for additional bug fixes.


> On May 11, 2017, at 10:19 AM, Robert Metzger  wrote:
> 
> It seems that we found quite a large number of critical issues in the first
> RC.
> 
> - FLINK-6537 Umbrella issue for fixes to incremental snapshots (Stefan has
> a PR open to fix the critical ones)
> - FLINK-6531 Deserialize checkpoint hooks with user classloader (has a
> pending PR)
> - FLINK-6515 KafkaConsumer checkpointing fails because of ClassLoader
> issues (status unknown)
> - FLINK-6514 Cannot start Flink Cluster in standalone mode (Stephan is on
> it)
> - FLINK-6508 Include license files of packaged dependencies (Stephan is on
> it) + FLINK-6501 Make sure NOTICE files are bundled into shaded JAR files
> - FLINK-6284 Incorrect sorting of completed checkpoints in
> ZooKeeperCompletedCheckpointStore (unknown)
> 
> I would like to get these issues fixed by end of this week (Sunday), so
> that I can create the first voting RC on Monday morning.
> Please reject if you think we will not manage to get the stuff fixed until
> then.
> 
> 
> 
> On Thu, May 11, 2017 at 10:54 AM, Till Rohrmann 
> wrote:
> 
>> Unfortunately, it won't be fully functional in 1.3.
>> 
>> Cheers,
>> Till
>> 
>> On Thu, May 11, 2017 at 10:45 AM, Renjie Liu 
>> wrote:
>> 
>>> @Rohrmann Will FLIP 6 be fully functional in 1.3 release?
>>> 
>>> On Thu, May 11, 2017 at 4:12 PM Gyula Fóra  wrote:
>>> 
 Thanks Stefan!
 Gyula
 
 Stefan Richter  ezt írta (időpont: 2017.
>>> máj.
 11., Cs, 10:04):
 
> 
> Hi,
> 
> Thanks for reporting this. I found a couple of issues yesterday and I
>>> am
> currently working on a bundle of fixes. I will take a look at this
 problem,
> and if it is already covered.
> 
> Best,
> Stefan
> 
>> Am 11.05.2017 um 09:47 schrieb Gyula Fóra :
>> 
>> Hi,
>> I am not sure if this belong to this thread, but while trying to
>> run
>>> a
> job
>> with rocks incremental backend I ran into 2 issues:
>> 
>> One with savepoints, I can't figure out because I can't make sense
>> of
 the
>> error or how it happenned:
>> The error stack trace is here:
>> https://gist.github.com/gyfora/2f7bb387bbd9f455f9702908cde0b239
>> This happens on every savepoint attempt and seems to be related to
>>> the
>> kafka source. Interestingly other tasks succeed in writing data to
 hdfs.
>> 
>> The other one is covered by
> https://issues.apache.org/jira/browse/FLINK-6531 I
>> guess. I am not sure if the first one is related though.
>> 
>> Thank you!
>> Gyula
>> 
>> Till Rohrmann  ezt írta (időpont: 2017. máj.
 11.,
> Cs,
>> 9:14):
>> 
>>> Hi Renjie,
>>> 
>>> 1.3 already contains some Flip-6 code. However, it is not yet
>> fully
>>> functional. What you already can do is to run local jobs on the
>>> Flip-6
> code
>>> base by instantiating a MiniCluster and then using the
>>> Flip6LocalStreamEnvironment.
>>> 
>>> Cheers,
>>> Till
>>> 
>>> On Thu, May 11, 2017 at 6:00 AM, Renjie Liu <
>>> liurenjie2...@gmail.com>
>>> wrote:
>>> 
 Hi, all:
 Will the FLIP 6 be included in release 1.3?
 
 On Wed, May 10, 2017 at 9:48 PM Gyula Fóra >> 
> wrote:
 
> Thanks you! :)
> 
> Chesnay Schepler  ezt írta (időpont: 2017.
>>> máj.
>>> 10.,
> Sze, 15:44):
> 
>> I guess it is related to this one
>> https://issues.apache.org/jira/browse/FLINK-6514 ?
>> 
>> On 10.05.2017 15:34, Gyula Fóra wrote:
>>> Hi,
>>> 
>>> I tried to run an application on 1.3 but I keep getting the
>>> following
>> error:
>>> java.lang.NoClassDefFoundError: Could not initialize class
>>> org.apache.hadoop.security.UserGroupInformation
>>> at
>>> 
>> 
> org.apache.flink.runtime.security.modules.HadoopModule.
 install(HadoopModule.java:45)
>>> at
>>> 
>> 
> org.apache.flink.runtime.security.SecurityUtils.
 install(SecurityUtils.java:78)
>>> at org.apache.flink.client.CliFrontend.main(CliFrontend.
>>> java:1128)
>>> 
>>> Even after adding hadoop-common to the lib manually (which I
>>> guess
> should
>>> not be necessary).
>>> 
>>> Any idea what might cause this?
>>> 
>>> Thanks,
>>> Gyula
>>> 
>>> Chesnay Schepler  ezt írta (időpont:
>> 

git overwritten by checkout

2017-05-12 Thread Greg Hogan
The following file in the following commit throw the following error when 
rebasing to master. I agree that case insensitive filesystems are an 
abomination and hopefully never supported by APFS but just wanted to note the 
situation. With a clean working directory I solved this with `git fetch apache 
master` and `git reset --hard FETCH_HEAD`.

Greg

.../scala/org/apache/flink/table/sources/{DefinedTimeAttributes.scala → 
definedTimeAttributes.scala}

https://github.com/apache/flink/commit/b50ef4b8de73e0e19df154d87ea588236e3ccb43 


git rebase apache/master
First, rewinding head to replay your work on top of it...
error: The following untracked working tree files would be overwritten by 
checkout:

flink-libraries/flink-table/src/main/scala/org/apache/flink/table/sources/definedTimeAttributes.scala
Please move or remove them before you switch branches.
Aborting
could not detach HEAD


[jira] [Created] (FLINK-6560) Restore maven parallelism in flink-tests

2017-05-11 Thread Greg Hogan (JIRA)
Greg Hogan created FLINK-6560:
-

 Summary: Restore maven parallelism in flink-tests
 Key: FLINK-6560
 URL: https://issues.apache.org/jira/browse/FLINK-6560
 Project: Flink
  Issue Type: Bug
  Components: Build System
Affects Versions: 1.3.0, 1.4.0
Reporter: Greg Hogan
Assignee: Greg Hogan
Priority: Minor
 Fix For: 1.3.0, 1.4.0


FLINK-6506 added the maven variable {{flink.forkCountTestPackage}} which is 
used by the TravisCI script but no default value is set.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (FLINK-6466) Build Hadoop 2.8.0 convenience binaries

2017-05-05 Thread Greg Hogan (JIRA)
Greg Hogan created FLINK-6466:
-

 Summary: Build Hadoop 2.8.0 convenience binaries
 Key: FLINK-6466
 URL: https://issues.apache.org/jira/browse/FLINK-6466
 Project: Flink
  Issue Type: New Feature
  Components: Build System
Affects Versions: 1.3.0
Reporter: Greg Hogan
Assignee: Greg Hogan
 Fix For: 1.3.0


As discussed on the dev mailing list, add Hadoop 2.8 to the 
{{create_release_files.sh}} script and TravisCI test matrix.

If there is consensus then references to binaries for old versions of Hadoop 
could be removed.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Re: [DISCUSS] What is a "Blocker" in our JIRA?

2017-05-05 Thread Greg Hogan
My reference to post-hoc was in regards to release. Concurrent dev and doc is 
of course ideal but I think anyone running snapshot is aware that docs may be 
somewhat out of sync.

A better link discussing unreleased publication: 
http://www.apache.org/legal/release-policy.html#publication

Our “How to Contribute” works well to filter “developers” but we do link to the 
GitHub repo from the left panel.


> On May 5, 2017, at 9:13 AM, Ufuk Celebi <u...@apache.org> wrote:
> 
> On Fri, May 5, 2017 at 2:24 PM, Greg Hogan <c...@greghogan.com> wrote:
>> -1 to post-hoc documentation and unreleased publication [0]
> 
> Could you quickly confirm that I understand this correctly:
> - Post-hoc documentation means not documenting features separately
> from their initial PRs?
> - Unreleased publication means *not* serving the snapshot docs?
> 
> – Ufuk



Re: [DISCUSS] What is a "Blocker" in our JIRA?

2017-05-05 Thread Greg Hogan
+1 to restricting use of “blocker"

-1 to post-hoc documentation and unreleased publication [0]

[0] http://www.apache.org/dev/release-publishing.html


> On May 5, 2017, at 5:06 AM, Ufuk Celebi  wrote:
> 
> Thanks for this discussion Robert. I agree with your points. +1
> 
> Regarding the documentation: I also agree with Kostas and others that
> it is very important to have good documentation in place. The good
> thing about our current doc deployment model is that we can update the
> docs for a released version even after it has been released. Yes,
> ideally, we would have the docs already in place when we release and
> only do small updates (doc "bug fixes"), but I don't see that
> happening in the near future. Therefore, I would unblock the
> documentation issues as well for now. Maybe this is a point that we
> can revisit in the near future. All in all, I think that the docs have
> made tremendous progress in the last 6 months. :-)
> 
> – Ufuk
> 
> On Fri, May 5, 2017 at 9:47 AM, Robert Metzger  wrote:
>> I understand the motivation to make documentation equally important as bugs
>> in software, but from my point of view as a release manager, its not easy
>> to keep track of the release status when some blockers are not real
>> blockers (I can't just look at the number in JIRA, but I have to manually
>> go through the list and read every JIRA to get the real number of release
>> blockers)
>> 
>> As I've said in my initial email, a blocker is an issue in the code where
>> no workaround exist and the system is virtually unusable. Missing
>> documentation is always something users can work around.
>> 
>> I agree with Ted that we can still merge documentation changes while
>> testing RC0.
>> 
>> On Fri, May 5, 2017 at 7:48 AM, Aljoscha Krettek 
>> wrote:
>> 
>>> IMHO, the problem with the “add documentation” issues is that they should
>>> ideally have been documented as part of the feature development. (Full
>>> disclosure: I’m not innocent there and the Per-Key Window State Doc is
>>> somewhat my fault.) Sometimes, though, several features are developed over
>>> the course of multiple Jiras and it only makes sense to document the final
>>> new state.
>>> 
>>> Best,
>>> Aljoscha
 On 4. May 2017, at 20:07, Ted Yu  wrote:
 
 I agree with Kostas.
 
 Considering 1.3 has many new features which need non-trivial effort for
 testing, maybe the work on documentation can be done in parallel to
>>> testing
 RC0.
 
 Cheers
 
 On Thu, May 4, 2017 at 10:54 AM, Kostas Kloudas <
>>> k.klou...@data-artisans.com
> wrote:
 
> Hi Robert,
> 
> Thanks for clarifying this so that we all have a common definition of
> what is a blocker.
> 
> The only thing I would disagree, and only is some cases, is the
> documentation part.
> I think that in some cases, things that have to do with documentation
>>> can
> and
> should become blockers. Mainly to motivate devs to take care of them.
> 
> There are important parts of Flink which are not (sufficiently or at
>>> all)
> documented
> and this can result in confusion / load in the mailing list / wrong user
> assumptions.
> 
> Regards,
> Kostas
> 
> 
>> On May 4, 2017, at 7:38 PM, Robert Metzger 
>>> wrote:
>> 
>> Hi Flink Devs,
>> 
>> while checking our JIRA for the 1.3 release, I found that some issues
>>> of
>> type "New Feature" have the priority "Blocker".
>> With the time-based release model, I don't think its possible to mark a
> new
>> feature as a blocker.
>> 
>> Only a bug that leads to wrong results, system failure or completely
>> undefined behavior (without any available workaround) is a blocker.
>> New features, missing documentation or inconveniences are never
>>> blockers.
>> 
>> I understand that people want to express that their feature is so
> important
>> that it is basically blocking us from creating a release, but with the
>> time-based release model, this doesn't really apply.
>> 
>> If nobody disagrees, I'll un"block" new features to "Major" new
>>> features.
>> 
>> 
>> Regards,
>> Robert
>> 
>> 
>> Examples:
>> https://issues.apache.org/jira/browse/FLINK-6198
>> https://issues.apache.org/jira/browse/FLINK-6178
>> https://issues.apache.org/jira/browse/FLINK-6163
>> https://issues.apache.org/jira/browse/FLINK-6047
>> https://issues.apache.org/jira/browse/FLINK-5978
>> https://issues.apache.org/jira/browse/FLINK-5968
> 
> 
>>> 
>>> 



Re: Supported Hadoop versions

2017-05-03 Thread Greg Hogan
1) Flink 1.2 dropped Hadoop 1 but we don’t have a replacement “default”.

2) I’ll create a ticket to bump to the latest patch versions.

3) Uncertain if we can drop Hadoop 2.3 and/or 2.4.


> On Apr 30, 2017, at 10:46 AM, Ted Yu <yuzhih...@gmail.com> wrote:
> 
> For #1, +1 on dropping hadoop 1.
> 
> For #2, we can reference Hadoop 2.7
> BTW I think we can bump to 2.7.3 as dependency.
> 
> For #3, Hadoop 2.8.0 was marked not production ready. Junping is in the
> process of releasing 2.8.1
> It would be good idea to start testing against 2.8.0 now
> 
> Cheers
> 
> On Sun, Apr 30, 2017 at 4:16 AM, Greg Hogan <c...@greghogan.com 
> <mailto:c...@greghogan.com>> wrote:
> 
>> Hi Flink,
>> 
>> I filed a ticket [0] that our download page [1] still states
>> 
>> “You don’t have to install Hadoop to use Flink, but if you plan to use
>> Flink with data stored in Hadoop, pick the version matching your installed
>> Hadoop version. If you don’t want to do this, pick the Hadoop 1 version.”
>> 
>> 1) We no longer offer the Hadoop 1 version, which I expect was chosen due
>> to its smaller size. What is the new recommendation?
>> 
>> 2) Should we reference, for example, “Hadoop 2.7” (as with the binary
>> filenames) rather than “Hadoop 2.7.0” since we are actually testing and
>> releasing against Hadoop 2.7.2?
>> 
>> 3) Should Flink 1.3.0 support the recently released Hadoop 2.8.0? Is this
>> the time to drop older versions (which users can easily build)? This would
>> also be the time to bump the patch versions in .travis.yml and
>> create-release-files.sh.
>> 
>> I ask because I think it is important to present this choice well since it
>> is likely to be a new users first decision point.
>> 
>> [0] https://issues.apache.org/jira/browse/FLINK-6399 <
>> https://issues.apache.org/jira/browse/FLINK-6399 
>> <https://issues.apache.org/jira/browse/FLINK-6399>>
>> [1] https://flink.apache.org/downloads.html 
>> <https://flink.apache.org/downloads.html> <https://flink.apache.org/ 
>> <https://flink.apache.org/>
>> downloads.html>
>> 
>> Greg



[jira] [Created] (FLINK-6414) Use scala.binary.version in place of change-scala-version.sh

2017-04-28 Thread Greg Hogan (JIRA)
Greg Hogan created FLINK-6414:
-

 Summary: Use scala.binary.version in place of 
change-scala-version.sh
 Key: FLINK-6414
 URL: https://issues.apache.org/jira/browse/FLINK-6414
 Project: Flink
  Issue Type: Improvement
  Components: Build System
Affects Versions: 1.3.0
Reporter: Greg Hogan
Assignee: Greg Hogan


Recent commits have failed to modify {{change-scala-version.sh}} resulting in 
broken builds for {{scala-2.11}}. It looks like we can remove the need for this 
script by replacing hard-coded references to the Scala version with Flink's 
maven variable {{scala.binary.version}}.

I had initially realized that the change script is [only used for 
building|https://ci.apache.org/projects/flink/flink-docs-release-1.3/setup/building.html#scala-versions]
 and not for switching the IDE environment.




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (FLINK-6399) Update default Hadoop download version

2017-04-27 Thread Greg Hogan (JIRA)
Greg Hogan created FLINK-6399:
-

 Summary: Update default Hadoop download version
 Key: FLINK-6399
 URL: https://issues.apache.org/jira/browse/FLINK-6399
 Project: Flink
  Issue Type: Bug
  Components: Project Website
Reporter: Greg Hogan


[Update|http://flink.apache.org/downloads.html] "If you don’t want to do this, 
pick the Hadoop 1 version." since Hadoop 1 versions are no longer provided.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (FLINK-6382) Support all numeric types for generated graphs in Gelly examples

2017-04-25 Thread Greg Hogan (JIRA)
Greg Hogan created FLINK-6382:
-

 Summary: Support all numeric types for generated graphs in Gelly 
examples
 Key: FLINK-6382
 URL: https://issues.apache.org/jira/browse/FLINK-6382
 Project: Flink
  Issue Type: Improvement
  Components: Gelly
Affects Versions: 1.3.0
Reporter: Greg Hogan
Assignee: Greg Hogan
Priority: Minor
 Fix For: 1.3.0


The Gelly examples current support {{IntValue}}, {{LongValue}}, and 
{{StringValue}} for {{RMatGraph}}. Allow transformations and tests for all 
generated graphs for {{ByteValue}}, {{Byte}}, {{ShortValue}}, {{Short}}, 
{{CharValue}}, {{Character}}, {{Integer}}, {{Long}}, and {{String}}.

This is additionally of interest for benchmarking and testing modifications to 
Flink's internal sort.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (FLINK-6375) Fix LongValue hashCode

2017-04-24 Thread Greg Hogan (JIRA)
Greg Hogan created FLINK-6375:
-

 Summary: Fix LongValue hashCode
 Key: FLINK-6375
 URL: https://issues.apache.org/jira/browse/FLINK-6375
 Project: Flink
  Issue Type: Improvement
  Components: Core
Affects Versions: 2.0.0
Reporter: Greg Hogan
Priority: Trivial


Match {{LongValue.hashCode}} to {{Long.hashCode}} (and the other numeric types) 
by simply adding the high and low words rather than shifting the hash by adding 
43.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (FLINK-6358) Write job details for Gelly examples

2017-04-22 Thread Greg Hogan (JIRA)
Greg Hogan created FLINK-6358:
-

 Summary: Write job details for Gelly examples
 Key: FLINK-6358
 URL: https://issues.apache.org/jira/browse/FLINK-6358
 Project: Flink
  Issue Type: Improvement
  Components: Gelly
Affects Versions: 1.3.0
Reporter: Greg Hogan
Assignee: Greg Hogan
Priority: Minor


Add an option to write job details to a file in JSON format. Job details 
include: job ID, runtime, parameters with values, and accumulators with values.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (FLINK-6357) Parametertool get unrequested parameters

2017-04-22 Thread Greg Hogan (JIRA)
Greg Hogan created FLINK-6357:
-

 Summary: Parametertool get unrequested parameters
 Key: FLINK-6357
 URL: https://issues.apache.org/jira/browse/FLINK-6357
 Project: Flink
  Issue Type: Improvement
  Components: Java API
Affects Versions: 1.3.0
Reporter: Greg Hogan
Assignee: Greg Hogan
Priority: Minor


The Gelly examples use {{ParameterTool}} to parse required and optional 
parameters. In the latter case we should detect if a user mistypes a parameter 
name. I would like to add a {{Set getUnrequestedParameters()}} method 
returning parameter names not requested by {{has}} or any of the {{get}} 
methods.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (FLINK-6280) Allow logging with Java flags

2017-04-07 Thread Greg Hogan (JIRA)
Greg Hogan created FLINK-6280:
-

 Summary: Allow logging with Java flags
 Key: FLINK-6280
 URL: https://issues.apache.org/jira/browse/FLINK-6280
 Project: Flink
  Issue Type: Improvement
  Components: Startup Shell Scripts
Affects Versions: 1.3.0
Reporter: Greg Hogan
Assignee: Greg Hogan


Allow configuring Flink's Java options with the logging prefix and log 
rotation. For example, this allows the following configurations to write 
{{.jfr}} and {{.jit}} files alongside the existing {{.log}} and {{.out}} files.

{code:language=bash|title=Configuration for Java Flight Recorder}
env.java.opts: "-XX:+UnlockCommercialFeatures -XX:+UnlockDiagnosticVMOptions 
-XX:+FlightRecorder -XX:+DebugNonSafepoints 
-XX:FlightRecorderOptions=defaultrecording=true,dumponexit=true,dumponexitpath=${LOG_PREFIX}.jfr"
{code}

{code:language=bash|title=Configuration for JitWatch}
env.java.opts: "-XX:+UnlockDiagnosticVMOptions -XX:+TraceClassLoading 
-XX:+LogCompilation -XX:LogFile=${LOG_PREFIX}.jit -XX:+PrintAssembly"
{code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Re: [DISCUSS] FLIP-18: Code Generation for improving sorting performance

2017-04-05 Thread Greg Hogan
Pat,

Thanks for running additional tests and continuing to work on this contribution.

My testing is also showing that the performance gains remain even when multiple 
classes are used for sorting.

I think we should proceed in the order of FLINK-3722, FLINK-4705, and 
FLINK-5734. Gabor has reviewed FLINK-3722 and I’ve done so multiple times. I’m 
looking into test coverage for FLINK-4705. Once these are reviewed and 
FLINK-5734 rebased we can benchmark Flink’s performance to validate the 
improvements.

Greg


> On Apr 3, 2017, at 8:46 PM, Pattarawat Chormai  wrote:
> 
> Hi guys,
> 
> I have made an additional optimization[1] related to megamorphic call issue
> that Greg mentioned earlier. The optimization[2] improves execution time
> around ~13%, while the original code from FLINK-5734 is ~11%.
> 
> IMHO, the improvement from metamorphic call optimization is very small
> compared to the code we have to introduce. So, I think we can just go with
> the PR that we currently have. What do you think?
> 
> [1]
> https://github.com/heytitle/flink/commit/8e38b4d738b9953337361c62a8d77e909327d28f
> [2]https://docs.google.com/spreadsheets/d/1PcdCdFX4bGecO6Lb5dLI2nww2NoeaA8c3MgbEdsVmk0/edit#gid=598217386
> 
> Best,
> Pat
> 
> 
> 
> --
> View this message in context: 
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-18-Code-Generation-for-improving-sorting-performance-tp16486p16923.html
> Sent from the Apache Flink Mailing List archive. mailing list archive at 
> Nabble.com.



Re: [DISCUSS] Code style / checkstyle

2017-04-05 Thread Greg Hogan
 general guidelines help keep code more readable and
>>> keep
>>>> things simple
>>>> with many contributors and different styles of code writing + language
>>>> features.
>>>> 
>>>> 
>>>> On Mon, Feb 27, 2017 at 8:01 PM, Stephan Ewen <se...@apache.org> wrote:
>>>> 
>>>>> I agree, reformatting 90% of the code base is tough.
>>>>> 
>>>>> There are two main issues:
>>>>> (1) Incompatible merges. This is hard, especially for the folks that
>>>> have
>>>>> to merge the pull requests ;-)
>>>>> 
>>>>> (2) Author history: This is less of an issue, I think. "git log
>>>>> " and "git show  -- " will still work and
>>>> one
>>>>> may have to go one commit back to find out why something was changed
>>>>> 
>>>>> 
>>>>> What I could image is to do this incrementally. Define the code style
>>> in
>>>>> "flink-parent" but do not activate it.
>>>>> Then start with some projects (new projects, plus some others):
>>>>> merge/reject PRs, reformat, activate code style.
>>>>> 
>>>>> Piece by piece. This is realistically going to take a long time until
>>> it
>>>> is
>>>>> pulled through all components, but that's okay, I guess.
>>>>> 
>>>>> Stephan
>>>>> 
>>>>> 
>>>>> On Mon, Feb 27, 2017 at 1:53 PM, Aljoscha Krettek <aljos...@apache.org
>>>> 
>>>>> wrote:
>>>>> 
>>>>>> Just for a bit of context, this is the output of running cloc on the
>>>>> Flink
>>>>>> codebase:
>>>>>> 
>>>>>> ---
>>>>>> Language files  blankcomment
>>>>>>   code
>>>>>> 
>>>>>> ---
>>>>>> Java  4609 126825 185428
>>>>>> 519096
>>>>>> 
>>>>>> => 704,524 lines of code + comments/javadoc
>>>>>> 
>>>>>> When I apply the google style to the Flink code base using
>>>>>> https://github.com/google/google-java-format I get these commit
>>>>>> statistics:
>>>>>> 
>>>>>> 4577 files changed, 647645 insertions(+), 622663 deletions(-)
>>>>>> 
>>>>>> That is, a change to the Google Code Style would touch roughly over
>>> 90%
>>>>> of
>>>>>> all code/comment lines.
>>>>>> 
>>>>>> I would like to have a well defined code style, such as the Google
>>> Code
>>>>>> style, that has nice tooling and support but I don't think we will
>>> ever
>>>>>> convince enough people to do this kind of massive change. Even I
>>> think
>>>>> it's
>>>>>> a bit crazy to change 90% of the code base in one commit.
>>>>>> 
>>>>>> On Mon, 27 Feb 2017 at 11:10 Till Rohrmann <trohrm...@apache.org>
>>>> wrote:
>>>>>> 
>>>>>>> No, I think that's exactly what people mean when saying "losing the
>>>>>> commit
>>>>>>> history". With the reformatting you would have to go manually
>>> through
>>>>> all
>>>>>>> past commits until you find the commit which changed a given line
>>>>> before
>>>>>>> the reformatting.
>>>>>>> 
>>>>>>> Cheers,
>>>>>>> Till
>>>>>>> 
>>>>>>> On Sun, Feb 26, 2017 at 6:32 PM, Alexander Alexandrov <
>>>>>>> alexander.s.alexand...@gmail.com> wrote:
>>>>>>> 
>>>>>>>> Just to clarify - by "losing the commit history" you actually
>>> mean
>>>>>>> "losing
>>>>>>>> the ability to annotate each line in a file with its last
>>> commit",
>>>>>> right?
>>>>>>>> 
>>>>>>>> Or is there some other sense in which something i

[jira] [Created] (FLINK-6268) Object reuse for Either type

2017-04-05 Thread Greg Hogan (JIRA)
Greg Hogan created FLINK-6268:
-

 Summary: Object reuse for Either type
 Key: FLINK-6268
 URL: https://issues.apache.org/jira/browse/FLINK-6268
 Project: Flink
  Issue Type: Improvement
  Components: Core
Affects Versions: 1.3.0
Reporter: Greg Hogan
Assignee: Greg Hogan
Priority: Minor


While reviewing test coverage for FLINK-4705 I have come across that {{Either}} 
only implements partial object reuse (when from and to are both {{Right}}). We 
can implement full object reuse if {{Left}} stores a reference to a {{Right}} 
and {{Right}} to a {{Left}}. These references will be {{private}} and will 
remain {{null}} until set by {{EitherSerializer}} when copying or deserializing 
with object reuse.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Re: [DISCUSS] Project build time and possible restructuring

2017-03-31 Thread Greg Hogan
Thanks for pursuing this Robert.

I appreciate their receptiveness to increasing the time and memory limits but 
we’ll still be bound by the old limits for our personal repos. Does this change 
any of the proposed actions for splitting the repo?

Has anyone looked into why we see many jobs timeout right at 50 minutes? 
Passing job take well under 50 minutes and the 5 minute watchdog timeout is not 
being triggered. Just pulling up a recent build: 
https://travis-ci.org/apache/flink/builds/217034084

Greg


> On Mar 31, 2017, at 9:12 AM, Robert Metzger <rmetz...@apache.org> wrote:
> 
> Good news :)
> A few weeks ago, I got an email from travis asking for feedback. I filled
> out the form and said, that the 50 minutes build time limit is a
> showstopper for us.
> And now, a few weeks later they got back to me and told me that they have
> increased the build time for "apache/flink" to 120 minutes. Also, we can
> set the settings to use the "sudo enabled infrastructure", with 7.5 Gb of
> main memory guaranteed.
> 
> I'll do a push to a separate branch to see how well it works :)
> 
> On Tue, Mar 28, 2017 at 4:36 PM, Robert Metzger <rmetz...@apache.org> wrote:
> 
>> I think your selection of modules is okay.
>> Moving out storm and the scala shell would be nice as well. But storm is
>> not really maintained, so maybe we should consider moving it out of the
>> Flink repo entirely.
>> And the scala shell is not a library, but it also doesn't really  belong
>> into the main repo.
>> 
>> Regarding the feature freeze: We either do it with a lot of  time in
>> advance to avoid any delays for the release, OR we do it right after the
>> release branch has been forked off.
>> 
>> 
>> 
>> On Tue, Mar 21, 2017 at 1:09 PM, Timo Walther <twal...@apache.org> wrote:
>> 
>>> So what do we want to move to the libraries repository?
>>> 
>>> I would propose to move these modules first:
>>> 
>>> flink-cep-scala
>>> flink-cep
>>> flink-gelly-examples
>>> flink-gelly-scala
>>> flink-gelly
>>> flink-ml
>>> 
>>> All other modules (e.g. in flink-contrib) are rather connectors. I think
>>> it would be better to move those in a connectors repository later.
>>> 
>>> If we are not in a rush, we could do the moving after the feature-freeze.
>>> This is the time where most of the PR will have been merged.
>>> 
>>> Timo
>>> 
>>> 
>>> Am 20/03/17 um 15:00 schrieb Greg Hogan:
>>> 
>>>> We can add cluster tests using the distribution jar, and will need to do
>>>> so to remove Flink’s dependency on Hadoop. The YARN and Mesos tests would
>>>> still run nightly and running cluster tests should be much faster. As
>>>> troublesome as TravisCI has been, a major driver for this change has been
>>>> local build time.
>>>> 
>>>> I agree with splitting off one repo at a time, but we’ll first need to
>>>> reorganize the core repo if using git submodules as flink-python and
>>>> flink-table would need to first be moved. So I think planning this out
>>>> first is a healthy idea, with the understanding that the plan will be
>>>> reevaluated.
>>>> 
>>>> Any changes to the project structure need a scheduled period, perhaps a
>>>> week, for existing pull requests to be reviewed and accepted or closed and
>>>> later migrated.
>>>> 
>>>> 
>>>> On Mar 20, 2017, at 6:27 AM, Stephan Ewen <se...@apache.org> wrote:
>>>>> 
>>>>> @Greg
>>>>> 
>>>>> I am personally in favor of splitting "connectors" and "contrib" out as
>>>>> well. I know that @rmetzger has some reservations about the connectors,
>>>>> but
>>>>> we may be able to convince him.
>>>>> 
>>>>> For the cluster tests (yarn / mesos) - in the past there were many cases
>>>>> where these tests caught cases that other tests did not, because they
>>>>> are
>>>>> the only tests that actually use the "flink-dist.jar" and thus discover
>>>>> many dependency and configuration issues. For that reason, my feeling
>>>>> would
>>>>> be that they are valuable in the core repository.
>>>>> 
>>>>> I would actually suggest to do only the library split initially, to see
>>>>> what the challenges are in setting up the multi-repo build and r

Re: [DISCUSS] TravisCI auto cancellation

2017-03-29 Thread Greg Hogan
Wow, that was a quick response that this feature was already enabled.


> On Mar 29, 2017, at 9:31 AM, Greg Hogan <c...@greghogan.com> wrote:
> 
> Ticket: https://issues.apache.org/jira/browse/INFRA-13778 
> <https://issues.apache.org/jira/browse/INFRA-13778>
> 
> 
>> On Mar 29, 2017, at 4:07 AM, Till Rohrmann <trohrm...@apache.org 
>> <mailto:trohrm...@apache.org>> wrote:
>> 
>> Looking at Flink's Travis account, I've got the feeling that this feature
>> has already been activated. At least I see some builds (e.g. PR #3625)
>> where multiple commits where created in a short time and then only the
>> latest was actually executed. Apart from that I think it's a good idea
>> since it will help to decrease the waiting queue of Travis builds a bit.
>> 
>> Cheers,
>> Till
>> 
>> On Sun, Mar 26, 2017 at 11:57 PM, Ted Yu <yuzhih...@gmail.com 
>> <mailto:yuzhih...@gmail.com>> wrote:
>> 
>>> +1 to Greg's suggestion.
>>> 
>>> On Sun, Mar 26, 2017 at 2:22 PM, Greg Hogan <c...@greghogan.com 
>>> <mailto:c...@greghogan.com>> wrote:
>>> 
>>>> Hi,
>>>> 
>>>> Just saw this TravisCI beta feature. I think this would be worthwhile to
>>>> enable on pull request builds. We could leave branch builds unchanged
>>> since
>>>> there are fewer builds of this type and skipping builds would make it
>>>> harder to locate a broken build. It’s not uncommon to see three or more
>>>> builds queued for the same PR and developers cannot cancel builds on the
>>>> project account.
>>>>  https://blog.travis-ci.com/2017-03-22-introducing-auto-cancellation 
>>>> <https://blog.travis-ci.com/2017-03-22-introducing-auto-cancellation>
>>>> 
>>>> I’ve enabled this against my personal repo but I believe Apache
>>>> Infrastructure would need to make the change for the project repo. Flink
>>>> has been the biggest user of Apache’s TravisCI build pool.
>>>> 
>>>> Greg



Re: [DISCUSS] TravisCI auto cancellation

2017-03-29 Thread Greg Hogan
Ticket: https://issues.apache.org/jira/browse/INFRA-13778 
<https://issues.apache.org/jira/browse/INFRA-13778>


> On Mar 29, 2017, at 4:07 AM, Till Rohrmann <trohrm...@apache.org> wrote:
> 
> Looking at Flink's Travis account, I've got the feeling that this feature
> has already been activated. At least I see some builds (e.g. PR #3625)
> where multiple commits where created in a short time and then only the
> latest was actually executed. Apart from that I think it's a good idea
> since it will help to decrease the waiting queue of Travis builds a bit.
> 
> Cheers,
> Till
> 
> On Sun, Mar 26, 2017 at 11:57 PM, Ted Yu <yuzhih...@gmail.com> wrote:
> 
>> +1 to Greg's suggestion.
>> 
>> On Sun, Mar 26, 2017 at 2:22 PM, Greg Hogan <c...@greghogan.com> wrote:
>> 
>>> Hi,
>>> 
>>> Just saw this TravisCI beta feature. I think this would be worthwhile to
>>> enable on pull request builds. We could leave branch builds unchanged
>> since
>>> there are fewer builds of this type and skipping builds would make it
>>> harder to locate a broken build. It’s not uncommon to see three or more
>>> builds queued for the same PR and developers cannot cancel builds on the
>>> project account.
>>>  https://blog.travis-ci.com/2017-03-22-introducing-auto-cancellation
>>> 
>>> I’ve enabled this against my personal repo but I believe Apache
>>> Infrastructure would need to make the change for the project repo. Flink
>>> has been the biggest user of Apache’s TravisCI build pool.
>>> 
>>> Greg


[jira] [Created] (FLINK-6195) Move gelly-examples jar from opt to examples

2017-03-27 Thread Greg Hogan (JIRA)
Greg Hogan created FLINK-6195:
-

 Summary: Move gelly-examples jar from opt to examples
 Key: FLINK-6195
 URL: https://issues.apache.org/jira/browse/FLINK-6195
 Project: Flink
  Issue Type: Sub-task
  Components: Gelly
Affects Versions: 1.3.0
Reporter: Greg Hogan
Assignee: Greg Hogan
Priority: Trivial
 Fix For: 1.3.0


The {{opt}} directory should be reserved for Flink JARs which users may 
optionally move to {{lib}} to be loaded by the runtime. 
{{flink-gelly-examples}} is a user program so is being moved to the 
{{examples}} folder.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[DISCUSS] TravisCI auto cancellation

2017-03-26 Thread Greg Hogan
Hi,

Just saw this TravisCI beta feature. I think this would be worthwhile to enable 
on pull request builds. We could leave branch builds unchanged since there are 
fewer builds of this type and skipping builds would make it harder to locate a 
broken build. It’s not uncommon to see three or more builds queued for the same 
PR and developers cannot cancel builds on the project account.
  https://blog.travis-ci.com/2017-03-22-introducing-auto-cancellation

I’ve enabled this against my personal repo but I believe Apache Infrastructure 
would need to make the change for the project repo. Flink has been the biggest 
user of Apache’s TravisCI build pool.

Greg

Re: [DISCUSS] Flink dist directory management

2017-03-25 Thread Greg Hogan
Hi Jinkui,

+1 to moving gelly-examples into examples/.

Also sounds nice to similarly organize the Python examples.

Docs will also need to be updated (docs/dev/lib/gelly/index.md).

Greg


> On Mar 25, 2017, at 3:46 AM, shijinkui  wrote:
> 
> Hi, all
> 
> The Flink distributionĄ¯s directory have no very clear responsibility about 
> what type of files should be in which directory.
> 
> The "bin","conf","lib" directories are clear for their responsibility.
> 
> But the Ą°opt" directories are mixed with library jars and example jars.
> 
> I think we can discuss how is reasonable for the directory. Once we 
> determined, we should follow it.
> 
> IMO, directory style below is reasonable:
> 
> - "examples" directory only contain example jars
> - "opt" directory only contain optional library jars in runtime
> - "lib" directory only contain library jar that must be loaded at runtime
> - Ą°resourcesĄą directory only contain resource file used at runtime, such as 
> web file
> 
> Show your opinion please.
> 
> @wuchong, @fhueske @Fabian
> 
> Best regards,
> Jinkui Shi
> 
> 
> .
> ŠĀŠ¤Š¤ LICENSE
> ŠĀŠ¤Š¤ NOTICE
> ŠĀŠ¤Š¤ README.txt
> ŠĀŠ¤Š¤ bin
> ŠĻ   ŠĀŠ¤Š¤ config.sh
> ŠĻ   ŠĀŠ¤Š¤ flink
> ŠĻ   ŠĀŠ¤Š¤ ...
> ŠĀŠ¤Š¤ conf
> ŠĻ   ŠĀŠ¤Š¤ flink-conf.yaml
> ŠĻ   ŠĀŠ¤Š¤ ...
> ŠĀŠ¤Š¤ examples
> ŠĻ   ŠĀŠ¤Š¤ batch
> ŠĻ   Š¸Š¤Š¤ streaming
> ŠĀŠ¤Š¤ lib
> ŠĻ   ŠĀŠ¤Š¤ flink-dist_2.11-1.3.0.jar
> ŠĻ   ŠĀŠ¤Š¤ flink-python_2.11-1.3.0.jar
> ŠĻ   ŠĀŠ¤Š¤ ...
> ŠĀŠ¤Š¤ log
> ŠĀŠ¤Š¤ opt
> ŠĻ   ŠĀŠ¤Š¤ flink-cep-scala_2.11-1.3.0.jar
> ŠĻ   ŠĀŠ¤Š¤ flink-cep_2.11-1.3.0.jar
> ŠĻ   ŠĀŠ¤Š¤ flink-gelly-examples_2.11-1.3.0.jar
> ŠĻ   ŠĀŠ¤Š¤ flink-gelly-scala_2.11-1.3.0.jar
> ŠĻ   ŠĀŠ¤Š¤ flink-gelly_2.11-1.3.0.jar
> ŠĻ   ŠĀŠ¤Š¤ flink-metrics-dropwizard-1.3.0.jar
> ŠĻ   ŠĀŠ¤Š¤ flink-metrics-ganglia-1.3.0.jar
> ŠĻ   ŠĀŠ¤Š¤ flink-metrics-graphite-1.3.0.jar
> ŠĻ   ŠĀŠ¤Š¤ flink-metrics-statsd-1.3.0.jar
> ŠĻ   Š¸Š¤Š¤ flink-ml_2.11-1.3.0.jar
> ŠĀŠ¤Š¤ resources
> ŠĻ   Š¸Š¤Š¤ python
> Š¸Š¤Š¤ tools
>Š¸Š¤Š¤ planVisualizer.html
> 
> 
> [1] https://github.com/apache/flink/pull/2460
> 



Re: [DISCUSS] FLIP-18: Code Generation for improving sorting performance

2017-03-23 Thread Greg Hogan
I would be more than happy to shepherd and review this PR.

I have two discussion points. First, a strategy for developing with templates. 
IntelliJ has a FreeMarker plugin but we lose formatting and code completion. To 
minimize this issue we can retain the untemplated code in an abstract class 
which is then concretely subclassed by the template.

Second, additional classes will turn performance critical callsites 
megamorphic. Stephan noted this issue in his work on MemorySegment. 
  http://flink.apache.org/news/2015/09/16/off-heap-memory.html

For example, QuickSort calls IndexedSortable#compare and IndexedSortable#swap. 
With multiple compiled implementations of the sorter template these callsites 
can no longer be inlined (the same is true with NormalizedKeySorter and 
FixedLengthRecordSorter if the latter was instrumented).

I have not found a way to duplicate a Java class at runtime, but we may be able 
to use Janino to compile a class which is then uniquely renamed: each 
IndexSortable type would map to a different QuickSort type (same bytecode, but 
uniquely optimized). This should also boost performance of runtime operators 
calling user defined functions.

Given the code already written, I expect we can refactor, review, and benchmark 
for the 1.3 release.

Greg


> On Mar 21, 2017, at 3:46 PM, Fabian Hueske  wrote:
> 
> Hi Pat,
> 
> thanks a lot for this great proposal! I think it is very well structured and 
> has the right level of detail.
> The improvements of your performance benchmarks look very promising and I 
> think code-gen'd sorters would be a very nice improvement.
> I like that you plan to add a switch to activate this feature.
> 
> In order move on, we will need a committer who "champions" your FLIP, reviews 
> the pull request, and eventually merges it.
> 
> @Greg and @Stephan, what do you think about this proposal?
> 
> Best, Fabian
> 
> 
> 2017-03-14 16:10 GMT+01:00 Pattarawat Chormai  >:
> Hi all,
> 
> I would like to initiate a discussion of applying code generation to 
> NormalizedKeySorter. The goal is to improve sorting performance by generating 
> suitable NormalizedKeySorter for underlying data. This generated sorter will 
> contains only necessary code in important methods, such as swap and compare, 
> hence improving sorting performance. 
> 
> Details of the implementation is illustrated at FLIP-18 : Code Generation for 
> improving sorting performance. 
> 
> 
> 
> Also, because we’re doing it as a course project at TUB, we have completed 
> the implementation and made a pull-request 
>  to Flink repo already.
> 
> From our evaluation, we have found that the pull-request reduces sorting time 
> around 7-10% and together with FLINK-3722 
>  the sorting time is 
> decreased by 12-20%.
> 
> 
> 
> Please take a look at the document and the pull-request and let me know if 
> you have any suggestion.
> 
> Best,
> Pat
> 



Re: [ANNOUNCE] New committer: Theodore Vasiloudis

2017-03-21 Thread Greg Hogan
Welcome, Theo, and great to have you onboard with Flink and ML!


> On Mar 21, 2017, at 4:35 AM, Robert Metzger  wrote:
> 
> Hi everybody,
> 
> On behalf of the PMC I am delighted to announce Theodore Vasiloudis as a
> new Flink committer!
> 
> Theo has been a community member for a very long time and he is one of the
> main drivers of the currently ongoing ML discussions in Flink.
> 
> 
> Welcome Theo and congratulations again for becoming a Flink committer!
> 
> 
> Regards,
> Robert



  1   2   3   4   >