[jira] [Created] (FLINK-3716) Kafka08ITCase.testFailOnNoBroker() timing out before it has a chance to pass

2016-04-07 Thread Todd Lisonbee (JIRA)
Todd Lisonbee created FLINK-3716:


 Summary: Kafka08ITCase.testFailOnNoBroker() timing out before it 
has a chance to pass
 Key: FLINK-3716
 URL: https://issues.apache.org/jira/browse/FLINK-3716
 Project: Flink
  Issue Type: Bug
Reporter: Todd Lisonbee


This is on the latest master 4/7/2016 with `mvn clean verify`.  

Test also reliably fails running it directly from IntelliJ.

Test has a 60 second timeout but it seems to need much more time to run (my 
workstation has server class Xeon).

---

/usr/lib/jvm/java-1.8.0-openjdk.x86_64/bin/java -ea -DforkNumber=01 -Xms256m 
-Xmx800m -Dlog4j.configuration=log4j-test.properties -Dmvn.forkNumber=1 
-XX:-UseGCOverheadLimit -Didea.launcher.port=7546 
-Didea.launcher.bin.path=/home/iauser/bin/idea-IU-141.1532.4/bin 
-Dfile.encoding=UTF-8 -classpath 

Re: Broken links after doc resturcturing

2016-04-07 Thread Stephan Ewen
Hey Matthias!

The mailing list and feature requests are getting super many, hard to keep
up and fix things within days...

Do you think you could fix those links? As a simple approach, I would
suggest to

  - Truncate the history to drop everything earlier than "Flink" days (in
your list before Hadoop Compatibility in Flink)

  - Links that point to a changing URL (docs-master or Github master
branch) to point to the release around that time.

  - Links from news that have no match any more should probably be
dropped...

Stephan



On Thu, Apr 7, 2016 at 2:54 PM, Matthias J. Sax  wrote:

> Anyone?
>
> On 04/04/2016 05:06 PM, Matthias J. Sax wrote:
> > Hi,
> >
> > I just stepped through the whole blog. Some stuff can get fixed easily,
> > more links should just be removed, and for some I am not sure what to do
> > about (quite old stuff).
> >
> > I put my though about each broken link (or nothing if I have no idea how
> > to deal with it). Please give feedback.
> >
> >
> >
> > ** Version 0.2 Released
> > -> Link zum Changelog:
> > https://stratosphere.eu/wiki/doku.php/wiki:changesrelease0.2
> >
> >
> >
> > ** Stratosphere Demo Accepted for ICDE 2013 -> links zum Paper und poster
> > ->
> >
> https://flink.apache.org/assets/papers/optimizationOfDataFlowsWithUDFs_13.pdf
> > ->
> >
> https://flink.apache.org/assets/papers/optimizationOfDataFlowsWithUDFs_poster_13.pdf
> >
> > Paper:
> http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=6544927
> > Poster not sure (maybe ask Fabian or remove link)
> >
> >
> >
> > ** Stratosphere Demo Paper Accepted for BTW 2013
> > -> https://flink.apache.org/assets/papers/Sopremo_Meteor%20BigData.pdf
> >
> >
> >
> > ** ICDE 2013 Demo Preview
> > -> https://flink.apache.org/publications
> >
> > This section does no exist any more. But the paper is linked in another
> > blog post anyway. Maybe, we can remove the whole blog post.
> >
> >
> >
> > ** Paper "All Roads Lead to Rome: Optimistic Recovery for Distributed
> > Iterative Data Processing" accepted at CIKM 2013
> > -> https://flink.apache.org/assets/papers/optimistic.pdf
> >
> > Available at https://dl.acm.org/citation.cfm?doid=2505515.2505753
> >
> >
> >
> > ** Stratosphere got accepted to the Hadoop Summit Europe in Amsterdam
> > -> http://hadoopsummit.org/amsterdam/
> > ->
> >
> https://hadoopsummit.uservoice.com/forums/196822-future-of-apache-hadoop/filters/top
> >
> > I would just remove those links.
> >
> >
> >
> > ** Stratosphere 0.4 Released
> > ->
> >
> https://flink.apache.org/blog/tutorial/2014/01/12/0.4-migration-guide.html
> > ->
> >
> https://flink.apache.org/blog/tutorial/2014/01/12/0.4-migration-guide.html
> > ->
> >
> https://ci.apache.org/projects/flink/flink-docs-master/0.4/programming_guides/iterations.html
> > ->
> >
> https://ci.apache.org/projects/flink/flink-docs-master/0.4/programming_guides/iterations.html
> > ->
> >
> https://ci.apache.org/projects/flink/flink-docs-master/0.4/program_execution/local_executor.html
> > -> https://flink.apache.org/quickstart/
> >
> >
> >
> >
> > ** Optimizer Plan Visualization Tool
> > -> http://stratosphere.eu/docs/0.4/program_execution/web_interface.html
> > -> http://stratosphere.eu/docs/0.4/program_execution/local_executor.html
> >
> >
> >
> > ** Use Stratosphere with Amazon Elastic MapReduce
> > ->
> >
> https://ci.apache.org/projects/flink/flink-docs-master/0.4/setup/yarn.html
> >
> >
> >
> > ** Stratosphere version 0.5 available
> > -> http://stratosphere.eu/docs/0.5/
> > -> http://stratosphere.eu/docs/0.5/programming_guides/examples_java.html
> >
> >
> >
> > ** Hadoop Compatibility in Flink
> > ->
> >
> https://ci.apache.org/projects/flink/flink-docs-release-0.7/hadoop_compatibility.html
> >
> > Works; however, we might want to point to a newer version (maybe current
> > master?)
> >
> >
> >
> > ** January 2015 in the Flink community
> > -> http://data-artisans.com/computing-recommendations-with-flink.html
> > ->
> >
> http://2015.hadoopsummit.org/amsterdam-blog/announcing-the-community-vote-session-winners-for-the-2015-hadoop-summit-europe/
> >
> >
> >
> > ** Introducing Flink Streaming
> > ->
> >
> http://ci.apache.org/projects/flink/flink-docs-master/apis/streaming_guide.html#sources
> > ->
> >
> http://ci.apache.org/projects/flink/flink-docs-master/apis/streaming_guide.html#stream-connectors
> > ->
> >
> http://ci.apache.org/projects/flink/flink-docs-master/apis/streaming_guide.html#window-operators
> >
> > Those can be fixed easily (point to current master)
> >
> >
> >
> > ** February 2015 in the Flink community
> > ->
> >
> https://github.com/apache/flink/tree/master/flink-staging/flink-gelly/src/main/java/org/apache/flink/graph/example
> > -> https://github.com/apache/flink/tree/master/flink-staging/flink-table
> > ->
> https://github.com/apache/flink/tree/master/flink-staging/flink-hcatalog
> >
> > I would just removed those.
> >
> >
> >
> > ** Peeking into Apache Flink's Engine Room
> > ->
> >
> 

Re: Expected duration for cascading-flink tests?

2016-04-07 Thread Fabian Hueske
Hi Ken,

I fixed the issues you reported and pushed a new version that depends on
Flink 1.0.1 and Cascading 3.1-wip-56 to the master branch [1].
We will publish this branch soon as cascading-flink 0.2.

Thanks for your help,
Fabian

[1] https://github.com/dataArtisans/cascading-flink/tree/master

2016-03-30 11:04 GMT+02:00 Fabian Hueske :

> Hi Ken,
>
> regarding the failed tests:
> - cascading.JoinFieldedPipesPlatformTest$testJoinMergeGroupBy is expected
> to fail due to restrictions in the MR/Tez engines. If I remember correctly,
> this is about deadlocks that need to be resolved by splitting a job.
> Flink's optimizer detects such situations and places a dam breaker to
> resolve such a situation within a single job and is hence able to execute
> the job correctly.
> - cascading.ComparePlatformsTest$CompareTestCase I think you are right on
> this one. When I implemented the runner, I did not find a way to make this
> tests pass. It looked like an issue with the test itself as you assumed as
> well.
>
> Btw. I ported the runner to Flink 1.0 and bumped the Cascading 3.1
> WIP version already, but haven't done an "official" release yet. You find
> the code in the flink-1.0 branch [1]. With Flink 1.0, we also extended the
> support for outer joins. It might be possible to get rid of some of the
> HashJoin restrictions, but I have to take a closer look at how outer hash
> joins are done with Cascading MR/Tez.
> Anyway, I can do a Cascading-Flink release for Flink 1.0 soon and extend
> HashJoin support later.
>
> Best, Fabian
>
> [1] https://github.com/dataartisans/cascading-flink/tree/flink-1.0
>
> 2016-03-30 6:08 GMT+02:00 Ken Krugler :
>
>> Hi Fabian,
>>
>> > From: Fabian Hueske
>> > Sent: March 29, 2016 3:51:08pm PDT
>> > To: dev@flink.apache.org
>> > Subject: Re: Expected duration for cascading-flink tests?
>> >
>> > Hi Ken,
>> >
>> > no, this is definitely not expected. The tests complete in about 30
>> mins on
>> > my machine.
>> > Is it possible that you have another Flink process running on your
>> machine
>> > (maybe a debug thread in your IDE)? That could explain the "Address
>> already
>> > in use" exceptions.
>>
>> Good call - I'd run "bin/stop-local.sh" previously, but I see that
>> there's still the Flink process running.
>>
>> Re-running bin/stop-local.sh displays "No jobmanager daemon to stop on
>> host Kens-MacBook-Air.local.", but still doesn't kill off the Flink process.
>>
>> What might cause that situation?
>>
>> In any case, I manually killed the process and started the build again,
>> and it finished in about 20 minutes, which is great.
>>
>> I see the expected errors, e.g.
>>
>> HashJoin does only support InnerJoin and LeftJoin but is
>> cascading.pipe.joiner.OuterJoin
>>
>> though this one seems odd:
>>
>> > testJoinMergeGroupBy(cascading.JoinFieldedPipesPlatformTest)  Time
>> elapsed: 0.048 sec  <<< FAILURE!
>> > junit.framework.AssertionFailedError: planner should throw error on plan
>>
>> FlinkTestPlatform needs to return true from supportsGroupByAfterMerge() -
>> assuming that this is actually the case (seems reasonable for Flink)
>>
>> Though making that change requires cascading-wip-56 to avoid a
>> compilation error on the @Override.
>>
>> There's also this one:
>>
>> > Running cascading.ComparePlatformsTest$CompareTestCase
>> > Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 0.053
>> sec <<< FAILURE! - in cascading.ComparePlatformsTest$CompareTestCase
>> > warning(junit.framework.TestSuite$1)  Time elapsed: 0.009 sec  <<<
>> FAILURE!
>> > junit.framework.AssertionFailedError: Class
>> cascading.ComparePlatformsTest$CompareTestCase has no public constructor
>> TestCase(String name) or TestCase()
>> >   at junit.framework.Assert.fail(Assert.java:57)
>> >   at junit.framework.TestCase.fail(TestCase.java:227)
>> >   at junit.framework.TestSuite$1.runTest(TestSuite.java:100)
>>
>>
>> But that seems like an issue with the Cascading test code. I'll check
>> w/Chris and see what he says.
>>
>> Anyway, the build worked with the update to cascading-wip-56.
>>
>> I also tried updating to Flink 1.0.0 (from 0.10.0), but so far I've run
>> into some compilation errors, e.g. in FlinkFlowStep.java it can't find the
>> JavaPlan class.
>>
>> Thanks again for the help,
>>
>> -- Ken
>>
>>
>>
>> > "
>> > Best, Fabian
>> >
>> > 2016-03-29 20:36 GMT+02:00 Ken Krugler :
>> >
>> >> An update (and a nudge)…
>> >>
>> >> So far it's been more than 20 hours, and the tests are still running.
>> >>
>> >> Most tests seem to fail with one of two different errors…
>> >>
>> >> 1. Address already in use
>> >>
>> >> cascading.flow.FlowException: [test] unhandled exception
>> >>at cascading.flow.BaseFlow.complete(BaseFlow.java:977)
>> >>at
>> >>
>> cascading.flow.FlowStrategiesPlatformTest.testSkipStrategiesReplace(FlowStrategiesPlatformTest.java:67)
>> >> Caused by: 

Re: Broken links after doc resturcturing

2016-04-07 Thread Matthias J. Sax
Anyone?

On 04/04/2016 05:06 PM, Matthias J. Sax wrote:
> Hi,
> 
> I just stepped through the whole blog. Some stuff can get fixed easily,
> more links should just be removed, and for some I am not sure what to do
> about (quite old stuff).
> 
> I put my though about each broken link (or nothing if I have no idea how
> to deal with it). Please give feedback.
> 
> 
> 
> ** Version 0.2 Released
> -> Link zum Changelog:
> https://stratosphere.eu/wiki/doku.php/wiki:changesrelease0.2
> 
> 
> 
> ** Stratosphere Demo Accepted for ICDE 2013 -> links zum Paper und poster
> ->
> https://flink.apache.org/assets/papers/optimizationOfDataFlowsWithUDFs_13.pdf
> ->
> https://flink.apache.org/assets/papers/optimizationOfDataFlowsWithUDFs_poster_13.pdf
> 
> Paper: http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=6544927
> Poster not sure (maybe ask Fabian or remove link)
> 
> 
> 
> ** Stratosphere Demo Paper Accepted for BTW 2013
> -> https://flink.apache.org/assets/papers/Sopremo_Meteor%20BigData.pdf
> 
> 
> 
> ** ICDE 2013 Demo Preview
> -> https://flink.apache.org/publications
> 
> This section does no exist any more. But the paper is linked in another
> blog post anyway. Maybe, we can remove the whole blog post.
> 
> 
> 
> ** Paper "All Roads Lead to Rome: Optimistic Recovery for Distributed
> Iterative Data Processing" accepted at CIKM 2013
> -> https://flink.apache.org/assets/papers/optimistic.pdf
> 
> Available at https://dl.acm.org/citation.cfm?doid=2505515.2505753
> 
> 
> 
> ** Stratosphere got accepted to the Hadoop Summit Europe in Amsterdam
> -> http://hadoopsummit.org/amsterdam/
> ->
> https://hadoopsummit.uservoice.com/forums/196822-future-of-apache-hadoop/filters/top
> 
> I would just remove those links.
> 
> 
> 
> ** Stratosphere 0.4 Released
> ->
> https://flink.apache.org/blog/tutorial/2014/01/12/0.4-migration-guide.html
> ->
> https://flink.apache.org/blog/tutorial/2014/01/12/0.4-migration-guide.html
> ->
> https://ci.apache.org/projects/flink/flink-docs-master/0.4/programming_guides/iterations.html
> ->
> https://ci.apache.org/projects/flink/flink-docs-master/0.4/programming_guides/iterations.html
> ->
> https://ci.apache.org/projects/flink/flink-docs-master/0.4/program_execution/local_executor.html
> -> https://flink.apache.org/quickstart/
> 
> 
> 
> 
> ** Optimizer Plan Visualization Tool
> -> http://stratosphere.eu/docs/0.4/program_execution/web_interface.html
> -> http://stratosphere.eu/docs/0.4/program_execution/local_executor.html
> 
> 
> 
> ** Use Stratosphere with Amazon Elastic MapReduce
> ->
> https://ci.apache.org/projects/flink/flink-docs-master/0.4/setup/yarn.html
> 
> 
> 
> ** Stratosphere version 0.5 available
> -> http://stratosphere.eu/docs/0.5/
> -> http://stratosphere.eu/docs/0.5/programming_guides/examples_java.html
> 
> 
> 
> ** Hadoop Compatibility in Flink
> ->
> https://ci.apache.org/projects/flink/flink-docs-release-0.7/hadoop_compatibility.html
> 
> Works; however, we might want to point to a newer version (maybe current
> master?)
> 
> 
> 
> ** January 2015 in the Flink community
> -> http://data-artisans.com/computing-recommendations-with-flink.html
> ->
> http://2015.hadoopsummit.org/amsterdam-blog/announcing-the-community-vote-session-winners-for-the-2015-hadoop-summit-europe/
> 
> 
> 
> ** Introducing Flink Streaming
> ->
> http://ci.apache.org/projects/flink/flink-docs-master/apis/streaming_guide.html#sources
> ->
> http://ci.apache.org/projects/flink/flink-docs-master/apis/streaming_guide.html#stream-connectors
> ->
> http://ci.apache.org/projects/flink/flink-docs-master/apis/streaming_guide.html#window-operators
> 
> Those can be fixed easily (point to current master)
> 
> 
> 
> ** February 2015 in the Flink community
> ->
> https://github.com/apache/flink/tree/master/flink-staging/flink-gelly/src/main/java/org/apache/flink/graph/example
> -> https://github.com/apache/flink/tree/master/flink-staging/flink-table
> -> https://github.com/apache/flink/tree/master/flink-staging/flink-hcatalog
> 
> I would just removed those.
> 
> 
> 
> ** Peeking into Apache Flink's Engine Room
> ->
> https://ci.apache.org/projects/flink/flink-docs-master/apis/dataset_transformations.html#join-algorithm-hints
> 
> Can be fixed with
> https://ci.apache.org/projects/flink/flink-docs-master/apis/batch/index.html#semantic-annotations
> 
> 
> 
> ** March 2015 in the Flink community
> -> http://data-artisans.com/dataflow.html
> -> https://github.com/apache/flink/tree/master/flink-staging/flink-table
> ->
> https://github.com/apache/flink/blob/master/flink-staging/flink-table/src/main/java/org/apache/flink/examples/java/JavaTableExample.java
> ->
> https://github.com/apache/flink/tree/master/flink-staging/flink-table/src/main/scala/org/apache/flink/examples/scala
> -> https://github.com/apache/flink/tree/master/flink-staging/flink-ml
> ->
> https://ci.apache.org/projects/flink/flink-docs-master/setup/flink_on_tez.html
> 
> 
> 
> ** Announcing Flink 0.9.0-milestone1 preview release
> ->
> 

[jira] [Created] (FLINK-3715) Move Accumulating/Discarding from Trigger to WindowOperator

2016-04-07 Thread Aljoscha Krettek (JIRA)
Aljoscha Krettek created FLINK-3715:
---

 Summary: Move Accumulating/Discarding from Trigger to 
WindowOperator
 Key: FLINK-3715
 URL: https://issues.apache.org/jira/browse/FLINK-3715
 Project: Flink
  Issue Type: Sub-task
  Components: Streaming
Reporter: Aljoscha Krettek


As mentioned in 
https://docs.google.com/document/d/1Xp-YBf87vLTduYSivgqWVEMjYUmkA-hyb4muX3KRl08/edit#heading=h.pyk1sn8f33q2
 we should move the decision of whether to {{PURGE}} a window upon firing from 
the {{Trigger}} to the {{WindowOperato}}. This also requires to add API so that 
the user can specify whether windows should be purged upon trigger firing 
(discarding) or kept (accumulating).

As mentioned in the above doc, the {{Trigger}} can react with 4 results right 
now: {{CONTINUE}}, {{FIRE}}, {{PURGE}}, {{FIRE_AND_PURGE}}. The behavior of a 
trigger is not apparent if not looking at the code of the trigger, this has 
confused a number of users. With the new regime, a {{Trigger}} can just decide 
whether to {{CONTINUE}} or {{FIRE}}. The setting of accumulating/discarding 
decides whether to purge the window or keep it.

This depends on FLINK-3714 where we introduce an "allowed lateness" setting. 
Having a choice between accumulating and discarding only makes sense with an 
allowed lateness greater zero. Otherwise no late elements could ever arrive 
that would go into the kept windows.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-3714) Add Support for "Allowed Lateness"

2016-04-07 Thread Aljoscha Krettek (JIRA)
Aljoscha Krettek created FLINK-3714:
---

 Summary: Add Support for "Allowed Lateness"
 Key: FLINK-3714
 URL: https://issues.apache.org/jira/browse/FLINK-3714
 Project: Flink
  Issue Type: Sub-task
  Components: Streaming
Reporter: Aljoscha Krettek


As mentioned in 
https://docs.google.com/document/d/1Xp-YBf87vLTduYSivgqWVEMjYUmkA-hyb4muX3KRl08/edit#
 we should add support for an allowed lateness setting.

This includes several things:

 - API for setting allowed lateness
 - Dropping of late elements 
 - Garbage collection of event-time windows 

Lateness only makes sense for event-time windows. So we also have to figure out 
what the API for this should look like and especially what should happen with 
the "stream-time characteristic" switch. For example in this:

{code}
env.setStreamTimeCharacteristic(ProcessingTime)
...
DataStream in = ...

result = in
  .keyBy()
  .timeWindow()
  .allowedLateness()
  .apply()
{code}

I think the setting can be silently ignored when doing processing-time 
windowing.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Table API / SQL Queries and Code Generation

2016-04-07 Thread Gábor Horváth
Hi,

Thank you for the responses!

On 7 April 2016 at 12:38, Fabian Hueske  wrote:

> Hi Gabor,
>
> the Table API / SQL translator generates the code as String and ships the
> String as member of the function object.
> The code is compiled with Janino when the function is initialized in the
> open() method. So we are not shipping classes but compile the code at the
> worker.
>
> Not sure if this approach would work for serializers and comparators as
> well.
>

I think I will go into the same direction that I will generate code as
string and
compile it instead of generating bytecode directly. In order to reduce the
dependencies I think I will use the same compiler. Did you do any attempt
to support debugging of the generated code? What is the motivation to
do the compilation on at the worker? Distributing the load coming from
the compilation? Isn't it a problem to compile the same code multiple
times on multiple workers?


>
> Best, Fabian
>
> 2016-04-05 16:47 GMT+02:00 Timo Walther :
>
> > Hi Gábor,
> >
> > the code generation in the Table API is in a very early stage and
> contains
> > not much optimization logic so far. Currently we extend the functionality
> > to support the most important SQL operations. It will need some time
> until
> > we can further optimize the generated code (e.g. for tracking nulls).
>

Do you plan to utilize annotations for optimizations? E.g.: something can
not
be null or can not be subclassed. In case you do, it might be beneficial to
use the same set of annotations.


> >
> > We are using the Janino compiler [1] for in-memory compilation and class
> > loading. You can have a look at the code generation here [2].
>

Thank you I will definitely look into the code.


> >
> > Regards,
> > Timo
> >
> > [1] http://unkrig.de/w/Janino
> > [2]
> >
> https://github.com/apache/flink/blob/master/flink-libraries/flink-table/src/main/scala/org/apache/flink/api/table/codegen/CodeGenerator.scala
> >
> >
> >
> > On 05.04.2016 16:25, Gábor Horváth wrote:
> >
> >> Hi!
> >>
> >> During this summer I plan to introduce runtime code generation in the
> >> serializers [1]
> >> to improve the performance of Flink.
> >>
> >> As Stephan Ewen pointed in Table API / SQL effort code generation will
> >> also
> >> be used to
> >> generate functions and data types. In order to share as much code as
> >> possible and
> >> align the code generation efforts I would like to cooperate with the
> >> authors of that project.
> >> Who are they, what libraries and approach are they planning to use? Is
> >> there a design
> >> document or a requirement list somewhere?
> >>
> >> I know about one document [2], but that did not contain the answers I
> was
> >> looking for.
> >>
> >> Thanks in advance,
> >> Gábor Horváth
> >>
> >> [1]
> >>
> >>
> https://docs.google.com/document/d/1VC8lCeErx9kI5lCMPiUn625PO0rxR-iKlVqtt3hkVnk
> >> [2]
> >>
> >>
> https://docs.google.com/document/d/1sITIShmJMGegzAjGqFuwiN_iw1urwykKsLiacokxSw0
> >>
> >>
> >
>

Regards,
Gábor


Re: Table API / SQL Queries and Code Generation

2016-04-07 Thread Fabian Hueske
Hi Gabor,

the Table API / SQL translator generates the code as String and ships the
String as member of the function object.
The code is compiled with Janino when the function is initialized in the
open() method. So we are not shipping classes but compile the code at the
worker.

Not sure if this approach would work for serializers and comparators as
well.

Best, Fabian

2016-04-05 16:47 GMT+02:00 Timo Walther :

> Hi Gábor,
>
> the code generation in the Table API is in a very early stage and contains
> not much optimization logic so far. Currently we extend the functionality
> to support the most important SQL operations. It will need some time until
> we can further optimize the generated code (e.g. for tracking nulls).
>
> We are using the Janino compiler [1] for in-memory compilation and class
> loading. You can have a look at the code generation here [2].
>
> Regards,
> Timo
>
> [1] http://unkrig.de/w/Janino
> [2]
> https://github.com/apache/flink/blob/master/flink-libraries/flink-table/src/main/scala/org/apache/flink/api/table/codegen/CodeGenerator.scala
>
>
>
> On 05.04.2016 16:25, Gábor Horváth wrote:
>
>> Hi!
>>
>> During this summer I plan to introduce runtime code generation in the
>> serializers [1]
>> to improve the performance of Flink.
>>
>> As Stephan Ewen pointed in Table API / SQL effort code generation will
>> also
>> be used to
>> generate functions and data types. In order to share as much code as
>> possible and
>> align the code generation efforts I would like to cooperate with the
>> authors of that project.
>> Who are they, what libraries and approach are they planning to use? Is
>> there a design
>> document or a requirement list somewhere?
>>
>> I know about one document [2], but that did not contain the answers I was
>> looking for.
>>
>> Thanks in advance,
>> Gábor Horváth
>>
>> [1]
>>
>> https://docs.google.com/document/d/1VC8lCeErx9kI5lCMPiUn625PO0rxR-iKlVqtt3hkVnk
>> [2]
>>
>> https://docs.google.com/document/d/1sITIShmJMGegzAjGqFuwiN_iw1urwykKsLiacokxSw0
>>
>>
>


[jira] [Created] (FLINK-3713) DisposeSavepoint message uses system classloader to discard state

2016-04-07 Thread Robert Metzger (JIRA)
Robert Metzger created FLINK-3713:
-

 Summary: DisposeSavepoint message uses system classloader to 
discard state
 Key: FLINK-3713
 URL: https://issues.apache.org/jira/browse/FLINK-3713
 Project: Flink
  Issue Type: Bug
  Components: JobManager
Reporter: Robert Metzger


The {{DisposeSavepoint}} message in the JobManager is using the system 
classloader to discard the state:

{code}
val savepoint = savepointStore.getState(savepointPath)

  log.debug(s"$savepoint")

  // Discard the associated checkpoint
  savepoint.discard(getClass.getClassLoader)

  // Dispose the savepoint
  savepointStore.disposeState(savepointPath)
{code}

Which leads to issues when the state contains user classes:

{code}
2016-04-07 03:02:12,225 INFO  org.apache.flink.yarn.YarnJobManager  
- Disposing savepoint at 
'hdfs:/bigdata/flink/savepoints/savepoint-7610540d089f'.
2016-04-07 03:02:12,233 WARN  org.apache.flink.runtime.checkpoint.StateForTask  
- Failed to discard checkpoint state: StateForTask 
eed5cc5a12dc2e0672848ba81bd8fa6d-0 : SerializedValue
java.lang.ClassNotFoundException: 
.MetricsProcessor$CombinedKeysFoldFunction
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:270)
at 
org.apache.flink.util.InstantiationUtil$ClassLoaderObjectInputStream.resolveClass(InstantiationUtil.java:64)
at 
java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1612)
at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1517)
at 
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1771)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at 
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
at 
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at 
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
at 
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
at java.util.HashMap.readObject(HashMap.java:1184)
at sun.reflect.GeneratedMethodAccessor12.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1893)
at 
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at 
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
at 
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at java.io.ObjectInputStream.readArray(ObjectInputStream.java:1706)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1344)
at 
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
at 
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
at 
org.apache.flink.util.InstantiationUtil.deserializeObject(InstantiationUtil.java:290)
at 
org.apache.flink.util.SerializedValue.deserializeValue(SerializedValue.java:58)
at 
org.apache.flink.runtime.checkpoint.StateForTask.discard(StateForTask.java:109)
at 
org.apache.flink.runtime.checkpoint.CompletedCheckpoint.discard(CompletedCheckpoint.java:85)
at 

[jira] [Created] (FLINK-3712) YARN client dynamic properties are not passed correctly to the leader election service on the client

2016-04-07 Thread Robert Metzger (JIRA)
Robert Metzger created FLINK-3712:
-

 Summary: YARN client dynamic properties are not passed correctly 
to the leader election service on the client
 Key: FLINK-3712
 URL: https://issues.apache.org/jira/browse/FLINK-3712
 Project: Flink
  Issue Type: Bug
  Components: YARN Client
Affects Versions: 1.0.0
Reporter: Robert Metzger


The issue was reported here: 
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/YARN-High-Availability-td3558.html

Dynamic properties (for example the zookeeper root path) are not properly 
passed to the leader election service on the client.
The election service is using the configuration values from the config file 
instead of the properties dynamically passed.





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [DISCUSS] Macro-benchmarking for performance tuning and regression detection

2016-04-07 Thread Till Rohrmann
Hi Greg,

I like the idea to have a macro-benchmarking suite to exactly test the
points you've mentioned. If we don't have reliable performance numbers,
then it will always be hard to tell whether an improvement makes sense or
not (performance-wise).

I think we already undertook a first attempt to do solve the problem with
Yoka [1]. The idea was to run a set of algorithms continuously on a machine
in the cloud. Yoka was running for some time, but I'm not sure whether this
is still the case.

Another tool I know of and which people use to run benchmark suites with
Flink is Peel [2]. Researcher of Dima are using it to benchmark different
distributed engines against each other. But I have never really worked with
it.

[1] https://github.com/mxm/yoka
[2] https://github.com/stratosphere/peel

Cheers,
Till

On Wed, Apr 6, 2016 at 6:56 PM, Greg Hogan  wrote:

> I'd like to discuss the creation of a macro-benchmarking module for Flink.
> This could be run during pre-release testing to detect performance
> regressions and during development when refactoring or performance tuning
> code on the hot path.
>
> Many users have published benchmarks and the Flink libraries already
> contain a modest selection of algorithms. Some benefits of creating a
> consolidated collection of macro-benchmarks include:
>
> - comprehensive code coverage: a diverse set of algorithms can stress every
> aspect of Flink (streaming, batch, sorts, joins, spilling, cluster, ...)
>
> - codify best practices: benchmarks should be relatively stable and
> repeatable
>
> - efficient: an automated system can run many more tests and generate more
> accurate results
>
> Macro-benchmarks would be useful in analyzing improved performance with the
> proposed specialized serializes and comparators [FLINK-3599] or making
> Flink NUMA-aware [FLINK-3163].
>
> I've also been looking recently at some of the hot code and see about a
> ~12-14% total improvement when modifying NormalizedKeySorter.compare/swap
> to bitshift and bitmask rather than divide and modulo. The trade-off is
> that to align on a power-of-2 we have holes in and require additional
> MemoryBuffers. And I'm testing on a single data type, IntValue, and there
> may be different results for LongValue or StringValue or custom types or
> with different algorithms. And replacing multiply with a left shift reduces
> performance, demonstrating the need to test changes in isolation.
>
> There are many more ideas, i.e. NormalizedKeySorter writing keys before the
> pointer so that the offset computation is performed outside of the compare
> and sort methods. Also, SpanningRecordSerializer could skip to the next
> buffer rather than writing length across buffers. These changes might each
> be worth a few percent. Other changes might be less than a 1% speedup, but
> taken in aggregate will yield a noticeable performance increase.
>
> I like the idea of profile first, measure second, then create and discuss
> the pull request.
>
> As for the actual macro-benchmarking framework, it would be nice if the
> algorithms would also verify correctness alongside performance. The
> algorithm interface would be warmup (run only once) and execute, which
> would be run multiple times in an interleaved manner. There benchmarking
> duration should be tunable.
>
> The framework would be responsible for configuration of as well as starting
> and stopping the cluster, executing algorithms and recording performance,
> and comparing and analyzing results.
>
> Greg
>