[jira] [Commented] (FLINK-1710) Expression API Tests take very long
[ https://issues.apache.org/jira/browse/FLINK-1710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14492352#comment-14492352 ] ASF GitHub Bot commented on FLINK-1710: --- Github user aljoscha closed the pull request at: https://github.com/apache/flink/pull/584 Expression API Tests take very long --- Key: FLINK-1710 URL: https://issues.apache.org/jira/browse/FLINK-1710 Project: Flink Issue Type: Bug Components: Expression API Affects Versions: 0.9 Reporter: Stephan Ewen Assignee: Aljoscha Krettek Fix For: 0.9 The tests of the Expression API take an immense amount of time, compared to the other API tests. Is that because they execute on large (generated) data sets, because the program compilation overhead is high, or because there is an inefficiency in the execution still? Running org.apache.flink.api.scala.expressions.AggregationsITCase Running org.apache.flink.api.scala.expressions.SelectITCase Running org.apache.flink.api.scala.expressions.AsITCase Running org.apache.flink.api.scala.expressions.StringExpressionsITCase Running org.apache.flink.api.scala.expressions.CastingITCase Running org.apache.flink.api.scala.expressions.JoinITCase Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 34.652 sec - in org.apache.flink.api.scala.expressions.AsITCase Running org.apache.flink.api.scala.expressions.GroupedAggreagationsITCase Tests run: 8, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 55.797 sec - in org.apache.flink.api.scala.expressions.StringExpressionsITCase Running org.apache.flink.api.scala.expressions.PageRankExpressionITCase Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 63.072 sec - in org.apache.flink.api.scala.expressions.SelectITCase Running org.apache.flink.api.scala.expressions.ExpressionsITCase Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 65.628 sec - in org.apache.flink.api.scala.expressions.CastingITCase Running org.apache.flink.api.scala.expressions.FilterITCase Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 74.174 sec - in org.apache.flink.api.scala.expressions.AggregationsITCase Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 93.878 sec - in org.apache.flink.api.scala.expressions.JoinITCase Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 63.4 sec - in org.apache.flink.api.scala.expressions.GroupedAggreagationsITCase Tests run: 12, Failures: 0, Errors: 0, Skipped: 4, Time elapsed: 44.179 sec - in org.apache.flink.api.scala.expressions.FilterITCase Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 53.801 sec - in org.apache.flink.api.scala.expressions.ExpressionsITCase Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 124.365 sec - in org.apache.flink.api.scala.expressions.PageRankExpressionITCase -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1710) Expression API Tests take very long
[ https://issues.apache.org/jira/browse/FLINK-1710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14487952#comment-14487952 ] ASF GitHub Bot commented on FLINK-1710: --- Github user StephanEwen commented on the pull request: https://github.com/apache/flink/pull/584#issuecomment-91325622 Looks good! Any concrete numbers in how the compile time changes? I assume this is now shipping java code strings which get compiled in each function / serializer / comparator? Expression API Tests take very long --- Key: FLINK-1710 URL: https://issues.apache.org/jira/browse/FLINK-1710 Project: Flink Issue Type: Bug Components: Expression API Affects Versions: 0.9 Reporter: Stephan Ewen Assignee: Aljoscha Krettek Fix For: 0.9 The tests of the Expression API take an immense amount of time, compared to the other API tests. Is that because they execute on large (generated) data sets, because the program compilation overhead is high, or because there is an inefficiency in the execution still? Running org.apache.flink.api.scala.expressions.AggregationsITCase Running org.apache.flink.api.scala.expressions.SelectITCase Running org.apache.flink.api.scala.expressions.AsITCase Running org.apache.flink.api.scala.expressions.StringExpressionsITCase Running org.apache.flink.api.scala.expressions.CastingITCase Running org.apache.flink.api.scala.expressions.JoinITCase Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 34.652 sec - in org.apache.flink.api.scala.expressions.AsITCase Running org.apache.flink.api.scala.expressions.GroupedAggreagationsITCase Tests run: 8, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 55.797 sec - in org.apache.flink.api.scala.expressions.StringExpressionsITCase Running org.apache.flink.api.scala.expressions.PageRankExpressionITCase Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 63.072 sec - in org.apache.flink.api.scala.expressions.SelectITCase Running org.apache.flink.api.scala.expressions.ExpressionsITCase Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 65.628 sec - in org.apache.flink.api.scala.expressions.CastingITCase Running org.apache.flink.api.scala.expressions.FilterITCase Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 74.174 sec - in org.apache.flink.api.scala.expressions.AggregationsITCase Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 93.878 sec - in org.apache.flink.api.scala.expressions.JoinITCase Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 63.4 sec - in org.apache.flink.api.scala.expressions.GroupedAggreagationsITCase Tests run: 12, Failures: 0, Errors: 0, Skipped: 4, Time elapsed: 44.179 sec - in org.apache.flink.api.scala.expressions.FilterITCase Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 53.801 sec - in org.apache.flink.api.scala.expressions.ExpressionsITCase Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 124.365 sec - in org.apache.flink.api.scala.expressions.PageRankExpressionITCase -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1710) Expression API Tests take very long
[ https://issues.apache.org/jira/browse/FLINK-1710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14487285#comment-14487285 ] ASF GitHub Bot commented on FLINK-1710: --- GitHub user aljoscha opened a pull request: https://github.com/apache/flink/pull/584 [FLINK-1710] [table] Switch compile backend to Janino This greatly reduces compile time while still supporting the same feature set. I added Janino in flink-bin/LICENSE and flink-bin/NOTICE. I hope this is correct. Can someone comment on this? You can merge this pull request into a Git repository by running: $ git pull https://github.com/aljoscha/flink table-janino Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/584.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #584 commit 21d85679b4b8fb47c8b0200cc06d7a39b188be69 Author: Aljoscha Krettek aljoscha.kret...@gmail.com Date: 2015-04-07T16:11:16Z [FLINK-1710] [table] Switch compile backend to Janino This greatly reduces compile time while still supporting the same feature set. Expression API Tests take very long --- Key: FLINK-1710 URL: https://issues.apache.org/jira/browse/FLINK-1710 Project: Flink Issue Type: Bug Components: Expression API Affects Versions: 0.9 Reporter: Stephan Ewen Assignee: Aljoscha Krettek Fix For: 0.9 The tests of the Expression API take an immense amount of time, compared to the other API tests. Is that because they execute on large (generated) data sets, because the program compilation overhead is high, or because there is an inefficiency in the execution still? Running org.apache.flink.api.scala.expressions.AggregationsITCase Running org.apache.flink.api.scala.expressions.SelectITCase Running org.apache.flink.api.scala.expressions.AsITCase Running org.apache.flink.api.scala.expressions.StringExpressionsITCase Running org.apache.flink.api.scala.expressions.CastingITCase Running org.apache.flink.api.scala.expressions.JoinITCase Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 34.652 sec - in org.apache.flink.api.scala.expressions.AsITCase Running org.apache.flink.api.scala.expressions.GroupedAggreagationsITCase Tests run: 8, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 55.797 sec - in org.apache.flink.api.scala.expressions.StringExpressionsITCase Running org.apache.flink.api.scala.expressions.PageRankExpressionITCase Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 63.072 sec - in org.apache.flink.api.scala.expressions.SelectITCase Running org.apache.flink.api.scala.expressions.ExpressionsITCase Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 65.628 sec - in org.apache.flink.api.scala.expressions.CastingITCase Running org.apache.flink.api.scala.expressions.FilterITCase Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 74.174 sec - in org.apache.flink.api.scala.expressions.AggregationsITCase Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 93.878 sec - in org.apache.flink.api.scala.expressions.JoinITCase Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 63.4 sec - in org.apache.flink.api.scala.expressions.GroupedAggreagationsITCase Tests run: 12, Failures: 0, Errors: 0, Skipped: 4, Time elapsed: 44.179 sec - in org.apache.flink.api.scala.expressions.FilterITCase Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 53.801 sec - in org.apache.flink.api.scala.expressions.ExpressionsITCase Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 124.365 sec - in org.apache.flink.api.scala.expressions.PageRankExpressionITCase -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1710) Expression API Tests take very long
[ https://issues.apache.org/jira/browse/FLINK-1710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14371131#comment-14371131 ] Aljoscha Krettek commented on FLINK-1710: - But what should I ask? The Scala reflection API is simply not thread-safe in 2.10. Expression API Tests take very long --- Key: FLINK-1710 URL: https://issues.apache.org/jira/browse/FLINK-1710 Project: Flink Issue Type: Bug Components: Expression API Affects Versions: 0.9 Reporter: Stephan Ewen Assignee: Aljoscha Krettek Fix For: 0.9 The tests of the Expression API take an immense amount of time, compared to the other API tests. Is that because they execute on large (generated) data sets, because the program compilation overhead is high, or because there is an inefficiency in the execution still? Running org.apache.flink.api.scala.expressions.AggregationsITCase Running org.apache.flink.api.scala.expressions.SelectITCase Running org.apache.flink.api.scala.expressions.AsITCase Running org.apache.flink.api.scala.expressions.StringExpressionsITCase Running org.apache.flink.api.scala.expressions.CastingITCase Running org.apache.flink.api.scala.expressions.JoinITCase Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 34.652 sec - in org.apache.flink.api.scala.expressions.AsITCase Running org.apache.flink.api.scala.expressions.GroupedAggreagationsITCase Tests run: 8, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 55.797 sec - in org.apache.flink.api.scala.expressions.StringExpressionsITCase Running org.apache.flink.api.scala.expressions.PageRankExpressionITCase Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 63.072 sec - in org.apache.flink.api.scala.expressions.SelectITCase Running org.apache.flink.api.scala.expressions.ExpressionsITCase Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 65.628 sec - in org.apache.flink.api.scala.expressions.CastingITCase Running org.apache.flink.api.scala.expressions.FilterITCase Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 74.174 sec - in org.apache.flink.api.scala.expressions.AggregationsITCase Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 93.878 sec - in org.apache.flink.api.scala.expressions.JoinITCase Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 63.4 sec - in org.apache.flink.api.scala.expressions.GroupedAggreagationsITCase Tests run: 12, Failures: 0, Errors: 0, Skipped: 4, Time elapsed: 44.179 sec - in org.apache.flink.api.scala.expressions.FilterITCase Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 53.801 sec - in org.apache.flink.api.scala.expressions.ExpressionsITCase Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 124.365 sec - in org.apache.flink.api.scala.expressions.PageRankExpressionITCase -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1710) Expression API Tests take very long
[ https://issues.apache.org/jira/browse/FLINK-1710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14371133#comment-14371133 ] Stephan Ewen commented on FLINK-1710: - Maybe there are some known tweaks / best practices to - speed up compilation - serialize generated code and integrate it into class loaders Expression API Tests take very long --- Key: FLINK-1710 URL: https://issues.apache.org/jira/browse/FLINK-1710 Project: Flink Issue Type: Bug Components: Expression API Affects Versions: 0.9 Reporter: Stephan Ewen Assignee: Aljoscha Krettek Fix For: 0.9 The tests of the Expression API take an immense amount of time, compared to the other API tests. Is that because they execute on large (generated) data sets, because the program compilation overhead is high, or because there is an inefficiency in the execution still? Running org.apache.flink.api.scala.expressions.AggregationsITCase Running org.apache.flink.api.scala.expressions.SelectITCase Running org.apache.flink.api.scala.expressions.AsITCase Running org.apache.flink.api.scala.expressions.StringExpressionsITCase Running org.apache.flink.api.scala.expressions.CastingITCase Running org.apache.flink.api.scala.expressions.JoinITCase Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 34.652 sec - in org.apache.flink.api.scala.expressions.AsITCase Running org.apache.flink.api.scala.expressions.GroupedAggreagationsITCase Tests run: 8, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 55.797 sec - in org.apache.flink.api.scala.expressions.StringExpressionsITCase Running org.apache.flink.api.scala.expressions.PageRankExpressionITCase Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 63.072 sec - in org.apache.flink.api.scala.expressions.SelectITCase Running org.apache.flink.api.scala.expressions.ExpressionsITCase Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 65.628 sec - in org.apache.flink.api.scala.expressions.CastingITCase Running org.apache.flink.api.scala.expressions.FilterITCase Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 74.174 sec - in org.apache.flink.api.scala.expressions.AggregationsITCase Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 93.878 sec - in org.apache.flink.api.scala.expressions.JoinITCase Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 63.4 sec - in org.apache.flink.api.scala.expressions.GroupedAggreagationsITCase Tests run: 12, Failures: 0, Errors: 0, Skipped: 4, Time elapsed: 44.179 sec - in org.apache.flink.api.scala.expressions.FilterITCase Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 53.801 sec - in org.apache.flink.api.scala.expressions.ExpressionsITCase Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 124.365 sec - in org.apache.flink.api.scala.expressions.PageRankExpressionITCase -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1710) Expression API Tests take very long
[ https://issues.apache.org/jira/browse/FLINK-1710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14371518#comment-14371518 ] Aljoscha Krettek commented on FLINK-1710: - This might sound a bit crazy, but we could switch to using Janino (http://docs.codehaus.org/display/JANINO/Home) as the compiler backend and completely remove the use of Scala Reflection. Expression API Tests take very long --- Key: FLINK-1710 URL: https://issues.apache.org/jira/browse/FLINK-1710 Project: Flink Issue Type: Bug Components: Expression API Affects Versions: 0.9 Reporter: Stephan Ewen Assignee: Aljoscha Krettek Fix For: 0.9 The tests of the Expression API take an immense amount of time, compared to the other API tests. Is that because they execute on large (generated) data sets, because the program compilation overhead is high, or because there is an inefficiency in the execution still? Running org.apache.flink.api.scala.expressions.AggregationsITCase Running org.apache.flink.api.scala.expressions.SelectITCase Running org.apache.flink.api.scala.expressions.AsITCase Running org.apache.flink.api.scala.expressions.StringExpressionsITCase Running org.apache.flink.api.scala.expressions.CastingITCase Running org.apache.flink.api.scala.expressions.JoinITCase Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 34.652 sec - in org.apache.flink.api.scala.expressions.AsITCase Running org.apache.flink.api.scala.expressions.GroupedAggreagationsITCase Tests run: 8, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 55.797 sec - in org.apache.flink.api.scala.expressions.StringExpressionsITCase Running org.apache.flink.api.scala.expressions.PageRankExpressionITCase Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 63.072 sec - in org.apache.flink.api.scala.expressions.SelectITCase Running org.apache.flink.api.scala.expressions.ExpressionsITCase Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 65.628 sec - in org.apache.flink.api.scala.expressions.CastingITCase Running org.apache.flink.api.scala.expressions.FilterITCase Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 74.174 sec - in org.apache.flink.api.scala.expressions.AggregationsITCase Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 93.878 sec - in org.apache.flink.api.scala.expressions.JoinITCase Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 63.4 sec - in org.apache.flink.api.scala.expressions.GroupedAggreagationsITCase Tests run: 12, Failures: 0, Errors: 0, Skipped: 4, Time elapsed: 44.179 sec - in org.apache.flink.api.scala.expressions.FilterITCase Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 53.801 sec - in org.apache.flink.api.scala.expressions.ExpressionsITCase Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 124.365 sec - in org.apache.flink.api.scala.expressions.PageRankExpressionITCase -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1710) Expression API Tests take very long
[ https://issues.apache.org/jira/browse/FLINK-1710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14371543#comment-14371543 ] Stephan Ewen commented on FLINK-1710: - That would be a major rewrite, I guess? Is this part not dependent on certain Scala compiler features? Expression API Tests take very long --- Key: FLINK-1710 URL: https://issues.apache.org/jira/browse/FLINK-1710 Project: Flink Issue Type: Bug Components: Expression API Affects Versions: 0.9 Reporter: Stephan Ewen Assignee: Aljoscha Krettek Fix For: 0.9 The tests of the Expression API take an immense amount of time, compared to the other API tests. Is that because they execute on large (generated) data sets, because the program compilation overhead is high, or because there is an inefficiency in the execution still? Running org.apache.flink.api.scala.expressions.AggregationsITCase Running org.apache.flink.api.scala.expressions.SelectITCase Running org.apache.flink.api.scala.expressions.AsITCase Running org.apache.flink.api.scala.expressions.StringExpressionsITCase Running org.apache.flink.api.scala.expressions.CastingITCase Running org.apache.flink.api.scala.expressions.JoinITCase Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 34.652 sec - in org.apache.flink.api.scala.expressions.AsITCase Running org.apache.flink.api.scala.expressions.GroupedAggreagationsITCase Tests run: 8, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 55.797 sec - in org.apache.flink.api.scala.expressions.StringExpressionsITCase Running org.apache.flink.api.scala.expressions.PageRankExpressionITCase Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 63.072 sec - in org.apache.flink.api.scala.expressions.SelectITCase Running org.apache.flink.api.scala.expressions.ExpressionsITCase Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 65.628 sec - in org.apache.flink.api.scala.expressions.CastingITCase Running org.apache.flink.api.scala.expressions.FilterITCase Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 74.174 sec - in org.apache.flink.api.scala.expressions.AggregationsITCase Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 93.878 sec - in org.apache.flink.api.scala.expressions.JoinITCase Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 63.4 sec - in org.apache.flink.api.scala.expressions.GroupedAggreagationsITCase Tests run: 12, Failures: 0, Errors: 0, Skipped: 4, Time elapsed: 44.179 sec - in org.apache.flink.api.scala.expressions.FilterITCase Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 53.801 sec - in org.apache.flink.api.scala.expressions.ExpressionsITCase Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 124.365 sec - in org.apache.flink.api.scala.expressions.PageRankExpressionITCase -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1710) Expression API Tests take very long
[ https://issues.apache.org/jira/browse/FLINK-1710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366933#comment-14366933 ] Aljoscha Krettek commented on FLINK-1710: - It is executed on the same data sets as the other tests. The problem is, that the expressions are compiled to byte code at runtime, this takes time. For these short tests, where the data size is very small we notice the additional compilation overhead. For real jobs there is no difference in runtime. I ran tests for that. Expression API Tests take very long --- Key: FLINK-1710 URL: https://issues.apache.org/jira/browse/FLINK-1710 Project: Flink Issue Type: Bug Components: Expression API Affects Versions: 0.9 Reporter: Stephan Ewen Assignee: Aljoscha Krettek Fix For: 0.9 The tests of the Expression API take an immense amount of time, compared to the other API tests. Is that because they execute on large (generated) data sets, because the program compilation overhead is high, or because there is an inefficiency in the execution still? Running org.apache.flink.api.scala.expressions.AggregationsITCase Running org.apache.flink.api.scala.expressions.SelectITCase Running org.apache.flink.api.scala.expressions.AsITCase Running org.apache.flink.api.scala.expressions.StringExpressionsITCase Running org.apache.flink.api.scala.expressions.CastingITCase Running org.apache.flink.api.scala.expressions.JoinITCase Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 34.652 sec - in org.apache.flink.api.scala.expressions.AsITCase Running org.apache.flink.api.scala.expressions.GroupedAggreagationsITCase Tests run: 8, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 55.797 sec - in org.apache.flink.api.scala.expressions.StringExpressionsITCase Running org.apache.flink.api.scala.expressions.PageRankExpressionITCase Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 63.072 sec - in org.apache.flink.api.scala.expressions.SelectITCase Running org.apache.flink.api.scala.expressions.ExpressionsITCase Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 65.628 sec - in org.apache.flink.api.scala.expressions.CastingITCase Running org.apache.flink.api.scala.expressions.FilterITCase Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 74.174 sec - in org.apache.flink.api.scala.expressions.AggregationsITCase Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 93.878 sec - in org.apache.flink.api.scala.expressions.JoinITCase Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 63.4 sec - in org.apache.flink.api.scala.expressions.GroupedAggreagationsITCase Tests run: 12, Failures: 0, Errors: 0, Skipped: 4, Time elapsed: 44.179 sec - in org.apache.flink.api.scala.expressions.FilterITCase Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 53.801 sec - in org.apache.flink.api.scala.expressions.ExpressionsITCase Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 124.365 sec - in org.apache.flink.api.scala.expressions.PageRankExpressionITCase -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1710) Expression API Tests take very long
[ https://issues.apache.org/jira/browse/FLINK-1710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366939#comment-14366939 ] Stephan Ewen commented on FLINK-1710: - This compilation overhead is pretty massive. Two minutes for the page rank test that otherwise takes 5 seconds or so is huge. I think that this means we need to invest in making this a bit better. If I understand it correctly, the code is currently compiled in every serializer and comparator? What about compiling it on the client (with reusing the compiler instance) and then shipping the code? Expression API Tests take very long --- Key: FLINK-1710 URL: https://issues.apache.org/jira/browse/FLINK-1710 Project: Flink Issue Type: Bug Components: Expression API Affects Versions: 0.9 Reporter: Stephan Ewen Assignee: Aljoscha Krettek Fix For: 0.9 The tests of the Expression API take an immense amount of time, compared to the other API tests. Is that because they execute on large (generated) data sets, because the program compilation overhead is high, or because there is an inefficiency in the execution still? Running org.apache.flink.api.scala.expressions.AggregationsITCase Running org.apache.flink.api.scala.expressions.SelectITCase Running org.apache.flink.api.scala.expressions.AsITCase Running org.apache.flink.api.scala.expressions.StringExpressionsITCase Running org.apache.flink.api.scala.expressions.CastingITCase Running org.apache.flink.api.scala.expressions.JoinITCase Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 34.652 sec - in org.apache.flink.api.scala.expressions.AsITCase Running org.apache.flink.api.scala.expressions.GroupedAggreagationsITCase Tests run: 8, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 55.797 sec - in org.apache.flink.api.scala.expressions.StringExpressionsITCase Running org.apache.flink.api.scala.expressions.PageRankExpressionITCase Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 63.072 sec - in org.apache.flink.api.scala.expressions.SelectITCase Running org.apache.flink.api.scala.expressions.ExpressionsITCase Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 65.628 sec - in org.apache.flink.api.scala.expressions.CastingITCase Running org.apache.flink.api.scala.expressions.FilterITCase Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 74.174 sec - in org.apache.flink.api.scala.expressions.AggregationsITCase Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 93.878 sec - in org.apache.flink.api.scala.expressions.JoinITCase Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 63.4 sec - in org.apache.flink.api.scala.expressions.GroupedAggreagationsITCase Tests run: 12, Failures: 0, Errors: 0, Skipped: 4, Time elapsed: 44.179 sec - in org.apache.flink.api.scala.expressions.FilterITCase Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 53.801 sec - in org.apache.flink.api.scala.expressions.ExpressionsITCase Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 124.365 sec - in org.apache.flink.api.scala.expressions.PageRankExpressionITCase -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1710) Expression API Tests take very long
[ https://issues.apache.org/jira/browse/FLINK-1710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14367273#comment-14367273 ] Aljoscha Krettek commented on FLINK-1710: - The more important problem could be that the Scala Code Generation stuff is not thread-safe in 2.10, I have to use a lock to protect all code-generation-code. I will look into whether this can be solved. Expression API Tests take very long --- Key: FLINK-1710 URL: https://issues.apache.org/jira/browse/FLINK-1710 Project: Flink Issue Type: Bug Components: Expression API Affects Versions: 0.9 Reporter: Stephan Ewen Assignee: Aljoscha Krettek Fix For: 0.9 The tests of the Expression API take an immense amount of time, compared to the other API tests. Is that because they execute on large (generated) data sets, because the program compilation overhead is high, or because there is an inefficiency in the execution still? Running org.apache.flink.api.scala.expressions.AggregationsITCase Running org.apache.flink.api.scala.expressions.SelectITCase Running org.apache.flink.api.scala.expressions.AsITCase Running org.apache.flink.api.scala.expressions.StringExpressionsITCase Running org.apache.flink.api.scala.expressions.CastingITCase Running org.apache.flink.api.scala.expressions.JoinITCase Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 34.652 sec - in org.apache.flink.api.scala.expressions.AsITCase Running org.apache.flink.api.scala.expressions.GroupedAggreagationsITCase Tests run: 8, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 55.797 sec - in org.apache.flink.api.scala.expressions.StringExpressionsITCase Running org.apache.flink.api.scala.expressions.PageRankExpressionITCase Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 63.072 sec - in org.apache.flink.api.scala.expressions.SelectITCase Running org.apache.flink.api.scala.expressions.ExpressionsITCase Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 65.628 sec - in org.apache.flink.api.scala.expressions.CastingITCase Running org.apache.flink.api.scala.expressions.FilterITCase Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 74.174 sec - in org.apache.flink.api.scala.expressions.AggregationsITCase Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 93.878 sec - in org.apache.flink.api.scala.expressions.JoinITCase Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 63.4 sec - in org.apache.flink.api.scala.expressions.GroupedAggreagationsITCase Tests run: 12, Failures: 0, Errors: 0, Skipped: 4, Time elapsed: 44.179 sec - in org.apache.flink.api.scala.expressions.FilterITCase Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 53.801 sec - in org.apache.flink.api.scala.expressions.ExpressionsITCase Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 124.365 sec - in org.apache.flink.api.scala.expressions.PageRankExpressionITCase -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1710) Expression API Tests take very long
[ https://issues.apache.org/jira/browse/FLINK-1710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14367213#comment-14367213 ] Aljoscha Krettek commented on FLINK-1710: - It is currently being compiled in one of ExpressionJoinFunction, ExpressionFilterFunction or ExpressionSelectFunction. Compiling on the client is quite hard, since that would require a way of shipping the compiled code. This means finding out what files are generated by the Scala compiler, packing them into a jar and then shipping it along with the user-code-jar. Expression API Tests take very long --- Key: FLINK-1710 URL: https://issues.apache.org/jira/browse/FLINK-1710 Project: Flink Issue Type: Bug Components: Expression API Affects Versions: 0.9 Reporter: Stephan Ewen Assignee: Aljoscha Krettek Fix For: 0.9 The tests of the Expression API take an immense amount of time, compared to the other API tests. Is that because they execute on large (generated) data sets, because the program compilation overhead is high, or because there is an inefficiency in the execution still? Running org.apache.flink.api.scala.expressions.AggregationsITCase Running org.apache.flink.api.scala.expressions.SelectITCase Running org.apache.flink.api.scala.expressions.AsITCase Running org.apache.flink.api.scala.expressions.StringExpressionsITCase Running org.apache.flink.api.scala.expressions.CastingITCase Running org.apache.flink.api.scala.expressions.JoinITCase Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 34.652 sec - in org.apache.flink.api.scala.expressions.AsITCase Running org.apache.flink.api.scala.expressions.GroupedAggreagationsITCase Tests run: 8, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 55.797 sec - in org.apache.flink.api.scala.expressions.StringExpressionsITCase Running org.apache.flink.api.scala.expressions.PageRankExpressionITCase Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 63.072 sec - in org.apache.flink.api.scala.expressions.SelectITCase Running org.apache.flink.api.scala.expressions.ExpressionsITCase Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 65.628 sec - in org.apache.flink.api.scala.expressions.CastingITCase Running org.apache.flink.api.scala.expressions.FilterITCase Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 74.174 sec - in org.apache.flink.api.scala.expressions.AggregationsITCase Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 93.878 sec - in org.apache.flink.api.scala.expressions.JoinITCase Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 63.4 sec - in org.apache.flink.api.scala.expressions.GroupedAggreagationsITCase Tests run: 12, Failures: 0, Errors: 0, Skipped: 4, Time elapsed: 44.179 sec - in org.apache.flink.api.scala.expressions.FilterITCase Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 53.801 sec - in org.apache.flink.api.scala.expressions.ExpressionsITCase Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 124.365 sec - in org.apache.flink.api.scala.expressions.PageRankExpressionITCase -- This message was sent by Atlassian JIRA (v6.3.4#6332)