Github user uce commented on the pull request:
https://github.com/apache/flink/pull/729#issuecomment-106093530
This is a very cool! Thanks for all the effort you put into this. I think
this will be a great addition to the project. :-)
---
I've skimmed over the code and checked it out. Some initial feedback:
- **Naming UDF**. The feedback of committers giving talks about Flink some
time ago was that the name "UDF" was sometimes confusing. @rmetzger, can you
confirm this? We might take this into account and rename the `UdfAnalysisMode`
to something else, for example just `CodeAnalysisMode`.
- **Default mode**: the default mode is currently set to hinting. Running
the WordCount example, it didn't work out of the box though. After some
debugging I found this in the `UdfAnalyzer`, which prevented the analyis:
```java
if (internalUdfClassName.startsWith(EXCLUDED_CLASSPATH)
&&
!internalUdfClassName.startsWith("org/apache/flink/api/java/sca")) {
return false;
}
```
After commenting the second condition out, it worked fine. For a
WordCount with a reducer, which only touches the second field of the Tuple2, it
worked like a charm. The produced output looks like this:
```
23:23:49,207 INFO org.apache.flink.api.java.operators.UdfOperatorUtils
- Function 'org.apache.flink.examples.java.wordcount.WordCount$Tokenizer'
has been analyzed with the following result:
Number of object creations (should be kept to a minimum): 1 in method / 36
transitively
A need for forwarded fields annotations could not be found.
23:23:49,217 INFO org.apache.flink.api.java.operators.UdfOperatorUtils
- Function 'org.apache.flink.examples.java.wordcount.WordCount$1' has been
analyzed with the following result:
Number of object creations (should be kept to a minimum): 1 in method / 1
transitively
You could use the following annotation: @ForwardedFields("f0->f0;")
23:23:49,243 INFO org.apache.flink.api.java.operators.UdfOperatorUtils
- Function 'org.apache.flink.api.java.Utils$CollectHelper' has been
analyzed with the following result:
Number of object creations (should be kept to a minimum): 0 in method / 6
transitively
A need for forwarded fields annotations could not be found.
```
Very nice to see this working. This is great news for the optimizer. :-)
What do the transitive object creations refer to exactly? I'm wondering
how a user could influence them? Reusing a result object in the WordCount sum
reducer is correctly detected as 0 creations in the method as well.
I was wondering whether it could be possible to configure code analysis
on a per-function level. For example, all library related functions should not
print the hints imo.
---
I very much like this. Regarding merging this before the release: I would
vote to only merge it if we disable it by default. This will essentially affect
every program written in the upcoming Flink release and merging it without
proper review, testing, and exposure seems rushed to me. But I wouldn't veto
it, if everyone wants it in asap.
I think this feature alone would warrant a new 0.10 release soon after 0.9.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---