Github user uce commented on the pull request:

    https://github.com/apache/flink/pull/729#issuecomment-106093530
  
    This is a very cool! Thanks for all the effort you put into this. I think 
this will be a great addition to the project. :-)
    
    ---
    
    I've skimmed over the code and checked it out. Some initial feedback:
    - **Naming UDF**. The feedback of committers giving talks about Flink some 
time ago was that the name "UDF" was sometimes confusing. @rmetzger, can you 
confirm this? We might take this into account and rename the `UdfAnalysisMode` 
to something else, for example just `CodeAnalysisMode`.
    - **Default mode**: the default mode is currently set to hinting. Running 
the WordCount example, it didn't work out of the box though. After some 
debugging I found this in the `UdfAnalyzer`, which prevented the analyis:
      ```java
    if (internalUdfClassName.startsWith(EXCLUDED_CLASSPATH)
            && 
!internalUdfClassName.startsWith("org/apache/flink/api/java/sca")) {
            return false;
    }
    ```
      After commenting the second condition out, it worked fine. For a 
WordCount with a reducer, which only touches the second field of the Tuple2, it 
worked like a charm. The produced output looks like this:
    
      ```
    23:23:49,207 INFO  org.apache.flink.api.java.operators.UdfOperatorUtils     
     - Function 'org.apache.flink.examples.java.wordcount.WordCount$Tokenizer' 
has been analyzed with the following result:
    Number of object creations (should be kept to a minimum): 1 in method / 36 
transitively
    A need for forwarded fields annotations could not be found.
    
    23:23:49,217 INFO  org.apache.flink.api.java.operators.UdfOperatorUtils     
     - Function 'org.apache.flink.examples.java.wordcount.WordCount$1' has been 
analyzed with the following result:
    Number of object creations (should be kept to a minimum): 1 in method / 1 
transitively
    You could use the following annotation: @ForwardedFields("f0->f0;")
    
    23:23:49,243 INFO  org.apache.flink.api.java.operators.UdfOperatorUtils     
     - Function 'org.apache.flink.api.java.Utils$CollectHelper' has been 
analyzed with the following result:
    Number of object creations (should be kept to a minimum): 0 in method / 6 
transitively
    A need for forwarded fields annotations could not be found.
    ```
      Very nice to see this working. This is great news for the optimizer. :-)
    
      What do the transitive object creations refer to exactly? I'm wondering 
how a user could influence them? Reusing a result object in the WordCount sum 
reducer is correctly detected as 0 creations in the method as well.
    
      I was wondering whether it could be possible to configure code analysis 
on a per-function level. For example, all library related functions should not 
print the hints imo.
    
    ---
    
    I very much like this. Regarding merging this before the release: I would 
vote to only merge it if we disable it by default. This will essentially affect 
every program written in the upcoming Flink release and merging it without 
proper review, testing, and exposure seems rushed to me. But I wouldn't veto 
it, if everyone wants it in asap.
    
    I think this feature alone would warrant a new 0.10 release soon after 0.9.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

Reply via email to