belugabehr commented on a change in pull request #1067:
URL: https://github.com/apache/hive/pull/1067#discussion_r436092979



##########
File path: ql/src/java/org/apache/hadoop/hive/ql/exec/GroupByOperator.java
##########
@@ -990,24 +955,17 @@ private void flushHashTable(boolean complete) throws 
HiveException {
     // changed in the future
 
     if (complete) {
-      Iterator<Map.Entry<KeyWrapper, AggregationBuffer[]>> iter = 
hashAggregations
-          .entrySet().iterator();
-      while (iter.hasNext()) {
-        Map.Entry<KeyWrapper, AggregationBuffer[]> m = iter.next();
-        forward(m.getKey().getKeyArray(), m.getValue());
+      for (Map.Entry<KeyWrapper, AggregationBuffer[]> entry : 
hashAggregations.entrySet()) {
+        forward(entry.getKey().getKeyArray(), entry.getValue());
       }
-      hashAggregations.clear();
       hashAggregations = null;
-      if (LOG.isInfoEnabled()) {
-        LOG.info("Hash Table completed flushed");
-      }
+      LOG.info("Hash Table completed flushed");
       return;
     }
 
     int oldSize = hashAggregations.size();
-    if (LOG.isInfoEnabled()) {
-      LOG.info("Hash Tbl flush: #hash table = " + oldSize);
-    }
+    LOG.info("Hash Tbl flush: #hash table = " + oldSize);

Review comment:
       Thanks for chiming in @pgaref 
   
   I love that you're passionate about using the parameters.  I usually like to 
do this, but it's 50-50 on who likes to see it and who not.  My point is that 
using parameters:
   
   > can significantly boost logging performance for _disabled_ logging 
statement.
   
   The problem is that it adds overhead when there are _enabled_ logging 
statements.  For an INFO level logging, it's almost certainly enabled in a 
production environment so using the anchor ({}) just adds needless overhead.
   
   Think about it, what is faster?
   
   ```
   public static void main(String[] args) {
      info1("foo: ", "bar");
      info2("foo: {}", "bar");
   }
   
   void info1(String s, String param) {
     System.out.println(s.concat(param));
   }
   
   void info2(String s, String param) {
      int index = s.lastIndexOf("{");
      if (index < 0) {
        System.out.println(s);
      } else {
         if (s.lastIndexOf{"}") == (index + 1) {
            String firstHalf= s.subString(0, index);
            String secondHalf = s.subString(index + 2) s.length());
           System.out.println(firstHalf + param + secondHalf);
       } else {
           System.out.println(s);
       }
   }
   ```
   
   Clearly the second option is going to be slower.  `Info1` generates a string 
certainly, and `Info2` does not.  However, if the application is almost always 
going to be running at the INFO level, it doesn't really make a lot of sense to 
have to always go down `Info2`.
   
   That said, there is certainly value in having consistency (always using 
parameters), and I think messages are easier to read when formatted this way 
than when they are all piecemeal with the "+" signs, but if you want to 
understand the raw performance, then what is currently there is preferable in 
an environment where the log level is set to a minimum of INFO (which almost 
every environment I've worked on is).




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to