zabetak opened a new pull request, #5159:
URL: https://github.com/apache/hive/pull/5159

   ### Why are the changes needed?
   
   When BITVECTOR or KLL stats are disabled/not present in the metastore the 
following message may appear way too often in the HMS logs.
   ```
   2024-03-22T01:50:57,849 DEBUG [CachedStore-CacheUpdateService: Thread-240] 
metastore.MetastoreDirectSqlUtils: Expected blob type but got java.lang.String
   ```
   In fact in some cases, the message appears more than once for every single 
partition that is present in the table(s) being queried. When the number of 
partitions is important it can easily clog the logs with redundant and useless 
information.
   
   To put things in perspective while running the cbo_query10.q on the 
statistics of TPC-DS30TB dataset the message occupies more than 50% (26MB) of 
the total log file (46MB).
   
   The presence of the message does not tells us much on its own. In 
conjunction with the code we can infer that we are not fetching BITVECTOR/KLL 
stats from the metastore but this could be done in a different place without 
having to print the same message 170K times.
   
   Removing this message saves disk space, avoids frequent log rotation, and 
improves the overall readability of the log file.
   
   There is another redundant message which appears when transforming a 
database value to Boolean. The message is redundant since it is followed 
directly by an exception so there is no reason to have both. This message may 
not appear as often as the previous one but given that it doesn't add much 
value it can also be removed.
   
   ### Does this PR introduce _any_ user-facing change?
   No
   
   ### Is the change a dependency upgrade?
   No
   
   ### How was this patch tested?
   Run cbo_query10.q before and after the changes and check occurrences/size of 
the message.
   ```
   $ mvn test -Dtest=TestTezTPCDS30TBPerfCliDriver -Dqfile=cbo_query10.q
   $ grep -a "Expected blob type but got java.lang.String" 
target/tmp/log/hive.log | wc -c
   26129538
   $ wc -c target/tmp/log/hive.log 
   46959003 target/tmp/log/hive.log
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org
For additional commands, e-mail: gitbox-h...@hive.apache.org

Reply via email to