[ https://issues.apache.org/jira/browse/HIVE-224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12757646#action_12757646 ]
Jeff Hammerbacher commented on HIVE-224: ---------------------------------------- Hey Joy, Out of curiosity, did you guys ever look at this issue further? Thanks, Jeff > implement lfu based flushing policy for map side aggregates > ----------------------------------------------------------- > > Key: HIVE-224 > URL: https://issues.apache.org/jira/browse/HIVE-224 > Project: Hadoop Hive > Issue Type: Improvement > Reporter: Joydeep Sen Sarma > > currently we flush some random set of rows when the map side hash table > approaches memory limits. > we have discussed a strategy of flushing hash table entries that have the > been seen the least number of times (effectively LFU flushing strategy). This > will be very effective at reducing the amount of data sent from map to reduce > step - as well as reduce the chances for any skews. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.