DefaultTuple underestimate the memory footprint for string ----------------------------------------------------------
Key: PIG-1443 URL: https://issues.apache.org/jira/browse/PIG-1443 Project: Pig Issue Type: Bug Components: impl Affects Versions: 0.7.0 Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.8.0 Currently, in DefaultTuple, we estimate the memory footprint for string as if it is char array. The formula we use is: length * 2 + 12. It turns out we underestimate the memory usage for string. Here is a list of real memory footprint for string we get from memory dump: | length of string | memory in bytes | | 7 | 56 | | 3 | 48 | | 1 | 40 | I did a search and find the following formula can accurately estimate the memory footprint for string: {code} 8 * (int) (((length * 2) + 45) / 8) {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.