[ https://issues.apache.org/jira/browse/HIVE-8347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Alan Gates updated HIVE-8347: ----------------------------- Resolution: Fixed Fix Version/s: 0.15.0 Assignee: Alan Gates Status: Resolved (was: Patch Available) Patch checked into trunk. Thanks Mariappan for the patch. Note, this should be assigned to Mariappan Asokan, but as Mariappan is not in the contributor list I couldn't do that. JIRA seemed to want it to be assigned to someone so I assigned it to me. But if one of the JIRA admins to add Mariappan to the contributor list then we can properly assign the JIRA. > Use base-64 encoding instead of custom encoding for serialized objects > ---------------------------------------------------------------------- > > Key: HIVE-8347 > URL: https://issues.apache.org/jira/browse/HIVE-8347 > Project: Hive > Issue Type: Improvement > Components: HCatalog > Affects Versions: 0.13.1 > Reporter: Mariappan Asokan > Assignee: Alan Gates > Fix For: 0.15.0 > > Attachments: HIVE-8347.patch > > > Serialized objects that are shipped via Hadoop {{Configuration}} are encoded > using custom encoding (see {{HCatUtil.encodeBytes()}} and its complement > {{HCatUtil.decodeBytes()}}) which has 100% overhead. In other words, each > byte in the serialized object becomes 2 bytes after encoding. Perhaps, this > might be one of the reasons for the problem reported in HCATALOG-453. The > patch for HCATALOG-453 compressed serialized {{InputJobInfo}} objects to > solve the problem. > By using Base64 encoding, the overhead will be reduced to about 33%. This > will alleviate the problem for all serialized objects. -- This message was sent by Atlassian JIRA (v6.3.4#6332)