Gopal V created HIVE-7144: ----------------------------- Summary: GC pressure during ORC StringDictionary writes Key: HIVE-7144 URL: https://issues.apache.org/jira/browse/HIVE-7144 Project: Hive Issue Type: Bug Components: File Formats Affects Versions: 0.14.0 Environment: ORC Table ~ 12 string columns Reporter: Gopal V Assignee: Gopal V Attachments: orc-string-write.png
When ORC string dictionary writes data out, it suffers from bad GC performance due to a few allocations in-loop. !orc-string-write.png! The conversions are as follows StringTreeWriter::getStringValue() causes 2 conversions LazyString -> Text (LazyString::getWritableObject) Text -> String (LazyStringObjectInspector::getPrimitiveJavaObject) Then StringRedBlackTree::add() does one conversion String -> Text This causes some GC pressure with un-necessary String and byte[] array allocations. -- This message was sent by Atlassian JIRA (v6.2#6252)