Joe McDonnell created IMPALA-9413:
-------------------------------------
Summary: Calculation of result set memory usage is wrong when
using GCC7 with new ABI
Key: IMPALA-9413
URL: https://issues.apache.org/jira/browse/IMPALA-9413
Project: IMPALA
Issue Type: Bug
Components: Backend
Affects Versions: Impala 3.4.0
Reporter: Joe McDonnell
Attachments: stringcapacity.cc
GCC5+ uses a new ABI for std::string which has a small string optimization.
This allows it to avoid an extra memory allocation for strings up to 15
characters. This means that string.capacity() is 15 while still only using
sizeof(string), so calculations of memory usage that add sizeof(string) +
string.capacity are no longer correct. This happens in the query result set:
[https://github.com/apache/impala/blob/master/be/src/service/query-result-set.cc#L225-L232]
[https://github.com/apache/impala/blob/master/be/src/service/query-result-set.cc#L239-L241]
At the moment, Impala uses GCC 4.9.2, which does not have this optimization, so
this is only a problem when we switch to the new ABI.
I have attached a simple c++ file to demonstrate the difference. On GCC-4.9.2,
the output is:
{noformat}
joe@joemcdonnell:~/view2/Impala/stringcapacity$ ./a.out
init short_string
[Allocating 30 bytes]
sizeof(short_string): 8
short_string.size(): 5
short_string.capacity(): 5
init long_string
[Allocating 54 bytes]
sizeof(long_string): 8
long_string.size(): 29
long_string.capacity(): 29
{noformat}
On GCC 5.4.0:
{noformat}
init short_string
sizeof(short_string): 32
short_string.size(): 5
short_string.capacity(): 15init long_string
[Allocating 30 bytes]
sizeof(long_string): 32
long_string.size(): 29
long_string.capacity(): 29{noformat}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]