Joe McDonnell created IMPALA-9413:
-------------------------------------

             Summary: Calculation of result set memory usage is wrong when 
using GCC7 with new ABI
                 Key: IMPALA-9413
                 URL: https://issues.apache.org/jira/browse/IMPALA-9413
             Project: IMPALA
          Issue Type: Bug
          Components: Backend
    Affects Versions: Impala 3.4.0
            Reporter: Joe McDonnell
         Attachments: stringcapacity.cc

GCC5+ uses a new ABI for std::string which has a small string optimization. 
This allows it to avoid an extra memory allocation for strings up to 15 
characters. This means that string.capacity() is 15 while still only using 
sizeof(string), so calculations of memory usage that add sizeof(string) + 
string.capacity are no longer correct. This happens in the query result set:

[https://github.com/apache/impala/blob/master/be/src/service/query-result-set.cc#L225-L232]

[https://github.com/apache/impala/blob/master/be/src/service/query-result-set.cc#L239-L241]

At the moment, Impala uses GCC 4.9.2, which does not have this optimization, so 
this is only a problem when we switch to the new ABI.

I have attached a simple c++ file to demonstrate the difference. On GCC-4.9.2, 
the output is:
{noformat}
joe@joemcdonnell:~/view2/Impala/stringcapacity$ ./a.out
init short_string
[Allocating 30 bytes]
sizeof(short_string): 8
short_string.size(): 5
short_string.capacity(): 5

init long_string
[Allocating 54 bytes]
sizeof(long_string): 8
long_string.size(): 29
long_string.capacity(): 29
{noformat}
On GCC 5.4.0:
{noformat}
init short_string
sizeof(short_string): 32
short_string.size(): 5
short_string.capacity(): 15init long_string
[Allocating 30 bytes]
sizeof(long_string): 32
long_string.size(): 29
long_string.capacity(): 29{noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to