[ 
https://issues.apache.org/jira/browse/IMPALA-12373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17755950#comment-17755950
 ] 

Zoltán Borók-Nagy commented on IMPALA-12373:
--------------------------------------------

Thanks for pointing this out, Daniel!

It's interesting that libc++ and Folly both do a similar approach:
 * 
[https://github.com/llvm-mirror/libcxx/blob/78d6a7767ed57b50122a161b91f59f19c9bd0d19/include/string#L1420]
 * 
[https://github.com/facebook/folly/blob/db9fd491eacdbef14aba095dd9ffd1b7e51e2689/folly/FBString.h#L612]

Though I think libc++ is free to depend on implementation dependent behavior I 
guess.

On StackOverflow there's a question about Folly in this subject:
 * 
[https://stackoverflow.com/questions/45900169/does-fbstrings-small-string-optimization-rely-on-undefined-behavior]

The answer seems correct to me, so in {{is_small()}} we can have stg like this:
{noformat}
  bool is_small() {
    char last_char = reinterpret_cast<const char*>(this)[sizeof(*this) - 1];
    return last_char & 0b10000000;
}
{noformat}

> Implement Small String Optimization for StringValue
> ---------------------------------------------------
>
>                 Key: IMPALA-12373
>                 URL: https://issues.apache.org/jira/browse/IMPALA-12373
>             Project: IMPALA
>          Issue Type: Improvement
>            Reporter: Zoltán Borók-Nagy
>            Priority: Major
>              Labels: Performance
>         Attachments: small_string.cpp
>
>
> Implement Small String Optimization for StringValue.
> Current memory layout of StringValue is:
> {noformat}
>   char* ptr;  // 8 byte
>   int len;    // 4 byte
> {noformat}
> For small strings with size up to 8 we could store the string contents in the 
> bytes of the 'ptr'. Something like that:
> {noformat}
>   union {
>     char* ptr;
>     char small_buf[sizeof(ptr)];
>   };
>   int len;
> {noformat}
> Many C++ string implementations use the {{Small String Optimization}} to 
> speed up work with small strings. For example:
> {code:java}
> Microsoft STL, libstdc++, libc++, Boost, Folly.{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to