[ 
https://issues.apache.org/jira/browse/THRIFT-5497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17475247#comment-17475247
 ] 

Mario Emmenlauer commented on THRIFT-5497:
------------------------------------------

In principle I like this idea. I've also been considering how to best deal with 
binary data in `std::string`.

When addressing this, I have a use case that is a bit different: I'd like to 
avoid a memcopy between thrift and user code. For larger binary data this can 
be relevant, and there is no practical reason why to have it in the first 
place. Basically its quite unlikely that users will keep their binary data in 
`std::string`, so its just a temporary container for having a simple interface. 
While there is nothing totally wrong with that, it does come with the two 
downsides that
 * Its not very understandable that binary data would be contained in 
`std::string`, and
 * The data can not be consumed without a full memory copy

I have not found the time to consider the "ideal" solution yet. Instead of an 
stl container, it would be fairly straightforward to design a simple binary 
container that allows to swap out the data pointer. But that may be a bit 
intrusive on downstream users?

Ideas?

> Change the C++ representation of binary be std::basic_string<unsigned_char>
> ---------------------------------------------------------------------------
>
>                 Key: THRIFT-5497
>                 URL: https://issues.apache.org/jira/browse/THRIFT-5497
>             Project: Thrift
>          Issue Type: Improvement
>            Reporter: Richard H. Gumpertz
>            Priority: Minor
>
> I would like to change the C++ representation of binary from "std::string" to 
> "std::basic_string<unsigned_char>".  This would allow C++ templates to 
> distinguish the Thrift type "string" from "binary".
> If nothing else, ::apached::thrift::to_string could print binary differently 
> than string.  Currently, it can get pretty ugly when one calls printTo on a 
> struct containing a binary field that has data that is not readable!
> There is an issue as to how they should be displayed by default; using 
> \xDE\xAD\xBE\xEF with a \ before every byte seems ugly.  Alternativlely, 
> "DEADBEEF"_x because it could also be used as input if the C++ operator""_x 
> is defined by the user?  Alternatively just <binary> or <27-byte binary> 
> (mirroring <null> for unset fields), hiding the contents?  Perhaps leave 
> to_string(std::basic_string<unsigned char> &value) undefined and force the 
> user to define it if needed?  Thoughts & suggestions???
> Given that such a change could break existing code, it should probably be 
> enabled under a new thrift compiler option to --gen cpp.  I haven't come up 
> with a good name for such an option yet: --gen cpp:binary_as_unsigned_string 
> is explicit but wordy.  Any suggestions???
> I would be willing to do the work necessary to add the relevant code if there 
> are no volunteers who have a better understanding of precisely what code 
> would have to be updated (or at least help me find such code)



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to