Hello,
I’m rather new to Google Protobufs and this discussion group, so if it
seems I’m headed in the wrong direction, feedback is welcome! I’ve done
some searching to see if there have been related discusisons before, and I
found a few related things, but nothing that quite tackles my concerns
directly. So hopefully I’m not regurgitating old topics.
I’m looking to integrate protobufs into the data path of a messaging
product, where the product will produce, and therefore serialize, but not
consume (or parse) protobuf messages.
The data path makes use of chained buffers, and so it seems clear that the
right approach to expose this into is to implement the ZeroCopyOutputStream
interface so that it can be provided to a CodedOutputStream. So far, so
good.
>From here, the ideal interface, it seems to me is
google::protobuf::internal::WireFormatLite and its static Write... methods.
I don’t view the Message or MessageLite interfaces adding a lot of value
for what is needed, which is to format a collection of data from numerous
sources into protobuf formatted bytes in a chain of bffers.. Is there
something I’m missing?
If not, I have two concerns with the WireFormatLite methods:
They’re in an internal namespace. Presumably this means “don’t use them, we
won’t guarantee backwards compatibility”. If we were willing to update our
code if they changed in a future version, should we be concerned with using
these functions directly?
Strings and Bytes are specified via std::string. Our data path avoids the
heap entirely, so we can’t use these methods directly. I understand support
for string_view is being considered (in one discussion on saw mention that
support is planned for 2022). Perhaps given this, any intermediate solution
other than changing the interface (or adding similar interfaces) to use
string_view is off the table as they would just be a distraction.
If it’s not practical to wait for string_view support, and there isn’t much
appetite to do anything different, it appears there is *almost* a way to do
this already by making use of some templated methods in EpsCopyOutputStream.
It has the following:
template <typename T>
PROTOBUF_ALWAYS_INLINE uint8_t* WriteString(uint32_t num, const T& s,
uint8_t* ptr)
That’s great! I can provide a std::string_view for s and code will be
generated. While slightly awkward, I believe the following would write the
contents of a std::string_view to a CodedOutputStream cs:
cs.SetCur(cs.EpsCopy()->WriteString(<fieldNum>, <string_view>, cs.Cur()));
Unfortunately, this doesn’t *quite* work because of this section of its
implementation:
if (PROTOBUF_PREDICT_FALSE(
size >= 128 || end_ - ptr + 16 - TagSize(num << 3) - 1 < size)) {
return WriteStringOutline(num, s, ptr);
}
It calls WriteStringOutline, which requires s to be converted to a
std::string. As a quick experiement, I tried templating this method on the
string type as well, and successfully implemented a prototype. However, I’m
wondering if there may have been a good reason for not templating this
method? It has Outline in its name, which sounds like it should *not* be
inlined, perhaps for performance or code-bloat reasons?
I’m wondering if the project might consider a PR to template this method?
If the problem is that inlining is to be avoided for this method, would it
be an option to instantiate only a std::string variant inside
coded_stream.cc, by including a .tcc file (installed with the headers)
which would then leave the door open for an end user to instantiate a
std::string_view variant of WriteStringOutline in their own project if
desired? I agree it’s not beautiful, but it seems rather minimally invasive
until proper string_view support is available?
If the above sounds reasonable, would it also be reasonable to extend this
up into the WireFormatLite methods as well (i.e. provide template methods
that take a templated string type rather than requiring the use of
std::string)?
My goals are to:
1. Make sure my general plan for using some of the lower level
serialization methods directly doesn’t sound crazy; and
2. Be able to install and use an officially released library that meets
our needs within the next couple of months to avoid maintaining our own
changes to the library. It seems if the change proposed above (or something
similar). I suspect full string_view support is more than a couple of
months away?
I apologize for the length, thanks in advance for any feedback!
Cheers,
Duane
--
You received this message because you are subscribed to the Google Groups
"Protocol Buffers" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/protobuf/eee335b4-3709-4a09-8aa3-e427b1cbbc20n%40googlegroups.com.