The unsafe_arena_* functions for strings are deprecated: > When arenas are enabled, string and bytes fields generate unsafe_arena_release_field() and unsafe_arena_set_allocated_field() methods. Note that these methods are deprecated and will be removed in a future release. Those methods were added by mistake and offer no performance advantage over their safe equivalents.
https://developers.google.com/protocol-buffers/docs/reference/arenas#arenastring We do not have any supported interface for borrowing a std::string and releasing it later. Our goal in this area is to reduce (and eventually remove) the use of std::string in generated APIs in favor of string_view, which could potentially support an accessor like: // The "foo" field will point to the given string data, which must outlive the message. void FooMessage::set_alias_foo(string_view str) {} On Thu, May 7, 2020 at 11:10 AM King Kong <[email protected]> wrote: > I guess my last post wasn't clear enough, so here is the clarification. > > With open source version of protobuf, std::string cannot be allocated in > arena, so set allocated and release APIs are necessary for temporarily > sharing std::string pointers, according to official documentation and > generated code, different APIs are needed (unsafe_arena_* vs non-arena), > depending on how the "borrowing" object is allocated, this leads to fragile > code. Allowing using one set of APIs in both cases seems a better > alternative. > > If the recommendation is not to use std::string, what's the replacement? > Thanks. > > On Tuesday, May 5, 2020 at 8:33:21 PM UTC-7, King Kong wrote: >> >> As a related issue, I remember reading some comments in protobuf source >> code that says google internal protobuf supports arena allocation that's >> not portable (probably assumes certain std::string implementation), thus it >> is not open sourced, what's the plan for string arena allocation support? >> If I don't care about portability, is it possible to get the non-portable >> solution somehow? >> >> On Tuesday, May 5, 2020 at 7:02:39 PM UTC-7, Josh Haberman wrote: >>> >>> On Tue, May 5, 2020 at 4:20 PM King Kong <[email protected]> wrote: >>> >>>> Thanks for your reply. >>>> >>>> Yes, I have been using arena, but I wanted to do better than batching >>>> heap allocation, when possible. The "scoped leasing" use case I described >>>> above achieve 0 heap allocation all the time. I guess my code could use >>>> GetArena() at runtime to decide how the "leasing" object is created and >>>> which set of set&release APIs to use. >>>> >>> >>> If you want to avoid any heap allocation, you could give the arena an >>> initial block of memory from the stack. >>> >>> If you are trying to optimize, keep in mind that most of the savings >>> from arenas come at deallocation time, where arena-allocated protos can >>> simply skip the destructor. If you are using heap-allocated protos, the >>> destructor has to check every message field to see if it needs to be >>> deleted. >>> >>> Arenas have given such performance benefits that we are focused on that >>> use case. If we loosen the checks in unsafe_arena functions, it could force >>> us to compromise later to keep supporting the heap-allocated case. I don't >>> think we want to loosen these restrictions. >>> >>> >>>> On Tuesday, May 5, 2020 at 2:25:05 PM UTC-7, Josh Haberman wrote: >>>>> >>>>> The documentation for unsafe_arena_release_foo() says: >>>>> >>>>> > This method should only be used when the parent message is on the >>>>> arena and the submessage is on the same arena, or an arena with equivalent >>>>> lifetime. >>>>> >>>>> The DCHECK() you highlighted is checking the first part of this >>>>> requirement. >>>>> >>>>> Stack/heap-allocated messages must always uniquely own their >>>>> sub-messages, which must always be on the heap. Protobuf doesn't support >>>>> the use case you are describing, where a message temporarily points to a >>>>> message it doesn't own. If we allowed that, protobuf itself would crash if >>>>> a message in this state was ever destroyed. >>>>> >>>>> If you are trying to avoid heap allocation overhead, the recommended >>>>> solution is arenas. >>>>> >>>>> On Tuesday, May 5, 2020 at 12:44:32 PM UTC-7, King Kong wrote: >>>>>> >>>>>> I am using unsafe_arena_set_allocated_* and unsafe_arena_release_* >>>>>> APIs to avoid >>>>>> unnecessary allocations when the sub-objects are only used in a >>>>>> function scope, >>>>>> unsafe_arena_release_* is always called before leaving the scope. At >>>>>> the beginning, >>>>>> >>>>>> I chose to always use the arena versions of unsafe APIs because the >>>>>> non-arena version >>>>>> >>>>>> unsafe_release does extra allocation when the object is allocated on >>>>>> arena, and the arena >>>>>> >>>>>> version does not check if the object is allocated in arena, based on >>>>>> generated code like >>>>>> >>>>>> the following: >>>>>> >>>>>> >>>>>> inline foo* outter1::unsafe_arena_release_foo() { >>>>>> foo* temp = foo_; >>>>>> foo_ = nullptr; >>>>>> return temp; >>>>>> } >>>>>> >>>>>> >>>>>> However, I later also saw the following the generated code with arena >>>>>> check, it means it is not okay to call for heap-allocated objects. >>>>>> >>>>>> inline void outter2::unsafe_arena_set_allocated_bar( >>>>>> std::string* bar) { >>>>>> GOOGLE_DCHECK(GetArenaNoVirtual() != nullptr); >>>>>> ... ... >>>>>> } >>>>>> >>>>>> >>>>>> So, why the discrepancy? >>>>>> >>>>>> >>>>>> The official documentation at >>>>>> https://developers.google.com/protocol-buffers/docs/reference/arenas says >>>>>> the correct set of APIs should be used >>>>>> >>>>>> according to how to the object is allocated, but having two different >>>>>> sets of unsafe APIs makes it harder to use, and it makes the code less >>>>>> robust to future changes. I am wondering if protobuf owners also realize >>>>>> such an issue and start to remove arena checking gradually? It that the >>>>>> case? I would like to hear how others think. >>>>>> >>>>>> >>>>>> Thanks. >>>>>> >>>>> -- >>>> You received this message because you are subscribed to a topic in the >>>> Google Groups "Protocol Buffers" group. >>>> To unsubscribe from this topic, visit >>>> https://groups.google.com/d/topic/protobuf/C26jc_jrCJQ/unsubscribe. >>>> To unsubscribe from this group and all its topics, send an email to >>>> [email protected]. >>>> To view this discussion on the web visit >>>> https://groups.google.com/d/msgid/protobuf/bfa10586-05af-430c-b809-5e4c65d43b5c%40googlegroups.com >>>> <https://groups.google.com/d/msgid/protobuf/bfa10586-05af-430c-b809-5e4c65d43b5c%40googlegroups.com?utm_medium=email&utm_source=footer> >>>> . >>>> >>> -- > You received this message because you are subscribed to a topic in the > Google Groups "Protocol Buffers" group. > To unsubscribe from this topic, visit > https://groups.google.com/d/topic/protobuf/C26jc_jrCJQ/unsubscribe. > To unsubscribe from this group and all its topics, send an email to > [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/protobuf/64726032-e17f-4c21-aa49-9273940d901b%40googlegroups.com > <https://groups.google.com/d/msgid/protobuf/64726032-e17f-4c21-aa49-9273940d901b%40googlegroups.com?utm_medium=email&utm_source=footer> > . > -- You received this message because you are subscribed to the Google Groups "Protocol Buffers" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/protobuf/CAFFe09hJCYr5c%2B1MXVpvMgg%3Dm5aWQXUd6BuZUe0g7bnVv%3D-NgA%40mail.gmail.com.
