I guess my last post wasn't clear enough, so here is the clarification. With open source version of protobuf, std::string cannot be allocated in arena, so set allocated and release APIs are necessary for temporarily sharing std::string pointers, according to official documentation and generated code, different APIs are needed (unsafe_arena_* vs non-arena), depending on how the "borrowing" object is allocated, this leads to fragile code. Allowing using one set of APIs in both cases seems a better alternative.
If the recommendation is not to use std::string, what's the replacement? Thanks. On Tuesday, May 5, 2020 at 8:33:21 PM UTC-7, King Kong wrote: > > As a related issue, I remember reading some comments in protobuf source > code that says google internal protobuf supports arena allocation that's > not portable (probably assumes certain std::string implementation), thus it > is not open sourced, what's the plan for string arena allocation support? > If I don't care about portability, is it possible to get the non-portable > solution somehow? > > On Tuesday, May 5, 2020 at 7:02:39 PM UTC-7, Josh Haberman wrote: >> >> On Tue, May 5, 2020 at 4:20 PM King Kong <[email protected]> wrote: >> >>> Thanks for your reply. >>> >>> Yes, I have been using arena, but I wanted to do better than batching >>> heap allocation, when possible. The "scoped leasing" use case I described >>> above achieve 0 heap allocation all the time. I guess my code could use >>> GetArena() at runtime to decide how the "leasing" object is created and >>> which set of set&release APIs to use. >>> >> >> If you want to avoid any heap allocation, you could give the arena an >> initial block of memory from the stack. >> >> If you are trying to optimize, keep in mind that most of the savings from >> arenas come at deallocation time, where arena-allocated protos can simply >> skip the destructor. If you are using heap-allocated protos, the destructor >> has to check every message field to see if it needs to be deleted. >> >> Arenas have given such performance benefits that we are focused on that >> use case. If we loosen the checks in unsafe_arena functions, it could force >> us to compromise later to keep supporting the heap-allocated case. I don't >> think we want to loosen these restrictions. >> >> >>> On Tuesday, May 5, 2020 at 2:25:05 PM UTC-7, Josh Haberman wrote: >>>> >>>> The documentation for unsafe_arena_release_foo() says: >>>> >>>> > This method should only be used when the parent message is on the >>>> arena and the submessage is on the same arena, or an arena with equivalent >>>> lifetime. >>>> >>>> The DCHECK() you highlighted is checking the first part of this >>>> requirement. >>>> >>>> Stack/heap-allocated messages must always uniquely own their >>>> sub-messages, which must always be on the heap. Protobuf doesn't support >>>> the use case you are describing, where a message temporarily points to a >>>> message it doesn't own. If we allowed that, protobuf itself would crash if >>>> a message in this state was ever destroyed. >>>> >>>> If you are trying to avoid heap allocation overhead, the recommended >>>> solution is arenas. >>>> >>>> On Tuesday, May 5, 2020 at 12:44:32 PM UTC-7, King Kong wrote: >>>>> >>>>> I am using unsafe_arena_set_allocated_* and unsafe_arena_release_* >>>>> APIs to avoid >>>>> unnecessary allocations when the sub-objects are only used in a >>>>> function scope, >>>>> unsafe_arena_release_* is always called before leaving the scope. At >>>>> the beginning, >>>>> >>>>> I chose to always use the arena versions of unsafe APIs because the >>>>> non-arena version >>>>> >>>>> unsafe_release does extra allocation when the object is allocated on >>>>> arena, and the arena >>>>> >>>>> version does not check if the object is allocated in arena, based on >>>>> generated code like >>>>> >>>>> the following: >>>>> >>>>> >>>>> inline foo* outter1::unsafe_arena_release_foo() { >>>>> foo* temp = foo_; >>>>> foo_ = nullptr; >>>>> return temp; >>>>> } >>>>> >>>>> >>>>> However, I later also saw the following the generated code with arena >>>>> check, it means it is not okay to call for heap-allocated objects. >>>>> >>>>> inline void outter2::unsafe_arena_set_allocated_bar( >>>>> std::string* bar) { >>>>> GOOGLE_DCHECK(GetArenaNoVirtual() != nullptr); >>>>> ... ... >>>>> } >>>>> >>>>> >>>>> So, why the discrepancy? >>>>> >>>>> >>>>> The official documentation at >>>>> https://developers.google.com/protocol-buffers/docs/reference/arenas says >>>>> the correct set of APIs should be used >>>>> >>>>> according to how to the object is allocated, but having two different >>>>> sets of unsafe APIs makes it harder to use, and it makes the code less >>>>> robust to future changes. I am wondering if protobuf owners also realize >>>>> such an issue and start to remove arena checking gradually? It that the >>>>> case? I would like to hear how others think. >>>>> >>>>> >>>>> Thanks. >>>>> >>>> -- >>> You received this message because you are subscribed to a topic in the >>> Google Groups "Protocol Buffers" group. >>> To unsubscribe from this topic, visit >>> https://groups.google.com/d/topic/protobuf/C26jc_jrCJQ/unsubscribe. >>> To unsubscribe from this group and all its topics, send an email to >>> [email protected]. >>> To view this discussion on the web visit >>> https://groups.google.com/d/msgid/protobuf/bfa10586-05af-430c-b809-5e4c65d43b5c%40googlegroups.com >>> >>> <https://groups.google.com/d/msgid/protobuf/bfa10586-05af-430c-b809-5e4c65d43b5c%40googlegroups.com?utm_medium=email&utm_source=footer> >>> . >>> >> -- You received this message because you are subscribed to the Google Groups "Protocol Buffers" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/protobuf/64726032-e17f-4c21-aa49-9273940d901b%40googlegroups.com.
