Kenton,

Thanks for the help. I've confirmed the issue was the stack size, as I was 
suspecting.

Apparently, one external library was creating threads with 
PTHREAD_STACK_MIN x2, which is around 32Kbytes. At our application 
callback, where we received the event and before calling encode(), ~26K 
bytes of the stack were already already consumed. It still sounds a bit too 
much ~8K of stack during the encoding. But raising the stack of that thread 
to 2Mbytes, solved the problem.

Not sure if this is going to be of any help. Sorry for the fuzz.

Regards
marc

On Tuesday, March 6, 2018 at 10:59:05 PM UTC+1, Kenton Varda wrote:
>
> Hi Marc,
>
> I would guess each encode() stack frame is on the order of 100 bytes. 
> Unless your stacks are *really* small, that doesn't seem like it could be 
> the problem.
>
> It looks like the real problem has something to do with unexpectedly null 
> pointers appearing in the schema objects. These objects are declared as 
> static constants in the generated code. Here's an example:
>
>   const ::capnp::_::RawSchema s_e682ab4cf923a417 = {
>     0xe682ab4cf923a417, b_e682ab4cf923a417.words, 225, d_e682ab4cf923a417, 
> m_e682ab4cf923a417,
>     8, 14, i_e682ab4cf923a417, nullptr, nullptr, { &s_e682ab4cf923a417, 
> nullptr, nullptr, 0, 0, nullptr }
>   }
>
> Your latest stack trace shows a case where, somehow, the second field of 
> this turned out null, which seems impossible.
>
> Is it at all possible that your generated code was created with a 
> different version of Cap'n Proto compiler vs. the runtime library and/or 
> headers you are compiling against?
>
> -Kenton
>
> On Mon, Mar 5, 2018 at 2:34 PM, Marc Sune <ma...@voltanet.io <javascript:>
> > wrote:
>
>> Kenton,
>>
>> Is it remotely possible some of the encode() methods are consuming large 
>> amounts of stack? I would need to to get deep into the code, but glancing 
>> over the code I see some parameters that _seem_ to be passed by value, and 
>> few autos which I am _not sure_ on whether they might perform a copy or not.
>>
>> Marc
>>
>> On Monday, March 5, 2018 at 10:43:11 PM UTC+1, Marc Sune wrote:
>>>
>>> Thanks Kenton,
>>>
>>> An update on this. I am leaning to think it is an stack overflow, 
>>> although I couldn't confirm 100%. The truth is that, calling the same code 
>>> from the main(), with the same exact objects and encoding routine, doesn't 
>>> crash.
>>>
>>> However, when the encode() method is called from a thread that is 
>>> created by an external library, I consistently get a SIGSEGV. I've tried 
>>> the typical ulimit / pthread_attr_setstacksize() without much of a success 
>>> (not sure why though). 
>>>
>>> Re-arranging the code so that the encode() is called with less stack (I 
>>> cut 2 or 3 frames), I get past the point of libstdc++, but I consistenly 
>>> crash here:
>>>
>>> (gdb) bt
>>> #0  0x00000000009d4e46 in capnp::_::DirectWireValue<unsigned int>::get 
>>> *(this=0x0)* at /home/marc/target/rootfs/include/capnp/endian.h:80
>>> #1  0x0000000001e64d37 in 
>>> capnp::_::WirePointer::target(capnp::_::SegmentReader*) const ()
>>> #2  0x0000000001e6a83c in 
>>> capnp::_::WireHelpers::readStructPointer(capnp::_::SegmentReader*, 
>>> capnp::_::CapTableReader*, capnp::_::WirePointer const*, capnp::word const*
>>> , int) ()
>>> #3  0x0000000001e5e7e7 in capnp::_::PointerReader::getStruct(capnp::word 
>>> const*) const ()
>>> #4  0x0000000001e7fe61 in capnp::_::PointerHelpers<capnp::schema::Node, 
>>> (capnp::Kind)3>::get(capnp::_::PointerReader, capnp::word const*) ()
>>> #5  0x0000000001e7f6ec in capnp::ReaderFor_<capnp::schema::Node, 
>>> (kind<capnp::schema::Node>)()>::Type 
>>> capnp::AnyPointer::Reader::getAs<capnp::schema::Node>() const ()
>>> #6  0x0000000001e7de31 in capnp::schema::Node::Reader 
>>> capnp::readMessageUnchecked<capnp::schema::Node>(capnp::word const*) ()
>>> #7  0x0000000001e780db in capnp::Schema::getProto() const ()
>>> #8  0x0000000001e7995f in capnp::EnumSchema::getEnumerants() const ()
>>> #9  0x0000000001e8068f in capnp::DynamicEnum::getEnumerant() const ()
>>> #10 0x0000000001e45c74 in 
>>> capnp::JsonCodec::encode(capnp::DynamicValue::Reader, capnp::Type, 
>>> capnp::JsonValue::Builder) const ()
>>>
>>> Really odd... So I think it is not related to capnproto code
>>>
>>> Thanks
>>> marc
>>>
>>> On Saturday, March 3, 2018 at 1:19:15 AM UTC+1, Kenton Varda wrote:
>>>>
>>>> Sorry, I don't really have any ideas here. The stack trace is deep in 
>>>> STL code for the type handler map, inside find(). If you've registered no 
>>>> type handlers, that map should be empty. It's hard to imagine how find() 
>>>> on 
>>>> an empty std::unordered_map could ever segfault...
>>>>
>>>> -Kenton
>>>>
>>>> On Fri, Mar 2, 2018 at 11:07 AM, Marc Sune <ma...@voltanet.io> wrote:
>>>>
>>>>> Kenton,
>>>>>
>>>>> On Friday, March 2, 2018 at 7:27:30 PM UTC+1, Kenton Varda wrote:
>>>>>>
>>>>>> Hi Marc,
>>>>>>
>>>>>> Do you have any custom type handlers registered via addTypeHandler()? 
>>>>>> Is it possible that the handler class has gone out-of-scope (been 
>>>>>> destroyed) by the time the encoder is executed?
>>>>>>
>>>>>
>>>>> No, I am not using custom handlers. I am just dumping the Builder like 
>>>>> this (simplified):
>>>>>
>>>>> template<typename T>
>>>>> void dump_msg(T& msg){
>>>>> try{
>>>>> capnp::JsonCodec enc;
>>>>> enc.setPrettyPrint(PPRINT);
>>>>> auto t = enc.encode(msg);
>>>>> fprintf(stderr, "MSG: %s\n", t.cStr());
>>>>> }catch(...){}
>>>>> }
>>>>>
>>>>> Where, in this case, msg is:
>>>>>
>>>>> 147         capnp::MallocMessageBuilder builder;                      
>>>>>               
>>>>> 148         auto msg = builder.initRoot<Message>();    
>>>>>
>>>>> //Fill it
>>>>>
>>>>> 254                          dump_msg(msg);
>>>>>
>>>>> So the builder and encoder al valid, I believe in the entire dumping. 
>>>>> Moreover, valgrind would complain before the SEGFAULT if something would 
>>>>> be 
>>>>> out of the stack, and the only thing I get is the direct SEGFAULT.
>>>>>
>>>>> Any thoughts? I will keep trying to isolate the problem
>>>>>
>>>>> marc
>>>>>
>>>>> -Kenton
>>>>>>
>>>>>> On Fri, Mar 2, 2018 at 10:18 AM, Marc Sune <ma...@voltanet.io> wrote:
>>>>>>
>>>>>>> > That's about all that I can get, even though capnproto is compiled 
>>>>>>> with DEBUG.
>>>>>>>
>>>>>>> Or shall I say, it should :/
>>>>>>>
>>>>>>>
>>>>>>> On Friday, March 2, 2018 at 7:17:02 PM UTC+1, Marc Sune wrote:
>>>>>>>>
>>>>>>>> Hi guys,
>>>>>>>>
>>>>>>>> I am experiencing a _very_ strange segfault during JSON encoding of 
>>>>>>>> message:
>>>>>>>>
>>>>>>>> ```
>>>>>>>> Program received signal SIGSEGV, Segmentation fault.
>>>>>>>> [Switching to Thread 27946]
>>>>>>>> 0x0000000001e4b81a in std::__detail::_Hashtable_ebo_helper<1, 
>>>>>>>> capnp::(anonymous namespace)::TypeHash, 
>>>>>>>> true>::_S_cget(std::__detail::_Hashtable_ebo_helper<1, 
>>>>>>>> capnp::(anonymous 
>>>>>>>> namespace)::TypeHash, true> const&) ()
>>>>>>>> (gdb) bt
>>>>>>>> #0  0x0000000001e4b81a in std::__detail::_Hashtable_ebo_helper<1, 
>>>>>>>> capnp::(anonymous namespace)::TypeHash, 
>>>>>>>> true>::_S_cget(std::__detail::_Hashtable_ebo_helper<1, 
>>>>>>>> capnp::(anonymous 
>>>>>>>> namespace)::TypeHash, true> const&) ()
>>>>>>>> #1  0x0000000001e4b11a in 
>>>>>>>> std::__detail::_Hash_code_base<capnp::Type, std::pair<capnp::Type 
>>>>>>>> const, 
>>>>>>>> capnp::JsonCodec::HandlerBase*>, std::__detail::_Select1st, 
>>>>>>>> capnp::(anonymous namespace)::TypeHash, 
>>>>>>>> std::__detail::_Mod_range_hashing, 
>>>>>>>> std::__detail::_Default_ranged_hash, true>::_M_h1() const ()
>>>>>>>> #2  0x0000000001e4a976 in 
>>>>>>>> std::__detail::_Hash_code_base<capnp::Type, std::pair<capnp::Type 
>>>>>>>> const, 
>>>>>>>> capnp::JsonCodec::HandlerBase*>, std::__detail::_Select1st, 
>>>>>>>> capnp::(anonymous namespace)::TypeHash, 
>>>>>>>> std::__detail::_Mod_range_hashing, 
>>>>>>>> std::__detail::_Default_ranged_hash, true>::_M_hash_code(capnp::Type 
>>>>>>>> const&) const ()
>>>>>>>> #3  0x0000000001e4a401 in std::_Hashtable<capnp::Type, 
>>>>>>>> std::pair<capnp::Type const, capnp::JsonCodec::HandlerBase*>, 
>>>>>>>> std::allocator<std::pair<capnp::Type const, 
>>>>>>>> capnp::JsonCodec::HandlerBase*> 
>>>>>>>> >, std::__detail::_Select1st, std::equal_to<capnp::Type>, 
>>>>>>>> >capnp::(anonymous 
>>>>>>>> namespace)::TypeHash, std::__detail::_Mod_range_hashing, 
>>>>>>>> std::__detail::_Default_ranged_hash, 
>>>>>>>> std::__detail::_Prime_rehash_policy, 
>>>>>>>> std::__detail::_Hashtable_traits<true, false, true> 
>>>>>>>> >::find(capnp::Type 
>>>>>>>> const&) const ()
>>>>>>>> #4  0x0000000001e483a3 in std::unordered_map<capnp::Type, 
>>>>>>>> capnp::JsonCodec::HandlerBase*, capnp::(anonymous 
>>>>>>>> namespace)::TypeHash, 
>>>>>>>> std::equal_to<capnp::Type>, std::allocator<std::pair<capnp::Type 
>>>>>>>> const, 
>>>>>>>> capnp::JsonCodec::HandlerBase*> > >::find(capnp::Type const&) const ()
>>>>>>>> #5  0x0000000001e443d5 in 
>>>>>>>> capnp::JsonCodec::encode(capnp::DynamicValue::Reader, capnp::Type, 
>>>>>>>> capnp::JsonValue::Builder) const ()
>>>>>>>> #6  0x0000000001e44a6f in 
>>>>>>>> capnp::JsonCodec::encode(capnp::DynamicValue::Reader, capnp::Type, 
>>>>>>>> capnp::JsonValue::Builder) const ()
>>>>>>>> #7  0x0000000001e45a36 in 
>>>>>>>> capnp::JsonCodec::encodeField(capnp::StructSchema::Field, 
>>>>>>>> capnp::DynamicValue::Reader, capnp::JsonValue::Builder) const ()
>>>>>>>> #8  0x0000000001e45377 in 
>>>>>>>> capnp::JsonCodec::encode(capnp::DynamicValue::Reader, capnp::Type, 
>>>>>>>> capnp::JsonValue::Builder) const ()
>>>>>>>> #9  0x0000000001e45a36 in 
>>>>>>>> capnp::JsonCodec::encodeField(capnp::StructSchema::Field, 
>>>>>>>> capnp::DynamicValue::Reader, capnp::JsonValue::Builder) const ()
>>>>>>>> #10 0x0000000001e45377 in 
>>>>>>>> capnp::JsonCodec::encode(capnp::DynamicValue::Reader, capnp::Type, 
>>>>>>>> capnp::JsonValue::Builder) const ()
>>>>>>>> #11 0x0000000001e45a36 in 
>>>>>>>> capnp::JsonCodec::encodeField(capnp::StructSchema::Field, 
>>>>>>>> capnp::DynamicValue::Reader, capnp::JsonValue::Builder) const ()
>>>>>>>> #12 0x0000000001e45377 in 
>>>>>>>> capnp::JsonCodec::encode(capnp::DynamicValue::Reader, capnp::Type, 
>>>>>>>> capnp::JsonValue::Builder) const ()
>>>>>>>> #13 0x0000000001e45a36 in 
>>>>>>>> capnp::JsonCodec::encodeField(capnp::StructSchema::Field, 
>>>>>>>> capnp::DynamicValue::Reader, capnp::JsonValue::Builder) const ()
>>>>>>>> #14 0x0000000001e45618 in 
>>>>>>>> capnp::JsonCodec::encode(capnp::DynamicValue::Reader, capnp::Type, 
>>>>>>>> capnp::JsonValue::Builder) const ()
>>>>>>>> #15 0x0000000001e45a36 in 
>>>>>>>> capnp::JsonCodec::encodeField(capnp::StructSchema::Field, 
>>>>>>>> capnp::DynamicValue::Reader, capnp::JsonValue::Builder) const ()
>>>>>>>> #16 0x0000000001e45618 in 
>>>>>>>> capnp::JsonCodec::encode(capnp::DynamicValue::Reader, capnp::Type, 
>>>>>>>> capnp::JsonValue::Builder) const ()
>>>>>>>> #17 0x0000000001e43fcc in 
>>>>>>>> capnp::JsonCodec::encode(capnp::DynamicValue::Reader, capnp::Type) 
>>>>>>>> const ()
>>>>>>>> #18 0x00000000009e02c7 in capnp::JsonCodec::encode<Message::Builder 
>>>>>>>> const&> (this=0x7ffff22e1210, value=...) at 
>>>>>>>> /home/marc/.../capnp/compat/json.h:216
>>>>>>>> ```
>>>>>>>>
>>>>>>>> That's about all that I can get, even though capnproto is compiled 
>>>>>>>> with DEBUG.
>>>>>>>>
>>>>>>>> The message trying to be encoded (sorry, I am not sure I can share 
>>>>>>>> the entire set of schemas), is a series of simple objects, which in 
>>>>>>>> the 
>>>>>>>> inner-most object contains a list that is initialized normally:
>>>>>>>>
>>>>>>>> 115         s.initIfaceType(1);     
>>>>>>>>
>>>>>>>> The funny part; not initializing it, doesn't make JsonCodec crash. 
>>>>>>>> But initializing it, or initializing it + setting a value (valid one), 
>>>>>>>> produces the crash always. 
>>>>>>>>
>>>>>>>> Valgrind etc... doesn't complain until that point.
>>>>>>>>
>>>>>>>> I am trying to isolate the problem, to make it reproducible, but I 
>>>>>>>> am not able yet.
>>>>>>>>
>>>>>>>> Any ideas on this?
>>>>>>>>
>>>>>>>> Thanks
>>>>>>>>
>>>>>>> -- 
>>>>>>> You received this message because you are subscribed to the Google 
>>>>>>> Groups "Cap'n Proto" group.
>>>>>>> To unsubscribe from this group and stop receiving emails from it, 
>>>>>>> send an email to capnproto+...@googlegroups.com.
>>>>>>> Visit this group at https://groups.google.com/group/capnproto.
>>>>>>>
>>>>>>
>>>>>> -- 
>>>>> You received this message because you are subscribed to the Google 
>>>>> Groups "Cap'n Proto" group.
>>>>> To unsubscribe from this group and stop receiving emails from it, send 
>>>>> an email to capnproto+...@googlegroups.com.
>>>>> Visit this group at https://groups.google.com/group/capnproto.
>>>>>
>>>>
>>>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "Cap'n Proto" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to capnproto+...@googlegroups.com <javascript:>.
>> Visit this group at https://groups.google.com/group/capnproto.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Cap'n Proto" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to capnproto+unsubscr...@googlegroups.com.
Visit this group at https://groups.google.com/group/capnproto.

Reply via email to