Re: [capnproto] Re: Segfault during JSON encode() in v0.6.1

2018-03-07 Thread Marc Sune
Kenton,

Thanks for the help. I've confirmed the issue was the stack size, as I was 
suspecting.

Apparently, one external library was creating threads with 
PTHREAD_STACK_MIN x2, which is around 32Kbytes. At our application 
callback, where we received the event and before calling encode(), ~26K 
bytes of the stack were already already consumed. It still sounds a bit too 
much ~8K of stack during the encoding. But raising the stack of that thread 
to 2Mbytes, solved the problem.

Not sure if this is going to be of any help. Sorry for the fuzz.

Regards
marc

On Tuesday, March 6, 2018 at 10:59:05 PM UTC+1, Kenton Varda wrote:
>
> Hi Marc,
>
> I would guess each encode() stack frame is on the order of 100 bytes. 
> Unless your stacks are *really* small, that doesn't seem like it could be 
> the problem.
>
> It looks like the real problem has something to do with unexpectedly null 
> pointers appearing in the schema objects. These objects are declared as 
> static constants in the generated code. Here's an example:
>
>   const ::capnp::_::RawSchema s_e682ab4cf923a417 = {
> 0xe682ab4cf923a417, b_e682ab4cf923a417.words, 225, d_e682ab4cf923a417, 
> m_e682ab4cf923a417,
> 8, 14, i_e682ab4cf923a417, nullptr, nullptr, { _e682ab4cf923a417, 
> nullptr, nullptr, 0, 0, nullptr }
>   }
>
> Your latest stack trace shows a case where, somehow, the second field of 
> this turned out null, which seems impossible.
>
> Is it at all possible that your generated code was created with a 
> different version of Cap'n Proto compiler vs. the runtime library and/or 
> headers you are compiling against?
>
> -Kenton
>
> On Mon, Mar 5, 2018 at 2:34 PM, Marc Sune  > wrote:
>
>> Kenton,
>>
>> Is it remotely possible some of the encode() methods are consuming large 
>> amounts of stack? I would need to to get deep into the code, but glancing 
>> over the code I see some parameters that _seem_ to be passed by value, and 
>> few autos which I am _not sure_ on whether they might perform a copy or not.
>>
>> Marc
>>
>> On Monday, March 5, 2018 at 10:43:11 PM UTC+1, Marc Sune wrote:
>>>
>>> Thanks Kenton,
>>>
>>> An update on this. I am leaning to think it is an stack overflow, 
>>> although I couldn't confirm 100%. The truth is that, calling the same code 
>>> from the main(), with the same exact objects and encoding routine, doesn't 
>>> crash.
>>>
>>> However, when the encode() method is called from a thread that is 
>>> created by an external library, I consistently get a SIGSEGV. I've tried 
>>> the typical ulimit / pthread_attr_setstacksize() without much of a success 
>>> (not sure why though). 
>>>
>>> Re-arranging the code so that the encode() is called with less stack (I 
>>> cut 2 or 3 frames), I get past the point of libstdc++, but I consistenly 
>>> crash here:
>>>
>>> (gdb) bt
>>> #0  0x009d4e46 in capnp::_::DirectWireValue::get 
>>> *(this=0x0)* at /home/marc/target/rootfs/include/capnp/endian.h:80
>>> #1  0x01e64d37 in 
>>> capnp::_::WirePointer::target(capnp::_::SegmentReader*) const ()
>>> #2  0x01e6a83c in 
>>> capnp::_::WireHelpers::readStructPointer(capnp::_::SegmentReader*, 
>>> capnp::_::CapTableReader*, capnp::_::WirePointer const*, capnp::word const*
>>> , int) ()
>>> #3  0x01e5e7e7 in capnp::_::PointerReader::getStruct(capnp::word 
>>> const*) const ()
>>> #4  0x01e7fe61 in capnp::_::PointerHelpers>> (capnp::Kind)3>::get(capnp::_::PointerReader, capnp::word const*) ()
>>> #5  0x01e7f6ec in capnp::ReaderFor_>> (kind)()>::Type 
>>> capnp::AnyPointer::Reader::getAs() const ()
>>> #6  0x01e7de31 in capnp::schema::Node::Reader 
>>> capnp::readMessageUnchecked(capnp::word const*) ()
>>> #7  0x01e780db in capnp::Schema::getProto() const ()
>>> #8  0x01e7995f in capnp::EnumSchema::getEnumerants() const ()
>>> #9  0x01e8068f in capnp::DynamicEnum::getEnumerant() const ()
>>> #10 0x01e45c74 in 
>>> capnp::JsonCodec::encode(capnp::DynamicValue::Reader, capnp::Type, 
>>> capnp::JsonValue::Builder) const ()
>>>
>>> Really odd... So I think it is not related to capnproto code
>>>
>>> Thanks
>>> marc
>>>
>>> On Saturday, March 3, 2018 at 1:19:15 AM UTC+1, Kenton Varda wrote:

 Sorry, I don't really have any ideas here. The stack trace is deep in 
 STL code for the type handler map, inside find(). If you've registered no 
 type handlers, that map should be empty. It's hard to imagine how find() 
 on 
 an empty std::unordered_map could ever segfault...

 -Kenton

 On Fri, Mar 2, 2018 at 11:07 AM, Marc Sune  wrote:

> Kenton,
>
> On Friday, March 2, 2018 at 7:27:30 PM UTC+1, Kenton Varda wrote:
>>
>> Hi Marc,
>>
>> Do you have any custom type handlers registered via addTypeHandler()? 
>> Is it possible that the handler class has gone out-of-scope (been 
>> destroyed) by the 

Re: [capnproto] Re: Segfault during JSON encode() in v0.6.1

2018-03-05 Thread Marc Sune
Kenton,

Is it remotely possible some of the encode() methods are consuming large 
amounts of stack? I would need to to get deep into the code, but glancing 
over the code I see some parameters that _seem_ to be passed by value, and 
few autos which I am _not sure_ on whether they might perform a copy or not.

Marc

On Monday, March 5, 2018 at 10:43:11 PM UTC+1, Marc Sune wrote:
>
> Thanks Kenton,
>
> An update on this. I am leaning to think it is an stack overflow, although 
> I couldn't confirm 100%. The truth is that, calling the same code from the 
> main(), with the same exact objects and encoding routine, doesn't crash.
>
> However, when the encode() method is called from a thread that is created 
> by an external library, I consistently get a SIGSEGV. I've tried the 
> typical ulimit / pthread_attr_setstacksize() without much of a success (not 
> sure why though). 
>
> Re-arranging the code so that the encode() is called with less stack (I 
> cut 2 or 3 frames), I get past the point of libstdc++, but I consistenly 
> crash here:
>
> (gdb) bt
> #0  0x009d4e46 in capnp::_::DirectWireValue::get 
> *(this=0x0)* at /home/marc/target/rootfs/include/capnp/endian.h:80
> #1  0x01e64d37 in 
> capnp::_::WirePointer::target(capnp::_::SegmentReader*) const ()
> #2  0x01e6a83c in 
> capnp::_::WireHelpers::readStructPointer(capnp::_::SegmentReader*, 
> capnp::_::CapTableReader*, capnp::_::WirePointer const*, capnp::word const*
> , int) ()
> #3  0x01e5e7e7 in capnp::_::PointerReader::getStruct(capnp::word 
> const*) const ()
> #4  0x01e7fe61 in capnp::_::PointerHelpers (capnp::Kind)3>::get(capnp::_::PointerReader, capnp::word const*) ()
> #5  0x01e7f6ec in capnp::ReaderFor_ (kind)()>::Type 
> capnp::AnyPointer::Reader::getAs() const ()
> #6  0x01e7de31 in capnp::schema::Node::Reader 
> capnp::readMessageUnchecked(capnp::word const*) ()
> #7  0x01e780db in capnp::Schema::getProto() const ()
> #8  0x01e7995f in capnp::EnumSchema::getEnumerants() const ()
> #9  0x01e8068f in capnp::DynamicEnum::getEnumerant() const ()
> #10 0x01e45c74 in 
> capnp::JsonCodec::encode(capnp::DynamicValue::Reader, capnp::Type, 
> capnp::JsonValue::Builder) const ()
>
> Really odd... So I think it is not related to capnproto code
>
> Thanks
> marc
>
> On Saturday, March 3, 2018 at 1:19:15 AM UTC+1, Kenton Varda wrote:
>>
>> Sorry, I don't really have any ideas here. The stack trace is deep in STL 
>> code for the type handler map, inside find(). If you've registered no type 
>> handlers, that map should be empty. It's hard to imagine how find() on an 
>> empty std::unordered_map could ever segfault...
>>
>> -Kenton
>>
>> On Fri, Mar 2, 2018 at 11:07 AM, Marc Sune  wrote:
>>
>>> Kenton,
>>>
>>> On Friday, March 2, 2018 at 7:27:30 PM UTC+1, Kenton Varda wrote:

 Hi Marc,

 Do you have any custom type handlers registered via addTypeHandler()? 
 Is it possible that the handler class has gone out-of-scope (been 
 destroyed) by the time the encoder is executed?

>>>
>>> No, I am not using custom handlers. I am just dumping the Builder like 
>>> this (simplified):
>>>
>>> template
>>> void dump_msg(T& msg){
>>> try{
>>> capnp::JsonCodec enc;
>>> enc.setPrettyPrint(PPRINT);
>>> auto t = enc.encode(msg);
>>> fprintf(stderr, "MSG: %s\n", t.cStr());
>>> }catch(...){}
>>> }
>>>
>>> Where, in this case, msg is:
>>>
>>> 147 capnp::MallocMessageBuilder builder;
>>> 
>>> 148 auto msg = builder.initRoot();
>>>
>>> //Fill it
>>>
>>> 254  dump_msg(msg);
>>>
>>> So the builder and encoder al valid, I believe in the entire dumping. 
>>> Moreover, valgrind would complain before the SEGFAULT if something would be 
>>> out of the stack, and the only thing I get is the direct SEGFAULT.
>>>
>>> Any thoughts? I will keep trying to isolate the problem
>>>
>>> marc
>>>
>>> -Kenton

 On Fri, Mar 2, 2018 at 10:18 AM, Marc Sune  wrote:

> > That's about all that I can get, even though capnproto is compiled 
> with DEBUG.
>
> Or shall I say, it should :/
>
>
> On Friday, March 2, 2018 at 7:17:02 PM UTC+1, Marc Sune wrote:
>>
>> Hi guys,
>>
>> I am experiencing a _very_ strange segfault during JSON encoding of 
>> message:
>>
>> ```
>> Program received signal SIGSEGV, Segmentation fault.
>> [Switching to Thread 27946]
>> 0x01e4b81a in std::__detail::_Hashtable_ebo_helper<1, 
>> capnp::(anonymous namespace)::TypeHash, 
>> true>::_S_cget(std::__detail::_Hashtable_ebo_helper<1, capnp::(anonymous 
>> namespace)::TypeHash, true> const&) ()
>> (gdb) bt
>> #0  0x01e4b81a in std::__detail::_Hashtable_ebo_helper<1, 
>> capnp::(anonymous namespace)::TypeHash, 
>> 

Re: [capnproto] Re: Segfault during JSON encode() in v0.6.1

2018-03-02 Thread Marc Sune
Kenton,

On Friday, March 2, 2018 at 7:27:30 PM UTC+1, Kenton Varda wrote:
>
> Hi Marc,
>
> Do you have any custom type handlers registered via addTypeHandler()? Is 
> it possible that the handler class has gone out-of-scope (been destroyed) 
> by the time the encoder is executed?
>

No, I am not using custom handlers. I am just dumping the Builder like this 
(simplified):

template
void dump_msg(T& msg){
try{
capnp::JsonCodec enc;
enc.setPrettyPrint(PPRINT);
auto t = enc.encode(msg);
fprintf(stderr, "MSG: %s\n", t.cStr());
}catch(...){}
}

Where, in this case, msg is:

147 capnp::MallocMessageBuilder builder;

148 auto msg = builder.initRoot();

//Fill it

254  dump_msg(msg);

So the builder and encoder al valid, I believe in the entire dumping. 
Moreover, valgrind would complain before the SEGFAULT if something would be 
out of the stack, and the only thing I get is the direct SEGFAULT.

Any thoughts? I will keep trying to isolate the problem

marc

-Kenton
>
> On Fri, Mar 2, 2018 at 10:18 AM, Marc Sune  > wrote:
>
>> > That's about all that I can get, even though capnproto is compiled with 
>> DEBUG.
>>
>> Or shall I say, it should :/
>>
>>
>> On Friday, March 2, 2018 at 7:17:02 PM UTC+1, Marc Sune wrote:
>>>
>>> Hi guys,
>>>
>>> I am experiencing a _very_ strange segfault during JSON encoding of 
>>> message:
>>>
>>> ```
>>> Program received signal SIGSEGV, Segmentation fault.
>>> [Switching to Thread 27946]
>>> 0x01e4b81a in std::__detail::_Hashtable_ebo_helper<1, 
>>> capnp::(anonymous namespace)::TypeHash, 
>>> true>::_S_cget(std::__detail::_Hashtable_ebo_helper<1, capnp::(anonymous 
>>> namespace)::TypeHash, true> const&) ()
>>> (gdb) bt
>>> #0  0x01e4b81a in std::__detail::_Hashtable_ebo_helper<1, 
>>> capnp::(anonymous namespace)::TypeHash, 
>>> true>::_S_cget(std::__detail::_Hashtable_ebo_helper<1, capnp::(anonymous 
>>> namespace)::TypeHash, true> const&) ()
>>> #1  0x01e4b11a in std::__detail::_Hash_code_base>> std::pair, 
>>> std::__detail::_Select1st, capnp::(anonymous namespace)::TypeHash, 
>>> std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, 
>>> true>::_M_h1() const ()
>>> #2  0x01e4a976 in std::__detail::_Hash_code_base>> std::pair, 
>>> std::__detail::_Select1st, capnp::(anonymous namespace)::TypeHash, 
>>> std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, 
>>> true>::_M_hash_code(capnp::Type const&) const ()
>>> #3  0x01e4a401 in std::_Hashtable>> std::pair, 
>>> std::allocator>> >, std::__detail::_Select1st, std::equal_to, capnp::(anonymous 
>>> namespace)::TypeHash, std::__detail::_Mod_range_hashing, 
>>> std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, 
>>> std::__detail::_Hashtable_traits >::find(capnp::Type 
>>> const&) const ()
>>> #4  0x01e483a3 in std::unordered_map>> capnp::JsonCodec::HandlerBase*, capnp::(anonymous namespace)::TypeHash, 
>>> std::equal_to, std::allocator> capnp::JsonCodec::HandlerBase*> > >::find(capnp::Type const&) const ()
>>> #5  0x01e443d5 in 
>>> capnp::JsonCodec::encode(capnp::DynamicValue::Reader, capnp::Type, 
>>> capnp::JsonValue::Builder) const ()
>>> #6  0x01e44a6f in 
>>> capnp::JsonCodec::encode(capnp::DynamicValue::Reader, capnp::Type, 
>>> capnp::JsonValue::Builder) const ()
>>> #7  0x01e45a36 in 
>>> capnp::JsonCodec::encodeField(capnp::StructSchema::Field, 
>>> capnp::DynamicValue::Reader, capnp::JsonValue::Builder) const ()
>>> #8  0x01e45377 in 
>>> capnp::JsonCodec::encode(capnp::DynamicValue::Reader, capnp::Type, 
>>> capnp::JsonValue::Builder) const ()
>>> #9  0x01e45a36 in 
>>> capnp::JsonCodec::encodeField(capnp::StructSchema::Field, 
>>> capnp::DynamicValue::Reader, capnp::JsonValue::Builder) const ()
>>> #10 0x01e45377 in 
>>> capnp::JsonCodec::encode(capnp::DynamicValue::Reader, capnp::Type, 
>>> capnp::JsonValue::Builder) const ()
>>> #11 0x01e45a36 in 
>>> capnp::JsonCodec::encodeField(capnp::StructSchema::Field, 
>>> capnp::DynamicValue::Reader, capnp::JsonValue::Builder) const ()
>>> #12 0x01e45377 in 
>>> capnp::JsonCodec::encode(capnp::DynamicValue::Reader, capnp::Type, 
>>> capnp::JsonValue::Builder) const ()
>>> #13 0x01e45a36 in 
>>> capnp::JsonCodec::encodeField(capnp::StructSchema::Field, 
>>> capnp::DynamicValue::Reader, capnp::JsonValue::Builder) const ()
>>> #14 0x01e45618 in 
>>> capnp::JsonCodec::encode(capnp::DynamicValue::Reader, capnp::Type, 
>>> capnp::JsonValue::Builder) const ()
>>> #15 0x01e45a36 in 
>>> capnp::JsonCodec::encodeField(capnp::StructSchema::Field, 
>>> capnp::DynamicValue::Reader, capnp::JsonValue::Builder) const ()
>>> #16 0x01e45618 in 
>>>