[ 
https://issues.apache.org/jira/browse/MESOS-2050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14203060#comment-14203060
 ] 

Till Toenshoff commented on MESOS-2050:
---------------------------------------

commit ce82e81c12d7f91b158c73465c47725331626f32
Author: Till Toenshoff <toensh...@me.com>
Date:   Sat Nov 8 01:19:50 2014 +0100

    Fixed Authenticator SASL auxiliary memory access.

    Review: https://reviews.apache.org/r/27741

> InMemoryAuxProp plugin used by Authenticators results in SEGFAULT
> -----------------------------------------------------------------
>
>                 Key: MESOS-2050
>                 URL: https://issues.apache.org/jira/browse/MESOS-2050
>             Project: Mesos
>          Issue Type: Bug
>    Affects Versions: 0.21.0
>            Reporter: Vinod Kone
>            Assignee: Till Toenshoff
>
> Observed this on ASF CI:
> Basically, as part of the recent Auth refactor for modules, the loading of 
> secrets is being done once per Authenticator Process instead of once in the 
> Master.  Since, InMemoryAuxProp plugin manipulates static variables (e.g, 
> 'properties') it results in SEGFAULT when one Authenticator (e.g., for slave) 
> does load() while another Authenticator (e.g., for framework) does lookup(), 
> as both these methods manipulate static 'properties'.
> {code}
> [ RUN      ] MasterTest.LaunchDuplicateOfferTest
> Using temporary directory '/tmp/MasterTest_LaunchDuplicateOfferTest_XEBbvp'
> I1104 03:37:55.523553 28363 leveldb.cpp:176] Opened db in 2.270387ms
> I1104 03:37:55.524250 28363 leveldb.cpp:183] Compacted db in 662527ns
> I1104 03:37:55.524276 28363 leveldb.cpp:198] Created db iterator in 4964ns
> I1104 03:37:55.524284 28363 leveldb.cpp:204] Seeked to beginning of db in 
> 702ns
> I1104 03:37:55.524291 28363 leveldb.cpp:273] Iterated through 0 keys in the 
> db in 450ns
> I1104 03:37:55.524333 28363 replica.cpp:741] Replica recovered with log 
> positions 0 -> 0 with 1 holes and 0 unlearned
> I1104 03:37:55.524852 28384 recover.cpp:437] Starting replica recovery
> I1104 03:37:55.525188 28384 recover.cpp:463] Replica is in EMPTY status
> I1104 03:37:55.526577 28378 replica.cpp:638] Replica in EMPTY status received 
> a broadcasted recover request
> I1104 03:37:55.527135 28378 master.cpp:318] Master 
> 20141104-033755-3176252227-49988-28363 (proserpina.apache.org) started on 
> 67.195.81.189:49988
> I1104 03:37:55.527180 28378 master.cpp:364] Master only allowing 
> authenticated frameworks to register
> I1104 03:37:55.527191 28378 master.cpp:369] Master only allowing 
> authenticated slaves to register
> I1104 03:37:55.527217 28378 credentials.hpp:36] Loading credentials for 
> authentication from 
> '/tmp/MasterTest_LaunchDuplicateOfferTest_XEBbvp/credentials'
> I1104 03:37:55.527451 28378 master.cpp:408] Authorization enabled
> I1104 03:37:55.528081 28384 master.cpp:126] No whitelist given. Advertising 
> offers for all slaves
> I1104 03:37:55.528548 28383 recover.cpp:188] Received a recover response from 
> a replica in EMPTY status
> I1104 03:37:55.528645 28388 hierarchical_allocator_process.hpp:299] 
> Initializing hierarchical allocator process with master : 
> master@67.195.81.189:49988
> I1104 03:37:55.529233 28388 master.cpp:1258] The newly elected leader is 
> master@67.195.81.189:49988 with id 20141104-033755-3176252227-49988-28363
> I1104 03:37:55.529266 28388 master.cpp:1271] Elected as the leading master!
> I1104 03:37:55.529289 28388 master.cpp:1089] Recovering from registrar
> I1104 03:37:55.529311 28385 recover.cpp:554] Updating replica status to 
> STARTING
> I1104 03:37:55.529500 28384 registrar.cpp:313] Recovering registrar
> I1104 03:37:55.530037 28383 leveldb.cpp:306] Persisting metadata (8 bytes) to 
> leveldb took 497965ns
> I1104 03:37:55.530083 28383 replica.cpp:320] Persisted replica status to 
> STARTING
> I1104 03:37:55.530335 28387 recover.cpp:463] Replica is in STARTING status
> I1104 03:37:55.531343 28381 replica.cpp:638] Replica in STARTING status 
> received a broadcasted recover request
> I1104 03:37:55.531739 28384 recover.cpp:188] Received a recover response from 
> a replica in STARTING status
> I1104 03:37:55.532168 28379 recover.cpp:554] Updating replica status to VOTING
> I1104 03:37:55.532572 28381 leveldb.cpp:306] Persisting metadata (8 bytes) to 
> leveldb took 293974ns
> I1104 03:37:55.532594 28381 replica.cpp:320] Persisted replica status to 
> VOTING
> I1104 03:37:55.532790 28390 recover.cpp:568] Successfully joined the Paxos 
> group
> I1104 03:37:55.533107 28390 recover.cpp:452] Recover process terminated
> I1104 03:37:55.533604 28382 log.cpp:656] Attempting to start the writer
> I1104 03:37:55.534840 28381 replica.cpp:474] Replica received implicit 
> promise request with proposal 1
> I1104 03:37:55.535188 28381 leveldb.cpp:306] Persisting metadata (8 bytes) to 
> leveldb took 321021ns
> I1104 03:37:55.535212 28381 replica.cpp:342] Persisted promised to 1
> I1104 03:37:55.535893 28378 coordinator.cpp:230] Coordinator attemping to 
> fill missing position
> I1104 03:37:55.537318 28392 replica.cpp:375] Replica received explicit 
> promise request for position 0 with proposal 2
> I1104 03:37:55.537719 28392 leveldb.cpp:343] Persisting action (8 bytes) to 
> leveldb took 366858ns
> I1104 03:37:55.537746 28392 replica.cpp:676] Persisted action at 0
> I1104 03:37:55.538872 28391 replica.cpp:508] Replica received write request 
> for position 0
> I1104 03:37:55.538952 28391 leveldb.cpp:438] Reading position from leveldb 
> took 36647ns
> I1104 03:37:55.539482 28391 leveldb.cpp:343] Persisting action (14 bytes) to 
> leveldb took 483191ns
> I1104 03:37:55.539510 28391 replica.cpp:676] Persisted action at 0
> I1104 03:37:55.540045 28379 replica.cpp:655] Replica received learned notice 
> for position 0
> I1104 03:37:55.540402 28379 leveldb.cpp:343] Persisting action (16 bytes) to 
> leveldb took 299913ns
> I1104 03:37:55.540429 28379 replica.cpp:676] Persisted action at 0
> I1104 03:37:55.540455 28379 replica.cpp:661] Replica learned NOP action at 
> position 0
> I1104 03:37:55.541205 28385 log.cpp:672] Writer started with ending position 0
> I1104 03:37:55.542579 28378 leveldb.cpp:438] Reading position from leveldb 
> took 33358ns
> I1104 03:37:55.545608 28377 registrar.cpp:346] Successfully fetched the 
> registry (0B) in 16.064768ms
> I1104 03:37:55.545778 28377 registrar.cpp:445] Applied 1 operations in 
> 31511ns; attempting to update the 'registry'
> I1104 03:37:55.548598 28382 log.cpp:680] Attempting to append 143 bytes to 
> the log
> I1104 03:37:55.548821 28385 coordinator.cpp:340] Coordinator attempting to 
> write APPEND action at position 1
> I1104 03:37:55.549759 28385 replica.cpp:508] Replica received write request 
> for position 1
> I1104 03:37:55.550269 28385 leveldb.cpp:343] Persisting action (162 bytes) to 
> leveldb took 475014ns
> I1104 03:37:55.550300 28385 replica.cpp:676] Persisted action at 1
> I1104 03:37:55.550984 28382 replica.cpp:655] Replica received learned notice 
> for position 1
> I1104 03:37:55.551311 28382 leveldb.cpp:343] Persisting action (164 bytes) to 
> leveldb took 304432ns
> I1104 03:37:55.551332 28382 replica.cpp:676] Persisted action at 1
> I1104 03:37:55.551349 28382 replica.cpp:661] Replica learned APPEND action at 
> position 1
> I1104 03:37:55.552196 28392 registrar.cpp:490] Successfully updated the 
> 'registry' in 6.35904ms
> I1104 03:37:55.552338 28392 registrar.cpp:376] Successfully recovered 
> registrar
> I1104 03:37:55.552423 28385 log.cpp:699] Attempting to truncate the log to 1
> I1104 03:37:55.552577 28382 master.cpp:1116] Recovered 0 slaves from the 
> Registry (105B) ; allowing 10mins for slaves to re-register
> I1104 03:37:55.552647 28378 coordinator.cpp:340] Coordinator attempting to 
> write TRUNCATE action at position 2
> I1104 03:37:55.553383 28391 replica.cpp:508] Replica received write request 
> for position 2
> I1104 03:37:55.553772 28391 leveldb.cpp:343] Persisting action (16 bytes) to 
> leveldb took 357114ns
> I1104 03:37:55.553797 28391 replica.cpp:676] Persisted action at 2
> I1104 03:37:55.554234 28379 replica.cpp:655] Replica received learned notice 
> for position 2
> I1104 03:37:55.554529 28379 leveldb.cpp:343] Persisting action (18 bytes) to 
> leveldb took 273827ns
> I1104 03:37:55.554565 28379 leveldb.cpp:401] Deleting ~1 keys from leveldb 
> took 16455ns
> I1104 03:37:55.554580 28379 replica.cpp:676] Persisted action at 2
> I1104 03:37:55.554596 28379 replica.cpp:661] Replica learned TRUNCATE action 
> at position 2
> I1104 03:37:55.566700 28385 slave.cpp:169] Slave started on 
> 2)@67.195.81.189:49988
> I1104 03:37:55.566766 28385 credentials.hpp:84] Loading credential for 
> authentication from 
> '/tmp/MasterTest_LaunchDuplicateOfferTest_He08c1/credential'
> I1104 03:37:55.566946 28385 slave.cpp:276] Slave using credential for: 
> test-principal
> I1104 03:37:55.567327 28385 slave.cpp:289] Slave resources: cpus(*):2; 
> mem(*):1024; disk(*):3.70122e+06; ports(*):[31000-32000]
> I1104 03:37:55.567436 28385 slave.cpp:318] Slave hostname: 
> proserpina.apache.org
> I1104 03:37:55.567458 28385 slave.cpp:319] Slave checkpoint: false
> W1104 03:37:55.567471 28385 slave.cpp:321] Disabling checkpointing is 
> deprecated and the --checkpoint flag will be removed in a future release. 
> Please avoid using this flag
> I1104 03:37:55.568488 28392 state.cpp:33] Recovering state from 
> '/tmp/MasterTest_LaunchDuplicateOfferTest_He08c1/meta'
> I1104 03:37:55.568739 28380 status_update_manager.cpp:197] Recovering status 
> update manager
> I1104 03:37:55.569128 28387 slave.cpp:3478] Finished recovery
> I1104 03:37:55.569718 28388 status_update_manager.cpp:171] Pausing sending 
> status updates
> I1104 03:37:55.569773 28384 slave.cpp:602] New master detected at 
> master@67.195.81.189:49988
> I1104 03:37:55.569820 28384 slave.cpp:665] Authenticating with master 
> master@67.195.81.189:49988
> I1104 03:37:55.570000 28386 authenticatee.hpp:133] Creating new client SASL 
> connection
> I1104 03:37:55.570137 28384 slave.cpp:638] Detecting new master
> I1104 03:37:55.570248 28385 master.cpp:3869] Authenticating 
> slave(2)@67.195.81.189:49988
> I1104 03:37:55.570291 28385 master.cpp:3880] Using default CRAM-MD5 
> authenticator
> I1104 03:37:55.570592 28377 authenticator.hpp:169] Creating new server SASL 
> connection
> I1104 03:37:55.570808 28377 authenticatee.hpp:224] Received SASL 
> authentication mechanisms: CRAM-MD5
> I1104 03:37:55.570826 28363 sched.cpp:137] Version: 0.21.0
> I1104 03:37:55.570837 28377 authenticatee.hpp:250] Attempting to authenticate 
> with mechanism 'CRAM-MD5'
> I1104 03:37:55.571281 28386 authenticator.hpp:275] Received SASL 
> authentication start
> I1104 03:37:55.571321 28386 authenticator.hpp:397] Authentication requires 
> more steps
> I1104 03:37:55.571599 28391 sched.cpp:233] New master detected at 
> master@67.195.81.189:49988
> I1104 03:37:55.571635 28391 sched.cpp:283] Authenticating with master 
> master@67.195.81.189:49988
> I1104 03:37:55.571771 28378 authenticatee.hpp:133] Creating new client SASL 
> connection
> I1104 03:37:55.571974 28377 authenticatee.hpp:270] Received SASL 
> authentication step
> I1104 03:37:55.572016 28380 master.cpp:3869] Authenticating 
> scheduler-3dc2b4d5-b421-4e73-9088-32ed103e743d@67.195.81.189:49988
> I1104 03:37:55.572082 28377 authenticator.hpp:303] Received SASL 
> authentication step
> I1104 03:37:55.572103 28380 master.cpp:3880] Using default CRAM-MD5 
> authenticator
> I1104 03:37:55.572108 28377 auxprop.cpp:99] Request to lookup properties for 
> user: 'test-principal' realm: 'proserpina.apache.org' server FQDN: 
> 'proserpina.apache.org' SASL_AUXPROP_VERIFY_AGAINST_HASH: false 
> SASL_AUXPROP_OVERRIDE: false SASL_AUXPROP_AUTHZID: false 
> I1104 03:37:55.572119 28377 auxprop.cpp:171] Looking up auxiliary property 
> '*userPassword'
> *** Aborted at 1415072275 (unix time) try "date -d @1415072275" if you are 
> using GNU date ***
> I1104 03:37:55.572304 28387 authenticator.hpp:169] Creating new server SASL 
> connection
> PC: @     0x2aae9cc945cb (unknown)
> I1104 03:37:55.572639 28380 authenticatee.hpp:224] Received SASL 
> authentication mechanisms: CRAM-MD5
> I1104 03:37:55.572667 28380 authenticatee.hpp:250] Attempting to authenticate 
> with mechanism 'CRAM-MD5'
> I1104 03:37:55.572756 28391 authenticator.hpp:275] Received SASL 
> authentication start
> I1104 03:37:55.572796 28391 authenticator.hpp:397] Authentication requires 
> more steps
> I1104 03:37:55.572860 28391 authenticatee.hpp:270] Received SASL 
> authentication step
> I1104 03:37:55.572939 28387 authenticator.hpp:303] Received SASL 
> authentication step
> I1104 03:37:55.572958 28387 auxprop.cpp:99] Request to lookup properties for 
> user: 'test-principal' realm: 'proserpina.apache.org' server FQDN: 
> 'proserpina.apache.org' SASL_AUXPROP_VERIFY_AGAINST_HASH: false 
> SASL_AUXPROP_OVERRIDE: false SASL_AUXPROP_AUTHZID: false 
> I1104 03:37:55.572967 28387 auxprop.cpp:171] Looking up auxiliary property 
> '*userPassword'
> I1104 03:37:55.572999 28387 auxprop.cpp:171] Looking up auxiliary property 
> '*cmusaslsecretCRAM-MD5'
> I1104 03:37:55.573015 28387 auxprop.cpp:99] Request to lookup properties for 
> user: 'test-principal' realm: 'proserpina.apache.org' server FQDN: 
> 'proserpina.apache.org' SASL_AUXPROP_VERIFY_AGAINST_HASH: false 
> SASL_AUXPROP_OVERRIDE: false SASL_AUXPROP_AUTHZID: true 
> I1104 03:37:55.573027 28387 auxprop.cpp:121] Skipping auxiliary property 
> '*userPassword' since SASL_AUXPROP_AUTHZID == true
> I1104 03:37:55.573034 28387 auxprop.cpp:121] Skipping auxiliary property 
> '*cmusaslsecretCRAM-MD5' since SASL_AUXPROP_AUTHZID == true
> I1104 03:37:55.573045 28387 authenticator.hpp:389] Authentication success
> I1104 03:37:55.573132 28390 authenticatee.hpp:310] Authentication success
> I1104 03:37:55.573168 28388 master.cpp:3930] Successfully authenticated 
> principal 'test-principal' at 
> scheduler-3dc2b4d5-b421-4e73-9088-32ed103e743d@67.195.81.189:49988
> I1104 03:37:55.573325 28390 sched.cpp:357] Successfully authenticated with 
> master master@67.195.81.189:49988
> I1104 03:37:55.573348 28390 sched.cpp:476] Sending registration request to 
> master@67.195.81.189:49988
> I1104 03:37:55.573606 28388 master.cpp:1378] Received registration request 
> for framework 'default' at 
> scheduler-3dc2b4d5-b421-4e73-9088-32ed103e743d@67.195.81.189:49988
> I1104 03:37:55.573689 28388 master.cpp:1337] Authorizing framework principal 
> 'test-principal' to receive offers for role '*'
> I1104 03:37:55.574188 28388 master.cpp:1442] Registering framework 
> 20141104-033755-3176252227-49988-28363-0000 (default) at 
> scheduler-3dc2b4d5-b421-4e73-9088-32ed103e743d@67.195.81.189:49988
> I1104 03:37:55.574515 28386 hierarchical_allocator_process.hpp:329] Added 
> framework 20141104-033755-3176252227-49988-28363-0000
> I1104 03:37:55.574547 28386 hierarchical_allocator_process.hpp:697] No 
> resources available to allocate!
> I1104 03:37:55.574560 28386 hierarchical_allocator_process.hpp:659] Performed 
> allocation for 0 slaves in 19409ns
> I1104 03:37:55.574631 28388 sched.cpp:407] Framework registered with 
> 20141104-033755-3176252227-49988-28363-0000
> I1104 03:37:55.574676 28388 sched.cpp:421] Scheduler::registered took 20198ns
> *** SIGSEGV (@0xfffffffffffffff8) received by PID 28363 (TID 0x2aae9f399700) 
> from PID 18446744073709551608; stack trace: ***
>     @     0x2aae9d409340 (unknown)
>     @     0x2aae9cc945cb (unknown)
>     @           0x98e4f0 _ZNSt10_List_nodeISsEC2IJRKSsEEEDpOT_
>     @           0x9788c2 
> _ZN9__gnu_cxx13new_allocatorISt10_List_nodeISsEE9constructIS2_JRKSsEEEvPT_DpOT0_
>     @           0x956696 std::list<>::_M_create_node<>()
>     @           0x9376a4 std::list<>::_M_insert<>()
>     @           0x978829 _ZNSt4listISsSaISsEE12emplace_backIJRKSsEEEvDpOT_
>     @           0x956534 std::list<>::_M_initialize_dispatch<>()
>     @           0x9374e1 std::list<>::list()
>     @           0xb0f80f mesos::internal::cram_md5::Property::Property()
>     @     0x2aae9a6779f8 std::_List_node<>::_List_node<>()
>     @     0x2aae9a67664e 
> _ZN9__gnu_cxx13new_allocatorISt10_List_nodeIN5mesos8internal8cram_md58PropertyEEE9constructIS6_JRKS5_EEEvPT_DpOT0_
>     @     0x2aae9a674876 
> _ZNSt4listIN5mesos8internal8cram_md58PropertyESaIS3_EE14_M_create_nodeIJRKS3_EEEPSt10_List_nodeIS3_EDpOT_
>     @     0x2aae9a6721c6 
> _ZNSt4listIN5mesos8internal8cram_md58PropertyESaIS3_EE9_M_insertIJRKS3_EEEvSt14_List_iteratorIS3_EDpOT_
>     @     0x2aae9a670326 std::list<>::push_back()
>     @     0x2aae9a66e704 Multimap<>::get()
>     @     0x2aae9a66b561 
> mesos::internal::cram_md5::InMemoryAuxiliaryPropertyPlugin::lookup()
>     @     0x2aae9a66aa3d 
> mesos::internal::cram_md5::InMemoryAuxiliaryPropertyPlugin::lookup()
>     @     0x2aae9c7b9e29 (unknown)
>     @     0x2aae9c7babe5 (unknown)
>     @     0x2aaea2b8252e (unknown)
>     @     0x2aaea2b82929 (unknown)
>     @     0x2aae9c7c6628 (unknown)
>     @           0xb0c482 
> mesos::internal::cram_md5::CRAMMD5AuthenticatorProcess::step()
>     @           0xb135e9 ProtobufProcess<>::_handler1<>()
>     @           0xb25229 
> _ZNSt5_BindIFPFvPN5mesos8internal8cram_md527CRAMMD5AuthenticatorProcessEMS3_FvRKSsEMNS1_25AuthenticationStepMessageEKFS6_vERKN7process4UPIDES6_ES4_S8_SB_St12_PlaceholderILi1EESI_ILi2EEEE6__callIvJSF_S6_EJLm0ELm1ELm2ELm3ELm4EEEET_OSt5tupleIJDpT0_EESt12_Index_tupleIJXspT1_EEE
>     @           0xb2078c 
> _ZNSt5_BindIFPFvPN5mesos8internal8cram_md527CRAMMD5AuthenticatorProcessEMS3_FvRKSsEMNS1_25AuthenticationStepMessageEKFS6_vERKN7process4UPIDES6_ES4_S8_SB_St12_PlaceholderILi1EESI_ILi2EEEEclIJSF_S6_EvEET0_DpOT_
>     @           0xb1c33b std::_Function_handler<>::_M_invoke()
>     @           0xb2bcc5 std::function<>::operator()()
>     @           0xb2ba8e ProtobufProcess<>::visit()
>     @     0x2aae9a84120a process::MessageEvent::visit()
>     @           0x904766 process::ProcessBase::serve()
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to