[jira] [Updated] (HDFS-9486) Valgrind failures when using more than 1 io_service worker thread.

James Clampffer (JIRA) Fri, 04 Dec 2015 12:14:25 -0800

     [ 
https://issues.apache.org/jira/browse/HDFS-9486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


James Clampffer updated HDFS-9486:
----------------------------------
    Attachment: HDFS-9486.HDFS-8707.000.patch

Attached patch.

The issue here was stale code meant to prevent another type of bug that used to 
be common and a lack of a comment mentioning this as a potential issue that got 
lost over time.  Here's what was going on:
1) FileSystemImpl's destructor would explicitly reset
FileSystemImpl::io_service_.  
2) Then the NameNodeOperations member of FileSystem impl would be implicitly 
destroyed, that had RpcEngine as a member that would in turn be destroyed.  
3) The RpcEngine destructor would then attempt to do some work on the 
underlying asio::io_service (it keeps a pointer) after it had been destroyed.

Fix is just removing the explicit unique_ptr reset and adding a comment says 
that FileSystemImpl::io_service_ must always be the first declared member 
variable so it is guaranteed to be the last destroyed member variable.

> Valgrind failures when using more than 1 io_service worker thread.
> ------------------------------------------------------------------
>
>                 Key: HDFS-9486
>                 URL: https://issues.apache.org/jira/browse/HDFS-9486
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: hdfs-client
>            Reporter: James Clampffer
>            Assignee: James Clampffer
>         Attachments: HDFS-9486-stacks-sanitized.txt, 
> HDFS-9486.HDFS-8707.000.patch
>
>
> Valgrind catches an invalid read of size 8.  Setup: 4 io_service worker 
> threads, 64 threads doing open-read-close on a small file.
> Stack:
> ==8351== Invalid read of size 8
> ==8351==    at 0x51F45C: 
> asio::detail::reactive_socket_recv_op<asio::mutable_buffers_1, 
> asio::detail::read_op<asio::basic_stream_socket<asio::ip::tcp, 
> asio::stream_socket_service<asio::ip::tcp> >, asio::mutable_buffers_1, 
> asio::detail::transfer_all_t, std::_Bind<std::_Mem_fn<void 
> (hdfs::RpcConnectionImpl<asio::basic_stream_socket<asio::ip::tcp, 
> asio::stream_socket_service<asio::ip::tcp> > >::*)(std::error_code const&, 
> unsigned long)> 
> (hdfs::RpcConnectionImpl<asio::basic_stream_socket<asio::ip::tcp, 
> asio::stream_socket_service<asio::ip::tcp> > >*, std::_Placeholder<1>, 
> std::_Placeholder<2>)> > >::do_complete(asio::detail::task_io_service*, 
> asio::detail::task_io_service_operation*, std::error_code const&, unsigned 
> long) (functional:601)
> ==8351==    by 0x508B10: hdfs::IoServiceImpl::Run() 
> (task_io_service_operation.hpp:37)
> ==8351==    by 0x55BCBEF: ??? (in 
> /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.19)
> ==8351==    by 0x5A2D181: start_thread (pthread_create.c:312)
> ==8351==    by 0x5D3D47C: clone (clone.S:111)
> ==8351==  Address 0x67e3eb0 is 0 bytes inside a block of size 216 free'd
> ==8351==    at 0x4C2C2BC: operator delete(void*) (in 
> /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
> ==8351==    by 0x51F7B2: 
> hdfs::RpcConnectionImpl<asio::basic_stream_socket<asio::ip::tcp, 
> asio::stream_socket_service<asio::ip::tcp> > >::~RpcConnectionImpl() 
> (rpc_connection.h:32)
> ==8351==    by 0x50C104: hdfs::FileSystemImpl::~FileSystemImpl() 
> (unique_ptr.h:67)
> ==8351==    by 0x503A10: hdfs::HadoopFileSystem::~HadoopFileSystem() 
> (unique_ptr.h:67)
> ==8351==    by 0x503B28: hdfs::HadoopFileSystem::~HadoopFileSystem() 
> (hdfs_cpp.cc:140)
> ==8351==    by 0x503580: hdfs_internal::~hdfs_internal() (unique_ptr.h:67)
> ==8351==    by 0x502FEE: hdfsDisconnect (hdfs.cc:127)
> ==8351==    by 0x5010B7: main (threaded_stress_test.cc:74)
> ==8351== 
> pure virtual method called
> terminate called without an active exception



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-9486) Valgrind failures when using more than 1 io_service worker thread.

Reply via email to