Hi Goncalo,

If possible it would be great if you could capture a core file for this with
full debugging symbols (preferably glibc debuginfo as well). How you do
that will depend on the ceph version and your OS but we can offfer help
if required I'm sure.

Once you have the core do the following.

$ gdb /path/to/ceph-fuse core.XXXX
(gdb) set pag off
(gdb) set log on
(gdb) thread apply all bt
(gdb) thread apply all bt full

Then quit gdb and you should find a file called gdb.txt in your
working directory.
If you could attach that file to http://tracker.ceph.com/issues/16610

Cheers,
Brad

On Fri, Jul 8, 2016 at 12:06 AM, Patrick Donnelly <pdonn...@redhat.com> wrote:
> On Thu, Jul 7, 2016 at 2:01 AM, Goncalo Borges
> <goncalo.bor...@sydney.edu.au> wrote:
>> Unfortunately, the other user application breaks ceph-fuse again (It is a
>> completely different application then in my previous test).
>>
>> We have tested it in 4 machines with 4 cores. The user is submitting 16
>> single core jobs which are all writing different output files (one per job)
>> to a common dir in cephfs. The first 4 jobs run happily and never break
>> ceph-fuse. But the remaining 12 jobs, running in the remaining 3 machines,
>> trigger a segmentation fault, which is completely different from the other
>> case.
>>
>> ceph version 10.2.2 (45107e21c568dd033c2f0a3107dec8f0b0e58374)
>> 1: (()+0x297fe2) [0x7f54402b7fe2]
>> 2: (()+0xf7e0) [0x7f543ecf77e0]
>> 3: (ObjectCacher::bh_write_scattered(std::list<ObjectCacher::BufferHead*,
>> std::allocator<ObjectCacher::BufferHead*> >&)+0x36) [0x7f5440268086]
>> 4: (ObjectCacher::bh_write_adjacencies(ObjectCacher::BufferHead*,
>> std::chrono::time_point<ceph::time_detail::real_clock,
>> std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l> > >, long*,
>> int*)+0x22c) [0x7f5440268a3c]
>> 5: (ObjectCacher::flush(long)+0x1ef) [0x7f5440268cef]
>> 6: (ObjectCacher::flusher_entry()+0xac4) [0x7f5440269a34]
>> 7: (ObjectCacher::FlusherThread::entry()+0xd) [0x7f5440275c6d]
>> 8: (()+0x7aa1) [0x7f543ecefaa1]
>>  9: (clone()+0x6d) [0x7f543df6893d]
>> NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to
>> interpret this.
>
> This one looks like a very different problem. I've created an issue
> here: http://tracker.ceph.com/issues/16610
>
> Thanks for the report and debug log!
>
> --
> Patrick Donnelly
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 
Cheers,
Brad
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to