Re: error allocating core memory buffers (code 22) at util2.c(106) [sender=3.1.2]
> But at first blush, it appeared that adding - made things hang > forever. Yes. Confirmed against the git HEAD (9e7b8ab7cf66ecd152002926a7da61d8ad862522). Running: rsync -n -iaHJAX $d $b Does some initial work and then gets to: bash-3.2# lldb -p 6458 (lldb) process attach --pid 6458 Process 6458 stopped * thread #1: tid = 0x1d5ff0, 0x7fff8c2d83fa libsystem_kernel.dylib`__select + 10, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP frame #0: 0x7fff8c2d83fa libsystem_kernel.dylib`__select + 10 libsystem_kernel.dylib`__select: -> 0x7fff8c2d83fa <+10>: jae0x7fff8c2d8404; <+20> 0x7fff8c2d83fc <+12>: movq %rax, %rdi 0x7fff8c2d83ff <+15>: jmp0x7fff8c2d3c78; cerror 0x7fff8c2d8404 <+20>: retq Executable module set to "/usr/local/bin/rsync". Architecture set to: x86_64h-apple-macosx. (lldb) bt * thread #1: tid = 0x1d5ff0, 0x7fff8c2d83fa libsystem_kernel.dylib`__select + 10, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP * frame #0: 0x7fff8c2d83fa libsystem_kernel.dylib`__select + 10 frame #1: 0x00010dd6aedc rsync`perform_io(needed=47, flags=4) + 2924 at io.c:742 frame #2: 0x00010dd6a0bd rsync`send_msg(code=MSG_INFO, buf="[receiver] receiving flist for dir 1319\n", len=40, convert=0) + 349 at io.c:958 frame #3: 0x00010dd5c765 rsync`rwrite(code=FINFO, buf="[receiver] receiving flist for dir 1319\n", len=40, is_utf8=0) + 469 at log.c:279 frame #4: 0x00010dd5d3f3 rsync`rprintf(code=FINFO, format="[%s] receiving flist for dir %d\n") + 627 at log.c:435 frame #5: 0x00010dd394a6 rsync`read_ndx_and_attrs(f_in=0, f_out=4, iflag_ptr=0x7fff51ed0f64, type_ptr="\x80", buf="", len_ptr=0x7fff51ed0f60) + 646 at rsync.c:362 frame #6: 0x00010dd4483b rsync`recv_files(f_in=0, f_out=4, local_name=0x) + 379 at receiver.c:542 frame #7: 0x00010dd566c3 rsync`do_recv(f_in=0, f_out=4, local_name=0x) + 899 at main.c:909 frame #8: 0x00010dd54ac7 rsync`do_server_recv(f_in=0, f_out=1, argc=1, argv=0x7fff51ed2328) + 1207 at main.c:1078 frame #9: 0x00010dd5431e rsync`start_server(f_in=0, f_out=1, argc=2, argv=0x7fff51ed2320) + 286 at main.c:1112 frame #10: 0x00010dd541ee rsync`child_main(argc=2, argv=0x7fff51ed2320) + 46 at main.c:1085 frame #11: 0x00010dd852ad rsync`local_child(argc=2, argv=0x7fff51ed2320, f_in=0x7fff51ed4328, f_out=0x7fff51ed4324, child_main=(rsync`child_main at main.c:1084)) + 701 at pipe.c:167 frame #12: 0x00010dd58e42 rsync`do_cmd(cmd=0x, machine=0x, user=0x, remote_argv=0x7fee40c04a50, remote_argc=0, f_in_p=0x7fff51ed4328, f_out_p=0x7fff51ed4324) + 2210 at main.c:543 frame #13: 0x00010dd57fd2 rsync`start_client(argc=1, argv=0x7fee40c04a40) + 2402 at main.c:1414 frame #14: 0x00010dd5747b rsync`main(argc=2, argv=0x7fee40c04a40) + 2555 at main.c:1652 frame #15: 0x7fff96cad5c9 libdyld.dylib`start + 1 frame #16: 0x7fff96cad5c9 libdyld.dylib`start + 1 (lldb) up frame #1: 0x00010dd6aedc rsync`perform_io(needed=47, flags=4) + 2924 at io.c:742 739 tv.tv_sec = select_timeout; 740 tv.tv_usec = 0; 741 -> 742 cnt = select(max_fd + 1, _fds, _fds, _fds, ); 743 744 if (cnt <= 0) { 745 if (cnt < 0 && errno == EBADF) { (lldb) fin And then never returns fomr that stack frame, seems to hang out in select() forever (unless interrupted by a breakpoint or whatnot). Bugzilla account request submitted. --jh...@mit.edu John Hawkinson -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: error allocating core memory buffers (code 22) at util2.c(106) [sender=3.1.2]
Sorry to keep replying to myself: > Because this is a Time Machine backup, and there were 66 snapshots of a > 1 TB disk consuming about 1.5 TB, there were a *lot* of hard links. Many > of directories rather than individual files, so it's a little Err, whoops? No, I was tired and confused. They are not hard links to directories, that would screw up the universe. Still, lots of hard linked files. > Do I need to run this under lldb and set a breakpoint in expand_item_list()? > Quick inspection suggests running with - might give some useful output: Err... the result of this was it processed a few files for a minute or so and then hung in select() and consumed no cpu and there was no disk activity. Unfortunately apparently my clang/lldb workflow was broken and I didn't have functional debugging symbols (...) and I also lost the stack trace I thought I had (inadequate scrollback), so I'm not sure what was going on. But at first blush, it appeared that adding - made things hang forever. Removing it, and rerunning, it's now happily trucking along and has been for the past hour actually doing work. --jh...@mit.edu John Hawkinson -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: error allocating core memory buffers (code 22) at util2.c(106) [sender=3.1.2]
> I guess I can turn on core dumps and increase (unlimit completely) the > stack size... > > Although it doesn't seem to have segfaulted, so I'm not sure having > core dumps enabled would have helped? Indeed. I reran it on the 50 remaining individual directories, it seems to have made it through 10 before failing again overnight: 66596 20:38:03 rsync -vP -iaHJAX `cat /tmp/unback` /Volumes/platinum-barratry/x/Backups.backupdb/pb3/ ... rsync: unpack_smb_acl: sys_acl_get_info(): Undefined error: 0 (0) rsync: unpack_smb_acl: sys_acl_get_info(): Undefined error: 0 (0) ... .fog.a.. 2015-06-19-072520/platinum-bar2/Users/jhawk/.emacs.d/auto-save-list/.saves-1541-platinum-bar2.local~ ERROR: out of memory in expand_item_list [sender] rsync error: error allocating core memory buffers (code 22) at util2.c(106) [sender=3.1.2] bash-3.2# No segfault, no coredump, no syslogs. Do I need to run this under lldb and set a breakpoint in expand_item_list()? Quick inspection suggests running with - might give some useful output: 1680 if (DEBUG_GTE(FLIST, 3)) { 1681 rprintf(FINFO, "[%s] expand %s to %s bytes, did%s move\n", 1682 who_am_i(), desc, big_num(new_size * item_size), 1683 new_ptr == lp->items ? " not" : ""); 1684 } 1685 if (!new_ptr) 1686 out_of_memory("expand_item_list"); Although I imagine that output might be voluminous [but maybe not]? Again, I don't have time to build test cases and reproduce this carefully, every run is painful and long and slow. But I'd like to do the responsible thing if someone can tell me what that is. > p.s.: If I had to start over, I would have spent less time just deleting > the data and recopying it, rather than trying to fixup the metadata and Indeed, it's looking like fixing the metadata with rsync is an order of magnitude slower, even as far as I've gotten. So maybe it's time to find another method. I don't think fts(3) is optimized any better for large hardlink farms, so I think maybe I need a homegrown solution? Ug. --jh...@mit.edu John Hawkinson -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html