Bernhard,

Thanks for reporting this new issue.
I committed the attached patch

Jean-Louis

On 04/06/17 07:45 AM, Bernhard Erdmann wrote:
Hello,

after an upgrade to Amanda-3.4.4, I discovered that amrecover eats too much memory. Trying to recover a single directory from a huge GNU tar image (900 GB), amrecover gets killed because of memory constraints.

System is running CentOS 5.11 x86_64 using 8 GB RAM and 24 GB Swap available for several years using Amanda to backup to a 90 slot vTape iSCSI-attached.

Downgrade to Amanda-3.4.3 helped. The same recover procedure (same backup date, same directory to recover) works well using this version.

Last lines of amidxtaped.20170603173611.debug (timezone is CEST (GMT+2)):

Sat Jun 03 17:13:55.770129000 2017: pid 720: thd-0x1f6147f0: amidxtaped: /opt/amanda/lib/amanda/perl/Amanda/Restore.pm:1699:info:4900000 198642624 kb Sat Jun 03 17:14:20.073570000 2017: pid 720: thd-0x1f6147f0: amidxtaped: /opt/amanda/lib/amanda/perl/Amanda/Restore.pm:1699:info:4900000 198645984 kb Sat Jun 03 17:15:14.519889000 2017: pid 720: thd-0x1f6147f0: amidxtaped: /opt/amanda/lib/amanda/perl/Amanda/Restore.pm:1699:info:4900000 198646752 kb Sat Jun 03 17:21:39.862161000 2017: pid 720: thd-0x1f6147f0: amidxtaped: /opt/amanda/lib/amanda/perl/Amanda/Restore.pm:1699:info:4900000 198647680 kb Sat Jun 03 17:21:40.388673000 2017: pid 720: thd-0x20d85690: amidxtaped: xfer_cancel_with_error: Error writing to fd 8: Broken pipe Sat Jun 03 17:21:40.388729000 2017: pid 720: thd-0x20d85690: amidxtaped: xfer_queue_message: MSG: <XMsg@0x2b2e0c0008c0 type=XMSG_ERROR elt=<XferElemen
tGlue@0x20d86020> version=0>
Sat Jun 03 17:21:40.388900000 2017: pid 720: thd-0x20d85690: amidxtaped: xfer_queue_message: MSG: <XMsg@0x2b2e0c009ab0 type=XMSG_CANCEL elt=<XferSourceRecovery@0x20a3d000> version=0> Sat Jun 03 17:21:40.845451000 2017: pid 720: thd-0x1f6147f0: amidxtaped: Cancelling <Xfer@0x20d849c0 (<XferSourceRecovery@0x20a3d000> -> <XferDestFd@0x20d83010>)> Sat Jun 03 17:21:40.922716000 2017: pid 720: thd-0x20d85690: amidxtaped: sending XMSG_CRC message 0x20d83010 Sat Jun 03 17:21:40.922777000 2017: pid 720: thd-0x20d85690: amidxtaped: pull_and_write CRC: 1b42fe0e size 203415191552 Sat Jun 03 17:21:40.922793000 2017: pid 720: thd-0x20d85690: amidxtaped: xfer_queue_message: MSG: <XMsg@0x2b2e0c009a40 type=XMSG_CRC elt=<XferDestFd@0x20d83010> version=0> Sat Jun 03 17:21:40.922818000 2017: pid 720: thd-0x20d85690: amidxtaped: xfer_queue_message: MSG: <XMsg@0x2b2e0c001ae0 type=XMSG_DONE elt=<XferElementGlue@0x20d86020> version=0> Sat Jun 03 17:21:41.313146000 2017: pid 720: thd-0x1f6147f0: amidxtaped: dest_crc: 1b42fe0e:203415191552 Sat Jun 03 17:21:41.674875000 2017: pid 720: thd-0x1f6147f0: amidxtaped: /opt/amanda/lib/amanda/perl/Amanda/Restore.pm:1893:info:4900012 198647680 kb Sat Jun 03 17:21:42.462789000 2017: pid 720: thd-0x1f6147f0: amidxtaped: /opt/amanda/lib/amanda/perl/Amanda/Restore.pm:1921:error:4900055 Error writing to fd 8: Broken pipe Sat Jun 03 17:21:42.958237000 2017: pid 720: thd-0x1f6147f0: amidxtaped: /opt/amanda/lib/amanda/perl/Amanda/Restore.pm:2138:error:4900068 Error writing to fd 8: Broken pipe Sat Jun 03 17:21:43.406247000 2017: pid 720: thd-0x1f6147f0: amidxtaped: user_message feedback: Error writing to fd 8: Broken pipe Sat Jun 03 17:21:44.597549000 2017: pid 720: thd-0x1f6147f0: amidxtaped: CTL >> MESSAGE Error writing to fd 8: Broken pipe Sat Jun 03 17:21:46.940293000 2017: pid 720: thd-0x1f6147f0: amidxtaped: exiting with 1 Sat Jun 03 17:21:46.990525000 2017: pid 720: thd-0x1f6147f0: amidxtaped: ru_utime : 352 Sat Jun 03 17:21:46.990566000 2017: pid 720: thd-0x1f6147f0: amidxtaped: ru_stime : 463 Sat Jun 03 17:21:46.990576000 2017: pid 720: thd-0x1f6147f0: amidxtaped: ru_maxrss : 38984 Sat Jun 03 17:21:46.990595000 2017: pid 720: thd-0x1f6147f0: amidxtaped: ru_ixrss : 0 Sat Jun 03 17:21:46.990604000 2017: pid 720: thd-0x1f6147f0: amidxtaped: ru_idrss : 0 Sat Jun 03 17:21:46.990611000 2017: pid 720: thd-0x1f6147f0: amidxtaped: ru_isrss : 0 Sat Jun 03 17:21:46.990619000 2017: pid 720: thd-0x1f6147f0: amidxtaped: ru_minflt : 13614 Sat Jun 03 17:21:46.990627000 2017: pid 720: thd-0x1f6147f0: amidxtaped: ru_majflt : 2372 Sat Jun 03 17:21:46.990634000 2017: pid 720: thd-0x1f6147f0: amidxtaped: ru_nswap : 0 Sat Jun 03 17:21:46.990641000 2017: pid 720: thd-0x1f6147f0: amidxtaped: ru_inblock : 0 Sat Jun 03 17:21:46.990649000 2017: pid 720: thd-0x1f6147f0: amidxtaped: ru_oublock : 0 Sat Jun 03 17:21:46.990656000 2017: pid 720: thd-0x1f6147f0: amidxtaped: ru_msgsnd : 0 Sat Jun 03 17:21:46.990663000 2017: pid 720: thd-0x1f6147f0: amidxtaped: ru_msgrcv : 0 Sat Jun 03 17:21:46.990670000 2017: pid 720: thd-0x1f6147f0: amidxtaped: ru_nsignals: 0 Sat Jun 03 17:21:46.990678000 2017: pid 720: thd-0x1f6147f0: amidxtaped: ru_nvcsw : 1477628 Sat Jun 03 17:21:46.990685000 2017: pid 720: thd-0x1f6147f0: amidxtaped: ru_nivcsw : 4970121 Sat Jun 03 17:21:47.010360000 2017: pid 720: thd-0x1f6147f0: amidxtaped: pid 720 finish time Sat Jun 3 17:21:47 2017

Lines from /var/log/messages:

Jun  3 17:17:42 ente -- MARK --
Jun 3 17:21:36 ente kernel: sshd invoked oom-killer: gfp_mask=0x201d2, order=0, oomkilladj=0
Jun  3 17:21:37 ente kernel:
Jun  3 17:21:37 ente kernel: Call Trace:
Jun 3 17:21:37 ente kernel: [<ffffffff800cb6ae>] out_of_memory+0x8e/0x2f3 Jun 3 17:21:37 ente kernel: [<ffffffff8000f698>] __alloc_pages+0x27f/0x308 Jun 3 17:21:37 ente kernel: [<ffffffff80013083>] __do_page_cache_readahead+0x96/0x17b Jun 3 17:21:37 ente kernel: [<ffffffff800139cc>] filemap_nopage+0x14c/0x360 Jun 3 17:21:37 ente kernel: [<ffffffff8000896c>] __handle_mm_fault+0x1fd/0x1037 Jun 3 17:21:37 ente kernel: [<ffffffff800671ae>] do_page_fault+0x499/0x842 Jun 3 17:21:37 ente kernel: [<ffffffff800a3f9a>] autoremove_wake_function+0x0/0x2e Jun 3 17:21:37 ente kernel: [<ffffffff8003abee>] tty_ldisc_deref+0x6a/0x7d
Jun  3 17:21:37 ente kernel:  [<ffffffff8005ddf9>] error_exit+0x0/0x84
Jun  3 17:21:38 ente kernel:
Jun  3 17:21:39 ente kernel: Mem-info:
Jun  3 17:21:40 ente kernel: Node 0 DMA per-cpu:
Jun  3 17:21:41 ente kernel: cpu 0 hot: high 0, batch 1 used:0
Jun  3 17:21:41 ente kernel: cpu 0 cold: high 0, batch 1 used:0
Jun  3 17:21:41 ente kernel: cpu 1 hot: high 0, batch 1 used:0
Jun  3 17:21:41 ente kernel: cpu 1 cold: high 0, batch 1 used:0
Jun  3 17:21:41 ente kernel: Node 0 DMA32 per-cpu:
Jun  3 17:21:41 ente kernel: cpu 0 hot: high 186, batch 31 used:9
Jun  3 17:21:42 ente kernel: cpu 0 cold: high 62, batch 15 used:51
Jun  3 17:21:42 ente kernel: cpu 1 hot: high 186, batch 31 used:18
Jun  3 17:21:42 ente kernel: cpu 1 cold: high 62, batch 15 used:37
Jun  3 17:21:42 ente kernel: Node 0 Normal per-cpu:
Jun  3 17:21:42 ente kernel: cpu 0 hot: high 186, batch 31 used:17
Jun  3 17:21:42 ente kernel: cpu 0 cold: high 62, batch 15 used:14
Jun  3 17:21:42 ente kernel: cpu 1 hot: high 186, batch 31 used:56
Jun  3 17:21:42 ente kernel: cpu 1 cold: high 62, batch 15 used:54
Jun  3 17:21:42 ente kernel: Node 0 HighMem per-cpu: empty
Jun  3 17:21:42 ente kernel: Free pages:       39636kB (0kB HighMem)
Jun 3 17:21:42 ente kernel: Active:945135 inactive:1018975 dirty:0 writeback:0 unstable:0 free:9909 slab:6321 mapped-file:1157 mapped-anon:1963393 pagetables:19050 Jun 3 17:21:42 ente kernel: Node 0 DMA free:10084kB min:12kB low:12kB high:16kB active:0kB inactive:0kB present:9652kB pages_scanned:0 all_unreclaimable? yes
Jun  3 17:21:43 ente kernel: lowmem_reserve[]: 0 3383 7928 7928
Jun 3 17:21:43 ente kernel: Node 0 DMA32 free:23032kB min:4860kB low:6072kB high:7288kB active:1406744kB inactive:1978380kB present:3464992kB pages_scanned:16393700 all_unreclaimable? yes
Jun  3 17:21:43 ente kernel: lowmem_reserve[]: 0 0 4545 4545
Jun 3 17:21:43 ente kernel: Node 0 Normal free:6520kB min:6528kB low:8160kB high:9792kB active:2374184kB inactive:2097132kB present:4654080kB pages_scanned:10006802 all_unreclaimable? yes
Jun  3 17:21:43 ente kernel: lowmem_reserve[]: 0 0 0 0
Jun 3 17:21:43 ente kernel: Node 0 HighMem free:0kB min:128kB low:128kB high:128kB active:0kB inactive:0kB present:0kB pages_scanned:0 all_unreclaimable? no
Jun  3 17:21:43 ente kernel: lowmem_reserve[]: 0 0 0 0
Jun 3 17:21:43 ente kernel: Node 0 DMA: 5*4kB 4*8kB 5*16kB 5*32kB 3*64kB 5*128kB 1*256kB 1*512kB 0*1024kB 0*2048kB 2*4096kB = 10084kB Jun 3 17:21:43 ente kernel: Node 0 DMA32: 32*4kB 1*8kB 1*16kB 19*32kB 8*64kB 0*128kB 1*256kB 0*512kB 1*1024kB 0*2048kB 5*4096kB = 23032kB Jun 3 17:21:43 ente kernel: Node 0 Normal: 126*4kB 56*8kB 48*16kB 2*32kB 0*64kB 1*128kB 0*256kB 1*512kB 0*1024kB 0*2048kB 1*4096kB = 6520kB
Jun  3 17:21:43 ente kernel: Node 0 HighMem: empty
Jun  3 17:21:43 ente kernel: 7119 pagecache pages
Jun 3 17:21:43 ente kernel: Swap cache: add 19085619, delete 19079797, find 1173083/1206685, race 0+462
Jun  3 17:21:43 ente kernel: Free swap  = 0kB
Jun  3 17:21:43 ente kernel: Total swap = 25165816kB
Jun  3 17:21:43 ente kernel: Free swap:            0kB
Jun  3 17:21:44 ente kernel: 2228224 pages of RAM
Jun  3 17:21:44 ente kernel: 215581 reserved pages
Jun  3 17:21:44 ente kernel: 15624 pages shared
Jun  3 17:21:44 ente kernel: 5822 pages swap cached
Jun 3 17:21:44 ente kernel: Out of memory: Killed process 707, UID 0, (amrecover).

This message is the property of CARBONITE, INC. and may contain confidential or 
privileged information.
If this message has been delivered to you by mistake, then do not copy or 
deliver this message to anyone.  Instead, destroy it and notify me by reply 
e-mail
diff --git a/recover-src/extract_list.c b/recover-src/extract_list.c
index 2f019ae..71efa85 100644
--- a/recover-src/extract_list.c
+++ b/recover-src/extract_list.c
@@ -3178,11 +3178,12 @@ read_amidxtaped_state(
 typedef struct data_cookie_t {
     char    *buf;
     size_t   size;
+    size_t   alloc_size;
     size_t   count;
     event_handle_t *event;
     ctl_data_t     *ctl_data;
 } data_cookie_t;
-data_cookie_t data_cookie;
+data_cookie_t data_cookie = { NULL, 0, 0, 0, NULL, NULL };
 
 static void write_data_to_app(void *);
 
@@ -3328,7 +3329,13 @@ read_amidxtaped_data(
 	if ((count >= 0 && count < size) || (count == -1 && errno == EAGAIN)) {
 		if (count == -1) count = 0;
 		security_stream_pause(amidxtaped_streams[DATAFD].fd);
-		data_cookie.buf = g_malloc(size);
+		if (!data_cookie.buf || data_cookie.alloc_size < (size_t)size) {
+		    if (data_cookie.buf) {
+			g_free(data_cookie.buf);
+		    }
+		    data_cookie.buf = g_malloc(size);
+		    data_cookie.alloc_size = size;
+		}
 		memcpy(data_cookie.buf, buf, size);
 		data_cookie.size = size;
 		data_cookie.count = count;

Reply via email to