Re: [bareos-users] Re: Backup of clients in different network with long run before scripts terminate with connection reset

2017-08-01 Thread Cristian Mammoli
Il giorno mercoledì 19 luglio 2017 11:15:25 UTC+2, Cristian Mammoli ha scritto:
> Of course it is C:/Test.bat, not C:\Test.bat
> 
>Run Script {
>  Command = "C:/Test.bat"
>  Runs When = after
>  Fail Job On Error = No
>}
> 

Ok, the test.bat script run without a itch for 2 weeks. Now i'll try to put my 
command inside the bat script

-- 
You received this message because you are subscribed to the Google Groups 
"bareos-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to bareos-users+unsubscr...@googlegroups.com.
To post to this group, send email to bareos-users@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [bareos-users] Re: Backup of clients in different network with long run before scripts terminate with connection reset

2017-07-19 Thread Cristian Mammoli

Of course it is C:/Test.bat, not C:\Test.bat

  Run Script {
Command = "C:/Test.bat"
Runs When = after
Fail Job On Error = No
  }


Il 19/07/2017 10:33, Bruno Friedmann ha scritto:

On mardi, 18 juillet 2017 15.34:13 h CEST Cristian Mammoli wrote:

It works *most of the time* the way it is.  It's not easy to reproduce
it. I can test with a simple batch but I don't think an escaping problem
can cause a"connection reset" once every 10 backups (just saying) :-)


Yeah right, the root cause is still really annoying (especially the randomness
of reproductibility)

Just to exclude another eventual cause, I guess you're using fixed ip, and not
dhcp where the problem occur ? so we can exclude the problem that a renewal of
dhcp lease can create.

I've also seen this kind of trouble last week-end with one 15.2.4 windows 2012
client (Hyper-V guest)

15-Jul 20:00 europe-fd JobId 10487: Generate VSS snapshots. Driver="Win64
VSS", Drive(s)="CD"
15-Jul 20:00 europe-fd JobId 10487: VolumeMountpoints are not processed as
onefs = yes.
15-Jul 20:00 europe-fd JobId 10487: VolumeMountpoints are not processed as
onefs = yes.
15-Jul 21:57 oceania-sd JobId 10487: User specified Job spool size reached:
JobSpoolSize=68,719,477,111 MaxJobSpoolSize=68,719,476,736
15-Jul 21:57 oceania-sd JobId 10487: Writing spooled data to Volume.
Despooling 68,719,477,111 bytes ...
15-Jul 21:59 oceania-dir JobId 10487: Fatal error: Network error with FD
during Backup: ERR=Connection timed out
15-Jul 21:59 oceania-dir JobId 10487: Error: Director's comm line to SD
dropped.
15-Jul 21:59 oceania-dir JobId 10487: Fatal error: No Job status returned from
FD.
15-Jul 21:59 oceania-dir JobId 10487: Error: Bareos oceania-dir 15.2.4
(09Jun16):

No idea why the unspooling doesn't take place, and the Network error.
I'm suspecting network related troubles, as the ethernet errors are increasing
more than expected (which can be also a switch failure)
Ports have to be monitored to check if this is the case.



--
You received this message because you are subscribed to the Google Groups 
"bareos-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to bareos-users+unsubscr...@googlegroups.com.
To post to this group, send email to bareos-users@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [bareos-users] Re: Backup of clients in different network with long run before scripts terminate with connection reset

2017-07-19 Thread Cristian Mammoli



Il 19/07/2017 10:33, Bruno Friedmann ha scritto:

Yeah right, the root cause is still really annoying (especially the randomness
of reproductibility)

Just to exclude another eventual cause, I guess you're using fixed ip, and not
dhcp where the problem occur ? so we can exclude the problem that a renewal of
dhcp lease can create.

I configured a run after job this way:
  Run Script {
Command = "C:\Test.bat"
Runs When = after
Fail Job On Error = No
  }

And test.bat only has "echo hello world" in it

Let's se how it goes, atm this is the only run after job left on the 
problematic servers



I've also seen this kind of trouble last week-end with one 15.2.4 windows 2012
client (Hyper-V guest)

No idea why the unspooling doesn't take place, and the Network error.
I'm suspecting network related troubles, as the ethernet errors are increasing
more than expected (which can be also a switch failure)
Ports have to be monitored to check if this is the case.


Sadly my environment is a remote "virtual datacenter" running on vSphere 
and I don't have access to the underlying network. The Linux VMs, 
anyway, are running fine, but I already tried to exclude all the 
possible causes:

* Windows Firewall
* KeepAlive settings
* Vmware tools (and nic drivers) version

I should try to replace the vmxnet3 vNic with e1000, but I already know 
E1000 is even more unstable on windows 2012 and later.



--
You received this message because you are subscribed to the Google Groups 
"bareos-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to bareos-users+unsubscr...@googlegroups.com.
To post to this group, send email to bareos-users@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [bareos-users] Re: Backup of clients in different network with long run before scripts terminate with connection reset

2017-07-18 Thread Cristian Mammoli
It works *most of the time* the way it is.  It's not easy to reproduce 
it. I can test with a simple batch but I don't think an escaping problem 
can cause a"connection reset" once every 10 backups (just saying) :-)


Il 18/07/2017 14:37, Bruno Friedmann ha scritto:

On mardi, 18 juillet 2017 09.18:28 h CEST Cristian Mammoli wrote:

Tha was just an example script, it does work for the record ( i create a
systemstate backup Before and dump it after backup).

Anyway I get connection reset by peer even with a simle script such as:

Run Script {
  Command = "del /Q \"C:\\Program Files\\MySQL\\MySQL Server
5.7\\Backup\\MySQL.bak\""
  Runs When = after
  Fail Job On Error = No
}


Considering the documentation (especially the Windows consideration)
http://doc.bareos.org/master/html/bareos-manual-main-reference.html#directiveDirJobRun%20Script
And by experience the nitpicking thing to correctly escaping and sending
command line what happen if you wrap this in a simple bat.
Command = "c:/testme.bat"

is it working ?




Il 17/07/2017 18:18, Bruno Friedmann ha scritto:

it seems you ask windows to delete the running vss state. l don't know if
this can work ?>
On July 17, 2017 9:50:02 AM GMT+02:00, Cristian Mammoli

 wrote:

05-Jul 21:48 srvbkp-dir JobId 54311: Fatal error: Network error with

FD during Backup: ERR=Connection reset by peer


05-Jul 21:48 srvbkp-dir JobId 54311: Error: Bareos srvbkp-dir 16.2.4

(01Jul16):

I can confirm that the issue is with Client Run After Job such as:
   Run Script {
   
 Command = "wbadmin delete systemstatebackup -keepversions:0 -quiet"

 Runs When = before
 Fail Job On Error = No
   
   }


I commented out all the script like this and had no issues so far.
Obviously this is not a solution...

I'm pretty sure this has nothing to do with routers since I noticed it
happens even in the same network.

So to recap these are the conditions:
It doesn't happen with Linux clients but only Windows (2008R2 to 2012R2
tested)
Windows firewall on/off doesn't matter
Heartbeat interval does not help
It happens with "normal" mode, passive clients, and client initiated
connections
It happens even if server and client are in the same network
It only happens if there is a "Client Run After Job" script
I tried updating vmware tools and nic drivers




--
You received this message because you are subscribed to the Google Groups 
"bareos-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to bareos-users+unsubscr...@googlegroups.com.
To post to this group, send email to bareos-users@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [bareos-users] Re: Backup of clients in different network with long run before scripts terminate with connection reset

2017-07-18 Thread Cristian Mammoli
Tha was just an example script, it does work for the record ( i create a 
systemstate backup Before and dump it after backup).


Anyway I get connection reset by peer even with a simle script such as:

  Run Script {
Command = "del /Q \"C:\\Program Files\\MySQL\\MySQL Server 
5.7\\Backup\\MySQL.bak\""

Runs When = after
Fail Job On Error = No
  }


Il 17/07/2017 18:18, Bruno Friedmann ha scritto:

it seems you ask windows to delete the running vss state. l don't know if this 
can work ?


On July 17, 2017 9:50:02 AM GMT+02:00, Cristian Mammoli 
 wrote:

05-Jul 21:48 srvbkp-dir JobId 54311: Fatal error: Network error with

FD during Backup: ERR=Connection reset by peer

05-Jul 21:48 srvbkp-dir JobId 54311: Error: Bareos srvbkp-dir 16.2.4

(01Jul16):

I can confirm that the issue is with Client Run After Job such as:

  Run Script {
Command = "wbadmin delete systemstatebackup -keepversions:0 -quiet"
Runs When = before
Fail Job On Error = No
  }

I commented out all the script like this and had no issues so far.
Obviously this is not a solution...

I'm pretty sure this has nothing to do with routers since I noticed it
happens even in the same network.

So to recap these are the conditions:
It doesn't happen with Linux clients but only Windows (2008R2 to 2012R2
tested)
Windows firewall on/off doesn't matter
Heartbeat interval does not help
It happens with "normal" mode, passive clients, and client initiated
connections
It happens even if server and client are in the same network
It only happens if there is a "Client Run After Job" script
I tried updating vmware tools and nic drivers


--
You received this message because you are subscribed to the Google Groups 
"bareos-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to bareos-users+unsubscr...@googlegroups.com.
To post to this group, send email to bareos-users@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [bareos-users] Re: Backup of clients in different network with long run before scripts terminate with connection reset

2017-07-17 Thread Cristian Mammoli

> 05-Jul 21:48 srvbkp-dir JobId 54311: Fatal error: Network error with FD 
> during Backup: ERR=Connection reset by peer
> 05-Jul 21:48 srvbkp-dir JobId 54311: Error: Bareos srvbkp-dir 16.2.4 
> (01Jul16):

I can confirm that the issue is with Client Run After Job such as:

  Run Script {
Command = "wbadmin delete systemstatebackup -keepversions:0 -quiet"
Runs When = before
Fail Job On Error = No
  }

I commented out all the script like this and had no issues so far.
Obviously this is not a solution...

I'm pretty sure this has nothing to do with routers since I noticed it happens 
even in the same network.

So to recap these are the conditions:
It doesn't happen with Linux clients but only Windows (2008R2 to 2012R2 tested)
Windows firewall on/off doesn't matter
Heartbeat interval does not help
It happens with "normal" mode, passive clients, and client initiated connections
It happens even if server and client are in the same network
It only happens if there is a "Client Run After Job" script
I tried updating vmware tools and nic drivers 

-- 
You received this message because you are subscribed to the Google Groups 
"bareos-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to bareos-users+unsubscr...@googlegroups.com.
To post to this group, send email to bareos-users@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [bareos-users] Re: Backup of clients in different network with long run before scripts terminate with connection reset

2017-07-06 Thread Cristian Mammoli
I checked the logs and the connection does not reset *before* the client starts 
sending data to the SD but *after* the run after job script!
The backup actually "succeeds":

05-Jul 21:48 css-srvdc02-fd JobId 54311: ClientAfterJob: Deleting system state 
backup version 07/05/2017-18:42 (1 out of 1)...
05-Jul 21:48 srvbkp-sd JobId 54311: Sending spooled attrs to the Director. 
Despooling 63,976 bytes ...
05-Jul 21:48 css-srvdc02-fd JobId 54311: ClientAfterJob: The operation to 
delete system state backups completed,
05-Jul 21:48 css-srvdc02-fd JobId 54311: ClientAfterJob: 1 backups were deleted.
05-Jul 21:48 srvbkp-dir JobId 54311: Fatal error: Network error with FD during 
Backup: ERR=Connection reset by peer
05-Jul 21:48 srvbkp-dir JobId 54311: Error: Bareos srvbkp-dir 16.2.4 (01Jul16): 

-- 
You received this message because you are subscribed to the Google Groups 
"bareos-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to bareos-users+unsubscr...@googlegroups.com.
To post to this group, send email to bareos-users@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [bareos-users] Re: Backup of clients in different network with long run before scripts terminate with connection reset

2017-07-06 Thread Cristian Mammoli
Il giorno giovedì 6 luglio 2017 13:05:02 UTC+2, Bruno Friedmann ha scritto:
> did you tried to setup heartbeatinterval option inr dir sd and client. you 
> certainly facing a firewall or router somewhefe that drop what it consider as 
> empty dead connection.

Yes, I already added:

  Heartbeat Interval = 60

in director, client and sd.conf

I even manually configured 

[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters]
"KeepAliveInterval"=dword:03e8
"KeepAliveTime"=dword:ea60

in Windows registry.

Actually I am running a backup while sniffing the traffic with Wireshark and 
the keepalives seems to be exchanged:

tcp0  0 10.254.99.100:34562 10.254.96.1:9102ESTABLISHED 
keepalive (54.00/0/0)
tcp0  0 10.254.99.100:34560 10.254.96.1:9102ESTABLISHED 
keepalive (43.76/0/0)


453 1020.122463 10.254.96.1 10.254.99.100   TCP 55  [TCP 
Keep-Alive] 9102 → 34562 [ACK] Seq=106 Ack=185 Win=131584 Len=1
454 1020.123393 10.254.99.100   10.254.96.1 TCP 78  [TCP 
Keep-Alive ACK] 34562 → 9102 [ACK] Seq=185 Ack=107 Win=29312 Len=0 
TSval=186109820 TSecr=94118 SLE=106 SRE=107

But every now and then the backup fails.
The connection is always reset at the end of "the run before job" script when 
the client should start sending data to the SD (sd and dir run on the same 
server)

-- 
You received this message because you are subscribed to the Google Groups 
"bareos-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to bareos-users+unsubscr...@googlegroups.com.
To post to this group, send email to bareos-users@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[bareos-users] Re: Backup of clients in different network with long run before scripts terminate with connection reset

2017-07-06 Thread Cristian Mammoli
Il giorno giovedì 26 gennaio 2017 10:03:58 UTC+1, Cristian Mammoli ha scritto:
> I forgot to save the file before attaching


Anyone???

-- 
You received this message because you are subscribed to the Google Groups 
"bareos-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to bareos-users+unsubscr...@googlegroups.com.
To post to this group, send email to bareos-users@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[bareos-users] Re: Backup of clients in different network with long run before scripts terminate with connection reset

2017-01-26 Thread Cristian Mammoli
I forgot to save the file before attaching

-- 
You received this message because you are subscribed to the Google Groups 
"bareos-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to bareos-users+unsubscr...@googlegroups.com.
To post to this group, send email to bareos-users@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


25-Jan 20:40 srvbkp-dir JobId 24217: Start Backup JobId 24217, 
Job=css-srvdc02-backup.2017-01-25_20.30.01_53
25-Jan 20:40 srvbkp-dir JobId 24217: Using Device "File" to write.
25-Jan 20:40 css-srvdc02-fd JobId 24217: Created 32 wildcard excludes from 
FilesNotToBackup Registry key
25-Jan 20:40 css-srvdc02-fd JobId 24217: shell command: run ClientBeforeJob 
"REG ADD 
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\wbengine\SystemStateBackup\
 /v AllowSSBToAnyVolume /t REG_DWORD /d 1 /F"
25-Jan 20:40 css-srvdc02-fd JobId 24217: ClientBeforeJob: The operation 
completed successfully.
25-Jan 20:40 css-srvdc02-fd JobId 24217: ClientBeforeJob: 
25-Jan 20:40 css-srvdc02-fd JobId 24217: shell command: run ClientBeforeJob 
"wbadmin start systemstatebackup -backuptarget:C: -quiet"
25-Jan 20:40 css-srvdc02-fd JobId 24217: ClientBeforeJob: wbadmin 1.0 - Backup 
command-line tool
25-Jan 20:40 css-srvdc02-fd JobId 24217: ClientBeforeJob: (C) Copyright 2013 
Microsoft Corporation. All rights reserved.
25-Jan 20:40 css-srvdc02-fd JobId 24217: ClientBeforeJob: 
25-Jan 20:40 css-srvdc02-fd JobId 24217: ClientBeforeJob: Starting to back up 
the system state [25/01/2017 20:40]...
25-Jan 20:40 css-srvdc02-fd JobId 24217: ClientBeforeJob: Retrieving volume 
information...
25-Jan 20:40 css-srvdc02-fd JobId 24217: ClientBeforeJob: This will back up the 
system state from volume(s) System Reserved (350.00 MB),(C:) to C:.
25-Jan 20:40 css-srvdc02-fd JobId 24217: ClientBeforeJob: Creating a shadow 
copy of the volumes specified for backup...
25-Jan 20:40 css-srvdc02-fd JobId 24217: ClientBeforeJob: Creating a shadow 
copy of the volumes specified for backup...
25-Jan 20:41 css-srvdc02-fd JobId 24217: ClientBeforeJob: Please wait while 
system state files to back up are identified.
25-Jan 20:41 css-srvdc02-fd JobId 24217: ClientBeforeJob: This might take 
several minutes...
25-Jan 20:41 css-srvdc02-fd JobId 24217: ClientBeforeJob: Found (800) files.
25-Jan 20:41 css-srvdc02-fd JobId 24217: ClientBeforeJob: Found (6302) files.
25-Jan 20:41 css-srvdc02-fd JobId 24217: ClientBeforeJob: Found (20986) files.
25-Jan 20:41 css-srvdc02-fd JobId 24217: ClientBeforeJob: Found (30911) files.
25-Jan 20:41 css-srvdc02-fd JobId 24217: ClientBeforeJob: Found (34322) files.
25-Jan 20:42 css-srvdc02-fd JobId 24217: ClientBeforeJob: Found (46552) files.
25-Jan 20:42 css-srvdc02-fd JobId 24217: ClientBeforeJob: Found (54284) files.
25-Jan 20:42 css-srvdc02-fd JobId 24217: ClientBeforeJob: Found (58286) files.
25-Jan 20:42 css-srvdc02-fd JobId 24217: ClientBeforeJob: Found (60655) files.
25-Jan 20:42 css-srvdc02-fd JobId 24217: ClientBeforeJob: Found (62403) files.
25-Jan 20:42 css-srvdc02-fd JobId 24217: ClientBeforeJob: Found (70164) files.
25-Jan 20:43 css-srvdc02-fd JobId 24217: ClientBeforeJob: Found (76541) files.
25-Jan 20:43 css-srvdc02-fd JobId 24217: ClientBeforeJob: Found (84063) files.
25-Jan 20:43 css-srvdc02-fd JobId 24217: ClientBeforeJob: Found (90136) files.
25-Jan 20:43 css-srvdc02-fd JobId 24217: ClientBeforeJob: Found (95040) files.
25-Jan 20:43 css-srvdc02-fd JobId 24217: ClientBeforeJob: Found (103972) files.
25-Jan 20:43 css-srvdc02-fd JobId 24217: ClientBeforeJob: Found (119617) files.
25-Jan 20:44 css-srvdc02-fd JobId 24217: ClientBeforeJob: Found (132625) files.
25-Jan 20:44 css-srvdc02-fd JobId 24217: ClientBeforeJob: Found (138834) files.
25-Jan 20:44 css-srvdc02-fd JobId 24217: ClientBeforeJob: Found (142878) files.
25-Jan 20:44 css-srvdc02-fd JobId 24217: ClientBeforeJob: Found (142878) files.
25-Jan 20:44 css-srvdc02-fd JobId 24217: ClientBeforeJob: The search for system 
state files is complete.
25-Jan 20:44 css-srvdc02-fd JobId 24217: ClientBeforeJob: Starting to back up 
files...
25-Jan 20:44 css-srvdc02-fd JobId 24217: ClientBeforeJob: The backup of files 
reported by 'Task Scheduler Writer' is complete.
25-Jan 20:44 css-srvdc02-fd JobId 24217: ClientBeforeJob: The backup of files 
reported by 'VSS Metadata Store Writer' is complete.
25-Jan 20:44 css-srvdc02-fd JobId 24217: ClientBeforeJob: The backup of files 
reported by 'Performance Counters Writer' is complete.
25-Jan 20:44 css-srvdc02-fd JobId 24217: ClientBeforeJob: Overall progress: 0%.
25-Jan 20:44 css-srvdc02-fd JobId 24217: ClientBeforeJob: Currently backing up 
files reported by 'Registry Writer'...
25-Jan 20:44 css-srvdc02-fd JobId 24217: ClientBeforeJob: The backup of files 
reported by 'Registry Writer' is complete.
25-Jan 20:44 css-srvdc02-fd JobId 24217: ClientBeforeJob: Overall progress: 2%.
25-Jan 20:44 css-srvdc02-fd JobId 24217: ClientBeforeJob: Cur

[bareos-users] Backup of clients in different network with long run before scripts terminate with connection reset

2017-01-26 Thread Cristian Mammoli
Hi, we are having issues backing up a couple of windows servers behind a 
firewall.
There are lots of server in the same network but this 2 are the only ones with 
long (20-30 mins) run before scripts.

Every now and then (like 1 every 5 backups) the backup ends with connection 
reset by peer:

25-Jan 21:19 srvbkp-sd JobId 24217: Sending spooled attrs to the Director. 
Despooling 79,898 bytes ...
25-Jan 21:19 css-srvdc02-fd JobId 24217: ClientAfterJob: The operation to 
delete system state backups completed,
25-Jan 21:19 css-srvdc02-fd JobId 24217: ClientAfterJob: 1 backups were deleted.
25-Jan 21:19 srvbkp-dir JobId 24217: Fatal error: Network error with FD during 
Backup: ERR=Connection reset by peer
25-Jan 21:19 srvbkp-dir JobId 24217: Fatal error: No Job status returned from 
FD.
25-Jan 21:19 srvbkp-dir JobId 24217: Error: Bareos srvbkp-dir 16.2.4 (01Jul16):

The full log is attached

I already tried adding "Heartbeat interval = 60" to the server, client and 
storage configuration. 
Then I tried lowering keepalive time both on the director and on the windows 
client like I read here: http://wiki.bacula.org/doku.php?id=faq

More info:
Director and Storage daemon run on the same server
Everything is version 16.4
It doesn't happen with Linux clients
Windows Firewall on the affected server is on but there is an exception for 
Bareos
It happens with "normal" mode, passive clients, and client initiated 
connections as well
I'm using SpoolAttributes = yes

Thanks

Cristian

-- 
You received this message because you are subscribed to the Google Groups 
"bareos-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to bareos-users+unsubscr...@googlegroups.com.
To post to this group, send email to bareos-users@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.