Re: [DRBD-user] Testing new DRBD9 dedicated repo for PVE

2017-02-17 Thread Lars Ellenberg
On Fri, Feb 17, 2017 at 11:30:03AM +0100, Roberto Resoli wrote:
> Il 17/02/2017 09:51, Roland Kammerer ha scritto:
> >> does not fixes the situation anymore, the "start" part failing most of
> >> times.
> > Hi,
> > 
> > yes, there was a regression. Hopefully 0.99.2 fixed that.
> 
> Thanks, I've updated to 0.99.2 now in repo; much better, most resources
> are up at startup.
> 
> drbdmanage restart keeps failing (sometimes shutdown, wait, startup
> succeeds) :
> 
> # drbdmanage restart
> You are going to shut down the drbdmanaged server process on this node.
> Please confirm:
>   yes/no: yes
> Attempting to startup the server through D-Bus activation...
> ERROR:dbus.proxies:Introspect error on :1.7:/interface:
> dbus.exceptions.DBusException: org.freedesktop.DBus.Error.NoReply: Did
> not receive a reply. Possible causes include: the remote application did
> not send a reply, the message bus security policy blocked the reply, the
> reply timeout expired, or the network connection was broken.
> D-Bus connection successful, server is running and reachable
> 
> >> The only way to recover I found is  to issue a
> >>
> >> drbdmanage export-res "*"
> >>
> >> and issuing manually a
> >>
> >> drbdadm up 
> >>
> >> for every 
> > You could have done "drbdadm up all" in that case.

just do: drbdadm adjust all

> Tried, but it was failing leaving disks in "Diskless" state. (may be
> because the command finds .drbdctrl already up, giving an exception?)
> 
> It would be handy to have something like a wildcard support:
> 
>  drbdadm up vm-*
> 
> I implemented somewhat similar behaviour (provide a list of res to bring
> up/down, defaulting to all but .drbdctrl) in a custom script.

-- 
: Lars Ellenberg
: LINBIT | Keeping the Digital World Running
: DRBD -- Heartbeat -- Corosync -- Pacemaker

DRBD® and LINBIT® are registered trademarks of LINBIT
__
please don't Cc me, but send to list -- I'm subscribed
___
drbd-user mailing list
drbd-user@lists.linbit.com
http://lists.linbit.com/mailman/listinfo/drbd-user


Re: [DRBD-user] DRBD Trouble (block drbd0: local WRITE IO error sector)

2017-02-17 Thread Lars Ellenberg
On Fri, Feb 03, 2017 at 03:32:39PM +0900, Seiichirou Hiraoka wrote:
> Hello.
> 
> I use DRBD in the following environment.
> 
> OS: Redhat Enterprise Linux 7.1
> Pacemaker: 1.1.12 (CentOS Repository)
> Corosync: 2.3.4 (CentOS Repository)
> DRBD: 8.4.9 (ELRepo)
> # rpm -qi drbd84-utils
> Name: drbd84-utils
> Version : 8.9.2
> Release : 2.el7.elrepo
> Architecture: x86_64
> Vendor: The ELRepo Project (http://elrepo.org)
> # rpm -qi kmod-drbd84
> Name: kmod-drbd84
> Version : 8.4.9
> Release : 1.el7.elrepo
> Architecture: x86_64
> 
> Although DRBD is operated on two servers (server1, server2),
> the following error message suddenly appears
> and writing to the DRBD area can not be performed.
> 
> . server1(master)
> Jan 20 10:41:16 server1 kernel: block drbd0: local WRITE IO error sector 
> 118616936+40 on dm-0
> Jan 20 10:41:16 server1 kernel: block drbd0: disk( UpToDate -> Failed )
> Jan 20 10:41:16 server1 kernel: block drbd0: Local IO failed in __req_mod. 
> Detaching...
> Jan 20 10:41:16 server1 kernel: block drbd0: 0 KB (0 bits) marked out-of-sync 
> by on disk bit-map.
> Jan 20 10:41:16 server1 kernel: block drbd0: disk( Failed -> Diskless )
> Jan 20 10:41:16 server1 kernel: block drbd0: Got NegDReply; Sector 
> 117512416s, len 4096.
> Jan 20 10:41:16 server1 kernel: drbd0: WRITE SAME failed. Manually zeroing.


That ^^ is the relevant hint.

VMWare "virtual" disks seem to love to pretend to be able to do WRITE SAME,
but when the actually see such requests, fail them with IO error.
(Not blaming VMWare here, maybe other (real/virtual) disks show the same
behavior. It's just the most frequent "offender" currently)

That's not easy for DRBD to handle.
Next DRBD release will have a config switch to turn off write-same
support for specific DRBD volumes.

Meanwhile, available work arounds:

use a different type of virtual disk ("sata" may work), something that
does not claim to support something it then does not handle.

or, *early* in the boot process (before you bring up DRBD),
disable write same like this:
echo 0 | tee /sys/block/*/device/../*/scsi_disk/*/max_write_same_blocks
(for the relevant backend devices)

If you use LVM, you may need to vgchange -an ; vgchange -ay after that,
(at least for the relevant VGs), if they have already been activated.

-- 
: Lars Ellenberg
: LINBIT | Keeping the Digital World Running
: DRBD -- Heartbeat -- Corosync -- Pacemaker

DRBD® and LINBIT® are registered trademarks of LINBIT
__
please don't Cc me, but send to list -- I'm subscribed
___
drbd-user mailing list
drbd-user@lists.linbit.com
http://lists.linbit.com/mailman/listinfo/drbd-user


Re: [DRBD-user] Issues with a DRBD9 two node setup

2017-02-17 Thread Roland Kammerer
On Fri, Feb 17, 2017 at 11:58:02AM +, Gerwim Feiken wrote:
> TLDR: drbdmanage completely breaks while drbdadm commands still work
> (including mounting LVs) when one of the two nodes is down.

Hi,

sure, drbdadm commands still work, why shouldn't they, they are lowlevel
tools.

In your case (2 nodes cluster), drbdmanage is supposed to "break" when
the second node is not there. You basically have these choices:

- 1 node (okay, a bit lame)
- 2 nodes and *both* are available. Otherwise it does not finish with
  leader election to avoid split brains.
- >2 and you have a majority

Regards, rck
___
drbd-user mailing list
drbd-user@lists.linbit.com
http://lists.linbit.com/mailman/listinfo/drbd-user


Re: [DRBD-user] repository Key not more valid

2017-02-17 Thread Lars Ellenberg
On Thu, Feb 16, 2017 at 07:00:41PM +0100, Roberto Resoli wrote:
> Su giovedì 16 febbraio 2017 18:04:06 CET, Michele Rossetti
>  ha scritto:
> >Hi,
> >trying to update my 3 server PVe 4.4 with DRBD9 system, now I get this
> >warning:
> >
> >W: An error occurred during the signature verification. The repository is
> >not updated and the previous index files will be used. GPG error:
> >http://packages.linbit.com proxmox-4 Release: The following signatures
> >were invalid: KEYEXPIRED 1485786921 KEYEXPIRED 1485786921 KEYEXPIRED
> >1485786921
> >
> >W: Failed to fetch
> >http://packages.linbit.com/proxmox/dists/proxmox-4/Release
> >
> >Is there some change in Linbit repository or I missed something?
> 
>   You have to downlad ed install the updated key as you did that time.

something like this should usually do it

apt-key adv --recv-keys --keyserver keyserver.ubuntu.com 0x53B3B037282B6E23

feel free to use your favorite keyserver,
or use the short key id 282B6E23.

-- 
: Lars Ellenberg
: LINBIT | Keeping the Digital World Running
: DRBD -- Heartbeat -- Corosync -- Pacemaker

DRBD® and LINBIT® are registered trademarks of LINBIT
__
please don't Cc me, but send to list -- I'm subscribed
___
drbd-user mailing list
drbd-user@lists.linbit.com
http://lists.linbit.com/mailman/listinfo/drbd-user


[DRBD-user] Issues with a DRBD9 two node setup

2017-02-17 Thread Gerwim Feiken
Hi guys,

First of all: I'm really impressed with DRBD9, so good job!

My problem: drbdmanage completely breaks when one of the two nodes is down. 
Drbdadm commands still work perfectly fine (and I can still mount the LVs 
hosted on the online node).
My setup is quite simple: two Ubuntu 16.04 nodes using DRBD9, using drbdmanage 
(0.99.2. From the Linbit PPA.

After shutting down node bravo:
root@alpha:~# drbdmanage nodes
Waiting for server: ...
No nodes defined

root@alpha:~# drbdmanage ping
pong

root@alpha:~# drbdadm status
.drbdctrl role:Secondary
  volume:0 disk:UpToDate
  volume:1 disk:UpToDate
  bravo connection:Connecting

volume1 role:Secondary
  disk:UpToDate
  bravo connection:Connecting

volume2 role:Secondary
  disk:UpToDate
  bravo connection:Connecting


I have not made .drbdctrl primary as Robert said here: 
http://lists.linbit.com/pipermail/drbd-user/2016-June/023002.html.

After starting the bravo node again:
root@alpha:~# drbdadm status
.drbdctrl role:Secondary
  volume:0 disk:UpToDate
  volume:1 disk:UpToDate
  bravo role:Secondary
volume:0 peer-disk:UpToDate
volume:1 peer-disk:UpToDate

volume1 role:Secondary
  disk:UpToDate
  bravo role:Secondary
peer-disk:UpToDate

volume2 role:Secondary
  disk:UpToDate
  bravo role:Secondary
peer-disk:UpToDate

root@alpha:~# drbdmanage nodes
Waiting for server: .
++
| Name  | Pool Size | Pool Free |   
 | State |
||
| alpha | 51196 | 49132 |   
 |ok |
| bravo | 51196 | 49132 |   
 |ok |
++

TLDR: drbdmanage completely breaks while drbdadm commands still work (including 
mounting LVs) when one of the two nodes is down.

Is there anyone who can point me in the right direction?

Kind regards,
Gerwim Feiken



___
drbd-user mailing list
drbd-user@lists.linbit.com
http://lists.linbit.com/mailman/listinfo/drbd-user


Re: [DRBD-user] Testing new DRBD9 dedicated repo for PVE

2017-02-17 Thread Roberto Resoli
Il 17/02/2017 09:51, Roland Kammerer ha scritto:
>> does not fixes the situation anymore, the "start" part failing most of
>> times.
> Hi,
> 
> yes, there was a regression. Hopefully 0.99.2 fixed that.

Thanks, I've updated to 0.99.2 now in repo; much better, most resources
are up at startup.

drbdmanage restart keeps failing (sometimes shutdown, wait, startup
succeeds) :

# drbdmanage restart
You are going to shut down the drbdmanaged server process on this node.
Please confirm:
  yes/no: yes
Attempting to startup the server through D-Bus activation...
ERROR:dbus.proxies:Introspect error on :1.7:/interface:
dbus.exceptions.DBusException: org.freedesktop.DBus.Error.NoReply: Did
not receive a reply. Possible causes include: the remote application did
not send a reply, the message bus security policy blocked the reply, the
reply timeout expired, or the network connection was broken.
D-Bus connection successful, server is running and reachable

>> The only way to recover I found is  to issue a
>>
>> drbdmanage export-res "*"
>>
>> and issuing manually a
>>
>> drbdadm up 
>>
>> for every 
> You could have done "drbdadm up all" in that case.

Tried, but it was failing leaving disks in "Diskless" state. (may be
because the command finds .drbdctrl already up, giving an exception?)

It would be handy to have something like a wildcard support:

 drbdadm up vm-*

I implemented somewhat similar behaviour (provide a list of res to bring
up/down, defaulting to all but .drbdctrl) in a custom script.

bye,
rob

___
drbd-user mailing list
drbd-user@lists.linbit.com
http://lists.linbit.com/mailman/listinfo/drbd-user


Re: [DRBD-user] Testing new DRBD9 dedicated repo for PVE

2017-02-17 Thread Roland Kammerer
On Thu, Feb 16, 2017 at 05:39:42PM +0100, Roberto Resoli wrote:
> Il 09/01/2017 10:56, Roberto Resoli ha scritto:
> > Il 29/12/2016 15:01, Roberto Resoli ha scritto:
> >> Il 28/12/2016 16:00, Roberto Resoli ha scritto:
> >>> Il 27/12/2016 10:19, Roberto Resoli ha scritto:
>  All is up and running nicely, in any case.
> >> Another issue: drbdmanaged starts at boot, but most resources remain down; 
> >> a
> >>
> >> drbdmanaged restart
> >>
> >> fixes that.
> >>
> >> I attach a related fragment from syslog; it seems that some temporary
> >> .res files are not generated correctly.
> > 
> > any hint?
> 
> After having upgraded to python-drbdmanage 0.99.1-1 from linbit drbd9
> repo for proxmox, a simple
> 
> drbdmanage restart
> 
> does not fixes the situation anymore, the "start" part failing most of
> times.

Hi,

yes, there was a regression. Hopefully 0.99.2 fixed that.

> The only way to recover I found is  to issue a
> 
> drbdmanage export-res "*"
> 
> and issuing manually a
> 
> drbdadm up 
> 
> for every 

You could have done "drbdadm up all" in that case.

Regards, rck
___
drbd-user mailing list
drbd-user@lists.linbit.com
http://lists.linbit.com/mailman/listinfo/drbd-user