Re: [ovirt-users] [Gluster-users] Centos 7.1 failed to start glusterd after upgrading to ovirt 3.6

2015-11-12 Thread Stefano Danzi
to temporary fix this problem I changed [Unit] section in 
glusterd.service file:


[Unit]
Description=GlusterFS, a clustered file-system server
After=network.target rpcbind.service network-online.target 
vdsm-network.service

Before=vdsmd.service

Il 10/11/2015 8.02, Kaushal M ha scritto:

On Mon, Nov 9, 2015 at 9:06 PM, Stefano Danzi  wrote:

Here output from systemd-analyze critical-chain and  systemd-analyze blame.
I think that now glusterd start too early (before networking)

You are nearly right. GlusterD did start too early. GlusterD is
configured to start after network.target. But network.target in
systemd only guarantees that the network management stack is up; it
doesn't guarantee that the network devices have been configured and
are usable (Ref [1]). This means that when GlusterD starts, the
network is still not up and hence GlusterD will fail to resolve
bricks.

While we could start GlusterD after network-online.target, it would
break GlusterFS mounts configured in /etc/fstab with _netdev option.
Systemd automatically schedules _netdev mounts to be done after
network-online.target. (Ref [1] network-online.target). This could
allow the GlusterFS mounts to be done before GlusterD is up, causing
them to fail. This can be done using systemd-220 [2] which introduced
support for `x-systemd.requires` option for fstab, which can be used
to order mounts after specific services, but is not possible with el7
which has systemd-208.

[1]: https://wiki.freedesktop.org/www/Software/systemd/NetworkTarget/
[2]: https://bugzilla.redhat.com/show_bug.cgi?id=812826


___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] [Gluster-users] Centos 7.1 failed to start glusterd after upgrading to ovirt 3.6

2015-11-09 Thread Stefano Danzi

Here output from systemd-analyze critical-chain and  systemd-analyze blame.
I think that now glusterd start too early (before networking)

[root@ovirt01 tmp]# systemd-analyze critical-chain
The time after the unit is active or started is printed after the "@" 
character.

The time the unit takes to start is printed after the "+" character.

multi-user.target @17.148s
└─ovirt-ha-agent.service @17.021s +127ms
  └─vdsmd.service @15.871s +1.148s
└─vdsm-network.service @11.495s +4.373s
  └─libvirtd.service @11.238s +254ms
└─iscsid.service @11.228s +8ms
  └─network.target @11.226s
└─network.service @6.748s +4.476s
  └─iptables.service @6.630s +117ms
└─basic.target @6.629s
  └─paths.target @6.629s
└─brandbot.path @6.629s
  └─sysinit.target @6.615s
└─systemd-update-utmp.service @6.610s +4ms
  └─auditd.service @6.450s +157ms
└─systemd-tmpfiles-setup.service @6.369s +77ms
  └─rhel-import-state.service @6.277s +88ms
└─local-fs.target @6.275s
  └─home-glusterfs-data.mount @5.805s 
+470ms

└─home.mount @3.946s +1.836s
└─systemd-fsck@dev-mapper-centos_ovirt01\x2dhome.service @3.937s +7ms
└─dev-mapper-centos_ovirt01\x2dhome.device @3.936s



[root@ovirt01 tmp]# systemd-analyze blame
  4.476s network.service
  4.373s vdsm-network.service
  2.318s glusterd.service
  2.076s postfix.service
  1.836s home.mount
  1.651s lvm2-monitor.service
  1.258s lvm2-pvscan@9:1.service
  1.211s systemd-udev-settle.service
  1.148s vdsmd.service
  1.079s dmraid-activation.service
  1.046s boot.mount
   904ms kdump.service
   779ms multipathd.service
   657ms var-lib-nfs-rpc_pipefs.mount
   590ms 
systemd-fsck@dev-disk-by\x2duuid-e185849f\x2d2c82\x2d4eb2\x2da215\x2d97340e90c93e.service

   547ms tuned.service
   481ms kmod-static-nodes.service
   470ms home-glusterfs-data.mount
   427ms home-glusterfs-engine.mount
   422ms sys-kernel-debug.mount
   411ms dev-hugepages.mount
   411ms dev-mqueue.mount
   278ms systemd-fsck-root.service
   263ms systemd-readahead-replay.service
   254ms libvirtd.service
   243ms systemd-tmpfiles-setup-dev.service
   216ms systemd-modules-load.service
   209ms rhel-readonly.service
   195ms wdmd.service
   192ms sanlock.service
   191ms gssproxy.service
   186ms systemd-udev-trigger.service
   157ms auditd.service
   151ms plymouth-quit-wait.service
   151ms plymouth-quit.service
   132ms proc-fs-nfsd.mount
   127ms ovirt-ha-agent.service
   117ms iptables.service
   110ms ovirt-ha-broker.service
96ms avahi-daemon.service
89ms systemd-udevd.service
88ms rhel-import-state.service
77ms systemd-tmpfiles-setup.service
71ms sysstat.service
71ms microcode.service
71ms chronyd.service
69ms systemd-readahead-collect.service
68ms systemd-sysctl.service
65ms systemd-logind.service
61ms rsyslog.service
58ms systemd-remount-fs.service
46ms rpcbind.service
46ms nfs-config.service
45ms systemd-tmpfiles-clean.service
41ms rhel-dmesg.service
37ms dev-mapper-centos_ovirt01\x2dswap.swap
29ms systemd-vconsole-setup.service
26ms plymouth-read-write.service
26ms systemd-random-seed.service
24ms netcf-transaction.service
22ms mdmonitor.service
20ms systemd-machined.service
14ms plymouth-start.service
12ms systemd-update-utmp-runlevel.service
11ms 
systemd-fsck@dev-mapper-centos_ovirt01\x2dglusterOVEngine.service

 8ms iscsid.service
 7ms systemd-fsck@dev-mapper-centos_ovirt01\x2dhome.service
 7ms systemd-readahead-done.service
 7ms 
systemd-fsck@dev-mapper-centos_ovirt01\x2dglusterOVData.service

 6ms sys-fs-fuse-connections.mount
 4ms systemd-update-utmp.service
 4ms glusterfsd.service
 4ms rpc-statd-notify.service
 3ms iscsi-shutdown.service
 3ms systemd-journal-flush.service
 2ms sys-kernel-config.mount
 1ms systemd-user-sessions.service


Il 06/11/2015 9.27, Stefano Danzi ha scritto:

Hi!
I have only one node (Test system) and I don't chage any ip address 
and the entry is on /etc/hosts.

I think that now gluster start before networking

Il 06/11/2015 

Re: [ovirt-users] [Gluster-users] Centos 7.1 failed to start glusterd after upgrading to ovirt 3.6

2015-11-09 Thread Kaushal M
On Mon, Nov 9, 2015 at 9:06 PM, Stefano Danzi  wrote:
> Here output from systemd-analyze critical-chain and  systemd-analyze blame.
> I think that now glusterd start too early (before networking)

You are nearly right. GlusterD did start too early. GlusterD is
configured to start after network.target. But network.target in
systemd only guarantees that the network management stack is up; it
doesn't guarantee that the network devices have been configured and
are usable (Ref [1]). This means that when GlusterD starts, the
network is still not up and hence GlusterD will fail to resolve
bricks.

While we could start GlusterD after network-online.target, it would
break GlusterFS mounts configured in /etc/fstab with _netdev option.
Systemd automatically schedules _netdev mounts to be done after
network-online.target. (Ref [1] network-online.target). This could
allow the GlusterFS mounts to be done before GlusterD is up, causing
them to fail. This can be done using systemd-220 [2] which introduced
support for `x-systemd.requires` option for fstab, which can be used
to order mounts after specific services, but is not possible with el7
which has systemd-208.

[1]: https://wiki.freedesktop.org/www/Software/systemd/NetworkTarget/
[2]: https://bugzilla.redhat.com/show_bug.cgi?id=812826

>
> [root@ovirt01 tmp]# systemd-analyze critical-chain
> The time after the unit is active or started is printed after the "@"
> character.
> The time the unit takes to start is printed after the "+" character.
>
> multi-user.target @17.148s
> └─ovirt-ha-agent.service @17.021s +127ms
>   └─vdsmd.service @15.871s +1.148s
> └─vdsm-network.service @11.495s +4.373s
>   └─libvirtd.service @11.238s +254ms
> └─iscsid.service @11.228s +8ms
>   └─network.target @11.226s
> └─network.service @6.748s +4.476s
>   └─iptables.service @6.630s +117ms
> └─basic.target @6.629s
>   └─paths.target @6.629s
> └─brandbot.path @6.629s
>   └─sysinit.target @6.615s
> └─systemd-update-utmp.service @6.610s +4ms
>   └─auditd.service @6.450s +157ms
> └─systemd-tmpfiles-setup.service @6.369s +77ms
>   └─rhel-import-state.service @6.277s +88ms
> └─local-fs.target @6.275s
>   └─home-glusterfs-data.mount @5.805s +470ms
> └─home.mount @3.946s +1.836s
> └─systemd-fsck@dev-mapper-centos_ovirt01\x2dhome.service @3.937s +7ms
> └─dev-mapper-centos_ovirt01\x2dhome.device @3.936s
>
>
>
> [root@ovirt01 tmp]# systemd-analyze blame
>   4.476s network.service
>   4.373s vdsm-network.service
>   2.318s glusterd.service
>   2.076s postfix.service
>   1.836s home.mount
>   1.651s lvm2-monitor.service
>   1.258s lvm2-pvscan@9:1.service
>   1.211s systemd-udev-settle.service
>   1.148s vdsmd.service
>   1.079s dmraid-activation.service
>   1.046s boot.mount
>904ms kdump.service
>779ms multipathd.service
>657ms var-lib-nfs-rpc_pipefs.mount
>590ms
> systemd-fsck@dev-disk-by\x2duuid-e185849f\x2d2c82\x2d4eb2\x2da215\x2d97340e90c93e.service
>547ms tuned.service
>481ms kmod-static-nodes.service
>470ms home-glusterfs-data.mount
>427ms home-glusterfs-engine.mount
>422ms sys-kernel-debug.mount
>411ms dev-hugepages.mount
>411ms dev-mqueue.mount
>278ms systemd-fsck-root.service
>263ms systemd-readahead-replay.service
>254ms libvirtd.service
>243ms systemd-tmpfiles-setup-dev.service
>216ms systemd-modules-load.service
>209ms rhel-readonly.service
>195ms wdmd.service
>192ms sanlock.service
>191ms gssproxy.service
>186ms systemd-udev-trigger.service
>157ms auditd.service
>151ms plymouth-quit-wait.service
>151ms plymouth-quit.service
>132ms proc-fs-nfsd.mount
>127ms ovirt-ha-agent.service
>117ms iptables.service
>110ms ovirt-ha-broker.service
> 96ms avahi-daemon.service
> 89ms systemd-udevd.service
> 88ms rhel-import-state.service
> 77ms systemd-tmpfiles-setup.service
> 71ms sysstat.service
> 71ms microcode.service
> 71ms chronyd.service
> 69ms systemd-readahead-collect.service
> 68ms systemd-sysctl.service
> 65ms systemd-logind.service
> 61ms rsyslog.service
> 58ms systemd-remount-fs.service
> 46ms rpcbind.service
> 46ms nfs-config.service
> 45ms 

Re: [ovirt-users] [Gluster-users] Centos 7.1 failed to start glusterd after upgrading to ovirt 3.6

2015-11-05 Thread Atin Mukherjee
>> [glusterd-store.c:4243:glusterd_resolve_all_bricks] 0-glusterd:
>> resolve brick failed in restore
The above log is the culprit here. Generally this function fails when
GlusterD fails to resolve the associated host of a brick. Has any of the
node undergone an IP change during the upgrade process?

~Atin

On 11/06/2015 09:59 AM, Sahina Bose wrote:
> Did you upgrade all the nodes too?
> Are some of your nodes not-reachable?
> 
> Adding gluster-users for glusterd error.
> 
> On 11/06/2015 12:00 AM, Stefano Danzi wrote:
>>
>> After upgrading oVirt from 3.5 to 3.6, glusterd fail to start when the
>> host boot.
>> Manual start of service after boot works fine.
>>
>> gluster log:
>>
>> [2015-11-04 13:37:55.360876] I [MSGID: 100030]
>> [glusterfsd.c:2318:main] 0-/usr/sbin/glusterd: Started running
>> /usr/sbin/glusterd version 3.7.5 (args: /usr/sbin/glusterd -p
>> /var/run/glusterd.pid)
>> [2015-11-04 13:37:55.447413] I [MSGID: 106478] [glusterd.c:1350:init]
>> 0-management: Maximum allowed open file descriptors set to 65536
>> [2015-11-04 13:37:55.447477] I [MSGID: 106479] [glusterd.c:1399:init]
>> 0-management: Using /var/lib/glusterd as working directory
>> [2015-11-04 13:37:55.464540] W [MSGID: 103071]
>> [rdma.c:4592:__gf_rdma_ctx_create] 0-rpc-transport/rdma: rdma_cm event
>> channel creation failed [Nessun device corrisponde]
>> [2015-11-04 13:37:55.464559] W [MSGID: 103055] [rdma.c:4899:init]
>> 0-rdma.management: Failed to initialize IB Device
>> [2015-11-04 13:37:55.464566] W
>> [rpc-transport.c:359:rpc_transport_load] 0-rpc-transport: 'rdma'
>> initialization failed
>> [2015-11-04 13:37:55.464616] W [rpcsvc.c:1597:rpcsvc_transport_create]
>> 0-rpc-service: cannot create listener, initing the transport failed
>> [2015-11-04 13:37:55.464624] E [MSGID: 106243] [glusterd.c:1623:init]
>> 0-management: creation of 1 listeners failed, continuing with
>> succeeded transport
>> [2015-11-04 13:37:57.663862] I [MSGID: 106513]
>> [glusterd-store.c:2036:glusterd_restore_op_version] 0-glusterd:
>> retrieved op-version: 30600
>> [2015-11-04 13:37:58.284522] I [MSGID: 106194]
>> [glusterd-store.c:3465:glusterd_store_retrieve_missed_snaps_list]
>> 0-management: No missed snaps list.
>> [2015-11-04 13:37:58.287477] E [MSGID: 106187]
>> [glusterd-store.c:4243:glusterd_resolve_all_bricks] 0-glusterd:
>> resolve brick failed in restore
>> [2015-11-04 13:37:58.287505] E [MSGID: 101019]
>> [xlator.c:428:xlator_init] 0-management: Initialization of volume
>> 'management' failed, review your volfile again
>> [2015-11-04 13:37:58.287513] E [graph.c:322:glusterfs_graph_init]
>> 0-management: initializing translator failed
>> [2015-11-04 13:37:58.287518] E [graph.c:661:glusterfs_graph_activate]
>> 0-graph: init failed
>> [2015-11-04 13:37:58.287799] W [glusterfsd.c:1236:cleanup_and_exit]
>> (-->/usr/sbin/glusterd(glusterfs_volumes_init+0xfd) [0x7f29b876524d]
>> -->/usr/sbin/glusterd(glusterfs_process_volfp+0x126) [0x7f29b87650f6]
>> -->/usr/sbin/glusterd(cleanup_and_exit+0x69) [0x7f29b87646d9] ) 0-:
>> received signum (0), shutting down
>>
>>
>> ___
>> Users mailing list
>> Users@ovirt.org
>> http://lists.ovirt.org/mailman/listinfo/users
> 
> ___
> Gluster-users mailing list
> gluster-us...@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
> 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users