On Tue, Jul 25, 2017 at 11:12 AM, Kasturi Narra <kna...@redhat.com> wrote:
> These errors are because not having glusternw assigned to the correct > interface. Once you attach that these errors should go away. This has > nothing to do with the problem you are seeing. > > sahina any idea about engine not showing the correct volume info ? > Please provide the vdsm.log (contianing the gluster volume info) and engine.log > On Mon, Jul 24, 2017 at 7:30 PM, yayo (j) <jag...@gmail.com> wrote: > >> Hi, >> >> UI refreshed but problem still remain ... >> >> No specific error, I've only these errors but I've read that there is no >> problem if I have this kind of errors: >> >> >> 2017-07-24 15:53:59,823+02 INFO [org.ovirt.engine.core.vdsbro >> ker.gluster.GlusterServersListVDSCommand] (DefaultQuartzScheduler2) >> [b7590c4] START, GlusterServersListVDSCommand(HostName = >> node01.localdomain.local, VdsIdVDSCommandParametersBase:{runAsync='true', >> hostId='4c89baa5-e8f7-4132-a4b3-af332247570c'}), log id: 29a62417 >> 2017-07-24 15:54:01,066+02 INFO [org.ovirt.engine.core.vdsbro >> ker.gluster.GlusterServersListVDSCommand] (DefaultQuartzScheduler2) >> [b7590c4] FINISH, GlusterServersListVDSCommand, return: >> [10.10.20.80/24:CONNECTED, >> node02.localdomain.local:CONNECTED, gdnode04:CONNECTED], log id: 29a62417 >> 2017-07-24 15:54:01,076+02 INFO [org.ovirt.engine.core.vdsbro >> ker.gluster.GlusterVolumesListVDSCommand] (DefaultQuartzScheduler2) >> [b7590c4] START, GlusterVolumesListVDSCommand(HostName = >> node01.localdomain.local, GlusterVolumesListVDSParameters:{runAsync= >> 'true', hostId='4c89baa5-e8f7-4132-a4b3-af332247570c'}), log id: 7fce25d3 >> 2017-07-24 15:54:02,209+02 WARN [org.ovirt.engine.core.vdsbro >> ker.gluster.GlusterVolumesListReturn] (DefaultQuartzScheduler2) >> [b7590c4] Could not associate brick 'gdnode01:/gluster/engine/brick' of >> volume 'd19c19e3-910d-437b-8ba7-4f2a23d17515' with correct network as no >> gluster network found in cluster '00000002-0002-0002-0002-00000000017a' >> 2017-07-24 15:54:02,212+02 WARN [org.ovirt.engine.core.vdsbro >> ker.gluster.GlusterVolumesListReturn] (DefaultQuartzScheduler2) >> [b7590c4] Could not associate brick 'gdnode02:/gluster/engine/brick' of >> volume 'd19c19e3-910d-437b-8ba7-4f2a23d17515' with correct network as no >> gluster network found in cluster '00000002-0002-0002-0002-00000000017a' >> 2017-07-24 15:54:02,215+02 WARN [org.ovirt.engine.core.vdsbro >> ker.gluster.GlusterVolumesListReturn] (DefaultQuartzScheduler2) >> [b7590c4] Could not associate brick 'gdnode04:/gluster/engine/brick' of >> volume 'd19c19e3-910d-437b-8ba7-4f2a23d17515' with correct network as no >> gluster network found in cluster '00000002-0002-0002-0002-00000000017a' >> 2017-07-24 15:54:02,218+02 WARN [org.ovirt.engine.core.vdsbro >> ker.gluster.GlusterVolumesListReturn] (DefaultQuartzScheduler2) >> [b7590c4] Could not associate brick 'gdnode01:/gluster/data/brick' of >> volume 'c7a5dfc9-3e72-4ea1-843e-c8275d4a7c2d' with correct network as no >> gluster network found in cluster '00000002-0002-0002-0002-00000000017a' >> 2017-07-24 15:54:02,221+02 WARN [org.ovirt.engine.core.vdsbro >> ker.gluster.GlusterVolumesListReturn] (DefaultQuartzScheduler2) >> [b7590c4] Could not associate brick 'gdnode02:/gluster/data/brick' of >> volume 'c7a5dfc9-3e72-4ea1-843e-c8275d4a7c2d' with correct network as no >> gluster network found in cluster '00000002-0002-0002-0002-00000000017a' >> 2017-07-24 15:54:02,224+02 WARN [org.ovirt.engine.core.vdsbro >> ker.gluster.GlusterVolumesListReturn] (DefaultQuartzScheduler2) >> [b7590c4] Could not associate brick 'gdnode04:/gluster/data/brick' of >> volume 'c7a5dfc9-3e72-4ea1-843e-c8275d4a7c2d' with correct network as no >> gluster network found in cluster '00000002-0002-0002-0002-00000000017a' >> 2017-07-24 15:54:02,224+02 INFO [org.ovirt.engine.core.vdsbro >> ker.gluster.GlusterVolumesListVDSCommand] (DefaultQuartzScheduler2) >> [b7590c4] FINISH, GlusterVolumesListVDSCommand, return: {d19c19e3-910d >> -437b-8ba7-4f2a23d17515=org.ovirt.engine.core. >> common.businessentities.gluster.GlusterVolumeEntity@fdc91062, c7a5dfc9 >> -3e72-4ea1-843e-c8275d4a7c2d=org.ovirt.engine.core.c >> ommon.businessentities.gluster.GlusterVolumeEntity@999a6f23}, log id: 7 >> fce25d3 >> >> >> Thank you >> >> >> 2017-07-24 8:12 GMT+02:00 Kasturi Narra <kna...@redhat.com>: >> >>> Hi, >>> >>> Regarding the UI showing incorrect information about engine and data >>> volumes, can you please refresh the UI and see if the issue persists plus >>> any errors in the engine.log files ? >>> >>> Thanks >>> kasturi >>> >>> On Sat, Jul 22, 2017 at 11:43 AM, Ravishankar N <ravishan...@redhat.com> >>> wrote: >>> >>>> >>>> On 07/21/2017 11:41 PM, yayo (j) wrote: >>>> >>>> Hi, >>>> >>>> Sorry for follow up again, but, checking the ovirt interface I've found >>>> that ovirt report the "engine" volume as an "arbiter" configuration and the >>>> "data" volume as full replicated volume. Check these screenshots: >>>> >>>> >>>> This is probably some refresh bug in the UI, Sahina might be able to >>>> tell you. >>>> >>>> >>>> https://drive.google.com/drive/folders/0ByUV7xQtP1gCTE8tUTFf >>>> VmR5aDQ?usp=sharing >>>> >>>> But the "gluster volume info" command report that all 2 volume are full >>>> replicated: >>>> >>>> >>>> *Volume Name: data* >>>> *Type: Replicate* >>>> *Volume ID: c7a5dfc9-3e72-4ea1-843e-c8275d4a7c2d* >>>> *Status: Started* >>>> *Snapshot Count: 0* >>>> *Number of Bricks: 1 x 3 = 3* >>>> *Transport-type: tcp* >>>> *Bricks:* >>>> *Brick1: gdnode01:/gluster/data/brick* >>>> *Brick2: gdnode02:/gluster/data/brick* >>>> *Brick3: gdnode04:/gluster/data/brick* >>>> *Options Reconfigured:* >>>> *nfs.disable: on* >>>> *performance.readdir-ahead: on* >>>> *transport.address-family: inet* >>>> *storage.owner-uid: 36* >>>> *performance.quick-read: off* >>>> *performance.read-ahead: off* >>>> *performance.io-cache: off* >>>> *performance.stat-prefetch: off* >>>> *performance.low-prio-threads: 32* >>>> *network.remote-dio: enable* >>>> *cluster.eager-lock: enable* >>>> *cluster.quorum-type: auto* >>>> *cluster.server-quorum-type: server* >>>> *cluster.data-self-heal-algorithm: full* >>>> *cluster.locking-scheme: granular* >>>> *cluster.shd-max-threads: 8* >>>> *cluster.shd-wait-qlength: 10000* >>>> *features.shard: on* >>>> *user.cifs: off* >>>> *storage.owner-gid: 36* >>>> *features.shard-block-size: 512MB* >>>> *network.ping-timeout: 30* >>>> *performance.strict-o-direct: on* >>>> *cluster.granular-entry-heal: on* >>>> *auth.allow: ** >>>> *server.allow-insecure: on* >>>> >>>> >>>> >>>> >>>> >>>> *Volume Name: engine* >>>> *Type: Replicate* >>>> *Volume ID: d19c19e3-910d-437b-8ba7-4f2a23d17515* >>>> *Status: Started* >>>> *Snapshot Count: 0* >>>> *Number of Bricks: 1 x 3 = 3* >>>> *Transport-type: tcp* >>>> *Bricks:* >>>> *Brick1: gdnode01:/gluster/engine/brick* >>>> *Brick2: gdnode02:/gluster/engine/brick* >>>> *Brick3: gdnode04:/gluster/engine/brick* >>>> *Options Reconfigured:* >>>> *nfs.disable: on* >>>> *performance.readdir-ahead: on* >>>> *transport.address-family: inet* >>>> *storage.owner-uid: 36* >>>> *performance.quick-read: off* >>>> *performance.read-ahead: off* >>>> *performance.io-cache: off* >>>> *performance.stat-prefetch: off* >>>> *performance.low-prio-threads: 32* >>>> *network.remote-dio: off* >>>> *cluster.eager-lock: enable* >>>> *cluster.quorum-type: auto* >>>> *cluster.server-quorum-type: server* >>>> *cluster.data-self-heal-algorithm: full* >>>> *cluster.locking-scheme: granular* >>>> *cluster.shd-max-threads: 8* >>>> *cluster.shd-wait-qlength: 10000* >>>> *features.shard: on* >>>> *user.cifs: off* >>>> *storage.owner-gid: 36* >>>> *features.shard-block-size: 512MB* >>>> *network.ping-timeout: 30* >>>> *performance.strict-o-direct: on* >>>> *cluster.granular-entry-heal: on* >>>> *auth.allow: ** >>>> >>>> server.allow-insecure: on >>>> >>>> >>>> 2017-07-21 19:13 GMT+02:00 yayo (j) <jag...@gmail.com>: >>>> >>>>> 2017-07-20 14:48 GMT+02:00 Ravishankar N <ravishan...@redhat.com>: >>>>> >>>>>> >>>>>> But it does say something. All these gfids of completed heals in the >>>>>> log below are the for the ones that you have given the getfattr output >>>>>> of. >>>>>> So what is likely happening is there is an intermittent connection >>>>>> problem >>>>>> between your mount and the brick process, leading to pending heals again >>>>>> after the heal gets completed, which is why the numbers are varying each >>>>>> time. You would need to check why that is the case. >>>>>> Hope this helps, >>>>>> Ravi >>>>>> >>>>>> >>>>>> >>>>>> *[2017-07-20 09:58:46.573079] I [MSGID: 108026] >>>>>> [afr-self-heal-common.c:1254:afr_log_selfheal] 0-engine-replicate-0: >>>>>> Completed data selfheal on e6dfd556-340b-4b76-b47b-7b6f5bd74327. >>>>>> sources=[0] 1 sinks=2* >>>>>> *[2017-07-20 09:59:22.995003] I [MSGID: 108026] >>>>>> [afr-self-heal-metadata.c:51:__afr_selfheal_metadata_do] >>>>>> 0-engine-replicate-0: performing metadata selfheal on >>>>>> f05b9742-2771-484a-85fc-5b6974bcef81* >>>>>> *[2017-07-20 09:59:22.999372] I [MSGID: 108026] >>>>>> [afr-self-heal-common.c:1254:afr_log_selfheal] 0-engine-replicate-0: >>>>>> Completed metadata selfheal on f05b9742-2771-484a-85fc-5b6974bcef81. >>>>>> sources=[0] 1 sinks=2* >>>>>> >>>>>> >>>>> >>>>> Hi, >>>>> >>>>> following your suggestion, I've checked the "peer" status and I found >>>>> that there is too many name for the hosts, I don't know if this can be the >>>>> problem or part of it: >>>>> >>>>> *gluster peer status on NODE01:* >>>>> *Number of Peers: 2* >>>>> >>>>> *Hostname: dnode02.localdomain.local* >>>>> *Uuid: 7c0ebfa3-5676-4d3f-9bfa-7fff6afea0dd* >>>>> *State: Peer in Cluster (Connected)* >>>>> *Other names:* >>>>> *192.168.10.52* >>>>> *dnode02.localdomain.local* >>>>> *10.10.20.90* >>>>> *10.10.10.20* >>>>> >>>>> >>>>> >>>>> >>>>> *gluster peer status on NODE02:* >>>>> *Number of Peers: 2* >>>>> >>>>> *Hostname: dnode01.localdomain.local* >>>>> *Uuid: a568bd60-b3e4-4432-a9bc-996c52eaaa12* >>>>> *State: Peer in Cluster (Connected)* >>>>> *Other names:* >>>>> *gdnode01* >>>>> *10.10.10.10* >>>>> >>>>> *Hostname: gdnode04* >>>>> *Uuid: ce6e0f6b-12cf-4e40-8f01-d1609dfc5828* >>>>> *State: Peer in Cluster (Connected)* >>>>> *Other names:* >>>>> *192.168.10.54* >>>>> *10.10.10.40* >>>>> >>>>> >>>>> *gluster peer status on NODE04:* >>>>> *Number of Peers: 2* >>>>> >>>>> *Hostname: dnode02.neridom.dom* >>>>> *Uuid: 7c0ebfa3-5676-4d3f-9bfa-7fff6afea0dd* >>>>> *State: Peer in Cluster (Connected)* >>>>> *Other names:* >>>>> *10.10.20.90* >>>>> *gdnode02* >>>>> *192.168.10.52* >>>>> *10.10.10.20* >>>>> >>>>> *Hostname: dnode01.localdomain.local* >>>>> *Uuid: a568bd60-b3e4-4432-a9bc-996c52eaaa12* >>>>> *State: Peer in Cluster (Connected)* >>>>> *Other names:* >>>>> *gdnode01* >>>>> *10.10.10.10* >>>>> >>>>> >>>>> >>>>> All these ip are pingable and hosts resolvible across all 3 nodes but, >>>>> only the 10.10.10.0 network is the decidated network for gluster >>>>> (rosolved >>>>> using gdnode* host names) ... You think that remove other entries can fix >>>>> the problem? So, sorry, but, how can I remove other entries? >>>>> >>>> I don't think having extra entries could be a problem. Did you check >>>> the fuse mount logs for disconnect messages that I referred to in the other >>>> email? >>>> >>>> >>>>> And, what about the selinux? >>>>> >>>> Not sure about this. See if there are disconnect messages in the mount >>>> logs first. >>>> -Ravi >>>> >>>> >>>>> Thank you >>>>> >>>>> >>>>> >>>> >>>> >>>> -- >>>> Linux User: 369739 http://counter.li.org >>>> >>>> >>>> >>>> _______________________________________________ >>>> Users mailing list >>>> Users@ovirt.org >>>> http://lists.ovirt.org/mailman/listinfo/users >>>> >>>> >>> >> >> >> -- >> Linux User: 369739 http://counter.li.org >> > > > _______________________________________________ > Gluster-users mailing list > gluster-us...@gluster.org > http://lists.gluster.org/mailman/listinfo/gluster-users >
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users