On Thu, Feb 21, 2019 at 11:11 PM Jason P. Thomas <jthom...@gmualumni.org> wrote:
>
> On 2/20/19 5:33 PM, Darrell Budic wrote:
>
> I was just helping Tristam on #ovirt with a similar problem, we found that 
> his two upgraded nodes were running multiple glusterfsd processes per brick 
> (but not all bricks). His volume & brick files in /var/lib/gluster looked 
> normal, but starting glusterd would often spawn extra fsd processes per 
> brick, seemed random. Gluster bug? Maybe related to  
> https://bugzilla.redhat.com/show_bug.cgi?id=1651246, but I’m helping debug 
> this one second hand… Possibly related to the brick crashes? We wound up 
> stopping glusterd, killing off all the fsds, restarting glusterd, and 
> repeating until it only spawned one fsd per brick. Did that to each updated 
> server, then restarted glusterd on the not-yet-updated server to get it 
> talking to the right bricks. That seemed to get to a mostly stable gluster 
> environment, but he’s still seeing 1-2 files listed as needing healing on the 
> upgraded bricks (but not the 3.12 brick). Mainly the DIRECT_IO_TEST and one 
> of the dom/ids files, but he can probably update that. Did manage to get his 
> engine going again, waiting to see if he’s stable now.
>
> Anyway, figured it was worth posting about so people could check for multiple 
> brick processes (glusterfsd) if they hit this stability issue as well, maybe 
> find common ground.
>
> Note: also encountered https://bugzilla.redhat.com/show_bug.cgi?id=1348434 
> trying to get his engine back up, restarting libvirtd let us get it going 
> again. Maybe un-needed if he’d been able to complete his third node upgrades, 
> but he got stuck before then, so...
>
>   -Darrell
>
> Stable is a relative term.  My unsynced entries total for each of my 4 
> volumes changes drastically (with the exception of the engine volume, it 
> pretty much bounces between 1 and 4).  The cluster has been "healing" for 18 
> hours or so and only the unupgraded HC node has healed bricks.  I did have 
> the problem that some files/directories were owned by root:root.  These VMs 
> did not boot until I changed ownership to 36:36.  Even after 18 hours, 
> there's anywhere from 20-386 entries in vol heal info for my 3 non engine 
> bricks.  Overnight I had one brick on one volume go down on one HC node.  
> When I bounced glusterd, it brought up a new fsd process for that brick.  I 
> killed the old one and now vol status reports the right pid on each of the 
> nodes.  This is quite the debacle.  If I can provide any info that might help 
> get this debacle moving in the right direction, let me know.

Can you provide the gluster brick logs and glusterd logs from the
servers (from /var/log/glusterfs/). Since you mention that heal seems
to be stuck, could you also provide the heal logs from
/var/log/glusterfs/glustershd.log
If you can log a bug with these logs, that would be great - please use
https://bugzilla.redhat.com/enter_bug.cgi?product=GlusterFS to log the
bug.


>
> Jason aka Tristam
>
>
> On Feb 14, 2019, at 1:12 AM, Sahina Bose <sab...@redhat.com> wrote:
>
> On Thu, Feb 14, 2019 at 2:39 AM Ron Jerome <ronj...@gmail.com> wrote:
>
>
>
>
> Can you be more specific? What things did you see, and did you report bugs?
>
>
> I've got this one: https://bugzilla.redhat.com/show_bug.cgi?id=1649054
> and this one: https://bugzilla.redhat.com/show_bug.cgi?id=1651246
> and I've got bricks randomly going offline and getting out of sync with the 
> others at which point I've had to manually stop and start the volume to get 
> things back in sync.
>
>
> Thanks for reporting these. Will follow up on the bugs to ensure
> they're addressed.
> Regarding brciks going offline - are the brick processes crashing? Can
> you provide logs of glusterd and bricks. Or is this to do with
> ovirt-engine and brick status not being in sync?
>
> _______________________________________________
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> oVirt Code of Conduct: 
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives: 
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/3RVMLCRK4BWCSBTWVXU2JTIDBWU7WEOP/
>
> _______________________________________________
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> oVirt Code of Conduct: 
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives: 
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/4PKJSVDIH3V4H7Q2RKS2C4ZUMWDODQY6/
>
>
>
> _______________________________________________
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
> Privacy Statement: 
> https://secure-web.cisco.com/1ubMaXUij250PN8zKVQvmo6NUYWPOdVDirkU4lwkRkpCkQix6ZJlGJiEF1lWy8_04u2Ems0FwTKbgPFhm06jfILR59nJDNUIiCeN5YkYj0RU-r9UbaWrCmz_uLZJISuevoC0ELHCC121je2k5qatuVVcZL3XrG4eOeOFlhAd7riOB_HVcTdkWXGXF5hw6IiQj4E33rY5vEP9waE6nkhZO6bk08CLKlYrPyVF0o8d1-X8ntzhjWIE311h2ZNlu9KFarFqe5cckSGvVh1UiHQ3AKuBPZAvPKIH7KXsL6iFBNG-pJm-uVP27ZUnoeEQaG8kAVc6jW43e7fxfBUvrzmiFlQyD2o3HBrNNlbtHGjYU5Wy3Ao2H09QtCReoIaypCYbwS6Di3wqgY0lNuxB7swSo1vziW4Uez_j5sRmSl43UgIXzzjoeu4gWwRyfeteXo88x/https%3A%2F%2Fwww.ovirt.org%2Fsite%2Fprivacy-policy%2F
> oVirt Code of Conduct: 
> https://secure-web.cisco.com/1HjeIIkwx_NRkoCsnonfHu87z-MFaPfE3HOMBJ02Mzwyj-9AxzEENIuSMb_cTt98gAuZrWnSWq26-hUbz4lqcziFPWDUWOpWeYyBfQFYYld79cH960SfEhrOi44Gl9GDPCs27iXPJ1Kpxbp0t3iyi0HmC9QqLoXswWm8sIRPgvg4g1q9sSRKmrTyqylP8-MEETXdMXW-SwYeQT0I-_w1GH9VHOuy6cYf8bqaAwYFtAQ_TDrJX0atMmNh1bqDF3BLKxeXePEZCwqondC9a5ovB9-FzZcpUHrT4YK6gOIng55mdlAj6j-6lyw9N5gNXtfz9oq5DH78nE15q6iFyyEVG58pbrUje45FJdy9WsRRvNttcFbzgtb5E5-RtoFgdIYf5fJfchr0o1NVNHWpb1beyhLeM8_fS1Pzy-Fo8m0r_ZcYtOQ1WOdfE5fs5QRz2UVVZ/https%3A%2F%2Fwww.ovirt.org%2Fcommunity%2Fabout%2Fcommunity-guidelines%2F
> List Archives: 
> https://secure-web.cisco.com/1XcKrt1wH3y9o2mcDXqQa9v-MXc1VugRHkrHz1HJwNk-1Mv89pcENMjNLP_TEZ99urLjX0r-Njjx1RP-mFIsJ_OOvLqjsx1fHATqYdaQf4kPCSF2q9mQeT69waxxb6pMsr12XPMv8rLYkx4aW2OstK2D-qPwT8zq5VxhKu-BnlokI1iS8eE08BgzugQl5Z471i_6Huk6h9jYCYxWW7lPu3OMBmRtlsV_wIshfnu1Cslu_sAOh_44DsDsfPswlNOHzVWX7bS67AKwhr7Ic-QUeew3FJXQL_JnMXZstYWxXgZhK48wgp1CNMhjhva4OiBwm6eKnvMjB6_IYQQSbjO0qg9MHHQQ01BfgJmM2uWLdzAeK7e7S_JKPndAQVyg2LOhECm4JX8GEUEC6a4zM1WbXoA-Zp-vUvOMfzeM-DWmIvqfyojx-yRwDcI8r7HtcJvhE/https%3A%2F%2Flists.ovirt.org%2Farchives%2Flist%2Fusers%40ovirt.org%2Fmessage%2FDYFZAC4BPJNGZP3PEZ6ZP2AB3C3JVAFM%2F
>
>
> _______________________________________________
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> oVirt Code of Conduct: 
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives: 
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/O2WI5F77DVZ3HZORNNXOIWFLMP4GYERV/
_______________________________________________
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/UG7XXWITXUUJYMAXWYZUXJFQGSYWRND6/

Reply via email to