Hi, On Sat, Aug 4, 2018 at 3:19 AM Fox <[email protected]> wrote:
> Replying to the last batch of questions I've received... > > To reiterate, I am only having problems writing files to disperse volumes > when mounting it on an armhf system. Mounting the same volume on an x86-64 > system works fine. > Disperse volumes running on arm can not heal. > > Replica volumes mount and heal just fine. > > > All bricks are up and running. I have ensured connectivity and that MTU is > correct and identical. > > Armhf is 32bit: > # uname -a > Linux gluster01 4.14.55-146 #1 SMP PREEMPT Wed Jul 11 22:31:01 -03 2018 > armv7l armv7l armv7l GNU/Linux > # file /bin/bash > /bin/bash: ELF 32-bit LSB shared object, ARM, EABI5 version 1 (SYSV), > dynamically linked, interpreter /lib/ld-linux-armhf.so.3, for GNU/Linux > 3.2.0, BuildID[sha1]=e0a53f804173b0cd9845bb8a76fee1a1e98a9759, stripped > # lsb_release -a > No LSB modules are available. > Distributor ID: Ubuntu > Description: Ubuntu 18.04.1 LTS > Release: 18.04 > Codename: bionic > # free > total used free shared buff/cache > available > Mem: 2042428 83540 1671004 6052 287884 > 1895684 > Swap: 0 0 0 > > > 8 cores total. 4x running 2ghz and 4x running 1.4ghz > processor : 0 > model name : ARMv7 Processor rev 3 (v7l) > BogoMIPS : 24.00 > Features : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva > idivt vfpd32 lpae > CPU implementer : 0x41 > CPU architecture: 7 > CPU variant : 0x0 > CPU part : 0xc07 > CPU revision : 3 > > processor : 4 > model name : ARMv7 Processor rev 3 (v7l) > BogoMIPS : 72.00 > Features : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva > idivt vfpd32 lpae > CPU implementer : 0x41 > CPU architecture: 7 > CPU variant : 0x2 > CPU part : 0xc0f > CPU revision : 3 > > > > There IS a 98MB /core file from the fuse mount so thats cool. > # file /core > /core: ELF 32-bit LSB core file ARM, version 1 (SYSV), SVR4-style, from > '/usr/sbin/glusterfs --process-name fuse --volfile-server=gluster01 > --volfile-id', real uid: 0, effective uid: 0, real gid: 0, effective gid: > 0, execfn: '/usr/sbin/glusterfs', platform: 'v7l' > On possible cause is some 64/32 bits inconsistency. If you have also installed the debug symbols and can provide a backtrace from the core dump, it would help to identify the problem. Xavi > I will try and get a bug report with logs filed over the weekend. > > This is just an experimental home cluster. I don't have anything on it > yet. Its possible I could grant someone SSH access to the cluster if it > helps further the gluster project. But the results should be reproducible > on something like a raspberry pi. I was hoping to run a dispersed volume on > it eventually otherwise I would have never found this issue. > > Thank you for the troubleshooting ideas. > > > -Fox > > > > > On Fri, Aug 3, 2018 at 3:33 AM, Milind Changire <[email protected]> > wrote: > >> What is the endianness of the armhf CPU ? >> Are you running a 32bit or 64bit Operating System ? >> >> >> On Fri, Aug 3, 2018 at 9:51 AM, Fox <[email protected]> wrote: >> >>> Just wondering if anyone else is running into the same behavior with >>> disperse volumes described below and what I might be able to do about it. >>> >>> I am using ubuntu 18.04LTS on Odroid HC-2 hardware (armhf) and have >>> installed gluster 4.1.2 via PPA. I have 12 member nodes each with a single >>> brick. I can successfully create a working volume via the command: >>> >>> gluster volume create testvol1 disperse 12 redundancy 4 >>> gluster01:/exports/sda/brick1/testvol1 >>> gluster02:/exports/sda/brick1/testvol1 >>> gluster03:/exports/sda/brick1/testvol1 >>> gluster04:/exports/sda/brick1/testvol1 >>> gluster05:/exports/sda/brick1/testvol1 >>> gluster06:/exports/sda/brick1/testvol1 >>> gluster07:/exports/sda/brick1/testvol1 >>> gluster08:/exports/sda/brick1/testvol1 >>> gluster09:/exports/sda/brick1/testvol1 >>> gluster10:/exports/sda/brick1/testvol1 >>> gluster11:/exports/sda/brick1/testvol1 >>> gluster12:/exports/sda/brick1/testvol1 >>> >>> And start the volume: >>> gluster volume start testvol1 >>> >>> Mounting the volume on an x86-64 system it performs as expected. >>> >>> Mounting the same volume on an armhf system (such as one of the cluster >>> members) I can create directories but trying to create a file I get an >>> error and the file system unmounts/crashes: >>> root@gluster01:~# mount -t glusterfs gluster01:/testvol1 /mnt >>> root@gluster01:~# cd /mnt >>> root@gluster01:/mnt# ls >>> root@gluster01:/mnt# mkdir test >>> root@gluster01:/mnt# cd test >>> root@gluster01:/mnt/test# cp /root/notes.txt ./ >>> cp: failed to close './notes.txt': Software caused connection abort >>> root@gluster01:/mnt/test# ls >>> ls: cannot open directory '.': Transport endpoint is not connected >>> >>> I get many of these in the glusterfsd.log: >>> The message "W [MSGID: 101088] [common-utils.c:4316:gf_backtrace_save] >>> 0-management: Failed to save the backtrace." repeated 100 times between >>> [2018-08-03 04:06:39.904166] and [2018-08-03 04:06:57.521895] >>> >>> >>> Furthermore, if a cluster member ducks out (reboots, loses connection, >>> etc) and needs healing the self heal daemon logs messages similar to that >>> above and can not heal - no disk activity (verified via iotop) though very >>> high CPU usage and the volume heal info command indicates the volume needs >>> healing. >>> >>> >>> I tested all of the above in virtual environments using x86-64 VMs and >>> could self heal as expected. >>> >>> Again this only happens when using disperse volumes. Should I be filing >>> a bug report instead? >>> >>> _______________________________________________ >>> Gluster-users mailing list >>> [email protected] >>> https://lists.gluster.org/mailman/listinfo/gluster-users >>> >> >> >> >> -- >> Milind >> >> > _______________________________________________ > Gluster-users mailing list > [email protected] > https://lists.gluster.org/mailman/listinfo/gluster-users
_______________________________________________ Gluster-users mailing list [email protected] https://lists.gluster.org/mailman/listinfo/gluster-users
