Re: [Gluster-devel] Jenkins Issues this weekend and how we're solving them
On Mon, Feb 19, 2018 at 8:53 AM, Nigel Babu wrote: > Hello, > > As you all most likely know, we store the tarball of the binaries and core > if there's a core during regression. Occasionally, we've introduced a bug > in Gluster and this tar can take up a lot of space. This has happened > recently with brick multiplex tests. The build-install tar takes up 25G, > causing the machine to run out of space and continuously fail. > AFAIK, we don't have a .t file in upstream regression suits where hundreds of volumes are created. With that scale and brick multiplexing enabled, I can understand the core will be quite heavy loaded and may consume up to this much of crazy amount of space. FWIW, can we first try to figure out which test was causing this crash and see if running a gcore after a certain steps in the tests do left us with a similar size of the core file? IOW, have we actually seen such huge size of core file generated earlier? If not, what changed because which we've started seeing this is something to be invested on. > > I've made some changes this morning. Right after we create the tarball, > we'll delete all files in /archive that are greater than 1G. Please be > aware that this means all large files including the newly created tarball > will be deleted. You will have to work with the traceback on the Jenkins > job. > We'd really need to first investigate on the average size of the core file what we can get with when a system is running with brick multiplexing and ongoing I/O. With out that immediately deleting the core files > 1G will cause trouble to the developers in debugging genuine crashes as traceback alone may not be sufficient. > > > > -- > nigelb > > ___ > Gluster-devel mailing list > Gluster-devel@gluster.org > http://lists.gluster.org/mailman/listinfo/gluster-devel > ___ Gluster-devel mailing list Gluster-devel@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] Jenkins Issues this weekend and how we're solving them
Hello, As you all most likely know, we store the tarball of the binaries and core if there's a core during regression. Occasionally, we've introduced a bug in Gluster and this tar can take up a lot of space. This has happened recently with brick multiplex tests. The build-install tar takes up 25G, causing the machine to run out of space and continuously fail. I've made some changes this morning. Right after we create the tarball, we'll delete all files in /archive that are greater than 1G. Please be aware that this means all large files including the newly created tarball will be deleted. You will have to work with the traceback on the Jenkins job. -- nigelb ___ Gluster-devel mailing list Gluster-devel@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] Weekly Untriaged Bugs
[...truncated 6 lines...] https://bugzilla.redhat.com/1536908 / build: gluster-block build as "fatal error: api/glfs.h: No such file or directory" https://bugzilla.redhat.com/1541261 / build: "glustereventsd-SuSE.in" is missing in extras/init.d https://bugzilla.redhat.com/1544851 / build: Redefinitions of IXDR_GET_LONG and IXDR_PUT_LONG when libtirpc is used https://bugzilla.redhat.com/1545048 / core: [brick-mux] process termination race while killing glusterfsd on last brick detach https://bugzilla.redhat.com/1545142 / core: GlusterFS - Memory Leak during adding directories https://bugzilla.redhat.com/1544090 / core: possible memleak in glusterfsd process with brick multiplexing on https://bugzilla.redhat.com/1540882 / disperse: Do lock conflict check correctly for wait-list https://bugzilla.redhat.com/1543585 / fuse: Client Memory Usage Drastically Increased from 3.12 to 3.13 for Replicate 3 Volumes https://bugzilla.redhat.com/1537602 / geo-replication: Georeplication tests intermittently fail https://bugzilla.redhat.com/1539657 / geo-replication: Georeplication tests intermittently fail https://bugzilla.redhat.com/1542979 / geo-replication: glibc fix for CVE-2018-101 breaks geo-replication https://bugzilla.redhat.com/1544461 / glusterd: 3.8 -> 3.10 rolling upgrade fails (same for 3.12 or 3.13) on Ubuntu 14 https://bugzilla.redhat.com/1544638 / glusterd: 3.8 -> 3.10 rolling upgrade fails (same for 3.12 or 3.13) on Ubuntu 14 https://bugzilla.redhat.com/1540249 / glusterd: Gluster is trying to use a port outside documentation and firewalld's glusterfs.xml https://bugzilla.redhat.com/1540868 / glusterd: Volume start commit failed, Commit failed for operation Start on local node https://bugzilla.redhat.com/1536952 / project-infrastructure: build: add libcurl package to regression machines https://bugzilla.redhat.com/1545003 / project-infrastructure: Create a new list: automated-test...@gluster.org https://bugzilla.redhat.com/1544378 / project-infrastructure: mailman list moderation redirects from https to http https://bugzilla.redhat.com/1546040 / project-infrastructure: Need centos machine to validate all test cases while brick mux is on https://bugzilla.redhat.com/1545891 / project-infrastructure: Provide a automated way to update bugzilla status with patch merge. https://bugzilla.redhat.com/1538900 / protocol: Found a missing unref in rpc_clnt_reconnect https://bugzilla.redhat.com/1540478 / quota: Change quota option of many volumes concurrently, some commit operation failed. https://bugzilla.redhat.com/1539680 / rdma: RDMA transport bricks crash https://bugzilla.redhat.com/1544961 / rpc: libgfrpc does not export IPv6 RPC methods even with --with-ipv6-default https://bugzilla.redhat.com/1546295 / rpc: Official packages don't default to IPv6 https://bugzilla.redhat.com/1538978 / rpc: rpcsvc_request_handler thread should be made multithreaded https://bugzilla.redhat.com/1542934 / rpc: Seeing timer errors in the rebalance logs https://bugzilla.redhat.com/1542072 / scripts: Syntactical errors in hook scripts for managing SELinux context on bricks #2 (S10selinux-label-brick.sh + S10selinux-del-fcontext.sh) https://bugzilla.redhat.com/1540759 / tiering: Failure to demote tiered volume file that is continuously modified by client during hot tier detachment. https://bugzilla.redhat.com/1540376 / tiering: Tiered volume performance degrades badly after a volume stop/start or system restart. [...truncated 2 lines...] build.log Description: Binary data ___ Gluster-devel mailing list Gluster-devel@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-devel