Hi all,
I would like to ask if, and with how much success, you are using
GlusterFS for virtual machine storage.
My plan: I want to setup a 2-node cluster, where VM runs on the nodes
themselves and can be live-migrated on demand.
I have some questions:
- do you use GlusterFS for similar setup
Il 23-08-2017 18:14 Pavel Szalbot ha scritto:
Hi, after many VM crashes during upgrades of Gluster, losing network
connectivity on one node etc. I would advise running replica 2 with
arbiter.
Hi Pavel, this is bad news :(
So, in your case at least, Gluster was not stable? Something as simple
a
Il 25-08-2017 08:32 Gionatan Danti ha scritto:
Hi all,
any other advice from who use (or do not use) Gluster as a replicated
VM backend?
Thanks.
Sorry, I was not seeing messages because I was not subscribed on the
list; I read it from the web.
So it seems that Pavel and WK have vastly
Il 25-08-2017 10:50 lemonni...@ulrar.net ha scritto:
Yes. Gluster has it's own quorum, you can disable it but that's just a
recipe for a disaster.
Free from a lot of problems, but apparently not as good as a replica 3
volume. I can't comment on arbiter, I only have replica 3 clusters. I
can tell
Il 23-08-2017 18:51 Gionatan Danti ha scritto:
Il 23-08-2017 18:14 Pavel Szalbot ha scritto:
Hi, after many VM crashes during upgrades of Gluster, losing network
connectivity on one node etc. I would advise running replica 2 with
arbiter.
Hi Pavel, this is bad news :(
So, in your case at
Il 25-08-2017 14:22 Lindsay Mathieson ha scritto:
On 25/08/2017 6:50 PM, lemonni...@ulrar.net wrote:
I run Replica 3 VM hosting (gfapi) via a 3 node proxmox cluster. Have
done a lot of rolling node updates, power failures etc, never had a
problem. Performance is better than any other DFS I've tr
Il 25-08-2017 21:43 lemonni...@ulrar.net ha scritto:
I think you are talking about DRBD 8, which is indeed very easy. DRBD 9
on the other hand, which is the one that compares to gluster (more or
less), is a whole other story. Never managed to make it work correctly
either
Oh yes, absolutely DRB
Il 25-08-2017 21:48 WK ha scritto:
On 8/25/2017 12:56 AM, Gionatan Danti wrote:
We ran Rep2 for years on 3.4. It does work if you are really,really
careful, But in a crash on one side, you might have lost some bits
that were on the fly. The VM would then try to heal.
Without sharding, big
Il 26-08-2017 01:13 WK ha scritto:
Big +1 on what was Kevin just said. Just avoiding the problem is the
best strategy.
Ok, never run Gluster with anything less than a replica2 + arbiter ;)
However, for the record, and if you really, really want to get deep
into the weeds on the subject, the
Il 26-08-2017 07:38 Gionatan Danti ha scritto:
I'll surely give a look at the documentation. I have the "bad" habit
of not putting into production anything I know how to repair/cope
with.
Thanks.
Mmmm, this should read as:
"I have the "bad" habit of not putting
Il 30-08-2017 03:57 Everton Brogliatto ha scritto:
Ciao Gionatan,
I run Gluster 3.10.x (Replica 3 arbiter or 2 + 1 arbiter) to provide
storage for oVirt 4.x and I have had no major issues so far.
I have done online upgrades a couple of times, power losses,
maintenance, etc with no issues. Overal
Il 30-08-2017 17:07 Ivan Rossi ha scritto:
There has ben a bug associated to sharding that led to VM corruption
that has been around for a long time (difficult to reproduce I
understood). I have not seen reports on that for some time after the
last fix, so hopefully now VM hosting is stable.
Mm
Il 31-08-2017 01:17 lemonni...@ulrar.net ha scritto:
Solved as to 3.7.12. The only bug left is when adding new bricks to
create a new replica set, now sure where we are now on that bug but
that's not a common operation (well, at least for me).
Hi, same question here: is any specific information
Il 04-09-2017 19:27 Ivan Rossi ha scritto:
The latter one is the one I have been referring to. And it is pretty
dangerous Imho
Sorry, I can not find the bug report/entry by myself.
Can you link to some more information, or explain which bug are your
referring and how to trigger it?
Thanks.
Il 09-09-2017 09:09 Pavel Szalbot ha scritto:
Sorry, I did not start the glusterfsd on the node I was shutting
yesterday and now killed another one during FUSE test, so it had to
crash immediately (only one of three nodes were actually up). This
definitely happened for the first time (only one no
Hi list,
I have a replica 3 test cluster and I have a question about how clients
behave to an host shutdown.
If I suddenly switch off one of the gluster server, the connected
clients see a ~42s stall in I/O: this is expected, as it is the default
client timeout.
However, it is possible to *
On 16/07/2019 15:27, Ravishankar N wrote:
Yes, if you simply pkill the gluster brick processes of the node before
switching it off, you won't observe the hang on the clients because they
will receive the disconnect notification immediately. But before that,
you would need to check if there are
Hi list,
I have a question about recommended gluster stable version for using to
host virtual disk images.
From my understanding, current RHGS uses the latest 3.x gluster branch.
This is also the same version provided by default in RHEL/CentOS;
[root@localhost ~]# yum info glusterfs.x86_64
.
On 23/07/2019 13:21, Kaleb Keithley wrote:
That's true, but the glusterfs-3.12.x packages are still available on
the CentOS mirrors.
I'm not sure why the centos-release-gluster312 package has been removed
from the mirrors, but you can still get it from
https://cbs.centos.org/koji/packageinfo
Hi all,
I would like to better understand gluster performance and how to
profile/analyze them.
I set up two old test machine each with a quad-core i7 CPU, 8 GB RAM and
4x 5400 RPM disks in software RAID 10. OS is CentOS 8.1 and I am using
Gluster 6.7. To avoid being limited by the mechanical
Il 20-01-2020 16:30 Gionatan Danti ha scritto:
So, I have some questions:
- why performance are so low with fsync?
- why do I have so low IOPs (20/30) for minutes?
- what is capping the non-fsync test?
- why both glusterd and glusterfd are so CPU intensive? I can
understand glusterfd itself
Il 21-01-2020 11:40 Yaniv Kaul ha scritto:
How did you fix this?
How did you spot this?
I used iperf3 between the two hosts. It shows that, albeit bandwidth was
near the 1 Gbps limit, there were frequent retransmissions. "netstat -s
| grep retran" confirmed that retransmissions happened durin
Il 21-01-2020 18:40 Jeff Brown ha scritto:
Looking at your setup, you have two two disk drives satisfying your
random read and write requests. Your mirroring for redundancy and
striping across the mirrors. so your read transactions will be
satisfied by two disks logically. This class of device
Il 22-01-2020 14:02 Barak Sason Rofman ha scritto:
Hello Gionatan,
Some time ago we've looked into the effect of logging in gluster on
performance.
If it's possible for you, you could try doing a performance test by
disabling logging.
In order to fully disable logging, some code needs to be mo
Il 2020-02-17 03:59 Markus Kern ha scritto:
Greetings!
I am currently evaluating our options to replace our old mixture of
IBM SAN storage boxes. This will be a strategic decision for the next
years.
One of the solutions I am reviewing is a GlusterFS installation.
Planned usage:
- Central NFS s
Il 2020-03-18 18:41 Rene Bon Ciric ha scritto:
[ ID] Interval Transfer Bandwidth Retr Cwnd
[ 4] 0.00-1.00 sec 1.02 GBytes 8.77 Gbits/sec 1459 3.15
MBytes
[ 4] 1.00-2.00 sec 1022 MBytes 8.58 Gbits/sec 2284 3.15
MBytes
[ 4] 2.00-3.00 sec 1005 MBytes
Il 2020-03-21 21:02 Strahil Nikolov ha scritto:
WARNING: DO NOT DISABLE SHARDING!!!
EVER!
Sorry to hijack, but I am genuinely curious: why sharding should not be
disabled? What does happen if/when disabling sharding?
Thanks.
--
Danti Gionatan
Supporto Tecnico
Assyoma S.r.l. - www.assyoma.i
Il 2020-05-19 08:35 Susant Palai ha scritto:
Generally, these logs indicate that there were directories missing
("hole = 1") on one of the bricks. These are harmless message and you
can ignore them.
I would guess that a message warning about missing directories on a
brick is not to be ignore -
Il 2020-05-19 13:07 Susant Palai ha scritto:
This can happen when a server goes down (reboot, crash, network
partition) during a fop execution. Once the brick is back up, dht will
heal the entry so that operation goes smoothly.
If there is a resultant error, it should have been logged in the
clie
DISCLAIMER: I *really* appreciate this project and I thank all peoples
involved.
Il 2020-06-19 21:33 Mahdi Adnan ha scritto:
The strength of Gluster, in my opinion, is the simplicity of creating
distributed volumes that can be consumed by different clients, and
this is why we chose Gluster back
Il 2020-06-21 01:26 Strahil Nikolov ha scritto:
The efforts are far less than reconstructing the disk of a VM from
CEPH. In gluster , just run a find on the brick searching for the
name of the VM disk and you will find the VM_IMAGE.xyz (where xyz is
just a number) and then concatenate the li
Il 2020-06-21 14:20 Strahil Nikolov ha scritto:
With every community project , you are in the position of a Betta
Tester - no matter Fedora, Gluster or CEPH. So far , I had
issues with upstream projects only diring and immediately after
patching - but this is properly mitigated with a
Il 2020-06-21 20:41 Mahdi Adnan ha scritto:
Hello Gionatan,
Using Gluster brick in a RAID configuration might be safer and
require less work from Gluster admins but, it is a waste of disk
space.
Gluster bricks are replicated "assuming you're creating a
distributed-replica volume" so when brick
Il 2020-06-22 06:58 Hu Bert ha scritto:
Am So., 21. Juni 2020 um 19:43 Uhr schrieb Gionatan Danti
:
For the RAID6/10 setup, I found no issues: simply replace the broken
disk without involing Gluster at all. However, this also means facing
the "iops wall" I described earlier for si
Il 2020-07-21 14:45 Stefano Danzi ha scritto:
Hi!
I have a strange request. I don't know if some gluster settings could
help me.
There are two buildings linked using a wifi bridge.
The main building host the data centre. In the other building there
are an office
that need to use a file serve
Il 2020-07-30 15:08 Gilberto Nunes ha scritto:
I meant, if you power off the server, pull off 1 disk, and then power
on we get system errors
Hi, you are probably hitting some variant of this bug, rather than see
LVM crashing: https://bugzilla.redhat.com/show_bug.cgi?id=1701504
If not, pl
Il 2020-09-04 01:00 Computerisms Corporation ha scritto:
For the sake of completeness I am reporting back that your suspicions
seem to have been validated. I talked to the data center, they made
some changes. we talked again some days later, and they made some
more changes, and for several days
Il 2020-09-09 15:30 Miguel Mascarenhas Filipe ha scritto:
I'm setting up GlusterFS on 2 hw w/ same configuration, 8 hdds. This
deployment will grow later on.
Hi, I really suggest avoiding a replica 2 cluster unless it is for
testing only. Be sure to add an arbiter at least (using a replica 2
Il 2020-09-10 23:13 Miguel Mascarenhas Filipe ha scritto:
can you explain better how a single disk failing would bring a whole
node out of service?
Oh, I did a bad cut/paste. A single disk failure will not put the entire
node out-of-service. The main point was the potentially long heal time
(
Il 2020-09-11 01:03 Computerisms Corporation ha scritto:
Hi Danti,
the notes are not very verbose, but looks like the following lines
were removed from their virtualization config:
They also enabled hyperthreading, so having 12 "cores" instead of 6
now. Guessing that had a lot to do
Il 2020-09-11 05:27 Martin Bähr ha scritto:
Excerpts from Gionatan Danti's message of 2020-09-11 00:35:52 +0200:
The main point was the potentially long heal time
could you (or anyone else) please elaborate on what long heal times are
to be expected?
Hi, there are multiple factor at works he
Il 2020-10-13 21:16 Strahil Nikolov ha scritto:
At least it is a good start point.
This can also be an interesting read:
https://docs.openshift.com/container-platform/3.11/scaling_performance/optimizing_on_glusterfs_storage.html
--
Danti Gionatan
Supporto Tecnico
Assyoma S.r.l. - www.assyoma
Il 2020-11-26 06:14 Dmitry Antipov ha scritto:
In my test setup, all bricks and client workload (fio) are running on
the same host. So
all network traffic should be routed through the loopback interface,
which is CPU-bounded.
Since the server is 32-core and has plenty of RAM, loopback should be
f
Il 2020-11-26 09:47 Dmitry Antipov ha scritto:
On 11/26/20 11:29 AM, Gionatan Danti wrote:
Can you details your exact client and server CPU model?
Desktop is 8x of:
model name : Intel(R) Core(TM) i7-6700HQ CPU @ 2.60GHz
Server is 32x of:
model name : Intel(R) Xeon(R) Silver 4110
Il 2020-11-27 06:53 Dmitry Antipov ha scritto:
Thanks, it seems you're right. Running local replica 3 volume on 3x1Gb
ramdisks, I'm seeing:
top - 08:44:35 up 1 day, 11:51, 1 user, load average: 2.34, 1.94,
1.00
Tasks: 237 total, 2 running, 235 sleeping, 0 stopped, 0 zombie
%Cpu(s): 38.
Il 2020-11-27 09:40 Amar Tumballi ha scritto:
Let's get to longer look into performance:
Amar, Xavi, thanks for your input - very appreciated.
However, I found that when facing sync writes (ie: fsync) gluster
performances are very low - too much for a kernel/syscall overhead.
For more info
Il 2020-12-27 03:00 Zenon Panoussis ha scritto:
The goal: a resilient and geographically distributed mailstore. A
mail server is a very dynamic thing, with files being written, moved
and deleted all the time. You can put the mailstore on a SAN and
access it from multiple SMTP and IMAP servers, bu
Il 2020-12-27 21:58 Zenon Panoussis ha scritto:
For such a project, I would simply configure the SMTP server do to
protocol-specific replication and use a low-TTL DNS name to publish
the IMAP/Web frontends.
Either you know something about mail servers that I would love
to know myself, or else t
Il 2021-01-03 04:48 Zenon Panoussis ha scritto:
Any ideas where I should look for the bottleneck? I can't find
anything even remotely relevant in any of the logs.
As already stated by Strahil, is the latency that is killing your setup.
Just as I warned you before: Gluster sync replication is n
Il 2021-03-19 16:03 Erik Jacobson ha scritto:
A while back I was asked to make a blog or something similar to discuss
the use cases the team I work on (HPCM cluster management) at HPE.
If you are not interested in reading about what I'm up to, just delete
this and move on.
I really don't have a
Il 2021-07-14 08:03 Pranith Kumar Karampuri ha scritto:
Hi,
I am researching the kind of hardware that would be best for
archival use case. We probably need to keep the data anywhere between
20-40 years. Do let us know what you think would be best.
I think nobody can recommend anything on
Il 2021-07-19 06:44 Pranith Kumar Karampuri ha scritto:
On Sat, Jul 17, 2021 at 3:56 PM Strahil Nikolov
wrote:
I can add that you should use bitrod, if you plan to keep data in
glusterfs for a longer period.
My company uses zfs, so it probably would be redundant. zfs does
something similar I
Il 2021-07-19 13:18 Pranith Kumar Karampuri ha scritto:
One option is we will place a zfs brick with tapes as hard drives in
different data centers with glusterfs replicating between them.
For this storage, latency is not that important. But we need the data
to be safe.
For such a setup, I woul
Il 2021-07-28 09:20 Yaniv Kaul ha scritto:
In real life, the 'best' node is the one with the highest overall free
resources, across CPU, network and disk IO. So it could change and it
might change all the time.
Network, disk saturation might be common, disk performing garbage
collection, CPU bein
Il 2021-07-28 13:11 Strahil Nikolov ha scritto:
I think you mean cluster.choose-local which is enabled by default.
Yet, Gluster will check if the local copy is healthy.
Ah, ok, from reading here [1] I was under the impression that
cluster.choose-local was somewhat deprecated.
Good to know tha
Il 2021-08-03 19:51 Strahil Nikolov ha scritto:
The difference between thin and usual arbiter is that the thin arbiter
takes in action only when it's needed (one of the data bricks is down)
, so the thin arbiter's lattency won't affect you as long as both data
bricks are running.
Keep in mind th
Il 2021-08-05 06:00 Strahil Nikolov ha scritto:
I'm not so sure. Imagine that local copy needs healing (outdated).
Then gluster will check if other node's copy is blaming the local one
and if it's "GREEN" , it will read locally. This check to the other
servers is the slowest part due to the latte
Il 2022-01-23 06:37 Sam ha scritto:
Hello Everyone,
I am just starting up with Gluster so pardon my ignorance if I am
doing something incorrectly. In order to test the efficiency of
GlusterFS, I wanted to compare its performance with the native file
system on which it resides and thus I kept bot
Il 2022-04-03 16:31 Strahil Nikolov ha scritto:
It's not like that, but most of the active developers are from RH and
since RHGS is being EOL-ed - they have other priorities.
Is RHGS going to be totally replaced by Red RHCS?
Thanks.
--
Danti Gionatan
Supporto Tecnico
Assyoma S.r.l. - www.assyo
Il 2022-09-16 18:41 dpglus...@posteo.de ha scritto:
I have made extensive load tests in the last few days and figured out
it's definitely a network related issue. I changed from jumbo frames
(mtu 9000) to default mtu of 1500. With a mtu of 1500 the problem
doesn't occur. I'm able to bump the io-w
Hi All,
I have a few questions about GlusterFS used in active/active WAN-based
replication scenarios.
Let first start with a little ASCII chart:
HQ Linux w/SMB share -> low speed link -> Remote Linux w/SMB share ->
WIN7 clients
In short I had to replicate a local server share on a remote Li
On 02/17/2014 11:18 AM, Vijay Bellur wrote:
write-behind can help with write operations but the lookup preceding the
write is sent out to all bricks today and hence that affects overall
performance.
Ok, this is in-line with my tests and the auto-generated configuration files
Not as of today.
62 matches
Mail list logo