Yes, so the bug has been fixed upstream and the backports to release-7 and
release-8 of gluster pending merge. The fix should be available in the next
.x release of gluster-7 and 8. Until then like Nir suggested, please turn
off performance.stat-prefetch on your volumes.
On Wed, Jun 17,
On Tue, Apr 7, 2020 at 7:36 PM Gianluca Cecchi
> OK. So I set log at least at INFO level on all subsystems and tried a
> redeploy of Openshift with 3 mater nodes and 7 worker nodes.
> One worker got the error and VM in paused mode
> Apr 7, 2020, 3:27:28 PM VM worker-6 has been paused
Agreed. Please share the bug report when you're done filing it. In
addition to the logs Nir requested, include gluster version and the
`gluster volume info` output in your report.
We'll take the discussion forward on the bz.
On Wed, Mar 25, 2020 at 11:39 PM Nir Soffer wrote:
Sorry about the late response.
I looked at the logs. These errors are originating from posix-acl
*[2019-11-17 07:55:47.090065] E [MSGID: 115050]
[server-rpc-fops_v2.c:158:server4_lookup_cbk] 0-data_fast-server: 162496:
On Sat, Nov 23, 2019 at 3:14 AM Nir Soffer wrote:
> On Fri, Nov 22, 2019 at 10:41 PM Strahil Nikolov
>> On Thu, Nov 21, 2019 at 8:20 AM Sahina Bose wrote:
>> On Thu, Nov 21, 2019 at 6:03 AM Strahil Nikolov
>> Hi All,
>> another clue in the logs :
dea was explored sometime back here -
But there were some issues that were identified with the approach, so it
had to be dropped.
Thanks for the detailed explanation.
> Best Regards,
> Strahil Nikolov
> On May 21, 2019 08:36, Kr
So in our internal tests (with nvme ssd drives, 10g n/w), we found read
performance to be better with choose-local
disabled in hyperconverged setup. See
https://bugzilla.redhat.com/show_bug.cgi?id=1566386 for more information.
With choose-local off, the read replica is chosen randomly (based on
Adding back gluster-users
Comments inline ...
On Fri, Mar 29, 2019 at 8:11 PM Olaf Buitelaar
> Dear Krutika,
> 1. I’ve made 2 profile runs of around 10 minutes (see files
> profile_data.txt and profile_data2.txt). Looking at it, most time seems be
> spent at the fop’s fsync and
Questions/comments inline ...
On Thu, Mar 28, 2019 at 10:18 PM wrote:
> Dear All,
> I wanted to share my experience upgrading from 4.2.8 to 4.3.1. While
> previous upgrades from 4.1 to 4.2 etc. went rather smooth, this one was a
> different experience. After first trying a test upgrade on a 3
On Thu, Mar 28, 2019 at 2:28 PM Krutika Dhananjay
> Gluster 5.x does have two important performance-related fixes that are not
> part of 3.12.x -
> i. in shard-replicate interaction -
Sorry, wrong bug-id. This
s the situation now with 5.5 ?
> Best Regards,
> Strahil Nikolov
> On Mar 28, 2019 08:56, Krutika Dhananjay wrote:
> Right. So Gluster stores what are called "indices" for each modified file
> (or shard)
> under a special hidden directory of the "good" br
r node know which shards were modified after it went
> Do the other Gluster nodes keep track of it?
> Indivar Nair
> On Thu, Mar 28, 2019 at 9:45 AM Krutika Dhananjay
>> Each shard is a separate file of size equal to va
will still have to compare each shard to determine whether
> there are any changes that need to be replicated.
> > Am I right?
> +Krutika Dhananjay
> > Regards,
> > Indivar Nair
> > On Wed, Mar 27, 2019 at 4:34
s really went down - performing inside vm fio tests.
> On Wed, Mar 27, 2019, 07:03 Krutika Dhananjay wrote:
>> Could you enable strict-o-direct and disable remote-dio on the src volume
>> as well, restart the vms on "old" and retry migration?
>> # gluster v
> On 26-03-19 14:23, Sahina Bose wrote:
> > +Krutika Dhananjay and gluster ml
> > On Tue, Mar 26, 2019 at 6:16 PM Sander Hoentjen
> >> Hello,
> >> tl;dr We have disk corruption when doing live storage migration on oVirt
; cluster.granular-entry-heal: enable
> transport.address-family: inet
> nfs.disable: on
> performance.client-io-threads: off
> On Thu, Mar 7, 2019 at 1:00 AM Krutika Dhananjay
>> So from the profile, it appears th
implementation. This was fixed at
I need the two things I asked for in the prev mail to confirm if you're
hitting the same issue.
On Thu, Mar 7, 2019 at 12:24 PM Krutika Dhananjay
> Could you share the followi
Could you share the following pieces of information to begin with -
1. output of `gluster volume info $AFFECTED_VOLUME_NAME`
2. glusterfs version you're running
On Sat, Mar 2, 2019 at 3:38 AM Drew R wrote:
> Saw some people asking for profile info. So I had started a migration
On Fri, Feb 15, 2019 at 12:30 AM Jayme wrote:
> Running an oVirt 4.3 HCI 3-way replica cluster with SSD backed storage.
> I've noticed that my SSD writes (smart Total_LBAs_Written) are quite high
> on one particular drive. Specifically I've noticed one volume is much much
> higher total bytes
Gluster's write-behind translator by default buffers writes for flushing to
disk later, *even* when the file is opened with O_DIRECT flag. Not honoring
O_DIRECT could mean a reader from another client could be READing stale
data from bricks because some WRITEs may not yet be flushed to disk.
Adding Ravi who works on replicate component to hep resolve the mismatches.
On Mon, Jul 2, 2018 at 12:27 PM, Krutika Dhananjay
> Sorry, I was out sick on Friday. I am looking into the logs. Will get back
> to you in some time.
Could you share the gluster mount and brick logs? You'll find them under
Also, what's the version of gluster you're using?
Also, output of `gluster volume info `?
On Thu, Jun 21, 2018 at 9:50 AM, Sahina Bose wrote:
> On Wed, Jun 20, 2018 at 11:33 PM, Hanson
Adding Ravi to look into the heal issue.
As for the fsync hang and subsequent IO errors, it seems a lot like
https://bugzilla.redhat.com/show_bug.cgi?id=1497156 and Paolo Bonzini from
qemu had pointed out that this would be fixed by the following commit:
No, you don't need to do any of that. Just executing volume-set commands is
sufficient for the changes to take effect.
On Wed, Jun 21, 2017 at 3:48 PM, Chris Boot <bo...@bootc.net> wrote:
> [replying to lists this time]
> On 20/06/17 11:23, Krutika Dhananjay wr
No. It's just that in the internal testing that was done here, increasing
the thread count beyond 4 did not improve the performance any further.
On Tue, Jun 20, 2017 at 11:30 PM, mabi wrote:
> Dear Krutika,
> Sorry for asking so naively but can you tell me on
Couple of things:
1. Like Darrell suggested, you should enable stat-prefetch and increase
client and server event threads to 4.
# gluster volume set performance.stat-prefetch on
# gluster volume set client.event-threads 4
# gluster volume set server.event-threads 4
2. Also glusterfs-3.10.1
I stand corrected.
Just realised the strace command I gave was wrong.
Here's what you would actually need to execute:
strace -y -ff -o
On Tue, Jun 6, 2017 at 3:20 PM, Krutika Dhananjay <kdhan...@redhat.com>
> So for the 'Transport endpoint is not c
t (127488) at the log does not seem aligned at 4K.
>> On Mon, Jun 5, 2017 at 2:47 PM, Abi Askushi <rightkickt...@gmail.com>
>>> Hi Krutika,
>>> I am saying that I am facing this issue with 4k dri
This seems like a case of O_DIRECT reads and writes gone wrong, judging by
the 'Invalid argument' errors.
The two operations that have failed on gluster bricks are:
[2017-06-05 09:40:39.428979] E [MSGID: 113072] [posix.c:3453:posix_writev]
0-engine-posix: write failed: offset 0, [Invalid
> 0-gv2-shard: Lookup on shard 173 failed. Base file gfid =
> 55b94942-dee5-4f69-8b0f-52e251ac6f5e [No data available]
> *De: *"Sahina Bose" <sab...@redhat.com>
Could you please share your volume info output?
On Fri, Mar 10, 2017 at 6:41 PM, p...@email.cz wrote:
> freez / freezing
> IO operations are paused from any reasons
> available posibilities are
> 1) net - any tcp framework collapse
> 2) gluster interconnect due
any vms at any point before or after the upgrade?
On Mon, Jul 25, 2016 at 11:30 PM, David Gossage <dgoss...@carouselchecks.com
> On Mon, Jul 25, 2016 at 9:58 AM, Krutika Dhananjay <kdhan...@redhat.com>
>> OK, could you try the followin
On Mon, Jul 25, 2016 at 4:57 PM, Samuli Heinonen <samp...@neutraali.net>
> > On 25 Jul 2016, at 12:34, David Gossage <dgoss...@carouselchecks.com>
> > On Mon, Jul 25, 2016 at 1:01 AM, Krutika Dhananja
Could you also share the brick logs from the affected volume? They're
Also, could you share the volume configuration (output of `gluster volume
info `) for the affected volume(s) AND at the time you actually saw
After glusterfs 3.7.11, around 4-5 bugs were found in sharding and
replicate modules and fixed, some of them causing the VM(s) to pause. Could
you share the glusterfs client logs from around the time the issue was
seen? This will help me confirm it's the same issue, or even debug further
Mail list logo