Thank you.
Sincerely,
Artem
--
Founder, Android Police <http://www.androidpolice.com>, APK Mirror
<http://www.apkmirror.com/>, Illogical Robot LLC
beerpla.net <http://beerpla.net/> | +ArtemRussakovskii
<https://plus.google.com/+ArtemRussakovskii> | @ArtemR
<http://twitter.com/ArtemR>
On Tue, Apr 10, 2018 at 9:56 AM, Artem Russakovskii
<[email protected] <mailto:[email protected]>> wrote:
Hi Vlad,
I actually saw that post already and even asked a question 4 days
ago
(https://serverfault.com/questions/517775/glusterfs-direct-i-o-mode#comment1172497_540917
<https://serverfault.com/questions/517775/glusterfs-direct-i-o-mode#comment1172497_540917>).
The accepted answer also seems to go against your suggestion to
enable direct-io-mode as it says it should be disabled for better
performance when used just for file accesses.
It'd be great if someone from the Gluster team chimed in about
this thread.
Sincerely,
Artem
--
Founder, Android Police <http://www.androidpolice.com>, APK Mirror
<http://www.apkmirror.com/>, Illogical Robot LLC
beerpla.net <http://beerpla.net/> | +ArtemRussakovskii
<https://plus.google.com/+ArtemRussakovskii> | @ArtemR
<http://twitter.com/ArtemR>
On Tue, Apr 10, 2018 at 7:01 AM, Vlad Kopylov <[email protected]
<mailto:[email protected]>> wrote:
Wish I knew or was able to get detailed description of those
options myself.
here is direct-io-mode
https://serverfault.com/questions/517775/glusterfs-direct-i-o-mode
<https://serverfault.com/questions/517775/glusterfs-direct-i-o-mode>
Same as you I ran tests on a large volume of files, finding
that main delays are in attribute calls, ending up with those
mount options to add performance.
I discovered those options through basically googling this
user list with people sharing their tests.
Not sure I would share your optimism, and rather then going up
I downgraded to 3.12 and have no dir view issue now. Though I
had to recreate the cluster and had to re-add bricks with
existing data.
On Tue, Apr 10, 2018 at 1:47 AM, Artem Russakovskii
<[email protected] <mailto:[email protected]>> wrote:
Hi Vlad,
I'm using only localhost: mounts.
Can you please explain what effect each option has on
performance issues shown in my posts?
"negative-timeout=10,attribute-timeout=30,fopen-keep-cache,direct-io-mode=enable,fetch-attempts=5"
From what I remember, direct-io-mode=enable didn't make a
difference in my tests, but I suppose I can try again. The
explanations about direct-io-mode are quite confusing on
the web in various guides, saying enabling it could make
performance worse in some situations and better in others
due to OS file cache.
There are also these gluster volume settings, adding to
the confusion:
Option: performance.strict-o-direct
Default Value: off
Description: This option when set to off, ignores the
O_DIRECT flag.
Option: performance.nfs.strict-o-direct
Default Value: off
Description: This option when set to off, ignores the
O_DIRECT flag.
Re: 4.0. I moved to 4.0 after finding out that it fixes
the disappearing dirs bug related to
cluster.readdir-optimize if you remember
(http://lists.gluster.org/pipermail/gluster-users/2018-April/033830.html
<http://lists.gluster.org/pipermail/gluster-users/2018-April/033830.html>).
I was already on 3.13 by then, and 4.0 resolved the issue.
It's been stable for me so far, thankfully.
Sincerely,
Artem
--
Founder, Android Police <http://www.androidpolice.com>,
APK Mirror <http://www.apkmirror.com/>, Illogical Robot LLC
beerpla.net <http://beerpla.net/> | +ArtemRussakovskii
<https://plus.google.com/+ArtemRussakovskii> | @ArtemR
<http://twitter.com/ArtemR>
On Mon, Apr 9, 2018 at 10:38 PM, Vlad Kopylov
<[email protected] <mailto:[email protected]>> wrote:
you definitely need mount options to /etc/fstab
use ones from here
http://lists.gluster.org/pipermail/gluster-users/2018-April/033811.html
<http://lists.gluster.org/pipermail/gluster-users/2018-April/033811.html>
I went on with using local mounts to achieve
performance as well
Also, 3.12 or 3.10 branches would be preferable for
production
On Fri, Apr 6, 2018 at 4:12 AM, Artem Russakovskii
<[email protected] <mailto:[email protected]>> wrote:
Hi again,
I'd like to expand on the performance issues and
plead for help. Here's one case which shows these
odd hiccups: https://i.imgur.com/CXBPjTK.gifv
<https://i.imgur.com/CXBPjTK.gifv>.
In this GIF where I switch back and forth between
copy operations on 2 servers, I'm copying a 10GB
dir full of .apk and image files.
On server "hive" I'm copying straight from the
main disk to an attached volume block (xfs). As
you can see, the transfers are relatively speedy
and don't hiccup.
On server "citadel" I'm copying the same set of
data to a 4-replicate gluster which uses block
storage as a brick. As you can see, performance is
much worse, and there are frequent pauses for many
seconds where nothing seems to be happening - just
freezes.
All 4 servers have the same specs, and all of them
have performance issues with gluster and no such
issues when raw xfs block storage is used.
hive has long finished copying the data, while
citadel is barely chugging along and is expected
to take probably half an hour to an hour. I have
over 1TB of data to migrate, at which point if we
went live, I'm not even sure gluster would be able
to keep up instead of bringing the machines and
services down.
Here's the cluster config, though it didn't seem
to make any difference performance-wise before I
applied the customizations vs after.
Volume Name: apkmirror_data1
Type: Replicate
Volume ID: 11ecee7e-d4f8-497a-9994-ceb144d6841e
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 4 = 4
Transport-type: tcp
Bricks:
Brick1: nexus2:/mnt/nexus2_block1/apkmirror_data1
Brick2: forge:/mnt/forge_block1/apkmirror_data1
Brick3: hive:/mnt/hive_block1/apkmirror_data1
Brick4: citadel:/mnt/citadel_block1/apkmirror_data1
Options Reconfigured:
cluster.quorum-count: 1
cluster.quorum-type: fixed
network.ping-timeout: 5
network.remote-dio: enable
performance.rda-cache-limit: 256MB
performance.readdir-ahead: on
performance.parallel-readdir: on
network.inode-lru-limit: 500000
performance.md-cache-timeout: 600
performance.cache-invalidation: on
performance.stat-prefetch: on
features.cache-invalidation-timeout: 600
features.cache-invalidation: on
cluster.readdir-optimize: on
performance.io-thread-count: 32
server.event-threads: 4
client.event-threads: 4
performance.read-ahead: off
cluster.lookup-optimize: on
performance.cache-size: 1GB
cluster.self-heal-daemon: enable
transport.address-family: inet
nfs.disable: on
performance.client-io-threads: on
The mounts are done as follows in /etc/fstab:
/dev/disk/by-id/scsi-0Linode_Volume_citadel_block1
/mnt/citadel_block1 xfs defaults 0 2
localhost:/apkmirror_data1 /mnt/apkmirror_data1
glusterfs defaults,_netdev 0 0
I'm really not sure if direct-io-mode mount tweaks
would do anything here, what the value should be
set to, and what it is by default.
The OS is OpenSUSE 42.3, 64-bit. 80GB of RAM, 20
CPUs, hosted by Linode.
I'd really appreciate any help in the matter.
Thank you.
Sincerely,
Artem
--
Founder, Android Police
<http://www.androidpolice.com>, APK Mirror
<http://www.apkmirror.com/>, Illogical Robot LLC
beerpla.net <http://beerpla.net/> |
+ArtemRussakovskii
<https://plus.google.com/+ArtemRussakovskii> |
@ArtemR <http://twitter.com/ArtemR>
On Thu, Apr 5, 2018 at 11:13 PM, Artem
Russakovskii <[email protected]
<mailto:[email protected]>> wrote:
Hi,
I'm trying to squeeze performance out of
gluster on 4 80GB RAM 20-CPU machines where
Gluster runs on attached block storage
(Linode) in (4 replicate bricks), and so far
everything I tried results in sub-optimal
performance.
There are many files - mostly images, several
million - and many operations take minutes,
copying multiple files (even if they're small)
suddenly freezes up for seconds at a time,
then continues, iostat frequently shows large
r_await and w_awaits with 100% utilization for
the attached block device, etc.
But anyway, there are many guides out there
for small-file performance improvements, but
more explanation is needed, and I think more
tweaks should be possible.
My question today is
about performance.cache-size. Is this a size
of cache in RAM? If so, how do I view the
current cache size to see if it gets full and
I should increase its size? Is it advisable to
bump it up if I have many tens of gigs of RAM
free?
More generally, in the last 2 months since I
first started working with gluster and set a
production system live, I've been feeling
frustrated because Gluster has a lot of
poorly-documented and confusing options. I
really wish documentation could be improved
with examples and better explanations.
Specifically, it'd be absolutely amazing if
the docs offered a strategy for setting each
value and ways of determining more optimal
values. For example,
for performance.cache-size, if it said
something like "run command abc to see your
current cache size, and if it's hurting, up
it, but be aware that it's limited by RAM,"
it'd be already a huge improvement to the
docs. And so on with other options.
The gluster team is quite helpful on this
mailing list, but in a reactive rather than
proactive way. Perhaps it's tunnel vision once
you've worked on a project for so long where
less technical explanations and even proper
documentation of options takes a back seat,
but I encourage you to be more proactive about
helping us understand and optimize Gluster.
Thank you.
Sincerely,
Artem
--
Founder, Android Police
<http://www.androidpolice.com>, APK Mirror
<http://www.apkmirror.com/>, Illogical Robot LLC
beerpla.net <http://beerpla.net/> |
+ArtemRussakovskii
<https://plus.google.com/+ArtemRussakovskii> |
@ArtemR <http://twitter.com/ArtemR>
_______________________________________________
Gluster-users mailing list
[email protected]
<mailto:[email protected]>
http://lists.gluster.org/mailman/listinfo/gluster-users
<http://lists.gluster.org/mailman/listinfo/gluster-users>
_______________________________________________
Gluster-users mailing list
[email protected]
http://lists.gluster.org/mailman/listinfo/gluster-users