1
À : "Degremont, Aurelien" ,
"lustre-discuss@lists.lustre.org" ,
"lustre-discuss-requ...@lists.lustre.org"
,
"lustre-discuss-ow...@lists.lustre.org"
Objet : RE: [EXTERNAL][lustre-discuss] Lustre with ZFS Install
CAUTION: This email originated from outsi
Hi
It looks like the ‘./configure’ command was not successful. Did you check it?
Also, please copy/paste terminal output as text and not as a picture.
Aurélien
De : lustre-discuss au nom de Nick
dan via lustre-discuss
Répondre à : Nick dan
Date : mardi 24 janvier 2023 à 09:31
À :
Hi Dan,
There is no, a priori, incompatibilities between Lustre and Postgres. Don’t
bother configuring some clients in RW and some in RO before having done
properly your Postgres setup.
However, this is a Lustre mailing-list and you’re asking for Postgres setup.
This is not the right place.
12, 2023, at 2:59 AM, Degremont, Aurelien
wrote:
>
> Hello Daniel,
>
> You should also check if there is not some user workload that is
triggering that load, like a constant load of SYNC to files on those OSTs by
example.
>
> Aurélien
>
> Le
Did you try? :)
But the answer is yes, ‘-o ro’ is supported for client mounts
Aurélien
De : lustre-discuss au nom de Nick
dan via lustre-discuss
Répondre à : Nick dan
Date : vendredi 13 janvier 2023 à 10:48
À : "lustre-discuss@lists.lustre.org" ,
"lustre-discuss-requ...@lists.lustre.org"
Hello Daniel,
You should also check if there is not some user workload that is triggering
that load, like a constant load of SYNC to files on those OSTs by example.
Aurélien
Le 11/01/2023 22:37, « lustre-discuss au nom de Daniel Szkola via
lustre-discuss » a écrit :
CAUTION: This email
Hi
Yes, both parameters at the same time are valid.
Regarding the RPMS you installed. You picked a new kernel, did you reboot to
use it? If not you should.
You are missing those 2 packages, I’m even surprised Yum did not complain about
missing deps.
Hi Christopher,
As far as I know, this will only prevent you from using few features that
requires either a recent Linux kernel or a kernel with the appropriate
backports for them to work. From the top of my head, I'm thinking as project
quotas for ldiskfs, but you are using ZFS, or client
Hi François
[root@server1 ~] rm: cannot remove ‘Logs: Cannot send after transport endpoint
shutdown
[root@server1 ~] mv: cannot move /test/lustre/structure1 to
‘/test/lustre/structure2’: Input/output error
These 2 error messages are typical from a client eviction issue. Your client
was
Lustre 2.14.0 supports Linux kernel up to 5.4.
Lustre 2.15.0 which will be released in the coming days, supports up to Linux
5.11 according to Changelog, but supports clients up to 5.14 according to this
ticket https://jira.whamcloud.com/browse/LU-15220
This ticket is tracking effort to support
Hi
I'm not specialist of project quota, but I have a more generic comment.
I see you said you upgraded to 2.14.58? Is that a version you pick on purpose?
2.14.58 is not intended for production at all. This is an alpha version of what
will be Lustre 2.15.
If you want a production-compatible
Hello Michael
Lustre 2.12.8 does not support Linux 5.15
More recent Lustre versions support up to Linux 5.11 but not further.
See these tickets for 5.12 and 5.14 support.
https://jira.whamcloud.com/browse/LU-14651
https://jira.whamcloud.com/browse/LU-15220
It is possible to manually backport
Hello Florin,
As the filesystem servers do not exist anymore as you deleted it previously,
the client could not reach them to complete the unmount process.
Try unmounting them using '-f' flag, ie: 'umount -f '
You should also reach out to AWS support and check that with them.
Aurélien
Le
Hello
This message is appearing during MDT recovery, likely after a MDS restart. MDT
tries to reconnect first all existing client when it stopped.
It seems all these clients have been also rebooted. To avoid this message, try
to stop your clients before the servers.
If not possible, you can
the performance with direct I/O are worse when
using ZFS backend at least during my tests.
Best
Riccardo
On 10/1/21 2:22 AM, Degremont, Aurelien wrote:
> Hello
>
> To achieve higher throughput with a single threaded process, you should
try to limit
Hello
To achieve higher throughput with a single threaded process, you should try to
limit latencies and parallelize under the hood.
Try checking the following parameters:
- Stripe your file across multiple OSTs
- Do large I/O, multiple MB per write, to let Lustre send multiple RPC to
different
>>> [0x2b90e:0x21b4:0x0]
>>> [0x2b90c:0x1574:0x0]
>>> [0x2b90c:0x1575:0x0]
>>> [0x2b90c:0x1576:0x0]
>>>
>>> Doesn't seem to be many open, so I don't think it's a problem of open
files.
>>>
>
?
Aurélien
Le 26/04/2021 18:27, « Steve Thompson » a écrit :
CAUTION: This email originated from outside of the organization. Do not
click links or open attachments unless you can confirm the sender and know the
content is safe.
On Mon, 26 Apr 2021, Degremont, Aurelien wrote
of the organization. Do not
click links or open attachments unless you can confirm the sender and know the
content is safe.
On Mon, 26 Apr 2021, Degremont, Aurelien wrote:
> Le 26/04/2021 09:34, ? Degremont, Aurelien ? a
?crit :
>
>This message appears when you
Le 26/04/2021 09:34, « Degremont, Aurelien » a écrit :
This message appears when you are using Lustre modules built with only
client support, with server support disabled.
This message is quite new and only appears in very recent Lustre releases.
Actually I double-checked
Hello Steve,
This message appears when you are using Lustre modules built with only client
support, with server support disabled.
This message is quite new and only appears in very recent Lustre releases. What
Lustre version are you using, this error does not exist in 2.12.6 as far as I
know.
You're totally correct!
De : lustre-discuss au nom de Sid
Young via lustre-discuss
Répondre à : Sid Young
Date : vendredi 26 février 2021 à 00:45
À : lustre-discuss
Objet : [EXTERNAL] [lustre-discuss] servicenode /failnode
CAUTION: This email originated from outside of the organization.
Hello
If I understand correctly, you're telling that you have 2 configuration files:
/etc/modprobe.d/lnet.conf
options lnet networks=tcp
[root@hpc-oss-03 ~]# cat /etc/modprobe.d/lustre.conf
options lnet networks="tcp(ens2f0)"
options lnet ip2nets="tcp(ens2f0) 10.140.93.*
That means you are
Hi Andrew,
There are no such things as a filesystem version, as a filesystem is made of
MDTs, OSTs and clients. Each of them could have different Lustre versions (even
if running too much different lustre versions between MDT and OST is not really
supported).
So, to get the Lustre version, you
hank you,
Amit
-Original Message-
From: Stephane Thiell
Sent: Monday, December 7, 2020 11:43 AM
To: Degremont, Aurelien
Cc: Kumar, Amit ; Russell Dekema ;
lustre-discuss@lists.lustre.org
Subject: Re: [lustre-discuss] Robinhood scan time
Hi Amit,
Hi Amit,
Thanks for this data point, that's interesting.
Robinhood prints a scan summary in its logfile at the end of scan. It could be
nice if you can copy/paste it, for further reference.
Aurélien
Le 04/12/2020 23:39, « lustre-discuss au nom de Kumar, Amit »
a
écrit :
CAUTION: This
This is a known issue, see https://jira.whamcloud.com/browse/LU-11840 and
https://jira.whamcloud.com/browse/LU-13548
Aurélien
De : lustre-discuss au nom de Mark
Lundie
Date : mardi 1 décembre 2020 à 13:16
À : fırat yılmaz
Cc : "lustre-discuss@lists.lustre.org"
Objet : RE: [EXTERNAL]
Hi Angelos,
Bug reports could be made at https://jira.whamcloud.com/
Aurélien
Le 04/09/2020 06:11, « lustre-discuss au nom de Angelos Ching »
a écrit :
CAUTION: This email originated from outside of the organization. Do not
click links or open attachments unless you can confirm the
Do you mean you've never hit client eviction or dirty page discard before?
What was your previous Lustre version?
Dirty page discard warning exists for a long time, since Lustre 2.4.
Eviction happens for lots of reason. Evictions just mean this client did not
respond in 100 sec to this OSS. It
Hi Phil,
There are conflicts with your already installed Lustre 2.9.0 packages.
Based on the output you provide, you should remove 'kmod-lustre-client-tests'
first.
Actually, only kmod-lustre-client and lustre-client are the required. You
probably don't need the other ones (lustre-iokit,
Hi Lana,
Lustre dispatches the data across several servers, MDTs and OSTs. It is likely
that one of this OST is full.
To see the usage per sub-component, you should check:
lfs df -h
lfs df -ih
See if this reports one OSTs or MDT is full.
Aurélien
De : lustre-discuss au nom de Lana
Deere
.>; failed to send uevent qos_threshold_rr=100
On Fri, Mar 6, 2020 at 9:39 AM Michael Di Domenico
wrote:
>
> On Fri, Mar 6, 2020 at 9:36 AM Degremont, Aurelien
wrote:
> >
> > Did you see any actual error on your system?
Did you see any actual error on your system?
Because there is a patch that is just decreasing the verbosity level of such
messages, which looks like could be ignored.
https://jira.whamcloud.com/browse/LU-13071
https://review.whamcloud.com/#/c/37718/
Aurélien
Le 06/03/2020 15:17, «
are not available.
Aurélien
Le 05/03/2020 11:47, « Torsten Harenberg »
a écrit :
Hi Aurélien,
thanks for your quick reply, really appreciate it.
Am 05.03.20 um 10:20 schrieb Degremont, Aurelien:
> - What is the exact error message when the panic happens? Could you
c
Hello Torsten,
- What is the exact error message when the panic happens? Could you copy/paste
few log messages from this panic message?
- Did you try searching for this pattern onto jira.whamcloud.com, to see if
this is an already known bug.
- It seems related to quota. Is disabling quota an
Very nice one! Thanks!
I was able to reproduce and confirm that with the proper sequence, the LNET
connections will be initiated from server to client 988 port.
Thanks all!
De : Steve Crusan
Date : vendredi 21 février 2020 à 03:53
À : NeilBrown
Cc : "Degremont, Aurelien" ,
"
that helps.
NeilBrown
On Wed, Feb 19 2020, Degremont, Aurelien wrote:
> Thanks! That's really interesting.
> Do you have a code pointer that could show where the code will establish
this connection if missing?
>
> L
it.
-Cory
On 2/19/20, 2:35 AM, "lustre-discuss on behalf of Degremont, Aurelien"
wrote:
Thanks! That's really interesting.
Do you have a code pointer that could show where the code will
establish this connection if missing?
Le 18/02/
ld assume that any LNet node might receive a
connection
> from any other LNet node (for which they share an LNet network), and
> that the connection could come from any port between 512 and 1023
> (LNET_ACCEPTOR_MIN_PORT to LNET_ACCEPTOR_MAX_PORT).
>
&g
(LNET_ACCEPTOR_MIN_PORT to LNET_ACCEPTOR_MAX_PORT).
NeilBrown
On Mon, Feb 17 2020, Degremont, Aurelien wrote:
> Hi all,
>
> From what I've understood so far, LNET listens on port 988 by default and
peers connect to it using 1021-1023 TCP ports
Hi all,
From what I've understood so far, LNET listens on port 988 by default and peers
connect to it using 1021-1023 TCP ports as source ports.
At Lustre level, servers listen on 988 and clients connect to them using the
same source ports 1021-1023.
So only accepting connections to port 988 on
I don't have idea of what could be the problem, but you should try benchmarking
your network bandwidth with lnet_selftest, with o2ib and tcp and compare the
value. You will see if the problem is a related to Lustre network layer or
something else.
http://wiki.lustre.org/LNET_Selftest
De :
I did not try for Debian, but Ubuntu support works quite well.
You should try building from sources and hopefully it will work.
I know some sites in Germany (GSI) which are using Debian for their cluster. So
it must work.
Try downloading 2.12.4 sources, then
sh ./autogen.sh
./configure
make
Hi Kris,
As people said each HSM has its own copytool with its own constraints. To work
properly with S3, additional metadata is stored in Lustre when files are
imported from a S3 bucket.
Questions specific to Amazon FSx For Lustre would be better asked to AWS
support.
Aurélien
Le
Hi
These messages means the client thinks it has lost the communication with the
server and reconnect. The server only sees the reconnection and never thought
the client was gone.
It could be related to lots of things. The server could be receiving RPCs from
this client but not processing
Quota is not enabled by default, you need to enable it. Space accounting is on
by default.
Did you check the doc?
See 25.2.1: http://doc.lustre.org/lustre_manual.xhtml#configuringquotas
Aurélien
De : lustre-discuss au nom de Parag
Khuraswar
Date : vendredi 29 novembre 2019 à 04:10
À :
Thanks for this info. But actually I was really looking at the number of OSS,
not OSTs :)
This is really more how Lustre client nodes and MDT will cope with very large
number of OSSes.
De : Andreas Dilger
Date : vendredi 4 octobre 2019 à 04:54
À : "Degremont, Aurelien"
Cc : "
Hello all!
This doc from the wiki says "Lustre can support up to 2000 OSS per file system"
(http://wiki.lustre.org/Lustre_Server_Requirements_Guidelines).
I'm a bit surprised by this statement. Does somebody has information about the
upper limit to the number of OSSes?
Or what could be the
As Andreas said "it is not relevant for ZFS since ZFS dynamically allocates
inodes and blocks as needed"
"as needed" is the important part. In your example, your MDT is almost empty,
so 17G inodes for an empty MDT seems pretty sufficient.
As you will create new files and use these inodes, you
Hi Bernd,
Lustre is trying to update to support by default ZFS 0.8.1. During the tests of
Lustre 2.13 development branch, there was an increase of test failures. Default
ZFS version was reverted to 0.7.13 until this is clear if the problem comes
from that or something else. Based on that, the
The Lustre 2.10 branch don't support CentOS 7.7
Lustre 2.12.3 and Lustre 2.13 will.
However Lustre 2.10.8 can built on CentOS 7.7 if you can afford to miss
Infiniband support.
This is the part of the code that creates this problem. ./configure
--with-ofed=no will disable it.
Aurélien
De :
I saw the same issue and downgraded my kernel back to RHEL kernel 3.10.0-957.
But you probably wants to keep 7.7 kernel ? :)
De : lustre-discuss au nom de Tamas
Kazinczy
Organisation : Governmental Agency for IT Development
Date : mardi 10 septembre 2019 à 14:39
À :
Hello
When an OST dies and you have no choice but replacing it with a newly freshly
formatted one (using mkfs --replace), Lustre runs a resynchronization
mechanisms between the MDT and the OST.
The MDT will sent the last object ID it knows for this OST and the OST will
compare this value with
Hello
I think that's expected. Lustre runs an identity upcall command on the MDT to
get the user secondary groups. If it fails, Lustre returns a permission error.
See: http://doc.lustre.org/lustre_manual.xhtml#dbdoclet.l_getidentity
Try to disable it, confirm this is the reason of your problem.
Lustre is transitionning /proc files to /sys per kernel developer requests.
Instead of chasing where these files are located now depending on your Lustre
version, 'lctl get_param' is the official way to read them.
This command will figure out where to properly read the files.
De :
Does these lines means :
Since last snapshot there was 211177215x1 and read 41332944x2 I/O in flight ?
Best regards,
Louis
On 16/07/2019 15:50, Degremont, Aurelien wrote:
Hi Louis,
About brw_stats, there are a bit of explanation in the Lustre Doc (not that
detailed, but still)
http://doc.lustre.o
Nobody on this topic?
I'm pretty sure they are lots of people running Lustre on ZFS with various
tuning applied. Don't be shy
De : "Degremont, Aurelien"
Date : mercredi 10 juillet 2019 à 10:35
À : "lustre-discuss@lists.lustre.org"
Objet : ZFS tunings
Hi all,
I know
Hi Louis,
About brw_stats, there are a bit of explanation in the Lustre Doc (not that
detailed, but still)
http://doc.lustre.org/lustre_manual.xhtml#dbdoclet.50438271_55057
> Last thing, is there any way to get the name of the filesystem an OST is part
> of by using lctl ?
I don't know what
Hi all,
I know good default tunings for ZFS when used with Lustre is a big topic. There
are several pages on wiki or LUG/LAD slides which are few years old and it is
difficult to know which ones are still relevant when used with recent Lustre
and ZFS versions.
Does anybody have insight about
Hello Kurt,
max_create_count is a tunable per OST. It controls how many object
pre-creations could be done on each OST, by the MDT. If you set this to 0 for
one OST, no additional object will be allocated there and no new file will have
file stripes on it.
With the appropriate hardware,
Hi folks!
I'm seeing a lot of "Connection restored to " messages in a situation where
everything looks fine, with 2.10.5 filesystems.
Looking at the code, it looks like a client side thing but I'm seeing that on
server side.
Does this mean something is bad on the system or this message is not
This is very unlikely.
The only reason that could happened is this hardware is acknowledging I/O to
Lustre that it did not really commit to disk like writeback cache, or a Lustre
bug.
Le 02/04/2019 14:11, « lustre-discuss au nom de Hans Henrik Happe »
a écrit :
Isn't there a
Another thought I just had while re-reading LU-9341 is whether it would be
better to have the MDS always create files opened with O_APPEND with
stripe_count=1? There is no write parallelism for O_APPEND files, so having
multiple stripes doesn't help the writer. Because the writer
Also, if you’re not using Lustre 2.11 or 2.12, do not forget dnodesize=auto and
recordsize=1M for OST
zfs set dnodesize=auto mdt0
zfs set dnodesize=auto ostX
https://jira.whamcloud.com/browse/LU-8342
(useful for 2.10 LTS. Automatically done by Lustre for 2.11+)
De : lustre-discuss au nom de
Thanks for the clarifications.
Aurélien
Le 14/01/2019 01:36, « Andreas Dilger » a écrit :
On Jan 10, 2019, at 04:52, Degremont, Aurelien wrote:
>
>
> Le 09/01/2019 21:39, « Andreas Dilger » a écrit :
>
>> If admins completely trust the
>> --define "_tmppath $rpmbuilddir/TMP" \
>> --define "_topdir $rpmbuilddir" \
>> --define "dist %{nil}" \
>> -ts lustre-2.12.0.tar.gz || exit 1; \
>> cp $rpmbuilddir/SRPMS/lustre-2.12.0-*.src.r
2.10.6 does not support Linux 4.14.
There is several patches that needs to be cherry picked to have it working.
2.12 is fine though.
Aurélien
Le 09/01/2019 18:43, « lustre-discuss au nom de Tauferner, Andrew T »
a écrit :
I configured like so:
09.01.19 um 11:48 schrieb Degremont, Aurelien:
> When disabling identity_upcall on a MDT, you get this message in system
> logs:
>
> lustre-MDT: disable "identity_upcall" with ACL enabled maybe cause
> unexpected "EACCESS"
>
Hello all,
When disabling identity_upcall on a MDT, you get this message in system logs:
lustre-MDT: disable "identity_upcall" with ACL enabled maybe cause
unexpected "EACCESS"
I’m trying to understand what could be a scenario that shows this problem?
What is the implication, or rather,
*LAD'18 - Lustre Administrators and Developers Workshop*
September 24th-25th, 2018
Marriott Champs Elysées Hotel, Paris - France
These are the last days to benefit from early bird rates for LAD!
Register before August 26th and get a discount on registration fees.
Hello
This is feature is primarily used by HSM, internally, but is not limited to.
data_version is a number computed from each OST objects the file is made of.
Two different values for data_version mean the file content has changed.
Two identical values mean the content is identical. There is
cs.org put
on their own.
I hope this helps clarify my point of view.
This does not blame the really good content that was recently put on
wiki.lustre.org, just the format, not the content :)
Aurélien
Le 15/12/2017 à 09:32, Dilger, Andreas a écrit :
On Dec 13, 2017, at 09:02, DEGREMON
Hello
My recommendation would be to go for something like Documentation-as-a-code.
It is now very easy to deploy an infrastructure to automatically
generate nice looking HTML and PDF versions of a rst or markdown
documentation thanks to readthedocs.io
readthedocs.io uses Sphinx as backend,
*LAD'17 - Lustre Administrators and Developers Workshop*
October 4th-5th, 2017
Salon des Arts et Métiers, Paris - France
We are pleased to announce that agenda for Lustre Administrator and
Developer Workshop 2017 is now available online:
http://www.eofs.eu/events/lad17
Please register before
Hello Lustre community!
These are the last days to send abstracts for LAD'17, do not wait! We
will be pleased to hear about your Lustre experiences. Not only
developers but also from sites or admins presenting their Lustre
deployment and experiences with it.
*LAD'17 - Lustre Administrators and Developers Workshop*
*October**4th-5th, 2017
Salon des Arts et Métiers**, Paris - France*
EOFS and OpenSFS are happy to announce the 7th LAD will be held in
Paris, France, at Salon des Arts et Métiers! This will be a 2-day event,
from 4th to 5th of October,
*LAD'17 - Lustre Administrator and Developer Workshop**
**October 3rd - 4th, 2017**
**Hôtel des Arts et Métiers, Paris - France*
EOFS and OpenSFS are happy to announce the 7th LAD will be held in
Paris, France, at the Salon de l'Hôtel des Arts et Métiers! This will be
a 2-day event, from
*LAD'16 - Lustre Administrator and Developer Workshop*
September 20th - 21st, 2016
Hotel Renaissance Trocadero, Paris - France
We are pleased to announce that agenda for Lustre Administrator and
Developer Workshop 2016 is now available online:
http://www.eofs.eu/?id=lad16
Please register
*LAD'15 - Lustre Administrator and Developer Workshop*
September 22th - 23th, 2015
Paris Marriott Champs Elysees Hotel, Paris - France
We are pleased to announce that agenda for Lustre Administrator and
Developer Workshop 2015 is now available online:
http://www.eofs.eu/?id=lad15
Please
*LAD'15 - Lustre Administrator and Developer Workshop*
September 22th - 23th, 2015
Paris Marriott Champs Elysees Hotel, Paris - France
*CALL FOR PAPERS*
We are extending call for papers up to August 4th, 2015. You have one
more week to send your abstract!
We are inviting community members to
*LAD'15 - Lustre Administrator and Developer Workshop*
*September 22th - 23th, 2015**
**Paris Marriott Champs Elysees Hotel, Paris - France*
EOFS and OpenSFS are happy to announce the 5th LAD will be held in
Paris, France, at Marriott Hotel on Champs-Elysees Avenue! This will be
a 2-day
*LAD'13 - Lustre Administrators and Developers Workshop*
*September 16th - 17th, 2013**
**Hotel Lutetia, Paris - France*
AGENDA
The LAD'13 agenda is now available online!
Check LAD web page: http://www.eofs.eu/?id=lad13
REGISTRATION
There is only very few days left to benefit from from the
Hello
AFAIK there is 2 orders:
- If you are started your filesystem for the first time (or using
--writeconf), order is :
MGS, MDS, OST, Clients
- On normal start
MGS, OST, MDS, Clients
There is a patch on some recent Lustre release to be able to use the
first order any time but I would
Le 24/07/2012 20:10, Daniel Kobras a écrit :
Is this the troglodyte type of OST that started its life in times of
prehistoric versions of Lustre? We see this on old files that were created in
the early ages of Lustre 1.6, before the trusted.fid EA was introduced.
No, this filesystem was
Hello
I'm trying to analyze an OST with few thousands of object and find where they
belong to.
Mounting this OST with ldiskfs and using ll_decode_filter_fid tells me that.
-Most of these object do not have a fid EA back pointer. Does that means they
are not used?
-Some of them have good
Mark Day a écrit :
robinhood supplies this list, if you are using lustre 2.0 it directly
queries the mdt.
http://sourceforge.net/apps/trac/robinhood
On this subject does anyone know of a tool/app that tracks folder sizes?
Robinhood can also do this.
Aurélien
Liam Forbes a écrit :
I just built a lustre filesystem, and it appears to be working fine.
However, I think I'd like to reorder the OSTs in order to spread the
work across the OSSes better. Is this something I can do on the fly,
or do I have to rebuild the filesystem in a different order? If
European Lustre Workshop 2011
September 26th - 27th 2011
Hotel Pullman Paris Bercy - Paris, France
EOFS is organizing the first European Lustre Workshop, in Paris, at
Hotel Pullman Paris Bercy during 2 days, 26th and 27th of September, 2011.
This will be a great opportunity for Lustre(tm)
Nathan Rutman a écrit :
On May 3, 2011, at 10:09 AM, DEGREMONT Aurelien wrote:
Server and client cooperate together for the adaptive timeouts.
Ok they cooperate, the client will change its timeout through this
cooperation, but will also do the same ?
If yes, obd_timeout and ldlm_timeout
Hello
Andreas Dilger a écrit :
On May 3, 2011, at 13:41, Nathan Rutman wrote:
On May 3, 2011, at 10:09 AM, DEGREMONT Aurelien wrote:
Correct me if I'm wrong, but when I'm looking at Lustre manual, it said
that client is adapting its timeout, but not the server. I'm understood
Johann Lombardi a écrit :
On Wed, May 04, 2011 at 01:37:14PM +0200, DEGREMONT Aurelien wrote:
I assume that the 25315s is from a bug
BTW, do you see this problem with both extent inodebits locks?
Yes both. But more often on MDS.
How can I track those dropped RPCs on routers
Hello
We often see some of our Lustre clients being evicted abusively (clients
seem healthy).
The pattern is always the same:
All of this on Lustre 2.0, with adaptative timeout enabled
1 - A server complains about a client :
### lock callback timer expired... after 25315s...
(nothing on
for AT that might
help for you (increased at_min, etc).
Hopefully Chris or someone at LLNL can comment. I think they were also
documented in bugzilla, though I don't know the bug number.
Cheers, Andreas
On 2011-05-03, at 6:59 AM, DEGREMONT Aurelien aurelien.degrem...@cea.fr
wrote:
Hello
We
Jason Rappleye a écrit :
On Mar 18, 2011, at 1:07 AM, DEGREMONT Aurelien wrote:
Yes, that would totally make sense to do regardless of other methods.
Hmm... I do not want to patch 'cp' or 'dd' :)
You might want to have a look at this:
http://code.google.com/p/pagecache
Hello
I'm trying to control the amount of cache memory used by Lustre clients
with Lustre 2.0
There is a /proc tuning called max_cache_mb in llite which is
documented in Lustre manual.
Unfortunately, after testing it and checking the source code it seems
this variable is present but it not
Hello
From my understanding, Lustre can return EINTR for some I/O error cases.
I think that when a client gets evicted in the middle of one of its RPC,
it can returns EINTR to the caller.
Is this can explain your issue?
Can your verify your clients where not evicted at the same time?
Aurélien
Hello
I advise you to have a look at Robinhood, which has a ton of features to
handle Lustre file management:
http://robinhood.sf.net/ http://robinhood.sourceforge.net
And if you think this does not fit your needs I will be very interested
to know why.
Regards,
Aurélien Degrémont
Andrus,
Hi
Nathan Rutman a écrit :
Hi Aurelien, Robert -
We also use Hudson and are interested in using it to do Lustre builds
and testing.
Hi
Robert Read a écrit :
Hi Aurélien,
Yes, we've noticed Hudson's support for testing is not quite what we need,
so
we're planning to use
Hello
If you want to register different interfaces for different OST on the
same OSS, you should use --network options, introduced in patch
https://bugzilla.lustre.org/show_bug.cgi?id=22078
Regards,
Aurélien
Haisong Cai a écrit :
I have a storage server that has two QPI contolllers,
each
It seems Oracle Broomfield labs is down today, this impacted not just
the git repo, but other services such as bugzilla.
Monday was a holiday in US.
The business will resume today, I expect that the lab service will be
restored today
Aurélien
Wojciech Turek a écrit :
I am afraid that lustre
1 - 100 of 102 matches
Mail list logo