Hello!
There’s been some inrush of tickets in recent times in Jira about clients
being evicted
for being unresponsive to lock timeouts which is just a symptom for
potentially
a lot of different things.
Having gone through several of those. I did a writeup on what logs are
needed
On May 28, 2017, at 3:09 PM, Russell Dekema wrote:
> Greetings,
>
> We have been having various kinds of trouble with our Lustre
> filesystem lately; right now the main problem we are having is
> intermittent severe slowness (such as 30 seconds for an 'ls' of a
> directory containing 100 files
Hello!
On Oct 30, 2016, at 8:33 AM, Thomas Roth wrote:
> Hi all,
>
> we have a larger amount of files that give ??? on 'ls' and the error "Cannot
> allocate memory"
> The corresponding error on the OSS is
> "lvbo_init failed for resource ... rc = -2"
>
> This seems similar to LU-5457
On Aug 14, 2016, at 1:13 PM, Phill Harvey-Smith wrote:
> On 14/08/2016 03:09, Stephane Thiell wrote:
>> Hi Phil,
>
> Phill :)
>
>> I understand that you’re running master on your clients (tag v2_8_56
>> was created 4 days ago) and 2.1 on the servers? Running master in
>> production is already
On Aug 16, 2016, at 6:55 AM, E.S. Rosenberg wrote:
> I just found this paper:
> http://wiki.lustre.org/images/d/da/Understanding_Lustre_Filesystem_Internals.pdf
>
> It looks interesting but it deals with lustre 1.6 so I am not sure how
> relevant it still is.
Well, I believe it deals with
the case, the check in d_compare was added a few
years later and before then it was perfectly possible
to find invalid dentries and that's why we had this d_revalidate check hitting.
Bye,
Oleg
--
Oleg Drokin
Senior Software Engineer
Whamcloud, Inc
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss
--
Oleg
of smaller parts and
attach it to http://bugs.whamcloud.com/browse/LU-871, the bug that I
previously opened to track defects found by Clang/LLVM.
Also would be great to run something like this on 2.2 codebase, I guess.
Bye,
Oleg
--
Oleg Drokin
Senior Software Engineer
Whamcloud, Inc
@gmail.com
On Sun, Oct 2, 2011 at 4:31 PM, Jon Zhu jon@gmail.com wrote:
Thanks a lot, the work around works.
-Jon.
On Sun, Oct 2, 2011 at 3:47 PM, Oleg Drokin gr...@whamcloud.com wrote:
Hello!
Last time I hit this (some years ago), a simple touch
ldiskfs
'
make: *** [rpms] Error 2
Thanks,
-Jon.
On Thu, Sep 29, 2011 at 11:34 PM, Oleg Drokin gr...@whamcloud.com wrote:
Hello!
There is nothing special, same as rhel6.1:
unpack the lustre source, run autogen.sh, run configure and provide the
path to the linux kernel source for your
a procedure on how to build v2.1 GA code on CentOS 5.6 (xen)? On
whamcloud wiki I can only find build v2.1 on RHEL 6.1 or build v1.8 on CentOS
5.6.
BTW, congratulations on the 2.1 release!
Regards,
Jon Zhu
Sent from Google Mail
On Fri, Jun 24, 2011 at 2:43 PM, Oleg Drokin gr
, that would
probably have pretty negative impact too.
Bye,
Oleg
--
Oleg Drokin
Senior Software Engineer
Whamcloud, Inc.
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss
These are just warnings, I guess you skipped the errors.
I would expect the patches won't apply, and thus it fails.
Bye,
Oleg
--
Oleg Drokin
Senior Software Engineer
Whamcloud, Inc.
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
does not work properly in 1.8 in that
case.
It does work with 2.1 clients.
Bye,
Oleg
--
Oleg Drokin
Senior Software Engineer
Whamcloud, Inc.
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo
setting DEBUG_SIZE environment variable to
something bigger.
like double your CPU cores (this is in megabytes for debug buffers. if you are
not
interested in the debug buffers, set it to 0).
Bye,
Oleg
--
Oleg Drokin
Senior Software Engineer
Whamcloud, Inc
to
try 1.8.6 and see if it improves things.
Bye,
Oleg
--
Oleg Drokin
Senior Software Engineer
Whamcloud, Inc.
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss
the clients one way or the other.
Bye,
Oleg
--
Oleg Drokin
Senior Software Engineer
Whamcloud, Inc.
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss
never seen anything like that in rhel5 xen kernels,
perhaps it's something with rhel6.1 xen?
Bye,
Oleg
--
Oleg Drokin
Senior Software Engineer
Whamcloud, Inc.
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org
during the test so that's why the other client cannot list file inside it? I
guess so, after I stopped the fileop test program I can get into the
directory and there is nothing in it.
Thanks,
-Jon.
On Fri, Jun 24, 2011 at 5:11 PM, Oleg Drokin gr...@whamcloud.com wrote:
Did it delete
checkout latest
2.1 and build it aginst your kernel-devel package
(--with-linux=/lib/modules/`uname -r`/build to configure while booted into
rhel6.1 kernel from RH).
Bye,
Oleg
--
Oleg Drokin
Senior Software Engineer
Whamcloud, Inc.
___
Lustre-discuss
mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss
--
Oleg Drokin
Senior Software Engineer
Whamcloud, Inc.
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman
Hello!
On Jun 7, 2011, at 7:49 AM, Thomas Roth wrote:
there are some new error messages on our MDT, haven't seen these
before and according to Google nobody else has...
The usual question: what does it mean? Something to worry about?
Jun 7 06:23:53 lxmds kernel: [4565451.097596]
orphan objects.
Alternatively next time your MDS restarts such orphaned objects should
also be destroyed.
Bye,
Oleg
--
Oleg Drokin
Senior Software Engineer
Whamcloud, Inc.
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http
of EA i.e. the extended attributes ?
There are multiple things that could be called file handle, so it would be
great if you explain a little bit
about what is it you are actually looking for.
Bye,
Oleg
--
Oleg Drokin
Senior Software Engineer
Whamcloud, Inc
it out.
Is File handle part of EA i.e. the extended attributes ?
There are multiple things that could be called file handle, so it would be
great if you explain a little bit
about what is it you are actually looking for.
Bye,
Oleg
--
Oleg Drokin
Senior Software Engineer
Whamcloud
Lustre by Sun Microsystems
which is now somewhat stale.)
Bye,
Oleg
--
Oleg Drokin
Senior Software Engineer
Whamcloud, Inc.
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss
Hello!
On May 5, 2011, at 2:37 AM, vilobh meshram wrote:
I have noticed that for file or directory kind of operation in Lustre, the
Lock Manager grabs an EX (Exclusive lock) on the parent directory and then
creates a directory or file inside it.Is there a specific reason behind this
logic
Hello!
On Mar 17, 2011, at 5:44 PM, Andreas Dilger wrote:
I did not find if this was removed or this was partially included in
Lustre 2.0.
What's the current status of this and how can I tell to my client to
avoid caching too many data?
The client VM usage was one of the areas that was
Hello!
On Mar 6, 2011, at 8:43 PM, Samuel Aparicio wrote:
now an attempt to re-mount the OST fails with
LDISKFS-fs (md14): failed to open journal device unknown-block(152,225): -6
an e2fsck fixes this external superblock
[root@OST2 ~]# e2fsck -j /dev/etherd/e9.24p1 /dev/md14
e2fsck
Hello!
On Feb 9, 2011, at 4:35 PM, James Robnett wrote:
Normally I've had no problems but recently I have multiple clients
reporting the following error:
LustreError: 3935:0:(osc_request.c:1629:osc_brw_redo_request()) @@@ redo
for recoverable error req@8101ae084000
Hello!
It's not necessary missing, some other factors might be in play. E.g. if you
have somewhat older version of Lustre and export it via NFS from this node,
I think there was a bug leading to such messages.
If it is indeed missing, e2fsck should fix a case where a directory entry
Hello!
I guess I am a little bit late to the party, but I was just reading comments in
bug 16900 and have this question I really need to ask.
On Aug 23, 2010, at 10:58 PM, Jeremy Filizetti wrote:
The larger RPCs from bug 16900 offered some significant performance when
working over the WAN.
Hello!
On Dec 22, 2010, at 12:43 AM, Jeremy Filizetti wrote:
In the attachment I created that Andreas posted at
https://bugzilla.lustre.org/attachment.cgi?id=31423 if you look at graph 1
and 2 they are both using larger than default max_rpcs_in_flight. I believe
the data without the
Hello!
On Dec 6, 2010, at 6:50 PM, Jeff Johnson wrote:
Previous test incarnations of this filesystem were built where ost name
was not assigned (e.g.: OST) and was assigned upon first mount and
connection to the mds. Is it possible that some clients have residual
pointers or config
Hello!
On Dec 6, 2010, at 7:05 PM, Jeff Johnson wrote:
Previous test incarnations of this filesystem were built where ost name
was not assigned (e.g.: OST) and was assigned upon first mount and
connection to the mds. Is it possible that some clients have residual
pointers or config data
Hello!
Essentially your client(s) got disconnected from MGS for some reason
(somewhere earlier in MGS logs you should see something about that).
Now the clients did not know they were disconnected and discover this sad
fact next time they try to talk to MGS (sending their periodic PINGs
Actually note that it is conflicting with existing ext4progs, not ext2, so
should not be all that hard.
Besides, lustre-enabled ext2 should have all the stuff that's already in
ext4progs I would imagine.
On Nov 22, 2010, at 11:05 PM, Alexey Lyashkov wrote:
removing e2fsprogs from live system
Hello!
On Nov 18, 2010, at 7:18 AM, Herbert Fruchtl wrote:
Rebooting the client doesn't change anything. Is it broken, or is there some
persistent information that I need to flush? When I do an ls on a partially
broken directory, I get the following two lines in /var/log/messages:
Nov 18
Hello!
So are there any other compplaints on the OSS node when you mount that OST?
Did you try to run e2fsck on the ost disk itself (while unmounted)? I assume
one of the possible problems is just on0disk fs corruptions
(and it might show unhealthy due to that right after mount too).
Bye,
Hello!
On Sep 3, 2010, at 4:52 PM, John White wrote:
Can someone help me out figuring out what's wrong here? We have an
MDS/T that keeps causing problems. I have 2 dozen or so dumps from threads
crashing. The threads in question all appear to be either ll_mdt_rdpg_ ? or
samba do
that is different? We are using lustre to replace our old nfs server
for serving up home directories in our cluster and the rest of our
systems.
On Fri, Aug 27, 2010 at 6:15 PM, Oleg Drokin oleg.dro...@oracle.com wrote:
Hello!
On Aug 27, 2010, at 6:41 PM, David Noriega wrote:
But I
Hello!
On Aug 26, 2010, at 1:07 PM, Dulcardo Arteaga Clavijo wrote:
I am trying to compare the performance of Lustre for parallel write to
a shared file with locks and
without locks. But after doing some experiments I didn't see any
performance improvement when I run without locks.
It all
Hello!
On Aug 27, 2010, at 6:41 PM, David Noriega wrote:
But I also found out about the flock option for lustre. Should I set
flock on all clients? or can I just use localflock option on the
fileserver?
It depends.
If you are 100% sure none of your other clients use flocks in a way similar to
Hello!
You would need to upgrade your clients to at least 1.8.2, otherwise you might
hit bug 19128
during replay that would lead to losing some of the data being replayed.
Version on the routers is not important for async journals feature.
Bye,
Oleg
On Aug 21, 2010, at 10:06 AM, Erik
Hello!
On Aug 19, 2010, at 7:07 PM, Andreas Dilger wrote:
If you want to flush all the memory used by a Lustre client between jobs, you
can do lctl set_param ldlm.namespaces.*.lru_size=clear. Unlike Kevin's
suggestion it is Lustre-specific, while drop_caches will try to flush memory
from
Hello!
On Aug 4, 2010, at 3:41 AM, Andreas Dilger wrote:
mkdir(/mnt/lustre/blah2/b/c/d/e/f/g, 040755) = 0
+1 RPC
lstat(/mnt/lustre/blah2/b/c/d/e/f/g, {st_mode=S_IFDIR|0755, st_size=4096,
...}) = 0
+1 RPC
If we do the mkdir(), the client does not cache the entry?
No. mkdir cannot return
Hello!
On Aug 4, 2010, at 2:04 PM, Daire Byrne wrote:
Hm, initially I was going to say that find is not open-intensive so it should
not benefit from opencache at all.
But then I realized if you have a lot of dirs, then indeed there would be a
positive impact on subsequent reruns.
I assume
Hello!
On Aug 3, 2010, at 12:49 PM, Daire Byrne wrote:
So even with the metadata going over NFS the opencache in the client
seems to make quite a difference (I'm not sure how much the NFS client
caches though). As expected I see no mdt activity for the NFS export
once cached. I think it would
like opencache isn't generally
useful unless enabled on every node. Is there an easy way to force files out
of the cache (ie, echo 3 /proc/sys/vm/drop_caches)?
Kevin
On Aug 3, 2010, at 11:50 AM, Oleg Drokin oleg.dro...@oracle.com wrote:
Hello!
On Aug 3, 2010, at 12:49 PM, Daire
Hello!
On Aug 3, 2010, at 10:59 PM, Jeremy Filizetti wrote:
Another consideration for WAN performance when creating files is the stripe
count. When you start writing to a file the first RPC to each OSC requests
the lock rather then requesting the lock from all OSCs when the first lock is
Hello!
On Jul 30, 2010, at 7:20 AM, Daire Byrne wrote:
Ah yes... that makes sense. I recall the opencache gave a big boost in
performance for NFS exporting but I wasn't sure if it had become the
default. I haven't been keeping up to date with Lustre developments.
It was default for NFS for
Hello!
On May 3, 2010, at 11:49 AM, Thomas Roth wrote:
We found a user job submission script that probably caused all this by
starting
- several hundred (900) jobs simultaneously
- all of them opening one and the same file for batch system errors and
one and the same file for its output.
Hello!
On Apr 27, 2010, at 7:29 PM, Brian Andrus wrote:
Apr 27 16:15:19 nas-0-1 kernel: LustreError:
4133:0:(ldlm_lib.c:1848:target_send_reply_msg()) @@@ processing error (-107)
r...@810669d35c50 x1334203739385128/t0 o400-?@?:0/0 lens 192/0 e 0
to 0 dl 1272410135 ref 1 fl
Hello!
On Apr 27, 2010, at 9:38 PM, Brian Andrus wrote:
Odd, I just went through the log on the MDT and basically it has been
repeating those errors for over 24 hours (not spewing, but often enough).
only ONE other line on an ost:
Each such message means there was an attempt to send a ping
Hello!
On Mar 18, 2010, at 1:36 PM, Roy Dragseth wrote:
Is it possible to fsck on a disabled and drained OST that is mounted readonly?
We need to fsck an OST and would like to avoid a lengthy downtime while doing
it. My plan is to disable and drain the files from the OST and then remount
Hello!
This only works if all the requests are for the same file, then it is done
for you automatically
(assuming that these are write requests and there is not sync in between.
It's impossible to do
for reads for obvious reason that read is a synchronous operation and by
the time we
Hello!
On Mar 5, 2010, at 5:25 PM, Andreas Dilger wrote:
On 2010-03-05, at 15:18, Jagga Soorma wrote:
Is there an impact if the option is turned on, or only if it is
turned on and used? Is the impact local to the file being locked,
the machine on which that file is locked, or the entire
Hello!
On Mar 3, 2010, at 6:35 PM, Jeffrey Bennett wrote:
We are building a very small Lustre cluster with 32 clients (patchless) and
two OSS servers. Each OSS server has 1 OST with 1 TB of Solid State Drives.
All is connected using dual-port DDR IB.
For testing purposes, I am
Hello!
On Feb 28, 2010, at 9:31 PM, huangql wrote:
We got a problem that the MDS has high load value and the system CPU is up to
60% when running chown command on client. It's strange that the load value
and system CPU didn't decrease to the normal level as long as it getted high.
Even we
Hello!
On Nov 19, 2009, at 7:06 AM, Phil Schwan wrote:
Hello old friends! I return with a gift, like an almost-forgotten
uncle visiting from a faraway land.
Long time no see! ;)
I have an interesting issue, on 1.6.6:
# cat /proc/fs/lustre/version
lustre: 1.6.6
kernel: patchless
build:
Hello!
On Oct 5, 2009, at 4:40 PM, Hendelman, Rob wrote:
It looks like the threads finally died The 2 cpu cores that were
pegged at 100% are idle again.
That seems like one heck of a timeout...
Was there a client eviction right before this message?
The watchdog trace from your previous
Hello!
On Sep 26, 2009, at 1:57 AM, Nick Jennings wrote:
About an hour ago the client completely hung. Hosting co. says it was
a kernel panic. I got not useful feedback in /var/log/messages from
the
client or the MDS. However from the OST I got several complaints.
(below).
Does anyone
Hello!
On Sep 26, 2009, at 9:37 AM, Brian J. Murrell wrote:
Unfortunately that was the only info I could get. The client had no
information in the logs about what happened.
They usually don't when they panic.
Right.
RHEL configured to have panic on oops too, if you disable that (in /
Hello!
On Sep 23, 2009, at 7:47 AM, Lukas Hejtmanek wrote:
I limit oss_num_threads instead?
Yes. (they are the same thing anyway)
Thanks Oleg. One more question, this limit is per kernel module or
per OST
mount? E.g., I have 1 physical server that hosts 2 OST servers -
OST0, OST1.
This
Hello!
On Sep 22, 2009, at 7:10 AM, Lukas Hejtmanek wrote:
On Thu, Sep 17, 2009 at 04:17:54PM -0400, Oleg Drokin wrote:
If you bring down the load on the OSTs (read this list, recently
there were
several methods discussed like bringing down number of service
threads)
that should help
Hello!
On Sep 20, 2009, at 3:51 PM, Geoff Lustre wrote:
Dear List
The excellent wiki:
http://www.inter-mezzo.org/index.php?title=MDS_striping_format
states that
Note: Limits for stripe settings are:
• Maximum striping count for a single file is 160.
This is still current (work in
Hello!
On Sep 17, 2009, at 7:28 AM, Lukas Hejtmanek wrote:
LustreError: 11-0: an error occurred while communicating with
x.x@tcp.
The mds_connect operation failed with -16
Lustre: Request x112815827 sent from stable-OST0001-osc-
8802855b7800 to
NID x.x@tcp 100s ago has timed
Hello!
On Sep 11, 2009, at 1:17 AM, Muruga Prabu M wrote:
I have a small java application that uploads images into the lustre
filesystem. When I try to upload images from the application, the
MDS server crashes and kernel panic happens. I have attached the
ouptut of the 'dmesg', and the
Hello!
On Sep 11, 2009, at 9:33 AM, Aaron Knister wrote:
Is the read cache corruption actually causing on-disk corruption? Or
just in-memory corruption? I'm assuming the write cache corruption
would end up causing the file to become corrupt on disk, but if a
node crashes during a write
Hello!
On Sep 9, 2009, at 1:31 PM, Rafael David Tinoco wrote:
One of my OSSs crashes, sometimes one, sometimes another. With the
following error:
That's not a crash.
That's watchdog timeout indicative of lustre spending too much time
waiting on io.
As such you need to somehow decrease the
Hello!
On Sep 9, 2009, at 2:07 PM, Charles A. Taylor wrote:
Anyway, your email concerned us so we issued the recommended commands
on our OSSs to disable the caching. That promptly crashed two of our
OSSs. We got the servers back up and after fsck'ing (fsck.ext4) all
the OSTs and
Hello!
Any chance you can use more modern release like 1.8.1? There was a
number of bugs fixed including some readahead-logic fixes that could
impede read performance.
Bye,
Oleg
On Aug 20, 2009, at 10:38 PM, Alvaro Aguilera wrote:
Thanks for pointing that out. I was using the
Hello!
On Aug 18, 2009, at 4:27 AM, Patricia Santos Marco wrote:
Our MDT have lustre 1.6.7, I see in this message
http://lists.lustre.org/pipermail/lustre-discuss/2009-April/010167.html
that this version have a bug that cause directory corruptions on
the MDT. Can this bug produce this
Hello!
On Aug 18, 2009, at 8:23 AM, Mag Gam wrote:
just curious, if you didn't compile your own kernel, how do you apply
this patch? Is our only option to upgrade via RPMS or is there another
way to apply the patch?
This patch is to lustre itself, not to a kernel.
So you just need lustre
Hello!
On Aug 17, 2009, at 2:14 PM, Patricia Santos Marco wrote:
The last day our MDS refusing conections too. The logs are the same,
and we should reboot the MDS server . What's is the reason for this?
That means some requests from this client are still being processed
and server has a
Hello!
On Aug 10, 2009, at 9:39 AM, Wolfgang Stief wrote:
Before I start installing and fiddling around: Are there any reasons
AGAINST setting up a Lustre playground in a VirtualBox environment? I
just want to play around w/ recovery and debugging situations and
upgrades. No performance
Hello!
On Aug 10, 2009, at 11:03 PM, Jim McCusker wrote:
On Monday, August 10, 2009, Oleg Drokin oleg.dro...@sun.com wrote:
What lustre version is it now?
We used to have uncontrolled unlinking where OSTs might get swamped
with
unlink requests.
Now we limit to 8 unlinks to OST at any
Hello!
On Aug 6, 2009, at 12:57 PM, Thomas Roth wrote:
Hi,
these ll_inode_revalidate_fini errors are unfortunately quite known
to us.
So what would you guess if that happens again and again, on a number
of
clients - MDT softly dying away?
No, I do not think this is MDT problem of any
Hello!
On Aug 5, 2009, at 3:12 PM, Daniel Kulinski wrote:
What would cause the following error to appear?
Typically this is some sort of a race where you presume an inode exist
(because you have some traces of it in memory),
but it is not anymore (on mds, anyway). So when client comes to
Hello!
On Jul 31, 2009, at 3:15 AM, Guillaume Demillecamps wrote:
All servers and clients are having Lustre 1.8, on SLES 10 SP2. Clients
use patchless kernels, using same base revision as the ones for the
patched kernel servers.
We recurrently encounter this error :
Chances are you are
Hello!
On Jul 24, 2009, at 7:04 PM, Andreas Dilger wrote:
On Jul 24, 2009 15:29 -0700, John White wrote:
So we have a new file system set up. beefy OSTs, but certainly
under-
sized metadata (we're still figuring out what we'll use in the end).
We've just started to do friendly-user
Hello!
On Jul 17, 2009, at 2:01 PM, Ettore Enrico Delfino Ligorio wrote:
In my experince, the integration between most recent kernels with
glusterfs and patches of Xen hypervisor works well. The same with
Lustre is harder to do.
Works out of the box for me both with rhel5 kernels (that
Hello!
On May 15, 2009, at 7:39 AM, Ralf Utermann wrote:
so now I am sure to have libcfs-* enabled modules (probably the Debian
packages also had it, it's not disabled in the configure call) and
did this test
again, however I still do not get any debug lines after accessing
the NFS
Hello!
On May 14, 2009, at 4:05 AM, Ralf Utermann wrote:
Hm, that's really strange.
I hope you did not built your Lustre with --disable-libcfs-*
configure
options?
how can I check this? The modules have been built with debian
utilities (m-a build ...)
I suppose you can take a look at the
Hello!
On May 13, 2009, at 7:53 AM, Ralf Utermann wrote:
What might be useful is if you can reproduce this quickly n as few
set
of
Lustre nodes as possible.
remember your current /proc/sys/lnet/debug value.
on lustre-client/nfs-server and on MDS echo -1 /proc/sys/lnet/debug
then do lctl
Hello!
On May 13, 2009, at 10:48 AM, Ralf Utermann wrote:
Oleg Drokin wrote:
[...]
Either Lustre never got any control at all and your problem is
unrelated to lustre and related to something else in your system or
the logging is broken somewhat. The way to test it is to do ls -la
/mnt
Hello!
On May 13, 2009, at 8:35 AM, Mag Gam wrote:
I have an application which I would like to use Lustre as the backing
storage. However, the application (MonetDB) uses mmap(). Would the
application have any problems if using Lustre as its backing storage?
There should be no problems in
Hello!
On Apr 20, 2009, at 6:04 PM, Lukas Hejtmanek wrote:
On Mon, Apr 20, 2009 at 02:42:40PM -0600, Andreas Dilger wrote:
The core looks like this:
#1 0x2b7d5ff825a2 in DumpModeDecode (tif=0x58cdd0,
buf=0xf7f7f7f5f5f5f6f6
Address 0xf7f7f7f5f5f5f6f6 out of bounds, cc=76800, s=2016)
Hello!
On Apr 13, 2009, at 12:55 PM, Jim Garlick wrote:
2) some quick tests of MDS create rates (through lustre now) on the
SSD
and DDN hardware where we seemed to get about 2350 creates/sec no
matter
what hardware we used, and posts from Oleg on this mailing list
indicating
that
Hello!
On Apr 6, 2009, at 10:41 AM, Peter Kjellstrom wrote:
On Friday 03 April 2009, Thomas Wakefield wrote:
Any idea on the timeline for 1.6.7.1 ? Will it be out today, or just
sometime soon?
Knowing if it's hours away or awaiting a complete qa-cycle would be
nice. That
would decide if
Yes, it is for dirty cache limiting on a per-osc basis.
There is also /proc/fs/lustre/llite/*/max_cached_mb that regulates how
much cached
data per client you can have. (default is 3/4 of RAM)
On Apr 3, 2009, at 2:52 PM, Lundgren, Andrew wrote:
The parameter is called dirty, is that write
Hello!
On Mar 30, 2009, at 7:06 AM, Simon Latapie wrote:
I currently have a lustre system with 1 MDS, 2 OSS with 2 OSTs each,
and
37 lustre clients (1 login and 36 compute nodes), all using infiniband
as lustre network (o2ib). All nodes are on 1.6.5.1 patched kernel.
For the past two
Hello!
On Mar 24, 2009, at 3:19 PM, Jay Christopherson wrote:
If I have 5 clients, two of which are running an app which requires
fcntl style file locking, do I need to mount lustre with -o flock on
all five clients, or just the two that are using fcntl?
Just two would be fine.
What
Hello!
On Mar 16, 2009, at 5:41 AM, pascal.dev...@bull.net wrote:
Could anyone tell me if I made a mistake, if Lustre does not support
the
group lock or if it is a bug in Lustre ?
Thank you for bringing this to our attention.
Please file a bug. This is a bug in lustre introduced by lockless
Hello!
On Feb 10, 2009, at 12:11 PM, Simon Kelley wrote:
We are also seeing some userspace file operations fail with the
error
No locks available. These don't generate any logging on the
client so
I don't have exact timing. It's possible that they are associated
with
further ###
Hello!
On Feb 10, 2009, at 12:46 PM, Simon Kelley wrote:
If, by the complete event you mean the received cancel for
unknown cookie, there's not much more to tell. Grepping through the
last month's server logs shows that there are bursts of typically
between 3 and 7 messages, at the same
Hello!
On Jan 29, 2009, at 9:58 PM, Satoshi Isono wrote:
http://wiki.lustre.org/index.php?title=Lustre_FAQ
* What is the maximum number of files in a single file system? In
a single directory?
So, if we use current Lustre 1.6.x on EXT3, we can only support
single MDT. Then, according
Hello!
On Jan 24, 2009, at 9:08 PM, Craig Prescott wrote:
* any problem (from Lustre's perspective) to run the NFS server and
Samba server from the same client?
No.
* on the NFS/Samba server host, shoud I mount with certain options,
such
as -oflock?
If you mount with -o flock and plan
Hello!
On Jan 25, 2009, at 6:56 PM, Wojciech Turek wrote:
For my particular case it gives 512 ost_num_threads which is the
Lustre
max number for this particular parameter. Manual says that each thread
uses actually 1.5MB of RAM, so 768MB of RAM will be consumed on each
of
my OSSs for
Hello!
On May 16, 2008, at 6:45 AM, Patrick Winnertz wrote:
As I wrote in #11742 [1] I experienced a kernel panic after doing
heavy I/O
on the 1.6.5rc2 cluster on the mds. Since nobody answered to this bug
until now (and I think in other cases the lustre team is _really_ fast
(thanks for
1 - 100 of 110 matches
Mail list logo