If we are talking about requests being blocked 60+ seconds, those tunings might
not help (they help a lot for average latency during recovering/backfilling).
It would be interesting to see the logs for those blocked requests at OSD side
(they have level 0), pattern to search might be "slow
On 09/10/2015 10:56 PM, Robert LeBlanc wrote:
> Things I've tried:
>
> * Lowered nr_requests on the spindles from 1000 to 100. This reduced
> the max latency sometimes up to 3000 ms down to a max of 500-700 ms.
> it has also reduced the huge swings in latency, but has also reduced
> throughput
Is there a thread on the mailing list (or LKML?) with some background about
tcp_low_latency and TCP_NODELAY?
Bill
On Fri, Sep 11, 2015 at 2:30 AM, Jan Schermer wrote:
> Can you try
>
> echo 1 > /proc/sys/net/ipv4/tcp_low_latency
>
> And see if it improves things? I remember
Check this..
http://www.spinics.net/lists/ceph-users/msg16294.html
http://tracker.ceph.com/issues/9344
Thanks & Regards
Somnath
From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Bill
Sanders
Sent: Friday, September 11, 2015 11:17 AM
To: Jan Schermer
Cc: Rafael Lopez;
Hello Cephers!
I have interesting a task from our client.
The client have 3000+ video cams (monitoring streets, porchs, entrance,
etc.), we need store data from these cams for 30 days.
Each cam generating 1.3Tb data for 30 days, total bandwidth is 14Gbit/s.
In total we need ( 1.3+3000 ) ~4Pb+
- Original Message -
> From: "Wido den Hollander"
> To: "ceph-users"
> Sent: Friday, 11 September, 2015 6:46:11 AM
> Subject: [ceph-users] 9 PGs stay incomplete
>
> Hi,
>
> I'm running into a issue with Ceph 0.94.2/3 where after doing a recovery
>
It’s a long shot, but check if librados is installed.
From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Daleep
Bais
Sent: 11 September 2015 10:18
To: Jake Young ; p...@daystrom.com
Cc: Ceph-User
Subject: Re: [ceph-users] RBD with
Well, if you plan for OSD to have 2GB per daemon and suddenly it eats
4x as much RAM you might get cluster to a unrecoverable state if you
can't just increase amount of RAM at will. I managed to recover it
because I had only 4 OSDs per machine but I cant imagine what would
happen on 36 OSD
> -Original Message-
> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of
> Somnath Roy
> Sent: 11 September 2015 06:23
> To: Rafael Lopez
> Cc: ceph-users@lists.ceph.com
> Subject: Re: [ceph-users] bad perf for librbd vs krbd using FIO
>
> -Original Message-
> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of
> Mark Nelson
> Sent: 10 September 2015 16:20
> To: ceph-users@lists.ceph.com
> Subject: Re: [ceph-users] higher read iop/s for single thread
>
> I'm not sure you will be able to get there with
Hi Jake, Hello Paul,
I was able to mount the iscsi target to another initiator. However, after
installing the tgt and tgt-rbd, my rbd was not working. Getting error
message :
*root@ceph-node1:~# rbd ls test1*
*rbd: symbol lookup error: rbd: undefined symbol: _ZTIN8librados9WatchCtx*
I am using
Can you try
echo 1 > /proc/sys/net/ipv4/tcp_low_latency
And see if it improves things? I remember there being an option to disable
nagle completely, but it's gone apparently.
Jan
> On 11 Sep 2015, at 10:43, Nick Fisk wrote:
>
>
>
>
>
>> -Original Message-
>>
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256
Hi Arnoud
On 26/05/15 16:53, Arnoud de Jonge wrote:
> Hi,
[...]
>
> 2015-05-26 17:43:37.352569 7f0fce0ff840 0 ceph version 0.94.1
> (e4bfad3a3c51054df7e537a724c8d0bf9be972ff), process radosgw, pid
> 4259 2015-05-26 17:43:37.435921 7f0f8a4f2700 0
On Wed, 09 Sep 2015 08:59:53 -0500, Chad William Seys
wrote:
>
> > Going from 2GB to 8GB is not normal, although some slight bloating is
> > expected.
>
> If I recall correctly, Mariusz's cluster had a period of flapping OSDs?
NIC got packet loss under traffic which
On Wed, Sep 9, 2015 at 11:22 AM, HEWLETT, Paul (Paul)
wrote:
> By setting a parameter osd_max_write_size to 2047Š
> This normally defaults to 90
>
> Setting to 2048 exposes a bug in Ceph where signed overflow occurs...
>
> Part of the problem is my expectations.
If you really want to improve performance of *distributed* filesystem
like Ceph, Lustre, GPFS,
you must consider from networking of the linux kernel.
L5: Socket
L4: TCP
L3: IP
L2: Queuing
In this discussion, problem could be in L2 which is queuing in descriptor.
We may have to take a closer
> note that I've only did it after most of pg were recovered
My guess / hope is that heap free would also help during the recovery process.
Recovery causing failures does not seem like the best outcome. :)
C.
___
ceph-users mailing list
There should be some complains in /var/log/messages.
Can you attach?
Shinobu
- Original Message -
From: "谷枫"
To: "ceph-users"
Sent: Saturday, September 12, 2015 1:30:49 PM
Subject: [ceph-users] ceph-fuse auto down
Hi,all
My cephfs
Ah, you are using ubuntu, sorry for that.
How about:
/var/log/dmesg
I believe you can attach file not paste.
Pasting a bunch of logs would not be good for me -;
And when did you notice that cephfs was hung?
Shinobu
- Original Message -
From: "谷枫"
To: "Shinobu
On Wed, Sep 9, 2015 at 5:34 PM, Gregory Farnum wrote:
> On Wed, Sep 9, 2015 at 3:27 PM, Kyle Hutson wrote:
>> We are using Hammer - latest released version. How do I check if it's
>> getting promoted into the cache?
>
> Umm...that's a good question. You
Hi,all
My cephfs cluster deploy on three nodes with Ceph Hammer 0.94.3 on Ubuntu
14.04 the kernal version is 3.19.0.
I mount the cephfs with ceph-fuse on 9 clients,but some of them (ceph-fuse
process) auto down sometimes and i can't find the reason seems like there
is no other logs can be found
> In your procedure, the umount problems have nothing to do with
> corruption. It's (sometimes) hanging because the MDS is offline. If
How did you notice that the MDS was offline?
It's just because ceph client could not unmount filesystem, or anything?
I would like to see logs in mds and osd.
hi,Shinobu
There is no /var/log/messages on my system but i saw the /var/log/syslog
and no useful messages be found.
I discover the /var/crash/_usr_bin_ceph-fuse.0.crash with grep the "fuse"
on the system.
Below is the message in it :
ProcStatus:
Name: ceph-fuse
State: D (disk sleep)
Tgid:
On Fri, Sep 11, 2015 at 9:52 AM, Nick Fisk wrote:
>> -Original Message-
>> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of
>> Mark Nelson
>> Sent: 10 September 2015 16:20
>> To: ceph-users@lists.ceph.com
>> Subject: Re: [ceph-users] higher read
On Thu, Sep 10, 2015 at 9:46 PM, Wido den Hollander wrote:
> Hi,
>
> I'm running into a issue with Ceph 0.94.2/3 where after doing a recovery
> test 9 PGs stay incomplete:
>
> osdmap e78770: 2294 osds: 2294 up, 2294 in
> pgmap v1972391: 51840 pgs, 7 pools, 220 TB data, 185 Mobjects
On Fri, Sep 11, 2015 at 11:57 AM, M.Tarkeshwar Rao
wrote:
> Hi all,
>
> We have a product which is written in c++ on Red hat.
>
> In production our customers using our product with Veritas cluster file
> system for HA and as sharded storage(EMC).
>
> Initially this product
Hi Tarkeshwar,
CephFS is not considered ready for production use currently mainly due to there
being no fsck tool. There are people using it so YMMV.
However if this app is written in house, is there any chance you could change
it to write objects directly into the RADOS layer? The RADOS
Dropwatch.stp would help us see who dropped packets,
here packets were dropped at.
To do further investigation regarding to networking,
I always check:
/sys/class/net//statistics/*
tc command also is quite useful.
Have we already check if there is any bo or not using
vmstat?
Using vmstat
On 11-09-15 12:22, Gregory Farnum wrote:
> On Thu, Sep 10, 2015 at 9:46 PM, Wido den Hollander wrote:
>> Hi,
>>
>> I'm running into a issue with Ceph 0.94.2/3 where after doing a recovery
>> test 9 PGs stay incomplete:
>>
>> osdmap e78770: 2294 osds: 2294 up, 2294 in
>> pgmap
29 matches
Mail list logo