Re: [ceph-users] CephFS: No space left on device

2016-10-12 Thread Yan, Zheng
compile ceph from source
https://github.com/ukernel/ceph/tree/jewel-cephfs-scan-links

run ' ceph daemon mds.xxx flush journal' to flush MDS journal
stop all mds
run 'cephfs-data-scan scan_links'
restart mds
run 'ceph daemon mds.x scrub_path / recursive repair'


On Wed, Oct 12, 2016 at 9:51 PM, Davie De Smet
 wrote:
> Hi,
>
> That sounds great. I'll certainly try it out.
>
> Kind regards,
>
> Davie De Smet
>
> -Original Message-
> From: Yan, Zheng [mailto:uker...@gmail.com]
> Sent: Wednesday, October 12, 2016 3:41 PM
> To: Davie De Smet 
> Cc: Gregory Farnum ; ceph-users 
> Subject: Re: [ceph-users] CephFS: No space left on device
>
> I have written a tool that fixes this type of error. I'm currently testing 
> it. Will push it out tomorrow
>
> Regards
> Yan, Zheng
>
> On Wed, Oct 12, 2016 at 9:18 PM, Davie De Smet  
> wrote:
>> Hi Gregory,
>>
>> Thanks for the help! I've been looping over all trashcan files and the 
>> amount of strays is lowering. This is going to take quite some time as it 
>> are a lot of files but so far so good. If I should encounter any further 
>> problems regarding this topic, I'll give this thread a heads up.
>>
>> Kind regards,
>>
>> Davie De Smet
>> Director Technical Operations and Customer Services, Nomadesk
>> +32 9 240 10 31 (Office)
>>
>> -Original Message-
>> From: Gregory Farnum [mailto:gfar...@redhat.com]
>> Sent: Wednesday, October 12, 2016 2:11 AM
>> To: Davie De Smet 
>> Cc: Mykola Dvornik ; John Spray
>> ; ceph-users 
>> Subject: Re: [ceph-users] CephFS: No space left on device
>>
>> On Tue, Oct 11, 2016 at 12:20 AM, Davie De Smet  
>> wrote:
>>> Hi,
>>>
>>> We do use hardlinks a lot. The application using the cluster has a build in 
>>> 'trashcan' functionality based on hardlinks. Obviously, all removed files 
>>> and hardlinks are not visible anymore on the CephFS mount itself. Can I 
>>> manually remove the strays on the OSD's themselves?
>>
>> No, definitely not. At least part of the problem is:
>> *) Ceph stores file metadata organized by its *path* location, not in a 
>> separate on-disk inode data structure like local FSes do.
>> *) When you hard link a file in CephFS, its "primary" location increments 
>> the link counter and its "remote" location just records the inode number 
>> (and it has to look up metadata later on-demand).
>> *) When you unlink the primary link, the inode data gets moved into the 
>> stray directory until one of the remote links comes calling.
>>
>>>Or do you mean that I'm required to do a small touch/write on all files that 
>>>have not yet been deleted (this would be painfull as the cluster is 200TB+)?
>>
>> Luckily, it doesn't take quite that much work. It looks like just doing a 
>> getattr on all the remote links in your system should do it.
>> If it's just your trash can, "ls -l" on that directory will probably
>> pull them in. Or you could delete the whole trashcan folder (set of
>> folders?) and they'll go away as well.
>> -Greg
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph website problems?

2016-10-12 Thread Dan Mick
Everything should have been back some time ago ( UTC or thereabouts)

On 10/11/2016 10:41 PM, Brian :: wrote:
> Looks like they are having major challenges getting that ceph cluster
> running again.. Still down.
> 
> On Tuesday, October 11, 2016, Ken Dreyer  > wrote:
>> I think this may be related:
>>
> http://www.dreamhoststatus.com/2016/10/11/dreamcompute-us-east-1-cluster-service-disruption/
>>
>> On Tue, Oct 11, 2016 at 5:57 AM, Sean Redmond  > wrote:
>>> Hi,
>>>
>>> Looks like the ceph website and related sub domains are giving errors for
>>> the last few hours.
>>>
>>> I noticed the below that I use are in scope.
>>>
>>> http://ceph.com/
>>> http://docs.ceph.com/
>>> http://download.ceph.com/
>>> http://tracker.ceph.com/
>>>
>>> Thanks
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] ceph-osd activate timeout

2016-10-12 Thread Wukongming
Hi All,
I noticed that sometime udev may receive a time-out message during 
activating osd after rebooting system. Sth like that:

Oct 11 07:15:40 cvknode2 udevd[965]: timeout: killing '/usr/sbin/ceph-disk 
activate-journal /dev/sdb2' [1834] Oct 11 07:15:40 cvknode2 udevd[965]: 
'/usr/sbin/ceph-disk activate-journal /dev/sdb2' [1834] terminated by signal 9 
(Killed)

It will cause device or partition mounted on both directory, the right one and 
tmp one.

/dev/sdb1 477630464 218421512 259208952 46% /var/lib/ceph/tmp/mnt.rzE_fa
/dev/sdb1 477630464 218421512 259208952 46% /var/lib/ceph/osd/ceph-2

When I commented out all in the /lib/udev/rules.d/95-ceph-osd.rules and 
rebooted system, there would be no timeout and osd could be activated normally. 
I have 2 questions:
1. What caused udev receive timeout message.
2. Can I just commented out all in the /lib/udev/rules.d/95-ceph-osd.rules, 
Cause calling activate-all can do activate osds, why need extra calling 
activate and activate-journal?

Best regards,

-
wukongming ID: 12019
Tel:0571-86760239
Dept:ONEStor RD


-
本邮件及其附件含有杭州华三通信技术有限公司的保密信息,仅限于发送给上面地址中列出
的个人或群组。禁止任何其他人以任何形式使用(包括但不限于全部或部分地泄露、复制、
或散发)本邮件中的信息。如果您错收了本邮件,请您立即电话或邮件通知发件人并删除本
邮件!
This e-mail and its attachments contain confidential information from H3C, 
which is
intended only for the person or entity whose address is listed above. Any use 
of the
information contained herein in any way (including, but not limited to, total 
or partial
disclosure, reproduction, or dissemination) by persons other than the intended
recipient(s) is prohibited. If you receive this e-mail in error, please notify 
the sender
by phone or email immediately and delete it!
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Server Down?

2016-10-12 Thread Lincoln Bryant
Hi Ashwin,

Seems the website is down. From another thread: 
http://www.dreamhoststatus.com/2016/10/11/dreamcompute-us-east-1-cluster-service-disruption/
 


I’ve been using the EU mirrors in the meanwhile: http://eu.ceph.com/ 


—Lincoln

> On Oct 12, 2016, at 4:15 PM, Ashwin Dev  wrote:
> 
> Hi,
> 
> I've been working on deploying ceph on a cluster. Looks like some of the main 
> repositories are down today - download.ceph.com . 
> It's been down since morning. 
> 
> Any idea what's happening? When can I expect it to be up?
> 
> Thanks!
> 
> -Ashwin
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Server Down?

2016-10-12 Thread Ashwin Dev
Hi,

I've been working on deploying ceph on a cluster. Looks like some of the
main repositories are down today - download.ceph.com. It's been down since
morning.

Any idea what's happening? When can I expect it to be up?

Thanks!

-Ashwin
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Map RBD Image with Kernel 3.10.0+10

2016-10-12 Thread Ilya Dryomov
On Wed, Oct 12, 2016 at 10:51 PM, Mike Jacobacci  wrote:
> Hi Ilya,
>
> I tried disabling feature sets, but nothing worked... What features are
> different in Format 1 that's different from Format 2?

Format 1 images don't have any additional features, so there is nothing
to enable or disable there.  It sounds like you simply mistyped some of
your commands.  --format is for textual output (json, json-pretty, xml),
--image-format is for on-disk format.

Thanks,

Ilya
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Map RBD Image with Kernel 3.10.0+10

2016-10-12 Thread Ilya Dryomov
On Wed, Oct 12, 2016 at 10:35 PM, Mike Jacobacci  wrote:
> Figured it out finally!!  RBD images must be in format 1, I had to export
> the old image and import it as format 1, trying to create format 1 image
> fails, it's doesn't like the --image-format command:"
>
> "rbd: the argument for option '--format' is invalid"

You should be able to create format 1 images with:

  $ rbd create --size 1G --image-format 1 $IMAGENAME

That said, images mustn't be in format 1 - you just need to disable
features unsupported by the kernel (or create images with the supported
feature set in the first place).  See
http://lists.opennebula.org/pipermail/ceph-users-ceph.com/2016-May/009635.html
for details.

Thanks,

Ilya
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Map RBD Image with Kernel 3.10.0+10

2016-10-12 Thread Mike Jacobacci
Figured it out finally!!  RBD images must be in format 1, I had to export
the old image and import it as format 1, trying to create format 1 image
fails, it's doesn't like the --image-format command:"

"rbd: the argument for option '--format' is invalid"

Is there a way to create rbd images with format 1 without having to convert
from format 2?

Cheers
Mike

On Wed, Oct 12, 2016 at 12:37 PM, Mike Jacobacci  wrote:

> Hi all,
>
> I need some help yet again... With my cluster backup and running Ceph
> 10.2.3, I am having problems again mounting an RBD image under Xenserver.
> I had this working before I broke everything and started over (previously
> on Ceph 10.1), I made sure to set tunables to legacy and disabled
> chooseleaf_vary_r which is how I got it working before.  Now when I try to
> manually mount the image:
>
> echo "192.168.10.39,192.168.10.40,192.168.10.41 name=admin,secret=secret
> xen vm0" > /sys/bus/rbd/add
>
> It fails with:
>
> echo: write error: No such device or address
>
> This is different that when I was having problems before getting this
> working, something must have changed in 10.2?   Below are more details, I
> hope it helps.
>
> *Here is the trace output:*
>
>> execve("/usr/bin/echo", ["echo", "192.168.10.39 name=admin,secret="...],
>> [/* 22 vars */]) = 0
>> brk(0)  = 0x220e000
>> mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0)
>> = 0x7f6220ca2000
>> access("/etc/ld.so.preload", R_OK)  = -1 ENOENT (No such file or
>> directory)
>> open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
>> fstat(3, {st_mode=S_IFREG|0644, st_size=34586, ...}) = 0
>> mmap(NULL, 34586, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f6220c99000
>> close(3)= 0
>> open("/lib64/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
>> read(3, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0
>> \34\2\0\0\0\0\0"..., 832) = 832
>> fstat(3, {st_mode=S_IFREG|0755, st_size=2107816, ...}) = 0
>> mmap(NULL, 3932736, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0)
>> = 0x7f62206c1000
>> mprotect(0x7f6220877000, 2097152, PROT_NONE) = 0
>> mmap(0x7f6220a77000, 24576, PROT_READ|PROT_WRITE,
>> MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1b6000) = 0x7f6220a77000
>> mmap(0x7f6220a7d000, 16960, PROT_READ|PROT_WRITE,
>> MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7f6220a7d000
>> close(3)= 0
>> mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0)
>> = 0x7f6220c98000
>> mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0)
>> = 0x7f6220c96000
>> arch_prctl(ARCH_SET_FS, 0x7f6220c96740) = 0
>> mprotect(0x7f6220a77000, 16384, PROT_READ) = 0
>> mprotect(0x606000, 4096, PROT_READ) = 0
>> mprotect(0x7f6220ca3000, 4096, PROT_READ) = 0
>> munmap(0x7f6220c99000, 34586)   = 0
>> brk(0)  = 0x220e000
>> brk(0x222f000)  = 0x222f000
>> brk(0)  = 0x222f000
>> open("/usr/lib/locale/locale-archive", O_RDONLY|O_CLOEXEC) = 3
>> fstat(3, {st_mode=S_IFREG|0644, st_size=106065056, ...}) = 0
>> mmap(NULL, 106065056, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f621a19a000
>> close(3)= 0
>> fstat(1, {st_mode=S_IFREG|0200, st_size=4096, ...}) = 0
>> mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0)
>> = 0x7f6220ca1000
>> write(1, "192.168.10.39 name=admin,secret="..., 85) = -1 ENXIO (No such
>> device or address)
>> close(1)= 0
>> munmap(0x7f6220ca1000, 4096)= 0
>> open("/usr/share/locale/locale.alias", O_RDONLY|O_CLOEXEC) = 1
>> fstat(1, {st_mode=S_IFREG|0644, st_size=2502, ...}) = 0
>> mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0)
>> = 0x7f6220ca1000
>> read(1, "# Locale name alias data base.\n#"..., 4096) = 2502
>> read(1, "", 4096)   = 0
>> close(1)= 0
>> munmap(0x7f6220ca1000, 4096)= 0
>> open("/usr/share/locale/en_US.UTF-8/LC_MESSAGES/coreutils.mo", O_RDONLY)
>> = -1 ENOENT (No such file or directory)
>> open("/usr/share/locale/en_US.utf8/LC_MESSAGES/coreutils.mo", O_RDONLY)
>> = -1 ENOENT (No such file or directory)
>> open("/usr/share/locale/en_US/LC_MESSAGES/coreutils.mo", O_RDONLY) = -1
>> ENOENT (No such file or directory)
>> open("/usr/share/locale/en.UTF-8/LC_MESSAGES/coreutils.mo", O_RDONLY) =
>> -1 ENOENT (No such file or directory)
>> open("/usr/share/locale/en.utf8/LC_MESSAGES/coreutils.mo", O_RDONLY) =
>> -1 ENOENT (No such file or directory)
>> open("/usr/share/locale/en/LC_MESSAGES/coreutils.mo", O_RDONLY) = -1
>> ENOENT (No such file or directory)
>> write(2, "echo: ", 6echo: )   = 6
>> write(2, "write error", 11write error) = 11
>> open("/usr/share/locale/en_US.UTF-8/LC_MESSAGES/libc.mo", O_RDONLY) = -1
>> ENOENT (No such file or directory)
>> 

Re: [ceph-users] is the web site down ?

2016-10-12 Thread German Anders
I think that you can check it over here:

http://www.dreamhoststatus.com/2016/10/11/dreamcompute-
us-east-1-cluster-service-disruption/

*German Anders*
Storage Engineer Leader
*Despegar* | IT Team
*office* +54 11 4894 3500 x3408
*mobile* +54 911 3493 7262
*mail* gand...@despegar.com

2016-10-12 16:44 GMT-03:00 Andrey Shevel :

> does anybody know when the site http://docs.ceph.com/docs/jewel/cephfs
> will be available ?
>
> thanks in advance
>
> --
> Andrey Y Shevel
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] is the web site down ?

2016-10-12 Thread Andrey Shevel
does anybody know when the site http://docs.ceph.com/docs/jewel/cephfs
will be available ?

thanks in advance

-- 
Andrey Y Shevel
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Map RBD Image with Kernel 3.10.0+10

2016-10-12 Thread Mike Jacobacci
Hi all,

I need some help yet again... With my cluster backup and running Ceph
10.2.3, I am having problems again mounting an RBD image under Xenserver.
I had this working before I broke everything and started over (previously
on Ceph 10.1), I made sure to set tunables to legacy and disabled
chooseleaf_vary_r which is how I got it working before.  Now when I try to
manually mount the image:

echo "192.168.10.39,192.168.10.40,192.168.10.41 name=admin,secret=secret
xen vm0" > /sys/bus/rbd/add

It fails with:

echo: write error: No such device or address

This is different that when I was having problems before getting this
working, something must have changed in 10.2?   Below are more details, I
hope it helps.

*Here is the trace output:*

> execve("/usr/bin/echo", ["echo", "192.168.10.39 name=admin,secret="...],
> [/* 22 vars */]) = 0
> brk(0)  = 0x220e000
> mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) =
> 0x7f6220ca2000
> access("/etc/ld.so.preload", R_OK)  = -1 ENOENT (No such file or
> directory)
> open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
> fstat(3, {st_mode=S_IFREG|0644, st_size=34586, ...}) = 0
> mmap(NULL, 34586, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f6220c99000
> close(3)= 0
> open("/lib64/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
> read(3, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0
> \34\2\0\0\0\0\0"..., 832) = 832
> fstat(3, {st_mode=S_IFREG|0755, st_size=2107816, ...}) = 0
> mmap(NULL, 3932736, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0)
> = 0x7f62206c1000
> mprotect(0x7f6220877000, 2097152, PROT_NONE) = 0
> mmap(0x7f6220a77000, 24576, PROT_READ|PROT_WRITE,
> MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1b6000) = 0x7f6220a77000
> mmap(0x7f6220a7d000, 16960, PROT_READ|PROT_WRITE,
> MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7f6220a7d000
> close(3)= 0
> mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) =
> 0x7f6220c98000
> mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) =
> 0x7f6220c96000
> arch_prctl(ARCH_SET_FS, 0x7f6220c96740) = 0
> mprotect(0x7f6220a77000, 16384, PROT_READ) = 0
> mprotect(0x606000, 4096, PROT_READ) = 0
> mprotect(0x7f6220ca3000, 4096, PROT_READ) = 0
> munmap(0x7f6220c99000, 34586)   = 0
> brk(0)  = 0x220e000
> brk(0x222f000)  = 0x222f000
> brk(0)  = 0x222f000
> open("/usr/lib/locale/locale-archive", O_RDONLY|O_CLOEXEC) = 3
> fstat(3, {st_mode=S_IFREG|0644, st_size=106065056, ...}) = 0
> mmap(NULL, 106065056, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f621a19a000
> close(3)= 0
> fstat(1, {st_mode=S_IFREG|0200, st_size=4096, ...}) = 0
> mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) =
> 0x7f6220ca1000
> write(1, "192.168.10.39 name=admin,secret="..., 85) = -1 ENXIO (No such
> device or address)
> close(1)= 0
> munmap(0x7f6220ca1000, 4096)= 0
> open("/usr/share/locale/locale.alias", O_RDONLY|O_CLOEXEC) = 1
> fstat(1, {st_mode=S_IFREG|0644, st_size=2502, ...}) = 0
> mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) =
> 0x7f6220ca1000
> read(1, "# Locale name alias data base.\n#"..., 4096) = 2502
> read(1, "", 4096)   = 0
> close(1)= 0
> munmap(0x7f6220ca1000, 4096)= 0
> open("/usr/share/locale/en_US.UTF-8/LC_MESSAGES/coreutils.mo", O_RDONLY) =
> -1 ENOENT (No such file or directory)
> open("/usr/share/locale/en_US.utf8/LC_MESSAGES/coreutils.mo", O_RDONLY) =
> -1 ENOENT (No such file or directory)
> open("/usr/share/locale/en_US/LC_MESSAGES/coreutils.mo", O_RDONLY) = -1
> ENOENT (No such file or directory)
> open("/usr/share/locale/en.UTF-8/LC_MESSAGES/coreutils.mo", O_RDONLY) = -1
> ENOENT (No such file or directory)
> open("/usr/share/locale/en.utf8/LC_MESSAGES/coreutils.mo", O_RDONLY) = -1
> ENOENT (No such file or directory)
> open("/usr/share/locale/en/LC_MESSAGES/coreutils.mo", O_RDONLY) = -1
> ENOENT (No such file or directory)
> write(2, "echo: ", 6echo: )   = 6
> write(2, "write error", 11write error) = 11
> open("/usr/share/locale/en_US.UTF-8/LC_MESSAGES/libc.mo", O_RDONLY) = -1
> ENOENT (No such file or directory)
> open("/usr/share/locale/en_US.utf8/LC_MESSAGES/libc.mo", O_RDONLY) = -1
> ENOENT (No such file or directory)
> open("/usr/share/locale/en_US/LC_MESSAGES/libc.mo", O_RDONLY) = -1 ENOENT
> (No such file or directory)
> open("/usr/share/locale/en.UTF-8/LC_MESSAGES/libc.mo", O_RDONLY) = -1
> ENOENT (No such file or directory)
> open("/usr/share/locale/en.utf8/LC_MESSAGES/libc.mo", O_RDONLY) = -1
> ENOENT (No such file or directory)
> open("/usr/share/locale/en/LC_MESSAGES/libc.mo", O_RDONLY) = -1 ENOENT (No
> such file or directory)
> write(2, ": No such device or 

Re: [ceph-users] CephFS: No space left on device

2016-10-12 Thread Gregory Farnum
On Wed, Oct 12, 2016 at 7:18 AM, Davie De Smet
 wrote:
> Hi Gregory,
>
> Thanks for the help! I've been looping over all trashcan files and the amount 
> of strays is lowering. This is going to take quite some time as it are a lot 
> of files but so far so good. If I should encounter any further problems 
> regarding this topic, I'll give this thread a heads up.

Note that it sounds like you've got a *very large* trash can
directory, and until we turn on directory fragmentation that can have
a pretty negative effect on your MDS' throughput. Usual mitigation
step is to bump the MDS cache size up as much as possible (which you
should do anyway); check the archives or docs for more as we discuss
it pretty often. :)
-Greg
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] RBD journal pool

2016-10-12 Thread Jason Dillaman
On Wed, Oct 12, 2016 at 2:45 AM, Frédéric Nass
 wrote:
> Can we use rbd journaling without using rbd mirroring in Jewel ?

Yes, you can use journaling without using rbd-mirroring, but ...

>So that we
> can set rbd journals on SSD pools and improve write IOPS on standard (no
> mirrored) RBD images.
> Assuming IOs are acknowleged when written to the journal pool.

... write ops are acked as soon as they are written to the in-memory
cache or, if disabled, when the backing image is updated. The journal
is not used to satisfy read-after-write requests, so we cannot ack a
write op until a read can safely (and consistently) be performed --
either from the in-memory cache or from the backing image.

There



-- 
Jason
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] FOSDEM Dev Room

2016-10-12 Thread Patrick McGarry
Hey cephers,

I know it's a bit early, but I wanted plenty of time to plan on a
coordinated effort like this.  At FOSDEM 2017 in Brussels the Ceph and
Gluster teams are co-managing a dev room on Software Defined Storage
and we'd like to get a full day of community speakers for this event.

If you are interested in presenting your work on (or with) Open Source
Software Defined Storage (we hope it's with Ceph!) please send me a
note with name/org/talk title/abstract and I will put together a
schedule. Thanks!

https://submission.fosdem.org/devroom.php

-- 

Best Regards,

Patrick McGarry
Director Ceph Community || Red Hat
http://ceph.com  ||  http://community.redhat.com
@scuttlemonkey || @ceph
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] CephFS: No space left on device

2016-10-12 Thread Davie De Smet
Hi,

That sounds great. I'll certainly try it out. 

Kind regards,

Davie De Smet

-Original Message-
From: Yan, Zheng [mailto:uker...@gmail.com] 
Sent: Wednesday, October 12, 2016 3:41 PM
To: Davie De Smet 
Cc: Gregory Farnum ; ceph-users 
Subject: Re: [ceph-users] CephFS: No space left on device

I have written a tool that fixes this type of error. I'm currently testing it. 
Will push it out tomorrow

Regards
Yan, Zheng

On Wed, Oct 12, 2016 at 9:18 PM, Davie De Smet  
wrote:
> Hi Gregory,
>
> Thanks for the help! I've been looping over all trashcan files and the amount 
> of strays is lowering. This is going to take quite some time as it are a lot 
> of files but so far so good. If I should encounter any further problems 
> regarding this topic, I'll give this thread a heads up.
>
> Kind regards,
>
> Davie De Smet
> Director Technical Operations and Customer Services, Nomadesk
> +32 9 240 10 31 (Office)
>
> -Original Message-
> From: Gregory Farnum [mailto:gfar...@redhat.com]
> Sent: Wednesday, October 12, 2016 2:11 AM
> To: Davie De Smet 
> Cc: Mykola Dvornik ; John Spray 
> ; ceph-users 
> Subject: Re: [ceph-users] CephFS: No space left on device
>
> On Tue, Oct 11, 2016 at 12:20 AM, Davie De Smet  
> wrote:
>> Hi,
>>
>> We do use hardlinks a lot. The application using the cluster has a build in 
>> 'trashcan' functionality based on hardlinks. Obviously, all removed files 
>> and hardlinks are not visible anymore on the CephFS mount itself. Can I 
>> manually remove the strays on the OSD's themselves?
>
> No, definitely not. At least part of the problem is:
> *) Ceph stores file metadata organized by its *path* location, not in a 
> separate on-disk inode data structure like local FSes do.
> *) When you hard link a file in CephFS, its "primary" location increments the 
> link counter and its "remote" location just records the inode number (and it 
> has to look up metadata later on-demand).
> *) When you unlink the primary link, the inode data gets moved into the stray 
> directory until one of the remote links comes calling.
>
>>Or do you mean that I'm required to do a small touch/write on all files that 
>>have not yet been deleted (this would be painfull as the cluster is 200TB+)?
>
> Luckily, it doesn't take quite that much work. It looks like just doing a 
> getattr on all the remote links in your system should do it.
> If it's just your trash can, "ls -l" on that directory will probably 
> pull them in. Or you could delete the whole trashcan folder (set of
> folders?) and they'll go away as well.
> -Greg
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] CephFS: No space left on device

2016-10-12 Thread Yan, Zheng
I have written a tool that fixes this type of error. I'm currently
testing it. Will push it out tomorrow

Regards
Yan, Zheng

On Wed, Oct 12, 2016 at 9:18 PM, Davie De Smet
 wrote:
> Hi Gregory,
>
> Thanks for the help! I've been looping over all trashcan files and the amount 
> of strays is lowering. This is going to take quite some time as it are a lot 
> of files but so far so good. If I should encounter any further problems 
> regarding this topic, I'll give this thread a heads up.
>
> Kind regards,
>
> Davie De Smet
> Director Technical Operations and Customer Services, Nomadesk
> +32 9 240 10 31 (Office)
>
> -Original Message-
> From: Gregory Farnum [mailto:gfar...@redhat.com]
> Sent: Wednesday, October 12, 2016 2:11 AM
> To: Davie De Smet 
> Cc: Mykola Dvornik ; John Spray 
> ; ceph-users 
> Subject: Re: [ceph-users] CephFS: No space left on device
>
> On Tue, Oct 11, 2016 at 12:20 AM, Davie De Smet  
> wrote:
>> Hi,
>>
>> We do use hardlinks a lot. The application using the cluster has a build in 
>> 'trashcan' functionality based on hardlinks. Obviously, all removed files 
>> and hardlinks are not visible anymore on the CephFS mount itself. Can I 
>> manually remove the strays on the OSD's themselves?
>
> No, definitely not. At least part of the problem is:
> *) Ceph stores file metadata organized by its *path* location, not in a 
> separate on-disk inode data structure like local FSes do.
> *) When you hard link a file in CephFS, its "primary" location increments the 
> link counter and its "remote" location just records the inode number (and it 
> has to look up metadata later on-demand).
> *) When you unlink the primary link, the inode data gets moved into the stray 
> directory until one of the remote links comes calling.
>
>>Or do you mean that I'm required to do a small touch/write on all files that 
>>have not yet been deleted (this would be painfull as the cluster is 200TB+)?
>
> Luckily, it doesn't take quite that much work. It looks like just doing a 
> getattr on all the remote links in your system should do it.
> If it's just your trash can, "ls -l" on that directory will probably pull 
> them in. Or you could delete the whole trashcan folder (set of
> folders?) and they'll go away as well.
> -Greg
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] CephFS: No space left on device

2016-10-12 Thread Davie De Smet
Hi Gregory,

Thanks for the help! I've been looping over all trashcan files and the amount 
of strays is lowering. This is going to take quite some time as it are a lot of 
files but so far so good. If I should encounter any further problems regarding 
this topic, I'll give this thread a heads up.

Kind regards,

Davie De Smet
Director Technical Operations and Customer Services, Nomadesk
+32 9 240 10 31 (Office)

-Original Message-
From: Gregory Farnum [mailto:gfar...@redhat.com] 
Sent: Wednesday, October 12, 2016 2:11 AM
To: Davie De Smet 
Cc: Mykola Dvornik ; John Spray ; 
ceph-users 
Subject: Re: [ceph-users] CephFS: No space left on device

On Tue, Oct 11, 2016 at 12:20 AM, Davie De Smet  
wrote:
> Hi,
>
> We do use hardlinks a lot. The application using the cluster has a build in 
> 'trashcan' functionality based on hardlinks. Obviously, all removed files and 
> hardlinks are not visible anymore on the CephFS mount itself. Can I manually 
> remove the strays on the OSD's themselves?

No, definitely not. At least part of the problem is:
*) Ceph stores file metadata organized by its *path* location, not in a 
separate on-disk inode data structure like local FSes do.
*) When you hard link a file in CephFS, its "primary" location increments the 
link counter and its "remote" location just records the inode number (and it 
has to look up metadata later on-demand).
*) When you unlink the primary link, the inode data gets moved into the stray 
directory until one of the remote links comes calling.

>Or do you mean that I'm required to do a small touch/write on all files that 
>have not yet been deleted (this would be painfull as the cluster is 200TB+)?

Luckily, it doesn't take quite that much work. It looks like just doing a 
getattr on all the remote links in your system should do it.
If it's just your trash can, "ls -l" on that directory will probably pull them 
in. Or you could delete the whole trashcan folder (set of
folders?) and they'll go away as well.
-Greg
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] RBD journal pool

2016-10-12 Thread Frédéric Nass

I would have tried it but our cluster is still running RHCS 1.3. :-)

Frederic.

Le 12/10/2016 à 08:45, Frédéric Nass a écrit :

Hello,

Can we use rbd journaling without using rbd mirroring in Jewel ? So 
that we can set rbd journals on SSD pools and improve write IOPS on 
standard (no mirrored) RBD images.

Assuming IOs are acknowleged when written to the journal pool.

Everything I read regarding RBD journaling is related to RBD mirroring.

Regards,



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] rsync kernel client cepfs mkstemp no space left on device

2016-10-12 Thread Hauke Homburg
Am 10.10.2016 um 11:33 schrieb John Spray:
> On Mon, Oct 10, 2016 at 9:05 AM, Hauke Homburg  
> wrote:
>> Am 07.10.2016 um 17:37 schrieb Gregory Farnum:
>>> On Fri, Oct 7, 2016 at 7:15 AM, Hauke Homburg  
>>> wrote:
 Hello,

 I have a Ceph Cluster with 5 Server, and 40 OSD. Aktual on this Cluster
 are 85GB Free Space, and the rsync dir has lots of Pictures and a Data
 Volume of 40GB.

 The Linux is a Centos 7 and the Last stable Ceph. The Client is a Debian
 8 with Kernel 4 and the Cluster is with cephfs mounted.

 When i sync the Directory i see often the Message rsync mkstemp no space
 left on device (28). At this Point i can touch a File in anotherDiretory
 in the Cluster. In the Diretory i have ~ 63 Files. Are this too much
 Files?
>>> Yes, in recent releases CephFS limits you to 100k dentries in a single
>>> directory fragment. This *includes* the "stray" directories that files
>>> get moved into when you unlink them, and is intended to prevent issues
>>> with very large folders. It will stop being a problem once we enable
>>> automatic fragmenting (soon, hopefully).
>>> You can change that by changing the "mds bal fragment size max"
>>> config, but you're probably better off by figuring out if you've got
>>> an over-large directory or if you're deleting files faster than the
>>> cluster can keep up. There was a thread about this very recently and
>>> John included some details about tuning if you check the archives. :)
>>> -Greg
>> Hello,
>>
>> Thanks for the answer.
>> I enabled on the Cluster the mds bal frag = true Options.
>>
>> Today i read that i have to enable this option on the Client, too. With
>> a Fuse mount i can do it with the ceph Binary. I use the Kernel Module.
>> How can i do it there?
> mds_bal_frag is only a server side thing.  You do also need to do the
> "ceph fs set  allow_dirfrags true", which you can run from any
> client with an admin key (but again, this is a server side thing, not
> a client setting).
>
> Note that the reason directory fragmentation is not enabled by default
> is that it wasn't thoroughly tested ahead of Jewel, so there's a
> reason it requires a --yes-i-really-mean-it.
>
> John
>
>> Regards
>>
>> Hauke
>>
>> --
>> www.w3-creative.de
>>
>> www.westchat.de
>>
>>
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Hello,

Yesterday i found the correct Command Line to enable the allow_dirfrags:

ceph mds set allow_dirfrags true --yes-i-really-mean-it

I startet the Line on a CLient with Debian 8, Kernel 4 and ceph Client
10.2.3 installed ceph-fs-common and ceph-common.

Currently i rsync some TB with some Directory with more than 100K
Entries in it into the Ceph Cluster. For testing.

Do you know a Roadmap for the Directory Fragmentation to be stable?
ceph.com is currently offline.

Regards

Hauke





-- 
www.w3-creative.de

www.westchat.de


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] How do I restart node that I've killed in development mode

2016-10-12 Thread agung Laksono
Thanks Huang jun.

I will try it.

On Wed, Oct 12, 2016 at 2:06 PM, huang jun  wrote:

> ./init-ceph start mon.a
>
> 2016-10-12 14:54 GMT+08:00 agung Laksono :
> > Hi Ceph Users,
> >
> > I deploy development cluster using vstart with 3 MONs and 3 OSDs.
> > On my experiment, Kill one of the monitor nodes by its pid. like this:
> >
> >   $ kill -SIGSEGV 27557
> >
> > After a new monitor leader is chosen, I would like to re-run the monitor
> > that I've killed in the previous step. How do I do this?
> >
> >
> > Thanks
> >
> > --
> > Cheers,
> >
> > Agung Laksono
> >
> >
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >
>
>
>
> --
> Thank you!
> HuangJun
>



-- 
Cheers,

Agung Laksono
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] How do I restart node that I've killed in development mode

2016-10-12 Thread huang jun
./init-ceph start mon.a

2016-10-12 14:54 GMT+08:00 agung Laksono :
> Hi Ceph Users,
>
> I deploy development cluster using vstart with 3 MONs and 3 OSDs.
> On my experiment, Kill one of the monitor nodes by its pid. like this:
>
>   $ kill -SIGSEGV 27557
>
> After a new monitor leader is chosen, I would like to re-run the monitor
> that I've killed in the previous step. How do I do this?
>
>
> Thanks
>
> --
> Cheers,
>
> Agung Laksono
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>



-- 
Thank you!
HuangJun
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] How do I restart node that I've killed in development mode

2016-10-12 Thread agung Laksono
Hi Ceph Users,

I deploy development cluster using vstart with 3 MONs and 3 OSDs.
On my experiment, Kill one of the monitor nodes by its pid. like this:

  $ kill -SIGSEGV 27557

After a new monitor leader is chosen, I would like to re-run the monitor
that I've killed in the previous step. How do I do this?


Thanks

-- 
Cheers,

Agung Laksono
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] RBD journal pool

2016-10-12 Thread Frédéric Nass

Hello,

Can we use rbd journaling without using rbd mirroring in Jewel ? So that 
we can set rbd journals on SSD pools and improve write IOPS on standard 
(no mirrored) RBD images.

Assuming IOs are acknowleged when written to the journal pool.

Everything I read regarding RBD journaling is related to RBD mirroring.

Regards,

--

Frédéric Nass

Sous-direction Infrastructures
Direction du Numérique
Université de Lorraine

Tél : +33 3 72 74 11 35

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com