Re: [ceph-users] FW: Ceph data locality

2015-07-07 Thread Lionel Bouton
On 07/07/15 18:20, Dmitry Meytin wrote: Exactly because of that issue I've reduced the number of Ceph replications to 2 and the number of HDFS copies is also 2 (so we're talking about 4 copies). I want (but didn't tried yet) to change Ceph replication to 1 and change HDFS back to 3. You are

Re: [ceph-users] FW: Ceph data locality

2015-07-07 Thread Dmitry Meytin
ideas how to improve it? Thank you very much, Dmitry -Original Message- From: Lionel Bouton [mailto:lionel+c...@bouton.name] Sent: Tuesday, July 07, 2015 6:07 PM To: Dmitry Meytin Cc: ceph-users@lists.ceph.com Subject: Re: [ceph-users] FW: Ceph data locality Hi Dmitry, On 07/07/15 14

Re: [ceph-users] FW: Ceph data locality

2015-07-07 Thread Lionel Bouton
On 07/07/15 17:41, Dmitry Meytin wrote: Hi Lionel, Thanks for the answer. The missing info: 1) Ceph 0.80.9 Firefly 2) map-reduce makes sequential reads of blocks of 64MB (or 128 MB) 3) HDFS which is running on top of Ceph is replicating data for 3 times between VMs which could be located

Re: [ceph-users] FW: Ceph data locality

2015-07-07 Thread Dmitry Meytin
To: Dmitry Meytin Cc: ceph-users@lists.ceph.com Subject: Re: [ceph-users] FW: Ceph data locality On 07/07/15 18:20, Dmitry Meytin wrote: Exactly because of that issue I've reduced the number of Ceph replications to 2 and the number of HDFS copies is also 2 (so we're talking about 4 copies). I want

Re: [ceph-users] FW: Ceph data locality

2015-07-07 Thread Lionel Bouton
Hi Dmitry, On 07/07/15 14:42, Dmitry Meytin wrote: Hi Christian, Thanks for the thorough explanation. My case is Elastic Map Reduce on top of OpenStack with Ceph backend for everything (block, object, images). With default configuration, performance is 300% worse than bare metal. I did a

[ceph-users] FW: Ceph data locality

2015-07-07 Thread Dmitry Meytin
I think it's essential for huge data clusters to deal with data locality. Even very expensive network stack (100Gb/s) will not mitigate the problem if you need to move petabytes of data many times a day. Maybe there is some workaround to the problem? From: Van Leeuwen, Robert

Re: [ceph-users] FW: Ceph data locality

2015-07-07 Thread Christian Balzer
Hello, On Tue, 7 Jul 2015 11:45:11 + Dmitry Meytin wrote: I think it's essential for huge data clusters to deal with data locality. Even very expensive network stack (100Gb/s) will not mitigate the problem if you need to move petabytes of data many times a day. Maybe there is some

Re: [ceph-users] FW: Ceph data locality

2015-07-07 Thread Dmitry Meytin
...@ceph.com Cc: Dmitry Meytin Subject: Re: [ceph-users] FW: Ceph data locality Hello, On Tue, 7 Jul 2015 11:45:11 + Dmitry Meytin wrote: I think it's essential for huge data clusters to deal with data locality. Even very expensive network stack (100Gb/s) will not mitigate the problem

Re: [ceph-users] FW: Ceph data locality

2015-07-07 Thread Wido den Hollander
-Original Message- From: Christian Balzer [mailto:ch...@gol.com] Sent: 07 July 2015 15:25 To: ceph-us...@ceph.com Cc: Dmitry Meytin Subject: Re: [ceph-users] FW: Ceph data locality Hello, On Tue, 7 Jul 2015 11:45:11 + Dmitry Meytin wrote: I think it's essential