If I understand you correctly, you seem to want to bypass a perceived
bottleneck. Yes, you can have many rados gateways (RGW) running. You can
also bypass RGW and go directly with librados and use your own striping but
I would not recommend that unless you really know what you're doing and
have time to build and test it. That said, RGW can still give you parity
with AWS S3. We run very large clusters and we use RGW and RBD because we
also run OpenStack. Our original design had a RGW on every box that had
OSDs but we found it not to be very efficient in our use case (it could be
the opposite in your use case). We turned down the number of RGWs to 1 for
every 3 high density OSD servers. Depending on payload and requests this
could be tuned higher or lower. We also found that we needed to bump the
Apache config to a higher number of running servers (we are still using the
old Apache style - newer threaded/event may not need changing). Still, with
the changes we are now going to start testing using civeweb in the latest
version of ceph so as to bypass Apache all together.

The only way to get a good distributed spread is to use a good load
balancer. We found that our load balancers are our bottlenecks so we are
working with our network group on changing those out. Also, depending on
what load balancer you use, you may need to change your distribution
algorithm to round robin instead of least connections. With our load
balancer (rad...) we found least connections was not distributing the load
as you would expect which caused the first group of servers to have very
high loads as compared to others. Once we changed to round robin (it could
be different on your load balancer) we saw a good even spread. We used
JMeter on AWS to build a fleet of testing servers that made 2MB byte range
requests to the VIP on our load balancer and then tracked latency, RTT,
errors, etc for the duration. In addition we monitored the nics on the load
balancers and systats on each server.

We are very pleased with RGW and Ceph and strongly recommend it.
thx

On Sat, Mar 28, 2015 at 11:22 AM, <[email protected]> wrote:

> Hi,
>
> I am designing an infrastructure using Ceph.
> The client will fetch data though HTTP.
>
> I saw the radosgw, that is made for that, it has, however, some weakness
> for me : as far as I understood, when a client want to fetch a file, it
> connects to the radosgw, which will connect to the right OSD and pipe
> data to the client.
>
> Is there any way to remove such bottleneck (proxyfing all data ?)
>
> A solution would be to create a radosgw on each OSD-servers, I still
> need a way to redirect customers to the right way (to the radosgw that
> lives on the correct OSD).
> I dig the docs, and even with librados, I could not find this information.
>
> To conclude with a global overview, my needs:
> - few data, many servers, many bandwidth
> - each servers are limited by network, not by disk IO
> - client's URI are forged (they first download some sort of index, then
> the files): a programmable solution can be integrated there
>
> Thanks for reading
> _______________________________________________
> ceph-users mailing list
> [email protected]
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>



-- 
Best Regards,
Chris Jones

<http://www.cloudm2.com>

[email protected]
(p) 770.655.0770

This message is intended exclusively for the individual or entity to which
it is addressed.  This communication may contain information that is
proprietary, privileged or confidential or otherwise legally exempt from
disclosure.  If you are not the named addressee, you are not authorized to
read, print, retain, copy or disseminate this message or any part of it.
If you have received this message in error, please notify the sender
immediately by e-mail and delete all copies of the message.
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to