Re: Knox HA with Apache HTTP Server + mod_proxy + mod_proxy_balancer

Kevin Minder Fri, 25 Oct 2013 06:43:29 -0700

Maksim,
Great work!
Discussion inline below.
Recommended next steps.

1) Add the setup steps required to get all of this working to the user'sguide. File a jira.

2) Figure out a way to automate these tests.  Might be hard on Apache infra.
Kevin.


On 10/25/13 8:55 AM, Maksim Kononenko wrote:

Hi guys,

I was researching/testing Knox HA with Apache HTTP Server +  mod_proxy +
mod_proxy_balancer.
Here is what I found.
I.   3 load balancer scheduler algorithms available for use: Request
Counting, Weighted Traffic Counting and Pending Request Counting. (
http://httpd.apache.org/docs/2.2/mod/mod_proxy_balancer.html#scheduler)
II.  Load balancer stickyness. (
http://httpd.apache.org/docs/2.2/mod/mod_proxy_balancer.html#stickyness)
      I configured and tested stickyness. Worked as it had to be.
III. Failover. (
http://httpd.apache.org/docs/2.2/mod/mod_proxy.html#proxypass)
      1. I ran foolowing use cases:
         a) Knox instance is down before client request comes in.
             Steps:
                 - Configure Apache HTTP Server to proxy two Knox instances;
                 - Shoot down Knox instance A;
                 - Execute client request;
                 - Verify that Knox instance A is marked as unavailable and
client's request is redirected to Knox instance B;
                 - Verify that all subsequent requests in scope of the same
client's session are passed just to Knox instance B;
                 - Verify that client's requests in scope of new session are
tried to be passed to Knox instance A.
                   It is required because Knox instance A could be started
before new client's session.

This seems a little sub-optimal to me but there may be nothing we can doabout it.The issue that I have is that I don't think Apache should be tryinginstance-A first every time in this case.So the question is how is Apache distributing load over instance-A andinstance-B?Does it always try instance-A first or does it sometimes try instance-Bfirst?In addition if it gets a failure for instance-A ideally it would take itout of the "pool" for some (ideally configurable) period of time.

             This use case works fine.
         b) Knox instance goes down when it processes client's PUT request.
             Steps:
                 - Start executing PUT file to HDFS with medium size (200Mb);
                 - After some time shoot down Knox instance which processes
this request;
                 - Verify that client gets 500 status code and no failover
takes place.
             This use case works as it is described. Apache HTTP Server is
not able to do failover in this case.
         c) Knox instance goes down when it processes client's GET request.
             Steps:
                 - Start executing GET file from HDFS with medium size
(200Mb);
                 - After some time shoot down Knox instance which processes
this request;
                 - Verify that client gets 200 status code, 'Content-Length'
header with value equals to file size and some bytes in the body.
                   To execute this test I used as a client:
                     1) HttpClient - it doesn't produce any error when
stream is closed.
                     2) CURL - it doesn't produce any error when stream is
closed.
                     3) Firefox browser - it doesn't produce any error when
stream is closed.
                   All clients just download available bytes before stream
is closed, so client has to manually compare 'Content-Length' header value
and received bytes length.
                 - No failover takes place.
             This use case works as it is described. Apache HTTP Server is
not able to do failover in this case.

This is unexpected and unfortunate.

I would have hoped that HttpClient and cURL at least would provide someindication that the stream was incomplete according to theContent-Length header.The only thing I would recommend you trying is taking Knox out of thepicture, use cURL to GET the same file directly from HDFS, kill theDataNode halfway through the stream and ensure that you see the samebehavior on the client side.

      2. Additional use cases.
         What new cases could you advise?

I just want to confirm that you have tested a scenario for HDFS wherethe call to the NameNode goes to instance-A and the subsequent call tothe DataNode goes to instance-B and this works.

IV. What functionality did I miss?

Other than the note above I don't see anything missing.


Maksim.



--
CONFIDENTIALITY NOTICE

NOTICE: This message is intended for the use of the individual or entity towhich it is addressed and may contain information that is confidential,privileged and exempt from disclosure under applicable law. If the readerof this message is not the intended recipient, you are hereby notified thatany printing, copying, dissemination, distribution, disclosure orforwarding of this communication is strictly prohibited. If you havereceived this communication in error, please contact the sender immediatelyand delete it from your system. Thank You.

Re: Knox HA with Apache HTTP Server + mod_proxy + mod_proxy_balancer

Reply via email to