Re: MemCacheTicketRegistry

Scott Battaglia Mon, 27 Oct 2008 08:07:23 -0700

Also, how is your memcacheticketregistry configured?

We have the following snippit which is exactly the same on both machines
(machine names changed to protect the innocent):


<?xml version="1.0" encoding="UTF-8"?>
<beans xmlns="http://www.springframework.org/schema/beans";
       xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance";
       xmlns:util="http://www.springframework.org/schema/util";
       xsi:schemaLocation="
    http://www.springframework.org/schema/beans
http://www.springframework.org/schema/beans/spring-beans-2.5.xsd
    http://www.springframework.org/schema/util
http://www.springframework.org/schema/util/spring-util-2.5.xsd";>

    <util:list id="memcachedServers">
        <value>SERVER1.ess.rutgers.edu:11211</value>
        <value>SERVER2.ess.rutgers.edu:11211</value>
    </util:list>
</beans>

And in another file we have the following:
    <bean id="ticketRegistry"
class="org.jasig.cas.ticket.registry.MemCacheTicketRegistry">
        <constructor-arg index="0">
            <ref bean="memcachedServers" />
        </constructor-arg>
        <constructor-arg index="1" type="int" value="21600" />
        <constructor-arg index="2" type="int" value="300" />
    </bean>

-Scott Battaglia
PGP Public Key Id: 0x383733AA
LinkedIn: http://www.linkedin.com/in/scottbattaglia


On Mon, Oct 27, 2008 at 9:20 AM, Scott Battaglia
<[EMAIL PROTECTED]>wrote:

> On Fri, Oct 24, 2008 at 7:58 PM, Adam Rybicki <[EMAIL PROTECTED]> wrote:
>
>>  Scott,
>>
>> I mis-diagnosed the issue.  I just ran the same test, except I only ran
>> one instance of memcached.  I am getting a high error rate on ticket
>> validations.  So, it has nothing to do with memcached replication.  To
>> investigate further, I disabled the second CAS server, and all errors are
>> gone.  Of course that is not a viable workaround.  :-)
>>
>> My guess is that the error occurs when a ticket issued by one CAS server
>> is being validated on another CAS server.  I could not find a way to enable
>> debug logging in /cas/serviceValidate, but I think I have found a major
>> clue.  It took most of the day today to hunt this down.
>>
>> With a single instance of memcached running in verbose mode you can see a
>> sequence of messages like this:
>> ------------------------------
>> <11 add ST-8023-M0sU2U2ijyQ53QPYWnGm-arybicki1 1 300 2689
>> >11 STORED
>> <7 get ST-8023-M0sU2U2ijyQ53QPYWnGm-arybicki1
>> >7 sending key ST-8023-M0sU2U2ijyQ53QPYWnGm-arybicki1
>> >7 END
>> <7 replace ST-8023-M0sU2U2ijyQ53QPYWnGm-arybicki1 1 300 2689
>> >7 STORED
>> <7 delete ST-8023-M0sU2U2ijyQ53QPYWnGm-arybicki1 0
>> >7 DELETED
>>  ------------------------------
>> This is when everything went OK.  The sequence below, however, represents
>> a service ticket that failed to validate.  That's apparently because an
>> attempt to read the ticket was made before it was actually stored in cache!
>> ------------------------------
>> <11 add ST-8024-tKeeo5gYhjqoQzstAgqO-arybicki1 1 300 2689
>> <7 get ST-8024-tKeeo5gYhjqoQzstAgqO-arybicki1
>> >7 END
>> >11 STORED
>> ------------------------------
>> There may be some code that synchronizes access to the same object from
>> the same client.  However, it would seem that the service ticket is returned
>> by CAS before it's actually stored in memcached.  If this service ticket is
>> then presented to another instance of CAS for validation, it fails to
>> retrieve it from memcached because the "add" operation has not completed.
>>
>> Again, I have to emphasize that this is not an unrealistic test.  The
>> jMeter is simply following redirects at the time of the failure, as a
>> browser would.
>>
>
> We never saw that in production and we ran 500 virtual users.  However, if
> you are experiencing it, you most likely could update the
> MemcacheTicketRegistry to block on the Futures.  I've actually updated the
> code in HEAD with an option to block on Futures. :-)
>
> I have not tried it at all, since I wrote it all of 30 seconds ago.  You
> can grab it from HEAD and try it out.  The new property to enable it is
> "synchronizeUpdatesToRegistry"
>
> Let me know if it helps/doesn't help.
>
> -Scott
>
>
>>
>> Adam
>>
>> Scott Battaglia wrote:
>>
>> You have no need for sticky sessions.  If you have two repcached servers
>> and you've told your CAS instance about both of them, the memcached client
>> essentially sees them as two memcached servers (since its not familiar with
>> repcached).
>>
>> The memcached client works in that it takes a hash of the key and that
>> determines what instance of memcached/repcached to store the item on.
>> repcached will then do its async replication.  When you come to validate a
>> ticket the memcached client will again hash the key to determine what server
>> the item is stored on.  If that server is unreachable (as determined by the
>> memcached client) then it will try the next likely server that would hold
>> the data.
>>
>> -Scott
>>
>> -Scott Battaglia
>> PGP Public Key Id: 0x383733AA
>> LinkedIn: http://www.linkedin.com/in/scottbattaglia
>>
>>
>> On Fri, Oct 24, 2008 at 8:21 AM, Andrew Ralph Feller, afelle1 <
>> [EMAIL PROTECTED]> wrote:
>>
>>> So what you are saying is that even with replication enabled,
>>> asynchronous replication CAS clusters should have sticky sessions on
>>> regardless?  I realize that synchronous replication CAS cluster have no need
>>> of sticky sessions seeing as how it goes to all servers before the user can
>>> move on.
>>>
>>> Andrew
>>>
>>>
>>> On 10/23/08 9:29 PM, "Scott Battaglia" <[EMAIL PROTECTED]>
>>> wrote:
>>>
>>>    It actually shouldn't matter if the async works or not.  The memcache
>>> clients are designed to hash to a particular server and only check the
>>> backup servers if the primary isn't available.
>>>
>>> So you should always be validating against the original server unless its
>>> no longer there.
>>>
>>> -Scott
>>>
>>> -Scott Battaglia
>>> PGP Public Key Id: 0x383733AA
>>> LinkedIn: http://www.linkedin.com/in/scottbattaglia
>>>
>>>
>>> On Thu, Oct 23, 2008 at 9:17 PM, Adam Rybicki <[EMAIL PROTECTED]>
>>> wrote:
>>>
>>>
>>> Scott,
>>>
>>> I have run into a issue with MemCacheTicketRegistry and was wondering if
>>> you have any thoughts.  I didn't want to create a new thread for this note.
>>>  Anyone else with comments should feel free to reply, too. ;-)
>>>
>>> My tests have shown that when a ticket is generated on a CAS cluster
>>> member it may sometimes fail to validate.  This is apparently because the
>>> memcached asynchronous replication did not manage to send the ticket replica
>>> in time.  Fast as repcached may be, under a relatively light load, ST
>>> validation failed in 0.1% of the cases, or once in 1000 attempts.  It would
>>> seem that the following tasks should be fairly complex:
>>>
>>>    - Browser accesses a CAS-protected service
>>>    - Service redirects to CAS for authentication
>>>    - CAS validates the TGT
>>>    - CAS issues the ST for service
>>>    - CAS redirects the browser to service
>>>    - Service sends the ST for validation
>>>
>>> But they are fast!  My jMeter testing showed this taking 28 milliseconds
>>> under light load on CAS server , which is amazingly fast.  Please note that
>>> in real life, this can be just as fast because the browser, CAS, and service
>>> perform these steps without the user slowing them down.  CAS is indeed a
>>> lightweight system, and memcached does nothing to slow it down.  It seems
>>> that in 0.1% of the cases this outperforms repcached under light load.  The
>>> bad news is that under heavy load the failure rate increases.  I've seen as
>>> bad as 8% failure rate.
>>>
>>> Have you or anyone else seen this?  Have you had to work around this?
>>>
>>> Thanks,
>>> Adam
>>>
>>> Scott Battaglia wrote:
>>>
>>>
>>> On Tue, Oct 14, 2008 at 11:15 AM, Andrew Ralph Feller, afelle1 <
>>> [EMAIL PROTECTED]> wrote:
>>>
>>>
>>>
>>>
>>> Hey Scott,
>>>
>>> Thanks for answering some questions; really appreciate it.  Just a
>>> handful more:
>>>
>>>
>>>
>>>    1. What happens whenever the server it intends to replicate with is
>>>    down?
>>>
>>>
>>>
>>>
>>>
>>> It doesn't replicate :-) The client will send its request to the primary
>>> server and if the primary server is down it will replicate to the secondary.
>>>  The repcache server itself will not replicate to the other server if it
>>> can't find it.
>>>
>>>
>>>
>>>
>>>
>>>
>>>    1.
>>>     2.
>>>     3. What happens whenever it comes back up?
>>>
>>>
>>>
>>>
>>>
>>> The repcache servers will sync with each other.  The memcache clients
>>> will continue to function as they should
>>>
>>>
>>>
>>>
>>>
>>>
>>>    1.
>>>     2.
>>>     3. Does the newly recovered machine synchronize itself with the
>>>    other servers?
>>>
>>>
>>>
>>>
>>>
>>> The newly recovered machine with synchronize with its paired memcache
>>> server.
>>>
>>> -Scott
>>>
>>>
>>>
>>>
>>>
>>>
>>>    1.
>>>     2.
>>>
>>>
>>> Thanks,
>>> Andrew
>>>
>>>
>>> On 10/14/08 9:56 AM, "Scott Battaglia" <[EMAIL PROTECTED] <
>>> http://[EMAIL PROTECTED]> > wrote:
>>>
>>>
>>>
>>>
>>>
>>> Memcache, as far as I know, uses a hash of the key to determine which
>>> server to write to (and then with repcache, its replicated to its pair,
>>> which you configure).
>>>
>>> -Scott
>>>
>>> -Scott Battaglia
>>> PGP Public Key Id: 0x383733AA
>>> LinkedIn: http://www.linkedin.com/in/scottbattaglia
>>>
>>>
>>>  On Tue, Oct 14, 2008 at 10:38 AM, Andrew Ralph Feller, afelle1 <
>>> [EMAIL PROTECTED] <http://[EMAIL PROTECTED]> > wrote:
>>>
>>>
>>>
>>>
>>> Scott,
>>>
>>> I've looked at the sample configuration file on the JA-SIG wiki, however
>>> I was curious how memcached handles cluster membership for lack of a better
>>> word.  One of the things we are getting burned on by JBoss/Jgroups is the
>>> frequency the cluster is being fragmented.
>>>
>>> Thanks,
>>> Andrew
>>>
>>>
>>>
>>>
>>>
>>>  On 10/14/08 8:58 AM, "Scott Battaglia" <[EMAIL PROTECTED] <
>>> http://[EMAIL PROTECTED]>  <http://[EMAIL PROTECTED]> >
>>> wrote:
>>>
>>>
>>>
>>>
>>>
>>> We've disabled the registry cleaners since memcached has explicit time
>>> outs (which are configurable on the registry).  We've configured it by
>>> default with 1 gb of RAM I think, though I doubt we need that much.
>>>
>>> -Scott
>>>
>>> -Scott Battaglia
>>> PGP Public Key Id: 0x383733AA
>>> LinkedIn: http://www.linkedin.com/in/scottbattaglia
>>>
>>>
>>>
>>>
>>>  On Mon, Oct 13, 2008 at 11:41 PM, Patrick Hennessy <[EMAIL PROTECTED]<
>>> http://[EMAIL PROTECTED]>  <http://[EMAIL PROTECTED]> > wrote:
>>>
>>>
>>>
>>>
>>>
>>> I've been working on updating from 3.2 to 3.3 and wanted to give
>>> memcached a try instead of JBoss.  I read Scott's message about
>>> performance and we've had good success here with memcached for other
>>> applications.  It also looks like using memcached instead of JBoss will
>>> simplify the configuration changes for the CAS server.
>>>
>>> I do have the JBoss replication working with CAS 3.2 but pounding the
>>> heck out of it with JMeter will cause some not so nice stuff to happen.
>>>   I'm using VMWare VI3 and configured an isolated switch for the
>>> clustering and Linux-HA traffic.  I do see higher traffic levels coming
>>> to my cluster in the future, but I'm not sure if they'll meet the levels
>>> from my JMeter test. (I'm just throwing this out there because of the
>>> recent Best practice thread.)
>>>
>>> If I use memcached, is the ticketRegistryCleaner not needed anymore?  I
>>> left those beans in the ticketRegistry.xml file and saw all kinds of
>>> errors.  After taking it out it seems to load fine and appears to work,
>>> but I wasn't sure what the behavior is and I haven't tested it further.
>>>   What if memcached fills up all the way?  Does anyone have a general
>>> idea of how much memory to allocate to memcached with regards to
>>> concurrent logins and tickets stored?
>>>
>>> Thanks,
>>>
>>> Pat
>>> --
>>> =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
>>>
>>>  Patrick Hennessy                          ([EMAIL PROTECTED] <
>>> http://[EMAIL PROTECTED]>  <http://[EMAIL PROTECTED]> )
>>>
>>>
>>> Senior Systems Specialist
>>> Division of Information and Educational Technology
>>> Delaware Technical and Community College
>>> =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
>>> _______________________________________________
>>> Yale CAS mailing list
>>>
>>>   [email protected] <http://[email protected]>  <
>>> http://[email protected]>
>>>
>>> http://tp.its.yale.edu/mailman/listinfo/cas
>>>
>>>
>>>
>>>
>>>
>>>
>>> ------------------------------
>>> _______________________________________________
>>> Yale CAS mailing list
>>>  [email protected] <http://[email protected]>  <
>>> http://[email protected]>
>>>
>>> http://tp.its.yale.edu/mailman/listinfo/cas
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>  --
>>> Andrew R. Feller, Analyst
>>> Information Technology Services
>>> 200 Fred Frey Building
>>> Louisiana State University
>>> Baton Rouge, LA 70803
>>> (225) 578-3737 (Office)
>>> (225) 578-6400 (Fax)
>>>
>>>
>>>
>>> _______________________________________________
>>> Yale CAS mailing list
>>>  [email protected]
>>>  http://tp.its.yale.edu/mailman/listinfo/cas
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> ------------------------------
>>>
>>> _______________________________________________
>>> Yale CAS mailing list
>>> [email protected]
>>> http://tp.its.yale.edu/mailman/listinfo/cas
>>>
>>>
>>>
>>> _______________________________________________
>>> Yale CAS mailing list
>>> [email protected]
>>> http://tp.its.yale.edu/mailman/listinfo/cas
>>>
>>>
>>>
>>> ------------------------------
>>> _______________________________________________
>>> Yale CAS mailing list
>>> [email protected]
>>> http://tp.its.yale.edu/mailman/listinfo/cas
>>>
>>>
>>> --
>>> Andrew R. Feller, Analyst
>>> Information Technology Services
>>> 200 Fred Frey Building
>>> Louisiana State University
>>> Baton Rouge, LA 70803
>>> (225) 578-3737 (Office)
>>> (225) 578-6400 (Fax)
>>>
>>> _______________________________________________
>>> Yale CAS mailing list
>>> [email protected]
>>> http://tp.its.yale.edu/mailman/listinfo/cas
>>>
>>>
>> ------------------------------
>>
>> _______________________________________________
>> Yale CAS mailing [EMAIL PROTECTED]://tp.its.yale.edu/mailman/listinfo/cas
>>
>>
>> _______________________________________________
>> Yale CAS mailing list
>> [email protected]
>> http://tp.its.yale.edu/mailman/listinfo/cas
>>
>>
>

_______________________________________________
Yale CAS mailing list
[email protected]
http://tp.its.yale.edu/mailman/listinfo/cas

Re: MemCacheTicketRegistry

Reply via email to