On Mon, Oct 27, 2008 at 3:19 PM, Adam Rybicki <[EMAIL PROTECTED]> wrote:
> Scott, > > Great fix! I cannot make it fail again. > Any noticeable performance difference? -Scott > > Thanks, > Adam > > Scott Battaglia wrote: > > Let me know how the test goes. Also if you find any bugs since again, it > was written in about 30 seconds :-) > > -Scott > > -Scott Battaglia > PGP Public Key Id: 0x383733AA > LinkedIn: http://www.linkedin.com/in/scottbattaglia > > > On Mon, Oct 27, 2008 at 1:26 PM, Adam Rybicki <[EMAIL PROTECTED]> wrote: > >> Scott, >> >> Great. I will grab it and retest. This will probably solve the issue. >> >> Adam >> >> Scott Battaglia wrote: >> >> On Fri, Oct 24, 2008 at 7:58 PM, Adam Rybicki <[EMAIL PROTECTED]>wrote: >> >>> Scott, >>> >>> I mis-diagnosed the issue. I just ran the same test, except I only ran >>> one instance of memcached. I am getting a high error rate on ticket >>> validations. So, it has nothing to do with memcached replication. To >>> investigate further, I disabled the second CAS server, and all errors are >>> gone. Of course that is not a viable workaround. :-) >>> >>> My guess is that the error occurs when a ticket issued by one CAS server >>> is being validated on another CAS server. I could not find a way to enable >>> debug logging in /cas/serviceValidate, but I think I have found a major >>> clue. It took most of the day today to hunt this down. >>> >>> With a single instance of memcached running in verbose mode you can see a >>> sequence of messages like this: >>> ------------------------------ >>> <11 add ST-8023-M0sU2U2ijyQ53QPYWnGm-arybicki1 1 300 2689 >>> >11 STORED >>> <7 get ST-8023-M0sU2U2ijyQ53QPYWnGm-arybicki1 >>> >7 sending key ST-8023-M0sU2U2ijyQ53QPYWnGm-arybicki1 >>> >7 END >>> <7 replace ST-8023-M0sU2U2ijyQ53QPYWnGm-arybicki1 1 300 2689 >>> >7 STORED >>> <7 delete ST-8023-M0sU2U2ijyQ53QPYWnGm-arybicki1 0 >>> >7 DELETED >>> ------------------------------ >>> This is when everything went OK. The sequence below, however, represents >>> a service ticket that failed to validate. That's apparently because an >>> attempt to read the ticket was made before it was actually stored in cache! >>> ------------------------------ >>> <11 add ST-8024-tKeeo5gYhjqoQzstAgqO-arybicki1 1 300 2689 >>> <7 get ST-8024-tKeeo5gYhjqoQzstAgqO-arybicki1 >>> >7 END >>> >11 STORED >>> ------------------------------ >>> There may be some code that synchronizes access to the same object from >>> the same client. However, it would seem that the service ticket is returned >>> by CAS before it's actually stored in memcached. If this service ticket is >>> then presented to another instance of CAS for validation, it fails to >>> retrieve it from memcached because the "add" operation has not completed. >>> >>> Again, I have to emphasize that this is not an unrealistic test. The >>> jMeter is simply following redirects at the time of the failure, as a >>> browser would. >>> >> >> We never saw that in production and we ran 500 virtual users. However, if >> you are experiencing it, you most likely could update the >> MemcacheTicketRegistry to block on the Futures. I've actually updated the >> code in HEAD with an option to block on Futures. :-) >> >> I have not tried it at all, since I wrote it all of 30 seconds ago. You >> can grab it from HEAD and try it out. The new property to enable it is >> "synchronizeUpdatesToRegistry" >> >> Let me know if it helps/doesn't help. >> >> -Scott >> >> >>> >>> Adam >>> >>> Scott Battaglia wrote: >>> >>> You have no need for sticky sessions. If you have two repcached >>> servers and you've told your CAS instance about both of them, the memcached >>> client essentially sees them as two memcached servers (since its not >>> familiar with repcached). >>> >>> The memcached client works in that it takes a hash of the key and that >>> determines what instance of memcached/repcached to store the item on. >>> repcached will then do its async replication. When you come to validate a >>> ticket the memcached client will again hash the key to determine what server >>> the item is stored on. If that server is unreachable (as determined by the >>> memcached client) then it will try the next likely server that would hold >>> the data. >>> >>> -Scott >>> >>> -Scott Battaglia >>> PGP Public Key Id: 0x383733AA >>> LinkedIn: http://www.linkedin.com/in/scottbattaglia >>> >>> >>> On Fri, Oct 24, 2008 at 8:21 AM, Andrew Ralph Feller, afelle1 < >>> [EMAIL PROTECTED]> wrote: >>> >>>> So what you are saying is that even with replication enabled, >>>> asynchronous replication CAS clusters should have sticky sessions on >>>> regardless? I realize that synchronous replication CAS cluster have no >>>> need >>>> of sticky sessions seeing as how it goes to all servers before the user can >>>> move on. >>>> >>>> Andrew >>>> >>>> >>>> On 10/23/08 9:29 PM, "Scott Battaglia" <[EMAIL PROTECTED]> >>>> wrote: >>>> >>>> It actually shouldn't matter if the async works or not. The >>>> memcache clients are designed to hash to a particular server and only check >>>> the backup servers if the primary isn't available. >>>> >>>> So you should always be validating against the original server unless >>>> its no longer there. >>>> >>>> -Scott >>>> >>>> -Scott Battaglia >>>> PGP Public Key Id: 0x383733AA >>>> LinkedIn: http://www.linkedin.com/in/scottbattaglia >>>> >>>> >>>> On Thu, Oct 23, 2008 at 9:17 PM, Adam Rybicki <[EMAIL PROTECTED]> >>>> wrote: >>>> >>>> >>>> Scott, >>>> >>>> I have run into a issue with MemCacheTicketRegistry and was wondering if >>>> you have any thoughts. I didn't want to create a new thread for this note. >>>> Anyone else with comments should feel free to reply, too. ;-) >>>> >>>> My tests have shown that when a ticket is generated on a CAS cluster >>>> member it may sometimes fail to validate. This is apparently because the >>>> memcached asynchronous replication did not manage to send the ticket >>>> replica >>>> in time. Fast as repcached may be, under a relatively light load, ST >>>> validation failed in 0.1% of the cases, or once in 1000 attempts. It would >>>> seem that the following tasks should be fairly complex: >>>> >>>> - Browser accesses a CAS-protected service >>>> - Service redirects to CAS for authentication >>>> - CAS validates the TGT >>>> - CAS issues the ST for service >>>> - CAS redirects the browser to service >>>> - Service sends the ST for validation >>>> >>>> But they are fast! My jMeter testing showed this taking 28 milliseconds >>>> under light load on CAS server , which is amazingly fast. Please note that >>>> in real life, this can be just as fast because the browser, CAS, and >>>> service >>>> perform these steps without the user slowing them down. CAS is indeed a >>>> lightweight system, and memcached does nothing to slow it down. It seems >>>> that in 0.1% of the cases this outperforms repcached under light load. The >>>> bad news is that under heavy load the failure rate increases. I've seen as >>>> bad as 8% failure rate. >>>> >>>> Have you or anyone else seen this? Have you had to work around this? >>>> >>>> Thanks, >>>> Adam >>>> >>>> Scott Battaglia wrote: >>>> >>>> >>>> On Tue, Oct 14, 2008 at 11:15 AM, Andrew Ralph Feller, afelle1 < >>>> [EMAIL PROTECTED]> wrote: >>>> >>>> >>>> >>>> >>>> Hey Scott, >>>> >>>> Thanks for answering some questions; really appreciate it. Just a >>>> handful more: >>>> >>>> >>>> >>>> 1. What happens whenever the server it intends to replicate with is >>>> down? >>>> >>>> >>>> >>>> >>>> >>>> It doesn't replicate :-) The client will send its request to the primary >>>> server and if the primary server is down it will replicate to the >>>> secondary. >>>> The repcache server itself will not replicate to the other server if it >>>> can't find it. >>>> >>>> >>>> >>>> >>>> >>>> >>>> 1. >>>> 2. >>>> 3. What happens whenever it comes back up? >>>> >>>> >>>> >>>> >>>> >>>> The repcache servers will sync with each other. The memcache clients >>>> will continue to function as they should >>>> >>>> >>>> >>>> >>>> >>>> >>>> 1. >>>> 2. >>>> 3. Does the newly recovered machine synchronize itself with the >>>> other servers? >>>> >>>> >>>> >>>> >>>> >>>> The newly recovered machine with synchronize with its paired memcache >>>> server. >>>> >>>> -Scott >>>> >>>> >>>> >>>> >>>> >>>> >>>> 1. >>>> 2. >>>> >>>> >>>> Thanks, >>>> Andrew >>>> >>>> >>>> On 10/14/08 9:56 AM, "Scott Battaglia" <[EMAIL PROTECTED] < >>>> http://[EMAIL PROTECTED]> > wrote: >>>> >>>> >>>> >>>> >>>> >>>> Memcache, as far as I know, uses a hash of the key to determine which >>>> server to write to (and then with repcache, its replicated to its pair, >>>> which you configure). >>>> >>>> -Scott >>>> >>>> -Scott Battaglia >>>> PGP Public Key Id: 0x383733AA >>>> LinkedIn: http://www.linkedin.com/in/scottbattaglia >>>> >>>> >>>> On Tue, Oct 14, 2008 at 10:38 AM, Andrew Ralph Feller, afelle1 < >>>> [EMAIL PROTECTED] <http://[EMAIL PROTECTED]> > wrote: >>>> >>>> >>>> >>>> >>>> Scott, >>>> >>>> I've looked at the sample configuration file on the JA-SIG wiki, however >>>> I was curious how memcached handles cluster membership for lack of a better >>>> word. One of the things we are getting burned on by JBoss/Jgroups is the >>>> frequency the cluster is being fragmented. >>>> >>>> Thanks, >>>> Andrew >>>> >>>> >>>> >>>> >>>> >>>> On 10/14/08 8:58 AM, "Scott Battaglia" <[EMAIL PROTECTED] < >>>> http://[EMAIL PROTECTED]> <http://[EMAIL PROTECTED]> > >>>> wrote: >>>> >>>> >>>> >>>> >>>> >>>> We've disabled the registry cleaners since memcached has explicit time >>>> outs (which are configurable on the registry). We've configured it by >>>> default with 1 gb of RAM I think, though I doubt we need that much. >>>> >>>> -Scott >>>> >>>> -Scott Battaglia >>>> PGP Public Key Id: 0x383733AA >>>> LinkedIn: http://www.linkedin.com/in/scottbattaglia >>>> >>>> >>>> >>>> >>>> On Mon, Oct 13, 2008 at 11:41 PM, Patrick Hennessy <[EMAIL PROTECTED]< >>>> http://[EMAIL PROTECTED]> <http://[EMAIL PROTECTED]> > wrote: >>>> >>>> >>>> >>>> >>>> >>>> I've been working on updating from 3.2 to 3.3 and wanted to give >>>> memcached a try instead of JBoss. I read Scott's message about >>>> performance and we've had good success here with memcached for other >>>> applications. It also looks like using memcached instead of JBoss will >>>> simplify the configuration changes for the CAS server. >>>> >>>> I do have the JBoss replication working with CAS 3.2 but pounding the >>>> heck out of it with JMeter will cause some not so nice stuff to happen. >>>> I'm using VMWare VI3 and configured an isolated switch for the >>>> clustering and Linux-HA traffic. I do see higher traffic levels coming >>>> to my cluster in the future, but I'm not sure if they'll meet the levels >>>> from my JMeter test. (I'm just throwing this out there because of the >>>> recent Best practice thread.) >>>> >>>> If I use memcached, is the ticketRegistryCleaner not needed anymore? I >>>> left those beans in the ticketRegistry.xml file and saw all kinds of >>>> errors. After taking it out it seems to load fine and appears to work, >>>> but I wasn't sure what the behavior is and I haven't tested it further. >>>> What if memcached fills up all the way? Does anyone have a general >>>> idea of how much memory to allocate to memcached with regards to >>>> concurrent logins and tickets stored? >>>> >>>> Thanks, >>>> >>>> Pat >>>> -- >>>> =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= >>>> >>>> Patrick Hennessy ([EMAIL PROTECTED] < >>>> http://[EMAIL PROTECTED]> <http://[EMAIL PROTECTED]> ) >>>> >>>> >>>> Senior Systems Specialist >>>> Division of Information and Educational Technology >>>> Delaware Technical and Community College >>>> =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= >>>> _______________________________________________ >>>> Yale CAS mailing list >>>> >>>> [email protected] <http://[email protected]> < >>>> http://[email protected]> >>>> >>>> http://tp.its.yale.edu/mailman/listinfo/cas >>>> >>>> >>>> >>>> >>>> >>>> >>>> ------------------------------ >>>> _______________________________________________ >>>> Yale CAS mailing list >>>> [email protected] <http://[email protected]> < >>>> http://[email protected]> >>>> >>>> http://tp.its.yale.edu/mailman/listinfo/cas >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> -- >>>> Andrew R. Feller, Analyst >>>> Information Technology Services >>>> 200 Fred Frey Building >>>> Louisiana State University >>>> Baton Rouge, LA 70803 >>>> (225) 578-3737 (Office) >>>> (225) 578-6400 (Fax) >>>> >>>> >>>> >>>> _______________________________________________ >>>> Yale CAS mailing list >>>> [email protected] >>>> http://tp.its.yale.edu/mailman/listinfo/cas >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> ------------------------------ >>>> >>>> _______________________________________________ >>>> Yale CAS mailing list >>>> [email protected] >>>> http://tp.its.yale.edu/mailman/listinfo/cas >>>> >>>> >>>> >>>> _______________________________________________ >>>> Yale CAS mailing list >>>> [email protected] >>>> http://tp.its.yale.edu/mailman/listinfo/cas >>>> >>>> >>>> >>>> ------------------------------ >>>> _______________________________________________ >>>> Yale CAS mailing list >>>> [email protected] >>>> http://tp.its.yale.edu/mailman/listinfo/cas >>>> >>>> >>>> -- >>>> Andrew R. Feller, Analyst >>>> Information Technology Services >>>> 200 Fred Frey Building >>>> Louisiana State University >>>> Baton Rouge, LA 70803 >>>> (225) 578-3737 (Office) >>>> (225) 578-6400 (Fax) >>>> >>>> _______________________________________________ >>>> Yale CAS mailing list >>>> [email protected] >>>> http://tp.its.yale.edu/mailman/listinfo/cas >>>> >>>> >>> ------------------------------ >>> >>> _______________________________________________ >>> Yale CAS mailing [EMAIL PROTECTED]://tp.its.yale.edu/mailman/listinfo/cas >>> >>> >>> _______________________________________________ >>> Yale CAS mailing list >>> [email protected] >>> http://tp.its.yale.edu/mailman/listinfo/cas >>> >>> >> ------------------------------ >> >> _______________________________________________ >> Yale CAS mailing [EMAIL PROTECTED]://tp.its.yale.edu/mailman/listinfo/cas >> >> >> _______________________________________________ >> Yale CAS mailing list >> [email protected] >> http://tp.its.yale.edu/mailman/listinfo/cas >> >> > ------------------------------ > > _______________________________________________ > Yale CAS mailing [EMAIL PROTECTED]://tp.its.yale.edu/mailman/listinfo/cas > > > _______________________________________________ > Yale CAS mailing list > [email protected] > http://tp.its.yale.edu/mailman/listinfo/cas > >
_______________________________________________ Yale CAS mailing list [email protected] http://tp.its.yale.edu/mailman/listinfo/cas
