I'm going to be updating the project page with as much bad English as I can possibly write :)
If anyone else is interested in contributing the project please let me know. On Dec 19, 8:33 pm, "Jeremy Dunck" <[EMAIL PROTECTED]> wrote: > Here's the IRC chat from today: > > Most useful bits: > > django-orm-cache.googlecode.com now exists. > > todo: > [10:29pm] jdunck: Investigate whether ferringb (of curse) supplied > signal performance patch > [10:29pm] jdunck: Add multiget to all cache backends > [10:29pm] jdunck: Add max_key_length, max_value_length to all cache backends > [10:29pm] jdunck: add memcache's replace semantics for > replace-only-if-exists semantics > [10:29pm] jdunck: Support splitting qs values over max_value_length > (in other words, do multiple sets and gets for a single list of > objects if needed) > [10:29pm] jdunck: bench sha vs. (python) hsieh and jenkins > [10:29pm] jdunck: test w/o CachedModel __metaclass__ since that's a bit silly. > [10:29pm] jdunck: invalidate whole list if any key in list is missing > - ask dcramer > [10:29pm] jdunck: All related field descriptors should check cache first > [10:29pm] jdunck: Port to qs-refactor > > Full transcript: > > [7:27pm] jdunck: so, orm-based caching > [7:27pm] jdunck: seems like row-level-caching goog soc never went anywhere? > [7:27pm] jdunck: (you have time to talk now?) > [7:33pm] zeeg: ya i do > [7:33pm] zeeg: what you mean goog soc > [7:33pm] jdunck:http://django-object-level-caching.googlecode.com/ > [7:34pm] zeeg: oh > [7:34pm] zeeg: i dont like that > [7:34pm] zeeg: at all > [7:34pm] zeeg: but ya > [7:34pm] zeeg: you read over all my stuff? > [7:34pm] jdunck: well, i saw it's monkey-patching the standard QS > [7:34pm] jdunck: i did. > [7:35pm] zeeg: ya really what I want at the core, is a magical CachedModel > [7:35pm] zeeg: that can handle all invalidation that (for most uses) we need > [7:35pm] zeeg: which is just delete invalidation > [7:35pm] zeeg: then using signals for model level dependancies (via > registration) > [7:35pm] zeeg: and reverse key mappings (except a Model + pks mapping) > for row-level > [7:35pm] jdunck: yeah-- over the sprint, i talked to jacob about some > new signal ideas-- and he said he doesn't want to add any signals > without first improving signal performance. > [7:36pm] jdunck: i *like* signals, so don't mind improving their performance > [7:36pm] zeeg: ya i think trunk's signals still suck > [7:36pm] zeeg: ferringb patched ours (he's one of the devs at Curse) > [7:36pm] zeeg: im not sure if his patch made trunk tho > [7:36pm] zeeg: but ya, signals for dependencies is the last of my concerns > [7:36pm] zeeg: ill rely on expiration based caching mostly > [7:36pm] zeeg: but being able to handle invalidation at the row level > is.. beautiful > [7:37pm] zeeg: its obviously a bit more of a performance hit handling > caching like this, but my tests showed it wasn't big enough to matter > [7:37pm] zeeg: i was even going to add in the pre-expiration routines > [7:37pm] zeeg: (so if something expires in <predefined> minute, it > gets automatically locked and recached by the first person to see it) > [7:38pm] jdunck: not sure what you mean by "locked" > [7:38pm] zeeg: basically, when you set a key, you set either another > key, or a in that key you're setting you tokenize it > [7:38pm] zeeg: and that other key, or the first token > [7:39pm] zeeg: contains the expiration time, or expiration time - minutes > [7:39pm] zeeg: and when you fetch that key > [7:39pm] zeeg: if that expiration time has been reached (the > pre-expiration), you set a lock value, which says, if anyone else is > looking at this and checking, ignore it > [7:39pm] zeeg: and then you recreate that cache > [7:39pm] jdunck: ah. you assume no purging due to MRU memory limits? > [7:39pm] zeeg: well ya, only so much you can plan for > [7:39pm] zeeg: but w/ that, it potentially stops heavily accessed keys > [7:40pm] zeeg: from being regenerating 100s of times > [7:40pm] jdunck: fwiw, here's a wrapper i made to deal with the same > problem:http://code.djangoproject.com/ticket/6199 > [7:40pm] zeeg: if they take too long to generate > [7:40pm] zeeg: ah ya > [7:40pm] zeeg: thats your code? > [7:40pm] jdunck: yeah > [7:40pm] zeeg: I think I saw that linked on memcached > [7:40pm] zeeg: they talked about the usage at CNET and I thought it'd > be a great addition > [7:40pm] jdunck: hmm. i posted on that list a while back, but it > wasn't a ticket at the time. > [7:41pm] zeeg: ya i just remember seeing the code > [7:41pm] jdunck: well, anyway, do you not like that approach? just > wrapping stampedes for the whole backend? > [7:41pm] zeeg: and im like, cool, it must be useful if others are doing it > [7:41pm] zeeg: well in the backend I think its the best approach actually > [7:41pm] jdunck: i can see some ppl being annoyed that it has some > book-keeping overhead and doesn't store exactly what you say to store. > [7:41pm] zeeg: the way CNET did it, was they used 3 keys > [7:42pm] zeeg: actual data, expiration key, and locking key > [7:42pm] zeeg: which i can see benefits of doing it both in seperate > keys, and in a combined key > [7:43pm] jdunck: do you use gearman or some other background jobber? > [7:43pm] zeeg: nope > [7:43pm] zeeg: not familiar w/ them > [7:44pm] jdunck: i mean, my understanding is that [EMAIL PROTECTED] went a > totally different direction-- have a daemon that feeds in updated > keys, so that web app never misses keys > [7:44pm] jdunck: (obviously doesn't work for ad hoc stuff) > [7:44pm] zeeg: ah ya we do that for a few things > [7:44pm] zeeg: only things that are slow to cache tho > [7:44pm] jdunck: do you have a > 1MB memcached compilation? > [7:44pm] jdunck: i was surprised to find that hard limit. QS results > can easily reach that. > [7:45pm] zeeg: one sec brb > [7:45pm] zeeg: like 1mb in a key? > [7:45pm] jdunck: yeah > [7:45pm] jdunck: crazy-talk, know. > [7:45pm] jdunck: in your scheme, can you imagine a list of object keys > getting to 1MB? > [7:46pm] jdunck: 100 bytes per key, list of 10000 object keys would > result ~1mb; missed key set in standard memcache > [7:48pm] zeeg: hrm > [7:48pm] zeeg: so you mean a cache that would store 10k objects in it? > [7:50pm] jdunck: let me back up. a standard memcache will only store > a key value of 1mb or less > [7:50pm] jdunck: you can compile it to store more per key value > [7:51pm] jdunck: we (pegnews.com) are currently through queryset > results in cache > [7:51pm] jdunck: sometimes that results in a miss because the qs is too big. > [7:51pm] jdunck: we're silly for throwing in huge qs anyway, but > quick-n-dirty mostly works > [7:52pm] jdunck: anyway, if i understand correctly, your cacheqs would > store hash(qs kwargs) as the key, and [ct_id:pk_val1, ct_id:pk_val2, > ...] as the value > [7:52pm] jdunck: each individual object has ct_id:pk_val:1 as the key, > and the model instance as the value > [7:52pm] jdunck: right? > [7:53pm] jdunck: i was just pointing out that a result list long > enough would still hit the 1mb limit, resulting in a miss on the qs > key lookup. > [7:55pm] zeeg: ya > [7:55pm] zeeg: you'd still have the same limitation > [7:55pm] zeeg: my plan was to store > [7:55pm] zeeg: hrm > [7:55pm] zeeg: what was my plan > [7:55pm] jdunck: hah > [7:55pm] zeeg: i think it was up in the air > [7:55pm] zeeg: but it'd be like > [7:55pm] zeeg: ModelClass,(pk, pk, pk, pk),(related, fields, to, select) > [7:56pm] zeeg: feel free to poke holes > [7:56pm] zeeg: the one issue i see > [7:56pm] zeeg: im not sure how big ModelClass is > [7:56pm] zeeg: when serialized > [7:59pm] zeeg: but w/ this cool system > [7:59pm] zeeg: if you *needed* too > [7:59pm] zeeg: you could say "oh shit im trying to insert too much" > [8:00pm] zeeg: and be like ModelClass, (pks*,), (fields*), number_of_keys > [8:00pm] zeeg: and split it into multiple keys > [8:00pm] zeeg: it would be nearly just as fast > [8:00pm] zeeg: basing off of my multi-get bench results > [8:00pm] zeeg: thats what i like about taking this approach > [8:00pm] zeeg: is the developer doesnt have to worry about any of that > [8:04pm] zeeg: im actually hoping to get a rough version of this done > over the holidays while im on vaca > [8:05pm] jdunck: the (related,fields,to,select) bit above is FK/M2M > rels to follow? > [8:05pm] zeeg: select_related more or less > [8:05pm] zeeg: so it knows what to lookup in the batch keys when it grabs it > [8:05pm] jdunck: yeah.. i wonder what select_related does for cycles... > [8:05pm] zeeg: so it does select list -> select list of pks (batch) -> > select huge batch of related_fields > [8:05pm] jdunck: yeah, i follow > [8:05pm] zeeg: although that potentially may have to be split up too > [8:06pm] zeeg: is there a limit on how much data sends back and forth > between memcached > [8:06pm] jdunck: yeah, that's a simple abstraction, no biggie > [8:06pm] zeeg: or is that the 1mb you were referring to (i was assuming > storage) > [8:06pm] jdunck: it's 1mb per key value by default in memcache. > [8:06pm] zeeg: k > [8:06pm] jdunck: other backends are different, i'm sure > [8:06pm] zeeg: ya dont care about those tho > [8:06pm] zeeg: if anyone uses anything else they're not looking for > the kind of performance this is aimed at > [8:07pm] zeeg: but in theory, it'd support them > [8:07pm] zeeg: (i dont think they allow multi-gets tho, so it probably > does them one at a time) > [8:07pm] jdunck: i don't really care about them either, but if this is > to go in core, we probly should make max_value_size and > supports_multiget as vals on the cache backend > [8:08pm] zeeg: doesnt cache backend all have multi get by default? > [8:08pm] zeeg: i saw it in the memcached code so i assumed it was > across the board > [8:08pm] zeeg: (i want to personally add incr/decr into the cache backend) > [8:08pm] zeeg: thats another thing id like to potentially support with > this, is namespaces > [8:08pm] zeeg: but thats another pretty big addition > [8:08pm] zeeg: and can come later > [8:08pm] jdunck: nope, not in file, for example > [8:08pm] jdunck: easy to add, tho, that's a good point > [8:09pm] zeeg: but being that cache keys are db_table:hash, should be > fairly easy > [8:09pm] jdunck: honestly, i don't get what incr/decr does. are you > hand-rolling ref-counting on something? > [8:09pm] jdunck: i mean, i understand what the primitive does, i'm > just not smart enough to see the point > [8:10pm] zeeg: ya if you used namespaces it could help > [8:10pm] zeeg: iof you were threaded > [8:10pm] zeeg: and you did cache.get then cache.set > [8:10pm] zeeg: it ... > > read more >> --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/django-developers?hl=en -~----------~----~----~----~------~----~------~--~---
