[lusca-cache] r14432 committed - Created wiki page through web user interface.

codesite-noreply Tue, 23 Feb 2010 19:31:47 -0800

Revision: 14432
Author: adrian.chadd
Date: Tue Feb 23 19:31:22 2010
Log: Created wiki page through web user interface.
http://code.google.com/p/lusca-cache/source/detail?r=14432


Added:
 /wiki/ProjectAsyncReadCopy.wiki

=======================================
--- /dev/null
+++ /wiki/ProjectAsyncReadCopy.wiki     Tue Feb 23 19:31:22 2010
@@ -0,0 +1,48 @@
+#summary Eliminating a memcpy() when reading data from disk
+
+= Introduction =
+

+Because of the architecture of Squid/Lusca, a temporary buffer isallocated and used to handle read requests from the disk. Data is thenprovided to the caller when the IO completes.

+This copy becomes a significant overhead for workloads which use theoperating system disk cache quite heavily for high-hit workloads.

+This project attempts to eliminate the copy by, at the moment, enforcingthat the callback will always remain valid until the IO is completed orcancelled.

+
+= Overview =
+

+The Squid/Lusca codebase makes extensive use of a reference-counted objecttype called "cbdata". This ensures that callbacks are only madewith "valid" callback data. Unfortunately this is used as an early abortmechanism by a large pat of the codebase.

+Thus, if an asynchronous event is scheduled which uses a callback datapointer and/or any of the memory it points to (say, a memory buffer to readdata into) there is no guarantee that the data buffer will remain valid forthe duration of the event. In the case of disk IO reads, the kernel may bein the process of handling a read() call into the buffer in an IO threadwhilst the main Squid/Lusca thread aborts the connection and free'ing /reusing the underlying memory buffer.

+
+= Current Method =
+

+The current approach mirrors various underlying operating system support.This aims to be a "springboard" to layer further abstractions on top oflater on as needed but does not necessarily lock the codebase into aspecific paradigm. It is also currently the riskiest!

+Another aim is to evaluate what is required to implement this change forother asynchronous events with a future goal of allowing cbdata to beproperly shared between active threads. This would allow for furtherprocessing in other threads (eg, URL rewriting, content rewriting, etc)without requiring an intermediate copy step to ensure data remains validfor the duration of the asynchronous event.

+The caller must supply a callback+cbdata AND buffer which will remainvalid for the duration of the read event. The store client, which iseffectively the callback for the read IO mechanisms, must now remain validuntil the IO completes or is explicitly cancelled.

+The store client now tracks whether it is active or not.storeClientUnregister() now doesn't free the store client; it marks it asinactive. Callbacks are then responsible for calling storeClientComplete()to check whether the store client is done - and if so, the callback isaborted.

+
+= Risks with the current method =
+

+The codebase is a very large maze of twisty passages, all alike. There'smore involved in the read path than just straight reference counting - forexample, the general store disk IO stuff involves both the store clientand storeIOState as part of the callback data for various events - andthis will likely need similar separation and treatment.

+The aioCancel() path needs further testing to fully .. well, fix. It stillisn't completely clear.

+A few of the callbacks will call storeClientComplete() to check whetherthey need to be freed, and then abort the function if the store clientisn't active. I'm not entirely sure why a callback will be called on ain-progress but not-active callback and this requires furtherinvestigation. (In reality, I've forgotten why I wrote this in the past andneed to fully map out what's going on - then comment things! - before I'msatisfied with it.)

+Mapping out all of the possible interactions with store client and thestoreIOState would be very, very helpful in this.

+storeClientComplete() shouldn't do the checks AND free things. They shouldbe separated out for clarity.

+
+= Alternative Approaches =
+

+One alternative is to refcount the IO buffer separately from the cbdatafor the callback. The IO layer can then increase the buffer reference, readinto it, and then release the reference once the IO completes. If thecallback data has been freed then the callback will not be called but theIO buffer will still be valid for the duration of the IO.

+Another alternative is to request filled in pages from the IO layer - soinstead of the caller supplying a destination buffer, the caller simplystates what size the buffer should be, and is handed memory page(s) withthe relevant data.

+
+= Development Links =
+

+* Branch:[http://code.google.com/p/lusca-cache/source/list?path=/playpen/LUSCA_HEAD_zerocopy_storeread]+* Diff against LUSCA_HEAD (r14431):[http://www.creative.net.au/diffs/LUSCA_HEAD_zerocopy_storeread.r14431.diff]


--
You received this message because you are subscribed to the Google Groups 
"lusca-commit" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/lusca-commit?hl=en.

[lusca-cache] r14432 committed - Created wiki page through web user interface.

Reply via email to