Re: is it possible to mark buckets to be copied only when to be set-aside?
It's a bug in mod_xslt, if that module trys to set aside a transient bucket. Bill At 12:09 AM 5/19/2004, Stas Bekman wrote: We have the following situation in mod_perl 2 land: we use the same buffer to allocate data in buckets which are passed to the filters. That bucket is created once per request. It works perfectly fine and effective most of the time (we copy from user's program perl space into the re-usable buffer, but no extra allocation happens). But just now one user has reported that it breaks mod_xslt filter, which sets aside the buckets sent from the modperl handler, and then uses them after seeing EOS. By that time the data in all but last bucket is corrupted. Obviously the straightforward solution is to allocate a new buffer for each bucket that mod_perl sends to the filter chain. But this is a huge waste for most users, which don't use this particular kind of output filters that setaside buckets. My question is: Is it possible to mark the bucket's data as volatile or something, so if a downstream filter wants to set them aside it will have to do the copying? At the moment the bucket is created as: bucket = apr_bucket_transient_create(buf, len, ba); and buf is static for the length of the request. Thank you. -- __ Stas BekmanJAm_pH -- Just Another mod_perl Hacker http://stason.org/ mod_perl Guide --- http://perl.apache.org mailto:[EMAIL PROTECTED] http://use.perl.org http://apacheweek.com http://modperlbook.org http://apache.org http://ticketmaster.com
Re: is it possible to mark buckets to be copied only when to be set-aside?
William A. Rowe, Jr. wrote: It's a bug in mod_xslt, if that module trys to set aside a transient bucket. Ah, cool, for some reason I thought that transient still has to have a copy :) Thanks Bill! At 12:09 AM 5/19/2004, Stas Bekman wrote: We have the following situation in mod_perl 2 land: we use the same buffer to allocate data in buckets which are passed to the filters. That bucket is created once per request. It works perfectly fine and effective most of the time (we copy from user's program perl space into the re-usable buffer, but no extra allocation happens). But just now one user has reported that it breaks mod_xslt filter, which sets aside the buckets sent from the modperl handler, and then uses them after seeing EOS. By that time the data in all but last bucket is corrupted. Obviously the straightforward solution is to allocate a new buffer for each bucket that mod_perl sends to the filter chain. But this is a huge waste for most users, which don't use this particular kind of output filters that setaside buckets. My question is: Is it possible to mark the bucket's data as volatile or something, so if a downstream filter wants to set them aside it will have to do the copying? At the moment the bucket is created as: bucket = apr_bucket_transient_create(buf, len, ba); and buf is static for the length of the request. Thank you. -- __ Stas BekmanJAm_pH -- Just Another mod_perl Hacker http://stason.org/ mod_perl Guide --- http://perl.apache.org mailto:[EMAIL PROTECTED] http://use.perl.org http://apacheweek.com http://modperlbook.org http://apache.org http://ticketmaster.com -- __ Stas BekmanJAm_pH -- Just Another mod_perl Hacker http://stason.org/ mod_perl Guide --- http://perl.apache.org mailto:[EMAIL PROTECTED] http://use.perl.org http://apacheweek.com http://modperlbook.org http://apache.org http://ticketmaster.com
Re: is it possible to mark buckets to be copied only when to be set-aside?
On Tue, 18 May 2004, Stas Bekman wrote: extra allocation happens). But just now one user has reported that it breaks mod_xslt filter, which sets aside the buckets sent from the modperl handler, and then uses them after seeing EOS. That seems to me an unnecessarily complex and inefficient XSLT implementation. What XSLT needs to do with its data is to parse to a DOM. By using libxml2/libxslt we can use a parseChunk API, and thus feed every bucket to the parser as soon as it reaches the filter. No need at all to buffer or setaside it. We have at least one implementation that works like that (originally mine, but now more actively developed by others as mod_transform). Perhaps mod_perl users might benefit from switching? -- Nick Kew
Re: is it possible to mark buckets to be copied only when to be set-aside?
On Wed, 2004-05-19 at 07:38 +0100, Nick Kew wrote: We have at least one implementation that works like that (originally mine, but now more actively developed by others as mod_transform). Perhaps mod_perl users might benefit from switching? Currently it is only in a Subversion Repo: http://svn.outoforder.cc/svn/mod_transform/trunk/ We were planing on making our first public 'release' version in the next day or so. -Paul Querna signature.asc Description: This is a digitally signed message part
Re: is it possible to mark buckets to be copied only when to be set-aside?
Nick Kew wrote: On Tue, 18 May 2004, Stas Bekman wrote: extra allocation happens). But just now one user has reported that it breaks mod_xslt filter, which sets aside the buckets sent from the modperl handler, and then uses them after seeing EOS. That seems to me an unnecessarily complex and inefficient XSLT implementation. What XSLT needs to do with its data is to parse to a DOM. By using libxml2/libxslt we can use a parseChunk API, and thus feed every bucket to the parser as soon as it reaches the filter. No need at all to buffer or setaside it. We have at least one implementation that works like that (originally mine, but now more actively developed by others as mod_transform). Perhaps mod_perl users might benefit from switching? Frankly I even have no idea who is the author of mod_xslt, it's not part of the mod_perl project. -- __ Stas BekmanJAm_pH -- Just Another mod_perl Hacker http://stason.org/ mod_perl Guide --- http://perl.apache.org mailto:[EMAIL PROTECTED] http://use.perl.org http://apacheweek.com http://modperlbook.org http://apache.org http://ticketmaster.com
Re: is it possible to mark buckets to be copied only when to be set-aside?
On Tue, 18 May 2004, Stas Bekman wrote: Frankly I even have no idea who is the author of mod_xslt, it's not part of the mod_perl project. There are several modules with that name. When you raised it as a problem you had encountered with mod_perl, I thought maybe mod_perl had specific hooks for it - although there would seem to be no reason to do so other than particular system optimisations. -- Nick Kew
Re: is it possible to mark buckets to be copied only when to be set-aside?
On Wed, 19 May 2004, William A. Rowe, Jr. wrote: It's a bug in mod_xslt, if that module trys to set aside a transient bucket. Huh? No it isn't. Half of the reason setaside() even exists is to handle transient buckets. static apr_status_t transient_bucket_setaside(apr_bucket *b, apr_pool_t *pool) { b = apr_bucket_heap_make(b, (char *)b-data + b-start, b-length, NULL); if (b == NULL) { return APR_ENOMEM; } return APR_SUCCESS; } In other words, setting aside a transient bucket does exactly what you want -- it copies the data into a new buffer allocated on the heap. So I think mod_perl is doing the right thing by using a transient bucket here, and I think mod_xslt is doing the right thing by calling setaside. However, setaside is supposed to be used prior to returning from any function in the current call stack, not just when you want to keep a bucket past the end of the current request. In other words, if any filter wants to keep a bucket and return to its caller rather than passing that bucket to the next filter, it should call setaside on that bucket. Failing to do so won't be a problem for most bucket types, but it most assuredly WILL be a problem for transient buckets, by the definition of transient: the storage behind a transient bucket is only guaranteed to exist and have the desired data in it until you return back into the function that created that bucket. Note: ap_save_brigade() handles the setaside and brigade concatenation for you. I suspect there are a number of places in the code that we are incorrectly calling APR_BRIGADE_CONCAT() instead of ap_save_brigade(). That's bug city right there. --Cliff
Re: is it possible to mark buckets to be copied only when to be set-aside?
On Wed, 19 May 2004, Cliff Woolley wrote: Note: ap_save_brigade() handles the setaside and brigade concatenation for you. I suspect there are a number of places in the code that we are incorrectly calling APR_BRIGADE_CONCAT() instead of ap_save_brigade(). That's bug city right there. Worse, there are probably places in the code where we don't even use APR_BRIGADE_CONCAT() and simply append each bucket one at a time from the source brigade to the saved brigade using APR_BRIGADE_INSERT_TAIL() (or even APR_BUCKET_INSERT_AFTER(), though it's hard for me to imagine a case where doing it that way would have made sense to the programmer). Furthermore, I just found a bug in ap_save_brigade(). On line 534 of util_filter.c: rv = apr_bucket_setaside(e, p); if (rv != APR_SUCCESS /* ### this ENOTIMPL will go away once we implement setaside ### for all bucket types. */ rv != APR_ENOTIMPL) { return rv; } Note that ### comment? Yeah it denotes what at this point is a bug. You can't just ignore APR_ENOTIMPL for bucket types, because pipe and socket buckets never had setaside implemented on them. I thought I remembered that that was because Greg and Ryan and I had some huge debate about it and it was decided for some reason that setting aside a pipe or socket didn't make sense. But now I can't find where we talked about that in the archives. Anyway, when Greg originally wrote the lines in question (see http://marc.theaimsgroup.com/?l=apr-devm=99190977519715w=2), he obviously did so assuming that setaside would be implemented on all bucket types. It was not. I recommend that the interested reader go back and take a look at the following thread (this kind of starts somewhere in the middle but this one message and its followups are particularly relevant): http://marc.theaimsgroup.com/?l=apr-devm=99178798619777w=2 As it is, given the current State Of The Buckets: If rv == APR_ENOTIMPL when you call apr_bucket_setaside(), then you're supposed to call apr_bucket_read(), get the data out of the bucket, and stick it into a heap bucket. ap_save_brigade() could possibly do a little optimizing here in the special cases of socket and pipe buckets, and grab the actual APR socket or pipe out of the bucket's internal data and compare the socket/pipe's pool to pool p. If they already match, you don't have to do the read and make-into-heap-bucket process; the reason being is that the point of setaside is to tell the buckets code that you want it to guarantee that the data storage behind bucket e will live at least as long as pool p. I thought I could just refer for some of this back to either of my sets of ApacheCon slides, but I can see I discussed setaside() insufficiently well at both ApacheCons. :( --Cliff
Re: is it possible to mark buckets to be copied only when to be set-aside?
At 08:33 AM 5/19/2004, Cliff Woolley wrote: On Wed, 19 May 2004, William A. Rowe, Jr. wrote: It's a bug in mod_xslt, if that module trys to set aside a transient bucket. Huh? No it isn't. Half of the reason setaside() even exists is to handle transient buckets. I didn't say setaside(), and yes - you identifed the proper behavior (which I doubt the mod_xslt was doing.) Perhaps if I said trys to hold a reference it might have been less ambiguous. Bill
Re: is it possible to mark buckets to be copied only when to be set-aside?
On Wed, 19 May 2004, Cliff Woolley wrote: can't just ignore APR_ENOTIMPL for bucket types, because pipe and socket buckets never had setaside implemented on them. I thought I remembered that that was because Greg and Ryan and I had some huge debate about it and it was decided for some reason that setting aside a pipe or socket didn't make sense. But now I can't find where we talked about that in the archives. BTW - I remembered the reason why it was decided that it didn't make sense to implement setaside for pipe/socket buckets. It's because for all other bucket types, setaside takes one bucket in and produces one bucket out, whereas for pipes and sockets it would take one bucket in and either produce one bucket if the lifetime request was already satisfied or it would produce a chain of n buckets out where n is proportional to the size of the data that had been waiting around to be read in the pipe/socket. It was that inconsistency that was disliked. At this point, I'm willing to agree to whatever makes things work and would probably accept this inconsistency as long as it's documented. What we have now is bug-prone beyond belief. --Cliff
Re: is it possible to mark buckets to be copied only when to be set-aside?
Cliff Woolley wrote: On Wed, 19 May 2004, Cliff Woolley wrote: can't just ignore APR_ENOTIMPL for bucket types, because pipe and socket buckets never had setaside implemented on them. I thought I remembered that that was because Greg and Ryan and I had some huge debate about it and it was decided for some reason that setting aside a pipe or socket didn't make sense. But now I can't find where we talked about that in the archives. BTW - I remembered the reason why it was decided that it didn't make sense to implement setaside for pipe/socket buckets. It's because for all other bucket types, setaside takes one bucket in and produces one bucket out, whereas for pipes and sockets it would take one bucket in and either produce one bucket if the lifetime request was already satisfied or it would produce a chain of n buckets out where n is proportional to the size of the data that had been waiting around to be read in the pipe/socket. It was that inconsistency that was disliked. At this point, I'm willing to agree to whatever makes things work and would probably accept this inconsistency as long as it's documented. What we have now is bug-prone beyond belief. And a lot of pain and wasted time could be saved if all this was documented. I'm not aware of such documentation's existance, besides very scarce notes in the header files. Filters are a very complex area, and w/o proper user docs it's no wonder that we see all these problems. The worst part is that some of the developers who originally wrote things are no longer here to help, so their knowledge and reasons, for doing things in certain ways, get lost. I wish someone who groks filters well could write a proper document explaining most of (all?) the ins and outs of writing a proper filter. The old article written by rbb is nice, but a way too short to cover all the nuances. -- __ Stas BekmanJAm_pH -- Just Another mod_perl Hacker http://stason.org/ mod_perl Guide --- http://perl.apache.org mailto:[EMAIL PROTECTED] http://use.perl.org http://apacheweek.com http://modperlbook.org http://apache.org http://ticketmaster.com
Re: is it possible to mark buckets to be copied only when to be set-aside?
Cliff Woolley wrote: On Wed, 19 May 2004, Stas Bekman wrote: And a lot of pain and wasted time could be saved if all this was documented. I know. :( I'm not aware of such documentation's existance, besides very scarce notes in the header files. The best documentation that exists, besides what's in the header files, is in the talk I gave on Bucket Brigades at ApacheCon 2002; see http://www.cs.virginia.edu/~jcw5q/talks/ for links to the slides (ppt) and notes pages (pdf). There are also links there to my slides from ApacheCon 2003, which discuss debugging buckets and brigades. The second best bucket reference is a section in Chapter 13 of Ryan's book Apache Server 2.0: The Complete Reference, pp 310-322. If anybody else knows of other resources, I'd love to know about them... but I suspect that's more or less it. Right from the horses' mouths. But unfortunately neither Ryan nor I were verbose enough in either case to really go into the kind of depth you're looking for. Not that I don't agree it would be useful to have written down somewhere. Thanks Cliff, I've been to both of your great talks :) Perhaps someone on this list will take up an initiative to write such a document, by first putting together the existing resources and then beef it up to add the missing stuff. I'm sure Cliff and other filter gurus will be more than helpful to answer any questions arising while working on such a document. I'd have taken up that project by myself, but I'm already spread too thin and must finish first the mod_perl 2.0 API docs before we can release 2.0. To add to Cliff's list of resources, mod_perl 2.0 filter documentation is pretty extensive and perhaps it's the easiest to start with filters by writing them in perl. For more information see: http://perl.apache.org/docs/2.0/user/handlers/filters.html p.s. may be rbb will be willing to contribute parts of his book's filters chapter. -- __ Stas BekmanJAm_pH -- Just Another mod_perl Hacker http://stason.org/ mod_perl Guide --- http://perl.apache.org mailto:[EMAIL PROTECTED] http://use.perl.org http://apacheweek.com http://modperlbook.org http://apache.org http://ticketmaster.com
Re: is it possible to mark buckets to be copied only when to be set-aside?
On Wed, 19 May 2004, Stas Bekman wrote: And a lot of pain and wasted time could be saved if all this was documented. I know. :( I'm not aware of such documentation's existance, besides very scarce notes in the header files. The best documentation that exists, besides what's in the header files, is in the talk I gave on Bucket Brigades at ApacheCon 2002; see http://www.cs.virginia.edu/~jcw5q/talks/ for links to the slides (ppt) and notes pages (pdf). There are also links there to my slides from ApacheCon 2003, which discuss debugging buckets and brigades. The second best bucket reference is a section in Chapter 13 of Ryan's book Apache Server 2.0: The Complete Reference, pp 310-322. If anybody else knows of other resources, I'd love to know about them... but I suspect that's more or less it. Right from the horses' mouths. But unfortunately neither Ryan nor I were verbose enough in either case to really go into the kind of depth you're looking for. Not that I don't agree it would be useful to have written down somewhere. --Cliff
Re: is it possible to mark buckets to be copied only when to be set-aside?
On Wed, 19 May 2004, Stas Bekman wrote: I'm sure Cliff and other filter gurus For the record, I make no claims as to being a filter guru. Some parts of the filtering system baffle me as much as anybody else. I know buckets. I more or less leave the filtering to somebody else. :) will be more than helpful to answer any questions arising while working on such a document. ... but yes, both I and others would, I'm sure, be more than willing to have our brains picked. --Cliff PS: I begin my drive toward California at the crack o' dawn in the morning (stops for a few days in Kentucky and St. Louis)... see you SF types soon!