On 11/13/2012 07:17 PM, André Warnier wrote:
> I didn't want to take too much time of anyone before, which is why I
> somewhat oversimplified the issue.  But considering the traffic on the
> lis os low, maybe you want to hear the whole story after all.
> 
> The basic case is this : a bit aside from our usual professional
> activities and for a friend, we run a website which is basically a shop
> with hundreds of individual items which people can view and buy.  (I
> will provide the URL privately to anyone who is more interested. It's a
> cute shop.)
> 
> The pages corresponding to these individual items, at the moment, are
> individual static pages in multiple sub-directories, and there are quite
> a lot of them.  The friend creates and maintains these static pages
> herself; she is an artist rather than a programmer, so she can handle an
> html editor (which she does rather well) to edit static pages and test
> them on her PC before copying them to the server, but we cannot ask her
> to handle any kind of "template" pages or the like.
> Add to this that the basic logic of the website and the design and
> techniques used date back from some 10 years ago, have been patched and
> repatched several times over several years, and are rather bad.
> 
> Now there is a requirement that, instead of being just static pages,
> each of these pages should in addition contain a <form> with some
> specific item-related information, allowing to buy these items on-line
> (so it cannot be done just with an include or a stylesheet).

Have you considered to patch the files automatically? CPAN has a bunch
of modules that can parse also quite bad/old HTML (with missing </p> or
</li> and the stuff).

You could for example keep the original files in one directory tree and
set up a process that waits for changes there either by scanning it on a
regular basis or even by using inotify (there is a cron-like daemon that
is made just for this kind of monitoring. I think it's called incron or
so. Anyway, there is plenty of support for inotify on CPAN.)

Now, if anything changes the daemon can start a program that parses the
HTML, inserts the form and writes the result to another directory which
is your DocumentRoot. On the way it could also call 'git commit'. And
now you have a simple content management system almost for free.

> What I am trying to achieve, without having to edit each of these
> individual hundreds of pages, or changing the links to these pages, or
> change the basic design of the application (because there is no budget
> for that), is to find a clever way on the server side
> to respond to a normal "/Shop_xxx/def/xyz.html" URL of one of these
> pages, to combine into one response both the required <form>, and the
> content of the existing unmodified static page.  I also do not want to
> parse the html on-the-fly and insert a <form> right into it, because the
> html that she creates with her (T-online) editor is rather bad to start
> with, and I have no guarantee that the result would be pleasing.
> 
> So that was the reason to think of the <frameset> solution, whereby in
> response to the initial request for "/Shop_xxx/def/xyz.html", I would
> respond with a first <frame> containing the form (generated by a
> back-end application, and depending on the item), and a second <frame>
> containing her artfully-crafted static page describing that item in all
> its glory.
> 
> The static pages in question are in several subdirectories of
> DocumentRoot, and at different levels.  Fortunately, all the top
> sub-directories names start with "/Shop_" (after which there can be
> "Quilts" or "Babydecken" and things like that, and a variable hierarchy
> of sub-directories containing html files and jpg images and the like.
> 
> So I have this configured :
> <LocationMatch "^/Shop_">
>   sethandler modperl
>   PerlResponseHandler My::ShopResponse
>   ...
> </LocationMatch>
> 
> As a result in part of the previous communications on this list, this
> PerlResponseHandler
> does more or less what I want, except one remaining problem which I am
> trying to resolve right now :
> In response to an initial request for "/Shop_xxx/def/xyz.html", the
> handler generates a
> <frameset> document as such :
> <html>
> <frameset rows="100,*">
>   <frame name="top_frame" src="..the URI which generates the form.." /> 
> (1)
>   <frame name="bottom_frame" src="/Shop_xxx/def/xyz.html.shop" />  (2)
> </frameset>
> </html>
> 
> (1) for the dynamically-generated html <form> document
> (2) for the static existing page
> 
> Because the second frame's URI also starts with "/Shop_", when the
> browser requests this frame, the same ResponseHandler is called.
> The handler examines the URL and sees that it ends in ".shop" (instead
> of ".html").
> So it knows that this time, it should not send another frameset, but
> instead it should strip the trailing ".shop" and deliver, as is, the
> content of the static document "/Shop_xxx/def/xyz.html".
> 
> But, how do I tell it to do that ?
> I have tried :
>   my $uri = $r->uri();
>   if ($uri =~ m!\/([^/]+\.htm[l]?\.shop)$!i) {
>     $uri =~ s/\.shop$//; # strip the trailing ".shop"
>     $r->internal_redirect($uri);
>         return Apache2::Const::OK ;
>   }
> 
> and also :
> 
>   my $uri = $r->uri();
>   if ($uri =~ m!\/([^/]+\.htm[l]?\.shop)$!i) {
>     $uri =~ s/\.shop$//; # strip the trailing ".shop"
>     my $subr = $r->lookup_uri($uri);
>     $subr->run();
>         return Apache2::Const::OK ;
>   }
> 
> but both of those result in a loop : they end up requesting
> "/Shop_xxx/def/xyz.html", which hits the same Location, which runs the
> same handler, which then produces the frameset, and so on.

Here is how the cycle goes:

1) The server gets a request for /Shop_xxx/def/xyz.html, skips the
if-branch and generates the frameset.

2) The bottom frame generates another request for
/Shop_xxx/def/xyz.html.shop. It enters the if-branch and issues a
subrequest or an internal redirect for /Shop_xxx/def/xyz.html.

3) The subeq/redir enters the handler again. Now it avoids the if-branch
because it does not match /\.shop$/. So it spits out the frameset.

4) goto 2) by means of the browser

How to break the loop? Instead (or inside) of the LocationMatch above use:

PerlFixupHandler "sub {                                         \
  use Apache2::RequestUtil ();                                  \
  use Apache2::RequestRec ();                                   \
  use Apache2::Const -compile=>qw/DECLINE/;                     \
  my ($r)=@_;                                                   \
  if( $r->is_initial_req ) {                                    \
    $r->handler('modperl');                                     \
    $r->set_handlers(PerlResponseHandler=>'My::ShopResponse');  \
  }                                                             \
  return Apache2::Const::DECLINED;                              \
}"

Then step 3) reads:

3) The subreq/redir enters the request cycle again. The if-branch of
fixup handler is skipped because the request is not initial. Hence, the
PerlResponseHandler is skipped completely and the default handler sends
the document.

You can achieve a similar effect with mod_rewrite. It has IS_SUBREQ
available in RewriteCond. But I don't know if that also checks for
!$r->prev.

You can also try to modify your Response handler to decline if
!$r->is_initial_req:

  my $uri = $r->uri();
  if ($uri =~ s!(/[^/]+\.html?)\.shop$!$1!i) {
    $r->internal_redirect($uri);
    return Apache2::Const::OK ;
  }
  return Apache2::Const::DECLINE unless $r->is_initial_req;

I am not sure if that works. I think I have never tried to return
DECLINED from a response handler.

> So how do I tell Apache/mod_perl that this time "I mean it", and that it
> should directly deliver the requested file, without re-running the whole
> cycle ?
> 
> I can of course request the corresponding filename() and deliver it
> myself (perhaps with sendfile()), but that does not seem to be the most
> elegant way of doing this. Or is it ?
> 
> Oh, and I'd like it elegant, but I would prefer not having to introduce
> a PerlFixupHandler or a PerlOutputFilter or Javascript, and do it all
> within this ResponseHandler.  That's because I have colleagues who know
> even less about mod_perl than I do, and I'd like to leave them something
> simple to deal with in support and maintenance for another 10 years.
> 
> ...
> 
> Aside : I just tried
> 
>   my $uri = $r->uri();
>   if ($uri =~ m!\/([^/]+\.htm[l]?\.shop)$!i) {
>     $uri =~ s/\.shop$//; # strip the trailing ".shop"
>     my $subr = $r->lookup_uri($uri);
>     $r->sendfile($subr->filename());
>         return Apache2::Const::OK ;
>   }
> 
> and that works.  So I guess that /is/ the right solution here.

No, it's not.

The default handler does a bit more than just send $r->filename. For you
it may work but it won't in the general case. See default_handler() in
server/core.c.

Torsten

Reply via email to