On Wed, Jan 20, 2010 at 01:54, Jeff Tharp <[email protected]> wrote:
> That does fix this issue, but the browser still gets a 200 instead of a 404.  
> I know that's caused some confusion for our operation as well.  Think about 
> SEO here -- we have a site behind an Apache-based reverse proxy.  We want to 
> use ProxyErrorOverride and ErrorDocument to make sure we send proper error 
> pages no matter what the backend application spits out (because often times 
> its more like a stack trace than a nice human-readable page).  Yet, if we 
> trigger a 404, we send a 200 back, which of course means a search engine 
> crawler misses the original 404.  I need ProxyErrorOverride on to deal with 
> the 500/503 type errors from the backend.  And thus I can't send a nice 404 
> from the backend, because the proxy will still override it.  So how do I 
> return a clean 404 in that scenario?

I understand your problem, we had and still have the same problem.

I guess, I am not sure though, that you simply cannot get a 404 by
configuration only, if the error page is served via an HTTP request.
You'll get the 404 in the case in which you have ErrorDocument 404
/local_file_on_the_reverse_proxy.

Maybe there could be some ways to still get the 404, but you need code.

For example, you could think of hooking proxy_request_status(int
*status, request_rec *r). There you could write

*status = r->prev->status. (r is the request that fetches the error
page and r->prev is the original request that gave you the 404.)

Next you will have to prevent apache to think that it got a recursive
error (error getting the error document). I don't know how to do that,
you'll have to dig into the apache sources.

Another line of thought is to hook insert_error_filter and there to
set a custom filter that sets the error code of the request that
fetches the error document. But I am just brainstorming, I do not know
if any of these ideas work.

Sorin

P.S. I think this discussion is more suited to the modules-dev mailing list.

Reply via email to