On Wed, Jan 20, 2010 at 01:54, Jeff Tharp <[email protected]> wrote: > That does fix this issue, but the browser still gets a 200 instead of a 404. > I know that's caused some confusion for our operation as well. Think about > SEO here -- we have a site behind an Apache-based reverse proxy. We want to > use ProxyErrorOverride and ErrorDocument to make sure we send proper error > pages no matter what the backend application spits out (because often times > its more like a stack trace than a nice human-readable page). Yet, if we > trigger a 404, we send a 200 back, which of course means a search engine > crawler misses the original 404. I need ProxyErrorOverride on to deal with > the 500/503 type errors from the backend. And thus I can't send a nice 404 > from the backend, because the proxy will still override it. So how do I > return a clean 404 in that scenario?
I understand your problem, we had and still have the same problem. I guess, I am not sure though, that you simply cannot get a 404 by configuration only, if the error page is served via an HTTP request. You'll get the 404 in the case in which you have ErrorDocument 404 /local_file_on_the_reverse_proxy. Maybe there could be some ways to still get the 404, but you need code. For example, you could think of hooking proxy_request_status(int *status, request_rec *r). There you could write *status = r->prev->status. (r is the request that fetches the error page and r->prev is the original request that gave you the 404.) Next you will have to prevent apache to think that it got a recursive error (error getting the error document). I don't know how to do that, you'll have to dig into the apache sources. Another line of thought is to hook insert_error_filter and there to set a custom filter that sets the error code of the request that fetches the error document. But I am just brainstorming, I do not know if any of these ideas work. Sorin P.S. I think this discussion is more suited to the modules-dev mailing list.
