On Sun, Aug 4, 2024 at 9:54 PM Dave Wreski <dwre...@guardiandigital.com.invalid> wrote:
> Hi, > > I have a rewrite that's creating a loop because the origin is contained in >>> the final destination. I know it then is processed again by the .htaccess >>> in the document root, but I don't understand why or how to stop it. What's >>> the solution here? >>> >>> RewriteRule >>> ^/features/linux-malware-the-truth-about-this-growing-threat$ >>> https://linuxsecurity.com/features/linux-malware-the-truth-about-this-growing-threat-updated >>> [L,R=301,END] >>> >>> I've tried variations of the above but it always creates a loop. >>> >>> $ wget -O /dev/null >>> https://linuxsecurity.com/features/linux-malware-the-truth-about-this-growing-threat >>> 2>&1|grep -E 'Location|HTTP' >>> HTTP response 302 [ >>> https://linuxsecurity.com/features/linux-malware-the-truth-about-this-growing-threat >>> ] >>> HTTP response 301 [ >>> https://linuxsecurity.com/features/linux-malware-the-truth-about-this-growing-threat-updated >>> ] >>> HTTP response 200 [https://linuxsecurity.com//feature >>> <https://linuxsecurity.com//features> >>> >> If you don't depend on mod_rewrite for anything else, I would recommend >> using RedirectMatch instead. >> >> Yes, we have many existing rules. Can't it be used in combination with > mod_rewrite? I also tried this rewritematch and it has the same loop > problem. > > That rule on its own won't loop, unless you have other conflicting >> directive or rewrite rules. >> >> If you must use mod_rewrite, then enabling the rewrite log will help you >> pinpoint the source of the loop. >> >> I thought the problem was that it is a subset of the destination URL? > > RewriteRule ^/features/linux-malware-the-truth-about-this-growing-threat$ > https://linuxsecurity.com/features/linux-malware-the-truth-about-this-growing-threat-updated > [L,R=301] > > This is what the request looks like with the above rewriterule: > > $ wget -O /dev/null > https://linuxsecurity.com/features/linux-malware-the-truth-about-this-growing-threat > 2>&1|grep -E 'Location|HTTP' > HTTP response 301 [ > https://linuxsecurity.com/features/linux-malware-the-truth-about-this-growing-threat > ] > HTTP response 301 [ > https://linuxsecurity.com/features/linux-malware-the-truth-about-this-growing-threat-updated > ] > HTTP response 200 [https://linuxsecurity.com//features] > > I don't understand why this didn't match? After this rule, it went on to > check other rules below it. I've spent an exhausting number of hours > stepping through the rewriterules to understand what's happening. I hope > you can follow to help me fix this. > > applying pattern > '^/features/linux-malware-the-truth-about-this-growing-threat$' to uri > '/features/linux-malware-the-truth-about-this-growing-threat-updated' > > After it processed all rules, it passed it to .htaccess: > > pass through > /features/linux-malware-the-truth-about-this-growing-threat-updated > > When there was no match until the very end of .htaccess, it replaced the > URI with index.php: > applying pattern '.*' to uri > 'features/linux-malware-the-truth-about-this-growing-threat-updated' > rewrite > 'features/linux-malware-the-truth-about-this-growing-threat-updated' -> > 'index.php' > add per-dir prefix: index.php -> /var/www/linuxsec/html/index.php > trying to replace prefix /var/www/linuxsec/html/ with / > internal redirect with /index.php [INTERNAL REDIRECT] > > What does [INTERNAL REDIRECT] mean and how is it different from other > redirects? > > It then loops through the rewrite rules, doesn't match any, then loops > through .htaccess again (as you said) until it matches /index.php. > RewriteRule ^index.php(/.*){0,1}$ - [L] > > $ wget -O /dev/null > https://linuxsecurity.com/features/linux-malware-the-truth-about-this-growing-threat > 2>&1|grep -E 'Location|HTTP' > HTTP response 301 [ > https://linuxsecurity.com/features/linux-malware-the-truth-about-this-growing-threat > ] > HTTP response 301 [ > https://linuxsecurity.com/features/linux-malware-the-truth-about-this-growing-threat-updated > ] > HTTP response 200 [https://linuxsecurity.com//features] > > What am I missing? > > You likely should be using FallbackResource for that as well. > > As well? I need both? > > I'm trying to confirm whether the problem is related to the match being a > subset of the final destination? I really don't see any other matches > > Lastly, why are you using .htaccess files? > > Primarily it consists of rules to manage bots, run our image resizer, and > an explicit list of files that are accessible - it's a default deny policy, > which is why it redirects to the index.php at the end. > > RewriteCond %{REQUEST_URI} !^/index\.php > RewriteCond %{REQUEST_FILENAME} !-f > RewriteCond %{REQUEST_FILENAME} !-d > RewriteRule .* index.php [L] > > Thanks, > Dave > > > > Replace the following: RewriteCond %{REQUEST_URI} !^/index\.php RewriteCond %{REQUEST_FILENAME} !-f RewriteCond %{REQUEST_FILENAME} !-d RewriteRule .* index.php [L] With: FallbackResource /index.php