Hi,
This is pretty simple to do actually, but there's an additional prob
that may occur - when you replace concurrent chars or words, you will
double up on your underscores - I'm sure there's a simple way to
avoid that, but im not hot on the regex part.
OK, so this will do what you want, but run it & see how you can get
"hello_____world" if you have lots of concurrent replacements.
If put in a crude fix to that below too.
Notice the array of words uses "_the_" so we can easily identify the
whole word "the" & not just a part words like "there"
<?php
$txt='This iS my TE%*£S$$$T text & * % & ThEsE are * in Som)E THE
woR*ds';
// convert spaces to underscores
$txt=str_replace(" ", "_", $txt);
// convert to lowercase
$txt=strtolower($txt);
// array of common words to remove
$words=array("_and_","_the_","_to_","_in_","_is_");
// strip out the common words
$txt=str_replace($words, "_", $txt);
// strip any non-alphanumerics (except hyphen & underscore)
$txt=ereg_replace("[^a-z,0-9,_,-]","",$txt);
// Note: A simple, but ugly, way to remove the extra underscores is
to make a array of underscores to remove, which should catch most
situations.
// $undrscrs=array("__","___","____","_____","______");
// $txt=str_replace($undrscrs, "_", $txt);
echo $txt;
?>
On 4 Apr 2006, at 16:55, W. Smith wrote:
> Hello group, as previously stated, I have a PHP file (actually a set
> of files) that goes to an articles website, it strips the articles
> and places them in a template, then it copies them to my website.
> The code works pretty well, but could use some tweeking to make it
> better, ie, so I don't have to go back and re-name the files it
> creates. It takes the titles of the articles and makes that the name
> of the HTML file, it already replaces the spaces between the words
> with an underscore, what I would also like for it to do is (as far
> as the title of the file is concerned):
>
> 1. Change all upper case letters to lower case.
>
> 2. Remove all characters that don't translate well into file names
> ([EMAIL PROTECTED]&*()+=\/?{}[]<>*=-)
>
> 3. Remove all words that are not necessary and just make the URL
> longer, (to, the, it, and, or, with, too, also, an,) but this cannot
> alter words that CONTAIN these letters, (ie it needs to remove "to"
> but ignore the word "tone"). If that part cannot be done properly
> then I can live without it.
>
> 1 and 2 are necessary, 3 is optional, if it can be done, great, if
> not, no big deal.
>
> If anyone here can do this for me and make it work, I would be
> willing to pay for your time, I wouldn't be able to pay a lot, but
> we can discuss that part. The code for the most part is already
> created and works as is, it just needs to be adjusted, it shouldn't
> take much time for someone who is experienced with PHP.
>
> Thanks in advance!
>
> Wretha
>
>
>
>
>
> Community email addresses:
> Post message: [email protected]
> Subscribe: [EMAIL PROTECTED]
> Unsubscribe: [EMAIL PROTECTED]
> List owner: [EMAIL PROTECTED]
>
> Shortcut URL to this page:
> http://groups.yahoo.com/group/php-list
>
>
> YAHOO! GROUPS LINKS
>
> Visit your group "php-list" on the web.
>
> To unsubscribe from this group, send an email to:
> [EMAIL PROTECTED]
>
> Your use of Yahoo! Groups is subject to the Yahoo! Terms of Service.
>
>
Community email addresses:
Post message: [email protected]
Subscribe: [EMAIL PROTECTED]
Unsubscribe: [EMAIL PROTECTED]
List owner: [EMAIL PROTECTED]
Shortcut URL to this page:
http://groups.yahoo.com/group/php-list
Yahoo! Groups Links
<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/php-list/
<*> To unsubscribe from this group, send an email to:
[EMAIL PROTECTED]
<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/