Hi,

This is pretty simple to do actually, but there's an additional prob  
that may occur - when you replace concurrent chars or words, you will  
double up on your underscores - I'm sure there's a simple way to  
avoid that, but im not hot on the regex part.
OK, so this will do what you want, but run it & see how you can get  
"hello_____world" if you have lots of concurrent replacements.
If put in a crude fix to that below too.

Notice the array of words uses "_the_" so we can easily identify the  
whole word "the" & not just a part words like "there"

<?php
$txt='This iS my TE%*£S$$$T text & * % & ThEsE are * in Som)E THE  
woR*ds';

// convert spaces to underscores
$txt=str_replace(" ", "_", $txt);

// convert to lowercase
$txt=strtolower($txt);

// array of common words to remove
$words=array("_and_","_the_","_to_","_in_","_is_");
// strip out the common words
$txt=str_replace($words, "_", $txt);

// strip any non-alphanumerics (except hyphen & underscore)
$txt=ereg_replace("[^a-z,0-9,_,-]","",$txt);

// Note: A simple, but ugly, way to remove the extra underscores is  
to make a array of underscores to remove, which should catch most  
situations.
// $undrscrs=array("__","___","____","_____","______");
// $txt=str_replace($undrscrs, "_", $txt);

echo $txt;
?>

On 4 Apr 2006, at 16:55, W. Smith wrote:

> Hello group, as previously stated, I have a PHP file (actually a set
> of files) that goes to an articles website, it strips the articles
> and places them in a template, then it copies them to my website.
> The code works pretty well, but could use some tweeking to make it
> better, ie, so I don't have to go back and re-name the files it
> creates. It takes the titles of the articles and makes that the name
> of the HTML file, it already replaces the spaces between the words
> with an underscore, what I would also like for it to do is (as far
> as the title of the file is concerned):
>
> 1. Change all upper case letters to lower case.
>
> 2. Remove all characters that don't translate well into file names
> ([EMAIL PROTECTED]&*()+=\/?{}[]<>*=-)
>
> 3. Remove all words that are not necessary and just make the URL
> longer, (to, the, it, and, or, with, too, also, an,) but this cannot
> alter words that CONTAIN these letters, (ie it needs to remove "to"
> but ignore the word "tone"). If that part cannot be done properly
> then I can live without it.
>
> 1 and 2 are necessary, 3 is optional, if it can be done, great, if
> not, no big deal.
>
> If anyone here can do this for me and make it work, I would be
> willing to pay for your time, I wouldn't be able to pay a lot, but
> we can discuss that part. The code for the most part is already
> created and works as is, it just needs to be adjusted, it shouldn't
> take much time for someone who is experienced with PHP.
>
> Thanks in advance!
>
> Wretha
>
>
>
>
>
> Community email addresses:
>   Post message: [email protected]
>   Subscribe:    [EMAIL PROTECTED]
>   Unsubscribe:  [EMAIL PROTECTED]
>   List owner:   [EMAIL PROTECTED]
>
> Shortcut URL to this page:
>   http://groups.yahoo.com/group/php-list
>
>
> YAHOO! GROUPS LINKS
>
>  Visit your group "php-list" on the web.
>
>  To unsubscribe from this group, send an email to:
>  [EMAIL PROTECTED]
>
>  Your use of Yahoo! Groups is subject to the Yahoo! Terms of Service.
>
>



Community email addresses:
  Post message: [email protected]
  Subscribe:    [EMAIL PROTECTED]
  Unsubscribe:  [EMAIL PROTECTED]
  List owner:   [EMAIL PROTECTED]

Shortcut URL to this page:
  http://groups.yahoo.com/group/php-list 
Yahoo! Groups Links

<*> To visit your group on the web, go to:
    http://groups.yahoo.com/group/php-list/

<*> To unsubscribe from this group, send an email to:
    [EMAIL PROTECTED]

<*> Your use of Yahoo! Groups is subject to:
    http://docs.yahoo.com/info/terms/
 


Reply via email to