Re: [PHP] removing text from a string

Jim Lucas Thu, 06 Nov 2008 07:56:03 -0800

Thodoris wrote:
> 
>> Thodoris wrote:
>>  
>>>> Boyd, Todd M. wrote:
>>>>  
>>>>      
>>>>>> -----Original Message-----
>>>>>> From: Ashley Sheridan [mailto:[EMAIL PROTECTED]
>>>>>> Sent: Tuesday, November 04, 2008 1:40 PM
>>>>>> To: Adam Williams
>>>>>> Cc: PHP General list
>>>>>> Subject: Re: [PHP] removing text from a string
>>>>>>
>>>>>> On Tue, 2008-11-04 at 08:04 -0600, Adam Williams wrote:
>>>>>>               
>>>>>>> I have a file that looks like:
>>>>>>>
>>>>>>> 1. Some Text here
>>>>>>> 2. Another Line of Text
>>>>>>> 3. Yet another line of text
>>>>>>> 340. All the way to number 340
>>>>>>>
>>>>>>> And I want to remove the Number, period, and blank space at the
>>>>>>>                     
>>>>>> begining
>>>>>>               
>>>>>>> of each line.  How can I accomplish this?
>>>>>>>
>>>>>>> Opening the file to modify it is easy, I'm just lost at how to
>>>>>>>                     
>>>>> remove
>>>>>           
>>>>>>> the text.:
>>>>>>>
>>>>>>> <?php
>>>>>>> $filename = "results.txt";
>>>>>>>
>>>>>>> $fp = fopen($filename, "r") or die ("Couldn't open $filename");
>>>>>>> if ($fp)
>>>>>>> {
>>>>>>> while (!feof($fp))
>>>>>>>         {
>>>>>>>         $thedata = fgets($fp);
>>>>>>>         //Do something to remove the "1. "
>>>>>>>         //print the modified line and \n
>>>>>>>         }
>>>>>>> fclose($fp);
>>>>>>> }
>>>>>>> ?>
>>>>>>>
>>>>>>>                     
>>>>>> I'd go with a regular expression any day for something like this.
>>>>>> Something like:
>>>>>>
>>>>>> "/$[0-9]{1,3}\.\ .*^/g"
>>>>>>
>>>>>> should do what you need. Note the space before the last period.
>>>>>>                 
>>>>> That would only work for files with 1-999 lines, and will wind up
>>>>> matching the entire line (since you used $ and ^ and a greedy .*
>>>>> inbetween... also... $ is "end-of-line" and ^ is
>>>>> "beginning-of-line" :))
>>>>> rather than just the "line number" part. I would stick with my
>>>>> originally-posted regex ("/^\d+\.\s/"), but I would modify yours like
>>>>> this if I were to use it instead:
>>>>>
>>>>> "/^[0-9]+\.\ (.*)$/" (What was the "g" modifier for, anyway?)
>>>>>
>>>>> Then, you could grab the capture group made with (.*) and use it as
>>>>> the
>>>>> "clean" data. (It would be group 1 in the match results and "$1" in a
>>>>> preg_replace() call, I believe. Group 0 should be the entire match.)
>>>>>
>>>>>
>>>>> Todd Boyd
>>>>> Web Programmer
>>>>>
>>>>>             
>>>> Personally, I would go this route if you wanted to stick with a regex.
>>>>
>>>> <?php
>>>>
>>>> $lines[] = '01. asdf';
>>>> $lines[] = '02. 323 asdf';
>>>> $lines[] = '03.2323 asdf';
>>>> $lines[] = '04. asdf 23';
>>>> $lines[] = '05.        asdf'; /* tabs used here */
>>>> $lines[] = '06. asdf';
>>>>
>>>> foreach ( $lines AS $line ) {
>>>>     echo preg_replace('/^[0-9]+\.\s*/', '', $line), "\n";
>>>> }
>>>>
>>>> ?>
>>>>
>>>> This takes care of all possible issues related to the char after the
>>>> first period.  Maybe it is there maybe not.
>>>>
>>>> Could be that it is a tab and not a space.  Could even be multiple
>>>> tabs or spaces.
>>>>
>>>>         
>>> There it goes again.
>>>
>>> Every time someone asks a simple question (like the kind it's solved
>>> with a simple trim, ltrim or rtrim) the discussion about which is the
>>> best regular expression for this problem, makes a thread get "elephant"
>>> sized :-) .
>>>
>>> I love this list!!
>>>
>>>     
>>
>> Your not going to be able to get it with any of the xtrim()
>> functions.  you would end up with various nested ltrim() calls that,
>> IMO, would be a
>> nightmare to manage.
>>
>> So, a top to bottom comparison here
>>
>> If $line is this:
>> $line = '01. asdf';
>>
>> And you use either one of these:
>> A) ltrim(ltrim(ltrim($line, '0123456789'), '.'));
>> B) preg_replace('/^[0-9]+\.\s*/', '', $line);
>>
>> Which do you prefer?
>>
>> A's Pros:
>>     Not a regex
>> A's Cons:
>>     A little slower then B
>>     multiple function calls
>>
>> B's Pros:
>>     Slightly faster then A
>>     Single Function call
>> B's Cons:
>>     Regex
>>
>>
>>
>>   
> 
> You should really check the manual again Jim. AFAIK ltrim doesn't remove
> a single character but as long they belong to the list they are all
> removed you just do:
> 
> ltrim($line, '0123456789')
> 
> 
> and this does the job perfectly. So perhaps you need to reconsider your
> thoughts on this.
>


Maybe instead of saying "AFAIK", you should go and check it yourself.  But 
obviously, since you didn't care to do it the first time around, I will
supply the relevant parts for you and the list archive.  And I quote:

Reference: http://us3.php.net/ltrim

Under "Returned Values" section...

Return Values

This function returns a string with whitespace stripped from the beginning of 
str . Without the second parameter, ltrim() will strip these characters:

    * " " (ASCII 32 (0x20)), an ordinary space.
    * "\t" (ASCII 9 (0x09)), a tab.
    * "\n" (ASCII 10 (0x0A)), a new line (line feed).
    * "\r" (ASCII 13 (0x0D)), a carriage return.
    * "\0" (ASCII 0 (0x00)), the NUL-byte.
    * "\x0B" (ASCII 11 (0x0B)), a vertical tab.

Notice the second sentence of the first line?
        "Without the second parameter, ltrim() will strip these characters"

That means their is a default set of chars that it uses IF you do not supply a 
list of chars.

On a side note.  If you notice, under the ChangeLog section, the second 
parameter wasn't added until PHP 4.1.0.

What you said was "if they belong to the list they are all removed."  Then what 
exactly did this this function do before 4.1?

I did my requested homework, but I think you need to go back and study that 
chapter now.

-- 
Jim Lucas

   "Some men are born to greatness, some achieve greatness,
       and some have greatness thrust upon them."

Twelfth Night, Act II, Scene V
    by William Shakespeare


-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php

Re: [PHP] removing text from a string

Reply via email to