Jochen,

The reason why your regex did not match was because between the

(<div
> style="float:left;">.*<div[^>]+class="icon">.*?</div>.*?</div>)+

blocks there is some space, so that's why only the first block would match, 
and the rest of the HTML goes to the next pattern. By making the quantifier 
ungreedy, you match all the blocks within the same pattern, so you could 
even remove the + sign after the closing bracket as it is not needed.

If you wanted to really be able to match this block by matching multiple 
repeating blocks, you would have to use a regex like this:

$regex = '#(<div[^>]+id="cpanel">.*?)'.
         '((?:<div 
style="float:left;">.*?<div[^>]+class="icon">.*?</div>.*?</div>\s+)+)'. // 
note you need to account for the spaces and newlines (that's why \s+), plus 
all the repetitions must be within the pattern itself
            '(.*?</div>.*?</td>)#ism'; //

which produces the same result :)

Ken

----- Original Message ----- 
From: "Jochen Daum" <[email protected]>
To: <[email protected]>
Sent: Thursday, July 16, 2009 1:16 PM
Subject: [phpug] Re: make specific regex part greedy



On Thu, Jul 16, 2009 at 1:03 PM, Ken Golovin<[email protected]> wrote:
>
> This has worked for me:
> $regex = '#(<div[^>]+id="cpanel">.*?)'.
> '(<div
> style="float:left;">.*<div[^>]+class="icon">.*?</div>.*?</div>)+'. // note
> the question mark next to + quantifier is removed
> '(.*?</div>.*?</td>)#ism'; //instead the first .*? becomes
> greedy
>


Yes it does, thanks Ken!

so the trick is to find which part has become greedy inside?

Kind Regards,

Jochen Daum

Chief Automation Officer
Automatem Ltd

Phone: 09 630 3425
Mobile: 021 567 853
Email: [email protected]
Skype: jochendaum
Website: www.automatem.co.nz
http://twitter.com/automatem
http://www.xing.com/go/invite/3425509.181107

>
>
> Ken
> ----- Original Message -----
> From: "Jochen Daum" <[email protected]>
> To: <[email protected]>
> Sent: Thursday, July 16, 2009 12:53 PM
> Subject: [phpug] Re: make specific regex part greedy
>
>
>
> Hi Stig,
>
> On Thu, Jul 16, 2009 at 12:35 PM, Stig Manning<[email protected]> wrote:
>>
>> Hi Jochen,
>>
>> Can you provide some example text that contains matches? (and the
>> correct match)
>>
>> Might help us find a solution,
>>
>
> Please see below (its from the Joomla control panel)
>
>
> <div id="cpanel">
> <div style="float:left;">
> <div class="icon">
> <a href="index.php?option=com_content&amp;task=add">
> <img
> src="/administrator/templates/khepri/images/header/icon-48-article-add.png"
> alt="Add New Article" /> <span>Add New Article</span></a>
> </div>
> </div>
> <div style="float:left;">
>
> <div class="icon">
> <a
> href="index2.php?option=com_comprofiler&task=editPlugin&cid=501">
> <img
> src="/administrator/templates/khepri/images/header/icon-48-extension.png"
> alt="Global Configuration" />
> <span>Subscriptions</span></a>
> </div>
> </div>
> <div style="float:left;">
> <div class="icon">
>
> <a href="test">
> <img
> src="/administrator/templates/khepri/images/header/icon-48-extension.png"
> alt="Global Configuration" />
> <span>test</span></a>
> </div>
> </div>
> <div style="float:left;">
> <div class="icon">
> <a href="index.php?option=com_content">
>
> <img 
> src="/administrator/templates/khepri/images/header/icon-48-article.png"
> alt="Article Manager" /> <span>Article Manager</span></a>
> </div>
> </div>
> <div style="float:left;">
> <div class="icon">
> <a href="index.php?option=com_frontpage">
> <img
> src="/administrator/templates/khepri/images/header/icon-48-frontpage.png"
> alt="Front Page Manager" /> <span>Front Page Manager</span></a>
>
> </div>
> </div>
> <div style="float:left;">
> <div class="icon">
> <a href="index.php?option=com_sections&amp;scope=content">
> <img 
> src="/administrator/templates/khepri/images/header/icon-48-section.png"
> alt="Section Manager" /> <span>Section Manager</span></a>
> </div>
> </div>
>
> <div style="float:left;">
> <div class="icon">
> <a href="index.php?option=com_categories&amp;section=com_content">
> <img
> src="/administrator/templates/khepri/images/header/icon-48-category.png"
> alt="Category Manager" /> <span>Category Manager</span></a>
> </div>
> </div>
> <div style="float:left;">
> <div class="icon">
>
> <a href="index.php?option=com_media">
> <img src="/administrator/templates/khepri/images/header/icon-48-media.png"
> alt="Media Manager" /> <span>Media Manager</span></a>
> </div>
> </div>
> <div style="float:left;">
> <div class="icon">
> <a href="index.php?option=com_menus">
> <img 
> src="/administrator/templates/khepri/images/header/icon-48-menumgr.png"
> alt="Menu Manager" /> <span>Menu Manager</span></a>
>
> </div>
> </div>
> <div style="float:left;">
> <div class="icon">
> <a href="index.php?option=com_languages&amp;client=0">
> <img
> src="/administrator/templates/khepri/images/header/icon-48-language.png"
> alt="Language Manager" /> <span>Language Manager</span></a>
> </div>
> </div>
>
> <div style="float:left;">
> <div class="icon">
> <a href="index.php?option=com_users">
> <img src="/administrator/templates/khepri/images/header/icon-48-user.png"
> alt="User Manager" /> <span>User Manager</span></a>
> </div>
> </div>
> <div style="float:left;">
> <div class="icon">
>
> <a href="index.php?option=com_config">
> <img 
> src="/administrator/templates/khepri/images/header/icon-48-config.png"
> alt="Global Configuration" /> <span>Global
> Configuration</span></a>
> </div>
> </div>
> </div>
>
> </td>
> <td width="45%" valign="top">
>
>
> Kind Regards,
>
> Jochen Daum
>
> Chief Automation Officer
> Automatem Ltd
>
> Phone: 09 630 3425
> Mobile: 021 567 853
> Email: [email protected]
> Skype: jochendaum
> Website: www.automatem.co.nz
> http://twitter.com/automatem
> http://www.xing.com/go/invite/3425509.181107
>
>
>> Cheers,
>> Stig
>>
>> Jochen Daum wrote, on 16/07/2009 10:01 AM:
>>> The regex string is:
>>>
>>> $regex = '#(<div[^>]+id="cpanel">.*?)'.
>>> '(<div style="float:left;">.*?<div
>>> class="icon">.*?</div>.*?</div>)+'. //this needs to be greedy
>>> '(.*?</div>.*?</td>)#ism' //instead the first .*? becomes greedy
>>>
>>>
>>>
>>> The 2nd line html elements repeat and I don't know how often. I'd like
>>> to capture all of the repetition, currently it only captures the first
>>> element.
>>>
>>> Any pointer of how to make the ()+ greedy?
>>>
>>>
>>
>> >
>>
>
>
>
>
>
> >
>





--~--~---------~--~----~------------~-------~--~----~
NZ PHP Users Group: http://groups.google.com/group/nzphpug
To post, send email to [email protected]
To unsubscribe, send email to
[email protected]
-~----------~----~----~----~------~----~------~--~---

Reply via email to