--- In power-pro@yahoogroups.com, "entropyreduction" 
<alancampbelllists+ya...@...> wrote:
> 
> --- In power-pro@yahoogroups.com, "Sheri" <sherip99@> wrote:
> ok, quick fixes,try em and see
> 
> regexPlugin210_100701.zip
> in the usual
> http://tech.groups.yahoo.com/group/power-pro/files/0_TEMP_/AlansPluginProvisional/
> 
> Added
> exec option "slow" == PCRE_NO_START_OPTIMIZE 
> compile option "usp"  == PCRE_UCP
> pseudo exec option "mark" "k", == PCRE_MINE_MARK 
> 
> There's a sample script regexMark.powerpro, simple case seems to work. 

Need more time for testing, but I ran some preliminary tests on mark and ucp 
and they seem to be working. A couple of quibbles:

We said the PCRE_UCP option would be ucp. Any good reason it became usp instead 
of ucp? If just a typo, please fix it, as is its counter-intuitive.

Ditto for short form of mark, we said capital K, but it became little k.

The short form is working in space separated variation of options, but not 
working currenting in an option cluster.

> Okay, back to *MARK.
> 
> pcretest doc:
> "If the variable that the mark field points to is non-NULL for a match, 
> non-match, or partial match, pcretest prints the string to which it points. 
> For a match, this is shown on a line by itself, tagged with "MK:". "
> 
> Er, okay, not entirely sure I understand why *MARK can be meaningful for a 
> non-match, but who am I to quibble?

Feel the same way.

Ya, up to the user to take note of whether he/she is dealing with a match. In 
pcreReplaceCallback, the callback routine always will be. I suppose one also 
better thoroughly test out any pattern they develop in connection with mark, 
because in the docs, there are many caveats. Haven't tried it yet with 
pcreReplaceCallback.

> 
> Would it make sense to try to retrieve *MARK name when
> resultPostNoMatchAnchored comes back (if that doesn't mean
> anything to you: and after all this time it doesn't make much
> sense to me: lemme know and I'll go find out how I determine
> resultPostNoMatchAnchored)

I'm mystified, sorry. You would need to define resultPostNoMatchAnchored. Maybe 
that is connected with the plugin advancing the start position and retrying a 
match, in a multiple match operation. ?

If the plugin always copies the mark string, after each exec when the mark 
option is set, it sounds like a lot of probably useless activity during a 
multiple match operation - but this is at the user's discretion. When to clean 
out or initialize regex_mark, I'm not sure. Maybe once at the start of a 
regex.pcreFunction regardless if the mark option is set. ? Or should this too 
be the user's responsibility?

Regards,
Sheri

> 
> 
>  aResult = m_pPatternCompiledCurrent->exec();
> 
>   if (bUseMark &&  
>     (aResult == resultOK || aResult == resultPartialMatch || 
>      aResult == resultMatch  || aResult == resultNoMatch))
>         strMark = m_pPatternCompiledCurrent->getMarkName();
> 
>   switch (aResult)
>   {
>    case resultOK: 
>    case resultPartialMatch: 
>    case resultMatch:
> 
>     aResult = onMatch(piRetVal);
>     break;
> 
>    case resultPostNoMatchAnchored:
> 
>     break;
> 
>    case resultNoMatch: //not found 
>   
>     if (m_nPatternsMatched > 0)  //not found at end of repeated match
>      return resultMatch;
>     else
>     {
>      *piRetVal = m_usNdxBase - m_usNdxBase;
>      return resultNoMatch;
>     }
>


Reply via email to