On Jan 29, 2008 11:47 AM, Hans <[EMAIL PROTECTED]> wrote: > Tuesday, January 29, 2008, 2:29:29 PM, The Editor wrote: > > > 1) When doing a case insensitive search--I get the matches in the > > search function, but not the display. > > I have the /i qualifier switched for all preg match and replace calls.
Thank you! Works great... I don't suppose there's a way to get the output to keep the same case as the original. Using this everything gets switched to lowercase. > > 2) When doing a boolean like apples && oranges--I also get the matches > > but not the display > > apples & oranges would need to be entered as regex: apples|oranges Yes, I know. It's just I'm using the actual search term entered in the regex for the highlighting. I could try and do some substitutions or something, but it gets messy fast. > > 3) When searching for a phrase I get false matches. IE pages with the > > words--but not the phrase > > can this be because you apply wordboundary character \b ? No, it's because of the problem above. "this is a phrase" in the search engine gets treated as "this && is && a && phrase" which of course does not occur. I don't think PmWiki yet has boolean operators like this, (or phrase searching for that matter) apart from the work you have been doing. And BoltWire indexes words but not phrases. > > 4) What to do with markup. Right now, my returns show the markup. > > Processing it causes problems. Perhaps it could get stripped out to > > some extent. > > I have no option top show active markup directives and active markup > expressions, since the results were too horrible. > Either lines containing directives are ignored, which will give > perhaps some missing results, or the directives and expressions are > shown as code. The latter is the default and looks acceptable when > doing for instance searches in the PmWiki documentation. I had thought of processing the entire pages markup, then stripping out everything between <and> (all the html tags). This works easier for me because of how BoltWire processes sections of the pages. But I think it will slow things down. I too am left with just escaping code. > If you look at extract.php function TextExtract case 'code' > you see a number of replacements are happening for various reasons to > defuse some active markups. It still fails in a few cases, but works > for most. PmWiki.ChangeLog is a great page to use as a source for > testing, full of weird markup! That is a possibility. I suspect the biggest problems are things like tables, etc. In BoltWire, I can process a page and specify the markup rules used. But performance is again an issue. Thanks for the suggestions. Just thinking out loud with you. Your sample search form is impressive. Cheers, Dan _______________________________________________ pmwiki-users mailing list [email protected] http://www.pmichaud.com/mailman/listinfo/pmwiki-users
