Bill Janssen said: > I'm currently using a StoryEnd pattern of (<p>-<p>|<p>-=<p>|<!-- TextEnd -->) > on a story that contains the following text at the end: > > these hip protectors under your clothing. After you show, they can > see that under your normal trousers pockets there is something.''<p>-=<p>On > the Web:<p>World Health Organization about hip fractures: > > What I *get*, in the scooped story, is > > these hip protectors under your clothing. After you show, they can > see that under your normal trousers pockets there is > </body></html> > </body></html> > > So: what happened to the "something.''" part of the story? By the > way, this happens reliably to all the storiies scooped in this way. This shouldn't be happening -- but it is. It seems to be something that Perl's regular expression code is doing. I haven't found a way to avoid it, apart from writing the patterns so they're more oriented towards finding only one pattern in the page rather than several. :( I think the best thing to do is to try to avoid using multiple patterns, esp. patterns that may show up inside the text. > And: why are there two </body> tags in the output? yep -- that's now fixed. I've just checked in a stack of fixes for MHTML mode, including a fix for that bug you reported at the weekend. --j. _______________________________________________ Sitescooper-talk mailing list [EMAIL PROTECTED] http://lists.sourceforge.net/mailman/listinfo/sitescooper-talk
