Hello, I'm trying to use HTML::Parser to parse some web pages. During that process, I'm normally just interested in some specific parts, so I use report_tags. However, at some point I need to get the whole text including all embedded HTML tags. What I currently have are a start handler , an end handler, and a text handler which together reconstruct the text. This works reasonably well, but only for the tags which I explicitly list using report_tags.
So I look for either a completely different approach (e.g. is the raw HTML available somehow so that I don't have to reconstruct it), or a way to reset report_tags/ignore_tags to report all tags (without me listing all possible HTML tags, that is). I tried to use ->ignore_tags(()) and ->ignore_tags(qw(none)), but it seems that after calling ->report_tags() once it alsways uses a positive tag filter. Any ideas/comments? Best, Norbert
signature.asc
Description: This is a digitally signed message part