Malthe Borch <mbo...@gmail.com> added the comment:

Perhaps we can use ``lxml`` to extract the locations (string start-
and end- ranges) for the ``<img>`` tags and then simply use regex
matching on those.

This way, the original document isn't changed, but we don't have the
pitfalls of heuristic.

__________________________________
Repoze Bugs <b...@bugs.repoze.org>
<http://bugs.repoze.org/issue103>
__________________________________
_______________________________________________
Repoze-dev mailing list
Repoze-dev@lists.repoze.org
http://lists.repoze.org/listinfo/repoze-dev

Reply via email to