On Wed, 13 Jul 2005 06:19:54 -0700, qwweeeit wrote: > Hi all, > I am writing a script to visualize (and print) > the web references hidden in the html files as: > '<a href="web reference"> underlined reference</a>' > Optimizing my code,
[red rag to bull] Because it was too slow? Or just to prove what a macho programmer you are? Is your code even working yet? If it isn't working, you shouldn't be trying to optimizing buggy code. > I found that an essential step is: > splitting on a word (in this case 'href'). Then just do it: py> '<a href="web reference"> underlined reference</a>'.split('href') ['<a ', '="web reference"> underlined reference</a>'] If you are concerned about case issues, you can either convert the entire HTML file to lowercase, or you might write a case-insensitive regular expression to replace any "href" regardless of case with the lowercase version. [snip] > To be sure as delimiter I choose chr(127) > which surely is not present in the html file. I wouldn't bet my life on that. I've found some weird characters in HTML files. -- Steven. -- http://mail.python.org/mailman/listinfo/python-list