robert <[EMAIL PROTECTED]>: > Often I want to extract some web table contents. Formats are > mostly static, simple text & numbers in it, other tags to be > stripped off. So a simple & fast approach would be ok. > > What of the different modules around is most easy to use, stable, > up-to-date, iterator access or best matrix-access (without need > for callback functions,classes.. for basic tasks)?
Not more than a handful of lines with lxml.html: def htmltable2matrix(table): """Converts a html table to a matrix. :param table: The html table element :type table: An lxml element """ matrix = [] for row in table: matrix.append([e.text_content() for e in row]) return matrix -- Freedom is always the freedom of dissenters. (Rosa Luxemburg) -- http://mail.python.org/mailman/listinfo/python-list