You have NO IDEA what a can of worms this is! The problem is, real-world HTML is simply NOT STANDARD. That is, there's a standard, it's kind of loose, and people violate it all the time. Browsers understand this, and have VERY forgiving parsers.
But a good forgiving parser is a lot harder to write than one that follows a standard. And there's a whole lot of different ways that it could work. If you google, there ARE a number of Java-based HTML parsers out there. It's been a long time since I've used one. I have, on occasion, had to write my own. The first thing is to ask "Why do you want to parse HTML, rather than XML?". That leads to "What kind of HTML do I have to parse? How many kinds?" If you just want to extract some bits of information, you may be able to do that with a few well-chosen regular expressions. Or at the other extreme, you may have a $1 Million engineering project on your hand. The BEST option, is to avoid parsing HTML if at all possible. Otherwise, the more narrow your expectations of what you want to get from the HTML, the easier it will be to find, adapt, or write a parser to meet your needs. On Feb 21, 9:54 pm, Alisha <[email protected]> wrote: > Hi All, > > I have to parse a html file using java. I have gone through a lot of > html parsers, but seem to understand none of them. So please help me > out with the type of parser that should be used for an android app and > how to parse a html file. -- You received this message because you are subscribed to the Google Groups "Android Developers" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/android-developers?hl=en

