Revision: 7901
          http://svn.sourceforge.net/gate/?rev=7901&view=rev
Author:   ian_roberts
Date:     2006-12-11 04:07:23 -0800 (Mon, 11 Dec 2006)

Log Message:
-----------
A new and hopefully more robust HTML document parser based on NekoHTML.  It
will produce slightly different results from the old Swing-based parser, but
notably should handle JavaScript blocks and xhtml tags much better than the old
parser.

All tag and attribute names are converted to lowercase to conform to the old
parser style.

The old parser is still the default, pending further testing.

Added Paths:
-----------
    gate/trunk/lib/nekohtml-0.9.5.jar
    gate/trunk/src/gate/corpora/NekoHtmlDocumentFormat.java
    gate/trunk/src/gate/html/NekoHtmlDocumentHandler.java


This was sent by the SourceForge.net collaborative development platform, the 
world's largest Open Source development site.

-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
GATE-cvs mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/gate-cvs

Reply via email to