Hi

On Mon, Nov 16, 2009 at 12:02 PM, David Pollak <
[email protected]> wrote:

>
>
> On Sun, Nov 15, 2009 at 11:24 PM, <[email protected]> wrote:
>
>> Hello,
>>
>> I am a newby to both scala and lift. Now that that's out of the way I'm
>> wondering how to properly use PCDataXmlParser to read and parse html.
>
>
> PXDataXmlParser requires well formed XML.  It is an XML parser.
>
> There are plenty of Java libraries that parse HTML.  Please use one of
> those for parsing stuff that's not known to be well formed XML.
>

I would suggest you start here:
http://www.hars.de/2009/01/html-as-xml-in-scala.html

I implemented this last week and it works well.


>
>
>
>> I pull data from a restful service by doing the following:
>>
>> [code]
>> import dispatch._
>> import Http._
>>
>> import net.liftweb.util._;
>> import scala.xml._;
>>
>> def upcDatabase(): Box[NodeSeq] = {
>> val http = new Http
>> var stream: String = "";
>> http("http://www.upcdatabase.com/item/0606949324124"; >- (arg => stream =
>> arg))
>> stream;
>> PCDataXmlParser(stream);
>> }
>>
>> val feedXML: Box[NodeSeq] = upcDatabase;
>> [/code]
>>
>> when doing this I get the following exception:
>>
>> [exception]
>> INF: [console logger] dispatch: GET
>> http://www.upcdatabase.com/item/0606949324124
>> log4j:WARN No appenders could be found for logger
>> (org.apache.http.impl.conn.SingleClientConnManager).
>> log4j:WARN Please initialize the log4j system properly.
>> :96:5: '<' not allowed in attrib value </a> ^
>> :97:1: '<' not allowed in attrib value</p>^
>> :98:1: '<' not allowed in attrib value</td>^
>> :99:1: '<' not allowed in attrib value<td valign="top" width="70%">^
>> :99:27: whitespace expected<td valign="top" width="70%"> ^
>> :99:27: '>' expected instead of '%'<td valign="top" width="70%"> ^
>> Exception in thread "main" java.lang.ExceptionInInitializerError
>> at
>> ca.ctrlspace.loveItHateItWeb.xml.UpcDatabaseFeed.main(UpcDatabaseFeed.scala)
>> Caused by: java.lang.RuntimeException: FATAL
>> at scala.Predef$.error(Predef.scala:76)
>> at scala.xml.parsing.MarkupParser$class.xToken(MarkupParser.scala:267)
>> at net.liftweb.util.PCDataXmlParser.xToken(PCDataMarkupParser.scala:91)
>> at scala.xml.parsing.MarkupParser$class.element1(MarkupParser.scala:680)
>> at net.liftweb.util.PCDataXmlParser.element1(PCDataMarkupParser.scala:91)
>> at scala.xml.parsing.MarkupParser$class.content1(MarkupParser.scala:481)
>> at net.liftweb.util.PCDataXmlParser.content1(PCDataMarkupParser.scala:91)
>> at scala.xml.parsing.MarkupParser$class.content(MarkupParser.scala:505)
>> at net.liftweb.util.PCDataXmlParser.content(PCDataMarkupParser.scala:91)
>> at scala.xml.parsing.MarkupParser$class.element1(MarkupParser.scala:682)
>> at net.liftweb.util.PCDataXmlParser.element1(PCDataMarkupParser.scala:91)
>> at scala.xml.parsing.MarkupParser$class.content1(MarkupParser.scala:481)
>> at net.liftweb.util.PCDataXmlParser.content1(PCDataMarkupParser.scala:91)
>> at scala.xml.parsing.MarkupParser$class.content(MarkupParser.scala:505)
>> at net.liftweb.util.PCDataXmlParser.content(PCDataMarkupParser.scala:91)
>> at scala.xml.parsing.MarkupParser$class.element1(MarkupParser.scala:682)
>> at net.liftweb.util.PCDataXmlParser.element1(PCDataMarkupParser.scala:91)
>> at scala.xml.parsing.MarkupParser$class.content1(MarkupParser.scala:481)
>> at net.liftweb.util.PCDataXmlParser.content1(PCDataMarkupParser.scala:91)
>> at scala.xml.parsing.MarkupParser$class.content(MarkupParser.scala:505)
>> at net.liftweb.util.PCDataXmlParser.content(PCDataMarkupParser.scala:91)
>> at scala.xml.parsing.MarkupParser$class.element1(MarkupParser.scala:682)
>> at net.liftweb.util.PCDataXmlParser.element1(PCDataMarkupParser.scala:91)
>> at scala.xml.parsing.MarkupParser$class.content1(MarkupParser.scala:481)
>> at net.liftweb.util.PCDataXmlParser.content1(PCDataMarkupParser.scala:91)
>> at scala.xml.parsing.MarkupParser$class.content(MarkupParser.scala:505)
>> at net.liftweb.util.PCDataXmlParser.content(PCDataMarkupParser.scala:91)
>> at scala.xml.parsing.MarkupParser$class.element1(MarkupParser.scala:682)
>> at net.liftweb.util.PCDataXmlParser.element1(PCDataMarkupParser.scala:91)
>> at scala.xml.parsing.MarkupParser$class.content1(MarkupParser.scala:481)
>> at net.liftweb.util.PCDataXmlParser.content1(PCDataMarkupParser.scala:91)
>> at scala.xml.parsing.MarkupParser$class.content(MarkupParser.scala:505)
>> at net.liftweb.util.PCDataXmlParser.content(PCDataMarkupParser.scala:91)
>> at scala.xml.parsing.MarkupParser$class.element1(MarkupParser.scala:682)
>> at net.liftweb.util.PCDataXmlParser.element1(PCDataMarkupParser.scala:91)
>> at scala.xml.parsing.MarkupParser$class.content1(MarkupParser.scala:481)
>> at net.liftweb.util.PCDataXmlParser.content1(PCDataMarkupParser.scala:91)
>> at scala.xml.parsing.MarkupParser$class.content(MarkupParser.scala:505)
>> at net.liftweb.util.PCDataXmlParser.content(PCDataMarkupParser.scala:91)
>> at scala.xml.parsing.MarkupParser$class.element1(MarkupParser.scala:682)
>> at net.liftweb.util.PCDataXmlParser.element1(PCDataMarkupParser.scala:91)
>> at scala.xml.parsing.MarkupParser$class.content1(MarkupParser.scala:481)
>> at net.liftweb.util.PCDataXmlParser.content1(PCDataMarkupParser.scala:91)
>> at scala.xml.parsing.MarkupParser$class.content(MarkupParser.scala:505)
>> at net.liftweb.util.PCDataXmlParser.content(PCDataMarkupParser.scala:91)
>> at scala.xml.parsing.MarkupParser$class.element1(MarkupParser.scala:682)
>> at net.liftweb.util.PCDataXmlParser.element1(PCDataMarkupParser.scala:91)
>> at scala.xml.parsing.MarkupParser$class.content1(MarkupParser.scala:481)
>> at net.liftweb.util.PCDataXmlParser.content1(PCDataMarkupParser.scala:91)
>> at scala.xml.parsing.MarkupParser$class.content(MarkupParser.scala:505)
>> at net.liftweb.util.PCDataXmlParser.content(PCDataMarkupParser.scala:91)
>> at scala.xml.parsing.MarkupParser$class.document(MarkupParser.scala:207)
>> at net.liftweb.util.PCDataXmlParser.document(PCDataMarkupParser.scala:91)
>> at net.liftweb.util.PCDataXmlParser$.apply(PCDataMarkupParser.scala:112)
>> at
>> ca.ctrlspace.loveItHateItWeb.xml.UpcDatabaseFeed$.upcDatabase(UpcDatabaseFeed.scala:16)
>> at
>> ca.ctrlspace.loveItHateItWeb.xml.UpcDatabaseFeed$.<init>(UpcDatabaseFeed.scala:19)
>> at
>> ca.ctrlspace.loveItHateItWeb.xml.UpcDatabaseFeed$.<clinit>(UpcDatabaseFeed.scala)
>> ... 1 more
>> [/exception]
>>
>> What is the proper way to parse non strict html? I thought PCDataXMLParser
>> allowed for non strict xml as opposed to XML.load().
>>
>> Thanks,
>>
>> Chri
>>
>>
>>
>
>
> --
> Lift, the simply functional web framework http://liftweb.net
> Beginning Scala http://www.apress.com/book/view/1430219890
> Follow me: http://twitter.com/dpp
> Surf the harmonics
>
>
> >
>

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Lift" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/liftweb?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to