Re: Fw: [dom4j-user] Parsing CDATA

Terry Steichen Tue, 09 Jul 2002 10:09:23 -0700

James,

Thanks for the comments.  Unfortunately, however, your suggestion doesn't
seem to work.  It seems that the List that's returned ('content' in your
suggested code below) only contains one item: the CDATA section.  So it
still doesn't let me decompose that into the component <p> and <b> elements
that exist within the CDATA.  Maybe I'm doing something wrong, but that
seems to be the behavior.


Regards,

Terry

----- Original Message -----
From: "James Strachan" <[EMAIL PROTECTED]>
To: "Terry Steichen" <[EMAIL PROTECTED]>
Cc: "dom4j-user" <[EMAIL PROTECTED]>
Sent: Monday, July 08, 2002 5:32 AM
Subject: Re: Fw: [dom4j-user] Parsing CDATA


> From: "Terry Steichen" <[EMAIL PROTECTED]>
> > Bob,
> >
> > I did end up doing something like your suggestion.  It seems pretty
hokey
> > (not your suggestion - what I did) to me, but -- it does work.
> >
> > What I did was extract the body text and then did two substring
operations
> > to remove the '<![CDATA[' and ']]'.  Then I enclosed the resultant
string
> > inside a root tag ('<doc>mystring</doc>') and parsed that.
> >
> > As I said, I'm not really happy with this hack, but it lets me move
ahead.
> > Any suggestions on a more elegant solution would be much appreciated.
>
> I'm still not quite sure what you really want to do. You should not need
to
> parse text or do substring operations. The dom4j Element will contain a
tree
> of Node implementations which in your case will probably be a mixture of
> Element, Text and CDATA nodes. So you should just be able to use regular
> Java 2 Collections code and use 'instanceof' to determine which nodes you
> want to process and which you don't.
>
> When you say...
>
> > What I want to do is parse the contents of "body" into the component
> paragraph, highlighted text and regular text parts
>
> you could just iterate over the contents of <body> and process things
> however you wish...
>
> Element body = (Element) doc.selectSingleNode( "/doc/body" );
> List content = body.content();
> for (Iterator iter = content.iterator(); iter.hasNext(); ) {
>     Node child = (Node) iter.next();
>     if ( child instanceof Element ) {
>         ... process this element, could be a <b> or <p> etc.
>     }
>     else if ( child instanceof Text ) {
>         .. its a block of text...
>     }
> }
>
> James
>
>
> _________________________________________________________
> Do You Yahoo!?
> Get your free @yahoo.com address at http://mail.yahoo.com
>



-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Stuff, things, and much much more.
http://thinkgeek.com/sf
_______________________________________________
dom4j-user mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/dom4j-user

Re: Fw: [dom4j-user] Parsing CDATA

Reply via email to