Abdera uses the available, configured JAXP parser. Look in the Parser module for the code that interfaces with the underlying parser.
On Wed, Jan 26, 2011 at 7:37 AM, Rick Meyer <[email protected]> wrote: > Hi James, > > I was wondering if you could give me some help locating where exactly in > the > Abdera code the XML parser is being used to HTML encode the content. > Also, which XML parser is Abdera using by default? > > I'm hoping to be able to either configure the existing XML parser to encode > the > character too, or if necessary swap it out for another one that will. > I'm hardly an expert in this stuff, but I would think that XWork2 would be > able to handle this. I don't see that jar in the Abdera dependencies > directory though, so I guess that is not being used here. > > Thanks, > > Rick > > > On 12/1/10 12:39 PM, "James Snell" <[email protected]> wrote: > > > While the <p> encoding is annoying, it is valid. the > character does > not > > need to be escaped. Nevertheless, the encoding for this is actually > handled > > by the underlying XML parser/serializer and not Abdera itself. > > > > On Tue, Nov 30, 2010 at 2:32 PM, Rick Meyer <[email protected]> wrote: > > > >> We are using the Abdera client software to transfer html documents to a > >> client¹s server. > >> > >> In creating a Content object I have attempted to set the content type to > >> both TEXT and HTML and have run into an issue with each. > >> > >> When I set the content type to HTML only the Œ<Œ char of the include > html > >> ends up being HTML encoded, so <p> ends up like this <p> > >> It should be encode like this though <p> > >> > >> Actually when I set the content type to TEXT I get the exact same > behavior. > >> So if the text includes <p> what ends up being sent out is <p> > >> > >> Now if I HTML encode the content myself, then the & character ends up > being > >> double encoded. So what I end up with is &lt;p&gt; > >> It does this if I set the Content objects content type to HTML or TEXT. > >> > >> I would expect the this last case to occur with HTML since that should > be > >> HTML encoding the data anyways, but not for TEXT. > >> > >> I started using the latest release version of Abdera (1.1) and have now > >> downloaded the latest source and built that myself and both versions > have > >> the same behavior. > >> > >> Is it possible to resolve this issue immediately? Otherwise we may have > to > >> scrap Abdera and find another solution. > >> > >> Here is an example of what was being sent: > >> > >> <entry > >> xmlns="http://www.w3.org/2005/Atom > >> "><id>281474978492700</id><author><name>Br > >> enda Daverin</name></author><title type="text">US Indicts 11 German and > >> Chinese Executives for Honey Smuggling</title><content > >> type="text"><p>For > >> many people with psoriasis, finding safe and effective treatments can be > an > >> ever-moving target. There's no cure or universal fix, people respond > >> differently to treatment options, and even when you find a medication - > or > >> a > >> combination of them - that works, it may only be effective for a period > of > >> time or may need to be stopped to avoid potentially damaging side > >> effects.</p><p>"There are a lot of treatments out there and they > are > >> quite effective, but often they stop being effective," says Dr. Mark > >> Lebwohl, chair of the department of dermatology at Mount Sinai Medical > >> Center in New York City. "There isn't one treatment over a lifetime, > >> necessarily."</p></content><category /></entry> > >> > >> > >
