Re: Concerning ATOM feeds with xhtml content

James M Snell Tue, 15 Apr 2008 10:17:55 -0700

Entirely possible. If you would, can you please open a jira issue totrack this and I'll look into it further a bit later this week.


- James


Christoph Bauer wrote:

Hi,
ok. i tried it with a jdk 1.5 and it seems to work. I still think thereis something fundamental wrong with the<org.apache.abdera.parser.stax.FOMDiv>.getInternalValue() method though.
Consider this test:
public static void main(String[] args) throws Exception {
ByteArrayOutputStream out = new ByteArrayOutputStream();
    XMLOutputFactory factory = XMLOutputFactory.newInstance();
    XMLStreamWriter writer = factory.createXMLStreamWriter(out);
    writer.writeStartElement("");
// simulate <OMNode>.serialize(writer):
    writer.writeStartElement("a");
    writer.writeEndElement();
    writer.flush(); // !
writer.writeEndElement();
    System.out.println("BEFORE:");
    System.out.println(out.toString());

    // i think you should:
    writer.flush();
System.out.println("AFTER:");
    System.out.println(out.toString());
}

on my computer it outputs:
BEFORE:
<><a />
AFTER:
<><a /></>

Axiom's <OMNode>.serialize() calls flush() after writing.

The Abdera code does not.
Using the content of an ByteArrayOutputStream in this way is in my eyesunpredictable without calling flush on the writer.
The other thing is that calling substring(2) is really silly.

Why generate an entity that you take away afterwards?
Especially after i found out that OMNode can be serialized to anOutputStream.
So the following might work (untested):
  protected String getInternalValue() {
    try {
      ByteArrayOutputStream out = new ByteArrayOutputStream();
//      XMLStreamWriter writer =
//        XMLOutputFactory.newInstance().createXMLStreamWriter(out);
//      writer.writeStartElement("");
      for (Iterator nodes = this.getChildren(); nodes.hasNext();) {
        OMNode node = (OMNode) nodes.next();
//        node.serialize(writer);
        node.serialize(out);
      }
//      writer.writeEndElement();
//      return out.toString().substring(2);
      return out.toString();
    } catch (Exception e) {}
    return "";
  }

g

Christoph

James M Snell wrote:
The stax implementation has a lot to do with it. Based on theclasspath, it would appear that you're using the stax impl that shipswith JDK 1.6? If so, please note that Abdera has not been tested on1.6. I've tried both the IBM and Sun 1.5 JDK's and have not been ableto duplicate your results.
- James

Christoph Bauer wrote:
Hi,

thanks for the immediate response.
I was testing inside Eclipse Europe with a sun jdk.1.6.0_03 64bitJava (linux-64)
The testing routine contained only the main method.

following libs in classpath
abdera.0.3.0-incubating/abdera.client.0.3.0-incubating.jar
abdera.0.3.0-incubating/abdera.core.0.3.0-incubating.jar
abdera.0.3.0-incubating/abdera.parser.0.3.0-incubating.jar
abdera.0.3.0-incubating/abdera.protocol.0.3.0-incubating.jar
abdera.0.3.0-incubating/lib/axiom-api-1.2.5.jar
abdera.0.3.0-incubating/lib/axiom-impl-1.2.5.jar
abdera.0.3.0-incubating/lib/commons-codec-1.3.jar
abdera.0.3.0-incubating/lib/commons-httpclient-3.1-rc1.jar
abdera.0.3.0-incubating/lib/commons-logging-1.0.4.jar
abdera.0.3.0-incubating/lib/jaxen-1.1.1.jar
abdera.0.3.0-incubating/lib/abdera-i18n-0.3.0-incubating.jar
abdera.0.3.0-incubating/lib/stax-api-1.0.1.jar

also tried the retro libs with a 32bit 1.4 jdk (same result)
Initially i found the problem with a self-generate feed so i triedothers and found the same problem.
Concerning stax: see classpath, but looking at the code i don't seehow that should matter when abdera generates
<>
<h3>Header</h3>
<p>some text here</p>
</>

and then goes ahead and throws the first two characters away.




James M Snell schrieb:
I am unable to duplicate the issue. What stax implementation areyou using? What platform? Does this happen with every entry or justone specific entry? Are you seeing this problem with more than onefeed or several?
- James

Christoph Bauer wrote:
Hi Everyone,

i haven't found a bug report for this so i thought i ask here

please considered the following snippet:

public static void main(String[] args) {
    Abdera abdera = new Abdera();
    AbderaClient client = new AbderaClient(abdera);
ClientResponse resp =client.get("http://mail-archives.apache.org/mod_mbox/incubator-abdera-dev/?format=atom";);
    if (resp.getType() == ResponseType.SUCCESS) {
      Document<Feed> doc = resp.getDocument();
System.out.println(doc.getRoot().getEntries().iterator().next().getContent());
    } else {
    }
}


Right at the end of the content i get an empty </> xml tag.
Something like this:

<pre>My content</pre> </>
I dived through the code and found the<org.apache.abdera.parser.stax.FOMDiv>.getInternalValue() Classwhich I think is supposed to handle this:
  protected String getInternalValue() {
    try {
      ByteArrayOutputStream out = new ByteArrayOutputStream();
      XMLStreamWriter writer =
        XMLOutputFactory.newInstance().createXMLStreamWriter(out);
      writer.writeStartElement("");
      for (Iterator nodes = this.getChildren(); nodes.hasNext();) {
        OMNode node = (OMNode) nodes.next();
        node.serialize(writer);
      }
      writer.writeEndElement();
      return out.toString().substring(2);
    } catch (Exception e) {}
    return "";
  }
If I understand that right abdera is trying to remove thesurrounding "div" tag. Unfortunately the output from this methodcannot be used if you need valid xhtml or at least valid xmlbecause of the empty element tag.
I shudder when i see
out.toString().substring(2);

For now i decided to stick with abdera, but handle the xhtml myself:
Element c =doc.getRoot().getEntries().iterator().next().getContentElement().getFirstChild();
try {
  StringWriter out = new StringWriter();
  Element e = c.getFirstChild();
  while (e != null) {
      e.writeTo(out);
      e = e.getNextSibling();
  }
  System.out.println(out.toString());
} catch (IOException e1) {
    // TODO Auto-generated catch block
    e1.printStackTrace();
}
Would be nice to know whether this is some kind of bug, or whetheri got something wrong.
g
christoph bauer

Re: Concerning ATOM feeds with xhtml content

Reply via email to