Some issues with Doxia

2013-02-25 Thread Corentin Groix
Hi

First, I am quite a newbie at submitting feedback to an apache
project, so I am not sure if I chose the right place, but I haven't
found how to create an account in the JIRA issue tracker.

I am using Doxia as a stand alone library in my application to
generate PDF files using a source written in Markdown, with markdown
and FO modules, and Apache Fop. I don't know if this a frequent use
case, but I found some issues:

- First, I encountered this issue:
http://jira.codehaus.org/browse/DOXIA-480 (html entities ignored by
the xhtml parser) , but the consequence were more dramatic in my case:
The AbstractXMLParser.handleEntity(parser,sink) create a null String
and give it to sink.text(), causing a nullPointerException in
FoSink.escaped(String,boolean) (line 1600).

Exception in thread main java.lang.NullPointerException
at org.apache.maven.doxia.module.fo.FoSink.escaped(FoSink.java:1600)
at org.apache.maven.doxia.module.fo.FoSink.content(FoSink.java:1588)
at org.apache.maven.doxia.module.fo.FoSink.text(FoSink.java:1315)
at org.apache.maven.doxia.module.fo.FoSink.text(FoSink.java:1321)
at 
org.apache.maven.doxia.parser.AbstractXmlParser.handleEntity(AbstractXmlParser.java:390)
at 
org.apache.maven.doxia.parser.AbstractXmlParser.parseXml(AbstractXmlParser.java:251)
at 
org.apache.maven.doxia.parser.AbstractXmlParser.parse(AbstractXmlParser.java:141)
at 
org.apache.maven.doxia.parser.XhtmlBaseParser.parse(XhtmlBaseParser.java:90)
at 
org.apache.maven.doxia.module.markdown.MarkdownParser.parse(MarkdownParser.java:71)

The correction of DOXIA-480 bug corrected this problem, but I think a
null-check in FoSink.text or FoSink.escaped would be useful.

- A second issue: Is there any reason the h1 tag is ignored by
the xhtml parser?
I fix this by using the title() method, since my markdown document
doesn't have head.

- A more annoying problem: The FoSink class produce invalid XSL-FO
document: more precisely, the fo:flow and fo:page-sequence are ended,
by the body_() end method, but the body() start method does not open
them. There is a startPageSequence() protected function, but it is not
called in the FOSink class, even if it called in the FPAggregateSink
subclasse.

I can provide my patch to fix these issues if necessary, and I
attached an example using the Markdown parser, Fo Sink and Fop
demonstrating these issues (the first with Doxia 1.3, the others using
latest Doxia source).

Best regards

Corentin Groix


-
To unsubscribe, e-mail: dev-unsubscr...@maven.apache.org
For additional commands, e-mail: dev-h...@maven.apache.org

Re: Some issues with Doxia

2013-02-25 Thread Lukas Theussl

Hi,

First: are you aware of the maven pdf plugin:
http://maven.apache.org/plugins/maven-pdf-plugin/

It seems that you are trying to re-write it? ;)

See further comments in-line:

On 02/25/2013 11:22 AM, Corentin Groix wrote:
 Hi
 
 First, I am quite a newbie at submitting feedback to an apache
 project, so I am not sure if I chose the right place, but I haven't
 found how to create an account in the JIRA issue tracker.
 
 I am using Doxia as a stand alone library in my application to
 generate PDF files using a source written in Markdown, with markdown
 and FO modules, and Apache Fop. I don't know if this a frequent use
 case, but I found some issues:
 
 - First, I encountered this issue:
 http://jira.codehaus.org/browse/DOXIA-480 (html entities ignored by
 the xhtml parser) , but the consequence were more dramatic in my case:
 The AbstractXMLParser.handleEntity(parser,sink) create a null String
 and give it to sink.text(), causing a nullPointerException in
 FoSink.escaped(String,boolean) (line 1600).
 
 Exception in thread main java.lang.NullPointerException
 at org.apache.maven.doxia.module.fo.FoSink.escaped(FoSink.java:1600)
 at org.apache.maven.doxia.module.fo.FoSink.content(FoSink.java:1588)
 at org.apache.maven.doxia.module.fo.FoSink.text(FoSink.java:1315)
 at org.apache.maven.doxia.module.fo.FoSink.text(FoSink.java:1321)
 at 
 org.apache.maven.doxia.parser.AbstractXmlParser.handleEntity(AbstractXmlParser.java:390)
 at 
 org.apache.maven.doxia.parser.AbstractXmlParser.parseXml(AbstractXmlParser.java:251)
 at 
 org.apache.maven.doxia.parser.AbstractXmlParser.parse(AbstractXmlParser.java:141)
 at 
 org.apache.maven.doxia.parser.XhtmlBaseParser.parse(XhtmlBaseParser.java:90)
 at 
 org.apache.maven.doxia.module.markdown.MarkdownParser.parse(MarkdownParser.java:71)
 
 The correction of DOXIA-480 bug corrected this problem, but I think a
 null-check in FoSink.text or FoSink.escaped would be useful.

Unfortunately the Sink javadocs do not explicitly specify it

http://maven.apache.org/doxia/doxia/doxia-sink-api/apidocs/org/apache/maven/doxia/sink/Sink.html#text(java.lang.String,
org.apache.maven.doxia.sink.SinkEventAttributes)

but the description seems to imply that the text argument be non-null. In
particular, the most frequently-used Sink (XhtmlSink) does not check for null
text. I think DOXIA-480 is the correct fix for this issue.

 
 - A second issue: Is there any reason the h1 tag is ignored by
 the xhtml parser?

Unfortunately, the doxia Sink API only knows 5 section levels which are mapped
by the html parser/sink to h2-h6. See 
https://jira.codehaus.org/browse/DOXIA-203

 I fix this by using the title() method, since my markdown document
 doesn't have head.
 
 - A more annoying problem: The FoSink class produce invalid XSL-FO
 document: more precisely, the fo:flow and fo:page-sequence are ended,
 by the body_() end method, but the body() start method does not open
 them. There is a startPageSequence() protected function, but it is not
 called in the FOSink class, even if it called in the FPAggregateSink
 subclasse.

I vaguely remember that this was done to work around some other issues, probably
the apt or xdoc parser. If you change this behavior, you should make sure that
all tests pass, in particular also those of doxia sitetools.

 
 I can provide my patch to fix these issues if necessary, and I
 attached an example using the Markdown parser, Fo Sink and Fop
 demonstrating these issues (the first with Doxia 1.3, the others using
 latest Doxia source).

The correct way to proceed would be to open a JIRA and attach your patches
there. Not sure why you can't register an account, maybe send another specific
mail to the user list, or ask on IRC.

HTH,
-Lukas


 
 Best regards
 
 Corentin Groix
 
 
 
 
 -
 To unsubscribe, e-mail: dev-unsubscr...@maven.apache.org
 For additional commands, e-mail: dev-h...@maven.apache.org
 

-
To unsubscribe, e-mail: dev-unsubscr...@maven.apache.org
For additional commands, e-mail: dev-h...@maven.apache.org