Hi,
I have attached(Ticket_Diary.oft) a outlook format template.I need to parse
these type of files and get the actual HTML content.I have tested with
following code, but the parser returns <p> tag instead of <table> or <Div>
tags.How do i exclude from SAFE_ELEMENTS map.?
*String msgfile = "/home/test/Desktop/EmailParse/Ticket Diary.oft";
InputStream stream = new FileInputStream(msgfile);
StringWriter sw = new StringWriter();
Parser parser = new OfficeParser();
Metadata metadata = new Metadata();
ParseContext context = new ParseContext();
context.set(HtmlMapper.class,IdentityHtmlMapper.INSTANCE);
SAXTransformerFactory factory = (SAXTransformerFactory)
SAXTransformerFactory.newInstance();
TransformerHandler handler = factory.newTransformerHandler();
handler.getTransformer().setOutputProperty(OutputKeys.METHOD,
"html");
handler.getTransformer().setOutputProperty(OutputKeys.INDENT,
"no");
handler.setResult(new StreamResult(sw));
try {
parser.parse(stream,handler,metadata, context);
} finally {
stream.close();
}
String content = sw.toString();*
Output Example :
============
*
Ticket Diary
<dl/>
<div class="message-body"><p> _____
</p>
<p> </p>
<p>High Priority Notification *</p>
I have attached(TestTicket.jpg) the screen shot of file(.oft) looks, when
its open in outlook. I need to get the full table,style tag of what it has.
Can some one help me on this?
http://apache-poi.1045710.n5.nabble.com/file/n5643786/Ticket_Diary.oft
Ticket_Diary.oft
http://apache-poi.1045710.n5.nabble.com/file/n5643786/TestTicket.jpg
TestTicket.jpg
--
View this message in context:
http://apache-poi.1045710.n5.nabble.com/How-do-i-get-actual-html-content-of-attached-file-tp5643786p5643786.html
Sent from the POI - User mailing list archive at Nabble.com.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]