Not at all too sure that I understand your question as I am not familiar with natch/hadoop. So, I am going to guess that you want to use POI to extract the text of a paragraph or paragraphs from a Word document or documents and then repsent that text using another application but still formatted - from the perspective of a paragraphs layout - to the user. If this is correct then I think you will have to determine where Word would wrap the text and insert the hard line breaks yourself.
The problem you are facing is that Word does not add end of line characters to the text of a paragraph to wrap lines. Instead, the application will determine where the line should be broken based upon the width of the page amongst other factors. Therefore, you will need to decide where the line should be broken and insert your own line breaks if you want to emulate the look and feel of MS Word. Yours Mark B JohnRodey wrote: > > Using POI for word documents with nutch/hadoop is there a way to force the > plugin to add an eol character where word would typically do wrapping? Or > would I have to rewrite the plugin and add custom logic to add the eol > character? Currently the plugin will print a paragraph on one really long > line. > -- View this message in context: http://old.nabble.com/Preserving-word-wrap-tp28029360p28031229.html Sent from the POI - User mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
