Not at all too sure that I understand your question as I am not familiar with
natch/hadoop. So, I am going to guess that you want to use POI to extract
the text of a paragraph or paragraphs from a Word document or documents and
then repsent that text using another application but still formatted - from
the perspective of a paragraphs layout - to the user. If this is correct
then I think you will have to determine where Word would wrap the text and
insert the hard line breaks yourself.

The problem you are facing is that Word does not add end of line characters
to the text of a paragraph to wrap lines. Instead, the application will
determine where the line should be broken based upon the width of the page
amongst other factors. Therefore, you will need to decide where the line
should be broken and insert your own line breaks if you want to emulate the
look and feel of MS Word.

Yours

Mark B


JohnRodey wrote:
> 
> Using POI for word documents with nutch/hadoop is there a way to force the
> plugin to add an eol character where word would typically do wrapping?  Or
> would I have to rewrite the plugin and add custom logic to add the eol
> character?  Currently the plugin will print a paragraph on one really long
> line.
> 

-- 
View this message in context: 
http://old.nabble.com/Preserving-word-wrap-tp28029360p28031229.html
Sent from the POI - User mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to