Jakub, Try nightly snapshots from here: http://encore.torchbox.com/poi-cvs-build/
-- Best regards, Sergey On Mon, Aug 22, 2011 at 6:18 PM, Jakub Liska <[email protected]> wrote: > Hey, > I create doc like this > import org.apache.poi.hwpf.HWPFDocument; > import org.apache.poi.hwpf.usermodel.CharacterProperties; > import org.apache.poi.hwpf.usermodel.CharacterRun; > import org.apache.poi.hwpf.usermodel.Paragraph; > import org.apache.poi.hwpf.usermodel.Range; > import org.apache.poi.poifs.filesystem.POIFSFileSystem; > private void createDOCDocument(String from, File file) throws Exception { > POIFSFileSystem fs = new > POIFSFileSystem(DOCGenerator.class.getClass().getResourceAsStream("/poi/template.doc")); > HWPFDocument doc = new HWPFDocument(fs); > Range range = doc.getRange(); > Paragraph par1 = range.getParagraph(0); > CharacterRun run1 = par1.insertBefore(from, new CharacterProperties()); > run1.setFontSize(11); > doc.write(new FileOutputStream(file)); > } > The "from" string has 33 000 characters, but the first 1026 characters is > smaller (as you can see in the picture attached) and Apache Tika is able to > extract only these 1026 characters from such a document. > > Any idea what is wrong with it ? Btw is there a nightly/snapshot maven > artifact somewhere ? I heard there was some work done as far as horrible > format is concerned recently. > Best regards, Jakub > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] > -- Sergey Vladimirov --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
