https://bz.apache.org/bugzilla/show_bug.cgi?id=66197
--- Comment #3 from earl <[email protected]> --- The above error occurred during command line execution. We actually use tika parser(OfficeParser in this case) to parse documents. While parsing a doc file of size around 460 MB with a heap size of around 1024 MB, OutOfMemoryError occurred! I'll attach that stacktrace too Stacktrace: at java.lang.OutOfMemoryError.<init>()V (OutOfMemoryError.java:48) at java.util.Arrays.copyOf([BI)[B (Arrays.java:3236) at java.io.ByteArrayOutputStream.toByteArray()[B (ByteArrayOutputStream.java:191) at org.apache.poi.util.IOUtils.toByteArray(Ljava/io/InputStream;JI)[B (IOUtils.java:199) at org.apache.poi.util.IOUtils.toByteArray(Ljava/io/InputStream;I)[B (IOUtils.java:149) at org.apache.poi.hwpf.HWPFDocumentCore.getDocumentEntryBytes(Ljava/lang/String;II)[B (HWPFDocumentCore.java:331) at org.apache.poi.hwpf.HWPFDocumentCore.<init>(Lorg/apache/poi/poifs/filesystem/DirectoryNode;)V (HWPFDocumentCore.java:169) at org.apache.poi.hwpf.HWPFDocument.<init>(Lorg/apache/poi/poifs/filesystem/DirectoryNode;)V (HWPFDocument.java:193) at org.apache.tika.parser.microsoft.WordExtractor.parse(Lorg/apache/poi/poifs/filesystem/DirectoryNode;Lorg/apache/tika/sax/XHTMLContentHandler;)V (WordExtractor.java:152) at org.apache.tika.parser.microsoft.OfficeParser.parse(Lorg/apache/poi/poifs/filesystem/DirectoryNode;Lorg/apache/tika/parser/ParseContext;Lorg/apache/tika/metadata/Metadata;Lorg/apache/tika/sax/XHTMLContentHandler;)V (OfficeParser.java:216) at org.apache.tika.parser.microsoft.OfficeParser.parse(Ljava/io/InputStream;Lorg/xml/sax/ContentHandler;Lorg/apache/tika/metadata/Metadata;Lorg/apache/tika/parser/ParseContext;)V (OfficeParser.java:173) at org.apache.tika.parser.CompositeParser.parse(Ljava/io/InputStream;Lorg/xml/sax/ContentHandler;Lorg/apache/tika/metadata/Metadata;Lorg/apache/tika/parser/ParseContext;)V (CompositeParser.java:289) at org.apache.tika.parser.CompositeParser.parse(Ljava/io/InputStream;Lorg/xml/sax/ContentHandler;Lorg/apache/tika/metadata/Metadata;Lorg/apache/tika/parser/ParseContext;)V (CompositeParser.java:289) at org.apache.tika.parser.AutoDetectParser.parse(Ljava/io/InputStream;Lorg/xml/sax/ContentHandler;Lorg/apache/tika/metadata/Metadata;Lorg/apache/tika/parser/ParseContext;)V (AutoDetectParser.java:150) In dominator tree, the thread that occupies large memory contains a byte array(size=173960244) with the following data: .............................s!...bjbjS)S)......................4l^.1C.g1C.g.k!.......................................................................................................................................................................8...L...,...x0..................R...4.......4.......4.......4.......4.......#.......#.......#...........................................................$...Y...........X...................................#.......................#.......#.......#.......#.......................................4...............4...............C.......C.......C.......#...............4...............4.......................C.......................................................#.......................C.......C...........t...........................................................................d.......4..................FQ...................3.......T...............r...........0...........\.......g.......C.......g.......d..................................................................... I'm sorry I didn't ask the question clearly initially. -- You are receiving this mail because: You are the assignee for the bug. --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
