[jira] [Commented] (TIKA-1474) PackageParser leaves 7zip Temp Files behind
[ https://issues.apache.org/jira/browse/TIKA-1474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15509492#comment-15509492 ] Redfred Garett commented on TIKA-1474: -- Great Kathrine Colyn, 7-Zip doesn't know folder path of drop target. Only Windows Explorer knows exact drop target. And Windows Explorer needs files (drag source) as decompressed files on disk. So 7-Zip extracts files from archive to temp folder and then 7-Zip notifies Windows Explorer about paths of these temp files. Then Windows Explorer copies these files to drop target folder. To avoid temp file usage, you can use Extract command of 7-Zip or drag-and-drop from 7-Zip to 7-Zip. Thanks !!! [https://play.google.com/store/apps/details?id=com.redgage.RedGage&hl=en Redgage] > PackageParser leaves 7zip Temp Files behind > --- > > Key: TIKA-1474 > URL: https://issues.apache.org/jira/browse/TIKA-1474 > Project: Tika > Issue Type: Bug > Components: parser >Reporter: Fabian Lange > > If I put a 7z input stream into tika parser, tika will make a temp file in > PackageParser > {code} > ArchiveInputStream ais; > try { > ArchiveStreamFactory factory = context.get( > ArchiveStreamFactory.class, new ArchiveStreamFactory()); > ais = factory.createArchiveInputStream(stream); > } catch (StreamingNotSupportedException sne) { > // Most archive formats work on streams, but a few need files > if (sne.getFormat().equals(ArchiveStreamFactory.SEVEN_Z)) { > // Rework as a file, and wrap > stream.reset(); > TikaInputStream tstream = TikaInputStream.get(stream); > > // Pending a fix for COMPRESS-269, this bit is a little nasty > ais = new SevenZWrapper(new SevenZFile(tstream.getFile())); > } else { > throw new TikaException("Unknown non-streaming format " + > sne.getFormat(), sne); > } > } catch (ArchiveException e) { > throw new TikaException("Unable to unpack document stream", e); > } > {code} > tstream.getFile() will then internally make a new temp file: > {code} > // Spool the entire stream into a temporary file > file = tmp.createTemporaryFile(); > OutputStream out = new FileOutputStream(file); > {code} > this file is not deleted because SevenZWrapper does not close the SevenZFile. > This can be fixed by implementing the following close method in SevenZWrapper > {code} > public void close() throws IOException { > try { > file.close(); > } finally { > super.close(); > } > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TIKA-1474) PackageParser leaves 7zip Temp Files behind
[ https://issues.apache.org/jira/browse/TIKA-1474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14330082#comment-14330082 ] Kathrine Colyn commented on TIKA-1474: -- If I put a 7z input stream into tika parser, tika will make a temp file in > PackageParser > {code} > ArchiveInputStream ais; > try { > ArchiveStreamFactory factory = context.get( > ArchiveStreamFactory.class, new ArchiveStreamFactory()); > ais = factory.createArchiveInputStream(stream); > } catch (StreamingNotSupportedException sne) { > // Most archive formats work on streams, but a few need files > if (sne.getFormat().equals(ArchiveStreamFactory.SEVEN_Z)) { > // Rework as a file, and wrap > stream.reset(); > TikaInputStream tstream = TikaInputStream.get(stream); Thanks ! http://www.fixithere.net > > PackageParser leaves 7zip Temp Files behind > --- > > Key: TIKA-1474 > URL: https://issues.apache.org/jira/browse/TIKA-1474 > Project: Tika > Issue Type: Bug > Components: parser >Reporter: Fabian Lange > > If I put a 7z input stream into tika parser, tika will make a temp file in > PackageParser > {code} > ArchiveInputStream ais; > try { > ArchiveStreamFactory factory = context.get( > ArchiveStreamFactory.class, new ArchiveStreamFactory()); > ais = factory.createArchiveInputStream(stream); > } catch (StreamingNotSupportedException sne) { > // Most archive formats work on streams, but a few need files > if (sne.getFormat().equals(ArchiveStreamFactory.SEVEN_Z)) { > // Rework as a file, and wrap > stream.reset(); > TikaInputStream tstream = TikaInputStream.get(stream); > > // Pending a fix for COMPRESS-269, this bit is a little nasty > ais = new SevenZWrapper(new SevenZFile(tstream.getFile())); > } else { > throw new TikaException("Unknown non-streaming format " + > sne.getFormat(), sne); > } > } catch (ArchiveException e) { > throw new TikaException("Unable to unpack document stream", e); > } > {code} > tstream.getFile() will then internally make a new temp file: > {code} > // Spool the entire stream into a temporary file > file = tmp.createTemporaryFile(); > OutputStream out = new FileOutputStream(file); > {code} > this file is not deleted because SevenZWrapper does not close the SevenZFile. > This can be fixed by implementing the following close method in SevenZWrapper > {code} > public void close() throws IOException { > try { > file.close(); > } finally { > super.close(); > } > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TIKA-1474) PackageParser leaves 7zip Temp Files behind
[ https://issues.apache.org/jira/browse/TIKA-1474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14213603#comment-14213603 ] Fabian Lange commented on TIKA-1474: Indeed it is, thanks [~lfcnassif] - while i think the patch over at 1411 might be more complex that it needed to be, it should work. > PackageParser leaves 7zip Temp Files behind > --- > > Key: TIKA-1474 > URL: https://issues.apache.org/jira/browse/TIKA-1474 > Project: Tika > Issue Type: Bug > Components: parser >Reporter: Fabian Lange > > If I put a 7z input stream into tika parser, tika will make a temp file in > PackageParser > {code} > ArchiveInputStream ais; > try { > ArchiveStreamFactory factory = context.get( > ArchiveStreamFactory.class, new ArchiveStreamFactory()); > ais = factory.createArchiveInputStream(stream); > } catch (StreamingNotSupportedException sne) { > // Most archive formats work on streams, but a few need files > if (sne.getFormat().equals(ArchiveStreamFactory.SEVEN_Z)) { > // Rework as a file, and wrap > stream.reset(); > TikaInputStream tstream = TikaInputStream.get(stream); > > // Pending a fix for COMPRESS-269, this bit is a little nasty > ais = new SevenZWrapper(new SevenZFile(tstream.getFile())); > } else { > throw new TikaException("Unknown non-streaming format " + > sne.getFormat(), sne); > } > } catch (ArchiveException e) { > throw new TikaException("Unable to unpack document stream", e); > } > {code} > tstream.getFile() will then internally make a new temp file: > {code} > // Spool the entire stream into a temporary file > file = tmp.createTemporaryFile(); > OutputStream out = new FileOutputStream(file); > {code} > this file is not deleted because SevenZWrapper does not close the SevenZFile. > This can be fixed by implementing the following close method in SevenZWrapper > {code} > public void close() throws IOException { > try { > file.close(); > } finally { > super.close(); > } > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TIKA-1474) PackageParser leaves 7zip Temp Files behind
[ https://issues.apache.org/jira/browse/TIKA-1474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14211875#comment-14211875 ] blog commented on TIKA-1474: // Spool the entire stream into a temporary file file = tmp.createTemporaryFile(); OutputStream out = new FileOutputStream(file); thanks http://www.contacttelephonenumbers.com/ > PackageParser leaves 7zip Temp Files behind > --- > > Key: TIKA-1474 > URL: https://issues.apache.org/jira/browse/TIKA-1474 > Project: Tika > Issue Type: Bug > Components: parser >Reporter: Fabian Lange > > If I put a 7z input stream into tika parser, tika will make a temp file in > PackageParser > {code} > ArchiveInputStream ais; > try { > ArchiveStreamFactory factory = context.get( > ArchiveStreamFactory.class, new ArchiveStreamFactory()); > ais = factory.createArchiveInputStream(stream); > } catch (StreamingNotSupportedException sne) { > // Most archive formats work on streams, but a few need files > if (sne.getFormat().equals(ArchiveStreamFactory.SEVEN_Z)) { > // Rework as a file, and wrap > stream.reset(); > TikaInputStream tstream = TikaInputStream.get(stream); > > // Pending a fix for COMPRESS-269, this bit is a little nasty > ais = new SevenZWrapper(new SevenZFile(tstream.getFile())); > } else { > throw new TikaException("Unknown non-streaming format " + > sne.getFormat(), sne); > } > } catch (ArchiveException e) { > throw new TikaException("Unable to unpack document stream", e); > } > {code} > tstream.getFile() will then internally make a new temp file: > {code} > // Spool the entire stream into a temporary file > file = tmp.createTemporaryFile(); > OutputStream out = new FileOutputStream(file); > {code} > this file is not deleted because SevenZWrapper does not close the SevenZFile. > This can be fixed by implementing the following close method in SevenZWrapper > {code} > public void close() throws IOException { > try { > file.close(); > } finally { > super.close(); > } > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TIKA-1474) PackageParser leaves 7zip Temp Files behind
[ https://issues.apache.org/jira/browse/TIKA-1474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14209983#comment-14209983 ] Luis Filipe Nassif commented on TIKA-1474: -- I think it was resolved in trunk by TIKA-1411. > PackageParser leaves 7zip Temp Files behind > --- > > Key: TIKA-1474 > URL: https://issues.apache.org/jira/browse/TIKA-1474 > Project: Tika > Issue Type: Bug > Components: parser >Reporter: Fabian Lange > > If I put a 7z input stream into tika parser, tika will make a temp file in > PackageParser > {code} > ArchiveInputStream ais; > try { > ArchiveStreamFactory factory = context.get( > ArchiveStreamFactory.class, new ArchiveStreamFactory()); > ais = factory.createArchiveInputStream(stream); > } catch (StreamingNotSupportedException sne) { > // Most archive formats work on streams, but a few need files > if (sne.getFormat().equals(ArchiveStreamFactory.SEVEN_Z)) { > // Rework as a file, and wrap > stream.reset(); > TikaInputStream tstream = TikaInputStream.get(stream); > > // Pending a fix for COMPRESS-269, this bit is a little nasty > ais = new SevenZWrapper(new SevenZFile(tstream.getFile())); > } else { > throw new TikaException("Unknown non-streaming format " + > sne.getFormat(), sne); > } > } catch (ArchiveException e) { > throw new TikaException("Unable to unpack document stream", e); > } > {code} > tstream.getFile() will then internally make a new temp file: > {code} > // Spool the entire stream into a temporary file > file = tmp.createTemporaryFile(); > OutputStream out = new FileOutputStream(file); > {code} > this file is not deleted because SevenZWrapper does not close the SevenZFile. > This can be fixed by implementing the following close method in SevenZWrapper > {code} > public void close() throws IOException { > try { > file.close(); > } finally { > super.close(); > } > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)