I started trying to implement this, but obviously it's a bit more complex
than a 10-minute job and requires knowledge of the design.

Mark

On Tue, Aug 30, 2011 at 11:27 PM, Mark Kerzner <[email protected]>wrote:

> Guys,
>
> the errors show up again. I already thanked 
> everybody<http://shmsoft.blogspot.com/2011/08/freeeed-processing-is-stable.html>!
>  I
> wonder how I can make good on this :)
>
> I think that in ParserContainerExtractor.parse you need to associated
> TikaInputStream with temporary files both ways: from the stream you already
> can find the file, but you should be able to find the stream from the file.
> Then, when you are deleting the file, you can also close the associated
> stream. Something like
>
> public void parse(
>                 InputStream stream, ContentHandler ignored,
>                 Metadata metadata, ParseContext context)
>                 throws IOException, SAXException, TikaException {
>             TemporaryFiles tmp = new TemporaryFiles();
>             try {
>                 TikaInputStream tis = TikaInputStream.get(stream, tmp);
>
>                 // Figure out what we have to process
>                 String filename = metadata.get(Metadata.RESOURCE_NAME_KEY);
>                 MediaType type = detector.detect(tis, metadata);
>
>                 if (extractor == null) {
>                     // Let the handler process the embedded resource
>                     handler.handle(filename, type, tis);
>                 } else {
>                     // Use a temporary file to process the stream twice
>                     File file = tis.getFile();
>
>                     // Let the handler process the embedded resource
>                     handler.handle(filename, type,
> TikaInputStream.get(file));
>
>                     // Recurse
>                     extractor.extract(tis, extractor, handler);
>                 }
>             } finally {
>                 tmp.closeStreams();
>                 tmp.dispose();
>
>             }
>         }
>
> Thank you,
> Mark
>
>
> On Tue, Aug 30, 2011 at 9:43 PM, Mark Kerzner <[email protected]>wrote:
>
>> Okay,
>>
>> the error was there because of Java 7. I heard about some weird Java 7
>> error and Lucene. Back to Java 6, and everything works fine: builds,
>> extracts, closes files.
>>
>> Thank you,
>> Mark
>>
>>
>> On Tue, Aug 30, 2011 at 9:10 PM, Mark Kerzner <[email protected]>wrote:
>>
>>> Well,
>>>
>>> that error WAS important. It compiles and pretends to work, but does not
>>> extract any text or metadata (that's why it is so fast!).
>>>
>>> Thank you,
>>> Mark
>>>
>>>
>>> On Tue, Aug 30, 2011 at 7:20 PM, Mark Kerzner <[email protected]>wrote:
>>>
>>>> I do get an error in the build, but it create the core snapshot jar
>>>> anyway. Should I be concerned?
>>>>
>>>> Thank you,
>>>> Mark
>>>>
>>>> [INFO] -------------------------------------------------------------
>>>> [ERROR] COMPILATION ERROR :
>>>> [INFO] -------------------------------------------------------------
>>>> [ERROR]
>>>> /home/mark/ThirdParty/tika-source/tika-site/tika-parsers/src/main/java/org/apache/tika/parser/image/ImageMetadataExtractor.java:[89,34]
>>>> error: cannot access JPEGDecodeParam
>>>> [INFO] 1 error
>>>> [INFO] -------------------------------------------------------------
>>>> [INFO]
>>>> ------------------------------------------------------------------------
>>>> [INFO] Reactor Summary:
>>>> [INFO]
>>>> [INFO] Apache Tika parent ................................ SUCCESS
>>>> [32.118s]
>>>> [INFO] Apache Tika core .................................. SUCCESS
>>>> [15.994s]
>>>> [INFO] Apache Tika parsers ............................... FAILURE
>>>> [57.498s]
>>>> [INFO] Apache Tika application ........................... SKIPPED
>>>> [INFO] Apache Tika OSGi bundle ........................... SKIPPED
>>>> [INFO] Apache Tika ....................................... SKIPPED
>>>> [INFO]
>>>> ------------------------------------------------------------------------
>>>> [INFO] BUILD FAILURE
>>>> [INFO]
>>>> ------------------------------------------------------------------------
>>>> [INFO] Total time: 2:23.922s
>>>> [INFO] Finished at: Tue Aug 30 18:52:20 CDT 2011
>>>> [INFO] Final Memory: 28M/156M
>>>> [INFO]
>>>> ------------------------------------------------------------------------
>>>> [ERROR] Failed to execute goal
>>>> org.apache.maven.plugins:maven-compiler-plugin:2.3.2:compile
>>>> (default-compile) on project tika-parsers: Compilation failure
>>>> [ERROR]
>>>> /home/mark/ThirdParty/tika-source/tika-site/tika-parsers/src/main/java/org/apache/tika/parser/image/ImageMetadataExtractor.java:[89,34]
>>>> error: cannot access JPEGDecodeParam
>>>> [ERROR] -> [Help 1]
>>>> [ERROR]
>>>> [ERROR] To see the full stack trace of the errors, re-run Maven with the
>>>> -e switch.
>>>> [ERROR] Re-run Maven using the -X switch to enable full debug logging.
>>>> [ERROR]
>>>> [ERROR] For more information about the errors and possible solutions,
>>>> please read the following articles:
>>>> [ERROR] [Help 1]
>>>> http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException
>>>> [ERROR]
>>>> [ERROR] After correcting the problems, you can resume the build with the
>>>> command
>>>> [ERROR]   mvn <goals> -rf :tika-parsers
>>>>
>>>>
>>>> On Tue, Aug 30, 2011 at 7:08 PM, Mark Kerzner <[email protected]>wrote:
>>>>
>>>>> SUCCESS!!!!
>>>>>
>>>>> Nick, not only it closes all files, but it feels to work much faster (I
>>>>> mean, in the debuggers, real performance may vary :)
>>>>>
>>>>> Thank you everybody for today's productive discussion and help.
>>>>>
>>>>> Mark
>>>>>
>>>>> PS. If anyone every gets sued, they should use FreeEed for eDiscovery
>>>>> and come back a winner!
>>>>>
>>>>>
>>>>> On Tue, Aug 30, 2011 at 5:25 PM, Nick Burch 
>>>>> <[email protected]>wrote:
>>>>>
>>>>>> On Tue, 30 Aug 2011, Mark Kerzner wrote:
>>>>>>
>>>>>>> For the time being, is there a workaround that I could use? Right
>>>>>>> now, this
>>>>>>> is a show-stopper for my application
>>>>>>>
>>>>>>
>>>>>> Any chance you could do a svn checkout, build, and try with that?
>>>>>> After my last email, I have a nagging feeling about the timing of making
>>>>>> NPOIFS implement closable... I upgraded the POI dependency earlier 
>>>>>> today, so
>>>>>> it's worth checking with
>>>>>>
>>>>>> Nick
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>

Reply via email to