[ 
https://issues.apache.org/jira/browse/TIKA-2604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Allison resolved TIKA-2604.
-------------------------------
       Resolution: Fixed
    Fix Version/s: 2.0.0
                   1.18

[~goodmansasha] thank you for opening this issue.  I was able to replicate this 
on linux, and I _think_ I fixed it.  If you could grab the app.jar from the 
most recent successful build and give it a try, that'd be great.  Tika 1.18's 
release cycle is just around the corner, and this is important.

> Error with certain jar paths on OS X
> ------------------------------------
>
>                 Key: TIKA-2604
>                 URL: https://issues.apache.org/jira/browse/TIKA-2604
>             Project: Tika
>          Issue Type: Bug
>          Components: cli
>    Affects Versions: 1.17
>         Environment: tika-app-1.17.jar, OS X 10.13.3. 
>  
>            Reporter: Sasha Goodman
>            Assignee: Tim Allison
>            Priority: Major
>             Fix For: 1.18, 2.0.0
>
>
> I've been developing an R interface to the Tika batch processor for the past 
> month ( see: [https://github.com/predict-r/rtika] ), and this software is 
> awesome. I use the command line to call the batch processor, and my code has 
> worked on Ubuntu, Windows 10 and OS X. Several people have been testing my 
> code as well. Its been working.
> A few days ago I found an issue with the batch processor on OS X. 
> When calling the batch processor with the tika-app-1.17.jar on a path with 
> spaces in it, Tika starts to continually restart.
> Here is an example of calling the jar *when the path has spaces.* It 
> *produces this error, and the unexpected restarts*: 
> {code:java}
> java -Djava.awt.headless=true -jar '/Users/sasha/Downloads/space 
> folder/tika-app.jar' -maxRestarts 1 -t -i '/' -o 
> '/var/folders/nr/74rgb64s3n98yccxwbv6vsxw0000gn/T/Rtmp9VEJvX/rtika_dircf81200b313e'
>  -fileList 
> '/var/folders/nr/74rgb64s3n98yccxwbv6vsxw0000gn/T/Rtmp9VEJvX/rtika_filecf81530d27ee'
> INFO about to start driver
> INFO BatchProcess: Error: Could not find or load main class 
> org.apache.tika.batch.fs.FSBatchProcessCLI
> INFO BatchProcess: Caused by: java.lang.ClassNotFoundException: 
> org.apache.tika.batch.fs.FSBatchProcessCLI
> INFO The child process has finished with an exit value of: 1
> WARN Restarting on unexpected restart code: 1
> WARN Must restart process (exitValue=1 numRestarts=0 
> receivedRestartMessage=false)
> INFO BatchProcess: Error: Could not find or load main class 
> org.apache.tika.batch.fs.FSBatchProcessCLI
> INFO BatchProcess: Caused by: java.lang.ClassNotFoundException: 
> org.apache.tika.batch.fs.FSBatchProcessCLI
> INFO The child process has finished with an exit value of: 1
> WARN Restarting on unexpected restart code: 1
> WARN Hit the maximum number of process restarts. Driver is shutting down now.
> INFO Process driver has completed{code}
> The error ALSO occurs with double quotes also around the jar.
> *Now, in contrast,* calling the jar when the *path does not have spaces 
> produces absolutely NO error*:
> {code:java}
> java -Djava.awt.headless=true -jar '/Users/sasha/Downloads/tika-app.jar' 
> -maxRestarts 1 -t -i '/' -o 
> '/var/folders/nr/74rgb64s3n98yccxwbv6vsxw0000gn/T/Rtmp9VEJvX/rtika_dircf81200b313e'
>  -fileList 
> '/var/folders/nr/74rgb64s3n98yccxwbv6vsxw0000gn/T/Rtmp9VEJvX/rtika_filecf81530d27ee'
> INFO about to start driver
> INFO BatchProcess: log4j:WARN No appenders could be found for logger 
> (org.apache.tika.batch.fs.FSBatchProcessCLI).
> INFO BatchProcess: log4j:WARN Please initialize the log4j system properly.
> INFO BatchProcess: log4j:WARN See 
> http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
> INFO BatchProcess: Mar 09, 2018 12:19:17 AM 
> org.apache.tika.config.InitializableProblemHandler$3 
> handleInitializableProblem
> INFO BatchProcess: WARNING: JBIG2ImageReader not loaded. jbig2 files will be 
> ignored
> INFO BatchProcess: See 
> https://pdfbox.apache.org/2.0/dependencies.html#jai-image-io
> INFO BatchProcess: for optional dependencies.
> INFO BatchProcess: TIFFImageWriter not loaded. tiff files will not be 
> processed
> INFO BatchProcess: See 
> https://pdfbox.apache.org/2.0/dependencies.html#jai-image-io
> INFO BatchProcess: for optional dependencies.
> INFO BatchProcess: J2KImageReader not loaded. JPEG2000 files will not be 
> processed.
> INFO BatchProcess: See 
> https://pdfbox.apache.org/2.0/dependencies.html#jai-image-io
> INFO BatchProcess: for optional dependencies.
> INFO BatchProcess:
> INFO BatchProcess: Mar 09, 2018 12:19:17 AM 
> org.apache.tika.config.InitializableProblemHandler$3 
> handleInitializableProblem
> INFO BatchProcess: WARNING: org.xerial's sqlite-jdbc is not loaded.
> INFO BatchProcess: Please provide the jar on your classpath to parse sqlite 
> files.
> INFO BatchProcess: See tika-parsers/pom.xml for the correct version.
> INFO BatchProcess: randomCrawl attribute is ignored by FSListCrawler
> BatchProcess:Main thread in TikaFSBatchCLI has finished processing.
> BatchProcess:
> BatchProcess:
> BatchProcess:ParallelFileProcessingResult{considered=1, added=1, consumed=1, 
> numberHandledExceptions=0, secondsElapsed=0.853, exitStatus=0, 
> causeForTermination='COMPLETED_NORMALLY'}
> INFO The child process has finished with an exit value of: 0
> INFO Process driver has completed{code}
> Further, and what makes this a batch processor issue, is that that path with 
> the space in it produces absolutely *NO error in the normal Tika CLI mode 
> either*:  
> {code:java}
> java -jar '/Users/sasha/Downloads/space folder/tika-app.jar' -t 
> /Library/Frameworks/R.framework/Versions/3.4/Resources/library/rtika/extdata/jsonlite.pdf
> {code}
> The last two examples work, but the first does not. 
> The only difference is the first is calling the batch processor, and that is 
> causing restarts with whatever file.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to