[
https://issues.apache.org/jira/browse/TIKA-2604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tim Allison resolved TIKA-2604.
-------------------------------
Resolution: Fixed
Fix Version/s: 2.0.0
1.18
[~goodmansasha] thank you for opening this issue. I was able to replicate this
on linux, and I _think_ I fixed it. If you could grab the app.jar from the
most recent successful build and give it a try, that'd be great. Tika 1.18's
release cycle is just around the corner, and this is important.
> Error with certain jar paths on OS X
> ------------------------------------
>
> Key: TIKA-2604
> URL: https://issues.apache.org/jira/browse/TIKA-2604
> Project: Tika
> Issue Type: Bug
> Components: cli
> Affects Versions: 1.17
> Environment: tika-app-1.17.jar, OS X 10.13.3.
>
> Reporter: Sasha Goodman
> Assignee: Tim Allison
> Priority: Major
> Fix For: 1.18, 2.0.0
>
>
> I've been developing an R interface to the Tika batch processor for the past
> month ( see: [https://github.com/predict-r/rtika] ), and this software is
> awesome. I use the command line to call the batch processor, and my code has
> worked on Ubuntu, Windows 10 and OS X. Several people have been testing my
> code as well. Its been working.
> A few days ago I found an issue with the batch processor on OS X.
> When calling the batch processor with the tika-app-1.17.jar on a path with
> spaces in it, Tika starts to continually restart.
> Here is an example of calling the jar *when the path has spaces.* It
> *produces this error, and the unexpected restarts*:
> {code:java}
> java -Djava.awt.headless=true -jar '/Users/sasha/Downloads/space
> folder/tika-app.jar' -maxRestarts 1 -t -i '/' -o
> '/var/folders/nr/74rgb64s3n98yccxwbv6vsxw0000gn/T/Rtmp9VEJvX/rtika_dircf81200b313e'
> -fileList
> '/var/folders/nr/74rgb64s3n98yccxwbv6vsxw0000gn/T/Rtmp9VEJvX/rtika_filecf81530d27ee'
> INFO about to start driver
> INFO BatchProcess: Error: Could not find or load main class
> org.apache.tika.batch.fs.FSBatchProcessCLI
> INFO BatchProcess: Caused by: java.lang.ClassNotFoundException:
> org.apache.tika.batch.fs.FSBatchProcessCLI
> INFO The child process has finished with an exit value of: 1
> WARN Restarting on unexpected restart code: 1
> WARN Must restart process (exitValue=1 numRestarts=0
> receivedRestartMessage=false)
> INFO BatchProcess: Error: Could not find or load main class
> org.apache.tika.batch.fs.FSBatchProcessCLI
> INFO BatchProcess: Caused by: java.lang.ClassNotFoundException:
> org.apache.tika.batch.fs.FSBatchProcessCLI
> INFO The child process has finished with an exit value of: 1
> WARN Restarting on unexpected restart code: 1
> WARN Hit the maximum number of process restarts. Driver is shutting down now.
> INFO Process driver has completed{code}
> The error ALSO occurs with double quotes also around the jar.
> *Now, in contrast,* calling the jar when the *path does not have spaces
> produces absolutely NO error*:
> {code:java}
> java -Djava.awt.headless=true -jar '/Users/sasha/Downloads/tika-app.jar'
> -maxRestarts 1 -t -i '/' -o
> '/var/folders/nr/74rgb64s3n98yccxwbv6vsxw0000gn/T/Rtmp9VEJvX/rtika_dircf81200b313e'
> -fileList
> '/var/folders/nr/74rgb64s3n98yccxwbv6vsxw0000gn/T/Rtmp9VEJvX/rtika_filecf81530d27ee'
> INFO about to start driver
> INFO BatchProcess: log4j:WARN No appenders could be found for logger
> (org.apache.tika.batch.fs.FSBatchProcessCLI).
> INFO BatchProcess: log4j:WARN Please initialize the log4j system properly.
> INFO BatchProcess: log4j:WARN See
> http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
> INFO BatchProcess: Mar 09, 2018 12:19:17 AM
> org.apache.tika.config.InitializableProblemHandler$3
> handleInitializableProblem
> INFO BatchProcess: WARNING: JBIG2ImageReader not loaded. jbig2 files will be
> ignored
> INFO BatchProcess: See
> https://pdfbox.apache.org/2.0/dependencies.html#jai-image-io
> INFO BatchProcess: for optional dependencies.
> INFO BatchProcess: TIFFImageWriter not loaded. tiff files will not be
> processed
> INFO BatchProcess: See
> https://pdfbox.apache.org/2.0/dependencies.html#jai-image-io
> INFO BatchProcess: for optional dependencies.
> INFO BatchProcess: J2KImageReader not loaded. JPEG2000 files will not be
> processed.
> INFO BatchProcess: See
> https://pdfbox.apache.org/2.0/dependencies.html#jai-image-io
> INFO BatchProcess: for optional dependencies.
> INFO BatchProcess:
> INFO BatchProcess: Mar 09, 2018 12:19:17 AM
> org.apache.tika.config.InitializableProblemHandler$3
> handleInitializableProblem
> INFO BatchProcess: WARNING: org.xerial's sqlite-jdbc is not loaded.
> INFO BatchProcess: Please provide the jar on your classpath to parse sqlite
> files.
> INFO BatchProcess: See tika-parsers/pom.xml for the correct version.
> INFO BatchProcess: randomCrawl attribute is ignored by FSListCrawler
> BatchProcess:Main thread in TikaFSBatchCLI has finished processing.
> BatchProcess:
> BatchProcess:
> BatchProcess:ParallelFileProcessingResult{considered=1, added=1, consumed=1,
> numberHandledExceptions=0, secondsElapsed=0.853, exitStatus=0,
> causeForTermination='COMPLETED_NORMALLY'}
> INFO The child process has finished with an exit value of: 0
> INFO Process driver has completed{code}
> Further, and what makes this a batch processor issue, is that that path with
> the space in it produces absolutely *NO error in the normal Tika CLI mode
> either*:
> {code:java}
> java -jar '/Users/sasha/Downloads/space folder/tika-app.jar' -t
> /Library/Frameworks/R.framework/Versions/3.4/Resources/library/rtika/extdata/jsonlite.pdf
> {code}
> The last two examples work, but the first does not.
> The only difference is the first is calling the batch processor, and that is
> causing restarts with whatever file.
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)