[
https://issues.apache.org/jira/browse/TIKA-2604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16393353#comment-16393353
]
Sasha Goodman commented on TIKA-2604:
-------------------------------------
Yes!! Its fixed on my end :) Looking forward to Tika 1.8. I'll have to hold off
releasing my interface until that is released.
{code:java}
java -Djava.awt.headless=true -jar "/Users/sasha/Downloads/space
folder/tika-app-1.18.jar" -maxRestarts 1 -t -i '/' -o
'/var/folders/nr/74rgb64s3n98yccxwbv6vsxw0000gn/T/Rtmp9VEJvX/rtika_dircf81200b313e'-fileList
'/var/folders/nr/74rgb64s3n98yccxwbv6vsxw0000gn/T/Rtmp9VEJvX/rtika_filecf81530d27ee'
INFO about to start driver
BatchProcess:No config file set via -bc, relying on tika-app-batch-config.xml
or default-tika-batch-config.xml
INFO BatchProcess: Mar 09, 2018 10:39:18 AM
org.apache.tika.config.InitializableProblemHandler$3 handleInitializableProblem
INFO BatchProcess: WARNING: JBIG2ImageReader not loaded. jbig2 files will be
ignored
INFO BatchProcess: See
https://pdfbox.apache.org/2.0/dependencies.html#jai-image-io
INFO BatchProcess: for optional dependencies.
INFO BatchProcess: J2KImageReader not loaded. JPEG2000 files will not be
processed.
INFO BatchProcess: See
https://pdfbox.apache.org/2.0/dependencies.html#jai-image-io
INFO BatchProcess: for optional dependencies.
INFO BatchProcess:
INFO BatchProcess: Mar 09, 2018 10:39:18 AM
org.apache.tika.config.InitializableProblemHandler$3 handleInitializableProblem
INFO BatchProcess: WARNING: org.xerial's sqlite-jdbc is not loaded.
INFO BatchProcess: Please provide the jar on your classpath to parse sqlite
files.
INFO BatchProcess: See tika-parsers/pom.xml for the correct version.
INFO BatchProcess: randomCrawl attribute is ignored by FSListCrawler
BatchProcess:BatchProcess starting up
BatchProcess:Main thread in TikaFSBatchCLI has finished processing.
BatchProcess:
BatchProcess:
BatchProcess:ParallelFileProcessingResult{considered=1, added=1, consumed=1,
numberHandledExceptions=0, secondsElapsed=0.798, exitStatus=0,
causeForTermination='COMPLETED_NORMALLY'}
INFO The child process has finished with an exit value of: 0
INFO Process driver has completed
{code}
> Error with certain jar paths on OS X
> ------------------------------------
>
> Key: TIKA-2604
> URL: https://issues.apache.org/jira/browse/TIKA-2604
> Project: Tika
> Issue Type: Bug
> Components: cli
> Affects Versions: 1.17
> Environment: tika-app-1.17.jar, OS X 10.13.3.
>
> Reporter: Sasha Goodman
> Assignee: Tim Allison
> Priority: Blocker
> Fix For: 1.18, 2.0.0
>
>
> I've been developing an R interface to the Tika batch processor for the past
> month ( see: [https://github.com/predict-r/rtika] ), and this software is
> awesome. I use the command line to call the batch processor, and my code has
> worked on Ubuntu, Windows 10 and OS X. Several people have been testing my
> code as well. Its been working.
> A few days ago I found an issue with the batch processor on OS X.
> When calling the batch processor with the tika-app-1.17.jar on a path with
> spaces in it, Tika starts to continually restart.
> Here is an example of calling the jar *when the path has spaces.* It
> *produces this error, and the unexpected restarts*:
> {code:java}
> java -Djava.awt.headless=true -jar '/Users/sasha/Downloads/space
> folder/tika-app.jar' -maxRestarts 1 -t -i '/' -o
> '/var/folders/nr/74rgb64s3n98yccxwbv6vsxw0000gn/T/Rtmp9VEJvX/rtika_dircf81200b313e'
> -fileList
> '/var/folders/nr/74rgb64s3n98yccxwbv6vsxw0000gn/T/Rtmp9VEJvX/rtika_filecf81530d27ee'
> INFO about to start driver
> INFO BatchProcess: Error: Could not find or load main class
> org.apache.tika.batch.fs.FSBatchProcessCLI
> INFO BatchProcess: Caused by: java.lang.ClassNotFoundException:
> org.apache.tika.batch.fs.FSBatchProcessCLI
> INFO The child process has finished with an exit value of: 1
> WARN Restarting on unexpected restart code: 1
> WARN Must restart process (exitValue=1 numRestarts=0
> receivedRestartMessage=false)
> INFO BatchProcess: Error: Could not find or load main class
> org.apache.tika.batch.fs.FSBatchProcessCLI
> INFO BatchProcess: Caused by: java.lang.ClassNotFoundException:
> org.apache.tika.batch.fs.FSBatchProcessCLI
> INFO The child process has finished with an exit value of: 1
> WARN Restarting on unexpected restart code: 1
> WARN Hit the maximum number of process restarts. Driver is shutting down now.
> INFO Process driver has completed{code}
> The error ALSO occurs with double quotes also around the jar.
> *Now, in contrast,* calling the jar when the *path does not have spaces
> produces absolutely NO error*:
> {code:java}
> java -Djava.awt.headless=true -jar '/Users/sasha/Downloads/tika-app.jar'
> -maxRestarts 1 -t -i '/' -o
> '/var/folders/nr/74rgb64s3n98yccxwbv6vsxw0000gn/T/Rtmp9VEJvX/rtika_dircf81200b313e'
> -fileList
> '/var/folders/nr/74rgb64s3n98yccxwbv6vsxw0000gn/T/Rtmp9VEJvX/rtika_filecf81530d27ee'
> INFO about to start driver
> INFO BatchProcess: log4j:WARN No appenders could be found for logger
> (org.apache.tika.batch.fs.FSBatchProcessCLI).
> INFO BatchProcess: log4j:WARN Please initialize the log4j system properly.
> INFO BatchProcess: log4j:WARN See
> http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
> INFO BatchProcess: Mar 09, 2018 12:19:17 AM
> org.apache.tika.config.InitializableProblemHandler$3
> handleInitializableProblem
> INFO BatchProcess: WARNING: JBIG2ImageReader not loaded. jbig2 files will be
> ignored
> INFO BatchProcess: See
> https://pdfbox.apache.org/2.0/dependencies.html#jai-image-io
> INFO BatchProcess: for optional dependencies.
> INFO BatchProcess: TIFFImageWriter not loaded. tiff files will not be
> processed
> INFO BatchProcess: See
> https://pdfbox.apache.org/2.0/dependencies.html#jai-image-io
> INFO BatchProcess: for optional dependencies.
> INFO BatchProcess: J2KImageReader not loaded. JPEG2000 files will not be
> processed.
> INFO BatchProcess: See
> https://pdfbox.apache.org/2.0/dependencies.html#jai-image-io
> INFO BatchProcess: for optional dependencies.
> INFO BatchProcess:
> INFO BatchProcess: Mar 09, 2018 12:19:17 AM
> org.apache.tika.config.InitializableProblemHandler$3
> handleInitializableProblem
> INFO BatchProcess: WARNING: org.xerial's sqlite-jdbc is not loaded.
> INFO BatchProcess: Please provide the jar on your classpath to parse sqlite
> files.
> INFO BatchProcess: See tika-parsers/pom.xml for the correct version.
> INFO BatchProcess: randomCrawl attribute is ignored by FSListCrawler
> BatchProcess:Main thread in TikaFSBatchCLI has finished processing.
> BatchProcess:
> BatchProcess:
> BatchProcess:ParallelFileProcessingResult{considered=1, added=1, consumed=1,
> numberHandledExceptions=0, secondsElapsed=0.853, exitStatus=0,
> causeForTermination='COMPLETED_NORMALLY'}
> INFO The child process has finished with an exit value of: 0
> INFO Process driver has completed{code}
> Further, and what makes this a batch processor issue, is that that path with
> the space in it produces absolutely *NO error in the normal Tika CLI mode
> either*:
> {code:java}
> java -jar '/Users/sasha/Downloads/space folder/tika-app.jar' -t
> /Library/Frameworks/R.framework/Versions/3.4/Resources/library/rtika/extdata/jsonlite.pdf
> {code}
> The last two examples work, but the first does not.
> The only difference is the first is calling the batch processor, and that is
> causing restarts with whatever file.
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)