Curator adds escaped sequenced spaces when reading extractorBinPath tag from
extractor config files
---------------------------------------------------------------------------------------------------
Key: OODT-7
URL: https://issues.apache.org/jira/browse/OODT-7
Project: OODT
Issue Type: Bug
Environment: jpl-esg machine using cas-curator 1.0.0 release
Reporter: Joshua Garcia
If a config file is set up as the following, which is shown on the cas-curator
user guide:
[joshu...@jpl-esg mp3extractor]$ cat mp3PythonExtractor.config
<?xml version="1.0" encoding="UTF-8"?>
<cas:externextractor xmlns:cas="http://oodt.jpl.nasa.gov/1.0/cas">
<exec workingDir="">
<extractorBinPath>
/home/joshuaga/extractors/mp3extractor/mp3PythonExtractor.py
</extractorBinPath>
<args>
<arg isDataFile="true"/>
</args>
</exec>
</cas:externextractor>
tomcat's catalina.out log file can have a warning such as:
WARNING: IOException running met extraction: commandLine: [\
/home/joshuaga/extractors/mp3extractor/mp3PythonExtractor.py\ \ \ \ \ \ \
/home/joshuaga/staging/products/mp3/Bach-SuiteNo2.mp3 ]: Message:
java.io.IOException: \
/home/joshuaga/extractors/mp3extractor/mp3PythonExtractor.py\ \ \ \ \ \ \ : not
found
Which shows that the command being run when using the extractor is adding
spaces with escape sequences which gives the not found error shown.
The workaround is simply:
<?xml version="1.0" encoding="UTF-8"?>
<cas:externextractor xmlns:cas="http://oodt.jpl.nasa.gov/1.0/cas">
<exec workingDir="">
<extractorBinPath>/home/joshuaga/extractors/mp3extractor/mp3PythonExtractor.py</extractorBinPath>
<args>
<arg isDataFile="true"/>
</args>
</exec>
</cas:externextractor>
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.