Curator adds escaped sequenced spaces when reading extractorBinPath tag from 
extractor config files
---------------------------------------------------------------------------------------------------

                 Key: OODT-7
                 URL: https://issues.apache.org/jira/browse/OODT-7
             Project: OODT
          Issue Type: Bug
         Environment: jpl-esg machine using cas-curator 1.0.0 release
            Reporter: Joshua Garcia


If a config file is set up as the following, which is shown on the cas-curator 
user guide:

[joshu...@jpl-esg mp3extractor]$ cat mp3PythonExtractor.config
<?xml version="1.0" encoding="UTF-8"?>
<cas:externextractor xmlns:cas="http://oodt.jpl.nasa.gov/1.0/cas";>
  <exec workingDir="">
     <extractorBinPath>
/home/joshuaga/extractors/mp3extractor/mp3PythonExtractor.py
     </extractorBinPath>
     <args>
        <arg isDataFile="true"/>
     </args>
  </exec>
</cas:externextractor>

tomcat's catalina.out log file can have a warning such as:

WARNING: IOException running met extraction: commandLine: [\ 
/home/joshuaga/extractors/mp3extractor/mp3PythonExtractor.py\ \ \ \ \ \ \ 
/home/joshuaga/staging/products/mp3/Bach-SuiteNo2.mp3 ]: Message: 
java.io.IOException: \ 
/home/joshuaga/extractors/mp3extractor/mp3PythonExtractor.py\ \ \ \ \ \ \ : not 
found

Which shows that the command being run when using the extractor is adding 
spaces with escape sequences which gives the not found error shown.

The workaround is simply:

<?xml version="1.0" encoding="UTF-8"?>
<cas:externextractor xmlns:cas="http://oodt.jpl.nasa.gov/1.0/cas";>
  <exec workingDir="">
     
<extractorBinPath>/home/joshuaga/extractors/mp3extractor/mp3PythonExtractor.py</extractorBinPath>
     <args>
        <arg isDataFile="true"/>
     </args>
  </exec>
</cas:externextractor>

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to