Milad Fatenejad created STORM-402:
-------------------------------------

             Summary: FileNotFoundException when using storm with apache tika
                 Key: STORM-402
                 URL: https://issues.apache.org/jira/browse/STORM-402
             Project: Apache Storm (Incubating)
          Issue Type: Bug
    Affects Versions: 0.9.2-incubating, 0.9.3-incubating
         Environment: Ubuntu 14.04, amd64
            Reporter: Milad Fatenejad
            Priority: Minor


I have started using storm with apache tika for text extraction and I 
encountered a FileNotFoundException emanating from the download-storm-code 
method in supervisor.clj when running in local mode (I am able to submit to a 
remote cluster that I created):

7887 [Thread-5] INFO  backtype.storm.daemon.supervisor - Downloading code for 
storm id LocalTopology-1-1405005513 from 
/tmp/0de4961a-8694-4646-bb3c-b8e2d49da288/nimbus/stormdist/LocalTopology-1-1405005513
7906 [Thread-5] INFO  backtype.storm.daemon.supervisor - Copying resources at 
jar:file:/home/milad/.m2/repository/edu/ucar/netcdf/4.2-min/netcdf-4.2-min.jar!/resources
 to 
/tmp/61a2365c-f99f-49d6-9bb3-93f84e237fd2/supervisor/stormdist/LocalTopology-1-1405005513/resources
7907 [Thread-5] ERROR backtype.storm.event - Error when processing event
java.io.FileNotFoundException: Source 
'file:/home/milad/.m2/repository/edu/ucar/netcdf/4.2-min/netcdf-4.2-min.jar!/resources'
 does not exist
        at org.apache.commons.io.FileUtils.copyDirectory(FileUtils.java:1368) 
~[commons-io-2.4.jar:2.4]
        at org.apache.commons.io.FileUtils.copyDirectory(FileUtils.java:1261) 
~[commons-io-2.4.jar:2.4]
        at org.apache.commons.io.FileUtils.copyDirectory(FileUtils.java:1230) 
~[commons-io-2.4.jar:2.4]
        at backtype.storm.daemon.supervisor$fn__6392.invoke(supervisor.clj:535) 
~[storm-core-0.9.3-incubating-SNAPSHOT.jar:0.9.3-incubating-SNAPSHOT]
        at clojure.lang.MultiFn.invoke(MultiFn.java:236) ~[clojure-1.5.1.jar:na]
        at 
backtype.storm.daemon.supervisor$mk_synchronize_supervisor$this__6286.invoke(supervisor.clj:327)
 ~[storm-core-0.9.3-incubating-SNAPSHOT.jar:0.9.3-incubating-SNAPSHOT]
        at backtype.storm.event$event_manager$fn__2391.invoke(event.clj:39) 
~[storm-core-0.9.3-incubating-SNAPSHOT.jar:0.9.3-incubating-SNAPSHOT]
        at clojure.lang.AFn.run(AFn.java:24) [clojure-1.5.1.jar:na]
        at java.lang.Thread.run(Thread.java:744) [na:1.7.0_55]
7920 [Thread-5] INFO  backtype.storm.util - Halting process: ("Error when 
processing an event")

In this case, the download-storm-code method in supervisor.clj is trying to 
load resources from the URL:

jar:file:/home/milad/.m2/repository/edu/ucar/netcdf/4.2-min/netcdf-4.2-min.jar!/resources

The relevant code in download-storm-code seems to be:

  (log-message "Copying resources at " (str url) " to " target-dir)
  (FileUtils/copyDirectory (File. (.getFile url)) (File. target-dir))

Since the url is of the form: 
jar:file:/home/milad/.m2/repository/edu/ucar/netcdf/4.2-min/netcdf-4.2-min.jar!/resources

the getFile method simply returns:
file:/home/milad/.m2/repository/edu/ucar/netcdf/4.2-min/netcdf-4.2-min.jar!/resources

which is incorrect and leads to the exception.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to