The code you are running is intended for local/test mode where the
resources directory is a path on the file system, not a part of a jar.  In
distributed mode it is embedded into the jar.  Your hack seems fine to me.
 If you want to file a JIRA and put up a pull request for it that would be
fine.  My only comment would be to switch it over from scheming for
³jar:file:² at the beginning of the URL to pull off the protocol/scheme
and check that directly.  URL/URI parsing can be a bit fickle and it would
be nice to use the built in APIs to avoid any possible problems.

- Bobby

On 7/11/14, 8:54 AM, "Milad Fatenejad" <[email protected]> wrote:

>Hello:
>
>I have started using storm with apache tika for text extraction and I
>encountered a FileNotFoundException emanating from the download-storm-code
>method in supervisor.clj when running in local mode (I am able to submit
>to
>a remote cluster that I created):
>
>7887 [Thread-5] INFO  backtype.storm.daemon.supervisor - Downloading code
>for storm id LocalTopology-1-1405005513 from
>/tmp/0de4961a-8694-4646-bb3c-b8e2d49da288/nimbus/stormdist/LocalTopology-1
>-1405005513
>7906 [Thread-5] INFO  backtype.storm.daemon.supervisor - Copying resources
>at
>jar:file:/home/milad/.m2/repository/edu/ucar/netcdf/4.2-min/netcdf-4.2-min
>.jar!/resources
>to
>/tmp/61a2365c-f99f-49d6-9bb3-93f84e237fd2/supervisor/stormdist/LocalTopolo
>gy-1-1405005513/resources
>7907 [Thread-5] ERROR backtype.storm.event - Error when processing event
>java.io.FileNotFoundException: Source
>'file:/home/milad/.m2/repository/edu/ucar/netcdf/4.2-min/netcdf-4.2-min.ja
>r!/resources'
>does not exist
> at org.apache.commons.io.FileUtils.copyDirectory(FileUtils.java:1368)
>~[commons-io-2.4.jar:2.4]
> at org.apache.commons.io.FileUtils.copyDirectory(FileUtils.java:1261)
>~[commons-io-2.4.jar:2.4]
>at org.apache.commons.io.FileUtils.copyDirectory(FileUtils.java:1230)
>~[commons-io-2.4.jar:2.4]
> at backtype.storm.daemon.supervisor$fn__6392.invoke(supervisor.clj:535)
>~[storm-core-0.9.3-incubating-SNAPSHOT.jar:0.9.3-incubating-SNAPSHOT]
> at clojure.lang.MultiFn.invoke(MultiFn.java:236) ~[clojure-1.5.1.jar:na]
> at
>backtype.storm.daemon.supervisor$mk_synchronize_supervisor$this__6286.invo
>ke(supervisor.clj:327)
>~[storm-core-0.9.3-incubating-SNAPSHOT.jar:0.9.3-incubating-SNAPSHOT]
> at backtype.storm.event$event_manager$fn__2391.invoke(event.clj:39)
>~[storm-core-0.9.3-incubating-SNAPSHOT.jar:0.9.3-incubating-SNAPSHOT]
> at clojure.lang.AFn.run(AFn.java:24) [clojure-1.5.1.jar:na]
>at java.lang.Thread.run(Thread.java:744) [na:1.7.0_55]
>7920 [Thread-5] INFO  backtype.storm.util - Halting process: ("Error when
>processing an event")
>
>
>In this case, the download-storm-code method in supervisor.clj is trying
>to
>load resources from the URL:
>
>jar:file:/home/milad/.m2/repository/edu/ucar/netcdf/4.2-min/netcdf-4.2-min
>.jar!/resources
>
>The relevant code in download-storm-code seems to be:
>
>  (log-message "Copying resources at " (str url) " to " target-dir)
>  (FileUtils/copyDirectory (File. (.getFile url)) (File. target-dir))
>
>Since the url is of the form:
>jar:file:/home/milad/.m2/repository/edu/ucar/netcdf/4.2-min/netcdf-4.2-min
>.jar!/resources
>
>the getFile method simply returns:
>file:/home/milad/.m2/repository/edu/ucar/netcdf/4.2-min/netcdf-4.2-min.jar
>!/resources
>
>which is incorrect. I was able to produce a very hacky fix for this by
>tweaking download-storm-code. I changed this code:
>
>  (log-message "Copying resources at " (str url) " to " target-dir)
>  (FileUtils/copyDirectory (File. (.getFile url)) (File. target-dir))
>
>to:
>(log-message "Copying resources at " (str url) " to " target-dir)
>(if (.startsWith (str url) "jar:file:" )
>    (extract-dir-from-jar (.getFile (.getJarFileURL (.openConnection
>url)))
>RESOURCES-SUBDIR stormroot)
>    (FileUtils/copyDirectory (File. (.getFile url)) (File. target-dir)))
>)
>
>and this seems to work. I am very new to storm and clojure (this is my
>first experience) so I am not sure if this fix is correct, or if I am just
>using storm incorrectly. Any advice on how to proceed, should I submit
>this
>as a pull request on github?
>
>Thanks!
>Milad

Reply via email to