I created STORM-402 on JIRA and created pull request #192 incorporating your suggestions
Thanks for your help Milad On Fri, Jul 11, 2014 at 2:25 PM, Bobby Evans <[email protected]> wrote: > The code you are running is intended for local/test mode where the > resources directory is a path on the file system, not a part of a jar. In > distributed mode it is embedded into the jar. Your hack seems fine to me. > If you want to file a JIRA and put up a pull request for it that would be > fine. My only comment would be to switch it over from scheming for > ³jar:file:² at the beginning of the URL to pull off the protocol/scheme > and check that directly. URL/URI parsing can be a bit fickle and it would > be nice to use the built in APIs to avoid any possible problems. > > - Bobby > > On 7/11/14, 8:54 AM, "Milad Fatenejad" <[email protected]> wrote: > > >Hello: > > > >I have started using storm with apache tika for text extraction and I > >encountered a FileNotFoundException emanating from the download-storm-code > >method in supervisor.clj when running in local mode (I am able to submit > >to > >a remote cluster that I created): > > > >7887 [Thread-5] INFO backtype.storm.daemon.supervisor - Downloading code > >for storm id LocalTopology-1-1405005513 from > >/tmp/0de4961a-8694-4646-bb3c-b8e2d49da288/nimbus/stormdist/LocalTopology-1 > >-1405005513 > >7906 [Thread-5] INFO backtype.storm.daemon.supervisor - Copying resources > >at > >jar:file:/home/milad/.m2/repository/edu/ucar/netcdf/4.2-min/netcdf-4.2-min > >.jar!/resources > >to > >/tmp/61a2365c-f99f-49d6-9bb3-93f84e237fd2/supervisor/stormdist/LocalTopolo > >gy-1-1405005513/resources > >7907 [Thread-5] ERROR backtype.storm.event - Error when processing event > >java.io.FileNotFoundException: Source > >'file:/home/milad/.m2/repository/edu/ucar/netcdf/4.2-min/netcdf-4.2-min.ja > >r!/resources' > >does not exist > > at org.apache.commons.io.FileUtils.copyDirectory(FileUtils.java:1368) > >~[commons-io-2.4.jar:2.4] > > at org.apache.commons.io.FileUtils.copyDirectory(FileUtils.java:1261) > >~[commons-io-2.4.jar:2.4] > >at org.apache.commons.io.FileUtils.copyDirectory(FileUtils.java:1230) > >~[commons-io-2.4.jar:2.4] > > at backtype.storm.daemon.supervisor$fn__6392.invoke(supervisor.clj:535) > >~[storm-core-0.9.3-incubating-SNAPSHOT.jar:0.9.3-incubating-SNAPSHOT] > > at clojure.lang.MultiFn.invoke(MultiFn.java:236) ~[clojure-1.5.1.jar:na] > > at > >backtype.storm.daemon.supervisor$mk_synchronize_supervisor$this__6286.invo > >ke(supervisor.clj:327) > >~[storm-core-0.9.3-incubating-SNAPSHOT.jar:0.9.3-incubating-SNAPSHOT] > > at backtype.storm.event$event_manager$fn__2391.invoke(event.clj:39) > >~[storm-core-0.9.3-incubating-SNAPSHOT.jar:0.9.3-incubating-SNAPSHOT] > > at clojure.lang.AFn.run(AFn.java:24) [clojure-1.5.1.jar:na] > >at java.lang.Thread.run(Thread.java:744) [na:1.7.0_55] > >7920 [Thread-5] INFO backtype.storm.util - Halting process: ("Error when > >processing an event") > > > > > >In this case, the download-storm-code method in supervisor.clj is trying > >to > >load resources from the URL: > > > >jar:file:/home/milad/.m2/repository/edu/ucar/netcdf/4.2-min/netcdf-4.2-min > >.jar!/resources > > > >The relevant code in download-storm-code seems to be: > > > > (log-message "Copying resources at " (str url) " to " target-dir) > > (FileUtils/copyDirectory (File. (.getFile url)) (File. target-dir)) > > > >Since the url is of the form: > >jar:file:/home/milad/.m2/repository/edu/ucar/netcdf/4.2-min/netcdf-4.2-min > >.jar!/resources > > > >the getFile method simply returns: > >file:/home/milad/.m2/repository/edu/ucar/netcdf/4.2-min/netcdf-4.2-min.jar > >!/resources > > > >which is incorrect. I was able to produce a very hacky fix for this by > >tweaking download-storm-code. I changed this code: > > > > (log-message "Copying resources at " (str url) " to " target-dir) > > (FileUtils/copyDirectory (File. (.getFile url)) (File. target-dir)) > > > >to: > >(log-message "Copying resources at " (str url) " to " target-dir) > >(if (.startsWith (str url) "jar:file:" ) > > (extract-dir-from-jar (.getFile (.getJarFileURL (.openConnection > >url))) > >RESOURCES-SUBDIR stormroot) > > (FileUtils/copyDirectory (File. (.getFile url)) (File. target-dir))) > >) > > > >and this seems to work. I am very new to storm and clojure (this is my > >first experience) so I am not sure if this fix is correct, or if I am just > >using storm incorrectly. Any advice on how to proceed, should I submit > >this > >as a pull request on github? > > > >Thanks! > >Milad > >
