Re: Supervisor failed but workers continue to work

2016-09-18 Thread Jungtaek Lim
I guess you hit STORM-1934
 which was resolved from
Storm 1.0.2.
Since Storm 1.0.2 includes many bugfixes, I strongly encourage you to
upgrade your cluster to 1.0.2.

Thanks,
Jungtaek Lim (HeartSaVioR)

2016년 9월 18일 (일) 오전 11:08, k-2f...@hotmail.com 님이 작성:

> Storm UI shows this supervisor is down. But the workers run properly.
>
> *The error log of failed supervisor is:*
>
>
>
> 2016-09-18 09:47:13.129 o.a.s.d.supervisor [INFO] Shutting down
> e0c975d4-85c7-45fa-900d-bd3e7240e92c:
>
> 2016-09-18 09:47:13.129 o.a.s.config [INFO] GET worker-user
>
> 2016-09-18 09:47:13.129 o.a.s.config [WARN] Failed to get worker user for
> . #error {
>
> :cause /home/storm/apache-storm-1.0.1/data/workers-users (Is a directory)
>
> :via
>
> [{:type java.io.FileNotFoundException
>
>:message /home/storm/apache-storm-1.0.1/data/workers-users (Is a
> directory)
>
>:at [java.io.FileInputStream open0 FileInputStream.java -2]}]
>
> :trace
>
> [[java.io.FileInputStream open0 FileInputStream.java -2]
>
>   [java.io.FileInputStream open FileInputStream.java 195]
>
>   [java.io.FileInputStream  FileInputStream.java 138]
>
>   [clojure.java.io$fn__9189 invoke io.clj 229]
>
>   [clojure.java.io$fn__9102$G__9095__9109 invoke io.clj 69]
>
>   [clojure.java.io$fn__9201 invoke io.clj 258]
>
>   [clojure.java.io$fn__9102$G__9095__9109 invoke io.clj 69]
>
>   [clojure.java.io$fn__9163 invoke io.clj 165]
>
>   [clojure.java.io$fn__9115$G__9091__9122 invoke io.clj 69]
>
>   [clojure.java.io$reader doInvoke io.clj 102]
>
>   [clojure.lang.RestFn invoke RestFn.java 410]
>
>   [clojure.lang.AFn applyToHelper AFn.java 154]
>
>   [clojure.lang.RestFn applyTo RestFn.java 132]
>
>   [clojure.core$apply invoke core.clj 632]
>
>   [clojure.core$slurp doInvoke core.clj 6653]
>
>   [clojure.lang.RestFn invoke RestFn.java 410]
>
>   [org.apache.storm.config$get_worker_user invoke config.clj 239]
>
>   [org.apache.storm.daemon.supervisor$shutdown_worker invoke
> supervisor.clj 281]
>
>
> [org.apache.storm.daemon.supervisor$kill_existing_workers_with_change_in_components
> invoke supervisor.clj 536]
>
>   [org.apache.storm.daemon.supervisor$mk_synchronize_supervisor$this__9078
> invoke supervisor.clj 595]
>
>   [org.apache.storm.event$event_manager$fn__8630 invoke event.clj 40]
>
>   [clojure.lang.AFn run AFn.java 22]
>
>   [java.lang.Thread run Thread.java 745]]}
>
> 2016-09-18 09:47:13.129 o.a.s.d.supervisor [INFO] Shut down
> e0c975d4-85c7-45fa-900d-bd3e7240e92c:
>
> 2016-09-18 09:47:13.129 o.a.s.d.supervisor [INFO] Creating symlinks for
> worker-id: f8ff0276-ead3-4017-88f3-5d73b5e258e9 storm-id: DPI-17-1474162982
> for files(1): ("resources")
>
> 2016-09-18 09:47:13.132 o.a.s.event [ERROR] Error when processing event
>
> java.nio.file.NoSuchFileException:
> /home/storm/apache-storm-1.0.1/data/workers/f8ff0276-ead3-4017-88f3-5d73b5e258e9/resources
>
> at
> sun.nio.fs.UnixException.translateToIOException(UnixException.java:86)
>
> at
> sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
>
> at
> sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
>
> at
> sun.nio.fs.UnixFileSystemProvider.createSymbolicLink(UnixFileSystemProvider.java:457)
>
> at java.nio.file.Files.createSymbolicLink(Files.java:1043)
>
> at org.apache.storm.util$create_symlink_BANG_.invoke(util.clj:606)
>
> at org.apache.storm.util$create_symlink_BANG_.invoke(util.clj:596)
>
> at
> org.apache.storm.daemon.supervisor$create_blobstore_links.invoke(supervisor.clj:1038)
>
> at
> org.apache.storm.daemon.supervisor$fn__9341.invoke(supervisor.clj:1153)
>
> at clojure.lang.MultiFn.invoke(MultiFn.java:251)
>
> at
> org.apache.storm.daemon.supervisor$get_valid_new_worker_ids$iter__8926__8930$fn__8931.invoke(supervisor.clj:380)
>
> at clojure.lang.LazySeq.sval(LazySeq.java:40)
>
> at clojure.lang.LazySeq.seq(LazySeq.java:49)
>
> at clojure.lang.RT.seq(RT.java:507)
>
> at clojure.core$seq__4128.invoke(core.clj:137)
>
> at clojure.core$dorun.invoke(core.clj:3009)
>
> at clojure.core$doall.invoke(core.clj:3025)
>
> at
> org.apache.storm.daemon.supervisor$get_valid_new_worker_ids.invoke(supervisor.clj:367)
>
> at
> org.apache.storm.daemon.supervisor$sync_processes.invoke(supervisor.clj:428)
>
> at clojure.core$partial$fn__4527.invoke(core.clj:2492)
>
> at
> org.apache.storm.event$event_manager$fn__8630.invoke(event.clj:40)
>
> at clojure.lang.AFn.run(AFn.java:22)
>
> at java.lang.Thread.run(Thread.java:745)
>
> 2016-09-18 09:47:13.135 o.a.s.util [ERROR] Halting process: ("Error when
> processing an event")
>
> java.lang.RuntimeException: ("Error when processing an event")
>
> at org.apache.storm.util$exit_process_BANG_.doInvoke(util.clj:341)
>
> at clojure.lang.RestFn.invoke(RestFn.java:423)
>
> at
> org.ap

Supervisor failed but workers continue to work

2016-09-17 Thread k-2f...@hotmail.com
Storm UI shows this supervisor is down. But the workers run properly.
The error log of failed supervisor is:

2016-09-18 09:47:13.129 o.a.s.d.supervisor [INFO] Shutting down 
e0c975d4-85c7-45fa-900d-bd3e7240e92c:
2016-09-18 09:47:13.129 o.a.s.config [INFO] GET worker-user
2016-09-18 09:47:13.129 o.a.s.config [WARN] Failed to get worker user for . 
#error {
:cause /home/storm/apache-storm-1.0.1/data/workers-users (Is a directory)
:via
[{:type java.io.FileNotFoundException
   :message /home/storm/apache-storm-1.0.1/data/workers-users (Is a directory)
   :at [java.io.FileInputStream open0 FileInputStream.java -2]}]
:trace
[[java.io.FileInputStream open0 FileInputStream.java -2]
  [java.io.FileInputStream open FileInputStream.java 195]
  [java.io.FileInputStream  FileInputStream.java 138]
  [clojure.java.io$fn__9189 invoke io.clj 229]
  [clojure.java.io$fn__9102$G__9095__9109 invoke io.clj 69]
  [clojure.java.io$fn__9201 invoke io.clj 258]
  [clojure.java.io$fn__9102$G__9095__9109 invoke io.clj 69]
  [clojure.java.io$fn__9163 invoke io.clj 165]
  [clojure.java.io$fn__9115$G__9091__9122 invoke io.clj 69]
  [clojure.java.io$reader doInvoke io.clj 102]
  [clojure.lang.RestFn invoke RestFn.java 410]
  [clojure.lang.AFn applyToHelper AFn.java 154]
  [clojure.lang.RestFn applyTo RestFn.java 132]
  [clojure.core$apply invoke core.clj 632]
  [clojure.core$slurp doInvoke core.clj 6653]
  [clojure.lang.RestFn invoke RestFn.java 410]
  [org.apache.storm.config$get_worker_user invoke config.clj 239]
  [org.apache.storm.daemon.supervisor$shutdown_worker invoke supervisor.clj 281]
  
[org.apache.storm.daemon.supervisor$kill_existing_workers_with_change_in_components
 invoke supervisor.clj 536]
  [org.apache.storm.daemon.supervisor$mk_synchronize_supervisor$this__9078 
invoke supervisor.clj 595]
  [org.apache.storm.event$event_manager$fn__8630 invoke event.clj 40]
  [clojure.lang.AFn run AFn.java 22]
  [java.lang.Thread run Thread.java 745]]}
2016-09-18 09:47:13.129 o.a.s.d.supervisor [INFO] Shut down 
e0c975d4-85c7-45fa-900d-bd3e7240e92c:
2016-09-18 09:47:13.129 o.a.s.d.supervisor [INFO] Creating symlinks for 
worker-id: f8ff0276-ead3-4017-88f3-5d73b5e258e9 storm-id: DPI-17-1474162982 for 
files(1): ("resources")
2016-09-18 09:47:13.132 o.a.s.event [ERROR] Error when processing event
java.nio.file.NoSuchFileException: 
/home/storm/apache-storm-1.0.1/data/workers/f8ff0276-ead3-4017-88f3-5d73b5e258e9/resources
at 
sun.nio.fs.UnixException.translateToIOException(UnixException.java:86)
at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
at 
sun.nio.fs.UnixFileSystemProvider.createSymbolicLink(UnixFileSystemProvider.java:457)
at java.nio.file.Files.createSymbolicLink(Files.java:1043)
at org.apache.storm.util$create_symlink_BANG_.invoke(util.clj:606)
at org.apache.storm.util$create_symlink_BANG_.invoke(util.clj:596)
at 
org.apache.storm.daemon.supervisor$create_blobstore_links.invoke(supervisor.clj:1038)
at 
org.apache.storm.daemon.supervisor$fn__9341.invoke(supervisor.clj:1153)
at clojure.lang.MultiFn.invoke(MultiFn.java:251)
at 
org.apache.storm.daemon.supervisor$get_valid_new_worker_ids$iter__8926__8930$fn__8931.invoke(supervisor.clj:380)
at clojure.lang.LazySeq.sval(LazySeq.java:40)
at clojure.lang.LazySeq.seq(LazySeq.java:49)
at clojure.lang.RT.seq(RT.java:507)
at clojure.core$seq__4128.invoke(core.clj:137)
at clojure.core$dorun.invoke(core.clj:3009)
at clojure.core$doall.invoke(core.clj:3025)
at 
org.apache.storm.daemon.supervisor$get_valid_new_worker_ids.invoke(supervisor.clj:367)
at 
org.apache.storm.daemon.supervisor$sync_processes.invoke(supervisor.clj:428)
at clojure.core$partial$fn__4527.invoke(core.clj:2492)
at org.apache.storm.event$event_manager$fn__8630.invoke(event.clj:40)
at clojure.lang.AFn.run(AFn.java:22)
at java.lang.Thread.run(Thread.java:745)
2016-09-18 09:47:13.135 o.a.s.util [ERROR] Halting process: ("Error when 
processing an event")
java.lang.RuntimeException: ("Error when processing an event")
at org.apache.storm.util$exit_process_BANG_.doInvoke(util.clj:341)
at clojure.lang.RestFn.invoke(RestFn.java:423)
at org.apache.storm.event$event_manager$fn__8630.invoke(event.clj:48)
at clojure.lang.AFn.run(AFn.java:22)
at java.lang.Thread.run(Thread.java:745)

Regard ,
Junfeng