I had a similar issue and I solved it by setting this option worker.heap. memory.mb
On Feb 7, 2017 10:45 AM, "Navin Ipe" <[email protected]> wrote: > Hi, > > Even though I ran the topology on a server with 30GB RAM, it still crashed. > I had set *stormConfig.put(Config.TOPOLOGY_WORKER_CHILDOPTS, "-Xmx" + > "15g");* > > But still, when I see the workers using htop, their virtual memory is > shown as 15G, but toward the right side of the screen, under the command > column it shows "java -Xmx2048m and a few other options". I assume this was > the command that storm used to start the worker. > > So howcome my memory setting isn't getting used by the worker? Why is it > still using 2GB instead of 15GB? > Also, out of the 30GB, 25GB was getting used. How could that happen when I > have only 4 slots and 4 workers running? The exact same thing was taking up > just 5GB on a system with 10GB RAM, where I configured -Xmx to "2g". > > Could you help me understand this? > > > On Mon, Feb 6, 2017 at 2:29 PM, Navin Ipe <[email protected] > > wrote: > >> Thank you. Been monitoring it via JConsole, and these are what I see: >> Supervisor used memory: 61MB >> >> *Supervisor committed memory: 171MB* >> *Supervisor Max memory: 239.1MB* >> >> Nimbus used memory: 44.3MB >> Nimbus committed memory: 169.3MB >> Nimbus max memory: 954.7MB >> >> Zookeeper used memory: 224MB >> Zookeeper committed memory: 529MB >> Zookeeper Max memory: 1.9GB >> >> Worker used memory: 941MB >> >> *Worker committed memory: 1.4GB* >> *Worker Max memory: 1.9GB* >> >> So from what it looks like, even if the worker memory is managed and kept >> low, the supervisor can crash because of low memory. So the solution >> appears to be to increase supervisor memory in storm.yaml, use bigger RAM >> and use swap space. >> >> If you have any other opinions, please let me know. >> >> >> On Sun, Feb 5, 2017 at 7:10 PM, Andrea Gazzarini <[email protected]> >> wrote: >> >>> Hi Navin, >>> I think this line is a good starting point for your analysis: >>> >>> >>> >>> *"There is insufficient memory for the Java Runtime Environment to >>> continue." *I don't believe this scenario is caught by the JVM as a >>> checked exception: in my opinion it belongs to the "Error" class, and that >>> would explain why the catch block is never reached. >>> In addition, your assumption could be also right: the part of code that >>> raises the exception could be everywhere in the worker code, not >>> necessarily within your class; this because memory errors, differently from >>> what in general happens for exceptions, don't have a deterministic point of >>> failure, they depends on the system state at a given moment. >>> >>> Please expand a bit (or investigate on yourself) your architecture, >>> nodes, hardware resources and any information that can helps understanding >>> your context. Tools like JVisualVM, JConsole, Storm GUI are precious >>> friends in this contexts. >>> >>> Best, >>> Andrea >>> >>> >>> On 05/02/17 12:53, Navin Ipe wrote: >>> >>> >>> >>> *Hi, * >>> *I have a bolt which emits around 15000 tuples sometimes. Sometimes it >>> emits more than 20000 tuples. I think when this happens, there's a memory >>> issue and the workers get restarted. This is what worker.log.err contains:* >>> >>> >>> >>> >>> >>> * Java HotSpot(TM) 64-Bit Server VM warning: INFO: >>> os::commit_memory(0x00000000f1000000, 62914560, 0) failed; error='Cannot >>> allocate memory' (errno=12) # There is insufficient memory for the Java >>> Runtime Environment to continue. # Native memory allocation (mmap) failed >>> to map 62914560 bytes for committing reserved memory. # An error report >>> file with more information is saved as: # >>> /home/storm/apache-storm-1.0.0/storm-local/workers/6a1a70ad-d094-437a-a9c5-e837fc1b3535/hs_err_pid2766.log* >>> >>> *The odd part is, that in all my bolts I have * >>> >>> >>> >>> * @Override public void execute(Tuple tuple) { try { * >>> >>> *..some code; including the code that emits tuples * >>> >>> *} catch(Exception ex) {logger.info <http://logger.info>("The exception >>> {}, {}", ex.getCause(), ex.getMessage());} }* >>> >>> *But in the logs I never see the string "The exception". But worker.log >>> shows:* >>> >>> >>> >>> >>> >>> >>> *2017-02-05 09:14:01.320 STDERR [INFO] Java HotSpot(TM) 64-Bit Server VM >>> warning: INFO: os::commit_memory(0x00000000e6f80000, 37748736, 0) failed; >>> error='Cannot allocate memory' (errno=12) 2017-02-05 09:14:01.320 STDERR >>> [INFO] # 2017-02-05 09:14:01.330 STDERR [INFO] # There is insufficient >>> memory for the Java Runtime Environment to continue. 2017-02-05 >>> 09:14:01.330 STDERR [INFO] # Native memory allocation (mmap) failed to map >>> 37748736 bytes for committing reserved memory. 2017-02-05 09:14:01.331 >>> STDERR [INFO] # An error report file with more information is saved as: >>> 2017-02-05 09:14:01.331 STDERR [INFO] # >>> /home/storm/apache-storm-1.0.0/storm-local/workers/2685b445-c4a9-4f7e-94e1-1ce3fe13de47/hs_err_pid3022.log >>> 2017-02-05 09:14:06.904 o.a.s.d.worker [INFO] Launching worker for >>> HydraCellGen-138-1486283223 on 3fc3c05e-9769-4033-bf7d-df609d6c4963:6701 >>> with id 575bd7ed-a3fc-4f7f-a7d0-cdd4054c9fc5 and conf >>> {"topology.builtin.metrics.bucket.size.secs" 60, "nimbus.childopts" >>> "-Xmx1024m",... etc* >>> >>> *These are the settings I'm using for the topology:* >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> * Config stormConfig = new Config(); >>> stormConfig.setNumWorkers(20); stormConfig.setNumAckers(20); >>> stormConfig.put(Config.TOPOLOGY_DEBUG, false); >>> stormConfig.put(Config.TOPOLOGY_TRANSFER_BUFFER_SIZE, 1024); >>> stormConfig.put(Config.TOPOLOGY_EXECUTOR_RECEIVE_BUFFER_SIZE, >>> 65536); >>> stormConfig.put(Config.TOPOLOGY_EXECUTOR_SEND_BUFFER_SIZE, 65536); >>> stormConfig.put(Config.TOPOLOGY_MAX_SPOUT_PENDING, 2); >>> stormConfig.put(Config.TOPOLOGY_MESSAGE_TIMEOUT_SECS, 2200); >>> stormConfig.put(Config.STORM_ZOOKEEPER_SERVERS, Arrays.asList(new >>> String[]{"localhost"})); >>> stormConfig.put(Config.TOPOLOGY_WORKER_CHILDOPTS, "-Xmx" + "2g");* >>> >>> >>> >>> *So am I right in assuming the exception is not thrown in my code but is >>> thrown in the worker thread? Do such exceptions happen when the worker >>> isn't able to receive too many tuples in its queue? * >>> *What can I do to avoid this problem?* >>> >>> -- >>> Regards, >>> Navin >>> >>> >>> >> >> >> -- >> Regards, >> Navin >> > > > > -- > Regards, > Navin >
