Hi All, Any tips for determining the heap size for node's single JVM?
> On Oct 5, 2015, at 5:25 AM, anshu shukla <[email protected]> wrote: > > I was also facing the same issue of balancing the latency and tradeoff > .Got a nice dicussion here . > > Just one query How we can map - > 1-no of workers to number of cores > 2-no of slots on one machine to number of cores over that machine > >> On Sun, Oct 4, 2015 at 2:00 AM, Kashyap Mhaisekar <[email protected]> >> wrote: >> Thanks guys. >> So when you say one jvm per node, then it means that one port say 6700 on >> each machine and for that we assign high amount of heap? >> So in this case, it translates into 5 (5 machines) workers with atleast 4g >> heap and all bolts spread across these 5 workers? >> >> Is there a guideline on how should I arrive at parallelism hints of bolts >> themselves? I mean, when complete latency at spout is higher but execute >> latencies at bolts are very very small... >> >> Will jump into the links right away. >> >> Thanks >> Kashyap >> >>> On Oct 3, 2015 12:00 PM, "Michael Vogiatzis" <[email protected]> >>> wrote: >>> I will agree with Javier, one JVM per node should eliminate the number of >>> messages that need to be serialized. >>> >>> For tuning Storm topologies you may find the following links useful: >>> >>> https://gist.github.com/mrflip/5958028 >>> https://wassermelonemann.wordpress.com/2014/01/22/tuning-storm-topologies/ >>> Talk: >>> http://demo.ooyala.com/player.html?width=640&height=360&embedCode=Q1eXg5NzpKqUUzBm5WTIb6bXuiWHrRMi&videoPcode=9waHc6zKpbJKt9byfS7l4O4sn7Qn >>> >>> Cheers, >>> Michael >>> @mvogiatzis >>> >>> >>>> On Sat, 3 Oct 2015 at 14:04 Javier Gonzalez <[email protected]> wrote: >>>> I would suggest sticking with a single worker per machine. It makes memory >>>> allocation easier and it makes inter-component communication much more >>>> efficient. Configure the executors with your parallelism hints to take >>>> advantage of all your availabe CPU cores. >>>> >>>> Regards, >>>> JG >>>> >>>>> On Sat, Oct 3, 2015 at 12:10 AM, Kashyap Mhaisekar <[email protected]> >>>>> wrote: >>>>> Hi, >>>>> I was trying to come up with an approach to evaluate the parallelism >>>>> needed for a topology. >>>>> >>>>> Assuming I have 5 machines with 8 cores and 32 gb. And my topology has >>>>> one spout and 5 bolts. >>>>> >>>>> 1. Define one worker port per CPU to start off. (= 8 workers per machine >>>>> ie 40 workers over all) >>>>> 2. Each worker spawns one executor per component per worker, it >>>>> translates to 6 executors per worker which is 40x6= 240 executors. >>>>> 3. Of this, if the bolt logic is CPU intensive, then leave parallelism >>>>> hint at 40 (total workers), else increase parallelism hint beyond 40 >>>>> till you hit a number beyond which there is no more visible performance. >>>>> >>>>> Does this look right? >>>>> >>>>> Thanks >>>>> Kashyap >>>>> >>>> >>>> >>>> >>>> -- >>>> Javier González Nicolini >>> >>> -- >>> Michael Vogiatzis >>> Twitter: @mvogiatzis >>> http://micvog.com/ > > > > -- > Thanks & Regards, > Anshu Shukla
