Great, Chris, thanks for the advice!
—T

On Feb 11, 2014, at 9:04 AM, Chris Riccomini <[email protected]> wrote:

> Hey TJ,
> 
> For small containers, you can definitely drop the memory usage. There are
> several things to be aware of when doing this:
> 
> 1. YARN extrapolates virtual memory allocation as a multiple of your
> physical memory (2.1x, by default, if memory serves correct). This means a
> 1G container will give you 2.1G of VM. If you drop the 1G container size,
> you're also dropping the VM size as well, as a result.
> 2. If your task interacts with disk, you should consider the OS page
> cache, and how much memory you'd like to have. For example, your JVM and
> heap might only use 256M, but you might want the full gig at the container
> level in order to give yourself 768M of page cache for disk IO.
> 3. In practice, going below 256MB on Xmx, and 384MB for
> yarn.container.memory.mb is pretty hard to get right.
> 4. If your job is processing a high throughput stream, you might end up
> using a lot of memory usage in your eden space even if your task is
> totally stateless. In these scenarios, it is really helpful to use CMS,
> and increase the young gen size.
> 
> The AM actually uses a fair amount of memory because of the dashboard,
> which uses Scalatra and Scalate. These two guys end up chewing through a
> lot of memory when you view the dashboard in YARN. We were running the
> yarn container size at 768MB, and still seeing the NM kill the jobs
> occasionally. I'd recommend leaving the AM as it is, unless you're really
> pressed for memory in your YARN grid.
> 
> Cheers,
> Chris
> 
> On 2/10/14 11:12 PM, "TJ Giuli" <[email protected]> wrote:
> 
>> Folks, does anyone have experience they can share regarding memory
>> allocation for Samza tasks?  Out of the box, it looks like the
>> ApplicationManager defaults to 1GB of RAM for its container and 1GB per
>> YARN container for each TaskRunner.
>> 
>> Some of my Samza tasks are pretty simple and (I think) use very little
>> runtime memory per partition ‹ essentially following a pattern of read
>> message, process, commit result to a database or a stream output, repeat.
>> For these kinds of tasks, I¹m assuming I can safely scale down the
>> container memory bounds.  What about ApplicationManager?  Does it need a
>> full GB per Samza task?  Thanks!
>> ‹T
> 

Reply via email to