Folks -- how are folks handling the "productionalization" of their Pig submit nodes?
For our PROD environment, I originally thought we'd just have a few VMs from which Pig jobs would be submitted onto our cluster. But on our 8GB VMs, I found that we were often hitting heap OOM errors on a relatively small set of approx. 50 analytics jobs. As a short-term solution, we ended up scaling these VMs horizontally, which seemed a bit messy to me, since we have to manage which jobs are executed where. Is this heap footprint (300-400 MB/per Pig process) consistent with your environment? Norbert
