Hi all, When I do a leftOuterJoin(stream, JoinHint.REPARTITION_SORT_MERGE), I’m running into an IOException caused by too many open files.
The slaves in my YARN cluster (each with 48 slots and 320gb memory) are currently set up with a limit of 32767, so I really don’t want to crank this up much higher. In perusing the code, I assume the issue is that SpillingBuffer.nextSegment() can open a writer per segment. So if I have a lot of data (e.g. several TBs) that I’m joining, I can wind up with > 32K segments that are open for writing. Does that make sense? And is the best solution currently (without say using more, smaller servers to reduce the per-server load) to increase the taskmanager.memory.segment-size configuration setting from 32K to something larger, so I’d have fewer active segments? If I do that, any known side-effects I should worry about? Thanks, — Ken PS - a CoGroup is happening at the same time, so it’s possible that’s also hurting (or maybe the real source of the problem), but the leftOuterJoin failed first. -------------------------- Ken Krugler http://www.scaleunlimited.com custom big data solutions & training Hadoop, Cascading, Cassandra & Solr