Nope, no luck so far. We have upped the number of file descriptors and are having to restart hive every week or so :(
Any other suggestions would be greatly appreciated. On 15 Feb 2010, at 14:09, Bennie Schut wrote: > Did this help? I'm running into a similar problem. slowly leaking > connections to 50010 and after a hive restart all is ok again. > > Andy Kent wrote: >> I can give try and give it a go. I'm not convinced though as we are working >> with CSV files and don't touch sequence files at all at the moment. >> >> We are using the Clodera Ubuntu Packages for Hadoop 0.20.1+133 and Hive 0.40 >> >> >> On 25 Jan 2010, at 15:30, Jay Booth wrote: >> >> >>> Actually, we had an issue with this, it was a bug in SequenceFile where if >>> there were problems opening a file, it would leave a filehandle open and >>> never close it. >>> >>> Here's the patch -- It's already fixed in 0.21/trunk, if I get some time >>> this week I'll submit it against 0.20.2 -- could you apply this to hadoop >>> and let me know if it fixes things for you? >>> >>> On Mon, Jan 25, 2010 at 10:11 AM, Jay Booth >>> <[email protected]<mailto:[email protected]>> wrote: >>> Yeah, I'd guess that this is a Hive issue, although it could be a >>> combination.. maybe if you're doing queries and then closing your thrift >>> connection before reading all results, Hive doesn't know what to do and >>> leaves the connection open? Once the west coast folks wake up, they might >>> have a better answer for you than I do. >>> >>> >>> On Mon, Jan 25, 2010 at 9:06 AM, Andy Kent >>> <[email protected]<mailto:[email protected]>> wrote: >>> On 25 Jan 2010, at 13:59, Jay Booth wrote: >>> >>> >>>> That's the datanode port.. if I had to guess, Hive's connecting to DFS >>>> directly for some reason (maybe for "select *" queries?) and not finishing >>>> their reads or closing the connections after. >>>> >>> Thanks for the response. >>> >>> That's what I was suspecting. I have triple checked and our Ruby code and >>> it is defiantly closing it's thrift connections properly. >>> >>> I'll try running some different queries and see if I can suss out some >>> examples of which ones are leaky. Is this something that I should post to >>> Jira or is it a known issue? I can't believe other people haven't noticed >>> this? >>> >>> >>> <SequenceFile.patch> >>> >> >> > >
