Nope, no luck so far.

We have upped the number of file descriptors and are having to restart hive 
every week or so :(

Any other suggestions would be greatly appreciated.

On 15 Feb 2010, at 14:09, Bennie Schut wrote:

> Did this help? I'm running into a similar problem. slowly leaking 
> connections to 50010 and after a hive restart all is ok again.
> 
> Andy Kent wrote:
>> I can give try and give it a go. I'm not convinced though as we are working 
>> with CSV files and don't touch sequence files at all at the moment.
>> 
>> We are using the Clodera Ubuntu Packages for Hadoop 0.20.1+133 and Hive 0.40
>> 
>> 
>> On 25 Jan 2010, at 15:30, Jay Booth wrote:
>> 
>> 
>>> Actually, we had an issue with this, it was a bug in SequenceFile where if 
>>> there were problems opening a file, it would leave a filehandle open and 
>>> never close it.
>>> 
>>> Here's the patch -- It's already fixed in 0.21/trunk, if I get some time 
>>> this week I'll submit it against 0.20.2 -- could you apply this to hadoop 
>>> and let me know if it fixes things for you?
>>> 
>>> On Mon, Jan 25, 2010 at 10:11 AM, Jay Booth 
>>> <[email protected]<mailto:[email protected]>> wrote:
>>> Yeah, I'd guess that this is a Hive issue, although it could be a 
>>> combination..  maybe if you're doing queries and then closing your thrift 
>>> connection before reading all results, Hive doesn't know what to do and 
>>> leaves the connection open?  Once the west coast folks wake up, they might 
>>> have a better answer for you than I do.
>>> 
>>> 
>>> On Mon, Jan 25, 2010 at 9:06 AM, Andy Kent 
>>> <[email protected]<mailto:[email protected]>> wrote:
>>> On 25 Jan 2010, at 13:59, Jay Booth wrote:
>>> 
>>> 
>>>> That's the datanode port..  if I had to guess, Hive's connecting to DFS 
>>>> directly for some reason (maybe for "select *" queries?) and not finishing 
>>>> their reads or closing the connections after.
>>>> 
>>> Thanks for the response.
>>> 
>>> That's what I was suspecting. I have triple checked and our Ruby code and 
>>> it is defiantly closing it's thrift connections properly.
>>> 
>>> I'll try running some different queries and see if I can suss out some 
>>> examples of which ones are leaky. Is this something that I should post to 
>>> Jira or is it a known issue? I can't believe other people haven't noticed 
>>> this?
>>> 
>>> 
>>> <SequenceFile.patch>
>>> 
>> 
>> 
> 
> 

Reply via email to