This is great info. Thanks a lot for sharing :)
________________________________ From: Paul Ingles <p...@oobaloo.co.uk> To: user@hive.apache.org Sent: Wed, May 4, 2011 4:48:20 AM Subject: Re: HIVE Server multiple instances For future reference I've posted a little more about our setup here: http://oobaloo.co.uk/multiple-connections-with-hive On Tue, May 3, 2011 at 8:01 PM, Paul Ingles <p...@oobaloo.co.uk> wrote: Nothing specifically about our Hive setup although some of us at Forward have blogged bits and pieces about Hive + Hadoop and have a few Hadoop/Hive related libs on our GitHub account: https://github.com/forward. > >I've blogged a few bits (http://www.oobaloo.co.uk/) as has one of my >colleagues >(http://blog.fingertap.org/post/1255463384/hive-thrift-client). > >Another colleague also presented a little about our setup during a Hadoop >meetup >last summer (http://skillsmatter.com/podcast/home/hadoop-in-context-1591). The >numbers Andy mentioned will be a little out of date but it does include some >screenshots of a few of the surrounding apps we built that connect to Hive and >Hadoop (including a web based Hive query tool + work queue). > >I had a quick search through the mailing lists when we had connection problems >but I think most of it was discussed/resolved during a chat I had with Shevek >from Karmasphere at a London pub following a Hadoop meetup :) > >If you're interested, I've posted a gist (https://gist.github.com/953926) that >contains our HAProxy config; clients connect to 10000 and are balanced between >:10001 and :10005 on 2 servers (so actually 10 backend servers). > >Be happy to talk more about our experience- feel free to ping me an email off >list if you'd like. > > > >On 3 May 2011, at 19:18, Matthew Rathbone wrote: > >> Hey Paul, >> >> I'd be very interested in reading about your hadoop/hive setup, do you have >> a >>blog post or anything describing this setup, or some of the issues you've >>have >>with hive? >> >> -- >> Matthew Rathbone >> Foursquare | Software Engineer | Server Engineering Team >> matt...@foursquare.com | @rathboma | 4sq >> >> On Tuesday, May 3, 2011 at 2:15 PM, Paul Ingles wrote: >> HiveServer does seem to support multiple connections but I think it still >> has >>thread-safety problems (https://issues.apache.org/jira/browse/HIVE-80). >>> >>> We've (www.forward.co.uk) certainly had instability problems with the >>> thrift >>>server in the past and now run 5 or so instances behind the HAProxy >>>load-balancer (http://haproxy.1wt.eu/). Since we did that it's been >>>significantly better. >>> >>> I think the JDBC server still operates using thrift to connect to the >>>HiveServer so I would expect it to have similar problems (but I may have got >>>that wrong :) >>> >>> >>> On 3 May 2011, at 18:59, Matthew Rathbone wrote: >>> >>>> Even if it is single threaded it certainly seems to support multiple >>>>connections. >>>> >>>> We run 5 workers all connected at the same time executing a different >>>> query >>>>each ( with a different connection per worker). >>>> >>>> Hope that helps >>>> >>>> Matthew >>>> On Tuesday, May 3, 2011 at 1:40 PM, V.Senthil Kumar wrote: >>>> Thanks Matthew. The wiki page >>>> http://wiki.apache.org/hadoop/Hive/HiveServer >>>>says >>>>> its single threaded. I have a queue of queries which gets added >>>>> dynamically >>>>all >>>>> the time. By the time I run 1 query using 1 JDBC connection, the queue gets >>>>> added more queries and builds up a backlog. So, I was that's why I was >>>>>wondering >>>>> whether I can run two or more instances to avoid having a big backlog in >>>>queue. >>>>> >>>>> >>>>> >>>>> ----- Original Message ---- >>>>> From: Matthew Rathbone <matt...@foursquare.com> >>>>> To: user@hive.apache.org >>>>> Sent: Tue, May 3, 2011 7:46:49 AM >>>>> Subject: Re: HIVE Server multiple instances >>>>> >>>>> Why would you want to run two? I think it is multithreaded, so you can >>>>> query >>>>it >>>>> from two different connections >>>>> >>>>> -- >>>>> Matthew Rathbone >>>>> Foursquare | Software Engineer | Server Engineering Team >>>>> matt...@foursquare.com | @rathboma | 4sq >>>>> >>>>> On Monday, May 2, 2011 at 6:41 PM, V.Senthil Kumar wrote: >>>>> Hello, >>>>>> >>>>>> I have one instance of HIVE JDBC server running on port 10000. Can I run >>>>>> another >>>>>> >>>>>> instance on different port ? Would it cause a concurrency issue on the >>>>>> underlying data warehouse files ? Please clarify. >>>>>> >>>>>> Thanks, >>>>>> V.Senthil Kumar >>> >> > >