For future reference I've posted a little more about our setup here: http://oobaloo.co.uk/multiple-connections-with-hive
On Tue, May 3, 2011 at 8:01 PM, Paul Ingles <p...@oobaloo.co.uk> wrote: > Nothing specifically about our Hive setup although some of us at Forward > have blogged bits and pieces about Hive + Hadoop and have a few Hadoop/Hive > related libs on our GitHub account: https://github.com/forward. > > I've blogged a few bits (http://www.oobaloo.co.uk/) as has one of my > colleagues (http://blog.fingertap.org/post/1255463384/hive-thrift-client). > > Another colleague also presented a little about our setup during a Hadoop > meetup last summer ( > http://skillsmatter.com/podcast/home/hadoop-in-context-1591). The numbers > Andy mentioned will be a little out of date but it does include some > screenshots of a few of the surrounding apps we built that connect to Hive > and Hadoop (including a web based Hive query tool + work queue). > > I had a quick search through the mailing lists when we had connection > problems but I think most of it was discussed/resolved during a chat I had > with Shevek from Karmasphere at a London pub following a Hadoop meetup :) > > If you're interested, I've posted a gist (https://gist.github.com/953926) > that contains our HAProxy config; clients connect to 10000 and are balanced > between :10001 and :10005 on 2 servers (so actually 10 backend servers). > > Be happy to talk more about our experience- feel free to ping me an email > off list if you'd like. > > > On 3 May 2011, at 19:18, Matthew Rathbone wrote: > > > Hey Paul, > > > > I'd be very interested in reading about your hadoop/hive setup, do you > have a blog post or anything describing this setup, or some of the issues > you've have with hive? > > > > -- > > Matthew Rathbone > > Foursquare | Software Engineer | Server Engineering Team > > matt...@foursquare.com | @rathboma | 4sq > > > > On Tuesday, May 3, 2011 at 2:15 PM, Paul Ingles wrote: > > HiveServer does seem to support multiple connections but I think it still > has thread-safety problems (https://issues.apache.org/jira/browse/HIVE-80 > ). > >> > >> We've (www.forward.co.uk) certainly had instability problems with the > thrift server in the past and now run 5 or so instances behind the HAProxy > load-balancer (http://haproxy.1wt.eu/). Since we did that it's been > significantly better. > >> > >> I think the JDBC server still operates using thrift to connect to the > HiveServer so I would expect it to have similar problems (but I may have got > that wrong :) > >> > >> > >> On 3 May 2011, at 18:59, Matthew Rathbone wrote: > >> > >>> Even if it is single threaded it certainly seems to support multiple > connections. > >>> > >>> We run 5 workers all connected at the same time executing a different > query each ( with a different connection per worker). > >>> > >>> Hope that helps > >>> > >>> Matthew > >>> On Tuesday, May 3, 2011 at 1:40 PM, V.Senthil Kumar wrote: > >>> Thanks Matthew. The wiki page > http://wiki.apache.org/hadoop/Hive/HiveServer says > >>>> its single threaded. I have a queue of queries which gets added > dynamically all > >>>> the time. By the time I run 1 query using 1 JDBC connection, the queue > gets > >>>> added more queries and builds up a backlog. So, I was that's why I was > wondering > >>>> whether I can run two or more instances to avoid having a big backlog > in queue. > >>>> > >>>> > >>>> > >>>> ----- Original Message ---- > >>>> From: Matthew Rathbone <matt...@foursquare.com> > >>>> To: user@hive.apache.org > >>>> Sent: Tue, May 3, 2011 7:46:49 AM > >>>> Subject: Re: HIVE Server multiple instances > >>>> > >>>> Why would you want to run two? I think it is multithreaded, so you can > query it > >>>> from two different connections > >>>> > >>>> -- > >>>> Matthew Rathbone > >>>> Foursquare | Software Engineer | Server Engineering Team > >>>> matt...@foursquare.com | @rathboma | 4sq > >>>> > >>>> On Monday, May 2, 2011 at 6:41 PM, V.Senthil Kumar wrote: > >>>> Hello, > >>>>> > >>>>> I have one instance of HIVE JDBC server running on port 10000. Can I > run > >>>>> another > >>>>> > >>>>> instance on different port ? Would it cause a concurrency issue on > the > >>>>> underlying data warehouse files ? Please clarify. > >>>>> > >>>>> Thanks, > >>>>> V.Senthil Kumar > >> > > > >