On Wed, Aug 26, 2009 at 3:25 PM, Raghu Murthy<[email protected]> wrote: > Even if we decided to have multiple HiveServers, wouldn't it be possible for > HWI to randomly pick a HiveServer to connect to per query/client? > > On 8/26/09 12:16 PM, "Ashish Thusoo" <[email protected]> wrote: > >> +1 for ajaxing this baby. >> >> On the broader question of whether we should combine HWI and HiveServer - I >> think there are definite deployment and code reuse advantages in doing so, >> however keeping them separate also has the advantage that we can cluster >> HiveServers independently from HWI. Since the HiveServer sits in the data >> path, the independent scaling may have advantages. I am not sure how strong >> of >> an argument that is to not put them together. Simplicity obviously indicates >> that we should have them together. >> >> Thoughts? >> >> Ashish >> >> -----Original Message----- >> From: Edward Capriolo [mailto:[email protected]] >> Sent: Wednesday, August 26, 2009 9:45 AM >> To: [email protected] >> Subject: Re: Adding jar files when running hive in hwi mode or hiveserver >> mode >> >> On Tue, Aug 25, 2009 at 8:13 PM, Vijay<[email protected]> wrote: >>> Yep, I got it and now it works perfectly! I like hwi btw! It >>> definitely makes things easier for a wider audience to try out hive. >>> Your new session result bucket idea is very nice as well. I will keep >>> trying more things and see if anything else comes up but so far it looks >>> great! >>> Thanks Edward! >>> >>> On Tue, Aug 25, 2009 at 7:25 AM, Edward Capriolo >>> <[email protected]> >>> wrote: >>>> >>>> On Tue, Aug 25, 2009 at 10:18 AM, Edward >>>> Capriolo<[email protected]> >>>> wrote: >>>>> On Mon, Aug 24, 2009 at 10:13 PM, Vijay<[email protected]> wrote: >>>>>> Probably spoke too soon :) I added this comment to the JIRA ticket >>>>>> above. >>>>>> >>>>>> Hi, I tried the latest patch on trunk and there seems to be a problem. >>>>>> >>>>>> I was interested in using the "add jar " command to add jar files >>>>>> to the path. However, by the time the command flows through the >>>>>> SessionState to the AddResourceProcessor (in >>>>>> >>>>>> ./ql/src/java/org/apache/hadoop/hive/ql/processors/AddResourceProc >>>>>> essor.java), the command word "add" is not being stripped so the >>>>>> resource processor is trying to find a ResourceType of "ADD." >>>>>> >>>>>> I'm not sure if this was an existing bug or was a result of the >>>>>> current set of changes. >>>>>> >>>>>> [ Show > ] >>>>>> Vijay added a comment - 24/Aug/09 07:12 PM Hi, I tried the latest >>>>>> patch on trunk and there seems to be a problem. I was interested >>>>>> in using the "add jar " command to add jar files to the path. >>>>>> However, by the time the command flows through the SessionState to >>>>>> the AddResourceProcessor (in >>>>>> >>>>>> ./ql/src/java/org/apache/hadoop/hive/ql/processors/AddResourceProc >>>>>> essor.java), the command word "add" is not being stripped so the >>>>>> resource processor is trying to find a ResourceType of "ADD." I'm >>>>>> not sure if this was an existing bug or was a result of the >>>>>> current set of changes. >>>>>> On Mon, Aug 24, 2009 at 5:30 PM, Vijay <[email protected]> wrote: >>>>>>> >>>>>>> That's awesome and looks like exactly what I needed. Local file >>>>>>> system requirement is perfectly ok for now. I will check it out right >>>>>>> away! >>>>>>> Hopefully it will be checked in soon. >>>>>>> >>>>>>> Thanks Edward! >>>>>>> >>>>>>> On Mon, Aug 24, 2009 at 5:14 PM, Edward Capriolo >>>>>>> <[email protected]> >>>>>>> wrote: >>>>>>>> >>>>>>>> On Mon, Aug 24, 2009 at 8:09 PM, Prasad >>>>>>>> Chakka<[email protected]> >>>>>>>> wrote: >>>>>>>>> Vijay, there is no solution for it yet. There may be a jira >>>>>>>>> open but AFAIK, no one is working on it. You are welcome to >>>>>>>>> contribute this feature. >>>>>>>>> >>>>>>>>> Prasad >>>>>>>>> >>>>>>>>> >>>>>>>>> ________________________________ >>>>>>>>> From: Vijay <[email protected]> >>>>>>>>> Reply-To: <[email protected]> >>>>>>>>> Date: Mon, 24 Aug 2009 16:59:28 -0700 >>>>>>>>> To: <[email protected]> >>>>>>>>> Subject: Re: Adding jar files when running hive in hwi mode or >>>>>>>>> hiveserver mode >>>>>>>>> >>>>>>>>> Hi, is there any solution for this? How does everybody include >>>>>>>>> custom jar files running hive in a non-cli mode? >>>>>>>>> >>>>>>>>> Thanks in advance, >>>>>>>>> Vijay >>>>>>>>> >>>>>>>>> On Sat, Aug 22, 2009 at 6:19 PM, Vijay <[email protected]> wrote: >>>>>>>>> >>>>>>>>> When I run hive in cli mode, I add the hive_contrib.jar file >>>>>>>>> using this >>>>>>>>> command: >>>>>>>>> >>>>>>>>> hive> add jar lib/hive_contrib.jar >>>>>>>>> >>>>>>>>> Is there a way to do this automatically when running hive in >>>>>>>>> hwi or hiveserver modes? Or do I have to add the jar file >>>>>>>>> explicitly to any of the startup scripts? >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> Vijay, >>>>>>>> >>>>>>>> Currently HWI does not support this. The changes in >>>>>>>> https://issues.apache.org/jira/browse/HIVE-716 will make this >>>>>>>> possible (although I did not test but it should work as the cli >>>>>>>> does). The file will have to be in the servers local file >>>>>>>> system. We could probably include 'commons upload' to the web >>>>>>>> interface if there was a need for it. >>>>>>>> >>>>>>>> HIVE-716 should be in trunk soon. It does apply cleanly if its >>>>>>>> something you need today, Edward >>>>>>> >>>>>> >>>>>> >>>>> >>>>> I just committed a new version of the patch. You were correct, the >>>>> clidriver trims the first token off set and add queries hwi was not >>>>> doing that. Also let me know your impressions of HWI. >>>>> >>>>> The new features are the 'ResultBucket' a buffer of the last x >>>>> results viewable from the web interface, and the ability to supply >>>>> more then one query at a time. >>>>> >>>>> These two features should add much usability now as you can do >>>>> things like explain, show tables, etc and not have to dump the >>>>> results to a file. >>>>> >>>>> Edward >>>>> >>>> >>>> False statement: >>>>>> I just committed a new version of the patch >>>> >>>> In actuality, I updated the Jira with a new patch. >>>> >>>> It is still early AM. all the gears are not turning yet. >>>> >>>> Edward >>> >>> >> >> Vijay, >> >>>> It definitely makes things easier for a wider audience to try out >>>> hive >> >> That was always the goal. I often wonder which direction we should take HWI >> in. >> Should HWI have some REST-ful stubs to turn it into a remote job submission >> system? >> HiveServer uses thrift and I believe thrift has an HTTP-Transport so you >> might >> not need HWI to provide this. >> >> Should we ajax things like the result bucket or the entire interface so it >> has >> that ooo aaahhh effect? >> >> Really the larger question HWI has it's own multi-session management, >> HiveServer has this as well (now way back when it did not) . Should HWI just >> front end HiveServer? >> >> Does anyone have any thoughts? >> Edward > >
I think Raghu is correct. HiveClient->HiveServer happens on a permanent TCP connection (I think?). If you had a back end cluster of HiveServers, and you had a load balancer or proxy with sticky-session/session-tracking/source-ip policy. HWI would be configured with the virtual IP address of the load balancer and would connect and stay connected to a random HiveServer in the farm. I am naturally partial to the way it is now because I came up with it :) I like the idea of having a REST-ful/XML-RPC or some web service style interface for job submit. My thinking behind HWI has always been KISS. Keep It Simple Stupid. Anyone should be able to hack a few web pages onto it. Adding thrift, ajax, XML-RPC layers definitely ups the complexity. It think it makes sense to do HWI->HiveServer. I will have to take a deeper look at what HiveServer and thrift offers to be sure. Edward
