I believe what you're looking for is being worked on and tracked here: http://issues.apache.org/jira/browse/HIVE-80
On 9/8/09 6:37 PM, "Vijay" <[email protected]> wrote: > I get that HWI does manage sessions but it does that leveraging the internal > functionality of the "server." One usage pattern I'd like is some kind of a > "job" API. What I mean by that is an API that lets us simply submit a query, > get some kind of "job id," and leave. After that we use other APIs to query > the job status, kill it, get the output once it is done, etc. If we have a > simple API like this and the semantics to support this within hive, then the > UI can be completely decoupled and be as stateless as it can (using vanilla > apache+php as an example, we can't really do threads or stay resident after > submitting a job). Does something like this exist either within hive or at the > hadoop level? It seems to me may be this is something that needs to be built > first. > > Thanks, > Vijay > > On Tue, Sep 8, 2009 at 2:52 PM, Edward Capriolo <[email protected]> wrote: >> On Tue, Sep 8, 2009 at 5:15 PM, Royce >> Rollins<[email protected]> wrote: >>>> OK I see. I just looked at the code in HWISessionManager.java. So it looks >>>> like either I will have to write my own ruby HWISessionManager that manages >>>> sessions through thrift or expose the existng HWISessionManager via some >> >>>> web >>>> service interface. Has anyone done this? >>>> >>>> Royce >>>> >>>> >>>> On 9/8/09 1:47 PM, "Edward Capriolo" <[email protected]> wrote: >>>> >>>>>> On Tue, Sep 8, 2009 at 4:38 PM, Vijay<[email protected]> wrote: >>>>>>>> Sorry to inject into this thread but I have the same problem (only I'm >>>>>>>> trying to use the thrift PHP libraries from apache-php scripts). The >>>>> problem >>>>>>>> with this approach is that the http request cannot run indefinitely as >>>>>>>> the >>>>>>>> server is executing a query. Are there any solutions for this? >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Vijay >>>>>>>> >>>>>>>> On Tue, Sep 8, 2009 at 1:35 PM, Royce Rollins >>>>> <[email protected]> >>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>> Raghu, >>>>>>>>>> Thanks for the quick response. >>>>>>>>>> Yes. My application is web based so instead of having to build some >>>>>>>>>> kind >>>>>>>>>> of >>>>>>>>>> session model myself for queries that might take a while, I'd like >>>>>>>>>> >>>>> to use >>>>>>>>>> a session model in the hive service. >>>>>>>>>> >>>>>>>>>> Royce >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On 9/8/09 1:32 PM, "Raghu Murthy" <[email protected]> wrote: >>>>>>>>>> >>>>>>>>>>>> Our model so far has been to create a new connection to the hive >>>>>>>>>>>> >>>>>> thrift >>>>>>>>>>>> server per session. Is there anything specific you are looking for >>>>>>>>>>>> in >>>>>>>>>>>> sessions? >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On 9/8/09 1:06 PM, "Royce Rollins" <[email protected]> >>>>>>>>>>>> >>>>>> wrote: >>>>>>>>>>>> >>>>>>>>>>>> I¹m curently working on an application that connects to hive via >>>>>>>>>>>> the >>>>>>>>>>>> thrift >>>>>>>>>>>> ruby libraries. >>>>>>>>>>>> >>>>>>>>>>>> Does hive support creation of sessions using those libraries. If >>>>>>>>>>>> so, >>>>>>>>>>>> how? >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Royce >>>>>>>>>>>> >>>>>>>>>> >>>>>>>> >>>>>>>> >>>>>> >>>>>> Royce, >>>>>> >>>>>> The Hive Web Interface deals with this by having a threaded object >>>>>> (HWISessionManager) in the Web application scope. I am not sure if PHP >>>>>> has any equivalent to threading and Application Scope. >>>>>> >>>>>> Edward >>>> >>>> >> >> Someone correct me if I am wrong. >> >> Royce, >> >> You may be able to get at this another way. From my understanding, the >> internal hive web interface used at facebook would spawn ` bin/hive -e >> 'INSERT INTO X select * FROM`. All results were written to a hive >> table. >> >> Doing it this way gives you no way to interact with the query and >> 'stream' the result, set you can't really use 'fetchOne()' or >> 'fetchAll()' but you could start a query and set flags on completion. >> >> As for web interface, we just had some talks, and one of the things I >> was looking to do was create some type of web service style bindings. >> (We would also like to have HWI talk to Thrift and have thrift be the >> code path for everything). However, if we do make some web server >> style bindings they would really be independent of the back end. Do >> you want to work on this ? I would like to open a Jira and tackle the >> issue. >> >> >> The big picture here is that we need a 'state holder'. That is really >> what HWI is. You create a session, detach from it, and optionally >> check on it later. If an application needs that pattern how to handle >> it? >> >> One way to tackle this is >> >> INSERT INTO file 'hdfs://path/to/file' select * FROM XXX' & >> >> then have your client 'tail' the hdfs://path/to/file or record the >> last position it saw. I guess the big question is dealing with >> streaming results. HWI manages the session for you and writes the >> results to a local file, (and the new SessionBucket >> >> What is the usage pattern you need? >
