[jira] Commented: (HIVE-30) Hive web interface

Joydeep Sen Sarma (JIRA) Wed, 19 Nov 2008 15:44:07 -0800

    [ 
https://issues.apache.org/jira/browse/HIVE-30?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12649213#action_12649213
 ]


Joydeep Sen Sarma commented on HIVE-30:
---------------------------------------

@Ashish - i think what u are saying makes total sense (in terms of managing 
state for one client/session). but the other angle is that this jsp page 
becomes the place where i can go and see all running sessions (it's both in the 
code as well as one of the features mentioned in the jira-description). that's 
what confuses me.

something like show processlist is very useful for admins - but the 
adminstrative entity is not clear (unlike in mysql case). that's where my 
confusion is - what is the resource that we are administering? the compounding 
factor is that there are ways of submitting queries that do not go through the 
jsp gateway (or that there can be multiple jsp gateways) - so we are not going 
to be able to capture all running sessions/queries. ie. - if there's utility in 
capturing current/historic queries in one place - then we had better have a 
single server side for all access methods.

also - longer term - i think the actual act of running a hive query is fairly 
heavyweight (this is just a guess) - since there are many data path operations 
that we would want to move to the client itself. also - if someone is 
extracting bulk data - we would like this (if possible) to be a direct 
interaction between client and hdfs and remove any central session manager out 
of this datapath.

so what would make sense to me is to have a single session manager for all hive 
access paths (within a deployment say). cli/jsp/jdbc can all open, close, 
authenticate and get queries compiled into physical plans from this session 
manager (which can also take care of authentication etc.). the centralized 
session manager would be the administrative control point for the deployment. 
but the actual execution of the physical plan is then separated from 
centralized session management. cli clients or jsp or jdbc servers would take 
the physical plans and execute them in their own process (interacting with 
map-reduce and/or other resources as required).

does this make sense? (I am hoping we can have a single coherent client-server 
model rather than independent pieces of work that do not mix'n'match with each 
other). we could start/extend this patch to be the central session manager that 
the cli could talk to as well (and future jdbc servers could also talk to).


> Hive web interface
> ------------------
>
>                 Key: HIVE-30
>                 URL: https://issues.apache.org/jira/browse/HIVE-30
>             Project: Hadoop Hive
>          Issue Type: Bug
>            Reporter: Jeff Hammerbacher
>            Assignee: Edward Capriolo
>            Priority: Minor
>         Attachments: HIVE-30.patch
>
>
> Hive needs a web interface. The initial checkin should have:
> * simple schema browsing
> * query submission
> * query history (similar to MySQL's SHOW PROCESSLIST)
> A suggested feature: the ability to have a query notify the user when it's 
> completed.
> Edward Capriolo has expressed some interest in driving this process.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-30) Hive web interface

Reply via email to