[ https://issues.apache.org/jira/browse/HIVE-30?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12649213#action_12649213 ]
Joydeep Sen Sarma commented on HIVE-30: --------------------------------------- @Ashish - i think what u are saying makes total sense (in terms of managing state for one client/session). but the other angle is that this jsp page becomes the place where i can go and see all running sessions (it's both in the code as well as one of the features mentioned in the jira-description). that's what confuses me. something like show processlist is very useful for admins - but the adminstrative entity is not clear (unlike in mysql case). that's where my confusion is - what is the resource that we are administering? the compounding factor is that there are ways of submitting queries that do not go through the jsp gateway (or that there can be multiple jsp gateways) - so we are not going to be able to capture all running sessions/queries. ie. - if there's utility in capturing current/historic queries in one place - then we had better have a single server side for all access methods. also - longer term - i think the actual act of running a hive query is fairly heavyweight (this is just a guess) - since there are many data path operations that we would want to move to the client itself. also - if someone is extracting bulk data - we would like this (if possible) to be a direct interaction between client and hdfs and remove any central session manager out of this datapath. so what would make sense to me is to have a single session manager for all hive access paths (within a deployment say). cli/jsp/jdbc can all open, close, authenticate and get queries compiled into physical plans from this session manager (which can also take care of authentication etc.). the centralized session manager would be the administrative control point for the deployment. but the actual execution of the physical plan is then separated from centralized session management. cli clients or jsp or jdbc servers would take the physical plans and execute them in their own process (interacting with map-reduce and/or other resources as required). does this make sense? (I am hoping we can have a single coherent client-server model rather than independent pieces of work that do not mix'n'match with each other). we could start/extend this patch to be the central session manager that the cli could talk to as well (and future jdbc servers could also talk to). > Hive web interface > ------------------ > > Key: HIVE-30 > URL: https://issues.apache.org/jira/browse/HIVE-30 > Project: Hadoop Hive > Issue Type: Bug > Reporter: Jeff Hammerbacher > Assignee: Edward Capriolo > Priority: Minor > Attachments: HIVE-30.patch > > > Hive needs a web interface. The initial checkin should have: > * simple schema browsing > * query submission > * query history (similar to MySQL's SHOW PROCESSLIST) > A suggested feature: the ability to have a query notify the user when it's > completed. > Edward Capriolo has expressed some interest in driving this process. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.