[jira] Commented: (HIVE-30) Hive web interface

Joydeep Sen Sarma (JIRA) Fri, 21 Nov 2008 09:32:36 -0800

    [ 
https://issues.apache.org/jira/browse/HIVE-30?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12649712#action_12649712
 ]


Joydeep Sen Sarma commented on HIVE-30:
---------------------------------------

Blockers from my side:

hwi shell script: i would like to see this merged with the hive cli shell 
script and written as a generic harness to launch hive utilities. given that 
the bulk of the libraries are common - it seems perfectly fine to add more jars 
and classname to be executed based on the actual utility name (cli vs. hwi)

also - i think it will be fairly critical to take in userids and propagate them 
to hive/hadoop (by setting user.name property). why don't we just replace 
'sessionname' with 'userid' ? that should also automatically generate a 
separate log file for each user on the hwi server - so it will be somewhat easy 
to grok at logs if required.

Another thing i just noticed - Hive's current runtime assumes a singleton 
SessionState object. That's just not going to work here (since there's a 
singleton per execution thread now). There are in fact some comments to this 
effect in SessionState.java - we need to make it a thread-local singleton. This 
has to be fixed - otherwise concurrent queries/sessions would be trampling over 
each other. (we can do this in a separate jira - although it would be a blocker 
for this one)

regarding ss.out: in order to capture data only in the results file - please 
set the session to silent mode. otherwise the output will be polluted with 
informational messages. (perhaps this is highlighting that we need to get 
informational messages in a different stream (potentially) than the actual 
results - which is very doable - but not the way things are setup now)

all of these are really asking the question: how was this tested? both of the 
last two issues are fairly major.

other usability issues that are going to be very important (based on observing 
hipal): one cannot destroy a running session - but one of the most common 
operations that users will want to do is monitor the map-reduce tasks that have 
been spawned by a query and kill them (for example - if the job is too long or 
the jobconf parameter setting need to be fixed). 


Good to have things (in decreasing order of importance):
- regarding reloading HiveConf - if schema browsing is not associated with a 
session - then the same hiveconf can be cached and re-used. minor point - but 
loading the hiveconf is big enough that i think you won't be happy if this tool 
becomes really popular :-)
- any reason why QUERY_SET etc. should not be an enum type?
- spell check clientDestory

> Hive web interface
> ------------------
>
>                 Key: HIVE-30
>                 URL: https://issues.apache.org/jira/browse/HIVE-30
>             Project: Hadoop Hive
>          Issue Type: Bug
>            Reporter: Jeff Hammerbacher
>            Assignee: Edward Capriolo
>            Priority: Minor
>         Attachments: HIVE-30.patch
>
>
> Hive needs a web interface. The initial checkin should have:
> * simple schema browsing
> * query submission
> * query history (similar to MySQL's SHOW PROCESSLIST)
> A suggested feature: the ability to have a query notify the user when it's 
> completed.
> Edward Capriolo has expressed some interest in driving this process.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-30) Hive web interface

Reply via email to