On Sun, Sep 15, 2013 at 7:52 AM, Chris Rohr <[email protected]> wrote:
> I have been doing a lot of thinking on how to make this piece more push > button (like the rest of Blur) and I think I have come up with a possible > solution (with Aaron's help) and wanted to run it by everyone before I get > too far down the path. > > Server setup: > > I think that we should have a simple Jetty server that servers up the > initial pages and contains an embeddable db for a very small amount of > application data (will get to what this is in just a second). The start up > of the server can be added to the blur start-all if we wanted so that it is > available as soon as blur starts without extra work. > Sounds good. > > Features: > > I want to run through existing features and explain how I think they should > work in this version. For the most part I am leaning towards having ~95% > of the application be in javascript running in the user's browser itself. > > Dashboard: > > This page currently displays node status information (overall zookeepers, > controllers, and shards) for multiple instances of blur. This page also > displays some high level HDFS information but depending on what Hue > provides we might just remove this part. I think we can build in a very > similar approach that we have now where a small piece of code in the Jetty > server records node status into the embeddable database. The browser would > then poll for updates to display. > I think that having multiple instances within the application is not totally necessary. If it's easy then sure, but I think that this is a low priority. > > Environment: > > This page give detailed information on node status for a specific blur > instance. This would use the exact same information as the Dashboard, just > would display differently for the specific instance. > I think that this could be the main page. > > Tables: > > This page lists the tables that are in a specific version of blur. Most of > this information today comes from the Blur API itself and now that Aaron is > generating the thrift api in JS we can just ask for the information on > demand from the browser. Need to make sure there are no performance > implications on Blur itself if we are making the call on demand as opposed > to polling. This page also displays some shard server layout and schema > information which can all be obtained through the API. Sounds good. > > Queries: > > This page gives information about recent queries on a specific instance > that have gone through Blur. This is all achieved through the Blur API so > the information can be obtained on demand through the browser the same way. > The only issue that I'm not sure yet how to handle is that Blur currently > keeps queries around for 2 minutes and the current agent was keeping 30-60 > minutes worth of queries to help troubleshoot some things in the running > system. Maybe use a hybrid approach with the Jetty server. > I think that there is likely a lot of people that will want an audit of the queries executed against each instance. > > Search: > > This page allows for searching a specific instance of Blur. I think that > all of this can be done through the JS API, we will just have to do > something about the user preference to choose a priority column family and > maybe saved searches (though I'm thinking we can use browser local storage > do accomplish this) > Local storage sounds ok to me. We could provide a storage API to store in a database of some sort. > > HDFS: > > This section contained HDFS stats and metrics as well as a file browser. I > think we can remove this and defer to tools like Hue. > Agreed. > > Audit: > > This section displays an audit of destructive actions performed through > this tool (disabled tables, deleting tables, forgetting nodes, etc). This > has been useful in my experience when a table disappears and we didn't know > who did it. Now this doesn't audit the fact that the action was taken > through the shell or other external tools. I would like feedback on what > others feel about the usefulness of this. > I would omit at this point. If we need to add this later we will need to close all the holes like the thrift api, shell, etc. > > Admin: > > This allows for controlling users and their roles in the system. We had > originally done this so that 1. this tool isn't available to just anyone > (will contain production data), 2. restrict destructive actions, 3. Allows > for control over who can see certain information (i.e. actual query > content). We can probably utilize the Jetty server to at least do a basic > user setup, though once someone has the Blur api in JS they could run it > themselves. > Yeah I would put user context as a low priority for now. > > I know this is a lot of information, but I think we need to make this tool > easier and faster for everyone to use. Please let me know what you think. > Thanks Chris! Aaron > > Thanks, > Chris > > > On Fri, Sep 13, 2013 at 5:50 PM, Chris Rohr <[email protected]> wrote: > > > I'm not familiar with it, but does Hue do HDFS capacity and node status? > > I could definitely see taking out the file browser part. > > > > > > On Fri, Sep 13, 2013 at 8:31 AM, Garrett Barton < > [email protected]>wrote: > > > >> I think hue covers that functionality pretty well. The only slightly > odd > >> part is afaik hue doesn't talk to multi clusters and blur is typically > on > >> a > >> separate cluster. > >> On Sep 13, 2013 4:26 AM, "Aaron McCurry" <[email protected]> wrote: > >> > >> > At this point I would vote to remove it. If we take HBase as an > >> example, I > >> > don't think that they provide a way to interact with the FileSystem > >> read or > >> > write through any kind of web interface. > >> > > >> > Aaron > >> > > >> > > >> > On Thu, Sep 12, 2013 at 4:29 PM, Chris Rohr <[email protected]> > >> wrote: > >> > > >> > > Hi all, > >> > > > >> > > As we look to take the console to the next level, I was wondering > what > >> > > everyone's thoughts are the usefulness of the HDFS portion of the > >> > console. > >> > > This portion has the following features: > >> > > > >> > > 1. File system browser > >> > > a. Viewer > >> > > b. Upload > >> > > c. Rename > >> > > d. Delete > >> > > 2. Node status > >> > > 3. Capacity status > >> > > > >> > > I am just wondering if this should be here or if we should just rip > it > >> > out > >> > > and let people use something like Hue for this information. > >> > > > >> > > Chris > >> > > > >> > > >> > > > > >
