Agreed on Postgres. It's a lot easier to work with license-wise in apache projects, and has a lot of the capability we need here, especially if we can find a sensible ORM. Anyone got any thoughts on what would work there?
Simon > On 2 Aug 2017, at 21:21, Matt Foley <[email protected]> wrote: > > Hi Ryan, > Zookeeper has a default (and seldom changed) max znode size of 1MB, but it is > “designed to store data on the order of kilobytes in size.”[1] And it’s not > really intended for frequently-changing data, which is okay here. But I just > included it for completeness, I’m not advocating for its use here. > > I agree with you that the problem, especially because it includes shared > config, would fit well in a db. I’d suggest you consider PostgreSQL rather > than MySQL, as postgres is built into Redhat 6 and 7, and Ambari now uses it > by default, so an available server might be conveniently at hand in most > deployments. Definitely assume the user will want to use an external db > instance, rather than one dedicated to this use. Conveniently Postgres also > has a native REST interface, with the usual authorization options. > > Never mind about Ambari Views for now. It’s just a way to get GUI dashboards > without writing all the infrastructure for it, which as you say is somewhat > water under the bridge. > Cheers, > --Matt > > [1] https://zookeeper.apache.org/doc/r3.1.2/zookeeperAdmin.html > > > > On 8/2/17, 12:34 PM, "Ryan Merriman" <[email protected]> wrote: > > Matt, > > Thank you for the suggestions. I forgot to include Zookeeper. Are there > any tradeoffs we should be aware of if we decide to use Zookeeper? Are > there guidelines for how much data can be stored in Zookeeper? > > To answer your questions: > > 1. I think both use cases make sense so a combination of shared and > personal. > 2. I was planning on managing authorization in the REST layer. For now > viewer login auth (which is really REST auth) will suffice but we might > consider other methods since authentication is pluggable here. > 3. I had not considered Ambari Views since this will support an existing > UI. How would Ambari Views help us here? > > I will proceed initially with a saved search POC using a relational > database unless you think that is a bad idea or there are other better > options. Hopefully an example will further the discussion. > > Ryan > >> On Wed, Jul 26, 2017 at 6:31 PM, Matt Foley <[email protected]> wrote: >> >> There’s a couple other places you could put config info (but maybe not >> saved searches): >> - Zookeeper >> - metron-alerts-ui/config.xml or config.json file >> - the Ambari database, whichever it happens to be >> >> Questions that influence the decision include: >> 1. Should there be one configuration shared among users, or strictly >> per-user config? Or a combination of shared and personal? >> 2. What security do you wish to maintain on changing those settings, both >> shared and personal? What authentication/authorization scheme will you >> use? Is viewer login auth sufficient for this? >> 3. Will you assume Ambari exists? Did you consider using Ambari Views as >> the basis? (https://cwiki.apache.org/confluence/display/AMBARI/Views ) >> >> On 7/26/17, 2:54 PM, "Ryan Merriman" <[email protected]> wrote: >> >> In anticipation of METRON-988 being merged into master, there will be a >> need to persist user preferences such as UI layout, saved searches, >> search >> history, etc. I think where and how we persist this data should be >> discussed in order to facilitate a design. This data won't be large in >> scale and may or may not be relational. The initial features I am >> aware of >> don't require a relational model but I'm sure there will be some that >> do in >> the future. I'm also assuming this code will live in the REST >> application >> but someone correct me if there is a reason to keep it somewhere else. >> >> I think it would be preferable to leverage something that is already >> in our >> stack and available as a dependency. However I would not be against >> adding >> something if it really were the right tool for the job. Assuming >> others >> agree we should stick with out current stack, I see these options: >> >> - MySQL (or other relational database) >> - good fit for the size of data >> - relational capabilities >> - an ORM framework will be necessary which will increase our >> dependencies and complexity >> - HBase >> - client setup and code will likely be simpler and less complex >> - limited data model >> - Elasticsearch >> - json is a convenient data model >> - we already store user preferences here (Kibana dashboards) >> - we have abstracted our search engine interactions in several >> places >> and would have to here too >> >> Elasticsearch is out for me because we view search engines as >> pluggable. I >> think HBase would be the easiest to implement and get working but I'm >> worried we'll have similar use cases that won't be a good fit for >> HBase. >> In that case we would need to come up with an alternative persistence >> solution anyways. I think MySQL is a good fit long term but I'm >> concerned >> about adding a heavy ORM framework. Also, we can't use Hibernate >> because >> it is not license friendly. >> >> Does anyone have any thoughts on these options or other ideas? >> >> This requirement also brings up another topic that is outside of this >> discussion. Should we reevaluate our authentication strategy? >> Currently >> the REST application uses JDBC for this but if we decide a different >> mechanism is better then we no longer need a relational database. This >> might affect our decision to use MySQL for this kind of data >> persistence. >> >> Ryan >> >> >> >> > > >
