Couldn’t agree with you more Otto! On the perms / ACLs / AXOs / groups / users etc concerns though, there are other Apache projects (such as Ranger) which have already done a lot of the hard thinking and architecture / data structure / admin ui and persistence pieces for us, so I’d say we lean on them before designing our own approach to IAM.
Simon > On 2 Feb 2018, at 13:22, Otto Fowler <ottobackwa...@gmail.com> wrote: > > Fair enough, I don’t have a preference. I think my point is that we need to > understand the use cases we can think of more, especially if we are going to > be having permissions, grouping and crud around that, and preloading, before > just throwing everything in RDBMS -or- HBASE. > > > > On February 2, 2018 at 08:08:24, Simon Elliston Ball > (si...@simonellistonball.com <mailto:si...@simonellistonball.com>) wrote: > >> True, and that is a requirement I’ve heard a lot (standard views or field >> sets in shared sets of saved search for example). That would definitely rule >> out sticking with the current approach (browser local storage, per Casey’s >> suggestion below). >> >> That said, I’m not sure that changes my views on RDBMS. There is an argument >> that a single query from RDBMS could return a set of group prefs with a user >> overlay, but that’s not that much better than pulling groups and overwriting >> the maps clientside with user, from the key value store. We’re not talking >> about huge amounts of preference data here. I could be swayed the other way >> if we were to use the RDBMS as a canonical store for user and group >> information (we use it for users right now, in a really not great way) but I >> would much rather see us plugin to the Hadoop ecosystem and use something >> like Ranger to sync users, or an LDAP source directly for user and group >> data, because I suspect no one wants to have to administer a separate user >> database for Metron and open up the result IAM security hole we currently >> have (on that, let’s at least stop storing plain text passwords!) /rant. >> >> If anything I would like to see the current RDBMS dependency come out to >> reduce the overall complexity, unless we have a use case that genuinely >> benefits from a normalised data structure, or from SQL access patterns. >> >> In short, I would still go with LDAP / Ranger or users and groups, and >> instead of adding an RDBMS, using group prefs and user prefs in the existing >> KV store (HBase) to reduce the operational maintenance burden on the >> platform. >> >> Simon >> >>> On 2 Feb 2018, at 12:50, Otto Fowler <ottobackwa...@gmail.com >>> <mailto:ottobackwa...@gmail.com>> wrote: >>> >>> It is not uncommon to want to have ‘shared’ preferences or setups. Think >>> of shared dashboards or queries vs. personal version in jira. Would RDBMS >>> help with that? >>> >>> >>> >>> On February 2, 2018 at 07:17:04, Simon Elliston Ball >>> (si...@simonellistonball.com <mailto:si...@simonellistonball.com>) wrote: >>> >>>> Introducing a RDBMS to the stack seems unnecessary for this. >>>> >>>> If we consider the data access patterns for user profiles, we are unlikely >>>> to query into them, or indeed do anything other than look them up, or >>>> write them out by a username key. To that end, using an ORM to translate a >>>> a nested config object into a load of tables seems to introduce complexity >>>> and brittleness we then have to take away through relying on relational >>>> consistency models. We would also end up with, as Mike points out, a whole >>>> new disk deployment patterns and a bunch of additional DBA ops process >>>> requirements for every install. >>>> >>>> Since the access pattern is almost entirely key => value, hbase seems a >>>> good option (because we already have it there, it would be kinda crazy at >>>> this scale if we didn’t already have it) or arguably zookeeper, but that >>>> might be at the other end of the scale argument. I’d even go as far as to >>>> suggest files on HDFS to keep it simple. >>>> >>>> Simon >>>> >>>> > On 1 Feb 2018, at 23:24, Michael Miklavcic <michael.miklav...@gmail.com >>>> > <mailto:michael.miklav...@gmail.com>> wrote: >>>> > >>>> > Personally, I'd be in favor of something like Maria DB as an open source >>>> > repo. Or any other ansi sql store. On the positive side, it should mesh >>>> > seamlessly with ORM tools. And the schema for this should be pretty >>>> > vanilla, I'd imagine. I might even consider skipping ORM for straight >>>> > JDBC >>>> > and simple command scripts in Java for something this small. I'm not >>>> > worried so much about migrations of this sort. Large scale DBs can get >>>> > involved with major schema changes, but thats usually when the datastore >>>> > is >>>> > a massive set of tables with complex relationships, at least in my >>>> > experience. >>>> > >>>> > We could also use hbase, which probably wouldn't be that hard either, >>>> > but >>>> > there may be more boilerplate to write for the client as compared to >>>> > standard SQL. But I'm assuming we could reuse a fair amount of existing >>>> > code from our enrichments. One additional reason in favor of hbase might >>>> > be >>>> > data replication. For a SQL instance we'd probably recommend a RAID >>>> > store >>>> > or backup procedure, but we get that pretty easy with hbase too. >>>> > >>>> > On Feb 1, 2018 2:45 PM, "Casey Stella" <ceste...@gmail.com >>>> > <mailto:ceste...@gmail.com>> wrote: >>>> > >>>> >> So, I'll answer your question with some questions: >>>> >> >>>> >> - No matter the data store we use upgrading will take some care, right? >>>> >> - Do we currently depend on a RDBMS anywhere? I want to say that we do >>>> >> in the REST layer already, right? >>>> >> - If we don't use a RDBMs, what's the other option? What are the pros >>>> >> and cons? >>>> >> - Have we considered non-server offline persistent solutions (e.g. >>>> >> https://www.html5rocks.com/en/features/storage >>>> >> <https://www.html5rocks.com/en/features/storage>)? >>>> >> >>>> >> >>>> >> >>>> >> On Thu, Feb 1, 2018 at 9:11 AM, Ryan Merriman <merrim...@gmail.com >>>> >> <mailto:merrim...@gmail.com>> wrote: >>>> >> >>>> >>> There is currently a PR up for review that allows a user to configure >>>> >>> and >>>> >>> save the list of facet fields that appear in the left column of the >>>> >> Alerts >>>> >>> UI: https://github.com/apache/metron/pull/853 >>>> >>> <https://github.com/apache/metron/pull/853>. The REST layer has ORM >>>> >>> support which means we can store those in a relational database. >>>> >>> >>>> >>> However I'm not 100% sure this is the best place to keep this. As we >>>> >>> add >>>> >>> more use cases like this the backing tables in the RDBMS will need to >>>> >>> be >>>> >>> managed. This could make upgrading more tedious and error-prone. Is >>>> >> there >>>> >>> are a better way to store this, assuming we can leverage a component >>>> >> that's >>>> >>> already included in our stack? >>>> >>> >>>> >>> Ryan >>>> >>> >>>> >>