Couldn’t agree with you more Otto! On the perms / ACLs / AXOs / groups / users 
etc concerns though, there are other Apache projects (such as Ranger) which 
have already done a lot of the hard thinking and architecture / data structure 
/ admin ui and persistence pieces for us, so I’d say we lean on them before 
designing our own approach to IAM. 

Simon

> On 2 Feb 2018, at 13:22, Otto Fowler <ottobackwa...@gmail.com> wrote:
> 
> Fair enough,  I don’t have a preference.  I think my point is that we need to 
> understand the use cases we can think of more, especially if we are going to 
> be having permissions, grouping and crud around that, and preloading, before 
> just throwing everything in RDBMS -or- HBASE.
> 
> 
> 
> On February 2, 2018 at 08:08:24, Simon Elliston Ball 
> (si...@simonellistonball.com <mailto:si...@simonellistonball.com>) wrote:
> 
>> True, and that is a requirement I’ve heard a lot (standard views or field 
>> sets in shared sets of saved search for example). That would definitely rule 
>> out sticking with the current approach (browser local storage, per Casey’s 
>> suggestion below). 
>> 
>> That said, I’m not sure that changes my views on RDBMS. There is an argument 
>> that a single query from RDBMS could return a set of group prefs with a user 
>> overlay, but that’s not that much better than pulling groups and overwriting 
>> the maps clientside with user, from the key value store. We’re not talking 
>> about huge amounts of preference data here. I could be swayed the other way 
>> if we were to use the RDBMS as a canonical store for user and group 
>> information (we use it for users right now, in a really not great way) but I 
>> would much rather see us plugin to the Hadoop ecosystem and use something 
>> like Ranger to sync users, or an LDAP source directly for user and group 
>> data, because I suspect no one wants to have to administer a separate user 
>> database for Metron and open up the result IAM security hole we currently 
>> have (on that, let’s at least stop storing plain text passwords!) /rant. 
>> 
>> If anything I would like to see the current RDBMS dependency come out to 
>> reduce the overall complexity, unless we have a use case that genuinely 
>> benefits from a normalised data structure, or from SQL access patterns. 
>> 
>> In short, I would still go with LDAP / Ranger or users and groups, and 
>> instead of adding an RDBMS, using group prefs and user prefs in the existing 
>> KV store (HBase) to reduce the operational maintenance burden on the 
>> platform. 
>> 
>> Simon
>> 
>>> On 2 Feb 2018, at 12:50, Otto Fowler <ottobackwa...@gmail.com 
>>> <mailto:ottobackwa...@gmail.com>> wrote:
>>> 
>>> It is not uncommon to want to have ‘shared’ preferences or setups.   Think 
>>> of shared dashboards or queries vs. personal version in jira.  Would RDBMS 
>>> help with that?
>>> 
>>> 
>>> 
>>> On February 2, 2018 at 07:17:04, Simon Elliston Ball 
>>> (si...@simonellistonball.com <mailto:si...@simonellistonball.com>) wrote:
>>> 
>>>> Introducing a RDBMS to the stack seems unnecessary for this. 
>>>> 
>>>> If we consider the data access patterns for user profiles, we are unlikely 
>>>> to query into them, or indeed do anything other than look them up, or 
>>>> write them out by a username key. To that end, using an ORM to translate a 
>>>> a nested config object into a load of tables seems to introduce complexity 
>>>> and brittleness we then have to take away through relying on relational 
>>>> consistency models. We would also end up with, as Mike points out, a whole 
>>>> new disk deployment patterns and a bunch of additional DBA ops process 
>>>> requirements for every install. 
>>>> 
>>>> Since the access pattern is almost entirely key => value, hbase seems a 
>>>> good option (because we already have it there, it would be kinda crazy at 
>>>> this scale if we didn’t already have it) or arguably zookeeper, but that 
>>>> might be at the other end of the scale argument. I’d even go as far as to 
>>>> suggest files on HDFS to keep it simple.  
>>>> 
>>>> Simon 
>>>> 
>>>> > On 1 Feb 2018, at 23:24, Michael Miklavcic <michael.miklav...@gmail.com 
>>>> > <mailto:michael.miklav...@gmail.com>> wrote: 
>>>> >  
>>>> > Personally, I'd be in favor of something like Maria DB as an open source 
>>>> > repo. Or any other ansi sql store. On the positive side, it should mesh 
>>>> > seamlessly with ORM tools. And the schema for this should be pretty 
>>>> > vanilla, I'd imagine. I might even consider skipping ORM for straight 
>>>> > JDBC 
>>>> > and simple command scripts in Java for something this small. I'm not 
>>>> > worried so much about migrations of this sort. Large scale DBs can get 
>>>> > involved with major schema changes, but thats usually when the datastore 
>>>> > is 
>>>> > a massive set of tables with complex relationships, at least in my 
>>>> > experience. 
>>>> >  
>>>> > We could also use hbase, which probably wouldn't be that hard either, 
>>>> > but 
>>>> > there may be more boilerplate to write for the client as compared to 
>>>> > standard SQL. But I'm assuming we could reuse a fair amount of existing 
>>>> > code from our enrichments. One additional reason in favor of hbase might 
>>>> > be 
>>>> > data replication. For a SQL instance we'd probably recommend a RAID 
>>>> > store 
>>>> > or backup procedure, but we get that pretty easy with hbase too. 
>>>> >  
>>>> > On Feb 1, 2018 2:45 PM, "Casey Stella" <ceste...@gmail.com 
>>>> > <mailto:ceste...@gmail.com>> wrote: 
>>>> >  
>>>> >> So, I'll answer your question with some questions: 
>>>> >>  
>>>> >> - No matter the data store we use upgrading will take some care, right? 
>>>> >> - Do we currently depend on a RDBMS anywhere? I want to say that we do 
>>>> >> in the REST layer already, right? 
>>>> >> - If we don't use a RDBMs, what's the other option? What are the pros 
>>>> >> and cons? 
>>>> >> - Have we considered non-server offline persistent solutions (e.g. 
>>>> >>  https://www.html5rocks.com/en/features/storage 
>>>> >> <https://www.html5rocks.com/en/features/storage>)? 
>>>> >>  
>>>> >>  
>>>> >>  
>>>> >> On Thu, Feb 1, 2018 at 9:11 AM, Ryan Merriman <merrim...@gmail.com 
>>>> >> <mailto:merrim...@gmail.com>> wrote: 
>>>> >>  
>>>> >>> There is currently a PR up for review that allows a user to configure 
>>>> >>> and 
>>>> >>> save the list of facet fields that appear in the left column of the 
>>>> >> Alerts 
>>>> >>> UI:  https://github.com/apache/metron/pull/853 
>>>> >>> <https://github.com/apache/metron/pull/853>. The REST layer has ORM 
>>>> >>> support which means we can store those in a relational database. 
>>>> >>>  
>>>> >>> However I'm not 100% sure this is the best place to keep this. As we 
>>>> >>> add 
>>>> >>> more use cases like this the backing tables in the RDBMS will need to 
>>>> >>> be 
>>>> >>> managed. This could make upgrading more tedious and error-prone. Is 
>>>> >> there 
>>>> >>> are a better way to store this, assuming we can leverage a component 
>>>> >> that's 
>>>> >>> already included in our stack? 
>>>> >>>  
>>>> >>> Ryan 
>>>> >>>  
>>>> >>

Reply via email to