> Glad you agree with me that this isn’t HBase scale… it’s clearly not. I
would never suggest introducing HBase for something like this, but since
it’s there.

Ah, gotcha.  Misunderstood your statement.



On Fri, Feb 2, 2018 at 9:01 AM Simon Elliston Ball <
si...@simonellistonball.com> wrote:

> Glad you agree with me that this isn’t HBase scale… it’s clearly not. I
> would never suggest introducing HBase for something like this, but since
> it’s there.
>
> On the idea of using the Ambari RDBMS for the same basis of it being
> there, I see your point. That said, it can be postgres, sql server, mysql,
> maria, oracle… various. Yes we have an ORM, but those are not nearly as
> magic as they claim, and upgrade / schema evolution of an RDBMS often
> involves some sort of platform dependent SQL migration in my experience. I
> would suggest that supporting that range of options is not a good idea for
> us. The Ambari project also pretty much reserve the right to blow away that
> infrastructure in upgrades (which is fair enough). So relying on there
> being an RDBMS owned by another component is not something I would
> necessarily say was a clean choice.
>
> Simon
>
> > On 2 Feb 2018, at 13:50, Nick Allen <n...@nickallen.org> wrote:
> >
> > I fall marginally on the side of an RDBMS.  There is definitely a case to
> > be made on both sides, but I'll point out a few things for the RDBMS.
> >
> >
> > (1) Flexibility.  Using an RDBMS is going to provide us with much greater
> > flexibility going forward.  We really don't know what the specific use
> > cases will be, but I am willing to bet they are user-focused
> (preferences,
> > etc).  The type of use cases that most web applications use an RDBMS for.
> >
> >
> >> If anything I would like to see the current RDBMS dependency come out...
> >
> > (2) Don't we already have an RDBMS requirement for Ambari?  That's a
> > dependency that we do not control.
> >
> >
> >> ... hbase seems a good option (because we already have it there, it
> would
> > be kinda crazy at this scale if we didn’t already have it)
> >
> > (3) In this scenario, the RDBMS would not scale proportionally with the
> > amount of telemetry, it would scale based on usage; primarily the number
> of
> > users.  This is not "big data" scale.  I don't think we can make the case
> > for HBase based on scale here.
> >
> >
> >> We would also end up with, as Mike points out, a whole new disk
> > deployment patterns and a bunch of additional DBA ops process
> requirements
> > for every install.
> >
> > (4) Most users that need HA/DR (and other 'advanced stuff'), are
> > enterprises and organizations that are already very familiar with RDBMS
> > solutions and have the infrastructure in place to manage those.  For
> users
> > that don't need HA/DR, just use the DB that gets spun-up with Ambari.
> >
> >
> >
> >
> >
> > On Fri, Feb 2, 2018 at 7:17 AM Simon Elliston Ball <
> > si...@simonellistonball.com> wrote:
> >
> >> Introducing a RDBMS to the stack seems unnecessary for this.
> >>
> >> If we consider the data access patterns for user profiles, we are
> unlikely
> >> to query into them, or indeed do anything other than look them up, or
> write
> >> them out by a username key. To that end, using an ORM to translate a a
> >> nested config object into a load of tables seems to introduce complexity
> >> and brittleness we then have to take away through relying on relational
> >> consistency models. We would also end up with, as Mike points out, a
> whole
> >> new disk deployment patterns and a bunch of additional DBA ops process
> >> requirements for every install.
> >>
> >> Since the access pattern is almost entirely key => value, hbase seems a
> >> good option (because we already have it there, it would be kinda crazy
> at
> >> this scale if we didn’t already have it) or arguably zookeeper, but that
> >> might be at the other end of the scale argument. I’d even go as far as
> to
> >> suggest files on HDFS to keep it simple.
> >>
> >> Simon
> >>
> >>> On 1 Feb 2018, at 23:24, Michael Miklavcic <
> michael.miklav...@gmail.com>
> >> wrote:
> >>>
> >>> Personally, I'd be in favor of something like Maria DB as an open
> source
> >>> repo. Or any other ansi sql store. On the positive side, it should mesh
> >>> seamlessly with ORM tools. And the schema for this should be pretty
> >>> vanilla, I'd imagine. I might even consider skipping ORM for straight
> >> JDBC
> >>> and simple command scripts in Java for something this small. I'm not
> >>> worried so much about migrations of this sort. Large scale DBs can get
> >>> involved with major schema changes, but thats usually when the
> datastore
> >> is
> >>> a massive set of tables with complex relationships, at least in my
> >>> experience.
> >>>
> >>> We could also use hbase, which probably wouldn't be that hard either,
> but
> >>> there may be more boilerplate to write for the client as compared to
> >>> standard SQL. But I'm assuming we could reuse a fair amount of existing
> >>> code from our enrichments. One additional reason in favor of hbase
> might
> >> be
> >>> data replication. For a SQL instance we'd probably recommend a RAID
> store
> >>> or backup procedure, but we get that pretty easy with hbase too.
> >>>
> >>> On Feb 1, 2018 2:45 PM, "Casey Stella" <ceste...@gmail.com> wrote:
> >>>
> >>>> So, I'll answer your question with some questions:
> >>>>
> >>>>  - No matter the data store we use upgrading will take some care,
> >> right?
> >>>>  - Do we currently depend on a RDBMS anywhere?  I want to say that we
> >> do
> >>>>  in the REST layer already, right?
> >>>>  - If we don't use a RDBMs, what's the other option?  What are the
> pros
> >>>>  and cons?
> >>>>  - Have we considered non-server offline persistent solutions (e.g.
> >>>>  https://www.html5rocks.com/en/features/storage)?
> >>>>
> >>>>
> >>>>
> >>>> On Thu, Feb 1, 2018 at 9:11 AM, Ryan Merriman <merrim...@gmail.com>
> >> wrote:
> >>>>
> >>>>> There is currently a PR up for review that allows a user to configure
> >> and
> >>>>> save the list of facet fields that appear in the left column of the
> >>>> Alerts
> >>>>> UI:  https://github.com/apache/metron/pull/853.  The REST layer has
> >> ORM
> >>>>> support which means we can store those in a relational database.
> >>>>>
> >>>>> However I'm not 100% sure this is the best place to keep this.  As we
> >> add
> >>>>> more use cases like this the backing tables in the RDBMS will need to
> >> be
> >>>>> managed.  This could make upgrading more tedious and error-prone.  Is
> >>>> there
> >>>>> are a better way to store this, assuming we can leverage a component
> >>>> that's
> >>>>> already included in our stack?
> >>>>>
> >>>>> Ryan
> >>>>>
> >>>>
> >>
> >>
>
>

Reply via email to