Re: [DISCUSS] Persisting user data

2017-08-03 Thread Ryan Merriman
Spring is JDBC-generic so I think we're good there.  Improving our docs on
this topic is being discussed in https://github.com/apache/metron/pull/646
so hopefully this will be clear once that's worked out.

Simon is correct, I found out the hard way that Hibernate is not an option
because of it's license.  I think EclipseLink would be a good alternative.
I've seen it used in other open source projects (Ambari for example) and I
was able to get it working in a POC without much effort.

On Thu, Aug 3, 2017 at 5:26 AM, Simon Elliston Ball <
si...@simonellistonball.com> wrote:

> Anything spring based is likely multi-db by definition as long as a we
> pick a good friendly ORM (not hibernate because licensing problems with
> apache, eclipselink?) But I suspect we should pick a good default and that
> that default should be postgres.
>
> > On 3 Aug 2017, at 10:24, Casey Stella  wrote:
> >
> > I'd vote for a DB-based solution, but I'd argue that any solution
> shouldn't
> > be database specific (i.e. postgres), but JDBC-generic.  People and
> > organizations have very strong views regarding databases and I'd prefer
> to
> > side-step those holy wars by being agnostic.
> >
> > On Wed, Aug 2, 2017 at 9:36 PM, Ryan Merriman 
> wrote:
> >
> >> Spring supports a variety of databases including Postgres.  I have no
> >> problem with using Postgres instead of MySQL.
> >>
> >> On Wed, Aug 2, 2017 at 3:32 PM, Simon Elliston Ball <
> >> si...@simonellistonball.com> wrote:
> >>
> >>> Agreed on Postgres. It's a lot easier to work with license-wise in
> apache
> >>> projects, and has a lot of the capability we need here, especially if
> we
> >>> can find a sensible ORM. Anyone got any thoughts on what would work
> >> there?
> >>>
> >>> Simon
> >>>
>  On 2 Aug 2017, at 21:21, Matt Foley  wrote:
> 
>  Hi Ryan,
>  Zookeeper has a default (and seldom changed) max znode size of 1MB,
> but
> >>> it is “designed to store data on the order of kilobytes in size.”[1]
> And
> >>> it’s not really intended for frequently-changing data, which is okay
> >> here.
> >>> But I just included it for completeness, I’m not advocating for its use
> >>> here.
> 
>  I agree with you that the problem, especially because it includes
> >> shared
> >>> config, would fit well in a db.  I’d suggest you consider PostgreSQL
> >> rather
> >>> than MySQL, as postgres is built into Redhat 6 and 7, and Ambari now
> uses
> >>> it by default, so an available server might be conveniently at hand in
> >> most
> >>> deployments.  Definitely assume the user will want to use an external
> db
> >>> instance, rather than one dedicated to this use.  Conveniently Postgres
> >>> also has a native REST interface, with the usual authorization options.
> 
>  Never mind about Ambari Views for now.  It’s just a way to get GUI
> >>> dashboards without writing all the infrastructure for it, which as you
> >> say
> >>> is somewhat water under the bridge.
>  Cheers,
>  --Matt
> 
>  [1] https://zookeeper.apache.org/doc/r3.1.2/zookeeperAdmin.html
> 
> 
> 
>  On 8/2/17, 12:34 PM, "Ryan Merriman"  wrote:
> 
>    Matt,
> 
>    Thank you for the suggestions.  I forgot to include Zookeeper.  Are
> >>> there
>    any tradeoffs we should be aware of if we decide to use Zookeeper?
> >>> Are
>    there guidelines for how much data can be stored in Zookeeper?
> 
>    To answer your questions:
> 
>    1.  I think both use cases make sense so a combination of shared and
>    personal.
>    2.  I was planning on managing authorization in the REST layer.  For
> >>> now
>    viewer login auth (which is really REST auth) will suffice but we
> >>> might
>    consider other methods since authentication is pluggable here.
>    3.  I had not considered Ambari Views since this will support an
> >>> existing
>    UI.  How would Ambari Views help us here?
> 
>    I will proceed initially with a saved search POC using a relational
>    database unless you think that is a bad idea or there are other
> >> better
>    options.  Hopefully an example will further the discussion.
> 
>    Ryan
> 
> >   On Wed, Jul 26, 2017 at 6:31 PM, Matt Foley 
> >>> wrote:
> >
> > There’s a couple other places you could put config info (but maybe
> not
> > saved searches):
> > -  Zookeeper
> > -  metron-alerts-ui/config.xml or config.json  file
> > -  the Ambari database, whichever it happens to be
> >
> > Questions that influence the decision include:
> > 1. Should there be one configuration shared among users, or strictly
> > per-user config?  Or a combination of shared and personal?
> > 2. What security do you wish to maintain on changing those settings,
> >>> both
> > shared and personal?  What authentication/authorization 

Re: [DISCUSS] Persisting user data

2017-08-03 Thread Simon Elliston Ball
Anything spring based is likely multi-db by definition as long as a we pick a 
good friendly ORM (not hibernate because licensing problems with apache, 
eclipselink?) But I suspect we should pick a good default and that that default 
should be postgres. 

> On 3 Aug 2017, at 10:24, Casey Stella  wrote:
> 
> I'd vote for a DB-based solution, but I'd argue that any solution shouldn't
> be database specific (i.e. postgres), but JDBC-generic.  People and
> organizations have very strong views regarding databases and I'd prefer to
> side-step those holy wars by being agnostic.
> 
> On Wed, Aug 2, 2017 at 9:36 PM, Ryan Merriman  wrote:
> 
>> Spring supports a variety of databases including Postgres.  I have no
>> problem with using Postgres instead of MySQL.
>> 
>> On Wed, Aug 2, 2017 at 3:32 PM, Simon Elliston Ball <
>> si...@simonellistonball.com> wrote:
>> 
>>> Agreed on Postgres. It's a lot easier to work with license-wise in apache
>>> projects, and has a lot of the capability we need here, especially if we
>>> can find a sensible ORM. Anyone got any thoughts on what would work
>> there?
>>> 
>>> Simon
>>> 
 On 2 Aug 2017, at 21:21, Matt Foley  wrote:
 
 Hi Ryan,
 Zookeeper has a default (and seldom changed) max znode size of 1MB, but
>>> it is “designed to store data on the order of kilobytes in size.”[1]  And
>>> it’s not really intended for frequently-changing data, which is okay
>> here.
>>> But I just included it for completeness, I’m not advocating for its use
>>> here.
 
 I agree with you that the problem, especially because it includes
>> shared
>>> config, would fit well in a db.  I’d suggest you consider PostgreSQL
>> rather
>>> than MySQL, as postgres is built into Redhat 6 and 7, and Ambari now uses
>>> it by default, so an available server might be conveniently at hand in
>> most
>>> deployments.  Definitely assume the user will want to use an external db
>>> instance, rather than one dedicated to this use.  Conveniently Postgres
>>> also has a native REST interface, with the usual authorization options.
 
 Never mind about Ambari Views for now.  It’s just a way to get GUI
>>> dashboards without writing all the infrastructure for it, which as you
>> say
>>> is somewhat water under the bridge.
 Cheers,
 --Matt
 
 [1] https://zookeeper.apache.org/doc/r3.1.2/zookeeperAdmin.html
 
 
 
 On 8/2/17, 12:34 PM, "Ryan Merriman"  wrote:
 
   Matt,
 
   Thank you for the suggestions.  I forgot to include Zookeeper.  Are
>>> there
   any tradeoffs we should be aware of if we decide to use Zookeeper?
>>> Are
   there guidelines for how much data can be stored in Zookeeper?
 
   To answer your questions:
 
   1.  I think both use cases make sense so a combination of shared and
   personal.
   2.  I was planning on managing authorization in the REST layer.  For
>>> now
   viewer login auth (which is really REST auth) will suffice but we
>>> might
   consider other methods since authentication is pluggable here.
   3.  I had not considered Ambari Views since this will support an
>>> existing
   UI.  How would Ambari Views help us here?
 
   I will proceed initially with a saved search POC using a relational
   database unless you think that is a bad idea or there are other
>> better
   options.  Hopefully an example will further the discussion.
 
   Ryan
 
>   On Wed, Jul 26, 2017 at 6:31 PM, Matt Foley 
>>> wrote:
> 
> There’s a couple other places you could put config info (but maybe not
> saved searches):
> -  Zookeeper
> -  metron-alerts-ui/config.xml or config.json  file
> -  the Ambari database, whichever it happens to be
> 
> Questions that influence the decision include:
> 1. Should there be one configuration shared among users, or strictly
> per-user config?  Or a combination of shared and personal?
> 2. What security do you wish to maintain on changing those settings,
>>> both
> shared and personal?  What authentication/authorization scheme will
>> you
> use?  Is viewer login auth sufficient for this?
> 3. Will you assume Ambari exists?  Did you consider using Ambari Views
>>> as
> the basis? (https://cwiki.apache.org/confluence/display/AMBARI/Views
>> )
> 
> On 7/26/17, 2:54 PM, "Ryan Merriman"  wrote:
> 
>   In anticipation of METRON-988 being merged into master, there will
>>> be a
>   need to persist user preferences such as UI layout, saved searches,
> search
>   history, etc.  I think where and how we persist this data should be
>   discussed in order to facilitate a design.  This data won't be
>> large
>>> in
>   scale and may or may not be relational.  The initial features I am
> aware of
>   don't require a 

Re: [DISCUSS] Persisting user data

2017-08-03 Thread Casey Stella
I'd vote for a DB-based solution, but I'd argue that any solution shouldn't
be database specific (i.e. postgres), but JDBC-generic.  People and
organizations have very strong views regarding databases and I'd prefer to
side-step those holy wars by being agnostic.

On Wed, Aug 2, 2017 at 9:36 PM, Ryan Merriman  wrote:

> Spring supports a variety of databases including Postgres.  I have no
> problem with using Postgres instead of MySQL.
>
> On Wed, Aug 2, 2017 at 3:32 PM, Simon Elliston Ball <
> si...@simonellistonball.com> wrote:
>
> > Agreed on Postgres. It's a lot easier to work with license-wise in apache
> > projects, and has a lot of the capability we need here, especially if we
> > can find a sensible ORM. Anyone got any thoughts on what would work
> there?
> >
> > Simon
> >
> > > On 2 Aug 2017, at 21:21, Matt Foley  wrote:
> > >
> > > Hi Ryan,
> > > Zookeeper has a default (and seldom changed) max znode size of 1MB, but
> > it is “designed to store data on the order of kilobytes in size.”[1]  And
> > it’s not really intended for frequently-changing data, which is okay
> here.
> > But I just included it for completeness, I’m not advocating for its use
> > here.
> > >
> > > I agree with you that the problem, especially because it includes
> shared
> > config, would fit well in a db.  I’d suggest you consider PostgreSQL
> rather
> > than MySQL, as postgres is built into Redhat 6 and 7, and Ambari now uses
> > it by default, so an available server might be conveniently at hand in
> most
> > deployments.  Definitely assume the user will want to use an external db
> > instance, rather than one dedicated to this use.  Conveniently Postgres
> > also has a native REST interface, with the usual authorization options.
> > >
> > > Never mind about Ambari Views for now.  It’s just a way to get GUI
> > dashboards without writing all the infrastructure for it, which as you
> say
> > is somewhat water under the bridge.
> > > Cheers,
> > > --Matt
> > >
> > > [1] https://zookeeper.apache.org/doc/r3.1.2/zookeeperAdmin.html
> > >
> > >
> > >
> > > On 8/2/17, 12:34 PM, "Ryan Merriman"  wrote:
> > >
> > >Matt,
> > >
> > >Thank you for the suggestions.  I forgot to include Zookeeper.  Are
> > there
> > >any tradeoffs we should be aware of if we decide to use Zookeeper?
> > Are
> > >there guidelines for how much data can be stored in Zookeeper?
> > >
> > >To answer your questions:
> > >
> > >1.  I think both use cases make sense so a combination of shared and
> > >personal.
> > >2.  I was planning on managing authorization in the REST layer.  For
> > now
> > >viewer login auth (which is really REST auth) will suffice but we
> > might
> > >consider other methods since authentication is pluggable here.
> > >3.  I had not considered Ambari Views since this will support an
> > existing
> > >UI.  How would Ambari Views help us here?
> > >
> > >I will proceed initially with a saved search POC using a relational
> > >database unless you think that is a bad idea or there are other
> better
> > >options.  Hopefully an example will further the discussion.
> > >
> > >Ryan
> > >
> > >>On Wed, Jul 26, 2017 at 6:31 PM, Matt Foley 
> > wrote:
> > >>
> > >> There’s a couple other places you could put config info (but maybe not
> > >> saved searches):
> > >> -  Zookeeper
> > >> -  metron-alerts-ui/config.xml or config.json  file
> > >> -  the Ambari database, whichever it happens to be
> > >>
> > >> Questions that influence the decision include:
> > >> 1. Should there be one configuration shared among users, or strictly
> > >> per-user config?  Or a combination of shared and personal?
> > >> 2. What security do you wish to maintain on changing those settings,
> > both
> > >> shared and personal?  What authentication/authorization scheme will
> you
> > >> use?  Is viewer login auth sufficient for this?
> > >> 3. Will you assume Ambari exists?  Did you consider using Ambari Views
> > as
> > >> the basis? (https://cwiki.apache.org/confluence/display/AMBARI/Views
> )
> > >>
> > >> On 7/26/17, 2:54 PM, "Ryan Merriman"  wrote:
> > >>
> > >>In anticipation of METRON-988 being merged into master, there will
> > be a
> > >>need to persist user preferences such as UI layout, saved searches,
> > >> search
> > >>history, etc.  I think where and how we persist this data should be
> > >>discussed in order to facilitate a design.  This data won't be
> large
> > in
> > >>scale and may or may not be relational.  The initial features I am
> > >> aware of
> > >>don't require a relational model but I'm sure there will be some
> that
> > >> do in
> > >>the future.  I'm also assuming this code will live in the REST
> > >> application
> > >>but someone correct me if there is a reason to keep it somewhere
> > else.
> > >>
> > >>I think it would be preferable to 

Re: [DISCUSS] Persisting user data

2017-08-02 Thread Ryan Merriman
Spring supports a variety of databases including Postgres.  I have no
problem with using Postgres instead of MySQL.

On Wed, Aug 2, 2017 at 3:32 PM, Simon Elliston Ball <
si...@simonellistonball.com> wrote:

> Agreed on Postgres. It's a lot easier to work with license-wise in apache
> projects, and has a lot of the capability we need here, especially if we
> can find a sensible ORM. Anyone got any thoughts on what would work there?
>
> Simon
>
> > On 2 Aug 2017, at 21:21, Matt Foley  wrote:
> >
> > Hi Ryan,
> > Zookeeper has a default (and seldom changed) max znode size of 1MB, but
> it is “designed to store data on the order of kilobytes in size.”[1]  And
> it’s not really intended for frequently-changing data, which is okay here.
> But I just included it for completeness, I’m not advocating for its use
> here.
> >
> > I agree with you that the problem, especially because it includes shared
> config, would fit well in a db.  I’d suggest you consider PostgreSQL rather
> than MySQL, as postgres is built into Redhat 6 and 7, and Ambari now uses
> it by default, so an available server might be conveniently at hand in most
> deployments.  Definitely assume the user will want to use an external db
> instance, rather than one dedicated to this use.  Conveniently Postgres
> also has a native REST interface, with the usual authorization options.
> >
> > Never mind about Ambari Views for now.  It’s just a way to get GUI
> dashboards without writing all the infrastructure for it, which as you say
> is somewhat water under the bridge.
> > Cheers,
> > --Matt
> >
> > [1] https://zookeeper.apache.org/doc/r3.1.2/zookeeperAdmin.html
> >
> >
> >
> > On 8/2/17, 12:34 PM, "Ryan Merriman"  wrote:
> >
> >Matt,
> >
> >Thank you for the suggestions.  I forgot to include Zookeeper.  Are
> there
> >any tradeoffs we should be aware of if we decide to use Zookeeper?
> Are
> >there guidelines for how much data can be stored in Zookeeper?
> >
> >To answer your questions:
> >
> >1.  I think both use cases make sense so a combination of shared and
> >personal.
> >2.  I was planning on managing authorization in the REST layer.  For
> now
> >viewer login auth (which is really REST auth) will suffice but we
> might
> >consider other methods since authentication is pluggable here.
> >3.  I had not considered Ambari Views since this will support an
> existing
> >UI.  How would Ambari Views help us here?
> >
> >I will proceed initially with a saved search POC using a relational
> >database unless you think that is a bad idea or there are other better
> >options.  Hopefully an example will further the discussion.
> >
> >Ryan
> >
> >>On Wed, Jul 26, 2017 at 6:31 PM, Matt Foley 
> wrote:
> >>
> >> There’s a couple other places you could put config info (but maybe not
> >> saved searches):
> >> -  Zookeeper
> >> -  metron-alerts-ui/config.xml or config.json  file
> >> -  the Ambari database, whichever it happens to be
> >>
> >> Questions that influence the decision include:
> >> 1. Should there be one configuration shared among users, or strictly
> >> per-user config?  Or a combination of shared and personal?
> >> 2. What security do you wish to maintain on changing those settings,
> both
> >> shared and personal?  What authentication/authorization scheme will you
> >> use?  Is viewer login auth sufficient for this?
> >> 3. Will you assume Ambari exists?  Did you consider using Ambari Views
> as
> >> the basis? (https://cwiki.apache.org/confluence/display/AMBARI/Views )
> >>
> >> On 7/26/17, 2:54 PM, "Ryan Merriman"  wrote:
> >>
> >>In anticipation of METRON-988 being merged into master, there will
> be a
> >>need to persist user preferences such as UI layout, saved searches,
> >> search
> >>history, etc.  I think where and how we persist this data should be
> >>discussed in order to facilitate a design.  This data won't be large
> in
> >>scale and may or may not be relational.  The initial features I am
> >> aware of
> >>don't require a relational model but I'm sure there will be some that
> >> do in
> >>the future.  I'm also assuming this code will live in the REST
> >> application
> >>but someone correct me if there is a reason to keep it somewhere
> else.
> >>
> >>I think it would be preferable to leverage something that is already
> >> in our
> >>stack and available as a dependency.  However I would not be against
> >> adding
> >>something if it really were the right tool for the job.  Assuming
> >> others
> >>agree we should stick with out current stack, I see these options:
> >>
> >>   - MySQL (or other relational database)
> >>  - good fit for the size of data
> >>  - relational capabilities
> >>  - an ORM framework will be necessary which will increase our
> >>  dependencies and complexity
> >>   - HBase
> 

Re: [DISCUSS] Persisting user data

2017-08-02 Thread Simon Elliston Ball
Agreed on Postgres. It's a lot easier to work with license-wise in apache 
projects, and has a lot of the capability we need here, especially if we can 
find a sensible ORM. Anyone got any thoughts on what would work there?

Simon 

> On 2 Aug 2017, at 21:21, Matt Foley  wrote:
> 
> Hi Ryan,
> Zookeeper has a default (and seldom changed) max znode size of 1MB, but it is 
> “designed to store data on the order of kilobytes in size.”[1]  And it’s not 
> really intended for frequently-changing data, which is okay here.  But I just 
> included it for completeness, I’m not advocating for its use here.
> 
> I agree with you that the problem, especially because it includes shared 
> config, would fit well in a db.  I’d suggest you consider PostgreSQL rather 
> than MySQL, as postgres is built into Redhat 6 and 7, and Ambari now uses it 
> by default, so an available server might be conveniently at hand in most 
> deployments.  Definitely assume the user will want to use an external db 
> instance, rather than one dedicated to this use.  Conveniently Postgres also 
> has a native REST interface, with the usual authorization options.
> 
> Never mind about Ambari Views for now.  It’s just a way to get GUI dashboards 
> without writing all the infrastructure for it, which as you say is somewhat 
> water under the bridge.
> Cheers,
> --Matt
> 
> [1] https://zookeeper.apache.org/doc/r3.1.2/zookeeperAdmin.html 
> 
> 
> 
> On 8/2/17, 12:34 PM, "Ryan Merriman"  wrote:
> 
>Matt,
> 
>Thank you for the suggestions.  I forgot to include Zookeeper.  Are there
>any tradeoffs we should be aware of if we decide to use Zookeeper?  Are
>there guidelines for how much data can be stored in Zookeeper?
> 
>To answer your questions:
> 
>1.  I think both use cases make sense so a combination of shared and
>personal.
>2.  I was planning on managing authorization in the REST layer.  For now
>viewer login auth (which is really REST auth) will suffice but we might
>consider other methods since authentication is pluggable here.
>3.  I had not considered Ambari Views since this will support an existing
>UI.  How would Ambari Views help us here?
> 
>I will proceed initially with a saved search POC using a relational
>database unless you think that is a bad idea or there are other better
>options.  Hopefully an example will further the discussion.
> 
>Ryan
> 
>>On Wed, Jul 26, 2017 at 6:31 PM, Matt Foley  wrote:
>> 
>> There’s a couple other places you could put config info (but maybe not
>> saved searches):
>> -  Zookeeper
>> -  metron-alerts-ui/config.xml or config.json  file
>> -  the Ambari database, whichever it happens to be
>> 
>> Questions that influence the decision include:
>> 1. Should there be one configuration shared among users, or strictly
>> per-user config?  Or a combination of shared and personal?
>> 2. What security do you wish to maintain on changing those settings, both
>> shared and personal?  What authentication/authorization scheme will you
>> use?  Is viewer login auth sufficient for this?
>> 3. Will you assume Ambari exists?  Did you consider using Ambari Views as
>> the basis? (https://cwiki.apache.org/confluence/display/AMBARI/Views )
>> 
>> On 7/26/17, 2:54 PM, "Ryan Merriman"  wrote:
>> 
>>In anticipation of METRON-988 being merged into master, there will be a
>>need to persist user preferences such as UI layout, saved searches,
>> search
>>history, etc.  I think where and how we persist this data should be
>>discussed in order to facilitate a design.  This data won't be large in
>>scale and may or may not be relational.  The initial features I am
>> aware of
>>don't require a relational model but I'm sure there will be some that
>> do in
>>the future.  I'm also assuming this code will live in the REST
>> application
>>but someone correct me if there is a reason to keep it somewhere else.
>> 
>>I think it would be preferable to leverage something that is already
>> in our
>>stack and available as a dependency.  However I would not be against
>> adding
>>something if it really were the right tool for the job.  Assuming
>> others
>>agree we should stick with out current stack, I see these options:
>> 
>>   - MySQL (or other relational database)
>>  - good fit for the size of data
>>  - relational capabilities
>>  - an ORM framework will be necessary which will increase our
>>  dependencies and complexity
>>   - HBase
>>  - client setup and code will likely be simpler and less complex
>>  - limited data model
>>   - Elasticsearch
>>  - json is a convenient data model
>>  - we already store user preferences here (Kibana dashboards)
>>  - we have abstracted our search engine interactions in several
>> places
>>  and would have to here too

Re: [DISCUSS] Persisting user data

2017-08-02 Thread Ryan Merriman
Matt,

Thank you for the suggestions.  I forgot to include Zookeeper.  Are there
any tradeoffs we should be aware of if we decide to use Zookeeper?  Are
there guidelines for how much data can be stored in Zookeeper?

To answer your questions:

1.  I think both use cases make sense so a combination of shared and
personal.
2.  I was planning on managing authorization in the REST layer.  For now
viewer login auth (which is really REST auth) will suffice but we might
consider other methods since authentication is pluggable here.
3.  I had not considered Ambari Views since this will support an existing
UI.  How would Ambari Views help us here?

I will proceed initially with a saved search POC using a relational
database unless you think that is a bad idea or there are other better
options.  Hopefully an example will further the discussion.

Ryan

On Wed, Jul 26, 2017 at 6:31 PM, Matt Foley  wrote:

> There’s a couple other places you could put config info (but maybe not
> saved searches):
> -  Zookeeper
> -  metron-alerts-ui/config.xml or config.json  file
> -  the Ambari database, whichever it happens to be
>
> Questions that influence the decision include:
> 1. Should there be one configuration shared among users, or strictly
> per-user config?  Or a combination of shared and personal?
> 2. What security do you wish to maintain on changing those settings, both
> shared and personal?  What authentication/authorization scheme will you
> use?  Is viewer login auth sufficient for this?
> 3. Will you assume Ambari exists?  Did you consider using Ambari Views as
> the basis? (https://cwiki.apache.org/confluence/display/AMBARI/Views )
>
> On 7/26/17, 2:54 PM, "Ryan Merriman"  wrote:
>
> In anticipation of METRON-988 being merged into master, there will be a
> need to persist user preferences such as UI layout, saved searches,
> search
> history, etc.  I think where and how we persist this data should be
> discussed in order to facilitate a design.  This data won't be large in
> scale and may or may not be relational.  The initial features I am
> aware of
> don't require a relational model but I'm sure there will be some that
> do in
> the future.  I'm also assuming this code will live in the REST
> application
> but someone correct me if there is a reason to keep it somewhere else.
>
> I think it would be preferable to leverage something that is already
> in our
> stack and available as a dependency.  However I would not be against
> adding
> something if it really were the right tool for the job.  Assuming
> others
> agree we should stick with out current stack, I see these options:
>
>- MySQL (or other relational database)
>   - good fit for the size of data
>   - relational capabilities
>   - an ORM framework will be necessary which will increase our
>   dependencies and complexity
>- HBase
>   - client setup and code will likely be simpler and less complex
>   - limited data model
>- Elasticsearch
>   - json is a convenient data model
>   - we already store user preferences here (Kibana dashboards)
>   - we have abstracted our search engine interactions in several
> places
>   and would have to here too
>
> Elasticsearch is out for me because we view search engines as
> pluggable.  I
> think HBase would be the easiest to implement and get working but I'm
> worried we'll have similar use cases that won't be a good fit for
> HBase.
> In that case we would need to come up with an alternative persistence
> solution anyways.  I think MySQL is a good fit long term but I'm
> concerned
> about adding a heavy ORM framework.  Also, we can't use Hibernate
> because
> it is not license friendly.
>
> Does anyone have any thoughts on these options or other ideas?
>
> This requirement also brings up another topic that is outside of this
> discussion.  Should we reevaluate our authentication strategy?
> Currently
> the REST application uses JDBC for this but if we decide a different
> mechanism is better then we no longer need a relational database.  This
> might affect our decision to use MySQL for this kind of data
> persistence.
>
> Ryan
>
>
>
>


Re: [DISCUSS] Persisting user data

2017-07-26 Thread Matt Foley
There’s a couple other places you could put config info (but maybe not saved 
searches):
-  Zookeeper
-  metron-alerts-ui/config.xml or config.json  file
-  the Ambari database, whichever it happens to be

Questions that influence the decision include:
1. Should there be one configuration shared among users, or strictly per-user 
config?  Or a combination of shared and personal?
2. What security do you wish to maintain on changing those settings, both 
shared and personal?  What authentication/authorization scheme will you use?  
Is viewer login auth sufficient for this?
3. Will you assume Ambari exists?  Did you consider using Ambari Views as the 
basis? (https://cwiki.apache.org/confluence/display/AMBARI/Views )

On 7/26/17, 2:54 PM, "Ryan Merriman"  wrote:

In anticipation of METRON-988 being merged into master, there will be a
need to persist user preferences such as UI layout, saved searches, search
history, etc.  I think where and how we persist this data should be
discussed in order to facilitate a design.  This data won't be large in
scale and may or may not be relational.  The initial features I am aware of
don't require a relational model but I'm sure there will be some that do in
the future.  I'm also assuming this code will live in the REST application
but someone correct me if there is a reason to keep it somewhere else.

I think it would be preferable to leverage something that is already in our
stack and available as a dependency.  However I would not be against adding
something if it really were the right tool for the job.  Assuming others
agree we should stick with out current stack, I see these options:

   - MySQL (or other relational database)
  - good fit for the size of data
  - relational capabilities
  - an ORM framework will be necessary which will increase our
  dependencies and complexity
   - HBase
  - client setup and code will likely be simpler and less complex
  - limited data model
   - Elasticsearch
  - json is a convenient data model
  - we already store user preferences here (Kibana dashboards)
  - we have abstracted our search engine interactions in several places
  and would have to here too

Elasticsearch is out for me because we view search engines as pluggable.  I
think HBase would be the easiest to implement and get working but I'm
worried we'll have similar use cases that won't be a good fit for HBase.
In that case we would need to come up with an alternative persistence
solution anyways.  I think MySQL is a good fit long term but I'm concerned
about adding a heavy ORM framework.  Also, we can't use Hibernate because
it is not license friendly.

Does anyone have any thoughts on these options or other ideas?

This requirement also brings up another topic that is outside of this
discussion.  Should we reevaluate our authentication strategy?  Currently
the REST application uses JDBC for this but if we decide a different
mechanism is better then we no longer need a relational database.  This
might affect our decision to use MySQL for this kind of data persistence.

Ryan