Github user justinleet commented on the issue:

    https://github.com/apache/metron/pull/970
  
    @nickwallen For a bit of context, I tried primarily to keep things 
essentially how they were in the ES versions (including the integration tests 
and unit tests).  I definitely appreciate you looking at this, the feedback is 
definitely helpful, especially since like I said, this is a first cut, that 
gets things implemented in a very similar manner to the current implementation. 
 Iterating on it is definitely worthwhile, but I think you're right, the 
feedback is a little too abstract to work with as-is, so let's try to work out 
a more concrete approach.  Here's my initial thoughts, which are also mostly 
too abstract, but we can try to get more detailed from here.
    
    Re (1):
    We could probably do something fairly similar to what we have the other 
DAO's.  Specifically, break things up into SolrMetaAlertUpdateDao + 
ElasticsearchMetaAlertDao, SolrMetaAlertSearchDao + SolrMetaAlertSearchDao, 
etc. Then have the SolrMetaAlertDao just wrap the appropriate calls as needed. 
I think this will help make it more digestible (which is definitely a worthy 
cause), but I don't think it helps make the implementations more independent. 
I'd still expect them to be fairly large and share a lot of code in some 
abstraction (whether that's abstract classes or not). The reason they'd share 
code is that, unlike the SolrSearchDao vs ElasticsearchSearchDao is that the 
metaalerts are (mostly) avoiding digging into Solr and ES internals which is 
definitely not the case for the other DAOs.
    
    Essentially there's a tradeoff between changes in the base classes breaking 
one or both (but if that happens, I'd argue that our testing is incomplete and 
we should be adding more tests like you mention) or having them vary 
independently and having issues that are fixed on one side not fixed in the 
other (which has it's own set of testing challenges).  I'm curious what you 
think is a good approach to breaking things up beyond that that avoids a lot of 
code duplication.
    
    For the implementation specific quirks, I completely agree, the abstraction 
probably isn't as well defined as it could be.  it might be possible to 
refactor some things to make it more well defined, but at some point we're 
other going to be duplicating code or patching abstraction leaks at the DAO 
level.  Because I don't think Solr and ES (and any other implementation we 
choose) are fundamentally able to be papered over in a perfect abstraction. I 
definitely agree it would be nice to make them less tied at the hip, the 
problem is that a lot of this stuff is already broken up at the abstraction 
points we have (particularly at Document level, rather than implementation 
specific level). Right now, the implementation details are obscured from the 
calling REST service, and its the responsibility of the MetaAlertDao to handle 
the implementation details.
    
    Re (2):
    For the specific calculating metascores case, and probably some of the 
other stuff, it only seems useful to break that sort of thing into it's own 
class if we intend on making it pluggable.  After all, that is a function that 
is pretty tied into how metaalerts are laid out and I'm not really sure what 
the benefit of splitting individual operations out like that is (unless we 
potentially needed to make them available across composed DAOs).  The other 
thing is the actual calculation of the scores themselves are maintained by a 
different class.  It's only the actual management of the metaalert 
representation handled by the Dao.  I'm 
    
    To do more unit testing, there's definitely some easy targets in there.  
I'd kept the unit tests that existed, but looking back at them, there's 
definitely more things and helper functions that could be better tested even 
without splitting things into more classes.
    
    Is it worthwhile to do the refactoring into the *MetaAlertUpdateDao and 
*UpdateSearchDao et al., along with adding some of the unit tests, and seeing 
where we are at that point?
    



---

Reply via email to