#542: WebSearch - all-containing perform_request_search() makes it very hard to
re-use search engine
------------------------+----------------------
Reporter: rchyla | Owner: rchyla
Type: defect | Status: assigned
Priority: major | Milestone:
Component: WebSearch | Version:
Resolution: | Keywords: refactor
------------------------+----------------------
Comment (by rchyla):
Joe, thanks for taking time to look at it and for the valuable comments. I
can proceed with further changes thanks to it. The main point being, as
you say, this was a mid-hack solution, so I didn't spent too much
time/energy with certain aspects. Actually, this change was touching a lot
of code, so I was concerned with the main functionality. Please see
comments below.
> * Method signatures - did you auto-generate the refactoring with
eclipse? It looks very bad. All of your child methods have dozens of
parameters (as the parent p_r_s had), and most of them are unused. I
might go so far as to say that none of them are used. It looks like
everything important gets passed in the first argument, kwargs, as a
dictionary. This leads to three recommendations:
a more careful look will reveal that the arguments are used extensively,
true, a few functions (i just checked them) have redundant number of
arguments as I didn't care to remove them all, but this happened mostly
with functions added in later stages. And yes! The function signature
contains many variables, but that is a legacy thing and it would require a
much more changes
> 1. if you're going to have a catch-all keyword parameter at the end of
your parameter list, it should be called {{{**kwargs}}}, not {{{**rest}}}.
Also, you don't need to have it on any of these methods; it's never
dereferenced.
- note (kwargs, one=None, !**kwargs) is a syntactically incorrect in
Python
- I thought it is better to name it !**rest (as that means - 'we are not
using it, it is there to deal with what we have to deal with....')
> 2. if you're going to pass everything through in a dictionary of
options, feel free to do so, but call the dictionary something else, and
document what keys may be put in it. And eliminate all of the other
method parameters, which you're not using anyway.
sure, this can be done
> 3. or you could decide not to pass things around in a dictionary, and
instead actually have each method accept all and only those parameters it
needs, and then remove the rest.
I think that you have a very good feeling of how much more work that would
require, and especially in the phase of refactoring
> a. as a natural corollary of this pruning, you may find that one of
these utility methods has to call other utility methods that require still
other parameters. I think cases like this are where a slightly more
thoughtful driver (in this case, slightly fattening the newly slimmed
p_r_s) can help. But personally, my own go-to thing would be a closure.
When
very good point, there are at least two places where the call goes deep
and many more variables are needed later
you need this kind of shared state, objects are sort of obvious, but we
want to be minimal in how we alter the public API, which is very
functional. And closures are really single-method objects in a functional
style.
> * Method visibility - You've created a lot of utility methods. When I
suggested before that they should be inner functions of p_r_s, you said
that solr needed access to them. I wonder if solr needs access to all of
them? Those that it does, I think should be documented
solr is using search, sorting, filtering, ranking and display (but only
for citation summaries); this means it is using 3/5 of the new functions
are created. That is not insignificant.
> as being part of the API for solr connectivity and otherwise made first-
class components. Those utility methods that solr doesn't call I still
think should be hidden.
and what about being private? (in python sense)
> * This also suggests (though I don't think you should take the time
for this) that eventually someone should audit search_engine looking for
places these utility methods can be plugged in to replace some older cut-
and-pasted code.
> * In websearch_templates you introduce a checkbox for solr. Is this a
developer comfort feature? I don't think we would want this in
production.
sure, but if you were testing solr you would find it very useful ;)
> * And of course, being captured mid-hack, you haven't done anything to
clean it up, but before upstream acceptance I imagine you'll have to fix
some whitespace issues, introduce some docstrings, and generally make sure
that the kwalitee checks pass at at least the level they passed at for
HEAD~1.
which proves my point when I resisted writing unittests for something that
was half-cooked, as a code, it proves the point where the refactoring goes
and it actually *works*, but I never considered it finished
Thanks!
--
Ticket URL: <http://invenio-software.org/ticket/542#comment:15>
Invenio <http://invenio-software.org>