Advitya17 edited a comment on pull request #518:
URL: https://github.com/apache/madlib/pull/518#issuecomment-700952240


   > (6)
   > fmin definition
   > https://github.com/hyperopt/hyperopt/wiki/FMin
   > fmin(loss, space, algo, max_evals)
   > 
   > Looks like this PR is setting max_evals = num_models/num_segments in 
`get_configs_list()`. For one thing I'm not sure that
   > 
   > ```
   > self.num_workers = get_seg_number() * get_segments_per_host()
   > ```
   > 
   > gives total number of workers? On a 1 host, 2 segments-per-host database 
this returned 4 instead of the expected 2. Also this needs to be consistent 
with the distribution rules set in the mini-batch preprocessor.
   > 
   > (7)
   > The function `get_configs_list()` may not be distributing the workload 
correctly to the segments.
   > For
   > 
   > ```
   >     num_models = 3
   >     num_workers = 3
   > ```
   > 
   > it returns `[(1, 3)]` which seems OK. But for
   > 
   > ```
   >     num_models = 5
   >     num_workers = 3
   > ```
   > 
   > it returns `[(1, 5)]` which does not seem OK. It will not do any hyperopt 
updates if all 5 configs are grouped together. I would have expected `[(1, 3), 
(4, 5)]`. For
   > 
   > ```
   >     num_models = 20
   >     num_workers = 3
   > ```
   > 
   > it returns `[(1, 4), (5, 8), (9, 11), (12, 14), (15, 17), (18, 20)]` which 
means it is running 4 configs on 3 segments multiple times which does not seem 
efficient. I would have expected something like `[(1, 3), (4, 6), (7, 9), (10, 
12), (13, 15), (16, 18),(19, 20)]` which runs 3 configs at a time on the 3 
segments.
   > 
   > (8)
   > In `find_hyperopt_config()` confirm that the loss for _each_ model that 
passed to hyperopt after training, and not just the best one from the group.
   > 
   > (9)
   > defaults
   > Seems like hyperband defaults are being used for hyperopt in the case that 
use does not specify hyperband is not specified. That will probably throw an 
error
   
   @fmcquillan99 
   
   (7) 
   
   (a) As per my discussions with @reductionista , the current implementation 
seems to run more than 1 rounds when `number of models >= 2 * number of 
segments/workers in the cluster`. Sure, though the latter `[(1, 3), (4, 5)]` 
may have an idle worker, it may be a better approach for Hyperopt.
   
   (b) In reference to the 2nd example, I believe this was as per our 
discussions to minimize number of model training function calls (and to make 
the schedule look relatively more 'balanced'). Observe that the former schedule 
runs 4 configs on 3 workers for 2 rounds/runs and continues with the 3 configs 
on 3 workers for all other rounds (as 3 doesn't perfectly divide 20). The 
latter runs 2 configs on 3 workers at the end which also thus has an idle 
worker (apart from 1 more model training function call compared to the former 
approach).
   
   (8) I believe all the losses are already passed on to hyperopt as part of my 
implementation, in the `Hyperopt TPE Update` for loop (starting line 753 of the 
AutoML python file).
   
   (9) Yes I did observe that. I tried to change the global defaults in the SQL 
file as per chosen automl method, but ran into errors in the process. As per my 
current implementation, the hyperband params are defaults and the user may need 
to explicitly specify hyperopt params when using hyperopt.
   
   (10) The name changes sound good to me.
   
   Let me know if there're more areas I could address.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to