fmcquillan99 edited a comment on pull request #518: URL: https://github.com/apache/madlib/pull/518#issuecomment-699189947
(6) fmin definition https://github.com/hyperopt/hyperopt/wiki/FMin fmin(loss, space, algo, max_evals) (a) Looks like this PR is setting max_evals = num_models/num_segments. For one thing I'm not sure that ``` self.num_workers = get_seg_number() * get_segments_per_host() ``` gives total number of workers? On a 1 host, 2 segments-per-host database this returned 4 instead of the expected 2. Also this needs to be consistent with the distribution rules set in the mini-batch preprocessor. (b) Looks like the logic is for each eval to find the losses for multiple models (= num segments) in parallel, then return the best loss value to hyperopt, then get a new set of params for multiple models from hyperopt. Or does it pass the loss for *each* model that is trained to hyperopt? If it is the former, then it seems like that might confuse affect hyperband in unknown ways if we are chunking/grouping like in this way. (7) We should re-look at the madlib hyperopt params to see if we have defined them in the right way. Do we need an option for an explicit `max_evals` ? (8) defaults Seems like hyperband defaults are being used for hyperopt in the case that use does not specify hyperband is not specified. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org