huaxingao edited a comment on issue #26735: [SPARK-30102][WIP][ML][PYSPARK] GMM 
supports instance weighting
URL: https://github.com/apache/spark/pull/26735#issuecomment-563413620
 
 
   I guess instead of changing maxIter=5 and compare the logLikelihood at 
iteration 5, maybe use a much bigger maxIter  so it will converge. Compare the 
logLikelihood at convergence.
   
   It puzzled me why the logLikelihoods from iteration 7 are so different from 
the logLikelihoods computed using the original code. Weight is not set in the 
python doctest and it uses default 1.0. So in theory, this should behave exact 
the same as the original code,  the logLikelihood at each iteration should be 
very similar as the logLikelihood computed using the original code, right?
   
   I tried both the original code and the code with changes, they start to have 
different logLikelihood at iteration 7, but both of them converge at iteration 
25, with the same logLikelihood 65.02945125241477.
   
   I agree that we probably need to change the current convergence check. Seems 
to me that we also need to compare the logLikelihood difference to the previous 
difference. The difference should be smaller and smaller and eventually 
converge. However, I tested with the current code, the logLikelihood 
differences are not getting smaller consistently. 
   
   | iteration  | logLikelihoodPrev | logLikelihood |diff |
   | --------  | ----------------- |-------------- | -- |
   | 15  | 36.402816949681664 | 36.55682231506764 | 0.1540053653859772|
   | 16  | 36.55682231506764 | 36.75888971475007 | 0.20206739968242715|
   | 17  | 36.75888971475007 | 37.581643170088086 | 0.8227534553380167|
   | 18  | 37.581643170088086 | 6.674670202869423 | 30.906972967218664|
   | 19  | 6.674670202869423 | 10.601046748584544 | 3.9263765457151205|
   | 20  | 10.601046748584544 | 39.71941181091317 | 29.11836506232863|
   | 21  | 39.71941181091317 | 49.2147989416624 | 9.49538713074923|
   | 22  | 49.2147989416624 | 76.11383657713708 | 26.899037635474677|
   | 23  | 76.11383657713708 | 71.28238165058754 | 4.83145492654954|
   | 24  | 71.28238165058754 | 65.02945125241477 | 6.252930398172765|
   | 25  | 65.02945125241477 | 65.02945125241477 | 0.0|
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to