viirya commented on issue #26722: [SPARK-24666][ML] Fix infinity vectors 
produced by Word2Vec when numIterations are large
URL: https://github.com/apache/spark/pull/26722#issuecomment-560244916
 
 
   Specially, I looked into the timing when it produces any Infinity value in 
aggregating weight vectors at:
   
   ```scala
   val synAgg = partial.reduceByKey { case (v1, v2) =>        
     blas.saxpy(vectorSize, 1.0f, v2, 1, v1, 1)
   }.collect()
   ```
   
   When it first produces any Infinity values among weights:
   
   alpha: 0.025
   
   v1 (before blas.saxpy): 3.7545144E37, 9.609645E37, -8.751438E37, 
-1.6193201E38, 1.1736736E38, 3.2835947E38, 8.1553495E37, -1.6691325E38, 
-7.576555E37, -5.648573E37, -1.9869322E37, -1.6807897E37, -5.7600233E37, 
-6.2470694E37, -1.4104866E38, -1.4680707E38, -3.1782221E37, 1.8944205E38, 
1.5494958E38, -2.1342228E38, -6.157935E37, 3.9677284E37, 1.1558841E37, 
4.331978E37, -8.0626774E36, -5.8198486E36, 8.500153E37, -5.662092E36, 
-4.009228E37, -1.9031902E38, -2.4923412E38, 7.174913E37, 5.1235664E37, 
-5.5351527E37, 5.5978614E37, -1.8525286E38, 1.066509E37, 1.5285991E37, 
-2.0523789E38, 8.57768E37, -9.894086E37, -1.8595572E38, 2.0450045E37, 
7.084625E37, 1.7256363E38, 1.7746238E37, 1.4823289E37, 1.2560103E38, 
-1.910456E38, -5.6934737E37, 3.9446576E37, 1.9320926E38, 5.9035325E37, 
-1.2072379E38, 7.4097296E37, -8.0367785E37, 1.9674684E38, 5.9296644E37, 
-1.8741689E38, -1.4480887E38, -2.933689E37, -6.161533E37, 1.02056735E36, 
2.3885107E38
   
   v1 (after blas.saxpy): 4.022694E37, 1.0296049E38, -9.376541E37, 
-1.734986E38, 1.2575074E38, Infinity, 8.737874E37, -1.7883562E38, 
-8.1177373E37, -6.0520423E37, -2.128856E37, -1.8008461E37, -6.1714535E37, 
-6.6932885E37, -1.5112356E38, -1.5729329E38, -3.405238E37, 2.0297362E38, 
1.660174E38, -2.2866673E38, -6.597787E37, 4.2511375E37, 1.2384472E37, 
4.641405E37, -8.638583E36, -6.235552E36, 9.107307E37, -6.066527E36, 
-4.2956014E37, -2.0391324E38, -2.6703656E38, 7.687407E37, 5.4895356E37, 
-5.930521E37, 5.997709E37, -1.984852E38, 1.1426883E37, 1.6377848E37, 
-2.1989773E38, 9.190372E37, -1.0600806E38, -1.9923827E38, 2.1910763E37, 
7.5906695E37, 1.848896E38, 1.9013826E37, 1.5882095E37, 1.3457253E38, 
-2.046917E38, -6.10015E37, 4.2264189E37, 2.0700992E38, 6.3252134E37, 
-1.2934692E38, 7.938996E37, -8.610834E37, 2.1080017E38, 6.353212E37, 
-2.008038E38, -1.5515236E38, -3.1432383E37, -6.6016424E37, 1.093465E36, 
2.5591186E38
   
   
   v2: 2.681796E36, 6.8640314E36, -6.251026E36, -1.1566574E37, 8.3833835E36, 
2.3454244E37, 5.825249E36, -1.1922373E37, -5.411826E36, -4.034695E36, 
-1.419237E36, -1.2005642E36, -4.114303E36, -4.4621932E36, -1.0074906E37, 
-1.048622E37, -2.2701585E36, 1.3531576E37, 1.1067827E37, -1.524445E37, 
-4.3985254E36, 2.834092E36, 8.256315E35, 3.09427E36, -5.759055E35, 
-4.1570347E35, 6.071537E36, -4.044352E35, -2.8637343E36, -1.3594217E37, 
-1.7802438E37, 5.1249374E36, 3.65969E36, -3.9536805E36, 3.9984723E36, 
-1.3232347E37, 7.6179225E35, 1.0918563E36, -1.465985E37, 6.126915E36, 
-7.067205E36, -1.3282551E37, 1.4607173E36, 5.0604463E36, 1.2325974E37, 
1.2675882E36, 1.05880635E36, 8.971503E36, -1.3646113E37, -4.066767E36, 
2.817613E36, 1.3800664E37, 4.2168083E36, -8.623128
   E36, 5.2926634E36, -5.7405564E36, 1.4053346E37, 4.2354745E36, -1.3386921E37, 
-1.03434895E37, -2.0954919E36, -4.401094E36, 7.2897654E34, 1.7060787E37
   
   There are extremely positive/negative values that exceeds the range of 
float. So infinity/-infinity weights produced there.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to