viirya commented on issue #26722: [SPARK-24666][ML] Fix infinity vectors produced by Word2Vec when numIterations are large URL: https://github.com/apache/spark/pull/26722#issuecomment-560244916 Specially, I looked into the timing when it produces any Infinity value in aggregating weight vectors at: ```scala val synAgg = partial.reduceByKey { case (v1, v2) => blas.saxpy(vectorSize, 1.0f, v2, 1, v1, 1) }.collect() ``` When it first produces any Infinity values among weights: alpha: 0.025 v1 (before blas.saxpy): 3.7545144E37, 9.609645E37, -8.751438E37, -1.6193201E38, 1.1736736E38, 3.2835947E38, 8.1553495E37, -1.6691325E38, -7.576555E37, -5.648573E37, -1.9869322E37, -1.6807897E37, -5.7600233E37, -6.2470694E37, -1.4104866E38, -1.4680707E38, -3.1782221E37, 1.8944205E38, 1.5494958E38, -2.1342228E38, -6.157935E37, 3.9677284E37, 1.1558841E37, 4.331978E37, -8.0626774E36, -5.8198486E36, 8.500153E37, -5.662092E36, -4.009228E37, -1.9031902E38, -2.4923412E38, 7.174913E37, 5.1235664E37, -5.5351527E37, 5.5978614E37, -1.8525286E38, 1.066509E37, 1.5285991E37, -2.0523789E38, 8.57768E37, -9.894086E37, -1.8595572E38, 2.0450045E37, 7.084625E37, 1.7256363E38, 1.7746238E37, 1.4823289E37, 1.2560103E38, -1.910456E38, -5.6934737E37, 3.9446576E37, 1.9320926E38, 5.9035325E37, -1.2072379E38, 7.4097296E37, -8.0367785E37, 1.9674684E38, 5.9296644E37, -1.8741689E38, -1.4480887E38, -2.933689E37, -6.161533E37, 1.02056735E36, 2.3885107E38 v1 (after blas.saxpy): 4.022694E37, 1.0296049E38, -9.376541E37, -1.734986E38, 1.2575074E38, Infinity, 8.737874E37, -1.7883562E38, -8.1177373E37, -6.0520423E37, -2.128856E37, -1.8008461E37, -6.1714535E37, -6.6932885E37, -1.5112356E38, -1.5729329E38, -3.405238E37, 2.0297362E38, 1.660174E38, -2.2866673E38, -6.597787E37, 4.2511375E37, 1.2384472E37, 4.641405E37, -8.638583E36, -6.235552E36, 9.107307E37, -6.066527E36, -4.2956014E37, -2.0391324E38, -2.6703656E38, 7.687407E37, 5.4895356E37, -5.930521E37, 5.997709E37, -1.984852E38, 1.1426883E37, 1.6377848E37, -2.1989773E38, 9.190372E37, -1.0600806E38, -1.9923827E38, 2.1910763E37, 7.5906695E37, 1.848896E38, 1.9013826E37, 1.5882095E37, 1.3457253E38, -2.046917E38, -6.10015E37, 4.2264189E37, 2.0700992E38, 6.3252134E37, -1.2934692E38, 7.938996E37, -8.610834E37, 2.1080017E38, 6.353212E37, -2.008038E38, -1.5515236E38, -3.1432383E37, -6.6016424E37, 1.093465E36, 2.5591186E38 v2: 2.681796E36, 6.8640314E36, -6.251026E36, -1.1566574E37, 8.3833835E36, 2.3454244E37, 5.825249E36, -1.1922373E37, -5.411826E36, -4.034695E36, -1.419237E36, -1.2005642E36, -4.114303E36, -4.4621932E36, -1.0074906E37, -1.048622E37, -2.2701585E36, 1.3531576E37, 1.1067827E37, -1.524445E37, -4.3985254E36, 2.834092E36, 8.256315E35, 3.09427E36, -5.759055E35, -4.1570347E35, 6.071537E36, -4.044352E35, -2.8637343E36, -1.3594217E37, -1.7802438E37, 5.1249374E36, 3.65969E36, -3.9536805E36, 3.9984723E36, -1.3232347E37, 7.6179225E35, 1.0918563E36, -1.465985E37, 6.126915E36, -7.067205E36, -1.3282551E37, 1.4607173E36, 5.0604463E36, 1.2325974E37, 1.2675882E36, 1.05880635E36, 8.971503E36, -1.3646113E37, -4.066767E36, 2.817613E36, 1.3800664E37, 4.2168083E36, -8.623128 E36, 5.2926634E36, -5.7405564E36, 1.4053346E37, 4.2354745E36, -1.3386921E37, -1.03434895E37, -2.0954919E36, -4.401094E36, 7.2897654E34, 1.7060787E37 There are extremely positive/negative values that exceeds the range of float. So infinity/-infinity weights produced there.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
