mboehm7 commented on pull request #1481: URL: https://github.com/apache/systemds/pull/1481#issuecomment-1003424325
For PCA with many columns we have PPCA, which runs this scenario as follows: ``` Total elapsed time: 6.880 sec. Total compilation time: 1.586 sec. Total execution time: 5.293 sec. Number of compiled Spark inst: 0. Number of executed Spark inst: 0. Cache hits (Mem/Li/WB/FS/HDFS): 21/0/0/0/0. Cache writes (Li/WB/FS/HDFS): 3/6/0/2. Cache times (ACQr/m, RLS, EXP): 0.000/0.000/0.004/4.511 sec. HOP DAGs recompiled (PRED, SB): 0/0. HOP DAGs recompile time: 0.000 sec. Spark ctx create time (lazy): 0.000 sec. Spark trans counts (par,bc,col):0/0/0. Spark trans times (par,bc,col): 0.000/0.000/0.000 secs. Total JIT compile time: 3.13 sec. Total JVM GC count: 4. Total JVM GC time: 0.138 sec. Heavy hitter instructions: # Instruction Time(s) Count 1 write 4.511 2 2 rand 0.324 7 3 + 0.256 12 4 leftIndex 0.161 3 5 / 0.025 1 6 n+ 0.017 1 7 * 0.015 5 8 rmvar 0.007 15 9 ceil 0.004 2 10 createvar 0.001 21 ``` Similar to lm and als, we should dispatch to the appropriate pca algorithm based on heuristics. Thanks. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@systemds.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org