Hi, Just curious to know, how can we run a Principal Component Analysis on streaming data in distributed mode? If we can, is it mathematically valid enough?
Have anyone done that before? Can you guys share your experience over it? Is there any API Spark provides to do the same on Spark Streaming mode? Thanks, Aakash.