I am going to perform some test experiments on the wikipedia dataset using the spark framework. I know wikipedia data set might already have been analyzed, but what are the potential explored/unexplored aspects of spark that can be tested and benchmarked on wikipedia dataset?
Thanks AJ