big data Mon, 27 May 2019 02:09:25 -0700
Hi all, I've many binary files stored in HDFS, and use SparkContext.binaryFiles to load them into RDD, then transfer them to be calculated.
How the limitation is load files, is there any solutions to improve load binary files performance? Thanks.