Hello,
In spark we can use *newAPIHadoopRDD *to access the different distributed
system like HDFS, HBase, and MongoDB via different inputformat.
Is it possible to access the *inputsplit *in Spark directly? Spark can
cache data in local memory.
Perform local computation/aggregation on the local
inputsplit could speed up the whole performance.

Thanks a lot

Reply via email to