Just to be sure, if I execute Scan inside Spark, the execution is goig through RegionServers and I get all the features of HBase/Scan (filters and so on), all the parallelization is in charge of the RegionServers (even I'm running the program with spark) If I use TableInputFormat I read all the column families (even If I don't want to) , not previous filter either, it's just open the files of a hbase table and process them completly. All te parallelization is in Spark and don't use HBase at all, it's just read in HDFS the files what HBase stored for a specific table.
Am I missing something?