Matthias Boehm created SYSTEMML-951: ---------------------------------------
Summary: Efficient spark right indexing via lookup Key: SYSTEMML-951 URL: https://issues.apache.org/jira/browse/SYSTEMML-951 Project: SystemML Issue Type: Task Components: Runtime Reporter: Matthias Boehm So far all versions of spark right indexing instructions require a full scan over the data set. In case of existing partitioning (which anyway happens for any external format - binary block conversion) such a full scan is unnecessary if we're only interested in a small subset of the data. This task adds an efficient right indexing operation via 'rdd lookups' which access at most <num_lookup> partitions given existing hash partitioning. cc [~mwdus...@us.ibm.com] -- This message was sent by Atlassian JIRA (v6.3.4#6332)