[jira] [Created] (SYSTEMML-951) Efficient spark right indexing via lookup

Matthias Boehm (JIRA) Thu, 22 Sep 2016 21:34:47 -0700

Matthias Boehm created SYSTEMML-951:
---------------------------------------


             Summary: Efficient spark right indexing via lookup
                 Key: SYSTEMML-951
                 URL: https://issues.apache.org/jira/browse/SYSTEMML-951
             Project: SystemML
          Issue Type: Task
          Components: Runtime
            Reporter: Matthias Boehm


So far all versions of spark right indexing instructions require a full scan 
over the data set. In case of existing partitioning (which anyway happens for 
any external format - binary block conversion) such a full scan is unnecessary 
if we're only interested in a small subset of the data. This task adds an 
efficient right indexing operation via 'rdd lookups' which access at most 
<num_lookup> partitions given existing hash partitioning. 

cc [[email protected]]  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (SYSTEMML-951) Efficient spark right indexing via lookup

Reply via email to