Fei Hu created SYSTEMML-1830: -------------------------------- Summary: Improve the data locality for the tasks in ParFor body Key: SYSTEMML-1830 URL: https://issues.apache.org/jira/browse/SYSTEMML-1830 Project: SystemML Issue Type: Improvement Affects Versions: SystemML 1.0 Reporter: Fei Hu Assignee: Fei Hu
For {{RemoteParForSpark}}, the tasks are parallelized without considering the data locality of the input matrixes. It will cause a lot of data shuffling if the volume of the input data size is large. We can predict the data location of the input matrixes, and add these location information when parallelizing the ParFor program body. -- This message was sent by Atlassian JIRA (v6.4.14#64029)