Takeshi Yamamuro created HIVEMALL-185: -----------------------------------------
Summary: Add an optimizer rule to push down a Sample plan node into fact tables Key: HIVEMALL-185 URL: https://issues.apache.org/jira/browse/HIVEMALL-185 Project: Hivemall Issue Type: Sub-task Reporter: Takeshi Yamamuro Assignee: Takeshi Yamamuro Sampling is a common technique to extract a part of data in joined relations (fact tables and dimension tables) for training data. The optimizer in Spark cannot push down a Sample plan node into larger fact tables because this node is non-deterministic. But, by using RI constraints, we could push down this node into fact tables in some cases. -- This message was sent by Atlassian JIRA (v7.6.3#76005)