Takeshi Yamamuro created HIVEMALL-185:
-----------------------------------------
Summary: Add an optimizer rule to push down a Sample plan node
into fact tables
Key: HIVEMALL-185
URL: https://issues.apache.org/jira/browse/HIVEMALL-185
Project: Hivemall
Issue Type: Sub-task
Reporter: Takeshi Yamamuro
Assignee: Takeshi Yamamuro
Sampling is a common technique to extract a part of data in joined relations
(fact tables and dimension tables) for training data. The optimizer in Spark
cannot push down a Sample plan node into larger fact tables because this node
is non-deterministic. But, by using RI constraints, we could push down this
node into fact tables in some cases.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)