Github user DaveBirdsall commented on a diff in the pull request:
https://github.com/apache/incubator-trafodion/pull/195#discussion_r46222783
--- Diff: core/sql/sqlcomp/nadefaults.cpp ---
@@ -2180,6 +2180,10 @@ SDDkwd__(ISO_MAPPING, (char
*)SQLCHARSETSTRING_ISO88591),
XDDkwd__(MDAM_SCAN_METHOD, "ON"),
DDflt0_(MDAM_SELECTION_DEFAULT, "0.5"),
+
+ DDflt0_(MDAM_TOTAL_UEC_CHECK_MIN_RC_THRESHOLD, "10000"),
+ DDflt0_(MDAM_TOTAL_UEC_CHECK_UEC_THRESHOLD, "0.01"),
--- End diff --
Hi Hans,
Actually, your Case 2 is better suited for MDAM than Case 1. In Case 2,
we'll materialize each of the values of A (99 of them) and do a begin/end
subset from B = 1 to B = 90. Just 99 subsets. In Case 1, we'll materialize each
of the values of A (done 99 times), and for each value of A, materialize the
values of B (9801 times, or less if there are functional dependencies which MC
stats would show), and then for each of these, do a begin/end subset on C. Each
materialization is potentially a random I/O, not to mention some significant
path length (compared to the path length of just traversing to the next row
sequentially). But more to your point, the costing code takes into account
these differences.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---