Hi, Min, Thanks for clarify the question. Since the bins do NOT match, they are not currently used to resolve join conditions. There are some ways of using these indexes to resolve join conditions anyway, but we are not currently doing that.
John On 3/19/2010 8:09 PM, Min Zhou wrote: > Hi, John, > > Since the table with high-cardinality attributes would use bins, > let's take a example for discussing. If we have two tables like below, > > table1 column c1 > bin0 : value range [0.0-0.3) > bin1 : value range [0.3-0.6) > bin2 : value range [0.6-0.9) > bin3 : value range [0.9-1.2) > > table2 column d2 > bin0 : value range [0.1-0.4) > bin1 : value range [0.4-0.7) > bin2 : value range [0.7-1.0) > > > How do they do such a query "select table1.c1 as x, table1.c2, > table2.d2 from table1 join table2 on c1 = d1" if both c1, d1 have > high-cardinality attributes? > > Thanks, > Min > > > > > On Fri, Mar 19, 2010 at 11:54 PM, K. John Wu<[email protected]> wrote: >> Hi, Min, >> >> I am somewhat unsure of what operations you are referring to by >> "high-cardinality table join." The following is a quick description >> of the binning strategy. Please clarify your question and I will give >> it another try to answer it.. >> >> John >> >> ---------------------- >> One can explicitly tell FastBit to bin any numerical values by using >> an indexing specification containing a<binninb .../> directive. >> However, if you neglect to specify an explicit directive, here is what >> happens. >> >> - for integer values, if the difference between the min and max is >> less than 1000 or less than 10% of the number of rows, then each >> distinct value will get its own bin (i.e., no binning). Otherwise, a >> default binning strategy is used. >> >> - for floating-point values, the default binning strategy is used >> >> - the default binning strategy samples the current values, build an >> exact histogram on the sampled values, divide the histogram into a >> certain number of bins, typically around 10,000 bins. We call this >> approximate equal-weight bins. >> >> >> >> On 3/19/2010 3:57 AM, Min Zhou wrote: >>> Hi all, >>> Can anyone give me a description on the implementation fastbit deal >>> with high-cardinality table join? >>> Does it use binning? How do they join? >>> >>> >>> Thanks, >>> Min >> _______________________________________________ >> FastBit-users mailing list >> [email protected] >> https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users >> > > > _______________________________________________ FastBit-users mailing list [email protected] https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users
