Hi, John Got your point! But what i focused on is index creating time, not the query processing time you mentioned. Indexing some of the columns is ofen faster than creating all of them, especially does columns with a high-cardinality . Agree?
Thanks, Min On Sat, Nov 7, 2009 at 11:51 AM, K. John Wu <[email protected]> wrote: > Hi, Min, > > You are probably concerned about two things, the index size and query > processing time. > > In terms of index size, one typically have enough disk space to store > all the indexes. Therefore, it should not be a serious problem. If > you really really don't want to index something, the option is there. > > In terms of query processing time, FastBit does not blindly use the > indexes. It only uses an index if it is expected to reduce the query > processing time. Unused indexes are not read into memory, therefore, > does not unnecessarily increase the memory usage either. Having the > indexes around give it more options for deciding how to evaluate a > query. Some DBMS systems have trouble deciding which options to take > where there are too many options for answering a query. In this > regard, FastBit is a relatively simple system and can make its > decisions on evaluation strategy very quickly. > > Hope this helps. > > John > > > On 11/6/2009 7:05 PM, Min Zhou wrote: >> Thanks, John >> >> Another question. Why do you decided to index all dimensions? It >> seems improper if some of them are of a small number of distinct >> value, and the others are high-cardinality columns. >> >> >> Thanks, >> Min >> >> On Thu, Nov 5, 2009 at 10:53 PM, K. John Wu <[email protected]> wrote: >>> Hi, Min, >>> >>> I think you are on the right track. FastBit actually indexes >>> everything unless you specify "index = none". When answering a query, >>> the most common operations are bitwise OR operation. >>> >>> John >>> >>> >>> On 11/5/2009 12:42 AM, Min Zhou wrote: >>>> Hi John, >>>> >>>> It seems all columns will be indexed if index=true has been turned on. >>>> when bitmap is required being loaded, a operate like join will be >>>> done to filter records statisfy those bitmaps. Am I right? >>>> >>>> >>>> Regards, >>>> Min >>>> >>>> On Tue, Nov 3, 2009 at 12:38 AM, K. John Wu <[email protected]> wrote: >>>>> Hi, Min, >>>>> >>>>> In a bitmap index, each bit of a bitmap corresponds to a particular >>>>> row. Typically, the 1st bit corresponds to the 1st row, the 2nd bit >>>>> corresponds to the 2nd row, and so on. In resolve a query condition >>>>> such as "col4 < 1.5", a bitmap is produce to represent the answer. >>>>> For example, if the 1st row and the 4th row of you data satisfy this >>>>> condition, then the 1st bit of the answer and the 4th bit of the >>>>> answer are set to 1, while the other bits are set to 0. Based on this >>>>> answer, we can retrieve the values of col1, col2, and col3 from the >>>>> 1st row and the 4th row. >>>>> >>>>> Take the Java API as an example, one specify the where clause by >>>>> calling build_query, which resolves the query condition and produces a >>>>> bitmap to store the answer. To retrieve the rows qualify the query, >>>>> one invokes the function get_qualified_ints and friends. In this >>>>> case, the bitmap is not exposed to the client code. >>>>> >>>>> Hope this helps. >>>>> >>>>> John >>>>> >>>>> >>>>> On 11/1/2009 5:19 PM, Min Zhou wrote: >>>>>> Hi John, >>>>>> >>>>>> I guess I might not express clearly what I meant. Take a query like >>>>>> below: >>>>>> select col1, col2, col3 from tbl where col4 < 1.5 >>>>>> >>>>>> how does fastbit find the records for col1, col2, col3 where col4 is >>>>>> less than 1.5? >>>>>> >>>>>> Thanks, >>>>>> Min >>>>>> >>>>> _______________________________________________ >>>>> FastBit-users mailing list >>>>> [email protected] >>>>> https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users >>>>> >>>> >>>> >>> _______________________________________________ >>> FastBit-users mailing list >>> [email protected] >>> https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users >>> >> >> >> > _______________________________________________ > FastBit-users mailing list > [email protected] > https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users > -- My research interests are distributed systems, parallel computing and bytecode based virtual machine. My profile: http://www.linkedin.com/in/coderplay My blog: http://coderplay.javaeye.com _______________________________________________ FastBit-users mailing list [email protected] https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users
