Hi, John

Got your point! But what i focused on is index creating time, not the
query processing time you mentioned. Indexing some of the columns is
ofen faster than creating all of them, especially does columns with a
high-cardinality .  Agree?

Thanks,
Min

On Sat, Nov 7, 2009 at 11:51 AM, K. John Wu <[email protected]> wrote:
> Hi, Min,
>
> You are probably concerned about two things, the index size and query
> processing time.
>
> In terms of index size, one typically have enough disk space to store
> all the indexes.  Therefore, it should not be a serious problem.  If
> you really really don't want to index something, the option is there.
>
> In terms of query processing time, FastBit does not blindly use the
> indexes.  It only uses an index if it is expected to reduce the query
> processing time.  Unused indexes are not read into memory, therefore,
> does not unnecessarily increase the memory usage either.  Having the
> indexes around give it more options for deciding how to evaluate a
> query.  Some DBMS systems have trouble deciding which options to take
> where there are too many options for answering a query.  In this
> regard, FastBit is a relatively simple system and can make its
> decisions on evaluation strategy very quickly.
>
> Hope this helps.
>
> John
>
>
> On 11/6/2009 7:05 PM, Min Zhou wrote:
>> Thanks, John
>>
>> Another question. Why do you decided  to index all dimensions? It
>> seems improper if some of them are of a small number of distinct
>> value, and the others are high-cardinality columns.
>>
>>
>> Thanks,
>> Min
>>
>> On Thu, Nov 5, 2009 at 10:53 PM, K. John Wu <[email protected]> wrote:
>>> Hi, Min,
>>>
>>> I think you are on the right track.  FastBit actually indexes
>>> everything unless you specify "index = none".  When answering a query,
>>> the most common operations are bitwise OR operation.
>>>
>>> John
>>>
>>>
>>> On 11/5/2009 12:42 AM, Min Zhou wrote:
>>>> Hi John,
>>>>
>>>> It seems all columns will be indexed if index=true has been turned on.
>>>>  when bitmap is required being loaded, a operate like join will be
>>>> done to filter records statisfy those bitmaps. Am I right?
>>>>
>>>>
>>>> Regards,
>>>> Min
>>>>
>>>> On Tue, Nov 3, 2009 at 12:38 AM, K. John Wu <[email protected]> wrote:
>>>>> Hi, Min,
>>>>>
>>>>> In a bitmap index, each bit of a bitmap corresponds to a particular
>>>>> row.  Typically, the 1st bit corresponds to the 1st row, the 2nd bit
>>>>> corresponds to the 2nd row, and so on.  In resolve a query condition
>>>>> such as "col4 < 1.5", a bitmap is produce to represent the answer.
>>>>> For example, if the 1st row and the 4th row of you data satisfy this
>>>>> condition, then the 1st bit of the answer and the 4th bit of the
>>>>> answer are set to 1, while the other bits are set to 0.  Based on this
>>>>> answer, we can retrieve the values of col1, col2, and col3 from the
>>>>> 1st row and the 4th row.
>>>>>
>>>>> Take the Java API as an example, one specify the where clause by
>>>>> calling build_query, which resolves the query condition and produces a
>>>>> bitmap to store the answer.  To retrieve the rows qualify the query,
>>>>> one invokes the function get_qualified_ints and friends.  In this
>>>>> case, the bitmap is not exposed to the client code.
>>>>>
>>>>> Hope this helps.
>>>>>
>>>>> John
>>>>>
>>>>>
>>>>> On 11/1/2009 5:19 PM, Min Zhou wrote:
>>>>>> Hi John,
>>>>>>
>>>>>> I guess I might not express clearly what I meant. Take a query like 
>>>>>> below:
>>>>>> select col1, col2, col3 from tbl where col4 < 1.5
>>>>>>
>>>>>> how does fastbit find the records for col1, col2, col3 where col4 is
>>>>>> less than 1.5?
>>>>>>
>>>>>> Thanks,
>>>>>> Min
>>>>>>
>>>>> _______________________________________________
>>>>> FastBit-users mailing list
>>>>> [email protected]
>>>>> https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users
>>>>>
>>>>
>>>>
>>> _______________________________________________
>>> FastBit-users mailing list
>>> [email protected]
>>> https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users
>>>
>>
>>
>>
> _______________________________________________
> FastBit-users mailing list
> [email protected]
> https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users
>



-- 
My research interests are distributed systems, parallel computing and
bytecode based virtual machine.

My profile:
http://www.linkedin.com/in/coderplay
My blog:
http://coderplay.javaeye.com
_______________________________________________
FastBit-users mailing list
[email protected]
https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users

Reply via email to