Re: [sqlite] A littel question...

2006-07-21 Thread Cesar David Rodas Maldonado

Thanks very much to all of you!

I am doing a project of fulltext with sqlite, and i will share the code when
a i finish...

thanks to all for your help.

On 7/21/06, John Stanton <[EMAIL PROTECTED]> wrote:


Cesar David Rodas Maldonado wrote:
> I have not a substring, I have a list of words (stemmed words of several
> languages) and i just want to get the Id. The word is unique
>
In that case the sqlite B-Tree index is about as good as you will get.
just make sure that the word is an index.



Re: [sqlite] A littel question...

2006-07-21 Thread Daniel van Ham Colchete
Cesar,

in that case Clay Dowling's answer says it all:
it's faster to use an hash as an index preloading all your data into
your RAM memory (without SQLite), but if you don't want to use that
amount of memory then you could use an INDEX that will create a Binary
Tree (=btree) that will make your search faster.

Best regards,
Daniel Colchete

Cesar David Rodas Maldonado wrote:
> I know that, but I would like to know if will be better first
> transform the
> word into a number (a hash function), after that select the number and
> after
> search with the index of the word...
> understand?. I am sorry for my english...
>
> On 7/21/06, Daniel van Ham Colchete <[EMAIL PROTECTED]> wrote:
>>
>> Cesar David Rodas Maldonado wrote:
>> > Hello to everybody
>> >
>> > If I  have a table with 100.000 unique words I am wondering if SQLite
>> > select
>> > if faster an cheaper (RAM, Processor, etc), or If i have to help
>> SQLite
>> > using a Hash function, and what could be that Hash function?
>> >
>> > Thanks.
>> >
>> Cesar,
>>
>> you should consider using an index:
>> http://www.sqlite.org/lang_createindex.html
>>
>> Best regards,
>> Daniel Colchete
>>
>>
>




Re: [sqlite] A littel question...

2006-07-21 Thread John Stanton

Cesar David Rodas Maldonado wrote:

I have not a substring, I have a list of words (stemmed words of several
languages) and i just want to get the Id. The word is unique

In that case the sqlite B-Tree index is about as good as you will get. 
just make sure that the word is an index.


Re: [sqlite] A littel question...

2006-07-21 Thread John Stanton
If I were looking up a table with only 100,000 words and wanted it to be 
fast -


CASE OF SEARCHING ON WORD
a. If it is a fixed table.  Sort into alpha sequence in a flat file, 
open and memory map the file and use a binary search.  Very simple and 
fast and would find your word in a few microseconds.
b. If the table is dynamic make the file an AVL tree.  Also very fast 
and quite simple and self maintaining.
c. Make it an SQLITE table and make the word an index.  Simple and quite 
fast.

CASE OF SEARCHING ON A SUBSTRING
a. Simple string.  Memory map the file of name and search using a fast 
string algorithm like Boyer-Moore.  Will perform in a few milliseconds.
b. Complex search.  Memory map the file and use the REGEX library. 
Quite fast.
c. complex search. Use Sqlite table and perform a LIKE or regular 
expression search (slower because it has to access row-by-row).


If you have a much larger table hashing the words to a token may be 
better, but at only 100,000 my guess is that you are well below that 
threshold.


Cesar David Rodas Maldonado wrote:

Hello to everybody

If I  have a table with 100.000 unique words I am wondering if SQLite 
select

if faster an cheaper (RAM, Processor, etc), or If i have to help SQLite
using a Hash function, and what could be that Hash function?

Thanks.





Re: [sqlite] A littel question...

2006-07-21 Thread Cesar David Rodas Maldonado

I know that, but I would like to know if will be better first transform the
word into a number (a hash function), after that select the number and after
search with the index of the word...
understand?. I am sorry for my english...

On 7/21/06, Daniel van Ham Colchete <[EMAIL PROTECTED]> wrote:


Cesar David Rodas Maldonado wrote:
> Hello to everybody
>
> If I  have a table with 100.000 unique words I am wondering if SQLite
> select
> if faster an cheaper (RAM, Processor, etc), or If i have to help SQLite
> using a Hash function, and what could be that Hash function?
>
> Thanks.
>
Cesar,

you should consider using an index:
http://www.sqlite.org/lang_createindex.html

Best regards,
Daniel Colchete




Re: [sqlite] A littel question...

2006-07-21 Thread Cesar David Rodas Maldonado

I have not a substring, I have a list of words (stemmed words of several
languages) and i just want to get the Id. The word is unique


Re: [sqlite] A littel question...

2006-07-21 Thread Daniel van Ham Colchete
Cesar David Rodas Maldonado wrote:
> Hello to everybody
>
> If I  have a table with 100.000 unique words I am wondering if SQLite
> select
> if faster an cheaper (RAM, Processor, etc), or If i have to help SQLite
> using a Hash function, and what could be that Hash function?
>
> Thanks.
>
Cesar,

you should consider using an index:
http://www.sqlite.org/lang_createindex.html

Best regards,
Daniel Colchete



Re: [sqlite] A littel question...

2006-07-21 Thread John Stanton

Cesar David Rodas Maldonado wrote:

Hello to everybody

If I  have a table with 100.000 unique words I am wondering if SQLite 
select

if faster an cheaper (RAM, Processor, etc), or If i have to help SQLite
using a Hash function, and what could be that Hash function?

Thanks.

Do you want to select on whole words, first few characters in the word 
or on sub-strings?


Re: [sqlite] A littel question...

2006-07-21 Thread Clay Dowling

Cesar David Rodas Maldonado said:
> Hello to everybody
>
> If I  have a table with 100.000 unique words I am wondering if SQLite
> select
> if faster an cheaper (RAM, Processor, etc), or If i have to help SQLite
> using a Hash function, and what could be that Hash function?

If you're going to the trouble of building your own hash table, why bother
with SQLite?  The Hash table will provide faster access to the data,
assuming that you load the entire list of words into RAM, but will be more
memory intensive than using SQLite.  This is assuming that the SQLite
database lives on disk.  If the SQLite database lives in memory, it's
still going to take less RAM (btrees are almost always more compact than
hash tables), but the speed will depend on the efficiency of your hash
implementation.

Clay
-- 
Simple Content Management
http://www.ceamus.com



[sqlite] A littel question...

2006-07-21 Thread Cesar David Rodas Maldonado

Hello to everybody

If I  have a table with 100.000 unique words I am wondering if SQLite select
if faster an cheaper (RAM, Processor, etc), or If i have to help SQLite
using a Hash function, and what could be that Hash function?

Thanks.