[bigforth] HASH table question again (with more translations)

Sergey M Plis Sun, 13 Jan 2008 22:11:12 -0800

Hi Bernd,

I have fiddled with machine translation a bit more and had fun readingsome German idioms (like killing two flies with one hit) go into acomplete nonsense. What I understood is that by default any bigforthvocabulary is a hash-table. That is great in terms of having the hashdata structure available at hand and I'd like to use it. To make itcompletely useful I need a way to remove a single word from vocabulary,unlike *forget* that cuts off the tail. Is there already such a word?

Follows some more of my translations. I have a comment about yourtesting remark in the end of the section. You say the words werenormally distributed - should be uniformly, I guess.


The text:

\section{Hashed Vocabulary}

% The first paragraph is missing - online translator was unhelpful.

Besides accessing  the disk the most time-consuming  part is searching
for a word in vocabulary.  FORTH represents its vocabularies as linear
lists. Thus  the search  speed linearly depends  on the length  of the
vocabulary, though sometimes  the word is not found  (which can happen
in a  number of cases). With  the large extent  of vocabularies, which
FORTH does  indeed have, the  search takes noticeable time.  Often, on
the order of 1000 strings must be compared.

This  can be  sufficiently accelerated  by changing  the  structure of
vocabularies.    A   tree   \zB.    reduces   the   search   time   to
logarithmic. However  the management of a  suitable tree is  very ( an
AVL--tree  \zB.) complex.  However,  a linear  speed  increase with  a
substantially simpler administration is offered by Hash--table.

A key  is assigned  to each  word. This key  should be  distributed as
evenly as  possible, because  on the  basis of the  key the  words are
linked to different places of an array. When a word is looked for, its
key is computed at first. Then one  needs to only scan the list of all
words with the same key. Since  in bigFORTH the table has 128 entries,
on average that is only $1/128$ words of the vocabulary.

bigFORTH uses a simple method to compute the key: all letters of the
word and the COUNT byte are added together; then the remainder of the
division of this value by table length is kept as the result. The
algorithm is simple, and  thus efficiently distributes the words (in
order to examine the performance I wrote a statistics--tool. Test
results: the words are practically normally distributed.)

The Hash--tool works  normally under the hod, so  its words are rather
uninteresting. The module is called HASH and exports only its name and
all words in its vocabulary.


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

[bigforth] HASH table question again (with more translations)

Reply via email to