could you please share the link? coz at first glance a Trie looks like a
bad choice for this task.

I'd go with the Levenshtein distance and a kd-tree.
First implement the Levenshtein distance algorithm to calculate the edit
distance of two strings.
Second, since Levenshtein distance qualifies as a metric space we can use a
metric tree like BK-tree to populate it with our dictionary.
Choose a random word from dictionary as a root and subsequently insert
dictionary words(picking them up randomly) into the tree.
A node has arbitrary no. of children. The parent-child edge represents the
corresponding Levenshtein distance between them.

Building the tree is one time process. Once the tree is built we can devise
a way to serialize it and store it.

Using this tree we can find all the words with edit-distance less than or
equal to, say k.
Lets, define a function call in Tree class as: List KDTreeSearch(s, k);
which searches for all strings s' in the tree such that |s-s'| <= k i.e.
all strings which are less than or equal to an edit distance of k.
Searching:
Start with the Root and calculate the edit-distance of s from root. If
its', say d then we know exactly which children we need to descend to in
order to find the words with distance <=k.

Looking for typos:
Scan the document and for each word 'w' make a call: list = KDTreeSearch(w,
0);
if, list.size() = 1. //We have the word in dictionary.
else, list = KDTreeSearch(w, 2); // searching for all words with edit
distance of 2 from w

returned 'list' can sometimes be large, we can subsequently filter it out
by narrowing down our definition of 'typos'
e.g. for typo w = REDT [REST is more likely than RENT] or maybe some
Phoneme model etc.... you should discuss this at length with the
interviewer.

On 27 October 2012 07:03, Raghavan <its...@gmail.com> wrote:

> By any chance did you read the new blog post by Gayle Laakmaan..
>
> I guess to detect typos we can use some sort of Trie implementation..
>
>
> On Fri, Oct 26, 2012 at 7:50 PM, payal gupta <gpt.pa...@gmail.com> wrote:
>
>>
>>    Given a cube with sides length n, write code to print all possible
>> paths from the center to the surface.
>>    Thanx in advance.
>>
>>
>>    Regards,
>>   PAYAL GUPTA,
>>   NIT-B.
>>
>>
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "Algorithm Geeks" group.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msg/algogeeks/-/ZaItRf_9A_IJ.
>>
>> To post to this group, send email to algogeeks@googlegroups.com.
>> To unsubscribe from this group, send email to
>> algogeeks+unsubscr...@googlegroups.com.
>> For more options, visit this group at
>> http://groups.google.com/group/algogeeks?hl=en.
>>
>
>
>
> --
> Thanks and Regards,
> Raghavan KL
>
>
>  --
> You received this message because you are subscribed to the Google Groups
> "Algorithm Geeks" group.
> To post to this group, send email to algogeeks@googlegroups.com.
> To unsubscribe from this group, send email to
> algogeeks+unsubscr...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/algogeeks?hl=en.
>

-- 
You received this message because you are subscribed to the Google Groups 
"Algorithm Geeks" group.
To post to this group, send email to algogeeks@googlegroups.com.
To unsubscribe from this group, send email to 
algogeeks+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/algogeeks?hl=en.

Reply via email to