On Thu, Feb 4, 2010 at 1:45 PM, Robin Anil <robin.a...@gmail.com> wrote:
>
> if you have a clear plan lets do it or lets do the first version with just
>
> document -> analyzer -> token array -> vector
>                                                      |-> ngram -> vector
>

Ted summed it up perfectly. I think this is great until we get further
along with the document work.

>
> Lets not have overlapping ids otherwise it becomes a pain to merge. have
> unique ids in sequence file, and a file with last id used ?
>

Ok, I will read the partial vector/dictionary code to get my head around this.

Reply via email to