On 10/06/2014 18:40, Ted Dunning wrote:

> On Tue, Jun 10, 2014 at 8:08 AM, Lee Goddard <lee...@gmail.com
> <mailto:lee...@gmail.com>> wrote:
>
> Is it possible to weight the individual initials as words?
>
> Would you recommend employing a stemmer?
>
>
> Yes it is definitely possible.  But don't just use any stemmer.  You
> need to adapt something so that you preserve initial letters and
> likely uses heuristics such as possibly preserving case.

Am I going to have to write a parser in Java for that, or is it a matter of combing what is in the box? I've previously created indexes of photos (my own parser) and indexes of documents, but indexing a single company name is quite a new idea to me.

You will also probably want to  include alternative forms in other
> fields.  These would include nicknames, stock symbols and
> abbreviations.

Not in this — it's simply an interface to find information held by the state on the affairs of a company, so the alternative forms are of the final element of the company registered name: it might be 'Limited' but people may search 'ltd', it may be 'SE' but people may search 'european'.

TIA
Lee

Reply via email to