I peeked at the code and I still think it's not a bad idea to experiment
with extracting a facade for construction and lookup of words. there may
even be a middle ground between size and speed - if you assume zipfian
distribution of words, the top common ones could be stored/ cached outside
of the fst (even in an associative dictionary). This would require external
frequency information during construction but this isn't something
difficult.

D.

On Thu, Feb 11, 2021 at 8:54 AM Dawid Weiss <[email protected]> wrote:

>
> I didn't mean for Peter to write both backends but perhaps, if he's
> experimenting already anyway, make it possible to extract an interface
> which could be substituted externally with different implementations. Makes
> it easier to tinker with various options, even for us.
>
> D.
>
> On Thu, Feb 11, 2021 at 1:16 AM Robert Muir <[email protected]> wrote:
>
>> On Wed, Feb 10, 2021 at 3:05 PM Dawid Weiss <[email protected]>
>> wrote:
>> > Maybe the "backend" could be configurable somehow so that you could
>> change the strategy depending on your needs?... I haven't looked at how
>> FSTs are used but if can be hidden behind a facade then an alternative
>> implementation could be provided depending on one's need?
>> >
>> > D.
>> >
>>
>> I don't have any confidence that solr would default to the "smaller"
>> option or fix how they manage different solr cores or thousands of
>> threads or any of the analyzer issues. And who would maintain this
>> separate hunspell backend? I don't think it is fair to Peter to have
>> to cope with 2 implementations of hunspell, 1 is certainly enough...
>> :). It's all apache license, at the end of the day if someone wants to
>> step up, let 'em. otherwise let's get out of their way.
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [email protected]
>> For additional commands, e-mail: [email protected]
>>
>>

Reply via email to