It is (b). D.
On Fri, Aug 7, 2015 at 3:05 AM, Trejkaz <[email protected]> wrote: > I have recently done updates from Lucene 3.6 to 4.x and 4.x to 5.2. > > During this process, I noticed that the FST used by the Japanese > analyser (AKA Kuromoji) was changing between releases. As I fear > breakages in backwards compatibility, I worried that the dictionary > had changed, so I wrote a little program to read it in and print the > words out in order. > > What I find is that in all three releases, the list of words is > exactly the same - even though the files have changed subtly from > release to release. > > What's up with that? I can think of a few possibilities: > > (a) the dictionary _has_ actually changed, and merely printing the > list of words was not enough (e.g., the parts of speech changed) > > (b) the dictionary hasn't changed, but the files change when the FST > format changes > > (c) the dictionary hasn't changed, but the files change because > they're built on demand every time Lucene is built and there is > something non-deterministic about the process (e.g. something is using > a HashMap internally.) > > I'm hoping that it's (b), but does anybody know? > > TX > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] > --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
