Wait, lengthNorm is already abstract, left to whoever implements TfIdfSimilarity to return a float. So we're already letting people mess with length norm.
All I want to remove is the enforcement of a single-byte encoding, to let you encode a full float, int or even long, as it's a NumericDV under the covers. You then decode it back to a float. So lengthNorm will still be float, I don't propose to change it. Only how this float is encoded. Perhaps a patch would clarify it? Shai On Tue, Jun 25, 2013 at 4:34 PM, Robert Muir <[email protected]> wrote: > > > On Tue, Jun 25, 2013 at 9:20 AM, Shai Erera <[email protected]> wrote: > >> >> >> Right now, I need to copy most of the "tf-idf" code into my Sim, and I >> don't think that's good software engineering. How many people really extend >> Tf-Idf that the API can get complicated? >> > > And we have to do the same thing if we want to modify MmapDirectory to > call mappedbytebuffer.load, but everyone is ok with it there! > > I just mentioned my concern: if we can come up with a patch that isn't > trappy, I'll be happy. but past experience shows fucking around with > lengthnorm has caused confusion, non-obvious back compat breaks, bugs, > etc... hence my concerns. >
