In TFIDFPartialVectorReducer.java: If docFreq > maxDocFreq then the vector at that index is not set (ignored) If docFreq < minDocFreq then the vector at that index is set to the TfIdf calculation using minDocFreq instead of the actual document frequency.
Should minDocFreq not be treated the same as maxDocFreq by skipping setting the vector at that index? In both cases, the vector length remains the same and these settings have no effect on pruning the vector length / term reduction? NOTICE: This message and any attachments are intended only for the use of the addressee and may contain confidential, proprietary and/or privileged information. If you are not the intended recipient, any review, use, distribution, dissemination or copying of this email is prohibited. If you have received this email in error, please notify the sender by replying to this message and delete this email immediately. Securities trading, account management, and investment banking services are offered by MDB Capital Group LLC, a registered broker-dealer and member of FINRA and SIPC. Unless clearly stated, nothing herein shall be construed to be an offer to sell, nor a solicitation of an offer to buy, any financial product.
