I am seeing unexpected results when I modify my moreLikeThisConfiguration
settings.

Each of the settings seems intuitive, but as I make modifications to these
settings, the behavior surprises me.

If I set minTermFrequency to 2, I get some reasonable related items as well
as many irrelevant results.

If I set minTermFrequency to 3, I get no related items for most of my use
cases.  I have counted the keywords in my title, subject, etc and I believe
more results should be returned.  If the system was behaving as I expect, I
would like to set this value even higher.

Has anyone else experienced this issue?  Can you recommend a solution?

Thanks, Terry

      <property name="moreLikeThisConfiguration">
            <bean
class="org.dspace.discovery.configuration.DiscoveryMoreLikeThisConfiguration">
                <!--When altering this list also alter the
"xmlui.Discovery.RelatedItems.help" key as it describes
                the metadata fields below-->
                <property name="similarityMetadataFields">
                    <list>
                        <value>dc.title</value>
                        <value>dc.contributor.author</value>
                        <value>dc.creator</value>
                        <value>dc.subject</value>
                    </list>
                </property>
                <!--The minimum number of matching terms across the
metadata fields above before an item is found as related -->
                <property name="minTermFrequency" value="2"/>
                <!--The maximum number of related items displayed-->
                <property name="max" value="3"/>
                <!--The minimum word length below which words will be
ignored-->
                <property name="minWordLength" value="4"/>
            </bean>
        </property>


-- 
Terry Brady
Applications Programmer Analyst
Lauinger Information Technology
202-687-7053
------------------------------------------------------------------------------
Introducing AppDynamics Lite, a free troubleshooting tool for Java/.NET
Get 100% visibility into your production application - at no cost.
Code-level diagnostics for performance bottlenecks with <2% overhead
Download for free and get started troubleshooting in minutes.
http://p.sf.net/sfu/appdyn_d2d_ap1
_______________________________________________
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech
List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette

Reply via email to