[
https://issues.apache.org/jira/browse/SOLR-1844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13044971#comment-13044971
]
Steven Rowe commented on SOLR-1844:
-----------------------------------
Hi David,
The link in the description is dead - this one mentioned the new400common.txt
file: http://www.hathitrust.org/node/181 but I'm not sure it's what you were
after.
Looks like this is the sample you're talking about:
http://www.hathitrust.org/blogs/large-scale-search/common-word-list-commongrams
- I can see the comma deliminted values there.
Would you care to make a patch?
> CommonGramsQueryFilterFactory should read words in a comma-delimited format
> ---------------------------------------------------------------------------
>
> Key: SOLR-1844
> URL: https://issues.apache.org/jira/browse/SOLR-1844
> Project: Solr
> Issue Type: Improvement
> Components: Schema and Analysis
> Affects Versions: 1.4
> Reporter: David Smiley
> Priority: Minor
>
> CommonGramsQueryFilterFactory expects that the file(s) given to the "words"
> argument is a carriage-return delimited list of words. It doesn't support
> comments either. This file format should be more flexible to support comma
> delimited values. I came across this because I was trying to use the sample
> file provided by HathiTrust:
> http://www.hathitrust.org/node/180 (named in a file new400common.txt)
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]