[jira] [Commented] (SOLR-1844) CommonGramsQueryFilterFactory should read words in a comma-delimited format

2011-06-06 Thread Steven Rowe (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13044971#comment-13044971
 ] 

Steven Rowe commented on SOLR-1844:
---

Hi David,

The link in the description is dead - this one mentioned the new400common.txt 
file: http://www.hathitrust.org/node/181 but I'm not sure it's what you were 
after.

Looks like this is the sample you're talking about: 
http://www.hathitrust.org/blogs/large-scale-search/common-word-list-commongrams 
- I can see the comma deliminted values there.

Would you care to make a patch?

 CommonGramsQueryFilterFactory should read words in a comma-delimited format
 ---

 Key: SOLR-1844
 URL: https://issues.apache.org/jira/browse/SOLR-1844
 Project: Solr
  Issue Type: Improvement
  Components: Schema and Analysis
Affects Versions: 1.4
Reporter: David Smiley
Priority: Minor

 CommonGramsQueryFilterFactory expects that the file(s) given to the words 
 argument is a carriage-return delimited list of words.  It doesn't support 
 comments either.  This file format should be more flexible to support comma 
 delimited values.  I came across this because I was trying to use the sample 
 file provided by HathiTrust:
 http://www.hathitrust.org/node/180(named in a file new400common.txt)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-1844) CommonGramsQueryFilterFactory should read words in a comma-delimited format

2011-06-06 Thread David Smiley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13044988#comment-13044988
 ] 

David Smiley commented on SOLR-1844:


On second thought, I think the current behavior is fine because it's consistent 
with the other filters that need lists of words since they all share the same 
code to do it -- BaseTokenStreamFactory.getWordSet(...). If any change should 
happen, it should happen there. I'm fine with this issue being closed as 
Won't-Fix.  It was easy enough for me to simply replace the commas in Hathi's 
file with a carriage return.

 CommonGramsQueryFilterFactory should read words in a comma-delimited format
 ---

 Key: SOLR-1844
 URL: https://issues.apache.org/jira/browse/SOLR-1844
 Project: Solr
  Issue Type: Improvement
  Components: Schema and Analysis
Affects Versions: 1.4
Reporter: David Smiley
Priority: Minor

 CommonGramsQueryFilterFactory expects that the file(s) given to the words 
 argument is a carriage-return delimited list of words.  It doesn't support 
 comments either.  This file format should be more flexible to support comma 
 delimited values.  I came across this because I was trying to use the sample 
 file provided by HathiTrust:
 http://www.hathitrust.org/node/180(named in a file new400common.txt)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (SOLR-1844) CommonGramsQueryFilterFactory should read words in a comma-delimited format

2010-03-24 Thread David Smiley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12849571#action_12849571
 ] 

David Smiley commented on SOLR-1844:


It _does_ support comments; sorry.

 CommonGramsQueryFilterFactory should read words in a comma-delimited format
 ---

 Key: SOLR-1844
 URL: https://issues.apache.org/jira/browse/SOLR-1844
 Project: Solr
  Issue Type: Improvement
  Components: Schema and Analysis
Affects Versions: 1.4
Reporter: David Smiley
Priority: Minor

 CommonGramsQueryFilterFactory expects that the file(s) given to the words 
 argument is a carriage-return delimited list of words.  It doesn't support 
 comments either.  This file format should be more flexible to support comma 
 delimited values.  I came across this because I was trying to use the sample 
 file provided by HathiTrust:
 http://www.hathitrust.org/node/180(named in a file new400common.txt)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.