date:20071106

Re: [jira] Commented: (LUCENE-935) Improve maven artifacts

2007-11-06 Thread Karl Wettin

1 nov 2007 kl. 17.18 skrev Grant Ingersoll (JIRA): http://people.apache.org/maven-snapshot-repository/org/apache/lucene/ love++ - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

[jira] Commented: (LUCENE-1016) TermVectorAccessor, transparent vector space access

2007-11-06 Thread Karl Wettin (JIRA)

[ https://issues.apache.org/jira/browse/LUCENE-1016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12540518 ] Karl Wettin commented on LUCENE-1016: - I think this is interesting: http://www.nabble.com/How-to-generate-TermF

[jira] Updated: (LUCENE-1044) Behavior on hard power shutdown

2007-11-06 Thread Michael McCandless (JIRA)

[ https://issues.apache.org/jira/browse/LUCENE-1044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated LUCENE-1044: --- Attachment: LUCENE-1044.take3.patch Attached another rev of the patch. I changed th

[jira] Closed: (LUCENE-1019) CustomScoreQuery should support multiple ValueSourceQueries

2007-11-06 Thread Kyle Maxwell (JIRA)

[ https://issues.apache.org/jira/browse/LUCENE-1019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kyle Maxwell closed LUCENE-1019. Resolution: Invalid Lucene Fields: (was: [Patch Available, New]) Ok, I'm satisfied with D

Term pollution from binary data

2007-11-06 Thread Chuck Williams

Hi All, We are experiencing OOM's when binary data contained in text files (e.g., a base64 section of a text file) is indexed. We have extensive recognition of file types but have encountered binary sections inside of otherwise normal text files. We are using the default value of 128 for te

Re: Term pollution from binary data

2007-11-06 Thread robert engels

I think the binary section recognizer is probably your best best. If you write an analyzer that ignores terms that consist of only hexadecimal digits, and contain embedded digits, you will probably reduce the pollution quite a bit, and it is trivial to write, and not too expensive to check.

Re: [jira] Commented: (LUCENE-935) Improve maven artifacts

[jira] Commented: (LUCENE-1016) TermVectorAccessor, transparent vector space access

[jira] Updated: (LUCENE-1044) Behavior on hard power shutdown

[jira] Closed: (LUCENE-1019) CustomScoreQuery should support multiple ValueSourceQueries

Term pollution from binary data

Re: Term pollution from binary data

6 matches

Site Navigation

Mail list logo

Footer information