[ https://issues.apache.org/jira/browse/LUCENE-2846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12980156#action_12980156 ]
Grant Ingersoll commented on LUCENE-2846: ----------------------------------------- I get the comparison to omitTF, but this functionality has been around a long time. Why does it have to be all or nothing? Couldn't we investigate a sparse data structure to be used instead? We use the current dense approach when a high percentage contain norms and the sparse when less have that amount? I'm not sure what that data structure is just yet, but over in Mahout we have sparse and dense vectors and we have primitive collections that could be useful. > omitTF is viral, but omitNorms is anti-viral. > --------------------------------------------- > > Key: LUCENE-2846 > URL: https://issues.apache.org/jira/browse/LUCENE-2846 > Project: Lucene - Java > Issue Type: Improvement > Reporter: Robert Muir > Fix For: 4.0 > > Attachments: LUCENE-2846.patch, LUCENE-2846.patch, LUCENE-2846.patch, > LUCENE-2846.patch > > > omitTF is viral. if you add document 1 with field "foo" as omitTF, then > document 2 has field "foo" without omitTF, they are both treated as omitTF. > but omitNorms is the opposite. if you have a million documents with field > "foo" with omitNorms, then you add just one document without omitting norms, > now you suddenly have a million 'real norms'. > I think it would be good for omitNorms to be viral too, just for consistency, > and also to prevent huge byte[]'s. > but another option is to make omitTF anti-viral, which is more "schemaless" i > guess. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org