[jira] [Commented] (SOLR-2584) Add a parameter in UIMAUpdateRequestProcessor to avoid duplicated values on insert
[ https://issues.apache.org/jira/browse/SOLR-2584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13067840#comment-13067840 ] Elmer Garduno commented on SOLR-2584: - Koji, I followed your approach and implemented it using an UpdateRequestProcessor. I'm submitting the patch for branch 3x. Add a parameter in UIMAUpdateRequestProcessor to avoid duplicated values on insert -- Key: SOLR-2584 URL: https://issues.apache.org/jira/browse/SOLR-2584 Project: Solr Issue Type: Improvement Affects Versions: 3.3, 4.0 Reporter: Elmer Garduno Priority: Minor Labels: uima Hi folks, I think that UIMAUpdateRequestProcessor should have a parameter to avoid duplicate values on the updated field. A typical use case is: If you are using DictionaryAnnotator and there is a term that matches more than once it will be added two times in the mapped field. I think that we should add a parameter to avoid inserting duplicates as we are not preserving information on the position of the annotation. What do you think about it? I've already implemented this for branch 3x I'm writing some tests and I will submit a patch. Regards -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2584) Add a parameter in UIMAUpdateRequestProcessor to avoid duplicated values on insert
[ https://issues.apache.org/jira/browse/SOLR-2584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13068150#comment-13068150 ] Elmer Garduno commented on SOLR-2584: - Thanks Koji Add a parameter in UIMAUpdateRequestProcessor to avoid duplicated values on insert -- Key: SOLR-2584 URL: https://issues.apache.org/jira/browse/SOLR-2584 Project: Solr Issue Type: Improvement Affects Versions: 1.4.1, 3.3, 4.0 Reporter: Elmer Garduno Assignee: Koji Sekiguchi Priority: Minor Labels: uima Fix For: 3.4, 4.0 Attachments: SOLR-2584.patch, SOLR-2584.patch, SOLR-2584.patch Hi folks, I think that UIMAUpdateRequestProcessor should have a parameter to avoid duplicate values on the updated field. A typical use case is: If you are using DictionaryAnnotator and there is a term that matches more than once it will be added two times in the mapped field. I think that we should add a parameter to avoid inserting duplicates as we are not preserving information on the position of the annotation. What do you think about it? I've already implemented this for branch 3x I'm writing some tests and I will submit a patch. Regards -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2584) Add a parameter in UIMAUpdateRequestProcessor to avoid duplicated values on insert
[ https://issues.apache.org/jira/browse/SOLR-2584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13047793#comment-13047793 ] Koji Sekiguchi commented on SOLR-2584: -- Or we can implement the function in the new update processor and place it after uima update processor in the chain. Anyway I wish I could have the function. Add a parameter in UIMAUpdateRequestProcessor to avoid duplicated values on insert -- Key: SOLR-2584 URL: https://issues.apache.org/jira/browse/SOLR-2584 Project: Solr Issue Type: Improvement Affects Versions: 3.3, 4.0 Reporter: Elmer Garduno Priority: Minor Labels: uima Hi folks, I think that UIMAUpdateRequestProcessor should have a parameter to avoid duplicate values on the updated field. A typical use case is: If you are using DictionaryAnnotator and there is a term that matches more than once it will be added two times in the mapped field. I think that we should add a parameter to avoid inserting duplicates as we are not preserving information on the position of the annotation. What do you think about it? I've already implemented this for branch 3x I'm writing some tests and I will submit a patch. Regards -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org