[jira] [Commented] (LUCENE-7630) EdgeNGramTokenFilter drops payloads
[ https://issues.apache.org/jira/browse/LUCENE-7630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15823735#comment-15823735 ] ASF subversion and git services commented on LUCENE-7630: - Commit a69c632aa54d064515152145bcbcbe1e869d7061 in lucene-solr's branch refs/heads/branch_6x from [~thetaphi] [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=a69c632 ] LUCENE-7630: Fix (Edge)NGramTokenFilter to no longer drop payloads and preserve all attributes [merge branch 'edgepayloads' from Nathan Gass https://github.com/xabbu42/lucene-solr] Signed-off-by: Uwe Schindler> EdgeNGramTokenFilter drops payloads > --- > > Key: LUCENE-7630 > URL: https://issues.apache.org/jira/browse/LUCENE-7630 > Project: Lucene - Core > Issue Type: Bug > Components: modules/analysis >Affects Versions: master (7.0) >Reporter: Nathan Gass >Assignee: Uwe Schindler >Priority: Minor > Fix For: master (7.0), 6.5 > > Original Estimate: 48h > Remaining Estimate: 48h > > Using an EdgeNGramTokenFilter after a DelimitedPayloadTokenFilter discards > the payloads, where as most other filters copy the payload to the new tokens. > I added a test for this issue and a possible fix at > https://github.com/xabbu42/lucene-solr/tree/edgepayloads > Greetings > Nathan Gass -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7630) EdgeNGramTokenFilter drops payloads
[ https://issues.apache.org/jira/browse/LUCENE-7630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15823722#comment-15823722 ] ASF GitHub Bot commented on LUCENE-7630: Github user asfgit closed the pull request at: https://github.com/apache/lucene-solr/pull/138 > EdgeNGramTokenFilter drops payloads > --- > > Key: LUCENE-7630 > URL: https://issues.apache.org/jira/browse/LUCENE-7630 > Project: Lucene - Core > Issue Type: Bug > Components: modules/analysis >Affects Versions: master (7.0) >Reporter: Nathan Gass >Assignee: Uwe Schindler >Priority: Minor > Fix For: master (7.0), 6.5 > > Original Estimate: 48h > Remaining Estimate: 48h > > Using an EdgeNGramTokenFilter after a DelimitedPayloadTokenFilter discards > the payloads, where as most other filters copy the payload to the new tokens. > I added a test for this issue and a possible fix at > https://github.com/xabbu42/lucene-solr/tree/edgepayloads > Greetings > Nathan Gass -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7630) EdgeNGramTokenFilter drops payloads
[ https://issues.apache.org/jira/browse/LUCENE-7630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15823721#comment-15823721 ] ASF subversion and git services commented on LUCENE-7630: - Commit c64a01158e972176256e257d6c1d4629b05783a2 in lucene-solr's branch refs/heads/master from [~thetaphi] [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=c64a011 ] LUCENE-7630: Fix (Edge)NGramTokenFilter to no longer drop payloads and preserve all attributes [merge branch 'edgepayloads' from Nathan Gass https://github.com/xabbu42/lucene-solr] Signed-off-by: Uwe Schindler> EdgeNGramTokenFilter drops payloads > --- > > Key: LUCENE-7630 > URL: https://issues.apache.org/jira/browse/LUCENE-7630 > Project: Lucene - Core > Issue Type: Bug > Components: modules/analysis >Affects Versions: master (7.0) >Reporter: Nathan Gass >Assignee: Uwe Schindler >Priority: Minor > Fix For: master (7.0), 6.5 > > Original Estimate: 48h > Remaining Estimate: 48h > > Using an EdgeNGramTokenFilter after a DelimitedPayloadTokenFilter discards > the payloads, where as most other filters copy the payload to the new tokens. > I added a test for this issue and a possible fix at > https://github.com/xabbu42/lucene-solr/tree/edgepayloads > Greetings > Nathan Gass -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7630) EdgeNGramTokenFilter drops payloads
[ https://issues.apache.org/jira/browse/LUCENE-7630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15823638#comment-15823638 ] Uwe Schindler commented on LUCENE-7630: --- Thanks, I will merge and commit this after some testing! > EdgeNGramTokenFilter drops payloads > --- > > Key: LUCENE-7630 > URL: https://issues.apache.org/jira/browse/LUCENE-7630 > Project: Lucene - Core > Issue Type: Bug > Components: modules/analysis >Affects Versions: master (7.0) >Reporter: Nathan Gass >Assignee: Uwe Schindler >Priority: Minor > Fix For: master (7.0), 6.5 > > Original Estimate: 48h > Remaining Estimate: 48h > > Using an EdgeNGramTokenFilter after a DelimitedPayloadTokenFilter discards > the payloads, where as most other filters copy the payload to the new tokens. > I added a test for this issue and a possible fix at > https://github.com/xabbu42/lucene-solr/tree/edgepayloads > Greetings > Nathan Gass -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7630) EdgeNGramTokenFilter drops payloads
[ https://issues.apache.org/jira/browse/LUCENE-7630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15821972#comment-15821972 ] Nathan Gass commented on LUCENE-7630: - done > EdgeNGramTokenFilter drops payloads > --- > > Key: LUCENE-7630 > URL: https://issues.apache.org/jira/browse/LUCENE-7630 > Project: Lucene - Core > Issue Type: Bug > Components: modules/analysis >Affects Versions: master (7.0) >Reporter: Nathan Gass >Assignee: Uwe Schindler >Priority: Minor > Original Estimate: 48h > Remaining Estimate: 48h > > Using an EdgeNGramTokenFilter after a DelimitedPayloadTokenFilter discards > the payloads, where as most other filters copy the payload to the new tokens. > I added a test for this issue and a possible fix at > https://github.com/xabbu42/lucene-solr/tree/edgepayloads > Greetings > Nathan Gass -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7630) EdgeNGramTokenFilter drops payloads
[ https://issues.apache.org/jira/browse/LUCENE-7630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15821894#comment-15821894 ] Uwe Schindler commented on LUCENE-7630: --- bq. The NGramTokenFilter probably has the same issue. I can port the fix to that class when everything is correct. Please do! You can update the current PR. Otheriwise PR looks fine. > EdgeNGramTokenFilter drops payloads > --- > > Key: LUCENE-7630 > URL: https://issues.apache.org/jira/browse/LUCENE-7630 > Project: Lucene - Core > Issue Type: Bug > Components: modules/analysis >Affects Versions: master (7.0) >Reporter: Nathan Gass >Assignee: Uwe Schindler >Priority: Minor > Original Estimate: 48h > Remaining Estimate: 48h > > Using an EdgeNGramTokenFilter after a DelimitedPayloadTokenFilter discards > the payloads, where as most other filters copy the payload to the new tokens. > I added a test for this issue and a possible fix at > https://github.com/xabbu42/lucene-solr/tree/edgepayloads > Greetings > Nathan Gass -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7630) EdgeNGramTokenFilter drops payloads
[ https://issues.apache.org/jira/browse/LUCENE-7630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15821893#comment-15821893 ] Nathan Gass commented on LUCENE-7630: - I commited the suggested improvements and made a pull request https://github.com/apache/lucene-solr/pull/138. The NGramTokenFilter probably has the same issue. I can port the fix to that class when everything is correct. > EdgeNGramTokenFilter drops payloads > --- > > Key: LUCENE-7630 > URL: https://issues.apache.org/jira/browse/LUCENE-7630 > Project: Lucene - Core > Issue Type: Bug > Components: modules/analysis >Affects Versions: master (7.0) >Reporter: Nathan Gass >Assignee: Uwe Schindler >Priority: Minor > Original Estimate: 48h > Remaining Estimate: 48h > > Using an EdgeNGramTokenFilter after a DelimitedPayloadTokenFilter discards > the payloads, where as most other filters copy the payload to the new tokens. > I added a test for this issue and a possible fix at > https://github.com/xabbu42/lucene-solr/tree/edgepayloads > Greetings > Nathan Gass -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7630) EdgeNGramTokenFilter drops payloads
[ https://issues.apache.org/jira/browse/LUCENE-7630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15821888#comment-15821888 ] ASF GitHub Bot commented on LUCENE-7630: GitHub user xabbu42 opened a pull request: https://github.com/apache/lucene-solr/pull/138 EdgeNGramTokenFilter drops payloads Test and fix for https://issues.apache.org/jira/browse/LUCENE-7630. You can merge this pull request into a Git repository by running: $ git pull https://github.com/xabbu42/lucene-solr edgepayloads Alternatively you can review and apply these changes as the patch at: https://github.com/apache/lucene-solr/pull/138.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #138 commit 61e45283061ae486acc5882c5a770025c8291222 Author: Nathan GassDate: 2017-01-09T13:59:31Z add test that EdgeNGram filter keeps payloads commit 6570e6ecc2b14a28da9873948083791ba47145d0 Author: Nathan Gass Date: 2017-01-09T14:00:21Z copy all attributes including payload to new tokens commit 01f2a87c67392a86b533d0c76ba7666845d1945f Author: Nathan Gass Date: 2017-01-13T14:54:07Z use captureState and restoreState instead of cloneAttributes > EdgeNGramTokenFilter drops payloads > --- > > Key: LUCENE-7630 > URL: https://issues.apache.org/jira/browse/LUCENE-7630 > Project: Lucene - Core > Issue Type: Bug > Components: modules/analysis >Affects Versions: master (7.0) >Reporter: Nathan Gass >Assignee: Uwe Schindler >Priority: Minor > Original Estimate: 48h > Remaining Estimate: 48h > > Using an EdgeNGramTokenFilter after a DelimitedPayloadTokenFilter discards > the payloads, where as most other filters copy the payload to the new tokens. > I added a test for this issue and a possible fix at > https://github.com/xabbu42/lucene-solr/tree/edgepayloads > Greetings > Nathan Gass -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7630) EdgeNGramTokenFilter drops payloads
[ https://issues.apache.org/jira/browse/LUCENE-7630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15821733#comment-15821733 ] Uwe Schindler commented on LUCENE-7630: --- Hi, could you create a Pull Request and add the link here? About your branch: I would not use cloneAttributes() because thats slow for this simple case. cloneAttributes() only helps if you want to modify the attributes in the AttributeSource that was created, but is not useful for simple save/restore use cases. For your case, you should simple use captureState(), save the state object and then call restorestate() instead of clearAttributes(). After restoring you can adapt term text and positions/offsets. In addition when you clone or capture state, the call to clearAttributes() is useless and also slows down. When restoring states, everything is restored, so the additional clearing before is not needed. > EdgeNGramTokenFilter drops payloads > --- > > Key: LUCENE-7630 > URL: https://issues.apache.org/jira/browse/LUCENE-7630 > Project: Lucene - Core > Issue Type: Bug > Components: modules/analysis >Affects Versions: master (7.0) >Reporter: Nathan Gass >Priority: Minor > Original Estimate: 48h > Remaining Estimate: 48h > > Using an EdgeNGramTokenFilter after a DelimitedPayloadTokenFilter discards > the payloads, where as most other filters copy the payload to the new tokens. > I added a test for this issue and a possible fix at > https://github.com/xabbu42/lucene-solr/tree/edgepayloads > Greetings > Nathan Gass -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org