[jira] [Commented] (LUCENE-7630) EdgeNGramTokenFilter drops payloads

2017-01-16 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15823735#comment-15823735
 ] 

ASF subversion and git services commented on LUCENE-7630:
-

Commit a69c632aa54d064515152145bcbcbe1e869d7061 in lucene-solr's branch 
refs/heads/branch_6x from [~thetaphi]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=a69c632 ]

LUCENE-7630: Fix (Edge)NGramTokenFilter to no longer drop payloads and preserve 
all attributes
[merge branch 'edgepayloads' from Nathan Gass 
https://github.com/xabbu42/lucene-solr]

Signed-off-by: Uwe Schindler 


> EdgeNGramTokenFilter drops payloads
> ---
>
> Key: LUCENE-7630
> URL: https://issues.apache.org/jira/browse/LUCENE-7630
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: modules/analysis
>Affects Versions: master (7.0)
>Reporter: Nathan Gass
>Assignee: Uwe Schindler
>Priority: Minor
> Fix For: master (7.0), 6.5
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> Using an EdgeNGramTokenFilter after a DelimitedPayloadTokenFilter discards 
> the payloads, where as most other filters copy the payload to the new tokens.
> I added a test for this issue and a possible fix at 
> https://github.com/xabbu42/lucene-solr/tree/edgepayloads
> Greetings
> Nathan Gass



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7630) EdgeNGramTokenFilter drops payloads

2017-01-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15823722#comment-15823722
 ] 

ASF GitHub Bot commented on LUCENE-7630:


Github user asfgit closed the pull request at:

https://github.com/apache/lucene-solr/pull/138


> EdgeNGramTokenFilter drops payloads
> ---
>
> Key: LUCENE-7630
> URL: https://issues.apache.org/jira/browse/LUCENE-7630
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: modules/analysis
>Affects Versions: master (7.0)
>Reporter: Nathan Gass
>Assignee: Uwe Schindler
>Priority: Minor
> Fix For: master (7.0), 6.5
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> Using an EdgeNGramTokenFilter after a DelimitedPayloadTokenFilter discards 
> the payloads, where as most other filters copy the payload to the new tokens.
> I added a test for this issue and a possible fix at 
> https://github.com/xabbu42/lucene-solr/tree/edgepayloads
> Greetings
> Nathan Gass



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7630) EdgeNGramTokenFilter drops payloads

2017-01-16 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15823721#comment-15823721
 ] 

ASF subversion and git services commented on LUCENE-7630:
-

Commit c64a01158e972176256e257d6c1d4629b05783a2 in lucene-solr's branch 
refs/heads/master from [~thetaphi]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=c64a011 ]

LUCENE-7630: Fix (Edge)NGramTokenFilter to no longer drop payloads and preserve 
all attributes
[merge branch 'edgepayloads' from Nathan Gass 
https://github.com/xabbu42/lucene-solr]

Signed-off-by: Uwe Schindler 


> EdgeNGramTokenFilter drops payloads
> ---
>
> Key: LUCENE-7630
> URL: https://issues.apache.org/jira/browse/LUCENE-7630
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: modules/analysis
>Affects Versions: master (7.0)
>Reporter: Nathan Gass
>Assignee: Uwe Schindler
>Priority: Minor
> Fix For: master (7.0), 6.5
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> Using an EdgeNGramTokenFilter after a DelimitedPayloadTokenFilter discards 
> the payloads, where as most other filters copy the payload to the new tokens.
> I added a test for this issue and a possible fix at 
> https://github.com/xabbu42/lucene-solr/tree/edgepayloads
> Greetings
> Nathan Gass



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7630) EdgeNGramTokenFilter drops payloads

2017-01-16 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15823638#comment-15823638
 ] 

Uwe Schindler commented on LUCENE-7630:
---

Thanks, I will merge and commit this after some testing!

> EdgeNGramTokenFilter drops payloads
> ---
>
> Key: LUCENE-7630
> URL: https://issues.apache.org/jira/browse/LUCENE-7630
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: modules/analysis
>Affects Versions: master (7.0)
>Reporter: Nathan Gass
>Assignee: Uwe Schindler
>Priority: Minor
> Fix For: master (7.0), 6.5
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> Using an EdgeNGramTokenFilter after a DelimitedPayloadTokenFilter discards 
> the payloads, where as most other filters copy the payload to the new tokens.
> I added a test for this issue and a possible fix at 
> https://github.com/xabbu42/lucene-solr/tree/edgepayloads
> Greetings
> Nathan Gass



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7630) EdgeNGramTokenFilter drops payloads

2017-01-13 Thread Nathan Gass (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15821972#comment-15821972
 ] 

Nathan Gass commented on LUCENE-7630:
-

done

> EdgeNGramTokenFilter drops payloads
> ---
>
> Key: LUCENE-7630
> URL: https://issues.apache.org/jira/browse/LUCENE-7630
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: modules/analysis
>Affects Versions: master (7.0)
>Reporter: Nathan Gass
>Assignee: Uwe Schindler
>Priority: Minor
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> Using an EdgeNGramTokenFilter after a DelimitedPayloadTokenFilter discards 
> the payloads, where as most other filters copy the payload to the new tokens.
> I added a test for this issue and a possible fix at 
> https://github.com/xabbu42/lucene-solr/tree/edgepayloads
> Greetings
> Nathan Gass



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7630) EdgeNGramTokenFilter drops payloads

2017-01-13 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15821894#comment-15821894
 ] 

Uwe Schindler commented on LUCENE-7630:
---

bq. The NGramTokenFilter probably has the same issue. I can port the fix to 
that class when everything is correct.

Please do! You can update the current PR. Otheriwise PR looks fine.

> EdgeNGramTokenFilter drops payloads
> ---
>
> Key: LUCENE-7630
> URL: https://issues.apache.org/jira/browse/LUCENE-7630
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: modules/analysis
>Affects Versions: master (7.0)
>Reporter: Nathan Gass
>Assignee: Uwe Schindler
>Priority: Minor
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> Using an EdgeNGramTokenFilter after a DelimitedPayloadTokenFilter discards 
> the payloads, where as most other filters copy the payload to the new tokens.
> I added a test for this issue and a possible fix at 
> https://github.com/xabbu42/lucene-solr/tree/edgepayloads
> Greetings
> Nathan Gass



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7630) EdgeNGramTokenFilter drops payloads

2017-01-13 Thread Nathan Gass (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15821893#comment-15821893
 ] 

Nathan Gass commented on LUCENE-7630:
-

I commited the suggested improvements and made a pull request 
https://github.com/apache/lucene-solr/pull/138.

The NGramTokenFilter probably has the same issue. I can port the fix to that 
class when everything is correct.

> EdgeNGramTokenFilter drops payloads
> ---
>
> Key: LUCENE-7630
> URL: https://issues.apache.org/jira/browse/LUCENE-7630
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: modules/analysis
>Affects Versions: master (7.0)
>Reporter: Nathan Gass
>Assignee: Uwe Schindler
>Priority: Minor
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> Using an EdgeNGramTokenFilter after a DelimitedPayloadTokenFilter discards 
> the payloads, where as most other filters copy the payload to the new tokens.
> I added a test for this issue and a possible fix at 
> https://github.com/xabbu42/lucene-solr/tree/edgepayloads
> Greetings
> Nathan Gass



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7630) EdgeNGramTokenFilter drops payloads

2017-01-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15821888#comment-15821888
 ] 

ASF GitHub Bot commented on LUCENE-7630:


GitHub user xabbu42 opened a pull request:

https://github.com/apache/lucene-solr/pull/138

EdgeNGramTokenFilter drops payloads

Test and fix for https://issues.apache.org/jira/browse/LUCENE-7630.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/xabbu42/lucene-solr edgepayloads

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/lucene-solr/pull/138.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #138


commit 61e45283061ae486acc5882c5a770025c8291222
Author: Nathan Gass 
Date:   2017-01-09T13:59:31Z

add test that EdgeNGram filter keeps payloads

commit 6570e6ecc2b14a28da9873948083791ba47145d0
Author: Nathan Gass 
Date:   2017-01-09T14:00:21Z

copy all attributes including payload to new tokens

commit 01f2a87c67392a86b533d0c76ba7666845d1945f
Author: Nathan Gass 
Date:   2017-01-13T14:54:07Z

use captureState and restoreState instead of cloneAttributes




> EdgeNGramTokenFilter drops payloads
> ---
>
> Key: LUCENE-7630
> URL: https://issues.apache.org/jira/browse/LUCENE-7630
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: modules/analysis
>Affects Versions: master (7.0)
>Reporter: Nathan Gass
>Assignee: Uwe Schindler
>Priority: Minor
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> Using an EdgeNGramTokenFilter after a DelimitedPayloadTokenFilter discards 
> the payloads, where as most other filters copy the payload to the new tokens.
> I added a test for this issue and a possible fix at 
> https://github.com/xabbu42/lucene-solr/tree/edgepayloads
> Greetings
> Nathan Gass



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7630) EdgeNGramTokenFilter drops payloads

2017-01-13 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15821733#comment-15821733
 ] 

Uwe Schindler commented on LUCENE-7630:
---

Hi, could you create a Pull Request and add the link here?

About your branch: I would not use cloneAttributes() because thats slow for 
this simple case. cloneAttributes() only helps if you want to modify the 
attributes in the AttributeSource that was created, but is not useful for 
simple save/restore use cases.

For your case, you should simple use captureState(), save the state object and 
then call restorestate() instead of clearAttributes(). After restoring you can 
adapt term text and positions/offsets. In addition when you clone or capture 
state, the call to clearAttributes() is useless and also slows down. When 
restoring states, everything is restored, so the additional clearing before is 
not needed.

> EdgeNGramTokenFilter drops payloads
> ---
>
> Key: LUCENE-7630
> URL: https://issues.apache.org/jira/browse/LUCENE-7630
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: modules/analysis
>Affects Versions: master (7.0)
>Reporter: Nathan Gass
>Priority: Minor
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> Using an EdgeNGramTokenFilter after a DelimitedPayloadTokenFilter discards 
> the payloads, where as most other filters copy the payload to the new tokens.
> I added a test for this issue and a possible fix at 
> https://github.com/xabbu42/lucene-solr/tree/edgepayloads
> Greetings
> Nathan Gass



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org