[Bug 26997] CJKFilter wrongly tokenize a CJK and non-CJK mixed string.
https://bugzilla.wikimedia.org/show_bug.cgi?id=26997 Chad H. changed: What|Removed |Added Keywords|patch-need-review | Status|NEW |RESOLVED CC||innocentkil...@gmail.com Resolution|--- |WONTFIX --- Comment #10 from Chad H. --- I don't see this issue in Cirrus/Elastic. Marking WONTFIX since lsearchd is end of life but adding cirrus-fixed tag. -- You are receiving this mail because: You are the assignee for the bug. You are on the CC list for the bug. ___ Wikibugs-l mailing list Wikibugs-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikibugs-l
[Bug 26997] CJKFilter wrongly tokenize a CJK and non-CJK mixed string.
https://bugzilla.wikimedia.org/show_bug.cgi?id=26997 --- Comment #9 from Andre Klapper --- In the meantime, lucene-search in Wikimedia has reached its end of life and will not be improved further. Jun Mizuno: It would be awesome if you could check if the problem still exists in the CirrusSearch extension that is being working on (it is also a Lucene-based search for MediaWiki, backed by Elasticsearch instead of Wikimedia's home-grown lsearchd). -- You are receiving this mail because: You are on the CC list for the bug. You are the assignee for the bug. ___ Wikibugs-l mailing list Wikibugs-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikibugs-l
[Bug 26997] CJKFilter wrongly tokenize a CJK and non-CJK mixed string.
https://bugzilla.wikimedia.org/show_bug.cgi?id=26997 Andre Klapper changed: What|Removed |Added CC||wikibugs-l@lists.wikimedia. ||org Component|Lucene Search |lucene-search-2 Product|MediaWiki extensions|Wikimedia --- Comment #8 from Andre Klapper --- [Merging "MediaWiki extensions/Lucene Search" into "Wikimedia/lucene-search2", see bug 46542. You can filter bugmail for: search-component-merge-20130326 ] -- You are receiving this mail because: You are on the CC list for the bug. You are the assignee for the bug. You are watching all bug changes. ___ Wikibugs-l mailing list Wikibugs-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikibugs-l
[Bug 26997] CJKFilter wrongly tokenize a CJK and non-CJK mixed string.
https://bugzilla.wikimedia.org/show_bug.cgi?id=26997 --- Comment #7 from Sumana Harihareswara 2012-05-25 03:20:17 UTC --- Jun: Thanks again for the patch. Are you interested in using developer access to directly suggest any future MediaWiki and MediaWiki extension improvements into our Git source control system? https://www.mediawiki.org/wiki/Developer_access https://www.mediawiki.org/wiki/Git/Workflow#How_to_submit_a_patch -- Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug. You are on the CC list for the bug. ___ Wikibugs-l mailing list Wikibugs-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikibugs-l
[Bug 26997] CJKFilter wrongly tokenize a CJK and non-CJK mixed string.
https://bugzilla.wikimedia.org/show_bug.cgi?id=26997 --- Comment #6 from Jun Mizuno 2011-12-23 06:17:38 UTC --- Hi, It is a welcome news that my patch might be reviewed. I have used a patched lucene-search for almost one year at my site, however, I am not sure my patch is valid. Lucene-Search extension is a fundamental tool at my site. I don't know there is anything I can do, though, I will learn the implementation of CJK support more closely. -- Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug. You are on the CC list for the bug. ___ Wikibugs-l mailing list Wikibugs-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikibugs-l
[Bug 26997] CJKFilter wrongly tokenize a CJK and non-CJK mixed string.
https://bugzilla.wikimedia.org/show_bug.cgi?id=26997 --- Comment #5 from Liangent 2011-12-23 05:02:22 UTC --- If you really want to work on this I think you can try to incorporate some existing project into the extension: http://stackoverflow.com/questions/5834371/is-there-any-good-open-source-or-freely-available-chinese-segmentation-algorithm -- Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug. You are on the CC list for the bug. ___ Wikibugs-l mailing list Wikibugs-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikibugs-l
[Bug 26997] CJKFilter wrongly tokenize a CJK and non-CJK mixed string.
https://bugzilla.wikimedia.org/show_bug.cgi?id=26997 Sumana Harihareswara changed: What|Removed |Added CC||orenboch...@gmail.com --- Comment #4 from Sumana Harihareswara 2011-12-22 16:05:40 UTC --- Jun, I'm asking Oren Bochman to take a look at your patch. You might also be interested in working with him more generally to improve our Lucene search extension. -- Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug. You are on the CC list for the bug. ___ Wikibugs-l mailing list Wikibugs-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikibugs-l
[Bug 26997] CJKFilter wrongly tokenize a CJK and non-CJK mixed string.
https://bugzilla.wikimedia.org/show_bug.cgi?id=26997 Sumana Harihareswara changed: What|Removed |Added Keywords||need-review CC||rain...@eunet.rs, ||suma...@panix.com --- Comment #3 from Sumana Harihareswara 2011-11-14 16:18:37 UTC --- Jun, I'm sorry that it is taking so long for a developer to review your patch! I have added the "need-review" keyword to indicate that a your patch awaits review. Thank you for the patch. -- Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug. You are on the CC list for the bug. ___ Wikibugs-l mailing list Wikibugs-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikibugs-l
[Bug 26997] CJKFilter wrongly tokenize a CJK and non-CJK mixed string.
https://bugzilla.wikimedia.org/show_bug.cgi?id=26997 --- Comment #1 from Jun Mizuno 2011-01-28 14:20:40 UTC --- Created attachment 8060 --> https://bugzilla.wikimedia.org/attachment.cgi?id=8060 a patch for CJKFilter.java and its test. The previous patch has the wrong code. Tokens without a CJK character will be filtered wrong. I replace the patch. -- Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are on the CC list for the bug. ___ Wikibugs-l mailing list Wikibugs-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikibugs-l
[Bug 26997] CJKFilter wrongly tokenize a CJK and non-CJK mixed string.
https://bugzilla.wikimedia.org/show_bug.cgi?id=26997 Jun Mizuno changed: What|Removed |Added Attachment #8054|0 |1 is obsolete|| -- Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are on the CC list for the bug. ___ Wikibugs-l mailing list Wikibugs-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikibugs-l
[Bug 26997] CJKFilter wrongly tokenize a CJK and non-CJK mixed string.
https://bugzilla.wikimedia.org/show_bug.cgi?id=26997 Mark A. Hershberger changed: What|Removed |Added Keywords||patch CC||m...@everybody.org -- Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are on the CC list for the bug. ___ Wikibugs-l mailing list Wikibugs-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikibugs-l