[Bug 42234] Normal search with [some] accented letters fails: rebuild search index for wikidatawiki

2013-03-11 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=42234

Daniel Kinzler daniel.kinz...@wikimedia.de changed:

   What|Removed |Added

 CC||mybugs.m...@gmail.com

--- Comment #25 from Daniel Kinzler daniel.kinz...@wikimedia.de ---
*** Bug 45860 has been marked as a duplicate of this bug. ***

-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are watching all bug changes.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 42234] Normal search with [some] accented letters fails: rebuild search index for wikidatawiki

2013-03-07 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=42234

Andre Klapper aklap...@wikimedia.org changed:

   What|Removed |Added

 Status|RESOLVED|REOPENED
 Resolution|FIXED   |---

--- Comment #21 from Andre Klapper aklap...@wikimedia.org ---
Confirming that Łódź is still a problem for wikidata.org.
Reopening as per comment 19, though not sure if this is the same problem.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are watching all bug changes.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 42234] Normal search with [some] accented letters fails: rebuild search index for wikidatawiki

2013-03-07 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=42234

--- Comment #22 from Daniel Kinzler daniel.kinz...@wikimedia.de ---
(In reply to comment #19)
 notpeter  I have rebuilt the index from a fresh dump of wikidatawiki. this
 should hopefully fix the problem. if the problem persists, please re-open
 this ticket.

Oh... how does rebuilding the index from a dump work? Which code does it use?
Can it handle non-wikitext content at all? If not, it will index the JSON...

For the live updates, I have implemented the required support in the OAI
extension, so OAI's lsearch output is not JSON but (generated) plain text. The
same needs to be done when re-indexing based on dumps, I suppose. So far, I
assumed that the rebuild would be using the same interface to access the data.
If that is not the case, rebuilding the index might actually cause *more*
breakage.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are watching all bug changes.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 42234] Normal search with [some] accented letters fails: rebuild search index for wikidatawiki

2013-03-07 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=42234

--- Comment #23 from Munagala Ramanath (Ram) r...@wikimedia.org ---
Not sure exactly how notpeter did it but one way is to use the import-file()
function in puppet/files/lucene/lucene.jobs.sh. There is also an import-db()
function that dumps the DB to a file and runs the former function on that file.

It uses the Java class org.wikimedia.lsearch.importer.BuildAll. I don't yet
know
this part of the code well enough to answer the other questions.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are watching all bug changes.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 42234] Normal search with [some] accented letters fails: rebuild search index for wikidatawiki

2013-03-07 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=42234

--- Comment #24 from Daniel Kinzler daniel.kinz...@wikimedia.de ---
(In reply to comment #23)
 Not sure exactly how notpeter did it but one way is to use the import-file()
 function in puppet/files/lucene/lucene.jobs.sh. There is also an import-db()
 function that dumps the DB to a file and runs the former function on that
 file.
 
 It uses the Java class org.wikimedia.lsearch.importer.BuildAll. I don't yet
 know
 this part of the code well enough to answer the other questions.

We don't have any handling of non-wikitext content in Java, and I don't see how
it could be added... we'd either have to create specialized dumps, or implement
the entire content handler infrastructure in Java (including java versions of
content handlers supplied by extensions), or not use dumps and always call the
API.

None of the options sounds good :\

-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are watching all bug changes.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 42234] Normal search with [some] accented letters fails: rebuild search index for wikidatawiki

2013-03-07 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=42234

Greg Grossmeier g...@wikimedia.org changed:

   What|Removed |Added

 CC||g...@wikimedia.org

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 42234] Normal search with [some] accented letters fails: rebuild search index for wikidatawiki

2013-03-07 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=42234

Helder mybugs.m...@gmail.com changed:

   What|Removed |Added

   See Also||https://bugzilla.wikimedia.
   ||org/show_bug.cgi?id=45860

-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are watching all bug changes.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 42234] Normal search with [some] accented letters fails: rebuild search index for wikidatawiki

2013-03-06 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=42234

--- Comment #18 from Andre Klapper aklap...@wikimedia.org ---
Make that last comment RT #4625

-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are watching all bug changes.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 42234] Normal search with [some] accented letters fails: rebuild search index for wikidatawiki

2013-03-06 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=42234

jeremyb bugzilla+org.wikime...@tuxmachine.com changed:

   What|Removed |Added

 Status|REOPENED|RESOLVED
 Resolution|--- |FIXED

--- Comment #19 from jeremyb bugzilla+org.wikime...@tuxmachine.com ---
notpeter  I have rebuilt the index from a fresh dump of wikidatawiki. this
should hopefully fix the problem. if the problem persists, please re-open this
ticket.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are watching all bug changes.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 42234] Normal search with [some] accented letters fails: rebuild search index for wikidatawiki

2013-03-06 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=42234

--- Comment #20 from Nemo federicol...@tiscali.it ---
(In reply to comment #15)
 Names like Łódź are impossible to search:
 http://www.wikidata.org/w/index.
 php?search=%C5%81%C3%B3d%C5%BAtitle=Special%3ASearch

Still getting no result as of now. The other examples here seem to work.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are watching all bug changes.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 42234] Normal search with [some] accented letters fails: rebuild search index for wikidatawiki

2013-03-01 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=42234

Lydia Pintscher lydia.pintsc...@wikimedia.de changed:

   What|Removed |Added

 CC||lydia.pintscher@wikimedia.d
   ||e

--- Comment #13 from Lydia Pintscher lydia.pintsc...@wikimedia.de ---
Thanks for investigating, Tim. Any chance you can fix this? Anything I can tell
the community (who's rather unhappy about the search)?

-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are watching all bug changes.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 42234] Normal search with [some] accented letters fails: rebuild search index for wikidatawiki

2013-03-01 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=42234

--- Comment #14 from Munagala Ramanath (Ram) r...@wikimedia.org ---
Looks like Tim fixed it -- timestamp on searchidx1001 for wikidatawiki is
today:

cat ../status/wikidatawiki 
#Last incremental update timestamp
#Fri Mar 01 03:42:21 UTC 2013
timestamp=2013-03-01T03\:41\:07Z

Many of the index files have a timestamp of yesterday or today.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are watching all bug changes.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 42234] Normal search with [some] accented letters fails: rebuild search index for wikidatawiki

2013-03-01 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=42234

--- Comment #15 from Leinad danny.lei...@gmail.com ---
(In reply to comment #13)
 Thanks for investigating, Tim. Any chance you can fix this? Anything I can
 tell
 the community (who's rather unhappy about the search)?

Hi,
I would like you to suggest to postpone deploy Wikidata on projects like plwiki
until fix this bug - this is really important issue and in my opinion it will
cause negative impressions of new tool. On plwiki we still have a problem to
convince community about advantages of Wikidata and such bugs won't help us.

Names like Łódź are impossible to search:
http://www.wikidata.org/w/index.php?search=%C5%81%C3%B3d%C5%BAtitle=Special%3ASearch

-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are watching all bug changes.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 42234] Normal search with [some] accented letters fails: rebuild search index for wikidatawiki

2013-03-01 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=42234

--- Comment #16 from Nemo federicol...@tiscali.it ---
(In reply to comment #15)
 Names like Łódź are impossible to search:
 http://www.wikidata.org/w/index.
 php?search=%C5%81%C3%B3d%C5%BAtitle=Special%3ASearch

On it.wiki users were just told not to use Special:Search at all, because it's
completely useless, and to rely on the search gadget (enabled by default on
Vector) which is activated by clicking the arrow next to the search bar. You
should probably do the same and forget the standard search: this helped a lot
on it.wiki.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are watching all bug changes.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 42234] Normal search with [some] accented letters fails: rebuild search index for wikidatawiki

2013-03-01 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=42234

jeremyb bugzilla+org.wikime...@tuxmachine.com changed:

   What|Removed |Added

 CC||bugzilla+org.wikimedia@tuxm
   ||achine.com

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 42234] Normal search with [some] accented letters fails: rebuild search index for wikidatawiki

2013-03-01 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=42234

Daniel Zahn dz...@wikimedia.org changed:

   What|Removed |Added

 CC||dz...@wikimedia.org

--- Comment #17 from Daniel Zahn dz...@wikimedia.org ---
link to RT-4625

-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are watching all bug changes.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 42234] Normal search with [some] accented letters fails: rebuild search index for wikidatawiki

2013-02-27 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=42234

--- Comment #12 from Tim Starling tstarl...@wikimedia.org ---
Bash history and file modification timestamps on searchidx2 and searchidx1001
seem to indicate that the wikidatawiki index hasn't been rebuilt since November
14.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are watching all bug changes.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 42234] Normal search with [some] accented letters fails: rebuild search index for wikidatawiki

2013-02-25 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=42234

Nemo federicol...@tiscali.it changed:

   What|Removed |Added

   Keywords||ops
Summary|Normal search with [some]   |Normal search with [some]
   |accented letters fails, |accented letters fails:
   |search by label has to be   |rebuild search index for
   |used|wikidatawiki

-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are watching all bug changes.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l