Re: [sup-devel] Cannot query Japanese characters

2011-06-09 Thread William Morgan
Reformatted excerpts from Horacio Sanson's message of 2011-06-09: > Great I am downloading my gmail accounts now (again). I can see you > have improved the imap-dumper.rb to handle uidvalidity and uidnext > that is also great. In the git log says gmail labels are also copied > to heliotrope but I d

Re: [sup-devel] Cannot query Japanese characters

2011-06-09 Thread Horacio Sanson
On Wed, Jun 8, 2011 at 2:21 PM, William Morgan wrote: > Reformatted excerpts from Horacio Sanson's message of 2011-05-06: >> Great, let me know when you have the modifications so I can stress >> test them. > > In the most recent version of Heliotrope, there are two hooks you can > use to do this:

Re: [sup-devel] Cannot query Japanese characters

2011-06-09 Thread Horacio Sanson
Unfortunately the gmail sync failed Below the error: requesting messages 40266..40315 from imap server ; gmail loving gave us 19 messages in 4.6s = a whopping 4.1m/s scanned 748, indexed 747, skipped 0 bad and 1 seen messages in 400.4s = 1.9 m/s ; requesting messages 40316..40365 from imap s

Re: [sup-devel] Cannot query Japanese characters

2011-06-07 Thread William Morgan
Reformatted excerpts from Horacio Sanson's message of 2011-05-06: > Great, let me know when you have the modifications so I can stress > test them. In the most recent version of Heliotrope, there are two hooks you can use to do this: transform-text and transform-query. To use them, place your Ruby

Re: [sup-devel] Cannot query Japanese characters

2011-05-05 Thread Horacio Sanson
Great, let me know when you have the modifications so I can stress test them. regards, Horacio On Thu, May 5, 2011 at 1:56 AM, William Morgan wrote: > Hi Horacio, > > Thanks for all your help so far. > > Reformatted excerpts from Horacio Sanson's message of 2011-05-04: >> After some hacking I go

Re: [sup-devel] Cannot query Japanese characters

2011-05-04 Thread William Morgan
Hi Horacio, Thanks for all your help so far. Reformatted excerpts from Horacio Sanson's message of 2011-05-04: > After some hacking I got a Heliotrope server that works perfectly with > Japanese text. All I did was follow your comments > and applied the MeCab tokenizer to the message body and que

Re: [sup-devel] Cannot query Japanese characters

2011-05-03 Thread Horacio Sanson
Chasen is the worst tokenizer, is pretty old. The best one is MeCab that is the faster and from the same author of Chasen. You can see all major Japanese tokenizer in action at this URL: http://nomadscafe.jp/test/keitaiso/index.cgi. Just put some text in the box and press the button. After some ha

Re: [sup-devel] Cannot query Japanese characters

2011-05-03 Thread Horacio Sanson
Forgot to mention you need the mecab ruby gem. In Ubuntu 10.04 this gem is part of the distribution and can be installed with the command: sudo apt-get install libmecab-ruby1.8 libmecab-ruby1.9.1 mecab-ipadic-utf8 regards Horacio On Wed, May 4, 2011 at 10:42 AM, Horacio Sanson wrote: > Chasen i

Re: [sup-devel] Cannot query Japanese characters

2011-05-03 Thread William Morgan
Reformatted excerpts from Horacio Sanson's message of 2011-05-03: > index = Index.new "index" => # > entry1 = Entry.new => # > entry1.add_string "body", "研究会" => # > docid1 = index.add_entry entry1 => 1 > q1 = Query.new "body", "研究" => body:"研究" > results1 = index.search q1 => [] The problem here

Re: [sup-devel] Cannot query Japanese characters

2011-05-03 Thread Horacio Sanson
I managed to stop the crash when searching for Japanese text by forcing UTF-8 encoding in que query parameter (see patch). But seems that Whistelpig cannot speak Japanese. I tried the following small test and as you can see I get no results: > require 'rubygems' => true > require 'whistlepig' =>

Re: [sup-devel] Cannot query Japanese characters

2011-05-01 Thread Horacio Sanson
I also tried with ruby 1.8 and heliotrope does not crash but searching any Japanese word returns no matches even for search terms I now have matches. And by the way the installation instructions should mention that for ruby 1.8 we also need to install the json gem or heliotrope won't start. regar

Re: [sup-devel] Cannot query Japanese characters

2011-05-01 Thread Horacio Sanson
Installed whistelpig 0.6 but now I get a different error that looks similar to the turnsole problem. Below the backtrace: http://localhost:8042/search?q=primo -> /search?q=%7Einbox&start=0&num=20 127.0.0.1 - - [02/May/2011 00:31:58] "GET /favicon.ico HTTP/1.1" 404 447 0.0008 localhost - - [02/May/

Re: [sup-devel] Cannot query Japanese characters

2011-04-28 Thread William Morgan
Reformatted excerpts from William Morgan's message of 2011-04-26: > Thanks for the bug report on this one too. It's great to have someone > testing this stuff with non-ASCII code. This is a known bug in > Whistlepig and I should be releasing a fix soon. This is fixed in Whistlepig 0.6. Heliotrope

Re: [sup-devel] Cannot query Japanese characters

2011-04-25 Thread William Morgan
Reformatted excerpts from Horacio Sanson's message of 2011-04-25: > I like sup's idea and have a lot of hope in heliotrope but unfortunately both > have problems when dealing with my language: Japanese. > > When I put a search string like this "subject: 手紙" I get the following > crash: Thanks fo

[sup-devel] Cannot query Japanese characters

2011-04-24 Thread Horacio Sanson
I like sup's idea and have a lot of hope in heliotrope but unfortunately both have problems when dealing with my language: Japanese. When I put a search string like this "subject: 手紙" I get the following crash: 27.0.0.1 - - [25/Apr/2011 10:17:17] "GET /search?q=%E6%89%8B%E7%B4%99 HTTP/1.1" 200