Hello, On Sunday 10 July 2011 07:22:15 William Morgan wrote: > Hi Horacio, > > Reformatted excerpts from Horacio Sanson's message of 2011-07-05: > > First any attempt to search using japanese text fails with the dreaded > > > incompatible character encodings error: > I'm having trouble reproducing this, or even understanding why your fix > would help, since all string literals in the code should be UTF-8-encoded. > > Could you please apply this patch and tell me what the output is when > you feed it a crashing search term? Thanks! > > --- cut here --- > diff --git a/bin/heliotrope-server b/bin/heliotrope-server > index c9754d4..ca764c0 100644 > --- a/bin/heliotrope-server > +++ b/bin/heliotrope-server > @@ -219,6 +219,19 @@ class HeliotropeServer < Sinatra::Base > end > nav += "</div>" > > + puts "start" > + p query.original_query_s.encoding > + p query.parsed_query_s.encoding > + p header("Search: #{query.original_query_s}", > query.original_query_s).enc + p "<div>Parsed query: #{escape_html > query.parsed_query_s}</div>".encoding + p "<div>Search took #{sprintf > '%.2f', info[:elapsed]}s and #{info[:contin + p > "#{nav}<table>".encoding > + p results.size > + p results.map { |r| threadinfo_to_html r }.join.encoding > + p "</table>#{nav}".encoding > + p footer.encoding > + puts "end" > + > header("Search: #{query.original_query_s}", query.original_query_s) > + "<div>Parsed query: #{escape_html query.parsed_query_s}</div>" + > "<div>Search took #{sprintf '%.2f', info[:elapsed]}s and #{info[:contin > --- cut here ---
Seems the problem is not heliotrope. The problem are my hooks that use MeCab to split Japanese words. If I run a search for japanese using my query hook this is the output: search(body:"飲み会", 0, 20) took 0.1ms start #<Encoding:ASCII-8BIT> #<Encoding:UTF-8> #<Encoding:ASCII-8BIT> #<Encoding:UTF-8> "<div>Search took 0.00s and was NOT continued</div>" #<Encoding:UTF-8> 0 #<Encoding:ASCII-8BIT> #<Encoding:UTF-8> #<Encoding:UTF-8> end If I put a force_encoding at the end of the hook I get: start #<Encoding:UTF-8> #<Encoding:UTF-8> #<Encoding:UTF-8> #<Encoding:UTF-8> "<div>Search took 0.00s and was NOT continued</div>" #<Encoding:UTF-8> 20 #<Encoding:UTF-8> #<Encoding:UTF-8> #<Encoding:UTF-8> end I need to re-index my emails with the new UTF-8 hooks and test the search again. -- regards, Horacio Sanson _______________________________________________ Sup-devel mailing list Sup-devel@rubyforge.org http://rubyforge.org/mailman/listinfo/sup-devel