Hi Pedro Thanks for that debugging information. So we have an inconsistency between the PostgreSQL function and the Ruby function. Thankfully it wasn't to difficult to fix - I was presuming that the length of the string in bytes is the number of characters - not the case with non-ASCII characters.
The fix is now pushed to Github - are you using Bundler? Because for the moment, it's best to refer to that until a new TS release. Otherwise, let me know, I'll rush a new release through. Cheers -- Pat On 12/01/2011, at 3:33 AM, Pedro Cunha wrote: > Script/Console: > >>> FoodType.find(26).name > => "Indonésia" >>> FoodType.find(26).name.to_crc32 > => 1221489120 > >>> Restaurant.find(71).food_types > => [#<FoodType id: 11, name: "Britânica", code: nil, permalink: > "britanica", created_at: "2011-01-07 12:31:50", updated_at: > "2011-01-07 12:31:50">, #<FoodType id: 26, name: "Indonésia", code: > nil, permalink: "indonesia", created_at: "2011-01-07 12:31:50", > updated_at: "2011-01-07 12:31:50">] > > SQL: (For restaurant ID _71_ ) > id place_id food_type_facet > 71 71 497003136,1932385248 > > > Now another case: > >>> Restaurant.find(70).food_types > => [#<FoodType id: 20, name: "Francesa", code: nil, permalink: > "francesa", created_at: "2011-01-07 12:31:50", updated_at: "2011-01-07 > 12:31:50">, #<FoodType id: 28, name: "Irlandesa", code: nil, > permalink: "irlandesa", created_at: "2011-01-07 12:31:50", updated_at: > "2011-01-07 12:31:50">] > >>> FoodType.find(20).name.to_crc32 > => 1302718991 > > id place_id food_type_facet > 70 70 386799161,1302718991 > > > > > Seems like for accented chars output SQL differs from script/console > > > > > > > > On Jan 11, 1:14 am, Pedro Cunha <[email protected]> wrote: >> This is the resulting sql >> >> id place_id food_type_facet >> 40 40 3615477584,206447654 >> >> About the irb I don't seem able to place accented chars (bash v.4 does >> show them). Will have to dig a bit, those solutions on web to solve >> this are not pretty. >> >> Pat, what also I don't understand is why normal columns which I'm >> indexing (as facets too) display correctly the accented chars, but if >> they are associations like this case "Indonésia" they just give null. >> >> Best regards >> Pedro >> >> On Jan 8, 2:49 am, Pat Allan <[email protected]> wrote: >> >> >> >> >> >> >> >>> Hi Pedro >> >>> What's the output of the following SQL query: >> >>> SELECT "places"."id" * 1 + 0 AS "id", >>> "places"."id" as "place_id", >>> array_to_string(array_accum(COALESCE(crc32("food_types"."name"), >>> '0')), ',') AS "food_type_facet" >>> FROM "places" >>> LEFT OUTER JOIN "food_types_restaurants" ON >>> "food_types_restaurants".restaurant_id = "places".id >>> LEFT OUTER JOIN "food_types" ON "food_types".id = >>> "food_types_restaurants".food_type_id >>> WHERE "places"."id" = 40 >>> AND "places"."type" = 'Restaurant' >>> GROUP BY "places"."id" >> >>> Also, what's the output in script/console of: >>> "Indonésia".to_crc32 >> >>> -- >>> Pat >> >>> On 05/01/2011, at 10:43 PM, Pedro Cunha wrote: >> >>>> sql_query = SELECT "places"."id" * 1 + 0 AS "id" , "places"."name" AS >>>> "name", "places"."concelho" AS "concelho", "places"."freguesia" AS >>>> "freguesia", "places"."locality" AS "locality", "places"."distrito" AS >>>> "distrito", array_to_string(array_accum(COALESCE("zones"."name", >>>> '0')), ' ') AS "zone", >>>> array_to_string(array_accum(COALESCE("ideal_types"."name", '0')), ' ') >>>> AS "ideal_type", >>>> array_to_string(array_accum(COALESCE("food_types"."name", '0')), ' ') >>>> AS "food_type", "places"."id" AS "sphinx_internal_id", >>>> COALESCE(crc32(NULLIF("places"."type",'')), 2764575215::bigint) AS >>>> "class_crc", 0 AS "sphinx_deleted", crc32("places"."locality") AS >>>> "locality_facet", crc32("places"."distrito") AS "distrito_facet", >>>> array_to_string(array_accum(COALESCE(crc32("zones"."name"), '0')), >>>> ',') AS "zone_facet", >>>> array_to_string(array_accum(COALESCE(crc32("ideal_types"."name"), >>>> '0')), ',') AS "ideal_type_facet", >>>> array_to_string(array_accum(COALESCE(crc32("food_types"."name"), >>>> '0')), ',') AS "food_type_facet", "places"."average_price_id" AS >>>> "average_price_id", "places"."has_disabled_access" AS >>>> "has_disabled_access", "places"."has_live_music" AS "has_live_music", >>>> "places"."has_animation" AS "has_animation", "places"."has_beach" AS >>>> "has_beach", "places"."has_outside_tables" AS "has_outside_tables", >>>> "places"."has_vista" AS "has_vista", >>>> array_to_string(array_accum(COALESCE("smoking_types"."id", '0')), ',') >>>> AS "smoking_type_id", "places"."current_configuration_id" AS >>>> "current_configuration_id", cast(extract(epoch from >>>> "places"."created_at") as int) AS "created_at", "places"."total_like" >>>> AS "total_like", "average_prices"."position" AS "average_price" FROM >>>> "places" LEFT OUTER JOIN "place_zones" ON ("places"."id" = >>>> "place_zones"."place_id") LEFT OUTER JOIN "zones" ON ("zones"."id" = >>>> "place_zones"."zone_id") LEFT OUTER JOIN "ideal_types_restaurants" >>>> ON "ideal_types_restaurants".restaurant_id = "places".id LEFT OUTER >>>> JOIN "ideal_types" ON "ideal_types".id = >>>> "ideal_types_restaurants".ideal_type_id LEFT OUTER JOIN >>>> "food_types_restaurants" ON "food_types_restaurants".restaurant_id = >>>> "places".id LEFT OUTER JOIN "food_types" ON "food_types".id = >>>> "food_types_restaurants".food_type_id LEFT OUTER JOIN >>>> "restaurants_smoking_types" ON >>>> "restaurants_smoking_types".restaurant_id = "places".id LEFT OUTER >>>> JOIN "smoking_types" ON "smoking_types".id = >>>> "restaurants_smoking_types".smoking_type_id LEFT OUTER JOIN >>>> "average_prices" ON "average_prices".id = "places".average_price_id >>>> WHERE "places"."id" >= $start AND "places"."id" <= $end AND >>>> "places"."type" = 'Restaurant' GROUP BY "places"."id", >>>> "places"."name", "places"."concelho", "places"."freguesia", >>>> "places"."locality", "places"."distrito", "places"."id", >>>> "places"."locality", "places"."distrito", "places"."average_price_id", >>>> "places"."has_disabled_access", "places"."has_live_music", >>>> "places"."has_animation", "places"."has_beach", >>>> "places"."has_outside_tables", "places"."has_vista", >>>> "places"."current_configuration_id", "places"."created_at", >>>> "places"."total_like", "average_prices"."position", "places"."type" >> >>>> Postgres version: >>>> PostgreSQL 8.4.5 and 9.0.1 >> >>>> Still I'm suspected from the collation I refered on my last post >>>> (en_US) >> >>>> Postgres gem: >>>> pg (0.9.0) >> >>>> On Jan 5, 2:41 am, Pat Allan <[email protected]> wrote: >>>>> Hi Pedro >> >>>>> In your config/development.sphinx.conf file, what's the sql_query value >>>>> for your restaurant_core_0 source? Also, what version of PostgreSQL are >>>>> you using? >> >>>>> -- >>>>> Pat >> >>>>> On 04/01/2011, at 10:55 PM, Pedro Cunha wrote: >> >>>>>>>> Restaurant.find(40).food_types >>>>>> [#<FoodType id: 26, name: "Indonésia", code: nil, permalink: >>>>>> "indonesia", created_at: "2011-01-04 11:39:04", updated_at: >>>>>> "2011-01-04 11:39:04">, #<FoodType id: 27, name: "Internacional", >>>>>> code: nil, permalink: "internacional", created_at: "2011-01-04 >>>>>> 11:39:04", updated_at: "2011-01-04 11:39:04">] >> >>>>>>>> Restaurant.search "Fonte Clotilde Miseráveis" >>>>>> => [#<Restaurant id: 40, name: "Fonte Clotilde Miseráveis" ... >> >>>>>>>> Restaurant.facets "Fonte Clotilde Miseráveis" >>>>>> => {:locality=>{"Quinta das >>>>>> Teresinhas"=>1}, :distrito=>{"Lisboa"=>1}, :ideal_type=>{"Cocktails"=>1, >>>>>> "Descontrair"=>1, "Beira mar"=>1}, :food_type=>{"Internacional"=>1, >>>>>> nil=>1}} >> >>>>>> With: >>>>>> TS 1.3.20 >>>>>> postgres, database is in UTF8 and collation en_US >> >>>>>> Sorry for double posting forgot more specific examples >> >>>>>> On Jan 4, 11:33 am, Pedro Cunha <[email protected]> wrote: >>>>>>> My current index: >> >>>>>>> define_index do >>>>>>> indexes :name, :as => :name >>>>>>> indexes concelho >>>>>>> indexes freguesia >>>>>>> indexes locality, :as => :locality, :facet => true >>>>>>> indexes distrito, :facet => true >>>>>>> indexes ideal_types(:name), :as => :ideal_type, :facet => true >>>>>>> indexes food_types(:name), :as => :food_type, :facet => true >> >>>>>>> has average_price_id >>>>>>> has has_live_music >>>>>>> has has_animation >>>>>>> has has_beach >>>>>>> has has_outside_tables >>>>>>> has has_vista >>>>>>> has smoking_types(:id), :as => :smoking_type_id >> >>>>>>> has created_at >> >>>>>>> # Collections for sorting >>>>>>> has total_like, :as => :likes, :type => :integer >>>>>>> has average_price(:position), :as => :average_price, :type >>>>>>> => :integer >> >>>>>>> set_property :min_prefix_len => 3 >>>>>>> end >> >>>>>>> Facet search. It includes portuguese words, all words which include >>>>>>> accented chars are just not displayed / indexed? Can't figure it out >>>>>>> if I'm doing something wrong >> >>>>>>>>> Restaurant.facets "" >> >>>>>>> => {:locality=>{"Quinta do Alto das Areias"=>1, "Quinta do >>>>>>> Montezelo"=>1, "Ajuda"=>1, "Serra da Luz"=>3, "Braço de Prata"=>1, >>>>>>> "Chelas"=>1, "Campo de Ourique"=>1, "Santa Maria dos Olivais"=>1, >>>>>>> "Bairro Dona Leonor"=>1, "Quinta do Cartucho"=>1, "Alto da Ajuda"=>1, >>>>>>> "Quinta das Rosas"=>3, "Charneca"=>1, "Bairro do Menino de Deus"=>1, >>>>>>> "Olival do Rio"=>2, "Alvalade"=>2, "Bairro das Calvanas"=>1, "Fonte da >>>>>>> Pipa"=>1, "Bairro da Musgueira Sul"=>1, "Olivais Sul"=>1, "Quinta da >>>>>>> Fonte da Calça"=>1, "Casa Branca"=>2, "Encarnação"=>2, "Mesquita"=>1, >>>>>>> "Bairro de Santa Clara"=>1, "Caselas"=>1, "Quinta do Mata Mouros"=>1, >>>>>>> "Grilo"=>1, "Hipódromo"=>2, "Quinta do Baleeiro"=>1, "Quinta das >>>>>>> Teresinhas"=>1, "Francelha de Cima"=>1, "Quinta das Lavadeiras"=>2, >>>>>>> "Quinta dos Azulejos"=>1, "Bairro da Liberdade"=>1, "Quinta do >>>>>>> Pisany"=>1, "Quinta do Souto"=>2, "Quinta da Barroca"=>1, >>>>>>> "Alcântara"=>1}, :ideal_type=>{"Vista"=>11, nil=>27, "Ver e ser >>>>>>> visto"=>16, "Cocktails"=>14, "Descontrair"=>13, "Esplanada"=>8, "Boa >>>>>>> carta de vinhos"=>15, "Beira mar"=>7, "Fora d'horas"=>13, "Sair a >>>>>>> dois"=>14}, :distrito=>{"Lisboa"=>50}, :food_type=>{"Sul >>>>>>> Americana"=>2, "Goesa"=>1, "Francesa"=>2, "Grega"=>3, >>>>>>> "Internacional"=>2, "Judaica"=>2, "Coreana"=>1, "Vietnamita"=>2, >>>>>>> "Steak House"=>2, "Indiana"=>2, "Espanhola"=>4, "Brasileira"=>2, "Fast >>>>>>> Food"=>2, "Light"=>1, "Libanesa"=>2, "Belga"=>4, "Tailandesa"=>1, >>>>>>> "Americana"=>1, "Sueca"=>2, "Israelita"=>2, "Africana"=>2, >>>>>>> "Caribenha"=>1, nil=>18, "Italiana"=>2, "Autor"=>4, "Vegetariana"=>4, >>>>>>> "Nepalesa"=>4, "de Mercado"=>1, "Tibetana"=>2, "Japonesa"=>2, "Fast >>>>>>> Food Gourmet"=>1, "Cubana"=>1, "Portuguesa"=>3, "Irlandesa"=>1, >>>>>>> "Chinesa"=>2, "Grelhados"=>3, "Argentina"=>2, "Marroquina"=>2, >>>>>>> "Tapas"=>3, "Russa"=>1}} >> >>>>>>> Cheers >> >>>>>>> On Jan 4, 1:19 am, Pat Allan <[email protected]> wrote: >> >>>>>>>> Hi Pedro >> >>>>>>>> Not quite sure what you mean... can you give me an example of the data >>>>>>>> you're faceting, and what you expect the results to be? >> >>>>>>>> -- >>>>>>>> Pat >> >>>>>>>> On 04/01/2011, at 12:09 PM, Pedro Cunha wrote: >> >>>>>>>>> Is there any catch to able to get all facets which are refering >>>>>>>>> accented words? >> >>>>>>>>> I get something like nil => 4 >> >>>>>>>>> Or I need to configure a proper char set table? But that doesn't >>>>>>>>> really help my problem because I don't want to lose the proper typing >>>>>>>>> of the words >> >>>>>>>>> Cheers, >> >> ... >> >> read more » > > -- > You received this message because you are subscribed to the Google Groups > "Thinking Sphinx" group. > To post to this group, send email to [email protected]. > To unsubscribe from this group, send email to > [email protected]. > For more options, visit this group at > http://groups.google.com/group/thinking-sphinx?hl=en. > -- You received this message because you are subscribed to the Google Groups "Thinking Sphinx" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/thinking-sphinx?hl=en.
