Ah, I think I know what the problem is... if you look at the facet code in TS, 
you'll find both limit and max_matches are being set to either 1000 (Sphinx's 
default) or the custom value (sometimes larger) so Sphinx looks at the widest 
possible set of results to figure out the values.

https://github.com/freelancing-god/thinking-sphinx/blob/master/lib/thinking_sphinx/facet_search.rb#L59-60

Try doing the same for your facet search - as it's possible Sphinx just isn't 
getting deep enough into the result set to find other combinations.

-- 
Pat

On 23/03/2011, at 10:55 AM, Viacheslav Dushin wrote:

> Hi, Pat
> 
> This indexer is very simplified version of thinking sphinx.
> 
> facets method is in:
> 
> mongoid-sphinx/lib/mongoid_sphinx/mongoid/sphinx.rb
> 
>       def facets(*args)
>         options = args.extract_options!
>         query = args.join(" ")
>         MongoidSphinx::FacetSearch.new(query,self, options).facets
>       end
> 
> Facet search creates array of bundle searches (facet_search.rb)
> Each search in this array is grouped_by different facet attribute
>     def search
>       return if class_name.facet_attributes.blank?
>       bundled_search = BundledSearch.new
>       class_name.facet_attributes.each do |attribute|
>         bundled_search.search(query, class_name, 
> facet_search_options(attribute))
>       end
>       bundled_search
>     end
> 
> after that all results are mapped in facet has
> 
>     def facets
>       self.search.results
>        res = {}
>        self.search.results.each_with_index do |result, index|
>          attr_name = class_name.facet_attributes[index].to_s
>          res[attr_name] = result[:matches].map{|o| 
> [o[:attributes][attr_name], o[:attributes]["@count"]]}.to_hash
>        end
>        res
>     end
> 
> 
> This code is based on Thinking Sphinx
> 
> see full code in attachment
> 
> 
> Also I noted that group_by in thinking sphinx returns incorrect results too:
> Restaurant.search("pizza", :group_by=> :has_menu) -- returns only one result.
> debugging showed that:
> result[:matches].length == 1
> but when I run 
> Restaurant.facets("pizza") 
> debugging shows that
> results[0][:has_menu].length == 2
> this is correct
> 
> As far as I understand, thinking sphinx uses group_by parameter, to calc 
> facets. But why these results are different?
> 
> 
> Thanks
> 
> 2011/3/23 Pat Allan <[email protected]>
> Hi Viacheslav
> 
> Can you run me through the code you're using to make these queries? It does 
> seem like something's wrong, but I need a bit more context.
> 
> --
> Pat
> 
> On 23/03/2011, at 2:38 AM, Viacheslav Dushin wrote:
> 
> > Hello,
> >
> > I'm using latest version of Riddle from Github, Sphinx 0.9.9-release
> > (r2117) and xmlpipe2 as datasource for Sphinx.
> > I use group by to implement facets (similar to thinking sphinx, but
> > for xmlpipe2 datasource)
> > There is a problem: group by works incorrectly for "int", "bool" and
> > "multi" attributes, but it works ok float attributes.
> > Here is an example of output:
> >
> > grouping by has_menu -- bool:
> >
> >>> MongoRestaurant.facets("")[0]
> > => {:status=>0, :total_found=>1, :attribute_names=>["total_likes",
> > "neighborhood_ids", "lon", "has_delivery", "offer_type_ids",
> > "has_reservation", "cuisine_ids", "lat", "source_ids", "name_sort",
> > "address_zipcode", "has_menu", "offer_ids", "restaurant_ids",
> > "price_range", "score", "city_ids", "total_checkins",
> > "reviews_avg_score", "bh_id", "@groupby",
> > "@count"], :attributes=>{"offer_type_ids"=>1073741825, "lon"=>5,
> > "offer_ids"=>1073741825, "@groupby"=>1, "restaurant_ids"=>1,
> > "source_ids"=>1073741825, "reviews_avg_score"=>5, "total_checkins"=>1,
> > "total_likes"=>1, "@count"=>1, "address_zipcode"=>1,
> > "has_delivery"=>4, "city_ids"=>1, "has_menu"=>4,
> > "cuisine_ids"=>1073741825, "neighborhood_ids"=>1073741825, "bh_id"=>1,
> > "score"=>5, "price_range"=>3, "name_sort"=>3, "lat"=>5,
> > "has_reservation"=>4}, :words=>{"400456007"=>{:docs=>70, :hits=>70}}, 
> > :time=>0.0, :fields=>["classnamecrc32",
> > "name", "description", "offer_text", "offer_type_value",
> > "cuisine_name"], :matches=>[{:doc=>598000, 
> > :attributes=>{"offer_type_ids"=>[],
> > "lon"=>-1.29096734523773, "offer_ids"=>[], "@groupby"=>19700101,
> > "restaurant_ids"=>598000, "source_ids"=>[], "reviews_avg_score"=>0.0,
> > "total_checkins"=>0, "total_likes"=>0, "@count"=>70,
> > "address_zipcode"=>10022, "has_delivery"=>1, "city_ids"=>18819,
> > "has_menu"=>1, "cuisine_ids"=>[], "neighborhood_ids"=>[16, 22, 56,
> > 60], "bh_id"=>598000, "score"=>0.0, "price_range"=>69,
> > "name_sort"=>68, "lat"=>0.711285769939423,
> > "has_reservation"=>0}, :index=>0, :weight=>1273}], :total=>1}
> >
> > grouping by total_likes -- integer:
> >
> > => {:status=>0, :total_found=>1, :attribute_names=>["total_likes",
> > "neighborhood_ids", "lon", "has_delivery", "offer_type_ids",
> > "has_reservation", "cuisine_ids", "lat", "source_ids", "name_sort",
> > "address_zipcode", "has_menu", "offer_ids", "restaurant_ids",
> > "price_range", "score", "city_ids", "total_checkins",
> > "reviews_avg_score", "bh_id", "@groupby",
> > "@count"], :attributes=>{"offer_type_ids"=>1073741825, "lon"=>5,
> > "offer_ids"=>1073741825, "@groupby"=>1, "restaurant_ids"=>1,
> > "source_ids"=>1073741825, "reviews_avg_score"=>5, "total_checkins"=>1,
> > "total_likes"=>1, "@count"=>1, "address_zipcode"=>1,
> > "has_delivery"=>4, "city_ids"=>1, "has_menu"=>4,
> > "cuisine_ids"=>1073741825, "neighborhood_ids"=>1073741825, "bh_id"=>1,
> > "score"=>5, "price_range"=>3, "name_sort"=>3, "lat"=>5,
> > "has_reservation"=>4}, :words=>{"400456007"=>{:docs=>70, :hits=>70}}, 
> > :time=>0.001, :fields=>["classnamecrc32",
> > "name", "description", "offer_text", "offer_type_value",
> > "cuisine_name"], :matches=>[{:doc=>598000, 
> > :attributes=>{"offer_type_ids"=>[],
> > "lon"=>-1.29096734523773, "offer_ids"=>[], "@groupby"=>19700101,
> > "restaurant_ids"=>598000, "source_ids"=>[], "reviews_avg_score"=>0.0,
> > "total_checkins"=>0, "total_likes"=>0, "@count"=>70,
> > "address_zipcode"=>10022, "has_delivery"=>1, "city_ids"=>18819,
> > "has_menu"=>1, "cuisine_ids"=>[], "neighborhood_ids"=>[16, 22, 56,
> > 60], "bh_id"=>598000, "score"=>0.0, "price_range"=>69,
> > "name_sort"=>68, "lat"=>0.711285769939423,
> > "has_reservation"=>0}, :index=>0, :weight=>1273}], :total=>1}
> >
> >
> > grouping by reviews_avg_score -- float
> >
> > => {:status=>0, :total_found=>9, :attribute_names=>["total_likes",
> > "neighborhood_ids", "lon", "has_delivery", "offer_type_ids",
> > "has_reservation", "cuisine_ids", "lat", "source_ids", "name_sort",
> > "address_zipcode", "has_menu", "offer_ids", "restaurant_ids",
> > "price_range", "score", "city_ids", "total_checkins",
> > "reviews_avg_score", "bh_id", "@groupby",
> > "@count"], :attributes=>{"offer_type_ids"=>1073741825, "lon"=>5,
> > "offer_ids"=>1073741825, "@groupby"=>1, "restaurant_ids"=>1,
> > "source_ids"=>1073741825, "reviews_avg_score"=>5, "total_checkins"=>1,
> > "total_likes"=>1, "@count"=>1, "address_zipcode"=>1,
> > "has_delivery"=>4, "city_ids"=>1, "has_menu"=>4,
> > "cuisine_ids"=>1073741825, "neighborhood_ids"=>1073741825, "bh_id"=>1,
> > "score"=>5, "price_range"=>3, "name_sort"=>3, "lat"=>5,
> > "has_reservation"=>4}, :words=>{"400456007"=>{:docs=>70, :hits=>70}}, 
> > :time=>0.001, :fields=>["classnamecrc32",
> > "name", "description", "offer_text", "offer_type_value",
> > "cuisine_name"], :matches=>[{:doc=>598261, 
> > :attributes=>{"offer_type_ids"=>[],
> > "lon"=>-1.29073655605316, "offer_ids"=>[], "@groupby"=>20040816,
> > "restaurant_ids"=>598261, "source_ids"=>[], "reviews_avg_score"=>10.0,
> > "total_checkins"=>0, "total_likes"=>0, "@count"=>5,
> > "address_zipcode"=>10028, "has_delivery"=>1, "city_ids"=>18819,
> > "has_menu"=>1, "cuisine_ids"=>[21], "neighborhood_ids"=>[8, 22, 23,
> > 57], "bh_id"=>598261, "score"=>0.0, "price_range"=>69,
> > "name_sort"=>64, "lat"=>0.711645185947418,
> > "has_reservation"=>0}, :index=>0, :weight=>1273},
> > {:doc=>598904, :attributes=>{"offer_type_ids"=>[11],
> > "lon"=>-1.29076039791107, "offer_ids"=>[622122], "@groupby"=>20040804,
> > "restaurant_ids"=>598904, "source_ids"=>[15],
> > "reviews_avg_score"=>9.0, "total_checkins"=>13, "total_likes"=>1,
> > "@count"=>3, "address_zipcode"=>10021, "has_delivery"=>1,
> > "city_ids"=>18819, "has_menu"=>0, "cuisine_ids"=>[],
> > "neighborhood_ids"=>[22, 23, 28, 57], "bh_id"=>598904,
> > "score"=>2.4300000667572, "price_range"=>69, "name_sort"=>26,
> > "lat"=>0.711563467979431,
> > "has_reservation"=>0}, :index=>1, :weight=>1273},
> > {:doc=>598488, :attributes=>{"offer_type_ids"=>[], "lon"=>0.0,
> > "offer_ids"=>[], "@groupby"=>20040722, "restaurant_ids"=>598488,
> > "source_ids"=>[], "reviews_avg_score"=>8.0, "total_checkins"=>0,
> > "total_likes"=>0, "@count"=>3, "address_zipcode"=>10012,
> > "has_delivery"=>1, "city_ids"=>18819, "has_menu"=>1,
> > "cuisine_ids"=>[33], "neighborhood_ids"=>[2, 9, 22, 61],
> > "bh_id"=>598488, "score"=>0.0, "price_range"=>69, "name_sort"=>37,
> > "lat"=>0.0, "has_reservation"=>0}, :index=>2, :weight=>1273},
> > {:doc=>599149, :attributes=>{"offer_type_ids"=>[],
> > "lon"=>-1.29116952419281, "offer_ids"=>[], "@groupby"=>20040628,
> > "restaurant_ids"=>599149, "source_ids"=>[], "reviews_avg_score"=>7.0,
> > "total_checkins"=>22, "total_likes"=>0, "@count"=>2,
> > "address_zipcode"=>10017, "has_delivery"=>1, "city_ids"=>18819,
> > "has_menu"=>0, "cuisine_ids"=>[28], "neighborhood_ids"=>[6, 22, 60],
> > "bh_id"=>599149, "score"=>0.0, "price_range"=>69, "name_sort"=>45,
> > "lat"=>0.711319506168365,
> > "has_reservation"=>0}, :index=>3, :weight=>1273},
> > {:doc=>598304, :attributes=>{"offer_type_ids"=>[],
> > "lon"=>-1.52945172786713, "offer_ids"=>[], "@groupby"=>20040604,
> > "restaurant_ids"=>598304, "source_ids"=>[], "reviews_avg_score"=>6.0,
> > "total_checkins"=>115, "total_likes"=>0, "@count"=>5,
> > "address_zipcode"=>60604, "has_delivery"=>1, "city_ids"=>6335,
> > "has_menu"=>1, "cuisine_ids"=>[], "neighborhood_ids"=>[240],
> > "bh_id"=>598304, "score"=>0.0, "price_range"=>69, "name_sort"=>0,
> > "lat"=>0.73090934753418,
> > "has_reservation"=>0}, :index=>4, :weight=>1273},
> > {:doc=>598791, :attributes=>{"offer_type_ids"=>[],
> > "lon"=>-1.29123413562775, "offer_ids"=>[], "@groupby"=>20040511,
> > "restaurant_ids"=>598791, "source_ids"=>[], "reviews_avg_score"=>5.0,
> > "total_checkins"=>12, "total_likes"=>6, "@count"=>1,
> > "address_zipcode"=>10018, "has_delivery"=>1, "city_ids"=>18819,
> > "has_menu"=>1, "cuisine_ids"=>[2], "neighborhood_ids"=>[7, 22, 60],
> > "bh_id"=>598791, "score"=>0.0, "price_range"=>70, "name_sort"=>7,
> > "lat"=>0.711277902126312,
> > "has_reservation"=>0}, :index=>5, :weight=>1273},
> > {:doc=>598474, :attributes=>{"offer_type_ids"=>[],
> > "lon"=>-1.52936661243439, "offer_ids"=>[], "@groupby"=>20040416,
> > "restaurant_ids"=>598474, "source_ids"=>[], "reviews_avg_score"=>4.0,
> > "total_checkins"=>925, "total_likes"=>5, "@count"=>1,
> > "address_zipcode"=>60611, "has_delivery"=>1, "city_ids"=>6335,
> > "has_menu"=>1, "cuisine_ids"=>[], "neighborhood_ids"=>[217, 287],
> > "bh_id"=>598474, "score"=>0.0, "price_range"=>71, "name_sort"=>10,
> > "lat"=>0.731164395809174,
> > "has_reservation"=>0}, :index=>6, :weight=>1273},
> > {:doc=>598689, :attributes=>{"offer_type_ids"=>[],
> > "lon"=>-1.29161155223846, "offer_ids"=>[], "@groupby"=>20040110,
> > "restaurant_ids"=>598689, "source_ids"=>[], "reviews_avg_score"=>2.0,
> > "total_checkins"=>2, "total_likes"=>0, "@count"=>2,
> > "address_zipcode"=>10007, "has_delivery"=>1, "city_ids"=>18819,
> > "has_menu"=>1, "cuisine_ids"=>[], "neighborhood_ids"=>[2, 17, 22, 65],
> > "bh_id"=>598689, "score"=>0.0, "price_range"=>69, "name_sort"=>15,
> > "lat"=>0.71059387922287,
> > "has_reservation"=>0}, :index=>7, :weight=>1273},
> > {:doc=>598000, :attributes=>{"offer_type_ids"=>[],
> > "lon"=>-1.29096734523773, "offer_ids"=>[], "@groupby"=>19700101,
> > "restaurant_ids"=>598000, "source_ids"=>[], "reviews_avg_score"=>0.0,
> > "total_checkins"=>0, "total_likes"=>0, "@count"=>48,
> > "address_zipcode"=>10022, "has_delivery"=>1, "city_ids"=>18819,
> > "has_menu"=>1, "cuisine_ids"=>[], "neighborhood_ids"=>[16, 22, 56,
> > 60], "bh_id"=>598000, "score"=>0.0, "price_range"=>69,
> > "name_sort"=>68, "lat"=>0.711285769939423,
> > "has_reservation"=>0}, :index=>8, :weight=>1273}], :total=>9}
> >
> >
> >
> > You can easily note that grouping by :has_menu and :total_likes
> > returns only one result (:total_found=>1). It is incorrect: there are
> > records with :has_menu == false, total_likes = 1, total_likes =2 etc.
> > Only group by reviews_avg_score returns correct results
> >
> >
> > Example of xml data source:
> > <?xml version="1.0" encoding="utf-8"?>
> > <sphinx:docset>
> > <sphinx:schema>
> > <sphinx:field name="classnamecrc32"/>
> > <sphinx:field name="name"/>
> > <sphinx:field name="description"/>
> > <sphinx:field name="offer_text"/>
> > <sphinx:field name="offer_type_value"/>
> > <sphinx:field name="cuisine_name"/>
> > <sphinx:attr name="address_zipcode" type="int"/>
> > <sphinx:attr name="restaurant_ids" type="int"/>
> > <sphinx:attr name="lat" type="float"/>
> > <sphinx:attr name="has_delivery" type="bool"/>
> > <sphinx:attr name="source_ids" type="multi"/>
> > <sphinx:attr name="lon" type="float"/>
> > <sphinx:attr name="has_reservation" type="bool"/>
> > <sphinx:attr name="offer_type_ids" type="multi"/>
> > <sphinx:attr name="price_range" type="str2ordinal"/>
> > <sphinx:attr name="has_menu" type="bool"/>
> > <sphinx:attr name="score" type="float"/>
> > <sphinx:attr name="neighborhood_ids" type="multi"/>
> > <sphinx:attr name="cuisine_ids" type="multi"/>
> > <sphinx:attr name="total_checkins" type="int"/>
> > <sphinx:attr name="offer_ids" type="multi"/>
> > <sphinx:attr name="reviews_avg_score" type="float"/>
> > <sphinx:attr name="city_ids" type="int"/>
> > <sphinx:attr name="name_sort" type="str2ordinal"/>
> > <sphinx:attr name="total_likes" type="int"/>
> > <sphinx:attr name="bh_id" type="int"/>
> > </sphinx:schema>
> > <sphinx:document id="599105">
> > <classnamecrc32>400456007</classnamecrc32>
> > <name><![CDATA[Subway]]></name>
> > <description><![CDATA[test]]></description>
> > <offer_text><![CDATA[]]></offer_text>
> > <offer_type_value><![CDATA[]]></offer_type_value>
> > <cuisine_name><![CDATA[]]></cuisine_name>
> > <address_zipcode>60622</address_zipcode>
> > <restaurant_ids>599105</restaurant_ids>
> > <lat>0.731224661851994</lat>
> > <has_delivery>1</has_delivery>
> > <source_ids></source_ids>
> > <lon>-1.53024979754365</lon>
> > <has_reservation>0</has_reservation>
> > <offer_type_ids></offer_type_ids>
> > <price_range>1</price_range>
> > <has_menu>1</has_menu>
> > <score>0.0</score>
> > <neighborhood_ids>201,202,284</neighborhood_ids>
> > <cuisine_ids></cuisine_ids>
> > <total_checkins>5</total_checkins>
> > <offer_ids></offer_ids>
> > <reviews_avg_score>0</reviews_avg_score>
> > <city_ids>6335</city_ids>
> > <name_sort>Subway</name_sort>
> > <total_likes>0</total_likes>
> > <bh_id>599105</bh_id>
> > </sphinx:document>
> > </sphinx:docset>
> >
> >
> > Thanks, Slava
> >
> > --
> > You received this message because you are subscribed to the Google Groups 
> > "Thinking Sphinx" group.
> > To post to this group, send email to [email protected].
> > To unsubscribe from this group, send email to 
> > [email protected].
> > For more options, visit this group at 
> > http://groups.google.com/group/thinking-sphinx?hl=en.
> >
> 
> --
> You received this message because you are subscribed to the Google Groups 
> "Thinking Sphinx" group.
> To post to this group, send email to [email protected].
> To unsubscribe from this group, send email to 
> [email protected].
> For more options, visit this group at 
> http://groups.google.com/group/thinking-sphinx?hl=en.
> 
> 
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "Thinking Sphinx" group.
> To post to this group, send email to [email protected].
> To unsubscribe from this group, send email to 
> [email protected].
> For more options, visit this group at 
> http://groups.google.com/group/thinking-sphinx?hl=en.
> <mongoid-sphinx.zip>

-- 
You received this message because you are subscribed to the Google Groups 
"Thinking Sphinx" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/thinking-sphinx?hl=en.

Reply via email to