On 9/20/06, Clare <[EMAIL PROTECTED]> wrote:
> Hi i'm using ferret to enable geographical postcode. I take a postcode
> and distance in miles from the user, strip off the outcode and then
> retrieve the associated x y coordinates in metres from the db. Then i
> get two temp x's and y's and search for all results that are within the
> box, see code below.
>
> Problems start to occur when i search on big distances so for example
>
> 40 miles from "G1"
> VoObject.ferret_index.search(" x:[206826 335573] AND y:[590526
> 719273]").total_hits
> => 165
>
>
> 300 miles
> VoObject.ferret_index.search("y:[172098 1137702]").total_hits
> Ferret::QueryParser::QueryParseException: Error occured in q_range.c:121
> - range_new
>         Upper bound must be greater than lower bound. "1137702" <
> "172098"
>
>         from
> /usr/lib/ruby/gems/1.8/gems/ferret-0.10.1/lib/ferret/index.rb:572:in
> `parse'
>         from
> /usr/lib/ruby/gems/1.8/gems/ferret-0.10.1/lib/ferret/index.rb:572:in
> `process_query'
>         from
> /usr/lib/ruby/gems/1.8/gems/ferret-0.10.1/lib/ferret/index.rb:560:in
> `do_search'
>         from
> /usr/lib/ruby/gems/1.8/gems/ferret-0.10.1/lib/ferret/index.rb:233:in
> `search'
>         from /usr/lib/ruby/1.8/monitor.rb:229:in `synchronize'
>         from
> /usr/lib/ruby/gems/1.8/gems/ferret-0.10.1/lib/ferret/index.rb:232:in
> `search'
>         from (irb):16
>
>
> So what am  i doing wrong? How have other people used ferret for
> geographical searches? Is there another way that i can define the range
> so that it works properly?
>
> because I'm also getting other crazy and just plain wrong results
>
> VoObject.ferret_index.search("y:[0 9]").total_hits
> => 167
>
> thats telling me that all the test data is with 8 metres of the
> origin...
>
> thanks in advance.
> clare
>
>
> if their_outcode && their_outcode.size > 0
>         temp_hwz = HwzPostcode.find(:first, :conditions => ['outcode =
> ?',their_outcode])
>         range_x_left     = temp_hwz.x - (postcode_distance.to_f*1.60934 * 
> 1000)
>         range_x_right    = temp_hwz.x + (postcode_distance.to_f*1.60934 * 
> 1000)
>         range_y_top      = temp_hwz.y + (postcode_distance.to_f*1.60934 * 
> 1000)
>         range_y_bottom   = temp_hwz.y - (postcode_distance.to_f*1.60934 * 
> 1000)
>
>   query += " AND x:[#{range_x_left.to_i} #{range_x_right.to_i}] AND
> y:[#{range_y_bottom.to_i} #{range_y_top.to_i}]"
> end

Hi Clare,

Ranges are calculated according to lexical ordering, not numerical
ordering. Try this:

    puts ["0", "9", "167"].sort

You'll see that "167" does indeed fall between "0" and "9". Now try this:

    puts ["000", "009", "167"].sort

So that should explain what you have to do. You need to pad all
numbers to a fixed width. Alternatively you could build a custom
IntegerRangeFilter and combine it with a ConstantScoreQuery. Here is
an example for Floats:

    require 'rubygems'
    require 'ferret'

    class FloatRangeFilter
      attr_accessor :field, :upper, :lower, :upper_op, :lower_op

      def initialize(field, options)
        @field = field
        @upper = options[:<] || options[:<=]
        @lower = options[:>] || options[:>=]
        if @upper.nil? and @lower.nil?
          raise ArgError, "Must specify a bound"
        end
        @upper_op = options[:<].nil? ? :<= : :<
        @lower_op = options[:>].nil? ? :>= : :>
      end

      def bits(index_reader)
        bit_vector = Ferret::Utils::BitVector.new
        term_doc_enum = index_reader.term_docs
        index_reader.terms(@field).each do |term, freq|
          float = term.to_f
          next if @upper and not float.send(@upper_op, @upper)
          next if @lower and not float.send(@lower_op, @lower)
          term_doc_enum.seek(@field, term)
          term_doc_enum.each {|doc_id, freq| bit_vector.set(doc_id)}
        end
        return bit_vector
      end

      def hash
        return @field.hash ^ @upper.hash ^ @lower.hash ^
               @upper_op.hash ^ @lower_op.hash
      end

      def eql?(o)
        return (o.instance_of?(FloatRangeFilter) and @field == o.field and
                @upper == o.upper and @lower == o.lower and
                @upper_op == o.upper_op and @lower_op == o.lower_op)
      end
    end

You'll have to work out what is going on here yourself though. I have
no time for explanation. Note that this won't perform very well
compared to the padded field version because so much is going on in
the Ruby code. I could possibly be persuaded to implement this in C.

Cheers,
Dave
_______________________________________________
Ferret-talk mailing list
[email protected]
http://rubyforge.org/mailman/listinfo/ferret-talk

Reply via email to