If it were just a couple of colors, you could have a separate field
for each color and then index the percent in that field.

black:70
grey:20

and then you could use a function query to influence the score (or you
could sort by the color percent).

However, this doesn't scale well to a large index with a large number of colors.
Each field used like that will take up 4 bytes per document in the index.

so if you have 1M documents, that's 1Mdocs * 100colors * 4bytes = 400MB
Doable depending on your index size (use "int" or "float" and not
"sint" or "sfloat" type for this... it will be better on the memory).

If you needed to be better on the memory, you could encode all of the
colors into a single value (perhaps into a compact string... one
percentile per byte or something) and then have a custom function that
extracts the value for a particular color.  (this involves some java
development)

-Yonik


On 9/28/07, Guangwei Yuan <[EMAIL PROTECTED]> wrote:
> Hi,
>
> We're running an e-commerce site that provides product search. We've been
> able to extract colors from product images, and we think it'd be cool and
> useful to search products by color. A product image can have up to 5 colors
> (from a color space of about 100 colors), so we can implement it easily with
> Solr's facet search (thanks all who've developed Solr).
>
> The problem arises when we try to sort the results by the color relevancy.
> What's different from a normal facet search is that colors are weighted. For
> example, a black dress can have 70% of black, 20% of gray, 10% of brown. A
> search query "color:black" should return results in which the black dress
> ranks higher than other products with less percentage of black.
>
> My question is: how to configure and index the color field so that products
> with higher percentage of color X ranks higher for query "color:X"?
>
> Thanks for your help!
>
> - Guangwei
>

Reply via email to