Hi,

I'd go with (2) also but using dynamic fields so you don't have to define all 
the storeX_price fields in your schema but rather just one *_price field.  Then 
when you filter on store:store1 you'd know to sort with store1_price and so 
forth for units.  That should be pretty straightforward.

Hope that helps,
Robi

-----Original Message-----
From: Alejandro Marqués Rodríguez [mailto:amarq...@paradigmatecnologico.com] 
Sent: Thursday, November 21, 2013 1:36 AM
To: solr-user@lucene.apache.org
Subject: Best implementation for multi-price store?

Hi,

I've been recently ask to implement an application to search products from 
several stores, each store having different prices and stock for the same 
product.

So I have products that have the usual fields (name, description, brand,
etc) and also number of units and price for each store. I must be able to 
filter for a given store and order by stock or price for that store. The 
application should also allow incresing the number of stores, fields depending 
of store and number of products without much work.

The numbers for the application are more or less 100 stores and 7M products.

I've been thinking of some ways of defining the index structure but I don't 
know wich one is better as I think each one has it's pros and cons.


   1. *Each product-store as a document:* Denormalizing the information so
   for every product and store I have a different document. Pros are that I
   can filter and order without problems and that adding a new store-depending
   field is very easy. Cons are that the index goes from 7M documents to 700M
   and that most of the info is redundant as most of the fields are repeated
   among stores.
   2. *Each field-store as a field:* For example for price I would have
   "store1_price, store2_price, ...". Pros are that the index stays at 7M
   documents, and I can still filter and sort by those fields. Cons are that I
   have to add some logic so if I filter by one store I order for the
   associated price field, and that number of fields increases as number of
   store-depending fields x number of stores. I don't know if having more
   fields affects performance, but adding new store-depending fields will
   increase the number of fields even more
   3. *Join:* First time I read about solr joins thought it was the way to
   go in this case, but after reading a bit more and doing some tests I'm not
   so sure about it... Maybe I've done it wrong but I think it also
   denormalizes the info (So I will also havee 700M documents) and besides I
   can't order or filter by store fields.


I must say my preferred option is number 2, so I don't duplicate information, I 
keep a relatively small number of documents and I can filter and sort by the 
store fields. However, my main concern here is I don't know if having too many 
fields in a document will be harmful to performance.

Which one do you think is the best approach for this application? Is there a 
better approach that I have missed?

Thanks in advance



--
Alejandro Marqués Rodríguez

Paradigma Tecnológico
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42

Reply via email to