On Thu, Oct 27, 2011 at 7:13 AM, Anatoli Matuskova < anatoli.matusk...@gmail.com> wrote:
> I don't like the idea of indexing a doc per each value, the dataset can > grow > a lot. What does a lot mean? How high is the sky? A million people with 3 year schedules is a billion tiny documents. That doesn't sound like such an enormous number. > I have thought that something like this could work: > At indexing time, if I know the dates of no avaliability, I could gather > the > avaliability ones (will consider unknown as available). So, I index 4 > fields > aval_yes_start, aval_yes_end, aval_no_start, aval_no_end (all are > multiValued) > If the user ask for avaliability from $start to $end I filter like: > > fq=aval_yes_start:[$start TO $end]&fq=aval_yes_end:[$start TO > $end]&fq=*-*aval_no_start:[$start TO $end]&fq=*-*aval_no_end:[$start TO > $end] > This can be done. And given that you want long stretches of availability, but what happens when a reservation is canceled? You have to coalesce intervals. That isn't impossible, but it is a pain. Would this count as premature optimization? Simply retrieving days in the range and counting gets the right answer a bit more simply. Additions and deletions and modifications all work. If you want to drive down to a resolution of seconds, the document time slot model doesn't work. But for days, it probably does.