On 11/13/06, Bertrand Delacretaz <[EMAIL PROTECTED]> wrote:
I'm also envisioning using Solr to replace a database in some web apps

Yes, querying only the search collection rather than both the search
collection and the database can make a lot of sense: a less
complicated webapp, and you only need to make the search collection
HA.

- but how would you handle (or rather simulate) joins in such a case?

The usual approach is to denormalize the data.  The downside is a
slightly bigger search collection.

Say you have a Book which references an Author in a separate Solr
<document> - how do you suggest inserting the Author's data into each
Book like an SQL join would do?

Is it possible to make the collection book centric and put the
author's data into each book during indexing?

Is it efficient to do a new Lucene query for each Book found, to get
the Author? I can imagine doing that in  a loop, and Solr's caches
would probably help. But how does that feel from Lucene's point of
view?

It's doable.  The only advantage is decreased index size, but you give
up some query power and speed.

This wouldn't be a full join, as there's probably no way to do a
single query like

  select * from Book,Author
  where Book.author_id = Author.author_id
  and Author.name like '%chill%"

DB type joints would probably take a *lot* of work.

Another downside is the potential for federated or distributed search
in the future.  Joins go across documents and are thus not easily
distributed.

Being able to do this would be cool, but at this point I'm only
thinking of retrieving related info linked via IDs.

Trying to think of a URL friendly syntax for this that would work for
including fields from more than one other "table"... something like:
addFields=artist_name where artist_id:song_artist
addFields=album_name,album_date where album_id:song_album

I'm still not sure if it's a good idea or not though... you give up
powerful queries like
+song_title:foo +album_date:[1970 TO 1980] -artist_name:bob

-Yonik

Reply via email to