the trouble I'm having is one of dimension.  an author has many, many
 attributes (name, birthdate, biography in $language, etc).  as does
each book (title in $language, summary in $language, genre, etc).  as
does each library (name, address, directions in $language, etc).  so
an author with N books doesn't seem to scale very well in the flat representations I'm finding in all the lucene/solr docs and
examples... at least not in some way I can wrap my head around.

OG: I'm not sure why the number of attributes worries you.  Imagine
is as a wide RDBMS table, if it helps.  Indices with dozens of fields
are not uncommon.

it's not necessarily the number of fields, it's the Attribute1 .. AttributeN-style numbering that worries me. but I think it's all starting to make sense now... if wanting to pull data in multiple queries was my holdup.

OG: You certainly can do that.  I'm not sure I understand where the
hard part is.  You seem to know what attributes each entity has.
Maybe you are confused by how to handle N different types of entities
in a single index?

yes... or, more properly, how to relate them to eachother.

I understand that the schema can hold tons of attributes that are unused in different documents. my question seems to be how to organize my data such that I can answer the question "how do I get a list of libraries with $book like $pattern" - where does the de-normalization typically occur? if a document fully represents "a book by an author in a library" such that the same book (with all it's attributes) is in my index multiple times (one for each library) how do I drill down to showing just the directions to a specific library?

(I'm assuming a single index is what you currently
have in mind)

using different indices is what my lucene+compass counterparts are doing. I couldn't find an example of that in the solr docs (unless the answer is running multiple, distinct instances at the same time)

eew :)  seriously, though, that's what we have now - all rdbms
driven. if solr could only conceptually handle the initial lookup
there wouldn't be much point.

OG: Well, there might or might not be, depending on how much data you
have, how flexible and fast your RDBMS-powered (full-text?) search,
and so on.  The Lucene/Solr for full-text search + RDBMS/BDB for
display data is a common combination.

"the decision has been made to use lucene to replace all rdbms functionality for search"

*cough*

:)


maybe I'm thinking about this all wrong (as is to be expected :), but
I just can't believe that nobody is using solr to represent data a
bit more complex than the examples out there.

OG: Oh, lots of people are, it's just that examples are simple, so
people new to Solr, Lucene, etc. have easier time learning.

:)

thanks for your help here.

--Geoff

Reply via email to