Re: schema help

Geoffrey Young Wed, 12 Mar 2008 08:31:23 -0700

the trouble I'm having is one of dimension.  an author has many, many
 attributes (name, birthdate, biography in $language, etc).  as does
each book (title in $language, summary in $language, genre, etc).  as
does each library (name, address, directions in $language, etc).  so

an author with N books doesn't seem to scale very well in the flatrepresentations I'm finding in all the lucene/solr docs and

examples... at least not in some way I can wrap my head around.


OG: I'm not sure why the number of attributes worries you.  Imagine
is as a wide RDBMS table, if it helps.  Indices with dozens of fields
are not uncommon.

it's not necessarily the number of fields, it's the Attribute1 ..AttributeN-style numbering that worries me. but I think it's allstarting to make sense now... if wanting to pull data in multiplequeries was my holdup.

OG: You certainly can do that.  I'm not sure I understand where the
hard part is.  You seem to know what attributes each entity has.
Maybe you are confused by how to handle N different types of entities

in a single index?


yes... or, more properly, how to relate them to eachother.

I understand that the schema can hold tons of attributes that are unusedin different documents. my question seems to be how to organize my datasuch that I can answer the question "how do I get a list of librarieswith $book like $pattern" - where does the de-normalization typicallyoccur? if a document fully represents "a book by an author in alibrary" such that the same book (with all it's attributes) is in myindex multiple times (one for each library) how do I drill down toshowing just the directions to a specific library?

(I'm assuming a single index is what you currently
have in mind)

using different indices is what my lucene+compass counterparts aredoing. I couldn't find an example of that in the solr docs (unless theanswer is running multiple, distinct instances at the same time)

eew :)  seriously, though, that's what we have now - all rdbms
driven. if solr could only conceptually handle the initial lookup
there wouldn't be much point.

OG: Well, there might or might not be, depending on how much data you
have, how flexible and fast your RDBMS-powered (full-text?) search,
and so on.  The Lucene/Solr for full-text search + RDBMS/BDB for
display data is a common combination.

"the decision has been made to use lucene to replace all rdbmsfunctionality for search"


*cough*

:)


maybe I'm thinking about this all wrong (as is to be expected :), but
I just can't believe that nobody is using solr to represent data a
bit more complex than the examples out there.

OG: Oh, lots of people are, it's just that examples are simple, so
people new to Solr, Lucene, etc. have easier time learning.


:)

thanks for your help here.

--Geoff

Re: schema help

Reply via email to