Re: use case: structured DB records with a bunch of related files

2011-07-21 Thread Erick Erickson
I suspect you'll have to use Tika to parse the attachments, and as you
do add the info that'll allow you to display the link to the meta-data that
Tika generates. I'm in a bit of a rush, but one approach would be to use
SolrJ to do your indexing database querying, and you can ask Tika
from the SolrJ to parse the attachments and actually assemble the document
for the attachment and send it to Solr from the Tika output, add whatever
meta-data you want for a link back to the DB record and index the doc.

Hope this helps
Erick

On Tue, Jul 19, 2011 at 12:50 PM, Travis Low t...@4centurion.com wrote:
 Greetings.  I have a bunch of highly structured DB records, and I'm pretty
 clear on how to index those.  However, each of those records may have any
 number of related documents (Word, Excel, PDF, PPT, etc.).  All of this
 information will change over time.

 Can someone point me to a use case or some good reading to get me started on
 configuring Solr to index the DB records and files in such a way as to
 relate the two types of information?  By relate, I mean that if there's a
 hit in a related file, then I need to show the user a link to the DB record
 as well as a link to the file.

 Thanks in advance.

 cheers,

 Travis

 --

 **

 *Travis Low, Director of Development*


 ** t...@4centurion.com* *

 *Centurion Research Solutions, LLC*

 *14048 ParkEast Circle *•* Suite 100 *•* Chantilly, VA 20151*

 *703-956-6276 *•* 703-378-4474 (fax)*

 *http://www.centurionresearch.com* http://www.centurionresearch.com

 **The information contained in this email message is confidential and
 protected from disclosure.  If you are not the intended recipient, any use
 or dissemination of this communication, including attachments, is strictly
 prohibited.  If you received this email message in error, please delete it
 and immediately notify the sender.

 This email message and any attachments have been scanned and are believed to
 be free of malicious software and defects that might affect any computer
 system in which they are received and opened. No responsibility is accepted
 by Centurion Research Solutions, LLC for any loss or damage arising from the
 content of this email.



use case: structured DB records with a bunch of related files

2011-07-19 Thread Travis Low
Greetings.  I have a bunch of highly structured DB records, and I'm pretty
clear on how to index those.  However, each of those records may have any
number of related documents (Word, Excel, PDF, PPT, etc.).  All of this
information will change over time.

Can someone point me to a use case or some good reading to get me started on
configuring Solr to index the DB records and files in such a way as to
relate the two types of information?  By relate, I mean that if there's a
hit in a related file, then I need to show the user a link to the DB record
as well as a link to the file.

Thanks in advance.

cheers,

Travis

-- 

**

*Travis Low, Director of Development*


** t...@4centurion.com* *

*Centurion Research Solutions, LLC*

*14048 ParkEast Circle *•* Suite 100 *•* Chantilly, VA 20151*

*703-956-6276 *•* 703-378-4474 (fax)*

*http://www.centurionresearch.com* http://www.centurionresearch.com

**The information contained in this email message is confidential and
protected from disclosure.  If you are not the intended recipient, any use
or dissemination of this communication, including attachments, is strictly
prohibited.  If you received this email message in error, please delete it
and immediately notify the sender.

This email message and any attachments have been scanned and are believed to
be free of malicious software and defects that might affect any computer
system in which they are received and opened. No responsibility is accepted
by Centurion Research Solutions, LLC for any loss or damage arising from the
content of this email.