RE: Storing data in Solr
When I am putting PDF documents and rows from a table into the same index, I create "dataSource" field to identify the source and I don't copy database fields - only index them - apart from the unique key which is stored as "document". On search, you process the output before passing to user. If datasource is pdfs etc, then you should have highlighted text to pass on. If dataSource is the table, then fetch the rows from database and display the search fields as "highlights". A lot of postprocessing of search results but easier to create meaningful results if a single row in the table contains what a user wants. You need a custom indexer and a custom results postprocesser however. Notice: This email and any attachments are confidential and may not be used, published or redistributed without the prior written consent of the Institute of Geological and Nuclear Sciences Limited (GNS Science). If received in error please destroy and immediately notify GNS Science. Do not copy or disclose the contents.
Re: Storing data in Solr
Well, a very common pattern is to use Solr to search, storing just enough in each field (stored="true") to return to the user search results that give enough information to determine whether they want to look at the original document. When the click on a choice (or a link like "download PDF") then fetch the actual file from the system of record. You'll have to re-index sometime anyway as your requirements change and you have to re-ingest all your data and that's easiest from the system of record. Best, Erick Erickson On Mon, Aug 7, 2017 at 8:05 PM, sg1973 wrote: > I have written the code to publish to Solr but i am wondering what is the > right way to do it. Is directly putting data in Solr OK or putting it in a > separate cache and then building solr on top of it? what are the pros and > cons of each? > > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/Storing-data-in-Solr-tp4349537p4349541.html > Sent from the Solr - User mailing list archive at Nabble.com.
Re: Storing data in Solr
I have written the code to publish to Solr but i am wondering what is the right way to do it. Is directly putting data in Solr OK or putting it in a separate cache and then building solr on top of it? what are the pros and cons of each? -- View this message in context: http://lucene.472066.n3.nabble.com/Storing-data-in-Solr-tp4349537p4349541.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Storing data in Solr
Solr indexes data for search and if search is the main criterion Solr should be used. On Mon, 8/7/17, sg1973 wrote: Subject: Storing data in Solr To: solr-user@lucene.apache.org Received: Monday, August 7, 2017, 6:55 PM Hello All, I am new to Solr and have a question. I have to load about 1 million records from a DB table (with say 30 columns/row) and then run various search queries on it. I see 2 ways to do it. Store the data directly in Solr versus store in in a cache and then search on it using Solr. I am trying to understand which approach is better and recommended. One use case where I would need a separate cache is when I have to store non-linear data (PDF et al) which won't be supported by Solr. However, if i have tabulated data then i have a choice to store directly in Solr. Any ideas on what to choose when? Is there a reason i would choose a separate cache even for storing linear data? Thanks in advance PG -- View this message in context: http://lucene.472066.n3.nabble.com/Storing-data-in-Solr-tp4349537.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Storing data in Solr
Which database is to be integrated? Solr provides Data Import Handlers for several databases including Oracle and MySQL. On Mon, 8/7/17, sg1973 wrote: Subject: Storing data in Solr To: solr-user@lucene.apache.org Received: Monday, August 7, 2017, 6:55 PM Hello All, I am new to Solr and have a question. I have to load about 1 million records from a DB table (with say 30 columns/row) and then run various search queries on it. I see 2 ways to do it. Store the data directly in Solr versus store in in a cache and then search on it using Solr. I am trying to understand which approach is better and recommended. One use case where I would need a separate cache is when I have to store non-linear data (PDF et al) which won't be supported by Solr. However, if i have tabulated data then i have a choice to store directly in Solr. Any ideas on what to choose when? Is there a reason i would choose a separate cache even for storing linear data? Thanks in advance PG -- View this message in context: http://lucene.472066.n3.nabble.com/Storing-data-in-Solr-tp4349537.html Sent from the Solr - User mailing list archive at Nabble.com.