rclabo commented on issue #1130:
URL: https://github.com/apache/lucenenet/issues/1130#issuecomment-2671546445

   It's a common and recommended approach for the primary data repository to 
reside outside of Lucene.NET. In the .NET world, it's very common for data to 
be stored in SQL Server, although it could just as easily come from MongoDB, 
PostgreSQL, SQLite, or another source. Typically, when indexing data with 
Lucene.NET, one document is inserted for each result record. Often, this means 
inserting one document per source record, but that decision depends on your 
design needs.
   
   For example, consider a database with a product master and associated 
product detail records (such as product variations). The master record might be 
"V-neck Shirt," while the detail records might include variations like "V-neck 
Shirt, Red, Medium" and "V-neck Shirt, Blue, Small." If you want the Lucene.NET 
search to return a single result for "V-neck Shirt" regardless of color and 
size, you would insert one document into Lucene.NET for that product and add 
the various colors and sizes as indexed terms. Conversely, if you want separate 
search results for each variation, then you should index each product detail 
record separately and include the product name from the master record as part 
of each document.
   
   Note that while Lucene.NET stores its index data on the file system, it also 
uses memory for caching and performance enhancements. For a deeper dive into 
its architecture, this whitepaper might be helpful: 
http://opensearchlab.otago.ac.nz/paper_10.pdf .


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@lucenenet.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to