rclabo commented on issue #1130: URL: https://github.com/apache/lucenenet/issues/1130#issuecomment-2671546445
It's a common and recommended approach for the primary data repository to reside outside of Lucene.NET. In the .NET world, it's very common for data to be stored in SQL Server, although it could just as easily come from MongoDB, PostgreSQL, SQLite, or another source. Typically, when indexing data with Lucene.NET, one document is inserted for each result record. Often, this means inserting one document per source record, but that decision depends on your design needs. For example, consider a database with a product master and associated product detail records (such as product variations). The master record might be "V-neck Shirt," while the detail records might include variations like "V-neck Shirt, Red, Medium" and "V-neck Shirt, Blue, Small." If you want the Lucene.NET search to return a single result for "V-neck Shirt" regardless of color and size, you would insert one document into Lucene.NET for that product and add the various colors and sizes as indexed terms. Conversely, if you want separate search results for each variation, then you should index each product detail record separately and include the product name from the master record as part of each document. Note that while Lucene.NET stores its index data on the file system, it also uses memory for caching and performance enhancements. For a deeper dive into its architecture, this whitepaper might be helpful: http://opensearchlab.otago.ac.nz/paper_10.pdf . -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@lucenenet.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org