woj-tek commented on PR #2342:
URL: https://github.com/apache/james-project/pull/2342#issuecomment-2283478008

   Hi Uwe and thank you so much for the very insightful comments. Now it makes 
much more sense :)
   
   > My recommendation would be to remove the current version of Lucene support 
completely and rewrite it, ideally maybe with a configurable Index schema or 
better - usage of Apache Solr, Elasticsearch or Opensearch (to have the 
indexing clearly separated and scaleable for huge installations). E.g., Dovecot 
IMAP server has support for Apache Solr or Elasticsearch, so indexing e-Mail is 
straight forwards and by adapting the schema files shipped with the repo, it is 
possible to customize the text analysis without changing the code.
   
   James already supports OpenSearch and I think it's preferred way. Though 
Lucene implementation is handy for small deployments where having single (or 
better yet, limited number of services) is convenient.
   
   As I said before - I had zero knowledge about Lucene just a couple of days 
ago and I was just trying to make it work by looking at various documentations 
/ SO and so forth.
   
   
   > The correct way to update document in Lucene is the following: Rebuild the 
document using the same code which was used during indexing - don't use any 
information from index. E.g., read your document from the database/mail 
folder/EML file/.... and create a completely new document with applying the 
indexing schema that was configured by the user (languages). Important is also 
to index IDs as StringField, because any other field type is not supported for 
IDs. When you do this, the document is reachable easily using its IDs (case 
sensitive) and can be updated.
   
   @ james team - should we do this or go further with `DocValues` mentioned?
   
   @uschindler - I would be helpful if above information could somehow be 
included in Lucene javadocs so it would be more clear what to use and what to 
avoid.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to