Hey Justin, Think of the commit as a normal ACID transaction. In a regular RDMS you would add millions of records before committing - do it regularly, so the transaction does not becomes too big. The documents will not be searchable before the commit.
Optimization should be done after you are done adding documents to the index. This will merge the segments and remove any leftover deleted documents (lucene marks a document as deleted and remove them when optimizing). Regards, Anders Lybecker On Mon, Jun 4, 2012 at 10:40 PM, Kohlhepp, Justin W (Heritage Holdings (HHI)) <justin.kohlh...@thehartford.com> wrote: > Hello, > > > > I have a process that is going to create a new index and populate the > initial data for it which ends up being millions of docs. To make this > process as efficient (fast) as possible, when should I commit the > writer, and when should I optimize? Is it better to commit/optimize > once at the end, or am I better off doing it every X documents? > > > > Thanks, > > > ~ Justin > > ************************************************************ > This communication, including attachments, is for the exclusive use of > addressee and may contain proprietary, confidential and/or privileged > information. If you are not the intended recipient, any use, copying, > disclosure, dissemination or distribution is strictly prohibited. If you > are not the intended recipient, please notify the sender immediately by > return e-mail, delete this communication and destroy all copies. > ************************************************************ >