Hello, I'm new to this mailing list so I hope that I'm sending my question to 
the right place.


We recently experienced issues with Lucene Mass indexation process.
The connection is closed after few seconds and the indexation is partially 
processed.

I've got a piece of code that is re-indexing a table containing contacts.
This code was running fine until we execute it on a table containing more than 
2 millions of contacts.

In that configuration the process does not run entirely and stop due to the 
following exception.
11/15 16:12:32 ERROR rg.hibernate.search.exception.impl.LogErrorHandler - 
HSEARCH000058: HSEARCH000211: An exception occurred while the MassIndexer was 
fetching the primary identifiers list
org.hibernate.exception.JDBCConnectionException: could not advance using next()
at 
org.hibernate.exception.internal.SQLExceptionTypeDelegate.convert(SQLExceptionTypeDelegate.java:48)
at 
org.hibernate.exception.internal.StandardSQLExceptionConverter.convert(StandardSQLExceptionConverter.java:42)
at 
org.hibernate.engine.jdbc.spi.SqlExceptionHelper.convert(SqlExceptionHelper.java:111)
at 
org.hibernate.engine.jdbc.spi.SqlExceptionHelper.convert(SqlExceptionHelper.java:97)
at 
org.hibernate.internal.ScrollableResultsImpl.convert(ScrollableResultsImpl.java:69)
at 
org.hibernate.internal.ScrollableResultsImpl.next(ScrollableResultsImpl.java:104)
at 
org.hibernate.search.batchindexing.impl.IdentifierProducer.loadAllIdentifiers(IdentifierProducer.java:148)
at 
org.hibernate.search.batchindexing.impl.IdentifierProducer.inTransactionWrapper(IdentifierProducer.java:109)
at 
org.hibernate.search.batchindexing.impl.IdentifierProducer.run(IdentifierProducer.java:85)
at 
org.hibernate.search.batchindexing.impl.OptionallyWrapInJTATransaction.runWithErrorHandler(OptionallyWrapInJTATransaction.java:69)
at 
org.hibernate.search.batchindexing.impl.ErrorHandledRunnable.run(ErrorHandledRunnable.java:32)
at 
java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:834)
Caused by: java.sql.SQLNonTransientConnectionException: (conn=18) Server has 
closed the connection. If result set contain huge amount of data, Server 
expects client to read off the result set relatively fast. In this case, please 
consider increasing net_wait_timeout session variable / processing your result 
set faster (check Streaming result sets documentation for more information)
at 
org.mariadb.jdbc.internal.util.exceptions.ExceptionMapper.get(ExceptionMapper.java:234)
at 
org.mariadb.jdbc.internal.util.exceptions.ExceptionMapper.getException(ExceptionMapper.java:165)
at 
org.mariadb.jdbc.internal.com.read.resultset.SelectResultSet.handleIoException(SelectResultSet.java:381)
at 
org.mariadb.jdbc.internal.com.read.resultset.SelectResultSet.next(SelectResultSet.java:650)
at 
org.apache.commons.dbcp2.DelegatingResultSet.next(DelegatingResultSet.java:1160)
at 
org.apache.commons.dbcp2.DelegatingResultSet.next(DelegatingResultSet.java:1160)
at 
org.hibernate.internal.ScrollableResultsImpl.next(ScrollableResultsImpl.java:99)
... 10 more

Here is the piece of code :
    private void index(EntityManager em, LongConsumer callBack) {
        FullTextEntityManager fullTextEntityManager = 
Search.getFullTextEntityManager(em);
        MassIndexer indexer = 
fullTextEntityManager.createIndexer(Contact.class);
        indexer.batchSizeToLoadObjects(BATCH_SIZE);
        indexer.threadsToLoadObjects(NB_THREADS);
        indexer.progressMonitor(new IndexerProgressMonitor(callBack));
        indexer.start();
    }


To make the process run until the end and after several unsuccessful tries, the 
only thing that seems to work is to edit mariadb config file and set : 
net_write_timeout = 3600
By default the time out is 60 sec. Here we put it to 1 hour... as the process 
took 45 minutes...
Does someone have an idea of what we did wrong ? It does not seems reasonable 
to set a such huge value to ask a full re-index on the table.
Is there a way to ask MassIndexer to process by limited chunk of data or 
something else to avoid the connection to stay open for a too long time ?
Thanks for your time.

Reply via email to