Hey Sylvain,

this seems like a Hibernate Search problem and not really a Lucene one.
Maybe you could post this to https://discourse.hibernate.org/ and we'll
continue the discussion there?

Have a nice day,
Marko

On Wed, 17 Jan 2024 at 15:43, Sylvain Roulet <sylvain.rou...@eloquant.com>
wrote:

> Hello, I'm new to this mailing list so I hope that I'm sending my question
> to the right place.
>
>
> We recently experienced issues with Lucene Mass indexation process.
> The connection is closed after few seconds and the indexation is partially
> processed.
>
> I've got a piece of code that is re-indexing a table containing contacts.
> This code was running fine until we execute it on a table containing more
> than 2 millions of contacts.
>
> In that configuration the process does not run entirely and stop due to
> the following exception.
> 11/15 16:12:32 ERROR rg.hibernate.search.exception.impl.LogErrorHandler -
> HSEARCH000058: HSEARCH000211: An exception occurred while the MassIndexer
> was fetching the primary identifiers list
> org.hibernate.exception.JDBCConnectionException: could not advance using
> next()
> at
> org.hibernate.exception.internal.SQLExceptionTypeDelegate.convert(SQLExceptionTypeDelegate.java:48)
> at
> org.hibernate.exception.internal.StandardSQLExceptionConverter.convert(StandardSQLExceptionConverter.java:42)
> at
> org.hibernate.engine.jdbc.spi.SqlExceptionHelper.convert(SqlExceptionHelper.java:111)
> at
> org.hibernate.engine.jdbc.spi.SqlExceptionHelper.convert(SqlExceptionHelper.java:97)
> at
> org.hibernate.internal.ScrollableResultsImpl.convert(ScrollableResultsImpl.java:69)
> at
> org.hibernate.internal.ScrollableResultsImpl.next(ScrollableResultsImpl.java:104)
> at
> org.hibernate.search.batchindexing.impl.IdentifierProducer.loadAllIdentifiers(IdentifierProducer.java:148)
> at
> org.hibernate.search.batchindexing.impl.IdentifierProducer.inTransactionWrapper(IdentifierProducer.java:109)
> at
> org.hibernate.search.batchindexing.impl.IdentifierProducer.run(IdentifierProducer.java:85)
> at
> org.hibernate.search.batchindexing.impl.OptionallyWrapInJTATransaction.runWithErrorHandler(OptionallyWrapInJTATransaction.java:69)
> at
> org.hibernate.search.batchindexing.impl.ErrorHandledRunnable.run(ErrorHandledRunnable.java:32)
> at
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
> at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
> at
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
> at
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
> at java.base/java.lang.Thread.run(Thread.java:834)
> Caused by: java.sql.SQLNonTransientConnectionException: (conn=18) Server
> has closed the connection. If result set contain huge amount of data,
> Server expects client to read off the result set relatively fast. In this
> case, please consider increasing net_wait_timeout session variable /
> processing your result set faster (check Streaming result sets
> documentation for more information)
> at
> org.mariadb.jdbc.internal.util.exceptions.ExceptionMapper.get(ExceptionMapper.java:234)
> at
> org.mariadb.jdbc.internal.util.exceptions.ExceptionMapper.getException(ExceptionMapper.java:165)
> at org.mariadb.jdbc.internal.com
> .read.resultset.SelectResultSet.handleIoException(SelectResultSet.java:381)
> at org.mariadb.jdbc.internal.com
> .read.resultset.SelectResultSet.next(SelectResultSet.java:650)
> at
> org.apache.commons.dbcp2.DelegatingResultSet.next(DelegatingResultSet.java:1160)
> at
> org.apache.commons.dbcp2.DelegatingResultSet.next(DelegatingResultSet.java:1160)
> at
> org.hibernate.internal.ScrollableResultsImpl.next(ScrollableResultsImpl.java:99)
> ... 10 more
>
> Here is the piece of code :
>     private void index(EntityManager em, LongConsumer callBack) {
>         FullTextEntityManager fullTextEntityManager =
> Search.getFullTextEntityManager(em);
>         MassIndexer indexer =
> fullTextEntityManager.createIndexer(Contact.class);
>         indexer.batchSizeToLoadObjects(BATCH_SIZE);
>         indexer.threadsToLoadObjects(NB_THREADS);
>         indexer.progressMonitor(new IndexerProgressMonitor(callBack));
>         indexer.start();
>     }
>
>
> To make the process run until the end and after several unsuccessful
> tries, the only thing that seems to work is to edit mariadb config file and
> set : net_write_timeout = 3600
> By default the time out is 60 sec. Here we put it to 1 hour... as the
> process took 45 minutes...
> Does someone have an idea of what we did wrong ? It does not seems
> reasonable to set a such huge value to ask a full re-index on the table.
> Is there a way to ask MassIndexer to process by limited chunk of data or
> something else to avoid the connection to stay open for a too long time ?
> Thanks for your time.
>
>

Reply via email to