Hello,
Does your indexer utilize CPU/IO? - check it by iostat/vmstat.
If it doesn't, take several thread dumps by jvisualvm sampler or jstack,
try to understand what blocks your threads from progress.
It might happen you need to speedup your SQL data consumption, to do this,
you can enable
Pranav,
If possible, you may wish to consider moving a job this large outside
of DataImportHandler to a standalone program, as the SQL processing is
somewhat limited by the N+1 subselects problem.
Michael Della Bitta
Appinions | 18 East 41st St.,
9m*15 - that's a lot of queries (400 QPS).
I would try reduce the number of queries:
1. Rewrite your main (root) query to select all possible data
* use SQL joins instead of DIH nested entities
* select data from 1-N related tables (tags, authors, etc) in the main
query using GROUP_CONCAT