Re: DataImportHandler SolrEntityProcessor configuration for local copy

2020-02-06 Thread Mikhail Khludnev
Karl, what would you do if that own implementation stalls in GC, or smashes Solr over? On Thu, Feb 6, 2020 at 1:04 PM Karl Stoney wrote: > Spoke too soon, looks like it memory leaks. After about 1.3m the old gc > times went through the root and solr was almost unresponsive, had to > abort. We'

Re: DataImportHandler SolrEntityProcessor configuration for local copy

2020-02-06 Thread Mikhail Khludnev
Egor, would you mind to share some best practices regarding cursorMark in SolrEntityProcessor? On Thu, Feb 6, 2020 at 1:04 PM Karl Stoney wrote: > Spoke too soon, looks like it memory leaks. After about 1.3m the old gc > times went through the root and solr was almost unresponsive, had to > abo

Re: DataImportHandler SolrEntityProcessor configuration for local copy

2020-02-06 Thread Karl Stoney
Spoke too soon, looks like it memory leaks. After about 1.3m the old gc times went through the root and solr was almost unresponsive, had to abort. We're going to write our own implementation to copy data from one core to another that runs outside of solr. On 06/02/2020, 09:57, "Karl Stoney"

Re: DataImportHandler SolrEntityProcessor configuration for local copy

2020-02-06 Thread Karl Stoney
I cannot believe how much of a difference that cursorMark and sort order made. Previously it died about 800k docs, now we're at 1.2m without any slowdown. Thank you so much On 06/02/2020, 08:14, "Mikhail Khludnev" wrote: Hello, Karl. Please check these: https://eur03.safelinks.pro

Re: DataImportHandler SolrEntityProcessor configuration for local copy

2020-02-06 Thread Mikhail Khludnev
Hello, Karl. Please check these: https://lucene.apache.org/solr/guide/6_6/pagination-of-results.html#constraints-when-using-cursors https://lucene.apache.org/solr/guide/6_6/uploading-structured-data-store-data-with-the-data-import-handler.html#solrentityprocessor cursorMark="true" Good luck. On

DataImportHandler SolrEntityProcessor configuration for local copy

2020-02-05 Thread Karl Stoney
Hey All, I'm trying to implement a simplistic reindex strategy to copy all of the data out of one collection, into another, on a single node (no distributed queries). It's approx 4 million documents, with an index size of 26gig. Based on your experience, I'm wondering what people feel sensible