Re: Data Import Handler takes different time on different machines

2016-02-03 Thread Troy Edwards
While researching the space on the servers, I found that log files from Sept 2015 are still there. These are solr_gc_log_datetime and solr_log_datetime. Is the default logging for Solr ok for production systems or does it need to be changed/tuned? Thanks, On Tue, Feb 2, 2016 at 2:04 PM, Troy

Re: Data Import Handler takes different time on different machines

2016-02-02 Thread Erick Erickson
Scratch that installation and start over? Really, it sounds like something is fundamentally messed up with the Linux install. Perhaps something as simple as file paths, or you have old jars hanging around that are mis-matched. Or someone manually deleted files from the Solr install. Or your disk

Re: Data Import Handler takes different time on different machines

2016-02-02 Thread Troy Edwards
That is help! Thank you for the thoughts. On Tue, Feb 2, 2016 at 12:17 PM, Erick Erickson wrote: > Scratch that installation and start over? > > Really, it sounds like something is fundamentally messed up with the > Linux install. Perhaps something as simple as file

Re: Data Import Handler takes different time on different machines

2016-02-02 Thread Troy Edwards
Rerunning the Data Import Handler again on the the linux machine has started producing some errors and warnings: On the node on which DIH was started: WARN SolrWriter Error creating document : SolrInputDocument org.apache.solr.common.SolrException: No registered leader was found after waiting

Re: Data Import Handler takes different time on different machines

2016-02-01 Thread Erick Erickson
What happens if you run just the SQL query from the windows box and from the linux box? Is there any chance that somehow the connection from the linux box is just slower? Best, Erick On Mon, Feb 1, 2016 at 6:36 PM, Alexandre Rafalovitch wrote: > What are you importing from?

Data Import Handler takes different time on different machines

2016-02-01 Thread Troy Edwards
We have a windows development machine on which the Data Import Handler consistently takes about 40 mins to finish. Queries run fine. JVM memory is 2 GB per node. But on a linux machine it consistently takes about 2.5 hours. The queries also run slower. JVM memory here is also 2 GB per node. How

Re: Data Import Handler takes different time on different machines

2016-02-01 Thread Alexandre Rafalovitch
What are you importing from? Is the source and Solr machine collocated in the same fashion on dev and prod? Have you tried running this on a Linux dev machine? Perhaps your prod machine is loaded much more than a dev. Regards, Alex. Newsletter and resources for Solr beginners and

Re: Data Import Handler takes different time on different machines

2016-02-01 Thread Erick Erickson
The first thing I'd be looking at is how I the JDBC batch size compares between the two machines. AFAIK, Solr shouldn't notice the difference, and since a large majority of the development is done on Linux-based systems, I'd be surprised if this was worse than Windows, which would lead me to

Re: Data Import Handler takes different time on different machines

2016-02-01 Thread Troy Edwards
Sorry, I should explain further. The Data Import Handler had been running for a while retrieving only about 15 records from the database. Both in development env (windows) and linux machine it took about 3 mins. The query has been changed and we are now trying to retrieve about 10 million