Re: High disk IO during UpdateCSV

2013-11-13 Thread Utkarsh Sengar
Bumping this one again, any suggestions? On Tue, Nov 12, 2013 at 3:58 PM, Utkarsh Sengar utkarsh2...@gmail.comwrote: Hello, I load data from csv to solr via UpdateCSV. There are about 50M documents with 10 columns in each document. The index size is about 15GB and I am using a 3 node

Re: High disk IO during UpdateCSV

2013-11-13 Thread Michael Della Bitta
Utkarsh, Your screenshot didn't come through. I don't think this list allows attachments. Maybe put it up on imgur or something? I'm a little unclear on whether you're using Solr in Cloud mode, or with a single master. Michael Della Bitta Applications Developer o: +1 646 532 3062 | c: +1 917

Re: High disk IO during UpdateCSV

2013-11-13 Thread Utkarsh Sengar
Hi Michael, I am using solr cloud 4.5. And update csv loads data to one of these nodes. Attachment: http://i.imgur.com/1xmoNtt.png Thanks, -Utkarsh On Wed, Nov 13, 2013 at 8:33 AM, Michael Della Bitta michael.della.bi...@appinions.com wrote: Utkarsh, Your screenshot didn't come through.

Re: High disk IO during UpdateCSV

2013-11-13 Thread Walter Underwood
Don't load 50M documents in one shot. Break it up into reasonable chunks (100K?) with commits at each point. You will have a bottleneck somewhere, usually disk or CPU. Yours appears to be disk. If you get faster disks, it might become the CPU. wunder On Nov 13, 2013, at 8:22 AM, Utkarsh

Re: High disk IO during UpdateCSV

2013-11-13 Thread Utkarsh Sengar
Thanks guys! I will start splitting the file in chunks of 5M (10 chunks) to start with reduce the size if needed. Thanks, -Utkarsh On Wed, Nov 13, 2013 at 9:08 AM, Walter Underwood wun...@wunderwood.orgwrote: Don't load 50M documents in one shot. Break it up into reasonable chunks (100K?)

High disk IO during UpdateCSV

2013-11-12 Thread Utkarsh Sengar
Hello, I load data from csv to solr via UpdateCSV. There are about 50M documents with 10 columns in each document. The index size is about 15GB and I am using a 3 node distributed solr cluster. While loading the data the disk IO goes to 100%. if the load balancer in front of solr hits the