Best practice indexing doesn't create intermediate files at all, but constructs in memory and posts to solr via an indexing program. There are java, ruby, python, etc clients to help you talk to Solr over HTTP.
If you don't want to do any programming and your data is in a database, using a CSV dump may be the next best option. -Yonik On Jan 9, 2008 9:11 AM, Gunther, Andrew <[EMAIL PROTECTED]> wrote: > Is there a practical reason behind trying to post 1m different files > instead of several large files. If this is a unix setup can you try > post.sh instead. > > -----Original Message----- > From: zqzuk [mailto:[EMAIL PROTECTED] > Sent: Wednesday, January 09, 2008 6:14 AM > To: solr-user@lucene.apache.org > Subject: Using the post tool - too many files in a folder? > > > Hi, I am using the post.jar tool to post files to solr. I d like to post > everything in a folder, e.g., "myfolder". I typed in command: > > java -jar post.jar c:/myfolder/*.xml. > > This works perfectly when I test on a sample of 100k xml files. But when > I > work on the real dataset, there are over 1m files in the folder. And > when I > typed in the same command and hits enter, the program hangs and there > are no > response after a long while. > > Is it because there are too many files? What is the best practice > (should I > separate the 1million files into 100 subfolders and do the posting from > those folders separately?) > > Many thanks! > -- > View this message in context: > http://www.nabble.com/Using-the-post-tool---too-many-files-in-a-folder-- > tp14709773p14709773.html > Sent from the Solr - User mailing list archive at Nabble.com. > >