1 - I'm using commons Digester as xml parser, how can I find the bottleneck ? Should I run the code and comment out the Lucene queries part and just leave the xml parsing ?
2 - I actually also wanted to know the following: how much does it take to run a 100MB queries text file against each single document of a 100MB collection ? On a Intel Dual Duo Core with 4GB Ram ? Are we talking about few hours ? Can I have an estimate ? thanks On 29 March 2011 11:43, Ian Lea <ian....@gmail.com> wrote: > You need to figure out what is taking the time, for example by reading > the XML file without making any lucene queries. What XML parsing > process are you using? Some are faster than others. A google search > should find loads of info. > > If it turns out that it is lucene searching taking most of the time, > see http://wiki.apache.org/lucene-java/ImproveSearchingSpeed > > > But do the figuring out first - there is little point in speeding up > the bit that is already quick. > > > -- > Ian. > > > On Tue, Mar 29, 2011 at 10:22 AM, Patrick Diviacco > <patrick.divia...@gmail.com> wrote: > > hi, > > > > I performing multiple queries (stored in a 100MB XML file) against a > > collection (indexed with lucene, and it was stored before in a 100MB XML > > file). > > > > The process seems pretty long on my machine (more than 2 hours), so I was > > wondering if importing the 100MB queries XML file into a mysql dataset > and > > extract them with Java would dramatically improve the performances > (rather > > than working with Java + a xml text file). > > > > thanks > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > >