Hi Stefan,
Thank you for reply.
We have 31 segments. They are totally 106G.
Keren
Stefan Groschupf <[EMAIL PROTECTED]> wrote: How many segments you have and how
big are they?
Try a disc IO Measurement tool or script what does it says?
Am 08.03.2006 um 17:38 schrieb Insurance Squared Inc.:
> I appreciate your patience as we try to get over our search speed
> issues. We're getting closer - it seems we are having huge delays
> when retrieving the summaries for the various search results.
> Below are our logs from a search, you can see that retrieving some
> of the search summaries took into the double digit seconds. ('ve
> left in the comments from the developer).
>
> As we continue to dig deeper, I was wondering if the folks here
> that are more intimately familiar with the code had any immediate
> reaction as to what the problem might be,given this additional info.
>
> We've pretty much ruled out Tomcat as the source, we installed
> Resin and search speed was the same.
>
> Nutch 0.71 running on linux, dual Xeon, 8 gigs of ram, 3Xscsi
> drives in Raid 0. Nothing else running on the server. Index has
> about 4.5 million pages.
>
> Thanks!
>
> 060308 104251 11 query: term life insurance
> 060308 104251 11 searching for 20 raw hits
> 060308 104253 11 total hits: 20859
> 060308 104253 11 Keren: get hits.
> 060308 104253 11 Keren: get details.
> 060308 104253 11 Keren: get summary.
> 060308 104253 12 Keren: getSegment().
> 060308 104253 13 Keren: getSegment().
> 060308 104253 12 Keren: getDocNo().
> 060308 104253 14 Keren: getSegment().
> 060308 104253 14 Keren: getDocNo().
> 060308 104253 13 Keren: getDocNo().
> 060308 104253 12 Keren: getParseText().
> 060308 104253 15 Keren: getSegment().
> 060308 104253 17 Keren: getSegment().
> 060308 104253 13 Keren: getParseText().
> 060308 104253 15 Keren: getDocNo().
> 060308 104253 15 Keren: getParseText().
> 060308 104253 18 Keren: getSegment().
> 060308 104253 14 Keren: getParseText().
> 060308 104253 16 Keren: getSegment().
> 060308 104253 16 Keren: getDocNo().
> 060308 104253 18 Keren: getDocNo().
> 060308 104253 16 Keren: getParseText().
> 060308 104253 17 Keren: getDocNo().
> 060308 104253 17 Keren: getParseText().
> 060308 104253 19 Keren: getSegment().
> 060308 104253 18 Keren: getParseText().
> 060308 104253 20 Keren: getSegment().
> 060308 104253 21 Keren: getSegment().
> 060308 104253 20 Keren: getDocNo().
> 060308 104253 21 Keren: getDocNo().
> 060308 104253 20 Keren: getParseText().
> 060308 104253 21 Keren: getParseText().
> 060308 104253 19 Keren: getDocNo().
> 060308 104253 19 Keren: getParseText().
> 060308 104253 19 Keren: getText().
> 060308 104253 19 Keren: Summarizer().getSummary. text length=3288
> 060308 104254 19 found resource common-terms.utf8 at file:/var/
> jakarta-tomcat-4.1.31/webapps/ROOT/WEB-INF/classes/common-terms.utf8
> 060308 104254 18 Keren: getText().
> 060308 104254 18 Keren: Summarizer().getSummary. text length=4770
> 060308 104254 12 Keren: getText().
> 060308 104254 12 Keren: Summarizer().getSummary. text length=9442
> 060308 104257 20 Keren: getText().
> 060308 104257 20 Keren: Summarizer().getSummary. text length=4162
> 060308 104302 14 Keren: getText().
> 060308 104302 14 Keren: Summarizer().getSummary. text length=9364
> 060308 104302 13 Keren: getText().
> 060308 104302 13 Keren: Summarizer().getSummary. text length=9140
> 060308 104303 21 Keren: getText().
> 060308 104303 21 Keren: Summarizer().getSummary. text length=1107
> 060308 104304 17 Keren: getText().
> 060308 104304 17 Keren: Summarizer().getSummary. text length=3315
> 060308 104305 15 Keren: getText().
> 060308 104305 15 Keren: Summarizer().getSummary. text length=3261
> 060308 104305 16 Keren: getText().
> 060308 104305 16 Keren: Summarizer().getSummary. text length=492
> 060308 104305 11 Keren: get requestURL.
> 060308 104305 11 Keren: start try.
> 060308 104305 11 Keren: start detail.
> 060308 104305 11 Keren: detail: 0
> 060308 104305 11 Keren: detail: 1
> 060308 104305 11 Keren: detail: 2
> 060308 104305 11 Keren: detail: 3
> 060308 104305 11 Keren: detail: 4
> 060308 104305 11 Keren: detail: 5
> 060308 104305 11 Keren: detail: 6
> 060308 104305 11 Keren: detail: 7
> 060308 104305 11 Keren: detail: 8
> 060308 104305 11 Keren: detail: 9
> 060308 104306 11 Keren: doGet done.
>
> There are 10 threads to get summary. After these threads are done,
> it return the search results as RSS. Let's see the threads separately,
>
> 060308 104253 12 Keren: getSegment().
> 060308 104253 12 Keren: getDocNo().
> 060308 104253 12 Keren: getParseText().
> 060308 104254 12 Keren: getText().
> 060308 104254 12 Keren: Summarizer().getSummary. text length=9442
>
> The thread 12 took 1 second to get parse text.
>
> 060308 104253 13 Keren: getSegment().
> 060308 104253 13 Keren: getDocNo().
> 060308 104253 13 Keren: getParseText().
> 060308 104302 13 Keren: getText().
> 060308 104302 13 Keren: Summarizer().getSummary. text length=9140
>
> The thread 13 took 9 seconds to get parse text.
>
> 060308 104253 14 Keren: getSegment().
> 060308 104253 14 Keren: getDocNo().
> 060308 104253 14 Keren: getParseText().
> 060308 104302 14 Keren: getText().
> 060308 104302 14 Keren: Summarizer().getSummary. text length=9364
>
> The thread 14 took 9 seconds to get parse text.
>
> 060308 104253 15 Keren: getSegment().
> 060308 104253 15 Keren: getDocNo().
> 060308 104253 15 Keren: getParseText().
> 060308 104305 15 Keren: getText().
> 060308 104305 15 Keren: Summarizer().getSummary. text length=3261
>
> The thread 15 took 12 seconds to get parse text.
>
> 060308 104253 16 Keren: getSegment().
> 060308 104253 16 Keren: getDocNo().
> 060308 104253 16 Keren: getParseText().
> 060308 104305 16 Keren: getText().
> 060308 104305 16 Keren: Summarizer().getSummary. text length=492
>
> The thread 16 took 12 seconds to get parse text.
>
> 060308 104253 17 Keren: getSegment().
> 060308 104253 17 Keren: getDocNo().
> 060308 104253 17 Keren: getParseText().
> 060308 104304 17 Keren: getText().
> 060308 104304 17 Keren: Summarizer().getSummary. text length=3315
>
> The thread 17 took 11 seconds to get parse text.
>
> 060308 104253 18 Keren: getSegment().
> 060308 104253 18 Keren: getDocNo().
> 060308 104253 18 Keren: getParseText().
> 060308 104254 18 Keren: getText().
> 060308 104254 18 Keren: Summarizer().getSummary. text length=4770
>
> The thread 18 took 1 second to get parse text.
>
> 060308 104253 19 Keren: getSegment().
> 060308 104253 19 Keren: getParseText().
> 060308 104253 19 Keren: getText().
> 060308 104253 19 Keren: Summarizer().getSummary. text length=3288
>
> The thread 19 took 1 second to get parse text.
>
> I think the problem is that how these 10 concurrent threads run.
> I'm not sure they are really concurrenctly run. In the thread 16,
> it's text length is the smallest, 492.
>
>
---------------------------------------------------------------
company: http://www.media-style.com
forum: http://www.text-mining.org
blog: http://www.find23.net
---------------------------------
Make Yahoo! Canada your Homepage Yahoo! Canada Homepage