Hi Stefan,
 
 Thank you for reply.
 We have 31 segments. They are totally 106G.
 
 Keren

Stefan Groschupf <[EMAIL PROTECTED]> wrote: How many segments you have and how 
big are they?
Try a disc IO Measurement tool or script what does it says?


Am 08.03.2006 um 17:38 schrieb Insurance Squared Inc.:

> I appreciate your patience as we try to get over our search speed  
> issues.  We're getting closer - it seems we are having huge delays  
> when retrieving the summaries for the various search results.   
> Below are our logs from a search, you can see that retrieving some  
> of the search summaries  took into the double digit seconds.  ('ve  
> left in the comments from the developer).
>
> As we continue to dig deeper, I was wondering if the folks here  
> that are more intimately familiar with the code had any immediate  
> reaction as to what the problem might be,given this additional info.
>
> We've pretty much ruled out Tomcat as the source, we installed  
> Resin and search speed was the same.
>
> Nutch 0.71 running on linux, dual Xeon, 8 gigs of ram, 3Xscsi  
> drives in Raid 0.  Nothing else running on the server.  Index has  
> about 4.5 million pages.
>
> Thanks!
>
> 060308 104251 11 query: term life insurance
> 060308 104251 11 searching for 20 raw hits
> 060308 104253 11 total hits: 20859
> 060308 104253 11 Keren: get hits.
> 060308 104253 11 Keren: get details.
> 060308 104253 11 Keren: get summary.
> 060308 104253 12 Keren: getSegment().
> 060308 104253 13 Keren: getSegment().
> 060308 104253 12 Keren: getDocNo().
> 060308 104253 14 Keren: getSegment().
> 060308 104253 14 Keren: getDocNo().
> 060308 104253 13 Keren: getDocNo().
> 060308 104253 12 Keren: getParseText().
> 060308 104253 15 Keren: getSegment().
> 060308 104253 17 Keren: getSegment().
> 060308 104253 13 Keren: getParseText().
> 060308 104253 15 Keren: getDocNo().
> 060308 104253 15 Keren: getParseText().
> 060308 104253 18 Keren: getSegment().
> 060308 104253 14 Keren: getParseText().
> 060308 104253 16 Keren: getSegment().
> 060308 104253 16 Keren: getDocNo().
> 060308 104253 18 Keren: getDocNo().
> 060308 104253 16 Keren: getParseText().
> 060308 104253 17 Keren: getDocNo().
> 060308 104253 17 Keren: getParseText().
> 060308 104253 19 Keren: getSegment().
> 060308 104253 18 Keren: getParseText().
> 060308 104253 20 Keren: getSegment().
> 060308 104253 21 Keren: getSegment().
> 060308 104253 20 Keren: getDocNo().
> 060308 104253 21 Keren: getDocNo().
> 060308 104253 20 Keren: getParseText().
> 060308 104253 21 Keren: getParseText().
> 060308 104253 19 Keren: getDocNo().
> 060308 104253 19 Keren: getParseText().
> 060308 104253 19 Keren: getText().
> 060308 104253 19 Keren: Summarizer().getSummary. text length=3288
> 060308 104254 19 found resource common-terms.utf8 at file:/var/ 
> jakarta-tomcat-4.1.31/webapps/ROOT/WEB-INF/classes/common-terms.utf8
> 060308 104254 18 Keren: getText().
> 060308 104254 18 Keren: Summarizer().getSummary. text length=4770
> 060308 104254 12 Keren: getText().
> 060308 104254 12 Keren: Summarizer().getSummary. text length=9442
> 060308 104257 20 Keren: getText().
> 060308 104257 20 Keren: Summarizer().getSummary. text length=4162
> 060308 104302 14 Keren: getText().
> 060308 104302 14 Keren: Summarizer().getSummary. text length=9364
> 060308 104302 13 Keren: getText().
> 060308 104302 13 Keren: Summarizer().getSummary. text length=9140
> 060308 104303 21 Keren: getText().
> 060308 104303 21 Keren: Summarizer().getSummary. text length=1107
> 060308 104304 17 Keren: getText().
> 060308 104304 17 Keren: Summarizer().getSummary. text length=3315
> 060308 104305 15 Keren: getText().
> 060308 104305 15 Keren: Summarizer().getSummary. text length=3261
> 060308 104305 16 Keren: getText().
> 060308 104305 16 Keren: Summarizer().getSummary. text length=492
> 060308 104305 11 Keren: get requestURL.
> 060308 104305 11 Keren: start try.
> 060308 104305 11 Keren: start detail.
> 060308 104305 11 Keren: detail: 0
> 060308 104305 11 Keren: detail: 1
> 060308 104305 11 Keren: detail: 2
> 060308 104305 11 Keren: detail: 3
> 060308 104305 11 Keren: detail: 4
> 060308 104305 11 Keren: detail: 5
> 060308 104305 11 Keren: detail: 6
> 060308 104305 11 Keren: detail: 7
> 060308 104305 11 Keren: detail: 8
> 060308 104305 11 Keren: detail: 9
> 060308 104306 11 Keren: doGet done.
>
> There are 10 threads to get summary. After these threads are done,  
> it return the search results as RSS. Let's see the threads separately,
>
> 060308 104253 12 Keren: getSegment().
> 060308 104253 12 Keren: getDocNo().
> 060308 104253 12 Keren: getParseText().
> 060308 104254 12 Keren: getText().
> 060308 104254 12 Keren: Summarizer().getSummary. text length=9442
>
> The thread 12 took 1 second to get parse text.
>
> 060308 104253 13 Keren: getSegment().
> 060308 104253 13 Keren: getDocNo().
> 060308 104253 13 Keren: getParseText().
> 060308 104302 13 Keren: getText().
> 060308 104302 13 Keren: Summarizer().getSummary. text length=9140
>
> The thread 13 took 9 seconds to get parse text.
>
> 060308 104253 14 Keren: getSegment().
> 060308 104253 14 Keren: getDocNo().
> 060308 104253 14 Keren: getParseText().
> 060308 104302 14 Keren: getText().
> 060308 104302 14 Keren: Summarizer().getSummary. text length=9364
>
> The thread 14 took 9 seconds to get parse text.
>
> 060308 104253 15 Keren: getSegment().
> 060308 104253 15 Keren: getDocNo().
> 060308 104253 15 Keren: getParseText().
> 060308 104305 15 Keren: getText().
> 060308 104305 15 Keren: Summarizer().getSummary. text length=3261
>
> The thread 15 took 12 seconds to get parse text.
>
> 060308 104253 16 Keren: getSegment().
> 060308 104253 16 Keren: getDocNo().
> 060308 104253 16 Keren: getParseText().
> 060308 104305 16 Keren: getText().
> 060308 104305 16 Keren: Summarizer().getSummary. text length=492
>
> The thread 16 took 12 seconds to get parse text.
>
> 060308 104253 17 Keren: getSegment().
> 060308 104253 17 Keren: getDocNo().
> 060308 104253 17 Keren: getParseText().
> 060308 104304 17 Keren: getText().
> 060308 104304 17 Keren: Summarizer().getSummary. text length=3315
>
> The thread 17 took 11 seconds to get parse text.
>
> 060308 104253 18 Keren: getSegment().
> 060308 104253 18 Keren: getDocNo().
> 060308 104253 18 Keren: getParseText().
> 060308 104254 18 Keren: getText().
> 060308 104254 18 Keren: Summarizer().getSummary. text length=4770
>
> The thread 18 took 1 second to get parse text.
>
> 060308 104253 19 Keren: getSegment().
> 060308 104253 19 Keren: getParseText().
> 060308 104253 19 Keren: getText().
> 060308 104253 19 Keren: Summarizer().getSummary. text length=3288
>
> The thread 19 took 1 second to get parse text.
>
> I think the problem is that how these 10 concurrent threads run.  
> I'm not sure they are really concurrenctly run. In the thread 16,  
> it's text length is the smallest, 492.
>
>

---------------------------------------------------------------
company:        http://www.media-style.com
forum:        http://www.text-mining.org
blog:            http://www.find23.net




                
---------------------------------
Make Yahoo! Canada your Homepage Yahoo! Canada Homepage  

Reply via email to