Ok here is what I did. 
 
I split 5 million pages over 8 segments.  They are all indexed and I go to load them up, and 0-3 work fine.  The only problem I had in 3 was when the fetcher seemed to stall due to the parsing of pdf files.  When I load anything beyond 3 I get this:
 

HTTP Status 500 -


type Exception report

message

description The server encountered an internal error () that prevented it from fulfilling this request.

exception

javax.servlet.ServletException
	org.apache.jasper.runtime.PageContextImpl.doHandlePageException(PageContextImpl.java:825)
	org.apache.jasper.runtime.PageContextImpl.handlePageException(PageContextImpl.java:758)
	org.apache.jsp.search_jsp._jspService(search_jsp.java:495)
	org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:94)
	javax.servlet.http.HttpServlet.service(HttpServlet.java:802)
	org.apache.jasper.servlet.JspServletWrapper.service(JspServletWrapper.java:324)
	org.apache.jasper.servlet.JspServlet.serviceJspFile(JspServlet.java:292)
	org.apache.jasper.servlet.JspServlet.service(JspServlet.java:236)
	javax.servlet.http.HttpServlet.service(HttpServlet.java:802)

root cause

java.lang.OutOfMemoryError

note The full stack trace of the root cause is available in the Apache Tomcat/5.0.28 logs.


Apache Tomcat/5.0.28

 
These are the segments that work:
 
19M     segments/20040829092114/fetchlist
21M     segments/20040829092114/fetcher
244M    segments/20040829092114/content
82M     segments/20040829092114/parse_text
143M    segments/20040829092114/parse_data
119M    segments/20040829092114/index
625M    segments/20040829092114
26M     segments/20040829122947/fetchlist
32M     segments/20040829122947/fetcher
646M    segments/20040829122947/content
217M    segments/20040829122947/parse_text
358M    segments/20040829122947/parse_data
297M    segments/20040829122947/index
1.6G    segments/20040829122947
70M     segments/20040829124357/fetchlist
84M     segments/20040829124357/fetcher
2.2G    segments/20040829124357/content
811M    segments/20040829124357/parse_text
1.5G    segments/20040829124357/parse_data
1.1G    segments/20040829124357/index
5.6G    segments/20040829124357
90M     segments/20040829130541/fetchlist
102M    segments/20040829130541/fetcher
977M    segments/20040829130541/content
301M    segments/20040829130541/parse_text
547M    segments/20040829130541/parse_data
428M    segments/20040829130541/index
2.4G    segments/20040829130541
28M     segments/20040829212107/fetchlist
34M     segments/20040829212107/fetcher
757M    segments/20040829212107/content
256M    segments/20040829212107/parse_text
502M    segments/20040829212107/parse_data
344M    segments/20040829212107/index
1.9G    segments/20040829212107
28M     segments/20040829225928/fetchlist
33M     segments/20040829225928/fetcher
694M    segments/20040829225928/content
243M    segments/20040829225928/parse_text
398M    segments/20040829225928/parse_data
329M    segments/20040829225928/index
1.7G    segments/20040829225928
29M     segments/20040830042947/fetchlist
35M     segments/20040830042947/fetcher
890M    segments/20040830042947/content
292M    segments/20040830042947/parse_text
673M    segments/20040830042947/parse_data
376M    segments/20040830042947/index
2.3G    segments/20040830042947
29M     segments/20040830043001/fetchlist
35M     segments/20040830043001/fetcher
893M    segments/20040830043001/content
293M    segments/20040830043001/parse_text
678M    segments/20040830043001/parse_data
377M    segments/20040830043001/index
2.3G    segments/20040830043001
29M     segments/20040830065943/fetchlist
35M     segments/20040830065943/fetcher
872M    segments/20040830065943/content
292M    segments/20040830065943/parse_text
661M    segments/20040830065943/parse_data
377M    segments/20040830065943/index
2.3G    segments/20040830065943
154M    segments/20040830111830/fetchlist
183M    segments/20040830111830/fetcher
5.1G    segments/20040830111830/content
1.7G    segments/20040830111830/parse_text
3.9G    segments/20040830111830/parse_data
2.2G    segments/20040830111830/index
13G     segments/20040830111830
12M     segments/20040904035557-0/fetchlist
15M     segments/20040904035557-0/fetcher
316M    segments/20040904035557-0/content
122M    segments/20040904035557-0/parse_text
203M    segments/20040904035557-0/parse_data
163M    segments/20040904035557-0/index
829M    segments/20040904035557-0
13M     segments/20040904035557-1/fetchlist
16M     segments/20040904035557-1/fetcher
333M    segments/20040904035557-1/content
132M    segments/20040904035557-1/parse_text
210M    segments/20040904035557-1/parse_data
177M    segments/20040904035557-1/index
878M    segments/20040904035557-1
12M     segments/20040904035557-2/fetchlist
15M     segments/20040904035557-2/fetcher
328M    segments/20040904035557-2/content
126M    segments/20040904035557-2/parse_text
207M    segments/20040904035557-2/parse_data
170M    segments/20040904035557-2/index
855M    segments/20040904035557-2
73M     segments/20040905133243-0/fetchlist
87M     segments/20040905133243-0/fetcher
2.0G    segments/20040905133243-0/content
762M    segments/20040905133243-0/parse_text
1.4G    segments/20040905133243-0/parse_data
974M    segments/20040905133243-0/index
5.3G    segments/20040905133243-0
73M     segments/20040905133243-1/fetchlist
87M     segments/20040905133243-1/fetcher
2.0G    segments/20040905133243-1/content
746M    segments/20040905133243-1/parse_text
1.4G    segments/20040905133243-1/parse_data
958M    segments/20040905133243-1/index
5.2G    segments/20040905133243-1
72M     segments/20040905133243-2/fetchlist
86M     segments/20040905133243-2/fetcher
1.9G    segments/20040905133243-2/content
722M    segments/20040905133243-2/parse_text
1.4G    segments/20040905133243-2/parse_data
935M    segments/20040905133243-2/index
5.0G    segments/20040905133243-2
71M     segments/20040905133243-3/fetchlist
84M     segments/20040905133243-3/fetcher
1.9G    segments/20040905133243-3/content
727M    segments/20040905133243-3/parse_text
1.3G    segments/20040905133243-3/parse_data
926M    segments/20040905133243-3/index
5.0G    segments/20040905133243-3
57G     segments
When I merge this, the index directory is right at 10 gigs.
 
Are there limits to the /index directory size?  Am I missing something simple?
 
Thanks!
 
J

Reply via email to