Well now I am wondering what my next step would be?  I generated 5 6 1/2
million page segments.  Crawled them all, tried to update and I am getting
this crazy stuff - It appears that it go crazy on these angelfire sites.

Am I missing something easy or what would anyone suggest?

Jason

趥fhеۡ��>㵾��0.http://www.angelfire.com/ga/achamtb/clubs.html.http://ww
w.angelfire.com/ga/achamtb/clubs.html�
ü
]Ͻڪ�r3}�c�>"캾��31http://www.angelfire.com/ga/achamtb/downhill.html1http
://www.angelfire.com/ga/achamtb/downhill.htmlI7ɿc۸�n�c�>"캾��.,http://www
.angelfire.com/ga/achamtb/faq.html,http://www.angelfire.com/ga/achamtb/faq.h
tml'婩���tx��"ٺ��c�>"캾��53http://www.angelfire.com/ga/achamtb/new_photos
.html3http://www.angelfire.com/ga/achamtb/new_photos.htmlҿ�

� �cǼ#�c�>"캾��31http://www.angelfire.com/ga/achamtb/overview.html1http://
www.angelfire.com/ga/achamtb/overview.htmlÓ¤tÕ¯]
,`��c�>"캾��31http://www.angelfire.com/ga/achamtb/pictures.html1http://www.
angelfire.com/ga/achamtb/pictures.htmleZ{��}��o�c�>"캾��31http://www.angel
fire.com/ga/achamtb/survival.html1http://www.angelfire.com/ga/achamtb/surviv
al.html�ծ
�ֺ=�c�>"캾��1/http://www.angelfire.com/ga/achamtb/trails.html/http://www.a
ngelfire.com/ga/achamtb/trails.html�㥽W{j;g(ӡb��c�>"캾�s&$http://www.ange
lfire.com/ga/aeontrix$http://www.angelfire.com/ga/aeontrixr)f䨹!?l�>�>��
[EMAIL PROTECTED]://www.angelfire.com/ga/angelhugspage/[EMAIL PROTECTED]://ww
w.angelfire.com/ga/angelhugspage/hugsforsinglemoms.htmlNO�
�$�����
���^U>)C>��53http://www.angelfire.com/ga/batwentyone/Attack.html3http://www
.angelfire.com/ga/batwentyone/Attack.htmlܯ�
Ѻc&Ťҳ��^U>
<>��42http://www.angelfire.com/ga/batwentyone/Bases.html2http://www.angelfir
e.com/ga/batwentyone/Bases.html˨J/�ȣ$c%��^U>
<>��64http://www.angelfire.com/ga/batwentyone/Schools.html4http://www.angelf
ire.com/ga/batwentyone/Schools.htmlئA�3f˻V�е�^U>
<>��64http://www.angelfire.com/ga/batwentyone/Spe.Msn.html4http://www.angelf
ire.com/ga/batwentyone/Spe.Msn.html
�
 7֬��?}��^U>
<>��75http://www.angelfire.com/ga/batwenE�
J8�^U>ners.html5http://www.angelfire<>��75http://www.angelfire.com/ga/batwe
ntyone/figthers.html5http://www.angelfire.com/ga/batwentyone/figthers.html�
�
�:飻ӳ��^U>
<>��64http://www.angelfire.com/ga/batwentyone/gallery.html4http://www.angelf
ire.com/ga/batwentyone/gallery.html�ڹ {
/_d}׶�^U>
<>��75http://www.angelfire.com/ga/batwentyone/thankyou.html5http://www.angel
fire.com/ga/batwentyone/thankyou.htmlϬϩ
fͤéƳ½_�^U>
<>��97http://www.angelfire.com/ga/batwentyone/transports.html7http://www.ang
elfire.com/ga/batwentyone/transports.html�U�Mnƨ֡.��^U>
<>��64http://www.angelfire.com/ga/batwentyone/utility.html4http://www.angelf
ire.com/ga/batwentyone/utility.htmlP¼rҫ$.D;ȣҾ�^U>
<>��42http://www.angelfire.com/ga/batwentyone/valor.html2http://www.angelfir
e.com/ga/batwentyone/valor.htmlҥ����x��^U>
<>�q%#http://www.angelfire.com/ga/bazuka/#http://www.angelfire.com/ga/bazuka
//#ʵ�UЫ����>$z��k&�>�>�w read 26994 bytes, should read
779577707lfire.com/ga/b^[[?6c^[[?6c^[[?6c^[[?6c^[[?6c^[[?6c^[[?6c^[[?6c^[[?6
c^[[?6cPuTTY^[[?6c^[[?6c^[[?6c^[[?6c^[[?6c^[[?6c   at
net.nutch.io.SequenceFile$Reader.next(SequenceFile.java:192)
  at net.nutch.io.SequenceFile$Reader.next(SequenceFile.java:205)
  at net.nutch.io.MapFile$Reader.next(MapFile.java:300)
  at
net.nutch.db.WebDBWriter$PagesByURLProcessor.mergeEdits(WebDBWriter.java:623
)
  at net.nutch.db.WebDBWriter$CloseProcessor.closeDown(WebDBWriter.java:543)
  at net.nutch.db.WebDBWriter.close(WebDBWriter.java:1534)
  at net.nutch.tools.UpdateDatabaseTool.close(UpdateDatabaseTool.java:297)
  at net.nutch.tools.UpdateDatabaseTool.main(UpdateDatabaseTool.java:342)
[EMAIL PROTECTED] nutch-nightly]#



-------------------------------------------------------
This SF.Net email is sponsored by: YOU BE THE JUDGE. Be one of 170
Project Admins to receive an Apple iPod Mini FREE for your judgement on
who ports your project to Linux PPC the best. Sponsored by IBM.
Deadline: Sept. 24. Go here: http://sf.net/ppc_contest.php
_______________________________________________
Nutch-general mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to