Hi
I want to crawl with this seed:
http://shce.sums.ac.ir/articles/farsi.html

but when fetching operation arrives to pdf and doc files give me some errors
like these:
--------------------------------------------------------------------------------
ParseSegment: starting at 2011-10-04 21:08:05
ParseSegment: segment: crawl-2/segments/20111004210620
Error parsing:
http://shce.sums.ac.ir/icarusplus/export/sites/shce/department/technical/pdf/brca.doc:
failed(2,0): Your file contains 124 sectors, but the initial DIFAT array at
index 0 referenced block # 151. This isn't allowed and  your file is corrupt
Error parsing:
http://shce.sums.ac.ir/icarusplus/export/sites/shce/download/asthmprevention.pdf:
failed(2,0): null
Error parsing:
http://shce.sums.ac.ir/icarusplus/export/sites/shce/download/bicycle_safety.pdf:
failed(2,0): null
Error parsing:
http://shce.sums.ac.ir/icarusplus/export/sites/shce/download/ca2.pdf:
failed(2,0): null
Error parsing:
http://shce.sums.ac.ir/icarusplus/export/sites/shce/download/ca3.pdf:
failed(2,0): null
Error parsing:
http://shce.sums.ac.ir/icarusplus/export/sites/shce/download/ca4.pdf:
failed(2,0): null
Error parsing:
http://shce.sums.ac.ir/icarusplus/export/sites/shce/download/ca5.pdf:
failed(2,0): null
Error parsing:
http://shce.sums.ac.ir/icarusplus/export/sites/shce/download/cancerrisks.pdf:
failed(2,0): null
Error parsing:
http://shce.sums.ac.ir/icarusplus/export/sites/shce/download/cellphonehazard.pdf:
failed(2,0): null
Error parsing:
http://shce.sums.ac.ir/icarusplus/export/sites/shce/download/chol.pdf:
failed(2,0): null
Error parsing:
http://shce.sums.ac.ir/icarusplus/export/sites/shce/download/coronarydisprevention.pdf:
failed(2,0): null
Error parsing:
http://shce.sums.ac.ir/icarusplus/export/sites/shce/download/diabetescontrol.pdf:
failed(2,0): null
Error parsing:
http://shce.sums.ac.ir/icarusplus/export/sites/shce/download/diabeteshandouts.pdf:
failed(2,0): null
Error parsing:
http://shce.sums.ac.ir/icarusplus/export/sites/shce/download/farsidiabetes.pdf:
failed(2,0): null
Error parsing:
http://shce.sums.ac.ir/icarusplus/export/sites/shce/download/quitsession.pdf:
failed(2,0): null
Error parsing:
http://shce.sums.ac.ir/icarusplus/export/sites/shce/download/thalassemia3.pdf:
failed(2,0): null
ParseSegment: finished at 2011-10-04 21:08:07, elapsed: 00:00:02
-------------------------------------------------------------------
can anyone help me?

Reply via email to