When I was researching this issue I first thought it was the 
deflateBytes method as well but when I changed things in the code the 
problem persisted until I changed the regex filter.  Maybe your problem 
actually is in the deflate bytes method.  The forum I was talking about 
earlier was nutch-user but if you don't have the regex then those posts 
wouldn't help you.  Here is the text of a previous conversation I had 
about this with Stefan. 
------
I have this suspicion that the inflater class in Java 1.5 is causing 
some problems with spinning but I can't prove it.  We are using about 
the same java and linux versions.

The problem I think is line 343 of the SequenceFile.Reader:

          while (!inflater.finished()) {
            try {
              int count = inflater.inflate(inflateIn);
              inflateOut.write(inflateIn, 0, count);
            } catch (DataFormatException e) {
              throw new IOException (e.toString());
            }
          }

Count can sometimes return 0 and I am wondering if when it does is it 
possible that inflater.finished() can return false.  If that is the case 
I think this can drop into an infinite loop and some processes will sit 
and spin until they timeout.  It would explain some things because it 
probably would happen while inflating a strange byte combination and 
since this drops into a native method would probably affect one platform 
(in this case linux) more than another.  What do you think? 

Dennis
------

Hope this helps.

Dennis

Daniel Varela Santoalla wrote:
> Hello Dennis et al
>
> Dennis Kubes wrote:
>> We have seen this before too.  If is the same problem it is the regex 
>> url filter.  Comment out the  -.*(/.+?)/.*?\1/.*?\1/ expression in 
>> the regex-urlfilter.txt file and it should resolve itself. 
>
> I'm afraid I didn't have a line like that in my regex-urlfilter.txt. 
> Anyway I removed everything except the last line accepting all, but no 
> improvement.
>
>> Also search the forum for "Fetcher stops pushes cpu to 100%".
>
> Which forum? I tried both nutch-user and nutch-dev without luck...
>
>>
>> Dennis
>>
>
> BTW, I'm using 0.7.2.
>
> Regards
> Daniel
>

Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to