Lewis John McGibbney created NUTCH-2851:
-------------------------------------------

             Summary: Random object created and used only once
                 Key: NUTCH-2851
                 URL: https://issues.apache.org/jira/browse/NUTCH-2851
             Project: Nutch
          Issue Type: Sub-task
          Components: dmoz, generator, indexer, segment
    Affects Versions: 1.18
            Reporter: Lewis John McGibbney
            Assignee: Lewis John McGibbney
             Fix For: 1.19



In class org.apache.nutch.crawl.Generator
In method org.apache.nutch.crawl.Generator.partitionSegment(Path, Path, int)
Called method java.util.Random.nextInt()
At Generator.java:[line 1016]
Random object created and used only once in 
org.apache.nutch.crawl.Generator.partitionSegment(Path, Path, int)

This code creates a java.util.Random object, uses it to generate one random 
number, and then discards the Random object. This produces mediocre quality 
random numbers and is inefficient. If possible, rewrite the code so that the 
Random object is created once and saved, and each time a new random number is 
required invoke a method on the existing Random object to obtain it.

If it is important that the generated Random numbers not be guessable, you must 
not create a new Random for each random number; the values are too easily 
guessable. You should strongly consider using a java.security.SecureRandom 
instead (and avoid allocating a new SecureRandom for each random number needed).

This bad practice also affects the following

org.apache.nutch.indexer.IndexingJob since first historized release
org.apache.nutch.segment.SegmentReader since first historized release
org.apache.nutch.tools.DmozParser$RDFProcessor since first historized release 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to