Hi,

Could you also try the parsechecker tool on that last url? It's
possible.that the file has a.problem or simply a bug.

Remi

On Sunday, February 19, 2012, Magnús Skúlason <[email protected]> wrote:
> Hi,
>
> According to my logs a really long time +2 hours elapses between
> parsing the last page in a segment until the ParseSegment finishes as
> can be seen here:
>
> 2012-02-19 00:51:43,471 INFO  parse.ParseSegment - Parsing: http:// ....
> 2012-02-19 03:15:18,604 INFO  parse.ParseSegment - ParseSegment:
> finished at 2012-02-19 03:15:18, elapsed: 02:57:24
>
> Since the total time of the parse job is just around 3 hours, this
> represents a huge portion of the overall time
>
> Is it normal that the last step in the job takes such a long time and
> is there anything I can do to speed it up? I have been running the
> generator with -topN 20000 I wouldn't have expected that to be a big
> enough value to cause a problem. I have now reconfigured my script to
> skip the -topN parameter to see what happens.
>
> best regards,
> Magnus
>

Reply via email to