Hi,

According to my logs a really long time +2 hours elapses between
parsing the last page in a segment until the ParseSegment finishes as
can be seen here:

2012-02-19 00:51:43,471 INFO  parse.ParseSegment - Parsing: http:// ....
2012-02-19 03:15:18,604 INFO  parse.ParseSegment - ParseSegment:
finished at 2012-02-19 03:15:18, elapsed: 02:57:24

Since the total time of the parse job is just around 3 hours, this
represents a huge portion of the overall time

Is it normal that the last step in the job takes such a long time and
is there anything I can do to speed it up? I have been running the
generator with -topN 20000 I wouldn't have expected that to be a big
enough value to cause a problem. I have now reconfigured my script to
skip the -topN parameter to see what happens.

best regards,
Magnus

Reply via email to