[jira] Commented: (NUTCH-770) Timebomb for Fetcher

2009-12-05 Thread MilleBii (JIRA)

[ 
https://issues.apache.org/jira/browse/NUTCH-770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12786443#action_12786443
 ] 

MilleBii commented on NUTCH-770:


Tried it succesfully on a windows platform.

It does not work on a Ubuntu, pseudo-distributed hadoop configuration with 
mappers running in parallel 



 Timebomb for Fetcher
 

 Key: NUTCH-770
 URL: https://issues.apache.org/jira/browse/NUTCH-770
 Project: Nutch
  Issue Type: Improvement
Reporter: Julien Nioche
Assignee: Andrzej Bialecki 
 Fix For: 1.1

 Attachments: log-770, NUTCH-770-v2.patch, NUTCH-770-v3.patch, 
 NUTCH-770.patch


 This patch provides the Fetcher with a timebomb mechanism. By default the 
 timebomb is not activated; it can be set using the parameter 
 fetcher.timebomb.mins. The number of minutes is relative to the start of the 
 Fetch job. When the number of minutes is reached, the QueueFeeder skips all 
 remaining entries then all active queues are purged. This allows to keep the 
 Fetch step under comtrol and works well in combination with NUTCH-769

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (NUTCH-770) Timebomb for Fetcher

2009-12-01 Thread Andrzej Bialecki (JIRA)

[ 
https://issues.apache.org/jira/browse/NUTCH-770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12784250#action_12784250
 ] 

Andrzej Bialecki  commented on NUTCH-770:
-

Fixed in rev. 885776. Thank you!

 Timebomb for Fetcher
 

 Key: NUTCH-770
 URL: https://issues.apache.org/jira/browse/NUTCH-770
 Project: Nutch
  Issue Type: Improvement
Reporter: Julien Nioche
Assignee: Andrzej Bialecki 
 Fix For: 1.1

 Attachments: log-770, NUTCH-770-v2.patch, NUTCH-770-v3.patch, 
 NUTCH-770.patch


 This patch provides the Fetcher with a timebomb mechanism. By default the 
 timebomb is not activated; it can be set using the parameter 
 fetcher.timebomb.mins. The number of minutes is relative to the start of the 
 Fetch job. When the number of minutes is reached, the QueueFeeder skips all 
 remaining entries then all active queues are purged. This allows to keep the 
 Fetch step under comtrol and works well in combination with NUTCH-769

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (NUTCH-770) Timebomb for Fetcher

2009-11-30 Thread Andrzej Bialecki (JIRA)

[ 
https://issues.apache.org/jira/browse/NUTCH-770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12783638#action_12783638
 ] 

Andrzej Bialecki  commented on NUTCH-770:
-

bq.   time limit is definitely better than timebomb (but not as amusing). 

:) let's got for informative and less confusing now ... Could you please 
also add the nutch-default.xml property and its documentation.

Re: FetchQueues - ok, you have a point here.

Re: code style - yes.

 Timebomb for Fetcher
 

 Key: NUTCH-770
 URL: https://issues.apache.org/jira/browse/NUTCH-770
 Project: Nutch
  Issue Type: Improvement
Reporter: Julien Nioche
 Attachments: log-770, NUTCH-770.patch


 This patch provides the Fetcher with a timebomb mechanism. By default the 
 timebomb is not activated; it can be set using the parameter 
 fetcher.timebomb.mins. The number of minutes is relative to the start of the 
 Fetch job. When the number of minutes is reached, the QueueFeeder skips all 
 remaining entries then all active queues are purged. This allows to keep the 
 Fetch step under comtrol and works well in combination with NUTCH-769

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (NUTCH-770) Timebomb for Fetcher

2009-11-28 Thread Julien Nioche (JIRA)

[ 
https://issues.apache.org/jira/browse/NUTCH-770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12783248#action_12783248
 ] 

Julien Nioche commented on NUTCH-770:
-

The log simply shows that the patch has not been applied properly. 
See http://markmail.org/message/wbd3r3t5bfxzkbpn for a discussion on how to 
apply patches

Should work fine from the root directory of Nutch with 
patch -p0  ~/Desktop/NUTCH-770.patch

 Timebomb for Fetcher
 

 Key: NUTCH-770
 URL: https://issues.apache.org/jira/browse/NUTCH-770
 Project: Nutch
  Issue Type: Improvement
Reporter: Julien Nioche
 Attachments: log-770, NUTCH-770.patch


 This patch provides the Fetcher with a timebomb mechanism. By default the 
 timebomb is not activated; it can be set using the parameter 
 fetcher.timebomb.mins. The number of minutes is relative to the start of the 
 Fetch job. When the number of minutes is reached, the QueueFeeder skips all 
 remaining entries then all active queues are purged. This allows to keep the 
 Fetch step under comtrol and works well in combination with NUTCH-769

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (NUTCH-770) Timebomb for Fetcher

2009-11-28 Thread MilleBii (JIRA)

[ 
https://issues.apache.org/jira/browse/NUTCH-770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12783252#action_12783252
 ] 

MilleBii commented on NUTCH-770:


That's what I did  and just retried ... so I'm a bit suprised too.
Other patches worked fine so far.

???

 Timebomb for Fetcher
 

 Key: NUTCH-770
 URL: https://issues.apache.org/jira/browse/NUTCH-770
 Project: Nutch
  Issue Type: Improvement
Reporter: Julien Nioche
 Attachments: log-770, NUTCH-770.patch


 This patch provides the Fetcher with a timebomb mechanism. By default the 
 timebomb is not activated; it can be set using the parameter 
 fetcher.timebomb.mins. The number of minutes is relative to the start of the 
 Fetch job. When the number of minutes is reached, the QueueFeeder skips all 
 remaining entries then all active queues are purged. This allows to keep the 
 Fetch step under comtrol and works well in combination with NUTCH-769

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (NUTCH-770) Timebomb for Fetcher

2009-11-28 Thread Andrzej Bialecki (JIRA)

[ 
https://issues.apache.org/jira/browse/NUTCH-770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12783283#action_12783283
 ] 

Andrzej Bialecki  commented on NUTCH-770:
-

I propose to change the name of this functionality - timebomb is not 
self-explanatory, and it suggests that if you misbehave then your cluster may 
explode ;) Instead I would use time limit, rename all vars and methods to 
follow this naming, and document it properly in nutch-default.xml.

A few comments to the patch:

* it has some overlap with NUTCH-769 (the emptyQueue() method), but that's easy 
to resolve, see also the next point.

* why change the code in FetchQueues at all? Time limit is a global condition, 
we could just break the main loop in run() and ignore the QueueFeeder (or don't 
start it if the time limit already passed when starting run() ).

* the patch does not follow the code style (notably whitespace in for/while 
loops and assignments).

 Timebomb for Fetcher
 

 Key: NUTCH-770
 URL: https://issues.apache.org/jira/browse/NUTCH-770
 Project: Nutch
  Issue Type: Improvement
Reporter: Julien Nioche
 Attachments: log-770, NUTCH-770.patch


 This patch provides the Fetcher with a timebomb mechanism. By default the 
 timebomb is not activated; it can be set using the parameter 
 fetcher.timebomb.mins. The number of minutes is relative to the start of the 
 Fetch job. When the number of minutes is reached, the QueueFeeder skips all 
 remaining entries then all active queues are purged. This allows to keep the 
 Fetch step under comtrol and works well in combination with NUTCH-769

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.