Re: [VOTE] Apache Nutch 1.1 Release Candidate #1

2010-04-07 Thread Fadzi Ushewokunze
..and here is to a Vote: +1 Oh, per usual, forgot to throw in my +1. So, +1! Cheers, Chris On 4/7/10 1:14 AM, Mattmann, Chris A (388J) chris.a.mattm...@jpl.nasa.gov wrote: Hi Folks, I have posted a candidate for the Apache Nutch 1.1 release. The source code is at:

[Nutch Wiki] Update of FrontPage by JulienNioche

2010-04-07 Thread Apache Wiki
Dear Wiki user, You have subscribed to a wiki page or wiki category on Nutch Wiki for change notification. The FrontPage page has been changed by JulienNioche. http://wiki.apache.org/nutch/FrontPage?action=diffrev1=128rev2=129 -- *

[Nutch Wiki] Update of Nutch2Roadmap by JulienNioche

2010-04-07 Thread Apache Wiki
Dear Wiki user, You have subscribed to a wiki page or wiki category on Nutch Wiki for change notification. The Nutch2Roadmap page has been changed by JulienNioche. http://wiki.apache.org/nutch/Nutch2Roadmap -- New page: = Nutch2Roadmap = Here

Re: Nutch 2.0 roadmap

2010-04-07 Thread Julien Nioche
Hi, I'm not sure what is the status of the nutchbase - it's missed a lot of fixes and changes in trunk since it's been last touched ... yes, maybe we should start the 2.0 branch from 1.1 instead Dogacan - what do you think? BTW I see there is now a 2.0 label under JIRA, thanks to whoever

[Nutch Wiki] Update of Nutch2Roadmap by JulienNioche

2010-04-07 Thread Apache Wiki
Dear Wiki user, You have subscribed to a wiki page or wiki category on Nutch Wiki for change notification. The Nutch2Roadmap page has been changed by JulienNioche. http://wiki.apache.org/nutch/Nutch2Roadmap?action=diffrev1=1rev2=2 -- *

[jira] Updated: (NUTCH-808) Evaluate ORM Frameworks which support non-relational column-oriented datastores and RDBMs

2010-04-07 Thread Julien Nioche (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Julien Nioche updated NUTCH-808: Fix Version/s: 2.0 Evaluate ORM Frameworks which support non-relational column-oriented

Re: Nutch 2.0 roadmap

2010-04-07 Thread Doğacan Güney
Hey everyone, On Tue, Apr 6, 2010 at 20:23, Andrzej Bialecki a...@getopt.org wrote: On 2010-04-06 15:43, Julien Nioche wrote: Hi guys, I gather that we'll jump straight to  2.0 after 1.1 and that 2.0 will be based on what is currently referred to as NutchBase. Shall we create a branch for

Re: Nutch 2.0 roadmap

2010-04-07 Thread Enis Söztutar
Hi, On 04/07/2010 07:54 PM, Doğacan Güney wrote: Hey everyone, On Tue, Apr 6, 2010 at 20:23, Andrzej Bialeckia...@getopt.org wrote: On 2010-04-06 15:43, Julien Nioche wrote: Hi guys, I gather that we'll jump straight to 2.0 after 1.1 and that 2.0 will be based on what is

Re: Nutch 2.0 roadmap

2010-04-07 Thread Enis Söztutar
Forgot to say that, at Hadoop, it is the convention that big issues, like the ones under discussion come with a design document. So that a solid design is agreed upon for the work. We can apply the same pattern at Nutch. On 04/07/2010 07:54 PM, Doğacan Güney wrote: Hey everyone, On Tue, Apr

Re: Nutch 2.0 roadmap

2010-04-07 Thread Andrzej Bialecki
On 2010-04-07 18:54, Doğacan Güney wrote: Hey everyone, On Tue, Apr 6, 2010 at 20:23, Andrzej Bialecki a...@getopt.org wrote: On 2010-04-06 15:43, Julien Nioche wrote: Hi guys, I gather that we'll jump straight to 2.0 after 1.1 and that 2.0 will be based on what is currently referred to

Re: Nutch 2.0 roadmap

2010-04-07 Thread Andrzej Bialecki
On 2010-04-07 19:24, Enis Söztutar wrote: Also, the goal of the crawler-commons project is to provide APIs and implementations of stuff that is needed for every open source crawler project, like: robots handling, url filtering and url normalization, URL state management, perhaps

Re: Nutch 2.0 roadmap

2010-04-07 Thread MilleBii
Just a question ? Will the new HBase implementation allow more sophisticated crawling strategies than the current score based. Give you a few example of what I'd like to do : Define different crawling frequency for different set of URLs, say weekly for some url, monthly or more for others.

[jira] Commented: (NUTCH-570) Improvement of URL Ordering in Generator.java

2010-04-07 Thread Otis Gospodnetic (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12854665#action_12854665 ] Otis Gospodnetic commented on NUTCH-570: I'm tempted to close this issue as Won't

[jira] Commented: (NUTCH-570) Improvement of URL Ordering in Generator.java

2010-04-07 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12854767#action_12854767 ] Chris A. Mattmann commented on NUTCH-570: - Hi Otis: I think your logic perfectly