[jira] Updated: (NUTCH-700) Neko1.9.11 goes into a loop

2009-03-02 Thread Sami Siren (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sami Siren updated NUTCH-700: - Fix Version/s: 1.0.0 Assignee: Sami Siren This one just bit me - the effect is that parsing

Re: planning for nutch-1.0-rc1

2009-03-02 Thread Sami Siren
Andrzej Bialecki wrote: Sami Siren wrote: I am planning to build the first rc for nutch 1.0 at Tue 3.3.2009 morning (EET). There are still some issues marked as fix for 1.0 in Jira. Neither of the two remaining _bugs_ seems too important to me, actually I only count the issues assigned to

[jira] Closed: (NUTCH-419) unavailable robots.txt kills fetch

2009-03-02 Thread Andrzej Bialecki (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrzej Bialecki closed NUTCH-419. --- Resolution: Fixed Fix Version/s: 1.0.0 Assignee: Andrzej Bialecki Fixed in

[jira] Resolved: (NUTCH-700) Neko1.9.11 goes into a loop

2009-03-02 Thread Sami Siren (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sami Siren resolved NUTCH-700. -- Resolution: Fixed reverted to 0.9.4 Neko1.9.11 goes into a loop ---

[jira] Resolved: (NUTCH-669) Consolidate code for Fetcher and Fetcher2

2009-03-02 Thread Sami Siren (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sami Siren resolved NUTCH-669. -- Resolution: Fixed replaced fetcher with fetcher2 Consolidate code for Fetcher and Fetcher2

Re: [jira] Resolved: (NUTCH-669) Consolidate code for Fetcher and Fetcher2

2009-03-02 Thread Andrzej Bialecki
Sami Siren (JIRA) wrote: [ https://issues.apache.org/jira/browse/NUTCH-669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sami Siren resolved NUTCH-669. -- Resolution: Fixed replaced fetcher with fetcher2 I'm puzzled .. it seemed

Re: [jira] Resolved: (NUTCH-669) Consolidate code for Fetcher and Fetcher2

2009-03-02 Thread Sami Siren
Andrzej Bialecki wrote: Sami Siren (JIRA) wrote: [ https://issues.apache.org/jira/browse/NUTCH-669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sami Siren resolved NUTCH-669. -- Resolution: Fixed replaced fetcher with fetcher2

Re: [jira] Resolved: (NUTCH-669) Consolidate code for Fetcher and Fetcher2

2009-03-02 Thread Andrzej Bialecki
Sami Siren wrote: Andrzej Bialecki wrote: Sami Siren (JIRA) wrote: [ https://issues.apache.org/jira/browse/NUTCH-669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sami Siren resolved NUTCH-669. -- Resolution: Fixed replaced

Job offer for Nutch-Lucene Programmer

2009-03-02 Thread Wolfgang Sander-Beuermann
Hi, may be I'm totally wrong here with my post. If so: please excuse. I'm posting a job offer for a Nutch-Lucene-Programmer for a job in Germany. Because good knowledge of German language is mandatory for that job, I'm may be doing my second fault - I'll post this job offer here in German:

Re: [jira] Resolved: (NUTCH-669) Consolidate code for Fetcher and Fetcher2

2009-03-02 Thread Todd Lipcon
Hey guys, Sorry for the non-responsiveness here. I recently left my old employment and have been packing for a cross-country move. I agree that for 1.0 the best bet is what Sami has done. The code that I was working on is available here: http://github.com/toddlipcon/nutch/tree/nutch-669 But it

Re: planning for nutch-1.0-rc1

2009-03-02 Thread Bartosz Gadzimski
Sami Siren pisze: Andrzej Bialecki wrote: Sami Siren wrote: I am planning to build the first rc for nutch 1.0 at Tue 3.3.2009 morning (EET). There are still some issues marked as fix for 1.0 in Jira. Neither of the two remaining _bugs_ seems too important to me, actually I only count the

Parsing, Indexing multiple values (of same type) per document - Nutch-0.9

2009-03-02 Thread Stefan Dlugolinsky
Hello, I'm writing a DistanceSearch plugin (something similar to GeoPosition plugin), which parses HTML pages and extracts geographic data (addresses, GPS, ...) from the text of the page. This geographic data is geocoded into latitude and longitude - values to be indexed and later used

Build failed in Hudson: Nutch-trunk #741

2009-03-02 Thread Apache Hudson Server
See http://hudson.zones.apache.org/hudson/job/Nutch-trunk/741/changes Changes: [siren] NUTCH-669 - Consolidate code for Fetcher and Fetcher2 [siren] NUTCH-700 - revert to nekohtml-0.9.4 [ab] Commit changes to CHANGES. [ab] NUTCH-419 Unavailable robots.txt kills fetch.