Re: Nutchbase merge strategy

2010-07-21 Thread Andrzej Bialecki
On 2010-07-21 21:12, Mattmann, Chris A (388J) wrote: Hey Andrzej, +1 to all of the above - see below. So if 1-4 make sense, let's do 1, 2 and 3 today or tomorrow -- 4 can happen over the next few weeks. WDYT? This is a serious move - let's wait a bit, say until Monday, to give chance to

[jira] Created: (NUTCH-858) No longer able to set per-field boosts on lucene documents

2010-07-21 Thread Edward Drapkin (JIRA)
No longer able to set per-field boosts on lucene documents -- Key: NUTCH-858 URL: https://issues.apache.org/jira/browse/NUTCH-858 Project: Nutch Issue Type: Bug Components:

[jira] Updated: (NUTCH-858) No longer able to set per-field boosts on lucene documents

2010-07-21 Thread Andrzej Bialecki (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrzej Bialecki updated NUTCH-858: Assignee: Andrzej Bialecki Fix Version/s: 1.2 No longer able to set per-field

[jira] Commented: (NUTCH-858) No longer able to set per-field boosts on lucene documents

2010-07-21 Thread Edward Drapkin (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12890865#action_12890865 ] Edward Drapkin commented on NUTCH-858: -- Is there a patch against 1.1 that exists

[jira] Commented: (NUTCH-858) No longer able to set per-field boosts on lucene documents

2010-07-21 Thread Andrzej Bialecki (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12890873#action_12890873 ] Andrzej Bialecki commented on NUTCH-858: - Unfortunately no. The patch was included

[jira] Commented: (NUTCH-858) No longer able to set per-field boosts on lucene documents

2010-07-21 Thread Edward Drapkin (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12890876#action_12890876 ] Edward Drapkin commented on NUTCH-858: -- Ah, okay. Is there an ETA on a 1.2 release or

[Nutchbase] Multi-value ParseResult missing

2010-07-21 Thread Andrzej Bialecki
Hi, I noticed that nutchbase doesn't use the multi-valued ParseResult, instead all parse plugins return a simple Parse. As a consequence, it's not possible to return multiple values from parsing a single WebPage, something that parsers for compound documents absolutely require (archives,

Re: [Nutchbase] Multi-value ParseResult missing

2010-07-21 Thread Mattmann, Chris A (388J)
Hey Andrzej, We're having the same sorts of discussions in Tika-ville right now. Check out this page on the Tika wiki: http://wiki.apache.org/tika/MetadataDiscussion Comments, thoughts, welcome. Depending on what comes out of Tika, we may be able to leverage upon it... Cheers, Chris On

Build failed in Hudson: Nutch-trunk #1209

2010-07-21 Thread Apache Hudson Server
See http://hudson.zones.apache.org/hudson/job/Nutch-trunk/1209/ -- [...truncated 3878 lines...] deps-test: deploy: copy-generated-lib: test: [echo] Testing plugin: urlnormalizer-basic [junit] Running