Build failed in Jenkins: Nutch-nutchgora #25

2011-10-03 Thread Apache Jenkins Server
See Changes: [markus] NUTCH-1058 Upgrade Solr schema to version 1.4 -- [...truncated 2491 lines...] [ivy:resolve] :: loading settings :: file = /x1/jenkins/jenkins-slave/workspace/Nutch-nutchgora/

[jira] [Commented] (NUTCH-1058) Upgrade Solr schema to version 1.4

2011-10-03 Thread Hudson (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13119890#comment-13119890 ] Hudson commented on NUTCH-1058: --- Integrated in Nutch-nutchgora #25 (See [https://builds.apa

[jira] [Commented] (NUTCH-1058) Upgrade Solr schema to version 1.4

2011-10-03 Thread Hudson (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13119888#comment-13119888 ] Hudson commented on NUTCH-1058: --- Integrated in Nutch-trunk #1623 (See [https://builds.apach

[jira] [Commented] (NUTCH-1137) LinkDb / invertlinks: command line arguments ignored

2011-10-03 Thread Hudson (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13119889#comment-13119889 ] Hudson commented on NUTCH-1137: --- Integrated in Nutch-trunk #1623 (See [https://builds.apach

Build failed in Jenkins: Nutch-trunk #1623

2011-10-03 Thread Apache Jenkins Server
See Changes: [markus] NUTCH-1058 Upgrade Solr schema to version 1.4 [markus] NUTCH-1137 LinkDB other options ignored with -dir -- [...truncated 937 lines...] A src/plugin/language-identifie

Re: Providing a list of FAQ's with every new subscribe request

2011-10-03 Thread Sami Siren
On Mon, Oct 3, 2011 at 3:48 PM, lewis john mcgibbney < lewis.mcgibb...@gmail.com> wrote: > > Would it be possible to send out a list of our official FAQ's when a new > user confirms their subscription to both user@ and dev@ lists. > > It seems this is possible. Can you craft a piece of text you wo

[jira] [Commented] (NUTCH-1109) Add Sonar targets to Ant build.xml

2011-10-03 Thread Lewis John McGibbney (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13119367#comment-13119367 ] Lewis John McGibbney commented on NUTCH-1109: - Would like to commit before RC

[jira] [Commented] (NUTCH-1136) Ant pmd target is broken

2011-10-03 Thread Lewis John McGibbney (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13119365#comment-13119365 ] Lewis John McGibbney commented on NUTCH-1136: - Would like to commit before RC

Re: Check remaining 1.4 issues

2011-10-03 Thread Mattmann, Chris A (388J)
+1 from me, let's target end of week to have all issues knocked back and I'll roll and RC this weekend. Cheers, Chris On Oct 3, 2011, at 6:17 AM, lewis john mcgibbney wrote: > Hi Markus, > > I still see 17 possible issues for inclusion in 1.4. I don't know how others > feel with regards to th

[jira] [Updated] (NUTCH-1035) Tune Solr config for Nutch users

2011-10-03 Thread Markus Jelsma (Updated) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Markus Jelsma updated NUTCH-1035: - Fix Version/s: (was: 1.4) 1.5 > Tune Solr config for Nutch users > ---

[jira] [Updated] (NUTCH-1034) Create Solr Velocity templates

2011-10-03 Thread Markus Jelsma (Updated) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Markus Jelsma updated NUTCH-1034: - Fix Version/s: (was: 1.4) 1.5 > Create Solr Velocity templates > -

[jira] [Updated] (NUTCH-717) Make Nutch Solr integration easier

2011-10-03 Thread Markus Jelsma (Updated) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Markus Jelsma updated NUTCH-717: Fix Version/s: (was: 1.4) 1.5 > Make Nutch Solr integration easier >

[jira] [Closed] (NUTCH-1058) Upgrade Solr schema to version 1.4

2011-10-03 Thread Markus Jelsma (Closed) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Markus Jelsma closed NUTCH-1058. > Upgrade Solr schema to version 1.4 > -- > > Key:

[jira] [Resolved] (NUTCH-1058) Upgrade Solr schema to version 1.4

2011-10-03 Thread Markus Jelsma (Resolved) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Markus Jelsma resolved NUTCH-1058. -- Resolution: Fixed Assignee: Markus Jelsma Committed for 1.4 in rev. 1178409 and for nutch

Re: Check remaining 1.4 issues

2011-10-03 Thread lewis john mcgibbney
Hi Markus, I still see 17 possible issues for inclusion in 1.4. I don't know how others feel with regards to the 'Solr' type issues but my own preference would be that these can be knocked back to 1.5if they are going to take some time to get sorted. Of the 17 issues affecting the 1.4 release we h

[jira] [Commented] (NUTCH-1143) Omit anchor in webgraph's LinkDatum

2011-10-03 Thread Markus Jelsma (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13119281#comment-13119281 ] Markus Jelsma commented on NUTCH-1143: -- It seems the anchor field was once used for i

Check remaining 1.4 issues

2011-10-03 Thread Markus Jelsma
Hi guys, Can we do a final round of issue checking for 1.4? Move to 1.5 or give a final argument on what to do with an issue. Thanks,

[jira] [Updated] (NUTCH-1142) Normalization and filtering in WebGraph

2011-10-03 Thread Markus Jelsma (Updated) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Markus Jelsma updated NUTCH-1142: - Attachment: NUTCH-1142-1.4.patch Here's a patch for trunk. > Normalization and

[jira] [Resolved] (NUTCH-1144) Filtering optional in WebGraph

2011-10-03 Thread Markus Jelsma (Resolved) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Markus Jelsma resolved NUTCH-1144. -- Resolution: Won't Fix Decided to do filtering and normalizing in one issue. >

[jira] [Updated] (NUTCH-1142) Normalization and filtering in WebGraph

2011-10-03 Thread Markus Jelsma (Updated) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Markus Jelsma updated NUTCH-1142: - Description: The WebGraph programs performs URL normalization. Since normalization of outlinks i

[jira] [Updated] (NUTCH-1144) Filtering optional in WebGraph

2011-10-03 Thread Markus Jelsma (Updated) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Markus Jelsma updated NUTCH-1144: - Fix Version/s: (was: 1.5) > Filtering optional in WebGraph >

Re: Providing a list of FAQ's with every new subscribe request

2011-10-03 Thread lewis john mcgibbney
Hi Sami, At the moment I am not in a position to take on the role of mailing list moderator. But I've found out that the list moderators should be able to configure the nature of documentation on a per-list basis by emailing ${list}-help@ from their moderator address and following the instructions

[jira] [Resolved] (NUTCH-1137) LinkDb / invertlinks: command line arguments ignored

2011-10-03 Thread Markus Jelsma (Resolved) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Markus Jelsma resolved NUTCH-1137. -- Resolution: Fixed Committed for 1.4 in rev. 1178376. Reused crawldb code instead. Thanks for o

Re: Choosing an efficient family configuration for GORA HBase

2011-10-03 Thread Ferdy Galema
Ok thanks. I was just wondering whether there were any developments on this. I'm not sure yet what would be the fastest in the case of Nutch, all I know from our own experience is that it is best practice to group frequently-accessed columns together, but nevertheless store large columns in a s