Re: What's the status of Nutch-GUI?

2006-11-23 Thread Scott Green

Hi

I will try my best. But I think Stefan is the right guy to do the
right thing :) He designed admin gui and implemented it.

- Scott

On 11/23/06, Zaheed Haque [EMAIL PROTECTED] wrote:

Scott:

Would you be kind enough to upload your Nutch-Gui patch which works
with current trunk? I would like to give it a try.

Regards

On 11/22/06, scott green [EMAIL PROTECTED] wrote:
 On 11/22/06, Sami Siren [EMAIL PROTECTED] wrote:
  scott green wrote:
   Hi
  
   I am now port Stefan to my dev-box. And some errors here, hope some
   one can help me. When I start embedded web application jetty, the
   exceptions:
  
   06/11/22 02:28:10 INFO util.Credential: Checking Resource aliases
   06/11/22 02:28:11 INFO util.Container: Started
   [EMAIL PROTECTED]
   Exception in thread main java.lang.ClassNotFoundException:
   org.apache.jasper.servlet.JspServlet
   at java.net.URLClassLoader$1.run(Unknown Source)
   at java.security.AccessController.doPrivileged(Native Method)
   at java.net.URLClassLoader.findClass(Unknown Source)
   at java.lang.ClassLoader.loadClass(Unknown Source)
   at java.lang.ClassLoader.loadClass(Unknown Source)
   at org.mortbay.http.HttpContext.loadClass(HttpContext.java:1262)
   at org.mortbay.jetty.servlet.Holder.start(Holder.java:188)
   at
   org.mortbay.jetty.servlet.ServletHolder.start(ServletHolder.java:219)
   at
   
org.mortbay.jetty.servlet.ServletHandler.initializeServlets(ServletHandler.java:445)
  
   at
   
org.mortbay.jetty.servlet.WebApplicationHandler.initializeServlets(WebApplicationHandler.java:323)
  
   at
   
org.mortbay.jetty.servlet.WebApplicationContext.doStart(WebApplicationContext.java:511)
  
   at org.mortbay.util.Container.start(Container.java:72)
   at
   
org.apache.nutch.admin.WebContainer.addComponentExtensions(WebContainer.java:152)
  
   at
   
org.apache.nutch.admin.AdministrationApp.startContainer(AdministrationApp.java:41)
  
   at
   org.apache.nutch.admin.AdministrationApp.main(AdministrationApp.java:158)
   06/11/22 02:28:24 INFO util.Container: Started HttpContext[/,/]
  
   the code snippets:
WebApplicationContext webContext =
   this.server.addWebApplication(contextName, new
   File(jsps).getCanonicalPath());

webContext.setClassLoader(extension.getDescriptor().getClassLoader());
webContext.setAttribute(component, component);
webContext.setAttribute(components, components);
if (instances != null) {
  webContext.setAttribute(instances, instances);
  webContext.setAttribute(container, this);
}
webContext.start();
  
   So how can I put some required jars into the classloader?
   Thanks
 
  Is there a starts script (bin/nutch?) or something like that where you
  could add the jasper-compiler.jar so it gets into classpath of JVM.

 Hi Sami

 You are right. I add the jars into JVM classpath and now it works, thanks.

 - Scott

  --
   Sami Siren
 




[jira] Created: (NUTCH-406) Metadata tries to write null values

2006-11-23 Thread JIRA
Metadata tries to write null values
---

 Key: NUTCH-406
 URL: http://issues.apache.org/jira/browse/NUTCH-406
 Project: Nutch
  Issue Type: Bug
Affects Versions: 0.9.0
Reporter: Doğacan Güney


During parsing, some urls (especially pdfs, it seems) may create some_key, 
null pairs in ParseData's parseMeta. 
When Metadata.write() tries to write such a pair, it causes an NPE.

Stack trace will be something like this:
at org.apache.hadoop.io.Text.encode(Text.java:373)
at org.apache.hadoop.io.Text.encode(Text.java:354)
at org.apache.hadoop.io.Text.writeString(Text.java:394)
at org.apache.nutch.metadata.Metadata.write(Metadata.java:214)


I can consistently reproduce this using the following url:
http://www.efesbev.com/corporate_governance/pdf/MergerAgreement.pdf

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (NUTCH-406) Metadata tries to write null values

2006-11-23 Thread JIRA
 [ http://issues.apache.org/jira/browse/NUTCH-406?page=all ]

Doğacan Güney updated NUTCH-406:


Attachment: NUTCH-406.patch

A simple patch that writes nulls as empty strings.

 Metadata tries to write null values
 ---

 Key: NUTCH-406
 URL: http://issues.apache.org/jira/browse/NUTCH-406
 Project: Nutch
  Issue Type: Bug
Affects Versions: 0.9.0
Reporter: Doğacan Güney
 Attachments: NUTCH-406.patch


 During parsing, some urls (especially pdfs, it seems) may create some_key, 
 null pairs in ParseData's parseMeta. 
 When Metadata.write() tries to write such a pair, it causes an NPE.
 Stack trace will be something like this:
 at org.apache.hadoop.io.Text.encode(Text.java:373)
 at org.apache.hadoop.io.Text.encode(Text.java:354)
 at org.apache.hadoop.io.Text.writeString(Text.java:394)
 at org.apache.nutch.metadata.Metadata.write(Metadata.java:214)
 I can consistently reproduce this using the following url:
 http://www.efesbev.com/corporate_governance/pdf/MergerAgreement.pdf

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (NUTCH-251) Administration GUI

2006-11-23 Thread Enis Soztutar (JIRA)
 [ http://issues.apache.org/jira/browse/NUTCH-251?page=all ]

Enis Soztutar updated NUTCH-251:


Attachment: Nutch-251-AdminGUI.tar.gz

I have updated the patch written by stephan.
This version works with Nutch-0.9-dev and hadoop-0.7.1 (current version of 
nutch so far)

First extract the tar.gaz file into the root of nutch. It should copy 
src/plugin/admin-* 
lib/xalan.jar  lib/serializer.jar and lib/hadoop-0.7.2-dev.jar
hadoop_0.7.1_nutch_gui_v2.patch
nutch_0.9-dev_gui_v2.patch

then patch nutch with 
  patch -p0 nutch_0.9-dev_gui_v2.patch 
  (you can test the patch first by running : patch -p0 --dry-run 
nutch_0.9-dev_gui_v2.patch

Patched hadoop is included in the archive, but if you wish you can patch hadoop 
using 
   patch -p0 hadoop_0.7.1_nutch_gui_v2.patch


I have : 
converted necessary java.io.File fields and arguments to 
org.apache.hadoop.fs.Path
replaced deprecated LogFormatter's with LogFactory's
used generics with collections(changed only that I've seen)
written PathSerializable which is implements Serializable interface(needed for 
scheduling)
Some hadoop changes and some changes due to hadoop conflicts. 

I have not tested every feature of this plugin so, there still can be some 
bugs. 

 Administration GUI
 --

 Key: NUTCH-251
 URL: http://issues.apache.org/jira/browse/NUTCH-251
 Project: Nutch
  Issue Type: Improvement
Affects Versions: 0.8
Reporter: Stefan Groschupf
Priority: Minor
 Fix For: 0.9.0

 Attachments: hadoop_nutch_gui_v1.patch, Nutch-251-AdminGUI.tar.gz, 
 nutch_gui_plugins_v1.zip, nutch_gui_v1.patch


 Having a web based administration interface would help to make nutch 
 administration and management much more user friendly.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




Re: [jira] Updated: (NUTCH-251) Administration GUI

2006-11-23 Thread Zaheed Haque

Super Thanks! Now I can give it a go!

Cheers!

On 11/23/06, Enis Soztutar (JIRA) [EMAIL PROTECTED] wrote:

 [ http://issues.apache.org/jira/browse/NUTCH-251?page=all ]

Enis Soztutar updated NUTCH-251:


Attachment: Nutch-251-AdminGUI.tar.gz

I have updated the patch written by stephan.
This version works with Nutch-0.9-dev and hadoop-0.7.1 (current version of
nutch so far)

First extract the tar.gaz file into the root of nutch. It should copy
src/plugin/admin-*
lib/xalan.jar  lib/serializer.jar and lib/hadoop-0.7.2-dev.jar
hadoop_0.7.1_nutch_gui_v2.patch
nutch_0.9-dev_gui_v2.patch

then patch nutch with
  patch -p0 nutch_0.9-dev_gui_v2.patch
  (you can test the patch first by running : patch -p0 --dry-run
nutch_0.9-dev_gui_v2.patch

Patched hadoop is included in the archive, but if you wish you can patch
hadoop using
   patch -p0 hadoop_0.7.1_nutch_gui_v2.patch


I have :
converted necessary java.io.File fields and arguments to
org.apache.hadoop.fs.Path
replaced deprecated LogFormatter's with LogFactory's
used generics with collections(changed only that I've seen)
written PathSerializable which is implements Serializable interface(needed
for scheduling)
Some hadoop changes and some changes due to hadoop conflicts.

I have not tested every feature of this plugin so, there still can be some
bugs.

 Administration GUI
 --

 Key: NUTCH-251
 URL: http://issues.apache.org/jira/browse/NUTCH-251
 Project: Nutch
  Issue Type: Improvement
Affects Versions: 0.8
Reporter: Stefan Groschupf
Priority: Minor
 Fix For: 0.9.0

 Attachments: hadoop_nutch_gui_v1.patch, Nutch-251-AdminGUI.tar.gz,
nutch_gui_plugins_v1.zip, nutch_gui_v1.patch


 Having a web based administration interface would help to make nutch
administration and management much more user friendly.

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira





[jira] Updated: (NUTCH-406) Metadata tries to write null values

2006-11-23 Thread Chris A. Mattmann (JIRA)
 [ http://issues.apache.org/jira/browse/NUTCH-406?page=all ]

Chris A. Mattmann updated NUTCH-406:


Assignee: Chris A. Mattmann

 Metadata tries to write null values
 ---

 Key: NUTCH-406
 URL: http://issues.apache.org/jira/browse/NUTCH-406
 Project: Nutch
  Issue Type: Bug
Affects Versions: 0.9.0
Reporter: Doğacan Güney
 Assigned To: Chris A. Mattmann
 Attachments: NUTCH-406.patch


 During parsing, some urls (especially pdfs, it seems) may create some_key, 
 null pairs in ParseData's parseMeta. 
 When Metadata.write() tries to write such a pair, it causes an NPE.
 Stack trace will be something like this:
 at org.apache.hadoop.io.Text.encode(Text.java:373)
 at org.apache.hadoop.io.Text.encode(Text.java:354)
 at org.apache.hadoop.io.Text.writeString(Text.java:394)
 at org.apache.nutch.metadata.Metadata.write(Metadata.java:214)
 I can consistently reproduce this using the following url:
 http://www.efesbev.com/corporate_governance/pdf/MergerAgreement.pdf

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (NUTCH-406) Metadata tries to write null values

2006-11-23 Thread Andrzej Bialecki (JIRA)
[ 
http://issues.apache.org/jira/browse/NUTCH-406?page=comments#action_12452270 ] 

Andrzej Bialecki  commented on NUTCH-406:
-

Null value is not equivalent to an empty String - perhaps we should simply skip 
such values.

 Metadata tries to write null values
 ---

 Key: NUTCH-406
 URL: http://issues.apache.org/jira/browse/NUTCH-406
 Project: Nutch
  Issue Type: Bug
Affects Versions: 0.9.0
Reporter: Doğacan Güney
 Assigned To: Chris A. Mattmann
 Attachments: NUTCH-406.patch


 During parsing, some urls (especially pdfs, it seems) may create some_key, 
 null pairs in ParseData's parseMeta. 
 When Metadata.write() tries to write such a pair, it causes an NPE.
 Stack trace will be something like this:
 at org.apache.hadoop.io.Text.encode(Text.java:373)
 at org.apache.hadoop.io.Text.encode(Text.java:354)
 at org.apache.hadoop.io.Text.writeString(Text.java:394)
 at org.apache.nutch.metadata.Metadata.write(Metadata.java:214)
 I can consistently reproduce this using the following url:
 http://www.efesbev.com/corporate_governance/pdf/MergerAgreement.pdf

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (NUTCH-406) Metadata tries to write null values

2006-11-23 Thread JIRA
 [ http://issues.apache.org/jira/browse/NUTCH-406?page=all ]

Doğacan Güney updated NUTCH-406:


Attachment: NUTCH-406.patch

How about something like this then?

 Metadata tries to write null values
 ---

 Key: NUTCH-406
 URL: http://issues.apache.org/jira/browse/NUTCH-406
 Project: Nutch
  Issue Type: Bug
Affects Versions: 0.9.0
Reporter: Doğacan Güney
 Assigned To: Chris A. Mattmann
 Attachments: NUTCH-406.patch, NUTCH-406.patch


 During parsing, some urls (especially pdfs, it seems) may create some_key, 
 null pairs in ParseData's parseMeta. 
 When Metadata.write() tries to write such a pair, it causes an NPE.
 Stack trace will be something like this:
 at org.apache.hadoop.io.Text.encode(Text.java:373)
 at org.apache.hadoop.io.Text.encode(Text.java:354)
 at org.apache.hadoop.io.Text.writeString(Text.java:394)
 at org.apache.nutch.metadata.Metadata.write(Metadata.java:214)
 I can consistently reproduce this using the following url:
 http://www.efesbev.com/corporate_governance/pdf/MergerAgreement.pdf

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (NUTCH-406) Metadata tries to write null values

2006-11-23 Thread Chris A. Mattmann (JIRA)
[ 
http://issues.apache.org/jira/browse/NUTCH-406?page=comments#action_12452275 ] 

Chris A. Mattmann commented on NUTCH-406:
-

Hi Andrzej, Doğacan,

 +1. I think it makes a lot of sense to just not include the null key in the 
Met container. Doğacan, in the future, when you attach a new version of a patch 
for a JIRA issue, please indicate the change by renaming the patch. Not a big 
deal, but good style points ;)

  I'll commit this patch shortly.

Cheers,
  Chris


 Metadata tries to write null values
 ---

 Key: NUTCH-406
 URL: http://issues.apache.org/jira/browse/NUTCH-406
 Project: Nutch
  Issue Type: Bug
Affects Versions: 0.9.0
Reporter: Doğacan Güney
 Assigned To: Chris A. Mattmann
 Attachments: NUTCH-406.patch, NUTCH-406.patch


 During parsing, some urls (especially pdfs, it seems) may create some_key, 
 null pairs in ParseData's parseMeta. 
 When Metadata.write() tries to write such a pair, it causes an NPE.
 Stack trace will be something like this:
 at org.apache.hadoop.io.Text.encode(Text.java:373)
 at org.apache.hadoop.io.Text.encode(Text.java:354)
 at org.apache.hadoop.io.Text.writeString(Text.java:394)
 at org.apache.nutch.metadata.Metadata.write(Metadata.java:214)
 I can consistently reproduce this using the following url:
 http://www.efesbev.com/corporate_governance/pdf/MergerAgreement.pdf

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (NUTCH-406) Metadata tries to write null values

2006-11-23 Thread Andrzej Bialecki (JIRA)
[ 
http://issues.apache.org/jira/browse/NUTCH-406?page=comments#action_12452282 ] 

Andrzej Bialecki  commented on NUTCH-406:
-

Erhm, -1 from me. This code checks only if the first value is null, and then 
discards all other values (which may be non-null), thus we could lose valuable 
data if only the first value happens to be null ...

I think we should indeed check if the first value is null, but then if it is 
then loop over all other values, count non-nulls, and if the count  0 then 
write out the key, non-null values set.

 Metadata tries to write null values
 ---

 Key: NUTCH-406
 URL: http://issues.apache.org/jira/browse/NUTCH-406
 Project: Nutch
  Issue Type: Bug
Affects Versions: 0.9.0
Reporter: Doğacan Güney
 Assigned To: Chris A. Mattmann
 Attachments: NUTCH-406.patch, NUTCH-406.patch


 During parsing, some urls (especially pdfs, it seems) may create some_key, 
 null pairs in ParseData's parseMeta. 
 When Metadata.write() tries to write such a pair, it causes an NPE.
 Stack trace will be something like this:
 at org.apache.hadoop.io.Text.encode(Text.java:373)
 at org.apache.hadoop.io.Text.encode(Text.java:354)
 at org.apache.hadoop.io.Text.writeString(Text.java:394)
 at org.apache.nutch.metadata.Metadata.write(Metadata.java:214)
 I can consistently reproduce this using the following url:
 http://www.efesbev.com/corporate_governance/pdf/MergerAgreement.pdf

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (NUTCH-406) Metadata tries to write null values

2006-11-23 Thread Chris A. Mattmann (JIRA)
[ 
http://issues.apache.org/jira/browse/NUTCH-406?page=comments#action_12452285 ] 

Chris A. Mattmann commented on NUTCH-406:
-

Hi Doğacan,

  Loooking at your latest patch, I'm not sure that it completely does the right 
behavior. For example, what happens if there are 3 met values for a key k, and 
one of them is null, but the other 2 are not? Specifically, what if the first 
value is null, but the other 2 are not. In that case, your patch would skip 
over writing all of the keys. Wouldn't it just be easier to do something like 
this?

Index: src/java/org/apache/nutch/metadata/Metadata.java
===
--- src/java/org/apache/nutch/metadata/Metadata.java(revision 478613)
+++ src/java/org/apache/nutch/metadata/Metadata.java(working copy)
@@ -211,7 +211,9 @@
   values = getValues(names[i]);
   out.writeInt(values.length);
   for (int j = 0; j  values.length; j++) {
-Text.writeString(out, values[j]);
+if(values[j] != null  !values[j].equals()){
+   Text.writeString(out, values[j]);
+}
   }
 }
   }

 Metadata tries to write null values
 ---

 Key: NUTCH-406
 URL: http://issues.apache.org/jira/browse/NUTCH-406
 Project: Nutch
  Issue Type: Bug
Affects Versions: 0.9.0
Reporter: Doğacan Güney
 Assigned To: Chris A. Mattmann
 Attachments: NUTCH-406.patch, NUTCH-406.patch


 During parsing, some urls (especially pdfs, it seems) may create some_key, 
 null pairs in ParseData's parseMeta. 
 When Metadata.write() tries to write such a pair, it causes an NPE.
 Stack trace will be something like this:
 at org.apache.hadoop.io.Text.encode(Text.java:373)
 at org.apache.hadoop.io.Text.encode(Text.java:354)
 at org.apache.hadoop.io.Text.writeString(Text.java:394)
 at org.apache.nutch.metadata.Metadata.write(Metadata.java:214)
 I can consistently reproduce this using the following url:
 http://www.efesbev.com/corporate_governance/pdf/MergerAgreement.pdf

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (NUTCH-406) Metadata tries to write null values

2006-11-23 Thread Chris A. Mattmann (JIRA)
[ 
http://issues.apache.org/jira/browse/NUTCH-406?page=comments#action_12452286 ] 

Chris A. Mattmann commented on NUTCH-406:
-

Hi Andrzej,

  Yup, you caught the same thing as me. +1 for your solution. I will extend my 
above patch by writing getNumNonNullValues(values) instead of values.length.

Cheers,
  Chris


 Metadata tries to write null values
 ---

 Key: NUTCH-406
 URL: http://issues.apache.org/jira/browse/NUTCH-406
 Project: Nutch
  Issue Type: Bug
Affects Versions: 0.9.0
Reporter: Doğacan Güney
 Assigned To: Chris A. Mattmann
 Attachments: NUTCH-406.patch, NUTCH-406.patch


 During parsing, some urls (especially pdfs, it seems) may create some_key, 
 null pairs in ParseData's parseMeta. 
 When Metadata.write() tries to write such a pair, it causes an NPE.
 Stack trace will be something like this:
 at org.apache.hadoop.io.Text.encode(Text.java:373)
 at org.apache.hadoop.io.Text.encode(Text.java:354)
 at org.apache.hadoop.io.Text.writeString(Text.java:394)
 at org.apache.nutch.metadata.Metadata.write(Metadata.java:214)
 I can consistently reproduce this using the following url:
 http://www.efesbev.com/corporate_governance/pdf/MergerAgreement.pdf

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Resolved: (NUTCH-406) Metadata tries to write null values

2006-11-23 Thread Chris A. Mattmann (JIRA)
 [ http://issues.apache.org/jira/browse/NUTCH-406?page=all ]

Chris A. Mattmann resolved NUTCH-406.
-

Fix Version/s: 0.9.0
   Resolution: Fixed

Fix applied and tested in trunk.


 Metadata tries to write null values
 ---

 Key: NUTCH-406
 URL: http://issues.apache.org/jira/browse/NUTCH-406
 Project: Nutch
  Issue Type: Bug
Affects Versions: 0.9.0
Reporter: Doğacan Güney
 Assigned To: Chris A. Mattmann
 Fix For: 0.9.0

 Attachments: NUTCH-406.patch, NUTCH-406.patch


 During parsing, some urls (especially pdfs, it seems) may create some_key, 
 null pairs in ParseData's parseMeta. 
 When Metadata.write() tries to write such a pair, it causes an NPE.
 Stack trace will be something like this:
 at org.apache.hadoop.io.Text.encode(Text.java:373)
 at org.apache.hadoop.io.Text.encode(Text.java:354)
 at org.apache.hadoop.io.Text.writeString(Text.java:394)
 at org.apache.nutch.metadata.Metadata.write(Metadata.java:214)
 I can consistently reproduce this using the following url:
 http://www.efesbev.com/corporate_governance/pdf/MergerAgreement.pdf

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (NUTCH-251) Administration GUI

2006-11-23 Thread Sami Siren (JIRA)
[ 
http://issues.apache.org/jira/browse/NUTCH-251?page=comments#action_12452321 ] 

Sami Siren commented on NUTCH-251:
--

 Are you thinking of something like UI extension point like in contrib/web2 ? 
not necessarily, that was also a quick hack I put together. It however allows 
you to plug in new functionality or layout via plugin (from inside jar). But I 
guess stefan has also implemented something like that in his patch.

 Administration GUI
 --

 Key: NUTCH-251
 URL: http://issues.apache.org/jira/browse/NUTCH-251
 Project: Nutch
  Issue Type: Improvement
Affects Versions: 0.8
Reporter: Stefan Groschupf
Priority: Minor
 Fix For: 0.9.0

 Attachments: hadoop_nutch_gui_v1.patch, Nutch-251-AdminGUI.tar.gz, 
 nutch_gui_plugins_v1.zip, nutch_gui_v1.patch


 Having a web based administration interface would help to make nutch 
 administration and management much more user friendly.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




Re: Question on adaptive re-fetch plugin

2006-11-23 Thread kauu

yes, i 'm ur side

On 11/23/06, Scott Green [EMAIL PROTECTED] wrote:


Hi

NUTCH-61(http://issues.apache.org/jira/browse/NUTCH-61) is about
adaptive re-fetch plugin, and Jerome Charron had commented --Why not
making FetchSchedule a new ExtensionPoint and then
DefaultFetchSchedule and AdaptiveFetchSchedule some fetch schedule
plugins? . I am for it. Maintaining non-offical nutch source is
bitter to me. So why not provide another plugin and test it. When it
is stable enough, we can merge them, right?

- Scott





--
www.babatu.com


Re: 0.7.3 version

2006-11-23 Thread Piotr Kosiorowski
As no objections were raised I created a 0.7.3 version in JIRA so we can 
start assigning current JIRA issues to it.

Regards
Piotr
Piotr Kosiorowski wrote:

Hello committers,
Based on a recent discussion on nutch user list - (Strategic Direction
of Nutch) I would like to prepare 0.7.3 release. The idea is to allow
people who still use 0.7.2 to get rid of most important bugs and allow
them to add some small features they would need as the claim is 0.8.1
is not good for small crawls at the moment. It will allow us to work
on 0.8 branch so it would be more small installation friendly.
I would like to approach it this way that if noone objects I would
create a 0.7.3 release in JIRA and ask people to assign issues with
patches to it. I do not have a lot of time personally so I do not plan
to do any development myself - just taking care of high quality
patches and committing them - after some time when we gather some
aomount of bugfixes/isues I would prepare 0.7.3 release. Any
objections comments?
Regards
Piotr





Re: [jira] Closed: (NUTCH-406) Metadata tries to write null values

2006-11-23 Thread Andrzej Bialecki

Chris Mattmann wrote:

4. Issue could have been iterated in jira a bit further so all these
could have been catched before a commit.



This is true: however, I thought that the point of bringing in new people
was to move forward on some of these critical issues that keep moving their
way down the priority stack? The issues that you raise above (e.g.,
whitespace v. tabs, and unnecessary comments), although relevant points,
really had nothing to do with the fix itself. I wanted to get the fix into
the sources before everyone went away for thanksgiving (at least here in the
U.S.), so that users could pull it down sooner rather than later. Is this
not the correct policy? I'm a n00b, so I dunno ;)
  


My practice is to leave the fix to mature a day or two (or three if it's 
a holidays season), even if it seems innocuous. The reason is that quite 
often people come back with valuable and totally unexpected insights 
(peer review) _when_ and if they had a chance to see the fix - and 
considering different time zones, occupations and workloads this may 
take a day or two even with best intentions... If a fix is complicated I 
explicitly ask for feedback.


--
Best regards,
Andrzej Bialecki 
___. ___ ___ ___ _ _   __
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com