[jira] Commented: (NUTCH-139) Standard metadata property names in the ParseData metadata

2006-02-08 Thread Doug Cutting (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-139?page=comments#action_12365619 ] Doug Cutting commented on NUTCH-139: +1 This looks great. Thanks for all the hard work on this one! Standard metadata property names in the ParseData metadata

[jira] Commented: (NUTCH-139) Standard metadata property names in the ParseData metadata

2006-02-08 Thread Andrzej Bialecki (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-139?page=comments#action_12365623 ] Andrzej Bialecki commented on NUTCH-139: - I like this patch, the split of Metadata names into interfaces looks right. +1. Standard metadata property names in the

[jira] Commented: (NUTCH-139) Standard metadata property names in the ParseData metadata

2006-02-03 Thread Jerome Charron (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-139?page=comments#action_12365066 ] Jerome Charron commented on NUTCH-139: -- Sorry for this very late response... The idea behind separate subclasses of Metadata for content and parses is to enforce the

[jira] Commented: (NUTCH-139) Standard metadata property names in the ParseData metadata

2006-02-03 Thread Doug Cutting (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-139?page=comments#action_12365089 ] Doug Cutting commented on NUTCH-139: Jerome: yes, it makes sense, but there's also metadata that's not tightly related to the protocol or the parser, e.g., the nutch

[jira] Commented: (NUTCH-139) Standard metadata property names in the ParseData metadata

2006-02-03 Thread Jerome Charron (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-139?page=comments#action_12365095 ] Jerome Charron commented on NUTCH-139: -- Ok Doug. Your point of view makes sense for me. I hope, I can provide a (final) patch for the next week. Standard metadata

[jira] Commented: (NUTCH-139) Standard metadata property names in the ParseData metadata

2006-02-03 Thread Andrzej Bialecki (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-139?page=comments#action_12365098 ] Andrzej Bialecki commented on NUTCH-139: - FWIW, I agree with Doug on this - I don't see that subclasses would buy us much in terms of functionality, except for the

[jira] Commented: (NUTCH-139) Standard metadata property names in the ParseData metadata

2006-02-03 Thread Jerome Charron (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-139?page=comments#action_12365103 ] Jerome Charron commented on NUTCH-139: -- except for the sake of purity of OO approach Andrzej, as you noticed certainly, it is my defect... ;-) You know, I have still

[jira] Commented: (NUTCH-139) Standard metadata property names in the ParseData metadata

2006-01-27 Thread Jerome Charron (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-139?page=comments#action_12364218 ] Jerome Charron commented on NUTCH-139: -- I think we're near agreement here. I really hope ... ;-) We should add an add() method to Metadata, and change set() to

[jira] Commented: (NUTCH-139) Standard metadata property names in the ParseData metadata

2006-01-26 Thread Jerome Charron (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-139?page=comments#action_12364112 ] Jerome Charron commented on NUTCH-139: -- In fact, the more I look at this, the more I agreed with last Doug comment. There is no real needs (for now) for a so complicated

[jira] Commented: (NUTCH-139) Standard metadata property names in the ParseData metadata

2006-01-26 Thread Chris A. Mattmann (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-139?page=comments#action_12364116 ] Chris A. Mattmann commented on NUTCH-139: - Just to add to Jerome's last comment, I think the key here is simplicity. As a software developer, and ultimately as an end

[jira] Commented: (NUTCH-139) Standard metadata property names in the ParseData metadata

2006-01-26 Thread Doug Cutting (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-139?page=comments#action_12364125 ] Doug Cutting commented on NUTCH-139: I think we're near agreement here. Here are the changes I think this patch still needs: MetadataNames belongs in the protocol

Re: [jira] Commented: (NUTCH-139) Standard metadata property names in the ParseData metadata

2006-01-26 Thread Andrzej Bialecki
Doug Cutting (JIRA) wrote: [ http://issues.apache.org/jira/browse/NUTCH-139?page=comments#action_12364125 ] My apologies for commenting here - JIRA produces broken HTML for me, I can't use it... Doug Cutting commented on NUTCH-139: I think

Re: [jira] Commented: (NUTCH-139) Standard metadata property names in the ParseData metadata

2006-01-26 Thread Doug Cutting
Andrzej Bialecki wrote: Erhm.. please bear with me. I'd rather see these two classes in a separate package altogether, org.apache.nutch.metadata. The reason is that most likely these two classes will be used elsewhere too, not just in the protocol and parse/fetch related context. I'm

[jira] Commented: (NUTCH-139) Standard metadata property names in the ParseData metadata

2006-01-25 Thread Andrzej Bialecki (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-139?page=comments#action_12363942 ] Andrzej Bialecki commented on NUTCH-139: - Yes, this should work ok ... but it strikes me as unnecessarily complicated. After all, in most cases we will have single

[jira] Commented: (NUTCH-139) Standard metadata property names in the ParseData metadata

2006-01-24 Thread Jerome Charron (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-139?page=comments#action_12363834 ] Jerome Charron commented on NUTCH-139: -- Andrzej, I really don't like this X-Nutch naming convention. First it's really protocol level oriented, and it forces to map

[jira] Commented: (NUTCH-139) Standard metadata property names in the ParseData metadata

2006-01-20 Thread Andrzej Bialecki (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-139?page=comments#action_12363394 ] Andrzej Bialecki commented on NUTCH-139: - Yes, I agree with the split into a generic MetaData container, and subclasses that define necessary constants for metadata

[jira] Commented: (NUTCH-139) Standard metadata property names in the ParseData metadata

2006-01-19 Thread Chris A. Mattmann (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-139?page=comments#action_12363352 ] Chris A. Mattmann commented on NUTCH-139: - Hi Jerome, org.apache.nutch.parse.ParseData * The constructor becomes ParseData(ParseStatus, String, Outlink[],

[jira] Commented: (NUTCH-139) Standard metadata property names in the ParseData metadata

2006-01-13 Thread Jerome Charron (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-139?page=comments#action_12362618 ] Jerome Charron commented on NUTCH-139: -- Here is a new proposal for this issue. org.apache.nutch.util.MetaData * becomes an utility class that is only a container of

[jira] Commented: (NUTCH-139) Standard metadata property names in the ParseData metadata

2006-01-09 Thread Doug Cutting (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-139?page=comments#action_12362242 ] Doug Cutting commented on NUTCH-139: We can just use different names, rather than two metaData objects: X-nutch names for derived or other values that are usually protocol

[jira] Commented: (NUTCH-139) Standard metadata property names in the ParseData metadata

2006-01-07 Thread Andrzej Bialecki (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-139?page=comments#action_12362049 ] Andrzej Bialecki commented on NUTCH-139: - I see three issues here: * using standard metadata names and handling misspelles/erroneous ones: this patch provides this

[jira] Commented: (NUTCH-139) Standard metadata property names in the ParseData metadata

2006-01-07 Thread Jerome Charron (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-139?page=comments#action_12362061 ] Jerome Charron commented on NUTCH-139: -- I agree with your analysis Andrzej. I suggested to commit this patch because it is a response to this issue: standard metadata

[jira] Commented: (NUTCH-139) Standard metadata property names in the ParseData metadata

2006-01-06 Thread Doug Cutting (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-139?page=comments#action_12361994 ] Doug Cutting commented on NUTCH-139: Jerome, Some HTTP headers have multiple values. Correctly reflecting that was I thought the primary motivation for adding multiple

[jira] Commented: (NUTCH-139) Standard metadata property names in the ParseData metadata

2006-01-06 Thread Doug Cutting (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-139?page=comments#action_12362003 ] Doug Cutting commented on NUTCH-139: Also, since the primary use of multiple metadata values should be for protocols where multiple-values are required, the method to add

[jira] Commented: (NUTCH-139) Standard metadata property names in the ParseData metadata

2006-01-05 Thread Chris A. Mattmann (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-139?page=comments#action_12361924 ] Chris A. Mattmann commented on NUTCH-139: - Hi Doug, While it's true that content-length can be computed from the Content's data, wouldn't it also be nice to have it

[jira] Commented: (NUTCH-139) Standard metadata property names in the ParseData metadata

2006-01-05 Thread Chris A. Mattmann (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-139?page=comments#action_12361926 ] Chris A. Mattmann commented on NUTCH-139: - Hi Doug, While it's true that content-length can be computed from the Content's data, wouldn't it also be nice to have it

[jira] Commented: (NUTCH-139) Standard metadata property names in the ParseData metadata

2006-01-05 Thread Chris A. Mattmann (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-139?page=comments#action_12361927 ] Chris A. Mattmann commented on NUTCH-139: - Hi Doug, While it's true that content-length can be computed from the Content's data, wouldn't it also be nice to have it

RE: [jira] Commented: (NUTCH-139) Standard metadata property names in the ParseData metadata

2006-01-05 Thread chris.mattmann
. -Original Message- From: Doug Cutting (JIRA) [mailto:[EMAIL PROTECTED] Sent: Thursday, January 05, 2006 8:04 PM To: nutch-dev@incubator.apache.org Subject: [jira] Commented: (NUTCH-139) Standard metadata property names in the ParseData metadata [ http://issues.apache.org/jira

RE: [jira] Commented: (NUTCH-139) Standard metadata property names in the ParseData metadata

2006-01-05 Thread Chris Mattmann
. -Original Message- From: Doug Cutting (JIRA) [mailto:[EMAIL PROTECTED] Sent: Thursday, January 05, 2006 8:04 PM To: nutch-dev@incubator.apache.org Subject: [jira] Commented: (NUTCH-139) Standard metadata property names in the ParseData metadata [ http://issues.apache.org/jira

[jira] Commented: (NUTCH-139) Standard metadata property names in the ParseData metadata

2005-12-21 Thread Jerome Charron (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-139?page=comments#action_12361041 ] Jerome Charron commented on NUTCH-139: -- Ok, Chris and me will implement MetadataNames in this way. Just some few comments: I plan to move the MetadataNames to a class

[jira] Commented: (NUTCH-139) Standard metadata property names in the ParseData metadata

2005-12-21 Thread Andrzej Bialecki (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-139?page=comments#action_12361043 ] Andrzej Bialecki commented on NUTCH-139: - Regarding the move to a class with public static fields: I don't have any problem with that. Regarding the Levenshtein

[jira] Commented: (NUTCH-139) Standard metadata property names in the ParseData metadata

2005-12-21 Thread Jerome Charron (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-139?page=comments#action_12361045 ] Jerome Charron commented on NUTCH-139: -- Andrzej, Do you read in my mind? Yes of course, that's the way I want to do it: First checks for the most common cases (lower

[jira] Commented: (NUTCH-139) Standard metadata property names in the ParseData metadata

2005-12-20 Thread Andrzej Bialecki (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-139?page=comments#action_12360901 ] Andrzej Bialecki commented on NUTCH-139: - I have an objection, in fact I think the patches miss the main point of using of prefixed property names. In this patch

[jira] Commented: (NUTCH-139) Standard metadata property names in the ParseData metadata

2005-12-20 Thread Jerome Charron (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-139?page=comments#action_12360902 ] Jerome Charron commented on NUTCH-139: -- Andrzej, Thanks for taking time to take a look at the patch. In fact, we have some discussion with Chris about this point (that's

[jira] Commented: (NUTCH-139) Standard metadata property names in the ParseData metadata

2005-12-20 Thread Jerome Charron (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-139?page=comments#action_12360906 ] Jerome Charron commented on NUTCH-139: -- Andrzej, Here are more comments about my doubts, and how to handle metadata names. if for instance a protocol plugin doesn't have

[jira] Commented: (NUTCH-139) Standard metadata property names in the ParseData metadata

2005-12-20 Thread Jerome Charron (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-139?page=comments#action_12360920 ] Jerome Charron commented on NUTCH-139: -- And why not using the fact that the ContentProperties object can now handles multi-valued properties. Each piece of code that

[jira] Commented: (NUTCH-139) Standard metadata property names in the ParseData metadata

2005-12-20 Thread Chris A. Mattmann (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-139?page=comments#action_12360929 ] Chris A. Mattmann commented on NUTCH-139: - Hi Andrzej, I have an objection, in fact I think the patches miss the main point of using of prefixed property names.

[jira] Commented: (NUTCH-139) Standard metadata property names in the ParseData metadata

2005-12-20 Thread Chris A. Mattmann (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-139?page=comments#action_12360931 ] Chris A. Mattmann commented on NUTCH-139: - Hmm, Okay, I just finished reading the rest of the comments :-) Sorry, just woke up out here in Los Angeles. Okay, I

[jira] Commented: (NUTCH-139) Standard metadata property names in the ParseData metadata

2005-12-20 Thread Andrzej Bialecki (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-139?page=comments#action_12360933 ] Andrzej Bialecki commented on NUTCH-139: - I like Jerome's proposal of using the new ContentProperties class; this could save a lot of work, especially this naming

[jira] Commented: (NUTCH-139) Standard metadata property names in the ParseData metadata

2005-12-17 Thread Chris A. Mattmann (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-139?page=comments#action_12360681 ] Chris A. Mattmann commented on NUTCH-139: - Hi Doug, Jerome, I'm confused as to why all of the constant names have X_nutch in them. I'd expect to see something like

[jira] Commented: (NUTCH-139) Standard metadata property names in the ParseData metadata

2005-12-16 Thread Doug Cutting (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-139?page=comments#action_12360645 ] Doug Cutting commented on NUTCH-139: I'm confused as to why all of the constant names have X_nutch in them. I'd expect to see something like that in their string values,

[jira] Commented: (NUTCH-139) Standard metadata property names in the ParseData metadata

2005-12-13 Thread Chris A. Mattmann (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-139?page=comments#action_12360389 ] Chris A. Mattmann commented on NUTCH-139: - According to Andrzej: I agree, too. Perhaps we should use the names as they appear in the Dublin Core for those properties