[ 
http://issues.apache.org/jira/browse/NUTCH-192?page=comments#action_12364542 ] 

Andrzej Bialecki  commented on NUTCH-192:
-----------------------------------------

I have two comments:

* it's not obvious to me what are the strong arguments in favor of storing 
Writables. I'd think that for vast majority of applications Strings are 
sufficient, which would simplify the code and save a lot of space (at the cost 
of possible serialization from non-string values, in rare cases).

* if we really, really need Writables, then perhaps it would be better to store 
the mapping dictionary <class names, ids>, and then use a single byte as an id. 
I don't think one would need more than 256 different classes in a MapWritable, 
and this way we could avoid that static mapping table (which I'm afraid would 
cause its own problems with changes and versioning).

> meta data support for CrawlDatum
> --------------------------------
>
>          Key: NUTCH-192
>          URL: http://issues.apache.org/jira/browse/NUTCH-192
>      Project: Nutch
>         Type: Improvement
>     Versions: 0.8-dev
>     Reporter: Stefan Groschupf
>      Fix For: 0.8-dev
>  Attachments: metadata300106.patch
>
> Supporting meta data in CrawlDatum would help to get a set of new nutch 
> features realized and makes a lot possible to smaller special focused search 
> engines.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira



-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642
_______________________________________________
Nutch-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-developers

Reply via email to