[ http://issues.apache.org/jira/browse/NUTCH-192?page=comments#action_12364694 ]
Andrzej Bialecki commented on NUTCH-192: ----------------------------------------- What I meant was that both keys and values should be Strings (or rather UTF8), for the sake of simplicity. Let's take your example: if we use Writables, then to store 1 ByteWritable you need: * 1 byte - type id * 1 byte - value * plus whatever it takes to put the class name->id mapping in the MapWritable header (the mapping table): let's assume 40 bytes. For storing one value it's a substantial overhead. For storing hundreds of values the overhead is going down asymptotically to 1 byte per value. So, the question really is what is the typical use scenario that we want to optimize: whether you intend to store hundreds of metadata values of different types, or just a couple. If the former, then using MapWritable makes sense, if the latter - using Strings is simpler. > meta data support for CrawlDatum > -------------------------------- > > Key: NUTCH-192 > URL: http://issues.apache.org/jira/browse/NUTCH-192 > Project: Nutch > Type: Improvement > Versions: 0.8-dev > Reporter: Stefan Groschupf > Fix For: 0.8-dev > Attachments: metadata300106.patch > > Supporting meta data in CrawlDatum would help to get a set of new nutch > features realized and makes a lot possible to smaller special focused search > engines. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira ------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Do you grep through log files for problems? Stop! Download the new AJAX search engine that makes searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642 _______________________________________________ Nutch-developers mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-developers
