[ 
http://issues.apache.org/jira/browse/NUTCH-192?page=comments#action_12364683 ] 

Stefan Groschupf commented on NUTCH-192:
----------------------------------------

Andrzej, Doug. I'm not sure if I understand you correct, do you suggest to have 
string keys and values, or just string keys?
It confuse me a bit but I'm afraid to misunderstand things because of my 
english, since I remember that one reason to have no meta data until today was  
performance and the size of data. 
In one of my personal use-cases I have a set of meta data that is definitely 
smaller than 255 and I only need to store some long values.
So I would love to use key:ByteWritable and value:LongWritable. 

Storing new LongWritable(23) or new UTF8("23") should be  a significant 
different in size. Also parsing byte int or long from a string takes some time.
At least there is a nice side effect, since this map also is a writable we can 
store a Map in a Map, what allows heretically meta data.

I fully agree with having a manual created mapping table stored in the 
MapWritable class and I will change this and commit a new patch.
Thanks for your comments!

> meta data support for CrawlDatum
> --------------------------------
>
>          Key: NUTCH-192
>          URL: http://issues.apache.org/jira/browse/NUTCH-192
>      Project: Nutch
>         Type: Improvement
>     Versions: 0.8-dev
>     Reporter: Stefan Groschupf
>      Fix For: 0.8-dev
>  Attachments: metadata300106.patch
>
> Supporting meta data in CrawlDatum would help to get a set of new nutch 
> features realized and makes a lot possible to smaller special focused search 
> engines.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

Reply via email to