[ http://issues.apache.org/jira/browse/NUTCH-192?page=all ]

Stefan Groschupf updated NUTCH-192:
-----------------------------------

    Attachment: metadata08_02_06.patch

Doug, I'm afraid there is a missunderstanding or may be I just do not 
understand  your comments.
A plugin never need to add a class - id mapping anymore. The later patches 
(after Andrzej suggestions) can handle any kind of writables. In case the class 
 is not known in a mapping we create a internal id - class tuple and write  it 
to or read it from the 'header' of each mapWritable.  So users can use any kind 
of custom  writable's this just takes some more space in the file. (one byte 
for the id and a UTF8 for the classname). In case there is a frequently used 
new writable we can add it to the mapping. 

So as suggested I moved the mapping from WritableName into a static block of 
MapWritable and in case unknown writables are used we read write a header 
containg this id class tuple. From my point of view this is the best solution 
for now and I don't think we will have that often new and frquently used 
writables. 


> meta data support for CrawlDatum
> --------------------------------
>
>          Key: NUTCH-192
>          URL: http://issues.apache.org/jira/browse/NUTCH-192
>      Project: Nutch
>         Type: Improvement
>     Versions: 0.8-dev
>     Reporter: Stefan Groschupf
>      Fix For: 0.8-dev
>  Attachments: metadata010206.patch, metadata060206.patch, 
> metadata08_02_06.patch, metadata300106.patch, metadata310106.patch
>
> Supporting meta data in CrawlDatum would help to get a set of new nutch 
> features realized and makes a lot possible to smaller special focused search 
> engines.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira



-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642
_______________________________________________
Nutch-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-developers

Reply via email to