[ http://issues.apache.org/jira/browse/NUTCH-192?page=all ]
Stefan Groschupf updated NUTCH-192:
-----------------------------------
Attachment: metadata08_02_06.patch
Doug, I'm afraid there is a missunderstanding or may be I just do not
understand your comments.
A plugin never need to add a class - id mapping anymore. The later patches
(after Andrzej suggestions) can handle any kind of writables. In case the class
is not known in a mapping we create a internal id - class tuple and write it
to or read it from the 'header' of each mapWritable. So users can use any kind
of custom writable's this just takes some more space in the file. (one byte
for the id and a UTF8 for the classname). In case there is a frequently used
new writable we can add it to the mapping.
So as suggested I moved the mapping from WritableName into a static block of
MapWritable and in case unknown writables are used we read write a header
containg this id class tuple. From my point of view this is the best solution
for now and I don't think we will have that often new and frquently used
writables.
> meta data support for CrawlDatum
> --------------------------------
>
> Key: NUTCH-192
> URL: http://issues.apache.org/jira/browse/NUTCH-192
> Project: Nutch
> Type: Improvement
> Versions: 0.8-dev
> Reporter: Stefan Groschupf
> Fix For: 0.8-dev
> Attachments: metadata010206.patch, metadata060206.patch,
> metadata08_02_06.patch, metadata300106.patch, metadata310106.patch
>
> Supporting meta data in CrawlDatum would help to get a set of new nutch
> features realized and makes a lot possible to smaller special focused search
> engines.
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
http://www.atlassian.com/software/jira
-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems? Stop! Download the new AJAX search engine that makes
searching your log files as easy as surfing the web. DOWNLOAD SPLUNK!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642
_______________________________________________
Nutch-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-developers