Emmanuel JOKE wrote:
Hi Guys,

I've read an article which explain that we are now able to use the native
lib of hadoop in order to compress our data crawled.

I'm just wondering how can we compress a crawldb and all others stuff that
are already saved on the disk.
could you please help me ?

You can use the *Merger tools to re-write the data. E.g. CrawlDbMerger for crawldb, giving just a single db as the input argument.


--
Best regards,
Andrzej Bialecki     <><
 ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com

Reply via email to