Hello Doğacan,
Thanks for the reply. I had a query on the following:
">> 1. Does the map-reduce operation involve intermediate data which is as
high
>> ratio wise?
>Yes."
1. In your past experience what is the general ratio of size of segments
being merged to the maximum disk space required during the merge operation?
2. If you were using nutch prior to hadoop implementation was it any better
when run without hadoop?
>"Compressing temporary outputs may help you here. "
3. I guess compression would have a cost. Since it is already taking me more
than a day to merge these segments which are only 3GB and I have a task to
merge segments of 40GB or more, I was wondering how long this would take if
I enable compression. Guess my question is would you have any data on how
slow the merge would become if I enable compression of map output.
Thanks,
VB
--
View this message in context:
http://www.nabble.com/Issue-with-merging-segments-with-s-w-built-from-main-trunk-tp21641977p21650571.html
Sent from the Nutch - User mailing list archive at Nabble.com.