On 01/05/2011 07:01 PM, Ray Van Dolson wrote:
On Wed, Jan 05, 2011 at 07:41:13PM +0100, Diego Calleja wrote:
On Miércoles, 5 de Enero de 2011 18:42:42 Gordan Bobic escribió:
So by doing the hash indexing offline, the total amount of disk I/O
required effectively doubles, and the amount of CPU spent on doing the
hashing is in no way reduced.

But there are people who might want to avoid temporally the extra cost
of online dedup, and do it offline when the server load is smaller.

In my opinion, both online and offline dedup have valid use cases, and
the best choice is probably implement both.

Question from an end-user.  When we say "offline" deduplication, are we
talking about post-process deduplication (a la what Data ONTAP does
with their SIS implementation) during which the underlying file system
data continues to be available, or a process that needs exclusive
access ot the blocks to do its job?

I was assuming it was a regular cron-job that grinds away on the disks but doesn't require downtime.

Gordan
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to