Re: Accumulo GC and Hadoop trash settings

Josh Elser Mon, 17 Aug 2015 14:03:07 -0700

Some advanced recovery steps are documented[1], but there is no sort of"fix it for you" tool.

It's probably a good idea to either set "fs.trash.interval" and/or"fs.trash.checkpoint.interval" in core-site.xml to be representative ofthe available HDFS space you have, or just turn off trash and take thenecessary steps to make sure your data is backed up (if that's apriority for you).

HDFS (and Accumulo for that matter) are only as reliable as the hardwareand configurations you have set. They are built to be robust andreliable systems, but they aren't without their flaws given enough time.

[1]http://accumulo.apache.org/1.7/accumulo_user_manual.html#_advanced_system_recovery


James Hughes wrote:

Ok, I can the see the benefit of being able to recovery data.  Is this
process documented?  And is there any kind of user-friendly tool for it?

On Mon, Aug 17, 2015 at 4:11 PM, <[email protected]
<mailto:[email protected]>> wrote:


      It's not temporary files, it's any file that has been compacted
    away. If you keep files around longer than
    {dfs.namenode.checkpoint.period}, then you have a chance to recover
    in case your most recent checkpoint is corrupt.

    ------------------------------------------------------------------------
    *From: *"James Hughes" <[email protected] <mailto:[email protected]>>
    *To: *[email protected] <mailto:[email protected]>
    *Sent: *Monday, August 17, 2015 3:57:57 PM
    *Subject: *Accumulo GC and Hadoop trash settings


    Hi all,

     From reading about the Accumulo GC, it sounds like temporary files
    are routinely deleted during GC cycles.  In a small testing
    environment, I've the HDFS Accumulo user's .Trash folder have 10s of
    gigabytes of data.

    Is there any reason that the default value for gc.trash.ignore is
    false?  Is there any downside to deleting GC'ed files completely?

    Thanks in advance,

    Jim

    http://accumulo.apache.org/1.6/accumulo_user_manual.html#_gc_trash_ignore

Re: Accumulo GC and Hadoop trash settings

Reply via email to