That is a good idea. I currently use a shell script that does the rough equivalent of rsync -av, but it wouldn't be bad to have a one-liner that solves the same problem.
One (slight) benefit to the scripted approach is that I get a list of directories to which files have been moved. That lets me reprocess entire directories for aggregates when something changes. I expect that a clean implementation of rsync could give me a list of file that I could sed into a list of directories. On 1/2/08 7:03 AM, "Greg Connor" <[EMAIL PROTECTED]> wrote: > Hello, > > Does anyone know of a modified "rsync" that gets/puts files to/from the dfs > instead of the normal, mounted filesystems? I'm guessing since the dfs can't > be mounted like a "normal" filesystem that rsync would need to be modified in > order to access it, as with any other program. We use rsync --daemon a lot > for moving files around, making backups, etc. so I think it should be a > logical fit... at least I hope so. > > I'm new to hadoop and just got my first standalone node configured. Apologies > if this has been answered before, or if I'm missing something obvious. > > Thanks > gregc >