[EMAIL PROTECTED] wrote:
However I'm getting the following error:

copyFromLocal: Target /user/root/crawl/crawldb/current/part-00000/.data.crc
already exists

Please file a bug report. The problem is that when copyFromLocal enumerates local files it should exclude .crc files, but it does not. This is the listFiles() call on DistributedFileSystem:160. It should filter this, excluding files that are FileSystem.isChecksumFile().

BTW, as a workaround, it is safe to first remove all of the .crc files, but your files will no longer be checksummed as they are read. On systems without ECC memory file corruption is not uncommon, but I have seen very little on clusters that have ECC.

Doug

Reply via email to