We have a compression utility that tries to grab all subdirs to a directory on HDFS. It makes a call like this: FileStatus[] subdirs = fs.globStatus(new Path(inputdir, "*"));
and handles files vs dirs accordingly. We tried to run our utility against a dir containing a computed SOLR shard, which has files that look like this: -rw-r--r-- 2 hadoopuser visible 8538430603 2011-09-01 18:58 /test/output/solr-20110901165238/part-00000/data/index/_ox.fdt -rw-r--r-- 2 hadoopuser visible 233396596 2011-09-01 18:57 /test/output/solr-20110901165238/part-00000/data/index/_ox.fdx -rw-r--r-- 2 hadoopuser visible 130 2011-09-01 18:57 /test/output/solr-20110901165238/part-00000/data/index/_ox.fnm -rw-r--r-- 2 hadoopuser visible 2147948283 2011-09-01 18:55 /test/output/solr-20110901165238/part-00000/data/index/_ox.frq -rw-r--r-- 2 hadoopuser visible 87523726 2011-09-01 18:57 /test/output/solr-20110901165238/part-00000/data/index/_ox.nrm -rw-r--r-- 2 hadoopuser visible 920936168 2011-09-01 18:57 /test/output/solr-20110901165238/part-00000/data/index/_ox.prx -rw-r--r-- 2 hadoopuser visible 22619542 2011-09-01 18:58 /test/output/solr-20110901165238/part-00000/data/index/_ox.tii -rw-r--r-- 2 hadoopuser visible 2070214402 2011-09-01 18:51 /test/output/solr-20110901165238/part-00000/data/index/_ox.tis -rw-r--r-- 2 hadoopuser visible 20 2011-09-01 18:51 /test/output/solr-20110901165238/part-00000/data/index/segments.gen -rw-r--r-- 2 hadoopuser visible 282 2011-09-01 18:55 /test/output/solr-20110901165238/part-00000/data/index/segments_2 The globStatus call seems only able to pick up those last 2 files; the several files that start with _ don't register. I've skimmed the FileSystem and GlobExpander source to see if there's anything related to this, but didn't see it. Google didn't turn up anything about underscores. Am I misunderstanding something about the regex patterns needed to pick these up or unaware of some filename convention in HDFS?
