I'm wondering if the distribution of your few columns across the actual rfiles has an impact. I believe it could be that, even without column families, a subset of the rfiles could be precluded from even being opened (because we know your given column family doesn't exist in the file).

So, the one column family happens to be only in some files, where the other column family happens to be included in all the files (or at least some larger ones). Thus, in one case, Accumulo is just reading much less data.

You could try to use `accumulo rfile-info` on some of the RFiles in your table, looking for the column families in question.

z11373 wrote:
Hi Keith,
I left that scan command running, and it did return after a minute or so. I
think it's just slow somehow for those particular column families. I'll try
jstack-ing when I have chance later today.

Thanks,
Z



--
View this message in context: 
http://apache-accumulo.1065345.n5.nabble.com/scan-command-hung-tp15286p15302.html
Sent from the Developers mailing list archive at Nabble.com.

Reply via email to