Re: scan command hung

z11373 Mon, 05 Oct 2015 17:51:04 -0700

Thanks Billie/Josh! That's indeed fixing the issue, the scan now returns
instantly!!


So when we scan the whole table and filtering by column family, Accumulo
still has to go through all rows (ordered by the key), and check if the
particular item has specific column family, and in my case since they are
intermingled, the data I am looking for could be somewhere in the middle or
in the end of the rfile, am I right?

I did another experiment, if I specify -b and -e, then it also returned
instantly (this before I moved them to different group and compact), which
does make sense, because Accumulo could narrow down to specific ranges, and
then filter them by column family.

I have another follow up question, does it mean I have to create new
locality group for each column family since I wouldn't know how big/small
the data belong to that cf in advance?

Btw, we shard the customers by putting their id as column family, so we'll
add new column family whenever new customer onboard. I think the case which
we have to scan the table with cf without specifying ranges may be rare (or
perhaps never, except if I run it from shell), but I am worried if this can
become perf bottleneck if I don't set them to separate locality group.

Another question, when running setgroups command, it looks like I have to
set for all of them, even I just added new cf. For example, let say I did:
setgroups mygroup=cf1,cf2 -t mytable
compact -t mytable -w

Then later I need to add cf3 to the same group, I have to do "setgroups
mygroup=cf1,cf2,c3 -t mytable", instead of just "setgroups mygroup=cf3 -t
mytable"

It'd be nice if I can do the latter :-) What happens with cf1 and cf2 if I
did the latter, does it mean they are coming back to default group again
after compaction?


Thanks,
Z




--
View this message in context: 
http://apache-accumulo.1065345.n5.nabble.com/scan-command-hung-tp15286p15324.html
Sent from the Developers mailing list archive at Nabble.com.

Re: scan command hung

Reply via email to