Yes, compact the table and then count the entries. You can get close by looking at the monitor page. The tables list has a column called Entries which should be close to counting up those entries in the metadata by hand.
On Tue, Feb 13, 2018 at 2:19 PM Dong Zhou <dz...@phemi.com> wrote: > I see. Yes, the file is loaded via bulk import. > I would like to find out the most precise number of entries a table > contains, would running a compaction, and then scanning metadata table for > the entry number be sufficient method? > Also, what would happen is merge operation runs before the compaction? > Would it try to merge this tablet into other tablets since the file size > and entry number look fair small at the time it scans the metadata table? > Or, it would compact the table before running the merge. > > By the way, thanks for the quick reply. :) > > Cheers, > -Dong > > > > On Tue, Feb 13, 2018 at 11:05 AM Michael Wall <mjw...@gmail.com> wrote: > >> Hi Dong, >> >> That file is the result of a bulk import. I can tell because it starts >> with a capital "I", see >> http://accumulo.apache.org/1.8/accumulo_user_manual.html#_file_naming_conventions. >> Bulk files are inspected on import to find all the ranges of data they >> contain. They are then assigned to all the tablets hosting that data. So >> one "I" file can belong to more than one tablet. When that file is >> included in a compaction, the data that is not part of the range the tablet >> is hosting is not rewritten to the new files. >> >> When inspecting "I" files, Accumulo does not keep track of how many keys >> are in each range. So for "I" files in the metadata table, the number of >> keys is 0 until that file is compacted. >> >> HTH >> >> Mike >> >> >> >> On Tue, Feb 13, 2018 at 1:37 PM Dong Zhou <dz...@phemi.com> wrote: >> >>> Hi all, >>> >>> We have noticed that the Accumulo metadata entry reports certain RFile >>> has file size but no entry number. >>> For example, <tableId>;<tabletEndRow> >>> file:hdfs://apps/accumulo/tables/<tableId>/<folder>/I001ahdz.rf [] 48,0 >>> >>> From Metadata's perspective, it looks like this the RFile contains zero >>> entries, but if we run an RFILE-INFO command against the same file, the >>> outcome shows that the RFile has a bunch of entries. If we dump the RFile, >>> we can see that it spills out the actual data too. >>> >>> We wonder what is the reason behind it. >>> >>> Thanks, >>> -Dong Zhou >>> >>