Yes, compact the table and then count the entries.  You can get close by
looking at the monitor page.  The tables list has a column called Entries
which should be close to counting up those entries in the metadata by hand.


On Tue, Feb 13, 2018 at 2:19 PM Dong Zhou <dz...@phemi.com> wrote:

> I see. Yes, the file is loaded via bulk import.
> I would like to find out the most precise number of entries a table
> contains, would running a compaction, and then scanning metadata table for
> the entry number be sufficient method?
> Also, what would happen is merge operation runs before the compaction?
> Would it try to merge this tablet into other tablets since the file size
> and entry number look fair small at the time it scans the metadata table?
> Or, it would compact the table before running the merge.
>
> By the way, thanks for the quick reply. :)
>
> Cheers,
> -Dong
>
>
>
> On Tue, Feb 13, 2018 at 11:05 AM Michael Wall <mjw...@gmail.com> wrote:
>
>> Hi Dong,
>>
>> That file is the result of a bulk import.  I can tell because it starts
>> with a capital "I", see
>> http://accumulo.apache.org/1.8/accumulo_user_manual.html#_file_naming_conventions.
>> Bulk files are inspected on import to find all the ranges of data they
>> contain.  They are then assigned to all the tablets hosting that data.  So
>> one "I" file can belong to more than one tablet.  When that file is
>> included in a compaction, the data that is not part of the range the tablet
>> is hosting is not rewritten to the new files.
>>
>> When inspecting "I" files, Accumulo does not keep track of how many keys
>> are in each range.  So for "I" files in the metadata table, the number of
>> keys is 0 until that file is compacted.
>>
>> HTH
>>
>> Mike
>>
>>
>>
>> On Tue, Feb 13, 2018 at 1:37 PM Dong Zhou <dz...@phemi.com> wrote:
>>
>>> Hi all,
>>>
>>> We have noticed that the Accumulo metadata entry reports certain RFile
>>> has file size but no entry number.
>>> For example, <tableId>;<tabletEndRow>
>>> file:hdfs://apps/accumulo/tables/<tableId>/<folder>/I001ahdz.rf []   48,0
>>>
>>> From Metadata's perspective, it looks like this the RFile contains zero
>>> entries, but if we run an RFILE-INFO command against the same file, the
>>> outcome shows that the RFile has a bunch of entries. If we dump the RFile,
>>> we can see that it spills out the actual data too.
>>>
>>> We wonder what is the reason behind it.
>>>
>>> Thanks,
>>> -Dong Zhou
>>>
>>

Reply via email to