Hi Dong,

That file is the result of a bulk import.  I can tell because it starts
with a capital "I", see
http://accumulo.apache.org/1.8/accumulo_user_manual.html#_file_naming_conventions.
Bulk files are inspected on import to find all the ranges of data they
contain.  They are then assigned to all the tablets hosting that data.  So
one "I" file can belong to more than one tablet.  When that file is
included in a compaction, the data that is not part of the range the tablet
is hosting is not rewritten to the new files.

When inspecting "I" files, Accumulo does not keep track of how many keys
are in each range.  So for "I" files in the metadata table, the number of
keys is 0 until that file is compacted.

HTH

Mike



On Tue, Feb 13, 2018 at 1:37 PM Dong Zhou <dz...@phemi.com> wrote:

> Hi all,
>
> We have noticed that the Accumulo metadata entry reports certain RFile has
> file size but no entry number.
> For example, <tableId>;<tabletEndRow>
> file:hdfs://apps/accumulo/tables/<tableId>/<folder>/I001ahdz.rf []   48,0
>
> From Metadata's perspective, it looks like this the RFile contains zero
> entries, but if we run an RFILE-INFO command against the same file, the
> outcome shows that the RFile has a bunch of entries. If we dump the RFile,
> we can see that it spills out the actual data too.
>
> We wonder what is the reason behind it.
>
> Thanks,
> -Dong Zhou
>

Reply via email to