Metadata DataFileValue not Matching the Output of rfile-info Command

2018-02-13 Thread Dong Zhou
Hi all, We have noticed that the Accumulo metadata entry reports certain RFile has file size but no entry number. For example, ; file:hdfs://apps/accumulo/tables///I001ahdz.rf [] 48,0 >From Metadata's perspective, it looks like this the RFile contains zero entries, but if we run an RFILE-INFO c

Re: Metadata DataFileValue not Matching the Output of rfile-info Command

2018-02-13 Thread Michael Wall
Hi Dong, That file is the result of a bulk import. I can tell because it starts with a capital "I", see http://accumulo.apache.org/1.8/accumulo_user_manual.html#_file_naming_conventions. Bulk files are inspected on import to find all the ranges of data they contain. They are then assigned to all

Re: Metadata DataFileValue not Matching the Output of rfile-info Command

2018-02-13 Thread Dong Zhou
I see. Yes, the file is loaded via bulk import. I would like to find out the most precise number of entries a table contains, would running a compaction, and then scanning metadata table for the entry number be sufficient method? Also, what would happen is merge operation runs before the compaction

Re: Metadata DataFileValue not Matching the Output of rfile-info Command

2018-02-13 Thread Michael Wall
Yes, compact the table and then count the entries. You can get close by looking at the monitor page. The tables list has a column called Entries which should be close to counting up those entries in the metadata by hand. On Tue, Feb 13, 2018 at 2:19 PM Dong Zhou wrote: > I see. Yes, the file