Re: Read and writing rfiles

David Medinets Wed, 23 Dec 2015 07:27:17 -0800

Are the hadoop nodes handling your map-reduce job also running tservers?

Do the Accumulo log files show the exception? If so, can you post it?


On Wed, Dec 23, 2015 at 9:12 AM, Jeff Kubina <jeff.kub...@gmail.com> wrote:

> I've have a mapreduce job that reads rfiles as Accumulo key/value
> pairs using FileSKVIterator within a RecordReader, partition/shuffles them
> based on the byte string of the key, and writes them out as new rfiles
> using the AccumuloFileOutputFormat. The objective is to create larger
> rfiles for bulk ingesting and to minimize the number of tservers each rfile
> is assigned to after they are bulk ingested.
>
> For tables with a simple schema it works fine, but for tables with complex
> schema the new rfiles are causing the tservers to throw a null pointer
> exception during a compaction.
>
> Is there more to an rfile than just the key/value pairs that I am missing?
>
> If I compute an order independent checksum of the bytes of the key/value
> pairs in the original rfiles and the new rfiles shouldn't they be the same?
>
>

Re: Read and writing rfiles

Reply via email to