Isolation should only give you consistency within a row, to ensure you're
not scanning over partial changes from a mutation that is currently being
written to a row. It shouldn't have anything to do with compactions or
missing data that has already been written before the MapReduce scan has
started.

Splits shouldn't cause you to miss data either. It's been awhile since I
looked, but I believe the MapReduce APIs simply break up a table into
separate ranges to scan based on current tablet boundaries. If there are
splits, then all that means is that some of the ranges will span across
more than one tablet, but that's fine... a scan is a scan... scans don't
need to be limited to a single tablet.

Compactions could cause missed data if they transform the data in some way,
but otherwise, I wouldn't expect them to.

Are you seeing any error messages anywhere?

On Mon, Apr 18, 2022, 15:23 Vincent Russell <vincent.russ...@gmail.com>
wrote:

> Hi Dave,
>
> Yes we are using the new MapReduce API, but we are not setting any
> settings for isolated scan so we are using whatever the default is.
>
> Thanks,
> Vincent
>
> On Mon, Apr 18, 2022 at 3:12 PM Dave Marion <dmario...@gmail.com> wrote:
>
> > Major compactions should not move rows to new tablets, but a tablet split
> > could. Are you using the new MapReduce API introduced in 2.0? Are you
> > setting it to use an isolated scan?
> >
> > On Mon, Apr 18, 2022 at 3:01 PM Vincent Russell <
> vincent.russ...@gmail.com
> > >
> > wrote:
> >
> > > Hello All,
> > >
> > > Could major compactions that occur while a map reduce job is running
> > cause
> > > the map reduce job to miss records because rows have been moved to a
> > > different tablet?
> > >
> > > How does this work?
> > >
> > > I'm using accumulo 2.0.1
> > >
> > > Thank you,
> > > Vincent
> > >
> >
>

Reply via email to