Hi,

There appears to be a bug where two rows are merging into one as a result
of doing separate calls to the Iface.mutate method using
RowMutationType.UPDATE_ROW and RecordMutationType.REPLACE_ENTIRE_RECORD.
(I can also see the problem using REPLACE_ROW and REPLACE_ENTIRE_RECORD
instead).

For example, if the index has 2 rows with 1 record each that has a copy of
the rowId in cf.key:
  row A:   cf.key=A
  row B:   cf.key=B

After an attempt to Iface.mutate row A with exactly the same data,
sometimes the result is:
  row A:  cf.key=A
  row B:  cf.key=B  cf.key=A

instead of the expected result of a no-op.  The corruption is visible with
"blur get" and "blur query cf.key:B" and an Iface.fetchRow from java.

For the above, the recordId is always "0" and the rowId is a UUID generated
from java UUID.randomUUID (although for my test I'm also using the same
UUIDs).

I'm not setting a schema at all in my test program, so all the defaults for
analyzers, fieldless=true, etc.

I do notice the following show up in the shard server log:  INFO ...
[thrift-processors1] search.PrimeDocCache: PrimeDoc for reader
[_k(4.3):C19/4] not stored, because count [13] and freq [16] do not match.

Restarting blur doesn't seem to help.

Blur version is 0.2.4.  Hadoop stack is CDH 5.1.0

Cluster configuration is running 1 shard, 1 controller, 1 namenode all on
the same machine (redhat 6.3 Santiago).

I have a fairly small test case that if I run repeatedly sometimes fails,
sometimes doesn't.  I run it after using blur shell to remove the old table
and create a new one with 1 shard.

Although it isn't 100% reproducible, it seems to fail pretty often for me.
As I've typed the code in on a different network, I don't have the code for
you yet.

Have you seen this kind of issue before?

Any suggestions for how to track it down?

Are there any commands you want me to run on the resulting table that might
yield some clues?

Thanks,
-- Tom

Reply via email to