Re: Data inconsistency after restart

David Alves Fri, 08 Dec 2017 22:15:49 -0800

Hi Petter

Don't have answers yet, but I do have some more questions(inline)


Petter von Dolwitz (Hem) writes:

Hi David,

In short to summarize:
1. I ingest data. Kudus maintenance threads stops working (softmemorylimit) and incoming data is throttled. There are no errorsreported on the
client side.

 What is the "client side"? impala? spark? java/c++.

2. I stop ingestion and wait until i *think* Kudu is finsished.

The question above is pertinent. Impala will not return until aqueryis fully successful, though it may return an error and leave aquery

 only half-way executed. If you're using the client apis directly
 though are you checking for error when inserting?

3. I restart Kudu.
4. I validate the inserted data by doing count(*) on groups ofdata in
Kudu. For several groups, Kudu reports a lot of rows missing.

 Kudu's default scan mode is READ_LATEST. While this is the most

performance oriented mode, its also the one with the leastguaranteesso, on startup its possible that it reads from a stale replica,giving

 the _appearance_ that rows went missing. Things to try here:

- Try the same query a few minutes later. Is the answerdifferent?

 - If the above is true consider changing your scan mode to

READ_AT_SNAPHOT. In this mode data is guaranteed not to bestate,

 though you might have to wait for all replicas to be ready

5. I ingest the same data again. Client reports that all row arealready
present.

This isn't surprising _if_ the problem is indeed from statereplicas.

6. Doing the count(*) exercise again now gives me the correctnumber of
rows.
This tells me that the data was ingested into Kudu on the firstattempt but
a scan did not find the data. Inserting the data again made it
visible.

Can it be that after the scan it's just that enough time haselapsedso that all replicas are caught up? I'd say this is likely thecase.

Br,
Petter

2017-12-07 21:39 GMT+01:00 David Alves <[email protected]>:
Hi Petter
I'd like to clarify what exactly happened and exactly whatare you
referring to as "inconsistency".
From what I understand of the first error you observed, theKudu wasunderprovisioned, memory wise, and the ingest jobs/queriesfailed. Is thatright? Since Kudu doesn't have atomic multi-row writes, it'scurrentlyexpected in this case that you'll end up with partially writtendata.If you tried the same job again, and it succeeded, forcertain types ofoperation (UPSERT, INSERT IGNORE) then the remaining rows wouldbe written
and all the data would be there as expected.
   I'd like to distinguish this lack of atomicity on multi-row
transactions from "inconsistency", which is what you mightobserve if anoperation didn't fail, but you couldn't see all the data. Forthis lattercase there are options you can choose to avoid anyinconsistency.
Best
David



On Wed, Dec 6, 2017 at 4:26 AM, Petter von Dolwitz (Hem) <
[email protected]> wrote:
Thanks for your reply Andrew!
>How did you verify that all the data was inserted and how did>you find
some data missing?
This was done using Impala. We counted the rows for groupsrepresenting
the chunks we inserted.

>Following up on what I posted, take a look at
https://kudu.apache.org/docs/transaction_semantics.html#_
read_operations_scans. It seems definitely possible that notall of therows had finished inserting when counting, or that the scanswere sent to a
stale replica.
Before we shut down we could only see the following in thelogs. I.e., no
sign that ingestion was still ongoing.

kudu-tserver.ip-xx-yyy-z-nnn.root.log.INFO.20171201-065232.90314:I1201
07:27:35.010694 90793 maintenance_manager.cc:383] P
a38902afefca4a85a5469d149df9b4cb: we have exceeded our softmemory limit(current capacity is 67.52%). However, there are no opscurrently runnable
which would free memory.
Also the (cloudera) metrictotal_kudu_rows_inserted_rate_across_kudu_replicas
showed zero.
Still it seems like some data became inconsistent afterrestart. But ifthe maintenance_manager performs important jobs that arerequired to ensurethat all data is inserted then I can understand why we endedup withinconsistent data. But, if I understand you correct, you aresaying thatthese jobs are not critical for ingestion. In the link youprovided I read"Impala scans are currently performed as READ_LATEST and havenoconsistency guarantees.". I would assume this means that itdoes notguarantee consistency if new data is inserted but should givevalid (and
same) results if no new data is inserted?
I have not tried the ksck tool yet. Thank you for reminding. Iwill have
a look.

Br,
Petter


2017-12-06 1:31 GMT+01:00 Andrew Wong <[email protected]>:
How did you verify that all the data was inserted and how didyou find
some data missing? I'm wondering if it's possible that theinitial"missing" data was data that Kudu was still in the processof inserting
(albeit slowly, due to memory backpressure or somesuch).
Following up on what I posted, take a look at
https://kudu.apache.org/docs/transaction_semantics.html#_
read_operations_scans. It seems definitely possible that notall of therows had finished inserting when counting, or that the scanswere sent to a
stale replica.
On Tue, Dec 5, 2017 at 4:18 PM, Andrew Wong<[email protected]> wrote:
Hi Petter,
When we verified that all data was inserted we found thatsome data was
missing. We added this missing data and on some chunks wegot theinformation that all rows were already present, i.e impalasays somethinglike Modified: 0 rows, nnnnnnn errors. Doing theverification again nowshows that the Kudu table is complete. So, even though wedid not insertany data on some chunks, a count(*) operation over thesechunks now returns
a different value.
How did you verify that all the data was inserted and howdid you findsome data missing? I'm wondering if it's possible that theinitial"missing" data was data that Kudu was still in the processof inserting
(albeit slowly, due to memory backpressure or somesuch).
Now to my question. Will data be inconsistent if we recycleKudu after
seeing soft memory limit warnings?
Your data should be consistently written, even with thosewarnings.AFAIK they would cause a bit of slowness, not incorrectresults.
Is there a way to tell when it is safe to restart Kudu toavoid these
issues? Should we use any special procedure when restarting(e.g. onlyrestart the tablet servers, only restart one tablet serverat a time or
something like that)?
In general, you can use the `ksck` tool to check the healthof yourcluster. Seehttps://kudu.apache.org/docs/command_line_tools_reference.html#cluster-ksck for more details. For restarting acluster, Iwould recommend taking down all tablet servers at once,otherwise tabletreplicas may try to replicate data from the server that wastaken down.
Hope this helped,
Andrew

On Tue, Dec 5, 2017 at 10:42 AM, Petter von Dolwitz (Hem) <
[email protected]> wrote:
Hi Kudu users,
We just started to use Kudu (1.4.0+cdh5.12.1). To make abaseline forevaluation we ingested 3 month worth of data. Duringingestion we werefacing messages from the maintenance threads that a softmemory limit werereached. It seems like the background maintenance threadsstoppedperforming their tasks at this point in time. It also soseems like thememory was never recovered even after stopping ingestion soI guess therewas a large backlog being built up. I guess the root causehere is that wewere a bit too conservative when giving Kudu memory. Aftera reststart a
lot of maintenance tasks were started (i.e. compaction).
When we verified that all data was inserted we found thatsome datawas missing. We added this missing data and on some chunkswe got theinformation that all rows were already present, i.e impalasays somethinglike Modified: 0 rows, nnnnnnn errors. Doing theverification again nowshows that the Kudu table is complete. So, even though wedid not insertany data on some chunks, a count(*) operation over thesechunks now returns
a different value.
Now to my question. Will data be inconsistent if we recycleKudu after
seeing soft memory limit warnings?
Is there a way to tell when it is safe to restart Kudu toavoid theseissues? Should we use any special procedure when restarting(e.g. onlyrestart the tablet servers, only restart one tablet serverat a time or
something like that)?
The table design uses 50 tablets per day (times 90days). It is 8 TB
of data after 3xreplication over 5 tablet servers.

Thanks,
Petter
--
Andrew Wong
--
Andrew Wong



--
David Alves

Re: Data inconsistency after restart

Reply via email to