Hey Petter, Did you ever get to the bottom of this? We definitely don't expect Kudu to lose data on a restart (and we have hundreds of tests running continuously which try to ensure this)
-Todd On Fri, Dec 8, 2017 at 10:13 PM, David Alves <davidral...@gmail.com> wrote: > Hi Petter > > Don't have answers yet, but I do have some more questions (inline) > > Petter von Dolwitz (Hem) writes: > > Hi David, >> >> In short to summarize: >> >> 1. I ingest data. Kudus maintenance threads stops working (soft memory >> limit) and incoming data is throttled. There are no errors reported on the >> client side. >> > What is the "client side"? impala? spark? java/c++. > > 2. I stop ingestion and wait until i *think* Kudu is finsished. >> > The question above is pertinent. Impala will not return until a query > is fully successful, though it may return an error and leave a query > only half-way executed. If you're using the client apis directly > though are you checking for error when inserting? > > 3. I restart Kudu. >> 4. I validate the inserted data by doing count(*) on groups of data in >> Kudu. For several groups, Kudu reports a lot of rows missing. >> > Kudu's default scan mode is READ_LATEST. While this is the most > performance oriented mode, its also the one with the least guarantees > so, on startup its possible that it reads from a stale replica, giving > the _appearance_ that rows went missing. Things to try here: > - Try the same query a few minutes later. Is the answer different? > - If the above is true consider changing your scan mode to > READ_AT_SNAPHOT. In this mode data is guaranteed not to be state, > though you might have to wait for all replicas to be ready > > 5. I ingest the same data again. Client reports that all row are already >> present. >> > This isn't surprising _if_ the problem is indeed from state replicas. > >> 6. Doing the count(*) exercise again now gives me the correct number of >> rows. >> >> This tells me that the data was ingested into Kudu on the first attempt >> but >> a scan did not find the data. Inserting the data again made it >> visible. >> > Can it be that after the scan it's just that enough time has elapsed > so that all replicas are caught up? I'd say this is likely the case. > >> >> Br, >> Petter >> >> 2017-12-07 21:39 GMT+01:00 David Alves <davidral...@gmail.com>: >> >> Hi Petter >>> >>> I'd like to clarify what exactly happened and exactly what are you >>> referring to as "inconsistency". >>> From what I understand of the first error you observed, the Kudu >>> was >>> underprovisioned, memory wise, and the ingest jobs/queries failed. Is >>> that >>> right? Since Kudu doesn't have atomic multi-row writes, it's currently >>> expected in this case that you'll end up with partially written data. >>> If you tried the same job again, and it succeeded, for certain >>> types of >>> operation (UPSERT, INSERT IGNORE) then the remaining rows would be >>> written >>> and all the data would be there as expected. >>> I'd like to distinguish this lack of atomicity on multi-row >>> transactions from "inconsistency", which is what you might observe if an >>> operation didn't fail, but you couldn't see all the data. For this latter >>> case there are options you can choose to avoid any inconsistency. >>> >>> Best >>> David >>> >>> >>> >>> On Wed, Dec 6, 2017 at 4:26 AM, Petter von Dolwitz (Hem) < >>> petter.von.dolw...@gmail.com> wrote: >>> >>> Thanks for your reply Andrew! >>>> >>>> >How did you verify that all the data was inserted and how did >you find >>>> some data missing? >>>> This was done using Impala. We counted the rows for groups representing >>>> the chunks we inserted. >>>> >>>> >Following up on what I posted, take a look at >>>> https://kudu.apache.org/docs/transaction_semantics.html#_ >>>> read_operations_scans. It seems definitely possible that not all of the >>>> rows had finished inserting when counting, or that the scans were sent >>>> to a >>>> stale replica. >>>> Before we shut down we could only see the following in the logs. I.e., >>>> no >>>> sign that ingestion was still ongoing. >>>> >>>> kudu-tserver.ip-xx-yyy-z-nnn.root.log.INFO.20171201-065232.90314:I1201 >>>> 07:27:35.010694 90793 maintenance_manager.cc:383] P >>>> a38902afefca4a85a5469d149df9b4cb: we have exceeded our soft memory >>>> limit >>>> (current capacity is 67.52%). However, there are no ops currently >>>> runnable >>>> which would free memory. >>>> >>>> Also the (cloudera) metric total_kudu_rows_inserted_rate_ >>>> across_kudu_replicas >>>> showed zero. >>>> >>>> Still it seems like some data became inconsistent after restart. But if >>>> the maintenance_manager performs important jobs that are required to >>>> ensure >>>> that all data is inserted then I can understand why we ended up with >>>> inconsistent data. But, if I understand you correct, you are saying >>>> that >>>> these jobs are not critical for ingestion. In the link you provided I >>>> read >>>> "Impala scans are currently performed as READ_LATEST and have no >>>> consistency guarantees.". I would assume this means that it does not >>>> guarantee consistency if new data is inserted but should give valid (and >>>> same) results if no new data is inserted? >>>> >>>> I have not tried the ksck tool yet. Thank you for reminding. I will have >>>> a look. >>>> >>>> Br, >>>> Petter >>>> >>>> >>>> 2017-12-06 1:31 GMT+01:00 Andrew Wong <aw...@cloudera.com>: >>>> >>>> How did you verify that all the data was inserted and how did you find >>>>> >>>>>> some data missing? I'm wondering if it's possible that the initial >>>>>> "missing" data was data that Kudu was still in the process of >>>>>> inserting >>>>>> (albeit slowly, due to memory backpressure or somesuch). >>>>>> >>>>>> >>>>> Following up on what I posted, take a look at >>>>> https://kudu.apache.org/docs/transaction_semantics.html#_ >>>>> read_operations_scans. It seems definitely possible that not all of the >>>>> rows had finished inserting when counting, or that the scans were sent >>>>> to a >>>>> stale replica. >>>>> >>>>> On Tue, Dec 5, 2017 at 4:18 PM, Andrew Wong <aw...@cloudera.com> >>>>> wrote: >>>>> >>>>> Hi Petter, >>>>>> >>>>>> When we verified that all data was inserted we found that some data >>>>>> was >>>>>> >>>>>>> missing. We added this missing data and on some chunks we got the >>>>>>> information that all rows were already present, i.e impala says >>>>>>> something >>>>>>> like Modified: 0 rows, nnnnnnn errors. Doing the verification again >>>>>>> now >>>>>>> shows that the Kudu table is complete. So, even though we did not >>>>>>> insert >>>>>>> any data on some chunks, a count(*) operation over these chunks now >>>>>>> returns >>>>>>> a different value. >>>>>>> >>>>>> >>>>>> >>>>>> How did you verify that all the data was inserted and how did you find >>>>>> some data missing? I'm wondering if it's possible that the initial >>>>>> "missing" data was data that Kudu was still in the process of >>>>>> inserting >>>>>> (albeit slowly, due to memory backpressure or somesuch). >>>>>> >>>>>> Now to my question. Will data be inconsistent if we recycle Kudu after >>>>>> >>>>>>> seeing soft memory limit warnings? >>>>>>> >>>>>> >>>>>> >>>>>> Your data should be consistently written, even with those warnings. >>>>>> AFAIK they would cause a bit of slowness, not incorrect results. >>>>>> >>>>>> Is there a way to tell when it is safe to restart Kudu to avoid these >>>>>> >>>>>>> issues? Should we use any special procedure when restarting (e.g. >>>>>>> only >>>>>>> restart the tablet servers, only restart one tablet server at a time >>>>>>> or >>>>>>> something like that)? >>>>>>> >>>>>> >>>>>> >>>>>> In general, you can use the `ksck` tool to check the health of your >>>>>> cluster. See https://kudu.apache.org/docs/command_line_tools_referenc >>>>>> e.html#cluster-ksck for more details. For restarting a cluster, I >>>>>> would recommend taking down all tablet servers at once, otherwise >>>>>> tablet >>>>>> replicas may try to replicate data from the server that was taken >>>>>> down. >>>>>> >>>>>> Hope this helped, >>>>>> Andrew >>>>>> >>>>>> On Tue, Dec 5, 2017 at 10:42 AM, Petter von Dolwitz (Hem) < >>>>>> petter.von.dolw...@gmail.com> wrote: >>>>>> >>>>>> Hi Kudu users, >>>>>>> >>>>>>> We just started to use Kudu (1.4.0+cdh5.12.1). To make a baseline for >>>>>>> evaluation we ingested 3 month worth of data. During ingestion we >>>>>>> were >>>>>>> facing messages from the maintenance threads that a soft memory >>>>>>> limit were >>>>>>> reached. It seems like the background maintenance threads stopped >>>>>>> performing their tasks at this point in time. It also so seems like >>>>>>> the >>>>>>> memory was never recovered even after stopping ingestion so I guess >>>>>>> there >>>>>>> was a large backlog being built up. I guess the root cause here is >>>>>>> that we >>>>>>> were a bit too conservative when giving Kudu memory. After a >>>>>>> reststart a >>>>>>> lot of maintenance tasks were started (i.e. compaction). >>>>>>> >>>>>>> When we verified that all data was inserted we found that some data >>>>>>> was missing. We added this missing data and on some chunks we got the >>>>>>> information that all rows were already present, i.e impala says >>>>>>> something >>>>>>> like Modified: 0 rows, nnnnnnn errors. Doing the verification again >>>>>>> now >>>>>>> shows that the Kudu table is complete. So, even though we did not >>>>>>> insert >>>>>>> any data on some chunks, a count(*) operation over these chunks now >>>>>>> returns >>>>>>> a different value. >>>>>>> >>>>>>> Now to my question. Will data be inconsistent if we recycle Kudu >>>>>>> after >>>>>>> seeing soft memory limit warnings? >>>>>>> >>>>>>> Is there a way to tell when it is safe to restart Kudu to avoid these >>>>>>> issues? Should we use any special procedure when restarting (e.g. >>>>>>> only >>>>>>> restart the tablet servers, only restart one tablet server at a time >>>>>>> or >>>>>>> something like that)? >>>>>>> >>>>>>> The table design uses 50 tablets per day (times 90 days). It is 8 TB >>>>>>> of data after 3xreplication over 5 tablet servers. >>>>>>> >>>>>>> Thanks, >>>>>>> Petter >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>>> -- >>>>>> Andrew Wong >>>>>> >>>>>> >>>>> >>>>> >>>>> -- >>>>> Andrew Wong >>>>> >>>>> >>>> >>>> >>> > > -- > David Alves > -- Todd Lipcon Software Engineer, Cloudera