Re: Data inconsistency after restart

Alexey Serbin Thu, 07 Dec 2017 12:04:17 -0800

Hi Petter,

Before going too deep in attempts to find the place where the data waslost, I just wanted to make sure we definitely know that the data wasdelivered from the client to the server side.

Did you verified the client didn't report any errors during dataingestion? Most likely you did, but I just wanted to make sure. BTW,what kind of client did you use for the data ingestion?


Thanks!


Kind regards,

Alexey


On 12/6/17 3:56 PM, Andrew Wong wrote:

Hi Petter,

    Before we shut down we could only see the following in the logs.
    I.e., no sign that ingestion was still ongoing.

Interesting. Just to be sure, was that seen on one tserver, or did yousee them across all of them?


    But if the maintenance_manager performs important jobs that are
    required to ensure that all data is inserted then I can understand
    why we ended up with inconsistent data.

The maintenance manager's role is somewhat orthogonal to writes: datais first written to the on-disk write-ahead log and also keptin-memory to be accessible by scans. The maintenance managerperiodically shuttles this in-memory data to disk, among various othertasks like cleaning up WAL segments, compacting rowsets, etc. Giventhat, a lack of maintenance ops shouldn't cause incorrectness in data,even after restarting.


    I would assume this means that it does not guarantee consistency
    if new data is inserted but should give valid (and same) results
    if no new data is inserted?

Right, if /all/ tservers a truly caught up and done processing thewrites, with no tablet copies going on, and with no new data comingin, then the results should be consistent.



Hope this helped,
Andrew

On Wed, Dec 6, 2017 at 7:33 AM, Boris Tyukin <[email protected]<mailto:[email protected]>> wrote:


    this is smart, we are doing the same thing but the best part that
    attracts me to Kudu is replacing our main HDFS storage with Kudu
    to enable near RT use cases and not to deal with HBase and a
    Lambda architecture mess so reliability and scalability is a big
    deal for us as we are looking to move most of our data to Kudu.

    On Wed, Dec 6, 2017 at 9:58 AM, Petter von Dolwitz (Hem)
    <[email protected]
    <mailto:[email protected]>> wrote:

        Hi Boris,

        we do not have a Cloudera contract at the moment. Until we
        gained more Kudu experience we keep our master data in parquet
        format so that we can rebuild Kudu-tables upon errors. We are
        still in the early learning phase.

        Br,
        Petter



        2017-12-06 14:35 GMT+01:00 Boris Tyukin <[email protected]
        <mailto:[email protected]>>:

            this is definitely concerning thread for us looking to use
            Impala for storing mission-critical company data. Petter,
            are you paid Cloudera customer btw? I wonder if you opened
            support ticket as well

            On Wed, Dec 6, 2017 at 7:26 AM, Petter von Dolwitz (Hem)
            <[email protected]
            <mailto:[email protected]>> wrote:

                Thanks for your reply Andrew!

                >How did you verify that all the data was inserted and
                how did you find some data missing?
                This was done using Impala. We counted the rows for
                groups representing the chunks we inserted.

                >Following up on what I posted, take a look at
                
https://kudu.apache.org/docs/transaction_semantics.html#_read_operations_scans
                
<https://kudu.apache.org/docs/transaction_semantics.html#_read_operations_scans>.
                It seems definitely possible that not all of the rows
                had finished inserting when counting, or that the
                scans were sent to a stale replica.
                Before we shut down we could only see the following in
                the logs. I.e., no sign that ingestion was still ongoing.

                
kudu-tserver.ip-xx-yyy-z-nnn.root.log.INFO.20171201-065232.90314:I1201
                07:27:35.010694 90793 maintenance_manager.cc:383] P
                a38902afefca4a85a5469d149df9b4cb: we have exceeded our

soft memory limit (current capacity is 67.52%).However, there are no ops currently runnable which

                would free memory.

                Also the (cloudera) metric
                total_kudu_rows_inserted_rate_across_kudu_replicas
                showed zero.

                Still it seems like some data became inconsistent
                after restart. But if the maintenance_manager performs
                important jobs that are required to ensure that all
                data is inserted then I can understand why we ended up
                with inconsistent data. But, if I understand you
                correct,  you are saying that these jobs are not
                critical for ingestion. In the link you provided I
                read "Impala scans are currently performed as
                READ_LATEST and have no consistency guarantees.". I
                would assume this means that it does not guarantee
                consistency if new data is inserted but should give
                valid (and same) results if no new data is inserted?

                I have not tried the ksck tool yet. Thank you for
                reminding. I will have a look.

                Br,
                Petter


                2017-12-06 1:31 GMT+01:00 Andrew Wong
                <[email protected] <mailto:[email protected]>>:

                        How did you verify that all the data was
                        inserted and how did you find some data
                        missing? I'm wondering if it's possible that
                        the initial "missing" data was data that Kudu
                        was still in the process of inserting (albeit
                        slowly, due to memory backpressure or somesuch).


                    Following up on what I posted, take a look at
                    
https://kudu.apache.org/docs/transaction_semantics.html#_read_operations_scans
                    
<https://kudu.apache.org/docs/transaction_semantics.html#_read_operations_scans>.
                    It seems definitely possible that not all of the
                    rows had finished inserting when counting, or that
                    the scans were sent to a stale replica.

                    On Tue, Dec 5, 2017 at 4:18 PM, Andrew Wong
                    <[email protected] <mailto:[email protected]>>
                    wrote:

                        Hi Petter,

                            When we verified that all data was
                            inserted we found that some data was
                            missing. We added this missing data and on
                            some chunks we got the information that
                            all rows were already present, i.e impala
                            says something like Modified: 0 rows,
                            nnnnnnn errors. Doing the verification
                            again now shows that the Kudu table is
                            complete. So, even though we did not
                            insert any data on some chunks, a count(*)
                            operation over these chunks now returns a
                            different value.


                        How did you verify that all the data was
                        inserted and how did you find some data
                        missing? I'm wondering if it's possible that
                        the initial "missing" data was data that Kudu
                        was still in the process of inserting (albeit
                        slowly, due to memory backpressure or somesuch).

                            Now to my question. Will data be
                            inconsistent if we recycle Kudu after
                            seeing soft memory limit warnings?


                        Your data should be consistently written, even
                        with those warnings. AFAIK they would cause a
                        bit of slowness, not incorrect results.

                            Is there a way to tell when it is safe to
                            restart Kudu to avoid these issues? Should
                            we use any special procedure when
                            restarting (e.g. only restart the tablet
                            servers, only restart one tablet server at
                            a time or something like that)?


                        In general, you can use the `ksck` tool to
                        check the health of your cluster. See
                        
https://kudu.apache.org/docs/command_line_tools_reference.html#cluster-ksck
                        
<https://kudu.apache.org/docs/command_line_tools_reference.html#cluster-ksck>
                        for more details. For restarting a cluster, I
                        would recommend taking down all tablet servers
                        at once, otherwise tablet replicas may try to
                        replicate data from the server that was taken
                        down.

                        Hope this helped,
                        Andrew

                        On Tue, Dec 5, 2017 at 10:42 AM, Petter von
                        Dolwitz (Hem) <[email protected]
                        <mailto:[email protected]>> wrote:

                            Hi Kudu users,

                            We just started to use Kudu
                            (1.4.0+cdh5.12.1). To make a baseline for
                            evaluation we ingested 3 month worth of
                            data. During ingestion we were facing
                            messages from the maintenance threads that
                            a soft memory limit were reached. It seems
                            like the background maintenance threads
                            stopped performing their tasks at this
                            point in time. It also so seems like the
                            memory was never recovered even after
                            stopping ingestion so I guess there was a
                            large backlog being built up. I guess the
                            root cause here is that we were a bit too
                            conservative when giving Kudu memory.
                            After a reststart a lot of maintenance
                            tasks were started (i.e. compaction).

                            When we verified that all data was
                            inserted we found that some data was
                            missing. We added this missing data and on
                            some chunks we got the information that
                            all rows were already present, i.e impala
                            says something like Modified: 0 rows,
                            nnnnnnn errors. Doing the verification
                            again now shows that the Kudu table is
                            complete. So, even though we did not
                            insert any data on some chunks, a count(*)
                            operation over these chunks now returns a
                            different value.

                            Now to my question. Will data be
                            inconsistent if we recycle Kudu after
                            seeing soft memory limit warnings?

                            Is there a way to tell when it is safe to
                            restart Kudu to avoid these issues? Should
                            we use any special procedure when
                            restarting (e.g. only restart the tablet
                            servers, only restart one tablet server at
                            a time or something like that)?

                            The table design uses 50 tablets per day
                            (times 90 days). It is 8 TB of data after
                            3xreplication over 5 tablet servers.

                            Thanks,
                            Petter

--Andrew Wong









--
Andrew Wong

Re: Data inconsistency after restart

Reply via email to