Hey Boris, Thanks for reporting back with results!
On Wed, Jan 3, 2018 at 10:38 AM, Boris Tyukin <[email protected]> wrote: > so it was the page cache that makes this difference. we did a series of > tests either restarting Kudu only, Impala only or both and resetting or not > touching page cache. > > as for Kudu failures after restart, it was a sequence of services that > need to be started before Kudu. If we start Kudu after HDFS, everything is > fine. Data is intact > Is it possible that Kudu is sharing disks with ZK? > > thanks again for your help, J-D > > On Sat, Dec 16, 2017 at 4:05 PM, Jean-Daniel Cryans <[email protected]> > wrote: > >> I'm more thinking in terms of the startup IO having some impact on the >> co-located services, but we really need to know what "went down" means. >> >> On Sat, Dec 16, 2017 at 12:50 PM, Boris Tyukin <[email protected]> >> wrote: >> >>> yep it is really weird since Kudu does not use neither one. I'll get >>> with him on Monday to gather more details >>> >>> On Sat, Dec 16, 2017 at 3:28 PM, Jean-Daniel Cryans <[email protected] >>> > wrote: >>> >>>> Hi Boris, >>>> >>>> How exactly did HDFS and ZK go down? A Kudu restart is fairly >>>> IO-intensive but I don't know how that can cause things like DataNodes to >>>> fail. >>>> >>>> J-D >>>> >>>> On Sat, Dec 16, 2017 at 11:45 AM, Boris Tyukin <[email protected]> >>>> wrote: >>>> >>>>> well our admin had fun two days - it was the first time we restarted >>>>> Kudu on our DEV cluster and it did not go well. He is still >>>>> troubleshooting >>>>> what happened but after Kudu restart zookeeper and HDFS went down after >>>>> 3-4 >>>>> minutes. If we disable Kudu, all is well. No error in Kudu logs...I will >>>>> have more details next week so not asking for help as I do not know all >>>>> the >>>>> details. What is obvious thought is that it has to do something with Kudu >>>>> :) >>>>> >>>>> On Thu, Dec 14, 2017 at 9:40 AM, Boris Tyukin <[email protected]> >>>>> wrote: >>>>> >>>>>> thanks for your suggestions, J-D, I am sure you are right more often >>>>>> than that! :)) >>>>>> >>>>>> I will report back with our results. So far I am really impressed >>>>>> with Kudu - we have been benchmarking ingest and egress throughput and >>>>>> our >>>>>> typical queries runtime. The biggest pain so far is lack of support for >>>>>> decimals >>>>>> >>>>>> On Wed, Dec 13, 2017 at 5:07 PM, Jean-Daniel Cryans < >>>>>> [email protected]> wrote: >>>>>> >>>>>>> On Wed, Dec 13, 2017 at 11:30 AM, Boris Tyukin < >>>>>>> [email protected]> wrote: >>>>>>> >>>>>>>> thanks J-D! we are going to try that and see how it impacts the >>>>>>>> runtime. >>>>>>>> >>>>>>>> is there any way to load this metadata upfront? a lot of our >>>>>>>> queries are adhoc in nature but they will be hitting the same tables >>>>>>>> with >>>>>>>> different predicates and join patterns though. >>>>>>>> >>>>>>> >>>>>>> You could use Impala to compute all the stats of all the tables >>>>>>> after each Kudu restart. Actually, do try that, restart Kudu then >>>>>>> compute >>>>>>> stats and see how fast it scans. >>>>>>> >>>>>>> >>>>>>>> >>>>>>>> I am curious why this metadata does not survive restarts though. We >>>>>>>> are going to run our benchmarks again and this time restart Kudu and >>>>>>>> Impala. >>>>>>>> >>>>>>> >>>>>>> It's in the tserver memory, it can't survive a restart. >>>>>>> >>>>>>> >>>>>>>> >>>>>>>> I just ran another query first time which hits 2 large tables and >>>>>>>> these tables have been scanned by the previous query and this time I >>>>>>>> do not >>>>>>>> see any difference in query time before the first and second time - I >>>>>>>> guess >>>>>>>> this confirms your statement about " first time ever scanning the >>>>>>>> table since a Kudu restart" and collecting metadata. >>>>>>>> >>>>>>> >>>>>>> Maybe, I've been known to be right once or twice a year :) >>>>>>> >>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On Wed, Dec 13, 2017 at 11:18 AM, Jean-Daniel Cryans < >>>>>>>> [email protected]> wrote: >>>>>>>> >>>>>>>>> Hi Boris, >>>>>>>>> >>>>>>>>> Given that we don't have much data we can use here, I'll have to >>>>>>>>> extrapolate. As an aside though, this is yet another example where we >>>>>>>>> need >>>>>>>>> more Kudu-side metrics in the query profile. >>>>>>>>> >>>>>>>>> So, Kudu lazily loads a bunch of metadata and that can really >>>>>>>>> affect scan times. If this was your first time ever scanning the table >>>>>>>>> since a Kudu restart, it's very possible that that's where that time >>>>>>>>> was >>>>>>>>> spent. There's also the page cache in the OS that might now be >>>>>>>>> populated. >>>>>>>>> You could do something like "sync; echo 3 > /proc/sys/vm/drop_caches" >>>>>>>>> on >>>>>>>>> all the machines and run the query 2 times again, without restarting >>>>>>>>> Kudu, >>>>>>>>> to understand the effect of the page cache itself. There's currently >>>>>>>>> now >>>>>>>>> way to purge the cached metadata in Kudu though. >>>>>>>>> >>>>>>>>> Hope this helps a bit, >>>>>>>>> >>>>>>>>> J-D >>>>>>>>> >>>>>>>>> On Wed, Dec 13, 2017 at 8:07 AM, Boris Tyukin < >>>>>>>>> [email protected]> wrote: >>>>>>>>> >>>>>>>>>> Hi guys, >>>>>>>>>> >>>>>>>>>> I am doing some benchmarks with Kudu and Impala/Parquet and hope >>>>>>>>>> to share it soon but there is one thing that bugs me. This is perhaps >>>>>>>>>> Impala question but since I am using Kudu with Impala I am going to >>>>>>>>>> try and >>>>>>>>>> ask anyway. >>>>>>>>>> >>>>>>>>>> One of my queries takes 120 seconds to run the very first time. >>>>>>>>>> It joins one large 5B row table with a bunch of smaller tables and >>>>>>>>>> then >>>>>>>>>> stores result in Impala/parquet (not Kudu). >>>>>>>>>> >>>>>>>>>> Now if I run it second and third time, it only takes 60 seconds. >>>>>>>>>> Can someone explain why? Is there any settings to decrease this gap? >>>>>>>>>> >>>>>>>>>> I've compared query profiles in CM and the only thing that was >>>>>>>>>> very different is scan against Kudu table (the large one): >>>>>>>>>> >>>>>>>>>> *************************** >>>>>>>>>> first time: >>>>>>>>>> *************************** >>>>>>>>>> KUDU_SCAN_NODE (id=0) (47.68s) >>>>>>>>>> <https://lkmaorabd103.multihosp.net:7183/cmf/impala/queryDetails?queryId=5143f7165be82819%3Ae00a103500000000&serviceName=impala#> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> - BytesRead: *0 B* >>>>>>>>>> - InactiveTotalTime: *0ns* >>>>>>>>>> - KuduRemoteScanTokens: *0* >>>>>>>>>> - NumScannerThreadsStarted: *20* >>>>>>>>>> - PeakMemoryUsage: *35.8 MiB* >>>>>>>>>> - RowsRead: *693,502,241* >>>>>>>>>> - RowsReturned: *693,502,241* >>>>>>>>>> - RowsReturnedRate: *14643448 per second* >>>>>>>>>> - ScanRangesComplete: *20* >>>>>>>>>> - ScannerThreadsInvoluntaryContextSwitches: *1,341* >>>>>>>>>> - ScannerThreadsTotalWallClockTime: *36.2m* >>>>>>>>>> - MaterializeTupleTime(*): *47.57s* >>>>>>>>>> - ScannerThreadsSysTime: *31.42s* >>>>>>>>>> - ScannerThreadsUserTime: *1.7m* >>>>>>>>>> - ScannerThreadsVoluntaryContextSwitches: *96,855* >>>>>>>>>> - TotalKuduScanRoundTrips: *52,308* >>>>>>>>>> - TotalReadThroughput: *0 B/s* >>>>>>>>>> - TotalTime: *47.68s* >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> *************************** >>>>>>>>>> second time: >>>>>>>>>> *************************** >>>>>>>>>> KUDU_SCAN_NODE (id=0) (4.28s) >>>>>>>>>> <https://lkmaorabd103.multihosp.net:7183/cmf/impala/queryDetails?queryId=53497a308f860837%3A243772e000000000&serviceName=impala#> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> - BytesRead: *0 B* >>>>>>>>>> - InactiveTotalTime: *0ns* >>>>>>>>>> - KuduRemoteScanTokens: *0* >>>>>>>>>> - NumScannerThreadsStarted: *20* >>>>>>>>>> - PeakMemoryUsage: *37.9 MiB* >>>>>>>>>> - RowsRead: *693,502,241* >>>>>>>>>> - RowsReturned: *693,502,241* >>>>>>>>>> - RowsReturnedRate: *173481534 per second* >>>>>>>>>> - ScanRangesComplete: *20* >>>>>>>>>> - ScannerThreadsInvoluntaryContextSwitches: *1,451* >>>>>>>>>> - ScannerThreadsTotalWallClockTime: *19.5m* >>>>>>>>>> - MaterializeTupleTime(*): *4.20s* >>>>>>>>>> - ScannerThreadsSysTime: *38.22s* >>>>>>>>>> - ScannerThreadsUserTime: *1.7m* >>>>>>>>>> - ScannerThreadsVoluntaryContextSwitches: *480,870* >>>>>>>>>> - TotalKuduScanRoundTrips: *52,142* >>>>>>>>>> - TotalReadThroughput: *0 B/s* >>>>>>>>>> - TotalTime: *4.28s* >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> >
