Re: New Impala PMC member: Joe McDonnell

2018-08-21 Thread Yongjun Zhang
Congratulations Joe, great achievement! --Yongjun On Tue, Aug 21, 2018 at 3:30 PM, Tim Armstrong wrote: > The Project Management Committee (PMC) for Apache Impala has invited Joe > McDonnell to become a PMC member and we are pleased to announce that they > have accepted. > Congratulations and

Re: Improving Kudu Build Support

2018-08-21 Thread Tim Armstrong
Is there a path to building a version of Kudu locally for an arbitrary linux distro? Personally I am less concerned about 14.04 support and more concerned about what the path to upgrading to 18.04. It would also be nice for it to be at least possible to develop on RedHat-derived distros even if

New Impala PMC member: Joe McDonnell

2018-08-21 Thread Tim Armstrong
The Project Management Committee (PMC) for Apache Impala has invited Joe McDonnell to become a PMC member and we are pleased to announce that they have accepted. Congratulations and welcome, Joe!

Re: Improving latency of catalog update propagation?

2018-08-21 Thread Tim Armstrong
Yeah I think the angle makes sense to pursue. I don't feel strongly about whether now or later is the right time to pursue it, but it does seem like it's not the immediate highest priority. On Tue, Aug 21, 2018 at 1:57 PM, Tianyi Wang wrote: > GetCatalogDelta used to block catalogd from

Re: Improving latency of catalog update propagation?

2018-08-21 Thread Tianyi Wang
GetCatalogDelta used to block catalogd from executing DDLs and the pending struct was yet another cache to smooth things a little. On Tue, Aug 21, 2018 at 11:28 AM Todd Lipcon wrote: > One more parting thought: why don't we just call 'GetCatalogDelta()' > directly from the catalog callback in

Re: Impalad JVM OOM minutes after restart

2018-08-21 Thread Brock Noland
Jeezy - yes unfortunately I cannot share the query details at this time. No hs_err file was generated. Philip - Yeah that seems to be the way to go. On Tue, Aug 21, 2018 at 1:51 PM, Philip Zeyliger wrote: > Hi Brock, > > If you want to make Eclipse MAT more usable, set JAVA_TOOL_OPTIONS="-Xmx2g

Re: Impalad JVM OOM minutes after restart

2018-08-21 Thread Philip Zeyliger
Hi Brock, If you want to make Eclipse MAT more usable, set JAVA_TOOL_OPTIONS="-Xmx2g -XX:+HeapDumpOnOutOfMemoryError" and you should see the max heap at 2GB, thereby making Eclipse MAT friendlier. Folks have also been using http://www.jxray.com/. The query itself will also be interesting. If

Re: Impalad JVM OOM minutes after restart

2018-08-21 Thread Jeszy
Hm, that's interesting because: - I haven't yet seen query planning itself cause OOM - if it was catalog related to the tables involved in the query, following initial topic size would be bigger Can you share diagnostic data, like the query text, definitions and stats for tables involved,

Re: Impalad JVM OOM minutes after restart

2018-08-21 Thread Brock Noland
Hi Jeezy, Thanks, good tip. The MS is quite small. Even mysqldump format is only 12MB. The largest catalog-update I could find is only 1.5MB which should be easy to process with 32GB of of heap. Lastly, it's possible we can reproduce by running the query the impalad was processing during the

Re: Improving latency of catalog update propagation?

2018-08-21 Thread Todd Lipcon
One more parting thought: why don't we just call 'GetCatalogDelta()' directly from the catalog callback in order to do a direct handoff, instead of storing them in this 'pending' struct? Given the statestore uses a dedicated thread per subscriber (right?) it seems like it would be fine for the

Re: Impalad JVM OOM minutes after restart

2018-08-21 Thread Jeszy
Hey, If it happens shortly after a restart, there is a fair chance you're crashing while processing the initial catalog topic update. Statestore logs will tell you how big that was (it takes more memory to process it than the actual size of the update). If this is the case, it should also be

Impalad JVM OOM minutes after restart

2018-08-21 Thread Brock Noland
Hi folks, I've got an Impala CDH 5.14.2 cluster with a handful of users, 2-3, at any one time. All of a sudden the JVM inside the Impalad started running out of memory. I got a heap dump, but the heap was 32GB, host is 240GB, so it's very large. Thus I wasn't able to get Memory Analyzer Tool

Re: Improving latency of catalog update propagation?

2018-08-21 Thread Todd Lipcon
Thanks, Tim. I'm guessing once we switch over these RPCs to KRPC instead of Thrift we'll alleviate some of the scalability issues and maybe we can look into increasing frequency or doing a "push" to the statestore, etc. I probably won't work on this in the near term to avoid complicating the

Re: Improving latency of catalog update propagation?

2018-08-21 Thread Tim Armstrong
This is somewhat relevant for admission control too - I had thought about some of these issues in that context, because reducing the latency of admission controls state propagation helps avoid overadmission but having a very low statestore frequency is very inefficient and doesn't scale well to

Improving latency of catalog update propagation?

2018-08-21 Thread Todd Lipcon
Hey folks, In my recent forays into the catalog->statestore->impalad metadata propagation code base, I noticed that the latency of any update is typically between 2-4 seconds with the standard 2-second statestore polling interval. That's because the code currently works as follows: 1. in the

Re: Range partition on HDFS

2018-08-21 Thread Vuk Ercegovac
Were you thinking of something like this? https://www.cloudera.com/documentation/enterprise/latest/topics/impala_partitioning.html On Tue, Aug 21, 2018 at 7:37 AM Yuming Wang wrote: > Hi, > > Only kudu supports range partition, can HDFS support this feature? > > >

Re: Re: New Impala committer - Quanlong Huang

2018-08-21 Thread Zoltan Borok-Nagy
Congrats Quanlong! On Tue, Aug 21, 2018 at 9:34 AM Gabor Kaszab wrote: > Congrats! > > On Sat, Aug 18, 2018 at 3:11 AM Quanlong Huang > wrote: > > > Thanks! Glad to work with you all!--Quanlong > > > > At 2018-08-18 03:09:38, "Yongjun Zhang" wrote: > > >Congratulations Quanlong! > > > > >

Range partition on HDFS

2018-08-21 Thread Yuming Wang
Hi, Only kudu supports range partition, can HDFS support this feature? https://kudu.apache.org/docs/kudu_impala_integration.html#basic_partitioning Thanks.

Re: Improving Kudu Build Support

2018-08-21 Thread Laszlo Gaal
+1 for simplifying Kudu updates. I am also still on Ubuntu 14.04, but I am all for simplifying Kudu integration: I agree with Thomas that Kudu snapshots should be grouped with the other CDH components. Given that Ubuntu 14.04 will be EOL'd next spring, upgrading the development OS is a reasonably

Re: Re: New Impala committer - Quanlong Huang

2018-08-21 Thread Gabor Kaszab
Congrats! On Sat, Aug 18, 2018 at 3:11 AM Quanlong Huang wrote: > Thanks! Glad to work with you all!--Quanlong > > At 2018-08-18 03:09:38, "Yongjun Zhang" wrote: > >Congratulations Quanlong! > > > >--Yngjun > > > >On Fri, Aug 17, 2018 at 12:07 PM, Jeszy wrote: > > > >> Congrats Quanlong! > >>