Re: [VOTE] SPIP: Catalog API for view metadata

2022-02-05 Thread Jacky Lee
+1 (non-binding). Thanks John! It's great to see ViewCatalog moving on, it's a nice feature. Terry Kim 于2022年2月5日周六 11:57写道: > +1 (non-binding). Thanks John! > > Terry > > On Fri, Feb 4, 2022 at 4:13 PM Yufei Gu wrote: > >> +1 (non-binding) >> Best, >> >> Yufei >> >> `This is not a

Re: spark, autoscaling and handling node loss with autoscaling

2022-02-05 Thread Mich Talebzadeh
I did some tests on a three node Dataproc cluster with autoscaling on. One master node and 2 work nodes. the master node was called ctpcluster-m and the worker nodes were ctpcluster-w-0 and ctpcluster-w-1 respectively I started a spark-submit job with the following autoscaling parameters added

[Spark Core] [Feature] unionByName parameters

2022-02-05 Thread Daniel Davies
Hello dev@, I had a quick question about the unionByName function. This function currently seems to accept a parameter- "allowMissingColumns"- that allows some tolerance to merging datasets with different schemas [e.g. here

spark, autoscaling and handling node loss with autoscaling

2022-02-05 Thread Mich Talebzadeh
This question arises when Spark is offered as a managed service on a cluster of VMs in Cloud. For example, Google Dataproc or Amazon EMR among others >From what I can see in autoscaling setup, you will always need a minimum of two