date:20211213

Re: Creating a memory-efficient AggregateFunction to calculate Median

2021-12-13 Thread Nicholas Chammas

Yeah, I think approximate percentile is good enough most of the time. I don't have a specific need for a precise median. I was interested in implementing it more as a Catalyst learning exercise, but it turns out I picked a bad learning exercise to solve. :) On Mon, Dec 13, 2021 at 9:46 PM Reynold

Re: Creating a memory-efficient AggregateFunction to calculate Median

2021-12-13 Thread Reynold Xin

tl;dr: there's no easy way to implement aggregate expressions that'd require multiple pass over data. It is simply not something that's supported and doing so would be very high cost. Would you be OK using approximate percentile? That's relatively cheap. On Mon, Dec 13, 2021 at 6:43 PM, Nichola

Re: Creating a memory-efficient AggregateFunction to calculate Median

2021-12-13 Thread Nicholas Chammas

No takers here? :) I can see now why a median function is not available in most data processing systems. It's pretty annoying to implement! On Thu, Dec 9, 2021 at 9:25 PM Nicholas Chammas wrote: > I'm trying to create a new aggregate function. It's my first time working > with Catalyst, so it's

Re: Log4j 1.2.17 spark CVE

2021-12-13 Thread Qian Sun

My understanding is that we don’t need to do anything. Log4j2-core not used in spark. > 2021年12月13日下午12:45，Pralabh Kumar 写道： > > Hi developers, users > > Spark is built using log4j 1.2.17 . Is there a plan to upgrade based on > recent CVE detected ? > > > Regards > Pralabh kumar --

Re: Log4j 1.2.17 spark CVE

2021-12-13 Thread Sean Owen

You would want to shade this dependency in your app, in which case you would be using log4j 2. If you don't shade and just include it, you will also be using log4j 2 as some of the API classes are different. If they overlap with log4j 1, you will probably hit errors anyway. On Mon, Dec 13, 2021 at

Re: Log4j 1.2.17 spark CVE

2021-12-13 Thread Sean Owen

This has come up several times over years - search JIRA. The very short summary is: Spark does not use log4j 1.x, but its dependencies do, and that's the issue. Anyone that can successfully complete the surgery at this point is welcome to, but I failed ~2 years ago. On Mon, Dec 13, 2021 at 10:02 A

Re: Log4j 1.2.17 spark CVE

2021-12-13 Thread Jörn Franke

Is it in any case appropriate to use log4j 1.x which is not maintained anymore and has other security vulnerabilities which won’t be fixed anymore ? > Am 13.12.2021 um 06:06 schrieb Sean Owen : > > > Check the CVE - the log4j vulnerability appears to affect log4j 2, not 1.x. > There was menti

Re: Hi Team, I put a UDF-Utils jar on Google Cloud Storage, but I can't run it

2021-12-13 Thread Mich Talebzadeh

probably it is because you are using an older version of spark This works for version 3.1.1 gsutil ls gs://etcbucket/ojdbc8.jar gs://etcbucket/ojdbc8.jar *spark-sql add jar gs://etcbucket/ojdbc8.jar* 21/12/13 08:56:00 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform

Re: Creating a memory-efficient AggregateFunction to calculate Median

Re: Creating a memory-efficient AggregateFunction to calculate Median

Re: Creating a memory-efficient AggregateFunction to calculate Median

Re: Log4j 1.2.17 spark CVE

Re: Log4j 1.2.17 spark CVE

Re: Log4j 1.2.17 spark CVE

Re: Log4j 1.2.17 spark CVE

Re: Hi Team, I put a UDF-Utils jar on Google Cloud Storage, but I can't run it

8 matches

Site Navigation

Mail list logo

Footer information