Re: [Spark 2] BigDecimal and 0
I should have noted that I understand the notation of 0E-18 (exponential form, I think) and that in a normal case it is no different than 0; I just wanted to make sure that there wasn't something tricky going on since the representation was seemingly changing. Michael, that's a fair point. I keep operating under the assumption of some guaranteed performance from BigDecimal but I realize there is probably some math happening that's causing results that can't perfectly be represented. Thanks guys. I'm good now. On Mon, Oct 24, 2016 at 8:57 PM Jakob Odersky wrote: > Yes, thanks for elaborating Michael. > The other thing that I wanted to highlight was that in this specific > case the value is actually exactly zero (0E-18 = 0*10^(-18) = 0). > > On Mon, Oct 24, 2016 at 8:50 PM, Michael Matsko > wrote: > > Efe, > > > > I think Jakob's point is that that there is no problem. When you deal > with > > real numbers, you don't get exact representations of numbers. There is > > always some slop in representations, things don't ever cancel out > exactly. > > Testing reals for equality to zero will almost never work. > > > > Look at Goldberg's paper > > > https://ece.uwaterloo.ca/~dwharder/NumericalAnalysis/02Numerics/Double/paper.pdf > > for a quick intro. > > > > Mike > > > > On Oct 24, 2016, at 10:36 PM, Efe Selcuk wrote: > > > > Okay, so this isn't contributing to any kind of imprecision. I suppose I > > need to go digging further then. Thanks for the quick help. > > > > On Mon, Oct 24, 2016 at 7:34 PM Jakob Odersky wrote: > >> > >> What you're seeing is merely a strange representation, 0E-18 is zero. > >> The E-18 represents the precision that Spark uses to store the decimal > >> > >> On Mon, Oct 24, 2016 at 7:32 PM, Jakob Odersky > wrote: > >> > An even smaller example that demonstrates the same behaviour: > >> > > >> > Seq(Data(BigDecimal(0))).toDS.head > >> > > >> > On Mon, Oct 24, 2016 at 7:03 PM, Efe Selcuk > wrote: > >> >> I’m trying to track down what seems to be a very slight imprecision > in > >> >> our > >> >> Spark application; two of our columns, which should be netting out to > >> >> exactly zero, are coming up with very small fractions of non-zero > >> >> value. The > >> >> only thing that I’ve found out of place is that a case class entry > into > >> >> a > >> >> Dataset we’ve generated with BigDecimal(“0”) will end up as 0E-18 > after > >> >> it > >> >> goes through Spark, and I don’t know if there’s any appreciable > >> >> difference > >> >> between that and the actual 0 value, which can be generated with > >> >> BigDecimal. > >> >> Here’s a contrived example: > >> >> > >> >> scala> case class Data(num: BigDecimal) > >> >> defined class Data > >> >> > >> >> scala> val x = Data(0) > >> >> x: Data = Data(0) > >> >> > >> >> scala> x.num > >> >> res9: BigDecimal = 0 > >> >> > >> >> scala> val y = Seq(x, x.copy()).toDS.reduce( (a,b) => a.copy(a.num + > >> >> b.num)) > >> >> y: Data = Data(0E-18) > >> >> > >> >> scala> y.num > >> >> res12: BigDecimal = 0E-18 > >> >> > >> >> scala> BigDecimal("1") - 1 > >> >> res15: scala.math.BigDecimal = 0 > >> >> > >> >> Am I looking at anything valuable? > >> >> > >> >> Efe >
Re: [Spark 2] BigDecimal and 0
Yes, thanks for elaborating Michael. The other thing that I wanted to highlight was that in this specific case the value is actually exactly zero (0E-18 = 0*10^(-18) = 0). On Mon, Oct 24, 2016 at 8:50 PM, Michael Matsko wrote: > Efe, > > I think Jakob's point is that that there is no problem. When you deal with > real numbers, you don't get exact representations of numbers. There is > always some slop in representations, things don't ever cancel out exactly. > Testing reals for equality to zero will almost never work. > > Look at Goldberg's paper > https://ece.uwaterloo.ca/~dwharder/NumericalAnalysis/02Numerics/Double/paper.pdf > for a quick intro. > > Mike > > On Oct 24, 2016, at 10:36 PM, Efe Selcuk wrote: > > Okay, so this isn't contributing to any kind of imprecision. I suppose I > need to go digging further then. Thanks for the quick help. > > On Mon, Oct 24, 2016 at 7:34 PM Jakob Odersky wrote: >> >> What you're seeing is merely a strange representation, 0E-18 is zero. >> The E-18 represents the precision that Spark uses to store the decimal >> >> On Mon, Oct 24, 2016 at 7:32 PM, Jakob Odersky wrote: >> > An even smaller example that demonstrates the same behaviour: >> > >> > Seq(Data(BigDecimal(0))).toDS.head >> > >> > On Mon, Oct 24, 2016 at 7:03 PM, Efe Selcuk wrote: >> >> I’m trying to track down what seems to be a very slight imprecision in >> >> our >> >> Spark application; two of our columns, which should be netting out to >> >> exactly zero, are coming up with very small fractions of non-zero >> >> value. The >> >> only thing that I’ve found out of place is that a case class entry into >> >> a >> >> Dataset we’ve generated with BigDecimal(“0”) will end up as 0E-18 after >> >> it >> >> goes through Spark, and I don’t know if there’s any appreciable >> >> difference >> >> between that and the actual 0 value, which can be generated with >> >> BigDecimal. >> >> Here’s a contrived example: >> >> >> >> scala> case class Data(num: BigDecimal) >> >> defined class Data >> >> >> >> scala> val x = Data(0) >> >> x: Data = Data(0) >> >> >> >> scala> x.num >> >> res9: BigDecimal = 0 >> >> >> >> scala> val y = Seq(x, x.copy()).toDS.reduce( (a,b) => a.copy(a.num + >> >> b.num)) >> >> y: Data = Data(0E-18) >> >> >> >> scala> y.num >> >> res12: BigDecimal = 0E-18 >> >> >> >> scala> BigDecimal("1") - 1 >> >> res15: scala.math.BigDecimal = 0 >> >> >> >> Am I looking at anything valuable? >> >> >> >> Efe - To unsubscribe e-mail: user-unsubscr...@spark.apache.org
Re: [Spark 2] BigDecimal and 0
Efe, I think Jakob's point is that that there is no problem. When you deal with real numbers, you don't get exact representations of numbers. There is always some slop in representations, things don't ever cancel out exactly. Testing reals for equality to zero will almost never work. Look at Goldberg's paper https://ece.uwaterloo.ca/~dwharder/NumericalAnalysis/02Numerics/Double/paper.pdf for a quick intro. Mike > On Oct 24, 2016, at 10:36 PM, Efe Selcuk wrote: > > Okay, so this isn't contributing to any kind of imprecision. I suppose I need > to go digging further then. Thanks for the quick help. > >> On Mon, Oct 24, 2016 at 7:34 PM Jakob Odersky wrote: >> What you're seeing is merely a strange representation, 0E-18 is zero. >> The E-18 represents the precision that Spark uses to store the decimal >> >> On Mon, Oct 24, 2016 at 7:32 PM, Jakob Odersky wrote: >> > An even smaller example that demonstrates the same behaviour: >> > >> > Seq(Data(BigDecimal(0))).toDS.head >> > >> > On Mon, Oct 24, 2016 at 7:03 PM, Efe Selcuk wrote: >> >> I’m trying to track down what seems to be a very slight imprecision in our >> >> Spark application; two of our columns, which should be netting out to >> >> exactly zero, are coming up with very small fractions of non-zero value. >> >> The >> >> only thing that I’ve found out of place is that a case class entry into a >> >> Dataset we’ve generated with BigDecimal(“0”) will end up as 0E-18 after it >> >> goes through Spark, and I don’t know if there’s any appreciable difference >> >> between that and the actual 0 value, which can be generated with >> >> BigDecimal. >> >> Here’s a contrived example: >> >> >> >> scala> case class Data(num: BigDecimal) >> >> defined class Data >> >> >> >> scala> val x = Data(0) >> >> x: Data = Data(0) >> >> >> >> scala> x.num >> >> res9: BigDecimal = 0 >> >> >> >> scala> val y = Seq(x, x.copy()).toDS.reduce( (a,b) => a.copy(a.num + >> >> b.num)) >> >> y: Data = Data(0E-18) >> >> >> >> scala> y.num >> >> res12: BigDecimal = 0E-18 >> >> >> >> scala> BigDecimal("1") - 1 >> >> res15: scala.math.BigDecimal = 0 >> >> >> >> Am I looking at anything valuable? >> >> >> >> Efe
Re: [Spark 2] BigDecimal and 0
Okay, so this isn't contributing to any kind of imprecision. I suppose I need to go digging further then. Thanks for the quick help. On Mon, Oct 24, 2016 at 7:34 PM Jakob Odersky wrote: > What you're seeing is merely a strange representation, 0E-18 is zero. > The E-18 represents the precision that Spark uses to store the decimal > > On Mon, Oct 24, 2016 at 7:32 PM, Jakob Odersky wrote: > > An even smaller example that demonstrates the same behaviour: > > > > Seq(Data(BigDecimal(0))).toDS.head > > > > On Mon, Oct 24, 2016 at 7:03 PM, Efe Selcuk wrote: > >> I’m trying to track down what seems to be a very slight imprecision in > our > >> Spark application; two of our columns, which should be netting out to > >> exactly zero, are coming up with very small fractions of non-zero > value. The > >> only thing that I’ve found out of place is that a case class entry into > a > >> Dataset we’ve generated with BigDecimal(“0”) will end up as 0E-18 after > it > >> goes through Spark, and I don’t know if there’s any appreciable > difference > >> between that and the actual 0 value, which can be generated with > BigDecimal. > >> Here’s a contrived example: > >> > >> scala> case class Data(num: BigDecimal) > >> defined class Data > >> > >> scala> val x = Data(0) > >> x: Data = Data(0) > >> > >> scala> x.num > >> res9: BigDecimal = 0 > >> > >> scala> val y = Seq(x, x.copy()).toDS.reduce( (a,b) => a.copy(a.num + > b.num)) > >> y: Data = Data(0E-18) > >> > >> scala> y.num > >> res12: BigDecimal = 0E-18 > >> > >> scala> BigDecimal("1") - 1 > >> res15: scala.math.BigDecimal = 0 > >> > >> Am I looking at anything valuable? > >> > >> Efe >
Re: [Spark 2] BigDecimal and 0
What you're seeing is merely a strange representation, 0E-18 is zero. The E-18 represents the precision that Spark uses to store the decimal On Mon, Oct 24, 2016 at 7:32 PM, Jakob Odersky wrote: > An even smaller example that demonstrates the same behaviour: > > Seq(Data(BigDecimal(0))).toDS.head > > On Mon, Oct 24, 2016 at 7:03 PM, Efe Selcuk wrote: >> I’m trying to track down what seems to be a very slight imprecision in our >> Spark application; two of our columns, which should be netting out to >> exactly zero, are coming up with very small fractions of non-zero value. The >> only thing that I’ve found out of place is that a case class entry into a >> Dataset we’ve generated with BigDecimal(“0”) will end up as 0E-18 after it >> goes through Spark, and I don’t know if there’s any appreciable difference >> between that and the actual 0 value, which can be generated with BigDecimal. >> Here’s a contrived example: >> >> scala> case class Data(num: BigDecimal) >> defined class Data >> >> scala> val x = Data(0) >> x: Data = Data(0) >> >> scala> x.num >> res9: BigDecimal = 0 >> >> scala> val y = Seq(x, x.copy()).toDS.reduce( (a,b) => a.copy(a.num + b.num)) >> y: Data = Data(0E-18) >> >> scala> y.num >> res12: BigDecimal = 0E-18 >> >> scala> BigDecimal("1") - 1 >> res15: scala.math.BigDecimal = 0 >> >> Am I looking at anything valuable? >> >> Efe - To unsubscribe e-mail: user-unsubscr...@spark.apache.org
Re: [Spark 2] BigDecimal and 0
An even smaller example that demonstrates the same behaviour: Seq(Data(BigDecimal(0))).toDS.head On Mon, Oct 24, 2016 at 7:03 PM, Efe Selcuk wrote: > I’m trying to track down what seems to be a very slight imprecision in our > Spark application; two of our columns, which should be netting out to > exactly zero, are coming up with very small fractions of non-zero value. The > only thing that I’ve found out of place is that a case class entry into a > Dataset we’ve generated with BigDecimal(“0”) will end up as 0E-18 after it > goes through Spark, and I don’t know if there’s any appreciable difference > between that and the actual 0 value, which can be generated with BigDecimal. > Here’s a contrived example: > > scala> case class Data(num: BigDecimal) > defined class Data > > scala> val x = Data(0) > x: Data = Data(0) > > scala> x.num > res9: BigDecimal = 0 > > scala> val y = Seq(x, x.copy()).toDS.reduce( (a,b) => a.copy(a.num + b.num)) > y: Data = Data(0E-18) > > scala> y.num > res12: BigDecimal = 0E-18 > > scala> BigDecimal("1") - 1 > res15: scala.math.BigDecimal = 0 > > Am I looking at anything valuable? > > Efe - To unsubscribe e-mail: user-unsubscr...@spark.apache.org
[Spark 2] BigDecimal and 0
I’m trying to track down what seems to be a very slight imprecision in our Spark application; two of our columns, which should be netting out to exactly zero, are coming up with very small fractions of non-zero value. The only thing that I’ve found out of place is that a case class entry into a Dataset we’ve generated with BigDecimal(“0”) will end up as 0E-18 after it goes through Spark, and I don’t know if there’s any appreciable difference between that and the actual 0 value, which can be generated with BigDecimal. Here’s a contrived example: scala> case class Data(num: BigDecimal) defined class Data scala> val x = Data(0) x: Data = Data(0) scala> x.num res9: BigDecimal = 0 scala> val y = Seq(x, x.copy()).toDS.reduce( (a,b) => a.copy(a.num + b.num)) y: Data = Data(0E-18) scala> y.num res12: BigDecimal = 0E-18 scala> BigDecimal("1") - 1 res15: scala.math.BigDecimal = 0 Am I looking at anything valuable? Efe