Re: [Spark 2.0.1] Error in generated code, possible regression?

2016-10-26 Thread Efe Selcuk
; could open a JIRA. > > On Mon, Oct 24, 2016 at 6:21 PM, Efe Selcuk <efema...@gmail.com> wrote: > > I have an application that works in 2.0.0 but has been dying at runtime on > the 2.0.1 distribution. > > at > org.apache.spark.sql.catalyst.expressions.codegen.Co

Re: [Spark 2.0.1] Error in generated code, possible regression?

2016-10-25 Thread Efe Selcuk
would be great. Kazuaki Ishizaki From: Efe Selcuk <efema...@gmail.com> To:"user @spark" <user@spark.apache.org> Date:2016/10/25 10:23 Subject:[Spark 2.0.1] Error in generated code, possible regression? -- I ha

Re: [Spark 2] BigDecimal and 0

2016-10-24 Thread Efe Selcuk
for equality to zero will almost never work. > > > > Look at Goldberg's paper > > > https://ece.uwaterloo.ca/~dwharder/NumericalAnalysis/02Numerics/Double/paper.pdf > > for a quick intro. > > > > Mike > > > > On Oct 24, 2016, at 10:36 PM, Efe Selcuk

Re: [Spark 2] BigDecimal and 0

2016-10-24 Thread Efe Selcuk
gt; The E-18 represents the precision that Spark uses to store the decimal > > On Mon, Oct 24, 2016 at 7:32 PM, Jakob Odersky <ja...@odersky.com> wrote: > > An even smaller example that demonstrates the same behaviour: > > > > Seq(Data(BigDecimal(0))).toDS.head > >

[Spark 2] BigDecimal and 0

2016-10-24 Thread Efe Selcuk
I’m trying to track down what seems to be a very slight imprecision in our Spark application; two of our columns, which should be netting out to exactly zero, are coming up with very small fractions of non-zero value. The only thing that I’ve found out of place is that a case class entry into a

[Spark 2.0.1] Error in generated code, possible regression?

2016-10-24 Thread Efe Selcuk
I have an application that works in 2.0.0 but has been dying at runtime on the 2.0.1 distribution. at org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$.org$apache$spark$sql$catalyst$expressions$codegen$CodeGenerator$$doCompile(CodeGenerator.scala:893) at

Re: [Spark 2.0.0] error when unioning to an empty dataset

2016-10-24 Thread Efe Selcuk
n where I can't easily build from source. On Mon, Oct 24, 2016 at 12:29 PM Cheng Lian <lian.cs@gmail.com> wrote: > > > On 10/22/16 1:42 PM, Efe Selcuk wrote: > > Ah, looks similar. Next opportunity I get, I'm going to do a printSchema > on the two datasets and s

Re: [Spark 2.0.0] error when unioning to an empty dataset

2016-10-22 Thread Efe Selcuk
Could you print the schema for data and for > someCode.thatReturnsADataset() and see if there is any difference between > the two ? > > On Fri, Oct 21, 2016 at 9:14 AM, Efe Selcuk <efema...@gmail.com> wrote: > > Thanks for the response. What do you mean by "semantic

Re: [Spark 2.0.0] error when unioning to an empty dataset

2016-10-20 Thread Efe Selcuk
hu, Oct 20, 2016 at 8:34 PM Agraj Mangal <agraj@gmail.com> wrote: I believe this normally comes when Spark is unable to perform union due to "difference" in schema of the operands. Can you check if the schema of both the datasets are semantically same ? On Tue, Oct 18, 2016 at

Re: [Spark 2.0.0] error when unioning to an empty dataset

2016-10-17 Thread Efe Selcuk
Bump! On Thu, Oct 13, 2016 at 8:25 PM Efe Selcuk <efema...@gmail.com> wrote: > I have a use case where I want to build a dataset based off of > conditionally available data. I thought I'd do something like this: > > case class SomeData( ... ) // parameters are basic en

[Spark 2.0.0] error when unioning to an empty dataset

2016-10-13 Thread Efe Selcuk
I have a use case where I want to build a dataset based off of conditionally available data. I thought I'd do something like this: case class SomeData( ... ) // parameters are basic encodable types like strings and BigDecimals var data = spark.emptyDataset[SomeData] // loop, determining what

"Schemaless" Spark

2016-08-19 Thread Efe Selcuk
Hi Spark community, This is a bit of a high level question as frankly I'm not well versed in Spark or related tech. We have a system in place that reads columnar data in through CSV and represents the data in relational tables as it operates. It's essentially schema-based ETL. This restricts our

Re: [Spark2] Error writing "complex" type to CSV

2016-08-19 Thread Efe Selcuk
t;a", Date.valueOf("1990-12-13")), >> ("a", Date.valueOf("1990-12-13")), >> ("a", Date.valueOf("1990-12-13")) >> ).toDF("a", "b").as[ClassData] >> ds.write.csv("/tmp/data.csv") >

Re: [Spark2] Error writing "complex" type to CSV

2016-08-18 Thread Efe Selcuk
e CSV format can't represent the nested types in its > own format. > > I guess supporting them in writing in external CSV is rather a bug. > > I think it'd be great if we can write and read back CSV in its own format > but I guess we can't. > > Thanks! > > On 19 Aug 2016 6:

[Spark2] Error writing "complex" type to CSV

2016-08-18 Thread Efe Selcuk
We have an application working in Spark 1.6. It uses the databricks csv library for the output format when writing out. I'm attempting an upgrade to Spark 2. When writing with both the native DataFrameWriter#csv() method and with first specifying the "com.databricks.spark.csv" format (I suspect

Re: Spark2 SBT Assembly

2016-08-11 Thread Efe Selcuk
Bump! On Wed, Aug 10, 2016 at 2:59 PM, Efe Selcuk <efema...@gmail.com> wrote: > Thanks for the replies, folks. > > My specific use case is maybe unusual. I'm working in the context of the > build environment in my company. Spark was being used in such a way that >

Re: Spark2 SBT Assembly

2016-08-10 Thread Efe Selcuk
; On 10 August 2016 at 20:35, Holden Karau <hol...@pigscanfly.ca> wrote: > >> What are you looking to use the assembly jar for - maybe we can think of >> a workaround :) >> >> >> On Wednesday, August 10, 2016, Efe Selcuk <efema...@gmail.com> wrote: >> >

Re: Spark2 SBT Assembly

2016-08-10 Thread Efe Selcuk
bly > jar. > > To build now use "build/sbt package" > > > > On Wed, 10 Aug 2016 at 19:40, Efe Selcuk <efema...@gmail.com> wrote: > >> Hi Spark folks, >> >> With Spark 1.6 the 'assembly' target for sbt would build a fat jar with >> all of th

Spark2 SBT Assembly

2016-08-10 Thread Efe Selcuk
Hi Spark folks, With Spark 1.6 the 'assembly' target for sbt would build a fat jar with all of the main Spark dependencies for building an application. Against Spark 2, that target is no longer building a spark assembly, just ones for e.g. Flume and Kafka. I'm not well versed with maven and sbt,