Re: When to expect UTF8String?

2015-06-12 Thread Michael Armbrust
1. Custom aggregators that do map-side combine. This is something I'd hoping to add in Spark 1.5 2. UDFs with more than 22 arguments which is not supported by ScalaUdf, and to avoid wrapping a Java function interface in one of 22 different Scala function interfaces depending on the number

RE: When to expect UTF8String?

2015-06-12 Thread Zack Sampson
. Are there methods we can use to convert to/from the internal representation in these cases? From: Michael Armbrust [mich...@databricks.com] Sent: Thursday, June 11, 2015 9:05 PM To: Zack Sampson Cc: dev@spark.apache.org Subject: Re: When to expect

Re: When to expect UTF8String?

2015-06-11 Thread Michael Armbrust
Through the DataFrame API, users should never see UTF8String. Expression (and any class in the catalyst package) is considered internal and so uses the internal representation of various types. Which type we use here is not stable across releases. Is there a reason you aren't defining a UDF