unsubscribe
On Thu, Mar 15, 2018 at 8:00 PM, Alan Featherston Lago
wrote:
> I'm a pretty new user of spark and I've run into this issue with the
> pyspark docs:
>
> The functions pyspark.sql.functions.to_date &&
> pyspark.sql.functions.to_timestamp
> behave in the same way. As in both functions
Page 33 of the Sparkling Water Booklet:
http://docs.h2o.ai/h2o/latest-stable/h2o-docs/booklets/SparklingWaterBooklet.pdf
df = sqlContext.read.format("h2o").option("key",frame.frame_id).load()
df = sqlContext.read.format("h2o").load(frame.frame_id)
On Thu, Jan 12, 2017 at 1:17 PM, Md. Rezaul Kar
Amen
> On Nov 13, 2016, at 7:55 PM, janardhan shetty wrote:
>
> These Jiras' are still unresolved:
> https://issues.apache.org/jira/browse/SPARK-11215
>
> Also there is https://issues.apache.org/jira/browse/SPARK-8418
>
>> On Wed, Aug 17, 2016 at 11:15 AM, Nisha Muktewar wrote:
>>
>> The O
I did get *some* help from DataBricks in terms of programmatically grabbing
the categorical variables but I can't figure out where to go from here:
*# Get all string cols/categorical cols*
*stringColList = [i[0] for i in df.dtypes if i[1] == 'string']*
*# generate OHEs for every col in stringColL
I have a dataset that I need to convert some of the the variables to dummy
variables. The get_dummies function in Pandas works perfectly on smaller
datasets but since it collects I'll always be bottlenecked by the master
node.
I've looked at Spark's OHE feature and while that will work in theory I