from:"\"Nicholas Sharkey\""

unsubscribe

2018-03-27 Thread Nicholas Sharkey

Re: [PySpark SQL] sql function to_date and to_timestamp return the same data type

2018-03-15 Thread Nicholas Sharkey

unsubscribe On Thu, Mar 15, 2018 at 8:00 PM, Alan Featherston Lago wrote: > I'm a pretty new user of spark and I've run into this issue with the > pyspark docs: > > The functions pyspark.sql.functions.to_date && > pyspark.sql.functions.to_timestamp > behave in the same way. As in both functions

Re: H2O DataFrame to Spark RDD/DataFrame

2017-01-12 Thread Nicholas Sharkey

Page 33 of the Sparkling Water Booklet: http://docs.h2o.ai/h2o/latest-stable/h2o-docs/booklets/SparklingWaterBooklet.pdf df = sqlContext.read.format("h2o").option("key",frame.frame_id).load() df = sqlContext.read.format("h2o").load(frame.frame_id) On Thu, Jan 12, 2017 at 1:17 PM, Md. Rezaul Kar

Re: Spark ML : One hot Encoding for multiple columns

2016-11-13 Thread Nicholas Sharkey

Amen > On Nov 13, 2016, at 7:55 PM, janardhan shetty wrote: > > These Jiras' are still unresolved: > https://issues.apache.org/jira/browse/SPARK-11215 > > Also there is https://issues.apache.org/jira/browse/SPARK-8418 > >> On Wed, Aug 17, 2016 at 11:15 AM, Nisha Muktewar wrote: >> >> The O

Re: Finding a Spark Equivalent for Pandas' get_dummies

2016-11-11 Thread Nicholas Sharkey

I did get *some* help from DataBricks in terms of programmatically grabbing the categorical variables but I can't figure out where to go from here: *# Get all string cols/categorical cols* *stringColList = [i[0] for i in df.dtypes if i[1] == 'string']* *# generate OHEs for every col in stringColL

Finding a Spark Equivalent for Pandas' get_dummies

2016-11-11 Thread Nicholas Sharkey

I have a dataset that I need to convert some of the the variables to dummy variables. The get_dummies function in Pandas works perfectly on smaller datasets but since it collects I'll always be bottlenecked by the master node. I've looked at Spark's OHE feature and while that will work in theory I

unsubscribe

Re: [PySpark SQL] sql function to_date and to_timestamp return the same data type

Re: H2O DataFrame to Spark RDD/DataFrame

Re: Spark ML : One hot Encoding for multiple columns

Re: Finding a Spark Equivalent for Pandas' get_dummies

Finding a Spark Equivalent for Pandas' get_dummies

6 matches

Site Navigation

Mail list logo

Footer information