subject:"Finding a Spark Equivalent for Pandas' get

Re: Finding a Spark Equivalent for Pandas' get_dummies

2016-11-15 Thread neil90

You can have a list of all the columns and pass it to a recursive recursive function to fit and make the transformation. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Finding-a-Spark-Equivalent-for-Pandas-get-dummies-tp28064p28079.html Sent from the

Re: Finding a Spark Equivalent for Pandas' get_dummies

2016-11-11 Thread Nicholas Sharkey

gorical variables into dummy variables, then save the >> transformed data back to CSV. That is why I'm so interested in get_dummies >> but it's not scalable enough for my data size (500-600GB per file). >> >> Thanks in advance. >> >> Nick >> >>

Re: Finding a Spark Equivalent for Pandas' get_dummies

2016-11-11 Thread Nick Pentreath

med data back to CSV. That is why I'm so interested in get_dummies > but it's not scalable enough for my data size (500-600GB per file). > > Thanks in advance. > > Nick > > ---------- > View this message in context: Finding a Spark Equivalent for Pa

Finding a Spark Equivalent for Pandas' get_dummies

2016-11-11 Thread nsharkey

I have a dataset that I need to convert some of the the variables to dummy variables. The get_dummies function in Pandas works perfectly on smaller datasets but since it collects I'll always be bottlenecked by the master node. I've looked at Spark's OHE feature and while that will work in theory

Finding a Spark Equivalent for Pandas' get_dummies

2016-11-11 Thread Nicholas Sharkey

I have a dataset that I need to convert some of the the variables to dummy variables. The get_dummies function in Pandas works perfectly on smaller datasets but since it collects I'll always be bottlenecked by the master node. I've looked at Spark's OHE feature and while that will work in theory

Re: Finding a Spark Equivalent for Pandas' get_dummies

Re: Finding a Spark Equivalent for Pandas' get_dummies

Re: Finding a Spark Equivalent for Pandas' get_dummies

Finding a Spark Equivalent for Pandas' get_dummies

Finding a Spark Equivalent for Pandas' get_dummies

5 matches

Site Navigation

Mail list logo

Footer information