Hi Bob,I tested your scenario with Spark 1.3 and I assumed you did not miss the 
second parameter of pow(x,y)
from pyspark.sql import SQLContextsqlContext = SQLContext(sc)
df = sqlContext.jsonFile("/vagrant/people.json")# Displays the content of the 
DataFrame to stdoutdf.show()#These are all finedf.select("name", 
(df.age)*(df.age)).show()
name    (age * age)
Michael null       
Andy    900        
Justin  361  
df.select("name", (df.age)+1).show()
name    (age + 1)
Michael null     
Andy    31       
Justin  20
However the following tests give the same error.df.select("name", 
pow(df.age,2)).show()
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-27-ce7299d3ef76> in <module>()
----> 1 df.select("name", pow(df.age,2)).show()

TypeError: unsupported operand type(s) for ** or pow(): 'Column' and 'int'

df.select("name", (df.age)**2).show()
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-24-29540c3536bf> in <module>()
----> 1 df.select("name", (df.age)**2).show()

TypeError: unsupported operand type(s) for ** or pow(): 'Column' and 'int'
Moreover testing the functions individually they are working fine.pow(2,4)
162**4
16

Kind Regards
Salih Oztop
      From: Bob Corsaro <[email protected]>
 To: user <[email protected]> 
 Sent: Monday, June 29, 2015 7:27 PM
 Subject: SparkSQL built in functions
   
I'm having trouble using "select pow(col) from table" It seems the function is 
not registered for SparkSQL. Is this on purpose or an oversight? I'm using 
pyspark.

  

Reply via email to