Yea would be great to support a Column. Can you create a JIRA, and possibly a pull request?
On Fri, May 29, 2015 at 2:45 AM, Olivier Girardot < o.girar...@lateral-thoughts.com> wrote: > Actually, the Scala API too is only based on column name > > Le ven. 29 mai 2015 à 11:23, Olivier Girardot < > o.girar...@lateral-thoughts.com> a écrit : > >> Hi, >> Testing a bit more 1.4, it seems that the .drop() method in PySpark >> doesn't seem to accept a Column as input datatype : >> >> >> * .join(only_the_best, only_the_best.pol_no == df.pol_no, >> "inner").drop(only_the_best.pol_no)\* File >> "/usr/local/lib/python2.7/site-packages/pyspark/sql/dataframe.py", line >> 1225, in drop >> jdf = self._jdf.drop(colName) >> File "/usr/local/lib/python2.7/site-packages/py4j/java_gateway.py", line >> 523, in __call__ >> (new_args, temp_args) = self._get_args(args) >> File "/usr/local/lib/python2.7/site-packages/py4j/java_gateway.py", line >> 510, in _get_args >> temp_arg = converter.convert(arg, self.gateway_client) >> File "/usr/local/lib/python2.7/site-packages/py4j/java_collections.py", >> line 490, in convert >> for key in object.keys(): >> TypeError: 'Column' object is not callable >> >> It doesn't seem very consistent with rest of the APIs - and is especially >> annoying when executing joins - because drop("my_key") is not a qualified >> reference to the column. >> >> What do you think about changing that ? or what is the best practice as a >> workaround ? >> >> Regards, >> >> Olivier. >> >