Name resolution is not as easy I think. Wenchen can maybe give you some advice on resolution about this one.
On Sat, May 30, 2015 at 9:37 AM, Yijie Shen <henry.yijies...@gmail.com> wrote: > I think just match the Column’s expr as UnresolvedAttribute and use > UnresolvedAttribute’s name to match schema’s field name is enough. > > Seems no need to regard expr as a more general one. :) > > On May 30, 2015 at 11:14:05 PM, Girardot Olivier ( > o.girar...@lateral-thoughts.com) wrote: > > Jira done : https://issues.apache.org/jira/browse/SPARK-7969 > I've already started working on it but it's less trivial than it seems > because I don't exactly now the inner workings of the catalog, > and how to get the qualified name of a column to match it against the > schema/catalog. > > Regards, > > Olivier. > > Le sam. 30 mai 2015 à 09:54, Reynold Xin <r...@databricks.com> a écrit : > >> Yea would be great to support a Column. Can you create a JIRA, and >> possibly a pull request? >> >> >> On Fri, May 29, 2015 at 2:45 AM, Olivier Girardot < >> o.girar...@lateral-thoughts.com> wrote: >> >>> Actually, the Scala API too is only based on column name >>> >>> Le ven. 29 mai 2015 à 11:23, Olivier Girardot < >>> o.girar...@lateral-thoughts.com> a écrit : >>> >>>> Hi, >>>> Testing a bit more 1.4, it seems that the .drop() method in PySpark >>>> doesn't seem to accept a Column as input datatype : >>>> >>>> >>>> * .join(only_the_best, only_the_best.pol_no == df.pol_no, >>>> "inner").drop(only_the_best.pol_no)\* File >>>> "/usr/local/lib/python2.7/site-packages/pyspark/sql/dataframe.py", line >>>> 1225, in drop >>>> jdf = self._jdf.drop(colName) >>>> File "/usr/local/lib/python2.7/site-packages/py4j/java_gateway.py", >>>> line 523, in __call__ >>>> (new_args, temp_args) = self._get_args(args) >>>> File "/usr/local/lib/python2.7/site-packages/py4j/java_gateway.py", >>>> line 510, in _get_args >>>> temp_arg = converter.convert(arg, self.gateway_client) >>>> File "/usr/local/lib/python2.7/site-packages/py4j/java_collections.py", >>>> line 490, in convert >>>> for key in object.keys(): >>>> TypeError: 'Column' object is not callable >>>> >>>> It doesn't seem very consistent with rest of the APIs - and is >>>> especially annoying when executing joins - because drop("my_key") is not a >>>> qualified reference to the column. >>>> >>>> What do you think about changing that ? or what is the best practice as >>>> a workaround ? >>>> >>>> Regards, >>>> >>>> Olivier. >>>> >>> >>