The changes look good to me. Jenkins is somehow not responding. Will merge once Jenkins comes back happy.
On Fri, Apr 24, 2015 at 2:38 AM, Olivier Girardot < o.girar...@lateral-thoughts.com> wrote: > done : https://github.com/apache/spark/pull/5683 and > https://issues.apache.org/jira/browse/SPARK-7118 > thx > > Le ven. 24 avr. 2015 à 07:34, Olivier Girardot < > o.girar...@lateral-thoughts.com> a écrit : > >> I'll try thanks >> >> Le ven. 24 avr. 2015 à 00:09, Reynold Xin <r...@databricks.com> a écrit : >> >>> You can do it similar to the way countDistinct is done, can't you? >>> >>> >>> https://github.com/apache/spark/blob/master/python/pyspark/sql/functions.py#L78 >>> >>> >>> >>> On Thu, Apr 23, 2015 at 1:59 PM, Olivier Girardot < >>> o.girar...@lateral-thoughts.com> wrote: >>> >>>> I found another way setting a SPARK_HOME on a released version and >>>> launching an ipython to load the contexts. >>>> I may need your insight however, I found why it hasn't been done at the >>>> same time, this method (like some others) uses a varargs in Scala and for >>>> now the way functions are called only one parameter is supported. >>>> >>>> So at first I tried to just generalise the helper function "_" in the >>>> functions.py file to multiple arguments, but py4j's handling of varargs >>>> forces me to create an Array[Column] if the target method is expecting >>>> varargs. >>>> >>>> But from Python's perspective, we have no idea of whether the target >>>> method will be expecting varargs or just multiple arguments (to un-tuple). >>>> I can create a special case for "coalesce" or "for method that takes of >>>> list of columns as arguments" considering they will be varargs based (and >>>> therefore needs an Array[Column] instead of just a list of arguments) >>>> >>>> But this seems very specific and very prone to future mistakes. >>>> Is there any way in Py4j to know before calling it the signature of a >>>> method ? >>>> >>>> >>>> Le jeu. 23 avr. 2015 à 22:17, Olivier Girardot < >>>> o.girar...@lateral-thoughts.com> a écrit : >>>> >>>>> What is the way of testing/building the pyspark part of Spark ? >>>>> >>>>> Le jeu. 23 avr. 2015 à 22:06, Olivier Girardot < >>>>> o.girar...@lateral-thoughts.com> a écrit : >>>>> >>>>>> yep :) I'll open the jira when I've got the time. >>>>>> Thanks >>>>>> >>>>>> Le jeu. 23 avr. 2015 à 19:31, Reynold Xin <r...@databricks.com> a >>>>>> écrit : >>>>>> >>>>>>> Ah damn. We need to add it to the Python list. Would you like to >>>>>>> give it a shot? >>>>>>> >>>>>>> >>>>>>> On Thu, Apr 23, 2015 at 4:31 AM, Olivier Girardot < >>>>>>> o.girar...@lateral-thoughts.com> wrote: >>>>>>> >>>>>>>> Yep no problem, but I can't seem to find the coalesce fonction in >>>>>>>> pyspark.sql.{*, functions, types or whatever :) } >>>>>>>> >>>>>>>> Olivier. >>>>>>>> >>>>>>>> Le lun. 20 avr. 2015 à 11:48, Olivier Girardot < >>>>>>>> o.girar...@lateral-thoughts.com> a écrit : >>>>>>>> >>>>>>>> > a UDF might be a good idea no ? >>>>>>>> > >>>>>>>> > Le lun. 20 avr. 2015 à 11:17, Olivier Girardot < >>>>>>>> > o.girar...@lateral-thoughts.com> a écrit : >>>>>>>> > >>>>>>>> >> Hi everyone, >>>>>>>> >> let's assume I'm stuck in 1.3.0, how can I benefit from the >>>>>>>> *fillna* API >>>>>>>> >> in PySpark, is there any efficient alternative to mapping the >>>>>>>> records >>>>>>>> >> myself ? >>>>>>>> >> >>>>>>>> >> Regards, >>>>>>>> >> >>>>>>>> >> Olivier. >>>>>>>> >> >>>>>>>> > >>>>>>>> >>>>>>> >>>>>>> >>>