(test_data.name, ‘a’, ‘X’)
>
>
>
> You would need a Udf if you would wanted to do something on the string
> value of a single row (e.g. return data + “bla”)
>
>
>
> Assaf.
>
>
>
> *From:* Perttu Ranta-aho [mailto:ranta...@iki.fi]
> *Sent:* Thursday, November 17, 2
Hi,
Shouldn't this work?
from pyspark.sql.functions import regexp_replace, udf
def my_f(data):
return regexp_replace(data, 'a', 'X')
my_udf = udf(my_f)
test_data = sqlContext.createDataFrame([('a',), ('b',), ('c',)], ('name',))
test_data.select(my_udf(test_data.name)).show()
But instead
So it was something obvious, thanks!
-Perttu
to 10. marraskuuta 2016 klo 21.19 Davies Liu <dav...@databricks.com>
kirjoitti:
> On Thu, Nov 10, 2016 at 11:14 AM, Perttu Ranta-aho <ranta...@iki.fi>
> wrote:
> > Hello,
> >
> > I want to create an UDF which
Hello,
I want to create an UDF which modifies one column value depending on value
of some other column. But Python version of the code fails always in column
value comparison. Below are simple examples, scala version works as
expected but Python version throws an execption. Am I missing something
has to be found in whatever is causing the DAGScheduler to need to
shutdown in the first place.
On Sun, May 25, 2014 at 12:10 PM, Perttu Ranta-aho
perttu.ranta...@gmail.com wrote:
Hi,
We have a small Mesos (0.18.1) cluster with 4 nodes. Upgraded to Spark
1.0.0-rc9, to overcome some
Hi,
We have a small Mesos (0.18.1) cluster with 4 nodes. Upgraded to Spark
1.0.0-rc9, to overcome some PySpark bugs. But now we are experiencing
random crashes with almost every job. Local jobs run fine, but same code
with same data set in Mesos cluster leads to errors like:
14/05/22 15:03:34