ion, it runs faster but when
>>>> executed in Spark environment - the processing time is more than expected.
>>>> We have one column where the value is large (BinaryType -> 600KB),
>>>> wondering whether this could make the Arrow computation slower ?
>>>>
>>>> Is there any profiling or best way to debug the cost incurred using
>>>> pandas UDF ?
>>>>
>>>>
>>>> Thanks,
>>>> Subash
>>>>
>>>> --
>
> Thanks,
> Russell Jurney @rjurney <http://twitter.com/rjurney>
> russell.jur...@gmail.com LI <http://linkedin.com/in/russelljurney> FB
> <http://facebook.com/jurney> datasyndrome.com
>
--
Takuya UESHIN
ache/spark/pull/20280
> >>> [2] https://www.python.org/dev/peps/pep-0468/
> >>> [3] https://issues.apache.org/jira/browse/SPARK-29748
>
>
>
> --
> Shane Knapp
> UC Berkeley EECS Research / RISELab Staff Technical Lead
> https://rise.cs.berkeley.edu
>
> -
> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>
>
--
Takuya UESHIN
Tokyo, Japan
http://twitter.com/ueshin
expr, Literal(value))
> }
> }
>
>
> It does a pattern matching to detect if value is of type Column. If yes,
> it will use the .expr of the column, otherwise it will work as it used to.
>
> Any suggestion or opinion on the proposition?
>
>
> Kind regards,
> Chongguang LIU
>
>
--
Takuya UESHIN
Tokyo, Japan
http://twitter.com/ueshin
: user-h...@spark.apache.org
-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org
--
Takuya UESHIN
Tokyo, Japan
http://twitter.com/ueshin
.nabble.com/SparkSQL-Language-Integrated-query-OR-clause-and-IN-clause-tp9298.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
--
Takuya UESHIN
Tokyo, Japan
http://twitter.com/ueshin
correct me if I'm wrong).
I would love to be able to do something like the following:
val casRdd = sparkCtx.cassandraTable(ks, cf)
// registerAsTable etc
val res = sql(SELECT id, xmlGetTag(xmlfield, 'sometag') FROM cf)
--
Best regards,
Martin Gammelsæter
--
Takuya UESHIN
Tokyo, Japan
Gammelsæter martingammelsae...@gmail.com
:
Takuya, thanks for your reply :)
I am already doing that, and it is working well. My question is, can I
define arbitrary functions to be used in these queries?
On Fri, Jul 4, 2014 at 11:12 AM, Takuya UESHIN ues...@happy-camper.st
wrote:
Hi
error is thrown
val queryResult = sql(select * from Table)
queryResult.groupBy('colA)('colA,Sum('colB) as
'totB).aggregate(Sum('totB)).collect().foreach(println)
Thanks
subacini
--
Takuya UESHIN
Tokyo, Japan
http://twitter.com/ueshin