Take a look at the implementation of typed sum/avg:
https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/expressions/scalalang/typed.scala
You can implement a typed max/min.
On Tue, Jun 7, 2016 at 4:31 PM, Alexander Pivovarov
wrote:
>
Please go ahead.
On Tue, Jun 7, 2016 at 4:45 PM, franklyn
wrote:
> Thanks for reproducing it Ted, should i make a Jira Issue?.
>
>
>
> --
> View this message in context:
>
Thanks for reproducing it Ted, should i make a Jira Issue?.
--
View this message in context:
http://apache-spark-developers-list.1001551.n3.nabble.com/Can-t-use-UDFs-with-Dataframes-in-spark-2-0-preview-scala-2-10-tp17845p17852.html
Sent from the Apache Spark Developers List mailing list
Ted, It does not work like that
you have to .map(toAB).toDS
On Tue, Jun 7, 2016 at 4:07 PM, Ted Yu wrote:
> Have you tried the following ?
>
> Seq(1->2, 1->5, 3->6).toDS("a", "b")
>
> then you can refer to columns by name.
>
> FYI
>
>
> On Tue, Jun 7, 2016 at 3:58 PM,
I built with Scala 2.10
>>> df.select(add_one(df.a).alias('incremented')).collect()
The above just hung.
On Tue, Jun 7, 2016 at 3:31 PM, franklyn
wrote:
> Thanks Ted !.
>
> I'm using
>
>
Have you tried the following ?
Seq(1->2, 1->5, 3->6).toDS("a", "b")
then you can refer to columns by name.
FYI
On Tue, Jun 7, 2016 at 3:58 PM, Alexander Pivovarov
wrote:
> I'm trying to switch from RDD API to Dataset API
> My question is about reduceByKey method
>
>
I'm trying to switch from RDD API to Dataset API
My question is about reduceByKey method
e.g. in the following example I'm trying to rewrite
sc.parallelize(Seq(1->2, 1->5, 3->6)).reduceByKey(math.max).take(10)
using DS API. That is what I have so far:
Seq(1->2, 1->5,
Thanks Ted !.
I'm using
https://github.com/apache/spark/commit/8f5a04b6299e3a47aca13cbb40e72344c0114860
and building with scala-2.10
I can confirm that it works with scala-2.11
--
View this message in context:
With commit 200f01c8fb15680b5630fbd122d44f9b1d096e02 using Scala 2.11:
Using Python version 2.7.9 (default, Apr 29 2016 10:48:06)
SparkSession available as 'spark'.
>>> from pyspark.sql import SparkSession
>>> from pyspark.sql.types import IntegerType, StructField, StructType
>>> from
I've built spark-2.0-preview (8f5a04b) with scala-2.10 using the following
>
>
> ./dev/change-version-to-2.10.sh
> ./dev/make-distribution.sh -DskipTests -Dzookeeper.version=3.4.5
> -Dcurator.version=2.4.0 -Dscala-2.10 -Phadoop-2.6 -Pyarn -Phive
and then ran the following code in a pyspark
As far as I know the process is just to copy docs/_site from the build
to the appropriate location in the SVN repo (i.e.
site/docs/2.0.0-preview).
Thanks
Shivaram
On Tue, Jun 7, 2016 at 8:14 AM, Sean Owen wrote:
> As a stop-gap, I can edit that page to have a small section
Hi,
I'm searching for how and where spark allocates cores per executor in the
source code.
Is it possible to control programmaticaly allocated cores in standalone
cluster mode?
Regards,
Matteo
--
View this message in context:
Thanks Sean, you were right, hard refresh made it show up.
Seems like we should at least link to the preview docs fromĀ
http://spark.apache.org/documentation.html.
Tom
On Tuesday, June 7, 2016 10:04 AM, Sean Owen wrote:
It's there (refresh maybe?). See the end of the
As a stop-gap, I can edit that page to have a small section about
preview releases and point to the nightly docs.
Not sure who has the power to push 2.0.0-preview to site/docs, but, if
that's done then we can symlink "preview" in that dir to it and be
done, and update this section about preview
It's there (refresh maybe?). See the end of the downloads dropdown.
For the moment you can see the docs in the nightly docs build:
https://home.apache.org/~pwendell/spark-nightly/spark-branch-2.0-docs/latest/
I don't know, what's the best way to put this into the main site?
under a /preview
Congrats!!
On Mon, Jun 6, 2016, 8:12 AM Gayathri Murali
wrote:
> Congratulations Yanbo Liang! Well deserved.
>
>
> On Sun, Jun 5, 2016 at 7:10 PM, Shixiong(Ryan) Zhu <
> shixi...@databricks.com> wrote:
>
>> Congrats, Yanbo!
>>
>> On Sun, Jun 5, 2016 at 6:25 PM,
Hi,
I don't know if it is a bug or a feature, but one thing in streaming error
handling seems confusing to me - I create streaming context, start and call
#awaitTermination like this:
try {
ssc.awaitTermination();
} catch (Exception e) {
LoggerFactory.getLogger(getClass()).error("Job failed.
17 matches
Mail list logo