Basic error, you get back an RDD on transformations like map.
sc.textFile("filename").map(x => x.split(",")
On 5 Sep 2016 6:19 pm, "Ashok Kumar" wrote:
> Hi,
>
> I have a text file as below that I read in
>
> 74,20160905-133143,98.11218069128827594148
>
s splittable say TSV, CSV etc, it will be distributed
across all executors.
On Sat, Sep 3, 2016 at 3:38 PM, Somasundaram Sekar <somasundar.sekar@
tigeranalytics.com> wrote:
> Hi All,
>
>
>
> Would like to gain some understanding on the questions listed below,
>
>
, you will get RDD of arrays.
> What is your expected outcome of 2nd map?
>
> On Mon, Sep 5, 2016 at 11:30 PM, Ashok Kumar <ashok34...@yahoo.com.invalid
> > wrote:
>
> Thank you sir.
>
> This is what I get
>
> scala> textFile.map(x=> x.split(","))
&g
; (x.getString(0))
> | )
> :27: error: value getString is not a member of Array[String]
>textFile.map(x=> x.split(",")).map(x => (x.getString(0))
>
> regards
>
>
>
>
> On Monday, 5 September 2016, 13:51, Somasundaram Sekar <somasundar.se
Can you try this
https://www.linkedin.com/pulse/hive-functions-udfudaf-udtf-examples-gaurav-singh
On 4 Sep 2016 9:38 pm, "janardhan shetty" wrote:
> Hi,
>
> Is there any chance that we can send entire multiple columns to an udf and
> generate a new column for Spark ML.
Please suggest some good resources to learn Spark administration.
Hi All,
Would like to gain some understanding on the questions listed below,
1. When processing a large file with Apache Spark, with, say,
sc.textFile("somefile.xml"), does it split it for parallel processing
across executors or, will it be processed as a single chunk in a single
Hi,
I want to concat multiple columns into a single column after grouping the
DataFrame,
I want an functional equivalent of Redshift ListAgg function
pg_catalog.Listagg(column, '|')
within GROUP( ORDER BY column) AS
name
LISTAGG Function
: For each group in a query, the
Hi,
I have a GroupedData object, on which I perform aggregation of few columns
since GroupedData takes in map, I cannot perform multiple aggregate on the
same column, say I want to have both max and min of amount.
So the below line of code will return only one aggregate per column
Learning Spark - ORielly publication as a starter and official doc
On 4 Dec 2017 9:19 am, "Manuel Sopena Ballesteros"
wrote:
> Dear Spark community,
>
>
>
> Is there any resource (books, online course, etc.) available that you know
> of to learn about spark? I am
Hi,
Is it possible to write the Dataframe backed by Kafka Streaming source into
AWS Redshift, we have in the past used
https://github.com/databricks/spark-redshift to write into redshift, but I
presume it will not work with DataFrame##writeStream(). Also writing with
JDBC connector with
Is it possible to write the Dataframe backed by Kafka Streaming source into
AWS Redshift, we have in the past used
https://github.com/databricks/spark-redshift to write into redshift, but I
presume it will not work with *writeStream*. Also writing with JDBC
connector with ForeachWriter is also may
12 matches
Mail list logo