> Subhash
>
> Sent from my iPhone
>
> > On Mar 7, 2017, at 6:37 AM, El-Hassan Wanas
> wrote:
> >
> > As an example, this is basically what I'm doing:
> >
> > val myDF =
> > originalDataFrame.select(col(columnName).when(col(columnName)
nt to modify the sql statement to
extract the data in the right format and push some preprocessing to the
database.
On 7 Mar 2017, at 12:04, El-Hassan Wanas wrote:
Hello,
There is, as usual, a big table lying on some JDBC data source. I am doing some
data processing on that data from Spark, ho
Hello,
There is, as usual, a big table lying on some JDBC data source. I am
doing some data processing on that data from Spark, however, in order to
speed up my analysis, I use reduced encodings and minimize the general
size of the data before processing.
Spark has been doing a great job at
Interestingly if I don't cache the data it works. However, as I need to
re-use the data to apply different kinds of filtering it really slows down
the job as it needs to read from S3 again and again.
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/SPARK-S3-
Hi
I'm trying to read lzo compressed files from S3 using spark. The lzo files
are not indexed. Spark job starts to read the files just fine but after a
while it just hangs. No network throughput. I have to restart the worker
process to get it back up. Any idea what could be causing this. We were
u
I'm trying to read input files from S3. The files are compressed using LZO.
i-e from spark-shell
sc.textFile("s3n://path/xx.lzo").first returns 'String = �LZO?'
Spark does not uncompress the data from the file. I am using cloudera
manager 5, with CDH 5.0.2. I've already installed 'GPLEXTRAS' par
just use -Dspark.executor.memory=
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Setting-executor-memory-when-using-spark-shell-tp7082p7103.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
We just started using spark where I work. I'd be really interested to hear
more about this role.
--Hassan
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/job-offering-tp3858p7010.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.