eed to have some learning curve and trouble shooting.
On Fri, Dec 9, 2016 at 4:31 PM, Cassa L <lcas...@gmail.com> wrote:
> Hi,
> So far, I ran spark jobs directly using spark-submit options. I have a
> use case to use Spark Job server to run the job. I wanted to find out PROS
&
Hi,
So far, I ran spark jobs directly using spark-submit options. I have a use
case to use Spark Job server to run the job. I wanted to find out PROS and
CONs of using this job server? If anyone can share it, it will be great.
My jobs usually connected to multiple data sources like Kafka, Custom
yes, only for engine, but maybe newer version has more optimization
from tungsten project? at least since spark 1.6?
> -- Forwarded message --
> From: Mich Talebzadeh <mich.talebza...@gmail.com>
> Date: 27 May 2016 at 17:09
> Subject: Re: Pros and Cons
&g
: Mich Talebzadeh <mich.talebza...@gmail.com>
Date: 27 May 2016 at 17:09
Subject: Re: Pros and Cons
To: Teng Qiu <teng...@gmail.com>
Cc: Ted Yu <yuzhih...@gmail.com>, Koert Kuipers <ko...@tresata.com>, Jörn
Franke <jornfra...@gmail.com>, user <user@spark.apache.org>, Aa
tried spark 2.0.0 preview, but no assembly jar there... then just gave up... :p
2016-05-27 17:39 GMT+02:00 Ted Yu :
> Teng:
> Why not try out the 2.0 SANPSHOT build ?
>
> Thanks
>
>> On May 27, 2016, at 7:44 AM, Teng Qiu wrote:
>>
>> ah, yes, the version
Hi Ted,
do you mean Hive 2 with spark 2 snapshot build as the execution engine just
binaries for snapshot (all ok)?
Dr Mich Talebzadeh
LinkedIn *
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
Teng:
Why not try out the 2.0 SANPSHOT build ?
Thanks
> On May 27, 2016, at 7:44 AM, Teng Qiu wrote:
>
> ah, yes, the version is another mess!... no vendor's product
>
> i tried hadoop 2.6.2, hive 1.2.1 with spark 1.6.1, doesn't work.
>
> hadoop 2.6.2, hive 2.0.1 with
ah, yes, the version is another mess!... no vendor's product
i tried hadoop 2.6.2, hive 1.2.1 with spark 1.6.1, doesn't work.
hadoop 2.6.2, hive 2.0.1 with spark 1.6.1, works, but need to fix this
from hive side https://issues.apache.org/jira/browse/HIVE-13301
the jackson-databind lib from
Hi Teng,
what version of spark are using as the execution engine. are you using a
vendor's product here?
thanks
Dr Mich Talebzadeh
LinkedIn *
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
I agree with Koert and Reynold, spark works well with large dataset now.
back to the original discussion, compare SparkSQL vs Hive in Spark vs Spark API.
SparkSQL vs Spark API you can simply imagine you are in RDBMS world,
SparkSQL is pure SQL, and Spark API is language for writing stored
We do disk-to-disk iterative algorithms in spark all the time, on datasets
that do not fit in memory, and it works well for us. I usually have to do
some tuning of number of partitions for a new dataset but that's about it
in terms of inconveniences.
On May 26, 2016 2:07 AM, "Jörn Franke"
Spark can handle this true, but it is optimized for the idea that it works it
works on the same full dataset in-memory due to the underlying nature of
machine learning algorithms (iterative). Of course, you can spill over, but
that you should avoid.
That being said you should have read my
On Wed, May 25, 2016 at 9:52 AM, Jörn Franke wrote:
> Spark is more for machine learning working iteravely over the whole same
> dataset in memory. Additionally it has streaming and graph processing
> capabilities that can be used together.
>
Hi Jörn,
The first part is
; HTH
>
>
> Dr Mich Talebzadeh
>
> LinkedIn
> https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>
> http://talebzadehmich.wordpress.com
>
>
>> On 25 May 2016 at 16:34, Aakash Basu <raj2coo...@gmail.com>
wrote:
> Hi,
>
>
>
> I’m new to the Spark Ecosystem, need to understand the *Pros and Cons *of
> fetching data using *SparkSQL vs Hive in Spark vs Spark API.*
>
>
>
> *PLEASE HELP!*
>
>
>
> Thanks,
>
> Aakash Basu.
>
Hi,
I’m new to the Spark Ecosystem, need to understand the *Pros and Cons *of
fetching data using *SparkSQL vs Hive in Spark vs Spark API.*
*PLEASE HELP!*
Thanks,
Aakash Basu.
Hi,
I am new bee to Spark and I am exploring option and pros and cons which
one will work best in spark and hive context.My dataset inputs are CSV
files, using spark to process the my data and saving it in hive using
hivecontext
1) Process the CSV file using spark-csv package and create
I am new bee to Spark and I am exploring option and pros and cons which
> one will work best in spark and hive context.My dataset inputs are CSV
> files, using spark to process the my data and saving it in hive using
> hivecontext
>
> 1) Process the CSV file using spark-csv pac
18 matches
Mail list logo