Since you are a data engineer I would start by learning Scala. The parts of
Scala you would need to learn are pretty basic. Start with the examples on
the Spark website, which gives examples in multiple languages. Think of
Scala as a typed version of Python. You will find that the error messages
tend to be much more meaningful in Scala because that is the native
language of Spark. If you don’t want to to install the JVM and Scala, I
highly recommend Databricks community edition as a place to start.

On Thu, Jul 4, 2019 at 11:22 PM Vikas Garg <sperry...@gmail.com> wrote:

> I am currently working as a data engineer and I am working on Power BI,
> SSIS (ETL Tool). For learning purpose, I have done the setup PySpark and
> also able to run queries through Spark on multi node cluster DB (I am using
> Vertica DB and later will move on HDFS or SQL Server).
>
> I have good knowledge of Python also.
>
> On Fri, 5 Jul 2019 at 10:32, Kurt Fehlhauer <kfehl...@gmail.com> wrote:
>
>> Are you a data scientist or data engineer?
>>
>>
>> On Thu, Jul 4, 2019 at 10:34 PM Vikas Garg <sperry...@gmail.com> wrote:
>>
>>> Hi,
>>>
>>> I am new Spark learner. Can someone guide me with the strategy towards
>>> getting expertise in PySpark.
>>>
>>> Thanks!!!
>>>
>>

Reply via email to