Re: Can we access files on Cluster mode

sudhir k Sun, 25 Jun 2017 06:24:08 -0700

Thank you . I guess I have to use common mount or s3 to access those files.


On Sun, Jun 25, 2017 at 4:42 AM Mich Talebzadeh <mich.talebza...@gmail.com>
wrote:

> Thanks. In my experience certain distros like Cloudera only support yarn
> client mode so AFAIK the driver stays on the Edge node. Happy to be
> corrected :)
>
>
>
> Dr Mich Talebzadeh
>
>
>
> LinkedIn * 
> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>
>
>
> http://talebzadehmich.wordpress.com
>
>
> *Disclaimer:* Use it at your own risk. Any and all responsibility for any
> loss, damage or destruction of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
>
>
>
> On 25 June 2017 at 10:37, Anastasios Zouzias <zouz...@gmail.com> wrote:
>
>> Hi Mich,
>>
>> If the driver starts on the edge node with cluster mode, then I don't see
>> the difference between client and cluster deploy mode.
>>
>> In cluster mode, it is the responsibility of the resource manager (yarn,
>> etc) to decide where to run the driver (at least for spark 1.6 this is what
>> I have experienced).
>>
>> Best,
>> Anastasios
>>
>> On Sun, Jun 25, 2017 at 11:14 AM, Mich Talebzadeh <
>> mich.talebza...@gmail.com> wrote:
>>
>>> Hi Anastasios.
>>>
>>> Are you implying that in Yarn cluster mode even if you submit your Spark
>>> application on an Edge node the driver can start on any node. I was under
>>> the impression that the driver starts from the Edge node? and the executors
>>> can be on any node in the cluster (where Spark agents are running)?
>>>
>>> Thanks
>>>
>>>
>>> Dr Mich Talebzadeh
>>>
>>>
>>>
>>> LinkedIn * 
>>> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>>> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>>>
>>>
>>>
>>> http://talebzadehmich.wordpress.com
>>>
>>>
>>> *Disclaimer:* Use it at your own risk. Any and all responsibility for
>>> any loss, damage or destruction of data or any other property which may
>>> arise from relying on this email's technical content is explicitly
>>> disclaimed. The author will in no case be liable for any monetary damages
>>> arising from such loss, damage or destruction.
>>>
>>>
>>>
>>> On 25 June 2017 at 09:39, Anastasios Zouzias <zouz...@gmail.com> wrote:
>>>
>>>> Just to note that in cluster mode the spark driver might run on any
>>>> node of the cluster, hence you need to make sure that the file exists on
>>>> *all* nodes. Push the file on all nodes or use client deploy-mode.
>>>>
>>>> Best,
>>>> Anastasios
>>>>
>>>>
>>>> Am 24.06.2017 23:24 schrieb "Holden Karau" <hol...@pigscanfly.ca>:
>>>>
>>>>> addFile is supposed to not depend on a shared FS unless the semantics
>>>>> have changed recently.
>>>>>
>>>>> On Sat, Jun 24, 2017 at 11:55 AM varma dantuluri <dvsnva...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Hi Sudhir,
>>>>>>
>>>>>> I believe you have to use a shared file system that is accused by all
>>>>>> nodes.
>>>>>>
>>>>>>
>>>>>> On Jun 24, 2017, at 1:30 PM, sudhir k <k.sudhi...@gmail.com> wrote:
>>>>>>
>>>>>>
>>>>>> I am new to Spark and i need some guidance on how to fetch files from
>>>>>> --files option on Spark-Submit.
>>>>>>
>>>>>> I read on some forums that we can fetch the files from
>>>>>> Spark.getFiles(fileName) and can use it in our code and all nodes should
>>>>>> read it.
>>>>>>
>>>>>> But i am facing some issue
>>>>>>
>>>>>> Below is the command i am using
>>>>>>
>>>>>> spark-submit --deploy-mode cluster --class com.check.Driver --files
>>>>>> /home/sql/first.sql test.jar 20170619
>>>>>>
>>>>>> so when i use SparkFiles.get(first.sql) , i should be able to read
>>>>>> the file Path but it is throwing File not Found exception.
>>>>>>
>>>>>> I tried SpackContext.addFile(/home/sql/first.sql) and then
>>>>>> SparkFiles.get(first.sql) but still the same error.
>>>>>>
>>>>>> Its working on the stand alone mode but not on cluster mode. Any help
>>>>>> is appreciated.. Using Spark 2.1.0 and Scala 2.11
>>>>>>
>>>>>> Thanks.
>>>>>>
>>>>>>
>>>>>> Regards,
>>>>>> Sudhir K
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Regards,
>>>>>> Sudhir K
>>>>>>
>>>>>>
>>>>>> --
>>>>> Cell : 425-233-8271 <(425)%20233-8271>
>>>>> Twitter: https://twitter.com/holdenkarau
>>>>>
>>>>
>>>
>>
>>
>> --
>> -- Anastasios Zouzias
>> <a...@zurich.ibm.com>
>>
>
> --
Sent from Gmail Mobile

Re: Can we access files on Cluster mode

Reply via email to