Just a follow up question. How does the spark task/job/master know how to split 
the file that is in s3. In most cases, it would be better to fetch different 
parts of the file in parallel. Is that something that is done by the workers?

> On Oct 23, 2013, at 18:28, Ayush Mishra <[email protected]> wrote:
> 
> You can check 
> http://blog.knoldus.com/2013/09/09/running-standalone-scala-job-on-amazon-ec2-spark-cluster/.
> 
> 
>> On Thu, Oct 24, 2013 at 6:54 AM, Nan Zhu <[email protected]> wrote:
>> Great!!!
>> 
>> 
>>> On Wed, Oct 23, 2013 at 9:21 PM, Matei Zaharia <[email protected]> 
>>> wrote:
>>> Yes, take a look at 
>>> http://spark.incubator.apache.org/docs/latest/ec2-scripts.html#accessing-data-in-s3
>>> 
>>> Matei
>>> 
>>> 
>>>> On Oct 23, 2013, at 6:17 PM, Nan Zhu <[email protected]> wrote:
>>>> 
>>>> Hi, all
>>>> 
>>>> Is there any solution running Spark with Amazon S3? 
>>>> 
>>>> Best,
>>>> 
>>>> Nan
> 

Reply via email to