Re: S3 token times out during data frame "write.csv"

2018-01-28 Thread Jörn Franke
He is using CSV and either ORC or parquet would be fine. > On 28. Jan 2018, at 06:49, Gourav Sengupta wrote: > > Hi, > > There is definitely a parameter while creating temporary security credential > to mention the number of minutes those credentials will be active.

Re: S3 token times out during data frame "write.csv"

2018-01-27 Thread Gourav Sengupta
Hi, There is definitely a parameter while creating temporary security credential to mention the number of minutes those credentials will be active. There is an upper limit ofcourse which is around 3 days in case I remember correctly and the default, as you can see, is 30 mins. Can you let me

Re: S3 token times out during data frame "write.csv"

2018-01-25 Thread Jean Georges Perrin
Are you writing from an Amazon instance or from a on premise install to S3? How many partitions are you writing from? Maybe you can try to “play” with repartitioning to see how it behaves? > On Jan 23, 2018, at 17:09, Vasyl Harasymiv wrote: > > It is about 400

Re: S3 token times out during data frame "write.csv"

2018-01-23 Thread Vasyl Harasymiv
It is about 400 million rows. S3 automatically chunks the file on their end while writing, so that's fine, e.g. creates the same file name with alphanumeric suffixes. However, the write session expires due to token expiration. On Tue, Jan 23, 2018 at 5:03 PM, Jörn Franke

Re: S3 token times out during data frame "write.csv"

2018-01-23 Thread Jörn Franke
How large is the file? If it is very large then you should have anyway several partitions for the output. This is also important in case you need to read again from S3 - having several files there enables parallel reading. > On 23. Jan 2018, at 23:58, Vasyl Harasymiv

S3 token times out during data frame "write.csv"

2018-01-23 Thread Vasyl Harasymiv
Hi Spark Community, Saving a data frame into a file on S3 using: *df.write.csv(s3_location)* If run for longer than 30 mins, the following error persists: *The provided token has expired. (Service: Amazon S3; Status Code: 400; Error Code: ExpiredToken;`)* Potentially, because there is a