Re: Spark SQL - saving to multiple partitions in parallel - FileNotFoundException on _temporary directory possible bug?

2015-12-08 Thread Jiří Syrový
Hi,

I have a very similar issue on standalone SQL context, but when using
save() instead. I guess it might be related to
https://issues.apache.org/jira/browse/SPARK-8513. Also it usually happens
after using groupBy.

Regrads,
Jiri

2015-12-08 0:16 GMT+01:00 Deenar Toraskar :

> Hi
>
> I have a process that writes to multiple partitions of the same table in
> parallel using multiple threads sharing the same SQL context
> df.write.partitionedBy("partCol").insertInto("tableName") . I am
> getting FileNotFoundException on _temporary directory. Each write only goes
> to a single partition, is there some way of not using dynamic partitioning
> using df.write without having to resort to .save as I dont want to hardcode
> a physical DFS location in my code?
>
> This is similar to this issue listed here
> https://issues.apache.org/jira/browse/SPARK-2984
>
> Regards
> Deenar
>
> *Think Reactive Ltd*
> deenar.toras...@thinkreactive.co.uk
>
>
>
>


Spark SQL - saving to multiple partitions in parallel - FileNotFoundException on _temporary directory possible bug?

2015-12-07 Thread Deenar Toraskar
Hi

I have a process that writes to multiple partitions of the same table in
parallel using multiple threads sharing the same SQL context
df.write.partitionedBy("partCol").insertInto("tableName") . I am
getting FileNotFoundException on _temporary directory. Each write only goes
to a single partition, is there some way of not using dynamic partitioning
using df.write without having to resort to .save as I dont want to hardcode
a physical DFS location in my code?

This is similar to this issue listed here
https://issues.apache.org/jira/browse/SPARK-2984

Regards
Deenar

*Think Reactive Ltd*
deenar.toras...@thinkreactive.co.uk