have the first path to be something like
.csv("file://home/user/dataset/data.csv")
If you working with files that big
-don't use the inferSchema option, as that will trigger two scans through the
data
-try with a smaller file first, say 1MB or so
Trying to use spark *or any other tool* to
Hi, The source file i have is on local machine and its pretty huge like 150
gb. How to go about it?
On Sun, Nov 20, 2016 at 8:52 AM, Steve Loughran
wrote:
>
> On 19 Nov 2016, at 17:21, vr spark wrote:
>
> Hi,
> I am looking for scala or python
On 19 Nov 2016, at 17:21, vr spark
> wrote:
Hi,
I am looking for scala or python code samples to covert local tsv file to orc
file and store on distributed cloud storage(openstack).
So, need these 3 samples. Please suggest.
1. read tsv
2.
Hi,
I am looking for scala or python code samples to covert local tsv file to
orc file and store on distributed cloud storage(openstack).
So, need these 3 samples. Please suggest.
1. read tsv
2. convert to orc
3. store on distributed cloud storage
thanks
VR