eers
>
> On Sun, Dec 27, 2015 at 9:28 AM, Նարեկ Գալստեան <ngalsty...@gmail.com>
> wrote:
>
>>
>> http://spark.apache.org/docs/1.4.1/api/scala/index.html#org.apache.spark.sql.DataFrameWriter
>> I did try but it all was in vain.
>> It is also exp
ber...@gmail.com> wrote:
> have you tried to specify format of your output, might be parquet is
> default format?
> df.write().format("json").mode(SaveMode.Overwrite).save("/tmp/path");
>
> On 27 December 2015 at 15:18, Նարեկ Գալստեան <ngalsty...@gmail.com> wr
Hey all!
I am willing to partition *json *data by a column name and store the result
as a collection of json files to be loaded to another database.
I could use spark's built in *partitonBy *function but it only outputs in
parquet format which is not desirable for me.
Could you suggest me a way
A question regarding the topic,
I am using Intellij to write spark applications and then have to ship the
source code to my cluster on the cloud to compile and test
is there a way to automatise the process using Intellij?
Narek Galstyan
Նարեկ Գալստյան
On 29 November 2015 at 20:51, Ndjido Ardo
move files matching your pattern to a staging location and
> then load them using sc.textFile. you should find hdfs file system calls
> that are equivalent to normal file system if command line tools like distcp
> or mv don't meet your needs.
> On 27 Oct 2015 1:49 p.m., "Նարեկ Գալստեան
Dear Spark users,
I am reading a set of json files to compile them to Parquet data format.
I am willing to mark the folders in some way after having read their
contents so that I do not read it again(e.g. I can changed the name of the
folder).
I use .textFile("path/to*/dir/*/*/*.js") *technique
I have significant amount of data stored on my Hadoop HDFS as Parquet files
I am using Spark streaming to interactively receive queries from a web
server and transform the received queries into SQL to run on my data using
SparkSQL.
In this process I need to run several SQL queries and then return