I also wonder why there isn't a jdbc connector for spark sql?
Sent from my iPhone
> On Aug 10, 2017, at 2:45 PM, Jules Damji wrote:
>
> Yes, it's more used in Hive than Spark
>
> Sent from my iPhone
> Pardon the dumb thumb typos :)
>
>> On Aug 10, 2017, at 2:24 PM, Sathish Kumaran Vairavelu
Hi
I tried to start spark-thrift server but get following error:
javax.security.sasl.SaslException: GSS initiate failed [Caused by
GSSException: No valid credentials provided (Mechanism level: Failed to
find any Kerberos tgt)]
java.io.IOException: javax.security.sasl.SaslException: GSS initiate
The correct link is
https://docs.databricks.com/spark/latest/spark-sql/index.html .
This link does have the core syntax such as the BNF for the DDL and DML and
SELECT. It does *not *have a reference for date / string / numeric
functions: is there any such reference at this point? It is not suff
you could exit with error code just like normal java/scala application, and
get it from driver/yarn
On Fri, Aug 11, 2017 at 9:55 AM, Wei Zhang
wrote:
> I suppose you can find the job status from Yarn UI application view.
>
>
>
> Cheers,
>
> -z
>
>
>
> *From:* 陈宇航 [mailto:yuhang.c...@foxmail.com]
I suppose you can find the job status from Yarn UI application view.
Cheers,
-z
From: 陈宇航 [mailto:yuhang.c...@foxmail.com]
Sent: Thursday, August 10, 2017 5:23 PM
To: user
Subject: How can I tell if a Spark job is successful or not?
I want to do some clean-ups after a Spark job is finished, an
Hi,
I am facing issues while trying to recover a textFileStream from checkpoint.
Basically it is trying to load the files from the begining of the job start
whereas I am deleting the files after processing them. I have the following
configs set so was thinking that it should not look for files be
I refer to docs.databricks.com/Spark/latest/Spark-sql/index.html.
Cheers
Jules
Sent from my iPhone
Pardon the dumb thumb typos :)
> On Aug 10, 2017, at 1:46 PM, Stephen Boesch wrote:
>
>
> While the DataFrame/DataSets are useful in many circumstances they are
> cumbersome for many types of
Yes, it's more used in Hive than Spark
Sent from my iPhone
Pardon the dumb thumb typos :)
> On Aug 10, 2017, at 2:24 PM, Sathish Kumaran Vairavelu
> wrote:
>
> I think it is for hive dependency.
>> On Thu, Aug 10, 2017 at 4:14 PM kant kodali wrote:
>> Since I see a calcite dependency in Spar
I think it is for hive dependency.
On Thu, Aug 10, 2017 at 4:14 PM kant kodali wrote:
> Since I see a calcite dependency in Spark I wonder where Calcite is being
> used?
>
> On Thu, Aug 10, 2017 at 1:30 PM, Sathish Kumaran Vairavelu <
> vsathishkuma...@gmail.com> wrote:
>
>> Spark SQL doesn't use
Since I see a calcite dependency in Spark I wonder where Calcite is being
used?
On Thu, Aug 10, 2017 at 1:30 PM, Sathish Kumaran Vairavelu <
vsathishkuma...@gmail.com> wrote:
> Spark SQL doesn't use Calcite
>
> On Thu, Aug 10, 2017 at 3:14 PM kant kodali wrote:
>
>> Hi All,
>>
>> Does Spark SQL
While the DataFrame/DataSets are useful in many circumstances they are
cumbersome for many types of complex sql queries.
Is there an up to date *SQL* reference - i.e. not DataFrame DSL operations
- for version 2.2?
An example of what is not clear: what constructs are supported within
select
Spark SQL doesn't use Calcite
On Thu, Aug 10, 2017 at 3:14 PM kant kodali wrote:
> Hi All,
>
> Does Spark SQL uses Calcite? If so, what for? I thought the Spark SQL has
> catalyst which would generate its own logical plans, physical plans and
> other optimizations.
>
> Thanks,
> Kant
>
Hi All,
Does Spark SQL uses Calcite? If so, what for? I thought the Spark SQL has
catalyst which would generate its own logical plans, physical plans and
other optimizations.
Thanks,
Kant
Got the answer from
https://groups.google.com/a/lists.datastax.com/forum/#!topic/spark-connector-user/ETCZdCcaKq8
On Thu, Aug 10, 2017 at 11:59 AM, shyla deshpande
wrote:
> I have a 3 node cassandra cluster. I want to pass all the 3 nodes in spark
> submit. How do I do that.
> Any code samples
I have a 3 node cassandra cluster. I want to pass all the 3 nodes in spark
submit. How do I do that.
Any code samples will help.
Thanks
Thanks Cody.
On Wed, Aug 9, 2017 at 8:46 AM, Cody Koeninger wrote:
> org.apache.spark.streaming.kafka.KafkaCluster has methods
> getLatestLeaderOffsets and getEarliestLeaderOffsets
>
> On Mon, Aug 7, 2017 at 11:37 PM, shyla deshpande
> wrote:
> > Thanks TD.
> >
> > On Mon, Aug 7, 2017 at 8:59 P
Hi,
I have a spark streaming task that basically does the following,
1. Read a batch using a custom receiver
2. Parse and apply transforms to the batch
3. Convert the raw fields to a bunch of features
4. Use a pre-built model to predict the class of each record in th
I want to do some clean-ups after a Spark job is finished, and the action I
would do depends on whether the job is successful or not.
So how where can I get the result for the job?
I already tried the SparkListener, it worked fine when the job is successful,
but if the job fails, the listener s
Yeah, installing HDFS in our environment is unfornutately going to take lot of
time (approvals/planning etc). I will have to live with local FS for now.
The other option I had already tried is collect() and send everything to driver
node. But my data volume is too huge for driver node to handle a
Also, why are you trying to write results locally if you're not using a
distributed file system ? Spark is geared towards writing to a distributed file
system. I would suggest trying to collect() so the data is sent to the master
and then do a write if the result set isn't too big, or repartitio
Yes, I have tried with file:/// and the fullpath, as well as just the full path
without file:/// prefix.
Spark session has been closed, no luck though ☹
Regards,
Hemanth
From: Femi Anthony
Date: Thursday, 10 August 2017 at 11.06
To: Hemanth Gudela
Cc: "user@spark.apache.org"
Subject: Re: spar
Is your filePath prefaced with file:/// and the full path or is it relative ?
You might also try calling close() on the Spark context or session the end of
the program execution to try and ensure that cleanup is completed
Sent from my iPhone
> On Aug 10, 2017, at 3:58 AM, Hemanth Gudela
> wr
Thanks for reply Femi!
I’m writing the file like this -->
myDataFrame.write.mode("overwrite").csv("myFilePath")
There absolutely are no errors/warnings after the write.
_SUCCESS file is created on master node, but the problem of _temporary is
noticed only on worked nodes.
I know spark.write.cs
Normally the* _temporary* directory gets deleted as part of the cleanup
when the write is complete and a SUCCESS file is created. I suspect that
the writes are not properly completed. How are you specifying the write ?
Any error messages in the logs ?
On Thu, Aug 10, 2017 at 3:17 AM, Hemanth Gudel
Hi,
I’m running spark on cluster mode containing 4 nodes, and trying to write CSV
files to node’s local path (not HDFS).
I’m spark.write.csv to write CSV files.
On master node:
spark.write.csv creates a folder with csv file name and writes many files with
part-r-000n suffix. This is okay for me
25 matches
Mail list logo