Hi,
Wish to know, which type of SQL syntax is followed when we write a plain
SQL query inside spark.sql? Is it MySQL or PGSQL? I know it isn't SQL
Server or Oracle as while migrating, had to convert a lot of SQL functions.
Also if you can provide a documentation which clearly says the above
)
at
org.apache.hadoop.hdfs.protocol.datatransfer.PipelineAck.readFields(PipelineAck.java:244)
at
org.apache.hadoop.hdfs.DFSOutputStream$DataStream$ResponseProcessor.run(DFSOutputStream.java:733)
More information can be seen here:
https://stackoverflow.com/questions/61202566/spark-sql-datasetrow-collect-to-driver-throw-java-io
ond.isNull} && ${cond.value}) {
> ${res.code}
> $resultState = (byte)(${res.isNull} ? $HAS_NULL : $HAS_NONNULL);
> ${ev.value} = ${res.value};
> continue;
> }
>
>
> } while (false)
>
> Refer to:
> https://github.com/apach
HAS_NULL : $HAS_NONNULL);
${ev.value} = ${res.value};
continue;
}
} while (false)
Refer to:
https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/conditionalExpressions.scala#L208
Here is a full generated cod
I do not know the answer to this question so I am also looking for it, but
@kant maybe the generated code can help with this.
--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/
-
To unsubscribe e-mail:
Hi,
What is the unix_timestamp() function equivalent in a plain spark SQL query?
I want to subtract one timestamp column from another, but in plain SQL am
getting error "Should be numeric or calendarinterval and not timestamp."
But when I did through the above function inaide
Just wanted to follow up on this. If anyone has any advice, I’d be
interested in learning more!
On Thu, Feb 20, 2020 at 6:09 PM Ruijing Li wrote:
> Hi all,
>
> I’m interested in hearing the community’s thoughts on best practices to do
> integration testing for spark sql jobs.
Hi all,
I’m interested in hearing the community’s thoughts on best practices to do
integration testing for spark sql jobs. We run a lot of our jobs with cloud
infrastructure and hdfs - this makes debugging a challenge for us,
especially with problems that don’t occur from just initializing
Hello,
I am wondering whether there is a clear-cut performance advantage for using
CSL API instead of Spark SQL for queries in Java? I am interested in Joins,
Aggregates, and, Group By (with several fields) clauses.
Thank you.
RajevA
l even or all odd?
On Tue, Dec 17, 2019 at 11:01 AM Tzahi File
mailto:tzahi.f...@ironsrc.com>> wrote:
I have in my spark sql query a calculated field that gets the
value if field1 % 3.
I'm using this field as a partition so I expected to get 3
no.. there're 100M records both even and odd
On Tue, Dec 17, 2019 at 8:13 PM Russell Spitzer
wrote:
> Is there a chance your data is all even or all odd?
>
> On Tue, Dec 17, 2019 at 11:01 AM Tzahi File
> wrote:
>
>> I have in my spark sql query a calculated fiel
Is there a chance your data is all even or all odd?
On Tue, Dec 17, 2019 at 11:01 AM Tzahi File wrote:
> I have in my spark sql query a calculated field that gets the value if
> field1 % 3.
>
> I'm using this field as a partition so I expected to get 3 partitions in
> the mentio
I have in my spark sql query a calculated field that gets the value if
field1 % 3.
I'm using this field as a partition so I expected to get 3 partitions in
the mentioned case, and I do get. The issue happened with even numbers
(instead of 3 - 4,2 ... ).
When I tried to use even numbers
Hello,
I am new to Spark, so I have a basic question which I couldn't find an
answer online.
If I want to run SQL queries on a Spark dataframe, do I have to create a
temporary table first?
I know I could use the Spark SQL API, but is there a way of simply reading
the data and run SQL queries
t;>> On Mon, Nov 11, 2019 at 9:46 AM Tzahi File
>>> wrote:
>>>
>>>> Hi,
>>>>
>>>> Currently, I'm using hive huge cluster(m5.24xl * 40 workers) to run a
>>>> percentile function. I'm trying to improve this job by movin
huge cluster(m5.24xl * 40 workers) to run a
>>> percentile function. I'm trying to improve this job by moving it to run
>>> with spark SQL.
>>>
>>> Any suggestions on how to use a percentile function in Spark?
>>>
>>>
>>> Thanks,
>>> -
; Hi,
>>>
>>> Currently, I'm using hive huge cluster(m5.24xl * 40 workers) to run a
>>> percentile function. I'm trying to improve this job by moving it to run
>>> with spark SQL.
>>>
>>> Any suggestions on how to use a percentile function in Spa
gt;>
>> Currently, I'm using hive huge cluster(m5.24xl * 40 workers) to run a
>> percentile function. I'm trying to improve this job by moving it to run
>> with spark SQL.
>>
>> Any suggestions on how to use a percentile function in Spark?
>>
>>
for this task? Because I bet that's what's slowing you down.
On Mon, Nov 11, 2019 at 9:46 AM Tzahi File wrote:
> Hi,
>
> Currently, I'm using hive huge cluster(m5.24xl * 40 workers) to run a
> percentile function. I'm trying to improve this job by moving it to run
> with spa
Hi,
Currently, I'm using hive huge cluster(m5.24xl * 40 workers) to run a
percentile function. I'm trying to improve this job by moving it to run
with spark SQL.
Any suggestions on how to use a percentile function in Spark?
Thanks,
--
Tzahi File
Data Engineer
[image: ironSource] <h
Hi all,
I'm using spark 2.4.0, my spark.sql.catalogImplementation is set to hive
while spark.sql.warehouse.dir is set to a specific s3 bucket.
I want to execute a CTAS statement in spark sql like the one below.
*create table as db_name.table_name as (select ..)*
When writing, spark always uses
Hi Team,
I have kafka messages where json is coming as string how can create table
after converting json string to json using spark sql
everything Hadoop,
you can also implement ExternalCatalog:
https://github.com/apache/spark/blob/5264164a67df498b73facae207eda12ee133be7d/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/ExternalCatalog.scala
See https://jira.apache.org/jira/browse/SPARK-23443 for ongoing progress
Is it possible to use our own metastore instead of Hive Metastore with Spark
SQL?
Can you please point me to some docs or code I can look at to get it done?
We are moving away from everything Hadoop.
--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com
Hi,
I have scenario like below
https://stackoverflow.com/questions/58134379/how-to-handle-backup-scenario-in-spark-structured-streaming-using-joins
How to handle this use-case ( back-up scenario) in
spark-structured-streaming?
Any clues would be highly appreciated.
Thanks,
Shyam
Using SQL, is it possible to query a column's metadata?
Thanks,
Kyunam
--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/
-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org
1) this is not a use case, but a technical solution. Hence nobody can tell you
if it make sense or not
2) do an upsert in Cassandra. However keep in mind that the application
submitting to the Kafka topic and the one consuming from the Kafka topic need
to ensure that they process messages in
What exactly is your requirement?
Is the read before write mandatory?
Are you maintaining states in Cassandra?
Regards
Prathmesh Ranaut
https://linkedin.com/in/prathmeshranaut
> On Aug 29, 2019, at 3:35 PM, Shyam P wrote:
>
>
> thanks Aayush. For every record I need to get the data
thanks Aayush.
For every record I need to get the data from cassandra table and
update it ? Else it may not update the existing record.
What is this datastax-spark-connector ? is that not a "Cassandra
connector library written for spark"?
If not , how to write ourselves.
Where and how to
Cassandra is upsert, you should be able to do what you need with a single
statement unless you’re looking to maintain counters.
I’m not sure if there is a Cassandra connector library written for spark
streaming because we wrote one ourselves when we wanted to do the same.
Regards
Prathmesh
Hi,
I need to do a PoC for a business use-case.
*Use case :* Need to update a record in Cassandra table if exists.
Will spark streaming support compare each record and update existing
Cassandra record ?
For each record received from kakfa topic , If I want to check and compare
each record
on your use case.
> Am 14.08.2019 um 05:08 schrieb Shyam P :
>
> Hi,
> Any advice how to do this in spark sql ?
>
> I have a scenario as below
>
> dataframe1 = loaded from an HDFS Parquet file.
>
> dataframe2 = read from a Kafka Stream.
>
> If column1
Hi,
Any advice how to do this in spark sql ?
I have a scenario as below
dataframe1 = loaded from an HDFS Parquet file.
dataframe2 = read from a Kafka Stream.
If column1 of dataframe1 value in columnX value of dataframe2 , then I need
then I need to replace column1 value of dataframe1
S
> i.e.hdfs://xxx:8020/apps/hive/warehouse/
> For this the code ran fine.
>
> Thanks for the help,
> -Nirmal
>
> From: Nirmal Kumar
> Sent: 19 June 2019 11:51
> To: Raymond Honderdors
> Cc: user
> Subject: RE: Unable to run simple spark-sql
>
> Hi Raymond,
&
filesystem.
I created a new database and confirmed that the location was in HDFS
i.e.hdfs://xxx:8020/apps/hive/warehouse/
For this the code ran fine.
Thanks for the help,
-Nirmal
From: Nirmal Kumar
Sent: 19 June 2019 11:51
To: Raymond Honderdors
Cc: user
Subject: RE: Unable to run simple spark-sql
directory of hive user
(/home/hive/).
Why is it referring the local file system and from where?
Thanks,
Nirmal
From: Raymond Honderdors
Sent: 19 June 2019 11:18
To: Nirmal Kumar
Cc: user
Subject: Re: Unable to run simple spark-sql
Hi Nirmal,
i came across the following article
"
019 5:56:06 PM
> To: Raymond Honderdors; Nirmal Kumar
> Cc: user
> Subject: RE: Unable to run simple spark-sql
>
> Hi Raymond,
>
> Permission on hdfs is 777
> drwxrwxrwx - impadmin hdfs 0 2019-06-13 16:09
> /home/hive/spark-warehouse
>
>
> But it’
for Android<https://aka.ms/ghei36>
From: Nirmal Kumar
Sent: Tuesday, June 18, 2019 5:56:06 PM
To: Raymond Honderdors; Nirmal Kumar
Cc: user
Subject: RE: Unable to run simple spark-sql
Hi Raymond,
Permission on hdfs is 777
drwxrwxrwx - impadmin hdfs 0 2
-warehouse/testdb.db/employee_orc/.hive-staging_hive_2019-06-18_16-08-21_448_1691186175028734135-1'
Thanks,
-Nirmal
From: Raymond Honderdors
Sent: 18 June 2019 17:52
To: Nirmal Kumar
Cc: user
Subject: Re: Unable to run simple spark-sql
Hi
Can you check the permission of the user running spark
O
Hi
Can you check the permission of the user running spark
On the hdfs folder where it tries to create the table
On Tue, Jun 18, 2019, 15:05 Nirmal Kumar
wrote:
> Hi List,
>
> I tried running the following sample Java code using Spark2 version 2.0.0
> on YARN (HDP-2.5.0.0)
>
> public class
Hi List,
I tried running the following sample Java code using Spark2 version 2.0.0 on
YARN (HDP-2.5.0.0)
public class SparkSQLTest {
public static void main(String[] args) {
SparkSession sparkSession = SparkSession.builder().master("yarn")
.config("spark.sql.warehouse.dir",
You can check out
https://github.com/hortonworks-spark/spark-atlas-connector/
On Wed, 15 May 2019 at 19:44, lk_spark wrote:
> hi,all:
> When I use spark , if I run some SQL to do ETL how can I get
> lineage info. I found that , CDH spark have some config about lineage :
>
Hi,
spark.lineage.enabled is Cloudera specific and doesn't work with vanilla
Spark.
BR,
G
On Thu, May 16, 2019 at 4:44 AM lk_spark wrote:
> hi,all:
> When I use spark , if I run some SQL to do ETL how can I get
> lineage info. I found that , CDH spark have some config about lineage :
hi,all:
When I use spark , if I run some SQL to do ETL how can I get lineage
info. I found that , CDH spark have some config about lineage :
spark.lineage.enabled=true
spark.lineage.log.dir=/var/log/spark2/lineage
Are they also work for apache spark ?
2019-05-16
Hi ,
I have oracle table in which has
column schema is : DATA_DATE DATE something like 31-MAR-02
I am trying to retrieve data from oracle using spark-sql-2.4.1 version. I
tried to set the JdbcOptions as below :
.option("lowerBound", "2002-03-31 00:00:00");
.option
Hi All ,
I am using spark 2.2 in EMR cluster. I have a hive table in ORC format and
I need to create a persistent view on top of this hive table. I am using
spark sql to create the view.
By default spark sql creates the view with LazySerde. How can I change the
inputformat to use ORC ?
PFA
Hi, All,
I need to overwrite data in a Hive table and I use the following code to
do so:
df = sqlContext.sql(my-spark-sql-statement);
df.count
df.write.format("orc").mode("overwrite").saveAsTable("foo") // I also
tried 'insertInto("foo")
The "df
t 10:23 PM kant kodali wrote:
>
>> Hi All,
>>
>> Is there a way to validate the syntax of raw spark SQL query?
>>
>> for example, I would like to know if there is any isValid API call spark
>> provides?
>>
>> val query = "select * from table&
> Hi All,
>
> Is there a way to validate the syntax of raw spark SQL query?
>
> for example, I would like to know if there is any isValid API call spark
> provides?
>
> val query = "select * from table"if(isValid(query)) {
> sparkSession.sql(query) } else {
Hi All,
Is there a way to validate the syntax of raw spark SQL query?
for example, I would like to know if there is any isValid API call spark
provides?
val query = "select * from table"if(isValid(query)) {
sparkSession.sql(query) } else {
log.error("Invalid Syn
Thank you so much. I tried your suggestion and it really works!
发件人:
"Ramandeep Singh Nanda"
收件人:
l...@china-inv.cn
抄送:
"Shahab Yunus" , "Tomas Bartalos"
, "user @spark/'user @spark'/spark
users/user@spark"
日期:
2019/01/26 05:42
主题:
Re: Re: Ho
tried the suggested approach and it works, but it requires to 'run' the
> SQL statement first.
>
> I just want to parse the SQL statement without running it, so I can do
> this in my laptop without connecting to our production environment.
>
> I tried to write a tool which uses th
bundled with SPARK SQL
to extract names of the input tables and it works as expected.
But I have a question:
The parser generated by SqlBase.g4 only accepts 'select' statement with
all keywords such as 'SELECT', 'FROM' and table names capitalized
e.g. it accepts 'SELECT * FROM FOO
Thanks all for your help.
I'll try your suggestions.
Thanks again :)
发件人:
"Shahab Yunus"
收件人:
"Ramandeep Singh Nanda"
抄送:
"Tomas Bartalos" , l...@china-inv.cn, "user
@spark/'user @spark'/spark users/user@spark"
日期:
2019/01/24 06:45
主题:
Re: Ho
this info.
Some details here:
https://jaceklaskowski.gitbooks.io/mastering-spark-sql/spark-sql-Dataset.html#queryExecution
On Wed, Jan 23, 2019 at 5:35 PM Ramandeep Singh Nanda
wrote:
> Explain extended or explain would list the plan along with the tables. Not
> aware of any statements that expl
:43 napísal(a):
>
>> Hi, All,
>>
>> We need to get all input tables of several SPARK SQL 'select' statements.
>>
>> We can get those information of Hive SQL statements by using 'explain
>> dependency select....'.
>> But I can't find the equivalent
This might help:
show tables;
st 23. 1. 2019 o 10:43 napísal(a):
> Hi, All,
>
> We need to get all input tables of several SPARK SQL 'select' statements.
>
> We can get those information of Hive SQL statements by using 'explain
> dependency select'.
> But I can'
Hi, All,
We need to get all input tables of several SPARK SQL 'select' statements.
We can get those information of Hive SQL statements by using 'explain
dependency select'.
But I can't find the equivalent command for SPARK SQL.
Does anyone know how to get this information of a SPARK SQL
; hand:
> a. Turn on 'Hive on spark' feature and run HQLs and
> b. Run those query statements with spark SQL
>
> What the difference between these options?
>
> Another question is:
> There is a hive setting 'hive.optimze.ppd' to enable 'predicated pushdown'
> query optimi
want to improve the performance of these queries and have two options
at hand:
a. Turn on 'Hive on spark' feature and run HQLs and
b. Run those query statements with spark SQL
What the difference between these options?
Another question is:
There is a hive setting 'hive.optimze.ppd' to enable
;
: 2018??11??29??(??) 7:55
??: "user";
????: Java: pass parameters in spark sql query
Hello there,
I am trying to pass parameters in spark.sql query in Java code, the same as in
this link
https://forums.databricks.com/questions/115/how-do-i-pass-parame
That's string interpolation. You could create your own for example :bind
and then do replaceall, to replace named parameter.
On Wed, Nov 28, 2018, 18:55 Mann Du Hello there,
>
> I am trying to pass parameters in spark.sql query in Java code, the same
> as in this link
>
>
Hello there,
I am trying to pass parameters in spark.sql query in Java code, the same
as in this link
https://forums.databricks.com/questions/115/how-do-i-pass-parameters-to-my-sql-statements.html
The link suggested to use 's' before 'select' as -
val param = 100
spark.sql(s""" select * from
Class Test object. So will this work
in Spark SQL after using SQLUserDefinedType tag and extending
UserDefinedType class.
As UserDefinedType is private in Spark 2.0. I am just want to know if UDT
is support in Spark 2.3+. If yes what is the best to use UserDefinedType or
UDTRegisteration
st UDFMethod(string name, int age){
>
>Test ob = new Test();
>
>ob.name = name;
>
>ob.age = age;
>
> }
>
> Sample Spark query- `Select *, UDFMethod(name, age) From SomeTable;`
>
> Now UDFMethod(name, age) will return Class Test object.
Not sure what you mean about “raw” Spark sql, but there is one parameter which
will impact the optimizer choose broadcast join automatically or not :
spark.sql.autoBroadcastJoinThreshold
You can read Spark doc about above parameter setting and using explain to check
your join using broadcast
Hi All,
How to do a broadcast join using raw Spark SQL 2.3.1 or 2.3.2?
Thanks
Not sure I get what you mean….
I ran the query that you had – and don’t get the same hash as you.
From: Gokula Krishnan D
Date: Friday, September 28, 2018 at 10:40 AM
To: "Thakrar, Jayesh"
Cc: user
Subject: Re: [Spark SQL] why spark sql hash() are returns the same hash value
thoug
4589)|hash(40004)|
>
> +---+---+
>
> | 777096871|-1593820563|
>
> +---+---+
>
>
>
>
>
> scala>
>
>
>
> *From: *Gokula Krishnan D
> *Date: *Tuesday, September 25, 2018 at 8:57 PM
> *To: *user
Date: Tuesday, September 25, 2018 at 8:57 PM
To: user
Subject: [Spark SQL] why spark sql hash() are returns the same hash value
though the keys/expr are not same
Hello All,
I am calculating the hash value of few columns and determining whether its an
Insert/Delete/Update Record but found a s
Hello All,
I am calculating the hash value of few columns and determining whether its
an Insert/Delete/Update Record but found a scenario which is little weird
since some of the records returns same hash value though the key's are
totally different.
For the instance,
scala> spark.sql("select
You can use spark dataframe 'when' 'otherwise' clause to replace SQL case
statement.
This piece will be required to calculate before -
'select student_id from tbl_student where candidate_id = c.candidate_id and
approval_id = 2
and academic_start_date is null'
Take the count of above DF after
Dear Spark Users,
I came across little weird MSSQL Query to replace with Spark and I am like
no clue how to do it in an efficient way with Scala + SparkSQL. Can someone
please throw light. I can create view of DataFrame and do it as
*spark.sql *(query)
but I would like to do it with Scala + Spark
ted the code.
Regards,
Nikita
On Tue, Aug 28, 2018 at 2:34 PM kant kodali wrote:
> Hi All,
>
> How do I generate current UTC timestamp using spark sql?
>
> When I do curent_timestamp() it is giving me local time.
>
> to_utc_timestamp(current_time(), ) takes timezone i
Hi All,
How do I generate current UTC timestamp using spark sql?
When I do curent_timestamp() it is giving me local time.
to_utc_timestamp(current_time(), ) takes timezone in the second
parameter and I see no udf that can give me current timezone.
when I do
spark.conf.set
https://docs.databricks.com/spark/latest/spark-sql/skew-join.html
The above might help, in case you are using a join.
On Mon, Jul 23, 2018 at 4:49 AM, 崔苗 wrote:
> but how to get count(distinct userId) group by company from count(distinct
> userId) group by company+x?
> cou
Hi All,
is there a way to parse and modify raw spark sql query?
For example, given the following query
spark.sql("select hello from view")
I want to modify the query or logical plan such that I can get the result
equivalent to the below query.
spark.sql("select foo, hello f
Hi,
I want to know when I create a dataset by reading files in hdfs in spark sql,
like : Dataset user = spark.read().format("json").load(filePath) , what
defines the partition number of the dataset?
And what if the filePath is a directory instead of a singe file ?
Why we can't get the
xecution engine of Spark Streaming
> (DStream API): Does Spark streaming jobs run over the Spark SQL engine?
>
> For example, if I change a configuration parameter related to Spark SQL
> (like spark.sql.streaming.minBatchesToRetain or
> spark.sql.objectHashAggregate.sortBased.fallbackTh
Hi,
I have a question regarding the execution engine of Spark Streaming
(DStream API): Does Spark streaming jobs run over the Spark SQL engine?
For example, if I change a configuration parameter related to Spark SQL
(like spark.sql.streaming.minBatchesToRetain
with
1 vs 5 stages for the very same query plan (even if I changed number of
executors or number of cores or anything execution-related).
So my question is, is this possible that Spark SQL could give 1-stage
execution plan and 5-stage execution plan for the very same query?
(I am not saying
Please help me on the below error & give me different approach on the below
data manipulation.
Error:Unable to find encoder for type stored in a Dataset. Primitive types
(Int, String, etc) and Product types (case classes) are supported by
importing spark.implicits._ Support for serializing other
I am using spark to run merge query in postgres sql.
The way its being done now is save the data to be merged in postgres as
temp tables.
Now run the merge queries in postgres using java sql connection and
statment .
So basically this query runs in postgres.
The queries are insert into source
Ok, this one works:
.withColumn("hour", hour(from_unixtime(typedDataset.col("ts") / 1000)))
2018-03-20 22:43 GMT+01:00 Serega Sheypak :
> Hi, any updates? Looks like some API inconsistency or bug..?
>
> 2018-03-17 13:09 GMT+01:00 Serega Sheypak
Hi, any updates? Looks like some API inconsistency or bug..?
2018-03-17 13:09 GMT+01:00 Serega Sheypak :
> > Not sure why you are dividing by 1000. from_unixtime expects a long type
> It expects seconds, I have milliseconds.
>
>
>
> 2018-03-12 6:16 GMT+01:00 vermanurag
> Not sure why you are dividing by 1000. from_unixtime expects a long type
It expects seconds, I have milliseconds.
2018-03-12 6:16 GMT+01:00 vermanurag :
> Not sure why you are dividing by 1000. from_unixtime expects a long type
> which is time in milliseconds
Not sure why you are dividing by 1000. from_unixtime expects a long type
which is time in milliseconds from reference date.
The following should work:
val ds = dataset.withColumn("hour",hour(from_unixtime(dataset.col("ts"
--
Sent from:
hi, desperately trying to extract hour from unix seconds
year, month, dayofmonth functions work as expected.
hour function always returns 0.
val ds = dataset
.withColumn("year", year(to_date(from_unixtime(dataset.col("ts") / 1000
.withColumn("month",
Hi All,
My spark Configuration is following.
spark = SparkSession.builder.master(mesos_ip) \
.config('spark.executor.cores','3')\
.config('spark.executor.memory','8g')\
.config('spark.es.scroll.size','1')\
.config('spark.network.timeout','600s')\
unsubscribe
DISCLAIMER: Aquest missatge pot contenir informació confidencial. Si vostè no
n'és el destinatari, si us plau, esborri'l i faci'ns-ho saber immediatament a
la següent adreça: le...@eurecat.org Si el destinatari d'aquest missatge no
consent la
fashion I
> want to see how I can create a new Column using the raw sql. I am looking
> at this reference https://docs.databricks.com/spark/latest/
> spark-sql/index.html and I am not seeing a way.
>
> Thanks!
>
> On Thu, Feb 1, 2018 at 4:01 AM, Jean Georges Perrin <j...@jgp
a similar fashion I want
to see how I can create a new Column using the raw sql. I am looking at
this reference https://docs.databricks.com/spark/latest/spark-sql/index.html
and I am not seeing a way.
Thanks!
On Thu, Feb 1, 2018 at 4:01 AM, Jean Georges Perrin <j...@jgp.net> wrote:
> Sur
Sure, use withColumn()...
jg
> On Feb 1, 2018, at 05:50, kant kodali wrote:
>
> Hi All,
>
> Is there any way to create a new timeuuid column of a existing dataframe
> using raw sql? you can assume that there is a timeuuid udf function if that
> helps.
>
> Thanks!
Hi All,
Is there any way to create a new timeuuid column of a existing dataframe
using raw sql? you can assume that there is a timeuuid udf function if that
helps.
Thanks!
onvert a string with decimal value to decimal in Spark Sql
> and load it into Hive/Sql Server.
>
> In Hive instead of getting converted to decimal all my values are coming
> as null.
>
> In Sql Server instead of getting decimal values are coming without
> precision
>
> Can
Hi Experts
I am trying to convert a string with decimal value to decimal in Spark Sql
and load it into Hive/Sql Server.
In Hive instead of getting converted to decimal all my values are coming as
null.
In Sql Server instead of getting decimal values are coming without precision
Can you please
Hi Umar,
While this answer is a bit dated, you make find it useful in diagnosing a
store for Spark SQL tables:
https://stackoverflow.com/a/39753976/3723346
I don't know much about Pentaho or Arcadia, but I assume many of the listed
options have a JDBC or ODBC client.
Hope this helps,
Pierce
Hi All,
We are currently looking for real-time streaming analytics of data stored as
Spark SQL tables is there any external connectivity available to connect
with BI tools(Pentaho/Arcadia).
currently, we are storing data into the hive tables but its response on the
Arcadia dashboard is slow
>
> Can anyone kindly advice how to dump the spark SQL jobs for audit? Just
> like the one for the MapReduce jobs (https://hadoop.apache.org/
> docs/current/hadoop-yarn/hadoop-yarn-site/WebServicesIntro.html).
>
> Thanks again,
> Wenxing
>
#rest-api), I
still can't get the jobs for a given application with the endpoint:
*/applications/[app-id]/jobs*
Can anyone kindly advice how to dump the spark SQL jobs for audit? Just
like the one for the MapReduce jobs (
https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site
101 - 200 of 1268 matches
Mail list logo