[jira] [Comment Edited] (SPARK-13699) Spark SQL drops the table in "overwrite" mode while writing into table

2018-04-24 Thread Manish Kumar (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-13699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16449684#comment-16449684
 ] 

Manish Kumar edited comment on SPARK-13699 at 4/24/18 11:50 AM:


I am not sure whether the issue is resolved or not. But as a workaround, I have 
used JDBC to drop and create a tableĀ and then saved data using append mode .


was (Author: mkbond777):
I am not sure whether the issue is resolved or not. But as a workaround, I have 
used JDBC to drop the table and then saved data using SAVE mode.

> Spark SQL drops the table in "overwrite" mode while writing into table
> --
>
> Key: SPARK-13699
> URL: https://issues.apache.org/jira/browse/SPARK-13699
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 1.6.0
>Reporter: Dhaval Modi
>Priority: Major
> Attachments: stackTrace.txt
>
>
> Hi,
> While writing the dataframe to HIVE table with "SaveMode.Overwrite" option.
> E.g.
> tgtFinal.write.mode(SaveMode.Overwrite).saveAsTable("tgt_table")
> sqlContext drop the table instead of truncating.
> This is causing error while overwriting.
> Adding stacktrace & commands to reproduce the issue,
> Thanks & Regards,
> Dhaval



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-13699) Spark SQL drops the table in "overwrite" mode while writing into table

2016-03-05 Thread Dhaval Modi (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-13699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15181780#comment-15181780
 ] 

Dhaval Modi edited comment on SPARK-13699 at 3/5/16 5:30 PM:
-

== Code Snippet ===
val sqlContext = new org.apache.spark.sql.hive.HiveContext(sc);
val src=sqlContext.sql("select * from src_table");
val tgt=sqlContext.sql("select * from tgt_table");

var tgtFinal=tgt.filter("currind = 'N'"); //Add to final table
val tgtActive=tgt.filter("currind = 'Y'");



#src.select("col1").except(src.select("col1").as('a).join(tgtActive.select("col1").as('b),"col1"))


val newTgt1 = tgtActive.as('a).join(src.as('b),$"a.col1" === $"b.col1")

#val newTgt2 = tgtActive.except(newTgt1.select("a.*"));
tgtFinal = tgtFinal.unionAll(tgtActive.except(newTgt1.select("a.*")));

var srcInsert = src.except(newTgt1.select("b.*"))

import org.apache.spark.sql._

val inBatchID = udf((t:String) => "13" )
val inCurrInd = udf((t:String) => "Y" )
val NCurrInd = udf((t:String) => "N" )
val endDate = udf((t:String) => "-12-31 23:59:59")

tgtFinal = tgtFinal.unionAll(newTgt1.select("a.*").withColumn("currInd", 
NCurrInd(col("col1"))).withColumn("endDate", 
current_timestamp()).withColumn("updateDate", current_timestamp()))


srcInsert = src.withColumn("batchId", 
inBatchID(col("col1"))).withColumn("currInd", 
inCurrInd(col("col1"))).withColumn("startDate", 
current_timestamp()).withColumn("endDate", 
date_format(endDate(col("col1")),"-MM-dd 
HH:mm:ss")).withColumn("updateDate", current_timestamp())

tgtFinal = tgtFinal.unionAll(srcInsert)

tgtFinal.write.mode(SaveMode.Overwrite).saveAsTable(tgt_table)

=== Code Snippet =


was (Author: mysti):
== Code Snippet ===
val sqlContext = new org.apache.spark.sql.hive.HiveContext(sc);
val src=sqlContext.sql("select * from src_table");
val tgt=sqlContext.sql("select * from tgt_table");

var tgtFinal=tgt.filter("currind = 'N'"); //Add to final table
val tgtActive=tgt.filter("currind = 'Y'");



#src.select("col1").except(src.select("col1").as('a).join(tgtActive.select("col1").as('b),"col1"))


val newTgt1 = tgtActive.as('a).join(src.as('b),$"a.col1" === $"b.col1")

#val newTgt2 = tgtActive.except(newTgt1.select("a.*"));
tgtFinal = tgtFinal.unionAll(tgtActive.except(newTgt1.select("a.*")));

var srcInsert = src.except(newTgt1.select("b.*"))

import org.apache.spark.sql._

val inBatchID = udf((t:String) => "13" )
val inCurrInd = udf((t:String) => "Y" )
val NCurrInd = udf((t:String) => "N" )
val endDate = udf((t:String) => "-12-31 23:59:59")

tgtFinal = tgtFinal.unionAll(newTgt1.select("a.*").withColumn("currInd", 
NCurrInd(col("col1"))).withColumn("endDate", 
current_timestamp()).withColumn("updateDate", current_timestamp()))


srcInsert = src.withColumn("batchId", 
inBatchID(col("col1"))).withColumn("currInd", 
inCurrInd(col("col1"))).withColumn("startDate", 
current_timestamp()).withColumn("endDate", 
date_format(endDate(col("col1")),"-MM-dd 
HH:mm:ss")).withColumn("updateDate", current_timestamp())

tgtFinal = tgtFinal.unionAll(srcInsert)

tgtFinal.write().mode(SaveMode.Overwrite).saveAsTable(tgt_table)

=== Code Snippet =

> Spark SQL drops the table in "overwrite" mode while writing into table
> --
>
> Key: SPARK-13699
> URL: https://issues.apache.org/jira/browse/SPARK-13699
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 1.6.0
>Reporter: Dhaval Modi
> Attachments: stackTrace.txt
>
>
> Hi,
> While writing the dataframe to HIVE table with "SaveMode.Overwrite" option.
> E.g.
> tgtFinal.write.mode(SaveMode.Overwrite).saveAsTable("tgt_table")
> sqlContext drop the table instead of truncating.
> This is causing error while overwriting.
> Adding stacktrace & commands to reproduce the issue,
> Thanks & Regards,
> Dhaval



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-13699) Spark SQL drops the table in "overwrite" mode while writing into table

2016-03-05 Thread Dhaval Modi (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-13699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15181780#comment-15181780
 ] 

Dhaval Modi edited comment on SPARK-13699 at 3/5/16 5:30 PM:
-

== Code Snippet ===
val sqlContext = new org.apache.spark.sql.hive.HiveContext(sc);
val src=sqlContext.sql("select * from src_table");
val tgt=sqlContext.sql("select * from tgt_table");

var tgtFinal=tgt.filter("currind = 'N'"); //Add to final table
val tgtActive=tgt.filter("currind = 'Y'");



#src.select("col1").except(src.select("col1").as('a).join(tgtActive.select("col1").as('b),"col1"))


val newTgt1 = tgtActive.as('a).join(src.as('b),$"a.col1" === $"b.col1")

#val newTgt2 = tgtActive.except(newTgt1.select("a.*"));
tgtFinal = tgtFinal.unionAll(tgtActive.except(newTgt1.select("a.*")));

var srcInsert = src.except(newTgt1.select("b.*"))

import org.apache.spark.sql._

val inBatchID = udf((t:String) => "13" )
val inCurrInd = udf((t:String) => "Y" )
val NCurrInd = udf((t:String) => "N" )
val endDate = udf((t:String) => "-12-31 23:59:59")

tgtFinal = tgtFinal.unionAll(newTgt1.select("a.*").withColumn("currInd", 
NCurrInd(col("col1"))).withColumn("endDate", 
current_timestamp()).withColumn("updateDate", current_timestamp()))


srcInsert = src.withColumn("batchId", 
inBatchID(col("col1"))).withColumn("currInd", 
inCurrInd(col("col1"))).withColumn("startDate", 
current_timestamp()).withColumn("endDate", 
date_format(endDate(col("col1")),"-MM-dd 
HH:mm:ss")).withColumn("updateDate", current_timestamp())

tgtFinal = tgtFinal.unionAll(srcInsert)

tgtFinal.write.mode(SaveMode.Overwrite).saveAsTable("tgt_table")

=== Code Snippet =


was (Author: mysti):
== Code Snippet ===
val sqlContext = new org.apache.spark.sql.hive.HiveContext(sc);
val src=sqlContext.sql("select * from src_table");
val tgt=sqlContext.sql("select * from tgt_table");

var tgtFinal=tgt.filter("currind = 'N'"); //Add to final table
val tgtActive=tgt.filter("currind = 'Y'");



#src.select("col1").except(src.select("col1").as('a).join(tgtActive.select("col1").as('b),"col1"))


val newTgt1 = tgtActive.as('a).join(src.as('b),$"a.col1" === $"b.col1")

#val newTgt2 = tgtActive.except(newTgt1.select("a.*"));
tgtFinal = tgtFinal.unionAll(tgtActive.except(newTgt1.select("a.*")));

var srcInsert = src.except(newTgt1.select("b.*"))

import org.apache.spark.sql._

val inBatchID = udf((t:String) => "13" )
val inCurrInd = udf((t:String) => "Y" )
val NCurrInd = udf((t:String) => "N" )
val endDate = udf((t:String) => "-12-31 23:59:59")

tgtFinal = tgtFinal.unionAll(newTgt1.select("a.*").withColumn("currInd", 
NCurrInd(col("col1"))).withColumn("endDate", 
current_timestamp()).withColumn("updateDate", current_timestamp()))


srcInsert = src.withColumn("batchId", 
inBatchID(col("col1"))).withColumn("currInd", 
inCurrInd(col("col1"))).withColumn("startDate", 
current_timestamp()).withColumn("endDate", 
date_format(endDate(col("col1")),"-MM-dd 
HH:mm:ss")).withColumn("updateDate", current_timestamp())

tgtFinal = tgtFinal.unionAll(srcInsert)

tgtFinal.write.mode(SaveMode.Overwrite).saveAsTable(tgt_table)

=== Code Snippet =

> Spark SQL drops the table in "overwrite" mode while writing into table
> --
>
> Key: SPARK-13699
> URL: https://issues.apache.org/jira/browse/SPARK-13699
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 1.6.0
>Reporter: Dhaval Modi
> Attachments: stackTrace.txt
>
>
> Hi,
> While writing the dataframe to HIVE table with "SaveMode.Overwrite" option.
> E.g.
> tgtFinal.write.mode(SaveMode.Overwrite).saveAsTable("tgt_table")
> sqlContext drop the table instead of truncating.
> This is causing error while overwriting.
> Adding stacktrace & commands to reproduce the issue,
> Thanks & Regards,
> Dhaval



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-13699) Spark SQL drops the table in "overwrite" mode while writing into table

2016-03-05 Thread Dhaval Modi (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-13699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15181780#comment-15181780
 ] 

Dhaval Modi edited comment on SPARK-13699 at 3/5/16 5:29 PM:
-

== Code Snippet ===
val sqlContext = new org.apache.spark.sql.hive.HiveContext(sc);
val src=sqlContext.sql("select * from src_table");
val tgt=sqlContext.sql("select * from tgt_table");

var tgtFinal=tgt.filter("currind = 'N'"); //Add to final table
val tgtActive=tgt.filter("currind = 'Y'");



#src.select("col1").except(src.select("col1").as('a).join(tgtActive.select("col1").as('b),"col1"))


val newTgt1 = tgtActive.as('a).join(src.as('b),$"a.col1" === $"b.col1")

#val newTgt2 = tgtActive.except(newTgt1.select("a.*"));
tgtFinal = tgtFinal.unionAll(tgtActive.except(newTgt1.select("a.*")));

var srcInsert = src.except(newTgt1.select("b.*"))

import org.apache.spark.sql._

val inBatchID = udf((t:String) => "13" )
val inCurrInd = udf((t:String) => "Y" )
val NCurrInd = udf((t:String) => "N" )
val endDate = udf((t:String) => "-12-31 23:59:59")

tgtFinal = tgtFinal.unionAll(newTgt1.select("a.*").withColumn("currInd", 
NCurrInd(col("col1"))).withColumn("endDate", 
current_timestamp()).withColumn("updateDate", current_timestamp()))


srcInsert = src.withColumn("batchId", 
inBatchID(col("col1"))).withColumn("currInd", 
inCurrInd(col("col1"))).withColumn("startDate", 
current_timestamp()).withColumn("endDate", 
date_format(endDate(col("col1")),"-MM-dd 
HH:mm:ss")).withColumn("updateDate", current_timestamp())

tgtFinal = tgtFinal.unionAll(srcInsert)

tgtFinal.write().mode(SaveMode.Overwrite).saveAsTable(tgt_table)

=== Code Snippet =


was (Author: mysti):

== Code Snippet ===
val sqlContext = new org.apache.spark.sql.hive.HiveContext(sc);
val src=sqlContext.sql("select * from src_table");
val tgt=sqlContext.sql("select * from tgt_table");

var tgtFinal=tgt.filter("currind = 'N'"); //Add to final table
val tgtActive=tgt.filter("currind = 'Y'");



#src.select("col1").except(src.select("col1").as('a).join(tgtActive.select("col1").as('b),"col1"))


val newTgt1 = tgtActive.as('a).join(src.as('b),$"a.col1" === $"b.col1")

#val newTgt2 = tgtActive.except(newTgt1.select("a.*"));
tgtFinal = tgtFinal.unionAll(tgtActive.except(newTgt1.select("a.*")));

var srcInsert = src.except(newTgt1.select("b.*"))

import org.apache.spark.sql._

val inBatchID = udf((t:String) => "13" )
val inCurrInd = udf((t:String) => "Y" )
val NCurrInd = udf((t:String) => "N" )
val endDate = udf((t:String) => "-12-31 23:59:59")

tgtFinal = tgtFinal.unionAll(newTgt1.select("a.*").withColumn("currInd", 
NCurrInd(col("col1"))).withColumn("endDate", 
current_timestamp()).withColumn("updateDate", current_timestamp()))


srcInsert = src.withColumn("batchId", 
inBatchID(col("col1"))).withColumn("currInd", 
inCurrInd(col("col1"))).withColumn("startDate", 
current_timestamp()).withColumn("endDate", 
date_format(endDate(col("col1")),"-MM-dd 
HH:mm:ss")).withColumn("updateDate", current_timestamp())

tgtFinal = tgtFinal.unionAll(srcInsert)

tgtFinal.write().mode(SaveMode.Append).saveAsTable(tgt_table)

=== Code Snippet =

> Spark SQL drops the table in "overwrite" mode while writing into table
> --
>
> Key: SPARK-13699
> URL: https://issues.apache.org/jira/browse/SPARK-13699
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 1.6.0
>Reporter: Dhaval Modi
> Attachments: stackTrace.txt
>
>
> Hi,
> While writing the dataframe to HIVE table with "SaveMode.Overwrite" option.
> E.g.
> tgtFinal.write.mode(SaveMode.Overwrite).saveAsTable("tgt_table")
> sqlContext drop the table instead of truncating.
> This is causing error while overwriting.
> Adding stacktrace & commands to reproduce the issue,
> Thanks & Regards,
> Dhaval



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org