[GitHub] [carbondata] ajantha-bhat commented on pull request #3771: [CARBONDATA-3849] pushdown array_contains filter to carbon for array of primitive types

2020-06-11 Thread GitBox


ajantha-bhat commented on pull request #3771:
URL: https://github.com/apache/carbondata/pull/3771#issuecomment-642476599


   retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3771: [CARBONDATA-3849] pushdown array_contains filter to carbon for array of primitive types

2020-06-11 Thread GitBox


CarbonDataQA1 commented on pull request #3771:
URL: https://github.com/apache/carbondata/pull/3771#issuecomment-642548570


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3145/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3771: [CARBONDATA-3849] pushdown array_contains filter to carbon for array of primitive types

2020-06-11 Thread GitBox


CarbonDataQA1 commented on pull request #3771:
URL: https://github.com/apache/carbondata/pull/3771#issuecomment-642548932


   Build Failed  with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1421/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Updated] (CARBONDATA-3851) Merge Update and Insert with Partition Table is giving different results in different spark deploy modes

2020-06-11 Thread Sachin Ramachandra Setty (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sachin Ramachandra Setty updated CARBONDATA-3851:
-
Description: 
The result sets are different when run the queries in spark-shell--master local 
and spark-shell --master yarn (Two Different Spark Deploy Modes)

Steps to Reproduce Issue :

import scala.collection.JavaConverters._
 import java.sql.Date

import org.apache.spark.sql._
 import org.apache.spark.sql.CarbonSession._
 import org.apache.spark.sql.catalyst.TableIdentifier
 import org.apache.spark.sql.execution.command.mutation.merge.

{CarbonMergeDataSetCommand, DeleteAction, InsertAction, 
InsertInHistoryTableAction, MergeDataSetMatches, MergeMatch, UpdateAction, 
WhenMatched, WhenNotMatched, WhenNotMatchedAndExistsOnlyOnTarget}

import org.apache.spark.sql.functions._
 import org.apache.spark.sql.test.util.QueryTest
 import org.apache.spark.sql.types.

{BooleanType, DateType, IntegerType, StringType, StructField, StructType}

import spark.implicits._

sql("drop table if exists order").show()
 sql("drop table if exists order_hist").show()
 sql("create table order_hist(id string, name string, quantity int, price int, 
state int) PARTITIONED BY (c_name String) STORED AS carbondata").show()

val initframe = sc.parallelize(1 to 10, 4).map

{ x => ("id"+x, s"order$x",s"customer$x", x*10, x*75, 1)}

.toDF("id", "name", "c_name", "quantity", "price", "state")

initframe.write
 .format("carbondata")
 .option("tableName", "order")
 .option("partitionColumns", "c_name")
 .mode(SaveMode.Overwrite)
 .save()

val dwframe = spark.read.format("carbondata").option("tableName", 
"order").load()
 val dwSelframe = dwframe.as("A")

val ds1 = sc.parallelize(3 to 10, 4)
 .map {x =>
 if (x <= 4)

{ ("id"+x, s"order$x",s"customer$x", x*10, x*75, 2) }

else

{ ("id"+x, s"order$x",s"customer$x", x*10, x*75, 1) }

}.toDF("id", "name", "c_name", "quantity", "price", "state")

ds1.show()
 val ds2 = sc.parallelize(1 to 2, 4)
 .map

{x => ("newid"+x, s"order$x",s"customer$x", x*10, x*75, 1) }

.toDS().toDF()
 ds2.show()
 val ds3 = ds1.union(ds2)
 ds3.show()

val odsframe = ds3.as("B")

var matches = Seq.empty[MergeMatch]
 val updateMap = Map(col("id") -> col("A.id"),
 col("price") -> expr("B.price + 1"),
 col("state") -> col("B.state"))

val insertMap = Map(col("id") -> col("B.id"),
 col("name") -> col("B.name"),
 col("c_name") -> col("B.c_name"),
 col("quantity") -> col("B.quantity"),
 col("price") -> expr("B.price * 100"),
 col("state") -> col("B.state"))

val insertMap_u = Map(col("id") -> col("id"),
 col("name") -> col("name"),
 col("c_name") -> lit("insert"),
 col("quantity") -> col("quantity"),
 col("price") -> expr("price"),
 col("state") -> col("state"))

val insertMap_d = Map(col("id") -> col("id"),
 col("name") -> col("name"),
 col("c_name") -> lit("delete"),
 col("quantity") -> col("quantity"),
 col("price") -> expr("price"),
 col("state") -> col("state"))

matches ++= Seq(WhenMatched(Some(col("A.state") =!= 
col("B.state"))).addAction(UpdateAction(updateMap)).addAction(InsertInHistoryTableAction(insertMap_u,
 TableIdentifier("order_hist"
 matches ++= Seq(WhenNotMatched().addAction(InsertAction(insertMap)))
 matches ++= 
Seq(WhenNotMatchedAndExistsOnlyOnTarget().addAction(DeleteAction()).addAction(InsertInHistoryTableAction(insertMap_d,
 TableIdentifier("order_hist"

 

sql("select count(*) from order").show()
 sql("select count(*) from order where state = 2").show()
 sql("select price from order where id = 'newid1'").show()
 sql("select count(*) from order_hist where c_name = 'delete'").show()
 sql("select count(*) from order_hist where c_name = 'insert'").show()

  was:
The result sets are different when run the queries in spark-shell--master local 
and spark-shell --master yarn (Two Different Spark Deploy Modes)

Steps to Reproduce Issue :

import scala.collection.JavaConverters._
 import java.sql.Date

import org.apache.spark.sql._
 import org.apache.spark.sql.CarbonSession._
 import org.apache.spark.sql.catalyst.TableIdentifier
 import org.apache.spark.sql.execution.command.mutation.merge.

{CarbonMergeDataSetCommand, DeleteAction, InsertAction, 
InsertInHistoryTableAction, MergeDataSetMatches, MergeMatch, UpdateAction, 
WhenMatched, WhenNotMatched, WhenNotMatchedAndExistsOnlyOnTarget}

import org.apache.spark.sql.functions._
 import org.apache.spark.sql.test.util.QueryTest
 import org.apache.spark.sql.types.

{BooleanType, DateType, IntegerType, StringType, StructField, StructType}

import spark.implicits._

sql("drop table if exists order").show()
 sql("drop table if exists order_hist").show()
 sql("create table order_hist(id string, name string, quantity int, price int, 
state int) PARTITIONED BY (c_name String) STORED AS carbondata").show()

val initframe = sc.parallelize(1 to 10, 4).map

{ x => ("id"+x, s"order$x",s"customer$x", x*10, x*75, 1)}

.toDF("id", "name", 

[jira] [Updated] (CARBONDATA-3851) Merge Update and Insert with Partition Table is giving different results in different spark deploy modes

2020-06-11 Thread Sachin Ramachandra Setty (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sachin Ramachandra Setty updated CARBONDATA-3851:
-
Description: 
The result sets are different when run the queries in spark-shell--master local 
and spark-shell --master yarn (Two Different Spark Deploy Modes)

Steps to Reproduce Issue :

import scala.collection.JavaConverters._
 import java.sql.Date

import org.apache.spark.sql._
 import org.apache.spark.sql.CarbonSession._
 import org.apache.spark.sql.catalyst.TableIdentifier
 import org.apache.spark.sql.execution.command.mutation.merge.

{CarbonMergeDataSetCommand, DeleteAction, InsertAction, 
InsertInHistoryTableAction, MergeDataSetMatches, MergeMatch, UpdateAction, 
WhenMatched, WhenNotMatched, WhenNotMatchedAndExistsOnlyOnTarget}

import org.apache.spark.sql.functions._
 import org.apache.spark.sql.test.util.QueryTest
 import org.apache.spark.sql.types.

{BooleanType, DateType, IntegerType, StringType, StructField, StructType}

import spark.implicits._

sql("drop table if exists order").show()
 sql("drop table if exists order_hist").show()
 sql("create table order_hist(id string, name string, quantity int, price int, 
state int) PARTITIONED BY (c_name String) STORED AS carbondata").show()

val initframe = sc.parallelize(1 to 10, 4).map

{ x => ("id"+x, s"order$x",s"customer$x", x*10, x*75, 1)}

.toDF("id", "name", "c_name", "quantity", "price", "state")

initframe.write
 .format("carbondata")
 .option("tableName", "order")
 .option("partitionColumns", "c_name")
 .mode(SaveMode.Overwrite)
 .save()

val dwframe = spark.read.format("carbondata").option("tableName", 
"order").load()
 val dwSelframe = dwframe.as("A")

val ds1 = sc.parallelize(3 to 10, 4)
 .map {x =>
 if (x <= 4)

{ ("id"+x, s"order$x",s"customer$x", x*10, x*75, 2) }

else

{ ("id"+x, s"order$x",s"customer$x", x*10, x*75, 1) }

}.toDF("id", "name", "c_name", "quantity", "price", "state")

ds1.show()
 val ds2 = sc.parallelize(1 to 2, 4)
 .map

{x => ("newid"+x, s"order$x",s"customer$x", x*10, x*75, 1) }

.toDS().toDF()
 ds2.show()
 val ds3 = ds1.union(ds2)
 ds3.show()

val odsframe = ds3.as("B")

var matches = Seq.empty[MergeMatch]
 val updateMap = Map(col("id") -> col("A.id"),
 col("price") -> expr("B.price + 1"),
 col("state") -> col("B.state"))

val insertMap = Map(col("id") -> col("B.id"),
 col("name") -> col("B.name"),
 col("c_name") -> col("B.c_name"),
 col("quantity") -> col("B.quantity"),
 col("price") -> expr("B.price * 100"),
 col("state") -> col("B.state"))

val insertMap_u = Map(col("id") -> col("id"),
 col("name") -> col("name"),
 col("c_name") -> lit("insert"),
 col("quantity") -> col("quantity"),
 col("price") -> expr("price"),
 col("state") -> col("state"))

val insertMap_d = Map(col("id") -> col("id"),
 col("name") -> col("name"),
 col("c_name") -> lit("delete"),
 col("quantity") -> col("quantity"),
 col("price") -> expr("price"),
 col("state") -> col("state"))

matches ++= Seq(WhenMatched(Some(col("A.state") =!= 
col("B.state"))).addAction(UpdateAction(updateMap)).addAction(InsertInHistoryTableAction(insertMap_u,
 TableIdentifier("order_hist"
 matches ++= Seq(WhenNotMatched().addAction(InsertAction(insertMap)))
 matches ++= 
Seq(WhenNotMatchedAndExistsOnlyOnTarget().addAction(DeleteAction()).addAction(InsertInHistoryTableAction(insertMap_d,
 TableIdentifier("order_hist"

 

 
{code:java}
sql("select count from order").show()
sql("select count from order where state = 2").show()
sql("select price from order where id = 'newid1'").show()
sql("select count from order_hist where c_name = 'delete'").show()
sql("select count from order_hist where c_name = 'insert'").show() placeholder
{code}
 

sql("select count(*) from order").show()
 sql("select count(*) from order where state = 2").show()
 sql("select price from order where id = 'newid1'").show()
 sql("select count(*) from order_hist where c_name = 'delete'").show()
 sql("select count(*) from order_hist where c_name = 'insert'").show()

Results in spark-shell --master yarn
 scala> sql("select count(*) from order").show()
 ++
|count(1)|

++
|10|

++

scala> sql("select count(*) from order where state = 2").show()
 ++
|count(1)|

++
|0|

++

scala> sql("select price from order where id = 'newid1'").show()
 +-+
|price|

+-+
 +-+

scala> sql("select count(*) from order_hist where c_name = 'delete'").show()
 ++
|count(1)|

++
|0|

++

scala> sql("select count(*) from order_hist where c_name = 'insert'").show()
 ++
|count(1)|

++
|0|

++

Results in spark-shell --master local

scala> sql("select count(*) from order").show()
 ++
|count(1)|

++
|10|

++

scala> sql("select count(*) from order where state = 2").show()
 ++
|count(1)|

++
|2|

++

scala> sql("select price from order where id 

[jira] [Updated] (CARBONDATA-3851) Merge Update and Insert with Partition Table is giving different results in different spark deploy modes

2020-06-11 Thread Sachin Ramachandra Setty (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sachin Ramachandra Setty updated CARBONDATA-3851:
-
Description: 
The result sets are different when run the queries in spark-shell--master local 
and spark-shell --master yarn (Two Different Spark Deploy Modes)

Steps to Reproduce Issue :

import scala.collection.JavaConverters._
 import java.sql.Date

import org.apache.spark.sql._
 import org.apache.spark.sql.CarbonSession._
 import org.apache.spark.sql.catalyst.TableIdentifier
 import org.apache.spark.sql.execution.command.mutation.merge.

{CarbonMergeDataSetCommand, DeleteAction, InsertAction, 
InsertInHistoryTableAction, MergeDataSetMatches, MergeMatch, UpdateAction, 
WhenMatched, WhenNotMatched, WhenNotMatchedAndExistsOnlyOnTarget}

import org.apache.spark.sql.functions._
 import org.apache.spark.sql.test.util.QueryTest
 import org.apache.spark.sql.types.

{BooleanType, DateType, IntegerType, StringType, StructField, StructType}

import spark.implicits._

sql("drop table if exists order").show()
 sql("drop table if exists order_hist").show()
 sql("create table order_hist(id string, name string, quantity int, price int, 
state int) PARTITIONED BY (c_name String) STORED AS carbondata").show()

val initframe = sc.parallelize(1 to 10, 4).map

{ x => ("id"+x, s"order$x",s"customer$x", x*10, x*75, 1)}

.toDF("id", "name", "c_name", "quantity", "price", "state")

initframe.write
 .format("carbondata")
 .option("tableName", "order")
 .option("partitionColumns", "c_name")
 .mode(SaveMode.Overwrite)
 .save()

val dwframe = spark.read.format("carbondata").option("tableName", 
"order").load()
 val dwSelframe = dwframe.as("A")

val ds1 = sc.parallelize(3 to 10, 4)
 .map {x =>
 if (x <= 4)

{ ("id"+x, s"order$x",s"customer$x", x*10, x*75, 2) }

else

{ ("id"+x, s"order$x",s"customer$x", x*10, x*75, 1) }

}.toDF("id", "name", "c_name", "quantity", "price", "state")

ds1.show()
 val ds2 = sc.parallelize(1 to 2, 4)
 .map

{x => ("newid"+x, s"order$x",s"customer$x", x*10, x*75, 1) }

.toDS().toDF()
 ds2.show()
 val ds3 = ds1.union(ds2)
 ds3.show()

val odsframe = ds3.as("B")

var matches = Seq.empty[MergeMatch]
 val updateMap = Map(col("id") -> col("A.id"),
 col("price") -> expr("B.price + 1"),
 col("state") -> col("B.state"))

val insertMap = Map(col("id") -> col("B.id"),
 col("name") -> col("B.name"),
 col("c_name") -> col("B.c_name"),
 col("quantity") -> col("B.quantity"),
 col("price") -> expr("B.price * 100"),
 col("state") -> col("B.state"))

val insertMap_u = Map(col("id") -> col("id"),
 col("name") -> col("name"),
 col("c_name") -> lit("insert"),
 col("quantity") -> col("quantity"),
 col("price") -> expr("price"),
 col("state") -> col("state"))

val insertMap_d = Map(col("id") -> col("id"),
 col("name") -> col("name"),
 col("c_name") -> lit("delete"),
 col("quantity") -> col("quantity"),
 col("price") -> expr("price"),
 col("state") -> col("state"))

matches ++= Seq(WhenMatched(Some(col("A.state") =!= 
col("B.state"))).addAction(UpdateAction(updateMap)).addAction(InsertInHistoryTableAction(insertMap_u,
 TableIdentifier("order_hist"
 matches ++= Seq(WhenNotMatched().addAction(InsertAction(insertMap)))
 matches ++= 
Seq(WhenNotMatchedAndExistsOnlyOnTarget().addAction(DeleteAction()).addAction(InsertInHistoryTableAction(insertMap_d,
 TableIdentifier("order_hist"

 

 
{code:java}
sql("select count(*) from order").show()
sql("select count(*) from order where state = 2").show()
sql("select price from order where id = 'newid1'").show()
sql("select count from order_hist where c_name = 'delete'").show()
sql("select count from order_hist where c_name = 'insert'").show() placeholder
{code}
 

sql("select count(*) from order").show()
 sql("select count(*) from order where state = 2").show()
 sql("select price from order where id = 'newid1'").show()
 sql("select count(*) from order_hist where c_name = 'delete'").show()
 sql("select count(*) from order_hist where c_name = 'insert'").show()

Results in spark-shell --master yarn
 scala> sql("select count(*) from order").show()
 ++
|count(1)|

++
|10|

++

scala> sql("select count(*) from order where state = 2").show()
 ++
|count(1)|

++
|0|

++

scala> sql("select price from order where id = 'newid1'").show()
 +-+
|price|

+-+
 +-+

scala> sql("select count(*) from order_hist where c_name = 'delete'").show()
 ++
|count(1)|

++
|0|

++

scala> sql("select count(*) from order_hist where c_name = 'insert'").show()
 ++
|count(1)|

++
|0|

++

Results in spark-shell --master local

scala> sql("select count(*) from order").show()
 ++
|count(1)|

++
|10|

++

scala> sql("select count(*) from order where state = 2").show()
 ++
|count(1)|

++
|2|

++

scala> sql("select price from order 

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3789: [WIP] test

2020-06-11 Thread GitBox


CarbonDataQA1 commented on pull request #3789:
URL: https://github.com/apache/carbondata/pull/3789#issuecomment-642657316


   Build Failed  with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1422/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Created] (CARBONDATA-3851) Merge Update and Insert with Partition Table is giving different results in different spark deploy modes

2020-06-11 Thread Sachin Ramachandra Setty (Jira)
Sachin Ramachandra Setty created CARBONDATA-3851:


 Summary: Merge Update and Insert with Partition Table is giving 
different results in different spark deploy modes
 Key: CARBONDATA-3851
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3851
 Project: CarbonData
  Issue Type: Bug
  Components: spark-integration
Affects Versions: 2.0.0
Reporter: Sachin Ramachandra Setty


The result sets are different when run the queries in spark-shell--master local 
and spark-shell --master yarn (Two Different Spark Deploy Modes)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CARBONDATA-3851) Merge Update and Insert with Partition Table is giving different results in different spark deploy modes

2020-06-11 Thread Sachin Ramachandra Setty (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sachin Ramachandra Setty updated CARBONDATA-3851:
-
Description: 
The result sets are different when run the queries in spark-shell--master local 
and spark-shell --master yarn (Two Different Spark Deploy Modes)

Steps to Reproduce Issue :

import scala.collection.JavaConverters._
import java.sql.Date

import org.apache.spark.sql._
import org.apache.spark.sql.CarbonSession._
import org.apache.spark.sql.catalyst.TableIdentifier
import 
org.apache.spark.sql.execution.command.mutation.merge.{CarbonMergeDataSetCommand,
 DeleteAction, InsertAction, InsertInHistoryTableAction, MergeDataSetMatches, 
MergeMatch, UpdateAction, WhenMatched, WhenNotMatched, 
WhenNotMatchedAndExistsOnlyOnTarget}
import org.apache.spark.sql.functions._
import org.apache.spark.sql.test.util.QueryTest
import org.apache.spark.sql.types.{BooleanType, DateType, IntegerType, 
StringType, StructField, StructType}
import spark.implicits._

   sql("drop table if exists order").show()
sql("drop table if exists order_hist").show()
sql("create table order_hist(id string, name string, quantity int, price 
int, state int) PARTITIONED BY (c_name String) STORED AS carbondata").show()

val initframe = sc.parallelize(1 to 10, 4).map{ x => ("id"+x, 
s"order$x",s"customer$x", x*10, x*75, 1)}.toDF("id", "name", "c_name", 
"quantity", "price", "state")

initframe.write
  .format("carbondata")
  .option("tableName", "order")
  .option("partitionColumns", "c_name")
  .mode(SaveMode.Overwrite)
  .save()

val dwframe = spark.read.format("carbondata").option("tableName", 
"order").load()
val dwSelframe = dwframe.as("A")


val ds1 = sc.parallelize(3 to 10, 4)
  .map {x =>
if (x <= 4) {
  ("id"+x, s"order$x",s"customer$x", x*10, x*75, 2)
} else {
  ("id"+x, s"order$x",s"customer$x", x*10, x*75, 1)
}
  }.toDF("id", "name", "c_name", "quantity", "price", "state")
  
ds1.show()
val ds2 = sc.parallelize(1 to 2, 4)
  .map {x => ("newid"+x, s"order$x",s"customer$x", x*10, x*75, 1)
  }.toDS().toDF()
ds2.show()
val ds3 = ds1.union(ds2)
ds3.show()

val odsframe = ds3.as("B")


   var matches = Seq.empty[MergeMatch]
   val updateMap = Map(col("id") -> col("A.id"),
  col("price") -> expr("B.price + 1"),
  col("state") -> col("B.state"))

val insertMap = Map(col("id") -> col("B.id"),
  col("name") -> col("B.name"),
  col("c_name") -> col("B.c_name"),
  col("quantity") -> col("B.quantity"),
  col("price") -> expr("B.price * 100"),
  col("state") -> col("B.state"))

val insertMap_u = Map(col("id") -> col("id"),
  col("name") -> col("name"),
  col("c_name") -> lit("insert"),
  col("quantity") -> col("quantity"),
  col("price") -> expr("price"),
  col("state") -> col("state"))

val insertMap_d = Map(col("id") -> col("id"),
  col("name") -> col("name"),
  col("c_name") -> lit("delete"),
  col("quantity") -> col("quantity"),
  col("price") -> expr("price"),
  col("state") -> col("state"))

   matches ++= Seq(WhenMatched(Some(col("A.state") =!= 
col("B.state"))).addAction(UpdateAction(updateMap)).addAction(InsertInHistoryTableAction(insertMap_u,
 TableIdentifier("order_hist"
   matches ++= Seq(WhenNotMatched().addAction(InsertAction(insertMap)))
   matches ++= 
Seq(WhenNotMatchedAndExistsOnlyOnTarget().addAction(DeleteAction()).addAction(InsertInHistoryTableAction(insertMap_d,
 TableIdentifier("order_hist"

sql("select count(*) from order").show()
sql("select count(*) from order where state = 2").show()
sql("select price from order where id = 'newid1'").show()
sql("select count(*) from order_hist where c_name = 'delete'").show()
sql("select count(*) from order_hist where c_name = 'insert'").show()


*Results in spark-shell --master yarn *

scala> sql("select count(*) from order").show()
++
|count(1)|
++
|  10|
++


scala> sql("select count(*) from order where state = 2").show()
++
|count(1)|
++
|   0|
++


scala> sql("select price from order where id = 'newid1'").show()
+-+
|price|
+-+
+-+


scala> sql("select count(*) from order_hist where c_name = 'delete'").show()
++
|count(1)|
++
|   0|
++


scala> sql("select count(*) from order_hist where c_name = 'insert'").show()
++
|count(1)|
++
|   0|
++


*Results in spark-shell --master local* 

 scala> sql("select count(*) from order").show()
++
|count(1)|
++
|  10|
++


scala> sql("select count(*) from order where state = 2").show()
++
|count(1)|
++
|   2|
++



[jira] [Updated] (CARBONDATA-3851) Merge Update and Insert with Partition Table is giving different results in different spark deploy modes

2020-06-11 Thread Sachin Ramachandra Setty (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sachin Ramachandra Setty updated CARBONDATA-3851:
-
Description: 
The result sets are different when run the queries in spark-shell--master local 
and spark-shell --master yarn (Two Different Spark Deploy Modes)

Steps to Reproduce Issue :

import scala.collection.JavaConverters._
 import java.sql.Date

import org.apache.spark.sql._
 import org.apache.spark.sql.CarbonSession._
 import org.apache.spark.sql.catalyst.TableIdentifier
 import org.apache.spark.sql.execution.command.mutation.merge.

{CarbonMergeDataSetCommand, DeleteAction, InsertAction, 
InsertInHistoryTableAction, MergeDataSetMatches, MergeMatch, UpdateAction, 
WhenMatched, WhenNotMatched, WhenNotMatchedAndExistsOnlyOnTarget}

import org.apache.spark.sql.functions._
 import org.apache.spark.sql.test.util.QueryTest
 import org.apache.spark.sql.types.

{BooleanType, DateType, IntegerType, StringType, StructField, StructType}

import spark.implicits._

sql("drop table if exists order").show()
 sql("drop table if exists order_hist").show()
 sql("create table order_hist(id string, name string, quantity int, price int, 
state int) PARTITIONED BY (c_name String) STORED AS carbondata").show()

val initframe = sc.parallelize(1 to 10, 4).map

{ x => ("id"+x, s"order$x",s"customer$x", x*10, x*75, 1)}

.toDF("id", "name", "c_name", "quantity", "price", "state")

initframe.write
 .format("carbondata")
 .option("tableName", "order")
 .option("partitionColumns", "c_name")
 .mode(SaveMode.Overwrite)
 .save()

val dwframe = spark.read.format("carbondata").option("tableName", 
"order").load()
 val dwSelframe = dwframe.as("A")

val ds1 = sc.parallelize(3 to 10, 4)
 .map {x =>
 if (x <= 4)

{ ("id"+x, s"order$x",s"customer$x", x*10, x*75, 2) }

else

{ ("id"+x, s"order$x",s"customer$x", x*10, x*75, 1) }

}.toDF("id", "name", "c_name", "quantity", "price", "state")

ds1.show()
 val ds2 = sc.parallelize(1 to 2, 4)
 .map

{x => ("newid"+x, s"order$x",s"customer$x", x*10, x*75, 1) }

.toDS().toDF()
 ds2.show()
 val ds3 = ds1.union(ds2)
 ds3.show()

val odsframe = ds3.as("B")

var matches = Seq.empty[MergeMatch]
 val updateMap = Map(col("id") -> col("A.id"),
 col("price") -> expr("B.price + 1"),
 col("state") -> col("B.state"))

val insertMap = Map(col("id") -> col("B.id"),
 col("name") -> col("B.name"),
 col("c_name") -> col("B.c_name"),
 col("quantity") -> col("B.quantity"),
 col("price") -> expr("B.price * 100"),
 col("state") -> col("B.state"))

val insertMap_u = Map(col("id") -> col("id"),
 col("name") -> col("name"),
 col("c_name") -> lit("insert"),
 col("quantity") -> col("quantity"),
 col("price") -> expr("price"),
 col("state") -> col("state"))

val insertMap_d = Map(col("id") -> col("id"),
 col("name") -> col("name"),
 col("c_name") -> lit("delete"),
 col("quantity") -> col("quantity"),
 col("price") -> expr("price"),
 col("state") -> col("state"))

matches ++= Seq(WhenMatched(Some(col("A.state") =!= 
col("B.state"))).addAction(UpdateAction(updateMap)).addAction(InsertInHistoryTableAction(insertMap_u,
 TableIdentifier("order_hist"
 matches ++= Seq(WhenNotMatched().addAction(InsertAction(insertMap)))
 matches ++= 
Seq(WhenNotMatchedAndExistsOnlyOnTarget().addAction(DeleteAction()).addAction(InsertInHistoryTableAction(insertMap_d,
 TableIdentifier("order_hist"

 

sql("select count(*) from order").show()
 sql("select count(*) from order where state = 2").show()
 sql("select price from order where id = 'newid1'").show()
 sql("select count(*) from order_hist where c_name = 'delete'").show()
 sql("select count(*) from order_hist where c_name = 'insert'").show()



Results in spark-shell --master yarn
scala> sql("select count(*) from order").show()
++
|count(1)|
++
|  10|
++


scala> sql("select count(*) from order where state = 2").show()
++
|count(1)|
++
|   0|
++


scala> sql("select price from order where id = 'newid1'").show()
+-+
|price|
+-+
+-+


scala> sql("select count(*) from order_hist where c_name = 'delete'").show()
++
|count(1)|
++
|   0|
++


scala> sql("select count(*) from order_hist where c_name = 'insert'").show()
++
|count(1)|
++
|   0|
++


Results in spark-shell --master local

scala> sql("select count(*) from order").show()
++
|count(1)|
++
|  10|
++


scala> sql("select count(*) from order where state = 2").show()
++
|count(1)|
++
|   2|
++


scala> sql("select price from order where id = 'newid1'").show()
+-+
|price|
+-+
| 7500|
+-+


scala> sql("select count(*) from order_hist where c_name = 'delete'").show()
++
|count(1)|
++
|   2|
++


scala> sql("select count(*) from order_hist where c_name = 'insert'").show()
++

[jira] [Updated] (CARBONDATA-3851) Merge Update and Insert with Partition Table is giving different results in different spark deploy modes

2020-06-11 Thread Sachin Ramachandra Setty (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sachin Ramachandra Setty updated CARBONDATA-3851:
-
Description: 
The result sets are different when run the queries in spark-shell--master local 
and spark-shell --master yarn (Two Different Spark Deploy Modes)

Steps to Reproduce Issue :

import scala.collection.JavaConverters._
 import java.sql.Date

import org.apache.spark.sql._
 import org.apache.spark.sql.CarbonSession._
 import org.apache.spark.sql.catalyst.TableIdentifier
 import org.apache.spark.sql.execution.command.mutation.merge.

{CarbonMergeDataSetCommand, DeleteAction, InsertAction, 
InsertInHistoryTableAction, MergeDataSetMatches, MergeMatch, UpdateAction, 
WhenMatched, WhenNotMatched, WhenNotMatchedAndExistsOnlyOnTarget}

import org.apache.spark.sql.functions._
 import org.apache.spark.sql.test.util.QueryTest
 import org.apache.spark.sql.types.

{BooleanType, DateType, IntegerType, StringType, StructField, StructType}

import spark.implicits._

sql("drop table if exists order").show()
 sql("drop table if exists order_hist").show()
 sql("create table order_hist(id string, name string, quantity int, price int, 
state int) PARTITIONED BY (c_name String) STORED AS carbondata").show()

val initframe = sc.parallelize(1 to 10, 4).map

{ x => ("id"+x, s"order$x",s"customer$x", x*10, x*75, 1)}

.toDF("id", "name", "c_name", "quantity", "price", "state")

initframe.write
 .format("carbondata")
 .option("tableName", "order")
 .option("partitionColumns", "c_name")
 .mode(SaveMode.Overwrite)
 .save()

val dwframe = spark.read.format("carbondata").option("tableName", 
"order").load()
 val dwSelframe = dwframe.as("A")

val ds1 = sc.parallelize(3 to 10, 4)
 .map {x =>
 if (x <= 4)

{ ("id"+x, s"order$x",s"customer$x", x*10, x*75, 2) }

else

{ ("id"+x, s"order$x",s"customer$x", x*10, x*75, 1) }

}.toDF("id", "name", "c_name", "quantity", "price", "state")

ds1.show()
 val ds2 = sc.parallelize(1 to 2, 4)
 .map

{x => ("newid"+x, s"order$x",s"customer$x", x*10, x*75, 1) }

.toDS().toDF()
 ds2.show()
 val ds3 = ds1.union(ds2)
 ds3.show()

val odsframe = ds3.as("B")

var matches = Seq.empty[MergeMatch]
 val updateMap = Map(col("id") -> col("A.id"),
 col("price") -> expr("B.price + 1"),
 col("state") -> col("B.state"))

val insertMap = Map(col("id") -> col("B.id"),
 col("name") -> col("B.name"),
 col("c_name") -> col("B.c_name"),
 col("quantity") -> col("B.quantity"),
 col("price") -> expr("B.price * 100"),
 col("state") -> col("B.state"))

val insertMap_u = Map(col("id") -> col("id"),
 col("name") -> col("name"),
 col("c_name") -> lit("insert"),
 col("quantity") -> col("quantity"),
 col("price") -> expr("price"),
 col("state") -> col("state"))

val insertMap_d = Map(col("id") -> col("id"),
 col("name") -> col("name"),
 col("c_name") -> lit("delete"),
 col("quantity") -> col("quantity"),
 col("price") -> expr("price"),
 col("state") -> col("state"))

matches ++= Seq(WhenMatched(Some(col("A.state") =!= 
col("B.state"))).addAction(UpdateAction(updateMap)).addAction(InsertInHistoryTableAction(insertMap_u,
 TableIdentifier("order_hist"
 matches ++= Seq(WhenNotMatched().addAction(InsertAction(insertMap)))
 matches ++= 
Seq(WhenNotMatchedAndExistsOnlyOnTarget().addAction(DeleteAction()).addAction(InsertInHistoryTableAction(insertMap_d,
 TableIdentifier("order_hist"

 

sql("select count(*) from order").show()
 sql("select count(*) from order where state = 2").show()
 sql("select price from order where id = 'newid1'").show()
 sql("select count(*) from order_hist where c_name = 'delete'").show()
 sql("select count(*) from order_hist where c_name = 'insert'").show()



Results in spark-shell --master yarn
scala> sql("select count(*) from order").show()
++
|count(1)|
++
|  10|
++


scala> sql("select count(*) from order where state = 2").show()
++
|count(1)|
++
|   0|
++


scala> sql("select price from order where id = 'newid1'").show()
+-+
|price|
+-+
+-+


scala> sql("select count(*) from order_hist where c_name = 'delete'").show()
++
|count(1)|
++
|   0|
++


scala> sql("select count(*) from order_hist where c_name = 'insert'").show()
++
|count(1)|
++
|   0|
++


Results in spark-shell --master local

scala> sql("select count(*) from order").show()
++
|count(1)|
++
|  10|
++


scala> sql("select count(*) from order where state = 2").show()
++
|count(1)|
++
|   2|
++


scala> sql("select price from order where id = 'newid1'").show()
+-+
|price|
+-+
| 7500|
+-+


scala> sql("select count(*) from order_hist where c_name = 'delete'").show()
++
|count(1)|
++
|   2|
++


scala> sql("select count(*) from order_hist where c_name = 'insert'").show()
++

[jira] [Updated] (CARBONDATA-3851) Merge Update and Insert with Partition Table is giving different results in different spark deploy modes

2020-06-11 Thread Sachin Ramachandra Setty (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sachin Ramachandra Setty updated CARBONDATA-3851:
-
Description: (was: The result sets are different when run the queries 
in spark-shell--master local and spark-shell --master yarn (Two Different Spark 
Deploy Modes)

Steps to Reproduce Issue :

import scala.collection.JavaConverters._
 import java.sql.Date

import org.apache.spark.sql._
 import org.apache.spark.sql.CarbonSession._
 import org.apache.spark.sql.catalyst.TableIdentifier
 import org.apache.spark.sql.execution.command.mutation.merge.

{CarbonMergeDataSetCommand, DeleteAction, InsertAction, 
InsertInHistoryTableAction, MergeDataSetMatches, MergeMatch, UpdateAction, 
WhenMatched, WhenNotMatched, WhenNotMatchedAndExistsOnlyOnTarget}

import org.apache.spark.sql.functions._
 import org.apache.spark.sql.test.util.QueryTest
 import org.apache.spark.sql.types.

{BooleanType, DateType, IntegerType, StringType, StructField, StructType}

import spark.implicits._

sql("drop table if exists order").show()
 sql("drop table if exists order_hist").show()
 sql("create table order_hist(id string, name string, quantity int, price int, 
state int) PARTITIONED BY (c_name String) STORED AS carbondata").show()

val initframe = sc.parallelize(1 to 10, 4).map

{ x => ("id"+x, s"order$x",s"customer$x", x*10, x*75, 1)}

.toDF("id", "name", "c_name", "quantity", "price", "state")

initframe.write
 .format("carbondata")
 .option("tableName", "order")
 .option("partitionColumns", "c_name")
 .mode(SaveMode.Overwrite)
 .save()

val dwframe = spark.read.format("carbondata").option("tableName", 
"order").load()
 val dwSelframe = dwframe.as("A")

val ds1 = sc.parallelize(3 to 10, 4)
 .map {x =>
 if (x <= 4)

{ ("id"+x, s"order$x",s"customer$x", x*10, x*75, 2) }

else

{ ("id"+x, s"order$x",s"customer$x", x*10, x*75, 1) }

}.toDF("id", "name", "c_name", "quantity", "price", "state")

ds1.show()
 val ds2 = sc.parallelize(1 to 2, 4)
 .map

{x => ("newid"+x, s"order$x",s"customer$x", x*10, x*75, 1) }

.toDS().toDF()
 ds2.show()
 val ds3 = ds1.union(ds2)
 ds3.show()

val odsframe = ds3.as("B")

var matches = Seq.empty[MergeMatch]
 val updateMap = Map(col("id") -> col("A.id"),
 col("price") -> expr("B.price + 1"),
 col("state") -> col("B.state"))

val insertMap = Map(col("id") -> col("B.id"),
 col("name") -> col("B.name"),
 col("c_name") -> col("B.c_name"),
 col("quantity") -> col("B.quantity"),
 col("price") -> expr("B.price * 100"),
 col("state") -> col("B.state"))

val insertMap_u = Map(col("id") -> col("id"),
 col("name") -> col("name"),
 col("c_name") -> lit("insert"),
 col("quantity") -> col("quantity"),
 col("price") -> expr("price"),
 col("state") -> col("state"))

val insertMap_d = Map(col("id") -> col("id"),
 col("name") -> col("name"),
 col("c_name") -> lit("delete"),
 col("quantity") -> col("quantity"),
 col("price") -> expr("price"),
 col("state") -> col("state"))

matches ++= Seq(WhenMatched(Some(col("A.state") =!= 
col("B.state"))).addAction(UpdateAction(updateMap)).addAction(InsertInHistoryTableAction(insertMap_u,
 TableIdentifier("order_hist"
 matches ++= Seq(WhenNotMatched().addAction(InsertAction(insertMap)))
 matches ++= 
Seq(WhenNotMatchedAndExistsOnlyOnTarget().addAction(DeleteAction()).addAction(InsertInHistoryTableAction(insertMap_d,
 TableIdentifier("order_hist"

 

sql("select count(*) from order").show()
 sql("select count(*) from order where state = 2").show()
 sql("select price from order where id = 'newid1'").show()
 sql("select count(*) from order_hist where c_name = 'delete'").show()
 sql("select count(*) from order_hist where c_name = 'insert'").show()



Results in spark-shell --master yarn
scala> sql("select count(*) from order").show()
++
|count(1)|
++
|  10|
++


scala> sql("select count(*) from order where state = 2").show()
++
|count(1)|
++
|   0|
++


scala> sql("select price from order where id = 'newid1'").show()
+-+
|price|
+-+
+-+


scala> sql("select count(*) from order_hist where c_name = 'delete'").show()
++
|count(1)|
++
|   0|
++


scala> sql("select count(*) from order_hist where c_name = 'insert'").show()
++
|count(1)|
++
|   0|
++


Results in spark-shell --master local

scala> sql("select count(*) from order").show()
++
|count(1)|
++
|  10|
++


scala> sql("select count(*) from order where state = 2").show()
++
|count(1)|
++
|   2|
++


scala> sql("select price from order where id = 'newid1'").show()
+-+
|price|
+-+
| 7500|
+-+


scala> sql("select count(*) from order_hist where c_name = 'delete'").show()
++
|count(1)|
++
|   2|
++


scala> sql("select count(*) from order_hist where c_name = 'insert'").show()

[jira] [Updated] (CARBONDATA-3851) Merge Update and Insert with Partition Table is giving different results in different spark deploy modes

2020-06-11 Thread Sachin Ramachandra Setty (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sachin Ramachandra Setty updated CARBONDATA-3851:
-
Description: 
The result sets are different when run the queries in spark-shell--master local 
and spark-shell --master yarn (Two Different Spark Deploy Modes)

Steps to Reproduce Issue :

import scala.collection.JavaConverters._
import java.sql.Date

import org.apache.spark.sql._
import org.apache.spark.sql.CarbonSession._
import org.apache.spark.sql.catalyst.TableIdentifier
import 
org.apache.spark.sql.execution.command.mutation.merge.{CarbonMergeDataSetCommand,
 DeleteAction, InsertAction, InsertInHistoryTableAction, MergeDataSetMatches, 
MergeMatch, UpdateAction, WhenMatched, WhenNotMatched, 
WhenNotMatchedAndExistsOnlyOnTarget}
import org.apache.spark.sql.functions._
import org.apache.spark.sql.test.util.QueryTest
import org.apache.spark.sql.types.{BooleanType, DateType, IntegerType, 
StringType, StructField, StructType}
import spark.implicits._

   sql("drop table if exists order").show()
sql("drop table if exists order_hist").show()
sql("create table order_hist(id string, name string, quantity int, price 
int, state int) PARTITIONED BY (c_name String) STORED AS carbondata").show()

val initframe = sc.parallelize(1 to 10, 4).map{ x => ("id"+x, 
s"order$x",s"customer$x", x*10, x*75, 1)}.toDF("id", "name", "c_name", 
"quantity", "price", "state")

initframe.write
  .format("carbondata")
  .option("tableName", "order")
  .option("partitionColumns", "c_name")
  .mode(SaveMode.Overwrite)
  .save()

val dwframe = spark.read.format("carbondata").option("tableName", 
"order").load()
val dwSelframe = dwframe.as("A")


val ds1 = sc.parallelize(3 to 10, 4)
  .map {x =>
if (x <= 4) {
  ("id"+x, s"order$x",s"customer$x", x*10, x*75, 2)
} else {
  ("id"+x, s"order$x",s"customer$x", x*10, x*75, 1)
}
  }.toDF("id", "name", "c_name", "quantity", "price", "state")
  
ds1.show()
val ds2 = sc.parallelize(1 to 2, 4)
  .map {x => ("newid"+x, s"order$x",s"customer$x", x*10, x*75, 1)
  }.toDS().toDF()
ds2.show()
val ds3 = ds1.union(ds2)
ds3.show()

val odsframe = ds3.as("B")


   var matches = Seq.empty[MergeMatch]
   val updateMap = Map(col("id") -> col("A.id"),
  col("price") -> expr("B.price + 1"),
  col("state") -> col("B.state"))

val insertMap = Map(col("id") -> col("B.id"),
  col("name") -> col("B.name"),
  col("c_name") -> col("B.c_name"),
  col("quantity") -> col("B.quantity"),
  col("price") -> expr("B.price * 100"),
  col("state") -> col("B.state"))

val insertMap_u = Map(col("id") -> col("id"),
  col("name") -> col("name"),
  col("c_name") -> lit("insert"),
  col("quantity") -> col("quantity"),
  col("price") -> expr("price"),
  col("state") -> col("state"))

val insertMap_d = Map(col("id") -> col("id"),
  col("name") -> col("name"),
  col("c_name") -> lit("delete"),
  col("quantity") -> col("quantity"),
  col("price") -> expr("price"),
  col("state") -> col("state"))

   matches ++= Seq(WhenMatched(Some(col("A.state") =!= 
col("B.state"))).addAction(UpdateAction(updateMap)).addAction(InsertInHistoryTableAction(insertMap_u,
 TableIdentifier("order_hist"
   matches ++= Seq(WhenNotMatched().addAction(InsertAction(insertMap)))
   matches ++= 
Seq(WhenNotMatchedAndExistsOnlyOnTarget().addAction(DeleteAction()).addAction(InsertInHistoryTableAction(insertMap_d,
 TableIdentifier("order_hist"











 

  was:
The result sets are different when run the queries in spark-shell--master local 
and spark-shell --master yarn (Two Different Spark Deploy Modes)

Steps to Reproduce Issue :

import scala.collection.JavaConverters._
import java.sql.Date

import org.apache.spark.sql._
import org.apache.spark.sql.CarbonSession._
import org.apache.spark.sql.catalyst.TableIdentifier
import 
org.apache.spark.sql.execution.command.mutation.merge.{CarbonMergeDataSetCommand,
 DeleteAction, InsertAction, InsertInHistoryTableAction, MergeDataSetMatches, 
MergeMatch, UpdateAction, WhenMatched, WhenNotMatched, 
WhenNotMatchedAndExistsOnlyOnTarget}
import org.apache.spark.sql.functions._
import org.apache.spark.sql.test.util.QueryTest
import org.apache.spark.sql.types.{BooleanType, DateType, IntegerType, 
StringType, StructField, StructType}
import spark.implicits._

   sql("drop table if exists order").show()
sql("drop table if exists order_hist").show()
sql("create table order_hist(id string, name string, quantity int, price 
int, state int) PARTITIONED BY (c_name String) STORED AS carbondata").show()

val initframe = 

[jira] [Updated] (CARBONDATA-3851) Merge Update and Insert with Partition Table is giving different results in different spark deploy modes

2020-06-11 Thread Sachin Ramachandra Setty (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sachin Ramachandra Setty updated CARBONDATA-3851:
-
Description: 
The result sets are different when run the queries in spark-shell--master local 
and spark-shell --master yarn (Two Different Spark Deploy Modes)

Steps to Reproduce Issue :

import scala.collection.JavaConverters._
 import java.sql.Date

import org.apache.spark.sql._
 import org.apache.spark.sql.CarbonSession._
 import org.apache.spark.sql.catalyst.TableIdentifier
 import org.apache.spark.sql.execution.command.mutation.merge.

{CarbonMergeDataSetCommand, DeleteAction, InsertAction, 
InsertInHistoryTableAction, MergeDataSetMatches, MergeMatch, UpdateAction, 
WhenMatched, WhenNotMatched, WhenNotMatchedAndExistsOnlyOnTarget}

import org.apache.spark.sql.functions._
 import org.apache.spark.sql.test.util.QueryTest
 import org.apache.spark.sql.types.

{BooleanType, DateType, IntegerType, StringType, StructField, StructType}

import spark.implicits._

sql("drop table if exists order").show()
 sql("drop table if exists order_hist").show()
 sql("create table order_hist(id string, name string, quantity int, price int, 
state int) PARTITIONED BY (c_name String) STORED AS carbondata").show()

val initframe = sc.parallelize(1 to 10, 4).map

{ x => ("id"+x, s"order$x",s"customer$x", x*10, x*75, 1)}

.toDF("id", "name", "c_name", "quantity", "price", "state")

initframe.write
 .format("carbondata")
 .option("tableName", "order")
 .option("partitionColumns", "c_name")
 .mode(SaveMode.Overwrite)
 .save()

val dwframe = spark.read.format("carbondata").option("tableName", 
"order").load()
 val dwSelframe = dwframe.as("A")

val ds1 = sc.parallelize(3 to 10, 4)
 .map {x =>
 if (x <= 4)

{ ("id"+x, s"order$x",s"customer$x", x*10, x*75, 2) }

else

{ ("id"+x, s"order$x",s"customer$x", x*10, x*75, 1) }

}.toDF("id", "name", "c_name", "quantity", "price", "state")

ds1.show()
 val ds2 = sc.parallelize(1 to 2, 4)
 .map

{x => ("newid"+x, s"order$x",s"customer$x", x*10, x*75, 1) }

.toDS().toDF()
 ds2.show()
 val ds3 = ds1.union(ds2)
 ds3.show()

val odsframe = ds3.as("B")

var matches = Seq.empty[MergeMatch]
 val updateMap = Map(col("id") -> col("A.id"),
 col("price") -> expr("B.price + 1"),
 col("state") -> col("B.state"))

val insertMap = Map(col("id") -> col("B.id"),
 col("name") -> col("B.name"),
 col("c_name") -> col("B.c_name"),
 col("quantity") -> col("B.quantity"),
 col("price") -> expr("B.price * 100"),
 col("state") -> col("B.state"))

val insertMap_u = Map(col("id") -> col("id"),
 col("name") -> col("name"),
 col("c_name") -> lit("insert"),
 col("quantity") -> col("quantity"),
 col("price") -> expr("price"),
 col("state") -> col("state"))

val insertMap_d = Map(col("id") -> col("id"),
 col("name") -> col("name"),
 col("c_name") -> lit("delete"),
 col("quantity") -> col("quantity"),
 col("price") -> expr("price"),
 col("state") -> col("state"))

matches ++= Seq(WhenMatched(Some(col("A.state") =!= 
col("B.state"))).addAction(UpdateAction(updateMap)).addAction(InsertInHistoryTableAction(insertMap_u,
 TableIdentifier("order_hist"
 matches ++= Seq(WhenNotMatched().addAction(InsertAction(insertMap)))
 matches ++= 
Seq(WhenNotMatchedAndExistsOnlyOnTarget().addAction(DeleteAction()).addAction(InsertInHistoryTableAction(insertMap_d,
 TableIdentifier("order_hist"

 

sql("select count(*) from order").show()
sql("select count(*) from order where state = 2").show()
sql("select price from order where id = 'newid1'").show()
sql("select count(*) from order_hist where c_name = 'delete'").show()
sql("select count(*) from order_hist where c_name = 'insert'").show()

  was:
The result sets are different when run the queries in spark-shell--master local 
and spark-shell --master yarn (Two Different Spark Deploy Modes)

Steps to Reproduce Issue :

import scala.collection.JavaConverters._
import java.sql.Date

import org.apache.spark.sql._
import org.apache.spark.sql.CarbonSession._
import org.apache.spark.sql.catalyst.TableIdentifier
import 
org.apache.spark.sql.execution.command.mutation.merge.{CarbonMergeDataSetCommand,
 DeleteAction, InsertAction, InsertInHistoryTableAction, MergeDataSetMatches, 
MergeMatch, UpdateAction, WhenMatched, WhenNotMatched, 
WhenNotMatchedAndExistsOnlyOnTarget}
import org.apache.spark.sql.functions._
import org.apache.spark.sql.test.util.QueryTest
import org.apache.spark.sql.types.{BooleanType, DateType, IntegerType, 
StringType, StructField, StructType}
import spark.implicits._

   sql("drop table if exists order").show()
sql("drop table if exists order_hist").show()
sql("create table order_hist(id string, name string, quantity int, price 
int, state int) PARTITIONED BY (c_name String) STORED AS carbondata").show()

val initframe = sc.parallelize(1 to 10, 4).map{ x => ("id"+x, 

[jira] [Updated] (CARBONDATA-3851) Merge Update and Insert with Partition Table is giving different results in different spark deploy modes

2020-06-11 Thread Sachin Ramachandra Setty (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sachin Ramachandra Setty updated CARBONDATA-3851:
-
Description: 
The result sets are different when run the queries in spark-shell--master local 
and spark-shell --master yarn (Two Different Spark Deploy Modes)

Steps to Reproduce Issue :

import scala.collection.JavaConverters._
 import java.sql.Date

import org.apache.spark.sql._
 import org.apache.spark.sql.CarbonSession._
 import org.apache.spark.sql.catalyst.TableIdentifier
 import org.apache.spark.sql.execution.command.mutation.merge.

{CarbonMergeDataSetCommand, DeleteAction, InsertAction, 
InsertInHistoryTableAction, MergeDataSetMatches, MergeMatch, UpdateAction, 
WhenMatched, WhenNotMatched, WhenNotMatchedAndExistsOnlyOnTarget}

import org.apache.spark.sql.functions._
 import org.apache.spark.sql.test.util.QueryTest
 import org.apache.spark.sql.types.

{BooleanType, DateType, IntegerType, StringType, StructField, StructType}

import spark.implicits._

sql("drop table if exists order").show()
 sql("drop table if exists order_hist").show()
 sql("create table order_hist(id string, name string, quantity int, price int, 
state int) PARTITIONED BY (c_name String) STORED AS carbondata").show()

val initframe = sc.parallelize(1 to 10, 4).map

{ x => ("id"+x, s"order$x",s"customer$x", x*10, x*75, 1)}

.toDF("id", "name", "c_name", "quantity", "price", "state")

initframe.write
 .format("carbondata")
 .option("tableName", "order")
 .option("partitionColumns", "c_name")
 .mode(SaveMode.Overwrite)
 .save()

val dwframe = spark.read.format("carbondata").option("tableName", 
"order").load()
 val dwSelframe = dwframe.as("A")

val ds1 = sc.parallelize(3 to 10, 4)
 .map {x =>
 if (x <= 4)

{ ("id"+x, s"order$x",s"customer$x", x*10, x*75, 2) }

else

{ ("id"+x, s"order$x",s"customer$x", x*10, x*75, 1) }

}.toDF("id", "name", "c_name", "quantity", "price", "state")

ds1.show()
 val ds2 = sc.parallelize(1 to 2, 4)
 .map

{x => ("newid"+x, s"order$x",s"customer$x", x*10, x*75, 1) }

.toDS().toDF()
 ds2.show()
 val ds3 = ds1.union(ds2)
 ds3.show()

val odsframe = ds3.as("B")

var matches = Seq.empty[MergeMatch]
 val updateMap = Map(col("id") -> col("A.id"),
 col("price") -> expr("B.price + 1"),
 col("state") -> col("B.state"))

val insertMap = Map(col("id") -> col("B.id"),
 col("name") -> col("B.name"),
 col("c_name") -> col("B.c_name"),
 col("quantity") -> col("B.quantity"),
 col("price") -> expr("B.price * 100"),
 col("state") -> col("B.state"))

val insertMap_u = Map(col("id") -> col("id"),
 col("name") -> col("name"),
 col("c_name") -> lit("insert"),
 col("quantity") -> col("quantity"),
 col("price") -> expr("price"),
 col("state") -> col("state"))

val insertMap_d = Map(col("id") -> col("id"),
 col("name") -> col("name"),
 col("c_name") -> lit("delete"),
 col("quantity") -> col("quantity"),
 col("price") -> expr("price"),
 col("state") -> col("state"))

matches ++= Seq(WhenMatched(Some(col("A.state") =!= 
col("B.state"))).addAction(UpdateAction(updateMap)).addAction(InsertInHistoryTableAction(insertMap_u,
 TableIdentifier("order_hist"
 matches ++= Seq(WhenNotMatched().addAction(InsertAction(insertMap)))
 matches ++= 
Seq(WhenNotMatchedAndExistsOnlyOnTarget().addAction(DeleteAction()).addAction(InsertInHistoryTableAction(insertMap_d,
 TableIdentifier("order_hist"

 

 
{code}
sql("select count(*) from order").show()
 sql("select count(*) from order where state = 2").show()
 sql("select price from order where id = 'newid1'").show()
 sql("select count(*) from order_hist where c_name = 'delete'").show()
 sql("select count(*) from order_hist where c_name = 'insert'").show()
{code}

Results in spark-shell --master yarn

{code}
 scala> sql("select count(*) from order").show()
 ++
|count(1)|

++
|10|

++

scala> sql("select count(*) from order where state = 2").show()
 ++
|count(1)|

++
|0|

++

scala> sql("select price from order where id = 'newid1'").show()
 +-+
|price|

+-+
 +-+

scala> sql("select count(*) from order_hist where c_name = 'delete'").show()
 ++
|count(1)|

++
|0|

++

scala> sql("select count(*) from order_hist where c_name = 'insert'").show()
 ++
|count(1)|

++
|0|

++
{code}

Results in spark-shell --master local
{code}
scala> sql("select count(*) from order").show()
 ++
|count(1)|

++
|10|

++

scala> sql("select count(*) from order where state = 2").show()
 ++
|count(1)|

++
|2|

++

scala> sql("select price from order where id = 'newid1'").show()
 +-+
|price|

+-+
|7500|

+-+

scala> sql("select count(*) from order_hist where c_name = 'delete'").show()
 ++
|count(1)|

++
|2|

++

scala> sql("select count(*) from order_hist where c_name = 'insert'").show()
 ++

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3789: [WIP] test

2020-06-11 Thread GitBox


CarbonDataQA1 commented on pull request #3789:
URL: https://github.com/apache/carbondata/pull/3789#issuecomment-642659075


   Build Failed  with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3146/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] kunal642 commented on a change in pull request #3772: [CARBONDATA-3832]Added block and blocket pruning for the polygon expression processing

2020-06-11 Thread GitBox


kunal642 commented on a change in pull request #3772:
URL: https://github.com/apache/carbondata/pull/3772#discussion_r438818433



##
File path: 
geo/src/main/java/org/apache/carbondata/geo/scan/filter/executor/PolygonFilterExecutorImpl.java
##
@@ -0,0 +1,84 @@
+package org.apache.carbondata.geo.scan.filter.executor;

Review comment:
   Add header





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Updated] (CARBONDATA-3851) Merge Update and Insert with Partition Table is giving different results in different spark deploy modes

2020-06-11 Thread Sachin Ramachandra Setty (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sachin Ramachandra Setty updated CARBONDATA-3851:
-
Description: 
The result sets are different when run the queries in spark-shell --master 
local and spark-shell --master yarn (Two Different Spark Deploy Modes)

Steps to Reproduce Issue :

{code}
import scala.collection.JavaConverters._
import java.sql.Date
import org.apache.spark.sql._
import org.apache.spark.sql.CarbonSession._
import org.apache.spark.sql.catalyst.TableIdentifier
import org.apache.spark.sql.execution.command.mutation.merge.
{CarbonMergeDataSetCommand, DeleteAction, InsertAction, 
InsertInHistoryTableAction, MergeDataSetMatches, MergeMatch, UpdateAction, 
WhenMatched, WhenNotMatched, WhenNotMatchedAndExistsOnlyOnTarget}

import org.apache.spark.sql.functions._
import org.apache.spark.sql.test.util.QueryTest
import org.apache.spark.sql.types.
{BooleanType, DateType, IntegerType, StringType, StructField, StructType}
import spark.implicits._

sql("drop table if exists order").show()
sql("drop table if exists order_hist").show()
sql("create table order_hist(id string, name string, quantity int, price int, 
state int) PARTITIONED BY (c_name String) STORED AS carbondata").show()

val initframe = sc.parallelize(1 to 10, 4).map
{ x => ("id"+x, s"order$x",s"customer$x", x*10, x*75, 1)}
.toDF("id", "name", "c_name", "quantity", "price", "state")

initframe.write
 .format("carbondata")
 .option("tableName", "order")
 .option("partitionColumns", "c_name")
 .mode(SaveMode.Overwrite)
 .save()

val dwframe = spark.read.format("carbondata").option("tableName", 
"order").load()
val dwSelframe = dwframe.as("A")

val ds1 = sc.parallelize(3 to 10, 4)
 .map {x =>
 if (x <= 4)
{ ("id"+x, s"order$x",s"customer$x", x*10, x*75, 2) }
else
{ ("id"+x, s"order$x",s"customer$x", x*10, x*75, 1) }
}.toDF("id", "name", "c_name", "quantity", "price", "state")

ds1.show()

 val ds2 = sc.parallelize(1 to 2, 4)
 .map
{x => ("newid"+x, s"order$x",s"customer$x", x*10, x*75, 1) }
.toDS().toDF()

 ds2.show()

 val ds3 = ds1.union(ds2)

 ds3.show()

val odsframe = ds3.as("B")

var matches = Seq.empty[MergeMatch]
 val updateMap = Map(col("id") -> col("A.id"),
 col("price") -> expr("B.price + 1"),
 col("state") -> col("B.state"))

val insertMap = Map(col("id") -> col("B.id"),
 col("name") -> col("B.name"),
 col("c_name") -> col("B.c_name"),
 col("quantity") -> col("B.quantity"),
 col("price") -> expr("B.price * 100"),
 col("state") -> col("B.state"))

val insertMap_u = Map(col("id") -> col("id"),
 col("name") -> col("name"),
 col("c_name") -> lit("insert"),
 col("quantity") -> col("quantity"),
 col("price") -> expr("price"),
 col("state") -> col("state"))

val insertMap_d = Map(col("id") -> col("id"),
 col("name") -> col("name"),
 col("c_name") -> lit("delete"),
 col("quantity") -> col("quantity"),
 col("price") -> expr("price"),
 col("state") -> col("state"))

matches ++= Seq(WhenMatched(Some(col("A.state") =!= 
col("B.state"))).addAction(UpdateAction(updateMap)).addAction(InsertInHistoryTableAction(insertMap_u,
 TableIdentifier("order_hist"
 matches ++= Seq(WhenNotMatched().addAction(InsertAction(insertMap)))
 matches ++= 
Seq(WhenNotMatchedAndExistsOnlyOnTarget().addAction(DeleteAction()).addAction(InsertInHistoryTableAction(insertMap_d,
 TableIdentifier("order_hist"
{code}
 
SQL Queries :

{code}
sql("select count(*) from order").show()
 sql("select count(*) from order where state = 2").show()
 sql("select price from order where id = 'newid1'").show()
 sql("select count(*) from order_hist where c_name = 'delete'").show()
 sql("select count(*) from order_hist where c_name = 'insert'").show()
{code}

Results in spark-shell --master yarn

{code}
 scala> sql("select count(*) from order").show()
 ++
|count(1)|

++
|10|

++

scala> sql("select count(*) from order where state = 2").show()
 ++
|count(1)|

++
|0|

++

scala> sql("select price from order where id = 'newid1'").show()
 +-+
|price|

+-+
 +-+

scala> sql("select count(*) from order_hist where c_name = 'delete'").show()
 ++
|count(1)|

++
|0|

++

scala> sql("select count(*) from order_hist where c_name = 'insert'").show()
 ++
|count(1)|

++
|0|

++
{code}

Results in spark-shell --master local
{code}
scala> sql("select count(*) from order").show()
 ++
|count(1)|

++
|10|

++

scala> sql("select count(*) from order where state = 2").show()
 ++
|count(1)|

++
|2|

++

scala> sql("select price from order where id = 'newid1'").show()
 +-+
|price|

+-+
|7500|

+-+

scala> sql("select count(*) from order_hist where c_name = 'delete'").show()
 ++
|count(1)|

++
|2|

++

scala> sql("select count(*) from order_hist where c_name = 'insert'").show()
 

[GitHub] [carbondata] kunal642 commented on a change in pull request #3772: [CARBONDATA-3832]Added block and blocket pruning for the polygon expression processing

2020-06-11 Thread GitBox


kunal642 commented on a change in pull request #3772:
URL: https://github.com/apache/carbondata/pull/3772#discussion_r438820691



##
File path: 
geo/src/main/java/org/apache/carbondata/geo/scan/filter/executor/PolygonFilterExecutorImpl.java
##
@@ -0,0 +1,84 @@
+package org.apache.carbondata.geo.scan.filter.executor;
+
+import java.util.BitSet;
+import java.util.List;
+import java.util.Map;
+
+import org.apache.carbondata.common.logging.LogServiceFactory;
+import org.apache.carbondata.core.datastore.block.SegmentProperties;
+import org.apache.carbondata.core.datastore.chunk.impl.DimensionRawColumnChunk;
+import org.apache.carbondata.core.metadata.AbsoluteTableIdentifier;
+import org.apache.carbondata.core.metadata.datatype.DataTypes;
+import org.apache.carbondata.core.scan.expression.Expression;
+import org.apache.carbondata.core.scan.filter.GenericQueryType;
+import 
org.apache.carbondata.core.scan.filter.executer.RowLevelFilterExecuterImpl;
+import 
org.apache.carbondata.core.scan.filter.resolver.resolverinfo.DimColumnResolvedFilterInfo;
+import 
org.apache.carbondata.core.scan.filter.resolver.resolverinfo.MeasureColumnResolvedFilterInfo;
+import org.apache.carbondata.core.scan.processor.RawBlockletColumnChunks;
+import org.apache.carbondata.core.util.DataTypeUtil;
+import org.apache.carbondata.geo.scan.expression.PolygonExpression;
+
+import org.apache.log4j.Logger;
+
+public class PolygonFilterExecutorImpl extends RowLevelFilterExecuterImpl {
+  public PolygonFilterExecutorImpl(List 
dimColEvaluatorInfoList,

Review comment:
   Format this class





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Created] (CARBONDATA-3852) CCD Merge with Partition Table is giving different results in different spark deploy modes

2020-06-11 Thread Sachin Ramachandra Setty (Jira)
Sachin Ramachandra Setty created CARBONDATA-3852:


 Summary: CCD Merge with Partition Table is giving different 
results in different spark deploy modes
 Key: CARBONDATA-3852
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3852
 Project: CarbonData
  Issue Type: Bug
  Components: spark-integration
Affects Versions: 2.0.0
Reporter: Sachin Ramachandra Setty


The result sets are different when run the sql queries in spark-shell --master 
local and spark-shell --master yarn (Two Different Spark Deploy Modes)

{code}
import scala.collection.JavaConverters._
import java.sql.Date
import org.apache.spark.sql._
import org.apache.spark.sql.CarbonSession._
import org.apache.spark.sql.catalyst.TableIdentifier
import 
org.apache.spark.sql.execution.command.mutation.merge.{CarbonMergeDataSetCommand,
 DeleteAction, InsertAction, InsertInHistoryTableAction, MergeDataSetMatches, 
MergeMatch, UpdateAction, WhenMatched, WhenNotMatched, 
WhenNotMatchedAndExistsOnlyOnTarget}

import org.apache.spark.sql.functions._
import org.apache.spark.sql.test.util.QueryTest
import org.apache.spark.sql.types.{BooleanType, DateType, IntegerType, 
StringType, StructField, StructType}
import spark.implicits._

val df1 = sc.parallelize(1 to 10, 4).map{ x => ("id"+x, 
s"order$x",s"customer$x", x*10, x*75, 1)}.toDF("id", "name", "c_name", 
"quantity", "price", "state")


df1.write.format("carbondata").option("tableName", 
"order").mode(SaveMode.Overwrite).save()
val dwframe = spark.read.format("carbondata").option("tableName", 
"order").load()
val dwSelframe = dwframe.as("A")


val ds1 = sc.parallelize(3 to 10, 4)
  .map {x =>
if (x <= 4) {
  ("id"+x, s"order$x",s"customer$x", x*10, x*75, 2)
} else {
  ("id"+x, s"order$x",s"customer$x", x*10, x*75, 1)
}
  }.toDF("id", "name", "c_name", "quantity", "price", "state")
  

val ds2 = sc.parallelize(1 to 2, 4).map {x => ("newid"+x, 
s"order$x",s"customer$x", x*10, x*75, 1)}.toDS().toDF()
val ds3 = ds1.union(ds2)
val odsframe = ds3.as("B")

sql("drop table if exists target").show()

val initframe = spark.createDataFrame(Seq(
  Row("a", "0"),
  Row("b", "1"),
  Row("c", "2"),
  Row("d", "3")
).asJava, StructType(Seq(StructField("key", StringType), StructField("value", 
StringType

initframe.write
  .format("carbondata")
  .option("tableName", "target")
  .option("partitionColumns", "value")
  .mode(SaveMode.Overwrite)
  .save()
  
val target = spark.read.format("carbondata").option("tableName", 
"target").load()

var ccd =
  spark.createDataFrame(Seq(
Row("a", "10", false,  0),
Row("a", null, true, 1),   
Row("b", null, true, 2),   
Row("c", null, true, 3),   
Row("c", "20", false, 4),
Row("c", "200", false, 5),
Row("e", "100", false, 6) 
  ).asJava,
StructType(Seq(StructField("key", StringType),
  StructField("newValue", StringType),
  StructField("deleted", BooleanType), StructField("time", IntegerType
  
ccd.createOrReplaceTempView("changes")

ccd = sql("SELECT key, latest.newValue as newValue, latest.deleted as deleted 
FROM ( SELECT key, max(struct(time, newValue, deleted)) as latest FROM changes 
GROUP BY key)")

val updateMap = Map("key" -> "B.key", "value" -> 
"B.newValue").asInstanceOf[Map[Any, Any]]

val insertMap = Map("key" -> "B.key", "value" -> 
"B.newValue").asInstanceOf[Map[Any, Any]]

target.as("A").merge(ccd.as("B"), "A.key=B.key").
  whenMatched("B.deleted=false").
  updateExpr(updateMap).
  whenNotMatched("B.deleted=false").
  insertExpr(insertMap).
  whenMatched("B.deleted=true").
  delete().execute()
  
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CARBONDATA-3851) Merge Update and Insert with Partition Table is giving different results in different spark deploy modes

2020-06-11 Thread Sachin Ramachandra Setty (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sachin Ramachandra Setty updated CARBONDATA-3851:
-
Description: 
The result sets are different when run the queries in spark-shell--master local 
and spark-shell --master yarn (Two Different Spark Deploy Modes)

Steps to Reproduce Issue :

{code}
import scala.collection.JavaConverters._
 import java.sql.Date

import org.apache.spark.sql._
 import org.apache.spark.sql.CarbonSession._
 import org.apache.spark.sql.catalyst.TableIdentifier
 import org.apache.spark.sql.execution.command.mutation.merge.

{CarbonMergeDataSetCommand, DeleteAction, InsertAction, 
InsertInHistoryTableAction, MergeDataSetMatches, MergeMatch, UpdateAction, 
WhenMatched, WhenNotMatched, WhenNotMatchedAndExistsOnlyOnTarget}

import org.apache.spark.sql.functions._
 import org.apache.spark.sql.test.util.QueryTest
 import org.apache.spark.sql.types.

{BooleanType, DateType, IntegerType, StringType, StructField, StructType}

import spark.implicits._

sql("drop table if exists order").show()
 sql("drop table if exists order_hist").show()
 sql("create table order_hist(id string, name string, quantity int, price int, 
state int) PARTITIONED BY (c_name String) STORED AS carbondata").show()

val initframe = sc.parallelize(1 to 10, 4).map

{ x => ("id"+x, s"order$x",s"customer$x", x*10, x*75, 1)}

.toDF("id", "name", "c_name", "quantity", "price", "state")

initframe.write
 .format("carbondata")
 .option("tableName", "order")
 .option("partitionColumns", "c_name")
 .mode(SaveMode.Overwrite)
 .save()

val dwframe = spark.read.format("carbondata").option("tableName", 
"order").load()
 val dwSelframe = dwframe.as("A")

val ds1 = sc.parallelize(3 to 10, 4)
 .map {x =>
 if (x <= 4)

{ ("id"+x, s"order$x",s"customer$x", x*10, x*75, 2) }

else

{ ("id"+x, s"order$x",s"customer$x", x*10, x*75, 1) }

}.toDF("id", "name", "c_name", "quantity", "price", "state")

ds1.show()
 val ds2 = sc.parallelize(1 to 2, 4)
 .map

{x => ("newid"+x, s"order$x",s"customer$x", x*10, x*75, 1) }

.toDS().toDF()
 ds2.show()
 val ds3 = ds1.union(ds2)
 ds3.show()

val odsframe = ds3.as("B")

var matches = Seq.empty[MergeMatch]
 val updateMap = Map(col("id") -> col("A.id"),
 col("price") -> expr("B.price + 1"),
 col("state") -> col("B.state"))

val insertMap = Map(col("id") -> col("B.id"),
 col("name") -> col("B.name"),
 col("c_name") -> col("B.c_name"),
 col("quantity") -> col("B.quantity"),
 col("price") -> expr("B.price * 100"),
 col("state") -> col("B.state"))

val insertMap_u = Map(col("id") -> col("id"),
 col("name") -> col("name"),
 col("c_name") -> lit("insert"),
 col("quantity") -> col("quantity"),
 col("price") -> expr("price"),
 col("state") -> col("state"))

val insertMap_d = Map(col("id") -> col("id"),
 col("name") -> col("name"),
 col("c_name") -> lit("delete"),
 col("quantity") -> col("quantity"),
 col("price") -> expr("price"),
 col("state") -> col("state"))

matches ++= Seq(WhenMatched(Some(col("A.state") =!= 
col("B.state"))).addAction(UpdateAction(updateMap)).addAction(InsertInHistoryTableAction(insertMap_u,
 TableIdentifier("order_hist"
 matches ++= Seq(WhenNotMatched().addAction(InsertAction(insertMap)))
 matches ++= 
Seq(WhenNotMatchedAndExistsOnlyOnTarget().addAction(DeleteAction()).addAction(InsertInHistoryTableAction(insertMap_d,
 TableIdentifier("order_hist"
{code}
 
SQL Queries :

{code}
sql("select count(*) from order").show()
 sql("select count(*) from order where state = 2").show()
 sql("select price from order where id = 'newid1'").show()
 sql("select count(*) from order_hist where c_name = 'delete'").show()
 sql("select count(*) from order_hist where c_name = 'insert'").show()
{code}

Results in spark-shell --master yarn

{code}
 scala> sql("select count(*) from order").show()
 ++
|count(1)|

++
|10|

++

scala> sql("select count(*) from order where state = 2").show()
 ++
|count(1)|

++
|0|

++

scala> sql("select price from order where id = 'newid1'").show()
 +-+
|price|

+-+
 +-+

scala> sql("select count(*) from order_hist where c_name = 'delete'").show()
 ++
|count(1)|

++
|0|

++

scala> sql("select count(*) from order_hist where c_name = 'insert'").show()
 ++
|count(1)|

++
|0|

++
{code}

Results in spark-shell --master local
{code}
scala> sql("select count(*) from order").show()
 ++
|count(1)|

++
|10|

++

scala> sql("select count(*) from order where state = 2").show()
 ++
|count(1)|

++
|2|

++

scala> sql("select price from order where id = 'newid1'").show()
 +-+
|price|

+-+
|7500|

+-+

scala> sql("select count(*) from order_hist where c_name = 'delete'").show()
 ++
|count(1)|

++
|2|

++

scala> sql("select count(*) from order_hist where c_name = 

[jira] [Updated] (CARBONDATA-3851) Merge Update and Insert with Partition Table is giving different results in different spark deploy modes

2020-06-11 Thread Sachin Ramachandra Setty (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sachin Ramachandra Setty updated CARBONDATA-3851:
-
Description: 
The result sets are different when run the queries in spark-shell--master local 
and spark-shell --master yarn (Two Different Spark Deploy Modes)

Steps to Reproduce Issue :

{code}
import scala.collection.JavaConverters._
import java.sql.Date
import org.apache.spark.sql._
import org.apache.spark.sql.CarbonSession._
import org.apache.spark.sql.catalyst.TableIdentifier
import org.apache.spark.sql.execution.command.mutation.merge.
{CarbonMergeDataSetCommand, DeleteAction, InsertAction, 
InsertInHistoryTableAction, MergeDataSetMatches, MergeMatch, UpdateAction, 
WhenMatched, WhenNotMatched, WhenNotMatchedAndExistsOnlyOnTarget}

import org.apache.spark.sql.functions._
import org.apache.spark.sql.test.util.QueryTest
import org.apache.spark.sql.types.
{BooleanType, DateType, IntegerType, StringType, StructField, StructType}
import spark.implicits._

sql("drop table if exists order").show()
sql("drop table if exists order_hist").show()
sql("create table order_hist(id string, name string, quantity int, price int, 
state int) PARTITIONED BY (c_name String) STORED AS carbondata").show()

val initframe = sc.parallelize(1 to 10, 4).map
{ x => ("id"+x, s"order$x",s"customer$x", x*10, x*75, 1)}
.toDF("id", "name", "c_name", "quantity", "price", "state")

initframe.write
 .format("carbondata")
 .option("tableName", "order")
 .option("partitionColumns", "c_name")
 .mode(SaveMode.Overwrite)
 .save()

val dwframe = spark.read.format("carbondata").option("tableName", 
"order").load()
val dwSelframe = dwframe.as("A")

val ds1 = sc.parallelize(3 to 10, 4)
 .map {x =>
 if (x <= 4)
{ ("id"+x, s"order$x",s"customer$x", x*10, x*75, 2) }
else
{ ("id"+x, s"order$x",s"customer$x", x*10, x*75, 1) }
}.toDF("id", "name", "c_name", "quantity", "price", "state")

ds1.show()

 val ds2 = sc.parallelize(1 to 2, 4)
 .map
{x => ("newid"+x, s"order$x",s"customer$x", x*10, x*75, 1) }
.toDS().toDF()

 ds2.show()

 val ds3 = ds1.union(ds2)

 ds3.show()

val odsframe = ds3.as("B")

var matches = Seq.empty[MergeMatch]
 val updateMap = Map(col("id") -> col("A.id"),
 col("price") -> expr("B.price + 1"),
 col("state") -> col("B.state"))

val insertMap = Map(col("id") -> col("B.id"),
 col("name") -> col("B.name"),
 col("c_name") -> col("B.c_name"),
 col("quantity") -> col("B.quantity"),
 col("price") -> expr("B.price * 100"),
 col("state") -> col("B.state"))

val insertMap_u = Map(col("id") -> col("id"),
 col("name") -> col("name"),
 col("c_name") -> lit("insert"),
 col("quantity") -> col("quantity"),
 col("price") -> expr("price"),
 col("state") -> col("state"))

val insertMap_d = Map(col("id") -> col("id"),
 col("name") -> col("name"),
 col("c_name") -> lit("delete"),
 col("quantity") -> col("quantity"),
 col("price") -> expr("price"),
 col("state") -> col("state"))

matches ++= Seq(WhenMatched(Some(col("A.state") =!= 
col("B.state"))).addAction(UpdateAction(updateMap)).addAction(InsertInHistoryTableAction(insertMap_u,
 TableIdentifier("order_hist"
 matches ++= Seq(WhenNotMatched().addAction(InsertAction(insertMap)))
 matches ++= 
Seq(WhenNotMatchedAndExistsOnlyOnTarget().addAction(DeleteAction()).addAction(InsertInHistoryTableAction(insertMap_d,
 TableIdentifier("order_hist"
{code}
 
SQL Queries :

{code}
sql("select count(*) from order").show()
 sql("select count(*) from order where state = 2").show()
 sql("select price from order where id = 'newid1'").show()
 sql("select count(*) from order_hist where c_name = 'delete'").show()
 sql("select count(*) from order_hist where c_name = 'insert'").show()
{code}

Results in spark-shell --master yarn

{code}
 scala> sql("select count(*) from order").show()
 ++
|count(1)|

++
|10|

++

scala> sql("select count(*) from order where state = 2").show()
 ++
|count(1)|

++
|0|

++

scala> sql("select price from order where id = 'newid1'").show()
 +-+
|price|

+-+
 +-+

scala> sql("select count(*) from order_hist where c_name = 'delete'").show()
 ++
|count(1)|

++
|0|

++

scala> sql("select count(*) from order_hist where c_name = 'insert'").show()
 ++
|count(1)|

++
|0|

++
{code}

Results in spark-shell --master local
{code}
scala> sql("select count(*) from order").show()
 ++
|count(1)|

++
|10|

++

scala> sql("select count(*) from order where state = 2").show()
 ++
|count(1)|

++
|2|

++

scala> sql("select price from order where id = 'newid1'").show()
 +-+
|price|

+-+
|7500|

+-+

scala> sql("select count(*) from order_hist where c_name = 'delete'").show()
 ++
|count(1)|

++
|2|

++

scala> sql("select count(*) from order_hist where c_name = 'insert'").show()
 

[jira] [Updated] (CARBONDATA-3852) CCD Merge with Partition Table is giving different results in different spark deploy modes

2020-06-11 Thread Sachin Ramachandra Setty (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sachin Ramachandra Setty updated CARBONDATA-3852:
-
Description: 
The result sets are different when run the sql queries in spark-shell --master 
local and spark-shell --master yarn (Two Different Spark Deploy Modes)

{code}
import scala.collection.JavaConverters._
import java.sql.Date
import org.apache.spark.sql._
import org.apache.spark.sql.CarbonSession._
import org.apache.spark.sql.catalyst.TableIdentifier
import 
org.apache.spark.sql.execution.command.mutation.merge.{CarbonMergeDataSetCommand,
 DeleteAction, InsertAction, InsertInHistoryTableAction, MergeDataSetMatches, 
MergeMatch, UpdateAction, WhenMatched, WhenNotMatched, 
WhenNotMatchedAndExistsOnlyOnTarget}

import org.apache.spark.sql.functions._
import org.apache.spark.sql.test.util.QueryTest
import org.apache.spark.sql.types.{BooleanType, DateType, IntegerType, 
StringType, StructField, StructType}
import spark.implicits._

val df1 = sc.parallelize(1 to 10, 4).map{ x => ("id"+x, 
s"order$x",s"customer$x", x*10, x*75, 1)}.toDF("id", "name", "c_name", 
"quantity", "price", "state")


df1.write.format("carbondata").option("tableName", 
"order").mode(SaveMode.Overwrite).save()
val dwframe = spark.read.format("carbondata").option("tableName", 
"order").load()
val dwSelframe = dwframe.as("A")


val ds1 = sc.parallelize(3 to 10, 4)
  .map {x =>
if (x <= 4) {
  ("id"+x, s"order$x",s"customer$x", x*10, x*75, 2)
} else {
  ("id"+x, s"order$x",s"customer$x", x*10, x*75, 1)
}
  }.toDF("id", "name", "c_name", "quantity", "price", "state")
  

val ds2 = sc.parallelize(1 to 2, 4).map {x => ("newid"+x, 
s"order$x",s"customer$x", x*10, x*75, 1)}.toDS().toDF()
val ds3 = ds1.union(ds2)
val odsframe = ds3.as("B")

sql("drop table if exists target").show()

val initframe = spark.createDataFrame(Seq(
  Row("a", "0"),
  Row("b", "1"),
  Row("c", "2"),
  Row("d", "3")
).asJava, StructType(Seq(StructField("key", StringType), StructField("value", 
StringType

initframe.write
  .format("carbondata")
  .option("tableName", "target")
  .option("partitionColumns", "value")
  .mode(SaveMode.Overwrite)
  .save()
  
val target = spark.read.format("carbondata").option("tableName", 
"target").load()

var ccd =
  spark.createDataFrame(Seq(
Row("a", "10", false,  0),
Row("a", null, true, 1),   
Row("b", null, true, 2),   
Row("c", null, true, 3),   
Row("c", "20", false, 4),
Row("c", "200", false, 5),
Row("e", "100", false, 6) 
  ).asJava,
StructType(Seq(StructField("key", StringType),
  StructField("newValue", StringType),
  StructField("deleted", BooleanType), StructField("time", IntegerType
  
ccd.createOrReplaceTempView("changes")

ccd = sql("SELECT key, latest.newValue as newValue, latest.deleted as deleted 
FROM ( SELECT key, max(struct(time, newValue, deleted)) as latest FROM changes 
GROUP BY key)")

val updateMap = Map("key" -> "B.key", "value" -> 
"B.newValue").asInstanceOf[Map[Any, Any]]

val insertMap = Map("key" -> "B.key", "value" -> 
"B.newValue").asInstanceOf[Map[Any, Any]]

target.as("A").merge(ccd.as("B"), "A.key=B.key").
  whenMatched("B.deleted=false").
  updateExpr(updateMap).
  whenNotMatched("B.deleted=false").
  insertExpr(insertMap).
  whenMatched("B.deleted=true").
  delete().execute()
  
{code}

SQL Queries to run :

{code} 
sql("select count(*) from target").show()
sql("select * from target order by key").show()
{code}



Results in spark-shell --master yarn

{code}
scala> sql("select count(*) from target").show()
++
|count(1)|
++
|   4|
++


scala> sql("select * from target order by key").show()
+---+-+
|key|value|
+---+-+
|  a|0|
|  b|1|
|  c|2|
|  d|3|
+---+-+
{code}


Results in spark-shell --master local

{code}
scala> sql("select count(*) from target").show()
++
|count(1)|
++
|   3|
++


scala> sql("select * from target order by key").show()
+---+-+
|key|value|
+---+-+
|  c|  200|
|  d|3|
|  e|  100|
+---+-+
{code}




  was:
The result sets are different when run the sql queries in spark-shell --master 
local and spark-shell --master yarn (Two Different Spark Deploy Modes)

{code}
import scala.collection.JavaConverters._
import java.sql.Date
import org.apache.spark.sql._
import org.apache.spark.sql.CarbonSession._
import org.apache.spark.sql.catalyst.TableIdentifier
import 
org.apache.spark.sql.execution.command.mutation.merge.{CarbonMergeDataSetCommand,
 DeleteAction, InsertAction, InsertInHistoryTableAction, MergeDataSetMatches, 
MergeMatch, UpdateAction, WhenMatched, WhenNotMatched, 
WhenNotMatchedAndExistsOnlyOnTarget}

import org.apache.spark.sql.functions._
import org.apache.spark.sql.test.util.QueryTest
import 

[GitHub] [carbondata] kunal642 commented on a change in pull request #3772: [CARBONDATA-3832]Added block and blocket pruning for the polygon expression processing

2020-06-11 Thread GitBox


kunal642 commented on a change in pull request #3772:
URL: https://github.com/apache/carbondata/pull/3772#discussion_r438837210



##
File path: 
geo/src/main/java/org/apache/carbondata/geo/scan/filter/executor/PolygonFilterExecutorImpl.java
##
@@ -0,0 +1,84 @@
+package org.apache.carbondata.geo.scan.filter.executor;
+
+import java.util.BitSet;
+import java.util.List;
+import java.util.Map;
+
+import org.apache.carbondata.common.logging.LogServiceFactory;
+import org.apache.carbondata.core.datastore.block.SegmentProperties;
+import org.apache.carbondata.core.datastore.chunk.impl.DimensionRawColumnChunk;
+import org.apache.carbondata.core.metadata.AbsoluteTableIdentifier;
+import org.apache.carbondata.core.metadata.datatype.DataTypes;
+import org.apache.carbondata.core.scan.expression.Expression;
+import org.apache.carbondata.core.scan.filter.GenericQueryType;
+import 
org.apache.carbondata.core.scan.filter.executer.RowLevelFilterExecuterImpl;
+import 
org.apache.carbondata.core.scan.filter.resolver.resolverinfo.DimColumnResolvedFilterInfo;
+import 
org.apache.carbondata.core.scan.filter.resolver.resolverinfo.MeasureColumnResolvedFilterInfo;
+import org.apache.carbondata.core.scan.processor.RawBlockletColumnChunks;
+import org.apache.carbondata.core.util.DataTypeUtil;
+import org.apache.carbondata.geo.scan.expression.PolygonExpression;
+
+import org.apache.log4j.Logger;
+
+public class PolygonFilterExecutorImpl extends RowLevelFilterExecuterImpl {
+  public PolygonFilterExecutorImpl(List 
dimColEvaluatorInfoList,
+  List msrColEvalutorInfoList, Expression 
exp,
+  AbsoluteTableIdentifier tableIdentifier, SegmentProperties 
segmentProperties,
+  Map complexDimensionInfoMap) {
+super(dimColEvaluatorInfoList, msrColEvalutorInfoList, exp, 
tableIdentifier, segmentProperties,
+complexDimensionInfoMap);
+  }
+
+  private int getNearestRangeIndex(List ranges, long searchForNumber) {
+Long[] range;
+int low = 0, mid = 0, high = ranges.size() - 1;
+while (low <= high) {
+  mid = low + ((high - low) / 2);
+  range = ranges.get(mid);
+  if (searchForNumber >= range[0]) {
+if (searchForNumber <= range[1]) {
+  // Return the range index if the number is between min and max 
values of the range
+  return mid;
+} else {
+  // Number is bigger than this range's min and max. Search on the 
right side of the range
+  low = mid + 1;
+}
+  } else {
+// Number is smaller than this range's min and max. Search on the left 
side of the range
+high = mid - 1;
+  }
+}
+return mid;
+  }
+
+  private boolean isScanRequired(byte[] maxValue, byte[] minValue) {

Review comment:
   Please write a detailed explanation for the logic





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Created] (CARBONDATA-3853) Dataload fails for date column configured as BUCKET_COLUMNS

2020-06-11 Thread Chetan Bhat (Jira)
Chetan Bhat created CARBONDATA-3853:
---

 Summary: Dataload fails for date column configured as 
BUCKET_COLUMNS
 Key: CARBONDATA-3853
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3853
 Project: CarbonData
  Issue Type: Bug
  Components: data-load
Affects Versions: 2.0.0
Reporter: Chetan Bhat


Steps and Issue

0: jdbc:hive2://10.20.255.171:23040/> create table if not exists 
all_data_types1(bool_1 boolean,bool_2 boolean,chinese string,Number 
int,smallNumber smallint,BigNumber bigint,LargeDecimal double,smalldecimal 
float,customdecimal decimal(38,15),words string,smallwords char(8),varwords 
varchar(20),time timestamp,day date,emptyNumber int,emptysmallNumber 
smallint,emptyBigNumber bigint,emptyLargeDecimal double,emptysmalldecimal 
float,emptycustomdecimal decimal(38,38),emptywords string,emptysmallwords 
char(8),emptyvarwords varchar(20)) stored as carbondata TBLPROPERTIES 
('BUCKET_NUMBER'='2', *'BUCKET_COLUMNS'='day'*);
+-+--+
| Result |
+-+--+
+-+--+
No rows selected (0.494 seconds)
0: jdbc:hive2://10.20.255.171:23040/> LOAD DATA INPATH 
'hdfs://hacluster/chetan/datafile_0.csv' into table all_data_types1 
OPTIONS('DELIMITER'=',' , 
'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='bool_1 ,bool_2 
,chinese ,Number ,smallNumber ,BigNumber ,LargeDecimal ,smalldecimal 
,customdecimal,words ,smallwords ,varwords ,time ,day ,emptyNumber 
,emptysmallNumber ,emptyBigNumber ,emptyLargeDecimal 
,emptysmalldecimal,emptycustomdecimal ,emptywords ,emptysmallwords 
,emptyvarwords');
*Error: java.lang.Exception: DataLoad failure (state=,code=0)*

 

*Log-*



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CARBONDATA-3853) Dataload fails for date column configured as BUCKET_COLUMNS

2020-06-11 Thread Chetan Bhat (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Bhat updated CARBONDATA-3853:

Description: 
Steps and Issue

0: jdbc:hive2://10.20.255.171:23040/> create table if not exists 
all_data_types1(bool_1 boolean,bool_2 boolean,chinese string,Number 
int,smallNumber smallint,BigNumber bigint,LargeDecimal double,smalldecimal 
float,customdecimal decimal(38,15),words string,smallwords char(8),varwords 
varchar(20),time timestamp,day date,emptyNumber int,emptysmallNumber 
smallint,emptyBigNumber bigint,emptyLargeDecimal double,emptysmalldecimal 
float,emptycustomdecimal decimal(38,38),emptywords string,emptysmallwords 
char(8),emptyvarwords varchar(20)) stored as carbondata TBLPROPERTIES 
('BUCKET_NUMBER'='2', *'BUCKET_COLUMNS'='day'*);
 +--+-+
|Result|

+--+-+
 +--+-+
 No rows selected (0.494 seconds)
 0: jdbc:hive2://10.20.255.171:23040/> LOAD DATA INPATH 
'hdfs://hacluster/chetan/datafile_0.csv' into table all_data_types1 
OPTIONS('DELIMITER'=',' , 
'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='bool_1 ,bool_2 
,chinese ,Number ,smallNumber ,BigNumber ,LargeDecimal ,smalldecimal 
,customdecimal,words ,smallwords ,varwords ,time ,day ,emptyNumber 
,emptysmallNumber ,emptyBigNumber ,emptyLargeDecimal 
,emptysmalldecimal,emptycustomdecimal ,emptywords ,emptysmallwords 
,emptyvarwords');
 *Error: java.lang.Exception: DataLoad failure (state=,code=0)*

 

*Log-*

java.lang.Exception: DataLoad failure
 at 
org.apache.carbondata.spark.rdd.CarbonDataRDDFactory$.loadCarbonData(CarbonDataRDDFactory.scala:560)
 at 
org.apache.spark.sql.execution.command.management.CarbonLoadDataCommand.loadData(CarbonLoadDataCommand.scala:207)
 at 
org.apache.spark.sql.execution.command.management.CarbonLoadDataCommand.processData(CarbonLoadDataCommand.scala:168)
 at 
org.apache.spark.sql.execution.command.AtomicRunnableCommand$$anonfun$run$3.apply(package.scala:148)
 at 
org.apache.spark.sql.execution.command.AtomicRunnableCommand$$anonfun$run$3.apply(package.scala:145)
 at 
org.apache.spark.sql.execution.command.Auditable$class.runWithAudit(package.scala:104)
 at 
org.apache.spark.sql.execution.command.AtomicRunnableCommand.runWithAudit(package.scala:141)
 at 
org.apache.spark.sql.execution.command.AtomicRunnableCommand.run(package.scala:145)
 at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70)
 at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68)
 at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:79)
 at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:190)
 at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:190)
 at org.apache.spark.sql.Dataset$$anonfun$52.apply(Dataset.scala:3259)
 at 
org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:77)
 at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3258)
 at org.apache.spark.sql.Dataset.(Dataset.scala:190)
 at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:75)
 at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:642)
 at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:694)
 at 
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:232)
 at 
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:175)
 at 
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:171)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:422)
 at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
 at 
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1.run(SparkExecuteStatementOperation.scala:185)
 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
 at java.util.concurrent.FutureTask.run(FutureTask.java:266)
 at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
 at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
 at java.lang.Thread.run(Thread.java:745)
2020-06-11 23:47:24,973 | ERROR | [HiveServer2-Background-Pool: Thread-104] | 
Error running hive query: | 
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:179)
org.apache.hive.service.cli.HiveSQLException: java.lang.Exception: DataLoad 
failure
 at 

[jira] [Created] (CARBONDATA-3854) Quote char support to unprintable character like \u0009 \u0010

2020-06-11 Thread Mahesh Raju Somalaraju (Jira)
Mahesh Raju Somalaraju created CARBONDATA-3854:
--

 Summary: Quote char support to unprintable character like \u0009 
\u0010
 Key: CARBONDATA-3854
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3854
 Project: CarbonData
  Issue Type: New Feature
Reporter: Mahesh Raju Somalaraju


Quote char support to unprintable character like \u0009 \u0010

Currently carbondata will not support setting quotechar to printable char like 
\u0009.

current behaviour is quotechar will through exception if we give more than one 
character.

 

Need to support more than one character same as like delimiter.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [carbondata] maheshrajus opened a new pull request #3790: [CARBONDATA-3854] Quotechar support to more than one character

2020-06-11 Thread GitBox


maheshrajus opened a new pull request #3790:
URL: https://github.com/apache/carbondata/pull/3790


### Why is this PR needed?
   Need to support more than one character the same as a like **delimiter**.
   Quote char support to unprintable character like \u0009 \u0010
   Currently, carbondata will not support setting quotechar to printable char 
like \u0009.
   the current behaviour is quotechar will through exception if we give more 
than one character.
   

### What changes were proposed in this PR?
   
   
### Does this PR introduce any user interface change?
- No
- Yes. (please explain the change and update document)
   
### Is any new testcase added?
- No
- Yes
   
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3790: [CARBONDATA-3854] Quotechar support to more than one character

2020-06-11 Thread GitBox


CarbonDataQA1 commented on pull request #3790:
URL: https://github.com/apache/carbondata/pull/3790#issuecomment-642882015


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1423/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3790: [CARBONDATA-3854] Quotechar support to more than one character

2020-06-11 Thread GitBox


CarbonDataQA1 commented on pull request #3790:
URL: https://github.com/apache/carbondata/pull/3790#issuecomment-642881814


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3147/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org