Hello,

I have created a carbondata table from Spark 2.2.1 on Azure (Hive 1.2.1) via
CarbonSession.

The Spark code looks like this :

val carbon = SparkSession.builder().config("spark.sql.warehouse.dir",
warehouse).config("spark.sql.crossJoin.enabled",
"true").enableHiveSupport().getOrCreateCarbonSession(storeLocation)

val df = carbon.sparkContext.parallelize(1 to 1000000).map(x => .... <my
object creation>)

import carbon.implicits._

df.write.format("carbondata").option("tableName",
"carbon_table_20180816_1m_50f_p8").mode(SaveMode.Overwrite).save()


Note : I can query this carbondata table without problems in Spark, but I
also need to query it in Hive...



However I cannot query it from Hive.

If I look at the table definition in Hive I see this :

hive> show create table carbon_table_20180816_1m_50f_p8;
OK
CREATE EXTERNAL TABLE `carbon_table_20180816_1m_50f_p8`(
  `col` array<string> COMMENT 'from deserializer')
ROW FORMAT SERDE 
  'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' 
WITH SERDEPROPERTIES ( 
  'carbonSchemaPartsNo'='6', 
  'dbName'='default', 
  'isExternal'='false', 
  'isTransactional'='true', 
  'isVisible'='true', 
 
'path'='hdfs://hostname.internal.cloudapp.net/tmp/store/default/carbon_table_20180816_1m_50f_p8',
 
  'tableName'='carbon_table_20180816_1m_50f_p8', 
 
'tablePath'='hdfs://hostname.internal.cloudapp.net/tmp/store/default/carbon_table_20180816_1m_50f_p8')
 
STORED AS INPUTFORMAT 
  'org.apache.hadoop.mapred.SequenceFileInputFormat' 
OUTPUTFORMAT 
  'org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat'
LOCATION
 
'wasb://[email protected]/hive/warehouse/carbon_table_20180816_1m_50f_p8-__PLACEHOLDER__'
TBLPROPERTIES (
  'spark.sql.sources.provider'='org.apache.spark.sql.CarbonSource', 
  'spark.sql.sources.schema.numParts'='1', 
 
'spark.sql.sources.schema.part.0'='{\"type\":\"struct\",\"fields\":[{\"name\":\"c1\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c2\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c3\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c4\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c5\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c6\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c7\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c8\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c9\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c10\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c11\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c12\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c13\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c14\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c15\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c16\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c17\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c18\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c19\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c20\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c21\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c22\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c23\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c24\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c25\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c26\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c27\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c28\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c29\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c30\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c31\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c32\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c33\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c34\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c35\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c36\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c37\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c38\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c39\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c40\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c41\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c42\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c43\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c44\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c45\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c46\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c47\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c48\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c49\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c50\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}}]}',
 
  'transient_lastDdlTime'='1534416196')


It is clear that it is now using the Hive Carbondata SerDe.

I applied the following modifications based on
https://github.com/cenyuhai/incubator-carbondata/blob/CARBONDATA-727/integration/hive/hive-guide.md#alter-schema-in-hive



hive> alter table carbon_table_20180816_1m_50f_p8  set FILEFORMAT
    > INPUTFORMAT "org.apache.carbondata.hive.MapredCarbonInputFormat"
    > OUTPUTFORMAT "org.apache.carbondata.hive.MapredCarbonOutputFormat"
    > SERDE "org.apache.carbondata.hive.CarbonHiveSerDe";
OK


hive> alter table carbon_table_20180816_1m_50f_p8 set LOCATION
'hdfs://hostname.ax.internal.cloudapp.net/tmp/store/default/carbon_table_20180816_1m_50f_p8';
OK



hive> alter table carbon_table_20180816_1m_50f_p8 change col c1 string;
OK




Following these alter operations, I obtain the following table structure :


hive> show create table carbon_table_20180816_1m_50f_p8;
OK
CREATE EXTERNAL TABLE `carbon_table_20180816_1m_50f_p8`(
  `c1` string COMMENT '')
ROW FORMAT SERDE 
  'org.apache.carbondata.hive.CarbonHiveSerDe' 
WITH SERDEPROPERTIES ( 
  'carbonSchemaPartsNo'='6', 
  'dbName'='default', 
  'isExternal'='false', 
  'isTransactional'='true', 
  'isVisible'='true', 
 
'path'='hdfs://hostname.internal.cloudapp.net/tmp/store/default/carbon_table_20180816_1m_50f_p8',
 
  'tableName'='carbon_table_20180816_1m_50f_p8', 
 
'tablePath'='hdfs://hostname.internal.cloudapp.net/tmp/store/default/carbon_table_20180816_1m_50f_p8')
 
STORED AS INPUTFORMAT 
  'org.apache.carbondata.hive.MapredCarbonInputFormat' 
OUTPUTFORMAT 
  'org.apache.carbondata.hive.MapredCarbonOutputFormat'
LOCATION
 
'hdfs://hostname.internal.cloudapp.net/tmp/store/default/carbon_table_20180816_1m_50f_p8'
TBLPROPERTIES (
  'last_modified_by'='sshuser', 
  'last_modified_time'='1534423308', 
  'numFiles'='7', 
  'spark.sql.sources.provider'='org.apache.spark.sql.CarbonSource', 
  'spark.sql.sources.schema.numParts'='1', 
 
'spark.sql.sources.schema.part.0'='{\"type\":\"struct\",\"fields\":[{\"name\":\"c1\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c2\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c3\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c4\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c5\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c6\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c7\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c8\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c9\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c10\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c11\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c12\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c13\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c14\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c15\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c16\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c17\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c18\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c19\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c20\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c21\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c22\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c23\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c24\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c25\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c26\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c27\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c28\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c29\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c30\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c31\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c32\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c33\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c34\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c35\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c36\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c37\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c38\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c39\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c40\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c41\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c42\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c43\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c44\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c45\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c46\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c47\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c48\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c49\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"c50\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}}]}',
 
  'totalSize'='11446940', 
  'transient_lastDdlTime'='1534423308')


Before adding the extra columns (c2 to c50) I tried a basic select query,
and I am getting the following error :

hive> select c1 from carbon_table_20180816_1m_50f_p8 limit 1;
OK
Failed with exception java.io.IOException:java.io.IOException:
org.apache.carbondata.core.exception.InvalidConfigurationException: Database
name is not set.


The dataname seems to be well defined in 'dbName'='default' ...


Is there a better, easier way to query Carbondata tables (created from Spark
SQL) directly from Hive?? 


Your help would be appreciated.

Thanks

Yann



--
Sent from: 
http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/

Reply via email to