[GitHub] [carbondata] akashrn5 commented on a change in pull request #3875: [CARBONDATA-3934]Support write transactional table with presto.

2020-10-22 Thread GitBox


akashrn5 commented on a change in pull request #3875:
URL: https://github.com/apache/carbondata/pull/3875#discussion_r510102591



##
File path: 
integration/hive/src/main/java/org/apache/carbondata/hive/MapredCarbonOutputFormat.java
##
@@ -92,6 +95,14 @@ public void checkOutputSpecs(FileSystem fileSystem, JobConf 
jobConf) throws IOEx
 }
 String tablePath = 
FileFactory.getCarbonFile(carbonLoadModel.getTablePath()).getAbsolutePath();
 TaskAttemptID taskAttemptID = 
TaskAttemptID.forName(jc.get("mapred.task.id"));
+// taskAttemptID will be null when the insert job is fired from presto. 
Presto send the JobConf
+// and since presto does not use the MR framework for execution, the 
mapred.task.id will be
+// null, so prepare a new ID.
+if (taskAttemptID == null) {
+  SimpleDateFormat formatter = new SimpleDateFormat("MMddHHmm");
+  String jobTrackerId = formatter.format(new Date());
+  taskAttemptID = new TaskAttemptID(jobTrackerId, 0, TaskType.MAP, 0, 0);

Review comment:
   > ok, If this task number is used in file name, in case of 
non-transactional concurrent write. two files can have same file name leading 
to many issues. so, I suggested UUID. you can check again.
   
   I set the taskID to loadmodel only of the mapred.task.id is present and 
taskAttempt is not null, if null i dont set taskID to loadmodel, when we call 
super.getRecordWriter, CarbonTableOutputFormat will set load model based on 
DEFAULT_TASK_NO. Please have a look, transactional tables also shouldn't be 
problem





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] akashrn5 commented on a change in pull request #3875: [CARBONDATA-3934]Support write transactional table with presto.

2020-10-22 Thread GitBox


akashrn5 commented on a change in pull request #3875:
URL: https://github.com/apache/carbondata/pull/3875#discussion_r510102591



##
File path: 
integration/hive/src/main/java/org/apache/carbondata/hive/MapredCarbonOutputFormat.java
##
@@ -92,6 +95,14 @@ public void checkOutputSpecs(FileSystem fileSystem, JobConf 
jobConf) throws IOEx
 }
 String tablePath = 
FileFactory.getCarbonFile(carbonLoadModel.getTablePath()).getAbsolutePath();
 TaskAttemptID taskAttemptID = 
TaskAttemptID.forName(jc.get("mapred.task.id"));
+// taskAttemptID will be null when the insert job is fired from presto. 
Presto send the JobConf
+// and since presto does not use the MR framework for execution, the 
mapred.task.id will be
+// null, so prepare a new ID.
+if (taskAttemptID == null) {
+  SimpleDateFormat formatter = new SimpleDateFormat("MMddHHmm");
+  String jobTrackerId = formatter.format(new Date());
+  taskAttemptID = new TaskAttemptID(jobTrackerId, 0, TaskType.MAP, 0, 0);

Review comment:
   > ok, If this task number is used in file name, in case of 
non-transactional concurrent write. two files can have same file name leading 
to many issues. so, I suggested UUID. you can check again.
   
   I set the taskID to loadmodel only of the mapred.task.id is present and 
taskAttempt is not null, if null i dont set taskID to loadmodel, when we call 
super.getRecordWriter, CarbonTableOutputFormat will set load model based on 
DEFAULT_TASK_NO. Please have a look





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] akashrn5 commented on a change in pull request #3875: [CARBONDATA-3934]Support write transactional table with presto.

2020-10-21 Thread GitBox


akashrn5 commented on a change in pull request #3875:
URL: https://github.com/apache/carbondata/pull/3875#discussion_r509084981



##
File path: 
integration/presto/src/test/scala/org/apache/carbondata/presto/integrationtest/PrestoInsertIntoTableTestCase.scala
##
@@ -0,0 +1,207 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.presto.integrationtest
+
+import java.io.File
+import java.util
+import java.util.UUID
+import java.util.concurrent.{Callable, Executor, Executors, Future}
+
+import scala.collection.JavaConverters._
+
+import org.scalatest.{BeforeAndAfterAll, BeforeAndAfterEach, FunSuiteLike}
+
+import org.apache.carbondata.common.logging.LogServiceFactory
+import org.apache.carbondata.core.constants.CarbonCommonConstants
+import org.apache.carbondata.core.datastore.filesystem.{CarbonFile, 
CarbonFileFilter}
+import org.apache.carbondata.core.datastore.impl.FileFactory
+import org.apache.carbondata.core.metadata.schema.SchemaReader
+import org.apache.carbondata.core.metadata.{AbsoluteTableIdentifier, 
CarbonTableIdentifier}
+import org.apache.carbondata.core.statusmanager.SegmentStatusManager
+import org.apache.carbondata.core.util.path.CarbonTablePath
+import org.apache.carbondata.core.util.{CarbonProperties, CarbonUtil}
+import org.apache.carbondata.presto.server.PrestoServer
+import org.apache.carbondata.presto.util.CarbonDataStoreCreator
+
+class PrestoInsertIntoTableTestCase extends FunSuiteLike with 
BeforeAndAfterAll with BeforeAndAfterEach {
+
+  private val logger = LogServiceFactory
+.getLogService(classOf[PrestoAllDataTypeTest].getCanonicalName)
+
+  private val rootPath = new File(this.getClass.getResource("/").getPath
+  + "../../../..").getCanonicalPath
+  private val storePath = s"$rootPath/integration/presto/target/store"
+  private val prestoServer = new PrestoServer
+  private val executorService = Executors.newFixedThreadPool(1)
+
+  override def beforeAll: Unit = {
+
CarbonProperties.getInstance().addProperty(CarbonCommonConstants.CARBON_WRITTEN_BY_APPNAME,
+  "Presto")
+val map = new util.HashMap[String, String]()
+map.put("hive.metastore", "file")
+map.put("hive.metastore.catalog.dir", s"file://$storePath")
+map.put("hive.allow-drop-table", "true")
+prestoServer.startServer("testdb", map)
+prestoServer.execute("drop schema if exists testdb")
+prestoServer.execute("create schema testdb")
+  }
+
+  override protected def beforeEach(): Unit = {
+val query = "create table testdb.testtable(ID int, date date, country 
varchar, name varchar, phonetype varchar, serialname varchar,salary 
decimal(6,1), bonus decimal(8,6), monthlyBonus decimal(5,3), dob timestamp, 
shortField smallint, iscurrentemployee boolean) with(format='CARBONDATA') "
+createTable(query, "testdb", "testtable")
+  }
+
+  private def createTable(query: String, databaseName: String, tableName: 
String): Unit = {
+prestoServer.execute(s"drop table if exists ${databaseName}.${tableName}")
+prestoServer.execute(query)
+logger.info("Creating The Carbon Store")
+val absoluteTableIdentifier: AbsoluteTableIdentifier = 
getAbsoluteIdentifier(databaseName, tableName)
+CarbonDataStoreCreator.createTable(absoluteTableIdentifier, true)
+logger.info(s"\nCarbon store is created at location: $storePath")
+  }
+
+  private def getAbsoluteIdentifier(dbName: String,
+  tableName: String) = {
+val absoluteTableIdentifier = AbsoluteTableIdentifier.from(
+  storePath + "/" + dbName + "/" + tableName,
+  new CarbonTableIdentifier(dbName,
+tableName,
+UUID.randomUUID().toString))
+absoluteTableIdentifier
+  }
+
+  test("test insert with different storage format names") {
+val query1 = "create table testdb.testtable(ID int, date date, country 
varchar, name varchar, phonetype varchar, serialname varchar,salary 
decimal(6,1), bonus decimal(8,6), monthlyBonus decimal(5,3), dob timestamp, 
shortField smallint, iscurrentemployee boolean) with(format='CARBONDATA') "
+val query2 = "create table testdb.testtable(ID int, date date, country 
varchar, name varchar, phonetype varchar, serialname 

[GitHub] [carbondata] akashrn5 commented on a change in pull request #3875: [CARBONDATA-3934]Support write transactional table with presto.

2020-10-21 Thread GitBox


akashrn5 commented on a change in pull request #3875:
URL: https://github.com/apache/carbondata/pull/3875#discussion_r509015656



##
File path: 
integration/hive/src/main/java/org/apache/carbondata/hive/MapredCarbonOutputFormat.java
##
@@ -92,6 +95,14 @@ public void checkOutputSpecs(FileSystem fileSystem, JobConf 
jobConf) throws IOEx
 }
 String tablePath = 
FileFactory.getCarbonFile(carbonLoadModel.getTablePath()).getAbsolutePath();
 TaskAttemptID taskAttemptID = 
TaskAttemptID.forName(jc.get("mapred.task.id"));
+// taskAttemptID will be null when the insert job is fired from presto. 
Presto send the JobConf
+// and since presto does not use the MR framework for execution, the 
mapred.task.id will be
+// null, so prepare a new ID.
+if (taskAttemptID == null) {
+  SimpleDateFormat formatter = new SimpleDateFormat("MMddHHmm");
+  String jobTrackerId = formatter.format(new Date());
+  taskAttemptID = new TaskAttemptID(jobTrackerId, 0, TaskType.MAP, 0, 0);

Review comment:
   > Also please check filenames while testing whether segment id and other 
info is proper in the file name created by presto.
   
   
`Fact/Part0/Segment_10/part-0-0_batchno0-0-10-1603260474337.snappy.carbondata`, 
 `Fact/Part0/Segment_10/10_1603260475282.carbonindexmerge`
   
   These are indexmerge and carbon file inside segment for segment 10, so 
naming is fine.
   





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] akashrn5 commented on a change in pull request #3875: [CARBONDATA-3934]Support write transactional table with presto.

2020-10-21 Thread GitBox


akashrn5 commented on a change in pull request #3875:
URL: https://github.com/apache/carbondata/pull/3875#discussion_r509014972



##
File path: 
integration/presto/src/test/scala/org/apache/carbondata/presto/integrationtest/PrestoInsertIntoTableTestCase.scala
##
@@ -0,0 +1,207 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.presto.integrationtest
+
+import java.io.File
+import java.util
+import java.util.UUID
+import java.util.concurrent.{Callable, Executor, Executors, Future}
+
+import scala.collection.JavaConverters._
+
+import org.scalatest.{BeforeAndAfterAll, BeforeAndAfterEach, FunSuiteLike}
+
+import org.apache.carbondata.common.logging.LogServiceFactory
+import org.apache.carbondata.core.constants.CarbonCommonConstants
+import org.apache.carbondata.core.datastore.filesystem.{CarbonFile, 
CarbonFileFilter}
+import org.apache.carbondata.core.datastore.impl.FileFactory
+import org.apache.carbondata.core.metadata.schema.SchemaReader
+import org.apache.carbondata.core.metadata.{AbsoluteTableIdentifier, 
CarbonTableIdentifier}
+import org.apache.carbondata.core.statusmanager.SegmentStatusManager
+import org.apache.carbondata.core.util.path.CarbonTablePath
+import org.apache.carbondata.core.util.{CarbonProperties, CarbonUtil}
+import org.apache.carbondata.presto.server.PrestoServer
+import org.apache.carbondata.presto.util.CarbonDataStoreCreator
+
+class PrestoInsertIntoTableTestCase extends FunSuiteLike with 
BeforeAndAfterAll with BeforeAndAfterEach {
+
+  private val logger = LogServiceFactory
+.getLogService(classOf[PrestoAllDataTypeTest].getCanonicalName)
+
+  private val rootPath = new File(this.getClass.getResource("/").getPath
+  + "../../../..").getCanonicalPath
+  private val storePath = s"$rootPath/integration/presto/target/store"
+  private val prestoServer = new PrestoServer
+  private val executorService = Executors.newFixedThreadPool(1)
+
+  override def beforeAll: Unit = {
+
CarbonProperties.getInstance().addProperty(CarbonCommonConstants.CARBON_WRITTEN_BY_APPNAME,
+  "Presto")
+val map = new util.HashMap[String, String]()
+map.put("hive.metastore", "file")
+map.put("hive.metastore.catalog.dir", s"file://$storePath")
+map.put("hive.allow-drop-table", "true")
+prestoServer.startServer("testdb", map)
+prestoServer.execute("drop schema if exists testdb")
+prestoServer.execute("create schema testdb")
+  }
+
+  override protected def beforeEach(): Unit = {
+val query = "create table testdb.testtable(ID int, date date, country 
varchar, name varchar, phonetype varchar, serialname varchar,salary 
decimal(6,1), bonus decimal(8,6), monthlyBonus decimal(5,3), dob timestamp, 
shortField smallint, iscurrentemployee boolean) with(format='CARBONDATA') "
+createTable(query, "testdb", "testtable")
+  }
+
+  private def createTable(query: String, databaseName: String, tableName: 
String): Unit = {
+prestoServer.execute(s"drop table if exists ${databaseName}.${tableName}")
+prestoServer.execute(query)
+logger.info("Creating The Carbon Store")
+val absoluteTableIdentifier: AbsoluteTableIdentifier = 
getAbsoluteIdentifier(databaseName, tableName)
+CarbonDataStoreCreator.createTable(absoluteTableIdentifier, true)
+logger.info(s"\nCarbon store is created at location: $storePath")
+  }
+
+  private def getAbsoluteIdentifier(dbName: String,
+  tableName: String) = {
+val absoluteTableIdentifier = AbsoluteTableIdentifier.from(
+  storePath + "/" + dbName + "/" + tableName,
+  new CarbonTableIdentifier(dbName,
+tableName,
+UUID.randomUUID().toString))
+absoluteTableIdentifier
+  }
+
+  test("test insert with different storage format names") {
+val query1 = "create table testdb.testtable(ID int, date date, country 
varchar, name varchar, phonetype varchar, serialname varchar,salary 
decimal(6,1), bonus decimal(8,6), monthlyBonus decimal(5,3), dob timestamp, 
shortField smallint, iscurrentemployee boolean) with(format='CARBONDATA') "
+val query2 = "create table testdb.testtable(ID int, date date, country 
varchar, name varchar, phonetype varchar, serialname 

[GitHub] [carbondata] akashrn5 commented on a change in pull request #3875: [CARBONDATA-3934]Support write transactional table with presto.

2020-10-20 Thread GitBox


akashrn5 commented on a change in pull request #3875:
URL: https://github.com/apache/carbondata/pull/3875#discussion_r508502780



##
File path: 
integration/presto/src/main/prestosql/org/apache/carbondata/presto/CarbondataModule.java
##
@@ -127,7 +127,8 @@ public void configure(Binder binder) {
 .in(Scopes.SINGLETON);
 binder.bind(HivePartitionManager.class).in(Scopes.SINGLETON);
 
binder.bind(LocationService.class).to(HiveLocationService.class).in(Scopes.SINGLETON);
-binder.bind(HiveMetadataFactory.class).in(Scopes.SINGLETON);
+
binder.bind(HiveLocationService.class).to(CarbonDataLocationService.class).in(Scopes.SINGLETON);

Review comment:
   added to jira





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] akashrn5 commented on a change in pull request #3875: [CARBONDATA-3934]Support write transactional table with presto.

2020-10-20 Thread GitBox


akashrn5 commented on a change in pull request #3875:
URL: https://github.com/apache/carbondata/pull/3875#discussion_r508461938



##
File path: 
integration/presto/src/main/prestosql/org/apache/carbondata/presto/CarbonDataFileWriter.java
##
@@ -0,0 +1,188 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.presto;
+
+import java.io.IOException;
+import java.io.UncheckedIOException;
+import java.util.Arrays;
+import java.util.List;
+import java.util.Properties;
+
+import org.apache.carbondata.common.logging.LogServiceFactory;
+import org.apache.carbondata.core.constants.CarbonCommonConstants;
+import org.apache.carbondata.hadoop.api.CarbonTableOutputFormat;
+import org.apache.carbondata.hive.CarbonHiveSerDe;
+import org.apache.carbondata.hive.MapredCarbonOutputFormat;
+import org.apache.carbondata.presto.impl.CarbonTableConfig;
+
+import com.google.common.collect.ImmutableList;
+import io.prestosql.plugin.hive.HiveFileWriter;
+import io.prestosql.plugin.hive.HiveType;
+import io.prestosql.plugin.hive.HiveWriteUtils;
+import io.prestosql.spi.Page;
+import io.prestosql.spi.PrestoException;
+import io.prestosql.spi.block.Block;
+import io.prestosql.spi.type.Type;
+import io.prestosql.spi.type.TypeManager;
+import org.apache.commons.lang3.StringUtils;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.hive.conf.HiveConf;
+import org.apache.hadoop.hive.ql.exec.FileSinkOperator;
+import org.apache.hadoop.hive.ql.io.HiveOutputFormat;
+import org.apache.hadoop.hive.ql.io.IOConstants;
+import org.apache.hadoop.hive.ql.io.parquet.serde.ArrayWritableObjectInspector;
+import org.apache.hadoop.hive.serde2.SerDeException;
+import 
org.apache.hadoop.hive.serde2.objectinspector.SettableStructObjectInspector;
+import org.apache.hadoop.hive.serde2.objectinspector.StructField;
+import org.apache.hadoop.io.Text;
+import org.apache.hadoop.mapred.JobConf;
+import org.apache.hadoop.mapred.Reporter;
+import org.apache.log4j.Logger;
+
+import static com.google.common.collect.ImmutableList.toImmutableList;
+import static io.prestosql.plugin.hive.HiveErrorCode.HIVE_WRITER_DATA_ERROR;
+import static java.util.Objects.requireNonNull;
+import static java.util.stream.Collectors.toList;
+import static org.apache.hadoop.hive.conf.HiveConf.ConfVars.COMPRESSRESULT;
+
+/**
+ * This class implements HiveFileWriter and it creates the carbonFileWriter to 
write the page data
+ * sent from presto.
+ */
+public class CarbonDataFileWriter implements HiveFileWriter {
+
+  private static final Logger LOG =
+  LogServiceFactory.getLogService(CarbonDataFileWriter.class.getName());
+
+  private final JobConf configuration;
+  private final Path outPutPath;
+  private final FileSinkOperator.RecordWriter recordWriter;
+  private final CarbonHiveSerDe serDe;
+  private final int fieldCount;
+  private final Object row;
+  private final SettableStructObjectInspector tableInspector;
+  private final List structFields;
+  private final HiveWriteUtils.FieldSetter[] setters;
+
+  private boolean isCommitDone;
+
+  public CarbonDataFileWriter(Path outPutPath, List inputColumnNames, 
Properties properties,
+  JobConf configuration, TypeManager typeManager) throws SerDeException {
+requireNonNull(outPutPath, "path is null");
+// take the outputPath same as location in compliance with the carbon 
store folder structure.
+this.outPutPath = new Path(properties.getProperty("location"));
+this.configuration = requireNonNull(configuration, "conf is null");
+List columnNames = Arrays
+.asList(properties.getProperty(IOConstants.COLUMNS, 
"").split(CarbonCommonConstants.COMMA));
+List fileColumnTypes =
+HiveType.toHiveTypes(properties.getProperty(IOConstants.COLUMNS_TYPES, 
"")).stream()
+.map(hiveType -> hiveType.getType(typeManager)).collect(toList());
+this.fieldCount = columnNames.size();
+this.serDe = new CarbonHiveSerDe();
+serDe.initialize(configuration, properties);
+this.tableInspector = (ArrayWritableObjectInspector) 
serDe.getObjectInspector();
+
+this.structFields =
+
ImmutableList.copyOf(inputColumnNames.stream().map(tableInspector::getStructFieldRef)
+   

[GitHub] [carbondata] akashrn5 commented on a change in pull request #3875: [CARBONDATA-3934]Support write transactional table with presto.

2020-10-20 Thread GitBox


akashrn5 commented on a change in pull request #3875:
URL: https://github.com/apache/carbondata/pull/3875#discussion_r508357056



##
File path: 
integration/presto/src/test/scala/org/apache/carbondata/presto/integrationtest/PrestoInsertIntoTableTestCase.scala
##
@@ -0,0 +1,207 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.presto.integrationtest
+
+import java.io.File
+import java.util
+import java.util.UUID
+import java.util.concurrent.{Callable, Executor, Executors, Future}
+
+import scala.collection.JavaConverters._
+
+import org.scalatest.{BeforeAndAfterAll, BeforeAndAfterEach, FunSuiteLike}
+
+import org.apache.carbondata.common.logging.LogServiceFactory
+import org.apache.carbondata.core.constants.CarbonCommonConstants
+import org.apache.carbondata.core.datastore.filesystem.{CarbonFile, 
CarbonFileFilter}
+import org.apache.carbondata.core.datastore.impl.FileFactory
+import org.apache.carbondata.core.metadata.schema.SchemaReader
+import org.apache.carbondata.core.metadata.{AbsoluteTableIdentifier, 
CarbonTableIdentifier}
+import org.apache.carbondata.core.statusmanager.SegmentStatusManager
+import org.apache.carbondata.core.util.path.CarbonTablePath
+import org.apache.carbondata.core.util.{CarbonProperties, CarbonUtil}
+import org.apache.carbondata.presto.server.PrestoServer
+import org.apache.carbondata.presto.util.CarbonDataStoreCreator
+
+class PrestoInsertIntoTableTestCase extends FunSuiteLike with 
BeforeAndAfterAll with BeforeAndAfterEach {
+
+  private val logger = LogServiceFactory
+.getLogService(classOf[PrestoAllDataTypeTest].getCanonicalName)
+
+  private val rootPath = new File(this.getClass.getResource("/").getPath
+  + "../../../..").getCanonicalPath
+  private val storePath = s"$rootPath/integration/presto/target/store"
+  private val prestoServer = new PrestoServer
+  private val executorService = Executors.newFixedThreadPool(1)
+
+  override def beforeAll: Unit = {
+
CarbonProperties.getInstance().addProperty(CarbonCommonConstants.CARBON_WRITTEN_BY_APPNAME,
+  "Presto")
+val map = new util.HashMap[String, String]()
+map.put("hive.metastore", "file")
+map.put("hive.metastore.catalog.dir", s"file://$storePath")
+map.put("hive.allow-drop-table", "true")
+prestoServer.startServer("testdb", map)
+prestoServer.execute("drop schema if exists testdb")
+prestoServer.execute("create schema testdb")
+  }
+
+  override protected def beforeEach(): Unit = {
+val query = "create table testdb.testtable(ID int, date date, country 
varchar, name varchar, phonetype varchar, serialname varchar,salary 
decimal(6,1), bonus decimal(8,6), monthlyBonus decimal(5,3), dob timestamp, 
shortField smallint, iscurrentemployee boolean) with(format='CARBONDATA') "
+createTable(query, "testdb", "testtable")
+  }
+
+  private def createTable(query: String, databaseName: String, tableName: 
String): Unit = {
+prestoServer.execute(s"drop table if exists ${databaseName}.${tableName}")
+prestoServer.execute(query)
+logger.info("Creating The Carbon Store")
+val absoluteTableIdentifier: AbsoluteTableIdentifier = 
getAbsoluteIdentifier(databaseName, tableName)
+CarbonDataStoreCreator.createTable(absoluteTableIdentifier, true)
+logger.info(s"\nCarbon store is created at location: $storePath")
+  }
+
+  private def getAbsoluteIdentifier(dbName: String,
+  tableName: String) = {
+val absoluteTableIdentifier = AbsoluteTableIdentifier.from(
+  storePath + "/" + dbName + "/" + tableName,
+  new CarbonTableIdentifier(dbName,
+tableName,
+UUID.randomUUID().toString))
+absoluteTableIdentifier
+  }
+
+  test("test insert with different storage format names") {
+val query1 = "create table testdb.testtable(ID int, date date, country 
varchar, name varchar, phonetype varchar, serialname varchar,salary 
decimal(6,1), bonus decimal(8,6), monthlyBonus decimal(5,3), dob timestamp, 
shortField smallint, iscurrentemployee boolean) with(format='CARBONDATA') "
+val query2 = "create table testdb.testtable(ID int, date date, country 
varchar, name varchar, phonetype varchar, serialname 

[GitHub] [carbondata] akashrn5 commented on a change in pull request #3875: [CARBONDATA-3934]Support write transactional table with presto.

2020-10-20 Thread GitBox


akashrn5 commented on a change in pull request #3875:
URL: https://github.com/apache/carbondata/pull/3875#discussion_r508356404



##
File path: 
integration/presto/src/main/prestosql/org/apache/carbondata/presto/CarbonDataFileWriter.java
##
@@ -0,0 +1,188 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.presto;
+
+import java.io.IOException;
+import java.io.UncheckedIOException;
+import java.util.Arrays;
+import java.util.List;
+import java.util.Properties;
+
+import org.apache.carbondata.common.logging.LogServiceFactory;
+import org.apache.carbondata.core.constants.CarbonCommonConstants;
+import org.apache.carbondata.hadoop.api.CarbonTableOutputFormat;
+import org.apache.carbondata.hive.CarbonHiveSerDe;
+import org.apache.carbondata.hive.MapredCarbonOutputFormat;
+import org.apache.carbondata.presto.impl.CarbonTableConfig;
+
+import com.google.common.collect.ImmutableList;
+import io.prestosql.plugin.hive.HiveFileWriter;
+import io.prestosql.plugin.hive.HiveType;
+import io.prestosql.plugin.hive.HiveWriteUtils;
+import io.prestosql.spi.Page;
+import io.prestosql.spi.PrestoException;
+import io.prestosql.spi.block.Block;
+import io.prestosql.spi.type.Type;
+import io.prestosql.spi.type.TypeManager;
+import org.apache.commons.lang3.StringUtils;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.hive.conf.HiveConf;
+import org.apache.hadoop.hive.ql.exec.FileSinkOperator;
+import org.apache.hadoop.hive.ql.io.HiveOutputFormat;
+import org.apache.hadoop.hive.ql.io.IOConstants;
+import org.apache.hadoop.hive.ql.io.parquet.serde.ArrayWritableObjectInspector;
+import org.apache.hadoop.hive.serde2.SerDeException;
+import 
org.apache.hadoop.hive.serde2.objectinspector.SettableStructObjectInspector;
+import org.apache.hadoop.hive.serde2.objectinspector.StructField;
+import org.apache.hadoop.io.Text;
+import org.apache.hadoop.mapred.JobConf;
+import org.apache.hadoop.mapred.Reporter;
+import org.apache.log4j.Logger;
+
+import static com.google.common.collect.ImmutableList.toImmutableList;
+import static io.prestosql.plugin.hive.HiveErrorCode.HIVE_WRITER_DATA_ERROR;
+import static java.util.Objects.requireNonNull;
+import static java.util.stream.Collectors.toList;
+import static org.apache.hadoop.hive.conf.HiveConf.ConfVars.COMPRESSRESULT;
+
+/**
+ * This class implements HiveFileWriter and it creates the carbonFileWriter to 
write the page data
+ * sent from presto.
+ */
+public class CarbonDataFileWriter implements HiveFileWriter {
+
+  private static final Logger LOG =
+  LogServiceFactory.getLogService(CarbonDataFileWriter.class.getName());
+
+  private final JobConf configuration;
+  private final Path outPutPath;
+  private final FileSinkOperator.RecordWriter recordWriter;
+  private final CarbonHiveSerDe serDe;
+  private final int fieldCount;
+  private final Object row;
+  private final SettableStructObjectInspector tableInspector;
+  private final List structFields;
+  private final HiveWriteUtils.FieldSetter[] setters;
+
+  private boolean isCommitDone;
+
+  public CarbonDataFileWriter(Path outPutPath, List inputColumnNames, 
Properties properties,
+  JobConf configuration, TypeManager typeManager) throws SerDeException {
+requireNonNull(outPutPath, "path is null");
+// take the outputPath same as location in compliance with the carbon 
store folder structure.
+this.outPutPath = new Path(properties.getProperty("location"));
+this.configuration = requireNonNull(configuration, "conf is null");
+List columnNames = Arrays
+.asList(properties.getProperty(IOConstants.COLUMNS, 
"").split(CarbonCommonConstants.COMMA));
+List fileColumnTypes =
+HiveType.toHiveTypes(properties.getProperty(IOConstants.COLUMNS_TYPES, 
"")).stream()
+.map(hiveType -> hiveType.getType(typeManager)).collect(toList());
+this.fieldCount = columnNames.size();
+this.serDe = new CarbonHiveSerDe();
+serDe.initialize(configuration, properties);
+this.tableInspector = (ArrayWritableObjectInspector) 
serDe.getObjectInspector();
+
+this.structFields =
+
ImmutableList.copyOf(inputColumnNames.stream().map(tableInspector::getStructFieldRef)
+   

[GitHub] [carbondata] akashrn5 commented on a change in pull request #3875: [CARBONDATA-3934]Support write transactional table with presto.

2020-10-20 Thread GitBox


akashrn5 commented on a change in pull request #3875:
URL: https://github.com/apache/carbondata/pull/3875#discussion_r508356222



##
File path: 
integration/hive/src/main/java/org/apache/carbondata/hive/util/HiveCarbonUtil.java
##
@@ -137,7 +137,7 @@ public static CarbonLoadModel getCarbonLoadModel(String 
tableName, String databa
   carbonTable = CarbonTable.buildFromTableInfo(
   SchemaReader.inferSchema(absoluteTableIdentifier, false, 
configuration));
 }
-carbonTable.setTransactionalTable(false);
+carbonTable.setTransactionalTable(true);

Review comment:
   done





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] akashrn5 commented on a change in pull request #3875: [CARBONDATA-3934]Support write transactional table with presto.

2020-10-20 Thread GitBox


akashrn5 commented on a change in pull request #3875:
URL: https://github.com/apache/carbondata/pull/3875#discussion_r508284573



##
File path: 
integration/presto/src/test/scala/org/apache/carbondata/presto/integrationtest/PrestoInsertIntoTableTestCase.scala
##
@@ -0,0 +1,207 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.presto.integrationtest
+
+import java.io.File
+import java.util
+import java.util.UUID
+import java.util.concurrent.{Callable, Executor, Executors, Future}
+
+import scala.collection.JavaConverters._
+
+import org.scalatest.{BeforeAndAfterAll, BeforeAndAfterEach, FunSuiteLike}
+
+import org.apache.carbondata.common.logging.LogServiceFactory
+import org.apache.carbondata.core.constants.CarbonCommonConstants
+import org.apache.carbondata.core.datastore.filesystem.{CarbonFile, 
CarbonFileFilter}
+import org.apache.carbondata.core.datastore.impl.FileFactory
+import org.apache.carbondata.core.metadata.schema.SchemaReader
+import org.apache.carbondata.core.metadata.{AbsoluteTableIdentifier, 
CarbonTableIdentifier}
+import org.apache.carbondata.core.statusmanager.SegmentStatusManager
+import org.apache.carbondata.core.util.path.CarbonTablePath
+import org.apache.carbondata.core.util.{CarbonProperties, CarbonUtil}
+import org.apache.carbondata.presto.server.PrestoServer
+import org.apache.carbondata.presto.util.CarbonDataStoreCreator
+
+class PrestoInsertIntoTableTestCase extends FunSuiteLike with 
BeforeAndAfterAll with BeforeAndAfterEach {
+
+  private val logger = LogServiceFactory
+.getLogService(classOf[PrestoAllDataTypeTest].getCanonicalName)
+
+  private val rootPath = new File(this.getClass.getResource("/").getPath
+  + "../../../..").getCanonicalPath
+  private val storePath = s"$rootPath/integration/presto/target/store"
+  private val prestoServer = new PrestoServer
+  private val executorService = Executors.newFixedThreadPool(1)
+
+  override def beforeAll: Unit = {
+
CarbonProperties.getInstance().addProperty(CarbonCommonConstants.CARBON_WRITTEN_BY_APPNAME,
+  "Presto")
+val map = new util.HashMap[String, String]()
+map.put("hive.metastore", "file")
+map.put("hive.metastore.catalog.dir", s"file://$storePath")
+map.put("hive.allow-drop-table", "true")
+prestoServer.startServer("testdb", map)
+prestoServer.execute("drop schema if exists testdb")
+prestoServer.execute("create schema testdb")
+  }
+
+  override protected def beforeEach(): Unit = {
+val query = "create table testdb.testtable(ID int, date date, country 
varchar, name varchar, phonetype varchar, serialname varchar,salary 
decimal(6,1), bonus decimal(8,6), monthlyBonus decimal(5,3), dob timestamp, 
shortField smallint, iscurrentemployee boolean) with(format='CARBONDATA') "

Review comment:
   no, now we haven't supported the create table right, so didn't go much 
deep into existing things or what presto supports, once i take up the create 
table support, i will handle this. Create table feature is already planned 
after finishing insert requirement.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] akashrn5 commented on a change in pull request #3875: [CARBONDATA-3934]Support write transactional table with presto.

2020-10-20 Thread GitBox


akashrn5 commented on a change in pull request #3875:
URL: https://github.com/apache/carbondata/pull/3875#discussion_r508282935



##
File path: 
integration/presto/src/main/prestosql/org/apache/carbondata/presto/CarbonDataFileWriter.java
##
@@ -0,0 +1,188 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.presto;
+
+import java.io.IOException;
+import java.io.UncheckedIOException;
+import java.util.Arrays;
+import java.util.List;
+import java.util.Properties;
+
+import org.apache.carbondata.common.logging.LogServiceFactory;
+import org.apache.carbondata.core.constants.CarbonCommonConstants;
+import org.apache.carbondata.hadoop.api.CarbonTableOutputFormat;
+import org.apache.carbondata.hive.CarbonHiveSerDe;
+import org.apache.carbondata.hive.MapredCarbonOutputFormat;
+import org.apache.carbondata.presto.impl.CarbonTableConfig;
+
+import com.google.common.collect.ImmutableList;
+import io.prestosql.plugin.hive.HiveFileWriter;

Review comment:
   yes, you are right, i need to do for presto db also, but i thought, once 
we finish the presto-sql handling all comments, then i will directly copy and 
raise new PR for presto DB, it will be easy for me and as well as reviewer, as 
it can be directly merged. Else it will be huge PR and review will be difficult 
and handling will be two times.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] akashrn5 commented on a change in pull request #3875: [CARBONDATA-3934]Support write transactional table with presto.

2020-10-20 Thread GitBox


akashrn5 commented on a change in pull request #3875:
URL: https://github.com/apache/carbondata/pull/3875#discussion_r508281094



##
File path: 
integration/presto/src/test/scala/org/apache/carbondata/presto/integrationtest/PrestoInsertIntoTableTestCase.scala
##
@@ -0,0 +1,207 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.presto.integrationtest
+
+import java.io.File
+import java.util
+import java.util.UUID
+import java.util.concurrent.{Callable, Executor, Executors, Future}
+
+import scala.collection.JavaConverters._
+
+import org.scalatest.{BeforeAndAfterAll, BeforeAndAfterEach, FunSuiteLike}
+
+import org.apache.carbondata.common.logging.LogServiceFactory
+import org.apache.carbondata.core.constants.CarbonCommonConstants
+import org.apache.carbondata.core.datastore.filesystem.{CarbonFile, 
CarbonFileFilter}
+import org.apache.carbondata.core.datastore.impl.FileFactory
+import org.apache.carbondata.core.metadata.schema.SchemaReader
+import org.apache.carbondata.core.metadata.{AbsoluteTableIdentifier, 
CarbonTableIdentifier}
+import org.apache.carbondata.core.statusmanager.SegmentStatusManager
+import org.apache.carbondata.core.util.path.CarbonTablePath
+import org.apache.carbondata.core.util.{CarbonProperties, CarbonUtil}
+import org.apache.carbondata.presto.server.PrestoServer
+import org.apache.carbondata.presto.util.CarbonDataStoreCreator
+
+class PrestoInsertIntoTableTestCase extends FunSuiteLike with 
BeforeAndAfterAll with BeforeAndAfterEach {
+
+  private val logger = LogServiceFactory
+.getLogService(classOf[PrestoAllDataTypeTest].getCanonicalName)
+
+  private val rootPath = new File(this.getClass.getResource("/").getPath
+  + "../../../..").getCanonicalPath
+  private val storePath = s"$rootPath/integration/presto/target/store"
+  private val prestoServer = new PrestoServer
+  private val executorService = Executors.newFixedThreadPool(1)
+
+  override def beforeAll: Unit = {
+
CarbonProperties.getInstance().addProperty(CarbonCommonConstants.CARBON_WRITTEN_BY_APPNAME,
+  "Presto")
+val map = new util.HashMap[String, String]()
+map.put("hive.metastore", "file")
+map.put("hive.metastore.catalog.dir", s"file://$storePath")
+map.put("hive.allow-drop-table", "true")
+prestoServer.startServer("testdb", map)
+prestoServer.execute("drop schema if exists testdb")
+prestoServer.execute("create schema testdb")
+  }
+
+  override protected def beforeEach(): Unit = {
+val query = "create table testdb.testtable(ID int, date date, country 
varchar, name varchar, phonetype varchar, serialname varchar,salary 
decimal(6,1), bonus decimal(8,6), monthlyBonus decimal(5,3), dob timestamp, 
shortField smallint, iscurrentemployee boolean) with(format='CARBONDATA') "
+createTable(query, "testdb", "testtable")
+  }
+
+  private def createTable(query: String, databaseName: String, tableName: 
String): Unit = {
+prestoServer.execute(s"drop table if exists ${databaseName}.${tableName}")
+prestoServer.execute(query)
+logger.info("Creating The Carbon Store")
+val absoluteTableIdentifier: AbsoluteTableIdentifier = 
getAbsoluteIdentifier(databaseName, tableName)
+CarbonDataStoreCreator.createTable(absoluteTableIdentifier, true)
+logger.info(s"\nCarbon store is created at location: $storePath")
+  }
+
+  private def getAbsoluteIdentifier(dbName: String,
+  tableName: String) = {
+val absoluteTableIdentifier = AbsoluteTableIdentifier.from(
+  storePath + "/" + dbName + "/" + tableName,
+  new CarbonTableIdentifier(dbName,
+tableName,
+UUID.randomUUID().toString))
+absoluteTableIdentifier
+  }
+
+  test("test insert with different storage format names") {
+val query1 = "create table testdb.testtable(ID int, date date, country 
varchar, name varchar, phonetype varchar, serialname varchar,salary 
decimal(6,1), bonus decimal(8,6), monthlyBonus decimal(5,3), dob timestamp, 
shortField smallint, iscurrentemployee boolean) with(format='CARBONDATA') "
+val query2 = "create table testdb.testtable(ID int, date date, country 
varchar, name varchar, phonetype varchar, serialname 

[GitHub] [carbondata] akashrn5 commented on a change in pull request #3875: [CARBONDATA-3934]Support write transactional table with presto.

2020-10-20 Thread GitBox


akashrn5 commented on a change in pull request #3875:
URL: https://github.com/apache/carbondata/pull/3875#discussion_r508280110



##
File path: 
integration/presto/src/test/scala/org/apache/carbondata/presto/integrationtest/PrestoInsertIntoTableTestCase.scala
##
@@ -0,0 +1,207 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.presto.integrationtest
+
+import java.io.File
+import java.util
+import java.util.UUID
+import java.util.concurrent.{Callable, Executor, Executors, Future}
+
+import scala.collection.JavaConverters._
+
+import org.scalatest.{BeforeAndAfterAll, BeforeAndAfterEach, FunSuiteLike}
+
+import org.apache.carbondata.common.logging.LogServiceFactory
+import org.apache.carbondata.core.constants.CarbonCommonConstants
+import org.apache.carbondata.core.datastore.filesystem.{CarbonFile, 
CarbonFileFilter}
+import org.apache.carbondata.core.datastore.impl.FileFactory
+import org.apache.carbondata.core.metadata.schema.SchemaReader
+import org.apache.carbondata.core.metadata.{AbsoluteTableIdentifier, 
CarbonTableIdentifier}
+import org.apache.carbondata.core.statusmanager.SegmentStatusManager
+import org.apache.carbondata.core.util.path.CarbonTablePath
+import org.apache.carbondata.core.util.{CarbonProperties, CarbonUtil}
+import org.apache.carbondata.presto.server.PrestoServer
+import org.apache.carbondata.presto.util.CarbonDataStoreCreator
+
+class PrestoInsertIntoTableTestCase extends FunSuiteLike with 
BeforeAndAfterAll with BeforeAndAfterEach {
+
+  private val logger = LogServiceFactory
+.getLogService(classOf[PrestoAllDataTypeTest].getCanonicalName)
+
+  private val rootPath = new File(this.getClass.getResource("/").getPath
+  + "../../../..").getCanonicalPath
+  private val storePath = s"$rootPath/integration/presto/target/store"
+  private val prestoServer = new PrestoServer
+  private val executorService = Executors.newFixedThreadPool(1)
+
+  override def beforeAll: Unit = {
+
CarbonProperties.getInstance().addProperty(CarbonCommonConstants.CARBON_WRITTEN_BY_APPNAME,

Review comment:
   Thanks for reminding this, yes, here not needed, removed this and please 
check `CarbonMapredOutputFormat`, the static block adds app name as `hive`, but 
when its called from presto flow, the `mapred.task.id` will be null, in that 
case i will override with `appname `as `presto`.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] akashrn5 commented on a change in pull request #3875: [CARBONDATA-3934]Support write transactional table with presto.

2020-10-20 Thread GitBox


akashrn5 commented on a change in pull request #3875:
URL: https://github.com/apache/carbondata/pull/3875#discussion_r508273208



##
File path: 
integration/presto/src/test/scala/org/apache/carbondata/presto/integrationtest/PrestoInsertIntoTableTestCase.scala
##
@@ -0,0 +1,207 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.presto.integrationtest
+
+import java.io.File
+import java.util
+import java.util.UUID
+import java.util.concurrent.{Callable, Executor, Executors, Future}
+
+import scala.collection.JavaConverters._
+
+import org.scalatest.{BeforeAndAfterAll, BeforeAndAfterEach, FunSuiteLike}
+
+import org.apache.carbondata.common.logging.LogServiceFactory
+import org.apache.carbondata.core.constants.CarbonCommonConstants
+import org.apache.carbondata.core.datastore.filesystem.{CarbonFile, 
CarbonFileFilter}
+import org.apache.carbondata.core.datastore.impl.FileFactory
+import org.apache.carbondata.core.metadata.schema.SchemaReader
+import org.apache.carbondata.core.metadata.{AbsoluteTableIdentifier, 
CarbonTableIdentifier}
+import org.apache.carbondata.core.statusmanager.SegmentStatusManager
+import org.apache.carbondata.core.util.path.CarbonTablePath
+import org.apache.carbondata.core.util.{CarbonProperties, CarbonUtil}
+import org.apache.carbondata.presto.server.PrestoServer
+import org.apache.carbondata.presto.util.CarbonDataStoreCreator
+
+class PrestoInsertIntoTableTestCase extends FunSuiteLike with 
BeforeAndAfterAll with BeforeAndAfterEach {
+
+  private val logger = LogServiceFactory
+.getLogService(classOf[PrestoAllDataTypeTest].getCanonicalName)
+
+  private val rootPath = new File(this.getClass.getResource("/").getPath
+  + "../../../..").getCanonicalPath
+  private val storePath = s"$rootPath/integration/presto/target/store"
+  private val prestoServer = new PrestoServer
+  private val executorService = Executors.newFixedThreadPool(1)
+
+  override def beforeAll: Unit = {
+
CarbonProperties.getInstance().addProperty(CarbonCommonConstants.CARBON_WRITTEN_BY_APPNAME,
+  "Presto")
+val map = new util.HashMap[String, String]()
+map.put("hive.metastore", "file")
+map.put("hive.metastore.catalog.dir", s"file://$storePath")
+map.put("hive.allow-drop-table", "true")
+prestoServer.startServer("testdb", map)
+prestoServer.execute("drop schema if exists testdb")
+prestoServer.execute("create schema testdb")
+  }
+
+  override protected def beforeEach(): Unit = {
+val query = "create table testdb.testtable(ID int, date date, country 
varchar, name varchar, phonetype varchar, serialname varchar,salary 
decimal(6,1), bonus decimal(8,6), monthlyBonus decimal(5,3), dob timestamp, 
shortField smallint, iscurrentemployee boolean) with(format='CARBONDATA') "
+createTable(query, "testdb", "testtable")
+  }
+
+  private def createTable(query: String, databaseName: String, tableName: 
String): Unit = {
+prestoServer.execute(s"drop table if exists ${databaseName}.${tableName}")
+prestoServer.execute(query)
+logger.info("Creating The Carbon Store")
+val absoluteTableIdentifier: AbsoluteTableIdentifier = 
getAbsoluteIdentifier(databaseName, tableName)
+CarbonDataStoreCreator.createTable(absoluteTableIdentifier, true)
+logger.info(s"\nCarbon store is created at location: $storePath")
+  }
+
+  private def getAbsoluteIdentifier(dbName: String,
+  tableName: String) = {
+val absoluteTableIdentifier = AbsoluteTableIdentifier.from(
+  storePath + "/" + dbName + "/" + tableName,
+  new CarbonTableIdentifier(dbName,
+tableName,
+UUID.randomUUID().toString))
+absoluteTableIdentifier
+  }
+
+  test("test insert with different storage format names") {
+val query1 = "create table testdb.testtable(ID int, date date, country 
varchar, name varchar, phonetype varchar, serialname varchar,salary 
decimal(6,1), bonus decimal(8,6), monthlyBonus decimal(5,3), dob timestamp, 
shortField smallint, iscurrentemployee boolean) with(format='CARBONDATA') "
+val query2 = "create table testdb.testtable(ID int, date date, country 
varchar, name varchar, phonetype varchar, serialname 

[GitHub] [carbondata] akashrn5 commented on a change in pull request #3875: [CARBONDATA-3934]Support write transactional table with presto.

2020-10-20 Thread GitBox


akashrn5 commented on a change in pull request #3875:
URL: https://github.com/apache/carbondata/pull/3875#discussion_r508272165



##
File path: 
integration/presto/src/main/prestosql/org/apache/carbondata/presto/CarbonDataPageSinkProvider.java
##
@@ -0,0 +1,182 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.presto;
+
+import java.util.List;
+import java.util.Map;
+import java.util.OptionalInt;
+import java.util.Set;
+
+import com.google.common.collect.ImmutableList;
+import com.google.common.collect.ImmutableMap;
+import com.google.common.collect.ImmutableSet;
+import com.google.common.util.concurrent.ListeningExecutorService;
+import com.google.inject.Inject;
+import io.airlift.event.client.EventClient;
+import io.airlift.json.JsonCodec;
+import io.airlift.units.DataSize;
+import io.prestosql.plugin.hive.HdfsEnvironment;
+import io.prestosql.plugin.hive.HiveConfig;
+import io.prestosql.plugin.hive.HiveFileWriterFactory;
+import io.prestosql.plugin.hive.HivePageSink;
+import io.prestosql.plugin.hive.HivePageSinkProvider;
+import io.prestosql.plugin.hive.HiveSessionProperties;
+import io.prestosql.plugin.hive.HiveWritableTableHandle;
+import io.prestosql.plugin.hive.HiveWriterStats;
+import io.prestosql.plugin.hive.LocationService;
+import io.prestosql.plugin.hive.OrcFileWriterFactory;
+import io.prestosql.plugin.hive.PartitionUpdate;
+import io.prestosql.plugin.hive.metastore.HiveMetastore;
+import io.prestosql.plugin.hive.metastore.HivePageSinkMetadataProvider;
+import io.prestosql.plugin.hive.metastore.SortingColumn;
+import io.prestosql.spi.NodeManager;
+import io.prestosql.spi.PageIndexerFactory;
+import io.prestosql.spi.PageSorter;
+import io.prestosql.spi.connector.ConnectorInsertTableHandle;
+import io.prestosql.spi.connector.ConnectorPageSink;
+import io.prestosql.spi.connector.ConnectorSession;
+import io.prestosql.spi.connector.ConnectorTransactionHandle;
+import io.prestosql.spi.type.TypeManager;
+
+import static 
com.google.common.util.concurrent.MoreExecutors.listeningDecorator;
+import static io.airlift.concurrent.Threads.daemonThreadsNamed;
+import static 
io.prestosql.plugin.hive.metastore.CachingHiveMetastore.memoizeMetastore;
+import static java.util.Objects.requireNonNull;
+import static java.util.concurrent.Executors.newFixedThreadPool;
+
+public class CarbonDataPageSinkProvider extends HivePageSinkProvider {
+
+  private final Set fileWriterFactories;
+  private final HdfsEnvironment hdfsEnvironment;
+  private final PageSorter pageSorter;
+  private final HiveMetastore metastore;
+  private final PageIndexerFactory pageIndexerFactory;
+  private final TypeManager typeManager;
+  private final int maxOpenPartitions;
+  private final int maxOpenSortFiles;
+  private final DataSize writerSortBufferSize;
+  private final boolean immutablePartitions;
+  private final LocationService locationService;
+  private final ListeningExecutorService writeVerificationExecutor;
+  private final JsonCodec partitionUpdateCodec;
+  private final NodeManager nodeManager;
+  private final EventClient eventClient;
+  private final HiveSessionProperties hiveSessionProperties;
+  private final HiveWriterStats hiveWriterStats;
+  private final OrcFileWriterFactory orcFileWriterFactory;
+  private final long perTransactionMetastoreCacheMaximumSize;
+
+  @Inject
+  public CarbonDataPageSinkProvider(Set 
fileWriterFactories,
+  HdfsEnvironment hdfsEnvironment, PageSorter pageSorter, HiveMetastore 
metastore,
+  PageIndexerFactory pageIndexerFactory, TypeManager typeManager, 
HiveConfig config,
+  LocationService locationService, JsonCodec 
partitionUpdateCodec,
+  NodeManager nodeManager, EventClient eventClient, HiveSessionProperties 
hiveSessionProperties,
+  HiveWriterStats hiveWriterStats, OrcFileWriterFactory 
orcFileWriterFactory) {
+super(fileWriterFactories, hdfsEnvironment, pageSorter, metastore, 
pageIndexerFactory,

Review comment:
   Since no default constructor available in `HivePageSinkProvider`, we 
need to call super.





This is an automated message from the Apache Git Service.
To respond to 

[GitHub] [carbondata] akashrn5 commented on a change in pull request #3875: [CARBONDATA-3934]Support write transactional table with presto.

2020-10-20 Thread GitBox


akashrn5 commented on a change in pull request #3875:
URL: https://github.com/apache/carbondata/pull/3875#discussion_r508270595



##
File path: 
integration/presto/src/main/prestosql/org/apache/carbondata/presto/CarbonDataLocationService.java
##
@@ -0,0 +1,62 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.presto;
+
+import com.google.inject.Inject;
+import io.prestosql.plugin.hive.HdfsEnvironment;
+import io.prestosql.plugin.hive.HiveLocationService;
+import io.prestosql.plugin.hive.HiveWriteUtils;
+import io.prestosql.plugin.hive.LocationHandle;
+import io.prestosql.plugin.hive.metastore.SemiTransactionalHiveMetastore;
+import io.prestosql.plugin.hive.metastore.Table;
+import io.prestosql.spi.connector.ConnectorSession;
+import org.apache.hadoop.fs.Path;
+
+public class CarbonDataLocationService extends HiveLocationService {
+
+  private final HdfsEnvironment hdfsEnvironment;
+
+  @Inject
+  public CarbonDataLocationService(HdfsEnvironment hdfsEnvironment) {
+super(hdfsEnvironment);
+this.hdfsEnvironment = hdfsEnvironment;
+  }
+
+  @Override
+  public LocationHandle forNewTable(SemiTransactionalHiveMetastore metastore,
+  ConnectorSession session, String schemaName, String tableName) {
+// TODO: test in cloud scenario in S3/OBS and make it compatible for cloud 
scenario
+super.forNewTable(metastore, session, schemaName, tableName);
+HdfsEnvironment.HdfsContext context =
+new HdfsEnvironment.HdfsContext(session, schemaName, tableName);
+Path targetPath = HiveWriteUtils
+.getTableDefaultLocation(context, metastore, this.hdfsEnvironment, 
schemaName, tableName);
+return new LocationHandle(targetPath, targetPath, false,
+LocationHandle.WriteMode.DIRECT_TO_TARGET_NEW_DIRECTORY);
+  }
+
+  @Override
+  public LocationHandle forExistingTable(SemiTransactionalHiveMetastore 
metastore,
+  ConnectorSession session, Table table) {
+// TODO: test in cloud scenario in S3/OBS and make it compatible for cloud 
scenario

Review comment:
   testing on S3 not completed, as i dont have the environment, but i added 
a todo here just for tracking, because since we use both `target `and 
`writepath `same, it should work in s3 also, once i will check if i get any env 
to check, else once i check i will remove the comment from here.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] akashrn5 commented on a change in pull request #3875: [CARBONDATA-3934]Support write transactional table with presto.

2020-10-20 Thread GitBox


akashrn5 commented on a change in pull request #3875:
URL: https://github.com/apache/carbondata/pull/3875#discussion_r508269521



##
File path: 
integration/presto/src/main/prestosql/org/apache/carbondata/presto/CarbonDataLocationService.java
##
@@ -0,0 +1,62 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.presto;
+
+import com.google.inject.Inject;
+import io.prestosql.plugin.hive.HdfsEnvironment;
+import io.prestosql.plugin.hive.HiveLocationService;
+import io.prestosql.plugin.hive.HiveWriteUtils;
+import io.prestosql.plugin.hive.LocationHandle;
+import io.prestosql.plugin.hive.metastore.SemiTransactionalHiveMetastore;
+import io.prestosql.plugin.hive.metastore.Table;
+import io.prestosql.spi.connector.ConnectorSession;
+import org.apache.hadoop.fs.Path;
+
+public class CarbonDataLocationService extends HiveLocationService {

Review comment:
   we need `CarbonDataLocationService ` because, `HiveLocationService`  
always the temporary directory as store path and then move to staging 
directory, but we need both `targetPath `and `writePath` same and for S3 also 
it will work.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] akashrn5 commented on a change in pull request #3875: [CARBONDATA-3934]Support write transactional table with presto.

2020-10-20 Thread GitBox


akashrn5 commented on a change in pull request #3875:
URL: https://github.com/apache/carbondata/pull/3875#discussion_r508265258



##
File path: 
integration/presto/src/main/prestosql/org/apache/carbondata/presto/CarbonDataFileWriter.java
##
@@ -0,0 +1,188 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.presto;
+
+import java.io.IOException;
+import java.io.UncheckedIOException;
+import java.util.Arrays;
+import java.util.List;
+import java.util.Properties;
+
+import org.apache.carbondata.common.logging.LogServiceFactory;
+import org.apache.carbondata.core.constants.CarbonCommonConstants;
+import org.apache.carbondata.hadoop.api.CarbonTableOutputFormat;
+import org.apache.carbondata.hive.CarbonHiveSerDe;
+import org.apache.carbondata.hive.MapredCarbonOutputFormat;
+import org.apache.carbondata.presto.impl.CarbonTableConfig;
+
+import com.google.common.collect.ImmutableList;
+import io.prestosql.plugin.hive.HiveFileWriter;
+import io.prestosql.plugin.hive.HiveType;
+import io.prestosql.plugin.hive.HiveWriteUtils;
+import io.prestosql.spi.Page;
+import io.prestosql.spi.PrestoException;
+import io.prestosql.spi.block.Block;
+import io.prestosql.spi.type.Type;
+import io.prestosql.spi.type.TypeManager;
+import org.apache.commons.lang3.StringUtils;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.hive.conf.HiveConf;
+import org.apache.hadoop.hive.ql.exec.FileSinkOperator;
+import org.apache.hadoop.hive.ql.io.HiveOutputFormat;
+import org.apache.hadoop.hive.ql.io.IOConstants;
+import org.apache.hadoop.hive.ql.io.parquet.serde.ArrayWritableObjectInspector;
+import org.apache.hadoop.hive.serde2.SerDeException;
+import 
org.apache.hadoop.hive.serde2.objectinspector.SettableStructObjectInspector;
+import org.apache.hadoop.hive.serde2.objectinspector.StructField;
+import org.apache.hadoop.io.Text;
+import org.apache.hadoop.mapred.JobConf;
+import org.apache.hadoop.mapred.Reporter;
+import org.apache.log4j.Logger;
+
+import static com.google.common.collect.ImmutableList.toImmutableList;
+import static io.prestosql.plugin.hive.HiveErrorCode.HIVE_WRITER_DATA_ERROR;
+import static java.util.Objects.requireNonNull;
+import static java.util.stream.Collectors.toList;
+import static org.apache.hadoop.hive.conf.HiveConf.ConfVars.COMPRESSRESULT;
+
+/**
+ * This class implements HiveFileWriter and it creates the carbonFileWriter to 
write the page data
+ * sent from presto.
+ */
+public class CarbonDataFileWriter implements HiveFileWriter {
+
+  private static final Logger LOG =
+  LogServiceFactory.getLogService(CarbonDataFileWriter.class.getName());
+
+  private final JobConf configuration;
+  private final Path outPutPath;
+  private final FileSinkOperator.RecordWriter recordWriter;
+  private final CarbonHiveSerDe serDe;
+  private final int fieldCount;
+  private final Object row;
+  private final SettableStructObjectInspector tableInspector;
+  private final List structFields;
+  private final HiveWriteUtils.FieldSetter[] setters;
+
+  private boolean isCommitDone;
+
+  public CarbonDataFileWriter(Path outPutPath, List inputColumnNames, 
Properties properties,
+  JobConf configuration, TypeManager typeManager) throws SerDeException {
+requireNonNull(outPutPath, "path is null");
+// take the outputPath same as location in compliance with the carbon 
store folder structure.
+this.outPutPath = new Path(properties.getProperty("location"));
+this.configuration = requireNonNull(configuration, "conf is null");
+List columnNames = Arrays
+.asList(properties.getProperty(IOConstants.COLUMNS, 
"").split(CarbonCommonConstants.COMMA));
+List fileColumnTypes =
+HiveType.toHiveTypes(properties.getProperty(IOConstants.COLUMNS_TYPES, 
"")).stream()
+.map(hiveType -> hiveType.getType(typeManager)).collect(toList());
+this.fieldCount = columnNames.size();
+this.serDe = new CarbonHiveSerDe();
+serDe.initialize(configuration, properties);
+this.tableInspector = (ArrayWritableObjectInspector) 
serDe.getObjectInspector();
+
+this.structFields =
+
ImmutableList.copyOf(inputColumnNames.stream().map(tableInspector::getStructFieldRef)
+   

[GitHub] [carbondata] akashrn5 commented on a change in pull request #3875: [CARBONDATA-3934]Support write transactional table with presto.

2020-10-20 Thread GitBox


akashrn5 commented on a change in pull request #3875:
URL: https://github.com/apache/carbondata/pull/3875#discussion_r508264164



##
File path: 
integration/presto/src/main/prestosql/org/apache/carbondata/presto/CarbonDataInsertTableHandle.java
##
@@ -0,0 +1,60 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.carbondata.presto;
+
+import java.util.List;
+import java.util.Map;
+import java.util.Optional;
+
+import com.fasterxml.jackson.annotation.JsonCreator;
+import com.fasterxml.jackson.annotation.JsonProperty;
+import com.google.common.collect.ImmutableMap;
+import io.prestosql.plugin.hive.HiveBucketProperty;
+import io.prestosql.plugin.hive.HiveColumnHandle;
+import io.prestosql.plugin.hive.HiveInsertTableHandle;
+import io.prestosql.plugin.hive.HiveStorageFormat;
+import io.prestosql.plugin.hive.LocationHandle;
+import io.prestosql.plugin.hive.metastore.HivePageSinkMetadata;
+import io.prestosql.spi.connector.ConnectorInsertTableHandle;
+
+import static java.util.Objects.requireNonNull;
+
+public class CarbonDataInsertTableHandle extends HiveInsertTableHandle 
implements

Review comment:
   yes, we need this. This is because we need to send the load model to all 
the workers during writing, which we had prepared in the `carbondataMetadata 
``beginInsert`(), We do not have any info where we can add load model and send 
to worker. So we need `CarbonDataInsertTableHandle `, where we add in 
`additionalConf ` and send the load model to workers to support transaction





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] akashrn5 commented on a change in pull request #3875: [CARBONDATA-3934]Support write transactional table with presto.

2020-10-20 Thread GitBox


akashrn5 commented on a change in pull request #3875:
URL: https://github.com/apache/carbondata/pull/3875#discussion_r508236769



##
File path: 
integration/hive/src/main/java/org/apache/carbondata/hive/MapredCarbonOutputFormat.java
##
@@ -92,6 +95,14 @@ public void checkOutputSpecs(FileSystem fileSystem, JobConf 
jobConf) throws IOEx
 }
 String tablePath = 
FileFactory.getCarbonFile(carbonLoadModel.getTablePath()).getAbsolutePath();
 TaskAttemptID taskAttemptID = 
TaskAttemptID.forName(jc.get("mapred.task.id"));
+// taskAttemptID will be null when the insert job is fired from presto. 
Presto send the JobConf
+// and since presto does not use the MR framework for execution, the 
mapred.task.id will be
+// null, so prepare a new ID.
+if (taskAttemptID == null) {
+  SimpleDateFormat formatter = new SimpleDateFormat("MMddHHmm");
+  String jobTrackerId = formatter.format(new Date());
+  taskAttemptID = new TaskAttemptID(jobTrackerId, 0, TaskType.MAP, 0, 0);

Review comment:
   Here `taskAttemptID ` is `TaskAttemptID` object. Since for every writer 
it creates new task, there should be no problem. We get the jobconf from 
presto, we prepare the taskattemptid just for writer close purpose and 
initialize, so it should be fine i guess. what you think?
   
   With respect to ORC writer if you see, ORC uses the different 
`FIleOutPutFormat `from `mapred `package, we use `mapreduce `package, In 
mapred, taskcontext is not used, so they are not using this.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] akashrn5 commented on a change in pull request #3875: [CARBONDATA-3934]Support write transactional table with presto.

2020-09-21 Thread GitBox


akashrn5 commented on a change in pull request #3875:
URL: https://github.com/apache/carbondata/pull/3875#discussion_r491866753



##
File path: 
core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
##
@@ -2525,4 +2525,9 @@ private CarbonCommonConstants() {
* property which defines the presto query default value
*/
   public static final String IS_QUERY_FROM_PRESTO_DEFAULT = "false";
+
+  /**
+   * property to send load model from coordinator to worker in presto
+   */
+  public static final String CARBON_PRESTO_LOAD_MODEL = 
"presto.carbondata.encoded.loadmodel";

Review comment:
   its not configurable by user. You are right. I have moved the constant 
to `CarbonTableConfig` in presto module and renamed it to 
`carbondata.presto.encoded.loadmodel`

##
File path: 
core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
##
@@ -2525,4 +2525,9 @@ private CarbonCommonConstants() {
* property which defines the presto query default value
*/
   public static final String IS_QUERY_FROM_PRESTO_DEFAULT = "false";
+
+  /**
+   * property to send load model from coordinator to worker in presto

Review comment:
   not a user config, same as above comment. Moved to presto module.

##
File path: 
integration/presto/src/main/prestosql/org/apache/carbondata/presto/CarbonDataFileWriter.java
##
@@ -0,0 +1,183 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.presto;
+
+import java.io.IOException;
+import java.io.UncheckedIOException;
+import java.util.Arrays;
+import java.util.List;
+import java.util.Properties;
+
+import org.apache.carbondata.common.logging.LogServiceFactory;
+import org.apache.carbondata.core.constants.CarbonCommonConstants;
+import org.apache.carbondata.hadoop.api.CarbonTableOutputFormat;
+import org.apache.carbondata.hive.CarbonHiveSerDe;
+import org.apache.carbondata.hive.MapredCarbonOutputFormat;
+import org.apache.carbondata.presto.impl.CarbonTableConfig;
+
+import com.google.common.collect.ImmutableList;
+import io.prestosql.plugin.hive.HiveFileWriter;
+import io.prestosql.plugin.hive.HiveType;
+import io.prestosql.plugin.hive.HiveWriteUtils;
+import io.prestosql.spi.Page;
+import io.prestosql.spi.PrestoException;
+import io.prestosql.spi.block.Block;
+import io.prestosql.spi.type.Type;
+import io.prestosql.spi.type.TypeManager;
+import org.apache.commons.lang3.StringUtils;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.hive.conf.HiveConf;
+import org.apache.hadoop.hive.ql.exec.FileSinkOperator;
+import org.apache.hadoop.hive.ql.io.HiveOutputFormat;
+import org.apache.hadoop.hive.ql.io.IOConstants;
+import org.apache.hadoop.hive.ql.io.parquet.serde.ArrayWritableObjectInspector;
+import org.apache.hadoop.hive.serde2.SerDeException;
+import 
org.apache.hadoop.hive.serde2.objectinspector.SettableStructObjectInspector;
+import org.apache.hadoop.hive.serde2.objectinspector.StructField;
+import org.apache.hadoop.io.Text;
+import org.apache.hadoop.mapred.JobConf;
+import org.apache.hadoop.mapred.Reporter;
+import org.apache.log4j.Logger;
+
+import static com.google.common.collect.ImmutableList.toImmutableList;
+import static io.prestosql.plugin.hive.HiveErrorCode.HIVE_WRITER_DATA_ERROR;
+import static java.util.Objects.requireNonNull;
+import static java.util.stream.Collectors.toList;
+import static org.apache.hadoop.hive.conf.HiveConf.ConfVars.COMPRESSRESULT;
+
+/**
+ * This class implements HiveFileWriter and it creates the carbonFileWriter to 
write the age data
+ * sent from presto.
+ */
+public class CarbonDataFileWriter implements HiveFileWriter {
+
+  private static final Logger LOG =
+  LogServiceFactory.getLogService(CarbonDataFileWriter.class.getName());
+
+  private final JobConf configuration;
+  private Path outPutPath;
+  private final FileSinkOperator.RecordWriter recordWriter;
+  private final CarbonHiveSerDe serDe;
+  private final int fieldCount;
+  private final Object row;
+  private final SettableStructObjectInspector tableInspector;
+  private final List structFields;
+  private final HiveWriteUtils.FieldSetter[] setters;
+
+  private boolean 

[GitHub] [carbondata] akashrn5 commented on a change in pull request #3875: [CARBONDATA-3934]Support write transactional table with presto.

2020-09-21 Thread GitBox


akashrn5 commented on a change in pull request #3875:
URL: https://github.com/apache/carbondata/pull/3875#discussion_r492162320



##
File path: 
integration/presto/src/test/scala/org/apache/carbondata/presto/integrationtest/PrestoInsertIntoTableTestCase.scala
##
@@ -0,0 +1,170 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.presto.integrationtest
+
+import java.io.File
+import java.util
+import java.util.UUID
+
+import scala.collection.JavaConverters._
+
+import org.scalatest.{BeforeAndAfterAll, BeforeAndAfterEach, FunSuiteLike}
+
+import org.apache.carbondata.common.logging.LogServiceFactory
+import org.apache.carbondata.core.constants.CarbonCommonConstants
+import org.apache.carbondata.core.datastore.filesystem.{CarbonFile, 
CarbonFileFilter}
+import org.apache.carbondata.core.datastore.impl.FileFactory
+import org.apache.carbondata.core.metadata.schema.SchemaReader
+import org.apache.carbondata.core.metadata.{AbsoluteTableIdentifier, 
CarbonTableIdentifier}
+import org.apache.carbondata.core.statusmanager.SegmentStatusManager
+import org.apache.carbondata.core.util.path.CarbonTablePath
+import org.apache.carbondata.core.util.{CarbonProperties, CarbonUtil}
+import org.apache.carbondata.presto.server.PrestoServer
+import org.apache.carbondata.presto.util.CarbonDataStoreCreator
+
+class PrestoInsertIntoTableTestCase extends FunSuiteLike with 
BeforeAndAfterAll with BeforeAndAfterEach {
+
+  private val logger = LogServiceFactory
+.getLogService(classOf[PrestoAllDataTypeTest].getCanonicalName)
+
+  private val rootPath = new File(this.getClass.getResource("/").getPath
+  + "../../../..").getCanonicalPath
+  private val storePath = s"$rootPath/integration/presto/target/store"
+  private val systemPath = s"$rootPath/integration/presto/target/system"
+  private val prestoServer = new PrestoServer
+
+  override def beforeAll: Unit = {
+
CarbonProperties.getInstance().addProperty(CarbonCommonConstants.CARBON_WRITTEN_BY_APPNAME,
+  "Presto")
+val map = new util.HashMap[String, String]()
+map.put("hive.metastore", "file")
+map.put("hive.metastore.catalog.dir", s"file://$storePath")
+map.put("hive.allow-drop-table", "true")
+prestoServer.startServer("testdb", map)
+prestoServer.execute("drop schema if exists testdb")
+prestoServer.execute("create schema testdb")
+  }
+
+  override protected def beforeEach(): Unit = {
+val query = "create table testdb.testtable(ID int, date date, country 
varchar, name varchar, phonetype varchar, serialname varchar,salary 
decimal(6,1), bonus decimal(8,6), monthlyBonus decimal(5,3), dob timestamp, 
shortField smallint, iscurrentemployee boolean) with(format='CARBONDATA') "
+createTable(query, "testdb", "testtable")
+  }
+
+  private def createTable(query: String, databaseName: String, tableName: 
String): Unit = {
+prestoServer.execute(s"drop table if exists ${databaseName}.${tableName}")
+prestoServer.execute(query)
+logger.info("Creating The Carbon Store")
+val absoluteTableIdentifier: AbsoluteTableIdentifier = 
getAbsoluteIdentifier(databaseName, tableName)
+CarbonDataStoreCreator.createTable(absoluteTableIdentifier, true)
+logger.info(s"\nCarbon store is created at location: $storePath")
+  }
+
+  private def getAbsoluteIdentifier(dbName: String,
+  tableName: String) = {
+val absoluteTableIdentifier = AbsoluteTableIdentifier.from(
+  storePath + "/" + dbName + "/" + tableName,
+  new CarbonTableIdentifier(dbName,
+tableName,
+UUID.randomUUID().toString))
+absoluteTableIdentifier
+  }
+
+  test("test insert with different storage format names") {
+val query1 = "create table testdb.testtable(ID int, date date, country 
varchar, name varchar, phonetype varchar, serialname varchar,salary 
decimal(6,1), bonus decimal(8,6), monthlyBonus decimal(5,3), dob timestamp, 
shortField smallint, iscurrentemployee boolean) with(format='CARBONDATA') "
+val query2 = "create table testdb.testtable(ID int, date date, country 
varchar, name varchar, phonetype varchar, serialname varchar,salary 
decimal(6,1), bonus decimal(8,6), monthlyBonus 

[GitHub] [carbondata] akashrn5 commented on a change in pull request #3875: [CARBONDATA-3934]Support write transactional table with presto.

2020-09-21 Thread GitBox


akashrn5 commented on a change in pull request #3875:
URL: https://github.com/apache/carbondata/pull/3875#discussion_r492064374



##
File path: 
integration/presto/src/main/prestosql/org/apache/carbondata/presto/CarbonDataFileWriter.java
##
@@ -0,0 +1,183 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.presto;
+
+import java.io.IOException;
+import java.io.UncheckedIOException;
+import java.util.Arrays;
+import java.util.List;
+import java.util.Properties;
+
+import org.apache.carbondata.common.logging.LogServiceFactory;
+import org.apache.carbondata.core.constants.CarbonCommonConstants;
+import org.apache.carbondata.hadoop.api.CarbonTableOutputFormat;
+import org.apache.carbondata.hive.CarbonHiveSerDe;
+import org.apache.carbondata.hive.MapredCarbonOutputFormat;
+import org.apache.carbondata.presto.impl.CarbonTableConfig;
+
+import com.google.common.collect.ImmutableList;
+import io.prestosql.plugin.hive.HiveFileWriter;
+import io.prestosql.plugin.hive.HiveType;
+import io.prestosql.plugin.hive.HiveWriteUtils;
+import io.prestosql.spi.Page;
+import io.prestosql.spi.PrestoException;
+import io.prestosql.spi.block.Block;
+import io.prestosql.spi.type.Type;
+import io.prestosql.spi.type.TypeManager;
+import org.apache.commons.lang3.StringUtils;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.hive.conf.HiveConf;
+import org.apache.hadoop.hive.ql.exec.FileSinkOperator;
+import org.apache.hadoop.hive.ql.io.HiveOutputFormat;
+import org.apache.hadoop.hive.ql.io.IOConstants;
+import org.apache.hadoop.hive.ql.io.parquet.serde.ArrayWritableObjectInspector;
+import org.apache.hadoop.hive.serde2.SerDeException;
+import 
org.apache.hadoop.hive.serde2.objectinspector.SettableStructObjectInspector;
+import org.apache.hadoop.hive.serde2.objectinspector.StructField;
+import org.apache.hadoop.io.Text;
+import org.apache.hadoop.mapred.JobConf;
+import org.apache.hadoop.mapred.Reporter;
+import org.apache.log4j.Logger;
+
+import static com.google.common.collect.ImmutableList.toImmutableList;
+import static io.prestosql.plugin.hive.HiveErrorCode.HIVE_WRITER_DATA_ERROR;
+import static java.util.Objects.requireNonNull;
+import static java.util.stream.Collectors.toList;
+import static org.apache.hadoop.hive.conf.HiveConf.ConfVars.COMPRESSRESULT;
+
+/**
+ * This class implements HiveFileWriter and it creates the carbonFileWriter to 
write the age data
+ * sent from presto.
+ */
+public class CarbonDataFileWriter implements HiveFileWriter {
+
+  private static final Logger LOG =
+  LogServiceFactory.getLogService(CarbonDataFileWriter.class.getName());
+
+  private final JobConf configuration;
+  private Path outPutPath;
+  private final FileSinkOperator.RecordWriter recordWriter;
+  private final CarbonHiveSerDe serDe;
+  private final int fieldCount;
+  private final Object row;
+  private final SettableStructObjectInspector tableInspector;
+  private final List structFields;
+  private final HiveWriteUtils.FieldSetter[] setters;
+
+  private boolean isCommitDone;
+
+  public CarbonDataFileWriter(Path outPutPath, List inputColumnNames, 
Properties properties,
+  JobConf configuration, TypeManager typeManager) throws SerDeException {
+this.outPutPath = requireNonNull(outPutPath, "path is null");
+this.outPutPath = new Path(properties.getProperty("location"));
+outPutPath = new Path(properties.getProperty("location"));
+this.configuration = requireNonNull(configuration, "conf is null");
+List columnNames = Arrays
+.asList(properties.getProperty(IOConstants.COLUMNS, 
"").split(CarbonCommonConstants.COMMA));
+List fileColumnTypes =
+HiveType.toHiveTypes(properties.getProperty(IOConstants.COLUMNS_TYPES, 
"")).stream()
+.map(hiveType -> hiveType.getType(typeManager)).collect(toList());
+fieldCount = columnNames.size();
+serDe = new CarbonHiveSerDe();
+serDe.initialize(configuration, properties);
+tableInspector = (ArrayWritableObjectInspector) serDe.getObjectInspector();
+
+structFields = ImmutableList.copyOf(
+inputColumnNames.stream().map(tableInspector::getStructFieldRef)
+.collect(toImmutableList()));
+
+row 

[GitHub] [carbondata] akashrn5 commented on a change in pull request #3875: [CARBONDATA-3934]Support write transactional table with presto.

2020-09-21 Thread GitBox


akashrn5 commented on a change in pull request #3875:
URL: https://github.com/apache/carbondata/pull/3875#discussion_r492063738



##
File path: 
integration/presto/src/main/prestosql/org/apache/carbondata/presto/CarbonDataFileWriter.java
##
@@ -0,0 +1,183 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.presto;
+
+import java.io.IOException;
+import java.io.UncheckedIOException;
+import java.util.Arrays;
+import java.util.List;
+import java.util.Properties;
+
+import org.apache.carbondata.common.logging.LogServiceFactory;
+import org.apache.carbondata.core.constants.CarbonCommonConstants;
+import org.apache.carbondata.hadoop.api.CarbonTableOutputFormat;
+import org.apache.carbondata.hive.CarbonHiveSerDe;
+import org.apache.carbondata.hive.MapredCarbonOutputFormat;
+import org.apache.carbondata.presto.impl.CarbonTableConfig;
+
+import com.google.common.collect.ImmutableList;
+import io.prestosql.plugin.hive.HiveFileWriter;
+import io.prestosql.plugin.hive.HiveType;
+import io.prestosql.plugin.hive.HiveWriteUtils;
+import io.prestosql.spi.Page;
+import io.prestosql.spi.PrestoException;
+import io.prestosql.spi.block.Block;
+import io.prestosql.spi.type.Type;
+import io.prestosql.spi.type.TypeManager;
+import org.apache.commons.lang3.StringUtils;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.hive.conf.HiveConf;
+import org.apache.hadoop.hive.ql.exec.FileSinkOperator;
+import org.apache.hadoop.hive.ql.io.HiveOutputFormat;
+import org.apache.hadoop.hive.ql.io.IOConstants;
+import org.apache.hadoop.hive.ql.io.parquet.serde.ArrayWritableObjectInspector;
+import org.apache.hadoop.hive.serde2.SerDeException;
+import 
org.apache.hadoop.hive.serde2.objectinspector.SettableStructObjectInspector;
+import org.apache.hadoop.hive.serde2.objectinspector.StructField;
+import org.apache.hadoop.io.Text;
+import org.apache.hadoop.mapred.JobConf;
+import org.apache.hadoop.mapred.Reporter;
+import org.apache.log4j.Logger;
+
+import static com.google.common.collect.ImmutableList.toImmutableList;
+import static io.prestosql.plugin.hive.HiveErrorCode.HIVE_WRITER_DATA_ERROR;
+import static java.util.Objects.requireNonNull;
+import static java.util.stream.Collectors.toList;
+import static org.apache.hadoop.hive.conf.HiveConf.ConfVars.COMPRESSRESULT;
+
+/**
+ * This class implements HiveFileWriter and it creates the carbonFileWriter to 
write the age data

Review comment:
   its page data. changed





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] akashrn5 commented on a change in pull request #3875: [CARBONDATA-3934]Support write transactional table with presto.

2020-09-21 Thread GitBox


akashrn5 commented on a change in pull request #3875:
URL: https://github.com/apache/carbondata/pull/3875#discussion_r492063867



##
File path: 
integration/presto/src/main/prestosql/org/apache/carbondata/presto/CarbonDataFileWriter.java
##
@@ -0,0 +1,183 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.presto;
+
+import java.io.IOException;
+import java.io.UncheckedIOException;
+import java.util.Arrays;
+import java.util.List;
+import java.util.Properties;
+
+import org.apache.carbondata.common.logging.LogServiceFactory;
+import org.apache.carbondata.core.constants.CarbonCommonConstants;
+import org.apache.carbondata.hadoop.api.CarbonTableOutputFormat;
+import org.apache.carbondata.hive.CarbonHiveSerDe;
+import org.apache.carbondata.hive.MapredCarbonOutputFormat;
+import org.apache.carbondata.presto.impl.CarbonTableConfig;
+
+import com.google.common.collect.ImmutableList;
+import io.prestosql.plugin.hive.HiveFileWriter;
+import io.prestosql.plugin.hive.HiveType;
+import io.prestosql.plugin.hive.HiveWriteUtils;
+import io.prestosql.spi.Page;
+import io.prestosql.spi.PrestoException;
+import io.prestosql.spi.block.Block;
+import io.prestosql.spi.type.Type;
+import io.prestosql.spi.type.TypeManager;
+import org.apache.commons.lang3.StringUtils;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.hive.conf.HiveConf;
+import org.apache.hadoop.hive.ql.exec.FileSinkOperator;
+import org.apache.hadoop.hive.ql.io.HiveOutputFormat;
+import org.apache.hadoop.hive.ql.io.IOConstants;
+import org.apache.hadoop.hive.ql.io.parquet.serde.ArrayWritableObjectInspector;
+import org.apache.hadoop.hive.serde2.SerDeException;
+import 
org.apache.hadoop.hive.serde2.objectinspector.SettableStructObjectInspector;
+import org.apache.hadoop.hive.serde2.objectinspector.StructField;
+import org.apache.hadoop.io.Text;
+import org.apache.hadoop.mapred.JobConf;
+import org.apache.hadoop.mapred.Reporter;
+import org.apache.log4j.Logger;
+
+import static com.google.common.collect.ImmutableList.toImmutableList;
+import static io.prestosql.plugin.hive.HiveErrorCode.HIVE_WRITER_DATA_ERROR;
+import static java.util.Objects.requireNonNull;
+import static java.util.stream.Collectors.toList;
+import static org.apache.hadoop.hive.conf.HiveConf.ConfVars.COMPRESSRESULT;
+
+/**
+ * This class implements HiveFileWriter and it creates the carbonFileWriter to 
write the age data
+ * sent from presto.
+ */
+public class CarbonDataFileWriter implements HiveFileWriter {
+
+  private static final Logger LOG =
+  LogServiceFactory.getLogService(CarbonDataFileWriter.class.getName());
+
+  private final JobConf configuration;
+  private Path outPutPath;

Review comment:
   done

##
File path: 
integration/presto/src/main/prestosql/org/apache/carbondata/presto/CarbonDataFileWriter.java
##
@@ -0,0 +1,183 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.presto;
+
+import java.io.IOException;
+import java.io.UncheckedIOException;
+import java.util.Arrays;
+import java.util.List;
+import java.util.Properties;
+
+import org.apache.carbondata.common.logging.LogServiceFactory;
+import org.apache.carbondata.core.constants.CarbonCommonConstants;
+import org.apache.carbondata.hadoop.api.CarbonTableOutputFormat;
+import org.apache.carbondata.hive.CarbonHiveSerDe;
+import org.apache.carbondata.hive.MapredCarbonOutputFormat;
+import 

[GitHub] [carbondata] akashrn5 commented on a change in pull request #3875: [CARBONDATA-3934]Support write transactional table with presto.

2020-09-21 Thread GitBox


akashrn5 commented on a change in pull request #3875:
URL: https://github.com/apache/carbondata/pull/3875#discussion_r492063498



##
File path: 
integration/hive/src/main/java/org/apache/carbondata/hive/MapredCarbonOutputFormat.java
##
@@ -92,6 +95,11 @@ public void checkOutputSpecs(FileSystem fileSystem, JobConf 
jobConf) throws IOEx
 }
 String tablePath = 
FileFactory.getCarbonFile(carbonLoadModel.getTablePath()).getAbsolutePath();
 TaskAttemptID taskAttemptID = 
TaskAttemptID.forName(jc.get("mapred.task.id"));
+if (taskAttemptID == null) {
+  SimpleDateFormat formatter = new SimpleDateFormat("MMddHHmm");

Review comment:
   done





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] akashrn5 commented on a change in pull request #3875: [CARBONDATA-3934]Support write transactional table with presto.

2020-09-21 Thread GitBox


akashrn5 commented on a change in pull request #3875:
URL: https://github.com/apache/carbondata/pull/3875#discussion_r492063411



##
File path: 
integration/hive/src/main/java/org/apache/carbondata/hive/MapredCarbonOutputCommitter.java
##
@@ -52,25 +53,30 @@
 
   @Override
   public void setupJob(JobContext jobContext) throws IOException {
-
ThreadLocalSessionInfo.setConfigurationToCurrentThread(jobContext.getConfiguration());
-String a = jobContext.getJobConf().get(JobConf.MAPRED_MAP_TASK_ENV);
 Random random = new Random();
 JobID jobId = new JobID(UUID.randomUUID().toString(), 0);
 TaskID task = new TaskID(jobId, TaskType.MAP, random.nextInt());
 TaskAttemptID attemptID = new TaskAttemptID(task, random.nextInt());
 org.apache.hadoop.mapreduce.task.TaskAttemptContextImpl context =
 new TaskAttemptContextImpl(jobContext.getJobConf(), attemptID);
-CarbonLoadModel carbonLoadModel =
-HiveCarbonUtil.getCarbonLoadModel(jobContext.getConfiguration());
-CarbonTableOutputFormat.setLoadModel(jobContext.getConfiguration(), 
carbonLoadModel);
+CarbonLoadModel carbonLoadModel = null;
+String encodedString = 
jobContext.getJobConf().get(CarbonTableOutputFormat.LOAD_MODEL);
+if (encodedString != null) {
+  carbonLoadModel =
+  (CarbonLoadModel) 
ObjectSerializationUtil.convertStringToObject(encodedString);
+}
+if (null == carbonLoadModel) {
+  
ThreadLocalSessionInfo.setConfigurationToCurrentThread(jobContext.getConfiguration());
+  String a = jobContext.getJobConf().get(JobConf.MAPRED_MAP_TASK_ENV);
+  carbonLoadModel = 
HiveCarbonUtil.getCarbonLoadModel(jobContext.getConfiguration());
+  CarbonTableOutputFormat.setLoadModel(jobContext.getConfiguration(), 
carbonLoadModel);
+  String loadModelStr = 
jobContext.getConfiguration().get(CarbonTableOutputFormat.LOAD_MODEL);
+  jobContext.getJobConf().set(JobConf.MAPRED_MAP_TASK_ENV, a + ",carbon=" 
+ loadModelStr);

Review comment:
   added comment for the base code. @kunal642 please check whether the 
comment is proper or not.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] akashrn5 commented on a change in pull request #3875: [CARBONDATA-3934]Support write transactional table with presto.

2020-09-21 Thread GitBox


akashrn5 commented on a change in pull request #3875:
URL: https://github.com/apache/carbondata/pull/3875#discussion_r492063105



##
File path: 
integration/hive/src/main/java/org/apache/carbondata/hive/MapredCarbonOutputCommitter.java
##
@@ -52,25 +53,30 @@
 
   @Override
   public void setupJob(JobContext jobContext) throws IOException {
-
ThreadLocalSessionInfo.setConfigurationToCurrentThread(jobContext.getConfiguration());
-String a = jobContext.getJobConf().get(JobConf.MAPRED_MAP_TASK_ENV);
 Random random = new Random();
 JobID jobId = new JobID(UUID.randomUUID().toString(), 0);
 TaskID task = new TaskID(jobId, TaskType.MAP, random.nextInt());
 TaskAttemptID attemptID = new TaskAttemptID(task, random.nextInt());
 org.apache.hadoop.mapreduce.task.TaskAttemptContextImpl context =
 new TaskAttemptContextImpl(jobContext.getJobConf(), attemptID);
-CarbonLoadModel carbonLoadModel =
-HiveCarbonUtil.getCarbonLoadModel(jobContext.getConfiguration());
-CarbonTableOutputFormat.setLoadModel(jobContext.getConfiguration(), 
carbonLoadModel);
+CarbonLoadModel carbonLoadModel = null;
+String encodedString = 
jobContext.getJobConf().get(CarbonTableOutputFormat.LOAD_MODEL);
+if (encodedString != null) {

Review comment:
   actually its base code refactoring, added comment. @kunal642 please 
check if the comment is proper or do i need to modify?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] akashrn5 commented on a change in pull request #3875: [CARBONDATA-3934]Support write transactional table with presto.

2020-09-21 Thread GitBox


akashrn5 commented on a change in pull request #3875:
URL: https://github.com/apache/carbondata/pull/3875#discussion_r492062721



##
File path: 
hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonTableOutputFormat.java
##
@@ -76,7 +76,7 @@
 // TODO Move dictionary generator which is coded in spark to MR framework.
 public class CarbonTableOutputFormat extends FileOutputFormat {
 
-  protected static final String LOAD_MODEL = 
"mapreduce.carbontable.load.model";
+  public static final String LOAD_MODEL = "mapreduce.carbontable.load.model";

Review comment:
   done





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] akashrn5 commented on a change in pull request #3875: [CARBONDATA-3934]Support write transactional table with presto.

2020-09-21 Thread GitBox


akashrn5 commented on a change in pull request #3875:
URL: https://github.com/apache/carbondata/pull/3875#discussion_r492058430



##
File path: 
integration/presto/src/main/prestosql/org/apache/carbondata/presto/CarbonDataPageSinkProvider.java
##
@@ -0,0 +1,182 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.presto;
+
+import java.util.List;
+import java.util.Map;
+import java.util.OptionalInt;
+import java.util.Set;
+
+import com.google.common.collect.ImmutableList;
+import com.google.common.collect.ImmutableMap;
+import com.google.common.collect.ImmutableSet;
+import com.google.common.util.concurrent.ListeningExecutorService;
+import com.google.inject.Inject;
+import io.airlift.event.client.EventClient;
+import io.airlift.json.JsonCodec;
+import io.airlift.units.DataSize;
+import io.prestosql.plugin.hive.HdfsEnvironment;
+import io.prestosql.plugin.hive.HiveConfig;
+import io.prestosql.plugin.hive.HiveFileWriterFactory;
+import io.prestosql.plugin.hive.HivePageSink;
+import io.prestosql.plugin.hive.HivePageSinkProvider;
+import io.prestosql.plugin.hive.HiveSessionProperties;
+import io.prestosql.plugin.hive.HiveWritableTableHandle;
+import io.prestosql.plugin.hive.HiveWriterStats;
+import io.prestosql.plugin.hive.LocationService;
+import io.prestosql.plugin.hive.OrcFileWriterFactory;
+import io.prestosql.plugin.hive.PartitionUpdate;
+import io.prestosql.plugin.hive.metastore.HiveMetastore;
+import io.prestosql.plugin.hive.metastore.HivePageSinkMetadataProvider;
+import io.prestosql.plugin.hive.metastore.SortingColumn;
+import io.prestosql.spi.NodeManager;
+import io.prestosql.spi.PageIndexerFactory;
+import io.prestosql.spi.PageSorter;
+import io.prestosql.spi.connector.ConnectorInsertTableHandle;
+import io.prestosql.spi.connector.ConnectorPageSink;
+import io.prestosql.spi.connector.ConnectorSession;
+import io.prestosql.spi.connector.ConnectorTransactionHandle;
+import io.prestosql.spi.type.TypeManager;
+
+import static 
com.google.common.util.concurrent.MoreExecutors.listeningDecorator;
+import static io.airlift.concurrent.Threads.daemonThreadsNamed;
+import static 
io.prestosql.plugin.hive.metastore.CachingHiveMetastore.memoizeMetastore;
+import static java.util.Objects.requireNonNull;
+import static java.util.concurrent.Executors.newFixedThreadPool;
+
+public class CarbonDataPageSinkProvider extends HivePageSinkProvider {
+
+  private final Set fileWriterFactories;
+  private final HdfsEnvironment hdfsEnvironment;
+  private final PageSorter pageSorter;
+  private final HiveMetastore metastore;
+  private final PageIndexerFactory pageIndexerFactory;
+  private final TypeManager typeManager;
+  private final int maxOpenPartitions;
+  private final int maxOpenSortFiles;
+  private final DataSize writerSortBufferSize;
+  private final boolean immutablePartitions;
+  private final LocationService locationService;
+  private final ListeningExecutorService writeVerificationExecutor;
+  private final JsonCodec partitionUpdateCodec;
+  private final NodeManager nodeManager;
+  private final EventClient eventClient;
+  private final HiveSessionProperties hiveSessionProperties;
+  private final HiveWriterStats hiveWriterStats;
+  private final OrcFileWriterFactory orcFileWriterFactory;
+  private final long perTransactionMetastoreCacheMaximumSize;
+
+  @Inject
+  public CarbonDataPageSinkProvider(Set 
fileWriterFactories,
+  HdfsEnvironment hdfsEnvironment, PageSorter pageSorter, HiveMetastore 
metastore,

Review comment:
   Actually Super class that those many in its constructor, so followed 
same , and its being called from one place, so it should be fine? and this is 
Inject framework too.

##
File path: 
integration/presto/src/main/prestosql/org/apache/carbondata/presto/CarbonMetadataFactory.java
##
@@ -0,0 +1,134 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in 

[GitHub] [carbondata] akashrn5 commented on a change in pull request #3875: [CARBONDATA-3934]Support write transactional table with presto.

2020-09-21 Thread GitBox


akashrn5 commented on a change in pull request #3875:
URL: https://github.com/apache/carbondata/pull/3875#discussion_r492058162



##
File path: 
integration/presto/src/main/prestosql/org/apache/carbondata/presto/CarbonDataMetaData.java
##
@@ -0,0 +1,151 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.presto;
+
+import java.io.IOException;
+import java.util.Collection;
+import java.util.Optional;
+import java.util.Properties;
+
+import org.apache.carbondata.common.logging.LogServiceFactory;
+import org.apache.carbondata.core.constants.CarbonCommonConstants;
+import org.apache.carbondata.core.util.ThreadLocalSessionInfo;
+import org.apache.carbondata.hadoop.api.CarbonTableOutputFormat;
+import org.apache.carbondata.hive.MapredCarbonOutputCommitter;
+import org.apache.carbondata.hive.util.HiveCarbonUtil;
+import org.apache.carbondata.presto.impl.CarbonTableConfig;
+import org.apache.carbondata.processing.loading.model.CarbonLoadModel;
+
+import com.google.common.collect.ImmutableMap;
+import io.airlift.slice.Slice;
+import io.prestosql.plugin.hive.HdfsEnvironment;
+import io.prestosql.plugin.hive.HiveInsertTableHandle;
+import io.prestosql.plugin.hive.HiveMetadata;
+import io.prestosql.plugin.hive.HivePartitionManager;
+import io.prestosql.plugin.hive.LocationService;
+import io.prestosql.plugin.hive.PartitionUpdate;
+import io.prestosql.plugin.hive.TypeTranslator;
+import io.prestosql.plugin.hive.metastore.MetastoreUtil;
+import io.prestosql.plugin.hive.metastore.SemiTransactionalHiveMetastore;
+import io.prestosql.plugin.hive.metastore.Table;
+import io.prestosql.plugin.hive.security.AccessControlMetadata;
+import io.prestosql.plugin.hive.statistics.HiveStatisticsProvider;
+import io.prestosql.plugin.hive.util.ConfigurationUtils;
+import io.prestosql.spi.connector.ConnectorInsertTableHandle;
+import io.prestosql.spi.connector.ConnectorOutputMetadata;
+import io.prestosql.spi.connector.ConnectorSession;
+import io.prestosql.spi.connector.ConnectorTableHandle;
+import io.prestosql.spi.connector.SchemaTableName;
+import io.prestosql.spi.statistics.ComputedStatistics;
+import io.prestosql.spi.type.TypeManager;
+
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.mapred.JobConf;
+import org.apache.hadoop.mapred.JobContextImpl;
+import org.apache.hadoop.mapred.JobID;
+
+import org.apache.hadoop.mapred.TaskAttemptContextImpl;
+import org.apache.log4j.Logger;
+import org.joda.time.DateTimeZone;
+
+public class CarbonDataMetaData extends HiveMetadata {
+
+  private static final Logger LOG =
+  LogServiceFactory.getLogService(CarbonDataMetaData.class.getName());
+
+  private HdfsEnvironment hdfsEnvironment;
+  private SemiTransactionalHiveMetastore metastore;
+  private MapredCarbonOutputCommitter carbonOutputCommitter;
+  private JobContextImpl jobContext;
+
+  public CarbonDataMetaData(SemiTransactionalHiveMetastore metastore,
+  HdfsEnvironment hdfsEnvironment, HivePartitionManager partitionManager, 
DateTimeZone timeZone,
+  boolean allowCorruptWritesForTesting, boolean 
writesToNonManagedTablesEnabled,

Review comment:
   Actually Super class that those many in its constructor, so followed 
same , and its being called from one place, so it should be fine?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] akashrn5 commented on a change in pull request #3875: [CARBONDATA-3934]Support write transactional table with presto.

2020-09-21 Thread GitBox


akashrn5 commented on a change in pull request #3875:
URL: https://github.com/apache/carbondata/pull/3875#discussion_r492044263



##
File path: 
integration/presto/src/main/prestosql/org/apache/carbondata/presto/CarbonDataMetaData.java
##
@@ -0,0 +1,151 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.presto;
+
+import java.io.IOException;
+import java.util.Collection;
+import java.util.Optional;
+import java.util.Properties;
+
+import org.apache.carbondata.common.logging.LogServiceFactory;
+import org.apache.carbondata.core.constants.CarbonCommonConstants;
+import org.apache.carbondata.core.util.ThreadLocalSessionInfo;
+import org.apache.carbondata.hadoop.api.CarbonTableOutputFormat;
+import org.apache.carbondata.hive.MapredCarbonOutputCommitter;
+import org.apache.carbondata.hive.util.HiveCarbonUtil;
+import org.apache.carbondata.presto.impl.CarbonTableConfig;
+import org.apache.carbondata.processing.loading.model.CarbonLoadModel;
+
+import com.google.common.collect.ImmutableMap;
+import io.airlift.slice.Slice;
+import io.prestosql.plugin.hive.HdfsEnvironment;
+import io.prestosql.plugin.hive.HiveInsertTableHandle;
+import io.prestosql.plugin.hive.HiveMetadata;
+import io.prestosql.plugin.hive.HivePartitionManager;
+import io.prestosql.plugin.hive.LocationService;
+import io.prestosql.plugin.hive.PartitionUpdate;
+import io.prestosql.plugin.hive.TypeTranslator;
+import io.prestosql.plugin.hive.metastore.MetastoreUtil;
+import io.prestosql.plugin.hive.metastore.SemiTransactionalHiveMetastore;
+import io.prestosql.plugin.hive.metastore.Table;
+import io.prestosql.plugin.hive.security.AccessControlMetadata;
+import io.prestosql.plugin.hive.statistics.HiveStatisticsProvider;
+import io.prestosql.plugin.hive.util.ConfigurationUtils;
+import io.prestosql.spi.connector.ConnectorInsertTableHandle;
+import io.prestosql.spi.connector.ConnectorOutputMetadata;
+import io.prestosql.spi.connector.ConnectorSession;
+import io.prestosql.spi.connector.ConnectorTableHandle;
+import io.prestosql.spi.connector.SchemaTableName;
+import io.prestosql.spi.statistics.ComputedStatistics;
+import io.prestosql.spi.type.TypeManager;
+
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.mapred.JobConf;
+import org.apache.hadoop.mapred.JobContextImpl;
+import org.apache.hadoop.mapred.JobID;
+
+import org.apache.hadoop.mapred.TaskAttemptContextImpl;
+import org.apache.log4j.Logger;
+import org.joda.time.DateTimeZone;
+
+public class CarbonDataMetaData extends HiveMetadata {
+
+  private static final Logger LOG =
+  LogServiceFactory.getLogService(CarbonDataMetaData.class.getName());
+
+  private HdfsEnvironment hdfsEnvironment;
+  private SemiTransactionalHiveMetastore metastore;
+  private MapredCarbonOutputCommitter carbonOutputCommitter;
+  private JobContextImpl jobContext;
+
+  public CarbonDataMetaData(SemiTransactionalHiveMetastore metastore,
+  HdfsEnvironment hdfsEnvironment, HivePartitionManager partitionManager, 
DateTimeZone timeZone,
+  boolean allowCorruptWritesForTesting, boolean 
writesToNonManagedTablesEnabled,
+  boolean createsOfNonManagedTablesEnabled, TypeManager typeManager,
+  LocationService locationService,
+  io.airlift.json.JsonCodec partitionUpdateCodec,
+  TypeTranslator typeTranslator, String prestoVersion,
+  HiveStatisticsProvider hiveStatisticsProvider, AccessControlMetadata 
accessControlMetadata) {
+super(metastore, hdfsEnvironment, partitionManager, timeZone, 
allowCorruptWritesForTesting,
+true, createsOfNonManagedTablesEnabled, typeManager,
+locationService, partitionUpdateCodec, typeTranslator, prestoVersion,
+hiveStatisticsProvider, accessControlMetadata);
+this.hdfsEnvironment = hdfsEnvironment;
+this.metastore = metastore;
+  }
+
+  @Override
+  public CarbonDataInsertTableHandle beginInsert(ConnectorSession session,
+  ConnectorTableHandle tableHandle) {
+HiveInsertTableHandle hiveInsertTableHandle = super.beginInsert(session, 
tableHandle);
+SchemaTableName tableName = hiveInsertTableHandle.getSchemaTableName();
+Optional table =
+this.metastore.getTable(tableName.getSchemaName(), 

[GitHub] [carbondata] akashrn5 commented on a change in pull request #3875: [CARBONDATA-3934]Support write transactional table with presto.

2020-09-21 Thread GitBox


akashrn5 commented on a change in pull request #3875:
URL: https://github.com/apache/carbondata/pull/3875#discussion_r492039832



##
File path: 
integration/presto/src/main/prestosql/org/apache/carbondata/presto/CarbonDataLocationService.java
##
@@ -0,0 +1,62 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.presto;
+
+import com.google.inject.Inject;
+import io.prestosql.plugin.hive.HdfsEnvironment;
+import io.prestosql.plugin.hive.HiveLocationService;
+import io.prestosql.plugin.hive.HiveWriteUtils;
+import io.prestosql.plugin.hive.LocationHandle;
+import io.prestosql.plugin.hive.metastore.SemiTransactionalHiveMetastore;
+import io.prestosql.plugin.hive.metastore.Table;
+import io.prestosql.spi.connector.ConnectorSession;
+import org.apache.hadoop.fs.Path;
+
+public class CarbonDataLocationService extends HiveLocationService {
+
+  private final HdfsEnvironment hdfsEnvironment;
+
+  @Inject
+  public CarbonDataLocationService(HdfsEnvironment hdfsEnvironment) {
+super(hdfsEnvironment);
+this.hdfsEnvironment = hdfsEnvironment;
+  }
+
+  @Override
+  public LocationHandle forNewTable(SemiTransactionalHiveMetastore metastore,
+  ConnectorSession session, String schemaName, String tableName) {
+// TODO: check and make it compatible for cloud scenario

Review comment:
   Actually if we don't override these methods, the presto gives write path 
as temp path for each writer like we use the temp path in carbon during 
writing. But this will conflict with our writing part. So i have overridden and 
given write path and target path as same. In presto super class, for s3 or any 
encrypted stores, they do not create the temp write path or the staging path. 
So here basically once we need to test in S3 or OBS and then remove this todo, 
if it works fine. That is why i added a to do here. Since i didn't have S3/OBS 
test couldn't do it. I tested in HDFS.
   
   You can refer 
https://github.com/prestosql/presto/blob/8b177120661e600b5595b18826f5c415b7824b81/presto-hive/src/main/java/io/prestosql/plugin/hive/HiveLocationService.java#L55
   
   
https://github.com/prestosql/presto/blob/8b177120661e600b5595b18826f5c415b7824b81/presto-hive/src/main/java/io/prestosql/plugin/hive/HiveLocationService.java#L76





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] akashrn5 commented on a change in pull request #3875: [CARBONDATA-3934]Support write transactional table with presto.

2020-09-21 Thread GitBox


akashrn5 commented on a change in pull request #3875:
URL: https://github.com/apache/carbondata/pull/3875#discussion_r492017696



##
File path: 
integration/presto/src/main/java/org/apache/carbondata/presto/impl/CarbonTableConfig.java
##
@@ -40,6 +40,12 @@
   private String endPoint;
   private String pushRowFilter;
 
+  /**
+   * Property to send load model from coordinator to worker in presto. This is 
internal constant
+   * and not exposed to user.

Review comment:
   As said in the above comment, its same we use this as property name to 
send the load model from coordinator to worker. So its value will be the load 
model prepared for each load.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] akashrn5 commented on a change in pull request #3875: [CARBONDATA-3934]Support write transactional table with presto.

2020-09-21 Thread GitBox


akashrn5 commented on a change in pull request #3875:
URL: https://github.com/apache/carbondata/pull/3875#discussion_r492015787



##
File path: 
integration/presto/src/main/prestosql/org/apache/carbondata/presto/CarbonDataFileWriter.java
##
@@ -0,0 +1,183 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.presto;
+
+import java.io.IOException;
+import java.io.UncheckedIOException;
+import java.util.Arrays;
+import java.util.List;
+import java.util.Properties;
+
+import org.apache.carbondata.common.logging.LogServiceFactory;
+import org.apache.carbondata.core.constants.CarbonCommonConstants;
+import org.apache.carbondata.hadoop.api.CarbonTableOutputFormat;
+import org.apache.carbondata.hive.CarbonHiveSerDe;
+import org.apache.carbondata.hive.MapredCarbonOutputFormat;
+import org.apache.carbondata.presto.impl.CarbonTableConfig;
+
+import com.google.common.collect.ImmutableList;
+import io.prestosql.plugin.hive.HiveFileWriter;
+import io.prestosql.plugin.hive.HiveType;
+import io.prestosql.plugin.hive.HiveWriteUtils;
+import io.prestosql.spi.Page;
+import io.prestosql.spi.PrestoException;
+import io.prestosql.spi.block.Block;
+import io.prestosql.spi.type.Type;
+import io.prestosql.spi.type.TypeManager;
+import org.apache.commons.lang3.StringUtils;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.hive.conf.HiveConf;
+import org.apache.hadoop.hive.ql.exec.FileSinkOperator;
+import org.apache.hadoop.hive.ql.io.HiveOutputFormat;
+import org.apache.hadoop.hive.ql.io.IOConstants;
+import org.apache.hadoop.hive.ql.io.parquet.serde.ArrayWritableObjectInspector;
+import org.apache.hadoop.hive.serde2.SerDeException;
+import 
org.apache.hadoop.hive.serde2.objectinspector.SettableStructObjectInspector;
+import org.apache.hadoop.hive.serde2.objectinspector.StructField;
+import org.apache.hadoop.io.Text;
+import org.apache.hadoop.mapred.JobConf;
+import org.apache.hadoop.mapred.Reporter;
+import org.apache.log4j.Logger;
+
+import static com.google.common.collect.ImmutableList.toImmutableList;
+import static io.prestosql.plugin.hive.HiveErrorCode.HIVE_WRITER_DATA_ERROR;
+import static java.util.Objects.requireNonNull;
+import static java.util.stream.Collectors.toList;
+import static org.apache.hadoop.hive.conf.HiveConf.ConfVars.COMPRESSRESULT;
+
+/**
+ * This class implements HiveFileWriter and it creates the carbonFileWriter to 
write the age data
+ * sent from presto.
+ */
+public class CarbonDataFileWriter implements HiveFileWriter {
+
+  private static final Logger LOG =
+  LogServiceFactory.getLogService(CarbonDataFileWriter.class.getName());
+
+  private final JobConf configuration;
+  private Path outPutPath;
+  private final FileSinkOperator.RecordWriter recordWriter;
+  private final CarbonHiveSerDe serDe;
+  private final int fieldCount;
+  private final Object row;
+  private final SettableStructObjectInspector tableInspector;
+  private final List structFields;
+  private final HiveWriteUtils.FieldSetter[] setters;
+
+  private boolean isCommitDone;
+
+  public CarbonDataFileWriter(Path outPutPath, List inputColumnNames, 
Properties properties,
+  JobConf configuration, TypeManager typeManager) throws SerDeException {
+this.outPutPath = requireNonNull(outPutPath, "path is null");
+this.outPutPath = new Path(properties.getProperty("location"));
+outPutPath = new Path(properties.getProperty("location"));
+this.configuration = requireNonNull(configuration, "conf is null");
+List columnNames = Arrays
+.asList(properties.getProperty(IOConstants.COLUMNS, 
"").split(CarbonCommonConstants.COMMA));
+List fileColumnTypes =
+HiveType.toHiveTypes(properties.getProperty(IOConstants.COLUMNS_TYPES, 
"")).stream()
+.map(hiveType -> hiveType.getType(typeManager)).collect(toList());
+fieldCount = columnNames.size();
+serDe = new CarbonHiveSerDe();
+serDe.initialize(configuration, properties);
+tableInspector = (ArrayWritableObjectInspector) serDe.getObjectInspector();
+
+structFields = ImmutableList.copyOf(
+inputColumnNames.stream().map(tableInspector::getStructFieldRef)
+.collect(toImmutableList()));
+
+row 

[GitHub] [carbondata] akashrn5 commented on a change in pull request #3875: [CARBONDATA-3934]Support write transactional table with presto.

2020-09-21 Thread GitBox


akashrn5 commented on a change in pull request #3875:
URL: https://github.com/apache/carbondata/pull/3875#discussion_r491867047



##
File path: 
core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
##
@@ -2525,4 +2525,9 @@ private CarbonCommonConstants() {
* property which defines the presto query default value
*/
   public static final String IS_QUERY_FROM_PRESTO_DEFAULT = "false";
+
+  /**
+   * property to send load model from coordinator to worker in presto

Review comment:
   not a user config, same as above comment. Moved to presto module.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] akashrn5 commented on a change in pull request #3875: [CARBONDATA-3934]Support write transactional table with presto.

2020-09-21 Thread GitBox


akashrn5 commented on a change in pull request #3875:
URL: https://github.com/apache/carbondata/pull/3875#discussion_r491866753



##
File path: 
core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
##
@@ -2525,4 +2525,9 @@ private CarbonCommonConstants() {
* property which defines the presto query default value
*/
   public static final String IS_QUERY_FROM_PRESTO_DEFAULT = "false";
+
+  /**
+   * property to send load model from coordinator to worker in presto
+   */
+  public static final String CARBON_PRESTO_LOAD_MODEL = 
"presto.carbondata.encoded.loadmodel";

Review comment:
   its not configurable by user. You are right. I have moved the constant 
to `CarbonTableConfig` in presto module and renamed it to 
`carbondata.presto.encoded.loadmodel`





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org