[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3982: [CARBONDATA-4032] Fix drop partition command clean data issue

2020-10-19 Thread GitBox


CarbonDataQA1 commented on pull request #3982:
URL: https://github.com/apache/carbondata/pull/3982#issuecomment-712604515


   Build Failed  with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4523/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] ShreelekhyaG commented on pull request #3914: [CARBONDATA-3979] Added Hive local dictionary support example

2020-10-19 Thread GitBox


ShreelekhyaG commented on pull request #3914:
URL: https://github.com/apache/carbondata/pull/3914#issuecomment-712602617


   retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3982: [CARBONDATA-4032] Fix drop partition command clean data issue

2020-10-19 Thread GitBox


CarbonDataQA1 commented on pull request #3982:
URL: https://github.com/apache/carbondata/pull/3982#issuecomment-712602226


   Build Failed  with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2769/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] kunal642 commented on pull request #3983: [CARBONDATA-4036]Fix special char(`) issue in create table, when column name contains ` character

2020-10-19 Thread GitBox


kunal642 commented on pull request #3983:
URL: https://github.com/apache/carbondata/pull/3983#issuecomment-712600158


   retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] Indhumathi27 commented on pull request #3789: [CARBONDATA-3864] Store Size Optimization

2020-10-19 Thread GitBox


Indhumathi27 commented on pull request #3789:
URL: https://github.com/apache/carbondata/pull/3789#issuecomment-712598719


   retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] Karan980 commented on pull request #3970: [CARBONDATA-4007] Fix multiple issues in SDK

2020-10-19 Thread GitBox


Karan980 commented on pull request #3970:
URL: https://github.com/apache/carbondata/pull/3970#issuecomment-712595181


   retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Resolved] (CARBONDATA-3824) Error when Secondary index tried to be created on table that does not exist is not correct.

2020-10-19 Thread Akash R Nilugal (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akash R Nilugal resolved CARBONDATA-3824.
-
Fix Version/s: 2.1.0
   Resolution: Fixed

> Error when Secondary index tried to be created on table that does not exist 
> is not correct.
> ---
>
> Key: CARBONDATA-3824
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3824
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query
>Affects Versions: 2.0.0
> Environment: Spark 2.3.2, Spark 2.4.5
>Reporter: Chetan Bhat
>Priority: Minor
> Fix For: 2.1.0
>
>
> *Issue :-*
> Table uniqdata_double does not exist.
> Secondary index tried to be created on table. Error message is incorrect.
> CREATE INDEX indextable2 ON TABLE uniqdata_double (DOB) AS 'carbondata' 
> PROPERTIES('carbon.column.compressor'='zstd');
> *Error: java.lang.RuntimeException: Operation not allowed on non-carbon table 
> (state=,code=0)*
>  
> *Expected :-*
> *Error: java.lang.RuntimeException: Table does not exist* *(state=,code=0)***



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (CARBONDATA-3903) Documentation Issue in Github Docs Link https://github.com/apache/carbondata/tree/master/docs

2020-10-19 Thread Akash R Nilugal (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akash R Nilugal resolved CARBONDATA-3903.
-
Fix Version/s: 2.1.0
   Resolution: Fixed

> Documentation Issue in Github  Docs Link 
> https://github.com/apache/carbondata/tree/master/docs
> --
>
> Key: CARBONDATA-3903
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3903
> Project: CarbonData
>  Issue Type: Bug
>  Components: docs
>Affects Versions: 2.0.1
> Environment: https://github.com/apache/carbondata/tree/master/docs
>Reporter: PURUJIT CHAUGULE
>Priority: Minor
> Fix For: 2.1.0
>
>
> dml-of-carbondata.md
> LOAD DATA:
>  * Mention Each Load is considered as a Segment.
>  * Give all possible options for SORT_SCOPE like 
> GLOBAL_SORT/LOCAL_SORT/NO_SORT (with explanation of difference between each 
> type).
>  * Add Example Of complete Load query with/without use of OPTIONS.
> INSERT DATA:
>  * Mention each insert is a Segment.
> LOAD Using Static/Dynamic Partitioning:
>  * Can give a hyperlink to Static/Dynamic partitioning.
> UPDATE/DELETE:
>  * Mention about delta files concept in update and delete.
> DELETE:
>  * Add example for deletion of all records from a table (delete from 
> tablename).
> COMPACTION:
>  * Can mention Minor compaction of two types Auto and Manual( 
> carbon.auto.load.merge =true/false), and that if 
> carbon.auto.load.merge=false, trigger should be done manually.
>  * Hyperlink to Configurable properties of Compaction.
>  * Mention that compacted segments do not get cleaned automatically and 
> should be triggered manually using clean files.
>  
> flink-integration-guide.md
>  * Mention what are stages, how is it used.
>  * Process of insertion, deletion of stages in carbontable. (How is it stored 
> in carbontable).
>  
> language-manual.md
>  * Mention Compaction Hyperlink in DML section.
>  
> spatial-index-guide.md
>  * Mention the TBLPROPERTIES supported / not supported for Geo table.
>  * Mention Spatial Index does not make a new column.
>  * CTAS from one geo table to another does not create another Geo table can 
> be mentioned.
>  * Mention that a certain combination of Spatial Index table properties need 
> to be added in create table, without which a geo table does not get created.
>  * Mention that we cannot alter columns (change datatype, change name, drop) 
> mentioned in spatial_index.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CARBONDATA-3901) Documentation issues in https://github.com/apache/carbondata/tree/master/docs

2020-10-19 Thread Chetan Bhat (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Bhat updated CARBONDATA-3901:

Description: 
*Issue 1  -* 
[https://github.com/apache/carbondata/blob/master/docs/ddl-of-carbondata.mdSORT_SCOPE]
 Sort scope of the load.Options include no sort, local sort ,batch sort and 
global sort  --> Batch sort to be removed as its not supported.

*Issue 2 -* 
[https://github.com/apache/carbondata/blob/master/docs/streaming-guide.md#close-stream]
   CLOSE STREAM link is not working.

 

  was:
*Issue 1 :* 
[https://github.com/apache/carbondata/blob/master/docs/alluxio-guide.md] 
getOrCreateCarbonSession not used in Carbon 2.0 version and should be 
removed.Issue 1 : 
[https://github.com/apache/carbondata/blob/master/docs/alluxio-guide.md] 
getOrCreateCarbonSession not used in Carbon 2.0 version and should be removed.
 Testing use alluxio by CarbonSessionimport 
org.apache.spark.sql.CarbonSession._import org.apache.spark.sql.SparkSession   
val carbon = 
SparkSession.builder().master("local").appName("test").getOrCreateCarbonSession("alluxio://localhost:19998/carbondata");carbon.sql("CREATE
 TABLE carbon_alluxio(id String,name String, city String,age Int) STORED as 
carbondata");carbon.sql(s"LOAD DATA LOCAL INPATH 
'${CARBONDATA_PATH}/integration/spark/src/test/resources/sample.csv' into table 
carbon_alluxio");carbon.sql("select * from carbon_alluxio").show

*Issue 2  -* 
[https://github.com/apache/carbondata/blob/master/docs/ddl-of-carbondata.mdSORT_SCOPE]
 Sort scope of the load.Options include no sort, local sort ,batch sort and 
global sort  --> Batch sort to be removed as its not supported.

*Issue 3 -* 
[https://github.com/apache/carbondata/blob/master/docs/streaming-guide.md#close-stream]
   CLOSE STREAM link is not working.

*Issue 4 -*  
[https://github.com/apache/carbondata/blob/master/docs/index/bloomfilter-index-guide.md]
  Explain query does not hit the bloom. Hence the line "User can verify whether 
a query can leverage BloomFilter Index by executing {{EXPLAIN}} command, which 
will show the transformed logical plan, and thus user can check whether the 
BloomFilter Index can skip blocklets during the scan."  needs to be removed.


> Documentation issues in https://github.com/apache/carbondata/tree/master/docs
> -
>
> Key: CARBONDATA-3901
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3901
> Project: CarbonData
>  Issue Type: Bug
>  Components: docs
>Affects Versions: 2.0.1
> Environment: https://github.com/apache/carbondata/tree/master/docs
>Reporter: Chetan Bhat
>Priority: Minor
> Fix For: 2.1.0
>
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> *Issue 1  -* 
> [https://github.com/apache/carbondata/blob/master/docs/ddl-of-carbondata.mdSORT_SCOPE]
>  Sort scope of the load.Options include no sort, local sort ,batch sort and 
> global sort  --> Batch sort to be removed as its not supported.
> *Issue 2 -* 
> [https://github.com/apache/carbondata/blob/master/docs/streaming-guide.md#close-stream]
>    CLOSE STREAM link is not working.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (CARBONDATA-3901) Documentation issues in https://github.com/apache/carbondata/tree/master/docs

2020-10-19 Thread Akash R Nilugal (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akash R Nilugal resolved CARBONDATA-3901.
-
Fix Version/s: 2.1.0
   Resolution: Fixed

> Documentation issues in https://github.com/apache/carbondata/tree/master/docs
> -
>
> Key: CARBONDATA-3901
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3901
> Project: CarbonData
>  Issue Type: Bug
>  Components: docs
>Affects Versions: 2.0.1
> Environment: https://github.com/apache/carbondata/tree/master/docs
>Reporter: Chetan Bhat
>Priority: Minor
> Fix For: 2.1.0
>
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> *Issue 1 :* 
> [https://github.com/apache/carbondata/blob/master/docs/alluxio-guide.md] 
> getOrCreateCarbonSession not used in Carbon 2.0 version and should be 
> removed.Issue 1 : 
> [https://github.com/apache/carbondata/blob/master/docs/alluxio-guide.md] 
> getOrCreateCarbonSession not used in Carbon 2.0 version and should be removed.
>  Testing use alluxio by CarbonSessionimport 
> org.apache.spark.sql.CarbonSession._import org.apache.spark.sql.SparkSession  
>  val carbon = 
> SparkSession.builder().master("local").appName("test").getOrCreateCarbonSession("alluxio://localhost:19998/carbondata");carbon.sql("CREATE
>  TABLE carbon_alluxio(id String,name String, city String,age Int) STORED as 
> carbondata");carbon.sql(s"LOAD DATA LOCAL INPATH 
> '${CARBONDATA_PATH}/integration/spark/src/test/resources/sample.csv' into 
> table carbon_alluxio");carbon.sql("select * from carbon_alluxio").show
> *Issue 2  -* 
> [https://github.com/apache/carbondata/blob/master/docs/ddl-of-carbondata.mdSORT_SCOPE]
>  Sort scope of the load.Options include no sort, local sort ,batch sort and 
> global sort  --> Batch sort to be removed as its not supported.
> *Issue 3 -* 
> [https://github.com/apache/carbondata/blob/master/docs/streaming-guide.md#close-stream]
>    CLOSE STREAM link is not working.
> *Issue 4 -*  
> [https://github.com/apache/carbondata/blob/master/docs/index/bloomfilter-index-guide.md]
>   Explain query does not hit the bloom. Hence the line "User can verify 
> whether a query can leverage BloomFilter Index by executing {{EXPLAIN}} 
> command, which will show the transformed logical plan, and thus user can 
> check whether the BloomFilter Index can skip blocklets during the scan."  
> needs to be removed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [carbondata] asfgit closed pull request #3980: [CARBONDATA-3901] [CARBONDATA-3903] [CARBONDATA-3824] SI creation on unkbown table and doc changes.

2020-10-19 Thread GitBox


asfgit closed pull request #3980:
URL: https://github.com/apache/carbondata/pull/3980


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] akashrn5 commented on pull request #3980: [CARBONDATA-3901] [CARBONDATA-3903] [CARBONDATA-3824] SI creation on unkbown table and doc changes.

2020-10-19 Thread GitBox


akashrn5 commented on pull request #3980:
URL: https://github.com/apache/carbondata/pull/3980#issuecomment-712582284


   LGTM



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3980: [CARBONDATA-3901] [CARBONDATA-3903] [CARBONDATA-3824] SI creation on unkbown table and doc changes.

2020-10-19 Thread GitBox


CarbonDataQA1 commented on pull request #3980:
URL: https://github.com/apache/carbondata/pull/3980#issuecomment-712579135


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2768/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3980: [CARBONDATA-3901] [CARBONDATA-3903] [CARBONDATA-3824] SI creation on unkbown table and doc changes.

2020-10-19 Thread GitBox


CarbonDataQA1 commented on pull request #3980:
URL: https://github.com/apache/carbondata/pull/3980#issuecomment-712578745


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4522/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3950: [CARBONDATA-3889] Enable scalastyle check for all scala test code

2020-10-19 Thread GitBox


CarbonDataQA1 commented on pull request #3950:
URL: https://github.com/apache/carbondata/pull/3950#issuecomment-712560439


   Build Failed  with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4520/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3950: [CARBONDATA-3889] Enable scalastyle check for all scala test code

2020-10-19 Thread GitBox


CarbonDataQA1 commented on pull request #3950:
URL: https://github.com/apache/carbondata/pull/3950#issuecomment-712559981


   Build Failed  with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2766/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] marchpure commented on pull request #3982: [CARBONDATA-4032] Fix drop partition command clean data issue

2020-10-19 Thread GitBox


marchpure commented on pull request #3982:
URL: https://github.com/apache/carbondata/pull/3982#issuecomment-712560057


   retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] nihal0107 commented on a change in pull request #3980: [CARBONDATA-3901] [CARBONDATA-3903] [CARBONDATA-3824] SI creation on unkbown table and doc changes.

2020-10-19 Thread GitBox


nihal0107 commented on a change in pull request #3980:
URL: https://github.com/apache/carbondata/pull/3980#discussion_r508168162



##
File path: 
integration/spark/src/main/scala/org/apache/spark/sql/hive/CarbonFileMetastore.scala
##
@@ -267,6 +267,7 @@ class CarbonFileMetastore extends CarbonMetaStore {
   lookupRelation(tableIdentifier)(sparkSession)
 } catch {
   case _: NoSuchTableException =>
+LOGGER.error(s"Table ${tableIdentifier.table} does not exist.")

Review comment:
   done





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3970: [CARBONDATA-4007] Fix multiple issues in SDK

2020-10-19 Thread GitBox


CarbonDataQA1 commented on pull request #3970:
URL: https://github.com/apache/carbondata/pull/3970#issuecomment-712405367


   Build Failed  with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2763/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3979: [Carbondata-3954] Fix insertion from ORC table into carbon table when sort scope is global sort

2020-10-19 Thread GitBox


CarbonDataQA1 commented on pull request #3979:
URL: https://github.com/apache/carbondata/pull/3979#issuecomment-712395039







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3970: [CARBONDATA-4007] Fix multiple issues in SDK

2020-10-19 Thread GitBox


CarbonDataQA1 commented on pull request #3970:
URL: https://github.com/apache/carbondata/pull/3970#issuecomment-712386082


   Build Failed  with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4517/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] Karan980 commented on pull request #3970: [CARBONDATA-4007] Fix multiple issues in SDK

2020-10-19 Thread GitBox


Karan980 commented on pull request #3970:
URL: https://github.com/apache/carbondata/pull/3970#issuecomment-712318046


   retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] akashrn5 commented on a change in pull request #3980: [CARBONDATA-3901] [CARBONDATA-3903] [CARBONDATA-3824] SI creation on unkbown table and doc changes.

2020-10-19 Thread GitBox


akashrn5 commented on a change in pull request #3980:
URL: https://github.com/apache/carbondata/pull/3980#discussion_r507902315



##
File path: 
integration/spark/src/main/scala/org/apache/spark/sql/hive/CarbonFileMetastore.scala
##
@@ -267,6 +267,7 @@ class CarbonFileMetastore extends CarbonMetaStore {
   lookupRelation(tableIdentifier)(sparkSession)
 } catch {
   case _: NoSuchTableException =>
+LOGGER.error(s"Table ${tableIdentifier.table} does not exist.")

Review comment:
   i think this can be debug log, else user can get confused with the error 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3970: [CARBONDATA-4007] Fix multiple issues in SDK

2020-10-19 Thread GitBox


CarbonDataQA1 commented on pull request #3970:
URL: https://github.com/apache/carbondata/pull/3970#issuecomment-712277239


   Build Failed  with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2761/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3967: [CARBONDATA-4004] [CARBONDATA-4012] Issue with select after update command

2020-10-19 Thread GitBox


CarbonDataQA1 commented on pull request #3967:
URL: https://github.com/apache/carbondata/pull/3967#issuecomment-712271195


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2762/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3967: [CARBONDATA-4004] [CARBONDATA-4012] Issue with select after update command

2020-10-19 Thread GitBox


CarbonDataQA1 commented on pull request #3967:
URL: https://github.com/apache/carbondata/pull/3967#issuecomment-712270661


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4516/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Created] (CARBONDATA-4037) Improve the table status and segment file writing

2020-10-19 Thread SHREELEKHYA GAMPA (Jira)
SHREELEKHYA GAMPA created CARBONDATA-4037:
-

 Summary: Improve the table status and segment file writing
 Key: CARBONDATA-4037
 URL: https://issues.apache.org/jira/browse/CARBONDATA-4037
 Project: CarbonData
  Issue Type: Improvement
Reporter: SHREELEKHYA GAMPA


Currently, we update table status and segment files multiple times for a single 
iud/merge/compact operation and delete the index files immediately after merge. 
When concurrent queries are run, there may be situations like user query is 
trying to access the segment index files and they are not present, which is 
availability issue.
 * Instead of deleting carbon index files immediately after merge, delete index 
files only when clean files command is executed and delete only those that have 
existed for more than 1 hour.
 * Generate segment file after merge index and update table status at beginning 
and after merge index.
order:
create table status file => index files => merge index => generate segment file 
=> update table status



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3970: [CARBONDATA-4007] Fix multiple issues in SDK

2020-10-19 Thread GitBox


CarbonDataQA1 commented on pull request #3970:
URL: https://github.com/apache/carbondata/pull/3970#issuecomment-712260963


   Build Failed  with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4515/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] akashrn5 commented on a change in pull request #3986: [CARBONDATA-4034] Improve the time-consuming of Horizontal Compaction for update

2020-10-19 Thread GitBox


akashrn5 commented on a change in pull request #3986:
URL: https://github.com/apache/carbondata/pull/3986#discussion_r507834166



##
File path: 
core/src/main/java/org/apache/carbondata/core/statusmanager/SegmentUpdateStatusManager.java
##
@@ -415,44 +415,66 @@ public boolean accept(CarbonFile pathName) {
   }
 
   /**
-   * Return all delta file for a block.
-   * @param segmentId
-   * @param blockName
-   * @return
+   * Get all delete delta files mapped to each block of the specified segment.
+   * First list all deletedelta files in the segment dir, then loop the files 
and find
+   * a map of blocks and .deletedelta files related to each block.
+   *
+   * @param seg the segment which is to find blocks
+   * @return a map of block and its file list
*/
-  public CarbonFile[] getDeleteDeltaFilesList(final Segment segmentId, final 
String blockName) {
-String segmentPath = CarbonTablePath.getSegmentPath(
-identifier.getTablePath(), segmentId.getSegmentNo());
-CarbonFile segDir =
-FileFactory.getCarbonFile(segmentPath);
+  public Map> getDeleteDeltaFilesList(final Segment 
seg) {
+
+Map blockDeltaStartAndEndTimestampMap = new HashMap<>();

Review comment:
   @shenjiayu17 why exactly do we need to change here? Introduce map and 
all, still we are doing the list files which is costly operation. I have some 
points, please check
   
   1. Here no need to create these map, again listing files to fill map.
   we are already getting the blockname and every blovck will have one 
corresponding deletedelta file only right. So always the delta files per block 
block will be one. 
   2. The updatedetails contains the blockname, actual blockname, timestamps 
and delete delta timestamps so you can check if the timestamp not empty, you 
can yourself form the delta file name based on these info and return from this 
method.
   
   With the above approach, you will avoid list files operation, filtering 
operation based on timestamp and creating all these maps. So you can avoid all 
these changes.
   
   Always we keep the horizontal compaction threshold as 1, and we dont change 
and dont recommend users to change to get the better performance.

##
File path: 
integration/spark/src/main/scala/org/apache/spark/sql/execution/command/mutation/HorizontalCompaction.scala
##
@@ -173,6 +176,9 @@ object HorizontalCompaction {
 
 val db = carbonTable.getDatabaseName
 val table = carbonTable.getTableName
+
+LOG.info(s"Horizontal Delete Compaction operation is getting valid 
segments for [$db.$table].")

Review comment:
   same as above

##
File path: 
integration/spark/src/main/scala/org/apache/spark/sql/execution/command/mutation/HorizontalCompaction.scala
##
@@ -125,6 +125,9 @@ object HorizontalCompaction {
   segLists: util.List[Segment]): Unit = {
 val db = carbonTable.getDatabaseName
 val table = carbonTable.getTableName
+
+LOG.info(s"Horizontal Update Compaction operation is getting valid 
segments for [$db.$table].")

Review comment:
   i think this log does not give any useful info here, if you put some log 
after line 133 and print `validSegList ` it looks little useful





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] akkio-97 commented on pull request #3967: [CARBONDATA-4004] [CARBONDATA-4012] Issue with select after update command

2020-10-19 Thread GitBox


akkio-97 commented on pull request #3967:
URL: https://github.com/apache/carbondata/pull/3967#issuecomment-712186261


   retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3970: [CARBONDATA-4007] Fix multiple issues in SDK

2020-10-19 Thread GitBox


CarbonDataQA1 commented on pull request #3970:
URL: https://github.com/apache/carbondata/pull/3970#issuecomment-712135326


   Build Failed  with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4514/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3970: [CARBONDATA-4007] Fix multiple issues in SDK

2020-10-19 Thread GitBox


CarbonDataQA1 commented on pull request #3970:
URL: https://github.com/apache/carbondata/pull/3970#issuecomment-712134800


   Build Failed  with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2760/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3970: [CARBONDATA-4007] Fix multiple issues in SDK

2020-10-19 Thread GitBox


CarbonDataQA1 commented on pull request #3970:
URL: https://github.com/apache/carbondata/pull/3970#issuecomment-712132333


   Build Failed  with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2759/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3970: [CARBONDATA-4007] Fix multiple issues in SDK

2020-10-19 Thread GitBox


CarbonDataQA1 commented on pull request #3970:
URL: https://github.com/apache/carbondata/pull/3970#issuecomment-712129726


   Build Failed  with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4513/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3950: [CARBONDATA-3889] Enable scalastyle check for all scala test code

2020-10-19 Thread GitBox


CarbonDataQA1 commented on pull request #3950:
URL: https://github.com/apache/carbondata/pull/3950#issuecomment-712125005


   Build Failed  with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2758/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] Indhumathi27 commented on a change in pull request #3984: [CARBONDATA-4035]Fix MV query issue with aggregation on decimal column

2020-10-19 Thread GitBox


Indhumathi27 commented on a change in pull request #3984:
URL: https://github.com/apache/carbondata/pull/3984#discussion_r507703419



##
File path: 
integration/spark/src/test/scala/org/apache/carbondata/view/rewrite/MVCreateTestCase.scala
##
@@ -1471,6 +1471,16 @@ class MVCreateTestCase extends QueryTest with 
BeforeAndAfterAll {
 sql("drop table if exists t2")
   }
 
+  test("test sum aggregations on decimal columns") {
+sql("drop table if exists sum_agg_decimal")

Review comment:
   can add drop to afterAll also





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] Indhumathi27 commented on a change in pull request #3984: [CARBONDATA-4035]Fix MV query issue with aggregation on decimal column

2020-10-19 Thread GitBox


Indhumathi27 commented on a change in pull request #3984:
URL: https://github.com/apache/carbondata/pull/3984#discussion_r507702557



##
File path: 
integration/spark/src/test/scala/org/apache/carbondata/view/rewrite/MVCreateTestCase.scala
##
@@ -1471,6 +1471,16 @@ class MVCreateTestCase extends QueryTest with 
BeforeAndAfterAll {
 sql("drop table if exists t2")
   }
 
+  test("test sum aggregations on decimal columns") {
+sql("drop table if exists sum_agg_decimal")
+sql("create table sum_agg_decimal(salary1 decimal(7,2),salary2 
decimal(7,2),salary3 decimal(7,2),salary4 decimal(7,2),empname string) stored 
as carbondata")
+sql("drop materialized view if exists decimal_mv")
+sql("create materialized view decimal_mv as select empname, sum(salary1 - 
salary2) from sum_agg_decimal group by empname")
+sql("explain select empname, sum( salary1 - salary2) from sum_agg_decimal 
group by empname").show(false)

Review comment:
   can revert this change





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3950: [CARBONDATA-3889] Enable scalastyle check for all scala test code

2020-10-19 Thread GitBox


CarbonDataQA1 commented on pull request #3950:
URL: https://github.com/apache/carbondata/pull/3950#issuecomment-712097634


   Build Failed  with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4512/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3950: [CARBONDATA-3889] Enable scalastyle check for all scala test code

2020-10-19 Thread GitBox


CarbonDataQA1 commented on pull request #3950:
URL: https://github.com/apache/carbondata/pull/3950#issuecomment-712066559


   Build Failed  with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2754/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3980: [CARBONDATA-3901] [CARBONDATA-3903] [CARBONDATA-3824] SI creation on unkbown table and doc changes.

2020-10-19 Thread GitBox


CarbonDataQA1 commented on pull request #3980:
URL: https://github.com/apache/carbondata/pull/3980#issuecomment-712049673







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (CARBONDATA-4025) storage space for MV is double to that of a table on which MV has been created.

2020-10-19 Thread suyash yadav (Jira)


[ 
https://issues.apache.org/jira/browse/CARBONDATA-4025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17216618#comment-17216618
 ] 

suyash yadav commented on CARBONDATA-4025:
--

Hi Team,

 

Can somebody look into this request?

 

Regards

Suyash Yadav

> storage space for MV is double to that of a table on which MV has been 
> created.
> ---
>
> Key: CARBONDATA-4025
> URL: https://issues.apache.org/jira/browse/CARBONDATA-4025
> Project: CarbonData
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 2.0.1
> Environment: Apcahe carbondata 2.0.1
> Apache spark 2.4.5
> Hadoop 2.7.2
>Reporter: suyash yadav
>Priority: Major
>
> We are doing a POC based on carbondata but we have observed that when we 
> create n MV on a table with timeseries function of same granualarity the MV 
> takes double the space of the table.
>  
> In my scenario, My table has 1.3 million records and MV also has same number 
> of records but the size of the table is 3.6 MB but the size of the MV is 
> around 6.5 MB.
> This is really important for us as critical business decision are getting 
> affected due to this behaviour.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3950: [CARBONDATA-3889] Enable scalastyle check for all scala test code

2020-10-19 Thread GitBox


CarbonDataQA1 commented on pull request #3950:
URL: https://github.com/apache/carbondata/pull/3950#issuecomment-711928114


   Build Failed  with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4508/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] nihal0107 commented on a change in pull request #3980: [CARBONDATA-3901] [CARBONDATA-3903] [CARBONDATA-3824] SI creation on unkbown table and doc changes.

2020-10-19 Thread GitBox


nihal0107 commented on a change in pull request #3980:
URL: https://github.com/apache/carbondata/pull/3980#discussion_r507589219



##
File path: 
integration/spark/src/main/scala/org/apache/spark/sql/execution/strategy/DDLStrategy.scala
##
@@ -204,6 +204,10 @@ class DDLStrategy(sparkSession: SparkSession) extends 
SparkStrategy {
   ExecutedCommandExec(CarbonCreateSecondaryIndexCommand(
 indexModel, tableProperties, ifNotExists, isDeferredRefresh, 
isCreateSIndex)) :: Nil
 } else {
+  if (!sparkSession.sessionState.catalog.

Review comment:
   done

##
File path: docs/spatial-index-guide.md
##
@@ -62,13 +62,16 @@ create table source_index(id BIGINT, latitude long, 
longitude long) stored by 'c
 'SPATIAL_INDEX.mygeohash.maxLatitude'='20.225281',
 'SPATIAL_INDEX.mygeohash.conversionRatio'='100');
 ```
-Note: `mygeohash` in the above example represent the index name.
+Note: 
+   * `mygeohash` in the above example represent the index name.
+   * Columns present in spatial_index table properties cannot be altered
+i.e., sourcecolumns: `longitude, latitude` and index column: `mygeohash` 
in the above example.
 
  List of spatial index table properties
 
 |Name|Description|
 
|---|-|
-| SPATIAL_INDEX | Used to configure Spatial Index name. This name is appended 
to `SPATIAL_INDEX` in the subsequent sub-property configurations. `xxx` in the 
below sub-properties refer to index name.|
+| SPATIAL_INDEX | Used to configure Spatial Index name. This name is appended 
to `SPATIAL_INDEX` in the subsequent sub-property configurations. `xxx` in the 
below sub-properties refer to index name. Newly created column name is same as 
that of spatial index name. This column is not allowed in any properties except 
in SORT_COLUMNS table property.|

Review comment:
   done





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3986: [CARBONDATA-4034] Improve the time-consuming of Horizontal Compaction for update

2020-10-19 Thread GitBox


CarbonDataQA1 commented on pull request #3986:
URL: https://github.com/apache/carbondata/pull/3986#issuecomment-711838443


   Build Failed  with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4507/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3914: [CARBONDATA-3979] Added Hive local dictionary support example

2020-10-19 Thread GitBox


CarbonDataQA1 commented on pull request #3914:
URL: https://github.com/apache/carbondata/pull/3914#issuecomment-711832466


   Build Failed  with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2752/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3986: [CARBONDATA-4034] Improve the time-consuming of Horizontal Compaction for update

2020-10-19 Thread GitBox


CarbonDataQA1 commented on pull request #3986:
URL: https://github.com/apache/carbondata/pull/3986#issuecomment-711817892


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2753/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3950: [CARBONDATA-3889] Enable scalastyle check for all scala test code

2020-10-19 Thread GitBox


CarbonDataQA1 commented on pull request #3950:
URL: https://github.com/apache/carbondata/pull/3950#issuecomment-711793379


   Build Failed  with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2750/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3914: [CARBONDATA-3979] Added Hive local dictionary support example

2020-10-19 Thread GitBox


CarbonDataQA1 commented on pull request #3914:
URL: https://github.com/apache/carbondata/pull/3914#issuecomment-711781738


   Build Failed  with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4506/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Resolved] (CARBONDATA-3965) Adaptive encoding of Complex primitive float is using log value to store float (4 bytes) data

2020-10-19 Thread Ajantha Bhat (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ajantha Bhat resolved CARBONDATA-3965.
--
Fix Version/s: 2.1.0
   Resolution: Fixed

> Adaptive encoding of Complex primitive float is using log value to store 
> float (4 bytes) data
> -
>
> Key: CARBONDATA-3965
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3965
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Ajantha Bhat
>Priority: Major
> Fix For: 2.1.0
>
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> I have tested, With current UT itself it is hitting. for [Null, 5.512] it is 
> using long as storage for complex primitive adaptive. Base behavior needs to 
> check. I guess it can be analyzed separately
>  
> For this, I have checked
> If No complex type, (if it is just primitive type) same values goes to 
> DirectCompress, not adaptive. But for complex primitive it goes to adaptive 
> because of below code. And as min max is stored as double precision. Long is 
> chosen for this.
> {{DefaultEncodingFactory#selectCodecByAlgorithmForFloating()}}
>  
> {{} else if (decimalCount < 0 && !isComplexPrimitive) \{
>   return new DirectCompressCodec(DataTypes.DOUBLE);
> } else \{
>   return getColumnPageCodec(stats, isComplexPrimitive, columnSpec, 
> srcDataType, maxValue,
>   minValue, decimalCount, absMaxValue);
> }}}
> {{}}
> I don't know (remember) why complex primitive should not enter direct 
> compress. why that check is explicitly added.{{}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [carbondata] asfgit closed pull request #3985: [CARBONDATA-3965]Fixed float variable target datatype in case of adaptive encoding

2020-10-19 Thread GitBox


asfgit closed pull request #3985:
URL: https://github.com/apache/carbondata/pull/3985


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] akashrn5 commented on a change in pull request #3980: [CARBONDATA-3901] [CARBONDATA-3903] [CARBONDATA-3824] SI creation on unkbown table and doc changes.

2020-10-19 Thread GitBox


akashrn5 commented on a change in pull request #3980:
URL: https://github.com/apache/carbondata/pull/3980#discussion_r507519490



##
File path: 
integration/spark/src/main/scala/org/apache/spark/sql/execution/strategy/DDLStrategy.scala
##
@@ -204,6 +204,10 @@ class DDLStrategy(sparkSession: SparkSession) extends 
SparkStrategy {
   ExecutedCommandExec(CarbonCreateSecondaryIndexCommand(
 indexModel, tableProperties, ifNotExists, isDeferredRefresh, 
isCreateSIndex)) :: Nil
 } else {
+  if (!sparkSession.sessionState.catalog.

Review comment:
   already we are calling `tableExists `funtion to check whether carbon 
table or not, if the table not found, it throws `NoSuchTableException`. But we 
catch the exception in 
`org.apache.spark.sql.hive.CarbonFileMetastore#tableExists` and return false. 
So you can add an error log in CarbonFileMetaStore, saying the table not exists 
mentioning table name and avoid the calling to hive metastore here in your 
changes which is a costly operation.
   
   Also, you can just change the error to more generalized way to satisfy the 
non-carbon table scenario and table not exists.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] asfgit closed pull request #3948: [HOTFIX] Fix random 11 testcase failure in CI

2020-10-19 Thread GitBox


asfgit closed pull request #3948:
URL: https://github.com/apache/carbondata/pull/3948


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] QiangCai commented on pull request #3948: [HOTFIX] Fix random 11 testcase failure in CI

2020-10-19 Thread GitBox


QiangCai commented on pull request #3948:
URL: https://github.com/apache/carbondata/pull/3948#issuecomment-711717006


   LGTM,  we will raise more PRs to fix other random failures in CI



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] ajantha-bhat commented on pull request #3985: [CARBONDATA-3965]Fixed float variable target datatype in case of adaptive encoding

2020-10-19 Thread GitBox


ajantha-bhat commented on pull request #3985:
URL: https://github.com/apache/carbondata/pull/3985#issuecomment-711712512


   LGTM



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3985: [CARBONDATA-3965]Fixed float variable target datatype in case of adaptive encoding

2020-10-19 Thread GitBox


CarbonDataQA1 commented on pull request #3985:
URL: https://github.com/apache/carbondata/pull/3985#issuecomment-711693802


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2751/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3985: [CARBONDATA-3965]Fixed float variable target datatype in case of adaptive encoding

2020-10-19 Thread GitBox


CarbonDataQA1 commented on pull request #3985:
URL: https://github.com/apache/carbondata/pull/3985#issuecomment-711690989


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4505/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3950: [CARBONDATA-3889] Enable scalastyle check for all scala test code

2020-10-19 Thread GitBox


CarbonDataQA1 commented on pull request #3950:
URL: https://github.com/apache/carbondata/pull/3950#issuecomment-71106


   Build Failed  with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4504/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3875: [CARBONDATA-3934]Support write transactional table with presto.

2020-10-19 Thread GitBox


ajantha-bhat commented on a change in pull request #3875:
URL: https://github.com/apache/carbondata/pull/3875#discussion_r507464156



##
File path: 
integration/hive/src/main/java/org/apache/carbondata/hive/MapredCarbonOutputFormat.java
##
@@ -92,6 +95,14 @@ public void checkOutputSpecs(FileSystem fileSystem, JobConf 
jobConf) throws IOEx
 }
 String tablePath = 
FileFactory.getCarbonFile(carbonLoadModel.getTablePath()).getAbsolutePath();
 TaskAttemptID taskAttemptID = 
TaskAttemptID.forName(jc.get("mapred.task.id"));
+// taskAttemptID will be null when the insert job is fired from presto. 
Presto send the JobConf
+// and since presto does not use the MR framework for execution, the 
mapred.task.id will be
+// null, so prepare a new ID.
+if (taskAttemptID == null) {
+  SimpleDateFormat formatter = new SimpleDateFormat("MMddHHmm");
+  String jobTrackerId = formatter.format(new Date());
+  taskAttemptID = new TaskAttemptID(jobTrackerId, 0, TaskType.MAP, 0, 0);

Review comment:
   Concurrent insert may use same taskAttemptID. Can you use a UUID as 
taskAttemptID or check how ORC writer is doing?  

##
File path: 
integration/presto/src/main/prestosql/org/apache/carbondata/presto/CarbonDataFileWriter.java
##
@@ -0,0 +1,188 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.presto;
+
+import java.io.IOException;
+import java.io.UncheckedIOException;
+import java.util.Arrays;
+import java.util.List;
+import java.util.Properties;
+
+import org.apache.carbondata.common.logging.LogServiceFactory;
+import org.apache.carbondata.core.constants.CarbonCommonConstants;
+import org.apache.carbondata.hadoop.api.CarbonTableOutputFormat;
+import org.apache.carbondata.hive.CarbonHiveSerDe;
+import org.apache.carbondata.hive.MapredCarbonOutputFormat;
+import org.apache.carbondata.presto.impl.CarbonTableConfig;
+
+import com.google.common.collect.ImmutableList;
+import io.prestosql.plugin.hive.HiveFileWriter;
+import io.prestosql.plugin.hive.HiveType;
+import io.prestosql.plugin.hive.HiveWriteUtils;
+import io.prestosql.spi.Page;
+import io.prestosql.spi.PrestoException;
+import io.prestosql.spi.block.Block;
+import io.prestosql.spi.type.Type;
+import io.prestosql.spi.type.TypeManager;
+import org.apache.commons.lang3.StringUtils;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.hive.conf.HiveConf;
+import org.apache.hadoop.hive.ql.exec.FileSinkOperator;
+import org.apache.hadoop.hive.ql.io.HiveOutputFormat;
+import org.apache.hadoop.hive.ql.io.IOConstants;
+import org.apache.hadoop.hive.ql.io.parquet.serde.ArrayWritableObjectInspector;
+import org.apache.hadoop.hive.serde2.SerDeException;
+import 
org.apache.hadoop.hive.serde2.objectinspector.SettableStructObjectInspector;
+import org.apache.hadoop.hive.serde2.objectinspector.StructField;
+import org.apache.hadoop.io.Text;
+import org.apache.hadoop.mapred.JobConf;
+import org.apache.hadoop.mapred.Reporter;
+import org.apache.log4j.Logger;
+
+import static com.google.common.collect.ImmutableList.toImmutableList;
+import static io.prestosql.plugin.hive.HiveErrorCode.HIVE_WRITER_DATA_ERROR;
+import static java.util.Objects.requireNonNull;
+import static java.util.stream.Collectors.toList;
+import static org.apache.hadoop.hive.conf.HiveConf.ConfVars.COMPRESSRESULT;
+
+/**
+ * This class implements HiveFileWriter and it creates the carbonFileWriter to 
write the page data
+ * sent from presto.
+ */
+public class CarbonDataFileWriter implements HiveFileWriter {
+
+  private static final Logger LOG =
+  LogServiceFactory.getLogService(CarbonDataFileWriter.class.getName());
+
+  private final JobConf configuration;
+  private final Path outPutPath;
+  private final FileSinkOperator.RecordWriter recordWriter;
+  private final CarbonHiveSerDe serDe;
+  private final int fieldCount;
+  private final Object row;
+  private final SettableStructObjectInspector tableInspector;
+  private final List structFields;
+  private final HiveWriteUtils.FieldSetter[] setters;
+
+  private boolean isCommitDone;
+
+  public CarbonDataFileWriter(Path outPutPath, List inputColumnNames, 
Properties properties,
+